All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>
To: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Cc: James Bottomley <jejb@linux.vnet.ibm.com>,
	linux-scsi@vger.kernel.org,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	Frederic Barrat <fbarrat@linux.vnet.ibm.com>,
	"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Christophe Lombard <clombard@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout
Date: Wed, 28 Mar 2018 09:50:32 -0500	[thread overview]
Message-ID: <20180328145032.GB61145@p8tul1-build.aus.stglabs.ibm.com> (raw)
In-Reply-To: <1522082134-58938-1-git-send-email-ukrishn@linux.vnet.ibm.com>

On Mon, Mar 26, 2018 at 11:35:34AM -0500, Uma Krishnan wrote:
> The following Oops can occur if an internal command sent to the AFU does
> not complete within the timeout:
> 
> [c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash]
> [c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash]
> [c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230
>                                        [cxlflash]
> [c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
> [c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
> [c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0
> [c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
> [c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580
> [c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
> [c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
> [c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0
> [c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4
> 
> When an internal command times out, the command buffer is freed while it
> is still in the pending commands list of the context. This corrupts the
> list and when the context is cleaned up, a crash is encountered.
> 
> To resolve this issue, when an AFU command or TMF command times out, the
> command should be deleted from the hardware queue pending command list
> before freeing the buffer.
> 
> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

WARNING: multiple messages have this Message-ID (diff)
From: "Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>
To: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org,
	James Bottomley <jejb@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>,
	linuxppc-dev@lists.ozlabs.org,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
	Frederic Barrat <fbarrat@linux.vnet.ibm.com>,
	Christophe Lombard <clombard@linux.vnet.ibm.com>
Subject: Re: [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout
Date: Wed, 28 Mar 2018 09:50:32 -0500	[thread overview]
Message-ID: <20180328145032.GB61145@p8tul1-build.aus.stglabs.ibm.com> (raw)
In-Reply-To: <1522082134-58938-1-git-send-email-ukrishn@linux.vnet.ibm.com>

On Mon, Mar 26, 2018 at 11:35:34AM -0500, Uma Krishnan wrote:
> The following Oops can occur if an internal command sent to the AFU does
> not complete within the timeout:
> 
> [c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash]
> [c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash]
> [c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230
>                                        [cxlflash]
> [c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
> [c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
> [c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0
> [c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
> [c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580
> [c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
> [c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
> [c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0
> [c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4
> 
> When an internal command times out, the command buffer is freed while it
> is still in the pending commands list of the context. This corrupts the
> list and when the context is cleaned up, a crash is encountered.
> 
> To resolve this issue, when an AFU command or TMF command times out, the
> command should be deleted from the hardware queue pending command list
> before freeing the buffer.
> 
> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>

Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>

  reply	other threads:[~2018-03-28 14:50 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 16:29 [PATCH v3 00/41] cxlflash: OCXL transport support and miscellaneous fixes Uma Krishnan
2018-03-26 16:29 ` Uma Krishnan
2018-03-26 16:29 ` [PATCH v3 01/41] cxlflash: Preserve number of interrupts for master contexts Uma Krishnan
2018-03-26 16:29   ` Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 02/41] cxlflash: Avoid clobbering context control register value Uma Krishnan
2018-03-26 16:30   ` Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 03/41] cxlflash: Add argument identifier names Uma Krishnan
2018-03-26 16:30   ` Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 04/41] cxlflash: Introduce OCXL backend Uma Krishnan
2018-03-26 16:30   ` Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 05/41] cxlflash: Hardware AFU for OCXL Uma Krishnan
2018-03-26 16:31   ` Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 06/41] cxlflash: Read host function configuration Uma Krishnan
2018-03-26 16:31   ` Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 07/41] cxlflash: Setup function acTag range Uma Krishnan
2018-03-26 16:31   ` Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 08/41] cxlflash: Read host AFU configuration Uma Krishnan
2018-03-26 16:31   ` Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 09/41] cxlflash: Setup AFU acTag range Uma Krishnan
2018-03-26 16:31   ` Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 10/41] cxlflash: Setup AFU PASID Uma Krishnan
2018-03-26 16:31   ` Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 11/41] cxlflash: Adapter context support for OCXL Uma Krishnan
2018-03-26 16:31   ` Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 12/41] cxlflash: Use IDR to manage adapter contexts Uma Krishnan
2018-03-26 16:32   ` Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 13/41] cxlflash: Support adapter file descriptors for OCXL Uma Krishnan
2018-03-26 16:32   ` Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 14/41] cxlflash: Support adapter context discovery Uma Krishnan
2018-03-26 16:32   ` Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 15/41] cxlflash: Support image reload policy modification Uma Krishnan
2018-03-26 16:32   ` Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 16/41] cxlflash: MMIO map the AFU Uma Krishnan
2018-03-26 16:32   ` Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 17/41] cxlflash: Support starting an adapter context Uma Krishnan
2018-03-26 16:32   ` Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 18/41] cxlflash: Support process specific mappings Uma Krishnan
2018-03-26 16:32   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 19/41] cxlflash: Support AFU state toggling Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 20/41] cxlflash: Support reading adapter VPD data Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 21/41] cxlflash: Setup function OCXL link Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 22/41] cxlflash: Setup OCXL transaction layer Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 23/41] cxlflash: Support process element lifecycle Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 24/41] cxlflash: Support AFU interrupt management Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 25/41] cxlflash: Support AFU interrupt mapping and registration Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 26/41] cxlflash: Support starting user contexts Uma Krishnan
2018-03-26 16:33   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 27/41] cxlflash: Support adapter context polling Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 28/41] cxlflash: Support adapter context reading Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 29/41] cxlflash: Support adapter context mmap and release Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 30/41] cxlflash: Support file descriptor mapping Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 31/41] cxlflash: Introduce object handle fop Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 32/41] cxlflash: Setup LISNs for user contexts Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 33/41] cxlflash: Setup LISNs for master contexts Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 34/41] cxlflash: Update synchronous interrupt status bits Uma Krishnan
2018-03-26 16:34   ` Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 35/41] cxlflash: Introduce OCXL context state machine Uma Krishnan
2018-03-26 16:35   ` Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 36/41] cxlflash: Register for translation errors Uma Krishnan
2018-03-26 16:35   ` Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 37/41] cxlflash: Support AFU reset Uma Krishnan
2018-03-26 16:35   ` Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 38/41] cxlflash: Enable OCXL operations Uma Krishnan
2018-03-26 16:35   ` Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 39/41] cxlflash: Synchronize reset and remove ops Uma Krishnan
2018-03-26 16:35   ` Uma Krishnan
2018-03-28 14:43   ` Matthew R. Ochs
2018-03-28 14:43     ` Matthew R. Ochs
2018-03-26 16:35 ` [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout Uma Krishnan
2018-03-26 16:35   ` Uma Krishnan
2018-03-28 14:50   ` Matthew R. Ochs [this message]
2018-03-28 14:50     ` Matthew R. Ochs
2018-03-26 16:35 ` [PATCH v3 41/41] cxlflash: Handle spurious interrupts Uma Krishnan
2018-03-26 16:35   ` Uma Krishnan
2018-03-28 15:03   ` Matthew R. Ochs
2018-03-28 15:03     ` Matthew R. Ochs
2018-03-28 21:34 ` [PATCH v3 00/41] cxlflash: OCXL transport support and miscellaneous fixes Martin K. Petersen
2018-03-28 21:34   ` Martin K. Petersen
2018-03-29 18:35   ` Uma Krishnan
2018-03-29 18:35     ` Uma Krishnan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180328145032.GB61145@p8tul1-build.aus.stglabs.ibm.com \
    --to=mrochs@linux.vnet.ibm.com \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=clombard@linux.vnet.ibm.com \
    --cc=fbarrat@linux.vnet.ibm.com \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=manoj@linux.vnet.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=ukrishn@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.