linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
To: linux-scsi@vger.kernel.org,
	James Bottomley <jejb@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>,
	"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
	Frederic Barrat <fbarrat@linux.vnet.ibm.com>,
	Christophe Lombard <clombard@linux.vnet.ibm.com>
Subject: [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout
Date: Mon, 26 Mar 2018 11:35:34 -0500	[thread overview]
Message-ID: <1522082134-58938-1-git-send-email-ukrishn@linux.vnet.ibm.com> (raw)
In-Reply-To: <1522081759-57431-1-git-send-email-ukrishn@linux.vnet.ibm.com>

The following Oops can occur if an internal command sent to the AFU does
not complete within the timeout:

[c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash]
[c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash]
[c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230
                                       [cxlflash]
[c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
[c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
[c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0
[c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
[c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580
[c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
[c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
[c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0
[c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4

When an internal command times out, the command buffer is freed while it
is still in the pending commands list of the context. This corrupts the
list and when the context is cleaned up, a crash is encountered.

To resolve this issue, when an AFU command or TMF command times out, the
command should be deleted from the hardware queue pending command list
before freeing the buffer.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/main.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index dfe7648..c920328 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -473,6 +473,7 @@ static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev,
 	struct afu_cmd *cmd = NULL;
 	struct device *dev = &cfg->dev->dev;
 	struct hwq *hwq = get_hwq(afu, PRIMARY_HWQ);
+	bool needs_deletion = false;
 	char *buf = NULL;
 	ulong lock_flags;
 	int rc = 0;
@@ -527,6 +528,7 @@ static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev,
 	if (!to) {
 		dev_err(dev, "%s: TMF timed out\n", __func__);
 		rc = -ETIMEDOUT;
+		needs_deletion = true;
 	} else if (cmd->cmd_aborted) {
 		dev_err(dev, "%s: TMF aborted\n", __func__);
 		rc = -EAGAIN;
@@ -537,6 +539,12 @@ static int send_tmf(struct cxlflash_cfg *cfg, struct scsi_device *sdev,
 	}
 	cfg->tmf_active = false;
 	spin_unlock_irqrestore(&cfg->tmf_slock, lock_flags);
+
+	if (needs_deletion) {
+		spin_lock_irqsave(&hwq->hsq_slock, lock_flags);
+		list_del(&cmd->list);
+		spin_unlock_irqrestore(&hwq->hsq_slock, lock_flags);
+	}
 out:
 	kfree(buf);
 	return rc;
@@ -2284,6 +2292,7 @@ static int send_afu_cmd(struct afu *afu, struct sisl_ioarcb *rcb)
 	struct device *dev = &cfg->dev->dev;
 	struct afu_cmd *cmd = NULL;
 	struct hwq *hwq = get_hwq(afu, PRIMARY_HWQ);
+	ulong lock_flags;
 	char *buf = NULL;
 	int rc = 0;
 	int nretry = 0;
@@ -2329,6 +2338,11 @@ static int send_afu_cmd(struct afu *afu, struct sisl_ioarcb *rcb)
 	case -ETIMEDOUT:
 		rc = afu->context_reset(hwq);
 		if (rc) {
+			/* Delete the command from pending_cmds list */
+			spin_lock_irqsave(&hwq->hsq_slock, lock_flags);
+			list_del(&cmd->list);
+			spin_unlock_irqrestore(&hwq->hsq_slock, lock_flags);
+
 			cxlflash_schedule_async_reset(cfg);
 			break;
 		}
-- 
2.1.0

  parent reply	other threads:[~2018-03-26 16:35 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 16:29 [PATCH v3 00/41] cxlflash: OCXL transport support and miscellaneous fixes Uma Krishnan
2018-03-26 16:29 ` [PATCH v3 01/41] cxlflash: Preserve number of interrupts for master contexts Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 02/41] cxlflash: Avoid clobbering context control register value Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 03/41] cxlflash: Add argument identifier names Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 04/41] cxlflash: Introduce OCXL backend Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 05/41] cxlflash: Hardware AFU for OCXL Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 06/41] cxlflash: Read host function configuration Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 07/41] cxlflash: Setup function acTag range Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 08/41] cxlflash: Read host AFU configuration Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 09/41] cxlflash: Setup AFU acTag range Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 10/41] cxlflash: Setup AFU PASID Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 11/41] cxlflash: Adapter context support for OCXL Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 12/41] cxlflash: Use IDR to manage adapter contexts Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 13/41] cxlflash: Support adapter file descriptors for OCXL Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 14/41] cxlflash: Support adapter context discovery Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 15/41] cxlflash: Support image reload policy modification Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 16/41] cxlflash: MMIO map the AFU Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 17/41] cxlflash: Support starting an adapter context Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 18/41] cxlflash: Support process specific mappings Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 19/41] cxlflash: Support AFU state toggling Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 20/41] cxlflash: Support reading adapter VPD data Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 21/41] cxlflash: Setup function OCXL link Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 22/41] cxlflash: Setup OCXL transaction layer Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 23/41] cxlflash: Support process element lifecycle Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 24/41] cxlflash: Support AFU interrupt management Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 25/41] cxlflash: Support AFU interrupt mapping and registration Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 26/41] cxlflash: Support starting user contexts Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 27/41] cxlflash: Support adapter context polling Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 28/41] cxlflash: Support adapter context reading Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 29/41] cxlflash: Support adapter context mmap and release Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 30/41] cxlflash: Support file descriptor mapping Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 31/41] cxlflash: Introduce object handle fop Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 32/41] cxlflash: Setup LISNs for user contexts Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 33/41] cxlflash: Setup LISNs for master contexts Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 34/41] cxlflash: Update synchronous interrupt status bits Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 35/41] cxlflash: Introduce OCXL context state machine Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 36/41] cxlflash: Register for translation errors Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 37/41] cxlflash: Support AFU reset Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 38/41] cxlflash: Enable OCXL operations Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 39/41] cxlflash: Synchronize reset and remove ops Uma Krishnan
2018-03-28 14:43   ` Matthew R. Ochs
2018-03-26 16:35 ` Uma Krishnan [this message]
2018-03-28 14:50   ` [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout Matthew R. Ochs
2018-03-26 16:35 ` [PATCH v3 41/41] cxlflash: Handle spurious interrupts Uma Krishnan
2018-03-28 15:03   ` Matthew R. Ochs
2018-03-28 21:34 ` [PATCH v3 00/41] cxlflash: OCXL transport support and miscellaneous fixes Martin K. Petersen
2018-03-29 18:35   ` Uma Krishnan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1522082134-58938-1-git-send-email-ukrishn@linux.vnet.ibm.com \
    --to=ukrishn@linux.vnet.ibm.com \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=clombard@linux.vnet.ibm.com \
    --cc=fbarrat@linux.vnet.ibm.com \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=manoj@linux.vnet.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=mrochs@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).