linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
To: linux-scsi@vger.kernel.org,
	James Bottomley <jejb@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"Matthew R. Ochs" <mrochs@linux.vnet.ibm.com>,
	"Manoj N. Kumar" <manoj@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>,
	Frederic Barrat <fbarrat@linux.vnet.ibm.com>,
	Christophe Lombard <clombard@linux.vnet.ibm.com>
Subject: [PATCH v3 41/41] cxlflash: Handle spurious interrupts
Date: Mon, 26 Mar 2018 11:35:42 -0500	[thread overview]
Message-ID: <1522082142-58975-1-git-send-email-ukrishn@linux.vnet.ibm.com> (raw)
In-Reply-To: <1522081759-57431-1-git-send-email-ukrishn@linux.vnet.ibm.com>

The following Oops can occur when there is heavy I/O traffic and the host
is reset by a tool such as sg_reset.

[c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500
                                       [cxlflash] (unreliable)
[c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash]
[c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310
[c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90
[c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0
[c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230
[c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70
[c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0
[c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24
[c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130
[c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120

When a context is reset, the pending commands are flushed and the AFU
is notified. Before the AFU handles this request there could be command
completion interrupts queued to PHB which are yet to be delivered to the
context. In this scenario, a context could receive an interrupt for a
command that has been flushed, leading to a possible crash when the memory
for the flushed command is accessed.

To resolve this problem, a boolean will indicate if the hardware queue is
ready to process interrupts or not. This can be evaluated in the interrupt
handler before proessing an interrupt.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
---
 drivers/scsi/cxlflash/common.h |  1 +
 drivers/scsi/cxlflash/main.c   | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/drivers/scsi/cxlflash/common.h b/drivers/scsi/cxlflash/common.h
index b69fd32..3556b1d 100644
--- a/drivers/scsi/cxlflash/common.h
+++ b/drivers/scsi/cxlflash/common.h
@@ -224,6 +224,7 @@ struct hwq {
 	u64 *hrrq_end;
 	u64 *hrrq_curr;
 	bool toggle;
+	bool hrrq_online;
 
 	s64 room;
 
diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index c920328..a24d7e6 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -801,6 +801,10 @@ static void term_mc(struct cxlflash_cfg *cfg, u32 index)
 		WARN_ON(cfg->ops->release_context(hwq->ctx_cookie));
 	hwq->ctx_cookie = NULL;
 
+	spin_lock_irqsave(&hwq->hrrq_slock, lock_flags);
+	hwq->hrrq_online = false;
+	spin_unlock_irqrestore(&hwq->hrrq_slock, lock_flags);
+
 	spin_lock_irqsave(&hwq->hsq_slock, lock_flags);
 	flush_pending_cmds(hwq);
 	spin_unlock_irqrestore(&hwq->hsq_slock, lock_flags);
@@ -1475,6 +1479,12 @@ static irqreturn_t cxlflash_rrq_irq(int irq, void *data)
 
 	spin_lock_irqsave(&hwq->hrrq_slock, hrrq_flags);
 
+	/* Silently drop spurious interrupts when queue is not online */
+	if (!hwq->hrrq_online) {
+		spin_unlock_irqrestore(&hwq->hrrq_slock, hrrq_flags);
+		return IRQ_HANDLED;
+	}
+
 	if (afu_is_irqpoll_enabled(afu)) {
 		irq_poll_sched(&hwq->irqpoll);
 		spin_unlock_irqrestore(&hwq->hrrq_slock, hrrq_flags);
@@ -1781,6 +1791,7 @@ static int init_global(struct cxlflash_cfg *cfg)
 
 		writeq_be((u64) hwq->hrrq_start, &hmap->rrq_start);
 		writeq_be((u64) hwq->hrrq_end, &hmap->rrq_end);
+		hwq->hrrq_online = true;
 
 		if (afu_is_sq_cmd_mode(afu)) {
 			writeq_be((u64)hwq->hsq_start, &hmap->sq_start);
-- 
2.1.0

  parent reply	other threads:[~2018-03-26 16:35 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 16:29 [PATCH v3 00/41] cxlflash: OCXL transport support and miscellaneous fixes Uma Krishnan
2018-03-26 16:29 ` [PATCH v3 01/41] cxlflash: Preserve number of interrupts for master contexts Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 02/41] cxlflash: Avoid clobbering context control register value Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 03/41] cxlflash: Add argument identifier names Uma Krishnan
2018-03-26 16:30 ` [PATCH v3 04/41] cxlflash: Introduce OCXL backend Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 05/41] cxlflash: Hardware AFU for OCXL Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 06/41] cxlflash: Read host function configuration Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 07/41] cxlflash: Setup function acTag range Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 08/41] cxlflash: Read host AFU configuration Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 09/41] cxlflash: Setup AFU acTag range Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 10/41] cxlflash: Setup AFU PASID Uma Krishnan
2018-03-26 16:31 ` [PATCH v3 11/41] cxlflash: Adapter context support for OCXL Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 12/41] cxlflash: Use IDR to manage adapter contexts Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 13/41] cxlflash: Support adapter file descriptors for OCXL Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 14/41] cxlflash: Support adapter context discovery Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 15/41] cxlflash: Support image reload policy modification Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 16/41] cxlflash: MMIO map the AFU Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 17/41] cxlflash: Support starting an adapter context Uma Krishnan
2018-03-26 16:32 ` [PATCH v3 18/41] cxlflash: Support process specific mappings Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 19/41] cxlflash: Support AFU state toggling Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 20/41] cxlflash: Support reading adapter VPD data Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 21/41] cxlflash: Setup function OCXL link Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 22/41] cxlflash: Setup OCXL transaction layer Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 23/41] cxlflash: Support process element lifecycle Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 24/41] cxlflash: Support AFU interrupt management Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 25/41] cxlflash: Support AFU interrupt mapping and registration Uma Krishnan
2018-03-26 16:33 ` [PATCH v3 26/41] cxlflash: Support starting user contexts Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 27/41] cxlflash: Support adapter context polling Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 28/41] cxlflash: Support adapter context reading Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 29/41] cxlflash: Support adapter context mmap and release Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 30/41] cxlflash: Support file descriptor mapping Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 31/41] cxlflash: Introduce object handle fop Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 32/41] cxlflash: Setup LISNs for user contexts Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 33/41] cxlflash: Setup LISNs for master contexts Uma Krishnan
2018-03-26 16:34 ` [PATCH v3 34/41] cxlflash: Update synchronous interrupt status bits Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 35/41] cxlflash: Introduce OCXL context state machine Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 36/41] cxlflash: Register for translation errors Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 37/41] cxlflash: Support AFU reset Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 38/41] cxlflash: Enable OCXL operations Uma Krishnan
2018-03-26 16:35 ` [PATCH v3 39/41] cxlflash: Synchronize reset and remove ops Uma Krishnan
2018-03-28 14:43   ` Matthew R. Ochs
2018-03-26 16:35 ` [PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout Uma Krishnan
2018-03-28 14:50   ` Matthew R. Ochs
2018-03-26 16:35 ` Uma Krishnan [this message]
2018-03-28 15:03   ` [PATCH v3 41/41] cxlflash: Handle spurious interrupts Matthew R. Ochs
2018-03-28 21:34 ` [PATCH v3 00/41] cxlflash: OCXL transport support and miscellaneous fixes Martin K. Petersen
2018-03-29 18:35   ` Uma Krishnan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1522082142-58975-1-git-send-email-ukrishn@linux.vnet.ibm.com \
    --to=ukrishn@linux.vnet.ibm.com \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=clombard@linux.vnet.ibm.com \
    --cc=fbarrat@linux.vnet.ibm.com \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=manoj@linux.vnet.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=mrochs@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).