All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nilesh Javali <njavali@marvell.com>
To: <martin.petersen@oracle.com>
Cc: <linux-scsi@vger.kernel.org>,
	<GR-QLogic-Storage-Upstream@marvell.com>, <djeffery@redhat.com>,
	<loberman@redhat.com>
Subject: [PATCH v2 04/10] qla2xxx: Fix crash in NVME abort path
Date: Wed, 8 Sep 2021 09:46:16 -0700	[thread overview]
Message-ID: <20210908164622.19240-5-njavali@marvell.com> (raw)
In-Reply-To: <20210908164622.19240-1-njavali@marvell.com>

From: Arun Easi <aeasi@marvell.com>

System crash was seen when I/O was run against a NVME target and when I/O
aborts were occurring.

Crash stack is:

    -- relevant crash stack --
    BUG: kernel NULL pointer dereference, address: 0000000000000010
    :
    #6 [ffffae1f8666bdd0] page_fault at ffffffffa740122e
       [exception RIP: qla_nvme_abort_work+339]
       RIP: ffffffffc0f592e3  RSP: ffffae1f8666be80  RFLAGS: 00010297
       RAX: 0000000000000000  RBX: ffff9b581fc8af80  RCX: ffffffffc0f83bd0
       RDX: 0000000000000001  RSI: ffff9b5839c6c7c8  RDI: 0000000008000000
       RBP: ffff9b6832f85000   R8: ffffffffc0f68160   R9: ffffffffc0f70652
       R10: ffffae1f862ffdc8  R11: 0000000000000300  R12: 000000000000010d
       R13: 0000000000000000  R14: ffff9b5839cea000  R15: 0ffff9b583fab170
       ORIG_RAX: ffffffffffffffff   CS: 0010  SS: 0018
    #7 [ffffae1f8666be98] process_one_work at ffffffffa6aba184
    #8 [ffffae1f8666bed8] worker_thread at ffffffffa6aba39d
    #9 [ffffae1f8666bf10] kthread at ffffffffa6ac06ed

The crash was due to a stale SRB structure access after it was aborted.
Fixed the issue by removing stale access.

Fixes: 2cabf10dbbe38 (“scsi: qla2xxx: Fix hang on NVMe command timeouts ”)
Cc: stable@vger.kernel.org
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
---
 drivers/scsi/qla2xxx/qla_nvme.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
index 1c5da2dbd6f9..877b2b625020 100644
--- a/drivers/scsi/qla2xxx/qla_nvme.c
+++ b/drivers/scsi/qla2xxx/qla_nvme.c
@@ -228,6 +228,8 @@ static void qla_nvme_abort_work(struct work_struct *work)
 	fc_port_t *fcport = sp->fcport;
 	struct qla_hw_data *ha = fcport->vha->hw;
 	int rval, abts_done_called = 1;
+	bool io_wait_for_abort_done;
+	uint32_t handle;
 
 	ql_dbg(ql_dbg_io, fcport->vha, 0xffff,
 	       "%s called for sp=%p, hndl=%x on fcport=%p desc=%p deleted=%d\n",
@@ -244,12 +246,20 @@ static void qla_nvme_abort_work(struct work_struct *work)
 		goto out;
 	}
 
+	/*
+	 * sp may not be valid after abort_command if return code is either
+	 * SUCCESS or ERR_FROM_FW codes, so cache the value here.
+	 */
+	io_wait_for_abort_done = ql2xabts_wait_nvme &&
+					QLA_ABTS_WAIT_ENABLED(sp);
+	handle = sp->handle;
+
 	rval = ha->isp_ops->abort_command(sp);
 
 	ql_dbg(ql_dbg_io, fcport->vha, 0x212b,
 	    "%s: %s command for sp=%p, handle=%x on fcport=%p rval=%x\n",
 	    __func__, (rval != QLA_SUCCESS) ? "Failed to abort" : "Aborted",
-	    sp, sp->handle, fcport, rval);
+	    sp, handle, fcport, rval);
 
 	/*
 	 * If async tmf is enabled, the abort callback is called only on
@@ -264,7 +274,7 @@ static void qla_nvme_abort_work(struct work_struct *work)
 	 * are waited until ABTS complete. This kref is decreased
 	 * at qla24xx_abort_sp_done function.
 	 */
-	if (abts_done_called && ql2xabts_wait_nvme && QLA_ABTS_WAIT_ENABLED(sp))
+	if (abts_done_called && io_wait_for_abort_done)
 		return;
 out:
 	/* kref_get was done before work was schedule. */
-- 
2.19.0.rc0


  parent reply	other threads:[~2021-09-08 16:47 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-08 16:46 [PATCH v2 00/10] qla2xxx driver bug fixes Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 01/10] qla2xxx: Add support for mailbox passthru Nilesh Javali
2021-09-08 18:28   ` Bart Van Assche
2021-09-14  7:15     ` [EXT] " Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 02/10] qla2xxx: Display 16G only as supported speeds for 3830c card Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 03/10] qla2xxx: Check for firmware capability before creating QPair Nilesh Javali
2021-09-08 16:46 ` Nilesh Javali [this message]
2021-09-08 16:46 ` [PATCH v2 05/10] qla2xxx: edif: Use link event to wake up app Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 06/10] qla2xxx: Fix kernel crash when accessing port_speed sysfs file Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 07/10] qla2xxx: Call process_response_queue() in Tx path Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 08/10] qla2xxx: Move heart beat handling from dpc thread to workqueue Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 09/10] qla2xxx: Fix use after free in eh_abort path Nilesh Javali
2021-09-08 16:46 ` [PATCH v2 10/10] qla2xxx: Update version to 10.02.07.100-k Nilesh Javali
2021-09-15  2:59 ` [PATCH v2 00/10] qla2xxx driver bug fixes Martin K. Petersen
2021-09-22  4:45 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210908164622.19240-5-njavali@marvell.com \
    --to=njavali@marvell.com \
    --cc=GR-QLogic-Storage-Upstream@marvell.com \
    --cc=djeffery@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.