All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nilesh Javali <njavali@marvell.com>
To: <martin.petersen@oracle.com>
Cc: <linux-scsi@vger.kernel.org>, <GR-QLogic-Storage-Upstream@marvell.com>
Subject: [PATCH v2 05/17] qla2xxx: Fix premature hw access after pci error
Date: Sun, 9 Jan 2022 21:02:06 -0800	[thread overview]
Message-ID: <20220110050218.3958-6-njavali@marvell.com> (raw)
In-Reply-To: <20220110050218.3958-1-njavali@marvell.com>

From: Quinn Tran <qutran@marvell.com>

Fix premature hw access after pci error.
After a recoverable PCI error has been detected and recovered, qla driver
needs to check to see if the error condition still persist and/or wait until
the OS to give the resume signal.

Sep  8 22:26:03 localhost kernel: WARNING: CPU: 9 PID: 124606 at qla_tmpl.c:440
qla27xx_fwdt_entry_t266+0x55/0x60 [qla2xxx]
Sep  8 22:26:03 localhost kernel: RIP: 0010:qla27xx_fwdt_entry_t266+0x55/0x60
[qla2xxx]
Sep  8 22:26:03 localhost kernel: Call Trace:
Sep  8 22:26:03 localhost kernel: ? qla27xx_walk_template+0xb1/0x1b0 [qla2xxx]
Sep  8 22:26:03 localhost kernel: ? qla27xx_execute_fwdt_template+0x12a/0x160
[qla2xxx]
Sep  8 22:26:03 localhost kernel: ? qla27xx_fwdump+0xa0/0x1c0 [qla2xxx]
Sep  8 22:26:03 localhost kernel: ? qla2xxx_pci_mmio_enabled+0xfb/0x120
[qla2xxx]
Sep  8 22:26:03 localhost kernel: ? report_mmio_enabled+0x44/0x80
Sep  8 22:26:03 localhost kernel: ? report_slot_reset+0x80/0x80
Sep  8 22:26:03 localhost kernel: ? pci_walk_bus+0x70/0x90
Sep  8 22:26:03 localhost kernel: ? aer_dev_correctable_show+0xc0/0xc0
Sep  8 22:26:03 localhost kernel: ? pcie_do_recovery+0x1bb/0x240
Sep  8 22:26:03 localhost kernel: ? aer_recover_work_func+0xaa/0xd0
Sep  8 22:26:03 localhost kernel: ? process_one_work+0x1a7/0x360
..
Sep  8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-8041:22: detected PCI
disconnect.
Sep  8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-107ff:22:
qla27xx_fwdt_entry_t262: dump ram MB failed. Area 5h start 198013h end 198013h
Sep  8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-107ff:22: Unable to
capture FW dump
Sep  8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-1015:22: cmd=0x0,
waited 5221 msecs
Sep  8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-680d:22: mmio
enabled returning.
Sep  8 22:26:03 localhost kernel: qla2xxx [0000:42:00.2]-d04c:22: MBX
Command timeout for cmd 0, iocontrol=ffffffff jiffies=10140f2e5
mb[0-3]=[0xffff 0xffff 0xffff 0xffff]

Cc: stable@vger.kernel.org
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
---
 drivers/scsi/qla2xxx/qla_os.c   | 10 +++++++++-
 drivers/scsi/qla2xxx/qla_tmpl.c |  9 +++++++--
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 0a7b00d165c7..c4b4b4496399 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -7639,7 +7639,7 @@ qla2xxx_pci_error_detected(struct pci_dev *pdev, pci_channel_state_t state)
 
 	switch (state) {
 	case pci_channel_io_normal:
-		ha->flags.eeh_busy = 0;
+		qla_pci_set_eeh_busy(vha);
 		if (ql2xmqsupport || ql2xnvmeenable) {
 			set_bit(QPAIR_ONLINE_CHECK_NEEDED, &vha->dpc_flags);
 			qla2xxx_wake_dpc(vha);
@@ -7680,9 +7680,16 @@ qla2xxx_pci_mmio_enabled(struct pci_dev *pdev)
 	       "mmio enabled\n");
 
 	ha->pci_error_state = QLA_PCI_MMIO_ENABLED;
+
 	if (IS_QLA82XX(ha))
 		return PCI_ERS_RESULT_RECOVERED;
 
+	if (qla2x00_isp_reg_stat(ha)) {
+		ql_log(ql_log_info, base_vha, 0x803f,
+		    "During mmio enabled, PCI/Register disconnect still detected.\n");
+		goto out;
+	}
+
 	spin_lock_irqsave(&ha->hardware_lock, flags);
 	if (IS_QLA2100(ha) || IS_QLA2200(ha)){
 		stat = rd_reg_word(&reg->hccr);
@@ -7704,6 +7711,7 @@ qla2xxx_pci_mmio_enabled(struct pci_dev *pdev)
 		    "RISC paused -- mmio_enabled, Dumping firmware.\n");
 		qla2xxx_dump_fw(base_vha);
 	}
+out:
 	/* set PCI_ERS_RESULT_NEED_RESET to trigger call to qla2xxx_pci_slot_reset */
 	ql_dbg(ql_dbg_aer, base_vha, 0x600d,
 	       "mmio enabled returning.\n");
diff --git a/drivers/scsi/qla2xxx/qla_tmpl.c b/drivers/scsi/qla2xxx/qla_tmpl.c
index 26c13a953b97..b0a74b036cf4 100644
--- a/drivers/scsi/qla2xxx/qla_tmpl.c
+++ b/drivers/scsi/qla2xxx/qla_tmpl.c
@@ -435,8 +435,13 @@ qla27xx_fwdt_entry_t266(struct scsi_qla_host *vha,
 {
 	ql_dbg(ql_dbg_misc, vha, 0xd20a,
 	    "%s: reset risc [%lx]\n", __func__, *len);
-	if (buf)
-		WARN_ON_ONCE(qla24xx_soft_reset(vha->hw) != QLA_SUCCESS);
+	if (buf) {
+		if (qla24xx_soft_reset(vha->hw) != QLA_SUCCESS) {
+			ql_dbg(ql_dbg_async, vha, 0x5001,
+			    "%s: unable to soft reset\n", __func__);
+			return INVALID_ENTRY;
+		}
+	}
 
 	return qla27xx_next_entry(ent);
 }
-- 
2.23.1


  parent reply	other threads:[~2022-01-10  5:03 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-10  5:02 [PATCH v2 00/17] qla2xxx misc bug fixes and features Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 01/17] qla2xxx: Refactor asynchronous command initialization Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 02/17] qla2xxx: Implement ref count for srb Nilesh Javali
2022-02-03 14:44   ` Ewan Milne
2022-02-04  7:16     ` Saurav Kashyap
2022-02-08 10:59       ` Saurav Kashyap
2022-02-09 20:09         ` Ewan Milne
2022-01-10  5:02 ` [PATCH v2 03/17] qla2xxx: fix stuck session in gpdb Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 04/17] qla2xxx: Fix warning message due to adisc is being flush Nilesh Javali
2022-01-10  5:02 ` Nilesh Javali [this message]
2022-01-10  5:02 ` [PATCH v2 06/17] qla2xxx: Fix scheduling while atomic Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 07/17] qla2xxx: add retry for exec fw Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 08/17] qla2xxx: Show wrong FDMI data for 64G adaptor Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 09/17] qla2xxx: Add ql2xnvme_queues module param to configure number of NVME queues Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 10/17] qla2xxx: Fix device reconnect in loop topology Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 11/17] qla2xxx: fix warning for missing error code Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 12/17] qla2xxx: edif: Fix clang warning Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 13/17] qla2xxx: Fix T10 PI tag escape and IP guard options for 28XX adapters Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 14/17] qla2xxx: Suppress a kernel complaint in qla_create_qpair() Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 15/17] qla2xxx: Add devid's and conditionals for 28xx Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 16/17] qla2xxx: check for firmware dump already collected Nilesh Javali
2022-01-10  5:02 ` [PATCH v2 17/17] qla2xxx: Update version to 10.02.07.300-k Nilesh Javali
2022-01-25  5:00 ` [PATCH v2 00/17] qla2xxx misc bug fixes and features Martin K. Petersen
2022-02-01  2:04 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220110050218.3958-6-njavali@marvell.com \
    --to=njavali@marvell.com \
    --cc=GR-QLogic-Storage-Upstream@marvell.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.