All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] pm80xx updates.
@ 2020-12-30  4:57 Viswas G
  2020-12-30  4:57 ` [PATCH 1/8] pm80xx: No busywait in MPI init check Viswas G
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: Viswas G <Viswas.G@microchip.com>

This patch set include some bug fixes and enhancements for pm80xx driver.

Bhavesh Jashnani (1):
  pm80xx: Simultaneous poll for all FW readiness.

Vishakha Channapattan (2):
  pm80xx: Log SATA IOMB completion status on failure.
  pm80xx: Add sysfs attribute for ioc health

Viswas G (1):
  pm80xx: fix driver fatal dump failure.

akshatzen (4):
  pm80xx: No busywait in MPI init check
  pm80xx: check fatal error
  pm80xx: check main config table address
  pm80xx: fix missing tag_free in NVMD DATA req

 drivers/scsi/pm8001/pm8001_ctl.c  |  42 ++++++++
 drivers/scsi/pm8001/pm8001_hwi.c  |  15 ++-
 drivers/scsi/pm8001/pm8001_init.c |  11 ++-
 drivers/scsi/pm8001/pm8001_sas.c  |   9 ++
 drivers/scsi/pm8001/pm8001_sas.h  |   2 +
 drivers/scsi/pm8001/pm80xx_hwi.c  | 202 ++++++++++++++++++++++++--------------
 drivers/scsi/pm8001/pm80xx_hwi.h  |  17 +++-
 7 files changed, 216 insertions(+), 82 deletions(-)

-- 
2.16.3


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/8] pm80xx: No busywait in MPI init check
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-04  6:45   ` Jinpu Wang
  2020-12-30  4:57 ` [PATCH 2/8] pm80xx: check fatal error Viswas G
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: akshatzen <akshatzen@google.com>

We do not need to busy wait during mpi_init_check. I confirmed that
mpi_init_check is not being invoked in an ATOMIC context. It is being
called from pm8001_pci_resume, pm8001_pci_probe. Hence we are
replacing the udelay which busy waits with msleep.

Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 6 +++---
 drivers/scsi/pm8001/pm80xx_hwi.h | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 6772b0924dac..9c4b8b374ab8 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -997,7 +997,7 @@ static int mpi_init_check(struct pm8001_hba_info *pm8001_ha)
 		max_wait_count = SPC_DOORBELL_CLEAR_TIMEOUT;
 	}
 	do {
-		udelay(1);
+		msleep(FW_READY_INTERVAL);
 		value = pm8001_cr32(pm8001_ha, 0, MSGU_IBDB_SET);
 		value &= SPCv_MSGU_CFG_TABLE_UPDATE;
 	} while ((value != 0) && (--max_wait_count));
@@ -1010,9 +1010,9 @@ static int mpi_init_check(struct pm8001_hba_info *pm8001_ha)
 		return -EBUSY;
 	}
 	/* check the MPI-State for initialization upto 100ms*/
-	max_wait_count = 100 * 1000;/* 100 msec */
+	max_wait_count = 5;/* 100 msec */
 	do {
-		udelay(1);
+		msleep(FW_READY_INTERVAL);
 		gst_len_mpistate =
 			pm8001_mr32(pm8001_ha->general_stat_tbl_addr,
 					GST_GSTLEN_MPIS_OFFSET);
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.h b/drivers/scsi/pm8001/pm80xx_hwi.h
index ec48bc276de6..2b6b52551968 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.h
+++ b/drivers/scsi/pm8001/pm80xx_hwi.h
@@ -220,8 +220,8 @@
 #define SAS_DOPNRJT_RTRY_TMO            128
 #define SAS_COPNRJT_RTRY_TMO            128
 
-#define SPCV_DOORBELL_CLEAR_TIMEOUT	(30 * 1000 * 1000) /* 30 sec */
-#define SPC_DOORBELL_CLEAR_TIMEOUT	(15 * 1000 * 1000) /* 15 sec */
+#define SPCV_DOORBELL_CLEAR_TIMEOUT	(30 * 50) /* 30 sec */
+#define SPC_DOORBELL_CLEAR_TIMEOUT	(15 * 50) /* 15 sec */
 
 /*
   Making ORR bigger than IT NEXUS LOSS which is 2000000us = 2 second.
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/8] pm80xx: check fatal error
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
  2020-12-30  4:57 ` [PATCH 1/8] pm80xx: No busywait in MPI init check Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-05 13:19   ` Jinpu Wang
  2020-12-30  4:57 ` [PATCH 3/8] pm80xx: check main config table address Viswas G
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: akshatzen <akshatzen@google.com>

When controller runs into fatal error, commands which expect
response get stuck due to no response. If the controller is
in fatal error state, abort request issued to the controller
gets hung too. Hence we should fail it without trying.

Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
---
 drivers/scsi/pm8001/pm8001_hwi.c |  1 +
 drivers/scsi/pm8001/pm8001_sas.c |  9 +++++++++
 drivers/scsi/pm8001/pm8001_sas.h |  2 ++
 drivers/scsi/pm8001/pm80xx_hwi.c | 36 ++++++++++++++++++++++++++++++++++++
 drivers/scsi/pm8001/pm80xx_hwi.h | 13 +++++++++++++
 5 files changed, 61 insertions(+)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index c8d4d87c5473..f147193d67bd 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -4998,4 +4998,5 @@ const struct pm8001_dispatch pm8001_8001_dispatch = {
 	.fw_flash_update_req	= pm8001_chip_fw_flash_update_req,
 	.set_dev_state_req	= pm8001_chip_set_dev_state_req,
 	.sas_re_init_req	= pm8001_chip_sas_re_initialization,
+	.fatal_errors		= pm80xx_fatal_errors,
 };
diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index d1e9dba2ef19..f8d142f9b9ad 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -1183,12 +1183,21 @@ int pm8001_abort_task(struct sas_task *task)
 	int rc = TMF_RESP_FUNC_FAILED, ret;
 	u32 phy_id;
 	struct sas_task_slow slow_task;
+
 	if (unlikely(!task || !task->lldd_task || !task->dev))
 		return TMF_RESP_FUNC_FAILED;
+
 	dev = task->dev;
 	pm8001_dev = dev->lldd_dev;
 	pm8001_ha = pm8001_find_ha_by_dev(dev);
 	phy_id = pm8001_dev->attached_phy;
+
+	if (PM8001_CHIP_DISP->fatal_errors(pm8001_ha)) {
+		// If the controller is seeing fatal errors
+		// abort task will not get a response from the controller
+		return TMF_RESP_FUNC_FAILED;
+	}
+
 	ret = pm8001_find_tag(task, &tag);
 	if (ret == 0) {
 		pm8001_info(pm8001_ha, "no tag for task:%p\n", task);
diff --git a/drivers/scsi/pm8001/pm8001_sas.h b/drivers/scsi/pm8001/pm8001_sas.h
index f2c8cbad3853..039ed91e9841 100644
--- a/drivers/scsi/pm8001/pm8001_sas.h
+++ b/drivers/scsi/pm8001/pm8001_sas.h
@@ -215,6 +215,7 @@ struct pm8001_dispatch {
 	int (*sas_diag_execute_req)(struct pm8001_hba_info *pm8001_ha,
 		u32 state);
 	int (*sas_re_init_req)(struct pm8001_hba_info *pm8001_ha);
+	int (*fatal_errors)(struct pm8001_hba_info *pm8001_ha);
 };
 
 struct pm8001_chip_info {
@@ -725,6 +726,7 @@ ssize_t pm80xx_get_fatal_dump(struct device *cdev,
 ssize_t pm80xx_get_non_fatal_dump(struct device *cdev,
 		struct device_attribute *attr, char *buf);
 ssize_t pm8001_get_gsm_dump(struct device *cdev, u32, char *buf);
+int pm80xx_fatal_errors(struct pm8001_hba_info *pm8001_ha);
 /* ctl shared API */
 extern struct device_attribute *pm8001_host_attrs[];
 
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 9c4b8b374ab8..86a3d483749c 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1525,6 +1525,41 @@ static int mpi_uninit_check(struct pm8001_hba_info *pm8001_ha)
 	return 0;
 }
 
+/**
+ * pm80xx_fatal_errors - returns non zero *ONLY* when fatal errors
+ * @pm8001_ha: our hba card information
+ *
+ * Fatal errors are recoverable only after a host reboot.
+ */
+int
+pm80xx_fatal_errors(struct pm8001_hba_info *pm8001_ha)
+{
+	int ret = 0;
+	u32 scratch_pad_rsvd0 = pm8001_cr32(pm8001_ha, 0,
+					MSGU_HOST_SCRATCH_PAD_6);
+	u32 scratch_pad_rsvd1 = pm8001_cr32(pm8001_ha, 0,
+					MSGU_HOST_SCRATCH_PAD_7);
+	u32 scratch_pad1 = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
+	u32 scratch_pad2 = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_2);
+	u32 scratch_pad3 = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_3);
+
+	if (pm8001_ha->chip_id != chip_8006 &&
+			pm8001_ha->chip_id != chip_8074 &&
+			pm8001_ha->chip_id != chip_8076) {
+		return 0;
+	}
+
+	if (MSGU_SCRATCHPAD1_STATE_FATAL_ERROR(scratch_pad1)) {
+		pm8001_dbg(pm8001_ha, FAIL,
+			"Fatal error SCRATCHPAD1 = 0x%x SCRATCHPAD2 = 0x%x SCRATCHPAD3 = 0x%x SCRATCHPAD_RSVD0 = 0x%x SCRATCHPAD_RSVD1 = 0x%x\n",
+				scratch_pad1, scratch_pad2, scratch_pad3,
+				scratch_pad_rsvd0, scratch_pad_rsvd1);
+		ret = 1;
+	}
+
+	return ret;
+}
+
 /**
  * pm8001_chip_soft_rst - soft reset the PM8001 chip, so that the clear all
  * the FW register status to the originated status.
@@ -4959,4 +4994,5 @@ const struct pm8001_dispatch pm8001_80xx_dispatch = {
 	.set_nvmd_req		= pm8001_chip_set_nvmd_req,
 	.fw_flash_update_req	= pm8001_chip_fw_flash_update_req,
 	.set_dev_state_req	= pm8001_chip_set_dev_state_req,
+	.fatal_errors		= pm80xx_fatal_errors,
 };
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.h b/drivers/scsi/pm8001/pm80xx_hwi.h
index 2b6b52551968..2c8e85cfdbc4 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.h
+++ b/drivers/scsi/pm8001/pm80xx_hwi.h
@@ -1368,6 +1368,19 @@ typedef struct SASProtocolTimerConfig SASProtocolTimerConfig_t;
 #define MSGU_HOST_SCRATCH_PAD_6			0x6C
 #define MSGU_HOST_SCRATCH_PAD_7			0x70
 
+#define MSGU_SCRATCHPAD1_RAAE_STATE_ERR(x) ((x & 0x3) == 0x2)
+#define MSGU_SCRATCHPAD1_ILA_STATE_ERR(x) (((x >> 2) & 0x3) == 0x2)
+#define MSGU_SCRATCHPAD1_BOOTLDR_STATE_ERR(x) ((((x >> 4) & 0x7) == 0x7) || \
+						(((x >> 4) & 0x7) == 0x4))
+#define MSGU_SCRATCHPAD1_IOP0_STATE_ERR(x) (((x >> 10) & 0x3) == 0x2)
+#define MSGU_SCRATCHPAD1_IOP1_STATE_ERR(x) (((x >> 12) & 0x3) == 0x2)
+#define MSGU_SCRATCHPAD1_STATE_FATAL_ERROR(x)  \
+			(MSGU_SCRATCHPAD1_RAAE_STATE_ERR(x) ||      \
+			 MSGU_SCRATCHPAD1_ILA_STATE_ERR(x) ||       \
+			 MSGU_SCRATCHPAD1_BOOTLDR_STATE_ERR(x) ||   \
+			 MSGU_SCRATCHPAD1_IOP0_STATE_ERR(x) ||      \
+			 MSGU_SCRATCHPAD1_IOP1_STATE_ERR(x))
+
 /* bit definition for ODMR register */
 #define ODMR_MASK_ALL			0xFFFFFFFF/* mask all
 					interrupt vector */
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/8] pm80xx: check main config table address
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
  2020-12-30  4:57 ` [PATCH 1/8] pm80xx: No busywait in MPI init check Viswas G
  2020-12-30  4:57 ` [PATCH 2/8] pm80xx: check fatal error Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-05 13:23   ` Jinpu Wang
  2020-12-30  4:57 ` [PATCH 4/8] pm80xx: fix missing tag_free in NVMD DATA req Viswas G
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: akshatzen <akshatzen@google.com>

pm8001 driver initializes main configuration, general status, inbound
queue and outbound queue table address based on value read from
MSGU_SCRATCH_PAD_0 register.

We should validate these addresses before dereferencing them.

This change adds two validations:
1. Check if main configuration table offset lies within the pcibar
mapped
2. Check if first dword of main configuration table reads "PMCS"

There are two calls to init_pci_device_addresses() done during
pm8001_pci_probe() in this sequence:
1. First inside chip_soft_rst, where if init_pci_device_addresses fails
we will go ahead assuming MPI state is not ready and reset the device as
long as bootloader is okay.  This gives chance to second call of
init_pci_device_addresses to set up the addresses after reset.

2. The second call is via pm80xx_chip_init, after soft reset is done and
firmware is checked to be ready. Once that is done we are safe to go
ahead and initialize default table values and use them.

Tested:

1. Enabled debugging logs and observed no issues during initialization,
with a controller with No issues.

pm80xx0:: pm8001_setup_msix 1034: pci_alloc_irq_vectors request ret:64
no of intr 64
pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000
value 0x40002000
pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
pm80xx0:: init_pci_device_addresses 952: VALID main config signature
0x53434d50
pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval)
general status (ptrval)
pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd
(ptrval)
pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt
(ptrval)
pm80xx0:: pm80xx_chip_soft_rst 1446: reset register before write : 0x0
pm80xx0:: pm80xx_chip_soft_rst 1478: reset register after write 0x40
pm80xx0:: pm80xx_chip_soft_rst 1544: SPCv soft reset Complete
pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000
value 0x40002000
pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
pm80xx0:: init_pci_device_addresses 952: VALID main config signature
0x53434d50
pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval)
general status (ptrval)
pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd
(ptrval)
pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt
(ptrval)
pm80xx0:: pm80xx_chip_init 1329: MPI initialize successful!

2. Tested controller with firmware known to have initialization issue
and observed no crashes with this fix

pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:01:00.0: Removing from 1:1 domain
pm80xx 0000:01:00.0: Requesting non-1:1 mappings
pm80xx0:: init_pci_device_addresses 948: BAD main config signature 0x0
pm80xx0:: mpi_uninit_check 1365: Failed to init pci addresses
pm80xx0:: pm80xx_chip_soft_rst 1435: MPI state is not ready
scratch:0:8:62a01000:0
pm80xx0:: pm80xx_chip_soft_rst 1518: Firmware is not ready!
pm80xx0:: pm80xx_chip_soft_rst 1532: iButton Feature is not Available!!!
pm80xx0:: pm80xx_chip_init 1301: Firmware is not ready!
pm80xx0:: pm8001_pci_probe 1215: chip_init failed [ret: -16]
pm80xx: probe of 0000:01:00.0 failed with error -16
pm80xx 0000:07:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:07:00.0: Removing from 1:1 domain
pm80xx 0000:07:00.0: Requesting non-1:1 mappings
scsi host6: pm80xx
pm80xx1:: pm8001_setup_sgpio 5568: failed sgpio_req timeout
pm80xx1:: mpi_phy_start_resp 3447: phy start resp status:0x0, phyid:0x0
pm80xx 0000:08:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:08:00.0: Removing from 1:1 domain
pm80xx 0000:08:00.0: Requesting non-1:1 mappings

3. Without this fix we observe crash on the same controller.

pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
pm80xx 0000:01:00.0: Removing from 1:1 domain
pm80xx 0000:01:00.0: Requesting non-1:1 mappings
[<ffffffffc0451b3b>] pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
[<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
RIP: 0010:pm80xx_chip_soft_rst+0x71/0x4c0 [pm80xx]
[<ffffffffc0451b3b>] ? pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
[<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
pm80xx0:: mpi_uninit_check 1339: TIMEOUT:IBDB value/=2
pm80xx0:: pm80xx_chip_soft_rst 1387: MPI state is not ready
scratch:0:8:62a01000:0
pm80xx0:: pm80xx_chip_soft_rst 1470: Firmware is not ready!
pm80xx0:: pm80xx_chip_soft_rst 1484: iButton Feature is not Available!!!
pm80xx0:: pm80xx_chip_init 1266: Firmware is not ready!
pm80xx0:: pm8001_pci_probe 1207: chip_init failed [ret: -16]
pm80xx: probe of 0000:01:00.0 failed with error -16

Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
---
 drivers/scsi/pm8001/pm8001_init.c | 11 +++++---
 drivers/scsi/pm8001/pm80xx_hwi.c  | 53 ++++++++++++++++++++++++++++++++++++---
 2 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
index ee2de177d0d0..d21078ca7fb3 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -466,9 +466,12 @@ static int pm8001_ioremap(struct pm8001_hba_info *pm8001_ha)
 			pm8001_ha->io_mem[logicalBar].memvirtaddr =
 				ioremap(pm8001_ha->io_mem[logicalBar].membase,
 				pm8001_ha->io_mem[logicalBar].memsize);
-			pm8001_dbg(pm8001_ha, INIT,
-				   "PCI: bar %d, logicalBar %d\n",
+			if (!pm8001_ha->io_mem[logicalBar].memvirtaddr) {
+				pm8001_dbg(pm8001_ha, INIT,
+					"Failed to ioremap bar %d, logicalBar %d",
 				   bar, logicalBar);
+				return -ENOMEM;
+			}
 			pm8001_dbg(pm8001_ha, INIT,
 				   "base addr %llx virt_addr=%llx len=%d\n",
 				   (u64)pm8001_ha->io_mem[logicalBar].membase,
@@ -540,9 +543,11 @@ static struct pm8001_hba_info *pm8001_pci_alloc(struct pci_dev *pdev,
 			tasklet_init(&pm8001_ha->tasklet[j], pm8001_tasklet,
 				(unsigned long)&(pm8001_ha->irq_vector[j]));
 #endif
-	pm8001_ioremap(pm8001_ha);
+	if (pm8001_ioremap(pm8001_ha))
+		goto failed_pci_alloc;
 	if (!pm8001_alloc(pm8001_ha, ent))
 		return pm8001_ha;
+failed_pci_alloc:
 	pm8001_free(pm8001_ha);
 	return NULL;
 }
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 86a3d483749c..7d0eada11d3c 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1115,7 +1115,7 @@ static int check_fw_ready(struct pm8001_hba_info *pm8001_ha)
 	return ret;
 }
 
-static void init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
+static int init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
 {
 	void __iomem *base_addr;
 	u32	value;
@@ -1124,15 +1124,48 @@ static void init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
 	u32	pcilogic;
 
 	value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_0);
+
+	/**
+	 * lower 26 bits of SCRATCHPAD0 register describes offset within the
+	 * PCIe BAR where the MPI configuration table is present
+	 */
 	offset = value & 0x03FFFFFF; /* scratch pad 0 TBL address */
 
 	pm8001_dbg(pm8001_ha, DEV, "Scratchpad 0 Offset: 0x%x value 0x%x\n",
 		   offset, value);
+	/**
+	 * Upper 6 bits describe the offset within PCI config space where BAR
+	 * is located.
+	 */
 	pcilogic = (value & 0xFC000000) >> 26;
 	pcibar = get_pci_bar_index(pcilogic);
 	pm8001_dbg(pm8001_ha, INIT, "Scratchpad 0 PCI BAR: %d\n", pcibar);
+
+	/**
+	 * Make sure the offset falls inside the ioremapped PCI BAR
+	 */
+	if (offset > pm8001_ha->io_mem[pcibar].memsize) {
+		pm8001_dbg(pm8001_ha, FAIL,
+			"Main cfg tbl offset outside %u > %u\n",
+				offset, pm8001_ha->io_mem[pcibar].memsize);
+		return -EBUSY;
+	}
 	pm8001_ha->main_cfg_tbl_addr = base_addr =
 		pm8001_ha->io_mem[pcibar].memvirtaddr + offset;
+
+	/**
+	 * Validate main configuration table address: first DWord should read
+	 * "PMCS"
+	 */
+	value = pm8001_mr32(pm8001_ha->main_cfg_tbl_addr, 0);
+	if (memcmp(&value, "PMCS", 4) != 0) {
+		pm8001_dbg(pm8001_ha, FAIL,
+			"BAD main config signature 0x%x\n",
+				value);
+		return -EBUSY;
+	}
+	pm8001_dbg(pm8001_ha, INIT,
+			"VALID main config signature 0x%x\n", value);
 	pm8001_ha->general_stat_tbl_addr =
 		base_addr + (pm8001_cr32(pm8001_ha, pcibar, offset + 0x18) &
 					0xFFFFFF);
@@ -1171,6 +1204,7 @@ static void init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
 	pm8001_dbg(pm8001_ha, INIT, "addr - pspa %p ivt %p\n",
 		   pm8001_ha->pspa_q_tbl_addr,
 		   pm8001_ha->ivt_tbl_addr);
+	return 0;
 }
 
 /**
@@ -1438,7 +1472,12 @@ static int pm80xx_chip_init(struct pm8001_hba_info *pm8001_ha)
 	pm8001_ha->controller_fatal_error = false;
 
 	/* Initialize pci space address eg: mpi offset */
-	init_pci_device_addresses(pm8001_ha);
+	ret = init_pci_device_addresses(pm8001_ha);
+	if (ret) {
+		pm8001_dbg(pm8001_ha, FAIL,
+			"Failed to init pci addresses");
+		return ret;
+	}
 	init_default_table_values(pm8001_ha);
 	read_main_config_table(pm8001_ha);
 	read_general_status_table(pm8001_ha);
@@ -1482,7 +1521,15 @@ static int mpi_uninit_check(struct pm8001_hba_info *pm8001_ha)
 	u32 max_wait_count;
 	u32 value;
 	u32 gst_len_mpistate;
-	init_pci_device_addresses(pm8001_ha);
+	int ret;
+
+	ret = init_pci_device_addresses(pm8001_ha);
+	if (ret) {
+		pm8001_dbg(pm8001_ha, FAIL,
+			"Failed to init pci addresses");
+		return ret;
+	}
+
 	/* Write bit1=1 to Inbound DoorBell Register to tell the SPC FW the
 	table is stop */
 	pm8001_cw32(pm8001_ha, 0, MSGU_IBDB_SET, SPCv_MSGU_CFG_TABLE_RESET);
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/8] pm80xx: fix missing tag_free in NVMD DATA req
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
                   ` (2 preceding siblings ...)
  2020-12-30  4:57 ` [PATCH 3/8] pm80xx: check main config table address Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-05 13:24   ` Jinpu Wang
  2020-12-30  4:57 ` [PATCH 5/8] pm80xx: fix driver fatal dump failure Viswas G
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: akshatzen <akshatzen@google.com>

Tag is not free'd in NVMD get/set data request failure scenario,
which would have caused tag leak each time the request fails.

Signed-off-by: akshatzen <akshatzen@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
---
 drivers/scsi/pm8001/pm8001_hwi.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
index f147193d67bd..9cd6a654f8b2 100644
--- a/drivers/scsi/pm8001/pm8001_hwi.c
+++ b/drivers/scsi/pm8001/pm8001_hwi.c
@@ -3038,8 +3038,8 @@ void pm8001_mpi_set_nvmd_resp(struct pm8001_hba_info *pm8001_ha, void *piomb)
 	complete(pm8001_ha->nvmd_completion);
 	pm8001_dbg(pm8001_ha, MSG, "Set nvm data complete!\n");
 	if ((dlen_status & NVMD_STAT) != 0) {
-		pm8001_dbg(pm8001_ha, FAIL, "Set nvm data error!\n");
-		return;
+		pm8001_dbg(pm8001_ha, FAIL, "Set nvm data error %x\n",
+				dlen_status);
 	}
 	ccb->task = NULL;
 	ccb->ccb_tag = 0xFFFFFFFF;
@@ -3062,11 +3062,17 @@ pm8001_mpi_get_nvmd_resp(struct pm8001_hba_info *pm8001_ha, void *piomb)
 
 	pm8001_dbg(pm8001_ha, MSG, "Get nvm data complete!\n");
 	if ((dlen_status & NVMD_STAT) != 0) {
-		pm8001_dbg(pm8001_ha, FAIL, "Get nvm data error!\n");
+		pm8001_dbg(pm8001_ha, FAIL, "Get nvm data error %x\n",
+				dlen_status);
 		complete(pm8001_ha->nvmd_completion);
+		/* We should free tag during failure also, the tag is not being
+		 * free'd by requesting path anywhere.
+		 */
+		ccb->task = NULL;
+		ccb->ccb_tag = 0xFFFFFFFF;
+		pm8001_tag_free(pm8001_ha, tag);
 		return;
 	}
-
 	if (ir_tds_bn_dps_das_nvm & IPMode) {
 		/* indirect mode - IR bit set */
 		pm8001_dbg(pm8001_ha, MSG, "Get NVMD success, IR=1\n");
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/8] pm80xx: fix driver fatal dump failure.
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
                   ` (3 preceding siblings ...)
  2020-12-30  4:57 ` [PATCH 4/8] pm80xx: fix missing tag_free in NVMD DATA req Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-05 13:25   ` Jinpu Wang
  2020-12-30  4:57 ` [PATCH 6/8] pm80xx: Simultaneous poll for all FW readiness Viswas G
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: Viswas G <Viswas.G@microchip.com>

The fatal dump function pm80xx_get_fatal_dump() has two issues that
result in the fatal dump not being completed successfully.

1. When trying collect fatal_logs from the application it is getting
failed, because we are not shifting MEMBASE-II register properly.
Once we read 64K region of data we have to shift the MEMBASE-II register
and read the next chunk of data, then only we would be able to get
complete data.

2. If timeout occurs our application will get stuck because we are not
handling this case. In this patch it resolves all these issues.

Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 7d0eada11d3c..407c0cf6ab5f 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -349,10 +349,15 @@ ssize_t pm80xx_get_fatal_dump(struct device *cdev,
 				sprintf(
 				pm8001_ha->forensic_info.data_buf.direct_data,
 				"%08x ", 0xFFFFFFFF);
-				pm8001_cw32(pm8001_ha, 0,
+				return((char *)pm8001_ha->forensic_info.data_buf.direct_data -
+						(char *)buf);
+			}
+	/* reset fatal_forensic_shift_offset back to zero and reset MEMBASE 2 register to zero */
+			pm8001_ha->fatal_forensic_shift_offset = 0; /* location in 64k region */
+			pm8001_cw32(pm8001_ha, 0,
 					MEMBASE_II_SHIFT_REGISTER,
 					pm8001_ha->fatal_forensic_shift_offset);
-			}
+		}
 			/* Read the next block of the debug data.*/
 			length_to_read = pm8001_mr32(fatal_table_address,
 			MPI_FATAL_EDUMP_TABLE_ACCUM_LEN) -
@@ -373,13 +378,12 @@ ssize_t pm80xx_get_fatal_dump(struct device *cdev,
 								= 0;
 				pm8001_ha->forensic_info.data_buf.read_len = 0;
 			}
-		}
 	}
 	offset = (int)((char *)pm8001_ha->forensic_info.data_buf.direct_data
 			- (char *)buf);
 	pm8001_dbg(pm8001_ha, IO, "get_fatal_spcv: return4 0x%x\n", offset);
-	return (char *)pm8001_ha->forensic_info.data_buf.direct_data -
-		(char *)buf;
+	return ((char *)pm8001_ha->forensic_info.data_buf.direct_data -
+		(char *)buf);
 }
 
 /* pm80xx_get_non_fatal_dump - dump the nonfatal data from the dma
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/8] pm80xx: Simultaneous poll for all FW readiness.
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
                   ` (4 preceding siblings ...)
  2020-12-30  4:57 ` [PATCH 5/8] pm80xx: fix driver fatal dump failure Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-05 13:31   ` Jinpu Wang
  2020-12-30  4:57 ` [PATCH 7/8] pm80xx: Log SATA IOMB completion status on failure Viswas G
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: Bhavesh Jashnani <bjashnani@google.com>

In check_fw_ready() we first wait for ILA to come up and then we
wait for RAAE to come up and IOPs and so on. This is a sequential check.
Because of this ILA image seems to be not ready in the allocated time
and so the driver marks it as "not ready" and then move on to other FW
images. But ILA does become ready eventually, but is not checked again.
In this way driver concludes that FW is not ready, when it actually is.

Fix: Instead of sequentially polling each image, we keep polling for all
images to be ready. The timeout for the polling has been set to the sum
of what was used for each individual image.

Signed-off-by: Bhavesh Jashnani <bjashnani@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 80 ++++++++++++----------------------------
 1 file changed, 23 insertions(+), 57 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 407c0cf6ab5f..df679e36954a 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -1043,6 +1043,7 @@ static int check_fw_ready(struct pm8001_hba_info *pm8001_ha)
 	u32 value;
 	u32 max_wait_count;
 	u32 max_wait_time;
+	u32 expected_mask;
 	int ret = 0;
 
 	/* reset / PCIe ready */
@@ -1052,70 +1053,35 @@ static int check_fw_ready(struct pm8001_hba_info *pm8001_ha)
 		value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
 	} while ((value == 0xFFFFFFFF) && (--max_wait_count));
 
-	/* check ila status */
-	max_wait_time = max_wait_count = 50;	/* 1000 milli sec */
-	do {
-		msleep(FW_READY_INTERVAL);
-		value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
-	} while (((value & SCRATCH_PAD_ILA_READY) !=
-			SCRATCH_PAD_ILA_READY) && (--max_wait_count));
-	if (!max_wait_count)
-		ret = -1;
-	else {
-		pm8001_dbg(pm8001_ha, MSG,
-			   " ila ready status in %d millisec\n",
-			   (max_wait_time - max_wait_count));
-	}
-
-	/* check RAAE status */
-	max_wait_time = max_wait_count = 90;	/* 1800 milli sec */
-	do {
-		msleep(FW_READY_INTERVAL);
-		value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
-	} while (((value & SCRATCH_PAD_RAAE_READY) !=
-				SCRATCH_PAD_RAAE_READY) && (--max_wait_count));
-	if (!max_wait_count)
-		ret = -1;
-	else {
-		pm8001_dbg(pm8001_ha, MSG,
-			   " raae ready status in %d millisec\n",
-			   (max_wait_time - max_wait_count));
+	/* check ila, RAAE and iops status */
+	if ((pm8001_ha->chip_id != chip_8008) &&
+			(pm8001_ha->chip_id != chip_8009)) {
+		max_wait_time = max_wait_count = 180;   /* 3600 milli sec */
+		expected_mask = SCRATCH_PAD_ILA_READY |
+			SCRATCH_PAD_RAAE_READY |
+			SCRATCH_PAD_IOP0_READY |
+			SCRATCH_PAD_IOP1_READY;
+	} else {
+		max_wait_time = max_wait_count = 170;   /* 3400 milli sec */
+		expected_mask = SCRATCH_PAD_ILA_READY |
+			SCRATCH_PAD_RAAE_READY |
+			SCRATCH_PAD_IOP0_READY;
 	}
-
-	/* check iop0 status */
-	max_wait_time = max_wait_count = 30;	/* 600 milli sec */
 	do {
 		msleep(FW_READY_INTERVAL);
 		value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
-	} while (((value & SCRATCH_PAD_IOP0_READY) != SCRATCH_PAD_IOP0_READY) &&
-			(--max_wait_count));
-	if (!max_wait_count)
+	} while (((value & expected_mask) !=
+				 expected_mask) && (--max_wait_count));
+	if (!max_wait_count) {
+		pm8001_dbg(pm8001_ha, INIT,
+		"At least one FW component failed to load within %d millisec: Scratchpad1: 0x%x\n",
+			max_wait_time * FW_READY_INTERVAL, value);
 		ret = -1;
-	else {
+	} else {
 		pm8001_dbg(pm8001_ha, MSG,
-			   " iop0 ready status in %d millisec\n",
-			   (max_wait_time - max_wait_count));
+			"All FW components ready by %d ms\n",
+			(max_wait_time - max_wait_count) * FW_READY_INTERVAL);
 	}
-
-	/* check iop1 status only for 16 port controllers */
-	if ((pm8001_ha->chip_id != chip_8008) &&
-			(pm8001_ha->chip_id != chip_8009)) {
-		/* 200 milli sec */
-		max_wait_time = max_wait_count = 10;
-		do {
-			msleep(FW_READY_INTERVAL);
-			value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
-		} while (((value & SCRATCH_PAD_IOP1_READY) !=
-				SCRATCH_PAD_IOP1_READY) && (--max_wait_count));
-		if (!max_wait_count)
-			ret = -1;
-		else {
-			pm8001_dbg(pm8001_ha, MSG,
-				   "iop1 ready status in %d millisec\n",
-				   (max_wait_time - max_wait_count));
-		}
-	}
-
 	return ret;
 }
 
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/8] pm80xx: Log SATA IOMB completion status on failure.
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
                   ` (5 preceding siblings ...)
  2020-12-30  4:57 ` [PATCH 6/8] pm80xx: Simultaneous poll for all FW readiness Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-05 13:33   ` Jinpu Wang
  2020-12-30  4:57 ` [PATCH 8/8] pm80xx: Add sysfs attribute for ioc health Viswas G
  2021-01-08  3:21 ` [PATCH 0/8] pm80xx updates Martin K. Petersen
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: Vishakha Channapattan <vishakhavc@google.com>

Added a log message in sata completion path to log the status of failed
command. If the status does not match any expected status, another
message will be logged.

On IO failure with known status, log message will be

[ 1712.951735] pm80xx0:: mpi_sata_completion 2269: IO failed device_id
16385 status 0x1 tag XX

If the firmware returns unexpected status, log message of the following
format will be logged -

[ 1712.951735] pm80xx0:: mpi_sata_completion XXXX: Unknown status
device_id XXXXX status 0xX tag XX

Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
---
 drivers/scsi/pm8001/pm80xx_hwi.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index df679e36954a..e7fef42b4f6c 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -2437,10 +2437,11 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
 		return;
 	}
 
-	if (unlikely(status))
-		pm8001_dbg(pm8001_ha, IOERR,
-			   "status:0x%x, tag:0x%x, task::0x%p\n",
-			   status, tag, t);
+	if (status != IO_SUCCESS) {
+		pm8001_dbg(pm8001_ha, FAIL,
+			"IO failed device_id %u status 0x%x tag %d\n",
+			pm8001_dev->device_id, status, tag);
+	}
 
 	/* Print sas address of IO failed device */
 	if ((status != IO_SUCCESS) && (status != IO_OVERFLOW) &&
@@ -2762,7 +2763,9 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
 			atomic_dec(&pm8001_dev->running_req);
 		break;
 	default:
-		pm8001_dbg(pm8001_ha, DEVIO, "Unknown status 0x%x\n", status);
+		pm8001_dbg(pm8001_ha, DEVIO,
+				"Unknown status device_id %u status 0x%x tag %d\n",
+			pm8001_dev->device_id, status, tag);
 		/* not allowed case. Therefore, return failed status */
 		ts->resp = SAS_TASK_COMPLETE;
 		ts->stat = SAS_DEV_NO_RESPONSE;
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 8/8] pm80xx: Add sysfs attribute for ioc health
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
                   ` (6 preceding siblings ...)
  2020-12-30  4:57 ` [PATCH 7/8] pm80xx: Log SATA IOMB completion status on failure Viswas G
@ 2020-12-30  4:57 ` Viswas G
  2021-01-05 13:35   ` Jinpu Wang
  2021-01-08  3:21 ` [PATCH 0/8] pm80xx updates Martin K. Petersen
  8 siblings, 1 reply; 18+ messages in thread
From: Viswas G @ 2020-12-30  4:57 UTC (permalink / raw)
  To: linux-scsi
  Cc: Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi, yuuzheng,
	vishakhavc, radha, akshatzen, bjashnani, jinpu.wang

From: Vishakha Channapattan <vishakhavc@google.com>

A new sysfs variable 'health' is being introduced that tells if the
controller is alive by indicating controller ticks. If on subsequent
run we see the ticks changing that indicates that controller is not
dead.

Tested: Using 'health' sysfs variable we can see ticks incrementing
mvae14:~# cat  /sys/class/scsi_host/host*/health
MPI-S= MPI is successfully initialized   HMI_ERR=0
MSGUTCNT = 0x00000169 IOPTCNT=0x0000016a IOP1TCNT=0x0000016a
MPI-S= MPI is successfully initialized   HMI_ERR=0
MSGUTCNT = 0x0000014d IOPTCNT=0x0000014d IOP1TCNT=0x0000014d
MPI-S= MPI is successfully initialized   HMI_ERR=0
MSGUTCNT = 0x00000149 IOPTCNT=0x00000149 IOP1TCNT=0x00000149
mvae14:~#
mvae14:~#
mvae14:~#
mvae14:~# cat  /sys/class/scsi_host/host*/health
MPI-S= MPI is successfully initialized   HMI_ERR=0
MSGUTCNT = 0x0000016c IOPTCNT=0x0000016c IOP1TCNT=0x0000016c
MPI-S= MPI is successfully initialized   HMI_ERR=0
MSGUTCNT = 0x0000014f IOPTCNT=0x0000014f IOP1TCNT=0x0000014f
MPI-S= MPI is successfully initialized   HMI_ERR=0
MSGUTCNT = 0x0000014b IOPTCNT=0x0000014b IOP1TCNT=0x0000014b

Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
---
 drivers/scsi/pm8001/pm8001_ctl.c | 42 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/drivers/scsi/pm8001/pm8001_ctl.c b/drivers/scsi/pm8001/pm8001_ctl.c
index 12035baf0997..f46f341132fb 100644
--- a/drivers/scsi/pm8001/pm8001_ctl.c
+++ b/drivers/scsi/pm8001/pm8001_ctl.c
@@ -41,6 +41,7 @@
 #include <linux/slab.h>
 #include "pm8001_sas.h"
 #include "pm8001_ctl.h"
+#include "pm8001_chips.h"
 
 /* scsi host attributes */
 
@@ -886,6 +887,46 @@ static ssize_t pm8001_show_update_fw(struct device *cdev,
 
 static DEVICE_ATTR(update_fw, S_IRUGO|S_IWUSR|S_IWGRP,
 	pm8001_show_update_fw, pm8001_store_update_fw);
+
+/**
+ * pm8001_ctl_health_show - controller health check
+ * @cdev: pointer to embedded class device
+ * @buf: the buffer returned
+ *
+ * A sysfs 'read-only' shost attribute.
+ */
+
+char mpiStateText[][80] = {
+	"MPI is not initialized",
+	"MPI is successfully initialized",
+	"MPI termination is in progress",
+	"MPI initialization failed with error in [31:16]"
+};
+
+static ssize_t ctl_health_show(struct device *cdev,
+		struct device_attribute *attr, char *buf)
+{
+	struct Scsi_Host *shost = class_to_shost(cdev);
+	struct sas_ha_struct *sha = SHOST_TO_SAS_HA(shost);
+	struct pm8001_hba_info *pm8001_ha = sha->lldd_ha;
+	unsigned int mpiDW0 = 0;
+	unsigned int raaeCnt = 0;
+	unsigned int iop0Cnt = 0;
+	unsigned int iop1Cnt = 0;
+	int c;
+
+	pm8001_dbg(pm8001_ha, IOCTL, "%s\n", __func__);
+	mpiDW0 = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 0);
+	raaeCnt = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 12);
+	iop0Cnt = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 16);
+	iop1Cnt = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 20);
+	c = sprintf(buf, "MPI-S=%s\t HMI_ERR=%x\nMSGUTCNT=0x%08x IOPTCNT=0x%08x IOP1TCNT=0x%08x\n",
+			mpiStateText[mpiDW0 & 0x0003], ((mpiDW0 & 0xff00) >> 16),
+			raaeCnt, iop0Cnt, iop1Cnt);
+	return c;
+}
+static DEVICE_ATTR_RO(ctl_health);
+
 struct device_attribute *pm8001_host_attrs[] = {
 	&dev_attr_interface_rev,
 	&dev_attr_controller_fatal_error,
@@ -909,6 +950,7 @@ struct device_attribute *pm8001_host_attrs[] = {
 	&dev_attr_ob_log,
 	&dev_attr_ila_version,
 	&dev_attr_inc_fw_ver,
+	&dev_attr_ctl_health,
 	NULL,
 };
 
-- 
2.16.3


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/8] pm80xx: No busywait in MPI init check
  2020-12-30  4:57 ` [PATCH 1/8] pm80xx: No busywait in MPI init check Viswas G
@ 2021-01-04  6:45   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-04  6:45 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:47 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: akshatzen <akshatzen@google.com>
>
> We do not need to busy wait during mpi_init_check. I confirmed that
> mpi_init_check is not being invoked in an ATOMIC context. It is being
> called from pm8001_pci_resume, pm8001_pci_probe. Hence we are
> replacing the udelay which busy waits with msleep.
>
> Signed-off-by: akshatzen <akshatzen@google.com>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Radha Ramachandran <radha@google.com>
Thanks akshatzen!
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> ---
>  drivers/scsi/pm8001/pm80xx_hwi.c | 6 +++---
>  drivers/scsi/pm8001/pm80xx_hwi.h | 4 ++--
>  2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index 6772b0924dac..9c4b8b374ab8 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -997,7 +997,7 @@ static int mpi_init_check(struct pm8001_hba_info *pm8001_ha)
>                 max_wait_count = SPC_DOORBELL_CLEAR_TIMEOUT;
>         }
>         do {
> -               udelay(1);
> +               msleep(FW_READY_INTERVAL);
>                 value = pm8001_cr32(pm8001_ha, 0, MSGU_IBDB_SET);
>                 value &= SPCv_MSGU_CFG_TABLE_UPDATE;
>         } while ((value != 0) && (--max_wait_count));
> @@ -1010,9 +1010,9 @@ static int mpi_init_check(struct pm8001_hba_info *pm8001_ha)
>                 return -EBUSY;
>         }
>         /* check the MPI-State for initialization upto 100ms*/
> -       max_wait_count = 100 * 1000;/* 100 msec */
> +       max_wait_count = 5;/* 100 msec */
>         do {
> -               udelay(1);
> +               msleep(FW_READY_INTERVAL);
>                 gst_len_mpistate =
>                         pm8001_mr32(pm8001_ha->general_stat_tbl_addr,
>                                         GST_GSTLEN_MPIS_OFFSET);
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.h b/drivers/scsi/pm8001/pm80xx_hwi.h
> index ec48bc276de6..2b6b52551968 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.h
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.h
> @@ -220,8 +220,8 @@
>  #define SAS_DOPNRJT_RTRY_TMO            128
>  #define SAS_COPNRJT_RTRY_TMO            128
>
> -#define SPCV_DOORBELL_CLEAR_TIMEOUT    (30 * 1000 * 1000) /* 30 sec */
> -#define SPC_DOORBELL_CLEAR_TIMEOUT     (15 * 1000 * 1000) /* 15 sec */
> +#define SPCV_DOORBELL_CLEAR_TIMEOUT    (30 * 50) /* 30 sec */
> +#define SPC_DOORBELL_CLEAR_TIMEOUT     (15 * 50) /* 15 sec */
>
>  /*
>    Making ORR bigger than IT NEXUS LOSS which is 2000000us = 2 second.
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/8] pm80xx: check fatal error
  2020-12-30  4:57 ` [PATCH 2/8] pm80xx: check fatal error Viswas G
@ 2021-01-05 13:19   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-05 13:19 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:47 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: akshatzen <akshatzen@google.com>
>
> When controller runs into fatal error, commands which expect
> response get stuck due to no response. If the controller is
> in fatal error state, abort request issued to the controller
> gets hung too. Hence we should fail it without trying.
>
> Signed-off-by: akshatzen <akshatzen@google.com>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Radha Ramachandran <radha@google.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Thx
> ---
>  drivers/scsi/pm8001/pm8001_hwi.c |  1 +
>  drivers/scsi/pm8001/pm8001_sas.c |  9 +++++++++
>  drivers/scsi/pm8001/pm8001_sas.h |  2 ++
>  drivers/scsi/pm8001/pm80xx_hwi.c | 36 ++++++++++++++++++++++++++++++++++++
>  drivers/scsi/pm8001/pm80xx_hwi.h | 13 +++++++++++++
>  5 files changed, 61 insertions(+)
>
> diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
> index c8d4d87c5473..f147193d67bd 100644
> --- a/drivers/scsi/pm8001/pm8001_hwi.c
> +++ b/drivers/scsi/pm8001/pm8001_hwi.c
> @@ -4998,4 +4998,5 @@ const struct pm8001_dispatch pm8001_8001_dispatch = {
>         .fw_flash_update_req    = pm8001_chip_fw_flash_update_req,
>         .set_dev_state_req      = pm8001_chip_set_dev_state_req,
>         .sas_re_init_req        = pm8001_chip_sas_re_initialization,
> +       .fatal_errors           = pm80xx_fatal_errors,
>  };
> diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
> index d1e9dba2ef19..f8d142f9b9ad 100644
> --- a/drivers/scsi/pm8001/pm8001_sas.c
> +++ b/drivers/scsi/pm8001/pm8001_sas.c
> @@ -1183,12 +1183,21 @@ int pm8001_abort_task(struct sas_task *task)
>         int rc = TMF_RESP_FUNC_FAILED, ret;
>         u32 phy_id;
>         struct sas_task_slow slow_task;
> +
>         if (unlikely(!task || !task->lldd_task || !task->dev))
>                 return TMF_RESP_FUNC_FAILED;
> +
>         dev = task->dev;
>         pm8001_dev = dev->lldd_dev;
>         pm8001_ha = pm8001_find_ha_by_dev(dev);
>         phy_id = pm8001_dev->attached_phy;
> +
> +       if (PM8001_CHIP_DISP->fatal_errors(pm8001_ha)) {
> +               // If the controller is seeing fatal errors
> +               // abort task will not get a response from the controller
> +               return TMF_RESP_FUNC_FAILED;
> +       }
> +
>         ret = pm8001_find_tag(task, &tag);
>         if (ret == 0) {
>                 pm8001_info(pm8001_ha, "no tag for task:%p\n", task);
> diff --git a/drivers/scsi/pm8001/pm8001_sas.h b/drivers/scsi/pm8001/pm8001_sas.h
> index f2c8cbad3853..039ed91e9841 100644
> --- a/drivers/scsi/pm8001/pm8001_sas.h
> +++ b/drivers/scsi/pm8001/pm8001_sas.h
> @@ -215,6 +215,7 @@ struct pm8001_dispatch {
>         int (*sas_diag_execute_req)(struct pm8001_hba_info *pm8001_ha,
>                 u32 state);
>         int (*sas_re_init_req)(struct pm8001_hba_info *pm8001_ha);
> +       int (*fatal_errors)(struct pm8001_hba_info *pm8001_ha);
>  };
>
>  struct pm8001_chip_info {
> @@ -725,6 +726,7 @@ ssize_t pm80xx_get_fatal_dump(struct device *cdev,
>  ssize_t pm80xx_get_non_fatal_dump(struct device *cdev,
>                 struct device_attribute *attr, char *buf);
>  ssize_t pm8001_get_gsm_dump(struct device *cdev, u32, char *buf);
> +int pm80xx_fatal_errors(struct pm8001_hba_info *pm8001_ha);
>  /* ctl shared API */
>  extern struct device_attribute *pm8001_host_attrs[];
>
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index 9c4b8b374ab8..86a3d483749c 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -1525,6 +1525,41 @@ static int mpi_uninit_check(struct pm8001_hba_info *pm8001_ha)
>         return 0;
>  }
>
> +/**
> + * pm80xx_fatal_errors - returns non zero *ONLY* when fatal errors
> + * @pm8001_ha: our hba card information
> + *
> + * Fatal errors are recoverable only after a host reboot.
> + */
> +int
> +pm80xx_fatal_errors(struct pm8001_hba_info *pm8001_ha)
> +{
> +       int ret = 0;
> +       u32 scratch_pad_rsvd0 = pm8001_cr32(pm8001_ha, 0,
> +                                       MSGU_HOST_SCRATCH_PAD_6);
> +       u32 scratch_pad_rsvd1 = pm8001_cr32(pm8001_ha, 0,
> +                                       MSGU_HOST_SCRATCH_PAD_7);
> +       u32 scratch_pad1 = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
> +       u32 scratch_pad2 = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_2);
> +       u32 scratch_pad3 = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_3);
> +
> +       if (pm8001_ha->chip_id != chip_8006 &&
> +                       pm8001_ha->chip_id != chip_8074 &&
> +                       pm8001_ha->chip_id != chip_8076) {
> +               return 0;
> +       }
> +
> +       if (MSGU_SCRATCHPAD1_STATE_FATAL_ERROR(scratch_pad1)) {
> +               pm8001_dbg(pm8001_ha, FAIL,
> +                       "Fatal error SCRATCHPAD1 = 0x%x SCRATCHPAD2 = 0x%x SCRATCHPAD3 = 0x%x SCRATCHPAD_RSVD0 = 0x%x SCRATCHPAD_RSVD1 = 0x%x\n",
> +                               scratch_pad1, scratch_pad2, scratch_pad3,
> +                               scratch_pad_rsvd0, scratch_pad_rsvd1);
> +               ret = 1;
> +       }
> +
> +       return ret;
> +}
> +
>  /**
>   * pm8001_chip_soft_rst - soft reset the PM8001 chip, so that the clear all
>   * the FW register status to the originated status.
> @@ -4959,4 +4994,5 @@ const struct pm8001_dispatch pm8001_80xx_dispatch = {
>         .set_nvmd_req           = pm8001_chip_set_nvmd_req,
>         .fw_flash_update_req    = pm8001_chip_fw_flash_update_req,
>         .set_dev_state_req      = pm8001_chip_set_dev_state_req,
> +       .fatal_errors           = pm80xx_fatal_errors,
>  };
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.h b/drivers/scsi/pm8001/pm80xx_hwi.h
> index 2b6b52551968..2c8e85cfdbc4 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.h
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.h
> @@ -1368,6 +1368,19 @@ typedef struct SASProtocolTimerConfig SASProtocolTimerConfig_t;
>  #define MSGU_HOST_SCRATCH_PAD_6                        0x6C
>  #define MSGU_HOST_SCRATCH_PAD_7                        0x70
>
> +#define MSGU_SCRATCHPAD1_RAAE_STATE_ERR(x) ((x & 0x3) == 0x2)
> +#define MSGU_SCRATCHPAD1_ILA_STATE_ERR(x) (((x >> 2) & 0x3) == 0x2)
> +#define MSGU_SCRATCHPAD1_BOOTLDR_STATE_ERR(x) ((((x >> 4) & 0x7) == 0x7) || \
> +                                               (((x >> 4) & 0x7) == 0x4))
> +#define MSGU_SCRATCHPAD1_IOP0_STATE_ERR(x) (((x >> 10) & 0x3) == 0x2)
> +#define MSGU_SCRATCHPAD1_IOP1_STATE_ERR(x) (((x >> 12) & 0x3) == 0x2)
> +#define MSGU_SCRATCHPAD1_STATE_FATAL_ERROR(x)  \
> +                       (MSGU_SCRATCHPAD1_RAAE_STATE_ERR(x) ||      \
> +                        MSGU_SCRATCHPAD1_ILA_STATE_ERR(x) ||       \
> +                        MSGU_SCRATCHPAD1_BOOTLDR_STATE_ERR(x) ||   \
> +                        MSGU_SCRATCHPAD1_IOP0_STATE_ERR(x) ||      \
> +                        MSGU_SCRATCHPAD1_IOP1_STATE_ERR(x))
> +
>  /* bit definition for ODMR register */
>  #define ODMR_MASK_ALL                  0xFFFFFFFF/* mask all
>                                         interrupt vector */
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/8] pm80xx: check main config table address
  2020-12-30  4:57 ` [PATCH 3/8] pm80xx: check main config table address Viswas G
@ 2021-01-05 13:23   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-05 13:23 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:47 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: akshatzen <akshatzen@google.com>
>
> pm8001 driver initializes main configuration, general status, inbound
> queue and outbound queue table address based on value read from
> MSGU_SCRATCH_PAD_0 register.
>
> We should validate these addresses before dereferencing them.
>
> This change adds two validations:
> 1. Check if main configuration table offset lies within the pcibar
> mapped
> 2. Check if first dword of main configuration table reads "PMCS"
>
> There are two calls to init_pci_device_addresses() done during
> pm8001_pci_probe() in this sequence:
> 1. First inside chip_soft_rst, where if init_pci_device_addresses fails
> we will go ahead assuming MPI state is not ready and reset the device as
> long as bootloader is okay.  This gives chance to second call of
> init_pci_device_addresses to set up the addresses after reset.
>
> 2. The second call is via pm80xx_chip_init, after soft reset is done and
> firmware is checked to be ready. Once that is done we are safe to go
> ahead and initialize default table values and use them.
>
> Tested:
>
> 1. Enabled debugging logs and observed no issues during initialization,
> with a controller with No issues.
>
> pm80xx0:: pm8001_setup_msix 1034: pci_alloc_irq_vectors request ret:64
> no of intr 64
> pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000
> value 0x40002000
> pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
> pm80xx0:: init_pci_device_addresses 952: VALID main config signature
> 0x53434d50
> pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
> pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
> pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
> pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
> pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
> pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval)
> general status (ptrval)
> pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd
> (ptrval)
> pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt
> (ptrval)
> pm80xx0:: pm80xx_chip_soft_rst 1446: reset register before write : 0x0
> pm80xx0:: pm80xx_chip_soft_rst 1478: reset register after write 0x40
> pm80xx0:: pm80xx_chip_soft_rst 1544: SPCv soft reset Complete
> pm80xx0:: init_pci_device_addresses 917: Scratchpad 0 Offset: 0x2000
> value 0x40002000
> pm80xx0:: init_pci_device_addresses 925: Scratchpad 0 PCI BAR: 0
> pm80xx0:: init_pci_device_addresses 952: VALID main config signature
> 0x53434d50
> pm80xx0:: init_pci_device_addresses 975: GST OFFSET 0xc4
> pm80xx0:: init_pci_device_addresses 978: INBND OFFSET 0x20000128
> pm80xx0:: init_pci_device_addresses 981: OBND OFFSET 0x24000928
> pm80xx0:: init_pci_device_addresses 984: IVT OFFSET 0x8001408
> pm80xx0:: init_pci_device_addresses 987: PSPA OFFSET 0x8001608
> pm80xx0:: init_pci_device_addresses 991: addr - main cfg (ptrval)
> general status (ptrval)
> pm80xx0:: init_pci_device_addresses 995: addr - inbnd (ptrval) obnd
> (ptrval)
> pm80xx0:: init_pci_device_addresses 999: addr - pspa (ptrval) ivt
> (ptrval)
> pm80xx0:: pm80xx_chip_init 1329: MPI initialize successful!
>
> 2. Tested controller with firmware known to have initialization issue
> and observed no crashes with this fix
>
> pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
> pm80xx 0000:01:00.0: Removing from 1:1 domain
> pm80xx 0000:01:00.0: Requesting non-1:1 mappings
> pm80xx0:: init_pci_device_addresses 948: BAD main config signature 0x0
> pm80xx0:: mpi_uninit_check 1365: Failed to init pci addresses
> pm80xx0:: pm80xx_chip_soft_rst 1435: MPI state is not ready
> scratch:0:8:62a01000:0
> pm80xx0:: pm80xx_chip_soft_rst 1518: Firmware is not ready!
> pm80xx0:: pm80xx_chip_soft_rst 1532: iButton Feature is not Available!!!
> pm80xx0:: pm80xx_chip_init 1301: Firmware is not ready!
> pm80xx0:: pm8001_pci_probe 1215: chip_init failed [ret: -16]
> pm80xx: probe of 0000:01:00.0 failed with error -16
> pm80xx 0000:07:00.0: pm80xx: driver version 0.1.38
> pm80xx 0000:07:00.0: Removing from 1:1 domain
> pm80xx 0000:07:00.0: Requesting non-1:1 mappings
> scsi host6: pm80xx
> pm80xx1:: pm8001_setup_sgpio 5568: failed sgpio_req timeout
> pm80xx1:: mpi_phy_start_resp 3447: phy start resp status:0x0, phyid:0x0
> pm80xx 0000:08:00.0: pm80xx: driver version 0.1.38
> pm80xx 0000:08:00.0: Removing from 1:1 domain
> pm80xx 0000:08:00.0: Requesting non-1:1 mappings
>
> 3. Without this fix we observe crash on the same controller.
>
> pm80xx 0000:01:00.0: pm80xx: driver version 0.1.38
> pm80xx 0000:01:00.0: Removing from 1:1 domain
> pm80xx 0000:01:00.0: Requesting non-1:1 mappings
> [<ffffffffc0451b3b>] pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
> [<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
> RIP: 0010:pm80xx_chip_soft_rst+0x71/0x4c0 [pm80xx]
> [<ffffffffc0451b3b>] ? pm80xx_chip_soft_rst+0x6b/0x4c0 [pm80xx]
> [<ffffffffc043a933>] pm8001_pci_probe+0xa43/0x1630 [pm80xx]
> pm80xx0:: mpi_uninit_check 1339: TIMEOUT:IBDB value/=2
> pm80xx0:: pm80xx_chip_soft_rst 1387: MPI state is not ready
> scratch:0:8:62a01000:0
> pm80xx0:: pm80xx_chip_soft_rst 1470: Firmware is not ready!
> pm80xx0:: pm80xx_chip_soft_rst 1484: iButton Feature is not Available!!!
> pm80xx0:: pm80xx_chip_init 1266: Firmware is not ready!
> pm80xx0:: pm8001_pci_probe 1207: chip_init failed [ret: -16]
> pm80xx: probe of 0000:01:00.0 failed with error -16
>
> Signed-off-by: akshatzen <akshatzen@google.com>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Radha Ramachandran <radha@google.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Thx
> ---
>  drivers/scsi/pm8001/pm8001_init.c | 11 +++++---
>  drivers/scsi/pm8001/pm80xx_hwi.c  | 53 ++++++++++++++++++++++++++++++++++++---
>  2 files changed, 58 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
> index ee2de177d0d0..d21078ca7fb3 100644
> --- a/drivers/scsi/pm8001/pm8001_init.c
> +++ b/drivers/scsi/pm8001/pm8001_init.c
> @@ -466,9 +466,12 @@ static int pm8001_ioremap(struct pm8001_hba_info *pm8001_ha)
>                         pm8001_ha->io_mem[logicalBar].memvirtaddr =
>                                 ioremap(pm8001_ha->io_mem[logicalBar].membase,
>                                 pm8001_ha->io_mem[logicalBar].memsize);
> -                       pm8001_dbg(pm8001_ha, INIT,
> -                                  "PCI: bar %d, logicalBar %d\n",
> +                       if (!pm8001_ha->io_mem[logicalBar].memvirtaddr) {
> +                               pm8001_dbg(pm8001_ha, INIT,
> +                                       "Failed to ioremap bar %d, logicalBar %d",
>                                    bar, logicalBar);
> +                               return -ENOMEM;
> +                       }
>                         pm8001_dbg(pm8001_ha, INIT,
>                                    "base addr %llx virt_addr=%llx len=%d\n",
>                                    (u64)pm8001_ha->io_mem[logicalBar].membase,
> @@ -540,9 +543,11 @@ static struct pm8001_hba_info *pm8001_pci_alloc(struct pci_dev *pdev,
>                         tasklet_init(&pm8001_ha->tasklet[j], pm8001_tasklet,
>                                 (unsigned long)&(pm8001_ha->irq_vector[j]));
>  #endif
> -       pm8001_ioremap(pm8001_ha);
> +       if (pm8001_ioremap(pm8001_ha))
> +               goto failed_pci_alloc;
>         if (!pm8001_alloc(pm8001_ha, ent))
>                 return pm8001_ha;
> +failed_pci_alloc:
>         pm8001_free(pm8001_ha);
>         return NULL;
>  }
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index 86a3d483749c..7d0eada11d3c 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -1115,7 +1115,7 @@ static int check_fw_ready(struct pm8001_hba_info *pm8001_ha)
>         return ret;
>  }
>
> -static void init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
> +static int init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
>  {
>         void __iomem *base_addr;
>         u32     value;
> @@ -1124,15 +1124,48 @@ static void init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
>         u32     pcilogic;
>
>         value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_0);
> +
> +       /**
> +        * lower 26 bits of SCRATCHPAD0 register describes offset within the
> +        * PCIe BAR where the MPI configuration table is present
> +        */
>         offset = value & 0x03FFFFFF; /* scratch pad 0 TBL address */
>
>         pm8001_dbg(pm8001_ha, DEV, "Scratchpad 0 Offset: 0x%x value 0x%x\n",
>                    offset, value);
> +       /**
> +        * Upper 6 bits describe the offset within PCI config space where BAR
> +        * is located.
> +        */
>         pcilogic = (value & 0xFC000000) >> 26;
>         pcibar = get_pci_bar_index(pcilogic);
>         pm8001_dbg(pm8001_ha, INIT, "Scratchpad 0 PCI BAR: %d\n", pcibar);
> +
> +       /**
> +        * Make sure the offset falls inside the ioremapped PCI BAR
> +        */
> +       if (offset > pm8001_ha->io_mem[pcibar].memsize) {
> +               pm8001_dbg(pm8001_ha, FAIL,
> +                       "Main cfg tbl offset outside %u > %u\n",
> +                               offset, pm8001_ha->io_mem[pcibar].memsize);
> +               return -EBUSY;
> +       }
>         pm8001_ha->main_cfg_tbl_addr = base_addr =
>                 pm8001_ha->io_mem[pcibar].memvirtaddr + offset;
> +
> +       /**
> +        * Validate main configuration table address: first DWord should read
> +        * "PMCS"
> +        */
> +       value = pm8001_mr32(pm8001_ha->main_cfg_tbl_addr, 0);
> +       if (memcmp(&value, "PMCS", 4) != 0) {
> +               pm8001_dbg(pm8001_ha, FAIL,
> +                       "BAD main config signature 0x%x\n",
> +                               value);
> +               return -EBUSY;
> +       }
> +       pm8001_dbg(pm8001_ha, INIT,
> +                       "VALID main config signature 0x%x\n", value);
>         pm8001_ha->general_stat_tbl_addr =
>                 base_addr + (pm8001_cr32(pm8001_ha, pcibar, offset + 0x18) &
>                                         0xFFFFFF);
> @@ -1171,6 +1204,7 @@ static void init_pci_device_addresses(struct pm8001_hba_info *pm8001_ha)
>         pm8001_dbg(pm8001_ha, INIT, "addr - pspa %p ivt %p\n",
>                    pm8001_ha->pspa_q_tbl_addr,
>                    pm8001_ha->ivt_tbl_addr);
> +       return 0;
>  }
>
>  /**
> @@ -1438,7 +1472,12 @@ static int pm80xx_chip_init(struct pm8001_hba_info *pm8001_ha)
>         pm8001_ha->controller_fatal_error = false;
>
>         /* Initialize pci space address eg: mpi offset */
> -       init_pci_device_addresses(pm8001_ha);
> +       ret = init_pci_device_addresses(pm8001_ha);
> +       if (ret) {
> +               pm8001_dbg(pm8001_ha, FAIL,
> +                       "Failed to init pci addresses");
> +               return ret;
> +       }
>         init_default_table_values(pm8001_ha);
>         read_main_config_table(pm8001_ha);
>         read_general_status_table(pm8001_ha);
> @@ -1482,7 +1521,15 @@ static int mpi_uninit_check(struct pm8001_hba_info *pm8001_ha)
>         u32 max_wait_count;
>         u32 value;
>         u32 gst_len_mpistate;
> -       init_pci_device_addresses(pm8001_ha);
> +       int ret;
> +
> +       ret = init_pci_device_addresses(pm8001_ha);
> +       if (ret) {
> +               pm8001_dbg(pm8001_ha, FAIL,
> +                       "Failed to init pci addresses");
> +               return ret;
> +       }
> +
>         /* Write bit1=1 to Inbound DoorBell Register to tell the SPC FW the
>         table is stop */
>         pm8001_cw32(pm8001_ha, 0, MSGU_IBDB_SET, SPCv_MSGU_CFG_TABLE_RESET);
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/8] pm80xx: fix missing tag_free in NVMD DATA req
  2020-12-30  4:57 ` [PATCH 4/8] pm80xx: fix missing tag_free in NVMD DATA req Viswas G
@ 2021-01-05 13:24   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-05 13:24 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:47 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: akshatzen <akshatzen@google.com>
>
> Tag is not free'd in NVMD get/set data request failure scenario,
> which would have caused tag leak each time the request fails.
>
> Signed-off-by: akshatzen <akshatzen@google.com>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Radha Ramachandran <radha@google.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Thx
> ---
>  drivers/scsi/pm8001/pm8001_hwi.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm8001_hwi.c b/drivers/scsi/pm8001/pm8001_hwi.c
> index f147193d67bd..9cd6a654f8b2 100644
> --- a/drivers/scsi/pm8001/pm8001_hwi.c
> +++ b/drivers/scsi/pm8001/pm8001_hwi.c
> @@ -3038,8 +3038,8 @@ void pm8001_mpi_set_nvmd_resp(struct pm8001_hba_info *pm8001_ha, void *piomb)
>         complete(pm8001_ha->nvmd_completion);
>         pm8001_dbg(pm8001_ha, MSG, "Set nvm data complete!\n");
>         if ((dlen_status & NVMD_STAT) != 0) {
> -               pm8001_dbg(pm8001_ha, FAIL, "Set nvm data error!\n");
> -               return;
> +               pm8001_dbg(pm8001_ha, FAIL, "Set nvm data error %x\n",
> +                               dlen_status);
>         }
>         ccb->task = NULL;
>         ccb->ccb_tag = 0xFFFFFFFF;
> @@ -3062,11 +3062,17 @@ pm8001_mpi_get_nvmd_resp(struct pm8001_hba_info *pm8001_ha, void *piomb)
>
>         pm8001_dbg(pm8001_ha, MSG, "Get nvm data complete!\n");
>         if ((dlen_status & NVMD_STAT) != 0) {
> -               pm8001_dbg(pm8001_ha, FAIL, "Get nvm data error!\n");
> +               pm8001_dbg(pm8001_ha, FAIL, "Get nvm data error %x\n",
> +                               dlen_status);
>                 complete(pm8001_ha->nvmd_completion);
> +               /* We should free tag during failure also, the tag is not being
> +                * free'd by requesting path anywhere.
> +                */
> +               ccb->task = NULL;
> +               ccb->ccb_tag = 0xFFFFFFFF;
> +               pm8001_tag_free(pm8001_ha, tag);
>                 return;
>         }
> -
>         if (ir_tds_bn_dps_das_nvm & IPMode) {
>                 /* indirect mode - IR bit set */
>                 pm8001_dbg(pm8001_ha, MSG, "Get NVMD success, IR=1\n");
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/8] pm80xx: fix driver fatal dump failure.
  2020-12-30  4:57 ` [PATCH 5/8] pm80xx: fix driver fatal dump failure Viswas G
@ 2021-01-05 13:25   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-05 13:25 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:47 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: Viswas G <Viswas.G@microchip.com>
>
> The fatal dump function pm80xx_get_fatal_dump() has two issues that
> result in the fatal dump not being completed successfully.
>
> 1. When trying collect fatal_logs from the application it is getting
> failed, because we are not shifting MEMBASE-II register properly.
> Once we read 64K region of data we have to shift the MEMBASE-II register
> and read the next chunk of data, then only we would be able to get
> complete data.
>
> 2. If timeout occurs our application will get stuck because we are not
> handling this case. In this patch it resolves all these issues.
>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Thx
> ---
>  drivers/scsi/pm8001/pm80xx_hwi.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index 7d0eada11d3c..407c0cf6ab5f 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -349,10 +349,15 @@ ssize_t pm80xx_get_fatal_dump(struct device *cdev,
>                                 sprintf(
>                                 pm8001_ha->forensic_info.data_buf.direct_data,
>                                 "%08x ", 0xFFFFFFFF);
> -                               pm8001_cw32(pm8001_ha, 0,
> +                               return((char *)pm8001_ha->forensic_info.data_buf.direct_data -
> +                                               (char *)buf);
> +                       }
> +       /* reset fatal_forensic_shift_offset back to zero and reset MEMBASE 2 register to zero */
> +                       pm8001_ha->fatal_forensic_shift_offset = 0; /* location in 64k region */
> +                       pm8001_cw32(pm8001_ha, 0,
>                                         MEMBASE_II_SHIFT_REGISTER,
>                                         pm8001_ha->fatal_forensic_shift_offset);
> -                       }
> +               }
>                         /* Read the next block of the debug data.*/
>                         length_to_read = pm8001_mr32(fatal_table_address,
>                         MPI_FATAL_EDUMP_TABLE_ACCUM_LEN) -
> @@ -373,13 +378,12 @@ ssize_t pm80xx_get_fatal_dump(struct device *cdev,
>                                                                 = 0;
>                                 pm8001_ha->forensic_info.data_buf.read_len = 0;
>                         }
> -               }
>         }
>         offset = (int)((char *)pm8001_ha->forensic_info.data_buf.direct_data
>                         - (char *)buf);
>         pm8001_dbg(pm8001_ha, IO, "get_fatal_spcv: return4 0x%x\n", offset);
> -       return (char *)pm8001_ha->forensic_info.data_buf.direct_data -
> -               (char *)buf;
> +       return ((char *)pm8001_ha->forensic_info.data_buf.direct_data -
> +               (char *)buf);
>  }
>
>  /* pm80xx_get_non_fatal_dump - dump the nonfatal data from the dma
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/8] pm80xx: Simultaneous poll for all FW readiness.
  2020-12-30  4:57 ` [PATCH 6/8] pm80xx: Simultaneous poll for all FW readiness Viswas G
@ 2021-01-05 13:31   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-05 13:31 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:47 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: Bhavesh Jashnani <bjashnani@google.com>
>
> In check_fw_ready() we first wait for ILA to come up and then we
> wait for RAAE to come up and IOPs and so on. This is a sequential check.
> Because of this ILA image seems to be not ready in the allocated time
> and so the driver marks it as "not ready" and then move on to other FW
> images. But ILA does become ready eventually, but is not checked again.
> In this way driver concludes that FW is not ready, when it actually is.
>
> Fix: Instead of sequentially polling each image, we keep polling for all
> images to be ready. The timeout for the polling has been set to the sum
> of what was used for each individual image.
>
> Signed-off-by: Bhavesh Jashnani <bjashnani@google.com>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
> Signed-off-by: Radha Ramachandran <radha@google.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Thx
> ---
>  drivers/scsi/pm8001/pm80xx_hwi.c | 80 ++++++++++++----------------------------
>  1 file changed, 23 insertions(+), 57 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index 407c0cf6ab5f..df679e36954a 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -1043,6 +1043,7 @@ static int check_fw_ready(struct pm8001_hba_info *pm8001_ha)
>         u32 value;
>         u32 max_wait_count;
>         u32 max_wait_time;
> +       u32 expected_mask;
>         int ret = 0;
>
>         /* reset / PCIe ready */
> @@ -1052,70 +1053,35 @@ static int check_fw_ready(struct pm8001_hba_info *pm8001_ha)
>                 value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
>         } while ((value == 0xFFFFFFFF) && (--max_wait_count));
>
> -       /* check ila status */
> -       max_wait_time = max_wait_count = 50;    /* 1000 milli sec */
> -       do {
> -               msleep(FW_READY_INTERVAL);
> -               value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
> -       } while (((value & SCRATCH_PAD_ILA_READY) !=
> -                       SCRATCH_PAD_ILA_READY) && (--max_wait_count));
> -       if (!max_wait_count)
> -               ret = -1;
> -       else {
> -               pm8001_dbg(pm8001_ha, MSG,
> -                          " ila ready status in %d millisec\n",
> -                          (max_wait_time - max_wait_count));
> -       }
> -
> -       /* check RAAE status */
> -       max_wait_time = max_wait_count = 90;    /* 1800 milli sec */
> -       do {
> -               msleep(FW_READY_INTERVAL);
> -               value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
> -       } while (((value & SCRATCH_PAD_RAAE_READY) !=
> -                               SCRATCH_PAD_RAAE_READY) && (--max_wait_count));
> -       if (!max_wait_count)
> -               ret = -1;
> -       else {
> -               pm8001_dbg(pm8001_ha, MSG,
> -                          " raae ready status in %d millisec\n",
> -                          (max_wait_time - max_wait_count));
> +       /* check ila, RAAE and iops status */
> +       if ((pm8001_ha->chip_id != chip_8008) &&
> +                       (pm8001_ha->chip_id != chip_8009)) {
> +               max_wait_time = max_wait_count = 180;   /* 3600 milli sec */
> +               expected_mask = SCRATCH_PAD_ILA_READY |
> +                       SCRATCH_PAD_RAAE_READY |
> +                       SCRATCH_PAD_IOP0_READY |
> +                       SCRATCH_PAD_IOP1_READY;
> +       } else {
> +               max_wait_time = max_wait_count = 170;   /* 3400 milli sec */
> +               expected_mask = SCRATCH_PAD_ILA_READY |
> +                       SCRATCH_PAD_RAAE_READY |
> +                       SCRATCH_PAD_IOP0_READY;
>         }
> -
> -       /* check iop0 status */
> -       max_wait_time = max_wait_count = 30;    /* 600 milli sec */
>         do {
>                 msleep(FW_READY_INTERVAL);
>                 value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
> -       } while (((value & SCRATCH_PAD_IOP0_READY) != SCRATCH_PAD_IOP0_READY) &&
> -                       (--max_wait_count));
> -       if (!max_wait_count)
> +       } while (((value & expected_mask) !=
> +                                expected_mask) && (--max_wait_count));
> +       if (!max_wait_count) {
> +               pm8001_dbg(pm8001_ha, INIT,
> +               "At least one FW component failed to load within %d millisec: Scratchpad1: 0x%x\n",
> +                       max_wait_time * FW_READY_INTERVAL, value);
>                 ret = -1;
> -       else {
> +       } else {
>                 pm8001_dbg(pm8001_ha, MSG,
> -                          " iop0 ready status in %d millisec\n",
> -                          (max_wait_time - max_wait_count));
> +                       "All FW components ready by %d ms\n",
> +                       (max_wait_time - max_wait_count) * FW_READY_INTERVAL);
>         }
> -
> -       /* check iop1 status only for 16 port controllers */
> -       if ((pm8001_ha->chip_id != chip_8008) &&
> -                       (pm8001_ha->chip_id != chip_8009)) {
> -               /* 200 milli sec */
> -               max_wait_time = max_wait_count = 10;
> -               do {
> -                       msleep(FW_READY_INTERVAL);
> -                       value = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
> -               } while (((value & SCRATCH_PAD_IOP1_READY) !=
> -                               SCRATCH_PAD_IOP1_READY) && (--max_wait_count));
> -               if (!max_wait_count)
> -                       ret = -1;
> -               else {
> -                       pm8001_dbg(pm8001_ha, MSG,
> -                                  "iop1 ready status in %d millisec\n",
> -                                  (max_wait_time - max_wait_count));
> -               }
> -       }
> -
>         return ret;
>  }
>
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/8] pm80xx: Log SATA IOMB completion status on failure.
  2020-12-30  4:57 ` [PATCH 7/8] pm80xx: Log SATA IOMB completion status on failure Viswas G
@ 2021-01-05 13:33   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-05 13:33 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:48 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: Vishakha Channapattan <vishakhavc@google.com>
>
> Added a log message in sata completion path to log the status of failed
> command. If the status does not match any expected status, another
> message will be logged.
>
> On IO failure with known status, log message will be
>
> [ 1712.951735] pm80xx0:: mpi_sata_completion 2269: IO failed device_id
> 16385 status 0x1 tag XX
>
> If the firmware returns unexpected status, log message of the following
> format will be logged -
>
> [ 1712.951735] pm80xx0:: mpi_sata_completion XXXX: Unknown status
> device_id XXXXX status 0xX tag XX
>
> Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
> Signed-off-by: Radha Ramachandran <radha@google.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Thx
> ---
>  drivers/scsi/pm8001/pm80xx_hwi.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
> index df679e36954a..e7fef42b4f6c 100644
> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> @@ -2437,10 +2437,11 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
>                 return;
>         }
>
> -       if (unlikely(status))
> -               pm8001_dbg(pm8001_ha, IOERR,
> -                          "status:0x%x, tag:0x%x, task::0x%p\n",
> -                          status, tag, t);
> +       if (status != IO_SUCCESS) {
> +               pm8001_dbg(pm8001_ha, FAIL,
> +                       "IO failed device_id %u status 0x%x tag %d\n",
> +                       pm8001_dev->device_id, status, tag);
> +       }
>
>         /* Print sas address of IO failed device */
>         if ((status != IO_SUCCESS) && (status != IO_OVERFLOW) &&
> @@ -2762,7 +2763,9 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha, void *piomb)
>                         atomic_dec(&pm8001_dev->running_req);
>                 break;
>         default:
> -               pm8001_dbg(pm8001_ha, DEVIO, "Unknown status 0x%x\n", status);
> +               pm8001_dbg(pm8001_ha, DEVIO,
> +                               "Unknown status device_id %u status 0x%x tag %d\n",
> +                       pm8001_dev->device_id, status, tag);
>                 /* not allowed case. Therefore, return failed status */
>                 ts->resp = SAS_TASK_COMPLETE;
>                 ts->stat = SAS_DEV_NO_RESPONSE;
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 8/8] pm80xx: Add sysfs attribute for ioc health
  2020-12-30  4:57 ` [PATCH 8/8] pm80xx: Add sysfs attribute for ioc health Viswas G
@ 2021-01-05 13:35   ` Jinpu Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jinpu Wang @ 2021-01-05 13:35 UTC (permalink / raw)
  To: Viswas G
  Cc: Linux SCSI Mailinglist, Vasanthalakshmi.Tharmarajan, Viswas G,
	Ruksar.devadi, yuuzheng, vishakhavc, Radha Ramachandran,
	akshatzen, bjashnani

On Wed, Dec 30, 2020 at 5:48 AM Viswas G <Viswas.G@microchip.com.com> wrote:
>
> From: Vishakha Channapattan <vishakhavc@google.com>
>
> A new sysfs variable 'health' is being introduced that tells if the
> controller is alive by indicating controller ticks. If on subsequent
> run we see the ticks changing that indicates that controller is not
> dead.
>
> Tested: Using 'health' sysfs variable we can see ticks incrementing
> mvae14:~# cat  /sys/class/scsi_host/host*/health
> MPI-S= MPI is successfully initialized   HMI_ERR=0
> MSGUTCNT = 0x00000169 IOPTCNT=0x0000016a IOP1TCNT=0x0000016a
> MPI-S= MPI is successfully initialized   HMI_ERR=0
> MSGUTCNT = 0x0000014d IOPTCNT=0x0000014d IOP1TCNT=0x0000014d
> MPI-S= MPI is successfully initialized   HMI_ERR=0
> MSGUTCNT = 0x00000149 IOPTCNT=0x00000149 IOP1TCNT=0x00000149
> mvae14:~#
> mvae14:~#
> mvae14:~#
> mvae14:~# cat  /sys/class/scsi_host/host*/health
> MPI-S= MPI is successfully initialized   HMI_ERR=0
> MSGUTCNT = 0x0000016c IOPTCNT=0x0000016c IOP1TCNT=0x0000016c
> MPI-S= MPI is successfully initialized   HMI_ERR=0
> MSGUTCNT = 0x0000014f IOPTCNT=0x0000014f IOP1TCNT=0x0000014f
> MPI-S= MPI is successfully initialized   HMI_ERR=0
> MSGUTCNT = 0x0000014b IOPTCNT=0x0000014b IOP1TCNT=0x0000014b
>
> Signed-off-by: Vishakha Channapattan <vishakhavc@google.com>
> Signed-off-by: Viswas G <Viswas.G@microchip.com>
> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com>
> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com>
> Signed-off-by: Radha Ramachandran <radha@google.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Thx
> ---
>  drivers/scsi/pm8001/pm8001_ctl.c | 42 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
>
> diff --git a/drivers/scsi/pm8001/pm8001_ctl.c b/drivers/scsi/pm8001/pm8001_ctl.c
> index 12035baf0997..f46f341132fb 100644
> --- a/drivers/scsi/pm8001/pm8001_ctl.c
> +++ b/drivers/scsi/pm8001/pm8001_ctl.c
> @@ -41,6 +41,7 @@
>  #include <linux/slab.h>
>  #include "pm8001_sas.h"
>  #include "pm8001_ctl.h"
> +#include "pm8001_chips.h"
>
>  /* scsi host attributes */
>
> @@ -886,6 +887,46 @@ static ssize_t pm8001_show_update_fw(struct device *cdev,
>
>  static DEVICE_ATTR(update_fw, S_IRUGO|S_IWUSR|S_IWGRP,
>         pm8001_show_update_fw, pm8001_store_update_fw);
> +
> +/**
> + * pm8001_ctl_health_show - controller health check
> + * @cdev: pointer to embedded class device
> + * @buf: the buffer returned
> + *
> + * A sysfs 'read-only' shost attribute.
> + */
> +
> +char mpiStateText[][80] = {
> +       "MPI is not initialized",
> +       "MPI is successfully initialized",
> +       "MPI termination is in progress",
> +       "MPI initialization failed with error in [31:16]"
> +};
> +
> +static ssize_t ctl_health_show(struct device *cdev,
> +               struct device_attribute *attr, char *buf)
> +{
> +       struct Scsi_Host *shost = class_to_shost(cdev);
> +       struct sas_ha_struct *sha = SHOST_TO_SAS_HA(shost);
> +       struct pm8001_hba_info *pm8001_ha = sha->lldd_ha;
> +       unsigned int mpiDW0 = 0;
> +       unsigned int raaeCnt = 0;
> +       unsigned int iop0Cnt = 0;
> +       unsigned int iop1Cnt = 0;
> +       int c;
> +
> +       pm8001_dbg(pm8001_ha, IOCTL, "%s\n", __func__);
> +       mpiDW0 = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 0);
> +       raaeCnt = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 12);
> +       iop0Cnt = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 16);
> +       iop1Cnt = pm8001_mr32(pm8001_ha->general_stat_tbl_addr, 20);
> +       c = sprintf(buf, "MPI-S=%s\t HMI_ERR=%x\nMSGUTCNT=0x%08x IOPTCNT=0x%08x IOP1TCNT=0x%08x\n",
> +                       mpiStateText[mpiDW0 & 0x0003], ((mpiDW0 & 0xff00) >> 16),
> +                       raaeCnt, iop0Cnt, iop1Cnt);
> +       return c;
> +}
> +static DEVICE_ATTR_RO(ctl_health);
> +
>  struct device_attribute *pm8001_host_attrs[] = {
>         &dev_attr_interface_rev,
>         &dev_attr_controller_fatal_error,
> @@ -909,6 +950,7 @@ struct device_attribute *pm8001_host_attrs[] = {
>         &dev_attr_ob_log,
>         &dev_attr_ila_version,
>         &dev_attr_inc_fw_ver,
> +       &dev_attr_ctl_health,
>         NULL,
>  };
>
> --
> 2.16.3
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] pm80xx updates.
  2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
                   ` (7 preceding siblings ...)
  2020-12-30  4:57 ` [PATCH 8/8] pm80xx: Add sysfs attribute for ioc health Viswas G
@ 2021-01-08  3:21 ` Martin K. Petersen
  8 siblings, 0 replies; 18+ messages in thread
From: Martin K. Petersen @ 2021-01-08  3:21 UTC (permalink / raw)
  To: Viswas G
  Cc: linux-scsi, Vasanthalakshmi.Tharmarajan, Viswas.G, Ruksar.devadi,
	yuuzheng, vishakhavc, radha, akshatzen, bjashnani, jinpu.wang


Hi Viswas!

> From: Viswas G <Viswas.G@microchip.com>

This series was submitted as <Viswas.G@microchip.com.com>. I was going
to fix it up but various tooling gets confused and we end up with weird
lore links.

Please fix your email address and resubmit.

Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-01-08  3:24 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-30  4:57 [PATCH 0/8] pm80xx updates Viswas G
2020-12-30  4:57 ` [PATCH 1/8] pm80xx: No busywait in MPI init check Viswas G
2021-01-04  6:45   ` Jinpu Wang
2020-12-30  4:57 ` [PATCH 2/8] pm80xx: check fatal error Viswas G
2021-01-05 13:19   ` Jinpu Wang
2020-12-30  4:57 ` [PATCH 3/8] pm80xx: check main config table address Viswas G
2021-01-05 13:23   ` Jinpu Wang
2020-12-30  4:57 ` [PATCH 4/8] pm80xx: fix missing tag_free in NVMD DATA req Viswas G
2021-01-05 13:24   ` Jinpu Wang
2020-12-30  4:57 ` [PATCH 5/8] pm80xx: fix driver fatal dump failure Viswas G
2021-01-05 13:25   ` Jinpu Wang
2020-12-30  4:57 ` [PATCH 6/8] pm80xx: Simultaneous poll for all FW readiness Viswas G
2021-01-05 13:31   ` Jinpu Wang
2020-12-30  4:57 ` [PATCH 7/8] pm80xx: Log SATA IOMB completion status on failure Viswas G
2021-01-05 13:33   ` Jinpu Wang
2020-12-30  4:57 ` [PATCH 8/8] pm80xx: Add sysfs attribute for ioc health Viswas G
2021-01-05 13:35   ` Jinpu Wang
2021-01-08  3:21 ` [PATCH 0/8] pm80xx updates Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.