linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [smartpqi updates PATCH V2 00/11] smartpqi updates
@ 2021-09-28 23:54 Don Brace
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 01/11] smartpqi: update device removal management Don Brace
                   ` (12 more replies)
  0 siblings, 13 replies; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

These patches are based on Martin Petersen's 5.16/scsi-queue tree
  https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
  5.16/scsi-queue

This set of changes consist of:
  * Aligning device removal with our out of box driver.
  * Aligning kdump timing with controller memory dump.
    The OS was rebooting before the controller was finished dumping its own
    memory. Now the driver will wait for the controller to indicate that its
    dump has completed.
  * In rare cases where the controller stops responding to the driver, the
    driver can set reason codes to aid in debugging.
  * Enhance device reset operations. The driver was not obtaining the current
    number of outstanding commands during the check for outstanding command
    completions. This was causing reset hangs.
  * Add in a check for HBA devices undergoing sanitize. This was causing long
    boot up delays while the OS waited for sanitize to complete. The fix is to
    check for sanitize and keep the HBA disk offline. Note that the SSA spec
    states that the disk must be manually re-enabled after sanitize has
    completed. The link to the spec is noted in the patch.
  * When the OS off-lines a disk, the SCSI command pointers are cleaned up.
    The driver was attempting to return some outstanding commands that were
    no longer valid.
  * Add in more enhanced report physical luns (RPL) command. This is an
    internal command that yields more complete WWID information.
  * Correct a rare case where a poll for a register status before the
    register has been updated.
  * When multi-LUN tape devices are added to the OS, the OS does its own
    report LUNs and the tape devices were duplicated. A simple fix was to update
    slave_alloc/slave_configure to prevent this.
  * Add in some new PCI devices.
  * Bump the driver version.

Changes since V1:
  * Corrected issues with my e-mail server.


Don Brace (3):
  smartpqi: update device removal management
  smartpqi: add tur check for sanitize operation
  smartpqi: update version to 2.1.12-055

Kevin Barnett (2):
  smartpqi: update LUN reset handler
  smartpqi: fix duplicate device nodes for tape changers

Mahesh Rajashekhara (2):
  smartpqi: add controller handshake during kdump
  smartpqi: avoid failing ios for offline devices

Mike McGowen (3):
  smartpqi: add extended report physical luns
  smartpqi: fix boot failure during lun rebuild
  smartpqi: add 3252-8i pci id

Murthy Bhat (1):
  smartpqi: capture controller reason codes

 drivers/scsi/smartpqi/smartpqi.h              |  61 +-
 drivers/scsi/smartpqi/smartpqi_init.c         | 540 +++++++++++++-----
 .../scsi/smartpqi/smartpqi_sas_transport.c    |   6 +-
 drivers/scsi/smartpqi/smartpqi_sis.c          |  60 +-
 drivers/scsi/smartpqi/smartpqi_sis.h          |   4 +-
 5 files changed, 509 insertions(+), 162 deletions(-)

-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 01/11] smartpqi: update device removal management
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:21   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 02/11] smartpqi: add controller handshake during kdump Don Brace
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

Update device removal path to handle issues for:
  rmmod - Correct stack trace when removing devices.
  rmmod - Synchronize SCSI cache.
  Update handling for removing devices using sysfs.

This patch also aligns the device removal code with
our out-of-box driver.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 64 ++++++++++++---------------
 1 file changed, 28 insertions(+), 36 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index ecb2af3f43ca..97027574eb1f 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -1693,8 +1693,6 @@ static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info, struct pqi
 {
 	int rc;
 
-	pqi_device_remove_start(device);
-
 	rc = pqi_device_wait_for_pending_io(ctrl_info, device,
 		PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS);
 	if (rc)
@@ -1708,6 +1706,8 @@ static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info, struct pqi
 		scsi_remove_device(device->sdev);
 	else
 		pqi_remove_sas_device(device);
+
+	pqi_device_remove_start(device);
 }
 
 /* Assumes the SCSI device list lock is held. */
@@ -1986,7 +1986,7 @@ static void pqi_update_device_list(struct pqi_ctrl_info *ctrl_info,
 	list_for_each_entry_safe(device, next, &ctrl_info->scsi_device_list,
 		scsi_device_list_entry) {
 		if (device->device_gone) {
-			list_del_init(&device->scsi_device_list_entry);
+			list_del(&device->scsi_device_list_entry);
 			list_add_tail(&device->delete_list_entry, &delete_list);
 		}
 	}
@@ -2025,15 +2025,13 @@ static void pqi_update_device_list(struct pqi_ctrl_info *ctrl_info,
 		if (device->volume_offline) {
 			pqi_dev_info(ctrl_info, "offline", device);
 			pqi_show_volume_status(ctrl_info, device);
-		}
-		list_del(&device->delete_list_entry);
-		if (pqi_is_device_added(device)) {
-			pqi_remove_device(ctrl_info, device);
 		} else {
-			if (!device->volume_offline)
-				pqi_dev_info(ctrl_info, "removed", device);
-			pqi_free_device(device);
+			pqi_dev_info(ctrl_info, "removed", device);
 		}
+		if (pqi_is_device_added(device))
+			pqi_remove_device(ctrl_info, device);
+		list_del(&device->delete_list_entry);
+		pqi_free_device(device);
 	}
 
 	/*
@@ -2328,6 +2326,25 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 	return rc;
 }
 
+static void pqi_remove_all_scsi_devices(struct pqi_ctrl_info *ctrl_info)
+{
+	unsigned long flags;
+	struct pqi_scsi_dev *device;
+	struct pqi_scsi_dev *next;
+
+	spin_lock_irqsave(&ctrl_info->scsi_device_list_lock, flags);
+
+	list_for_each_entry_safe(device, next, &ctrl_info->scsi_device_list,
+		scsi_device_list_entry) {
+		if (pqi_is_device_added(device))
+			pqi_remove_device(ctrl_info, device);
+		list_del(&device->scsi_device_list_entry);
+		pqi_free_device(device);
+	}
+
+	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
+}
+
 static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 {
 	int rc;
@@ -6120,31 +6137,6 @@ static int pqi_slave_configure(struct scsi_device *sdev)
 	return 0;
 }
 
-static void pqi_slave_destroy(struct scsi_device *sdev)
-{
-	unsigned long flags;
-	struct pqi_scsi_dev *device;
-	struct pqi_ctrl_info *ctrl_info;
-
-	ctrl_info = shost_to_hba(sdev->host);
-
-	spin_lock_irqsave(&ctrl_info->scsi_device_list_lock, flags);
-
-	device = sdev->hostdata;
-	if (device) {
-		sdev->hostdata = NULL;
-		if (!list_empty(&device->scsi_device_list_entry))
-			list_del(&device->scsi_device_list_entry);
-	}
-
-	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
-
-	if (device) {
-		pqi_dev_info(ctrl_info, "removed", device);
-		pqi_free_device(device);
-	}
-}
-
 static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg)
 {
 	struct pci_dev *pci_dev;
@@ -6938,7 +6930,6 @@ static struct scsi_host_template pqi_driver_template = {
 	.ioctl = pqi_ioctl,
 	.slave_alloc = pqi_slave_alloc,
 	.slave_configure = pqi_slave_configure,
-	.slave_destroy = pqi_slave_destroy,
 	.map_queues = pqi_map_queues,
 	.sdev_attrs = pqi_sdev_attrs,
 	.shost_attrs = pqi_shost_attrs,
@@ -8169,6 +8160,7 @@ static void pqi_remove_ctrl(struct pqi_ctrl_info *ctrl_info)
 {
 	pqi_cancel_rescan_worker(ctrl_info);
 	pqi_cancel_update_time_worker(ctrl_info);
+	pqi_remove_all_scsi_devices(ctrl_info);
 	pqi_unregister_scsi(ctrl_info);
 	if (ctrl_info->pqi_mode_enabled)
 		pqi_revert_to_sis_mode(ctrl_info);
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 02/11] smartpqi: add controller handshake during kdump
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 01/11] smartpqi: update device removal management Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:21   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 03/11] smartpqi: capture controller reason codes Don Brace
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

Correct kdump hangs when controller is locked up.

There are occasions when a controller reboot
(controller soft reset) is issued when a controller
firmware crash dump is in progress.

This leads to incomplete controller firmware crash dump.
 - When the controller crash dump is in progress,
   and a kdump is initiated, the driver issues
   inbound doorbell reset to bring back the
   controller in SIS mode.
 - If the controller is in locked up state,
   the inbound doorbell reset does not work causing
   controller initialization failures. This results
   in the driver hanging waiting for SIS mode.

To avoid an incomplete controller crash dump, add in
a controller crash dump handshake.
 - Controller will indicate start and end of the controller
   crash dump by setting some register bits.
 - Driver will look these bits when a kdump is initiated.
   If a controller crash dump is in progress, the driver will
   wait for the controller crash dump to complete
   before issuing the controller soft reset then complete
   driver initialization.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 41 +++++++++++++++++++--
 drivers/scsi/smartpqi/smartpqi_sis.c  | 51 +++++++++++++++++++++++++++
 drivers/scsi/smartpqi/smartpqi_sis.h  |  1 +
 3 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 97027574eb1f..5655d240f7a7 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -234,15 +234,46 @@ static inline bool pqi_is_hba_lunid(u8 *scsi3addr)
 	return pqi_scsi3addr_equal(scsi3addr, RAID_CTLR_LUNID);
 }
 
+#define PQI_DRIVER_SCRATCH_PQI_MODE			0x1
+#define PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED		0x2
+
 static inline enum pqi_ctrl_mode pqi_get_ctrl_mode(struct pqi_ctrl_info *ctrl_info)
 {
-	return sis_read_driver_scratch(ctrl_info);
+	return sis_read_driver_scratch(ctrl_info) & PQI_DRIVER_SCRATCH_PQI_MODE ? PQI_MODE : SIS_MODE;
 }
 
 static inline void pqi_save_ctrl_mode(struct pqi_ctrl_info *ctrl_info,
 	enum pqi_ctrl_mode mode)
 {
-	sis_write_driver_scratch(ctrl_info, mode);
+	u32 driver_scratch;
+
+	driver_scratch = sis_read_driver_scratch(ctrl_info);
+
+	if (mode == PQI_MODE)
+		driver_scratch |= PQI_DRIVER_SCRATCH_PQI_MODE;
+	else
+		driver_scratch &= ~PQI_DRIVER_SCRATCH_PQI_MODE;
+
+	sis_write_driver_scratch(ctrl_info, driver_scratch);
+}
+
+static inline bool pqi_is_fw_triage_supported(struct pqi_ctrl_info *ctrl_info)
+{
+	return (sis_read_driver_scratch(ctrl_info) & PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED) != 0;
+}
+
+static inline void pqi_save_fw_triage_setting(struct pqi_ctrl_info *ctrl_info, bool is_supported)
+{
+	u32 driver_scratch;
+
+	driver_scratch = sis_read_driver_scratch(ctrl_info);
+
+	if (is_supported)
+		driver_scratch |= PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED;
+	else
+		driver_scratch &= ~PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED;
+
+	sis_write_driver_scratch(ctrl_info, driver_scratch);
 }
 
 static inline void pqi_ctrl_block_scan(struct pqi_ctrl_info *ctrl_info)
@@ -7292,6 +7323,7 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
 		ctrl_info->unique_wwid_in_report_phys_lun_supported =
 			firmware_feature->enabled;
 		break;
+		pqi_save_fw_triage_setting(ctrl_info, firmware_feature->enabled);
 	}
 
 	pqi_firmware_feature_status(ctrl_info, firmware_feature);
@@ -7618,6 +7650,11 @@ static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
 	u32 product_id;
 
 	if (reset_devices) {
+		if (pqi_is_fw_triage_supported(ctrl_info)) {
+			rc = sis_wait_for_fw_triage_completion(ctrl_info);
+			if (rc)
+				return rc;
+		}
 		sis_soft_reset(ctrl_info);
 		msleep(PQI_POST_RESET_DELAY_SECS * PQI_HZ);
 	} else {
diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c
index d63c46a8e38b..8acd3a80f582 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.c
+++ b/drivers/scsi/smartpqi/smartpqi_sis.c
@@ -51,12 +51,20 @@
 #define SIS_BASE_STRUCT_REVISION		9
 #define SIS_BASE_STRUCT_ALIGNMENT		16
 
+#define SIS_CTRL_KERNEL_FW_TRIAGE		0x3
 #define SIS_CTRL_KERNEL_UP			0x80
 #define SIS_CTRL_KERNEL_PANIC			0x100
 #define SIS_CTRL_READY_TIMEOUT_SECS		180
 #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS	90
 #define SIS_CTRL_READY_POLL_INTERVAL_MSECS	10
 
+enum sis_fw_triage_status {
+	FW_TRIAGE_NOT_STARTED = 0,
+	FW_TRIAGE_STARTED,
+	FW_TRIAGE_COND_INVALID,
+	FW_TRIAGE_COMPLETED
+};
+
 #pragma pack(1)
 
 /* for use with SIS_CMD_INIT_BASE_STRUCT_ADDRESS command */
@@ -419,12 +427,55 @@ u32 sis_read_driver_scratch(struct pqi_ctrl_info *ctrl_info)
 	return readl(&ctrl_info->registers->sis_driver_scratch);
 }
 
+static inline enum sis_fw_triage_status
+	sis_read_firmware_triage_status(struct pqi_ctrl_info *ctrl_info)
+{
+	return ((enum sis_fw_triage_status)(readl(&ctrl_info->registers->sis_firmware_status) &
+		SIS_CTRL_KERNEL_FW_TRIAGE));
+}
+
 void sis_soft_reset(struct pqi_ctrl_info *ctrl_info)
 {
 	writel(SIS_SOFT_RESET,
 		&ctrl_info->registers->sis_host_to_ctrl_doorbell);
 }
 
+#define SIS_FW_TRIAGE_STATUS_TIMEOUT_SECS		300
+#define SIS_FW_TRIAGE_STATUS_POLL_INTERVAL_SECS		1
+
+int sis_wait_for_fw_triage_completion(struct pqi_ctrl_info *ctrl_info)
+{
+	int rc;
+	enum sis_fw_triage_status status;
+	unsigned long timeout;
+
+	timeout = (SIS_FW_TRIAGE_STATUS_TIMEOUT_SECS * PQI_HZ) + jiffies;
+	while (1) {
+		status = sis_read_firmware_triage_status(ctrl_info);
+		if (status == FW_TRIAGE_COND_INVALID) {
+			dev_err(&ctrl_info->pci_dev->dev,
+				"firmware triage condition invalid\n");
+			rc = -EINVAL;
+			break;
+		} else if (status == FW_TRIAGE_NOT_STARTED ||
+			status == FW_TRIAGE_COMPLETED) {
+			rc = 0;
+			break;
+		}
+
+		if (time_after(jiffies, timeout)) {
+			dev_err(&ctrl_info->pci_dev->dev,
+				"timed out waiting for firmware triage status\n");
+			rc = -ETIMEDOUT;
+			break;
+		}
+
+		ssleep(SIS_FW_TRIAGE_STATUS_POLL_INTERVAL_SECS);
+	}
+
+	return rc;
+}
+
 static void __attribute__((unused)) verify_structures(void)
 {
 	BUILD_BUG_ON(offsetof(struct sis_base_struct,
diff --git a/drivers/scsi/smartpqi/smartpqi_sis.h b/drivers/scsi/smartpqi/smartpqi_sis.h
index d29c1352a826..c1db93054c86 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.h
+++ b/drivers/scsi/smartpqi/smartpqi_sis.h
@@ -28,5 +28,6 @@ void sis_write_driver_scratch(struct pqi_ctrl_info *ctrl_info, u32 value);
 u32 sis_read_driver_scratch(struct pqi_ctrl_info *ctrl_info);
 void sis_soft_reset(struct pqi_ctrl_info *ctrl_info);
 u32 sis_get_product_id(struct pqi_ctrl_info *ctrl_info);
+int sis_wait_for_fw_triage_completion(struct pqi_ctrl_info *ctrl_info);
 
 #endif	/* _SMARTPQI_SIS_H */
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 03/11] smartpqi: capture controller reason codes
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 01/11] smartpqi: update device removal management Don Brace
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 02/11] smartpqi: add controller handshake during kdump Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:22   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 04/11] smartpqi: update LUN reset handler Don Brace
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Murthy Bhat <Murthy.Bhat@microchip.com>

Store controller reason codes in a controller register for
debugging purposes.

In some rare cases, the driver can halt the controller.
- Add in a reason code on why the controller was halted.
- Store this reason code in a controller register to aid
  in debugging the issue.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      | 25 +++++++++++++++--
 drivers/scsi/smartpqi/smartpqi_init.c | 40 ++++++++++++++++++---------
 drivers/scsi/smartpqi/smartpqi_sis.c  |  9 ++++--
 drivers/scsi/smartpqi/smartpqi_sis.h  |  3 +-
 4 files changed, 57 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 70eca203d72f..d66863f8d1cf 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -82,9 +82,11 @@ struct pqi_ctrl_registers {
 	__le32  sis_product_identifier;			/* B4h */
 	u8	reserved5[0xbc - (0xb4 + sizeof(__le32))];
 	__le32	sis_firmware_status;			/* BCh */
-	u8	reserved6[0x1000 - (0xbc + sizeof(__le32))];
+	u8	reserved6[0xcc - (0xbc + sizeof(__le32))];
+	__le32	sis_ctrl_shutdown_reason_code;		/* CCh */
+	u8	reserved7[0x1000 - (0xcc + sizeof(__le32))];
 	__le32	sis_mailbox[8];				/* 1000h */
-	u8	reserved7[0x4000 - (0x1000 + (sizeof(__le32) * 8))];
+	u8	reserved8[0x4000 - (0x1000 + (sizeof(__le32) * 8))];
 	/*
 	 * The PQI spec states that the PQI registers should be at
 	 * offset 0 from the PCIe BAR 0.  However, we can't map
@@ -102,6 +104,21 @@ struct pqi_ctrl_registers {
 
 #define PQI_DEVICE_REGISTERS_OFFSET	0x4000
 
+/* shutdown reasons for taking the controller offline */
+enum pqi_ctrl_shutdown_reason {
+	PQI_IQ_NOT_DRAINED_TIMEOUT = 1,
+	PQI_LUN_RESET_TIMEOUT = 2,
+	PQI_IO_PENDING_POST_LUN_RESET_TIMEOUT = 3,
+	PQI_NO_HEARTBEAT = 4,
+	PQI_FIRMWARE_KERNEL_NOT_UP = 5,
+	PQI_OFA_RESPONSE_TIMEOUT = 6,
+	PQI_INVALID_REQ_ID = 7,
+	PQI_UNMATCHED_REQ_ID = 8,
+	PQI_IO_PI_OUT_OF_RANGE = 9,
+	PQI_EVENT_PI_OUT_OF_RANGE = 10,
+	PQI_UNEXPECTED_IU_TYPE = 11
+};
+
 enum pqi_io_path {
 	RAID_PATH = 0,
 	AIO_PATH = 1
@@ -850,7 +867,8 @@ struct pqi_config_table_firmware_features {
 #define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT			14
 #define PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME	15
 #define PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN	16
-#define PQI_FIRMWARE_FEATURE_MAXIMUM				16
+#define PQI_FIRMWARE_FEATURE_FW_TRIAGE				17
+#define PQI_FIRMWARE_FEATURE_MAXIMUM				17
 
 struct pqi_config_table_debug {
 	struct pqi_config_table_section_header header;
@@ -1297,6 +1315,7 @@ struct pqi_ctrl_info {
 	u8		raid_iu_timeout_supported : 1;
 	u8		tmf_iu_timeout_supported : 1;
 	u8		unique_wwid_in_report_phys_lun_supported : 1;
+	u8		firmware_triage_supported : 1;
 	u8		enable_r1_writes : 1;
 	u8		enable_r5_writes : 1;
 	u8		enable_r6_writes : 1;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 5655d240f7a7..b6ac4d607664 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -54,7 +54,8 @@ MODULE_DESCRIPTION("Driver for Microchip Smart Family Controller version "
 MODULE_VERSION(DRIVER_VERSION);
 MODULE_LICENSE("GPL");
 
-static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info);
+static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info,
+	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason);
 static void pqi_ctrl_offline_worker(struct work_struct *work);
 static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info);
 static void pqi_scan_start(struct Scsi_Host *shost);
@@ -226,7 +227,7 @@ static inline void pqi_check_ctrl_health(struct pqi_ctrl_info *ctrl_info)
 {
 	if (ctrl_info->controller_online)
 		if (!sis_is_firmware_running(ctrl_info))
-			pqi_take_ctrl_offline(ctrl_info);
+			pqi_take_ctrl_offline(ctrl_info, PQI_FIRMWARE_KERNEL_NOT_UP);
 }
 
 static inline bool pqi_is_hba_lunid(u8 *scsi3addr)
@@ -3180,9 +3181,10 @@ static int pqi_interpret_task_management_response(struct pqi_ctrl_info *ctrl_inf
 	return rc;
 }
 
-static inline void pqi_invalid_response(struct pqi_ctrl_info *ctrl_info)
+static inline void pqi_invalid_response(struct pqi_ctrl_info *ctrl_info,
+	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason)
 {
-	pqi_take_ctrl_offline(ctrl_info);
+	pqi_take_ctrl_offline(ctrl_info, ctrl_shutdown_reason);
 }
 
 static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue_group *queue_group)
@@ -3200,7 +3202,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
 	while (1) {
 		oq_pi = readl(queue_group->oq_pi);
 		if (oq_pi >= ctrl_info->num_elements_per_oq) {
-			pqi_invalid_response(ctrl_info);
+			pqi_invalid_response(ctrl_info, PQI_IO_PI_OUT_OF_RANGE);
 			dev_err(&ctrl_info->pci_dev->dev,
 				"I/O interrupt: producer index (%u) out of range (0-%u): consumer index: %u\n",
 				oq_pi, ctrl_info->num_elements_per_oq - 1, oq_ci);
@@ -3215,7 +3217,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
 
 		request_id = get_unaligned_le16(&response->request_id);
 		if (request_id >= ctrl_info->max_io_slots) {
-			pqi_invalid_response(ctrl_info);
+			pqi_invalid_response(ctrl_info, PQI_INVALID_REQ_ID);
 			dev_err(&ctrl_info->pci_dev->dev,
 				"request ID in response (%u) out of range (0-%u): producer index: %u  consumer index: %u\n",
 				request_id, ctrl_info->max_io_slots - 1, oq_pi, oq_ci);
@@ -3224,7 +3226,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
 
 		io_request = &ctrl_info->io_request_pool[request_id];
 		if (atomic_read(&io_request->refcount) == 0) {
-			pqi_invalid_response(ctrl_info);
+			pqi_invalid_response(ctrl_info, PQI_UNMATCHED_REQ_ID);
 			dev_err(&ctrl_info->pci_dev->dev,
 				"request ID in response (%u) does not match an outstanding I/O request: producer index: %u  consumer index: %u\n",
 				request_id, oq_pi, oq_ci);
@@ -3260,7 +3262,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
 			pqi_process_io_error(response->header.iu_type, io_request);
 			break;
 		default:
-			pqi_invalid_response(ctrl_info);
+			pqi_invalid_response(ctrl_info, PQI_UNEXPECTED_IU_TYPE);
 			dev_err(&ctrl_info->pci_dev->dev,
 				"unexpected IU type: 0x%x: producer index: %u  consumer index: %u\n",
 				response->header.iu_type, oq_pi, oq_ci);
@@ -3442,7 +3444,7 @@ static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
 		pqi_ofa_free_host_buffer(ctrl_info);
 		pqi_ctrl_ofa_done(ctrl_info);
 		pqi_ofa_ctrl_unquiesce(ctrl_info);
-		pqi_take_ctrl_offline(ctrl_info);
+		pqi_take_ctrl_offline(ctrl_info, PQI_OFA_RESPONSE_TIMEOUT);
 		break;
 	}
 }
@@ -3567,7 +3569,7 @@ static void pqi_heartbeat_timer_handler(struct timer_list *t)
 			dev_err(&ctrl_info->pci_dev->dev,
 				"no heartbeat detected - last heartbeat count: %u\n",
 				heartbeat_count);
-			pqi_take_ctrl_offline(ctrl_info);
+			pqi_take_ctrl_offline(ctrl_info, PQI_NO_HEARTBEAT);
 			return;
 		}
 	} else {
@@ -3631,7 +3633,7 @@ static int pqi_process_event_intr(struct pqi_ctrl_info *ctrl_info)
 	while (1) {
 		oq_pi = readl(event_queue->oq_pi);
 		if (oq_pi >= PQI_NUM_EVENT_QUEUE_ELEMENTS) {
-			pqi_invalid_response(ctrl_info);
+			pqi_invalid_response(ctrl_info, PQI_EVENT_PI_OUT_OF_RANGE);
 			dev_err(&ctrl_info->pci_dev->dev,
 				"event interrupt: producer index (%u) out of range (0-%u): consumer index: %u\n",
 				oq_pi, PQI_NUM_EVENT_QUEUE_ELEMENTS - 1, oq_ci);
@@ -7323,7 +7325,10 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
 		ctrl_info->unique_wwid_in_report_phys_lun_supported =
 			firmware_feature->enabled;
 		break;
+	case PQI_FIRMWARE_FEATURE_FW_TRIAGE:
+		ctrl_info->firmware_triage_supported = firmware_feature->enabled;
 		pqi_save_fw_triage_setting(ctrl_info, firmware_feature->enabled);
+		break;
 	}
 
 	pqi_firmware_feature_status(ctrl_info, firmware_feature);
@@ -7419,6 +7424,11 @@ static struct pqi_firmware_feature pqi_firmware_features[] = {
 		.feature_bit = PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN,
 		.feature_status = pqi_ctrl_update_feature_flags,
 	},
+	{
+		.feature_name = "Firmware Triage",
+		.feature_bit = PQI_FIRMWARE_FEATURE_FW_TRIAGE,
+		.feature_status = pqi_ctrl_update_feature_flags,
+	},
 };
 
 static void pqi_process_firmware_features(
@@ -7519,6 +7529,7 @@ static void pqi_ctrl_reset_config(struct pqi_ctrl_info *ctrl_info)
 	ctrl_info->raid_iu_timeout_supported = false;
 	ctrl_info->tmf_iu_timeout_supported = false;
 	ctrl_info->unique_wwid_in_report_phys_lun_supported = false;
+	ctrl_info->firmware_triage_supported = false;
 }
 
 static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
@@ -8459,7 +8470,8 @@ static void pqi_ctrl_offline_worker(struct work_struct *work)
 	pqi_take_ctrl_offline_deferred(ctrl_info);
 }
 
-static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info)
+static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info,
+	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason)
 {
 	if (!ctrl_info->controller_online)
 		return;
@@ -8468,7 +8480,7 @@ static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info)
 	ctrl_info->pqi_mode_enabled = false;
 	pqi_ctrl_block_requests(ctrl_info);
 	if (!pqi_disable_ctrl_shutdown)
-		sis_shutdown_ctrl(ctrl_info);
+		sis_shutdown_ctrl(ctrl_info, ctrl_shutdown_reason);
 	pci_disable_device(ctrl_info->pci_dev);
 	dev_err(&ctrl_info->pci_dev->dev, "controller offline\n");
 	schedule_work(&ctrl_info->ctrl_offline_work);
@@ -9303,6 +9315,8 @@ static void __attribute__((unused)) verify_structures(void)
 		sis_product_identifier) != 0xb4);
 	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
 		sis_firmware_status) != 0xbc);
+	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
+		sis_ctrl_shutdown_reason_code) != 0xcc);
 	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
 		sis_mailbox) != 0x1000);
 	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c
index 8acd3a80f582..d66eb8ea161c 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.c
+++ b/drivers/scsi/smartpqi/smartpqi_sis.c
@@ -397,14 +397,17 @@ void sis_enable_intx(struct pqi_ctrl_info *ctrl_info)
 	sis_set_doorbell_bit(ctrl_info, SIS_ENABLE_INTX);
 }
 
-void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info)
+void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info,
+	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason)
 {
 	if (readl(&ctrl_info->registers->sis_firmware_status) &
 		SIS_CTRL_KERNEL_PANIC)
 		return;
 
-	writel(SIS_TRIGGER_SHUTDOWN,
-		&ctrl_info->registers->sis_host_to_ctrl_doorbell);
+	if (ctrl_info->firmware_triage_supported)
+		writel(ctrl_shutdown_reason, &ctrl_info->registers->sis_ctrl_shutdown_reason_code);
+
+	writel(SIS_TRIGGER_SHUTDOWN, &ctrl_info->registers->sis_host_to_ctrl_doorbell);
 }
 
 int sis_pqi_reset_quiesce(struct pqi_ctrl_info *ctrl_info)
diff --git a/drivers/scsi/smartpqi/smartpqi_sis.h b/drivers/scsi/smartpqi/smartpqi_sis.h
index c1db93054c86..bd92ff49f385 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.h
+++ b/drivers/scsi/smartpqi/smartpqi_sis.h
@@ -21,7 +21,8 @@ int sis_get_pqi_capabilities(struct pqi_ctrl_info *ctrl_info);
 int sis_init_base_struct_addr(struct pqi_ctrl_info *ctrl_info);
 void sis_enable_msix(struct pqi_ctrl_info *ctrl_info);
 void sis_enable_intx(struct pqi_ctrl_info *ctrl_info);
-void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info);
+void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info,
+	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason);
 int sis_pqi_reset_quiesce(struct pqi_ctrl_info *ctrl_info);
 int sis_reenable_sis_mode(struct pqi_ctrl_info *ctrl_info);
 void sis_write_driver_scratch(struct pqi_ctrl_info *ctrl_info, u32 value);
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 04/11] smartpqi: update LUN reset handler
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (2 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 03/11] smartpqi: capture controller reason codes Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:22   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation Don Brace
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Kevin Barnett <kevin.barnett@microchip.com>

Enhance check for commands queued to the controller.
 - Add new function pqi_nonempty_inbound_queue_count that
   will wait for all I/O queued for submission to controller
   across all queue groups to drain. Add helper functions
   to obtain queue command counts for each queue group.
   These queues should drain quickly as they are already staged
   to be submitted down to the controller's IB queue.
Enhance check for outstanding command completion.
 - Update the count of outstanding commands while waiting.
   This value was not re-obtained and was potentially causing
   infinite wait for all completions.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 105 ++++++++++++++++----------
 1 file changed, 66 insertions(+), 39 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index b6ac4d607664..01330fd67500 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -5799,64 +5799,91 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 	return rc;
 }
 
-static int pqi_wait_until_queued_io_drained(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_queue_group *queue_group)
+static unsigned int pqi_queued_io_count(struct pqi_ctrl_info *ctrl_info)
 {
+	unsigned int i;
 	unsigned int path;
 	unsigned long flags;
-	bool list_is_empty;
+	unsigned int queued_io_count;
+	struct pqi_queue_group *queue_group;
+	struct pqi_io_request *io_request;
 
-	for (path = 0; path < 2; path++) {
-		while (1) {
-			spin_lock_irqsave(
-				&queue_group->submit_lock[path], flags);
-			list_is_empty =
-				list_empty(&queue_group->request_list[path]);
-			spin_unlock_irqrestore(
-				&queue_group->submit_lock[path], flags);
-			if (list_is_empty)
-				break;
-			pqi_check_ctrl_health(ctrl_info);
-			if (pqi_ctrl_offline(ctrl_info))
-				return -ENXIO;
-			usleep_range(1000, 2000);
+	queued_io_count = 0;
+
+	for (i = 0; i < ctrl_info->num_queue_groups; i++) {
+		queue_group = &ctrl_info->queue_groups[i];
+		for (path = 0; path < 2; path++) {
+			spin_lock_irqsave(&queue_group->submit_lock[path], flags);
+			list_for_each_entry(io_request, &queue_group->request_list[path], request_list_entry)
+				queued_io_count++;
+			spin_unlock_irqrestore(&queue_group->submit_lock[path], flags);
 		}
 	}
 
-	return 0;
+	return queued_io_count;
 }
 
-static int pqi_wait_until_inbound_queues_empty(struct pqi_ctrl_info *ctrl_info)
+static unsigned int pqi_nonempty_inbound_queue_count(struct pqi_ctrl_info *ctrl_info)
 {
-	int rc;
 	unsigned int i;
 	unsigned int path;
+	unsigned int nonempty_inbound_queue_count;
 	struct pqi_queue_group *queue_group;
 	pqi_index_t iq_pi;
 	pqi_index_t iq_ci;
 
+	nonempty_inbound_queue_count = 0;
+
 	for (i = 0; i < ctrl_info->num_queue_groups; i++) {
 		queue_group = &ctrl_info->queue_groups[i];
-
-		rc = pqi_wait_until_queued_io_drained(ctrl_info, queue_group);
-		if (rc)
-			return rc;
-
 		for (path = 0; path < 2; path++) {
 			iq_pi = queue_group->iq_pi_copy[path];
+			iq_ci = readl(queue_group->iq_ci[path]);
+			if (iq_ci != iq_pi)
+				nonempty_inbound_queue_count++;
+		}
+	}
 
-			while (1) {
-				iq_ci = readl(queue_group->iq_ci[path]);
-				if (iq_ci == iq_pi)
-					break;
-				pqi_check_ctrl_health(ctrl_info);
-				if (pqi_ctrl_offline(ctrl_info))
-					return -ENXIO;
-				usleep_range(1000, 2000);
-			}
+	return nonempty_inbound_queue_count;
+}
+
+#define PQI_INBOUND_QUEUES_NONEMPTY_WARNING_TIMEOUT_SECS	10
+
+static int pqi_wait_until_inbound_queues_empty(struct pqi_ctrl_info *ctrl_info)
+{
+	unsigned long start_jiffies;
+	unsigned long warning_timeout;
+	unsigned int queued_io_count;
+	unsigned int nonempty_inbound_queue_count;
+	bool displayed_warning;
+
+	displayed_warning = false;
+	start_jiffies = jiffies;
+	warning_timeout = (PQI_INBOUND_QUEUES_NONEMPTY_WARNING_TIMEOUT_SECS * PQI_HZ) + start_jiffies;
+
+	while (1) {
+		queued_io_count = pqi_queued_io_count(ctrl_info);
+		nonempty_inbound_queue_count = pqi_nonempty_inbound_queue_count(ctrl_info);
+		if (queued_io_count == 0 && nonempty_inbound_queue_count == 0)
+			break;
+		pqi_check_ctrl_health(ctrl_info);
+		if (pqi_ctrl_offline(ctrl_info))
+			return -ENXIO;
+		if (time_after(jiffies, warning_timeout)) {
+			dev_warn(&ctrl_info->pci_dev->dev,
+				"waiting %u seconds for queued I/O to drain (queued I/O count: %u; non-empty inbound queue count: %u)\n",
+				jiffies_to_msecs(jiffies - start_jiffies) / 1000, queued_io_count, nonempty_inbound_queue_count);
+			displayed_warning = true;
+			warning_timeout = (PQI_INBOUND_QUEUES_NONEMPTY_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;
 		}
+		usleep_range(1000, 2000);
 	}
 
+	if (displayed_warning)
+		dev_warn(&ctrl_info->pci_dev->dev,
+			"queued I/O drained after waiting for %u seconds\n",
+			jiffies_to_msecs(jiffies - start_jiffies) / 1000);
+
 	return 0;
 }
 
@@ -5922,7 +5949,7 @@ static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
 		if (pqi_ctrl_offline(ctrl_info))
 			return -ENXIO;
 		msecs_waiting = jiffies_to_msecs(jiffies - start_jiffies);
-		if (msecs_waiting > timeout_msecs) {
+		if (msecs_waiting >= timeout_msecs) {
 			dev_err(&ctrl_info->pci_dev->dev,
 				"scsi %d:%d:%d:%d: timed out after %lu seconds waiting for %d outstanding command(s)\n",
 				ctrl_info->scsi_host->host_no, device->bus, device->target,
@@ -5957,6 +5984,7 @@ static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,
 {
 	int rc;
 	unsigned int wait_secs;
+	int cmds_outstanding;
 
 	wait_secs = 0;
 
@@ -5974,11 +6002,10 @@ static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,
 		}
 
 		wait_secs += PQI_LUN_RESET_POLL_COMPLETION_SECS;
-
+		cmds_outstanding = atomic_read(&device->scsi_cmds_outstanding);
 		dev_warn(&ctrl_info->pci_dev->dev,
-			"scsi %d:%d:%d:%d: waiting %u seconds for LUN reset to complete\n",
-			ctrl_info->scsi_host->host_no, device->bus, device->target, device->lun,
-			wait_secs);
+			"scsi %d:%d:%d:%d: waiting %u seconds for LUN reset to complete (%d command(s) outstanding)\n",
+			ctrl_info->scsi_host->host_no, device->bus, device->target, device->lun, wait_secs, cmds_outstanding);
 	}
 
 	return rc;
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (3 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 04/11] smartpqi: update LUN reset handler Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-29  7:56   ` Paul Menzel
  2021-09-30 18:23   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 06/11] smartpqi: avoid failing ios for offline devices Don Brace
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

Add in a TUR to HBA disks and do not present them to the OS if
0x02/0x04/0x1b (sanitize in progress) is returned.

During boot-up, some OSes appear to hang when there are one or
more disks undergoing sanitize.

According to SCSI SBC4 specification
section 4.11.2 Commands allowed during sanitize,
some SCSI commands are permitted, but read/write operations are not.

When the OS attempts to read the disk partition table a
CHECK CONDITION ASC 0x04 ASCQ 0x1b is returned which causes the OS
to retry the read until sanitize has completed. This can take hours.

Note: According to document HPE Smart Storage Administrator User Guide
Link: https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=c03909334

During the sanitize erase operation, the drive is unusable.
I.E.
      The expected behavior for sanitize is the that disk remains
      offline even after sanitize has completed. The customer is
      expected to re-enable the disk using the management utility.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 87 +++++++++++++++++++++++++++
 1 file changed, 87 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 01330fd67500..838274d8fadf 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -555,6 +555,10 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
 	cdb = request->cdb;
 
 	switch (cmd) {
+	case TEST_UNIT_READY:
+		request->data_direction = SOP_READ_FLAG;
+		cdb[0] = TEST_UNIT_READY;
+		break;
 	case INQUIRY:
 		request->data_direction = SOP_READ_FLAG;
 		cdb[0] = INQUIRY;
@@ -1575,6 +1579,85 @@ static int pqi_get_logical_device_info(struct pqi_ctrl_info *ctrl_info,
 	return rc;
 }
 
+/*
+ * Prevent adding drive to OS for some corner cases such as a drive
+ * undergoing a sanitize operation. Some OSes will continue to poll
+ * the drive until the sanitize completes, which can take hours,
+ * resulting in long bootup delays. Commands such as TUR, READ_CAP
+ * are allowed, but READ/WRITE cause check condition. So the OS
+ * cannot check/read the partition table.
+ * Note: devices that have completed sanitize must be re-enabled
+ *       using the management utility.
+ */
+static bool pqi_keep_device_offline(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_scsi_dev *device)
+{
+	u8 scsi_status;
+	int rc;
+	enum dma_data_direction dir;
+	char *buffer;
+	int buffer_length = 64;
+	size_t sense_data_length;
+	struct scsi_sense_hdr sshdr;
+	struct pqi_raid_path_request request;
+	struct pqi_raid_error_info error_info;
+	bool offline = false; /* Assume keep online */
+
+	/* Do not check controllers. */
+	if (pqi_is_hba_lunid(device->scsi3addr))
+		return false;
+
+	/* Do not check LVs. */
+	if (pqi_is_logical_device(device))
+		return false;
+
+	buffer = kmalloc(buffer_length, GFP_KERNEL);
+	if (!buffer)
+		return false; /* Assume not offline */
+
+	/* Check for SANITIZE in progress using TUR */
+	rc = pqi_build_raid_path_request(ctrl_info, &request,
+		TEST_UNIT_READY, RAID_CTLR_LUNID, buffer,
+		buffer_length, 0, &dir);
+	if (rc)
+		goto out; /* Assume not offline */
+
+	memcpy(request.lun_number, device->scsi3addr, sizeof(request.lun_number));
+
+	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0, &error_info);
+
+	if (rc)
+		goto out; /* Assume not offline */
+
+	scsi_status = error_info.status;
+	sense_data_length = get_unaligned_le16(&error_info.sense_data_length);
+	if (sense_data_length == 0)
+		sense_data_length =
+			get_unaligned_le16(&error_info.response_data_length);
+	if (sense_data_length) {
+		if (sense_data_length > sizeof(error_info.data))
+			sense_data_length = sizeof(error_info.data);
+
+		/*
+		 * Check for sanitize in progress: asc:0x04, ascq: 0x1b
+		 */
+		if (scsi_status == SAM_STAT_CHECK_CONDITION &&
+			scsi_normalize_sense(error_info.data,
+				sense_data_length, &sshdr) &&
+				sshdr.sense_key == NOT_READY &&
+				sshdr.asc == 0x04 &&
+				sshdr.ascq == 0x1b) {
+			device->device_offline = true;
+			offline = true;
+			goto out; /* Keep device offline */
+		}
+	}
+
+out:
+	kfree(buffer);
+	return offline;
+}
+
 static int pqi_get_device_info(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev *device,
 	struct bmic_identify_physical_device *id_phys)
@@ -2296,6 +2379,10 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 		if (!pqi_is_supported_device(device))
 			continue;
 
+		/* Do not present disks that the OS cannot fully probe */
+		if (pqi_keep_device_offline(ctrl_info, device))
+			continue;
+
 		/* Gather information about the device. */
 		rc = pqi_get_device_info(ctrl_info, device, id_phys);
 		if (rc == -ENOMEM) {
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 06/11] smartpqi: avoid failing ios for offline devices
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (4 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:23   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 07/11] smartpqi: add extended report physical luns Don Brace
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

Prevent kernel crash by failing outstanding I/O request
when the OS takes device offline.

When posted IOs to the controller's inbound queue are
not picked by the controller, the driver will halt the
controller and take the controller offline.

When the driver takes the controller offline,
the driver will fail all the outstanding requests which
can sometime lead to an OS crash.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 838274d8fadf..c9f2a3d54663 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -8544,6 +8544,7 @@ static void pqi_fail_all_outstanding_requests(struct pqi_ctrl_info *ctrl_info)
 	unsigned int i;
 	struct pqi_io_request *io_request;
 	struct scsi_cmnd *scmd;
+	struct scsi_device *sdev;
 
 	for (i = 0; i < ctrl_info->max_io_slots; i++) {
 		io_request = &ctrl_info->io_request_pool[i];
@@ -8552,7 +8553,13 @@ static void pqi_fail_all_outstanding_requests(struct pqi_ctrl_info *ctrl_info)
 
 		scmd = io_request->scmd;
 		if (scmd) {
-			set_host_byte(scmd, DID_NO_CONNECT);
+			sdev = scmd->device;
+			if (!sdev || !scsi_device_online(sdev)) {
+				pqi_free_io_request(io_request);
+				continue;
+			} else {
+				set_host_byte(scmd, DID_NO_CONNECT);
+			}
 		} else {
 			io_request->status = -ENXIO;
 			io_request->error_info =
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 07/11] smartpqi: add extended report physical luns
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (5 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 06/11] smartpqi: avoid failing ios for offline devices Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:23   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 08/11] smartpqi: fix boot failure during lun rebuild Don Brace
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Mike McGowen <Mike.McGowen@microchip.com>

Add support for the new extended formats in
the data returned from the Report Physical LUNs
command for controllers that enable this feature.

The new formats allow the reporting of 16-byte WWIDs.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h              |  37 +++-
 drivers/scsi/smartpqi/smartpqi_init.c         | 163 +++++++++++++-----
 .../scsi/smartpqi/smartpqi_sas_transport.c    |   6 +-
 3 files changed, 147 insertions(+), 59 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index d66863f8d1cf..c439583a4ca5 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -868,7 +868,8 @@ struct pqi_config_table_firmware_features {
 #define PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME	15
 #define PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN	16
 #define PQI_FIRMWARE_FEATURE_FW_TRIAGE				17
-#define PQI_FIRMWARE_FEATURE_MAXIMUM				17
+#define PQI_FIRMWARE_FEATURE_RPL_EXTENDED_FORMAT_4_5		18
+#define PQI_FIRMWARE_FEATURE_MAXIMUM				18
 
 struct pqi_config_table_debug {
 	struct pqi_config_table_section_header header;
@@ -943,19 +944,21 @@ struct report_lun_header {
 #define CISS_REPORT_LOG_FLAG_QUEUE_DEPTH	(1 << 5)
 #define CISS_REPORT_LOG_FLAG_DRIVE_TYPE_MIX	(1 << 6)
 
-#define CISS_REPORT_PHYS_FLAG_OTHER		(1 << 1)
+#define CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_2		0x2
+#define CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_4		0x4
+#define CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_MASK	0xf
 
-struct report_log_lun_extended_entry {
+struct report_log_lun {
 	u8	lunid[8];
 	u8	volume_id[16];
 };
 
-struct report_log_lun_extended {
+struct report_log_lun_list {
 	struct report_lun_header header;
-	struct report_log_lun_extended_entry lun_entries[1];
+	struct report_log_lun lun_entries[1];
 };
 
-struct report_phys_lun_extended_entry {
+struct report_phys_lun_8byte_wwid {
 	u8	lunid[8];
 	__be64	wwid;
 	u8	device_type;
@@ -965,12 +968,27 @@ struct report_phys_lun_extended_entry {
 	u32	aio_handle;
 };
 
+struct report_phys_lun_16byte_wwid {
+	u8	lunid[8];
+	u8	wwid[16];
+	u8	device_type;
+	u8	device_flags;
+	u8	lun_count;	/* number of LUNs in a multi-LUN device */
+	u8	redundant_paths;
+	u32	aio_handle;
+};
+
 /* for device_flags field of struct report_phys_lun_extended_entry */
 #define CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED	0x8
 
-struct report_phys_lun_extended {
+struct report_phys_lun_8byte_wwid_list {
+	struct report_lun_header header;
+	struct report_phys_lun_8byte_wwid lun_entries[1];
+};
+
+struct report_phys_lun_16byte_wwid_list {
 	struct report_lun_header header;
-	struct report_phys_lun_extended_entry lun_entries[1];
+	struct report_phys_lun_16byte_wwid lun_entries[1];
 };
 
 struct raid_map_disk_data {
@@ -1077,7 +1095,7 @@ struct pqi_scsi_dev {
 	int	target;
 	int	lun;
 	u8	scsi3addr[8];
-	__be64	wwid;
+	u8	wwid[16];
 	u8	volume_id[16];
 	u8	is_physical_device : 1;
 	u8	is_external_raid_device : 1;
@@ -1316,6 +1334,7 @@ struct pqi_ctrl_info {
 	u8		tmf_iu_timeout_supported : 1;
 	u8		unique_wwid_in_report_phys_lun_supported : 1;
 	u8		firmware_triage_supported : 1;
+	u8		rpl_extended_format_4_5_supported : 1;
 	u8		enable_r1_writes : 1;
 	u8		enable_r5_writes : 1;
 	u8		enable_r6_writes : 1;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index c9f2a3d54663..1e27e6ba0159 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -572,10 +572,14 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
 	case CISS_REPORT_PHYS:
 		request->data_direction = SOP_READ_FLAG;
 		cdb[0] = cmd;
-		if (cmd == CISS_REPORT_PHYS)
-			cdb[1] = CISS_REPORT_PHYS_FLAG_OTHER;
-		else
+		if (cmd == CISS_REPORT_PHYS) {
+			if (ctrl_info->rpl_extended_format_4_5_supported)
+				cdb[1] = CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_4;
+			else
+				cdb[1] = CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_2;
+		} else {
 			cdb[1] = ctrl_info->ciss_report_log_flags;
+		}
 		put_unaligned_be32(cdb_length, &cdb[6]);
 		break;
 	case CISS_GET_RAID_MAP:
@@ -1132,7 +1136,64 @@ static int pqi_report_phys_logical_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd,
 
 static inline int pqi_report_phys_luns(struct pqi_ctrl_info *ctrl_info, void **buffer)
 {
-	return pqi_report_phys_logical_luns(ctrl_info, CISS_REPORT_PHYS, buffer);
+	int rc;
+	unsigned int i;
+	u8 rpl_response_format;
+	u32 num_physicals;
+	size_t rpl_16byte_wwid_list_length;
+	void *rpl_list;
+	struct report_lun_header *rpl_header;
+	struct report_phys_lun_8byte_wwid_list *rpl_8byte_wwid_list;
+	struct report_phys_lun_16byte_wwid_list *rpl_16byte_wwid_list;
+
+	rc = pqi_report_phys_logical_luns(ctrl_info, CISS_REPORT_PHYS, &rpl_list);
+	if (rc)
+		return rc;
+
+	if (ctrl_info->rpl_extended_format_4_5_supported) {
+		rpl_header = rpl_list;
+		rpl_response_format = rpl_header->flags & CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_MASK;
+		if (rpl_response_format == CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_4) {
+			*buffer = rpl_list;
+			return 0;
+		} else if (rpl_response_format != CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_2) {
+			dev_err(&ctrl_info->pci_dev->dev,
+				"RPL returned unsupported data format %u\n",
+				rpl_response_format);
+			return -EINVAL;
+		} else {
+			dev_warn(&ctrl_info->pci_dev->dev,
+				"RPL returned extended format 2 instead of 4\n");
+		}
+	}
+
+	rpl_8byte_wwid_list = rpl_list;
+	num_physicals = get_unaligned_be32(&rpl_8byte_wwid_list->header.list_length) / sizeof(rpl_8byte_wwid_list->lun_entries[0]);
+	rpl_16byte_wwid_list_length = sizeof(struct report_lun_header) + (num_physicals * sizeof(struct report_phys_lun_16byte_wwid));
+
+	rpl_16byte_wwid_list = kmalloc(rpl_16byte_wwid_list_length, GFP_KERNEL);
+	if (!rpl_16byte_wwid_list)
+		return -ENOMEM;
+
+	put_unaligned_be32(num_physicals * sizeof(struct report_phys_lun_16byte_wwid),
+		&rpl_16byte_wwid_list->header.list_length);
+	rpl_16byte_wwid_list->header.flags = rpl_8byte_wwid_list->header.flags;
+
+	for (i = 0; i < num_physicals; i++) {
+		memcpy(&rpl_16byte_wwid_list->lun_entries[i].lunid, &rpl_8byte_wwid_list->lun_entries[i].lunid, sizeof(rpl_8byte_wwid_list->lun_entries[i].lunid));
+		memset(&rpl_16byte_wwid_list->lun_entries[i].wwid, 0, 8);
+		memcpy(&rpl_16byte_wwid_list->lun_entries[i].wwid[8], &rpl_8byte_wwid_list->lun_entries[i].wwid, sizeof(rpl_8byte_wwid_list->lun_entries[i].wwid));
+		rpl_16byte_wwid_list->lun_entries[i].device_type = rpl_8byte_wwid_list->lun_entries[i].device_type;
+		rpl_16byte_wwid_list->lun_entries[i].device_flags = rpl_8byte_wwid_list->lun_entries[i].device_flags;
+		rpl_16byte_wwid_list->lun_entries[i].lun_count = rpl_8byte_wwid_list->lun_entries[i].lun_count;
+		rpl_16byte_wwid_list->lun_entries[i].redundant_paths = rpl_8byte_wwid_list->lun_entries[i].redundant_paths;
+		rpl_16byte_wwid_list->lun_entries[i].aio_handle = rpl_8byte_wwid_list->lun_entries[i].aio_handle;
+	}
+
+	kfree(rpl_8byte_wwid_list);
+	*buffer = rpl_16byte_wwid_list;
+
+	return 0;
 }
 
 static inline int pqi_report_logical_luns(struct pqi_ctrl_info *ctrl_info, void **buffer)
@@ -1141,14 +1202,14 @@ static inline int pqi_report_logical_luns(struct pqi_ctrl_info *ctrl_info, void
 }
 
 static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
-	struct report_phys_lun_extended **physdev_list,
-	struct report_log_lun_extended **logdev_list)
+	struct report_phys_lun_16byte_wwid_list **physdev_list,
+	struct report_log_lun_list **logdev_list)
 {
 	int rc;
 	size_t logdev_list_length;
 	size_t logdev_data_length;
-	struct report_log_lun_extended *internal_logdev_list;
-	struct report_log_lun_extended *logdev_data;
+	struct report_log_lun_list *internal_logdev_list;
+	struct report_log_lun_list *logdev_data;
 	struct report_lun_header report_lun_header;
 
 	rc = pqi_report_phys_luns(ctrl_info, (void **)physdev_list);
@@ -1173,7 +1234,7 @@ static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
 	} else {
 		memset(&report_lun_header, 0, sizeof(report_lun_header));
 		logdev_data =
-			(struct report_log_lun_extended *)&report_lun_header;
+			(struct report_log_lun_list *)&report_lun_header;
 		logdev_list_length = 0;
 	}
 
@@ -1181,7 +1242,7 @@ static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
 		logdev_list_length;
 
 	internal_logdev_list = kmalloc(logdev_data_length +
-		sizeof(struct report_log_lun_extended), GFP_KERNEL);
+		sizeof(struct report_log_lun), GFP_KERNEL);
 	if (!internal_logdev_list) {
 		kfree(*logdev_list);
 		*logdev_list = NULL;
@@ -1190,9 +1251,9 @@ static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
 
 	memcpy(internal_logdev_list, logdev_data, logdev_data_length);
 	memset((u8 *)internal_logdev_list + logdev_data_length, 0,
-		sizeof(struct report_log_lun_extended_entry));
+		sizeof(struct report_log_lun));
 	put_unaligned_be32(logdev_list_length +
-		sizeof(struct report_log_lun_extended_entry),
+		sizeof(struct report_log_lun),
 		&internal_logdev_list->header.list_length);
 
 	kfree(*logdev_list);
@@ -1845,7 +1906,7 @@ static inline bool pqi_device_equal(struct pqi_scsi_dev *dev1, struct pqi_scsi_d
 		return false;
 
 	if (dev1->is_physical_device)
-		return dev1->wwid == dev2->wwid;
+		return memcmp(dev1->wwid, dev2->wwid, sizeof(dev1->wwid)) == 0;
 
 	return memcmp(dev1->volume_id, dev2->volume_id, sizeof(dev1->volume_id)) == 0;
 }
@@ -1915,7 +1976,9 @@ static void pqi_dev_info(struct pqi_ctrl_info *ctrl_info,
 	else
 		count += scnprintf(buffer + count,
 			PQI_DEV_INFO_BUFFER_LENGTH - count,
-			" %016llx", device->sas_address);
+			" %016llx%016llx",
+			get_unaligned_be64(&device->wwid[0]),
+			get_unaligned_be64(&device->wwid[8]));
 
 	count += scnprintf(buffer + count, PQI_DEV_INFO_BUFFER_LENGTH - count,
 		" %s %.8s %.16s ",
@@ -2229,13 +2292,14 @@ static inline bool pqi_expose_device(struct pqi_scsi_dev *device)
 }
 
 static inline void pqi_set_physical_device_wwid(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device, struct report_phys_lun_extended_entry *phys_lun_ext_entry)
+	struct pqi_scsi_dev *device, struct report_phys_lun_16byte_wwid *phys_lun)
 {
 	if (ctrl_info->unique_wwid_in_report_phys_lun_supported ||
+		ctrl_info->rpl_extended_format_4_5_supported ||
 		pqi_is_device_with_sas_address(device))
-		device->wwid = phys_lun_ext_entry->wwid;
+		memcpy(device->wwid, phys_lun->wwid, sizeof(device->wwid));
 	else
-		device->wwid = cpu_to_be64(get_unaligned_be64(&device->page_83_identifier));
+		memcpy(&device->wwid[8], device->page_83_identifier, 8);
 }
 
 static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
@@ -2243,10 +2307,10 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 	int i;
 	int rc;
 	LIST_HEAD(new_device_list_head);
-	struct report_phys_lun_extended *physdev_list = NULL;
-	struct report_log_lun_extended *logdev_list = NULL;
-	struct report_phys_lun_extended_entry *phys_lun_ext_entry;
-	struct report_log_lun_extended_entry *log_lun_ext_entry;
+	struct report_phys_lun_16byte_wwid_list *physdev_list = NULL;
+	struct report_log_lun_list *logdev_list = NULL;
+	struct report_phys_lun_16byte_wwid *phys_lun;
+	struct report_log_lun *log_lun;
 	struct bmic_identify_physical_device *id_phys = NULL;
 	u32 num_physicals;
 	u32 num_logicals;
@@ -2297,10 +2361,9 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 
 		if (pqi_hide_vsep) {
 			for (i = num_physicals - 1; i >= 0; i--) {
-				phys_lun_ext_entry =
-						&physdev_list->lun_entries[i];
-				if (CISS_GET_DRIVE_NUMBER(phys_lun_ext_entry->lunid) == PQI_VSEP_CISS_BTL) {
-					pqi_mask_device(phys_lun_ext_entry->lunid);
+				phys_lun = &physdev_list->lun_entries[i];
+				if (CISS_GET_DRIVE_NUMBER(phys_lun->lunid) == PQI_VSEP_CISS_BTL) {
+					pqi_mask_device(phys_lun->lunid);
 					break;
 				}
 			}
@@ -2344,16 +2407,14 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 		if ((!pqi_expose_ld_first && i < num_physicals) ||
 			(pqi_expose_ld_first && i >= num_logicals)) {
 			is_physical_device = true;
-			phys_lun_ext_entry =
-				&physdev_list->lun_entries[physical_index++];
-			log_lun_ext_entry = NULL;
-			scsi3addr = phys_lun_ext_entry->lunid;
+			phys_lun = &physdev_list->lun_entries[physical_index++];
+			log_lun = NULL;
+			scsi3addr = phys_lun->lunid;
 		} else {
 			is_physical_device = false;
-			phys_lun_ext_entry = NULL;
-			log_lun_ext_entry =
-				&logdev_list->lun_entries[logical_index++];
-			scsi3addr = log_lun_ext_entry->lunid;
+			phys_lun = NULL;
+			log_lun = &logdev_list->lun_entries[logical_index++];
+			scsi3addr = log_lun->lunid;
 		}
 
 		if (is_physical_device && pqi_skip_device(scsi3addr))
@@ -2368,7 +2429,7 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 		memcpy(device->scsi3addr, scsi3addr, sizeof(device->scsi3addr));
 		device->is_physical_device = is_physical_device;
 		if (is_physical_device) {
-			device->device_type = phys_lun_ext_entry->device_type;
+			device->device_type = phys_lun->device_type;
 			if (device->device_type == SA_DEVICE_TYPE_EXPANDER_SMP)
 				device->is_expander_smp_device = true;
 		} else {
@@ -2393,8 +2454,9 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 		if (rc) {
 			if (device->is_physical_device)
 				dev_warn(&ctrl_info->pci_dev->dev,
-					"obtaining device info failed, skipping physical device %016llx\n",
-					get_unaligned_be64(&phys_lun_ext_entry->wwid));
+					"obtaining device info failed, skipping physical device %016llx%016llx\n",
+					get_unaligned_be64(&phys_lun->wwid[0]),
+					get_unaligned_be64(&phys_lun->wwid[8]));
 			else
 				dev_warn(&ctrl_info->pci_dev->dev,
 					"obtaining device info failed, skipping logical device %08x%08x\n",
@@ -2407,21 +2469,21 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 		pqi_assign_bus_target_lun(device);
 
 		if (device->is_physical_device) {
-			pqi_set_physical_device_wwid(ctrl_info, device, phys_lun_ext_entry);
-			if ((phys_lun_ext_entry->device_flags &
+			pqi_set_physical_device_wwid(ctrl_info, device, phys_lun);
+			if ((phys_lun->device_flags &
 				CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED) &&
-				phys_lun_ext_entry->aio_handle) {
+				phys_lun->aio_handle) {
 					device->aio_enabled = true;
 					device->aio_handle =
-						phys_lun_ext_entry->aio_handle;
+						phys_lun->aio_handle;
 			}
 		} else {
-			memcpy(device->volume_id, log_lun_ext_entry->volume_id,
+			memcpy(device->volume_id, log_lun->volume_id,
 				sizeof(device->volume_id));
 		}
 
 		if (pqi_is_device_with_sas_address(device))
-			device->sas_address = get_unaligned_be64(&device->wwid);
+			device->sas_address = get_unaligned_be64(&device->wwid[8]);
 
 		new_device_list[num_valid_devices++] = device;
 	}
@@ -6804,12 +6866,10 @@ static ssize_t pqi_unique_id_show(struct device *dev,
 		return -ENODEV;
 	}
 
-	if (device->is_physical_device) {
-		memset(unique_id, 0, 8);
-		memcpy(unique_id + 8, &device->wwid, sizeof(device->wwid));
-	} else {
+	if (device->is_physical_device)
+		memcpy(unique_id, device->wwid, sizeof(device->wwid));
+	else
 		memcpy(unique_id, device->volume_id, sizeof(device->volume_id));
-	}
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
@@ -7443,6 +7503,9 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
 		ctrl_info->firmware_triage_supported = firmware_feature->enabled;
 		pqi_save_fw_triage_setting(ctrl_info, firmware_feature->enabled);
 		break;
+	case PQI_FIRMWARE_FEATURE_RPL_EXTENDED_FORMAT_4_5:
+		ctrl_info->rpl_extended_format_4_5_supported = firmware_feature->enabled;
+		break;
 	}
 
 	pqi_firmware_feature_status(ctrl_info, firmware_feature);
@@ -7543,6 +7606,11 @@ static struct pqi_firmware_feature pqi_firmware_features[] = {
 		.feature_bit = PQI_FIRMWARE_FEATURE_FW_TRIAGE,
 		.feature_status = pqi_ctrl_update_feature_flags,
 	},
+	{
+		.feature_name = "RPL Extended Formats 4 and 5",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RPL_EXTENDED_FORMAT_4_5,
+		.feature_status = pqi_ctrl_update_feature_flags,
+	},
 };
 
 static void pqi_process_firmware_features(
@@ -7644,6 +7712,7 @@ static void pqi_ctrl_reset_config(struct pqi_ctrl_info *ctrl_info)
 	ctrl_info->tmf_iu_timeout_supported = false;
 	ctrl_info->unique_wwid_in_report_phys_lun_supported = false;
 	ctrl_info->firmware_triage_supported = false;
+	ctrl_info->rpl_extended_format_4_5_supported = false;
 }
 
 static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
diff --git a/drivers/scsi/smartpqi/smartpqi_sas_transport.c b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
index afd9bafebd1d..dea4ebaf1677 100644
--- a/drivers/scsi/smartpqi/smartpqi_sas_transport.c
+++ b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
@@ -343,7 +343,7 @@ static int pqi_sas_get_enclosure_identifier(struct sas_rphy *rphy,
 	}
 
 	if (found_device->devtype == TYPE_ENCLOSURE) {
-		*identifier = get_unaligned_be64(&found_device->wwid);
+		*identifier = get_unaligned_be64(&found_device->wwid[8]);
 		rc = 0;
 		goto out;
 	}
@@ -364,7 +364,7 @@ static int pqi_sas_get_enclosure_identifier(struct sas_rphy *rphy,
 			memcmp(device->phys_connector,
 				found_device->phys_connector, 2) == 0) {
 			*identifier =
-				get_unaligned_be64(&device->wwid);
+				get_unaligned_be64(&device->wwid[8]);
 			rc = 0;
 			goto out;
 		}
@@ -380,7 +380,7 @@ static int pqi_sas_get_enclosure_identifier(struct sas_rphy *rphy,
 		if (device->devtype == TYPE_ENCLOSURE &&
 			CISS_GET_DRIVE_NUMBER(device->scsi3addr) ==
 				PQI_VSEP_CISS_BTL) {
-			*identifier = get_unaligned_be64(&device->wwid);
+			*identifier = get_unaligned_be64(&device->wwid[8]);
 			rc = 0;
 			goto out;
 		}
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 08/11] smartpqi: fix boot failure during lun rebuild
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (6 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 07/11] smartpqi: add extended report physical luns Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:24   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers Don Brace
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Mike McGowen <Mike.McGowen@microchip.com>

Move the delay in the register polling loop to
the beginning of the loop to ensure there is always
a delay between writing the register and reading it.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 1e27e6ba0159..c28eb7ea4a24 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -4278,12 +4278,12 @@ static int pqi_create_admin_queues(struct pqi_ctrl_info *ctrl_info)
 
 	timeout = PQI_ADMIN_QUEUE_CREATE_TIMEOUT_JIFFIES + jiffies;
 	while (1) {
+		msleep(PQI_ADMIN_QUEUE_CREATE_POLL_INTERVAL_MSECS);
 		status = readb(&pqi_registers->function_and_status_code);
 		if (status == PQI_STATUS_IDLE)
 			break;
 		if (time_after(jiffies, timeout))
 			return -ETIMEDOUT;
-		msleep(PQI_ADMIN_QUEUE_CREATE_POLL_INTERVAL_MSECS);
 	}
 
 	/*
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (7 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 08/11] smartpqi: fix boot failure during lun rebuild Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:24   ` john.p.donnelly
  2021-10-01  8:26   ` Paul Menzel
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 10/11] smartpqi: add 3252-8i pci id Don Brace
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Kevin Barnett <kevin.barnett@microchip.com>

Stop the OS from re-discovering multiple LUNs for
tape drive and medium changer.

Duplicate device nodes for Ultrium tape drive and
medium changer are being created.

The Ultrium tape drive is a multi-LUN SCSI target.
It presents a LUN for the tape drive and a 2nd
LUN for the medium changer.
Our controller FW lists both LUNs in the RPL
results.

As a result, the smartpqi driver exposes both
devices to the OS. Then the OS does its normal
device discovery via the SCSI REPORT LUNS command,
which causes it to re-discover both devices a 2nd time,
which results in the duplicate device nodes.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |  1 +
 drivers/scsi/smartpqi/smartpqi_init.c | 23 +++++++++++++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index c439583a4ca5..aac88ac0a0b7 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -1106,6 +1106,7 @@ struct pqi_scsi_dev {
 	u8	keep_device : 1;
 	u8	volume_offline : 1;
 	u8	rescan : 1;
+	u8	ignore_device : 1;
 	bool	aio_enabled;		/* only valid for physical disks */
 	bool	in_remove;
 	bool	device_offline;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index c28eb7ea4a24..8be116992cb0 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -6297,9 +6297,13 @@ static int pqi_slave_alloc(struct scsi_device *sdev)
 		rphy = target_to_rphy(starget);
 		device = pqi_find_device_by_sas_rphy(ctrl_info, rphy);
 		if (device) {
-			device->target = sdev_id(sdev);
-			device->lun = sdev->lun;
-			device->target_lun_valid = true;
+			if (device->target_lun_valid) {
+				device->ignore_device = true;
+			} else {
+				device->target = sdev_id(sdev);
+				device->lun = sdev->lun;
+				device->target_lun_valid = true;
+			}
 		}
 	} else {
 		device = pqi_find_scsi_dev(ctrl_info, sdev_channel(sdev),
@@ -6336,14 +6340,25 @@ static int pqi_map_queues(struct Scsi_Host *shost)
 					ctrl_info->pci_dev, 0);
 }
 
+static inline bool pqi_is_tape_changer_device(struct pqi_scsi_dev *device)
+{
+	return device->devtype == TYPE_TAPE || device->devtype == TYPE_MEDIUM_CHANGER;
+}
+
 static int pqi_slave_configure(struct scsi_device *sdev)
 {
+	int rc = 0;
 	struct pqi_scsi_dev *device;
 
 	device = sdev->hostdata;
 	device->devtype = sdev->type;
 
-	return 0;
+	if (pqi_is_tape_changer_device(device) && device->ignore_device) {
+		rc = -ENXIO;
+		device->ignore_device = false;
+	}
+
+	return rc;
 }
 
 static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg)
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 10/11] smartpqi: add 3252-8i pci id
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (8 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:24   ` john.p.donnelly
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 11/11] smartpqi: update version to 2.1.12-055 Don Brace
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

From: Mike McGowen <Mike.McGowen@microchip.com>

Added PCI ID information for the
Adaptec SmartRAID 3252-8i controller:
    9005 / 028F / 9005 / 14A2

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 8be116992cb0..ffa217874352 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -9287,6 +9287,10 @@ static const struct pci_device_id pqi_pci_id_table[] = {
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       PCI_VENDOR_ID_ADAPTEC2, 0x14a1)
 	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14a2)
+	},
 	{
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       PCI_VENDOR_ID_ADAPTEC2, 0x14b0)
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [smartpqi updates PATCH V2 11/11] smartpqi: update version to 2.1.12-055
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (9 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 10/11] smartpqi: add 3252-8i pci id Don Brace
@ 2021-09-28 23:54 ` Don Brace
  2021-09-30 18:25   ` john.p.donnelly
  2021-09-29  9:34 ` [smartpqi updates PATCH V2 00/11] smartpqi updates Paul Menzel
  2021-10-12 20:35 ` Martin K. Petersen
  12 siblings, 1 reply; 33+ messages in thread
From: Don Brace @ 2021-09-28 23:54 UTC (permalink / raw)
  To: hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, pmenzel, linux-kernel

Update driver version to reflect changes.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index ffa217874352..b966ce3b4385 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -33,11 +33,11 @@
 #define BUILD_TIMESTAMP
 #endif
 
-#define DRIVER_VERSION		"2.1.10-020"
+#define DRIVER_VERSION		"2.1.12-055"
 #define DRIVER_MAJOR		2
 #define DRIVER_MINOR		1
-#define DRIVER_RELEASE		10
-#define DRIVER_REVISION		20
+#define DRIVER_RELEASE		12
+#define DRIVER_REVISION		55
 
 #define DRIVER_NAME		"Microchip SmartPQI Driver (v" \
 				DRIVER_VERSION BUILD_TIMESTAMP ")"
-- 
2.28.0.rc1.9.ge7ae437ac1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation Don Brace
@ 2021-09-29  7:56   ` Paul Menzel
  2021-09-30 18:23   ` john.p.donnelly
  1 sibling, 0 replies; 33+ messages in thread
From: Paul Menzel @ 2021-09-29  7:56 UTC (permalink / raw)
  To: Don Brace
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi

Dear Don,


Thank you for the patch. Maybe rephrase the summary:

 > Check TUR for sanitize operation


Am 29.09.21 um 01:54 schrieb Don Brace:
> Add in a TUR to HBA disks and do not present them to the OS if

Maybe add what TUR means: Test Unit Ready.

> 0x02/0x04/0x1b (sanitize in progress) is returned.
> 
> During boot-up, some OSes appear to hang when there are one or
> more disks undergoing sanitize.

It’d be great, if you gave at least one concrete test setup, where the 
hang occurred.

> According to SCSI SBC4 specification
> section 4.11.2 Commands allowed during sanitize,
> some SCSI commands are permitted, but read/write operations are not.
> 
> When the OS attempts to read the disk partition table a
> CHECK CONDITION ASC 0x04 ASCQ 0x1b is returned which causes the OS
> to retry the read until sanitize has completed. This can take hours.
> 
> Note: According to document HPE Smart Storage Administrator User Guide
> Link: https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=c03909334
> 
> During the sanitize erase operation, the drive is unusable.
> I.E.
>        The expected behavior for sanitize is the that disk remains
>        offline even after sanitize has completed. The customer is
>        expected to re-enable the disk using the management utility.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>   drivers/scsi/smartpqi/smartpqi_init.c | 87 +++++++++++++++++++++++++++
>   1 file changed, 87 insertions(+)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 01330fd67500..838274d8fadf 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -555,6 +555,10 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
>   	cdb = request->cdb;
>   
>   	switch (cmd) {
> +	case TEST_UNIT_READY:
> +		request->data_direction = SOP_READ_FLAG;
> +		cdb[0] = TEST_UNIT_READY;
> +		break;
>   	case INQUIRY:
>   		request->data_direction = SOP_READ_FLAG;
>   		cdb[0] = INQUIRY;
> @@ -1575,6 +1579,85 @@ static int pqi_get_logical_device_info(struct pqi_ctrl_info *ctrl_info,
>   	return rc;
>   }
>   
> +/*
> + * Prevent adding drive to OS for some corner cases such as a drive
> + * undergoing a sanitize operation. Some OSes will continue to poll
> + * the drive until the sanitize completes, which can take hours,
> + * resulting in long bootup delays. Commands such as TUR, READ_CAP
> + * are allowed, but READ/WRITE cause check condition. So the OS
> + * cannot check/read the partition table.
> + * Note: devices that have completed sanitize must be re-enabled
> + *       using the management utility.
> + */
> +static bool pqi_keep_device_offline(struct pqi_ctrl_info *ctrl_info,
> +	struct pqi_scsi_dev *device)
> +{
> +	u8 scsi_status;
> +	int rc;
> +	enum dma_data_direction dir;
> +	char *buffer;
> +	int buffer_length = 64;

Use size_t? And could be made const?

> +	size_t sense_data_length;
> +	struct scsi_sense_hdr sshdr;
> +	struct pqi_raid_path_request request;
> +	struct pqi_raid_error_info error_info;
> +	bool offline = false; /* Assume keep online */
> +
> +	/* Do not check controllers. */

I’d remove the dot/period at the end of the short comments.

> +	if (pqi_is_hba_lunid(device->scsi3addr))
> +		return false;
> +
> +	/* Do not check LVs. */
> +	if (pqi_is_logical_device(device))
> +		return false;
> +
> +	buffer = kmalloc(buffer_length, GFP_KERNEL);
> +	if (!buffer)
> +		return false; /* Assume not offline */
> +
> +	/* Check for SANITIZE in progress using TUR */
> +	rc = pqi_build_raid_path_request(ctrl_info, &request,
> +		TEST_UNIT_READY, RAID_CTLR_LUNID, buffer,
> +		buffer_length, 0, &dir);
> +	if (rc)
> +		goto out; /* Assume not offline */
> +
> +	memcpy(request.lun_number, device->scsi3addr, sizeof(request.lun_number));
> +
> +	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0, &error_info);
> +
> +	if (rc)
> +		goto out; /* Assume not offline */
> +
> +	scsi_status = error_info.status;
> +	sense_data_length = get_unaligned_le16(&error_info.sense_data_length);
> +	if (sense_data_length == 0)
> +		sense_data_length =
> +			get_unaligned_le16(&error_info.response_data_length);

As the variable is named `sense_data_length`, for an outsider like me, 
it’s suprising that `response_date_length` is stored in there.

> +	if (sense_data_length) {
> +		if (sense_data_length > sizeof(error_info.data))
> +			sense_data_length = sizeof(error_info.data);
> +
> +		/*
> +		 * Check for sanitize in progress: asc:0x04, ascq: 0x1b

Add a space after the second colon?

> +		 */
> +		if (scsi_status == SAM_STAT_CHECK_CONDITION &&
> +			scsi_normalize_sense(error_info.data,
> +				sense_data_length, &sshdr) &&
> +				sshdr.sense_key == NOT_READY &&
> +				sshdr.asc == 0x04 &&
> +				sshdr.ascq == 0x1b) {
> +			device->device_offline = true;
> +			offline = true;
> +			goto out; /* Keep device offline */
> +		}
> +	}

Should a error, warning, or debug message be printed, when 
`sense_data_length = 0` again?

> +
> +out:
> +	kfree(buffer);
> +	return offline;
> +}
> +
>   static int pqi_get_device_info(struct pqi_ctrl_info *ctrl_info,
>   	struct pqi_scsi_dev *device,
>   	struct bmic_identify_physical_device *id_phys)
> @@ -2296,6 +2379,10 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   		if (!pqi_is_supported_device(device))
>   			continue;
>   
> +		/* Do not present disks that the OS cannot fully probe */
> +		if (pqi_keep_device_offline(ctrl_info, device))

I’d use the positive `!pqi_get_device_online()`, but it’s subjective.

> +			continue;
> +
>   		/* Gather information about the device. */
>   		rc = pqi_get_device_info(ctrl_info, device, id_phys);
>   		if (rc == -ENOMEM) {
> 


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 00/11] smartpqi updates
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (10 preceding siblings ...)
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 11/11] smartpqi: update version to 2.1.12-055 Don Brace
@ 2021-09-29  9:34 ` Paul Menzel
  2021-09-29 14:08   ` Don.Brace
  2021-10-12 20:35 ` Martin K. Petersen
  12 siblings, 1 reply; 33+ messages in thread
From: Paul Menzel @ 2021-09-29  9:34 UTC (permalink / raw)
  To: Don Brace
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi

Dear Don,


Just a small nit regarding most patches in the patch queue.

It’d be great if the full text width of 75 characters could be used in 
the commit message bodies. Currently they are well below that, and 
therefore take more lines than necessary and are harder to read for me.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [smartpqi updates PATCH V2 00/11] smartpqi updates
  2021-09-29  9:34 ` [smartpqi updates PATCH V2 00/11] smartpqi updates Paul Menzel
@ 2021-09-29 14:08   ` Don.Brace
  2021-09-29 14:12     ` Paul Menzel
  0 siblings, 1 reply; 33+ messages in thread
From: Don.Brace @ 2021-09-29 14:08 UTC (permalink / raw)
  To: pmenzel
  Cc: Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, Mike.McGowen, Murthy.Bhat,
	Balsundar.P, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi

-----Original Message-----
From: Paul Menzel [mailto:pmenzel@molgen.mpg.de] 
Subject: Re: [smartpqi updates PATCH V2 00/11] smartpqi updates

Dear Don,


Just a small nit regarding most patches in the patch queue.

It’d be great if the full text width of 75 characters could be used in the commit message bodies. Currently they are well below that, and therefore take more lines than necessary and are harder to read for me.


Kind regards,

Paul
---
I can re-word and re-send if you like.
Thanks,
Don Brace

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 00/11] smartpqi updates
  2021-09-29 14:08   ` Don.Brace
@ 2021-09-29 14:12     ` Paul Menzel
  0 siblings, 0 replies; 33+ messages in thread
From: Paul Menzel @ 2021-09-29 14:12 UTC (permalink / raw)
  To: Don.Brace
  Cc: Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, Mike.McGowen, Murthy.Bhat,
	Balsundar.P, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi

Dear Don,


Am 29.09.21 um 16:08 schrieb Don.Brace@microchip.com:
> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> Subject: Re: [smartpqi updates PATCH V2 00/11] smartpqi updates
> 
> Dear Don,
> 
> 
> Just a small nit regarding most patches in the patch queue.
> 
> It’d be great if the full text width of 75 characters could be used in the commit message bodies. Currently they are well below that, and therefore take more lines than necessary and are harder to read for me.
> 
> 
> Kind regards,
> 
> Paul
> ---
> I can re-word and re-send if you like.

If you sent V3 due to other reasons, then yes. Otherwise, please just 
keep it in mind for the future.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 01/11] smartpqi: update device removal management
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 01/11] smartpqi: update device removal management Don Brace
@ 2021-09-30 18:21   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:21 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> Update device removal path to handle issues for:
>    rmmod - Correct stack trace when removing devices.
>    rmmod - Synchronize SCSI cache.
>    Update handling for removing devices using sysfs.
> 
> This patch also aligns the device removal code with
> our out-of-box driver.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>


> ---
>   drivers/scsi/smartpqi/smartpqi_init.c | 64 ++++++++++++---------------
>   1 file changed, 28 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index ecb2af3f43ca..97027574eb1f 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -1693,8 +1693,6 @@ static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info, struct pqi
>   {
>   	int rc;
>   
> -	pqi_device_remove_start(device);
> -
>   	rc = pqi_device_wait_for_pending_io(ctrl_info, device,
>   		PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS);
>   	if (rc)
> @@ -1708,6 +1706,8 @@ static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info, struct pqi
>   		scsi_remove_device(device->sdev);
>   	else
>   		pqi_remove_sas_device(device);
> +
> +	pqi_device_remove_start(device);
>   }
>   
>   /* Assumes the SCSI device list lock is held. */
> @@ -1986,7 +1986,7 @@ static void pqi_update_device_list(struct pqi_ctrl_info *ctrl_info,
>   	list_for_each_entry_safe(device, next, &ctrl_info->scsi_device_list,
>   		scsi_device_list_entry) {
>   		if (device->device_gone) {
> -			list_del_init(&device->scsi_device_list_entry);
> +			list_del(&device->scsi_device_list_entry);
>   			list_add_tail(&device->delete_list_entry, &delete_list);
>   		}
>   	}
> @@ -2025,15 +2025,13 @@ static void pqi_update_device_list(struct pqi_ctrl_info *ctrl_info,
>   		if (device->volume_offline) {
>   			pqi_dev_info(ctrl_info, "offline", device);
>   			pqi_show_volume_status(ctrl_info, device);
> -		}
> -		list_del(&device->delete_list_entry);
> -		if (pqi_is_device_added(device)) {
> -			pqi_remove_device(ctrl_info, device);
>   		} else {
> -			if (!device->volume_offline)
> -				pqi_dev_info(ctrl_info, "removed", device);
> -			pqi_free_device(device);
> +			pqi_dev_info(ctrl_info, "removed", device);
>   		}
> +		if (pqi_is_device_added(device))
> +			pqi_remove_device(ctrl_info, device);
> +		list_del(&device->delete_list_entry);
> +		pqi_free_device(device);
>   	}
>   
>   	/*
> @@ -2328,6 +2326,25 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   	return rc;
>   }
>   
> +static void pqi_remove_all_scsi_devices(struct pqi_ctrl_info *ctrl_info)
> +{
> +	unsigned long flags;
> +	struct pqi_scsi_dev *device;
> +	struct pqi_scsi_dev *next;
> +
> +	spin_lock_irqsave(&ctrl_info->scsi_device_list_lock, flags);
> +
> +	list_for_each_entry_safe(device, next, &ctrl_info->scsi_device_list,
> +		scsi_device_list_entry) {
> +		if (pqi_is_device_added(device))
> +			pqi_remove_device(ctrl_info, device);
> +		list_del(&device->scsi_device_list_entry);
> +		pqi_free_device(device);
> +	}
> +
> +	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
> +}
> +
>   static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   {
>   	int rc;
> @@ -6120,31 +6137,6 @@ static int pqi_slave_configure(struct scsi_device *sdev)
>   	return 0;
>   }
>   
> -static void pqi_slave_destroy(struct scsi_device *sdev)
> -{
> -	unsigned long flags;
> -	struct pqi_scsi_dev *device;
> -	struct pqi_ctrl_info *ctrl_info;
> -
> -	ctrl_info = shost_to_hba(sdev->host);
> -
> -	spin_lock_irqsave(&ctrl_info->scsi_device_list_lock, flags);
> -
> -	device = sdev->hostdata;
> -	if (device) {
> -		sdev->hostdata = NULL;
> -		if (!list_empty(&device->scsi_device_list_entry))
> -			list_del(&device->scsi_device_list_entry);
> -	}
> -
> -	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
> -
> -	if (device) {
> -		pqi_dev_info(ctrl_info, "removed", device);
> -		pqi_free_device(device);
> -	}
> -}
> -
>   static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg)
>   {
>   	struct pci_dev *pci_dev;
> @@ -6938,7 +6930,6 @@ static struct scsi_host_template pqi_driver_template = {
>   	.ioctl = pqi_ioctl,
>   	.slave_alloc = pqi_slave_alloc,
>   	.slave_configure = pqi_slave_configure,
> -	.slave_destroy = pqi_slave_destroy,
>   	.map_queues = pqi_map_queues,
>   	.sdev_attrs = pqi_sdev_attrs,
>   	.shost_attrs = pqi_shost_attrs,
> @@ -8169,6 +8160,7 @@ static void pqi_remove_ctrl(struct pqi_ctrl_info *ctrl_info)
>   {
>   	pqi_cancel_rescan_worker(ctrl_info);
>   	pqi_cancel_update_time_worker(ctrl_info);
> +	pqi_remove_all_scsi_devices(ctrl_info);
>   	pqi_unregister_scsi(ctrl_info);
>   	if (ctrl_info->pqi_mode_enabled)
>   		pqi_revert_to_sis_mode(ctrl_info);
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 02/11] smartpqi: add controller handshake during kdump
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 02/11] smartpqi: add controller handshake during kdump Don Brace
@ 2021-09-30 18:21   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:21 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> 
> Correct kdump hangs when controller is locked up.
> 
> There are occasions when a controller reboot
> (controller soft reset) is issued when a controller
> firmware crash dump is in progress.
> 
> This leads to incomplete controller firmware crash dump.
>   - When the controller crash dump is in progress,
>     and a kdump is initiated, the driver issues
>     inbound doorbell reset to bring back the
>     controller in SIS mode.
>   - If the controller is in locked up state,
>     the inbound doorbell reset does not work causing
>     controller initialization failures. This results
>     in the driver hanging waiting for SIS mode.
> 
> To avoid an incomplete controller crash dump, add in
> a controller crash dump handshake.
>   - Controller will indicate start and end of the controller
>     crash dump by setting some register bits.
>   - Driver will look these bits when a kdump is initiated.
>     If a controller crash dump is in progress, the driver will
>     wait for the controller crash dump to complete
>     before issuing the controller soft reset then complete
>     driver initialization.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>


> ---
>   drivers/scsi/smartpqi/smartpqi_init.c | 41 +++++++++++++++++++--
>   drivers/scsi/smartpqi/smartpqi_sis.c  | 51 +++++++++++++++++++++++++++
>   drivers/scsi/smartpqi/smartpqi_sis.h  |  1 +
>   3 files changed, 91 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 97027574eb1f..5655d240f7a7 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -234,15 +234,46 @@ static inline bool pqi_is_hba_lunid(u8 *scsi3addr)
>   	return pqi_scsi3addr_equal(scsi3addr, RAID_CTLR_LUNID);
>   }
>   
> +#define PQI_DRIVER_SCRATCH_PQI_MODE			0x1
> +#define PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED		0x2
> +
>   static inline enum pqi_ctrl_mode pqi_get_ctrl_mode(struct pqi_ctrl_info *ctrl_info)
>   {
> -	return sis_read_driver_scratch(ctrl_info);
> +	return sis_read_driver_scratch(ctrl_info) & PQI_DRIVER_SCRATCH_PQI_MODE ? PQI_MODE : SIS_MODE;
>   }
>   
>   static inline void pqi_save_ctrl_mode(struct pqi_ctrl_info *ctrl_info,
>   	enum pqi_ctrl_mode mode)
>   {
> -	sis_write_driver_scratch(ctrl_info, mode);
> +	u32 driver_scratch;
> +
> +	driver_scratch = sis_read_driver_scratch(ctrl_info);
> +
> +	if (mode == PQI_MODE)
> +		driver_scratch |= PQI_DRIVER_SCRATCH_PQI_MODE;
> +	else
> +		driver_scratch &= ~PQI_DRIVER_SCRATCH_PQI_MODE;
> +
> +	sis_write_driver_scratch(ctrl_info, driver_scratch);
> +}
> +
> +static inline bool pqi_is_fw_triage_supported(struct pqi_ctrl_info *ctrl_info)
> +{
> +	return (sis_read_driver_scratch(ctrl_info) & PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED) != 0;
> +}
> +
> +static inline void pqi_save_fw_triage_setting(struct pqi_ctrl_info *ctrl_info, bool is_supported)
> +{
> +	u32 driver_scratch;
> +
> +	driver_scratch = sis_read_driver_scratch(ctrl_info);
> +
> +	if (is_supported)
> +		driver_scratch |= PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED;
> +	else
> +		driver_scratch &= ~PQI_DRIVER_SCRATCH_FW_TRIAGE_SUPPORTED;
> +
> +	sis_write_driver_scratch(ctrl_info, driver_scratch);
>   }
>   
>   static inline void pqi_ctrl_block_scan(struct pqi_ctrl_info *ctrl_info)
> @@ -7292,6 +7323,7 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>   		ctrl_info->unique_wwid_in_report_phys_lun_supported =
>   			firmware_feature->enabled;
>   		break;
> +		pqi_save_fw_triage_setting(ctrl_info, firmware_feature->enabled);
>   	}
>   
>   	pqi_firmware_feature_status(ctrl_info, firmware_feature);
> @@ -7618,6 +7650,11 @@ static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
>   	u32 product_id;
>   
>   	if (reset_devices) {
> +		if (pqi_is_fw_triage_supported(ctrl_info)) {
> +			rc = sis_wait_for_fw_triage_completion(ctrl_info);
> +			if (rc)
> +				return rc;
> +		}
>   		sis_soft_reset(ctrl_info);
>   		msleep(PQI_POST_RESET_DELAY_SECS * PQI_HZ);
>   	} else {
> diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c
> index d63c46a8e38b..8acd3a80f582 100644
> --- a/drivers/scsi/smartpqi/smartpqi_sis.c
> +++ b/drivers/scsi/smartpqi/smartpqi_sis.c
> @@ -51,12 +51,20 @@
>   #define SIS_BASE_STRUCT_REVISION		9
>   #define SIS_BASE_STRUCT_ALIGNMENT		16
>   
> +#define SIS_CTRL_KERNEL_FW_TRIAGE		0x3
>   #define SIS_CTRL_KERNEL_UP			0x80
>   #define SIS_CTRL_KERNEL_PANIC			0x100
>   #define SIS_CTRL_READY_TIMEOUT_SECS		180
>   #define SIS_CTRL_READY_RESUME_TIMEOUT_SECS	90
>   #define SIS_CTRL_READY_POLL_INTERVAL_MSECS	10
>   
> +enum sis_fw_triage_status {
> +	FW_TRIAGE_NOT_STARTED = 0,
> +	FW_TRIAGE_STARTED,
> +	FW_TRIAGE_COND_INVALID,
> +	FW_TRIAGE_COMPLETED
> +};
> +
>   #pragma pack(1)
>   
>   /* for use with SIS_CMD_INIT_BASE_STRUCT_ADDRESS command */
> @@ -419,12 +427,55 @@ u32 sis_read_driver_scratch(struct pqi_ctrl_info *ctrl_info)
>   	return readl(&ctrl_info->registers->sis_driver_scratch);
>   }
>   
> +static inline enum sis_fw_triage_status
> +	sis_read_firmware_triage_status(struct pqi_ctrl_info *ctrl_info)
> +{
> +	return ((enum sis_fw_triage_status)(readl(&ctrl_info->registers->sis_firmware_status) &
> +		SIS_CTRL_KERNEL_FW_TRIAGE));
> +}
> +
>   void sis_soft_reset(struct pqi_ctrl_info *ctrl_info)
>   {
>   	writel(SIS_SOFT_RESET,
>   		&ctrl_info->registers->sis_host_to_ctrl_doorbell);
>   }
>   
> +#define SIS_FW_TRIAGE_STATUS_TIMEOUT_SECS		300
> +#define SIS_FW_TRIAGE_STATUS_POLL_INTERVAL_SECS		1
> +
> +int sis_wait_for_fw_triage_completion(struct pqi_ctrl_info *ctrl_info)
> +{
> +	int rc;
> +	enum sis_fw_triage_status status;
> +	unsigned long timeout;
> +
> +	timeout = (SIS_FW_TRIAGE_STATUS_TIMEOUT_SECS * PQI_HZ) + jiffies;
> +	while (1) {
> +		status = sis_read_firmware_triage_status(ctrl_info);
> +		if (status == FW_TRIAGE_COND_INVALID) {
> +			dev_err(&ctrl_info->pci_dev->dev,
> +				"firmware triage condition invalid\n");
> +			rc = -EINVAL;
> +			break;
> +		} else if (status == FW_TRIAGE_NOT_STARTED ||
> +			status == FW_TRIAGE_COMPLETED) {
> +			rc = 0;
> +			break;
> +		}
> +
> +		if (time_after(jiffies, timeout)) {
> +			dev_err(&ctrl_info->pci_dev->dev,
> +				"timed out waiting for firmware triage status\n");
> +			rc = -ETIMEDOUT;
> +			break;
> +		}
> +
> +		ssleep(SIS_FW_TRIAGE_STATUS_POLL_INTERVAL_SECS);
> +	}
> +
> +	return rc;
> +}
> +
>   static void __attribute__((unused)) verify_structures(void)
>   {
>   	BUILD_BUG_ON(offsetof(struct sis_base_struct,
> diff --git a/drivers/scsi/smartpqi/smartpqi_sis.h b/drivers/scsi/smartpqi/smartpqi_sis.h
> index d29c1352a826..c1db93054c86 100644
> --- a/drivers/scsi/smartpqi/smartpqi_sis.h
> +++ b/drivers/scsi/smartpqi/smartpqi_sis.h
> @@ -28,5 +28,6 @@ void sis_write_driver_scratch(struct pqi_ctrl_info *ctrl_info, u32 value);
>   u32 sis_read_driver_scratch(struct pqi_ctrl_info *ctrl_info);
>   void sis_soft_reset(struct pqi_ctrl_info *ctrl_info);
>   u32 sis_get_product_id(struct pqi_ctrl_info *ctrl_info);
> +int sis_wait_for_fw_triage_completion(struct pqi_ctrl_info *ctrl_info);
>   
>   #endif	/* _SMARTPQI_SIS_H */
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 03/11] smartpqi: capture controller reason codes
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 03/11] smartpqi: capture controller reason codes Don Brace
@ 2021-09-30 18:22   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:22 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Murthy Bhat <Murthy.Bhat@microchip.com>
> 
> Store controller reason codes in a controller register for
> debugging purposes.
> 
> In some rare cases, the driver can halt the controller.
> - Add in a reason code on why the controller was halted.
> - Store this reason code in a controller register to aid
>    in debugging the issue.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>


> ---
>   drivers/scsi/smartpqi/smartpqi.h      | 25 +++++++++++++++--
>   drivers/scsi/smartpqi/smartpqi_init.c | 40 ++++++++++++++++++---------
>   drivers/scsi/smartpqi/smartpqi_sis.c  |  9 ++++--
>   drivers/scsi/smartpqi/smartpqi_sis.h  |  3 +-
>   4 files changed, 57 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
> index 70eca203d72f..d66863f8d1cf 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -82,9 +82,11 @@ struct pqi_ctrl_registers {
>   	__le32  sis_product_identifier;			/* B4h */
>   	u8	reserved5[0xbc - (0xb4 + sizeof(__le32))];
>   	__le32	sis_firmware_status;			/* BCh */
> -	u8	reserved6[0x1000 - (0xbc + sizeof(__le32))];
> +	u8	reserved6[0xcc - (0xbc + sizeof(__le32))];
> +	__le32	sis_ctrl_shutdown_reason_code;		/* CCh */
> +	u8	reserved7[0x1000 - (0xcc + sizeof(__le32))];
>   	__le32	sis_mailbox[8];				/* 1000h */
> -	u8	reserved7[0x4000 - (0x1000 + (sizeof(__le32) * 8))];
> +	u8	reserved8[0x4000 - (0x1000 + (sizeof(__le32) * 8))];
>   	/*
>   	 * The PQI spec states that the PQI registers should be at
>   	 * offset 0 from the PCIe BAR 0.  However, we can't map
> @@ -102,6 +104,21 @@ struct pqi_ctrl_registers {
>   
>   #define PQI_DEVICE_REGISTERS_OFFSET	0x4000
>   
> +/* shutdown reasons for taking the controller offline */
> +enum pqi_ctrl_shutdown_reason {
> +	PQI_IQ_NOT_DRAINED_TIMEOUT = 1,
> +	PQI_LUN_RESET_TIMEOUT = 2,
> +	PQI_IO_PENDING_POST_LUN_RESET_TIMEOUT = 3,
> +	PQI_NO_HEARTBEAT = 4,
> +	PQI_FIRMWARE_KERNEL_NOT_UP = 5,
> +	PQI_OFA_RESPONSE_TIMEOUT = 6,
> +	PQI_INVALID_REQ_ID = 7,
> +	PQI_UNMATCHED_REQ_ID = 8,
> +	PQI_IO_PI_OUT_OF_RANGE = 9,
> +	PQI_EVENT_PI_OUT_OF_RANGE = 10,
> +	PQI_UNEXPECTED_IU_TYPE = 11
> +};
> +
>   enum pqi_io_path {
>   	RAID_PATH = 0,
>   	AIO_PATH = 1
> @@ -850,7 +867,8 @@ struct pqi_config_table_firmware_features {
>   #define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT			14
>   #define PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME	15
>   #define PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN	16
> -#define PQI_FIRMWARE_FEATURE_MAXIMUM				16
> +#define PQI_FIRMWARE_FEATURE_FW_TRIAGE				17
> +#define PQI_FIRMWARE_FEATURE_MAXIMUM				17
>   
>   struct pqi_config_table_debug {
>   	struct pqi_config_table_section_header header;
> @@ -1297,6 +1315,7 @@ struct pqi_ctrl_info {
>   	u8		raid_iu_timeout_supported : 1;
>   	u8		tmf_iu_timeout_supported : 1;
>   	u8		unique_wwid_in_report_phys_lun_supported : 1;
> +	u8		firmware_triage_supported : 1;
>   	u8		enable_r1_writes : 1;
>   	u8		enable_r5_writes : 1;
>   	u8		enable_r6_writes : 1;
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 5655d240f7a7..b6ac4d607664 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -54,7 +54,8 @@ MODULE_DESCRIPTION("Driver for Microchip Smart Family Controller version "
>   MODULE_VERSION(DRIVER_VERSION);
>   MODULE_LICENSE("GPL");
>   
> -static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info);
> +static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info,
> +	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason);
>   static void pqi_ctrl_offline_worker(struct work_struct *work);
>   static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info);
>   static void pqi_scan_start(struct Scsi_Host *shost);
> @@ -226,7 +227,7 @@ static inline void pqi_check_ctrl_health(struct pqi_ctrl_info *ctrl_info)
>   {
>   	if (ctrl_info->controller_online)
>   		if (!sis_is_firmware_running(ctrl_info))
> -			pqi_take_ctrl_offline(ctrl_info);
> +			pqi_take_ctrl_offline(ctrl_info, PQI_FIRMWARE_KERNEL_NOT_UP);
>   }
>   
>   static inline bool pqi_is_hba_lunid(u8 *scsi3addr)
> @@ -3180,9 +3181,10 @@ static int pqi_interpret_task_management_response(struct pqi_ctrl_info *ctrl_inf
>   	return rc;
>   }
>   
> -static inline void pqi_invalid_response(struct pqi_ctrl_info *ctrl_info)
> +static inline void pqi_invalid_response(struct pqi_ctrl_info *ctrl_info,
> +	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason)
>   {
> -	pqi_take_ctrl_offline(ctrl_info);
> +	pqi_take_ctrl_offline(ctrl_info, ctrl_shutdown_reason);
>   }
>   
>   static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue_group *queue_group)
> @@ -3200,7 +3202,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
>   	while (1) {
>   		oq_pi = readl(queue_group->oq_pi);
>   		if (oq_pi >= ctrl_info->num_elements_per_oq) {
> -			pqi_invalid_response(ctrl_info);
> +			pqi_invalid_response(ctrl_info, PQI_IO_PI_OUT_OF_RANGE);
>   			dev_err(&ctrl_info->pci_dev->dev,
>   				"I/O interrupt: producer index (%u) out of range (0-%u): consumer index: %u\n",
>   				oq_pi, ctrl_info->num_elements_per_oq - 1, oq_ci);
> @@ -3215,7 +3217,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
>   
>   		request_id = get_unaligned_le16(&response->request_id);
>   		if (request_id >= ctrl_info->max_io_slots) {
> -			pqi_invalid_response(ctrl_info);
> +			pqi_invalid_response(ctrl_info, PQI_INVALID_REQ_ID);
>   			dev_err(&ctrl_info->pci_dev->dev,
>   				"request ID in response (%u) out of range (0-%u): producer index: %u  consumer index: %u\n",
>   				request_id, ctrl_info->max_io_slots - 1, oq_pi, oq_ci);
> @@ -3224,7 +3226,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
>   
>   		io_request = &ctrl_info->io_request_pool[request_id];
>   		if (atomic_read(&io_request->refcount) == 0) {
> -			pqi_invalid_response(ctrl_info);
> +			pqi_invalid_response(ctrl_info, PQI_UNMATCHED_REQ_ID);
>   			dev_err(&ctrl_info->pci_dev->dev,
>   				"request ID in response (%u) does not match an outstanding I/O request: producer index: %u  consumer index: %u\n",
>   				request_id, oq_pi, oq_ci);
> @@ -3260,7 +3262,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
>   			pqi_process_io_error(response->header.iu_type, io_request);
>   			break;
>   		default:
> -			pqi_invalid_response(ctrl_info);
> +			pqi_invalid_response(ctrl_info, PQI_UNEXPECTED_IU_TYPE);
>   			dev_err(&ctrl_info->pci_dev->dev,
>   				"unexpected IU type: 0x%x: producer index: %u  consumer index: %u\n",
>   				response->header.iu_type, oq_pi, oq_ci);
> @@ -3442,7 +3444,7 @@ static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
>   		pqi_ofa_free_host_buffer(ctrl_info);
>   		pqi_ctrl_ofa_done(ctrl_info);
>   		pqi_ofa_ctrl_unquiesce(ctrl_info);
> -		pqi_take_ctrl_offline(ctrl_info);
> +		pqi_take_ctrl_offline(ctrl_info, PQI_OFA_RESPONSE_TIMEOUT);
>   		break;
>   	}
>   }
> @@ -3567,7 +3569,7 @@ static void pqi_heartbeat_timer_handler(struct timer_list *t)
>   			dev_err(&ctrl_info->pci_dev->dev,
>   				"no heartbeat detected - last heartbeat count: %u\n",
>   				heartbeat_count);
> -			pqi_take_ctrl_offline(ctrl_info);
> +			pqi_take_ctrl_offline(ctrl_info, PQI_NO_HEARTBEAT);
>   			return;
>   		}
>   	} else {
> @@ -3631,7 +3633,7 @@ static int pqi_process_event_intr(struct pqi_ctrl_info *ctrl_info)
>   	while (1) {
>   		oq_pi = readl(event_queue->oq_pi);
>   		if (oq_pi >= PQI_NUM_EVENT_QUEUE_ELEMENTS) {
> -			pqi_invalid_response(ctrl_info);
> +			pqi_invalid_response(ctrl_info, PQI_EVENT_PI_OUT_OF_RANGE);
>   			dev_err(&ctrl_info->pci_dev->dev,
>   				"event interrupt: producer index (%u) out of range (0-%u): consumer index: %u\n",
>   				oq_pi, PQI_NUM_EVENT_QUEUE_ELEMENTS - 1, oq_ci);
> @@ -7323,7 +7325,10 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>   		ctrl_info->unique_wwid_in_report_phys_lun_supported =
>   			firmware_feature->enabled;
>   		break;
> +	case PQI_FIRMWARE_FEATURE_FW_TRIAGE:
> +		ctrl_info->firmware_triage_supported = firmware_feature->enabled;
>   		pqi_save_fw_triage_setting(ctrl_info, firmware_feature->enabled);
> +		break;
>   	}
>   
>   	pqi_firmware_feature_status(ctrl_info, firmware_feature);
> @@ -7419,6 +7424,11 @@ static struct pqi_firmware_feature pqi_firmware_features[] = {
>   		.feature_bit = PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN,
>   		.feature_status = pqi_ctrl_update_feature_flags,
>   	},
> +	{
> +		.feature_name = "Firmware Triage",
> +		.feature_bit = PQI_FIRMWARE_FEATURE_FW_TRIAGE,
> +		.feature_status = pqi_ctrl_update_feature_flags,
> +	},
>   };
>   
>   static void pqi_process_firmware_features(
> @@ -7519,6 +7529,7 @@ static void pqi_ctrl_reset_config(struct pqi_ctrl_info *ctrl_info)
>   	ctrl_info->raid_iu_timeout_supported = false;
>   	ctrl_info->tmf_iu_timeout_supported = false;
>   	ctrl_info->unique_wwid_in_report_phys_lun_supported = false;
> +	ctrl_info->firmware_triage_supported = false;
>   }
>   
>   static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
> @@ -8459,7 +8470,8 @@ static void pqi_ctrl_offline_worker(struct work_struct *work)
>   	pqi_take_ctrl_offline_deferred(ctrl_info);
>   }
>   
> -static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info)
> +static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info,
> +	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason)
>   {
>   	if (!ctrl_info->controller_online)
>   		return;
> @@ -8468,7 +8480,7 @@ static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info)
>   	ctrl_info->pqi_mode_enabled = false;
>   	pqi_ctrl_block_requests(ctrl_info);
>   	if (!pqi_disable_ctrl_shutdown)
> -		sis_shutdown_ctrl(ctrl_info);
> +		sis_shutdown_ctrl(ctrl_info, ctrl_shutdown_reason);
>   	pci_disable_device(ctrl_info->pci_dev);
>   	dev_err(&ctrl_info->pci_dev->dev, "controller offline\n");
>   	schedule_work(&ctrl_info->ctrl_offline_work);
> @@ -9303,6 +9315,8 @@ static void __attribute__((unused)) verify_structures(void)
>   		sis_product_identifier) != 0xb4);
>   	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
>   		sis_firmware_status) != 0xbc);
> +	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
> +		sis_ctrl_shutdown_reason_code) != 0xcc);
>   	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
>   		sis_mailbox) != 0x1000);
>   	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
> diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c
> index 8acd3a80f582..d66eb8ea161c 100644
> --- a/drivers/scsi/smartpqi/smartpqi_sis.c
> +++ b/drivers/scsi/smartpqi/smartpqi_sis.c
> @@ -397,14 +397,17 @@ void sis_enable_intx(struct pqi_ctrl_info *ctrl_info)
>   	sis_set_doorbell_bit(ctrl_info, SIS_ENABLE_INTX);
>   }
>   
> -void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info)
> +void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info,
> +	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason)
>   {
>   	if (readl(&ctrl_info->registers->sis_firmware_status) &
>   		SIS_CTRL_KERNEL_PANIC)
>   		return;
>   
> -	writel(SIS_TRIGGER_SHUTDOWN,
> -		&ctrl_info->registers->sis_host_to_ctrl_doorbell);
> +	if (ctrl_info->firmware_triage_supported)
> +		writel(ctrl_shutdown_reason, &ctrl_info->registers->sis_ctrl_shutdown_reason_code);
> +
> +	writel(SIS_TRIGGER_SHUTDOWN, &ctrl_info->registers->sis_host_to_ctrl_doorbell);
>   }
>   
>   int sis_pqi_reset_quiesce(struct pqi_ctrl_info *ctrl_info)
> diff --git a/drivers/scsi/smartpqi/smartpqi_sis.h b/drivers/scsi/smartpqi/smartpqi_sis.h
> index c1db93054c86..bd92ff49f385 100644
> --- a/drivers/scsi/smartpqi/smartpqi_sis.h
> +++ b/drivers/scsi/smartpqi/smartpqi_sis.h
> @@ -21,7 +21,8 @@ int sis_get_pqi_capabilities(struct pqi_ctrl_info *ctrl_info);
>   int sis_init_base_struct_addr(struct pqi_ctrl_info *ctrl_info);
>   void sis_enable_msix(struct pqi_ctrl_info *ctrl_info);
>   void sis_enable_intx(struct pqi_ctrl_info *ctrl_info);
> -void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info);
> +void sis_shutdown_ctrl(struct pqi_ctrl_info *ctrl_info,
> +	enum pqi_ctrl_shutdown_reason ctrl_shutdown_reason);
>   int sis_pqi_reset_quiesce(struct pqi_ctrl_info *ctrl_info);
>   int sis_reenable_sis_mode(struct pqi_ctrl_info *ctrl_info);
>   void sis_write_driver_scratch(struct pqi_ctrl_info *ctrl_info, u32 value);
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 04/11] smartpqi: update LUN reset handler
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 04/11] smartpqi: update LUN reset handler Don Brace
@ 2021-09-30 18:22   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:22 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> Enhance check for commands queued to the controller.
>   - Add new function pqi_nonempty_inbound_queue_count that
>     will wait for all I/O queued for submission to controller
>     across all queue groups to drain. Add helper functions
>     to obtain queue command counts for each queue group.
>     These queues should drain quickly as they are already staged
>     to be submitted down to the controller's IB queue.
> Enhance check for outstanding command completion.
>   - Update the count of outstanding commands while waiting.
>     This value was not re-obtained and was potentially causing
>     infinite wait for all completions.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>

> ---
>   drivers/scsi/smartpqi/smartpqi_init.c | 105 ++++++++++++++++----------
>   1 file changed, 66 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index b6ac4d607664..01330fd67500 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5799,64 +5799,91 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
>   	return rc;
>   }
>   
> -static int pqi_wait_until_queued_io_drained(struct pqi_ctrl_info *ctrl_info,
> -	struct pqi_queue_group *queue_group)
> +static unsigned int pqi_queued_io_count(struct pqi_ctrl_info *ctrl_info)
>   {
> +	unsigned int i;
>   	unsigned int path;
>   	unsigned long flags;
> -	bool list_is_empty;
> +	unsigned int queued_io_count;
> +	struct pqi_queue_group *queue_group;
> +	struct pqi_io_request *io_request;
>   
> -	for (path = 0; path < 2; path++) {
> -		while (1) {
> -			spin_lock_irqsave(
> -				&queue_group->submit_lock[path], flags);
> -			list_is_empty =
> -				list_empty(&queue_group->request_list[path]);
> -			spin_unlock_irqrestore(
> -				&queue_group->submit_lock[path], flags);
> -			if (list_is_empty)
> -				break;
> -			pqi_check_ctrl_health(ctrl_info);
> -			if (pqi_ctrl_offline(ctrl_info))
> -				return -ENXIO;
> -			usleep_range(1000, 2000);
> +	queued_io_count = 0;
> +
> +	for (i = 0; i < ctrl_info->num_queue_groups; i++) {
> +		queue_group = &ctrl_info->queue_groups[i];
> +		for (path = 0; path < 2; path++) {
> +			spin_lock_irqsave(&queue_group->submit_lock[path], flags);
> +			list_for_each_entry(io_request, &queue_group->request_list[path], request_list_entry)
> +				queued_io_count++;
> +			spin_unlock_irqrestore(&queue_group->submit_lock[path], flags);
>   		}
>   	}
>   
> -	return 0;
> +	return queued_io_count;
>   }
>   
> -static int pqi_wait_until_inbound_queues_empty(struct pqi_ctrl_info *ctrl_info)
> +static unsigned int pqi_nonempty_inbound_queue_count(struct pqi_ctrl_info *ctrl_info)
>   {
> -	int rc;
>   	unsigned int i;
>   	unsigned int path;
> +	unsigned int nonempty_inbound_queue_count;
>   	struct pqi_queue_group *queue_group;
>   	pqi_index_t iq_pi;
>   	pqi_index_t iq_ci;
>   
> +	nonempty_inbound_queue_count = 0;
> +
>   	for (i = 0; i < ctrl_info->num_queue_groups; i++) {
>   		queue_group = &ctrl_info->queue_groups[i];
> -
> -		rc = pqi_wait_until_queued_io_drained(ctrl_info, queue_group);
> -		if (rc)
> -			return rc;
> -
>   		for (path = 0; path < 2; path++) {
>   			iq_pi = queue_group->iq_pi_copy[path];
> +			iq_ci = readl(queue_group->iq_ci[path]);
> +			if (iq_ci != iq_pi)
> +				nonempty_inbound_queue_count++;
> +		}
> +	}
>   
> -			while (1) {
> -				iq_ci = readl(queue_group->iq_ci[path]);
> -				if (iq_ci == iq_pi)
> -					break;
> -				pqi_check_ctrl_health(ctrl_info);
> -				if (pqi_ctrl_offline(ctrl_info))
> -					return -ENXIO;
> -				usleep_range(1000, 2000);
> -			}
> +	return nonempty_inbound_queue_count;
> +}
> +
> +#define PQI_INBOUND_QUEUES_NONEMPTY_WARNING_TIMEOUT_SECS	10
> +
> +static int pqi_wait_until_inbound_queues_empty(struct pqi_ctrl_info *ctrl_info)
> +{
> +	unsigned long start_jiffies;
> +	unsigned long warning_timeout;
> +	unsigned int queued_io_count;
> +	unsigned int nonempty_inbound_queue_count;
> +	bool displayed_warning;
> +
> +	displayed_warning = false;
> +	start_jiffies = jiffies;
> +	warning_timeout = (PQI_INBOUND_QUEUES_NONEMPTY_WARNING_TIMEOUT_SECS * PQI_HZ) + start_jiffies;
> +
> +	while (1) {
> +		queued_io_count = pqi_queued_io_count(ctrl_info);
> +		nonempty_inbound_queue_count = pqi_nonempty_inbound_queue_count(ctrl_info);
> +		if (queued_io_count == 0 && nonempty_inbound_queue_count == 0)
> +			break;
> +		pqi_check_ctrl_health(ctrl_info);
> +		if (pqi_ctrl_offline(ctrl_info))
> +			return -ENXIO;
> +		if (time_after(jiffies, warning_timeout)) {
> +			dev_warn(&ctrl_info->pci_dev->dev,
> +				"waiting %u seconds for queued I/O to drain (queued I/O count: %u; non-empty inbound queue count: %u)\n",
> +				jiffies_to_msecs(jiffies - start_jiffies) / 1000, queued_io_count, nonempty_inbound_queue_count);
> +			displayed_warning = true;
> +			warning_timeout = (PQI_INBOUND_QUEUES_NONEMPTY_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;
>   		}
> +		usleep_range(1000, 2000);
>   	}
>   
> +	if (displayed_warning)
> +		dev_warn(&ctrl_info->pci_dev->dev,
> +			"queued I/O drained after waiting for %u seconds\n",
> +			jiffies_to_msecs(jiffies - start_jiffies) / 1000);
> +
>   	return 0;
>   }
>   
> @@ -5922,7 +5949,7 @@ static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
>   		if (pqi_ctrl_offline(ctrl_info))
>   			return -ENXIO;
>   		msecs_waiting = jiffies_to_msecs(jiffies - start_jiffies);
> -		if (msecs_waiting > timeout_msecs) {
> +		if (msecs_waiting >= timeout_msecs) {
>   			dev_err(&ctrl_info->pci_dev->dev,
>   				"scsi %d:%d:%d:%d: timed out after %lu seconds waiting for %d outstanding command(s)\n",
>   				ctrl_info->scsi_host->host_no, device->bus, device->target,
> @@ -5957,6 +5984,7 @@ static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,
>   {
>   	int rc;
>   	unsigned int wait_secs;
> +	int cmds_outstanding;
>   
>   	wait_secs = 0;
>   
> @@ -5974,11 +6002,10 @@ static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,
>   		}
>   
>   		wait_secs += PQI_LUN_RESET_POLL_COMPLETION_SECS;
> -
> +		cmds_outstanding = atomic_read(&device->scsi_cmds_outstanding);
>   		dev_warn(&ctrl_info->pci_dev->dev,
> -			"scsi %d:%d:%d:%d: waiting %u seconds for LUN reset to complete\n",
> -			ctrl_info->scsi_host->host_no, device->bus, device->target, device->lun,
> -			wait_secs);
> +			"scsi %d:%d:%d:%d: waiting %u seconds for LUN reset to complete (%d command(s) outstanding)\n",
> +			ctrl_info->scsi_host->host_no, device->bus, device->target, device->lun, wait_secs, cmds_outstanding);
>   	}
>   
>   	return rc;
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation Don Brace
  2021-09-29  7:56   ` Paul Menzel
@ 2021-09-30 18:23   ` john.p.donnelly
  1 sibling, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:23 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> Add in a TUR to HBA disks and do not present them to the OS if
> 0x02/0x04/0x1b (sanitize in progress) is returned.
> 
> During boot-up, some OSes appear to hang when there are one or
> more disks undergoing sanitize.
> 
> According to SCSI SBC4 specification
> section 4.11.2 Commands allowed during sanitize,
> some SCSI commands are permitted, but read/write operations are not.
> 
> When the OS attempts to read the disk partition table a
> CHECK CONDITION ASC 0x04 ASCQ 0x1b is returned which causes the OS
> to retry the read until sanitize has completed. This can take hours.
> 
> Note: According to document HPE Smart Storage Administrator User Guide
> Link: https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=c03909334
> 
> During the sanitize erase operation, the drive is unusable.
> I.E.
>        The expected behavior for sanitize is the that disk remains
>        offline even after sanitize has completed. The customer is
>        expected to re-enable the disk using the management utility.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---

Acked-by: John Donnelly <john.p.donnelly@oracle.com>


>   drivers/scsi/smartpqi/smartpqi_init.c | 87 +++++++++++++++++++++++++++
>   1 file changed, 87 insertions(+)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 01330fd67500..838274d8fadf 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -555,6 +555,10 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
>   	cdb = request->cdb;
>   
>   	switch (cmd) {
> +	case TEST_UNIT_READY:
> +		request->data_direction = SOP_READ_FLAG;
> +		cdb[0] = TEST_UNIT_READY;
> +		break;
>   	case INQUIRY:
>   		request->data_direction = SOP_READ_FLAG;
>   		cdb[0] = INQUIRY;
> @@ -1575,6 +1579,85 @@ static int pqi_get_logical_device_info(struct pqi_ctrl_info *ctrl_info,
>   	return rc;
>   }
>   
> +/*
> + * Prevent adding drive to OS for some corner cases such as a drive
> + * undergoing a sanitize operation. Some OSes will continue to poll
> + * the drive until the sanitize completes, which can take hours,
> + * resulting in long bootup delays. Commands such as TUR, READ_CAP
> + * are allowed, but READ/WRITE cause check condition. So the OS
> + * cannot check/read the partition table.
> + * Note: devices that have completed sanitize must be re-enabled
> + *       using the management utility.
> + */
> +static bool pqi_keep_device_offline(struct pqi_ctrl_info *ctrl_info,
> +	struct pqi_scsi_dev *device)
> +{
> +	u8 scsi_status;
> +	int rc;
> +	enum dma_data_direction dir;
> +	char *buffer;
> +	int buffer_length = 64;
> +	size_t sense_data_length;
> +	struct scsi_sense_hdr sshdr;
> +	struct pqi_raid_path_request request;
> +	struct pqi_raid_error_info error_info;
> +	bool offline = false; /* Assume keep online */
> +
> +	/* Do not check controllers. */
> +	if (pqi_is_hba_lunid(device->scsi3addr))
> +		return false;
> +
> +	/* Do not check LVs. */
> +	if (pqi_is_logical_device(device))
> +		return false;
> +
> +	buffer = kmalloc(buffer_length, GFP_KERNEL);
> +	if (!buffer)
> +		return false; /* Assume not offline */
> +
> +	/* Check for SANITIZE in progress using TUR */
> +	rc = pqi_build_raid_path_request(ctrl_info, &request,
> +		TEST_UNIT_READY, RAID_CTLR_LUNID, buffer,
> +		buffer_length, 0, &dir);
> +	if (rc)
> +		goto out; /* Assume not offline */
> +
> +	memcpy(request.lun_number, device->scsi3addr, sizeof(request.lun_number));
> +
> +	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0, &error_info);
> +
> +	if (rc)
> +		goto out; /* Assume not offline */
> +
> +	scsi_status = error_info.status;
> +	sense_data_length = get_unaligned_le16(&error_info.sense_data_length);
> +	if (sense_data_length == 0)
> +		sense_data_length =
> +			get_unaligned_le16(&error_info.response_data_length);
> +	if (sense_data_length) {
> +		if (sense_data_length > sizeof(error_info.data))
> +			sense_data_length = sizeof(error_info.data);
> +
> +		/*
> +		 * Check for sanitize in progress: asc:0x04, ascq: 0x1b
> +		 */
> +		if (scsi_status == SAM_STAT_CHECK_CONDITION &&
> +			scsi_normalize_sense(error_info.data,
> +				sense_data_length, &sshdr) &&
> +				sshdr.sense_key == NOT_READY &&
> +				sshdr.asc == 0x04 &&
> +				sshdr.ascq == 0x1b) {
> +			device->device_offline = true;
> +			offline = true;
> +			goto out; /* Keep device offline */
> +		}
> +	}
> +
> +out:
> +	kfree(buffer);
> +	return offline;
> +}
> +
>   static int pqi_get_device_info(struct pqi_ctrl_info *ctrl_info,
>   	struct pqi_scsi_dev *device,
>   	struct bmic_identify_physical_device *id_phys)
> @@ -2296,6 +2379,10 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   		if (!pqi_is_supported_device(device))
>   			continue;
>   
> +		/* Do not present disks that the OS cannot fully probe */
> +		if (pqi_keep_device_offline(ctrl_info, device))
> +			continue;
> +
>   		/* Gather information about the device. */
>   		rc = pqi_get_device_info(ctrl_info, device, id_phys);
>   		if (rc == -ENOMEM) {
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 06/11] smartpqi: avoid failing ios for offline devices
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 06/11] smartpqi: avoid failing ios for offline devices Don Brace
@ 2021-09-30 18:23   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:23 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> 
> Prevent kernel crash by failing outstanding I/O request
> when the OS takes device offline.
> 
> When posted IOs to the controller's inbound queue are
> not picked by the controller, the driver will halt the
> controller and take the controller offline.
> 
> When the driver takes the controller offline,
> the driver will fail all the outstanding requests which
> can sometime lead to an OS crash.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>


> ---
>   drivers/scsi/smartpqi/smartpqi_init.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 838274d8fadf..c9f2a3d54663 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -8544,6 +8544,7 @@ static void pqi_fail_all_outstanding_requests(struct pqi_ctrl_info *ctrl_info)
>   	unsigned int i;
>   	struct pqi_io_request *io_request;
>   	struct scsi_cmnd *scmd;
> +	struct scsi_device *sdev;
>   
>   	for (i = 0; i < ctrl_info->max_io_slots; i++) {
>   		io_request = &ctrl_info->io_request_pool[i];
> @@ -8552,7 +8553,13 @@ static void pqi_fail_all_outstanding_requests(struct pqi_ctrl_info *ctrl_info)
>   
>   		scmd = io_request->scmd;
>   		if (scmd) {
> -			set_host_byte(scmd, DID_NO_CONNECT);
> +			sdev = scmd->device;
> +			if (!sdev || !scsi_device_online(sdev)) {
> +				pqi_free_io_request(io_request);
> +				continue;
> +			} else {
> +				set_host_byte(scmd, DID_NO_CONNECT);
> +			}
>   		} else {
>   			io_request->status = -ENXIO;
>   			io_request->error_info =
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 07/11] smartpqi: add extended report physical luns
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 07/11] smartpqi: add extended report physical luns Don Brace
@ 2021-09-30 18:23   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:23 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Mike McGowen <Mike.McGowen@microchip.com>
> 
> Add support for the new extended formats in
> the data returned from the Report Physical LUNs
> command for controllers that enable this feature.
> 
> The new formats allow the reporting of 16-byte WWIDs.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>



> ---
>   drivers/scsi/smartpqi/smartpqi.h              |  37 +++-
>   drivers/scsi/smartpqi/smartpqi_init.c         | 163 +++++++++++++-----
>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   6 +-
>   3 files changed, 147 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
> index d66863f8d1cf..c439583a4ca5 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -868,7 +868,8 @@ struct pqi_config_table_firmware_features {
>   #define PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME	15
>   #define PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN	16
>   #define PQI_FIRMWARE_FEATURE_FW_TRIAGE				17
> -#define PQI_FIRMWARE_FEATURE_MAXIMUM				17
> +#define PQI_FIRMWARE_FEATURE_RPL_EXTENDED_FORMAT_4_5		18
> +#define PQI_FIRMWARE_FEATURE_MAXIMUM				18
>   
>   struct pqi_config_table_debug {
>   	struct pqi_config_table_section_header header;
> @@ -943,19 +944,21 @@ struct report_lun_header {
>   #define CISS_REPORT_LOG_FLAG_QUEUE_DEPTH	(1 << 5)
>   #define CISS_REPORT_LOG_FLAG_DRIVE_TYPE_MIX	(1 << 6)
>   
> -#define CISS_REPORT_PHYS_FLAG_OTHER		(1 << 1)
> +#define CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_2		0x2
> +#define CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_4		0x4
> +#define CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_MASK	0xf
>   
> -struct report_log_lun_extended_entry {
> +struct report_log_lun {
>   	u8	lunid[8];
>   	u8	volume_id[16];
>   };
>   
> -struct report_log_lun_extended {
> +struct report_log_lun_list {
>   	struct report_lun_header header;
> -	struct report_log_lun_extended_entry lun_entries[1];
> +	struct report_log_lun lun_entries[1];
>   };
>   
> -struct report_phys_lun_extended_entry {
> +struct report_phys_lun_8byte_wwid {
>   	u8	lunid[8];
>   	__be64	wwid;
>   	u8	device_type;
> @@ -965,12 +968,27 @@ struct report_phys_lun_extended_entry {
>   	u32	aio_handle;
>   };
>   
> +struct report_phys_lun_16byte_wwid {
> +	u8	lunid[8];
> +	u8	wwid[16];
> +	u8	device_type;
> +	u8	device_flags;
> +	u8	lun_count;	/* number of LUNs in a multi-LUN device */
> +	u8	redundant_paths;
> +	u32	aio_handle;
> +};
> +
>   /* for device_flags field of struct report_phys_lun_extended_entry */
>   #define CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED	0x8
>   
> -struct report_phys_lun_extended {
> +struct report_phys_lun_8byte_wwid_list {
> +	struct report_lun_header header;
> +	struct report_phys_lun_8byte_wwid lun_entries[1];
> +};
> +
> +struct report_phys_lun_16byte_wwid_list {
>   	struct report_lun_header header;
> -	struct report_phys_lun_extended_entry lun_entries[1];
> +	struct report_phys_lun_16byte_wwid lun_entries[1];
>   };
>   
>   struct raid_map_disk_data {
> @@ -1077,7 +1095,7 @@ struct pqi_scsi_dev {
>   	int	target;
>   	int	lun;
>   	u8	scsi3addr[8];
> -	__be64	wwid;
> +	u8	wwid[16];
>   	u8	volume_id[16];
>   	u8	is_physical_device : 1;
>   	u8	is_external_raid_device : 1;
> @@ -1316,6 +1334,7 @@ struct pqi_ctrl_info {
>   	u8		tmf_iu_timeout_supported : 1;
>   	u8		unique_wwid_in_report_phys_lun_supported : 1;
>   	u8		firmware_triage_supported : 1;
> +	u8		rpl_extended_format_4_5_supported : 1;
>   	u8		enable_r1_writes : 1;
>   	u8		enable_r5_writes : 1;
>   	u8		enable_r6_writes : 1;
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index c9f2a3d54663..1e27e6ba0159 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -572,10 +572,14 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
>   	case CISS_REPORT_PHYS:
>   		request->data_direction = SOP_READ_FLAG;
>   		cdb[0] = cmd;
> -		if (cmd == CISS_REPORT_PHYS)
> -			cdb[1] = CISS_REPORT_PHYS_FLAG_OTHER;
> -		else
> +		if (cmd == CISS_REPORT_PHYS) {
> +			if (ctrl_info->rpl_extended_format_4_5_supported)
> +				cdb[1] = CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_4;
> +			else
> +				cdb[1] = CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_2;
> +		} else {
>   			cdb[1] = ctrl_info->ciss_report_log_flags;
> +		}
>   		put_unaligned_be32(cdb_length, &cdb[6]);
>   		break;
>   	case CISS_GET_RAID_MAP:
> @@ -1132,7 +1136,64 @@ static int pqi_report_phys_logical_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd,
>   
>   static inline int pqi_report_phys_luns(struct pqi_ctrl_info *ctrl_info, void **buffer)
>   {
> -	return pqi_report_phys_logical_luns(ctrl_info, CISS_REPORT_PHYS, buffer);
> +	int rc;
> +	unsigned int i;
> +	u8 rpl_response_format;
> +	u32 num_physicals;
> +	size_t rpl_16byte_wwid_list_length;
> +	void *rpl_list;
> +	struct report_lun_header *rpl_header;
> +	struct report_phys_lun_8byte_wwid_list *rpl_8byte_wwid_list;
> +	struct report_phys_lun_16byte_wwid_list *rpl_16byte_wwid_list;
> +
> +	rc = pqi_report_phys_logical_luns(ctrl_info, CISS_REPORT_PHYS, &rpl_list);
> +	if (rc)
> +		return rc;
> +
> +	if (ctrl_info->rpl_extended_format_4_5_supported) {
> +		rpl_header = rpl_list;
> +		rpl_response_format = rpl_header->flags & CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_MASK;
> +		if (rpl_response_format == CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_4) {
> +			*buffer = rpl_list;
> +			return 0;
> +		} else if (rpl_response_format != CISS_REPORT_PHYS_FLAG_EXTENDED_FORMAT_2) {
> +			dev_err(&ctrl_info->pci_dev->dev,
> +				"RPL returned unsupported data format %u\n",
> +				rpl_response_format);
> +			return -EINVAL;
> +		} else {
> +			dev_warn(&ctrl_info->pci_dev->dev,
> +				"RPL returned extended format 2 instead of 4\n");
> +		}
> +	}
> +
> +	rpl_8byte_wwid_list = rpl_list;
> +	num_physicals = get_unaligned_be32(&rpl_8byte_wwid_list->header.list_length) / sizeof(rpl_8byte_wwid_list->lun_entries[0]);
> +	rpl_16byte_wwid_list_length = sizeof(struct report_lun_header) + (num_physicals * sizeof(struct report_phys_lun_16byte_wwid));
> +
> +	rpl_16byte_wwid_list = kmalloc(rpl_16byte_wwid_list_length, GFP_KERNEL);
> +	if (!rpl_16byte_wwid_list)
> +		return -ENOMEM;
> +
> +	put_unaligned_be32(num_physicals * sizeof(struct report_phys_lun_16byte_wwid),
> +		&rpl_16byte_wwid_list->header.list_length);
> +	rpl_16byte_wwid_list->header.flags = rpl_8byte_wwid_list->header.flags;
> +
> +	for (i = 0; i < num_physicals; i++) {
> +		memcpy(&rpl_16byte_wwid_list->lun_entries[i].lunid, &rpl_8byte_wwid_list->lun_entries[i].lunid, sizeof(rpl_8byte_wwid_list->lun_entries[i].lunid));
> +		memset(&rpl_16byte_wwid_list->lun_entries[i].wwid, 0, 8);
> +		memcpy(&rpl_16byte_wwid_list->lun_entries[i].wwid[8], &rpl_8byte_wwid_list->lun_entries[i].wwid, sizeof(rpl_8byte_wwid_list->lun_entries[i].wwid));
> +		rpl_16byte_wwid_list->lun_entries[i].device_type = rpl_8byte_wwid_list->lun_entries[i].device_type;
> +		rpl_16byte_wwid_list->lun_entries[i].device_flags = rpl_8byte_wwid_list->lun_entries[i].device_flags;
> +		rpl_16byte_wwid_list->lun_entries[i].lun_count = rpl_8byte_wwid_list->lun_entries[i].lun_count;
> +		rpl_16byte_wwid_list->lun_entries[i].redundant_paths = rpl_8byte_wwid_list->lun_entries[i].redundant_paths;
> +		rpl_16byte_wwid_list->lun_entries[i].aio_handle = rpl_8byte_wwid_list->lun_entries[i].aio_handle;
> +	}
> +
> +	kfree(rpl_8byte_wwid_list);
> +	*buffer = rpl_16byte_wwid_list;
> +
> +	return 0;
>   }
>   
>   static inline int pqi_report_logical_luns(struct pqi_ctrl_info *ctrl_info, void **buffer)
> @@ -1141,14 +1202,14 @@ static inline int pqi_report_logical_luns(struct pqi_ctrl_info *ctrl_info, void
>   }
>   
>   static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
> -	struct report_phys_lun_extended **physdev_list,
> -	struct report_log_lun_extended **logdev_list)
> +	struct report_phys_lun_16byte_wwid_list **physdev_list,
> +	struct report_log_lun_list **logdev_list)
>   {
>   	int rc;
>   	size_t logdev_list_length;
>   	size_t logdev_data_length;
> -	struct report_log_lun_extended *internal_logdev_list;
> -	struct report_log_lun_extended *logdev_data;
> +	struct report_log_lun_list *internal_logdev_list;
> +	struct report_log_lun_list *logdev_data;
>   	struct report_lun_header report_lun_header;
>   
>   	rc = pqi_report_phys_luns(ctrl_info, (void **)physdev_list);
> @@ -1173,7 +1234,7 @@ static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
>   	} else {
>   		memset(&report_lun_header, 0, sizeof(report_lun_header));
>   		logdev_data =
> -			(struct report_log_lun_extended *)&report_lun_header;
> +			(struct report_log_lun_list *)&report_lun_header;
>   		logdev_list_length = 0;
>   	}
>   
> @@ -1181,7 +1242,7 @@ static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
>   		logdev_list_length;
>   
>   	internal_logdev_list = kmalloc(logdev_data_length +
> -		sizeof(struct report_log_lun_extended), GFP_KERNEL);
> +		sizeof(struct report_log_lun), GFP_KERNEL);
>   	if (!internal_logdev_list) {
>   		kfree(*logdev_list);
>   		*logdev_list = NULL;
> @@ -1190,9 +1251,9 @@ static int pqi_get_device_lists(struct pqi_ctrl_info *ctrl_info,
>   
>   	memcpy(internal_logdev_list, logdev_data, logdev_data_length);
>   	memset((u8 *)internal_logdev_list + logdev_data_length, 0,
> -		sizeof(struct report_log_lun_extended_entry));
> +		sizeof(struct report_log_lun));
>   	put_unaligned_be32(logdev_list_length +
> -		sizeof(struct report_log_lun_extended_entry),
> +		sizeof(struct report_log_lun),
>   		&internal_logdev_list->header.list_length);
>   
>   	kfree(*logdev_list);
> @@ -1845,7 +1906,7 @@ static inline bool pqi_device_equal(struct pqi_scsi_dev *dev1, struct pqi_scsi_d
>   		return false;
>   
>   	if (dev1->is_physical_device)
> -		return dev1->wwid == dev2->wwid;
> +		return memcmp(dev1->wwid, dev2->wwid, sizeof(dev1->wwid)) == 0;
>   
>   	return memcmp(dev1->volume_id, dev2->volume_id, sizeof(dev1->volume_id)) == 0;
>   }
> @@ -1915,7 +1976,9 @@ static void pqi_dev_info(struct pqi_ctrl_info *ctrl_info,
>   	else
>   		count += scnprintf(buffer + count,
>   			PQI_DEV_INFO_BUFFER_LENGTH - count,
> -			" %016llx", device->sas_address);
> +			" %016llx%016llx",
> +			get_unaligned_be64(&device->wwid[0]),
> +			get_unaligned_be64(&device->wwid[8]));
>   
>   	count += scnprintf(buffer + count, PQI_DEV_INFO_BUFFER_LENGTH - count,
>   		" %s %.8s %.16s ",
> @@ -2229,13 +2292,14 @@ static inline bool pqi_expose_device(struct pqi_scsi_dev *device)
>   }
>   
>   static inline void pqi_set_physical_device_wwid(struct pqi_ctrl_info *ctrl_info,
> -	struct pqi_scsi_dev *device, struct report_phys_lun_extended_entry *phys_lun_ext_entry)
> +	struct pqi_scsi_dev *device, struct report_phys_lun_16byte_wwid *phys_lun)
>   {
>   	if (ctrl_info->unique_wwid_in_report_phys_lun_supported ||
> +		ctrl_info->rpl_extended_format_4_5_supported ||
>   		pqi_is_device_with_sas_address(device))
> -		device->wwid = phys_lun_ext_entry->wwid;
> +		memcpy(device->wwid, phys_lun->wwid, sizeof(device->wwid));
>   	else
> -		device->wwid = cpu_to_be64(get_unaligned_be64(&device->page_83_identifier));
> +		memcpy(&device->wwid[8], device->page_83_identifier, 8);
>   }
>   
>   static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
> @@ -2243,10 +2307,10 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   	int i;
>   	int rc;
>   	LIST_HEAD(new_device_list_head);
> -	struct report_phys_lun_extended *physdev_list = NULL;
> -	struct report_log_lun_extended *logdev_list = NULL;
> -	struct report_phys_lun_extended_entry *phys_lun_ext_entry;
> -	struct report_log_lun_extended_entry *log_lun_ext_entry;
> +	struct report_phys_lun_16byte_wwid_list *physdev_list = NULL;
> +	struct report_log_lun_list *logdev_list = NULL;
> +	struct report_phys_lun_16byte_wwid *phys_lun;
> +	struct report_log_lun *log_lun;
>   	struct bmic_identify_physical_device *id_phys = NULL;
>   	u32 num_physicals;
>   	u32 num_logicals;
> @@ -2297,10 +2361,9 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   
>   		if (pqi_hide_vsep) {
>   			for (i = num_physicals - 1; i >= 0; i--) {
> -				phys_lun_ext_entry =
> -						&physdev_list->lun_entries[i];
> -				if (CISS_GET_DRIVE_NUMBER(phys_lun_ext_entry->lunid) == PQI_VSEP_CISS_BTL) {
> -					pqi_mask_device(phys_lun_ext_entry->lunid);
> +				phys_lun = &physdev_list->lun_entries[i];
> +				if (CISS_GET_DRIVE_NUMBER(phys_lun->lunid) == PQI_VSEP_CISS_BTL) {
> +					pqi_mask_device(phys_lun->lunid);
>   					break;
>   				}
>   			}
> @@ -2344,16 +2407,14 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   		if ((!pqi_expose_ld_first && i < num_physicals) ||
>   			(pqi_expose_ld_first && i >= num_logicals)) {
>   			is_physical_device = true;
> -			phys_lun_ext_entry =
> -				&physdev_list->lun_entries[physical_index++];
> -			log_lun_ext_entry = NULL;
> -			scsi3addr = phys_lun_ext_entry->lunid;
> +			phys_lun = &physdev_list->lun_entries[physical_index++];
> +			log_lun = NULL;
> +			scsi3addr = phys_lun->lunid;
>   		} else {
>   			is_physical_device = false;
> -			phys_lun_ext_entry = NULL;
> -			log_lun_ext_entry =
> -				&logdev_list->lun_entries[logical_index++];
> -			scsi3addr = log_lun_ext_entry->lunid;
> +			phys_lun = NULL;
> +			log_lun = &logdev_list->lun_entries[logical_index++];
> +			scsi3addr = log_lun->lunid;
>   		}
>   
>   		if (is_physical_device && pqi_skip_device(scsi3addr))
> @@ -2368,7 +2429,7 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   		memcpy(device->scsi3addr, scsi3addr, sizeof(device->scsi3addr));
>   		device->is_physical_device = is_physical_device;
>   		if (is_physical_device) {
> -			device->device_type = phys_lun_ext_entry->device_type;
> +			device->device_type = phys_lun->device_type;
>   			if (device->device_type == SA_DEVICE_TYPE_EXPANDER_SMP)
>   				device->is_expander_smp_device = true;
>   		} else {
> @@ -2393,8 +2454,9 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   		if (rc) {
>   			if (device->is_physical_device)
>   				dev_warn(&ctrl_info->pci_dev->dev,
> -					"obtaining device info failed, skipping physical device %016llx\n",
> -					get_unaligned_be64(&phys_lun_ext_entry->wwid));
> +					"obtaining device info failed, skipping physical device %016llx%016llx\n",
> +					get_unaligned_be64(&phys_lun->wwid[0]),
> +					get_unaligned_be64(&phys_lun->wwid[8]));
>   			else
>   				dev_warn(&ctrl_info->pci_dev->dev,
>   					"obtaining device info failed, skipping logical device %08x%08x\n",
> @@ -2407,21 +2469,21 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>   		pqi_assign_bus_target_lun(device);
>   
>   		if (device->is_physical_device) {
> -			pqi_set_physical_device_wwid(ctrl_info, device, phys_lun_ext_entry);
> -			if ((phys_lun_ext_entry->device_flags &
> +			pqi_set_physical_device_wwid(ctrl_info, device, phys_lun);
> +			if ((phys_lun->device_flags &
>   				CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED) &&
> -				phys_lun_ext_entry->aio_handle) {
> +				phys_lun->aio_handle) {
>   					device->aio_enabled = true;
>   					device->aio_handle =
> -						phys_lun_ext_entry->aio_handle;
> +						phys_lun->aio_handle;
>   			}
>   		} else {
> -			memcpy(device->volume_id, log_lun_ext_entry->volume_id,
> +			memcpy(device->volume_id, log_lun->volume_id,
>   				sizeof(device->volume_id));
>   		}
>   
>   		if (pqi_is_device_with_sas_address(device))
> -			device->sas_address = get_unaligned_be64(&device->wwid);
> +			device->sas_address = get_unaligned_be64(&device->wwid[8]);
>   
>   		new_device_list[num_valid_devices++] = device;
>   	}
> @@ -6804,12 +6866,10 @@ static ssize_t pqi_unique_id_show(struct device *dev,
>   		return -ENODEV;
>   	}
>   
> -	if (device->is_physical_device) {
> -		memset(unique_id, 0, 8);
> -		memcpy(unique_id + 8, &device->wwid, sizeof(device->wwid));
> -	} else {
> +	if (device->is_physical_device)
> +		memcpy(unique_id, device->wwid, sizeof(device->wwid));
> +	else
>   		memcpy(unique_id, device->volume_id, sizeof(device->volume_id));
> -	}
>   
>   	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
>   
> @@ -7443,6 +7503,9 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>   		ctrl_info->firmware_triage_supported = firmware_feature->enabled;
>   		pqi_save_fw_triage_setting(ctrl_info, firmware_feature->enabled);
>   		break;
> +	case PQI_FIRMWARE_FEATURE_RPL_EXTENDED_FORMAT_4_5:
> +		ctrl_info->rpl_extended_format_4_5_supported = firmware_feature->enabled;
> +		break;
>   	}
>   
>   	pqi_firmware_feature_status(ctrl_info, firmware_feature);
> @@ -7543,6 +7606,11 @@ static struct pqi_firmware_feature pqi_firmware_features[] = {
>   		.feature_bit = PQI_FIRMWARE_FEATURE_FW_TRIAGE,
>   		.feature_status = pqi_ctrl_update_feature_flags,
>   	},
> +	{
> +		.feature_name = "RPL Extended Formats 4 and 5",
> +		.feature_bit = PQI_FIRMWARE_FEATURE_RPL_EXTENDED_FORMAT_4_5,
> +		.feature_status = pqi_ctrl_update_feature_flags,
> +	},
>   };
>   
>   static void pqi_process_firmware_features(
> @@ -7644,6 +7712,7 @@ static void pqi_ctrl_reset_config(struct pqi_ctrl_info *ctrl_info)
>   	ctrl_info->tmf_iu_timeout_supported = false;
>   	ctrl_info->unique_wwid_in_report_phys_lun_supported = false;
>   	ctrl_info->firmware_triage_supported = false;
> +	ctrl_info->rpl_extended_format_4_5_supported = false;
>   }
>   
>   static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
> diff --git a/drivers/scsi/smartpqi/smartpqi_sas_transport.c b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
> index afd9bafebd1d..dea4ebaf1677 100644
> --- a/drivers/scsi/smartpqi/smartpqi_sas_transport.c
> +++ b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
> @@ -343,7 +343,7 @@ static int pqi_sas_get_enclosure_identifier(struct sas_rphy *rphy,
>   	}
>   
>   	if (found_device->devtype == TYPE_ENCLOSURE) {
> -		*identifier = get_unaligned_be64(&found_device->wwid);
> +		*identifier = get_unaligned_be64(&found_device->wwid[8]);
>   		rc = 0;
>   		goto out;
>   	}
> @@ -364,7 +364,7 @@ static int pqi_sas_get_enclosure_identifier(struct sas_rphy *rphy,
>   			memcmp(device->phys_connector,
>   				found_device->phys_connector, 2) == 0) {
>   			*identifier =
> -				get_unaligned_be64(&device->wwid);
> +				get_unaligned_be64(&device->wwid[8]);
>   			rc = 0;
>   			goto out;
>   		}
> @@ -380,7 +380,7 @@ static int pqi_sas_get_enclosure_identifier(struct sas_rphy *rphy,
>   		if (device->devtype == TYPE_ENCLOSURE &&
>   			CISS_GET_DRIVE_NUMBER(device->scsi3addr) ==
>   				PQI_VSEP_CISS_BTL) {
> -			*identifier = get_unaligned_be64(&device->wwid);
> +			*identifier = get_unaligned_be64(&device->wwid[8]);
>   			rc = 0;
>   			goto out;
>   		}
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 08/11] smartpqi: fix boot failure during lun rebuild
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 08/11] smartpqi: fix boot failure during lun rebuild Don Brace
@ 2021-09-30 18:24   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:24 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Mike McGowen <Mike.McGowen@microchip.com>
> 
> Move the delay in the register polling loop to
> the beginning of the loop to ensure there is always
> a delay between writing the register and reading it.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>


Acked-by: John Donnelly <john.p.donnelly@oracle.com>


> ---
>   drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 1e27e6ba0159..c28eb7ea4a24 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -4278,12 +4278,12 @@ static int pqi_create_admin_queues(struct pqi_ctrl_info *ctrl_info)
>   
>   	timeout = PQI_ADMIN_QUEUE_CREATE_TIMEOUT_JIFFIES + jiffies;
>   	while (1) {
> +		msleep(PQI_ADMIN_QUEUE_CREATE_POLL_INTERVAL_MSECS);
>   		status = readb(&pqi_registers->function_and_status_code);
>   		if (status == PQI_STATUS_IDLE)
>   			break;
>   		if (time_after(jiffies, timeout))
>   			return -ETIMEDOUT;
> -		msleep(PQI_ADMIN_QUEUE_CREATE_POLL_INTERVAL_MSECS);
>   	}
>   
>   	/*
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers Don Brace
@ 2021-09-30 18:24   ` john.p.donnelly
  2021-10-01  8:26   ` Paul Menzel
  1 sibling, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:24 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> Stop the OS from re-discovering multiple LUNs for
> tape drive and medium changer.
> 
> Duplicate device nodes for Ultrium tape drive and
> medium changer are being created.
> 
> The Ultrium tape drive is a multi-LUN SCSI target.
> It presents a LUN for the tape drive and a 2nd
> LUN for the medium changer.
> Our controller FW lists both LUNs in the RPL
> results.
> 
> As a result, the smartpqi driver exposes both
> devices to the OS. Then the OS does its normal
> device discovery via the SCSI REPORT LUNS command,
> which causes it to re-discover both devices a 2nd time,
> which results in the duplicate device nodes.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>



> ---
>   drivers/scsi/smartpqi/smartpqi.h      |  1 +
>   drivers/scsi/smartpqi/smartpqi_init.c | 23 +++++++++++++++++++----
>   2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
> index c439583a4ca5..aac88ac0a0b7 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -1106,6 +1106,7 @@ struct pqi_scsi_dev {
>   	u8	keep_device : 1;
>   	u8	volume_offline : 1;
>   	u8	rescan : 1;
> +	u8	ignore_device : 1;
>   	bool	aio_enabled;		/* only valid for physical disks */
>   	bool	in_remove;
>   	bool	device_offline;
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index c28eb7ea4a24..8be116992cb0 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -6297,9 +6297,13 @@ static int pqi_slave_alloc(struct scsi_device *sdev)
>   		rphy = target_to_rphy(starget);
>   		device = pqi_find_device_by_sas_rphy(ctrl_info, rphy);
>   		if (device) {
> -			device->target = sdev_id(sdev);
> -			device->lun = sdev->lun;
> -			device->target_lun_valid = true;
> +			if (device->target_lun_valid) {
> +				device->ignore_device = true;
> +			} else {
> +				device->target = sdev_id(sdev);
> +				device->lun = sdev->lun;
> +				device->target_lun_valid = true;
> +			}
>   		}
>   	} else {
>   		device = pqi_find_scsi_dev(ctrl_info, sdev_channel(sdev),
> @@ -6336,14 +6340,25 @@ static int pqi_map_queues(struct Scsi_Host *shost)
>   					ctrl_info->pci_dev, 0);
>   }
>   
> +static inline bool pqi_is_tape_changer_device(struct pqi_scsi_dev *device)
> +{
> +	return device->devtype == TYPE_TAPE || device->devtype == TYPE_MEDIUM_CHANGER;
> +}
> +
>   static int pqi_slave_configure(struct scsi_device *sdev)
>   {
> +	int rc = 0;
>   	struct pqi_scsi_dev *device;
>   
>   	device = sdev->hostdata;
>   	device->devtype = sdev->type;
>   
> -	return 0;
> +	if (pqi_is_tape_changer_device(device) && device->ignore_device) {
> +		rc = -ENXIO;
> +		device->ignore_device = false;
> +	}
> +
> +	return rc;
>   }
>   
>   static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg)
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 10/11] smartpqi: add 3252-8i pci id
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 10/11] smartpqi: add 3252-8i pci id Don Brace
@ 2021-09-30 18:24   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:24 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> From: Mike McGowen <Mike.McGowen@microchip.com>
> 
> Added PCI ID information for the
> Adaptec SmartRAID 3252-8i controller:
>      9005 / 028F / 9005 / 14A2
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Acked-by: John Donnelly <john.p.donnelly@oracle.com>


> ---
>   drivers/scsi/smartpqi/smartpqi_init.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 8be116992cb0..ffa217874352 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -9287,6 +9287,10 @@ static const struct pci_device_id pqi_pci_id_table[] = {
>   		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
>   			       PCI_VENDOR_ID_ADAPTEC2, 0x14a1)
>   	},
> +	{
> +		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
> +			       PCI_VENDOR_ID_ADAPTEC2, 0x14a2)
> +	},
>   	{
>   		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
>   			       PCI_VENDOR_ID_ADAPTEC2, 0x14b0)
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 11/11] smartpqi: update version to 2.1.12-055
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 11/11] smartpqi: update version to 2.1.12-055 Don Brace
@ 2021-09-30 18:25   ` john.p.donnelly
  0 siblings, 0 replies; 33+ messages in thread
From: john.p.donnelly @ 2021-09-30 18:25 UTC (permalink / raw)
  To: Don Brace, hch, martin.petersen, jejb, linux-scsi
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, mwilck, pmenzel,
	linux-kernel

On 9/28/21 6:54 PM, Don Brace wrote:
> Update driver version to reflect changes.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---

Acked-by: John Donnelly <john.p.donnelly@oracle.com>


>   drivers/scsi/smartpqi/smartpqi_init.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index ffa217874352..b966ce3b4385 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -33,11 +33,11 @@
>   #define BUILD_TIMESTAMP
>   #endif
>   
> -#define DRIVER_VERSION		"2.1.10-020"
> +#define DRIVER_VERSION		"2.1.12-055"
>   #define DRIVER_MAJOR		2
>   #define DRIVER_MINOR		1
> -#define DRIVER_RELEASE		10
> -#define DRIVER_REVISION		20
> +#define DRIVER_RELEASE		12
> +#define DRIVER_REVISION		55
>   
>   #define DRIVER_NAME		"Microchip SmartPQI Driver (v" \
>   				DRIVER_VERSION BUILD_TIMESTAMP ")"
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers
  2021-09-28 23:54 ` [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers Don Brace
  2021-09-30 18:24   ` john.p.donnelly
@ 2021-10-01  8:26   ` Paul Menzel
  2021-10-05 20:23     ` Don.Brace
  1 sibling, 1 reply; 33+ messages in thread
From: Paul Menzel @ 2021-10-01  8:26 UTC (permalink / raw)
  To: Don Brace
  Cc: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, mike.mcgowen, murthy.bhat,
	balsundar.p, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi

Dear Kevin, dear Don,


Am 29.09.21 um 01:54 schrieb Don Brace:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> Stop the OS from re-discovering multiple LUNs for
> tape drive and medium changer.
> 
> Duplicate device nodes for Ultrium tape drive and
> medium changer are being created.
> 
> The Ultrium tape drive is a multi-LUN SCSI target.
> It presents a LUN for the tape drive and a 2nd
> LUN for the medium changer.
> Our controller FW lists both LUNs in the RPL
> results.

Please document the firmware version (and controller) you tested with in 
the commit message.

> As a result, the smartpqi driver exposes both
> devices to the OS. Then the OS does its normal
> device discovery via the SCSI REPORT LUNS command,
> which causes it to re-discover both devices a 2nd time,
> which results in the duplicate device nodes.

Shortly describing the implementation (new struct member ignore_device) 
would be nice.

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>   drivers/scsi/smartpqi/smartpqi.h      |  1 +
>   drivers/scsi/smartpqi/smartpqi_init.c | 23 +++++++++++++++++++----
>   2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
> index c439583a4ca5..aac88ac0a0b7 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -1106,6 +1106,7 @@ struct pqi_scsi_dev {
>   	u8	keep_device : 1;
>   	u8	volume_offline : 1;
>   	u8	rescan : 1;
> +	u8	ignore_device : 1;

Why not type bool?

>   	bool	aio_enabled;		/* only valid for physical disks */
>   	bool	in_remove;
>   	bool	device_offline;
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index c28eb7ea4a24..8be116992cb0 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -6297,9 +6297,13 @@ static int pqi_slave_alloc(struct scsi_device *sdev)
>   		rphy = target_to_rphy(starget);
>   		device = pqi_find_device_by_sas_rphy(ctrl_info, rphy);
>   		if (device) {
> -			device->target = sdev_id(sdev);
> -			device->lun = sdev->lun;
> -			device->target_lun_valid = true;

Off topic, with `u8 target_lun_valid : 1`, why is `true` used.

> +			if (device->target_lun_valid) {
> +				device->ignore_device = true;
> +			} else {
> +				device->target = sdev_id(sdev);
> +				device->lun = sdev->lun;
> +				device->target_lun_valid = true;
> +			}

If the LUN should be ignored, is it actually valid? Why not extend 
target_lun_valid and add a third option (enums?) to ignore it?

>   		}
>   	} else {
>   		device = pqi_find_scsi_dev(ctrl_info, sdev_channel(sdev),
> @@ -6336,14 +6340,25 @@ static int pqi_map_queues(struct Scsi_Host *shost)
>   					ctrl_info->pci_dev, 0);
>   }
>   
> +static inline bool pqi_is_tape_changer_device(struct pqi_scsi_dev *device)
> +{
> +	return device->devtype == TYPE_TAPE || device->devtype == TYPE_MEDIUM_CHANGER;

Why also check for TYPE_TAPE? The function name should be updated then?

> +}
> +
>   static int pqi_slave_configure(struct scsi_device *sdev)
>   {
> +	int rc = 0;
>   	struct pqi_scsi_dev *device;
>   
>   	device = sdev->hostdata;
>   	device->devtype = sdev->type;
>   
> -	return 0;
> +	if (pqi_is_tape_changer_device(device) && device->ignore_device) {
> +		rc = -ENXIO;
> +		device->ignore_device = false;

I’d add a `return -ENXIO` here, and remove the variable.

> +	}
> +
> +	return rc;
>   }
>   
>   static int pqi_getpciinfo_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg)
> 


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers
  2021-10-01  8:26   ` Paul Menzel
@ 2021-10-05 20:23     ` Don.Brace
  2021-10-06  2:37       ` Martin K. Petersen
  2021-10-07  9:38       ` Paul Menzel
  0 siblings, 2 replies; 33+ messages in thread
From: Don.Brace @ 2021-10-05 20:23 UTC (permalink / raw)
  To: pmenzel
  Cc: Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, Mike.McGowen, Murthy.Bhat,
	Balsundar.P, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi

From: Paul Menzel [mailto:pmenzel@molgen.mpg.de] 

Subject: Re: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers

Dear Kevin, dear Don,
> Our controller FW lists both LUNs in the RPL results.

Please document the firmware version (and controller) you tested with in the commit message.

DON: Done in V3, thanks for your review.

Shortly describing the implementation (new struct member ignore_device) would be nice.
DON: Don in V3, thanks for your review.

>       u8      rescan : 1;
> +     u8      ignore_device : 1;

Why not type bool?
Don: They both take the same amount of memory and since the other members are also u8, the new member was also u8 for consistency.

> -                     device->lun = sdev->lun;
> -                     device->target_lun_valid = true;

Off topic, with `u8 target_lun_valid : 1`, why is `true` used.
Don: Has the same behavior, and carried forward from other member fields.

> +                     if (device->target_lun_valid) {
> +                             device->ignore_device = true;
> +                     } else {
> +                             device->target = sdev_id(sdev);
> +                             device->lun = sdev->lun;
> +                             device->target_lun_valid = true;
> +                     }

If the LUN should be ignored, is it actually valid? Why not extend target_lun_valid and add a third option (enums?) to ignore it?

Don: The reason is that it takes advantage of the order the devices are added and how slave_alloc and slave_configure fit into this order.

> +     return device->devtype == TYPE_TAPE || device->devtype == 
> +TYPE_MEDIUM_CHANGER;

Why also check for TYPE_TAPE? The function name should be updated then?
Don: Because out tape changer consisted of the changer and one or more tape units and both were duplicated.

>   static int pqi_slave_configure(struct scsi_device *sdev)
> +     if (pqi_is_tape_changer_device(device) && device->ignore_device) {
> +             rc = -ENXIO;
> +             device->ignore_device = false;

I’d add a `return -ENXIO` here, and remove the variable.
Don: This works in conjunction with slave_alloc and is needed.

>

Kind regards,
Paul

Thanks for your review. Appreciate the inspection.
Don and Kevin


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers
  2021-10-05 20:23     ` Don.Brace
@ 2021-10-06  2:37       ` Martin K. Petersen
  2021-10-06 14:28         ` Don.Brace
  2021-10-07  9:38       ` Paul Menzel
  1 sibling, 1 reply; 33+ messages in thread
From: Martin K. Petersen @ 2021-10-06  2:37 UTC (permalink / raw)
  To: Don.Brace
  Cc: pmenzel, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, Mike.McGowen, Murthy.Bhat,
	Balsundar.P, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi


Don,

> DON: Done in V3, thanks for your review.

Just a heads up that I already staged this for zeroday testing
yesterday. If you post a v3 I'll drop what I have. But I would prefer an
incremental patch.

Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers
  2021-10-06  2:37       ` Martin K. Petersen
@ 2021-10-06 14:28         ` Don.Brace
  0 siblings, 0 replies; 33+ messages in thread
From: Don.Brace @ 2021-10-06 14:28 UTC (permalink / raw)
  To: martin.petersen
  Cc: pmenzel, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, Mike.McGowen, Murthy.Bhat,
	Balsundar.P, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, jejb, linux-scsi

-----Original Message-----
From: Martin K. Petersen [mailto:martin.petersen@oracle.com] 
Subject: Re: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers

Don,

> DON: Done in V3, thanks for your review.

Just a heads up that I already staged this for zeroday testing yesterday. If you post a v3 I'll drop what I have. But I would prefer an incremental patch.

Thanks!

--
Martin K. Petersen      Oracle Linux Engineering

DON: Incremental patch it is. I appreciate this.
Thanks,
Don

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers
  2021-10-05 20:23     ` Don.Brace
  2021-10-06  2:37       ` Martin K. Petersen
@ 2021-10-07  9:38       ` Paul Menzel
  1 sibling, 0 replies; 33+ messages in thread
From: Paul Menzel @ 2021-10-07  9:38 UTC (permalink / raw)
  To: Don Brace
  Cc: Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, Mike.McGowen, Murthy.Bhat,
	Balsundar.P, joseph.szczypek, jeff, POSWALD, john.p.donnelly,
	mwilck, linux-kernel, hch, martin.petersen, jejb, linux-scsi

Dear Don,


Am 05.10.21 um 22:23 schrieb Don.Brace@microchip.com:

>> Our controller FW lists both LUNs in the RPL results.
> 
> Please document the firmware version (and controller) you tested with in the commit message.
> 
> DON: Done in V3, thanks for your review.

When I understood Martin correctly, he already pulled the patches in. 
It’d be great if you added it in an answer then.

> Shortly describing the implementation (new struct member ignore_device) would be nice.
> DON: Don in V3, thanks for your review.
> 
>>        u8      rescan : 1;
>> +     u8      ignore_device : 1;
> 
> Why not type bool?
> Don: They both take the same amount of memory and since the other members are also u8, the new member was also u8 for consistency.

Well, the below struct members are declared as bool.

         u8      volume_offline : 1;
         u8      rescan : 1;
         bool    aio_enabled;            /* only valid for physical disks */

It’d be great, if you could clean that up in the future.

>> -                     device->lun = sdev->lun;
>> -                     device->target_lun_valid = true;
> 
> Off topic, with `u8 target_lun_valid : 1`, why is `true` used.
> Don: Has the same behavior, and carried forward from other member fields.

In my opinion, if bool is used, true/false should be used too.

>> +                     if (device->target_lun_valid) {
>> +                             device->ignore_device = true;
>> +                     } else {
>> +                             device->target = sdev_id(sdev);
>> +                             device->lun = sdev->lun;
>> +                             device->target_lun_valid = true;
>> +                     }
> 
> If the LUN should be ignored, is it actually valid? Why not extend target_lun_valid and add a third option (enums?) to ignore it?
> 
> Don: The reason is that it takes advantage of the order the devices are added and how slave_alloc and slave_configure fit into this order.

Ok. My answer should have also been to use a bitfield. Sorry about that. 
It does not look nice to me to add new attributes to work around 
firmware isuses.

>> +     return device->devtype == TYPE_TAPE || device->devtype ==
>> +TYPE_MEDIUM_CHANGER;
> 
> Why also check for TYPE_TAPE? The function name should be updated then?
> Don: Because our tape changer consisted of the changer and one or more tape units and both were duplicated.

Yes, I figured. But the function name is still incorrect/misleading then?

>>    static int pqi_slave_configure(struct scsi_device *sdev)
>> +     if (pqi_is_tape_changer_device(device) && device->ignore_device) {
>> +             rc = -ENXIO;
>> +             device->ignore_device = false;
> 
> I’d add a `return -ENXIO` here, and remove the variable.
> Don: This works in conjunction with slave_alloc and is needed.

Instead of

+	if (pqi_is_tape_changer_device(device) && device->ignore_device) {
+		rc = -ENXIO;
+		device->ignore_device = false;
+	}
+
+	return rc;

I meant

+	if (pqi_is_tape_changer_device(device) && device->ignore_device) {
+		device->ignore_device = false;
+		return -ENXIO;
+	}
+
+	return 0;

Lastly, some (debug) log messages would always be helpful in my opinion, 
if stuff is worked around.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [smartpqi updates PATCH V2 00/11] smartpqi updates
  2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
                   ` (11 preceding siblings ...)
  2021-09-29  9:34 ` [smartpqi updates PATCH V2 00/11] smartpqi updates Paul Menzel
@ 2021-10-12 20:35 ` Martin K. Petersen
  12 siblings, 0 replies; 33+ messages in thread
From: Martin K. Petersen @ 2021-10-12 20:35 UTC (permalink / raw)
  To: jejb, Don Brace, hch, linux-scsi
  Cc: Martin K . Petersen, gerry.morong, linux-kernel, balsundar.p,
	mwilck, Kevin.Barnett, POSWALD, mike.mcgowen, jeff, scott.teel,
	murthy.bhat, joseph.szczypek, scott.benesh, mahesh.rajashekhara,
	Justin.Lindley, pmenzel, john.p.donnelly

On Tue, 28 Sep 2021 18:54:31 -0500, Don Brace wrote:

> These patches are based on Martin Petersen's 5.16/scsi-queue tree
>   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>   5.16/scsi-queue
> 
> This set of changes consist of:
>   * Aligning device removal with our out of box driver.
>   * Aligning kdump timing with controller memory dump.
>     The OS was rebooting before the controller was finished dumping its own
>     memory. Now the driver will wait for the controller to indicate that its
>     dump has completed.
>   * In rare cases where the controller stops responding to the driver, the
>     driver can set reason codes to aid in debugging.
>   * Enhance device reset operations. The driver was not obtaining the current
>     number of outstanding commands during the check for outstanding command
>     completions. This was causing reset hangs.
>   * Add in a check for HBA devices undergoing sanitize. This was causing long
>     boot up delays while the OS waited for sanitize to complete. The fix is to
>     check for sanitize and keep the HBA disk offline. Note that the SSA spec
>     states that the disk must be manually re-enabled after sanitize has
>     completed. The link to the spec is noted in the patch.
>   * When the OS off-lines a disk, the SCSI command pointers are cleaned up.
>     The driver was attempting to return some outstanding commands that were
>     no longer valid.
>   * Add in more enhanced report physical luns (RPL) command. This is an
>     internal command that yields more complete WWID information.
>   * Correct a rare case where a poll for a register status before the
>     register has been updated.
>   * When multi-LUN tape devices are added to the OS, the OS does its own
>     report LUNs and the tape devices were duplicated. A simple fix was to update
>     slave_alloc/slave_configure to prevent this.
>   * Add in some new PCI devices.
>   * Bump the driver version.
> 
> [...]

Applied to 5.16/scsi-queue, thanks!

[01/11] smartpqi: update device removal management
        https://git.kernel.org/mkp/scsi/c/819225b03dc7
[02/11] smartpqi: add controller handshake during kdump
        https://git.kernel.org/mkp/scsi/c/9ee5d6e9ac52
[03/11] smartpqi: capture controller reason codes
        https://git.kernel.org/mkp/scsi/c/5d1f03e6f49a
[04/11] smartpqi: update LUN reset handler
        https://git.kernel.org/mkp/scsi/c/6ce1ddf53252
[05/11] smartpqi: add tur check for sanitize operation
        https://git.kernel.org/mkp/scsi/c/be76f90668d8
[06/11] smartpqi: avoid failing ios for offline devices
        https://git.kernel.org/mkp/scsi/c/4f3cefc3084d
[07/11] smartpqi: add extended report physical luns
        https://git.kernel.org/mkp/scsi/c/28ca6d876c5a
[08/11] smartpqi: fix boot failure during lun rebuild
        https://git.kernel.org/mkp/scsi/c/987d35605b7e
[09/11] smartpqi: fix duplicate device nodes for tape changers
        https://git.kernel.org/mkp/scsi/c/d4dc6aea93cb
[10/11] smartpqi: add 3252-8i pci id
        https://git.kernel.org/mkp/scsi/c/80982656b78e
[11/11] smartpqi: update version to 2.1.12-055
        https://git.kernel.org/mkp/scsi/c/605ae389ea02

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2021-10-12 20:36 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28 23:54 [smartpqi updates PATCH V2 00/11] smartpqi updates Don Brace
2021-09-28 23:54 ` [smartpqi updates PATCH V2 01/11] smartpqi: update device removal management Don Brace
2021-09-30 18:21   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 02/11] smartpqi: add controller handshake during kdump Don Brace
2021-09-30 18:21   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 03/11] smartpqi: capture controller reason codes Don Brace
2021-09-30 18:22   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 04/11] smartpqi: update LUN reset handler Don Brace
2021-09-30 18:22   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 05/11] smartpqi: add tur check for sanitize operation Don Brace
2021-09-29  7:56   ` Paul Menzel
2021-09-30 18:23   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 06/11] smartpqi: avoid failing ios for offline devices Don Brace
2021-09-30 18:23   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 07/11] smartpqi: add extended report physical luns Don Brace
2021-09-30 18:23   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 08/11] smartpqi: fix boot failure during lun rebuild Don Brace
2021-09-30 18:24   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 09/11] smartpqi: fix duplicate device nodes for tape changers Don Brace
2021-09-30 18:24   ` john.p.donnelly
2021-10-01  8:26   ` Paul Menzel
2021-10-05 20:23     ` Don.Brace
2021-10-06  2:37       ` Martin K. Petersen
2021-10-06 14:28         ` Don.Brace
2021-10-07  9:38       ` Paul Menzel
2021-09-28 23:54 ` [smartpqi updates PATCH V2 10/11] smartpqi: add 3252-8i pci id Don Brace
2021-09-30 18:24   ` john.p.donnelly
2021-09-28 23:54 ` [smartpqi updates PATCH V2 11/11] smartpqi: update version to 2.1.12-055 Don Brace
2021-09-30 18:25   ` john.p.donnelly
2021-09-29  9:34 ` [smartpqi updates PATCH V2 00/11] smartpqi updates Paul Menzel
2021-09-29 14:08   ` Don.Brace
2021-09-29 14:12     ` Paul Menzel
2021-10-12 20:35 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).