All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 00/25] smartpqi updates
@ 2020-12-10 20:34 Don Brace
  2020-12-10 20:34 ` [PATCH V3 01/25] smartpqi: add support for product id Don Brace
                   ` (25 more replies)
  0 siblings, 26 replies; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:34 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

These patches are based on Martin Peterson's 5.11/scsi-queue tree

Note that these patches depend on the following three patches
applied to Martin Peterson's tree:
  https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
  5.11/scsi-queue
Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012
Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler busy condition
Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with HBA disks

This set of changes consist of:
  * Add support for newer controller hardware.
    * Refactor AIO and s/g processing code. (No functional changes)
    * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).
    * Add check for sequential streaming.
    * Add in new PCI-IDs.
  * Format changes to re-align with our in-house driver. (No functional changes.)
  * Correct some issues relating to suspend/hibernation/OFA/shutdown.
    * Block I/O requests during these conditions.
  * Add in qdepth limit check to limit outstanding commands.
    to the max values supported by the controller.
  * Correct some minor issues found during regression testing.
  * Update the driver version.

Changes since V1:
  * Re-added 32bit calculations to correct i386 compile issues
    to patch smartpqi-refactor-aio-submission-code 
    Reported-by: kernel test robot <lkp@intel.com>
    https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VMBBGGGE5446SVEOQBRCKBTRRWTSH4AB/

Changes since V2:
  * Added 32bit division to correct i386 compile issues
    to patch smartpqi-add-support-for-raid5-and-raid6-writes
    Reported-by: kernel test robot <lkp@intel.com>
    https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZCXJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/

---

Don Brace (7):
      smartpqi: refactor aio submission code
      smartpqi: refactor build sg list code
      smartpqi: add support for raid5 and raid6 writes
      smartpqi: add support for raid1 writes
      smartpqi: add stream detection
      smartpqi: add host level stream detection enable
      smartpqi: update version to 2.1.6-005

Kevin Barnett (14):
      smartpqi: add support for product id
      smartpqi: add support for BMIC sense feature cmd and feature bits
      smartpqi: update AIO Sub Page 0x02 support
      smartpqi: add support for long firmware version
      smartpqi: align code with oob driver
      smartpqi: enable support for NVMe encryption
      smartpqi: disable write_same for nvme hba disks
      smartpqi: fix driver synchronization issues
      smartpqi: convert snprintf to scnprintf
      smartpqi: change timing of release of QRM memory during OFA
      smartpqi: return busy indication for IOCTLs when ofa is active
      smartpqi: add additional logging for LUN resets
      smartpqi: correct system hangs when resuming from hibernation
      smartpqi: add new pci ids

Mahesh Rajashekhara (1):
      smartpqi: fix host qdepth limit

Murthy Bhat (3):
      smartpqi: add phy id support for the physical drives
      smartpqi: update sas initiator_port_protocols and target_port_protocols
      smartpqi: update enclosure identifier in sysf


 drivers/scsi/smartpqi/smartpqi.h              |  301 +-
 drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------
 .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-
 drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-
 4 files changed, 2189 insertions(+), 1278 deletions(-)

--
Signature

^ permalink raw reply	[flat|nested] 91+ messages in thread

* [PATCH V3 01/25] smartpqi: add support for product id
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
@ 2020-12-10 20:34 ` Don Brace
  2021-01-07 16:43   ` Martin Wilck
  2020-12-10 20:34 ` [PATCH V3 02/25] smartpqi: refactor aio submission code Don Brace
                   ` (24 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:34 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   11 ++++++++++-
 drivers/scsi/smartpqi/smartpqi_init.c |   11 +++++++++--
 drivers/scsi/smartpqi/smartpqi_sis.c  |    5 +++++
 drivers/scsi/smartpqi/smartpqi_sis.h  |    1 +
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 3e54590e6e92..7d3f956e949f 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -79,7 +79,8 @@ struct pqi_ctrl_registers {
 	__le32	sis_ctrl_to_host_doorbell_clear;	/* A0h */
 	u8	reserved4[0xb0 - (0xa0 + sizeof(__le32))];
 	__le32	sis_driver_scratch;			/* B0h */
-	u8	reserved5[0xbc - (0xb0 + sizeof(__le32))];
+	__le32  sis_product_identifier;			/* B4h */
+	u8	reserved5[0xbc - (0xb4 + sizeof(__le32))];
 	__le32	sis_firmware_status;			/* BCh */
 	u8	reserved6[0x1000 - (0xbc + sizeof(__le32))];
 	__le32	sis_mailbox[8];				/* 1000h */
@@ -585,6 +586,7 @@ struct pqi_raid_error_info {
 /* these values are defined by the PQI spec */
 #define PQI_MAX_NUM_ELEMENTS_ADMIN_QUEUE	255
 #define PQI_MAX_NUM_ELEMENTS_OPERATIONAL_QUEUE	65535
+
 #define PQI_QUEUE_ELEMENT_ARRAY_ALIGNMENT	64
 #define PQI_QUEUE_ELEMENT_LENGTH_ALIGNMENT	16
 #define PQI_ADMIN_INDEX_ALIGNMENT		64
@@ -1082,6 +1084,11 @@ struct pqi_event {
 	(PQI_RESERVED_IO_SLOTS_LUN_RESET + PQI_RESERVED_IO_SLOTS_EVENT_ACK + \
 	PQI_RESERVED_IO_SLOTS_SYNCHRONOUS_REQUESTS)
 
+#define PQI_CTRL_PRODUCT_ID_GEN1	0
+#define PQI_CTRL_PRODUCT_ID_GEN2	7
+#define PQI_CTRL_PRODUCT_REVISION_A	0
+#define PQI_CTRL_PRODUCT_REVISION_B	1
+
 struct pqi_ctrl_info {
 	unsigned int	ctrl_id;
 	struct pci_dev	*pci_dev;
@@ -1089,6 +1096,8 @@ struct pqi_ctrl_info {
 	char		serial_number[17];
 	char		model[17];
 	char		vendor[9];
+	u8		product_id;
+	u8		product_revision;
 	void __iomem	*iomem_base;
 	struct pqi_ctrl_registers __iomem *registers;
 	struct pqi_device_registers __iomem *pqi_registers;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index c53f456fbd09..68fc4327944e 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -6259,8 +6259,8 @@ static DEVICE_ATTR(model, 0444, pqi_model_show, NULL);
 static DEVICE_ATTR(serial_number, 0444, pqi_serial_number_show, NULL);
 static DEVICE_ATTR(vendor, 0444, pqi_vendor_show, NULL);
 static DEVICE_ATTR(rescan, 0200, NULL, pqi_host_rescan_store);
-static DEVICE_ATTR(lockup_action, 0644,
-	pqi_lockup_action_show, pqi_lockup_action_store);
+static DEVICE_ATTR(lockup_action, 0644, pqi_lockup_action_show,
+	pqi_lockup_action_store);
 
 static struct device_attribute *pqi_shost_attrs[] = {
 	&dev_attr_driver_version,
@@ -7146,6 +7146,7 @@ static int pqi_force_sis_mode(struct pqi_ctrl_info *ctrl_info)
 static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
 {
 	int rc;
+	u32 product_id;
 
 	if (reset_devices) {
 		sis_soft_reset(ctrl_info);
@@ -7182,6 +7183,10 @@ static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
 		return rc;
 	}
 
+	product_id = sis_get_product_id(ctrl_info);
+	ctrl_info->product_id = (u8)product_id;
+	ctrl_info->product_revision = (u8)(product_id >> 8);
+
 	if (reset_devices) {
 		if (ctrl_info->max_outstanding_requests >
 			PQI_MAX_OUTSTANDING_REQUESTS_KDUMP)
@@ -8602,6 +8607,8 @@ static void __attribute__((unused)) verify_structures(void)
 		sis_ctrl_to_host_doorbell_clear) != 0xa0);
 	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
 		sis_driver_scratch) != 0xb0);
+	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
+		sis_product_identifier) != 0xb4);
 	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
 		sis_firmware_status) != 0xbc);
 	BUILD_BUG_ON(offsetof(struct pqi_ctrl_registers,
diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c
index 26ea6b9d4199..f0199bd87dd1 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.c
+++ b/drivers/scsi/smartpqi/smartpqi_sis.c
@@ -149,6 +149,11 @@ bool sis_is_kernel_up(struct pqi_ctrl_info *ctrl_info)
 				SIS_CTRL_KERNEL_UP;
 }
 
+u32 sis_get_product_id(struct pqi_ctrl_info *ctrl_info)
+{
+	return readl(&ctrl_info->registers->sis_product_identifier);
+}
+
 /* used for passing command parameters/results when issuing SIS commands */
 struct sis_sync_cmd_params {
 	u32	mailbox[6];	/* mailboxes 0-5 */
diff --git a/drivers/scsi/smartpqi/smartpqi_sis.h b/drivers/scsi/smartpqi/smartpqi_sis.h
index 878d34ca6532..12cd2ab1aead 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.h
+++ b/drivers/scsi/smartpqi/smartpqi_sis.h
@@ -27,5 +27,6 @@ int sis_reenable_sis_mode(struct pqi_ctrl_info *ctrl_info);
 void sis_write_driver_scratch(struct pqi_ctrl_info *ctrl_info, u32 value);
 u32 sis_read_driver_scratch(struct pqi_ctrl_info *ctrl_info);
 void sis_soft_reset(struct pqi_ctrl_info *ctrl_info);
+u32 sis_get_product_id(struct pqi_ctrl_info *ctrl_info);
 
 #endif	/* _SMARTPQI_SIS_H */


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 02/25] smartpqi: refactor aio submission code
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
  2020-12-10 20:34 ` [PATCH V3 01/25] smartpqi: add support for product id Don Brace
@ 2020-12-10 20:34 ` Don Brace
  2021-01-07 16:43   ` Martin Wilck
  2020-12-10 20:34 ` [PATCH V3 03/25] smartpqi: refactor build sg list code Don Brace
                   ` (23 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:34 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

* No functional changes.
  * Refactor aio submission code:
    1. break-up function into smaller functions.
    2. add common block of data to carry around
       into newly added functions.
    3. Prepare for new AIO functionality.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   52 +++
 drivers/scsi/smartpqi/smartpqi_init.c |  554 ++++++++++++++++++---------------
 2 files changed, 360 insertions(+), 246 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 7d3f956e949f..d486a2ec3045 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -908,6 +908,58 @@ struct raid_map {
 
 #pragma pack()
 
+struct pqi_scsi_dev_raid_map_data {
+	bool	is_write;
+	u8	raid_level;
+	u32	map_index;
+	u64	first_block;
+	u64	last_block;
+	u32	data_length;
+	u32	block_cnt;
+	u32	blocks_per_row;
+	u64	first_row;
+	u64	last_row;
+	u32	first_row_offset;
+	u32	last_row_offset;
+	u32	first_column;
+	u32	last_column;
+	u64	r5or6_first_row;
+	u64	r5or6_last_row;
+	u32	r5or6_first_row_offset;
+	u32	r5or6_last_row_offset;
+	u32	r5or6_first_column;
+	u32	r5or6_last_column;
+	u16	data_disks_per_row;
+	u32	total_disks_per_row;
+	u16	layout_map_count;
+	u32	stripesize;
+	u16	strip_size;
+	u32	first_group;
+	u32	last_group;
+	u32	current_group;
+	u32	map_row;
+	u32	aio_handle;
+	u64	disk_block;
+	u32	disk_block_cnt;
+	u8	cdb[16];
+	u8	cdb_length;
+	int	offload_to_mirror;
+
+	/* RAID1 specific */
+#define NUM_RAID1_MAP_ENTRIES 3
+	u32	num_it_nexus_entries;
+	u32	it_nexus[NUM_RAID1_MAP_ENTRIES];
+
+	/* RAID5 RAID6 specific */
+	u32	p_parity_it_nexus; /* aio_handle */
+	u32	q_parity_it_nexus; /* aio_handle */
+	u8	xor_mult;
+	u64	row;
+	u64	stripe_lba;
+	u32	p_index;
+	u32	q_index;
+};
+
 #define RAID_CTLR_LUNID		"\0\0\0\0\0\0\0\0"
 
 struct pqi_scsi_dev {
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 68fc4327944e..2348b9f24d8c 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -2237,332 +2237,394 @@ static inline void pqi_set_encryption_info(
  * Attempt to perform RAID bypass mapping for a logical volume I/O.
  */
 
+static bool pqi_aio_raid_level_supported(struct pqi_scsi_dev_raid_map_data *rmd)
+{
+	bool is_supported = true;
+
+	switch (rmd->raid_level) {
+	case SA_RAID_0:
+		break;
+	case SA_RAID_1:
+		if (rmd->is_write)
+			is_supported = false;
+		break;
+	case SA_RAID_5:
+		fallthrough;
+	case SA_RAID_6:
+		if (rmd->is_write)
+			is_supported = false;
+		break;
+	case SA_RAID_ADM:
+		if (rmd->is_write)
+			is_supported = false;
+		break;
+	default:
+		is_supported = false;
+	}
+
+	return is_supported;
+}
+
 #define PQI_RAID_BYPASS_INELIGIBLE	1
 
-static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
-	struct pqi_queue_group *queue_group)
+static int pqi_get_aio_lba_and_block_count(struct scsi_cmnd *scmd,
+			struct pqi_scsi_dev_raid_map_data *rmd)
 {
-	struct raid_map *raid_map;
-	bool is_write = false;
-	u32 map_index;
-	u64 first_block;
-	u64 last_block;
-	u32 block_cnt;
-	u32 blocks_per_row;
-	u64 first_row;
-	u64 last_row;
-	u32 first_row_offset;
-	u32 last_row_offset;
-	u32 first_column;
-	u32 last_column;
-	u64 r0_first_row;
-	u64 r0_last_row;
-	u32 r5or6_blocks_per_row;
-	u64 r5or6_first_row;
-	u64 r5or6_last_row;
-	u32 r5or6_first_row_offset;
-	u32 r5or6_last_row_offset;
-	u32 r5or6_first_column;
-	u32 r5or6_last_column;
-	u16 data_disks_per_row;
-	u32 total_disks_per_row;
-	u16 layout_map_count;
-	u32 stripesize;
-	u16 strip_size;
-	u32 first_group;
-	u32 last_group;
-	u32 current_group;
-	u32 map_row;
-	u32 aio_handle;
-	u64 disk_block;
-	u32 disk_block_cnt;
-	u8 cdb[16];
-	u8 cdb_length;
-	int offload_to_mirror;
-	struct pqi_encryption_info *encryption_info_ptr;
-	struct pqi_encryption_info encryption_info;
-#if BITS_PER_LONG == 32
-	u64 tmpdiv;
-#endif
-
 	/* Check for valid opcode, get LBA and block count. */
 	switch (scmd->cmnd[0]) {
 	case WRITE_6:
-		is_write = true;
+		rmd->is_write = true;
 		fallthrough;
 	case READ_6:
-		first_block = (u64)(((scmd->cmnd[1] & 0x1f) << 16) |
+		rmd->first_block = (u64)(((scmd->cmnd[1] & 0x1f) << 16) |
 			(scmd->cmnd[2] << 8) | scmd->cmnd[3]);
-		block_cnt = (u32)scmd->cmnd[4];
-		if (block_cnt == 0)
-			block_cnt = 256;
+		rmd->block_cnt = (u32)scmd->cmnd[4];
+		if (rmd->block_cnt == 0)
+			rmd->block_cnt = 256;
 		break;
 	case WRITE_10:
-		is_write = true;
+		rmd->is_write = true;
 		fallthrough;
 	case READ_10:
-		first_block = (u64)get_unaligned_be32(&scmd->cmnd[2]);
-		block_cnt = (u32)get_unaligned_be16(&scmd->cmnd[7]);
+		rmd->first_block = (u64)get_unaligned_be32(&scmd->cmnd[2]);
+		rmd->block_cnt = (u32)get_unaligned_be16(&scmd->cmnd[7]);
 		break;
 	case WRITE_12:
-		is_write = true;
+		rmd->is_write = true;
 		fallthrough;
 	case READ_12:
-		first_block = (u64)get_unaligned_be32(&scmd->cmnd[2]);
-		block_cnt = get_unaligned_be32(&scmd->cmnd[6]);
+		rmd->first_block = (u64)get_unaligned_be32(&scmd->cmnd[2]);
+		rmd->block_cnt = get_unaligned_be32(&scmd->cmnd[6]);
 		break;
 	case WRITE_16:
-		is_write = true;
+		rmd->is_write = true;
 		fallthrough;
 	case READ_16:
-		first_block = get_unaligned_be64(&scmd->cmnd[2]);
-		block_cnt = get_unaligned_be32(&scmd->cmnd[10]);
+		rmd->first_block = get_unaligned_be64(&scmd->cmnd[2]);
+		rmd->block_cnt = get_unaligned_be32(&scmd->cmnd[10]);
 		break;
 	default:
 		/* Process via normal I/O path. */
 		return PQI_RAID_BYPASS_INELIGIBLE;
 	}
 
-	/* Check for write to non-RAID-0. */
-	if (is_write && device->raid_level != SA_RAID_0)
-		return PQI_RAID_BYPASS_INELIGIBLE;
+	put_unaligned_le32(scsi_bufflen(scmd), &rmd->data_length);
 
-	if (unlikely(block_cnt == 0))
-		return PQI_RAID_BYPASS_INELIGIBLE;
+	return 0;
+}
 
-	last_block = first_block + block_cnt - 1;
-	raid_map = device->raid_map;
+static int pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,
+					struct pqi_scsi_dev_raid_map_data *rmd,
+					struct raid_map *raid_map)
+{
+#if BITS_PER_LONG == 32
+	u64 tmpdiv;
+#endif
+
+	rmd->last_block = rmd->first_block + rmd->block_cnt - 1;
 
 	/* Check for invalid block or wraparound. */
-	if (last_block >= get_unaligned_le64(&raid_map->volume_blk_cnt) ||
-		last_block < first_block)
+	if (rmd->last_block >=
+		get_unaligned_le64(&raid_map->volume_blk_cnt) ||
+		rmd->last_block < rmd->first_block)
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
-	data_disks_per_row = get_unaligned_le16(&raid_map->data_disks_per_row);
-	strip_size = get_unaligned_le16(&raid_map->strip_size);
-	layout_map_count = get_unaligned_le16(&raid_map->layout_map_count);
+	rmd->data_disks_per_row =
+			get_unaligned_le16(&raid_map->data_disks_per_row);
+	rmd->strip_size = get_unaligned_le16(&raid_map->strip_size);
+	rmd->layout_map_count = get_unaligned_le16(&raid_map->layout_map_count);
 
 	/* Calculate stripe information for the request. */
-	blocks_per_row = data_disks_per_row * strip_size;
+	rmd->blocks_per_row = rmd->data_disks_per_row * rmd->strip_size;
 #if BITS_PER_LONG == 32
-	tmpdiv = first_block;
-	do_div(tmpdiv, blocks_per_row);
-	first_row = tmpdiv;
-	tmpdiv = last_block;
-	do_div(tmpdiv, blocks_per_row);
-	last_row = tmpdiv;
-	first_row_offset = (u32)(first_block - (first_row * blocks_per_row));
-	last_row_offset = (u32)(last_block - (last_row * blocks_per_row));
-	tmpdiv = first_row_offset;
-	do_div(tmpdiv, strip_size);
-	first_column = tmpdiv;
-	tmpdiv = last_row_offset;
-	do_div(tmpdiv, strip_size);
-	last_column = tmpdiv;
+	tmpdiv = rmd->first_block;
+	do_div(tmpdiv, rmd->blocks_per_row);
+	rmd->first_row = tmpdiv;
+	tmpdiv = rmd->last_block;
+	do_div(tmpdiv, rmd->blocks_per_row);
+	rmd->last_row = tmpdiv;
+	rmd->first_row_offset = (u32)(rmd->first_block - (rmd->first_row * rmd->blocks_per_row));
+	rmd->last_row_offset = (u32)(rmd->last_block - (rmd->last_row * rmd->blocks_per_row));
+	tmpdiv = rmd->first_row_offset;
+	do_div(tmpdiv, rmd->strip_size);
+	rmd->first_column = tmpdiv;
+	tmpdiv = rmd->last_row_offset;
+	do_div(tmpdiv, rmd->strip_size);
+	rmd->last_column = tmpdiv;
 #else
-	first_row = first_block / blocks_per_row;
-	last_row = last_block / blocks_per_row;
-	first_row_offset = (u32)(first_block - (first_row * blocks_per_row));
-	last_row_offset = (u32)(last_block - (last_row * blocks_per_row));
-	first_column = first_row_offset / strip_size;
-	last_column = last_row_offset / strip_size;
+	rmd->first_row = rmd->first_block / rmd->blocks_per_row;
+	rmd->last_row = rmd->last_block / rmd->blocks_per_row;
+	rmd->first_row_offset = (u32)(rmd->first_block -
+				(rmd->first_row * rmd->blocks_per_row));
+	rmd->last_row_offset = (u32)(rmd->last_block - (rmd->last_row *
+				rmd->blocks_per_row));
+	rmd->first_column = rmd->first_row_offset / rmd->strip_size;
+	rmd->last_column = rmd->last_row_offset / rmd->strip_size;
 #endif
 
 	/* If this isn't a single row/column then give to the controller. */
-	if (first_row != last_row || first_column != last_column)
+	if (rmd->first_row != rmd->last_row ||
+			rmd->first_column != rmd->last_column)
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
 	/* Proceeding with driver mapping. */
-	total_disks_per_row = data_disks_per_row +
+	rmd->total_disks_per_row = rmd->data_disks_per_row +
 		get_unaligned_le16(&raid_map->metadata_disks_per_row);
-	map_row = ((u32)(first_row >> raid_map->parity_rotation_shift)) %
+	rmd->map_row = ((u32)(rmd->first_row >>
+		raid_map->parity_rotation_shift)) %
 		get_unaligned_le16(&raid_map->row_cnt);
-	map_index = (map_row * total_disks_per_row) + first_column;
+	rmd->map_index = (rmd->map_row * rmd->total_disks_per_row) +
+			rmd->first_column;
 
-	/* RAID 1 */
-	if (device->raid_level == SA_RAID_1) {
-		if (device->offload_to_mirror)
-			map_index += data_disks_per_row;
-		device->offload_to_mirror = !device->offload_to_mirror;
-	} else if (device->raid_level == SA_RAID_ADM) {
-		/* RAID ADM */
-		/*
-		 * Handles N-way mirrors  (R1-ADM) and R10 with # of drives
-		 * divisible by 3.
-		 */
-		offload_to_mirror = device->offload_to_mirror;
-		if (offload_to_mirror == 0)  {
-			/* use physical disk in the first mirrored group. */
-			map_index %= data_disks_per_row;
-		} else {
-			do {
-				/*
-				 * Determine mirror group that map_index
-				 * indicates.
-				 */
-				current_group = map_index / data_disks_per_row;
-
-				if (offload_to_mirror != current_group) {
-					if (current_group <
-						layout_map_count - 1) {
-						/*
-						 * Select raid index from
-						 * next group.
-						 */
-						map_index += data_disks_per_row;
-						current_group++;
-					} else {
-						/*
-						 * Select raid index from first
-						 * group.
-						 */
-						map_index %= data_disks_per_row;
-						current_group = 0;
-					}
+	return 0;
+}
+
+static int pqi_calc_aio_raid_adm(struct pqi_scsi_dev_raid_map_data *rmd,
+				struct pqi_scsi_dev *device)
+{
+	/* RAID ADM */
+	/*
+	 * Handles N-way mirrors  (R1-ADM) and R10 with # of drives
+	 * divisible by 3.
+	 */
+	rmd->offload_to_mirror = device->offload_to_mirror;
+
+	if (rmd->offload_to_mirror == 0)  {
+		/* use physical disk in the first mirrored group. */
+		rmd->map_index %= rmd->data_disks_per_row;
+	} else {
+		do {
+			/*
+			 * Determine mirror group that map_index
+			 * indicates.
+			 */
+			rmd->current_group =
+				rmd->map_index / rmd->data_disks_per_row;
+
+			if (rmd->offload_to_mirror !=
+					rmd->current_group) {
+				if (rmd->current_group <
+					rmd->layout_map_count - 1) {
+					/*
+					 * Select raid index from
+					 * next group.
+					 */
+					rmd->map_index += rmd->data_disks_per_row;
+					rmd->current_group++;
+				} else {
+					/*
+					 * Select raid index from first
+					 * group.
+					 */
+					rmd->map_index %= rmd->data_disks_per_row;
+					rmd->current_group = 0;
 				}
-			} while (offload_to_mirror != current_group);
-		}
+			}
+		} while (rmd->offload_to_mirror != rmd->current_group);
+	}
 
-		/* Set mirror group to use next time. */
-		offload_to_mirror =
-			(offload_to_mirror >= layout_map_count - 1) ?
-				0 : offload_to_mirror + 1;
-		device->offload_to_mirror = offload_to_mirror;
-		/*
-		 * Avoid direct use of device->offload_to_mirror within this
-		 * function since multiple threads might simultaneously
-		 * increment it beyond the range of device->layout_map_count -1.
-		 */
-	} else if ((device->raid_level == SA_RAID_5 ||
-		device->raid_level == SA_RAID_6) && layout_map_count > 1) {
-		/* RAID 50/60 */
-		/* Verify first and last block are in same RAID group */
-		r5or6_blocks_per_row = strip_size * data_disks_per_row;
-		stripesize = r5or6_blocks_per_row * layout_map_count;
+	/* Set mirror group to use next time. */
+	rmd->offload_to_mirror =
+		(rmd->offload_to_mirror >= rmd->layout_map_count - 1) ?
+			0 : rmd->offload_to_mirror + 1;
+	device->offload_to_mirror = rmd->offload_to_mirror;
+	/*
+	 * Avoid direct use of device->offload_to_mirror within this
+	 * function since multiple threads might simultaneously
+	 * increment it beyond the range of device->layout_map_count -1.
+	 */
+
+	return 0;
+}
+
+static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data *rmd,
+				struct raid_map *raid_map)
+{
+#if BITS_PER_LONG == 32
+	u64 tmpdiv;
+#endif
+	/* RAID 50/60 */
+	/* Verify first and last block are in same RAID group */
+	rmd->stripesize = rmd->blocks_per_row * rmd->layout_map_count;
 #if BITS_PER_LONG == 32
-		tmpdiv = first_block;
-		first_group = do_div(tmpdiv, stripesize);
-		tmpdiv = first_group;
-		do_div(tmpdiv, r5or6_blocks_per_row);
-		first_group = tmpdiv;
-		tmpdiv = last_block;
-		last_group = do_div(tmpdiv, stripesize);
-		tmpdiv = last_group;
-		do_div(tmpdiv, r5or6_blocks_per_row);
-		last_group = tmpdiv;
+	tmpdiv = rmd->first_block;
+	rmd->first_group = do_div(tmpdiv, rmd->stripesize);
+	tmpdiv = rmd->first_group;
+	do_div(tmpdiv, rmd->blocks_per_row);
+	rmd->first_group = tmpdiv;
+	tmpdiv = rmd->last_block;
+	rmd->last_group = do_div(tmpdiv, rmd->stripesize);
+	tmpdiv = rmd->last_group;
+	do_div(tmpdiv, rmd->blocks_per_row);
+	rmd->last_group = tmpdiv;
 #else
-		first_group = (first_block % stripesize) / r5or6_blocks_per_row;
-		last_group = (last_block % stripesize) / r5or6_blocks_per_row;
+	rmd->first_group = (rmd->first_block % rmd->stripesize) / rmd->blocks_per_row;
+	rmd->last_group = (rmd->last_block % rmd->stripesize) / rmd->blocks_per_row;
 #endif
-		if (first_group != last_group)
-			return PQI_RAID_BYPASS_INELIGIBLE;
+	if (rmd->first_group != rmd->last_group)
+		return PQI_RAID_BYPASS_INELIGIBLE;
 
-		/* Verify request is in a single row of RAID 5/6 */
+	/* Verify request is in a single row of RAID 5/6 */
 #if BITS_PER_LONG == 32
-		tmpdiv = first_block;
-		do_div(tmpdiv, stripesize);
-		first_row = r5or6_first_row = r0_first_row = tmpdiv;
-		tmpdiv = last_block;
-		do_div(tmpdiv, stripesize);
-		r5or6_last_row = r0_last_row = tmpdiv;
+	tmpdiv = rmd->first_block;
+	do_div(tmpdiv, rmd->stripesize);
+	rmd->first_row = tmpdiv;
+	rmd->r5or6_first_row = tmpdiv;
+	tmpdiv = rmd->last_block;
+	do_div(tmpdiv, rmd->stripesize);
+	rmd->r5or6_last_row = tmpdiv;
 #else
-		first_row = r5or6_first_row = r0_first_row =
-			first_block / stripesize;
-		r5or6_last_row = r0_last_row = last_block / stripesize;
+	rmd->first_row = rmd->r5or6_first_row =
+		rmd->first_block / rmd->stripesize;
+	rmd->r5or6_last_row = rmd->last_block / rmd->stripesize;
 #endif
-		if (r5or6_first_row != r5or6_last_row)
-			return PQI_RAID_BYPASS_INELIGIBLE;
+	if (rmd->r5or6_first_row != rmd->r5or6_last_row)
+		return PQI_RAID_BYPASS_INELIGIBLE;
 
-		/* Verify request is in a single column */
+	/* Verify request is in a single column */
 #if BITS_PER_LONG == 32
-		tmpdiv = first_block;
-		first_row_offset = do_div(tmpdiv, stripesize);
-		tmpdiv = first_row_offset;
-		first_row_offset = (u32)do_div(tmpdiv, r5or6_blocks_per_row);
-		r5or6_first_row_offset = first_row_offset;
-		tmpdiv = last_block;
-		r5or6_last_row_offset = do_div(tmpdiv, stripesize);
-		tmpdiv = r5or6_last_row_offset;
-		r5or6_last_row_offset = do_div(tmpdiv, r5or6_blocks_per_row);
-		tmpdiv = r5or6_first_row_offset;
-		do_div(tmpdiv, strip_size);
-		first_column = r5or6_first_column = tmpdiv;
-		tmpdiv = r5or6_last_row_offset;
-		do_div(tmpdiv, strip_size);
-		r5or6_last_column = tmpdiv;
+	tmpdiv = rmd->first_block;
+	rmd->first_row_offset = do_div(tmpdiv, rmd->stripesize);
+	tmpdiv = rmd->first_row_offset;
+	rmd->first_row_offset = (u32)do_div(tmpdiv, rmd->blocks_per_row);
+	rmd->r5or6_first_row_offset = rmd->first_row_offset;
+	tmpdiv = rmd->last_block;
+	rmd->r5or6_last_row_offset = do_div(tmpdiv, rmd->stripesize);
+	tmpdiv = rmd->r5or6_last_row_offset;
+	rmd->r5or6_last_row_offset = do_div(tmpdiv, rmd->blocks_per_row);
+	tmpdiv = rmd->r5or6_first_row_offset;
+	do_div(tmpdiv, rmd->strip_size);
+	rmd->first_column = rmd->r5or6_first_column = tmpdiv;
+	tmpdiv = rmd->r5or6_last_row_offset;
+	do_div(tmpdiv, rmd->strip_size);
+	rmd->r5or6_last_column = tmpdiv;
 #else
-		first_row_offset = r5or6_first_row_offset =
-			(u32)((first_block % stripesize) %
-			r5or6_blocks_per_row);
+	rmd->first_row_offset = rmd->r5or6_first_row_offset =
+		(u32)((rmd->first_block %
+				rmd->stripesize) %
+				rmd->blocks_per_row);
+
+	rmd->r5or6_last_row_offset =
+		(u32)((rmd->last_block % rmd->stripesize) %
+		rmd->blocks_per_row);
+
+	rmd->first_column =
+			rmd->r5or6_first_row_offset / rmd->strip_size;
+	rmd->r5or6_first_column = rmd->first_column;
+	rmd->r5or6_last_column = rmd->r5or6_last_row_offset / rmd->strip_size;
+#endif
+	if (rmd->r5or6_first_column != rmd->r5or6_last_column)
+		return PQI_RAID_BYPASS_INELIGIBLE;
+
+	/* Request is eligible */
+	rmd->map_row =
+		((u32)(rmd->first_row >> raid_map->parity_rotation_shift)) %
+		get_unaligned_le16(&raid_map->row_cnt);
 
-		r5or6_last_row_offset =
-			(u32)((last_block % stripesize) %
-			r5or6_blocks_per_row);
+	rmd->map_index = (rmd->first_group *
+		(get_unaligned_le16(&raid_map->row_cnt) *
+		rmd->total_disks_per_row)) +
+		(rmd->map_row * rmd->total_disks_per_row) + rmd->first_column;
 
-		first_column = r5or6_first_row_offset / strip_size;
-		r5or6_first_column = first_column;
-		r5or6_last_column = r5or6_last_row_offset / strip_size;
-#endif
-		if (r5or6_first_column != r5or6_last_column)
-			return PQI_RAID_BYPASS_INELIGIBLE;
+	return 0;
+}
+
+static void pqi_set_aio_cdb(struct pqi_scsi_dev_raid_map_data *rmd)
+{
+	/* Build the new CDB for the physical disk I/O. */
+	if (rmd->disk_block > 0xffffffff) {
+		rmd->cdb[0] = rmd->is_write ? WRITE_16 : READ_16;
+		rmd->cdb[1] = 0;
+		put_unaligned_be64(rmd->disk_block, &rmd->cdb[2]);
+		put_unaligned_be32(rmd->disk_block_cnt, &rmd->cdb[10]);
+		rmd->cdb[14] = 0;
+		rmd->cdb[15] = 0;
+		rmd->cdb_length = 16;
+	} else {
+		rmd->cdb[0] = rmd->is_write ? WRITE_10 : READ_10;
+		rmd->cdb[1] = 0;
+		put_unaligned_be32((u32)rmd->disk_block, &rmd->cdb[2]);
+		rmd->cdb[6] = 0;
+		put_unaligned_be16((u16)rmd->disk_block_cnt, &rmd->cdb[7]);
+		rmd->cdb[9] = 0;
+		rmd->cdb_length = 10;
+	}
+}
+
+static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
+	struct pqi_queue_group *queue_group)
+{
+	struct raid_map *raid_map;
+	int rc;
+	struct pqi_encryption_info *encryption_info_ptr;
+	struct pqi_encryption_info encryption_info;
+	struct pqi_scsi_dev_raid_map_data rmd = {0};
+
+	rc = pqi_get_aio_lba_and_block_count(scmd, &rmd);
+	if (rc)
+		return PQI_RAID_BYPASS_INELIGIBLE;
+
+	rmd.raid_level = device->raid_level;
+
+	if (!pqi_aio_raid_level_supported(&rmd))
+		return PQI_RAID_BYPASS_INELIGIBLE;
+
+	if (unlikely(rmd.block_cnt == 0))
+		return PQI_RAID_BYPASS_INELIGIBLE;
+
+	raid_map = device->raid_map;
 
-		/* Request is eligible */
-		map_row =
-			((u32)(first_row >> raid_map->parity_rotation_shift)) %
-			get_unaligned_le16(&raid_map->row_cnt);
+	rc = pci_get_aio_common_raid_map_values(ctrl_info, &rmd, raid_map);
+	if (rc)
+		return PQI_RAID_BYPASS_INELIGIBLE;
 
-		map_index = (first_group *
-			(get_unaligned_le16(&raid_map->row_cnt) *
-			total_disks_per_row)) +
-			(map_row * total_disks_per_row) + first_column;
+	/* RAID 1 */
+	if (device->raid_level == SA_RAID_1) {
+		if (device->offload_to_mirror)
+			rmd.map_index += rmd.data_disks_per_row;
+		device->offload_to_mirror = !device->offload_to_mirror;
+	} else if (device->raid_level == SA_RAID_ADM) {
+		rc = pqi_calc_aio_raid_adm(&rmd, device);
+	} else if ((device->raid_level == SA_RAID_5 ||
+		device->raid_level == SA_RAID_6) && rmd.layout_map_count > 1) {
+		rc = pqi_calc_aio_r5_or_r6(&rmd, raid_map);
+		if (rc)
+			return PQI_RAID_BYPASS_INELIGIBLE;
 	}
 
-	aio_handle = raid_map->disk_data[map_index].aio_handle;
-	disk_block = get_unaligned_le64(&raid_map->disk_starting_blk) +
-		first_row * strip_size +
-		(first_row_offset - first_column * strip_size);
-	disk_block_cnt = block_cnt;
+	if (unlikely(rmd.map_index >= RAID_MAP_MAX_ENTRIES))
+		return PQI_RAID_BYPASS_INELIGIBLE;
+
+	rmd.aio_handle = raid_map->disk_data[rmd.map_index].aio_handle;
+	rmd.disk_block = get_unaligned_le64(&raid_map->disk_starting_blk) +
+		rmd.first_row * rmd.strip_size +
+		(rmd.first_row_offset - rmd.first_column * rmd.strip_size);
+	rmd.disk_block_cnt = rmd.block_cnt;
 
 	/* Handle differing logical/physical block sizes. */
 	if (raid_map->phys_blk_shift) {
-		disk_block <<= raid_map->phys_blk_shift;
-		disk_block_cnt <<= raid_map->phys_blk_shift;
+		rmd.disk_block <<= raid_map->phys_blk_shift;
+		rmd.disk_block_cnt <<= raid_map->phys_blk_shift;
 	}
 
-	if (unlikely(disk_block_cnt > 0xffff))
+	if (unlikely(rmd.disk_block_cnt > 0xffff))
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
-	/* Build the new CDB for the physical disk I/O. */
-	if (disk_block > 0xffffffff) {
-		cdb[0] = is_write ? WRITE_16 : READ_16;
-		cdb[1] = 0;
-		put_unaligned_be64(disk_block, &cdb[2]);
-		put_unaligned_be32(disk_block_cnt, &cdb[10]);
-		cdb[14] = 0;
-		cdb[15] = 0;
-		cdb_length = 16;
-	} else {
-		cdb[0] = is_write ? WRITE_10 : READ_10;
-		cdb[1] = 0;
-		put_unaligned_be32((u32)disk_block, &cdb[2]);
-		cdb[6] = 0;
-		put_unaligned_be16((u16)disk_block_cnt, &cdb[7]);
-		cdb[9] = 0;
-		cdb_length = 10;
-	}
+	pqi_set_aio_cdb(&rmd);
 
 	if (get_unaligned_le16(&raid_map->flags) &
 		RAID_MAP_ENCRYPTION_ENABLED) {
 		pqi_set_encryption_info(&encryption_info, raid_map,
-			first_block);
+			rmd.first_block);
 		encryption_info_ptr = &encryption_info;
 	} else {
 		encryption_info_ptr = NULL;
 	}
 
-	return pqi_aio_submit_io(ctrl_info, scmd, aio_handle,
-		cdb, cdb_length, queue_group, encryption_info_ptr, true);
+	return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
+				rmd.cdb, rmd.cdb_length, queue_group,
+				encryption_info_ptr, true);
 }
 
 #define PQI_STATUS_IDLE		0x0


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 03/25] smartpqi: refactor build sg list code
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
  2020-12-10 20:34 ` [PATCH V3 01/25] smartpqi: add support for product id Don Brace
  2020-12-10 20:34 ` [PATCH V3 02/25] smartpqi: refactor aio submission code Don Brace
@ 2020-12-10 20:34 ` Don Brace
  2021-01-07 16:43   ` Martin Wilck
  2020-12-10 20:34 ` [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes Don Brace
                   ` (22 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:34 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

* No functional changes.
* Factor out code common to all s/g list building.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |  101 ++++++++++++++-------------------
 1 file changed, 42 insertions(+), 59 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 2348b9f24d8c..6bcb037ae9d7 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -4857,16 +4857,52 @@ static inline void pqi_set_sg_descriptor(
 	put_unaligned_le32(0, &sg_descriptor->flags);
 }
 
+static unsigned int pqi_build_sg_list(struct pqi_sg_descriptor *sg_descriptor,
+	struct scatterlist *sg, int sg_count, struct pqi_io_request *io_request,
+	int max_sg_per_iu, bool *chained)
+{
+	int i;
+	unsigned int num_sg_in_iu;
+
+	*chained = false;
+	i = 0;
+	num_sg_in_iu = 0;
+	max_sg_per_iu--;	/* Subtract 1 to leave room for chain marker. */
+
+	while (1) {
+		pqi_set_sg_descriptor(sg_descriptor, sg);
+		if (!*chained)
+			num_sg_in_iu++;
+		i++;
+		if (i == sg_count)
+			break;
+		sg_descriptor++;
+		if (i == max_sg_per_iu) {
+			put_unaligned_le64((u64)io_request->sg_chain_buffer_dma_handle,
+				&sg_descriptor->address);
+			put_unaligned_le32((sg_count - num_sg_in_iu) * sizeof(*sg_descriptor),
+				&sg_descriptor->length);
+			put_unaligned_le32(CISS_SG_CHAIN, &sg_descriptor->flags);
+			*chained = true;
+			num_sg_in_iu++;
+			sg_descriptor = io_request->sg_chain_buffer;
+		}
+		sg = sg_next(sg);
+	}
+
+	put_unaligned_le32(CISS_SG_LAST, &sg_descriptor->flags);
+
+	return num_sg_in_iu;
+}
+
 static int pqi_build_raid_sg_list(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_raid_path_request *request, struct scsi_cmnd *scmd,
 	struct pqi_io_request *io_request)
 {
-	int i;
 	u16 iu_length;
 	int sg_count;
 	bool chained;
 	unsigned int num_sg_in_iu;
-	unsigned int max_sg_per_iu;
 	struct scatterlist *sg;
 	struct pqi_sg_descriptor *sg_descriptor;
 
@@ -4882,36 +4918,10 @@ static int pqi_build_raid_sg_list(struct pqi_ctrl_info *ctrl_info,
 
 	sg = scsi_sglist(scmd);
 	sg_descriptor = request->sg_descriptors;
-	max_sg_per_iu = ctrl_info->max_sg_per_iu - 1;
-	chained = false;
-	num_sg_in_iu = 0;
-	i = 0;
 
-	while (1) {
-		pqi_set_sg_descriptor(sg_descriptor, sg);
-		if (!chained)
-			num_sg_in_iu++;
-		i++;
-		if (i == sg_count)
-			break;
-		sg_descriptor++;
-		if (i == max_sg_per_iu) {
-			put_unaligned_le64(
-				(u64)io_request->sg_chain_buffer_dma_handle,
-				&sg_descriptor->address);
-			put_unaligned_le32((sg_count - num_sg_in_iu)
-				* sizeof(*sg_descriptor),
-				&sg_descriptor->length);
-			put_unaligned_le32(CISS_SG_CHAIN,
-				&sg_descriptor->flags);
-			chained = true;
-			num_sg_in_iu++;
-			sg_descriptor = io_request->sg_chain_buffer;
-		}
-		sg = sg_next(sg);
-	}
+	num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count, io_request,
+		ctrl_info->max_sg_per_iu, &chained);
 
-	put_unaligned_le32(CISS_SG_LAST, &sg_descriptor->flags);
 	request->partial = chained;
 	iu_length += num_sg_in_iu * sizeof(*sg_descriptor);
 
@@ -4925,12 +4935,10 @@ static int pqi_build_aio_sg_list(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_aio_path_request *request, struct scsi_cmnd *scmd,
 	struct pqi_io_request *io_request)
 {
-	int i;
 	u16 iu_length;
 	int sg_count;
 	bool chained;
 	unsigned int num_sg_in_iu;
-	unsigned int max_sg_per_iu;
 	struct scatterlist *sg;
 	struct pqi_sg_descriptor *sg_descriptor;
 
@@ -4947,35 +4955,10 @@ static int pqi_build_aio_sg_list(struct pqi_ctrl_info *ctrl_info,
 
 	sg = scsi_sglist(scmd);
 	sg_descriptor = request->sg_descriptors;
-	max_sg_per_iu = ctrl_info->max_sg_per_iu - 1;
-	chained = false;
-	i = 0;
 
-	while (1) {
-		pqi_set_sg_descriptor(sg_descriptor, sg);
-		if (!chained)
-			num_sg_in_iu++;
-		i++;
-		if (i == sg_count)
-			break;
-		sg_descriptor++;
-		if (i == max_sg_per_iu) {
-			put_unaligned_le64(
-				(u64)io_request->sg_chain_buffer_dma_handle,
-				&sg_descriptor->address);
-			put_unaligned_le32((sg_count - num_sg_in_iu)
-				* sizeof(*sg_descriptor),
-				&sg_descriptor->length);
-			put_unaligned_le32(CISS_SG_CHAIN,
-				&sg_descriptor->flags);
-			chained = true;
-			num_sg_in_iu++;
-			sg_descriptor = io_request->sg_chain_buffer;
-		}
-		sg = sg_next(sg);
-	}
+	num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count, io_request,
+		ctrl_info->max_sg_per_iu, &chained);
 
-	put_unaligned_le32(CISS_SG_LAST, &sg_descriptor->flags);
 	request->partial = chained;
 	iu_length += num_sg_in_iu * sizeof(*sg_descriptor);
 


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (2 preceding siblings ...)
  2020-12-10 20:34 ` [PATCH V3 03/25] smartpqi: refactor build sg list code Don Brace
@ 2020-12-10 20:34 ` Don Brace
  2021-01-07 16:44   ` Martin Wilck
  2020-12-10 20:34 ` [PATCH V3 05/25] smartpqi: add support for raid1 writes Don Brace
                   ` (21 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:34 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

* Add in new IU definition.
* Add in support raid5 and raid6 writes.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   39 +++++
 drivers/scsi/smartpqi/smartpqi_init.c |  247 ++++++++++++++++++++++++++++++++-
 2 files changed, 278 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index d486a2ec3045..e9844210c4a0 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -257,6 +257,7 @@ struct pqi_device_capability {
 };
 
 #define PQI_MAX_EMBEDDED_SG_DESCRIPTORS		4
+#define PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS	3
 
 struct pqi_raid_path_request {
 	struct pqi_iu_header header;
@@ -312,6 +313,39 @@ struct pqi_aio_path_request {
 		sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
 };
 
+#define PQI_RAID56_XFER_LIMIT_4K	0x1000 /* 4Kib */
+#define PQI_RAID56_XFER_LIMIT_8K	0x2000 /* 8Kib */
+struct pqi_aio_r56_path_request {
+	struct pqi_iu_header header;
+	__le16	request_id;
+	__le16	volume_id;		/* ID of the RAID volume */
+	__le32	data_it_nexus;		/* IT nexus for the data drive */
+	__le32	p_parity_it_nexus;	/* IT nexus for the P parity drive */
+	__le32	q_parity_it_nexus;	/* IT nexus for the Q parity drive */
+	__le32	data_length;		/* total bytes to read/write */
+	u8	data_direction : 2;
+	u8	partial : 1;
+	u8	mem_type : 1;		/* 0b: PCIe, 1b: DDR */
+	u8	fence : 1;
+	u8	encryption_enable : 1;
+	u8	reserved : 2;
+	u8	task_attribute : 3;
+	u8	command_priority : 4;
+	u8	reserved1 : 1;
+	__le16	data_encryption_key_index;
+	u8	cdb[16];
+	__le16	error_index;
+	u8	num_sg_descriptors;
+	u8	cdb_length;
+	u8	xor_multiplier;
+	u8	reserved2[3];
+	__le32	encrypt_tweak_lower;
+	__le32	encrypt_tweak_upper;
+	u8	row;			/* row = logical lba/blocks per row */
+	u8	reserved3[8];
+	struct pqi_sg_descriptor sg_descriptors[PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS];
+};
+
 struct pqi_io_response {
 	struct pqi_iu_header header;
 	__le16	request_id;
@@ -484,6 +518,8 @@ struct pqi_raid_error_info {
 #define PQI_REQUEST_IU_TASK_MANAGEMENT			0x13
 #define PQI_REQUEST_IU_RAID_PATH_IO			0x14
 #define PQI_REQUEST_IU_AIO_PATH_IO			0x15
+#define PQI_REQUEST_IU_AIO_PATH_RAID5_IO		0x18
+#define PQI_REQUEST_IU_AIO_PATH_RAID6_IO		0x19
 #define PQI_REQUEST_IU_GENERAL_ADMIN			0x60
 #define PQI_REQUEST_IU_REPORT_VENDOR_EVENT_CONFIG	0x72
 #define PQI_REQUEST_IU_SET_VENDOR_EVENT_CONFIG		0x73
@@ -1179,6 +1215,7 @@ struct pqi_ctrl_info {
 	u16		max_inbound_iu_length_per_firmware;
 	u16		max_inbound_iu_length;
 	unsigned int	max_sg_per_iu;
+	unsigned int	max_sg_per_r56_iu;
 	void		*admin_queue_memory_base;
 	u32		admin_queue_memory_length;
 	dma_addr_t	admin_queue_memory_base_dma_handle;
@@ -1210,6 +1247,8 @@ struct pqi_ctrl_info {
 	u8		soft_reset_handshake_supported : 1;
 	u8		raid_iu_timeout_supported: 1;
 	u8		tmf_iu_timeout_supported: 1;
+	u8		enable_r5_writes : 1;
+	u8		enable_r6_writes : 1;
 
 	struct list_head scsi_device_list;
 	spinlock_t	scsi_device_list_lock;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 6bcb037ae9d7..c813cec10003 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -67,6 +67,10 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,
 	unsigned int cdb_length, struct pqi_queue_group *queue_group,
 	struct pqi_encryption_info *encryption_info, bool raid_bypass);
+static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info *ctrl_info,
+	struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
+	struct pqi_encryption_info *encryption_info, struct pqi_scsi_dev *device,
+	struct pqi_scsi_dev_raid_map_data *rmd);
 static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info);
 static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info);
 static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info);
@@ -2237,7 +2241,8 @@ static inline void pqi_set_encryption_info(
  * Attempt to perform RAID bypass mapping for a logical volume I/O.
  */
 
-static bool pqi_aio_raid_level_supported(struct pqi_scsi_dev_raid_map_data *rmd)
+static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_scsi_dev_raid_map_data *rmd)
 {
 	bool is_supported = true;
 
@@ -2245,13 +2250,14 @@ static bool pqi_aio_raid_level_supported(struct pqi_scsi_dev_raid_map_data *rmd)
 	case SA_RAID_0:
 		break;
 	case SA_RAID_1:
-		if (rmd->is_write)
-			is_supported = false;
+		is_supported = false;
 		break;
 	case SA_RAID_5:
-		fallthrough;
+		if (rmd->is_write && !ctrl_info->enable_r5_writes)
+			is_supported = false;
+		break;
 	case SA_RAID_6:
-		if (rmd->is_write)
+		if (rmd->is_write && !ctrl_info->enable_r6_writes)
 			is_supported = false;
 		break;
 	case SA_RAID_ADM:
@@ -2526,6 +2532,26 @@ static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data *rmd,
 		rmd->total_disks_per_row)) +
 		(rmd->map_row * rmd->total_disks_per_row) + rmd->first_column;
 
+	if (rmd->is_write) {
+		rmd->p_index = (rmd->map_row * rmd->total_disks_per_row) + rmd->data_disks_per_row;
+		rmd->p_parity_it_nexus = raid_map->disk_data[rmd->p_index].aio_handle;
+		if (rmd->raid_level == SA_RAID_6) {
+			rmd->q_index = (rmd->map_row * rmd->total_disks_per_row) +
+				(rmd->data_disks_per_row + 1);
+			rmd->q_parity_it_nexus = raid_map->disk_data[rmd->q_index].aio_handle;
+			rmd->xor_mult = raid_map->disk_data[rmd->map_index].xor_mult[1];
+		}
+		if (rmd->blocks_per_row == 0)
+			return PQI_RAID_BYPASS_INELIGIBLE;
+#if BITS_PER_LONG == 32
+		tmpdiv = rmd->first_block;
+		do_div(tmpdiv, rmd->blocks_per_row);
+		rmd->row = tmpdiv;
+#else
+		rmd->row = rmd->first_block / rmd->blocks_per_row;
+#endif
+	}
+
 	return 0;
 }
 
@@ -2567,7 +2593,7 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 
 	rmd.raid_level = device->raid_level;
 
-	if (!pqi_aio_raid_level_supported(&rmd))
+	if (!pqi_aio_raid_level_supported(ctrl_info, &rmd))
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
 	if (unlikely(rmd.block_cnt == 0))
@@ -2587,7 +2613,8 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 	} else if (device->raid_level == SA_RAID_ADM) {
 		rc = pqi_calc_aio_raid_adm(&rmd, device);
 	} else if ((device->raid_level == SA_RAID_5 ||
-		device->raid_level == SA_RAID_6) && rmd.layout_map_count > 1) {
+		device->raid_level == SA_RAID_6) &&
+		(rmd.layout_map_count > 1 || rmd.is_write)) {
 		rc = pqi_calc_aio_r5_or_r6(&rmd, raid_map);
 		if (rc)
 			return PQI_RAID_BYPASS_INELIGIBLE;
@@ -2622,9 +2649,27 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 		encryption_info_ptr = NULL;
 	}
 
-	return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
+	if (rmd.is_write) {
+		switch (device->raid_level) {
+		case SA_RAID_0:
+			return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
 				rmd.cdb, rmd.cdb_length, queue_group,
 				encryption_info_ptr, true);
+		case SA_RAID_5:
+		case SA_RAID_6:
+			return pqi_aio_submit_r56_write_io(ctrl_info, scmd, queue_group,
+					encryption_info_ptr, device, &rmd);
+		default:
+			return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
+				rmd.cdb, rmd.cdb_length, queue_group,
+				encryption_info_ptr, true);
+		}
+	} else {
+		return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
+			rmd.cdb, rmd.cdb_length, queue_group,
+			encryption_info_ptr, true);
+	}
+
 }
 
 #define PQI_STATUS_IDLE		0x0
@@ -4844,6 +4889,12 @@ static void pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info)
 		PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /
 		sizeof(struct pqi_sg_descriptor)) +
 		PQI_MAX_EMBEDDED_SG_DESCRIPTORS;
+
+	ctrl_info->max_sg_per_r56_iu =
+		((ctrl_info->max_inbound_iu_length -
+		PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /
+		sizeof(struct pqi_sg_descriptor)) +
+		PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS;
 }
 
 static inline void pqi_set_sg_descriptor(
@@ -4931,6 +4982,44 @@ static int pqi_build_raid_sg_list(struct pqi_ctrl_info *ctrl_info,
 	return 0;
 }
 
+static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_aio_r56_path_request *request, struct scsi_cmnd *scmd,
+	struct pqi_io_request *io_request)
+{
+	u16 iu_length;
+	int sg_count;
+	bool chained;
+	unsigned int num_sg_in_iu;
+	struct scatterlist *sg;
+	struct pqi_sg_descriptor *sg_descriptor;
+
+	sg_count = scsi_dma_map(scmd);
+	if (sg_count < 0)
+		return sg_count;
+
+	iu_length = offsetof(struct pqi_aio_r56_path_request, sg_descriptors) -
+		PQI_REQUEST_HEADER_LENGTH;
+	num_sg_in_iu = 0;
+
+	if (sg_count == 0)
+		goto out;
+
+	sg = scsi_sglist(scmd);
+	sg_descriptor = request->sg_descriptors;
+
+	num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count, io_request,
+		ctrl_info->max_sg_per_r56_iu, &chained);
+
+	request->partial = chained;
+	iu_length += num_sg_in_iu * sizeof(*sg_descriptor);
+
+out:
+	put_unaligned_le16(iu_length, &request->header.iu_length);
+	request->num_sg_descriptors = num_sg_in_iu;
+
+	return 0;
+}
+
 static int pqi_build_aio_sg_list(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_aio_path_request *request, struct scsi_cmnd *scmd,
 	struct pqi_io_request *io_request)
@@ -5335,6 +5424,88 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
 	return 0;
 }
 
+static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info *ctrl_info,
+	struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
+	struct pqi_encryption_info *encryption_info, struct pqi_scsi_dev *device,
+	struct pqi_scsi_dev_raid_map_data *rmd)
+{
+	int rc;
+	struct pqi_io_request *io_request;
+	struct pqi_aio_r56_path_request *r56_request;
+
+	io_request = pqi_alloc_io_request(ctrl_info);
+	io_request->io_complete_callback = pqi_aio_io_complete;
+	io_request->scmd = scmd;
+	io_request->raid_bypass = true;
+
+	r56_request = io_request->iu;
+	memset(r56_request, 0, offsetof(struct pqi_aio_r56_path_request, sg_descriptors));
+
+	if (device->raid_level == SA_RAID_5 || device->raid_level == SA_RAID_51)
+		r56_request->header.iu_type = PQI_REQUEST_IU_AIO_PATH_RAID5_IO;
+	else
+		r56_request->header.iu_type = PQI_REQUEST_IU_AIO_PATH_RAID6_IO;
+
+	put_unaligned_le16(*(u16 *)device->scsi3addr & 0x3fff, &r56_request->volume_id);
+	put_unaligned_le32(rmd->aio_handle, &r56_request->data_it_nexus);
+	put_unaligned_le32(rmd->p_parity_it_nexus, &r56_request->p_parity_it_nexus);
+	if (rmd->raid_level == SA_RAID_6) {
+		put_unaligned_le32(rmd->q_parity_it_nexus, &r56_request->q_parity_it_nexus);
+		r56_request->xor_multiplier = rmd->xor_mult;
+	}
+	put_unaligned_le32(scsi_bufflen(scmd), &r56_request->data_length);
+	r56_request->task_attribute = SOP_TASK_ATTRIBUTE_SIMPLE;
+	put_unaligned_le64(rmd->row, &r56_request->row);
+
+	put_unaligned_le16(io_request->index, &r56_request->request_id);
+	r56_request->error_index = r56_request->request_id;
+
+	if (rmd->cdb_length > sizeof(r56_request->cdb))
+		rmd->cdb_length = sizeof(r56_request->cdb);
+	r56_request->cdb_length = rmd->cdb_length;
+	memcpy(r56_request->cdb, rmd->cdb, rmd->cdb_length);
+
+	switch (scmd->sc_data_direction) {
+	case DMA_TO_DEVICE:
+		r56_request->data_direction = SOP_READ_FLAG;
+		break;
+	case DMA_FROM_DEVICE:
+		r56_request->data_direction = SOP_WRITE_FLAG;
+		break;
+	case DMA_NONE:
+		r56_request->data_direction = SOP_NO_DIRECTION_FLAG;
+		break;
+	case DMA_BIDIRECTIONAL:
+		r56_request->data_direction = SOP_BIDIRECTIONAL;
+		break;
+	default:
+		dev_err(&ctrl_info->pci_dev->dev,
+			"unknown data direction: %d\n",
+			scmd->sc_data_direction);
+		break;
+	}
+
+	if (encryption_info) {
+		r56_request->encryption_enable = true;
+		put_unaligned_le16(encryption_info->data_encryption_key_index,
+				&r56_request->data_encryption_key_index);
+		put_unaligned_le32(encryption_info->encrypt_tweak_lower,
+				&r56_request->encrypt_tweak_lower);
+		put_unaligned_le32(encryption_info->encrypt_tweak_upper,
+				&r56_request->encrypt_tweak_upper);
+	}
+
+	rc = pqi_build_aio_r56_sg_list(ctrl_info, r56_request, scmd, io_request);
+	if (rc) {
+		pqi_free_io_request(io_request);
+		return SCSI_MLQUEUE_HOST_BUSY;
+	}
+
+	pqi_start_io(ctrl_info, queue_group, AIO_PATH, io_request);
+
+	return 0;
+}
+
 static inline u16 pqi_get_hw_queue(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd)
 {
@@ -6298,6 +6469,60 @@ static ssize_t pqi_lockup_action_store(struct device *dev,
 	return -EINVAL;
 }
 
+static ssize_t pqi_host_enable_r5_writes_show(struct device *dev,
+	struct device_attribute *attr, char *buffer)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
+
+	return scnprintf(buffer, 10, "%hhx\n", ctrl_info->enable_r5_writes);
+}
+
+static ssize_t pqi_host_enable_r5_writes_store(struct device *dev,
+	struct device_attribute *attr, const char *buffer, size_t count)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
+	u8 set_r5_writes = 0;
+
+	if (kstrtou8(buffer, 0, &set_r5_writes))
+		return -EINVAL;
+
+	if (set_r5_writes > 0)
+		set_r5_writes = 1;
+
+	ctrl_info->enable_r5_writes = set_r5_writes;
+
+	return count;
+}
+
+static ssize_t pqi_host_enable_r6_writes_show(struct device *dev,
+	struct device_attribute *attr, char *buffer)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
+
+	return scnprintf(buffer, 10, "%hhx\n", ctrl_info->enable_r6_writes);
+}
+
+static ssize_t pqi_host_enable_r6_writes_store(struct device *dev,
+	struct device_attribute *attr, const char *buffer, size_t count)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
+	u8 set_r6_writes = 0;
+
+	if (kstrtou8(buffer, 0, &set_r6_writes))
+		return -EINVAL;
+
+	if (set_r6_writes > 0)
+		set_r6_writes = 1;
+
+	ctrl_info->enable_r6_writes = set_r6_writes;
+
+	return count;
+}
+
 static DEVICE_ATTR(driver_version, 0444, pqi_driver_version_show, NULL);
 static DEVICE_ATTR(firmware_version, 0444, pqi_firmware_version_show, NULL);
 static DEVICE_ATTR(model, 0444, pqi_model_show, NULL);
@@ -6306,6 +6531,10 @@ static DEVICE_ATTR(vendor, 0444, pqi_vendor_show, NULL);
 static DEVICE_ATTR(rescan, 0200, NULL, pqi_host_rescan_store);
 static DEVICE_ATTR(lockup_action, 0644, pqi_lockup_action_show,
 	pqi_lockup_action_store);
+static DEVICE_ATTR(enable_r5_writes, 0644,
+	pqi_host_enable_r5_writes_show, pqi_host_enable_r5_writes_store);
+static DEVICE_ATTR(enable_r6_writes, 0644,
+	pqi_host_enable_r6_writes_show, pqi_host_enable_r6_writes_store);
 
 static struct device_attribute *pqi_shost_attrs[] = {
 	&dev_attr_driver_version,
@@ -6315,6 +6544,8 @@ static struct device_attribute *pqi_shost_attrs[] = {
 	&dev_attr_vendor,
 	&dev_attr_rescan,
 	&dev_attr_lockup_action,
+	&dev_attr_enable_r5_writes,
+	&dev_attr_enable_r6_writes,
 	NULL
 };
 


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 05/25] smartpqi: add support for raid1 writes
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (3 preceding siblings ...)
  2020-12-10 20:34 ` [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes Don Brace
@ 2020-12-10 20:34 ` Don Brace
  2021-01-07 16:44   ` Martin Wilck
  2020-12-10 20:34 ` [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits Don Brace
                   ` (20 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:34 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

* Add raid1 write IU.
* Add in raid1 write support.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   37 +++++
 drivers/scsi/smartpqi/smartpqi_init.c |  235 +++++++++++++++++++++++----------
 2 files changed, 196 insertions(+), 76 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index e9844210c4a0..225ec6843c68 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -313,6 +313,36 @@ struct pqi_aio_path_request {
 		sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
 };
 
+#define PQI_RAID1_NVME_XFER_LIMIT	(32 * 1024)	/* 32 KiB */
+struct pqi_aio_r1_path_request {
+	struct pqi_iu_header header;
+	__le16	request_id;
+	__le16	volume_id;	/* ID of the RAID volume */
+	__le32	it_nexus_1;	/* IT nexus of the 1st drive in the RAID volume */
+	__le32	it_nexus_2;	/* IT nexus of the 2nd drive in the RAID volume */
+	__le32	it_nexus_3;	/* IT nexus of the 3rd drive in the RAID volume */
+	__le32	data_length;	/* total bytes to read/write */
+	u8	data_direction : 2;
+	u8	partial : 1;
+	u8	memory_type : 1;
+	u8	fence : 1;
+	u8	encryption_enable : 1;
+	u8	reserved : 2;
+	u8	task_attribute : 3;
+	u8	command_priority : 4;
+	u8	reserved2 : 1;
+	__le16	data_encryption_key_index;
+	u8	cdb[16];
+	__le16	error_index;
+	u8	num_sg_descriptors;
+	u8	cdb_length;
+	u8	num_drives;	/* number of drives in the RAID volume (2 or 3) */
+	u8	reserved3[3];
+	__le32	encrypt_tweak_lower;
+	__le32	encrypt_tweak_upper;
+	struct pqi_sg_descriptor sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
+};
+
 #define PQI_RAID56_XFER_LIMIT_4K	0x1000 /* 4Kib */
 #define PQI_RAID56_XFER_LIMIT_8K	0x2000 /* 8Kib */
 struct pqi_aio_r56_path_request {
@@ -520,6 +550,7 @@ struct pqi_raid_error_info {
 #define PQI_REQUEST_IU_AIO_PATH_IO			0x15
 #define PQI_REQUEST_IU_AIO_PATH_RAID5_IO		0x18
 #define PQI_REQUEST_IU_AIO_PATH_RAID6_IO		0x19
+#define PQI_REQUEST_IU_AIO_PATH_RAID1_IO		0x1A
 #define PQI_REQUEST_IU_GENERAL_ADMIN			0x60
 #define PQI_REQUEST_IU_REPORT_VENDOR_EVENT_CONFIG	0x72
 #define PQI_REQUEST_IU_SET_VENDOR_EVENT_CONFIG		0x73
@@ -972,14 +1003,12 @@ struct pqi_scsi_dev_raid_map_data {
 	u16	strip_size;
 	u32	first_group;
 	u32	last_group;
-	u32	current_group;
 	u32	map_row;
 	u32	aio_handle;
 	u64	disk_block;
 	u32	disk_block_cnt;
 	u8	cdb[16];
 	u8	cdb_length;
-	int	offload_to_mirror;
 
 	/* RAID1 specific */
 #define NUM_RAID1_MAP_ENTRIES 3
@@ -1040,8 +1069,7 @@ struct pqi_scsi_dev {
 	u16	phys_connector[8];
 	bool	raid_bypass_configured;	/* RAID bypass configured */
 	bool	raid_bypass_enabled;	/* RAID bypass enabled */
-	int	offload_to_mirror;	/* Send next RAID bypass request */
-					/* to mirror drive. */
+	u32	next_bypass_group;
 	struct raid_map *raid_map;	/* RAID bypass map */
 
 	struct pqi_sas_port *sas_port;
@@ -1247,6 +1275,7 @@ struct pqi_ctrl_info {
 	u8		soft_reset_handshake_supported : 1;
 	u8		raid_iu_timeout_supported: 1;
 	u8		tmf_iu_timeout_supported: 1;
+	u8		enable_r1_writes : 1;
 	u8		enable_r5_writes : 1;
 	u8		enable_r6_writes : 1;
 
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index c813cec10003..8da9031c9c0b 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -67,6 +67,10 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,
 	unsigned int cdb_length, struct pqi_queue_group *queue_group,
 	struct pqi_encryption_info *encryption_info, bool raid_bypass);
+static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info *ctrl_info,
+	struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
+	struct pqi_encryption_info *encryption_info, struct pqi_scsi_dev *device,
+	struct pqi_scsi_dev_raid_map_data *rmd);
 static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
 	struct pqi_encryption_info *encryption_info, struct pqi_scsi_dev *device,
@@ -1717,7 +1721,7 @@ static void pqi_scsi_update_device(struct pqi_scsi_dev *existing_device,
 		sizeof(existing_device->box));
 	memcpy(existing_device->phys_connector, new_device->phys_connector,
 		sizeof(existing_device->phys_connector));
-	existing_device->offload_to_mirror = 0;
+	existing_device->next_bypass_group = 0;
 	kfree(existing_device->raid_map);
 	existing_device->raid_map = new_device->raid_map;
 	existing_device->raid_bypass_configured =
@@ -2250,7 +2254,10 @@ static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
 	case SA_RAID_0:
 		break;
 	case SA_RAID_1:
-		is_supported = false;
+		fallthrough;
+	case SA_RAID_ADM:
+		if (rmd->is_write && !ctrl_info->enable_r1_writes)
+			is_supported = false;
 		break;
 	case SA_RAID_5:
 		if (rmd->is_write && !ctrl_info->enable_r5_writes)
@@ -2260,10 +2267,6 @@ static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
 		if (rmd->is_write && !ctrl_info->enable_r6_writes)
 			is_supported = false;
 		break;
-	case SA_RAID_ADM:
-		if (rmd->is_write)
-			is_supported = false;
-		break;
 	default:
 		is_supported = false;
 	}
@@ -2385,64 +2388,6 @@ static int pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,
 	return 0;
 }
 
-static int pqi_calc_aio_raid_adm(struct pqi_scsi_dev_raid_map_data *rmd,
-				struct pqi_scsi_dev *device)
-{
-	/* RAID ADM */
-	/*
-	 * Handles N-way mirrors  (R1-ADM) and R10 with # of drives
-	 * divisible by 3.
-	 */
-	rmd->offload_to_mirror = device->offload_to_mirror;
-
-	if (rmd->offload_to_mirror == 0)  {
-		/* use physical disk in the first mirrored group. */
-		rmd->map_index %= rmd->data_disks_per_row;
-	} else {
-		do {
-			/*
-			 * Determine mirror group that map_index
-			 * indicates.
-			 */
-			rmd->current_group =
-				rmd->map_index / rmd->data_disks_per_row;
-
-			if (rmd->offload_to_mirror !=
-					rmd->current_group) {
-				if (rmd->current_group <
-					rmd->layout_map_count - 1) {
-					/*
-					 * Select raid index from
-					 * next group.
-					 */
-					rmd->map_index += rmd->data_disks_per_row;
-					rmd->current_group++;
-				} else {
-					/*
-					 * Select raid index from first
-					 * group.
-					 */
-					rmd->map_index %= rmd->data_disks_per_row;
-					rmd->current_group = 0;
-				}
-			}
-		} while (rmd->offload_to_mirror != rmd->current_group);
-	}
-
-	/* Set mirror group to use next time. */
-	rmd->offload_to_mirror =
-		(rmd->offload_to_mirror >= rmd->layout_map_count - 1) ?
-			0 : rmd->offload_to_mirror + 1;
-	device->offload_to_mirror = rmd->offload_to_mirror;
-	/*
-	 * Avoid direct use of device->offload_to_mirror within this
-	 * function since multiple threads might simultaneously
-	 * increment it beyond the range of device->layout_map_count -1.
-	 */
-
-	return 0;
-}
-
 static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data *rmd,
 				struct raid_map *raid_map)
 {
@@ -2577,12 +2522,34 @@ static void pqi_set_aio_cdb(struct pqi_scsi_dev_raid_map_data *rmd)
 	}
 }
 
+static void pqi_calc_aio_r1_nexus(struct raid_map *raid_map,
+				struct pqi_scsi_dev_raid_map_data *rmd)
+{
+	u32 index;
+	u32 group;
+
+	group = rmd->map_index / rmd->data_disks_per_row;
+
+	index = rmd->map_index - (group * rmd->data_disks_per_row);
+	rmd->it_nexus[0] = raid_map->disk_data[index].aio_handle;
+	index += rmd->data_disks_per_row;
+	rmd->it_nexus[1] = raid_map->disk_data[index].aio_handle;
+	if (rmd->layout_map_count > 2) {
+		index += rmd->data_disks_per_row;
+		rmd->it_nexus[2] = raid_map->disk_data[index].aio_handle;
+	}
+
+	rmd->num_it_nexus_entries = rmd->layout_map_count;
+}
+
 static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
 	struct pqi_queue_group *queue_group)
 {
-	struct raid_map *raid_map;
 	int rc;
+	struct raid_map *raid_map;
+	u32 group;
+	u32 next_bypass_group;
 	struct pqi_encryption_info *encryption_info_ptr;
 	struct pqi_encryption_info encryption_info;
 	struct pqi_scsi_dev_raid_map_data rmd = {0};
@@ -2605,13 +2572,18 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 	if (rc)
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
-	/* RAID 1 */
-	if (device->raid_level == SA_RAID_1) {
-		if (device->offload_to_mirror)
-			rmd.map_index += rmd.data_disks_per_row;
-		device->offload_to_mirror = !device->offload_to_mirror;
-	} else if (device->raid_level == SA_RAID_ADM) {
-		rc = pqi_calc_aio_raid_adm(&rmd, device);
+	if (device->raid_level == SA_RAID_1 ||
+		device->raid_level == SA_RAID_ADM) {
+		if (rmd.is_write) {
+			pqi_calc_aio_r1_nexus(raid_map, &rmd);
+		} else {
+			group = device->next_bypass_group;
+			next_bypass_group = group + 1;
+			if (next_bypass_group >= rmd.layout_map_count)
+				next_bypass_group = 0;
+			device->next_bypass_group = next_bypass_group;
+			rmd.map_index += group * rmd.data_disks_per_row;
+		}
 	} else if ((device->raid_level == SA_RAID_5 ||
 		device->raid_level == SA_RAID_6) &&
 		(rmd.layout_map_count > 1 || rmd.is_write)) {
@@ -2655,6 +2627,10 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 			return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
 				rmd.cdb, rmd.cdb_length, queue_group,
 				encryption_info_ptr, true);
+		case SA_RAID_1:
+		case SA_RAID_ADM:
+			return pqi_aio_submit_r1_write_io(ctrl_info, scmd, queue_group,
+				encryption_info_ptr, device, &rmd);
 		case SA_RAID_5:
 		case SA_RAID_6:
 			return pqi_aio_submit_r56_write_io(ctrl_info, scmd, queue_group,
@@ -4982,6 +4958,44 @@ static int pqi_build_raid_sg_list(struct pqi_ctrl_info *ctrl_info,
 	return 0;
 }
 
+static int pqi_build_aio_r1_sg_list(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_aio_r1_path_request *request, struct scsi_cmnd *scmd,
+	struct pqi_io_request *io_request)
+{
+	u16 iu_length;
+	int sg_count;
+	bool chained;
+	unsigned int num_sg_in_iu;
+	struct scatterlist *sg;
+	struct pqi_sg_descriptor *sg_descriptor;
+
+	sg_count = scsi_dma_map(scmd);
+	if (sg_count < 0)
+		return sg_count;
+
+	iu_length = offsetof(struct pqi_aio_r1_path_request, sg_descriptors) -
+		PQI_REQUEST_HEADER_LENGTH;
+	num_sg_in_iu = 0;
+
+	if (sg_count == 0)
+		goto out;
+
+	sg = scsi_sglist(scmd);
+	sg_descriptor = request->sg_descriptors;
+
+	num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count, io_request,
+		ctrl_info->max_sg_per_iu, &chained);
+
+	request->partial = chained;
+	iu_length += num_sg_in_iu * sizeof(*sg_descriptor);
+
+out:
+	put_unaligned_le16(iu_length, &request->header.iu_length);
+	request->num_sg_descriptors = num_sg_in_iu;
+
+	return 0;
+}
+
 static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_aio_r56_path_request *request, struct scsi_cmnd *scmd,
 	struct pqi_io_request *io_request)
@@ -5424,6 +5438,83 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
 	return 0;
 }
 
+static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info *ctrl_info,
+	struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
+	struct pqi_encryption_info *encryption_info, struct pqi_scsi_dev *device,
+	struct pqi_scsi_dev_raid_map_data *rmd)
+
+{
+	int rc;
+	struct pqi_io_request *io_request;
+	struct pqi_aio_r1_path_request *r1_request;
+
+	io_request = pqi_alloc_io_request(ctrl_info);
+	io_request->io_complete_callback = pqi_aio_io_complete;
+	io_request->scmd = scmd;
+	io_request->raid_bypass = true;
+
+	r1_request = io_request->iu;
+	memset(r1_request, 0, offsetof(struct pqi_aio_r1_path_request, sg_descriptors));
+
+	r1_request->header.iu_type = PQI_REQUEST_IU_AIO_PATH_RAID1_IO;
+
+	put_unaligned_le16(*(u16 *)device->scsi3addr & 0x3fff, &r1_request->volume_id);
+	r1_request->num_drives = rmd->num_it_nexus_entries;
+	put_unaligned_le32(rmd->it_nexus[0], &r1_request->it_nexus_1);
+	put_unaligned_le32(rmd->it_nexus[1], &r1_request->it_nexus_2);
+	if (rmd->num_it_nexus_entries == 3)
+		put_unaligned_le32(rmd->it_nexus[2], &r1_request->it_nexus_3);
+
+	put_unaligned_le32(scsi_bufflen(scmd), &r1_request->data_length);
+	r1_request->task_attribute = SOP_TASK_ATTRIBUTE_SIMPLE;
+	put_unaligned_le16(io_request->index, &r1_request->request_id);
+	r1_request->error_index = r1_request->request_id;
+	if (rmd->cdb_length > sizeof(r1_request->cdb))
+		rmd->cdb_length = sizeof(r1_request->cdb);
+	r1_request->cdb_length = rmd->cdb_length;
+	memcpy(r1_request->cdb, rmd->cdb, rmd->cdb_length);
+
+	switch (scmd->sc_data_direction) {
+	case DMA_TO_DEVICE:
+		r1_request->data_direction = SOP_READ_FLAG;
+		break;
+	case DMA_FROM_DEVICE:
+		r1_request->data_direction = SOP_WRITE_FLAG;
+		break;
+	case DMA_NONE:
+		r1_request->data_direction = SOP_NO_DIRECTION_FLAG;
+		break;
+	case DMA_BIDIRECTIONAL:
+		r1_request->data_direction = SOP_BIDIRECTIONAL;
+		break;
+	default:
+		dev_err(&ctrl_info->pci_dev->dev,
+			"unknown data direction: %d\n",
+			scmd->sc_data_direction);
+		break;
+	}
+
+	if (encryption_info) {
+		r1_request->encryption_enable = true;
+		put_unaligned_le16(encryption_info->data_encryption_key_index,
+				&r1_request->data_encryption_key_index);
+		put_unaligned_le32(encryption_info->encrypt_tweak_lower,
+				&r1_request->encrypt_tweak_lower);
+		put_unaligned_le32(encryption_info->encrypt_tweak_upper,
+				&r1_request->encrypt_tweak_upper);
+	}
+
+	rc = pqi_build_aio_r1_sg_list(ctrl_info, r1_request, scmd, io_request);
+	if (rc) {
+		pqi_free_io_request(io_request);
+		return SCSI_MLQUEUE_HOST_BUSY;
+	}
+
+	pqi_start_io(ctrl_info, queue_group, AIO_PATH, io_request);
+
+	return 0;
+}
+
 static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
 	struct pqi_encryption_info *encryption_info, struct pqi_scsi_dev *device,


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (4 preceding siblings ...)
  2020-12-10 20:34 ` [PATCH V3 05/25] smartpqi: add support for raid1 writes Don Brace
@ 2020-12-10 20:34 ` Don Brace
  2021-01-07 16:44   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support Don Brace
                   ` (19 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:34 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Determine support for supported features from
  BMIC sense feature command instead of config table.

Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   77 +++++++-
 drivers/scsi/smartpqi/smartpqi_init.c |  328 +++++++++++++++++++++++++++++----
 2 files changed, 363 insertions(+), 42 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 225ec6843c68..31281cddadfe 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -314,6 +314,7 @@ struct pqi_aio_path_request {
 };
 
 #define PQI_RAID1_NVME_XFER_LIMIT	(32 * 1024)	/* 32 KiB */
+
 struct pqi_aio_r1_path_request {
 	struct pqi_iu_header header;
 	__le16	request_id;
@@ -343,8 +344,10 @@ struct pqi_aio_r1_path_request {
 	struct pqi_sg_descriptor sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
 };
 
-#define PQI_RAID56_XFER_LIMIT_4K	0x1000 /* 4Kib */
-#define PQI_RAID56_XFER_LIMIT_8K	0x2000 /* 8Kib */
+#define PQI_DEFAULT_MAX_WRITE_RAID_5_6			(8 * 1024U)
+#define PQI_DEFAULT_MAX_TRANSFER_ENCRYPTED_SAS_SATA	(~0U)
+#define PQI_DEFAULT_MAX_TRANSFER_ENCRYPTED_NVME		(32 * 1024U)
+
 struct pqi_aio_r56_path_request {
 	struct pqi_iu_header header;
 	__le16	request_id;
@@ -355,7 +358,7 @@ struct pqi_aio_r56_path_request {
 	__le32	data_length;		/* total bytes to read/write */
 	u8	data_direction : 2;
 	u8	partial : 1;
-	u8	mem_type : 1;		/* 0b: PCIe, 1b: DDR */
+	u8	mem_type : 1;		/* 0 = PCIe, 1 = DDR */
 	u8	fence : 1;
 	u8	encryption_enable : 1;
 	u8	reserved : 2;
@@ -371,7 +374,7 @@ struct pqi_aio_r56_path_request {
 	u8	reserved2[3];
 	__le32	encrypt_tweak_lower;
 	__le32	encrypt_tweak_upper;
-	u8	row;			/* row = logical lba/blocks per row */
+	__le64	row;			/* row = logical LBA/blocks per row */
 	u8	reserved3[8];
 	struct pqi_sg_descriptor sg_descriptors[PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS];
 };
@@ -828,13 +831,27 @@ struct pqi_config_table_firmware_features {
 	u8	features_supported[];
 /*	u8	features_requested_by_host[]; */
 /*	u8	features_enabled[]; */
+/* The 2 fields below are only valid if the MAX_KNOWN_FEATURE bit is set. */
+/*	__le16	firmware_max_known_feature; */
+/*	__le16	host_max_known_feature; */
 };
 
 #define PQI_FIRMWARE_FEATURE_OFA			0
 #define PQI_FIRMWARE_FEATURE_SMP			1
+#define PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE		2
+#define PQI_FIRMWARE_FEATURE_RAID_0_READ_BYPASS		3
+#define PQI_FIRMWARE_FEATURE_RAID_1_READ_BYPASS		4
+#define PQI_FIRMWARE_FEATURE_RAID_5_READ_BYPASS		5
+#define PQI_FIRMWARE_FEATURE_RAID_6_READ_BYPASS		6
+#define PQI_FIRMWARE_FEATURE_RAID_0_WRITE_BYPASS	7
+#define PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS	8
+#define PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS	9
+#define PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS	10
 #define PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE	11
+#define PQI_FIRMWARE_FEATURE_UNIQUE_SATA_WWN		12
 #define PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT		13
 #define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT		14
+#define PQI_FIRMWARE_FEATURE_MAXIMUM			14
 
 struct pqi_config_table_debug {
 	struct pqi_config_table_section_header header;
@@ -1010,12 +1027,12 @@ struct pqi_scsi_dev_raid_map_data {
 	u8	cdb[16];
 	u8	cdb_length;
 
-	/* RAID1 specific */
+	/* RAID 1 specific */
 #define NUM_RAID1_MAP_ENTRIES 3
 	u32	num_it_nexus_entries;
 	u32	it_nexus[NUM_RAID1_MAP_ENTRIES];
 
-	/* RAID5 RAID6 specific */
+	/* RAID 5 / RAID 6 specific */
 	u32	p_parity_it_nexus; /* aio_handle */
 	u32	q_parity_it_nexus; /* aio_handle */
 	u8	xor_mult;
@@ -1071,6 +1088,7 @@ struct pqi_scsi_dev {
 	bool	raid_bypass_enabled;	/* RAID bypass enabled */
 	u32	next_bypass_group;
 	struct raid_map *raid_map;	/* RAID bypass map */
+	u32	max_transfer_encrypted;
 
 	struct pqi_sas_port *sas_port;
 	struct scsi_device *sdev;
@@ -1278,6 +1296,13 @@ struct pqi_ctrl_info {
 	u8		enable_r1_writes : 1;
 	u8		enable_r5_writes : 1;
 	u8		enable_r6_writes : 1;
+	u8		lv_drive_type_mix_valid : 1;
+
+	u8		ciss_report_log_flags;
+	u32		max_transfer_encrypted_sas_sata;
+	u32		max_transfer_encrypted_nvme;
+	u32		max_write_raid_5_6;
+
 
 	struct list_head scsi_device_list;
 	spinlock_t	scsi_device_list_lock;
@@ -1338,6 +1363,7 @@ enum pqi_ctrl_mode {
 #define BMIC_IDENTIFY_PHYSICAL_DEVICE		0x15
 #define BMIC_READ				0x26
 #define BMIC_WRITE				0x27
+#define BMIC_SENSE_FEATURE			0x61
 #define BMIC_SENSE_CONTROLLER_PARAMETERS	0x64
 #define BMIC_SENSE_SUBSYSTEM_INFORMATION	0x66
 #define BMIC_CSMI_PASSTHRU			0x68
@@ -1357,6 +1383,19 @@ enum pqi_ctrl_mode {
 	(((CISS_GET_LEVEL_2_BUS((lunid)) - 1) << 8) + \
 	CISS_GET_LEVEL_2_TARGET((lunid)))
 
+#define LV_GET_DRIVE_TYPE_MIX(lunid)		((lunid)[6])
+
+#define LV_DRIVE_TYPE_MIX_UNKNOWN		0
+#define LV_DRIVE_TYPE_MIX_NO_RESTRICTION	1
+#define LV_DRIVE_TYPE_MIX_SAS_HDD_ONLY		2
+#define LV_DRIVE_TYPE_MIX_SATA_HDD_ONLY		3
+#define LV_DRIVE_TYPE_MIX_SAS_OR_SATA_SSD_ONLY	4
+#define LV_DRIVE_TYPE_MIX_SAS_SSD_ONLY		5
+#define LV_DRIVE_TYPE_MIX_SATA_SSD_ONLY		6
+#define LV_DRIVE_TYPE_MIX_SAS_ONLY		7
+#define LV_DRIVE_TYPE_MIX_SATA_ONLY		8
+#define LV_DRIVE_TYPE_MIX_NVME_ONLY		9
+
 #define NO_TIMEOUT		((unsigned long) -1)
 
 #pragma pack(1)
@@ -1470,6 +1509,32 @@ struct bmic_identify_physical_device {
 	u8	padding_to_multiple_of_512[9];
 };
 
+#define BMIC_SENSE_FEATURE_IO_PAGE		0x8
+#define BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE	0x2
+
+struct bmic_sense_feature_buffer_header {
+	u8	page_code;
+	u8	subpage_code;
+	__le16	buffer_length;
+};
+
+struct bmic_sense_feature_page_header {
+	u8	page_code;
+	u8	subpage_code;
+	__le16	page_length;
+};
+
+struct bmic_sense_feature_io_page_aio_subpage {
+	struct bmic_sense_feature_page_header header;
+	u8	firmware_read_support;
+	u8	driver_read_support;
+	u8	firmware_write_support;
+	u8	driver_write_support;
+	__le16	max_transfer_encrypted_sas_sata;
+	__le16	max_transfer_encrypted_nvme;
+	__le16	max_write_raid_5_6;
+};
+
 struct bmic_smp_request {
 	u8	frame_type;
 	u8	function;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 8da9031c9c0b..aa21c1cd2cac 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -506,7 +506,7 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
 		if (cmd == CISS_REPORT_PHYS)
 			cdb[1] = CISS_REPORT_PHYS_FLAG_OTHER;
 		else
-			cdb[1] = CISS_REPORT_LOG_FLAG_UNIQUE_LUN_ID;
+			cdb[1] = ctrl_info->ciss_report_log_flags;
 		put_unaligned_be32(cdb_length, &cdb[6]);
 		break;
 	case CISS_GET_RAID_MAP:
@@ -527,6 +527,7 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
 	case BMIC_IDENTIFY_CONTROLLER:
 	case BMIC_IDENTIFY_PHYSICAL_DEVICE:
 	case BMIC_SENSE_SUBSYSTEM_INFORMATION:
+	case BMIC_SENSE_FEATURE:
 		request->data_direction = SOP_READ_FLAG;
 		cdb[0] = BMIC_READ;
 		cdb[6] = cmd;
@@ -695,6 +696,97 @@ static int pqi_identify_physical_device(struct pqi_ctrl_info *ctrl_info,
 	return rc;
 }
 
+#pragma pack(1)
+
+struct bmic_sense_feature_buffer {
+	struct bmic_sense_feature_buffer_header header;
+	struct bmic_sense_feature_io_page_aio_subpage aio_subpage;
+};
+
+#pragma pack()
+
+#define MINIMUM_AIO_SUBPAGE_BUFFER_LENGTH	\
+	offsetofend(struct bmic_sense_feature_buffer, \
+		aio_subpage.max_write_raid_5_6)
+
+#define MINIMUM_AIO_SUBPAGE_LENGTH	\
+	(offsetofend(struct bmic_sense_feature_io_page_aio_subpage, \
+		max_write_raid_5_6) - \
+		sizeof_field(struct bmic_sense_feature_io_page_aio_subpage, header))
+
+static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
+{
+	int rc;
+	enum dma_data_direction dir;
+	struct pqi_raid_path_request request;
+	struct bmic_sense_feature_buffer *buffer;
+
+	buffer = kmalloc(sizeof(*buffer), GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
+
+	rc = pqi_build_raid_path_request(ctrl_info, &request,
+		BMIC_SENSE_FEATURE, RAID_CTLR_LUNID, buffer,
+		sizeof(*buffer), 0, &dir);
+	if (rc)
+		goto error;
+
+	request.cdb[2] = BMIC_SENSE_FEATURE_IO_PAGE;
+	request.cdb[3] = BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE;
+
+	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header,
+		0, NULL, NO_TIMEOUT);
+
+	pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1, dir);
+
+	if (rc)
+		goto error;
+
+	if (buffer->header.page_code != BMIC_SENSE_FEATURE_IO_PAGE ||
+		buffer->header.subpage_code !=
+			BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE ||
+		get_unaligned_le16(&buffer->header.buffer_length) <
+			MINIMUM_AIO_SUBPAGE_BUFFER_LENGTH ||
+		buffer->aio_subpage.header.page_code !=
+			BMIC_SENSE_FEATURE_IO_PAGE ||
+		buffer->aio_subpage.header.subpage_code !=
+			BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE ||
+		get_unaligned_le16(&buffer->aio_subpage.header.page_length) <
+			MINIMUM_AIO_SUBPAGE_LENGTH) {
+		rc = -EINVAL;
+		goto error;
+	}
+
+	ctrl_info->max_transfer_encrypted_sas_sata =
+		get_unaligned_le16(
+			&buffer->aio_subpage.max_transfer_encrypted_sas_sata);
+	if (ctrl_info->max_transfer_encrypted_sas_sata)
+		ctrl_info->max_transfer_encrypted_sas_sata *= 1024;
+	else
+		ctrl_info->max_transfer_encrypted_sas_sata = ~0;
+
+	ctrl_info->max_transfer_encrypted_nvme =
+		get_unaligned_le16(
+			&buffer->aio_subpage.max_transfer_encrypted_nvme);
+	if (ctrl_info->max_transfer_encrypted_nvme)
+		ctrl_info->max_transfer_encrypted_nvme *= 1024;
+	else
+		ctrl_info->max_transfer_encrypted_nvme = ~0;
+
+	ctrl_info->max_write_raid_5_6 =
+		get_unaligned_le16(
+			&buffer->aio_subpage.max_write_raid_5_6);
+	if (ctrl_info->max_write_raid_5_6)
+		ctrl_info->max_write_raid_5_6 *= 1024;
+	else
+		ctrl_info->max_write_raid_5_6 = ~0;
+
+error:
+	kfree(buffer);
+
+	return rc;
+}
+
 static int pqi_flush_cache(struct pqi_ctrl_info *ctrl_info,
 	enum bmic_flush_cache_shutdown_event shutdown_event)
 {
@@ -1232,6 +1324,39 @@ static int pqi_get_raid_map(struct pqi_ctrl_info *ctrl_info,
 	return rc;
 }
 
+static void pqi_set_max_transfer_encrypted(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_scsi_dev *device)
+{
+	if (!ctrl_info->lv_drive_type_mix_valid) {
+		device->max_transfer_encrypted = ~0;
+		return;
+	}
+
+	switch (LV_GET_DRIVE_TYPE_MIX(device->scsi3addr)) {
+	case LV_DRIVE_TYPE_MIX_SAS_HDD_ONLY:
+	case LV_DRIVE_TYPE_MIX_SATA_HDD_ONLY:
+	case LV_DRIVE_TYPE_MIX_SAS_OR_SATA_SSD_ONLY:
+	case LV_DRIVE_TYPE_MIX_SAS_SSD_ONLY:
+	case LV_DRIVE_TYPE_MIX_SATA_SSD_ONLY:
+	case LV_DRIVE_TYPE_MIX_SAS_ONLY:
+	case LV_DRIVE_TYPE_MIX_SATA_ONLY:
+		device->max_transfer_encrypted =
+			ctrl_info->max_transfer_encrypted_sas_sata;
+		break;
+	case LV_DRIVE_TYPE_MIX_NVME_ONLY:
+		device->max_transfer_encrypted =
+			ctrl_info->max_transfer_encrypted_nvme;
+		break;
+	case LV_DRIVE_TYPE_MIX_UNKNOWN:
+	case LV_DRIVE_TYPE_MIX_NO_RESTRICTION:
+	default:
+		device->max_transfer_encrypted =
+			min(ctrl_info->max_transfer_encrypted_sas_sata,
+				ctrl_info->max_transfer_encrypted_nvme);
+		break;
+	}
+}
+
 static void pqi_get_raid_bypass_status(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev *device)
 {
@@ -1257,8 +1382,12 @@ static void pqi_get_raid_bypass_status(struct pqi_ctrl_info *ctrl_info,
 		(bypass_status & RAID_BYPASS_CONFIGURED) != 0;
 	if (device->raid_bypass_configured &&
 		(bypass_status & RAID_BYPASS_ENABLED) &&
-		pqi_get_raid_map(ctrl_info, device) == 0)
+		pqi_get_raid_map(ctrl_info, device) == 0) {
 		device->raid_bypass_enabled = true;
+		if (get_unaligned_le16(&device->raid_map->flags) &
+			RAID_MAP_ENCRYPTION_ENABLED)
+			pqi_set_max_transfer_encrypted(ctrl_info, device);
+	}
 
 out:
 	kfree(buffer);
@@ -2028,6 +2157,10 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 		}
 	}
 
+	if (num_logicals &&
+		(logdev_list->header.flags & CISS_REPORT_LOG_FLAG_DRIVE_TYPE_MIX))
+		ctrl_info->lv_drive_type_mix_valid = true;
+
 	num_new_devices = num_physicals + num_logicals;
 
 	new_device_list = kmalloc_array(num_new_devices,
@@ -2260,15 +2393,18 @@ static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
 			is_supported = false;
 		break;
 	case SA_RAID_5:
-		if (rmd->is_write && !ctrl_info->enable_r5_writes)
+		if (rmd->is_write && (!ctrl_info->enable_r5_writes ||
+			rmd->data_length > ctrl_info->max_write_raid_5_6))
 			is_supported = false;
 		break;
 	case SA_RAID_6:
-		if (rmd->is_write && !ctrl_info->enable_r6_writes)
+		if (rmd->is_write && (!ctrl_info->enable_r6_writes ||
+			rmd->data_length > ctrl_info->max_write_raid_5_6))
 			is_supported = false;
 		break;
 	default:
 		is_supported = false;
+		break;
 	}
 
 	return is_supported;
@@ -2277,7 +2413,7 @@ static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
 #define PQI_RAID_BYPASS_INELIGIBLE	1
 
 static int pqi_get_aio_lba_and_block_count(struct scsi_cmnd *scmd,
-			struct pqi_scsi_dev_raid_map_data *rmd)
+	struct pqi_scsi_dev_raid_map_data *rmd)
 {
 	/* Check for valid opcode, get LBA and block count. */
 	switch (scmd->cmnd[0]) {
@@ -2323,8 +2459,7 @@ static int pqi_get_aio_lba_and_block_count(struct scsi_cmnd *scmd,
 }
 
 static int pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,
-					struct pqi_scsi_dev_raid_map_data *rmd,
-					struct raid_map *raid_map)
+	struct pqi_scsi_dev_raid_map_data *rmd, struct raid_map *raid_map)
 {
 #if BITS_PER_LONG == 32
 	u64 tmpdiv;
@@ -2339,7 +2474,7 @@ static int pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
 	rmd->data_disks_per_row =
-			get_unaligned_le16(&raid_map->data_disks_per_row);
+		get_unaligned_le16(&raid_map->data_disks_per_row);
 	rmd->strip_size = get_unaligned_le16(&raid_map->strip_size);
 	rmd->layout_map_count = get_unaligned_le16(&raid_map->layout_map_count);
 
@@ -2364,16 +2499,16 @@ static int pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,
 	rmd->first_row = rmd->first_block / rmd->blocks_per_row;
 	rmd->last_row = rmd->last_block / rmd->blocks_per_row;
 	rmd->first_row_offset = (u32)(rmd->first_block -
-				(rmd->first_row * rmd->blocks_per_row));
+		(rmd->first_row * rmd->blocks_per_row));
 	rmd->last_row_offset = (u32)(rmd->last_block - (rmd->last_row *
-				rmd->blocks_per_row));
+		rmd->blocks_per_row));
 	rmd->first_column = rmd->first_row_offset / rmd->strip_size;
 	rmd->last_column = rmd->last_row_offset / rmd->strip_size;
 #endif
 
 	/* If this isn't a single row/column then give to the controller. */
 	if (rmd->first_row != rmd->last_row ||
-			rmd->first_column != rmd->last_column)
+		rmd->first_column != rmd->last_column)
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
 	/* Proceeding with driver mapping. */
@@ -2383,19 +2518,19 @@ static int pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,
 		raid_map->parity_rotation_shift)) %
 		get_unaligned_le16(&raid_map->row_cnt);
 	rmd->map_index = (rmd->map_row * rmd->total_disks_per_row) +
-			rmd->first_column;
+		rmd->first_column;
 
 	return 0;
 }
 
 static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data *rmd,
-				struct raid_map *raid_map)
+	struct raid_map *raid_map)
 {
 #if BITS_PER_LONG == 32
 	u64 tmpdiv;
 #endif
 	/* RAID 50/60 */
-	/* Verify first and last block are in same RAID group */
+	/* Verify first and last block are in same RAID group. */
 	rmd->stripesize = rmd->blocks_per_row * rmd->layout_map_count;
 #if BITS_PER_LONG == 32
 	tmpdiv = rmd->first_block;
@@ -2415,7 +2550,7 @@ static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data *rmd,
 	if (rmd->first_group != rmd->last_group)
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
-	/* Verify request is in a single row of RAID 5/6 */
+	/* Verify request is in a single row of RAID 5/6. */
 #if BITS_PER_LONG == 32
 	tmpdiv = rmd->first_block;
 	do_div(tmpdiv, rmd->stripesize);
@@ -2432,7 +2567,7 @@ static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data *rmd,
 	if (rmd->r5or6_first_row != rmd->r5or6_last_row)
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
-	/* Verify request is in a single column */
+	/* Verify request is in a single column. */
 #if BITS_PER_LONG == 32
 	tmpdiv = rmd->first_block;
 	rmd->first_row_offset = do_div(tmpdiv, rmd->stripesize);
@@ -2451,23 +2586,22 @@ static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data *rmd,
 	rmd->r5or6_last_column = tmpdiv;
 #else
 	rmd->first_row_offset = rmd->r5or6_first_row_offset =
-		(u32)((rmd->first_block %
-				rmd->stripesize) %
-				rmd->blocks_per_row);
+		(u32)((rmd->first_block % rmd->stripesize) %
+		rmd->blocks_per_row);
 
 	rmd->r5or6_last_row_offset =
 		(u32)((rmd->last_block % rmd->stripesize) %
 		rmd->blocks_per_row);
 
 	rmd->first_column =
-			rmd->r5or6_first_row_offset / rmd->strip_size;
+		rmd->r5or6_first_row_offset / rmd->strip_size;
 	rmd->r5or6_first_column = rmd->first_column;
 	rmd->r5or6_last_column = rmd->r5or6_last_row_offset / rmd->strip_size;
 #endif
 	if (rmd->r5or6_first_column != rmd->r5or6_last_column)
 		return PQI_RAID_BYPASS_INELIGIBLE;
 
-	/* Request is eligible */
+	/* Request is eligible. */
 	rmd->map_row =
 		((u32)(rmd->first_row >> raid_map->parity_rotation_shift)) %
 		get_unaligned_le16(&raid_map->row_cnt);
@@ -2523,7 +2657,7 @@ static void pqi_set_aio_cdb(struct pqi_scsi_dev_raid_map_data *rmd)
 }
 
 static void pqi_calc_aio_r1_nexus(struct raid_map *raid_map,
-				struct pqi_scsi_dev_raid_map_data *rmd)
+	struct pqi_scsi_dev_raid_map_data *rmd)
 {
 	u32 index;
 	u32 group;
@@ -2552,7 +2686,7 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 	u32 next_bypass_group;
 	struct pqi_encryption_info *encryption_info_ptr;
 	struct pqi_encryption_info encryption_info;
-	struct pqi_scsi_dev_raid_map_data rmd = {0};
+	struct pqi_scsi_dev_raid_map_data rmd = { 0 };
 
 	rc = pqi_get_aio_lba_and_block_count(scmd, &rmd);
 	if (rc)
@@ -2613,7 +2747,9 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 	pqi_set_aio_cdb(&rmd);
 
 	if (get_unaligned_le16(&raid_map->flags) &
-		RAID_MAP_ENCRYPTION_ENABLED) {
+			RAID_MAP_ENCRYPTION_ENABLED) {
+		if (rmd.data_length > device->max_transfer_encrypted)
+			return PQI_RAID_BYPASS_INELIGIBLE;
 		pqi_set_encryption_info(&encryption_info, raid_map,
 			rmd.first_block);
 		encryption_info_ptr = &encryption_info;
@@ -2623,10 +2759,6 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 
 	if (rmd.is_write) {
 		switch (device->raid_level) {
-		case SA_RAID_0:
-			return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
-				rmd.cdb, rmd.cdb_length, queue_group,
-				encryption_info_ptr, true);
 		case SA_RAID_1:
 		case SA_RAID_ADM:
 			return pqi_aio_submit_r1_write_io(ctrl_info, scmd, queue_group,
@@ -2635,17 +2767,12 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 		case SA_RAID_6:
 			return pqi_aio_submit_r56_write_io(ctrl_info, scmd, queue_group,
 					encryption_info_ptr, device, &rmd);
-		default:
-			return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
-				rmd.cdb, rmd.cdb_length, queue_group,
-				encryption_info_ptr, true);
 		}
-	} else {
-		return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
-			rmd.cdb, rmd.cdb_length, queue_group,
-			encryption_info_ptr, true);
 	}
 
+	return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
+		rmd.cdb, rmd.cdb_length, queue_group,
+		encryption_info_ptr, true);
 }
 
 #define PQI_STATUS_IDLE		0x0
@@ -7209,6 +7336,7 @@ static int pqi_enable_firmware_features(struct pqi_ctrl_info *ctrl_info,
 {
 	void *features_requested;
 	void __iomem *features_requested_iomem_addr;
+	void __iomem *host_max_known_feature_iomem_addr;
 
 	features_requested = firmware_features->features_supported +
 		le16_to_cpu(firmware_features->num_elements);
@@ -7219,6 +7347,16 @@ static int pqi_enable_firmware_features(struct pqi_ctrl_info *ctrl_info,
 	memcpy_toio(features_requested_iomem_addr, features_requested,
 		le16_to_cpu(firmware_features->num_elements));
 
+	if (pqi_is_firmware_feature_supported(firmware_features,
+		PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE)) {
+		host_max_known_feature_iomem_addr =
+			features_requested_iomem_addr +
+			(le16_to_cpu(firmware_features->num_elements) * 2) +
+			sizeof(__le16);
+		writew(PQI_FIRMWARE_FEATURE_MAXIMUM,
+			host_max_known_feature_iomem_addr);
+	}
+
 	return pqi_config_table_update(ctrl_info,
 		PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES,
 		PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES);
@@ -7256,6 +7394,15 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_firmware_feature *firmware_feature)
 {
 	switch (firmware_feature->feature_bit) {
+	case PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS:
+		ctrl_info->enable_r1_writes = firmware_feature->enabled;
+		break;
+	case PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS:
+		ctrl_info->enable_r5_writes = firmware_feature->enabled;
+		break;
+	case PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS:
+		ctrl_info->enable_r6_writes = firmware_feature->enabled;
+		break;
 	case PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE:
 		ctrl_info->soft_reset_handshake_supported =
 			firmware_feature->enabled;
@@ -7293,6 +7440,51 @@ static struct pqi_firmware_feature pqi_firmware_features[] = {
 		.feature_bit = PQI_FIRMWARE_FEATURE_SMP,
 		.feature_status = pqi_firmware_feature_status,
 	},
+	{
+		.feature_name = "Maximum Known Feature",
+		.feature_bit = PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE,
+		.feature_status = pqi_firmware_feature_status,
+	},
+	{
+		.feature_name = "RAID 0 Read Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_0_READ_BYPASS,
+		.feature_status = pqi_firmware_feature_status,
+	},
+	{
+		.feature_name = "RAID 1 Read Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_1_READ_BYPASS,
+		.feature_status = pqi_firmware_feature_status,
+	},
+	{
+		.feature_name = "RAID 5 Read Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_5_READ_BYPASS,
+		.feature_status = pqi_firmware_feature_status,
+	},
+	{
+		.feature_name = "RAID 6 Read Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_6_READ_BYPASS,
+		.feature_status = pqi_firmware_feature_status,
+	},
+	{
+		.feature_name = "RAID 0 Write Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_0_WRITE_BYPASS,
+		.feature_status = pqi_firmware_feature_status,
+	},
+	{
+		.feature_name = "RAID 1 Write Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS,
+		.feature_status = pqi_ctrl_update_feature_flags,
+	},
+	{
+		.feature_name = "RAID 5 Write Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS,
+		.feature_status = pqi_ctrl_update_feature_flags,
+	},
+	{
+		.feature_name = "RAID 6 Write Bypass",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS,
+		.feature_status = pqi_ctrl_update_feature_flags,
+	},
 	{
 		.feature_name = "New Soft Reset Handshake",
 		.feature_bit = PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE,
@@ -7667,6 +7859,17 @@ static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
 
 	pqi_start_heartbeat_timer(ctrl_info);
 
+	if (ctrl_info->enable_r5_writes || ctrl_info->enable_r6_writes) {
+		rc = pqi_get_advanced_raid_bypass_config(ctrl_info);
+		if (rc) {
+			dev_err(&ctrl_info->pci_dev->dev,
+				"error obtaining advanced RAID bypass configuration\n");
+			return rc;
+		}
+		ctrl_info->ciss_report_log_flags |=
+			CISS_REPORT_LOG_FLAG_DRIVE_TYPE_MIX;
+	}
+
 	rc = pqi_enable_events(ctrl_info);
 	if (rc) {
 		dev_err(&ctrl_info->pci_dev->dev,
@@ -7822,6 +8025,17 @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)
 
 	pqi_start_heartbeat_timer(ctrl_info);
 
+	if (ctrl_info->enable_r5_writes || ctrl_info->enable_r6_writes) {
+		rc = pqi_get_advanced_raid_bypass_config(ctrl_info);
+		if (rc) {
+			dev_err(&ctrl_info->pci_dev->dev,
+				"error obtaining advanced RAID bypass configuration\n");
+			return rc;
+		}
+		ctrl_info->ciss_report_log_flags |=
+			CISS_REPORT_LOG_FLAG_DRIVE_TYPE_MIX;
+	}
+
 	rc = pqi_enable_events(ctrl_info);
 	if (rc) {
 		dev_err(&ctrl_info->pci_dev->dev,
@@ -7985,6 +8199,13 @@ static struct pqi_ctrl_info *pqi_alloc_ctrl_info(int numa_node)
 	ctrl_info->irq_mode = IRQ_MODE_NONE;
 	ctrl_info->max_msix_vectors = PQI_MAX_MSIX_VECTORS;
 
+	ctrl_info->ciss_report_log_flags = CISS_REPORT_LOG_FLAG_UNIQUE_LUN_ID;
+	ctrl_info->max_transfer_encrypted_sas_sata =
+		PQI_DEFAULT_MAX_TRANSFER_ENCRYPTED_SAS_SATA;
+	ctrl_info->max_transfer_encrypted_nvme =
+		PQI_DEFAULT_MAX_TRANSFER_ENCRYPTED_NVME;
+	ctrl_info->max_write_raid_5_6 = PQI_DEFAULT_MAX_WRITE_RAID_5_6;
+
 	return ctrl_info;
 }
 
@@ -9396,6 +9617,41 @@ static void __attribute__((unused)) verify_structures(void)
 		current_queue_depth_limit) != 1796);
 	BUILD_BUG_ON(sizeof(struct bmic_identify_physical_device) != 2560);
 
+	BUILD_BUG_ON(sizeof(struct bmic_sense_feature_buffer_header) != 4);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_buffer_header,
+		page_code) != 0);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_buffer_header,
+		subpage_code) != 1);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_buffer_header,
+		buffer_length) != 2);
+
+	BUILD_BUG_ON(sizeof(struct bmic_sense_feature_page_header) != 4);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_page_header,
+		page_code) != 0);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_page_header,
+		subpage_code) != 1);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_page_header,
+		page_length) != 2);
+
+	BUILD_BUG_ON(sizeof(struct bmic_sense_feature_io_page_aio_subpage)
+		!= 14);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		header) != 0);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		firmware_read_support) != 4);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		driver_read_support) != 5);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		firmware_write_support) != 6);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		driver_write_support) != 7);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		max_transfer_encrypted_sas_sata) != 8);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		max_transfer_encrypted_nvme) != 10);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		max_write_raid_5_6) != 12);
+
 	BUILD_BUG_ON(PQI_ADMIN_IQ_NUM_ELEMENTS > 255);
 	BUILD_BUG_ON(PQI_ADMIN_OQ_NUM_ELEMENTS > 255);
 	BUILD_BUG_ON(PQI_ADMIN_IQ_ELEMENT_LENGTH %


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (5 preceding siblings ...)
  2020-12-10 20:34 ` [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-07 16:44   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 08/25] smartpqi: add support for long firmware version Don Brace
                   ` (18 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

The specification for AIO Sub-Page (0x02) has changed slightly.
* bring the driver into conformance with the spec.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   12 ++++---
 drivers/scsi/smartpqi/smartpqi_init.c |   60 +++++++++++++++++++++------------
 2 files changed, 47 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 31281cddadfe..eb23c3cf59c0 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -1028,13 +1028,13 @@ struct pqi_scsi_dev_raid_map_data {
 	u8	cdb_length;
 
 	/* RAID 1 specific */
-#define NUM_RAID1_MAP_ENTRIES 3
+#define NUM_RAID1_MAP_ENTRIES	3
 	u32	num_it_nexus_entries;
 	u32	it_nexus[NUM_RAID1_MAP_ENTRIES];
 
 	/* RAID 5 / RAID 6 specific */
-	u32	p_parity_it_nexus; /* aio_handle */
-	u32	q_parity_it_nexus; /* aio_handle */
+	u32	p_parity_it_nexus;	/* aio_handle */
+	u32	q_parity_it_nexus;	/* aio_handle */
 	u8	xor_mult;
 	u64	row;
 	u64	stripe_lba;
@@ -1044,6 +1044,7 @@ struct pqi_scsi_dev_raid_map_data {
 
 #define RAID_CTLR_LUNID		"\0\0\0\0\0\0\0\0"
 
+
 struct pqi_scsi_dev {
 	int	devtype;		/* as reported by INQUIRY commmand */
 	u8	device_type;		/* as reported by */
@@ -1302,7 +1303,8 @@ struct pqi_ctrl_info {
 	u32		max_transfer_encrypted_sas_sata;
 	u32		max_transfer_encrypted_nvme;
 	u32		max_write_raid_5_6;
-
+	u32		max_write_raid_1_10_2drive;
+	u32		max_write_raid_1_10_3drive;
 
 	struct list_head scsi_device_list;
 	spinlock_t	scsi_device_list_lock;
@@ -1533,6 +1535,8 @@ struct bmic_sense_feature_io_page_aio_subpage {
 	__le16	max_transfer_encrypted_sas_sata;
 	__le16	max_transfer_encrypted_nvme;
 	__le16	max_write_raid_5_6;
+	__le16	max_write_raid_1_10_2drive;
+	__le16	max_write_raid_1_10_3drive;
 };
 
 struct bmic_smp_request {
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index aa21c1cd2cac..419887aa8ff3 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -696,6 +696,19 @@ static int pqi_identify_physical_device(struct pqi_ctrl_info *ctrl_info,
 	return rc;
 }
 
+static inline u32 pqi_aio_limit_to_bytes(__le16 *limit)
+{
+	u32 bytes;
+
+	bytes = get_unaligned_le16(limit);
+	if (bytes == 0)
+		bytes = ~0;
+	else
+		bytes *= 1024;
+
+	return bytes;
+}
+
 #pragma pack(1)
 
 struct bmic_sense_feature_buffer {
@@ -707,11 +720,11 @@ struct bmic_sense_feature_buffer {
 
 #define MINIMUM_AIO_SUBPAGE_BUFFER_LENGTH	\
 	offsetofend(struct bmic_sense_feature_buffer, \
-		aio_subpage.max_write_raid_5_6)
+		aio_subpage.max_write_raid_1_10_3drive)
 
 #define MINIMUM_AIO_SUBPAGE_LENGTH	\
 	(offsetofend(struct bmic_sense_feature_io_page_aio_subpage, \
-		max_write_raid_5_6) - \
+		max_write_raid_1_10_3drive) - \
 		sizeof_field(struct bmic_sense_feature_io_page_aio_subpage, header))
 
 static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
@@ -753,33 +766,28 @@ static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
 			BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE ||
 		get_unaligned_le16(&buffer->aio_subpage.header.page_length) <
 			MINIMUM_AIO_SUBPAGE_LENGTH) {
-		rc = -EINVAL;
 		goto error;
 	}
 
 	ctrl_info->max_transfer_encrypted_sas_sata =
-		get_unaligned_le16(
+		pqi_aio_limit_to_bytes(
 			&buffer->aio_subpage.max_transfer_encrypted_sas_sata);
-	if (ctrl_info->max_transfer_encrypted_sas_sata)
-		ctrl_info->max_transfer_encrypted_sas_sata *= 1024;
-	else
-		ctrl_info->max_transfer_encrypted_sas_sata = ~0;
 
 	ctrl_info->max_transfer_encrypted_nvme =
-		get_unaligned_le16(
+		pqi_aio_limit_to_bytes(
 			&buffer->aio_subpage.max_transfer_encrypted_nvme);
-	if (ctrl_info->max_transfer_encrypted_nvme)
-		ctrl_info->max_transfer_encrypted_nvme *= 1024;
-	else
-		ctrl_info->max_transfer_encrypted_nvme = ~0;
 
 	ctrl_info->max_write_raid_5_6 =
-		get_unaligned_le16(
+		pqi_aio_limit_to_bytes(
 			&buffer->aio_subpage.max_write_raid_5_6);
-	if (ctrl_info->max_write_raid_5_6)
-		ctrl_info->max_write_raid_5_6 *= 1024;
-	else
-		ctrl_info->max_write_raid_5_6 = ~0;
+
+	ctrl_info->max_write_raid_1_10_2drive =
+		pqi_aio_limit_to_bytes(
+			&buffer->aio_subpage.max_write_raid_1_10_2drive);
+
+	ctrl_info->max_write_raid_1_10_3drive =
+		pqi_aio_limit_to_bytes(
+			&buffer->aio_subpage.max_write_raid_1_10_3drive);
 
 error:
 	kfree(buffer);
@@ -2387,9 +2395,13 @@ static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
 	case SA_RAID_0:
 		break;
 	case SA_RAID_1:
-		fallthrough;
+		if (rmd->is_write && (!ctrl_info->enable_r1_writes ||
+			rmd->data_length > ctrl_info->max_write_raid_1_10_2drive))
+			is_supported = false;
+		break;
 	case SA_RAID_ADM:
-		if (rmd->is_write && !ctrl_info->enable_r1_writes)
+		if (rmd->is_write && (!ctrl_info->enable_r1_writes ||
+			rmd->data_length > ctrl_info->max_write_raid_1_10_3drive))
 			is_supported = false;
 		break;
 	case SA_RAID_5:
@@ -8205,6 +8217,8 @@ static struct pqi_ctrl_info *pqi_alloc_ctrl_info(int numa_node)
 	ctrl_info->max_transfer_encrypted_nvme =
 		PQI_DEFAULT_MAX_TRANSFER_ENCRYPTED_NVME;
 	ctrl_info->max_write_raid_5_6 = PQI_DEFAULT_MAX_WRITE_RAID_5_6;
+	ctrl_info->max_write_raid_1_10_2drive = ~0;
+	ctrl_info->max_write_raid_1_10_3drive = ~0;
 
 	return ctrl_info;
 }
@@ -9634,7 +9648,7 @@ static void __attribute__((unused)) verify_structures(void)
 		page_length) != 2);
 
 	BUILD_BUG_ON(sizeof(struct bmic_sense_feature_io_page_aio_subpage)
-		!= 14);
+		!= 18);
 	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
 		header) != 0);
 	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
@@ -9651,6 +9665,10 @@ static void __attribute__((unused)) verify_structures(void)
 		max_transfer_encrypted_nvme) != 10);
 	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
 		max_write_raid_5_6) != 12);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		max_write_raid_1_10_2drive) != 14);
+	BUILD_BUG_ON(offsetof(struct bmic_sense_feature_io_page_aio_subpage,
+		max_write_raid_1_10_3drive) != 16);
 
 	BUILD_BUG_ON(PQI_ADMIN_IQ_NUM_ELEMENTS > 255);
 	BUILD_BUG_ON(PQI_ADMIN_OQ_NUM_ELEMENTS > 255);


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 08/25] smartpqi: add support for long firmware version
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (6 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-07 16:45   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 09/25] smartpqi: align code with oob driver Don Brace
                   ` (17 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Add support for new "long" firmware version which requires
  minor driver changes to expose.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   14 ++++++++---
 drivers/scsi/smartpqi/smartpqi_init.c |   42 ++++++++++++++++++++++++---------
 2 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index eb23c3cf59c0..f33244def944 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -1227,7 +1227,7 @@ struct pqi_event {
 struct pqi_ctrl_info {
 	unsigned int	ctrl_id;
 	struct pci_dev	*pci_dev;
-	char		firmware_version[11];
+	char		firmware_version[32];
 	char		serial_number[17];
 	char		model[17];
 	char		vendor[9];
@@ -1405,7 +1405,7 @@ enum pqi_ctrl_mode {
 struct bmic_identify_controller {
 	u8	configured_logical_drive_count;
 	__le32	configuration_signature;
-	u8	firmware_version[4];
+	u8	firmware_version_short[4];
 	u8	reserved[145];
 	__le16	extended_logical_unit_count;
 	u8	reserved1[34];
@@ -1413,11 +1413,17 @@ struct bmic_identify_controller {
 	u8	reserved2[8];
 	u8	vendor_id[8];
 	u8	product_id[16];
-	u8	reserved3[68];
+	u8	reserved3[62];
+	__le32	extra_controller_flags;
+	u8	reserved4[2];
 	u8	controller_mode;
-	u8	reserved4[32];
+	u8	spare_part_number[32];
+	u8	firmware_version_long[32];
 };
 
+/* constants for extra_controller_flags field of bmic_identify_controller */
+#define BMIC_IDENTIFY_EXTRA_FLAGS_LONG_FW_VERSION_SUPPORTED	0x20000000
+
 struct bmic_sense_subsystem_info {
 	u8	reserved[44];
 	u8	ctrl_serial_number[16];
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 419887aa8ff3..aa8b559e8907 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -998,14 +998,12 @@ static void pqi_update_time_worker(struct work_struct *work)
 		PQI_UPDATE_TIME_WORK_INTERVAL);
 }
 
-static inline void pqi_schedule_update_time_worker(
-	struct pqi_ctrl_info *ctrl_info)
+static inline void pqi_schedule_update_time_worker(struct pqi_ctrl_info *ctrl_info)
 {
 	schedule_delayed_work(&ctrl_info->update_time_work, 0);
 }
 
-static inline void pqi_cancel_update_time_worker(
-	struct pqi_ctrl_info *ctrl_info)
+static inline void pqi_cancel_update_time_worker(struct pqi_ctrl_info *ctrl_info)
 {
 	cancel_delayed_work_sync(&ctrl_info->update_time_work);
 }
@@ -7245,13 +7243,23 @@ static int pqi_get_ctrl_product_details(struct pqi_ctrl_info *ctrl_info)
 	if (rc)
 		goto out;
 
-	memcpy(ctrl_info->firmware_version, identify->firmware_version,
-		sizeof(identify->firmware_version));
-	ctrl_info->firmware_version[sizeof(identify->firmware_version)] = '\0';
-	snprintf(ctrl_info->firmware_version +
-		strlen(ctrl_info->firmware_version),
-		sizeof(ctrl_info->firmware_version),
-		"-%u", get_unaligned_le16(&identify->firmware_build_number));
+	if (get_unaligned_le32(&identify->extra_controller_flags) &
+		BMIC_IDENTIFY_EXTRA_FLAGS_LONG_FW_VERSION_SUPPORTED) {
+		memcpy(ctrl_info->firmware_version,
+			identify->firmware_version_long,
+			sizeof(identify->firmware_version_long));
+	} else {
+		memcpy(ctrl_info->firmware_version,
+			identify->firmware_version_short,
+			sizeof(identify->firmware_version_short));
+		ctrl_info->firmware_version
+			[sizeof(identify->firmware_version_short)] = '\0';
+		snprintf(ctrl_info->firmware_version +
+			strlen(ctrl_info->firmware_version),
+			sizeof(ctrl_info->firmware_version),
+			"-%u",
+			get_unaligned_le16(&identify->firmware_build_number));
+	}
 
 	memcpy(ctrl_info->model, identify->product_id,
 		sizeof(identify->product_id));
@@ -9607,13 +9615,23 @@ static void __attribute__((unused)) verify_structures(void)
 	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
 		configuration_signature) != 1);
 	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
-		firmware_version) != 5);
+		firmware_version_short) != 5);
 	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
 		extended_logical_unit_count) != 154);
 	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
 		firmware_build_number) != 190);
+	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
+		vendor_id) != 200);
+	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
+		product_id) != 208);
+	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
+		extra_controller_flags) != 286);
 	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
 		controller_mode) != 292);
+	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
+		spare_part_number) != 293);
+	BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
+		firmware_version_long) != 325);
 
 	BUILD_BUG_ON(offsetof(struct bmic_identify_physical_device,
 		phys_bay_in_box) != 115);


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 09/25] smartpqi: align code with oob driver
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (7 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 08/25] smartpqi: add support for long firmware version Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-08  0:13   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 10/25] smartpqi: add stream detection Don Brace
                   ` (16 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Non-functional changes.
* Reduce differences between out-of-box driver and
  kernel.org driver.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h               |   54 +++---
 drivers/scsi/smartpqi/smartpqi_init.c          |  232 +++++++++---------------
 drivers/scsi/smartpqi/smartpqi_sas_transport.c |   10 +
 drivers/scsi/smartpqi/smartpqi_sis.c           |    4 
 4 files changed, 119 insertions(+), 181 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index f33244def944..a5e271dd2742 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -129,7 +129,7 @@ struct pqi_iu_header {
 	__le16	iu_length;	/* in bytes - does not include the length */
 				/* of this header */
 	__le16	response_queue_id;	/* specifies the OQ where the */
-					/*   response IU is to be delivered */
+					/* response IU is to be delivered */
 	u8	work_area[2];	/* reserved for driver use */
 };
 
@@ -281,8 +281,7 @@ struct pqi_raid_path_request {
 	u8	cdb[16];
 	u8	reserved6[12];
 	__le32	timeout;
-	struct pqi_sg_descriptor
-		sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
+	struct pqi_sg_descriptor sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
 };
 
 struct pqi_aio_path_request {
@@ -309,8 +308,7 @@ struct pqi_aio_path_request {
 	u8	cdb_length;
 	u8	lun_number[8];
 	u8	reserved4[4];
-	struct pqi_sg_descriptor
-		sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
+	struct pqi_sg_descriptor sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
 };
 
 #define PQI_RAID1_NVME_XFER_LIMIT	(32 * 1024)	/* 32 KiB */
@@ -421,7 +419,7 @@ struct pqi_event_config {
 
 #define PQI_EVENT_OFA_MEMORY_ALLOCATION	0x0
 #define PQI_EVENT_OFA_QUIESCE		0x1
-#define PQI_EVENT_OFA_CANCELLED		0x2
+#define PQI_EVENT_OFA_CANCELED		0x2
 
 struct pqi_event_response {
 	struct pqi_iu_header header;
@@ -726,7 +724,7 @@ struct pqi_admin_queues_aligned {
 struct pqi_admin_queues {
 	void		*iq_element_array;
 	void		*oq_element_array;
-	pqi_index_t	*iq_ci;
+	pqi_index_t __iomem *iq_ci;
 	pqi_index_t __iomem *oq_pi;
 	dma_addr_t	iq_element_array_bus_addr;
 	dma_addr_t	oq_element_array_bus_addr;
@@ -751,8 +749,8 @@ struct pqi_queue_group {
 	dma_addr_t	oq_element_array_bus_addr;
 	__le32 __iomem	*iq_pi[2];
 	pqi_index_t	iq_pi_copy[2];
-	pqi_index_t __iomem	*iq_ci[2];
-	pqi_index_t __iomem	*oq_pi;
+	pqi_index_t __iomem *iq_ci[2];
+	pqi_index_t __iomem *oq_pi;
 	dma_addr_t	iq_ci_bus_addr[2];
 	dma_addr_t	oq_pi_bus_addr;
 	__le32 __iomem	*oq_ci;
@@ -765,7 +763,7 @@ struct pqi_event_queue {
 	u16		oq_id;
 	u16		int_msg_num;
 	void		*oq_element_array;
-	pqi_index_t __iomem	*oq_pi;
+	pqi_index_t __iomem *oq_pi;
 	dma_addr_t	oq_element_array_bus_addr;
 	dma_addr_t	oq_pi_bus_addr;
 	__le32 __iomem	*oq_ci;
@@ -836,22 +834,22 @@ struct pqi_config_table_firmware_features {
 /*	__le16	host_max_known_feature; */
 };
 
-#define PQI_FIRMWARE_FEATURE_OFA			0
-#define PQI_FIRMWARE_FEATURE_SMP			1
-#define PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE		2
-#define PQI_FIRMWARE_FEATURE_RAID_0_READ_BYPASS		3
-#define PQI_FIRMWARE_FEATURE_RAID_1_READ_BYPASS		4
-#define PQI_FIRMWARE_FEATURE_RAID_5_READ_BYPASS		5
-#define PQI_FIRMWARE_FEATURE_RAID_6_READ_BYPASS		6
-#define PQI_FIRMWARE_FEATURE_RAID_0_WRITE_BYPASS	7
-#define PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS	8
-#define PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS	9
-#define PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS	10
-#define PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE	11
-#define PQI_FIRMWARE_FEATURE_UNIQUE_SATA_WWN		12
-#define PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT		13
-#define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT		14
-#define PQI_FIRMWARE_FEATURE_MAXIMUM			14
+#define PQI_FIRMWARE_FEATURE_OFA				0
+#define PQI_FIRMWARE_FEATURE_SMP				1
+#define PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE			2
+#define PQI_FIRMWARE_FEATURE_RAID_0_READ_BYPASS			3
+#define PQI_FIRMWARE_FEATURE_RAID_1_READ_BYPASS			4
+#define PQI_FIRMWARE_FEATURE_RAID_5_READ_BYPASS			5
+#define PQI_FIRMWARE_FEATURE_RAID_6_READ_BYPASS			6
+#define PQI_FIRMWARE_FEATURE_RAID_0_WRITE_BYPASS		7
+#define PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS		8
+#define PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS		9
+#define PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS		10
+#define PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE		11
+#define PQI_FIRMWARE_FEATURE_UNIQUE_SATA_WWN			12
+#define PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT			13
+#define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT			14
+#define PQI_FIRMWARE_FEATURE_MAXIMUM				14
 
 struct pqi_config_table_debug {
 	struct pqi_config_table_section_header header;
@@ -1292,8 +1290,8 @@ struct pqi_ctrl_info {
 	u8		pqi_mode_enabled : 1;
 	u8		pqi_reset_quiesce_supported : 1;
 	u8		soft_reset_handshake_supported : 1;
-	u8		raid_iu_timeout_supported: 1;
-	u8		tmf_iu_timeout_supported: 1;
+	u8		raid_iu_timeout_supported : 1;
+	u8		tmf_iu_timeout_supported : 1;
 	u8		enable_r1_writes : 1;
 	u8		enable_r5_writes : 1;
 	u8		enable_r6_writes : 1;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index aa8b559e8907..fc8fafab480d 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -156,14 +156,12 @@ MODULE_PARM_DESC(lockup_action, "Action to take when controller locked up.\n"
 static int pqi_expose_ld_first;
 module_param_named(expose_ld_first,
 	pqi_expose_ld_first, int, 0644);
-MODULE_PARM_DESC(expose_ld_first,
-	"Expose logical drives before physical drives.");
+MODULE_PARM_DESC(expose_ld_first, "Expose logical drives before physical drives.");
 
 static int pqi_hide_vsep;
 module_param_named(hide_vsep,
 	pqi_hide_vsep, int, 0644);
-MODULE_PARM_DESC(hide_vsep,
-	"Hide the virtual SEP for direct attached drives.");
+MODULE_PARM_DESC(hide_vsep, "Hide the virtual SEP for direct attached drives.");
 
 static char *raid_levels[] = {
 	"RAID-0",
@@ -236,8 +234,7 @@ static inline bool pqi_is_hba_lunid(u8 *scsi3addr)
 	return pqi_scsi3addr_equal(scsi3addr, RAID_CTLR_LUNID);
 }
 
-static inline enum pqi_ctrl_mode pqi_get_ctrl_mode(
-	struct pqi_ctrl_info *ctrl_info)
+static inline enum pqi_ctrl_mode pqi_get_ctrl_mode(struct pqi_ctrl_info *ctrl_info)
 {
 	return sis_read_driver_scratch(ctrl_info);
 }
@@ -368,8 +365,8 @@ static inline bool pqi_ctrl_in_shutdown(struct pqi_ctrl_info *ctrl_info)
 	return ctrl_info->in_shutdown;
 }
 
-static inline void pqi_schedule_rescan_worker_with_delay(
-	struct pqi_ctrl_info *ctrl_info, unsigned long delay)
+static inline void pqi_schedule_rescan_worker_with_delay(struct pqi_ctrl_info *ctrl_info,
+	unsigned long delay)
 {
 	if (pqi_ctrl_offline(ctrl_info))
 		return;
@@ -386,8 +383,7 @@ static inline void pqi_schedule_rescan_worker(struct pqi_ctrl_info *ctrl_info)
 
 #define PQI_RESCAN_WORK_DELAY	(10 * PQI_HZ)
 
-static inline void pqi_schedule_rescan_worker_delayed(
-	struct pqi_ctrl_info *ctrl_info)
+static inline void pqi_schedule_rescan_worker_delayed(struct pqi_ctrl_info *ctrl_info)
 {
 	pqi_schedule_rescan_worker_with_delay(ctrl_info, PQI_RESCAN_WORK_DELAY);
 }
@@ -616,9 +612,8 @@ static int pqi_send_scsi_raid_request(struct pqi_ctrl_info *ctrl_info, u8 cmd,
 	struct pqi_raid_path_request request;
 	enum dma_data_direction dir;
 
-	rc = pqi_build_raid_path_request(ctrl_info, &request,
-		cmd, scsi3addr, buffer,
-		buffer_length, vpd_page, &dir);
+	rc = pqi_build_raid_path_request(ctrl_info, &request, cmd, scsi3addr,
+		buffer, buffer_length, vpd_page, &dir);
 	if (rc)
 		return rc;
 
@@ -738,17 +733,15 @@ static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
 	if (!buffer)
 		return -ENOMEM;
 
-	rc = pqi_build_raid_path_request(ctrl_info, &request,
-		BMIC_SENSE_FEATURE, RAID_CTLR_LUNID, buffer,
-		sizeof(*buffer), 0, &dir);
+	rc = pqi_build_raid_path_request(ctrl_info, &request, BMIC_SENSE_FEATURE, RAID_CTLR_LUNID,
+		buffer, sizeof(*buffer), 0, &dir);
 	if (rc)
 		goto error;
 
 	request.cdb[2] = BMIC_SENSE_FEATURE_IO_PAGE;
 	request.cdb[3] = BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE;
 
-	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header,
-		0, NULL, NO_TIMEOUT);
+	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0, NULL, NO_TIMEOUT);
 
 	pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1, dir);
 
@@ -1008,15 +1001,13 @@ static inline void pqi_cancel_update_time_worker(struct pqi_ctrl_info *ctrl_info
 	cancel_delayed_work_sync(&ctrl_info->update_time_work);
 }
 
-static inline int pqi_report_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd,
-	void *buffer, size_t buffer_length)
+static inline int pqi_report_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd, void *buffer,
+	size_t buffer_length)
 {
-	return pqi_send_ctrl_raid_request(ctrl_info, cmd, buffer,
-		buffer_length);
+	return pqi_send_ctrl_raid_request(ctrl_info, cmd, buffer, buffer_length);
 }
 
-static int pqi_report_phys_logical_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd,
-	void **buffer)
+static int pqi_report_phys_logical_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd, void **buffer)
 {
 	int rc;
 	size_t lun_list_length;
@@ -1031,8 +1022,7 @@ static int pqi_report_phys_logical_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd,
 		goto out;
 	}
 
-	rc = pqi_report_luns(ctrl_info, cmd, report_lun_header,
-		sizeof(*report_lun_header));
+	rc = pqi_report_luns(ctrl_info, cmd, report_lun_header, sizeof(*report_lun_header));
 	if (rc)
 		goto out;
 
@@ -1056,8 +1046,8 @@ static int pqi_report_phys_logical_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd,
 	if (rc)
 		goto out;
 
-	new_lun_list_length = get_unaligned_be32(
-		&((struct report_lun_header *)lun_data)->list_length);
+	new_lun_list_length =
+		get_unaligned_be32(&((struct report_lun_header *)lun_data)->list_length);
 
 	if (new_lun_list_length > lun_list_length) {
 		lun_list_length = new_lun_list_length;
@@ -1078,15 +1068,12 @@ static int pqi_report_phys_logical_luns(struct pqi_ctrl_info *ctrl_info, u8 cmd,
 	return rc;
 }
 
-static inline int pqi_report_phys_luns(struct pqi_ctrl_info *ctrl_info,
-	void **buffer)
+static inline int pqi_report_phys_luns(struct pqi_ctrl_info *ctrl_info, void **buffer)
 {
-	return pqi_report_phys_logical_luns(ctrl_info, CISS_REPORT_PHYS,
-		buffer);
+	return pqi_report_phys_logical_luns(ctrl_info, CISS_REPORT_PHYS, buffer);
 }
 
-static inline int pqi_report_logical_luns(struct pqi_ctrl_info *ctrl_info,
-	void **buffer)
+static inline int pqi_report_logical_luns(struct pqi_ctrl_info *ctrl_info, void **buffer)
 {
 	return pqi_report_phys_logical_luns(ctrl_info, CISS_REPORT_LOG, buffer);
 }
@@ -1309,7 +1296,7 @@ static int pqi_get_raid_map(struct pqi_ctrl_info *ctrl_info,
 		if (get_unaligned_le32(&raid_map->structure_size)
 			!= raid_map_size) {
 			dev_warn(&ctrl_info->pci_dev->dev,
-				"Requested %d bytes, received %d bytes",
+				"requested %u bytes, received %u bytes\n",
 				raid_map_size,
 				get_unaligned_le32(&raid_map->structure_size));
 			goto error;
@@ -1666,8 +1653,7 @@ static int pqi_add_device(struct pqi_ctrl_info *ctrl_info,
 
 #define PQI_PENDING_IO_TIMEOUT_SECS	20
 
-static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device)
+static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info, struct pqi_scsi_dev *device)
 {
 	int rc;
 
@@ -1701,8 +1687,7 @@ static struct pqi_scsi_dev *pqi_find_scsi_dev(struct pqi_ctrl_info *ctrl_info,
 	return NULL;
 }
 
-static inline bool pqi_device_equal(struct pqi_scsi_dev *dev1,
-	struct pqi_scsi_dev *dev2)
+static inline bool pqi_device_equal(struct pqi_scsi_dev *dev1, struct pqi_scsi_dev *dev2)
 {
 	if (dev1->is_physical_device != dev2->is_physical_device)
 		return false;
@@ -1710,8 +1695,7 @@ static inline bool pqi_device_equal(struct pqi_scsi_dev *dev1,
 	if (dev1->is_physical_device)
 		return dev1->wwid == dev2->wwid;
 
-	return memcmp(dev1->volume_id, dev2->volume_id,
-		sizeof(dev1->volume_id)) == 0;
+	return memcmp(dev1->volume_id, dev2->volume_id, sizeof(dev1->volume_id)) == 0;
 }
 
 enum pqi_find_result {
@@ -1850,8 +1834,7 @@ static void pqi_scsi_update_device(struct pqi_scsi_dev *existing_device,
 	existing_device->bay = new_device->bay;
 	existing_device->box_index = new_device->box_index;
 	existing_device->phys_box_on_bus = new_device->phys_box_on_bus;
-	existing_device->phy_connected_dev_type =
-		new_device->phy_connected_dev_type;
+	existing_device->phy_connected_dev_type = new_device->phy_connected_dev_type;
 	memcpy(existing_device->box, new_device->box,
 		sizeof(existing_device->box));
 	memcpy(existing_device->phys_connector, new_device->phys_connector,
@@ -2054,7 +2037,7 @@ static inline bool pqi_is_supported_device(struct pqi_scsi_dev *device)
 	 */
 	if (device->device_type == SA_DEVICE_TYPE_CONTROLLER &&
 		!pqi_is_hba_lunid(device->scsi3addr))
-		return false;
+			return false;
 
 	return true;
 }
@@ -2087,8 +2070,7 @@ static inline bool pqi_is_device_with_sas_address(struct pqi_scsi_dev *device)
 
 static inline bool pqi_expose_device(struct pqi_scsi_dev *device)
 {
-	return !device->is_physical_device ||
-		!pqi_skip_device(device->scsi3addr);
+	return !device->is_physical_device || !pqi_skip_device(device->scsi3addr);
 }
 
 static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
@@ -2152,11 +2134,8 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 			for (i = num_physicals - 1; i >= 0; i--) {
 				phys_lun_ext_entry =
 						&physdev_list->lun_entries[i];
-				if (CISS_GET_DRIVE_NUMBER(
-					phys_lun_ext_entry->lunid) ==
-						PQI_VSEP_CISS_BTL) {
-					pqi_mask_device(
-						phys_lun_ext_entry->lunid);
+				if (CISS_GET_DRIVE_NUMBER(phys_lun_ext_entry->lunid) == PQI_VSEP_CISS_BTL) {
+					pqi_mask_device(phys_lun_ext_entry->lunid);
 					break;
 				}
 			}
@@ -2246,8 +2225,7 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 			if (device->is_physical_device)
 				dev_warn(&ctrl_info->pci_dev->dev,
 					"obtaining device info failed, skipping physical device %016llx\n",
-					get_unaligned_be64(
-						&phys_lun_ext_entry->wwid));
+					get_unaligned_be64(&phys_lun_ext_entry->wwid));
 			else
 				dev_warn(&ctrl_info->pci_dev->dev,
 					"obtaining device info failed, skipping logical device %08x%08x\n",
@@ -2264,9 +2242,9 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 			if ((phys_lun_ext_entry->device_flags &
 				CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED) &&
 				phys_lun_ext_entry->aio_handle) {
-				device->aio_enabled = true;
-				device->aio_handle =
-					phys_lun_ext_entry->aio_handle;
+					device->aio_enabled = true;
+					device->aio_handle =
+						phys_lun_ext_entry->aio_handle;
 			}
 		} else {
 			memcpy(device->volume_id, log_lun_ext_entry->volume_id,
@@ -2756,12 +2734,10 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 
 	pqi_set_aio_cdb(&rmd);
 
-	if (get_unaligned_le16(&raid_map->flags) &
-			RAID_MAP_ENCRYPTION_ENABLED) {
+	if (get_unaligned_le16(&raid_map->flags) & RAID_MAP_ENCRYPTION_ENABLED) {
 		if (rmd.data_length > device->max_transfer_encrypted)
 			return PQI_RAID_BYPASS_INELIGIBLE;
-		pqi_set_encryption_info(&encryption_info, raid_map,
-			rmd.first_block);
+		pqi_set_encryption_info(&encryption_info, raid_map, rmd.first_block);
 		encryption_info_ptr = &encryption_info;
 	} else {
 		encryption_info_ptr = NULL;
@@ -2776,7 +2752,7 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 		case SA_RAID_5:
 		case SA_RAID_6:
 			return pqi_aio_submit_r56_write_io(ctrl_info, scmd, queue_group,
-					encryption_info_ptr, device, &rmd);
+				encryption_info_ptr, device, &rmd);
 		}
 	}
 
@@ -3162,8 +3138,7 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
 		case PQI_RESPONSE_IU_VENDOR_GENERAL:
 			io_request->status =
 				get_unaligned_le16(
-				&((struct pqi_vendor_general_response *)
-					response)->status);
+				&((struct pqi_vendor_general_response *)response)->status);
 			break;
 		case PQI_RESPONSE_IU_TASK_MANAGEMENT:
 			io_request->status =
@@ -3375,7 +3350,7 @@ static void pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
 		pqi_ofa_setup_host_buffer(ctrl_info,
 			le32_to_cpu(event->ofa_bytes_requested));
 		pqi_ofa_host_memory_update(ctrl_info);
-	} else if (event_id == PQI_EVENT_OFA_CANCELLED) {
+	} else if (event_id == PQI_EVENT_OFA_CANCELED) {
 		pqi_ofa_free_host_buffer(ctrl_info);
 		pqi_acknowledge_event(ctrl_info, event);
 		dev_info(&ctrl_info->pci_dev->dev,
@@ -3425,8 +3400,7 @@ static void pqi_heartbeat_timer_handler(struct timer_list *t)
 {
 	int num_interrupts;
 	u32 heartbeat_count;
-	struct pqi_ctrl_info *ctrl_info = from_timer(ctrl_info, t,
-						     heartbeat_timer);
+	struct pqi_ctrl_info *ctrl_info = from_timer(ctrl_info, t, heartbeat_timer);
 
 	pqi_check_ctrl_health(ctrl_info);
 	if (pqi_ctrl_offline(ctrl_info))
@@ -3499,7 +3473,7 @@ static void pqi_ofa_capture_event_payload(struct pqi_event *event,
 		if (event_id == PQI_EVENT_OFA_MEMORY_ALLOCATION) {
 			event->ofa_bytes_requested =
 			response->data.ofa_memory_allocation.bytes_requested;
-		} else if (event_id == PQI_EVENT_OFA_CANCELLED) {
+		} else if (event_id == PQI_EVENT_OFA_CANCELED) {
 			event->ofa_cancel_reason =
 			response->data.ofa_cancelled.reason;
 		}
@@ -3641,8 +3615,7 @@ static inline bool pqi_is_valid_irq(struct pqi_ctrl_info *ctrl_info)
 		valid_irq = true;
 		break;
 	case IRQ_MODE_INTX:
-		intx_status =
-			readl(&ctrl_info->pqi_registers->legacy_intx_status);
+		intx_status = readl(&ctrl_info->pqi_registers->legacy_intx_status);
 		if (intx_status & PQI_LEGACY_INTX_PENDING)
 			valid_irq = true;
 		else
@@ -3963,7 +3936,8 @@ static int pqi_alloc_admin_queues(struct pqi_ctrl_info *ctrl_info)
 		&admin_queues_aligned->iq_element_array;
 	admin_queues->oq_element_array =
 		&admin_queues_aligned->oq_element_array;
-	admin_queues->iq_ci = &admin_queues_aligned->iq_ci;
+	admin_queues->iq_ci =
+		(pqi_index_t __iomem *)&admin_queues_aligned->iq_ci;
 	admin_queues->oq_pi =
 		(pqi_index_t __iomem *)&admin_queues_aligned->oq_pi;
 
@@ -3977,8 +3951,8 @@ static int pqi_alloc_admin_queues(struct pqi_ctrl_info *ctrl_info)
 		ctrl_info->admin_queue_memory_base);
 	admin_queues->iq_ci_bus_addr =
 		ctrl_info->admin_queue_memory_base_dma_handle +
-		((void *)admin_queues->iq_ci -
-		ctrl_info->admin_queue_memory_base);
+		((void __iomem *)admin_queues->iq_ci -
+		(void __iomem *)ctrl_info->admin_queue_memory_base);
 	admin_queues->oq_pi_bus_addr =
 		ctrl_info->admin_queue_memory_base_dma_handle +
 		((void __iomem *)admin_queues->oq_pi -
@@ -4014,6 +3988,7 @@ static int pqi_create_admin_queues(struct pqi_ctrl_info *ctrl_info)
 		(PQI_ADMIN_OQ_NUM_ELEMENTS << 8) |
 		(admin_queues->int_msg_num << 16);
 	writel(reg, &pqi_registers->admin_iq_num_elements);
+
 	writel(PQI_CREATE_ADMIN_QUEUE_PAIR,
 		&pqi_registers->function_and_status_code);
 
@@ -4310,8 +4285,7 @@ static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
 	io_request->io_complete_callback = pqi_raid_synchronous_complete;
 	io_request->context = &wait;
 
-	pqi_start_io(ctrl_info,
-		&ctrl_info->queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
+	pqi_start_io(ctrl_info, &ctrl_info->queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
 		io_request);
 
 	pqi_ctrl_unbusy(ctrl_info);
@@ -4329,13 +4303,11 @@ static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
 
 	if (error_info) {
 		if (io_request->error_info)
-			memcpy(error_info, io_request->error_info,
-				sizeof(*error_info));
+			memcpy(error_info, io_request->error_info, sizeof(*error_info));
 		else
 			memset(error_info, 0, sizeof(*error_info));
 	} else if (rc == 0 && io_request->error_info) {
-		rc = pqi_process_raid_io_error_synchronous(
-			io_request->error_info);
+		rc = pqi_process_raid_io_error_synchronous(io_request->error_info);
 	}
 
 	pqi_free_io_request(io_request);
@@ -4413,8 +4385,7 @@ static int pqi_report_device_capability(struct pqi_ctrl_info *ctrl_info)
 	if (rc)
 		goto out;
 
-	rc = pqi_submit_admin_request_synchronous(ctrl_info, &request,
-		&response);
+	rc = pqi_submit_admin_request_synchronous(ctrl_info, &request, &response);
 
 	pqi_pci_unmap(ctrl_info->pci_dev,
 		&request.data.report_device_capability.sg_descriptor, 1,
@@ -4763,7 +4734,7 @@ static int pqi_configure_events(struct pqi_ctrl_info *ctrl_info,
 		event_descriptor = &event_config->descriptors[i];
 		if (enable_events &&
 			pqi_is_supported_event(event_descriptor->event_type))
-			put_unaligned_le16(ctrl_info->event_queue.oq_id,
+				put_unaligned_le16(ctrl_info->event_queue.oq_id,
 					&event_descriptor->oq_id);
 		else
 			put_unaligned_le16(0, &event_descriptor->oq_id);
@@ -4838,7 +4809,6 @@ static void pqi_free_all_io_requests(struct pqi_ctrl_info *ctrl_info)
 
 static inline int pqi_alloc_error_buffer(struct pqi_ctrl_info *ctrl_info)
 {
-
 	ctrl_info->error_buffer = dma_alloc_coherent(&ctrl_info->pci_dev->dev,
 				     ctrl_info->error_buffer_length,
 				     &ctrl_info->error_buffer_dma_handle,
@@ -4858,9 +4828,8 @@ static int pqi_alloc_io_resources(struct pqi_ctrl_info *ctrl_info)
 	struct device *dev;
 	struct pqi_io_request *io_request;
 
-	ctrl_info->io_request_pool =
-		kcalloc(ctrl_info->max_io_slots,
-			sizeof(ctrl_info->io_request_pool[0]), GFP_KERNEL);
+	ctrl_info->io_request_pool = kcalloc(ctrl_info->max_io_slots,
+		sizeof(ctrl_info->io_request_pool[0]), GFP_KERNEL);
 
 	if (!ctrl_info->io_request_pool) {
 		dev_err(&ctrl_info->pci_dev->dev,
@@ -4873,8 +4842,7 @@ static int pqi_alloc_io_resources(struct pqi_ctrl_info *ctrl_info)
 	io_request = ctrl_info->io_request_pool;
 
 	for (i = 0; i < ctrl_info->max_io_slots; i++) {
-		io_request->iu =
-			kmalloc(ctrl_info->max_inbound_iu_length, GFP_KERNEL);
+		io_request->iu = kmalloc(ctrl_info->max_inbound_iu_length, GFP_KERNEL);
 
 		if (!io_request->iu) {
 			dev_err(&ctrl_info->pci_dev->dev,
@@ -4894,8 +4862,7 @@ static int pqi_alloc_io_resources(struct pqi_ctrl_info *ctrl_info)
 
 		io_request->index = i;
 		io_request->sg_chain_buffer = sg_chain_buffer;
-		io_request->sg_chain_buffer_dma_handle =
-			sg_chain_buffer_dma_handle;
+		io_request->sg_chain_buffer_dma_handle = sg_chain_buffer_dma_handle;
 		io_request++;
 	}
 
@@ -5010,8 +4977,8 @@ static void pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info)
 		PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS;
 }
 
-static inline void pqi_set_sg_descriptor(
-	struct pqi_sg_descriptor *sg_descriptor, struct scatterlist *sg)
+static inline void pqi_set_sg_descriptor(struct pqi_sg_descriptor *sg_descriptor,
+	struct scatterlist *sg)
 {
 	u64 address = (u64)sg_dma_address(sg);
 	unsigned int length = sg_dma_len(sg);
@@ -5233,16 +5200,14 @@ static int pqi_raid_submit_scsi_cmd_with_io_request(
 	io_request->scmd = scmd;
 
 	request = io_request->iu;
-	memset(request, 0,
-		offsetof(struct pqi_raid_path_request, sg_descriptors));
+	memset(request, 0, offsetof(struct pqi_raid_path_request, sg_descriptors));
 
 	request->header.iu_type = PQI_REQUEST_IU_RAID_PATH_IO;
 	put_unaligned_le32(scsi_bufflen(scmd), &request->buffer_length);
 	request->task_attribute = SOP_TASK_ATTRIBUTE_SIMPLE;
 	put_unaligned_le16(io_request->index, &request->request_id);
 	request->error_index = request->request_id;
-	memcpy(request->lun_number, device->scsi3addr,
-		sizeof(request->lun_number));
+	memcpy(request->lun_number, device->scsi3addr, sizeof(request->lun_number));
 
 	cdb_length = min_t(size_t, scmd->cmd_len, sizeof(request->cdb));
 	memcpy(request->cdb, scmd->cmnd, cdb_length);
@@ -5252,30 +5217,20 @@ static int pqi_raid_submit_scsi_cmd_with_io_request(
 	case 10:
 	case 12:
 	case 16:
-		/* No bytes in the Additional CDB bytes field */
-		request->additional_cdb_bytes_usage =
-			SOP_ADDITIONAL_CDB_BYTES_0;
+		request->additional_cdb_bytes_usage = SOP_ADDITIONAL_CDB_BYTES_0;
 		break;
 	case 20:
-		/* 4 bytes in the Additional cdb field */
-		request->additional_cdb_bytes_usage =
-			SOP_ADDITIONAL_CDB_BYTES_4;
+		request->additional_cdb_bytes_usage = SOP_ADDITIONAL_CDB_BYTES_4;
 		break;
 	case 24:
-		/* 8 bytes in the Additional cdb field */
-		request->additional_cdb_bytes_usage =
-			SOP_ADDITIONAL_CDB_BYTES_8;
+		request->additional_cdb_bytes_usage = SOP_ADDITIONAL_CDB_BYTES_8;
 		break;
 	case 28:
-		/* 12 bytes in the Additional cdb field */
-		request->additional_cdb_bytes_usage =
-			SOP_ADDITIONAL_CDB_BYTES_12;
+		request->additional_cdb_bytes_usage = SOP_ADDITIONAL_CDB_BYTES_12;
 		break;
 	case 32:
 	default:
-		/* 16 bytes in the Additional cdb field */
-		request->additional_cdb_bytes_usage =
-			SOP_ADDITIONAL_CDB_BYTES_16;
+		request->additional_cdb_bytes_usage = SOP_ADDITIONAL_CDB_BYTES_16;
 		break;
 	}
 
@@ -5520,8 +5475,7 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
 	io_request->raid_bypass = raid_bypass;
 
 	request = io_request->iu;
-	memset(request, 0,
-		offsetof(struct pqi_raid_path_request, sg_descriptors));
+	memset(request, 0, offsetof(struct pqi_raid_path_request, sg_descriptors));
 
 	request->header.iu_type = PQI_REQUEST_IU_AIO_PATH_IO;
 	put_unaligned_le32(aio_handle, &request->nexus_id);
@@ -5579,7 +5533,6 @@ static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
 	struct pqi_encryption_info *encryption_info, struct pqi_scsi_dev *device,
 	struct pqi_scsi_dev_raid_map_data *rmd)
-
 {
 	int rc;
 	struct pqi_io_request *io_request;
@@ -5594,7 +5547,6 @@ static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info *ctrl_info,
 	memset(r1_request, 0, offsetof(struct pqi_aio_r1_path_request, sg_descriptors));
 
 	r1_request->header.iu_type = PQI_REQUEST_IU_AIO_PATH_RAID1_IO;
-
 	put_unaligned_le16(*(u16 *)device->scsi3addr & 0x3fff, &r1_request->volume_id);
 	r1_request->num_drives = rmd->num_it_nexus_entries;
 	put_unaligned_le32(rmd->it_nexus[0], &r1_request->it_nexus_1);
@@ -5923,6 +5875,7 @@ static void pqi_fail_io_queued_for_device(struct pqi_ctrl_info *ctrl_info,
 			list_for_each_entry_safe(io_request, next,
 				&queue_group->request_list[path],
 				request_list_entry) {
+
 				scmd = io_request->scmd;
 				if (!scmd)
 					continue;
@@ -6116,8 +6069,7 @@ static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info,
 		put_unaligned_le16(PQI_LUN_RESET_TIMEOUT_SECS,
 					&request->timeout);
 
-	pqi_start_io(ctrl_info,
-		&ctrl_info->queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
+	pqi_start_io(ctrl_info, &ctrl_info->queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
 		io_request);
 
 	rc = pqi_wait_for_lun_reset_completion(ctrl_info, device, &wait);
@@ -6807,7 +6759,8 @@ static ssize_t pqi_unique_id_show(struct device *dev,
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
 	return snprintf(buffer, PAGE_SIZE,
-		"%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X\n",
+		"%02X%02X%02X%02X%02X%02X%02X%02X"
+		"%02X%02X%02X%02X%02X%02X%02X%02X\n",
 		unique_id[0], unique_id[1], unique_id[2], unique_id[3],
 		unique_id[4], unique_id[5], unique_id[6], unique_id[7],
 		unique_id[8], unique_id[9], unique_id[10], unique_id[11],
@@ -7107,17 +7060,13 @@ static int pqi_register_scsi(struct pqi_ctrl_info *ctrl_info)
 
 	rc = scsi_add_host(shost, &ctrl_info->pci_dev->dev);
 	if (rc) {
-		dev_err(&ctrl_info->pci_dev->dev,
-			"scsi_add_host failed for controller %u\n",
-			ctrl_info->ctrl_id);
+		dev_err(&ctrl_info->pci_dev->dev, "scsi_add_host failed\n");
 		goto free_host;
 	}
 
 	rc = pqi_add_sas_host(shost, ctrl_info);
 	if (rc) {
-		dev_err(&ctrl_info->pci_dev->dev,
-			"add SAS host failed for controller %u\n",
-			ctrl_info->ctrl_id);
+		dev_err(&ctrl_info->pci_dev->dev, "add SAS host failed\n");
 		goto remove_host;
 	}
 
@@ -7187,8 +7136,7 @@ static int pqi_reset(struct pqi_ctrl_info *ctrl_info)
 		rc = sis_pqi_reset_quiesce(ctrl_info);
 		if (rc) {
 			dev_err(&ctrl_info->pci_dev->dev,
-				"PQI reset failed during quiesce with error %d\n",
-				rc);
+				"PQI reset failed during quiesce with error %d\n", rc);
 			return rc;
 		}
 	}
@@ -7428,12 +7376,10 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
 			firmware_feature->enabled;
 		break;
 	case PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT:
-		ctrl_info->raid_iu_timeout_supported =
-			firmware_feature->enabled;
+		ctrl_info->raid_iu_timeout_supported = firmware_feature->enabled;
 		break;
 	case PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT:
-		ctrl_info->tmf_iu_timeout_supported =
-			firmware_feature->enabled;
+		ctrl_info->tmf_iu_timeout_supported = firmware_feature->enabled;
 		break;
 	}
 
@@ -7578,7 +7524,7 @@ static void pqi_process_firmware_features(
 		if (pqi_is_firmware_feature_enabled(firmware_features,
 			firmware_features_iomem_addr,
 			pqi_firmware_features[i].feature_bit)) {
-			pqi_firmware_features[i].enabled = true;
+				pqi_firmware_features[i].enabled = true;
 		}
 		pqi_firmware_feature_update(ctrl_info,
 			&pqi_firmware_features[i]);
@@ -7628,21 +7574,18 @@ static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
 	 * Copy the config table contents from I/O memory space into the
 	 * temporary buffer.
 	 */
-	table_iomem_addr = ctrl_info->iomem_base +
-		ctrl_info->config_table_offset;
+	table_iomem_addr = ctrl_info->iomem_base + ctrl_info->config_table_offset;
 	memcpy_fromio(config_table, table_iomem_addr, table_length);
 
 	section_info.ctrl_info = ctrl_info;
-	section_offset =
-		get_unaligned_le32(&config_table->first_section_offset);
+	section_offset = get_unaligned_le32(&config_table->first_section_offset);
 
 	while (section_offset) {
 		section = (void *)config_table + section_offset;
 
 		section_info.section = section;
 		section_info.section_offset = section_offset;
-		section_info.section_iomem_addr =
-			table_iomem_addr + section_offset;
+		section_info.section_iomem_addr = table_iomem_addr + section_offset;
 
 		switch (get_unaligned_le16(&section->section_id)) {
 		case PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES:
@@ -7656,8 +7599,7 @@ static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
 				ctrl_info->heartbeat_counter =
 					table_iomem_addr +
 					section_offset +
-					offsetof(
-					struct pqi_config_table_heartbeat,
+					offsetof(struct pqi_config_table_heartbeat,
 						heartbeat_counter);
 			break;
 		case PQI_CONFIG_TABLE_SECTION_SOFT_RESET:
@@ -7669,8 +7611,7 @@ static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
 			break;
 		}
 
-		section_offset =
-			get_unaligned_le16(&section->next_section_offset);
+		section_offset = get_unaligned_le16(&section->next_section_offset);
 	}
 
 	kfree(config_table);
@@ -7769,12 +7710,12 @@ static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
 	if (reset_devices) {
 		if (ctrl_info->max_outstanding_requests >
 			PQI_MAX_OUTSTANDING_REQUESTS_KDUMP)
-			ctrl_info->max_outstanding_requests =
+				ctrl_info->max_outstanding_requests =
 					PQI_MAX_OUTSTANDING_REQUESTS_KDUMP;
 	} else {
 		if (ctrl_info->max_outstanding_requests >
 			PQI_MAX_OUTSTANDING_REQUESTS)
-			ctrl_info->max_outstanding_requests =
+				ctrl_info->max_outstanding_requests =
 					PQI_MAX_OUTSTANDING_REQUESTS;
 	}
 
@@ -8091,8 +8032,7 @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)
 	return 0;
 }
 
-static inline int pqi_set_pcie_completion_timeout(struct pci_dev *pci_dev,
-	u16 timeout)
+static inline int pqi_set_pcie_completion_timeout(struct pci_dev *pci_dev, u16 timeout)
 {
 	int rc;
 
@@ -8344,8 +8284,8 @@ static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info,
 			break;
 
 		mem_descriptor = &ofap->sg_descriptor[i];
-		put_unaligned_le64 ((u64) dma_handle, &mem_descriptor->address);
-		put_unaligned_le32 (chunk_size, &mem_descriptor->length);
+		put_unaligned_le64((u64)dma_handle, &mem_descriptor->address);
+		put_unaligned_le32(chunk_size, &mem_descriptor->length);
 	}
 
 	if (!size || size < total_size)
diff --git a/drivers/scsi/smartpqi/smartpqi_sas_transport.c b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
index c9b00b3368d7..77923c6ec2c6 100644
--- a/drivers/scsi/smartpqi/smartpqi_sas_transport.c
+++ b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
@@ -107,8 +107,7 @@ static int pqi_sas_port_add_rphy(struct pqi_sas_port *pqi_sas_port,
 
 static struct sas_rphy *pqi_sas_rphy_alloc(struct pqi_sas_port *pqi_sas_port)
 {
-	if (pqi_sas_port->device &&
-		pqi_sas_port->device->is_expander_smp_device)
+	if (pqi_sas_port->device && pqi_sas_port->device->is_expander_smp_device)
 		return sas_expander_alloc(pqi_sas_port->port,
 				SAS_FANOUT_EXPANDER_DEVICE);
 
@@ -161,7 +160,7 @@ static void pqi_free_sas_port(struct pqi_sas_port *pqi_sas_port)
 
 	list_for_each_entry_safe(pqi_sas_phy, next,
 		&pqi_sas_port->phy_list_head, phy_list_entry)
-		pqi_free_sas_phy(pqi_sas_phy);
+			pqi_free_sas_phy(pqi_sas_phy);
 
 	sas_port_delete(pqi_sas_port->port);
 	list_del(&pqi_sas_port->port_list_entry);
@@ -191,7 +190,7 @@ static void pqi_free_sas_node(struct pqi_sas_node *pqi_sas_node)
 
 	list_for_each_entry_safe(pqi_sas_port, next,
 		&pqi_sas_node->port_list_head, port_list_entry)
-		pqi_free_sas_port(pqi_sas_port);
+			pqi_free_sas_port(pqi_sas_port);
 
 	kfree(pqi_sas_node);
 }
@@ -498,7 +497,7 @@ static unsigned int pqi_build_sas_smp_handler_reply(
 
 	job->reply_len = le16_to_cpu(error_info->sense_data_length);
 	memcpy(job->reply, error_info->data,
-			le16_to_cpu(error_info->sense_data_length));
+		le16_to_cpu(error_info->sense_data_length));
 
 	return job->reply_payload.payload_len -
 		get_unaligned_le32(&error_info->data_in_transferred);
@@ -547,6 +546,7 @@ void pqi_sas_smp_handler(struct bsg_job *job, struct Scsi_Host *shost,
 		goto out;
 
 	reslen = pqi_build_sas_smp_handler_reply(smp_buf, job, &error_info);
+
 out:
 	bsg_job_done(job, rc, reslen);
 }
diff --git a/drivers/scsi/smartpqi/smartpqi_sis.c b/drivers/scsi/smartpqi/smartpqi_sis.c
index f0199bd87dd1..c954620628e0 100644
--- a/drivers/scsi/smartpqi/smartpqi_sis.c
+++ b/drivers/scsi/smartpqi/smartpqi_sis.c
@@ -71,7 +71,7 @@ struct sis_base_struct {
 						/* error response data */
 	__le32	error_buffer_element_length;	/* length of each PQI error */
 						/* response buffer element */
-						/*   in bytes */
+						/* in bytes */
 	__le32	error_buffer_num_elements;	/* total number of PQI error */
 						/* response buffers available */
 };
@@ -146,7 +146,7 @@ bool sis_is_firmware_running(struct pqi_ctrl_info *ctrl_info)
 bool sis_is_kernel_up(struct pqi_ctrl_info *ctrl_info)
 {
 	return readl(&ctrl_info->registers->sis_firmware_status) &
-				SIS_CTRL_KERNEL_UP;
+		SIS_CTRL_KERNEL_UP;
 }
 
 u32 sis_get_product_id(struct pqi_ctrl_info *ctrl_info)


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 10/25] smartpqi: add stream detection
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (8 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 09/25] smartpqi: align code with oob driver Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-08  0:14   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 11/25] smartpqi: add host level stream detection enable Don Brace
                   ` (15 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

* Enhance performance by adding sequential stream detection.
  for R5/R6 sequential write requests.
  * Reduce stripe lock contention with full-stripe write
    operations.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |    8 +++
 drivers/scsi/smartpqi/smartpqi_init.c |   87 +++++++++++++++++++++++++++++++--
 2 files changed, 89 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index a5e271dd2742..343f06e44220 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -1042,6 +1042,12 @@ struct pqi_scsi_dev_raid_map_data {
 
 #define RAID_CTLR_LUNID		"\0\0\0\0\0\0\0\0"
 
+#define NUM_STREAMS_PER_LUN	8
+
+struct pqi_stream_data {
+	u64	next_lba;
+	u32	last_accessed;
+};
 
 struct pqi_scsi_dev {
 	int	devtype;		/* as reported by INQUIRY commmand */
@@ -1097,6 +1103,7 @@ struct pqi_scsi_dev {
 	struct list_head add_list_entry;
 	struct list_head delete_list_entry;
 
+	struct pqi_stream_data stream_data[NUM_STREAMS_PER_LUN];
 	atomic_t scsi_cmds_outstanding;
 	atomic_t raid_bypass_cnt;
 };
@@ -1296,6 +1303,7 @@ struct pqi_ctrl_info {
 	u8		enable_r5_writes : 1;
 	u8		enable_r6_writes : 1;
 	u8		lv_drive_type_mix_valid : 1;
+	u8		enable_stream_detection : 1;
 
 	u8		ciss_report_log_flags;
 	u32		max_transfer_encrypted_sas_sata;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index fc8fafab480d..96383d047a88 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -5721,8 +5721,82 @@ void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)
 	atomic_dec(&device->scsi_cmds_outstanding);
 }
 
-static int pqi_scsi_queue_command(struct Scsi_Host *shost,
+static bool pqi_is_parity_write_stream(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd)
+{
+	u32 oldest_jiffies;
+	u8 lru_index;
+	int i;
+	int rc;
+	struct pqi_scsi_dev *device;
+	struct pqi_stream_data *pqi_stream_data;
+	struct pqi_scsi_dev_raid_map_data rmd;
+
+	if (!ctrl_info->enable_stream_detection)
+		return false;
+
+	rc = pqi_get_aio_lba_and_block_count(scmd, &rmd);
+	if (rc)
+		return false;
+
+	/* Check writes only. */
+	if (!rmd.is_write)
+		return false;
+
+	device = scmd->device->hostdata;
+
+	/* Check for RAID 5/6 streams. */
+	if (device->raid_level != SA_RAID_5 && device->raid_level != SA_RAID_6)
+		return false;
+
+	/*
+	 * If controller does not support AIO RAID{5,6} writes, need to send
+	 * requests down non-AIO path.
+	 */
+	if ((device->raid_level == SA_RAID_5 && !ctrl_info->enable_r5_writes) ||
+		(device->raid_level == SA_RAID_6 && !ctrl_info->enable_r6_writes))
+		return true;
+
+	lru_index = 0;
+	oldest_jiffies = INT_MAX;
+	for (i = 0; i < NUM_STREAMS_PER_LUN; i++) {
+		pqi_stream_data = &device->stream_data[i];
+		/*
+		 * Check for adjacent request or request is within
+		 * the previous request.
+		 */
+		if ((pqi_stream_data->next_lba &&
+			rmd.first_block >= pqi_stream_data->next_lba) &&
+			rmd.first_block <= pqi_stream_data->next_lba +
+				rmd.block_cnt) {
+			pqi_stream_data->next_lba = rmd.first_block +
+				rmd.block_cnt;
+			pqi_stream_data->last_accessed = jiffies;
+			return true;
+		}
+
+		/* unused entry */
+		if (pqi_stream_data->last_accessed == 0) {
+			lru_index = i;
+			break;
+		}
+
+		/* Find entry with oldest last accessed time. */
+		if (pqi_stream_data->last_accessed <= oldest_jiffies) {
+			oldest_jiffies = pqi_stream_data->last_accessed;
+			lru_index = i;
+		}
+	}
+
+	/* Set LRU entry. */
+	pqi_stream_data = &device->stream_data[lru_index];
+	pqi_stream_data->last_accessed = jiffies;
+	pqi_stream_data->next_lba = rmd.first_block + rmd.block_cnt;
+
+	return false;
+}
+
+static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 {
 	int rc;
 	struct pqi_ctrl_info *ctrl_info;
@@ -5768,11 +5842,12 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost,
 		raid_bypassed = false;
 		if (device->raid_bypass_enabled &&
 			!blk_rq_is_passthrough(scmd->request)) {
-			rc = pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device,
-				scmd, queue_group);
-			if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY) {
-				raid_bypassed = true;
-				atomic_inc(&device->raid_bypass_cnt);
+			if (!pqi_is_parity_write_stream(ctrl_info, scmd)) {
+				rc = pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd, queue_group);
+				if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY) {
+					raid_bypassed = true;
+					atomic_inc(&device->raid_bypass_cnt);
+				}
 			}
 		}
 		if (!raid_bypassed)


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 11/25] smartpqi: add host level stream detection enable
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (9 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 10/25] smartpqi: add stream detection Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-08  0:13   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 12/25] smartpqi: enable support for NVMe encryption Don Brace
                   ` (14 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

* Allow R5/R6 stream detection to be disabled/enabled.
  using sysfs entry enable_stream_detection.

Example usage:

lsscsi
[2:2:0:0]    storage Adaptec  3258P-32i /e     0010
 ^
 |
 +---- NOTE: here host is host2

find /sys -name \*enable_stream\*
/sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:39:00.0/host2/scsi_host/host2/enable_stream_detection
/sys/devices/pci0000:5b/0000:5b:00.0/0000:5c:00.0/host3/scsi_host/host3/enable_stream_detection

Current stream detection:
cat /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:39:00.0/host2/scsi_host/host2/enable_stream_detection
1

Turn off stream detection:
echo 0 > /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:39:00.0/host2/scsi_host/host2/enable_stream_detection

Turn on stream detection:
echo 1 > /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:39:00.0/host2/scsi_host/host2/enable_stream_detection

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |   32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 96383d047a88..9a449bbc1898 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -6724,6 +6724,34 @@ static ssize_t pqi_lockup_action_store(struct device *dev,
 	return -EINVAL;
 }
 
+static ssize_t pqi_host_enable_stream_detection_show(struct device *dev,
+	struct device_attribute *attr, char *buffer)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
+
+	return scnprintf(buffer, 10, "%hhx\n",
+			ctrl_info->enable_stream_detection);
+}
+
+static ssize_t pqi_host_enable_stream_detection_store(struct device *dev,
+	struct device_attribute *attr, const char *buffer, size_t count)
+{
+	struct Scsi_Host *shost = class_to_shost(dev);
+	struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
+	u8 set_stream_detection = 0;
+
+	if (kstrtou8(buffer, 0, &set_stream_detection))
+		return -EINVAL;
+
+	if (set_stream_detection > 0)
+		set_stream_detection = 1;
+
+	ctrl_info->enable_stream_detection = set_stream_detection;
+
+	return count;
+}
+
 static ssize_t pqi_host_enable_r5_writes_show(struct device *dev,
 	struct device_attribute *attr, char *buffer)
 {
@@ -6786,6 +6814,9 @@ static DEVICE_ATTR(vendor, 0444, pqi_vendor_show, NULL);
 static DEVICE_ATTR(rescan, 0200, NULL, pqi_host_rescan_store);
 static DEVICE_ATTR(lockup_action, 0644, pqi_lockup_action_show,
 	pqi_lockup_action_store);
+static DEVICE_ATTR(enable_stream_detection, 0644,
+	pqi_host_enable_stream_detection_show,
+	pqi_host_enable_stream_detection_store);
 static DEVICE_ATTR(enable_r5_writes, 0644,
 	pqi_host_enable_r5_writes_show, pqi_host_enable_r5_writes_store);
 static DEVICE_ATTR(enable_r6_writes, 0644,
@@ -6799,6 +6830,7 @@ static struct device_attribute *pqi_shost_attrs[] = {
 	&dev_attr_vendor,
 	&dev_attr_rescan,
 	&dev_attr_lockup_action,
+	&dev_attr_enable_stream_detection,
 	&dev_attr_enable_r5_writes,
 	&dev_attr_enable_r6_writes,
 	NULL


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 12/25] smartpqi: enable support for NVMe encryption
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (10 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 11/25] smartpqi: add host level stream detection enable Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-08  0:14   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 13/25] smartpqi: disable write_same for nvme hba disks Don Brace
                   ` (13 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Support new FW feature bit that enables
  NVMe encryption.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |    3 ++-
 drivers/scsi/smartpqi/smartpqi_init.c |    5 +++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 343f06e44220..976bfd8c5192 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -849,7 +849,8 @@ struct pqi_config_table_firmware_features {
 #define PQI_FIRMWARE_FEATURE_UNIQUE_SATA_WWN			12
 #define PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT			13
 #define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT			14
-#define PQI_FIRMWARE_FEATURE_MAXIMUM				14
+#define PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME	15
+#define PQI_FIRMWARE_FEATURE_MAXIMUM				15
 
 struct pqi_config_table_debug {
 	struct pqi_config_table_section_header header;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 9a449bbc1898..19b8dc9ea6ad 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -7573,6 +7573,11 @@ static struct pqi_firmware_feature pqi_firmware_features[] = {
 		.feature_bit = PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT,
 		.feature_status = pqi_ctrl_update_feature_flags,
 	},
+	{
+		.feature_name = "RAID Bypass on encrypted logical volumes on NVMe",
+		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME,
+		.feature_status = pqi_firmware_feature_status,
+	},
 };
 
 static void pqi_process_firmware_features(


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 13/25] smartpqi: disable write_same for nvme hba disks
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (11 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 12/25] smartpqi: enable support for NVMe encryption Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-08  0:13   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 14/25] smartpqi: fix driver synchronization issues Don Brace
                   ` (12 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Controller do not support SCSI WRITE SAME
  for NVMe drives in HBA mode

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 19b8dc9ea6ad..1eb677bc6c69 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -6280,10 +6280,13 @@ static int pqi_slave_alloc(struct scsi_device *sdev)
 			scsi_change_queue_depth(sdev,
 				device->advertised_queue_depth);
 		}
-		if (pqi_is_logical_device(device))
+		if (pqi_is_logical_device(device)) {
 			pqi_disable_write_same(sdev);
-		else
+		} else {
 			sdev->allow_restart = 1;
+			if (device->device_type == SA_DEVICE_TYPE_NVME)
+				pqi_disable_write_same(sdev);
+		}
 	}
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 14/25] smartpqi: fix driver synchronization issues
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (12 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 13/25] smartpqi: disable write_same for nvme hba disks Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-07 23:32   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 15/25] smartpqi: fix host qdepth limit Don Brace
                   ` (11 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* synchronize: LUN resets, shutdowns, suspend, hibernate,
  OFA, and controller offline events.
* prevent I/O during the the above conditions.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |   48 -
 drivers/scsi/smartpqi/smartpqi_init.c | 1157 +++++++++++++--------------------
 2 files changed, 474 insertions(+), 731 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 976bfd8c5192..0b94c755a74c 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -130,9 +130,12 @@ struct pqi_iu_header {
 				/* of this header */
 	__le16	response_queue_id;	/* specifies the OQ where the */
 					/* response IU is to be delivered */
-	u8	work_area[2];	/* reserved for driver use */
+	u16	driver_flags;	/* reserved for driver use */
 };
 
+/* manifest constants for pqi_iu_header.driver_flags */
+#define PQI_DRIVER_NONBLOCKABLE_REQUEST		0x1
+
 /*
  * According to the PQI spec, the IU header is only the first 4 bytes of our
  * pqi_iu_header structure.
@@ -508,10 +511,6 @@ struct pqi_vendor_general_response {
 #define PQI_OFA_SIGNATURE		"OFA_QRM"
 #define PQI_OFA_MAX_SG_DESCRIPTORS	64
 
-#define PQI_OFA_MEMORY_DESCRIPTOR_LENGTH \
-	(offsetof(struct pqi_ofa_memory, sg_descriptor) + \
-	(PQI_OFA_MAX_SG_DESCRIPTORS * sizeof(struct pqi_sg_descriptor)))
-
 struct pqi_ofa_memory {
 	__le64	signature;	/* "OFA_QRM" */
 	__le16	version;	/* version of this struct (1 = 1st version) */
@@ -519,7 +518,7 @@ struct pqi_ofa_memory {
 	__le32	bytes_allocated;	/* total allocated memory in bytes */
 	__le16	num_memory_descriptors;
 	u8	reserved1[2];
-	struct pqi_sg_descriptor sg_descriptor[1];
+	struct pqi_sg_descriptor sg_descriptor[PQI_OFA_MAX_SG_DESCRIPTORS];
 };
 
 struct pqi_aio_error_info {
@@ -850,7 +849,8 @@ struct pqi_config_table_firmware_features {
 #define PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT			13
 #define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT			14
 #define PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME	15
-#define PQI_FIRMWARE_FEATURE_MAXIMUM				15
+#define PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN	16
+#define PQI_FIRMWARE_FEATURE_MAXIMUM				16
 
 struct pqi_config_table_debug {
 	struct pqi_config_table_section_header header;
@@ -1071,7 +1071,6 @@ struct pqi_scsi_dev {
 	u8	volume_offline : 1;
 	u8	rescan : 1;
 	bool	aio_enabled;		/* only valid for physical disks */
-	bool	in_reset;
 	bool	in_remove;
 	bool	device_offline;
 	u8	vendor[8];		/* bytes 8-15 of inquiry data */
@@ -1107,6 +1106,7 @@ struct pqi_scsi_dev {
 	struct pqi_stream_data stream_data[NUM_STREAMS_PER_LUN];
 	atomic_t scsi_cmds_outstanding;
 	atomic_t raid_bypass_cnt;
+	u8	page_83_identifier[16];
 };
 
 /* VPD inquiry pages */
@@ -1212,10 +1212,8 @@ struct pqi_io_request {
 struct pqi_event {
 	bool	pending;
 	u8	event_type;
-	__le16	event_id;
-	__le32	additional_event_id;
-	__le32	ofa_bytes_requested;
-	__le16	ofa_cancel_reason;
+	u16	event_id;
+	u32	additional_event_id;
 };
 
 #define PQI_RESERVED_IO_SLOTS_LUN_RESET			1
@@ -1287,12 +1285,9 @@ struct pqi_ctrl_info {
 
 	struct mutex	scan_mutex;
 	struct mutex	lun_reset_mutex;
-	struct mutex	ofa_mutex; /* serialize ofa */
 	bool		controller_online;
 	bool		block_requests;
-	bool		block_device_reset;
-	bool		in_ofa;
-	bool		in_shutdown;
+	bool		scan_blocked;
 	u8		inbound_spanning_supported : 1;
 	u8		outbound_spanning_supported : 1;
 	u8		pqi_mode_enabled : 1;
@@ -1300,6 +1295,7 @@ struct pqi_ctrl_info {
 	u8		soft_reset_handshake_supported : 1;
 	u8		raid_iu_timeout_supported : 1;
 	u8		tmf_iu_timeout_supported : 1;
+	u8		unique_wwid_in_report_phys_lun_supported : 1;
 	u8		enable_r1_writes : 1;
 	u8		enable_r5_writes : 1;
 	u8		enable_r6_writes : 1;
@@ -1341,14 +1337,14 @@ struct pqi_ctrl_info {
 	atomic_t	num_blocked_threads;
 	wait_queue_head_t block_requests_wait;
 
-	struct list_head raid_bypass_retry_list;
-	spinlock_t	raid_bypass_retry_list_lock;
-	struct work_struct raid_bypass_retry_work;
-
+	struct mutex	ofa_mutex;
 	struct pqi_ofa_memory *pqi_ofa_mem_virt_addr;
 	dma_addr_t	pqi_ofa_mem_dma_handle;
 	void		**pqi_ofa_chunk_virt_addr;
-	atomic_t	sync_cmds_outstanding;
+	struct work_struct ofa_memory_alloc_work;
+	struct work_struct ofa_quiesce_work;
+	u32		ofa_bytes_requested;
+	u16		ofa_cancel_reason;
 };
 
 enum pqi_ctrl_mode {
@@ -1619,16 +1615,6 @@ struct bmic_diag_options {
 
 #pragma pack()
 
-static inline void pqi_ctrl_busy(struct pqi_ctrl_info *ctrl_info)
-{
-	atomic_inc(&ctrl_info->num_busy_threads);
-}
-
-static inline void pqi_ctrl_unbusy(struct pqi_ctrl_info *ctrl_info)
-{
-	atomic_dec(&ctrl_info->num_busy_threads);
-}
-
 static inline struct pqi_ctrl_info *shost_to_hba(struct Scsi_Host *shost)
 {
 	void *hostdata = shost_priv(shost);
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 1eb677bc6c69..082b17e9bd80 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -45,6 +45,9 @@
 
 #define PQI_EXTRA_SGL_MEMORY	(12 * sizeof(struct pqi_sg_descriptor))
 
+#define PQI_POST_RESET_DELAY_SECS			5
+#define PQI_POST_OFA_RESET_DELAY_UPON_TIMEOUT_SECS	10
+
 MODULE_AUTHOR("Microsemi");
 MODULE_DESCRIPTION("Driver for Microsemi Smart Family Controller version "
 	DRIVER_VERSION);
@@ -54,7 +57,6 @@ MODULE_LICENSE("GPL");
 
 static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info);
 static void pqi_ctrl_offline_worker(struct work_struct *work);
-static void pqi_retry_raid_bypass_requests(struct pqi_ctrl_info *ctrl_info);
 static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info);
 static void pqi_scan_start(struct Scsi_Host *shost);
 static void pqi_start_io(struct pqi_ctrl_info *ctrl_info,
@@ -62,7 +64,7 @@ static void pqi_start_io(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_io_request *io_request);
 static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_iu_header *request, unsigned int flags,
-	struct pqi_raid_error_info *error_info, unsigned long timeout_msecs);
+	struct pqi_raid_error_info *error_info);
 static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
 	struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,
 	unsigned int cdb_length, struct pqi_queue_group *queue_group,
@@ -77,9 +79,8 @@ static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev_raid_map_data *rmd);
 static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info);
 static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info);
-static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info);
-static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info *ctrl_info,
-	u32 bytes_requested);
+static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info, unsigned int delay_secs);
+static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info *ctrl_info);
 static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info *ctrl_info);
 static int pqi_ofa_host_memory_update(struct pqi_ctrl_info *ctrl_info);
 static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
@@ -245,14 +246,66 @@ static inline void pqi_save_ctrl_mode(struct pqi_ctrl_info *ctrl_info,
 	sis_write_driver_scratch(ctrl_info, mode);
 }
 
+static inline void pqi_ctrl_block_scan(struct pqi_ctrl_info *ctrl_info)
+{
+	ctrl_info->scan_blocked = true;
+	mutex_lock(&ctrl_info->scan_mutex);
+}
+
+static inline void pqi_ctrl_unblock_scan(struct pqi_ctrl_info *ctrl_info)
+{
+	ctrl_info->scan_blocked = false;
+	mutex_unlock(&ctrl_info->scan_mutex);
+}
+
+static inline bool pqi_ctrl_scan_blocked(struct pqi_ctrl_info *ctrl_info)
+{
+	return ctrl_info->scan_blocked;
+}
+
 static inline void pqi_ctrl_block_device_reset(struct pqi_ctrl_info *ctrl_info)
 {
-	ctrl_info->block_device_reset = true;
+	mutex_lock(&ctrl_info->lun_reset_mutex);
+}
+
+static inline void pqi_ctrl_unblock_device_reset(struct pqi_ctrl_info *ctrl_info)
+{
+	mutex_unlock(&ctrl_info->lun_reset_mutex);
+}
+
+static inline void pqi_scsi_block_requests(struct pqi_ctrl_info *ctrl_info)
+{
+	struct Scsi_Host *shost;
+	unsigned int num_loops;
+	int msecs_sleep;
+
+	shost = ctrl_info->scsi_host;
+
+	scsi_block_requests(shost);
+
+	num_loops = 0;
+	msecs_sleep = 20;
+	while (scsi_host_busy(shost)) {
+		num_loops++;
+		if (num_loops == 10)
+			msecs_sleep = 500;
+		msleep(msecs_sleep);
+	}
+}
+
+static inline void pqi_scsi_unblock_requests(struct pqi_ctrl_info *ctrl_info)
+{
+	scsi_unblock_requests(ctrl_info->scsi_host);
+}
+
+static inline void pqi_ctrl_busy(struct pqi_ctrl_info *ctrl_info)
+{
+	atomic_inc(&ctrl_info->num_busy_threads);
 }
 
-static inline bool pqi_device_reset_blocked(struct pqi_ctrl_info *ctrl_info)
+static inline void pqi_ctrl_unbusy(struct pqi_ctrl_info *ctrl_info)
 {
-	return ctrl_info->block_device_reset;
+	atomic_dec(&ctrl_info->num_busy_threads);
 }
 
 static inline bool pqi_ctrl_blocked(struct pqi_ctrl_info *ctrl_info)
@@ -263,44 +316,23 @@ static inline bool pqi_ctrl_blocked(struct pqi_ctrl_info *ctrl_info)
 static inline void pqi_ctrl_block_requests(struct pqi_ctrl_info *ctrl_info)
 {
 	ctrl_info->block_requests = true;
-	scsi_block_requests(ctrl_info->scsi_host);
 }
 
 static inline void pqi_ctrl_unblock_requests(struct pqi_ctrl_info *ctrl_info)
 {
 	ctrl_info->block_requests = false;
 	wake_up_all(&ctrl_info->block_requests_wait);
-	pqi_retry_raid_bypass_requests(ctrl_info);
-	scsi_unblock_requests(ctrl_info->scsi_host);
 }
 
-static unsigned long pqi_wait_if_ctrl_blocked(struct pqi_ctrl_info *ctrl_info,
-	unsigned long timeout_msecs)
+static void pqi_wait_if_ctrl_blocked(struct pqi_ctrl_info *ctrl_info)
 {
-	unsigned long remaining_msecs;
-
 	if (!pqi_ctrl_blocked(ctrl_info))
-		return timeout_msecs;
+		return;
 
 	atomic_inc(&ctrl_info->num_blocked_threads);
-
-	if (timeout_msecs == NO_TIMEOUT) {
-		wait_event(ctrl_info->block_requests_wait,
-			!pqi_ctrl_blocked(ctrl_info));
-		remaining_msecs = timeout_msecs;
-	} else {
-		unsigned long remaining_jiffies;
-
-		remaining_jiffies =
-			wait_event_timeout(ctrl_info->block_requests_wait,
-				!pqi_ctrl_blocked(ctrl_info),
-				msecs_to_jiffies(timeout_msecs));
-		remaining_msecs = jiffies_to_msecs(remaining_jiffies);
-	}
-
+	wait_event(ctrl_info->block_requests_wait,
+		!pqi_ctrl_blocked(ctrl_info));
 	atomic_dec(&ctrl_info->num_blocked_threads);
-
-	return remaining_msecs;
 }
 
 static inline void pqi_ctrl_wait_until_quiesced(struct pqi_ctrl_info *ctrl_info)
@@ -315,34 +347,25 @@ static inline bool pqi_device_offline(struct pqi_scsi_dev *device)
 	return device->device_offline;
 }
 
-static inline void pqi_device_reset_start(struct pqi_scsi_dev *device)
-{
-	device->in_reset = true;
-}
-
-static inline void pqi_device_reset_done(struct pqi_scsi_dev *device)
-{
-	device->in_reset = false;
-}
-
-static inline bool pqi_device_in_reset(struct pqi_scsi_dev *device)
+static inline void pqi_ctrl_ofa_start(struct pqi_ctrl_info *ctrl_info)
 {
-	return device->in_reset;
+	mutex_lock(&ctrl_info->ofa_mutex);
 }
 
-static inline void pqi_ctrl_ofa_start(struct pqi_ctrl_info *ctrl_info)
+static inline void pqi_ctrl_ofa_done(struct pqi_ctrl_info *ctrl_info)
 {
-	ctrl_info->in_ofa = true;
+	mutex_unlock(&ctrl_info->ofa_mutex);
 }
 
-static inline void pqi_ctrl_ofa_done(struct pqi_ctrl_info *ctrl_info)
+static inline void pqi_wait_until_ofa_finished(struct pqi_ctrl_info *ctrl_info)
 {
-	ctrl_info->in_ofa = false;
+	mutex_lock(&ctrl_info->ofa_mutex);
+	mutex_unlock(&ctrl_info->ofa_mutex);
 }
 
-static inline bool pqi_ctrl_in_ofa(struct pqi_ctrl_info *ctrl_info)
+static inline bool pqi_ofa_in_progress(struct pqi_ctrl_info *ctrl_info)
 {
-	return ctrl_info->in_ofa;
+	return mutex_is_locked(&ctrl_info->ofa_mutex);
 }
 
 static inline void pqi_device_remove_start(struct pqi_scsi_dev *device)
@@ -355,14 +378,20 @@ static inline bool pqi_device_in_remove(struct pqi_scsi_dev *device)
 	return device->in_remove;
 }
 
-static inline void pqi_ctrl_shutdown_start(struct pqi_ctrl_info *ctrl_info)
+static inline int pqi_event_type_to_event_index(unsigned int event_type)
 {
-	ctrl_info->in_shutdown = true;
+	int index;
+
+	for (index = 0; index < ARRAY_SIZE(pqi_supported_event_types); index++)
+		if (event_type == pqi_supported_event_types[index])
+			return index;
+
+	return -1;
 }
 
-static inline bool pqi_ctrl_in_shutdown(struct pqi_ctrl_info *ctrl_info)
+static inline bool pqi_is_supported_event(unsigned int event_type)
 {
-	return ctrl_info->in_shutdown;
+	return pqi_event_type_to_event_index(event_type) != -1;
 }
 
 static inline void pqi_schedule_rescan_worker_with_delay(struct pqi_ctrl_info *ctrl_info,
@@ -370,8 +399,6 @@ static inline void pqi_schedule_rescan_worker_with_delay(struct pqi_ctrl_info *c
 {
 	if (pqi_ctrl_offline(ctrl_info))
 		return;
-	if (pqi_ctrl_in_ofa(ctrl_info))
-		return;
 
 	schedule_delayed_work(&ctrl_info->rescan_work, delay);
 }
@@ -408,22 +435,15 @@ static inline u32 pqi_read_heartbeat_counter(struct pqi_ctrl_info *ctrl_info)
 
 static inline u8 pqi_read_soft_reset_status(struct pqi_ctrl_info *ctrl_info)
 {
-	if (!ctrl_info->soft_reset_status)
-		return 0;
-
 	return readb(ctrl_info->soft_reset_status);
 }
 
-static inline void pqi_clear_soft_reset_status(struct pqi_ctrl_info *ctrl_info,
-	u8 clear)
+static inline void pqi_clear_soft_reset_status(struct pqi_ctrl_info *ctrl_info)
 {
 	u8 status;
 
-	if (!ctrl_info->soft_reset_status)
-		return;
-
 	status = pqi_read_soft_reset_status(ctrl_info);
-	status &= ~clear;
+	status &= ~PQI_SOFT_RESET_ABORT;
 	writeb(status, ctrl_info->soft_reset_status);
 }
 
@@ -512,6 +532,7 @@ static int pqi_build_raid_path_request(struct pqi_ctrl_info *ctrl_info,
 		put_unaligned_be32(cdb_length, &cdb[6]);
 		break;
 	case SA_FLUSH_CACHE:
+		request->header.driver_flags = PQI_DRIVER_NONBLOCKABLE_REQUEST;
 		request->data_direction = SOP_WRITE_FLAG;
 		cdb[0] = BMIC_WRITE;
 		cdb[6] = BMIC_FLUSH_CACHE;
@@ -606,7 +627,7 @@ static void pqi_free_io_request(struct pqi_io_request *io_request)
 
 static int pqi_send_scsi_raid_request(struct pqi_ctrl_info *ctrl_info, u8 cmd,
 	u8 *scsi3addr, void *buffer, size_t buffer_length, u16 vpd_page,
-	struct pqi_raid_error_info *error_info,	unsigned long timeout_msecs)
+	struct pqi_raid_error_info *error_info)
 {
 	int rc;
 	struct pqi_raid_path_request request;
@@ -618,7 +639,7 @@ static int pqi_send_scsi_raid_request(struct pqi_ctrl_info *ctrl_info, u8 cmd,
 		return rc;
 
 	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0,
-		error_info, timeout_msecs);
+		error_info);
 
 	pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1, dir);
 
@@ -631,7 +652,7 @@ static inline int pqi_send_ctrl_raid_request(struct pqi_ctrl_info *ctrl_info,
 	u8 cmd, void *buffer, size_t buffer_length)
 {
 	return pqi_send_scsi_raid_request(ctrl_info, cmd, RAID_CTLR_LUNID,
-		buffer, buffer_length, 0, NULL, NO_TIMEOUT);
+		buffer, buffer_length, 0, NULL);
 }
 
 static inline int pqi_send_ctrl_raid_with_error(struct pqi_ctrl_info *ctrl_info,
@@ -639,7 +660,7 @@ static inline int pqi_send_ctrl_raid_with_error(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_raid_error_info *error_info)
 {
 	return pqi_send_scsi_raid_request(ctrl_info, cmd, RAID_CTLR_LUNID,
-		buffer, buffer_length, 0, error_info, NO_TIMEOUT);
+		buffer, buffer_length, 0, error_info);
 }
 
 static inline int pqi_identify_controller(struct pqi_ctrl_info *ctrl_info,
@@ -661,7 +682,7 @@ static inline int pqi_scsi_inquiry(struct pqi_ctrl_info *ctrl_info,
 	u8 *scsi3addr, u16 vpd_page, void *buffer, size_t buffer_length)
 {
 	return pqi_send_scsi_raid_request(ctrl_info, INQUIRY, scsi3addr,
-		buffer, buffer_length, vpd_page, NULL, NO_TIMEOUT);
+		buffer, buffer_length, vpd_page, NULL);
 }
 
 static int pqi_identify_physical_device(struct pqi_ctrl_info *ctrl_info,
@@ -683,8 +704,7 @@ static int pqi_identify_physical_device(struct pqi_ctrl_info *ctrl_info,
 	request.cdb[2] = (u8)bmic_device_index;
 	request.cdb[9] = (u8)(bmic_device_index >> 8);
 
-	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header,
-		0, NULL, NO_TIMEOUT);
+	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0, NULL);
 
 	pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1, dir);
 
@@ -741,7 +761,7 @@ static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
 	request.cdb[2] = BMIC_SENSE_FEATURE_IO_PAGE;
 	request.cdb[3] = BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE;
 
-	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0, NULL, NO_TIMEOUT);
+	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0, NULL);
 
 	pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1, dir);
 
@@ -794,13 +814,6 @@ static int pqi_flush_cache(struct pqi_ctrl_info *ctrl_info,
 	int rc;
 	struct bmic_flush_cache *flush_cache;
 
-	/*
-	 * Don't bother trying to flush the cache if the controller is
-	 * locked up.
-	 */
-	if (pqi_ctrl_offline(ctrl_info))
-		return -ENXIO;
-
 	flush_cache = kzalloc(sizeof(*flush_cache), GFP_KERNEL);
 	if (!flush_cache)
 		return -ENOMEM;
@@ -979,9 +992,6 @@ static void pqi_update_time_worker(struct work_struct *work)
 	ctrl_info = container_of(to_delayed_work(work), struct pqi_ctrl_info,
 		update_time_work);
 
-	if (pqi_ctrl_offline(ctrl_info))
-		return;
-
 	rc = pqi_write_current_time_to_host_wellness(ctrl_info);
 	if (rc)
 		dev_warn(&ctrl_info->pci_dev->dev,
@@ -1271,9 +1281,7 @@ static int pqi_get_raid_map(struct pqi_ctrl_info *ctrl_info,
 		return -ENOMEM;
 
 	rc = pqi_send_scsi_raid_request(ctrl_info, CISS_GET_RAID_MAP,
-		device->scsi3addr, raid_map, sizeof(*raid_map),
-		0, NULL, NO_TIMEOUT);
-
+		device->scsi3addr, raid_map, sizeof(*raid_map), 0, NULL);
 	if (rc)
 		goto error;
 
@@ -1288,8 +1296,7 @@ static int pqi_get_raid_map(struct pqi_ctrl_info *ctrl_info,
 			return -ENOMEM;
 
 		rc = pqi_send_scsi_raid_request(ctrl_info, CISS_GET_RAID_MAP,
-			device->scsi3addr, raid_map, raid_map_size,
-			0, NULL, NO_TIMEOUT);
+			device->scsi3addr, raid_map, raid_map_size, 0, NULL);
 		if (rc)
 			goto error;
 
@@ -1464,6 +1471,9 @@ static int pqi_get_physical_device_info(struct pqi_ctrl_info *ctrl_info,
 		sizeof(device->phys_connector));
 	device->bay = id_phys->phys_bay_in_box;
 
+	memcpy(&device->page_83_identifier, &id_phys->page_83_identifier,
+		sizeof(device->page_83_identifier));
+
 	return 0;
 }
 
@@ -1970,8 +1980,13 @@ static void pqi_update_device_list(struct pqi_ctrl_info *ctrl_info,
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
-	if (pqi_ctrl_in_ofa(ctrl_info))
-		pqi_ctrl_ofa_done(ctrl_info);
+	if (pqi_ofa_in_progress(ctrl_info)) {
+		list_for_each_entry_safe(device, next, &delete_list, delete_list_entry)
+			if (pqi_is_device_added(device))
+				pqi_device_remove_start(device);
+		pqi_ctrl_unblock_device_reset(ctrl_info);
+		pqi_scsi_unblock_requests(ctrl_info);
+	}
 
 	/* Remove all devices that have gone away. */
 	list_for_each_entry_safe(device, next, &delete_list, delete_list_entry) {
@@ -1993,19 +2008,14 @@ static void pqi_update_device_list(struct pqi_ctrl_info *ctrl_info,
 	 * Notify the SCSI ML if the queue depth of any existing device has
 	 * changed.
 	 */
-	list_for_each_entry(device, &ctrl_info->scsi_device_list,
-		scsi_device_list_entry) {
-		if (device->sdev) {
-			if (device->queue_depth !=
-				device->advertised_queue_depth) {
-				device->advertised_queue_depth = device->queue_depth;
-				scsi_change_queue_depth(device->sdev,
-					device->advertised_queue_depth);
-			}
-			if (device->rescan) {
-				scsi_rescan_device(&device->sdev->sdev_gendev);
-				device->rescan = false;
-			}
+	list_for_each_entry(device, &ctrl_info->scsi_device_list, scsi_device_list_entry) {
+		if (device->sdev && device->queue_depth != device->advertised_queue_depth) {
+			device->advertised_queue_depth = device->queue_depth;
+			scsi_change_queue_depth(device->sdev, device->advertised_queue_depth);
+		}
+		if (device->rescan) {
+			scsi_rescan_device(&device->sdev->sdev_gendev);
+			device->rescan = false;
 		}
 	}
 
@@ -2073,6 +2083,16 @@ static inline bool pqi_expose_device(struct pqi_scsi_dev *device)
 	return !device->is_physical_device || !pqi_skip_device(device->scsi3addr);
 }
 
+static inline void pqi_set_physical_device_wwid(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_scsi_dev *device, struct report_phys_lun_extended_entry *phys_lun_ext_entry)
+{
+	if (ctrl_info->unique_wwid_in_report_phys_lun_supported ||
+		pqi_is_device_with_sas_address(device))
+		device->wwid = phys_lun_ext_entry->wwid;
+	else
+		device->wwid = cpu_to_be64(get_unaligned_be64(&device->page_83_identifier));
+}
+
 static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 {
 	int i;
@@ -2238,7 +2258,7 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 		pqi_assign_bus_target_lun(device);
 
 		if (device->is_physical_device) {
-			device->wwid = phys_lun_ext_entry->wwid;
+			pqi_set_physical_device_wwid(ctrl_info, device, phys_lun_ext_entry);
 			if ((phys_lun_ext_entry->device_flags &
 				CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED) &&
 				phys_lun_ext_entry->aio_handle) {
@@ -2278,21 +2298,27 @@ static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 
 static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info)
 {
-	int rc = 0;
+	int rc;
+	int mutex_acquired;
 
 	if (pqi_ctrl_offline(ctrl_info))
 		return -ENXIO;
 
-	if (!mutex_trylock(&ctrl_info->scan_mutex)) {
+	mutex_acquired = mutex_trylock(&ctrl_info->scan_mutex);
+
+	if (!mutex_acquired) {
+		if (pqi_ctrl_scan_blocked(ctrl_info))
+			return -EBUSY;
 		pqi_schedule_rescan_worker_delayed(ctrl_info);
-		rc = -EINPROGRESS;
-	} else {
-		rc = pqi_update_scsi_devices(ctrl_info);
-		if (rc)
-			pqi_schedule_rescan_worker_delayed(ctrl_info);
-		mutex_unlock(&ctrl_info->scan_mutex);
+		return -EINPROGRESS;
 	}
 
+	rc = pqi_update_scsi_devices(ctrl_info);
+	if (rc && !pqi_ctrl_scan_blocked(ctrl_info))
+		pqi_schedule_rescan_worker_delayed(ctrl_info);
+
+	mutex_unlock(&ctrl_info->scan_mutex);
+
 	return rc;
 }
 
@@ -2301,8 +2327,6 @@ static void pqi_scan_start(struct Scsi_Host *shost)
 	struct pqi_ctrl_info *ctrl_info;
 
 	ctrl_info = shost_to_hba(shost);
-	if (pqi_ctrl_in_ofa(ctrl_info))
-		return;
 
 	pqi_scan_scsi_devices(ctrl_info);
 }
@@ -2319,27 +2343,8 @@ static int pqi_scan_finished(struct Scsi_Host *shost,
 	return !mutex_is_locked(&ctrl_info->scan_mutex);
 }
 
-static void pqi_wait_until_scan_finished(struct pqi_ctrl_info *ctrl_info)
-{
-	mutex_lock(&ctrl_info->scan_mutex);
-	mutex_unlock(&ctrl_info->scan_mutex);
-}
-
-static void pqi_wait_until_lun_reset_finished(struct pqi_ctrl_info *ctrl_info)
-{
-	mutex_lock(&ctrl_info->lun_reset_mutex);
-	mutex_unlock(&ctrl_info->lun_reset_mutex);
-}
-
-static void pqi_wait_until_ofa_finished(struct pqi_ctrl_info *ctrl_info)
-{
-	mutex_lock(&ctrl_info->ofa_mutex);
-	mutex_unlock(&ctrl_info->ofa_mutex);
-}
-
-static inline void pqi_set_encryption_info(
-	struct pqi_encryption_info *encryption_info, struct raid_map *raid_map,
-	u64 first_block)
+static inline void pqi_set_encryption_info(struct pqi_encryption_info *encryption_info,
+	struct raid_map *raid_map, u64 first_block)
 {
 	u32 volume_blk_size;
 
@@ -3251,8 +3256,8 @@ static void pqi_acknowledge_event(struct pqi_ctrl_info *ctrl_info,
 	put_unaligned_le16(sizeof(request) - PQI_REQUEST_HEADER_LENGTH,
 		&request.header.iu_length);
 	request.event_type = event->event_type;
-	request.event_id = event->event_id;
-	request.additional_event_id = event->additional_event_id;
+	put_unaligned_le16(event->event_id, &request.event_id);
+	put_unaligned_le16(event->additional_event_id, &request.additional_event_id);
 
 	pqi_send_event_ack(ctrl_info, &request, sizeof(request));
 }
@@ -3263,8 +3268,8 @@ static void pqi_acknowledge_event(struct pqi_ctrl_info *ctrl_info,
 static enum pqi_soft_reset_status pqi_poll_for_soft_reset_status(
 	struct pqi_ctrl_info *ctrl_info)
 {
-	unsigned long timeout;
 	u8 status;
+	unsigned long timeout;
 
 	timeout = (PQI_SOFT_RESET_STATUS_TIMEOUT_SECS * PQI_HZ) + jiffies;
 
@@ -3276,120 +3281,169 @@ static enum pqi_soft_reset_status pqi_poll_for_soft_reset_status(
 		if (status & PQI_SOFT_RESET_ABORT)
 			return RESET_ABORT;
 
+		if (!sis_is_firmware_running(ctrl_info))
+			return RESET_NORESPONSE;
+
 		if (time_after(jiffies, timeout)) {
-			dev_err(&ctrl_info->pci_dev->dev,
+			dev_warn(&ctrl_info->pci_dev->dev,
 				"timed out waiting for soft reset status\n");
 			return RESET_TIMEDOUT;
 		}
 
-		if (!sis_is_firmware_running(ctrl_info))
-			return RESET_NORESPONSE;
-
 		ssleep(PQI_SOFT_RESET_STATUS_POLL_INTERVAL_SECS);
 	}
 }
 
-static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info,
-	enum pqi_soft_reset_status reset_status)
+static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
 {
 	int rc;
+	unsigned int delay_secs;
+	enum pqi_soft_reset_status reset_status;
+
+	if (ctrl_info->soft_reset_handshake_supported)
+		reset_status = pqi_poll_for_soft_reset_status(ctrl_info);
+	else
+		reset_status = RESET_INITIATE_FIRMWARE;
+
+	pqi_ofa_free_host_buffer(ctrl_info);
+
+	delay_secs = PQI_POST_RESET_DELAY_SECS;
 
 	switch (reset_status) {
-	case RESET_INITIATE_DRIVER:
 	case RESET_TIMEDOUT:
+		delay_secs = PQI_POST_OFA_RESET_DELAY_UPON_TIMEOUT_SECS;
+		fallthrough;
+	case RESET_INITIATE_DRIVER:
 		dev_info(&ctrl_info->pci_dev->dev,
-			"resetting controller %u\n", ctrl_info->ctrl_id);
+				"Online Firmware Activation: resetting controller\n");
 		sis_soft_reset(ctrl_info);
 		fallthrough;
 	case RESET_INITIATE_FIRMWARE:
-		rc = pqi_ofa_ctrl_restart(ctrl_info);
-		pqi_ofa_free_host_buffer(ctrl_info);
+		ctrl_info->pqi_mode_enabled = false;
+		pqi_save_ctrl_mode(ctrl_info, SIS_MODE);
+		rc = pqi_ofa_ctrl_restart(ctrl_info, delay_secs);
+		pqi_ctrl_ofa_done(ctrl_info);
 		dev_info(&ctrl_info->pci_dev->dev,
-			"Online Firmware Activation for controller %u: %s\n",
-			ctrl_info->ctrl_id, rc == 0 ? "SUCCESS" : "FAILED");
+				"Online Firmware Activation: %s\n",
+				rc == 0 ? "SUCCESS" : "FAILED");
 		break;
 	case RESET_ABORT:
-		pqi_ofa_ctrl_unquiesce(ctrl_info);
 		dev_info(&ctrl_info->pci_dev->dev,
-			"Online Firmware Activation for controller %u: %s\n",
-			ctrl_info->ctrl_id, "ABORTED");
+				"Online Firmware Activation ABORTED\n");
+		if (ctrl_info->soft_reset_handshake_supported)
+			pqi_clear_soft_reset_status(ctrl_info);
+		pqi_ctrl_ofa_done(ctrl_info);
+		pqi_ofa_ctrl_unquiesce(ctrl_info);
 		break;
 	case RESET_NORESPONSE:
-		pqi_ofa_free_host_buffer(ctrl_info);
+		fallthrough;
+	default:
+		dev_err(&ctrl_info->pci_dev->dev,
+			"unexpected Online Firmware Activation reset status: 0x%x\n",
+			reset_status);
+		pqi_ctrl_ofa_done(ctrl_info);
+		pqi_ofa_ctrl_unquiesce(ctrl_info);
 		pqi_take_ctrl_offline(ctrl_info);
 		break;
 	}
 }
 
-static void pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_event *event)
+static void pqi_ofa_memory_alloc_worker(struct work_struct *work)
 {
-	u16 event_id;
-	enum pqi_soft_reset_status status;
+	struct pqi_ctrl_info *ctrl_info;
 
-	event_id = get_unaligned_le16(&event->event_id);
+	ctrl_info = container_of(work, struct pqi_ctrl_info, ofa_memory_alloc_work);
 
-	mutex_lock(&ctrl_info->ofa_mutex);
+	pqi_ctrl_ofa_start(ctrl_info);
+	pqi_ofa_setup_host_buffer(ctrl_info);
+	pqi_ofa_host_memory_update(ctrl_info);
+}
 
-	if (event_id == PQI_EVENT_OFA_QUIESCE) {
-		dev_info(&ctrl_info->pci_dev->dev,
-			"Received Online Firmware Activation quiesce event for controller %u\n",
-			ctrl_info->ctrl_id);
-		pqi_ofa_ctrl_quiesce(ctrl_info);
-		pqi_acknowledge_event(ctrl_info, event);
-		if (ctrl_info->soft_reset_handshake_supported) {
-			status = pqi_poll_for_soft_reset_status(ctrl_info);
-			pqi_process_soft_reset(ctrl_info, status);
-		} else {
-			pqi_process_soft_reset(ctrl_info,
-					RESET_INITIATE_FIRMWARE);
-		}
+static void pqi_ofa_quiesce_worker(struct work_struct *work)
+{
+	struct pqi_ctrl_info *ctrl_info;
+	struct pqi_event *event;
 
-	} else if (event_id == PQI_EVENT_OFA_MEMORY_ALLOCATION) {
-		pqi_acknowledge_event(ctrl_info, event);
-		pqi_ofa_setup_host_buffer(ctrl_info,
-			le32_to_cpu(event->ofa_bytes_requested));
-		pqi_ofa_host_memory_update(ctrl_info);
-	} else if (event_id == PQI_EVENT_OFA_CANCELED) {
-		pqi_ofa_free_host_buffer(ctrl_info);
-		pqi_acknowledge_event(ctrl_info, event);
+	ctrl_info = container_of(work, struct pqi_ctrl_info, ofa_quiesce_work);
+
+	event = &ctrl_info->events[pqi_event_type_to_event_index(PQI_EVENT_TYPE_OFA)];
+
+	pqi_ofa_ctrl_quiesce(ctrl_info);
+	pqi_acknowledge_event(ctrl_info, event);
+	pqi_process_soft_reset(ctrl_info);
+}
+
+static bool pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_event *event)
+{
+	bool ack_event;
+
+	ack_event = true;
+
+	switch (event->event_id) {
+	case PQI_EVENT_OFA_MEMORY_ALLOCATION:
+		dev_info(&ctrl_info->pci_dev->dev,
+			"received Online Firmware Activation memory allocation request\n");
+		schedule_work(&ctrl_info->ofa_memory_alloc_work);
+		break;
+	case PQI_EVENT_OFA_QUIESCE:
 		dev_info(&ctrl_info->pci_dev->dev,
-			"Online Firmware Activation(%u) cancel reason : %u\n",
-			ctrl_info->ctrl_id, event->ofa_cancel_reason);
+			"received Online Firmware Activation quiesce request\n");
+		schedule_work(&ctrl_info->ofa_quiesce_work);
+		ack_event = false;
+		break;
+	case PQI_EVENT_OFA_CANCELED:
+		dev_info(&ctrl_info->pci_dev->dev,
+			"received Online Firmware Activation cancel request: reason: %u\n",
+			ctrl_info->ofa_cancel_reason);
+		pqi_ofa_free_host_buffer(ctrl_info);
+		pqi_ctrl_ofa_done(ctrl_info);
+		break;
+	default:
+		dev_err(&ctrl_info->pci_dev->dev,
+			"received unknown Online Firmware Activation request: event ID: %u\n",
+			event->event_id);
+		break;
 	}
 
-	mutex_unlock(&ctrl_info->ofa_mutex);
+	return ack_event;
 }
 
 static void pqi_event_worker(struct work_struct *work)
 {
 	unsigned int i;
+	bool rescan_needed;
 	struct pqi_ctrl_info *ctrl_info;
 	struct pqi_event *event;
+	bool ack_event;
 
 	ctrl_info = container_of(work, struct pqi_ctrl_info, event_work);
 
 	pqi_ctrl_busy(ctrl_info);
-	pqi_wait_if_ctrl_blocked(ctrl_info, NO_TIMEOUT);
+	pqi_wait_if_ctrl_blocked(ctrl_info);
 	if (pqi_ctrl_offline(ctrl_info))
 		goto out;
 
-	pqi_schedule_rescan_worker_delayed(ctrl_info);
-
+	rescan_needed = false;
 	event = ctrl_info->events;
 	for (i = 0; i < PQI_NUM_SUPPORTED_EVENTS; i++) {
 		if (event->pending) {
 			event->pending = false;
 			if (event->event_type == PQI_EVENT_TYPE_OFA) {
-				pqi_ctrl_unbusy(ctrl_info);
-				pqi_ofa_process_event(ctrl_info, event);
-				return;
+				ack_event = pqi_ofa_process_event(ctrl_info, event);
+			} else {
+				ack_event = true;
+				rescan_needed = true;
 			}
-			pqi_acknowledge_event(ctrl_info, event);
+			if (ack_event)
+				pqi_acknowledge_event(ctrl_info, event);
 		}
 		event++;
 	}
 
+	if (rescan_needed)
+		pqi_schedule_rescan_worker_delayed(ctrl_info);
+
 out:
 	pqi_ctrl_unbusy(ctrl_info);
 }
@@ -3446,37 +3500,18 @@ static inline void pqi_stop_heartbeat_timer(struct pqi_ctrl_info *ctrl_info)
 	del_timer_sync(&ctrl_info->heartbeat_timer);
 }
 
-static inline int pqi_event_type_to_event_index(unsigned int event_type)
-{
-	int index;
-
-	for (index = 0; index < ARRAY_SIZE(pqi_supported_event_types); index++)
-		if (event_type == pqi_supported_event_types[index])
-			return index;
-
-	return -1;
-}
-
-static inline bool pqi_is_supported_event(unsigned int event_type)
-{
-	return pqi_event_type_to_event_index(event_type) != -1;
-}
-
-static void pqi_ofa_capture_event_payload(struct pqi_event *event,
-	struct pqi_event_response *response)
+static void pqi_ofa_capture_event_payload(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_event *event, struct pqi_event_response *response)
 {
-	u16 event_id;
-
-	event_id = get_unaligned_le16(&event->event_id);
-
-	if (event->event_type == PQI_EVENT_TYPE_OFA) {
-		if (event_id == PQI_EVENT_OFA_MEMORY_ALLOCATION) {
-			event->ofa_bytes_requested =
-			response->data.ofa_memory_allocation.bytes_requested;
-		} else if (event_id == PQI_EVENT_OFA_CANCELED) {
-			event->ofa_cancel_reason =
-			response->data.ofa_cancelled.reason;
-		}
+	switch (event->event_id) {
+	case PQI_EVENT_OFA_MEMORY_ALLOCATION:
+		ctrl_info->ofa_bytes_requested =
+			get_unaligned_le32(&response->data.ofa_memory_allocation.bytes_requested);
+		break;
+	case PQI_EVENT_OFA_CANCELED:
+		ctrl_info->ofa_cancel_reason =
+			get_unaligned_le16(&response->data.ofa_cancelled.reason);
+		break;
 	}
 }
 
@@ -3510,17 +3545,17 @@ static int pqi_process_event_intr(struct pqi_ctrl_info *ctrl_info)
 		num_events++;
 		response = event_queue->oq_element_array + (oq_ci * PQI_EVENT_OQ_ELEMENT_LENGTH);
 
-		event_index =
-			pqi_event_type_to_event_index(response->event_type);
+		event_index = pqi_event_type_to_event_index(response->event_type);
 
 		if (event_index >= 0 && response->request_acknowledge) {
 			event = &ctrl_info->events[event_index];
 			event->pending = true;
 			event->event_type = response->event_type;
-			event->event_id = response->event_id;
-			event->additional_event_id = response->additional_event_id;
+			event->event_id = get_unaligned_le16(&response->event_id);
+			event->additional_event_id =
+				get_unaligned_le32(&response->additional_event_id);
 			if (event->event_type == PQI_EVENT_TYPE_OFA)
-				pqi_ofa_capture_event_payload(event, response);
+				pqi_ofa_capture_event_payload(ctrl_info, event, response);
 		}
 
 		oq_ci = (oq_ci + 1) % PQI_NUM_EVENT_QUEUE_ELEMENTS;
@@ -3537,8 +3572,7 @@ static int pqi_process_event_intr(struct pqi_ctrl_info *ctrl_info)
 
 #define PQI_LEGACY_INTX_MASK	0x1
 
-static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info *ctrl_info,
-	bool enable_intx)
+static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info *ctrl_info, bool enable_intx)
 {
 	u32 intx_mask;
 	struct pqi_device_registers __iomem *pqi_registers;
@@ -4216,59 +4250,36 @@ static int pqi_process_raid_io_error_synchronous(
 	return rc;
 }
 
+static inline bool pqi_is_blockable_request(struct pqi_iu_header *request)
+{
+	return (request->driver_flags & PQI_DRIVER_NONBLOCKABLE_REQUEST) == 0;
+}
+
 static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_iu_header *request, unsigned int flags,
-	struct pqi_raid_error_info *error_info, unsigned long timeout_msecs)
+	struct pqi_raid_error_info *error_info)
 {
 	int rc = 0;
 	struct pqi_io_request *io_request;
-	unsigned long start_jiffies;
-	unsigned long msecs_blocked;
 	size_t iu_length;
 	DECLARE_COMPLETION_ONSTACK(wait);
 
-	/*
-	 * Note that specifying PQI_SYNC_FLAGS_INTERRUPTABLE and a timeout value
-	 * are mutually exclusive.
-	 */
-
 	if (flags & PQI_SYNC_FLAGS_INTERRUPTABLE) {
 		if (down_interruptible(&ctrl_info->sync_request_sem))
 			return -ERESTARTSYS;
 	} else {
-		if (timeout_msecs == NO_TIMEOUT) {
-			down(&ctrl_info->sync_request_sem);
-		} else {
-			start_jiffies = jiffies;
-			if (down_timeout(&ctrl_info->sync_request_sem,
-				msecs_to_jiffies(timeout_msecs)))
-				return -ETIMEDOUT;
-			msecs_blocked =
-				jiffies_to_msecs(jiffies - start_jiffies);
-			if (msecs_blocked >= timeout_msecs) {
-				rc = -ETIMEDOUT;
-				goto out;
-			}
-			timeout_msecs -= msecs_blocked;
-		}
+		down(&ctrl_info->sync_request_sem);
 	}
 
 	pqi_ctrl_busy(ctrl_info);
-	timeout_msecs = pqi_wait_if_ctrl_blocked(ctrl_info, timeout_msecs);
-	if (timeout_msecs == 0) {
-		pqi_ctrl_unbusy(ctrl_info);
-		rc = -ETIMEDOUT;
-		goto out;
-	}
+	if (pqi_is_blockable_request(request))
+		pqi_wait_if_ctrl_blocked(ctrl_info);
 
 	if (pqi_ctrl_offline(ctrl_info)) {
-		pqi_ctrl_unbusy(ctrl_info);
 		rc = -ENXIO;
 		goto out;
 	}
 
-	atomic_inc(&ctrl_info->sync_cmds_outstanding);
-
 	io_request = pqi_alloc_io_request(ctrl_info);
 
 	put_unaligned_le16(io_request->index,
@@ -4288,18 +4299,7 @@ static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
 	pqi_start_io(ctrl_info, &ctrl_info->queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
 		io_request);
 
-	pqi_ctrl_unbusy(ctrl_info);
-
-	if (timeout_msecs == NO_TIMEOUT) {
-		pqi_wait_for_completion_io(ctrl_info, &wait);
-	} else {
-		if (!wait_for_completion_io_timeout(&wait,
-			msecs_to_jiffies(timeout_msecs))) {
-			dev_warn(&ctrl_info->pci_dev->dev,
-				"command timed out\n");
-			rc = -ETIMEDOUT;
-		}
-	}
+	pqi_wait_for_completion_io(ctrl_info, &wait);
 
 	if (error_info) {
 		if (io_request->error_info)
@@ -4312,8 +4312,8 @@ static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
 
 	pqi_free_io_request(io_request);
 
-	atomic_dec(&ctrl_info->sync_cmds_outstanding);
 out:
+	pqi_ctrl_unbusy(ctrl_info);
 	up(&ctrl_info->sync_request_sem);
 
 	return rc;
@@ -4350,8 +4350,7 @@ static int pqi_submit_admin_request_synchronous(
 	rc = pqi_poll_for_admin_response(ctrl_info, response);
 
 	if (rc == 0)
-		rc = pqi_validate_admin_response(response,
-			request->function_code);
+		rc = pqi_validate_admin_response(response, request->function_code);
 
 	return rc;
 }
@@ -4721,7 +4720,7 @@ static int pqi_configure_events(struct pqi_ctrl_info *ctrl_info,
 		goto out;
 
 	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header,
-		0, NULL, NO_TIMEOUT);
+		0, NULL);
 
 	pqi_pci_unmap(ctrl_info->pci_dev,
 		request.data.report_event_configuration.sg_descriptors, 1,
@@ -4757,7 +4756,7 @@ static int pqi_configure_events(struct pqi_ctrl_info *ctrl_info,
 		goto out;
 
 	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header, 0,
-		NULL, NO_TIMEOUT);
+		NULL);
 
 	pqi_pci_unmap(ctrl_info->pci_dev,
 		request.data.report_event_configuration.sg_descriptors, 1,
@@ -5277,12 +5276,6 @@ static inline int pqi_raid_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
 		device, scmd, queue_group);
 }
 
-static inline void pqi_schedule_bypass_retry(struct pqi_ctrl_info *ctrl_info)
-{
-	if (!pqi_ctrl_blocked(ctrl_info))
-		schedule_work(&ctrl_info->raid_bypass_retry_work);
-}
-
 static bool pqi_raid_bypass_retry_needed(struct pqi_io_request *io_request)
 {
 	struct scsi_cmnd *scmd;
@@ -5299,7 +5292,7 @@ static bool pqi_raid_bypass_retry_needed(struct pqi_io_request *io_request)
 		return false;
 
 	device = scmd->device->hostdata;
-	if (pqi_device_offline(device))
+	if (pqi_device_offline(device) || pqi_device_in_remove(device))
 		return false;
 
 	ctrl_info = shost_to_hba(scmd->device->host);
@@ -5309,155 +5302,26 @@ static bool pqi_raid_bypass_retry_needed(struct pqi_io_request *io_request)
 	return true;
 }
 
-static inline void pqi_add_to_raid_bypass_retry_list(
-	struct pqi_ctrl_info *ctrl_info,
-	struct pqi_io_request *io_request, bool at_head)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock, flags);
-	if (at_head)
-		list_add(&io_request->request_list_entry,
-			&ctrl_info->raid_bypass_retry_list);
-	else
-		list_add_tail(&io_request->request_list_entry,
-			&ctrl_info->raid_bypass_retry_list);
-	spin_unlock_irqrestore(&ctrl_info->raid_bypass_retry_list_lock, flags);
-}
-
-static void pqi_queued_raid_bypass_complete(struct pqi_io_request *io_request,
+static void pqi_aio_io_complete(struct pqi_io_request *io_request,
 	void *context)
 {
 	struct scsi_cmnd *scmd;
 
 	scmd = io_request->scmd;
+	scsi_dma_unmap(scmd);
+	if (io_request->status == -EAGAIN ||
+		pqi_raid_bypass_retry_needed(io_request))
+			set_host_byte(scmd, DID_IMM_RETRY);
 	pqi_free_io_request(io_request);
 	pqi_scsi_done(scmd);
 }
 
-static void pqi_queue_raid_bypass_retry(struct pqi_io_request *io_request)
+static inline int pqi_aio_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
+	struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
+	struct pqi_queue_group *queue_group)
 {
-	struct scsi_cmnd *scmd;
-	struct pqi_ctrl_info *ctrl_info;
-
-	io_request->io_complete_callback = pqi_queued_raid_bypass_complete;
-	scmd = io_request->scmd;
-	scmd->result = 0;
-	ctrl_info = shost_to_hba(scmd->device->host);
-
-	pqi_add_to_raid_bypass_retry_list(ctrl_info, io_request, false);
-	pqi_schedule_bypass_retry(ctrl_info);
-}
-
-static int pqi_retry_raid_bypass(struct pqi_io_request *io_request)
-{
-	struct scsi_cmnd *scmd;
-	struct pqi_scsi_dev *device;
-	struct pqi_ctrl_info *ctrl_info;
-	struct pqi_queue_group *queue_group;
-
-	scmd = io_request->scmd;
-	device = scmd->device->hostdata;
-	if (pqi_device_in_reset(device)) {
-		pqi_free_io_request(io_request);
-		set_host_byte(scmd, DID_RESET);
-		pqi_scsi_done(scmd);
-		return 0;
-	}
-
-	ctrl_info = shost_to_hba(scmd->device->host);
-	queue_group = io_request->queue_group;
-
-	pqi_reinit_io_request(io_request);
-
-	return pqi_raid_submit_scsi_cmd_with_io_request(ctrl_info, io_request,
-		device, scmd, queue_group);
-}
-
-static inline struct pqi_io_request *pqi_next_queued_raid_bypass_request(
-	struct pqi_ctrl_info *ctrl_info)
-{
-	unsigned long flags;
-	struct pqi_io_request *io_request;
-
-	spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock, flags);
-	io_request = list_first_entry_or_null(
-		&ctrl_info->raid_bypass_retry_list,
-		struct pqi_io_request, request_list_entry);
-	if (io_request)
-		list_del(&io_request->request_list_entry);
-	spin_unlock_irqrestore(&ctrl_info->raid_bypass_retry_list_lock, flags);
-
-	return io_request;
-}
-
-static void pqi_retry_raid_bypass_requests(struct pqi_ctrl_info *ctrl_info)
-{
-	int rc;
-	struct pqi_io_request *io_request;
-
-	pqi_ctrl_busy(ctrl_info);
-
-	while (1) {
-		if (pqi_ctrl_blocked(ctrl_info))
-			break;
-		io_request = pqi_next_queued_raid_bypass_request(ctrl_info);
-		if (!io_request)
-			break;
-		rc = pqi_retry_raid_bypass(io_request);
-		if (rc) {
-			pqi_add_to_raid_bypass_retry_list(ctrl_info, io_request,
-				true);
-			pqi_schedule_bypass_retry(ctrl_info);
-			break;
-		}
-	}
-
-	pqi_ctrl_unbusy(ctrl_info);
-}
-
-static void pqi_raid_bypass_retry_worker(struct work_struct *work)
-{
-	struct pqi_ctrl_info *ctrl_info;
-
-	ctrl_info = container_of(work, struct pqi_ctrl_info,
-		raid_bypass_retry_work);
-	pqi_retry_raid_bypass_requests(ctrl_info);
-}
-
-static void pqi_clear_all_queued_raid_bypass_retries(
-	struct pqi_ctrl_info *ctrl_info)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock, flags);
-	INIT_LIST_HEAD(&ctrl_info->raid_bypass_retry_list);
-	spin_unlock_irqrestore(&ctrl_info->raid_bypass_retry_list_lock, flags);
-}
-
-static void pqi_aio_io_complete(struct pqi_io_request *io_request,
-	void *context)
-{
-	struct scsi_cmnd *scmd;
-
-	scmd = io_request->scmd;
-	scsi_dma_unmap(scmd);
-	if (io_request->status == -EAGAIN)
-		set_host_byte(scmd, DID_IMM_RETRY);
-	else if (pqi_raid_bypass_retry_needed(io_request)) {
-		pqi_queue_raid_bypass_retry(io_request);
-		return;
-	}
-	pqi_free_io_request(io_request);
-	pqi_scsi_done(scmd);
-}
-
-static inline int pqi_aio_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
-	struct pqi_queue_group *queue_group)
-{
-	return pqi_aio_submit_io(ctrl_info, scmd, device->aio_handle,
-		scmd->cmnd, scmd->cmd_len, queue_group, NULL, false);
+	return pqi_aio_submit_io(ctrl_info, scmd, device->aio_handle,
+		scmd->cmnd, scmd->cmd_len, queue_group, NULL, false);
 }
 
 static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
@@ -5698,6 +5562,14 @@ static inline u16 pqi_get_hw_queue(struct pqi_ctrl_info *ctrl_info,
 	return hw_queue;
 }
 
+static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd *scmd)
+{
+	if (blk_rq_is_passthrough(scmd->request))
+		return false;
+
+	return scmd->retries == 0;
+}
+
 /*
  * This function gets called just before we hand the completed SCSI request
  * back to the SML.
@@ -5806,7 +5678,6 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 	bool raid_bypassed;
 
 	device = scmd->device->hostdata;
-	ctrl_info = shost_to_hba(shost);
 
 	if (!device) {
 		set_host_byte(scmd, DID_NO_CONNECT);
@@ -5816,15 +5687,15 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 
 	atomic_inc(&device->scsi_cmds_outstanding);
 
+	ctrl_info = shost_to_hba(shost);
+
 	if (pqi_ctrl_offline(ctrl_info) || pqi_device_in_remove(device)) {
 		set_host_byte(scmd, DID_NO_CONNECT);
 		pqi_scsi_done(scmd);
 		return 0;
 	}
 
-	pqi_ctrl_busy(ctrl_info);
-	if (pqi_ctrl_blocked(ctrl_info) || pqi_device_in_reset(device) ||
-	    pqi_ctrl_in_ofa(ctrl_info) || pqi_ctrl_in_shutdown(ctrl_info)) {
+	if (pqi_ctrl_blocked(ctrl_info)) {
 		rc = SCSI_MLQUEUE_HOST_BUSY;
 		goto out;
 	}
@@ -5841,13 +5712,12 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 	if (pqi_is_logical_device(device)) {
 		raid_bypassed = false;
 		if (device->raid_bypass_enabled &&
-			!blk_rq_is_passthrough(scmd->request)) {
-			if (!pqi_is_parity_write_stream(ctrl_info, scmd)) {
-				rc = pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd, queue_group);
-				if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY) {
-					raid_bypassed = true;
-					atomic_inc(&device->raid_bypass_cnt);
-				}
+			pqi_is_bypass_eligible_request(scmd) &&
+			!pqi_is_parity_write_stream(ctrl_info, scmd)) {
+			rc = pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd, queue_group);
+			if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY) {
+				raid_bypassed = true;
+				atomic_inc(&device->raid_bypass_cnt);
 			}
 		}
 		if (!raid_bypassed)
@@ -5860,7 +5730,6 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 	}
 
 out:
-	pqi_ctrl_unbusy(ctrl_info);
 	if (rc)
 		atomic_dec(&device->scsi_cmds_outstanding);
 
@@ -5970,100 +5839,22 @@ static void pqi_fail_io_queued_for_device(struct pqi_ctrl_info *ctrl_info,
 	}
 }
 
-static void pqi_fail_io_queued_for_all_devices(struct pqi_ctrl_info *ctrl_info)
-{
-	unsigned int i;
-	unsigned int path;
-	struct pqi_queue_group *queue_group;
-	unsigned long flags;
-	struct pqi_io_request *io_request;
-	struct pqi_io_request *next;
-	struct scsi_cmnd *scmd;
-
-	for (i = 0; i < ctrl_info->num_queue_groups; i++) {
-		queue_group = &ctrl_info->queue_groups[i];
-
-		for (path = 0; path < 2; path++) {
-			spin_lock_irqsave(&queue_group->submit_lock[path],
-						flags);
-
-			list_for_each_entry_safe(io_request, next,
-				&queue_group->request_list[path],
-				request_list_entry) {
-
-				scmd = io_request->scmd;
-				if (!scmd)
-					continue;
-
-				list_del(&io_request->request_list_entry);
-				set_host_byte(scmd, DID_RESET);
-				pqi_scsi_done(scmd);
-			}
-
-			spin_unlock_irqrestore(
-				&queue_group->submit_lock[path], flags);
-		}
-	}
-}
-
 static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev *device, unsigned long timeout_secs)
 {
 	unsigned long timeout;
 
-	timeout = (timeout_secs * PQI_HZ) + jiffies;
-
-	while (atomic_read(&device->scsi_cmds_outstanding)) {
-		pqi_check_ctrl_health(ctrl_info);
-		if (pqi_ctrl_offline(ctrl_info))
-			return -ENXIO;
-		if (timeout_secs != NO_TIMEOUT) {
-			if (time_after(jiffies, timeout)) {
-				dev_err(&ctrl_info->pci_dev->dev,
-					"timed out waiting for pending IO\n");
-				return -ETIMEDOUT;
-			}
-		}
-		usleep_range(1000, 2000);
-	}
-
-	return 0;
-}
-
-static int pqi_ctrl_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
-	unsigned long timeout_secs)
-{
-	bool io_pending;
-	unsigned long flags;
-	unsigned long timeout;
-	struct pqi_scsi_dev *device;
 
 	timeout = (timeout_secs * PQI_HZ) + jiffies;
-	while (1) {
-		io_pending = false;
-
-		spin_lock_irqsave(&ctrl_info->scsi_device_list_lock, flags);
-		list_for_each_entry(device, &ctrl_info->scsi_device_list,
-			scsi_device_list_entry) {
-			if (atomic_read(&device->scsi_cmds_outstanding)) {
-				io_pending = true;
-				break;
-			}
-		}
-		spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock,
-					flags);
-
-		if (!io_pending)
-			break;
 
+	while (atomic_read(&device->scsi_cmds_outstanding)) {
 		pqi_check_ctrl_health(ctrl_info);
 		if (pqi_ctrl_offline(ctrl_info))
 			return -ENXIO;
-
 		if (timeout_secs != NO_TIMEOUT) {
 			if (time_after(jiffies, timeout)) {
 				dev_err(&ctrl_info->pci_dev->dev,
-					"timed out waiting for pending IO\n");
+					"timed out waiting for pending I/O\n");
 				return -ETIMEDOUT;
 			}
 		}
@@ -6073,18 +5864,6 @@ static int pqi_ctrl_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
 	return 0;
 }
 
-static int pqi_ctrl_wait_for_pending_sync_cmds(struct pqi_ctrl_info *ctrl_info)
-{
-	while (atomic_read(&ctrl_info->sync_cmds_outstanding)) {
-		pqi_check_ctrl_health(ctrl_info);
-		if (pqi_ctrl_offline(ctrl_info))
-			return -ENXIO;
-		usleep_range(1000, 2000);
-	}
-
-	return 0;
-}
-
 static void pqi_lun_reset_complete(struct pqi_io_request *io_request,
 	void *context)
 {
@@ -6156,13 +5935,11 @@ static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info,
 	return rc;
 }
 
-/* Performs a reset at the LUN level. */
-
 #define PQI_LUN_RESET_RETRIES			3
 #define PQI_LUN_RESET_RETRY_INTERVAL_MSECS	10000
 #define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS	120
 
-static int _pqi_device_reset(struct pqi_ctrl_info *ctrl_info,
+static int pqi_lun_reset_with_retries(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev *device)
 {
 	int rc;
@@ -6188,23 +5965,15 @@ static int pqi_device_reset(struct pqi_ctrl_info *ctrl_info,
 {
 	int rc;
 
-	mutex_lock(&ctrl_info->lun_reset_mutex);
-
 	pqi_ctrl_block_requests(ctrl_info);
 	pqi_ctrl_wait_until_quiesced(ctrl_info);
 	pqi_fail_io_queued_for_device(ctrl_info, device);
 	rc = pqi_wait_until_inbound_queues_empty(ctrl_info);
-	pqi_device_reset_start(device);
-	pqi_ctrl_unblock_requests(ctrl_info);
-
 	if (rc)
 		rc = FAILED;
 	else
-		rc = _pqi_device_reset(ctrl_info, device);
-
-	pqi_device_reset_done(device);
-
-	mutex_unlock(&ctrl_info->lun_reset_mutex);
+		rc = pqi_lun_reset_with_retries(ctrl_info, device);
+	pqi_ctrl_unblock_requests(ctrl_info);
 
 	return rc;
 }
@@ -6220,29 +5989,25 @@ static int pqi_eh_device_reset_handler(struct scsi_cmnd *scmd)
 	ctrl_info = shost_to_hba(shost);
 	device = scmd->device->hostdata;
 
+	mutex_lock(&ctrl_info->lun_reset_mutex);
+
 	dev_err(&ctrl_info->pci_dev->dev,
 		"resetting scsi %d:%d:%d:%d\n",
 		shost->host_no, device->bus, device->target, device->lun);
 
 	pqi_check_ctrl_health(ctrl_info);
-	if (pqi_ctrl_offline(ctrl_info) ||
-		pqi_device_reset_blocked(ctrl_info)) {
+	if (pqi_ctrl_offline(ctrl_info))
 		rc = FAILED;
-		goto out;
-	}
-
-	pqi_wait_until_ofa_finished(ctrl_info);
-
-	atomic_inc(&ctrl_info->sync_cmds_outstanding);
-	rc = pqi_device_reset(ctrl_info, device);
-	atomic_dec(&ctrl_info->sync_cmds_outstanding);
+	else
+		rc = pqi_device_reset(ctrl_info, device);
 
-out:
 	dev_err(&ctrl_info->pci_dev->dev,
 		"reset of scsi %d:%d:%d:%d: %s\n",
 		shost->host_no, device->bus, device->target, device->lun,
 		rc == SUCCESS ? "SUCCESS" : "FAILED");
 
+	mutex_unlock(&ctrl_info->lun_reset_mutex);
+
 	return rc;
 }
 
@@ -6544,7 +6309,7 @@ static int pqi_passthru_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg)
 		put_unaligned_le32(iocommand.Request.Timeout, &request.timeout);
 
 	rc = pqi_submit_raid_request_synchronous(ctrl_info, &request.header,
-		PQI_SYNC_FLAGS_INTERRUPTABLE, &pqi_error_info, NO_TIMEOUT);
+		PQI_SYNC_FLAGS_INTERRUPTABLE, &pqi_error_info);
 
 	if (iocommand.buf_size > 0)
 		pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1,
@@ -6596,9 +6361,6 @@ static int pqi_ioctl(struct scsi_device *sdev, unsigned int cmd,
 
 	ctrl_info = shost_to_hba(sdev->host);
 
-	if (pqi_ctrl_in_ofa(ctrl_info) || pqi_ctrl_in_shutdown(ctrl_info))
-		return -EBUSY;
-
 	switch (cmd) {
 	case CCISS_DEREGDISK:
 	case CCISS_REGNEWDISK:
@@ -7145,9 +6907,7 @@ static int pqi_register_scsi(struct pqi_ctrl_info *ctrl_info)
 
 	shost = scsi_host_alloc(&pqi_driver_template, sizeof(ctrl_info));
 	if (!shost) {
-		dev_err(&ctrl_info->pci_dev->dev,
-			"scsi_host_alloc failed for controller %u\n",
-			ctrl_info->ctrl_id);
+		dev_err(&ctrl_info->pci_dev->dev, "scsi_host_alloc failed\n");
 		return -ENOMEM;
 	}
 
@@ -7405,7 +7165,7 @@ static int pqi_config_table_update(struct pqi_ctrl_info *ctrl_info,
 		&request.data.config_table_update.last_section);
 
 	return pqi_submit_raid_request_synchronous(ctrl_info, &request.header,
-		0, NULL, NO_TIMEOUT);
+		0, NULL);
 }
 
 static int pqi_enable_firmware_features(struct pqi_ctrl_info *ctrl_info,
@@ -7483,7 +7243,8 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
 		break;
 	case PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE:
 		ctrl_info->soft_reset_handshake_supported =
-			firmware_feature->enabled;
+			firmware_feature->enabled &&
+			ctrl_info->soft_reset_status;
 		break;
 	case PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT:
 		ctrl_info->raid_iu_timeout_supported = firmware_feature->enabled;
@@ -7491,6 +7252,10 @@ static void pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
 	case PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT:
 		ctrl_info->tmf_iu_timeout_supported = firmware_feature->enabled;
 		break;
+	case PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN:
+		ctrl_info->unique_wwid_in_report_phys_lun_supported =
+			firmware_feature->enabled;
+		break;
 	}
 
 	pqi_firmware_feature_status(ctrl_info, firmware_feature);
@@ -7581,6 +7346,11 @@ static struct pqi_firmware_feature pqi_firmware_features[] = {
 		.feature_bit = PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME,
 		.feature_status = pqi_firmware_feature_status,
 	},
+	{
+		.feature_name = "Unique WWID in Report Physical LUN",
+		.feature_bit = PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN,
+		.feature_status = pqi_ctrl_update_feature_flags,
+	},
 };
 
 static void pqi_process_firmware_features(
@@ -7665,14 +7435,34 @@ static void pqi_process_firmware_features_section(
 	mutex_unlock(&pqi_firmware_features_mutex);
 }
 
+/*
+ * Reset all controller settings that can be initialized during the processing
+ * of the PQI Configuration Table.
+ */
+
+static void pqi_ctrl_reset_config(struct pqi_ctrl_info *ctrl_info)
+{
+	ctrl_info->heartbeat_counter = NULL;
+	ctrl_info->soft_reset_status = NULL;
+	ctrl_info->soft_reset_handshake_supported = false;
+	ctrl_info->enable_r1_writes = false;
+	ctrl_info->enable_r5_writes = false;
+	ctrl_info->enable_r6_writes = false;
+	ctrl_info->raid_iu_timeout_supported = false;
+	ctrl_info->tmf_iu_timeout_supported = false;
+	ctrl_info->unique_wwid_in_report_phys_lun_supported = false;
+}
+
 static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
 {
 	u32 table_length;
 	u32 section_offset;
+	bool firmware_feature_section_present;
 	void __iomem *table_iomem_addr;
 	struct pqi_config_table *config_table;
 	struct pqi_config_table_section_header *section;
 	struct pqi_config_table_section_info section_info;
+	struct pqi_config_table_section_info feature_section_info;
 
 	table_length = ctrl_info->config_table_length;
 	if (table_length == 0)
@@ -7692,6 +7482,7 @@ static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
 	table_iomem_addr = ctrl_info->iomem_base + ctrl_info->config_table_offset;
 	memcpy_fromio(config_table, table_iomem_addr, table_length);
 
+	firmware_feature_section_present = false;
 	section_info.ctrl_info = ctrl_info;
 	section_offset = get_unaligned_le32(&config_table->first_section_offset);
 
@@ -7704,7 +7495,8 @@ static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
 
 		switch (get_unaligned_le16(&section->section_id)) {
 		case PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES:
-			pqi_process_firmware_features_section(&section_info);
+			firmware_feature_section_present = true;
+			feature_section_info = section_info;
 			break;
 		case PQI_CONFIG_TABLE_SECTION_HEARTBEAT:
 			if (pqi_disable_heartbeat)
@@ -7722,13 +7514,21 @@ static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
 				table_iomem_addr +
 				section_offset +
 				offsetof(struct pqi_config_table_soft_reset,
-						soft_reset_status);
+					soft_reset_status);
 			break;
 		}
 
 		section_offset = get_unaligned_le16(&section->next_section_offset);
 	}
 
+	/*
+	 * We process the firmware feature section after all other sections
+	 * have been processed so that the feature bit callbacks can take
+	 * into account the settings configured by other sections.
+	 */
+	if (firmware_feature_section_present)
+		pqi_process_firmware_features_section(&feature_section_info);
+
 	kfree(config_table);
 
 	return 0;
@@ -7776,8 +7576,6 @@ static int pqi_force_sis_mode(struct pqi_ctrl_info *ctrl_info)
 	return pqi_revert_to_sis_mode(ctrl_info);
 }
 
-#define PQI_POST_RESET_DELAY_B4_MSGU_READY	5000
-
 static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
 {
 	int rc;
@@ -7785,7 +7583,7 @@ static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
 
 	if (reset_devices) {
 		sis_soft_reset(ctrl_info);
-		msleep(PQI_POST_RESET_DELAY_B4_MSGU_READY);
+		msleep(PQI_POST_RESET_DELAY_SECS * PQI_HZ);
 	} else {
 		rc = pqi_force_sis_mode(ctrl_info);
 		if (rc)
@@ -8095,6 +7893,8 @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)
 	ctrl_info->controller_online = true;
 	pqi_ctrl_unblock_requests(ctrl_info);
 
+	pqi_ctrl_reset_config(ctrl_info);
+
 	rc = pqi_process_config_table(ctrl_info);
 	if (rc)
 		return rc;
@@ -8140,7 +7940,8 @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)
 		return rc;
 	}
 
-	pqi_schedule_update_time_worker(ctrl_info);
+	if (pqi_ofa_in_progress(ctrl_info))
+		pqi_ctrl_unblock_scan(ctrl_info);
 
 	pqi_scan_scsi_devices(ctrl_info);
 
@@ -8253,7 +8054,6 @@ static struct pqi_ctrl_info *pqi_alloc_ctrl_info(int numa_node)
 
 	INIT_WORK(&ctrl_info->event_work, pqi_event_worker);
 	atomic_set(&ctrl_info->num_interrupts, 0);
-	atomic_set(&ctrl_info->sync_cmds_outstanding, 0);
 
 	INIT_DELAYED_WORK(&ctrl_info->rescan_work, pqi_rescan_worker);
 	INIT_DELAYED_WORK(&ctrl_info->update_time_work, pqi_update_time_worker);
@@ -8261,15 +8061,13 @@ static struct pqi_ctrl_info *pqi_alloc_ctrl_info(int numa_node)
 	timer_setup(&ctrl_info->heartbeat_timer, pqi_heartbeat_timer_handler, 0);
 	INIT_WORK(&ctrl_info->ctrl_offline_work, pqi_ctrl_offline_worker);
 
+	INIT_WORK(&ctrl_info->ofa_memory_alloc_work, pqi_ofa_memory_alloc_worker);
+	INIT_WORK(&ctrl_info->ofa_quiesce_work, pqi_ofa_quiesce_worker);
+
 	sema_init(&ctrl_info->sync_request_sem,
 		PQI_RESERVED_IO_SLOTS_SYNCHRONOUS_REQUESTS);
 	init_waitqueue_head(&ctrl_info->block_requests_wait);
 
-	INIT_LIST_HEAD(&ctrl_info->raid_bypass_retry_list);
-	spin_lock_init(&ctrl_info->raid_bypass_retry_list_lock);
-	INIT_WORK(&ctrl_info->raid_bypass_retry_work,
-		pqi_raid_bypass_retry_worker);
-
 	ctrl_info->ctrl_id = atomic_inc_return(&pqi_controller_count) - 1;
 	ctrl_info->irq_mode = IRQ_MODE_NONE;
 	ctrl_info->max_msix_vectors = PQI_MAX_MSIX_VECTORS;
@@ -8334,81 +8132,57 @@ static void pqi_remove_ctrl(struct pqi_ctrl_info *ctrl_info)
 
 static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info)
 {
-	pqi_cancel_update_time_worker(ctrl_info);
-	pqi_cancel_rescan_worker(ctrl_info);
-	pqi_wait_until_lun_reset_finished(ctrl_info);
-	pqi_wait_until_scan_finished(ctrl_info);
-	pqi_ctrl_ofa_start(ctrl_info);
+	pqi_ctrl_block_scan(ctrl_info);
+	pqi_scsi_block_requests(ctrl_info);
+	pqi_ctrl_block_device_reset(ctrl_info);
 	pqi_ctrl_block_requests(ctrl_info);
 	pqi_ctrl_wait_until_quiesced(ctrl_info);
-	pqi_ctrl_wait_for_pending_io(ctrl_info, PQI_PENDING_IO_TIMEOUT_SECS);
-	pqi_fail_io_queued_for_all_devices(ctrl_info);
-	pqi_wait_until_inbound_queues_empty(ctrl_info);
 	pqi_stop_heartbeat_timer(ctrl_info);
-	ctrl_info->pqi_mode_enabled = false;
-	pqi_save_ctrl_mode(ctrl_info, SIS_MODE);
 }
 
 static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info)
 {
-	pqi_ofa_free_host_buffer(ctrl_info);
-	ctrl_info->pqi_mode_enabled = true;
-	pqi_save_ctrl_mode(ctrl_info, PQI_MODE);
-	ctrl_info->controller_online = true;
-	pqi_ctrl_unblock_requests(ctrl_info);
 	pqi_start_heartbeat_timer(ctrl_info);
-	pqi_schedule_update_time_worker(ctrl_info);
-	pqi_clear_soft_reset_status(ctrl_info,
-		PQI_SOFT_RESET_ABORT);
-	pqi_scan_scsi_devices(ctrl_info);
+	pqi_ctrl_unblock_requests(ctrl_info);
+	pqi_ctrl_unblock_device_reset(ctrl_info);
+	pqi_scsi_unblock_requests(ctrl_info);
+	pqi_ctrl_unblock_scan(ctrl_info);
 }
 
-static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info,
-	u32 total_size, u32 chunk_size)
+static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info, u32 total_size, u32 chunk_size)
 {
-	u32 sg_count;
-	u32 size;
 	int i;
-	struct pqi_sg_descriptor *mem_descriptor = NULL;
+	u32 sg_count;
 	struct device *dev;
 	struct pqi_ofa_memory *ofap;
-
-	dev = &ctrl_info->pci_dev->dev;
-
-	sg_count = (total_size + chunk_size - 1);
-	sg_count /= chunk_size;
+	struct pqi_sg_descriptor *mem_descriptor;
+	dma_addr_t dma_handle;
 
 	ofap = ctrl_info->pqi_ofa_mem_virt_addr;
 
-	if (sg_count*chunk_size < total_size)
+	sg_count = DIV_ROUND_UP(total_size, chunk_size);
+	if (sg_count == 0 || sg_count > PQI_OFA_MAX_SG_DESCRIPTORS)
 		goto out;
 
-	ctrl_info->pqi_ofa_chunk_virt_addr =
-				kcalloc(sg_count, sizeof(void *), GFP_KERNEL);
+	ctrl_info->pqi_ofa_chunk_virt_addr = kmalloc_array(sg_count, sizeof(void *), GFP_KERNEL);
 	if (!ctrl_info->pqi_ofa_chunk_virt_addr)
 		goto out;
 
-	for (size = 0, i = 0; size < total_size; size += chunk_size, i++) {
-		dma_addr_t dma_handle;
+	dev = &ctrl_info->pci_dev->dev;
 
+	for (i = 0; i < sg_count; i++) {
 		ctrl_info->pqi_ofa_chunk_virt_addr[i] =
-			dma_alloc_coherent(dev, chunk_size, &dma_handle,
-					   GFP_KERNEL);
-
+			dma_alloc_coherent(dev, chunk_size, &dma_handle, GFP_KERNEL);
 		if (!ctrl_info->pqi_ofa_chunk_virt_addr[i])
-			break;
-
+			goto out_free_chunks;
 		mem_descriptor = &ofap->sg_descriptor[i];
 		put_unaligned_le64((u64)dma_handle, &mem_descriptor->address);
 		put_unaligned_le32(chunk_size, &mem_descriptor->length);
 	}
 
-	if (!size || size < total_size)
-		goto out_free_chunks;
-
 	put_unaligned_le32(CISS_SG_LAST, &mem_descriptor->flags);
 	put_unaligned_le16(sg_count, &ofap->num_memory_descriptors);
-	put_unaligned_le32(size, &ofap->bytes_allocated);
+	put_unaligned_le32(sg_count * chunk_size, &ofap->bytes_allocated);
 
 	return 0;
 
@@ -8416,82 +8190,87 @@ static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info,
 	while (--i >= 0) {
 		mem_descriptor = &ofap->sg_descriptor[i];
 		dma_free_coherent(dev, chunk_size,
-				ctrl_info->pqi_ofa_chunk_virt_addr[i],
-				get_unaligned_le64(&mem_descriptor->address));
+			ctrl_info->pqi_ofa_chunk_virt_addr[i],
+			get_unaligned_le64(&mem_descriptor->address));
 	}
 	kfree(ctrl_info->pqi_ofa_chunk_virt_addr);
 
 out:
-	put_unaligned_le32 (0, &ofap->bytes_allocated);
 	return -ENOMEM;
 }
 
 static int pqi_ofa_alloc_host_buffer(struct pqi_ctrl_info *ctrl_info)
 {
 	u32 total_size;
+	u32 chunk_size;
 	u32 min_chunk_size;
-	u32 chunk_sz;
 
-	total_size = le32_to_cpu(
-			ctrl_info->pqi_ofa_mem_virt_addr->bytes_allocated);
-	min_chunk_size = total_size / PQI_OFA_MAX_SG_DESCRIPTORS;
+	if (ctrl_info->ofa_bytes_requested == 0)
+		return 0;
 
-	for (chunk_sz = total_size; chunk_sz >= min_chunk_size; chunk_sz /= 2)
-		if (!pqi_ofa_alloc_mem(ctrl_info, total_size, chunk_sz))
+	total_size = PAGE_ALIGN(ctrl_info->ofa_bytes_requested);
+	min_chunk_size = DIV_ROUND_UP(total_size, PQI_OFA_MAX_SG_DESCRIPTORS);
+	min_chunk_size = PAGE_ALIGN(min_chunk_size);
+
+	for (chunk_size = total_size; chunk_size >= min_chunk_size;) {
+		if (pqi_ofa_alloc_mem(ctrl_info, total_size, chunk_size) == 0)
 			return 0;
+		chunk_size /= 2;
+		chunk_size = PAGE_ALIGN(chunk_size);
+	}
 
 	return -ENOMEM;
 }
 
-static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info *ctrl_info,
-	u32 bytes_requested)
+static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info *ctrl_info)
 {
-	struct pqi_ofa_memory *pqi_ofa_memory;
 	struct device *dev;
+	struct pqi_ofa_memory *ofap;
 
 	dev = &ctrl_info->pci_dev->dev;
-	pqi_ofa_memory = dma_alloc_coherent(dev,
-					    PQI_OFA_MEMORY_DESCRIPTOR_LENGTH,
-					    &ctrl_info->pqi_ofa_mem_dma_handle,
-					    GFP_KERNEL);
 
-	if (!pqi_ofa_memory)
+	ofap = dma_alloc_coherent(dev, sizeof(*ofap),
+		&ctrl_info->pqi_ofa_mem_dma_handle, GFP_KERNEL);
+	if (!ofap)
 		return;
 
-	put_unaligned_le16(PQI_OFA_VERSION, &pqi_ofa_memory->version);
-	memcpy(&pqi_ofa_memory->signature, PQI_OFA_SIGNATURE,
-					sizeof(pqi_ofa_memory->signature));
-	pqi_ofa_memory->bytes_allocated = cpu_to_le32(bytes_requested);
-
-	ctrl_info->pqi_ofa_mem_virt_addr = pqi_ofa_memory;
+	ctrl_info->pqi_ofa_mem_virt_addr = ofap;
 
 	if (pqi_ofa_alloc_host_buffer(ctrl_info) < 0) {
-		dev_err(dev, "Failed to allocate host buffer of size = %u",
-			bytes_requested);
+		dev_err(dev,
+			"failed to allocate host buffer for Online Firmware Activation\n");
+		dma_free_coherent(dev, sizeof(*ofap), ofap, ctrl_info->pqi_ofa_mem_dma_handle);
+		ctrl_info->pqi_ofa_mem_virt_addr = NULL;
+		return;
 	}
 
-	return;
+	put_unaligned_le16(PQI_OFA_VERSION, &ofap->version);
+	memcpy(&ofap->signature, PQI_OFA_SIGNATURE, sizeof(ofap->signature));
 }
 
 static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info *ctrl_info)
 {
-	int i;
-	struct pqi_sg_descriptor *mem_descriptor;
+	unsigned int i;
+	struct device *dev;
 	struct pqi_ofa_memory *ofap;
+	struct pqi_sg_descriptor *mem_descriptor;
+	unsigned int num_memory_descriptors;
 
 	ofap = ctrl_info->pqi_ofa_mem_virt_addr;
-
 	if (!ofap)
 		return;
 
-	if (!ofap->bytes_allocated)
+	dev = &ctrl_info->pci_dev->dev;
+
+	if (get_unaligned_le32(&ofap->bytes_allocated) == 0)
 		goto out;
 
 	mem_descriptor = ofap->sg_descriptor;
+	num_memory_descriptors =
+		get_unaligned_le16(&ofap->num_memory_descriptors);
 
-	for (i = 0; i < get_unaligned_le16(&ofap->num_memory_descriptors);
-		i++) {
-		dma_free_coherent(&ctrl_info->pci_dev->dev,
+	for (i = 0; i < num_memory_descriptors; i++) {
+		dma_free_coherent(dev,
 			get_unaligned_le32(&mem_descriptor[i].length),
 			ctrl_info->pqi_ofa_chunk_virt_addr[i],
 			get_unaligned_le64(&mem_descriptor[i].address));
@@ -8499,47 +8278,46 @@ static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info *ctrl_info)
 	kfree(ctrl_info->pqi_ofa_chunk_virt_addr);
 
 out:
-	dma_free_coherent(&ctrl_info->pci_dev->dev,
-			PQI_OFA_MEMORY_DESCRIPTOR_LENGTH, ofap,
-			ctrl_info->pqi_ofa_mem_dma_handle);
+	dma_free_coherent(dev, sizeof(*ofap), ofap,
+		ctrl_info->pqi_ofa_mem_dma_handle);
 	ctrl_info->pqi_ofa_mem_virt_addr = NULL;
 }
 
 static int pqi_ofa_host_memory_update(struct pqi_ctrl_info *ctrl_info)
 {
+	u32 buffer_length;
 	struct pqi_vendor_general_request request;
-	size_t size;
 	struct pqi_ofa_memory *ofap;
 
 	memset(&request, 0, sizeof(request));
 
-	ofap = ctrl_info->pqi_ofa_mem_virt_addr;
-
 	request.header.iu_type = PQI_REQUEST_IU_VENDOR_GENERAL;
 	put_unaligned_le16(sizeof(request) - PQI_REQUEST_HEADER_LENGTH,
 		&request.header.iu_length);
 	put_unaligned_le16(PQI_VENDOR_GENERAL_HOST_MEMORY_UPDATE,
 		&request.function_code);
 
+	ofap = ctrl_info->pqi_ofa_mem_virt_addr;
+
 	if (ofap) {
-		size = offsetof(struct pqi_ofa_memory, sg_descriptor) +
+		buffer_length = offsetof(struct pqi_ofa_memory, sg_descriptor) +
 			get_unaligned_le16(&ofap->num_memory_descriptors) *
 			sizeof(struct pqi_sg_descriptor);
 
 		put_unaligned_le64((u64)ctrl_info->pqi_ofa_mem_dma_handle,
 			&request.data.ofa_memory_allocation.buffer_address);
-		put_unaligned_le32(size,
+		put_unaligned_le32(buffer_length,
 			&request.data.ofa_memory_allocation.buffer_length);
-
 	}
 
 	return pqi_submit_raid_request_synchronous(ctrl_info, &request.header,
-		0, NULL, NO_TIMEOUT);
+		0, NULL);
 }
 
-static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info)
+static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info, unsigned int delay_secs)
 {
-	msleep(PQI_POST_RESET_DELAY_B4_MSGU_READY);
+	ssleep(delay_secs);
+
 	return pqi_ctrl_init_resume(ctrl_info);
 }
 
@@ -8597,7 +8375,6 @@ static void pqi_take_ctrl_offline_deferred(struct pqi_ctrl_info *ctrl_info)
 	pqi_cancel_update_time_worker(ctrl_info);
 	pqi_ctrl_wait_until_quiesced(ctrl_info);
 	pqi_fail_all_outstanding_requests(ctrl_info);
-	pqi_clear_all_queued_raid_bypass_retries(ctrl_info);
 	pqi_ctrl_unblock_requests(ctrl_info);
 }
 
@@ -8730,24 +8507,12 @@ static void pqi_shutdown(struct pci_dev *pci_dev)
 		return;
 	}
 
-	pqi_disable_events(ctrl_info);
 	pqi_wait_until_ofa_finished(ctrl_info);
-	pqi_cancel_update_time_worker(ctrl_info);
-	pqi_cancel_rescan_worker(ctrl_info);
-	pqi_cancel_event_worker(ctrl_info);
-
-	pqi_ctrl_shutdown_start(ctrl_info);
-	pqi_ctrl_wait_until_quiesced(ctrl_info);
-
-	rc = pqi_ctrl_wait_for_pending_io(ctrl_info, NO_TIMEOUT);
-	if (rc) {
-		dev_err(&pci_dev->dev,
-			"wait for pending I/O failed\n");
-		return;
-	}
 
+	pqi_scsi_block_requests(ctrl_info);
 	pqi_ctrl_block_device_reset(ctrl_info);
-	pqi_wait_until_lun_reset_finished(ctrl_info);
+	pqi_ctrl_block_requests(ctrl_info);
+	pqi_ctrl_wait_until_quiesced(ctrl_info);
 
 	/*
 	 * Write all data in the controller's battery-backed cache to
@@ -8758,15 +8523,6 @@ static void pqi_shutdown(struct pci_dev *pci_dev)
 		dev_err(&pci_dev->dev,
 			"unable to flush controller cache\n");
 
-	pqi_ctrl_block_requests(ctrl_info);
-
-	rc = pqi_ctrl_wait_for_pending_sync_cmds(ctrl_info);
-	if (rc) {
-		dev_err(&pci_dev->dev,
-			"wait for pending sync cmds failed\n");
-		return;
-	}
-
 	pqi_crash_if_pending_command(ctrl_info);
 	pqi_reset(ctrl_info);
 }
@@ -8801,19 +8557,18 @@ static __maybe_unused int pqi_suspend(struct pci_dev *pci_dev, pm_message_t stat
 
 	ctrl_info = pci_get_drvdata(pci_dev);
 
-	pqi_disable_events(ctrl_info);
-	pqi_cancel_update_time_worker(ctrl_info);
-	pqi_cancel_rescan_worker(ctrl_info);
-	pqi_wait_until_scan_finished(ctrl_info);
-	pqi_wait_until_lun_reset_finished(ctrl_info);
 	pqi_wait_until_ofa_finished(ctrl_info);
-	pqi_flush_cache(ctrl_info, SUSPEND);
+
+	pqi_ctrl_block_scan(ctrl_info);
+	pqi_scsi_block_requests(ctrl_info);
+	pqi_ctrl_block_device_reset(ctrl_info);
 	pqi_ctrl_block_requests(ctrl_info);
 	pqi_ctrl_wait_until_quiesced(ctrl_info);
-	pqi_wait_until_inbound_queues_empty(ctrl_info);
-	pqi_ctrl_wait_for_pending_io(ctrl_info, NO_TIMEOUT);
+	pqi_flush_cache(ctrl_info, SUSPEND);
 	pqi_stop_heartbeat_timer(ctrl_info);
 
+	pqi_crash_if_pending_command(ctrl_info);
+
 	if (state.event == PM_EVENT_FREEZE)
 		return 0;
 
@@ -8846,8 +8601,10 @@ static __maybe_unused int pqi_resume(struct pci_dev *pci_dev)
 				pci_dev->irq, rc);
 			return rc;
 		}
-		pqi_start_heartbeat_timer(ctrl_info);
+		pqi_ctrl_unblock_device_reset(ctrl_info);
 		pqi_ctrl_unblock_requests(ctrl_info);
+		pqi_scsi_unblock_requests(ctrl_info);
+		pqi_ctrl_unblock_scan(ctrl_info);
 		return 0;
 	}
 
@@ -9288,7 +9045,7 @@ static void __attribute__((unused)) verify_structures(void)
 	BUILD_BUG_ON(offsetof(struct pqi_iu_header,
 		response_queue_id) != 0x4);
 	BUILD_BUG_ON(offsetof(struct pqi_iu_header,
-		work_area) != 0x6);
+		driver_flags) != 0x6);
 	BUILD_BUG_ON(sizeof(struct pqi_iu_header) != 0x8);
 
 	BUILD_BUG_ON(offsetof(struct pqi_aio_error_info,
@@ -9386,7 +9143,7 @@ static void __attribute__((unused)) verify_structures(void)
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
 		header.iu_length) != 2);
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
-		header.work_area) != 6);
+		header.driver_flags) != 6);
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
 		request_id) != 8);
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
@@ -9442,7 +9199,7 @@ static void __attribute__((unused)) verify_structures(void)
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
 		header.iu_length) != 2);
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
-		header.work_area) != 6);
+		header.driver_flags) != 6);
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
 		request_id) != 8);
 	BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
@@ -9466,7 +9223,7 @@ static void __attribute__((unused)) verify_structures(void)
 	BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
 		header.response_queue_id) != 4);
 	BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
-		header.work_area) != 6);
+		header.driver_flags) != 6);
 	BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
 		request_id) != 8);
 	BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
@@ -9495,7 +9252,7 @@ static void __attribute__((unused)) verify_structures(void)
 	BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
 		header.response_queue_id) != 4);
 	BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
-		header.work_area) != 6);
+		header.driver_flags) != 6);
 	BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
 		request_id) != 8);
 	BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (13 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 14/25] smartpqi: fix driver synchronization issues Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2020-12-14 17:54   ` Paul Menzel
  2020-12-10 20:35 ` [PATCH V3 16/25] smartpqi: convert snprintf to scnprintf Don Brace
                   ` (10 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>

* Correct scsi-mid-layer sending more requests than
  exposed host Q depth causing firmware ASSERT issue.
  * Add host Qdepth counter.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h      |    2 ++
 drivers/scsi/smartpqi/smartpqi_init.c |   19 ++++++++++++++++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index 0b94c755a74c..c3b103b15924 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -1345,6 +1345,8 @@ struct pqi_ctrl_info {
 	struct work_struct ofa_quiesce_work;
 	u32		ofa_bytes_requested;
 	u16		ofa_cancel_reason;
+
+	atomic_t	total_scmds_outstanding;
 };
 
 enum pqi_ctrl_mode {
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 082b17e9bd80..4e088f47d95f 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -5578,6 +5578,8 @@ static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd *scmd)
 void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)
 {
 	struct pqi_scsi_dev *device;
+	struct pqi_ctrl_info *ctrl_info;
+	struct Scsi_Host *shost;
 
 	if (!scmd->device) {
 		set_host_byte(scmd, DID_NO_CONNECT);
@@ -5590,7 +5592,11 @@ void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)
 		return;
 	}
 
+	shost = scmd->device->host;
+	ctrl_info = shost_to_hba(shost);
+
 	atomic_dec(&device->scsi_cmds_outstanding);
+	atomic_dec(&ctrl_info->total_scmds_outstanding);
 }
 
 static bool pqi_is_parity_write_stream(struct pqi_ctrl_info *ctrl_info,
@@ -5678,6 +5684,7 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 	bool raid_bypassed;
 
 	device = scmd->device->hostdata;
+	ctrl_info = shost_to_hba(shost);
 
 	if (!device) {
 		set_host_byte(scmd, DID_NO_CONNECT);
@@ -5686,8 +5693,11 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 	}
 
 	atomic_inc(&device->scsi_cmds_outstanding);
-
-	ctrl_info = shost_to_hba(shost);
+	if (atomic_inc_return(&ctrl_info->total_scmds_outstanding) >
+		ctrl_info->scsi_ml_can_queue) {
+		rc = SCSI_MLQUEUE_HOST_BUSY;
+		goto out;
+	}
 
 	if (pqi_ctrl_offline(ctrl_info) || pqi_device_in_remove(device)) {
 		set_host_byte(scmd, DID_NO_CONNECT);
@@ -5730,8 +5740,10 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
 	}
 
 out:
-	if (rc)
+	if (rc) {
 		atomic_dec(&device->scsi_cmds_outstanding);
+		atomic_dec(&ctrl_info->total_scmds_outstanding);
+	}
 
 	return rc;
 }
@@ -8054,6 +8066,7 @@ static struct pqi_ctrl_info *pqi_alloc_ctrl_info(int numa_node)
 
 	INIT_WORK(&ctrl_info->event_work, pqi_event_worker);
 	atomic_set(&ctrl_info->num_interrupts, 0);
+	atomic_set(&ctrl_info->total_scmds_outstanding, 0);
 
 	INIT_DELAYED_WORK(&ctrl_info->rescan_work, pqi_rescan_worker);
 	INIT_DELAYED_WORK(&ctrl_info->update_time_work, pqi_update_time_worker);


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 16/25] smartpqi: convert snprintf to scnprintf
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (14 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 15/25] smartpqi: fix host qdepth limit Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-07 23:51   ` Martin Wilck
  2020-12-10 20:35 ` [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA Don Brace
                   ` (9 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

The entire Linux kernel has been slowly migrating from snprintf
to scnprintf, so we are doing our part. This article explains
the rationale for this change:
    https: //lwn.net/Articles/69419/

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |   23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 4e088f47d95f..456ea8732312 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -1750,7 +1750,7 @@ static void pqi_dev_info(struct pqi_ctrl_info *ctrl_info,
 	ssize_t count;
 	char buffer[PQI_DEV_INFO_BUFFER_LENGTH];
 
-	count = snprintf(buffer, PQI_DEV_INFO_BUFFER_LENGTH,
+	count = scnprintf(buffer, PQI_DEV_INFO_BUFFER_LENGTH,
 		"%d:%d:", ctrl_info->scsi_host->host_no, device->bus);
 
 	if (device->target_lun_valid)
@@ -6405,14 +6405,13 @@ static ssize_t pqi_firmware_version_show(struct device *dev,
 	shost = class_to_shost(dev);
 	ctrl_info = shost_to_hba(shost);
 
-	return snprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->firmware_version);
+	return scnprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->firmware_version);
 }
 
 static ssize_t pqi_driver_version_show(struct device *dev,
 	struct device_attribute *attr, char *buffer)
 {
-	return snprintf(buffer, PAGE_SIZE, "%s\n",
-			DRIVER_VERSION BUILD_TIMESTAMP);
+	return scnprintf(buffer, PAGE_SIZE, "%s\n", DRIVER_VERSION BUILD_TIMESTAMP);
 }
 
 static ssize_t pqi_serial_number_show(struct device *dev,
@@ -6424,7 +6423,7 @@ static ssize_t pqi_serial_number_show(struct device *dev,
 	shost = class_to_shost(dev);
 	ctrl_info = shost_to_hba(shost);
 
-	return snprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->serial_number);
+	return scnprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->serial_number);
 }
 
 static ssize_t pqi_model_show(struct device *dev,
@@ -6436,7 +6435,7 @@ static ssize_t pqi_model_show(struct device *dev,
 	shost = class_to_shost(dev);
 	ctrl_info = shost_to_hba(shost);
 
-	return snprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->model);
+	return scnprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->model);
 }
 
 static ssize_t pqi_vendor_show(struct device *dev,
@@ -6448,7 +6447,7 @@ static ssize_t pqi_vendor_show(struct device *dev,
 	shost = class_to_shost(dev);
 	ctrl_info = shost_to_hba(shost);
 
-	return snprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->vendor);
+	return scnprintf(buffer, PAGE_SIZE, "%s\n", ctrl_info->vendor);
 }
 
 static ssize_t pqi_host_rescan_store(struct device *dev,
@@ -6642,7 +6641,7 @@ static ssize_t pqi_unique_id_show(struct device *dev,
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
-	return snprintf(buffer, PAGE_SIZE,
+	return scnprintf(buffer, PAGE_SIZE,
 		"%02X%02X%02X%02X%02X%02X%02X%02X"
 		"%02X%02X%02X%02X%02X%02X%02X%02X\n",
 		unique_id[0], unique_id[1], unique_id[2], unique_id[3],
@@ -6675,7 +6674,7 @@ static ssize_t pqi_lunid_show(struct device *dev,
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
-	return snprintf(buffer, PAGE_SIZE, "0x%8phN\n", lunid);
+	return scnprintf(buffer, PAGE_SIZE, "0x%8phN\n", lunid);
 }
 
 #define MAX_PATHS	8
@@ -6787,7 +6786,7 @@ static ssize_t pqi_sas_address_show(struct device *dev,
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
-	return snprintf(buffer, PAGE_SIZE, "0x%016llx\n", sas_address);
+	return scnprintf(buffer, PAGE_SIZE, "0x%016llx\n", sas_address);
 }
 
 static ssize_t pqi_ssd_smart_path_enabled_show(struct device *dev,
@@ -6845,7 +6844,7 @@ static ssize_t pqi_raid_level_show(struct device *dev,
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
-	return snprintf(buffer, PAGE_SIZE, "%s\n", raid_level);
+	return scnprintf(buffer, PAGE_SIZE, "%s\n", raid_level);
 }
 
 static ssize_t pqi_raid_bypass_cnt_show(struct device *dev,
@@ -6872,7 +6871,7 @@ static ssize_t pqi_raid_bypass_cnt_show(struct device *dev,
 
 	spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock, flags);
 
-	return snprintf(buffer, PAGE_SIZE, "0x%x\n", raid_bypass_cnt);
+	return scnprintf(buffer, PAGE_SIZE, "0x%x\n", raid_bypass_cnt);
 }
 
 static DEVICE_ATTR(lunid, 0444, pqi_lunid_show, NULL);


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (15 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 16/25] smartpqi: convert snprintf to scnprintf Don Brace
@ 2020-12-10 20:35 ` Don Brace
  2021-01-08  0:14   ` Martin Wilck
  2020-12-10 20:36 ` [PATCH V3 18/25] smartpqi: return busy indication for IOCTLs when ofa is active Don Brace
                   ` (8 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:35 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Release QRM memory (OFA buffer) on OFA error conditions.
* Controller is left in a bad state which can cause a kernel panic
    upon reboot after an unsuccessful OFA.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 456ea8732312..552072812771 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -3305,8 +3305,6 @@ static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
 	else
 		reset_status = RESET_INITIATE_FIRMWARE;
 
-	pqi_ofa_free_host_buffer(ctrl_info);
-
 	delay_secs = PQI_POST_RESET_DELAY_SECS;
 
 	switch (reset_status) {
@@ -3322,6 +3320,7 @@ static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
 		ctrl_info->pqi_mode_enabled = false;
 		pqi_save_ctrl_mode(ctrl_info, SIS_MODE);
 		rc = pqi_ofa_ctrl_restart(ctrl_info, delay_secs);
+		pqi_ofa_free_host_buffer(ctrl_info);
 		pqi_ctrl_ofa_done(ctrl_info);
 		dev_info(&ctrl_info->pci_dev->dev,
 				"Online Firmware Activation: %s\n",
@@ -3332,6 +3331,7 @@ static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
 				"Online Firmware Activation ABORTED\n");
 		if (ctrl_info->soft_reset_handshake_supported)
 			pqi_clear_soft_reset_status(ctrl_info);
+		pqi_ofa_free_host_buffer(ctrl_info);
 		pqi_ctrl_ofa_done(ctrl_info);
 		pqi_ofa_ctrl_unquiesce(ctrl_info);
 		break;
@@ -3341,6 +3341,7 @@ static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
 		dev_err(&ctrl_info->pci_dev->dev,
 			"unexpected Online Firmware Activation reset status: 0x%x\n",
 			reset_status);
+		pqi_ofa_free_host_buffer(ctrl_info);
 		pqi_ctrl_ofa_done(ctrl_info);
 		pqi_ofa_ctrl_unquiesce(ctrl_info);
 		pqi_take_ctrl_offline(ctrl_info);


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 18/25] smartpqi: return busy indication for IOCTLs when ofa is active
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (16 preceding siblings ...)
  2020-12-10 20:35 ` [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2020-12-10 20:36 ` [PATCH V3 19/25] smartpqi: add phy id support for the physical drives Don Brace
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Prevent Kernel crashes while issuing ioctl during OFA.
* Before this fix, the driver returned busy indication for
    pass-through IOCTLs throughout all stages of OFA.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 552072812771..f6bc7d9850e0 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -6238,6 +6238,8 @@ static int pqi_passthru_ioctl(struct pqi_ctrl_info *ctrl_info, void __user *arg)
 
 	if (pqi_ctrl_offline(ctrl_info))
 		return -ENXIO;
+	if (pqi_ofa_in_progress(ctrl_info) && pqi_ctrl_blocked(ctrl_info))
+		return -EBUSY;
 	if (!arg)
 		return -EINVAL;
 	if (!capable(CAP_SYS_RAWIO))


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 19/25] smartpqi: add phy id support for the physical drives
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (17 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 18/25] smartpqi: return busy indication for IOCTLs when ofa is active Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2021-01-08  0:03   ` Martin Wilck
  2020-12-10 20:36 ` [PATCH V3 20/25] smartpqi: update sas initiator_port_protocols and target_port_protocols Don Brace
                   ` (6 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Murthy Bhat <Murthy.Bhat@microchip.com>

* Display topology using PHY numbers.
* PHY(both local and remote) numbers corresponding to physical drives
    are read from BMIC_IDENTIFY_PHYSICAL_DEVICE.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi.h               |    1 +
 drivers/scsi/smartpqi/smartpqi_init.c          |   10 ++++++++++
 drivers/scsi/smartpqi/smartpqi_sas_transport.c |    1 +
 3 files changed, 12 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
index c3b103b15924..8220957bc69b 100644
--- a/drivers/scsi/smartpqi/smartpqi.h
+++ b/drivers/scsi/smartpqi/smartpqi.h
@@ -1089,6 +1089,7 @@ struct pqi_scsi_dev {
 	u8	phy_connected_dev_type;
 	u8	box[8];
 	u16	phys_connector[8];
+	u8	phy_id;
 	bool	raid_bypass_configured;	/* RAID bypass configured */
 	bool	raid_bypass_enabled;	/* RAID bypass enabled */
 	u32	next_bypass_group;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index f6bc7d9850e0..6b624413c8e6 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -1435,6 +1435,8 @@ static void pqi_get_volume_status(struct pqi_ctrl_info *ctrl_info,
 	device->volume_offline = volume_offline;
 }
 
+#define PQI_DEVICE_PHY_MAP_SUPPORTED	0x10
+
 static int pqi_get_physical_device_info(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev *device,
 	struct bmic_identify_physical_device *id_phys)
@@ -1474,6 +1476,13 @@ static int pqi_get_physical_device_info(struct pqi_ctrl_info *ctrl_info,
 	memcpy(&device->page_83_identifier, &id_phys->page_83_identifier,
 		sizeof(device->page_83_identifier));
 
+	if ((id_phys->even_more_flags & PQI_DEVICE_PHY_MAP_SUPPORTED) &&
+		id_phys->phy_count)
+		device->phy_id =
+			id_phys->phy_to_phy_map[device->active_path_index];
+	else
+		device->phy_id = 0xFF;
+
 	return 0;
 }
 
@@ -1840,6 +1849,7 @@ static void pqi_scsi_update_device(struct pqi_scsi_dev *existing_device,
 	existing_device->aio_handle = new_device->aio_handle;
 	existing_device->volume_status = new_device->volume_status;
 	existing_device->active_path_index = new_device->active_path_index;
+	existing_device->phy_id = new_device->phy_id;
 	existing_device->path_map = new_device->path_map;
 	existing_device->bay = new_device->bay;
 	existing_device->box_index = new_device->box_index;
diff --git a/drivers/scsi/smartpqi/smartpqi_sas_transport.c b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
index 77923c6ec2c6..71e83d5fdd02 100644
--- a/drivers/scsi/smartpqi/smartpqi_sas_transport.c
+++ b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
@@ -92,6 +92,7 @@ static int pqi_sas_port_add_rphy(struct pqi_sas_port *pqi_sas_port,
 
 	identify = &rphy->identify;
 	identify->sas_address = pqi_sas_port->sas_address;
+	identify->phy_identifier = pqi_sas_port->device->phy_id;
 
 	if (pqi_sas_port->device &&
 		pqi_sas_port->device->is_expander_smp_device) {


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 20/25] smartpqi: update sas initiator_port_protocols and target_port_protocols
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (18 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 19/25] smartpqi: add phy id support for the physical drives Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2021-01-08  0:12   ` Martin Wilck
  2020-12-10 20:36 ` [PATCH V3 21/25] smartpqi: add additional logging for LUN resets Don Brace
                   ` (5 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Murthy Bhat <Murthy.Bhat@microchip.com>

* Export valid sas initiator_port_protocols and
  target_port_protocols to sysfs.
  * lsscsi now shows correct values.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_sas_transport.c |   28 ++++++++++++++++--------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_sas_transport.c b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
index 71e83d5fdd02..dd9b784792ef 100644
--- a/drivers/scsi/smartpqi/smartpqi_sas_transport.c
+++ b/drivers/scsi/smartpqi/smartpqi_sas_transport.c
@@ -65,8 +65,8 @@ static int pqi_sas_port_add_phy(struct pqi_sas_phy *pqi_sas_phy)
 	memset(identify, 0, sizeof(*identify));
 	identify->sas_address = pqi_sas_port->sas_address;
 	identify->device_type = SAS_END_DEVICE;
-	identify->initiator_port_protocols = SAS_PROTOCOL_STP;
-	identify->target_port_protocols = SAS_PROTOCOL_STP;
+	identify->initiator_port_protocols = SAS_PROTOCOL_ALL;
+	identify->target_port_protocols = SAS_PROTOCOL_ALL;
 	phy->minimum_linkrate_hw = SAS_LINK_RATE_UNKNOWN;
 	phy->maximum_linkrate_hw = SAS_LINK_RATE_UNKNOWN;
 	phy->minimum_linkrate = SAS_LINK_RATE_UNKNOWN;
@@ -94,13 +94,23 @@ static int pqi_sas_port_add_rphy(struct pqi_sas_port *pqi_sas_port,
 	identify->sas_address = pqi_sas_port->sas_address;
 	identify->phy_identifier = pqi_sas_port->device->phy_id;
 
-	if (pqi_sas_port->device &&
-		pqi_sas_port->device->is_expander_smp_device) {
-		identify->initiator_port_protocols = SAS_PROTOCOL_SMP;
-		identify->target_port_protocols = SAS_PROTOCOL_SMP;
-	} else {
-		identify->initiator_port_protocols = SAS_PROTOCOL_STP;
-		identify->target_port_protocols = SAS_PROTOCOL_STP;
+	identify->initiator_port_protocols = SAS_PROTOCOL_ALL;
+	identify->target_port_protocols = SAS_PROTOCOL_STP;
+
+	if (pqi_sas_port->device) {
+		switch (pqi_sas_port->device->device_type) {
+		case SA_DEVICE_TYPE_SAS:
+		case SA_DEVICE_TYPE_SES:
+		case SA_DEVICE_TYPE_NVME:
+			identify->target_port_protocols = SAS_PROTOCOL_SSP;
+			break;
+		case SA_DEVICE_TYPE_EXPANDER_SMP:
+			identify->target_port_protocols = SAS_PROTOCOL_SMP;
+			break;
+		case SA_DEVICE_TYPE_SATA:
+		default:
+			break;
+		}
 	}
 
 	return sas_rphy_add(rphy);


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 21/25] smartpqi: add additional logging for LUN resets
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (19 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 20/25] smartpqi: update sas initiator_port_protocols and target_port_protocols Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2021-01-08  0:27   ` Martin Wilck
  2020-12-10 20:36 ` [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf Don Brace
                   ` (4 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Add additional logging to help in debugging issues
  with LUN resets.

Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |  125 +++++++++++++++++++++++----------
 1 file changed, 89 insertions(+), 36 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 6b624413c8e6..1c51a59f1da6 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -84,7 +84,7 @@ static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info *ctrl_info);
 static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info *ctrl_info);
 static int pqi_ofa_host_memory_update(struct pqi_ctrl_info *ctrl_info);
 static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device, unsigned long timeout_secs);
+	struct pqi_scsi_dev *device, unsigned long timeout_msecs);
 
 /* for flags argument to pqi_submit_raid_request_synchronous() */
 #define PQI_SYNC_FLAGS_INTERRUPTABLE	0x1
@@ -335,11 +335,34 @@ static void pqi_wait_if_ctrl_blocked(struct pqi_ctrl_info *ctrl_info)
 	atomic_dec(&ctrl_info->num_blocked_threads);
 }
 
+#define PQI_QUIESE_WARNING_TIMEOUT_SECS		10
+
 static inline void pqi_ctrl_wait_until_quiesced(struct pqi_ctrl_info *ctrl_info)
 {
+	unsigned long start_jiffies;
+	unsigned long warning_timeout;
+	bool displayed_warning;
+
+	displayed_warning = false;
+	start_jiffies = jiffies;
+	warning_timeout = (PQI_QUIESE_WARNING_TIMEOUT_SECS * PQI_HZ) + start_jiffies;
+
 	while (atomic_read(&ctrl_info->num_busy_threads) >
-		atomic_read(&ctrl_info->num_blocked_threads))
+		atomic_read(&ctrl_info->num_blocked_threads)) {
+		if (time_after(jiffies, warning_timeout)) {
+			dev_warn(&ctrl_info->pci_dev->dev,
+				"waiting %u seconds for driver activity to quiesce\n",
+				jiffies_to_msecs(jiffies - start_jiffies) / 1000);
+			displayed_warning = true;
+			warning_timeout = (PQI_QUIESE_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;
+		}
 		usleep_range(1000, 2000);
+	}
+
+	if (displayed_warning)
+		dev_warn(&ctrl_info->pci_dev->dev,
+			"driver activity quiesced after waiting for %u seconds\n",
+			jiffies_to_msecs(jiffies - start_jiffies) / 1000);
 }
 
 static inline bool pqi_device_offline(struct pqi_scsi_dev *device)
@@ -1670,7 +1693,7 @@ static int pqi_add_device(struct pqi_ctrl_info *ctrl_info,
 	return rc;
 }
 
-#define PQI_PENDING_IO_TIMEOUT_SECS	20
+#define PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS	(20 * 1000)
 
 static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info, struct pqi_scsi_dev *device)
 {
@@ -1678,7 +1701,8 @@ static inline void pqi_remove_device(struct pqi_ctrl_info *ctrl_info, struct pqi
 
 	pqi_device_remove_start(device);
 
-	rc = pqi_device_wait_for_pending_io(ctrl_info, device, PQI_PENDING_IO_TIMEOUT_SECS);
+	rc = pqi_device_wait_for_pending_io(ctrl_info, device,
+		PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS);
 	if (rc)
 		dev_err(&ctrl_info->pci_dev->dev,
 			"scsi %d:%d:%d:%d removing device with %d outstanding command(s)\n",
@@ -3070,7 +3094,7 @@ static void pqi_process_io_error(unsigned int iu_type,
 	}
 }
 
-static int pqi_interpret_task_management_response(
+static int pqi_interpret_task_management_response(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_task_management_response *response)
 {
 	int rc;
@@ -3088,6 +3112,10 @@ static int pqi_interpret_task_management_response(
 		break;
 	}
 
+	if (rc)
+		dev_err(&ctrl_info->pci_dev->dev,
+			"Task Management Function error: %d (response code: %u)\n", rc, response->response_code);
+
 	return rc;
 }
 
@@ -3156,9 +3184,8 @@ static int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, struct pqi_queue
 				&((struct pqi_vendor_general_response *)response)->status);
 			break;
 		case PQI_RESPONSE_IU_TASK_MANAGEMENT:
-			io_request->status =
-				pqi_interpret_task_management_response(
-					(void *)response);
+			io_request->status = pqi_interpret_task_management_response(ctrl_info,
+				(void *)response);
 			break;
 		case PQI_RESPONSE_IU_AIO_PATH_DISABLED:
 			pqi_aio_path_disabled(io_request);
@@ -5862,24 +5889,37 @@ static void pqi_fail_io_queued_for_device(struct pqi_ctrl_info *ctrl_info,
 	}
 }
 
+#define PQI_PENDING_IO_WARNING_TIMEOUT_SECS	10
+
 static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device, unsigned long timeout_secs)
+	struct pqi_scsi_dev *device, unsigned long timeout_msecs)
 {
-	unsigned long timeout;
+	int cmds_outstanding;
+	unsigned long start_jiffies;
+	unsigned long warning_timeout;
+	unsigned long msecs_waiting;
 
+	start_jiffies = jiffies;
+	warning_timeout = (PQI_PENDING_IO_WARNING_TIMEOUT_SECS * PQI_HZ) + start_jiffies;
 
-	timeout = (timeout_secs * PQI_HZ) + jiffies;
-
-	while (atomic_read(&device->scsi_cmds_outstanding)) {
+	while ((cmds_outstanding = atomic_read(&device->scsi_cmds_outstanding)) > 0) {
 		pqi_check_ctrl_health(ctrl_info);
 		if (pqi_ctrl_offline(ctrl_info))
 			return -ENXIO;
-		if (timeout_secs != NO_TIMEOUT) {
-			if (time_after(jiffies, timeout)) {
-				dev_err(&ctrl_info->pci_dev->dev,
-					"timed out waiting for pending I/O\n");
-				return -ETIMEDOUT;
-			}
+		msecs_waiting = jiffies_to_msecs(jiffies - start_jiffies);
+		if (msecs_waiting > timeout_msecs) {
+			dev_err(&ctrl_info->pci_dev->dev,
+				"scsi %d:%d:%d:%d: timed out after %lu seconds waiting for %d outstanding command(s)\n",
+				ctrl_info->scsi_host->host_no, device->bus, device->target,
+				device->lun, msecs_waiting / 1000, cmds_outstanding);
+			return -ETIMEDOUT;
+		}
+		if (time_after(jiffies, warning_timeout)) {
+			dev_warn(&ctrl_info->pci_dev->dev,
+				"scsi %d:%d:%d:%d: waiting %lu seconds for %d outstanding command(s)\n",
+				ctrl_info->scsi_host->host_no, device->bus, device->target,
+				device->lun, msecs_waiting / 1000, cmds_outstanding);
+			warning_timeout = (PQI_PENDING_IO_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;
 		}
 		usleep_range(1000, 2000);
 	}
@@ -5895,13 +5935,15 @@ static void pqi_lun_reset_complete(struct pqi_io_request *io_request,
 	complete(waiting);
 }
 
-#define PQI_LUN_RESET_TIMEOUT_SECS		30
 #define PQI_LUN_RESET_POLL_COMPLETION_SECS	10
 
 static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,
 	struct pqi_scsi_dev *device, struct completion *wait)
 {
 	int rc;
+	unsigned int wait_secs;
+
+	wait_secs = 0;
 
 	while (1) {
 		if (wait_for_completion_io_timeout(wait,
@@ -5915,13 +5957,21 @@ static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,
 			rc = -ENXIO;
 			break;
 		}
+
+		wait_secs += PQI_LUN_RESET_POLL_COMPLETION_SECS;
+
+		dev_warn(&ctrl_info->pci_dev->dev,
+			"scsi %d:%d:%d:%d: waiting %u seconds for LUN reset to complete\n",
+			ctrl_info->scsi_host->host_no, device->bus, device->target, device->lun,
+			wait_secs);
 	}
 
 	return rc;
 }
 
-static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device)
+#define PQI_LUN_RESET_FIRMWARE_TIMEOUT_SECS	30
+
+static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info, struct pqi_scsi_dev *device)
 {
 	int rc;
 	struct pqi_io_request *io_request;
@@ -5943,8 +5993,7 @@ static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info,
 		sizeof(request->lun_number));
 	request->task_management_function = SOP_TASK_MANAGEMENT_LUN_RESET;
 	if (ctrl_info->tmf_iu_timeout_supported)
-		put_unaligned_le16(PQI_LUN_RESET_TIMEOUT_SECS,
-					&request->timeout);
+		put_unaligned_le16(PQI_LUN_RESET_FIRMWARE_TIMEOUT_SECS, &request->timeout);
 
 	pqi_start_io(ctrl_info, &ctrl_info->queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
 		io_request);
@@ -5958,29 +6007,33 @@ static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info,
 	return rc;
 }
 
-#define PQI_LUN_RESET_RETRIES			3
-#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS	10000
-#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS	120
+#define PQI_LUN_RESET_RETRIES				3
+#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS		(10 * 1000)
+#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS		(10 * 60 * 1000)
+#define PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS	(2 * 60 * 1000)
 
-static int pqi_lun_reset_with_retries(struct pqi_ctrl_info *ctrl_info,
-	struct pqi_scsi_dev *device)
+static int pqi_lun_reset_with_retries(struct pqi_ctrl_info *ctrl_info, struct pqi_scsi_dev *device)
 {
-	int rc;
+	int reset_rc;
+	int wait_rc;
 	unsigned int retries;
-	unsigned long timeout_secs;
+	unsigned long timeout_msecs;
 
 	for (retries = 0;;) {
-		rc = pqi_lun_reset(ctrl_info, device);
-		if (rc == 0 || ++retries > PQI_LUN_RESET_RETRIES)
+		reset_rc = pqi_lun_reset(ctrl_info, device);
+		if (reset_rc == 0 || ++retries > PQI_LUN_RESET_RETRIES)
 			break;
 		msleep(PQI_LUN_RESET_RETRY_INTERVAL_MSECS);
 	}
 
-	timeout_secs = rc ? PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS : NO_TIMEOUT;
+	timeout_msecs = reset_rc ? PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS :
+		PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS;
 
-	rc |= pqi_device_wait_for_pending_io(ctrl_info, device, timeout_secs);
+	wait_rc = pqi_device_wait_for_pending_io(ctrl_info, device, timeout_msecs);
+	if (wait_rc && reset_rc == 0)
+		reset_rc = wait_rc;
 
-	return rc == 0 ? SUCCESS : FAILED;
+	return reset_rc == 0 ? SUCCESS : FAILED;
 }
 
 static int pqi_device_reset(struct pqi_ctrl_info *ctrl_info,


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (20 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 21/25] smartpqi: add additional logging for LUN resets Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2021-01-08  0:30   ` Martin Wilck
  2020-12-10 20:36 ` [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation Don Brace
                   ` (3 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Murthy Bhat <Murthy.Bhat@microchip.com>

* Update enclosure identifier field corresponding to
  physical devices in lsscsi/sysfs.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 1c51a59f1da6..40ae82470d8c 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info *ctrl_info,
 static void pqi_scsi_update_device(struct pqi_scsi_dev *existing_device,
 	struct pqi_scsi_dev *new_device)
 {
-	existing_device->devtype = new_device->devtype;
 	existing_device->device_type = new_device->device_type;
 	existing_device->bus = new_device->bus;
 	if (new_device->target_lun_valid) {


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (21 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2021-01-08  0:34   ` Martin Wilck
  2020-12-10 20:36 ` [PATCH V3 24/25] smartpqi: add new pci ids Don Brace
                   ` (2 subsequent siblings)
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Correct system hangs when resuming from hibernation after
  first successful hibernation/resume cycle.
  * Rare condition involving OFA.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 40ae82470d8c..5ca265babaa2 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -8688,6 +8688,11 @@ static __maybe_unused int pqi_resume(struct pci_dev *pci_dev)
 	pci_set_power_state(pci_dev, PCI_D0);
 	pci_restore_state(pci_dev);
 
+	pqi_ctrl_unblock_device_reset(ctrl_info);
+	pqi_ctrl_unblock_requests(ctrl_info);
+	pqi_scsi_unblock_requests(ctrl_info);
+	pqi_ctrl_unblock_scan(ctrl_info);
+
 	return pqi_ctrl_init_resume(ctrl_info);
 }
 


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 24/25] smartpqi: add new pci ids
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (22 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2021-01-08  0:35   ` Martin Wilck
  2020-12-10 20:36 ` [PATCH V3 25/25] smartpqi: update version to 2.1.6-005 Don Brace
  2020-12-21 14:31 ` [PATCH V3 00/25] smartpqi updates Donald Buczek
  25 siblings, 1 reply; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

From: Kevin Barnett <kevin.barnett@microchip.com>

* Add support for newer HW.

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |  156 +++++++++++++++++++++++++++++++++
 1 file changed, 156 insertions(+)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 5ca265babaa2..a0501d09a8a3 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -8726,6 +8726,10 @@ static const struct pci_device_id pqi_pci_id_table[] = {
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       0x152d, 0x8a37)
 	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x193d, 0x8460)
+	},
 	{
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       0x193d, 0x1104)
@@ -8798,6 +8802,22 @@ static const struct pci_device_id pqi_pci_id_table[] = {
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       0x1bd4, 0x004f)
 	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1bd4, 0x0051)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1bd4, 0x0052)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1bd4, 0x0053)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1bd4, 0x0054)
+	},
 	{
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       0x19e5, 0xd227)
@@ -8958,6 +8978,122 @@ static const struct pci_device_id pqi_pci_id_table[] = {
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       PCI_VENDOR_ID_ADAPTEC2, 0x1380)
 	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1400)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1402)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1410)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1411)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1412)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1420)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1430)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1440)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1441)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1450)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1452)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1460)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1461)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1462)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1470)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1471)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1472)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1480)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1490)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x1491)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14a0)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14a1)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14b0)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14b1)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14c0)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14c1)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14d0)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14e0)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_ADAPTEC2, 0x14f0)
+	},
 	{
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       PCI_VENDOR_ID_ADVANTECH, 0x8312)
@@ -9022,6 +9158,10 @@ static const struct pci_device_id pqi_pci_id_table[] = {
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       PCI_VENDOR_ID_HP, 0x1001)
 	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       PCI_VENDOR_ID_HP, 0x1002)
+	},
 	{
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       PCI_VENDOR_ID_HP, 0x1100)
@@ -9030,6 +9170,22 @@ static const struct pci_device_id pqi_pci_id_table[] = {
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       PCI_VENDOR_ID_HP, 0x1101)
 	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1590, 0x0294)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1590, 0x02db)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1590, 0x02dc)
+	},
+	{
+		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
+			       0x1590, 0x032e)
+	},
 	{
 		PCI_DEVICE_SUB(PCI_VENDOR_ID_ADAPTEC2, 0x028f,
 			       0x1d8d, 0x0800)


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* [PATCH V3 25/25] smartpqi: update version to 2.1.6-005
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (23 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 24/25] smartpqi: add new pci ids Don Brace
@ 2020-12-10 20:36 ` Don Brace
  2020-12-21 14:31 ` [PATCH V3 00/25] smartpqi updates Donald Buczek
  25 siblings, 0 replies; 91+ messages in thread
From: Don Brace @ 2020-12-10 20:36 UTC (permalink / raw)
  To: Kevin.Barnett, scott.teel, Justin.Lindley, scott.benesh,
	gerry.morong, mahesh.rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

* Update version for tracking

Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Gerry Morong <gerry.morong@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
---
 drivers/scsi/smartpqi/smartpqi_init.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index a0501d09a8a3..89baa9b7023e 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -33,11 +33,11 @@
 #define BUILD_TIMESTAMP
 #endif
 
-#define DRIVER_VERSION		"1.2.16-012"
-#define DRIVER_MAJOR		1
-#define DRIVER_MINOR		2
-#define DRIVER_RELEASE		16
-#define DRIVER_REVISION		12
+#define DRIVER_VERSION		"2.1.6-005"
+#define DRIVER_MAJOR		2
+#define DRIVER_MINOR		1
+#define DRIVER_RELEASE		6
+#define DRIVER_REVISION		5
 
 #define DRIVER_NAME		"Microsemi PQI Driver (v" \
 				DRIVER_VERSION BUILD_TIMESTAMP ")"


^ permalink raw reply related	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2020-12-10 20:35 ` [PATCH V3 15/25] smartpqi: fix host qdepth limit Don Brace
@ 2020-12-14 17:54   ` Paul Menzel
  2020-12-15 20:23     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Paul Menzel @ 2020-12-14 17:54 UTC (permalink / raw)
  To: Don Brace, Kevin Barnett, Scott Teel, Justin.Lindley,
	Scott Benesh, gerry.morong, Mahesh Rajashekhara, hch,
	joseph.szczypek, POSWALD, James E. J. Bottomley,
	Martin K. Petersen
  Cc: linux-scsi, it+linux-scsi, Donald Buczek, Greg KH

Dear Don, dear Mahesh,


Am 10.12.20 um 21:35 schrieb Don Brace:
> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> 
> * Correct scsi-mid-layer sending more requests than
>    exposed host Q depth causing firmware ASSERT issue.
>    * Add host Qdepth counter.

This supposedly fixes the regression between Linux 5.4 and 5.9, which we 
reported in [1].

     kernel: smartpqi 0000:89:00.0: controller is offline: status code 
0x6100c
     kernel: smartpqi 0000:89:00.0: controller offline

Thank you for looking into this issue and fixing it. We are going to 
test this.

For easily finding these things in the git history or the WWW, it would 
be great if these log messages could be included (in the future).

Also, that means, that the regression is still present in Linux 5.10, 
released yesterday, and this commit does not apply to these versions.

Mahesh, do you have any idea, what commit caused the regression and why 
the issue started to show up?

James, Martin, how are regressions handled for the SCSI subsystem?

Regarding the diff, personally, I find the commit message much too 
terse. `pqi_scsi_queue_command()` will return `SCSI_MLQUEUE_HOST_BUSY` 
for the case of too many requests. Will that be logged by Linux in some 
log level? In my opinion it points to a performance problem, and should 
be at least logged as a notice or warning.

Can `ctrl_info->scsi_ml_can_queue` be queried somehow maybe in the logs? 
`sudo find /sys -name queue` did not display something interesting.

[1]: https://marc.info/?l=linux-scsi&m=160271263114829&w=2
      "Linux 5.9: smartpqi: controller is offline: status code 0x6100c"

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>   drivers/scsi/smartpqi/smartpqi.h      |    2 ++
>   drivers/scsi/smartpqi/smartpqi_init.c |   19 ++++++++++++++++---
>   2 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h b/drivers/scsi/smartpqi/smartpqi.h
> index 0b94c755a74c..c3b103b15924 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -1345,6 +1345,8 @@ struct pqi_ctrl_info {
>   	struct work_struct ofa_quiesce_work;
>   	u32		ofa_bytes_requested;
>   	u16		ofa_cancel_reason;
> +
> +	atomic_t	total_scmds_outstanding;
>   };

What is the difference between the already existing

     atomic_t scsi_cmds_outstanding;

and the new counter?

     atomic_t	total_scmds_outstanding;

The names are quite similar, so different names or a comment might be 
useful.

>   
>   enum pqi_ctrl_mode {
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
> index 082b17e9bd80..4e088f47d95f 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5578,6 +5578,8 @@ static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd *scmd)
>   void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)
>   {
>   	struct pqi_scsi_dev *device;
> +	struct pqi_ctrl_info *ctrl_info;
> +	struct Scsi_Host *shost;
>   
>   	if (!scmd->device) {
>   		set_host_byte(scmd, DID_NO_CONNECT);
> @@ -5590,7 +5592,11 @@ void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)
>   		return;
>   	}
>   
> +	shost = scmd->device->host;

The function already has a variable `device`, which is assigned 
“hostdata” though:

     device = scmd->device->hostdata;

This confuses me. Maybe this should be cleaned up in a followup commit, 
and the variable device be reused above in the `shost` assignment.

> +	ctrl_info = shost_to_hba(shost);
> +
>   	atomic_dec(&device->scsi_cmds_outstanding);
> +	atomic_dec(&ctrl_info->total_scmds_outstanding);
>   }
>   
>   static bool pqi_is_parity_write_stream(struct pqi_ctrl_info *ctrl_info,
> @@ -5678,6 +5684,7 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
>   	bool raid_bypassed;
>   
>   	device = scmd->device->hostdata;
> +	ctrl_info = shost_to_hba(shost);
>   
>   	if (!device) {
>   		set_host_byte(scmd, DID_NO_CONNECT);
> @@ -5686,8 +5693,11 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
>   	}
>   
>   	atomic_inc(&device->scsi_cmds_outstanding);
> -
> -	ctrl_info = shost_to_hba(shost);

I believe, style changes (re-ordering) in commits fixing regressions 
make it harder to backport it.

> +	if (atomic_inc_return(&ctrl_info->total_scmds_outstanding) >
> +		ctrl_info->scsi_ml_can_queue) {
> +		rc = SCSI_MLQUEUE_HOST_BUSY;
> +		goto out;
> +	}
>   
>   	if (pqi_ctrl_offline(ctrl_info) || pqi_device_in_remove(device)) {
>   		set_host_byte(scmd, DID_NO_CONNECT);
> @@ -5730,8 +5740,10 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
>   	}
>   
>   out:
> -	if (rc)
> +	if (rc) {
>   		atomic_dec(&device->scsi_cmds_outstanding);
> +		atomic_dec(&ctrl_info->total_scmds_outstanding);
> +	}
>   
>   	return rc;
>   }
> @@ -8054,6 +8066,7 @@ static struct pqi_ctrl_info *pqi_alloc_ctrl_info(int numa_node)
>   
>   	INIT_WORK(&ctrl_info->event_work, pqi_event_worker);
>   	atomic_set(&ctrl_info->num_interrupts, 0);
> +	atomic_set(&ctrl_info->total_scmds_outstanding, 0);
>   
>   	INIT_DELAYED_WORK(&ctrl_info->rescan_work, pqi_rescan_worker);
>   	INIT_DELAYED_WORK(&ctrl_info->update_time_work, pqi_update_time_worker);


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2020-12-14 17:54   ` Paul Menzel
@ 2020-12-15 20:23     ` Don.Brace
  2021-01-07 23:43       ` Martin Wilck
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2020-12-15 20:23 UTC (permalink / raw)
  To: pmenzel, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, joseph.szczypek, POSWALD,
	jejb, martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh

Please see answers below. Hope this helps.

-----Original Message-----
From: Paul Menzel [mailto:pmenzel@molgen.mpg.de] 
Sent: Monday, December 14, 2020 11:54 AM
To: Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; joseph.szczypek@hpe.com; POSWALD@suse.com; James E. J. Bottomley <jejb@linux.ibm.com>; Martin K. Petersen <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org; it+linux-scsi@molgen.mpg.de; Donald Buczek <buczek@molgen.mpg.de>; Greg KH <gregkh@linuxfoundation.org>
Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

Dear Don, dear Mahesh,


Am 10.12.20 um 21:35 schrieb Don Brace:
> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
>
> * Correct scsi-mid-layer sending more requests than
>    exposed host Q depth causing firmware ASSERT issue.
>    * Add host Qdepth counter.

This supposedly fixes the regression between Linux 5.4 and 5.9, which we reported in [1].

     kernel: smartpqi 0000:89:00.0: controller is offline: status code 0x6100c
     kernel: smartpqi 0000:89:00.0: controller offline

Thank you for looking into this issue and fixing it. We are going to test this.

For easily finding these things in the git history or the WWW, it would be great if these log messages could be included (in the future).
DON> Thanks for your suggestion. Well add them in the next time.

Also, that means, that the regression is still present in Linux 5.10, released yesterday, and this commit does not apply to these versions.

DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12 depending when all of the patches are applied. The patch in question is among 28 other patches.

Mahesh, do you have any idea, what commit caused the regression and why the issue started to show up?
DON> The smartpqi driver sets two scsi_host_template member fields: .can_queue and .nr_hw_queues. But we have not yet converted to host_tagset. So the queue_depth becomes nr_hw_queues * can_queue, which is more than the hw can support. That can be verified by looking at scsi_host.h.
        /*
         * In scsi-mq mode, the number of hardware queues supported by the LLD.
         *
         * Note: it is assumed that each hardware queue has a queue depth of
         * can_queue. In other words, the total queue depth per host
         * is nr_hw_queues * can_queue. However, for when host_tagset is set,
         * the total queue depth is can_queue.
         */

So, until we make this change, the queue_depth change prevents the above issue from happening.
Note: you will see better performance and more evenly distributed performance with this patch applied.

James, Martin, how are regressions handled for the SCSI subsystem?

Regarding the diff, personally, I find the commit message much too terse. `pqi_scsi_queue_command()` will return `SCSI_MLQUEUE_HOST_BUSY` for the case of too many requests. Will that be logged by Linux in some log level? In my opinion it points to a performance problem, and should be at least logged as a notice or warning.
DON> We could add a ratelimited print, but we did not want to interrupt the CPU for logging these messages.
Also, you should see better and more even performance.

Can `ctrl_info->scsi_ml_can_queue` be queried somehow maybe in the logs?
`sudo find /sys -name queue` did not display something interesting.
All I find is /sys/class/scsi_host/host<X>/{cmd_per_lun, can_queue}, but not nr_hw_queues, but there is one queue for each CPU.

[1]: https://marc.info/?l=linux-scsi&m=160271263114829&w=2
      "Linux 5.9: smartpqi: controller is offline: status code 0x6100c"

> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>   drivers/scsi/smartpqi/smartpqi.h      |    2 ++
>   drivers/scsi/smartpqi/smartpqi_init.c |   19 ++++++++++++++++---
>   2 files changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/smartpqi/smartpqi.h 
> b/drivers/scsi/smartpqi/smartpqi.h
> index 0b94c755a74c..c3b103b15924 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -1345,6 +1345,8 @@ struct pqi_ctrl_info {
>       struct work_struct ofa_quiesce_work;
>       u32             ofa_bytes_requested;
>       u16             ofa_cancel_reason;
> +
> +     atomic_t        total_scmds_outstanding;
>   };

What is the difference between the already existing

     atomic_t scsi_cmds_outstanding;

and the new counter?

     atomic_t   total_scmds_outstanding;

The names are quite similar, so different names or a comment might be useful.
DON> total_scmds_outstanding tracks the queue_depth for the entire driver instance.


>
>   enum pqi_ctrl_mode {
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c 
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 082b17e9bd80..4e088f47d95f 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5578,6 +5578,8 @@ static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd *scmd)
>   void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)
>   {
>       struct pqi_scsi_dev *device;
> +     struct pqi_ctrl_info *ctrl_info;
> +     struct Scsi_Host *shost;
>
>       if (!scmd->device) {
>               set_host_byte(scmd, DID_NO_CONNECT); @@ -5590,7 +5592,11 
> @@ void pqi_prep_for_scsi_done(struct scsi_cmnd *scmd)
>               return;
>       }
>
> +     shost = scmd->device->host;

The function already has a variable `device`, which is assigned “hostdata” though:

     device = scmd->device->hostdata;

This confuses me. Maybe this should be cleaned up in a followup commit, and the variable device be reused above in the `shost` assignment.
DON> host points back to the driver instance of our HW.
DON> hostdata is a driver usable field that points back to our internal device pointer <LUN or HBA>.


> +     ctrl_info = shost_to_hba(shost);
> +
>       atomic_dec(&device->scsi_cmds_outstanding);
> +     atomic_dec(&ctrl_info->total_scmds_outstanding);
>   }
>
>   static bool pqi_is_parity_write_stream(struct pqi_ctrl_info 
> *ctrl_info, @@ -5678,6 +5684,7 @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
>       bool raid_bypassed;
>
>       device = scmd->device->hostdata;
> +     ctrl_info = shost_to_hba(shost);
>
>       if (!device) {
>               set_host_byte(scmd, DID_NO_CONNECT); @@ -5686,8 +5693,11 
> @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
>       }
>
>       atomic_inc(&device->scsi_cmds_outstanding);
> -
> -     ctrl_info = shost_to_hba(shost);

I believe, style changes (re-ordering) in commits fixing regressions make it harder to backport it.

> +     if (atomic_inc_return(&ctrl_info->total_scmds_outstanding) >
> +             ctrl_info->scsi_ml_can_queue) {
> +             rc = SCSI_MLQUEUE_HOST_BUSY;
> +             goto out;
> +     }
>
>       if (pqi_ctrl_offline(ctrl_info) || pqi_device_in_remove(device)) {
>               set_host_byte(scmd, DID_NO_CONNECT); @@ -5730,8 +5740,10 
> @@ static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scm
>       }
>
>   out:
> -     if (rc)
> +     if (rc) {
>               atomic_dec(&device->scsi_cmds_outstanding);
> +             atomic_dec(&ctrl_info->total_scmds_outstanding);
> +     }
>
>       return rc;
>   }
> @@ -8054,6 +8066,7 @@ static struct pqi_ctrl_info 
> *pqi_alloc_ctrl_info(int numa_node)
>
>       INIT_WORK(&ctrl_info->event_work, pqi_event_worker);
>       atomic_set(&ctrl_info->num_interrupts, 0);
> +     atomic_set(&ctrl_info->total_scmds_outstanding, 0);
>
>       INIT_DELAYED_WORK(&ctrl_info->rescan_work, pqi_rescan_worker);
>       INIT_DELAYED_WORK(&ctrl_info->update_time_work, 
> pqi_update_time_worker);


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 00/25] smartpqi updates
  2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
                   ` (24 preceding siblings ...)
  2020-12-10 20:36 ` [PATCH V3 25/25] smartpqi: update version to 2.1.6-005 Don Brace
@ 2020-12-21 14:31 ` Donald Buczek
       [not found]   ` <SN6PR11MB2848D8C9DF9856A2B7AA69ACE1C00@SN6PR11MB2848.namprd11.prod.outlook.com>
  25 siblings, 1 reply; 91+ messages in thread
From: Donald Buczek @ 2020-12-21 14:31 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi, it+linux

Dear Don,

just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.

After three hours of heavy operation (including raid scrubbing!) the driver seems to have lost some requests for the md0 member disks

This is the static picture after all activity has ceased:

     root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done
     /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0
     /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0
     /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0
     /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0

Best
   Donald

On 10.12.20 21:34, Don Brace wrote:
> These patches are based on Martin Peterson's 5.11/scsi-queue tree
> 
> Note that these patches depend on the following three patches
> applied to Martin Peterson's tree:
>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>    5.11/scsi-queue
> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012
> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler busy condition
> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with HBA disks
> 
> This set of changes consist of:
>    * Add support for newer controller hardware.
>      * Refactor AIO and s/g processing code. (No functional changes)
>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).
>      * Add check for sequential streaming.
>      * Add in new PCI-IDs.
>    * Format changes to re-align with our in-house driver. (No functional changes.)
>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.
>      * Block I/O requests during these conditions.
>    * Add in qdepth limit check to limit outstanding commands.
>      to the max values supported by the controller.
>    * Correct some minor issues found during regression testing.
>    * Update the driver version.
> 
> Changes since V1:
>    * Re-added 32bit calculations to correct i386 compile issues
>      to patch smartpqi-refactor-aio-submission-code
>      Reported-by: kernel test robot <lkp@intel.com>
>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VMBBGGGE5446SVEOQBRCKBTRRWTSH4AB/
> 
> Changes since V2:
>    * Added 32bit division to correct i386 compile issues
>      to patch smartpqi-add-support-for-raid5-and-raid6-writes
>      Reported-by: kernel test robot <lkp@intel.com>
>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZCXJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/
> 
> ---
> 
> Don Brace (7):
>        smartpqi: refactor aio submission code
>        smartpqi: refactor build sg list code
>        smartpqi: add support for raid5 and raid6 writes
>        smartpqi: add support for raid1 writes
>        smartpqi: add stream detection
>        smartpqi: add host level stream detection enable
>        smartpqi: update version to 2.1.6-005
> 
> Kevin Barnett (14):
>        smartpqi: add support for product id
>        smartpqi: add support for BMIC sense feature cmd and feature bits
>        smartpqi: update AIO Sub Page 0x02 support
>        smartpqi: add support for long firmware version
>        smartpqi: align code with oob driver
>        smartpqi: enable support for NVMe encryption
>        smartpqi: disable write_same for nvme hba disks
>        smartpqi: fix driver synchronization issues
>        smartpqi: convert snprintf to scnprintf
>        smartpqi: change timing of release of QRM memory during OFA
>        smartpqi: return busy indication for IOCTLs when ofa is active
>        smartpqi: add additional logging for LUN resets
>        smartpqi: correct system hangs when resuming from hibernation
>        smartpqi: add new pci ids
> 
> Mahesh Rajashekhara (1):
>        smartpqi: fix host qdepth limit
> 
> Murthy Bhat (3):
>        smartpqi: add phy id support for the physical drives
>        smartpqi: update sas initiator_port_protocols and target_port_protocols
>        smartpqi: update enclosure identifier in sysf
> 
> 
>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-
>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------
>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-
>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-
>   4 files changed, 2189 insertions(+), 1278 deletions(-)
> 

-- 
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 00/25] smartpqi updates
       [not found]   ` <SN6PR11MB2848D8C9DF9856A2B7AA69ACE1C00@SN6PR11MB2848.namprd11.prod.outlook.com>
@ 2020-12-22 13:13     ` Donald Buczek
  2020-12-28 15:57       ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Donald Buczek @ 2020-12-22 13:13 UTC (permalink / raw)
  To: Don.Brace, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi, it+linux

On 22.12.20 00:30, Don.Brace@microchip.com wrote:
> Can you please post your hw configuration and the stress load that you used? Was it fio?

Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.
Adapter is Adaptec HBA 1100-8e, Firmware 3.21
On it two AIC J3016-01 Enclosures with 16 8TB disks each
The disks of each jbod are a combined into a raid6 software raid with xfs on it.
So I have two filesystems with ~100 TB ( 14 * 7.3 TB)

Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.

I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.

Donald

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> *From:* Donald Buczek <buczek@molgen.mpg.de>
> *Sent:* Monday, December 21, 2020 8:31 AM
> *To:* Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org <hch@infradead.org>; jejb@linux.vnet.ibm.com <jejb@linux.vnet.ibm.com>; joseph.szczypek@hpe.com <joseph.szczypek@hpe.com>; POSWALD@suse.com <POSWALD@suse.com>
> *Cc:* linux-scsi@vger.kernel.org <linux-scsi@vger.kernel.org>; it+linux@molgen.mpg.de <it+linux@molgen.mpg.de>
> *Subject:* Re: [PATCH V3 00/25] smartpqi updates
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> Dear Don,
> 
> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.
> 
> After three hours of heavy operation (including raid scrubbing!) the driver seems to have lost some requests for the md0 member disks
> 
> This is the static picture after all activity has ceased:
> 
>       root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done
>       /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0
>       /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0
> 
> Best
>     Donald
> 
> On 10.12.20 21:34, Don Brace wrote:
>> These patches are based on Martin Peterson's 5.11/scsi-queue tree
>>
>> Note that these patches depend on the following three patches
>> applied to Martin Peterson's tree:
>>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>>    5.11/scsi-queue
>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012
>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler busy condition
>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with HBA disks
>>
>> This set of changes consist of:
>>    * Add support for newer controller hardware.
>>      * Refactor AIO and s/g processing code. (No functional changes)
>>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).
>>      * Add check for sequential streaming.
>>      * Add in new PCI-IDs.
>>    * Format changes to re-align with our in-house driver. (No functional changes.)
>>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.
>>      * Block I/O requests during these conditions.
>>    * Add in qdepth limit check to limit outstanding commands.
>>      to the max values supported by the controller.
>>    * Correct some minor issues found during regression testing.
>>    * Update the driver version.
>>
>> Changes since V1:
>>    * Re-added 32bit calculations to correct i386 compile issues
>>      to patch smartpqi-refactor-aio-submission-code
>>      Reported-by: kernel test robot <lkp@intel.com>
>>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VMBBGGGE5446SVEOQBRCKBTRRWTSH4AB/
>>
>> Changes since V2:
>>    * Added 32bit division to correct i386 compile issues
>>      to patch smartpqi-add-support-for-raid5-and-raid6-writes
>>      Reported-by: kernel test robot <lkp@intel.com>
>>      https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZCXJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/
>>
>> ---
>>
>> Don Brace (7):
>>        smartpqi: refactor aio submission code
>>        smartpqi: refactor build sg list code
>>        smartpqi: add support for raid5 and raid6 writes
>>        smartpqi: add support for raid1 writes
>>        smartpqi: add stream detection
>>        smartpqi: add host level stream detection enable
>>        smartpqi: update version to 2.1.6-005
>>
>> Kevin Barnett (14):
>>        smartpqi: add support for product id
>>        smartpqi: add support for BMIC sense feature cmd and feature bits
>>        smartpqi: update AIO Sub Page 0x02 support
>>        smartpqi: add support for long firmware version
>>        smartpqi: align code with oob driver
>>        smartpqi: enable support for NVMe encryption
>>        smartpqi: disable write_same for nvme hba disks
>>        smartpqi: fix driver synchronization issues
>>        smartpqi: convert snprintf to scnprintf
>>        smartpqi: change timing of release of QRM memory during OFA
>>        smartpqi: return busy indication for IOCTLs when ofa is active
>>        smartpqi: add additional logging for LUN resets
>>        smartpqi: correct system hangs when resuming from hibernation
>>        smartpqi: add new pci ids
>>
>> Mahesh Rajashekhara (1):
>>        smartpqi: fix host qdepth limit
>>
>> Murthy Bhat (3):
>>        smartpqi: add phy id support for the physical drives
>>        smartpqi: update sas initiator_port_protocols and target_port_protocols
>>        smartpqi: update enclosure identifier in sysf
>>
>>
>>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-
>>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------
>>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-
>>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-
>>   4 files changed, 2189 insertions(+), 1278 deletions(-)
>>
> 
> --
> Donald Buczek
> buczek@molgen.mpg.de
> Tel: +49 30 8413 1433

-- 
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 00/25] smartpqi updates
  2020-12-22 13:13     ` Donald Buczek
@ 2020-12-28 15:57       ` Don.Brace
  2020-12-28 19:25         ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2020-12-28 15:57 UTC (permalink / raw)
  To: buczek, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi, it+linux


Subject: Re: [PATCH V3 00/25] smartpqi updates

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On 22.12.20 00:30, Don.Brace@microchip.com wrote:
> Can you please post your hw configuration and the stress load that you used? Was it fio?

Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.
Adapter is Adaptec HBA 1100-8e, Firmware 3.21 On it two AIC J3016-01 Enclosures with 16 8TB disks each The disks of each jbod are a combined into a raid6 software raid with xfs on it.
So I have two filesystems with ~100 TB ( 14 * 7.3 TB)

Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.

I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.

Donald

>>Thanks for sharing your HW setup.
>>I will also setup a similar system. I have two scripts that I run against the driver before I feel satisfied that it will hold up against extreme conditions. One script performs a list of I/O stress tests (to all presented disks (LVs and HBAs): 1) mkfs {xfs, ext4}, 2) mount, 3) test using rsync, 4) fio using file system, 5) umount, 6) fsck, 7) fio to raw disk.

>>The other script continuously issues resets to all of the disks in parallel. Normally any issues will show up within 20 iterations of my scripts. I wait for 50K before I'm happy.

>>I have not tried layering in the dm driver, but that will be added to my tests. There have been a few patches added to both the block layer and dm driver recently.

>>Thanks again,
>>Don.



>
> Dear Don,
>
> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.
>
> After three hours of heavy operation (including raid scrubbing!) the 
> driver seems to have lost some requests for the md0 member disks
>
> This is the static picture after all activity has ceased:
>
>       root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done
>       /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0
>       /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0
>
> Best
>     Donald
>
> On 10.12.20 21:34, Don Brace wrote:
>> These patches are based on Martin Peterson's 5.11/scsi-queue tree
>>
>> Note that these patches depend on the following three patches  
>>applied to Martin Peterson's tree:
>>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>>    5.11/scsi-queue
>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012
>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler 
>>busy condition
>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with 
>>HBA disks
>>
>> This set of changes consist of:
>>    * Add support for newer controller hardware.
>>      * Refactor AIO and s/g processing code. (No functional changes)
>>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).
>>      * Add check for sequential streaming.
>>      * Add in new PCI-IDs.
>>    * Format changes to re-align with our in-house driver. (No 
>>functional changes.)
>>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.
>>      * Block I/O requests during these conditions.
>>    * Add in qdepth limit check to limit outstanding commands.
>>      to the max values supported by the controller.
>>    * Correct some minor issues found during regression testing.
>>    * Update the driver version.
>>
>> Changes since V1:
>>    * Re-added 32bit calculations to correct i386 compile issues
>>      to patch smartpqi-refactor-aio-submission-code
>>      Reported-by: kernel test robot <lkp@intel.com>
>>      
>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VM
>>BBGGGE5446SVEOQBRCKBTRRWTSH4AB/
>>
>> Changes since V2:
>>    * Added 32bit division to correct i386 compile issues
>>      to patch smartpqi-add-support-for-raid5-and-raid6-writes
>>      Reported-by: kernel test robot <lkp@intel.com>
>>      
>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZC
>>XJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/
>>
>> ---
>>
>> Don Brace (7):
>>        smartpqi: refactor aio submission code
>>        smartpqi: refactor build sg list code
>>        smartpqi: add support for raid5 and raid6 writes
>>        smartpqi: add support for raid1 writes
>>        smartpqi: add stream detection
>>        smartpqi: add host level stream detection enable
>>        smartpqi: update version to 2.1.6-005
>>
>> Kevin Barnett (14):
>>        smartpqi: add support for product id
>>        smartpqi: add support for BMIC sense feature cmd and feature 
>>bits
>>        smartpqi: update AIO Sub Page 0x02 support
>>        smartpqi: add support for long firmware version
>>        smartpqi: align code with oob driver
>>        smartpqi: enable support for NVMe encryption
>>        smartpqi: disable write_same for nvme hba disks
>>        smartpqi: fix driver synchronization issues
>>        smartpqi: convert snprintf to scnprintf
>>        smartpqi: change timing of release of QRM memory during OFA
>>        smartpqi: return busy indication for IOCTLs when ofa is active
>>        smartpqi: add additional logging for LUN resets
>>        smartpqi: correct system hangs when resuming from hibernation
>>        smartpqi: add new pci ids
>>
>> Mahesh Rajashekhara (1):
>>        smartpqi: fix host qdepth limit
>>
>> Murthy Bhat (3):
>>        smartpqi: add phy id support for the physical drives
>>        smartpqi: update sas initiator_port_protocols and 
>>target_port_protocols
>>        smartpqi: update enclosure identifier in sysf
>>
>>
>>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-
>>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------
>>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-
>>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-
>>   4 files changed, 2189 insertions(+), 1278 deletions(-)
>>
>
> --
> Donald Buczek
> buczek@molgen.mpg.de
> Tel: +49 30 8413 1433

--
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 00/25] smartpqi updates
  2020-12-28 15:57       ` Don.Brace
@ 2020-12-28 19:25         ` Don.Brace
  2020-12-28 22:36           ` Donald Buczek
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2020-12-28 19:25 UTC (permalink / raw)
  To: Don.Brace, buczek, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi, it+linux

Can you provide the base OS that you used to build the kernel.org kernel?

Thanks,
Don

-----Original Message-----
From: Don.Brace@microchip.com [mailto:Don.Brace@microchip.com] 
Sent: Monday, December 28, 2020 9:58 AM
To: buczek@molgen.mpg.de; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; jejb@linux.vnet.ibm.com; joseph.szczypek@hpe.com; POSWALD@suse.com
Cc: linux-scsi@vger.kernel.org; it+linux@molgen.mpg.de
Subject: RE: [PATCH V3 00/25] smartpqi updates


Subject: Re: [PATCH V3 00/25] smartpqi updates

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On 22.12.20 00:30, Don.Brace@microchip.com wrote:
> Can you please post your hw configuration and the stress load that you used? Was it fio?

Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.
Adapter is Adaptec HBA 1100-8e, Firmware 3.21 On it two AIC J3016-01 Enclosures with 16 8TB disks each The disks of each jbod are a combined into a raid6 software raid with xfs on it.
So I have two filesystems with ~100 TB ( 14 * 7.3 TB)

Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.

I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.

Donald

>>Thanks for sharing your HW setup.
>>I will also setup a similar system. I have two scripts that I run against the driver before I feel satisfied that it will hold up against extreme conditions. One script performs a list of I/O stress tests (to all presented disks (LVs and HBAs): 1) mkfs {xfs, ext4}, 2) mount, 3) test using rsync, 4) fio using file system, 5) umount, 6) fsck, 7) fio to raw disk.

>>The other script continuously issues resets to all of the disks in parallel. Normally any issues will show up within 20 iterations of my scripts. I wait for 50K before I'm happy.

>>I have not tried layering in the dm driver, but that will be added to my tests. There have been a few patches added to both the block layer and dm driver recently.

>>Thanks again,
>>Don.



>
> Dear Don,
>
> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.
>
> After three hours of heavy operation (including raid scrubbing!) the 
> driver seems to have lost some requests for the md0 member disks
>
> This is the static picture after all activity has ceased:
>
>       root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done
>       /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0
>       /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0
>       /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0
>       /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0
>
> Best
>     Donald
>
> On 10.12.20 21:34, Don Brace wrote:
>> These patches are based on Martin Peterson's 5.11/scsi-queue tree
>>
>> Note that these patches depend on the following three patches applied 
>>to Martin Peterson's tree:
>>    https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>>    5.11/scsi-queue
>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012
>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler 
>>busy condition
>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with 
>>HBA disks
>>
>> This set of changes consist of:
>>    * Add support for newer controller hardware.
>>      * Refactor AIO and s/g processing code. (No functional changes)
>>      * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).
>>      * Add check for sequential streaming.
>>      * Add in new PCI-IDs.
>>    * Format changes to re-align with our in-house driver. (No 
>>functional changes.)
>>    * Correct some issues relating to suspend/hibernation/OFA/shutdown.
>>      * Block I/O requests during these conditions.
>>    * Add in qdepth limit check to limit outstanding commands.
>>      to the max values supported by the controller.
>>    * Correct some minor issues found during regression testing.
>>    * Update the driver version.
>>
>> Changes since V1:
>>    * Re-added 32bit calculations to correct i386 compile issues
>>      to patch smartpqi-refactor-aio-submission-code
>>      Reported-by: kernel test robot <lkp@intel.com>
>>      
>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VM
>>BBGGGE5446SVEOQBRCKBTRRWTSH4AB/
>>
>> Changes since V2:
>>    * Added 32bit division to correct i386 compile issues
>>      to patch smartpqi-add-support-for-raid5-and-raid6-writes
>>      Reported-by: kernel test robot <lkp@intel.com>
>>      
>>https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZC
>>XJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/
>>
>> ---
>>
>> Don Brace (7):
>>        smartpqi: refactor aio submission code
>>        smartpqi: refactor build sg list code
>>        smartpqi: add support for raid5 and raid6 writes
>>        smartpqi: add support for raid1 writes
>>        smartpqi: add stream detection
>>        smartpqi: add host level stream detection enable
>>        smartpqi: update version to 2.1.6-005
>>
>> Kevin Barnett (14):
>>        smartpqi: add support for product id
>>        smartpqi: add support for BMIC sense feature cmd and feature 
>>bits
>>        smartpqi: update AIO Sub Page 0x02 support
>>        smartpqi: add support for long firmware version
>>        smartpqi: align code with oob driver
>>        smartpqi: enable support for NVMe encryption
>>        smartpqi: disable write_same for nvme hba disks
>>        smartpqi: fix driver synchronization issues
>>        smartpqi: convert snprintf to scnprintf
>>        smartpqi: change timing of release of QRM memory during OFA
>>        smartpqi: return busy indication for IOCTLs when ofa is active
>>        smartpqi: add additional logging for LUN resets
>>        smartpqi: correct system hangs when resuming from hibernation
>>        smartpqi: add new pci ids
>>
>> Mahesh Rajashekhara (1):
>>        smartpqi: fix host qdepth limit
>>
>> Murthy Bhat (3):
>>        smartpqi: add phy id support for the physical drives
>>        smartpqi: update sas initiator_port_protocols and 
>>target_port_protocols
>>        smartpqi: update enclosure identifier in sysf
>>
>>
>>   drivers/scsi/smartpqi/smartpqi.h              |  301 +-
>>   drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------
>>   .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-
>>   drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-
>>   4 files changed, 2189 insertions(+), 1278 deletions(-)
>>
>
> --
> Donald Buczek
> buczek@molgen.mpg.de
> Tel: +49 30 8413 1433

--
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 00/25] smartpqi updates
  2020-12-28 19:25         ` Don.Brace
@ 2020-12-28 22:36           ` Donald Buczek
  0 siblings, 0 replies; 91+ messages in thread
From: Donald Buczek @ 2020-12-28 22:36 UTC (permalink / raw)
  To: Don.Brace, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi, it+linux

On 28.12.20 20:25, Don.Brace@microchip.com wrote:

> Can you provide the base OS that you used to build the kernel.org kernel?

GNU/Linux build form scratch, no distribution. gcc-7.5.0

However, after more testing, I'm no longer convinced that deadlock is caused by the block layer.

I've original posted this as an xfs bug "v5.10.1 xfs deadlock" ( https://lkml.org/lkml/2020/12/17/608 )

In that thread it was suggested, that this might be caused by the block layer dropping I/Os. The non-zero inflight counters, I observed, seemed to confirm that, so I posted the same problem to linux-scsi and the relevant maintainers. However, since then I did a lot more tests and while I am now able to reproduce the original deadlock, I am not able to reproduce the non-zero inflight counters and haven't seen them since then.

It is quite clear now, that the deadlock is in the xfs layer and although, I don't fully understand it, I can patch it away now.

I was very eager to test smartpqi 2.1.6-005, which you submitted for linux-next. As you know, we have severe problems with the in-tree 1.2.8-026 smartpqi driver, while the 2.1.8-005 OOT driver, you provided us, did work. However, smartpqi 2.1.6-005 on Linux 5.10 failed for us on the production system, too, and I couldn't continue the tests on this system, so we set up the test system to identify the (potential) problem.

Unfortunately, on this test system, two other problems got in our way, which could be related to the smartpqi driver, but probably aren't. The first one was "md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition" ( https://lkml.org/lkml/2020/11/28/165 ). When I tried to isolate that problem, the xfs problem "v5.10.1 xfs deadlock" ( https://lkml.org/lkml/2020/12/17/608 ) brought itself into the foreground and needed to be resolved first. My original goal of testing the smartpqi 2.1.6-005 driver and trying to reproduce the problem, we've seen on the production system, didn't make progress because of that.

On the other hand, all this time smartpqi 2.1.6-005 was used on the test system with high simulated load and didn't fail on me. So I kind of tested it without bad results. I'd appreciate, if it would go into the next Linux release, anyway.

Best
   Donald

> 
> Thanks,
> Don
> 
> -----Original Message-----
> From: Don.Brace@microchip.com [mailto:Don.Brace@microchip.com]
> Sent: Monday, December 28, 2020 9:58 AM
> To: buczek@molgen.mpg.de; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; jejb@linux.vnet.ibm.com; joseph.szczypek@hpe.com; POSWALD@suse.com
> Cc: linux-scsi@vger.kernel.org; it+linux@molgen.mpg.de
> Subject: RE: [PATCH V3 00/25] smartpqi updates
> 
> 
> Subject: Re: [PATCH V3 00/25] smartpqi updates
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On 22.12.20 00:30, Don.Brace@microchip.com wrote:
>> Can you please post your hw configuration and the stress load that you used? Was it fio?
> 
> Testsystem is a Dell PowerEdge R730. with two 10 core Intel® Xeon® Processor E5-2687W v3  and 200 GB memory.
> Adapter is Adaptec HBA 1100-8e, Firmware 3.21 On it two AIC J3016-01 Enclosures with 16 8TB disks each The disks of each jbod are a combined into a raid6 software raid with xfs on it.
> So I have two filesystems with ~100 TB ( 14 * 7.3 TB)
> 
> Unfortunately, for the time being, I was only able to reproduce this with a very complex load setup with both, file system activity (two parallel `cp -a` of big directory trees on each filesystem) and switching on and of raid scrubbing at the same time. I'm currently trigger the issue with less complex setups.
> 
> I'm not sure at all, whether this is really a problem of the smartpqi driver. Its just the frozen inflight counter seem to hint in the direction of the block layer.
> 
> Donald
> 
>>> Thanks for sharing your HW setup.
>>> I will also setup a similar system. I have two scripts that I run against the driver before I feel satisfied that it will hold up against extreme conditions. One script performs a list of I/O stress tests (to all presented disks (LVs and HBAs): 1) mkfs {xfs, ext4}, 2) mount, 3) test using rsync, 4) fio using file system, 5) umount, 6) fsck, 7) fio to raw disk.
> 
>>> The other script continuously issues resets to all of the disks in parallel. Normally any issues will show up within 20 iterations of my scripts. I wait for 50K before I'm happy.
> 
>>> I have not tried layering in the dm driver, but that will be added to my tests. There have been a few patches added to both the block layer and dm driver recently.
> 
>>> Thanks again,
>>> Don.
> 
> 
> 
>>
>> Dear Don,
>>
>> just wanted to let you know that I've tested this series (plus the three Depends-on patches you mentioned) on top of Linux v5.10.1 with an Adaptec 1100-8e with fw 3.21.
>>
>> After three hours of heavy operation (including raid scrubbing!) the
>> driver seems to have lost some requests for the md0 member disks
>>
>> This is the static picture after all activity has ceased:
>>
>>        root:deadbird:/scratch/local/# for f in /sys/devices/virtual/block/md?/md/rd*/block/inflight;do echo $f: $(cat $f);done
>>        /sys/devices/virtual/block/md0/md/rd0/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd1/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd10/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd11/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd12/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd13/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd14/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd15/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd2/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd3/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd4/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd5/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd6/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd7/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd8/block/inflight: 1 0
>>        /sys/devices/virtual/block/md0/md/rd9/block/inflight: 1 0
>>        /sys/devices/virtual/block/md1/md/rd0/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd1/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd10/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd11/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd12/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd13/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd14/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd15/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd2/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd3/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd4/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd5/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd6/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd7/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd8/block/inflight: 0 0
>>        /sys/devices/virtual/block/md1/md/rd9/block/inflight: 0 0
>>
>> Best
>>      Donald
>>
>> On 10.12.20 21:34, Don Brace wrote:
>>> These patches are based on Martin Peterson's 5.11/scsi-queue tree
>>>
>>> Note that these patches depend on the following three patches applied
>>> to Martin Peterson's tree:
>>>     https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>>>      5.11/scsi-queue
>>> Depends-on: 5443bdc4cc77 scsi: smartpqi: Update version to 1.2.16-012
>>> Depends-on: 408bdd7e5845 scsi: smartpqi: Correct pqi_sas_smp_handler
>>> busy condition
>>> Depends-on: 1bdf6e934387 scsi: smartpqi: Correct driver removal with
>>> HBA disks
>>>
>>> This set of changes consist of:
>>>      * Add support for newer controller hardware.
>>>        * Refactor AIO and s/g processing code. (No functional changes)
>>>        * Add write support for RAID 5/6/1 Raid bypass path (or accelerated I/O path).
>>>        * Add check for sequential streaming.
>>>        * Add in new PCI-IDs.
>>>      * Format changes to re-align with our in-house driver. (No
>>> functional changes.)
>>>      * Correct some issues relating to suspend/hibernation/OFA/shutdown.
>>>        * Block I/O requests during these conditions.
>>>      * Add in qdepth limit check to limit outstanding commands.
>>>        to the max values supported by the controller.
>>>      * Correct some minor issues found during regression testing.
>>>      * Update the driver version.
>>>
>>> Changes since V1:
>>>      * Re-added 32bit calculations to correct i386 compile issues
>>>        to patch smartpqi-refactor-aio-submission-code
>>>        Reported-by: kernel test robot <lkp@intel.com>
>>>       
>>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/VM
>>> BBGGGE5446SVEOQBRCKBTRRWTSH4AB/
>>>
>>> Changes since V2:
>>>      * Added 32bit division to correct i386 compile issues
>>>        to patch smartpqi-add-support-for-raid5-and-raid6-writes
>>>        Reported-by: kernel test robot <lkp@intel.com>
>>>       
>>> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/ZC
>>> XJJDGPPTTXLZCSCGWEY6VXPRB3IFOQ/
>>>
>>> ---
>>>
>>> Don Brace (7):
>>>          smartpqi: refactor aio submission code
>>>          smartpqi: refactor build sg list code
>>>          smartpqi: add support for raid5 and raid6 writes
>>>          smartpqi: add support for raid1 writes
>>>          smartpqi: add stream detection
>>>          smartpqi: add host level stream detection enable
>>>          smartpqi: update version to 2.1.6-005
>>>
>>> Kevin Barnett (14):
>>>          smartpqi: add support for product id
>>>          smartpqi: add support for BMIC sense feature cmd and feature
>>> bits
>>>          smartpqi: update AIO Sub Page 0x02 support
>>>          smartpqi: add support for long firmware version
>>>          smartpqi: align code with oob driver
>>>          smartpqi: enable support for NVMe encryption
>>>          smartpqi: disable write_same for nvme hba disks
>>>          smartpqi: fix driver synchronization issues
>>>          smartpqi: convert snprintf to scnprintf
>>>          smartpqi: change timing of release of QRM memory during OFA
>>>          smartpqi: return busy indication for IOCTLs when ofa is active
>>>          smartpqi: add additional logging for LUN resets
>>>          smartpqi: correct system hangs when resuming from hibernation
>>>          smartpqi: add new pci ids
>>>
>>> Mahesh Rajashekhara (1):
>>>          smartpqi: fix host qdepth limit
>>>
>>> Murthy Bhat (3):
>>>          smartpqi: add phy id support for the physical drives
>>>          smartpqi: update sas initiator_port_protocols and
>>> target_port_protocols
>>>          smartpqi: update enclosure identifier in sysf
>>>
>>>
>>>     drivers/scsi/smartpqi/smartpqi.h              |  301 +-
>>>     drivers/scsi/smartpqi/smartpqi_init.c         | 3123 ++++++++++-------
>>>     .../scsi/smartpqi/smartpqi_sas_transport.c    |   39 +-
>>>     drivers/scsi/smartpqi/smartpqi_sis.c          |    4 +-
>>>     4 files changed, 2189 insertions(+), 1278 deletions(-)
>>>
>>
>> --
>> Donald Buczek
>> buczek@molgen.mpg.de
>> Tel: +49 30 8413 1433
> 
> --
> Donald Buczek
> buczek@molgen.mpg.de
> Tel: +49 30 8413 1433
> 

-- 
Donald Buczek
buczek@molgen.mpg.de
Tel: +49 30 8413 1433

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 01/25] smartpqi: add support for product id
  2020-12-10 20:34 ` [PATCH V3 01/25] smartpqi: add support for product id Don Brace
@ 2021-01-07 16:43   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:43 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   11 ++++++++++-
>  drivers/scsi/smartpqi/smartpqi_init.c |   11 +++++++++--
>  drivers/scsi/smartpqi/smartpqi_sis.c  |    5 +++++
>  drivers/scsi/smartpqi/smartpqi_sis.h  |    1 +
>  4 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index 3e54590e6e92..7d3f956e949f 100644
> 
> [...]
> 
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -6259,8 +6259,8 @@ static DEVICE_ATTR(model, 0444, pqi_model_show,
> NULL);
>  static DEVICE_ATTR(serial_number, 0444, pqi_serial_number_show,
> NULL);
>  static DEVICE_ATTR(vendor, 0444, pqi_vendor_show, NULL);
>  static DEVICE_ATTR(rescan, 0200, NULL, pqi_host_rescan_store);
> -static DEVICE_ATTR(lockup_action, 0644,
> -       pqi_lockup_action_show, pqi_lockup_action_store);
> +static DEVICE_ATTR(lockup_action, 0644, pqi_lockup_action_show,
> +       pqi_lockup_action_store);

Nitpick: could you please avoid mixing real code changes with unrelated
whitespace edits? The same applies to several patches of this series.

Other than that:

Reviewed-by: Martin Wilck <mwilck@suse.com>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 02/25] smartpqi: refactor aio submission code
  2020-12-10 20:34 ` [PATCH V3 02/25] smartpqi: refactor aio submission code Don Brace
@ 2021-01-07 16:43   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:43 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> * No functional changes.
>   * Refactor aio submission code:
>     1. break-up function into smaller functions.
>     2. add common block of data to carry around
>        into newly added functions.
>     3. Prepare for new AIO functionality.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

I've got a few nitpicks, see below. But they should rather go
into a separate patch, so

Reviewed-by: Martin Wilck <mwilck@suse.com>

> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   52 +++
>  drivers/scsi/smartpqi/smartpqi_init.c |  554 ++++++++++++++++++-----
> ----------
>  2 files changed, 360 insertions(+), 246 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index 7d3f956e949f..d486a2ec3045 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -908,6 +908,58 @@ struct raid_map {
>  
>  #pragma pack()
>  
> +struct pqi_scsi_dev_raid_map_data {
> +       bool    is_write;
> +       u8      raid_level;
> +       u32     map_index;
> +       u64     first_block;
> +       u64     last_block;
> +       u32     data_length;
> +       u32     block_cnt;
> +       u32     blocks_per_row;
> +       u64     first_row;
> +       u64     last_row;
> +       u32     first_row_offset;
> +       u32     last_row_offset;
> +       u32     first_column;
> +       u32     last_column;
> +       u64     r5or6_first_row;
> +       u64     r5or6_last_row;
> +       u32     r5or6_first_row_offset;
> +       u32     r5or6_last_row_offset;
> +       u32     r5or6_first_column;
> +       u32     r5or6_last_column;
> +       u16     data_disks_per_row;
> +       u32     total_disks_per_row;
> +       u16     layout_map_count;
> +       u32     stripesize;
> +       u16     strip_size;
> +       u32     first_group;
> +       u32     last_group;
> +       u32     current_group;
> +       u32     map_row;
> +       u32     aio_handle;
> +       u64     disk_block;
> +       u32     disk_block_cnt;
> +       u8      cdb[16];
> +       u8      cdb_length;
> +       int     offload_to_mirror;
> +
> +       /* RAID1 specific */
> +#define NUM_RAID1_MAP_ENTRIES 3
> +       u32     num_it_nexus_entries;
> +       u32     it_nexus[NUM_RAID1_MAP_ENTRIES];
> +
> +       /* RAID5 RAID6 specific */
> +       u32     p_parity_it_nexus; /* aio_handle */
> +       u32     q_parity_it_nexus; /* aio_handle */
> +       u8      xor_mult;
> +       u64     row;
> +       u64     stripe_lba;
> +       u32     p_index;
> +       u32     q_index;
> +};

There seem to be more Raid 5/6 specific fields above.
Have you considered using a union?


> +
>  #define RAID_CTLR_LUNID                "\0\0\0\0\0\0\0\0"
>  
>  struct pqi_scsi_dev {
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 68fc4327944e..2348b9f24d8c 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -2237,332 +2237,394 @@ static inline void pqi_set_encryption_info(
>   * Attempt to perform RAID bypass mapping for a logical volume I/O.
>   */
>  
> +static bool pqi_aio_raid_level_supported(struct
> pqi_scsi_dev_raid_map_data *rmd)
> +{
> +       bool is_supported = true;
> +
> +       switch (rmd->raid_level) {
> +       case SA_RAID_0:
> +               break;
> +       case SA_RAID_1:
> +               if (rmd->is_write)
> +                       is_supported = false;
> +               break;
> +       case SA_RAID_5:
> +               fallthrough;

I don't think "fallthrough" is necessary here.

> +       case SA_RAID_6:
> +               if (rmd->is_write)
> +                       is_supported = false;
> +               break;
> +       case SA_RAID_ADM:
> +               if (rmd->is_write)
> +                       is_supported = false;
> +               break;
> +       default:
> +               is_supported = false;
> +       }
> +
> +       return is_supported;
> +}
> +
>  #define PQI_RAID_BYPASS_INELIGIBLE     1
>  
> -static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info
> *ctrl_info,
> -       struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
> -       struct pqi_queue_group *queue_group)
> +static int pqi_get_aio_lba_and_block_count(struct scsi_cmnd *scmd,
> +                       struct pqi_scsi_dev_raid_map_data *rmd)
>  {
> -       struct raid_map *raid_map;
> -       bool is_write = false;
> -       u32 map_index;
> -       u64 first_block;
> -       u64 last_block;
> -       u32 block_cnt;
> -       u32 blocks_per_row;
> -       u64 first_row;
> -       u64 last_row;
> -       u32 first_row_offset;
> -       u32 last_row_offset;
> -       u32 first_column;
> -       u32 last_column;
> -       u64 r0_first_row;
> -       u64 r0_last_row;
> -       u32 r5or6_blocks_per_row;
> -       u64 r5or6_first_row;
> -       u64 r5or6_last_row;
> -       u32 r5or6_first_row_offset;
> -       u32 r5or6_last_row_offset;
> -       u32 r5or6_first_column;
> -       u32 r5or6_last_column;
> -       u16 data_disks_per_row;
> -       u32 total_disks_per_row;
> -       u16 layout_map_count;
> -       u32 stripesize;
> -       u16 strip_size;
> -       u32 first_group;
> -       u32 last_group;
> -       u32 current_group;
> -       u32 map_row;
> -       u32 aio_handle;
> -       u64 disk_block;
> -       u32 disk_block_cnt;
> -       u8 cdb[16];
> -       u8 cdb_length;
> -       int offload_to_mirror;
> -       struct pqi_encryption_info *encryption_info_ptr;
> -       struct pqi_encryption_info encryption_info;
> -#if BITS_PER_LONG == 32
> -       u64 tmpdiv;
> -#endif
> -
>         /* Check for valid opcode, get LBA and block count. */
>         switch (scmd->cmnd[0]) {
>         case WRITE_6:
> -               is_write = true;
> +               rmd->is_write = true;
>                 fallthrough;
>         case READ_6:
> -               first_block = (u64)(((scmd->cmnd[1] & 0x1f) << 16) |
> +               rmd->first_block = (u64)(((scmd->cmnd[1] & 0x1f) <<
> 16) |
>                         (scmd->cmnd[2] << 8) | scmd->cmnd[3]);
> -               block_cnt = (u32)scmd->cmnd[4];
> -               if (block_cnt == 0)
> -                       block_cnt = 256;
> +               rmd->block_cnt = (u32)scmd->cmnd[4];
> +               if (rmd->block_cnt == 0)
> +                       rmd->block_cnt = 256;
>                 break;
>         case WRITE_10:
> -               is_write = true;
> +               rmd->is_write = true;
>                 fallthrough;
>         case READ_10:
> -               first_block = (u64)get_unaligned_be32(&scmd-
> > cmnd[2]);
> -               block_cnt = (u32)get_unaligned_be16(&scmd->cmnd[7]);
> +               rmd->first_block = (u64)get_unaligned_be32(&scmd-
> > cmnd[2]);
> +               rmd->block_cnt = (u32)get_unaligned_be16(&scmd-
> > cmnd[7]);
>                 break;
>         case WRITE_12:
> -               is_write = true;
> +               rmd->is_write = true;
>                 fallthrough;
>         case READ_12:
> -               first_block = (u64)get_unaligned_be32(&scmd-
> > cmnd[2]);
> -               block_cnt = get_unaligned_be32(&scmd->cmnd[6]);
> +               rmd->first_block = (u64)get_unaligned_be32(&scmd-
> > cmnd[2]);
> +               rmd->block_cnt = get_unaligned_be32(&scmd->cmnd[6]);
>                 break;
>         case WRITE_16:
> -               is_write = true;
> +               rmd->is_write = true;
>                 fallthrough;
>         case READ_16:
> -               first_block = get_unaligned_be64(&scmd->cmnd[2]);
> -               block_cnt = get_unaligned_be32(&scmd->cmnd[10]);
> +               rmd->first_block = get_unaligned_be64(&scmd-
> > cmnd[2]);
> +               rmd->block_cnt = get_unaligned_be32(&scmd->cmnd[10]);
>                 break;
>         default:
>                 /* Process via normal I/O path. */
>                 return PQI_RAID_BYPASS_INELIGIBLE;
>         }
>  
> -       /* Check for write to non-RAID-0. */
> -       if (is_write && device->raid_level != SA_RAID_0)
> -               return PQI_RAID_BYPASS_INELIGIBLE;
> +       put_unaligned_le32(scsi_bufflen(scmd), &rmd->data_length);
>  
> -       if (unlikely(block_cnt == 0))
> -               return PQI_RAID_BYPASS_INELIGIBLE;
> +       return 0;
> +}
>  
> -       last_block = first_block + block_cnt - 1;
> -       raid_map = device->raid_map;
> +static int pci_get_aio_common_raid_map_values(struct pqi_ctrl_info
> *ctrl_info,
> +                                       struct
> pqi_scsi_dev_raid_map_data *rmd,
> +                                       struct raid_map *raid_map)
> +{
> +#if BITS_PER_LONG == 32
> +       u64 tmpdiv;
> +#endif
> +
> +       rmd->last_block = rmd->first_block + rmd->block_cnt - 1;
>  
>         /* Check for invalid block or wraparound. */
> -       if (last_block >= get_unaligned_le64(&raid_map-
> > volume_blk_cnt) ||
> -               last_block < first_block)
> +       if (rmd->last_block >=
> +               get_unaligned_le64(&raid_map->volume_blk_cnt) ||
> +               rmd->last_block < rmd->first_block)
>                 return PQI_RAID_BYPASS_INELIGIBLE;
>  
> -       data_disks_per_row = get_unaligned_le16(&raid_map-
> > data_disks_per_row);
> -       strip_size = get_unaligned_le16(&raid_map->strip_size);
> -       layout_map_count = get_unaligned_le16(&raid_map-
> > layout_map_count);
> +       rmd->data_disks_per_row =
> +                       get_unaligned_le16(&raid_map-
> > data_disks_per_row);
> +       rmd->strip_size = get_unaligned_le16(&raid_map->strip_size);
> +       rmd->layout_map_count = get_unaligned_le16(&raid_map-
> > layout_map_count);
>  
>         /* Calculate stripe information for the request. */
> -       blocks_per_row = data_disks_per_row * strip_size;
> +       rmd->blocks_per_row = rmd->data_disks_per_row * rmd-
> > strip_size;
>  #if BITS_PER_LONG == 32

Just wondering - why don't you use do_div() for 64 bit, too?
Same question for pqi_calc_aio_r5_or_r6() below.

> -       tmpdiv = first_block;
> -       do_div(tmpdiv, blocks_per_row);
> -       first_row = tmpdiv;
> -       tmpdiv = last_block;
> -       do_div(tmpdiv, blocks_per_row);
> -       last_row = tmpdiv;
> -       first_row_offset = (u32)(first_block - (first_row *
> blocks_per_row));
> -       last_row_offset = (u32)(last_block - (last_row *
> blocks_per_row));
> -       tmpdiv = first_row_offset;
> -       do_div(tmpdiv, strip_size);
> -       first_column = tmpdiv;
> -       tmpdiv = last_row_offset;
> -       do_div(tmpdiv, strip_size);
> -       last_column = tmpdiv;
> +       tmpdiv = rmd->first_block;
> +       do_div(tmpdiv, rmd->blocks_per_row);
> +       rmd->first_row = tmpdiv;
> +       tmpdiv = rmd->last_block;
> +       do_div(tmpdiv, rmd->blocks_per_row);
> +       rmd->last_row = tmpdiv;
> +       rmd->first_row_offset = (u32)(rmd->first_block - (rmd-
> > first_row * rmd->blocks_per_row));
> +       rmd->last_row_offset = (u32)(rmd->last_block - (rmd->last_row
> * rmd->blocks_per_row));
> +       tmpdiv = rmd->first_row_offset;
> +       do_div(tmpdiv, rmd->strip_size);
> +       rmd->first_column = tmpdiv;
> +       tmpdiv = rmd->last_row_offset;
> +       do_div(tmpdiv, rmd->strip_size);
> +       rmd->last_column = tmpdiv;
>  #else
> -       first_row = first_block / blocks_per_row;
> -       last_row = last_block / blocks_per_row;
> -       first_row_offset = (u32)(first_block - (first_row *
> blocks_per_row));
> -       last_row_offset = (u32)(last_block - (last_row *
> blocks_per_row));
> -       first_column = first_row_offset / strip_size;
> -       last_column = last_row_offset / strip_size;
> +       rmd->first_row = rmd->first_block / rmd->blocks_per_row;
> +       rmd->last_row = rmd->last_block / rmd->blocks_per_row;
> +       rmd->first_row_offset = (u32)(rmd->first_block -
> +                               (rmd->first_row * rmd-
> > blocks_per_row));
> +       rmd->last_row_offset = (u32)(rmd->last_block - (rmd->last_row
> *
> +                               rmd->blocks_per_row));
> +       rmd->first_column = rmd->first_row_offset / rmd->strip_size;
> +       rmd->last_column = rmd->last_row_offset / rmd->strip_size;
>  #endif
>  
>         /* If this isn't a single row/column then give to the
> controller. */
> -       if (first_row != last_row || first_column != last_column)
> +       if (rmd->first_row != rmd->last_row ||
> +                       rmd->first_column != rmd->last_column)
>                 return PQI_RAID_BYPASS_INELIGIBLE;

You could save a few cycles here by testing rows and columns
separately.

>  
>         /* Proceeding with driver mapping. */
> -       total_disks_per_row = data_disks_per_row +
> +       rmd->total_disks_per_row = rmd->data_disks_per_row +
>                 get_unaligned_le16(&raid_map-
> > metadata_disks_per_row);
> -       map_row = ((u32)(first_row >> raid_map-
> > parity_rotation_shift)) %
> +       rmd->map_row = ((u32)(rmd->first_row >>
> +               raid_map->parity_rotation_shift)) %
>                 get_unaligned_le16(&raid_map->row_cnt);
> -       map_index = (map_row * total_disks_per_row) + first_column;
> +       rmd->map_index = (rmd->map_row * rmd->total_disks_per_row) +
> +                       rmd->first_column;
>  
> -       /* RAID 1 */
> -       if (device->raid_level == SA_RAID_1) {
> -               if (device->offload_to_mirror)
> -                       map_index += data_disks_per_row;
> -               device->offload_to_mirror = !device-
> > offload_to_mirror;
> -       } else if (device->raid_level == SA_RAID_ADM) {
> -               /* RAID ADM */
> -               /*
> -                * Handles N-way mirrors  (R1-ADM) and R10 with # of
> drives
> -                * divisible by 3.
> -                */
> -               offload_to_mirror = device->offload_to_mirror;
> -               if (offload_to_mirror == 0)  {
> -                       /* use physical disk in the first mirrored
> group. */
> -                       map_index %= data_disks_per_row;
> -               } else {
> -                       do {
> -                               /*
> -                                * Determine mirror group that
> map_index
> -                                * indicates.
> -                                */
> -                               current_group = map_index /
> data_disks_per_row;
> -
> -                               if (offload_to_mirror !=
> current_group) {
> -                                       if (current_group <
> -                                               layout_map_count - 1)
> {
> -                                               /*
> -                                                * Select raid index
> from
> -                                                * next group.
> -                                                */
> -                                               map_index +=
> data_disks_per_row;
> -                                               current_group++;
> -                                       } else {
> -                                               /*
> -                                                * Select raid index
> from first
> -                                                * group.
> -                                                */
> -                                               map_index %=
> data_disks_per_row;
> -                                               current_group = 0;
> -                                       }
> +       return 0;
> +}
> +
> +static int pqi_calc_aio_raid_adm(struct pqi_scsi_dev_raid_map_data
> *rmd,
> +                               struct pqi_scsi_dev *device)
> +{
> +       /* RAID ADM */
> +       /*
> +        * Handles N-way mirrors  (R1-ADM) and R10 with # of drives
> +        * divisible by 3.
> +        */
> +       rmd->offload_to_mirror = device->offload_to_mirror;
> +
> +       if (rmd->offload_to_mirror == 0)  {
> +               /* use physical disk in the first mirrored group. */
> +               rmd->map_index %= rmd->data_disks_per_row;
> +       } else {
> +               do {
> +                       /*
> +                        * Determine mirror group that map_index
> +                        * indicates.
> +                        */
> +                       rmd->current_group =
> +                               rmd->map_index / rmd-
> > data_disks_per_row;
> +
> +                       if (rmd->offload_to_mirror !=
> +                                       rmd->current_group) {
> +                               if (rmd->current_group <
> +                                       rmd->layout_map_count - 1) {
> +                                       /*
> +                                        * Select raid index from
> +                                        * next group.
> +                                        */
> +                                       rmd->map_index += rmd-
> > data_disks_per_row;
> +                                       rmd->current_group++;
> +                               } else {
> +                                       /*
> +                                        * Select raid index from
> first
> +                                        * group.
> +                                        */
> +                                       rmd->map_index %= rmd-
> > data_disks_per_row;
> +                                       rmd->current_group = 0;
>                                 }
> -                       } while (offload_to_mirror != current_group);
> -               }
> +                       }
> +               } while (rmd->offload_to_mirror != rmd-
> > current_group);
> +       }
>  
> -               /* Set mirror group to use next time. */
> -               offload_to_mirror =
> -                       (offload_to_mirror >= layout_map_count - 1) ?
> -                               0 : offload_to_mirror + 1;
> -               device->offload_to_mirror = offload_to_mirror;
> -               /*
> -                * Avoid direct use of device->offload_to_mirror
> within this
> -                * function since multiple threads might
> simultaneously
> -                * increment it beyond the range of device-
> > layout_map_count -1.
> -                */
> -       } else if ((device->raid_level == SA_RAID_5 ||
> -               device->raid_level == SA_RAID_6) && layout_map_count
> > 1) {
> -               /* RAID 50/60 */
> -               /* Verify first and last block are in same RAID group
> */
> -               r5or6_blocks_per_row = strip_size *
> data_disks_per_row;
> -               stripesize = r5or6_blocks_per_row * layout_map_count;
> +       /* Set mirror group to use next time. */
> +       rmd->offload_to_mirror =
> +               (rmd->offload_to_mirror >= rmd->layout_map_count - 1)
> ?
> +                       0 : rmd->offload_to_mirror + 1;
> +       device->offload_to_mirror = rmd->offload_to_mirror;
> +       /*
> +        * Avoid direct use of device->offload_to_mirror within this
> +        * function since multiple threads might simultaneously
> +        * increment it beyond the range of device->layout_map_count
> -1.
> +        */
> +
> +       return 0;
> +}
> +
> +static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data
> *rmd,
> +                               struct raid_map *raid_map)
> +{
> +#if BITS_PER_LONG == 32
> +       u64 tmpdiv;
> +#endif
> +       /* RAID 50/60 */
> +       /* Verify first and last block are in same RAID group */
> +       rmd->stripesize = rmd->blocks_per_row * rmd-
> > layout_map_count;
>  #if BITS_PER_LONG == 32
> -               tmpdiv = first_block;
> -               first_group = do_div(tmpdiv, stripesize);
> -               tmpdiv = first_group;
> -               do_div(tmpdiv, r5or6_blocks_per_row);
> -               first_group = tmpdiv;
> -               tmpdiv = last_block;
> -               last_group = do_div(tmpdiv, stripesize);
> -               tmpdiv = last_group;
> -               do_div(tmpdiv, r5or6_blocks_per_row);
> -               last_group = tmpdiv;
> +       tmpdiv = rmd->first_block;
> +       rmd->first_group = do_div(tmpdiv, rmd->stripesize);
> +       tmpdiv = rmd->first_group;
> +       do_div(tmpdiv, rmd->blocks_per_row);
> +       rmd->first_group = tmpdiv;
> +       tmpdiv = rmd->last_block;
> +       rmd->last_group = do_div(tmpdiv, rmd->stripesize);
> +       tmpdiv = rmd->last_group;
> +       do_div(tmpdiv, rmd->blocks_per_row);
> +       rmd->last_group = tmpdiv;
>  #else
> -               first_group = (first_block % stripesize) /
> r5or6_blocks_per_row;
> -               last_group = (last_block % stripesize) /
> r5or6_blocks_per_row;
> +       rmd->first_group = (rmd->first_block % rmd->stripesize) /
> rmd->blocks_per_row;
> +       rmd->last_group = (rmd->last_block % rmd->stripesize) / rmd-
> > blocks_per_row;
>  #endif
> -               if (first_group != last_group)
> -                       return PQI_RAID_BYPASS_INELIGIBLE;
> +       if (rmd->first_group != rmd->last_group)
> +               return PQI_RAID_BYPASS_INELIGIBLE;
>  
> -               /* Verify request is in a single row of RAID 5/6 */
> +       /* Verify request is in a single row of RAID 5/6 */
>  #if BITS_PER_LONG == 32
> -               tmpdiv = first_block;
> -               do_div(tmpdiv, stripesize);
> -               first_row = r5or6_first_row = r0_first_row = tmpdiv;
> -               tmpdiv = last_block;
> -               do_div(tmpdiv, stripesize);
> -               r5or6_last_row = r0_last_row = tmpdiv;
> +       tmpdiv = rmd->first_block;
> +       do_div(tmpdiv, rmd->stripesize);
> +       rmd->first_row = tmpdiv;
> +       rmd->r5or6_first_row = tmpdiv;
> +       tmpdiv = rmd->last_block;
> +       do_div(tmpdiv, rmd->stripesize);
> +       rmd->r5or6_last_row = tmpdiv;
>  #else
> -               first_row = r5or6_first_row = r0_first_row =
> -                       first_block / stripesize;
> -               r5or6_last_row = r0_last_row = last_block /
> stripesize;
> +       rmd->first_row = rmd->r5or6_first_row =
> +               rmd->first_block / rmd->stripesize;
> +       rmd->r5or6_last_row = rmd->last_block / rmd->stripesize;
>  #endif
> -               if (r5or6_first_row != r5or6_last_row)
> -                       return PQI_RAID_BYPASS_INELIGIBLE;
> +       if (rmd->r5or6_first_row != rmd->r5or6_last_row)
> +               return PQI_RAID_BYPASS_INELIGIBLE;
>  
> -               /* Verify request is in a single column */
> +       /* Verify request is in a single column */
>  #if BITS_PER_LONG == 32
> -               tmpdiv = first_block;
> -               first_row_offset = do_div(tmpdiv, stripesize);
> -               tmpdiv = first_row_offset;
> -               first_row_offset = (u32)do_div(tmpdiv,
> r5or6_blocks_per_row);
> -               r5or6_first_row_offset = first_row_offset;
> -               tmpdiv = last_block;
> -               r5or6_last_row_offset = do_div(tmpdiv, stripesize);
> -               tmpdiv = r5or6_last_row_offset;
> -               r5or6_last_row_offset = do_div(tmpdiv,
> r5or6_blocks_per_row);
> -               tmpdiv = r5or6_first_row_offset;
> -               do_div(tmpdiv, strip_size);
> -               first_column = r5or6_first_column = tmpdiv;
> -               tmpdiv = r5or6_last_row_offset;
> -               do_div(tmpdiv, strip_size);
> -               r5or6_last_column = tmpdiv;
> +       tmpdiv = rmd->first_block;
> +       rmd->first_row_offset = do_div(tmpdiv, rmd->stripesize);
> +       tmpdiv = rmd->first_row_offset;
> +       rmd->first_row_offset = (u32)do_div(tmpdiv, rmd-
> > blocks_per_row);
> +       rmd->r5or6_first_row_offset = rmd->first_row_offset;
> +       tmpdiv = rmd->last_block;
> +       rmd->r5or6_last_row_offset = do_div(tmpdiv, rmd->stripesize);
> +       tmpdiv = rmd->r5or6_last_row_offset;
> +       rmd->r5or6_last_row_offset = do_div(tmpdiv, rmd-
> > blocks_per_row);
> +       tmpdiv = rmd->r5or6_first_row_offset;
> +       do_div(tmpdiv, rmd->strip_size);
> +       rmd->first_column = rmd->r5or6_first_column = tmpdiv;
> +       tmpdiv = rmd->r5or6_last_row_offset;
> +       do_div(tmpdiv, rmd->strip_size);
> +       rmd->r5or6_last_column = tmpdiv;
>  #else
> -               first_row_offset = r5or6_first_row_offset =
> -                       (u32)((first_block % stripesize) %
> -                       r5or6_blocks_per_row);
> +       rmd->first_row_offset = rmd->r5or6_first_row_offset =
> +               (u32)((rmd->first_block %
> +                               rmd->stripesize) %
> +                               rmd->blocks_per_row);
> +
> +       rmd->r5or6_last_row_offset =
> +               (u32)((rmd->last_block % rmd->stripesize) %
> +               rmd->blocks_per_row);
> +
> +       rmd->first_column =
> +                       rmd->r5or6_first_row_offset / rmd-
> > strip_size;
> +       rmd->r5or6_first_column = rmd->first_column;
> +       rmd->r5or6_last_column = rmd->r5or6_last_row_offset / rmd-
> > strip_size;
> +#endif
> +       if (rmd->r5or6_first_column != rmd->r5or6_last_column)
> +               return PQI_RAID_BYPASS_INELIGIBLE;
> +
> +       /* Request is eligible */
> +       rmd->map_row =
> +               ((u32)(rmd->first_row >> raid_map-
> > parity_rotation_shift)) %
> +               get_unaligned_le16(&raid_map->row_cnt);
>  
> -               r5or6_last_row_offset =
> -                       (u32)((last_block % stripesize) %
> -                       r5or6_blocks_per_row);
> +       rmd->map_index = (rmd->first_group *
> +               (get_unaligned_le16(&raid_map->row_cnt) *
> +               rmd->total_disks_per_row)) +
> +               (rmd->map_row * rmd->total_disks_per_row) + rmd-
> > first_column;
>  
> -               first_column = r5or6_first_row_offset / strip_size;
> -               r5or6_first_column = first_column;
> -               r5or6_last_column = r5or6_last_row_offset /
> strip_size;
> -#endif
> -               if (r5or6_first_column != r5or6_last_column)
> -                       return PQI_RAID_BYPASS_INELIGIBLE;
> +       return 0;
> +}
> +
> +static void pqi_set_aio_cdb(struct pqi_scsi_dev_raid_map_data *rmd)
> +{
> +       /* Build the new CDB for the physical disk I/O. */
> +       if (rmd->disk_block > 0xffffffff) {
> +               rmd->cdb[0] = rmd->is_write ? WRITE_16 : READ_16;
> +               rmd->cdb[1] = 0;
> +               put_unaligned_be64(rmd->disk_block, &rmd->cdb[2]);
> +               put_unaligned_be32(rmd->disk_block_cnt, &rmd-
> > cdb[10]);
> +               rmd->cdb[14] = 0;
> +               rmd->cdb[15] = 0;
> +               rmd->cdb_length = 16;
> +       } else {
> +               rmd->cdb[0] = rmd->is_write ? WRITE_10 : READ_10;
> +               rmd->cdb[1] = 0;
> +               put_unaligned_be32((u32)rmd->disk_block, &rmd-
> > cdb[2]);
> +               rmd->cdb[6] = 0;
> +               put_unaligned_be16((u16)rmd->disk_block_cnt, &rmd-
> > cdb[7]);
> +               rmd->cdb[9] = 0;
> +               rmd->cdb_length = 10;
> +       }
> +}
> +
> +static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
> +       struct pqi_queue_group *queue_group)
> +{
> +       struct raid_map *raid_map;
> +       int rc;
> +       struct pqi_encryption_info *encryption_info_ptr;
> +       struct pqi_encryption_info encryption_info;
> +       struct pqi_scsi_dev_raid_map_data rmd = {0};
> +
> +       rc = pqi_get_aio_lba_and_block_count(scmd, &rmd);
> +       if (rc)
> +               return PQI_RAID_BYPASS_INELIGIBLE;
> +
> +       rmd.raid_level = device->raid_level;
> +
> +       if (!pqi_aio_raid_level_supported(&rmd))
> +               return PQI_RAID_BYPASS_INELIGIBLE;
> +
> +       if (unlikely(rmd.block_cnt == 0))
> +               return PQI_RAID_BYPASS_INELIGIBLE;
> +
> +       raid_map = device->raid_map;
>  
> -               /* Request is eligible */
> -               map_row =
> -                       ((u32)(first_row >> raid_map-
> > parity_rotation_shift)) %
> -                       get_unaligned_le16(&raid_map->row_cnt);
> +       rc = pci_get_aio_common_raid_map_values(ctrl_info, &rmd,
> raid_map);
> +       if (rc)
> +               return PQI_RAID_BYPASS_INELIGIBLE;
>  
> -               map_index = (first_group *
> -                       (get_unaligned_le16(&raid_map->row_cnt) *
> -                       total_disks_per_row)) +
> -                       (map_row * total_disks_per_row) +
> first_column;
> +       /* RAID 1 */
> +       if (device->raid_level == SA_RAID_1) {
> +               if (device->offload_to_mirror)
> +                       rmd.map_index += rmd.data_disks_per_row;
> +               device->offload_to_mirror = !device-
> > offload_to_mirror;
> +       } else if (device->raid_level == SA_RAID_ADM) {
> +               rc = pqi_calc_aio_raid_adm(&rmd, device);

You don't use this return value. Actually, pqi_calc_aio_raid_adm()
could be a void function.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 03/25] smartpqi: refactor build sg list code
  2020-12-10 20:34 ` [PATCH V3 03/25] smartpqi: refactor build sg list code Don Brace
@ 2021-01-07 16:43   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:43 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> * No functional changes.
> * Factor out code common to all s/g list building.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Hint: readability would be improved if you could remove the
"goto out" statements in pqi_build_raid_sg_list() and
pqi_build_aio_sg_list(). That should go into a separate patch,
though.

Reviewed-by: Martin Wilck <mwilck@suse.com>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes
  2020-12-10 20:34 ` [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes Don Brace
@ 2021-01-07 16:44   ` Martin Wilck
  2021-01-08 22:56     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:44 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> * Add in new IU definition.
> * Add in support raid5 and raid6 writes.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   39 +++++
>  drivers/scsi/smartpqi/smartpqi_init.c |  247
> ++++++++++++++++++++++++++++++++-
>  2 files changed, 278 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index d486a2ec3045..e9844210c4a0 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -257,6 +257,7 @@ struct pqi_device_capability {
>  };
>  
>  #define PQI_MAX_EMBEDDED_SG_DESCRIPTORS                4
> +#define PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS    3
>  
>  struct pqi_raid_path_request {
>         struct pqi_iu_header header;
> @@ -312,6 +313,39 @@ struct pqi_aio_path_request {
>                 sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
>  };
>  
> +#define PQI_RAID56_XFER_LIMIT_4K       0x1000 /* 4Kib */
> +#define PQI_RAID56_XFER_LIMIT_8K       0x2000 /* 8Kib */

You don't seem to use these, and you'll remove them again in patch
06/25.

> +struct pqi_aio_r56_path_request {
> +       struct pqi_iu_header header;
> +       __le16  request_id;
> +       __le16  volume_id;              /* ID of the RAID volume */
> +       __le32  data_it_nexus;          /* IT nexus for the data
> drive */
> +       __le32  p_parity_it_nexus;      /* IT nexus for the P parity
> drive */
> +       __le32  q_parity_it_nexus;      /* IT nexus for the Q parity
> drive */
> +       __le32  data_length;            /* total bytes to read/write
> */
> +       u8      data_direction : 2;
> +       u8      partial : 1;
> +       u8      mem_type : 1;           /* 0b: PCIe, 1b: DDR */
> +       u8      fence : 1;
> +       u8      encryption_enable : 1;
> +       u8      reserved : 2;
> +       u8      task_attribute : 3;
> +       u8      command_priority : 4;
> +       u8      reserved1 : 1;
> +       __le16  data_encryption_key_index;
> +       u8      cdb[16];
> +       __le16  error_index;
> +       u8      num_sg_descriptors;
> +       u8      cdb_length;
> +       u8      xor_multiplier;
> +       u8      reserved2[3];
> +       __le32  encrypt_tweak_lower;
> +       __le32  encrypt_tweak_upper;
> +       u8      row;                    /* row = logical lba/blocks
> per row */
> +       u8      reserved3[8];
> +       struct pqi_sg_descriptor
> sg_descriptors[PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS];
> +};
> +
>  struct pqi_io_response {
>         struct pqi_iu_header header;
>         __le16  request_id;
> @@ -484,6 +518,8 @@ struct pqi_raid_error_info {
>  #define PQI_REQUEST_IU_TASK_MANAGEMENT                 0x13
>  #define PQI_REQUEST_IU_RAID_PATH_IO                    0x14
>  #define PQI_REQUEST_IU_AIO_PATH_IO                     0x15
> +#define PQI_REQUEST_IU_AIO_PATH_RAID5_IO               0x18
> +#define PQI_REQUEST_IU_AIO_PATH_RAID6_IO               0x19
>  #define PQI_REQUEST_IU_GENERAL_ADMIN                   0x60
>  #define PQI_REQUEST_IU_REPORT_VENDOR_EVENT_CONFIG      0x72
>  #define PQI_REQUEST_IU_SET_VENDOR_EVENT_CONFIG         0x73
> @@ -1179,6 +1215,7 @@ struct pqi_ctrl_info {
>         u16             max_inbound_iu_length_per_firmware;
>         u16             max_inbound_iu_length;
>         unsigned int    max_sg_per_iu;
> +       unsigned int    max_sg_per_r56_iu;
>         void            *admin_queue_memory_base;
>         u32             admin_queue_memory_length;
>         dma_addr_t      admin_queue_memory_base_dma_handle;
> @@ -1210,6 +1247,8 @@ struct pqi_ctrl_info {
>         u8              soft_reset_handshake_supported : 1;
>         u8              raid_iu_timeout_supported: 1;
>         u8              tmf_iu_timeout_supported: 1;
> +       u8              enable_r5_writes : 1;
> +       u8              enable_r6_writes : 1;
>  
>         struct list_head scsi_device_list;
>         spinlock_t      scsi_device_list_lock;
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 6bcb037ae9d7..c813cec10003 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -67,6 +67,10 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info
> *ctrl_info,
>         struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,
>         unsigned int cdb_length, struct pqi_queue_group *queue_group,
>         struct pqi_encryption_info *encryption_info, bool
> raid_bypass);
> +static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info
> *ctrl_info,
> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
> +       struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> +       struct pqi_scsi_dev_raid_map_data *rmd);
>  static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info);
>  static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info);
>  static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info);
> @@ -2237,7 +2241,8 @@ static inline void pqi_set_encryption_info(
>   * Attempt to perform RAID bypass mapping for a logical volume I/O.
>   */
>  
> -static bool pqi_aio_raid_level_supported(struct
> pqi_scsi_dev_raid_map_data *rmd)
> +static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_scsi_dev_raid_map_data *rmd)
>  {
>         bool is_supported = true;
>  
> @@ -2245,13 +2250,14 @@ static bool
> pqi_aio_raid_level_supported(struct pqi_scsi_dev_raid_map_data *rmd)
>         case SA_RAID_0:
>                 break;
>         case SA_RAID_1:
> -               if (rmd->is_write)
> -                       is_supported = false;
> +               is_supported = false;

You disable RAID1 READs with this patch. I can see you fix it again in
05/25, still it looks wrong.

>                 break;
>         case SA_RAID_5:
> -               fallthrough;
> +               if (rmd->is_write && !ctrl_info->enable_r5_writes)
> +                       is_supported = false;
> +               break;
>         case SA_RAID_6:
> -               if (rmd->is_write)
> +               if (rmd->is_write && !ctrl_info->enable_r6_writes)
>                         is_supported = false;
>                 break;
>         case SA_RAID_ADM:
> @@ -2526,6 +2532,26 @@ static int pqi_calc_aio_r5_or_r6(struct
> pqi_scsi_dev_raid_map_data *rmd,
>                 rmd->total_disks_per_row)) +
>                 (rmd->map_row * rmd->total_disks_per_row) + rmd-
> > first_column;
>  
> +       if (rmd->is_write) {
> +               rmd->p_index = (rmd->map_row * rmd-
> > total_disks_per_row) + rmd->data_disks_per_row;
> +               rmd->p_parity_it_nexus = raid_map->disk_data[rmd-
> > p_index].aio_handle;

I suppose you have made sure rmd->p_index can't be larger than the
size of raid_map->disk_data. A comment explaining that would be helpful
for the reader though.


> +               if (rmd->raid_level == SA_RAID_6) {
> +                       rmd->q_index = (rmd->map_row * rmd-
> > total_disks_per_row) +
> +                               (rmd->data_disks_per_row + 1);
> +                       rmd->q_parity_it_nexus = raid_map-
> > disk_data[rmd->q_index].aio_handle;
> +                       rmd->xor_mult = raid_map->disk_data[rmd-
> > map_index].xor_mult[1];

See above.

> +               }
> +               if (rmd->blocks_per_row == 0)
> +                       return PQI_RAID_BYPASS_INELIGIBLE;
> +#if BITS_PER_LONG == 32
> +               tmpdiv = rmd->first_block;
> +               do_div(tmpdiv, rmd->blocks_per_row);
> +               rmd->row = tmpdiv;
> +#else
> +               rmd->row = rmd->first_block / rmd->blocks_per_row;
> +#endif

Why not always use do_div()?

> +       }
> +
>         return 0;
>  }
>  
> @@ -2567,7 +2593,7 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>  
>         rmd.raid_level = device->raid_level;
>  
> -       if (!pqi_aio_raid_level_supported(&rmd))
> +       if (!pqi_aio_raid_level_supported(ctrl_info, &rmd))
>                 return PQI_RAID_BYPASS_INELIGIBLE;
>  
>         if (unlikely(rmd.block_cnt == 0))
> @@ -2587,7 +2613,8 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>         } else if (device->raid_level == SA_RAID_ADM) {
>                 rc = pqi_calc_aio_raid_adm(&rmd, device);
>         } else if ((device->raid_level == SA_RAID_5 ||
> -               device->raid_level == SA_RAID_6) &&
> rmd.layout_map_count > 1) {
> +               device->raid_level == SA_RAID_6) &&
> +               (rmd.layout_map_count > 1 || rmd.is_write)) {
>                 rc = pqi_calc_aio_r5_or_r6(&rmd, raid_map);
>                 if (rc)
>                         return PQI_RAID_BYPASS_INELIGIBLE;
> @@ -2622,9 +2649,27 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>                 encryption_info_ptr = NULL;
>         }
>  
> -       return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
> +       if (rmd.is_write) {
> +               switch (device->raid_level) {
> +               case SA_RAID_0:
> +                       return pqi_aio_submit_io(ctrl_info, scmd,
> rmd.aio_handle,
>                                 rmd.cdb, rmd.cdb_length, queue_group,
>                                 encryption_info_ptr, true);
> +               case SA_RAID_5:
> +               case SA_RAID_6:
> +                       return pqi_aio_submit_r56_write_io(ctrl_info,
> scmd, queue_group,
> +                                       encryption_info_ptr, device,
> &rmd);
> +               default:
> +                       return pqi_aio_submit_io(ctrl_info, scmd,
> rmd.aio_handle,
> +                               rmd.cdb, rmd.cdb_length, queue_group,
> +                               encryption_info_ptr, true);
> +               }
> +       } else {
> +               return pqi_aio_submit_io(ctrl_info, scmd,
> rmd.aio_handle,
> +                       rmd.cdb, rmd.cdb_length, queue_group,
> +                       encryption_info_ptr, true);
> +       }
> +
>  }
>  
>  #define PQI_STATUS_IDLE                0x0
> @@ -4844,6 +4889,12 @@ static void
> pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info)
>                 PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /
>                 sizeof(struct pqi_sg_descriptor)) +
>                 PQI_MAX_EMBEDDED_SG_DESCRIPTORS;
> +
> +       ctrl_info->max_sg_per_r56_iu =
> +               ((ctrl_info->max_inbound_iu_length -
> +               PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /
> +               sizeof(struct pqi_sg_descriptor)) +
> +               PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS;
>  }
>  
>  static inline void pqi_set_sg_descriptor(
> @@ -4931,6 +4982,44 @@ static int pqi_build_raid_sg_list(struct
> pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>  
> +static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_aio_r56_path_request *request, struct scsi_cmnd
> *scmd,
> +       struct pqi_io_request *io_request)
> +{
> +       u16 iu_length;
> +       int sg_count;
> +       bool chained;
> +       unsigned int num_sg_in_iu;
> +       struct scatterlist *sg;
> +       struct pqi_sg_descriptor *sg_descriptor;
> +
> +       sg_count = scsi_dma_map(scmd);
> +       if (sg_count < 0)
> +               return sg_count;
> +
> +       iu_length = offsetof(struct pqi_aio_r56_path_request,
> sg_descriptors) -
> +               PQI_REQUEST_HEADER_LENGTH;
> +       num_sg_in_iu = 0;
> +
> +       if (sg_count == 0)
> +               goto out;

An if {} block would be better readable here.

> +
> +       sg = scsi_sglist(scmd);
> +       sg_descriptor = request->sg_descriptors;
> +
> +       num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count,
> io_request,
> +               ctrl_info->max_sg_per_r56_iu, &chained);
> +
> +       request->partial = chained;
> +       iu_length += num_sg_in_iu * sizeof(*sg_descriptor);
> +
> +out:
> +       put_unaligned_le16(iu_length, &request->header.iu_length);
> +       request->num_sg_descriptors = num_sg_in_iu;
> +
> +       return 0;
> +}
> +
>  static int pqi_build_aio_sg_list(struct pqi_ctrl_info *ctrl_info,
>         struct pqi_aio_path_request *request, struct scsi_cmnd *scmd,
>         struct pqi_io_request *io_request)
> @@ -5335,6 +5424,88 @@ static int pqi_aio_submit_io(struct
> pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>  
> +static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info
> *ctrl_info,
> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
> +       struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> +       struct pqi_scsi_dev_raid_map_data *rmd)
> +{
> +       int rc;
> +       struct pqi_io_request *io_request;
> +       struct pqi_aio_r56_path_request *r56_request;
> +
> +       io_request = pqi_alloc_io_request(ctrl_info);
> +       io_request->io_complete_callback = pqi_aio_io_complete;
> +       io_request->scmd = scmd;
> +       io_request->raid_bypass = true;
> +
> +       r56_request = io_request->iu;
> +       memset(r56_request, 0, offsetof(struct
> pqi_aio_r56_path_request, sg_descriptors));
> +
> +       if (device->raid_level == SA_RAID_5 || device->raid_level ==
> SA_RAID_51)
> +               r56_request->header.iu_type =
> PQI_REQUEST_IU_AIO_PATH_RAID5_IO;
> +       else
> +               r56_request->header.iu_type =
> PQI_REQUEST_IU_AIO_PATH_RAID6_IO;
> +
> +       put_unaligned_le16(*(u16 *)device->scsi3addr & 0x3fff,
> &r56_request->volume_id);
> +       put_unaligned_le32(rmd->aio_handle, &r56_request-
> > data_it_nexus);
> +       put_unaligned_le32(rmd->p_parity_it_nexus, &r56_request-
> > p_parity_it_nexus);
> +       if (rmd->raid_level == SA_RAID_6) {
> +               put_unaligned_le32(rmd->q_parity_it_nexus,
> &r56_request->q_parity_it_nexus);
> +               r56_request->xor_multiplier = rmd->xor_mult;
> +       }
> +       put_unaligned_le32(scsi_bufflen(scmd), &r56_request-
> > data_length);
> +       r56_request->task_attribute = SOP_TASK_ATTRIBUTE_SIMPLE;
> +       put_unaligned_le64(rmd->row, &r56_request->row);
> +
> +       put_unaligned_le16(io_request->index, &r56_request-
> > request_id);
> +       r56_request->error_index = r56_request->request_id;
> +
> +       if (rmd->cdb_length > sizeof(r56_request->cdb))
> +               rmd->cdb_length = sizeof(r56_request->cdb);
> +       r56_request->cdb_length = rmd->cdb_length;
> +       memcpy(r56_request->cdb, rmd->cdb, rmd->cdb_length);
> +
> +       switch (scmd->sc_data_direction) {
> +       case DMA_TO_DEVICE:
> +               r56_request->data_direction = SOP_READ_FLAG;
> +               break;

I wonder how it would be possible that sc_data_direction is anything
else but DMA_TO_DEVICE here. AFAICS we will only reach this code for
WRITE commands. Add a comment, please.


> +       case DMA_FROM_DEVICE:
> +               r56_request->data_direction = SOP_WRITE_FLAG;
> +               break;
> +       case DMA_NONE:
> +               r56_request->data_direction = SOP_NO_DIRECTION_FLAG;
> +               break;
> +       case DMA_BIDIRECTIONAL:
> +               r56_request->data_direction = SOP_BIDIRECTIONAL;
> +               break;
> +       default:
> +               dev_err(&ctrl_info->pci_dev->dev,
> +                       "unknown data direction: %d\n",
> +                       scmd->sc_data_direction);
> +               break;
> +       }
> +
> +       if (encryption_info) {
> +               r56_request->encryption_enable = true;
> +               put_unaligned_le16(encryption_info-
> > data_encryption_key_index,
> +                               &r56_request-
> > data_encryption_key_index);
> +               put_unaligned_le32(encryption_info-
> > encrypt_tweak_lower,
> +                               &r56_request->encrypt_tweak_lower);
> +               put_unaligned_le32(encryption_info-
> > encrypt_tweak_upper,
> +                               &r56_request->encrypt_tweak_upper);
> +       }
> +
> +       rc = pqi_build_aio_r56_sg_list(ctrl_info, r56_request, scmd,
> io_request);
> +       if (rc) {
> +               pqi_free_io_request(io_request);
> +               return SCSI_MLQUEUE_HOST_BUSY;
> +       }
> +
> +       pqi_start_io(ctrl_info, queue_group, AIO_PATH, io_request);
> +
> +       return 0;
> +}
> +
>  static inline u16 pqi_get_hw_queue(struct pqi_ctrl_info *ctrl_info,
>         struct scsi_cmnd *scmd)
>  {
> @@ -6298,6 +6469,60 @@ static ssize_t pqi_lockup_action_store(struct
> device *dev,
>         return -EINVAL;
>  }
>  
> +static ssize_t pqi_host_enable_r5_writes_show(struct device *dev,
> +       struct device_attribute *attr, char *buffer)
> +{
> +       struct Scsi_Host *shost = class_to_shost(dev);
> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> +
> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-
> > enable_r5_writes);

"%hhx" is deprecated, see
https://lore.kernel.org/lkml/20190914015858.7c76e036@lwn.net/T/


> +}
> +
> +static ssize_t pqi_host_enable_r5_writes_store(struct device *dev,
> +       struct device_attribute *attr, const char *buffer, size_t
> count)
> +{
> +       struct Scsi_Host *shost = class_to_shost(dev);
> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> +       u8 set_r5_writes = 0;
> +
> +       if (kstrtou8(buffer, 0, &set_r5_writes))
> +               return -EINVAL;
> +
> +       if (set_r5_writes > 0)
> +               set_r5_writes = 1;
> +
> +       ctrl_info->enable_r5_writes = set_r5_writes;
> +
> +       return count;
> +}
> +
> +static ssize_t pqi_host_enable_r6_writes_show(struct device *dev,
> +       struct device_attribute *attr, char *buffer)
> +{
> +       struct Scsi_Host *shost = class_to_shost(dev);
> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> +
> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-
> > enable_r6_writes);

See above

> +}
> +
> +static ssize_t pqi_host_enable_r6_writes_store(struct device *dev,
> +       struct device_attribute *attr, const char *buffer, size_t
> count)
> +{
> +       struct Scsi_Host *shost = class_to_shost(dev);
> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> +       u8 set_r6_writes = 0;
> +
> +       if (kstrtou8(buffer, 0, &set_r6_writes))
> +               return -EINVAL;
> +
> +       if (set_r6_writes > 0)
> +               set_r6_writes = 1;
> +
> +       ctrl_info->enable_r6_writes = set_r6_writes;
> +
> +       return count;
> +}
> +
>  static DEVICE_ATTR(driver_version, 0444, pqi_driver_version_show,
> NULL);
>  static DEVICE_ATTR(firmware_version, 0444,
> pqi_firmware_version_show, NULL);
>  static DEVICE_ATTR(model, 0444, pqi_model_show, NULL);
> @@ -6306,6 +6531,10 @@ static DEVICE_ATTR(vendor, 0444,
> pqi_vendor_show, NULL);
>  static DEVICE_ATTR(rescan, 0200, NULL, pqi_host_rescan_store);
>  static DEVICE_ATTR(lockup_action, 0644, pqi_lockup_action_show,
>         pqi_lockup_action_store);
> +static DEVICE_ATTR(enable_r5_writes, 0644,
> +       pqi_host_enable_r5_writes_show,
> pqi_host_enable_r5_writes_store);
> +static DEVICE_ATTR(enable_r6_writes, 0644,
> +       pqi_host_enable_r6_writes_show,
> pqi_host_enable_r6_writes_store);
>  
>  static struct device_attribute *pqi_shost_attrs[] = {
>         &dev_attr_driver_version,
> @@ -6315,6 +6544,8 @@ static struct device_attribute
> *pqi_shost_attrs[] = {
>         &dev_attr_vendor,
>         &dev_attr_rescan,
>         &dev_attr_lockup_action,
> +       &dev_attr_enable_r5_writes,
> +       &dev_attr_enable_r6_writes,
>         NULL
>  };
>  
> 






^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 05/25] smartpqi: add support for raid1 writes
  2020-12-10 20:34 ` [PATCH V3 05/25] smartpqi: add support for raid1 writes Don Brace
@ 2021-01-07 16:44   ` Martin Wilck
  2021-01-09 16:56     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:44 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> * Add raid1 write IU.
> * Add in raid1 write support.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   37 +++++
>  drivers/scsi/smartpqi/smartpqi_init.c |  235
> +++++++++++++++++++++++----------
>  2 files changed, 196 insertions(+), 76 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index e9844210c4a0..225ec6843c68 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -313,6 +313,36 @@ struct pqi_aio_path_request {
>                 sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
>  };
>  
> +#define PQI_RAID1_NVME_XFER_LIMIT      (32 * 1024)     /* 32 KiB */
> +struct pqi_aio_r1_path_request {
> +       struct pqi_iu_header header;
> +       __le16  request_id;
> +       __le16  volume_id;      /* ID of the RAID volume */
> +       __le32  it_nexus_1;     /* IT nexus of the 1st drive in the
> RAID volume */
> +       __le32  it_nexus_2;     /* IT nexus of the 2nd drive in the
> RAID volume */
> +       __le32  it_nexus_3;     /* IT nexus of the 3rd drive in the
> RAID volume */
> +       __le32  data_length;    /* total bytes to read/write */
> +       u8      data_direction : 2;
> +       u8      partial : 1;
> +       u8      memory_type : 1;
> +       u8      fence : 1;
> +       u8      encryption_enable : 1;
> +       u8      reserved : 2;
> +       u8      task_attribute : 3;
> +       u8      command_priority : 4;
> +       u8      reserved2 : 1;
> +       __le16  data_encryption_key_index;
> +       u8      cdb[16];
> +       __le16  error_index;
> +       u8      num_sg_descriptors;
> +       u8      cdb_length;
> +       u8      num_drives;     /* number of drives in the RAID
> volume (2 or 3) */
> +       u8      reserved3[3];
> +       __le32  encrypt_tweak_lower;
> +       __le32  encrypt_tweak_upper;
> +       struct pqi_sg_descriptor
> sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
> +};
> +
>  #define PQI_RAID56_XFER_LIMIT_4K       0x1000 /* 4Kib */
>  #define PQI_RAID56_XFER_LIMIT_8K       0x2000 /* 8Kib */
>  struct pqi_aio_r56_path_request {
> @@ -520,6 +550,7 @@ struct pqi_raid_error_info {
>  #define PQI_REQUEST_IU_AIO_PATH_IO                     0x15
>  #define PQI_REQUEST_IU_AIO_PATH_RAID5_IO               0x18
>  #define PQI_REQUEST_IU_AIO_PATH_RAID6_IO               0x19
> +#define PQI_REQUEST_IU_AIO_PATH_RAID1_IO               0x1A
>  #define PQI_REQUEST_IU_GENERAL_ADMIN                   0x60
>  #define PQI_REQUEST_IU_REPORT_VENDOR_EVENT_CONFIG      0x72
>  #define PQI_REQUEST_IU_SET_VENDOR_EVENT_CONFIG         0x73
> @@ -972,14 +1003,12 @@ struct pqi_scsi_dev_raid_map_data {
>         u16     strip_size;
>         u32     first_group;
>         u32     last_group;
> -       u32     current_group;
>         u32     map_row;
>         u32     aio_handle;
>         u64     disk_block;
>         u32     disk_block_cnt;
>         u8      cdb[16];
>         u8      cdb_length;
> -       int     offload_to_mirror;
>  
>         /* RAID1 specific */
>  #define NUM_RAID1_MAP_ENTRIES 3
> @@ -1040,8 +1069,7 @@ struct pqi_scsi_dev {
>         u16     phys_connector[8];
>         bool    raid_bypass_configured; /* RAID bypass configured */
>         bool    raid_bypass_enabled;    /* RAID bypass enabled */
> -       int     offload_to_mirror;      /* Send next RAID bypass
> request */
> -                                       /* to mirror drive. */
> +       u32     next_bypass_group;
>         struct raid_map *raid_map;      /* RAID bypass map */
>  
>         struct pqi_sas_port *sas_port;
> @@ -1247,6 +1275,7 @@ struct pqi_ctrl_info {
>         u8              soft_reset_handshake_supported : 1;
>         u8              raid_iu_timeout_supported: 1;
>         u8              tmf_iu_timeout_supported: 1;
> +       u8              enable_r1_writes : 1;
>         u8              enable_r5_writes : 1;
>         u8              enable_r6_writes : 1;
>  
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index c813cec10003..8da9031c9c0b 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -67,6 +67,10 @@ static int pqi_aio_submit_io(struct pqi_ctrl_info
> *ctrl_info,
>         struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,
>         unsigned int cdb_length, struct pqi_queue_group *queue_group,
>         struct pqi_encryption_info *encryption_info, bool
> raid_bypass);
> +static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info
> *ctrl_info,
> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
> +       struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> +       struct pqi_scsi_dev_raid_map_data *rmd);
>  static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info
> *ctrl_info,
>         struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
>         struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> @@ -1717,7 +1721,7 @@ static void pqi_scsi_update_device(struct
> pqi_scsi_dev *existing_device,
>                 sizeof(existing_device->box));
>         memcpy(existing_device->phys_connector, new_device-
> >phys_connector,
>                 sizeof(existing_device->phys_connector));
> -       existing_device->offload_to_mirror = 0;
> +       existing_device->next_bypass_group = 0;
>         kfree(existing_device->raid_map);
>         existing_device->raid_map = new_device->raid_map;
>         existing_device->raid_bypass_configured =
> @@ -2250,7 +2254,10 @@ static bool
> pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
>         case SA_RAID_0:
>                 break;
>         case SA_RAID_1:
> -               is_supported = false;
> +               fallthrough;

Nit: fallthrough isn't necessary here.

> +       case SA_RAID_ADM:
> +               if (rmd->is_write && !ctrl_info->enable_r1_writes)
> +                       is_supported = false;
>                 break;
>         case SA_RAID_5:
>                 if (rmd->is_write && !ctrl_info->enable_r5_writes)
> @@ -2260,10 +2267,6 @@ static bool
> pqi_aio_raid_level_supported(struct pqi_ctrl_info *ctrl_info,
>                 if (rmd->is_write && !ctrl_info->enable_r6_writes)
>                         is_supported = false;
>                 break;
> -       case SA_RAID_ADM:
> -               if (rmd->is_write)
> -                       is_supported = false;
> -               break;
>         default:
>                 is_supported = false;
>         }
> @@ -2385,64 +2388,6 @@ static int
> pci_get_aio_common_raid_map_values(struct pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>  
> -static int pqi_calc_aio_raid_adm(struct pqi_scsi_dev_raid_map_data
> *rmd,
> -                               struct pqi_scsi_dev *device)
> -{
> -       /* RAID ADM */
> -       /*
> -        * Handles N-way mirrors  (R1-ADM) and R10 with # of drives
> -        * divisible by 3.
> -        */
> -       rmd->offload_to_mirror = device->offload_to_mirror;
> -
> -       if (rmd->offload_to_mirror == 0)  {
> -               /* use physical disk in the first mirrored group. */
> -               rmd->map_index %= rmd->data_disks_per_row;
> -       } else {
> -               do {
> -                       /*
> -                        * Determine mirror group that map_index
> -                        * indicates.
> -                        */
> -                       rmd->current_group =
> -                               rmd->map_index / rmd-
> >data_disks_per_row;
> -
> -                       if (rmd->offload_to_mirror !=
> -                                       rmd->current_group) {
> -                               if (rmd->current_group <
> -                                       rmd->layout_map_count - 1) {
> -                                       /*
> -                                        * Select raid index from
> -                                        * next group.
> -                                        */
> -                                       rmd->map_index += rmd-
> >data_disks_per_row;
> -                                       rmd->current_group++;
> -                               } else {
> -                                       /*
> -                                        * Select raid index from
> first
> -                                        * group.
> -                                        */
> -                                       rmd->map_index %= rmd-
> >data_disks_per_row;
> -                                       rmd->current_group = 0;
> -                               }
> -                       }
> -               } while (rmd->offload_to_mirror != rmd-
> >current_group);
> -       }
> -
> -       /* Set mirror group to use next time. */
> -       rmd->offload_to_mirror =
> -               (rmd->offload_to_mirror >= rmd->layout_map_count - 1)
> ?
> -                       0 : rmd->offload_to_mirror + 1;
> -       device->offload_to_mirror = rmd->offload_to_mirror;
> -       /*
> -        * Avoid direct use of device->offload_to_mirror within this
> -        * function since multiple threads might simultaneously
> -        * increment it beyond the range of device->layout_map_count
> -1.
> -        */
> -
> -       return 0;
> -}
> -
>  static int pqi_calc_aio_r5_or_r6(struct pqi_scsi_dev_raid_map_data
> *rmd,
>                                 struct raid_map *raid_map)
>  {
> @@ -2577,12 +2522,34 @@ static void pqi_set_aio_cdb(struct
> pqi_scsi_dev_raid_map_data *rmd)
>         }
>  }
>  
> +static void pqi_calc_aio_r1_nexus(struct raid_map *raid_map,
> +                               struct pqi_scsi_dev_raid_map_data
> *rmd)
> +{
> +       u32 index;
> +       u32 group;
> +
> +       group = rmd->map_index / rmd->data_disks_per_row;
> +
> +       index = rmd->map_index - (group * rmd->data_disks_per_row);
> +       rmd->it_nexus[0] = raid_map->disk_data[index].aio_handle;
> +       index += rmd->data_disks_per_row;
> +       rmd->it_nexus[1] = raid_map->disk_data[index].aio_handle;
> +       if (rmd->layout_map_count > 2) {
> +               index += rmd->data_disks_per_row;
> +               rmd->it_nexus[2] = raid_map-
> >disk_data[index].aio_handle;
> +       }
> +
> +       rmd->num_it_nexus_entries = rmd->layout_map_count;
> +}
> +
>  static int pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
>         struct pqi_queue_group *queue_group)
>  {
> -       struct raid_map *raid_map;
>         int rc;
> +       struct raid_map *raid_map;
> +       u32 group;
> +       u32 next_bypass_group;
>         struct pqi_encryption_info *encryption_info_ptr;
>         struct pqi_encryption_info encryption_info;
>         struct pqi_scsi_dev_raid_map_data rmd = {0};
> @@ -2605,13 +2572,18 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>         if (rc)
>                 return PQI_RAID_BYPASS_INELIGIBLE;
>  
> -       /* RAID 1 */
> -       if (device->raid_level == SA_RAID_1) {
> -               if (device->offload_to_mirror)
> -                       rmd.map_index += rmd.data_disks_per_row;
> -               device->offload_to_mirror = !device-
> >offload_to_mirror;
> -       } else if (device->raid_level == SA_RAID_ADM) {
> -               rc = pqi_calc_aio_raid_adm(&rmd, device);
> +       if (device->raid_level == SA_RAID_1 ||
> +               device->raid_level == SA_RAID_ADM) {
> +               if (rmd.is_write) {
> +                       pqi_calc_aio_r1_nexus(raid_map, &rmd);
> +               } else {
> +                       group = device->next_bypass_group;
> +                       next_bypass_group = group + 1;
> +                       if (next_bypass_group >=
> rmd.layout_map_count)
> +                               next_bypass_group = 0;
> +                       device->next_bypass_group =
> next_bypass_group;
> +                       rmd.map_index += group *
> rmd.data_disks_per_row;
> +               }
>         } else if ((device->raid_level == SA_RAID_5 ||
>                 device->raid_level == SA_RAID_6) &&
>                 (rmd.layout_map_count > 1 || rmd.is_write)) {
> @@ -2655,6 +2627,10 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>                         return pqi_aio_submit_io(ctrl_info, scmd,
> rmd.aio_handle,
>                                 rmd.cdb, rmd.cdb_length, queue_group,
>                                 encryption_info_ptr, true);
> +               case SA_RAID_1:
> +               case SA_RAID_ADM:
> +                       return pqi_aio_submit_r1_write_io(ctrl_info,
> scmd, queue_group,
> +                               encryption_info_ptr, device, &rmd);
>                 case SA_RAID_5:
>                 case SA_RAID_6:
>                         return pqi_aio_submit_r56_write_io(ctrl_info,
> scmd, queue_group,
> @@ -4982,6 +4958,44 @@ static int pqi_build_raid_sg_list(struct
> pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>  
> +static int pqi_build_aio_r1_sg_list(struct pqi_ctrl_info *ctrl_info,
> +       struct pqi_aio_r1_path_request *request, struct scsi_cmnd
> *scmd,
> +       struct pqi_io_request *io_request)
> +{
> +       u16 iu_length;
> +       int sg_count;
> +       bool chained;
> +       unsigned int num_sg_in_iu;
> +       struct scatterlist *sg;
> +       struct pqi_sg_descriptor *sg_descriptor;
> +
> +       sg_count = scsi_dma_map(scmd);
> +       if (sg_count < 0)
> +               return sg_count;
> +
> +       iu_length = offsetof(struct pqi_aio_r1_path_request,
> sg_descriptors) -
> +               PQI_REQUEST_HEADER_LENGTH;
> +       num_sg_in_iu = 0;
> +
> +       if (sg_count == 0)
> +               goto out;
> +
> +       sg = scsi_sglist(scmd);
> +       sg_descriptor = request->sg_descriptors;
> +
> +       num_sg_in_iu = pqi_build_sg_list(sg_descriptor, sg, sg_count,
> io_request,
> +               ctrl_info->max_sg_per_iu, &chained);
> +
> +       request->partial = chained;
> +       iu_length += num_sg_in_iu * sizeof(*sg_descriptor);
> +
> +out:
> +       put_unaligned_le16(iu_length, &request->header.iu_length);
> +       request->num_sg_descriptors = num_sg_in_iu;
> +
> +       return 0;
> +}
> +
>  static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_aio_r56_path_request *request, struct scsi_cmnd
> *scmd,
>         struct pqi_io_request *io_request)
> @@ -5424,6 +5438,83 @@ static int pqi_aio_submit_io(struct
> pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>  
> +static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info
> *ctrl_info,
> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
> +       struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> +       struct pqi_scsi_dev_raid_map_data *rmd)
> +
> +{
> +       int rc;
> +       struct pqi_io_request *io_request;
> +       struct pqi_aio_r1_path_request *r1_request;
> +
> +       io_request = pqi_alloc_io_request(ctrl_info);
> +       io_request->io_complete_callback = pqi_aio_io_complete;
> +       io_request->scmd = scmd;
> +       io_request->raid_bypass = true;
> +
> +       r1_request = io_request->iu;
> +       memset(r1_request, 0, offsetof(struct
> pqi_aio_r1_path_request, sg_descriptors));
> +
> +       r1_request->header.iu_type =
> PQI_REQUEST_IU_AIO_PATH_RAID1_IO;
> +
> +       put_unaligned_le16(*(u16 *)device->scsi3addr & 0x3fff,
> &r1_request->volume_id);
> +       r1_request->num_drives = rmd->num_it_nexus_entries;
> +       put_unaligned_le32(rmd->it_nexus[0], &r1_request-
> >it_nexus_1);
> +       put_unaligned_le32(rmd->it_nexus[1], &r1_request-
> >it_nexus_2);
> +       if (rmd->num_it_nexus_entries == 3)
> +               put_unaligned_le32(rmd->it_nexus[2], &r1_request-
> >it_nexus_3);
> +
> +       put_unaligned_le32(scsi_bufflen(scmd), &r1_request-
> >data_length);
> +       r1_request->task_attribute = SOP_TASK_ATTRIBUTE_SIMPLE;
> +       put_unaligned_le16(io_request->index, &r1_request-
> >request_id);
> +       r1_request->error_index = r1_request->request_id;
> +       if (rmd->cdb_length > sizeof(r1_request->cdb))
> +               rmd->cdb_length = sizeof(r1_request->cdb);
> +       r1_request->cdb_length = rmd->cdb_length;
> +       memcpy(r1_request->cdb, rmd->cdb, rmd->cdb_length);
> +
> +       switch (scmd->sc_data_direction) {
> +       case DMA_TO_DEVICE:
> +               r1_request->data_direction = SOP_READ_FLAG;
> +               break;

Same question as for previous patch, how would anything else than
DMA_TO_DEVICE be possible here?

> +       case DMA_FROM_DEVICE:
> +               r1_request->data_direction = SOP_WRITE_FLAG;
> +               break;
> +       case DMA_NONE:
> +               r1_request->data_direction = SOP_NO_DIRECTION_FLAG;
> +               break;
> +       case DMA_BIDIRECTIONAL:
> +               r1_request->data_direction = SOP_BIDIRECTIONAL;
> +               break;
> +       default:
> +               dev_err(&ctrl_info->pci_dev->dev,
> +                       "unknown data direction: %d\n",
> +                       scmd->sc_data_direction);
> +               break;
> +       }
> +
> +       if (encryption_info) {
> +               r1_request->encryption_enable = true;
> +               put_unaligned_le16(encryption_info-
> >data_encryption_key_index,
> +                               &r1_request-
> >data_encryption_key_index);
> +               put_unaligned_le32(encryption_info-
> >encrypt_tweak_lower,
> +                               &r1_request->encrypt_tweak_lower);
> +               put_unaligned_le32(encryption_info-
> >encrypt_tweak_upper,
> +                               &r1_request->encrypt_tweak_upper);
> +       }
> +
> +       rc = pqi_build_aio_r1_sg_list(ctrl_info, r1_request, scmd,
> io_request);
> +       if (rc) {
> +               pqi_free_io_request(io_request);
> +               return SCSI_MLQUEUE_HOST_BUSY;
> +       }
> +
> +       pqi_start_io(ctrl_info, queue_group, AIO_PATH, io_request);
> +
> +       return 0;
> +}
> +
>  static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info
> *ctrl_info,
>         struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
>         struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> 




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits
  2020-12-10 20:34 ` [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits Don Brace
@ 2021-01-07 16:44   ` Martin Wilck
  2021-01-11 17:22     ` Don.Brace
  2021-01-22 16:45     ` Don.Brace
  0 siblings, 2 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:44 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:34 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Determine support for supported features from
>   BMIC sense feature command instead of config table.
> 
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   77 +++++++-
>  drivers/scsi/smartpqi/smartpqi_init.c |  328
> +++++++++++++++++++++++++++++----
>  2 files changed, 363 insertions(+), 42 deletions(-)
> 

In general: This patch contains a lot of whitespace, indentation, and
minor comment formatting changes which should rather go into a separate
patch IMHO. This one is big enough without them.

Further remarks below.

> [...]
> 
> @@ -2552,7 +2686,7 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>         u32 next_bypass_group;
>         struct pqi_encryption_info *encryption_info_ptr;
>         struct pqi_encryption_info encryption_info;
> -       struct pqi_scsi_dev_raid_map_data rmd = {0};
> +       struct pqi_scsi_dev_raid_map_data rmd = { 0 };
>  
>         rc = pqi_get_aio_lba_and_block_count(scmd, &rmd);
>         if (rc)
> @@ -2613,7 +2747,9 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>         pqi_set_aio_cdb(&rmd);
>  
>         if (get_unaligned_le16(&raid_map->flags) &
> -               RAID_MAP_ENCRYPTION_ENABLED) {
> +                       RAID_MAP_ENCRYPTION_ENABLED) {
> +               if (rmd.data_length > device->max_transfer_encrypted)
> +                       return PQI_RAID_BYPASS_INELIGIBLE;
>                 pqi_set_encryption_info(&encryption_info, raid_map,
>                         rmd.first_block);
>                 encryption_info_ptr = &encryption_info;
> @@ -2623,10 +2759,6 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>  

This hunk is fine, but AFAICS it doesn't belong here logically, it
should rather be part of patch 04 and 05.

>         if (rmd.is_write) {
>                 switch (device->raid_level) {
> -               case SA_RAID_0:
> -                       return pqi_aio_submit_io(ctrl_info, scmd,
> rmd.aio_handle,
> -                               rmd.cdb, rmd.cdb_length, queue_group,
> -                               encryption_info_ptr, true);
>                 case SA_RAID_1:
>                 case SA_RAID_ADM:
>                         return pqi_aio_submit_r1_write_io(ctrl_info,
> scmd, queue_group,
> @@ -2635,17 +2767,12 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>                 case SA_RAID_6:
>                         return pqi_aio_submit_r56_write_io(ctrl_info,
> scmd, queue_group,
>                                         encryption_info_ptr, device,
> &rmd);
> -               default:
> -                       return pqi_aio_submit_io(ctrl_info, scmd,
> rmd.aio_handle,
> -                               rmd.cdb, rmd.cdb_length, queue_group,
> -                               encryption_info_ptr, true);
>                 }
> -       } else {
> -               return pqi_aio_submit_io(ctrl_info, scmd,
> rmd.aio_handle,
> -                       rmd.cdb, rmd.cdb_length, queue_group,
> -                       encryption_info_ptr, true);
>         }
>  
> +       return pqi_aio_submit_io(ctrl_info, scmd, rmd.aio_handle,
> +               rmd.cdb, rmd.cdb_length, queue_group,
> +               encryption_info_ptr, true);
>  }
>  
>  #define PQI_STATUS_IDLE                0x0
> @@ -7209,6 +7336,7 @@ static int pqi_enable_firmware_features(struct
> pqi_ctrl_info *ctrl_info,
>  {
>         void *features_requested;
>         void __iomem *features_requested_iomem_addr;
> +       void __iomem *host_max_known_feature_iomem_addr;
>  
>         features_requested = firmware_features->features_supported +
>                 le16_to_cpu(firmware_features->num_elements);
> @@ -7219,6 +7347,16 @@ static int pqi_enable_firmware_features(struct
> pqi_ctrl_info *ctrl_info,
>         memcpy_toio(features_requested_iomem_addr,
> features_requested,
>                 le16_to_cpu(firmware_features->num_elements));
>  
> +       if (pqi_is_firmware_feature_supported(firmware_features,
> +               PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE)) {
> +               host_max_known_feature_iomem_addr =
> +                       features_requested_iomem_addr +
> +                       (le16_to_cpu(firmware_features->num_elements)
> * 2) +
> +                       sizeof(__le16);
> +               writew(PQI_FIRMWARE_FEATURE_MAXIMUM,
> +                       host_max_known_feature_iomem_addr);
> +       }
> +
>         return pqi_config_table_update(ctrl_info,
>                 PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES,
>                 PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES);
> @@ -7256,6 +7394,15 @@ static void
> pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>         struct pqi_firmware_feature *firmware_feature)
>  {
>         switch (firmware_feature->feature_bit) {
> +       case PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS:
> +               ctrl_info->enable_r1_writes = firmware_feature-
> >enabled;
> +               break;
> +       case PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS:
> +               ctrl_info->enable_r5_writes = firmware_feature-
> >enabled;
> +               break;
> +       case PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS:
> +               ctrl_info->enable_r6_writes = firmware_feature-
> >enabled;
> +               break;
>         case PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE:
>                 ctrl_info->soft_reset_handshake_supported =
>                         firmware_feature->enabled;
> @@ -7293,6 +7440,51 @@ static struct pqi_firmware_feature
> pqi_firmware_features[] = {
>                 .feature_bit = PQI_FIRMWARE_FEATURE_SMP,
>                 .feature_status = pqi_firmware_feature_status,
>         },
> +       {
> +               .feature_name = "Maximum Known Feature",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_MAX_KNOWN_FEATURE,
> +               .feature_status = pqi_firmware_feature_status,
> +       },
> +       {
> +               .feature_name = "RAID 0 Read Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_0_READ_BYPASS,
> +               .feature_status = pqi_firmware_feature_status,
> +       },
> +       {
> +               .feature_name = "RAID 1 Read Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_1_READ_BYPASS,
> +               .feature_status = pqi_firmware_feature_status,
> +       },
> +       {
> +               .feature_name = "RAID 5 Read Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_5_READ_BYPASS,
> +               .feature_status = pqi_firmware_feature_status,
> +       },
> +       {
> +               .feature_name = "RAID 6 Read Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_6_READ_BYPASS,
> +               .feature_status = pqi_firmware_feature_status,
> +       },
> +       {
> +               .feature_name = "RAID 0 Write Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_0_WRITE_BYPASS,
> +               .feature_status = pqi_firmware_feature_status,
> +       },
> +       {
> +               .feature_name = "RAID 1 Write Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_1_WRITE_BYPASS,
> +               .feature_status = pqi_ctrl_update_feature_flags,
> +       },
> +       {
> +               .feature_name = "RAID 5 Write Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_5_WRITE_BYPASS,
> +               .feature_status = pqi_ctrl_update_feature_flags,
> +       },
> +       {
> +               .feature_name = "RAID 6 Write Bypass",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_6_WRITE_BYPASS,
> +               .feature_status = pqi_ctrl_update_feature_flags,
> +       },
>         {
>                 .feature_name = "New Soft Reset Handshake",
>                 .feature_bit =
> PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE,
> @@ -7667,6 +7859,17 @@ static int pqi_ctrl_init(struct pqi_ctrl_info
> *ctrl_info)
>  
>         pqi_start_heartbeat_timer(ctrl_info);
>  
> +       if (ctrl_info->enable_r5_writes || ctrl_info-
> >enable_r6_writes) {
> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);
> +               if (rc) {
> +                       dev_err(&ctrl_info->pci_dev->dev,
> +                               "error obtaining advanced RAID bypass
> configuration\n");
> +                       return rc;
> +               }
> +               ctrl_info->ciss_report_log_flags |=
> +                       CISS_REPORT_LOG_FLAG_DRIVE_TYPE_MIX;
> +       }
> +
>         rc = pqi_enable_events(ctrl_info);
>         if (rc) {
>                 dev_err(&ctrl_info->pci_dev->dev,
> @@ -7822,6 +8025,17 @@ static int pqi_ctrl_init_resume(struct
> pqi_ctrl_info *ctrl_info)
>  
>         pqi_start_heartbeat_timer(ctrl_info);
>  
> +       if (ctrl_info->enable_r5_writes || ctrl_info-
> >enable_r6_writes) {
> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);
> +               if (rc) {
> +                       dev_err(&ctrl_info->pci_dev->dev,
> +                               "error obtaining advanced RAID bypass
> configuration\n");
> +                       return rc;

Do you need to error out here ? Can't you simply unset the
enable_rX_writes feature?


Regards
Martin




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support
  2020-12-10 20:35 ` [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support Don Brace
@ 2021-01-07 16:44   ` Martin Wilck
  2021-01-11 20:53     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:44 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> The specification for AIO Sub-Page (0x02) has changed slightly.
> * bring the driver into conformance with the spec.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   12 ++++---
>  drivers/scsi/smartpqi/smartpqi_init.c |   60 +++++++++++++++++++++--
> ----------
>  2 files changed, 47 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index 31281cddadfe..eb23c3cf59c0 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -1028,13 +1028,13 @@ struct pqi_scsi_dev_raid_map_data {
>         u8      cdb_length;
>  
>         /* RAID 1 specific */
> -#define NUM_RAID1_MAP_ENTRIES 3
> +#define NUM_RAID1_MAP_ENTRIES  3
>         u32     num_it_nexus_entries;
>         u32     it_nexus[NUM_RAID1_MAP_ENTRIES];
>  
>         /* RAID 5 / RAID 6 specific */
> -       u32     p_parity_it_nexus; /* aio_handle */
> -       u32     q_parity_it_nexus; /* aio_handle */
> +       u32     p_parity_it_nexus;      /* aio_handle */
> +       u32     q_parity_it_nexus;      /* aio_handle */
>         u8      xor_mult;
>         u64     row;
>         u64     stripe_lba;
> @@ -1044,6 +1044,7 @@ struct pqi_scsi_dev_raid_map_data {
>  
>  #define RAID_CTLR_LUNID                "\0\0\0\0\0\0\0\0"
>  
> +
>  struct pqi_scsi_dev {
>         int     devtype;                /* as reported by INQUIRY
> commmand */
>         u8      device_type;            /* as reported by */
> @@ -1302,7 +1303,8 @@ struct pqi_ctrl_info {
>         u32             max_transfer_encrypted_sas_sata;
>         u32             max_transfer_encrypted_nvme;
>         u32             max_write_raid_5_6;
> -
> +       u32             max_write_raid_1_10_2drive;
> +       u32             max_write_raid_1_10_3drive;
>  
>         struct list_head scsi_device_list;
>         spinlock_t      scsi_device_list_lock;
> @@ -1533,6 +1535,8 @@ struct bmic_sense_feature_io_page_aio_subpage {
>         __le16  max_transfer_encrypted_sas_sata;
>         __le16  max_transfer_encrypted_nvme;
>         __le16  max_write_raid_5_6;
> +       __le16  max_write_raid_1_10_2drive;
> +       __le16  max_write_raid_1_10_3drive;
>  };
>  
>  struct bmic_smp_request {
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index aa21c1cd2cac..419887aa8ff3 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -696,6 +696,19 @@ static int pqi_identify_physical_device(struct
> pqi_ctrl_info *ctrl_info,
>         return rc;
>  }
>  
> +static inline u32 pqi_aio_limit_to_bytes(__le16 *limit)
> +{
> +       u32 bytes;
> +
> +       bytes = get_unaligned_le16(limit);
> +       if (bytes == 0)
> +               bytes = ~0;
> +       else
> +               bytes *= 1024;
> +
> +       return bytes;
> +}

Nice, but this function and it's callers belong into patch 06/25.


> +
>  #pragma pack(1)
>  
>  struct bmic_sense_feature_buffer {
> @@ -707,11 +720,11 @@ struct bmic_sense_feature_buffer {
>  
>  #define MINIMUM_AIO_SUBPAGE_BUFFER_LENGTH      \
>         offsetofend(struct bmic_sense_feature_buffer, \
> -               aio_subpage.max_write_raid_5_6)
> +               aio_subpage.max_write_raid_1_10_3drive)
>  
>  #define MINIMUM_AIO_SUBPAGE_LENGTH     \
>         (offsetofend(struct bmic_sense_feature_io_page_aio_subpage, \
> -               max_write_raid_5_6) - \
> +               max_write_raid_1_10_3drive) - \
>                 sizeof_field(struct
> bmic_sense_feature_io_page_aio_subpage, header))
>  
>  static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info
> *ctrl_info)
> @@ -753,33 +766,28 @@ static int
> pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
>                         BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE ||
>                 get_unaligned_le16(&buffer-
> >aio_subpage.header.page_length) <
>                         MINIMUM_AIO_SUBPAGE_LENGTH) {
> -               rc = -EINVAL;

This should be changed in 06/25.

>                 goto error;
>         }
>  

Regards
Martin




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 08/25] smartpqi: add support for long firmware version
  2020-12-10 20:35 ` [PATCH V3 08/25] smartpqi: add support for long firmware version Don Brace
@ 2021-01-07 16:45   ` Martin Wilck
  2021-01-11 22:25     ` Don.Brace
  2021-01-22 20:01     ` Don.Brace
  0 siblings, 2 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 16:45 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Add support for new "long" firmware version which requires
>   minor driver changes to expose.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   14 ++++++++---
>  drivers/scsi/smartpqi/smartpqi_init.c |   42
> ++++++++++++++++++++++++---------
>  2 files changed, 40 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index eb23c3cf59c0..f33244def944 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -1227,7 +1227,7 @@ struct pqi_event {
>  struct pqi_ctrl_info {
>         unsigned int    ctrl_id;
>         struct pci_dev  *pci_dev;
> -       char            firmware_version[11];
> +       char            firmware_version[32];
>         char            serial_number[17];
>         char            model[17];
>         char            vendor[9];
> @@ -1405,7 +1405,7 @@ enum pqi_ctrl_mode {
>  struct bmic_identify_controller {
>         u8      configured_logical_drive_count;
>         __le32  configuration_signature;
> -       u8      firmware_version[4];
> +       u8      firmware_version_short[4];
>         u8      reserved[145];
>         __le16  extended_logical_unit_count;
>         u8      reserved1[34];
> @@ -1413,11 +1413,17 @@ struct bmic_identify_controller {
>         u8      reserved2[8];
>         u8      vendor_id[8];
>         u8      product_id[16];
> -       u8      reserved3[68];
> +       u8      reserved3[62];
> +       __le32  extra_controller_flags;
> +       u8      reserved4[2];
>         u8      controller_mode;
> -       u8      reserved4[32];
> +       u8      spare_part_number[32];
> +       u8      firmware_version_long[32];
>  };
>  
> +/* constants for extra_controller_flags field of
> bmic_identify_controller */
> +#define
> BMIC_IDENTIFY_EXTRA_FLAGS_LONG_FW_VERSION_SUPPORTED    0x20000000
> +
>  struct bmic_sense_subsystem_info {
>         u8      reserved[44];
>         u8      ctrl_serial_number[16];
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 419887aa8ff3..aa8b559e8907 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -998,14 +998,12 @@ static void pqi_update_time_worker(struct
> work_struct *work)
>                 PQI_UPDATE_TIME_WORK_INTERVAL);
>  }
>  
> -static inline void pqi_schedule_update_time_worker(
> -       struct pqi_ctrl_info *ctrl_info)
> +static inline void pqi_schedule_update_time_worker(struct
> pqi_ctrl_info *ctrl_info)
>  {
>         schedule_delayed_work(&ctrl_info->update_time_work, 0);
>  }
>  
> -static inline void pqi_cancel_update_time_worker(
> -       struct pqi_ctrl_info *ctrl_info)
> +static inline void pqi_cancel_update_time_worker(struct
> pqi_ctrl_info *ctrl_info)
>  {
>         cancel_delayed_work_sync(&ctrl_info->update_time_work);
>  }
> @@ -7245,13 +7243,23 @@ static int
> pqi_get_ctrl_product_details(struct pqi_ctrl_info *ctrl_info)
>         if (rc)
>                 goto out;
>  
> -       memcpy(ctrl_info->firmware_version, identify-
> > firmware_version,
> -               sizeof(identify->firmware_version));
> -       ctrl_info->firmware_version[sizeof(identify-
> > firmware_version)] = '\0';
> -       snprintf(ctrl_info->firmware_version +
> -               strlen(ctrl_info->firmware_version),
> -               sizeof(ctrl_info->firmware_version),
> -               "-%u", get_unaligned_le16(&identify-
> > firmware_build_number));
> +       if (get_unaligned_le32(&identify->extra_controller_flags) &
> +               BMIC_IDENTIFY_EXTRA_FLAGS_LONG_FW_VERSION_SUPPORTED)
> {
> +               memcpy(ctrl_info->firmware_version,
> +                       identify->firmware_version_long,
> +                       sizeof(identify->firmware_version_long));
> +       } else {
> +               memcpy(ctrl_info->firmware_version,
> +                       identify->firmware_version_short,
> +                       sizeof(identify->firmware_version_short));
> +               ctrl_info->firmware_version
> +                       [sizeof(identify->firmware_version_short)] =
> '\0';
> +               snprintf(ctrl_info->firmware_version +
> +                       strlen(ctrl_info->firmware_version),
> +                       sizeof(ctrl_info->firmware_version),

This looks wrong. I suppose a real overflow can't happen, but shouldn't
it rather be written like this?

snprintf(ctrl_info->firmware_version + 
 sizeof(identify->firmware_version_short),
 sizeof(ctrl_info->firmware_version) 
 - sizeof(identify->firmware_version_short), 
 "-u", ...)

> +                       "-%u",
> +                       get_unaligned_le16(&identify-
> > firmware_build_number));
> +       }
>  
>         memcpy(ctrl_info->model, identify->product_id,
>                 sizeof(identify->product_id));
> @@ -9607,13 +9615,23 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 configuration_signature) != 1);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> -               firmware_version) != 5);
> +               firmware_version_short) != 5);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 extended_logical_unit_count) != 154);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 firmware_build_number) != 190);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               vendor_id) != 200);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               product_id) != 208);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               extra_controller_flags) != 286);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 controller_mode) != 292);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               spare_part_number) != 293);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               firmware_version_long) != 325);
>  
>         BUILD_BUG_ON(offsetof(struct bmic_identify_physical_device,
>                 phys_bay_in_box) != 115);
> 




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 14/25] smartpqi: fix driver synchronization issues
  2020-12-10 20:35 ` [PATCH V3 14/25] smartpqi: fix driver synchronization issues Don Brace
@ 2021-01-07 23:32   ` Martin Wilck
  2021-01-08  4:13     ` Martin K. Petersen
                       ` (2 more replies)
  0 siblings, 3 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 23:32 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * synchronize: LUN resets, shutdowns, suspend, hibernate,
>   OFA, and controller offline events.
> * prevent I/O during the the above conditions.

This description is too terse for a complex patch like this.

Could you please explain how this synchronization is supposed to work
on the different layers, and for the different code paths for different
types of IO events that are apparently not all handled equally wrt
blocking, and how the different flags and mutexes are supposed to
interact? I'd also appreciate some explanation what sort of "driver
synchronization issues" you have seen, and how exactly this patch is
supposed to fix them.

Please forgive me if I ask dumb questions or make dumb comments below,
I don't get the full picture of what you're trying to achieve.

The patch does not only address synchronization issues; it also changes
various other things that (given the size of the patch) should better
be handled elsewhere. I believe this patch could easily be split into
4 or more separate independent patches, which would ease review
considerably. I've added remarks below where I thought one or more
hunks could be separated out.

Thanks,
Martin

> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |   48 -
>  drivers/scsi/smartpqi/smartpqi_init.c | 1157 +++++++++++++----------
> ----------
>  2 files changed, 474 insertions(+), 731 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index 976bfd8c5192..0b94c755a74c 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -130,9 +130,12 @@ struct pqi_iu_header {
>                                 /* of this header */
>         __le16  response_queue_id;      /* specifies the OQ where the
> */
>                                         /* response IU is to be
> delivered */
> -       u8      work_area[2];   /* reserved for driver use */
> +       u16     driver_flags;   /* reserved for driver use */
>  };
>  
> +/* manifest constants for pqi_iu_header.driver_flags */
> +#define PQI_DRIVER_NONBLOCKABLE_REQUEST                0x1
> +
>  /*
>   * According to the PQI spec, the IU header is only the first 4
> bytes of our
>   * pqi_iu_header structure.
> @@ -508,10 +511,6 @@ struct pqi_vendor_general_response {
>  #define PQI_OFA_SIGNATURE              "OFA_QRM"
>  #define PQI_OFA_MAX_SG_DESCRIPTORS     64
>  
> -#define PQI_OFA_MEMORY_DESCRIPTOR_LENGTH \
> -       (offsetof(struct pqi_ofa_memory, sg_descriptor) + \
> -       (PQI_OFA_MAX_SG_DESCRIPTORS * sizeof(struct
> pqi_sg_descriptor)))
> -
>  struct pqi_ofa_memory {
>         __le64  signature;      /* "OFA_QRM" */
>         __le16  version;        /* version of this struct (1 = 1st
> version) */
> @@ -519,7 +518,7 @@ struct pqi_ofa_memory {
>         __le32  bytes_allocated;        /* total allocated memory in
> bytes */
>         __le16  num_memory_descriptors;
>         u8      reserved1[2];
> -       struct pqi_sg_descriptor sg_descriptor[1];
> +       struct pqi_sg_descriptor
> sg_descriptor[PQI_OFA_MAX_SG_DESCRIPTORS];
>  };
>  
>  struct pqi_aio_error_info {
> @@ -850,7 +849,8 @@ struct pqi_config_table_firmware_features {
>  #define PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT                   13
>  #define PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT                    14
>  #define PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME     15
> -#define PQI_FIRMWARE_FEATURE_MAXIMUM                           15
> +#define PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN    16
> +#define PQI_FIRMWARE_FEATURE_MAXIMUM                           16

What does the "unique WWID" feature have to do with synchronization
issues? This part should have gone into a separate patch.

>  
>  struct pqi_config_table_debug {
>         struct pqi_config_table_section_header header;
> @@ -1071,7 +1071,6 @@ struct pqi_scsi_dev {
>         u8      volume_offline : 1;
>         u8      rescan : 1;
>         bool    aio_enabled;            /* only valid for physical
> disks */
> -       bool    in_reset;
>         bool    in_remove;
>         bool    device_offline;
>         u8      vendor[8];              /* bytes 8-15 of inquiry data
> */
> @@ -1107,6 +1106,7 @@ struct pqi_scsi_dev {
>         struct pqi_stream_data stream_data[NUM_STREAMS_PER_LUN];
>         atomic_t scsi_cmds_outstanding;
>         atomic_t raid_bypass_cnt;
> +       u8      page_83_identifier[16];
>  };
>  
>  /* VPD inquiry pages */
> @@ -1212,10 +1212,8 @@ struct pqi_io_request {
>  struct pqi_event {
>         bool    pending;
>         u8      event_type;
> -       __le16  event_id;
> -       __le32  additional_event_id;
> -       __le32  ofa_bytes_requested;
> -       __le16  ofa_cancel_reason;
> +       u16     event_id;
> +       u32     additional_event_id;
>  };
>  
>  #define PQI_RESERVED_IO_SLOTS_LUN_RESET                        1
> @@ -1287,12 +1285,9 @@ struct pqi_ctrl_info {
>  
>         struct mutex    scan_mutex;
>         struct mutex    lun_reset_mutex;
> -       struct mutex    ofa_mutex; /* serialize ofa */
>         bool            controller_online;
>         bool            block_requests;
> -       bool            block_device_reset;
> -       bool            in_ofa;
> -       bool            in_shutdown;
> +       bool            scan_blocked;
>         u8              inbound_spanning_supported : 1;
>         u8              outbound_spanning_supported : 1;
>         u8              pqi_mode_enabled : 1;
> @@ -1300,6 +1295,7 @@ struct pqi_ctrl_info {
>         u8              soft_reset_handshake_supported : 1;
>         u8              raid_iu_timeout_supported : 1;
>         u8              tmf_iu_timeout_supported : 1;
> +       u8              unique_wwid_in_report_phys_lun_supported : 1;
>         u8              enable_r1_writes : 1;
>         u8              enable_r5_writes : 1;
>         u8              enable_r6_writes : 1;
> @@ -1341,14 +1337,14 @@ struct pqi_ctrl_info {
>         atomic_t        num_blocked_threads;
>         wait_queue_head_t block_requests_wait;
>  
> -       struct list_head raid_bypass_retry_list;
> -       spinlock_t      raid_bypass_retry_list_lock;
> -       struct work_struct raid_bypass_retry_work;
> -
> +       struct mutex    ofa_mutex;
>         struct pqi_ofa_memory *pqi_ofa_mem_virt_addr;
>         dma_addr_t      pqi_ofa_mem_dma_handle;
>         void            **pqi_ofa_chunk_virt_addr;
> -       atomic_t        sync_cmds_outstanding;
> +       struct work_struct ofa_memory_alloc_work;
> +       struct work_struct ofa_quiesce_work;
> +       u32             ofa_bytes_requested;
> +       u16             ofa_cancel_reason;
>  };
>  
>  enum pqi_ctrl_mode {
> @@ -1619,16 +1615,6 @@ struct bmic_diag_options {
>  
>  #pragma pack()
>  
> -static inline void pqi_ctrl_busy(struct pqi_ctrl_info *ctrl_info)
> -{
> -       atomic_inc(&ctrl_info->num_busy_threads);
> -}
> -
> -static inline void pqi_ctrl_unbusy(struct pqi_ctrl_info *ctrl_info)
> -{
> -       atomic_dec(&ctrl_info->num_busy_threads);
> -}
> -
>  static inline struct pqi_ctrl_info *shost_to_hba(struct Scsi_Host
> *shost)
>  {
>         void *hostdata = shost_priv(shost);
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 1eb677bc6c69..082b17e9bd80 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -45,6 +45,9 @@
>  
>  #define PQI_EXTRA_SGL_MEMORY   (12 * sizeof(struct
> pqi_sg_descriptor))
>  
> +#define PQI_POST_RESET_DELAY_SECS                      5
> +#define PQI_POST_OFA_RESET_DELAY_UPON_TIMEOUT_SECS     10
> +
>  MODULE_AUTHOR("Microsemi");
>  MODULE_DESCRIPTION("Driver for Microsemi Smart Family Controller
> version "
>         DRIVER_VERSION);
> @@ -54,7 +57,6 @@ MODULE_LICENSE("GPL");
>  
>  static void pqi_take_ctrl_offline(struct pqi_ctrl_info *ctrl_info);
>  static void pqi_ctrl_offline_worker(struct work_struct *work);
> -static void pqi_retry_raid_bypass_requests(struct pqi_ctrl_info
> *ctrl_info);
>  static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info);
>  static void pqi_scan_start(struct Scsi_Host *shost);
>  static void pqi_start_io(struct pqi_ctrl_info *ctrl_info,
> @@ -62,7 +64,7 @@ static void pqi_start_io(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_io_request *io_request);
>  static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_iu_header *request, unsigned int flags,
> -       struct pqi_raid_error_info *error_info, unsigned long
> timeout_msecs);
> +       struct pqi_raid_error_info *error_info);
>  static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
>         struct scsi_cmnd *scmd, u32 aio_handle, u8 *cdb,
>         unsigned int cdb_length, struct pqi_queue_group *queue_group,
> @@ -77,9 +79,8 @@ static int pqi_aio_submit_r56_write_io(struct
> pqi_ctrl_info *ctrl_info,
>         struct pqi_scsi_dev_raid_map_data *rmd);
>  static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info);
>  static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info);
> -static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info);
> -static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info
> *ctrl_info,
> -       u32 bytes_requested);
> +static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info,
> unsigned int delay_secs);
> +static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info
> *ctrl_info);
>  static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info
> *ctrl_info);
>  static int pqi_ofa_host_memory_update(struct pqi_ctrl_info
> *ctrl_info);
>  static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info
> *ctrl_info,
> @@ -245,14 +246,66 @@ static inline void pqi_save_ctrl_mode(struct
> pqi_ctrl_info *ctrl_info,
>         sis_write_driver_scratch(ctrl_info, mode);
>  }
>  
> +static inline void pqi_ctrl_block_scan(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       ctrl_info->scan_blocked = true;
> +       mutex_lock(&ctrl_info->scan_mutex);
> +}

What do you need scan_blocked for? Can't you simply use
mutex_is_locked(&ctrl_info->scan_mutex)?
OTOH, using a mutex for this kind of condition feels dangerous
to me, see remark about ota_mutex() below. 
Have you considered using a completion for this?

> +
> +static inline void pqi_ctrl_unblock_scan(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       ctrl_info->scan_blocked = false;
> +       mutex_unlock(&ctrl_info->scan_mutex);
> +}
> +
> +static inline bool pqi_ctrl_scan_blocked(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       return ctrl_info->scan_blocked;
> +}
> +
>  static inline void pqi_ctrl_block_device_reset(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->block_device_reset = true;
> +       mutex_lock(&ctrl_info->lun_reset_mutex);
> +}
> +
> +static inline void pqi_ctrl_unblock_device_reset(struct
> pqi_ctrl_info *ctrl_info)
> +{
> +       mutex_unlock(&ctrl_info->lun_reset_mutex);
> +}
> +
> +static inline void pqi_scsi_block_requests(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       struct Scsi_Host *shost;
> +       unsigned int num_loops;
> +       int msecs_sleep;
> +
> +       shost = ctrl_info->scsi_host;
> +
> +       scsi_block_requests(shost);
> +
> +       num_loops = 0;
> +       msecs_sleep = 20;
> +       while (scsi_host_busy(shost)) {
> +               num_loops++;
> +               if (num_loops == 10)
> +                       msecs_sleep = 500;
> +               msleep(msecs_sleep);
> +       }
> +}

Waiting for !scsi_host_busy() here looks like a layering violation to
me. Can't you use wait_event{_timeout}() here and wait for the sum of
of device->scsi_cmds_outstanding over to become zero (waking up the
queue in pqi_prep_for_scsi_done())? You could use the
total_scmnds_outstanding count that you introduce in patch 15/25.

Also, how does this interact/interfere with scsi EH?

> +
> +static inline void pqi_scsi_unblock_requests(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       scsi_unblock_requests(ctrl_info->scsi_host);
> +}
> +
> +static inline void pqi_ctrl_busy(struct pqi_ctrl_info *ctrl_info)
> +{
> +       atomic_inc(&ctrl_info->num_busy_threads);
>  }
>  
> -static inline bool pqi_device_reset_blocked(struct pqi_ctrl_info
> *ctrl_info)
> +static inline void pqi_ctrl_unbusy(struct pqi_ctrl_info *ctrl_info)
>  {
> -       return ctrl_info->block_device_reset;
> +       atomic_dec(&ctrl_info->num_busy_threads);
>  }
>  
>  static inline bool pqi_ctrl_blocked(struct pqi_ctrl_info *ctrl_info)
> @@ -263,44 +316,23 @@ static inline bool pqi_ctrl_blocked(struct
> pqi_ctrl_info *ctrl_info)
>  static inline void pqi_ctrl_block_requests(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         ctrl_info->block_requests = true;
> -       scsi_block_requests(ctrl_info->scsi_host);
>  }
>  
>  static inline void pqi_ctrl_unblock_requests(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         ctrl_info->block_requests = false;
>         wake_up_all(&ctrl_info->block_requests_wait);
> -       pqi_retry_raid_bypass_requests(ctrl_info);
> -       scsi_unblock_requests(ctrl_info->scsi_host);
>  }
>  
> -static unsigned long pqi_wait_if_ctrl_blocked(struct pqi_ctrl_info
> *ctrl_info,
> -       unsigned long timeout_msecs)
> +static void pqi_wait_if_ctrl_blocked(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       unsigned long remaining_msecs;
> -
>         if (!pqi_ctrl_blocked(ctrl_info))
> -               return timeout_msecs;
> +               return;
>  
>         atomic_inc(&ctrl_info->num_blocked_threads);
> -
> -       if (timeout_msecs == NO_TIMEOUT) {
> -               wait_event(ctrl_info->block_requests_wait,
> -                       !pqi_ctrl_blocked(ctrl_info));
> -               remaining_msecs = timeout_msecs;
> -       } else {
> -               unsigned long remaining_jiffies;
> -
> -               remaining_jiffies =
> -                       wait_event_timeout(ctrl_info-
> > block_requests_wait,
> -                               !pqi_ctrl_blocked(ctrl_info),
> -                               msecs_to_jiffies(timeout_msecs));
> -               remaining_msecs =
> jiffies_to_msecs(remaining_jiffies);
> -       }
> -
> +       wait_event(ctrl_info->block_requests_wait,
> +               !pqi_ctrl_blocked(ctrl_info));
>         atomic_dec(&ctrl_info->num_blocked_threads);
> -
> -       return remaining_msecs;
>  }
>  
>  static inline void pqi_ctrl_wait_until_quiesced(struct pqi_ctrl_info
> *ctrl_info)
> @@ -315,34 +347,25 @@ static inline bool pqi_device_offline(struct
> pqi_scsi_dev *device)
>         return device->device_offline;
>  }
>  
> -static inline void pqi_device_reset_start(struct pqi_scsi_dev
> *device)
> -{
> -       device->in_reset = true;
> -}
> -
> -static inline void pqi_device_reset_done(struct pqi_scsi_dev
> *device)
> -{
> -       device->in_reset = false;
> -}
> -
> -static inline bool pqi_device_in_reset(struct pqi_scsi_dev *device)
> +static inline void pqi_ctrl_ofa_start(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       return device->in_reset;
> +       mutex_lock(&ctrl_info->ofa_mutex);
>  }
>  
> -static inline void pqi_ctrl_ofa_start(struct pqi_ctrl_info
> *ctrl_info)
> +static inline void pqi_ctrl_ofa_done(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->in_ofa = true;
> +       mutex_unlock(&ctrl_info->ofa_mutex);
>  }

pqi_ctrl_ofa_done() is called in several places. For me, it's non-
obvious whether ofa_mutex is guaranteed to be locked when this happens.
It would be an error to call mutex_unlock() if that's not the case.
Also, is it always guaranteed that "The context (task) that acquired
the lock also releases it"
(https://www.kernel.org/doc/html/latest/locking/locktypes.html)?
If feel that's rather not the case, as pqi_ctrl_ofa_start() is run from
a work queue, whereas pqi_ctrl_ofa_done() is not, afaics.

Have you considered using a completion?
Or can you add some explanatory comments?

> -static inline void pqi_ctrl_ofa_done(struct pqi_ctrl_info
> *ctrl_info)
> +static inline void pqi_wait_until_ofa_finished(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->in_ofa = false;
> +       mutex_lock(&ctrl_info->ofa_mutex);
> +       mutex_unlock(&ctrl_info->ofa_mutex);
>  }
>  
> -static inline bool pqi_ctrl_in_ofa(struct pqi_ctrl_info *ctrl_info)
> +static inline bool pqi_ofa_in_progress(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       return ctrl_info->in_ofa;
> +       return mutex_is_locked(&ctrl_info->ofa_mutex);
>  }
>  
>  static inline void pqi_device_remove_start(struct pqi_scsi_dev
> *device)
> @@ -355,14 +378,20 @@ static inline bool pqi_device_in_remove(struct
> pqi_scsi_dev *device)
>         return device->in_remove;
>  }
>  
> -static inline void pqi_ctrl_shutdown_start(struct pqi_ctrl_info
> *ctrl_info)
> +static inline int pqi_event_type_to_event_index(unsigned int
> event_type)
>  {
> -       ctrl_info->in_shutdown = true;
> +       int index;
> +
> +       for (index = 0; index <
> ARRAY_SIZE(pqi_supported_event_types); index++)
> +               if (event_type == pqi_supported_event_types[index])
> +                       return index;
> +
> +       return -1;
>  }
>  
> -static inline bool pqi_ctrl_in_shutdown(struct pqi_ctrl_info
> *ctrl_info)
> +static inline bool pqi_is_supported_event(unsigned int event_type)
>  {
> -       return ctrl_info->in_shutdown;
> +       return pqi_event_type_to_event_index(event_type) != -1;
>  }
>  
>  static inline void pqi_schedule_rescan_worker_with_delay(struct
> pqi_ctrl_info *ctrl_info,
> @@ -370,8 +399,6 @@ static inline void
> pqi_schedule_rescan_worker_with_delay(struct pqi_ctrl_info *c
>  {
>         if (pqi_ctrl_offline(ctrl_info))
>                 return;
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               return;
>  
>         schedule_delayed_work(&ctrl_info->rescan_work, delay);
>  }
> @@ -408,22 +435,15 @@ static inline u32
> pqi_read_heartbeat_counter(struct pqi_ctrl_info *ctrl_info)
>  
>  static inline u8 pqi_read_soft_reset_status(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       if (!ctrl_info->soft_reset_status)
> -               return 0;
> -
>         return readb(ctrl_info->soft_reset_status);
>  }

The new treatment of soft_reset_status is unrelated to the
synchronization issues mentioned in the patch description.

>  
> -static inline void pqi_clear_soft_reset_status(struct pqi_ctrl_info
> *ctrl_info,
> -       u8 clear)
> +static inline void pqi_clear_soft_reset_status(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         u8 status;
>  
> -       if (!ctrl_info->soft_reset_status)
> -               return;
> -
>         status = pqi_read_soft_reset_status(ctrl_info);
> -       status &= ~clear;
> +       status &= ~PQI_SOFT_RESET_ABORT;
>         writeb(status, ctrl_info->soft_reset_status);
>  }
>  
> @@ -512,6 +532,7 @@ static int pqi_build_raid_path_request(struct
> pqi_ctrl_info *ctrl_info,
>                 put_unaligned_be32(cdb_length, &cdb[6]);
>                 break;
>         case SA_FLUSH_CACHE:
> +               request->header.driver_flags =
> PQI_DRIVER_NONBLOCKABLE_REQUEST;
>                 request->data_direction = SOP_WRITE_FLAG;
>                 cdb[0] = BMIC_WRITE;
>                 cdb[6] = BMIC_FLUSH_CACHE;
> @@ -606,7 +627,7 @@ static void pqi_free_io_request(struct
> pqi_io_request *io_request)
>  
>  static int pqi_send_scsi_raid_request(struct pqi_ctrl_info
> *ctrl_info, u8 cmd,
>         u8 *scsi3addr, void *buffer, size_t buffer_length, u16
> vpd_page,
> -       struct pqi_raid_error_info *error_info, unsigned long
> timeout_msecs)
> +       struct pqi_raid_error_info *error_info)
>  {
>         int rc;
>         struct pqi_raid_path_request request;
> @@ -618,7 +639,7 @@ static int pqi_send_scsi_raid_request(struct
> pqi_ctrl_info *ctrl_info, u8 cmd,
>                 return rc;
>  
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0,
> -               error_info, timeout_msecs);
> +               error_info);
>  
>         pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1,
> dir);
>  
> @@ -631,7 +652,7 @@ static inline int
> pqi_send_ctrl_raid_request(struct pqi_ctrl_info *ctrl_info,
>         u8 cmd, void *buffer, size_t buffer_length)
>  {
>         return pqi_send_scsi_raid_request(ctrl_info, cmd,
> RAID_CTLR_LUNID,
> -               buffer, buffer_length, 0, NULL, NO_TIMEOUT);
> +               buffer, buffer_length, 0, NULL);
>  }
>  
>  static inline int pqi_send_ctrl_raid_with_error(struct pqi_ctrl_info
> *ctrl_info,
> @@ -639,7 +660,7 @@ static inline int
> pqi_send_ctrl_raid_with_error(struct pqi_ctrl_info *ctrl_info,
>         struct pqi_raid_error_info *error_info)
>  {
>         return pqi_send_scsi_raid_request(ctrl_info, cmd,
> RAID_CTLR_LUNID,
> -               buffer, buffer_length, 0, error_info, NO_TIMEOUT);
> +               buffer, buffer_length, 0, error_info);
>  }
>  
>  static inline int pqi_identify_controller(struct pqi_ctrl_info
> *ctrl_info,
> @@ -661,7 +682,7 @@ static inline int pqi_scsi_inquiry(struct
> pqi_ctrl_info *ctrl_info,
>         u8 *scsi3addr, u16 vpd_page, void *buffer, size_t
> buffer_length)
>  {
>         return pqi_send_scsi_raid_request(ctrl_info, INQUIRY,
> scsi3addr,
> -               buffer, buffer_length, vpd_page, NULL, NO_TIMEOUT);
> +               buffer, buffer_length, vpd_page, NULL);
>  }
>  
>  static int pqi_identify_physical_device(struct pqi_ctrl_info
> *ctrl_info,
> @@ -683,8 +704,7 @@ static int pqi_identify_physical_device(struct
> pqi_ctrl_info *ctrl_info,
>         request.cdb[2] = (u8)bmic_device_index;
>         request.cdb[9] = (u8)(bmic_device_index >> 8);
>  
> -       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0, NULL);
>  
>         pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1,
> dir);
>  
> @@ -741,7 +761,7 @@ static int
> pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
>         request.cdb[2] = BMIC_SENSE_FEATURE_IO_PAGE;
>         request.cdb[3] = BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE;
>  
> -       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0, NULL, NO_TIMEOUT);
> +       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0, NULL);
>  
>         pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1,
> dir);
>  
> @@ -794,13 +814,6 @@ static int pqi_flush_cache(struct pqi_ctrl_info
> *ctrl_info,
>         int rc;
>         struct bmic_flush_cache *flush_cache;
>  
> -       /*
> -        * Don't bother trying to flush the cache if the controller
> is
> -        * locked up.
> -        */
> -       if (pqi_ctrl_offline(ctrl_info))
> -               return -ENXIO;
> -
>         flush_cache = kzalloc(sizeof(*flush_cache), GFP_KERNEL);
>         if (!flush_cache)
>                 return -ENOMEM;
> @@ -979,9 +992,6 @@ static void pqi_update_time_worker(struct
> work_struct *work)
>         ctrl_info = container_of(to_delayed_work(work), struct
> pqi_ctrl_info,
>                 update_time_work);
>  
> -       if (pqi_ctrl_offline(ctrl_info))
> -               return;
> -
>         rc = pqi_write_current_time_to_host_wellness(ctrl_info);
>         if (rc)
>                 dev_warn(&ctrl_info->pci_dev->dev,
> @@ -1271,9 +1281,7 @@ static int pqi_get_raid_map(struct
> pqi_ctrl_info *ctrl_info,
>                 return -ENOMEM;
>  
>         rc = pqi_send_scsi_raid_request(ctrl_info, CISS_GET_RAID_MAP,
> -               device->scsi3addr, raid_map, sizeof(*raid_map),
> -               0, NULL, NO_TIMEOUT);
> -
> +               device->scsi3addr, raid_map, sizeof(*raid_map), 0,
> NULL);
>         if (rc)
>                 goto error;
>  
> @@ -1288,8 +1296,7 @@ static int pqi_get_raid_map(struct
> pqi_ctrl_info *ctrl_info,
>                         return -ENOMEM;
>  
>                 rc = pqi_send_scsi_raid_request(ctrl_info,
> CISS_GET_RAID_MAP,
> -                       device->scsi3addr, raid_map, raid_map_size,
> -                       0, NULL, NO_TIMEOUT);
> +                       device->scsi3addr, raid_map, raid_map_size,
> 0, NULL);
>                 if (rc)
>                         goto error;
>  
> @@ -1464,6 +1471,9 @@ static int pqi_get_physical_device_info(struct
> pqi_ctrl_info *ctrl_info,
>                 sizeof(device->phys_connector));
>         device->bay = id_phys->phys_bay_in_box;
>  
> +       memcpy(&device->page_83_identifier, &id_phys-
> >page_83_identifier,
> +               sizeof(device->page_83_identifier));
> +
>         return 0;
>  }
>  

This hunk belongs to the "unique wwid" part, see above.


> @@ -1970,8 +1980,13 @@ static void pqi_update_device_list(struct
> pqi_ctrl_info *ctrl_info,
>  
>         spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock,
> flags);
>  
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               pqi_ctrl_ofa_done(ctrl_info);
> +       if (pqi_ofa_in_progress(ctrl_info)) {
> +               list_for_each_entry_safe(device, next, &delete_list,
> delete_list_entry)
> +                       if (pqi_is_device_added(device))
> +                               pqi_device_remove_start(device);
> +               pqi_ctrl_unblock_device_reset(ctrl_info);
> +               pqi_scsi_unblock_requests(ctrl_info);
> +       }

I don't understand the purpose of is code. pqi_device_remove_start()
will be called again a few lines below. Why do it twice? I suppose
it's related to the unblocking, but that deserves an explanation.
Also, why do you unblock requests while OFA is "in progress"?

>  
>         /* Remove all devices that have gone away. */
>         list_for_each_entry_safe(device, next, &delete_list,
> delete_list_entry) {
> @@ -1993,19 +2008,14 @@ static void pqi_update_device_list(struct
> pqi_ctrl_info *ctrl_info,

The following hunk is unrelated to synchronization.

>          * Notify the SCSI ML if the queue depth of any existing
> device has
>          * changed.
>          */
> -       list_for_each_entry(device, &ctrl_info->scsi_device_list,
> -               scsi_device_list_entry) {
> -               if (device->sdev) {
> -                       if (device->queue_depth !=
> -                               device->advertised_queue_depth) {
> -                               device->advertised_queue_depth =
> device->queue_depth;
> -                               scsi_change_queue_depth(device->sdev,
> -                                       device-
> >advertised_queue_depth);
> -                       }
> -                       if (device->rescan) {
> -                               scsi_rescan_device(&device->sdev-
> >sdev_gendev);
> -                               device->rescan = false;
> -                       }
> +       list_for_each_entry(device, &ctrl_info->scsi_device_list,
> scsi_device_list_entry) {
> +               if (device->sdev && device->queue_depth != device-
> >advertised_queue_depth) {
> +                       device->advertised_queue_depth = device-
> >queue_depth;
> +                       scsi_change_queue_depth(device->sdev, device-
> >advertised_queue_depth);
> +               }
> +               if (device->rescan) {
> +                       scsi_rescan_device(&device->sdev-
> >sdev_gendev);
> +                       device->rescan = false;
>                 }

You've taken the reference to device->sdev->sdev_gendev out of the if
(device->sdev) clause. Can you be certain that device->sdev is non-
NULL?

>         }
>  
> @@ -2073,6 +2083,16 @@ static inline bool pqi_expose_device(struct
> pqi_scsi_dev *device)
>         return !device->is_physical_device ||
> !pqi_skip_device(device->scsi3addr);
>  }
>  

The following belongs to the "unique wwid" part.

> +static inline void pqi_set_physical_device_wwid(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_scsi_dev *device, struct
> report_phys_lun_extended_entry *phys_lun_ext_entry)
> +{
> +       if (ctrl_info->unique_wwid_in_report_phys_lun_supported ||
> +               pqi_is_device_with_sas_address(device))
> +               device->wwid = phys_lun_ext_entry->wwid;
> +       else
> +               device->wwid =
> cpu_to_be64(get_unaligned_be64(&device->page_83_identifier));
> +}
> +
>  static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>  {
>         int i;
> @@ -2238,7 +2258,7 @@ static int pqi_update_scsi_devices(struct
> pqi_ctrl_info *ctrl_info)
>                 pqi_assign_bus_target_lun(device);
>  
>                 if (device->is_physical_device) {
> -                       device->wwid = phys_lun_ext_entry->wwid;
> +                       pqi_set_physical_device_wwid(ctrl_info,
> device, phys_lun_ext_entry);
>                         if ((phys_lun_ext_entry->device_flags &
>                                 CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED
> ) &&
>                                 phys_lun_ext_entry->aio_handle) {
> @@ -2278,21 +2298,27 @@ static int pqi_update_scsi_devices(struct
> pqi_ctrl_info *ctrl_info)
>  
>  static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>  {
> -       int rc = 0;
> +       int rc;
> +       int mutex_acquired;
>  
>         if (pqi_ctrl_offline(ctrl_info))
>                 return -ENXIO;
>  
> -       if (!mutex_trylock(&ctrl_info->scan_mutex)) {
> +       mutex_acquired = mutex_trylock(&ctrl_info->scan_mutex);
> +
> +       if (!mutex_acquired) {
> +               if (pqi_ctrl_scan_blocked(ctrl_info))
> +                       return -EBUSY;
>                 pqi_schedule_rescan_worker_delayed(ctrl_info);
> -               rc = -EINPROGRESS;
> -       } else {
> -               rc = pqi_update_scsi_devices(ctrl_info);
> -               if (rc)
> -
>                        pqi_schedule_rescan_worker_delayed(ctrl_info);
> -               mutex_unlock(&ctrl_info->scan_mutex);
> +               return -EINPROGRESS;
>         }
>  
> +       rc = pqi_update_scsi_devices(ctrl_info);
> +       if (rc && !pqi_ctrl_scan_blocked(ctrl_info))
> +               pqi_schedule_rescan_worker_delayed(ctrl_info);
> +
> +       mutex_unlock(&ctrl_info->scan_mutex);
> +
>         return rc;
>  }
>  
> @@ -2301,8 +2327,6 @@ static void pqi_scan_start(struct Scsi_Host
> *shost)
>         struct pqi_ctrl_info *ctrl_info;
>  
>         ctrl_info = shost_to_hba(shost);
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               return;
>  
>         pqi_scan_scsi_devices(ctrl_info);
>  }
> @@ -2319,27 +2343,8 @@ static int pqi_scan_finished(struct Scsi_Host
> *shost,
>         return !mutex_is_locked(&ctrl_info->scan_mutex);
>  }
>  
> -static void pqi_wait_until_scan_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->scan_mutex);
> -       mutex_unlock(&ctrl_info->scan_mutex);
> -}
> -
> -static void pqi_wait_until_lun_reset_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->lun_reset_mutex);
> -       mutex_unlock(&ctrl_info->lun_reset_mutex);
> -}
> -
> -static void pqi_wait_until_ofa_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->ofa_mutex);
> -       mutex_unlock(&ctrl_info->ofa_mutex);
> -}

Here, again, I wonder if this mutex_lock()/mutex_unlock() approach is
optimal. Have you considered using completions? 

See above for rationale.

> -
> -static inline void pqi_set_encryption_info(
> -       struct pqi_encryption_info *encryption_info, struct raid_map
> *raid_map,
> -       u64 first_block)
> +static inline void pqi_set_encryption_info(struct
> pqi_encryption_info *encryption_info,
> +       struct raid_map *raid_map, u64 first_block)
>  {
>         u32 volume_blk_size;

This whitespace change doesn't belong here.

>  
> @@ -3251,8 +3256,8 @@ static void pqi_acknowledge_event(struct
> pqi_ctrl_info *ctrl_info,
>         put_unaligned_le16(sizeof(request) -
> PQI_REQUEST_HEADER_LENGTH,
>                 &request.header.iu_length);
>         request.event_type = event->event_type;
> -       request.event_id = event->event_id;
> -       request.additional_event_id = event->additional_event_id;
> +       put_unaligned_le16(event->event_id, &request.event_id);
> +       put_unaligned_le16(event->additional_event_id,
> &request.additional_event_id);

The different treatment of the event_id fields is unrelated to
synchronization, or am I missing something?

>  
>         pqi_send_event_ack(ctrl_info, &request, sizeof(request));
>  }
> @@ -3263,8 +3268,8 @@ static void pqi_acknowledge_event(struct
> pqi_ctrl_info *ctrl_info,
>  static enum pqi_soft_reset_status pqi_poll_for_soft_reset_status(
>         struct pqi_ctrl_info *ctrl_info)
>  {
> -       unsigned long timeout;
>         u8 status;
> +       unsigned long timeout;
>  
>         timeout = (PQI_SOFT_RESET_STATUS_TIMEOUT_SECS * PQI_HZ) +
> jiffies;
>  
> @@ -3276,120 +3281,169 @@ static enum pqi_soft_reset_status
> pqi_poll_for_soft_reset_status(
>                 if (status & PQI_SOFT_RESET_ABORT)
>                         return RESET_ABORT;
>  
> +               if (!sis_is_firmware_running(ctrl_info))
> +                       return RESET_NORESPONSE;
> +
>                 if (time_after(jiffies, timeout)) {
> -                       dev_err(&ctrl_info->pci_dev->dev,
> +                       dev_warn(&ctrl_info->pci_dev->dev,
>                                 "timed out waiting for soft reset
> status\n");
>                         return RESET_TIMEDOUT;
>                 }
>  
> -               if (!sis_is_firmware_running(ctrl_info))
> -                       return RESET_NORESPONSE;
> -
>                 ssleep(PQI_SOFT_RESET_STATUS_POLL_INTERVAL_SECS);
>         }
>  }
>  
> -static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info,
> -       enum pqi_soft_reset_status reset_status)
> +static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
>  {
>         int rc;
> +       unsigned int delay_secs;
> +       enum pqi_soft_reset_status reset_status;
> +
> +       if (ctrl_info->soft_reset_handshake_supported)
> +               reset_status =
> pqi_poll_for_soft_reset_status(ctrl_info);
> +       else
> +               reset_status = RESET_INITIATE_FIRMWARE;
> +
> +       pqi_ofa_free_host_buffer(ctrl_info);
> +
> +       delay_secs = PQI_POST_RESET_DELAY_SECS;
>  
>         switch (reset_status) {
> -       case RESET_INITIATE_DRIVER:
>         case RESET_TIMEDOUT:
> +               delay_secs =
> PQI_POST_OFA_RESET_DELAY_UPON_TIMEOUT_SECS;
> +               fallthrough;
> +       case RESET_INITIATE_DRIVER:
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "resetting controller %u\n", ctrl_info-
> >ctrl_id);
> +                               "Online Firmware Activation:
> resetting controller\n");
>                 sis_soft_reset(ctrl_info);
>                 fallthrough;
>         case RESET_INITIATE_FIRMWARE:
> -               rc = pqi_ofa_ctrl_restart(ctrl_info);
> -               pqi_ofa_free_host_buffer(ctrl_info);
> +               ctrl_info->pqi_mode_enabled = false;
> +               pqi_save_ctrl_mode(ctrl_info, SIS_MODE);
> +               rc = pqi_ofa_ctrl_restart(ctrl_info, delay_secs);
> +               pqi_ctrl_ofa_done(ctrl_info);
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "Online Firmware Activation for controller
> %u: %s\n",
> -                       ctrl_info->ctrl_id, rc == 0 ? "SUCCESS" :
> "FAILED");
> +                               "Online Firmware Activation: %s\n",
> +                               rc == 0 ? "SUCCESS" : "FAILED");
>                 break;
>         case RESET_ABORT:
> -               pqi_ofa_ctrl_unquiesce(ctrl_info);
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "Online Firmware Activation for controller
> %u: %s\n",
> -                       ctrl_info->ctrl_id, "ABORTED");
> +                               "Online Firmware Activation
> ABORTED\n");
> +               if (ctrl_info->soft_reset_handshake_supported)
> +                       pqi_clear_soft_reset_status(ctrl_info);
> +               pqi_ctrl_ofa_done(ctrl_info);
> +               pqi_ofa_ctrl_unquiesce(ctrl_info);
>                 break;
>         case RESET_NORESPONSE:
> -               pqi_ofa_free_host_buffer(ctrl_info);
> +               fallthrough;
> +       default:
> +               dev_err(&ctrl_info->pci_dev->dev,
> +                       "unexpected Online Firmware Activation reset
> status: 0x%x\n",
> +                       reset_status);
> +               pqi_ctrl_ofa_done(ctrl_info);
> +               pqi_ofa_ctrl_unquiesce(ctrl_info);
>                 pqi_take_ctrl_offline(ctrl_info);
>                 break;
>         }
>  }
>  
> -static void pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
> -       struct pqi_event *event)
> +static void pqi_ofa_memory_alloc_worker(struct work_struct *work)

Moving the ofa handling into work queues seems to be a key aspect of
this patch. The patch description should mention how this is will
improve synchronization. Naïve thinking suggests that making these
calls asynchronous could aggravate synchronization issues.

Repeating myself, I feel that completions would be the best way so
synchronize with these work items.

>  {
> -       u16 event_id;
> -       enum pqi_soft_reset_status status;
> +       struct pqi_ctrl_info *ctrl_info;
>  
> -       event_id = get_unaligned_le16(&event->event_id);
> +       ctrl_info = container_of(work, struct pqi_ctrl_info,
> ofa_memory_alloc_work);
>  
> -       mutex_lock(&ctrl_info->ofa_mutex);
> +       pqi_ctrl_ofa_start(ctrl_info);
> +       pqi_ofa_setup_host_buffer(ctrl_info);
> +       pqi_ofa_host_memory_update(ctrl_info);
> +}
>  
> -       if (event_id == PQI_EVENT_OFA_QUIESCE) {
> -               dev_info(&ctrl_info->pci_dev->dev,
> -                       "Received Online Firmware Activation quiesce
> event for controller %u\n",
> -                       ctrl_info->ctrl_id);
> -               pqi_ofa_ctrl_quiesce(ctrl_info);
> -               pqi_acknowledge_event(ctrl_info, event);
> -               if (ctrl_info->soft_reset_handshake_supported) {
> -                       status =
> pqi_poll_for_soft_reset_status(ctrl_info);
> -                       pqi_process_soft_reset(ctrl_info, status);
> -               } else {
> -                       pqi_process_soft_reset(ctrl_info,
> -                                       RESET_INITIATE_FIRMWARE);
> -               }
> +static void pqi_ofa_quiesce_worker(struct work_struct *work)
> +{
> +       struct pqi_ctrl_info *ctrl_info;
> +       struct pqi_event *event;
>  
> -       } else if (event_id == PQI_EVENT_OFA_MEMORY_ALLOCATION) {
> -               pqi_acknowledge_event(ctrl_info, event);
> -               pqi_ofa_setup_host_buffer(ctrl_info,
> -                       le32_to_cpu(event->ofa_bytes_requested));
> -               pqi_ofa_host_memory_update(ctrl_info);
> -       } else if (event_id == PQI_EVENT_OFA_CANCELED) {
> -               pqi_ofa_free_host_buffer(ctrl_info);
> -               pqi_acknowledge_event(ctrl_info, event);
> +       ctrl_info = container_of(work, struct pqi_ctrl_info,
> ofa_quiesce_work);
> +
> +       event = &ctrl_info-
> >events[pqi_event_type_to_event_index(PQI_EVENT_TYPE_OFA)];
> +
> +       pqi_ofa_ctrl_quiesce(ctrl_info);
> +       pqi_acknowledge_event(ctrl_info, event);
> +       pqi_process_soft_reset(ctrl_info);
> +}
> +
> +static bool pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
> +       struct pqi_event *event)
> +{
> +       bool ack_event;
> +
> +       ack_event = true;
> +
> +       switch (event->event_id) {
> +       case PQI_EVENT_OFA_MEMORY_ALLOCATION:
> +               dev_info(&ctrl_info->pci_dev->dev,
> +                       "received Online Firmware Activation memory
> allocation request\n");
> +               schedule_work(&ctrl_info->ofa_memory_alloc_work);
> +               break;
> +       case PQI_EVENT_OFA_QUIESCE:
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "Online Firmware Activation(%u) cancel reason
> : %u\n",
> -                       ctrl_info->ctrl_id, event-
> >ofa_cancel_reason);
> +                       "received Online Firmware Activation quiesce
> request\n");
> +               schedule_work(&ctrl_info->ofa_quiesce_work);
> +               ack_event = false;
> +               break;
> +       case PQI_EVENT_OFA_CANCELED:
> +               dev_info(&ctrl_info->pci_dev->dev,
> +                       "received Online Firmware Activation cancel
> request: reason: %u\n",
> +                       ctrl_info->ofa_cancel_reason);
> +               pqi_ofa_free_host_buffer(ctrl_info);
> +               pqi_ctrl_ofa_done(ctrl_info);
> +               break;
> +       default:
> +               dev_err(&ctrl_info->pci_dev->dev,
> +                       "received unknown Online Firmware Activation
> request: event ID: %u\n",
> +                       event->event_id);
> +               break;
>         }
>  
> -       mutex_unlock(&ctrl_info->ofa_mutex);
> +       return ack_event;
>  }
>  
>  static void pqi_event_worker(struct work_struct *work)
>  {
>         unsigned int i;
> +       bool rescan_needed;
>         struct pqi_ctrl_info *ctrl_info;
>         struct pqi_event *event;
> +       bool ack_event;
>  
>         ctrl_info = container_of(work, struct pqi_ctrl_info,
> event_work);
>  
>         pqi_ctrl_busy(ctrl_info);
> -       pqi_wait_if_ctrl_blocked(ctrl_info, NO_TIMEOUT);
> +       pqi_wait_if_ctrl_blocked(ctrl_info);
>         if (pqi_ctrl_offline(ctrl_info))
>                 goto out;
>  
> -       pqi_schedule_rescan_worker_delayed(ctrl_info);
> -
> +       rescan_needed = false;
>         event = ctrl_info->events;
>         for (i = 0; i < PQI_NUM_SUPPORTED_EVENTS; i++) {
>                 if (event->pending) {
>                         event->pending = false;
>                         if (event->event_type == PQI_EVENT_TYPE_OFA)
> {
> -                               pqi_ctrl_unbusy(ctrl_info);
> -                               pqi_ofa_process_event(ctrl_info,
> event);
> -                               return;
> +                               ack_event =
> pqi_ofa_process_event(ctrl_info, event);
> +                       } else {
> +                               ack_event = true;
> +                               rescan_needed = true;
>                         }
> -                       pqi_acknowledge_event(ctrl_info, event);
> +                       if (ack_event)
> +                               pqi_acknowledge_event(ctrl_info,
> event);
>                 }
>                 event++;
>         }
>  
> +       if (rescan_needed)
> +               pqi_schedule_rescan_worker_delayed(ctrl_info);
> +
>  out:
>         pqi_ctrl_unbusy(ctrl_info);
>  }
> @@ -3446,37 +3500,18 @@ static inline void
> pqi_stop_heartbeat_timer(struct pqi_ctrl_info *ctrl_info)
>         del_timer_sync(&ctrl_info->heartbeat_timer);
>  }
>  
> -static inline int pqi_event_type_to_event_index(unsigned int
> event_type)
> -{
> -       int index;
> -
> -       for (index = 0; index <
> ARRAY_SIZE(pqi_supported_event_types); index++)
> -               if (event_type == pqi_supported_event_types[index])
> -                       return index;
> -
> -       return -1;
> -}
> -
> -static inline bool pqi_is_supported_event(unsigned int event_type)
> -{
> -       return pqi_event_type_to_event_index(event_type) != -1;
> -}
> -
> -static void pqi_ofa_capture_event_payload(struct pqi_event *event,
> -       struct pqi_event_response *response)
> +static void pqi_ofa_capture_event_payload(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_event *event, struct pqi_event_response *response)
>  {
> -       u16 event_id;
> -
> -       event_id = get_unaligned_le16(&event->event_id);
> -
> -       if (event->event_type == PQI_EVENT_TYPE_OFA) {
> -               if (event_id == PQI_EVENT_OFA_MEMORY_ALLOCATION) {
> -                       event->ofa_bytes_requested =
> -                       response-
> >data.ofa_memory_allocation.bytes_requested;
> -               } else if (event_id == PQI_EVENT_OFA_CANCELED) {
> -                       event->ofa_cancel_reason =
> -                       response->data.ofa_cancelled.reason;
> -               }
> +       switch (event->event_id) {
> +       case PQI_EVENT_OFA_MEMORY_ALLOCATION:
> +               ctrl_info->ofa_bytes_requested =
> +                       get_unaligned_le32(&response-
> >data.ofa_memory_allocation.bytes_requested);
> +               break;
> +       case PQI_EVENT_OFA_CANCELED:
> +               ctrl_info->ofa_cancel_reason =
> +                       get_unaligned_le16(&response-
> >data.ofa_cancelled.reason);
> +               break;
>         }
>  }
>  
> @@ -3510,17 +3545,17 @@ static int pqi_process_event_intr(struct
> pqi_ctrl_info *ctrl_info)
>                 num_events++;
>                 response = event_queue->oq_element_array + (oq_ci *
> PQI_EVENT_OQ_ELEMENT_LENGTH);
>  
> -               event_index =
> -                       pqi_event_type_to_event_index(response-
> >event_type);
> +               event_index = pqi_event_type_to_event_index(response-
> >event_type);
>  
>                 if (event_index >= 0 && response-
> >request_acknowledge) {
>                         event = &ctrl_info->events[event_index];
>                         event->pending = true;
>                         event->event_type = response->event_type;
> -                       event->event_id = response->event_id;
> -                       event->additional_event_id = response-
> >additional_event_id;
> +                       event->event_id =
> get_unaligned_le16(&response->event_id);
> +                       event->additional_event_id =
> +                               get_unaligned_le32(&response-
> >additional_event_id);
>                         if (event->event_type == PQI_EVENT_TYPE_OFA)
> -                               pqi_ofa_capture_event_payload(event,
> response);
> +                               pqi_ofa_capture_event_payload(ctrl_in
> fo, event, response);
>                 }
>  
>                 oq_ci = (oq_ci + 1) % PQI_NUM_EVENT_QUEUE_ELEMENTS;
> @@ -3537,8 +3572,7 @@ static int pqi_process_event_intr(struct
> pqi_ctrl_info *ctrl_info)
>  
>  #define PQI_LEGACY_INTX_MASK   0x1
>  
> -static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info
> *ctrl_info,
> -       bool enable_intx)
> +static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info
> *ctrl_info, bool enable_intx)

another whitespace hunk

>  {
>         u32 intx_mask;
>         struct pqi_device_registers __iomem *pqi_registers;
> @@ -4216,59 +4250,36 @@ static int
> pqi_process_raid_io_error_synchronous(
>         return rc;
>  }
>  
> +static inline bool pqi_is_blockable_request(struct pqi_iu_header
> *request)
> +{
> +       return (request->driver_flags &
> PQI_DRIVER_NONBLOCKABLE_REQUEST) == 0;
> +}
> +
>  static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_iu_header *request, unsigned int flags,
> -       struct pqi_raid_error_info *error_info, unsigned long
> timeout_msecs)
> +       struct pqi_raid_error_info *error_info)

The removal of the timeout_msecs argument to this function could be
a separate patch in its own right.

>  {
>         int rc = 0;
>         struct pqi_io_request *io_request;
> -       unsigned long start_jiffies;
> -       unsigned long msecs_blocked;
>         size_t iu_length;
>         DECLARE_COMPLETION_ONSTACK(wait);
>  
> -       /*
> -        * Note that specifying PQI_SYNC_FLAGS_INTERRUPTABLE and a
> timeout value
> -        * are mutually exclusive.
> -        */
> -
>         if (flags & PQI_SYNC_FLAGS_INTERRUPTABLE) {
>                 if (down_interruptible(&ctrl_info->sync_request_sem))
>                         return -ERESTARTSYS;
>         } else {
> -               if (timeout_msecs == NO_TIMEOUT) {
> -                       down(&ctrl_info->sync_request_sem);
> -               } else {
> -                       start_jiffies = jiffies;
> -                       if (down_timeout(&ctrl_info-
> >sync_request_sem,
> -                               msecs_to_jiffies(timeout_msecs)))
> -                               return -ETIMEDOUT;
> -                       msecs_blocked =
> -                               jiffies_to_msecs(jiffies -
> start_jiffies);
> -                       if (msecs_blocked >= timeout_msecs) {
> -                               rc = -ETIMEDOUT;
> -                               goto out;
> -                       }
> -                       timeout_msecs -= msecs_blocked;
> -               }
> +               down(&ctrl_info->sync_request_sem);
>         }
>  
>         pqi_ctrl_busy(ctrl_info);
> -       timeout_msecs = pqi_wait_if_ctrl_blocked(ctrl_info,
> timeout_msecs);
> -       if (timeout_msecs == 0) {
> -               pqi_ctrl_unbusy(ctrl_info);
> -               rc = -ETIMEDOUT;
> -               goto out;
> -       }
> +       if (pqi_is_blockable_request(request))
> +               pqi_wait_if_ctrl_blocked(ctrl_info);

You wait here after taking the semaphore - is that intended? Why?

>  
>         if (pqi_ctrl_offline(ctrl_info)) {

Should you test this before waiting, perhaps?

> -               pqi_ctrl_unbusy(ctrl_info);
>                 rc = -ENXIO;
>                 goto out;
>         }
>  
> -       atomic_inc(&ctrl_info->sync_cmds_outstanding);
> -
>         io_request = pqi_alloc_io_request(ctrl_info);
>  
>         put_unaligned_le16(io_request->index,
> @@ -4288,18 +4299,7 @@ static int
> pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
>         pqi_start_io(ctrl_info, &ctrl_info-
> >queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
>                 io_request);
>  
> -       pqi_ctrl_unbusy(ctrl_info);
> -
> -       if (timeout_msecs == NO_TIMEOUT) {
> -               pqi_wait_for_completion_io(ctrl_info, &wait);
> -       } else {
> -               if (!wait_for_completion_io_timeout(&wait,
> -                       msecs_to_jiffies(timeout_msecs))) {
> -                       dev_warn(&ctrl_info->pci_dev->dev,
> -                               "command timed out\n");
> -                       rc = -ETIMEDOUT;
> -               }
> -       }
> +       pqi_wait_for_completion_io(ctrl_info, &wait);
>  
>         if (error_info) {
>                 if (io_request->error_info)
> @@ -4312,8 +4312,8 @@ static int
> pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
>  
>         pqi_free_io_request(io_request);
>  
> -       atomic_dec(&ctrl_info->sync_cmds_outstanding);
>  out:
> +       pqi_ctrl_unbusy(ctrl_info);
>         up(&ctrl_info->sync_request_sem);
>  
>         return rc;
> @@ -4350,8 +4350,7 @@ static int
> pqi_submit_admin_request_synchronous(
>         rc = pqi_poll_for_admin_response(ctrl_info, response);
>  
>         if (rc == 0)
> -               rc = pqi_validate_admin_response(response,
> -                       request->function_code);
> +               rc = pqi_validate_admin_response(response, request-
> >function_code);
>  
>         return rc;
>  }
> @@ -4721,7 +4720,7 @@ static int pqi_configure_events(struct
> pqi_ctrl_info *ctrl_info,
>                 goto out;
>  
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +               0, NULL);
>  
>         pqi_pci_unmap(ctrl_info->pci_dev,
>                 request.data.report_event_configuration.sg_descriptor
> s, 1,
> @@ -4757,7 +4756,7 @@ static int pqi_configure_events(struct
> pqi_ctrl_info *ctrl_info,
>                 goto out;
>  
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0,
> -               NULL, NO_TIMEOUT);
> +               NULL);
>  
>         pqi_pci_unmap(ctrl_info->pci_dev,
>                 request.data.report_event_configuration.sg_descriptor
> s, 1,
> @@ -5277,12 +5276,6 @@ static inline int
> pqi_raid_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>                 device, scmd, queue_group);
>  }
>  

Below here, a new section starts that refacors the treatment of bypass
retries. I don't see how this is related to the synchronization issues
mentioned in the patch description.


> -static inline void pqi_schedule_bypass_retry(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       if (!pqi_ctrl_blocked(ctrl_info))
> -               schedule_work(&ctrl_info->raid_bypass_retry_work);
> -}
> -
>  static bool pqi_raid_bypass_retry_needed(struct pqi_io_request
> *io_request)
>  {
>         struct scsi_cmnd *scmd;
> @@ -5299,7 +5292,7 @@ static bool pqi_raid_bypass_retry_needed(struct
> pqi_io_request *io_request)
>                 return false;
>  
>         device = scmd->device->hostdata;
> -       if (pqi_device_offline(device))
> +       if (pqi_device_offline(device) ||
> pqi_device_in_remove(device))
>                 return false;
>  
>         ctrl_info = shost_to_hba(scmd->device->host);
> @@ -5309,155 +5302,26 @@ static bool
> pqi_raid_bypass_retry_needed(struct pqi_io_request *io_request)
>         return true;
>  }
>  
> -static inline void pqi_add_to_raid_bypass_retry_list(
> -       struct pqi_ctrl_info *ctrl_info,
> -       struct pqi_io_request *io_request, bool at_head)
> -{
> -       unsigned long flags;
> -
> -       spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock,
> flags);
> -       if (at_head)
> -               list_add(&io_request->request_list_entry,
> -                       &ctrl_info->raid_bypass_retry_list);
> -       else
> -               list_add_tail(&io_request->request_list_entry,
> -                       &ctrl_info->raid_bypass_retry_list);
> -       spin_unlock_irqrestore(&ctrl_info-
> >raid_bypass_retry_list_lock, flags);
> -}
> -
> -static void pqi_queued_raid_bypass_complete(struct pqi_io_request
> *io_request,
> +static void pqi_aio_io_complete(struct pqi_io_request *io_request,
>         void *context)
>  {
>         struct scsi_cmnd *scmd;
>  
>         scmd = io_request->scmd;
> +       scsi_dma_unmap(scmd);
> +       if (io_request->status == -EAGAIN ||
> +               pqi_raid_bypass_retry_needed(io_request))
> +                       set_host_byte(scmd, DID_IMM_RETRY);
>         pqi_free_io_request(io_request);
>         pqi_scsi_done(scmd);
>  }
>  
> -static void pqi_queue_raid_bypass_retry(struct pqi_io_request
> *io_request)
> +static inline int pqi_aio_submit_scsi_cmd(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
> +       struct pqi_queue_group *queue_group)
>  {
> -       struct scsi_cmnd *scmd;
> -       struct pqi_ctrl_info *ctrl_info;
> -
> -       io_request->io_complete_callback =
> pqi_queued_raid_bypass_complete;
> -       scmd = io_request->scmd;
> -       scmd->result = 0;
> -       ctrl_info = shost_to_hba(scmd->device->host);
> -
> -       pqi_add_to_raid_bypass_retry_list(ctrl_info, io_request,
> false);
> -       pqi_schedule_bypass_retry(ctrl_info);
> -}
> -
> -static int pqi_retry_raid_bypass(struct pqi_io_request *io_request)
> -{
> -       struct scsi_cmnd *scmd;
> -       struct pqi_scsi_dev *device;
> -       struct pqi_ctrl_info *ctrl_info;
> -       struct pqi_queue_group *queue_group;
> -
> -       scmd = io_request->scmd;
> -       device = scmd->device->hostdata;
> -       if (pqi_device_in_reset(device)) {
> -               pqi_free_io_request(io_request);
> -               set_host_byte(scmd, DID_RESET);
> -               pqi_scsi_done(scmd);
> -               return 0;
> -       }
> -
> -       ctrl_info = shost_to_hba(scmd->device->host);
> -       queue_group = io_request->queue_group;
> -
> -       pqi_reinit_io_request(io_request);
> -
> -       return pqi_raid_submit_scsi_cmd_with_io_request(ctrl_info,
> io_request,
> -               device, scmd, queue_group);
> -}
> -
> -static inline struct pqi_io_request
> *pqi_next_queued_raid_bypass_request(
> -       struct pqi_ctrl_info *ctrl_info)
> -{
> -       unsigned long flags;
> -       struct pqi_io_request *io_request;
> -
> -       spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock,
> flags);
> -       io_request = list_first_entry_or_null(
> -               &ctrl_info->raid_bypass_retry_list,
> -               struct pqi_io_request, request_list_entry);
> -       if (io_request)
> -               list_del(&io_request->request_list_entry);
> -       spin_unlock_irqrestore(&ctrl_info-
> >raid_bypass_retry_list_lock, flags);
> -
> -       return io_request;
> -}
> -
> -static void pqi_retry_raid_bypass_requests(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       int rc;
> -       struct pqi_io_request *io_request;
> -
> -       pqi_ctrl_busy(ctrl_info);
> -
> -       while (1) {
> -               if (pqi_ctrl_blocked(ctrl_info))
> -                       break;
> -               io_request =
> pqi_next_queued_raid_bypass_request(ctrl_info);
> -               if (!io_request)
> -                       break;
> -               rc = pqi_retry_raid_bypass(io_request);
> -               if (rc) {
> -                       pqi_add_to_raid_bypass_retry_list(ctrl_info,
> io_request,
> -                               true);
> -                       pqi_schedule_bypass_retry(ctrl_info);
> -                       break;
> -               }
> -       }
> -
> -       pqi_ctrl_unbusy(ctrl_info);
> -}
> -
> -static void pqi_raid_bypass_retry_worker(struct work_struct *work)
> -{
> -       struct pqi_ctrl_info *ctrl_info;
> -
> -       ctrl_info = container_of(work, struct pqi_ctrl_info,
> -               raid_bypass_retry_work);
> -       pqi_retry_raid_bypass_requests(ctrl_info);
> -}
> -
> -static void pqi_clear_all_queued_raid_bypass_retries(
> -       struct pqi_ctrl_info *ctrl_info)
> -{
> -       unsigned long flags;
> -
> -       spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock,
> flags);
> -       INIT_LIST_HEAD(&ctrl_info->raid_bypass_retry_list);
> -       spin_unlock_irqrestore(&ctrl_info-
> >raid_bypass_retry_list_lock, flags);
> -}
> -
> -static void pqi_aio_io_complete(struct pqi_io_request *io_request,
> -       void *context)
> -{
> -       struct scsi_cmnd *scmd;
> -
> -       scmd = io_request->scmd;
> -       scsi_dma_unmap(scmd);
> -       if (io_request->status == -EAGAIN)
> -               set_host_byte(scmd, DID_IMM_RETRY);
> -       else if (pqi_raid_bypass_retry_needed(io_request)) {
> -               pqi_queue_raid_bypass_retry(io_request);
> -               return;
> -       }
> -       pqi_free_io_request(io_request);
> -       pqi_scsi_done(scmd);
> -}
> -
> -static inline int pqi_aio_submit_scsi_cmd(struct pqi_ctrl_info
> *ctrl_info,
> -       struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
> -       struct pqi_queue_group *queue_group)
> -{
> -       return pqi_aio_submit_io(ctrl_info, scmd, device->aio_handle,
> -               scmd->cmnd, scmd->cmd_len, queue_group, NULL, false);
> +       return pqi_aio_submit_io(ctrl_info, scmd, device->aio_handle,
> +               scmd->cmnd, scmd->cmd_len, queue_group, NULL, false);
>  }
>  
>  static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
> @@ -5698,6 +5562,14 @@ static inline u16 pqi_get_hw_queue(struct
> pqi_ctrl_info *ctrl_info,
>         return hw_queue;
>  }
>  
> +static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd
> *scmd)
> +{
> +       if (blk_rq_is_passthrough(scmd->request))
> +               return false;
> +
> +       return scmd->retries == 0;
> +}
> +

Nice, but this fits better into (or next to) 10/25 IMO.

>  /*
>   * This function gets called just before we hand the completed SCSI
> request
>   * back to the SML.
> @@ -5806,7 +5678,6 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>         bool raid_bypassed;
>  
>         device = scmd->device->hostdata;
> -       ctrl_info = shost_to_hba(shost);
>  
>         if (!device) {
>                 set_host_byte(scmd, DID_NO_CONNECT);
> @@ -5816,15 +5687,15 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>  
>         atomic_inc(&device->scsi_cmds_outstanding);
>  
> +       ctrl_info = shost_to_hba(shost);
> +
>         if (pqi_ctrl_offline(ctrl_info) ||
> pqi_device_in_remove(device)) {
>                 set_host_byte(scmd, DID_NO_CONNECT);
>                 pqi_scsi_done(scmd);
>                 return 0;
>         }
>  
> -       pqi_ctrl_busy(ctrl_info);
> -       if (pqi_ctrl_blocked(ctrl_info) ||
> pqi_device_in_reset(device) ||
> -           pqi_ctrl_in_ofa(ctrl_info) ||
> pqi_ctrl_in_shutdown(ctrl_info)) {
> +       if (pqi_ctrl_blocked(ctrl_info)) {
>                 rc = SCSI_MLQUEUE_HOST_BUSY;
>                 goto out;
>         }
> @@ -5841,13 +5712,12 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>         if (pqi_is_logical_device(device)) {
>                 raid_bypassed = false;
>                 if (device->raid_bypass_enabled &&
> -                       !blk_rq_is_passthrough(scmd->request)) {
> -                       if (!pqi_is_parity_write_stream(ctrl_info,
> scmd)) {
> -                               rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd,
> queue_group);
> -                               if (rc == 0 || rc ==
> SCSI_MLQUEUE_HOST_BUSY) {
> -                                       raid_bypassed = true;
> -                                       atomic_inc(&device-
> >raid_bypass_cnt);
> -                               }
> +                       pqi_is_bypass_eligible_request(scmd) &&
> +                       !pqi_is_parity_write_stream(ctrl_info, scmd))
> {
> +                       rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd,
> queue_group);
> +                       if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY)
> {
> +                               raid_bypassed = true;
> +                               atomic_inc(&device->raid_bypass_cnt);
>                         }
>                 }
>                 if (!raid_bypassed)
> @@ -5860,7 +5730,6 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>         }
>  
>  out:
> -       pqi_ctrl_unbusy(ctrl_info);
>         if (rc)
>                 atomic_dec(&device->scsi_cmds_outstanding);
>  
> @@ -5970,100 +5839,22 @@ static void
> pqi_fail_io_queued_for_device(struct pqi_ctrl_info *ctrl_info,
>         }
>  }
>  
> -static void pqi_fail_io_queued_for_all_devices(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       unsigned int i;
> -       unsigned int path;
> -       struct pqi_queue_group *queue_group;
> -       unsigned long flags;
> -       struct pqi_io_request *io_request;
> -       struct pqi_io_request *next;
> -       struct scsi_cmnd *scmd;
> -
> -       for (i = 0; i < ctrl_info->num_queue_groups; i++) {
> -               queue_group = &ctrl_info->queue_groups[i];
> -
> -               for (path = 0; path < 2; path++) {
> -                       spin_lock_irqsave(&queue_group-
> >submit_lock[path],
> -                                               flags);
> -
> -                       list_for_each_entry_safe(io_request, next,
> -                               &queue_group->request_list[path],
> -                               request_list_entry) {
> -
> -                               scmd = io_request->scmd;
> -                               if (!scmd)
> -                                       continue;
> -
> -                               list_del(&io_request-
> >request_list_entry);
> -                               set_host_byte(scmd, DID_RESET);
> -                               pqi_scsi_done(scmd);
> -                       }
> -
> -                       spin_unlock_irqrestore(
> -                               &queue_group->submit_lock[path],
> flags);
> -               }
> -       }
> -}
> -
>  static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_scsi_dev *device, unsigned long timeout_secs)
>  {
>         unsigned long timeout;
>  
> -       timeout = (timeout_secs * PQI_HZ) + jiffies;
> -
> -       while (atomic_read(&device->scsi_cmds_outstanding)) {
> -               pqi_check_ctrl_health(ctrl_info);
> -               if (pqi_ctrl_offline(ctrl_info))
> -                       return -ENXIO;
> -               if (timeout_secs != NO_TIMEOUT) {
> -                       if (time_after(jiffies, timeout)) {
> -                               dev_err(&ctrl_info->pci_dev->dev,
> -                                       "timed out waiting for
> pending IO\n");
> -                               return -ETIMEDOUT;
> -                       }
> -               }
> -               usleep_range(1000, 2000);
> -       }
> -
> -       return 0;
> -}
> -
> -static int pqi_ctrl_wait_for_pending_io(struct pqi_ctrl_info
> *ctrl_info,
> -       unsigned long timeout_secs)
> -{
> -       bool io_pending;
> -       unsigned long flags;
> -       unsigned long timeout;
> -       struct pqi_scsi_dev *device;
>  
>         timeout = (timeout_secs * PQI_HZ) + jiffies;
> -       while (1) {
> -               io_pending = false;
> -
> -               spin_lock_irqsave(&ctrl_info->scsi_device_list_lock,
> flags);
> -               list_for_each_entry(device, &ctrl_info-
> >scsi_device_list,
> -                       scsi_device_list_entry) {
> -                       if (atomic_read(&device-
> >scsi_cmds_outstanding)) {
> -                               io_pending = true;
> -                               break;
> -                       }
> -               }
> -               spin_unlock_irqrestore(&ctrl_info-
> >scsi_device_list_lock,
> -                                       flags);
> -
> -               if (!io_pending)
> -                       break;
>  
> +       while (atomic_read(&device->scsi_cmds_outstanding)) {
>                 pqi_check_ctrl_health(ctrl_info);
>                 if (pqi_ctrl_offline(ctrl_info))
>                         return -ENXIO;
> -
>                 if (timeout_secs != NO_TIMEOUT) {
>                         if (time_after(jiffies, timeout)) {
>                                 dev_err(&ctrl_info->pci_dev->dev,
> -                                       "timed out waiting for
> pending IO\n");
> +                                       "timed out waiting for
> pending I/O\n");
>                                 return -ETIMEDOUT;
>                         }
>                 }

Like I said above (wrt pqi_scsi_block_requests), have you considered
using wait_event_timeout() here?


> @@ -6073,18 +5864,6 @@ static int pqi_ctrl_wait_for_pending_io(struct
> pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>  
> -static int pqi_ctrl_wait_for_pending_sync_cmds(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       while (atomic_read(&ctrl_info->sync_cmds_outstanding)) {
> -               pqi_check_ctrl_health(ctrl_info);
> -               if (pqi_ctrl_offline(ctrl_info))
> -                       return -ENXIO;
> -               usleep_range(1000, 2000);
> -       }
> -
> -       return 0;
> -}
> -
>  static void pqi_lun_reset_complete(struct pqi_io_request
> *io_request,
>         void *context)
>  {
> @@ -6156,13 +5935,11 @@ static int pqi_lun_reset(struct pqi_ctrl_info
> *ctrl_info,
>         return rc;
>  }
>  
> -/* Performs a reset at the LUN level. */
> -
>  #define PQI_LUN_RESET_RETRIES                  3
>  #define PQI_LUN_RESET_RETRY_INTERVAL_MSECS     10000
>  #define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS  120
>  
> -static int _pqi_device_reset(struct pqi_ctrl_info *ctrl_info,
> +static int pqi_lun_reset_with_retries(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_scsi_dev *device)
>  {
>         int rc;
> @@ -6188,23 +5965,15 @@ static int pqi_device_reset(struct
> pqi_ctrl_info *ctrl_info,
>  {
>         int rc;
>  
> -       mutex_lock(&ctrl_info->lun_reset_mutex);
> -
>         pqi_ctrl_block_requests(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
>         pqi_fail_io_queued_for_device(ctrl_info, device);
>         rc = pqi_wait_until_inbound_queues_empty(ctrl_info);
> -       pqi_device_reset_start(device);
> -       pqi_ctrl_unblock_requests(ctrl_info);
> -
>         if (rc)
>                 rc = FAILED;
>         else
> -               rc = _pqi_device_reset(ctrl_info, device);
> -
> -       pqi_device_reset_done(device);
> -
> -       mutex_unlock(&ctrl_info->lun_reset_mutex);
> +               rc = pqi_lun_reset_with_retries(ctrl_info, device);
> +       pqi_ctrl_unblock_requests(ctrl_info);
>  
>         return rc;
>  }
> @@ -6220,29 +5989,25 @@ static int pqi_eh_device_reset_handler(struct
> scsi_cmnd *scmd)
>         ctrl_info = shost_to_hba(shost);
>         device = scmd->device->hostdata;
>  
> +       mutex_lock(&ctrl_info->lun_reset_mutex);
> +
>         dev_err(&ctrl_info->pci_dev->dev,
>                 "resetting scsi %d:%d:%d:%d\n",
>                 shost->host_no, device->bus, device->target, device-
> >lun);
>  
>         pqi_check_ctrl_health(ctrl_info);
> -       if (pqi_ctrl_offline(ctrl_info) ||
> -               pqi_device_reset_blocked(ctrl_info)) {
> +       if (pqi_ctrl_offline(ctrl_info))
>                 rc = FAILED;
> -               goto out;
> -       }
> -
> -       pqi_wait_until_ofa_finished(ctrl_info);
> -
> -       atomic_inc(&ctrl_info->sync_cmds_outstanding);
> -       rc = pqi_device_reset(ctrl_info, device);
> -       atomic_dec(&ctrl_info->sync_cmds_outstanding);
> +       else
> +               rc = pqi_device_reset(ctrl_info, device);
>  
> -out:
>         dev_err(&ctrl_info->pci_dev->dev,
>                 "reset of scsi %d:%d:%d:%d: %s\n",
>                 shost->host_no, device->bus, device->target, device-
> >lun,
>                 rc == SUCCESS ? "SUCCESS" : "FAILED");
>  
> +       mutex_unlock(&ctrl_info->lun_reset_mutex);
> +
>         return rc;
>  }
>  
> @@ -6544,7 +6309,7 @@ static int pqi_passthru_ioctl(struct
> pqi_ctrl_info *ctrl_info, void __user *arg)
>                 put_unaligned_le32(iocommand.Request.Timeout,
> &request.timeout);
>  
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               PQI_SYNC_FLAGS_INTERRUPTABLE, &pqi_error_info,
> NO_TIMEOUT);
> +               PQI_SYNC_FLAGS_INTERRUPTABLE, &pqi_error_info);
>  
>         if (iocommand.buf_size > 0)
>                 pqi_pci_unmap(ctrl_info->pci_dev,
> request.sg_descriptors, 1,
> @@ -6596,9 +6361,6 @@ static int pqi_ioctl(struct scsi_device *sdev,
> unsigned int cmd,
>  
>         ctrl_info = shost_to_hba(sdev->host);
>  
> -       if (pqi_ctrl_in_ofa(ctrl_info) ||
> pqi_ctrl_in_shutdown(ctrl_info))
> -               return -EBUSY;
> -
>         switch (cmd) {
>         case CCISS_DEREGDISK:
>         case CCISS_REGNEWDISK:
> @@ -7145,9 +6907,7 @@ static int pqi_register_scsi(struct
> pqi_ctrl_info *ctrl_info)
>  
>         shost = scsi_host_alloc(&pqi_driver_template,
> sizeof(ctrl_info));
>         if (!shost) {
> -               dev_err(&ctrl_info->pci_dev->dev,
> -                       "scsi_host_alloc failed for controller %u\n",
> -                       ctrl_info->ctrl_id);
> +               dev_err(&ctrl_info->pci_dev->dev, "scsi_host_alloc
> failed\n");
>                 return -ENOMEM;
>         }
>  
> @@ -7405,7 +7165,7 @@ static int pqi_config_table_update(struct
> pqi_ctrl_info *ctrl_info,
>                 &request.data.config_table_update.last_section);
>  
>         return pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +               0, NULL);
>  }
>  
>  static int pqi_enable_firmware_features(struct pqi_ctrl_info
> *ctrl_info,
> @@ -7483,7 +7243,8 @@ static void
> pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>                 break;
>         case PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE:
>                 ctrl_info->soft_reset_handshake_supported =
> -                       firmware_feature->enabled;
> +                       firmware_feature->enabled &&
> +                       ctrl_info->soft_reset_status;

Should you use readb(ctrl_info->soft_reset_status) here, like above?

>                 break;
>         case PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT:
>                 ctrl_info->raid_iu_timeout_supported =
> firmware_feature->enabled;
> @@ -7491,6 +7252,10 @@ static void
> pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>         case PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT:
>                 ctrl_info->tmf_iu_timeout_supported =
> firmware_feature->enabled;
>                 break;
> +       case PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN:
> +               ctrl_info->unique_wwid_in_report_phys_lun_supported =
> +                       firmware_feature->enabled;
> +               break;
>         }
>  
>         pqi_firmware_feature_status(ctrl_info, firmware_feature);
> @@ -7581,6 +7346,11 @@ static struct pqi_firmware_feature
> pqi_firmware_features[] = {
>                 .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME,
>                 .feature_status = pqi_firmware_feature_status,
>         },
> +       {
> +               .feature_name = "Unique WWID in Report Physical LUN",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN,
> +               .feature_status = pqi_ctrl_update_feature_flags,
> +       },
>  };
>  
>  static void pqi_process_firmware_features(
> @@ -7665,14 +7435,34 @@ static void
> pqi_process_firmware_features_section(
>         mutex_unlock(&pqi_firmware_features_mutex);
>  }
>  

The hunks below look like yet another independent change.
(Handling of firmware_feature_section_present).

> +/*
> + * Reset all controller settings that can be initialized during the
> processing
> + * of the PQI Configuration Table.
> + */
> +
> +static void pqi_ctrl_reset_config(struct pqi_ctrl_info *ctrl_info)
> +{
> +       ctrl_info->heartbeat_counter = NULL;
> +       ctrl_info->soft_reset_status = NULL;
> +       ctrl_info->soft_reset_handshake_supported = false;
> +       ctrl_info->enable_r1_writes = false;
> +       ctrl_info->enable_r5_writes = false;
> +       ctrl_info->enable_r6_writes = false;
> +       ctrl_info->raid_iu_timeout_supported = false;
> +       ctrl_info->tmf_iu_timeout_supported = false;
> +       ctrl_info->unique_wwid_in_report_phys_lun_supported = false;
> +}
> +
>  static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
>  {
>         u32 table_length;
>         u32 section_offset;
> +       bool firmware_feature_section_present;
>         void __iomem *table_iomem_addr;
>         struct pqi_config_table *config_table;
>         struct pqi_config_table_section_header *section;
>         struct pqi_config_table_section_info section_info;
> +       struct pqi_config_table_section_info feature_section_info;
>  
>         table_length = ctrl_info->config_table_length;
>         if (table_length == 0)
> @@ -7692,6 +7482,7 @@ static int pqi_process_config_table(struct
> pqi_ctrl_info *ctrl_info)
>         table_iomem_addr = ctrl_info->iomem_base + ctrl_info-
> >config_table_offset;
>         memcpy_fromio(config_table, table_iomem_addr, table_length);
>  
> +       firmware_feature_section_present = false;
>         section_info.ctrl_info = ctrl_info;
>         section_offset = get_unaligned_le32(&config_table-
> >first_section_offset);
>  
> @@ -7704,7 +7495,8 @@ static int pqi_process_config_table(struct
> pqi_ctrl_info *ctrl_info)
>  
>                 switch (get_unaligned_le16(&section->section_id)) {
>                 case PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES:
> -
>                        pqi_process_firmware_features_section(&section_
> info);
> +                       firmware_feature_section_present = true;
> +                       feature_section_info = section_info;
>                         break;
>                 case PQI_CONFIG_TABLE_SECTION_HEARTBEAT:
>                         if (pqi_disable_heartbeat)
> @@ -7722,13 +7514,21 @@ static int pqi_process_config_table(struct
> pqi_ctrl_info *ctrl_info)
>                                 table_iomem_addr +
>                                 section_offset +
>                                 offsetof(struct
> pqi_config_table_soft_reset,
> -                                               soft_reset_status);
> +                                       soft_reset_status);
>                         break;
>                 }
>  
>                 section_offset = get_unaligned_le16(&section-
> >next_section_offset);
>         }
>  
> +       /*
> +        * We process the firmware feature section after all other
> sections
> +        * have been processed so that the feature bit callbacks can
> take
> +        * into account the settings configured by other sections.
> +        */
> +       if (firmware_feature_section_present)
> +               pqi_process_firmware_features_section(&feature_sectio
> n_info);
> +
>         kfree(config_table);
>  
>         return 0;
> @@ -7776,8 +7576,6 @@ static int pqi_force_sis_mode(struct
> pqi_ctrl_info *ctrl_info)
>         return pqi_revert_to_sis_mode(ctrl_info);
>  }
>  
> -#define PQI_POST_RESET_DELAY_B4_MSGU_READY     5000
> -
>  static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
>  {
>         int rc;
> @@ -7785,7 +7583,7 @@ static int pqi_ctrl_init(struct pqi_ctrl_info
> *ctrl_info)
>  
>         if (reset_devices) {
>                 sis_soft_reset(ctrl_info);
> -               msleep(PQI_POST_RESET_DELAY_B4_MSGU_READY);
> +               msleep(PQI_POST_RESET_DELAY_SECS * PQI_HZ);
>         } else {
>                 rc = pqi_force_sis_mode(ctrl_info);
>                 if (rc)
> @@ -8095,6 +7893,8 @@ static int pqi_ctrl_init_resume(struct
> pqi_ctrl_info *ctrl_info)
>         ctrl_info->controller_online = true;
>         pqi_ctrl_unblock_requests(ctrl_info);
>  
> +       pqi_ctrl_reset_config(ctrl_info);
> +
>         rc = pqi_process_config_table(ctrl_info);
>         if (rc)
>                 return rc;
> @@ -8140,7 +7940,8 @@ static int pqi_ctrl_init_resume(struct
> pqi_ctrl_info *ctrl_info)
>                 return rc;
>         }
>  
> -       pqi_schedule_update_time_worker(ctrl_info);
> +       if (pqi_ofa_in_progress(ctrl_info))
> +               pqi_ctrl_unblock_scan(ctrl_info);
>  
>         pqi_scan_scsi_devices(ctrl_info);
>  
> @@ -8253,7 +8054,6 @@ static struct pqi_ctrl_info
> *pqi_alloc_ctrl_info(int numa_node)
>  
>         INIT_WORK(&ctrl_info->event_work, pqi_event_worker);
>         atomic_set(&ctrl_info->num_interrupts, 0);
> -       atomic_set(&ctrl_info->sync_cmds_outstanding, 0);
>  
>         INIT_DELAYED_WORK(&ctrl_info->rescan_work,
> pqi_rescan_worker);
>         INIT_DELAYED_WORK(&ctrl_info->update_time_work,
> pqi_update_time_worker);
> @@ -8261,15 +8061,13 @@ static struct pqi_ctrl_info
> *pqi_alloc_ctrl_info(int numa_node)
>         timer_setup(&ctrl_info->heartbeat_timer,
> pqi_heartbeat_timer_handler, 0);
>         INIT_WORK(&ctrl_info->ctrl_offline_work,
> pqi_ctrl_offline_worker);
>  
> +       INIT_WORK(&ctrl_info->ofa_memory_alloc_work,
> pqi_ofa_memory_alloc_worker);
> +       INIT_WORK(&ctrl_info->ofa_quiesce_work,
> pqi_ofa_quiesce_worker);
> +
>         sema_init(&ctrl_info->sync_request_sem,
>                 PQI_RESERVED_IO_SLOTS_SYNCHRONOUS_REQUESTS);
>         init_waitqueue_head(&ctrl_info->block_requests_wait);
>  
> -       INIT_LIST_HEAD(&ctrl_info->raid_bypass_retry_list);
> -       spin_lock_init(&ctrl_info->raid_bypass_retry_list_lock);
> -       INIT_WORK(&ctrl_info->raid_bypass_retry_work,
> -               pqi_raid_bypass_retry_worker);
> -
>         ctrl_info->ctrl_id = atomic_inc_return(&pqi_controller_count)
> - 1;
>         ctrl_info->irq_mode = IRQ_MODE_NONE;
>         ctrl_info->max_msix_vectors = PQI_MAX_MSIX_VECTORS;
> @@ -8334,81 +8132,57 @@ static void pqi_remove_ctrl(struct
> pqi_ctrl_info *ctrl_info)
>  
>  static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info)
>  {
> -       pqi_cancel_update_time_worker(ctrl_info);
> -       pqi_cancel_rescan_worker(ctrl_info);
> -       pqi_wait_until_lun_reset_finished(ctrl_info);
> -       pqi_wait_until_scan_finished(ctrl_info);
> -       pqi_ctrl_ofa_start(ctrl_info);
> +       pqi_ctrl_block_scan(ctrl_info);
> +       pqi_scsi_block_requests(ctrl_info);
> +       pqi_ctrl_block_device_reset(ctrl_info);
>         pqi_ctrl_block_requests(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
> -       pqi_ctrl_wait_for_pending_io(ctrl_info,
> PQI_PENDING_IO_TIMEOUT_SECS);
> -       pqi_fail_io_queued_for_all_devices(ctrl_info);
> -       pqi_wait_until_inbound_queues_empty(ctrl_info);
>         pqi_stop_heartbeat_timer(ctrl_info);
> -       ctrl_info->pqi_mode_enabled = false;
> -       pqi_save_ctrl_mode(ctrl_info, SIS_MODE);
>  }
>  
>  static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info)
>  {
> -       pqi_ofa_free_host_buffer(ctrl_info);
> -       ctrl_info->pqi_mode_enabled = true;
> -       pqi_save_ctrl_mode(ctrl_info, PQI_MODE);
> -       ctrl_info->controller_online = true;
> -       pqi_ctrl_unblock_requests(ctrl_info);
>         pqi_start_heartbeat_timer(ctrl_info);
> -       pqi_schedule_update_time_worker(ctrl_info);
> -       pqi_clear_soft_reset_status(ctrl_info,
> -               PQI_SOFT_RESET_ABORT);
> -       pqi_scan_scsi_devices(ctrl_info);
> +       pqi_ctrl_unblock_requests(ctrl_info);
> +       pqi_ctrl_unblock_device_reset(ctrl_info);
> +       pqi_scsi_unblock_requests(ctrl_info);
> +       pqi_ctrl_unblock_scan(ctrl_info);
>  }
>  
> -static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info,
> -       u32 total_size, u32 chunk_size)
> +static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info, u32
> total_size, u32 chunk_size)
>  {
> -       u32 sg_count;
> -       u32 size;
>         int i;
> -       struct pqi_sg_descriptor *mem_descriptor = NULL;
> +       u32 sg_count;
>         struct device *dev;
>         struct pqi_ofa_memory *ofap;
> -
> -       dev = &ctrl_info->pci_dev->dev;
> -
> -       sg_count = (total_size + chunk_size - 1);
> -       sg_count /= chunk_size;
> +       struct pqi_sg_descriptor *mem_descriptor;
> +       dma_addr_t dma_handle;
>  
>         ofap = ctrl_info->pqi_ofa_mem_virt_addr;
>  
> -       if (sg_count*chunk_size < total_size)
> +       sg_count = DIV_ROUND_UP(total_size, chunk_size);
> +       if (sg_count == 0 || sg_count > PQI_OFA_MAX_SG_DESCRIPTORS)
>                 goto out;
>  
> -       ctrl_info->pqi_ofa_chunk_virt_addr =
> -                               kcalloc(sg_count, sizeof(void *),
> GFP_KERNEL);
> +       ctrl_info->pqi_ofa_chunk_virt_addr = kmalloc_array(sg_count,
> sizeof(void *), GFP_KERNEL);
>         if (!ctrl_info->pqi_ofa_chunk_virt_addr)
>                 goto out;
>  
> -       for (size = 0, i = 0; size < total_size; size += chunk_size,
> i++) {
> -               dma_addr_t dma_handle;
> +       dev = &ctrl_info->pci_dev->dev;
>  
> +       for (i = 0; i < sg_count; i++) {
>                 ctrl_info->pqi_ofa_chunk_virt_addr[i] =
> -                       dma_alloc_coherent(dev, chunk_size,
> &dma_handle,
> -                                          GFP_KERNEL);
> -
> +                       dma_alloc_coherent(dev, chunk_size,
> &dma_handle, GFP_KERNEL);
>                 if (!ctrl_info->pqi_ofa_chunk_virt_addr[i])
> -                       break;
> -
> +                       goto out_free_chunks;
>                 mem_descriptor = &ofap->sg_descriptor[i];
>                 put_unaligned_le64((u64)dma_handle, &mem_descriptor-
> >address);
>                 put_unaligned_le32(chunk_size, &mem_descriptor-
> >length);
>         }
>  
> -       if (!size || size < total_size)
> -               goto out_free_chunks;
> -
>         put_unaligned_le32(CISS_SG_LAST, &mem_descriptor->flags);
>         put_unaligned_le16(sg_count, &ofap->num_memory_descriptors);
> -       put_unaligned_le32(size, &ofap->bytes_allocated);
> +       put_unaligned_le32(sg_count * chunk_size, &ofap-
> >bytes_allocated);
>  
>         return 0;
>  
> @@ -8416,82 +8190,87 @@ static int pqi_ofa_alloc_mem(struct
> pqi_ctrl_info *ctrl_info,
>         while (--i >= 0) {
>                 mem_descriptor = &ofap->sg_descriptor[i];
>                 dma_free_coherent(dev, chunk_size,
> -                               ctrl_info-
> >pqi_ofa_chunk_virt_addr[i],
> -                               get_unaligned_le64(&mem_descriptor-
> >address));
> +                       ctrl_info->pqi_ofa_chunk_virt_addr[i],
> +                       get_unaligned_le64(&mem_descriptor-
> >address));
>         }
>         kfree(ctrl_info->pqi_ofa_chunk_virt_addr);
>  
>  out:
> -       put_unaligned_le32 (0, &ofap->bytes_allocated);
>         return -ENOMEM;
>  }
>  
>  static int pqi_ofa_alloc_host_buffer(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         u32 total_size;
> +       u32 chunk_size;
>         u32 min_chunk_size;
> -       u32 chunk_sz;
>  
> -       total_size = le32_to_cpu(
> -                       ctrl_info->pqi_ofa_mem_virt_addr-
> >bytes_allocated);
> -       min_chunk_size = total_size / PQI_OFA_MAX_SG_DESCRIPTORS;
> +       if (ctrl_info->ofa_bytes_requested == 0)
> +               return 0;
>  
> -       for (chunk_sz = total_size; chunk_sz >= min_chunk_size;
> chunk_sz /= 2)
> -               if (!pqi_ofa_alloc_mem(ctrl_info, total_size,
> chunk_sz))
> +       total_size = PAGE_ALIGN(ctrl_info->ofa_bytes_requested);
> +       min_chunk_size = DIV_ROUND_UP(total_size,
> PQI_OFA_MAX_SG_DESCRIPTORS);
> +       min_chunk_size = PAGE_ALIGN(min_chunk_size);
> +
> +       for (chunk_size = total_size; chunk_size >= min_chunk_size;)
> {
> +               if (pqi_ofa_alloc_mem(ctrl_info, total_size,
> chunk_size) == 0)
>                         return 0;
> +               chunk_size /= 2;
> +               chunk_size = PAGE_ALIGN(chunk_size);
> +       }
>  
>         return -ENOMEM;
>  }
>  
> -static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info
> *ctrl_info,
> -       u32 bytes_requested)
> +static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       struct pqi_ofa_memory *pqi_ofa_memory;
>         struct device *dev;
> +       struct pqi_ofa_memory *ofap;
>  
>         dev = &ctrl_info->pci_dev->dev;
> -       pqi_ofa_memory = dma_alloc_coherent(dev,
> -                                          
> PQI_OFA_MEMORY_DESCRIPTOR_LENGTH,
> -                                           &ctrl_info-
> >pqi_ofa_mem_dma_handle,
> -                                           GFP_KERNEL);
>  
> -       if (!pqi_ofa_memory)
> +       ofap = dma_alloc_coherent(dev, sizeof(*ofap),
> +               &ctrl_info->pqi_ofa_mem_dma_handle, GFP_KERNEL);
> +       if (!ofap)
>                 return;
>  
> -       put_unaligned_le16(PQI_OFA_VERSION, &pqi_ofa_memory-
> >version);
> -       memcpy(&pqi_ofa_memory->signature, PQI_OFA_SIGNATURE,
> -                                       sizeof(pqi_ofa_memory-
> >signature));
> -       pqi_ofa_memory->bytes_allocated =
> cpu_to_le32(bytes_requested);
> -
> -       ctrl_info->pqi_ofa_mem_virt_addr = pqi_ofa_memory;
> +       ctrl_info->pqi_ofa_mem_virt_addr = ofap;
>  
>         if (pqi_ofa_alloc_host_buffer(ctrl_info) < 0) {
> -               dev_err(dev, "Failed to allocate host buffer of size
> = %u",
> -                       bytes_requested);
> +               dev_err(dev,
> +                       "failed to allocate host buffer for Online
> Firmware Activation\n");
> +               dma_free_coherent(dev, sizeof(*ofap), ofap,
> ctrl_info->pqi_ofa_mem_dma_handle);
> +               ctrl_info->pqi_ofa_mem_virt_addr = NULL;
> +               return;
>         }
>  
> -       return;
> +       put_unaligned_le16(PQI_OFA_VERSION, &ofap->version);
> +       memcpy(&ofap->signature, PQI_OFA_SIGNATURE, sizeof(ofap-
> >signature));
>  }
>  
>  static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       int i;
> -       struct pqi_sg_descriptor *mem_descriptor;
> +       unsigned int i;
> +       struct device *dev;
>         struct pqi_ofa_memory *ofap;
> +       struct pqi_sg_descriptor *mem_descriptor;
> +       unsigned int num_memory_descriptors;
>  
>         ofap = ctrl_info->pqi_ofa_mem_virt_addr;
> -
>         if (!ofap)
>                 return;
>  
> -       if (!ofap->bytes_allocated)
> +       dev = &ctrl_info->pci_dev->dev;
> +
> +       if (get_unaligned_le32(&ofap->bytes_allocated) == 0)
>                 goto out;
>  
>         mem_descriptor = ofap->sg_descriptor;
> +       num_memory_descriptors =
> +               get_unaligned_le16(&ofap->num_memory_descriptors);
>  
> -       for (i = 0; i < get_unaligned_le16(&ofap-
> >num_memory_descriptors);
> -               i++) {
> -               dma_free_coherent(&ctrl_info->pci_dev->dev,
> +       for (i = 0; i < num_memory_descriptors; i++) {
> +               dma_free_coherent(dev,
>                         get_unaligned_le32(&mem_descriptor[i].length)
> ,
>                         ctrl_info->pqi_ofa_chunk_virt_addr[i],
>                         get_unaligned_le64(&mem_descriptor[i].address
> ));
> @@ -8499,47 +8278,46 @@ static void pqi_ofa_free_host_buffer(struct
> pqi_ctrl_info *ctrl_info)
>         kfree(ctrl_info->pqi_ofa_chunk_virt_addr);
>  
>  out:
> -       dma_free_coherent(&ctrl_info->pci_dev->dev,
> -                       PQI_OFA_MEMORY_DESCRIPTOR_LENGTH, ofap,
> -                       ctrl_info->pqi_ofa_mem_dma_handle);
> +       dma_free_coherent(dev, sizeof(*ofap), ofap,
> +               ctrl_info->pqi_ofa_mem_dma_handle);
>         ctrl_info->pqi_ofa_mem_virt_addr = NULL;
>  }
>  
>  static int pqi_ofa_host_memory_update(struct pqi_ctrl_info
> *ctrl_info)
>  {
> +       u32 buffer_length;
>         struct pqi_vendor_general_request request;
> -       size_t size;
>         struct pqi_ofa_memory *ofap;
>  
>         memset(&request, 0, sizeof(request));
>  
> -       ofap = ctrl_info->pqi_ofa_mem_virt_addr;
> -
>         request.header.iu_type = PQI_REQUEST_IU_VENDOR_GENERAL;
>         put_unaligned_le16(sizeof(request) -
> PQI_REQUEST_HEADER_LENGTH,
>                 &request.header.iu_length);
>         put_unaligned_le16(PQI_VENDOR_GENERAL_HOST_MEMORY_UPDATE,
>                 &request.function_code);
>  
> +       ofap = ctrl_info->pqi_ofa_mem_virt_addr;
> +
>         if (ofap) {
> -               size = offsetof(struct pqi_ofa_memory, sg_descriptor)
> +
> +               buffer_length = offsetof(struct pqi_ofa_memory,
> sg_descriptor) +
>                         get_unaligned_le16(&ofap-
> >num_memory_descriptors) *
>                         sizeof(struct pqi_sg_descriptor);
>  
>                 put_unaligned_le64((u64)ctrl_info-
> >pqi_ofa_mem_dma_handle,
>                         &request.data.ofa_memory_allocation.buffer_ad
> dress);
> -               put_unaligned_le32(size,
> +               put_unaligned_le32(buffer_length,
>                         &request.data.ofa_memory_allocation.buffer_le
> ngth);
> -
>         }
>  
>         return pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +               0, NULL);
>  }
>  
> -static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info)
> +static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info,
> unsigned int delay_secs)
>  {
> -       msleep(PQI_POST_RESET_DELAY_B4_MSGU_READY);
> +       ssleep(delay_secs);
> +
>         return pqi_ctrl_init_resume(ctrl_info);
>  }
>  
> @@ -8597,7 +8375,6 @@ static void
> pqi_take_ctrl_offline_deferred(struct pqi_ctrl_info *ctrl_info)
>         pqi_cancel_update_time_worker(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
>         pqi_fail_all_outstanding_requests(ctrl_info);
> -       pqi_clear_all_queued_raid_bypass_retries(ctrl_info);
>         pqi_ctrl_unblock_requests(ctrl_info);
>  }
>  
> @@ -8730,24 +8507,12 @@ static void pqi_shutdown(struct pci_dev
> *pci_dev)
>                 return;
>         }
>  
> -       pqi_disable_events(ctrl_info);
>         pqi_wait_until_ofa_finished(ctrl_info);
> -       pqi_cancel_update_time_worker(ctrl_info);
> -       pqi_cancel_rescan_worker(ctrl_info);
> -       pqi_cancel_event_worker(ctrl_info);
> -
> -       pqi_ctrl_shutdown_start(ctrl_info);
> -       pqi_ctrl_wait_until_quiesced(ctrl_info);
> -
> -       rc = pqi_ctrl_wait_for_pending_io(ctrl_info, NO_TIMEOUT);
> -       if (rc) {
> -               dev_err(&pci_dev->dev,
> -                       "wait for pending I/O failed\n");
> -               return;
> -       }
>  
> +       pqi_scsi_block_requests(ctrl_info);
>         pqi_ctrl_block_device_reset(ctrl_info);
> -       pqi_wait_until_lun_reset_finished(ctrl_info);
> +       pqi_ctrl_block_requests(ctrl_info);
> +       pqi_ctrl_wait_until_quiesced(ctrl_info);
>  
>         /*
>          * Write all data in the controller's battery-backed cache to
> @@ -8758,15 +8523,6 @@ static void pqi_shutdown(struct pci_dev
> *pci_dev)
>                 dev_err(&pci_dev->dev,
>                         "unable to flush controller cache\n");
>  
> -       pqi_ctrl_block_requests(ctrl_info);
> -
> -       rc = pqi_ctrl_wait_for_pending_sync_cmds(ctrl_info);
> -       if (rc) {
> -               dev_err(&pci_dev->dev,
> -                       "wait for pending sync cmds failed\n");
> -               return;
> -       }
> -
>         pqi_crash_if_pending_command(ctrl_info);
>         pqi_reset(ctrl_info);
>  }
> @@ -8801,19 +8557,18 @@ static __maybe_unused int pqi_suspend(struct
> pci_dev *pci_dev, pm_message_t stat
>  
>         ctrl_info = pci_get_drvdata(pci_dev);
>  
> -       pqi_disable_events(ctrl_info);
> -       pqi_cancel_update_time_worker(ctrl_info);
> -       pqi_cancel_rescan_worker(ctrl_info);
> -       pqi_wait_until_scan_finished(ctrl_info);
> -       pqi_wait_until_lun_reset_finished(ctrl_info);
>         pqi_wait_until_ofa_finished(ctrl_info);
> -       pqi_flush_cache(ctrl_info, SUSPEND);
> +
> +       pqi_ctrl_block_scan(ctrl_info);
> +       pqi_scsi_block_requests(ctrl_info);
> +       pqi_ctrl_block_device_reset(ctrl_info);
>         pqi_ctrl_block_requests(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
> -       pqi_wait_until_inbound_queues_empty(ctrl_info);
> -       pqi_ctrl_wait_for_pending_io(ctrl_info, NO_TIMEOUT);
> +       pqi_flush_cache(ctrl_info, SUSPEND);
>         pqi_stop_heartbeat_timer(ctrl_info);
>  
> +       pqi_crash_if_pending_command(ctrl_info);
> +
>         if (state.event == PM_EVENT_FREEZE)
>                 return 0;
>  
> @@ -8846,8 +8601,10 @@ static __maybe_unused int pqi_resume(struct
> pci_dev *pci_dev)
>                                 pci_dev->irq, rc);
>                         return rc;
>                 }
> -               pqi_start_heartbeat_timer(ctrl_info);
> +               pqi_ctrl_unblock_device_reset(ctrl_info);
>                 pqi_ctrl_unblock_requests(ctrl_info);
> +               pqi_scsi_unblock_requests(ctrl_info);
> +               pqi_ctrl_unblock_scan(ctrl_info);
>                 return 0;
>         }
>  
> @@ -9288,7 +9045,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_iu_header,
>                 response_queue_id) != 0x4);
>         BUILD_BUG_ON(offsetof(struct pqi_iu_header,
> -               work_area) != 0x6);
> +               driver_flags) != 0x6);
>         BUILD_BUG_ON(sizeof(struct pqi_iu_header) != 0x8);
>  
>         BUILD_BUG_ON(offsetof(struct pqi_aio_error_info,
> @@ -9386,7 +9143,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
>                 header.iu_length) != 2);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
> @@ -9442,7 +9199,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
>                 header.iu_length) != 2);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
> @@ -9466,7 +9223,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
>                 header.response_queue_id) != 4);
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
> @@ -9495,7 +9252,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
>                 header.response_queue_id) != 4);
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
> 




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2020-12-15 20:23     ` Don.Brace
@ 2021-01-07 23:43       ` Martin Wilck
  2021-01-15 21:17         ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 23:43 UTC (permalink / raw)
  To: Don.Brace, pmenzel, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch,
	joseph.szczypek, POSWALD, jejb, martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh

On Tue, 2020-12-15 at 20:23 +0000, Don.Brace@microchip.com wrote:
> Please see answers below. Hope this helps.
> 
> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de] 
> Sent: Monday, December 14, 2020 11:54 AM
> To: Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett -
> C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <
> Scott.Teel@microchip.com>; Justin Lindley - C33718 <
> Justin.Lindley@microchip.com>; Scott Benesh - C33703 <
> Scott.Benesh@microchip.com>; Gerry Morong - C33720 <
> Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <
> Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; 
> joseph.szczypek@hpe.com; POSWALD@suse.com; James E. J. Bottomley <
> jejb@linux.ibm.com>; Martin K. Petersen <martin.petersen@oracle.com>
> Cc: linux-scsi@vger.kernel.org; it+linux-scsi@molgen.mpg.de; Donald
> Buczek <buczek@molgen.mpg.de>; Greg KH <gregkh@linuxfoundation.org>
> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
> 
> Dear Don, dear Mahesh,
> 
> 
> Am 10.12.20 um 21:35 schrieb Don Brace:
> > From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> > 
> > * Correct scsi-mid-layer sending more requests than
> >    exposed host Q depth causing firmware ASSERT issue.
> >    * Add host Qdepth counter.
> 
> This supposedly fixes the regression between Linux 5.4 and 5.9, which
> we reported in [1].
> 
>      kernel: smartpqi 0000:89:00.0: controller is offline: status
> code 0x6100c
>      kernel: smartpqi 0000:89:00.0: controller offline
> 
> Thank you for looking into this issue and fixing it. We are going to
> test this.
> 
> For easily finding these things in the git history or the WWW, it
> would be great if these log messages could be included (in the
> future).
> DON> Thanks for your suggestion. Well add them in the next time.
> 
> Also, that means, that the regression is still present in Linux 5.10,
> released yesterday, and this commit does not apply to these versions.
> 
> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12
> depending when all of the patches are applied. The patch in question
> is among 28 other patches.
> 
> Mahesh, do you have any idea, what commit caused the regression and
> why the issue started to show up?
> DON> The smartpqi driver sets two scsi_host_template member fields:
> .can_queue and .nr_hw_queues. But we have not yet converted to
> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,
> which is more than the hw can support. That can be verified by
> looking at scsi_host.h.
>         /*
>          * In scsi-mq mode, the number of hardware queues supported
> by the LLD.
>          *
>          * Note: it is assumed that each hardware queue has a queue
> depth of
>          * can_queue. In other words, the total queue depth per host
>          * is nr_hw_queues * can_queue. However, for when host_tagset
> is set,
>          * the total queue depth is can_queue.
>          */
> 
> So, until we make this change, the queue_depth change prevents the
> above issue from happening.

can_queue and nr_hw_queues have been set like this as long as the
driver existed. Why did Paul observe a regression with 5.9?

And why can't you simply set can_queue to 
(ctrl_info->scsi_ml_can_queue / nr_hw_queues)?

Regards,
Martin



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 16/25] smartpqi: convert snprintf to scnprintf
  2020-12-10 20:35 ` [PATCH V3 16/25] smartpqi: convert snprintf to scnprintf Don Brace
@ 2021-01-07 23:51   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-07 23:51 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> The entire Linux kernel has been slowly migrating from snprintf
> to scnprintf, so we are doing our part. This article explains
> the rationale for this change:
>     https: //lwn.net/Articles/69419/
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

AFAICS, none of the changed snprintf() invocations could possibly
overflow their target buffers, so this isn't necessary. Anyway, 

Reviewed-by: Martin Wilck <mwilck@suse.com>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 19/25] smartpqi: add phy id support for the physical drives
  2020-12-10 20:36 ` [PATCH V3 19/25] smartpqi: add phy id support for the physical drives Don Brace
@ 2021-01-08  0:03   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:03 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Murthy Bhat <Murthy.Bhat@microchip.com>
> 
> * Display topology using PHY numbers.
> * PHY(both local and remote) numbers corresponding to physical drives
>     are read from BMIC_IDENTIFY_PHYSICAL_DEVICE.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 20/25] smartpqi: update sas initiator_port_protocols and target_port_protocols
  2020-12-10 20:36 ` [PATCH V3 20/25] smartpqi: update sas initiator_port_protocols and target_port_protocols Don Brace
@ 2021-01-08  0:12   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:12 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Murthy Bhat <Murthy.Bhat@microchip.com>
> 
> * Export valid sas initiator_port_protocols and
>   target_port_protocols to sysfs.
>   * lsscsi now shows correct values.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 09/25] smartpqi: align code with oob driver
  2020-12-10 20:35 ` [PATCH V3 09/25] smartpqi: align code with oob driver Don Brace
@ 2021-01-08  0:13   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:13 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Non-functional changes.
> * Reduce differences between out-of-box driver and
>   kernel.org driver.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 11/25] smartpqi: add host level stream detection enable
  2020-12-10 20:35 ` [PATCH V3 11/25] smartpqi: add host level stream detection enable Don Brace
@ 2021-01-08  0:13   ` Martin Wilck
  2021-01-12 20:28     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:13 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> * Allow R5/R6 stream detection to be disabled/enabled.
>   using sysfs entry enable_stream_detection.
> 
> Example usage:
> 
> lsscsi
> [2:2:0:0]    storage Adaptec  3258P-32i /e     0010
>  ^
>  |
>  +---- NOTE: here host is host2
> 
> find /sys -name \*enable_stream\*
> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3
> 9:00.0/host2/scsi_host/host2/enable_stream_detection
> /sys/devices/pci0000:5b/0000:5b:00.0/0000:5c:00.0/host3/scsi_host/hos
> t3/enable_stream_detection
> 
> Current stream detection:
> cat
> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3
> 9:00.0/host2/scsi_host/host2/enable_stream_detection
> 1
> 
> Turn off stream detection:
> echo 0 >
> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3
> 9:00.0/host2/scsi_host/host2/enable_stream_detection
> 
> Turn on stream detection:
> echo 1 >
> /sys/devices/pci0000:36/0000:36:00.0/0000:37:00.0/0000:38:00.0/0000:3
> 9:00.0/host2/scsi_host/host2/enable_stream_detection
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Nitpick below, but

Reviewed-by: Martin Wilck <mwilck@suse.com>

> ---
>  drivers/scsi/smartpqi/smartpqi_init.c |   32
> ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 96383d047a88..9a449bbc1898 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -6724,6 +6724,34 @@ static ssize_t pqi_lockup_action_store(struct
> device *dev,
>         return -EINVAL;
>  }
>  
> +static ssize_t pqi_host_enable_stream_detection_show(struct device
> *dev,
> +       struct device_attribute *attr, char *buffer)
> +{
> +       struct Scsi_Host *shost = class_to_shost(dev);
> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> +
> +       return scnprintf(buffer, 10, "%hhx\n",
> +                       ctrl_info->enable_stream_detection);

Nitpick: As noted before, %hhx is discouraged.

Regards,
Martin



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 13/25] smartpqi: disable write_same for nvme hba disks
  2020-12-10 20:35 ` [PATCH V3 13/25] smartpqi: disable write_same for nvme hba disks Don Brace
@ 2021-01-08  0:13   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:13 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Controller do not support SCSI WRITE SAME
>   for NVMe drives in HBA mode
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 10/25] smartpqi: add stream detection
  2020-12-10 20:35 ` [PATCH V3 10/25] smartpqi: add stream detection Don Brace
@ 2021-01-08  0:14   ` Martin Wilck
  2021-01-15 21:58     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:14 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> * Enhance performance by adding sequential stream detection.
>   for R5/R6 sequential write requests.
>   * Reduce stripe lock contention with full-stripe write
>     operations.

I suppose that "stripe lock" is used by the firmware? Could you
elaborate a bit more how this technique improves performance?

> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi.h      |    8 +++
>  drivers/scsi/smartpqi/smartpqi_init.c |   87
> +++++++++++++++++++++++++++++++--
>  2 files changed, 89 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi.h
> b/drivers/scsi/smartpqi/smartpqi.h
> index a5e271dd2742..343f06e44220 100644
> --- a/drivers/scsi/smartpqi/smartpqi.h
> +++ b/drivers/scsi/smartpqi/smartpqi.h
> @@ -1042,6 +1042,12 @@ struct pqi_scsi_dev_raid_map_data {
>  
>  #define RAID_CTLR_LUNID                "\0\0\0\0\0\0\0\0"
>  
> +#define NUM_STREAMS_PER_LUN    8
> +
> +struct pqi_stream_data {
> +       u64     next_lba;
> +       u32     last_accessed;
> +};
>  
>  struct pqi_scsi_dev {
>         int     devtype;                /* as reported by INQUIRY
> commmand */
> @@ -1097,6 +1103,7 @@ struct pqi_scsi_dev {
>         struct list_head add_list_entry;
>         struct list_head delete_list_entry;
>  
> +       struct pqi_stream_data stream_data[NUM_STREAMS_PER_LUN];
>         atomic_t scsi_cmds_outstanding;
>         atomic_t raid_bypass_cnt;
>  };
> @@ -1296,6 +1303,7 @@ struct pqi_ctrl_info {
>         u8              enable_r5_writes : 1;
>         u8              enable_r6_writes : 1;
>         u8              lv_drive_type_mix_valid : 1;
> +       u8              enable_stream_detection : 1;
>  
>         u8              ciss_report_log_flags;
>         u32             max_transfer_encrypted_sas_sata;
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index fc8fafab480d..96383d047a88 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -5721,8 +5721,82 @@ void pqi_prep_for_scsi_done(struct scsi_cmnd
> *scmd)
>         atomic_dec(&device->scsi_cmds_outstanding);
>  }
>  
> -static int pqi_scsi_queue_command(struct Scsi_Host *shost,
> +static bool pqi_is_parity_write_stream(struct pqi_ctrl_info
> *ctrl_info,
>         struct scsi_cmnd *scmd)
> +{
> +       u32 oldest_jiffies;
> +       u8 lru_index;
> +       int i;
> +       int rc;
> +       struct pqi_scsi_dev *device;
> +       struct pqi_stream_data *pqi_stream_data;
> +       struct pqi_scsi_dev_raid_map_data rmd;
> +
> +       if (!ctrl_info->enable_stream_detection)
> +               return false;
> +
> +       rc = pqi_get_aio_lba_and_block_count(scmd, &rmd);
> +       if (rc)
> +               return false;
> +
> +       /* Check writes only. */
> +       if (!rmd.is_write)
> +               return false;
> +
> +       device = scmd->device->hostdata;
> +
> +       /* Check for RAID 5/6 streams. */
> +       if (device->raid_level != SA_RAID_5 && device->raid_level !=
> SA_RAID_6)
> +               return false;
> +
> +       /*
> +        * If controller does not support AIO RAID{5,6} writes, need
> to send
> +        * requests down non-AIO path.
> +        */
> +       if ((device->raid_level == SA_RAID_5 && !ctrl_info-
> >enable_r5_writes) ||
> +               (device->raid_level == SA_RAID_6 && !ctrl_info-
> >enable_r6_writes))
> +               return true;
> +
> +       lru_index = 0;
> +       oldest_jiffies = INT_MAX;
> +       for (i = 0; i < NUM_STREAMS_PER_LUN; i++) {
> +               pqi_stream_data = &device->stream_data[i];
> +               /*
> +                * Check for adjacent request or request is within
> +                * the previous request.
> +                */
> +               if ((pqi_stream_data->next_lba &&
> +                       rmd.first_block >= pqi_stream_data->next_lba)
> &&
> +                       rmd.first_block <= pqi_stream_data->next_lba
> +
> +                               rmd.block_cnt) {

Here you seem to assume that the previous write had the same block_cnt.
What's the justification for that?

> +                       pqi_stream_data->next_lba = rmd.first_block +
> +                               rmd.block_cnt;
> +                       pqi_stream_data->last_accessed = jiffies;
> +                       return true;
> +               }
> +
> +               /* unused entry */
> +               if (pqi_stream_data->last_accessed == 0) {
> +                       lru_index = i;
> +                       break;
> +               }
> +
> +               /* Find entry with oldest last accessed time. */
> +               if (pqi_stream_data->last_accessed <= oldest_jiffies)
> {
> +                       oldest_jiffies = pqi_stream_data-
> >last_accessed;
> +                       lru_index = i;
> +               }
> +       }
> +
> +       /* Set LRU entry. */
> +       pqi_stream_data = &device->stream_data[lru_index];
> +       pqi_stream_data->last_accessed = jiffies;
> +       pqi_stream_data->next_lba = rmd.first_block + rmd.block_cnt;
> +
> +       return false;
> +}
> +
> +static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct
> scsi_cmnd *scmd)
>  {
>         int rc;
>         struct pqi_ctrl_info *ctrl_info;
> @@ -5768,11 +5842,12 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost,
>                 raid_bypassed = false;
>                 if (device->raid_bypass_enabled &&
>                         !blk_rq_is_passthrough(scmd->request)) {
> -                       rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device,
> -                               scmd, queue_group);
> -                       if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY)
> {
> -                               raid_bypassed = true;
> -                               atomic_inc(&device->raid_bypass_cnt);
> +                       if (!pqi_is_parity_write_stream(ctrl_info,
> scmd)) {
> +                               rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd,
> queue_group);
> +                               if (rc == 0 || rc ==
> SCSI_MLQUEUE_HOST_BUSY) {
> +                                       raid_bypassed = true;
> +                                       atomic_inc(&device-
> >raid_bypass_cnt);
> +                               }
>                         }
>                 }
>                 if (!raid_bypassed)
> 




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 12/25] smartpqi: enable support for NVMe encryption
  2020-12-10 20:35 ` [PATCH V3 12/25] smartpqi: enable support for NVMe encryption Don Brace
@ 2021-01-08  0:14   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:14 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Support new FW feature bit that enables
>   NVMe encryption.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Reviewed-by: Martin Wilck <mwilck@suse.com>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA
  2020-12-10 20:35 ` [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA Don Brace
@ 2021-01-08  0:14   ` Martin Wilck
  2021-01-27 17:46     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:14 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Release QRM memory (OFA buffer) on OFA error conditions.
> * Controller is left in a bad state which can cause a kernel panic
>     upon reboot after an unsuccessful OFA.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

I don't understand how the patch description relates to the actual
change. With the patch, the buffers are released just like before,
only some instructions later. So apparently, without this patch, the
OFA memory had been released prematurely?

Anyway,

Reviewed-by: Martin Wilck <mwilck@suse.com>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 21/25] smartpqi: add additional logging for LUN resets
  2020-12-10 20:36 ` [PATCH V3 21/25] smartpqi: add additional logging for LUN resets Don Brace
@ 2021-01-08  0:27   ` Martin Wilck
  2021-01-25 17:09     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:27 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Add additional logging to help in debugging issues
>   with LUN resets.
> 
> Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

The patch description is not complete, as the patch also changes
some timings. Two remarks below.

Cheers,
Martin

> ---
>  drivers/scsi/smartpqi/smartpqi_init.c |  125
> +++++++++++++++++++++++----------
>  1 file changed, 89 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 6b624413c8e6..1c51a59f1da6 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -84,7 +84,7 @@ static void pqi_ofa_setup_host_buffer(struct
> pqi_ctrl_info *ctrl_info);
>  static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info
> *ctrl_info);
>  static int pqi_ofa_host_memory_update(struct pqi_ctrl_info
> *ctrl_info);
>  static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info
> *ctrl_info,
> -       struct pqi_scsi_dev *device, unsigned long timeout_secs);
> +       struct pqi_scsi_dev *device, unsigned long timeout_msecs);
>  
>  /* for flags argument to pqi_submit_raid_request_synchronous() */
>  #define PQI_SYNC_FLAGS_INTERRUPTABLE   0x1
> @@ -335,11 +335,34 @@ static void pqi_wait_if_ctrl_blocked(struct
> pqi_ctrl_info *ctrl_info)
>         atomic_dec(&ctrl_info->num_blocked_threads);
>  }
>  
> +#define PQI_QUIESE_WARNING_TIMEOUT_SECS                10

Did you mean QUIESCE ?

> +
>  static inline void pqi_ctrl_wait_until_quiesced(struct pqi_ctrl_info
> *ctrl_info)
>  {
> +       unsigned long start_jiffies;
> +       unsigned long warning_timeout;
> +       bool displayed_warning;
> +
> +       displayed_warning = false;
> +       start_jiffies = jiffies;
> +       warning_timeout = (PQI_QUIESE_WARNING_TIMEOUT_SECS * PQI_HZ)
> + start_jiffies;
> +
>         while (atomic_read(&ctrl_info->num_busy_threads) >
> -               atomic_read(&ctrl_info->num_blocked_threads))
> +               atomic_read(&ctrl_info->num_blocked_threads)) {
> +               if (time_after(jiffies, warning_timeout)) {
> +                       dev_warn(&ctrl_info->pci_dev->dev,
> +                               "waiting %u seconds for driver
> activity to quiesce\n",
> +                               jiffies_to_msecs(jiffies -
> start_jiffies) / 1000);
> +                       displayed_warning = true;
> +                       warning_timeout =
> (PQI_QUIESE_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;
> +               }
>                 usleep_range(1000, 2000);
> +       }
> +
> +       if (displayed_warning)
> +               dev_warn(&ctrl_info->pci_dev->dev,
> +                       "driver activity quiesced after waiting for
> %u seconds\n",
> +                       jiffies_to_msecs(jiffies - start_jiffies) /
> 1000);
>  }
>  
>  static inline bool pqi_device_offline(struct pqi_scsi_dev *device)
> @@ -1670,7 +1693,7 @@ static int pqi_add_device(struct pqi_ctrl_info
> *ctrl_info,
>         return rc;
>  }
>  
> -#define PQI_PENDING_IO_TIMEOUT_SECS    20
> +#define PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS     (20 * 1000)
>  
>  static inline void pqi_remove_device(struct pqi_ctrl_info
> *ctrl_info, struct pqi_scsi_dev *device)
>  {
> @@ -1678,7 +1701,8 @@ static inline void pqi_remove_device(struct
> pqi_ctrl_info *ctrl_info, struct pqi
>  
>         pqi_device_remove_start(device);
>  
> -       rc = pqi_device_wait_for_pending_io(ctrl_info, device,
> PQI_PENDING_IO_TIMEOUT_SECS);
> +       rc = pqi_device_wait_for_pending_io(ctrl_info, device,
> +               PQI_REMOVE_DEVICE_PENDING_IO_TIMEOUT_MSECS);
>         if (rc)
>                 dev_err(&ctrl_info->pci_dev->dev,
>                         "scsi %d:%d:%d:%d removing device with %d
> outstanding command(s)\n",
> @@ -3070,7 +3094,7 @@ static void pqi_process_io_error(unsigned int
> iu_type,
>         }
>  }
>  
> -static int pqi_interpret_task_management_response(
> +static int pqi_interpret_task_management_response(struct
> pqi_ctrl_info *ctrl_info,
>         struct pqi_task_management_response *response)
>  {
>         int rc;
> @@ -3088,6 +3112,10 @@ static int
> pqi_interpret_task_management_response(
>                 break;
>         }
>  
> +       if (rc)
> +               dev_err(&ctrl_info->pci_dev->dev,
> +                       "Task Management Function error: %d (response
> code: %u)\n", rc, response->response_code);
> +
>         return rc;
>  }
>  
> @@ -3156,9 +3184,8 @@ static int pqi_process_io_intr(struct
> pqi_ctrl_info *ctrl_info, struct pqi_queue
>                                 &((struct pqi_vendor_general_response
> *)response)->status);
>                         break;
>                 case PQI_RESPONSE_IU_TASK_MANAGEMENT:
> -                       io_request->status =
> -
>                                pqi_interpret_task_management_response(
> -                                       (void *)response);
> +                       io_request->status =
> pqi_interpret_task_management_response(ctrl_info,
> +                               (void *)response);
>                         break;
>                 case PQI_RESPONSE_IU_AIO_PATH_DISABLED:
>                         pqi_aio_path_disabled(io_request);
> @@ -5862,24 +5889,37 @@ static void
> pqi_fail_io_queued_for_device(struct pqi_ctrl_info *ctrl_info,
>         }
>  }
>  
> +#define PQI_PENDING_IO_WARNING_TIMEOUT_SECS    10
> +
>  static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info
> *ctrl_info,
> -       struct pqi_scsi_dev *device, unsigned long timeout_secs)
> +       struct pqi_scsi_dev *device, unsigned long timeout_msecs)
>  {
> -       unsigned long timeout;
> +       int cmds_outstanding;
> +       unsigned long start_jiffies;
> +       unsigned long warning_timeout;
> +       unsigned long msecs_waiting;
>  
> +       start_jiffies = jiffies;
> +       warning_timeout = (PQI_PENDING_IO_WARNING_TIMEOUT_SECS *
> PQI_HZ) + start_jiffies;
>  
> -       timeout = (timeout_secs * PQI_HZ) + jiffies;
> -
> -       while (atomic_read(&device->scsi_cmds_outstanding)) {
> +       while ((cmds_outstanding = atomic_read(&device-
> >scsi_cmds_outstanding)) > 0) {
>                 pqi_check_ctrl_health(ctrl_info);
>                 if (pqi_ctrl_offline(ctrl_info))
>                         return -ENXIO;
> -               if (timeout_secs != NO_TIMEOUT) {
> -                       if (time_after(jiffies, timeout)) {
> -                               dev_err(&ctrl_info->pci_dev->dev,
> -                                       "timed out waiting for
> pending I/O\n");
> -                               return -ETIMEDOUT;
> -                       }
> +               msecs_waiting = jiffies_to_msecs(jiffies -
> start_jiffies);
> +               if (msecs_waiting > timeout_msecs) {
> +                       dev_err(&ctrl_info->pci_dev->dev,
> +                               "scsi %d:%d:%d:%d: timed out after
> %lu seconds waiting for %d outstanding command(s)\n",
> +                               ctrl_info->scsi_host->host_no,
> device->bus, device->target,
> +                               device->lun, msecs_waiting / 1000,
> cmds_outstanding);
> +                       return -ETIMEDOUT;
> +               }
> +               if (time_after(jiffies, warning_timeout)) {
> +                       dev_warn(&ctrl_info->pci_dev->dev,
> +                               "scsi %d:%d:%d:%d: waiting %lu
> seconds for %d outstanding command(s)\n",
> +                               ctrl_info->scsi_host->host_no,
> device->bus, device->target,
> +                               device->lun, msecs_waiting / 1000,
> cmds_outstanding);
> +                       warning_timeout =
> (PQI_PENDING_IO_WARNING_TIMEOUT_SECS * PQI_HZ) + jiffies;
>                 }
>                 usleep_range(1000, 2000);
>         }
> @@ -5895,13 +5935,15 @@ static void pqi_lun_reset_complete(struct
> pqi_io_request *io_request,
>         complete(waiting);
>  }
>  
> -#define PQI_LUN_RESET_TIMEOUT_SECS             30
>  #define PQI_LUN_RESET_POLL_COMPLETION_SECS     10
>  
>  static int pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_scsi_dev *device, struct completion *wait)
>  {
>         int rc;
> +       unsigned int wait_secs;
> +
> +       wait_secs = 0;
>  
>         while (1) {
>                 if (wait_for_completion_io_timeout(wait,
> @@ -5915,13 +5957,21 @@ static int
> pqi_wait_for_lun_reset_completion(struct pqi_ctrl_info *ctrl_info,
>                         rc = -ENXIO;
>                         break;
>                 }
> +
> +               wait_secs += PQI_LUN_RESET_POLL_COMPLETION_SECS;
> +
> +               dev_warn(&ctrl_info->pci_dev->dev,
> +                       "scsi %d:%d:%d:%d: waiting %u seconds for LUN
> reset to complete\n",
> +                       ctrl_info->scsi_host->host_no, device->bus,
> device->target, device->lun,
> +                       wait_secs);
>         }
>  
>         return rc;
>  }
>  
> -static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info,
> -       struct pqi_scsi_dev *device)
> +#define PQI_LUN_RESET_FIRMWARE_TIMEOUT_SECS    30
> +
> +static int pqi_lun_reset(struct pqi_ctrl_info *ctrl_info, struct
> pqi_scsi_dev *device)
>  {
>         int rc;
>         struct pqi_io_request *io_request;
> @@ -5943,8 +5993,7 @@ static int pqi_lun_reset(struct pqi_ctrl_info
> *ctrl_info,
>                 sizeof(request->lun_number));
>         request->task_management_function =
> SOP_TASK_MANAGEMENT_LUN_RESET;
>         if (ctrl_info->tmf_iu_timeout_supported)
> -               put_unaligned_le16(PQI_LUN_RESET_TIMEOUT_SECS,
> -                                       &request->timeout);
> +               put_unaligned_le16(PQI_LUN_RESET_FIRMWARE_TIMEOUT_SEC
> S, &request->timeout);
>  
>         pqi_start_io(ctrl_info, &ctrl_info-
> >queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
>                 io_request);
> @@ -5958,29 +6007,33 @@ static int pqi_lun_reset(struct pqi_ctrl_info
> *ctrl_info,
>         return rc;
>  }
>  
> -#define PQI_LUN_RESET_RETRIES                  3
> -#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS     10000
> -#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS  120
> +#define PQI_LUN_RESET_RETRIES                          3
> +#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS             (10 * 1000)
> +#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS         (10 * 60 *
> 1000)

10 minutes? Isn't that a bit much?

> +#define PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS  (2 * 60 *
> 1000)

Why wait less long after a failure?



>  
> -static int pqi_lun_reset_with_retries(struct pqi_ctrl_info
> *ctrl_info,
> -       struct pqi_scsi_dev *device)
> +static int pqi_lun_reset_with_retries(struct pqi_ctrl_info
> *ctrl_info, struct pqi_scsi_dev *device)
>  {
> -       int rc;
> +       int reset_rc;
> +       int wait_rc;
>         unsigned int retries;
> -       unsigned long timeout_secs;
> +       unsigned long timeout_msecs;
>  
>         for (retries = 0;;) {
> -               rc = pqi_lun_reset(ctrl_info, device);
> -               if (rc == 0 || ++retries > PQI_LUN_RESET_RETRIES)
> +               reset_rc = pqi_lun_reset(ctrl_info, device);
> +               if (reset_rc == 0 || ++retries >
> PQI_LUN_RESET_RETRIES)
>                         break;
>                 msleep(PQI_LUN_RESET_RETRY_INTERVAL_MSECS);
>         }
>  
> -       timeout_secs = rc ? PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS :
> NO_TIMEOUT;
> +       timeout_msecs = reset_rc ?
> PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS :
> +               PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS;
>  
> -       rc |= pqi_device_wait_for_pending_io(ctrl_info, device,
> timeout_secs);
> +       wait_rc = pqi_device_wait_for_pending_io(ctrl_info, device,
> timeout_msecs);
> +       if (wait_rc && reset_rc == 0)
> +               reset_rc = wait_rc;
>  
> -       return rc == 0 ? SUCCESS : FAILED;
> +       return reset_rc == 0 ? SUCCESS : FAILED;
>  }
>  
>  static int pqi_device_reset(struct pqi_ctrl_info *ctrl_info,
> 



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf
  2020-12-10 20:36 ` [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf Don Brace
@ 2021-01-08  0:30   ` Martin Wilck
  2021-01-25 17:13     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:30 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Murthy Bhat <Murthy.Bhat@microchip.com>
> 
> * Update enclosure identifier field corresponding to
>   physical devices in lsscsi/sysfs.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi_init.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 1c51a59f1da6..40ae82470d8c 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info
> *ctrl_info,
>  static void pqi_scsi_update_device(struct pqi_scsi_dev
> *existing_device,
>         struct pqi_scsi_dev *new_device)
>  {
> -       existing_device->devtype = new_device->devtype;
>         existing_device->device_type = new_device->device_type;
>         existing_device->bus = new_device->bus;
>         if (new_device->target_lun_valid) {
> 

I don't get this. Why was it wrong to update the devtype field?




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation
  2020-12-10 20:36 ` [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation Don Brace
@ 2021-01-08  0:34   ` Martin Wilck
  2021-01-27 17:39     ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:34 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Correct system hangs when resuming from hibernation after
>   first successful hibernation/resume cycle.
>   * Rare condition involving OFA.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>
> ---
>  drivers/scsi/smartpqi/smartpqi_init.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 40ae82470d8c..5ca265babaa2 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -8688,6 +8688,11 @@ static __maybe_unused int pqi_resume(struct
> pci_dev *pci_dev)
>         pci_set_power_state(pci_dev, PCI_D0);
>         pci_restore_state(pci_dev);
>  
> +       pqi_ctrl_unblock_device_reset(ctrl_info);
> +       pqi_ctrl_unblock_requests(ctrl_info);
> +       pqi_scsi_unblock_requests(ctrl_info);
> +       pqi_ctrl_unblock_scan(ctrl_info);
> +
>         return pqi_ctrl_init_resume(ctrl_info);
>  }

Like I said in my comments on 14/25:

pqi_ctrl_unblock_scan() and pqi_ctrl_unblock_device_reset() expand
to mutex_unlock(). Unlocking an already-unlocked mutex is wrong, and
a mutex has to be unlocked by the task that owns the lock. How
can you be sure that these conditions are met here?

Regards
Martin





^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 24/25] smartpqi: add new pci ids
  2020-12-10 20:36 ` [PATCH V3 24/25] smartpqi: add new pci ids Don Brace
@ 2021-01-08  0:35   ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-08  0:35 UTC (permalink / raw)
  To: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Thu, 2020-12-10 at 14:36 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
> 
> * Add support for newer HW.
> 
> Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
> Reviewed-by: Scott Teel <scott.teel@microchip.com>
> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
> Signed-off-by: Don Brace <don.brace@microchip.com>

Looks good (I haven't verified the IDs).

Acked-by: Martin Wilck <mwilck@suse.com>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 14/25] smartpqi: fix driver synchronization issues
  2021-01-07 23:32   ` Martin Wilck
@ 2021-01-08  4:13     ` Martin K. Petersen
  2021-01-15 21:13     ` Don.Brace
  2021-01-27 23:01     ` Don.Brace
  2 siblings, 0 replies; 91+ messages in thread
From: Martin K. Petersen @ 2021-01-08  4:13 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Don Brace, Kevin.Barnett, scott.teel, Justin.Lindley,
	scott.benesh, gerry.morong, mahesh.rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD, linux-scsi


Martin,

>> * synchronize: LUN resets, shutdowns, suspend, hibernate,
>>   OFA, and controller offline events.
>> * prevent I/O during the the above conditions.
>
> This description is too terse for a complex patch like this.

That's a recurring problem with pretty much every patch in this
series. Big changes warrant detailed commit descriptions. Bullet lists
are woefully inadequate.

Microchip: Please read the "Describe your changes" in
Documentation/process/submitting-patches.rst. I also suggest you inspect
the commit history for other drivers in the tree to get an idea how
commit messages should be written.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes
  2021-01-07 16:44   ` Martin Wilck
@ 2021-01-08 22:56     ` Don.Brace
  2021-01-13 10:26       ` Martin Wilck
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-01-08 22:56 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

Subject: Re: [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes

Stg s>
>  struct pqi_raid_path_request {
>         struct pqi_iu_header header;
> @@ -312,6 +313,39 @@ struct pqi_aio_path_request {
>                 sg_descriptors[PQI_MAX_EMBEDDED_SG_DESCRIPTORS];
>  };
>
> +#define PQI_RAID56_XFER_LIMIT_4K       0x1000 /* 4Kib */
> +#define PQI_RAID56_XFER_LIMIT_8K       0x2000 /* 8Kib */

You don't seem to use these, and you'll remove them again in patch 06/25.

Don: Removed these definitions.

> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 6bcb037ae9d7..c813cec10003 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -2245,13 +2250,14 @@ static bool
> pqi_aio_raid_level_supported(struct pqi_scsi_dev_raid_map_data *rmd)
>         case SA_RAID_0:
>                 break;
>         case SA_RAID_1:
> -               if (rmd->is_write)
> -                       is_supported = false;
> +               is_supported = false;

You disable RAID1 READs with this patch. I can see you fix it again in 05/25, still it looks wrong.

Don: Corrected

>                 break;
>         case SA_RAID_5:
> -               fallthrough;
> +               if (rmd->is_write && !ctrl_info->enable_r5_writes)
> +                       is_supported = false;
> +               break;
>         case SA_RAID_6:
> -               if (rmd->is_write)
> +               if (rmd->is_write && !ctrl_info->enable_r6_writes)
>                         is_supported = false;
>                 break;
>         case SA_RAID_ADM:
> @@ -2526,6 +2532,26 @@ static int pqi_calc_aio_r5_or_r6(struct 
> pqi_scsi_dev_raid_map_data *rmd,
>                 rmd->total_disks_per_row)) +
>                 (rmd->map_row * rmd->total_disks_per_row) + rmd-
> > first_column;
>
> +       if (rmd->is_write) {
> +               rmd->p_index = (rmd->map_row * rmd-
> > total_disks_per_row) + rmd->data_disks_per_row;
> +               rmd->p_parity_it_nexus = raid_map->disk_data[rmd-
> > p_index].aio_handle;

I suppose you have made sure rmd->p_index can't be larger than the size of raid_map->disk_data. A comment explaining that would be helpful for the reader though.

Don: Added a comment for p_index.

> +               if (rmd->raid_level == SA_RAID_6) {
> +                       rmd->q_index = (rmd->map_row * rmd-
> > total_disks_per_row) +
> +                               (rmd->data_disks_per_row + 1);
> +                       rmd->q_parity_it_nexus = raid_map-
> > disk_data[rmd->q_index].aio_handle;
> +                       rmd->xor_mult = raid_map->disk_data[rmd-
> > map_index].xor_mult[1];

See above.

Don: Comment updated to include q_index.

> +               }
> +               if (rmd->blocks_per_row == 0)
> +                       return PQI_RAID_BYPASS_INELIGIBLE; #if 
> +BITS_PER_LONG == 32
> +               tmpdiv = rmd->first_block;
> +               do_div(tmpdiv, rmd->blocks_per_row);
> +               rmd->row = tmpdiv;
> +#else
> +               rmd->row = rmd->first_block / rmd->blocks_per_row; 
> +#endif

Why not always use do_div()?

Don: I had removed the BITS_PER_LONG check, was an attempt to clean up the code, but forgot we still need to support 32bit and I just re-added BITS_PER_LONG HUNKS. These HUNKS were there before I refactored the code so it predates me. Any chance I can leave this in? It's been through a lot of regression testing already...

> @@ -4844,6 +4889,12 @@ static void
> pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info)
>                 PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /
>                 sizeof(struct pqi_sg_descriptor)) +
>                 PQI_MAX_EMBEDDED_SG_DESCRIPTORS;
> +
> +       ctrl_info->max_sg_per_r56_iu =
> +               ((ctrl_info->max_inbound_iu_length -
> +               PQI_OPERATIONAL_IQ_ELEMENT_LENGTH) /
> +               sizeof(struct pqi_sg_descriptor)) +
> +               PQI_MAX_EMBEDDED_R56_SG_DESCRIPTORS;
>  }
>
>  static inline void pqi_set_sg_descriptor( @@ -4931,6 +4982,44 @@ 
> static int pqi_build_raid_sg_list(struct pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>
> +static int pqi_build_aio_r56_sg_list(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_aio_r56_path_request *request, struct scsi_cmnd
> *scmd,
> +       struct pqi_io_request *io_request) {
> +       u16 iu_length;
> +       int sg_count;
> +       bool chained;
> +       unsigned int num_sg_in_iu;
> +       struct scatterlist *sg;
> +       struct pqi_sg_descriptor *sg_descriptor;
> +
> +       sg_count = scsi_dma_map(scmd);
> +       if (sg_count < 0)
> +               return sg_count;
> +
> +       iu_length = offsetof(struct pqi_aio_r56_path_request,
> sg_descriptors) -
> +               PQI_REQUEST_HEADER_LENGTH;
> +       num_sg_in_iu = 0;
> +
> +       if (sg_count == 0)
> +               goto out;

An if {} block would be better readable here.
Don> done.

>  }
>
> +static int pqi_aio_submit_r56_write_io(struct pqi_ctrl_info
> *ctrl_info,
> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
> +       struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> +       struct pqi_scsi_dev_raid_map_data *rmd) {
> +
> +       switch (scmd->sc_data_direction) {
> +       case DMA_TO_DEVICE:
> +               r56_request->data_direction = SOP_READ_FLAG;
> +               break;

I wonder how it would be possible that sc_data_direction is anything else but DMA_TO_DEVICE here. AFAICS we will only reach this code for WRITE commands. Add a comment, please.

Don: Great observation, removed switch block and added a comment. Set direction to write.

> +static ssize_t pqi_host_enable_r5_writes_show(struct device *dev,
> +       struct device_attribute *attr, char *buffer) {
> +       struct Scsi_Host *shost = class_to_shost(dev);
> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> +
> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-
> > enable_r5_writes);

"%hhx" is deprecated, see
https://lore.kernel.org/lkml/20190914015858.7c76e036@lwn.net/T/

Don: done

> +static ssize_t pqi_host_enable_r6_writes_show(struct device *dev,
> +       struct device_attribute *attr, char *buffer) {
> +       struct Scsi_Host *shost = class_to_shost(dev);
> +       struct pqi_ctrl_info *ctrl_info = shost_to_hba(shost);
> +
> +       return scnprintf(buffer, 10, "%hhx\n", ctrl_info-
> > enable_r6_writes);

See above

Don: done

Don: Thanks for your all of your great effort on this patch. I'll upload a V4 with updates to this patch and the rest of your reviews soon.

Thanks,
Don Brace 



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 05/25] smartpqi: add support for raid1 writes
  2021-01-07 16:44   ` Martin Wilck
@ 2021-01-09 16:56     ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-09 16:56 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

Subject: Re: [PATCH V3 05/25] smartpqi: add support for raid1 writes
> @@ static bool pqi_aio_raid_level_supported(struct pqi_ctrl_info 
> *ctrl_info,
>         case SA_RAID_0:
>                 break;
>         case SA_RAID_1:
> -               is_supported = false;
> +               fallthrough;

Nit: fallthrough isn't necessary here.
Don: removed fallthrough
>
> +static  int pqi_aio_submit_r1_write_io(struct pqi_ctrl_info
> *ctrl_info,
> +       struct scsi_cmnd *scmd, struct pqi_queue_group *queue_group,
> +       struct pqi_encryption_info *encryption_info, struct
> pqi_scsi_dev *device,
> +       struct pqi_scsi_dev_raid_map_data *rmd)
> +
> +       switch (scmd->sc_data_direction) {
> +       case DMA_TO_DEVICE:
> +               r1_request->data_direction = SOP_READ_FLAG;
> +               break;

Same question as for previous patch, how would anything else than DMA_TO_DEVICE be possible here?

Don: changed direction to write, added comment.

Thank-you Martin for your review. I'll upload a V4 after I complete the other reviews.

Don Brace 



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits
  2021-01-07 16:44   ` Martin Wilck
@ 2021-01-11 17:22     ` Don.Brace
  2021-01-22 16:45     ` Don.Brace
  1 sibling, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-11 17:22 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

Subject: Re: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits


In general: This patch contains a lot of whitespace, indentation, and minor comment formatting changes which should rather go into a separate patch IMHO. This one is big enough without them.

Don: Moved formatting changes into patch smartpqi-align-code-with-oob-driver

Further remarks below.


> [...]
>
> @@ -2552,7 +2686,7 @@ static int
> pqi_raid_bypass_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>         u32 next_bypass_group;
>         struct pqi_encryption_info *encryption_info_ptr;
>         struct pqi_encryption_info encryption_info;
> -       struct pqi_scsi_dev_raid_map_data rmd = {0};
> +       struct pqi_scsi_dev_raid_map_data rmd = { 0 };
>
>
>         if (get_unaligned_le16(&raid_map->flags) &
> -               RAID_MAP_ENCRYPTION_ENABLED) {
> +                       RAID_MAP_ENCRYPTION_ENABLED) {
> +               if (rmd.data_length > device->max_transfer_encrypted)
> +                       return PQI_RAID_BYPASS_INELIGIBLE;
>                 pqi_set_encryption_info(&encryption_info, raid_map,
>                         rmd.first_block);
>                 encryption_info_ptr = &encryption_info; @@ -2623,10 
> +2759,6 @@ static int pqi_raid_bypass_submit_scsi_cmd(struct 
> pqi_ctrl_info *ctrl_info,
>

This hunk is fine, but AFAICS it doesn't belong here logically, it should rather be part of patch 04 and 05.

Don: The patch adds max_transfer_encrypted field as part of new feature support. We would like to leave the update in this patch.


> @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)
>
>         pqi_start_heartbeat_timer(ctrl_info);
>
> +       if (ctrl_info->enable_r5_writes || ctrl_info-
> >enable_r6_writes) {
> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);
> +               if (rc) {
> +                       dev_err(&ctrl_info->pci_dev->dev,
> +                               "error obtaining advanced RAID bypass
> configuration\n");
> +                       return rc;

Do you need to error out here ? Can't you simply unset the enable_rX_writes feature?

This function should never fail, so a failure indicates a serious problem. But we're considering some changes in that area that we may push up at a later date.

Regards
Martin




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support
  2021-01-07 16:44   ` Martin Wilck
@ 2021-01-11 20:53     ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-11 20:53 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
>
> The specification for AIO Sub-Page (0x02) has changed slightly.
> * bring the driver into conformance with the spec.
>
>
> +static inline u32 pqi_aio_limit_to_bytes(__le16 *limit) {
> +       u32 bytes;
> +
> +       bytes = get_unaligned_le16(limit);
> +       if (bytes == 0)
> +               bytes = ~0;
> +       else
> +               bytes *= 1024;
> +
> +       return bytes;
> +}

Nice, but this function and it's callers belong into patch 06/25.

Don: 
      * Squashed smartpqi-update-AIO-Sub-Page-0x02-support
      * Moved formatting HUNK for pqi_scsi_dev_raid_map_data into
        smartpqi-refactor-aio-submission-code
      * Moved structure pqi_aio_r56_path_request formatting HUNKS into
        smartpqi-add-support-for-raid5-and-raid6-writes
      * Moved remaining formatting HUNKs into
        smartpqi-align-code-with-oob-driver

Thanks for all of your attention to detail,
Don

> +
>  #pragma pack(1)
>
>
>  static int pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info
> *ctrl_info)
> @@ -753,33 +766,28 @@ static int
> pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
>                         BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE ||
>                 get_unaligned_le16(&buffer-
> >aio_subpage.header.page_length) <
>                         MINIMUM_AIO_SUBPAGE_LENGTH) {
> -               rc = -EINVAL;

This should be changed in 06/25.

>                 goto error;
>         }
>

Regards
Martin




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 08/25] smartpqi: add support for long firmware version
  2021-01-07 16:45   ` Martin Wilck
@ 2021-01-11 22:25     ` Don.Brace
  2021-01-22 20:01     ` Don.Brace
  1 sibling, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-11 22:25 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 08/25] smartpqi: add support for long firmware version

> @@ -1405,7 +1405,7 @@ enum pqi_ctrl_mode {  struct 
> bmic_identify_controller {
>         u8      configured_logical_drive_count;
>         __le32  configuration_signature;
> -       u8      firmware_version[4];
> +       u8      firmware_version_short[4];
>         u8      reserved[145];
>         __le16  extended_logical_unit_count;
>         u8      reserved1[34];
> @@ -1413,11 +1413,17 @@ struct bmic_identify_controller {
>         u8      reserved2[8];
>         u8      vendor_id[8];
>         u8      product_id[16];
> -       u8      reserved3[68];
> +       u8      reserved3[62];
> +       __le32  extra_controller_flags;
> +       u8      reserved4[2];
>         u8      controller_mode;
> -       u8      reserved4[32];
> +       u8      spare_part_number[32];
> +       u8      firmware_version_long[32];
>  };
>
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> pqi_get_ctrl_product_details(struct pqi_ctrl_info *ctrl_info)
>         if (rc)
>                 goto out;
>
> +       if (get_unaligned_le32(&identify->extra_controller_flags) &
> +               BMIC_IDENTIFY_EXTRA_FLAGS_LONG_FW_VERSION_SUPPORTED)
> {
> +               memcpy(ctrl_info->firmware_version,
> +                       identify->firmware_version_long,
> +                       sizeof(identify->firmware_version_long));
> +       } else {
> +               memcpy(ctrl_info->firmware_version,
> +                       identify->firmware_version_short,
> +                       sizeof(identify->firmware_version_short));
> +               ctrl_info->firmware_version
> +                       [sizeof(identify->firmware_version_short)] =
> '\0';
> +               snprintf(ctrl_info->firmware_version +
> +                       strlen(ctrl_info->firmware_version),
> +                       sizeof(ctrl_info->firmware_version),

This looks wrong. I suppose a real overflow can't happen, but shouldn't it rather be written like this?

snprintf(ctrl_info->firmware_version +
 sizeof(identify->firmware_version_short),
 sizeof(ctrl_info->firmware_version)
 - sizeof(identify->firmware_version_short),
 "-u", ...)

> +                       "-%u",
> +                       get_unaligned_le16(&identify-
> > firmware_build_number));


Don: Agreed. Updated. 
Thanks for your revew.

^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 11/25] smartpqi: add host level stream detection enable
  2021-01-08  0:13   ` Martin Wilck
@ 2021-01-12 20:28     ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-12 20:28 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi


From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 11/25] smartpqi: add host level stream detection enable


Nitpick: As noted before, %hhx is discouraged.

Regards,
Martin

Don: Changed to %x
Thanks for your hard work.
Don


^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes
  2021-01-08 22:56     ` Don.Brace
@ 2021-01-13 10:26       ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-13 10:26 UTC (permalink / raw)
  To: Don.Brace, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Fri, 2021-01-08 at 22:56 +0000, Don.Brace@microchip.com wrote:
> 
> > +               }
> > +               if (rmd->blocks_per_row == 0)
> > +                       return PQI_RAID_BYPASS_INELIGIBLE; #if 
> > +BITS_PER_LONG == 32
> > +               tmpdiv = rmd->first_block;
> > +               do_div(tmpdiv, rmd->blocks_per_row);
> > +               rmd->row = tmpdiv;
> > +#else
> > +               rmd->row = rmd->first_block / rmd->blocks_per_row; 
> > +#endif
> 
> Why not always use do_div()?
> 
> Don: I had removed the BITS_PER_LONG check, was an attempt to clean
> up the code, but forgot we still need to support 32bit and I just re-
> added BITS_PER_LONG HUNKS. These HUNKS were there before I refactored
> the code so it predates me. Any chance I can leave this in? It's been
> through a lot of regression testing already...

My suggestion was to rather do the opposite, use the 32bit code (with
do_div()) for both 32bit and 64bit. AFAIK, this would work just fine 
(but not vice-versa). 

You can leave this in. It was just a suggestion how to improve
readability. Perhaps consider cleaning it up sometime later.

Regards,
Martin



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 14/25] smartpqi: fix driver synchronization issues
  2021-01-07 23:32   ` Martin Wilck
  2021-01-08  4:13     ` Martin K. Petersen
@ 2021-01-15 21:13     ` Don.Brace
  2021-01-27 23:01     ` Don.Brace
  2 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-15 21:13 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 14/25] smartpqi: fix driver synchronization issues

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
>
> * synchronize: LUN resets, shutdowns, suspend, hibernate,
>   OFA, and controller offline events.
> * prevent I/O during the the above conditions.

This description is too terse for a complex patch like this.

Could you please explain how this synchronization is supposed to work
on the different layers, and for the different code paths for different
types of IO events that are apparently not all handled equally wrt
blocking, and how the different flags and mutexes are supposed to
interact? I'd also appreciate some explanation what sort of "driver
synchronization issues" you have seen, and how exactly this patch is
supposed to fix them.

Please forgive me if I ask dumb questions or make dumb comments below,
I don't get the full picture of what you're trying to achieve.

The patch does not only address synchronization issues; it also changes
various other things that (given the size of the patch) should better
be handled elsewhere. I believe this patch could easily be split into
4 or more separate independent patches, which would ease review
considerably. I've added remarks below where I thought one or more
hunks could be separated out.

Thanks,
Martin

Don: I refactored this patch into 10 patches:
+ smartpqi-remove-timeouts-from-internal-cmds
+ smartpqi-add-support-for-wwid
+ smartpqi-update-event-handler
+ smartpqi-update-soft-reset
+ smartpqi-update-device-resets
+ smartpqi-update-suspend-and-resume
+ smartpqi-update-raid-bypass-handling
+ smartpqi-update-ofa-management
+ smartpqi-update-device-scan-operations
+ smartpqi-fix-driver-synchronization-issues

I will post them after some internal review in V4. I may make some changes to the patch titles...
I'll answer more questions that you have asked in another e-mail.


> @@ -245,14 +246,66 @@ static inline void pqi_save_ctrl_mode(struct
> pqi_ctrl_info *ctrl_info,
>         sis_write_driver_scratch(ctrl_info, mode);
>  }
>
> +static inline void pqi_ctrl_block_scan(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       ctrl_info->scan_blocked = true;
> +       mutex_lock(&ctrl_info->scan_mutex);
> +}

What do you need scan_blocked for? Can't you simply use
mutex_is_locked(&ctrl_info->scan_mutex)?
OTOH, using a mutex for this kind of condition feels dangerous
to me, see remark about ota_mutex() below.
Have you considered using a completion for this?

> +
> +static inline void pqi_ctrl_unblock_scan(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       ctrl_info->scan_blocked = false;
> +       mutex_unlock(&ctrl_info->scan_mutex);
> +}
> +
> +static inline bool pqi_ctrl_scan_blocked(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       return ctrl_info->scan_blocked;
> +}
> +
>  static inline void pqi_ctrl_block_device_reset(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->block_device_reset = true;
> +       mutex_lock(&ctrl_info->lun_reset_mutex);
> +}
> +
> +static inline void pqi_ctrl_unblock_device_reset(struct
> pqi_ctrl_info *ctrl_info)
> +{
> +       mutex_unlock(&ctrl_info->lun_reset_mutex);
> +}
> +
> +static inline void pqi_scsi_block_requests(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       struct Scsi_Host *shost;
> +       unsigned int num_loops;
> +       int msecs_sleep;
> +
> +       shost = ctrl_info->scsi_host;
> +
> +       scsi_block_requests(shost);
> +
> +       num_loops = 0;
> +       msecs_sleep = 20;
> +       while (scsi_host_busy(shost)) {
> +               num_loops++;
> +               if (num_loops == 10)
> +                       msecs_sleep = 500;
> +               msleep(msecs_sleep);
> +       }
> +}

Waiting for !scsi_host_busy() here looks like a layering violation to
me. Can't you use wait_event{_timeout}() here and wait for the sum of
of device->scsi_cmds_outstanding over to become zero (waking up the
queue in pqi_prep_for_scsi_done())? You could use the
total_scmnds_outstanding count that you introduce in patch 15/25.

Also, how does this interact/interfere with scsi EH?

> +
> +static inline void pqi_scsi_unblock_requests(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       scsi_unblock_requests(ctrl_info->scsi_host);
> +}
> +
> +static inline void pqi_ctrl_busy(struct pqi_ctrl_info *ctrl_info)
> +{
> +       atomic_inc(&ctrl_info->num_busy_threads);
>  }
>
> -static inline bool pqi_device_reset_blocked(struct pqi_ctrl_info
> *ctrl_info)
> +static inline void pqi_ctrl_unbusy(struct pqi_ctrl_info *ctrl_info)
>  {
> -       return ctrl_info->block_device_reset;
> +       atomic_dec(&ctrl_info->num_busy_threads);
>  }
>
>  static inline bool pqi_ctrl_blocked(struct pqi_ctrl_info *ctrl_info)
> @@ -263,44 +316,23 @@ static inline bool pqi_ctrl_blocked(struct
> pqi_ctrl_info *ctrl_info)
>  static inline void pqi_ctrl_block_requests(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         ctrl_info->block_requests = true;
> -       scsi_block_requests(ctrl_info->scsi_host);
>  }
>
>  static inline void pqi_ctrl_unblock_requests(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         ctrl_info->block_requests = false;
>         wake_up_all(&ctrl_info->block_requests_wait);
> -       pqi_retry_raid_bypass_requests(ctrl_info);
> -       scsi_unblock_requests(ctrl_info->scsi_host);
>  }
>
> -static unsigned long pqi_wait_if_ctrl_blocked(struct pqi_ctrl_info
> *ctrl_info,
> -       unsigned long timeout_msecs)
> +static void pqi_wait_if_ctrl_blocked(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       unsigned long remaining_msecs;
> -
>         if (!pqi_ctrl_blocked(ctrl_info))
> -               return timeout_msecs;
> +               return;
>
>         atomic_inc(&ctrl_info->num_blocked_threads);
> -
> -       if (timeout_msecs == NO_TIMEOUT) {
> -               wait_event(ctrl_info->block_requests_wait,
> -                       !pqi_ctrl_blocked(ctrl_info));
> -               remaining_msecs = timeout_msecs;
> -       } else {
> -               unsigned long remaining_jiffies;
> -
> -               remaining_jiffies =
> -                       wait_event_timeout(ctrl_info-
> > block_requests_wait,
> -                               !pqi_ctrl_blocked(ctrl_info),
> -                               msecs_to_jiffies(timeout_msecs));
> -               remaining_msecs =
> jiffies_to_msecs(remaining_jiffies);
> -       }
> -
> +       wait_event(ctrl_info->block_requests_wait,
> +               !pqi_ctrl_blocked(ctrl_info));
>         atomic_dec(&ctrl_info->num_blocked_threads);
> -
> -       return remaining_msecs;
>  }
>
>  static inline void pqi_ctrl_wait_until_quiesced(struct pqi_ctrl_info
> *ctrl_info)
> @@ -315,34 +347,25 @@ static inline bool pqi_device_offline(struct
> pqi_scsi_dev *device)
>         return device->device_offline;
>  }
>
> -static inline void pqi_device_reset_start(struct pqi_scsi_dev
> *device)
> -{
> -       device->in_reset = true;
> -}
> -
> -static inline void pqi_device_reset_done(struct pqi_scsi_dev
> *device)
> -{
> -       device->in_reset = false;
> -}
> -
> -static inline bool pqi_device_in_reset(struct pqi_scsi_dev *device)
> +static inline void pqi_ctrl_ofa_start(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       return device->in_reset;
> +       mutex_lock(&ctrl_info->ofa_mutex);
>  }
>
> -static inline void pqi_ctrl_ofa_start(struct pqi_ctrl_info
> *ctrl_info)
> +static inline void pqi_ctrl_ofa_done(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->in_ofa = true;
> +       mutex_unlock(&ctrl_info->ofa_mutex);
>  }

pqi_ctrl_ofa_done() is called in several places. For me, it's non-
obvious whether ofa_mutex is guaranteed to be locked when this happens.
It would be an error to call mutex_unlock() if that's not the case.
Also, is it always guaranteed that "The context (task) that acquired
the lock also releases it"
(https://www.kernel.org/doc/html/latest/locking/locktypes.html)?
If feel that's rather not the case, as pqi_ctrl_ofa_start() is run from
a work queue, whereas pqi_ctrl_ofa_done() is not, afaics.

Have you considered using a completion?
Or can you add some explanatory comments?

> -static inline void pqi_ctrl_ofa_done(struct pqi_ctrl_info
> *ctrl_info)
> +static inline void pqi_wait_until_ofa_finished(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->in_ofa = false;
> +       mutex_lock(&ctrl_info->ofa_mutex);
> +       mutex_unlock(&ctrl_info->ofa_mutex);
>  }
>
> -static inline bool pqi_ctrl_in_ofa(struct pqi_ctrl_info *ctrl_info)
> +static inline bool pqi_ofa_in_progress(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       return ctrl_info->in_ofa;
> +       return mutex_is_locked(&ctrl_info->ofa_mutex);
>  }
>
>  static inline void pqi_device_remove_start(struct pqi_scsi_dev
> *device)
> @@ -355,14 +378,20 @@ static inline bool pqi_device_in_remove(struct
> pqi_scsi_dev *device)
>         return device->in_remove;
>  }
>
> -static inline void pqi_ctrl_shutdown_start(struct pqi_ctrl_info
> *ctrl_info)
> +static inline int pqi_event_type_to_event_index(unsigned int
> event_type)
>  {
> -       ctrl_info->in_shutdown = true;
> +       int index;
> +
> +       for (index = 0; index <
> ARRAY_SIZE(pqi_supported_event_types); index++)
> +               if (event_type == pqi_supported_event_types[index])
> +                       return index;
> +
> +       return -1;
>  }
>
> -static inline bool pqi_ctrl_in_shutdown(struct pqi_ctrl_info
> *ctrl_info)
> +static inline bool pqi_is_supported_event(unsigned int event_type)
>  {
> -       return ctrl_info->in_shutdown;
> +       return pqi_event_type_to_event_index(event_type) != -1;
>  }
>
>  static inline void pqi_schedule_rescan_worker_with_delay(struct
> pqi_ctrl_info *ctrl_info,
> @@ -370,8 +399,6 @@ static inline void
> pqi_schedule_rescan_worker_with_delay(struct pqi_ctrl_info *c
>  {
>         if (pqi_ctrl_offline(ctrl_info))
>                 return;
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               return;
>
>         schedule_delayed_work(&ctrl_info->rescan_work, delay);
>  }
> @@ -408,22 +435,15 @@ static inline u32
> pqi_read_heartbeat_counter(struct pqi_ctrl_info *ctrl_info)
>
>  static inline u8 pqi_read_soft_reset_status(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       if (!ctrl_info->soft_reset_status)
> -               return 0;
> -
>         return readb(ctrl_info->soft_reset_status);
>  }

The new treatment of soft_reset_status is unrelated to the
synchronization issues mentioned in the patch description.

>
> -static inline void pqi_clear_soft_reset_status(struct pqi_ctrl_info
> *ctrl_info,
> -       u8 clear)
> +static inline void pqi_clear_soft_reset_status(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         u8 status;
>
> -       if (!ctrl_info->soft_reset_status)
> -               return;
> -
>         status = pqi_read_soft_reset_status(ctrl_info);
> -       status &= ~clear;
> +       status &= ~PQI_SOFT_RESET_ABORT;
>         writeb(status, ctrl_info->soft_reset_status);
>  }
>
> @@ -512,6 +532,7 @@ static int pqi_build_raid_path_request(struct
> pqi_ctrl_info *ctrl_info,
>                 put_unaligned_be32(cdb_length, &cdb[6]);
>                 break;
>         case SA_FLUSH_CACHE:
> +               request->header.driver_flags =
> PQI_DRIVER_NONBLOCKABLE_REQUEST;
>                 request->data_direction = SOP_WRITE_FLAG;
>                 cdb[0] = BMIC_WRITE;
>                 cdb[6] = BMIC_FLUSH_CACHE;
> @@ -606,7 +627,7 @@ static void pqi_free_io_request(struct
> pqi_io_request *io_request)
>
>  static int pqi_send_scsi_raid_request(struct pqi_ctrl_info
> *ctrl_info, u8 cmd,
>         u8 *scsi3addr, void *buffer, size_t buffer_length, u16
> vpd_page,
> -       struct pqi_raid_error_info *error_info, unsigned long
> timeout_msecs)
> +       struct pqi_raid_error_info *error_info)
>  {
>         int rc;
>         struct pqi_raid_path_request request;
> @@ -618,7 +639,7 @@ static int pqi_send_scsi_raid_request(struct
> pqi_ctrl_info *ctrl_info, u8 cmd,
>                 return rc;
>
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0,
> -               error_info, timeout_msecs);
> +               error_info);
>
>         pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1,
> dir);
>
> @@ -631,7 +652,7 @@ static inline int
> pqi_send_ctrl_raid_request(struct pqi_ctrl_info *ctrl_info,
>         u8 cmd, void *buffer, size_t buffer_length)
>  {
>         return pqi_send_scsi_raid_request(ctrl_info, cmd,
> RAID_CTLR_LUNID,
> -               buffer, buffer_length, 0, NULL, NO_TIMEOUT);
> +               buffer, buffer_length, 0, NULL);
>  }
>
>  static inline int pqi_send_ctrl_raid_with_error(struct pqi_ctrl_info
> *ctrl_info,
> @@ -639,7 +660,7 @@ static inline int
> pqi_send_ctrl_raid_with_error(struct pqi_ctrl_info *ctrl_info,
>         struct pqi_raid_error_info *error_info)
>  {
>         return pqi_send_scsi_raid_request(ctrl_info, cmd,
> RAID_CTLR_LUNID,
> -               buffer, buffer_length, 0, error_info, NO_TIMEOUT);
> +               buffer, buffer_length, 0, error_info);
>  }
>
>  static inline int pqi_identify_controller(struct pqi_ctrl_info
> *ctrl_info,
> @@ -661,7 +682,7 @@ static inline int pqi_scsi_inquiry(struct
> pqi_ctrl_info *ctrl_info,
>         u8 *scsi3addr, u16 vpd_page, void *buffer, size_t
> buffer_length)
>  {
>         return pqi_send_scsi_raid_request(ctrl_info, INQUIRY,
> scsi3addr,
> -               buffer, buffer_length, vpd_page, NULL, NO_TIMEOUT);
> +               buffer, buffer_length, vpd_page, NULL);
>  }
>
>  static int pqi_identify_physical_device(struct pqi_ctrl_info
> *ctrl_info,
> @@ -683,8 +704,7 @@ static int pqi_identify_physical_device(struct
> pqi_ctrl_info *ctrl_info,
>         request.cdb[2] = (u8)bmic_device_index;
>         request.cdb[9] = (u8)(bmic_device_index >> 8);
>
> -       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0, NULL);
>
>         pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1,
> dir);
>
> @@ -741,7 +761,7 @@ static int
> pqi_get_advanced_raid_bypass_config(struct pqi_ctrl_info *ctrl_info)
>         request.cdb[2] = BMIC_SENSE_FEATURE_IO_PAGE;
>         request.cdb[3] = BMIC_SENSE_FEATURE_IO_PAGE_AIO_SUBPAGE;
>
> -       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0, NULL, NO_TIMEOUT);
> +       rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0, NULL);
>
>         pqi_pci_unmap(ctrl_info->pci_dev, request.sg_descriptors, 1,
> dir);
>
> @@ -794,13 +814,6 @@ static int pqi_flush_cache(struct pqi_ctrl_info
> *ctrl_info,
>         int rc;
>         struct bmic_flush_cache *flush_cache;
>
> -       /*
> -        * Don't bother trying to flush the cache if the controller
> is
> -        * locked up.
> -        */
> -       if (pqi_ctrl_offline(ctrl_info))
> -               return -ENXIO;
> -
>         flush_cache = kzalloc(sizeof(*flush_cache), GFP_KERNEL);
>         if (!flush_cache)
>                 return -ENOMEM;
> @@ -979,9 +992,6 @@ static void pqi_update_time_worker(struct
> work_struct *work)
>         ctrl_info = container_of(to_delayed_work(work), struct
> pqi_ctrl_info,
>                 update_time_work);
>
> -       if (pqi_ctrl_offline(ctrl_info))
> -               return;
> -
>         rc = pqi_write_current_time_to_host_wellness(ctrl_info);
>         if (rc)
>                 dev_warn(&ctrl_info->pci_dev->dev,
> @@ -1271,9 +1281,7 @@ static int pqi_get_raid_map(struct
> pqi_ctrl_info *ctrl_info,
>                 return -ENOMEM;
>
>         rc = pqi_send_scsi_raid_request(ctrl_info, CISS_GET_RAID_MAP,
> -               device->scsi3addr, raid_map, sizeof(*raid_map),
> -               0, NULL, NO_TIMEOUT);
> -
> +               device->scsi3addr, raid_map, sizeof(*raid_map), 0,
> NULL);
>         if (rc)
>                 goto error;
>
> @@ -1288,8 +1296,7 @@ static int pqi_get_raid_map(struct
> pqi_ctrl_info *ctrl_info,
>                         return -ENOMEM;
>
>                 rc = pqi_send_scsi_raid_request(ctrl_info,
> CISS_GET_RAID_MAP,
> -                       device->scsi3addr, raid_map, raid_map_size,
> -                       0, NULL, NO_TIMEOUT);
> +                       device->scsi3addr, raid_map, raid_map_size,
> 0, NULL);
>                 if (rc)
>                         goto error;
>
> @@ -1464,6 +1471,9 @@ static int pqi_get_physical_device_info(struct
> pqi_ctrl_info *ctrl_info,
>                 sizeof(device->phys_connector));
>         device->bay = id_phys->phys_bay_in_box;
>
> +       memcpy(&device->page_83_identifier, &id_phys-
> >page_83_identifier,
> +               sizeof(device->page_83_identifier));
> +
>         return 0;
>  }
>

This hunk belongs to the "unique wwid" part, see above.


> @@ -1970,8 +1980,13 @@ static void pqi_update_device_list(struct
> pqi_ctrl_info *ctrl_info,
>
>         spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock,
> flags);
>
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               pqi_ctrl_ofa_done(ctrl_info);
> +       if (pqi_ofa_in_progress(ctrl_info)) {
> +               list_for_each_entry_safe(device, next, &delete_list,
> delete_list_entry)
> +                       if (pqi_is_device_added(device))
> +                               pqi_device_remove_start(device);
> +               pqi_ctrl_unblock_device_reset(ctrl_info);
> +               pqi_scsi_unblock_requests(ctrl_info);
> +       }

I don't understand the purpose of is code. pqi_device_remove_start()
will be called again a few lines below. Why do it twice? I suppose
it's related to the unblocking, but that deserves an explanation.
Also, why do you unblock requests while OFA is "in progress"?

>
>         /* Remove all devices that have gone away. */
>         list_for_each_entry_safe(device, next, &delete_list,
> delete_list_entry) {
> @@ -1993,19 +2008,14 @@ static void pqi_update_device_list(struct
> pqi_ctrl_info *ctrl_info,

The following hunk is unrelated to synchronization.

>          * Notify the SCSI ML if the queue depth of any existing
> device has
>          * changed.
>          */
> -       list_for_each_entry(device, &ctrl_info->scsi_device_list,
> -               scsi_device_list_entry) {
> -               if (device->sdev) {
> -                       if (device->queue_depth !=
> -                               device->advertised_queue_depth) {
> -                               device->advertised_queue_depth =
> device->queue_depth;
> -                               scsi_change_queue_depth(device->sdev,
> -                                       device-
> >advertised_queue_depth);
> -                       }
> -                       if (device->rescan) {
> -                               scsi_rescan_device(&device->sdev-
> >sdev_gendev);
> -                               device->rescan = false;
> -                       }
> +       list_for_each_entry(device, &ctrl_info->scsi_device_list,
> scsi_device_list_entry) {
> +               if (device->sdev && device->queue_depth != device-
> >advertised_queue_depth) {
> +                       device->advertised_queue_depth = device-
> >queue_depth;
> +                       scsi_change_queue_depth(device->sdev, device-
> >advertised_queue_depth);
> +               }
> +               if (device->rescan) {
> +                       scsi_rescan_device(&device->sdev-
> >sdev_gendev);
> +                       device->rescan = false;
>                 }

You've taken the reference to device->sdev->sdev_gendev out of the if
(device->sdev) clause. Can you be certain that device->sdev is non-
NULL?

>         }
>
> @@ -2073,6 +2083,16 @@ static inline bool pqi_expose_device(struct
> pqi_scsi_dev *device)
>         return !device->is_physical_device ||
> !pqi_skip_device(device->scsi3addr);
>  }
>

The following belongs to the "unique wwid" part.

> +static inline void pqi_set_physical_device_wwid(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_scsi_dev *device, struct
> report_phys_lun_extended_entry *phys_lun_ext_entry)
> +{
> +       if (ctrl_info->unique_wwid_in_report_phys_lun_supported ||
> +               pqi_is_device_with_sas_address(device))
> +               device->wwid = phys_lun_ext_entry->wwid;
> +       else
> +               device->wwid =
> cpu_to_be64(get_unaligned_be64(&device->page_83_identifier));
> +}
> +
>  static int pqi_update_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>  {
>         int i;
> @@ -2238,7 +2258,7 @@ static int pqi_update_scsi_devices(struct
> pqi_ctrl_info *ctrl_info)
>                 pqi_assign_bus_target_lun(device);
>
>                 if (device->is_physical_device) {
> -                       device->wwid = phys_lun_ext_entry->wwid;
> +                       pqi_set_physical_device_wwid(ctrl_info,
> device, phys_lun_ext_entry);
>                         if ((phys_lun_ext_entry->device_flags &
>                                 CISS_REPORT_PHYS_DEV_FLAG_AIO_ENABLED
> ) &&
>                                 phys_lun_ext_entry->aio_handle) {
> @@ -2278,21 +2298,27 @@ static int pqi_update_scsi_devices(struct
> pqi_ctrl_info *ctrl_info)
>
>  static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>  {
> -       int rc = 0;
> +       int rc;
> +       int mutex_acquired;
>
>         if (pqi_ctrl_offline(ctrl_info))
>                 return -ENXIO;
>
> -       if (!mutex_trylock(&ctrl_info->scan_mutex)) {
> +       mutex_acquired = mutex_trylock(&ctrl_info->scan_mutex);
> +
> +       if (!mutex_acquired) {
> +               if (pqi_ctrl_scan_blocked(ctrl_info))
> +                       return -EBUSY;
>                 pqi_schedule_rescan_worker_delayed(ctrl_info);
> -               rc = -EINPROGRESS;
> -       } else {
> -               rc = pqi_update_scsi_devices(ctrl_info);
> -               if (rc)
> -
>                        pqi_schedule_rescan_worker_delayed(ctrl_info);
> -               mutex_unlock(&ctrl_info->scan_mutex);
> +               return -EINPROGRESS;
>         }
>
> +       rc = pqi_update_scsi_devices(ctrl_info);
> +       if (rc && !pqi_ctrl_scan_blocked(ctrl_info))
> +               pqi_schedule_rescan_worker_delayed(ctrl_info);
> +
> +       mutex_unlock(&ctrl_info->scan_mutex);
> +
>         return rc;
>  }
>
> @@ -2301,8 +2327,6 @@ static void pqi_scan_start(struct Scsi_Host
> *shost)
>         struct pqi_ctrl_info *ctrl_info;
>
>         ctrl_info = shost_to_hba(shost);
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               return;
>
>         pqi_scan_scsi_devices(ctrl_info);
>  }
> @@ -2319,27 +2343,8 @@ static int pqi_scan_finished(struct Scsi_Host
> *shost,
>         return !mutex_is_locked(&ctrl_info->scan_mutex);
>  }
>
> -static void pqi_wait_until_scan_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->scan_mutex);
> -       mutex_unlock(&ctrl_info->scan_mutex);
> -}
> -
> -static void pqi_wait_until_lun_reset_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->lun_reset_mutex);
> -       mutex_unlock(&ctrl_info->lun_reset_mutex);
> -}
> -
> -static void pqi_wait_until_ofa_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->ofa_mutex);
> -       mutex_unlock(&ctrl_info->ofa_mutex);
> -}

Here, again, I wonder if this mutex_lock()/mutex_unlock() approach is
optimal. Have you considered using completions?

See above for rationale.

> -
> -static inline void pqi_set_encryption_info(
> -       struct pqi_encryption_info *encryption_info, struct raid_map
> *raid_map,
> -       u64 first_block)
> +static inline void pqi_set_encryption_info(struct
> pqi_encryption_info *encryption_info,
> +       struct raid_map *raid_map, u64 first_block)
>  {
>         u32 volume_blk_size;

This whitespace change doesn't belong here.

>
> @@ -3251,8 +3256,8 @@ static void pqi_acknowledge_event(struct
> pqi_ctrl_info *ctrl_info,
>         put_unaligned_le16(sizeof(request) -
> PQI_REQUEST_HEADER_LENGTH,
>                 &request.header.iu_length);
>         request.event_type = event->event_type;
> -       request.event_id = event->event_id;
> -       request.additional_event_id = event->additional_event_id;
> +       put_unaligned_le16(event->event_id, &request.event_id);
> +       put_unaligned_le16(event->additional_event_id,
> &request.additional_event_id);

The different treatment of the event_id fields is unrelated to
synchronization, or am I missing something?

>
>         pqi_send_event_ack(ctrl_info, &request, sizeof(request));
>  }
> @@ -3263,8 +3268,8 @@ static void pqi_acknowledge_event(struct
> pqi_ctrl_info *ctrl_info,
>  static enum pqi_soft_reset_status pqi_poll_for_soft_reset_status(
>         struct pqi_ctrl_info *ctrl_info)
>  {
> -       unsigned long timeout;
>         u8 status;
> +       unsigned long timeout;
>
>         timeout = (PQI_SOFT_RESET_STATUS_TIMEOUT_SECS * PQI_HZ) +
> jiffies;
>
> @@ -3276,120 +3281,169 @@ static enum pqi_soft_reset_status
> pqi_poll_for_soft_reset_status(
>                 if (status & PQI_SOFT_RESET_ABORT)
>                         return RESET_ABORT;
>
> +               if (!sis_is_firmware_running(ctrl_info))
> +                       return RESET_NORESPONSE;
> +
>                 if (time_after(jiffies, timeout)) {
> -                       dev_err(&ctrl_info->pci_dev->dev,
> +                       dev_warn(&ctrl_info->pci_dev->dev,
>                                 "timed out waiting for soft reset
> status\n");
>                         return RESET_TIMEDOUT;
>                 }
>
> -               if (!sis_is_firmware_running(ctrl_info))
> -                       return RESET_NORESPONSE;
> -
>                 ssleep(PQI_SOFT_RESET_STATUS_POLL_INTERVAL_SECS);
>         }
>  }
>
> -static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info,
> -       enum pqi_soft_reset_status reset_status)
> +static void pqi_process_soft_reset(struct pqi_ctrl_info *ctrl_info)
>  {
>         int rc;
> +       unsigned int delay_secs;
> +       enum pqi_soft_reset_status reset_status;
> +
> +       if (ctrl_info->soft_reset_handshake_supported)
> +               reset_status =
> pqi_poll_for_soft_reset_status(ctrl_info);
> +       else
> +               reset_status = RESET_INITIATE_FIRMWARE;
> +
> +       pqi_ofa_free_host_buffer(ctrl_info);
> +
> +       delay_secs = PQI_POST_RESET_DELAY_SECS;
>
>         switch (reset_status) {
> -       case RESET_INITIATE_DRIVER:
>         case RESET_TIMEDOUT:
> +               delay_secs =
> PQI_POST_OFA_RESET_DELAY_UPON_TIMEOUT_SECS;
> +               fallthrough;
> +       case RESET_INITIATE_DRIVER:
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "resetting controller %u\n", ctrl_info-
> >ctrl_id);
> +                               "Online Firmware Activation:
> resetting controller\n");
>                 sis_soft_reset(ctrl_info);
>                 fallthrough;
>         case RESET_INITIATE_FIRMWARE:
> -               rc = pqi_ofa_ctrl_restart(ctrl_info);
> -               pqi_ofa_free_host_buffer(ctrl_info);
> +               ctrl_info->pqi_mode_enabled = false;
> +               pqi_save_ctrl_mode(ctrl_info, SIS_MODE);
> +               rc = pqi_ofa_ctrl_restart(ctrl_info, delay_secs);
> +               pqi_ctrl_ofa_done(ctrl_info);
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "Online Firmware Activation for controller
> %u: %s\n",
> -                       ctrl_info->ctrl_id, rc == 0 ? "SUCCESS" :
> "FAILED");
> +                               "Online Firmware Activation: %s\n",
> +                               rc == 0 ? "SUCCESS" : "FAILED");
>                 break;
>         case RESET_ABORT:
> -               pqi_ofa_ctrl_unquiesce(ctrl_info);
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "Online Firmware Activation for controller
> %u: %s\n",
> -                       ctrl_info->ctrl_id, "ABORTED");
> +                               "Online Firmware Activation
> ABORTED\n");
> +               if (ctrl_info->soft_reset_handshake_supported)
> +                       pqi_clear_soft_reset_status(ctrl_info);
> +               pqi_ctrl_ofa_done(ctrl_info);
> +               pqi_ofa_ctrl_unquiesce(ctrl_info);
>                 break;
>         case RESET_NORESPONSE:
> -               pqi_ofa_free_host_buffer(ctrl_info);
> +               fallthrough;
> +       default:
> +               dev_err(&ctrl_info->pci_dev->dev,
> +                       "unexpected Online Firmware Activation reset
> status: 0x%x\n",
> +                       reset_status);
> +               pqi_ctrl_ofa_done(ctrl_info);
> +               pqi_ofa_ctrl_unquiesce(ctrl_info);
>                 pqi_take_ctrl_offline(ctrl_info);
>                 break;
>         }
>  }
>
> -static void pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
> -       struct pqi_event *event)
> +static void pqi_ofa_memory_alloc_worker(struct work_struct *work)

Moving the ofa handling into work queues seems to be a key aspect of
this patch. The patch description should mention how this is will
improve synchronization. Naïve thinking suggests that making these
calls asynchronous could aggravate synchronization issues.

Repeating myself, I feel that completions would be the best way so
synchronize with these work items.

>  {
> -       u16 event_id;
> -       enum pqi_soft_reset_status status;
> +       struct pqi_ctrl_info *ctrl_info;
>
> -       event_id = get_unaligned_le16(&event->event_id);
> +       ctrl_info = container_of(work, struct pqi_ctrl_info,
> ofa_memory_alloc_work);
>
> -       mutex_lock(&ctrl_info->ofa_mutex);
> +       pqi_ctrl_ofa_start(ctrl_info);
> +       pqi_ofa_setup_host_buffer(ctrl_info);
> +       pqi_ofa_host_memory_update(ctrl_info);
> +}
>
> -       if (event_id == PQI_EVENT_OFA_QUIESCE) {
> -               dev_info(&ctrl_info->pci_dev->dev,
> -                       "Received Online Firmware Activation quiesce
> event for controller %u\n",
> -                       ctrl_info->ctrl_id);
> -               pqi_ofa_ctrl_quiesce(ctrl_info);
> -               pqi_acknowledge_event(ctrl_info, event);
> -               if (ctrl_info->soft_reset_handshake_supported) {
> -                       status =
> pqi_poll_for_soft_reset_status(ctrl_info);
> -                       pqi_process_soft_reset(ctrl_info, status);
> -               } else {
> -                       pqi_process_soft_reset(ctrl_info,
> -                                       RESET_INITIATE_FIRMWARE);
> -               }
> +static void pqi_ofa_quiesce_worker(struct work_struct *work)
> +{
> +       struct pqi_ctrl_info *ctrl_info;
> +       struct pqi_event *event;
>
> -       } else if (event_id == PQI_EVENT_OFA_MEMORY_ALLOCATION) {
> -               pqi_acknowledge_event(ctrl_info, event);
> -               pqi_ofa_setup_host_buffer(ctrl_info,
> -                       le32_to_cpu(event->ofa_bytes_requested));
> -               pqi_ofa_host_memory_update(ctrl_info);
> -       } else if (event_id == PQI_EVENT_OFA_CANCELED) {
> -               pqi_ofa_free_host_buffer(ctrl_info);
> -               pqi_acknowledge_event(ctrl_info, event);
> +       ctrl_info = container_of(work, struct pqi_ctrl_info,
> ofa_quiesce_work);
> +
> +       event = &ctrl_info-
> >events[pqi_event_type_to_event_index(PQI_EVENT_TYPE_OFA)];
> +
> +       pqi_ofa_ctrl_quiesce(ctrl_info);
> +       pqi_acknowledge_event(ctrl_info, event);
> +       pqi_process_soft_reset(ctrl_info);
> +}
> +
> +static bool pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
> +       struct pqi_event *event)
> +{
> +       bool ack_event;
> +
> +       ack_event = true;
> +
> +       switch (event->event_id) {
> +       case PQI_EVENT_OFA_MEMORY_ALLOCATION:
> +               dev_info(&ctrl_info->pci_dev->dev,
> +                       "received Online Firmware Activation memory
> allocation request\n");
> +               schedule_work(&ctrl_info->ofa_memory_alloc_work);
> +               break;
> +       case PQI_EVENT_OFA_QUIESCE:
>                 dev_info(&ctrl_info->pci_dev->dev,
> -                       "Online Firmware Activation(%u) cancel reason
> : %u\n",
> -                       ctrl_info->ctrl_id, event-
> >ofa_cancel_reason);
> +                       "received Online Firmware Activation quiesce
> request\n");
> +               schedule_work(&ctrl_info->ofa_quiesce_work);
> +               ack_event = false;
> +               break;
> +       case PQI_EVENT_OFA_CANCELED:
> +               dev_info(&ctrl_info->pci_dev->dev,
> +                       "received Online Firmware Activation cancel
> request: reason: %u\n",
> +                       ctrl_info->ofa_cancel_reason);
> +               pqi_ofa_free_host_buffer(ctrl_info);
> +               pqi_ctrl_ofa_done(ctrl_info);
> +               break;
> +       default:
> +               dev_err(&ctrl_info->pci_dev->dev,
> +                       "received unknown Online Firmware Activation
> request: event ID: %u\n",
> +                       event->event_id);
> +               break;
>         }
>
> -       mutex_unlock(&ctrl_info->ofa_mutex);
> +       return ack_event;
>  }
>
>  static void pqi_event_worker(struct work_struct *work)
>  {
>         unsigned int i;
> +       bool rescan_needed;
>         struct pqi_ctrl_info *ctrl_info;
>         struct pqi_event *event;
> +       bool ack_event;
>
>         ctrl_info = container_of(work, struct pqi_ctrl_info,
> event_work);
>
>         pqi_ctrl_busy(ctrl_info);
> -       pqi_wait_if_ctrl_blocked(ctrl_info, NO_TIMEOUT);
> +       pqi_wait_if_ctrl_blocked(ctrl_info);
>         if (pqi_ctrl_offline(ctrl_info))
>                 goto out;
>
> -       pqi_schedule_rescan_worker_delayed(ctrl_info);
> -
> +       rescan_needed = false;
>         event = ctrl_info->events;
>         for (i = 0; i < PQI_NUM_SUPPORTED_EVENTS; i++) {
>                 if (event->pending) {
>                         event->pending = false;
>                         if (event->event_type == PQI_EVENT_TYPE_OFA)
> {
> -                               pqi_ctrl_unbusy(ctrl_info);
> -                               pqi_ofa_process_event(ctrl_info,
> event);
> -                               return;
> +                               ack_event =
> pqi_ofa_process_event(ctrl_info, event);
> +                       } else {
> +                               ack_event = true;
> +                               rescan_needed = true;
>                         }
> -                       pqi_acknowledge_event(ctrl_info, event);
> +                       if (ack_event)
> +                               pqi_acknowledge_event(ctrl_info,
> event);
>                 }
>                 event++;
>         }
>
> +       if (rescan_needed)
> +               pqi_schedule_rescan_worker_delayed(ctrl_info);
> +
>  out:
>         pqi_ctrl_unbusy(ctrl_info);
>  }
> @@ -3446,37 +3500,18 @@ static inline void
> pqi_stop_heartbeat_timer(struct pqi_ctrl_info *ctrl_info)
>         del_timer_sync(&ctrl_info->heartbeat_timer);
>  }
>
> -static inline int pqi_event_type_to_event_index(unsigned int
> event_type)
> -{
> -       int index;
> -
> -       for (index = 0; index <
> ARRAY_SIZE(pqi_supported_event_types); index++)
> -               if (event_type == pqi_supported_event_types[index])
> -                       return index;
> -
> -       return -1;
> -}
> -
> -static inline bool pqi_is_supported_event(unsigned int event_type)
> -{
> -       return pqi_event_type_to_event_index(event_type) != -1;
> -}
> -
> -static void pqi_ofa_capture_event_payload(struct pqi_event *event,
> -       struct pqi_event_response *response)
> +static void pqi_ofa_capture_event_payload(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_event *event, struct pqi_event_response *response)
>  {
> -       u16 event_id;
> -
> -       event_id = get_unaligned_le16(&event->event_id);
> -
> -       if (event->event_type == PQI_EVENT_TYPE_OFA) {
> -               if (event_id == PQI_EVENT_OFA_MEMORY_ALLOCATION) {
> -                       event->ofa_bytes_requested =
> -                       response-
> >data.ofa_memory_allocation.bytes_requested;
> -               } else if (event_id == PQI_EVENT_OFA_CANCELED) {
> -                       event->ofa_cancel_reason =
> -                       response->data.ofa_cancelled.reason;
> -               }
> +       switch (event->event_id) {
> +       case PQI_EVENT_OFA_MEMORY_ALLOCATION:
> +               ctrl_info->ofa_bytes_requested =
> +                       get_unaligned_le32(&response-
> >data.ofa_memory_allocation.bytes_requested);
> +               break;
> +       case PQI_EVENT_OFA_CANCELED:
> +               ctrl_info->ofa_cancel_reason =
> +                       get_unaligned_le16(&response-
> >data.ofa_cancelled.reason);
> +               break;
>         }
>  }
>
> @@ -3510,17 +3545,17 @@ static int pqi_process_event_intr(struct
> pqi_ctrl_info *ctrl_info)
>                 num_events++;
>                 response = event_queue->oq_element_array + (oq_ci *
> PQI_EVENT_OQ_ELEMENT_LENGTH);
>
> -               event_index =
> -                       pqi_event_type_to_event_index(response-
> >event_type);
> +               event_index = pqi_event_type_to_event_index(response-
> >event_type);
>
>                 if (event_index >= 0 && response-
> >request_acknowledge) {
>                         event = &ctrl_info->events[event_index];
>                         event->pending = true;
>                         event->event_type = response->event_type;
> -                       event->event_id = response->event_id;
> -                       event->additional_event_id = response-
> >additional_event_id;
> +                       event->event_id =
> get_unaligned_le16(&response->event_id);
> +                       event->additional_event_id =
> +                               get_unaligned_le32(&response-
> >additional_event_id);
>                         if (event->event_type == PQI_EVENT_TYPE_OFA)
> -                               pqi_ofa_capture_event_payload(event,
> response);
> +                               pqi_ofa_capture_event_payload(ctrl_in
> fo, event, response);
>                 }
>
>                 oq_ci = (oq_ci + 1) % PQI_NUM_EVENT_QUEUE_ELEMENTS;
> @@ -3537,8 +3572,7 @@ static int pqi_process_event_intr(struct
> pqi_ctrl_info *ctrl_info)
>
>  #define PQI_LEGACY_INTX_MASK   0x1
>
> -static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info
> *ctrl_info,
> -       bool enable_intx)
> +static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info
> *ctrl_info, bool enable_intx)

another whitespace hunk

>  {
>         u32 intx_mask;
>         struct pqi_device_registers __iomem *pqi_registers;
> @@ -4216,59 +4250,36 @@ static int
> pqi_process_raid_io_error_synchronous(
>         return rc;
>  }
>
> +static inline bool pqi_is_blockable_request(struct pqi_iu_header
> *request)
> +{
> +       return (request->driver_flags &
> PQI_DRIVER_NONBLOCKABLE_REQUEST) == 0;
> +}
> +
>  static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_iu_header *request, unsigned int flags,
> -       struct pqi_raid_error_info *error_info, unsigned long
> timeout_msecs)
> +       struct pqi_raid_error_info *error_info)

The removal of the timeout_msecs argument to this function could be
a separate patch in its own right.

>  {
>         int rc = 0;
>         struct pqi_io_request *io_request;
> -       unsigned long start_jiffies;
> -       unsigned long msecs_blocked;
>         size_t iu_length;
>         DECLARE_COMPLETION_ONSTACK(wait);
>
> -       /*
> -        * Note that specifying PQI_SYNC_FLAGS_INTERRUPTABLE and a
> timeout value
> -        * are mutually exclusive.
> -        */
> -
>         if (flags & PQI_SYNC_FLAGS_INTERRUPTABLE) {
>                 if (down_interruptible(&ctrl_info->sync_request_sem))
>                         return -ERESTARTSYS;
>         } else {
> -               if (timeout_msecs == NO_TIMEOUT) {
> -                       down(&ctrl_info->sync_request_sem);
> -               } else {
> -                       start_jiffies = jiffies;
> -                       if (down_timeout(&ctrl_info-
> >sync_request_sem,
> -                               msecs_to_jiffies(timeout_msecs)))
> -                               return -ETIMEDOUT;
> -                       msecs_blocked =
> -                               jiffies_to_msecs(jiffies -
> start_jiffies);
> -                       if (msecs_blocked >= timeout_msecs) {
> -                               rc = -ETIMEDOUT;
> -                               goto out;
> -                       }
> -                       timeout_msecs -= msecs_blocked;
> -               }
> +               down(&ctrl_info->sync_request_sem);
>         }
>
>         pqi_ctrl_busy(ctrl_info);
> -       timeout_msecs = pqi_wait_if_ctrl_blocked(ctrl_info,
> timeout_msecs);
> -       if (timeout_msecs == 0) {
> -               pqi_ctrl_unbusy(ctrl_info);
> -               rc = -ETIMEDOUT;
> -               goto out;
> -       }
> +       if (pqi_is_blockable_request(request))
> +               pqi_wait_if_ctrl_blocked(ctrl_info);

You wait here after taking the semaphore - is that intended? Why?

>
>         if (pqi_ctrl_offline(ctrl_info)) {

Should you test this before waiting, perhaps?

> -               pqi_ctrl_unbusy(ctrl_info);
>                 rc = -ENXIO;
>                 goto out;
>         }
>
> -       atomic_inc(&ctrl_info->sync_cmds_outstanding);
> -
>         io_request = pqi_alloc_io_request(ctrl_info);
>
>         put_unaligned_le16(io_request->index,
> @@ -4288,18 +4299,7 @@ static int
> pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
>         pqi_start_io(ctrl_info, &ctrl_info-
> >queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
>                 io_request);
>
> -       pqi_ctrl_unbusy(ctrl_info);
> -
> -       if (timeout_msecs == NO_TIMEOUT) {
> -               pqi_wait_for_completion_io(ctrl_info, &wait);
> -       } else {
> -               if (!wait_for_completion_io_timeout(&wait,
> -                       msecs_to_jiffies(timeout_msecs))) {
> -                       dev_warn(&ctrl_info->pci_dev->dev,
> -                               "command timed out\n");
> -                       rc = -ETIMEDOUT;
> -               }
> -       }
> +       pqi_wait_for_completion_io(ctrl_info, &wait);
>
>         if (error_info) {
>                 if (io_request->error_info)
> @@ -4312,8 +4312,8 @@ static int
> pqi_submit_raid_request_synchronous(struct pqi_ctrl_info *ctrl_info,
>
>         pqi_free_io_request(io_request);
>
> -       atomic_dec(&ctrl_info->sync_cmds_outstanding);
>  out:
> +       pqi_ctrl_unbusy(ctrl_info);
>         up(&ctrl_info->sync_request_sem);
>
>         return rc;
> @@ -4350,8 +4350,7 @@ static int
> pqi_submit_admin_request_synchronous(
>         rc = pqi_poll_for_admin_response(ctrl_info, response);
>
>         if (rc == 0)
> -               rc = pqi_validate_admin_response(response,
> -                       request->function_code);
> +               rc = pqi_validate_admin_response(response, request-
> >function_code);
>
>         return rc;
>  }
> @@ -4721,7 +4720,7 @@ static int pqi_configure_events(struct
> pqi_ctrl_info *ctrl_info,
>                 goto out;
>
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +               0, NULL);
>
>         pqi_pci_unmap(ctrl_info->pci_dev,
>                 request.data.report_event_configuration.sg_descriptor
> s, 1,
> @@ -4757,7 +4756,7 @@ static int pqi_configure_events(struct
> pqi_ctrl_info *ctrl_info,
>                 goto out;
>
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header, 0,
> -               NULL, NO_TIMEOUT);
> +               NULL);
>
>         pqi_pci_unmap(ctrl_info->pci_dev,
>                 request.data.report_event_configuration.sg_descriptor
> s, 1,
> @@ -5277,12 +5276,6 @@ static inline int
> pqi_raid_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>                 device, scmd, queue_group);
>  }
>

Below here, a new section starts that refacors the treatment of bypass
retries. I don't see how this is related to the synchronization issues
mentioned in the patch description.


> -static inline void pqi_schedule_bypass_retry(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       if (!pqi_ctrl_blocked(ctrl_info))
> -               schedule_work(&ctrl_info->raid_bypass_retry_work);
> -}
> -
>  static bool pqi_raid_bypass_retry_needed(struct pqi_io_request
> *io_request)
>  {
>         struct scsi_cmnd *scmd;
> @@ -5299,7 +5292,7 @@ static bool pqi_raid_bypass_retry_needed(struct
> pqi_io_request *io_request)
>                 return false;
>
>         device = scmd->device->hostdata;
> -       if (pqi_device_offline(device))
> +       if (pqi_device_offline(device) ||
> pqi_device_in_remove(device))
>                 return false;
>
>         ctrl_info = shost_to_hba(scmd->device->host);
> @@ -5309,155 +5302,26 @@ static bool
> pqi_raid_bypass_retry_needed(struct pqi_io_request *io_request)
>         return true;
>  }
>
> -static inline void pqi_add_to_raid_bypass_retry_list(
> -       struct pqi_ctrl_info *ctrl_info,
> -       struct pqi_io_request *io_request, bool at_head)
> -{
> -       unsigned long flags;
> -
> -       spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock,
> flags);
> -       if (at_head)
> -               list_add(&io_request->request_list_entry,
> -                       &ctrl_info->raid_bypass_retry_list);
> -       else
> -               list_add_tail(&io_request->request_list_entry,
> -                       &ctrl_info->raid_bypass_retry_list);
> -       spin_unlock_irqrestore(&ctrl_info-
> >raid_bypass_retry_list_lock, flags);
> -}
> -
> -static void pqi_queued_raid_bypass_complete(struct pqi_io_request
> *io_request,
> +static void pqi_aio_io_complete(struct pqi_io_request *io_request,
>         void *context)
>  {
>         struct scsi_cmnd *scmd;
>
>         scmd = io_request->scmd;
> +       scsi_dma_unmap(scmd);
> +       if (io_request->status == -EAGAIN ||
> +               pqi_raid_bypass_retry_needed(io_request))
> +                       set_host_byte(scmd, DID_IMM_RETRY);
>         pqi_free_io_request(io_request);
>         pqi_scsi_done(scmd);
>  }
>
> -static void pqi_queue_raid_bypass_retry(struct pqi_io_request
> *io_request)
> +static inline int pqi_aio_submit_scsi_cmd(struct pqi_ctrl_info
> *ctrl_info,
> +       struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
> +       struct pqi_queue_group *queue_group)
>  {
> -       struct scsi_cmnd *scmd;
> -       struct pqi_ctrl_info *ctrl_info;
> -
> -       io_request->io_complete_callback =
> pqi_queued_raid_bypass_complete;
> -       scmd = io_request->scmd;
> -       scmd->result = 0;
> -       ctrl_info = shost_to_hba(scmd->device->host);
> -
> -       pqi_add_to_raid_bypass_retry_list(ctrl_info, io_request,
> false);
> -       pqi_schedule_bypass_retry(ctrl_info);
> -}
> -
> -static int pqi_retry_raid_bypass(struct pqi_io_request *io_request)
> -{
> -       struct scsi_cmnd *scmd;
> -       struct pqi_scsi_dev *device;
> -       struct pqi_ctrl_info *ctrl_info;
> -       struct pqi_queue_group *queue_group;
> -
> -       scmd = io_request->scmd;
> -       device = scmd->device->hostdata;
> -       if (pqi_device_in_reset(device)) {
> -               pqi_free_io_request(io_request);
> -               set_host_byte(scmd, DID_RESET);
> -               pqi_scsi_done(scmd);
> -               return 0;
> -       }
> -
> -       ctrl_info = shost_to_hba(scmd->device->host);
> -       queue_group = io_request->queue_group;
> -
> -       pqi_reinit_io_request(io_request);
> -
> -       return pqi_raid_submit_scsi_cmd_with_io_request(ctrl_info,
> io_request,
> -               device, scmd, queue_group);
> -}
> -
> -static inline struct pqi_io_request
> *pqi_next_queued_raid_bypass_request(
> -       struct pqi_ctrl_info *ctrl_info)
> -{
> -       unsigned long flags;
> -       struct pqi_io_request *io_request;
> -
> -       spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock,
> flags);
> -       io_request = list_first_entry_or_null(
> -               &ctrl_info->raid_bypass_retry_list,
> -               struct pqi_io_request, request_list_entry);
> -       if (io_request)
> -               list_del(&io_request->request_list_entry);
> -       spin_unlock_irqrestore(&ctrl_info-
> >raid_bypass_retry_list_lock, flags);
> -
> -       return io_request;
> -}
> -
> -static void pqi_retry_raid_bypass_requests(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       int rc;
> -       struct pqi_io_request *io_request;
> -
> -       pqi_ctrl_busy(ctrl_info);
> -
> -       while (1) {
> -               if (pqi_ctrl_blocked(ctrl_info))
> -                       break;
> -               io_request =
> pqi_next_queued_raid_bypass_request(ctrl_info);
> -               if (!io_request)
> -                       break;
> -               rc = pqi_retry_raid_bypass(io_request);
> -               if (rc) {
> -                       pqi_add_to_raid_bypass_retry_list(ctrl_info,
> io_request,
> -                               true);
> -                       pqi_schedule_bypass_retry(ctrl_info);
> -                       break;
> -               }
> -       }
> -
> -       pqi_ctrl_unbusy(ctrl_info);
> -}
> -
> -static void pqi_raid_bypass_retry_worker(struct work_struct *work)
> -{
> -       struct pqi_ctrl_info *ctrl_info;
> -
> -       ctrl_info = container_of(work, struct pqi_ctrl_info,
> -               raid_bypass_retry_work);
> -       pqi_retry_raid_bypass_requests(ctrl_info);
> -}
> -
> -static void pqi_clear_all_queued_raid_bypass_retries(
> -       struct pqi_ctrl_info *ctrl_info)
> -{
> -       unsigned long flags;
> -
> -       spin_lock_irqsave(&ctrl_info->raid_bypass_retry_list_lock,
> flags);
> -       INIT_LIST_HEAD(&ctrl_info->raid_bypass_retry_list);
> -       spin_unlock_irqrestore(&ctrl_info-
> >raid_bypass_retry_list_lock, flags);
> -}
> -
> -static void pqi_aio_io_complete(struct pqi_io_request *io_request,
> -       void *context)
> -{
> -       struct scsi_cmnd *scmd;
> -
> -       scmd = io_request->scmd;
> -       scsi_dma_unmap(scmd);
> -       if (io_request->status == -EAGAIN)
> -               set_host_byte(scmd, DID_IMM_RETRY);
> -       else if (pqi_raid_bypass_retry_needed(io_request)) {
> -               pqi_queue_raid_bypass_retry(io_request);
> -               return;
> -       }
> -       pqi_free_io_request(io_request);
> -       pqi_scsi_done(scmd);
> -}
> -
> -static inline int pqi_aio_submit_scsi_cmd(struct pqi_ctrl_info
> *ctrl_info,
> -       struct pqi_scsi_dev *device, struct scsi_cmnd *scmd,
> -       struct pqi_queue_group *queue_group)
> -{
> -       return pqi_aio_submit_io(ctrl_info, scmd, device->aio_handle,
> -               scmd->cmnd, scmd->cmd_len, queue_group, NULL, false);
> +       return pqi_aio_submit_io(ctrl_info, scmd, device->aio_handle,
> +               scmd->cmnd, scmd->cmd_len, queue_group, NULL, false);
>  }
>
>  static int pqi_aio_submit_io(struct pqi_ctrl_info *ctrl_info,
> @@ -5698,6 +5562,14 @@ static inline u16 pqi_get_hw_queue(struct
> pqi_ctrl_info *ctrl_info,
>         return hw_queue;
>  }
>
> +static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd
> *scmd)
> +{
> +       if (blk_rq_is_passthrough(scmd->request))
> +               return false;
> +
> +       return scmd->retries == 0;
> +}
> +

Nice, but this fits better into (or next to) 10/25 IMO.

>  /*
>   * This function gets called just before we hand the completed SCSI
> request
>   * back to the SML.
> @@ -5806,7 +5678,6 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>         bool raid_bypassed;
>
>         device = scmd->device->hostdata;
> -       ctrl_info = shost_to_hba(shost);
>
>         if (!device) {
>                 set_host_byte(scmd, DID_NO_CONNECT);
> @@ -5816,15 +5687,15 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>
>         atomic_inc(&device->scsi_cmds_outstanding);
>
> +       ctrl_info = shost_to_hba(shost);
> +
>         if (pqi_ctrl_offline(ctrl_info) ||
> pqi_device_in_remove(device)) {
>                 set_host_byte(scmd, DID_NO_CONNECT);
>                 pqi_scsi_done(scmd);
>                 return 0;
>         }
>
> -       pqi_ctrl_busy(ctrl_info);
> -       if (pqi_ctrl_blocked(ctrl_info) ||
> pqi_device_in_reset(device) ||
> -           pqi_ctrl_in_ofa(ctrl_info) ||
> pqi_ctrl_in_shutdown(ctrl_info)) {
> +       if (pqi_ctrl_blocked(ctrl_info)) {
>                 rc = SCSI_MLQUEUE_HOST_BUSY;
>                 goto out;
>         }
> @@ -5841,13 +5712,12 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>         if (pqi_is_logical_device(device)) {
>                 raid_bypassed = false;
>                 if (device->raid_bypass_enabled &&
> -                       !blk_rq_is_passthrough(scmd->request)) {
> -                       if (!pqi_is_parity_write_stream(ctrl_info,
> scmd)) {
> -                               rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd,
> queue_group);
> -                               if (rc == 0 || rc ==
> SCSI_MLQUEUE_HOST_BUSY) {
> -                                       raid_bypassed = true;
> -                                       atomic_inc(&device-
> >raid_bypass_cnt);
> -                               }
> +                       pqi_is_bypass_eligible_request(scmd) &&
> +                       !pqi_is_parity_write_stream(ctrl_info, scmd))
> {
> +                       rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd,
> queue_group);
> +                       if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY)
> {
> +                               raid_bypassed = true;
> +                               atomic_inc(&device->raid_bypass_cnt);
>                         }
>                 }
>                 if (!raid_bypassed)
> @@ -5860,7 +5730,6 @@ static int pqi_scsi_queue_command(struct
> Scsi_Host *shost, struct scsi_cmnd *scm
>         }
>
>  out:
> -       pqi_ctrl_unbusy(ctrl_info);
>         if (rc)
>                 atomic_dec(&device->scsi_cmds_outstanding);
>
> @@ -5970,100 +5839,22 @@ static void
> pqi_fail_io_queued_for_device(struct pqi_ctrl_info *ctrl_info,
>         }
>  }
>
> -static void pqi_fail_io_queued_for_all_devices(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       unsigned int i;
> -       unsigned int path;
> -       struct pqi_queue_group *queue_group;
> -       unsigned long flags;
> -       struct pqi_io_request *io_request;
> -       struct pqi_io_request *next;
> -       struct scsi_cmnd *scmd;
> -
> -       for (i = 0; i < ctrl_info->num_queue_groups; i++) {
> -               queue_group = &ctrl_info->queue_groups[i];
> -
> -               for (path = 0; path < 2; path++) {
> -                       spin_lock_irqsave(&queue_group-
> >submit_lock[path],
> -                                               flags);
> -
> -                       list_for_each_entry_safe(io_request, next,
> -                               &queue_group->request_list[path],
> -                               request_list_entry) {
> -
> -                               scmd = io_request->scmd;
> -                               if (!scmd)
> -                                       continue;
> -
> -                               list_del(&io_request-
> >request_list_entry);
> -                               set_host_byte(scmd, DID_RESET);
> -                               pqi_scsi_done(scmd);
> -                       }
> -
> -                       spin_unlock_irqrestore(
> -                               &queue_group->submit_lock[path],
> flags);
> -               }
> -       }
> -}
> -
>  static int pqi_device_wait_for_pending_io(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_scsi_dev *device, unsigned long timeout_secs)
>  {
>         unsigned long timeout;
>
> -       timeout = (timeout_secs * PQI_HZ) + jiffies;
> -
> -       while (atomic_read(&device->scsi_cmds_outstanding)) {
> -               pqi_check_ctrl_health(ctrl_info);
> -               if (pqi_ctrl_offline(ctrl_info))
> -                       return -ENXIO;
> -               if (timeout_secs != NO_TIMEOUT) {
> -                       if (time_after(jiffies, timeout)) {
> -                               dev_err(&ctrl_info->pci_dev->dev,
> -                                       "timed out waiting for
> pending IO\n");
> -                               return -ETIMEDOUT;
> -                       }
> -               }
> -               usleep_range(1000, 2000);
> -       }
> -
> -       return 0;
> -}
> -
> -static int pqi_ctrl_wait_for_pending_io(struct pqi_ctrl_info
> *ctrl_info,
> -       unsigned long timeout_secs)
> -{
> -       bool io_pending;
> -       unsigned long flags;
> -       unsigned long timeout;
> -       struct pqi_scsi_dev *device;
>
>         timeout = (timeout_secs * PQI_HZ) + jiffies;
> -       while (1) {
> -               io_pending = false;
> -
> -               spin_lock_irqsave(&ctrl_info->scsi_device_list_lock,
> flags);
> -               list_for_each_entry(device, &ctrl_info-
> >scsi_device_list,
> -                       scsi_device_list_entry) {
> -                       if (atomic_read(&device-
> >scsi_cmds_outstanding)) {
> -                               io_pending = true;
> -                               break;
> -                       }
> -               }
> -               spin_unlock_irqrestore(&ctrl_info-
> >scsi_device_list_lock,
> -                                       flags);
> -
> -               if (!io_pending)
> -                       break;
>
> +       while (atomic_read(&device->scsi_cmds_outstanding)) {
>                 pqi_check_ctrl_health(ctrl_info);
>                 if (pqi_ctrl_offline(ctrl_info))
>                         return -ENXIO;
> -
>                 if (timeout_secs != NO_TIMEOUT) {
>                         if (time_after(jiffies, timeout)) {
>                                 dev_err(&ctrl_info->pci_dev->dev,
> -                                       "timed out waiting for
> pending IO\n");
> +                                       "timed out waiting for
> pending I/O\n");
>                                 return -ETIMEDOUT;
>                         }
>                 }

Like I said above (wrt pqi_scsi_block_requests), have you considered
using wait_event_timeout() here?


> @@ -6073,18 +5864,6 @@ static int pqi_ctrl_wait_for_pending_io(struct
> pqi_ctrl_info *ctrl_info,
>         return 0;
>  }
>
> -static int pqi_ctrl_wait_for_pending_sync_cmds(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       while (atomic_read(&ctrl_info->sync_cmds_outstanding)) {
> -               pqi_check_ctrl_health(ctrl_info);
> -               if (pqi_ctrl_offline(ctrl_info))
> -                       return -ENXIO;
> -               usleep_range(1000, 2000);
> -       }
> -
> -       return 0;
> -}
> -
>  static void pqi_lun_reset_complete(struct pqi_io_request
> *io_request,
>         void *context)
>  {
> @@ -6156,13 +5935,11 @@ static int pqi_lun_reset(struct pqi_ctrl_info
> *ctrl_info,
>         return rc;
>  }
>
> -/* Performs a reset at the LUN level. */
> -
>  #define PQI_LUN_RESET_RETRIES                  3
>  #define PQI_LUN_RESET_RETRY_INTERVAL_MSECS     10000
>  #define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS  120
>
> -static int _pqi_device_reset(struct pqi_ctrl_info *ctrl_info,
> +static int pqi_lun_reset_with_retries(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_scsi_dev *device)
>  {
>         int rc;
> @@ -6188,23 +5965,15 @@ static int pqi_device_reset(struct
> pqi_ctrl_info *ctrl_info,
>  {
>         int rc;
>
> -       mutex_lock(&ctrl_info->lun_reset_mutex);
> -
>         pqi_ctrl_block_requests(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
>         pqi_fail_io_queued_for_device(ctrl_info, device);
>         rc = pqi_wait_until_inbound_queues_empty(ctrl_info);
> -       pqi_device_reset_start(device);
> -       pqi_ctrl_unblock_requests(ctrl_info);
> -
>         if (rc)
>                 rc = FAILED;
>         else
> -               rc = _pqi_device_reset(ctrl_info, device);
> -
> -       pqi_device_reset_done(device);
> -
> -       mutex_unlock(&ctrl_info->lun_reset_mutex);
> +               rc = pqi_lun_reset_with_retries(ctrl_info, device);
> +       pqi_ctrl_unblock_requests(ctrl_info);
>
>         return rc;
>  }
> @@ -6220,29 +5989,25 @@ static int pqi_eh_device_reset_handler(struct
> scsi_cmnd *scmd)
>         ctrl_info = shost_to_hba(shost);
>         device = scmd->device->hostdata;
>
> +       mutex_lock(&ctrl_info->lun_reset_mutex);
> +
>         dev_err(&ctrl_info->pci_dev->dev,
>                 "resetting scsi %d:%d:%d:%d\n",
>                 shost->host_no, device->bus, device->target, device-
> >lun);
>
>         pqi_check_ctrl_health(ctrl_info);
> -       if (pqi_ctrl_offline(ctrl_info) ||
> -               pqi_device_reset_blocked(ctrl_info)) {
> +       if (pqi_ctrl_offline(ctrl_info))
>                 rc = FAILED;
> -               goto out;
> -       }
> -
> -       pqi_wait_until_ofa_finished(ctrl_info);
> -
> -       atomic_inc(&ctrl_info->sync_cmds_outstanding);
> -       rc = pqi_device_reset(ctrl_info, device);
> -       atomic_dec(&ctrl_info->sync_cmds_outstanding);
> +       else
> +               rc = pqi_device_reset(ctrl_info, device);
>
> -out:
>         dev_err(&ctrl_info->pci_dev->dev,
>                 "reset of scsi %d:%d:%d:%d: %s\n",
>                 shost->host_no, device->bus, device->target, device-
> >lun,
>                 rc == SUCCESS ? "SUCCESS" : "FAILED");
>
> +       mutex_unlock(&ctrl_info->lun_reset_mutex);
> +
>         return rc;
>  }
>
> @@ -6544,7 +6309,7 @@ static int pqi_passthru_ioctl(struct
> pqi_ctrl_info *ctrl_info, void __user *arg)
>                 put_unaligned_le32(iocommand.Request.Timeout,
> &request.timeout);
>
>         rc = pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               PQI_SYNC_FLAGS_INTERRUPTABLE, &pqi_error_info,
> NO_TIMEOUT);
> +               PQI_SYNC_FLAGS_INTERRUPTABLE, &pqi_error_info);
>
>         if (iocommand.buf_size > 0)
>                 pqi_pci_unmap(ctrl_info->pci_dev,
> request.sg_descriptors, 1,
> @@ -6596,9 +6361,6 @@ static int pqi_ioctl(struct scsi_device *sdev,
> unsigned int cmd,
>
>         ctrl_info = shost_to_hba(sdev->host);
>
> -       if (pqi_ctrl_in_ofa(ctrl_info) ||
> pqi_ctrl_in_shutdown(ctrl_info))
> -               return -EBUSY;
> -
>         switch (cmd) {
>         case CCISS_DEREGDISK:
>         case CCISS_REGNEWDISK:
> @@ -7145,9 +6907,7 @@ static int pqi_register_scsi(struct
> pqi_ctrl_info *ctrl_info)
>
>         shost = scsi_host_alloc(&pqi_driver_template,
> sizeof(ctrl_info));
>         if (!shost) {
> -               dev_err(&ctrl_info->pci_dev->dev,
> -                       "scsi_host_alloc failed for controller %u\n",
> -                       ctrl_info->ctrl_id);
> +               dev_err(&ctrl_info->pci_dev->dev, "scsi_host_alloc
> failed\n");
>                 return -ENOMEM;
>         }
>
> @@ -7405,7 +7165,7 @@ static int pqi_config_table_update(struct
> pqi_ctrl_info *ctrl_info,
>                 &request.data.config_table_update.last_section);
>
>         return pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +               0, NULL);
>  }
>
>  static int pqi_enable_firmware_features(struct pqi_ctrl_info
> *ctrl_info,
> @@ -7483,7 +7243,8 @@ static void
> pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>                 break;
>         case PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE:
>                 ctrl_info->soft_reset_handshake_supported =
> -                       firmware_feature->enabled;
> +                       firmware_feature->enabled &&
> +                       ctrl_info->soft_reset_status;

Should you use readb(ctrl_info->soft_reset_status) here, like above?

>                 break;
>         case PQI_FIRMWARE_FEATURE_RAID_IU_TIMEOUT:
>                 ctrl_info->raid_iu_timeout_supported =
> firmware_feature->enabled;
> @@ -7491,6 +7252,10 @@ static void
> pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>         case PQI_FIRMWARE_FEATURE_TMF_IU_TIMEOUT:
>                 ctrl_info->tmf_iu_timeout_supported =
> firmware_feature->enabled;
>                 break;
> +       case PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN:
> +               ctrl_info->unique_wwid_in_report_phys_lun_supported =
> +                       firmware_feature->enabled;
> +               break;
>         }
>
>         pqi_firmware_feature_status(ctrl_info, firmware_feature);
> @@ -7581,6 +7346,11 @@ static struct pqi_firmware_feature
> pqi_firmware_features[] = {
>                 .feature_bit =
> PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME,
>                 .feature_status = pqi_firmware_feature_status,
>         },
> +       {
> +               .feature_name = "Unique WWID in Report Physical LUN",
> +               .feature_bit =
> PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN,
> +               .feature_status = pqi_ctrl_update_feature_flags,
> +       },
>  };
>
>  static void pqi_process_firmware_features(
> @@ -7665,14 +7435,34 @@ static void
> pqi_process_firmware_features_section(
>         mutex_unlock(&pqi_firmware_features_mutex);
>  }
>

The hunks below look like yet another independent change.
(Handling of firmware_feature_section_present).

> +/*
> + * Reset all controller settings that can be initialized during the
> processing
> + * of the PQI Configuration Table.
> + */
> +
> +static void pqi_ctrl_reset_config(struct pqi_ctrl_info *ctrl_info)
> +{
> +       ctrl_info->heartbeat_counter = NULL;
> +       ctrl_info->soft_reset_status = NULL;
> +       ctrl_info->soft_reset_handshake_supported = false;
> +       ctrl_info->enable_r1_writes = false;
> +       ctrl_info->enable_r5_writes = false;
> +       ctrl_info->enable_r6_writes = false;
> +       ctrl_info->raid_iu_timeout_supported = false;
> +       ctrl_info->tmf_iu_timeout_supported = false;
> +       ctrl_info->unique_wwid_in_report_phys_lun_supported = false;
> +}
> +
>  static int pqi_process_config_table(struct pqi_ctrl_info *ctrl_info)
>  {
>         u32 table_length;
>         u32 section_offset;
> +       bool firmware_feature_section_present;
>         void __iomem *table_iomem_addr;
>         struct pqi_config_table *config_table;
>         struct pqi_config_table_section_header *section;
>         struct pqi_config_table_section_info section_info;
> +       struct pqi_config_table_section_info feature_section_info;
>
>         table_length = ctrl_info->config_table_length;
>         if (table_length == 0)
> @@ -7692,6 +7482,7 @@ static int pqi_process_config_table(struct
> pqi_ctrl_info *ctrl_info)
>         table_iomem_addr = ctrl_info->iomem_base + ctrl_info-
> >config_table_offset;
>         memcpy_fromio(config_table, table_iomem_addr, table_length);
>
> +       firmware_feature_section_present = false;
>         section_info.ctrl_info = ctrl_info;
>         section_offset = get_unaligned_le32(&config_table-
> >first_section_offset);
>
> @@ -7704,7 +7495,8 @@ static int pqi_process_config_table(struct
> pqi_ctrl_info *ctrl_info)
>
>                 switch (get_unaligned_le16(&section->section_id)) {
>                 case PQI_CONFIG_TABLE_SECTION_FIRMWARE_FEATURES:
> -
>                        pqi_process_firmware_features_section(&section_
> info);
> +                       firmware_feature_section_present = true;
> +                       feature_section_info = section_info;
>                         break;
>                 case PQI_CONFIG_TABLE_SECTION_HEARTBEAT:
>                         if (pqi_disable_heartbeat)
> @@ -7722,13 +7514,21 @@ static int pqi_process_config_table(struct
> pqi_ctrl_info *ctrl_info)
>                                 table_iomem_addr +
>                                 section_offset +
>                                 offsetof(struct
> pqi_config_table_soft_reset,
> -                                               soft_reset_status);
> +                                       soft_reset_status);
>                         break;
>                 }
>
>                 section_offset = get_unaligned_le16(&section-
> >next_section_offset);
>         }
>
> +       /*
> +        * We process the firmware feature section after all other
> sections
> +        * have been processed so that the feature bit callbacks can
> take
> +        * into account the settings configured by other sections.
> +        */
> +       if (firmware_feature_section_present)
> +               pqi_process_firmware_features_section(&feature_sectio
> n_info);
> +
>         kfree(config_table);
>
>         return 0;
> @@ -7776,8 +7576,6 @@ static int pqi_force_sis_mode(struct
> pqi_ctrl_info *ctrl_info)
>         return pqi_revert_to_sis_mode(ctrl_info);
>  }
>
> -#define PQI_POST_RESET_DELAY_B4_MSGU_READY     5000
> -
>  static int pqi_ctrl_init(struct pqi_ctrl_info *ctrl_info)
>  {
>         int rc;
> @@ -7785,7 +7583,7 @@ static int pqi_ctrl_init(struct pqi_ctrl_info
> *ctrl_info)
>
>         if (reset_devices) {
>                 sis_soft_reset(ctrl_info);
> -               msleep(PQI_POST_RESET_DELAY_B4_MSGU_READY);
> +               msleep(PQI_POST_RESET_DELAY_SECS * PQI_HZ);
>         } else {
>                 rc = pqi_force_sis_mode(ctrl_info);
>                 if (rc)
> @@ -8095,6 +7893,8 @@ static int pqi_ctrl_init_resume(struct
> pqi_ctrl_info *ctrl_info)
>         ctrl_info->controller_online = true;
>         pqi_ctrl_unblock_requests(ctrl_info);
>
> +       pqi_ctrl_reset_config(ctrl_info);
> +
>         rc = pqi_process_config_table(ctrl_info);
>         if (rc)
>                 return rc;
> @@ -8140,7 +7940,8 @@ static int pqi_ctrl_init_resume(struct
> pqi_ctrl_info *ctrl_info)
>                 return rc;
>         }
>
> -       pqi_schedule_update_time_worker(ctrl_info);
> +       if (pqi_ofa_in_progress(ctrl_info))
> +               pqi_ctrl_unblock_scan(ctrl_info);
>
>         pqi_scan_scsi_devices(ctrl_info);
>
> @@ -8253,7 +8054,6 @@ static struct pqi_ctrl_info
> *pqi_alloc_ctrl_info(int numa_node)
>
>         INIT_WORK(&ctrl_info->event_work, pqi_event_worker);
>         atomic_set(&ctrl_info->num_interrupts, 0);
> -       atomic_set(&ctrl_info->sync_cmds_outstanding, 0);
>
>         INIT_DELAYED_WORK(&ctrl_info->rescan_work,
> pqi_rescan_worker);
>         INIT_DELAYED_WORK(&ctrl_info->update_time_work,
> pqi_update_time_worker);
> @@ -8261,15 +8061,13 @@ static struct pqi_ctrl_info
> *pqi_alloc_ctrl_info(int numa_node)
>         timer_setup(&ctrl_info->heartbeat_timer,
> pqi_heartbeat_timer_handler, 0);
>         INIT_WORK(&ctrl_info->ctrl_offline_work,
> pqi_ctrl_offline_worker);
>
> +       INIT_WORK(&ctrl_info->ofa_memory_alloc_work,
> pqi_ofa_memory_alloc_worker);
> +       INIT_WORK(&ctrl_info->ofa_quiesce_work,
> pqi_ofa_quiesce_worker);
> +
>         sema_init(&ctrl_info->sync_request_sem,
>                 PQI_RESERVED_IO_SLOTS_SYNCHRONOUS_REQUESTS);
>         init_waitqueue_head(&ctrl_info->block_requests_wait);
>
> -       INIT_LIST_HEAD(&ctrl_info->raid_bypass_retry_list);
> -       spin_lock_init(&ctrl_info->raid_bypass_retry_list_lock);
> -       INIT_WORK(&ctrl_info->raid_bypass_retry_work,
> -               pqi_raid_bypass_retry_worker);
> -
>         ctrl_info->ctrl_id = atomic_inc_return(&pqi_controller_count)
> - 1;
>         ctrl_info->irq_mode = IRQ_MODE_NONE;
>         ctrl_info->max_msix_vectors = PQI_MAX_MSIX_VECTORS;
> @@ -8334,81 +8132,57 @@ static void pqi_remove_ctrl(struct
> pqi_ctrl_info *ctrl_info)
>
>  static void pqi_ofa_ctrl_quiesce(struct pqi_ctrl_info *ctrl_info)
>  {
> -       pqi_cancel_update_time_worker(ctrl_info);
> -       pqi_cancel_rescan_worker(ctrl_info);
> -       pqi_wait_until_lun_reset_finished(ctrl_info);
> -       pqi_wait_until_scan_finished(ctrl_info);
> -       pqi_ctrl_ofa_start(ctrl_info);
> +       pqi_ctrl_block_scan(ctrl_info);
> +       pqi_scsi_block_requests(ctrl_info);
> +       pqi_ctrl_block_device_reset(ctrl_info);
>         pqi_ctrl_block_requests(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
> -       pqi_ctrl_wait_for_pending_io(ctrl_info,
> PQI_PENDING_IO_TIMEOUT_SECS);
> -       pqi_fail_io_queued_for_all_devices(ctrl_info);
> -       pqi_wait_until_inbound_queues_empty(ctrl_info);
>         pqi_stop_heartbeat_timer(ctrl_info);
> -       ctrl_info->pqi_mode_enabled = false;
> -       pqi_save_ctrl_mode(ctrl_info, SIS_MODE);
>  }
>
>  static void pqi_ofa_ctrl_unquiesce(struct pqi_ctrl_info *ctrl_info)
>  {
> -       pqi_ofa_free_host_buffer(ctrl_info);
> -       ctrl_info->pqi_mode_enabled = true;
> -       pqi_save_ctrl_mode(ctrl_info, PQI_MODE);
> -       ctrl_info->controller_online = true;
> -       pqi_ctrl_unblock_requests(ctrl_info);
>         pqi_start_heartbeat_timer(ctrl_info);
> -       pqi_schedule_update_time_worker(ctrl_info);
> -       pqi_clear_soft_reset_status(ctrl_info,
> -               PQI_SOFT_RESET_ABORT);
> -       pqi_scan_scsi_devices(ctrl_info);
> +       pqi_ctrl_unblock_requests(ctrl_info);
> +       pqi_ctrl_unblock_device_reset(ctrl_info);
> +       pqi_scsi_unblock_requests(ctrl_info);
> +       pqi_ctrl_unblock_scan(ctrl_info);
>  }
>
> -static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info,
> -       u32 total_size, u32 chunk_size)
> +static int pqi_ofa_alloc_mem(struct pqi_ctrl_info *ctrl_info, u32
> total_size, u32 chunk_size)
>  {
> -       u32 sg_count;
> -       u32 size;
>         int i;
> -       struct pqi_sg_descriptor *mem_descriptor = NULL;
> +       u32 sg_count;
>         struct device *dev;
>         struct pqi_ofa_memory *ofap;
> -
> -       dev = &ctrl_info->pci_dev->dev;
> -
> -       sg_count = (total_size + chunk_size - 1);
> -       sg_count /= chunk_size;
> +       struct pqi_sg_descriptor *mem_descriptor;
> +       dma_addr_t dma_handle;
>
>         ofap = ctrl_info->pqi_ofa_mem_virt_addr;
>
> -       if (sg_count*chunk_size < total_size)
> +       sg_count = DIV_ROUND_UP(total_size, chunk_size);
> +       if (sg_count == 0 || sg_count > PQI_OFA_MAX_SG_DESCRIPTORS)
>                 goto out;
>
> -       ctrl_info->pqi_ofa_chunk_virt_addr =
> -                               kcalloc(sg_count, sizeof(void *),
> GFP_KERNEL);
> +       ctrl_info->pqi_ofa_chunk_virt_addr = kmalloc_array(sg_count,
> sizeof(void *), GFP_KERNEL);
>         if (!ctrl_info->pqi_ofa_chunk_virt_addr)
>                 goto out;
>
> -       for (size = 0, i = 0; size < total_size; size += chunk_size,
> i++) {
> -               dma_addr_t dma_handle;
> +       dev = &ctrl_info->pci_dev->dev;
>
> +       for (i = 0; i < sg_count; i++) {
>                 ctrl_info->pqi_ofa_chunk_virt_addr[i] =
> -                       dma_alloc_coherent(dev, chunk_size,
> &dma_handle,
> -                                          GFP_KERNEL);
> -
> +                       dma_alloc_coherent(dev, chunk_size,
> &dma_handle, GFP_KERNEL);
>                 if (!ctrl_info->pqi_ofa_chunk_virt_addr[i])
> -                       break;
> -
> +                       goto out_free_chunks;
>                 mem_descriptor = &ofap->sg_descriptor[i];
>                 put_unaligned_le64((u64)dma_handle, &mem_descriptor-
> >address);
>                 put_unaligned_le32(chunk_size, &mem_descriptor-
> >length);
>         }
>
> -       if (!size || size < total_size)
> -               goto out_free_chunks;
> -
>         put_unaligned_le32(CISS_SG_LAST, &mem_descriptor->flags);
>         put_unaligned_le16(sg_count, &ofap->num_memory_descriptors);
> -       put_unaligned_le32(size, &ofap->bytes_allocated);
> +       put_unaligned_le32(sg_count * chunk_size, &ofap-
> >bytes_allocated);
>
>         return 0;
>
> @@ -8416,82 +8190,87 @@ static int pqi_ofa_alloc_mem(struct
> pqi_ctrl_info *ctrl_info,
>         while (--i >= 0) {
>                 mem_descriptor = &ofap->sg_descriptor[i];
>                 dma_free_coherent(dev, chunk_size,
> -                               ctrl_info-
> >pqi_ofa_chunk_virt_addr[i],
> -                               get_unaligned_le64(&mem_descriptor-
> >address));
> +                       ctrl_info->pqi_ofa_chunk_virt_addr[i],
> +                       get_unaligned_le64(&mem_descriptor-
> >address));
>         }
>         kfree(ctrl_info->pqi_ofa_chunk_virt_addr);
>
>  out:
> -       put_unaligned_le32 (0, &ofap->bytes_allocated);
>         return -ENOMEM;
>  }
>
>  static int pqi_ofa_alloc_host_buffer(struct pqi_ctrl_info
> *ctrl_info)
>  {
>         u32 total_size;
> +       u32 chunk_size;
>         u32 min_chunk_size;
> -       u32 chunk_sz;
>
> -       total_size = le32_to_cpu(
> -                       ctrl_info->pqi_ofa_mem_virt_addr-
> >bytes_allocated);
> -       min_chunk_size = total_size / PQI_OFA_MAX_SG_DESCRIPTORS;
> +       if (ctrl_info->ofa_bytes_requested == 0)
> +               return 0;
>
> -       for (chunk_sz = total_size; chunk_sz >= min_chunk_size;
> chunk_sz /= 2)
> -               if (!pqi_ofa_alloc_mem(ctrl_info, total_size,
> chunk_sz))
> +       total_size = PAGE_ALIGN(ctrl_info->ofa_bytes_requested);
> +       min_chunk_size = DIV_ROUND_UP(total_size,
> PQI_OFA_MAX_SG_DESCRIPTORS);
> +       min_chunk_size = PAGE_ALIGN(min_chunk_size);
> +
> +       for (chunk_size = total_size; chunk_size >= min_chunk_size;)
> {
> +               if (pqi_ofa_alloc_mem(ctrl_info, total_size,
> chunk_size) == 0)
>                         return 0;
> +               chunk_size /= 2;
> +               chunk_size = PAGE_ALIGN(chunk_size);
> +       }
>
>         return -ENOMEM;
>  }
>
> -static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info
> *ctrl_info,
> -       u32 bytes_requested)
> +static void pqi_ofa_setup_host_buffer(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       struct pqi_ofa_memory *pqi_ofa_memory;
>         struct device *dev;
> +       struct pqi_ofa_memory *ofap;
>
>         dev = &ctrl_info->pci_dev->dev;
> -       pqi_ofa_memory = dma_alloc_coherent(dev,
> -
> PQI_OFA_MEMORY_DESCRIPTOR_LENGTH,
> -                                           &ctrl_info-
> >pqi_ofa_mem_dma_handle,
> -                                           GFP_KERNEL);
>
> -       if (!pqi_ofa_memory)
> +       ofap = dma_alloc_coherent(dev, sizeof(*ofap),
> +               &ctrl_info->pqi_ofa_mem_dma_handle, GFP_KERNEL);
> +       if (!ofap)
>                 return;
>
> -       put_unaligned_le16(PQI_OFA_VERSION, &pqi_ofa_memory-
> >version);
> -       memcpy(&pqi_ofa_memory->signature, PQI_OFA_SIGNATURE,
> -                                       sizeof(pqi_ofa_memory-
> >signature));
> -       pqi_ofa_memory->bytes_allocated =
> cpu_to_le32(bytes_requested);
> -
> -       ctrl_info->pqi_ofa_mem_virt_addr = pqi_ofa_memory;
> +       ctrl_info->pqi_ofa_mem_virt_addr = ofap;
>
>         if (pqi_ofa_alloc_host_buffer(ctrl_info) < 0) {
> -               dev_err(dev, "Failed to allocate host buffer of size
> = %u",
> -                       bytes_requested);
> +               dev_err(dev,
> +                       "failed to allocate host buffer for Online
> Firmware Activation\n");
> +               dma_free_coherent(dev, sizeof(*ofap), ofap,
> ctrl_info->pqi_ofa_mem_dma_handle);
> +               ctrl_info->pqi_ofa_mem_virt_addr = NULL;
> +               return;
>         }
>
> -       return;
> +       put_unaligned_le16(PQI_OFA_VERSION, &ofap->version);
> +       memcpy(&ofap->signature, PQI_OFA_SIGNATURE, sizeof(ofap-
> >signature));
>  }
>
>  static void pqi_ofa_free_host_buffer(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       int i;
> -       struct pqi_sg_descriptor *mem_descriptor;
> +       unsigned int i;
> +       struct device *dev;
>         struct pqi_ofa_memory *ofap;
> +       struct pqi_sg_descriptor *mem_descriptor;
> +       unsigned int num_memory_descriptors;
>
>         ofap = ctrl_info->pqi_ofa_mem_virt_addr;
> -
>         if (!ofap)
>                 return;
>
> -       if (!ofap->bytes_allocated)
> +       dev = &ctrl_info->pci_dev->dev;
> +
> +       if (get_unaligned_le32(&ofap->bytes_allocated) == 0)
>                 goto out;
>
>         mem_descriptor = ofap->sg_descriptor;
> +       num_memory_descriptors =
> +               get_unaligned_le16(&ofap->num_memory_descriptors);
>
> -       for (i = 0; i < get_unaligned_le16(&ofap-
> >num_memory_descriptors);
> -               i++) {
> -               dma_free_coherent(&ctrl_info->pci_dev->dev,
> +       for (i = 0; i < num_memory_descriptors; i++) {
> +               dma_free_coherent(dev,
>                         get_unaligned_le32(&mem_descriptor[i].length)
> ,
>                         ctrl_info->pqi_ofa_chunk_virt_addr[i],
>                         get_unaligned_le64(&mem_descriptor[i].address
> ));
> @@ -8499,47 +8278,46 @@ static void pqi_ofa_free_host_buffer(struct
> pqi_ctrl_info *ctrl_info)
>         kfree(ctrl_info->pqi_ofa_chunk_virt_addr);
>
>  out:
> -       dma_free_coherent(&ctrl_info->pci_dev->dev,
> -                       PQI_OFA_MEMORY_DESCRIPTOR_LENGTH, ofap,
> -                       ctrl_info->pqi_ofa_mem_dma_handle);
> +       dma_free_coherent(dev, sizeof(*ofap), ofap,
> +               ctrl_info->pqi_ofa_mem_dma_handle);
>         ctrl_info->pqi_ofa_mem_virt_addr = NULL;
>  }
>
>  static int pqi_ofa_host_memory_update(struct pqi_ctrl_info
> *ctrl_info)
>  {
> +       u32 buffer_length;
>         struct pqi_vendor_general_request request;
> -       size_t size;
>         struct pqi_ofa_memory *ofap;
>
>         memset(&request, 0, sizeof(request));
>
> -       ofap = ctrl_info->pqi_ofa_mem_virt_addr;
> -
>         request.header.iu_type = PQI_REQUEST_IU_VENDOR_GENERAL;
>         put_unaligned_le16(sizeof(request) -
> PQI_REQUEST_HEADER_LENGTH,
>                 &request.header.iu_length);
>         put_unaligned_le16(PQI_VENDOR_GENERAL_HOST_MEMORY_UPDATE,
>                 &request.function_code);
>
> +       ofap = ctrl_info->pqi_ofa_mem_virt_addr;
> +
>         if (ofap) {
> -               size = offsetof(struct pqi_ofa_memory, sg_descriptor)
> +
> +               buffer_length = offsetof(struct pqi_ofa_memory,
> sg_descriptor) +
>                         get_unaligned_le16(&ofap-
> >num_memory_descriptors) *
>                         sizeof(struct pqi_sg_descriptor);
>
>                 put_unaligned_le64((u64)ctrl_info-
> >pqi_ofa_mem_dma_handle,
>                         &request.data.ofa_memory_allocation.buffer_ad
> dress);
> -               put_unaligned_le32(size,
> +               put_unaligned_le32(buffer_length,
>                         &request.data.ofa_memory_allocation.buffer_le
> ngth);
> -
>         }
>
>         return pqi_submit_raid_request_synchronous(ctrl_info,
> &request.header,
> -               0, NULL, NO_TIMEOUT);
> +               0, NULL);
>  }
>
> -static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info)
> +static int pqi_ofa_ctrl_restart(struct pqi_ctrl_info *ctrl_info,
> unsigned int delay_secs)
>  {
> -       msleep(PQI_POST_RESET_DELAY_B4_MSGU_READY);
> +       ssleep(delay_secs);
> +
>         return pqi_ctrl_init_resume(ctrl_info);
>  }
>
> @@ -8597,7 +8375,6 @@ static void
> pqi_take_ctrl_offline_deferred(struct pqi_ctrl_info *ctrl_info)
>         pqi_cancel_update_time_worker(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
>         pqi_fail_all_outstanding_requests(ctrl_info);
> -       pqi_clear_all_queued_raid_bypass_retries(ctrl_info);
>         pqi_ctrl_unblock_requests(ctrl_info);
>  }
>
> @@ -8730,24 +8507,12 @@ static void pqi_shutdown(struct pci_dev
> *pci_dev)
>                 return;
>         }
>
> -       pqi_disable_events(ctrl_info);
>         pqi_wait_until_ofa_finished(ctrl_info);
> -       pqi_cancel_update_time_worker(ctrl_info);
> -       pqi_cancel_rescan_worker(ctrl_info);
> -       pqi_cancel_event_worker(ctrl_info);
> -
> -       pqi_ctrl_shutdown_start(ctrl_info);
> -       pqi_ctrl_wait_until_quiesced(ctrl_info);
> -
> -       rc = pqi_ctrl_wait_for_pending_io(ctrl_info, NO_TIMEOUT);
> -       if (rc) {
> -               dev_err(&pci_dev->dev,
> -                       "wait for pending I/O failed\n");
> -               return;
> -       }
>
> +       pqi_scsi_block_requests(ctrl_info);
>         pqi_ctrl_block_device_reset(ctrl_info);
> -       pqi_wait_until_lun_reset_finished(ctrl_info);
> +       pqi_ctrl_block_requests(ctrl_info);
> +       pqi_ctrl_wait_until_quiesced(ctrl_info);
>
>         /*
>          * Write all data in the controller's battery-backed cache to
> @@ -8758,15 +8523,6 @@ static void pqi_shutdown(struct pci_dev
> *pci_dev)
>                 dev_err(&pci_dev->dev,
>                         "unable to flush controller cache\n");
>
> -       pqi_ctrl_block_requests(ctrl_info);
> -
> -       rc = pqi_ctrl_wait_for_pending_sync_cmds(ctrl_info);
> -       if (rc) {
> -               dev_err(&pci_dev->dev,
> -                       "wait for pending sync cmds failed\n");
> -               return;
> -       }
> -
>         pqi_crash_if_pending_command(ctrl_info);
>         pqi_reset(ctrl_info);
>  }
> @@ -8801,19 +8557,18 @@ static __maybe_unused int pqi_suspend(struct
> pci_dev *pci_dev, pm_message_t stat
>
>         ctrl_info = pci_get_drvdata(pci_dev);
>
> -       pqi_disable_events(ctrl_info);
> -       pqi_cancel_update_time_worker(ctrl_info);
> -       pqi_cancel_rescan_worker(ctrl_info);
> -       pqi_wait_until_scan_finished(ctrl_info);
> -       pqi_wait_until_lun_reset_finished(ctrl_info);
>         pqi_wait_until_ofa_finished(ctrl_info);
> -       pqi_flush_cache(ctrl_info, SUSPEND);
> +
> +       pqi_ctrl_block_scan(ctrl_info);
> +       pqi_scsi_block_requests(ctrl_info);
> +       pqi_ctrl_block_device_reset(ctrl_info);
>         pqi_ctrl_block_requests(ctrl_info);
>         pqi_ctrl_wait_until_quiesced(ctrl_info);
> -       pqi_wait_until_inbound_queues_empty(ctrl_info);
> -       pqi_ctrl_wait_for_pending_io(ctrl_info, NO_TIMEOUT);
> +       pqi_flush_cache(ctrl_info, SUSPEND);
>         pqi_stop_heartbeat_timer(ctrl_info);
>
> +       pqi_crash_if_pending_command(ctrl_info);
> +
>         if (state.event == PM_EVENT_FREEZE)
>                 return 0;
>
> @@ -8846,8 +8601,10 @@ static __maybe_unused int pqi_resume(struct
> pci_dev *pci_dev)
>                                 pci_dev->irq, rc);
>                         return rc;
>                 }
> -               pqi_start_heartbeat_timer(ctrl_info);
> +               pqi_ctrl_unblock_device_reset(ctrl_info);
>                 pqi_ctrl_unblock_requests(ctrl_info);
> +               pqi_scsi_unblock_requests(ctrl_info);
> +               pqi_ctrl_unblock_scan(ctrl_info);
>                 return 0;
>         }
>
> @@ -9288,7 +9045,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_iu_header,
>                 response_queue_id) != 0x4);
>         BUILD_BUG_ON(offsetof(struct pqi_iu_header,
> -               work_area) != 0x6);
> +               driver_flags) != 0x6);
>         BUILD_BUG_ON(sizeof(struct pqi_iu_header) != 0x8);
>
>         BUILD_BUG_ON(offsetof(struct pqi_aio_error_info,
> @@ -9386,7 +9143,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
>                 header.iu_length) != 2);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_request,
> @@ -9442,7 +9199,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
>                 header.iu_length) != 2);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_general_admin_response,
> @@ -9466,7 +9223,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
>                 header.response_queue_id) != 4);
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_raid_path_request,
> @@ -9495,7 +9252,7 @@ static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
>                 header.response_queue_id) != 4);
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
> -               header.work_area) != 6);
> +               header.driver_flags) != 6);
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
>                 request_id) != 8);
>         BUILD_BUG_ON(offsetof(struct pqi_aio_path_request,
>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-07 23:43       ` Martin Wilck
@ 2021-01-15 21:17         ` Don.Brace
  2021-01-19 10:33           ` John Garry
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-01-15 21:17 UTC (permalink / raw)
  To: mwilck, pmenzel, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch,
	joseph.szczypek, POSWALD, jejb, martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On Tue, 2020-12-15 at 20:23 +0000, Don.Brace@microchip.com wrote:
> Please see answers below. Hope this helps.
>
> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> Sent: Monday, December 14, 2020 11:54 AM
> To: Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett -
> C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 < 
> Scott.Teel@microchip.com>; Justin Lindley - C33718 < 
> Justin.Lindley@microchip.com>; Scott Benesh - C33703 < 
> Scott.Benesh@microchip.com>; Gerry Morong - C33720 < 
> Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 < 
> Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; 
> joseph.szczypek@hpe.com; POSWALD@suse.com; James E. J. Bottomley < 
> jejb@linux.ibm.com>; Martin K. Petersen <martin.petersen@oracle.com>
> Cc: linux-scsi@vger.kernel.org; it+linux-scsi@molgen.mpg.de; Donald 
> Buczek <buczek@molgen.mpg.de>; Greg KH <gregkh@linuxfoundation.org>
> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
>
> EXTERNAL EMAIL: Do not click links or open attachments unless you know 
> the content is safe
>
> Dear Don, dear Mahesh,
>
>
> Am 10.12.20 um 21:35 schrieb Don Brace:
> > From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> >
> > * Correct scsi-mid-layer sending more requests than
> >    exposed host Q depth causing firmware ASSERT issue.
> >    * Add host Qdepth counter.
>
> This supposedly fixes the regression between Linux 5.4 and 5.9, which 
> we reported in [1].
>
>      kernel: smartpqi 0000:89:00.0: controller is offline: status code 
> 0x6100c
>      kernel: smartpqi 0000:89:00.0: controller offline
>
> Thank you for looking into this issue and fixing it. We are going to 
> test this.
>
> For easily finding these things in the git history or the WWW, it 
> would be great if these log messages could be included (in the 
> future).
> DON> Thanks for your suggestion. Well add them in the next time.
>
> Also, that means, that the regression is still present in Linux 5.10, 
> released yesterday, and this commit does not apply to these versions.
>
> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12
> depending when all of the patches are applied. The patch in question 
> is among 28 other patches.
>
> Mahesh, do you have any idea, what commit caused the regression and 
> why the issue started to show up?
> DON> The smartpqi driver sets two scsi_host_template member fields:
> .can_queue and .nr_hw_queues. But we have not yet converted to 
> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue, 
> which is more than the hw can support. That can be verified by looking 
> at scsi_host.h.
>         /*
>          * In scsi-mq mode, the number of hardware queues supported by 
> the LLD.
>          *
>          * Note: it is assumed that each hardware queue has a queue 
> depth of
>          * can_queue. In other words, the total queue depth per host
>          * is nr_hw_queues * can_queue. However, for when host_tagset 
> is set,
>          * the total queue depth is can_queue.
>          */
>
> So, until we make this change, the queue_depth change prevents the 
> above issue from happening.

can_queue and nr_hw_queues have been set like this as long as the driver existed. Why did Paul observe a regression with 5.9?

And why can't you simply set can_queue to (ctrl_info->scsi_ml_can_queue / nr_hw_queues)?

Don: I did this in an internal patch, but this patch seemed to work the best for our driver. HBA performance remained steady when running benchmarks.

Regards,
Martin



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 10/25] smartpqi: add stream detection
  2021-01-08  0:14   ` Martin Wilck
@ 2021-01-15 21:58     ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-15 21:58 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Sent: Thursday, January 7, 2021 6:14 PM
To: Don Brace - C33706 <Don.Brace@microchip.com>; Kevin Barnett - C33748 <Kevin.Barnett@microchip.com>; Scott Teel - C33730 <Scott.Teel@microchip.com>; Justin Lindley - C33718 <Justin.Lindley@microchip.com>; Scott Benesh - C33703 <Scott.Benesh@microchip.com>; Gerry Morong - C33720 <Gerry.Morong@microchip.com>; Mahesh Rajashekhara - I30583 <Mahesh.Rajashekhara@microchip.com>; hch@infradead.org; jejb@linux.vnet.ibm.com; joseph.szczypek@hpe.com; POSWALD@suse.com
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH V3 10/25] smartpqi: add stream detection

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> * Enhance performance by adding sequential stream detection.
>   for R5/R6 sequential write requests.
>   * Reduce stripe lock contention with full-stripe write
>     operations.

I suppose that "stripe lock" is used by the firmware? Could you elaborate a bit more how this technique improves performance?

> +       /*
> +        * If controller does not support AIO RAID{5,6} writes, need
> to send
> +        * requests down non-AIO path.
> +        */
> +       if ((device->raid_level == SA_RAID_5 && !ctrl_info-
> >enable_r5_writes) ||
> +               (device->raid_level == SA_RAID_6 && !ctrl_info-
> >enable_r6_writes))
> +               return true;
> +
> +       lru_index = 0;
> +       oldest_jiffies = INT_MAX;
> +       for (i = 0; i < NUM_STREAMS_PER_LUN; i++) {
> +               pqi_stream_data = &device->stream_data[i];
> +               /*
> +                * Check for adjacent request or request is within
> +                * the previous request.
> +                */
> +               if ((pqi_stream_data->next_lba &&
> +                       rmd.first_block >= pqi_stream_data->next_lba)
> &&
> +                       rmd.first_block <= pqi_stream_data->next_lba
> +
> +                               rmd.block_cnt) {

Here you seem to assume that the previous write had the same block_cnt.
What's the justification for that?

Don:
There is a maximum request size for RAID 5/RAID 6 write requests. So we are assuming that if a sequential stream is detected, the stream is comprised of similar request sizes. In fact for coalescing, the LBAs need to be continuous or nearly contiguous, otherwise the RAID engine will not wait for a full-stripe.

I have updated the patch description accordingly.

> +                       pqi_stream_data->next_lba = rmd.first_block +
> +                               rmd.block_cnt;
> +                       pqi_stream_data->last_accessed = jiffies;
> +                       return true;
> +               }
> +
> +               /* unused entry */
> +               if (pqi_stream_data->last_accessed == 0) {
> +                       lru_index = i;
> +                       break;
> +               }
> +
> +               /* Find entry with oldest last accessed time. */
> +               if (pqi_stream_data->last_accessed <= oldest_jiffies)
> {
> +                       oldest_jiffies = pqi_stream_data-
> >last_accessed;
> +                       lru_index = i;
> +               }
> +       }
> +
> +       /* Set LRU entry. */
> +       pqi_stream_data = &device->stream_data[lru_index];
> +       pqi_stream_data->last_accessed = jiffies;
> +       pqi_stream_data->next_lba = rmd.first_block + rmd.block_cnt;
> +
> +       return false;
> +}
> +
> +static int pqi_scsi_queue_command(struct Scsi_Host *shost, struct
> scsi_cmnd *scmd)
>  {
>         int rc;
>         struct pqi_ctrl_info *ctrl_info; @@ -5768,11 +5842,12 @@ 
> static int pqi_scsi_queue_command(struct Scsi_Host *shost,
>                 raid_bypassed = false;
>                 if (device->raid_bypass_enabled &&
>                         !blk_rq_is_passthrough(scmd->request)) {
> -                       rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device,
> -                               scmd, queue_group);
> -                       if (rc == 0 || rc == SCSI_MLQUEUE_HOST_BUSY)
> {
> -                               raid_bypassed = true;
> -                               atomic_inc(&device->raid_bypass_cnt);
> +                       if (!pqi_is_parity_write_stream(ctrl_info,
> scmd)) {
> +                               rc =
> pqi_raid_bypass_submit_scsi_cmd(ctrl_info, device, scmd, queue_group);
> +                               if (rc == 0 || rc ==
> SCSI_MLQUEUE_HOST_BUSY) {
> +                                       raid_bypassed = true;
> +                                       atomic_inc(&device-
> >raid_bypass_cnt);
> +                               }
>                         }
>                 }
>                 if (!raid_bypassed)
>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-15 21:17         ` Don.Brace
@ 2021-01-19 10:33           ` John Garry
  2021-01-19 14:12             ` Martin Wilck
  2021-02-10 15:27             ` Don.Brace
  0 siblings, 2 replies; 91+ messages in thread
From: John Garry @ 2021-01-19 10:33 UTC (permalink / raw)
  To: Don.Brace, mwilck, pmenzel, Kevin.Barnett, Scott.Teel,
	Justin.Lindley, Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara,
	hch, joseph.szczypek, POSWALD, jejb, martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh, Ming Lei

>>
>> Am 10.12.20 um 21:35 schrieb Don Brace:
>>> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
>>>
>>> * Correct scsi-mid-layer sending more requests than
>>>     exposed host Q depth causing firmware ASSERT issue.
>>>     * Add host Qdepth counter.
>>
>> This supposedly fixes the regression between Linux 5.4 and 5.9, which
>> we reported in [1].
>>
>>       kernel: smartpqi 0000:89:00.0: controller is offline: status code
>> 0x6100c
>>       kernel: smartpqi 0000:89:00.0: controller offline
>>
>> Thank you for looking into this issue and fixing it. We are going to
>> test this.
>>
>> For easily finding these things in the git history or the WWW, it
>> would be great if these log messages could be included (in the
>> future).
>> DON> Thanks for your suggestion. Well add them in the next time.
>>
>> Also, that means, that the regression is still present in Linux 5.10,
>> released yesterday, and this commit does not apply to these versions.
>>
>> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12
>> depending when all of the patches are applied. The patch in question
>> is among 28 other patches.
>>
>> Mahesh, do you have any idea, what commit caused the regression and
>> why the issue started to show up?
>> DON> The smartpqi driver sets two scsi_host_template member fields:
>> .can_queue and .nr_hw_queues. But we have not yet converted to
>> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,
>> which is more than the hw can support. That can be verified by looking
>> at scsi_host.h.
>>          /*
>>           * In scsi-mq mode, the number of hardware queues supported by
>> the LLD.
>>           *
>>           * Note: it is assumed that each hardware queue has a queue
>> depth of
>>           * can_queue. In other words, the total queue depth per host
>>           * is nr_hw_queues * can_queue. However, for when host_tagset
>> is set,
>>           * the total queue depth is can_queue.
>>           */
>>
>> So, until we make this change, the queue_depth change prevents the
>> above issue from happening.
> 
> can_queue and nr_hw_queues have been set like this as long as the driver existed. Why did Paul observe a regression with 5.9?
> 
> And why can't you simply set can_queue to (ctrl_info->scsi_ml_can_queue / nr_hw_queues)?
> 
> Don: I did this in an internal patch, but this patch seemed to work the best for our driver. HBA performance remained steady when running benchmarks.
> 

I guess that this is a fallout from commit 6eb045e092ef ("scsi:
  core: avoid host-wide host_busy counter for scsi_mq"). But that commit 
is correct.

If .can_queue is set to (ctrl_info->scsi_ml_can_queue / nr_hw_queues), 
then blk-mq can send each hw queue only (ctrl_info->scsi_ml_can_queue / 
nr_hw_queues) commands, while it should be possible to send 
ctrl_info->scsi_ml_can_queue commands.

I think that this can alternatively be solved by setting .host_tagset flag.

Thanks,
John



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-19 10:33           ` John Garry
@ 2021-01-19 14:12             ` Martin Wilck
  2021-01-19 17:43               ` Paul Menzel
  2021-01-20 16:42               ` Donald Buczek
  2021-02-10 15:27             ` Don.Brace
  1 sibling, 2 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-19 14:12 UTC (permalink / raw)
  To: John Garry, Don.Brace, pmenzel, Kevin.Barnett, Scott.Teel,
	Justin.Lindley, Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara,
	hch, joseph.szczypek, POSWALD, jejb, martin.petersen,
	Paul Menzel, Ming Lei
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh

On Tue, 2021-01-19 at 10:33 +0000, John Garry wrote:
> > > 
> > > Am 10.12.20 um 21:35 schrieb Don Brace:
> > > > From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
> > > > 
> > > > * Correct scsi-mid-layer sending more requests than
> > > >     exposed host Q depth causing firmware ASSERT issue.
> > > >     * Add host Qdepth counter.
> > > 
> > > This supposedly fixes the regression between Linux 5.4 and 5.9,
> > > which
> > > we reported in [1].
> > > 
> > >       kernel: smartpqi 0000:89:00.0: controller is offline:
> > > status code
> > > 0x6100c
> > >       kernel: smartpqi 0000:89:00.0: controller offline
> > > 
> > > Thank you for looking into this issue and fixing it. We are going
> > > to
> > > test this.
> > > 
> > > For easily finding these things in the git history or the WWW, it
> > > would be great if these log messages could be included (in the
> > > future).
> > > DON> Thanks for your suggestion. Well add them in the next time.
> > > 
> > > Also, that means, that the regression is still present in Linux
> > > 5.10,
> > > released yesterday, and this commit does not apply to these
> > > versions.
> > > 
> > > DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12
> > > depending when all of the patches are applied. The patch in
> > > question
> > > is among 28 other patches.
> > > 
> > > Mahesh, do you have any idea, what commit caused the regression
> > > and
> > > why the issue started to show up?
> > > DON> The smartpqi driver sets two scsi_host_template member
> > > fields:
> > > .can_queue and .nr_hw_queues. But we have not yet converted to
> > > host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,
> > > which is more than the hw can support. That can be verified by
> > > looking
> > > at scsi_host.h.
> > >          /*
> > >           * In scsi-mq mode, the number of hardware queues
> > > supported by
> > > the LLD.
> > >           *
> > >           * Note: it is assumed that each hardware queue has a
> > > queue
> > > depth of
> > >           * can_queue. In other words, the total queue depth per
> > > host
> > >           * is nr_hw_queues * can_queue. However, for when
> > > host_tagset
> > > is set,
> > >           * the total queue depth is can_queue.
> > >           */
> > > 
> > > So, until we make this change, the queue_depth change prevents
> > > the
> > > above issue from happening.
> > 
> > can_queue and nr_hw_queues have been set like this as long as the
> > driver existed. Why did Paul observe a regression with 5.9?
> > 
> > And why can't you simply set can_queue to (ctrl_info-
> > >scsi_ml_can_queue / nr_hw_queues)?
> > 
> > Don: I did this in an internal patch, but this patch seemed to work
> > the best for our driver. HBA performance remained steady when
> > running benchmarks.

That was a stupid suggestion on my part. Sorry.

> I guess that this is a fallout from commit 6eb045e092ef ("scsi:
>   core: avoid host-wide host_busy counter for scsi_mq"). But that
> commit 
> is correct.

It would be good if someone (Paul?) could verify whether that commit
actually caused the regression they saw.

Looking at that 6eb045e092ef, I notice this hunk:

 
-       busy = atomic_inc_return(&shost->host_busy) - 1;
        if (atomic_read(&shost->host_blocked) > 0) {
-               if (busy)
+               if (scsi_host_busy(shost) > 0)
                        goto starved;

Before 6eb045e092ef, the busy count was incremented with membarrier
before looking at "host_blocked". The new code does this instead:

@ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
                spin_unlock_irq(shost->host_lock);
        }
 
+       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);
+

but it happens *after* the "host_blocked" check. Could that perhaps
have caused the regression?

Thanks
Martin


^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-19 14:12             ` Martin Wilck
@ 2021-01-19 17:43               ` Paul Menzel
  2021-01-20 16:42               ` Donald Buczek
  1 sibling, 0 replies; 91+ messages in thread
From: Paul Menzel @ 2021-01-19 17:43 UTC (permalink / raw)
  To: Martin Wilck, John Garry, Don.Brace, Kevin.Barnett, Scott.Teel,
	Justin.Lindley, Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara,
	hch, joseph.szczypek, POSWALD, jejb, martin.petersen, Ming Lei
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh

Dear Martin, dear John, dear Don, dear Linux folks,


Am 19.01.21 um 15:12 schrieb Martin Wilck:
> On Tue, 2021-01-19 at 10:33 +0000, John Garry wrote:
>>>>
>>>> Am 10.12.20 um 21:35 schrieb Don Brace:
>>>>> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
>>>>>
>>>>> * Correct scsi-mid-layer sending more requests than
>>>>>      exposed host Q depth causing firmware ASSERT issue.
>>>>>      * Add host Qdepth counter.
>>>>
>>>> This supposedly fixes the regression between Linux 5.4 and 5.9,
>>>> which we reported in [1].
>>>>
>>>>        kernel: smartpqi 0000:89:00.0: controller is offline: status code 0x6100c
>>>>        kernel: smartpqi 0000:89:00.0: controller offline
>>>>
>>>> Thank you for looking into this issue and fixing it. We are going
>>>> to test this.
>>>>
>>>> For easily finding these things in the git history or the WWW, it
>>>> would be great if these log messages could be included (in the
>>>> future).
>>>> DON> Thanks for your suggestion. Well add them in the next time.
>>>>
>>>> Also, that means, that the regression is still present in Linux
>>>> 5.10, released yesterday, and this commit does not apply to these
>>>> versions.
>>>>
>>>> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12
>>>> depending when all of the patches are applied. The patch in
>>>> question is among 28 other patches.
>>>>
>>>> Mahesh, do you have any idea, what commit caused the regression
>>>> and why the issue started to show up?
>>>> DON> The smartpqi driver sets two scsi_host_template member
>>>> fields:
>>>> .can_queue and .nr_hw_queues. But we have not yet converted to
>>>> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,
>>>> which is more than the hw can support. That can be verified by
>>>> looking at scsi_host.h.
>>>>           /*
>>>>            * In scsi-mq mode, the number of hardware queues supported by the LLD.
>>>>            *
>>>>            * Note: it is assumed that each hardware queue has a queue depth of
>>>>            * can_queue. In other words, the total queue depth per host
>>>>            * is nr_hw_queues * can_queue. However, for when host_tagset is set,
>>>>            * the total queue depth is can_queue.
>>>>            */
>>>>
>>>> So, until we make this change, the queue_depth change prevents
>>>> the above issue from happening.
>>>
>>> can_queue and nr_hw_queues have been set like this as long as the
>>> driver existed. Why did Paul observe a regression with 5.9?
>>>
>>> And why can't you simply set can_queue to (ctrl_info-
>>>> scsi_ml_can_queue / nr_hw_queues)?
>>>
>>> Don: I did this in an internal patch, but this patch seemed to work
>>> the best for our driver. HBA performance remained steady when
>>> running benchmarks.
> 
> That was a stupid suggestion on my part. Sorry.
> 
>> I guess that this is a fallout from commit 6eb045e092ef ("scsi:
>>    core: avoid host-wide host_busy counter for scsi_mq"). But that
>> commit is correct.

John, thank you very much for taking the time to point this out. The 
commit showed first up in Linux 5.5-rc1. (The host template flog 
`host_tagset` was introduced in Linux 5.10-rc1.)

> It would be good if someone (Paul?) could verify whether that commit
> actually caused the regression they saw.
> 
> Looking at that 6eb045e092ef, I notice this hunk:
>   
> -       busy = atomic_inc_return(&shost->host_busy) - 1;
>          if (atomic_read(&shost->host_blocked) > 0) {
> -               if (busy)
> +               if (scsi_host_busy(shost) > 0)
>                          goto starved;
> 
> Before 6eb045e092ef, the busy count was incremented with membarrier
> before looking at "host_blocked". The new code does this instead:
> 
> @ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>                  spin_unlock_irq(shost->host_lock);
>          }
>   
> +       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);
> +
> 
> but it happens *after* the "host_blocked" check. Could that perhaps
> have caused the regression?

As we only have production systems with this issue, and Don wrote the 
Microchip team was able to reproduce the issue, it’d be great, if Don 
and his team, could test, if commit 6eb045e092ef introduced the regression.

Also, we still need a path forward how to fix this for the Linux 5.10 
series. Due to the issue dragging on for so long, the 5.9 series has 
reached end of life already.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-19 14:12             ` Martin Wilck
  2021-01-19 17:43               ` Paul Menzel
@ 2021-01-20 16:42               ` Donald Buczek
  2021-01-20 17:03                 ` Don.Brace
  2021-01-20 18:35                 ` Martin Wilck
  1 sibling, 2 replies; 91+ messages in thread
From: Donald Buczek @ 2021-01-20 16:42 UTC (permalink / raw)
  To: Martin Wilck, John Garry, Don.Brace, pmenzel, Kevin.Barnett,
	Scott.Teel, Justin.Lindley, Scott.Benesh, Gerry.Morong,
	Mahesh.Rajashekhara, hch, joseph.szczypek, POSWALD, jejb,
	martin.petersen, Ming Lei
  Cc: linux-scsi, it+linux-scsi, gregkh

On 19.01.21 15:12, Martin Wilck wrote:
> On Tue, 2021-01-19 at 10:33 +0000, John Garry wrote:
>>>>
>>>> Am 10.12.20 um 21:35 schrieb Don Brace:
>>>>> From: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
>>>>>
>>>>> * Correct scsi-mid-layer sending more requests than
>>>>>      exposed host Q depth causing firmware ASSERT issue.
>>>>>      * Add host Qdepth counter.
>>>>
>>>> This supposedly fixes the regression between Linux 5.4 and 5.9,
>>>> which
>>>> we reported in [1].
>>>>
>>>>        kernel: smartpqi 0000:89:00.0: controller is offline:
>>>> status code
>>>> 0x6100c
>>>>        kernel: smartpqi 0000:89:00.0: controller offline
>>>>
>>>> Thank you for looking into this issue and fixing it. We are going
>>>> to
>>>> test this.
>>>>
>>>> For easily finding these things in the git history or the WWW, it
>>>> would be great if these log messages could be included (in the
>>>> future).
>>>> DON> Thanks for your suggestion. Well add them in the next time.
>>>>
>>>> Also, that means, that the regression is still present in Linux
>>>> 5.10,
>>>> released yesterday, and this commit does not apply to these
>>>> versions.
>>>>
>>>> DON> They have started 5.10-RC7 now. So possibly 5.11 or 5.12
>>>> depending when all of the patches are applied. The patch in
>>>> question
>>>> is among 28 other patches.
>>>>
>>>> Mahesh, do you have any idea, what commit caused the regression
>>>> and
>>>> why the issue started to show up?
>>>> DON> The smartpqi driver sets two scsi_host_template member
>>>> fields:
>>>> .can_queue and .nr_hw_queues. But we have not yet converted to
>>>> host_tagset. So the queue_depth becomes nr_hw_queues * can_queue,
>>>> which is more than the hw can support. That can be verified by
>>>> looking
>>>> at scsi_host.h.
>>>>           /*
>>>>            * In scsi-mq mode, the number of hardware queues
>>>> supported by
>>>> the LLD.
>>>>            *
>>>>            * Note: it is assumed that each hardware queue has a
>>>> queue
>>>> depth of
>>>>            * can_queue. In other words, the total queue depth per
>>>> host
>>>>            * is nr_hw_queues * can_queue. However, for when
>>>> host_tagset
>>>> is set,
>>>>            * the total queue depth is can_queue.
>>>>            */
>>>>
>>>> So, until we make this change, the queue_depth change prevents
>>>> the
>>>> above issue from happening.
>>>
>>> can_queue and nr_hw_queues have been set like this as long as the
>>> driver existed. Why did Paul observe a regression with 5.9?
>>>
>>> And why can't you simply set can_queue to (ctrl_info-
>>>> scsi_ml_can_queue / nr_hw_queues)?
>>>
>>> Don: I did this in an internal patch, but this patch seemed to work
>>> the best for our driver. HBA performance remained steady when
>>> running benchmarks.
> 
> That was a stupid suggestion on my part. Sorry.
> 
>> I guess that this is a fallout from commit 6eb045e092ef ("scsi:
>>    core: avoid host-wide host_busy counter for scsi_mq"). But that
>> commit
>> is correct.
> 
> It would be good if someone (Paul?) could verify whether that commit
> actually caused the regression they saw.

We can reliably trigger the issue with a certain load pattern on a certain hardware.

I've compiled 6eb045e092ef  and got (as with other affected kernels) "controller is offline: status code 0x6100c" after 15 minutes of the test load.
I've compiled 6eb045e092ef^ and the load is running for 3 1/2 hours now.

So you hit it.

> Looking at that 6eb045e092ef, I notice this hunk:
> 
>   
> -       busy = atomic_inc_return(&shost->host_busy) - 1;
>          if (atomic_read(&shost->host_blocked) > 0) {
> -               if (busy)
> +               if (scsi_host_busy(shost) > 0)
>                          goto starved;
> 
> Before 6eb045e092ef, the busy count was incremented with membarrier
> before looking at "host_blocked". The new code does this instead:
> 
> @ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>                  spin_unlock_irq(shost->host_lock);
>          }
>   
> +       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);
> +
> 
> but it happens *after* the "host_blocked" check. Could that perhaps
> have caused the regression?

I'm not into this and can't comment on that. But if you need me to test any patch for verification, I'll certainly can do that.

Best
   Donald

>
> 
> Thanks
> Martin
> 

^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-20 16:42               ` Donald Buczek
@ 2021-01-20 17:03                 ` Don.Brace
  2021-01-20 18:35                 ` Martin Wilck
  1 sibling, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-20 17:03 UTC (permalink / raw)
  To: buczek, mwilck, john.garry, pmenzel, Kevin.Barnett, Scott.Teel,
	Justin.Lindley, Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara,
	hch, joseph.szczypek, POSWALD, jejb, martin.petersen, ming.lei
  Cc: linux-scsi, it+linux-scsi, gregkh

-----Original Message-----
From: Donald Buczek [mailto:buczek@molgen.mpg.de] 
Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

>
> It would be good if someone (Paul?) could verify whether that commit 
> actually caused the regression they saw.

We can reliably trigger the issue with a certain load pattern on a certain hardware.

I've compiled 6eb045e092ef  and got (as with other affected kernels) "controller is offline: status code 0x6100c" after 15 minutes of the test load.
I've compiled 6eb045e092ef^ and the load is running for 3 1/2 hours now.

So you hit it.

Don: good news, I was starting my own testing.
Thanks for your help

> Looking at that 6eb045e092ef, I notice this hunk:
>
>
> -       busy = atomic_inc_return(&shost->host_busy) - 1;
>          if (atomic_read(&shost->host_blocked) > 0) {
> -               if (busy)
> +               if (scsi_host_busy(shost) > 0)
>                          goto starved;
>
> Before 6eb045e092ef, the busy count was incremented with membarrier 
> before looking at "host_blocked". The new code does this instead:
>
> @ -1403,6 +1400,8 @@ static inline int scsi_host_queue_ready(struct request_queue *q,
>                  spin_unlock_irq(shost->host_lock);
>          }
>
> +       __set_bit(SCMD_STATE_INFLIGHT, &cmd->state);
> +
>
> but it happens *after* the "host_blocked" check. Could that perhaps 
> have caused the regression?

I'm not into this and can't comment on that. But if you need me to test any patch for verification, I'll certainly can do that.

Best
   Donald

>
>
> Thanks
> Martin
>

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-20 16:42               ` Donald Buczek
  2021-01-20 17:03                 ` Don.Brace
@ 2021-01-20 18:35                 ` Martin Wilck
  1 sibling, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-20 18:35 UTC (permalink / raw)
  To: Donald Buczek, John Garry, Don.Brace, pmenzel, Kevin.Barnett,
	Scott.Teel, Justin.Lindley, Scott.Benesh, Gerry.Morong,
	Mahesh.Rajashekhara, hch, joseph.szczypek, POSWALD, jejb,
	martin.petersen, Ming Lei
  Cc: linux-scsi, it+linux-scsi, gregkh

On Wed, 2021-01-20 at 17:42 +0100, Donald Buczek wrote:
> 
> I'm not into this and can't comment on that. But if you need me to
> test any patch for verification, I'll certainly can do that.

I'll send an *experimental* patch.

Thanks
Martin



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits
  2021-01-07 16:44   ` Martin Wilck
  2021-01-11 17:22     ` Don.Brace
@ 2021-01-22 16:45     ` Don.Brace
  2021-01-22 19:04       ` Martin Wilck
  1 sibling, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-01-22 16:45 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits


In general: This patch contains a lot of whitespace, indentation, and minor comment formatting changes which should rather go into a separate patch IMHO. This one is big enough without them.

Don: Moved formatting HUNKs to smartpqi-align-code-with-oob-driver.

Further remarks below.

> @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)
>
>         pqi_start_heartbeat_timer(ctrl_info);
>
> +       if (ctrl_info->enable_r5_writes || ctrl_info-
> >enable_r6_writes) {
> +               rc = pqi_get_advanced_raid_bypass_config(ctrl_info);
> +               if (rc) {
> +                       dev_err(&ctrl_info->pci_dev->dev,
> +                               "error obtaining advanced RAID bypass
> configuration\n");
> +                       return rc;

Do you need to error out here ? Can't you simply unset the enable_rX_writes feature?

Don: If the call to pqi_get_advanced_raid_bypass_config fails, then there most likely be some serious issues. So we abandon the initialization process.


Regards
Martin




^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits
  2021-01-22 16:45     ` Don.Brace
@ 2021-01-22 19:04       ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-22 19:04 UTC (permalink / raw)
  To: Don.Brace, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Fri, 2021-01-22 at 16:45 +0000, Don.Brace@microchip.com wrote:
> 
> > @@ static int pqi_ctrl_init_resume(struct pqi_ctrl_info *ctrl_info)
> > 
> >         pqi_start_heartbeat_timer(ctrl_info);
> > 
> > +       if (ctrl_info->enable_r5_writes || ctrl_info-
> > > enable_r6_writes) {
> > +               rc =
> > pqi_get_advanced_raid_bypass_config(ctrl_info);
> > +               if (rc) {
> > +                       dev_err(&ctrl_info->pci_dev->dev,
> > +                               "error obtaining advanced RAID
> > bypass
> > configuration\n");
> > +                       return rc;
> 
> Do you need to error out here ? Can't you simply unset the
> enable_rX_writes feature?
> 
> Don: If the call to pqi_get_advanced_raid_bypass_config fails, then
> there most likely be some serious issues. So we abandon the
> initialization process.

Ok, understood. A reader who isn't fully familiar with the HW
properties (like myself) may think that "advanced_raid_bypass"
is an optional performance-related feature and that the controller
could be operational without it. If that's not the case, fine.

Martin



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 08/25] smartpqi: add support for long firmware version
  2021-01-07 16:45   ` Martin Wilck
  2021-01-11 22:25     ` Don.Brace
@ 2021-01-22 20:01     ` Don.Brace
  1 sibling, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-22 20:01 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 08/25] smartpqi: add support for long firmware version

> +               snprintf(ctrl_info->firmware_version +
> +                       strlen(ctrl_info->firmware_version),
> +                       sizeof(ctrl_info->firmware_version),

This looks wrong. I suppose a real overflow can't happen, but shouldn't it rather be written like this?

snprintf(ctrl_info->firmware_version +
 sizeof(identify->firmware_version_short),
 sizeof(ctrl_info->firmware_version)
 - sizeof(identify->firmware_version_short),
 "-u", ...)

Don: I updated the code to match your suggeston.
Thank-you for your review,
Don

> +                       "-%u",
> +                       get_unaligned_le16(&identify-
> > firmware_build_number));
> +       }
>
>         memcpy(ctrl_info->model, identify->product_id,
>                 sizeof(identify->product_id)); @@ -9607,13 +9615,23 @@ 
> static void __attribute__((unused))
> verify_structures(void)
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 configuration_signature) != 1);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> -               firmware_version) != 5);
> +               firmware_version_short) != 5);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 extended_logical_unit_count) != 154);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 firmware_build_number) != 190);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               vendor_id) != 200);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               product_id) != 208);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               extra_controller_flags) != 286);
>         BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
>                 controller_mode) != 292);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               spare_part_number) != 293);
> +       BUILD_BUG_ON(offsetof(struct bmic_identify_controller,
> +               firmware_version_long) != 325);
>
>         BUILD_BUG_ON(offsetof(struct bmic_identify_physical_device,
>                 phys_bay_in_box) != 115);
>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 21/25] smartpqi: add additional logging for LUN resets
  2021-01-08  0:27   ` Martin Wilck
@ 2021-01-25 17:09     ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-25 17:09 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi


From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 21/25] smartpqi: add additional logging for LUN resets


The patch description is not complete, as the patch also changes some timings. Two remarks below.

Cheers,
Martin

> ---
>  drivers/scsi/smartpqi/smartpqi_init.c |  125
> +++++++++++++++++++++++----------
>  1 file changed, 89 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/scsi/smartpqi/smartpqi_init.c
> b/drivers/scsi/smartpqi/smartpqi_init.c
> index 6b624413c8e6..1c51a59f1da6 100644
> --- a/drivers/scsi/smartpqi/smartpqi_init.c
> +++ b/drivers/scsi/smartpqi/smartpqi_init.c
> @@ -84,7 +84,7 @@ static void pqi_ofa_setup_host_buffer(struct 
> pqi_ctrl_info *ctrl_info);  static void 
> pqi_ofa_free_host_buffer(struct pqi_ctrl_info *ctrl_info);  static int 
> pqi_ofa_host_memory_update(struct pqi_ctrl_info *ctrl_info);  static 
> int pqi_device_wait_for_pending_io(struct pqi_ctrl_info *ctrl_info,
> -       struct pqi_scsi_dev *device, unsigned long timeout_secs);
> +       struct pqi_scsi_dev *device, unsigned long timeout_msecs);
>
>  /* for flags argument to pqi_submit_raid_request_synchronous() */
>  #define PQI_SYNC_FLAGS_INTERRUPTABLE   0x1
> @@ -335,11 +335,34 @@ static void pqi_wait_if_ctrl_blocked(struct
> pqi_ctrl_info *ctrl_info)
>         atomic_dec(&ctrl_info->num_blocked_threads);
>  }
>
> +#define PQI_QUIESE_WARNING_TIMEOUT_SECS                10

Did you mean QUIESCE ?

Don: Yes, corrected. Thank-you.

>
>         pqi_start_io(ctrl_info, &ctrl_info-
> >queue_groups[PQI_DEFAULT_QUEUE_GROUP], RAID_PATH,
>                 io_request);
> @@ -5958,29 +6007,33 @@ static int pqi_lun_reset(struct pqi_ctrl_info
> *ctrl_info,
>         return rc;
>  }
>
> -#define PQI_LUN_RESET_RETRIES                  3
> -#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS     10000
> -#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_SECS  120
> +#define PQI_LUN_RESET_RETRIES                          3
> +#define PQI_LUN_RESET_RETRY_INTERVAL_MSECS             (10 * 1000)
> +#define PQI_LUN_RESET_PENDING_IO_TIMEOUT_MSECS         (10 * 60 *
> 1000)

10 minutes? Isn't that a bit much?

Don: 10 minutes seems to hold true for our worst-case-scenarios.

> +#define PQI_LUN_RESET_FAILED_PENDING_IO_TIMEOUT_MSECS  (2 * 60 *
> 1000)

Why wait less long after a failure?

Don: If the reset TMF fails, the driver is waiting two more minutes for I/O to be flushed out. After this timeout return a failure if the I/O has not been flushed. In many tests, the I/O eventually returns.




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf
  2021-01-08  0:30   ` Martin Wilck
@ 2021-01-25 17:13     ` Don.Brace
  2021-01-25 19:44       ` Martin Wilck
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-01-25 17:13 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi


From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf

> @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info 
> *ctrl_info,  static void pqi_scsi_update_device(struct pqi_scsi_dev 
> *existing_device,
>         struct pqi_scsi_dev *new_device)  {
> -       existing_device->devtype = new_device->devtype;
>         existing_device->device_type = new_device->device_type;
>         existing_device->bus = new_device->bus;
>         if (new_device->target_lun_valid) {
>

I don't get this. Why was it wrong to update the devtype field?

Don: From patch Author...
If we don't remove that statement, following issue will crop up.

During initial device enumeration, the devtype attribute of the device(in current case enclosure device) is filled in slave_configure.
But whenever a rescan occurs, the f/w would return zero for this field,  and valid devtype is overwritten by zero, when device attributes are updated in pqi_scsi_update_device. Due to this lsscsi output shows wrong values.



^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf
  2021-01-25 17:13     ` Don.Brace
@ 2021-01-25 19:44       ` Martin Wilck
  2021-01-25 20:36         ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: Martin Wilck @ 2021-01-25 19:44 UTC (permalink / raw)
  To: Don.Brace, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Mon, 2021-01-25 at 17:13 +0000, Don.Brace@microchip.com wrote:
> 
> From: Martin Wilck [mailto:mwilck@suse.com] 
> Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier
> in sysf
> 
> > @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info
> > *ctrl_info,  static void pqi_scsi_update_device(struct pqi_scsi_dev
> > *existing_device,
> >         struct pqi_scsi_dev *new_device)  {
> > -       existing_device->devtype = new_device->devtype;
> >         existing_device->device_type = new_device->device_type;
> >         existing_device->bus = new_device->bus;
> >         if (new_device->target_lun_valid) {
> > 
> 
> I don't get this. Why was it wrong to update the devtype field?
> 
> Don: From patch Author...
> If we don't remove that statement, following issue will crop up.
> 
> During initial device enumeration, the devtype attribute of the
> device(in current case enclosure device) is filled in
> slave_configure.
> But whenever a rescan occurs, the f/w would return zero for this
> field,  and valid devtype is overwritten by zero, when device
> attributes are updated in pqi_scsi_update_device. Due to this lsscsi
> output shows wrong values.

Thanks. It would be very helpful for reviewers to add comments in
cases like this.

Regards,
Martin

> 
> 



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf
  2021-01-25 19:44       ` Martin Wilck
@ 2021-01-25 20:36         ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-25 20:36 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf

> From: Martin Wilck [mailto:mwilck@suse.com]
> Subject: Re: [PATCH V3 22/25] smartpqi: update enclosure identifier in 
> sysf
>
> > @@ -1841,7 +1841,6 @@ static void pqi_dev_info(struct pqi_ctrl_info 
> > *ctrl_info,  static void pqi_scsi_update_device(struct pqi_scsi_dev 
> > *existing_device,
> >         struct pqi_scsi_dev *new_device)  {
> > -       existing_device->devtype = new_device->devtype;
> >         existing_device->device_type = new_device->device_type;
> >         existing_device->bus = new_device->bus;
> >         if (new_device->target_lun_valid) {
> >
>
> I don't get this. Why was it wrong to update the devtype field?
>
> Don: From patch Author...
> If we don't remove that statement, following issue will crop up.
>
> During initial device enumeration, the devtype attribute of the 
> device(in current case enclosure device) is filled in slave_configure.
> But whenever a rescan occurs, the f/w would return zero for this 
> field,  and valid devtype is overwritten by zero, when device 
> attributes are updated in pqi_scsi_update_device. Due to this lsscsi 
> output shows wrong values.

Thanks. It would be very helpful for reviewers to add comments in cases like this.

Regards,
Martin

Don: I updated the patch description accordingly
Thanks for all your hard-work.
>
>



^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation
  2021-01-08  0:34   ` Martin Wilck
@ 2021-01-27 17:39     ` Don.Brace
  2021-01-27 17:45       ` Martin Wilck
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-01-27 17:39 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation

> @@ -8688,6 +8688,11 @@ static __maybe_unused int pqi_resume(struct 
> pci_dev *pci_dev)
>         pci_set_power_state(pci_dev, PCI_D0);
>         pci_restore_state(pci_dev);
>
> +       pqi_ctrl_unblock_device_reset(ctrl_info);
> +       pqi_ctrl_unblock_requests(ctrl_info);
> +       pqi_scsi_unblock_requests(ctrl_info);
> +       pqi_ctrl_unblock_scan(ctrl_info);
> +
>         return pqi_ctrl_init_resume(ctrl_info);  }

Like I said in my comments on 14/25:

pqi_ctrl_unblock_scan() and pqi_ctrl_unblock_device_reset() expand to mutex_unlock(). Unlocking an already-unlocked mutex is wrong, and a mutex has to be unlocked by the task that owns the lock. How can you be sure that these conditions are met here?

Don: I updated this patch to:
@@ -8661,9 +8661,17 @@ static __maybe_unused int pqi_resume(struct pci_dev *pci_dev)
                return 0;
        }
 
+       pqi_ctrl_block_device_reset(ctrl_info);
+       pqi_ctrl_block_scan(ctrl_info);
+
        pci_set_power_state(pci_dev, PCI_D0);
        pci_restore_state(pci_dev);
 
+       pqi_ctrl_unblock_device_reset(ctrl_info);
+       pqi_ctrl_unblock_requests(ctrl_info);
+       pqi_scsi_unblock_requests(ctrl_info);
+       pqi_ctrl_unblock_scan(ctrl_info);
+
        return pqi_ctrl_init_resume(ctrl_info);
 }
Don: So the mutexes are set and unset in the same task. I updated the other patch 14 accordingly, but I'll reply in that patch also. Is there a specific driver that initiates suspend/resume? Like acpi? Or some other pci driver?

Thanks for your hard work on these patches
Don





^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation
  2021-01-27 17:39     ` Don.Brace
@ 2021-01-27 17:45       ` Martin Wilck
  0 siblings, 0 replies; 91+ messages in thread
From: Martin Wilck @ 2021-01-27 17:45 UTC (permalink / raw)
  To: Don.Brace, Kevin.Barnett, Scott.Teel, Justin.Lindley,
	Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara, hch, jejb,
	joseph.szczypek, POSWALD
  Cc: linux-scsi

On Wed, 2021-01-27 at 17:39 +0000, Don.Brace@microchip.com wrote:
> -----Original Message-----
> From: Martin Wilck [mailto:mwilck@suse.com] 
> Subject: Re: [PATCH V3 23/25] smartpqi: correct system hangs when
> resuming from hibernation
> 
> > @@ -8688,6 +8688,11 @@ static __maybe_unused int pqi_resume(struct 
> > pci_dev *pci_dev)
> >         pci_set_power_state(pci_dev, PCI_D0);
> >         pci_restore_state(pci_dev);
> > 
> > +       pqi_ctrl_unblock_device_reset(ctrl_info);
> > +       pqi_ctrl_unblock_requests(ctrl_info);
> > +       pqi_scsi_unblock_requests(ctrl_info);
> > +       pqi_ctrl_unblock_scan(ctrl_info);
> > +
> >         return pqi_ctrl_init_resume(ctrl_info);  }
> 
> Like I said in my comments on 14/25:
> 
> pqi_ctrl_unblock_scan() and pqi_ctrl_unblock_device_reset() expand to
> mutex_unlock(). Unlocking an already-unlocked mutex is wrong, and a
> mutex has to be unlocked by the task that owns the lock. How can you
> be sure that these conditions are met here?
> 
> Don: I updated this patch to:
> @@ -8661,9 +8661,17 @@ static __maybe_unused int pqi_resume(struct
> pci_dev *pci_dev)
>                 return 0;
>         }
>  
> +       pqi_ctrl_block_device_reset(ctrl_info);
> +       pqi_ctrl_block_scan(ctrl_info);
> +
>         pci_set_power_state(pci_dev, PCI_D0);
>         pci_restore_state(pci_dev);
>  
> +       pqi_ctrl_unblock_device_reset(ctrl_info);
> +       pqi_ctrl_unblock_requests(ctrl_info);
> +       pqi_scsi_unblock_requests(ctrl_info);
> +       pqi_ctrl_unblock_scan(ctrl_info);
> +
>         return pqi_ctrl_init_resume(ctrl_info);
>  }
> Don: So the mutexes are set and unset in the same task.

Yes, that looks much better to me.

>  I updated the other patch 14 accordingly, but I'll reply in that
> patch also. Is there a specific driver that initiates suspend/resume?
> Like acpi? Or some other pci driver?

I'm no expert on suspend/resume. I think it's platform-dependent, you
shouldn't make any specific assumptions what the platform actually
does.

Regards
Martin




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA
  2021-01-08  0:14   ` Martin Wilck
@ 2021-01-27 17:46     ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-01-27 17:46 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA


I don't understand how the patch description relates to the actual change. With the patch, the buffers are released just like before, only some instructions later. So apparently, without this patch, the OFA memory had been released prematurely?
Don: Yes. So when I broke up patch smartpqi-fix-driver-synchronization-issues, I ended up squashing this patch into a new patch called smartpqi-update-ofa-management. Thanks for your review.

Anyway,

Reviewed-by: Martin Wilck <mwilck@suse.com>




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 14/25] smartpqi: fix driver synchronization issues
  2021-01-07 23:32   ` Martin Wilck
  2021-01-08  4:13     ` Martin K. Petersen
  2021-01-15 21:13     ` Don.Brace
@ 2021-01-27 23:01     ` Don.Brace
       [not found]       ` <c1e6b199f5ccda5ccec5223dfcbd1fba22171c86.camel@suse.com>
  2 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-01-27 23:01 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 14/25] smartpqi: fix driver synchronization issues

On Thu, 2020-12-10 at 14:35 -0600, Don Brace wrote:
> From: Kevin Barnett <kevin.barnett@microchip.com>
>
> * synchronize: LUN resets, shutdowns, suspend, hibernate,
>   OFA, and controller offline events.
> * prevent I/O during the the above conditions.

This description is too terse for a complex patch like this.


The patch does not only address synchronization issues; it also changes
various other things that (given the size of the patch) should better
be handled elsewhere. I believe this patch could easily be split into
4 or more separate independent patches, which would ease review
considerably. I've added remarks below where I thought one or more
hunks could be separated out.

Don: I'll start answering questions. The overall answer is that I split this patch into 10 patches:
+ smartpqi-remove-timeouts-from-internal-cmds
+ smartpqi-add-support-for-wwid
+ smartpqi-update-event-handler
+ smartpqi-update-soft-reset-management-for-OFA
+ smartpqi-synchronize-device-resets-with-mutex
+ smartpqi-update-suspend-resume-and-shutdown
+ smartpqi-update-raid-bypass-handling
+ smartpqi-update-ofa-management
+ smartpqi-update-device-scan-operations
+ smartpqi-fix-driver-synchronization-issues

Don: and I'll answer the questions and give the new patch name that your question belongs to.

Thanks for your hard work. It really helps a lot.

PQI_FIRMWARE_FEATURE_RAID_BYPASS_ON_ENCRYPTED_NVME     15
> -#define PQI_FIRMWARE_FEATURE_MAXIMUM                           15
> +#define PQI_FIRMWARE_FEATURE_UNIQUE_WWID_IN_REPORT_PHYS_LUN    16
> +#define PQI_FIRMWARE_FEATURE_MAXIMUM                           16

What does the "unique WWID" feature have to do with synchronization
issues? This part should have gone into a separate patch.

DON: Correct, I moved all of the corresponding WWID HUNKs into patch:
smartpqi-add-support-for-wwid

> +static inline void pqi_ctrl_block_scan(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       ctrl_info->scan_blocked = true;
> +       mutex_lock(&ctrl_info->scan_mutex);
> +}

What do you need scan_blocked for? Can't you simply use
mutex_is_locked(&ctrl_info->scan_mutex)?
OTOH, using a mutex for this kind of condition feels dangerous
to me, see remark about ota_mutex() below.
Have you considered using a completion for this?

Don: This mutex is used for application initiated REGNEWD, driver initialization, during a controller restart after Online Firmware Initialization, and resume operation after suspend.

I believe that all of these operations are in their own thread and can be paused briefly to avoid updating any driver state. A wait_for_completion can cause a stack trace if the wait period is over 120 seconds. The authors intent was to provide the simplest mechanism to pause these operations.

I moved this functionality into patch 
smartpqi-update-suspend-resume-and-shutdown. 

> +
> +static inline void pqi_ctrl_unblock_scan(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       ctrl_info->scan_blocked = false;
> +       mutex_unlock(&ctrl_info->scan_mutex);
> +}
> +
> +static inline bool pqi_ctrl_scan_blocked(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       return ctrl_info->scan_blocked;
> +}
> +
>  static inline void pqi_ctrl_block_device_reset(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->block_device_reset = true;
> +       mutex_lock(&ctrl_info->lun_reset_mutex);
> +}
> +
> +static inline void pqi_ctrl_unblock_device_reset(struct
> pqi_ctrl_info *ctrl_info)
> +{
> +       mutex_unlock(&ctrl_info->lun_reset_mutex);
> +}
> +
> +static inline void pqi_scsi_block_requests(struct pqi_ctrl_info
> *ctrl_info)
> +{
> +       struct Scsi_Host *shost;
> +       unsigned int num_loops;
> +       int msecs_sleep;
> +
> +       shost = ctrl_info->scsi_host;
> +
> +       scsi_block_requests(shost);
> +
> +       num_loops = 0;
> +       msecs_sleep = 20;
> +       while (scsi_host_busy(shost)) {
> +               num_loops++;
> +               if (num_loops == 10)
> +                       msecs_sleep = 500;
> +               msleep(msecs_sleep);
> +       }
> +}

Waiting for !scsi_host_busy() here looks like a layering violation to
me. Can't you use wait_event{_timeout}() here and wait for the sum of
of device->scsi_cmds_outstanding over to become zero (waking up the
queue in pqi_prep_for_scsi_done())? You could use the
total_scmnds_outstanding count that you introduce in patch 15/25.

Also, how does this interact/interfere with scsi EH?

Don: I moved this HUNK into 
smartpqi-update-suspend-resume-and-shutdown
The function pqi_scsi_block_requests is called during Online Firmware Activation (OFA), shutdown, and suspend. I believe that this HUNK was in reaction to test team issuing continuous device resets during these operations and each needed to block any new I/O requests and EH operations during this time-period.

Since I broke up the larger patch, I may be able to move the smartpqi_fix_host_qdepth_limit patch before the 10 patches created from the refactoring. I think I would like to get V4 up for re-evaluation and eliminate as many outstanding reviews before I attack this review.


>
> -static inline void pqi_ctrl_ofa_start(struct pqi_ctrl_info
> *ctrl_info)
> +static inline void pqi_ctrl_ofa_done(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       ctrl_info->in_ofa = true;
> +       mutex_unlock(&ctrl_info->ofa_mutex);
>  }

pqi_ctrl_ofa_done() is called in several places. For me, it's non-
obvious whether ofa_mutex is guaranteed to be locked when this happens.
It would be an error to call mutex_unlock() if that's not the case.
Also, is it always guaranteed that "The context (task) that acquired
the lock also releases it"
(https://www.kernel.org/doc/html/latest/locking/locktypes.html)?
If feel that's rather not the case, as pqi_ctrl_ofa_start() is run from
a work queue, whereas pqi_ctrl_ofa_done() is not, afaics.

Have you considered using a completion?
Or can you add some explanatory comments?

Don: 
I moved this into smartpqi-update-ofa-management and
Hope this will look cleaner and clearer. The corresponding mutex set is in pqi_ofa_memory_alloc_worker.


>  static inline u8 pqi_read_soft_reset_status(struct pqi_ctrl_info
> *ctrl_info)
>  {
> -       if (!ctrl_info->soft_reset_status)
> -               return 0;
> -
>         return readb(ctrl_info->soft_reset_status);
>  }

The new treatment of soft_reset_status is unrelated to the
synchronization issues mentioned in the patch description.

Don: I moved the soft_reset operations into patch
 smartpqi-update-soft-reset-management-for-OFA
Hopefully this cleans and clears things up.

> @@ -1464,6 +1471,9 @@ static int pqi_get_physical_device_info(struct
> pqi_ctrl_info *ctrl_info,
>                 sizeof(device->phys_connector));
>         device->bay = id_phys->phys_bay_in_box;
>
> +       memcpy(&device->page_83_identifier, &id_phys-
> >page_83_identifier,
> +               sizeof(device->page_83_identifier));
> +
>         return 0;
>  }
>

This hunk belongs to the "unique wwid" part, see above.
Don: moved HUNK into smartpqi-add-support-for-wwid


> @@ -1970,8 +1980,13 @@ static void pqi_update_device_list(struct
> pqi_ctrl_info *ctrl_info,
>
>         spin_unlock_irqrestore(&ctrl_info->scsi_device_list_lock,
> flags);
>
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               pqi_ctrl_ofa_done(ctrl_info);
> +       if (pqi_ofa_in_progress(ctrl_info)) {
> +               list_for_each_entry_safe(device, next, &delete_list,
> delete_list_entry)
> +                       if (pqi_is_device_added(device))
> +                               pqi_device_remove_start(device);
> +               pqi_ctrl_unblock_device_reset(ctrl_info);
> +               pqi_scsi_unblock_requests(ctrl_info);
> +       }

I don't understand the purpose of is code. pqi_device_remove_start()
will be called again a few lines below. Why do it twice? I suppose
it's related to the unblocking, but that deserves an explanation.
Also, why do you unblock requests while OFA is "in progress"?

Don: I moved this code into patch smartpqi-update-ofa-management
The author's intent was to block any new incoming requests and to block retries while ofa is in progress to existing devices that will be deleted. Allow any pending resets to complete.
I'll add a comment. 
Thanks for all of your really hard work on this patch review.
It means a lot.

>
>         /* Remove all devices that have gone away. */
>         list_for_each_entry_safe(device, next, &delete_list,
> delete_list_entry) {
> @@ -1993,19 +2008,14 @@ static void pqi_update_device_list(struct
> pqi_ctrl_info *ctrl_info,

The following hunk is unrelated to synchronization.

Don: Moved this formatting HUNK into smartpqi-align-code-with-oob-driver

>          * Notify the SCSI ML if the queue depth of any existing
> device has
>          * changed.
>          */
> -       list_for_each_entry(device, &ctrl_info->scsi_device_list,
> -               scsi_device_list_entry) {
> -               if (device->sdev) {
> -                       if (device->queue_depth !=
> -                               device->advertised_queue_depth) {
> -                               device->advertised_queue_depth =
> device->queue_depth;
> -                               scsi_change_queue_depth(device->sdev,
> -                                       device-
> >advertised_queue_depth);
> -                       }
> -                       if (device->rescan) {
> -                               scsi_rescan_device(&device->sdev-
> >sdev_gendev);
> -                               device->rescan = false;
> -                       }
> +       list_for_each_entry(device, &ctrl_info->scsi_device_list,
> scsi_device_list_entry) {
> +               if (device->sdev && device->queue_depth != device-
> >advertised_queue_depth) {
> +                       device->advertised_queue_depth = device-
> >queue_depth;
> +                       scsi_change_queue_depth(device->sdev, device-
> >advertised_queue_depth);
> +               }
> +               if (device->rescan) {
> +                       scsi_rescan_device(&device->sdev-
> >sdev_gendev);
> +                       device->rescan = false;
>                 }

You've taken the reference to device->sdev->sdev_gendev out of the if
(device->sdev) clause. Can you be certain that device->sdev is non-
NULL?

Don: No. Corrected this HUNK in smartpqi-align-code-with-oob-driver.
Thank-you
>         }
>
> @@ -2073,6 +2083,16 @@ static inline bool pqi_expose_device(struct
> pqi_scsi_dev *device)
>         return !device->is_physical_device ||
> !pqi_skip_device(device->scsi3addr);
>  }
>

The following belongs to the "unique wwid" part.

Don: Moved WWID HUNKs into smartpqi-add-support-for-wwid

>
>  static int pqi_scan_scsi_devices(struct pqi_ctrl_info *ctrl_info)
>  {
> -       int rc = 0;
> +       int rc;
> +       int mutex_acquired;
>
>         if (pqi_ctrl_offline(ctrl_info))
>                 return -ENXIO;
>
> -       if (!mutex_trylock(&ctrl_info->scan_mutex)) {
> +       mutex_acquired = mutex_trylock(&ctrl_info->scan_mutex);
> +
> +       if (!mutex_acquired) {
> +               if (pqi_ctrl_scan_blocked(ctrl_info))
> +                       return -EBUSY;
>                 pqi_schedule_rescan_worker_delayed(ctrl_info);
> -               rc = -EINPROGRESS;
> -       } else {
> -               rc = pqi_update_scsi_devices(ctrl_info);
> -               if (rc)
> -
>                        pqi_schedule_rescan_worker_delayed(ctrl_info);
> -               mutex_unlock(&ctrl_info->scan_mutex);
> +               return -EINPROGRESS;
>         }
>
> +       rc = pqi_update_scsi_devices(ctrl_info);
> +       if (rc && !pqi_ctrl_scan_blocked(ctrl_info))
> +               pqi_schedule_rescan_worker_delayed(ctrl_info);
> +
> +       mutex_unlock(&ctrl_info->scan_mutex);
> +
>         return rc;
>  }
>
> @@ -2301,8 +2327,6 @@ static void pqi_scan_start(struct Scsi_Host
> *shost)
>         struct pqi_ctrl_info *ctrl_info;
>
>         ctrl_info = shost_to_hba(shost);
> -       if (pqi_ctrl_in_ofa(ctrl_info))
> -               return;
>
>         pqi_scan_scsi_devices(ctrl_info);
>  }
> @@ -2319,27 +2343,8 @@ static int pqi_scan_finished(struct Scsi_Host
> *shost,
>         return !mutex_is_locked(&ctrl_info->scan_mutex);
>  }
>
> -static void pqi_wait_until_scan_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->scan_mutex);
> -       mutex_unlock(&ctrl_info->scan_mutex);
> -}
> -
> -static void pqi_wait_until_lun_reset_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->lun_reset_mutex);
> -       mutex_unlock(&ctrl_info->lun_reset_mutex);
> -}
> -
> -static void pqi_wait_until_ofa_finished(struct pqi_ctrl_info
> *ctrl_info)
> -{
> -       mutex_lock(&ctrl_info->ofa_mutex);
> -       mutex_unlock(&ctrl_info->ofa_mutex);
> -}

Here, again, I wonder if this mutex_lock()/mutex_unlock() approach is
optimal. Have you considered using completions?

See above for rationale.

Don: The authors intent was to block threads that are initiated from applications. I moved these HUNKs into patch 
 smartpqi-update-device-scan-operations. 

> -
> -static inline void pqi_set_encryption_info(
> -       struct pqi_encryption_info *encryption_info, struct raid_map
> *raid_map,
> -       u64 first_block)
> +static inline void pqi_set_encryption_info(struct
> pqi_encryption_info *encryption_info,
> +       struct raid_map *raid_map, u64 first_block)
>  {
>         u32 volume_blk_size;

This whitespace change doesn't belong here.

Don: moved to smartpqi-align-code-with-oob-driver

>
> @@ -3251,8 +3256,8 @@ static void pqi_acknowledge_event(struct
> pqi_ctrl_info *ctrl_info,
>         put_unaligned_le16(sizeof(request) -
> PQI_REQUEST_HEADER_LENGTH,
>                 &request.header.iu_length);
>         request.event_type = event->event_type;
> -       request.event_id = event->event_id;
> -       request.additional_event_id = event->additional_event_id;
> +       put_unaligned_le16(event->event_id, &request.event_id);
> +       put_unaligned_le16(event->additional_event_id,
> &request.additional_event_id);

The different treatment of the event_id fields is unrelated to
synchronization, or am I missing something?

Don: Moved event HUNKS to patch smartpqi-update-event-handler

>
>         pqi_send_event_ack(ctrl_info, &request, sizeof(request));
>  }
> @@ -3263,8 +3268,8 @@ static void pqi_acknowledge_event(struct
> pqi_ctrl_info *ctrl_info,
>  static enum pqi_soft_reset_status pqi_poll_for_soft_reset_status(
>         struct pqi_ctrl_info *ctrl_info)
>  {
> -       unsigned long timeout;
>         u8 status;
> +       unsigned long timeout;
>
>         timeout = (PQI_SOFT_RESET_STATUS_TIMEOUT_SECS * PQI_HZ) +
> jiffies;
>
>
> -static void pqi_ofa_process_event(struct pqi_ctrl_info *ctrl_info,
> -       struct pqi_event *event)
> +static void pqi_ofa_memory_alloc_worker(struct work_struct *work)

Moving the ofa handling into work queues seems to be a key aspect of
this patch. The patch description should mention how this is will
improve synchronization. Naïve thinking suggests that making these
calls asynchronous could aggravate synchronization issues.

Repeating myself, I feel that completions would be the best way so
synchronize with these work items.

Don: I moved these HUNKs into smartpqi-update-ofa-management. 
 The author likes to use mutexes to synchronize threads. I'll investigate your synchronizations further. Perhaps V4 will help clarify. 

> @@ -3537,8 +3572,7 @@ static int pqi_process_event_intr(struct
> pqi_ctrl_info *ctrl_info)
>
>  #define PQI_LEGACY_INTX_MASK   0x1
>
> -static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info
> *ctrl_info,
> -       bool enable_intx)
> +static inline void pqi_configure_legacy_intx(struct pqi_ctrl_info
> *ctrl_info, bool enable_intx)

another whitespace hunk

Don: Moved into patch smartpqi-align-code-with-oob-driver
>  {
>         u32 intx_mask;
>         struct pqi_device_registers __iomem *pqi_registers;
> @@ -4216,59 +4250,36 @@ static int
> pqi_process_raid_io_error_synchronous(
>         return rc;
>  }
>
> +static inline bool pqi_is_blockable_request(struct pqi_iu_header
> *request)
> +{
> +       return (request->driver_flags &
> PQI_DRIVER_NONBLOCKABLE_REQUEST) == 0;
> +}
> +
>  static int pqi_submit_raid_request_synchronous(struct pqi_ctrl_info
> *ctrl_info,
>         struct pqi_iu_header *request, unsigned int flags,
> -       struct pqi_raid_error_info *error_info, unsigned long
> timeout_msecs)
> +       struct pqi_raid_error_info *error_info)

The removal of the timeout_msecs argument to this function could be
a separate patch in its own right.

Don: Moved into new patch smartpqi-remove-timeouts-from-internal-cmds

>         if (flags & PQI_SYNC_FLAGS_INTERRUPTABLE) {
>                 if (down_interruptible(&ctrl_info->sync_request_sem))
>                         return -ERESTARTSYS;
>         } else {
> -               if (timeout_msecs == NO_TIMEOUT) {
> -                       down(&ctrl_info->sync_request_sem);
> -               } else {
> -                       start_jiffies = jiffies;
> -                       if (down_timeout(&ctrl_info-
> >sync_request_sem,
> -                               msecs_to_jiffies(timeout_msecs)))
> -                               return -ETIMEDOUT;
> -                       msecs_blocked =
> -                               jiffies_to_msecs(jiffies -
> start_jiffies);
> -                       if (msecs_blocked >= timeout_msecs) {
> -                               rc = -ETIMEDOUT;
> -                               goto out;
> -                       }
> -                       timeout_msecs -= msecs_blocked;
> -               }
> +               down(&ctrl_info->sync_request_sem);
>         }
>
>         pqi_ctrl_busy(ctrl_info);
> -       timeout_msecs = pqi_wait_if_ctrl_blocked(ctrl_info,
> timeout_msecs);
> -       if (timeout_msecs == 0) {
> -               pqi_ctrl_unbusy(ctrl_info);
> -               rc = -ETIMEDOUT;
> -               goto out;
> -       }
> +       if (pqi_is_blockable_request(request))
> +               pqi_wait_if_ctrl_blocked(ctrl_info);

You wait here after taking the semaphore - is that intended? Why?
Don: The author's intent was to prevent any new driver initiated requests that deal with administrative operations (such as updating reply queues, config table changes, OFA memory changes, ...

I'll add in a comment to new patch smartpqi-remove-timeouts-from-internal-cmds.

>
>         if (pqi_ctrl_offline(ctrl_info)) {

Should you test this before waiting, perhaps?
Don: There is a separate thread that will notice that the controller has gone offline. The intent was to see if there was an issue created during the administrative update.

> @@ -5277,12 +5276,6 @@ static inline int
> pqi_raid_submit_scsi_cmd(struct pqi_ctrl_info *ctrl_info,
>                 device, scmd, queue_group);
>  }
>

Below here, a new section starts that refacors the treatment of bypass
retries. I don't see how this is related to the synchronization issues
mentioned in the patch description.

Don: Moved bypass retry HUNKs into patch
 smartpqi-update-raid-bypass-handling


> @@ -5698,6 +5562,14 @@ static inline u16 pqi_get_hw_queue(struct
> pqi_ctrl_info *ctrl_info,
>         return hw_queue;
>  }
>
> +static inline bool pqi_is_bypass_eligible_request(struct scsi_cmnd
> *scmd)
> +{
> +       if (blk_rq_is_passthrough(scmd->request))
> +               return false;
> +
> +       return scmd->retries == 0;
> +}
> +

Nice, but this fits better into (or next to) 10/25 IMO.
Don: For now moved into smartpqi-update-raid-bypass-handling
Thanks for your hard work.


> +       while (atomic_read(&device->scsi_cmds_outstanding)) {
>                 pqi_check_ctrl_health(ctrl_info);
>                 if (pqi_ctrl_offline(ctrl_info))
>                         return -ENXIO;
> -
>                 if (timeout_secs != NO_TIMEOUT) {
>                         if (time_after(jiffies, timeout)) {
>                                 dev_err(&ctrl_info->pci_dev->dev,
> -                                       "timed out waiting for
> pending IO\n");
> +                                       "timed out waiting for
> pending I/O\n");
>                                 return -ETIMEDOUT;
>                         }
>                 }

Like I said above (wrt pqi_scsi_block_requests), have you considered
using wait_event_timeout() here?
Don: I will start an investigation. However, this may take a subsequent patch.
Thanks 

> @@ -7483,7 +7243,8 @@ static void
> pqi_ctrl_update_feature_flags(struct pqi_ctrl_info *ctrl_info,
>                 break;
>         case PQI_FIRMWARE_FEATURE_SOFT_RESET_HANDSHAKE:
>                 ctrl_info->soft_reset_handshake_supported =
> -                       firmware_feature->enabled;
> +                       firmware_feature->enabled &&
> +                       ctrl_info->soft_reset_status;

Should you use readb(ctrl_info->soft_reset_status) here, like above?

Don: Yes. Changed in new patch 
 smartpqi-update-soft-reset-management-for-OFA
I guess since it's a u8 that sparse did not care? It's still iomem...

>
>  static void pqi_process_firmware_features(
> @@ -7665,14 +7435,34 @@ static void
> pqi_process_firmware_features_section(
>         mutex_unlock(&pqi_firmware_features_mutex);
>  }
>

The hunks below look like yet another independent change.
(Handling of firmware_feature_section_present).

Don: squashed into patch 
 smartpqi-add-support-for-BMIC-sense-feature-cmd-and-feature-bits




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 14/25] smartpqi: fix driver synchronization issues
       [not found]       ` <c1e6b199f5ccda5ccec5223dfcbd1fba22171c86.camel@suse.com>
@ 2021-02-01 22:47         ` Don.Brace
  0 siblings, 0 replies; 91+ messages in thread
From: Don.Brace @ 2021-02-01 22:47 UTC (permalink / raw)
  To: mwilck, Kevin.Barnett, Scott.Teel, Justin.Lindley, Scott.Benesh,
	Gerry.Morong, Mahesh.Rajashekhara, hch, jejb, joseph.szczypek,
	POSWALD
  Cc: linux-scsi, john.garry

-----Original Message-----
From: Martin Wilck [mailto:mwilck@suse.com] 
Subject: Re: [PATCH V3 14/25] smartpqi: fix driver synchronization issues


I would still like to figure out if "smartpqi_fix_host_qdepth_limit" is really necessary. It duplicates functionality that's already in the block and SCSI layers. I tend to think that John Garry is right, and that all that is needed to fix the queue overflow would be setting host_tagset to 1. Someone would need to verify that, though. If yes, it'd be preferrable to "smartpqi_fix_host_qdepth_limit", and we'd need to figure out a different fix for the above.

Don: I have been testing with setting host_tagset = 1. Seems to be working.
I answered John Gary's e-mail. Thanks for your suggestion. However I want to test this some more before calling it good.
Thanks,
Don Brace 


Regards
Martin




^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-01-19 10:33           ` John Garry
  2021-01-19 14:12             ` Martin Wilck
@ 2021-02-10 15:27             ` Don.Brace
  2021-02-10 15:42               ` John Garry
  1 sibling, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-02-10 15:27 UTC (permalink / raw)
  To: john.garry, mwilck, pmenzel, Kevin.Barnett, Scott.Teel,
	Justin.Lindley, Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara,
	hch, joseph.szczypek, POSWALD, jejb, martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh, ming.lei

-----Original Message-----
From: John Garry [mailto:john.garry@huawei.com] 
Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit


I think that this can alternatively be solved by setting .host_tagset flag.

Thanks,
John

Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?


^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-02-10 15:27             ` Don.Brace
@ 2021-02-10 15:42               ` John Garry
  2021-02-10 16:29                 ` Don.Brace
  0 siblings, 1 reply; 91+ messages in thread
From: John Garry @ 2021-02-10 15:42 UTC (permalink / raw)
  To: Don.Brace, mwilck, pmenzel, Kevin.Barnett, Scott.Teel,
	Justin.Lindley, Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara,
	hch, joseph.szczypek, POSWALD, jejb, martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh, ming.lei

On 10/02/2021 15:27, Don.Brace@microchip.com wrote:
> -----Original Message-----
> From: John Garry [mailto:john.garry@huawei.com]
> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
> 
> 
> I think that this can alternatively be solved by setting .host_tagset flag.
> 
> Thanks,
> John
> 
> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?

I don't mind. And I think that Ming had the same idea.

Thanks,
John

^ permalink raw reply	[flat|nested] 91+ messages in thread

* RE: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-02-10 15:42               ` John Garry
@ 2021-02-10 16:29                 ` Don.Brace
  2021-03-29 21:15                   ` Paul Menzel
  0 siblings, 1 reply; 91+ messages in thread
From: Don.Brace @ 2021-02-10 16:29 UTC (permalink / raw)
  To: john.garry, mwilck, pmenzel, Kevin.Barnett, Scott.Teel,
	Justin.Lindley, Scott.Benesh, Gerry.Morong, Mahesh.Rajashekhara,
	hch, joseph.szczypek, POSWALD, jejb, martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh, ming.lei

-----Original Message-----
From: John Garry [mailto:john.garry@huawei.com] 
Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit

>
>
> I think that this can alternatively be solved by setting .host_tagset flag.
>
> Thanks,
> John
>
> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?

I don't mind. And I think that Ming had the same idea.

Thanks,
John

Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?


^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-02-10 16:29                 ` Don.Brace
@ 2021-03-29 21:15                   ` Paul Menzel
  2021-03-29 21:16                     ` Paul Menzel
  0 siblings, 1 reply; 91+ messages in thread
From: Paul Menzel @ 2021-03-29 21:15 UTC (permalink / raw)
  To: Don.Brace, john.garry, mwilck, pmenzel, Kevin.Barnett,
	Scott.Teel, Justin.Lindley, Scott.Benesh, Gerry.Morong,
	Mahesh.Rajashekhara, hch, joseph.szczypek, POSWALD, jejb,
	martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh, ming.lei

Dear Linüx folks,


Am 10.02.21 um 17:29 schrieb Don.Brace@microchip.com:
> -----Original Message-----
> From: John Garry [mailto:john.garry@huawei.com]
> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
> 
>> I think that this can alternatively be solved by setting .host_tagset flag.
>>
>> Thanks,
>> John
>>
>> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?
> 
> I don't mind. And I think that Ming had the same idea.

> Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?

It looks like, iterations 4 and 5 of the patch series have been posted 
in the meantime. Unfortunately without the reporters and discussion 
participants in Cc. Linux upstream is still broken since version 5.5.


Kind regards,

Paul


[1]: 
https://lore.kernel.org/linux-scsi/161540568064.19430.11157730901022265360.stgit@brunhilda/
[2]: 
https://lore.kernel.org/linux-scsi/161549045434.25025.17473629602756431540.stgit@brunhilda/

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-03-29 21:15                   ` Paul Menzel
@ 2021-03-29 21:16                     ` Paul Menzel
  2021-03-30 14:37                       ` Donald Buczek
  0 siblings, 1 reply; 91+ messages in thread
From: Paul Menzel @ 2021-03-29 21:16 UTC (permalink / raw)
  To: Don.Brace, john.garry, mwilck, pmenzel, Kevin.Barnett,
	Scott.Teel, Justin.Lindley, Scott.Benesh, Gerry.Morong,
	Mahesh.Rajashekhara, hch, joseph.szczypek, POSWALD, jejb,
	martin.petersen
  Cc: linux-scsi, it+linux-scsi, buczek, gregkh, ming.lei

[Resent from correct address.]

Am 29.03.21 um 23:15 schrieb Paul Menzel:
> Dear Linüx folks,
> 
> 
> Am 10.02.21 um 17:29 schrieb Don.Brace@microchip.com:
>> -----Original Message-----
>> From: John Garry [mailto:john.garry@huawei.com]
>> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
>>
>>> I think that this can alternatively be solved by setting .host_tagset 
>>> flag.
>>>
>>> Thanks,
>>> John
>>>
>>> Don: John, can I add a Suggested-by tag for you for my new patch 
>>> smartpqi-use-host-wide-tagspace?
>>
>> I don't mind. And I think that Ming had the same idea.
> 
>> Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?
> 
> It looks like, iterations 4 and 5 of the patch series have been posted 
> in the meantime. Unfortunately without the reporters and discussion 
> participants in Cc. Linux upstream is still broken since version 5.5.
> 
> 
> Kind regards,
> 
> Paul
> 
> 
> [1]: https://lore.kernel.org/linux-scsi/161540568064.19430.11157730901022265360.stgit@brunhilda/
> [2]: https://lore.kernel.org/linux-scsi/161549045434.25025.17473629602756431540.stgit@brunhilda/

^ permalink raw reply	[flat|nested] 91+ messages in thread

* Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
  2021-03-29 21:16                     ` Paul Menzel
@ 2021-03-30 14:37                       ` Donald Buczek
  0 siblings, 0 replies; 91+ messages in thread
From: Donald Buczek @ 2021-03-30 14:37 UTC (permalink / raw)
  To: Paul Menzel, Don.Brace, john.garry, mwilck, Kevin.Barnett,
	Scott.Teel, Justin.Lindley, Scott.Benesh, Gerry.Morong,
	Mahesh.Rajashekhara, hch, joseph.szczypek, POSWALD, jejb,
	martin.petersen
  Cc: linux-scsi, it+linux-scsi, gregkh, ming.lei

Dear Paul,

On 29.03.21 23:16, Paul Menzel wrote:
> [Resent from correct address.]
> 
> Am 29.03.21 um 23:15 schrieb Paul Menzel:
>> Dear Linüx folks,
>>
>>
>> Am 10.02.21 um 17:29 schrieb Don.Brace@microchip.com:
>>> -----Original Message-----
>>> From: John Garry [mailto:john.garry@huawei.com]
>>> Subject: Re: [PATCH V3 15/25] smartpqi: fix host qdepth limit
>>>
>>>> I think that this can alternatively be solved by setting .host_tagset flag.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> Don: John, can I add a Suggested-by tag for you for my new patch smartpqi-use-host-wide-tagspace?
>>>
>>> I don't mind. And I think that Ming had the same idea.
>>
>>> Don: Thanks for reminding me. Ming, can I add your Suggested-by tag?
>>
>> It looks like, iterations 4 and 5 of the patch series have been posted in the meantime. Unfortunately without the reporters and discussion participants in Cc. Linux upstream is still broken since version 5.5.

When "smartpqi: use host wide tagspace" [1] goes into mainline, we can submit it to stable, if nobody else does. This fixes the original problem and we got a patch with the same code change running in our 5.10 kernels.

Best
   Donald

[1]: https://lore.kernel.org/linux-scsi/161549369787.25025.8975999483518581619.stgit@brunhilda/

>>
>>
>> Kind regards,
>>
>> Paul
>>
>>
>> [1]: https://lore.kernel.org/linux-scsi/161540568064.19430.11157730901022265360.stgit@brunhilda/
>> [2]: https://lore.kernel.org/linux-scsi/161549045434.25025.17473629602756431540.stgit@brunhilda/

^ permalink raw reply	[flat|nested] 91+ messages in thread

end of thread, other threads:[~2021-03-30 14:38 UTC | newest]

Thread overview: 91+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10 20:34 [PATCH V3 00/25] smartpqi updates Don Brace
2020-12-10 20:34 ` [PATCH V3 01/25] smartpqi: add support for product id Don Brace
2021-01-07 16:43   ` Martin Wilck
2020-12-10 20:34 ` [PATCH V3 02/25] smartpqi: refactor aio submission code Don Brace
2021-01-07 16:43   ` Martin Wilck
2020-12-10 20:34 ` [PATCH V3 03/25] smartpqi: refactor build sg list code Don Brace
2021-01-07 16:43   ` Martin Wilck
2020-12-10 20:34 ` [PATCH V3 04/25] smartpqi: add support for raid5 and raid6 writes Don Brace
2021-01-07 16:44   ` Martin Wilck
2021-01-08 22:56     ` Don.Brace
2021-01-13 10:26       ` Martin Wilck
2020-12-10 20:34 ` [PATCH V3 05/25] smartpqi: add support for raid1 writes Don Brace
2021-01-07 16:44   ` Martin Wilck
2021-01-09 16:56     ` Don.Brace
2020-12-10 20:34 ` [PATCH V3 06/25] smartpqi: add support for BMIC sense feature cmd and feature bits Don Brace
2021-01-07 16:44   ` Martin Wilck
2021-01-11 17:22     ` Don.Brace
2021-01-22 16:45     ` Don.Brace
2021-01-22 19:04       ` Martin Wilck
2020-12-10 20:35 ` [PATCH V3 07/25] smartpqi: update AIO Sub Page 0x02 support Don Brace
2021-01-07 16:44   ` Martin Wilck
2021-01-11 20:53     ` Don.Brace
2020-12-10 20:35 ` [PATCH V3 08/25] smartpqi: add support for long firmware version Don Brace
2021-01-07 16:45   ` Martin Wilck
2021-01-11 22:25     ` Don.Brace
2021-01-22 20:01     ` Don.Brace
2020-12-10 20:35 ` [PATCH V3 09/25] smartpqi: align code with oob driver Don Brace
2021-01-08  0:13   ` Martin Wilck
2020-12-10 20:35 ` [PATCH V3 10/25] smartpqi: add stream detection Don Brace
2021-01-08  0:14   ` Martin Wilck
2021-01-15 21:58     ` Don.Brace
2020-12-10 20:35 ` [PATCH V3 11/25] smartpqi: add host level stream detection enable Don Brace
2021-01-08  0:13   ` Martin Wilck
2021-01-12 20:28     ` Don.Brace
2020-12-10 20:35 ` [PATCH V3 12/25] smartpqi: enable support for NVMe encryption Don Brace
2021-01-08  0:14   ` Martin Wilck
2020-12-10 20:35 ` [PATCH V3 13/25] smartpqi: disable write_same for nvme hba disks Don Brace
2021-01-08  0:13   ` Martin Wilck
2020-12-10 20:35 ` [PATCH V3 14/25] smartpqi: fix driver synchronization issues Don Brace
2021-01-07 23:32   ` Martin Wilck
2021-01-08  4:13     ` Martin K. Petersen
2021-01-15 21:13     ` Don.Brace
2021-01-27 23:01     ` Don.Brace
     [not found]       ` <c1e6b199f5ccda5ccec5223dfcbd1fba22171c86.camel@suse.com>
2021-02-01 22:47         ` Don.Brace
2020-12-10 20:35 ` [PATCH V3 15/25] smartpqi: fix host qdepth limit Don Brace
2020-12-14 17:54   ` Paul Menzel
2020-12-15 20:23     ` Don.Brace
2021-01-07 23:43       ` Martin Wilck
2021-01-15 21:17         ` Don.Brace
2021-01-19 10:33           ` John Garry
2021-01-19 14:12             ` Martin Wilck
2021-01-19 17:43               ` Paul Menzel
2021-01-20 16:42               ` Donald Buczek
2021-01-20 17:03                 ` Don.Brace
2021-01-20 18:35                 ` Martin Wilck
2021-02-10 15:27             ` Don.Brace
2021-02-10 15:42               ` John Garry
2021-02-10 16:29                 ` Don.Brace
2021-03-29 21:15                   ` Paul Menzel
2021-03-29 21:16                     ` Paul Menzel
2021-03-30 14:37                       ` Donald Buczek
2020-12-10 20:35 ` [PATCH V3 16/25] smartpqi: convert snprintf to scnprintf Don Brace
2021-01-07 23:51   ` Martin Wilck
2020-12-10 20:35 ` [PATCH V3 17/25] smartpqi: change timing of release of QRM memory during OFA Don Brace
2021-01-08  0:14   ` Martin Wilck
2021-01-27 17:46     ` Don.Brace
2020-12-10 20:36 ` [PATCH V3 18/25] smartpqi: return busy indication for IOCTLs when ofa is active Don Brace
2020-12-10 20:36 ` [PATCH V3 19/25] smartpqi: add phy id support for the physical drives Don Brace
2021-01-08  0:03   ` Martin Wilck
2020-12-10 20:36 ` [PATCH V3 20/25] smartpqi: update sas initiator_port_protocols and target_port_protocols Don Brace
2021-01-08  0:12   ` Martin Wilck
2020-12-10 20:36 ` [PATCH V3 21/25] smartpqi: add additional logging for LUN resets Don Brace
2021-01-08  0:27   ` Martin Wilck
2021-01-25 17:09     ` Don.Brace
2020-12-10 20:36 ` [PATCH V3 22/25] smartpqi: update enclosure identifier in sysf Don Brace
2021-01-08  0:30   ` Martin Wilck
2021-01-25 17:13     ` Don.Brace
2021-01-25 19:44       ` Martin Wilck
2021-01-25 20:36         ` Don.Brace
2020-12-10 20:36 ` [PATCH V3 23/25] smartpqi: correct system hangs when resuming from hibernation Don Brace
2021-01-08  0:34   ` Martin Wilck
2021-01-27 17:39     ` Don.Brace
2021-01-27 17:45       ` Martin Wilck
2020-12-10 20:36 ` [PATCH V3 24/25] smartpqi: add new pci ids Don Brace
2021-01-08  0:35   ` Martin Wilck
2020-12-10 20:36 ` [PATCH V3 25/25] smartpqi: update version to 2.1.6-005 Don Brace
2020-12-21 14:31 ` [PATCH V3 00/25] smartpqi updates Donald Buczek
     [not found]   ` <SN6PR11MB2848D8C9DF9856A2B7AA69ACE1C00@SN6PR11MB2848.namprd11.prod.outlook.com>
2020-12-22 13:13     ` Donald Buczek
2020-12-28 15:57       ` Don.Brace
2020-12-28 19:25         ` Don.Brace
2020-12-28 22:36           ` Donald Buczek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.