[PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1
@ 2020-01-14 11:21 Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 01/11] megaraid_sas: Reset adapter if FW is not in READY state after device resume Anand Lodnoor
                   ` (11 more replies)
  0 siblings, 12 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

This patchset contains few enhancements and fixes in megaraid_sas driver.

v2:
    - Fixed couple of typos and indentation issues as pointed out by Martin K.
    - Added proper commit descriptions where ever necessary.

Anand Lodnoor (11):
  megaraid_sas: Reset adapter if FW is not in READY state after device
    resume
  megaraid_sas: Set no_write_same only for Virtual Disk
  megaraid_sas: Update optimal queue depth for SAS and NVMe devices
  megaraid_sas: Do not kill host bus adapter, if adapter is already dead
  megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is
    disabled
  megaraid_sas: Do not set HBA Operational if FW is not in operational
    state
  megaraid_sas: Re-Define enum DCMD_RETURN_STATUS
  megaraid_sas: Do not initiate OCR if controller is not in ready state
  megaraid_sas: Limit the number of retries for the IOCTLs causing
    firmware fault
  megaraid_sas: Use Block layer API to check SCSI device in-flight IO
    requests
  megaraid_sas: Update driver version to 07.713.01.00-rc1

 drivers/scsi/megaraid/megaraid_sas.h        |  17 ++--
 drivers/scsi/megaraid/megaraid_sas_base.c   |  95 ++++++++++++++------
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 134 +++++++++++++++++++++-------
 drivers/scsi/megaraid/megaraid_sas_fusion.h |  18 +++-
 4 files changed, 197 insertions(+), 67 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 01/11] megaraid_sas: Reset adapter if FW is not in READY state after device resume
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 02/11] megaraid_sas: Set no_write_same only for Virtual Disk Anand Lodnoor
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

After device resume we expect the firmware to be in READY state.
Transition to READY might fail due to unhandled exceptions,such
as an internal error or a hardware failure.Retry initiating chip
reset and wait for the controller to come to ready state.

Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index a4bc814..96b893f 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -7593,6 +7593,7 @@ static void megasas_shutdown_controller(struct megasas_instance *instance,
 	struct Scsi_Host *host;
 	struct megasas_instance *instance;
 	int irq_flags = PCI_IRQ_LEGACY;
+	u32 status_reg;
 
 	instance = pci_get_drvdata(pdev);
 
@@ -7620,9 +7621,35 @@ static void megasas_shutdown_controller(struct megasas_instance *instance,
 	/*
 	 * We expect the FW state to be READY
 	 */
-	if (megasas_transition_to_ready(instance, 0))
-		goto fail_ready_state;
 
+	if (megasas_transition_to_ready(instance, 0)) {
+		dev_info(&instance->pdev->dev,
+			 "Failed to transition controller to ready from %s!\n",
+			 __func__);
+		if (instance->adapter_type != MFI_SERIES) {
+			status_reg =
+			instance->instancet->read_fw_status_reg(instance);
+			if (!(status_reg & MFI_RESET_ADAPTER) ||
+				((megasas_adp_reset_wait_for_ready
+				(instance, true, 0)) == FAILED))
+				goto fail_ready_state;
+		} else {
+			atomic_set(&instance->fw_reset_no_pci_access, 1);
+			instance->instancet->adp_reset
+				(instance, instance->reg_set);
+			atomic_set(&instance->fw_reset_no_pci_access, 0);
+
+			/* waiting for about 30 seconds before retry */
+			ssleep(30);
+
+			if (megasas_transition_to_ready(instance, 0))
+				goto fail_ready_state;
+		}
+
+		dev_info(&instance->pdev->dev,
+			 "FW restarted successfully from %s!\n",
+			 __func__);
+	}
 	if (megasas_set_dma_mask(instance))
 		goto fail_set_dma_mask;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 02/11] megaraid_sas: Set no_write_same only for Virtual Disk
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 01/11] megaraid_sas: Reset adapter if FW is not in READY state after device resume Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 03/11] megaraid_sas: Update optimal queue depth for SAS and NVMe devices Anand Lodnoor
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

Disable WRITE_SAME (no_write_same) for Virtual Disks only.
For System PDs and EPDs (Enhanced PDs), WRITE_SAME need not be disabled by
default.

Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_base.c   |  5 ++++-
 drivers/scsi/megaraid/megaraid_sas_fusion.h | 17 ++++++++++++++---
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 96b893f..8b9ecee 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -1887,6 +1887,10 @@ void megasas_set_dynamic_target_properties(struct scsi_device *sdev,
 
 		mr_device_priv_data->is_tm_capable =
 			raid->capability.tmCapable;
+
+		if (!raid->flags.isEPD)
+			sdev->no_write_same = 1;
+
 	} else if (instance->use_seqnum_jbod_fp) {
 		pd_index = (sdev->channel * MEGASAS_MAX_DEV_PER_CHANNEL) +
 			sdev->id;
@@ -3416,7 +3420,6 @@ static int megasas_reset_target(struct scsi_cmnd *scmd)
 	.bios_param = megasas_bios_param,
 	.change_queue_depth = scsi_change_queue_depth,
 	.max_segment_size = 0xffffffff,
-	.no_write_same = 1,
 };
 
 /**
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h
index c013c80..8358b68 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
@@ -864,9 +864,20 @@ struct MR_LD_RAID {
 	u8	regTypeReqOnRead;
 	__le16     seqNum;
 
-	struct {
-		u32 ldSyncRequired:1;
-		u32 reserved:31;
+struct {
+#ifndef MFI_BIG_ENDIAN
+	u32 ldSyncRequired:1;
+	u32 regTypeReqOnReadIsValid:1;
+	u32 isEPD:1;
+	u32 enableSLDOnAllRWIOs:1;
+	u32 reserved:28;
+#else
+	u32 reserved:28;
+	u32 enableSLDOnAllRWIOs:1;
+	u32 isEPD:1;
+	u32 regTypeReqOnReadIsValid:1;
+	u32 ldSyncRequired:1;
+#endif
 	} flags;
 
 	u8	LUN[8]; /* 0x24 8 byte LUN field used for SCSI IO's */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 03/11] megaraid_sas: Update optimal queue depth for SAS and NVMe devices
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 01/11] megaraid_sas: Reset adapter if FW is not in READY state after device resume Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 02/11] megaraid_sas: Set no_write_same only for Virtual Disk Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 04/11] megaraid_sas: Do not kill host bus adapter, if adapter is already dead Anand Lodnoor
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

Ideally, optimal queue depth will be provided by firmware.
The driver defines will be used as a fallback mechanism, in case the FW
assisted QD is not supported.
The driver defined values provide optimal queue depth for most of the
drives and the workloads, as is learned from the firmware assisted QD
results.

Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index bd81840..ddfbe6f 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2233,9 +2233,9 @@ enum MR_PD_TYPE {
 
 /* JBOD Queue depth definitions */
 #define MEGASAS_SATA_QD	32
-#define MEGASAS_SAS_QD	64
+#define MEGASAS_SAS_QD 256
 #define MEGASAS_DEFAULT_PD_QD	64
-#define MEGASAS_NVME_QD		32
+#define MEGASAS_NVME_QD        64
 
 #define MR_DEFAULT_NVME_PAGE_SIZE	4096
 #define MR_DEFAULT_NVME_PAGE_SHIFT	12
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 04/11] megaraid_sas: Do not kill host bus adapter, if adapter is already dead
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (2 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 03/11] megaraid_sas: Update optimal queue depth for SAS and NVMe devices Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 05/11] megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is disabled Anand Lodnoor
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 8b9ecee..26c5119 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -2154,6 +2154,12 @@ static void megasas_complete_outstanding_ioctls(struct megasas_instance *instanc
 
 void megaraid_sas_kill_hba(struct megasas_instance *instance)
 {
+	if (atomic_read(&instance->adprecovery) == MEGASAS_HW_CRITICAL_ERROR) {
+		dev_warn(&instance->pdev->dev,
+			 "Adapter already dead, skipping kill HBA\n");
+		return;
+	}
+
 	/* Set critical error to block I/O & ioctls in case caller didn't */
 	atomic_set(&instance->adprecovery, MEGASAS_HW_CRITICAL_ERROR);
 	/* Wait 1 second to ensure IO or ioctls in build have posted */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 05/11] megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is disabled
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (3 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 04/11] megaraid_sas: Do not kill host bus adapter, if adapter is already dead Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 06/11] megaraid_sas: Do not set HBA Operational if FW is not in operational state Anand Lodnoor
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

At the time of firmware initialization, if JBOD map or RAID map is not available,
driver can function without these features in a limited functionality mode.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index e301458..ef20234 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -1312,7 +1312,9 @@ static int megasas_create_sg_sense_fusion(struct megasas_instance *instance)
 	}
 
 	if (ret == DCMD_TIMEOUT)
-		megaraid_sas_kill_hba(instance);
+		dev_warn(&instance->pdev->dev,
+			 "%s DCMD timed out, continue without JBOD sequence map\n",
+			 __func__);
 
 	if (ret == DCMD_SUCCESS)
 		instance->pd_seq_map_id++;
@@ -1394,7 +1396,9 @@ static int megasas_create_sg_sense_fusion(struct megasas_instance *instance)
 		ret = megasas_issue_polled(instance, cmd);
 
 	if (ret == DCMD_TIMEOUT)
-		megaraid_sas_kill_hba(instance);
+		dev_warn(&instance->pdev->dev,
+			 "%s DCMD timed out, RAID map is disabled\n",
+			 __func__);
 
 	megasas_return_cmd(instance, cmd);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 06/11] megaraid_sas: Do not set HBA Operational if FW is not in operational state
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (4 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 05/11] megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is disabled Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 07/11] megaraid_sas: Re-Define enum DCMD_RETURN_STATUS Anand Lodnoor
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

After issuing a adapter reset, driver blindly used to set adprecovery flag
to OPERATIONAL state.
Add a check to see if the FW is operational before setting the flag
and marking reset adapter successful.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index ef20234..6860fd2 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -4991,6 +4991,15 @@ int megasas_reset_fusion(struct Scsi_Host *shost, int reason)
 				megasas_set_dynamic_target_properties(sdev, is_target_prop);
 			}
 
+			status_reg = instance->instancet->read_fw_status_reg
+					(instance);
+			abs_state = status_reg & MFI_STATE_MASK;
+			if (abs_state != MFI_STATE_OPERATIONAL) {
+				dev_info(&instance->pdev->dev,
+					 "Adapter is not OPERATIONAL, state 0x%x for scsi:%d\n",
+					 abs_state, instance->host->host_no);
+				goto out;
+			}
 			atomic_set(&instance->adprecovery, MEGASAS_HBA_OPERATIONAL);
 
 			dev_info(&instance->pdev->dev,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 07/11] megaraid_sas: Re-Define enum DCMD_RETURN_STATUS
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (5 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 06/11] megaraid_sas: Do not set HBA Operational if FW is not in operational state Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 08/11] megaraid_sas: Do not initiate OCR if controller is not in ready state Anand Lodnoor
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

DCMD_INIT is introduced to indicate the initial DCMD status,
which was earlier set to MFI status.
DCMD_BUSY indicates the resource is busy or locked.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h      |  9 +++---
 drivers/scsi/megaraid/megaraid_sas_base.c | 50 ++++++++++++++++++-------------
 2 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index ddfbe6f..25afa81 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -2640,10 +2640,11 @@ enum MEGASAS_OCR_CAUSE {
 };
 
 enum DCMD_RETURN_STATUS {
-	DCMD_SUCCESS		= 0,
-	DCMD_TIMEOUT		= 1,
-	DCMD_FAILED		= 2,
-	DCMD_NOT_FIRED		= 3,
+	DCMD_SUCCESS    = 0x00,
+	DCMD_TIMEOUT    = 0x01,
+	DCMD_FAILED     = 0x02,
+	DCMD_BUSY       = 0x03,
+	DCMD_INIT       = 0xff,
 };
 
 u8
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 26c5119..8bee5629 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -1099,7 +1099,7 @@ struct megasas_cmd *megasas_get_cmd(struct megasas_instance
 	if (atomic_read(&instance->adprecovery) == MEGASAS_HW_CRITICAL_ERROR) {
 		dev_err(&instance->pdev->dev, "Failed from %s %d\n",
 			__func__, __LINE__);
-		return DCMD_NOT_FIRED;
+		return DCMD_INIT;
 	}
 
 	instance->instancet->issue_dcmd(instance, cmd);
@@ -1123,19 +1123,19 @@ struct megasas_cmd *megasas_get_cmd(struct megasas_instance
 			  struct megasas_cmd *cmd, int timeout)
 {
 	int ret = 0;
-	cmd->cmd_status_drv = MFI_STAT_INVALID_STATUS;
+	cmd->cmd_status_drv = DCMD_INIT;
 
 	if (atomic_read(&instance->adprecovery) == MEGASAS_HW_CRITICAL_ERROR) {
 		dev_err(&instance->pdev->dev, "Failed from %s %d\n",
 			__func__, __LINE__);
-		return DCMD_NOT_FIRED;
+		return DCMD_INIT;
 	}
 
 	instance->instancet->issue_dcmd(instance, cmd);
 
 	if (timeout) {
 		ret = wait_event_timeout(instance->int_cmd_wait_q,
-				cmd->cmd_status_drv != MFI_STAT_INVALID_STATUS, timeout * HZ);
+		cmd->cmd_status_drv != DCMD_INIT, timeout * HZ);
 		if (!ret) {
 			dev_err(&instance->pdev->dev,
 				"DCMD(opcode: 0x%x) is timed out, func:%s\n",
@@ -1144,10 +1144,9 @@ struct megasas_cmd *megasas_get_cmd(struct megasas_instance
 		}
 	} else
 		wait_event(instance->int_cmd_wait_q,
-				cmd->cmd_status_drv != MFI_STAT_INVALID_STATUS);
+				cmd->cmd_status_drv != DCMD_INIT);
 
-	return (cmd->cmd_status_drv == MFI_STAT_OK) ?
-		DCMD_SUCCESS : DCMD_FAILED;
+	return cmd->cmd_status_drv;
 }
 
 /**
@@ -1190,19 +1189,19 @@ struct megasas_cmd *megasas_get_cmd(struct megasas_instance
 		cpu_to_le32(upper_32_bits(cmd_to_abort->frame_phys_addr));
 
 	cmd->sync_cmd = 1;
-	cmd->cmd_status_drv = MFI_STAT_INVALID_STATUS;
+	cmd->cmd_status_drv = DCMD_INIT;
 
 	if (atomic_read(&instance->adprecovery) == MEGASAS_HW_CRITICAL_ERROR) {
 		dev_err(&instance->pdev->dev, "Failed from %s %d\n",
 			__func__, __LINE__);
-		return DCMD_NOT_FIRED;
+		return DCMD_INIT;
 	}
 
 	instance->instancet->issue_dcmd(instance, cmd);
 
 	if (timeout) {
 		ret = wait_event_timeout(instance->abort_cmd_wait_q,
-				cmd->cmd_status_drv != MFI_STAT_INVALID_STATUS, timeout * HZ);
+		cmd->cmd_status_drv != DCMD_INIT, timeout * HZ);
 		if (!ret) {
 			opcode = cmd_to_abort->frame->dcmd.opcode;
 			dev_err(&instance->pdev->dev,
@@ -1212,13 +1211,12 @@ struct megasas_cmd *megasas_get_cmd(struct megasas_instance
 		}
 	} else
 		wait_event(instance->abort_cmd_wait_q,
-				cmd->cmd_status_drv != MFI_STAT_INVALID_STATUS);
+		cmd->cmd_status_drv != DCMD_INIT);
 
 	cmd->sync_cmd = 0;
 
 	megasas_return_cmd(instance, cmd);
-	return (cmd->cmd_status_drv == MFI_STAT_OK) ?
-		DCMD_SUCCESS : DCMD_FAILED;
+	return cmd->cmd_status_drv;
 }
 
 /**
@@ -2736,7 +2734,7 @@ static int megasas_wait_for_outstanding(struct megasas_instance *instance)
 						"reset queue\n",
 						reset_cmd);
 
-				reset_cmd->cmd_status_drv = MFI_STAT_INVALID_STATUS;
+				reset_cmd->cmd_status_drv = DCMD_INIT;
 				instance->instancet->fire_cmd(instance,
 						reset_cmd->frame_phys_addr,
 						0, instance->reg_set);
@@ -3441,7 +3439,11 @@ static int megasas_reset_target(struct scsi_cmnd *scmd)
 megasas_complete_int_cmd(struct megasas_instance *instance,
 			 struct megasas_cmd *cmd)
 {
-	cmd->cmd_status_drv = cmd->frame->io.cmd_status;
+	if (cmd->cmd_status_drv == DCMD_INIT)
+		cmd->cmd_status_drv =
+		(cmd->frame->io.cmd_status == MFI_STAT_OK) ?
+		DCMD_SUCCESS : DCMD_FAILED;
+
 	wake_up(&instance->int_cmd_wait_q);
 }
 
@@ -3460,7 +3462,7 @@ static int megasas_reset_target(struct scsi_cmnd *scmd)
 {
 	if (cmd->sync_cmd) {
 		cmd->sync_cmd = 0;
-		cmd->cmd_status_drv = 0;
+		cmd->cmd_status_drv = DCMD_SUCCESS;
 		wake_up(&instance->abort_cmd_wait_q);
 	}
 }
@@ -3736,7 +3738,7 @@ static int megasas_reset_target(struct scsi_cmnd *scmd)
 			dev_notice(&instance->pdev->dev, "%p synchronous cmd"
 						"on the internal reset queue,"
 						"issue it again.\n", cmd);
-			cmd->cmd_status_drv = MFI_STAT_INVALID_STATUS;
+			cmd->cmd_status_drv = DCMD_INIT;
 			instance->instancet->fire_cmd(instance,
 							cmd->frame_phys_addr,
 							0, instance->reg_set);
@@ -8072,6 +8074,7 @@ static int megasas_set_crash_dump_params_ioctl(struct megasas_cmd *cmd)
 	dma_addr_t sense_handle;
 	unsigned long *sense_ptr;
 	u32 opcode = 0;
+	int ret = DCMD_SUCCESS;
 
 	memset(kbuff_arr, 0, sizeof(kbuff_arr));
 
@@ -8212,13 +8215,18 @@ static int megasas_set_crash_dump_params_ioctl(struct megasas_cmd *cmd)
 	 * cmd to the SCSI mid-layer
 	 */
 	cmd->sync_cmd = 1;
-	if (megasas_issue_blocked_cmd(instance, cmd, 0) == DCMD_NOT_FIRED) {
+
+	ret = megasas_issue_blocked_cmd(instance, cmd, 0);
+	switch (ret) {
+	case DCMD_INIT:
+	case DCMD_BUSY:
 		cmd->sync_cmd = 0;
 		dev_err(&instance->pdev->dev,
 			"return -EBUSY from %s %d cmd 0x%x opcode 0x%x cmd->cmd_status_drv 0x%x\n",
-			__func__, __LINE__, cmd->frame->hdr.cmd, opcode,
-			cmd->cmd_status_drv);
-		return -EBUSY;
+			 __func__, __LINE__, cmd->frame->hdr.cmd, opcode,
+			 cmd->cmd_status_drv);
+			error = -EBUSY;
+			goto out;
 	}
 
 	cmd->sync_cmd = 0;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 08/11] megaraid_sas: Do not initiate OCR if controller is not in ready state
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (6 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 07/11] megaraid_sas: Re-Define enum DCMD_RETURN_STATUS Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 09/11] megaraid_sas: Limit the number of retries for the IOCTLs causing firmware fault Anand Lodnoor
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor, stable

Driver initiates OCR if a DCMD command times out. But there is a
deadlock if the driver attempts to invoke another OCR before the
mutex lock (reset_mutex) is released from the previous session of OCR.

This patch takes care of the above scenario using new flag
MEGASAS_FUSION_OCR_NOT_POSSIBLE to indicate if OCR is possible.

Cc: stable@vger.kernel.org
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_base.c   | 3 ++-
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 3 ++-
 drivers/scsi/megaraid/megaraid_sas_fusion.h | 1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 8bee5629..792db75 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -4403,7 +4403,8 @@ int megasas_alloc_cmds(struct megasas_instance *instance)
 	if (instance->adapter_type == MFI_SERIES)
 		return KILL_ADAPTER;
 	else if (instance->unload ||
-			test_bit(MEGASAS_FUSION_IN_RESET, &instance->reset_flags))
+			test_bit(MEGASAS_FUSION_OCR_NOT_POSSIBLE,
+				 &instance->reset_flags))
 		return IGNORE_TIMEOUT;
 	else
 		return INITIATE_OCR;
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 6860fd2..8b6cc1b 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -4851,6 +4851,7 @@ int megasas_reset_fusion(struct Scsi_Host *shost, int reason)
 	if (instance->requestorId && !instance->skip_heartbeat_timer_del)
 		del_timer_sync(&instance->sriov_heartbeat_timer);
 	set_bit(MEGASAS_FUSION_IN_RESET, &instance->reset_flags);
+	set_bit(MEGASAS_FUSION_OCR_NOT_POSSIBLE, &instance->reset_flags);
 	atomic_set(&instance->adprecovery, MEGASAS_ADPRESET_SM_POLLING);
 	instance->instancet->disable_intr(instance);
 	megasas_sync_irqs((unsigned long)instance);
@@ -5059,7 +5060,7 @@ int megasas_reset_fusion(struct Scsi_Host *shost, int reason)
 	instance->skip_heartbeat_timer_del = 1;
 	retval = FAILED;
 out:
-	clear_bit(MEGASAS_FUSION_IN_RESET, &instance->reset_flags);
+	clear_bit(MEGASAS_FUSION_OCR_NOT_POSSIBLE, &instance->reset_flags);
 	mutex_unlock(&instance->reset_mutex);
 	return retval;
 }
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h
index 8358b68..d57ecc7 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
@@ -89,6 +89,7 @@ enum MR_RAID_FLAGS_IO_SUB_TYPE {
 
 #define MEGASAS_FP_CMD_LEN	16
 #define MEGASAS_FUSION_IN_RESET 0
+#define MEGASAS_FUSION_OCR_NOT_POSSIBLE 1
 #define RAID_1_PEER_CMDS 2
 #define JBOD_MAPS_COUNT	2
 #define MEGASAS_REDUCE_QD_COUNT 64
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 09/11] megaraid_sas: Limit the number of retries for the IOCTLs causing firmware fault
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (7 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 08/11] megaraid_sas: Do not initiate OCR if controller is not in ready state Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-14 11:21 ` [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests Anand Lodnoor
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

IOCTLs causing firmware fault may end up in failed controller resets and
finally killing the adapter.

This patch fixes this problem as stated below:
In OCR sequence, driver will attempt refiring pended IOCTLs upto two times.
If first two attempts fail, then in third attempt driver will return pended
IOCTLs with EBUSY status to application. These changes are done to ensure
if any of pended IOCTLs is causing firmware fault and resulting into OCR
failure, then in last attempt of OCR driver will refrain firing it to
firmware and saving adapter from being killed due to faulty IOCTL.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 58 +++++++++++++++++++++++++++--
 1 file changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 8b6cc1b..0bdd477 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -4223,7 +4223,8 @@ void  megasas_reset_reply_desc(struct megasas_instance *instance)
  * megasas_refire_mgmt_cmd :	Re-fire management commands
  * @instance:				Controller's soft instance
 */
-static void megasas_refire_mgmt_cmd(struct megasas_instance *instance)
+void megasas_refire_mgmt_cmd(struct megasas_instance *instance,
+			     bool return_ioctl)
 {
 	int j;
 	struct megasas_cmd_fusion *cmd_fusion;
@@ -4287,6 +4288,16 @@ static void megasas_refire_mgmt_cmd(struct megasas_instance *instance)
 			break;
 		}
 
+		if (return_ioctl && cmd_mfi->sync_cmd &&
+		    cmd_mfi->frame->hdr.cmd != MFI_CMD_ABORT) {
+			dev_err(&instance->pdev->dev,
+				"return -EBUSY from %s %d cmd 0x%x opcode 0x%x\n",
+				__func__, __LINE__, cmd_mfi->frame->hdr.cmd,
+				le32_to_cpu(cmd_mfi->frame->dcmd.opcode));
+			cmd_mfi->cmd_status_drv = DCMD_BUSY;
+			result = COMPLETE_CMD;
+		}
+
 		switch (result) {
 		case REFIRE_CMD:
 			megasas_fire_cmd_fusion(instance, req_desc);
@@ -4302,6 +4313,37 @@ static void megasas_refire_mgmt_cmd(struct megasas_instance *instance)
 }
 
 /*
+ * megasas_return_polled_cmds: Return polled mode commands back to the pool
+ *			       before initiating an OCR.
+ * @instance:                  Controller's soft instance
+ */
+static void
+megasas_return_polled_cmds(struct megasas_instance *instance)
+{
+	int i;
+	struct megasas_cmd_fusion *cmd_fusion;
+	struct fusion_context *fusion;
+	struct megasas_cmd *cmd_mfi;
+
+	fusion = instance->ctrl_context;
+
+	for (i = instance->max_scsi_cmds; i < instance->max_fw_cmds; i++) {
+		cmd_fusion = fusion->cmd_list[i];
+		cmd_mfi = instance->cmd_list[cmd_fusion->sync_cmd_idx];
+
+		if (cmd_mfi->flags & DRV_DCMD_POLLED_MODE) {
+			if (megasas_dbg_lvl & OCR_DEBUG)
+				dev_info(&instance->pdev->dev,
+					 "%s %d return cmd 0x%x opcode 0x%x\n",
+					 __func__, __LINE__, cmd_mfi->frame->hdr.cmd,
+					 le32_to_cpu(cmd_mfi->frame->dcmd.opcode));
+			cmd_mfi->flags &= ~DRV_DCMD_POLLED_MODE;
+			megasas_return_cmd(instance, cmd_mfi);
+		}
+	}
+}
+
+/*
  * megasas_track_scsiio : Track SCSI IOs outstanding to a SCSI device
  * @instance: per adapter struct
  * @channel: the channel assigned by the OS
@@ -4956,7 +4998,9 @@ int megasas_reset_fusion(struct Scsi_Host *shost, int reason)
 				goto kill_hba;
 			}
 
-			megasas_refire_mgmt_cmd(instance);
+			megasas_refire_mgmt_cmd(instance,
+						(i == (MEGASAS_FUSION_MAX_RESET_TRIES - 1)
+							? 1 : 0));
 
 			/* Reset load balance info */
 			if (fusion->load_balance_info)
@@ -4964,8 +5008,16 @@ int megasas_reset_fusion(struct Scsi_Host *shost, int reason)
 				       (sizeof(struct LD_LOAD_BALANCE_INFO) *
 				       MAX_LOGICAL_DRIVES_EXT));
 
-			if (!megasas_get_map_info(instance))
+			if (!megasas_get_map_info(instance)) {
 				megasas_sync_map_info(instance);
+			} else {
+				/*
+				 * Return pending polled mode cmds before
+				 * retrying OCR
+				 */
+				megasas_return_polled_cmds(instance);
+				continue;
+			}
 
 			megasas_setup_jbod_map(instance);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (8 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 09/11] megaraid_sas: Limit the number of retries for the IOCTLs causing firmware fault Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-16 12:30   ` Hannes Reinecke
  2020-01-14 11:21 ` [PATCH v2 11/11] megaraid_sas: Update driver version to 07.713.01.00-rc1 Anand Lodnoor
  2020-01-16  4:21 ` [PATCH v2 00/11] megaraid_sas: driver updates " Martin K. Petersen
  11 siblings, 1 reply; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

Remove usage of device_busy counter from driver. Instead of
device_busy counter now driver uses 'nr_active' counter of
request_queue to get the number of inflight request for a
LUN.

Link : https://patchwork.kernel.org/patch/11249297/
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 56 ++++++++++++++++-------------
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 0bdd477..f3b36fd 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -364,6 +364,35 @@ inline void megasas_return_cmd_fusion(struct megasas_instance *instance,
 		instance->max_fw_cmds = instance->max_fw_cmds-1;
 	}
 }
+
+static inline void
+megasas_get_msix_index(struct megasas_instance *instance,
+		       struct scsi_cmnd *scmd,
+		       struct megasas_cmd_fusion *cmd,
+		       u8 data_arms)
+{
+	int sdev_busy;
+
+	/* nr_hw_queue = 1 for MegaRAID */
+	struct blk_mq_hw_ctx *hctx =
+		scmd->device->request_queue->queue_hw_ctx[0];
+
+	sdev_busy = atomic_read(&hctx->nr_active);
+
+	if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
+	    sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
+		cmd->request_desc->SCSIIO.MSIxIndex =
+			mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) /
+					MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start);
+	else if (instance->msix_load_balance)
+		cmd->request_desc->SCSIIO.MSIxIndex =
+			(mega_mod64(atomic64_add_return(1, &instance->total_io_count),
+				instance->msix_vectors));
+	else
+		cmd->request_desc->SCSIIO.MSIxIndex =
+			instance->reply_map[raw_smp_processor_id()];
+}
+
 /**
  * megasas_free_cmds_fusion -	Free all the cmds in the free cmd pool
  * @instance:		Adapter soft state
@@ -2829,19 +2858,7 @@ static void megasas_stream_detect(struct megasas_instance *instance,
 			fp_possible = (io_info.fpOkForIo > 0) ? true : false;
 	}
 
-	if ((instance->perf_mode == MR_BALANCED_PERF_MODE) &&
-		atomic_read(&scp->device->device_busy) >
-		(io_info.data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
-		cmd->request_desc->SCSIIO.MSIxIndex =
-			mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) /
-				MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start);
-	else if (instance->msix_load_balance)
-		cmd->request_desc->SCSIIO.MSIxIndex =
-			(mega_mod64(atomic64_add_return(1, &instance->total_io_count),
-				    instance->msix_vectors));
-	else
-		cmd->request_desc->SCSIIO.MSIxIndex =
-			instance->reply_map[raw_smp_processor_id()];
+	megasas_get_msix_index(instance, scp, cmd, io_info.data_arms);
 
 	if (instance->adapter_type >= VENTURA_SERIES) {
 		/* FP for Optimal raid level 1.
@@ -3162,18 +3179,7 @@ static void megasas_build_ld_nonrw_fusion(struct megasas_instance *instance,
 
 	cmd->request_desc->SCSIIO.DevHandle = io_request->DevHandle;
 
-	if ((instance->perf_mode == MR_BALANCED_PERF_MODE) &&
-		atomic_read(&scmd->device->device_busy) > MR_DEVICE_HIGH_IOPS_DEPTH)
-		cmd->request_desc->SCSIIO.MSIxIndex =
-			mega_mod64((atomic64_add_return(1, &instance->high_iops_outstanding) /
-				MR_HIGH_IOPS_BATCH_COUNT), instance->low_latency_index_start);
-	else if (instance->msix_load_balance)
-		cmd->request_desc->SCSIIO.MSIxIndex =
-			(mega_mod64(atomic64_add_return(1, &instance->total_io_count),
-				    instance->msix_vectors));
-	else
-		cmd->request_desc->SCSIIO.MSIxIndex =
-			instance->reply_map[raw_smp_processor_id()];
+	megasas_get_msix_index(instance, scmd, cmd, 1);
 
 	if (!fp_possible) {
 		/* system pd firmware path */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 11/11] megaraid_sas: Update driver version to 07.713.01.00-rc1
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (9 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests Anand Lodnoor
@ 2020-01-14 11:21 ` Anand Lodnoor
  2020-01-16  4:21 ` [PATCH v2 00/11] megaraid_sas: driver updates " Martin K. Petersen
  11 siblings, 0 replies; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-14 11:21 UTC (permalink / raw)
  To: linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil,
	Anand Lodnoor

Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index 25afa81..83d8c4c 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -21,8 +21,8 @@
 /*
  * MegaRAID SAS Driver meta data
  */
-#define MEGASAS_VERSION				"07.710.50.00-rc1"
-#define MEGASAS_RELDATE				"June 28, 2019"
+#define MEGASAS_VERSION				"07.713.01.00-rc1"
+#define MEGASAS_RELDATE				"Dec 27, 2019"
 
 #define MEGASAS_MSIX_NAME_LEN			32
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1
  2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
                   ` (10 preceding siblings ...)
  2020-01-14 11:21 ` [PATCH v2 11/11] megaraid_sas: Update driver version to 07.713.01.00-rc1 Anand Lodnoor
@ 2020-01-16  4:21 ` Martin K. Petersen
  11 siblings, 0 replies; 24+ messages in thread
From: Martin K. Petersen @ 2020-01-16  4:21 UTC (permalink / raw)
  To: Anand Lodnoor
  Cc: linux-scsi, kashyap.desai, sumit.saxena, kiran-kumar.kasturi,
	sankar.patra, sasikumar.pc, shivasharan.srikanteshwara,
	chandrakanth.patil


Anand,

> This patchset contains few enhancements and fixes in megaraid_sas driver.
>
> v2:
>     - Fixed couple of typos and indentation issues as pointed out by Martin K.
>     - Added proper commit descriptions where ever necessary.

Applied to 5.6/scsi-queue, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-01-14 11:21 ` [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests Anand Lodnoor
@ 2020-01-16 12:30   ` Hannes Reinecke
  2020-01-17 11:19     ` Anand Lodnoor
  0 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-01-16 12:30 UTC (permalink / raw)
  To: Anand Lodnoor, linux-scsi
  Cc: kashyap.desai, sumit.saxena, kiran-kumar.kasturi, sankar.patra,
	sasikumar.pc, shivasharan.srikanteshwara, chandrakanth.patil

On 1/14/20 12:21 PM, Anand Lodnoor wrote:
> Remove usage of device_busy counter from driver. Instead of
> device_busy counter now driver uses 'nr_active' counter of
> request_queue to get the number of inflight request for a
> LUN.
> 
> Link : https://patchwork.kernel.org/patch/11249297/
> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 56 ++++++++++++++++-------------
>  1 file changed, 31 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 0bdd477..f3b36fd 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -364,6 +364,35 @@ inline void megasas_return_cmd_fusion(struct megasas_instance *instance,
>  		instance->max_fw_cmds = instance->max_fw_cmds-1;
>  	}
>  }
> +
> +static inline void
> +megasas_get_msix_index(struct megasas_instance *instance,
> +		       struct scsi_cmnd *scmd,
> +		       struct megasas_cmd_fusion *cmd,
> +		       u8 data_arms)
> +{
> +	int sdev_busy;
> +
> +	/* nr_hw_queue = 1 for MegaRAID */
> +	struct blk_mq_hw_ctx *hctx =
> +		scmd->device->request_queue->queue_hw_ctx[0];
> +
While this might be true it would be better to use the hctx from the
request itself:

struct blk_mq_hw_ctx *hctx = scmd->request->mq_hctx;

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      Teamlead Storage & Networking
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-01-16 12:30   ` Hannes Reinecke
@ 2020-01-17 11:19     ` Anand Lodnoor
  2020-02-26 16:19       ` John Garry
  0 siblings, 1 reply; 24+ messages in thread
From: Anand Lodnoor @ 2020-01-17 11:19 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi
  Cc: Kashyap Desai, Sumit Saxena, Kiran Kumar Kasturi, Sankar Patra,
	Sasikumar PC, Shivasharan Srikanteshwara, Chandrakanth Patil

Hannes,
               Thank you for pointing it out. Will incorporate the suggested
changes in the upcoming patches.

Thanks & Regards,
Anand R.L

-----Original Message-----
From: Hannes Reinecke [mailto:hare@suse.de]
Sent: Thursday, January 16, 2020 6:01 PM
To: Anand Lodnoor <anand.lodnoor@broadcom.com>; linux-scsi@vger.kernel.org
Cc: kashyap.desai@broadcom.com; sumit.saxena@broadcom.com;
kiran-kumar.kasturi@broadcom.com; sankar.patra@broadcom.com;
sasikumar.pc@broadcom.com; shivasharan.srikanteshwara@broadcom.com;
chandrakanth.patil@broadcom.com
Subject: Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check
SCSI device in-flight IO requests

On 1/14/20 12:21 PM, Anand Lodnoor wrote:
> Remove usage of device_busy counter from driver. Instead of
> device_busy counter now driver uses 'nr_active' counter of
> request_queue to get the number of inflight request for a LUN.
>
> Link : https://patchwork.kernel.org/patch/11249297/
> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
> Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 56
> ++++++++++++++++-------------
>  1 file changed, 31 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 0bdd477..f3b36fd 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -364,6 +364,35 @@ inline void megasas_return_cmd_fusion(struct
> megasas_instance *instance,
>  		instance->max_fw_cmds = instance->max_fw_cmds-1;
>  	}
>  }
> +
> +static inline void
> +megasas_get_msix_index(struct megasas_instance *instance,
> +		       struct scsi_cmnd *scmd,
> +		       struct megasas_cmd_fusion *cmd,
> +		       u8 data_arms)
> +{
> +	int sdev_busy;
> +
> +	/* nr_hw_queue = 1 for MegaRAID */
> +	struct blk_mq_hw_ctx *hctx =
> +		scmd->device->request_queue->queue_hw_ctx[0];
> +
While this might be true it would be better to use the hctx from the request
itself:

struct blk_mq_hw_ctx *hctx = scmd->request->mq_hctx;

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      Teamlead Storage & Networking
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB
36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-01-17 11:19     ` Anand Lodnoor
@ 2020-02-26 16:19       ` John Garry
       [not found]         ` <CAAO+jF-P3MkB2mo6pmYH1ihjRGpfjkkgXZg9dAZ29nYmU6T2=A@mail.gmail.com>
  0 siblings, 1 reply; 24+ messages in thread
From: John Garry @ 2020-02-26 16:19 UTC (permalink / raw)
  To: Anand Lodnoor, Hannes Reinecke, linux-scsi
  Cc: Kashyap Desai, Sumit Saxena, Kiran Kumar Kasturi, Sankar Patra,
	Sasikumar PC, Shivasharan Srikanteshwara, Chandrakanth Patil,
	Ming Lei

On 17/01/2020 11:19, Anand Lodnoor wrote:
> Hannes,
>                 Thank you for pointing it out. Will incorporate the suggested
> changes in the upcoming patches.

It doesn't look like this suggested change was incorporated in the end.

So I am rebasing series 
https://lore.kernel.org/linux-block/20191202153914.84722-1-hare@suse.de/, 
and this patch conflicts. But I did think that this was a strange 
change, apart from that.

> Thanks & Regards,
> Anand R.L
> 
> -----Original Message-----
> From: Hannes Reinecke [mailto:hare@suse.de]
> Sent: Thursday, January 16, 2020 6:01 PM
> To: Anand Lodnoor<anand.lodnoor@broadcom.com>;linux-scsi@vger.kernel.org
> Cc:kashyap.desai@broadcom.com;sumit.saxena@broadcom.com;
> kiran-kumar.kasturi@broadcom.com;sankar.patra@broadcom.com;
> sasikumar.pc@broadcom.com;shivasharan.srikanteshwara@broadcom.com;
> chandrakanth.patil@broadcom.com
> Subject: Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check
> SCSI device in-flight IO requests
> 
> On 1/14/20 12:21 PM, Anand Lodnoor wrote:
>> Remove usage of device_busy counter from driver. Instead of
>> device_busy counter now driver uses 'nr_active' counter of
>> request_queue to get the number of inflight request for a LUN.

Is blk_mq_hw_ctx.nr_active really the same as scsi_device.device_busy?

Thanks,
John

>>
>> Link :https://patchwork.kernel.org/patch/11249297/
>> Signed-off-by: Chandrakanth Patil<chandrakanth.patil@broadcom.com>
>> Signed-off-by: Anand Lodnoor<anand.lodnoor@broadcom.com>
>> ---
>>   drivers/scsi/megaraid/megaraid_sas_fusion.c | 56
>> ++++++++++++++++-------------
>>   1 file changed, 31 insertions(+), 25 deletions(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> index 0bdd477..f3b36fd 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> @@ -364,6 +364,35 @@ inline void megasas_return_cmd_fusion(struct
>> megasas_instance *instance,
>>   		instance->max_fw_cmds = instance->max_fw_cmds-1;
>>   	}
>>   }
>> +
>> +static inline void
>> +megasas_get_msix_index(struct megasas_instance *instance,
>> +		       struct scsi_cmnd *scmd,
>> +		       struct megasas_cmd_fusion *cmd,
>> +		       u8 data_arms)
>> +{
>> +	int sdev_busy;
>> +
>> +	/* nr_hw_queue = 1 for MegaRAID */
>> +	struct blk_mq_hw_ctx *hctx =
>> +		scmd->device->request_queue->queue_hw_ctx[0];
>> +
> While this might be true it would be better to use the hctx from the request
> itself:
> 
> struct blk_mq_hw_ctx *hctx = scmd->request->mq_hctx;
> 
> Cheers,
> 
> Hannes
> -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de	+49 
> 911 74053 688 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 
> Nürnberg HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
       [not found]         ` <CAAO+jF-P3MkB2mo6pmYH1ihjRGpfjkkgXZg9dAZ29nYmU6T2=A@mail.gmail.com>
@ 2020-02-27 12:32           ` John Garry
  2020-03-02  9:17             ` Hannes Reinecke
  0 siblings, 1 reply; 24+ messages in thread
From: John Garry @ 2020-02-27 12:32 UTC (permalink / raw)
  To: Anand Lodnoor
  Cc: Hannes Reinecke, linux-scsi, Kashyap Desai, Sumit Saxena,
	Kiran Kumar Kasturi, Sankar Patra, Sasikumar PC,
	Shivasharan Srikanteshwara, Chandrakanth Patil, Ming Lei,
	Bart Van Assche, Martin K . Petersen

> 
>     Is blk_mq_hw_ctx.nr_active really the same as scsi_device.device_busy?
> 
> *Both of them are not the same but it serves our purpose to get the 
> number of outstanding io requests. Please refer below link for more 
> details:*
> 
> https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/

Thanks for the pointer, but there did not seem to be a conclusion there: 
https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/

Anyway, if we move to exposing multiple HW queues in the megaraid SAS 
driver:

  host->nr_hw_queues = instance->msix_vectors -
                       instance->low_latency_index_start;

Then hctx->nr_active will no longer be the total active requests per 
host, but rather per hctx.

In addition, hctx->nr_active will no longer be properly maintained, as 
it would be based on the hctx HW queue actually being used by the LLDD 
for that request, which is not always true now. That is because in 
megasas_get_msix_index() a judgement may be made to use a low-latency HW 
queue instead:

static inline void
megasas_get_msix_index(struct megasas_instance *instance,
		       struct scsi_cmnd *scmd,
		       struct megasas_cmd_fusion *cmd,
		       u8 data_arms)
{
...

sdev_busy = atomic_read(&hctx->nr_active);

if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
     sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
	cmd->request_desc->SCSIIO.MSIxIndex =
			mega_mod64(...),
	else if (instance->msix_load_balance)
		cmd->request_desc->SCSIIO.MSIxIndex =
			(mega_mod64(...),
				instance->msix_vectors));

Will this make a difference? I am not sure. Maybe, on this basis, 
magaraid sas is not a good candidate to change to expose multiple queues.

Ignoring that for a moment, since we no longer keep a host busy count, 
and I figure that we don't want to back to using 
scsi_device.device_busy, is the judgement of hctx->nr_active ok to use 
to decide whether to use these performance queues?

Thanks,
John

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-02-27 12:32           ` John Garry
@ 2020-03-02  9:17             ` Hannes Reinecke
  2020-03-02  9:51               ` John Garry
  0 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2020-03-02  9:17 UTC (permalink / raw)
  To: John Garry, Anand Lodnoor
  Cc: linux-scsi, Kashyap Desai, Sumit Saxena, Kiran Kumar Kasturi,
	Sankar Patra, Sasikumar PC, Shivasharan Srikanteshwara,
	Chandrakanth Patil, Ming Lei, Bart Van Assche,
	Martin K . Petersen

On 2/27/20 1:32 PM, John Garry wrote:
>>
>>     Is blk_mq_hw_ctx.nr_active really the same as 
>> scsi_device.device_busy?
>>
>> *Both of them are not the same but it serves our purpose to get the 
>> number of outstanding io requests. Please refer below link for more 
>> details:*
>>
>> https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/
> 
> Thanks for the pointer, but there did not seem to be a conclusion there: 
> https://lore.kernel.org/linux-scsi/20191105002334.GA11436@ming.t460p/
> 
> Anyway, if we move to exposing multiple HW queues in the megaraid SAS 
> driver:
> 
>   host->nr_hw_queues = instance->msix_vectors -
>                        instance->low_latency_index_start;
> 
> Then hctx->nr_active will no longer be the total active requests per 
> host, but rather per hctx.
> 
> In addition, hctx->nr_active will no longer be properly maintained, as 
> it would be based on the hctx HW queue actually being used by the LLDD 
> for that request, which is not always true now. That is because in 
> megasas_get_msix_index() a judgement may be made to use a low-latency HW 
> queue instead:
> 
> static inline void
> megasas_get_msix_index(struct megasas_instance *instance,
>                 struct scsi_cmnd *scmd,
>                 struct megasas_cmd_fusion *cmd,
>                 u8 data_arms)
> {
> ...
> 
> sdev_busy = atomic_read(&hctx->nr_active);
> 
> if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
>      sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
>      cmd->request_desc->SCSIIO.MSIxIndex =
>              mega_mod64(...),
>      else if (instance->msix_load_balance)
>          cmd->request_desc->SCSIIO.MSIxIndex =
>              (mega_mod64(...),
>                  instance->msix_vectors));
> 
> Will this make a difference? I am not sure. Maybe, on this basis, 
> magaraid sas is not a good candidate to change to expose multiple queues.
> 
> Ignoring that for a moment, since we no longer keep a host busy count, 
> and I figure that we don't want to back to using 
> scsi_device.device_busy, is the judgement of hctx->nr_active ok to use 
> to decide whether to use these performance queues?
> 
Personally, I wonder if the current implementation of high-IOPs queues 
makes sense with multiqueue.
Thing is, the current high-IOPs queue mechanism of shifting I/O to 
another internal queue doesn't align nicely with the blk-mq architecture.
What we _do_ have, though, is a 'poll' queue mechanism, allowing to 
separate out one (or several) queues for polling, to allow for .. 
indeed, high-IOPs.
So it would be interesting to figure out if we don't get similar 
performance by using the 'poll' queue implementation from blk-mq instead 
of the current one.

Which would also have the benefit that we could support the io_uring 
interface natively with megaraid_sas, which I think would be a benefit 
on its own.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-03-02  9:17             ` Hannes Reinecke
@ 2020-03-02  9:51               ` John Garry
  2020-03-02 18:37                 ` Sumit Saxena
  0 siblings, 1 reply; 24+ messages in thread
From: John Garry @ 2020-03-02  9:51 UTC (permalink / raw)
  To: Hannes Reinecke, Anand Lodnoor
  Cc: linux-scsi, Kashyap Desai, Sumit Saxena, Kiran Kumar Kasturi,
	Sankar Patra, Sasikumar PC, Shivasharan Srikanteshwara,
	Chandrakanth Patil, Ming Lei, Bart Van Assche,
	Martin K . Petersen


>> static inline void
>> megasas_get_msix_index(struct megasas_instance *instance,
>>                 struct scsi_cmnd *scmd,
>>                 struct megasas_cmd_fusion *cmd,
>>                 u8 data_arms)
>> {
>> ...
>>
>> sdev_busy = atomic_read(&hctx->nr_active);
>>
>> if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
>>      sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
>>      cmd->request_desc->SCSIIO.MSIxIndex =
>>              mega_mod64(...),
>>      else if (instance->msix_load_balance)
>>          cmd->request_desc->SCSIIO.MSIxIndex =
>>              (mega_mod64(...),
>>                  instance->msix_vectors));
>>
>> Will this make a difference? I am not sure. Maybe, on this basis, 
>> magaraid sas is not a good candidate to change to expose multiple queues.
>>
>> Ignoring that for a moment, since we no longer keep a host busy count, 
>> and I figure that we don't want to back to using 
>> scsi_device.device_busy, is the judgement of hctx->nr_active ok to use 
>> to decide whether to use these performance queues?
>>
> Personally, I wonder if the current implementation of high-IOPs queues 
> makes sense with multiqueue. > Thing is, the current high-IOPs queue mechanism of shifting I/O to
> another internal queue doesn't align nicely with the blk-mq architecture.

Right, we should not be hiding HW queues from blk-mq like this. This 
breaks the symmetry. Maybe we can move this functionality to blk-mq, but 
I doubt that this is a common use case.

> What we _do_ have, though, is a 'poll' queue mechanism, allowing to 
> separate out one (or several) queues for polling, to allow for .. 
> indeed, high-IOPs.

Any examples or references for this?

> So it would be interesting to figure out if we don't get similar 
> performance by using the 'poll' queue implementation from blk-mq instead 
> of the current one.

I thought that this driver/or mpt3sas already used a polling mode.

And for these low-latency queues, I figure that the issue is not just 
polling vs interrupt, but indeed how fast the HW queue can consume SQEs. 
A HW queue may only be able to consume at a limited rate - that's why we 
segregate.

As an aside, that is actually an issue for blk-mq. For 1 to many HW 
queue-to-CPU mapping, limiting many CPUs a single queue can limit IOPs 
since HW queues can only consume at a limited rate.

> 
> Which would also have the benefit that we could support the io_uring 
> interface natively with megaraid_sas, which I think would be a benefit 
> on its own.
> 

thanks,
John


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-03-02  9:51               ` John Garry
@ 2020-03-02 18:37                 ` Sumit Saxena
  2020-03-03 11:53                   ` John Garry
  0 siblings, 1 reply; 24+ messages in thread
From: Sumit Saxena @ 2020-03-02 18:37 UTC (permalink / raw)
  To: John Garry
  Cc: Hannes Reinecke, Anand Lodnoor, Linux SCSI List, Kashyap Desai,
	Kiran Kumar Kasturi, Sankar Patra, Sasikumar PC,
	Shivasharan Srikanteshwara, Chandrakanth Patil, Ming Lei,
	Bart Van Assche, Martin K . Petersen

On Mon, Mar 2, 2020 at 3:21 PM John Garry <john.garry@huawei.com> wrote:
>
>
> >> static inline void
> >> megasas_get_msix_index(struct megasas_instance *instance,
> >>                 struct scsi_cmnd *scmd,
> >>                 struct megasas_cmd_fusion *cmd,
> >>                 u8 data_arms)
> >> {
> >> ...
> >>
> >> sdev_busy = atomic_read(&hctx->nr_active);
> >>
> >> if (instance->perf_mode == MR_BALANCED_PERF_MODE &&
> >>      sdev_busy > (data_arms * MR_DEVICE_HIGH_IOPS_DEPTH))
> >>      cmd->request_desc->SCSIIO.MSIxIndex =
> >>              mega_mod64(...),
> >>      else if (instance->msix_load_balance)
> >>          cmd->request_desc->SCSIIO.MSIxIndex =
> >>              (mega_mod64(...),
> >>                  instance->msix_vectors));
> >>
> >> Will this make a difference? I am not sure. Maybe, on this basis,
> >> magaraid sas is not a good candidate to change to expose multiple queues.
> >>
> >> Ignoring that for a moment, since we no longer keep a host busy count,
> >> and I figure that we don't want to back to using
> >> scsi_device.device_busy, is the judgement of hctx->nr_active ok to use
> >> to decide whether to use these performance queues?
> >>
> > Personally, I wonder if the current implementation of high-IOPs queues
> > makes sense with multiqueue. > Thing is, the current high-IOPs queue mechanism of shifting I/O to
> > another internal queue doesn't align nicely with the blk-mq architecture.
>
> Right, we should not be hiding HW queues from blk-mq like this. This
> breaks the symmetry. Maybe we can move this functionality to blk-mq, but
> I doubt that this is a common use case.
We added this concept of extra queues for megraid_sas latest gen of controllers
for performance reasons. Here is some background-
https://lore.kernel.org/lkml/20180829084618.GA24765@ming.t460p/t/
We worked with the community to have such interface for managed(for
low latency queues) and non-managed(High IOPs queues)
interrupts co-existence.

>
> > What we _do_ have, though, is a 'poll' queue mechanism, allowing to
> > separate out one (or several) queues for polling, to allow for ..
> > indeed, high-IOPs.
>
> Any examples or references for this?
>
> > So it would be interesting to figure out if we don't get similar
> > performance by using the 'poll' queue implementation from blk-mq instead
> > of the current one.
>
> I thought that this driver/or mpt3sas already used a polling mode.
>
> And for these low-latency queues, I figure that the issue is not just
> polling vs interrupt, but indeed how fast the HW queue can consume SQEs.
> A HW queue may only be able to consume at a limited rate - that's why we
> segregate.
Yes, there is no polling in any of HW queues. High IOPs queues have
interrupt coalescing enabled whereas
low latency queues does not have interrupt coalescing. megaraid_sas
driver would choose which set of queues
among these two has to be used depending on workload. For latency
oriented workload, driver would use low
latency queues and for IOPs profile, driver would use High IOPs queues.
>
> As an aside, that is actually an issue for blk-mq. For 1 to many HW
> queue-to-CPU mapping, limiting many CPUs a single queue can limit IOPs
> since HW queues can only consume at a limited rate.
We were able to achieve performance target for MegaRAID latest gen
controller with this model of few set
 of HW queues mapped to local numa CPUs and low latency queues has one
to one mapping to CPUs.
This is default behavior of queues segregation in megaraid_sas driver
to satisfy our IOPs and latency requirements altogether.
However we have given module parameter- "perf_mode" to tune queues
behavior. i.e turning on/off interrupt
coalescing on all HW queues where this one to many queues to CPU
mapping would not happen.

Thanks,
Sumit
>
> >
> > Which would also have the benefit that we could support the io_uring
> > interface natively with megaraid_sas, which I think would be a benefit
> > on its own.
> >
>
> thanks,
> John
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-03-02 18:37                 ` Sumit Saxena
@ 2020-03-03 11:53                   ` John Garry
  2020-03-03 11:56                     ` Hannes Reinecke
  2020-03-03 17:04                     ` Sumit Saxena
  0 siblings, 2 replies; 24+ messages in thread
From: John Garry @ 2020-03-03 11:53 UTC (permalink / raw)
  To: Sumit Saxena
  Cc: Hannes Reinecke, Anand Lodnoor, Linux SCSI List, Kashyap Desai,
	Kiran Kumar Kasturi, Sankar Patra, Sasikumar PC,
	Shivasharan Srikanteshwara, Chandrakanth Patil, Ming Lei,
	Bart Van Assche, Martin K . Petersen

>> And for these low-latency queues, I figure that the issue is not just
>> polling vs interrupt, but indeed how fast the HW queue can consume SQEs.
>> A HW queue may only be able to consume at a limited rate - that's why we
>> segregate.
> Yes, there is no polling in any of HW queues. High IOPs queues have
> interrupt coalescing enabled whereas
> low latency queues does not have interrupt coalescing. megaraid_sas
> driver would choose which set of queues	
> among these two has to be used depending on workload. For latency
> oriented workload, driver would use low
> latency queues and for IOPs profile, driver would use High IOPs queues.
>>
>> As an aside, that is actually an issue for blk-mq. For 1 to many HW
>> queue-to-CPU mapping, limiting many CPUs a single queue can limit IOPs
>> since HW queues can only consume at a limited rate.
> We were able to achieve performance target for MegaRAID latest gen
> controller with this model of few set
>   of HW queues mapped to local numa CPUs and low latency queues has one
> to one mapping to CPUs.
> This is default behavior of queues segregation in megaraid_sas driver
> to satisfy our IOPs and latency requirements altogether.
> However we have given module parameter- "perf_mode" to tune queues
> behavior. i.e turning on/off interrupt
> coalescing on all HW queues where this one to many queues to CPU
> mapping would not happen.

Hi Sumit,

OK, I have a rough idea of the concept. And again I'd say megaraid sas 
may not be a good candidate to expose > 1 HW queues, as we hide HW 
queues and don't maintain the symmetry with blk-mq layer.

Indeed, I do not even expect a performance increase in exposing > 1 HW 
queues since the driver already uses the reply map + managed interrupts.

The main reason for that change in some drivers - apart from losing the 
duplicated ugliness of the reply map - is to leverage the blk-mq feature 
to drain a hctx for CPU hotplug [0] - is this something which megaraid 
sas is vulnerable to and would benefit from?

Thanks,
John

[0] 
https://lore.kernel.org/linux-block/20200115114409.28895-1-ming.lei@redhat.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-03-03 11:53                   ` John Garry
@ 2020-03-03 11:56                     ` Hannes Reinecke
  2020-03-03 17:04                     ` Sumit Saxena
  1 sibling, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2020-03-03 11:56 UTC (permalink / raw)
  To: John Garry, Sumit Saxena
  Cc: Anand Lodnoor, Linux SCSI List, Kashyap Desai,
	Kiran Kumar Kasturi, Sankar Patra, Sasikumar PC,
	Shivasharan Srikanteshwara, Chandrakanth Patil, Ming Lei,
	Bart Van Assche, Martin K . Petersen

On 3/3/20 12:53 PM, John Garry wrote:
>>> And for these low-latency queues, I figure that the issue is not just
>>> polling vs interrupt, but indeed how fast the HW queue can consume SQEs.
>>> A HW queue may only be able to consume at a limited rate - that's why we
>>> segregate.
>> Yes, there is no polling in any of HW queues. High IOPs queues have
>> interrupt coalescing enabled whereas
>> low latency queues does not have interrupt coalescing. megaraid_sas
>> driver would choose which set of queues   
>> among these two has to be used depending on workload. For latency
>> oriented workload, driver would use low
>> latency queues and for IOPs profile, driver would use High IOPs queues.
>>>
>>> As an aside, that is actually an issue for blk-mq. For 1 to many HW
>>> queue-to-CPU mapping, limiting many CPUs a single queue can limit IOPs
>>> since HW queues can only consume at a limited rate.
>> We were able to achieve performance target for MegaRAID latest gen
>> controller with this model of few set
>>   of HW queues mapped to local numa CPUs and low latency queues has one
>> to one mapping to CPUs.
>> This is default behavior of queues segregation in megaraid_sas driver
>> to satisfy our IOPs and latency requirements altogether.
>> However we have given module parameter- "perf_mode" to tune queues
>> behavior. i.e turning on/off interrupt
>> coalescing on all HW queues where this one to many queues to CPU
>> mapping would not happen.
> 
> Hi Sumit,
> 
> OK, I have a rough idea of the concept. And again I'd say megaraid sas
> may not be a good candidate to expose > 1 HW queues, as we hide HW
> queues and don't maintain the symmetry with blk-mq layer.
> 
> Indeed, I do not even expect a performance increase in exposing > 1 HW
> queues since the driver already uses the reply map + managed interrupts.
> 
> The main reason for that change in some drivers - apart from losing the
> duplicated ugliness of the reply map - is to leverage the blk-mq feature
> to drain a hctx for CPU hotplug [0] - is this something which megaraid
> sas is vulnerable to and would benefit from?
> 
I would guess so.
Megaraid_sas (much like mpt3sas) has a mailbox interface, requiring you
to just write the address of the command into it.
The command itself carries the information about which MSIx interrupt
firmware should post completions on, so if the cpu serving that
interrupt is gone we're in trouble as we'll never see the completion.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-03-03 11:53                   ` John Garry
  2020-03-03 11:56                     ` Hannes Reinecke
@ 2020-03-03 17:04                     ` Sumit Saxena
  2020-03-04  9:39                       ` John Garry
  1 sibling, 1 reply; 24+ messages in thread
From: Sumit Saxena @ 2020-03-03 17:04 UTC (permalink / raw)
  To: John Garry
  Cc: Hannes Reinecke, Anand Lodnoor, Linux SCSI List, Kashyap Desai,
	Kiran Kumar Kasturi, Sankar Patra, Sasikumar PC,
	Shivasharan Srikanteshwara, Chandrakanth Patil, Ming Lei,
	Bart Van Assche, Martin K . Petersen

On Tue, Mar 3, 2020 at 5:23 PM John Garry <john.garry@huawei.com> wrote:
>
> >> And for these low-latency queues, I figure that the issue is not just
> >> polling vs interrupt, but indeed how fast the HW queue can consume SQEs.
> >> A HW queue may only be able to consume at a limited rate - that's why we
> >> segregate.
> > Yes, there is no polling in any of HW queues. High IOPs queues have
> > interrupt coalescing enabled whereas
> > low latency queues does not have interrupt coalescing. megaraid_sas
> > driver would choose which set of queues
> > among these two has to be used depending on workload. For latency
> > oriented workload, driver would use low
> > latency queues and for IOPs profile, driver would use High IOPs queues.
> >>
> >> As an aside, that is actually an issue for blk-mq. For 1 to many HW
> >> queue-to-CPU mapping, limiting many CPUs a single queue can limit IOPs
> >> since HW queues can only consume at a limited rate.
> > We were able to achieve performance target for MegaRAID latest gen
> > controller with this model of few set
> >   of HW queues mapped to local numa CPUs and low latency queues has one
> > to one mapping to CPUs.
> > This is default behavior of queues segregation in megaraid_sas driver
> > to satisfy our IOPs and latency requirements altogether.
> > However we have given module parameter- "perf_mode" to tune queues
> > behavior. i.e turning on/off interrupt
> > coalescing on all HW queues where this one to many queues to CPU
> > mapping would not happen.
>
> Hi Sumit,
Hi John,
>
> OK, I have a rough idea of the concept. And again I'd say megaraid sas
> may not be a good candidate to expose > 1 HW queues, as we hide HW
> queues and don't maintain the symmetry with blk-mq layer.
Sorry, my last response was not very clear. I was referring to reply
queues as HW queues
not submission queues. I agree with you, since megaraid_sas HW has
single submission
queue so >1 HW queue would not help to improve performance. Testing
done by us on shared
tagset patches worked by you/Hannes was to ensure no performance drop
from single HW
submission queue based driver.
>
> Indeed, I do not even expect a performance increase in exposing > 1 HW
> queues since the driver already uses the reply map + managed interrupts.
>
> The main reason for that change in some drivers - apart from losing the
> duplicated ugliness of the reply map - is to leverage the blk-mq feature
> to drain a hctx for CPU hotplug [0] - is this something which megaraid
> sas is vulnerable to and would benefit from?
"megaraid_sas" driver would be benefited with draining of IO
completions directed to
hotplugged(offlined) CPU. With current driver IO completion would
hang, if CPU on which IO is to be
completed goes offline.

Thanks,
Sumit
>
> Thanks,
> John
>
> [0]
> https://lore.kernel.org/linux-block/20200115114409.28895-1-ming.lei@redhat.com/

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
  2020-03-03 17:04                     ` Sumit Saxena
@ 2020-03-04  9:39                       ` John Garry
  0 siblings, 0 replies; 24+ messages in thread
From: John Garry @ 2020-03-04  9:39 UTC (permalink / raw)
  To: Sumit Saxena
  Cc: Hannes Reinecke, Anand Lodnoor, Linux SCSI List, Kashyap Desai,
	Kiran Kumar Kasturi, Sankar Patra, Sasikumar PC,
	Shivasharan Srikanteshwara, Chandrakanth Patil, Ming Lei,
	Bart Van Assche, Martin K . Petersen

>>
>> OK, I have a rough idea of the concept. And again I'd say megaraid sas
>> may not be a good candidate to expose > 1 HW queues, as we hide HW
>> queues and don't maintain the symmetry with blk-mq layer.
> Sorry, my last response was not very clear. I was referring to reply
> queues as HW queues
> not submission queues. I agree with you, since megaraid_sas HW has
> single submission
> queue so >1 HW queue would not help to improve performance. Testing
> done by us on shared
> tagset patches worked by you/Hannes was to ensure no performance drop
> from single HW
> submission queue based driver.

OK, but I still have concern with this. That's your choice.

>>
>> Indeed, I do not even expect a performance increase in exposing > 1 HW
>> queues since the driver already uses the reply map + managed interrupts.
>>
>> The main reason for that change in some drivers - apart from losing the
>> duplicated ugliness of the reply map - is to leverage the blk-mq feature
>> to drain a hctx for CPU hotplug [0] - is this something which megaraid
>> sas is vulnerable to and would benefit from?
> "megaraid_sas" driver would be benefited with draining of IO
> completions directed to
> hotplugged(offlined) CPU. With current driver IO completion would
> hang, if CPU on which IO is to be
> completed goes offline.

But that feature will only work for the queues which you expose. For the 
low-latency queues, there would be no draining*. However, the 
low-latency interrupts are not managed; as such, I think that their 
interrupts would migrate when their cpumask goes offline, rather than 
being shutdown, so not vulnerable to this problem.

* In principle, since you can submit the scsi request on different hw 
queue than expected from blk-mq perspective, when we offline the cpu 
which blk-mq set to submit on, blk-mq may actually wait for requests to 
complete on these low-latency queues and addition to the HW queue which 
blk-mq thought that the request would be submitted on - again, not 
ideal, and may cause problems.

Thanks,
John

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-03-04  9:39 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 11:21 [PATCH v2 00/11] megaraid_sas: driver updates to 07.713.01.00-rc1 Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 01/11] megaraid_sas: Reset adapter if FW is not in READY state after device resume Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 02/11] megaraid_sas: Set no_write_same only for Virtual Disk Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 03/11] megaraid_sas: Update optimal queue depth for SAS and NVMe devices Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 04/11] megaraid_sas: Do not kill host bus adapter, if adapter is already dead Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 05/11] megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is disabled Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 06/11] megaraid_sas: Do not set HBA Operational if FW is not in operational state Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 07/11] megaraid_sas: Re-Define enum DCMD_RETURN_STATUS Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 08/11] megaraid_sas: Do not initiate OCR if controller is not in ready state Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 09/11] megaraid_sas: Limit the number of retries for the IOCTLs causing firmware fault Anand Lodnoor
2020-01-14 11:21 ` [PATCH v2 10/11] megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests Anand Lodnoor
2020-01-16 12:30   ` Hannes Reinecke
2020-01-17 11:19     ` Anand Lodnoor
2020-02-26 16:19       ` John Garry
     [not found]         ` <CAAO+jF-P3MkB2mo6pmYH1ihjRGpfjkkgXZg9dAZ29nYmU6T2=A@mail.gmail.com>
2020-02-27 12:32           ` John Garry
2020-03-02  9:17             ` Hannes Reinecke
2020-03-02  9:51               ` John Garry
2020-03-02 18:37                 ` Sumit Saxena
2020-03-03 11:53                   ` John Garry
2020-03-03 11:56                     ` Hannes Reinecke
2020-03-03 17:04                     ` Sumit Saxena
2020-03-04  9:39                       ` John Garry
2020-01-14 11:21 ` [PATCH v2 11/11] megaraid_sas: Update driver version to 07.713.01.00-rc1 Anand Lodnoor
2020-01-16  4:21 ` [PATCH v2 00/11] megaraid_sas: driver updates " Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).