All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] megaraid_sas: Updates for scsi-next
@ 2016-10-17 10:24 Sumit Saxena
  2016-10-17 10:24 ` [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset Sumit Saxena
                   ` (6 more replies)
  0 siblings, 7 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena

Sumit Saxena (7):
  megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for
    30secs before reset
  megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade
  megaraid_sas: Do not fire DCMDs during PCI shutdown/detach
  megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  megaraid_sas: driver version upgrade
  MAINTAINERS: Update megaraid maintainers list
  megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which
    does not support JBOD sequence map

 MAINTAINERS                                 | 10 ++---
 drivers/scsi/megaraid/megaraid_sas.h        |  7 +++-
 drivers/scsi/megaraid/megaraid_sas_base.c   | 60 +++++++++++++++++++++++++----
 drivers/scsi/megaraid/megaraid_sas_fp.c     |  6 ++-
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 21 ++++++----
 drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 +
 6 files changed, 82 insertions(+), 24 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset
  2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
@ 2016-10-17 10:24 ` Sumit Saxena
  2016-10-17 11:29     ` Hannes Reinecke
  2016-10-17 12:43   ` Tomas Henzl
  2016-10-17 10:24 ` [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade Sumit Saxena
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena,
	stable, Kiran Kumar Kasturi

For SRIOV enabled firmware, if there is a OCR(online controller reset)
possibility driver set the convert flag to 1, which is not happening if
there are outstanding commands even after 180 seconds.
As driver does not set convert flag to 1 and still making the OCR to run,
VF(Virtual function) driver is directly writing on to the register
instead of waiting for 30 seconds. Setting convert flag to 1 will cause
VF driver will wait for 30 secs before going for reset.

CC: stable@vger.kernel.org
Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 52d8bbf..61be7ed 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -2823,6 +2823,7 @@ int megasas_wait_for_outstanding_fusion(struct megasas_instance *instance,
 		dev_err(&instance->pdev->dev, "pending commands remain after waiting, "
 		       "will reset adapter scsi%d.\n",
 		       instance->host->host_no);
+		*convert = 1;
 		retval = 1;
 	}
 out:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade
  2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
  2016-10-17 10:24 ` [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset Sumit Saxena
@ 2016-10-17 10:24 ` Sumit Saxena
  2016-10-17 11:29   ` Hannes Reinecke
  2016-10-17 12:50   ` Tomas Henzl
  2016-10-17 10:24 ` [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach Sumit Saxena
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena,
	Kiran Kumar Kasturi

This patch fixes the issue of wrong PhysArm was sent to firmware for R1
VD downgrade.

Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_fp.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fp.c b/drivers/scsi/megaraid/megaraid_sas_fp.c
index e413113..f237d00 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fp.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fp.c
@@ -782,7 +782,8 @@ static u8 mr_spanset_get_phy_params(struct megasas_instance *instance, u32 ld,
 			(raid->regTypeReqOnRead != REGION_TYPE_UNUSED))))
 			pRAID_Context->regLockFlags = REGION_TYPE_EXCLUSIVE;
 		else if (raid->level == 1) {
-			pd = MR_ArPdGet(arRef, physArm + 1, map);
+			physArm = physArm + 1;
+			pd = MR_ArPdGet(arRef, physArm, map);
 			if (pd != MR_PD_INVALID)
 				*pDevHandle = MR_PdDevHandleGet(pd, map);
 		}
@@ -879,7 +880,8 @@ u8 MR_GetPhyParams(struct megasas_instance *instance, u32 ld, u64 stripRow,
 			pRAID_Context->regLockFlags = REGION_TYPE_EXCLUSIVE;
 		else if (raid->level == 1) {
 			/* Get alternate Pd. */
-			pd = MR_ArPdGet(arRef, physArm + 1, map);
+			physArm = physArm + 1;
+			pd = MR_ArPdGet(arRef, physArm, map);
 			if (pd != MR_PD_INVALID)
 				/* Get dev handle from Pd */
 				*pDevHandle = MR_PdDevHandleGet(pd, map);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach
  2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
  2016-10-17 10:24 ` [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset Sumit Saxena
  2016-10-17 10:24 ` [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade Sumit Saxena
@ 2016-10-17 10:24 ` Sumit Saxena
  2016-10-17 11:31   ` Hannes Reinecke
  2016-10-17 10:24 ` [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware Sumit Saxena
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena

This patch addresses the issue of driver firing DCMDs in PCI
shutdown/detach path irrespective of firmware state.
Driver will check for whether firmware is operational state or not
before firing DCMDs. If firmware is in unrecoverbale
state or does not become operational within specfied time, driver will
skip firing DCMDs.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_base.c   | 46 +++++++++++++++++++++++++++++
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  9 ++++--
 2 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 9ff57de..092387f 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -6248,6 +6248,32 @@ fail_reenable_msix:
 #define megasas_resume	NULL
 #endif
 
+static inline int
+megasas_wait_for_adapter_operational(struct megasas_instance *instance)
+{
+	int wait_time = MEGASAS_RESET_WAIT_TIME * 2;
+	int i;
+
+	for (i = 0; i < wait_time; i++) {
+		if (atomic_read(&instance->adprecovery)
+			== MEGASAS_HBA_OPERATIONAL)
+			break;
+
+		if (!(i % MEGASAS_RESET_NOTICE_INTERVAL))
+			dev_notice(&instance->pdev->dev, "waiting for controller reset to finish\n");
+
+		msleep(1000);
+	}
+
+	if (atomic_read(&instance->adprecovery) != MEGASAS_HBA_OPERATIONAL) {
+		dev_info(&instance->pdev->dev, "%s timed out while waiting for HBA to recover.\n",
+			__func__);
+		return 1;
+	}
+
+	return 0;
+}
+
 /**
  * megasas_detach_one -	PCI hot"un"plug entry point
  * @pdev:		PCI device structure
@@ -6272,9 +6298,18 @@ static void megasas_detach_one(struct pci_dev *pdev)
 	if (instance->fw_crash_state != UNAVAILABLE)
 		megasas_free_host_crash_buffer(instance);
 	scsi_remove_host(instance->host);
+
+	msleep(1000);
+	if ((atomic_read(&instance->adprecovery) == MEGASAS_HW_CRITICAL_ERROR)
+		|| ((atomic_read(&instance->adprecovery)
+			!= MEGASAS_HBA_OPERATIONAL) &&
+		megasas_wait_for_adapter_operational(instance)))
+		goto skip_firing_dcmds;
+
 	megasas_flush_cache(instance);
 	megasas_shutdown_controller(instance, MR_DCMD_CTRL_SHUTDOWN);
 
+skip_firing_dcmds:
 	/* cancel the delayed work if this work still in queue*/
 	if (instance->ev != NULL) {
 		struct megasas_aen_event *ev = instance->ev;
@@ -6388,8 +6423,19 @@ static void megasas_shutdown(struct pci_dev *pdev)
 	struct megasas_instance *instance = pci_get_drvdata(pdev);
 
 	instance->unload = 1;
+
+	msleep(1000);
+
+	if ((atomic_read(&instance->adprecovery) == MEGASAS_HW_CRITICAL_ERROR)
+		|| ((atomic_read(&instance->adprecovery)
+			!= MEGASAS_HBA_OPERATIONAL)
+		&& megasas_wait_for_adapter_operational(instance)))
+		goto skip_firing_dcmds;
+
 	megasas_flush_cache(instance);
 	megasas_shutdown_controller(instance, MR_DCMD_CTRL_SHUTDOWN);
+
+skip_firing_dcmds:
 	instance->instancet->disable_intr(instance);
 	megasas_destroy_irqs(instance);
 
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 61be7ed..2159f6a 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -2463,12 +2463,15 @@ irqreturn_t megasas_isr_fusion(int irq, void *devp)
 			/* Start collecting crash, if DMA bit is done */
 			if ((fw_state == MFI_STATE_FAULT) && dma_state)
 				schedule_work(&instance->crash_init);
-			else if (fw_state == MFI_STATE_FAULT)
-				schedule_work(&instance->work_init);
+			else if (fw_state == MFI_STATE_FAULT) {
+				if (instance->unload == 0)
+					schedule_work(&instance->work_init);
+			}
 		} else if (fw_state == MFI_STATE_FAULT) {
 			dev_warn(&instance->pdev->dev, "Iop2SysDoorbellInt"
 			       "for scsi%d\n", instance->host->host_no);
-			schedule_work(&instance->work_init);
+			if (instance->unload == 0)
+				schedule_work(&instance->work_init);
 		}
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
                   ` (2 preceding siblings ...)
  2016-10-17 10:24 ` [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach Sumit Saxena
@ 2016-10-17 10:24 ` Sumit Saxena
  2016-10-17 11:34   ` Hannes Reinecke
  2016-10-17 13:13   ` Tomas Henzl
  2016-10-17 10:24 ` [PATCH 5/7] megaraid_sas: driver version upgrade Sumit Saxena
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena

megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI layer
without sending it to firmware as firmware takes care of flushing cache.
This patch will change the driver behavior wrt SYNCHRONIZE_CACHE handling.
If underlying firmware has support to handle the SYNCHORNIZE_CACHE, driver
will send it for firmware otherwise complete it back to SCSI layer with
SUCCESS immediately.
If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
"canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will be
set.

This behavior can be controlled via module parameter and user can fallback
to old behavior of returning SYNCHRONIZE_CACHE by driver only without
sending it to firmware.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
 drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
 drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index ca86c88..43fd14f 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
 #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
 #define MR_MAX_MSIX_REG_ARRAY                   16
 #define MR_RDPQ_MODE_OFFSET			0X00800000
+#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
+
 /*
 * register set for both 1068 and 1078 controllers
 * structure extended for 1078 registers
@@ -2140,6 +2142,7 @@ struct megasas_instance {
 	u8 is_imr;
 	u8 is_rdpq;
 	bool dev_handle;
+	bool fw_sync_cache_support;
 };
 struct MR_LD_VF_MAP {
 	u32 size;
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 092387f..a4a8e2f 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -104,6 +104,10 @@ unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
 module_param(scmd_timeout, int, S_IRUGO);
 MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer.");
 
+bool block_sync_cache;
+module_param(block_sync_cache, bool, S_IRUGO);
+MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver Default: 0(Send it to firmware)");
+
 MODULE_LICENSE("GPL");
 MODULE_VERSION(MEGASAS_VERSION);
 MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
@@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		goto out_done;
 	}
 
-	switch (scmd->cmnd[0]) {
-	case SYNCHRONIZE_CACHE:
-		/*
-		 * FW takes care of flush cache on its own
-		 * No need to send it down
-		 */
+	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
+		(!instance->fw_sync_cache_support)) {
 		scmd->result = DID_OK << 16;
 		goto out_done;
-	default:
-		break;
 	}
 
 	return instance->instancet->build_and_issue_cmd(instance, scmd);
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 2159f6a..8237580 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance *instance)
 		ret = 1;
 		goto fail_fw_init;
 	}
+	if (!block_sync_cache)
+		instance->fw_sync_cache_support = (scratch_pad_2 &
+			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
 
 	IOCInitMessage =
 	  dma_alloc_coherent(&instance->pdev->dev,
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h
index e3bee04..2857154 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
@@ -72,6 +72,8 @@
 #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
 #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
 
+extern bool block_sync_cache;
+
 /*
  * Raid context flags
  */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 5/7] megaraid_sas: driver version upgrade
  2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
                   ` (3 preceding siblings ...)
  2016-10-17 10:24 ` [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware Sumit Saxena
@ 2016-10-17 10:24 ` Sumit Saxena
  2016-10-17 11:35   ` Hannes Reinecke
  2016-10-17 10:24 ` [PATCH 6/7] MAINTAINERS: Update megaraid maintainers list Sumit Saxena
  2016-10-17 10:24 ` [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map Sumit Saxena
  6 siblings, 1 reply; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena

Upgrade driver version.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
index 43fd14f..1d4de90 100644
--- a/drivers/scsi/megaraid/megaraid_sas.h
+++ b/drivers/scsi/megaraid/megaraid_sas.h
@@ -35,8 +35,8 @@
 /*
  * MegaRAID SAS Driver meta data
  */
-#define MEGASAS_VERSION				"06.811.02.00-rc1"
-#define MEGASAS_RELDATE				"April 12, 2016"
+#define MEGASAS_VERSION				"06.812.06.00-rc1"
+#define MEGASAS_RELDATE				"August 22, 2016"
 
 /*
  * Device IDs
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 6/7] MAINTAINERS: Update megaraid maintainers list
  2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
                   ` (4 preceding siblings ...)
  2016-10-17 10:24 ` [PATCH 5/7] megaraid_sas: driver version upgrade Sumit Saxena
@ 2016-10-17 10:24 ` Sumit Saxena
  2016-10-17 11:37   ` Hannes Reinecke
  2016-10-17 10:24 ` [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map Sumit Saxena
  6 siblings, 1 reply; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena

Update MEGARAID drivers maintainers list.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
 MAINTAINERS | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f0ee7a6..8b9117f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7612,12 +7612,12 @@ S:	Maintained
 F:	drivers/net/wireless/mediatek/mt7601u/
 
 MEGARAID SCSI/SAS DRIVERS
-M:	Kashyap Desai <kashyap.desai@avagotech.com>
-M:	Sumit Saxena <sumit.saxena@avagotech.com>
-M:	Uday Lingala <uday.lingala@avagotech.com>
-L:	megaraidlinux.pdl@avagotech.com
+M:	Kashyap Desai <kashyap.desai@broadcom.com>
+M:	Sumit Saxena <sumit.saxena@broadcom.com>
+M:	Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
+L:	megaraidlinux.pdl@broadcom.com
 L:	linux-scsi@vger.kernel.org
-W:	http://www.lsi.com
+W:	http://www.avagotech.com/support/
 S:	Maintained
 F:	Documentation/scsi/megaraid.txt
 F:	drivers/scsi/megaraid.*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map
  2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
                   ` (5 preceding siblings ...)
  2016-10-17 10:24 ` [PATCH 6/7] MAINTAINERS: Update megaraid maintainers list Sumit Saxena
@ 2016-10-17 10:24 ` Sumit Saxena
  2016-10-17 11:37     ` Hannes Reinecke
  2016-10-17 13:23   ` Tomas Henzl
  6 siblings, 2 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 10:24 UTC (permalink / raw)
  To: linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, Sumit Saxena, stable

Do not set MPI2_TYPE_CUDA for JBOD fastpath IOs for firmware which does
not support JBOD sequence map.

CC: stable@vger.kernel.org
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 8237580..25b7720 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -2003,6 +2003,8 @@ megasas_build_syspd_fusion(struct megasas_instance *instance,
 		io_request->DevHandle = pd_sync->seq[pd_index].devHandle;
 		pRAID_Context->regLockFlags |=
 			(MR_RL_FLAGS_SEQ_NUM_ENABLE|MR_RL_FLAGS_GRANT_DESTINATION_CUDA);
+		pRAID_Context->Type = MPI2_TYPE_CUDA;
+		pRAID_Context->nseg = 0x1;
 	} else if (fusion->fast_path_io) {
 		pRAID_Context->VirtualDiskTgtId = cpu_to_le16(device_id);
 		pRAID_Context->configSeqNum = 0;
@@ -2038,12 +2040,10 @@ megasas_build_syspd_fusion(struct megasas_instance *instance,
 		pRAID_Context->timeoutValue =
 			cpu_to_le16((os_timeout_value > timeout_limit) ?
 			timeout_limit : os_timeout_value);
-		if (fusion->adapter_type == INVADER_SERIES) {
-			pRAID_Context->Type = MPI2_TYPE_CUDA;
-			pRAID_Context->nseg = 0x1;
+		if (fusion->adapter_type == INVADER_SERIES)
 			io_request->IoFlags |=
 				cpu_to_le16(MPI25_SAS_DEVICE0_FLAGS_ENABLED_FAST_PATH);
-		}
+
 		cmd->request_desc->SCSIIO.RequestFlags =
 			(MPI2_REQ_DESCRIPT_FLAGS_FP_IO <<
 				MEGASAS_REQ_DESCRIPT_FLAGS_TYPE_SHIFT);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset
  2016-10-17 10:24 ` [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset Sumit Saxena
@ 2016-10-17 11:29     ` Hannes Reinecke
  2016-10-17 12:43   ` Tomas Henzl
  1 sibling, 0 replies; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:29 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, stable,
	Kiran Kumar Kasturi

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> For SRIOV enabled firmware, if there is a OCR(online controller reset)
> possibility driver set the convert flag to 1, which is not happening if
> there are outstanding commands even after 180 seconds.
> As driver does not set convert flag to 1 and still making the OCR to run,
> VF(Virtual function) driver is directly writing on to the register
> instead of waiting for 30 seconds. Setting convert flag to 1 will cause
> VF driver will wait for 30 secs before going for reset.
> 
> CC: stable@vger.kernel.org
> Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 52d8bbf..61be7ed 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -2823,6 +2823,7 @@ int megasas_wait_for_outstanding_fusion(struct megasas_instance *instance,
>  		dev_err(&instance->pdev->dev, "pending commands remain after waiting, "
>  		       "will reset adapter scsi%d.\n",
>  		       instance->host->host_no);
> +		*convert = 1;
>  		retval = 1;
>  	}
>  out:
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: F. Imend�rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset
@ 2016-10-17 11:29     ` Hannes Reinecke
  0 siblings, 0 replies; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:29 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, stable,
	Kiran Kumar Kasturi

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> For SRIOV enabled firmware, if there is a OCR(online controller reset)
> possibility driver set the convert flag to 1, which is not happening if
> there are outstanding commands even after 180 seconds.
> As driver does not set convert flag to 1 and still making the OCR to run,
> VF(Virtual function) driver is directly writing on to the register
> instead of waiting for 30 seconds. Setting convert flag to 1 will cause
> VF driver will wait for 30 secs before going for reset.
> 
> CC: stable@vger.kernel.org
> Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 52d8bbf..61be7ed 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -2823,6 +2823,7 @@ int megasas_wait_for_outstanding_fusion(struct megasas_instance *instance,
>  		dev_err(&instance->pdev->dev, "pending commands remain after waiting, "
>  		       "will reset adapter scsi%d.\n",
>  		       instance->host->host_no);
> +		*convert = 1;
>  		retval = 1;
>  	}
>  out:
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade
  2016-10-17 10:24 ` [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade Sumit Saxena
@ 2016-10-17 11:29   ` Hannes Reinecke
  2016-10-17 12:50   ` Tomas Henzl
  1 sibling, 0 replies; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:29 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, Kiran Kumar Kasturi

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> This patch fixes the issue of wrong PhysArm was sent to firmware for R1
> VD downgrade.
> 
> Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fp.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fp.c b/drivers/scsi/megaraid/megaraid_sas_fp.c
> index e413113..f237d00 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fp.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fp.c
> @@ -782,7 +782,8 @@ static u8 mr_spanset_get_phy_params(struct megasas_instance *instance, u32 ld,
>  			(raid->regTypeReqOnRead != REGION_TYPE_UNUSED))))
>  			pRAID_Context->regLockFlags = REGION_TYPE_EXCLUSIVE;
>  		else if (raid->level == 1) {
> -			pd = MR_ArPdGet(arRef, physArm + 1, map);
> +			physArm = physArm + 1;
> +			pd = MR_ArPdGet(arRef, physArm, map);
>  			if (pd != MR_PD_INVALID)
>  				*pDevHandle = MR_PdDevHandleGet(pd, map);
>  		}
> @@ -879,7 +880,8 @@ u8 MR_GetPhyParams(struct megasas_instance *instance, u32 ld, u64 stripRow,
>  			pRAID_Context->regLockFlags = REGION_TYPE_EXCLUSIVE;
>  		else if (raid->level == 1) {
>  			/* Get alternate Pd. */
> -			pd = MR_ArPdGet(arRef, physArm + 1, map);
> +			physArm = physArm + 1;
> +			pd = MR_ArPdGet(arRef, physArm, map);
>  			if (pd != MR_PD_INVALID)
>  				/* Get dev handle from Pd */
>  				*pDevHandle = MR_PdDevHandleGet(pd, map);
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach
  2016-10-17 10:24 ` [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach Sumit Saxena
@ 2016-10-17 11:31   ` Hannes Reinecke
  2016-10-17 12:19     ` Sumit Saxena
  0 siblings, 1 reply; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:31 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> This patch addresses the issue of driver firing DCMDs in PCI
> shutdown/detach path irrespective of firmware state.
> Driver will check for whether firmware is operational state or not
> before firing DCMDs. If firmware is in unrecoverbale
> state or does not become operational within specfied time, driver will
> skip firing DCMDs.
> 
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_base.c   | 46 +++++++++++++++++++++++++++++
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  9 ++++--
>  2 files changed, 52 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 9ff57de..092387f 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -6248,6 +6248,32 @@ fail_reenable_msix:
>  #define megasas_resume	NULL
>  #endif
>  
> +static inline int
> +megasas_wait_for_adapter_operational(struct megasas_instance *instance)
> +{
> +	int wait_time = MEGASAS_RESET_WAIT_TIME * 2;
> +	int i;
> +
> +	for (i = 0; i < wait_time; i++) {
> +		if (atomic_read(&instance->adprecovery)
> +			== MEGASAS_HBA_OPERATIONAL)
> +			break;
> +
> +		if (!(i % MEGASAS_RESET_NOTICE_INTERVAL))
> +			dev_notice(&instance->pdev->dev, "waiting for controller reset to finish\n");
> +
> +		msleep(1000);
> +	}
> +
> +	if (atomic_read(&instance->adprecovery) != MEGASAS_HBA_OPERATIONAL) {
> +		dev_info(&instance->pdev->dev, "%s timed out while waiting for HBA to recover.\n",
> +			__func__);
> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * megasas_detach_one -	PCI hot"un"plug entry point
>   * @pdev:		PCI device structure
> @@ -6272,9 +6298,18 @@ static void megasas_detach_one(struct pci_dev *pdev)
>  	if (instance->fw_crash_state != UNAVAILABLE)
>  		megasas_free_host_crash_buffer(instance);
>  	scsi_remove_host(instance->host);
> +
> +	msleep(1000);
> +	if ((atomic_read(&instance->adprecovery) == MEGASAS_HW_CRITICAL_ERROR)
> +		|| ((atomic_read(&instance->adprecovery)
> +			!= MEGASAS_HBA_OPERATIONAL) &&
> +		megasas_wait_for_adapter_operational(instance)))
> +		goto skip_firing_dcmds;
> +
>  	megasas_flush_cache(instance);
>  	megasas_shutdown_controller(instance, MR_DCMD_CTRL_SHUTDOWN);
>  
> +skip_firing_dcmds:
>  	/* cancel the delayed work if this work still in queue*/
>  	if (instance->ev != NULL) {
>  		struct megasas_aen_event *ev = instance->ev;
Why not move the 'msleep' and 'atomic_read' into the call to
megasas_wait_for_adapter_operational?
Plus I'm not sure if the unconditional 'msleep' is a good idea here;
maybe one should read 'adprecovery' first and _then_ call msleep if
required?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 10:24 ` [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware Sumit Saxena
@ 2016-10-17 11:34   ` Hannes Reinecke
  2016-10-17 12:13     ` Sumit Saxena
  2016-10-17 13:01     ` Ric Wheeler
  2016-10-17 13:13   ` Tomas Henzl
  1 sibling, 2 replies; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:34 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI layer
> without sending it to firmware as firmware takes care of flushing cache.
> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE handling.
> If underlying firmware has support to handle the SYNCHORNIZE_CACHE, driver
> will send it for firmware otherwise complete it back to SCSI layer with
> SUCCESS immediately.
> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will be
> set.
> 
> This behavior can be controlled via module parameter and user can fallback
> to old behavior of returning SYNCHRONIZE_CACHE by driver only without
> sending it to firmware.
> 
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>  drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>  drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>  4 files changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
> index ca86c88..43fd14f 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>  #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>  #define MR_MAX_MSIX_REG_ARRAY                   16
>  #define MR_RDPQ_MODE_OFFSET			0X00800000
> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
> +
>  /*
>  * register set for both 1068 and 1078 controllers
>  * structure extended for 1078 registers
> @@ -2140,6 +2142,7 @@ struct megasas_instance {
>  	u8 is_imr;
>  	u8 is_rdpq;
>  	bool dev_handle;
> +	bool fw_sync_cache_support;
>  };
>  struct MR_LD_VF_MAP {
>  	u32 size;
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 092387f..a4a8e2f 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -104,6 +104,10 @@ unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
>  module_param(scmd_timeout, int, S_IRUGO);
>  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer.");
>  
> +bool block_sync_cache;
> +module_param(block_sync_cache, bool, S_IRUGO);
> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver Default: 0(Send it to firmware)");
> +
>  MODULE_LICENSE("GPL");
>  MODULE_VERSION(MEGASAS_VERSION);
>  MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>  		goto out_done;
>  	}
>  
> -	switch (scmd->cmnd[0]) {
> -	case SYNCHRONIZE_CACHE:
> -		/*
> -		 * FW takes care of flush cache on its own
> -		 * No need to send it down
> -		 */
> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> +		(!instance->fw_sync_cache_support)) {
>  		scmd->result = DID_OK << 16;
>  		goto out_done;
> -	default:
> -		break;
>  	}
>  
>  	return instance->instancet->build_and_issue_cmd(instance, scmd);
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 2159f6a..8237580 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance *instance)
>  		ret = 1;
>  		goto fail_fw_init;
>  	}
> +	if (!block_sync_cache)
> +		instance->fw_sync_cache_support = (scratch_pad_2 &
> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>  
>  	IOCInitMessage =
>  	  dma_alloc_coherent(&instance->pdev->dev,
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> index e3bee04..2857154 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> @@ -72,6 +72,8 @@
>  #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>  #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>  
> +extern bool block_sync_cache;
> +
>  /*
>   * Raid context flags
>   */
> 
Be extra careful with that.

SYNCHRONIZE_CACHE has (potentially) a global scope, as there might be an
array-wide cache, and a cache flush would affect the entire cache.
Linux is using SYNCHRONIZE_CACHE as a per device setting, ie it assumes
that the effects of a cache flush is restricted to the device in question.

If this is _not_ the case I'd rather not enable it.
Have you checked that enabling this functionality doesn't have negative
performance impact?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 5/7] megaraid_sas: driver version upgrade
  2016-10-17 10:24 ` [PATCH 5/7] megaraid_sas: driver version upgrade Sumit Saxena
@ 2016-10-17 11:35   ` Hannes Reinecke
  2016-10-17 12:20     ` Sumit Saxena
  0 siblings, 1 reply; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:35 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> Upgrade driver version.
> 
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
> index 43fd14f..1d4de90 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -35,8 +35,8 @@
>  /*
>   * MegaRAID SAS Driver meta data
>   */
> -#define MEGASAS_VERSION				"06.811.02.00-rc1"
> -#define MEGASAS_RELDATE				"April 12, 2016"
> +#define MEGASAS_VERSION				"06.812.06.00-rc1"
> +#define MEGASAS_RELDATE				"August 22, 2016"
>  
>  /*
>   * Device IDs
> 
Please move this patch to the end of the series.
Otherwise it's impossible to tell if the following patches should be
part of the new version or not.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 6/7] MAINTAINERS: Update megaraid maintainers list
  2016-10-17 10:24 ` [PATCH 6/7] MAINTAINERS: Update megaraid maintainers list Sumit Saxena
@ 2016-10-17 11:37   ` Hannes Reinecke
  0 siblings, 0 replies; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:37 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi; +Cc: martin.petersen, thenzl, jejb, kashyap.desai

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> Update MEGARAID drivers maintainers list.
> 
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> ---
>  MAINTAINERS | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index f0ee7a6..8b9117f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7612,12 +7612,12 @@ S:	Maintained
>  F:	drivers/net/wireless/mediatek/mt7601u/
>  
>  MEGARAID SCSI/SAS DRIVERS
> -M:	Kashyap Desai <kashyap.desai@avagotech.com>
> -M:	Sumit Saxena <sumit.saxena@avagotech.com>
> -M:	Uday Lingala <uday.lingala@avagotech.com>
> -L:	megaraidlinux.pdl@avagotech.com
> +M:	Kashyap Desai <kashyap.desai@broadcom.com>
> +M:	Sumit Saxena <sumit.saxena@broadcom.com>
> +M:	Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
> +L:	megaraidlinux.pdl@broadcom.com
>  L:	linux-scsi@vger.kernel.org
> -W:	http://www.lsi.com
> +W:	http://www.avagotech.com/support/
>  S:	Maintained
>  F:	Documentation/scsi/megaraid.txt
>  F:	drivers/scsi/megaraid.*
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map
  2016-10-17 10:24 ` [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map Sumit Saxena
@ 2016-10-17 11:37     ` Hannes Reinecke
  2016-10-17 13:23   ` Tomas Henzl
  1 sibling, 0 replies; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:37 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, stable

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> Do not set MPI2_TYPE_CUDA for JBOD fastpath IOs for firmware which does
> not support JBOD sequence map.
> 
> CC: stable@vger.kernel.org
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 8237580..25b7720 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -2003,6 +2003,8 @@ megasas_build_syspd_fusion(struct megasas_instance *instance,
>  		io_request->DevHandle = pd_sync->seq[pd_index].devHandle;
>  		pRAID_Context->regLockFlags |=
>  			(MR_RL_FLAGS_SEQ_NUM_ENABLE|MR_RL_FLAGS_GRANT_DESTINATION_CUDA);
> +		pRAID_Context->Type = MPI2_TYPE_CUDA;
> +		pRAID_Context->nseg = 0x1;
>  	} else if (fusion->fast_path_io) {
>  		pRAID_Context->VirtualDiskTgtId = cpu_to_le16(device_id);
>  		pRAID_Context->configSeqNum = 0;
> @@ -2038,12 +2040,10 @@ megasas_build_syspd_fusion(struct megasas_instance *instance,
>  		pRAID_Context->timeoutValue =
>  			cpu_to_le16((os_timeout_value > timeout_limit) ?
>  			timeout_limit : os_timeout_value);
> -		if (fusion->adapter_type == INVADER_SERIES) {
> -			pRAID_Context->Type = MPI2_TYPE_CUDA;
> -			pRAID_Context->nseg = 0x1;
> +		if (fusion->adapter_type == INVADER_SERIES)
>  			io_request->IoFlags |=
>  				cpu_to_le16(MPI25_SAS_DEVICE0_FLAGS_ENABLED_FAST_PATH);
> -		}
> +
>  		cmd->request_desc->SCSIIO.RequestFlags =
>  			(MPI2_REQ_DESCRIPT_FLAGS_FP_IO <<
>  				MEGASAS_REQ_DESCRIPT_FLAGS_TYPE_SHIFT);
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: F. Imend�rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map
@ 2016-10-17 11:37     ` Hannes Reinecke
  0 siblings, 0 replies; 49+ messages in thread
From: Hannes Reinecke @ 2016-10-17 11:37 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, stable

On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> Do not set MPI2_TYPE_CUDA for JBOD fastpath IOs for firmware which does
> not support JBOD sequence map.
> 
> CC: stable@vger.kernel.org
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 8237580..25b7720 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -2003,6 +2003,8 @@ megasas_build_syspd_fusion(struct megasas_instance *instance,
>  		io_request->DevHandle = pd_sync->seq[pd_index].devHandle;
>  		pRAID_Context->regLockFlags |=
>  			(MR_RL_FLAGS_SEQ_NUM_ENABLE|MR_RL_FLAGS_GRANT_DESTINATION_CUDA);
> +		pRAID_Context->Type = MPI2_TYPE_CUDA;
> +		pRAID_Context->nseg = 0x1;
>  	} else if (fusion->fast_path_io) {
>  		pRAID_Context->VirtualDiskTgtId = cpu_to_le16(device_id);
>  		pRAID_Context->configSeqNum = 0;
> @@ -2038,12 +2040,10 @@ megasas_build_syspd_fusion(struct megasas_instance *instance,
>  		pRAID_Context->timeoutValue =
>  			cpu_to_le16((os_timeout_value > timeout_limit) ?
>  			timeout_limit : os_timeout_value);
> -		if (fusion->adapter_type == INVADER_SERIES) {
> -			pRAID_Context->Type = MPI2_TYPE_CUDA;
> -			pRAID_Context->nseg = 0x1;
> +		if (fusion->adapter_type == INVADER_SERIES)
>  			io_request->IoFlags |=
>  				cpu_to_le16(MPI25_SAS_DEVICE0_FLAGS_ENABLED_FAST_PATH);
> -		}
> +
>  		cmd->request_desc->SCSIIO.RequestFlags =
>  			(MPI2_REQ_DESCRIPT_FLAGS_FP_IO <<
>  				MEGASAS_REQ_DESCRIPT_FLAGS_TYPE_SHIFT);
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 11:34   ` Hannes Reinecke
@ 2016-10-17 12:13     ` Sumit Saxena
  2016-10-17 13:01     ` Ric Wheeler
  1 sibling, 0 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 12:13 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi; +Cc: martin.petersen, thenzl, jejb, Kashyap Desai

>-----Original Message-----
>From: Hannes Reinecke [mailto:hare@suse.de]
>Sent: Monday, October 17, 2016 5:04 PM
>To: Sumit Saxena; linux-scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; thenzl@redhat.com;
jejb@linux.vnet.ibm.com;
>kashyap.desai@broadcom.com
>Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>firmware
>
>On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
>> layer without sending it to firmware as firmware takes care of flushing
cache.
>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE
handling.
>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE,
>> driver will send it for firmware otherwise complete it back to SCSI
>> layer with SUCCESS immediately.
>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will
>> be set.
>>
>> This behavior can be controlled via module parameter and user can
>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver only
>> without sending it to firmware.
>>
>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>> ---
>>  drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>  drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>> drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>  4 files changed, 14 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>> b/drivers/scsi/megaraid/megaraid_sas.h
>> index ca86c88..43fd14f 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>  #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>  #define MR_MAX_MSIX_REG_ARRAY                   16
>>  #define MR_RDPQ_MODE_OFFSET			0X00800000
>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>> +
>>  /*
>>  * register set for both 1068 and 1078 controllers
>>  * structure extended for 1078 registers @@ -2140,6 +2142,7 @@ struct
>> megasas_instance {
>>  	u8 is_imr;
>>  	u8 is_rdpq;
>>  	bool dev_handle;
>> +	bool fw_sync_cache_support;
>>  };
>>  struct MR_LD_VF_MAP {
>>  	u32 size;
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index 092387f..a4a8e2f 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>> MEGASAS_DEFAULT_CMD_TIMEOUT;  module_param(scmd_timeout, int,
>> S_IRUGO);  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>> (10-90s), default 90s. See megasas_reset_timer.");
>>
>> +bool block_sync_cache;
>> +module_param(block_sync_cache, bool, S_IRUGO);
>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>> +Default: 0(Send it to firmware)");
>> +
>>  MODULE_LICENSE("GPL");
>>  MODULE_VERSION(MEGASAS_VERSION);
>>  MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>*shost, struct scsi_cmnd *scmd)
>>  		goto out_done;
>>  	}
>>
>> -	switch (scmd->cmnd[0]) {
>> -	case SYNCHRONIZE_CACHE:
>> -		/*
>> -		 * FW takes care of flush cache on its own
>> -		 * No need to send it down
>> -		 */
>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>> +		(!instance->fw_sync_cache_support)) {
>>  		scmd->result = DID_OK << 16;
>>  		goto out_done;
>> -	default:
>> -		break;
>>  	}
>>
>>  	return instance->instancet->build_and_issue_cmd(instance, scmd);
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> index 2159f6a..8237580 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance
>*instance)
>>  		ret = 1;
>>  		goto fail_fw_init;
>>  	}
>> +	if (!block_sync_cache)
>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>>
>>  	IOCInitMessage =
>>  	  dma_alloc_coherent(&instance->pdev->dev,
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> index e3bee04..2857154 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> @@ -72,6 +72,8 @@
>>  #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>  #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>
>> +extern bool block_sync_cache;
>> +
>>  /*
>>   * Raid context flags
>>   */
>>
>Be extra careful with that.
>
>SYNCHRONIZE_CACHE has (potentially) a global scope, as there might be an
>array-wide cache, and a cache flush would affect the entire cache.
>Linux is using SYNCHRONIZE_CACHE as a per device setting, ie it assumes
that the
>effects of a cache flush is restricted to the device in question.

Yes, cache flush will be done per device only.
>
>If this is _not_ the case I'd rather not enable it.
>Have you checked that enabling this functionality doesn't have negative
>performance impact?

Yes, there will be negative performance impact on wrokloads where
SYNCHRONIZE_CACHE is fired very frequently.
We have provided a module parameter to fallback to older behavior, if
anyone sees negative performance impact because of this
and there is external power to drive.
This change is done for deployment where there is no external power to
drive and abrupt shutdown may cause data loss
as cache may not be flushed to disk.

Thanks,
Sumit

>
>Cheers,
>
>Hannes
>--
>Dr. Hannes Reinecke		   Teamlead Storage & Networking
>hare@suse.de			               +49 911 74053 688
>SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
>GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB
21284 (AG
>Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach
  2016-10-17 11:31   ` Hannes Reinecke
@ 2016-10-17 12:19     ` Sumit Saxena
  0 siblings, 0 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 12:19 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi; +Cc: martin.petersen, thenzl, jejb, Kashyap Desai

>-----Original Message-----
>From: Hannes Reinecke [mailto:hare@suse.de]
>Sent: Monday, October 17, 2016 5:01 PM
>To: Sumit Saxena; linux-scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; thenzl@redhat.com;
jejb@linux.vnet.ibm.com;
>kashyap.desai@broadcom.com
>Subject: Re: [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI
>shutdown/detach
>
>On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>> This patch addresses the issue of driver firing DCMDs in PCI
>> shutdown/detach path irrespective of firmware state.
>> Driver will check for whether firmware is operational state or not
>> before firing DCMDs. If firmware is in unrecoverbale state or does not
>> become operational within specfied time, driver will skip firing
>> DCMDs.
>>
>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>> ---
>>  drivers/scsi/megaraid/megaraid_sas_base.c   | 46
>+++++++++++++++++++++++++++++
>>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  9 ++++--
>>  2 files changed, 52 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index 9ff57de..092387f 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -6248,6 +6248,32 @@ fail_reenable_msix:
>>  #define megasas_resume	NULL
>>  #endif
>>
>> +static inline int
>> +megasas_wait_for_adapter_operational(struct megasas_instance
>> +*instance) {
>> +	int wait_time = MEGASAS_RESET_WAIT_TIME * 2;
>> +	int i;
>> +
>> +	for (i = 0; i < wait_time; i++) {
>> +		if (atomic_read(&instance->adprecovery)
>> +			== MEGASAS_HBA_OPERATIONAL)
>> +			break;
>> +
>> +		if (!(i % MEGASAS_RESET_NOTICE_INTERVAL))
>> +			dev_notice(&instance->pdev->dev, "waiting for
>controller reset to
>> +finish\n");
>> +
>> +		msleep(1000);
>> +	}
>> +
>> +	if (atomic_read(&instance->adprecovery) !=
>MEGASAS_HBA_OPERATIONAL) {
>> +		dev_info(&instance->pdev->dev, "%s timed out while waiting
for
>HBA to recover.\n",
>> +			__func__);
>> +		return 1;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  /**
>>   * megasas_detach_one -	PCI hot"un"plug entry point
>>   * @pdev:		PCI device structure
>> @@ -6272,9 +6298,18 @@ static void megasas_detach_one(struct pci_dev
>*pdev)
>>  	if (instance->fw_crash_state != UNAVAILABLE)
>>  		megasas_free_host_crash_buffer(instance);
>>  	scsi_remove_host(instance->host);
>> +
>> +	msleep(1000);
>> +	if ((atomic_read(&instance->adprecovery) ==
>MEGASAS_HW_CRITICAL_ERROR)
>> +		|| ((atomic_read(&instance->adprecovery)
>> +			!= MEGASAS_HBA_OPERATIONAL) &&
>> +		megasas_wait_for_adapter_operational(instance)))
>> +		goto skip_firing_dcmds;
>> +
>>  	megasas_flush_cache(instance);
>>  	megasas_shutdown_controller(instance,
>MR_DCMD_CTRL_SHUTDOWN);
>>
>> +skip_firing_dcmds:
>>  	/* cancel the delayed work if this work still in queue*/
>>  	if (instance->ev != NULL) {
>>  		struct megasas_aen_event *ev = instance->ev;
>Why not move the 'msleep' and 'atomic_read' into the call to
>megasas_wait_for_adapter_operational?
I will rectify this in next version of this patch.

>Plus I'm not sure if the unconditional 'msleep' is a good idea here;
maybe one
>should read 'adprecovery' first and _then_ call msleep if required?
Agreed.. I will cleanup this and resend the patch.

Thanks,
Sumit
>
>Cheers,
>
>Hannes
>--
>Dr. Hannes Reinecke		   Teamlead Storage & Networking
>hare@suse.de			               +49 911 74053 688
>SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
>GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB
21284 (AG
>Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 5/7] megaraid_sas: driver version upgrade
  2016-10-17 11:35   ` Hannes Reinecke
@ 2016-10-17 12:20     ` Sumit Saxena
  0 siblings, 0 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 12:20 UTC (permalink / raw)
  To: Hannes Reinecke, linux-scsi; +Cc: martin.petersen, thenzl, jejb, Kashyap Desai

>-----Original Message-----
>From: Hannes Reinecke [mailto:hare@suse.de]
>Sent: Monday, October 17, 2016 5:05 PM
>To: Sumit Saxena; linux-scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; thenzl@redhat.com;
jejb@linux.vnet.ibm.com;
>kashyap.desai@broadcom.com
>Subject: Re: [PATCH 5/7] megaraid_sas: driver version upgrade
>
>On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>> Upgrade driver version.
>>
>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>> ---
>>  drivers/scsi/megaraid/megaraid_sas.h | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>> b/drivers/scsi/megaraid/megaraid_sas.h
>> index 43fd14f..1d4de90 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>> @@ -35,8 +35,8 @@
>>  /*
>>   * MegaRAID SAS Driver meta data
>>   */
>> -#define MEGASAS_VERSION				"06.811.02.00-rc1"
>> -#define MEGASAS_RELDATE				"April 12, 2016"
>> +#define MEGASAS_VERSION				"06.812.06.00-rc1"
>> +#define MEGASAS_RELDATE				"August 22, 2016"
>>
>>  /*
>>   * Device IDs
>>
>Please move this patch to the end of the series.
>Otherwise it's impossible to tell if the following patches should be part
of the new
>version or not.

Sure..I will rectify and resend the patches.
>
>Cheers,
>
>Hannes
>--
>Dr. Hannes Reinecke		   Teamlead Storage & Networking
>hare@suse.de			               +49 911 74053 688
>SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
>GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB
21284 (AG
>Nürnberg)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset
  2016-10-17 10:24 ` [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset Sumit Saxena
  2016-10-17 11:29     ` Hannes Reinecke
@ 2016-10-17 12:43   ` Tomas Henzl
  1 sibling, 0 replies; 49+ messages in thread
From: Tomas Henzl @ 2016-10-17 12:43 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi
  Cc: martin.petersen, jejb, kashyap.desai, stable, Kiran Kumar Kasturi

On 17.10.2016 12:24, Sumit Saxena wrote:
> For SRIOV enabled firmware, if there is a OCR(online controller reset)
> possibility driver set the convert flag to 1, which is not happening if
> there are outstanding commands even after 180 seconds.
> As driver does not set convert flag to 1 and still making the OCR to run,
> VF(Virtual function) driver is directly writing on to the register
> instead of waiting for 30 seconds. Setting convert flag to 1 will cause
> VF driver will wait for 30 secs before going for reset.
>
> CC: stable@vger.kernel.org
> Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>

Reviewed-by: Tomas Henzl <thenzl@redhat.com>

Tomas


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade
  2016-10-17 10:24 ` [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade Sumit Saxena
  2016-10-17 11:29   ` Hannes Reinecke
@ 2016-10-17 12:50   ` Tomas Henzl
  1 sibling, 0 replies; 49+ messages in thread
From: Tomas Henzl @ 2016-10-17 12:50 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi
  Cc: martin.petersen, jejb, kashyap.desai, Kiran Kumar Kasturi

On 17.10.2016 12:24, Sumit Saxena wrote:
> This patch fixes the issue of wrong PhysArm was sent to firmware for R1
> VD downgrade.
>
> Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>

Reviewed-by: Tomas Henzl <thenzl@redhat.com>

Tomas


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 11:34   ` Hannes Reinecke
  2016-10-17 12:13     ` Sumit Saxena
@ 2016-10-17 13:01     ` Ric Wheeler
  2016-10-17 13:31       ` Sumit Saxena
                         ` (2 more replies)
  1 sibling, 3 replies; 49+ messages in thread
From: Ric Wheeler @ 2016-10-17 13:01 UTC (permalink / raw)
  To: Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, jejb, kashyap.desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI layer
>> without sending it to firmware as firmware takes care of flushing cache.
>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE handling.
>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE, driver
>> will send it for firmware otherwise complete it back to SCSI layer with
>> SUCCESS immediately.
>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will be
>> set.
>>
>> This behavior can be controlled via module parameter and user can fallback
>> to old behavior of returning SYNCHRONIZE_CACHE by driver only without
>> sending it to firmware.
>>
>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>> ---
>>   drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>   drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>   drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>   drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>   4 files changed, 14 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
>> index ca86c88..43fd14f 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>   #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>   #define MR_MAX_MSIX_REG_ARRAY                   16
>>   #define MR_RDPQ_MODE_OFFSET			0X00800000
>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>> +
>>   /*
>>   * register set for both 1068 and 1078 controllers
>>   * structure extended for 1078 registers
>> @@ -2140,6 +2142,7 @@ struct megasas_instance {
>>   	u8 is_imr;
>>   	u8 is_rdpq;
>>   	bool dev_handle;
>> +	bool fw_sync_cache_support;
>>   };
>>   struct MR_LD_VF_MAP {
>>   	u32 size;
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index 092387f..a4a8e2f 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
>>   module_param(scmd_timeout, int, S_IRUGO);
>>   MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer.");
>>   
>> +bool block_sync_cache;
>> +module_param(block_sync_cache, bool, S_IRUGO);
>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver Default: 0(Send it to firmware)");
>> +
>>   MODULE_LICENSE("GPL");
>>   MODULE_VERSION(MEGASAS_VERSION);
>>   MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>>   		goto out_done;
>>   	}
>>   
>> -	switch (scmd->cmnd[0]) {
>> -	case SYNCHRONIZE_CACHE:
>> -		/*
>> -		 * FW takes care of flush cache on its own
>> -		 * No need to send it down
>> -		 */
>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>> +		(!instance->fw_sync_cache_support)) {
>>   		scmd->result = DID_OK << 16;
>>   		goto out_done;
>> -	default:
>> -		break;
>>   	}
>>   
>>   	return instance->instancet->build_and_issue_cmd(instance, scmd);
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> index 2159f6a..8237580 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance *instance)
>>   		ret = 1;
>>   		goto fail_fw_init;
>>   	}
>> +	if (!block_sync_cache)
>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>>   
>>   	IOCInitMessage =
>>   	  dma_alloc_coherent(&instance->pdev->dev,
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> index e3bee04..2857154 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> @@ -72,6 +72,8 @@
>>   #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>   #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>   
>> +extern bool block_sync_cache;
>> +
>>   /*
>>    * Raid context flags
>>    */
>>
> Be extra careful with that.
>
> SYNCHRONIZE_CACHE has (potentially) a global scope, as there might be an
> array-wide cache, and a cache flush would affect the entire cache.
> Linux is using SYNCHRONIZE_CACHE as a per device setting, ie it assumes
> that the effects of a cache flush is restricted to the device in question.
>
> If this is _not_ the case I'd rather not enable it.
> Have you checked that enabling this functionality doesn't have negative
> performance impact?
>
> Cheers,
>
> Hannes

This must go in - without this fix, there is no data integrity for any file system.

In effect, this driver by default has been throwing away SYNCHRONIZE_CACHE 
commands even when acting in JBOD/non-RAID mode.

Of course, actually doing a SYNCHRONIZE_CACHE to drives will be slower than 
throwing it away, but this is not optional.

We really need to have some ways to validate that our IO stack is properly and 
safely configured.

I would love to see a couple of things:

* having T10 & T13 report the existence of a volatile write cache - this is 
different than WCE set, some devices have a write cache and are battery/flash 
backed.

* having a robust tool to test over power failure/disconnect that our 
assumptions are true - any write is durable after a SYNCHRONIZE_CACHE or 
CACHE_FLUSH_EXT is sent and ack'ed

Regards,

Ric



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 10:24 ` [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware Sumit Saxena
  2016-10-17 11:34   ` Hannes Reinecke
@ 2016-10-17 13:13   ` Tomas Henzl
  2016-10-17 13:28     ` Sumit Saxena
  1 sibling, 1 reply; 49+ messages in thread
From: Tomas Henzl @ 2016-10-17 13:13 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi; +Cc: martin.petersen, jejb, kashyap.desai

On 17.10.2016 12:24, Sumit Saxena wrote:
> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI layer
> without sending it to firmware as firmware takes care of flushing cache.
> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE handling.
> If underlying firmware has support to handle the SYNCHORNIZE_CACHE, driver
> will send it for firmware otherwise complete it back to SCSI layer with
> SUCCESS immediately.
> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will be
> set.
>
> This behavior can be controlled via module parameter and user can fallback
> to old behavior of returning SYNCHRONIZE_CACHE by driver only without
> sending it to firmware.
>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> ---
>  drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>  drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>  drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>  4 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h
> index ca86c88..43fd14f 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>  #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>  #define MR_MAX_MSIX_REG_ARRAY                   16
>  #define MR_RDPQ_MODE_OFFSET			0X00800000
> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
> +
>  /*
>  * register set for both 1068 and 1078 controllers
>  * structure extended for 1078 registers
> @@ -2140,6 +2142,7 @@ struct megasas_instance {
>  	u8 is_imr;
>  	u8 is_rdpq;
>  	bool dev_handle;
> +	bool fw_sync_cache_support;
>  };
>  struct MR_LD_VF_MAP {
>  	u32 size;
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 092387f..a4a8e2f 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -104,6 +104,10 @@ unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
>  module_param(scmd_timeout, int, S_IRUGO);
>  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer.");
>  
> +bool block_sync_cache;
> +module_param(block_sync_cache, bool, S_IRUGO);
> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver Default: 0(Send it to firmware)");
> +
>  MODULE_LICENSE("GPL");
>  MODULE_VERSION(MEGASAS_VERSION);
>  MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>  		goto out_done;
>  	}
>  
> -	switch (scmd->cmnd[0]) {
> -	case SYNCHRONIZE_CACHE:
> -		/*
> -		 * FW takes care of flush cache on its own
> -		 * No need to send it down
> -		 */
> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> +		(!instance->fw_sync_cache_support)) {

Maybe I'm wrong, but isn't the logic inverted?
when fw_sync_cache_support is true  it means that there is a flash cache or a battery
and we don't have to send the command down ?

tomash

>  		scmd->result = DID_OK << 16;
>  		goto out_done;
> -	default:
> -		break;
>  	}
>  
>  	return instance->instancet->build_and_issue_cmd(instance, scmd);
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> index 2159f6a..8237580 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance *instance)
>  		ret = 1;
>  		goto fail_fw_init;
>  	}
> +	if (!block_sync_cache)
> +		instance->fw_sync_cache_support = (scratch_pad_2 &
> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>  
>  	IOCInitMessage =
>  	  dma_alloc_coherent(&instance->pdev->dev,
> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> index e3bee04..2857154 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> @@ -72,6 +72,8 @@
>  #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>  #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>  
> +extern bool block_sync_cache;
> +
>  /*
>   * Raid context flags
>   */



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map
  2016-10-17 10:24 ` [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map Sumit Saxena
  2016-10-17 11:37     ` Hannes Reinecke
@ 2016-10-17 13:23   ` Tomas Henzl
  1 sibling, 0 replies; 49+ messages in thread
From: Tomas Henzl @ 2016-10-17 13:23 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi; +Cc: martin.petersen, jejb, kashyap.desai, stable

On 17.10.2016 12:24, Sumit Saxena wrote:
> Do not set MPI2_TYPE_CUDA for JBOD fastpath IOs for firmware which does
> not support JBOD sequence map.
>
> CC: stable@vger.kernel.org
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>

Reviewed-by: Tomas Henzl <thenzl@redhat.com>

Tomas


^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 13:13   ` Tomas Henzl
@ 2016-10-17 13:28     ` Sumit Saxena
  2016-10-17 13:57       ` Tomas Henzl
  0 siblings, 1 reply; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 13:28 UTC (permalink / raw)
  To: Tomas Henzl, linux-scsi; +Cc: martin.petersen, jejb, Kashyap Desai

>-----Original Message-----
>From: Tomas Henzl [mailto:thenzl@redhat.com]
>Sent: Monday, October 17, 2016 6:44 PM
>To: Sumit Saxena; linux-scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; jejb@linux.vnet.ibm.com;
>kashyap.desai@broadcom.com
>Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>firmware
>
>On 17.10.2016 12:24, Sumit Saxena wrote:
>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
>> layer without sending it to firmware as firmware takes care of flushing
cache.
>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE
handling.
>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE,
>> driver will send it for firmware otherwise complete it back to SCSI
>> layer with SUCCESS immediately.
>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will
>> be set.
>>
>> This behavior can be controlled via module parameter and user can
>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver only
>> without sending it to firmware.
>>
>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>> ---
>>  drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>  drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>> drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>  4 files changed, 14 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>> b/drivers/scsi/megaraid/megaraid_sas.h
>> index ca86c88..43fd14f 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>  #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>  #define MR_MAX_MSIX_REG_ARRAY                   16
>>  #define MR_RDPQ_MODE_OFFSET			0X00800000
>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>> +
>>  /*
>>  * register set for both 1068 and 1078 controllers
>>  * structure extended for 1078 registers @@ -2140,6 +2142,7 @@ struct
>> megasas_instance {
>>  	u8 is_imr;
>>  	u8 is_rdpq;
>>  	bool dev_handle;
>> +	bool fw_sync_cache_support;
>>  };
>>  struct MR_LD_VF_MAP {
>>  	u32 size;
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>> index 092387f..a4a8e2f 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>> MEGASAS_DEFAULT_CMD_TIMEOUT;  module_param(scmd_timeout, int,
>> S_IRUGO);  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>> (10-90s), default 90s. See megasas_reset_timer.");
>>
>> +bool block_sync_cache;
>> +module_param(block_sync_cache, bool, S_IRUGO);
>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>> +Default: 0(Send it to firmware)");
>> +
>>  MODULE_LICENSE("GPL");
>>  MODULE_VERSION(MEGASAS_VERSION);
>>  MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>*shost, struct scsi_cmnd *scmd)
>>  		goto out_done;
>>  	}
>>
>> -	switch (scmd->cmnd[0]) {
>> -	case SYNCHRONIZE_CACHE:
>> -		/*
>> -		 * FW takes care of flush cache on its own
>> -		 * No need to send it down
>> -		 */
>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>> +		(!instance->fw_sync_cache_support)) {
>
>Maybe I'm wrong, but isn't the logic inverted?
>when fw_sync_cache_support is true  it means that there is a flash cache
or a
>battery and we don't have to send the command down ?
>

If "instance->fw_sync_cache_support" is set to true, it means driver can
send SYNCHRONIZE_CACHE to firmware(firmware has infrastructure to support
SYNCHRONIZE_CACHE).
If "instance->fw_sync_cache_support" is set to false, it means FW does not
support to handle SYNCHRONIZE_CACHE and FW will flush cache during
shutdown.

>tomash
>
>>  		scmd->result = DID_OK << 16;
>>  		goto out_done;
>> -	default:
>> -		break;
>>  	}
>>
>>  	return instance->instancet->build_and_issue_cmd(instance, scmd);
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> index 2159f6a..8237580 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance
>*instance)
>>  		ret = 1;
>>  		goto fail_fw_init;
>>  	}
>> +	if (!block_sync_cache)
>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>>
>>  	IOCInitMessage =
>>  	  dma_alloc_coherent(&instance->pdev->dev,
>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> index e3bee04..2857154 100644
>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> @@ -72,6 +72,8 @@
>>  #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>  #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>
>> +extern bool block_sync_cache;
>> +
>>  /*
>>   * Raid context flags
>>   */
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 13:01     ` Ric Wheeler
@ 2016-10-17 13:31       ` Sumit Saxena
  2016-10-17 15:55       ` Christoph Hellwig
  2016-10-17 16:20       ` James Bottomley
  2 siblings, 0 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 13:31 UTC (permalink / raw)
  To: Ric Wheeler, Hannes Reinecke, linux-scsi
  Cc: martin.petersen, thenzl, jejb, Kashyap Desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

>-----Original Message-----
>From: Ric Wheeler [mailto:rwheeler@redhat.com]
>Sent: Monday, October 17, 2016 6:31 PM
>To: Hannes Reinecke; Sumit Saxena; linux-scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; thenzl@redhat.com;
jejb@linux.vnet.ibm.com;
>kashyap.desai@broadcom.com; Christoph Hellwig; Martin K. Petersen; Jeff
>Moyer; Gris Ge; Ewan Milne; Jens Axboe
>Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>firmware
>
>On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
>>> layer without sending it to firmware as firmware takes care of
flushing cache.
>>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE
handling.
>>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE,
>>> driver will send it for firmware otherwise complete it back to SCSI
>>> layer with SUCCESS immediately.
>>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will
>>> be set.
>>>
>>> This behavior can be controlled via module parameter and user can
>>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver
>>> only without sending it to firmware.
>>>
>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>> ---
>>>   drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>   drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>>   drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>   drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>   4 files changed, 14 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>> index ca86c88..43fd14f 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>   #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>   #define MR_MAX_MSIX_REG_ARRAY                   16
>>>   #define MR_RDPQ_MODE_OFFSET			0X00800000
>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>>> +
>>>   /*
>>>   * register set for both 1068 and 1078 controllers
>>>   * structure extended for 1078 registers @@ -2140,6 +2142,7 @@
>>> struct megasas_instance {
>>>   	u8 is_imr;
>>>   	u8 is_rdpq;
>>>   	bool dev_handle;
>>> +	bool fw_sync_cache_support;
>>>   };
>>>   struct MR_LD_VF_MAP {
>>>   	u32 size;
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> index 092387f..a4a8e2f 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>MEGASAS_DEFAULT_CMD_TIMEOUT;
>>>   module_param(scmd_timeout, int, S_IRUGO);
>>>   MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s),
>>> default 90s. See megasas_reset_timer.");
>>>
>>> +bool block_sync_cache;
>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>>> +Default: 0(Send it to firmware)");
>>> +
>>>   MODULE_LICENSE("GPL");
>>>   MODULE_VERSION(MEGASAS_VERSION);
>>>   MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>*shost, struct scsi_cmnd *scmd)
>>>   		goto out_done;
>>>   	}
>>>
>>> -	switch (scmd->cmnd[0]) {
>>> -	case SYNCHRONIZE_CACHE:
>>> -		/*
>>> -		 * FW takes care of flush cache on its own
>>> -		 * No need to send it down
>>> -		 */
>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>> +		(!instance->fw_sync_cache_support)) {
>>>   		scmd->result = DID_OK << 16;
>>>   		goto out_done;
>>> -	default:
>>> -		break;
>>>   	}
>>>
>>>   	return instance->instancet->build_and_issue_cmd(instance, scmd);
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> index 2159f6a..8237580 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct megasas_instance
>*instance)
>>>   		ret = 1;
>>>   		goto fail_fw_init;
>>>   	}
>>> +	if (!block_sync_cache)
>>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>>>
>>>   	IOCInitMessage =
>>>   	  dma_alloc_coherent(&instance->pdev->dev,
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> index e3bee04..2857154 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> @@ -72,6 +72,8 @@
>>>   #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>>   #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>>
>>> +extern bool block_sync_cache;
>>> +
>>>   /*
>>>    * Raid context flags
>>>    */
>>>
>> Be extra careful with that.
>>
>> SYNCHRONIZE_CACHE has (potentially) a global scope, as there might be
>> an array-wide cache, and a cache flush would affect the entire cache.
>> Linux is using SYNCHRONIZE_CACHE as a per device setting, ie it
>> assumes that the effects of a cache flush is restricted to the device
in question.
>>
>> If this is _not_ the case I'd rather not enable it.
>> Have you checked that enabling this functionality doesn't have
>> negative performance impact?
>>
>> Cheers,
>>
>> Hannes
>
>This must go in - without this fix, there is no data integrity for any
file system.
>
>In effect, this driver by default has been throwing away
SYNCHRONIZE_CACHE
>commands even when acting in JBOD/non-RAID mode.

For JBOD mode, we are planning to send SYNCHRONIZE_CACHE unconditionally
for all generation
Controllers and FW. Since there is single driver for all generation
controllers so we are doing some
Testing that opening SYNCHRONIZE_CACHE should not break any controller.
I will be sending follow up patch for the same as soon as I am done.

>
>Of course, actually doing a SYNCHRONIZE_CACHE to drives will be slower
than
>throwing it away, but this is not optional.
>
>We really need to have some ways to validate that our IO stack is
properly and
>safely configured.
>
>I would love to see a couple of things:
>
>* having T10 & T13 report the existence of a volatile write cache - this
is different
>than WCE set, some devices have a write cache and are battery/flash
backed.
>
>* having a robust tool to test over power failure/disconnect that our
assumptions
>are true - any write is durable after a SYNCHRONIZE_CACHE or
>CACHE_FLUSH_EXT is sent and ack'ed
>
>Regards,
>
>Ric
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 13:28     ` Sumit Saxena
@ 2016-10-17 13:57       ` Tomas Henzl
  2016-10-17 14:25         ` Sumit Saxena
  2016-10-18 13:07         ` Ric Wheeler
  0 siblings, 2 replies; 49+ messages in thread
From: Tomas Henzl @ 2016-10-17 13:57 UTC (permalink / raw)
  To: Sumit Saxena, linux-scsi; +Cc: martin.petersen, jejb, Kashyap Desai

On 17.10.2016 15:28, Sumit Saxena wrote:
>> -----Original Message-----
>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>> Sent: Monday, October 17, 2016 6:44 PM
>> To: Sumit Saxena; linux-scsi@vger.kernel.org
>> Cc: martin.petersen@oracle.com; jejb@linux.vnet.ibm.com;
>> kashyap.desai@broadcom.com
>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>> firmware
>>
>> On 17.10.2016 12:24, Sumit Saxena wrote:
>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
>>> layer without sending it to firmware as firmware takes care of flushing
> cache.
>>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE
> handling.
>>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE,
>>> driver will send it for firmware otherwise complete it back to SCSI
>>> layer with SUCCESS immediately.
>>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will
>>> be set.
>>>
>>> This behavior can be controlled via module parameter and user can
>>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver only
>>> without sending it to firmware.
>>>
>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>> ---
>>>  drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>  drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>> drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>  4 files changed, 14 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>> index ca86c88..43fd14f 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>  #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>  #define MR_MAX_MSIX_REG_ARRAY                   16
>>>  #define MR_RDPQ_MODE_OFFSET			0X00800000
>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>>> +
>>>  /*
>>>  * register set for both 1068 and 1078 controllers
>>>  * structure extended for 1078 registers @@ -2140,6 +2142,7 @@ struct
>>> megasas_instance {
>>>  	u8 is_imr;
>>>  	u8 is_rdpq;
>>>  	bool dev_handle;
>>> +	bool fw_sync_cache_support;
>>>  };
>>>  struct MR_LD_VF_MAP {
>>>  	u32 size;
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> index 092387f..a4a8e2f 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>> MEGASAS_DEFAULT_CMD_TIMEOUT;  module_param(scmd_timeout, int,
>>> S_IRUGO);  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>>> (10-90s), default 90s. See megasas_reset_timer.");
>>>
>>> +bool block_sync_cache;
>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>>> +Default: 0(Send it to firmware)");
>>> +
>>>  MODULE_LICENSE("GPL");
>>>  MODULE_VERSION(MEGASAS_VERSION);
>>>  MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>> *shost, struct scsi_cmnd *scmd)
>>>  		goto out_done;
>>>  	}
>>>
>>> -	switch (scmd->cmnd[0]) {
>>> -	case SYNCHRONIZE_CACHE:
>>> -		/*
>>> -		 * FW takes care of flush cache on its own
>>> -		 * No need to send it down
>>> -		 */
>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>> +		(!instance->fw_sync_cache_support)) {
>> Maybe I'm wrong, but isn't the logic inverted?
>> when fw_sync_cache_support is true  it means that there is a flash cache
> or a
>> battery and we don't have to send the command down ?
>>
> If "instance->fw_sync_cache_support" is set to true, it means driver can
> send SYNCHRONIZE_CACHE to firmware(firmware has infrastructure to support
> SYNCHRONIZE_CACHE).
> If "instance->fw_sync_cache_support" is set to false, it means FW does not
> support to handle SYNCHRONIZE_CACHE and FW will flush cache during
> shutdown.

Thanks, in that case it is correct.

>>> +	if (!block_sync_cache)
>>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;

Please add a warning or log the state of the synchronise cache command
on the controller.


>>>  	IOCInitMessage =
>>>  	  dma_alloc_coherent(&instance->pdev->dev,
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> index e3bee04..2857154 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>> @@ -72,6 +72,8 @@
>>>  #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>>  #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>>
>>> +extern bool block_sync_cache;
>>> +
>>>  /*
>>>   * Raid context flags
>>>   */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 13:57       ` Tomas Henzl
@ 2016-10-17 14:25         ` Sumit Saxena
  2016-10-18 13:07         ` Ric Wheeler
  1 sibling, 0 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-17 14:25 UTC (permalink / raw)
  To: Tomas Henzl, linux-scsi; +Cc: martin.petersen, jejb, Kashyap Desai

>-----Original Message-----
>From: Tomas Henzl [mailto:thenzl@redhat.com]
>Sent: Monday, October 17, 2016 7:27 PM
>To: Sumit Saxena; linux-scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; jejb@linux.vnet.ibm.com; Kashyap Desai
>Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>firmware
>
>On 17.10.2016 15:28, Sumit Saxena wrote:
>>> -----Original Message-----
>>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>>> Sent: Monday, October 17, 2016 6:44 PM
>>> To: Sumit Saxena; linux-scsi@vger.kernel.org
>>> Cc: martin.petersen@oracle.com; jejb@linux.vnet.ibm.com;
>>> kashyap.desai@broadcom.com
>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>command
>>> to firmware
>>>
>>> On 17.10.2016 12:24, Sumit Saxena wrote:
>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
>>>> layer without sending it to firmware as firmware takes care of
>>>> flushing
>> cache.
>>>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE
>> handling.
>>>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE,
>>>> driver will send it for firmware otherwise complete it back to SCSI
>>>> layer with SUCCESS immediately.
>>>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>>>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will
>>>> be set.
>>>>
>>>> This behavior can be controlled via module parameter and user can
>>>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver
>>>> only without sending it to firmware.
>>>>
>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>>> ---
>>>>  drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>>  drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>>>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>> drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>>  4 files changed, 14 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>> index ca86c88..43fd14f 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>  #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>>  #define MR_MAX_MSIX_REG_ARRAY                   16
>>>>  #define MR_RDPQ_MODE_OFFSET			0X00800000
>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>>>> +
>>>>  /*
>>>>  * register set for both 1068 and 1078 controllers
>>>>  * structure extended for 1078 registers @@ -2140,6 +2142,7 @@
>>>> struct megasas_instance {
>>>>  	u8 is_imr;
>>>>  	u8 is_rdpq;
>>>>  	bool dev_handle;
>>>> +	bool fw_sync_cache_support;
>>>>  };
>>>>  struct MR_LD_VF_MAP {
>>>>  	u32 size;
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> index 092387f..a4a8e2f 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;  module_param(scmd_timeout, int,
>>>> S_IRUGO);  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>>>> (10-90s), default 90s. See megasas_reset_timer.");
>>>>
>>>> +bool block_sync_cache;
>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>>>> +Default: 0(Send it to firmware)");
>>>> +
>>>>  MODULE_LICENSE("GPL");
>>>>  MODULE_VERSION(MEGASAS_VERSION);
>>>>  MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>>> *shost, struct scsi_cmnd *scmd)
>>>>  		goto out_done;
>>>>  	}
>>>>
>>>> -	switch (scmd->cmnd[0]) {
>>>> -	case SYNCHRONIZE_CACHE:
>>>> -		/*
>>>> -		 * FW takes care of flush cache on its own
>>>> -		 * No need to send it down
>>>> -		 */
>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>> +		(!instance->fw_sync_cache_support)) {
>>> Maybe I'm wrong, but isn't the logic inverted?
>>> when fw_sync_cache_support is true  it means that there is a flash
>>> cache
>> or a
>>> battery and we don't have to send the command down ?
>>>
>> If "instance->fw_sync_cache_support" is set to true, it means driver
>> can send SYNCHRONIZE_CACHE to firmware(firmware has infrastructure to
>> support SYNCHRONIZE_CACHE).
>> If "instance->fw_sync_cache_support" is set to false, it means FW does
>> not support to handle SYNCHRONIZE_CACHE and FW will flush cache during
>> shutdown.
>
>Thanks, in that case it is correct.
>
>>>> +	if (!block_sync_cache)
>>>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>
>Please add a warning or log the state of the synchronise cache command on
>the
>controller.

Ok. thanks for pointing it out. I will add a print for the same.
>
>
>>>>  	IOCInitMessage =
>>>>  	  dma_alloc_coherent(&instance->pdev->dev,
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> index e3bee04..2857154 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> @@ -72,6 +72,8 @@
>>>>  #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>>>  #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>>>
>>>> +extern bool block_sync_cache;
>>>> +
>>>>  /*
>>>>   * Raid context flags
>>>>   */
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 13:01     ` Ric Wheeler
  2016-10-17 13:31       ` Sumit Saxena
@ 2016-10-17 15:55       ` Christoph Hellwig
  2016-10-17 16:08         ` Ric Wheeler
  2016-10-17 16:20       ` James Bottomley
  2 siblings, 1 reply; 49+ messages in thread
From: Christoph Hellwig @ 2016-10-17 15:55 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Hannes Reinecke, Sumit Saxena, linux-scsi, martin.petersen,
	thenzl, jejb, kashyap.desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On Mon, Oct 17, 2016 at 09:01:29AM -0400, Ric Wheeler wrote:
> This must go in - without this fix, there is no data integrity for any file system.

megaraid always had odd ideas on cache flushing, and this might be
a opportunity to write down all the assumptions and document them.

> In effect, this driver by default has been throwing away SYNCHRONIZE_CACHE
> commands even when acting in JBOD/non-RAID mode.

That would explain some issues we've seen with megaraid hardware, but
it seems a bit too shocking to be true.

Looking over the patch I disagree with the module option - we must do
the right thing by default, which is sending SYNCHRONIZE_CACHE commands
if the WCE bit is set.  If there are controllers where this is harmful
for RAID mode and we can't fix the firmware in time we'll need to make
special exceptions for this case in the driver based on the PCI ID
and knowing what we talk to instead of leaving it to the user.

> * having T10 & T13 report the existence of a volatile write cache - this is
> different than WCE set, some devices have a write cache and are
> battery/flash backed.

T10 is pretty clear now the WCE should only be set for a non-voltile
cache.  For a while they had odd NV bits to allow flushing a
non-volatile cache, but in the latest revisions all that is gone.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 15:55       ` Christoph Hellwig
@ 2016-10-17 16:08         ` Ric Wheeler
  2016-10-17 16:23           ` Ewan D. Milne
  0 siblings, 1 reply; 49+ messages in thread
From: Ric Wheeler @ 2016-10-17 16:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hannes Reinecke, Sumit Saxena, linux-scsi, martin.petersen,
	thenzl, jejb, kashyap.desai, Martin K. Petersen, Jeff Moyer,
	Gris Ge, Ewan Milne, Jens Axboe

On 10/17/2016 11:55 AM, Christoph Hellwig wrote:
> On Mon, Oct 17, 2016 at 09:01:29AM -0400, Ric Wheeler wrote:
>> This must go in - without this fix, there is no data integrity for any file system.
> megaraid always had odd ideas on cache flushing, and this might be
> a opportunity to write down all the assumptions and document them.
>
>> In effect, this driver by default has been throwing away SYNCHRONIZE_CACHE
>> commands even when acting in JBOD/non-RAID mode.
> That would explain some issues we've seen with megaraid hardware, but
> it seems a bit too shocking to be true.
>
> Looking over the patch I disagree with the module option - we must do
> the right thing by default, which is sending SYNCHRONIZE_CACHE commands
> if the WCE bit is set.  If there are controllers where this is harmful
> for RAID mode and we can't fix the firmware in time we'll need to make
> special exceptions for this case in the driver based on the PCI ID
> and knowing what we talk to instead of leaving it to the user.

I do agree - having users be able to disable this easily is asking for trouble. 
It will be slower, but slower because the cache flush actually is effective.

>
>> * having T10 & T13 report the existence of a volatile write cache - this is
>> different than WCE set, some devices have a write cache and are
>> battery/flash backed.
> T10 is pretty clear now the WCE should only be set for a non-voltile
> cache.  For a while they had odd NV bits to allow flushing a
> non-volatile cache, but in the latest revisions all that is gone.

If T10 has clarity on this, then the actual fix would be to have the driver 
advertise WCE enabled only for the pass through/non-RAID luns (assuming the 
drive's write cache was enabled) and then we would leave WCE disabled for the 
targets that the firmware handles cache destaging for.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 13:01     ` Ric Wheeler
  2016-10-17 13:31       ` Sumit Saxena
  2016-10-17 15:55       ` Christoph Hellwig
@ 2016-10-17 16:20       ` James Bottomley
  2016-10-17 16:27         ` Ric Wheeler
  2016-10-17 16:29         ` Ric Wheeler
  2 siblings, 2 replies; 49+ messages in thread
From: James Bottomley @ 2016-10-17 16:20 UTC (permalink / raw)
  To: Ric Wheeler, Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, kashyap.desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> > On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> > > megaraid_sas driver returns SYNCHRONIZE_CACHE command back to 
> > > SCSI layer without sending it to firmware as firmware takes care 
> > > of flushing cache.  This patch will change the driver behavior 
> > > wrt SYNCHRONIZE_CACHE handling. If underlying firmware has 
> > > support to handle the SYNCHORNIZE_CACHE, driver will send it for 
> > > firmware otherwise complete it back to SCSI layer with SUCCESS 
> > > immediately.  If  Firmware  handle SYNCHORNIZE_CACHE for both VD 
> > > and JBOD "canHandleSyncCache" bit in scratch pad register(offset 
> > > 0x00B4) will be set.
> > > 
> > > This behavior can be controlled via module parameter and user can 
> > > fallback to old behavior of returning SYNCHRONIZE_CACHE by driver 
> > > only without sending it to firmware.
> > > 
> > > Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> > > Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> > > ---
> > >   drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> > >   drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
> > >   drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> > >   drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> > >   4 files changed, 14 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > > b/drivers/scsi/megaraid/megaraid_sas.h
> > > index ca86c88..43fd14f 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > > @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> > >   #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> > >   #define MR_MAX_MSIX_REG_ARRAY                   16
> > >   #define MR_RDPQ_MODE_OFFSET			0X00800000
> > > +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X0100000
> > > 0
> > > +
> > >   /*
> > >   * register set for both 1068 and 1078 controllers
> > >   * structure extended for 1078 registers
> > > @@ -2140,6 +2142,7 @@ struct megasas_instance {
> > >   	u8 is_imr;
> > >   	u8 is_rdpq;
> > >   	bool dev_handle;
> > > +	bool fw_sync_cache_support;
> > >   };
> > >   struct MR_LD_VF_MAP {
> > >   	u32 size;
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > index 092387f..a4a8e2f 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> > > MEGASAS_DEFAULT_CMD_TIMEOUT;
> > >   module_param(scmd_timeout, int, S_IRUGO);
> > >   MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s),
> > > default 90s. See megasas_reset_timer.");
> > >   
> > > +bool block_sync_cache;
> > > +module_param(block_sync_cache, bool, S_IRUGO);
> > > +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
> > > Default: 0(Send it to firmware)");
> > > +
> > >   MODULE_LICENSE("GPL");
> > >   MODULE_VERSION(MEGASAS_VERSION);
> > >   MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> > > @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
> > > *shost, struct scsi_cmnd *scmd)
> > >   		goto out_done;
> > >   	}
> > >   
> > > -	switch (scmd->cmnd[0]) {
> > > -	case SYNCHRONIZE_CACHE:
> > > -		/*
> > > -		 * FW takes care of flush cache on its own
> > > -		 * No need to send it down
> > > -		 */
> > > +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> > > +		(!instance->fw_sync_cache_support)) {
> > >   		scmd->result = DID_OK << 16;
> > >   		goto out_done;
> > > -	default:
> > > -		break;
> > >   	}
> > >   
> > >   	return instance->instancet
> > > ->build_and_issue_cmd(instance, scmd);
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > index 2159f6a..8237580 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> > > megasas_instance *instance)
> > >   		ret = 1;
> > >   		goto fail_fw_init;
> > >   	}
> > > +	if (!block_sync_cache)
> > > +		instance->fw_sync_cache_support = (scratch_pad_2
> > > &
> > > +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 :
> > > 0;
> > >   
> > >   	IOCInitMessage =
> > >   	  dma_alloc_coherent(&instance->pdev->dev,
> > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > index e3bee04..2857154 100644
> > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > @@ -72,6 +72,8 @@
> > >   #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
> > >   #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
> > >   
> > > +extern bool block_sync_cache;
> > > +
> > >   /*
> > >    * Raid context flags
> > >    */
> > > 
> > Be extra careful with that.
> > 
> > SYNCHRONIZE_CACHE has (potentially) a global scope, as there might 
> > be an array-wide cache, and a cache flush would affect the entire 
> > cache. Linux is using SYNCHRONIZE_CACHE as a per device setting, ie 
> > it assumes that the effects of a cache flush is restricted to the 
> > device in question.
> > 
> > If this is _not_ the case I'd rather not enable it.
> > Have you checked that enabling this functionality doesn't have 
> > negative performance impact?
> > 
> > Cheers,
> > 
> > Hannes
> 
> This must go in - without this fix, there is no data integrity for
> any file system.

That's not what I get from the change log.  What it says to me is that
the caches are currently firmware managed.  Barring firmware bugs, that
means that we currently don't have any integrity issues.

> In effect, this driver by default has been throwing away 
> SYNCHRONIZE_CACHE commands even when acting in JBOD/non-RAID mode.

That's not what the changelog says.  It says the cache flushing has
been managed by the firmware only.  Meaning the firmware decides when
to flush the cache.  Barring firmware bugs, this should mean that
integrity is preserved in either mode.

> Of course, actually doing a SYNCHRONIZE_CACHE to drives will be 
> slower than throwing it away, but this is not optional.

What Hannes means is that we need to know that turning over cache
management to Linux is a) safe and b) not a performance regression. 
 Given that there aren't any integrity issues, that's a reasonable
request.

> We really need to have some ways to validate that our IO stack is 
> properly and safely configured.
> 
> I would love to see a couple of things:
> 
> * having T10 & T13 report the existence of a volatile write cache - 
> this is different than WCE set, some devices have a write cache and 
> are battery/flash backed.

That's the non-volatile cache log page.

James


> * having a robust tool to test over power failure/disconnect that our
> assumptions are true - any write is durable after a SYNCHRONIZE_CACHE 
> or CACHE_FLUSH_EXT is sent and ack'ed
> 
> Regards,
> 
> Ric
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 16:08         ` Ric Wheeler
@ 2016-10-17 16:23           ` Ewan D. Milne
  0 siblings, 0 replies; 49+ messages in thread
From: Ewan D. Milne @ 2016-10-17 16:23 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Christoph Hellwig, Hannes Reinecke, Sumit Saxena, linux-scsi,
	martin.petersen, thenzl, jejb, kashyap.desai, Martin K. Petersen,
	Jeff Moyer, Gris Ge, Jens Axboe

On Mon, 2016-10-17 at 12:08 -0400, Ric Wheeler wrote:
> On 10/17/2016 11:55 AM, Christoph Hellwig wrote:
> > On Mon, Oct 17, 2016 at 09:01:29AM -0400, Ric Wheeler wrote:
> >> This must go in - without this fix, there is no data integrity for any file system.
> > megaraid always had odd ideas on cache flushing, and this might be
> > a opportunity to write down all the assumptions and document them.
> >
> >> In effect, this driver by default has been throwing away SYNCHRONIZE_CACHE
> >> commands even when acting in JBOD/non-RAID mode.
> > That would explain some issues we've seen with megaraid hardware, but
> > it seems a bit too shocking to be true.
> >
> > Looking over the patch I disagree with the module option - we must do
> > the right thing by default, which is sending SYNCHRONIZE_CACHE commands
> > if the WCE bit is set.  If there are controllers where this is harmful
> > for RAID mode and we can't fix the firmware in time we'll need to make
> > special exceptions for this case in the driver based on the PCI ID
> > and knowing what we talk to instead of leaving it to the user.
> 
> I do agree - having users be able to disable this easily is asking for trouble. 
> It will be slower, but slower because the cache flush actually is effective.

Absolutely.  It is a bad idea to allow users to disable correct behavior
in order to achieve better performance, you will always get people who
will do this and then pay the price eventually when they get corruption.

> 
> >
> >> * having T10 & T13 report the existence of a volatile write cache - this is
> >> different than WCE set, some devices have a write cache and are
> >> battery/flash backed.
> > T10 is pretty clear now the WCE should only be set for a non-voltile
> > cache.  For a while they had odd NV bits to allow flushing a
> > non-volatile cache, but in the latest revisions all that is gone.
> 
> If T10 has clarity on this, then the actual fix would be to have the driver 
> advertise WCE enabled only for the pass through/non-RAID luns (assuming the 
> drive's write cache was enabled) and then we would leave WCE disabled for the 
> targets that the firmware handles cache destaging for.
> 
> 



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 16:20       ` James Bottomley
@ 2016-10-17 16:27         ` Ric Wheeler
  2016-10-17 17:19           ` James Bottomley
  2016-10-17 16:29         ` Ric Wheeler
  1 sibling, 1 reply; 49+ messages in thread
From: Ric Wheeler @ 2016-10-17 16:27 UTC (permalink / raw)
  To: James Bottomley, Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, kashyap.desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On 10/17/2016 12:20 PM, James Bottomley wrote:
> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to
>>>> SCSI layer without sending it to firmware as firmware takes care
>>>> of flushing cache.  This patch will change the driver behavior
>>>> wrt SYNCHRONIZE_CACHE handling. If underlying firmware has
>>>> support to handle the SYNCHORNIZE_CACHE, driver will send it for
>>>> firmware otherwise complete it back to SCSI layer with SUCCESS
>>>> immediately.  If  Firmware  handle SYNCHORNIZE_CACHE for both VD
>>>> and JBOD "canHandleSyncCache" bit in scratch pad register(offset
>>>> 0x00B4) will be set.
>>>>
>>>> This behavior can be controlled via module parameter and user can
>>>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver
>>>> only without sending it to firmware.
>>>>
>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>>> ---
>>>>    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>>    drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>>    4 files changed, 14 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>> index ca86c88..43fd14f 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>>    #define MR_MAX_MSIX_REG_ARRAY                   16
>>>>    #define MR_RDPQ_MODE_OFFSET			0X00800000
>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X0100000
>>>> 0
>>>> +
>>>>    /*
>>>>    * register set for both 1068 and 1078 controllers
>>>>    * structure extended for 1078 registers
>>>> @@ -2140,6 +2142,7 @@ struct megasas_instance {
>>>>    	u8 is_imr;
>>>>    	u8 is_rdpq;
>>>>    	bool dev_handle;
>>>> +	bool fw_sync_cache_support;
>>>>    };
>>>>    struct MR_LD_VF_MAP {
>>>>    	u32 size;
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> index 092387f..a4a8e2f 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
>>>>    module_param(scmd_timeout, int, S_IRUGO);
>>>>    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s),
>>>> default 90s. See megasas_reset_timer.");
>>>>    
>>>> +bool block_sync_cache;
>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>>>> Default: 0(Send it to firmware)");
>>>> +
>>>>    MODULE_LICENSE("GPL");
>>>>    MODULE_VERSION(MEGASAS_VERSION);
>>>>    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>>>> *shost, struct scsi_cmnd *scmd)
>>>>    		goto out_done;
>>>>    	}
>>>>    
>>>> -	switch (scmd->cmnd[0]) {
>>>> -	case SYNCHRONIZE_CACHE:
>>>> -		/*
>>>> -		 * FW takes care of flush cache on its own
>>>> -		 * No need to send it down
>>>> -		 */
>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>> +		(!instance->fw_sync_cache_support)) {
>>>>    		scmd->result = DID_OK << 16;
>>>>    		goto out_done;
>>>> -	default:
>>>> -		break;
>>>>    	}
>>>>    
>>>>    	return instance->instancet
>>>> ->build_and_issue_cmd(instance, scmd);
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>> index 2159f6a..8237580 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
>>>> megasas_instance *instance)
>>>>    		ret = 1;
>>>>    		goto fail_fw_init;
>>>>    	}
>>>> +	if (!block_sync_cache)
>>>> +		instance->fw_sync_cache_support = (scratch_pad_2
>>>> &
>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 :
>>>> 0;
>>>>    
>>>>    	IOCInitMessage =
>>>>    	  dma_alloc_coherent(&instance->pdev->dev,
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> index e3bee04..2857154 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> @@ -72,6 +72,8 @@
>>>>    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>>>    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>>>    
>>>> +extern bool block_sync_cache;
>>>> +
>>>>    /*
>>>>     * Raid context flags
>>>>     */
>>>>
>>> Be extra careful with that.
>>>
>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as there might
>>> be an array-wide cache, and a cache flush would affect the entire
>>> cache. Linux is using SYNCHRONIZE_CACHE as a per device setting, ie
>>> it assumes that the effects of a cache flush is restricted to the
>>> device in question.
>>>
>>> If this is _not_ the case I'd rather not enable it.
>>> Have you checked that enabling this functionality doesn't have
>>> negative performance impact?
>>>
>>> Cheers,
>>>
>>> Hannes
>> This must go in - without this fix, there is no data integrity for
>> any file system.
> That's not what I get from the change log.  What it says to me is that
> the caches are currently firmware managed.  Barring firmware bugs, that
> means that we currently don't have any integrity issues.

Your understanding (or the change log) is incorrect.

The issue here is for devices that are non-RAID or pass through - this is a real 
issue and has been seen in practice.

Ric

>
>> In effect, this driver by default has been throwing away
>> SYNCHRONIZE_CACHE commands even when acting in JBOD/non-RAID mode.
> That's not what the changelog says.  It says the cache flushing has
> been managed by the firmware only.  Meaning the firmware decides when
> to flush the cache.  Barring firmware bugs, this should mean that
> integrity is preserved in either mode.
>
>> Of course, actually doing a SYNCHRONIZE_CACHE to drives will be
>> slower than throwing it away, but this is not optional.
> What Hannes means is that we need to know that turning over cache
> management to Linux is a) safe and b) not a performance regression.
>   Given that there aren't any integrity issues, that's a reasonable
> request.
>
>> We really need to have some ways to validate that our IO stack is
>> properly and safely configured.
>>
>> I would love to see a couple of things:
>>
>> * having T10 & T13 report the existence of a volatile write cache -
>> this is different than WCE set, some devices have a write cache and
>> are battery/flash backed.
> That's the non-volatile cache log page.
>
> James
>
>
>> * having a robust tool to test over power failure/disconnect that our
>> assumptions are true - any write is durable after a SYNCHRONIZE_CACHE
>> or CACHE_FLUSH_EXT is sent and ack'ed
>>
>> Regards,
>>
>> Ric
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
>> in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 16:20       ` James Bottomley
  2016-10-17 16:27         ` Ric Wheeler
@ 2016-10-17 16:29         ` Ric Wheeler
  1 sibling, 0 replies; 49+ messages in thread
From: Ric Wheeler @ 2016-10-17 16:29 UTC (permalink / raw)
  To: James Bottomley, Ric Wheeler, Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, kashyap.desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On 10/17/2016 12:20 PM, James Bottomley wrote:
>> We really need to have some ways to validate that our IO stack is
>> >properly and safely configured.
>> >
>> >I would love to see a couple of things:
>> >
>> >* having T10 & T13 report the existence of a volatile write cache -
>> >this is different than WCE set, some devices have a write cache and
>> >are battery/flash backed.
> That's the non-volatile cache log page.
>
> James
>
>
That is not how the vendors implement this unfortunately. I dug into this 
non-volatile cache log page with several disk vendors last year and they were 
consistently not setting this to reflect that state.

Ric



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 16:27         ` Ric Wheeler
@ 2016-10-17 17:19           ` James Bottomley
  2016-10-17 17:30             ` Ric Wheeler
  2016-10-17 17:36             ` Kashyap Desai
  0 siblings, 2 replies; 49+ messages in thread
From: James Bottomley @ 2016-10-17 17:19 UTC (permalink / raw)
  To: Ric Wheeler, Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, kashyap.desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
> On 10/17/2016 12:20 PM, James Bottomley wrote:
> > On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> > > On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> > > > On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> > > > > megaraid_sas driver returns SYNCHRONIZE_CACHE command back to
> > > > > SCSI layer without sending it to firmware as firmware takes
> > > > > care
> > > > > of flushing cache.  This patch will change the driver
> > > > > behavior
> > > > > wrt SYNCHRONIZE_CACHE handling. If underlying firmware has
> > > > > support to handle the SYNCHORNIZE_CACHE, driver will send it
> > > > > for
> > > > > firmware otherwise complete it back to SCSI layer with
> > > > > SUCCESS
> > > > > immediately.  If  Firmware  handle SYNCHORNIZE_CACHE for both
> > > > > VD
> > > > > and JBOD "canHandleSyncCache" bit in scratch pad
> > > > > register(offset
> > > > > 0x00B4) will be set.
> > > > > 
> > > > > This behavior can be controlled via module parameter and user
> > > > > can
> > > > > fallback to old behavior of returning SYNCHRONIZE_CACHE by
> > > > > driver
> > > > > only without sending it to firmware.
> > > > > 
> > > > > Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> > > > > Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> > > > > ---
> > > > >    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> > > > >    drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++---
> > > > > -----
> > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> > > > >    4 files changed, 14 insertions(+), 8 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > > > > b/drivers/scsi/megaraid/megaraid_sas.h
> > > > > index ca86c88..43fd14f 100644
> > > > > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > > > > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > > > > @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> > > > >    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> > > > >    #define MR_MAX_MSIX_REG_ARRAY                   16
> > > > >    #define MR_RDPQ_MODE_OFFSET			0X00800
> > > > > 000
> > > > > +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X010
> > > > > 0000
> > > > > 0
> > > > > +
> > > > >    /*
> > > > >    * register set for both 1068 and 1078 controllers
> > > > >    * structure extended for 1078 registers
> > > > > @@ -2140,6 +2142,7 @@ struct megasas_instance {
> > > > >    	u8 is_imr;
> > > > >    	u8 is_rdpq;
> > > > >    	bool dev_handle;
> > > > > +	bool fw_sync_cache_support;
> > > > >    };
> > > > >    struct MR_LD_VF_MAP {
> > > > >    	u32 size;
> > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > index 092387f..a4a8e2f 100644
> > > > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> > > > > MEGASAS_DEFAULT_CMD_TIMEOUT;
> > > > >    module_param(scmd_timeout, int, S_IRUGO);
> > > > >    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10
> > > > > -90s),
> > > > > default 90s. See megasas_reset_timer.");
> > > > >    
> > > > > +bool block_sync_cache;
> > > > > +module_param(block_sync_cache, bool, S_IRUGO);
> > > > > +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
> > > > > driver
> > > > > Default: 0(Send it to firmware)");
> > > > > +
> > > > >    MODULE_LICENSE("GPL");
> > > > >    MODULE_VERSION(MEGASAS_VERSION);
> > > > >    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> > > > > @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
> > > > > Scsi_Host
> > > > > *shost, struct scsi_cmnd *scmd)
> > > > >    		goto out_done;
> > > > >    	}
> > > > >    
> > > > > -	switch (scmd->cmnd[0]) {
> > > > > -	case SYNCHRONIZE_CACHE:
> > > > > -		/*
> > > > > -		 * FW takes care of flush cache on its own
> > > > > -		 * No need to send it down
> > > > > -		 */
> > > > > +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> > > > > +		(!instance->fw_sync_cache_support)) {
> > > > >    		scmd->result = DID_OK << 16;
> > > > >    		goto out_done;
> > > > > -	default:
> > > > > -		break;
> > > > >    	}
> > > > >    
> > > > >    	return instance->instancet
> > > > > ->build_and_issue_cmd(instance, scmd);
> > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > index 2159f6a..8237580 100644
> > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> > > > > megasas_instance *instance)
> > > > >    		ret = 1;
> > > > >    		goto fail_fw_init;
> > > > >    	}
> > > > > +	if (!block_sync_cache)
> > > > > +		instance->fw_sync_cache_support =
> > > > > (scratch_pad_2
> > > > > &
> > > > > +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1
> > > > > :
> > > > > 0;
> > > > >    
> > > > >    	IOCInitMessage =
> > > > >    	  dma_alloc_coherent(&instance->pdev->dev,
> > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > index e3bee04..2857154 100644
> > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > @@ -72,6 +72,8 @@
> > > > >    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET  
> > > > >  (0x0000030C)
> > > > >    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x000000
> > > > > 6C)
> > > > >    
> > > > > +extern bool block_sync_cache;
> > > > > +
> > > > >    /*
> > > > >     * Raid context flags
> > > > >     */
> > > > > 
> > > > Be extra careful with that.
> > > > 
> > > > SYNCHRONIZE_CACHE has (potentially) a global scope, as there
> > > > might
> > > > be an array-wide cache, and a cache flush would affect the
> > > > entire
> > > > cache. Linux is using SYNCHRONIZE_CACHE as a per device
> > > > setting, ie
> > > > it assumes that the effects of a cache flush is restricted to
> > > > the
> > > > device in question.
> > > > 
> > > > If this is _not_ the case I'd rather not enable it.
> > > > Have you checked that enabling this functionality doesn't have
> > > > negative performance impact?
> > > > 
> > > > Cheers,
> > > > 
> > > > Hannes
> > > This must go in - without this fix, there is no data integrity
> > > for
> > > any file system.
> > That's not what I get from the change log.  What it says to me is 
> > that the caches are currently firmware managed.  Barring firmware 
> > bugs, that means that we currently don't have any integrity issues.
> 
> Your understanding (or the change log) is incorrect.

There's no current way in English of construing "as firmware takes care
of flushing cache" to mean its inverse, so the changelog needs updating
to explain that firmware does *not* take care of cache flushing, so
effectively nothing does and we'll need a cc to stable if the problem
is that nothing takes care of flushing the drive caches.

James


> The issue here is for devices that are non-RAID or pass through - 
> this is a real issue and has been seen in practice.




^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 17:19           ` James Bottomley
@ 2016-10-17 17:30             ` Ric Wheeler
  2016-10-17 17:36             ` Kashyap Desai
  1 sibling, 0 replies; 49+ messages in thread
From: Ric Wheeler @ 2016-10-17 17:30 UTC (permalink / raw)
  To: James Bottomley, Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, kashyap.desai, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On 10/17/2016 01:19 PM, James Bottomley wrote:
>>> That's not what I get from the change log.  What it says to me is
>>> > >that the caches are currently firmware managed.  Barring firmware
>>> > >bugs, that means that we currently don't have any integrity issues.
>> >
>> >Your understanding (or the change log) is incorrect.
> There's no current way in English of construing "as firmware takes care
> of flushing cache" to mean its inverse, so the changelog needs updating
> to explain that firmware does*not*  take care of cache flushing, so
> effectively nothing does and we'll need a cc to stable if the problem
> is that nothing takes care of flushing the drive caches.
>
> James
>
>
I agree that the changelog should be fixed, but the code itself is clearly 
broken in how is discards synchronize_cache commands. I assume Sumit will update 
that for use.

The specific situation is this:

* when using devices in RAID mode, the firmware does handle cache flushes (I 
assume it disables the back end disk write caches and handles its HBA write 
cache as you would expect).

* when using devices in non-RAID mode or pass through mode where the backend 
disks have write cache enabled, dropping the "SYNCHRONIZE_CACHE" commands in the 
driver is *never* correct for any disk with a volatile write cache

* there is no standard way to distinguish devices that have WCE set and have 
non-volatile write caches, so we have to assume the worst.

Regards,

Ric



^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 17:19           ` James Bottomley
  2016-10-17 17:30             ` Ric Wheeler
@ 2016-10-17 17:36             ` Kashyap Desai
  2016-10-17 17:51               ` James Bottomley
  1 sibling, 1 reply; 49+ messages in thread
From: Kashyap Desai @ 2016-10-17 17:36 UTC (permalink / raw)
  To: James Bottomley, Ric Wheeler, Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, Christoph Hellwig, Martin K. Petersen,
	Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

> -----Original Message-----
> From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
> Sent: Monday, October 17, 2016 10:50 PM
> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena; linux-scsi@vger.kernel.org
> Cc: martin.petersen@oracle.com; thenzl@redhat.com;
> kashyap.desai@broadcom.com; Christoph Hellwig; Martin K. Petersen; Jeff
> Moyer; Gris Ge; Ewan Milne; Jens Axboe
> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command
> to firmware
>
> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
> > On 10/17/2016 12:20 PM, James Bottomley wrote:
> > > On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> > > > On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> > > > > On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> > > > > > megaraid_sas driver returns SYNCHRONIZE_CACHE command back to
> > > > > > SCSI layer without sending it to firmware as firmware takes
> > > > > > care of flushing cache.  This patch will change the driver
> > > > > > behavior wrt SYNCHRONIZE_CACHE handling. If underlying
> > > > > > firmware has support to handle the SYNCHORNIZE_CACHE, driver
> > > > > > will send it for firmware otherwise complete it back to SCSI
> > > > > > layer with SUCCESS immediately.  If  Firmware  handle
> > > > > > SYNCHORNIZE_CACHE for both VD and JBOD "canHandleSyncCache"
> > > > > > bit in scratch pad register(offset
> > > > > > 0x00B4) will be set.
> > > > > >
> > > > > > This behavior can be controlled via module parameter and user
> > > > > > can fallback to old behavior of returning SYNCHRONIZE_CACHE by
> > > > > > driver only without sending it to firmware.
> > > > > >
> > > > > > Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> > > > > > ---
> > > > > >    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> > > > > >    drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++---
> > > > > > -----
> > > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> > > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> > > > > >    4 files changed, 14 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > b/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > index ca86c88..43fd14f 100644
> > > > > > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> > > > > >    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> > > > > >    #define MR_MAX_MSIX_REG_ARRAY                   16
> > > > > >    #define MR_RDPQ_MODE_OFFSET			0X00800
> > > > > > 000
> > > > > > +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X010
> > > > > > 0000
> > > > > > 0
> > > > > > +
> > > > > >    /*
> > > > > >    * register set for both 1068 and 1078 controllers
> > > > > >    * structure extended for 1078 registers @@ -2140,6 +2142,7
> > > > > > @@ struct megasas_instance {
> > > > > >    	u8 is_imr;
> > > > > >    	u8 is_rdpq;
> > > > > >    	bool dev_handle;
> > > > > > +	bool fw_sync_cache_support;
> > > > > >    };
> > > > > >    struct MR_LD_VF_MAP {
> > > > > >    	u32 size;
> > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > index 092387f..a4a8e2f 100644
> > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> > > > > > MEGASAS_DEFAULT_CMD_TIMEOUT;
> > > > > >    module_param(scmd_timeout, int, S_IRUGO);
> > > > > >    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10
> > > > > > -90s), default 90s. See megasas_reset_timer.");
> > > > > >
> > > > > > +bool block_sync_cache;
> > > > > > +module_param(block_sync_cache, bool, S_IRUGO);
> > > > > > +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
> > > > > > driver
> > > > > > Default: 0(Send it to firmware)");
> > > > > > +
> > > > > >    MODULE_LICENSE("GPL");
> > > > > >    MODULE_VERSION(MEGASAS_VERSION);
> > > > > >    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> > > > > > @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
> Scsi_Host
> > > > > > *shost, struct scsi_cmnd *scmd)
> > > > > >    		goto out_done;
> > > > > >    	}
> > > > > >
> > > > > > -	switch (scmd->cmnd[0]) {
> > > > > > -	case SYNCHRONIZE_CACHE:
> > > > > > -		/*
> > > > > > -		 * FW takes care of flush cache on its own
> > > > > > -		 * No need to send it down
> > > > > > -		 */
> > > > > > +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> > > > > > +		(!instance->fw_sync_cache_support)) {
> > > > > >    		scmd->result = DID_OK << 16;
> > > > > >    		goto out_done;
> > > > > > -	default:
> > > > > > -		break;
> > > > > >    	}
> > > > > >
> > > > > >    	return instance->instancet
> > > > > > ->build_and_issue_cmd(instance, scmd);
> > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > index 2159f6a..8237580 100644
> > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> > > > > > megasas_instance *instance)
> > > > > >    		ret = 1;
> > > > > >    		goto fail_fw_init;
> > > > > >    	}
> > > > > > +	if (!block_sync_cache)
> > > > > > +		instance->fw_sync_cache_support =
> > > > > > (scratch_pad_2
> > > > > > &
> > > > > > +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1
> > > > > > :
> > > > > > 0;
> > > > > >
> > > > > >    	IOCInitMessage =
> > > > > >    	  dma_alloc_coherent(&instance->pdev->dev,
> > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > index e3bee04..2857154 100644
> > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > @@ -72,6 +72,8 @@
> > > > > >    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
> > > > > >  (0x0000030C)
> > > > > >    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x000000
> > > > > > 6C)
> > > > > >
> > > > > > +extern bool block_sync_cache;
> > > > > > +
> > > > > >    /*
> > > > > >     * Raid context flags
> > > > > >     */
> > > > > >
> > > > > Be extra careful with that.
> > > > >
> > > > > SYNCHRONIZE_CACHE has (potentially) a global scope, as there
> > > > > might be an array-wide cache, and a cache flush would affect the
> > > > > entire cache. Linux is using SYNCHRONIZE_CACHE as a per device
> > > > > setting, ie it assumes that the effects of a cache flush is
> > > > > restricted to the device in question.
> > > > >
> > > > > If this is _not_ the case I'd rather not enable it.
> > > > > Have you checked that enabling this functionality doesn't have
> > > > > negative performance impact?
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Hannes
> > > > This must go in - without this fix, there is no data integrity for
> > > > any file system.
> > > That's not what I get from the change log.  What it says to me is
> > > that the caches are currently firmware managed.  Barring firmware
> > > bugs, that means that we currently don't have any integrity issues.
> >
> > Your understanding (or the change log) is incorrect.
>
> There's no current way in English of construing "as firmware takes care of
> flushing cache" to mean its inverse, so the changelog needs updating to
> explain
> that firmware does *not* take care of cache flushing, so effectively
> nothing
> does and we'll need a cc to stable if the problem is that nothing takes
> care of
> flushing the drive caches.
>
> James

Sorry for confusion. More accurate to say  -

Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
layer without sending it down to firmware as firmware supposed to takes care
of flushing cache for all Virtual Disk at the time of system
reboot/shutdown.
Because of above FW rule driver coded wrongly and it added bug that
SYNCHRONIZE_CACHE is not passed to FW for JBOD as well.  (MR Firmware wanted
OS to manage SYNCHRONIZE_CACHE and pass the same to the Firmware. ).
We figure out that even for VD, why driver should send back SYNC_CACHE
command...let's have the same in FW (giving all control in FW)

New behavior described -

IF application requests SYNCH_CACHE
 IF any FW which supports new API bit called canHandleSyncCache
  Driver sends SYNCH_CACHE command to the FW
  IF 'VirtualDisk'
      FW does not send SYNCH_CACHE to drives
      FW returns SUCCESS
ELSE IF 'JBOD'
     FW sends SYNCH_CACHE to drive
     FW obtains status from drive and returns same status back to driver
  ENDIF

Fixing this is robust if FW and driver changes are inline. See below for
more detail.

Case - 1
This patch is attempt to fix one level problem where Virtual Disks are not
managed correctly in MR FW. There are some MR FW (E.a Gen2 and Gen2.5 FW),
which set WCE bit for Virtual Disk but FW is not reply correct for
SYNCH_CACHE.
 This was handled in past and carry forward (in driver + FW ) to all new
generation products as well. We tried to collect all possible details why it
was handled such way to provide better driver fix for this issue( mainly to
avoid this FW checks and module
  parameters etc..). But now it looks like to make generic fix  (Device ID
based etc..)is even risky, so went with this protective approach.  It is
very risky to find out which Device ID and FW has capability to manage
SYNC_CACHE, so we managed to figure out which are the new FW using FW
capability bit.

E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported command
(this is a firmware bug) and immediately it will be failed to SML. This will
cause File system to go Read-only.

Case -2
One more thing which we are trying and known driver can send "SYNC_CACHE"
unconditionally to all generation of FW.  For this we are doing more unit
testing on various controllers/FW and planning to provide another patch to
fix JBOD path for any FW. (It will not be based on FW capability/module
parameter).

If we remove module parameter, we can ask customer to disable WCE on drive
to get similar impact.

` Kashyap
>
>
> > The issue here is for devices that are non-RAID or pass through - this
> > is a real issue and has been seen in practice.
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 17:36             ` Kashyap Desai
@ 2016-10-17 17:51               ` James Bottomley
  2016-10-18 13:56                 ` Tomas Henzl
  2016-10-18 18:47                 ` Sumit Saxena
  0 siblings, 2 replies; 49+ messages in thread
From: James Bottomley @ 2016-10-17 17:51 UTC (permalink / raw)
  To: Kashyap Desai, Ric Wheeler, Hannes Reinecke, Sumit Saxena, linux-scsi
  Cc: martin.petersen, thenzl, Christoph Hellwig, Martin K. Petersen,
	Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
> > -----Original Message-----
> > From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
> > Sent: Monday, October 17, 2016 10:50 PM
> > To: Ric Wheeler; Hannes Reinecke; Sumit Saxena; 
> > linux-scsi@vger.kernel.org
> > Cc: martin.petersen@oracle.com; thenzl@redhat.com;
> > kashyap.desai@broadcom.com; Christoph Hellwig; Martin K. Petersen;
> > Jeff
> > Moyer; Gris Ge; Ewan Milne; Jens Axboe
> > Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
> > command
> > to firmware
> > 
> > On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
> > > On 10/17/2016 12:20 PM, James Bottomley wrote:
> > > > On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> > > > > On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> > > > > > On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> > > > > > > megaraid_sas driver returns SYNCHRONIZE_CACHE command
> > > > > > > back to
> > > > > > > SCSI layer without sending it to firmware as firmware
> > > > > > > takes
> > > > > > > care of flushing cache.  This patch will change the
> > > > > > > driver
> > > > > > > behavior wrt SYNCHRONIZE_CACHE handling. If underlying
> > > > > > > firmware has support to handle the SYNCHORNIZE_CACHE,
> > > > > > > driver
> > > > > > > will send it for firmware otherwise complete it back to
> > > > > > > SCSI
> > > > > > > layer with SUCCESS immediately.  If  Firmware  handle
> > > > > > > SYNCHORNIZE_CACHE for both VD and JBOD
> > > > > > > "canHandleSyncCache"
> > > > > > > bit in scratch pad register(offset
> > > > > > > 0x00B4) will be set.
> > > > > > > 
> > > > > > > This behavior can be controlled via module parameter and
> > > > > > > user
> > > > > > > can fallback to old behavior of returning
> > > > > > > SYNCHRONIZE_CACHE by
> > > > > > > driver only without sending it to firmware.
> > > > > > > 
> > > > > > > Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> > > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> > > > > > > ---
> > > > > > >    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> > > > > > >    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
> > > > > > > ++++++---
> > > > > > > -----
> > > > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> > > > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> > > > > > >    4 files changed, 14 insertions(+), 8 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > > b/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > > index ca86c88..43fd14f 100644
> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > > > > > > @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> > > > > > >    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> > > > > > >    #define MR_MAX_MSIX_REG_ARRAY                   16
> > > > > > >    #define MR_RDPQ_MODE_OFFSET			0X0
> > > > > > > 0800
> > > > > > > 000
> > > > > > > +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0
> > > > > > > X010
> > > > > > > 0000
> > > > > > > 0
> > > > > > > +
> > > > > > >    /*
> > > > > > >    * register set for both 1068 and 1078 controllers
> > > > > > >    * structure extended for 1078 registers @@ -2140,6
> > > > > > > +2142,7
> > > > > > > @@ struct megasas_instance {
> > > > > > >    	u8 is_imr;
> > > > > > >    	u8 is_rdpq;
> > > > > > >    	bool dev_handle;
> > > > > > > +	bool fw_sync_cache_support;
> > > > > > >    };
> > > > > > >    struct MR_LD_VF_MAP {
> > > > > > >    	u32 size;
> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > > b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > > index 092387f..a4a8e2f 100644
> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > > > > > > @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> > > > > > > MEGASAS_DEFAULT_CMD_TIMEOUT;
> > > > > > >    module_param(scmd_timeout, int, S_IRUGO);
> > > > > > >    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
> > > > > > > (10
> > > > > > > -90s), default 90s. See megasas_reset_timer.");
> > > > > > > 
> > > > > > > +bool block_sync_cache;
> > > > > > > +module_param(block_sync_cache, bool, S_IRUGO);
> > > > > > > +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
> > > > > > > driver
> > > > > > > Default: 0(Send it to firmware)");
> > > > > > > +
> > > > > > >    MODULE_LICENSE("GPL");
> > > > > > >    MODULE_VERSION(MEGASAS_VERSION);
> > > > > > >    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> > > > > > > @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
> > Scsi_Host
> > > > > > > *shost, struct scsi_cmnd *scmd)
> > > > > > >    		goto out_done;
> > > > > > >    	}
> > > > > > > 
> > > > > > > -	switch (scmd->cmnd[0]) {
> > > > > > > -	case SYNCHRONIZE_CACHE:
> > > > > > > -		/*
> > > > > > > -		 * FW takes care of flush cache on its
> > > > > > > own
> > > > > > > -		 * No need to send it down
> > > > > > > -		 */
> > > > > > > +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> > > > > > > +		(!instance->fw_sync_cache_support)) {
> > > > > > >    		scmd->result = DID_OK << 16;
> > > > > > >    		goto out_done;
> > > > > > > -	default:
> > > > > > > -		break;
> > > > > > >    	}
> > > > > > > 
> > > > > > >    	return instance->instancet
> > > > > > > ->build_and_issue_cmd(instance, scmd);
> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > > index 2159f6a..8237580 100644
> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > > > > > > @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> > > > > > > megasas_instance *instance)
> > > > > > >    		ret = 1;
> > > > > > >    		goto fail_fw_init;
> > > > > > >    	}
> > > > > > > +	if (!block_sync_cache)
> > > > > > > +		instance->fw_sync_cache_support =
> > > > > > > (scratch_pad_2
> > > > > > > &
> > > > > > > +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
> > > > > > > ? 1
> > > > > > > :
> > > > > > > 0;
> > > > > > > 
> > > > > > >    	IOCInitMessage =
> > > > > > >    	  dma_alloc_coherent(&instance->pdev->dev,
> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > > index e3bee04..2857154 100644
> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> > > > > > > @@ -72,6 +72,8 @@
> > > > > > >    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
> > > > > > >  (0x0000030C)
> > > > > > >    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x00
> > > > > > > 0000
> > > > > > > 6C)
> > > > > > > 
> > > > > > > +extern bool block_sync_cache;
> > > > > > > +
> > > > > > >    /*
> > > > > > >     * Raid context flags
> > > > > > >     */
> > > > > > > 
> > > > > > Be extra careful with that.
> > > > > > 
> > > > > > SYNCHRONIZE_CACHE has (potentially) a global scope, as
> > > > > > there
> > > > > > might be an array-wide cache, and a cache flush would
> > > > > > affect the
> > > > > > entire cache. Linux is using SYNCHRONIZE_CACHE as a per
> > > > > > device
> > > > > > setting, ie it assumes that the effects of a cache flush is
> > > > > > restricted to the device in question.
> > > > > > 
> > > > > > If this is _not_ the case I'd rather not enable it.
> > > > > > Have you checked that enabling this functionality doesn't
> > > > > > have
> > > > > > negative performance impact?
> > > > > > 
> > > > > > Cheers,
> > > > > > 
> > > > > > Hannes
> > > > > This must go in - without this fix, there is no data
> > > > > integrity for
> > > > > any file system.
> > > > That's not what I get from the change log.  What it says to me
> > > > is
> > > > that the caches are currently firmware managed.  Barring
> > > > firmware
> > > > bugs, that means that we currently don't have any integrity
> > > > issues.
> > > 
> > > Your understanding (or the change log) is incorrect.
> > 
> > There's no current way in English of construing "as firmware takes
> > care of
> > flushing cache" to mean its inverse, so the changelog needs
> > updating to
> > explain
> > that firmware does *not* take care of cache flushing, so
> > effectively
> > nothing
> > does and we'll need a cc to stable if the problem is that nothing
> > takes
> > care of
> > flushing the drive caches.
> > 
> > James
> 
> Sorry for confusion. More accurate to say  -
> 
> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to 
> SCSI layer without sending it down to firmware as firmware supposed 
> to takes care of flushing cache for all Virtual Disk at the time of 
> system reboot/shutdown. Because of above FW rule driver coded wrongly 
> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for JBOD 
> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE and pass 
> the same to the Firmware. ). We figure out that even for VD, why 
> driver should send back SYNC_CACHE command...let's have the same in
> FW (giving all control in FW)
> 
> New behavior described -
> 
> IF application requests SYNCH_CACHE
>  IF any FW which supports new API bit called canHandleSyncCache
>   Driver sends SYNCH_CACHE command to the FW
>   IF 'VirtualDisk'
>       FW does not send SYNCH_CACHE to drives
>       FW returns SUCCESS
> ELSE IF 'JBOD'
>      FW sends SYNCH_CACHE to drive
>      FW obtains status from drive and returns same status back to
> driver
>   ENDIF
> 
> Fixing this is robust if FW and driver changes are inline. See below 
> for more detail.
> 
> Case - 1
> This patch is attempt to fix one level problem where Virtual Disks 
> are not managed correctly in MR FW. There are some MR FW (E.a Gen2 
> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not 
> reply correct for SYNCH_CACHE.  This was handled in past and carry 
> forward (in driver + FW ) to all new generation products as well. We 
> tried to collect all possible details why it was handled such way to 
> provide better driver fix for this issue(mainly to avoid this FW 
> checks and module  parameters etc..). But now it looks like to make 
> generic fix (Device ID based etc..)is even risky, so went with this 
> protective approach.  It is very risky to find out which Device ID 
> and FW has capability to manage SYNC_CACHE, so we managed to figure 
> out which are the new FW using FW capability bit.
> 
> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported 
> command (this is a firmware bug) and immediately it will be failed to 
> SML. This will cause File system to go Read-only.

That's a better description ... what you're saying is that the patch
doesn't actually fix the bug Ric is worrying about.

> Case -2
> One more thing which we are trying and known driver can send 
> "SYNC_CACHE" unconditionally to all generation of FW.  For this we 
> are doing more unit testing on various controllers/FW and planning to 
> provide another patch to fix JBOD path for any FW. (It will not be 
> based on FW capability/module parameter).

OK, but we really need this ASAP.  As Ric said, data integrity is at
risk until this is fixed.  Can you also reference the commit that
introduced the problem so we know how far to do the sable backports?

> If we remove module parameter, we can ask customer to disable WCE on
> drive to get similar impact.

Thanks,

James


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 13:57       ` Tomas Henzl
  2016-10-17 14:25         ` Sumit Saxena
@ 2016-10-18 13:07         ` Ric Wheeler
  2016-10-18 13:22           ` Sumit Saxena
  1 sibling, 1 reply; 49+ messages in thread
From: Ric Wheeler @ 2016-10-18 13:07 UTC (permalink / raw)
  To: Tomas Henzl, Sumit Saxena, linux-scsi
  Cc: martin.petersen, jejb, Kashyap Desai

On 10/17/2016 09:57 AM, Tomas Henzl wrote:
> On 17.10.2016 15:28, Sumit Saxena wrote:
>>> -----Original Message-----
>>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>>> Sent: Monday, October 17, 2016 6:44 PM
>>> To: Sumit Saxena; linux-scsi@vger.kernel.org
>>> Cc: martin.petersen@oracle.com; jejb@linux.vnet.ibm.com;
>>> kashyap.desai@broadcom.com
>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>>> firmware
>>>
>>> On 17.10.2016 12:24, Sumit Saxena wrote:
>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
>>>> layer without sending it to firmware as firmware takes care of flushing
>> cache.
>>>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE
>> handling.
>>>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE,
>>>> driver will send it for firmware otherwise complete it back to SCSI
>>>> layer with SUCCESS immediately.
>>>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>>>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4) will
>>>> be set.
>>>>
>>>> This behavior can be controlled via module parameter and user can
>>>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver only
>>>> without sending it to firmware.
>>>>
>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>>> ---
>>>>   drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>>   drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>>>   drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>> drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>>   4 files changed, 14 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>> index ca86c88..43fd14f 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>   #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>>   #define MR_MAX_MSIX_REG_ARRAY                   16
>>>>   #define MR_RDPQ_MODE_OFFSET			0X00800000
>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>>>> +
>>>>   /*
>>>>   * register set for both 1068 and 1078 controllers
>>>>   * structure extended for 1078 registers @@ -2140,6 +2142,7 @@ struct
>>>> megasas_instance {
>>>>   	u8 is_imr;
>>>>   	u8 is_rdpq;
>>>>   	bool dev_handle;
>>>> +	bool fw_sync_cache_support;
>>>>   };
>>>>   struct MR_LD_VF_MAP {
>>>>   	u32 size;
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> index 092387f..a4a8e2f 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;  module_param(scmd_timeout, int,
>>>> S_IRUGO);  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>>>> (10-90s), default 90s. See megasas_reset_timer.");
>>>>
>>>> +bool block_sync_cache;
>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>>>> +Default: 0(Send it to firmware)");
>>>> +
>>>>   MODULE_LICENSE("GPL");
>>>>   MODULE_VERSION(MEGASAS_VERSION);
>>>>   MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>>> *shost, struct scsi_cmnd *scmd)
>>>>   		goto out_done;
>>>>   	}
>>>>
>>>> -	switch (scmd->cmnd[0]) {
>>>> -	case SYNCHRONIZE_CACHE:
>>>> -		/*
>>>> -		 * FW takes care of flush cache on its own
>>>> -		 * No need to send it down
>>>> -		 */
>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>> +		(!instance->fw_sync_cache_support)) {
>>> Maybe I'm wrong, but isn't the logic inverted?
>>> when fw_sync_cache_support is true  it means that there is a flash cache
>> or a
>>> battery and we don't have to send the command down ?
>>>
>> If "instance->fw_sync_cache_support" is set to true, it means driver can
>> send SYNCHRONIZE_CACHE to firmware(firmware has infrastructure to support
>> SYNCHRONIZE_CACHE).
>> If "instance->fw_sync_cache_support" is set to false, it means FW does not
>> support to handle SYNCHRONIZE_CACHE and FW will flush cache during
>> shutdown.
> Thanks, in that case it is correct.
>
>>>> +	if (!block_sync_cache)
>>>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
> Please add a warning or log the state of the synchronise cache command
> on the controller.

Any logging should be limited to once per device - say when the device is 
opened. Note that synchronize_cache commands are extremely common, we don't want 
to fill the log with this.

thanks!

Ric

>
>
>>>>   	IOCInitMessage =
>>>>   	  dma_alloc_coherent(&instance->pdev->dev,
>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> index e3bee04..2857154 100644
>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>> @@ -72,6 +72,8 @@
>>>>   #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>>>   #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>>>
>>>> +extern bool block_sync_cache;
>>>> +
>>>>   /*
>>>>    * Raid context flags
>>>>    */


^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-18 13:07         ` Ric Wheeler
@ 2016-10-18 13:22           ` Sumit Saxena
  0 siblings, 0 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-18 13:22 UTC (permalink / raw)
  To: Ric Wheeler, Tomas Henzl, linux-scsi; +Cc: martin.petersen, jejb, Kashyap Desai

>-----Original Message-----
>From: Ric Wheeler [mailto:ricwheeler@gmail.com]
>Sent: Tuesday, October 18, 2016 6:38 PM
>To: Tomas Henzl; Sumit Saxena; linux-scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; jejb@linux.vnet.ibm.com; Kashyap Desai
>Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>firmware
>
>On 10/17/2016 09:57 AM, Tomas Henzl wrote:
>> On 17.10.2016 15:28, Sumit Saxena wrote:
>>>> -----Original Message-----
>>>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>>>> Sent: Monday, October 17, 2016 6:44 PM
>>>> To: Sumit Saxena; linux-scsi@vger.kernel.org
>>>> Cc: martin.petersen@oracle.com; jejb@linux.vnet.ibm.com;
>>>> kashyap.desai@broadcom.com
>>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>>>> command to firmware
>>>>
>>>> On 17.10.2016 12:24, Sumit Saxena wrote:
>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command back to SCSI
>>>>> layer without sending it to firmware as firmware takes care of
>>>>> flushing
>>> cache.
>>>>> This patch will change the driver behavior wrt SYNCHRONIZE_CACHE
>>> handling.
>>>>> If underlying firmware has support to handle the SYNCHORNIZE_CACHE,
>>>>> driver will send it for firmware otherwise complete it back to SCSI
>>>>> layer with SUCCESS immediately.
>>>>> If  Firmware  handle SYNCHORNIZE_CACHE for both VD and JBOD
>>>>> "canHandleSyncCache" bit in scratch pad register(offset 0x00B4)
>>>>> will be set.
>>>>>
>>>>> This behavior can be controlled via module parameter and user can
>>>>> fallback to old behavior of returning SYNCHRONIZE_CACHE by driver
>>>>> only without sending it to firmware.
>>>>>
>>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>>>> ---
>>>>>   drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>>>   drivers/scsi/megaraid/megaraid_sas_base.c   | 14 ++++++--------
>>>>>   drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>>> drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>>>   4 files changed, 14 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>>> index ca86c88..43fd14f 100644
>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>>   #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>>>   #define MR_MAX_MSIX_REG_ARRAY                   16
>>>>>   #define MR_RDPQ_MODE_OFFSET			0X00800000
>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0X01000000
>>>>> +
>>>>>   /*
>>>>>   * register set for both 1068 and 1078 controllers
>>>>>   * structure extended for 1078 registers @@ -2140,6 +2142,7 @@
>>>>> struct megasas_instance {
>>>>>   	u8 is_imr;
>>>>>   	u8 is_rdpq;
>>>>>   	bool dev_handle;
>>>>> +	bool fw_sync_cache_support;
>>>>>   };
>>>>>   struct MR_LD_VF_MAP {
>>>>>   	u32 size;
>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>> index 092387f..a4a8e2f 100644
>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;  module_param(scmd_timeout, int,
>>>>> S_IRUGO);  MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>>>>> (10-90s), default 90s. See megasas_reset_timer.");
>>>>>
>>>>> +bool block_sync_cache;
>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by driver
>>>>> +Default: 0(Send it to firmware)");
>>>>> +
>>>>>   MODULE_LICENSE("GPL");
>>>>>   MODULE_VERSION(MEGASAS_VERSION);
>>>>>   MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct Scsi_Host
>>>> *shost, struct scsi_cmnd *scmd)
>>>>>   		goto out_done;
>>>>>   	}
>>>>>
>>>>> -	switch (scmd->cmnd[0]) {
>>>>> -	case SYNCHRONIZE_CACHE:
>>>>> -		/*
>>>>> -		 * FW takes care of flush cache on its own
>>>>> -		 * No need to send it down
>>>>> -		 */
>>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>>> +		(!instance->fw_sync_cache_support)) {
>>>> Maybe I'm wrong, but isn't the logic inverted?
>>>> when fw_sync_cache_support is true  it means that there is a flash
>>>> cache
>>> or a
>>>> battery and we don't have to send the command down ?
>>>>
>>> If "instance->fw_sync_cache_support" is set to true, it means driver
>>> can send SYNCHRONIZE_CACHE to firmware(firmware has infrastructure to
>>> support SYNCHRONIZE_CACHE).
>>> If "instance->fw_sync_cache_support" is set to false, it means FW
>>> does not support to handle SYNCHRONIZE_CACHE and FW will flush cache
>>> during shutdown.
>> Thanks, in that case it is correct.
>>
>>>>> +	if (!block_sync_cache)
>>>>> +		instance->fw_sync_cache_support = (scratch_pad_2 &
>>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET) ? 1 : 0;
>> Please add a warning or log the state of the synchronise cache command
>> on the controller.
>
>Any logging should be limited to once per device - say when the device is
>opened.
>Note that synchronize_cache commands are extremely common, we don't want
>to fill the log with this.

This print is not per device but per MegaRAID controller. This print will
state the controller firmware's capability whether
SYNCHRONIZE_CACHE coming from OS for RAID volumes can be handled by firmware
or not.

For non RAID(JBODs), SYNCHRONIZE_CACHE will be always sent to all MegaRAID
firmware irrespective of firmware capability.

Thanks,
Sumit
>
>thanks!
>
>Ric
>
>>
>>
>>>>>   	IOCInitMessage =
>>>>>   	  dma_alloc_coherent(&instance->pdev->dev,
>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>> index e3bee04..2857154 100644
>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>> @@ -72,6 +72,8 @@
>>>>>   #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET   (0x0000030C)
>>>>>   #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x0000006C)
>>>>>
>>>>> +extern bool block_sync_cache;
>>>>> +
>>>>>   /*
>>>>>    * Raid context flags
>>>>>    */

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 17:51               ` James Bottomley
@ 2016-10-18 13:56                 ` Tomas Henzl
  2016-10-19  9:50                   ` Ching Huang
  2016-10-18 18:47                 ` Sumit Saxena
  1 sibling, 1 reply; 49+ messages in thread
From: Tomas Henzl @ 2016-10-18 13:56 UTC (permalink / raw)
  To: James Bottomley, Kashyap Desai, Ric Wheeler, Hannes Reinecke,
	Sumit Saxena, linux-scsi
  Cc: martin.petersen, Christoph Hellwig, Martin K. Petersen,
	Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe,
	黃清隆,
	RaghavaAditya.Renukunta

Hi,

similar suspicious code path can be found in the queuecommand functions in other drivers too
these are -
pmcraid.c
arcmsr_hba.c
cc-ing maintainers - 
(but both drivers seem to be unmaintained for a while -
I've added Ching for arcmsr and Raghava for pmcraid)

please read this thread and verify that your driver and firmware is safe.

It is expected that a raid card controls the disk write cache (switches it off)
but this might not be true for all modes of operation a raid adapter handles
- pass through/non-RAID etc. In such case when disk write cache is enabled
and your driver skips the SYNCHRONIZE_CACHE command a FS corruption
is very much possible during unexpected power loss or even a clean shutdown.

tomash

On 17.10.2016 19:51, James Bottomley wrote:
> On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
>>> -----Original Message-----
>>> From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
>>> Sent: Monday, October 17, 2016 10:50 PM
>>> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena; 
>>> linux-scsi@vger.kernel.org
>>> Cc: martin.petersen@oracle.com; thenzl@redhat.com;
>>> kashyap.desai@broadcom.com; Christoph Hellwig; Martin K. Petersen;
>>> Jeff
>>> Moyer; Gris Ge; Ewan Milne; Jens Axboe
>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>>> command
>>> to firmware
>>>
>>> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
>>>> On 10/17/2016 12:20 PM, James Bottomley wrote:
>>>>> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
>>>>>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
>>>>>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>>>>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command
>>>>>>>> back to
>>>>>>>> SCSI layer without sending it to firmware as firmware
>>>>>>>> takes
>>>>>>>> care of flushing cache.  This patch will change the
>>>>>>>> driver
>>>>>>>> behavior wrt SYNCHRONIZE_CACHE handling. If underlying
>>>>>>>> firmware has support to handle the SYNCHORNIZE_CACHE,
>>>>>>>> driver
>>>>>>>> will send it for firmware otherwise complete it back to
>>>>>>>> SCSI
>>>>>>>> layer with SUCCESS immediately.  If  Firmware  handle
>>>>>>>> SYNCHORNIZE_CACHE for both VD and JBOD
>>>>>>>> "canHandleSyncCache"
>>>>>>>> bit in scratch pad register(offset
>>>>>>>> 0x00B4) will be set.
>>>>>>>>
>>>>>>>> This behavior can be controlled via module parameter and
>>>>>>>> user
>>>>>>>> can fallback to old behavior of returning
>>>>>>>> SYNCHRONIZE_CACHE by
>>>>>>>> driver only without sending it to firmware.
>>>>>>>>
>>>>>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>>>>>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>>>>>>> ---
>>>>>>>>    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
>>>>>>>> ++++++---
>>>>>>>> -----
>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>>>>>>    4 files changed, 14 insertions(+), 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>> index ca86c88..43fd14f 100644
>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>>>>>    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>>>>>>    #define MR_MAX_MSIX_REG_ARRAY                   16
>>>>>>>>    #define MR_RDPQ_MODE_OFFSET			0X0
>>>>>>>> 0800
>>>>>>>> 000
>>>>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0
>>>>>>>> X010
>>>>>>>> 0000
>>>>>>>> 0
>>>>>>>> +
>>>>>>>>    /*
>>>>>>>>    * register set for both 1068 and 1078 controllers
>>>>>>>>    * structure extended for 1078 registers @@ -2140,6
>>>>>>>> +2142,7
>>>>>>>> @@ struct megasas_instance {
>>>>>>>>    	u8 is_imr;
>>>>>>>>    	u8 is_rdpq;
>>>>>>>>    	bool dev_handle;
>>>>>>>> +	bool fw_sync_cache_support;
>>>>>>>>    };
>>>>>>>>    struct MR_LD_VF_MAP {
>>>>>>>>    	u32 size;
>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>> index 092387f..a4a8e2f 100644
>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
>>>>>>>>    module_param(scmd_timeout, int, S_IRUGO);
>>>>>>>>    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>>>>>>>> (10
>>>>>>>> -90s), default 90s. See megasas_reset_timer.");
>>>>>>>>
>>>>>>>> +bool block_sync_cache;
>>>>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
>>>>>>>> driver
>>>>>>>> Default: 0(Send it to firmware)");
>>>>>>>> +
>>>>>>>>    MODULE_LICENSE("GPL");
>>>>>>>>    MODULE_VERSION(MEGASAS_VERSION);
>>>>>>>>    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>>>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
>>> Scsi_Host
>>>>>>>> *shost, struct scsi_cmnd *scmd)
>>>>>>>>    		goto out_done;
>>>>>>>>    	}
>>>>>>>>
>>>>>>>> -	switch (scmd->cmnd[0]) {
>>>>>>>> -	case SYNCHRONIZE_CACHE:
>>>>>>>> -		/*
>>>>>>>> -		 * FW takes care of flush cache on its
>>>>>>>> own
>>>>>>>> -		 * No need to send it down
>>>>>>>> -		 */
>>>>>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>>>>>> +		(!instance->fw_sync_cache_support)) {
>>>>>>>>    		scmd->result = DID_OK << 16;
>>>>>>>>    		goto out_done;
>>>>>>>> -	default:
>>>>>>>> -		break;
>>>>>>>>    	}
>>>>>>>>
>>>>>>>>    	return instance->instancet
>>>>>>>> ->build_and_issue_cmd(instance, scmd);
>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>> index 2159f6a..8237580 100644
>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
>>>>>>>> megasas_instance *instance)
>>>>>>>>    		ret = 1;
>>>>>>>>    		goto fail_fw_init;
>>>>>>>>    	}
>>>>>>>> +	if (!block_sync_cache)
>>>>>>>> +		instance->fw_sync_cache_support =
>>>>>>>> (scratch_pad_2
>>>>>>>> &
>>>>>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
>>>>>>>> ? 1
>>>>>>>> :
>>>>>>>> 0;
>>>>>>>>
>>>>>>>>    	IOCInitMessage =
>>>>>>>>    	  dma_alloc_coherent(&instance->pdev->dev,
>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>> index e3bee04..2857154 100644
>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>> @@ -72,6 +72,8 @@
>>>>>>>>    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
>>>>>>>>  (0x0000030C)
>>>>>>>>    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x00
>>>>>>>> 0000
>>>>>>>> 6C)
>>>>>>>>
>>>>>>>> +extern bool block_sync_cache;
>>>>>>>> +
>>>>>>>>    /*
>>>>>>>>     * Raid context flags
>>>>>>>>     */
>>>>>>>>
>>>>>>> Be extra careful with that.
>>>>>>>
>>>>>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as
>>>>>>> there
>>>>>>> might be an array-wide cache, and a cache flush would
>>>>>>> affect the
>>>>>>> entire cache. Linux is using SYNCHRONIZE_CACHE as a per
>>>>>>> device
>>>>>>> setting, ie it assumes that the effects of a cache flush is
>>>>>>> restricted to the device in question.
>>>>>>>
>>>>>>> If this is _not_ the case I'd rather not enable it.
>>>>>>> Have you checked that enabling this functionality doesn't
>>>>>>> have
>>>>>>> negative performance impact?
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Hannes
>>>>>> This must go in - without this fix, there is no data
>>>>>> integrity for
>>>>>> any file system.
>>>>> That's not what I get from the change log.  What it says to me
>>>>> is
>>>>> that the caches are currently firmware managed.  Barring
>>>>> firmware
>>>>> bugs, that means that we currently don't have any integrity
>>>>> issues.
>>>> Your understanding (or the change log) is incorrect.
>>> There's no current way in English of construing "as firmware takes
>>> care of
>>> flushing cache" to mean its inverse, so the changelog needs
>>> updating to
>>> explain
>>> that firmware does *not* take care of cache flushing, so
>>> effectively
>>> nothing
>>> does and we'll need a cc to stable if the problem is that nothing
>>> takes
>>> care of
>>> flushing the drive caches.
>>>
>>> James
>> Sorry for confusion. More accurate to say  -
>>
>> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to 
>> SCSI layer without sending it down to firmware as firmware supposed 
>> to takes care of flushing cache for all Virtual Disk at the time of 
>> system reboot/shutdown. Because of above FW rule driver coded wrongly 
>> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for JBOD 
>> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE and pass 
>> the same to the Firmware. ). We figure out that even for VD, why 
>> driver should send back SYNC_CACHE command...let's have the same in
>> FW (giving all control in FW)
>>
>> New behavior described -
>>
>> IF application requests SYNCH_CACHE
>>  IF any FW which supports new API bit called canHandleSyncCache
>>   Driver sends SYNCH_CACHE command to the FW
>>   IF 'VirtualDisk'
>>       FW does not send SYNCH_CACHE to drives
>>       FW returns SUCCESS
>> ELSE IF 'JBOD'
>>      FW sends SYNCH_CACHE to drive
>>      FW obtains status from drive and returns same status back to
>> driver
>>   ENDIF
>>
>> Fixing this is robust if FW and driver changes are inline. See below 
>> for more detail.
>>
>> Case - 1
>> This patch is attempt to fix one level problem where Virtual Disks 
>> are not managed correctly in MR FW. There are some MR FW (E.a Gen2 
>> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not 
>> reply correct for SYNCH_CACHE.  This was handled in past and carry 
>> forward (in driver + FW ) to all new generation products as well. We 
>> tried to collect all possible details why it was handled such way to 
>> provide better driver fix for this issue(mainly to avoid this FW 
>> checks and module  parameters etc..). But now it looks like to make 
>> generic fix (Device ID based etc..)is even risky, so went with this 
>> protective approach.  It is very risky to find out which Device ID 
>> and FW has capability to manage SYNC_CACHE, so we managed to figure 
>> out which are the new FW using FW capability bit.
>>
>> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported 
>> command (this is a firmware bug) and immediately it will be failed to 
>> SML. This will cause File system to go Read-only.
> That's a better description ... what you're saying is that the patch
> doesn't actually fix the bug Ric is worrying about.
>
>> Case -2
>> One more thing which we are trying and known driver can send 
>> "SYNC_CACHE" unconditionally to all generation of FW.  For this we 
>> are doing more unit testing on various controllers/FW and planning to 
>> provide another patch to fix JBOD path for any FW. (It will not be 
>> based on FW capability/module parameter).
> OK, but we really need this ASAP.  As Ric said, data integrity is at
> risk until this is fixed.  Can you also reference the commit that
> introduced the problem so we know how far to do the sable backports?
>
>> If we remove module parameter, we can ask customer to disable WCE on
>> drive to get similar impact.
> Thanks,
>
> James
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-17 17:51               ` James Bottomley
  2016-10-18 13:56                 ` Tomas Henzl
@ 2016-10-18 18:47                 ` Sumit Saxena
  1 sibling, 0 replies; 49+ messages in thread
From: Sumit Saxena @ 2016-10-18 18:47 UTC (permalink / raw)
  To: James Bottomley, Kashyap Desai, Ric Wheeler, Hannes Reinecke, linux-scsi
  Cc: martin.petersen, thenzl, Christoph Hellwig, Martin K. Petersen,
	Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

>-----Original Message-----
>From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
>Sent: Monday, October 17, 2016 11:22 PM
>To: Kashyap Desai; Ric Wheeler; Hannes Reinecke; Sumit Saxena; linux-
>scsi@vger.kernel.org
>Cc: martin.petersen@oracle.com; thenzl@redhat.com; Christoph Hellwig;
>Martin
>K. Petersen; Jeff Moyer; Gris Ge; Ewan Milne; Jens Axboe
>Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to
>firmware
>
>On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
>> > -----Original Message-----
>> > From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
>> > Sent: Monday, October 17, 2016 10:50 PM
>> > To: Ric Wheeler; Hannes Reinecke; Sumit Saxena;
>> > linux-scsi@vger.kernel.org
>> > Cc: martin.petersen@oracle.com; thenzl@redhat.com;
>> > kashyap.desai@broadcom.com; Christoph Hellwig; Martin K. Petersen;
>> > Jeff Moyer; Gris Ge; Ewan Milne; Jens Axboe
>> > Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>> > command to firmware
>> >
>> > On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
>> > > On 10/17/2016 12:20 PM, James Bottomley wrote:
>> > > > On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
>> > > > > On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
>> > > > > > On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>> > > > > > > megaraid_sas driver returns SYNCHRONIZE_CACHE command back
>> > > > > > > to SCSI layer without sending it to firmware as firmware
>> > > > > > > takes care of flushing cache.  This patch will change the
>> > > > > > > driver behavior wrt SYNCHRONIZE_CACHE handling. If
>> > > > > > > underlying firmware has support to handle the
>> > > > > > > SYNCHORNIZE_CACHE, driver will send it for firmware
>> > > > > > > otherwise complete it back to SCSI layer with SUCCESS
>> > > > > > > immediately.  If  Firmware  handle SYNCHORNIZE_CACHE for
>> > > > > > > both VD and JBOD "canHandleSyncCache"
>> > > > > > > bit in scratch pad register(offset
>> > > > > > > 0x00B4) will be set.
>> > > > > > >
>> > > > > > > This behavior can be controlled via module parameter and
>> > > > > > > user can fallback to old behavior of returning
>> > > > > > > SYNCHRONIZE_CACHE by driver only without sending it to
>> > > > > > > firmware.
>> > > > > > >
>> > > > > > > Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>> > > > > > > Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>> > > > > > > ---
>> > > > > > >    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>> > > > > > >    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
>> > > > > > > ++++++---
>> > > > > > > -----
>> > > > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>> > > > > > >    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>> > > > > > >    4 files changed, 14 insertions(+), 8 deletions(-)
>> > > > > > >
>> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>> > > > > > > b/drivers/scsi/megaraid/megaraid_sas.h
>> > > > > > > index ca86c88..43fd14f 100644
>> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas.h
>> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas.h
>> > > > > > > @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>> > > > > > >    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>> > > > > > >    #define MR_MAX_MSIX_REG_ARRAY                   16
>> > > > > > >    #define MR_RDPQ_MODE_OFFSET			0X0
>> > > > > > > 0800
>> > > > > > > 000
>> > > > > > > +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0
>> > > > > > > X010
>> > > > > > > 0000
>> > > > > > > 0
>> > > > > > > +
>> > > > > > >    /*
>> > > > > > >    * register set for both 1068 and 1078 controllers
>> > > > > > >    * structure extended for 1078 registers @@ -2140,6
>> > > > > > > +2142,7
>> > > > > > > @@ struct megasas_instance {
>> > > > > > >    	u8 is_imr;
>> > > > > > >    	u8 is_rdpq;
>> > > > > > >    	bool dev_handle;
>> > > > > > > +	bool fw_sync_cache_support;
>> > > > > > >    };
>> > > > > > >    struct MR_LD_VF_MAP {
>> > > > > > >    	u32 size;
>> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>> > > > > > > b/drivers/scsi/megaraid/megaraid_sas_base.c
>> > > > > > > index 092387f..a4a8e2f 100644
>> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>> > > > > > > @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>> > > > > > > MEGASAS_DEFAULT_CMD_TIMEOUT;
>> > > > > > >    module_param(scmd_timeout, int, S_IRUGO);
>> > > > > > >    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>> > > > > > > (10
>> > > > > > > -90s), default 90s. See megasas_reset_timer.");
>> > > > > > >
>> > > > > > > +bool block_sync_cache;
>> > > > > > > +module_param(block_sync_cache, bool, S_IRUGO);
>> > > > > > > +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
>> > > > > > > driver
>> > > > > > > Default: 0(Send it to firmware)");
>> > > > > > > +
>> > > > > > >    MODULE_LICENSE("GPL");
>> > > > > > >    MODULE_VERSION(MEGASAS_VERSION);
>> > > > > > >    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>> > > > > > > @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
>> > Scsi_Host
>> > > > > > > *shost, struct scsi_cmnd *scmd)
>> > > > > > >    		goto out_done;
>> > > > > > >    	}
>> > > > > > >
>> > > > > > > -	switch (scmd->cmnd[0]) {
>> > > > > > > -	case SYNCHRONIZE_CACHE:
>> > > > > > > -		/*
>> > > > > > > -		 * FW takes care of flush cache on its
>> > > > > > > own
>> > > > > > > -		 * No need to send it down
>> > > > > > > -		 */
>> > > > > > > +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>> > > > > > > +		(!instance->fw_sync_cache_support)) {
>> > > > > > >    		scmd->result = DID_OK << 16;
>> > > > > > >    		goto out_done;
>> > > > > > > -	default:
>> > > > > > > -		break;
>> > > > > > >    	}
>> > > > > > >
>> > > > > > >    	return instance->instancet
>> > > > > > > ->build_and_issue_cmd(instance, scmd);
>> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> > > > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> > > > > > > index 2159f6a..8237580 100644
>> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>> > > > > > > @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
>> > > > > > > megasas_instance *instance)
>> > > > > > >    		ret = 1;
>> > > > > > >    		goto fail_fw_init;
>> > > > > > >    	}
>> > > > > > > +	if (!block_sync_cache)
>> > > > > > > +		instance->fw_sync_cache_support =
>> > > > > > > (scratch_pad_2
>> > > > > > > &
>> > > > > > > +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
>> > > > > > > ? 1
>> > > > > > > :
>> > > > > > > 0;
>> > > > > > >
>> > > > > > >    	IOCInitMessage =
>> > > > > > >    	  dma_alloc_coherent(&instance->pdev->dev,
>> > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> > > > > > > b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> > > > > > > index e3bee04..2857154 100644
>> > > > > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> > > > > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>> > > > > > > @@ -72,6 +72,8 @@
>> > > > > > >    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
>> > > > > > >  (0x0000030C)
>> > > > > > >    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x00
>> > > > > > > 0000
>> > > > > > > 6C)
>> > > > > > >
>> > > > > > > +extern bool block_sync_cache;
>> > > > > > > +
>> > > > > > >    /*
>> > > > > > >     * Raid context flags
>> > > > > > >     */
>> > > > > > >
>> > > > > > Be extra careful with that.
>> > > > > >
>> > > > > > SYNCHRONIZE_CACHE has (potentially) a global scope, as there
>> > > > > > might be an array-wide cache, and a cache flush would affect
>> > > > > > the entire cache. Linux is using SYNCHRONIZE_CACHE as a per
>> > > > > > device setting, ie it assumes that the effects of a cache
>> > > > > > flush is restricted to the device in question.
>> > > > > >
>> > > > > > If this is _not_ the case I'd rather not enable it.
>> > > > > > Have you checked that enabling this functionality doesn't
>> > > > > > have negative performance impact?
>> > > > > >
>> > > > > > Cheers,
>> > > > > >
>> > > > > > Hannes
>> > > > > This must go in - without this fix, there is no data integrity
>> > > > > for any file system.
>> > > > That's not what I get from the change log.  What it says to me
>> > > > is that the caches are currently firmware managed.  Barring
>> > > > firmware bugs, that means that we currently don't have any
>> > > > integrity issues.
>> > >
>> > > Your understanding (or the change log) is incorrect.
>> >
>> > There's no current way in English of construing "as firmware takes
>> > care of flushing cache" to mean its inverse, so the changelog needs
>> > updating to explain that firmware does *not* take care of cache
>> > flushing, so effectively nothing does and we'll need a cc to stable
>> > if the problem is that nothing takes care of flushing the drive
>> > caches.
>> >
>> > James
>>
>> Sorry for confusion. More accurate to say  -
>>
>> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to
>> SCSI layer without sending it down to firmware as firmware supposed to
>> takes care of flushing cache for all Virtual Disk at the time of
>> system reboot/shutdown. Because of above FW rule driver coded wrongly
>> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for JBOD
>> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE and pass
>> the same to the Firmware. ). We figure out that even for VD, why
>> driver should send back SYNC_CACHE command...let's have the same in FW
>> (giving all control in FW)
>>
>> New behavior described -
>>
>> IF application requests SYNCH_CACHE
>>  IF any FW which supports new API bit called canHandleSyncCache
>>   Driver sends SYNCH_CACHE command to the FW
>>   IF 'VirtualDisk'
>>       FW does not send SYNCH_CACHE to drives
>>       FW returns SUCCESS
>> ELSE IF 'JBOD'
>>      FW sends SYNCH_CACHE to drive
>>      FW obtains status from drive and returns same status back to
>> driver
>>   ENDIF
>>
>> Fixing this is robust if FW and driver changes are inline. See below
>> for more detail.
>>
>> Case - 1
>> This patch is attempt to fix one level problem where Virtual Disks are
>> not managed correctly in MR FW. There are some MR FW (E.a Gen2 and
>> Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not reply
>> correct for SYNCH_CACHE.  This was handled in past and carry forward
>> (in driver + FW ) to all new generation products as well. We tried to
>> collect all possible details why it was handled such way to provide
>> better driver fix for this issue(mainly to avoid this FW checks and
>> module  parameters etc..). But now it looks like to make generic fix
>> (Device ID based etc..)is even risky, so went with this protective
>> approach.  It is very risky to find out which Device ID and FW has
>> capability to manage SYNC_CACHE, so we managed to figure out which are
>> the new FW using FW capability bit.
>>
>> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported
>> command (this is a firmware bug) and immediately it will be failed to
>> SML. This will cause File system to go Read-only.
>
>That's a better description ... what you're saying is that the patch
>doesn't actually
>fix the bug Ric is worrying about.
>
>> Case -2
>> One more thing which we are trying and known driver can send
>> "SYNC_CACHE" unconditionally to all generation of FW.  For this we are
>> doing more unit testing on various controllers/FW and planning to
>> provide another patch to fix JBOD path for any FW. (It will not be
>> based on FW capability/module parameter).
>
>OK, but we really need this ASAP.  As Ric said, data integrity is at risk
>until this is
>fixed.  Can you also reference the commit that introduced the problem so we
>know how far to do the sable backports?


Commit- " 02b01e0 [SCSI] megaraid_sas: return sync cache call with success"
added the code
in driver to return SYNCHRONIZE_CACHE without sending it to firmware long
back in 2007.
This was then added for RAID volumes as then supported controller firmwares
did not have
support for SYNCHRONIZE_CACHE. This was mistakenly carried forward for non
RAID(JBODs)
which was wrong. I will be sending a consolidated patch which will address
all issues pertaining to SYNCHRONIZE_CACHE for RAID volumes and non
RAID(JBOD) disks.

>
>> If we remove module parameter, we can ask customer to disable WCE on
>> drive to get similar impact.
>
>Thanks,
>
>James

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-18 13:56                 ` Tomas Henzl
@ 2016-10-19  9:50                   ` Ching Huang
  2016-10-19 12:55                     ` Tomas Henzl
  0 siblings, 1 reply; 49+ messages in thread
From: Ching Huang @ 2016-10-19  9:50 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: James Bottomley, Kashyap Desai, Ric Wheeler, Hannes Reinecke,
	Sumit Saxena, linux-scsi, martin.petersen, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe,
	RaghavaAditya.Renukunta

Hi Tomas,

SCSI command checking in queuecommand function of arcmsr can be removed safely. 
Now driver can pass all scsi command to controller firmware.

diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
--- a/drivers/scsi/arcmsr/arcmsr_hba.c	2016-10-19 16:18:45.000000000 +0800
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c	2016-10-19 16:31:59.665524699 +0800
@@ -2623,13 +2623,6 @@ static int arcmsr_queue_command_lck(stru
 	cmd->scsi_done = done;
 	cmd->host_scribble = NULL;
 	cmd->result = 0;
-	if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd == SEND_DIAGNOSTIC)){
-		if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
-    			cmd->result = (DID_NO_CONNECT << 16);
-		}
-		cmd->scsi_done(cmd);
-		return 0;
-	}
 	if (target == 16) {
 		/* virtual device for iop message transfer */
 		arcmsr_handle_virtual_command(acb, cmd);

Thanks
Ching
On Tue, 2016-10-18 at 15:56 +0200, Tomas Henzl wrote:
> Hi,
> 
> similar suspicious code path can be found in the queuecommand functions in other drivers too
> these are -
> pmcraid.c
> arcmsr_hba.c
> cc-ing maintainers - 
> (but both drivers seem to be unmaintained for a while -
> I've added Ching for arcmsr and Raghava for pmcraid)
> 
> please read this thread and verify that your driver and firmware is safe.
> 
> It is expected that a raid card controls the disk write cache (switches it off)
> but this might not be true for all modes of operation a raid adapter handles
> - pass through/non-RAID etc. In such case when disk write cache is enabled
> and your driver skips the SYNCHRONIZE_CACHE command a FS corruption
> is very much possible during unexpected power loss or even a clean shutdown.
> 
> tomash
> 
> On 17.10.2016 19:51, James Bottomley wrote:
> > On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
> >>> -----Original Message-----
> >>> From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
> >>> Sent: Monday, October 17, 2016 10:50 PM
> >>> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena; 
> >>> linux-scsi@vger.kernel.org
> >>> Cc: martin.petersen@oracle.com; thenzl@redhat.com;
> >>> kashyap.desai@broadcom.com; Christoph Hellwig; Martin K. Petersen;
> >>> Jeff
> >>> Moyer; Gris Ge; Ewan Milne; Jens Axboe
> >>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
> >>> command
> >>> to firmware
> >>>
> >>> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
> >>>> On 10/17/2016 12:20 PM, James Bottomley wrote:
> >>>>> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> >>>>>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> >>>>>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> >>>>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command
> >>>>>>>> back to
> >>>>>>>> SCSI layer without sending it to firmware as firmware
> >>>>>>>> takes
> >>>>>>>> care of flushing cache.  This patch will change the
> >>>>>>>> driver
> >>>>>>>> behavior wrt SYNCHRONIZE_CACHE handling. If underlying
> >>>>>>>> firmware has support to handle the SYNCHORNIZE_CACHE,
> >>>>>>>> driver
> >>>>>>>> will send it for firmware otherwise complete it back to
> >>>>>>>> SCSI
> >>>>>>>> layer with SUCCESS immediately.  If  Firmware  handle
> >>>>>>>> SYNCHORNIZE_CACHE for both VD and JBOD
> >>>>>>>> "canHandleSyncCache"
> >>>>>>>> bit in scratch pad register(offset
> >>>>>>>> 0x00B4) will be set.
> >>>>>>>>
> >>>>>>>> This behavior can be controlled via module parameter and
> >>>>>>>> user
> >>>>>>>> can fallback to old behavior of returning
> >>>>>>>> SYNCHRONIZE_CACHE by
> >>>>>>>> driver only without sending it to firmware.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> >>>>>>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
> >>>>>>>> ---
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
> >>>>>>>> ++++++---
> >>>>>>>> -----
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> >>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> >>>>>>>>    4 files changed, 14 insertions(+), 8 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> index ca86c88..43fd14f 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> >>>>>>>>    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> >>>>>>>>    #define MR_MAX_MSIX_REG_ARRAY                   16
> >>>>>>>>    #define MR_RDPQ_MODE_OFFSET			0X0
> >>>>>>>> 0800
> >>>>>>>> 000
> >>>>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0
> >>>>>>>> X010
> >>>>>>>> 0000
> >>>>>>>> 0
> >>>>>>>> +
> >>>>>>>>    /*
> >>>>>>>>    * register set for both 1068 and 1078 controllers
> >>>>>>>>    * structure extended for 1078 registers @@ -2140,6
> >>>>>>>> +2142,7
> >>>>>>>> @@ struct megasas_instance {
> >>>>>>>>    	u8 is_imr;
> >>>>>>>>    	u8 is_rdpq;
> >>>>>>>>    	bool dev_handle;
> >>>>>>>> +	bool fw_sync_cache_support;
> >>>>>>>>    };
> >>>>>>>>    struct MR_LD_VF_MAP {
> >>>>>>>>    	u32 size;
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> index 092387f..a4a8e2f 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> >>>>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
> >>>>>>>>    module_param(scmd_timeout, int, S_IRUGO);
> >>>>>>>>    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
> >>>>>>>> (10
> >>>>>>>> -90s), default 90s. See megasas_reset_timer.");
> >>>>>>>>
> >>>>>>>> +bool block_sync_cache;
> >>>>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
> >>>>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
> >>>>>>>> driver
> >>>>>>>> Default: 0(Send it to firmware)");
> >>>>>>>> +
> >>>>>>>>    MODULE_LICENSE("GPL");
> >>>>>>>>    MODULE_VERSION(MEGASAS_VERSION);
> >>>>>>>>    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> >>>>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
> >>> Scsi_Host
> >>>>>>>> *shost, struct scsi_cmnd *scmd)
> >>>>>>>>    		goto out_done;
> >>>>>>>>    	}
> >>>>>>>>
> >>>>>>>> -	switch (scmd->cmnd[0]) {
> >>>>>>>> -	case SYNCHRONIZE_CACHE:
> >>>>>>>> -		/*
> >>>>>>>> -		 * FW takes care of flush cache on its
> >>>>>>>> own
> >>>>>>>> -		 * No need to send it down
> >>>>>>>> -		 */
> >>>>>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> >>>>>>>> +		(!instance->fw_sync_cache_support)) {
> >>>>>>>>    		scmd->result = DID_OK << 16;
> >>>>>>>>    		goto out_done;
> >>>>>>>> -	default:
> >>>>>>>> -		break;
> >>>>>>>>    	}
> >>>>>>>>
> >>>>>>>>    	return instance->instancet
> >>>>>>>> ->build_and_issue_cmd(instance, scmd);
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> index 2159f6a..8237580 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> >>>>>>>> megasas_instance *instance)
> >>>>>>>>    		ret = 1;
> >>>>>>>>    		goto fail_fw_init;
> >>>>>>>>    	}
> >>>>>>>> +	if (!block_sync_cache)
> >>>>>>>> +		instance->fw_sync_cache_support =
> >>>>>>>> (scratch_pad_2
> >>>>>>>> &
> >>>>>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
> >>>>>>>> ? 1
> >>>>>>>> :
> >>>>>>>> 0;
> >>>>>>>>
> >>>>>>>>    	IOCInitMessage =
> >>>>>>>>    	  dma_alloc_coherent(&instance->pdev->dev,
> >>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> index e3bee04..2857154 100644
> >>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>> @@ -72,6 +72,8 @@
> >>>>>>>>    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
> >>>>>>>>  (0x0000030C)
> >>>>>>>>    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x00
> >>>>>>>> 0000
> >>>>>>>> 6C)
> >>>>>>>>
> >>>>>>>> +extern bool block_sync_cache;
> >>>>>>>> +
> >>>>>>>>    /*
> >>>>>>>>     * Raid context flags
> >>>>>>>>     */
> >>>>>>>>
> >>>>>>> Be extra careful with that.
> >>>>>>>
> >>>>>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as
> >>>>>>> there
> >>>>>>> might be an array-wide cache, and a cache flush would
> >>>>>>> affect the
> >>>>>>> entire cache. Linux is using SYNCHRONIZE_CACHE as a per
> >>>>>>> device
> >>>>>>> setting, ie it assumes that the effects of a cache flush is
> >>>>>>> restricted to the device in question.
> >>>>>>>
> >>>>>>> If this is _not_ the case I'd rather not enable it.
> >>>>>>> Have you checked that enabling this functionality doesn't
> >>>>>>> have
> >>>>>>> negative performance impact?
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>>
> >>>>>>> Hannes
> >>>>>> This must go in - without this fix, there is no data
> >>>>>> integrity for
> >>>>>> any file system.
> >>>>> That's not what I get from the change log.  What it says to me
> >>>>> is
> >>>>> that the caches are currently firmware managed.  Barring
> >>>>> firmware
> >>>>> bugs, that means that we currently don't have any integrity
> >>>>> issues.
> >>>> Your understanding (or the change log) is incorrect.
> >>> There's no current way in English of construing "as firmware takes
> >>> care of
> >>> flushing cache" to mean its inverse, so the changelog needs
> >>> updating to
> >>> explain
> >>> that firmware does *not* take care of cache flushing, so
> >>> effectively
> >>> nothing
> >>> does and we'll need a cc to stable if the problem is that nothing
> >>> takes
> >>> care of
> >>> flushing the drive caches.
> >>>
> >>> James
> >> Sorry for confusion. More accurate to say  -
> >>
> >> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to 
> >> SCSI layer without sending it down to firmware as firmware supposed 
> >> to takes care of flushing cache for all Virtual Disk at the time of 
> >> system reboot/shutdown. Because of above FW rule driver coded wrongly 
> >> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for JBOD 
> >> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE and pass 
> >> the same to the Firmware. ). We figure out that even for VD, why 
> >> driver should send back SYNC_CACHE command...let's have the same in
> >> FW (giving all control in FW)
> >>
> >> New behavior described -
> >>
> >> IF application requests SYNCH_CACHE
> >>  IF any FW which supports new API bit called canHandleSyncCache
> >>   Driver sends SYNCH_CACHE command to the FW
> >>   IF 'VirtualDisk'
> >>       FW does not send SYNCH_CACHE to drives
> >>       FW returns SUCCESS
> >> ELSE IF 'JBOD'
> >>      FW sends SYNCH_CACHE to drive
> >>      FW obtains status from drive and returns same status back to
> >> driver
> >>   ENDIF
> >>
> >> Fixing this is robust if FW and driver changes are inline. See below 
> >> for more detail.
> >>
> >> Case - 1
> >> This patch is attempt to fix one level problem where Virtual Disks 
> >> are not managed correctly in MR FW. There are some MR FW (E.a Gen2 
> >> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not 
> >> reply correct for SYNCH_CACHE.  This was handled in past and carry 
> >> forward (in driver + FW ) to all new generation products as well. We 
> >> tried to collect all possible details why it was handled such way to 
> >> provide better driver fix for this issue(mainly to avoid this FW 
> >> checks and module  parameters etc..). But now it looks like to make 
> >> generic fix (Device ID based etc..)is even risky, so went with this 
> >> protective approach.  It is very risky to find out which Device ID 
> >> and FW has capability to manage SYNC_CACHE, so we managed to figure 
> >> out which are the new FW using FW capability bit.
> >>
> >> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported 
> >> command (this is a firmware bug) and immediately it will be failed to 
> >> SML. This will cause File system to go Read-only.
> > That's a better description ... what you're saying is that the patch
> > doesn't actually fix the bug Ric is worrying about.
> >
> >> Case -2
> >> One more thing which we are trying and known driver can send 
> >> "SYNC_CACHE" unconditionally to all generation of FW.  For this we 
> >> are doing more unit testing on various controllers/FW and planning to 
> >> provide another patch to fix JBOD path for any FW. (It will not be 
> >> based on FW capability/module parameter).
> > OK, but we really need this ASAP.  As Ric said, data integrity is at
> > risk until this is fixed.  Can you also reference the commit that
> > introduced the problem so we know how far to do the sable backports?
> >
> >> If we remove module parameter, we can ask customer to disable WCE on
> >> drive to get similar impact.
> > Thanks,
> >
> > James
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-19  9:50                   ` Ching Huang
@ 2016-10-19 12:55                     ` Tomas Henzl
  2016-10-19 18:07                       ` Raghava Aditya Renukunta
  2016-10-25  2:02                       ` Martin K. Petersen
  0 siblings, 2 replies; 49+ messages in thread
From: Tomas Henzl @ 2016-10-19 12:55 UTC (permalink / raw)
  To: Ching Huang
  Cc: James Bottomley, Kashyap Desai, Ric Wheeler, Hannes Reinecke,
	Sumit Saxena, linux-scsi, martin.petersen, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe,
	RaghavaAditya.Renukunta

On 19.10.2016 11:50, Ching Huang wrote:
> Hi Tomas,
>
> SCSI command checking in queuecommand function of arcmsr can be removed safely. 
> Now driver can pass all scsi command to controller firmware.

thanks for your quick reply. Is this (passing the SYNCHRONIZE_CACHE command
to firmware) safe for all arcmsr cards, even the oldest?

Please start with your patch a new thread and add your 'Signed-off-by:' line.

>
> diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
> --- a/drivers/scsi/arcmsr/arcmsr_hba.c	2016-10-19 16:18:45.000000000 +0800
> +++ b/drivers/scsi/arcmsr/arcmsr_hba.c	2016-10-19 16:31:59.665524699 +0800
> @@ -2623,13 +2623,6 @@ static int arcmsr_queue_command_lck(stru
>  	cmd->scsi_done = done;
>  	cmd->host_scribble = NULL;
>  	cmd->result = 0;
> -	if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd == SEND_DIAGNOSTIC)){
> -		if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
> -    			cmd->result = (DID_NO_CONNECT << 16);
> -		}
> -		cmd->scsi_done(cmd);
> -		return 0;
> -	}
>  	if (target == 16) {
>  		/* virtual device for iop message transfer */
>  		arcmsr_handle_virtual_command(acb, cmd);
>
> Thanks
> Ching
> On Tue, 2016-10-18 at 15:56 +0200, Tomas Henzl wrote:
>> Hi,
>>
>> similar suspicious code path can be found in the queuecommand functions in other drivers too
>> these are -
>> pmcraid.c
>> arcmsr_hba.c
>> cc-ing maintainers - 
>> (but both drivers seem to be unmaintained for a while -
>> I've added Ching for arcmsr and Raghava for pmcraid)
>>
>> please read this thread and verify that your driver and firmware is safe.
>>
>> It is expected that a raid card controls the disk write cache (switches it off)
>> but this might not be true for all modes of operation a raid adapter handles
>> - pass through/non-RAID etc. In such case when disk write cache is enabled
>> and your driver skips the SYNCHRONIZE_CACHE command a FS corruption
>> is very much possible during unexpected power loss or even a clean shutdown.
>>
>> tomash
>>
>> On 17.10.2016 19:51, James Bottomley wrote:
>>> On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
>>>>> -----Original Message-----
>>>>> From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
>>>>> Sent: Monday, October 17, 2016 10:50 PM
>>>>> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena; 
>>>>> linux-scsi@vger.kernel.org
>>>>> Cc: martin.petersen@oracle.com; thenzl@redhat.com;
>>>>> kashyap.desai@broadcom.com; Christoph Hellwig; Martin K. Petersen;
>>>>> Jeff
>>>>> Moyer; Gris Ge; Ewan Milne; Jens Axboe
>>>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>>>>> command
>>>>> to firmware
>>>>>
>>>>> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
>>>>>> On 10/17/2016 12:20 PM, James Bottomley wrote:
>>>>>>> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
>>>>>>>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
>>>>>>>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>>>>>>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command
>>>>>>>>>> back to
>>>>>>>>>> SCSI layer without sending it to firmware as firmware
>>>>>>>>>> takes
>>>>>>>>>> care of flushing cache.  This patch will change the
>>>>>>>>>> driver
>>>>>>>>>> behavior wrt SYNCHRONIZE_CACHE handling. If underlying
>>>>>>>>>> firmware has support to handle the SYNCHORNIZE_CACHE,
>>>>>>>>>> driver
>>>>>>>>>> will send it for firmware otherwise complete it back to
>>>>>>>>>> SCSI
>>>>>>>>>> layer with SUCCESS immediately.  If  Firmware  handle
>>>>>>>>>> SYNCHORNIZE_CACHE for both VD and JBOD
>>>>>>>>>> "canHandleSyncCache"
>>>>>>>>>> bit in scratch pad register(offset
>>>>>>>>>> 0x00B4) will be set.
>>>>>>>>>>
>>>>>>>>>> This behavior can be controlled via module parameter and
>>>>>>>>>> user
>>>>>>>>>> can fallback to old behavior of returning
>>>>>>>>>> SYNCHRONIZE_CACHE by
>>>>>>>>>> driver only without sending it to firmware.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>>>>>>>>> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
>>>>>>>>>> ---
>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
>>>>>>>>>> ++++++---
>>>>>>>>>> -----
>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>>>>>>>>    4 files changed, 14 insertions(+), 8 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> index ca86c88..43fd14f 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>>>>>>>    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>>>>>>>>    #define MR_MAX_MSIX_REG_ARRAY                   16
>>>>>>>>>>    #define MR_RDPQ_MODE_OFFSET			0X0
>>>>>>>>>> 0800
>>>>>>>>>> 000
>>>>>>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET		0
>>>>>>>>>> X010
>>>>>>>>>> 0000
>>>>>>>>>> 0
>>>>>>>>>> +
>>>>>>>>>>    /*
>>>>>>>>>>    * register set for both 1068 and 1078 controllers
>>>>>>>>>>    * structure extended for 1078 registers @@ -2140,6
>>>>>>>>>> +2142,7
>>>>>>>>>> @@ struct megasas_instance {
>>>>>>>>>>    	u8 is_imr;
>>>>>>>>>>    	u8 is_rdpq;
>>>>>>>>>>    	bool dev_handle;
>>>>>>>>>> +	bool fw_sync_cache_support;
>>>>>>>>>>    };
>>>>>>>>>>    struct MR_LD_VF_MAP {
>>>>>>>>>>    	u32 size;
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> index 092387f..a4a8e2f 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>>>>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
>>>>>>>>>>    module_param(scmd_timeout, int, S_IRUGO);
>>>>>>>>>>    MODULE_PARM_DESC(scmd_timeout, "scsi command timeout
>>>>>>>>>> (10
>>>>>>>>>> -90s), default 90s. See megasas_reset_timer.");
>>>>>>>>>>
>>>>>>>>>> +bool block_sync_cache;
>>>>>>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>>>>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE by
>>>>>>>>>> driver
>>>>>>>>>> Default: 0(Send it to firmware)");
>>>>>>>>>> +
>>>>>>>>>>    MODULE_LICENSE("GPL");
>>>>>>>>>>    MODULE_VERSION(MEGASAS_VERSION);
>>>>>>>>>>    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>>>>>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
>>>>> Scsi_Host
>>>>>>>>>> *shost, struct scsi_cmnd *scmd)
>>>>>>>>>>    		goto out_done;
>>>>>>>>>>    	}
>>>>>>>>>>
>>>>>>>>>> -	switch (scmd->cmnd[0]) {
>>>>>>>>>> -	case SYNCHRONIZE_CACHE:
>>>>>>>>>> -		/*
>>>>>>>>>> -		 * FW takes care of flush cache on its
>>>>>>>>>> own
>>>>>>>>>> -		 * No need to send it down
>>>>>>>>>> -		 */
>>>>>>>>>> +	if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>>>>>>>> +		(!instance->fw_sync_cache_support)) {
>>>>>>>>>>    		scmd->result = DID_OK << 16;
>>>>>>>>>>    		goto out_done;
>>>>>>>>>> -	default:
>>>>>>>>>> -		break;
>>>>>>>>>>    	}
>>>>>>>>>>
>>>>>>>>>>    	return instance->instancet
>>>>>>>>>> ->build_and_issue_cmd(instance, scmd);
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> index 2159f6a..8237580 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
>>>>>>>>>> megasas_instance *instance)
>>>>>>>>>>    		ret = 1;
>>>>>>>>>>    		goto fail_fw_init;
>>>>>>>>>>    	}
>>>>>>>>>> +	if (!block_sync_cache)
>>>>>>>>>> +		instance->fw_sync_cache_support =
>>>>>>>>>> (scratch_pad_2
>>>>>>>>>> &
>>>>>>>>>> +			MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
>>>>>>>>>> ? 1
>>>>>>>>>> :
>>>>>>>>>> 0;
>>>>>>>>>>
>>>>>>>>>>    	IOCInitMessage =
>>>>>>>>>>    	  dma_alloc_coherent(&instance->pdev->dev,
>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> index e3bee04..2857154 100644
>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>> @@ -72,6 +72,8 @@
>>>>>>>>>>    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
>>>>>>>>>>  (0x0000030C)
>>>>>>>>>>    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET	(0x00
>>>>>>>>>> 0000
>>>>>>>>>> 6C)
>>>>>>>>>>
>>>>>>>>>> +extern bool block_sync_cache;
>>>>>>>>>> +
>>>>>>>>>>    /*
>>>>>>>>>>     * Raid context flags
>>>>>>>>>>     */
>>>>>>>>>>
>>>>>>>>> Be extra careful with that.
>>>>>>>>>
>>>>>>>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as
>>>>>>>>> there
>>>>>>>>> might be an array-wide cache, and a cache flush would
>>>>>>>>> affect the
>>>>>>>>> entire cache. Linux is using SYNCHRONIZE_CACHE as a per
>>>>>>>>> device
>>>>>>>>> setting, ie it assumes that the effects of a cache flush is
>>>>>>>>> restricted to the device in question.
>>>>>>>>>
>>>>>>>>> If this is _not_ the case I'd rather not enable it.
>>>>>>>>> Have you checked that enabling this functionality doesn't
>>>>>>>>> have
>>>>>>>>> negative performance impact?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>> Hannes
>>>>>>>> This must go in - without this fix, there is no data
>>>>>>>> integrity for
>>>>>>>> any file system.
>>>>>>> That's not what I get from the change log.  What it says to me
>>>>>>> is
>>>>>>> that the caches are currently firmware managed.  Barring
>>>>>>> firmware
>>>>>>> bugs, that means that we currently don't have any integrity
>>>>>>> issues.
>>>>>> Your understanding (or the change log) is incorrect.
>>>>> There's no current way in English of construing "as firmware takes
>>>>> care of
>>>>> flushing cache" to mean its inverse, so the changelog needs
>>>>> updating to
>>>>> explain
>>>>> that firmware does *not* take care of cache flushing, so
>>>>> effectively
>>>>> nothing
>>>>> does and we'll need a cc to stable if the problem is that nothing
>>>>> takes
>>>>> care of
>>>>> flushing the drive caches.
>>>>>
>>>>> James
>>>> Sorry for confusion. More accurate to say  -
>>>>
>>>> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command back to 
>>>> SCSI layer without sending it down to firmware as firmware supposed 
>>>> to takes care of flushing cache for all Virtual Disk at the time of 
>>>> system reboot/shutdown. Because of above FW rule driver coded wrongly 
>>>> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for JBOD 
>>>> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE and pass 
>>>> the same to the Firmware. ). We figure out that even for VD, why 
>>>> driver should send back SYNC_CACHE command...let's have the same in
>>>> FW (giving all control in FW)
>>>>
>>>> New behavior described -
>>>>
>>>> IF application requests SYNCH_CACHE
>>>>  IF any FW which supports new API bit called canHandleSyncCache
>>>>   Driver sends SYNCH_CACHE command to the FW
>>>>   IF 'VirtualDisk'
>>>>       FW does not send SYNCH_CACHE to drives
>>>>       FW returns SUCCESS
>>>> ELSE IF 'JBOD'
>>>>      FW sends SYNCH_CACHE to drive
>>>>      FW obtains status from drive and returns same status back to
>>>> driver
>>>>   ENDIF
>>>>
>>>> Fixing this is robust if FW and driver changes are inline. See below 
>>>> for more detail.
>>>>
>>>> Case - 1
>>>> This patch is attempt to fix one level problem where Virtual Disks 
>>>> are not managed correctly in MR FW. There are some MR FW (E.a Gen2 
>>>> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not 
>>>> reply correct for SYNCH_CACHE.  This was handled in past and carry 
>>>> forward (in driver + FW ) to all new generation products as well. We 
>>>> tried to collect all possible details why it was handled such way to 
>>>> provide better driver fix for this issue(mainly to avoid this FW 
>>>> checks and module  parameters etc..). But now it looks like to make 
>>>> generic fix (Device ID based etc..)is even risky, so went with this 
>>>> protective approach.  It is very risky to find out which Device ID 
>>>> and FW has capability to manage SYNC_CACHE, so we managed to figure 
>>>> out which are the new FW using FW capability bit.
>>>>
>>>> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported 
>>>> command (this is a firmware bug) and immediately it will be failed to 
>>>> SML. This will cause File system to go Read-only.
>>> That's a better description ... what you're saying is that the patch
>>> doesn't actually fix the bug Ric is worrying about.
>>>
>>>> Case -2
>>>> One more thing which we are trying and known driver can send 
>>>> "SYNC_CACHE" unconditionally to all generation of FW.  For this we 
>>>> are doing more unit testing on various controllers/FW and planning to 
>>>> provide another patch to fix JBOD path for any FW. (It will not be 
>>>> based on FW capability/module parameter).
>>> OK, but we really need this ASAP.  As Ric said, data integrity is at
>>> risk until this is fixed.  Can you also reference the commit that
>>> introduced the problem so we know how far to do the sable backports?
>>>
>>>> If we remove module parameter, we can ask customer to disable WCE on
>>>> drive to get similar impact.
>>> Thanks,
>>>
>>> James
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* RE: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-19 12:55                     ` Tomas Henzl
@ 2016-10-19 18:07                       ` Raghava Aditya Renukunta
  2016-10-21 16:30                         ` Tomas Henzl
  2016-10-25  2:02                       ` Martin K. Petersen
  1 sibling, 1 reply; 49+ messages in thread
From: Raghava Aditya Renukunta @ 2016-10-19 18:07 UTC (permalink / raw)
  To: Tomas Henzl, Ching Huang
  Cc: James Bottomley, Kashyap Desai, Ric Wheeler, Hannes Reinecke,
	Sumit Saxena, linux-scsi, martin.petersen, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

Hi Tomas,

> -----Original Message-----
> From: Tomas Henzl [mailto:thenzl@redhat.com]
> Sent: Wednesday, October 19, 2016 5:56 AM
> To: Ching Huang
> Cc: James Bottomley; Kashyap Desai; Ric Wheeler; Hannes Reinecke; Sumit
> Saxena; linux-scsi@vger.kernel.org; martin.petersen@oracle.com; Christoph
> Hellwig; Martin K. Petersen; Jeff Moyer; Gris Ge; Ewan Milne; Jens Axboe;
> Raghava Aditya Renukunta
> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
> command to firmware
> 
> EXTERNAL EMAIL
> 
> 
> On 19.10.2016 11:50, Ching Huang wrote:
> > Hi Tomas,
> >
> > SCSI command checking in queuecommand function of arcmsr can be
> removed safely.
> > Now driver can pass all scsi command to controller firmware.
> 
> thanks for your quick reply. Is this (passing the SYNCHRONIZE_CACHE
> command
> to firmware) safe for all arcmsr cards, even the oldest?
> 
> Please start with your patch a new thread and add your 'Signed-off-by:' line.
> 
> >
> > diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c
> b/drivers/scsi/arcmsr/arcmsr_hba.c
> > --- a/drivers/scsi/arcmsr/arcmsr_hba.c        2016-10-19 16:18:45.000000000
> +0800
> > +++ b/drivers/scsi/arcmsr/arcmsr_hba.c        2016-10-19 16:31:59.665524699
> +0800
> > @@ -2623,13 +2623,6 @@ static int arcmsr_queue_command_lck(stru
> >       cmd->scsi_done = done;
> >       cmd->host_scribble = NULL;
> >       cmd->result = 0;
> > -     if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd ==
> SEND_DIAGNOSTIC)){
> > -             if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
> > -                     cmd->result = (DID_NO_CONNECT << 16);
> > -             }
> > -             cmd->scsi_done(cmd);
> > -             return 0;
> > -     }
> >       if (target == 16) {
> >               /* virtual device for iop message transfer */
> >               arcmsr_handle_virtual_command(acb, cmd);
> >
> > Thanks
> > Ching
> > On Tue, 2016-10-18 at 15:56 +0200, Tomas Henzl wrote:
> >> Hi,
> >>
> >> similar suspicious code path can be found in the queuecommand
> functions in other drivers too
> >> these are -
> >> pmcraid.c
> >> arcmsr_hba.c
> >> cc-ing maintainers -
> >> (but both drivers seem to be unmaintained for a while -
> >> I've added Ching for arcmsr and Raghava for pmcraid)

We do not support this card  anymore nor do we have the hardware to test
it out, but let me  try and procure the hardware for testing although 
chances are very slim.

Regards,
Raghava Aditya

> >> please read this thread and verify that your driver and firmware is safe.
> >>
> >> It is expected that a raid card controls the disk write cache (switches it off)
> >> but this might not be true for all modes of operation a raid adapter
> handles
> >> - pass through/non-RAID etc. In such case when disk write cache is
> enabled
> >> and your driver skips the SYNCHRONIZE_CACHE command a FS corruption
> >> is very much possible during unexpected power loss or even a clean
> shutdown.
> >>
> >> tomash
> >>
> >> On 17.10.2016 19:51, James Bottomley wrote:
> >>> On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
> >>>>> -----Original Message-----
> >>>>> From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
> >>>>> Sent: Monday, October 17, 2016 10:50 PM
> >>>>> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena;
> >>>>> linux-scsi@vger.kernel.org
> >>>>> Cc: martin.petersen@oracle.com; thenzl@redhat.com;
> >>>>> kashyap.desai@broadcom.com; Christoph Hellwig; Martin K.
> Petersen;
> >>>>> Jeff
> >>>>> Moyer; Gris Ge; Ewan Milne; Jens Axboe
> >>>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
> >>>>> command
> >>>>> to firmware
> >>>>>
> >>>>> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
> >>>>>> On 10/17/2016 12:20 PM, James Bottomley wrote:
> >>>>>>> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
> >>>>>>>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
> >>>>>>>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
> >>>>>>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command
> >>>>>>>>>> back to
> >>>>>>>>>> SCSI layer without sending it to firmware as firmware
> >>>>>>>>>> takes
> >>>>>>>>>> care of flushing cache.  This patch will change the
> >>>>>>>>>> driver
> >>>>>>>>>> behavior wrt SYNCHRONIZE_CACHE handling. If underlying
> >>>>>>>>>> firmware has support to handle the SYNCHORNIZE_CACHE,
> >>>>>>>>>> driver
> >>>>>>>>>> will send it for firmware otherwise complete it back to
> >>>>>>>>>> SCSI
> >>>>>>>>>> layer with SUCCESS immediately.  If  Firmware  handle
> >>>>>>>>>> SYNCHORNIZE_CACHE for both VD and JBOD
> >>>>>>>>>> "canHandleSyncCache"
> >>>>>>>>>> bit in scratch pad register(offset
> >>>>>>>>>> 0x00B4) will be set.
> >>>>>>>>>>
> >>>>>>>>>> This behavior can be controlled via module parameter and
> >>>>>>>>>> user
> >>>>>>>>>> can fallback to old behavior of returning
> >>>>>>>>>> SYNCHRONIZE_CACHE by
> >>>>>>>>>> driver only without sending it to firmware.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> >>>>>>>>>> Signed-off-by: Kashyap Desai
> <kashyap.desai@broadcom.com>
> >>>>>>>>>> ---
> >>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
> >>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
> >>>>>>>>>> ++++++---
> >>>>>>>>>> -----
> >>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
> >>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
> >>>>>>>>>>    4 files changed, 14 insertions(+), 8 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>>>> index ca86c88..43fd14f 100644
> >>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> >>>>>>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
> >>>>>>>>>>    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
> >>>>>>>>>>    #define MR_MAX_MSIX_REG_ARRAY                   16
> >>>>>>>>>>    #define MR_RDPQ_MODE_OFFSET                       0X0
> >>>>>>>>>> 0800
> >>>>>>>>>> 000
> >>>>>>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET             0
> >>>>>>>>>> X010
> >>>>>>>>>> 0000
> >>>>>>>>>> 0
> >>>>>>>>>> +
> >>>>>>>>>>    /*
> >>>>>>>>>>    * register set for both 1068 and 1078 controllers
> >>>>>>>>>>    * structure extended for 1078 registers @@ -2140,6
> >>>>>>>>>> +2142,7
> >>>>>>>>>> @@ struct megasas_instance {
> >>>>>>>>>>      u8 is_imr;
> >>>>>>>>>>      u8 is_rdpq;
> >>>>>>>>>>      bool dev_handle;
> >>>>>>>>>> +    bool fw_sync_cache_support;
> >>>>>>>>>>    };
> >>>>>>>>>>    struct MR_LD_VF_MAP {
> >>>>>>>>>>      u32 size;
> >>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>>>> index 092387f..a4a8e2f 100644
> >>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> >>>>>>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
> >>>>>>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
> >>>>>>>>>>    module_param(scmd_timeout, int, S_IRUGO);
> >>>>>>>>>>    MODULE_PARM_DESC(scmd_timeout, "scsi command
> timeout
> >>>>>>>>>> (10
> >>>>>>>>>> -90s), default 90s. See megasas_reset_timer.");
> >>>>>>>>>>
> >>>>>>>>>> +bool block_sync_cache;
> >>>>>>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
> >>>>>>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE
> by
> >>>>>>>>>> driver
> >>>>>>>>>> Default: 0(Send it to firmware)");
> >>>>>>>>>> +
> >>>>>>>>>>    MODULE_LICENSE("GPL");
> >>>>>>>>>>    MODULE_VERSION(MEGASAS_VERSION);
> >>>>>>>>>>    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
> >>>>>>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
> >>>>> Scsi_Host
> >>>>>>>>>> *shost, struct scsi_cmnd *scmd)
> >>>>>>>>>>              goto out_done;
> >>>>>>>>>>      }
> >>>>>>>>>>
> >>>>>>>>>> -    switch (scmd->cmnd[0]) {
> >>>>>>>>>> -    case SYNCHRONIZE_CACHE:
> >>>>>>>>>> -            /*
> >>>>>>>>>> -             * FW takes care of flush cache on its
> >>>>>>>>>> own
> >>>>>>>>>> -             * No need to send it down
> >>>>>>>>>> -             */
> >>>>>>>>>> +    if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
> >>>>>>>>>> +            (!instance->fw_sync_cache_support)) {
> >>>>>>>>>>              scmd->result = DID_OK << 16;
> >>>>>>>>>>              goto out_done;
> >>>>>>>>>> -    default:
> >>>>>>>>>> -            break;
> >>>>>>>>>>      }
> >>>>>>>>>>
> >>>>>>>>>>      return instance->instancet
> >>>>>>>>>> ->build_and_issue_cmd(instance, scmd);
> >>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>>>> index 2159f6a..8237580 100644
> >>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> >>>>>>>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
> >>>>>>>>>> megasas_instance *instance)
> >>>>>>>>>>              ret = 1;
> >>>>>>>>>>              goto fail_fw_init;
> >>>>>>>>>>      }
> >>>>>>>>>> +    if (!block_sync_cache)
> >>>>>>>>>> +            instance->fw_sync_cache_support =
> >>>>>>>>>> (scratch_pad_2
> >>>>>>>>>> &
> >>>>>>>>>> +                    MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
> >>>>>>>>>> ? 1
> >>>>>>>>>> :
> >>>>>>>>>> 0;
> >>>>>>>>>>
> >>>>>>>>>>      IOCInitMessage =
> >>>>>>>>>>        dma_alloc_coherent(&instance->pdev->dev,
> >>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>>>> index e3bee04..2857154 100644
> >>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
> >>>>>>>>>> @@ -72,6 +72,8 @@
> >>>>>>>>>>    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
> >>>>>>>>>>  (0x0000030C)
> >>>>>>>>>>    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET (0x00
> >>>>>>>>>> 0000
> >>>>>>>>>> 6C)
> >>>>>>>>>>
> >>>>>>>>>> +extern bool block_sync_cache;
> >>>>>>>>>> +
> >>>>>>>>>>    /*
> >>>>>>>>>>     * Raid context flags
> >>>>>>>>>>     */
> >>>>>>>>>>
> >>>>>>>>> Be extra careful with that.
> >>>>>>>>>
> >>>>>>>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as
> >>>>>>>>> there
> >>>>>>>>> might be an array-wide cache, and a cache flush would
> >>>>>>>>> affect the
> >>>>>>>>> entire cache. Linux is using SYNCHRONIZE_CACHE as a per
> >>>>>>>>> device
> >>>>>>>>> setting, ie it assumes that the effects of a cache flush is
> >>>>>>>>> restricted to the device in question.
> >>>>>>>>>
> >>>>>>>>> If this is _not_ the case I'd rather not enable it.
> >>>>>>>>> Have you checked that enabling this functionality doesn't
> >>>>>>>>> have
> >>>>>>>>> negative performance impact?
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>>
> >>>>>>>>> Hannes
> >>>>>>>> This must go in - without this fix, there is no data
> >>>>>>>> integrity for
> >>>>>>>> any file system.
> >>>>>>> That's not what I get from the change log.  What it says to me
> >>>>>>> is
> >>>>>>> that the caches are currently firmware managed.  Barring
> >>>>>>> firmware
> >>>>>>> bugs, that means that we currently don't have any integrity
> >>>>>>> issues.
> >>>>>> Your understanding (or the change log) is incorrect.
> >>>>> There's no current way in English of construing "as firmware takes
> >>>>> care of
> >>>>> flushing cache" to mean its inverse, so the changelog needs
> >>>>> updating to
> >>>>> explain
> >>>>> that firmware does *not* take care of cache flushing, so
> >>>>> effectively
> >>>>> nothing
> >>>>> does and we'll need a cc to stable if the problem is that nothing
> >>>>> takes
> >>>>> care of
> >>>>> flushing the drive caches.
> >>>>>
> >>>>> James
> >>>> Sorry for confusion. More accurate to say  -
> >>>>
> >>>> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command
> back to
> >>>> SCSI layer without sending it down to firmware as firmware supposed
> >>>> to takes care of flushing cache for all Virtual Disk at the time of
> >>>> system reboot/shutdown. Because of above FW rule driver coded
> wrongly
> >>>> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for
> JBOD
> >>>> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE
> and pass
> >>>> the same to the Firmware. ). We figure out that even for VD, why
> >>>> driver should send back SYNC_CACHE command...let's have the same
> in
> >>>> FW (giving all control in FW)
> >>>>
> >>>> New behavior described -
> >>>>
> >>>> IF application requests SYNCH_CACHE
> >>>>  IF any FW which supports new API bit called canHandleSyncCache
> >>>>   Driver sends SYNCH_CACHE command to the FW
> >>>>   IF 'VirtualDisk'
> >>>>       FW does not send SYNCH_CACHE to drives
> >>>>       FW returns SUCCESS
> >>>> ELSE IF 'JBOD'
> >>>>      FW sends SYNCH_CACHE to drive
> >>>>      FW obtains status from drive and returns same status back to
> >>>> driver
> >>>>   ENDIF
> >>>>
> >>>> Fixing this is robust if FW and driver changes are inline. See below
> >>>> for more detail.
> >>>>
> >>>> Case - 1
> >>>> This patch is attempt to fix one level problem where Virtual Disks
> >>>> are not managed correctly in MR FW. There are some MR FW (E.a Gen2
> >>>> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not
> >>>> reply correct for SYNCH_CACHE.  This was handled in past and carry
> >>>> forward (in driver + FW ) to all new generation products as well. We
> >>>> tried to collect all possible details why it was handled such way to
> >>>> provide better driver fix for this issue(mainly to avoid this FW
> >>>> checks and module  parameters etc..). But now it looks like to make
> >>>> generic fix (Device ID based etc..)is even risky, so went with this
> >>>> protective approach.  It is very risky to find out which Device ID
> >>>> and FW has capability to manage SYNC_CACHE, so we managed to
> figure
> >>>> out which are the new FW using FW capability bit.
> >>>>
> >>>> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported
> >>>> command (this is a firmware bug) and immediately it will be failed to
> >>>> SML. This will cause File system to go Read-only.
> >>> That's a better description ... what you're saying is that the patch
> >>> doesn't actually fix the bug Ric is worrying about.
> >>>
> >>>> Case -2
> >>>> One more thing which we are trying and known driver can send
> >>>> "SYNC_CACHE" unconditionally to all generation of FW.  For this we
> >>>> are doing more unit testing on various controllers/FW and planning to
> >>>> provide another patch to fix JBOD path for any FW. (It will not be
> >>>> based on FW capability/module parameter).
> >>> OK, but we really need this ASAP.  As Ric said, data integrity is at
> >>> risk until this is fixed.  Can you also reference the commit that
> >>> introduced the problem so we know how far to do the sable backports?
> >>>
> >>>> If we remove module parameter, we can ask customer to disable WCE
> on
> >>>> drive to get similar impact.
> >>> Thanks,
> >>>
> >>> James
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-19 18:07                       ` Raghava Aditya Renukunta
@ 2016-10-21 16:30                         ` Tomas Henzl
  0 siblings, 0 replies; 49+ messages in thread
From: Tomas Henzl @ 2016-10-21 16:30 UTC (permalink / raw)
  To: Raghava Aditya Renukunta, Ching Huang
  Cc: James Bottomley, Kashyap Desai, Ric Wheeler, Hannes Reinecke,
	Sumit Saxena, linux-scsi, martin.petersen, Christoph Hellwig,
	Martin K. Petersen, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe

On 19.10.2016 20:07, Raghava Aditya Renukunta wrote:
> Hi Tomas,
>
>> -----Original Message-----
>> From: Tomas Henzl [mailto:thenzl@redhat.com]
>> Sent: Wednesday, October 19, 2016 5:56 AM
>> To: Ching Huang
>> Cc: James Bottomley; Kashyap Desai; Ric Wheeler; Hannes Reinecke; Sumit
>> Saxena; linux-scsi@vger.kernel.org; martin.petersen@oracle.com; Christoph
>> Hellwig; Martin K. Petersen; Jeff Moyer; Gris Ge; Ewan Milne; Jens Axboe;
>> Raghava Aditya Renukunta
>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>> command to firmware
>>
>> EXTERNAL EMAIL
>>
>>
>> On 19.10.2016 11:50, Ching Huang wrote:
>>> Hi Tomas,
>>>
>>> SCSI command checking in queuecommand function of arcmsr can be
>> removed safely.
>>> Now driver can pass all scsi command to controller firmware.
>> thanks for your quick reply. Is this (passing the SYNCHRONIZE_CACHE
>> command
>> to firmware) safe for all arcmsr cards, even the oldest?
>>
>> Please start with your patch a new thread and add your 'Signed-off-by:' line.
>>
>>> diff -uprN a/drivers/scsi/arcmsr/arcmsr_hba.c
>> b/drivers/scsi/arcmsr/arcmsr_hba.c
>>> --- a/drivers/scsi/arcmsr/arcmsr_hba.c        2016-10-19 16:18:45.000000000
>> +0800
>>> +++ b/drivers/scsi/arcmsr/arcmsr_hba.c        2016-10-19 16:31:59.665524699
>> +0800
>>> @@ -2623,13 +2623,6 @@ static int arcmsr_queue_command_lck(stru
>>>       cmd->scsi_done = done;
>>>       cmd->host_scribble = NULL;
>>>       cmd->result = 0;
>>> -     if ((scsicmd == SYNCHRONIZE_CACHE) ||(scsicmd ==
>> SEND_DIAGNOSTIC)){
>>> -             if(acb->devstate[target][lun] == ARECA_RAID_GONE) {
>>> -                     cmd->result = (DID_NO_CONNECT << 16);
>>> -             }
>>> -             cmd->scsi_done(cmd);
>>> -             return 0;
>>> -     }
>>>       if (target == 16) {
>>>               /* virtual device for iop message transfer */
>>>               arcmsr_handle_virtual_command(acb, cmd);
>>>
>>> Thanks
>>> Ching
>>> On Tue, 2016-10-18 at 15:56 +0200, Tomas Henzl wrote:
>>>> Hi,
>>>>
>>>> similar suspicious code path can be found in the queuecommand
>> functions in other drivers too
>>>> these are -
>>>> pmcraid.c
>>>> arcmsr_hba.c
>>>> cc-ing maintainers -
>>>> (but both drivers seem to be unmaintained for a while -
>>>> I've added Ching for arcmsr and Raghava for pmcraid)
> We do not support this card  anymore nor do we have the hardware to test
> it out, but let me  try and procure the hardware for testing although 
> chances are very slim.

Thanks for looking into this even though the driver is no more supported.


>
> Regards,
> Raghava Aditya
>
>>>> please read this thread and verify that your driver and firmware is safe.
>>>>
>>>> It is expected that a raid card controls the disk write cache (switches it off)
>>>> but this might not be true for all modes of operation a raid adapter
>> handles
>>>> - pass through/non-RAID etc. In such case when disk write cache is
>> enabled
>>>> and your driver skips the SYNCHRONIZE_CACHE command a FS corruption
>>>> is very much possible during unexpected power loss or even a clean
>> shutdown.
>>>> tomash
>>>>
>>>> On 17.10.2016 19:51, James Bottomley wrote:
>>>>> On Mon, 2016-10-17 at 23:06 +0530, Kashyap Desai wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: James Bottomley [mailto:jejb@linux.vnet.ibm.com]
>>>>>>> Sent: Monday, October 17, 2016 10:50 PM
>>>>>>> To: Ric Wheeler; Hannes Reinecke; Sumit Saxena;
>>>>>>> linux-scsi@vger.kernel.org
>>>>>>> Cc: martin.petersen@oracle.com; thenzl@redhat.com;
>>>>>>> kashyap.desai@broadcom.com; Christoph Hellwig; Martin K.
>> Petersen;
>>>>>>> Jeff
>>>>>>> Moyer; Gris Ge; Ewan Milne; Jens Axboe
>>>>>>> Subject: Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE
>>>>>>> command
>>>>>>> to firmware
>>>>>>>
>>>>>>> On Mon, 2016-10-17 at 12:27 -0400, Ric Wheeler wrote:
>>>>>>>> On 10/17/2016 12:20 PM, James Bottomley wrote:
>>>>>>>>> On Mon, 2016-10-17 at 09:01 -0400, Ric Wheeler wrote:
>>>>>>>>>> On 10/17/2016 07:34 AM, Hannes Reinecke wrote:
>>>>>>>>>>> On 10/17/2016 12:24 PM, Sumit Saxena wrote:
>>>>>>>>>>>> megaraid_sas driver returns SYNCHRONIZE_CACHE command
>>>>>>>>>>>> back to
>>>>>>>>>>>> SCSI layer without sending it to firmware as firmware
>>>>>>>>>>>> takes
>>>>>>>>>>>> care of flushing cache.  This patch will change the
>>>>>>>>>>>> driver
>>>>>>>>>>>> behavior wrt SYNCHRONIZE_CACHE handling. If underlying
>>>>>>>>>>>> firmware has support to handle the SYNCHORNIZE_CACHE,
>>>>>>>>>>>> driver
>>>>>>>>>>>> will send it for firmware otherwise complete it back to
>>>>>>>>>>>> SCSI
>>>>>>>>>>>> layer with SUCCESS immediately.  If  Firmware  handle
>>>>>>>>>>>> SYNCHORNIZE_CACHE for both VD and JBOD
>>>>>>>>>>>> "canHandleSyncCache"
>>>>>>>>>>>> bit in scratch pad register(offset
>>>>>>>>>>>> 0x00B4) will be set.
>>>>>>>>>>>>
>>>>>>>>>>>> This behavior can be controlled via module parameter and
>>>>>>>>>>>> user
>>>>>>>>>>>> can fallback to old behavior of returning
>>>>>>>>>>>> SYNCHRONIZE_CACHE by
>>>>>>>>>>>> driver only without sending it to firmware.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
>>>>>>>>>>>> Signed-off-by: Kashyap Desai
>> <kashyap.desai@broadcom.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas.h        |  3 +++
>>>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_base.c   | 14
>>>>>>>>>>>> ++++++---
>>>>>>>>>>>> -----
>>>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.c |  3 +++
>>>>>>>>>>>>    drivers/scsi/megaraid/megaraid_sas_fusion.h |  2 ++
>>>>>>>>>>>>    4 files changed, 14 insertions(+), 8 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>>>> index ca86c88..43fd14f 100644
>>>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>>>>>>>>>>> @@ -1429,6 +1429,8 @@ enum FW_BOOT_CONTEXT {
>>>>>>>>>>>>    #define MR_MAX_REPLY_QUEUES_EXT_OFFSET_SHIFT    14
>>>>>>>>>>>>    #define MR_MAX_MSIX_REG_ARRAY                   16
>>>>>>>>>>>>    #define MR_RDPQ_MODE_OFFSET                       0X0
>>>>>>>>>>>> 0800
>>>>>>>>>>>> 000
>>>>>>>>>>>> +#define MR_CAN_HANDLE_SYNC_CACHE_OFFSET             0
>>>>>>>>>>>> X010
>>>>>>>>>>>> 0000
>>>>>>>>>>>> 0
>>>>>>>>>>>> +
>>>>>>>>>>>>    /*
>>>>>>>>>>>>    * register set for both 1068 and 1078 controllers
>>>>>>>>>>>>    * structure extended for 1078 registers @@ -2140,6
>>>>>>>>>>>> +2142,7
>>>>>>>>>>>> @@ struct megasas_instance {
>>>>>>>>>>>>      u8 is_imr;
>>>>>>>>>>>>      u8 is_rdpq;
>>>>>>>>>>>>      bool dev_handle;
>>>>>>>>>>>> +    bool fw_sync_cache_support;
>>>>>>>>>>>>    };
>>>>>>>>>>>>    struct MR_LD_VF_MAP {
>>>>>>>>>>>>      u32 size;
>>>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>>>> index 092387f..a4a8e2f 100644
>>>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>>>>>>>>>>> @@ -104,6 +104,10 @@ unsigned int scmd_timeout =
>>>>>>>>>>>> MEGASAS_DEFAULT_CMD_TIMEOUT;
>>>>>>>>>>>>    module_param(scmd_timeout, int, S_IRUGO);
>>>>>>>>>>>>    MODULE_PARM_DESC(scmd_timeout, "scsi command
>> timeout
>>>>>>>>>>>> (10
>>>>>>>>>>>> -90s), default 90s. See megasas_reset_timer.");
>>>>>>>>>>>>
>>>>>>>>>>>> +bool block_sync_cache;
>>>>>>>>>>>> +module_param(block_sync_cache, bool, S_IRUGO);
>>>>>>>>>>>> +MODULE_PARM_DESC(block_sync_cache, "Block SYNC CACHE
>> by
>>>>>>>>>>>> driver
>>>>>>>>>>>> Default: 0(Send it to firmware)");
>>>>>>>>>>>> +
>>>>>>>>>>>>    MODULE_LICENSE("GPL");
>>>>>>>>>>>>    MODULE_VERSION(MEGASAS_VERSION);
>>>>>>>>>>>>    MODULE_AUTHOR("megaraidlinux.pdl@avagotech.com");
>>>>>>>>>>>> @@ -1700,16 +1704,10 @@ megasas_queue_command(struct
>>>>>>> Scsi_Host
>>>>>>>>>>>> *shost, struct scsi_cmnd *scmd)
>>>>>>>>>>>>              goto out_done;
>>>>>>>>>>>>      }
>>>>>>>>>>>>
>>>>>>>>>>>> -    switch (scmd->cmnd[0]) {
>>>>>>>>>>>> -    case SYNCHRONIZE_CACHE:
>>>>>>>>>>>> -            /*
>>>>>>>>>>>> -             * FW takes care of flush cache on its
>>>>>>>>>>>> own
>>>>>>>>>>>> -             * No need to send it down
>>>>>>>>>>>> -             */
>>>>>>>>>>>> +    if ((scmd->cmnd[0] == SYNCHRONIZE_CACHE) &&
>>>>>>>>>>>> +            (!instance->fw_sync_cache_support)) {
>>>>>>>>>>>>              scmd->result = DID_OK << 16;
>>>>>>>>>>>>              goto out_done;
>>>>>>>>>>>> -    default:
>>>>>>>>>>>> -            break;
>>>>>>>>>>>>      }
>>>>>>>>>>>>
>>>>>>>>>>>>      return instance->instancet
>>>>>>>>>>>> ->build_and_issue_cmd(instance, scmd);
>>>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>>>> index 2159f6a..8237580 100644
>>>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>>>>>>>>>>> @@ -747,6 +747,9 @@ megasas_ioc_init_fusion(struct
>>>>>>>>>>>> megasas_instance *instance)
>>>>>>>>>>>>              ret = 1;
>>>>>>>>>>>>              goto fail_fw_init;
>>>>>>>>>>>>      }
>>>>>>>>>>>> +    if (!block_sync_cache)
>>>>>>>>>>>> +            instance->fw_sync_cache_support =
>>>>>>>>>>>> (scratch_pad_2
>>>>>>>>>>>> &
>>>>>>>>>>>> +                    MR_CAN_HANDLE_SYNC_CACHE_OFFSET)
>>>>>>>>>>>> ? 1
>>>>>>>>>>>> :
>>>>>>>>>>>> 0;
>>>>>>>>>>>>
>>>>>>>>>>>>      IOCInitMessage =
>>>>>>>>>>>>        dma_alloc_coherent(&instance->pdev->dev,
>>>>>>>>>>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>>>> index e3bee04..2857154 100644
>>>>>>>>>>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.h
>>>>>>>>>>>> @@ -72,6 +72,8 @@
>>>>>>>>>>>>    #define MPI2_SUP_REPLY_POST_HOST_INDEX_OFFSET
>>>>>>>>>>>>  (0x0000030C)
>>>>>>>>>>>>    #define MPI2_REPLY_POST_HOST_INDEX_OFFSET (0x00
>>>>>>>>>>>> 0000
>>>>>>>>>>>> 6C)
>>>>>>>>>>>>
>>>>>>>>>>>> +extern bool block_sync_cache;
>>>>>>>>>>>> +
>>>>>>>>>>>>    /*
>>>>>>>>>>>>     * Raid context flags
>>>>>>>>>>>>     */
>>>>>>>>>>>>
>>>>>>>>>>> Be extra careful with that.
>>>>>>>>>>>
>>>>>>>>>>> SYNCHRONIZE_CACHE has (potentially) a global scope, as
>>>>>>>>>>> there
>>>>>>>>>>> might be an array-wide cache, and a cache flush would
>>>>>>>>>>> affect the
>>>>>>>>>>> entire cache. Linux is using SYNCHRONIZE_CACHE as a per
>>>>>>>>>>> device
>>>>>>>>>>> setting, ie it assumes that the effects of a cache flush is
>>>>>>>>>>> restricted to the device in question.
>>>>>>>>>>>
>>>>>>>>>>> If this is _not_ the case I'd rather not enable it.
>>>>>>>>>>> Have you checked that enabling this functionality doesn't
>>>>>>>>>>> have
>>>>>>>>>>> negative performance impact?
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> Hannes
>>>>>>>>>> This must go in - without this fix, there is no data
>>>>>>>>>> integrity for
>>>>>>>>>> any file system.
>>>>>>>>> That's not what I get from the change log.  What it says to me
>>>>>>>>> is
>>>>>>>>> that the caches are currently firmware managed.  Barring
>>>>>>>>> firmware
>>>>>>>>> bugs, that means that we currently don't have any integrity
>>>>>>>>> issues.
>>>>>>>> Your understanding (or the change log) is incorrect.
>>>>>>> There's no current way in English of construing "as firmware takes
>>>>>>> care of
>>>>>>> flushing cache" to mean its inverse, so the changelog needs
>>>>>>> updating to
>>>>>>> explain
>>>>>>> that firmware does *not* take care of cache flushing, so
>>>>>>> effectively
>>>>>>> nothing
>>>>>>> does and we'll need a cc to stable if the problem is that nothing
>>>>>>> takes
>>>>>>> care of
>>>>>>> flushing the drive caches.
>>>>>>>
>>>>>>> James
>>>>>> Sorry for confusion. More accurate to say  -
>>>>>>
>>>>>> Current megaraid_sas driver returns SYNCHRONIZE_CACHE command
>> back to
>>>>>> SCSI layer without sending it down to firmware as firmware supposed
>>>>>> to takes care of flushing cache for all Virtual Disk at the time of
>>>>>> system reboot/shutdown. Because of above FW rule driver coded
>> wrongly
>>>>>> and it added bug that SYNCHRONIZE_CACHE is not passed to FW for
>> JBOD
>>>>>> as well.  (MR Firmware wanted OS to manage SYNCHRONIZE_CACHE
>> and pass
>>>>>> the same to the Firmware. ). We figure out that even for VD, why
>>>>>> driver should send back SYNC_CACHE command...let's have the same
>> in
>>>>>> FW (giving all control in FW)
>>>>>>
>>>>>> New behavior described -
>>>>>>
>>>>>> IF application requests SYNCH_CACHE
>>>>>>  IF any FW which supports new API bit called canHandleSyncCache
>>>>>>   Driver sends SYNCH_CACHE command to the FW
>>>>>>   IF 'VirtualDisk'
>>>>>>       FW does not send SYNCH_CACHE to drives
>>>>>>       FW returns SUCCESS
>>>>>> ELSE IF 'JBOD'
>>>>>>      FW sends SYNCH_CACHE to drive
>>>>>>      FW obtains status from drive and returns same status back to
>>>>>> driver
>>>>>>   ENDIF
>>>>>>
>>>>>> Fixing this is robust if FW and driver changes are inline. See below
>>>>>> for more detail.
>>>>>>
>>>>>> Case - 1
>>>>>> This patch is attempt to fix one level problem where Virtual Disks
>>>>>> are not managed correctly in MR FW. There are some MR FW (E.a Gen2
>>>>>> and Gen2.5 FW), which set WCE bit for Virtual Disk but FW is not
>>>>>> reply correct for SYNCH_CACHE.  This was handled in past and carry
>>>>>> forward (in driver + FW ) to all new generation products as well. We
>>>>>> tried to collect all possible details why it was handled such way to
>>>>>> provide better driver fix for this issue(mainly to avoid this FW
>>>>>> checks and module  parameters etc..). But now it looks like to make
>>>>>> generic fix (Device ID based etc..)is even risky, so went with this
>>>>>> protective approach.  It is very risky to find out which Device ID
>>>>>> and FW has capability to manage SYNC_CACHE, so we managed to
>> figure
>>>>>> out which are the new FW using FW capability bit.
>>>>>>
>>>>>> E.g Liberator/Thunderbolt MR FW SYNC_CACHE is write Unsupported
>>>>>> command (this is a firmware bug) and immediately it will be failed to
>>>>>> SML. This will cause File system to go Read-only.
>>>>> That's a better description ... what you're saying is that the patch
>>>>> doesn't actually fix the bug Ric is worrying about.
>>>>>
>>>>>> Case -2
>>>>>> One more thing which we are trying and known driver can send
>>>>>> "SYNC_CACHE" unconditionally to all generation of FW.  For this we
>>>>>> are doing more unit testing on various controllers/FW and planning to
>>>>>> provide another patch to fix JBOD path for any FW. (It will not be
>>>>>> based on FW capability/module parameter).
>>>>> OK, but we really need this ASAP.  As Ric said, data integrity is at
>>>>> risk until this is fixed.  Can you also reference the commit that
>>>>> introduced the problem so we know how far to do the sable backports?
>>>>>
>>>>>> If we remove module parameter, we can ask customer to disable WCE
>> on
>>>>>> drive to get similar impact.
>>>>> Thanks,
>>>>>
>>>>> James
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> N�����r��y���b�X��ǧv�^�)޺{.n�+����{���"�{ay�\x1d
ʇڙ�,j\a��f���h���z�\x1e
�w���\f
���j:+v���w�j�m����\a����zZ+�����ݢj"��!tml=



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
  2016-10-19 12:55                     ` Tomas Henzl
  2016-10-19 18:07                       ` Raghava Aditya Renukunta
@ 2016-10-25  2:02                       ` Martin K. Petersen
       [not found]                         ` <CAKiknE_MM88eVON1g_qy7wbb2fkxdAs6O0SRSzN-RbQqSvx1eA@mail.gmail.com>
  1 sibling, 1 reply; 49+ messages in thread
From: Martin K. Petersen @ 2016-10-25  2:02 UTC (permalink / raw)
  To: Tomas Henzl
  Cc: Ching Huang, James Bottomley, Kashyap Desai, Ric Wheeler,
	Hannes Reinecke, Sumit Saxena, linux-scsi, martin.petersen,
	Christoph Hellwig, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe,
	RaghavaAditya.Renukunta

>>>>> "Tomas" == Tomas Henzl <thenzl@redhat.com> writes:

>> SCSI command checking in queuecommand function of arcmsr can be
>> removed safely.  Now driver can pass all scsi command to controller
>> firmware.

Tomas> Is this (passing the SYNCHRONIZE_CACHE command to firmware) safe
Tomas> for all arcmsr cards, even the oldest?

Ching?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware
       [not found]                         ` <CAKiknE_MM88eVON1g_qy7wbb2fkxdAs6O0SRSzN-RbQqSvx1eA@mail.gmail.com>
@ 2016-10-27  2:07                           ` Martin K. Petersen
  0 siblings, 0 replies; 49+ messages in thread
From: Martin K. Petersen @ 2016-10-27  2:07 UTC (permalink / raw)
  To: 黃清隆
  Cc: Tomas Henzl, James Bottomley, Kashyap Desai, Ric Wheeler,
	Hannes Reinecke, Sumit Saxena, linux-scsi, Martin K. Petersen,
	Christoph Hellwig, Jeff Moyer, Gris Ge, Ewan Milne, Jens Axboe,
	RaghavaAditya.Renukunta

>>>>> "ching" == 黃清隆  <ching2048@areca.com.tw> writes:

ching,

ching> Yes. Passing SYNCHRONIZE_CACHE command to firmware is safe for
ching> all Areca Raid controllers, even the oldest.

I applied your patch to 4.9/scsi-fixes.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2016-10-27  2:07 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-17 10:24 [PATCH 0/7] megaraid_sas: Updates for scsi-next Sumit Saxena
2016-10-17 10:24 ` [PATCH 1/7] megaraid_sas: For SRIOV enabled firmware, ensure VF driver waits for 30secs before reset Sumit Saxena
2016-10-17 11:29   ` Hannes Reinecke
2016-10-17 11:29     ` Hannes Reinecke
2016-10-17 12:43   ` Tomas Henzl
2016-10-17 10:24 ` [PATCH 2/7] megaraid_sas: Send correct PhysArm to FW for R1 VD downgrade Sumit Saxena
2016-10-17 11:29   ` Hannes Reinecke
2016-10-17 12:50   ` Tomas Henzl
2016-10-17 10:24 ` [PATCH 3/7] megaraid_sas: Do not fire DCMDs during PCI shutdown/detach Sumit Saxena
2016-10-17 11:31   ` Hannes Reinecke
2016-10-17 12:19     ` Sumit Saxena
2016-10-17 10:24 ` [PATCH 4/7] megaraid_sas: Send SYNCHRONIZE_CACHE command to firmware Sumit Saxena
2016-10-17 11:34   ` Hannes Reinecke
2016-10-17 12:13     ` Sumit Saxena
2016-10-17 13:01     ` Ric Wheeler
2016-10-17 13:31       ` Sumit Saxena
2016-10-17 15:55       ` Christoph Hellwig
2016-10-17 16:08         ` Ric Wheeler
2016-10-17 16:23           ` Ewan D. Milne
2016-10-17 16:20       ` James Bottomley
2016-10-17 16:27         ` Ric Wheeler
2016-10-17 17:19           ` James Bottomley
2016-10-17 17:30             ` Ric Wheeler
2016-10-17 17:36             ` Kashyap Desai
2016-10-17 17:51               ` James Bottomley
2016-10-18 13:56                 ` Tomas Henzl
2016-10-19  9:50                   ` Ching Huang
2016-10-19 12:55                     ` Tomas Henzl
2016-10-19 18:07                       ` Raghava Aditya Renukunta
2016-10-21 16:30                         ` Tomas Henzl
2016-10-25  2:02                       ` Martin K. Petersen
     [not found]                         ` <CAKiknE_MM88eVON1g_qy7wbb2fkxdAs6O0SRSzN-RbQqSvx1eA@mail.gmail.com>
2016-10-27  2:07                           ` Martin K. Petersen
2016-10-18 18:47                 ` Sumit Saxena
2016-10-17 16:29         ` Ric Wheeler
2016-10-17 13:13   ` Tomas Henzl
2016-10-17 13:28     ` Sumit Saxena
2016-10-17 13:57       ` Tomas Henzl
2016-10-17 14:25         ` Sumit Saxena
2016-10-18 13:07         ` Ric Wheeler
2016-10-18 13:22           ` Sumit Saxena
2016-10-17 10:24 ` [PATCH 5/7] megaraid_sas: driver version upgrade Sumit Saxena
2016-10-17 11:35   ` Hannes Reinecke
2016-10-17 12:20     ` Sumit Saxena
2016-10-17 10:24 ` [PATCH 6/7] MAINTAINERS: Update megaraid maintainers list Sumit Saxena
2016-10-17 11:37   ` Hannes Reinecke
2016-10-17 10:24 ` [PATCH 7/7] megaraid_sas: Do not set MPI2_TYPE_CUDA for JBOD FP path for FW which does not support JBOD sequence map Sumit Saxena
2016-10-17 11:37   ` Hannes Reinecke
2016-10-17 11:37     ` Hannes Reinecke
2016-10-17 13:23   ` Tomas Henzl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.