All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination.
@ 2016-10-30 12:43 ` Andrey Grodzovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-10-30 12:43 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: Andrey Grodzovsky, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, Hannes Reinecke,
	stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specifc command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git [PATCH]drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..5542dd02 100644
--- [PATCH]drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,23 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+/**
+ * This is a work around for a bug with LSI Fusion MPT SAS2 when
+ * pefroming secure erase. Due to the verly long time the operation
+ * takes commands issued during the erase will time out and will trigger
+ * execution of abort hook. This leads to device reset and premature
+ * termination of the secured erase.
+ */
+static inline bool disk_erase_command(struct scsi_cmnd *scmd)
+{
+   /**
+   * Identify secure erase command according to
+   * ATA/ATAPI Command Set (ATA8-ACS) p.202
+   */
+   return scmd->cmd_len == 16 && scmd->cmnd[14] == 0xf4;
+}
+
+
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3545,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+   /**
+	* Lock the device for any subsequent command until secured erase
+	* command is done.
+	*/
+	if (disk_erase_command(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4087,11 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	/* Secured erase is done. Unlock the device */
+	if (disk_erase_command(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination.
@ 2016-10-30 12:43 ` Andrey Grodzovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-10-30 12:43 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: Andrey Grodzovsky, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, Hannes Reinecke,
	stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specifc command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git [PATCH]drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..5542dd02 100644
--- [PATCH]drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,23 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+/**
+ * This is a work around for a bug with LSI Fusion MPT SAS2 when
+ * pefroming secure erase. Due to the verly long time the operation
+ * takes commands issued during the erase will time out and will trigger
+ * execution of abort hook. This leads to device reset and premature
+ * termination of the secured erase.
+ */
+static inline bool disk_erase_command(struct scsi_cmnd *scmd)
+{
+   /**
+   * Identify secure erase command according to
+   * ATA/ATAPI Command Set (ATA8-ACS) p.202
+   */
+   return scmd->cmd_len == 16 && scmd->cmnd[14] == 0xf4;
+}
+
+
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3545,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+   /**
+	* Lock the device for any subsequent command until secured erase
+	* command is done.
+	*/
+	if (disk_erase_command(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4087,11 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	/* Secured erase is done. Unlock the device */
+	if (disk_erase_command(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination.
  2016-10-30 12:43 ` Andrey Grodzovsky
@ 2016-10-30 18:43   ` Hannes Reinecke
  -1 siblings, 0 replies; 31+ messages in thread
From: Hannes Reinecke @ 2016-10-30 18:43 UTC (permalink / raw)
  To: Andrey Grodzovsky, MPT-FusionLinux.pdl
  Cc: linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, stable

[-- Attachment #1: Type: text/plain, Size: 1680 bytes --]

On 10/30/2016 01:43 PM, Andrey Grodzovsky wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specifc command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
>
Actually, it is _not_ the erase command which times out, it's the 
successive commands which time out, as the controller is unable to 
process them while erase is running.
I suspect a bug in the SAT-layer from the mpt3sas firmware, which simply 
does not return 'busy' for additional commands when erase is in progress.
That being said, this issue was obscured prior to implementing 
asynchronous aborts, as originally a timeout would be invoking SCSI EH, 
which would wait for all outstanding commands to complete.
So by the time SCSI EH was invoked the erase command was already 
completed, allowing for a successful retry of the failing command.
With asynchronous aborts we don't have this option, as the abort will 
succeed, but the command cannot be retried as the original erase command 
is still running.

In the light of the above I guess we need something like the attached 
patch. I'm not utterly proud of if, but I guess it's the best we can do 
for the moment.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)

[-- Attachment #2: 0001-mpt3sas-hack-disable-concurrent-commands-for-ATA_16-.patch --]
[-- Type: text/x-patch, Size: 1832 bytes --]

>From 1556746987c3b4c1a1a4705625280b1136554f89 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Sun, 30 Oct 2016 14:24:44 +0100
Subject: [PATCH] mpt3sas: hack: disable concurrent commands for ATA_16/ATA_12

There's a bug in the mpt3sas driver/firmware which would not return
BUSY if it's busy processing requests (eg 'erase') and cannot
respond to other commands. Hence these commands will timeout
and eventually start the error handler.
This patch disallows request processing whenever an ATA_12 or
ATA_16 command is received, thereby avoiding this problem.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 97987e7..18b9f09 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4096,6 +4096,13 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 	    sas_device_priv_data->block)
 		return SCSI_MLQUEUE_DEVICE_BUSY;
 
+	/*
+	 * Hack: block the device for any ATA_12/ATA_16 command
+	 */
+	if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) {
+		sas_device_priv_data = scmd->device->hostdata;
+		_scsih_internal_device_block(scmd->device, sas_device_priv_data);
+	}
 	if (scmd->sc_data_direction == DMA_FROM_DEVICE)
 		mpi_control = MPI2_SCSIIO_CONTROL_READ;
 	else if (scmd->sc_data_direction == DMA_TO_DEVICE)
@@ -4835,6 +4842,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 
  out:
 
+	if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) {
+		sas_device_priv_data = scmd->device->hostdata;
+		_scsih_internal_device_unblock(scmd->device, sas_device_priv_data);
+	}
 	scsi_dma_unmap(scmd);
 
 	scmd->scsi_done(scmd);
-- 
2.6.6


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination.
@ 2016-10-30 18:43   ` Hannes Reinecke
  0 siblings, 0 replies; 31+ messages in thread
From: Hannes Reinecke @ 2016-10-30 18:43 UTC (permalink / raw)
  To: Andrey Grodzovsky, MPT-FusionLinux.pdl
  Cc: linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, stable

[-- Attachment #1: Type: text/plain, Size: 1674 bytes --]

On 10/30/2016 01:43 PM, Andrey Grodzovsky wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specifc command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
>
Actually, it is _not_ the erase command which times out, it's the 
successive commands which time out, as the controller is unable to 
process them while erase is running.
I suspect a bug in the SAT-layer from the mpt3sas firmware, which simply 
does not return 'busy' for additional commands when erase is in progress.
That being said, this issue was obscured prior to implementing 
asynchronous aborts, as originally a timeout would be invoking SCSI EH, 
which would wait for all outstanding commands to complete.
So by the time SCSI EH was invoked the erase command was already 
completed, allowing for a successful retry of the failing command.
With asynchronous aborts we don't have this option, as the abort will 
succeed, but the command cannot be retried as the original erase command 
is still running.

In the light of the above I guess we need something like the attached 
patch. I'm not utterly proud of if, but I guess it's the best we can do 
for the moment.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

[-- Attachment #2: 0001-mpt3sas-hack-disable-concurrent-commands-for-ATA_16-.patch --]
[-- Type: text/x-patch, Size: 1832 bytes --]

>From 1556746987c3b4c1a1a4705625280b1136554f89 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Sun, 30 Oct 2016 14:24:44 +0100
Subject: [PATCH] mpt3sas: hack: disable concurrent commands for ATA_16/ATA_12

There's a bug in the mpt3sas driver/firmware which would not return
BUSY if it's busy processing requests (eg 'erase') and cannot
respond to other commands. Hence these commands will timeout
and eventually start the error handler.
This patch disallows request processing whenever an ATA_12 or
ATA_16 command is received, thereby avoiding this problem.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 97987e7..18b9f09 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4096,6 +4096,13 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 	    sas_device_priv_data->block)
 		return SCSI_MLQUEUE_DEVICE_BUSY;
 
+	/*
+	 * Hack: block the device for any ATA_12/ATA_16 command
+	 */
+	if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) {
+		sas_device_priv_data = scmd->device->hostdata;
+		_scsih_internal_device_block(scmd->device, sas_device_priv_data);
+	}
 	if (scmd->sc_data_direction == DMA_FROM_DEVICE)
 		mpi_control = MPI2_SCSIIO_CONTROL_READ;
 	else if (scmd->sc_data_direction == DMA_TO_DEVICE)
@@ -4835,6 +4842,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 
  out:
 
+	if (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85) {
+		sas_device_priv_data = scmd->device->hostdata;
+		_scsih_internal_device_unblock(scmd->device, sas_device_priv_data);
+	}
 	scsi_dma_unmap(scmd);
 
 	scmd->scsi_done(scmd);
-- 
2.6.6


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-10-30 18:43   ` Hannes Reinecke
@ 2016-11-02  0:09     ` Andrey Grodzovsky
  -1 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-02  0:09 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: Andrey Grodzovsky, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, Hannes Reinecke,
	stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specifc command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

v2: Update according to example patch by Hannes Reinecke to apply
the blocking logic to any ATA 12/16 command.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..43ab0cc 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+/**
+ * This is a work around for a bug with LSI Fusion MPT SAS2 when
+ * pefroming secure erase. Due to the verly long time the operation
+ * takes commands issued during the erase will time out and will trigger
+ * execution of abort hook. This leads to device reset and premature
+ * termination of the secured erase.
+ *
+ */
+static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+{
+   return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85);
+}
+
+
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+   /**
+	* Lock the device for any subsequent command until
+	* command is done.
+	*/
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
@ 2016-11-02  0:09     ` Andrey Grodzovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-02  0:09 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: Andrey Grodzovsky, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, Hannes Reinecke,
	stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specifc command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

v2: Update according to example patch by Hannes Reinecke to apply
the blocking logic to any ATA 12/16 command.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..43ab0cc 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+/**
+ * This is a work around for a bug with LSI Fusion MPT SAS2 when
+ * pefroming secure erase. Due to the verly long time the operation
+ * takes commands issued during the erase will time out and will trigger
+ * execution of abort hook. This leads to device reset and premature
+ * termination of the secured erase.
+ *
+ */
+static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+{
+   return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85);
+}
+
+
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+   /**
+	* Lock the device for any subsequent command until
+	* command is done.
+	*/
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-11-02  0:09     ` Andrey Grodzovsky
@ 2016-11-02  2:07       ` Hannes Reinecke
  -1 siblings, 0 replies; 31+ messages in thread
From: Hannes Reinecke @ 2016-11-02  2:07 UTC (permalink / raw)
  To: Andrey Grodzovsky, MPT-FusionLinux.pdl
  Cc: linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, stable

On 11/02/2016 01:09 AM, Andrey Grodzovsky wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specifc command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
>
> Fix:
> Set device state to busy while erase in progress to reject any incoming
> commands until the erase is done. The device is blocked any way during
> this time and cannot execute any other command.
> More data and logs can be found here -
> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
>
> v2: Update according to example patch by Hannes Reinecke to apply
> the blocking logic to any ATA 12/16 command.
>
> Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
> Cc: <linux-scsi@vger.kernel.org>
> Cc: Sathya Prakash <sathya.prakash@broadcom.com>
> Cc: Chaitra P B <chaitra.basappa@broadcom.com>
> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: <stable@vger.kernel.org>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..43ab0cc 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
>  	    SAM_STAT_CHECK_CONDITION;
>  }
>
> +/**
> + * This is a work around for a bug with LSI Fusion MPT SAS2 when
> + * pefroming secure erase. Due to the verly long time the operation
> + * takes commands issued during the erase will time out and will trigger
> + * execution of abort hook. This leads to device reset and premature
> + * termination of the secured erase.
> + *
> + */
> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> +{
> +   return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85);
> +}
> +
> +
>
>  /**
>   * _scsih_qcmd - main scsi request entry point
> @@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>  		scsi_print_command(scmd);
>  #endif
>
> +   /**
> +	* Lock the device for any subsequent command until
> +	* command is done.
> +	*/
> +	if (ata_12_16_cmd(scmd))
> +		scsi_internal_device_block(scmd->device);
> +
> +
>  	sas_device_priv_data = scmd->device->hostdata;
>  	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
>  		scmd->result = DID_NO_CONNECT << 16;
> @@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>  	if (scmd == NULL)
>  		return 1;
>
> +	if (ata_12_16_cmd(scmd))
> +		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
> +
> +
>  	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
>
>  	if (mpi_reply == NULL) {
>
Yeah, it's ugly, but I can't think of a better solution for the moment.
Thanks for debugging this.

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: J. Hawn, J. Guild, F. Imend�rffer, HRB 16746 (AG N�rnberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
@ 2016-11-02  2:07       ` Hannes Reinecke
  0 siblings, 0 replies; 31+ messages in thread
From: Hannes Reinecke @ 2016-11-02  2:07 UTC (permalink / raw)
  To: Andrey Grodzovsky, MPT-FusionLinux.pdl
  Cc: linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, stable

On 11/02/2016 01:09 AM, Andrey Grodzovsky wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specifc command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
>
> Fix:
> Set device state to busy while erase in progress to reject any incoming
> commands until the erase is done. The device is blocked any way during
> this time and cannot execute any other command.
> More data and logs can be found here -
> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
>
> v2: Update according to example patch by Hannes Reinecke to apply
> the blocking logic to any ATA 12/16 command.
>
> Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
> Cc: <linux-scsi@vger.kernel.org>
> Cc: Sathya Prakash <sathya.prakash@broadcom.com>
> Cc: Chaitra P B <chaitra.basappa@broadcom.com>
> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: <stable@vger.kernel.org>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..43ab0cc 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
>  	    SAM_STAT_CHECK_CONDITION;
>  }
>
> +/**
> + * This is a work around for a bug with LSI Fusion MPT SAS2 when
> + * pefroming secure erase. Due to the verly long time the operation
> + * takes commands issued during the erase will time out and will trigger
> + * execution of abort hook. This leads to device reset and premature
> + * termination of the secured erase.
> + *
> + */
> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> +{
> +   return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85);
> +}
> +
> +
>
>  /**
>   * _scsih_qcmd - main scsi request entry point
> @@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>  		scsi_print_command(scmd);
>  #endif
>
> +   /**
> +	* Lock the device for any subsequent command until
> +	* command is done.
> +	*/
> +	if (ata_12_16_cmd(scmd))
> +		scsi_internal_device_block(scmd->device);
> +
> +
>  	sas_device_priv_data = scmd->device->hostdata;
>  	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
>  		scmd->result = DID_NO_CONNECT << 16;
> @@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>  	if (scmd == NULL)
>  		return 1;
>
> +	if (ata_12_16_cmd(scmd))
> +		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
> +
> +
>  	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
>
>  	if (mpi_reply == NULL) {
>
Yeah, it's ugly, but I can't think of a better solution for the moment.
Thanks for debugging this.

Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-11-02  2:07       ` Hannes Reinecke
  (?)
@ 2016-11-02 10:05       ` Sreekanth Reddy
       [not found]         ` <CAJphD_qrQftfCOn_uzXCfdX=Xv9BYvVQ60AZ4DR2rc3gfXQa_Q@mail.gmail.com>
  -1 siblings, 1 reply; 31+ messages in thread
From: Sreekanth Reddy @ 2016-11-02 10:05 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: PDL-MPT-FUSIONLINUX, linux-scsi, Hannes Reinecke, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, stable

On Wed, Nov 2, 2016 at 7:37 AM, Hannes Reinecke <hare@suse.de> wrote:
> On 11/02/2016 01:09 AM, Andrey Grodzovsky wrote:
>>
>> Problem:
>> This is a work around for a bug with LSI Fusion MPT SAS2 when
>> pefroming secure erase. Due to the very long time the operation
>> takes commands issued during the erase will time out and will trigger
>> execution of abort hook. Even though the abort hook is called for
>> the specifc command which timed out this leads to entire device halt
>> (scsi_state terminated) and premature termination of the secured erase.
>>
>> Fix:
>> Set device state to busy while erase in progress to reject any incoming
>> commands until the erase is done. The device is blocked any way during
>> this time and cannot execute any other command.
>> More data and logs can be found here -
>> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
>>
>> v2: Update according to example patch by Hannes Reinecke to apply
>> the blocking logic to any ATA 12/16 command.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
>> Cc: <linux-scsi@vger.kernel.org>
>> Cc: Sathya Prakash <sathya.prakash@broadcom.com>
>> Cc: Chaitra P B <chaitra.basappa@broadcom.com>
>> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
>> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Cc: <stable@vger.kernel.org>
>> ---
>>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++++++++++++++++++++++++++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> index 5a97e32..43ab0cc 100644
>> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> @@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd,
>> u16 ioc_status)
>>             SAM_STAT_CHECK_CONDITION;
>>  }
>>
>> +/**
>> + * This is a work around for a bug with LSI Fusion MPT SAS2 when
>> + * pefroming secure erase. Due to the verly long time the operation
>> + * takes commands issued during the erase will time out and will trigger
>> + * execution of abort hook. This leads to device reset and premature
>> + * termination of the secured erase.
>> + *
>> + */
>> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
>> +{
>> +   return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85);
>> +}
>> +
>> +
>>
>>  /**
>>   * _scsih_qcmd - main scsi request entry point
>> @@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct
>> scsi_cmnd *scmd)
>>                 scsi_print_command(scmd);
>>  #endif
>>
>> +   /**
>> +       * Lock the device for any subsequent command until
>> +       * command is done.
>> +       */
>> +       if (ata_12_16_cmd(scmd))
>> +               scsi_internal_device_block(scmd->device);
>> +
>> +
>>         sas_device_priv_data = scmd->device->hostdata;
>>         if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
>>                 scmd->result = DID_NO_CONNECT << 16;
>> @@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16
>> smid, u8 msix_index, u32 reply)
>>         if (scmd == NULL)
>>                 return 1;
>>
>> +       if (ata_12_16_cmd(scmd))
>> +               scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
>> +
>> +
>>         mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
>>
>>         if (mpi_reply == NULL) {
>>
> Yeah, it's ugly, but I can't think of a better solution for the moment.
> Thanks for debugging this.

May I known the result of same test case if the SATA drive is
connected to on-bord SATA?

If it is assumed to be HBA firmware issue then it should be fixed in
the Firmware not in the driver. Have you tried with the latest HBA
Firmware image?  if it still occurs then is it possible for you to
share the firmware logs?

I think that service request has raised for this issue with Broadcom,
in this service request our support people can help you in collecting
the firmware logs and can provide the analysis of those firmware logs.

Thanks,
Sreekanth

>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke                   zSeries & Storage
> hare@suse.de                          +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
       [not found]           ` <30w645ulbhlofxrk1h4a9q3s.1478144944778@email.android.com>
@ 2016-11-04 12:45             ` Sreekanth Reddy
  2016-11-04 14:51               ` Hannes Reinecke
  0 siblings, 1 reply; 31+ messages in thread
From: Sreekanth Reddy @ 2016-11-04 12:45 UTC (permalink / raw)
  To: Igor Rybak, Andrey Grodzovsky
  Cc: Ezra Kohavi, PDL-MPT-FUSIONLINUX, linux-scsi, Hannes Reinecke,
	Sathya Prakash, Chaitra P B, Suganath Prabu Subramani, stable

Hi All,

>From last two days, I was working with my firmware team to get the
required info over this issue. Here is my firmware team response

"For ATA PASSTHROUGH commands, the IOC SATL will not check for the
opcode and will direct it to the drive. So even though ATA PASSTHOUGH
has ATA erase to the drive, IOC SATL FW will not know that and as a
general logic for all ATA PASSTHOGH commands, IOC FW will pend the
upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as
per the SAT specification for SAS controllers and we can't compare it
with the SATA controllers in the on board that have full fledge SATA
implementation".

So this is an expected behavior from our HBA firmware. i.e. it will
pend the subsequent commands if any ATA PASSTHROUGH command is going
on. So their is no issue with the FW.

Today I have tried the same test case on my local setup. i.e. I have
issued a secure erase command using hdparm utility and observed the
same issue on 4.2.3-300.fc23.x86_64 kernel.

Then after browsing over this issue, I found that some people are
recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a
compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL
and recompiled this 4.4.0 kernel and booted in to this kernel. Then I
tried same test case and I haven't observed this issue and secure
erase operation was completed successfully.

So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled.

Thanks,
Sreekanth






On Thu, Nov 3, 2016 at 9:19 AM, Igor Rybak <igor@media-clone.com> wrote:
> Hi,
>
> We tried the latest LSI firmware 20.0.0.7, also collected logs by the
> Broadcom script and emailed to their tech support already.
>
> Thanks,
>
> Igor Rybak
> VP Engineering
> MediaClone Inc
>
>
> -------- Original message --------
> From: Andrey Grodzovsky <andrey2805@gmail.com>
> Date: 11/2/16 9:31 PM (GMT+05:30)
> To: Sreekanth Reddy <sreekanth.reddy@broadcom.com>, Igor Rybak
> <igor@media-clone.com>, Ezra Kohavi <ezra@media-clone.com>
> Cc: PDL-MPT-FUSIONLINUX <MPT-FusionLinux.pdl@broadcom.com>,
> linux-scsi@vger.kernel.org, Hannes Reinecke <hare@suse.de>, Sathya Prakash
> <sathya.prakash@broadcom.com>, Chaitra P B <chaitra.basappa@broadcom.com>,
> Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>,
> stable@vger.kernel.org
> Subject: Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination
> (v2)
>
>
>
> On Wed, Nov 2, 2016 at 6:05 AM, Sreekanth Reddy
> <sreekanth.reddy@broadcom.com> wrote:
>>
>> On Wed, Nov 2, 2016 at 7:37 AM, Hannes Reinecke <hare@suse.de> wrote:
>> > On 11/02/2016 01:09 AM, Andrey Grodzovsky wrote:
>> >>
>> >> Problem:
>> >> This is a work around for a bug with LSI Fusion MPT SAS2 when
>> >> pefroming secure erase. Due to the very long time the operation
>> >> takes commands issued during the erase will time out and will trigger
>> >> execution of abort hook. Even though the abort hook is called for
>> >> the specifc command which timed out this leads to entire device halt
>> >> (scsi_state terminated) and premature termination of the secured erase.
>> >>
>> >> Fix:
>> >> Set device state to busy while erase in progress to reject any incoming
>> >> commands until the erase is done. The device is blocked any way during
>> >> this time and cannot execute any other command.
>> >> More data and logs can be found here -
>> >> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
>> >>
>> >> v2: Update according to example patch by Hannes Reinecke to apply
>> >> the blocking logic to any ATA 12/16 command.
>> >>
>> >> Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
>> >> Cc: <linux-scsi@vger.kernel.org>
>> >> Cc: Sathya Prakash <sathya.prakash@broadcom.com>
>> >> Cc: Chaitra P B <chaitra.basappa@broadcom.com>
>> >> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
>> >> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
>> >> Cc: Hannes Reinecke <hare@suse.de>
>> >> Cc: <stable@vger.kernel.org>
>> >> ---
>> >>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 26 ++++++++++++++++++++++++++
>> >>  1 file changed, 26 insertions(+)
>> >>
>> >> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> >> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> >> index 5a97e32..43ab0cc 100644
>> >> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> >> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> >> @@ -3500,6 +3500,20 @@ _scsih_eedp_error_handling(struct scsi_cmnd
>> >> *scmd,
>> >> u16 ioc_status)
>> >>             SAM_STAT_CHECK_CONDITION;
>> >>  }
>> >>
>> >> +/**
>> >> + * This is a work around for a bug with LSI Fusion MPT SAS2 when
>> >> + * pefroming secure erase. Due to the verly long time the operation
>> >> + * takes commands issued during the erase will time out and will
>> >> trigger
>> >> + * execution of abort hook. This leads to device reset and premature
>> >> + * termination of the secured erase.
>> >> + *
>> >> + */
>> >> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
>> >> +{
>> >> +   return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85);
>> >> +}
>> >> +
>> >> +
>> >>
>> >>  /**
>> >>   * _scsih_qcmd - main scsi request entry point
>> >> @@ -3528,6 +3542,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct
>> >> scsi_cmnd *scmd)
>> >>                 scsi_print_command(scmd);
>> >>  #endif
>> >>
>> >> +   /**
>> >> +       * Lock the device for any subsequent command until
>> >> +       * command is done.
>> >> +       */
>> >> +       if (ata_12_16_cmd(scmd))
>> >> +               scsi_internal_device_block(scmd->device);
>> >> +
>> >> +
>> >>         sas_device_priv_data = scmd->device->hostdata;
>> >>         if (!sas_device_priv_data || !sas_device_priv_data->sas_target)
>> >> {
>> >>                 scmd->result = DID_NO_CONNECT << 16;
>> >> @@ -4062,6 +4084,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16
>> >> smid, u8 msix_index, u32 reply)
>> >>         if (scmd == NULL)
>> >>                 return 1;
>> >>
>> >> +       if (ata_12_16_cmd(scmd))
>> >> +               scsi_internal_device_unblock(scmd->device,
>> >> SDEV_RUNNING);
>> >> +
>> >> +
>> >>         mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
>> >>
>> >>         if (mpi_reply == NULL) {
>> >>
>> > Yeah, it's ugly, but I can't think of a better solution for the moment.
>> > Thanks for debugging this.
>>
>> May I known the result of same test case if the SATA drive is
>> connected to on-bord SATA?
>
>
> + Igor and Ezra from media-clone who originally reported the problem.
>
> With on board controller no problems were observed.
>>
>>
>> If it is assumed to be HBA firmware issue then it should be fixed in
>> the Firmware not in the driver. Have you tried with the latest HBA
>> Firmware image?  if it still occurs then is it possible for you to
>> share the firmware logs?
>
>
> Igor, Ezra - can you do it please ?
>>
>>
>> I think that service request has raised for this issue with Broadcom,
>> in this service request our support people can help you in collecting
>> the firmware logs and can provide the analysis of those firmware logs.
>
>
> Same as above.
>
> Thanks,
> Andrey
>>
>>
>> Thanks,
>> Sreekanth
>>
>> >
>> > Reviewed-by: Hannes Reinecke <hare@suse.com>
>> >
>> > Cheers,
>> >
>> > Hannes
>> > --
>> > Dr. Hannes Reinecke                   zSeries & Storage
>> > hare@suse.de                          +49 911 74053 688
>> > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
>> > GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-11-04 12:45             ` Sreekanth Reddy
@ 2016-11-04 14:51               ` Hannes Reinecke
  2016-11-04 16:35                   ` Martin K. Petersen
  2016-11-05 13:17                 ` Andrey Grodzovsky
  0 siblings, 2 replies; 31+ messages in thread
From: Hannes Reinecke @ 2016-11-04 14:51 UTC (permalink / raw)
  To: Sreekanth Reddy, Igor Rybak, Andrey Grodzovsky
  Cc: Ezra Kohavi, PDL-MPT-FUSIONLINUX, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, stable

On 11/04/2016 01:45 PM, Sreekanth Reddy wrote:
> Hi All,
>
> From last two days, I was working with my firmware team to get the
> required info over this issue. Here is my firmware team response
>
> "For ATA PASSTHROUGH commands, the IOC SATL will not check for the
> opcode and will direct it to the drive. So even though ATA PASSTHOUGH
> has ATA erase to the drive, IOC SATL FW will not know that and as a
> general logic for all ATA PASSTHOGH commands, IOC FW will pend the
> upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as
> per the SAT specification for SAS controllers and we can't compare it
> with the SATA controllers in the on board that have full fledge SATA
> implementation".
>
> So this is an expected behavior from our HBA firmware. i.e. it will
> pend the subsequent commands if any ATA PASSTHROUGH command is going
> on. So their is no issue with the FW.
>
But is there a way to figure out if the firmware / SATL layer is busy 
processing requests?

With 'real' ATA HBAs these issue doesn't occur, as the ATA erase command 
is a non-queued command, and hence the next command automatically has to 
wait for the erase command to complete.
But this wait happens as the ATA HBA returns 'BUSY', and the linux I/O 
stack will then reset the timeout for all consecutive commands.

With mpt3sas _all_ commands are queued, so if there is a long-running 
I/O command all other commands already in the queue will time out.

Which is at least a very awkward behaviour.

Checking with SAT-3 (section 6.2.4: Commands the SATL queues internally) 
the implemented behaviour is standards conformant, although the standard 
also allows for returning 'TASK SET FULL' or 'BUSY' in these cases.
Doing so would nicely solve this issue.

> Today I have tried the same test case on my local setup. i.e. I have
> issued a secure erase command using hdparm utility and observed the
> same issue on 4.2.3-300.fc23.x86_64 kernel.
>
> Then after browsing over this issue, I found that some people are
> recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a
> compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL
> and recompiled this 4.4.0 kernel and booted in to this kernel. Then I
> tried same test case and I haven't observed this issue and secure
> erase operation was completed successfully.
>
> So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled.
>
Errm.
CONFIG_IDE_TASK_IOCTL is for the old IDE subsystem, which isn't in use 
here. So this option does not make a difference when using mpt3sas, as 
this is a 'real' SCSI driver which never calls out into any of these 
subsystems.

I would be _VERY_ much surprised if that would make a difference.

The reason why this behaviour did go unnoticed with older kernels was 
that a command timeout would trigger SCSI EH to engage, and that in turn 
required all outstanding commands to complete.
So by the time SCSI EH started the ERASE command was complete, and a 
retry of the timed-out commands would work.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-11-04 14:51               ` Hannes Reinecke
@ 2016-11-04 16:35                   ` Martin K. Petersen
  2016-11-05 13:17                 ` Andrey Grodzovsky
  1 sibling, 0 replies; 31+ messages in thread
From: Martin K. Petersen @ 2016-11-04 16:35 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Sreekanth Reddy, Igor Rybak, Andrey Grodzovsky, Ezra Kohavi,
	PDL-MPT-FUSIONLINUX, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, stable

>>>>> "Hannes" == Hannes Reinecke <hare@suse.de> writes:

Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues
Hannes> internally) the implemented behaviour is standards conformant,
Hannes> although the standard also allows for returning 'TASK SET FULL'
Hannes> or 'BUSY' in these cases.  Doing so would nicely solve this
Hannes> issue.

I agree with Hannes that it would be appropriate for the SATL to report
busy when it makes an non-queued command queueable.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
@ 2016-11-04 16:35                   ` Martin K. Petersen
  0 siblings, 0 replies; 31+ messages in thread
From: Martin K. Petersen @ 2016-11-04 16:35 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Sreekanth Reddy, Igor Rybak, Andrey Grodzovsky, Ezra Kohavi,
	PDL-MPT-FUSIONLINUX, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, stable

>>>>> "Hannes" == Hannes Reinecke <hare@suse.de> writes:

Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues
Hannes> internally) the implemented behaviour is standards conformant,
Hannes> although the standard also allows for returning 'TASK SET FULL'
Hannes> or 'BUSY' in these cases.  Doing so would nicely solve this
Hannes> issue.

I agree with Hannes that it would be appropriate for the SATL to report
busy when it makes an non-queued command queueable.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-11-04 14:51               ` Hannes Reinecke
  2016-11-04 16:35                   ` Martin K. Petersen
@ 2016-11-05 13:17                 ` Andrey Grodzovsky
  2016-11-10 12:07                   ` Sreekanth Reddy
  1 sibling, 1 reply; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-05 13:17 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Sreekanth Reddy, Igor Rybak, Ezra Kohavi, PDL-MPT-FUSIONLINUX,
	linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, stable

On Fri, Nov 4, 2016 at 10:51 AM, Hannes Reinecke <hare@suse.de> wrote:
> On 11/04/2016 01:45 PM, Sreekanth Reddy wrote:
>>
>> Hi All,
>>
>> From last two days, I was working with my firmware team to get the
>> required info over this issue. Here is my firmware team response
>>
>> "For ATA PASSTHROUGH commands, the IOC SATL will not check for the
>> opcode and will direct it to the drive. So even though ATA PASSTHOUGH
>> has ATA erase to the drive, IOC SATL FW will not know that and as a
>> general logic for all ATA PASSTHOGH commands, IOC FW will pend the
>> upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as
>> per the SAT specification for SAS controllers and we can't compare it
>> with the SATA controllers in the on board that have full fledge SATA
>> implementation".
>>
>> So this is an expected behavior from our HBA firmware. i.e. it will
>> pend the subsequent commands if any ATA PASSTHROUGH command is going
>> on. So their is no issue with the FW.
>>
> But is there a way to figure out if the firmware / SATL layer is busy
> processing requests?
>
> With 'real' ATA HBAs these issue doesn't occur, as the ATA erase command is
> a non-queued command, and hence the next command automatically has to wait
> for the erase command to complete.
> But this wait happens as the ATA HBA returns 'BUSY', and the linux I/O stack
> will then reset the timeout for all consecutive commands.
>
> With mpt3sas _all_ commands are queued, so if there is a long-running I/O
> command all other commands already in the queue will time out.
>
> Which is at least a very awkward behaviour.
>
> Checking with SAT-3 (section 6.2.4: Commands the SATL queues internally) the
> implemented behaviour is standards conformant, although the standard also
> allows for returning 'TASK SET FULL' or 'BUSY' in these cases.
> Doing so would nicely solve this issue.
>
>> Today I have tried the same test case on my local setup. i.e. I have
>> issued a secure erase command using hdparm utility and observed the
>> same issue on 4.2.3-300.fc23.x86_64 kernel.
>>
>> Then after browsing over this issue, I found that some people are
>> recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a
>> compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL
>> and recompiled this 4.4.0 kernel and booted in to this kernel. Then I
>> tried same test case and I haven't observed this issue and secure
>> erase operation was completed successfully.
>>
>> So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled.
>>
> Errm.
> CONFIG_IDE_TASK_IOCTL is for the old IDE subsystem, which isn't in use here.
> So this option does not make a difference when using mpt3sas, as this is a
> 'real' SCSI driver which never calls out into any of these subsystems.
>
> I would be _VERY_ much surprised if that would make a difference.
>
> The reason why this behaviour did go unnoticed with older kernels was that a
> command timeout would trigger SCSI EH to engage, and that in turn required
> all outstanding commands to complete.
> So by the time SCSI EH started the ERASE command was complete, and a retry
> of the timed-out commands would work.

Indeed, when retesting with CONFIG_IDE_TASK_IOCTL=y and. reverting the
fix the bug is back.

Thanks,
Andrey
>
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke                   zSeries & Storage
> hare@suse.de                          +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-11-05 13:17                 ` Andrey Grodzovsky
@ 2016-11-10 12:07                   ` Sreekanth Reddy
  2016-11-10 13:42                       ` Andrey Grodzovsky
  0 siblings, 1 reply; 31+ messages in thread
From: Sreekanth Reddy @ 2016-11-10 12:07 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Hannes Reinecke, Igor Rybak, Ezra Kohavi, PDL-MPT-FUSIONLINUX,
	linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, stable

On Sat, Nov 5, 2016 at 6:47 PM, Andrey Grodzovsky <andrey2805@gmail.com> wrote:
> On Fri, Nov 4, 2016 at 10:51 AM, Hannes Reinecke <hare@suse.de> wrote:
>> On 11/04/2016 01:45 PM, Sreekanth Reddy wrote:
>>>
>>> Hi All,
>>>
>>> From last two days, I was working with my firmware team to get the
>>> required info over this issue. Here is my firmware team response
>>>
>>> "For ATA PASSTHROUGH commands, the IOC SATL will not check for the
>>> opcode and will direct it to the drive. So even though ATA PASSTHOUGH
>>> has ATA erase to the drive, IOC SATL FW will not know that and as a
>>> general logic for all ATA PASSTHOGH commands, IOC FW will pend the
>>> upcoming IOs untill the previous ATA PASSTHORUGH completes. This is as
>>> per the SAT specification for SAS controllers and we can't compare it
>>> with the SATA controllers in the on board that have full fledge SATA
>>> implementation".
>>>
>>> So this is an expected behavior from our HBA firmware. i.e. it will
>>> pend the subsequent commands if any ATA PASSTHROUGH command is going
>>> on. So their is no issue with the FW.
>>>
>> But is there a way to figure out if the firmware / SATL layer is busy
>> processing requests?
>>
>> With 'real' ATA HBAs these issue doesn't occur, as the ATA erase command is
>> a non-queued command, and hence the next command automatically has to wait
>> for the erase command to complete.
>> But this wait happens as the ATA HBA returns 'BUSY', and the linux I/O stack
>> will then reset the timeout for all consecutive commands.
>>
>> With mpt3sas _all_ commands are queued, so if there is a long-running I/O
>> command all other commands already in the queue will time out.
>>
>> Which is at least a very awkward behaviour.
>>
>> Checking with SAT-3 (section 6.2.4: Commands the SATL queues internally) the
>> implemented behaviour is standards conformant, although the standard also
>> allows for returning 'TASK SET FULL' or 'BUSY' in these cases.
>> Doing so would nicely solve this issue.
>>
>>> Today I have tried the same test case on my local setup. i.e. I have
>>> issued a secure erase command using hdparm utility and observed the
>>> same issue on 4.2.3-300.fc23.x86_64 kernel.
>>>
>>> Then after browsing over this issue, I found that some people are
>>> recommending to enable 'CONFIG_IDE_TASK_IOCTL' Kconfig flag. I had a
>>> compiled 4.4.0 kernel, so I have enabled this CONFIG_IDE_TASK_IOCTL
>>> and recompiled this 4.4.0 kernel and booted in to this kernel. Then I
>>> tried same test case and I haven't observed this issue and secure
>>> erase operation was completed successfully.
>>>
>>> So, can you please try once with CONFIG_IDE_TASK_IOCTL enabled.
>>>
>> Errm.
>> CONFIG_IDE_TASK_IOCTL is for the old IDE subsystem, which isn't in use here.
>> So this option does not make a difference when using mpt3sas, as this is a
>> 'real' SCSI driver which never calls out into any of these subsystems.
>>
>> I would be _VERY_ much surprised if that would make a difference.
>>
>> The reason why this behaviour did go unnoticed with older kernels was that a
>> command timeout would trigger SCSI EH to engage, and that in turn required
>> all outstanding commands to complete.
>> So by the time SCSI EH started the ERASE command was complete, and a retry
>> of the timed-out commands would work.
>
> Indeed, when retesting with CONFIG_IDE_TASK_IOCTL=y and. reverting the
> fix the bug is back.
>
> Thanks,
> Andrey

Hi Andrey,

We are fine with this patch with below few changes,

1. Please remove below comment. it not a bug in firmware, it is
designed like that,

/* This is a work around for a bug with LSI Fusion MPT SAS2 when
* pefroming secure erase. Due to the verly long time the operation
* takes commands issued during the erase will time out and will trigger
* execution of abort hook. This leads to device reset and premature
* termination of the secured erase.
*/

2. Use SCSI commands opcodes definitions instead of value, so replace below line

return (scmd->cmnd[0] == 0xa1 || scmd->cmnd[0] == 0x85);
as
return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);

3.  Please correct alignment for the below comment,

 /**
     * Lock the device for any subsequent command until
     * command is done.
     */

Thanks,
Sreekanth

>>
>>
>> Cheers,
>>
>> Hannes
>> --
>> Dr. Hannes Reinecke                   zSeries & Storage
>> hare@suse.de                          +49 911 74053 688
>> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
>> GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v3)
  2016-11-10 12:07                   ` Sreekanth Reddy
@ 2016-11-10 13:42                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-10 13:42 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: igor, ezra, Andrey Grodzovsky, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Sreekanth Reddy,
	Hannes Reinecke, stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specifc command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

v2: Update according to example patch by Hannes Reinecke to apply
the blocking logic to any ATA 12/16 command.

v3: Use SCSI commands opcodes definitions instead of value and
correct identation.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..320f16c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+{
+   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+}
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3532,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+   	/**
+	* Lock the device for any subsequent command until
+	* command is done.
+	*/
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4074,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v3)
@ 2016-11-10 13:42                       ` Andrey Grodzovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-10 13:42 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: igor, ezra, Andrey Grodzovsky, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Sreekanth Reddy,
	Hannes Reinecke, stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specifc command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

v2: Update according to example patch by Hannes Reinecke to apply
the blocking logic to any ATA 12/16 command.

v3: Use SCSI commands opcodes definitions instead of value and
correct identation.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..320f16c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+{
+   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+}
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3532,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+   	/**
+	* Lock the device for any subsequent command until
+	* command is done.
+	*/
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4074,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v3)
  2016-11-10 13:42                       ` Andrey Grodzovsky
  (?)
@ 2016-11-10 13:54                       ` Greg KH
  2016-11-10 14:35                           ` Andrey Grodzovsky
  -1 siblings, 1 reply; 31+ messages in thread
From: Greg KH @ 2016-11-10 13:54 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: MPT-FusionLinux.pdl, igor, ezra, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Sreekanth Reddy,
	Hannes Reinecke, stable

On Thu, Nov 10, 2016 at 08:42:52AM -0500, Andrey Grodzovsky wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specifc command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
> 
> Fix:
> Set device state to busy while erase in progress to reject any incoming
> commands until the erase is done. The device is blocked any way during
> this time and cannot execute any other command.
> More data and logs can be found here -
> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
> 
> v2: Update according to example patch by Hannes Reinecke to apply
> the blocking logic to any ATA 12/16 command.
> 
> v3: Use SCSI commands opcodes definitions instead of value and
> correct identation.
> 
> Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
> Cc: <linux-scsi@vger.kernel.org>
> Cc: Sathya Prakash <sathya.prakash@broadcom.com>
> Cc: Chaitra P B <chaitra.basappa@broadcom.com>
> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: <stable@vger.kernel.org>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..320f16c 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
>  	    SAM_STAT_CHECK_CONDITION;
>  }
>  
> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> +{
> +   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
> +}

Please always run your patches through checkpatch.pl so you don't get a
grumpy maintainer emailing you and telling you to run your patches
through checkpatch.pl...


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
  2016-11-10 13:54                       ` Greg KH
@ 2016-11-10 14:35                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-10 14:35 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: igor, ezra, Andrey Grodzovsky, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Sreekanth Reddy,
	Hannes Reinecke, stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specific command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

v2: Update according to example patch by Hannes Reinecke to apply
the blocking logic to any ATA 12/16 command.

v3: Use SCSI commands opcodes definitions instead of value and
correct identation.

v4: Fix checkpath errors and warning.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..c032319 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+{
+	return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+}
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3532,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+	/**
+	* Lock the device for any subsequent command until
+	* command is done.
+	*/
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4074,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
@ 2016-11-10 14:35                           ` Andrey Grodzovsky
  0 siblings, 0 replies; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-10 14:35 UTC (permalink / raw)
  To: MPT-FusionLinux.pdl
  Cc: igor, ezra, Andrey Grodzovsky, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Sreekanth Reddy,
	Hannes Reinecke, stable

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specific command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

v2: Update according to example patch by Hannes Reinecke to apply
the blocking logic to any ATA 12/16 command.

v3: Use SCSI commands opcodes definitions instead of value and
correct identation.

v4: Fix checkpath errors and warning.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..c032319 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
 	    SAM_STAT_CHECK_CONDITION;
 }
 
+static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+{
+	return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+}
 
 /**
  * _scsih_qcmd - main scsi request entry point
@@ -3528,6 +3532,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 		scsi_print_command(scmd);
 #endif
 
+	/**
+	* Lock the device for any subsequent command until
+	* command is done.
+	*/
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_block(scmd->device);
+
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4062,6 +4074,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
+
 	mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
  2016-11-10 14:35                           ` Andrey Grodzovsky
  (?)
@ 2016-11-11  4:38                           ` Sreekanth Reddy
  2018-04-23 18:28                             ` Igor Rybak
  -1 siblings, 1 reply; 31+ messages in thread
From: Sreekanth Reddy @ 2016-11-11  4:38 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: PDL-MPT-FUSIONLINUX, Igor Rybak, Ezra Kohavi, linux-scsi,
	Sathya Prakash, Chaitra P B, Suganath Prabu Subramani,
	Hannes Reinecke, stable

On Thu, Nov 10, 2016 at 8:05 PM, Andrey Grodzovsky <andrey2805@gmail.com> wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specific command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
>
> Fix:
> Set device state to busy while erase in progress to reject any incoming
> commands until the erase is done. The device is blocked any way during
> this time and cannot execute any other command.
> More data and logs can be found here -
> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
>
> v2: Update according to example patch by Hannes Reinecke to apply
> the blocking logic to any ATA 12/16 command.
>
> v3: Use SCSI commands opcodes definitions instead of value and
> correct identation.
>
> v4: Fix checkpath errors and warning.
>
> Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
> Cc: <linux-scsi@vger.kernel.org>
> Cc: Sathya Prakash <sathya.prakash@broadcom.com>
> Cc: Chaitra P B <chaitra.basappa@broadcom.com>
> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: <stable@vger.kernel.org>

Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>

> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..c032319 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
>             SAM_STAT_CHECK_CONDITION;
>  }
>
> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> +{
> +       return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
> +}
>
>  /**
>   * _scsih_qcmd - main scsi request entry point
> @@ -3528,6 +3532,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>                 scsi_print_command(scmd);
>  #endif
>
> +       /**
> +       * Lock the device for any subsequent command until
> +       * command is done.
> +       */
> +       if (ata_12_16_cmd(scmd))
> +               scsi_internal_device_block(scmd->device);
> +
> +
>         sas_device_priv_data = scmd->device->hostdata;
>         if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
>                 scmd->result = DID_NO_CONNECT << 16;
> @@ -4062,6 +4074,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>         if (scmd == NULL)
>                 return 1;
>
> +       if (ata_12_16_cmd(scmd))
> +               scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
> +
> +
>         mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
>
>         if (mpi_reply == NULL) {
> --
> 2.1.4
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
  2016-11-10 14:35                           ` Andrey Grodzovsky
@ 2016-11-12 15:29                             ` Martin K. Petersen
  -1 siblings, 0 replies; 31+ messages in thread
From: Martin K. Petersen @ 2016-11-12 15:29 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: MPT-FusionLinux.pdl, igor, ezra, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Sreekanth Reddy,
	Hannes Reinecke, stable

>>>>> "Andrey" == Andrey Grodzovsky <andrey2805@gmail.com> writes:

Andrey,

Andrey> Problem: This is a work around for a bug with LSI Fusion MPT
Andrey> SAS2 when pefroming secure erase. Due to the very long time the
Andrey> operation takes commands issued during the erase will time out
Andrey> and will trigger execution of abort hook. Even though the abort
Andrey> hook is called for the specific command which timed out this
Andrey> leads to entire device halt (scsi_state terminated) and
Andrey> premature termination of the secured erase.

This patch didn't apply to the SCSI tree. I merged it into
4.9/scsi-fixes by hand.

Also, please check Documentation/SubmittingPatches for future
submissions. Patch version goes inside the [PATCH foo/bar] brackets and
patch changelog entries below "---" separator.

Thanks!
Martin

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
@ 2016-11-12 15:29                             ` Martin K. Petersen
  0 siblings, 0 replies; 31+ messages in thread
From: Martin K. Petersen @ 2016-11-12 15:29 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: MPT-FusionLinux.pdl, igor, ezra, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Sreekanth Reddy,
	Hannes Reinecke, stable

>>>>> "Andrey" == Andrey Grodzovsky <andrey2805@gmail.com> writes:

Andrey,

Andrey> Problem: This is a work around for a bug with LSI Fusion MPT
Andrey> SAS2 when pefroming secure erase. Due to the very long time the
Andrey> operation takes commands issued during the erase will time out
Andrey> and will trigger execution of abort hook. Even though the abort
Andrey> hook is called for the specific command which timed out this
Andrey> leads to entire device halt (scsi_state terminated) and
Andrey> premature termination of the secured erase.

This patch didn't apply to the SCSI tree. I merged it into
4.9/scsi-fixes by hand.

Also, please check Documentation/SubmittingPatches for future
submissions. Patch version goes inside the [PATCH foo/bar] brackets and
patch changelog entries below "---" separator.

Thanks!
Martin

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
  2016-11-12 15:29                             ` Martin K. Petersen
  (?)
@ 2016-11-12 16:36                             ` Andrey Grodzovsky
  2016-11-14 23:30                               ` Martin K. Petersen
  -1 siblings, 1 reply; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-12 16:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: PDL-MPT-FUSIONLINUX, Igor Rybak, Ezra Kohavi, linux-scsi,
	Sathya Prakash, Chaitra P B, Suganath Prabu Subramani,
	Sreekanth Reddy, Hannes Reinecke, stable

On Sat, Nov 12, 2016 at 10:29 AM, Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>>>>>> "Andrey" == Andrey Grodzovsky <andrey2805@gmail.com> writes:
>
> Andrey,
>
> Andrey> Problem: This is a work around for a bug with LSI Fusion MPT
> Andrey> SAS2 when pefroming secure erase. Due to the very long time the
> Andrey> operation takes commands issued during the erase will time out
> Andrey> and will trigger execution of abort hook. Even though the abort
> Andrey> hook is called for the specific command which timed out this
> Andrey> leads to entire device halt (scsi_state terminated) and
> Andrey> premature termination of the secured erase.
>
> This patch didn't apply to the SCSI tree. I merged it into
> 4.9/scsi-fixes by hand.

Sorry about that and thanks. Next time i will work of off latest tree.
Regarding older code where there is still a separate mpt2sas driver, should
a separate patch to be done or this fix will be ported there ?

Thanks,
Andrey
>
> Also, please check Documentation/SubmittingPatches for future
> submissions. Patch version goes inside the [PATCH foo/bar] brackets and
> patch changelog entries below "---" separator.
>
> Thanks!
> Martin
>
> --
> Martin K. Petersen      Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
  2016-11-12 16:36                             ` Andrey Grodzovsky
@ 2016-11-14 23:30                               ` Martin K. Petersen
  2016-11-17  1:15                                 ` [PATCH] [SCSI] mpt2sas: Fix secure erase premature termination Andrey Grodzovsky
  0 siblings, 1 reply; 31+ messages in thread
From: Martin K. Petersen @ 2016-11-14 23:30 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: Martin K. Petersen, PDL-MPT-FUSIONLINUX, Igor Rybak, Ezra Kohavi,
	linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Sreekanth Reddy, Hannes Reinecke,
	stable

>>>>> "Andrey" == Andrey Grodzovsky <andrey2805@gmail.com> writes:

Andrey,

Andrey> Regarding older code where there is still a separate mpt2sas
Andrey> driver, should a separate patch to be done or this fix will be
Andrey> ported there ?

Feel free to submit a mpt2sas patch to the pre-4.4 stable trees.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] [SCSI] mpt2sas: Fix secure erase premature termination
  2016-11-14 23:30                               ` Martin K. Petersen
@ 2016-11-17  1:15                                 ` Andrey Grodzovsky
  2016-11-17  7:11                                   ` Greg KH
  0 siblings, 1 reply; 31+ messages in thread
From: Andrey Grodzovsky @ 2016-11-17  1:15 UTC (permalink / raw)
  To: stable
  Cc: Andrey Grodzovsky, Sreekanth Reddy, Hannes Reinecke,
	PDL-MPT-FUSIONLINUX, Martin K. Petersen

Problem:
This is a work around for a bug with LSI Fusion MPT SAS2 when
pefroming secure erase. Due to the very long time the operation
takes commands issued during the erase will time out and will trigger
execution of abort hook. Even though the abort hook is called for
the specific command which timed out this leads to entire device halt
(scsi_state terminated) and premature termination of the secured erase.

Fix:
Set device state to busy while erase in progress to reject any incoming
commands until the erase is done. The device is blocked any way during
this time and cannot execute any other command.
More data and logs can be found here -
https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view

P.S
This is a backport from the same fix for mpt3sas driver intended
for pre-4.4 stable trees.

Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: PDL-MPT-FUSIONLINUX <MPT-FusionLinux.pdl@broadcom.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
---
 drivers/scsi/mpt2sas/mpt2sas_scsih.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..988c1da 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -3884,6 +3884,11 @@ _scsih_setup_direct_io(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
 	_scsih_scsi_direct_io_set(ioc, smid, 1);
 }
 
+static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
+{
+	return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
+}
+
 /**
  * _scsih_qcmd - main scsi request entry point
  * @scmd: pointer to scsi command object
@@ -3906,6 +3911,13 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
 	u32 mpi_control;
 	u16 smid;
 
+	/**
+	* Lock the device for any subsequent command until
+	* command is done.
+	*/
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_block(scmd->device);
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4447,6 +4459,9 @@ _scsih_io_done(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	if (scmd == NULL)
 		return 1;
 
+	if (ata_12_16_cmd(scmd))
+		scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
+
 	mpi_request = mpt2sas_base_get_msg_frame(ioc, smid);
 
 	if (mpi_reply == NULL) {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt2sas: Fix secure erase premature termination
  2016-11-17  1:15                                 ` [PATCH] [SCSI] mpt2sas: Fix secure erase premature termination Andrey Grodzovsky
@ 2016-11-17  7:11                                   ` Greg KH
  0 siblings, 0 replies; 31+ messages in thread
From: Greg KH @ 2016-11-17  7:11 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: stable, Sreekanth Reddy, Hannes Reinecke, PDL-MPT-FUSIONLINUX,
	Martin K. Petersen

On Wed, Nov 16, 2016 at 08:15:08PM -0500, Andrey Grodzovsky wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specific command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
> 
> Fix:
> Set device state to busy while erase in progress to reject any incoming
> commands until the erase is done. The device is blocked any way during
> this time and cannot execute any other command.
> More data and logs can be found here -
> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
> 
> P.S
> This is a backport from the same fix for mpt3sas driver intended
> for pre-4.4 stable trees.

What is "the same fix"?  What is the git commit id in Linus's tree for
this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
  2016-11-11  4:38                           ` Sreekanth Reddy
@ 2018-04-23 18:28                             ` Igor Rybak
  2018-04-24  7:25                               ` Greg KH
  0 siblings, 1 reply; 31+ messages in thread
From: Igor Rybak @ 2018-04-23 18:28 UTC (permalink / raw)
  To: Sreekanth Reddy, Andrey Grodzovsky
  Cc: PDL-MPT-FUSIONLINUX, Ezra Kohavi, linux-scsi, Sathya Prakash,
	Chaitra P B, Suganath Prabu Subramani, Hannes Reinecke, stable

Hi,

We are running kernel 4.4.0-22 and the patch below does not seem to be present in the mpt3sas driver. Can you please confirm?
As a reminder the patch was related to a Security Erase ATA command that requires a very long timeout like 100 minutes or more and the drive retains a busy status. And the driver should not try to send other commands or reset the drive.

Thanks,

Igor Rybak
CTO
MediaClone Inc
6900 Canby Ave Ste 107
Reseda, CA 91335
USA
+1-818-654-6286
________________________________________
From: Sreekanth Reddy [sreekanth.reddy@broadcom.com]
Sent: Thursday, November 10, 2016 8:38 PM
To: Andrey Grodzovsky
Cc: PDL-MPT-FUSIONLINUX; Igor Rybak; Ezra Kohavi; linux-scsi@vger.kernel.org; Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; Hannes Reinecke; stable@vger.kernel.org
Subject: Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)

On Thu, Nov 10, 2016 at 8:05 PM, Andrey Grodzovsky <andrey2805@gmail.com> wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specific command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
>
> Fix:
> Set device state to busy while erase in progress to reject any incoming
> commands until the erase is done. The device is blocked any way during
> this time and cannot execute any other command.
> More data and logs can be found here -
> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
>
> v2: Update according to example patch by Hannes Reinecke to apply
> the blocking logic to any ATA 12/16 command.
>
> v3: Use SCSI commands opcodes definitions instead of value and
> correct identation.
>
> v4: Fix checkpath errors and warning.
>
> Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
> Cc: <linux-scsi@vger.kernel.org>
> Cc: Sathya Prakash <sathya.prakash@broadcom.com>
> Cc: Chaitra P B <chaitra.basappa@broadcom.com>
> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
> Cc: Hannes Reinecke <hare@suse.de>
> Cc: <stable@vger.kernel.org>

Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>

> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..c032319 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 ioc_status)
>             SAM_STAT_CHECK_CONDITION;
>  }
>
> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> +{
> +       return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
> +}
>
>  /**
>   * _scsih_qcmd - main scsi request entry point
> @@ -3528,6 +3532,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd)
>                 scsi_print_command(scmd);
>  #endif
>
> +       /**
> +       * Lock the device for any subsequent command until
> +       * command is done.
> +       */
> +       if (ata_12_16_cmd(scmd))
> +               scsi_internal_device_block(scmd->device);
> +
> +
>         sas_device_priv_data = scmd->device->hostdata;
>         if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
>                 scmd->result = DID_NO_CONNECT << 16;
> @@ -4062,6 +4074,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>         if (scmd == NULL)
>                 return 1;
>
> +       if (ata_12_16_cmd(scmd))
> +               scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
> +
> +
>         mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
>
>         if (mpi_reply == NULL) {
> --
> 2.1.4
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)
  2018-04-23 18:28                             ` Igor Rybak
@ 2018-04-24  7:25                               ` Greg KH
  0 siblings, 0 replies; 31+ messages in thread
From: Greg KH @ 2018-04-24  7:25 UTC (permalink / raw)
  To: Igor Rybak
  Cc: Sreekanth Reddy, Andrey Grodzovsky, PDL-MPT-FUSIONLINUX,
	Ezra Kohavi, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, Hannes Reinecke, stable

On Mon, Apr 23, 2018 at 06:28:03PM +0000, Igor Rybak wrote:
> Hi,
> 
> We are running kernel 4.4.0-22 and the patch below does not seem to be present in the mpt3sas driver. Can you please confirm?

Please update your kernel, this patch was in the 4.4.36 kernel release
which came out December 2, 2016, well over a full year ago.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2016-11-04 16:35                   ` Martin K. Petersen
  (?)
@ 2018-04-24  9:09                   ` Steffen Maier
  2018-04-24 12:33                     ` Hannes Reinecke
  -1 siblings, 1 reply; 31+ messages in thread
From: Steffen Maier @ 2018-04-24  9:09 UTC (permalink / raw)
  To: Martin K. Petersen, Hannes Reinecke
  Cc: Sreekanth Reddy, Igor Rybak, Andrey Grodzovsky, Ezra Kohavi,
	PDL-MPT-FUSIONLINUX, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, stable


On 11/04/2016 05:35 PM, Martin K. Petersen wrote:
>>>>>> "Hannes" == Hannes Reinecke <hare@suse.de> writes:
> 
> Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues
> Hannes> internally) the implemented behaviour is standards conformant,
> Hannes> although the standard also allows for returning 'TASK SET FULL'
> Hannes> or 'BUSY' in these cases.  Doing so would nicely solve this
> Hannes> issue.
> 
> I agree with Hannes that it would be appropriate for the SATL to report
> busy when it makes an non-queued command queueable.

Wouldn't this potentially still cause problems if the secure erase takes 
longer than max_retries * scmd_tmo. I.e. the command timing out by 
default after 180 seconds as in 
https://www.spinics.net/lists/linux-block/msg24837.html ?

The fix approach here seems to also handle this gracefully.

-- 
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)
  2018-04-24  9:09                   ` Steffen Maier
@ 2018-04-24 12:33                     ` Hannes Reinecke
  0 siblings, 0 replies; 31+ messages in thread
From: Hannes Reinecke @ 2018-04-24 12:33 UTC (permalink / raw)
  To: Steffen Maier, Martin K. Petersen
  Cc: Sreekanth Reddy, Igor Rybak, Andrey Grodzovsky, Ezra Kohavi,
	PDL-MPT-FUSIONLINUX, linux-scsi, Sathya Prakash, Chaitra P B,
	Suganath Prabu Subramani, stable

On 04/24/2018 11:09 AM, Steffen Maier wrote:
> 
> On 11/04/2016 05:35 PM, Martin K. Petersen wrote:
>>>>>>> "Hannes" == Hannes Reinecke <hare@suse.de> writes:
>>
>> Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues
>> Hannes> internally) the implemented behaviour is standards conformant,
>> Hannes> although the standard also allows for returning 'TASK SET FULL'
>> Hannes> or 'BUSY' in these cases.  Doing so would nicely solve this
>> Hannes> issue.
>>
>> I agree with Hannes that it would be appropriate for the SATL to report
>> busy when it makes an non-queued command queueable.
> 
> Wouldn't this potentially still cause problems if the secure erase takes 
> longer than max_retries * scmd_tmo. I.e. the command timing out by 
> default after 180 seconds as in 
> https://www.spinics.net/lists/linux-block/msg24837.html ?
> 
> The fix approach here seems to also handle this gracefully.
> 
Well, yes, of course the command will be terminated after it timed out.
But typically secure erase is invoked from userspace via sg ioctls, and 
it's in the responsibility of the application to set the correct timeout.

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2018-04-24 12:33 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-30 12:43 [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination Andrey Grodzovsky
2016-10-30 12:43 ` Andrey Grodzovsky
2016-10-30 18:43 ` Hannes Reinecke
2016-10-30 18:43   ` Hannes Reinecke
2016-11-02  0:09   ` [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2) Andrey Grodzovsky
2016-11-02  0:09     ` Andrey Grodzovsky
2016-11-02  2:07     ` Hannes Reinecke
2016-11-02  2:07       ` Hannes Reinecke
2016-11-02 10:05       ` Sreekanth Reddy
     [not found]         ` <CAJphD_qrQftfCOn_uzXCfdX=Xv9BYvVQ60AZ4DR2rc3gfXQa_Q@mail.gmail.com>
     [not found]           ` <30w645ulbhlofxrk1h4a9q3s.1478144944778@email.android.com>
2016-11-04 12:45             ` Sreekanth Reddy
2016-11-04 14:51               ` Hannes Reinecke
2016-11-04 16:35                 ` Martin K. Petersen
2016-11-04 16:35                   ` Martin K. Petersen
2018-04-24  9:09                   ` Steffen Maier
2018-04-24 12:33                     ` Hannes Reinecke
2016-11-05 13:17                 ` Andrey Grodzovsky
2016-11-10 12:07                   ` Sreekanth Reddy
2016-11-10 13:42                     ` [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v3) Andrey Grodzovsky
2016-11-10 13:42                       ` Andrey Grodzovsky
2016-11-10 13:54                       ` Greg KH
2016-11-10 14:35                         ` [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4) Andrey Grodzovsky
2016-11-10 14:35                           ` Andrey Grodzovsky
2016-11-11  4:38                           ` Sreekanth Reddy
2018-04-23 18:28                             ` Igor Rybak
2018-04-24  7:25                               ` Greg KH
2016-11-12 15:29                           ` Martin K. Petersen
2016-11-12 15:29                             ` Martin K. Petersen
2016-11-12 16:36                             ` Andrey Grodzovsky
2016-11-14 23:30                               ` Martin K. Petersen
2016-11-17  1:15                                 ` [PATCH] [SCSI] mpt2sas: Fix secure erase premature termination Andrey Grodzovsky
2016-11-17  7:11                                   ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.