All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Improve error handling
@ 2020-09-10  7:39 Damien Le Moal
  2020-09-10  7:39 ` [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd() Damien Le Moal
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Damien Le Moal @ 2020-09-10  7:39 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen

A small series to improve command error hadling.

The first patch is a simple code cleanup. The second patch updates
asc/ascq sense codes list. Finally, the third patch adds zone resource
errors handling for zoned block deives to return BLK_STS_DEV_RESOURCE,
similarly to what the NVMe driver does for ZNS devices.

Damien Le Moal (3):
  scsi: Cleanup scsi_noretry_cmd()
  scsi: update additional sense codes list
  scsi: handle zone resources errors

 drivers/scsi/scsi_error.c  |  4 +--
 drivers/scsi/scsi_lib.c    | 12 +++++++++
 drivers/scsi/sense_codes.h | 54 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 67 insertions(+), 3 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd()
  2020-09-10  7:39 [PATCH 0/3] Improve error handling Damien Le Moal
@ 2020-09-10  7:39 ` Damien Le Moal
  2020-09-10 17:47   ` Bart Van Assche
  2020-09-10  7:39 ` [PATCH 2/3] scsi: update additional sense codes list Damien Le Moal
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Damien Le Moal @ 2020-09-10  7:39 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen

No need for else after return.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/scsi_error.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 927b1e641842..5f3726abed78 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1755,8 +1755,8 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 	if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
 	    blk_rq_is_passthrough(scmd->request))
 		return 1;
-	else
-		return 0;
+
+	return 0;
 }
 
 /**
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] scsi: update additional sense codes list
  2020-09-10  7:39 [PATCH 0/3] Improve error handling Damien Le Moal
  2020-09-10  7:39 ` [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd() Damien Le Moal
@ 2020-09-10  7:39 ` Damien Le Moal
  2020-09-10  7:39 ` [PATCH 3/3] scsi: handle zone resources errors Damien Le Moal
  2020-09-10 14:11 ` [PATCH 0/3] Improve error handling Himanshu Madhani
  3 siblings, 0 replies; 10+ messages in thread
From: Damien Le Moal @ 2020-09-10  7:39 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen

Add missing Additional Sense Codes listed in
http://www.t10.org/lists/asc-num.txt.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/sense_codes.h | 54 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sense_codes.h b/drivers/scsi/sense_codes.h
index 201a536688de..805d4c13d070 100644
--- a/drivers/scsi/sense_codes.h
+++ b/drivers/scsi/sense_codes.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
  * The canonical list of T10 Additional Sense Codes is available at:
- * http://www.t10.org/lists/asc-num.txt [most recent: 20141221]
+ * http://www.t10.org/lists/asc-num.txt [most recent: 20200817]
  */
 
 SENSE_CODE(0x0000, "No additional sense information")
@@ -29,6 +29,7 @@ SENSE_CODE(0x001E, "Conflicting SA creation request")
 SENSE_CODE(0x001F, "Logical unit transitioning to another power condition")
 SENSE_CODE(0x0020, "Extended copy information available")
 SENSE_CODE(0x0021, "Atomic command aborted due to ACA")
+SENSE_CODE(0x0022, "Deferred microcode is pending")
 
 SENSE_CODE(0x0100, "No index/sector signal")
 
@@ -72,6 +73,9 @@ SENSE_CODE(0x041F, "Logical unit not ready, microcode download required")
 SENSE_CODE(0x0420, "Logical unit not ready, logical unit reset required")
 SENSE_CODE(0x0421, "Logical unit not ready, hard reset required")
 SENSE_CODE(0x0422, "Logical unit not ready, power cycle required")
+SENSE_CODE(0x0423, "Logical unit not ready, affiliation required")
+SENSE_CODE(0x0424, "Depopulation in progress")
+SENSE_CODE(0x0425, "Depopulation restoration in progress")
 
 SENSE_CODE(0x0500, "Logical unit does not respond to selection")
 
@@ -104,6 +108,17 @@ SENSE_CODE(0x0B06, "Warning - non-volatile cache now volatile")
 SENSE_CODE(0x0B07, "Warning - degraded power to non-volatile cache")
 SENSE_CODE(0x0B08, "Warning - power loss expected")
 SENSE_CODE(0x0B09, "Warning - device statistics notification active")
+SENSE_CODE(0x0B0A, "Warning - high critical temperature limit exceeded")
+SENSE_CODE(0x0B0B, "Warning - low critical temperature limit exceeded")
+SENSE_CODE(0x0B0C, "Warning - high operating temperature limit exceeded")
+SENSE_CODE(0x0B0D, "Warning - low operating temperature limit exceeded")
+SENSE_CODE(0x0B0E, "Warning - high critical humidity limit exceeded")
+SENSE_CODE(0x0B0F, "Warning - low critical humidity limit exceeded")
+SENSE_CODE(0x0B10, "Warning - high operating humidity limit exceeded")
+SENSE_CODE(0x0B11, "Warning - low operating humidity limit exceeded")
+SENSE_CODE(0x0B12, "Warning - microcode security at risk")
+SENSE_CODE(0x0B13, "Warning - microcode digital signature validation failure")
+SENSE_CODE(0x0B14, "Warning - physical element status change")
 
 SENSE_CODE(0x0C00, "Write error")
 SENSE_CODE(0x0C01, "Write error - recovered with auto reallocation")
@@ -122,6 +137,8 @@ SENSE_CODE(0x0C0D, "Write error - not enough unsolicited data")
 SENSE_CODE(0x0C0E, "Multiple write errors")
 SENSE_CODE(0x0C0F, "Defects in error window")
 SENSE_CODE(0x0C10, "Incomplete multiple atomic write operations")
+SENSE_CODE(0x0C11, "Write error - recovery scan needed")
+SENSE_CODE(0x0C12, "Write error - insufficient zone resources")
 
 SENSE_CODE(0x0D00, "Error detected by third party temporary initiator")
 SENSE_CODE(0x0D01, "Third party device failure")
@@ -242,6 +259,9 @@ SENSE_CODE(0x2009, "Access denied - invalid LU identifier")
 SENSE_CODE(0x200A, "Access denied - invalid proxy token")
 SENSE_CODE(0x200B, "Access denied - ACL LUN conflict")
 SENSE_CODE(0x200C, "Illegal command when not in append-only mode")
+SENSE_CODE(0x200D, "Not an administrative logical unit")
+SENSE_CODE(0x200E, "Not a subsidiary logical unit")
+SENSE_CODE(0x200F, "Not a conglomerate logical unit")
 
 SENSE_CODE(0x2100, "Logical block address out of range")
 SENSE_CODE(0x2101, "Invalid element address")
@@ -251,6 +271,8 @@ SENSE_CODE(0x2104, "Unaligned write command")
 SENSE_CODE(0x2105, "Write boundary violation")
 SENSE_CODE(0x2106, "Attempt to read invalid data")
 SENSE_CODE(0x2107, "Read boundary violation")
+SENSE_CODE(0x2108, "Misaligned write command")
+SENSE_CODE(0x2109, "Attempt to access gap zone")
 
 SENSE_CODE(0x2200, "Illegal function (use 20 00, 24 00, or 26 00)")
 
@@ -275,6 +297,7 @@ SENSE_CODE(0x2405, "Security working key frozen")
 SENSE_CODE(0x2406, "Nonce not unique")
 SENSE_CODE(0x2407, "Nonce timestamp out of range")
 SENSE_CODE(0x2408, "Invalid XCDB")
+SENSE_CODE(0x2409, "Invalid fast format")
 
 SENSE_CODE(0x2500, "Logical unit not supported")
 
@@ -297,6 +320,10 @@ SENSE_CODE(0x260F, "Invalid data-out buffer integrity check value")
 SENSE_CODE(0x2610, "Data decryption key fail limit reached")
 SENSE_CODE(0x2611, "Incomplete key-associated data set")
 SENSE_CODE(0x2612, "Vendor specific key reference not found")
+SENSE_CODE(0x2613, "Application tag mode page is invalid")
+SENSE_CODE(0x2614, "Tape stream mirroring prevented")
+SENSE_CODE(0x2615, "Copy source or copy destination not authorized")
+SENSE_CODE(0x2616, "Fast copy not possible")
 
 SENSE_CODE(0x2700, "Write protected")
 SENSE_CODE(0x2701, "Hardware write protected")
@@ -342,6 +369,7 @@ SENSE_CODE(0x2A12, "Data encryption parameters changed by vendor specific event"
 SENSE_CODE(0x2A13, "Data encryption key instance counter has changed")
 SENSE_CODE(0x2A14, "SA creation capabilities data has changed")
 SENSE_CODE(0x2A15, "Medium removal prevention preempted")
+SENSE_CODE(0x2A16, "Zone reset write pointer recommended")
 
 SENSE_CODE(0x2B00, "Copy cannot execute since host cannot disconnect")
 
@@ -360,6 +388,11 @@ SENSE_CODE(0x2C0B, "Not reserved")
 SENSE_CODE(0x2C0C, "Orwrite generation does not match")
 SENSE_CODE(0x2C0D, "Reset write pointer not allowed")
 SENSE_CODE(0x2C0E, "Zone is offline")
+SENSE_CODE(0x2C0F, "Stream not open")
+SENSE_CODE(0x2C10, "Unwritten data in zone")
+SENSE_CODE(0x2C11, "Descriptor format sense data required")
+SENSE_CODE(0x2C12, "Zone is inactive")
+SENSE_CODE(0x2C13, "Well known logical unit access required")
 
 SENSE_CODE(0x2D00, "Overwrite error on update in place")
 
@@ -395,6 +428,8 @@ SENSE_CODE(0x3100, "Medium format corrupted")
 SENSE_CODE(0x3101, "Format command failed")
 SENSE_CODE(0x3102, "Zoned formatting failed due to spare linking")
 SENSE_CODE(0x3103, "Sanitize command failed")
+SENSE_CODE(0x3104, "Depopulation failed")
+SENSE_CODE(0x3105, "Depopulation restoration failed")
 
 SENSE_CODE(0x3200, "No defect spare location available")
 SENSE_CODE(0x3201, "Defect list update failure")
@@ -419,6 +454,7 @@ SENSE_CODE(0x3802, "Esn - power management class event")
 SENSE_CODE(0x3804, "Esn - media class event")
 SENSE_CODE(0x3806, "Esn - device busy class event")
 SENSE_CODE(0x3807, "Thin Provisioning soft threshold reached")
+SENSE_CODE(0x3808, "Depopulation interrupted")
 
 SENSE_CODE(0x3900, "Saving parameters not supported")
 
@@ -456,6 +492,7 @@ SENSE_CODE(0x3B19, "Element enabled")
 SENSE_CODE(0x3B1A, "Data transfer device removed")
 SENSE_CODE(0x3B1B, "Data transfer device inserted")
 SENSE_CODE(0x3B1C, "Too many logical objects on partition to support operation")
+SENSE_CODE(0x3B20, "Element static information changed")
 
 SENSE_CODE(0x3D00, "Invalid bits in identify message")
 
@@ -488,6 +525,11 @@ SENSE_CODE(0x3F13, "iSCSI IP address removed")
 SENSE_CODE(0x3F14, "iSCSI IP address changed")
 SENSE_CODE(0x3F15, "Inspect referrals sense descriptors")
 SENSE_CODE(0x3F16, "Microcode has been changed without reset")
+SENSE_CODE(0x3F17, "Zone transition to full")
+SENSE_CODE(0x3F18, "Bind completed")
+SENSE_CODE(0x3F19, "Bind redirected")
+SENSE_CODE(0x3F1A, "Subsidiary binding changed")
+
 /*
  *	SENSE_CODE(0x40NN, "Ram failure")
  *	SENSE_CODE(0x40NN, "Diagnostic failure on component nn")
@@ -589,6 +631,9 @@ SENSE_CODE(0x550B, "Insufficient power for operation")
 SENSE_CODE(0x550C, "Insufficient resources to create rod")
 SENSE_CODE(0x550D, "Insufficient resources to create rod token")
 SENSE_CODE(0x550E, "Insufficient zone resources")
+SENSE_CODE(0x550F, "Insufficient zone resources to complete write")
+SENSE_CODE(0x5510, "Maximum number of streams open")
+SENSE_CODE(0x5511, "Insufficient resources to bind")
 
 SENSE_CODE(0x5700, "Unable to recover table-of-contents")
 
@@ -692,6 +737,7 @@ SENSE_CODE(0x5D69, "Firmware impending failure throughput performance")
 SENSE_CODE(0x5D6A, "Firmware impending failure seek time performance")
 SENSE_CODE(0x5D6B, "Firmware impending failure spin-up retry count")
 SENSE_CODE(0x5D6C, "Firmware impending failure drive calibration retry count")
+SENSE_CODE(0x5D73, "Media impending failure endurance limit met")
 SENSE_CODE(0x5DFF, "Failure prediction threshold exceeded (false)")
 
 SENSE_CODE(0x5E00, "Low power condition on")
@@ -744,6 +790,8 @@ SENSE_CODE(0x6708, "Assign failure occurred")
 SENSE_CODE(0x6709, "Multiply assigned logical unit")
 SENSE_CODE(0x670A, "Set target port groups command failed")
 SENSE_CODE(0x670B, "ATA device feature not enabled")
+SENSE_CODE(0x670C, "Command rejected")
+SENSE_CODE(0x670D, "Explicit bind not allowed")
 
 SENSE_CODE(0x6800, "Logical unit not configured")
 SENSE_CODE(0x6801, "Subsidiary logical unit not configured")
@@ -772,6 +820,10 @@ SENSE_CODE(0x6F04, "Media region code is mismatched to logical unit region")
 SENSE_CODE(0x6F05, "Drive region must be permanent/region reset count error")
 SENSE_CODE(0x6F06, "Insufficient block count for binding nonce recording")
 SENSE_CODE(0x6F07, "Conflict in binding nonce recording")
+SENSE_CODE(0x6F08, "Insufficient permission")
+SENSE_CODE(0x6F09, "Invalid drive-host pairing server")
+SENSE_CODE(0x6F0A, "Drive-host pairing suspended")
+
 /*
  *	SENSE_CODE(0x70NN, "Decompression exception short algorithm id of nn")
  */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] scsi: handle zone resources errors
  2020-09-10  7:39 [PATCH 0/3] Improve error handling Damien Le Moal
  2020-09-10  7:39 ` [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd() Damien Le Moal
  2020-09-10  7:39 ` [PATCH 2/3] scsi: update additional sense codes list Damien Le Moal
@ 2020-09-10  7:39 ` Damien Le Moal
  2020-09-10  7:45   ` Damien Le Moal
  2020-09-10 17:53   ` Christoph Hellwig
  2020-09-10 14:11 ` [PATCH 0/3] Improve error handling Himanshu Madhani
  3 siblings, 2 replies; 10+ messages in thread
From: Damien Le Moal @ 2020-09-10  7:39 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen

ZBC or ZAC disks that have a limit on the number of open zones may fail
a zone open command or a write to a zone that is not already implicitly
or explicitly open if the total number of open zones is already at the
maximum allowed.

For these operations, instead of returning the generic BLK_STS_IOERR,
return BLK_STS_DEV_RESOURCE which is returned as -EBUSY to the I/O
issuer, allowing the device user to act appropriately on these
relatively benign zone resource errors.

With this change the NVMe (ZNS) and sd drivers both return the same
error code for zone resource errors, facilitating the implementation of
IO error handling by the user with a common code base for both device
types.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/scsi/scsi_lib.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7c6dd6f75190..7eb4a80c3bbb 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -758,6 +758,18 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
 			/* See SSC3rXX or current. */
 			action = ACTION_FAIL;
 			break;
+		case DATA_PROTECT:
+			sdev_printk(KERN_INFO, cmd->device,
+				    "asc/ascq = 0x%02x 0x%02x\n",
+				    sshdr.asc, sshdr.ascq);
+			action = ACTION_FAIL;
+			if ((sshdr.asc == 0x0C && sshdr.ascq == 0x12) ||
+			    (sshdr.asc == 0x55 &&
+			     (sshdr.ascq == 0x0E || sshdr.ascq == 0x0F))) {
+				/* Insufficient zone resources */
+				blk_stat = BLK_STS_DEV_RESOURCE;
+			}
+			break;
 		default:
 			action = ACTION_FAIL;
 			break;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] scsi: handle zone resources errors
  2020-09-10  7:39 ` [PATCH 3/3] scsi: handle zone resources errors Damien Le Moal
@ 2020-09-10  7:45   ` Damien Le Moal
  2020-09-10 17:53   ` Christoph Hellwig
  1 sibling, 0 replies; 10+ messages in thread
From: Damien Le Moal @ 2020-09-10  7:45 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen

On 2020/09/10 16:40, Damien Le Moal wrote:
> ZBC or ZAC disks that have a limit on the number of open zones may fail
> a zone open command or a write to a zone that is not already implicitly
> or explicitly open if the total number of open zones is already at the
> maximum allowed.
> 
> For these operations, instead of returning the generic BLK_STS_IOERR,
> return BLK_STS_DEV_RESOURCE which is returned as -EBUSY to the I/O
> issuer, allowing the device user to act appropriately on these
> relatively benign zone resource errors.
> 
> With this change the NVMe (ZNS) and sd drivers both return the same
> error code for zone resource errors, facilitating the implementation of
> IO error handling by the user with a common code base for both device
> types.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  drivers/scsi/scsi_lib.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 7c6dd6f75190..7eb4a80c3bbb 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -758,6 +758,18 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
>  			/* See SSC3rXX or current. */
>  			action = ACTION_FAIL;
>  			break;
> +		case DATA_PROTECT:
> +			sdev_printk(KERN_INFO, cmd->device,
> +				    "asc/ascq = 0x%02x 0x%02x\n",
> +				    sshdr.asc, sshdr.ascq);

Oops... Forgot to remove my debug message. Re-sending without it.

> +			action = ACTION_FAIL;
> +			if ((sshdr.asc == 0x0C && sshdr.ascq == 0x12) ||
> +			    (sshdr.asc == 0x55 &&
> +			     (sshdr.ascq == 0x0E || sshdr.ascq == 0x0F))) {
> +				/* Insufficient zone resources */
> +				blk_stat = BLK_STS_DEV_RESOURCE;
> +			}
> +			break;
>  		default:
>  			action = ACTION_FAIL;
>  			break;
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/3] Improve error handling
  2020-09-10  7:39 [PATCH 0/3] Improve error handling Damien Le Moal
                   ` (2 preceding siblings ...)
  2020-09-10  7:39 ` [PATCH 3/3] scsi: handle zone resources errors Damien Le Moal
@ 2020-09-10 14:11 ` Himanshu Madhani
  3 siblings, 0 replies; 10+ messages in thread
From: Himanshu Madhani @ 2020-09-10 14:11 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-scsi, Martin K . Petersen



> On Sep 10, 2020, at 2:39 AM, Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> 
> A small series to improve command error hadling.
> 
> The first patch is a simple code cleanup. The second patch updates
> asc/ascq sense codes list. Finally, the third patch adds zone resource
> errors handling for zoned block deives to return BLK_STS_DEV_RESOURCE,
> similarly to what the NVMe driver does for ZNS devices.
> 
> Damien Le Moal (3):
>  scsi: Cleanup scsi_noretry_cmd()
>  scsi: update additional sense codes list
>  scsi: handle zone resources errors
> 
> drivers/scsi/scsi_error.c  |  4 +--
> drivers/scsi/scsi_lib.c    | 12 +++++++++
> drivers/scsi/sense_codes.h | 54 +++++++++++++++++++++++++++++++++++++-
> 3 files changed, 67 insertions(+), 3 deletions(-)
> 
> -- 
> 2.26.2
> 

For the Series with updated Patch 3. 

Looks Good.

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	 Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd()
  2020-09-10  7:39 ` [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd() Damien Le Moal
@ 2020-09-10 17:47   ` Bart Van Assche
  2020-09-10 22:07     ` Damien Le Moal
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2020-09-10 17:47 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen

On 2020-09-10 00:39, Damien Le Moal wrote:
> No need for else after return.
> 
> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
> ---
>  drivers/scsi/scsi_error.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 927b1e641842..5f3726abed78 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1755,8 +1755,8 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
>  	if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>  	    blk_rq_is_passthrough(scmd->request))
>  		return 1;
> -	else
> -		return 0;
> +
> +	return 0;
>  }

Is this patch useful? Is it necessary? Or is it just code churn?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] scsi: handle zone resources errors
  2020-09-10  7:39 ` [PATCH 3/3] scsi: handle zone resources errors Damien Le Moal
  2020-09-10  7:45   ` Damien Le Moal
@ 2020-09-10 17:53   ` Christoph Hellwig
  2020-09-10 22:16     ` Damien Le Moal
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2020-09-10 17:53 UTC (permalink / raw)
  To: Damien Le Moal; +Cc: linux-scsi, Martin K . Petersen

On Thu, Sep 10, 2020 at 04:39:52PM +0900, Damien Le Moal wrote:
> +		case DATA_PROTECT:
> +			sdev_printk(KERN_INFO, cmd->device,
> +				    "asc/ascq = 0x%02x 0x%02x\n",
> +				    sshdr.asc, sshdr.ascq);
> +			action = ACTION_FAIL;
> +			if ((sshdr.asc == 0x0C && sshdr.ascq == 0x12) ||
> +			    (sshdr.asc == 0x55 &&
> +			     (sshdr.ascq == 0x0E || sshdr.ascq == 0x0F))) {
> +				/* Insufficient zone resources */
> +				blk_stat = BLK_STS_DEV_RESOURCE;

BLK_STS_DEV_RESOURCE is a magic error code leading to a retry on the
particular request_queue once it isn't busy any more.  Please don't
abuse it for random other conditions.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd()
  2020-09-10 17:47   ` Bart Van Assche
@ 2020-09-10 22:07     ` Damien Le Moal
  0 siblings, 0 replies; 10+ messages in thread
From: Damien Le Moal @ 2020-09-10 22:07 UTC (permalink / raw)
  To: Bart Van Assche, linux-scsi, Martin K . Petersen

On 2020/09/11 2:48, Bart Van Assche wrote:
> On 2020-09-10 00:39, Damien Le Moal wrote:
>> No need for else after return.
>>
>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
>> ---
>>  drivers/scsi/scsi_error.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>> index 927b1e641842..5f3726abed78 100644
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -1755,8 +1755,8 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)
>>  	if (scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
>>  	    blk_rq_is_passthrough(scmd->request))
>>  		return 1;
>> -	else
>> -		return 0;
>> +
>> +	return 0;
>>  }
> 
> Is this patch useful? Is it necessary? Or is it just code churn?

Sure, you may consider it code churn, but I prefer the more positive view that
it improves code style. And looking at it again, I think the proper change
should actually be:

	return scmd->request->cmd_flags & REQ_FAILFAST_DEV ||
		blk_rq_is_passthrough(scmd->request);

A lot cleaner.

But no strong feelings. If you really do not like the change, I will drop it.

Best regards.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] scsi: handle zone resources errors
  2020-09-10 17:53   ` Christoph Hellwig
@ 2020-09-10 22:16     ` Damien Le Moal
  0 siblings, 0 replies; 10+ messages in thread
From: Damien Le Moal @ 2020-09-10 22:16 UTC (permalink / raw)
  To: hch; +Cc: linux-scsi, Martin K . Petersen

On 2020/09/11 2:54, Christoph Hellwig wrote:
> On Thu, Sep 10, 2020 at 04:39:52PM +0900, Damien Le Moal wrote:
>> +		case DATA_PROTECT:
>> +			sdev_printk(KERN_INFO, cmd->device,
>> +				    "asc/ascq = 0x%02x 0x%02x\n",
>> +				    sshdr.asc, sshdr.ascq);
>> +			action = ACTION_FAIL;
>> +			if ((sshdr.asc == 0x0C && sshdr.ascq == 0x12) ||
>> +			    (sshdr.asc == 0x55 &&
>> +			     (sshdr.ascq == 0x0E || sshdr.ascq == 0x0F))) {
>> +				/* Insufficient zone resources */
>> +				blk_stat = BLK_STS_DEV_RESOURCE;
> 
> BLK_STS_DEV_RESOURCE is a magic error code leading to a retry on the
> particular request_queue once it isn't busy any more.  Please don't
> abuse it for random other conditions.

Yes, but that is for the submission path, isn't it ? This change is in the
completion path and action is set to ACTION_FAIL, so the request is terminated
right away without any retry (tested). More importantly, this leads to the block
layer returning -EBUSY which allows the user to differentiate this
temporary/trivial error from the potentially more serious -EIO.

Keith sent a patch for NVMe ZNS doing something similar, which will result in
the block layer returning -EBUSY for zone resource errors. I would like to unify
scsi and nvme behavior for these recoverable zone resource errors.

So should we define a new BLK_STS_BUSYERR status to differentiate from the
default BLK_STS_IOERR and not overload BLK_STS_DEV_RESOURCE (or
BLK_STS_ZONE_RESOURCE) ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-09-10 22:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10  7:39 [PATCH 0/3] Improve error handling Damien Le Moal
2020-09-10  7:39 ` [PATCH 1/3] scsi: Cleanup scsi_noretry_cmd() Damien Le Moal
2020-09-10 17:47   ` Bart Van Assche
2020-09-10 22:07     ` Damien Le Moal
2020-09-10  7:39 ` [PATCH 2/3] scsi: update additional sense codes list Damien Le Moal
2020-09-10  7:39 ` [PATCH 3/3] scsi: handle zone resources errors Damien Le Moal
2020-09-10  7:45   ` Damien Le Moal
2020-09-10 17:53   ` Christoph Hellwig
2020-09-10 22:16     ` Damien Le Moal
2020-09-10 14:11 ` [PATCH 0/3] Improve error handling Himanshu Madhani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.