All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] scsi: Fix internal host code use
@ 2022-08-12  1:00 Mike Christie
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley

The following patches made over Martin's 5.20 staging branch fix an issue
where we intended the host codes:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

to be internal to scsi-ml and used to tell the completion path what type
of error it was without having to re-parse the sense, host codes and
status. But at some point drivers started using them and the driver
writers never updated scsi-ml, so we now have the bugs:

1. scsi_result_to_blk_status clears those codes, so they are not
propagated upwards. SG IO/passthrough users will then not see an error
and think a command was successful.

2. The SCSI error handler runs because scsi_decide_disposition has no
case statements for them and we return FAILED.

This patchset converts the drivers to stop using these codes, and then
moves them to scsi_priv.h in a new error byte so they can only be used
by scsi-ml.

v2:
- Rename SPACE_ALLOC to NOSPC
- Fix up email subject formatting
- Drop part of xen patch that removed zen definitions.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 01/10] scsi: xen: Drop use of internal host codes
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-15  6:12   ` Juergen Gross
  2022-08-12  1:00 ` [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use Mike Christie
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

The error codes:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

are internal to the SCSI layer. Drivers must not use them because:

1. They are not propagated upwards, so SG IO/passthrough users will not
see an error and think a command was successful.

xen-scsiback will never see this error and should not try to send it.

2. There is no handling for them in scsi_decide_disposition so if
xen-scsifront were to return the error to scsi-ml then it kicks off the
error handler which is definitely not what we want.

This patch remove the use from xen-scsifront/back.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/xen-scsifront.c |  8 --------
 drivers/xen/xen-scsiback.c   | 12 ------------
 2 files changed, 20 deletions(-)

diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c
index 51afc66e839d..66b316d173b0 100644
--- a/drivers/scsi/xen-scsifront.c
+++ b/drivers/scsi/xen-scsifront.c
@@ -289,14 +289,6 @@ static unsigned int scsifront_host_byte(int32_t rslt)
 		return DID_TRANSPORT_DISRUPTED;
 	case XEN_VSCSIIF_RSLT_HOST_TRANSPORT_FAILFAST:
 		return DID_TRANSPORT_FAILFAST;
-	case XEN_VSCSIIF_RSLT_HOST_TARGET_FAILURE:
-		return DID_TARGET_FAILURE;
-	case XEN_VSCSIIF_RSLT_HOST_NEXUS_FAILURE:
-		return DID_NEXUS_FAILURE;
-	case XEN_VSCSIIF_RSLT_HOST_ALLOC_FAILURE:
-		return DID_ALLOC_FAILURE;
-	case XEN_VSCSIIF_RSLT_HOST_MEDIUM_ERROR:
-		return DID_MEDIUM_ERROR;
 	case XEN_VSCSIIF_RSLT_HOST_TRANSPORT_MARGINAL:
 		return DID_TRANSPORT_MARGINAL;
 	default:
diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
index 7a0c93acc2c5..e98c88a960d8 100644
--- a/drivers/xen/xen-scsiback.c
+++ b/drivers/xen/xen-scsiback.c
@@ -333,18 +333,6 @@ static int32_t scsiback_result(int32_t result)
 	case DID_TRANSPORT_FAILFAST:
 		host_status = XEN_VSCSIIF_RSLT_HOST_TRANSPORT_FAILFAST;
 		break;
-	case DID_TARGET_FAILURE:
-		host_status = XEN_VSCSIIF_RSLT_HOST_TARGET_FAILURE;
-		break;
-	case DID_NEXUS_FAILURE:
-		host_status = XEN_VSCSIIF_RSLT_HOST_NEXUS_FAILURE;
-		break;
-	case DID_ALLOC_FAILURE:
-		host_status = XEN_VSCSIIF_RSLT_HOST_ALLOC_FAILURE;
-		break;
-	case DID_MEDIUM_ERROR:
-		host_status = XEN_VSCSIIF_RSLT_HOST_MEDIUM_ERROR;
-		break;
 	case DID_TRANSPORT_MARGINAL:
 		host_status = XEN_VSCSIIF_RSLT_HOST_TRANSPORT_MARGINAL;
 		break;
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 03/10] scsi: uas: " Mike Christie
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like the driver wanted a hard failure so this swaps it with
DID_BAD_TARGET.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/storvsc_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index fe000da11332..25c44c87c972 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1029,7 +1029,7 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb,
 	 */
 	wrk = kmalloc(sizeof(struct storvsc_scan_work), GFP_ATOMIC);
 	if (!wrk) {
-		set_host_byte(scmnd, DID_TARGET_FAILURE);
+		set_host_byte(scmnd, DID_BAD_TARGET);
 		return;
 	}
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 03/10] scsi: uas: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
  2022-08-12  1:00 ` [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 04/10] scsi: virtio_scsi: " Mike Christie
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like the driver wanted a hard failure so this swaps it with
DID_BAD_TARGET which gives us that behavior and the error looks like it's
for a case where the target did not support a TMF we wanted to use (maybe
not a bad target but disappointing so close enough).

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/usb/storage/uas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 84dc270f6f73..de3836412bf3 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -283,7 +283,7 @@ static bool uas_evaluate_response_iu(struct response_iu *riu, struct scsi_cmnd *
 		set_host_byte(cmnd, DID_OK);
 		break;
 	case RC_TMF_NOT_SUPPORTED:
-		set_host_byte(cmnd, DID_TARGET_FAILURE);
+		set_host_byte(cmnd, DID_BAD_TARGET);
 		break;
 	default:
 		uas_log_cmd_state(cmnd, "response iu", response_code);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 04/10] scsi: virtio_scsi: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (2 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 03/10] scsi: uas: " Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use Mike Christie
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_TARGET_FAILURE. It looks like qemu returns that error code
if a host OS returns it, but this shouldn't happen for linux since we
never propagate that error to userspace.

This has us use DID_BAD_TARGET in case some other virt layer is returning
it. In that case we will still get a hard error like before and it conveys
something unexpected happened.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/virtio_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 578c4b6d0f7d..112d8c3962b0 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -141,7 +141,7 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
 		set_host_byte(sc, DID_TRANSPORT_DISRUPTED);
 		break;
 	case VIRTIO_SCSI_S_TARGET_FAILURE:
-		set_host_byte(sc, DID_TARGET_FAILURE);
+		set_host_byte(sc, DID_BAD_TARGET);
 		break;
 	case VIRTIO_SCSI_S_NEXUS_FAILURE:
 		set_host_byte(sc, DID_NEXUS_FAILURE);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (3 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 04/10] scsi: virtio_scsi: " Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_NEXUS_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_NEXUS_FAILURE. It looks like qemu returns that error code
if host OS returns DID_NEXUS_FAILURE (qemu's internal
SCSI_HOST_RESERVATION_ERROR maps to DID_NEXUS_FAILURE). This shouldn't
happen for linux since we don't propagate that error code to userspace.

This has us convert VIRTIO_SCSI_S_NEXUS_FAILURE to a
SAM_STAT_RESERVATION_CONFLICT in case some other virt layer is returning
it. In that case we will still get the reservation confict failure we
expect.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/virtio_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 112d8c3962b0..00cf6743db8c 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -144,7 +144,7 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
 		set_host_byte(sc, DID_BAD_TARGET);
 		break;
 	case VIRTIO_SCSI_S_NEXUS_FAILURE:
-		set_host_byte(sc, DID_NEXUS_FAILURE);
+		set_status_byte(sc, SAM_STAT_RESERVATION_CONFLICT);
 		break;
 	default:
 		scmd_printk(KERN_WARNING, sc, "Unknown response %d",
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (4 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-19 19:42   ` Himanshu Madhani
  2022-08-12  1:00 ` [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use Mike Christie
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

This has qla2xxx use DID_NO_CONNECT because it looks like we hit this
error when we can't find a port. It will give us the same hard error
behavior and it seems to match the error where we can't find the
endpoint.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/qla2xxx/qla_edif.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
index 400a8b6f3982..00ccc41cef14 100644
--- a/drivers/scsi/qla2xxx/qla_edif.c
+++ b/drivers/scsi/qla2xxx/qla_edif.c
@@ -1551,7 +1551,7 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
 		ql_dbg(ql_dbg_edif, vha, 0x70a3, "Failed to find port= %06x\n",
 		    sa_frame.port_id.b24);
 		rval = -EINVAL;
-		SET_DID_STATUS(bsg_reply->result, DID_TARGET_FAILURE);
+		SET_DID_STATUS(bsg_reply->result, DID_NO_CONNECT);
 		goto done;
 	}
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (5 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use Mike Christie
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_ALLOC_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

By the code comment, it looks like the driver wanted a retryable error
code, so this has it use DID_ERROR.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/cxlflash/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index e7be95ee7d64..cd1324ec742d 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -132,7 +132,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 			break;
 		case SISL_AFU_RC_OUT_OF_DATA_BUFS:
 			/* Retry */
-			scp->result = (DID_ALLOC_FAILURE << 16);
+			scp->result = (DID_ERROR << 16);
 			break;
 		default:
 			scp->result = (DID_ERROR << 16);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (6 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT Mike Christie
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie, Bart Van Assche

If a driver returns:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

we hit a couple bugs:

1. The SCSI error handler runs because scsi_decide_disposition has no
case statements for them and we return FAILED.

2. For SG IO the userspace app gets a success status instead of failed,
because scsi_result_to_blk_status clears those errors.

This patch adds a new internal error code byte for use by scsi-ml. It
will be used instead of the above error codes, so we don't have to play
that clearing the host code game in scsi_result_to_blk_status and
drivers cannot accidentally use them.

The next patch will then remove the internal users of the above codes and
convert us to use the new ones.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c |  5 +++++
 drivers/scsi/scsi_lib.c   | 22 ++++++++++++++++++++++
 drivers/scsi/scsi_priv.h  | 11 +++++++++++
 3 files changed, 38 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 448748e3fba5..d09b9ba1518c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -514,6 +514,11 @@ static void scsi_report_sense(struct scsi_device *sdev,
 	}
 }
 
+static inline void set_scsi_ml_byte(struct scsi_cmnd *cmd, u8 status)
+{
+	cmd->result = (cmd->result & 0xffff00ff) | (status << 8);
+}
+
 /**
  * scsi_check_sense - Examine scsi cmd sense
  * @scmd:	Cmd to have sense checked.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 4dbd29ab1dcc..92b8c050697e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -576,6 +576,11 @@ static bool scsi_end_request(struct request *req, blk_status_t error,
 	return false;
 }
 
+static inline u8 get_scsi_ml_byte(int result)
+{
+	return (result >> 8) & 0xff;
+}
+
 /**
  * scsi_result_to_blk_status - translate a SCSI result code into blk_status_t
  * @cmd:	SCSI command
@@ -586,6 +591,23 @@ static bool scsi_end_request(struct request *req, blk_status_t error,
  */
 static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
 {
+	/*
+	 * Check the scsi-ml byte first in case we converted a host or status
+	 * byte.
+	 */
+	switch (get_scsi_ml_byte(result)) {
+	case SCSIML_STAT_OK:
+		break;
+	case SCSIML_STAT_RESV_CONFLICT:
+		return BLK_STS_NEXUS;
+	case SCSIML_STAT_NOSPC:
+		return BLK_STS_NOSPC;
+	case SCSIML_STAT_MED_ERROR:
+		return BLK_STS_MEDIUM;
+	case SCSIML_STAT_TGT_FAILURE:
+		return BLK_STS_TARGET;
+	}
+
 	switch (host_byte(result)) {
 	case DID_OK:
 		if (scsi_status_is_good(result))
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 429663bd78ec..2b9e0559ddcb 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -18,6 +18,17 @@ struct scsi_nl_hdr;
 
 #define SCSI_CMD_RETRIES_NO_LIMIT -1
 
+/*
+ * Error codes used by scsi-ml internally. These must not be used by drivers.
+ */
+enum scsi_ml_status {
+	SCSIML_STAT_OK			= 0x00,
+	SCSIML_STAT_RESV_CONFLICT	= 0x01,	/* Reservation conflict */
+	SCSIML_STAT_NOSPC		= 0x02,	/* Space allocation on the dev failed */
+	SCSIML_STAT_MED_ERROR		= 0x03,	/* Medium error */
+	SCSIML_STAT_TGT_FAILURE		= 0x04,	/* Permanent target failure */
+};
+
 /*
  * Scsi Error Handler Flags
  */
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (7 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 10/10] scsi: Remove useless host error codes Mike Christie
  2022-09-07  2:14 ` [PATCH v2 00/10] scsi: Fix internal host code use Martin K. Petersen
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie, Bart Van Assche

Don't use:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

Instead use the scsi-ml internal values.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 12 ++++++------
 drivers/scsi/scsi_lib.c   | 24 +++++-------------------
 2 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d09b9ba1518c..b5fa2aad05f9 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -649,7 +649,7 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 	case DATA_PROTECT:
 		if (sshdr.asc == 0x27 && sshdr.ascq == 0x07) {
 			/* Thin provisioning hard threshold reached */
-			set_host_byte(scmd, DID_ALLOC_FAILURE);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_NOSPC);
 			return SUCCESS;
 		}
 		fallthrough;
@@ -657,14 +657,14 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 	case VOLUME_OVERFLOW:
 	case MISCOMPARE:
 	case BLANK_CHECK:
-		set_host_byte(scmd, DID_TARGET_FAILURE);
+		set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
 		return SUCCESS;
 
 	case MEDIUM_ERROR:
 		if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
 		    sshdr.asc == 0x13 || /* AMNF DATA FIELD */
 		    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
-			set_host_byte(scmd, DID_MEDIUM_ERROR);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_MED_ERROR);
 			return SUCCESS;
 		}
 		return NEEDS_RETRY;
@@ -673,7 +673,7 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 		if (scmd->device->retry_hwerror)
 			return ADD_TO_MLQUEUE;
 		else
-			set_host_byte(scmd, DID_TARGET_FAILURE);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
 		fallthrough;
 
 	case ILLEGAL_REQUEST:
@@ -683,7 +683,7 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 		    sshdr.asc == 0x24 || /* Invalid field in cdb */
 		    sshdr.asc == 0x26 || /* Parameter value invalid */
 		    sshdr.asc == 0x27) { /* Write protected */
-			set_host_byte(scmd, DID_TARGET_FAILURE);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
 		}
 		return SUCCESS;
 
@@ -1988,7 +1988,7 @@ enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *scmd)
 	case SAM_STAT_RESERVATION_CONFLICT:
 		sdev_printk(KERN_INFO, scmd->device,
 			    "reservation conflict\n");
-		set_host_byte(scmd, DID_NEXUS_FAILURE);
+		set_scsi_ml_byte(scmd, SCSIML_STAT_RESV_CONFLICT);
 		return SUCCESS; /* causes immediate i/o error */
 	}
 	return FAILED;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 92b8c050697e..473d9403f0c1 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -583,13 +583,11 @@ static inline u8 get_scsi_ml_byte(int result)
 
 /**
  * scsi_result_to_blk_status - translate a SCSI result code into blk_status_t
- * @cmd:	SCSI command
  * @result:	scsi error code
  *
- * Translate a SCSI result code into a blk_status_t value. May reset the host
- * byte of @cmd->result.
+ * Translate a SCSI result code into a blk_status_t value.
  */
-static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
+static blk_status_t scsi_result_to_blk_status(int result)
 {
 	/*
 	 * Check the scsi-ml byte first in case we converted a host or status
@@ -616,18 +614,6 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
 	case DID_TRANSPORT_FAILFAST:
 	case DID_TRANSPORT_MARGINAL:
 		return BLK_STS_TRANSPORT;
-	case DID_TARGET_FAILURE:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_TARGET;
-	case DID_NEXUS_FAILURE:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_NEXUS;
-	case DID_ALLOC_FAILURE:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_NOSPC;
-	case DID_MEDIUM_ERROR:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_MEDIUM;
 	default:
 		return BLK_STS_IOERR;
 	}
@@ -715,7 +701,7 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
 	if (sense_valid)
 		sense_current = !scsi_sense_is_deferred(&sshdr);
 
-	blk_stat = scsi_result_to_blk_status(cmd, result);
+	blk_stat = scsi_result_to_blk_status(result);
 
 	if (host_byte(result) == DID_RESET) {
 		/* Third party bus reset or reset for error recovery
@@ -893,14 +879,14 @@ static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result,
 					     SCSI_SENSE_BUFFERSIZE);
 		}
 		if (sense_current)
-			*blk_statp = scsi_result_to_blk_status(cmd, result);
+			*blk_statp = scsi_result_to_blk_status(result);
 	} else if (blk_rq_bytes(req) == 0 && sense_current) {
 		/*
 		 * Flush commands do not transfers any data, and thus cannot use
 		 * good_bytes != blk_rq_bytes(req) as the signal for an error.
 		 * This sets *blk_statp explicitly for the problem case.
 		 */
-		*blk_statp = scsi_result_to_blk_status(cmd, result);
+		*blk_statp = scsi_result_to_blk_status(result);
 	}
 	/*
 	 * Recovered errors need reporting, but they're always treated as
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 10/10] scsi: Remove useless host error codes
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (8 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-09-07  2:14 ` [PATCH v2 00/10] scsi: Fix internal host code use Martin K. Petersen
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie, Bart Van Assche

The host codes that were supposed to only be used for internal use are
now not used, so remove them.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 include/scsi/scsi_status.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/scsi/scsi_status.h b/include/scsi/scsi_status.h
index 31d30cee1869..9cb85262de64 100644
--- a/include/scsi/scsi_status.h
+++ b/include/scsi/scsi_status.h
@@ -62,12 +62,12 @@ enum scsi_host_status {
 					 * recover the link. Transport class will
 					 * retry or fail IO */
 	DID_TRANSPORT_FAILFAST = 0x0f, /* Transport class fastfailed the io */
-	DID_TARGET_FAILURE = 0x10, /* Permanent target failure, do not retry on
-				    * other paths */
-	DID_NEXUS_FAILURE = 0x11,  /* Permanent nexus failure, retry on other
-				    * paths might yield different results */
-	DID_ALLOC_FAILURE = 0x12,  /* Space allocation on the device failed */
-	DID_MEDIUM_ERROR = 0x13,  /* Medium error */
+	/*
+	 * We used to have DID_TARGET_FAILURE, DID_NEXUS_FAILURE,
+	 * DID_ALLOC_FAILURE and DID_MEDIUM_ERROR at 0x10 - 0x13. For compat
+	 * with userspace apps that parse the host byte for SG IO, we leave
+	 * that block of codes unused and start at 0x14 below.
+	 */
 	DID_TRANSPORT_MARGINAL = 0x14, /* Transport marginal errors */
 };
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 01/10] scsi: xen: Drop use of internal host codes
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
@ 2022-08-15  6:12   ` Juergen Gross
  0 siblings, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2022-08-15  6:12 UTC (permalink / raw)
  To: Mike Christie, njavali, pbonzini, jasowang, mst, stefanha,
	oneukum, mrochs, ukrishn, martin.petersen, linux-scsi,
	james.bottomley


[-- Attachment #1.1.1: Type: text/plain, Size: 819 bytes --]

On 12.08.22 03:00, Mike Christie wrote:
> The error codes:
> 
> DID_TARGET_FAILURE
> DID_NEXUS_FAILURE
> DID_ALLOC_FAILURE
> DID_MEDIUM_ERROR
> 
> are internal to the SCSI layer. Drivers must not use them because:
> 
> 1. They are not propagated upwards, so SG IO/passthrough users will not
> see an error and think a command was successful.
> 
> xen-scsiback will never see this error and should not try to send it.
> 
> 2. There is no handling for them in scsi_decide_disposition so if
> xen-scsifront were to return the error to scsi-ml then it kicks off the
> error handler which is definitely not what we want.
> 
> This patch remove the use from xen-scsifront/back.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
@ 2022-08-19 19:42   ` Himanshu Madhani
  0 siblings, 0 replies; 14+ messages in thread
From: Himanshu Madhani @ 2022-08-19 19:42 UTC (permalink / raw)
  To: Michael Christie
  Cc: jgross, Nilesh Javali, pbonzini, jasowang, Michael S. Tsirkin,
	Stefan Hajnoczi, Oliver Neukum, mrochs, ukrishn, Martin Petersen,
	linux-scsi, james.bottomley



> On Aug 11, 2022, at 6:00 PM, Mike Christie <michael.christie@oracle.com> wrote:
> 
> DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
> because:
> 
> 1. It's not propagated upwards, so SG IO/passthrough users will not see an
> error and think a command was successful.
> 
> 2. There is no handling for them in scsi_decide_disposition so it results
> in the scsi eh running.
> 
> This has qla2xxx use DID_NO_CONNECT because it looks like we hit this
> error when we can't find a port. It will give us the same hard error
> behavior and it seems to match the error where we can't find the
> endpoint.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
> drivers/scsi/qla2xxx/qla_edif.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
> index 400a8b6f3982..00ccc41cef14 100644
> --- a/drivers/scsi/qla2xxx/qla_edif.c
> +++ b/drivers/scsi/qla2xxx/qla_edif.c
> @@ -1551,7 +1551,7 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
> 		ql_dbg(ql_dbg_edif, vha, 0x70a3, "Failed to find port= %06x\n",
> 		    sa_frame.port_id.b24);
> 		rval = -EINVAL;
> -		SET_DID_STATUS(bsg_reply->result, DID_TARGET_FAILURE);
> +		SET_DID_STATUS(bsg_reply->result, DID_NO_CONNECT);
> 		goto done;
> 	}
> 
> -- 
> 2.18.2
> 

FWIW. Looks Good.

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani	Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 00/10] scsi: Fix internal host code use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (9 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 10/10] scsi: Remove useless host error codes Mike Christie
@ 2022-09-07  2:14 ` Martin K. Petersen
  10 siblings, 0 replies; 14+ messages in thread
From: Martin K. Petersen @ 2022-09-07  2:14 UTC (permalink / raw)
  To: Mike Christie
  Cc: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley


Mike,

> The following patches made over Martin's 5.20 staging branch fix an
> issue where we intended the host codes:

Applied to 6.1/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-09-07  2:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
2022-08-15  6:12   ` Juergen Gross
2022-08-12  1:00 ` [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use Mike Christie
2022-08-12  1:00 ` [PATCH v2 03/10] scsi: uas: " Mike Christie
2022-08-12  1:00 ` [PATCH v2 04/10] scsi: virtio_scsi: " Mike Christie
2022-08-12  1:00 ` [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use Mike Christie
2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
2022-08-19 19:42   ` Himanshu Madhani
2022-08-12  1:00 ` [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use Mike Christie
2022-08-12  1:00 ` [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use Mike Christie
2022-08-12  1:00 ` [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT Mike Christie
2022-08-12  1:00 ` [PATCH v2 10/10] scsi: Remove useless host error codes Mike Christie
2022-09-07  2:14 ` [PATCH v2 00/10] scsi: Fix internal host code use Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.