linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] scsi: Fix internal host code use
@ 2022-08-12  1:00 Mike Christie
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley

The following patches made over Martin's 5.20 staging branch fix an issue
where we intended the host codes:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

to be internal to scsi-ml and used to tell the completion path what type
of error it was without having to re-parse the sense, host codes and
status. But at some point drivers started using them and the driver
writers never updated scsi-ml, so we now have the bugs:

1. scsi_result_to_blk_status clears those codes, so they are not
propagated upwards. SG IO/passthrough users will then not see an error
and think a command was successful.

2. The SCSI error handler runs because scsi_decide_disposition has no
case statements for them and we return FAILED.

This patchset converts the drivers to stop using these codes, and then
moves them to scsi_priv.h in a new error byte so they can only be used
by scsi-ml.

v2:
- Rename SPACE_ALLOC to NOSPC
- Fix up email subject formatting
- Drop part of xen patch that removed zen definitions.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 01/10] scsi: xen: Drop use of internal host codes
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-15  6:12   ` Juergen Gross
  2022-08-12  1:00 ` [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use Mike Christie
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

The error codes:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

are internal to the SCSI layer. Drivers must not use them because:

1. They are not propagated upwards, so SG IO/passthrough users will not
see an error and think a command was successful.

xen-scsiback will never see this error and should not try to send it.

2. There is no handling for them in scsi_decide_disposition so if
xen-scsifront were to return the error to scsi-ml then it kicks off the
error handler which is definitely not what we want.

This patch remove the use from xen-scsifront/back.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/xen-scsifront.c |  8 --------
 drivers/xen/xen-scsiback.c   | 12 ------------
 2 files changed, 20 deletions(-)

diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c
index 51afc66e839d..66b316d173b0 100644
--- a/drivers/scsi/xen-scsifront.c
+++ b/drivers/scsi/xen-scsifront.c
@@ -289,14 +289,6 @@ static unsigned int scsifront_host_byte(int32_t rslt)
 		return DID_TRANSPORT_DISRUPTED;
 	case XEN_VSCSIIF_RSLT_HOST_TRANSPORT_FAILFAST:
 		return DID_TRANSPORT_FAILFAST;
-	case XEN_VSCSIIF_RSLT_HOST_TARGET_FAILURE:
-		return DID_TARGET_FAILURE;
-	case XEN_VSCSIIF_RSLT_HOST_NEXUS_FAILURE:
-		return DID_NEXUS_FAILURE;
-	case XEN_VSCSIIF_RSLT_HOST_ALLOC_FAILURE:
-		return DID_ALLOC_FAILURE;
-	case XEN_VSCSIIF_RSLT_HOST_MEDIUM_ERROR:
-		return DID_MEDIUM_ERROR;
 	case XEN_VSCSIIF_RSLT_HOST_TRANSPORT_MARGINAL:
 		return DID_TRANSPORT_MARGINAL;
 	default:
diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
index 7a0c93acc2c5..e98c88a960d8 100644
--- a/drivers/xen/xen-scsiback.c
+++ b/drivers/xen/xen-scsiback.c
@@ -333,18 +333,6 @@ static int32_t scsiback_result(int32_t result)
 	case DID_TRANSPORT_FAILFAST:
 		host_status = XEN_VSCSIIF_RSLT_HOST_TRANSPORT_FAILFAST;
 		break;
-	case DID_TARGET_FAILURE:
-		host_status = XEN_VSCSIIF_RSLT_HOST_TARGET_FAILURE;
-		break;
-	case DID_NEXUS_FAILURE:
-		host_status = XEN_VSCSIIF_RSLT_HOST_NEXUS_FAILURE;
-		break;
-	case DID_ALLOC_FAILURE:
-		host_status = XEN_VSCSIIF_RSLT_HOST_ALLOC_FAILURE;
-		break;
-	case DID_MEDIUM_ERROR:
-		host_status = XEN_VSCSIIF_RSLT_HOST_MEDIUM_ERROR;
-		break;
 	case DID_TRANSPORT_MARGINAL:
 		host_status = XEN_VSCSIIF_RSLT_HOST_TRANSPORT_MARGINAL;
 		break;
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 03/10] scsi: uas: " Mike Christie
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like the driver wanted a hard failure so this swaps it with
DID_BAD_TARGET.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/storvsc_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index fe000da11332..25c44c87c972 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1029,7 +1029,7 @@ static void storvsc_handle_error(struct vmscsi_request *vm_srb,
 	 */
 	wrk = kmalloc(sizeof(struct storvsc_scan_work), GFP_ATOMIC);
 	if (!wrk) {
-		set_host_byte(scmnd, DID_TARGET_FAILURE);
+		set_host_byte(scmnd, DID_BAD_TARGET);
 		return;
 	}
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 03/10] scsi: uas: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
  2022-08-12  1:00 ` [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 04/10] scsi: virtio_scsi: " Mike Christie
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like the driver wanted a hard failure so this swaps it with
DID_BAD_TARGET which gives us that behavior and the error looks like it's
for a case where the target did not support a TMF we wanted to use (maybe
not a bad target but disappointing so close enough).

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/usb/storage/uas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 84dc270f6f73..de3836412bf3 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -283,7 +283,7 @@ static bool uas_evaluate_response_iu(struct response_iu *riu, struct scsi_cmnd *
 		set_host_byte(cmnd, DID_OK);
 		break;
 	case RC_TMF_NOT_SUPPORTED:
-		set_host_byte(cmnd, DID_TARGET_FAILURE);
+		set_host_byte(cmnd, DID_BAD_TARGET);
 		break;
 	default:
 		uas_log_cmd_state(cmnd, "response iu", response_code);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 04/10] scsi: virtio_scsi: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (2 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 03/10] scsi: uas: " Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use Mike Christie
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_TARGET_FAILURE. It looks like qemu returns that error code
if a host OS returns it, but this shouldn't happen for linux since we
never propagate that error to userspace.

This has us use DID_BAD_TARGET in case some other virt layer is returning
it. In that case we will still get a hard error like before and it conveys
something unexpected happened.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/virtio_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 578c4b6d0f7d..112d8c3962b0 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -141,7 +141,7 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
 		set_host_byte(sc, DID_TRANSPORT_DISRUPTED);
 		break;
 	case VIRTIO_SCSI_S_TARGET_FAILURE:
-		set_host_byte(sc, DID_TARGET_FAILURE);
+		set_host_byte(sc, DID_BAD_TARGET);
 		break;
 	case VIRTIO_SCSI_S_NEXUS_FAILURE:
 		set_host_byte(sc, DID_NEXUS_FAILURE);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (3 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 04/10] scsi: virtio_scsi: " Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_NEXUS_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

It looks like virtio_scsi gets this when something like qemu returns
VIRTIO_SCSI_S_NEXUS_FAILURE. It looks like qemu returns that error code
if host OS returns DID_NEXUS_FAILURE (qemu's internal
SCSI_HOST_RESERVATION_ERROR maps to DID_NEXUS_FAILURE). This shouldn't
happen for linux since we don't propagate that error code to userspace.

This has us convert VIRTIO_SCSI_S_NEXUS_FAILURE to a
SAM_STAT_RESERVATION_CONFLICT in case some other virt layer is returning
it. In that case we will still get the reservation confict failure we
expect.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/virtio_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 112d8c3962b0..00cf6743db8c 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -144,7 +144,7 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
 		set_host_byte(sc, DID_BAD_TARGET);
 		break;
 	case VIRTIO_SCSI_S_NEXUS_FAILURE:
-		set_host_byte(sc, DID_NEXUS_FAILURE);
+		set_status_byte(sc, SAM_STAT_RESERVATION_CONFLICT);
 		break;
 	default:
 		scmd_printk(KERN_WARNING, sc, "Unknown response %d",
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (4 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-19 19:42   ` Himanshu Madhani
  2022-08-12  1:00 ` [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use Mike Christie
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

This has qla2xxx use DID_NO_CONNECT because it looks like we hit this
error when we can't find a port. It will give us the same hard error
behavior and it seems to match the error where we can't find the
endpoint.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/qla2xxx/qla_edif.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
index 400a8b6f3982..00ccc41cef14 100644
--- a/drivers/scsi/qla2xxx/qla_edif.c
+++ b/drivers/scsi/qla2xxx/qla_edif.c
@@ -1551,7 +1551,7 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
 		ql_dbg(ql_dbg_edif, vha, 0x70a3, "Failed to find port= %06x\n",
 		    sa_frame.port_id.b24);
 		rval = -EINVAL;
-		SET_DID_STATUS(bsg_reply->result, DID_TARGET_FAILURE);
+		SET_DID_STATUS(bsg_reply->result, DID_NO_CONNECT);
 		goto done;
 	}
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (5 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use Mike Christie
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie

DID_ALLOC_FAILURE is internal to the SCSI layer. Drivers must not use it
because:

1. It's not propagated upwards, so SG IO/passthrough users will not see an
error and think a command was successful.

2. There is no handling for them in scsi_decide_disposition so it results
in the scsi eh running.

By the code comment, it looks like the driver wanted a retryable error
code, so this has it use DID_ERROR.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/cxlflash/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/cxlflash/main.c b/drivers/scsi/cxlflash/main.c
index e7be95ee7d64..cd1324ec742d 100644
--- a/drivers/scsi/cxlflash/main.c
+++ b/drivers/scsi/cxlflash/main.c
@@ -132,7 +132,7 @@ static void process_cmd_err(struct afu_cmd *cmd, struct scsi_cmnd *scp)
 			break;
 		case SISL_AFU_RC_OUT_OF_DATA_BUFS:
 			/* Retry */
-			scp->result = (DID_ALLOC_FAILURE << 16);
+			scp->result = (DID_ERROR << 16);
 			break;
 		default:
 			scp->result = (DID_ERROR << 16);
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (6 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT Mike Christie
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie, Bart Van Assche

If a driver returns:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

we hit a couple bugs:

1. The SCSI error handler runs because scsi_decide_disposition has no
case statements for them and we return FAILED.

2. For SG IO the userspace app gets a success status instead of failed,
because scsi_result_to_blk_status clears those errors.

This patch adds a new internal error code byte for use by scsi-ml. It
will be used instead of the above error codes, so we don't have to play
that clearing the host code game in scsi_result_to_blk_status and
drivers cannot accidentally use them.

The next patch will then remove the internal users of the above codes and
convert us to use the new ones.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c |  5 +++++
 drivers/scsi/scsi_lib.c   | 22 ++++++++++++++++++++++
 drivers/scsi/scsi_priv.h  | 11 +++++++++++
 3 files changed, 38 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 448748e3fba5..d09b9ba1518c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -514,6 +514,11 @@ static void scsi_report_sense(struct scsi_device *sdev,
 	}
 }
 
+static inline void set_scsi_ml_byte(struct scsi_cmnd *cmd, u8 status)
+{
+	cmd->result = (cmd->result & 0xffff00ff) | (status << 8);
+}
+
 /**
  * scsi_check_sense - Examine scsi cmd sense
  * @scmd:	Cmd to have sense checked.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 4dbd29ab1dcc..92b8c050697e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -576,6 +576,11 @@ static bool scsi_end_request(struct request *req, blk_status_t error,
 	return false;
 }
 
+static inline u8 get_scsi_ml_byte(int result)
+{
+	return (result >> 8) & 0xff;
+}
+
 /**
  * scsi_result_to_blk_status - translate a SCSI result code into blk_status_t
  * @cmd:	SCSI command
@@ -586,6 +591,23 @@ static bool scsi_end_request(struct request *req, blk_status_t error,
  */
 static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
 {
+	/*
+	 * Check the scsi-ml byte first in case we converted a host or status
+	 * byte.
+	 */
+	switch (get_scsi_ml_byte(result)) {
+	case SCSIML_STAT_OK:
+		break;
+	case SCSIML_STAT_RESV_CONFLICT:
+		return BLK_STS_NEXUS;
+	case SCSIML_STAT_NOSPC:
+		return BLK_STS_NOSPC;
+	case SCSIML_STAT_MED_ERROR:
+		return BLK_STS_MEDIUM;
+	case SCSIML_STAT_TGT_FAILURE:
+		return BLK_STS_TARGET;
+	}
+
 	switch (host_byte(result)) {
 	case DID_OK:
 		if (scsi_status_is_good(result))
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 429663bd78ec..2b9e0559ddcb 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -18,6 +18,17 @@ struct scsi_nl_hdr;
 
 #define SCSI_CMD_RETRIES_NO_LIMIT -1
 
+/*
+ * Error codes used by scsi-ml internally. These must not be used by drivers.
+ */
+enum scsi_ml_status {
+	SCSIML_STAT_OK			= 0x00,
+	SCSIML_STAT_RESV_CONFLICT	= 0x01,	/* Reservation conflict */
+	SCSIML_STAT_NOSPC		= 0x02,	/* Space allocation on the dev failed */
+	SCSIML_STAT_MED_ERROR		= 0x03,	/* Medium error */
+	SCSIML_STAT_TGT_FAILURE		= 0x04,	/* Permanent target failure */
+};
+
 /*
  * Scsi Error Handler Flags
  */
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (7 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-08-12  1:00 ` [PATCH v2 10/10] scsi: Remove useless host error codes Mike Christie
  2022-09-07  2:14 ` [PATCH v2 00/10] scsi: Fix internal host code use Martin K. Petersen
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie, Bart Van Assche

Don't use:

DID_TARGET_FAILURE
DID_NEXUS_FAILURE
DID_ALLOC_FAILURE
DID_MEDIUM_ERROR

Instead use the scsi-ml internal values.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 12 ++++++------
 drivers/scsi/scsi_lib.c   | 24 +++++-------------------
 2 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d09b9ba1518c..b5fa2aad05f9 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -649,7 +649,7 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 	case DATA_PROTECT:
 		if (sshdr.asc == 0x27 && sshdr.ascq == 0x07) {
 			/* Thin provisioning hard threshold reached */
-			set_host_byte(scmd, DID_ALLOC_FAILURE);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_NOSPC);
 			return SUCCESS;
 		}
 		fallthrough;
@@ -657,14 +657,14 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 	case VOLUME_OVERFLOW:
 	case MISCOMPARE:
 	case BLANK_CHECK:
-		set_host_byte(scmd, DID_TARGET_FAILURE);
+		set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
 		return SUCCESS;
 
 	case MEDIUM_ERROR:
 		if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
 		    sshdr.asc == 0x13 || /* AMNF DATA FIELD */
 		    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
-			set_host_byte(scmd, DID_MEDIUM_ERROR);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_MED_ERROR);
 			return SUCCESS;
 		}
 		return NEEDS_RETRY;
@@ -673,7 +673,7 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 		if (scmd->device->retry_hwerror)
 			return ADD_TO_MLQUEUE;
 		else
-			set_host_byte(scmd, DID_TARGET_FAILURE);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
 		fallthrough;
 
 	case ILLEGAL_REQUEST:
@@ -683,7 +683,7 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd)
 		    sshdr.asc == 0x24 || /* Invalid field in cdb */
 		    sshdr.asc == 0x26 || /* Parameter value invalid */
 		    sshdr.asc == 0x27) { /* Write protected */
-			set_host_byte(scmd, DID_TARGET_FAILURE);
+			set_scsi_ml_byte(scmd, SCSIML_STAT_TGT_FAILURE);
 		}
 		return SUCCESS;
 
@@ -1988,7 +1988,7 @@ enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *scmd)
 	case SAM_STAT_RESERVATION_CONFLICT:
 		sdev_printk(KERN_INFO, scmd->device,
 			    "reservation conflict\n");
-		set_host_byte(scmd, DID_NEXUS_FAILURE);
+		set_scsi_ml_byte(scmd, SCSIML_STAT_RESV_CONFLICT);
 		return SUCCESS; /* causes immediate i/o error */
 	}
 	return FAILED;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 92b8c050697e..473d9403f0c1 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -583,13 +583,11 @@ static inline u8 get_scsi_ml_byte(int result)
 
 /**
  * scsi_result_to_blk_status - translate a SCSI result code into blk_status_t
- * @cmd:	SCSI command
  * @result:	scsi error code
  *
- * Translate a SCSI result code into a blk_status_t value. May reset the host
- * byte of @cmd->result.
+ * Translate a SCSI result code into a blk_status_t value.
  */
-static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
+static blk_status_t scsi_result_to_blk_status(int result)
 {
 	/*
 	 * Check the scsi-ml byte first in case we converted a host or status
@@ -616,18 +614,6 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result)
 	case DID_TRANSPORT_FAILFAST:
 	case DID_TRANSPORT_MARGINAL:
 		return BLK_STS_TRANSPORT;
-	case DID_TARGET_FAILURE:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_TARGET;
-	case DID_NEXUS_FAILURE:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_NEXUS;
-	case DID_ALLOC_FAILURE:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_NOSPC;
-	case DID_MEDIUM_ERROR:
-		set_host_byte(cmd, DID_OK);
-		return BLK_STS_MEDIUM;
 	default:
 		return BLK_STS_IOERR;
 	}
@@ -715,7 +701,7 @@ static void scsi_io_completion_action(struct scsi_cmnd *cmd, int result)
 	if (sense_valid)
 		sense_current = !scsi_sense_is_deferred(&sshdr);
 
-	blk_stat = scsi_result_to_blk_status(cmd, result);
+	blk_stat = scsi_result_to_blk_status(result);
 
 	if (host_byte(result) == DID_RESET) {
 		/* Third party bus reset or reset for error recovery
@@ -893,14 +879,14 @@ static int scsi_io_completion_nz_result(struct scsi_cmnd *cmd, int result,
 					     SCSI_SENSE_BUFFERSIZE);
 		}
 		if (sense_current)
-			*blk_statp = scsi_result_to_blk_status(cmd, result);
+			*blk_statp = scsi_result_to_blk_status(result);
 	} else if (blk_rq_bytes(req) == 0 && sense_current) {
 		/*
 		 * Flush commands do not transfers any data, and thus cannot use
 		 * good_bytes != blk_rq_bytes(req) as the signal for an error.
 		 * This sets *blk_statp explicitly for the problem case.
 		 */
-		*blk_statp = scsi_result_to_blk_status(cmd, result);
+		*blk_statp = scsi_result_to_blk_status(result);
 	}
 	/*
 	 * Recovered errors need reporting, but they're always treated as
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 10/10] scsi: Remove useless host error codes
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (8 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT Mike Christie
@ 2022-08-12  1:00 ` Mike Christie
  2022-09-07  2:14 ` [PATCH v2 00/10] scsi: Fix internal host code use Martin K. Petersen
  10 siblings, 0 replies; 14+ messages in thread
From: Mike Christie @ 2022-08-12  1:00 UTC (permalink / raw)
  To: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley
  Cc: Mike Christie, Bart Van Assche

The host codes that were supposed to only be used for internal use are
now not used, so remove them.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
 include/scsi/scsi_status.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/scsi/scsi_status.h b/include/scsi/scsi_status.h
index 31d30cee1869..9cb85262de64 100644
--- a/include/scsi/scsi_status.h
+++ b/include/scsi/scsi_status.h
@@ -62,12 +62,12 @@ enum scsi_host_status {
 					 * recover the link. Transport class will
 					 * retry or fail IO */
 	DID_TRANSPORT_FAILFAST = 0x0f, /* Transport class fastfailed the io */
-	DID_TARGET_FAILURE = 0x10, /* Permanent target failure, do not retry on
-				    * other paths */
-	DID_NEXUS_FAILURE = 0x11,  /* Permanent nexus failure, retry on other
-				    * paths might yield different results */
-	DID_ALLOC_FAILURE = 0x12,  /* Space allocation on the device failed */
-	DID_MEDIUM_ERROR = 0x13,  /* Medium error */
+	/*
+	 * We used to have DID_TARGET_FAILURE, DID_NEXUS_FAILURE,
+	 * DID_ALLOC_FAILURE and DID_MEDIUM_ERROR at 0x10 - 0x13. For compat
+	 * with userspace apps that parse the host byte for SG IO, we leave
+	 * that block of codes unused and start at 0x14 below.
+	 */
 	DID_TRANSPORT_MARGINAL = 0x14, /* Transport marginal errors */
 };
 
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 01/10] scsi: xen: Drop use of internal host codes
  2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
@ 2022-08-15  6:12   ` Juergen Gross
  0 siblings, 0 replies; 14+ messages in thread
From: Juergen Gross @ 2022-08-15  6:12 UTC (permalink / raw)
  To: Mike Christie, njavali, pbonzini, jasowang, mst, stefanha,
	oneukum, mrochs, ukrishn, martin.petersen, linux-scsi,
	james.bottomley


[-- Attachment #1.1.1: Type: text/plain, Size: 819 bytes --]

On 12.08.22 03:00, Mike Christie wrote:
> The error codes:
> 
> DID_TARGET_FAILURE
> DID_NEXUS_FAILURE
> DID_ALLOC_FAILURE
> DID_MEDIUM_ERROR
> 
> are internal to the SCSI layer. Drivers must not use them because:
> 
> 1. They are not propagated upwards, so SG IO/passthrough users will not
> see an error and think a command was successful.
> 
> xen-scsiback will never see this error and should not try to send it.
> 
> 2. There is no handling for them in scsi_decide_disposition so if
> xen-scsifront were to return the error to scsi-ml then it kicks off the
> error handler which is definitely not what we want.
> 
> This patch remove the use from xen-scsifront/back.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use
  2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
@ 2022-08-19 19:42   ` Himanshu Madhani
  0 siblings, 0 replies; 14+ messages in thread
From: Himanshu Madhani @ 2022-08-19 19:42 UTC (permalink / raw)
  To: Michael Christie
  Cc: jgross, Nilesh Javali, pbonzini, jasowang, Michael S. Tsirkin,
	Stefan Hajnoczi, Oliver Neukum, mrochs, ukrishn, Martin Petersen,
	linux-scsi, james.bottomley



> On Aug 11, 2022, at 6:00 PM, Mike Christie <michael.christie@oracle.com> wrote:
> 
> DID_TARGET_FAILURE is internal to the SCSI layer. Drivers must not use it
> because:
> 
> 1. It's not propagated upwards, so SG IO/passthrough users will not see an
> error and think a command was successful.
> 
> 2. There is no handling for them in scsi_decide_disposition so it results
> in the scsi eh running.
> 
> This has qla2xxx use DID_NO_CONNECT because it looks like we hit this
> error when we can't find a port. It will give us the same hard error
> behavior and it seems to match the error where we can't find the
> endpoint.
> 
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
> drivers/scsi/qla2xxx/qla_edif.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c
> index 400a8b6f3982..00ccc41cef14 100644
> --- a/drivers/scsi/qla2xxx/qla_edif.c
> +++ b/drivers/scsi/qla2xxx/qla_edif.c
> @@ -1551,7 +1551,7 @@ qla24xx_sadb_update(struct bsg_job *bsg_job)
> 		ql_dbg(ql_dbg_edif, vha, 0x70a3, "Failed to find port= %06x\n",
> 		    sa_frame.port_id.b24);
> 		rval = -EINVAL;
> -		SET_DID_STATUS(bsg_reply->result, DID_TARGET_FAILURE);
> +		SET_DID_STATUS(bsg_reply->result, DID_NO_CONNECT);
> 		goto done;
> 	}
> 
> -- 
> 2.18.2
> 

FWIW. Looks Good.

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

-- 
Himanshu Madhani	Oracle Linux Engineering


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 00/10] scsi: Fix internal host code use
  2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
                   ` (9 preceding siblings ...)
  2022-08-12  1:00 ` [PATCH v2 10/10] scsi: Remove useless host error codes Mike Christie
@ 2022-09-07  2:14 ` Martin K. Petersen
  10 siblings, 0 replies; 14+ messages in thread
From: Martin K. Petersen @ 2022-09-07  2:14 UTC (permalink / raw)
  To: Mike Christie
  Cc: jgross, njavali, pbonzini, jasowang, mst, stefanha, oneukum,
	mrochs, ukrishn, martin.petersen, linux-scsi, james.bottomley


Mike,

> The following patches made over Martin's 5.20 staging branch fix an
> issue where we intended the host codes:

Applied to 6.1/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-09-07  2:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-12  1:00 [PATCH v2 00/10] scsi: Fix internal host code use Mike Christie
2022-08-12  1:00 ` [PATCH v2 01/10] scsi: xen: Drop use of internal host codes Mike Christie
2022-08-15  6:12   ` Juergen Gross
2022-08-12  1:00 ` [PATCH v2 02/10] scsi: storvsc: Drop DID_TARGET_FAILURE use Mike Christie
2022-08-12  1:00 ` [PATCH v2 03/10] scsi: uas: " Mike Christie
2022-08-12  1:00 ` [PATCH v2 04/10] scsi: virtio_scsi: " Mike Christie
2022-08-12  1:00 ` [PATCH v2 05/10] scsi: virtio_scsi: Drop DID_NEXUS_FAILURE use Mike Christie
2022-08-12  1:00 ` [PATCH v2 06/10] scsi: qla2xxx: Drop DID_TARGET_FAILURE use Mike Christie
2022-08-19 19:42   ` Himanshu Madhani
2022-08-12  1:00 ` [PATCH v2 07/10] scsi: cxlflash: Drop DID_ALLOC_FAILURE use Mike Christie
2022-08-12  1:00 ` [PATCH v2 08/10] scsi: Add error codes for internal scsi-ml use Mike Christie
2022-08-12  1:00 ` [PATCH v2 09/10] scsi: Convert scsi_decide_disposition to use SCSIML_STAT Mike Christie
2022-08-12  1:00 ` [PATCH v2 10/10] scsi: Remove useless host error codes Mike Christie
2022-09-07  2:14 ` [PATCH v2 00/10] scsi: Fix internal host code use Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).