All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
@ 2011-01-21  8:07 Hannes Reinecke
  2011-01-21 20:49 ` Mike Christie
  2011-02-12 16:30 ` James Bottomley
  0 siblings, 2 replies; 10+ messages in thread
From: Hannes Reinecke @ 2011-01-21  8:07 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Hannes Reinecke

Consider a scenario where a SCSI EH command like TUR fails with a NOT READY
status - MANUAL INTERVENTION REQUIRED (02/04/03). As evident in the ASC/ASCQ
description, manual intervention is required in this case i.e. the target wants
TUR to fail here so that the host administrator can take corrective action.

But this particular ASC/ASCQ is not handled in the scsi_check_sense, which
ultimately causes scsi_eh_tur to erroneously report a SUCCESS (device ready)
instead of the actual FAILURE state (device not ready).

This patch converts scsi_check_sense() to return SOFT_ERROR in
cases where an error has been signalled via the sense code, but
we should not wake the error handler. This error code will be
reverted back to SUCCESS for normal command handling, but
scsi_eh_completed_normally() will convert it to FAILED.

Reported-by: Martin George <marting@netapp.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 45c7564..e86e62e 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -223,7 +223,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost,
  * @scmd:	Cmd to have sense checked.
  *
  * Return value:
- * 	SUCCESS or FAILED or NEEDS_RETRY
+ * 	SUCCESS or FAILED or NEEDS_RETRY or SOFT_ERROR
  *
  * Notes:
  *	When a deferred error is detected the current command has
@@ -278,7 +278,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 
 	case ABORTED_COMMAND:
 		if (sshdr.asc == 0x10) /* DIF */
-			return SUCCESS;
+			return SOFT_ERROR;
 
 		return NEEDS_RETRY;
 	case NOT_READY:
@@ -324,19 +324,19 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		 * Pass the UA upwards for a determination in the completion
 		 * functions.
 		 */
-		return SUCCESS;
+		return SOFT_ERROR;
 
 		/* these three are not supported */
 	case COPY_ABORTED:
 	case VOLUME_OVERFLOW:
 	case MISCOMPARE:
-		return SUCCESS;
+		return SOFT_ERROR;
 
 	case MEDIUM_ERROR:
 		if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
 		    sshdr.asc == 0x13 || /* AMNF DATA FIELD */
 		    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
-			return SUCCESS;
+			return SOFT_ERROR;
 		}
 		return NEEDS_RETRY;
 
@@ -344,13 +344,13 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		if (scmd->device->retry_hwerror)
 			return ADD_TO_MLQUEUE;
 		else
-			return SUCCESS;
+			return SOFT_ERROR;
 
 	case ILLEGAL_REQUEST:
 	case BLANK_CHECK:
 	case DATA_PROTECT:
 	default:
-		return SUCCESS;
+		return SOFT_ERROR;
 	}
 }
 
@@ -1469,10 +1469,15 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		rtn = scsi_check_sense(scmd);
 		if (rtn == NEEDS_RETRY)
 			goto maybe_retry;
-		/* if rtn == FAILED, we have no sense information;
+		/* If rtn == FAILED, we have no sense information;
 		 * returning FAILED will wake the error handler thread
 		 * to collect the sense and redo the decide
-		 * disposition */
+		 * disposition.
+		 * If rtn == SOFT_ERROR an error has occurred, but
+		 * we do _not_ need to wake the error handler.
+		 * So return SUCCESS here. */
+		if (rtn == SOFT_ERROR)
+			rtn = SUCCESS;
 		return rtn;
 	case CONDITION_GOOD:
 	case INTERMEDIATE_GOOD:

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-21  8:07 [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands Hannes Reinecke
@ 2011-01-21 20:49 ` Mike Christie
  2011-01-24  7:59   ` Hannes Reinecke
  2011-02-12 16:30 ` James Bottomley
  1 sibling, 1 reply; 10+ messages in thread
From: Mike Christie @ 2011-01-21 20:49 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: James Bottomley, linux-scsi

On 01/21/2011 02:07 AM, Hannes Reinecke wrote:
> Consider a scenario where a SCSI EH command like TUR fails with a NOT READY
> status - MANUAL INTERVENTION REQUIRED (02/04/03). As evident in the ASC/ASCQ
> description, manual intervention is required in this case i.e. the target wants
> TUR to fail here so that the host administrator can take corrective action.
>
> But this particular ASC/ASCQ is not handled in the scsi_check_sense, which
> ultimately causes scsi_eh_tur to erroneously report a SUCCESS (device ready)
> instead of the actual FAILURE state (device not ready).
>
> This patch converts scsi_check_sense() to return SOFT_ERROR in
> cases where an error has been signalled via the sense code, but
> we should not wake the error handler. This error code will be
> reverted back to SUCCESS for normal command handling, but
> scsi_eh_completed_normally() will convert it to FAILED.
>
> Reported-by: Martin George<marting@netapp.com>
> Signed-off-by: Hannes Reinecke<hare@suse.de>
>
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 45c7564..e86e62e 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -223,7 +223,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost,
>    * @scmd:	Cmd to have sense checked.
>    *
>    * Return value:
> - * 	SUCCESS or FAILED or NEEDS_RETRY
> + * 	SUCCESS or FAILED or NEEDS_RETRY or SOFT_ERROR
>    *
>    * Notes:
>    *	When a deferred error is detected the current command has
> @@ -278,7 +278,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>
>   	case ABORTED_COMMAND:
>   		if (sshdr.asc == 0x10) /* DIF */
> -			return SUCCESS;
> +			return SOFT_ERROR;
>
>   		return NEEDS_RETRY;
>   	case NOT_READY:
> @@ -324,19 +324,19 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>   		 * Pass the UA upwards for a determination in the completion
>   		 * functions.
>   		 */
> -		return SUCCESS;
> +		return SOFT_ERROR;
>


For normal IO execution the UA, not ready, and illegal request handling 
in scsi_io_completion would determine that some of these are retryable 
conditions, but with the patch in the scsi_error.cs path we will return 
soft error and fail. Also a lot of these that were retryable were 
returned as success in the past, but with the patch are being failed.

Martin made me the same bugzilla :) I was thinking we needed to make 
check_sense smarter or move some of the scsi_io_completion code to some 
lib helpers so scsi_error.c could call them.

And does scsi_eh_stu need to be covnerted to check for soft error and 
failed.

Also if we did this patch, then I think we also have to convert the 
device handlers.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-21 20:49 ` Mike Christie
@ 2011-01-24  7:59   ` Hannes Reinecke
  2011-01-24 20:17     ` Mike Christie
  0 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2011-01-24  7:59 UTC (permalink / raw)
  To: Mike Christie; +Cc: James Bottomley, linux-scsi

On 01/21/2011 09:49 PM, Mike Christie wrote:
> On 01/21/2011 02:07 AM, Hannes Reinecke wrote:
>> Consider a scenario where a SCSI EH command like TUR fails with a NOT
>> READY
>> status - MANUAL INTERVENTION REQUIRED (02/04/03). As evident in the
>> ASC/ASCQ
>> description, manual intervention is required in this case i.e. the
>> target wants
>> TUR to fail here so that the host administrator can take corrective
>> action.
>>
>> But this particular ASC/ASCQ is not handled in the scsi_check_sense,
>> which
>> ultimately causes scsi_eh_tur to erroneously report a SUCCESS (device
>> ready)
>> instead of the actual FAILURE state (device not ready).
>>
>> This patch converts scsi_check_sense() to return SOFT_ERROR in
>> cases where an error has been signalled via the sense code, but
>> we should not wake the error handler. This error code will be
>> reverted back to SUCCESS for normal command handling, but
>> scsi_eh_completed_normally() will convert it to FAILED.
>>
>> Reported-by: Martin George<marting@netapp.com>
>> Signed-off-by: Hannes Reinecke<hare@suse.de>
>>
>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>> index 45c7564..e86e62e 100644
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -223,7 +223,7 @@ static inline void scsi_eh_prt_fail_stats(struct
>> Scsi_Host *shost,
>>    * @scmd:    Cmd to have sense checked.
>>    *
>>    * Return value:
>> - *     SUCCESS or FAILED or NEEDS_RETRY
>> + *     SUCCESS or FAILED or NEEDS_RETRY or SOFT_ERROR
>>    *
>>    * Notes:
>>    *    When a deferred error is detected the current command has
>> @@ -278,7 +278,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>>
>>       case ABORTED_COMMAND:
>>           if (sshdr.asc == 0x10) /* DIF */
>> -            return SUCCESS;
>> +            return SOFT_ERROR;
>>
>>           return NEEDS_RETRY;
>>       case NOT_READY:
>> @@ -324,19 +324,19 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>>            * Pass the UA upwards for a determination in the completion
>>            * functions.
>>            */
>> -        return SUCCESS;
>> +        return SOFT_ERROR;
>>
> 
> 
> For normal IO execution the UA, not ready, and illegal request handling
> in scsi_io_completion would determine that some of these are retryable
> conditions, but with the patch in the scsi_error.cs path we will return
> soft error and fail. Also a lot of these that were retryable were
> returned as success in the past, but with the patch are being failed.
> 
No. scsi_check_sense() is not called from scsi_io_completion; that
just calls scsi_normalize_sense().
scsi_check_sense() is only called from

- scsi_eh_completed_normally()
- scsi_eh_stu()
- scsi_decide_disposition()

All of which have been checked against this patch.
scsi_io_completion() has it's own set of checks for sense codes
which I don't modify here.

> Martin made me the same bugzilla :) I was thinking we needed to make
> check_sense smarter or move some of the scsi_io_completion code to some
> lib helpers so scsi_error.c could call them.
> 
> And does scsi_eh_stu need to be covnerted to check for soft error and
> failed.
> 
No. scsi_eh_stu() needs to check if a START STOP UNIT command needs
to be send. The trigger for this is that scsi_check_sense() returns
FAILED:

		list_for_each_entry(scmd, work_q, eh_entry)
			if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) &&
			    scsi_check_sense(scmd) == FAILED ) {
				stu_scmd = scmd;
				break;
			}

		if (!stu_scmd)
			continue;

so this logic hasn't changed.

> Also if we did this patch, then I think we also have to convert the
> device handlers.
Not necessarily. This patch just allows check_sense() to return
an additional error code; the device handlers are not forced to
use them. If they don't things will work like before.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-24  7:59   ` Hannes Reinecke
@ 2011-01-24 20:17     ` Mike Christie
  2011-01-25  5:18       ` Mike Christie
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Christie @ 2011-01-24 20:17 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: James Bottomley, linux-scsi

On 01/24/2011 01:59 AM, Hannes Reinecke wrote:
> On 01/21/2011 09:49 PM, Mike Christie wrote:
>> On 01/21/2011 02:07 AM, Hannes Reinecke wrote:
>>> Consider a scenario where a SCSI EH command like TUR fails with a NOT
>>> READY
>>> status - MANUAL INTERVENTION REQUIRED (02/04/03). As evident in the
>>> ASC/ASCQ
>>> description, manual intervention is required in this case i.e. the
>>> target wants
>>> TUR to fail here so that the host administrator can take corrective
>>> action.
>>>
>>> But this particular ASC/ASCQ is not handled in the scsi_check_sense,
>>> which
>>> ultimately causes scsi_eh_tur to erroneously report a SUCCESS (device
>>> ready)
>>> instead of the actual FAILURE state (device not ready).
>>>
>>> This patch converts scsi_check_sense() to return SOFT_ERROR in
>>> cases where an error has been signalled via the sense code, but
>>> we should not wake the error handler. This error code will be
>>> reverted back to SUCCESS for normal command handling, but
>>> scsi_eh_completed_normally() will convert it to FAILED.
>>>
>>> Reported-by: Martin George<marting@netapp.com>
>>> Signed-off-by: Hannes Reinecke<hare@suse.de>
>>>
>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>>> index 45c7564..e86e62e 100644
>>> --- a/drivers/scsi/scsi_error.c
>>> +++ b/drivers/scsi/scsi_error.c
>>> @@ -223,7 +223,7 @@ static inline void scsi_eh_prt_fail_stats(struct
>>> Scsi_Host *shost,
>>>     * @scmd:    Cmd to have sense checked.
>>>     *
>>>     * Return value:
>>> - *     SUCCESS or FAILED or NEEDS_RETRY
>>> + *     SUCCESS or FAILED or NEEDS_RETRY or SOFT_ERROR
>>>     *
>>>     * Notes:
>>>     *    When a deferred error is detected the current command has
>>> @@ -278,7 +278,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>>>
>>>        case ABORTED_COMMAND:
>>>            if (sshdr.asc == 0x10) /* DIF */
>>> -            return SUCCESS;
>>> +            return SOFT_ERROR;
>>>
>>>            return NEEDS_RETRY;
>>>        case NOT_READY:
>>> @@ -324,19 +324,19 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>>>             * Pass the UA upwards for a determination in the completion
>>>             * functions.
>>>             */
>>> -        return SUCCESS;
>>> +        return SOFT_ERROR;
>>>
>>
>>
>> For normal IO execution the UA, not ready, and illegal request handling
>> in scsi_io_completion would determine that some of these are retryable
>> conditions, but with the patch in the scsi_error.cs path we will return
>> soft error and fail. Also a lot of these that were retryable were
>> returned as success in the past, but with the patch are being failed.
>>
> No. scsi_check_sense() is not called from scsi_io_completion; that

I did not say that it was. It is not directly called by 
scsi_io_completion, but for the normal IO completion path 
scsi_check_sense is called by scsi_decide_disposition and eventually 
scsi_io_completion is then called once we go through the ULD handling.

> just calls scsi_normalize_sense().
> scsi_check_sense() is only called from
>
> - scsi_eh_completed_normally()
> - scsi_eh_stu()
> - scsi_decide_disposition()
>
> All of which have been checked against this patch.
> scsi_io_completion() has it's own set of checks for sense codes
> which I don't modify here.

I am saying I think we need the extra checks from scsi_io_completion or 
we need something like it in the scsi eh path, or the scsi eh is going 
to fail when it could have succeeded. For normal IO, 
scsi_decide_disposition would call scsi_check_sense, and 
scsi_check_sense would return SUCCESS on some UA or not ready and 
illegal requests. Then later scsi_io_completion would look at the 
asc/ascq in more depth and retry in a lot of situations.

With your patch all those IOs that would be determined by 
scsi_io_completion to be retryable are now returned as SOFT_ERROR in 
scsi_check_sense and so scsi_eh_completed_normally will see SOFT_ERROR 
and fail them.

For example, with your patch if for the 
scsi_send_eh_cmnd/scsi_eh_completed_normally/scsi_check_sense path we 
got 02/04/01 (not ready - becomming ready), scsi_send_eh_cmnd would see 
SOFT_ERROR and fail the scsi eh. And, I am saying don't we want retry 
this like we would if we were going through the normal IO path, so we do 
not offline devices on this type of failure?


>
>> Martin made me the same bugzilla :) I was thinking we needed to make
>> check_sense smarter or move some of the scsi_io_completion code to some
>> lib helpers so scsi_error.c could call them.
>>
>> And does scsi_eh_stu need to be covnerted to check for soft error and
>> failed.
>>
> No. scsi_eh_stu() needs to check if a START STOP UNIT command needs
> to be send. The trigger for this is that scsi_check_sense() returns
> FAILED:
>

Ah yeah, my mistake.

>
>> Also if we did this patch, then I think we also have to convert the
>> device handlers.
> Not necessarily. This patch just allows check_sense() to return
> an additional error code; the device handlers are not forced to
> use them. If they don't things will work like before.
>

Maybe I should have written, don't we want to fix them too. For example 
if alua_check_sense got 02/04/12 it returns SUCCESS, and expects the IO 
to be failed, right? With your patch the 
scsi_send_eh_cmnd/scsi_eh_completed_normally/scsi_check_sense path will 
still see the SUCCESS and propagate that upwards and you hit the problem 
where the scsi eh should have failed the eh but let it succeed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-24 20:17     ` Mike Christie
@ 2011-01-25  5:18       ` Mike Christie
  2011-01-25  5:34         ` Mike Christie
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Christie @ 2011-01-25  5:18 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: James Bottomley, linux-scsi

On 01/24/2011 02:17 PM, Mike Christie wrote:
> For example, with your patch if for the
> scsi_send_eh_cmnd/scsi_eh_completed_normally/scsi_check_sense path we
> got 02/04/01 (not ready - becomming ready), scsi_send_eh_cmnd would see
> SOFT_ERROR and fail the scsi eh. And, I am saying don't we want retry
> this like we would if we were going through the normal IO path, so we do
> not offline devices on this type of failure?
>

Just some corrections/clarifications. The device, target and bus reset 
handling are ok, but just host reset handling seems like the problem (at 
least the problem I was hitting I think). We do not call 
__scsi_report_device_reset for host reset handling, but some LLD 
eh_host_reset_handler implementations execute operations that can cause 
us to get something like a UA for the scsi eh TUR. For your patch do you 
also want to call __scsi_report_device_reset for the host reset handling 
for these drivers then?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-25  5:18       ` Mike Christie
@ 2011-01-25  5:34         ` Mike Christie
  2011-01-25 15:38           ` James Bottomley
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Christie @ 2011-01-25  5:34 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: James Bottomley, linux-scsi

On 01/24/2011 11:18 PM, Mike Christie wrote:
> On 01/24/2011 02:17 PM, Mike Christie wrote:
>> For example, with your patch if for the
>> scsi_send_eh_cmnd/scsi_eh_completed_normally/scsi_check_sense path we
>> got 02/04/01 (not ready - becomming ready), scsi_send_eh_cmnd would see
>> SOFT_ERROR and fail the scsi eh. And, I am saying don't we want retry
>> this like we would if we were going through the normal IO path, so we do
>> not offline devices on this type of failure?
>>
>
> Just some corrections/clarifications. The device, target and bus reset
> handling are ok, but just host reset handling seems like the problem (at
> least the problem I was hitting I think). We do not call
> __scsi_report_device_reset for host reset handling, but some LLD
> eh_host_reset_handler implementations execute operations that can cause
> us to get something like a UA for the scsi eh TUR. For your patch do you
> also want to call __scsi_report_device_reset for the host reset handling
> for these drivers then?

Ignore that. I see it is getting called for the host reset code. I am 
not sure what happened when I was testing it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-25  5:34         ` Mike Christie
@ 2011-01-25 15:38           ` James Bottomley
  2011-01-25 20:19             ` Mike Christie
  0 siblings, 1 reply; 10+ messages in thread
From: James Bottomley @ 2011-01-25 15:38 UTC (permalink / raw)
  To: Mike Christie; +Cc: Hannes Reinecke, linux-scsi

On Mon, 2011-01-24 at 23:34 -0600, Mike Christie wrote:
> On 01/24/2011 11:18 PM, Mike Christie wrote:
> > On 01/24/2011 02:17 PM, Mike Christie wrote:
> >> For example, with your patch if for the
> >> scsi_send_eh_cmnd/scsi_eh_completed_normally/scsi_check_sense path we
> >> got 02/04/01 (not ready - becomming ready), scsi_send_eh_cmnd would see
> >> SOFT_ERROR and fail the scsi eh. And, I am saying don't we want retry
> >> this like we would if we were going through the normal IO path, so we do
> >> not offline devices on this type of failure?
> >>
> >
> > Just some corrections/clarifications. The device, target and bus reset
> > handling are ok, but just host reset handling seems like the problem (at
> > least the problem I was hitting I think). We do not call
> > __scsi_report_device_reset for host reset handling, but some LLD
> > eh_host_reset_handler implementations execute operations that can cause
> > us to get something like a UA for the scsi eh TUR. For your patch do you
> > also want to call __scsi_report_device_reset for the host reset handling
> > for these drivers then?
> 
> Ignore that. I see it is getting called for the host reset code. I am 
> not sure what happened when I was testing it.

Just so I'm clear ... that's you withdrawing all objections to this
patch, so I can put it in scsi-misc for testing?

James



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-25 15:38           ` James Bottomley
@ 2011-01-25 20:19             ` Mike Christie
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Christie @ 2011-01-25 20:19 UTC (permalink / raw)
  To: James Bottomley; +Cc: Hannes Reinecke, linux-scsi

On 01/25/2011 09:38 AM, James Bottomley wrote:
>> Ignore that. I see it is getting called for the host reset code. I am
>> not sure what happened when I was testing it.
>
> Just so I'm clear ... that's you withdrawing all objections to this
> patch, so I can put it in scsi-misc for testing?
>

Yeah.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-01-21  8:07 [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands Hannes Reinecke
  2011-01-21 20:49 ` Mike Christie
@ 2011-02-12 16:30 ` James Bottomley
  2011-02-14  7:26   ` Hannes Reinecke
  1 sibling, 1 reply; 10+ messages in thread
From: James Bottomley @ 2011-02-12 16:30 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi

On Fri, 2011-01-21 at 09:07 +0100, Hannes Reinecke wrote:
> Consider a scenario where a SCSI EH command like TUR fails with a NOT READY
> status - MANUAL INTERVENTION REQUIRED (02/04/03). As evident in the ASC/ASCQ
> description, manual intervention is required in this case i.e. the target wants
> TUR to fail here so that the host administrator can take corrective action.
> 
> But this particular ASC/ASCQ is not handled in the scsi_check_sense, which
> ultimately causes scsi_eh_tur to erroneously report a SUCCESS (device ready)
> instead of the actual FAILURE state (device not ready).
> 
> This patch converts scsi_check_sense() to return SOFT_ERROR in
> cases where an error has been signalled via the sense code, but
> we should not wake the error handler. This error code will be
> reverted back to SUCCESS for normal command handling, but
> scsi_eh_completed_normally() will convert it to FAILED.

I can't reconcile this patch with the detailed error handling ones: you
change a lot of call sites to TARGET_ERROR in that patch and SOFT_ERROR
in this one.  Since the detailed error handling seems the more
important, I'll drop this one ... can you work out a reconciliation,
please?

Thanks,

James



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands
  2011-02-12 16:30 ` James Bottomley
@ 2011-02-14  7:26   ` Hannes Reinecke
  0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2011-02-14  7:26 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi

On 02/12/2011 05:30 PM, James Bottomley wrote:
> On Fri, 2011-01-21 at 09:07 +0100, Hannes Reinecke wrote:
>> Consider a scenario where a SCSI EH command like TUR fails with a NOT READY
>> status - MANUAL INTERVENTION REQUIRED (02/04/03). As evident in the ASC/ASCQ
>> description, manual intervention is required in this case i.e. the target wants
>> TUR to fail here so that the host administrator can take corrective action.
>>
>> But this particular ASC/ASCQ is not handled in the scsi_check_sense, which
>> ultimately causes scsi_eh_tur to erroneously report a SUCCESS (device ready)
>> instead of the actual FAILURE state (device not ready).
>>
>> This patch converts scsi_check_sense() to return SOFT_ERROR in
>> cases where an error has been signalled via the sense code, but
>> we should not wake the error handler. This error code will be
>> reverted back to SUCCESS for normal command handling, but
>> scsi_eh_completed_normally() will convert it to FAILED.
> 
> I can't reconcile this patch with the detailed error handling ones: you
> change a lot of call sites to TARGET_ERROR in that patch and SOFT_ERROR
> in this one.  Since the detailed error handling seems the more
> important, I'll drop this one ... can you work out a reconciliation,
> please?
> 
Yes. I'm in discussion with NetApp (as the original reporter)
about details about this patch anyway.

I'll be sending an update.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-02-14  7:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-21  8:07 [PATCH] scsi: Erroneous handling of sense status in SCSI EH commands Hannes Reinecke
2011-01-21 20:49 ` Mike Christie
2011-01-24  7:59   ` Hannes Reinecke
2011-01-24 20:17     ` Mike Christie
2011-01-25  5:18       ` Mike Christie
2011-01-25  5:34         ` Mike Christie
2011-01-25 15:38           ` James Bottomley
2011-01-25 20:19             ` Mike Christie
2011-02-12 16:30 ` James Bottomley
2011-02-14  7:26   ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.