All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Retry commands with UNIT_ATTENTION sense codes
@ 2010-05-04 14:48 Hannes Reinecke
  2010-05-04 16:26 ` James Bottomley
  0 siblings, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2010-05-04 14:48 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi


We have to enable retries for UNIT_ATTENTION sense codes, as a
command might've been injected from the block layer (eg barrier
requests). And for those no error recovers takes place, so we
need to make sure that the SCSI midlayer corrects any transient
errors.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_error.c |    8 +-------
 1 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d45c69c..663a1ad 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -290,19 +290,13 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 			return NEEDS_RETRY;
 		}
 		/*
-		 * if the device is in the process of becoming ready, we 
-		 * should retry.
-		 */
-		if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01))
-			return NEEDS_RETRY;
-		/*
 		 * if the device is not started, we need to wake
 		 * the error handler to start the motor
 		 */
 		if (scmd->device->allow_restart &&
 		    (sshdr.asc == 0x04) && (sshdr.ascq == 0x02))
 			return FAILED;
-		return SUCCESS;
+		return NEEDS_RETRY;
 
 		/* these three are not supported */
 	case COPY_ABORTED:
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-04 14:48 [PATCH] Retry commands with UNIT_ATTENTION sense codes Hannes Reinecke
@ 2010-05-04 16:26 ` James Bottomley
  2010-05-04 20:15   ` Mike Christie
  0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2010-05-04 16:26 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi

On Tue, 2010-05-04 at 16:48 +0200, Hannes Reinecke wrote:
> We have to enable retries for UNIT_ATTENTION sense codes, as a
> command might've been injected from the block layer (eg barrier
> requests). And for those no error recovers takes place, so we
> need to make sure that the SCSI midlayer corrects any transient
> errors.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>  drivers/scsi/scsi_error.c |    8 +-------
>  1 files changed, 1 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index d45c69c..663a1ad 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -290,19 +290,13 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
>  			return NEEDS_RETRY;
>  		}
>  		/*
> -		 * if the device is in the process of becoming ready, we 
> -		 * should retry.
> -		 */
> -		if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01))
> -			return NEEDS_RETRY;
> -		/*
>  		 * if the device is not started, we need to wake
>  		 * the error handler to start the motor
>  		 */
>  		if (scmd->device->allow_restart &&
>  		    (sshdr.asc == 0x04) && (sshdr.ascq == 0x02))
>  			return FAILED;
> -		return SUCCESS;
> +		return NEEDS_RETRY;
>  
>  		/* these three are not supported */
>  	case COPY_ABORTED:

The other patch is fine, but I don't think this is necessary.  The
reason is that even returning SUCCESS here, we go straight into
scsi_finish_command() (which passes it up to the driver handler) and
then scsi_io_completion().  There's a catch for UNIT_ATTENTION in
scsi_io_completion which will still cause a retry provided the driver
handler hasn't done any alterations, so I think we'd still get the retry
with only your other patch.

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-04 16:26 ` James Bottomley
@ 2010-05-04 20:15   ` Mike Christie
  2010-05-04 20:27     ` Mike Christie
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Christie @ 2010-05-04 20:15 UTC (permalink / raw)
  To: James Bottomley; +Cc: Hannes Reinecke, linux-scsi

On 05/04/2010 11:26 AM, James Bottomley wrote:
> The other patch is fine, but I don't think this is necessary.  The
> reason is that even returning SUCCESS here, we go straight into
> scsi_finish_command() (which passes it up to the driver handler) and
> then scsi_io_completion().  There's a catch for UNIT_ATTENTION in
> scsi_io_completion

The request is sent as a REQ_TYPE_BLOCK_PC (this flag is set for the 
request in sd_prepare_flush), and scsi_io_completion's blk_pc_request 
check() that returns the request upwards is before the UNIT_ATTENTION 
check one so we never hit the UNIT_ATTENTION check.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-04 20:15   ` Mike Christie
@ 2010-05-04 20:27     ` Mike Christie
  2010-05-04 20:30       ` James Bottomley
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Christie @ 2010-05-04 20:27 UTC (permalink / raw)
  To: James Bottomley; +Cc: Hannes Reinecke, linux-scsi

On 05/04/2010 03:15 PM, Mike Christie wrote:
> On 05/04/2010 11:26 AM, James Bottomley wrote:
>> The other patch is fine, but I don't think this is necessary. The
>> reason is that even returning SUCCESS here, we go straight into
>> scsi_finish_command() (which passes it up to the driver handler) and
>> then scsi_io_completion(). There's a catch for UNIT_ATTENTION in
>> scsi_io_completion
>
> The request is sent as a REQ_TYPE_BLOCK_PC (this flag is set for the
> request in sd_prepare_flush), and scsi_io_completion's blk_pc_request
> check() that returns the request upwards is before the UNIT_ATTENTION

I was looking at the wrong source.

scsi_finish_command checks for REQ_TYPE_BLOCK_PC and sets good_bytes to 
scsi_bufflen, so when scsi_io_completion calls scsi_end_request, it 
fails the request before we can get to the UNIT_ATTENTION.




> check one so we never hit the UNIT_ATTENTION check.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-04 20:27     ` Mike Christie
@ 2010-05-04 20:30       ` James Bottomley
  2010-05-04 20:51         ` James Bottomley
  2010-05-05  6:26         ` Hannes Reinecke
  0 siblings, 2 replies; 9+ messages in thread
From: James Bottomley @ 2010-05-04 20:30 UTC (permalink / raw)
  To: Mike Christie; +Cc: Hannes Reinecke, linux-scsi

On Tue, 2010-05-04 at 15:27 -0500, Mike Christie wrote:
> On 05/04/2010 03:15 PM, Mike Christie wrote:
> > On 05/04/2010 11:26 AM, James Bottomley wrote:
> >> The other patch is fine, but I don't think this is necessary. The
> >> reason is that even returning SUCCESS here, we go straight into
> >> scsi_finish_command() (which passes it up to the driver handler) and
> >> then scsi_io_completion(). There's a catch for UNIT_ATTENTION in
> >> scsi_io_completion
> >
> > The request is sent as a REQ_TYPE_BLOCK_PC (this flag is set for the
> > request in sd_prepare_flush), and scsi_io_completion's blk_pc_request
> > check() that returns the request upwards is before the UNIT_ATTENTION
> 
> I was looking at the wrong source.
> 
> scsi_finish_command checks for REQ_TYPE_BLOCK_PC and sets good_bytes to 
> scsi_bufflen, so when scsi_io_completion calls scsi_end_request, it 
> fails the request before we can get to the UNIT_ATTENTION.

Ah, yes, missed that.

However there are other problems then.  We can't just eat all unit
attentions on the BLOCK_PC path because some of the user programs want
to see them (CD burners for one).  I know the patch allows some to
proceed upwards, but it's risky making all except device not started do
a retry.

I'm a bit stumped on this one ... the intention of BLOCK_PC is that the
caller simply retries if they get a unit attention (which is what all
SCSI code does).  The block code doesn't want to look at the sense data
but at the error return.  I suppose we could make UNIT_ATTENTION
translate to -EAGAIN and have block do the right thing?

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-04 20:30       ` James Bottomley
@ 2010-05-04 20:51         ` James Bottomley
  2010-05-05  6:26         ` Hannes Reinecke
  1 sibling, 0 replies; 9+ messages in thread
From: James Bottomley @ 2010-05-04 20:51 UTC (permalink / raw)
  To: Mike Christie; +Cc: Hannes Reinecke, linux-scsi

On Tue, 2010-05-04 at 16:30 -0400, James Bottomley wrote:
> On Tue, 2010-05-04 at 15:27 -0500, Mike Christie wrote:
> > On 05/04/2010 03:15 PM, Mike Christie wrote:
> > > On 05/04/2010 11:26 AM, James Bottomley wrote:
> > >> The other patch is fine, but I don't think this is necessary. The
> > >> reason is that even returning SUCCESS here, we go straight into
> > >> scsi_finish_command() (which passes it up to the driver handler) and
> > >> then scsi_io_completion(). There's a catch for UNIT_ATTENTION in
> > >> scsi_io_completion
> > >
> > > The request is sent as a REQ_TYPE_BLOCK_PC (this flag is set for the
> > > request in sd_prepare_flush), and scsi_io_completion's blk_pc_request
> > > check() that returns the request upwards is before the UNIT_ATTENTION
> > 
> > I was looking at the wrong source.
> > 
> > scsi_finish_command checks for REQ_TYPE_BLOCK_PC and sets good_bytes to 
> > scsi_bufflen, so when scsi_io_completion calls scsi_end_request, it 
> > fails the request before we can get to the UNIT_ATTENTION.
> 
> Ah, yes, missed that.
> 
> However there are other problems then.  We can't just eat all unit
> attentions on the BLOCK_PC path because some of the user programs want
> to see them (CD burners for one).  I know the patch allows some to
> proceed upwards, but it's risky making all except device not started do
> a retry.
> 
> I'm a bit stumped on this one ... the intention of BLOCK_PC is that the
> caller simply retries if they get a unit attention (which is what all
> SCSI code does).  The block code doesn't want to look at the sense data
> but at the error return.  I suppose we could make UNIT_ATTENTION
> translate to -EAGAIN and have block do the right thing?

I tried this, it gets very messy in block, so it looks to be a hack too
far.

What about this then ... it only retries UAs for barrier commands (thus
preserving standard semantics for SG_IO requests)?

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d45c69c..7ad53fa 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -302,7 +302,20 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		if (scmd->device->allow_restart &&
 		    (sshdr.asc == 0x04) && (sshdr.ascq == 0x02))
 			return FAILED;
-		return SUCCESS;
+
+		if (blk_barrier_rq(scmd->request))
+			/*
+			 * barrier requests should always retry on UA
+			 * otherwise block will get a spurious error
+			 */
+			return NEEDS_RETRY;
+		else
+			/*
+			 * for normal (non barrier) commands, pass the
+			 * UA upwards for a determination in the
+			 * completion functions
+			 */
+			return SUCCESS;
 
 		/* these three are not supported */
 	case COPY_ABORTED:



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-04 20:30       ` James Bottomley
  2010-05-04 20:51         ` James Bottomley
@ 2010-05-05  6:26         ` Hannes Reinecke
  2010-05-05 13:19           ` James Bottomley
  1 sibling, 1 reply; 9+ messages in thread
From: Hannes Reinecke @ 2010-05-05  6:26 UTC (permalink / raw)
  To: James Bottomley; +Cc: Mike Christie, linux-scsi

James Bottomley wrote:
> On Tue, 2010-05-04 at 15:27 -0500, Mike Christie wrote:
>> On 05/04/2010 03:15 PM, Mike Christie wrote:
>>> On 05/04/2010 11:26 AM, James Bottomley wrote:
>>>> The other patch is fine, but I don't think this is necessary. The
>>>> reason is that even returning SUCCESS here, we go straight into
>>>> scsi_finish_command() (which passes it up to the driver handler) and
>>>> then scsi_io_completion(). There's a catch for UNIT_ATTENTION in
>>>> scsi_io_completion
>>> The request is sent as a REQ_TYPE_BLOCK_PC (this flag is set for the
>>> request in sd_prepare_flush), and scsi_io_completion's blk_pc_request
>>> check() that returns the request upwards is before the UNIT_ATTENTION
>> I was looking at the wrong source.
>>
>> scsi_finish_command checks for REQ_TYPE_BLOCK_PC and sets good_bytes to 
>> scsi_bufflen, so when scsi_io_completion calls scsi_end_request, it 
>> fails the request before we can get to the UNIT_ATTENTION.
> 
> Ah, yes, missed that.
> 
> However there are other problems then.  We can't just eat all unit
> attentions on the BLOCK_PC path because some of the user programs want
> to see them (CD burners for one).  I know the patch allows some to
> proceed upwards, but it's risky making all except device not started do
> a retry.
> 
> I'm a bit stumped on this one ... the intention of BLOCK_PC is that the
> caller simply retries if they get a unit attention (which is what all
> SCSI code does).  The block code doesn't want to look at the sense data
> but at the error return.  I suppose we could make UNIT_ATTENTION
> translate to -EAGAIN and have block do the right thing?
> 
The intention of BLOCK_PC is ok, but the point here is the barrier
request isn't really a BLOCK_PC request. The caller precisely
does _not_ want to look the sense code but rather have the SCSI
layer do all the necessary things.

Why can't we just mark the barrier request as something else than
BLOCK_PC, eg REQ_TYPE_SPECIAL?
Then we'd avoid these pitfalls and everything should work as expected, right?

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index de6c603..b63f84e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1038,7 +1038,7 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
 
 static void sd_prepare_flush(struct request_queue *q, struct request *rq)
 {
-       rq->cmd_type = REQ_TYPE_BLOCK_PC;
+       rq->cmd_type = REQ_TYPE_SPECIAL;
        rq->timeout = SD_TIMEOUT;
        rq->retries = SD_MAX_RETRIES;
        rq->cmd[0] = SYNCHRONIZE_CACHE;

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-05  6:26         ` Hannes Reinecke
@ 2010-05-05 13:19           ` James Bottomley
  2010-05-05 13:28             ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: James Bottomley @ 2010-05-05 13:19 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: Mike Christie, linux-scsi, Jens Axboe

On Wed, 2010-05-05 at 08:26 +0200, Hannes Reinecke wrote:
> James Bottomley wrote:
> > On Tue, 2010-05-04 at 15:27 -0500, Mike Christie wrote:
> >> On 05/04/2010 03:15 PM, Mike Christie wrote:
> >>> On 05/04/2010 11:26 AM, James Bottomley wrote:
> >>>> The other patch is fine, but I don't think this is necessary. The
> >>>> reason is that even returning SUCCESS here, we go straight into
> >>>> scsi_finish_command() (which passes it up to the driver handler) and
> >>>> then scsi_io_completion(). There's a catch for UNIT_ATTENTION in
> >>>> scsi_io_completion
> >>> The request is sent as a REQ_TYPE_BLOCK_PC (this flag is set for the
> >>> request in sd_prepare_flush), and scsi_io_completion's blk_pc_request
> >>> check() that returns the request upwards is before the UNIT_ATTENTION
> >> I was looking at the wrong source.
> >>
> >> scsi_finish_command checks for REQ_TYPE_BLOCK_PC and sets good_bytes to 
> >> scsi_bufflen, so when scsi_io_completion calls scsi_end_request, it 
> >> fails the request before we can get to the UNIT_ATTENTION.
> > 
> > Ah, yes, missed that.
> > 
> > However there are other problems then.  We can't just eat all unit
> > attentions on the BLOCK_PC path because some of the user programs want
> > to see them (CD burners for one).  I know the patch allows some to
> > proceed upwards, but it's risky making all except device not started do
> > a retry.
> > 
> > I'm a bit stumped on this one ... the intention of BLOCK_PC is that the
> > caller simply retries if they get a unit attention (which is what all
> > SCSI code does).  The block code doesn't want to look at the sense data
> > but at the error return.  I suppose we could make UNIT_ATTENTION
> > translate to -EAGAIN and have block do the right thing?
> > 
> The intention of BLOCK_PC is ok, but the point here is the barrier
> request isn't really a BLOCK_PC request. The caller precisely
> does _not_ want to look the sense code but rather have the SCSI
> layer do all the necessary things.
> 
> Why can't we just mark the barrier request as something else than
> BLOCK_PC, eg REQ_TYPE_SPECIAL?
> Then we'd avoid these pitfalls and everything should work as expected, right?

Unfortunately, not.  A long time ago we used to use SPECIAL for prepared
commands, but we stripped that path out.  If you look at the prep
functions today, they only accept either REQ_TYPE_FS or
REQ_TYPE_BLOCK_PC ... they kill everything else, so we'd have to do
surgery on all the input paths to get this to work.

Looking at all of this, I'm really unhappy with the way barriers are
working in SCSI ... using BLOCK_PC for them was a big mistake, since
block can't handle the errors and BLOCK_PC by its very nature requests
that all errors, however trivially fixable, be returned.

It looks like we have precisely the same problems with discard too,
except there at least we'll ignore the error.

Someone needs to take a good look at fixing this ... or at least just
making discard and barrier use the same submit paths, so we can fix it
up in the lower layers.

In the meantime, I think the SUCCESS/NEEDS_RETRY fiddle in check
condition is still the easiest hack.

James



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Retry commands with UNIT_ATTENTION sense codes
  2010-05-05 13:19           ` James Bottomley
@ 2010-05-05 13:28             ` Christoph Hellwig
  0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2010-05-05 13:28 UTC (permalink / raw)
  To: James Bottomley; +Cc: Hannes Reinecke, Mike Christie, linux-scsi, Jens Axboe

On Wed, May 05, 2010 at 09:19:53AM -0400, James Bottomley wrote:
> Unfortunately, not.  A long time ago we used to use SPECIAL for prepared
> commands, but we stripped that path out.  If you look at the prep
> functions today, they only accept either REQ_TYPE_FS or
> REQ_TYPE_BLOCK_PC ... they kill everything else, so we'd have to do
> surgery on all the input paths to get this to work.
> 
> Looking at all of this, I'm really unhappy with the way barriers are
> working in SCSI ... using BLOCK_PC for them was a big mistake, since
> block can't handle the errors and BLOCK_PC by its very nature requests
> that all errors, however trivially fixable, be returned.
> 
> It looks like we have precisely the same problems with discard too,
> except there at least we'll ignore the error.
> 
> Someone needs to take a good look at fixing this ... or at least just
> making discard and barrier use the same submit paths, so we can fix it
> up in the lower layers.

Both barriers and dicards are FS requests as far as the block layer is
concerned.  We turn them into BLOCK_PC requests very late in the game,
either in the prepare_flush function for barriers, or directly inside
prep_fn for discard.  I tried to handle discard requests as FS all
the way through, but the various different ways in which we account for
a request length didn't play nicely with that.

Handling cache flushes as FS requests all the way through should be
easier than that.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-05-05 13:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-04 14:48 [PATCH] Retry commands with UNIT_ATTENTION sense codes Hannes Reinecke
2010-05-04 16:26 ` James Bottomley
2010-05-04 20:15   ` Mike Christie
2010-05-04 20:27     ` Mike Christie
2010-05-04 20:30       ` James Bottomley
2010-05-04 20:51         ` James Bottomley
2010-05-05  6:26         ` Hannes Reinecke
2010-05-05 13:19           ` James Bottomley
2010-05-05 13:28             ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.