All of lore.kernel.org
 help / color / mirror / Atom feed
* Investigating potential flaw in scsi error handling
@ 2008-02-09 21:59 Elias Oltmanns
  2008-02-09 23:30 ` James Bottomley
  0 siblings, 1 reply; 8+ messages in thread
From: Elias Oltmanns @ 2008-02-09 21:59 UTC (permalink / raw)
  To: linux-scsi; +Cc: Tejun Heo

Hi there,

I'm experiencing system lockups with 2.6.24 which I believe to be
related to scsi error handling. Actually, I have patched the mainline
kernel with a disk shock protection patch [1] and in my case it is indeed
the shock protection mechanism that triggers the lockups. However, some
rather lengthy investigations have lead me to the conclusion that this
additional patch is just the means to reproduce the error condition
fairly reliably rather than the origin of the problem.

The problem has only become apparent since Tejun's commit
31cc23b34913bc173680bdc87af79e551bf8cc0d. More precisely, libata now
sets max_host_blocked and max_device_blocked to 1 for all ATA devices.
Various tests I've conducted so far have lead me to the conclusion that
a non zero return code from scsi_dispatch_command is sufficient to
trigger the problem I'm seeing provided that max_host_blocked and
max_device_blocked are set to 1.

Unfortunately, I'm a bit at a loss as to how I should proceed to find
the culprit. I can reliably reproduce the problem using the disk shock
protection patch in order to cause non zero return values from
scsi_dispatch_command. How can I find out where in the error handling of
this condition things might go wrong?

Most likely you will need further information to help me solving this
issue but perhaps you can already come up with some suggestions and tell
me what else you'd like to know.

Thanks in advance,

Elias

[1] http://article.gmane.org/gmane.linux.drivers.hdaps.devel/1094


PS: Since the disk shock protection patch is mainly concerned with an
ATA specific feature, I'm currently working on it to implement it in
libata rather than in the scsi midlayer. This doesn't change anything
with regard to the problem I've described above but has confirmed my
suspicion that it must be the return code from scsi_dispatch_command
that triggers system freeze.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Investigating potential flaw in scsi error handling
  2008-02-09 21:59 Investigating potential flaw in scsi error handling Elias Oltmanns
@ 2008-02-09 23:30 ` James Bottomley
  2008-02-10 12:54   ` Elias Oltmanns
  0 siblings, 1 reply; 8+ messages in thread
From: James Bottomley @ 2008-02-09 23:30 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: linux-scsi, Tejun Heo

On Sat, 2008-02-09 at 22:59 +0100, Elias Oltmanns wrote:
> Hi there,
> 
> I'm experiencing system lockups with 2.6.24 which I believe to be
> related to scsi error handling. Actually, I have patched the mainline
> kernel with a disk shock protection patch [1] and in my case it is indeed
> the shock protection mechanism that triggers the lockups. However, some
> rather lengthy investigations have lead me to the conclusion that this
> additional patch is just the means to reproduce the error condition
> fairly reliably rather than the origin of the problem.
> 
> The problem has only become apparent since Tejun's commit
> 31cc23b34913bc173680bdc87af79e551bf8cc0d. More precisely, libata now
> sets max_host_blocked and max_device_blocked to 1 for all ATA devices.
> Various tests I've conducted so far have lead me to the conclusion that
> a non zero return code from scsi_dispatch_command is sufficient to
> trigger the problem I'm seeing provided that max_host_blocked and
> max_device_blocked are set to 1.

There's nothing inherently incorrect with setting max_device_blocked to
1 but it is suboptimal: it means that for a single queue device
returning a wait causes an immediate reissue.

> Unfortunately, I'm a bit at a loss as to how I should proceed to find
> the culprit. I can reliably reproduce the problem using the disk shock
> protection patch in order to cause non zero return values from
> scsi_dispatch_command. How can I find out where in the error handling of
> this condition things might go wrong?
> 
> Most likely you will need further information to help me solving this
> issue but perhaps you can already come up with some suggestions and tell
> me what else you'd like to know.

Well, the first case I'm not sure why you refer to non-zero return from
scsi_dispatch_command() since that's an internal API; the non zero
return should come from ->queuecommand().

However, if you've patched scsi_dispatch_command() I'd guess that would
be the problem.

James



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Investigating potential flaw in scsi error handling
  2008-02-09 23:30 ` James Bottomley
@ 2008-02-10 12:54   ` Elias Oltmanns
  2008-02-10 13:02     ` [PATCH] Make sure that scsi_request_fn() isn't called recursively forever Elias Oltmanns
  2008-02-10 14:22     ` Investigating potential flaw in scsi error handling James Bottomley
  0 siblings, 2 replies; 8+ messages in thread
From: Elias Oltmanns @ 2008-02-10 12:54 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Tejun Heo

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Sat, 2008-02-09 at 22:59 +0100, Elias Oltmanns wrote:
>> Hi there,
>> 
>> I'm experiencing system lockups with 2.6.24 which I believe to be
>> related to scsi error handling. Actually, I have patched the mainline
>> kernel with a disk shock protection patch [1] and in my case it is indeed
>> the shock protection mechanism that triggers the lockups. However, some
>> rather lengthy investigations have lead me to the conclusion that this
>> additional patch is just the means to reproduce the error condition
>> fairly reliably rather than the origin of the problem.
>> 
>> The problem has only become apparent since Tejun's commit
>> 31cc23b34913bc173680bdc87af79e551bf8cc0d. More precisely, libata now
>> sets max_host_blocked and max_device_blocked to 1 for all ATA devices.
>> Various tests I've conducted so far have lead me to the conclusion that
>> a non zero return code from scsi_dispatch_command is sufficient to
>> trigger the problem I'm seeing provided that max_host_blocked and
>> max_device_blocked are set to 1.
>
> There's nothing inherently incorrect with setting max_device_blocked to
> 1 but it is suboptimal: it means that for a single queue device
> returning a wait causes an immediate reissue.

Thanks for rubbing that in again. It should have been clear to me all
along but I've only just realised the consequences and found the
problem, I think. We are, in fact, faced with a situation where the
->request_fn() is being called recursively forever.

Consider this: The ->request_fn() of a single queue device is called
which in turn calls scsi_dispatch_cmd(). Assume that the device is
either in SDEV_BLOCK state or ->queuecommand() returns
SCSI_MLQUEUE_DEVICE_BUSY for some reason. In either case
scsi_queue_insert() will be called. Eventually, blk_run_queue() will be
called with the same device queue not plugged yet. This way we directly
reenter q->request_fn(). Now, remember that libata sets
sdev->max_device_blocked to 1. Consequently, the function
scsi_dev_queue_ready() will immediately give a positive response and we
go ahead calling scsi_dispatch_cmd() again. Note that at this stage the
lld will not have had a chance yet to clear the SDEV_BLOCK state or the
condition that caused the SCSI_MLQUEUE_DEVICE_BUSY return code from
->queuecommand(). Hence the infinite recursion. A similar recursion can
also occur due to a SCSI_MLQUEUE_HOST_BUSY response from
->queuecommand().

Unless I have overlooked some unwanted implications, please consider
applying the patch that I'm going to send you as a follow up to this
email.

Regards,

Elias

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] Make sure that scsi_request_fn() isn't called recursively forever
  2008-02-10 12:54   ` Elias Oltmanns
@ 2008-02-10 13:02     ` Elias Oltmanns
  2008-02-10 14:22     ` Investigating potential flaw in scsi error handling James Bottomley
  1 sibling, 0 replies; 8+ messages in thread
From: Elias Oltmanns @ 2008-02-10 13:02 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Tejun Heo

Currently, scsi_dev_queue_ready() and scsi_host_queue_ready() decrease the
device_blocked or host_blocked counter respectively *before* they determine
the right return value. If the device can't accept a request for some
reason and max_host_blocked or max_device_blocked has been set to 1, this
may lead to scsi_request_fn() being called recursively without giving the
low level driver a chance to unjam the device.

This patch applies to 2.6.24. Please include it in the stable updates as
well.

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
---

 drivers/scsi/scsi_lib.c |   20 ++++++++------------
 1 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a9ac5b1..7513bed 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1357,14 +1357,12 @@ static inline int scsi_dev_queue_ready(s
 		/*
 		 * unblock after device_blocked iterates to zero
 		 */
-		if (--sdev->device_blocked == 0) {
+		if (sdev->device_blocked-- == 1)
 			SCSI_LOG_MLQUEUE(3,
 				   sdev_printk(KERN_INFO, sdev,
-				   "unblocking device at zero depth\n"));
-		} else {
-			blk_plug_device(q);
-			return 0;
-		}
+				   "device will be unblocked next time\n"));
+		blk_plug_device(q);
+		return 0;
 	}
 	if (sdev->device_blocked)
 		return 0;
@@ -1389,14 +1387,12 @@ static inline int scsi_host_queue_ready(
 		/*
 		 * unblock after host_blocked iterates to zero
 		 */
-		if (--shost->host_blocked == 0) {
+		if (shost->host_blocked-- == 1)
 			SCSI_LOG_MLQUEUE(3,
-				printk("scsi%d unblocking host at zero depth\n",
+				printk("scsi%d host will be unblocked next time\n",
 					shost->host_no));
-		} else {
-			blk_plug_device(q);
-			return 0;
-		}
+		blk_plug_device(q);
+		return 0;
 	}
 	if ((shost->can_queue > 0 && shost->host_busy >= shost->can_queue) ||
 	    shost->host_blocked || shost->host_self_blocked) {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Investigating potential flaw in scsi error handling
  2008-02-10 12:54   ` Elias Oltmanns
  2008-02-10 13:02     ` [PATCH] Make sure that scsi_request_fn() isn't called recursively forever Elias Oltmanns
@ 2008-02-10 14:22     ` James Bottomley
  2008-02-10 15:29       ` Elias Oltmanns
  1 sibling, 1 reply; 8+ messages in thread
From: James Bottomley @ 2008-02-10 14:22 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: linux-scsi, Tejun Heo


On Sun, 2008-02-10 at 13:54 +0100, Elias Oltmanns wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > On Sat, 2008-02-09 at 22:59 +0100, Elias Oltmanns wrote:
> >> Hi there,
> >> 
> >> I'm experiencing system lockups with 2.6.24 which I believe to be
> >> related to scsi error handling. Actually, I have patched the mainline
> >> kernel with a disk shock protection patch [1] and in my case it is indeed
> >> the shock protection mechanism that triggers the lockups. However, some
> >> rather lengthy investigations have lead me to the conclusion that this
> >> additional patch is just the means to reproduce the error condition
> >> fairly reliably rather than the origin of the problem.
> >> 
> >> The problem has only become apparent since Tejun's commit
> >> 31cc23b34913bc173680bdc87af79e551bf8cc0d. More precisely, libata now
> >> sets max_host_blocked and max_device_blocked to 1 for all ATA devices.
> >> Various tests I've conducted so far have lead me to the conclusion that
> >> a non zero return code from scsi_dispatch_command is sufficient to
> >> trigger the problem I'm seeing provided that max_host_blocked and
> >> max_device_blocked are set to 1.
> >
> > There's nothing inherently incorrect with setting max_device_blocked to
> > 1 but it is suboptimal: it means that for a single queue device
> > returning a wait causes an immediate reissue.
> 
> Thanks for rubbing that in again. It should have been clear to me all
> along but I've only just realised the consequences and found the
> problem, I think. We are, in fact, faced with a situation where the
> ->request_fn() is being called recursively forever.

This happens on a device_max_blocked of 1 if there's a programmatic
defer return of non zero at zero outstanding commands.  If you want to
set a max blocked of one, you need to ensure that doesn't happen.

> Consider this: The ->request_fn() of a single queue device is called
> which in turn calls scsi_dispatch_cmd(). Assume that the device is
> either in SDEV_BLOCK state or ->queuecommand() returns
> SCSI_MLQUEUE_DEVICE_BUSY for some reason. In either case
> scsi_queue_insert() will be called. Eventually, blk_run_queue() will be
> called with the same device queue not plugged yet. This way we directly
> reenter q->request_fn(). Now, remember that libata sets
> sdev->max_device_blocked to 1. Consequently, the function
> scsi_dev_queue_ready() will immediately give a positive response and we
> go ahead calling scsi_dispatch_cmd() again. Note that at this stage the
> lld will not have had a chance yet to clear the SDEV_BLOCK state or the
> condition that caused the SCSI_MLQUEUE_DEVICE_BUSY return code from
> ->queuecommand(). Hence the infinite recursion. A similar recursion can
> also occur due to a SCSI_MLQUEUE_HOST_BUSY response from
> ->queuecommand().
> 
> Unless I have overlooked some unwanted implications, please consider
> applying the patch that I'm going to send you as a follow up to this
> email.

No.  We have a fix for this, it's called setting device_max_blocked to 2
or greater.  All your patch does is make this seem to be the case, plus
it eliminates the instant reissue case for drivers with queuecommands
that do obey all the rules.

If you can prove that IDE doesn't obey the rules (no defer returns) then
ask them to revert their setting of device_max_blocked to 1; if they do,
then you have to come up with an alternative for your patch.

James


James



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Investigating potential flaw in scsi error handling
  2008-02-10 14:22     ` Investigating potential flaw in scsi error handling James Bottomley
@ 2008-02-10 15:29       ` Elias Oltmanns
  2008-02-10 15:44         ` James Bottomley
  0 siblings, 1 reply; 8+ messages in thread
From: Elias Oltmanns @ 2008-02-10 15:29 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Tejun Heo

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Sun, 2008-02-10 at 13:54 +0100, Elias Oltmanns wrote:
[...]
>> Consider this: The ->request_fn() of a single queue device is called
>> which in turn calls scsi_dispatch_cmd(). Assume that the device is
>> either in SDEV_BLOCK state or ->queuecommand() returns
>> SCSI_MLQUEUE_DEVICE_BUSY for some reason. In either case
>> scsi_queue_insert() will be called. Eventually, blk_run_queue() will be
>> called with the same device queue not plugged yet. This way we directly
>> reenter q->request_fn(). Now, remember that libata sets
>> sdev->max_device_blocked to 1. Consequently, the function
>> scsi_dev_queue_ready() will immediately give a positive response and we
>> go ahead calling scsi_dispatch_cmd() again. Note that at this stage the
>> lld will not have had a chance yet to clear the SDEV_BLOCK state or the
>> condition that caused the SCSI_MLQUEUE_DEVICE_BUSY return code from
>> ->queuecommand(). Hence the infinite recursion. A similar recursion can
>> also occur due to a SCSI_MLQUEUE_HOST_BUSY response from
>> ->queuecommand().
>> 
>> Unless I have overlooked some unwanted implications, please consider
>> applying the patch that I'm going to send you as a follow up to this
>> email.
>
> No.  We have a fix for this, it's called setting device_max_blocked to 2
> or greater.  All your patch does is make this seem to be the case, plus
> it eliminates the instant reissue case for drivers with queuecommands
> that do obey all the rules.
>
> If you can prove that IDE doesn't obey the rules (no defer returns)

In fact, I can prove that scsi midlayer itself doesn't exactly comply
with this rule by design. The comment explaining the SDEV_BLOCK state in
scsi_device.h suggests that the low level driver is supposed to control
whether a device is switched to or from SDEV_BLOCK. However, with
max_device_blocked set to 1 we have an infinite loop where the low level
driver never gets even called since scsi_dispatch_cmd will requeue the
request instantly.

IDE doesn't obey the rule either but this can be fixed easily. So, what
about SDEV_BLOCK?

Regards,

Elias

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Investigating potential flaw in scsi error handling
  2008-02-10 15:29       ` Elias Oltmanns
@ 2008-02-10 15:44         ` James Bottomley
  2008-02-10 16:04           ` Elias Oltmanns
  0 siblings, 1 reply; 8+ messages in thread
From: James Bottomley @ 2008-02-10 15:44 UTC (permalink / raw)
  To: Elias Oltmanns; +Cc: linux-scsi, Tejun Heo

On Sun, 2008-02-10 at 16:29 +0100, Elias Oltmanns wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > On Sun, 2008-02-10 at 13:54 +0100, Elias Oltmanns wrote:
> [...]
> >> Consider this: The ->request_fn() of a single queue device is called
> >> which in turn calls scsi_dispatch_cmd(). Assume that the device is
> >> either in SDEV_BLOCK state or ->queuecommand() returns
> >> SCSI_MLQUEUE_DEVICE_BUSY for some reason. In either case
> >> scsi_queue_insert() will be called. Eventually, blk_run_queue() will be
> >> called with the same device queue not plugged yet. This way we directly
> >> reenter q->request_fn(). Now, remember that libata sets
> >> sdev->max_device_blocked to 1. Consequently, the function
> >> scsi_dev_queue_ready() will immediately give a positive response and we
> >> go ahead calling scsi_dispatch_cmd() again. Note that at this stage the
> >> lld will not have had a chance yet to clear the SDEV_BLOCK state or the
> >> condition that caused the SCSI_MLQUEUE_DEVICE_BUSY return code from
> >> ->queuecommand(). Hence the infinite recursion. A similar recursion can
> >> also occur due to a SCSI_MLQUEUE_HOST_BUSY response from
> >> ->queuecommand().
> >> 
> >> Unless I have overlooked some unwanted implications, please consider
> >> applying the patch that I'm going to send you as a follow up to this
> >> email.
> >
> > No.  We have a fix for this, it's called setting device_max_blocked to 2
> > or greater.  All your patch does is make this seem to be the case, plus
> > it eliminates the instant reissue case for drivers with queuecommands
> > that do obey all the rules.
> >
> > If you can prove that IDE doesn't obey the rules (no defer returns)
> 
> In fact, I can prove that scsi midlayer itself doesn't exactly comply
> with this rule by design. The comment explaining the SDEV_BLOCK state in
> scsi_device.h suggests that the low level driver is supposed to control
> whether a device is switched to or from SDEV_BLOCK. However, with
> max_device_blocked set to 1 we have an infinite loop where the low level
> driver never gets even called since scsi_dispatch_cmd will requeue the
> request instantly.
> 
> IDE doesn't obey the rule either but this can be fixed easily. So, what
> about SDEV_BLOCK?

Well, nothing.  It's an API a HBA may not use (per previous rule) if it
wants to set max_device_blocked to 1.

James



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Investigating potential flaw in scsi error handling
  2008-02-10 15:44         ` James Bottomley
@ 2008-02-10 16:04           ` Elias Oltmanns
  0 siblings, 0 replies; 8+ messages in thread
From: Elias Oltmanns @ 2008-02-10 16:04 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Tejun Heo

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Sun, 2008-02-10 at 16:29 +0100, Elias Oltmanns wrote:
>> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
[...]
>> > No.  We have a fix for this, it's called setting device_max_blocked to 2
>> > or greater.  All your patch does is make this seem to be the case, plus
>> > it eliminates the instant reissue case for drivers with queuecommands
>> > that do obey all the rules.
>> >
>> > If you can prove that IDE doesn't obey the rules (no defer returns)
>> 
>> In fact, I can prove that scsi midlayer itself doesn't exactly comply
>> with this rule by design. The comment explaining the SDEV_BLOCK state in
>> scsi_device.h suggests that the low level driver is supposed to control
>> whether a device is switched to or from SDEV_BLOCK. However, with
>> max_device_blocked set to 1 we have an infinite loop where the low level
>> driver never gets even called since scsi_dispatch_cmd will requeue the
>> request instantly.
>> 
>> IDE doesn't obey the rule either but this can be fixed easily. So, what
>> about SDEV_BLOCK?
>
> Well, nothing.  It's an API a HBA may not use (per previous rule) if it
> wants to set max_device_blocked to 1.

Right. Thanks for the clarification.

Regards,

Elias

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-02-10 16:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-09 21:59 Investigating potential flaw in scsi error handling Elias Oltmanns
2008-02-09 23:30 ` James Bottomley
2008-02-10 12:54   ` Elias Oltmanns
2008-02-10 13:02     ` [PATCH] Make sure that scsi_request_fn() isn't called recursively forever Elias Oltmanns
2008-02-10 14:22     ` Investigating potential flaw in scsi error handling James Bottomley
2008-02-10 15:29       ` Elias Oltmanns
2008-02-10 15:44         ` James Bottomley
2008-02-10 16:04           ` Elias Oltmanns

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.