linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] oops in sd_shutdown
       [not found] <Pine.LNX.4.53.0308111426570.16008@thevillage.soulcatcher>
@ 2003-08-11 22:28 ` Andries Brouwer
  2003-08-12  1:13   ` Jeff Woods
  2003-08-12  6:53   ` Christoph Hellwig
  0 siblings, 2 replies; 10+ messages in thread
From: Andries Brouwer @ 2003-08-11 22:28 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-usb-devel, linux-kernel

I see an Oops in the SCSI code, caused by the fact that sdkp is NULL
in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set
in sd_probe. But in my case sd_probe never finished. An insmod usb-storage
hangs forever, or at least for more than six hours, giving ample opportunity
to observe this race between sd_probe and sd_shutdown.
(Of course sd_probe hangs in sd_revalidate disk.)

Perhaps the obvious test is a good idea.
Locking seems meaningless - sd_probe will never finish.

Andries

[Probably the init of usb_storage should start probing the devices in separate
threads, in parallel, and return immediately.]

The obvious patch (with whitespace damage)

diff -u --recursive --new-file -X /linux/dontdiff a/drivers/scsi/sd.c b/drivers/scsi/sd.c
--- a/drivers/scsi/sd.c Mon Jul 28 05:39:31 2003
+++ b/drivers/scsi/sd.c Tue Aug 12 01:24:51 2003
@@ -1351,10 +1351,14 @@
 static void sd_shutdown(struct device *dev)
 {
        struct scsi_device *sdp = to_scsi_device(dev);
-       struct scsi_disk *sdkp = dev_get_drvdata(dev);
+       struct scsi_disk *sdkp;
        struct scsi_request *sreq;
        int retries, res;
 
+       sdkp = dev_get_drvdata(dev);
+       if (!sdkp)
+               return;         /* this can happen */
+
        if (!sdp->online || !sdkp->WCE)
                return;
 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] oops in sd_shutdown
  2003-08-11 22:28 ` [PATCH] oops in sd_shutdown Andries Brouwer
@ 2003-08-12  1:13   ` Jeff Woods
  2003-08-12  2:49     ` Andries Brouwer
  2003-08-12  6:53   ` Christoph Hellwig
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Woods @ 2003-08-12  1:13 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: linux-scsi, linux-usb-devel, linux-kernel

At +0200 12:28 AM 8/12/2003, Andries Brouwer wrote (in part):
>The obvious patch (with whitespace damage)
>
>diff -u --recursive --new-file -X /linux/dontdiff a/drivers/scsi/sd.c 
>b/drivers/scsi/sd.c
>--- a/drivers/scsi/sd.c Mon Jul 28 05:39:31 2003
>+++ b/drivers/scsi/sd.c Tue Aug 12 01:24:51 2003
>@@ -1351,10 +1351,14 @@
>  static void sd_shutdown(struct device *dev)
>  {
>         struct scsi_device *sdp = to_scsi_device(dev);
>-       struct scsi_disk *sdkp = dev_get_drvdata(dev);
>+       struct scsi_disk *sdkp;
>         struct scsi_request *sreq;
>         int retries, res;
>
>+       sdkp = dev_get_drvdata(dev);
>+       if (!sdkp)
>+               return;         /* this can happen */
>+
>         if (!sdp->online || !sdkp->WCE)
>                 return;


Looking only at the above code snippet, I'd suggest something more like:
~~~~~~~~~~~~~~~~~~
diff -u --recursive --new-file -X /linux/dontdiff a/drivers/scsi/sd.c 
b/drivers/scsi/sd.c
--- a/drivers/scsi/sd.c Mon Jul 28 05:39:31 2003
+++ b/drivers/scsi/sd.c Tue Aug 12 01:24:51 2003
@@ -1351,10 +1351,14 @@
  static void sd_shutdown(struct device *dev)
  {
         struct scsi_device *sdp = to_scsi_device(dev);
         struct scsi_disk *sdkp = dev_get_drvdata(dev);
         struct scsi_request *sreq;
         int retries, res;

-       if (!sdp->online || !sdkp->WCE)
+       if (!sdp || !sdp->online || !sdkp || !sdkp->WCE)
                 return;
~~~~~~~~~~~~~~~~~~
If sdp can *never* be NULL *and* this is a performance-critical code-path 
then perhaps it makes sense to leave off the "!sdp || " which I added to 
the logic, but it seems a very small cost to pay for the insurance checking 
in sd_shutdown().

--
Jeff Woods <kazrak+kernel@cesmail.net> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] oops in sd_shutdown
  2003-08-12  1:13   ` Jeff Woods
@ 2003-08-12  2:49     ` Andries Brouwer
  2003-08-12  6:35       ` Jeff Woods
  0 siblings, 1 reply; 10+ messages in thread
From: Andries Brouwer @ 2003-08-12  2:49 UTC (permalink / raw)
  To: Jeff Woods; +Cc: Andries Brouwer, linux-scsi, linux-usb-devel, linux-kernel

On Mon, Aug 11, 2003 at 06:13:50PM -0700, Jeff Woods wrote:

> Looking only at the above code snippet, I'd suggest something more like:

> +       if (!sdp || 


This is not meaningful.

A general kind of convention is that a pointer will be NULL either
by mistake, when it is uninitialized, or on purpose, when no object
is present or no action (other than the default) needs to be performed.

But that general idea is broken by container_of(), which just subtracts
a constant. So, one should check before subtracting that the pointer
is non-NULL. Checking afterwards is meaningless.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] oops in sd_shutdown
  2003-08-12  2:49     ` Andries Brouwer
@ 2003-08-12  6:35       ` Jeff Woods
  2003-08-12 21:36         ` [linux-usb-devel] " Greg KH
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Woods @ 2003-08-12  6:35 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: linux-scsi, linux-usb-devel, linux-kernel

At +0200 04:49 AM 8/12/2003, Andries Brouwer wrote:
>On Mon, Aug 11, 2003 at 06:13:50PM -0700, Jeff Woods wrote:
>
>>Looking only at the above code snippet, I'd suggest something more like:
>
>>+       if (!sdp ||
>
>This is not meaningful.

How is it not meaningful?  The next action in the expression is to 
dereference the pointer and if it has a NULL value then I expect the 
dereference to fail.  [But I am a complete newbie with respect to Linux 
kernel and driver code so perhaps my understanding is in error.  If so, 
please enlighten me.]

>A general kind of convention is that a pointer will be NULL either by 
>mistake, when it is uninitialized, or on purpose, when no object is 
>present or no action (other than the default) needs to be performed.

Of course.  But in this case, the next action the code will attempt is to 
dereference that pointer which I expect would fail if it's NULL.  If you're 
telling me the pointer cannot be NULL, then fine [which I tried to indicate 
was a possibility in my first email in this thread], but if the pointer 
might *ever* be NULL (and dereferencing a NULL pointer in this context is 
as bad as it usually is) then there is no point in proceeding and returning 
from the function immediately seems like a reasonable thing to do in case 
of a shutdown function.  (I can see possible value in additionally 
reporting an error or warning somehow if feasible, but that's not germane 
to whether checking the pointers for NULL is a prudent action.

>But that general idea is broken by container_of(), which just subtracts a 
>constant. So, one should check before subtracting that the pointer is 
>non-NULL. Checking afterwards is meaningless.

As I tried to indicate in the opening statement I have not looked at any 
other code than what you included in the patch diff beginning this thread 
so I don't see any reference to anything that indicates some function 
called container_of() [which sounds like it might be some C++ routine... 
and I was under the impression this code is C rather than C++].  The diff 
includes the beginning of the function including initialization of both the 
sdp and sdkp pointers.  One bug the patch fixes is the implicit dereference 
of sdkp in the original version of the "if" statement I suggest be 
modified.  My version of the patch (1) reduces the number of lines changed, 
(2) results in fewer lines, (3) improves the transparency of the code, and 
(4) additionally checks for a (perhaps unlikely or even  improbable) 
potential unanticipated runtime error, all of which makes me believe it is 
an improvement.

--
Jeff Woods <kazrak+kernel@cesmail.net> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] oops in sd_shutdown
  2003-08-11 22:28 ` [PATCH] oops in sd_shutdown Andries Brouwer
  2003-08-12  1:13   ` Jeff Woods
@ 2003-08-12  6:53   ` Christoph Hellwig
  2003-08-12  8:51     ` Andries Brouwer
  2003-08-12 21:35     ` [linux-usb-devel] " Greg KH
  1 sibling, 2 replies; 10+ messages in thread
From: Christoph Hellwig @ 2003-08-12  6:53 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: linux-scsi, linux-usb-devel, linux-kernel

On Tue, Aug 12, 2003 at 12:28:44AM +0200, Andries Brouwer wrote:
> I see an Oops in the SCSI code, caused by the fact that sdkp is NULL
> in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set
> in sd_probe. But in my case sd_probe never finished. An insmod usb-storage
> hangs forever, or at least for more than six hours, giving ample opportunity
> to observe this race between sd_probe and sd_shutdown.
> (Of course sd_probe hangs in sd_revalidate disk.)

Well, this same problem could show upb in any other driver.  Could
you instead send a patch to Pat that the driver model never calls
the shutdown method for a driver that hasn't finished ->probe?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] oops in sd_shutdown
  2003-08-12  6:53   ` Christoph Hellwig
@ 2003-08-12  8:51     ` Andries Brouwer
  2003-08-12 21:35     ` [linux-usb-devel] " Greg KH
  1 sibling, 0 replies; 10+ messages in thread
From: Andries Brouwer @ 2003-08-12  8:51 UTC (permalink / raw)
  To: Christoph Hellwig, linux-scsi, linux-usb-devel, linux-kernel

On Tue, Aug 12, 2003 at 07:53:53AM +0100, Christoph Hellwig wrote:

> > I see an Oops in the SCSI code, caused by the fact that sdkp is NULL
> > in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set
> > in sd_probe. But in my case sd_probe never finished. An insmod usb-storage
> > hangs forever, or at least for more than six hours, giving ample opportunity
> > to observe this race between sd_probe and sd_shutdown.
> > (Of course sd_probe hangs in sd_revalidate disk.)
> 
> Well, this same problem could show upb in any other driver.  Could
> you instead send a patch to Pat that the driver model never calls
> the shutdown method for a driver that hasn't finished ->probe?

Yes, that is the next stage. But it takes a few hours instead of a
few seconds.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] oops in sd_shutdown
  2003-08-12  6:53   ` Christoph Hellwig
  2003-08-12  8:51     ` Andries Brouwer
@ 2003-08-12 21:35     ` Greg KH
  2003-08-13  0:24       ` Mike Anderson
  1 sibling, 1 reply; 10+ messages in thread
From: Greg KH @ 2003-08-12 21:35 UTC (permalink / raw)
  To: Christoph Hellwig, Andries Brouwer, linux-scsi, linux-usb-devel,
	linux-kernel

On Tue, Aug 12, 2003 at 07:53:53AM +0100, Christoph Hellwig wrote:
> On Tue, Aug 12, 2003 at 12:28:44AM +0200, Andries Brouwer wrote:
> > I see an Oops in the SCSI code, caused by the fact that sdkp is NULL
> > in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set
> > in sd_probe. But in my case sd_probe never finished. An insmod usb-storage
> > hangs forever, or at least for more than six hours, giving ample opportunity
> > to observe this race between sd_probe and sd_shutdown.
> > (Of course sd_probe hangs in sd_revalidate disk.)
> 
> Well, this same problem could show upb in any other driver.  Could
> you instead send a patch to Pat that the driver model never calls
> the shutdown method for a driver that hasn't finished ->probe?

I think it already will not do that due to taking the bus->subsys.rwsem
before calling either probe() or remove().

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] oops in sd_shutdown
  2003-08-12  6:35       ` Jeff Woods
@ 2003-08-12 21:36         ` Greg KH
  0 siblings, 0 replies; 10+ messages in thread
From: Greg KH @ 2003-08-12 21:36 UTC (permalink / raw)
  To: Jeff Woods; +Cc: Andries Brouwer, linux-scsi, linux-usb-devel, linux-kernel

On Mon, Aug 11, 2003 at 11:35:26PM -0700, Jeff Woods wrote:
> At +0200 04:49 AM 8/12/2003, Andries Brouwer wrote:
> >On Mon, Aug 11, 2003 at 06:13:50PM -0700, Jeff Woods wrote:
> >
> >>Looking only at the above code snippet, I'd suggest something more like:
> >
> >>+       if (!sdp ||
> >
> >This is not meaningful.
> 
> How is it not meaningful?  The next action in the expression is to 
> dereference the pointer and if it has a NULL value then I expect the 
> dereference to fail.  [But I am a complete newbie with respect to Linux 
> kernel and driver code so perhaps my understanding is in error.  If so, 
> please enlighten me.]

sdp can not be NULL in this case.  That is why it is not meaningful to
try to check for it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-usb-devel] Re: [PATCH] oops in sd_shutdown
  2003-08-12 21:35     ` [linux-usb-devel] " Greg KH
@ 2003-08-13  0:24       ` Mike Anderson
  2003-08-13 10:46         ` Andries Brouwer
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Anderson @ 2003-08-13  0:24 UTC (permalink / raw)
  To: Greg KH
  Cc: Christoph Hellwig, Andries Brouwer, linux-scsi, linux-usb-devel,
	linux-kernel

Greg KH [greg@kroah.com] wrote:
> On Tue, Aug 12, 2003 at 07:53:53AM +0100, Christoph Hellwig wrote:
> > On Tue, Aug 12, 2003 at 12:28:44AM +0200, Andries Brouwer wrote:
> > > I see an Oops in the SCSI code, caused by the fact that sdkp is NULL
> > > in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set
> > > in sd_probe. But in my case sd_probe never finished. An insmod usb-storage
> > > hangs forever, or at least for more than six hours, giving ample opportunity
> > > to observe this race between sd_probe and sd_shutdown.
> > > (Of course sd_probe hangs in sd_revalidate disk.)
> > 
> > Well, this same problem could show upb in any other driver.  Could
> > you instead send a patch to Pat that the driver model never calls
> > the shutdown method for a driver that hasn't finished ->probe?
> 
> I think it already will not do that due to taking the bus->subsys.rwsem
> before calling either probe() or remove().
> 

Is the shutdown being called directly? The shutdown call is protected by
a different rwsem. Depending on the call graph setting dev->driver on
return of probe may provide a solution. I have not looked at all probe
routines to understand if this would cause any bad side effects.

Andries,
	Can you send the oops output?

-andmike
--
Michael Anderson
andmike@us.ibm.com


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] oops in sd_shutdown
  2003-08-13  0:24       ` Mike Anderson
@ 2003-08-13 10:46         ` Andries Brouwer
  0 siblings, 0 replies; 10+ messages in thread
From: Andries Brouwer @ 2003-08-13 10:46 UTC (permalink / raw)
  To: Greg KH, Christoph Hellwig, linux-scsi, linux-usb-devel, linux-kernel

On Tue, Aug 12, 2003 at 05:24:50PM -0700, Mike Anderson wrote:

> > > Well, this same problem could show upb in any other driver.  Could
> > > you instead send a patch to Pat that the driver model never calls
> > > the shutdown method for a driver that hasn't finished ->probe?
> > 
> > I think it already will not do that due to taking the bus->subsys.rwsem
> > before calling either probe() or remove().
> 
> Is the shutdown being called directly? The shutdown call is protected by
> a different rwsem. Depending on the call graph setting dev->driver on
> return of probe may provide a solution.

Yes, that is precisely what I had considered doing.

> I have not looked at all probe
> routines to understand if this would cause any bad side effects.
> 
> Andries,
> 	Can you send the oops output?

top of stack was reported as (process reboot):

sd_shutdown + 0x22/0x110 NULL deref (namely, sdkp)
i8042_notify_sys
device_shutdown
sys_reboot
do_clock_nanosleep


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-08-13 10:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.53.0308111426570.16008@thevillage.soulcatcher>
2003-08-11 22:28 ` [PATCH] oops in sd_shutdown Andries Brouwer
2003-08-12  1:13   ` Jeff Woods
2003-08-12  2:49     ` Andries Brouwer
2003-08-12  6:35       ` Jeff Woods
2003-08-12 21:36         ` [linux-usb-devel] " Greg KH
2003-08-12  6:53   ` Christoph Hellwig
2003-08-12  8:51     ` Andries Brouwer
2003-08-12 21:35     ` [linux-usb-devel] " Greg KH
2003-08-13  0:24       ` Mike Anderson
2003-08-13 10:46         ` Andries Brouwer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).