* [PATCH] oops in sd_shutdown [not found] <Pine.LNX.4.53.0308111426570.16008@thevillage.soulcatcher> @ 2003-08-11 22:28 ` Andries Brouwer 2003-08-12 1:13 ` Jeff Woods 2003-08-12 6:53 ` Christoph Hellwig 0 siblings, 2 replies; 10+ messages in thread From: Andries Brouwer @ 2003-08-11 22:28 UTC (permalink / raw) To: linux-scsi; +Cc: linux-usb-devel, linux-kernel I see an Oops in the SCSI code, caused by the fact that sdkp is NULL in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set in sd_probe. But in my case sd_probe never finished. An insmod usb-storage hangs forever, or at least for more than six hours, giving ample opportunity to observe this race between sd_probe and sd_shutdown. (Of course sd_probe hangs in sd_revalidate disk.) Perhaps the obvious test is a good idea. Locking seems meaningless - sd_probe will never finish. Andries [Probably the init of usb_storage should start probing the devices in separate threads, in parallel, and return immediately.] The obvious patch (with whitespace damage) diff -u --recursive --new-file -X /linux/dontdiff a/drivers/scsi/sd.c b/drivers/scsi/sd.c --- a/drivers/scsi/sd.c Mon Jul 28 05:39:31 2003 +++ b/drivers/scsi/sd.c Tue Aug 12 01:24:51 2003 @@ -1351,10 +1351,14 @@ static void sd_shutdown(struct device *dev) { struct scsi_device *sdp = to_scsi_device(dev); - struct scsi_disk *sdkp = dev_get_drvdata(dev); + struct scsi_disk *sdkp; struct scsi_request *sreq; int retries, res; + sdkp = dev_get_drvdata(dev); + if (!sdkp) + return; /* this can happen */ + if (!sdp->online || !sdkp->WCE) return; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] oops in sd_shutdown 2003-08-11 22:28 ` [PATCH] oops in sd_shutdown Andries Brouwer @ 2003-08-12 1:13 ` Jeff Woods 2003-08-12 2:49 ` Andries Brouwer 2003-08-12 6:53 ` Christoph Hellwig 1 sibling, 1 reply; 10+ messages in thread From: Jeff Woods @ 2003-08-12 1:13 UTC (permalink / raw) To: Andries Brouwer; +Cc: linux-scsi, linux-usb-devel, linux-kernel At +0200 12:28 AM 8/12/2003, Andries Brouwer wrote (in part): >The obvious patch (with whitespace damage) > >diff -u --recursive --new-file -X /linux/dontdiff a/drivers/scsi/sd.c >b/drivers/scsi/sd.c >--- a/drivers/scsi/sd.c Mon Jul 28 05:39:31 2003 >+++ b/drivers/scsi/sd.c Tue Aug 12 01:24:51 2003 >@@ -1351,10 +1351,14 @@ > static void sd_shutdown(struct device *dev) > { > struct scsi_device *sdp = to_scsi_device(dev); >- struct scsi_disk *sdkp = dev_get_drvdata(dev); >+ struct scsi_disk *sdkp; > struct scsi_request *sreq; > int retries, res; > >+ sdkp = dev_get_drvdata(dev); >+ if (!sdkp) >+ return; /* this can happen */ >+ > if (!sdp->online || !sdkp->WCE) > return; Looking only at the above code snippet, I'd suggest something more like: ~~~~~~~~~~~~~~~~~~ diff -u --recursive --new-file -X /linux/dontdiff a/drivers/scsi/sd.c b/drivers/scsi/sd.c --- a/drivers/scsi/sd.c Mon Jul 28 05:39:31 2003 +++ b/drivers/scsi/sd.c Tue Aug 12 01:24:51 2003 @@ -1351,10 +1351,14 @@ static void sd_shutdown(struct device *dev) { struct scsi_device *sdp = to_scsi_device(dev); struct scsi_disk *sdkp = dev_get_drvdata(dev); struct scsi_request *sreq; int retries, res; - if (!sdp->online || !sdkp->WCE) + if (!sdp || !sdp->online || !sdkp || !sdkp->WCE) return; ~~~~~~~~~~~~~~~~~~ If sdp can *never* be NULL *and* this is a performance-critical code-path then perhaps it makes sense to leave off the "!sdp || " which I added to the logic, but it seems a very small cost to pay for the insurance checking in sd_shutdown(). -- Jeff Woods <kazrak+kernel@cesmail.net> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] oops in sd_shutdown 2003-08-12 1:13 ` Jeff Woods @ 2003-08-12 2:49 ` Andries Brouwer 2003-08-12 6:35 ` Jeff Woods 0 siblings, 1 reply; 10+ messages in thread From: Andries Brouwer @ 2003-08-12 2:49 UTC (permalink / raw) To: Jeff Woods; +Cc: Andries Brouwer, linux-scsi, linux-usb-devel, linux-kernel On Mon, Aug 11, 2003 at 06:13:50PM -0700, Jeff Woods wrote: > Looking only at the above code snippet, I'd suggest something more like: > + if (!sdp || This is not meaningful. A general kind of convention is that a pointer will be NULL either by mistake, when it is uninitialized, or on purpose, when no object is present or no action (other than the default) needs to be performed. But that general idea is broken by container_of(), which just subtracts a constant. So, one should check before subtracting that the pointer is non-NULL. Checking afterwards is meaningless. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] oops in sd_shutdown 2003-08-12 2:49 ` Andries Brouwer @ 2003-08-12 6:35 ` Jeff Woods 2003-08-12 21:36 ` [linux-usb-devel] " Greg KH 0 siblings, 1 reply; 10+ messages in thread From: Jeff Woods @ 2003-08-12 6:35 UTC (permalink / raw) To: Andries Brouwer; +Cc: linux-scsi, linux-usb-devel, linux-kernel At +0200 04:49 AM 8/12/2003, Andries Brouwer wrote: >On Mon, Aug 11, 2003 at 06:13:50PM -0700, Jeff Woods wrote: > >>Looking only at the above code snippet, I'd suggest something more like: > >>+ if (!sdp || > >This is not meaningful. How is it not meaningful? The next action in the expression is to dereference the pointer and if it has a NULL value then I expect the dereference to fail. [But I am a complete newbie with respect to Linux kernel and driver code so perhaps my understanding is in error. If so, please enlighten me.] >A general kind of convention is that a pointer will be NULL either by >mistake, when it is uninitialized, or on purpose, when no object is >present or no action (other than the default) needs to be performed. Of course. But in this case, the next action the code will attempt is to dereference that pointer which I expect would fail if it's NULL. If you're telling me the pointer cannot be NULL, then fine [which I tried to indicate was a possibility in my first email in this thread], but if the pointer might *ever* be NULL (and dereferencing a NULL pointer in this context is as bad as it usually is) then there is no point in proceeding and returning from the function immediately seems like a reasonable thing to do in case of a shutdown function. (I can see possible value in additionally reporting an error or warning somehow if feasible, but that's not germane to whether checking the pointers for NULL is a prudent action. >But that general idea is broken by container_of(), which just subtracts a >constant. So, one should check before subtracting that the pointer is >non-NULL. Checking afterwards is meaningless. As I tried to indicate in the opening statement I have not looked at any other code than what you included in the patch diff beginning this thread so I don't see any reference to anything that indicates some function called container_of() [which sounds like it might be some C++ routine... and I was under the impression this code is C rather than C++]. The diff includes the beginning of the function including initialization of both the sdp and sdkp pointers. One bug the patch fixes is the implicit dereference of sdkp in the original version of the "if" statement I suggest be modified. My version of the patch (1) reduces the number of lines changed, (2) results in fewer lines, (3) improves the transparency of the code, and (4) additionally checks for a (perhaps unlikely or even improbable) potential unanticipated runtime error, all of which makes me believe it is an improvement. -- Jeff Woods <kazrak+kernel@cesmail.net> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] oops in sd_shutdown 2003-08-12 6:35 ` Jeff Woods @ 2003-08-12 21:36 ` Greg KH 0 siblings, 0 replies; 10+ messages in thread From: Greg KH @ 2003-08-12 21:36 UTC (permalink / raw) To: Jeff Woods; +Cc: Andries Brouwer, linux-scsi, linux-usb-devel, linux-kernel On Mon, Aug 11, 2003 at 11:35:26PM -0700, Jeff Woods wrote: > At +0200 04:49 AM 8/12/2003, Andries Brouwer wrote: > >On Mon, Aug 11, 2003 at 06:13:50PM -0700, Jeff Woods wrote: > > > >>Looking only at the above code snippet, I'd suggest something more like: > > > >>+ if (!sdp || > > > >This is not meaningful. > > How is it not meaningful? The next action in the expression is to > dereference the pointer and if it has a NULL value then I expect the > dereference to fail. [But I am a complete newbie with respect to Linux > kernel and driver code so perhaps my understanding is in error. If so, > please enlighten me.] sdp can not be NULL in this case. That is why it is not meaningful to try to check for it. thanks, greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] oops in sd_shutdown 2003-08-11 22:28 ` [PATCH] oops in sd_shutdown Andries Brouwer 2003-08-12 1:13 ` Jeff Woods @ 2003-08-12 6:53 ` Christoph Hellwig 2003-08-12 8:51 ` Andries Brouwer 2003-08-12 21:35 ` [linux-usb-devel] " Greg KH 1 sibling, 2 replies; 10+ messages in thread From: Christoph Hellwig @ 2003-08-12 6:53 UTC (permalink / raw) To: Andries Brouwer; +Cc: linux-scsi, linux-usb-devel, linux-kernel On Tue, Aug 12, 2003 at 12:28:44AM +0200, Andries Brouwer wrote: > I see an Oops in the SCSI code, caused by the fact that sdkp is NULL > in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set > in sd_probe. But in my case sd_probe never finished. An insmod usb-storage > hangs forever, or at least for more than six hours, giving ample opportunity > to observe this race between sd_probe and sd_shutdown. > (Of course sd_probe hangs in sd_revalidate disk.) Well, this same problem could show upb in any other driver. Could you instead send a patch to Pat that the driver model never calls the shutdown method for a driver that hasn't finished ->probe? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] oops in sd_shutdown 2003-08-12 6:53 ` Christoph Hellwig @ 2003-08-12 8:51 ` Andries Brouwer 2003-08-12 21:35 ` [linux-usb-devel] " Greg KH 1 sibling, 0 replies; 10+ messages in thread From: Andries Brouwer @ 2003-08-12 8:51 UTC (permalink / raw) To: Christoph Hellwig, linux-scsi, linux-usb-devel, linux-kernel On Tue, Aug 12, 2003 at 07:53:53AM +0100, Christoph Hellwig wrote: > > I see an Oops in the SCSI code, caused by the fact that sdkp is NULL > > in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set > > in sd_probe. But in my case sd_probe never finished. An insmod usb-storage > > hangs forever, or at least for more than six hours, giving ample opportunity > > to observe this race between sd_probe and sd_shutdown. > > (Of course sd_probe hangs in sd_revalidate disk.) > > Well, this same problem could show upb in any other driver. Could > you instead send a patch to Pat that the driver model never calls > the shutdown method for a driver that hasn't finished ->probe? Yes, that is the next stage. But it takes a few hours instead of a few seconds. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] oops in sd_shutdown 2003-08-12 6:53 ` Christoph Hellwig 2003-08-12 8:51 ` Andries Brouwer @ 2003-08-12 21:35 ` Greg KH 2003-08-13 0:24 ` Mike Anderson 1 sibling, 1 reply; 10+ messages in thread From: Greg KH @ 2003-08-12 21:35 UTC (permalink / raw) To: Christoph Hellwig, Andries Brouwer, linux-scsi, linux-usb-devel, linux-kernel On Tue, Aug 12, 2003 at 07:53:53AM +0100, Christoph Hellwig wrote: > On Tue, Aug 12, 2003 at 12:28:44AM +0200, Andries Brouwer wrote: > > I see an Oops in the SCSI code, caused by the fact that sdkp is NULL > > in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set > > in sd_probe. But in my case sd_probe never finished. An insmod usb-storage > > hangs forever, or at least for more than six hours, giving ample opportunity > > to observe this race between sd_probe and sd_shutdown. > > (Of course sd_probe hangs in sd_revalidate disk.) > > Well, this same problem could show upb in any other driver. Could > you instead send a patch to Pat that the driver model never calls > the shutdown method for a driver that hasn't finished ->probe? I think it already will not do that due to taking the bus->subsys.rwsem before calling either probe() or remove(). thanks, greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] oops in sd_shutdown 2003-08-12 21:35 ` [linux-usb-devel] " Greg KH @ 2003-08-13 0:24 ` Mike Anderson 2003-08-13 10:46 ` Andries Brouwer 0 siblings, 1 reply; 10+ messages in thread From: Mike Anderson @ 2003-08-13 0:24 UTC (permalink / raw) To: Greg KH Cc: Christoph Hellwig, Andries Brouwer, linux-scsi, linux-usb-devel, linux-kernel Greg KH [greg@kroah.com] wrote: > On Tue, Aug 12, 2003 at 07:53:53AM +0100, Christoph Hellwig wrote: > > On Tue, Aug 12, 2003 at 12:28:44AM +0200, Andries Brouwer wrote: > > > I see an Oops in the SCSI code, caused by the fact that sdkp is NULL > > > in sd_shutdown. "How can that be?", you will ask - dev->driver_data was set > > > in sd_probe. But in my case sd_probe never finished. An insmod usb-storage > > > hangs forever, or at least for more than six hours, giving ample opportunity > > > to observe this race between sd_probe and sd_shutdown. > > > (Of course sd_probe hangs in sd_revalidate disk.) > > > > Well, this same problem could show upb in any other driver. Could > > you instead send a patch to Pat that the driver model never calls > > the shutdown method for a driver that hasn't finished ->probe? > > I think it already will not do that due to taking the bus->subsys.rwsem > before calling either probe() or remove(). > Is the shutdown being called directly? The shutdown call is protected by a different rwsem. Depending on the call graph setting dev->driver on return of probe may provide a solution. I have not looked at all probe routines to understand if this would cause any bad side effects. Andries, Can you send the oops output? -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] oops in sd_shutdown 2003-08-13 0:24 ` Mike Anderson @ 2003-08-13 10:46 ` Andries Brouwer 0 siblings, 0 replies; 10+ messages in thread From: Andries Brouwer @ 2003-08-13 10:46 UTC (permalink / raw) To: Greg KH, Christoph Hellwig, linux-scsi, linux-usb-devel, linux-kernel On Tue, Aug 12, 2003 at 05:24:50PM -0700, Mike Anderson wrote: > > > Well, this same problem could show upb in any other driver. Could > > > you instead send a patch to Pat that the driver model never calls > > > the shutdown method for a driver that hasn't finished ->probe? > > > > I think it already will not do that due to taking the bus->subsys.rwsem > > before calling either probe() or remove(). > > Is the shutdown being called directly? The shutdown call is protected by > a different rwsem. Depending on the call graph setting dev->driver on > return of probe may provide a solution. Yes, that is precisely what I had considered doing. > I have not looked at all probe > routines to understand if this would cause any bad side effects. > > Andries, > Can you send the oops output? top of stack was reported as (process reboot): sd_shutdown + 0x22/0x110 NULL deref (namely, sdkp) i8042_notify_sys device_shutdown sys_reboot do_clock_nanosleep ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2003-08-13 10:46 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <Pine.LNX.4.53.0308111426570.16008@thevillage.soulcatcher> 2003-08-11 22:28 ` [PATCH] oops in sd_shutdown Andries Brouwer 2003-08-12 1:13 ` Jeff Woods 2003-08-12 2:49 ` Andries Brouwer 2003-08-12 6:35 ` Jeff Woods 2003-08-12 21:36 ` [linux-usb-devel] " Greg KH 2003-08-12 6:53 ` Christoph Hellwig 2003-08-12 8:51 ` Andries Brouwer 2003-08-12 21:35 ` [linux-usb-devel] " Greg KH 2003-08-13 0:24 ` Mike Anderson 2003-08-13 10:46 ` Andries Brouwer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).