linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG in: Driver core: convert block from raw kobjects to core devices
@ 2007-10-18 19:23 Alan Stern
  2007-10-18 19:27 ` Greg KH
  2007-10-18 19:27 ` Kay Sievers
  0 siblings, 2 replies; 16+ messages in thread
From: Alan Stern @ 2007-10-18 19:23 UTC (permalink / raw)
  To: Kay Sievers, Greg KH; +Cc: Kernel development list

This patch (as1004) fixes a refcounting bug in the development version
of the block-device core.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

---

Kay, you have got to start testing your patches better!  Finding and
fixing refcount errors is _not_ one of my favorite ways to pass the
time.  For example, you could see what happens when you insert and
unplug a USB flash disk a few times.

Greg, you'll probably want to just fold this in with Kay's 
block-device.patch.

Alan Stern


Index: usb-2.6/fs/partitions/check.c
===================================================================
--- usb-2.6.orig/fs/partitions/check.c
+++ usb-2.6/fs/partitions/check.c
@@ -516,5 +516,4 @@ void del_gendisk(struct gendisk *disk)
 	sysfs_remove_link(&block_depr, disk->dev.bus_id);
 #endif
 	device_del(&disk->dev);
-	put_device(&disk->dev);
 }


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-18 19:23 BUG in: Driver core: convert block from raw kobjects to core devices Alan Stern
  2007-10-18 19:27 ` Greg KH
@ 2007-10-18 19:27 ` Kay Sievers
  2007-10-18 20:08   ` Alan Stern
  1 sibling, 1 reply; 16+ messages in thread
From: Kay Sievers @ 2007-10-18 19:27 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kernel development list

On Thu, 2007-10-18 at 15:23 -0400, Alan Stern wrote:
> This patch (as1004) fixes a refcounting bug in the development version
> of the block-device core.
> 
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> 
> ---
> 
> Kay, you have got to start testing your patches better!

That leaves references around for SCSI target devices. There must be a
bug somewhere else, if the patch isn't correct.

> Finding and
> fixing refcount errors is _not_ one of my favorite ways to pass the
> time.  For example, you could see what happens when you insert and
> unplug a USB flash disk a few times.

What do you see with the original version?

Kay


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-18 19:23 BUG in: Driver core: convert block from raw kobjects to core devices Alan Stern
@ 2007-10-18 19:27 ` Greg KH
  2007-10-18 19:27 ` Kay Sievers
  1 sibling, 0 replies; 16+ messages in thread
From: Greg KH @ 2007-10-18 19:27 UTC (permalink / raw)
  To: Alan Stern; +Cc: Kay Sievers, Kernel development list

On Thu, Oct 18, 2007 at 03:23:49PM -0400, Alan Stern wrote:
> This patch (as1004) fixes a refcounting bug in the development version
> of the block-device core.
> 
> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> 
> ---
> 
> Kay, you have got to start testing your patches better!  Finding and
> fixing refcount errors is _not_ one of my favorite ways to pass the
> time.  For example, you could see what happens when you insert and
> unplug a USB flash disk a few times.
> 
> Greg, you'll probably want to just fold this in with Kay's 
> block-device.patch.
> 
> Alan Stern
> 
> 
> Index: usb-2.6/fs/partitions/check.c
> ===================================================================
> --- usb-2.6.orig/fs/partitions/check.c
> +++ usb-2.6/fs/partitions/check.c
> @@ -516,5 +516,4 @@ void del_gendisk(struct gendisk *disk)
>  	sysfs_remove_link(&block_depr, disk->dev.bus_id);
>  #endif
>  	device_del(&disk->dev);
> -	put_device(&disk->dev);
>  }

Are we sure this is correct?  Kay and I spent a lot of time (well, Kay
did all the work) a few weeks ago trying to track down a bug in this
area, and this was the solution from what I remember.

Kay, thoughts?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-18 19:27 ` Kay Sievers
@ 2007-10-18 20:08   ` Alan Stern
  2007-10-19  1:27     ` Kay Sievers
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2007-10-18 20:08 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg KH, Kernel development list

On Thu, 18 Oct 2007, Kay Sievers wrote:

> On Thu, 2007-10-18 at 15:23 -0400, Alan Stern wrote:
> > This patch (as1004) fixes a refcounting bug in the development version
> > of the block-device core.
> > 
> > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > 
> > ---
> > 
> > Kay, you have got to start testing your patches better!
> 
> That leaves references around for SCSI target devices. There must be a
> bug somewhere else, if the patch isn't correct.
> 
> > Finding and
> > fixing refcount errors is _not_ one of my favorite ways to pass the
> > time.  For example, you could see what happens when you insert and
> > unplug a USB flash disk a few times.
> 
> What do you see with the original version?

Note that a USB drive is treated as a SCSI device.

With the original code, I see the following sequence of events when 
add_disk() is first called.  Values in parentheses are 
atomic_read(disk->dev.kobj.kref.refcount) after each stage runs:

	Entry to add_disk		(1)
	Call to register_disk
	device_add			(3)
	CONFIG_SYSFS_DEPRECATED is not set
	Call disk_sysfs_add_subdirs
	add disk->holder_dir		(4)
	add disk->slave_dir		(5)
	Return to register_disk
	get_capacity			(5)
	bdget_disk			(5)
	blkdev_get (partitions)		(8)
	blkdev_put			(7)
	Return to add_disk
	blk_register_queue		(9)

You can see how many references each stage takes.  Now here's the
equivalent list for del_gendisk():

	Entry to del_gendisk		(9)
	invalidate_ and delete_partition loop	(7)
	invalidate_partition 0		(7)
	Call unlink_gendisk
	blk_unregister_queue		(5)
	Return to del_gendisk
	unregister disk->holder_dir	(4)
	unregister disk->slave_dir	(3)
	CONFIG_SYSFS_DEPRECATED is not set
	device_del			(1)
	put_device			(0) -- oops!

Matching things up we have:

	device_add/device_del		2 refs
	reg/unreg subdirs		2 refs
	subpartitions			2 refs
	reg/unreg block queue		2 refs

This accounts for everything in del_gendisk except the final
put_device.  Evidently it doesn't belong there.  There's no matching 
get_device in add_disk or register_disk.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-18 20:08   ` Alan Stern
@ 2007-10-19  1:27     ` Kay Sievers
  2007-10-19 14:09       ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Kay Sievers @ 2007-10-19  1:27 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kernel development list


On Thu, 2007-10-18 at 16:08 -0400, Alan Stern wrote:
> On Thu, 18 Oct 2007, Kay Sievers wrote:
> 
> > On Thu, 2007-10-18 at 15:23 -0400, Alan Stern wrote:
> > > This patch (as1004) fixes a refcounting bug in the development version
> > > of the block-device core.
> > > 
> > > Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
> > > 
> > > ---
> > > 
> > > Kay, you have got to start testing your patches better!
> > 
> > That leaves references around for SCSI target devices. There must be a
> > bug somewhere else, if the patch isn't correct.
> > 
> > > Finding and
> > > fixing refcount errors is _not_ one of my favorite ways to pass the
> > > time.  For example, you could see what happens when you insert and
> > > unplug a USB flash disk a few times.
> > 
> > What do you see with the original version?
> 
> Note that a USB drive is treated as a SCSI device.
> 
> With the original code, I see the following sequence of events when 
> add_disk() is first called.  Values in parentheses are 
> atomic_read(disk->dev.kobj.kref.refcount) after each stage runs:
> 
> 	Entry to add_disk		(1)
> 	Call to register_disk
> 	device_add			(3)
> 	CONFIG_SYSFS_DEPRECATED is not set
> 	Call disk_sysfs_add_subdirs
> 	add disk->holder_dir		(4)
> 	add disk->slave_dir		(5)
> 	Return to register_disk
> 	get_capacity			(5)
> 	bdget_disk			(5)
> 	blkdev_get (partitions)		(8)
> 	blkdev_put			(7)
> 	Return to add_disk
> 	blk_register_queue		(9)
> 
> You can see how many references each stage takes.  Now here's the
> equivalent list for del_gendisk():
> 
> 	Entry to del_gendisk		(9)
> 	invalidate_ and delete_partition loop	(7)
> 	invalidate_partition 0		(7)
> 	Call unlink_gendisk
> 	blk_unregister_queue		(5)
> 	Return to del_gendisk
> 	unregister disk->holder_dir	(4)
> 	unregister disk->slave_dir	(3)
> 	CONFIG_SYSFS_DEPRECATED is not set
> 	device_del			(1)
> 	put_device			(0) -- oops!
> 
> Matching things up we have:
> 
> 	device_add/device_del		2 refs
> 	reg/unreg subdirs		2 refs
> 	subpartitions			2 refs
> 	reg/unreg block queue		2 refs
> 
> This accounts for everything in del_gendisk except the final
> put_device.  Evidently it doesn't belong there.  There's no matching 
> get_device in add_disk or register_disk.

Hmm, do you have kobject debugging enabled? Do you ever see something
like: "kobject sdb: cleaning up" when you remove the put_device()?

Kay


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-19  1:27     ` Kay Sievers
@ 2007-10-19 14:09       ` Alan Stern
  2007-10-19 14:15         ` Kay Sievers
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2007-10-19 14:09 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg KH, Kernel development list

On Fri, 19 Oct 2007, Kay Sievers wrote:

> > This accounts for everything in del_gendisk except the final
> > put_device.  Evidently it doesn't belong there.  There's no matching 
> > get_device in add_disk or register_disk.
> 
> Hmm, do you have kobject debugging enabled? Do you ever see something
> like: "kobject sdb: cleaning up" when you remove the put_device()?

I didn't enable kobject debugging, but I did put a printk statement in
drivers/scsi/sd.c:scsi_disk_release(), which is the release routine for
the scsi_disk structure.  It does the final put_disk() call -- or at 
least, this is _supposed_ to be the final call.

With my patch, just before the call to put_disk the value of 
disk->dev.kobj.kref.refcount is 1.  Without my patch, the value is 
garbage (probably a slab poison value, but I printed it in decimal 
rather than hex so I can't be sure).

Don't you have a USB storage device?  It should be easy for you to test 
this on your own system.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-19 14:09       ` Alan Stern
@ 2007-10-19 14:15         ` Kay Sievers
  2007-10-19 17:11           ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Kay Sievers @ 2007-10-19 14:15 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kernel development list


On Fri, 2007-10-19 at 10:09 -0400, Alan Stern wrote:
> On Fri, 19 Oct 2007, Kay Sievers wrote:
> 
> > > This accounts for everything in del_gendisk except the final
> > > put_device.  Evidently it doesn't belong there.  There's no matching 
> > > get_device in add_disk or register_disk.
> > 
> > Hmm, do you have kobject debugging enabled? Do you ever see something
> > like: "kobject sdb: cleaning up" when you remove the put_device()?
> 
> I didn't enable kobject debugging, but I did put a printk statement in
> drivers/scsi/sd.c:scsi_disk_release(), which is the release routine for
> the scsi_disk structure.  It does the final put_disk() call -- or at 
> least, this is _supposed_ to be the final call.
> 
> With my patch, just before the call to put_disk the value of 
> disk->dev.kobj.kref.refcount is 1.  Without my patch, the value is 
> garbage (probably a slab poison value, but I printed it in decimal 
> rather than hex so I can't be sure).
> 
> Don't you have a USB storage device?  It should be easy for you to test 
> this on your own system.

Sure, I have, and tried a lot of times, and all seemed correct here with
the final put. I don't say that it's the right fix, but without it, the
disk device object is never released here, it only gets removed from
sysfs.

Kay


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-19 14:15         ` Kay Sievers
@ 2007-10-19 17:11           ` Alan Stern
  2007-10-19 23:06             ` Kay Sievers
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2007-10-19 17:11 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg KH, Kernel development list

[-- Attachment #1: Type: TEXT/PLAIN, Size: 8123 bytes --]

On Fri, 19 Oct 2007, Kay Sievers wrote:

> > Don't you have a USB storage device?  It should be easy for you to test 
> > this on your own system.
> 
> Sure, I have, and tried a lot of times, and all seemed correct here with
> the final put. I don't say that it's the right fix, but without it, the
> disk device object is never released here, it only gets removed from
> sysfs.

Well, attached is a testing patch.  It should apply to 2.6.23 together 
with gregkh-all-2.6.23.  Here's the output on my system using your 
original code:

Plug in USB drive:

[   75.555077] usb-storage: device found at 3
[   75.558052] scsi 0:0:0:0: Direct-Access     UDISK    PDU01_1G 65G2.0  0.00 PQ: 0 ANSI: 2
[   75.568248] usb-storage: device scan complete
[   75.917139] sd 0:0:0:0: [sda] 1967616 512-byte hardware sectors (1007 MB)
[   75.918463] sd 0:0:0:0: [sda] Write Protect is off
[   75.919011] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[   75.919263] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   75.919397] sd_probe: call add_disk
[   75.919615] start of register_disk: 1
[   75.920402] after device_add: 3
[   75.920692] after disk_sysfs_add_subdirs: 5
[   75.921173] after get_capacity: 5
[   75.921461] after bdget_disk: 5
[   75.927996] sd 0:0:0:0: [sda] 1967616 512-byte hardware sectors (1007 MB)
[   75.929405] sd 0:0:0:0: [sda] Write Protect is off
[   75.929801] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[   75.929960] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   75.930202]  sda: sda1
[   75.966112] after blkdev_get: 8
[   75.966648] after blkdev_put: 7
[   75.968997] end of register_disk: 7
[   75.969327] after blk_register_queue: 9
[   75.969823] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   76.448192] sd 0:0:0:0: Attached scsi generic sg0 type 0

Unplug USB drive:

[   92.852078] usb 6-4: USB disconnect, address 3
[   92.853389] usb 6-4: unregistering device
[   92.853507] usb 6-4: usb_disable_device nuking all URBs
[   92.853783] usb 6-4: unregistering interface 6-4:1.0
[   92.863556] sd_remove: call del_gendisk
[   92.864043] start of del_gendisk: 9
[   92.866551] after deleting subpartitions: 7
[   92.866880] after invalidate_partition(0): 7
[   92.867469] after unlink_gendisk: 5
[   92.867738] after kobject_unregister subdirs: 3
[   92.869450] after device_del: 1
[   92.869741] after put_device: 1802201963
[   92.870267] BUG: unable to handle kernel paging request at virtual address 6b6b6b6f
[   92.870494]  printing eip:
[   92.870547] d09639d2
[   92.870597] *pde = 00000000
[   92.870747] Oops: 0000 [#1]
[   92.870797] PREEMPT SMP 
[   92.870915] last sysfs file: /block/sda/sda1/dev
[   92.870968] Modules linked in: sg sd_mod usb_storage scsi_mod fan container battery ac evdev thermal button processor ohci_hcd e100 mii ehci_hcd uhci_hcd pcspkr usbcore
[   92.871749] CPU:    0
[   92.871751] EIP:    0060:[<d09639d2>]    Not tainted VLI
[   92.871753] EFLAGS: 00010202   (2.6.23 #7)
[   92.871905] EIP is at __scsi_disk_get+0xd/0x2a [sd_mod]
[   92.871958] eax: 6b6b6b6b   ebx: 6b6b6b6b   ecx: 00000000   edx: 00000008
[   92.872014] esi: cd7359a0   edi: cd7359a8   ebp: cf6edd6c   esp: cf6edd68
[   92.872069] ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
[   92.872124] Process khubd (pid: 970, ti=cf6ec000 task=cf6f0a90 task.ti=cf6ec000)
[   92.872179] Stack: 00000000 cf6edd78 d0963a13 cfb32910 cf6edd9c d0963bfd cf6edd9c c0194e69 
[   92.872555]        c032351e 6b6b6b6b cfb32910 cd7359a0 cd7359a8 cf6eddb4 d0964096 d09659fa 
[   92.872932]        cfb32910 d09670a4 00000292 cf6eddc4 c022303e cfb32910 cfb32910 cf6eddd0 
[   92.873310] Call Trace:
[   92.873400]  [<c01050fd>] show_trace_log_lvl+0x1a/0x2f
[   92.873496]  [<c01051af>] show_stack_log_lvl+0x9d/0xa5
[   92.873584]  [<c01053a4>] show_registers+0x1ed/0x32c
[   92.873672]  [<c01055fa>] die+0x117/0x236
[   92.873758]  [<c02ac046>] do_page_fault+0x4c0/0x591
[   92.873849]  [<c02aa69a>] error_code+0x72/0x78
[   92.873936]  [<d0963a13>] scsi_disk_get_from_dev+0x24/0x35 [sd_mod]
[   92.874029]  [<d0963bfd>] sd_shutdown+0xe/0xfd [sd_mod]
[   92.874122]  [<d0964096>] sd_remove+0x40/0x77 [sd_mod]
[   92.874213]  [<c022303e>] __device_release_driver+0x74/0x90
[   92.874306]  [<c022342c>] device_release_driver+0x30/0x47
[   92.874394]  [<c02229c3>] bus_remove_device+0x73/0x82
[   92.874482]  [<c02211ac>] device_del+0x206/0x27c
[   92.874569]  [<d0988b4e>] __scsi_remove_device+0x3c/0x6d [scsi_mod]
[   92.874678]  [<d0986509>] scsi_forget_host+0x30/0x4f [scsi_mod]
[   92.874781]  [<d0981543>] scsi_remove_host+0x6a/0xdf [scsi_mod]
[   92.874882]  [<d099f7b1>] quiesce_and_remove_host+0xa5/0xb5 [usb_storage]
[   92.874983]  [<d099f893>] storage_disconnect+0x11/0x1b [usb_storage]
[   92.875079]  [<d0847a52>] usb_unbind_interface+0x47/0x99 [usbcore]
[   92.875191]  [<c022303e>] __device_release_driver+0x74/0x90
[   92.875280]  [<c022342c>] device_release_driver+0x30/0x47
[   92.875369]  [<c02229c3>] bus_remove_device+0x73/0x82
[   92.875456]  [<c02211ac>] device_del+0x206/0x27c
[   92.875543]  [<d0844fee>] usb_disable_device+0xe2/0x145 [usbcore]
[   92.875648]  [<d08407e6>] usb_disconnect+0xce/0x167 [usbcore]
[   92.875751]  [<d0841b97>] hub_thread+0x646/0xea7 [usbcore]
[   92.875852]  [<c0131c9e>] kthread+0x3b/0x61
[   92.875940]  [<c0104d53>] kernel_thread_helper+0x7/0x10
[   92.876028]  =======================
[   92.876077] Code: e8 e1 6a 86 ef 8b 43 04 05 b0 00 00 00 e8 e6 d1 8b ef 89 d8 e8 45 e6 7f ef 58 5b 5e 5d c3 55 89 e5 53 8b 40 38 85 c0 74 18 89 c3 <8b> 40 04 e8 fb c6 01 00 85 c0 75 0a 8d 43 08 e8 79 fd 8b ef eb 
[   92.878418] EIP: [<d09639d2>] __scsi_disk_get+0xd/0x2a [sd_mod] SS:ESP 0068:cf6edd68


And here is the output with the put_device() call removed (the 
plug-in part is the same as before):

Plug in USB drive:

[   74.683198] usb-storage: device found at 3
[   74.685229] scsi 0:0:0:0: Direct-Access     UDISK    PDU01_1G 65G2.0  0.00 PQ: 0 ANSI: 2
[   74.692196] usb-storage: device scan complete
[   75.018341] sd 0:0:0:0: [sda] 1967616 512-byte hardware sectors (1007 MB)
[   75.019465] sd 0:0:0:0: [sda] Write Protect is off
[   75.019624] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[   75.019745] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   75.019872] sd_probe: call add_disk
[   75.019986] start of register_disk: 1
[   75.020234] after device_add: 3
[   75.020387] after disk_sysfs_add_subdirs: 5
[   75.020519] after get_capacity: 5
[   75.020672] after bdget_disk: 5
[   75.023582] sd 0:0:0:0: [sda] 1967616 512-byte hardware sectors (1007 MB)
[   75.024637] sd 0:0:0:0: [sda] Write Protect is off
[   75.024905] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[   75.025115] sd 0:0:0:0: [sda] Assuming drive cache: write through
[   75.025264]  sda: sda1
[   75.054655] after blkdev_get: 8
[   75.054924] after blkdev_put: 7
[   75.057048] end of register_disk: 7
[   75.057628] after blk_register_queue: 9
[   75.057871] sd 0:0:0:0: [sda] Attached SCSI removable disk
[   75.464875] sd 0:0:0:0: Attached scsi generic sg0 type 0

Unplug USB drive:

[   85.266412] usb 4-4: USB disconnect, address 3
[   85.266719] usb 4-4: unregistering device
[   85.266831] usb 4-4: usb_disable_device nuking all URBs
[   85.266961] usb 4-4: unregistering interface 4-4:1.0
[   85.271229] sd_remove: call del_gendisk
[   85.275622] start of del_gendisk: 9
[   85.276204] after deleting subpartitions: 7
[   85.276336] after invalidate_partition(0): 7
[   85.276512] after unlink_gendisk: 5
[   85.276644] after kobject_unregister subdirs: 3
[   85.276966] after device_del: 1
[   85.277095] scsi_disk_release: call put_disk
[   85.292153] usb 4-4:1.0: uevent
[   85.292468] usb 4-4:1.0: uevent
[   85.293458] usb 4-4: uevent

As you can see, this definitely proves that the final put is done at 
the right time.

See what the patch shows on your system.  One extra thing to check for: 
When you plug in a USB drive, the SCSI core starts up an error-handler 
process for it.  When you unplug the drive, the error-handler task 
should exit.  If it doesn't then there's an extra reference somewhere 
that never got released.

Alan Stern

[-- Attachment #2: test patch --]
[-- Type: TEXT/plain, Size: 5256 bytes --]

Index: usb-2.6/block/genhd.c
===================================================================
--- usb-2.6.orig/block/genhd.c
+++ usb-2.6/block/genhd.c
@@ -17,6 +17,10 @@
 #include <linux/buffer_head.h>
 #include <linux/mutex.h>
 
+extern int alantest;
+#define alanp(msg, disk)   if (alantest) printk(KERN_INFO msg ": %d\n", \
+			atomic_read(&disk->dev.kobj.kref.refcount))
+
 extern struct class block_class;
 extern struct device_type disk_type;
 extern struct device_type part_type;
@@ -185,6 +189,7 @@ void add_disk(struct gendisk *disk)
 			    disk->minors, NULL, exact_match, exact_lock, disk);
 	register_disk(disk);
 	blk_register_queue(disk);
+	alanp("after blk_register_queue", disk);
 }
 
 EXPORT_SYMBOL(add_disk);
Index: usb-2.6/drivers/scsi/sd.c
===================================================================
--- usb-2.6.orig/drivers/scsi/sd.c
+++ usb-2.6/drivers/scsi/sd.c
@@ -62,6 +62,8 @@
 
 #include "scsi_logging.h"
 
+extern int alantest;
+
 MODULE_AUTHOR("Eric Youngdale");
 MODULE_DESCRIPTION("SCSI disk (sd) driver");
 MODULE_LICENSE("GPL");
@@ -1677,7 +1679,10 @@ static int sd_probe(struct device *dev)
 		gd->flags |= GENHD_FL_REMOVABLE;
 
 	dev_set_drvdata(dev, sdkp);
+	printk(KERN_INFO "sd_probe: call add_disk\n");
+	alantest = 1;
 	add_disk(gd);
+	alantest = 0;
 
 	sd_printk(KERN_NOTICE, sdkp, "Attached SCSI %sdisk\n",
 		  sdp->removable ? "removable " : "");
@@ -1708,6 +1713,8 @@ static int sd_remove(struct device *dev)
 	struct scsi_disk *sdkp = dev_get_drvdata(dev);
 
 	class_device_del(&sdkp->cdev);
+	printk(KERN_INFO "sd_remove: call del_gendisk\n");
+	alantest = 1;
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
 
@@ -1716,6 +1723,7 @@ static int sd_remove(struct device *dev)
 	class_device_put(&sdkp->cdev);
 	mutex_unlock(&sd_ref_mutex);
 
+	alantest = 0;
 	return 0;
 }
 
@@ -1738,6 +1746,7 @@ static void scsi_disk_release(struct cla
 	spin_unlock(&sd_index_lock);
 
 	disk->private_data = NULL;
+	printk(KERN_INFO "scsi_disk_release: call put_disk\n");
 	put_disk(disk);
 	put_device(&sdkp->device->sdev_gendev);
 
Index: usb-2.6/fs/partitions/check.c
===================================================================
--- usb-2.6.orig/fs/partitions/check.c
+++ usb-2.6/fs/partitions/check.c
@@ -36,6 +36,11 @@
 #include "karma.h"
 #include "sysv68.h"
 
+int alantest;
+EXPORT_SYMBOL(alantest);
+#define alanp(msg, disk)   if (alantest) printk(KERN_INFO msg ": %d\n", \
+			atomic_read(&disk->dev.kobj.kref.refcount))
+
 extern struct kobject block_depr;
 extern struct class block_class;
 
@@ -378,6 +383,7 @@ void register_disk(struct gendisk *disk)
 	struct hd_struct *p;
 	int err;
 
+	alanp("start of register_disk", disk);
 	disk->dev.parent = disk->driverfs_dev;
 	disk->dev.devt = MKDEV(disk->major, disk->first_minor);
 
@@ -392,6 +398,7 @@ void register_disk(struct gendisk *disk)
 
 	if (device_add(&disk->dev))
 		return;
+	alanp("after device_add", disk);
 #ifndef CONFIG_SYSFS_DEPRECATED
 	err = sysfs_create_link(&block_depr, &disk->dev.kobj,
 				kobject_name(&disk->dev.kobj));
@@ -399,8 +406,10 @@ void register_disk(struct gendisk *disk)
 		device_del(&disk->dev);
 		return;
 	}
+	alanp("after sysfs_create_link", disk);
 #endif
  	disk_sysfs_add_subdirs(disk);
+	alanp("after disk_sysfs_add_subdirs", disk);
 
 	/* No minors to use for partitions */
 	if (disk->minors == 1)
@@ -409,16 +418,20 @@ void register_disk(struct gendisk *disk)
 	/* No such device (e.g., media were just removed) */
 	if (!get_capacity(disk))
 		goto exit;
+	alanp("after get_capacity", disk);
 
 	bdev = bdget_disk(disk, 0);
 	if (!bdev)
 		goto exit;
+	alanp("after bdget_disk", disk);
 
 	bdev->bd_invalidated = 1;
 	err = blkdev_get(bdev, FMODE_READ, 0);
 	if (err < 0)
 		goto exit;
+	alanp("after blkdev_get", disk);
 	blkdev_put(bdev);
+	alanp("after blkdev_put", disk);
 
 exit:
 	/* announce disk after possible partitions are created */
@@ -432,6 +445,7 @@ exit:
 			continue;
 		kobject_uevent(&p->dev.kobj, KOBJ_ADD);
 	}
+	alanp("end of register_disk", disk);
 }
 
 int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
@@ -497,24 +511,34 @@ void del_gendisk(struct gendisk *disk)
 {
 	int p;
 
+	alanp("start of del_gendisk", disk);
+
 	/* invalidate stuff */
 	for (p = disk->minors - 1; p > 0; p--) {
 		invalidate_partition(disk, p);
 		delete_partition(disk, p);
 	}
+	alanp("after deleting subpartitions", disk);
 	invalidate_partition(disk, 0);
+	alanp("after invalidate_partition(0)", disk);
+
 	disk->capacity = 0;
 	disk->flags &= ~GENHD_FL_UP;
 	unlink_gendisk(disk);
+	alanp("after unlink_gendisk", disk);
 	disk_stat_set_all(disk, 0);
 	disk->stamp = 0;
 
 	kobject_unregister(disk->holder_dir);
 	kobject_unregister(disk->slave_dir);
+	alanp("after kobject_unregister subdirs", disk);
 	disk->driverfs_dev = NULL;
 #ifndef CONFIG_SYSFS_DEPRECATED
 	sysfs_remove_link(&block_depr, disk->dev.bus_id);
+	alanp("after sysfs_remove_link", disk);
 #endif
 	device_del(&disk->dev);
+	alanp("after device_del", disk);
 	put_device(&disk->dev);
+	alanp("after put_device", disk);
 }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-19 17:11           ` Alan Stern
@ 2007-10-19 23:06             ` Kay Sievers
  2007-10-21  1:33               ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Kay Sievers @ 2007-10-19 23:06 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kernel development list

On Fri, 2007-10-19 at 13:11 -0400, Alan Stern wrote:
> On Fri, 19 Oct 2007, Kay Sievers wrote:

> > > Don't you have a USB storage device?  It should be easy for you to
> test 
> > > this on your own system.
> > 
> > Sure, I have, and tried a lot of times, and all seemed correct here
> with
> > the final put. I don't say that it's the right fix, but without it,
> the
> > disk device object is never released here, it only gets removed from
> > sysfs.
> 
> Well, attached is a testing patch.  It should apply to 2.6.23
> together 
> with gregkh-all-2.6.23.  Here's the output on my system using your 
> original code:
> 
...

> As you can see, this definitely proves that the final put is done at 
> the right time.
> 
> See what the patch shows on your system.  One extra thing to check for: 
> When you plug in a USB drive, the SCSI core starts up an error-handler 
> process for it.  When you unplug the drive, the error-handler task 
> should exit.  If it doesn't then there's an extra reference somewhere 
> that never got released.

Here is what I see, the error handler hangs without the final put and
the kobject never gets cleaned up. Note the missing:
  kobject sdb: cleaning up

What is your CONFIG_SYSFS_DEPRECATED option? I have it unset, and that
may be the difference in the behavior if you have it set.

Thanks,
Kay


With device_put() - add sequence:
  sd_probe: call add_disk
  start of register_disk: 1
  kobject block: registering. parent: 9:0:0:0, set: <NULL>
  unset subsytem caused the event to drop!
  kobject sdb: registering. parent: block, set: devices
  kobject filter function caused the event to drop!
  after device_add: 3
  after sysfs_create_link: 3
  kobject holders: registering. parent: sdb, set: <NULL>
  kobject filter function caused the event to drop!
  kobject slaves: registering. parent: sdb, set: <NULL>
  kobject filter function caused the event to drop!
  after disk_sysfs_add_subdirs: 5
  after get_capacity: 5
  after bdget_disk: 5
  sd 9:0:0:0: [sdb] 127840 512-byte hardware sectors (65 MB)
  sd 9:0:0:0: [sdb] Write Protect is off
  sd 9:0:0:0: [sdb] Mode Sense: 45 00 00 08
  sd 9:0:0:0: [sdb] Assuming drive cache: write through
   sdb: sdb1
  kobject sdb1: registering. parent: sdb, set: devices
  kobject filter function caused the event to drop!
  kobject holders: registering. parent: sdb1, set: <NULL>
  kobject filter function caused the event to drop!
  after blkdev_get: 8
  after blkdev_put: 7
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb/sdb1'
  end of register_disk: 7
  kobject queue: registering. parent: sdb, set: <NULL>
  kobject filter function caused the event to drop!
  kobject iosched: registering. parent: queue, set: <NULL>
  kobject filter function caused the event to drop!
  after blk_register_queue: 9
  sd 9:0:0:0: [sdb] Attached SCSI removable disk
  kobject 9:0:0:0: registering. parent: scsi_device, set: class_obj
  fill_kobj_path: path = '/class/scsi_device/9:0:0:0'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
  kobject sg2: registering. parent: scsi_generic, set: class_obj
  fill_kobj_path: path = '/class/scsi_generic/sg2'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
  sd 9:0:0:0: Attached scsi generic sg2 type 0
  kobject 9:0:0:0: registering. parent: bsg, set: class_obj
  fill_kobj_path: path = '/class/bsg/9:0:0:0'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject target9:0:1: registering. parent: host9, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:1: cleaning up
  kobject target9:0:2: registering. parent: host9, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:2: cleaning up
  kobject target9:0:3: registering. parent: host9, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:3: cleaning up
  kobject target9:0:4: registering. parent: host9, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:4: cleaning up
  kobject target9:0:5: registering. parent: host9, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:5: cleaning up
  kobject target9:0:6: registering. parent: host9, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:6: cleaning up
  kobject target9:0:7: registering. parent: host9, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:7: cleaning up
  usb-storage: device scan complete
  
With device_put() - eh_9 created:
     23 ?        S<     0:00  \_ [scsi_eh_0]
     24 ?        S<     0:00  \_ [scsi_eh_1]
     25 ?        S<     0:00  \_ [scsi_eh_2]
     26 ?        S<     0:00  \_ [scsi_eh_3]
     27 ?        S<     0:00  \_ [scsi_eh_4]
     28 ?        S<     0:00  \_ [scsi_eh_5]
   2149 ?        S<     0:00  \_ [scsi_eh_9]
   2182 tty1     S+     0:00      \_ grep eh

  
With device_put() - remove sequence:
  sd_remove: call del_gendisk
  start of del_gendisk: 9
  kobject holders: unregistering
  kobject filter function caused the event to drop!
  kobject holders: cleaning up
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb/sdb1'
  kobject sdb1: cleaning up
  after deleting subpartitions: 7
  after invalidate_partition(0): 7
  kobject filter function caused the event to drop!
  kobject filter function caused the event to drop!
  after unlink_gendisk: 6
  kobject holders: unregistering
  kobject filter function caused the event to drop!
  kobject holders: cleaning up
  kobject slaves: unregistering
  kobject filter function caused the event to drop!
  kobject slaves: cleaning up
  after kobject_unregister subdirs: 4
  after sysfs_remove_link: 4
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb'
  after device_del: 2
  after put_device: 1
  kobject 9:0:0:0: cleaning up
  scsi_disk_release: call put_disk
  kobject sdb: cleaning up
  kobject block: cleaning up
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
  kobject 9:0:0:0: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target9:0:0: cleaning up
  fill_kobj_path: path = '/class/scsi_host/host9'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9'
  kobject host9: cleaning up
  kobject filter function caused the event to drop!
  kobject host9: cleaning up
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0'
  kobject 1-2:1.0: cleaning up
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/usb_endpoint/usbdev1.6_ep00'
  kobject usbdev1.6_ep00: cleaning up
  kobject usb_endpoint: cleaning up
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2'
  kobject 1-2: cleaning up

With device_put() - eh_9 removed:
     23 ?        S<     0:00  \_ [scsi_eh_0]
     24 ?        S<     0:00  \_ [scsi_eh_1]
     25 ?        S<     0:00  \_ [scsi_eh_2]
     26 ?        S<     0:00  \_ [scsi_eh_3]
     27 ?        S<     0:00  \_ [scsi_eh_4]
     28 ?        S<     0:00  \_ [scsi_eh_5]
   2199 tty1     S+     0:00      \_ grep eh
  
  

Without device_put() - add sequence:
  sd_probe: call add_disk
  start of register_disk: 1
  kobject block: registering. parent: 8:0:0:0, set: <NULL>
  unset subsytem caused the event to drop!
  kobject sdb: registering. parent: block, set: devices
  kobject filter function caused the event to drop!
  after device_add: 3
  after sysfs_create_link: 3
  kobject holders: registering. parent: sdb, set: <NULL>
  kobject filter function caused the event to drop!
  kobject slaves: registering. parent: sdb, set: <NULL>
  kobject filter function caused the event to drop!
  after disk_sysfs_add_subdirs: 5
  after get_capacity: 5
  after bdget_disk: 5
  sd 8:0:0:0: [sdb] 127840 512-byte hardware sectors (65 MB)
  sd 8:0:0:0: [sdb] Write Protect is off
  sd 8:0:0:0: [sdb] Mode Sense: 45 00 00 08
  sd 8:0:0:0: [sdb] Assuming drive cache: write through
   sdb: sdb1
  kobject sdb1: registering. parent: sdb, set: devices
  kobject filter function caused the event to drop!
  kobject holders: registering. parent: sdb1, set: <NULL>
  kobject filter function caused the event to drop!
  after blkdev_get: 8
  after blkdev_put: 7
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb/sdb1'
  end of register_disk: 7
  kobject queue: registering. parent: sdb, set: <NULL>
  kobject filter function caused the event to drop!
  kobject iosched: registering. parent: queue, set: <NULL>
  kobject filter function caused the event to drop!
  after blk_register_queue: 9
  sd 8:0:0:0: [sdb] Attached SCSI removable disk
  kobject 8:0:0:0: registering. parent: scsi_device, set: class_obj
  fill_kobj_path: path = '/class/scsi_device/8:0:0:0'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0'
  kobject sg2: registering. parent: scsi_generic, set: class_obj
  fill_kobj_path: path = '/class/scsi_generic/sg2'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0'
  sd 8:0:0:0: Attached scsi generic sg2 type 0
  kobject 8:0:0:0: registering. parent: bsg, set: class_obj
  fill_kobj_path: path = '/class/bsg/8:0:0:0'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0'
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject target8:0:1: registering. parent: host8, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target8:0:1: cleaning up
  kobject target8:0:2: registering. parent: host8, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target8:0:2: cleaning up
  kobject target8:0:3: registering. parent: host8, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target8:0:3: cleaning up
  kobject target8:0:4: registering. parent: host8, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target8:0:4: cleaning up
  kobject target8:0:5: registering. parent: host8, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target8:0:5: cleaning up
  kobject target8:0:6: registering. parent: host8, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target8:0:6: cleaning up
  kobject target8:0:7: registering. parent: host8, set: devices
  kobject filter function caused the event to drop!
  kobject <NULL>: cleaning up
  kobject iosched: cleaning up
  kobject queue: cleaning up
  kobject filter function caused the event to drop!
  kobject target8:0:7: cleaning up
  usb-storage: device scan complete
  
Without device_put() - eh_8 created:
     23 ?        S<     0:00  \_ [scsi_eh_0]
     24 ?        S<     0:00  \_ [scsi_eh_1]
     25 ?        S<     0:00  \_ [scsi_eh_2]
     26 ?        S<     0:00  \_ [scsi_eh_3]
     27 ?        S<     0:00  \_ [scsi_eh_4]
     28 ?        S<     0:00  \_ [scsi_eh_5]
   1965 ?        S<     0:00  \_ [scsi_eh_6]
   2014 ?        S<     0:00  \_ [scsi_eh_7]
   2066 ?        S<     0:00  \_ [scsi_eh_8]
   2098 tty1     S+     0:00      \_ grep eh
  
  
Without device_put() - remove sequence:
  sd_remove: call del_gendisk
  start of del_gendisk: 9
  kobject holders: unregistering
  kobject filter function caused the event to drop!
  kobject holders: cleaning up
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb/sdb1'
  kobject sdb1: cleaning up
  after deleting subpartitions: 7
  after invalidate_partition(0): 7
  kobject filter function caused the event to drop!
  kobject filter function caused the event to drop!
  after unlink_gendisk: 6
  kobject holders: unregistering
  kobject filter function caused the event to drop!
  kobject holders: cleaning up
  kobject slaves: unregistering
  kobject filter function caused the event to drop!
  kobject slaves: cleaning up
  after kobject_unregister subdirs: 4
  after sysfs_remove_link: 4
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb'
  after device_del: 2
  after put_device: 2
  kobject 8:0:0:0: cleaning up
  scsi_disk_release: call put_disk
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8/target8:0:0/8:0:0:0'
  fill_kobj_path: path = '/class/scsi_host/host8'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host8'
  kobject host8: cleaning up
  kobject filter function caused the event to drop!
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0'
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/usb_endpoint/usbdev1.4_ep00'
  kobject usbdev1.4_ep00: cleaning up
  kobject usb_endpoint: cleaning up
  fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2'
  
Without device_put() - eh_8 hanging around:
     23 ?        S<     0:00  \_ [scsi_eh_0]
     24 ?        S<     0:00  \_ [scsi_eh_1]
     25 ?        S<     0:00  \_ [scsi_eh_2]
     26 ?        S<     0:00  \_ [scsi_eh_3]
     27 ?        S<     0:00  \_ [scsi_eh_4]
     28 ?        S<     0:00  \_ [scsi_eh_5]
   1965 ?        S<     0:00  \_ [scsi_eh_6]
   2014 ?        S<     0:00  \_ [scsi_eh_7]
   2066 ?        S<     0:00  \_ [scsi_eh_8]
   2118 tty1     S+     0:00      \_ grep eh
  



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-19 23:06             ` Kay Sievers
@ 2007-10-21  1:33               ` Alan Stern
  2007-10-21 19:03                 ` Kay Sievers
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2007-10-21  1:33 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg KH, Kernel development list

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2357 bytes --]

On Sat, 20 Oct 2007, Kay Sievers wrote:

> Here is what I see, the error handler hangs without the final put and
> the kobject never gets cleaned up. Note the missing:
>   kobject sdb: cleaning up
> 
> What is your CONFIG_SYSFS_DEPRECATED option? I have it unset, and that
> may be the difference in the behavior if you have it set.

It's unset in my config also.  No, the difference lies somewhere else.
The plug-in parts are the same, but we differ in the remove parts.  On 
your system, unlink_gendisk ends up dropping only one reference instead 
of two.  This suggests that something strange is happening to the 
request_queue on your machine.

Can you try running the attached patch (without the previous patch)?  
It traces the various release routines.  The idea is that both the
scsi_disk and the scsi_device hold references to the request_queue,
and these references get dropped in their respective release routines.

Just for the record, here's what happens with this patch on my system,
without the put_device call in del_gendisk (the patch comments it out).  
I plugged in a USB drive and let things settle down.  Then unplugging
the drive gave this:

[  457.916995] usb 6-4: USB disconnect, address 3
[  457.918459] usb 6-4: unregistering device
[  457.920570] usb 6-4: usb_disable_device nuking all URBs
[  457.920859] usb 6-4: unregistering interface 6-4:1.0
[  457.958990] disk_release: kobj cedcda50

The line above refers to the /dev/sda1 partition.  Ignore it.

[  458.012317] del_gendisk sda, kobj ce8be990, queue cd9b2000, refcount before put_device 2

2 references: one from the scsi_disk and one from the request_queue.

[  458.013133] scsi_disk_release: disk sda, kobj ce8be990, refcount before put_disk 2
[  458.032420] scsi_device_dev_release: rq cd9b2000

These lines show where the two references to the request_queue get 
dropped.  As a result the queue is released, causing the gendisk to be 
released as well:

[  458.032766] blk_release_queue: rq cd9b2000, parent ce8be990
[  458.032973] disk_release: kobj ce8be990
[  458.051641] usb 6-4:1.0: uevent
[  458.052001] usb 6-4:1.0: uevent
[  458.068665] usb 6-4: uevent

If you don't get a similar sequence of events, it must indicate that 
something on your system continues to hold an outstanding reference.  
Maybe an automounter program, or something like that.

Alan Stern

[-- Attachment #2: test patch 2 --]
[-- Type: TEXT/plain, Size: 2546 bytes --]

Index: 2.6.23/block/genhd.c
===================================================================
--- 2.6.23.orig/block/genhd.c
+++ 2.6.23/block/genhd.c
@@ -496,6 +496,7 @@ static void disk_release(struct device *
 {
 	struct gendisk *disk = dev_to_disk(dev);
 
+	printk(KERN_INFO "disk_release: kobj %p\n", &disk->dev.kobj);
 	kfree(disk->random);
 	kfree(disk->part);
 	free_disk_stats(disk);
Index: 2.6.23/block/ll_rw_blk.c
===================================================================
--- 2.6.23.orig/block/ll_rw_blk.c
+++ 2.6.23/block/ll_rw_blk.c
@@ -1781,6 +1781,8 @@ static void blk_release_queue(struct kob
 		container_of(kobj, struct request_queue, kobj);
 	struct request_list *rl = &q->rq;
 
+	printk(KERN_INFO "blk_release_queue: rq %p, parent %p\n",
+			q, q->kobj.parent);
 	blk_sync_queue(q);
 
 	if (rl->rq_pool)
Index: 2.6.23/drivers/scsi/scsi_sysfs.c
===================================================================
--- 2.6.23.orig/drivers/scsi/scsi_sysfs.c
+++ 2.6.23/drivers/scsi/scsi_sysfs.c
@@ -238,6 +238,8 @@ static void scsi_device_dev_release_user
 	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
+	printk(KERN_INFO "scsi_device_dev_release: rq %p\n",
+			sdev->request_queue);
 	if (sdev->request_queue) {
 		sdev->request_queue->queuedata = NULL;
 		/* user context needed to free queue */
Index: 2.6.23/fs/partitions/check.c
===================================================================
--- 2.6.23.orig/fs/partitions/check.c
+++ 2.6.23/fs/partitions/check.c
@@ -516,5 +516,9 @@ void del_gendisk(struct gendisk *disk)
 	sysfs_remove_link(&block_depr, disk->dev.bus_id);
 #endif
 	device_del(&disk->dev);
-	put_device(&disk->dev);
+	printk(KERN_INFO "del_gendisk %s, kobj %p, queue %p, "
+			"refcount before put_device %d\n",
+			disk->dev.bus_id, &disk->dev.kobj, disk->queue,
+			atomic_read(&disk->dev.kobj.kref.refcount));
+//	put_device(&disk->dev);
 }
Index: 2.6.23/drivers/scsi/sd.c
===================================================================
--- 2.6.23.orig/drivers/scsi/sd.c
+++ 2.6.23/drivers/scsi/sd.c
@@ -1738,6 +1738,10 @@ static void scsi_disk_release(struct cla
 	spin_unlock(&sd_index_lock);
 
 	disk->private_data = NULL;
+	printk(KERN_INFO "scsi_disk_release: disk %s, kobj %p, "
+			"refcount before put_disk %d\n",
+			disk->dev.bus_id, &disk->dev.kobj,
+			atomic_read(&disk->dev.kobj.kref.refcount));
 	put_disk(disk);
 	put_device(&sdkp->device->sdev_gendev);
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-21  1:33               ` Alan Stern
@ 2007-10-21 19:03                 ` Kay Sievers
  2007-10-22  0:26                   ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Kay Sievers @ 2007-10-21 19:03 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kernel development list

On Sat, 2007-10-20 at 21:33 -0400, Alan Stern wrote: 
> On Sat, 20 Oct 2007, Kay Sievers wrote:
> 
> > Here is what I see, the error handler hangs without the final put and
> > the kobject never gets cleaned up. Note the missing:
> >   kobject sdb: cleaning up
> > 
> > What is your CONFIG_SYSFS_DEPRECATED option? I have it unset, and that
> > may be the difference in the behavior if you have it set.
> 
> It's unset in my config also.  No, the difference lies somewhere else.
> The plug-in parts are the same, but we differ in the remove parts.  On 
> your system, unlink_gendisk ends up dropping only one reference instead 
> of two.  This suggests that something strange is happening to the 
> request_queue on your machine.
> 
> Can you try running the attached patch (without the previous patch)?  
> It traces the various release routines.  The idea is that both the
> scsi_disk and the scsi_device hold references to the request_queue,
> and these references get dropped in their respective release routines.

Here is the output. I used a device with a partition now, like your
setup, but it's still the same difference to the behavior you see.
What's your event sequence?

Here is the output of "udevmonitor -k":

UEVENT[1192992135.385043] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2 (usb)
UEVENT[1192992135.385088] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/usb_endpoint/usbdev2.7_ep00 (usb_endpoint)
UEVENT[1192992135.385244] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0 (usb)
UEVENT[1192992135.385761] add      /class/scsi_host/host8 (scsi_host)
UEVENT[1192992135.386199] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.7_ep02 (usb_endpoint)
UEVENT[1192992135.386212] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.7_ep86 (usb_endpoint)
UEVENT[1192992135.386220] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.7_ep81 (usb_endpoint)
UEVENT[1192992140.387147] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/host8/target8:0:0/8:0:0:0 (scsi)
UEVENT[1192992140.387181] add      /class/scsi_disk/8:0:0:0 (scsi_disk)
UEVENT[1192992140.441760] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb (block)
UEVENT[1192992140.441776] add      /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb/sdb1 (block)
UEVENT[1192992140.441785] add      /class/scsi_device/8:0:0:0 (scsi_device)
UEVENT[1192992140.441792] add      /class/scsi_generic/sg2 (scsi_generic)
UEVENT[1192992140.441800] add      /class/bsg/8:0:0:0 (bsg)

UEVENT[1192992143.163514] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.7_ep02 (usb_endpoint)
UEVENT[1192992143.163529] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.7_ep86 (usb_endpoint)
UEVENT[1192992143.163537] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/usb_endpoint/usbdev2.7_ep81 (usb_endpoint)
UEVENT[1192992143.163544] remove   /class/bsg/8:0:0:0 (bsg)
UEVENT[1192992143.163551] remove   /class/scsi_generic/sg2 (scsi_generic)
UEVENT[1192992143.163559] remove   /class/scsi_device/8:0:0:0 (scsi_device)
UEVENT[1192992143.163846] remove   /class/scsi_disk/8:0:0:0 (scsi_disk)
UEVENT[1192992143.163858] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb/sdb1 (block)
UEVENT[1192992143.163866] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/host8/target8:0:0/8:0:0:0/block/sdb (block)
UEVENT[1192992143.163873] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0/host8/target8:0:0/8:0:0:0 (scsi)
UEVENT[1192992143.163881] remove   /class/scsi_host/host8 (scsi_host)
UEVENT[1192992143.163889] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/2-2:1.0 (usb)
UEVENT[1192992143.163996] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2/usb_endpoint/usbdev2.7_ep00 (usb_endpoint)
UEVENT[1192992143.164290] remove   /devices/pci0000:00/0000:00:1d.7/usb2/2-2 (usb)

> Just for the record, here's what happens with this patch on my system,
> without the put_device call in del_gendisk (the patch comments it out).  
> I plugged in a USB drive and let things settle down.  Then unplugging
> the drive gave this:
> 
> [  457.916995] usb 6-4: USB disconnect, address 3
> [  457.918459] usb 6-4: unregistering device
> [  457.920570] usb 6-4: usb_disable_device nuking all URBs
> [  457.920859] usb 6-4: unregistering interface 6-4:1.0
> [  457.958990] disk_release: kobj cedcda50
> 
> The line above refers to the /dev/sda1 partition.  Ignore it.
> 
> [  458.012317] del_gendisk sda, kobj ce8be990, queue cd9b2000, refcount before put_device 2
> 
> 2 references: one from the scsi_disk and one from the request_queue.
> 
> [  458.013133] scsi_disk_release: disk sda, kobj ce8be990, refcount before put_disk 2
> [  458.032420] scsi_device_dev_release: rq cd9b2000

Hmm, I still don't see this without the final put.

> These lines show where the two references to the request_queue get 
> dropped.  As a result the queue is released, causing the gendisk to be 
> released as well:
> 
> [  458.032766] blk_release_queue: rq cd9b2000, parent ce8be990
> [  458.032973] disk_release: kobj ce8be990
> [  458.051641] usb 6-4:1.0: uevent
> [  458.052001] usb 6-4:1.0: uevent
> [  458.068665] usb 6-4: uevent
> 
> If you don't get a similar sequence of events, it must indicate that 
> something on your system continues to hold an outstanding reference.  
> Maybe an automounter program, or something like that.

I killed all userspace while catching that output.


kobject <NULL>: cleaning up
usb 1-2: new full speed USB device using uhci_hcd and address 4
kobject 1-2: registering. parent: usb1, set: devices
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2'
kobject usb_endpoint: registering. parent: 1-2, set: <NULL>
kobject_uevent_env
unset subsystem caused the event to drop!
kobject usbdev1.4_ep00: registering. parent: usb_endpoint, set: devices
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/usb_endpoint/usbdev1.4_ep00'
usb 1-2: configuration #1 chosen from 1 choice
kobject 1-2:1.0: registering. parent: 1-2, set: devices
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0'
scsi9 : SCSI emulation for USB Mass Storage devices
kobject host9: registering. parent: 1-2:1.0, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject host9: registering. parent: scsi_host, set: class_obj
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_host/host9'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9'
kobject usb_endpoint: registering. parent: 1-2:1.0, set: <NULL>
kobject_uevent_env
unset subsystem caused the event to drop!
kobject usbdev1.4_ep82: registering. parent: usb_endpoint, set: devices
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/usb_endpoint/usbdev1.4_ep82'
kobject usbdev1.4_ep01: registering. parent: usb_endpoint, set: devices
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/usb_endpoint/usbdev1.4_ep01'
usb 1-2: new device found, idVendor=08ec, idProduct=0011
usb 1-2: new device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-2: Product: USB 2.0 Memory Key
usb 1-2: Manufacturer: IBM
usb 1-2: SerialNumber: 0218B301030027E8
usb-storage: device found at 4
usb-storage: waiting for device to settle before scanning
kobject target9:0:0: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
scsi 9:0:0:0: Direct-Access     IBM      Memory Key       3.04 PQ: 0 ANSI: 0 CCS
kobject 9:0:0:0: registering. parent: target9:0:0, set: devices
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject 9:0:0:0: registering. parent: scsi_disk, set: class_obj
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_disk/9:0:0:0'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
sd 9:0:0:0: [sdb] 127840 512-byte hardware sectors (65 MB)
sd 9:0:0:0: [sdb] Write Protect is off
sd 9:0:0:0: [sdb] Mode Sense: 45 00 00 08
sd 9:0:0:0: [sdb] Assuming drive cache: write through
kobject block: registering. parent: 9:0:0:0, set: <NULL>
kobject_uevent_env
unset subsystem caused the event to drop!
kobject sdb: registering. parent: block, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject holders: registering. parent: sdb, set: <NULL>
kobject_uevent_env
kobject filter function caused the event to drop!
kobject slaves: registering. parent: sdb, set: <NULL>
kobject_uevent_env
kobject filter function caused the event to drop!
sd 9:0:0:0: [sdb] 127840 512-byte hardware sectors (65 MB)
sd 9:0:0:0: [sdb] Write Protect is off
sd 9:0:0:0: [sdb] Mode Sense: 45 00 00 08
sd 9:0:0:0: [sdb] Assuming drive cache: write through
 sdb: sdb1
kobject sdb1: registering. parent: sdb, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject holders: registering. parent: sdb1, set: <NULL>
kobject_uevent_env
kobject filter function caused the event to drop!
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb'
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb/sdb1'
kobject queue: registering. parent: sdb, set: <NULL>
kobject_uevent_env
kobject filter function caused the event to drop!
kobject iosched: registering. parent: queue, set: <NULL>
kobject_uevent_env
kobject filter function caused the event to drop!
sd 9:0:0:0: [sdb] Attached SCSI removable disk
kobject 9:0:0:0: registering. parent: scsi_device, set: class_obj
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_device/9:0:0:0'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject sg2: registering. parent: scsi_generic, set: class_obj
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_generic/sg2'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
sd 9:0:0:0: Attached scsi generic sg2 type 0
kobject 9:0:0:0: registering. parent: bsg, set: class_obj
kobject_uevent_env
fill_kobj_path: path = '/class/bsg/9:0:0:0'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject target9:0:1: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject_uevent_env
kobject filter function caused the event to drop!
kobject target9:0:1: cleaning up
kobject target9:0:2: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject_uevent_env
kobject filter function caused the event to drop!
kobject target9:0:2: cleaning up
kobject target9:0:3: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject_uevent_env
kobject filter function caused the event to drop!
kobject target9:0:3: cleaning up
kobject target9:0:4: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject_uevent_env
kobject filter function caused the event to drop!
kobject target9:0:4: cleaning up
kobject target9:0:5: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject_uevent_env
kobject filter function caused the event to drop!
kobject target9:0:5: cleaning up
kobject target9:0:6: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject_uevent_env
kobject filter function caused the event to drop!
kobject target9:0:6: cleaning up
kobject target9:0:7: registering. parent: host9, set: devices
kobject_uevent_env
kobject filter function caused the event to drop!
kobject <NULL>: cleaning up
scsi_device_dev_release: rq ffff810005518690
kobject iosched: cleaning up
kobject queue: cleaning up
blk_release_queue: rq ffff810005518690, parent 0000000000000000
kobject_uevent_env
kobject filter function caused the event to drop!
kobject target9:0:7: cleaning up
usb-storage: device scan complete

usb 1-2: USB disconnect, address 4
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/usb_endpoint/usbdev1.4_ep82'
kobject usbdev1.4_ep82: cleaning up
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/usb_endpoint/usbdev1.4_ep01'
kobject usbdev1.4_ep01: cleaning up
kobject usb_endpoint: cleaning up
kobject_uevent_env
fill_kobj_path: path = '/class/bsg/9:0:0:0'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject 9:0:0:0: cleaning up
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_generic/sg2'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject sg2: cleaning up
kobject <NULL>: cleaning up
kobject <NULL>: cleaning up
disk_release: kobj ffff810003bcbd58
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_device/9:0:0:0'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject 9:0:0:0: cleaning up
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_disk/9:0:0:0'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject holders: unregistering
kobject_uevent_env
kobject filter function caused the event to drop!
kobject holders: cleaning up
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb/sdb1'
kobject sdb1: cleaning up
kobject_uevent_env
kobject filter function caused the event to drop!
kobject_uevent_env
kobject filter function caused the event to drop!
kobject holders: unregistering
kobject_uevent_env
kobject filter function caused the event to drop!
kobject holders: cleaning up
kobject slaves: unregistering
kobject_uevent_env
kobject filter function caused the event to drop!
kobject slaves: cleaning up
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0/block/sdb'
del_gendisk sdb, kobj ffff810004825158, queue ffff810005518000, refcount before put_device 2
kobject 9:0:0:0: cleaning up
scsi_disk_release: disk sdb, kobj ffff810004825158, refcount before put_disk 2
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9/target9:0:0/9:0:0:0'
kobject_uevent_env
fill_kobj_path: path = '/class/scsi_host/host9'
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0/host9'
kobject host9: cleaning up
kobject_uevent_env
kobject filter function caused the event to drop!
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/1-2:1.0'
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2/usb_endpoint/usbdev1.4_ep00'
kobject usbdev1.4_ep00: cleaning up
kobject usb_endpoint: cleaning up
kobject_uevent_env
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1d.0/usb1/1-2'


Thanks,
Kay


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-21 19:03                 ` Kay Sievers
@ 2007-10-22  0:26                   ` Alan Stern
  2007-10-23  0:01                     ` Kay Sievers
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2007-10-22  0:26 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg KH, Kernel development list

On Sun, 21 Oct 2007, Kay Sievers wrote:

> > [  458.013133] scsi_disk_release: disk sda, kobj ce8be990, refcount before put_disk 2
> > [  458.032420] scsi_device_dev_release: rq cd9b2000
> 
> Hmm, I still don't see this without the final put.

Yes, I see your point.  Suppose you try doing the exact same thing
again, but this time un-comment the put_device() call so that the
scsi_device does get released.  I predict that the log will show your
request_queue drops its reference to the gendisk structure _after_ the
gendisk has been released.  (If necessary I could send a patch with a
printk at the crucial spot.)  That would prove something is going wrong
on your system.

(BTW you don't need to include the log for when you plug in the USB
drive; all that matters is what happens when you unplug it.)

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-22  0:26                   ` Alan Stern
@ 2007-10-23  0:01                     ` Kay Sievers
  2007-10-23  4:14                       ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Kay Sievers @ 2007-10-23  0:01 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kernel development list

On Sun, 2007-10-21 at 20:26 -0400, Alan Stern wrote:
> On Sun, 21 Oct 2007, Kay Sievers wrote:
> 
> > > [  458.013133] scsi_disk_release: disk sda, kobj ce8be990, refcount before put_disk 2
> > > [  458.032420] scsi_device_dev_release: rq cd9b2000
> > 
> > Hmm, I still don't see this without the final put.
> 
> Yes, I see your point.  Suppose you try doing the exact same thing
> again, but this time un-comment the put_device() call so that the
> scsi_device does get released.  I predict that the log will show your
> request_queue drops its reference to the gendisk structure _after_ the
> gendisk has been released.  (If necessary I could send a patch with a
> printk at the crucial spot.)  That would prove something is going wrong
> on your system.

There is definitely something wrong, I tried all sorts of options now,
and a second machine, and I can never get the behavior you see. I even
booted with init=/bin/sh.
But true, looking at the kobject debugging for loop devices, and usb
storage driven by the ub driver, all looks fine without the additional
put.

There must be something going wrong with the block patch in conjunction
with the crazy SCSI release logic. Can you send me your .config? Just
for a check, maybe you have some option, enabled/disabled that changes
the behavior, and possibly brings us closer to find the bug.

Thanks,
Kay


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-23  0:01                     ` Kay Sievers
@ 2007-10-23  4:14                       ` Alan Stern
  2007-10-23 11:27                         ` Kay Sievers
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2007-10-23  4:14 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg KH, Kernel development list

On Tue, 23 Oct 2007, Kay Sievers wrote:

> There is definitely something wrong, I tried all sorts of options now,
> and a second machine, and I can never get the behavior you see. I even
> booted with init=/bin/sh.
> But true, looking at the kobject debugging for loop devices, and usb
> storage driven by the ub driver, all looks fine without the additional
> put.

Yes; I haven't been able to figure out why we get different results.

> There must be something going wrong with the block patch in conjunction
> with the crazy SCSI release logic.

I don't know -- there's nothing obviously wrong with the block patch 
except the extra put_device.  But you're right that the SCSI logic is 
crazy.  The scsi_device is the parent of the gendisk, which is the 
parent of the request_queue.  But the scsi_device holds a reference to 
the request_queue, which is dropped in the scsi_device's release 
routine!  That doesn't make much sense to me, and it is complicated by 
the fact that deleting a kobject doesn't drop the kobject's reference 
to its parent -- only releasing the kobject does.

In fact, for my development work I normally use a patch which makes 
kobjects behave better: They do drop the reference to their parent when 
they are deleted.  This actually is nothing more than a reversion of a 
patch added several years ago to try and cover up some of the 
weaknesses of the SCSI stack!  Now that the SCSI stack is in better 
shape, maybe my patch should be included in the mainstream kernel.  The 
patch is below; see what you think.

> Can you send me your .config? Just
> for a check, maybe you have some option, enabled/disabled that changes
> the behavior, and possibly brings us closer to find the bug.

I'll send it to you off-list.

Unfortunately I don't have much time to work on debugging this right
now; I'm on vacation this week.

Alan Stern


Index: usb-2.6/lib/kobject.c
===================================================================
--- usb-2.6.orig/lib/kobject.c
+++ usb-2.6/lib/kobject.c
@@ -206,12 +206,16 @@ void kobject_init(struct kobject * kobj)
 
 static void unlink(struct kobject * kobj)
 {
+	struct kobject *parent = kobj->parent;
+
 	if (kobj->kset) {
 		spin_lock(&kobj->kset->list_lock);
 		list_del_init(&kobj->entry);
 		spin_unlock(&kobj->kset->list_lock);
 	}
+	kobj->parent = NULL;
 	kobject_put(kobj);
+	kobject_put(parent);
 }
 
 /**
@@ -255,7 +259,6 @@ int kobject_add(struct kobject * kobj)
 	if (error) {
 		/* unlink does the kobject_put() for us */
 		unlink(kobj);
-		kobject_put(parent);
 
 		/* be noisy on error issues */
 		if (error == -EEXIST)
@@ -498,7 +501,6 @@ void kobject_cleanup(struct kobject * ko
 {
 	struct kobj_type * t = get_ktype(kobj);
 	struct kset * s = kobj->kset;
-	struct kobject * parent = kobj->parent;
 	const char *name = kobj->k_name;
 
 	pr_debug("kobject %s: cleaning up\n",kobject_name(kobj));
@@ -515,7 +517,6 @@ void kobject_cleanup(struct kobject * ko
 	}
 	if (s)
 		kset_put(s);
-	kobject_put(parent);
 }
 
 static void kobject_release(struct kref *kref)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-23  4:14                       ` Alan Stern
@ 2007-10-23 11:27                         ` Kay Sievers
  2007-10-24 18:01                           ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Kay Sievers @ 2007-10-23 11:27 UTC (permalink / raw)
  To: Alan Stern; +Cc: Greg KH, Kernel development list, Hannes Reinecke

[-- Attachment #1: Type: text/plain, Size: 2451 bytes --]

On Tue, 2007-10-23 at 00:14 -0400, Alan Stern wrote:
> On Tue, 23 Oct 2007, Kay Sievers wrote:
> 
> > There is definitely something wrong, I tried all sorts of options now,
> > and a second machine, and I can never get the behavior you see. I even
> > booted with init=/bin/sh.
> > But true, looking at the kobject debugging for loop devices, and usb
> > storage driven by the ub driver, all looks fine without the additional
> > put.
> 
> Yes; I haven't been able to figure out why we get different results.
> 
> > There must be something going wrong with the block patch in conjunction
> > with the crazy SCSI release logic.
> 
> I don't know -- there's nothing obviously wrong with the block patch 
> except the extra put_device.  But you're right that the SCSI logic is 
> crazy.  The scsi_device is the parent of the gendisk, which is the 
> parent of the request_queue.  But the scsi_device holds a reference to 
> the request_queue, which is dropped in the scsi_device's release 
> routine!

That's the thing. We see a circular reference when we use the SCSI LUN
as the parent.
The sysfs relationship is: scsi_device -> genhd -> queue, while the
queue holds a ref to to the blockdev (parent), the blockdev to the
scsi_device (parent) and the scsi_devices to the queue (SCSI). All
waiting for their release functions to be called, to release the other
refs.

The scsi_device needs to drop the queue reference while the device is
removed, not when it's data is released. Hannes came up with the
attached patch, which seems to work fine here.

> That doesn't make much sense to me, and it is complicated by 
> the fact that deleting a kobject doesn't drop the kobject's reference 
> to its parent -- only releasing the kobject does.

Right, that makes things very complicated.

> In fact, for my development work I normally use a patch which makes 
> kobjects behave better: They do drop the reference to their parent when 
> they are deleted.  This actually is nothing more than a reversion of a 
> patch added several years ago to try and cover up some of the 
> weaknesses of the SCSI stack!  Now that the SCSI stack is in better 
> shape, maybe my patch should be included in the mainstream kernel.  The 
> patch is below; see what you think.

Sounds good to me, to disconnect a dead object from its parent when it's
deleted. We only need to protect for "use after free" but not lock the
parent, I guess. We should give it a try.

Thanks,
Kay

[-- Attachment #2: scsi-sysfs-drop-queue-ref-on-remove.patch --]
[-- Type: text/x-patch, Size: 1184 bytes --]

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index daed37d..577bbf3 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -280,15 +280,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
-	if (sdev->request_queue) {
-		sdev->request_queue->queuedata = NULL;
-		/* user context needed to free queue */
-		scsi_free_queue(sdev->request_queue);
-		/* temporary expedient, try to catch use of queue lock
-		 * after free of sdev */
-		sdev->request_queue = NULL;
-	}
-
 	scsi_target_reap(scsi_target(sdev));
 
 	kfree(sdev->inquiry);
@@ -799,6 +790,16 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	if (sdev->host->hostt->slave_destroy)
 		sdev->host->hostt->slave_destroy(sdev);
 	transport_destroy_device(dev);
+
+	if (sdev->request_queue) {
+		sdev->request_queue->queuedata = NULL;
+		/* user context needed to free queue */
+		scsi_free_queue(sdev->request_queue);
+		/* temporary expedient, try to catch use of queue lock
+		 * after free of sdev */
+		sdev->request_queue = NULL;
+	}
+
 	put_device(dev);
 }
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: BUG in: Driver core: convert block from raw kobjects to core devices
  2007-10-23 11:27                         ` Kay Sievers
@ 2007-10-24 18:01                           ` Alan Stern
  0 siblings, 0 replies; 16+ messages in thread
From: Alan Stern @ 2007-10-24 18:01 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Greg KH, Kernel development list, Hannes Reinecke

On Tue, 23 Oct 2007, Kay Sievers wrote:

> > Yes; I haven't been able to figure out why we get different results.
> > 
> > > There must be something going wrong with the block patch in conjunction
> > > with the crazy SCSI release logic.

I found out why it works on my system -- which still leaves open the 
question of why it fails on yours.

Although the gendisk device is a child of the scsi_device, nevertheless 
disk->kobj is not a child of the scsi_device's embedded kobject.  
Instead it is a child of the static (!) block_depr kobject, which is 
defined in block/genhd.c.  Hence the disk's single reference to the 
scsi_device is indeed dropped when the disk is unregistered, which 
breaks the loop of circular references.

It works this way because I have CONFIG_SYSFS_DEPRECATED set; perhaps
you don't.  If you compare the two versions of get_device_parent() in
drivers/base/core.c you'll see the difference (it's the dev_type ==
&disk_type case).

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-10-24 18:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-18 19:23 BUG in: Driver core: convert block from raw kobjects to core devices Alan Stern
2007-10-18 19:27 ` Greg KH
2007-10-18 19:27 ` Kay Sievers
2007-10-18 20:08   ` Alan Stern
2007-10-19  1:27     ` Kay Sievers
2007-10-19 14:09       ` Alan Stern
2007-10-19 14:15         ` Kay Sievers
2007-10-19 17:11           ` Alan Stern
2007-10-19 23:06             ` Kay Sievers
2007-10-21  1:33               ` Alan Stern
2007-10-21 19:03                 ` Kay Sievers
2007-10-22  0:26                   ` Alan Stern
2007-10-23  0:01                     ` Kay Sievers
2007-10-23  4:14                       ` Alan Stern
2007-10-23 11:27                         ` Kay Sievers
2007-10-24 18:01                           ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).