All of lore.kernel.org
 help / color / mirror / Atom feed
* status of oops in sd_revalidate_disk?
@ 2011-12-14 17:59 Axel Theilmann
  2011-12-21 14:12 ` Huajun Li
  0 siblings, 1 reply; 16+ messages in thread
From: Axel Theilmann @ 2011-12-14 17:59 UTC (permalink / raw)
  To: linux-scsi; +Cc: huajun.li.lee, jbottomley, stefanr



hi,

two weeks ago Huajun Li posted a patch for a kernel oops, subject [PATCH]
SCSI/sd: Fix NULL dereference in sd_revalidate_disk".

The patch was discussed but considered "clearly wrong". The bug shows up for
us in kernel 3.1.4 quite often when unplugging usb sticks and it seems a few
other people have the same problem:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/859199
https://bugzilla.redhat.com/show_bug.cgi?id=754518
https://bugzilla.novell.com/show_bug.cgi?id=722350
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649735

Can anyone of you maybe give me any status update on that bug?

Thanks

tty, axel


-- 
Dipl.-Inform. Axel Theilmann             theilmann@pre-sense.de
                               Tel.  (+49) 040  / 244 2407 - 13
                               Fax   (+49) 040  / 244 2407 - 24
                               Mobil (+49) 0151 / 116 18 664
PRESENSE Technologies GmbH         Sachsenstr. 5, 20097 Hamburg
                                         USt-IdNr.: DE263765024
Geschäftsführer/Managing Directors       AG Hamburg, HRB 107844
Till Dörges           Jürgen Sander              Axel Theilmann
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: status of oops in sd_revalidate_disk?
  2011-12-14 17:59 status of oops in sd_revalidate_disk? Axel Theilmann
@ 2011-12-21 14:12 ` Huajun Li
  2011-12-25 20:58   ` Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?) Stefan Richter
  0 siblings, 1 reply; 16+ messages in thread
From: Huajun Li @ 2011-12-21 14:12 UTC (permalink / raw)
  To: Axel Theilmann; +Cc: linux-scsi, jbottomley, stefanr, huajun.li.lee

在 2011年12月15日 上午1:59,Axel Theilmann <theilmann@pre-sense.de> :
>
>
> hi,
>
> two weeks ago Huajun Li posted a patch for a kernel oops, subject [PATCH]
> SCSI/sd: Fix NULL dereference in sd_revalidate_disk".
>
> The patch was discussed but considered "clearly wrong". The bug shows up for
> us in kernel 3.1.4 quite often when unplugging usb sticks and it seems a few
> other people have the same problem:
>
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/859199
> https://bugzilla.redhat.com/show_bug.cgi?id=754518
> https://bugzilla.novell.com/show_bug.cgi?id=722350
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649735
>
> Can anyone of you maybe give me any status update on that bug?

It's quiet on this thread. :)

As I mentioned at http://marc.info/?l=linux-scsi&m=132266428023443&w=2
, sd_open() may return -ENOMEDIUM, then rescan_partitions() and
sd_revalidate_disk() will be triggered in turn, the latter will find
sdkp is NULL. The patch can fix the issue _If_ there is no other
dependency.

Any comments?

Thanks,
--Huajun
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2011-12-21 14:12 ` Huajun Li
@ 2011-12-25 20:58   ` Stefan Richter
  2011-12-27 10:21       ` Axel Theilmann
  2012-02-14 11:34     ` Jun'ichi Nomura
  0 siblings, 2 replies; 16+ messages in thread
From: Stefan Richter @ 2011-12-25 20:58 UTC (permalink / raw)
  To: linux-scsi; +Cc: Huajun Li, Axel Theilmann, jbottomley, linux-kernel

Hi,

as far as I remember, all Linux releases in 2011 have been broken WRT hot
removal of block devices; some more severely, some less.  Various patches
for this went in over the year, but if they fixed anything, they always
uncovered the next lingering unplug related bug.  The presumed first Linux
release in 2012 will be broken too, again in an easy to trigger way.  Here
is a quick test:
  - Start grip or any other program for CD-ROM access.
  - Unplug CD-ROM drive.
  - Have the program issue an ioctl, e.g. poll for medium presence.

With a little bit of bad luck, udisks-daemon or in older distros hald
should hit the bug too.  Under kernel 3.1 I typically just got processes
hanging in unkillable sleep.  With kernel 3.2-rc7 I get an instant kernel
panic.

First I tested a FireWire drive and got the first log which is included
below, instantly in two attempts.  Then I made two attempts with a USB
CD-ROM which did not oops immediately at device removal but when I then
hit the eject button in the still open grip.  This consistently produced
the second log at the end of this post.

First test with 1394 CD-ROM:
-----------------------------------------------------------------
  - attach CD-ROM drive
-----------------------------------------------------------------
scsi4 : SBP-2 IEEE-1394
firewire_sbp2 fw1.0: logged in to LUN 0000 (0 retries)
scsi 4:0:0:0: CD-ROM		TEAC	 CD-W28E	  1.1A PQ: 0 ANSI: 0 CCS
sr1: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
sr 4:0:0:0: Attached scsi CD-ROM sr1
-----------------------------------------------------------------
  - start grip
  - detach CD-ROM drive
-----------------------------------------------------------------
sr 4:0:0:0: Attached scsi generic sg2 type 5
scsi 4:0:0:0: killing request
BUG: unable to handle kernel NULL pointer dereference at 000003f0
IP: [<c11bc19f>] scsi_prep_state_check+0x6/0x68
*pde = 00000000 
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: firewire_sbp2 firewire_ohci firewire_core netconsole snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd sunrpc i2c_i801 applesmc sr_mod rtc sg input_polldev cdrom snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sky2 snd_page_alloc

Pid: 2832, comm: grip Not tainted 3.2.0-rc7 #1 Apple Computer, Inc. Macmini1,1/Mac-F4208EC8
EIP: 0060:[<c11bc19f>] EFLAGS: 00010046 CPU: 0
EIP is at scsi_prep_state_check+0x6/0x68
EAX: 00000000 EBX: f33f3f18 ECX: 00000000 EDX: f33f3f18
ESI: f4815a68 EDI: 00000000 EBP: f160bc14 ESP: f160bc10
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process grip (pid: 2832, ti=f160a000 task=f5d48760 task.ti=f160a000)
Stack:
 f33f3f18 f160bc2c c11bc8b1 f160bc3c f33f3f18 f4815a68 f33f3f18 f160bc3c
 c11bc9a5 f33f3f18 f4815a68 f160bc50 c10efad5 00000000 f33f3f18 f4815a68
 f160bc78 c11bd09f f4815db0 f33f3f18 00000001 f33f3f18 f4815a68 f4815a68

Call Trace:
 [<c11bc8b1>] scsi_setup_blk_pc_cmnd+0x12/0xe7
 [<c11bc9a5>] scsi_prep_fn+0x1f/0x2e
 [<c10efad5>] blk_peek_request+0x98/0x168
 [<c11bd09f>] scsi_request_fn+0x23/0x3b5
 [<c10ed9d6>] __blk_run_queue+0x14/0x16
 [<c10f25d5>] blk_execute_rq_nowait+0x7d/0x98
 [<c10f2697>] blk_execute_rq+0xa7/0xe8
 [<c10f2530>] ? blk_rq_map_user+0x1b7/0x1b7
 [<c10f8c81>] ? changed_ioprio+0x70/0x70
 [<c10ed700>] ? elv_set_request+0x12/0x20
 [<c10eeebd>] ? get_request+0x21e/0x2bb
 [<c11bcad2>] scsi_execute+0xc4/0x10a
 [<c11bcb6c>] scsi_execute_req+0x54/0x81
 [<c11bcbea>] scsi_test_unit_ready+0x51/0xb2
 [<f828248b>] sr_drive_status+0x33/0xd5 [sr_mod]
 [<f81f7860>] cdrom_ioctl+0x6a9/0xb31 [cdrom]
 [<c1279f36>] ? mutex_lock_nested+0x26c/0x2b0
 [<c10231e5>] ? get_parent_ip+0xb/0x31
 [<c1023287>] ? sub_preempt_count+0x7c/0x89
 [<c1279f5f>] ? mutex_lock_nested+0x295/0x2b0
 [<f82815f1>] ? sr_block_ioctl+0x2e/0x9a [sr_mod]
 [<f8281612>] sr_block_ioctl+0x4f/0x9a [sr_mod]
 [<f82815c3>] ? sr_block_check_events+0x13/0x13 [sr_mod]
 [<c10f39ee>] __blkdev_driver_ioctl+0x22/0x2e
 [<c10f42f5>] blkdev_ioctl+0x66d/0x68c
 [<c104bf7e>] ? __lock_acquire+0x62e/0x14bb
 [<c10b1861>] block_ioctl+0x32/0x3a
 [<c10b1861>] ? block_ioctl+0x32/0x3a
 [<c10b182f>] ? bd_set_size+0x67/0x67
 [<c109bfd5>] do_vfs_ioctl+0x481/0x4b7
 [<c1090993>] ? fget_light+0x4c/0xd0
 [<c109c039>] sys_ioctl+0x2e/0x49
 [<c127bb50>] sysenter_do_call+0x12/0x36
Code: 55 dc 8b 42 04 8b 80 68 03 00 00 e8 65 f0 0b 00 8b 43 64 e8 bf e9 0b 00 e9 59 ff ff ff 83 c4 1c 5b 5e 5f c9 c3 55 31 c9 89 e5 53 98 f0 03 00 00 83 fb 02 74 50 8d 4b fc 83 f9 04 77 3f ff 24
EIP: [<c11bc19f>] scsi_prep_state_check+0x6/0x68 SS:ESP 0068:f160bc10
CR2: 00000000000003f0
---[ end trace fba59fe8183510a7 ]---
note: grip[2832] exited with preempt_count 1
BUG: sleeping function called from invalid context at mm/memory.c:3905
in_atomic(): 1, irqs_disabled(): 0, pid: 2832, name: grip
INFO: lockdep is turned off.
Pid: 2832, comm: grip Tainted: G      D      3.2.0-rc7 #1
Call Trace:
 [<c1020b11>] __might_sleep+0xdb/0xe2
 [<c107cb36>] might_fault+0x22/0x7c
 [<c10503d0>] exit_robust_list+0x24/0x112
 [<c127b3f0>] ? restore_all+0xf/0xf
 [<c102721c>] mm_release+0x21/0xad
 [<c102a72f>] exit_mm+0x18/0xe7
 [<c127831f>] ? printk+0xf/0x18
 [<c102ba91>] do_exit+0x193/0x574
 [<c1004ac9>] oops_end+0x75/0x7c
 [<c101a8fc>] no_context+0x10e/0x118
 [<c104bf7e>] ? __lock_acquire+0x62e/0x14bb
 [<c101a9fa>] __bad_area_nosemaphore+0xf4/0xfc
 [<c101acd2>] ? vmalloc_sync_all+0x101/0x101
 [<c101aa0f>] bad_area_nosemaphore+0xd/0x10
 [<c101ae2d>] do_page_fault+0x15b/0x352
 [<c1090993>] ? fget_light+0x4c/0xd0
 [<c101acd2>] ? vmalloc_sync_all+0x101/0x101
 [<c127b8e7>] error_code+0x5f/0x64
 [<c101acd2>] ? vmalloc_sync_all+0x101/0x101
 [<c11bc19f>] ? scsi_prep_state_check+0x6/0x68
 [<c11bc8b1>] scsi_setup_blk_pc_cmnd+0x12/0xe7
 [<c11bc9a5>] scsi_prep_fn+0x1f/0x2e
 [<c10efad5>] blk_peek_request+0x98/0x168
 [<c11bd09f>] scsi_request_fn+0x23/0x3b5
 [<c10ed9d6>] __blk_run_queue+0x14/0x16
 [<c10f25d5>] blk_execute_rq_nowait+0x7d/0x98
 [<c10f2697>] blk_execute_rq+0xa7/0xe8
 [<c10f2530>] ? blk_rq_map_user+0x1b7/0x1b7
 [<c10f8c81>] ? changed_ioprio+0x70/0x70
 [<c10ed700>] ? elv_set_request+0x12/0x20
 [<c10eeebd>] ? get_request+0x21e/0x2bb
 [<c11bcad2>] scsi_execute+0xc4/0x10a
 [<c11bcb6c>] scsi_execute_req+0x54/0x81
 [<c11bcbea>] scsi_test_unit_ready+0x51/0xb2
 [<f828248b>] sr_drive_status+0x33/0xd5 [sr_mod]
 [<f81f7860>] cdrom_ioctl+0x6a9/0xb31 [cdrom]
 [<c1279f36>] ? mutex_lock_nested+0x26c/0x2b0
 [<c10231e5>] ? get_parent_ip+0xb/0x31
 [<c1023287>] ? sub_preempt_count+0x7c/0x89
 [<c1279f5f>] ? mutex_lock_nested+0x295/0x2b0
 [<f82815f1>] ? sr_block_ioctl+0x2e/0x9a [sr_mod]
 [<f8281612>] sr_block_ioctl+0x4f/0x9a [sr_mod]
 [<f82815c3>] ? sr_block_check_events+0x13/0x13 [sr_mod]
 [<c10f39ee>] __blkdev_driver_ioctl+0x22/0x2e
 [<c10f42f5>] blkdev_ioctl+0x66d/0x68c
 [<c104bf7e>] ? __lock_acquire+0x62e/0x14bb
 [<c10b1861>] block_ioctl+0x32/0x3a
 [<c10b1861>] ? block_ioctl+0x32/0x3a
 [<c10b182f>] ? bd_set_size+0x67/0x67
 [<c109bfd5>] do_vfs_ioctl+0x481/0x4b7
 [<c1090993>] ? fget_light+0x4c/0xd0
 [<c109c039>] sys_ioctl+0x2e/0x49
 [<c127bb50>] sysenter_do_call+0x12/0x36


Second test with USB CD-ROM:
-----------------------------------------------------------------
  - attach CD-ROM drive
-----------------------------------------------------------------
scsi4 : usb-storage 1-5:1.0
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
scsi 4:0:0:0: CD-ROM		PLEXTOR  DVDR	PX-716A   1.08 PQ: 0 ANSI: 0 CCS
sr1: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray
sr 4:0:0:0: Attached scsi CD-ROM sr1
sr 4:0:0:0: Attached scsi generic sg2 type 5
-----------------------------------------------------------------
  - start grip
  - detach CD-ROM drive
-----------------------------------------------------------------
usb 1-5: USB disconnect, device number 7
-----------------------------------------------------------------
  - hit grip's eject button
-----------------------------------------------------------------
BUG: unable to handle kernel NULL pointer dereference at 00000024
IP: [<c10f636a>] __blk_send_generic.clone.9+0x21/0x70
*pde = 00000000
Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: usb_storage netconsole snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd sunrpc sr_mod i2c_i801 sg cdrom applesmc input_polldev rtc snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sky2 snd_page_alloc

Pid: 2845, comm: grip Not tainted 3.2.0-rc7 #1 Apple Computer, Inc. Macmini1,1/Mac-F4208EC8
EIP: 0060:[<c10f636a>] EFLAGS: 00010246 CPU: 1
EIP is at __blk_send_generic.clone.9+0x21/0x70
EAX: 00000000 EBX: 00000000 ECX: 00000006 EDX: c10ef095
ESI: f3f3ca68 EDI: f1d55bf0 EBP: f1625d78 ESP: f1625d68
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process grip (pid: 2845, ti=f1624000 task=f3043760 task.ti=f1624000)
Stack:
 00000003 f3f3ca68 fffffffa 00000005 f1625e18 c10f6711 c150ede4 0000005d
 f1d55bf0 00000000 f3043ac4 f1625e08 00000046 00000282 f1625dc8 f1625df4
 00000000 f1625db8 f829b200 00000000 c150ede4 00000001 0000022d f3043702

Call Trace:
 [<c10f6711>] scsi_cmd_ioctl+0x358/0x373
 [<f829b200>] ? sr_packet+0x1a/0x3b [sr_mod]
 [<f81f71dd>] cdrom_ioctl+0x26/0xb31 [cdrom]
 [<c1279f36>] ? mutex_lock_nested+0x26c/0x2b0
 [<c10231e5>] ? get_parent_ip+0xb/0x31
 [<c1023287>] ? sub_preempt_count+0x7c/0x89
 [<c1279f5f>] ? mutex_lock_nested+0x295/0x2b0
 [<f829b5f1>] ? sr_block_ioctl+0x2e/0x9a [sr_mod]
 [<f829b612>] sr_block_ioctl+0x4f/0x9a [sr_mod]
 [<f829b5c3>] ? sr_block_check_events+0x13/0x13 [sr_mod]
 [<c10f39ee>] __blkdev_driver_ioctl+0x22/0x2e
 [<c10f42f5>] blkdev_ioctl+0x66d/0x68c
 [<c104bf7e>] ? __lock_acquire+0x62e/0x14bb
 [<c10b1861>] block_ioctl+0x32/0x3a
 [<c10b1861>] ? block_ioctl+0x32/0x3a
 [<c10b182f>] ? bd_set_size+0x67/0x67
 [<c109bfd5>] do_vfs_ioctl+0x481/0x4b7
 [<c1090993>] ? fget_light+0x4c/0xd0
 [<c109c039>] sys_ioctl+0x2e/0x49
 [<c127bb50>] sysenter_do_call+0x12/0x36
 [<c1270000>] ? pcibios_scan_specific_bus+0x43/0x72
Code: 8d 65 f4 89 f0 5b 5e 5f c9 c3 55 89 e5 57 89 d7 56 ba 01 00 00 00 53 89 c6 83 ec 04 89 4d f0 b9 10 00 00 00 e8 d8 8c ff ff 89 c3 40 24 02 00 00 00 c7 80 c0 00 00 00 60 ea 00 00 89 d9 8b 80
EIP: [<c10f636a>] __blk_send_generic.clone.9+0x21/0x70 SS:ESP 0068:f1625d68
CR2: 0000000000000024
---[ end trace 41f5b857579a5ae9 ]---

-- 
Stefan Richter
-=====-==-== ==-- ==--=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2011-12-25 20:58   ` Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?) Stefan Richter
@ 2011-12-27 10:21       ` Axel Theilmann
  2012-02-14 11:34     ` Jun'ichi Nomura
  1 sibling, 0 replies; 16+ messages in thread
From: Axel Theilmann @ 2011-12-27 10:21 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux-scsi, Huajun Li, jbottomley, linux-kernel

On 12/25/2011 09:58 PM, Stefan Richter wrote:

moin,

> as far as I remember, all Linux releases in 2011 have been broken WRT hot
> removal of block devices; some more severely, some less.  Various patches
> for this went in over the year, but if they fixed anything, they always
> uncovered the next lingering unplug related bug.  The presumed first Linux

so now there are 2 known NULL-pointer problems in the cd-rom code and one in
the scsi-disk code.

Would a complete fix for this issue be a question of locating all the
possible NULL-pointers and fixing them or do you think that the hotplug
problem has to be fixed on a more "fundamental" level?

Even if there is a more fundamental problem below that has to be fixed, it
would still be nice to get in fixes for the dereferences that are currently
known to keep peoples systems from crashing.

We built a kernel with Huajun's patch included and will do some tests to see
if the problem goes away (and no others show up).


> With a little bit of bad luck, udisks-daemon or in older distros hald
> should hit the bug too.  Under kernel 3.1 I typically just got processes
> hanging in unkillable sleep.  With kernel 3.2-rc7 I get an instant kernel
> panic.

Yes, udisks is what probably triggers the bug for us. People removing USB
media before udisks is finished initializing the medium. With kernel 3.1.4
we get instant kernel panics as well.


tty, axel



-- 
Dipl.-Inform. Axel Theilmann             theilmann@pre-sense.de
                               Tel.  (+49) 040  / 244 2407 - 13
                               Fax   (+49) 040  / 244 2407 - 24
                               Mobil (+49) 0151 / 116 18 664
PRESENSE Technologies GmbH         Sachsenstr. 5, 20097 Hamburg
                                         USt-IdNr.: DE263765024
Geschäftsführer/Managing Directors       AG Hamburg, HRB 107844
Till Dörges           Jürgen Sander              Axel Theilmann

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
@ 2011-12-27 10:21       ` Axel Theilmann
  0 siblings, 0 replies; 16+ messages in thread
From: Axel Theilmann @ 2011-12-27 10:21 UTC (permalink / raw)
  To: Stefan Richter; +Cc: linux-scsi, Huajun Li, jbottomley, linux-kernel

On 12/25/2011 09:58 PM, Stefan Richter wrote:

moin,

> as far as I remember, all Linux releases in 2011 have been broken WRT hot
> removal of block devices; some more severely, some less.  Various patches
> for this went in over the year, but if they fixed anything, they always
> uncovered the next lingering unplug related bug.  The presumed first Linux

so now there are 2 known NULL-pointer problems in the cd-rom code and one in
the scsi-disk code.

Would a complete fix for this issue be a question of locating all the
possible NULL-pointers and fixing them or do you think that the hotplug
problem has to be fixed on a more "fundamental" level?

Even if there is a more fundamental problem below that has to be fixed, it
would still be nice to get in fixes for the dereferences that are currently
known to keep peoples systems from crashing.

We built a kernel with Huajun's patch included and will do some tests to see
if the problem goes away (and no others show up).


> With a little bit of bad luck, udisks-daemon or in older distros hald
> should hit the bug too.  Under kernel 3.1 I typically just got processes
> hanging in unkillable sleep.  With kernel 3.2-rc7 I get an instant kernel
> panic.

Yes, udisks is what probably triggers the bug for us. People removing USB
media before udisks is finished initializing the medium. With kernel 3.1.4
we get instant kernel panics as well.


tty, axel



-- 
Dipl.-Inform. Axel Theilmann             theilmann@pre-sense.de
                               Tel.  (+49) 040  / 244 2407 - 13
                               Fax   (+49) 040  / 244 2407 - 24
                               Mobil (+49) 0151 / 116 18 664
PRESENSE Technologies GmbH         Sachsenstr. 5, 20097 Hamburg
                                         USt-IdNr.: DE263765024
Geschäftsführer/Managing Directors       AG Hamburg, HRB 107844
Till Dörges           Jürgen Sander              Axel Theilmann
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2011-12-27 10:21       ` Axel Theilmann
  (?)
@ 2011-12-27 13:40       ` Stefan Richter
  -1 siblings, 0 replies; 16+ messages in thread
From: Stefan Richter @ 2011-12-27 13:40 UTC (permalink / raw)
  To: Axel Theilmann; +Cc: linux-scsi, Huajun Li, jbottomley, linux-kernel

On Dec 27 Axel Theilmann wrote:
> On 12/25/2011 09:58 PM, Stefan Richter wrote:
[...
>>> 在 2011年12月15日 上午1:59,Axel Theilmann <theilmann@pre-sense.de> :
>>>> two weeks ago Huajun Li posted a patch for a kernel oops, subject
>>>> [PATCH] SCSI/sd: Fix NULL dereference in sd_revalidate_disk".
>>>> 
>>>> The patch was discussed but considered "clearly wrong". The bug shows
>>>> up for us in kernel 3.1.4 quite often when unplugging usb sticks and
>>>> it seems a few other people have the same problem:
>>>> 
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/859199
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=754518
>>>> https://bugzilla.novell.com/show_bug.cgi?id=722350
>>>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649735
>>>> 
>>>> Can anyone of you maybe give me any status update on that bug?
...]
> > as far as I remember, all Linux releases in 2011 have been broken WRT hot
> > removal of block devices; some more severely, some less.  Various patches
> > for this went in over the year, but if they fixed anything, they always
> > uncovered the next lingering unplug related bug.  The presumed first Linux
> 
> so now there are 2 known NULL-pointer problems in the cd-rom code and one in
> the scsi-disk code.

The two CD-ROM related traces which I posted seem to indicate a bug
between block layer's and SCSI core lifetime managements, rather than in
the cd-rom code particularly.  When I get the time, I will try the
"1. open(), 2. remove device, 3. ioctl()" sequence on an sd_mod device
instead of an sr_mod one and see where this goes.

> Would a complete fix for this issue be a question of locating all the
> possible NULL-pointers and fixing them or do you think that the hotplug
> problem has to be fixed on a more "fundamental" level?

I don't know what my two traces tell us what particularly is broken and
where to attack the problem.

In case of the sd_revalidate_disk oops from the thread which I highjacked
(which refers to "[PATCH] SCSI/sd: Fix NULL dereference in
sd_revalidate_disk", http://thread.gmane.org/gmane.linux.scsi/71174), the
trouble is that nobody came up with an answer to James' question on how it
could happen in the first place that sd_revalidate_disk(disk) could be
called on a disk that leads to a NULL scsi_disk.  In turn, this presumably
means among else that the answer to my earlier question --- what prevents
the scsi_disk to go invalid slightly after that newly added NULL pointer
check --- cannot be answered yet either.

However, I do think that the pitiful state of block device unplugging
throughout circa a whole year indicates a fundamental problem indeed.  But
I am only familiar with one of the SCSI transport layer drivers, not with
the kernel layers above, so what do I know.

> Even if there is a more fundamental problem below that has to be fixed, it
> would still be nice to get in fixes for the dereferences that are currently
> known to keep peoples systems from crashing.
> 
> We built a kernel with Huajun's patch included and will do some tests to see
> if the problem goes away (and no others show up).

AFAIU it is not clear whether this patch actually prevents dereference of
an invalid sdkp or only makes it considerably more unlikely.

In either case, since there is apparently an underlying issue that this
patch does not address, it is a judgment call whether such a patch is
allowed into a kernel --- distributor kernel or mainline kernel.  If
somebody takes it, then at least a FIXME comment should be put there that
sd_revalidate_disk is supposed to rely on an always valid sdkp.

> > With a little bit of bad luck, udisks-daemon or in older distros hald
> > should hit the bug too.  Under kernel 3.1 I typically just got processes
> > hanging in unkillable sleep.  With kernel 3.2-rc7 I get an instant kernel
> > panic.
> 
> Yes, udisks is what probably triggers the bug for us. People removing USB
> media before udisks is finished initializing the medium. With kernel 3.1.4
> we get instant kernel panics as well.
> 
> tty, axel

Sounds like both "your" and "my" bug occur at the end of the sequence

    1. open(), 2. remove device, 3. ioctl() or whatever

though perhaps with the extra twist in your case that this has to happen
before the device bring-up was entirely finished...?  In my CD-ROM related
case the bug is not timing-sensitive at all; it always happens with above
sequence.
-- 
Stefan Richter
-=====-==-== ==-- ==-==
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2011-12-25 20:58   ` Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?) Stefan Richter
  2011-12-27 10:21       ` Axel Theilmann
@ 2012-02-14 11:34     ` Jun'ichi Nomura
  2012-02-14 11:54       ` Bart Van Assche
  2012-03-13 18:10       ` Bart Van Assche
  1 sibling, 2 replies; 16+ messages in thread
From: Jun'ichi Nomura @ 2012-02-14 11:34 UTC (permalink / raw)
  To: Stefan Richter, jbottomley
  Cc: linux-scsi, Huajun Li, Axel Theilmann, linux-kernel

On 12/26/11 05:58, Stefan Richter wrote:
> First I tested a FireWire drive and got the first log which is included
> below, instantly in two attempts.  Then I made two attempts with a USB
> CD-ROM which did not oops immediately at device removal but when I then
> hit the eject button in the still open grip.  This consistently produced
> the second log at the end of this post.
> 
> First test with 1394 CD-ROM:
> -----------------------------------------------------------------
>   - attach CD-ROM drive
> -----------------------------------------------------------------
> scsi4 : SBP-2 IEEE-1394
> firewire_sbp2 fw1.0: logged in to LUN 0000 (0 retries)
> scsi 4:0:0:0: CD-ROM		TEAC	 CD-W28E	  1.1A PQ: 0 ANSI: 0 CCS
> sr1: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
> sr 4:0:0:0: Attached scsi CD-ROM sr1
> -----------------------------------------------------------------
>   - start grip
>   - detach CD-ROM drive
> -----------------------------------------------------------------
> sr 4:0:0:0: Attached scsi generic sg2 type 5
> scsi 4:0:0:0: killing request
> BUG: unable to handle kernel NULL pointer dereference at 000003f0
> IP: [<c11bc19f>] scsi_prep_state_check+0x6/0x68
> *pde = 00000000 
> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> Modules linked in: firewire_sbp2 firewire_ohci firewire_core netconsole snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss nfs lockd sunrpc i2c_i801 applesmc sr_mod rtc sg input_polldev cdrom snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_pcm snd_timer snd sky2 snd_page_alloc
> 
> Pid: 2832, comm: grip Not tainted 3.2.0-rc7 #1 Apple Computer, Inc. Macmini1,1/Mac-F4208EC8
> EIP: 0060:[<c11bc19f>] EFLAGS: 00010046 CPU: 0
> EIP is at scsi_prep_state_check+0x6/0x68
> EAX: 00000000 EBX: f33f3f18 ECX: 00000000 EDX: f33f3f18
> ESI: f4815a68 EDI: 00000000 EBP: f160bc14 ESP: f160bc10
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Process grip (pid: 2832, ti=f160a000 task=f5d48760 task.ti=f160a000)
> Stack:
>  f33f3f18 f160bc2c c11bc8b1 f160bc3c f33f3f18 f4815a68 f33f3f18 f160bc3c
>  c11bc9a5 f33f3f18 f4815a68 f160bc50 c10efad5 00000000 f33f3f18 f4815a68
>  f160bc78 c11bd09f f4815db0 f33f3f18 00000001 f33f3f18 f4815a68 f4815a68
> 
> Call Trace:
>  [<c11bc8b1>] scsi_setup_blk_pc_cmnd+0x12/0xe7
>  [<c11bc9a5>] scsi_prep_fn+0x1f/0x2e
>  [<c10efad5>] blk_peek_request+0x98/0x168
>  [<c11bd09f>] scsi_request_fn+0x23/0x3b5
>  [<c10ed9d6>] __blk_run_queue+0x14/0x16
>  [<c10f25d5>] blk_execute_rq_nowait+0x7d/0x98
>  [<c10f2697>] blk_execute_rq+0xa7/0xe8
>  [<c10f2530>] ? blk_rq_map_user+0x1b7/0x1b7
>  [<c10f8c81>] ? changed_ioprio+0x70/0x70
>  [<c10ed700>] ? elv_set_request+0x12/0x20
>  [<c10eeebd>] ? get_request+0x21e/0x2bb
>  [<c11bcad2>] scsi_execute+0xc4/0x10a
>  [<c11bcb6c>] scsi_execute_req+0x54/0x81
>  [<c11bcbea>] scsi_test_unit_ready+0x51/0xb2
>  [<f828248b>] sr_drive_status+0x33/0xd5 [sr_mod]
>  [<f81f7860>] cdrom_ioctl+0x6a9/0xb31 [cdrom]
>  [<c1279f36>] ? mutex_lock_nested+0x26c/0x2b0
>  [<c10231e5>] ? get_parent_ip+0xb/0x31
>  [<c1023287>] ? sub_preempt_count+0x7c/0x89
>  [<c1279f5f>] ? mutex_lock_nested+0x295/0x2b0
>  [<f82815f1>] ? sr_block_ioctl+0x2e/0x9a [sr_mod]
>  [<f8281612>] sr_block_ioctl+0x4f/0x9a [sr_mod]
>  [<f82815c3>] ? sr_block_check_events+0x13/0x13 [sr_mod]
>  [<c10f39ee>] __blkdev_driver_ioctl+0x22/0x2e
>  [<c10f42f5>] blkdev_ioctl+0x66d/0x68c
>  [<c104bf7e>] ? __lock_acquire+0x62e/0x14bb
>  [<c10b1861>] block_ioctl+0x32/0x3a
>  [<c10b1861>] ? block_ioctl+0x32/0x3a
>  [<c10b182f>] ? bd_set_size+0x67/0x67
>  [<c109bfd5>] do_vfs_ioctl+0x481/0x4b7
>  [<c1090993>] ? fget_light+0x4c/0xd0
>  [<c109c039>] sys_ioctl+0x2e/0x49
>  [<c127bb50>] sysenter_do_call+0x12/0x36

While scsi_device is propery refcounted object,
q->queuedata is set to NULL by scsi_remove_device() asynchronously.
So every reader of scsi_device's q->queuedata should always check it.

Though I don't have a machine to actually test it,
the following patch should remove such places.

Index: linux-3.3/drivers/scsi/scsi_lib.c
===================================================================
--- linux-3.3.orig/drivers/scsi/scsi_lib.c	2012-02-01 06:31:54.000000000 +0900
+++ linux-3.3/drivers/scsi/scsi_lib.c	2012-02-14 19:12:57.641216402 +0900
@@ -1214,10 +1214,8 @@ int scsi_prep_state_check(struct scsi_de
 }
 EXPORT_SYMBOL(scsi_prep_state_check);
 
-int scsi_prep_return(struct request_queue *q, struct request *req, int ret)
+int scsi_prep_return(struct request_queue *q, struct scsi_device *sdev, struct request *req, int ret)
 {
-	struct scsi_device *sdev = q->queuedata;
-
 	switch (ret) {
 	case BLKPREP_KILL:
 		req->errors = DID_NO_CONNECT << 16;
@@ -1251,9 +1249,11 @@ int scsi_prep_fn(struct request_queue *q
 	struct scsi_device *sdev = q->queuedata;
 	int ret = BLKPREP_KILL;
 
+	if (!sdev)
+		return ret;
 	if (req->cmd_type == REQ_TYPE_BLOCK_PC)
 		ret = scsi_setup_blk_pc_cmnd(sdev, req);
-	return scsi_prep_return(q, req, ret);
+	return scsi_prep_return(q, sdev, req, ret);
 }
 EXPORT_SYMBOL(scsi_prep_fn);
 
Index: linux-3.3/drivers/scsi/sd.c
===================================================================
--- linux-3.3.orig/drivers/scsi/sd.c	2012-02-13 13:02:14.000000000 +0900
+++ linux-3.3/drivers/scsi/sd.c	2012-02-14 19:15:18.224212107 +0900
@@ -653,6 +653,9 @@ static int sd_prep_fn(struct request_que
 	int ret, host_dif;
 	unsigned char protect;
 
+	if (!sdp)
+		return BLKPREP_KILL;
+
 	/*
 	 * Discard request come in as REQ_TYPE_FS but we turn them into
 	 * block PC requests to make life easier.
@@ -905,7 +908,7 @@ static int sd_prep_fn(struct request_que
 	 */
 	ret = BLKPREP_OK;
  out:
-	return scsi_prep_return(q, rq, ret);
+	return scsi_prep_return(q, sdp, rq, ret);
 }
 
 /**
Index: linux-3.3/drivers/scsi/sr.c
===================================================================
--- linux-3.3.orig/drivers/scsi/sr.c	2012-02-01 06:31:54.000000000 +0900
+++ linux-3.3/drivers/scsi/sr.c	2012-02-14 19:15:59.001211143 +0900
@@ -372,6 +372,9 @@ static int sr_prep_fn(struct request_que
 	struct scsi_device *sdp = q->queuedata;
 	int ret;
 
+	if (!sdp)
+		return BLKPREP_KILL;
+
 	if (rq->cmd_type == REQ_TYPE_BLOCK_PC) {
 		ret = scsi_setup_blk_pc_cmnd(sdp, rq);
 		goto out;
@@ -503,7 +506,7 @@ static int sr_prep_fn(struct request_que
 	 */
 	ret = BLKPREP_OK;
  out:
-	return scsi_prep_return(q, rq, ret);
+	return scsi_prep_return(q, sdp, rq, ret);
 }
 
 static int sr_block_open(struct block_device *bdev, fmode_t mode)
Index: linux-3.3/include/scsi/scsi_driver.h
===================================================================
--- linux-3.3.orig/include/scsi/scsi_driver.h	2012-02-01 06:31:54.000000000 +0900
+++ linux-3.3/include/scsi/scsi_driver.h	2012-02-14 19:16:47.478209843 +0900
@@ -31,7 +31,7 @@ extern int scsi_register_interface(struc
 int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req);
 int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct request *req);
 int scsi_prep_state_check(struct scsi_device *sdev, struct request *req);
-int scsi_prep_return(struct request_queue *q, struct request *req, int ret);
+int scsi_prep_return(struct request_queue *q, struct scsi_device *sdev, struct request *req, int ret);
 int scsi_prep_fn(struct request_queue *, struct request *);
 
 #endif /* _SCSI_SCSI_DRIVER_H */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-02-14 11:34     ` Jun'ichi Nomura
@ 2012-02-14 11:54       ` Bart Van Assche
  2012-02-14 13:38         ` Stefan Richter
  2012-03-13 18:10       ` Bart Van Assche
  1 sibling, 1 reply; 16+ messages in thread
From: Bart Van Assche @ 2012-02-14 11:54 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Stefan Richter, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

On Tue, Feb 14, 2012 at 12:34 PM, Jun'ichi Nomura
<j-nomura@ce.jp.nec.com> wrote:
> While scsi_device is propery refcounted object,
> q->queuedata is set to NULL by scsi_remove_device() asynchronously.
> So every reader of scsi_device's q->queuedata should always check it.

As far as I can see this patch narrows the race window but doesn't fix
the race. At least sd_prep_fn() still reads queuedata and if I'm not
mistaken that read races with scsi_remove_device(). Has it been
considered to modify scsi_remove_device() and scsi_request_fn() such
that device removal is communicated from the former to the latter in
another way than by clearing queuedata ?

Bart.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-02-14 11:54       ` Bart Van Assche
@ 2012-02-14 13:38         ` Stefan Richter
  2012-02-14 19:11           ` Bart Van Assche
  0 siblings, 1 reply; 16+ messages in thread
From: Stefan Richter @ 2012-02-14 13:38 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jun'ichi Nomura, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

On Feb 14 Bart Van Assche wrote:
> On Tue, Feb 14, 2012 at 12:34 PM, Jun'ichi Nomura
> <j-nomura@ce.jp.nec.com> wrote:
> > While scsi_device is propery refcounted object,
> > q->queuedata is set to NULL by scsi_remove_device() asynchronously.
> > So every reader of scsi_device's q->queuedata should always check it.
> 
> As far as I can see this patch narrows the race window but doesn't fix
> the race. At least sd_prep_fn() still reads queuedata and if I'm not
> mistaken that read races with scsi_remove_device(). Has it been
> considered to modify scsi_remove_device() and scsi_request_fn() such
> that device removal is communicated from the former to the latter in
> another way than by clearing queuedata ?

Or asked differently, *what* is supposed to serialize the ->queuedata
accesses?

(If it is the BKL -- well, some bleeding edge kernel versions lack it,
sources say.)
-- 
Stefan Richter
-=====-===-- --=- -===-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-02-14 13:38         ` Stefan Richter
@ 2012-02-14 19:11           ` Bart Van Assche
  2012-02-15  2:26             ` Jun'ichi Nomura
  0 siblings, 1 reply; 16+ messages in thread
From: Bart Van Assche @ 2012-02-14 19:11 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Jun'ichi Nomura, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

On Feb 14 Stefan Richter wrote:
> Or asked differently, what is supposed to serialize the ->queuedata
> accesses?

How about avoiding to modify q->queuedata before scsi_free_queue() has been
invoked, e.g. as follows ? Note: the kfree() call in scsi_host_dev_release() probably
should be postponed until the last put on the queue has occurred.

>From 4d29b62c686a718d34d3b3f634306376b40dbdb6 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bvanassche@acm.org>
Date: Tue, 14 Feb 2012 13:52:21 +0100
Subject: [PATCH] scsi-queuedata-null-deref-fix

---
 drivers/scsi/hosts.c      |    4 ++--
 drivers/scsi/scsi_lib.c   |   13 +++----------
 drivers/scsi/scsi_sysfs.c |    3 ---
 3 files changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 351dc0b..5f34620 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -296,9 +296,9 @@ static void scsi_host_dev_release(struct device *dev)
 		destroy_workqueue(shost->work_q);
 	q = shost->uspace_req_q;
 	if (q) {
-		kfree(q->queuedata);
-		q->queuedata = NULL;
+		void *qd = q->queuedata;
 		scsi_free_queue(q);
+		kfree(qd);
 	}
 
 	scsi_destroy_command_freelist(shost);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f85cfa6..7c40303 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1491,7 +1491,9 @@ static void scsi_request_fn(struct request_queue *q)
 	struct scsi_cmnd *cmd;
 	struct request *req;
 
-	if (!sdev) {
+	BUG_ON(!sdev);
+
+	if (test_bit(QUEUE_FLAG_DEAD, &q->queue_flags)) {
 		while ((req = blk_peek_request(q)) != NULL)
 			scsi_kill_request(req, q);
 		return;
@@ -1700,15 +1702,6 @@ struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
 
 void scsi_free_queue(struct request_queue *q)
 {
-	unsigned long flags;
-
-	WARN_ON(q->queuedata);
-
-	/* cause scsi_request_fn() to kill all non-finished requests */
-	spin_lock_irqsave(q->queue_lock, flags);
-	q->request_fn(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
-
 	blk_cleanup_queue(q);
 }
 
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index ea5658d..61dd107 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -976,9 +976,6 @@ void __scsi_remove_device(struct scsi_device *sdev)
 		sdev->host->hostt->slave_destroy(sdev);
 	transport_destroy_device(dev);
 
-	/* cause the request function to reject all I/O requests */
-	sdev->request_queue->queuedata = NULL;
-
 	/* Freeing the queue signals to block that we're done */
 	scsi_free_queue(sdev->request_queue);
 	put_device(dev);
-- 
1.7.7



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-02-14 19:11           ` Bart Van Assche
@ 2012-02-15  2:26             ` Jun'ichi Nomura
  2012-02-15 11:29               ` Bart Van Assche
  0 siblings, 1 reply; 16+ messages in thread
From: Jun'ichi Nomura @ 2012-02-15  2:26 UTC (permalink / raw)
  To: Bart Van Assche, Stefan Richter
  Cc: jbottomley, linux-scsi, Huajun Li, Axel Theilmann, linux-kernel

Hi,

Thank you for comments.

On 02/15/12 04:11, Bart Van Assche wrote:
> On Feb 14 Stefan Richter wrote:
>> Or asked differently, what is supposed to serialize the ->queuedata
>> accesses?

AFAIU the racy check is ok because 
sdev does not disappear as far as the device is opened.

The oopses in scsi_prep_fn happened just because they used queuedata
without checking.

> 
> How about avoiding to modify q->queuedata before scsi_free_queue() has been
> invoked, e.g. as follows ?

I think this patch is good, too.
Since QUEUE_FLAG_DEAD is also racy, it is not much different
from queuedata check, IMO.
Maybe you want additional blk_queue_dead() check in prep_fn
to keep the current behaviour.

> Note: the kfree() call in scsi_host_dev_release() probably
> should be postponed until the last put on the queue has occurred.
> 
>>From 4d29b62c686a718d34d3b3f634306376b40dbdb6 Mon Sep 17 00:00:00 2001
> From: Bart Van Assche <bvanassche@acm.org>
> Date: Tue, 14 Feb 2012 13:52:21 +0100
> Subject: [PATCH] scsi-queuedata-null-deref-fix
> 
> ---
>  drivers/scsi/hosts.c      |    4 ++--
>  drivers/scsi/scsi_lib.c   |   13 +++----------
>  drivers/scsi/scsi_sysfs.c |    3 ---
>  3 files changed, 5 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 351dc0b..5f34620 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -296,9 +296,9 @@ static void scsi_host_dev_release(struct device *dev)
>  		destroy_workqueue(shost->work_q);
>  	q = shost->uspace_req_q;
>  	if (q) {
> -		kfree(q->queuedata);
> -		q->queuedata = NULL;
> +		void *qd = q->queuedata;
>  		scsi_free_queue(q);
> +		kfree(qd);
>  	}
>  
>  	scsi_destroy_command_freelist(shost);
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index f85cfa6..7c40303 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1491,7 +1491,9 @@ static void scsi_request_fn(struct request_queue *q)
>  	struct scsi_cmnd *cmd;
>  	struct request *req;
>  
> -	if (!sdev) {
> +	BUG_ON(!sdev);
> +
> +	if (test_bit(QUEUE_FLAG_DEAD, &q->queue_flags)) {
>  		while ((req = blk_peek_request(q)) != NULL)
>  			scsi_kill_request(req, q);
>  		return;
> @@ -1700,15 +1702,6 @@ struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
>  
>  void scsi_free_queue(struct request_queue *q)
>  {
> -	unsigned long flags;
> -
> -	WARN_ON(q->queuedata);
> -
> -	/* cause scsi_request_fn() to kill all non-finished requests */
> -	spin_lock_irqsave(q->queue_lock, flags);
> -	q->request_fn(q);
> -	spin_unlock_irqrestore(q->queue_lock, flags);
> -
>  	blk_cleanup_queue(q);
>  }
>  
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index ea5658d..61dd107 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -976,9 +976,6 @@ void __scsi_remove_device(struct scsi_device *sdev)
>  		sdev->host->hostt->slave_destroy(sdev);
>  	transport_destroy_device(dev);
>  
> -	/* cause the request function to reject all I/O requests */
> -	sdev->request_queue->queuedata = NULL;
> -
>  	/* Freeing the queue signals to block that we're done */
>  	scsi_free_queue(sdev->request_queue);
>  	put_device(dev);

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-02-15  2:26             ` Jun'ichi Nomura
@ 2012-02-15 11:29               ` Bart Van Assche
  2012-02-16  1:04                 ` Jun'ichi Nomura
  0 siblings, 1 reply; 16+ messages in thread
From: Bart Van Assche @ 2012-02-15 11:29 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Stefan Richter, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

On Wed, Feb 15, 2012 at 3:26 AM, Jun'ichi Nomura <j-nomura@ce.jp.nec.com> wrote:
> I think this patch is good, too.
> Since QUEUE_FLAG_DEAD is also racy, it is not much different
> from queuedata check, IMO.

Are you sure ? As far as I know the block layer takes care of
synchronizing queue flag modifications against request_fn invocations.

Bart.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-02-15 11:29               ` Bart Van Assche
@ 2012-02-16  1:04                 ` Jun'ichi Nomura
  0 siblings, 0 replies; 16+ messages in thread
From: Jun'ichi Nomura @ 2012-02-16  1:04 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Stefan Richter, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

Hi,

On 02/15/12 20:29, Bart Van Assche wrote:
> On Wed, Feb 15, 2012 at 3:26 AM, Jun'ichi Nomura <j-nomura@ce.jp.nec.com> wrote:
>> I think this patch is good, too.
>> Since QUEUE_FLAG_DEAD is also racy, it is not much different
>> from queuedata check, IMO.
> 
> Are you sure ? As far as I know the block layer takes care of
> synchronizing queue flag modifications against request_fn invocations.

QUEUE_FLAG_DEAD is set outside of queue_lock. So it's racy.
However, it's single directional change and the race is benign
if the driver, who sets QUEUE_FLAG_DEAD, is ready to reject
requests.
q->queuedata is same with that regard.

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-02-14 11:34     ` Jun'ichi Nomura
  2012-02-14 11:54       ` Bart Van Assche
@ 2012-03-13 18:10       ` Bart Van Assche
  2012-03-16  8:58         ` Jun'ichi Nomura
  1 sibling, 1 reply; 16+ messages in thread
From: Bart Van Assche @ 2012-03-13 18:10 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Stefan Richter, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

On 02/14/12 11:34, Jun'ichi Nomura wrote:
> While scsi_device is propery refcounted object,
> q->queuedata is set to NULL by scsi_remove_device() asynchronously.
> So every reader of scsi_device's q->queuedata should always check it.
>
> Though I don't have a machine to actually test it,
> the following patch should remove such places.
>
> Index: linux-3.3/drivers/scsi/scsi_lib.c
> ===================================================================
> --- linux-3.3.orig/drivers/scsi/scsi_lib.c	2012-02-01 06:31:54.000000000 +0900
> +++ linux-3.3/drivers/scsi/scsi_lib.c	2012-02-14 19:12:57.641216402 +0900
> @@ -1214,10 +1214,8 @@ int scsi_prep_state_check(struct scsi_de
>  }
>  EXPORT_SYMBOL(scsi_prep_state_check);
>  
> -int scsi_prep_return(struct request_queue *q, struct request *req, int ret)
> +int scsi_prep_return(struct request_queue *q, struct scsi_device *sdev, struct request *req, int ret)
>  {
> -	struct scsi_device *sdev = q->queuedata;
> -
>  	switch (ret) {
>  	case BLKPREP_KILL:
>  		req->errors = DID_NO_CONNECT << 16;
> @@ -1251,9 +1249,11 @@ int scsi_prep_fn(struct request_queue *q
>  	struct scsi_device *sdev = q->queuedata;
>  	int ret = BLKPREP_KILL;
>  
> +	if (!sdev)
> +		return ret;
>  	if (req->cmd_type == REQ_TYPE_BLOCK_PC)
>  		ret = scsi_setup_blk_pc_cmnd(sdev, req);
> -	return scsi_prep_return(q, req, ret);
> +	return scsi_prep_return(q, sdev, req, ret);
>  }
>  EXPORT_SYMBOL(scsi_prep_fn);
>  
> Index: linux-3.3/drivers/scsi/sd.c
> ===================================================================
> --- linux-3.3.orig/drivers/scsi/sd.c	2012-02-13 13:02:14.000000000 +0900
> +++ linux-3.3/drivers/scsi/sd.c	2012-02-14 19:15:18.224212107 +0900
> @@ -653,6 +653,9 @@ static int sd_prep_fn(struct request_que
>  	int ret, host_dif;
>  	unsigned char protect;
>  
> +	if (!sdp)
> +		return BLKPREP_KILL;
> +
>  	/*
>  	 * Discard request come in as REQ_TYPE_FS but we turn them into
>  	 * block PC requests to make life easier.
> @@ -905,7 +908,7 @@ static int sd_prep_fn(struct request_que
>  	 */
>  	ret = BLKPREP_OK;
>   out:
> -	return scsi_prep_return(q, rq, ret);
> +	return scsi_prep_return(q, sdp, rq, ret);
>  }
>  
>  /**
> Index: linux-3.3/drivers/scsi/sr.c
> ===================================================================
> --- linux-3.3.orig/drivers/scsi/sr.c	2012-02-01 06:31:54.000000000 +0900
> +++ linux-3.3/drivers/scsi/sr.c	2012-02-14 19:15:59.001211143 +0900
> @@ -372,6 +372,9 @@ static int sr_prep_fn(struct request_que
>  	struct scsi_device *sdp = q->queuedata;
>  	int ret;
>  
> +	if (!sdp)
> +		return BLKPREP_KILL;
> +
>  	if (rq->cmd_type == REQ_TYPE_BLOCK_PC) {
>  		ret = scsi_setup_blk_pc_cmnd(sdp, rq);
>  		goto out;
> @@ -503,7 +506,7 @@ static int sr_prep_fn(struct request_que
>  	 */
>  	ret = BLKPREP_OK;
>   out:
> -	return scsi_prep_return(q, rq, ret);
> +	return scsi_prep_return(q, sdp, rq, ret);
>  }
>  
>  static int sr_block_open(struct block_device *bdev, fmode_t mode)
> Index: linux-3.3/include/scsi/scsi_driver.h
> ===================================================================
> --- linux-3.3.orig/include/scsi/scsi_driver.h	2012-02-01 06:31:54.000000000 +0900
> +++ linux-3.3/include/scsi/scsi_driver.h	2012-02-14 19:16:47.478209843 +0900
> @@ -31,7 +31,7 @@ extern int scsi_register_interface(struc
>  int scsi_setup_blk_pc_cmnd(struct scsi_device *sdev, struct request *req);
>  int scsi_setup_fs_cmnd(struct scsi_device *sdev, struct request *req);
>  int scsi_prep_state_check(struct scsi_device *sdev, struct request *req);
> -int scsi_prep_return(struct request_queue *q, struct request *req, int ret);
> +int scsi_prep_return(struct request_queue *q, struct scsi_device *sdev, struct request *req, int ret);
>  int scsi_prep_fn(struct request_queue *, struct request *);
>  
>  #endif /* _SCSI_SCSI_DRIVER_H */

Now that I've had some more time to think about this: has anyone
considered to hold a reference on the SCSI host instead of the SCSI
device as long as sd_probe_async() is active ? If sd_prep_fn() can ever
see a NULL queuedata pointer then that means that
scsi_host_dev_release() can get invoked while sd_prep_fn() is running.
That doesn't look correct to me.

Bart.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-03-13 18:10       ` Bart Van Assche
@ 2012-03-16  8:58         ` Jun'ichi Nomura
  2012-03-16 18:53           ` Bart Van Assche
  0 siblings, 1 reply; 16+ messages in thread
From: Jun'ichi Nomura @ 2012-03-16  8:58 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Stefan Richter, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

Hi,

On 03/14/12 03:10, Bart Van Assche wrote:
> Now that I've had some more time to think about this: has anyone
> considered to hold a reference on the SCSI host instead of the SCSI
> device as long as sd_probe_async() is active ? If sd_prep_fn() can ever
> see a NULL queuedata pointer then that means that
> scsi_host_dev_release() can get invoked while sd_prep_fn() is running.

Holding a host reference does not help, I think.
It does not stop __scsi_remove_device() setting NULL
to sdev's q->queuedata.

So, while there might be another race between sd_probe_async
and scsi_host_remove, I believe your "[PATCH] Fix device
removal NULL pointer dereference" still makes sense.

> That doesn't look correct to me.

-- 
Jun'ichi Nomura, NEC Corporation

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?)
  2012-03-16  8:58         ` Jun'ichi Nomura
@ 2012-03-16 18:53           ` Bart Van Assche
  0 siblings, 0 replies; 16+ messages in thread
From: Bart Van Assche @ 2012-03-16 18:53 UTC (permalink / raw)
  To: Jun'ichi Nomura
  Cc: Stefan Richter, jbottomley, linux-scsi, Huajun Li,
	Axel Theilmann, linux-kernel

On 03/16/12 08:58, Jun'ichi Nomura wrote:
> So, while there might be another race between sd_probe_async
> and scsi_host_remove, I believe your "[PATCH] Fix device
> removal NULL pointer dereference" still makes sense.

You are right, I have been too harsh for my own patch. I'll resend it.

Bart.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-03-16 18:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-14 17:59 status of oops in sd_revalidate_disk? Axel Theilmann
2011-12-21 14:12 ` Huajun Li
2011-12-25 20:58   ` Yet another hot unplug NULL pointer dereference (was Re: status of oops in sd_revalidate_disk?) Stefan Richter
2011-12-27 10:21     ` Axel Theilmann
2011-12-27 10:21       ` Axel Theilmann
2011-12-27 13:40       ` Stefan Richter
2012-02-14 11:34     ` Jun'ichi Nomura
2012-02-14 11:54       ` Bart Van Assche
2012-02-14 13:38         ` Stefan Richter
2012-02-14 19:11           ` Bart Van Assche
2012-02-15  2:26             ` Jun'ichi Nomura
2012-02-15 11:29               ` Bart Van Assche
2012-02-16  1:04                 ` Jun'ichi Nomura
2012-03-13 18:10       ` Bart Van Assche
2012-03-16  8:58         ` Jun'ichi Nomura
2012-03-16 18:53           ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.