* Kernel crashing on eject SD card @ 2012-02-08 0:19 Naveen Goswamy 2012-02-12 21:08 ` Stefan Richter 0 siblings, 1 reply; 17+ messages in thread From: Naveen Goswamy @ 2012-02-08 0:19 UTC (permalink / raw) To: linux-kernel The details are here: https://bugs.gentoo.org/show_bug.cgi?id=402433 Any ideas? Is this a known issue? Thanks, Naveen ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-08 0:19 Kernel crashing on eject SD card Naveen Goswamy @ 2012-02-12 21:08 ` Stefan Richter 2012-02-12 21:20 ` Stefan Richter 0 siblings, 1 reply; 17+ messages in thread From: Stefan Richter @ 2012-02-12 21:08 UTC (permalink / raw) To: Naveen Goswamy; +Cc: linux-kernel, linux-scsi On Feb 07 Naveen Goswamy wrote: > The details are here: > > > https://bugs.gentoo.org/show_bug.cgi?id=402433 > > Any ideas? Is this a known issue? It has been reported repeatedly, AFAICT without any progress so far. http://marc.info/?l=linux-scsi&m=132388619710052 It's the old story; an udev helper opens a block device while it is being torn down apparently. Block subsystem soils itself. Here is the kernel log from Naveen's report at the Gentoo bug tracker, for the convenience of the list subscribers: Feb 6 10:51:09 speedy kernel: ieee80211 phy0: wl0: brcms_c_d11hdrs_mac80211: txop exceeded phylen 130/256 dur 1546/1504 Feb 6 10:54:39 speedy kernel: usb 1-1.6: new high-speed USB device number 4 using ehci_hcd Feb 6 10:54:39 speedy kernel: usb 1-1.6: New USB device found, idVendor=0bda, idProduct=0159 Feb 6 10:54:39 speedy kernel: usb 1-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Feb 6 10:54:39 speedy kernel: usb 1-1.6: Product: USB2.0-CRW Feb 6 10:54:39 speedy kernel: usb 1-1.6: Manufacturer: Generic Feb 6 10:54:39 speedy kernel: usb 1-1.6: SerialNumber: 20071114173400000 Feb 6 10:54:39 speedy kernel: scsi7 : usb-storage 1-1.6:1.0 Feb 6 10:54:40 speedy kernel: scsi 7:0:0:0: Direct-Access Generic- Multi-Card 1.00 PQ: 0 ANSI: 0 CCS Feb 6 10:54:40 speedy kernel: sd 7:0:0:0: Attached scsi generic sg2 type 0 Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] 3862528 512-byte logical blocks: (1.97 GB/1.84 GiB) Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] Write Protect is off Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] Mode Sense: 03 00 00 00 Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] No Caching mode page present Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] Assuming drive cache: write through Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] No Caching mode page present Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] Assuming drive cache: write through Feb 6 10:54:41 speedy kernel: sdb: sdb1 Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] No Caching mode page present Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] Assuming drive cache: write through Feb 6 10:54:41 speedy kernel: sd 7:0:0:0: [sdb] Attached SCSI removable disk Feb 6 10:58:08 speedy kernel: usb 1-1.6: USB disconnect, device number 4 Feb 6 10:58:08 speedy kernel: scsi 7:0:0:0: killing request Feb 6 10:58:08 speedy kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 Feb 6 10:58:08 speedy kernel: IP: [<ffffffff8135b1c7>] sd_revalidate_disk+0x1a/0x16ee Feb 6 10:58:08 speedy kernel: PGD 2209b0067 PUD 2209b1067 PMD 0 Feb 6 10:58:08 speedy kernel: Oops: 0000 [#1] SMP Feb 6 10:58:08 speedy kernel: CPU 3 Feb 6 10:58:08 speedy kernel: Modules linked in: aes_x86_64 aes_generic ipt_REJECT iptable_mangle iptable_nat nf_nat iptable_filter ip_tables ipv6 dm_mod vboxnetadp(O) vboxnetflt(O) vboxdrv(O) uvcvideo videodev v4l2_compat_ioctl32 usb_storage arc4 brcmsmac snd_hda_codec_hdmi snd_hda_codec_idt mac80211 snd_hda_intel brcmutil snd_hda_codec dell_wmi ehci_hcd r8169 usbcore snd_pcm cfg80211 snd_timer dcdbas firmware_class snd usb_common sparse_keymap soundcore rtc rfkill sg wmi snd_page_alloc crc8 cordic Feb 6 10:58:08 speedy kernel: Feb 6 10:58:08 speedy kernel: Pid: 2434, comm: udisks-daemon Tainted: G O 3.2.1-gentoo-r2_MINE_V01 #1 Dell Inc. Vostro 3400/07MJFM Feb 6 10:58:08 speedy kernel: RIP: 0010:[<ffffffff8135b1c7>] [<ffffffff8135b1c7>] sd_revalidate_disk+0x1a/0x16ee Feb 6 10:58:08 speedy kernel: RSP: 0018:ffff8802209cfb08 EFLAGS: 00010292 Feb 6 10:58:08 speedy kernel: RAX: ffffffff8135b1ad RBX: 0000000000000000 RCX: 0000000000000002 Feb 6 10:58:08 speedy kernel: RDX: 0000000000000002 RSI: 0000000800000000 RDI: ffff880231668800 Feb 6 10:58:08 speedy kernel: RBP: ffff880231668800 R08: ffff8802315447a0 R09: ffffffff81852e48 Feb 6 10:58:08 speedy kernel: R10: 0000000000000000 R11: 0000000000011e00 R12: ffff880231668800 Feb 6 10:58:08 speedy kernel: R13: ffff88021746acd8 R14: 0000000000000000 R15: ffff88021746acc0 Feb 6 10:58:08 speedy kernel: FS: 00007fc6dd3ab700(0000) GS:ffff88023bd80000(0000) knlGS:0000000000000000 Feb 6 10:58:08 speedy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Feb 6 10:58:08 speedy kernel: CR2: 0000000000000008 CR3: 00000002209af000 CR4: 00000000000006e0 Feb 6 10:58:08 speedy kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Feb 6 10:58:08 speedy kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Feb 6 10:58:08 speedy kernel: Process udisks-daemon (pid: 2434, threadinfo ffff8802209ce000, task ffff88023aac4110) Feb 6 10:58:08 speedy kernel: Stack: Feb 6 10:58:08 speedy kernel: ffffffff8103461a ffff880231668848 0000000000000000 ffff880231668800 Feb 6 10:58:08 speedy kernel: ffff88021746acd8 000000000000001d ffff88021746acc0 ffffffff810a0ec9 Feb 6 10:58:08 speedy kernel: ffff88021746acc0 ffff880231668800 0000000000000000 ffff88021746ad98 Feb 6 10:58:08 speedy kernel: Call Trace: Feb 6 10:58:08 speedy kernel: [<ffffffff8103461a>] ? try_to_wake_up+0x200/0x200 Feb 6 10:58:08 speedy kernel: [<ffffffff810a0ec9>] ? get_super+0x1a/0x95 Feb 6 10:58:08 speedy kernel: [<ffffffff810b23d8>] ? iput+0x2b/0x17e Feb 6 10:58:08 speedy kernel: [<ffffffff810eb3ce>] ? rescan_partitions+0xac/0x446 Feb 6 10:58:08 speedy kernel: [<ffffffff810c5410>] ? __blkdev_get+0x162/0x33f Feb 6 10:58:08 speedy kernel: [<ffffffff810c588b>] ? blkdev_get+0x29e/0x29e Feb 6 10:58:08 speedy kernel: [<ffffffff810c57ad>] ? blkdev_get+0x1c0/0x29e Feb 6 10:58:08 speedy kernel: [<ffffffff810c588b>] ? blkdev_get+0x29e/0x29e Feb 6 10:58:08 speedy kernel: [<ffffffff8109e03b>] ? __dentry_open.clone.14+0x16b/0x294 Feb 6 10:58:08 speedy kernel: [<ffffffff810aaacb>] ? do_last.clone.34+0x64e/0x662 Feb 6 10:58:08 speedy kernel: [<ffffffff810aabe1>] ? path_openat+0xcb/0x354 Feb 6 10:58:08 speedy kernel: [<ffffffff8133e3fc>] ? scsi_set_medium_removal+0x46/0x6b Feb 6 10:58:08 speedy kernel: [<ffffffff8102c3b7>] ? ttwu_do_wakeup+0x11/0x86 Feb 6 10:58:08 speedy kernel: [<ffffffff810aaf45>] ? do_filp_open+0x2c/0x72 Feb 6 10:58:08 speedy kernel: [<ffffffff810b3fde>] ? alloc_fd+0x69/0x10f Feb 6 10:58:08 speedy kernel: [<ffffffff8109ed2e>] ? do_sys_open+0x101/0x18f Feb 6 10:58:08 speedy kernel: [<ffffffff81482b52>] ? system_call_fastpath+0x16/0x1b Feb 6 10:58:08 speedy kernel: Code: ff ff 48 83 c4 68 5b 5d 41 5c 41 5d 41 5e 41 5f c3 41 57 41 56 41 55 41 54 55 53 48 83 ec 78 48 8b 9f 50 02 00 00 48 89 7c 24 48 <48> 8b 43 08 48 89 44 24 28 8b 05 1a e2 7e 00 c1 e8 15 83 e0 07 Feb 6 10:58:08 speedy kernel: RIP [<ffffffff8135b1c7>] sd_revalidate_disk+0x1a/0x16ee Feb 6 10:58:08 speedy kernel: RSP <ffff8802209cfb08> Feb 6 10:58:08 speedy kernel: CR2: 0000000000000008 Feb 6 10:58:09 speedy kernel: ---[ end trace 2cb4da56c38cb030 ]--- -- Stefan Richter -=====-===-- --=- -==-- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-12 21:08 ` Stefan Richter @ 2012-02-12 21:20 ` Stefan Richter 2012-02-13 1:46 ` Naveen Goswamy 2012-02-13 2:18 ` Dave Jones 0 siblings, 2 replies; 17+ messages in thread From: Stefan Richter @ 2012-02-12 21:20 UTC (permalink / raw) To: Stefan Richter; +Cc: Naveen Goswamy, linux-kernel, linux-scsi On Feb 12 Stefan Richter wrote: > Modules linked in: [...] vboxnetadp(O) vboxnetflt(O) vboxdrv(O) [...] Oh, could you try without virtualbox? The debian bug report hints that kernel 3.2 /without virtualbox drivers/ seems to behave itself. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649735 -- Stefan Richter -=====-===-- --=- -==-- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-12 21:20 ` Stefan Richter @ 2012-02-13 1:46 ` Naveen Goswamy 2012-02-13 2:18 ` Dave Jones 1 sibling, 0 replies; 17+ messages in thread From: Naveen Goswamy @ 2012-02-13 1:46 UTC (permalink / raw) To: Stefan Richter; +Cc: linux-kernel, linux-scsi I confirm that it behaves properly when virtualbox drivers are not loaded. Cheers, Naveen Quoting Stefan Richter <stefanr@s5r6.in-berlin.de>: > On Feb 12 Stefan Richter wrote: > > Modules linked in: [...] vboxnetadp(O) vboxnetflt(O) vboxdrv(O) [...] > > Oh, could you try without virtualbox? > > The debian bug report hints that kernel 3.2 /without virtualbox drivers/ > seems to behave itself. > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649735 > -- > Stefan Richter > -=====-===-- --=- -==-- > http://arcgraph.de/sr/ > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-12 21:20 ` Stefan Richter 2012-02-13 1:46 ` Naveen Goswamy @ 2012-02-13 2:18 ` Dave Jones 2012-02-13 17:40 ` Naveen Goswamy 1 sibling, 1 reply; 17+ messages in thread From: Dave Jones @ 2012-02-13 2:18 UTC (permalink / raw) To: Stefan Richter; +Cc: Naveen Goswamy, linux-kernel, linux-scsi On Sun, Feb 12, 2012 at 10:20:27PM +0100, Stefan Richter wrote: > On Feb 12 Stefan Richter wrote: > > Modules linked in: [...] vboxnetadp(O) vboxnetflt(O) vboxdrv(O) [...] > > Oh, could you try without virtualbox? > > The debian bug report hints that kernel 3.2 /without virtualbox drivers/ > seems to behave itself. > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=649735 We've seen this a bunch of times in Fedora too. Here's a report we've been duping similar bugs against https://bugzilla.redhat.com/show_bug.cgi?id=754518 Some of them are using vbox/vmware, but there's a few in there that haven't used either, so I think that might be a red herring. Dave ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-13 2:18 ` Dave Jones @ 2012-02-13 17:40 ` Naveen Goswamy 2012-02-14 11:14 ` Jun'ichi Nomura 0 siblings, 1 reply; 17+ messages in thread From: Naveen Goswamy @ 2012-02-13 17:40 UTC (permalink / raw) To: Dave Jones; +Cc: Stefan Richter, linux-kernel, linux-scsi Quoting Dave Jones <davej@redhat.com>: > Some of them are using vbox/vmware, but there's a few in there that > haven't used either, so I think that might be a red herring. You are correct Dave. There was a red-herring. I managed to experience the same crash and burn again today, without vbox drivers. Here are the logs. Feb 13 08:50:53 speedy kernel: scsi 6:0:0:0: killing request Feb 13 08:50:53 speedy kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 Feb 13 08:50:53 speedy kernel: IP: [<ffffffff8135b798>] sd_revalidate_disk+0x1a/0x16ee Feb 13 08:50:53 speedy kernel: PGD 223493067 PUD 2234de067 PMD 0 Feb 13 08:50:53 speedy kernel: Oops: 0000 [#1] SMP Feb 13 08:50:53 speedy kernel: CPU 2 Feb 13 08:50:53 speedy kernel: Modules linked in: aes_x86_64 aes_generic ipt_REJECT iptable_mangle iptable_nat nf_nat iptable_filter ip_tables ipv6 dm_mod uvcvideo videodev v4l2_compat_ioctl32 usb_storage arc4 brcmsmac snd_hda_codec_hdmi snd_hda_codec_idt mac80211 brcmutil snd_hda_intel snd_hda_codec cfg80211 r8169 rfkill snd_pcm snd_timer dell_wmi snd sparse_keymap ehci_hcd wmi firmware_class dcdbas crc8 soundcore rtc usbcore snd_page_alloc sg cordic usb_common Feb 13 08:50:53 speedy kernel: Feb 13 08:50:53 speedy kernel: Pid: 2721, comm: udisks-daemon Not tainted 3.2.5-gentoo_MINE_V00 #1 Dell Inc. Vostro 3400/07MJFM Feb 13 08:50:53 speedy kernel: RIP: 0010:[<ffffffff8135b798>] [<ffffffff8135b798>] sd_revalidate_disk+0x1a/0x16ee Feb 13 08:50:53 speedy kernel: RSP: 0018:ffff8802234ddb08 EFLAGS: 00010292 Feb 13 08:50:53 speedy kernel: RAX: ffffffff8135b77e RBX: 0000000000000000 RCX: 0000000000000002 Feb 13 08:50:53 speedy kernel: RDX: 0000000000000002 RSI: 0000000800000000 RDI: ffff880231599000 Feb 13 08:50:53 speedy kernel: RBP: ffff880231599000 R08: ffff88023ab4f9a0 R09: ffffffff81852ec8 Feb 13 08:50:53 speedy kernel: R10: 0000000000000002 R11: 0000000000011e00 R12: ffff880231599000 Feb 13 08:50:53 speedy kernel: R13: ffff880232322698 R14: 0000000000000000 R15: ffff880232322680 Feb 13 08:50:53 speedy kernel: FS: 00007f7666c6b700(0000) GS:ffff88023bd00000(0000) knlGS:0000000000000000 Feb 13 08:50:53 speedy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 13 08:50:53 speedy kernel: CR2: 0000000000000008 CR3: 0000000223492000 CR4: 00000000000006e0 Feb 13 08:50:53 speedy kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Feb 13 08:50:53 speedy kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Feb 13 08:50:53 speedy kernel: Process udisks-daemon (pid: 2721, threadinfo ffff8802234dc000, task ffff880230f76920) Feb 13 08:50:53 speedy kernel: Stack: Feb 13 08:50:53 speedy kernel: ffffffff8103468a ffff880231599048 0000000000000000 ffff880231599000 Feb 13 08:50:53 speedy kernel: ffff880232322698 000000000000001d ffff880232322680 ffffffff810a0f35 Feb 13 08:50:53 speedy kernel: ffff880232322680 ffff880231599000 0000000000000000 ffff880232322758 Feb 13 08:50:53 speedy kernel: Call Trace: Feb 13 08:50:53 speedy kernel: [<ffffffff8103468a>] ? try_to_wake_up+0x200/0x200 Feb 13 08:50:53 speedy kernel: [<ffffffff810a0f35>] ? get_super+0x1a/0x95 Feb 13 08:50:53 speedy kernel: [<ffffffff810b2460>] ? iput+0x2b/0x17e Feb 13 08:50:53 speedy kernel: [<ffffffff810eb4b6>] ? rescan_partitions+0xac/0x446 Feb 13 08:50:53 speedy kernel: [<ffffffff810c5498>] ? __blkdev_get+0x162/0x33f Feb 13 08:50:53 speedy kernel: [<ffffffff810c5913>] ? blkdev_get+0x29e/0x29e Feb 13 08:50:53 speedy kernel: [<ffffffff810c5835>] ? blkdev_get+0x1c0/0x29e Feb 13 08:50:53 speedy kernel: [<ffffffff810c5913>] ? blkdev_get+0x29e/0x29e Feb 13 08:50:53 speedy kernel: [<ffffffff8109e0a7>] ? __dentry_open.clone.14+0x16b/0x294 Feb 13 08:50:53 speedy kernel: [<ffffffff810aab37>] ? do_last.clone.34+0x64e/0x662 Feb 13 08:50:53 speedy kernel: [<ffffffff810aac4d>] ? path_openat+0xcb/0x354 Feb 13 08:50:53 speedy kernel: [<ffffffff8133e9b0>] ? scsi_set_medium_removal+0x46/0x6b Feb 13 08:50:53 speedy kernel: [<ffffffff810aafb1>] ? do_filp_open+0x2c/0x72 Feb 13 08:50:53 speedy kernel: [<ffffffff810b4066>] ? alloc_fd+0x69/0x10f Feb 13 08:50:53 speedy kernel: [<ffffffff8109ed9a>] ? do_sys_open+0x101/0x18f Feb 13 08:50:53 speedy kernel: [<ffffffff81483292>] ? system_call_fastpath+0x16/0x1b Feb 13 08:50:53 speedy kernel: Code: ff ff 48 83 c4 68 5b 5d 41 5c 41 5d 41 5e 41 5f c3 41 57 41 56 41 55 41 54 55 53 48 83 ec 78 48 8b 9f 50 02 00 00 48 89 7c 24 48 <48> 8b 43 08 48 89 44 24 28 8b 05 49 dc 7e 00 c1 e8 15 83 e0 07 Feb 13 08:50:53 speedy kernel: RIP [<ffffffff8135b798>] sd_revalidate_disk+0x1a/0x16ee Feb 13 08:50:53 speedy kernel: RSP <ffff8802234ddb08> Feb 13 08:50:53 speedy kernel: CR2: 0000000000000008 Feb 13 08:50:53 speedy kernel: ---[ end trace 0370d79d444e26e5 ]--- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-13 17:40 ` Naveen Goswamy @ 2012-02-14 11:14 ` Jun'ichi Nomura 2012-02-14 13:31 ` Stefan Richter 2012-02-14 16:28 ` Tejun Heo 0 siblings, 2 replies; 17+ messages in thread From: Jun'ichi Nomura @ 2012-02-14 11:14 UTC (permalink / raw) To: Naveen Goswamy, Jens Axboe, Tejun Heo, James Bottomley Cc: Stefan Richter, Dave Jones, linux-kernel, linux-scsi On 02/14/12 02:40, Naveen Goswamy wrote: > Feb 13 08:50:53 speedy kernel: scsi 6:0:0:0: killing request > Feb 13 08:50:53 speedy kernel: BUG: unable to handle kernel NULL pointer > dereference at 0000000000000008 > Feb 13 08:50:53 speedy kernel: IP: [<ffffffff8135b798>] > sd_revalidate_disk+0x1a/0x16ee > Feb 13 08:50:53 speedy kernel: PGD 223493067 PUD 2234de067 PMD 0 > Feb 13 08:50:53 speedy kernel: Oops: 0000 [#1] SMP > Feb 13 08:50:53 speedy kernel: CPU 2 > Feb 13 08:50:53 speedy kernel: Modules linked in: aes_x86_64 aes_generic > ipt_REJECT iptable_mangle iptable_nat nf_nat iptable_filter ip_tables ipv6 > dm_mod uvcvideo videodev v4l2_compat_ioctl32 usb_storage arc4 brcmsmac > snd_hda_codec_hdmi snd_hda_codec_idt mac80211 brcmutil snd_hda_intel > snd_hda_codec cfg80211 r8169 rfkill snd_pcm snd_timer dell_wmi snd > sparse_keymap ehci_hcd wmi firmware_class dcdbas crc8 soundcore rtc usbcore > snd_page_alloc sg cordic usb_common > Feb 13 08:50:53 speedy kernel: > Feb 13 08:50:53 speedy kernel: Pid: 2721, comm: udisks-daemon Not tainted > 3.2.5-gentoo_MINE_V00 #1 Dell Inc. Vostro 3400/07MJFM > Feb 13 08:50:53 speedy kernel: RIP: 0010:[<ffffffff8135b798>] > [<ffffffff8135b798>] sd_revalidate_disk+0x1a/0x16ee > Feb 13 08:50:53 speedy kernel: RSP: 0018:ffff8802234ddb08 EFLAGS: 00010292 > Feb 13 08:50:53 speedy kernel: RAX: ffffffff8135b77e RBX: 0000000000000000 RCX: > 0000000000000002 > Feb 13 08:50:53 speedy kernel: RDX: 0000000000000002 RSI: 0000000800000000 RDI: > ffff880231599000 > Feb 13 08:50:53 speedy kernel: RBP: ffff880231599000 R08: ffff88023ab4f9a0 R09: > ffffffff81852ec8 > Feb 13 08:50:53 speedy kernel: R10: 0000000000000002 R11: 0000000000011e00 R12: > ffff880231599000 > Feb 13 08:50:53 speedy kernel: R13: ffff880232322698 R14: 0000000000000000 R15: > ffff880232322680 > Feb 13 08:50:53 speedy kernel: FS: 00007f7666c6b700(0000) > GS:ffff88023bd00000(0000) knlGS:0000000000000000 > Feb 13 08:50:53 speedy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Feb 13 08:50:53 speedy kernel: CR2: 0000000000000008 CR3: 0000000223492000 CR4: > 00000000000006e0 > Feb 13 08:50:53 speedy kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > Feb 13 08:50:53 speedy kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > Feb 13 08:50:53 speedy kernel: Process udisks-daemon (pid: 2721, threadinfo > ffff8802234dc000, task ffff880230f76920) > Feb 13 08:50:53 speedy kernel: Stack: > Feb 13 08:50:53 speedy kernel: ffffffff8103468a ffff880231599048 > 0000000000000000 ffff880231599000 > Feb 13 08:50:53 speedy kernel: ffff880232322698 000000000000001d > ffff880232322680 ffffffff810a0f35 > Feb 13 08:50:53 speedy kernel: ffff880232322680 ffff880231599000 > 0000000000000000 ffff880232322758 > Feb 13 08:50:53 speedy kernel: Call Trace: > Feb 13 08:50:53 speedy kernel: [<ffffffff8103468a>] ? try_to_wake_up+0x200/0x200 > Feb 13 08:50:53 speedy kernel: [<ffffffff810a0f35>] ? get_super+0x1a/0x95 > Feb 13 08:50:53 speedy kernel: [<ffffffff810b2460>] ? iput+0x2b/0x17e > Feb 13 08:50:53 speedy kernel: [<ffffffff810eb4b6>] ? > rescan_partitions+0xac/0x446 > Feb 13 08:50:53 speedy kernel: [<ffffffff810c5498>] ? __blkdev_get+0x162/0x33f > Feb 13 08:50:53 speedy kernel: [<ffffffff810c5913>] ? blkdev_get+0x29e/0x29e > Feb 13 08:50:53 speedy kernel: [<ffffffff810c5835>] ? blkdev_get+0x1c0/0x29e > Feb 13 08:50:53 speedy kernel: [<ffffffff810c5913>] ? blkdev_get+0x29e/0x29e > Feb 13 08:50:53 speedy kernel: [<ffffffff8109e0a7>] ? > __dentry_open.clone.14+0x16b/0x294 > Feb 13 08:50:53 speedy kernel: [<ffffffff810aab37>] ? > do_last.clone.34+0x64e/0x662 > Feb 13 08:50:53 speedy kernel: [<ffffffff810aac4d>] ? path_openat+0xcb/0x354 > Feb 13 08:50:53 speedy kernel: [<ffffffff8133e9b0>] ? > scsi_set_medium_removal+0x46/0x6b > Feb 13 08:50:53 speedy kernel: [<ffffffff810aafb1>] ? do_filp_open+0x2c/0x72 > Feb 13 08:50:53 speedy kernel: [<ffffffff810b4066>] ? alloc_fd+0x69/0x10f > Feb 13 08:50:53 speedy kernel: [<ffffffff8109ed9a>] ? do_sys_open+0x101/0x18f > Feb 13 08:50:53 speedy kernel: [<ffffffff81483292>] ? > system_call_fastpath+0x16/0x1b > Feb 13 08:50:53 speedy kernel: Code: ff ff 48 83 c4 68 5b 5d 41 5c 41 5d 41 5e > 41 5f c3 41 57 41 56 41 55 41 54 55 53 48 83 ec 78 48 8b 9f 50 02 00 00 48 89 > 7c 24 48 <48> 8b 43 08 48 89 44 24 28 8b 05 49 dc 7e 00 c1 e8 15 83 e0 07 > Feb 13 08:50:53 speedy kernel: RIP [<ffffffff8135b798>] > sd_revalidate_disk+0x1a/0x16ee > Feb 13 08:50:53 speedy kernel: RSP <ffff8802234ddb08> > Feb 13 08:50:53 speedy kernel: CR2: 0000000000000008 > Feb 13 08:50:53 speedy kernel: ---[ end trace 0370d79d444e26e5 ]--- According to the comments by Huajun Li: http://www.spinics.net/lists/linux-scsi/msg55698.html The following commit has changed __blkdev_get() to end up calling sd_revalidate_disk() without getting a refcount of scsi_device: commit 1196f8b814f32cd04df334abf47648c2a9fd8324 Author: Tejun Heo <tj@kernel.org> Date: Thu Apr 21 20:54:45 2011 +0200 block: rescan partitions on invalidated devices on -ENOMEDIA too that could lead to oops like this: process A process B ---------------------------------------------- sys_open __blkdev_get sd_open returns -ENOMEDIUM scsi_remove_device <scsi_device torn down> rescan_partitions sd_revalidate_disk <oops> Should "revalidate_disk" of block_device_operations work without successful open()? If so, sd_revalidate_disk() (and possibly other drivers) needs to be fixed. (e.g. use scsi_disk_get/put by itself) If not, __blkdev_get() or rescan_partision() should avoid calling "revalidate_disk" for -ENOMEDIUM case. Thanks, -- Jun'ichi Nomura, NEC Corporation ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-14 11:14 ` Jun'ichi Nomura @ 2012-02-14 13:31 ` Stefan Richter 2012-02-14 16:28 ` Tejun Heo 1 sibling, 0 replies; 17+ messages in thread From: Stefan Richter @ 2012-02-14 13:31 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Naveen Goswamy, Jens Axboe, Tejun Heo, James Bottomley, Dave Jones, linux-kernel, linux-scsi On Feb 14 Jun'ichi Nomura wrote: > According to the comments by Huajun Li: > http://www.spinics.net/lists/linux-scsi/msg55698.html > > The following commit has changed __blkdev_get() to end up calling > sd_revalidate_disk() without getting a refcount of scsi_device: > > commit 1196f8b814f32cd04df334abf47648c2a9fd8324 > Author: Tejun Heo <tj@kernel.org> > Date: Thu Apr 21 20:54:45 2011 +0200 > > block: rescan partitions on invalidated devices on -ENOMEDIA too > > that could lead to oops like this: > > process A process B > ---------------------------------------------- > sys_open > __blkdev_get > sd_open > returns -ENOMEDIUM > scsi_remove_device > <scsi_device torn down> > rescan_partitions > sd_revalidate_disk > <oops> > > Should "revalidate_disk" of block_device_operations work > without successful open()? > > If so, sd_revalidate_disk() (and possibly other drivers) needs to be > fixed. (e.g. use scsi_disk_get/put by itself) > > If not, __blkdev_get() or rescan_partision() should avoid calling > "revalidate_disk" for -ENOMEDIUM case. It may very well be that not only sd_revalidate_disk is affected. I have yet to check whether the "open -> unplug -> ioctl -> oops" bug from http://www.spinics.net/lists/linux-scsi/msg56254.html (a) happens under 3.3-rc still (was reported against 3.2-rc7), (b) affects sd devices too (was reported against sr devices). -- Stefan Richter -=====-===-- --=- -===- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-14 11:14 ` Jun'ichi Nomura 2012-02-14 13:31 ` Stefan Richter @ 2012-02-14 16:28 ` Tejun Heo 2012-02-15 2:56 ` Jun'ichi Nomura 1 sibling, 1 reply; 17+ messages in thread From: Tejun Heo @ 2012-02-14 16:28 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Naveen Goswamy, Jens Axboe, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi Hello, On Tue, Feb 14, 2012 at 08:14:40PM +0900, Jun'ichi Nomura wrote: > The following commit has changed __blkdev_get() to end up calling > sd_revalidate_disk() without getting a refcount of scsi_device: > > commit 1196f8b814f32cd04df334abf47648c2a9fd8324 > Author: Tejun Heo <tj@kernel.org> > Date: Thu Apr 21 20:54:45 2011 +0200 > > block: rescan partitions on invalidated devices on -ENOMEDIA too > > that could lead to oops like this: > > process A process B > ---------------------------------------------- > sys_open > __blkdev_get > sd_open > returns -ENOMEDIUM > scsi_remove_device > <scsi_device torn down> > rescan_partitions > sd_revalidate_disk > <oops> > > Should "revalidate_disk" of block_device_operations work > without successful open()? > > If so, sd_revalidate_disk() (and possibly other drivers) needs to be > fixed. (e.g. use scsi_disk_get/put by itself) > > If not, __blkdev_get() or rescan_partision() should avoid calling > "revalidate_disk" for -ENOMEDIUM case. Hmmm... right, that's a problem. Missed rescan_partitions() calling into driver. What we should probably do is separating out invalidation & partition shoot down into a separate function, say trucate_disk(), and call that on -ENOMEDIUM instead of rescan_partitions(). All that's necessary is killing the partition devices (and maybe zapping device size to zero). Any one interested in trying it? Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-14 16:28 ` Tejun Heo @ 2012-02-15 2:56 ` Jun'ichi Nomura 2012-02-15 17:26 ` Tejun Heo 0 siblings, 1 reply; 17+ messages in thread From: Jun'ichi Nomura @ 2012-02-15 2:56 UTC (permalink / raw) To: Tejun Heo, Naveen Goswamy Cc: Jens Axboe, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi Hi, Thank you for the comments. On 02/15/12 01:28, Tejun Heo wrote: >> that could lead to oops like this: >> >> process A process B >> ---------------------------------------------- >> sys_open >> __blkdev_get >> sd_open >> returns -ENOMEDIUM >> scsi_remove_device >> <scsi_device torn down> >> rescan_partitions >> sd_revalidate_disk >> <oops> >> >> Should "revalidate_disk" of block_device_operations work >> without successful open()? >> >> If so, sd_revalidate_disk() (and possibly other drivers) needs to be >> fixed. (e.g. use scsi_disk_get/put by itself) >> >> If not, __blkdev_get() or rescan_partision() should avoid calling >> "revalidate_disk" for -ENOMEDIUM case. > > Hmmm... right, that's a problem. Missed rescan_partitions() calling > into driver. What we should probably do is separating out > invalidation & partition shoot down into a separate function, say > trucate_disk(), and call that on -ENOMEDIUM instead of > rescan_partitions(). All that's necessary is killing the partition > devices (and maybe zapping device size to zero). Any one interested > in trying it? How about this? If the patch looks ok, I appreciate if somebody with removable media could test the followings: - the oops in sd_revalidate_disk() should not occur: http://marc.info/?l=linux-scsi&m=132388619710052 - the problem reported here should not be re-introduced: https://bugzilla.kernel.org/show_bug.cgi?id=13029 Index: linux-3.3/block/partition-generic.c =================================================================== --- linux-3.3.orig/block/partition-generic.c 2012-02-15 09:00:25.147293790 +0900 +++ linux-3.3/block/partition-generic.c 2012-02-15 11:31:33.835554974 +0900 @@ -389,17 +389,11 @@ static bool disk_unlock_native_capacity( } } -int rescan_partitions(struct gendisk *disk, struct block_device *bdev) +static int drop_partitions(struct gendisk *disk, struct block_device *bdev) { - struct parsed_partitions *state = NULL; struct disk_part_iter piter; struct hd_struct *part; - int p, highest, res; -rescan: - if (state && !IS_ERR(state)) { - kfree(state); - state = NULL; - } + int res; if (bdev->bd_part_count) return -EBUSY; @@ -412,6 +406,24 @@ rescan: delete_partition(disk, part->partno); disk_part_iter_exit(&piter); + return 0; +} + +int rescan_partitions(struct gendisk *disk, struct block_device *bdev) +{ + struct parsed_partitions *state = NULL; + struct hd_struct *part; + int p, highest, res; +rescan: + if (state && !IS_ERR(state)) { + kfree(state); + state = NULL; + } + + res = drop_partitions(disk, bdev); + if (res) + return res; + if (disk->fops->revalidate_disk) disk->fops->revalidate_disk(disk); check_disk_size_change(disk, bdev); @@ -515,6 +527,22 @@ rescan: return 0; } +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev) +{ + int res; + + res = drop_partitions(disk, bdev); + if (res) + return res; + + check_disk_size_change(disk, bdev); + bdev->bd_invalidated = 0; + /* tell userspace that the media / partition table may have changed */ + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE); + + return 0; +} + unsigned char *read_dev_sector(struct block_device *bdev, sector_t n, Sector *p) { struct address_space *mapping = bdev->bd_inode->i_mapping; Index: linux-3.3/include/linux/genhd.h =================================================================== --- linux-3.3.orig/include/linux/genhd.h 2012-02-09 12:21:53.000000000 +0900 +++ linux-3.3/include/linux/genhd.h 2012-02-15 11:18:59.661594629 +0900 @@ -596,6 +596,7 @@ extern char *disk_name (struct gendisk * extern int disk_expand_part_tbl(struct gendisk *disk, int target); extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev); +extern int invalidate_partitions(struct gendisk *disk, struct block_device *bdev); extern struct hd_struct * __must_check add_partition(struct gendisk *disk, int partno, sector_t start, sector_t len, int flags, Index: linux-3.3/fs/block_dev.c =================================================================== --- linux-3.3.orig/fs/block_dev.c 2012-02-09 12:21:53.000000000 +0900 +++ linux-3.3/fs/block_dev.c 2012-02-15 11:34:48.800549266 +0900 @@ -1183,8 +1183,12 @@ static int __blkdev_get(struct block_dev * The latter is necessary to prevent ghost * partitions on a removed medium. */ - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM)) - rescan_partitions(disk, bdev); + if (bdev->bd_invalidated) { + if (!ret) + rescan_partitions(disk, bdev); + else if (ret == -ENOMEDIUM) + invalidate_partitions(disk, bdev); + } if (ret) goto out_clear; } else { @@ -1214,8 +1218,12 @@ static int __blkdev_get(struct block_dev if (bdev->bd_disk->fops->open) ret = bdev->bd_disk->fops->open(bdev, mode); /* the same as first opener case, read comment there */ - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM)) - rescan_partitions(bdev->bd_disk, bdev); + if (bdev->bd_invalidated) { + if (!ret) + rescan_partitions(bdev->bd_disk, bdev); + else if (ret == -ENOMEDIUM) + invalidate_partitions(bdev->bd_disk, bdev); + } if (ret) goto out_unlock_bdev; } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-15 2:56 ` Jun'ichi Nomura @ 2012-02-15 17:26 ` Tejun Heo 2012-02-16 1:26 ` Jun'ichi Nomura 0 siblings, 1 reply; 17+ messages in thread From: Tejun Heo @ 2012-02-15 17:26 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Naveen Goswamy, Jens Axboe, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi Hello, This seems like the right approach to me, but.. On Wed, Feb 15, 2012 at 11:56:19AM +0900, Jun'ichi Nomura wrote: > +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev) > +{ > + int res; > + > + res = drop_partitions(disk, bdev); > + if (res) > + return res; > + Hmmm... shouldn't we have set_capacity(disk, 0) here? > + check_disk_size_change(disk, bdev); > + bdev->bd_invalidated = 0; > + /* tell userspace that the media / partition table may have changed */ > + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE); Also, we really shouldn't be generating KOBJ_CHANGE after every -ENOMEDIUM open. This can easily lead to infinite loop. We should generate this iff we actually dropped partitions && modified the size. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-15 17:26 ` Tejun Heo @ 2012-02-16 1:26 ` Jun'ichi Nomura 2012-02-16 16:36 ` Tejun Heo 2012-03-01 18:58 ` Luis Henriques 0 siblings, 2 replies; 17+ messages in thread From: Jun'ichi Nomura @ 2012-02-16 1:26 UTC (permalink / raw) To: Tejun Heo Cc: Naveen Goswamy, Jens Axboe, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi Hi, Thank you for review and comments. On 02/16/12 02:26, Tejun Heo wrote: > On Wed, Feb 15, 2012 at 11:56:19AM +0900, Jun'ichi Nomura wrote: >> +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev) >> +{ >> + int res; >> + >> + res = drop_partitions(disk, bdev); >> + if (res) >> + return res; >> + > > Hmmm... shouldn't we have set_capacity(disk, 0) here? Added. I wasn't sure whether I should leave it to drivers. But it seems capacity 0 for ENOMEDIUM device is reasonable. >> + check_disk_size_change(disk, bdev); >> + bdev->bd_invalidated = 0; >> + /* tell userspace that the media / partition table may have changed */ >> + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE); > > Also, we really shouldn't be generating KOBJ_CHANGE after every > -ENOMEDIUM open. This can easily lead to infinite loop. We should > generate this iff we actually dropped partitions && modified the size. invalidate_partitions() is called only when bd_invalidated is set. So KOBJ_CHANGE is not raised for every ENOMEDIUM open. I put it explicit in the function to make it safer for possible misuse. How about this? --------------------------------------------------------- Do not call drivers when invalidating partitions for -ENOMEDIUM When a scsi driver returns -ENOMEDIUM for open(), __blkdev_get() calls rescan_partitions(), which ends up calling sd_revalidate_disk() without getting a refcount of scsi_device. That could lead to oops like this: process A process B ---------------------------------------------- sys_open __blkdev_get sd_open returns -ENOMEDIUM scsi_remove_device <scsi_device torn down> rescan_partitions sd_revalidate_disk <oops> Oopses are reported here: http://marc.info/?l=linux-scsi&m=132388619710052 This patch separates the partition invalidation from rescan_partitions() and use it for -ENOMEDIUM case. Index: linux-3.3/block/partition-generic.c =================================================================== --- linux-3.3.orig/block/partition-generic.c 2012-02-15 09:00:25.147293790 +0900 +++ linux-3.3/block/partition-generic.c 2012-02-16 10:48:22.257680685 +0900 @@ -389,17 +389,11 @@ static bool disk_unlock_native_capacity( } } -int rescan_partitions(struct gendisk *disk, struct block_device *bdev) +static int drop_partitions(struct gendisk *disk, struct block_device *bdev) { - struct parsed_partitions *state = NULL; struct disk_part_iter piter; struct hd_struct *part; - int p, highest, res; -rescan: - if (state && !IS_ERR(state)) { - kfree(state); - state = NULL; - } + int res; if (bdev->bd_part_count) return -EBUSY; @@ -412,6 +406,24 @@ rescan: delete_partition(disk, part->partno); disk_part_iter_exit(&piter); + return 0; +} + +int rescan_partitions(struct gendisk *disk, struct block_device *bdev) +{ + struct parsed_partitions *state = NULL; + struct hd_struct *part; + int p, highest, res; +rescan: + if (state && !IS_ERR(state)) { + kfree(state); + state = NULL; + } + + res = drop_partitions(disk, bdev); + if (res) + return res; + if (disk->fops->revalidate_disk) disk->fops->revalidate_disk(disk); check_disk_size_change(disk, bdev); @@ -515,6 +527,26 @@ rescan: return 0; } +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev) +{ + int res; + + if (!bdev->bd_invalidated) + return 0; + + res = drop_partitions(disk, bdev); + if (res) + return res; + + set_capacity(disk, 0); + check_disk_size_change(disk, bdev); + bdev->bd_invalidated = 0; + /* tell userspace that the media / partition table may have changed */ + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE); + + return 0; +} + unsigned char *read_dev_sector(struct block_device *bdev, sector_t n, Sector *p) { struct address_space *mapping = bdev->bd_inode->i_mapping; Index: linux-3.3/include/linux/genhd.h =================================================================== --- linux-3.3.orig/include/linux/genhd.h 2012-02-09 12:21:53.000000000 +0900 +++ linux-3.3/include/linux/genhd.h 2012-02-16 10:47:43.783681813 +0900 @@ -596,6 +596,7 @@ extern char *disk_name (struct gendisk * extern int disk_expand_part_tbl(struct gendisk *disk, int target); extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev); +extern int invalidate_partitions(struct gendisk *disk, struct block_device *bdev); extern struct hd_struct * __must_check add_partition(struct gendisk *disk, int partno, sector_t start, sector_t len, int flags, Index: linux-3.3/fs/block_dev.c =================================================================== --- linux-3.3.orig/fs/block_dev.c 2012-02-09 12:21:53.000000000 +0900 +++ linux-3.3/fs/block_dev.c 2012-02-16 10:47:52.602681441 +0900 @@ -1183,8 +1183,12 @@ static int __blkdev_get(struct block_dev * The latter is necessary to prevent ghost * partitions on a removed medium. */ - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM)) - rescan_partitions(disk, bdev); + if (bdev->bd_invalidated) { + if (!ret) + rescan_partitions(disk, bdev); + else if (ret == -ENOMEDIUM) + invalidate_partitions(disk, bdev); + } if (ret) goto out_clear; } else { @@ -1214,8 +1218,12 @@ static int __blkdev_get(struct block_dev if (bdev->bd_disk->fops->open) ret = bdev->bd_disk->fops->open(bdev, mode); /* the same as first opener case, read comment there */ - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM)) - rescan_partitions(bdev->bd_disk, bdev); + if (bdev->bd_invalidated) { + if (!ret) + rescan_partitions(bdev->bd_disk, bdev); + else if (ret == -ENOMEDIUM) + invalidate_partitions(bdev->bd_disk, bdev); + } if (ret) goto out_unlock_bdev; } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-16 1:26 ` Jun'ichi Nomura @ 2012-02-16 16:36 ` Tejun Heo 2012-03-01 18:58 ` Luis Henriques 1 sibling, 0 replies; 17+ messages in thread From: Tejun Heo @ 2012-02-16 16:36 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Naveen Goswamy, Jens Axboe, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi Hello, On Thu, Feb 16, 2012 at 10:26:38AM +0900, Jun'ichi Nomura wrote: > >> +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev) > >> +{ > >> + int res; > >> + > >> + res = drop_partitions(disk, bdev); > >> + if (res) > >> + return res; > >> + > > > > Hmmm... shouldn't we have set_capacity(disk, 0) here? > > Added. > I wasn't sure whether I should leave it to drivers. The problem is that we shouldn't call into drivers without first opening the device, so.... > But it seems capacity 0 for ENOMEDIUM device is reasonable. Yeah, I *think* it should be okay. > >> + check_disk_size_change(disk, bdev); > >> + bdev->bd_invalidated = 0; > >> + /* tell userspace that the media / partition table may have changed */ > >> + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE); > > > > Also, we really shouldn't be generating KOBJ_CHANGE after every > > -ENOMEDIUM open. This can easily lead to infinite loop. We should > > generate this iff we actually dropped partitions && modified the size. > > invalidate_partitions() is called only when bd_invalidated is set. > So KOBJ_CHANGE is not raised for every ENOMEDIUM open. Ah, okay. > I put it explicit in the function to make it safer for > possible misuse. > > How about this? > > --------------------------------------------------------- > Do not call drivers when invalidating partitions for -ENOMEDIUM > > When a scsi driver returns -ENOMEDIUM for open(), > __blkdev_get() calls rescan_partitions(), which ends up calling > sd_revalidate_disk() without getting a refcount of scsi_device. > > That could lead to oops like this: > > process A process B > ---------------------------------------------- > sys_open > __blkdev_get > sd_open > returns -ENOMEDIUM > scsi_remove_device > <scsi_device torn down> > rescan_partitions > sd_revalidate_disk > <oops> > > Oopses are reported here: > http://marc.info/?l=linux-scsi&m=132388619710052 > > This patch separates the partition invalidation from rescan_partitions() > and use it for -ENOMEDIUM case. Yeah, this looks good to me. Thank you. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-02-16 1:26 ` Jun'ichi Nomura 2012-02-16 16:36 ` Tejun Heo @ 2012-03-01 18:58 ` Luis Henriques 2012-03-02 0:12 ` Jun'ichi Nomura 1 sibling, 1 reply; 17+ messages in thread From: Luis Henriques @ 2012-03-01 18:58 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Tejun Heo, Naveen Goswamy, Jens Axboe, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi Hi, On Thu, Feb 16, 2012 at 10:26:38AM +0900, Jun'ichi Nomura wrote: > Hi, > > Thank you for review and comments. > > On 02/16/12 02:26, Tejun Heo wrote: > > On Wed, Feb 15, 2012 at 11:56:19AM +0900, Jun'ichi Nomura wrote: > >> +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev) > >> +{ > >> + int res; > >> + > >> + res = drop_partitions(disk, bdev); > >> + if (res) > >> + return res; > >> + > > > > Hmmm... shouldn't we have set_capacity(disk, 0) here? > > Added. > I wasn't sure whether I should leave it to drivers. > But it seems capacity 0 for ENOMEDIUM device is reasonable. > > >> + check_disk_size_change(disk, bdev); > >> + bdev->bd_invalidated = 0; > >> + /* tell userspace that the media / partition table may have changed */ > >> + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE); > > > > Also, we really shouldn't be generating KOBJ_CHANGE after every > > -ENOMEDIUM open. This can easily lead to infinite loop. We should > > generate this iff we actually dropped partitions && modified the size. > > invalidate_partitions() is called only when bd_invalidated is set. > So KOBJ_CHANGE is not raised for every ENOMEDIUM open. > > I put it explicit in the function to make it safer for > possible misuse. > > How about this? Are there any updates on this fix? I was wondering if any progress has been made and if this patch has any chances of hitting mainline soon. I have executed a quick test and it seems to solve the problem (or, at least, I am not able to reproduce the oops anymore). Cheers, -- Luis > --------------------------------------------------------- > Do not call drivers when invalidating partitions for -ENOMEDIUM > > When a scsi driver returns -ENOMEDIUM for open(), > __blkdev_get() calls rescan_partitions(), which ends up calling > sd_revalidate_disk() without getting a refcount of scsi_device. > > That could lead to oops like this: > > process A process B > ---------------------------------------------- > sys_open > __blkdev_get > sd_open > returns -ENOMEDIUM > scsi_remove_device > <scsi_device torn down> > rescan_partitions > sd_revalidate_disk > <oops> > > Oopses are reported here: > http://marc.info/?l=linux-scsi&m=132388619710052 > > This patch separates the partition invalidation from rescan_partitions() > and use it for -ENOMEDIUM case. > > Index: linux-3.3/block/partition-generic.c > =================================================================== > --- linux-3.3.orig/block/partition-generic.c 2012-02-15 09:00:25.147293790 +0900 > +++ linux-3.3/block/partition-generic.c 2012-02-16 10:48:22.257680685 +0900 > @@ -389,17 +389,11 @@ static bool disk_unlock_native_capacity( > } > } > > -int rescan_partitions(struct gendisk *disk, struct block_device *bdev) > +static int drop_partitions(struct gendisk *disk, struct block_device *bdev) > { > - struct parsed_partitions *state = NULL; > struct disk_part_iter piter; > struct hd_struct *part; > - int p, highest, res; > -rescan: > - if (state && !IS_ERR(state)) { > - kfree(state); > - state = NULL; > - } > + int res; > > if (bdev->bd_part_count) > return -EBUSY; > @@ -412,6 +406,24 @@ rescan: > delete_partition(disk, part->partno); > disk_part_iter_exit(&piter); > > + return 0; > +} > + > +int rescan_partitions(struct gendisk *disk, struct block_device *bdev) > +{ > + struct parsed_partitions *state = NULL; > + struct hd_struct *part; > + int p, highest, res; > +rescan: > + if (state && !IS_ERR(state)) { > + kfree(state); > + state = NULL; > + } > + > + res = drop_partitions(disk, bdev); > + if (res) > + return res; > + > if (disk->fops->revalidate_disk) > disk->fops->revalidate_disk(disk); > check_disk_size_change(disk, bdev); > @@ -515,6 +527,26 @@ rescan: > return 0; > } > > +int invalidate_partitions(struct gendisk *disk, struct block_device *bdev) > +{ > + int res; > + > + if (!bdev->bd_invalidated) > + return 0; > + > + res = drop_partitions(disk, bdev); > + if (res) > + return res; > + > + set_capacity(disk, 0); > + check_disk_size_change(disk, bdev); > + bdev->bd_invalidated = 0; > + /* tell userspace that the media / partition table may have changed */ > + kobject_uevent(&disk_to_dev(disk)->kobj, KOBJ_CHANGE); > + > + return 0; > +} > + > unsigned char *read_dev_sector(struct block_device *bdev, sector_t n, Sector *p) > { > struct address_space *mapping = bdev->bd_inode->i_mapping; > Index: linux-3.3/include/linux/genhd.h > =================================================================== > --- linux-3.3.orig/include/linux/genhd.h 2012-02-09 12:21:53.000000000 +0900 > +++ linux-3.3/include/linux/genhd.h 2012-02-16 10:47:43.783681813 +0900 > @@ -596,6 +596,7 @@ extern char *disk_name (struct gendisk * > > extern int disk_expand_part_tbl(struct gendisk *disk, int target); > extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev); > +extern int invalidate_partitions(struct gendisk *disk, struct block_device *bdev); > extern struct hd_struct * __must_check add_partition(struct gendisk *disk, > int partno, sector_t start, > sector_t len, int flags, > Index: linux-3.3/fs/block_dev.c > =================================================================== > --- linux-3.3.orig/fs/block_dev.c 2012-02-09 12:21:53.000000000 +0900 > +++ linux-3.3/fs/block_dev.c 2012-02-16 10:47:52.602681441 +0900 > @@ -1183,8 +1183,12 @@ static int __blkdev_get(struct block_dev > * The latter is necessary to prevent ghost > * partitions on a removed medium. > */ > - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM)) > - rescan_partitions(disk, bdev); > + if (bdev->bd_invalidated) { > + if (!ret) > + rescan_partitions(disk, bdev); > + else if (ret == -ENOMEDIUM) > + invalidate_partitions(disk, bdev); > + } > if (ret) > goto out_clear; > } else { > @@ -1214,8 +1218,12 @@ static int __blkdev_get(struct block_dev > if (bdev->bd_disk->fops->open) > ret = bdev->bd_disk->fops->open(bdev, mode); > /* the same as first opener case, read comment there */ > - if (bdev->bd_invalidated && (!ret || ret == -ENOMEDIUM)) > - rescan_partitions(bdev->bd_disk, bdev); > + if (bdev->bd_invalidated) { > + if (!ret) > + rescan_partitions(bdev->bd_disk, bdev); > + else if (ret == -ENOMEDIUM) > + invalidate_partitions(bdev->bd_disk, bdev); > + } > if (ret) > goto out_unlock_bdev; > } > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-03-01 18:58 ` Luis Henriques @ 2012-03-02 0:12 ` Jun'ichi Nomura 2012-03-02 9:35 ` Luis Henriques 2012-03-02 9:41 ` Jens Axboe 0 siblings, 2 replies; 17+ messages in thread From: Jun'ichi Nomura @ 2012-03-02 0:12 UTC (permalink / raw) To: Luis Henriques, Jens Axboe Cc: Tejun Heo, Naveen Goswamy, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi Hi, On 03/02/12 03:58, Luis Henriques wrote: > Are there any updates on this fix? I was wondering if any progress has been > made and if this patch has any chances of hitting mainline soon. > > I have executed a quick test and it seems to solve the problem (or, at least, I > am not able to reproduce the oops anymore). Thank you for testing and feedback. The patch is posted here and waiting for Jens to pick up: https://lkml.org/lkml/2012/2/21/443 [PATCH] Fix NULL pointer dereference in sd_revalidate_disk -- Jun'ichi Nomura, NEC Corporation ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-03-02 0:12 ` Jun'ichi Nomura @ 2012-03-02 9:35 ` Luis Henriques 2012-03-02 9:41 ` Jens Axboe 1 sibling, 0 replies; 17+ messages in thread From: Luis Henriques @ 2012-03-02 9:35 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Jens Axboe, Tejun Heo, Naveen Goswamy, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi On Fri, Mar 02, 2012 at 09:12:02AM +0900, Jun'ichi Nomura wrote: > Hi, > > On 03/02/12 03:58, Luis Henriques wrote: > > Are there any updates on this fix? I was wondering if any progress has been > > made and if this patch has any chances of hitting mainline soon. > > > > I have executed a quick test and it seems to solve the problem (or, at least, I > > am not able to reproduce the oops anymore). > > Thank you for testing and feedback. > > The patch is posted here and waiting for Jens to pick up: > https://lkml.org/lkml/2012/2/21/443 > [PATCH] Fix NULL pointer dereference in sd_revalidate_disk Thanks for pointing me to the post. I don't know how I missed that update. I'll take a look at it and re-test it. Cheers, -- Luis ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Kernel crashing on eject SD card 2012-03-02 0:12 ` Jun'ichi Nomura 2012-03-02 9:35 ` Luis Henriques @ 2012-03-02 9:41 ` Jens Axboe 1 sibling, 0 replies; 17+ messages in thread From: Jens Axboe @ 2012-03-02 9:41 UTC (permalink / raw) To: Jun'ichi Nomura Cc: Luis Henriques, Tejun Heo, Naveen Goswamy, James Bottomley, Stefan Richter, Dave Jones, linux-kernel, linux-scsi On 03/02/2012 01:12 AM, Jun'ichi Nomura wrote: > Hi, > > On 03/02/12 03:58, Luis Henriques wrote: >> Are there any updates on this fix? I was wondering if any progress has been >> made and if this patch has any chances of hitting mainline soon. >> >> I have executed a quick test and it seems to solve the problem (or, at least, I >> am not able to reproduce the oops anymore). > > Thank you for testing and feedback. > > The patch is posted here and waiting for Jens to pick up: > https://lkml.org/lkml/2012/2/21/443 > [PATCH] Fix NULL pointer dereference in sd_revalidate_disk It's queued up, I'll send off patches to Linus today. -- Jens Axboe ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2012-03-02 9:42 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-02-08 0:19 Kernel crashing on eject SD card Naveen Goswamy 2012-02-12 21:08 ` Stefan Richter 2012-02-12 21:20 ` Stefan Richter 2012-02-13 1:46 ` Naveen Goswamy 2012-02-13 2:18 ` Dave Jones 2012-02-13 17:40 ` Naveen Goswamy 2012-02-14 11:14 ` Jun'ichi Nomura 2012-02-14 13:31 ` Stefan Richter 2012-02-14 16:28 ` Tejun Heo 2012-02-15 2:56 ` Jun'ichi Nomura 2012-02-15 17:26 ` Tejun Heo 2012-02-16 1:26 ` Jun'ichi Nomura 2012-02-16 16:36 ` Tejun Heo 2012-03-01 18:58 ` Luis Henriques 2012-03-02 0:12 ` Jun'ichi Nomura 2012-03-02 9:35 ` Luis Henriques 2012-03-02 9:41 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).