* Manual driver binding and unbinding broken for SCSI
@ 2017-02-18 0:30 Omar Sandoval
2017-02-18 0:43 ` James Bottomley
0 siblings, 1 reply; 5+ messages in thread
From: Omar Sandoval @ 2017-02-18 0:30 UTC (permalink / raw)
To: Dan Williams, Jan Kara, James Bottomley, Martin K. Petersen
Cc: Jens Axboe, linux-scsi, linux-block
Hi, everyone,
As per $SUBJECT, I can cause a crash on v4.10-rc8, Jens' block/for-next,
and Jan's bdi branch [1] by doing this:
# lsscsi
[0:0:0:0] disk QEMU QEMU HARDDISK 2.5+ /dev/sda
# echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/unbind
# echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/bind
The resulting trace looks like this:
[ 19.347924] kobject (ffff8800791ea0b8): tried to init an initialized object, something is seriously wrong.
[ 19.349781] CPU: 1 PID: 84 Comm: kworker/u8:1 Not tainted 4.10.0-rc7-00210-g53f39eeaa263 #34
[ 19.350686] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-20161122_114906-anatol 04/01/2014
[ 19.350920] Workqueue: events_unbound async_run_entry_fn
[ 19.350920] Call Trace:
[ 19.350920] dump_stack+0x63/0x83
[ 19.350920] kobject_init+0x77/0x90
[ 19.350920] blk_mq_register_dev+0x40/0x130
[ 19.350920] blk_register_queue+0xb6/0x190
[ 19.350920] device_add_disk+0x1ec/0x4b0
[ 19.350920] sd_probe_async+0x10d/0x1c0 [sd_mod]
[ 19.350920] async_run_entry_fn+0x48/0x150
[ 19.350920] process_one_work+0x1d0/0x480
[ 19.350920] worker_thread+0x48/0x4e0
[ 19.350920] kthread+0x101/0x140
[ 19.350920] ? process_one_work+0x480/0x480
[ 19.350920] ? kthread_create_on_node+0x60/0x60
[ 19.350920] ret_from_fork+0x2c/0x40
Additionally, on v4.10-rc8, but not on block/for-next or Jan's branch,
doing this:
# echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/unbind
# modprobe scsi_debug
Causes this trace:
[ 18.876096] ------------[ cut here ]------------
[ 18.877057] WARNING: CPU: 1 PID: 90 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
[ 18.878270] sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:0'
[ 18.879435] Modules linked in: scsi_debug btrfs xor raid6_pq sd_mod virtio_scsi scsi_mod nvme nvme_core virtio_net
[ 18.881118] CPU: 1 PID: 90 Comm: kworker/u8:2 Not tainted 4.10.0-rc8 #34
[ 18.882114] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-20161122_114906-anatol 04/01/2014
[ 18.883872] Workqueue: events_unbound async_run_entry_fn
[ 18.884408] Call Trace:
[ 18.884408] dump_stack+0x63/0x83
[ 18.884408] __warn+0xcb/0xf0
[ 18.884408] warn_slowpath_fmt+0x5f/0x80
[ 18.884408] ? kernfs_path_from_node+0x4f/0x60
[ 18.884408] sysfs_warn_dup+0x62/0x80
[ 18.884408] sysfs_create_dir_ns+0x77/0x90
[ 18.884408] kobject_add_internal+0xbe/0x350
[ 18.884408] kobject_add+0x75/0xd0
[ 18.884408] device_add+0x121/0x680
[ 18.884408] device_create_groups_vargs+0xe0/0xf0
[ 18.884408] device_create_vargs+0x1c/0x20
[ 18.884408] bdi_register+0x90/0x1b0
[ 18.884408] ? sd_revalidate_disk+0x34a/0x1d00 [sd_mod]
[ 18.884408] bdi_register_owner+0x36/0x60
[ 18.884408] device_add_disk+0x165/0x4a0
[ 18.884408] ? update_autosuspend+0x51/0x60
[ 18.884408] ? __pm_runtime_use_autosuspend+0x5c/0x70
[ 18.884408] sd_probe_async+0x10d/0x1c0 [sd_mod]
[ 18.884408] async_run_entry_fn+0x4a/0x170
[ 18.884408] process_one_work+0x165/0x430
[ 18.884408] worker_thread+0x4e/0x490
[ 18.884408] kthread+0x101/0x140
[ 18.884408] ? process_one_work+0x430/0x430
[ 18.884408] ? kthread_create_on_node+0x60/0x60
[ 18.884408] ret_from_fork+0x2c/0x40
[ 18.913090] ---[ end trace f43b051485c2a749 ]---
On all three kernels, it looks like the bdi sysfs entry hangs around
after the block device has already been removed:
┌[root@silver ~]
└# lsblk /dev/sda
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 16G 0 disk
┌[root@silver ~]
└# ls -al /sys/devices/virtual/bdi
total 0
drwxr-xr-x 6 root root 0 Feb 17 16:19 .
drwxr-xr-x 13 root root 0 Feb 17 16:19 ..
drwxr-xr-x 3 root root 0 Feb 17 16:19 254:0
drwxr-xr-x 3 root root 0 Feb 17 16:19 259:0
drwxr-xr-x 3 root root 0 Feb 17 16:19 8:0
drwxr-xr-x 3 root root 0 Feb 17 16:19 9p-1
┌[root@silver ~]
└# echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/unbind
┌[root@silver ~]
└# ls -al /sys/devices/virtual/bdi
total 0
drwxr-xr-x 6 root root 0 Feb 17 16:19 .
drwxr-xr-x 13 root root 0 Feb 17 16:19 ..
drwxr-xr-x 3 root root 0 Feb 17 16:19 254:0
drwxr-xr-x 3 root root 0 Feb 17 16:19 259:0
drwxr-xr-x 3 root root 0 Feb 17 16:19 8:0
drwxr-xr-x 3 root root 0 Feb 17 16:19 9p-1
┌[root@silver ~]
└# lsblk /dev/sda
lsblk: /dev/sda: not a block device
Any ideas here?
1: https://git.kernel.org/cgit/linux/kernel/git/jack/linux-fs.git/tree/?h=bdi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Manual driver binding and unbinding broken for SCSI
2017-02-18 0:30 Manual driver binding and unbinding broken for SCSI Omar Sandoval
@ 2017-02-18 0:43 ` James Bottomley
2017-02-20 2:19 ` Omar Sandoval
0 siblings, 1 reply; 5+ messages in thread
From: James Bottomley @ 2017-02-18 0:43 UTC (permalink / raw)
To: Omar Sandoval, Dan Williams, Jan Kara, Martin K. Petersen
Cc: Jens Axboe, linux-scsi, linux-block
On Fri, 2017-02-17 at 16:30 -0800, Omar Sandoval wrote:
> Hi, everyone,
>
> As per $SUBJECT, I can cause a crash on v4.10-rc8, Jens' block/for
> -next,
> and Jan's bdi branch [1] by doing this:
>
> # lsscsi
> [0:0:0:0] disk QEMU QEMU HARDDISK 2.5+ /dev/sda
> # echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/unbind
> # echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/bind
>
> The resulting trace looks like this:
>
> [ 19.347924] kobject (ffff8800791ea0b8): tried to init an
> initialized object, something is seriously wrong.
> [ 19.349781] CPU: 1 PID: 84 Comm: kworker/u8:1 Not tainted 4.10.0
> -rc7-00210-g53f39eeaa263 #34
> [ 19.350686] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.1-20161122_114906-anatol 04/01/2014
> [ 19.350920] Workqueue: events_unbound async_run_entry_fn
> [ 19.350920] Call Trace:
> [ 19.350920] dump_stack+0x63/0x83
> [ 19.350920] kobject_init+0x77/0x90
> [ 19.350920] blk_mq_register_dev+0x40/0x130
> [ 19.350920] blk_register_queue+0xb6/0x190
> [ 19.350920] device_add_disk+0x1ec/0x4b0
> [ 19.350920] sd_probe_async+0x10d/0x1c0 [sd_mod]
> [ 19.350920] async_run_entry_fn+0x48/0x150
> [ 19.350920] process_one_work+0x1d0/0x480
> [ 19.350920] worker_thread+0x48/0x4e0
> [ 19.350920] kthread+0x101/0x140
> [ 19.350920] ? process_one_work+0x480/0x480
> [ 19.350920] ? kthread_create_on_node+0x60/0x60
> [ 19.350920] ret_from_fork+0x2c/0x40
>
> Additionally, on v4.10-rc8, but not on block/for-next or Jan's
> branch,
> doing this:
>
> # echo 0:0:0:0 > /sys/bus/scsi/drivers/sd/unbind
> # modprobe scsi_debug
>
> Causes this trace:
>
> [ 18.876096] ------------[ cut here ]------------
> [ 18.877057] WARNING: CPU: 1 PID: 90 at fs/sysfs/dir.c:31
> sysfs_warn_dup+0x62/0x80
> [ 18.878270] sysfs: cannot create duplicate filename
> '/devices/virtual/bdi/8:0'
> [ 18.879435] Modules linked in: scsi_debug btrfs xor raid6_pq
> sd_mod virtio_scsi scsi_mod nvme nvme_core virtio_net
> [ 18.881118] CPU: 1 PID: 90 Comm: kworker/u8:2 Not tainted 4.10.0
> -rc8 #34
> [ 18.882114] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.1-20161122_114906-anatol 04/01/2014
> [ 18.883872] Workqueue: events_unbound async_run_entry_fn
> [ 18.884408] Call Trace:
> [ 18.884408] dump_stack+0x63/0x83
> [ 18.884408] __warn+0xcb/0xf0
> [ 18.884408] warn_slowpath_fmt+0x5f/0x80
> [ 18.884408] ? kernfs_path_from_node+0x4f/0x60
> [ 18.884408] sysfs_warn_dup+0x62/0x80
> [ 18.884408] sysfs_create_dir_ns+0x77/0x90
> [ 18.884408] kobject_add_internal+0xbe/0x350
> [ 18.884408] kobject_add+0x75/0xd0
> [ 18.884408] device_add+0x121/0x680
> [ 18.884408] device_create_groups_vargs+0xe0/0xf0
> [ 18.884408] device_create_vargs+0x1c/0x20
> [ 18.884408] bdi_register+0x90/0x1b0
> [ 18.884408] ? sd_revalidate_disk+0x34a/0x1d00 [sd_mod]
> [ 18.884408] bdi_register_owner+0x36/0x60
> [ 18.884408] device_add_disk+0x165/0x4a0
> [ 18.884408] ? update_autosuspend+0x51/0x60
> [ 18.884408] ? __pm_runtime_use_autosuspend+0x5c/0x70
> [ 18.884408] sd_probe_async+0x10d/0x1c0 [sd_mod]
> [ 18.884408] async_run_entry_fn+0x4a/0x170
> [ 18.884408] process_one_work+0x165/0x430
> [ 18.884408] worker_thread+0x4e/0x490
> [ 18.884408] kthread+0x101/0x140
> [ 18.884408] ? process_one_work+0x430/0x430
> [ 18.884408] ? kthread_create_on_node+0x60/0x60
> [ 18.884408] ret_from_fork+0x2c/0x40
> [ 18.913090] ---[ end trace f43b051485c2a749 ]---
>
> On all three kernels, it looks like the bdi sysfs entry hangs around
> after the block device has already been removed:
This seems to be related to a 0day test we got on the block tree,
details here:
http://marc.info/?t=148624068800001
I root caused the above to something not being released when it should
be, so it looks like you have the same problem. It seems to be a
recent commit in the block tree, so could you bisect it since you have
a nice reproducer?
Thanks,
James
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Manual driver binding and unbinding broken for SCSI
2017-02-18 0:43 ` James Bottomley
@ 2017-02-20 2:19 ` Omar Sandoval
2017-02-21 17:14 ` Jan Kara
0 siblings, 1 reply; 5+ messages in thread
From: Omar Sandoval @ 2017-02-20 2:19 UTC (permalink / raw)
To: James Bottomley
Cc: Dan Williams, Jan Kara, Martin K. Petersen, Jens Axboe,
linux-scsi, linux-block
On Fri, Feb 17, 2017 at 04:43:56PM -0800, James Bottomley wrote:
> This seems to be related to a 0day test we got on the block tree,
> details here:
>
> http://marc.info/?t=148624068800001
>
> I root caused the above to something not being released when it should
> be, so it looks like you have the same problem. It seems to be a
> recent commit in the block tree, so could you bisect it since you have
> a nice reproducer?
These appear to actually be two separate issues.
The unbind followed by bind crash only happens with scsi-mq. It reproes
since at least 4.0.
The unbind followed by a new device coming up crash happens both with
and without scsi-mq. The earliest version I was able to check for that
was 4.6, which did reproduce.
I'll see if I can get some more info on these two issues separately.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Manual driver binding and unbinding broken for SCSI
2017-02-20 2:19 ` Omar Sandoval
@ 2017-02-21 17:14 ` Jan Kara
2017-02-22 10:21 ` Ming Lei
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2017-02-21 17:14 UTC (permalink / raw)
To: Omar Sandoval
Cc: James Bottomley, Dan Williams, Jan Kara, Martin K. Petersen,
Jens Axboe, linux-scsi, linux-block
On Sun 19-02-17 18:19:58, Omar Sandoval wrote:
> On Fri, Feb 17, 2017 at 04:43:56PM -0800, James Bottomley wrote:
> > This seems to be related to a 0day test we got on the block tree,
> > details here:
> >
> > http://marc.info/?t=148624068800001
> >
> > I root caused the above to something not being released when it should
> > be, so it looks like you have the same problem. It seems to be a
> > recent commit in the block tree, so could you bisect it since you have
> > a nice reproducer?
>
> These appear to actually be two separate issues.
>
> The unbind followed by bind crash only happens with scsi-mq. It reproes
> since at least 4.0.
>
> The unbind followed by a new device coming up crash happens both with
> and without scsi-mq. The earliest version I was able to check for that
> was 4.6, which did reproduce.
>
> I'll see if I can get some more info on these two issues separately.
Actually, the second issue is only a warning right? And if I understand the
issue correctly, it should be fixed by either Dan's patches in linux-block
or my patch 4 in the series which matches your test results. So that is
dealt with. I have no idea about the first issue though.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Manual driver binding and unbinding broken for SCSI
2017-02-21 17:14 ` Jan Kara
@ 2017-02-22 10:21 ` Ming Lei
0 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2017-02-22 10:21 UTC (permalink / raw)
To: Jan Kara
Cc: Omar Sandoval, James Bottomley, Dan Williams, Martin K. Petersen,
Jens Axboe, Linux SCSI List, linux-block
On Wed, Feb 22, 2017 at 1:14 AM, Jan Kara <jack@suse.cz> wrote:
> On Sun 19-02-17 18:19:58, Omar Sandoval wrote:
>> On Fri, Feb 17, 2017 at 04:43:56PM -0800, James Bottomley wrote:
>> > This seems to be related to a 0day test we got on the block tree,
>> > details here:
>> >
>> > http://marc.info/?t=148624068800001
>> >
>> > I root caused the above to something not being released when it should
>> > be, so it looks like you have the same problem. It seems to be a
>> > recent commit in the block tree, so could you bisect it since you have
>> > a nice reproducer?
>>
>> These appear to actually be two separate issues.
>>
>> The unbind followed by bind crash only happens with scsi-mq. It reproes
>> since at least 4.0.
>>
>> The unbind followed by a new device coming up crash happens both with
>> and without scsi-mq. The earliest version I was able to check for that
>> was 4.6, which did reproduce.
>>
>> I'll see if I can get some more info on these two issues separately.
>
> Actually, the second issue is only a warning right? And if I understand the
> issue correctly, it should be fixed by either Dan's patches in linux-block
> or my patch 4 in the series which matches your test results. So that is
> dealt with. I have no idea about the first issue though.
Looks the 1st one is one old issue in blk-mq, and I have sent one patchset
to address it:
http://marc.info/?l=linux-kernel&m=148775847517071&w=2
Omar, feel free to give a test.
thanks,
Ming Lei
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-02-22 10:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-18 0:30 Manual driver binding and unbinding broken for SCSI Omar Sandoval
2017-02-18 0:43 ` James Bottomley
2017-02-20 2:19 ` Omar Sandoval
2017-02-21 17:14 ` Jan Kara
2017-02-22 10:21 ` Ming Lei
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.