All of lore.kernel.org
 help / color / mirror / Atom feed
* virt_blk BUG: sleeping function called from invalid context
@ 2014-06-27 11:57 Josh Boyer
  2014-06-29  8:26   ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Josh Boyer @ 2014-06-27 11:57 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: virtualization, Linux-Kernel@Vger. Kernel. Org, Brian Lane

Hi All,

We've had a report[1] of the virt_blk driver causing a lot of spew
because it's calling a sleeping function from an invalid context.  The
backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.

The reporter is on CC and can give you relevant details.

josh

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1113805

[drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 0
 virtio-pci 0000:00:05.0: irq 40 for MSI/MSI-X
 virtio-pci 0000:00:05.0: irq 41 for MSI/MSI-X
  vda: vda1 vda2
 virtio-pci 0000:00:06.0: irq 42 for MSI/MSI-X
 virtio-pci 0000:00:06.0: irq 43 for MSI/MSI-X
  vdb: vdb1
 tsc: Refined TSC clocksource calibration: 3392.169 MHz
 md: bind<vdb1>
 mdadm (350) used greatest stack depth: 12384 bytes left
 md: bind<vda2>
 md: raid1 personality registered for level 1
 md/raid1:md127: active with 2 out of 2 mirrors
 created bitmap (1 pages) for device md127
 md127: bitmap initialized from disk: read 1 pages, set 0 of 153 bits
 md127: detected capacity change from 0 to 10203693056
  md127: unknown partition table
 systemd-cryptsetup[358]: Set cipher aes, mode xts-plain64, key size
512 bits for device
/dev/disk/by-uuid/6972a564-542d-498b-b3a0-7d71c2e966a2.
  md127: unknown partition table
  md127: unknown partition table
 dracut-initqueue[296]: Scanning devices dm-0  for LVM logical volumes
fedora-foo1/root fedora-foo1/swap fedora-foo1/root fedora-foo1/swap
 dracut-initqueue[296]: inactive '/dev/fedora-foo1/root' [7.42 GiB] inherit
 dracut-initqueue[296]: inactive '/dev/fedora-foo1/swap' [2.00 GiB] inherit
 systemd-fsck[662]: /dev/mapper/fedora--foo1-root: clean, 22215/486720
files, 222526/1944576 blocks
 BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:586
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
 2 locks held by swapper/1/0:
  #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
virtblk_done+0x42/0xe0 [virtio_blk]
  #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
[<ffffffff81633718>] bitmap_endwrite+0x68/0x240
 irq event stamp: 33518
 hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
 hardirqs last disabled at (33516): [<ffffffff818122ed>]
common_interrupt+0x6d/0x72
 softirqs last  enabled at (33518): [<ffffffff810a1272>]
_local_bh_enable+0x22/0x50
 softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
  0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
  0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
 Call Trace:
  <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
  [<ffffffff810d4f14>] __might_sleep+0x184/0x240
  [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
  [<ffffffff810e1de6>] ? local_clock+0x16/0x30
  [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
  [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
  [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
  [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
  [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
  [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
  [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
  [<ffffffff813b46c4>] blk_update_request+0x94/0x420
  [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
  [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
  [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
  [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
  [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
  [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
  [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
  [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
  [<ffffffff81119436>] handle_edge_irq+0x66/0x130
  [<ffffffff8101c3e4>] handle_irq+0x84/0x150
  [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
  [<ffffffff818122f2>] common_interrupt+0x72/0x72
  <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
  [<ffffffff81025454>] default_idle+0x24/0x230
  [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
  [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
  [<ffffffff8104df1b>] start_secondary+0x25b/0x300

 =================================
 [ INFO: inconsistent lock state ]
 3.16.0-0.rc2.git2.1.fc21.x86_64 #1 Not tainted
 ---------------------------------
 inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
 swapper/1/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
  (kernfs_mutex){?.+.+.}, at: [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
 {HARDIRQ-ON-W} state was registered at:
   [<ffffffff811004b2>] __lock_acquire+0x942/0x1ca0
   [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
   [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
   [<ffffffff812d6d6f>] kernfs_activate+0x1f/0xf0
   [<ffffffff812d7140>] kernfs_create_root+0xf0/0x110
   [<ffffffff821f1ddc>] sysfs_init+0x13/0x51
   [<ffffffff821ef5af>] mnt_init+0x107/0x231
   [<ffffffff821ef213>] vfs_caches_init+0x98/0x106
   [<ffffffff821bdfb0>] start_kernel+0x44f/0x4bc
   [<ffffffff821bd4d7>] x86_64_start_reservations+0x2a/0x2c
   [<ffffffff821bd626>] x86_64_start_kernel+0x14d/0x170
 irq event stamp: 33518
 hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
 hardirqs last disabled at (33516): [<ffffffff818122ed>]
common_interrupt+0x6d/0x72
 softirqs last  enabled at (33518): [<ffffffff810a1272>]
_local_bh_enable+0x22/0x50
 softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80

                                              other info that might
help us debug this:
  Possible unsafe locking scenario:
        CPU0
        ----
   lock(kernfs_mutex);
   <Interrupt>
     lock(kernfs_mutex);

                                               *** DEADLOCK ***
 2 locks held by swapper/1/0:
  #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
virtblk_done+0x42/0xe0 [virtio_blk]
  #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
[<ffffffff81633718>] bitmap_endwrite+0x68/0x240

                                              stack backtrace:
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000000 f90db13964f4ee05 ffff88007d4039d0 ffffffff81807b4c
  ffff88007bcb19d0 ffff88007d403a20 ffffffff81805358 0000000000000000
  0000000000000000 0000000000000001 ffff88007bcb25a8 ffff88007bcb19d0
 Call Trace:
  <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
  [<ffffffff81805358>] print_usage_bug+0x1f0/0x205
  [<ffffffff810ff450>] mark_lock+0x610/0x6d0
  [<ffffffff810fe520>] ? print_shortest_lock_dependencies+0x1d0/0x1d0
  [<ffffffff81100614>] __lock_acquire+0xaa4/0x1ca0
  [<ffffffff8101dc4d>] ? show_trace_log_lvl+0x4d/0x60
  [<ffffffff8101c8ad>] ? show_stack_log_lvl+0xad/0x1b0
  [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
  [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
  [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
  [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
  [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
  [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
  [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
  [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
  [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
  [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
  [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
  [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
  [<ffffffff813b46c4>] blk_update_request+0x94/0x420
  [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
  [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
  [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
  [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
  [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
  [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
  [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
  [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
  [<ffffffff81119436>] handle_edge_irq+0x66/0x130
  [<ffffffff8101c3e4>] handle_irq+0x84/0x150
  [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
  [<ffffffff818122f2>] common_interrupt+0x72/0x72
  <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
  [<ffffffff81025454>] default_idle+0x24/0x230
  [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
  [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
  [<ffffffff8104df1b>] start_secondary+0x25b/0x30

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
  2014-06-27 11:57 virt_blk BUG: sleeping function called from invalid context Josh Boyer
@ 2014-06-29  8:26   ` Michael S. Tsirkin
  0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2014-06-29  8:26 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Rusty Russell, virtualization, Linux-Kernel@Vger. Kernel. Org,
	Brian Lane, Jens Axboe, Christoph Hellwig

On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
> Hi All,
> 
> We've had a report[1] of the virt_blk driver causing a lot of spew
> because it's calling a sleeping function from an invalid context.  The
> backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.

Hi Jens, pls see below - it looks like the call to blk_mq_end_io
from IRQ context is causing the issue.
IIUC you switched virtio to this from __blk_end_request_all in

commit 1cf7e9c68fe84248174e998922b39e508375e7c1
    virtio_blk: blk-mq support

Is this always safe?
I note that at least one other driver is doing this:
drivers/block/mtip32xx/mtip32xx.c

Thanks!

> The reporter is on CC and can give you relevant details.
> 
> josh
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1113805
> 
> [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 0
>  virtio-pci 0000:00:05.0: irq 40 for MSI/MSI-X
>  virtio-pci 0000:00:05.0: irq 41 for MSI/MSI-X
>   vda: vda1 vda2
>  virtio-pci 0000:00:06.0: irq 42 for MSI/MSI-X
>  virtio-pci 0000:00:06.0: irq 43 for MSI/MSI-X
>   vdb: vdb1
>  tsc: Refined TSC clocksource calibration: 3392.169 MHz
>  md: bind<vdb1>
>  mdadm (350) used greatest stack depth: 12384 bytes left
>  md: bind<vda2>
>  md: raid1 personality registered for level 1
>  md/raid1:md127: active with 2 out of 2 mirrors
>  created bitmap (1 pages) for device md127
>  md127: bitmap initialized from disk: read 1 pages, set 0 of 153 bits
>  md127: detected capacity change from 0 to 10203693056
>   md127: unknown partition table
>  systemd-cryptsetup[358]: Set cipher aes, mode xts-plain64, key size
> 512 bits for device
> /dev/disk/by-uuid/6972a564-542d-498b-b3a0-7d71c2e966a2.
>   md127: unknown partition table
>   md127: unknown partition table
>  dracut-initqueue[296]: Scanning devices dm-0  for LVM logical volumes
> fedora-foo1/root fedora-foo1/swap fedora-foo1/root fedora-foo1/swap
>  dracut-initqueue[296]: inactive '/dev/fedora-foo1/root' [7.42 GiB] inherit
>  dracut-initqueue[296]: inactive '/dev/fedora-foo1/swap' [2.00 GiB] inherit
>  systemd-fsck[662]: /dev/mapper/fedora--foo1-root: clean, 22215/486720
> files, 222526/1944576 blocks
>  BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:586
>  in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
> virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
> [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>]
> common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>]
> _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
>   0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
>   0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff810d4f14>] __might_sleep+0x184/0x240
>   [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
>   [<ffffffff810e1de6>] ? local_clock+0x16/0x30
>   [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x300
> 
>  =================================
>  [ INFO: inconsistent lock state ]
>  3.16.0-0.rc2.git2.1.fc21.x86_64 #1 Not tainted
>  ---------------------------------
>  inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>  swapper/1/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>   (kernfs_mutex){?.+.+.}, at: [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>  {HARDIRQ-ON-W} state was registered at:
>    [<ffffffff811004b2>] __lock_acquire+0x942/0x1ca0
>    [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
>    [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
>    [<ffffffff812d6d6f>] kernfs_activate+0x1f/0xf0
>    [<ffffffff812d7140>] kernfs_create_root+0xf0/0x110
>    [<ffffffff821f1ddc>] sysfs_init+0x13/0x51
>    [<ffffffff821ef5af>] mnt_init+0x107/0x231
>    [<ffffffff821ef213>] vfs_caches_init+0x98/0x106
>    [<ffffffff821bdfb0>] start_kernel+0x44f/0x4bc
>    [<ffffffff821bd4d7>] x86_64_start_reservations+0x2a/0x2c
>    [<ffffffff821bd626>] x86_64_start_kernel+0x14d/0x170
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>]
> common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>]
> _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
> 
>                                               other info that might
> help us debug this:
>   Possible unsafe locking scenario:
>         CPU0
>         ----
>    lock(kernfs_mutex);
>    <Interrupt>
>      lock(kernfs_mutex);
> 
>                                                *** DEADLOCK ***
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
> virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
> [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
> 
>                                               stack backtrace:
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d4039d0 ffffffff81807b4c
>   ffff88007bcb19d0 ffff88007d403a20 ffffffff81805358 0000000000000000
>   0000000000000000 0000000000000001 ffff88007bcb25a8 ffff88007bcb19d0
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff81805358>] print_usage_bug+0x1f0/0x205
>   [<ffffffff810ff450>] mark_lock+0x610/0x6d0
>   [<ffffffff810fe520>] ? print_shortest_lock_dependencies+0x1d0/0x1d0
>   [<ffffffff81100614>] __lock_acquire+0xaa4/0x1ca0
>   [<ffffffff8101dc4d>] ? show_trace_log_lvl+0x4d/0x60
>   [<ffffffff8101c8ad>] ? show_stack_log_lvl+0xad/0x1b0
>   [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
>   [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
>   [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
>   [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
>   [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
>   [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x30

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
@ 2014-06-29  8:26   ` Michael S. Tsirkin
  0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2014-06-29  8:26 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Jens Axboe, Linux-Kernel@Vger. Kernel. Org, virtualization,
	Brian Lane, Christoph Hellwig

On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
> Hi All,
> 
> We've had a report[1] of the virt_blk driver causing a lot of spew
> because it's calling a sleeping function from an invalid context.  The
> backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.

Hi Jens, pls see below - it looks like the call to blk_mq_end_io
from IRQ context is causing the issue.
IIUC you switched virtio to this from __blk_end_request_all in

commit 1cf7e9c68fe84248174e998922b39e508375e7c1
    virtio_blk: blk-mq support

Is this always safe?
I note that at least one other driver is doing this:
drivers/block/mtip32xx/mtip32xx.c

Thanks!

> The reporter is on CC and can give you relevant details.
> 
> josh
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1113805
> 
> [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 0
>  virtio-pci 0000:00:05.0: irq 40 for MSI/MSI-X
>  virtio-pci 0000:00:05.0: irq 41 for MSI/MSI-X
>   vda: vda1 vda2
>  virtio-pci 0000:00:06.0: irq 42 for MSI/MSI-X
>  virtio-pci 0000:00:06.0: irq 43 for MSI/MSI-X
>   vdb: vdb1
>  tsc: Refined TSC clocksource calibration: 3392.169 MHz
>  md: bind<vdb1>
>  mdadm (350) used greatest stack depth: 12384 bytes left
>  md: bind<vda2>
>  md: raid1 personality registered for level 1
>  md/raid1:md127: active with 2 out of 2 mirrors
>  created bitmap (1 pages) for device md127
>  md127: bitmap initialized from disk: read 1 pages, set 0 of 153 bits
>  md127: detected capacity change from 0 to 10203693056
>   md127: unknown partition table
>  systemd-cryptsetup[358]: Set cipher aes, mode xts-plain64, key size
> 512 bits for device
> /dev/disk/by-uuid/6972a564-542d-498b-b3a0-7d71c2e966a2.
>   md127: unknown partition table
>   md127: unknown partition table
>  dracut-initqueue[296]: Scanning devices dm-0  for LVM logical volumes
> fedora-foo1/root fedora-foo1/swap fedora-foo1/root fedora-foo1/swap
>  dracut-initqueue[296]: inactive '/dev/fedora-foo1/root' [7.42 GiB] inherit
>  dracut-initqueue[296]: inactive '/dev/fedora-foo1/swap' [2.00 GiB] inherit
>  systemd-fsck[662]: /dev/mapper/fedora--foo1-root: clean, 22215/486720
> files, 222526/1944576 blocks
>  BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:586
>  in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
> virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
> [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>]
> common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>]
> _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
>   0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
>   0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff810d4f14>] __might_sleep+0x184/0x240
>   [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
>   [<ffffffff810e1de6>] ? local_clock+0x16/0x30
>   [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x300
> 
>  =================================
>  [ INFO: inconsistent lock state ]
>  3.16.0-0.rc2.git2.1.fc21.x86_64 #1 Not tainted
>  ---------------------------------
>  inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>  swapper/1/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>   (kernfs_mutex){?.+.+.}, at: [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>  {HARDIRQ-ON-W} state was registered at:
>    [<ffffffff811004b2>] __lock_acquire+0x942/0x1ca0
>    [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
>    [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
>    [<ffffffff812d6d6f>] kernfs_activate+0x1f/0xf0
>    [<ffffffff812d7140>] kernfs_create_root+0xf0/0x110
>    [<ffffffff821f1ddc>] sysfs_init+0x13/0x51
>    [<ffffffff821ef5af>] mnt_init+0x107/0x231
>    [<ffffffff821ef213>] vfs_caches_init+0x98/0x106
>    [<ffffffff821bdfb0>] start_kernel+0x44f/0x4bc
>    [<ffffffff821bd4d7>] x86_64_start_reservations+0x2a/0x2c
>    [<ffffffff821bd626>] x86_64_start_kernel+0x14d/0x170
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>]
> common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>]
> _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
> 
>                                               other info that might
> help us debug this:
>   Possible unsafe locking scenario:
>         CPU0
>         ----
>    lock(kernfs_mutex);
>    <Interrupt>
>      lock(kernfs_mutex);
> 
>                                                *** DEADLOCK ***
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
> virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
> [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
> 
>                                               stack backtrace:
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d4039d0 ffffffff81807b4c
>   ffff88007bcb19d0 ffff88007d403a20 ffffffff81805358 0000000000000000
>   0000000000000000 0000000000000001 ffff88007bcb25a8 ffff88007bcb19d0
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff81805358>] print_usage_bug+0x1f0/0x205
>   [<ffffffff810ff450>] mark_lock+0x610/0x6d0
>   [<ffffffff810fe520>] ? print_shortest_lock_dependencies+0x1d0/0x1d0
>   [<ffffffff81100614>] __lock_acquire+0xaa4/0x1ca0
>   [<ffffffff8101dc4d>] ? show_trace_log_lvl+0x4d/0x60
>   [<ffffffff8101c8ad>] ? show_stack_log_lvl+0xad/0x1b0
>   [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
>   [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
>   [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
>   [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
>   [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
>   [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x30

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
  2014-06-29  8:26   ` Michael S. Tsirkin
  (?)
@ 2014-06-29 19:32   ` Christoph Hellwig
  2014-06-29 20:47       ` Michael S. Tsirkin
  -1 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2014-06-29 19:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Josh Boyer, Rusty Russell, virtualization,
	Linux-Kernel@Vger. Kernel. Org, Brian Lane, Jens Axboe,
	Christoph Hellwig

On Sun, Jun 29, 2014 at 11:26:37AM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
> > Hi All,
> > 
> > We've had a report[1] of the virt_blk driver causing a lot of spew
> > because it's calling a sleeping function from an invalid context.  The
> > backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.
> 
> Hi Jens, pls see below - it looks like the call to blk_mq_end_io
> from IRQ context is causing the issue.
> IIUC you switched virtio to this from __blk_end_request_all in
> 
> commit 1cf7e9c68fe84248174e998922b39e508375e7c1
>     virtio_blk: blk-mq support
> 
> Is this always safe?
> I note that at least one other driver is doing this:
> drivers/block/mtip32xx/mtip32xx.c

Just like __blk_end_request_all blk_mq_end_io is supposed to be called
from irq context.  The problem is that the MD bio end_io handler is calling
a sleeping function.  Not sure if that's a bug in MD though given the
kernfs symbols in the all trace and the recent churn in that area.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
  2014-06-29  8:26   ` Michael S. Tsirkin
  (?)
  (?)
@ 2014-06-29 19:32   ` Christoph Hellwig
  -1 siblings, 0 replies; 22+ messages in thread
From: Christoph Hellwig @ 2014-06-29 19:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jens Axboe, Josh Boyer, Linux-Kernel@Vger. Kernel. Org,
	virtualization, Brian Lane, Christoph Hellwig

On Sun, Jun 29, 2014 at 11:26:37AM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
> > Hi All,
> > 
> > We've had a report[1] of the virt_blk driver causing a lot of spew
> > because it's calling a sleeping function from an invalid context.  The
> > backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.
> 
> Hi Jens, pls see below - it looks like the call to blk_mq_end_io
> from IRQ context is causing the issue.
> IIUC you switched virtio to this from __blk_end_request_all in
> 
> commit 1cf7e9c68fe84248174e998922b39e508375e7c1
>     virtio_blk: blk-mq support
> 
> Is this always safe?
> I note that at least one other driver is doing this:
> drivers/block/mtip32xx/mtip32xx.c

Just like __blk_end_request_all blk_mq_end_io is supposed to be called
from irq context.  The problem is that the MD bio end_io handler is calling
a sleeping function.  Not sure if that's a bug in MD though given the
kernfs symbols in the all trace and the recent churn in that area.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
  2014-06-29 19:32   ` Christoph Hellwig
@ 2014-06-29 20:47       ` Michael S. Tsirkin
  0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2014-06-29 20:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Josh Boyer, Rusty Russell, virtualization,
	Linux-Kernel@Vger. Kernel. Org, Brian Lane, Jens Axboe,
	Tejun Heo, John McCutchan, Robert Love, Eric Paris,
	Greg Kroah-Hartman

On Sun, Jun 29, 2014 at 09:32:22PM +0200, Christoph Hellwig wrote:
> On Sun, Jun 29, 2014 at 11:26:37AM +0300, Michael S. Tsirkin wrote:
> > On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
> > > Hi All,
> > > 
> > > We've had a report[1] of the virt_blk driver causing a lot of spew
> > > because it's calling a sleeping function from an invalid context.  The
> > > backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.
> > 
> > Hi Jens, pls see below - it looks like the call to blk_mq_end_io
> > from IRQ context is causing the issue.
> > IIUC you switched virtio to this from __blk_end_request_all in
> > 
> > commit 1cf7e9c68fe84248174e998922b39e508375e7c1
> >     virtio_blk: blk-mq support
> > 
> > Is this always safe?
> > I note that at least one other driver is doing this:
> > drivers/block/mtip32xx/mtip32xx.c
> 
> Just like __blk_end_request_all blk_mq_end_io is supposed to be called
> from irq context.  The problem is that the MD bio end_io handler is calling
> a sleeping function.  Not sure if that's a bug in MD though given the
> kernfs symbols in the all trace and the recent churn in that area.

My understanding is this:

bitmap_endwrite -> calls sysfs_notify_dirent_safe under spinlock
 -> calls kernfs_notify which takes a mutex.

So I am guessing it is this commit:

commit d911d98748018f7c8facc035ba39c30f5cce6f9c
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Apr 9 11:07:31 2014 -0400

    kernfs: make kernfs_notify() trigger inotify events too

Tejun, what do you think?

Josh, Brian, could you try reverting that commit to see if it helps?

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
@ 2014-06-29 20:47       ` Michael S. Tsirkin
  0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2014-06-29 20:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Josh Boyer, Robert Love, Greg Kroah-Hartman,
	Linux-Kernel@Vger. Kernel. Org, virtualization, Tejun Heo,
	Eric Paris, Brian Lane, John McCutchan

On Sun, Jun 29, 2014 at 09:32:22PM +0200, Christoph Hellwig wrote:
> On Sun, Jun 29, 2014 at 11:26:37AM +0300, Michael S. Tsirkin wrote:
> > On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
> > > Hi All,
> > > 
> > > We've had a report[1] of the virt_blk driver causing a lot of spew
> > > because it's calling a sleeping function from an invalid context.  The
> > > backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.
> > 
> > Hi Jens, pls see below - it looks like the call to blk_mq_end_io
> > from IRQ context is causing the issue.
> > IIUC you switched virtio to this from __blk_end_request_all in
> > 
> > commit 1cf7e9c68fe84248174e998922b39e508375e7c1
> >     virtio_blk: blk-mq support
> > 
> > Is this always safe?
> > I note that at least one other driver is doing this:
> > drivers/block/mtip32xx/mtip32xx.c
> 
> Just like __blk_end_request_all blk_mq_end_io is supposed to be called
> from irq context.  The problem is that the MD bio end_io handler is calling
> a sleeping function.  Not sure if that's a bug in MD though given the
> kernfs symbols in the all trace and the recent churn in that area.

My understanding is this:

bitmap_endwrite -> calls sysfs_notify_dirent_safe under spinlock
 -> calls kernfs_notify which takes a mutex.

So I am guessing it is this commit:

commit d911d98748018f7c8facc035ba39c30f5cce6f9c
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Apr 9 11:07:31 2014 -0400

    kernfs: make kernfs_notify() trigger inotify events too

Tejun, what do you think?

Josh, Brian, could you try reverting that commit to see if it helps?

-- 
MST

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
  2014-06-29 20:47       ` Michael S. Tsirkin
@ 2014-06-29 20:55         ` Jens Axboe
  -1 siblings, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2014-06-29 20:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Christoph Hellwig
  Cc: Josh Boyer, Rusty Russell, virtualization,
	Linux-Kernel@Vger. Kernel. Org, Brian Lane, Tejun Heo,
	John McCutchan, Robert Love, Eric Paris, Greg Kroah-Hartman

On 06/29/2014 02:47 PM, Michael S. Tsirkin wrote:
> On Sun, Jun 29, 2014 at 09:32:22PM +0200, Christoph Hellwig wrote:
>> On Sun, Jun 29, 2014 at 11:26:37AM +0300, Michael S. Tsirkin wrote:
>>> On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
>>>> Hi All,
>>>>
>>>> We've had a report[1] of the virt_blk driver causing a lot of spew
>>>> because it's calling a sleeping function from an invalid context.  The
>>>> backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.
>>>
>>> Hi Jens, pls see below - it looks like the call to blk_mq_end_io
>>> from IRQ context is causing the issue.
>>> IIUC you switched virtio to this from __blk_end_request_all in
>>>
>>> commit 1cf7e9c68fe84248174e998922b39e508375e7c1
>>>     virtio_blk: blk-mq support
>>>
>>> Is this always safe?
>>> I note that at least one other driver is doing this:
>>> drivers/block/mtip32xx/mtip32xx.c
>>
>> Just like __blk_end_request_all blk_mq_end_io is supposed to be called
>> from irq context.  The problem is that the MD bio end_io handler is calling
>> a sleeping function.  Not sure if that's a bug in MD though given the
>> kernfs symbols in the all trace and the recent churn in that area.
> 
> My understanding is this:
> 
> bitmap_endwrite -> calls sysfs_notify_dirent_safe under spinlock
>  -> calls kernfs_notify which takes a mutex.
> 
> So I am guessing it is this commit:
> 
> commit d911d98748018f7c8facc035ba39c30f5cce6f9c
> Author: Tejun Heo <tj@kernel.org>
> Date:   Wed Apr 9 11:07:31 2014 -0400
> 
>     kernfs: make kernfs_notify() trigger inotify events too
> 
> Tejun, what do you think?
> 
> Josh, Brian, could you try reverting that commit to see if it helps?

That definitely be a bug. If you need to block off ->bi_end_io(), just
must punt to a worker thread.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
@ 2014-06-29 20:55         ` Jens Axboe
  0 siblings, 0 replies; 22+ messages in thread
From: Jens Axboe @ 2014-06-29 20:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Christoph Hellwig
  Cc: Josh Boyer, Robert Love, Greg Kroah-Hartman,
	Linux-Kernel@Vger. Kernel. Org, virtualization, Tejun Heo,
	Eric Paris, Brian Lane, John McCutchan

On 06/29/2014 02:47 PM, Michael S. Tsirkin wrote:
> On Sun, Jun 29, 2014 at 09:32:22PM +0200, Christoph Hellwig wrote:
>> On Sun, Jun 29, 2014 at 11:26:37AM +0300, Michael S. Tsirkin wrote:
>>> On Fri, Jun 27, 2014 at 07:57:38AM -0400, Josh Boyer wrote:
>>>> Hi All,
>>>>
>>>> We've had a report[1] of the virt_blk driver causing a lot of spew
>>>> because it's calling a sleeping function from an invalid context.  The
>>>> backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.
>>>
>>> Hi Jens, pls see below - it looks like the call to blk_mq_end_io
>>> from IRQ context is causing the issue.
>>> IIUC you switched virtio to this from __blk_end_request_all in
>>>
>>> commit 1cf7e9c68fe84248174e998922b39e508375e7c1
>>>     virtio_blk: blk-mq support
>>>
>>> Is this always safe?
>>> I note that at least one other driver is doing this:
>>> drivers/block/mtip32xx/mtip32xx.c
>>
>> Just like __blk_end_request_all blk_mq_end_io is supposed to be called
>> from irq context.  The problem is that the MD bio end_io handler is calling
>> a sleeping function.  Not sure if that's a bug in MD though given the
>> kernfs symbols in the all trace and the recent churn in that area.
> 
> My understanding is this:
> 
> bitmap_endwrite -> calls sysfs_notify_dirent_safe under spinlock
>  -> calls kernfs_notify which takes a mutex.
> 
> So I am guessing it is this commit:
> 
> commit d911d98748018f7c8facc035ba39c30f5cce6f9c
> Author: Tejun Heo <tj@kernel.org>
> Date:   Wed Apr 9 11:07:31 2014 -0400
> 
>     kernfs: make kernfs_notify() trigger inotify events too
> 
> Tejun, what do you think?
> 
> Josh, Brian, could you try reverting that commit to see if it helps?

That definitely be a bug. If you need to block off ->bi_end_io(), just
must punt to a worker thread.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
  2014-06-29 20:55         ` Jens Axboe
@ 2014-06-30 20:17           ` Tejun Heo
  -1 siblings, 0 replies; 22+ messages in thread
From: Tejun Heo @ 2014-06-30 20:17 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Michael S. Tsirkin, Christoph Hellwig, Josh Boyer, Rusty Russell,
	virtualization, Linux-Kernel@Vger. Kernel. Org, Brian Lane,
	John McCutchan, Robert Love, Eric Paris, Greg Kroah-Hartman

On Sun, Jun 29, 2014 at 02:55:36PM -0600, Jens Axboe wrote:
> > commit d911d98748018f7c8facc035ba39c30f5cce6f9c
> > Author: Tejun Heo <tj@kernel.org>
> > Date:   Wed Apr 9 11:07:31 2014 -0400
> > 
> >     kernfs: make kernfs_notify() trigger inotify events too
> > 
> > Tejun, what do you think?
> > 
> > Josh, Brian, could you try reverting that commit to see if it helps?
> 
> That definitely be a bug. If you need to block off ->bi_end_io(), just
> must punt to a worker thread.

Ugh... didn't realize we had users triggering sysfs notifications from
an atomic context.  fsnotify support requires sleepable context.
Guess I'll have to punt it to a work item. :(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: virt_blk BUG: sleeping function called from invalid context
@ 2014-06-30 20:17           ` Tejun Heo
  0 siblings, 0 replies; 22+ messages in thread
From: Tejun Heo @ 2014-06-30 20:17 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Josh Boyer, Michael S. Tsirkin, John McCutchan,
	Greg Kroah-Hartman, Linux-Kernel@Vger. Kernel. Org,
	virtualization, Robert Love, Eric Paris, Brian Lane,
	Christoph Hellwig

On Sun, Jun 29, 2014 at 02:55:36PM -0600, Jens Axboe wrote:
> > commit d911d98748018f7c8facc035ba39c30f5cce6f9c
> > Author: Tejun Heo <tj@kernel.org>
> > Date:   Wed Apr 9 11:07:31 2014 -0400
> > 
> >     kernfs: make kernfs_notify() trigger inotify events too
> > 
> > Tejun, what do you think?
> > 
> > Josh, Brian, could you try reverting that commit to see if it helps?
> 
> That definitely be a bug. If you need to block off ->bi_end_io(), just
> must punt to a worker thread.

Ugh... didn't realize we had users triggering sysfs notifications from
an atomic context.  fsnotify support requires sleepable context.
Guess I'll have to punt it to a work item. :(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
  2014-06-30 20:17           ` Tejun Heo
@ 2014-07-01 20:41             ` Tejun Heo
  -1 siblings, 0 replies; 22+ messages in thread
From: Tejun Heo @ 2014-07-01 20:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jens Axboe, Michael S. Tsirkin, Christoph Hellwig, Josh Boyer,
	Rusty Russell, virtualization, Linux-Kernel@Vger. Kernel. Org,
	Brian Lane, John McCutchan, Robert Love, Eric Paris

d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events
too") added fsnotify triggering to kernfs_notify() which requires a
sleepable context.  There are already existing users of
kernfs_notify() which invoke it from an atomic context and in general
it's silly to require a sleepable context for triggering a
notification.

The following is an invalid context bug triggerd by md invoking
sysfs_notify() from IO completion path.

 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
 2 locks held by swapper/1/0:
  #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
  #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
 irq event stamp: 33518
 hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
 hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
 softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
 softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
  0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
  0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
 Call Trace:
  <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
  [<ffffffff810d4f14>] __might_sleep+0x184/0x240
  [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
  [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
  [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
  [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
  [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
  [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
  [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
  [<ffffffff813b46c4>] blk_update_request+0x94/0x420
  [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
  [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
  [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
  [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
  [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
  [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
  [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
  [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
  [<ffffffff81119436>] handle_edge_irq+0x66/0x130
  [<ffffffff8101c3e4>] handle_irq+0x84/0x150
  [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
  [<ffffffff818122f2>] common_interrupt+0x72/0x72
  <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
  [<ffffffff81025454>] default_idle+0x24/0x230
  [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
  [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
  [<ffffffff8104df1b>] start_secondary+0x25b/0x300

This patch fixes it by punting the notification delivery through a
work item.  This ends up adding an extra pointer to kernfs_elem_attr
enlarging kernfs_node by a pointer, which is not ideal but not a very
big deal either.  If this turns out to be an actual issue, we can move
kernfs_elem_attr->size to kernfs_node->iattr later.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
---
 fs/kernfs/file.c       |   69 +++++++++++++++++++++++++++++++++++++++----------
 include/linux/kernfs.h |    1 
 2 files changed, 56 insertions(+), 14 deletions(-)

--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -39,6 +39,19 @@ struct kernfs_open_node {
 	struct list_head	files; /* goes through kernfs_open_file.list */
 };
 
+/*
+ * kernfs_notify() may be called from any context and bounces notifications
+ * through a work item.  To minimize space overhead in kernfs_node, the
+ * pending queue is implemented as a singly linked list of kernfs_nodes.
+ * The list is terminated with the self pointer so that whether a
+ * kernfs_node is on the list or not can be determined by testing the next
+ * pointer for NULL.
+ */
+#define KERNFS_NOTIFY_EOL			((void *)&kernfs_notify_list)
+
+static DEFINE_SPINLOCK(kernfs_notify_lock);
+static struct kernfs_node *kernfs_notify_list = KERNFS_NOTIFY_EOL;
+
 static struct kernfs_open_file *kernfs_of(struct file *file)
 {
 	return ((struct seq_file *)file->private_data)->private;
@@ -783,24 +796,25 @@ static unsigned int kernfs_fop_poll(stru
 	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
 }
 
-/**
- * kernfs_notify - notify a kernfs file
- * @kn: file to notify
- *
- * Notify @kn such that poll(2) on @kn wakes up.
- */
-void kernfs_notify(struct kernfs_node *kn)
+static void kernfs_notify_workfn(struct work_struct *work)
 {
-	struct kernfs_root *root = kernfs_root(kn);
+	struct kernfs_node *kn;
 	struct kernfs_open_node *on;
 	struct kernfs_super_info *info;
-	unsigned long flags;
-
-	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
+repeat:
+	/* pop one off the notify_list */
+	spin_lock_irq(&kernfs_notify_lock);
+	kn = kernfs_notify_list;
+	if (kn == KERNFS_NOTIFY_EOL) {
+		spin_unlock_irq(&kernfs_notify_lock);
 		return;
+	}
+	kernfs_notify_list = kn->attr.notify_next;
+	kn->attr.notify_next = NULL;
+	spin_unlock_irq(&kernfs_notify_lock);
 
 	/* kick poll */
-	spin_lock_irqsave(&kernfs_open_node_lock, flags);
+	spin_lock_irq(&kernfs_open_node_lock);
 
 	on = kn->attr.open;
 	if (on) {
@@ -808,12 +822,12 @@ void kernfs_notify(struct kernfs_node *k
 		wake_up_interruptible(&on->poll);
 	}
 
-	spin_unlock_irqrestore(&kernfs_open_node_lock, flags);
+	spin_unlock_irq(&kernfs_open_node_lock);
 
 	/* kick fsnotify */
 	mutex_lock(&kernfs_mutex);
 
-	list_for_each_entry(info, &root->supers, node) {
+	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
 		struct inode *inode;
 		struct dentry *dentry;
 
@@ -833,6 +847,33 @@ void kernfs_notify(struct kernfs_node *k
 	}
 
 	mutex_unlock(&kernfs_mutex);
+	kernfs_put(kn);
+	goto repeat;
+}
+
+/**
+ * kernfs_notify - notify a kernfs file
+ * @kn: file to notify
+ *
+ * Notify @kn such that poll(2) on @kn wakes up.  Maybe be called from any
+ * context.
+ */
+void kernfs_notify(struct kernfs_node *kn)
+{
+	static DECLARE_WORK(kernfs_notify_work, kernfs_notify_workfn);
+	unsigned long flags;
+
+	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
+		return;
+
+	spin_lock_irqsave(&kernfs_notify_lock, flags);
+	if (!kn->attr.notify_next) {
+		kernfs_get(kn);
+		kn->attr.notify_next = kernfs_notify_list;
+		kernfs_notify_list = kn;
+		schedule_work(&kernfs_notify_work);
+	}
+	spin_unlock_irqrestore(&kernfs_notify_lock, flags);
 }
 EXPORT_SYMBOL_GPL(kernfs_notify);
 
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -91,6 +91,7 @@ struct kernfs_elem_attr {
 	const struct kernfs_ops	*ops;
 	struct kernfs_open_node	*open;
 	loff_t			size;
+	struct kernfs_node	*notify_next;	/* for kernfs_notify() */
 };
 
 /*

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
@ 2014-07-01 20:41             ` Tejun Heo
  0 siblings, 0 replies; 22+ messages in thread
From: Tejun Heo @ 2014-07-01 20:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jens Axboe, Josh Boyer, Michael S. Tsirkin, John McCutchan,
	Linux-Kernel@Vger. Kernel. Org, virtualization, Robert Love,
	Eric Paris, Brian Lane, Christoph Hellwig

d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events
too") added fsnotify triggering to kernfs_notify() which requires a
sleepable context.  There are already existing users of
kernfs_notify() which invoke it from an atomic context and in general
it's silly to require a sleepable context for triggering a
notification.

The following is an invalid context bug triggerd by md invoking
sysfs_notify() from IO completion path.

 BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
 2 locks held by swapper/1/0:
  #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
  #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
 irq event stamp: 33518
 hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
 hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
 softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
 softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
  0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
  0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
 Call Trace:
  <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
  [<ffffffff810d4f14>] __might_sleep+0x184/0x240
  [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
  [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
  [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
  [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
  [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
  [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
  [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
  [<ffffffff813b46c4>] blk_update_request+0x94/0x420
  [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
  [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
  [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
  [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
  [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
  [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
  [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
  [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
  [<ffffffff81119436>] handle_edge_irq+0x66/0x130
  [<ffffffff8101c3e4>] handle_irq+0x84/0x150
  [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
  [<ffffffff818122f2>] common_interrupt+0x72/0x72
  <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
  [<ffffffff81025454>] default_idle+0x24/0x230
  [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
  [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
  [<ffffffff8104df1b>] start_secondary+0x25b/0x300

This patch fixes it by punting the notification delivery through a
work item.  This ends up adding an extra pointer to kernfs_elem_attr
enlarging kernfs_node by a pointer, which is not ideal but not a very
big deal either.  If this turns out to be an actual issue, we can move
kernfs_elem_attr->size to kernfs_node->iattr later.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
---
 fs/kernfs/file.c       |   69 +++++++++++++++++++++++++++++++++++++++----------
 include/linux/kernfs.h |    1 
 2 files changed, 56 insertions(+), 14 deletions(-)

--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -39,6 +39,19 @@ struct kernfs_open_node {
 	struct list_head	files; /* goes through kernfs_open_file.list */
 };
 
+/*
+ * kernfs_notify() may be called from any context and bounces notifications
+ * through a work item.  To minimize space overhead in kernfs_node, the
+ * pending queue is implemented as a singly linked list of kernfs_nodes.
+ * The list is terminated with the self pointer so that whether a
+ * kernfs_node is on the list or not can be determined by testing the next
+ * pointer for NULL.
+ */
+#define KERNFS_NOTIFY_EOL			((void *)&kernfs_notify_list)
+
+static DEFINE_SPINLOCK(kernfs_notify_lock);
+static struct kernfs_node *kernfs_notify_list = KERNFS_NOTIFY_EOL;
+
 static struct kernfs_open_file *kernfs_of(struct file *file)
 {
 	return ((struct seq_file *)file->private_data)->private;
@@ -783,24 +796,25 @@ static unsigned int kernfs_fop_poll(stru
 	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
 }
 
-/**
- * kernfs_notify - notify a kernfs file
- * @kn: file to notify
- *
- * Notify @kn such that poll(2) on @kn wakes up.
- */
-void kernfs_notify(struct kernfs_node *kn)
+static void kernfs_notify_workfn(struct work_struct *work)
 {
-	struct kernfs_root *root = kernfs_root(kn);
+	struct kernfs_node *kn;
 	struct kernfs_open_node *on;
 	struct kernfs_super_info *info;
-	unsigned long flags;
-
-	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
+repeat:
+	/* pop one off the notify_list */
+	spin_lock_irq(&kernfs_notify_lock);
+	kn = kernfs_notify_list;
+	if (kn == KERNFS_NOTIFY_EOL) {
+		spin_unlock_irq(&kernfs_notify_lock);
 		return;
+	}
+	kernfs_notify_list = kn->attr.notify_next;
+	kn->attr.notify_next = NULL;
+	spin_unlock_irq(&kernfs_notify_lock);
 
 	/* kick poll */
-	spin_lock_irqsave(&kernfs_open_node_lock, flags);
+	spin_lock_irq(&kernfs_open_node_lock);
 
 	on = kn->attr.open;
 	if (on) {
@@ -808,12 +822,12 @@ void kernfs_notify(struct kernfs_node *k
 		wake_up_interruptible(&on->poll);
 	}
 
-	spin_unlock_irqrestore(&kernfs_open_node_lock, flags);
+	spin_unlock_irq(&kernfs_open_node_lock);
 
 	/* kick fsnotify */
 	mutex_lock(&kernfs_mutex);
 
-	list_for_each_entry(info, &root->supers, node) {
+	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
 		struct inode *inode;
 		struct dentry *dentry;
 
@@ -833,6 +847,33 @@ void kernfs_notify(struct kernfs_node *k
 	}
 
 	mutex_unlock(&kernfs_mutex);
+	kernfs_put(kn);
+	goto repeat;
+}
+
+/**
+ * kernfs_notify - notify a kernfs file
+ * @kn: file to notify
+ *
+ * Notify @kn such that poll(2) on @kn wakes up.  Maybe be called from any
+ * context.
+ */
+void kernfs_notify(struct kernfs_node *kn)
+{
+	static DECLARE_WORK(kernfs_notify_work, kernfs_notify_workfn);
+	unsigned long flags;
+
+	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
+		return;
+
+	spin_lock_irqsave(&kernfs_notify_lock, flags);
+	if (!kn->attr.notify_next) {
+		kernfs_get(kn);
+		kn->attr.notify_next = kernfs_notify_list;
+		kernfs_notify_list = kn;
+		schedule_work(&kernfs_notify_work);
+	}
+	spin_unlock_irqrestore(&kernfs_notify_lock, flags);
 }
 EXPORT_SYMBOL_GPL(kernfs_notify);
 
--- a/include/linux/kernfs.h
+++ b/include/linux/kernfs.h
@@ -91,6 +91,7 @@ struct kernfs_elem_attr {
 	const struct kernfs_ops	*ops;
 	struct kernfs_open_node	*open;
 	loff_t			size;
+	struct kernfs_node	*notify_next;	/* for kernfs_notify() */
 };
 
 /*

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
  2014-07-01 20:41             ` Tejun Heo
@ 2014-07-01 20:51               ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-07-01 20:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Michael S. Tsirkin, Christoph Hellwig, Josh Boyer,
	Rusty Russell, virtualization, Linux-Kernel@Vger. Kernel. Org,
	Brian Lane, John McCutchan, Robert Love, Eric Paris

On Tue, Jul 01, 2014 at 04:41:03PM -0400, Tejun Heo wrote:
> d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events
> too") added fsnotify triggering to kernfs_notify() which requires a
> sleepable context.  There are already existing users of
> kernfs_notify() which invoke it from an atomic context and in general
> it's silly to require a sleepable context for triggering a
> notification.
> 
> The following is an invalid context bug triggerd by md invoking
> sysfs_notify() from IO completion path.
> 
>  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
>  in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
>   0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
>   0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff810d4f14>] __might_sleep+0x184/0x240
>   [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x300
> 
> This patch fixes it by punting the notification delivery through a
> work item.  This ends up adding an extra pointer to kernfs_elem_attr
> enlarging kernfs_node by a pointer, which is not ideal but not a very
> big deal either.  If this turns out to be an actual issue, we can move
> kernfs_elem_attr->size to kernfs_node->iattr later.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> ---
>  fs/kernfs/file.c       |   69 +++++++++++++++++++++++++++++++++++++++----------
>  include/linux/kernfs.h |    1 
>  2 files changed, 56 insertions(+), 14 deletions(-)

Looks good to me, do you want to take this with your other kernfs
patches for 3.16-final?  Or if you don't have that, I can take it
through my tree, it's your choice, either is fine for me.

If you want it in your tree, feel free to add:
	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
@ 2014-07-01 20:51               ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-07-01 20:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Josh Boyer, Michael S. Tsirkin, John McCutchan,
	Linux-Kernel@Vger. Kernel. Org, virtualization, Robert Love,
	Eric Paris, Brian Lane, Christoph Hellwig

On Tue, Jul 01, 2014 at 04:41:03PM -0400, Tejun Heo wrote:
> d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events
> too") added fsnotify triggering to kernfs_notify() which requires a
> sleepable context.  There are already existing users of
> kernfs_notify() which invoke it from an atomic context and in general
> it's silly to require a sleepable context for triggering a
> notification.
> 
> The following is an invalid context bug triggerd by md invoking
> sysfs_notify() from IO completion path.
> 
>  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
>  in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
>   0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
>   0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff810d4f14>] __might_sleep+0x184/0x240
>   [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x300
> 
> This patch fixes it by punting the notification delivery through a
> work item.  This ends up adding an extra pointer to kernfs_elem_attr
> enlarging kernfs_node by a pointer, which is not ideal but not a very
> big deal either.  If this turns out to be an actual issue, we can move
> kernfs_elem_attr->size to kernfs_node->iattr later.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> ---
>  fs/kernfs/file.c       |   69 +++++++++++++++++++++++++++++++++++++++----------
>  include/linux/kernfs.h |    1 
>  2 files changed, 56 insertions(+), 14 deletions(-)

Looks good to me, do you want to take this with your other kernfs
patches for 3.16-final?  Or if you don't have that, I can take it
through my tree, it's your choice, either is fine for me.

If you want it in your tree, feel free to add:
	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
  2014-07-01 20:41             ` Tejun Heo
@ 2014-07-02  5:53               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2014-07-02  5:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Greg Kroah-Hartman, Jens Axboe, Christoph Hellwig, Josh Boyer,
	Rusty Russell, virtualization, Linux-Kernel@Vger. Kernel. Org,
	Brian Lane, John McCutchan, Robert Love, Eric Paris

On Tue, Jul 01, 2014 at 04:41:03PM -0400, Tejun Heo wrote:
> d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events
> too") added fsnotify triggering to kernfs_notify() which requires a
> sleepable context.  There are already existing users of
> kernfs_notify() which invoke it from an atomic context and in general
> it's silly to require a sleepable context for triggering a
> notification.
> 
> The following is an invalid context bug triggerd by md invoking
> sysfs_notify() from IO completion path.
> 
>  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
>  in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
>   0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
>   0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff810d4f14>] __might_sleep+0x184/0x240
>   [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x300
> 
> This patch fixes it by punting the notification delivery through a
> work item.  This ends up adding an extra pointer to kernfs_elem_attr
> enlarging kernfs_node by a pointer, which is not ideal but not a very
> big deal either.  If this turns out to be an actual issue, we can move
> kernfs_elem_attr->size to kernfs_node->iattr later.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>

FWIW

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  fs/kernfs/file.c       |   69 +++++++++++++++++++++++++++++++++++++++----------
>  include/linux/kernfs.h |    1 
>  2 files changed, 56 insertions(+), 14 deletions(-)
> 
> --- a/fs/kernfs/file.c
> +++ b/fs/kernfs/file.c
> @@ -39,6 +39,19 @@ struct kernfs_open_node {
>  	struct list_head	files; /* goes through kernfs_open_file.list */
>  };
>  
> +/*
> + * kernfs_notify() may be called from any context and bounces notifications
> + * through a work item.  To minimize space overhead in kernfs_node, the
> + * pending queue is implemented as a singly linked list of kernfs_nodes.
> + * The list is terminated with the self pointer so that whether a
> + * kernfs_node is on the list or not can be determined by testing the next
> + * pointer for NULL.
> + */
> +#define KERNFS_NOTIFY_EOL			((void *)&kernfs_notify_list)
> +
> +static DEFINE_SPINLOCK(kernfs_notify_lock);
> +static struct kernfs_node *kernfs_notify_list = KERNFS_NOTIFY_EOL;
> +
>  static struct kernfs_open_file *kernfs_of(struct file *file)
>  {
>  	return ((struct seq_file *)file->private_data)->private;
> @@ -783,24 +796,25 @@ static unsigned int kernfs_fop_poll(stru
>  	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
>  }
>  
> -/**
> - * kernfs_notify - notify a kernfs file
> - * @kn: file to notify
> - *
> - * Notify @kn such that poll(2) on @kn wakes up.
> - */
> -void kernfs_notify(struct kernfs_node *kn)
> +static void kernfs_notify_workfn(struct work_struct *work)
>  {
> -	struct kernfs_root *root = kernfs_root(kn);
> +	struct kernfs_node *kn;
>  	struct kernfs_open_node *on;
>  	struct kernfs_super_info *info;
> -	unsigned long flags;
> -
> -	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
> +repeat:
> +	/* pop one off the notify_list */
> +	spin_lock_irq(&kernfs_notify_lock);
> +	kn = kernfs_notify_list;
> +	if (kn == KERNFS_NOTIFY_EOL) {
> +		spin_unlock_irq(&kernfs_notify_lock);
>  		return;
> +	}
> +	kernfs_notify_list = kn->attr.notify_next;
> +	kn->attr.notify_next = NULL;
> +	spin_unlock_irq(&kernfs_notify_lock);
>  
>  	/* kick poll */
> -	spin_lock_irqsave(&kernfs_open_node_lock, flags);
> +	spin_lock_irq(&kernfs_open_node_lock);
>  
>  	on = kn->attr.open;
>  	if (on) {
> @@ -808,12 +822,12 @@ void kernfs_notify(struct kernfs_node *k
>  		wake_up_interruptible(&on->poll);
>  	}
>  
> -	spin_unlock_irqrestore(&kernfs_open_node_lock, flags);
> +	spin_unlock_irq(&kernfs_open_node_lock);
>  
>  	/* kick fsnotify */
>  	mutex_lock(&kernfs_mutex);
>  
> -	list_for_each_entry(info, &root->supers, node) {
> +	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
>  		struct inode *inode;
>  		struct dentry *dentry;
>  
> @@ -833,6 +847,33 @@ void kernfs_notify(struct kernfs_node *k
>  	}
>  
>  	mutex_unlock(&kernfs_mutex);
> +	kernfs_put(kn);
> +	goto repeat;
> +}
> +
> +/**
> + * kernfs_notify - notify a kernfs file
> + * @kn: file to notify
> + *
> + * Notify @kn such that poll(2) on @kn wakes up.  Maybe be called from any
> + * context.
> + */
> +void kernfs_notify(struct kernfs_node *kn)
> +{
> +	static DECLARE_WORK(kernfs_notify_work, kernfs_notify_workfn);
> +	unsigned long flags;
> +
> +	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
> +		return;
> +
> +	spin_lock_irqsave(&kernfs_notify_lock, flags);
> +	if (!kn->attr.notify_next) {
> +		kernfs_get(kn);
> +		kn->attr.notify_next = kernfs_notify_list;
> +		kernfs_notify_list = kn;
> +		schedule_work(&kernfs_notify_work);
> +	}
> +	spin_unlock_irqrestore(&kernfs_notify_lock, flags);
>  }
>  EXPORT_SYMBOL_GPL(kernfs_notify);
>  
> --- a/include/linux/kernfs.h
> +++ b/include/linux/kernfs.h
> @@ -91,6 +91,7 @@ struct kernfs_elem_attr {
>  	const struct kernfs_ops	*ops;
>  	struct kernfs_open_node	*open;
>  	loff_t			size;
> +	struct kernfs_node	*notify_next;	/* for kernfs_notify() */
>  };
>  
>  /*

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
@ 2014-07-02  5:53               ` Michael S. Tsirkin
  0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2014-07-02  5:53 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Josh Boyer, Robert Love, John McCutchan,
	Greg Kroah-Hartman, Linux-Kernel@Vger. Kernel. Org,
	virtualization, Eric Paris, Brian Lane, Christoph Hellwig

On Tue, Jul 01, 2014 at 04:41:03PM -0400, Tejun Heo wrote:
> d911d9874801 ("kernfs: make kernfs_notify() trigger inotify events
> too") added fsnotify triggering to kernfs_notify() which requires a
> sleepable context.  There are already existing users of
> kernfs_notify() which invoke it from an atomic context and in general
> it's silly to require a sleepable context for triggering a
> notification.
> 
> The following is an invalid context bug triggerd by md invoking
> sysfs_notify() from IO completion path.
> 
>  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
>  in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
>  2 locks held by swapper/1/0:
>   #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>] virtblk_done+0x42/0xe0 [virtio_blk]
>   #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at: [<ffffffff81633718>] bitmap_endwrite+0x68/0x240
>  irq event stamp: 33518
>  hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
>  hardirqs last disabled at (33516): [<ffffffff818122ed>] common_interrupt+0x6d/0x72
>  softirqs last  enabled at (33518): [<ffffffff810a1272>] _local_bh_enable+0x22/0x50
>  softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
>  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
>   0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
>   0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
>  Call Trace:
>   <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
>   [<ffffffff810d4f14>] __might_sleep+0x184/0x240
>   [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
>   [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
>   [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
>   [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
>   [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
>   [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
>   [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
>   [<ffffffff813b46c4>] blk_update_request+0x94/0x420
>   [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
>   [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
>   [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
>   [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
>   [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
>   [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
>   [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
>   [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
>   [<ffffffff81119436>] handle_edge_irq+0x66/0x130
>   [<ffffffff8101c3e4>] handle_irq+0x84/0x150
>   [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
>   [<ffffffff818122f2>] common_interrupt+0x72/0x72
>   <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
>   [<ffffffff81025454>] default_idle+0x24/0x230
>   [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
>   [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
>   [<ffffffff8104df1b>] start_secondary+0x25b/0x300
> 
> This patch fixes it by punting the notification delivery through a
> work item.  This ends up adding an extra pointer to kernfs_elem_attr
> enlarging kernfs_node by a pointer, which is not ideal but not a very
> big deal either.  If this turns out to be an actual issue, we can move
> kernfs_elem_attr->size to kernfs_node->iattr later.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jens Axboe <axboe@kernel.dk>

FWIW

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  fs/kernfs/file.c       |   69 +++++++++++++++++++++++++++++++++++++++----------
>  include/linux/kernfs.h |    1 
>  2 files changed, 56 insertions(+), 14 deletions(-)
> 
> --- a/fs/kernfs/file.c
> +++ b/fs/kernfs/file.c
> @@ -39,6 +39,19 @@ struct kernfs_open_node {
>  	struct list_head	files; /* goes through kernfs_open_file.list */
>  };
>  
> +/*
> + * kernfs_notify() may be called from any context and bounces notifications
> + * through a work item.  To minimize space overhead in kernfs_node, the
> + * pending queue is implemented as a singly linked list of kernfs_nodes.
> + * The list is terminated with the self pointer so that whether a
> + * kernfs_node is on the list or not can be determined by testing the next
> + * pointer for NULL.
> + */
> +#define KERNFS_NOTIFY_EOL			((void *)&kernfs_notify_list)
> +
> +static DEFINE_SPINLOCK(kernfs_notify_lock);
> +static struct kernfs_node *kernfs_notify_list = KERNFS_NOTIFY_EOL;
> +
>  static struct kernfs_open_file *kernfs_of(struct file *file)
>  {
>  	return ((struct seq_file *)file->private_data)->private;
> @@ -783,24 +796,25 @@ static unsigned int kernfs_fop_poll(stru
>  	return DEFAULT_POLLMASK|POLLERR|POLLPRI;
>  }
>  
> -/**
> - * kernfs_notify - notify a kernfs file
> - * @kn: file to notify
> - *
> - * Notify @kn such that poll(2) on @kn wakes up.
> - */
> -void kernfs_notify(struct kernfs_node *kn)
> +static void kernfs_notify_workfn(struct work_struct *work)
>  {
> -	struct kernfs_root *root = kernfs_root(kn);
> +	struct kernfs_node *kn;
>  	struct kernfs_open_node *on;
>  	struct kernfs_super_info *info;
> -	unsigned long flags;
> -
> -	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
> +repeat:
> +	/* pop one off the notify_list */
> +	spin_lock_irq(&kernfs_notify_lock);
> +	kn = kernfs_notify_list;
> +	if (kn == KERNFS_NOTIFY_EOL) {
> +		spin_unlock_irq(&kernfs_notify_lock);
>  		return;
> +	}
> +	kernfs_notify_list = kn->attr.notify_next;
> +	kn->attr.notify_next = NULL;
> +	spin_unlock_irq(&kernfs_notify_lock);
>  
>  	/* kick poll */
> -	spin_lock_irqsave(&kernfs_open_node_lock, flags);
> +	spin_lock_irq(&kernfs_open_node_lock);
>  
>  	on = kn->attr.open;
>  	if (on) {
> @@ -808,12 +822,12 @@ void kernfs_notify(struct kernfs_node *k
>  		wake_up_interruptible(&on->poll);
>  	}
>  
> -	spin_unlock_irqrestore(&kernfs_open_node_lock, flags);
> +	spin_unlock_irq(&kernfs_open_node_lock);
>  
>  	/* kick fsnotify */
>  	mutex_lock(&kernfs_mutex);
>  
> -	list_for_each_entry(info, &root->supers, node) {
> +	list_for_each_entry(info, &kernfs_root(kn)->supers, node) {
>  		struct inode *inode;
>  		struct dentry *dentry;
>  
> @@ -833,6 +847,33 @@ void kernfs_notify(struct kernfs_node *k
>  	}
>  
>  	mutex_unlock(&kernfs_mutex);
> +	kernfs_put(kn);
> +	goto repeat;
> +}
> +
> +/**
> + * kernfs_notify - notify a kernfs file
> + * @kn: file to notify
> + *
> + * Notify @kn such that poll(2) on @kn wakes up.  Maybe be called from any
> + * context.
> + */
> +void kernfs_notify(struct kernfs_node *kn)
> +{
> +	static DECLARE_WORK(kernfs_notify_work, kernfs_notify_workfn);
> +	unsigned long flags;
> +
> +	if (WARN_ON(kernfs_type(kn) != KERNFS_FILE))
> +		return;
> +
> +	spin_lock_irqsave(&kernfs_notify_lock, flags);
> +	if (!kn->attr.notify_next) {
> +		kernfs_get(kn);
> +		kn->attr.notify_next = kernfs_notify_list;
> +		kernfs_notify_list = kn;
> +		schedule_work(&kernfs_notify_work);
> +	}
> +	spin_unlock_irqrestore(&kernfs_notify_lock, flags);
>  }
>  EXPORT_SYMBOL_GPL(kernfs_notify);
>  
> --- a/include/linux/kernfs.h
> +++ b/include/linux/kernfs.h
> @@ -91,6 +91,7 @@ struct kernfs_elem_attr {
>  	const struct kernfs_ops	*ops;
>  	struct kernfs_open_node	*open;
>  	loff_t			size;
> +	struct kernfs_node	*notify_next;	/* for kernfs_notify() */
>  };
>  
>  /*

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
  2014-07-01 20:51               ` Greg Kroah-Hartman
@ 2014-07-02 14:14                 ` Tejun Heo
  -1 siblings, 0 replies; 22+ messages in thread
From: Tejun Heo @ 2014-07-02 14:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jens Axboe, Michael S. Tsirkin, Christoph Hellwig, Josh Boyer,
	Rusty Russell, virtualization, Linux-Kernel@Vger. Kernel. Org,
	Brian Lane, John McCutchan, Robert Love, Eric Paris

Hello,

On Tue, Jul 01, 2014 at 01:51:48PM -0700, Greg Kroah-Hartman wrote:
> Looks good to me, do you want to take this with your other kernfs
> patches for 3.16-final?  Or if you don't have that, I can take it
> through my tree, it's your choice, either is fine for me.
> 
> If you want it in your tree, feel free to add:
> 	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The other kernfs changes are cgroup specific and in
cgroup/for-3.16-fixes.  I think this one fits better in the driver
core tree.  Can you please route it?

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
@ 2014-07-02 14:14                 ` Tejun Heo
  0 siblings, 0 replies; 22+ messages in thread
From: Tejun Heo @ 2014-07-02 14:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Jens Axboe, Josh Boyer, Michael S. Tsirkin, John McCutchan,
	Linux-Kernel@Vger. Kernel. Org, virtualization, Robert Love,
	Eric Paris, Brian Lane, Christoph Hellwig

Hello,

On Tue, Jul 01, 2014 at 01:51:48PM -0700, Greg Kroah-Hartman wrote:
> Looks good to me, do you want to take this with your other kernfs
> patches for 3.16-final?  Or if you don't have that, I can take it
> through my tree, it's your choice, either is fine for me.
> 
> If you want it in your tree, feel free to add:
> 	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The other kernfs changes are cgroup specific and in
cgroup/for-3.16-fixes.  I think this one fits better in the driver
core tree.  Can you please route it?

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
  2014-07-02 14:14                 ` Tejun Heo
@ 2014-07-02 16:31                   ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-07-02 16:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Michael S. Tsirkin, Christoph Hellwig, Josh Boyer,
	Rusty Russell, virtualization, Linux-Kernel@Vger. Kernel. Org,
	Brian Lane, John McCutchan, Robert Love, Eric Paris

On Wed, Jul 02, 2014 at 10:14:16AM -0400, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jul 01, 2014 at 01:51:48PM -0700, Greg Kroah-Hartman wrote:
> > Looks good to me, do you want to take this with your other kernfs
> > patches for 3.16-final?  Or if you don't have that, I can take it
> > through my tree, it's your choice, either is fine for me.
> > 
> > If you want it in your tree, feel free to add:
> > 	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> The other kernfs changes are cgroup specific and in
> cgroup/for-3.16-fixes.  I think this one fits better in the driver
> core tree.  Can you please route it?

Sure, will queue it up now.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts
@ 2014-07-02 16:31                   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2014-07-02 16:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, Josh Boyer, Michael S. Tsirkin, John McCutchan,
	Linux-Kernel@Vger. Kernel. Org, virtualization, Robert Love,
	Eric Paris, Brian Lane, Christoph Hellwig

On Wed, Jul 02, 2014 at 10:14:16AM -0400, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jul 01, 2014 at 01:51:48PM -0700, Greg Kroah-Hartman wrote:
> > Looks good to me, do you want to take this with your other kernfs
> > patches for 3.16-final?  Or if you don't have that, I can take it
> > through my tree, it's your choice, either is fine for me.
> > 
> > If you want it in your tree, feel free to add:
> > 	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> The other kernfs changes are cgroup specific and in
> cgroup/for-3.16-fixes.  I think this one fits better in the driver
> core tree.  Can you please route it?

Sure, will queue it up now.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* virt_blk BUG: sleeping function called from invalid context
@ 2014-06-27 11:57 Josh Boyer
  0 siblings, 0 replies; 22+ messages in thread
From: Josh Boyer @ 2014-06-27 11:57 UTC (permalink / raw)
  To: Rusty Russell, Michael S. Tsirkin
  Cc: Brian Lane, Linux-Kernel@Vger. Kernel. Org, virtualization

Hi All,

We've had a report[1] of the virt_blk driver causing a lot of spew
because it's calling a sleeping function from an invalid context.  The
backtrace is below.  This is with kernel v3.16-rc2-69-gd91d66e88ea9.

The reporter is on CC and can give you relevant details.

josh

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1113805

[drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 0
 virtio-pci 0000:00:05.0: irq 40 for MSI/MSI-X
 virtio-pci 0000:00:05.0: irq 41 for MSI/MSI-X
  vda: vda1 vda2
 virtio-pci 0000:00:06.0: irq 42 for MSI/MSI-X
 virtio-pci 0000:00:06.0: irq 43 for MSI/MSI-X
  vdb: vdb1
 tsc: Refined TSC clocksource calibration: 3392.169 MHz
 md: bind<vdb1>
 mdadm (350) used greatest stack depth: 12384 bytes left
 md: bind<vda2>
 md: raid1 personality registered for level 1
 md/raid1:md127: active with 2 out of 2 mirrors
 created bitmap (1 pages) for device md127
 md127: bitmap initialized from disk: read 1 pages, set 0 of 153 bits
 md127: detected capacity change from 0 to 10203693056
  md127: unknown partition table
 systemd-cryptsetup[358]: Set cipher aes, mode xts-plain64, key size
512 bits for device
/dev/disk/by-uuid/6972a564-542d-498b-b3a0-7d71c2e966a2.
  md127: unknown partition table
  md127: unknown partition table
 dracut-initqueue[296]: Scanning devices dm-0  for LVM logical volumes
fedora-foo1/root fedora-foo1/swap fedora-foo1/root fedora-foo1/swap
 dracut-initqueue[296]: inactive '/dev/fedora-foo1/root' [7.42 GiB] inherit
 dracut-initqueue[296]: inactive '/dev/fedora-foo1/swap' [2.00 GiB] inherit
 systemd-fsck[662]: /dev/mapper/fedora--foo1-root: clean, 22215/486720
files, 222526/1944576 blocks
 BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:586
 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
 2 locks held by swapper/1/0:
  #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
virtblk_done+0x42/0xe0 [virtio_blk]
  #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
[<ffffffff81633718>] bitmap_endwrite+0x68/0x240
 irq event stamp: 33518
 hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
 hardirqs last disabled at (33516): [<ffffffff818122ed>]
common_interrupt+0x6d/0x72
 softirqs last  enabled at (33518): [<ffffffff810a1272>]
_local_bh_enable+0x22/0x50
 softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000000 f90db13964f4ee05 ffff88007d403b80 ffffffff81807b4c
  0000000000000000 ffff88007d403ba8 ffffffff810d4f14 0000000000000000
  0000000000441800 ffff880078fa1780 ffff88007d403c38 ffffffff8180caf2
 Call Trace:
  <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
  [<ffffffff810d4f14>] __might_sleep+0x184/0x240
  [<ffffffff8180caf2>] mutex_lock_nested+0x42/0x440
  [<ffffffff810e1de6>] ? local_clock+0x16/0x30
  [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
  [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
  [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
  [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
  [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
  [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
  [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
  [<ffffffff813b46c4>] blk_update_request+0x94/0x420
  [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
  [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
  [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
  [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
  [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
  [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
  [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
  [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
  [<ffffffff81119436>] handle_edge_irq+0x66/0x130
  [<ffffffff8101c3e4>] handle_irq+0x84/0x150
  [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
  [<ffffffff818122f2>] common_interrupt+0x72/0x72
  <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
  [<ffffffff81025454>] default_idle+0x24/0x230
  [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
  [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
  [<ffffffff8104df1b>] start_secondary+0x25b/0x300

 =================================
 [ INFO: inconsistent lock state ]
 3.16.0-0.rc2.git2.1.fc21.x86_64 #1 Not tainted
 ---------------------------------
 inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
 swapper/1/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
  (kernfs_mutex){?.+.+.}, at: [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
 {HARDIRQ-ON-W} state was registered at:
   [<ffffffff811004b2>] __lock_acquire+0x942/0x1ca0
   [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
   [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
   [<ffffffff812d6d6f>] kernfs_activate+0x1f/0xf0
   [<ffffffff812d7140>] kernfs_create_root+0xf0/0x110
   [<ffffffff821f1ddc>] sysfs_init+0x13/0x51
   [<ffffffff821ef5af>] mnt_init+0x107/0x231
   [<ffffffff821ef213>] vfs_caches_init+0x98/0x106
   [<ffffffff821bdfb0>] start_kernel+0x44f/0x4bc
   [<ffffffff821bd4d7>] x86_64_start_reservations+0x2a/0x2c
   [<ffffffff821bd626>] x86_64_start_kernel+0x14d/0x170
 irq event stamp: 33518
 hardirqs last  enabled at (33515): [<ffffffff8102544f>] default_idle+0x1f/0x230
 hardirqs last disabled at (33516): [<ffffffff818122ed>]
common_interrupt+0x6d/0x72
 softirqs last  enabled at (33518): [<ffffffff810a1272>]
_local_bh_enable+0x22/0x50
 softirqs last disabled at (33517): [<ffffffff810a29e0>] irq_enter+0x60/0x80

                                              other info that might
help us debug this:
  Possible unsafe locking scenario:
        CPU0
        ----
   lock(kernfs_mutex);
   <Interrupt>
     lock(kernfs_mutex);

                                               *** DEADLOCK ***
 2 locks held by swapper/1/0:
  #0:  (&(&vblk->vq_lock)->rlock){-.-...}, at: [<ffffffffa0039042>]
virtblk_done+0x42/0xe0 [virtio_blk]
  #1:  (&(&bitmap->counts.lock)->rlock){-.....}, at:
[<ffffffff81633718>] bitmap_endwrite+0x68/0x240

                                              stack backtrace:
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.16.0-0.rc2.git2.1.fc21.x86_64 #1
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000000 f90db13964f4ee05 ffff88007d4039d0 ffffffff81807b4c
  ffff88007bcb19d0 ffff88007d403a20 ffffffff81805358 0000000000000000
  0000000000000000 0000000000000001 ffff88007bcb25a8 ffff88007bcb19d0
 Call Trace:
  <IRQ>  [<ffffffff81807b4c>] dump_stack+0x4d/0x66
  [<ffffffff81805358>] print_usage_bug+0x1f0/0x205
  [<ffffffff810ff450>] mark_lock+0x610/0x6d0
  [<ffffffff810fe520>] ? print_shortest_lock_dependencies+0x1d0/0x1d0
  [<ffffffff81100614>] __lock_acquire+0xaa4/0x1ca0
  [<ffffffff8101dc4d>] ? show_trace_log_lvl+0x4d/0x60
  [<ffffffff8101c8ad>] ? show_stack_log_lvl+0xad/0x1b0
  [<ffffffff811020f4>] lock_acquire+0xa4/0x1d0
  [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
  [<ffffffff8180cb35>] mutex_lock_nested+0x85/0x440
  [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
  [<ffffffff810fc23f>] ? lock_release_holdtime.part.28+0xf/0x200
  [<ffffffff812d76a0>] ? kernfs_notify+0x90/0x150
  [<ffffffff812d76a0>] kernfs_notify+0x90/0x150
  [<ffffffff8163377c>] bitmap_endwrite+0xcc/0x240
  [<ffffffffa00de863>] close_write+0x93/0xb0 [raid1]
  [<ffffffffa00df029>] r1_bio_write_done+0x29/0x50 [raid1]
  [<ffffffffa00e0474>] raid1_end_write_request+0xe4/0x260 [raid1]
  [<ffffffff813acb8b>] bio_endio+0x6b/0xa0
  [<ffffffff813b46c4>] blk_update_request+0x94/0x420
  [<ffffffff813bf0ea>] blk_mq_end_io+0x1a/0x70
  [<ffffffffa00392c2>] virtblk_request_done+0x32/0x80 [virtio_blk]
  [<ffffffff813c0648>] __blk_mq_complete_request+0x88/0x120
  [<ffffffff813c070a>] blk_mq_complete_request+0x2a/0x30
  [<ffffffffa0039066>] virtblk_done+0x66/0xe0 [virtio_blk]
  [<ffffffffa002535a>] vring_interrupt+0x3a/0xa0 [virtio_ring]
  [<ffffffff81116177>] handle_irq_event_percpu+0x77/0x340
  [<ffffffff8111647d>] handle_irq_event+0x3d/0x60
  [<ffffffff81119436>] handle_edge_irq+0x66/0x130
  [<ffffffff8101c3e4>] handle_irq+0x84/0x150
  [<ffffffff818146ad>] do_IRQ+0x4d/0xe0
  [<ffffffff818122f2>] common_interrupt+0x72/0x72
  <EOI>  [<ffffffff8105f706>] ? native_safe_halt+0x6/0x10
  [<ffffffff81025454>] default_idle+0x24/0x230
  [<ffffffff81025f9f>] arch_cpu_idle+0xf/0x20
  [<ffffffff810f5adc>] cpu_startup_entry+0x37c/0x7b0
  [<ffffffff8104df1b>] start_secondary+0x25b/0x30

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2014-07-02 16:32 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-27 11:57 virt_blk BUG: sleeping function called from invalid context Josh Boyer
2014-06-29  8:26 ` Michael S. Tsirkin
2014-06-29  8:26   ` Michael S. Tsirkin
2014-06-29 19:32   ` Christoph Hellwig
2014-06-29 20:47     ` Michael S. Tsirkin
2014-06-29 20:47       ` Michael S. Tsirkin
2014-06-29 20:55       ` Jens Axboe
2014-06-29 20:55         ` Jens Axboe
2014-06-30 20:17         ` Tejun Heo
2014-06-30 20:17           ` Tejun Heo
2014-07-01 20:41           ` [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts Tejun Heo
2014-07-01 20:41             ` Tejun Heo
2014-07-01 20:51             ` Greg Kroah-Hartman
2014-07-01 20:51               ` Greg Kroah-Hartman
2014-07-02 14:14               ` Tejun Heo
2014-07-02 14:14                 ` Tejun Heo
2014-07-02 16:31                 ` Greg Kroah-Hartman
2014-07-02 16:31                   ` Greg Kroah-Hartman
2014-07-02  5:53             ` Michael S. Tsirkin
2014-07-02  5:53               ` Michael S. Tsirkin
2014-06-29 19:32   ` virt_blk BUG: sleeping function called from invalid context Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2014-06-27 11:57 Josh Boyer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.