linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* possible deadlock in io_poll_double_wake (2)
@ 2020-09-29  2:28 syzbot
  2021-02-28  0:42 ` syzbot
  0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2020-09-29  2:28 UTC (permalink / raw)
  To: axboe, io-uring, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

Hello,

syzbot found the following issue on:

HEAD commit:    d1d2220c Add linux-next specific files for 20200924
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14cd5cd3900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=254e028a642027c
dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13ba3881900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+28abd693db9e92c160d8@syzkaller.appspotmail.com

============================================
WARNING: possible recursive locking detected
5.9.0-rc6-next-20200924-syzkaller #0 Not tainted
--------------------------------------------
kworker/0:1/12 is trying to acquire lock:
ffff88808d998130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
ffff88808d998130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x156/0x510 fs/io_uring.c:4855

but task is already holding lock:
ffff888093f4c130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:122

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&runtime->sleep);
  lock(&runtime->sleep);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

6 locks held by kworker/0:1/12:
 #0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x821/0x15a0 kernel/workqueue.c:2240
 #1: ffffc90000d2fda8 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x854/0x15a0 kernel/workqueue.c:2244
 #2: ffffffff8b6c5648 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xb/0x60 net/core/link_watch.c:250
 #3: ffffffff8a553d40 (rcu_read_lock){....}-{1:2}, at: ib_device_get_by_netdev+0x0/0x4f0 drivers/infiniband/core/device.c:2550
 #4: ffff888214d31908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
 #5: ffff888093f4c130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:122

stack backtrace:
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc6-next-20200924-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events linkwatch_event
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fb lib/dump_stack.c:118
 print_deadlock_bug kernel/locking/lockdep.c:2714 [inline]
 check_deadlock kernel/locking/lockdep.c:2755 [inline]
 validate_chain kernel/locking/lockdep.c:3546 [inline]
 __lock_acquire.cold+0x12e/0x3ad kernel/locking/lockdep.c:4796
 lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5398
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_poll_double_wake+0x156/0x510 fs/io_uring.c:4855
 __wake_up_common+0x147/0x650 kernel/sched/wait.c:93
 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:123
 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
 snd_pcm_update_hw_ptr0+0xa71/0x1a50 sound/core/pcm_lib.c:464
 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
 __run_hrtimer kernel/time/hrtimer.c:1524 [inline]
 __hrtimer_run_queues+0x693/0xea0 kernel/time/hrtimer.c:1588
 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1605
 __do_softirq+0x203/0xab6 kernel/softirq.c:298
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:786
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 do_softirq_own_stack+0x9d/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:393 [inline]
 __irq_exit_rcu kernel/softirq.c:423 [inline]
 irq_exit_rcu+0x235/0x280 kernel/softirq.c:435
 sysvec_apic_timer_interrupt+0x51/0xf0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:631
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:653 [inline]
RIP: 0010:lock_acquire+0x27b/0xaa0 kernel/locking/lockdep.c:5401
Code: 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 d2 06 00 00 48 83 3d 89 bc e1 08 00 0f 84 2d 05 00 00 48 8b 3c 24 57 9d <0f> 1f 44 00 00 48 b8 00 00 00 00 00 fc ff df 48 01 c3 48 c7 03 00
RSP: 0018:ffffc90000d2f8c8 EFLAGS: 00000282
RAX: 1ffffffff1479e35 RBX: 1ffff920001a5f1c RCX: 00000000f1571e19
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: 0000000000000282
RBP: ffff8880a969a300 R08: 0000000000000000 R09: ffffffff8d71a9e7
R10: fffffbfff1ae353c R11: 0000000000000000 R12: 0000000000000002
R13: 0000000000000000 R14: ffffffff8a553d40 R15: 0000000000000000
 rcu_lock_acquire include/linux/rcupdate.h:253 [inline]
 rcu_read_lock include/linux/rcupdate.h:642 [inline]
 ib_device_get_by_netdev+0x9a/0x4f0 drivers/infiniband/core/device.c:2248
 rxe_get_dev_from_net drivers/infiniband/sw/rxe/rxe.h:76 [inline]
 rxe_notify+0x8b/0x1c0 drivers/infiniband/sw/rxe/rxe_net.c:566
 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2034
 netdev_state_change net/core/dev.c:1464 [inline]
 netdev_state_change+0x100/0x130 net/core/dev.c:1457
 linkwatch_do_dev+0x13f/0x180 net/core/link_watch.c:167
 __linkwatch_run_queue+0x1ea/0x630 net/core/link_watch.c:212
 linkwatch_event+0x4a/0x60 net/core/link_watch.c:251
 process_one_work+0x933/0x15a0 kernel/workqueue.c:2269
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
 kthread+0x3af/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2020-09-29  2:28 possible deadlock in io_poll_double_wake (2) syzbot
@ 2021-02-28  0:42 ` syzbot
  2021-02-28 23:08   ` Jens Axboe
  0 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2021-02-28  0:42 UTC (permalink / raw)
  To: asml.silence, axboe, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

syzbot has found a reproducer for the following issue on:

HEAD commit:    5695e516 Merge tag 'io_uring-worker.v3-2021-02-25' of git:..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=114e3866d00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=8c76dad0946df1f3
dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=122ed9b6d00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5a292d00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+28abd693db9e92c160d8@syzkaller.appspotmail.com

============================================
WARNING: possible recursive locking detected
5.11.0-syzkaller #0 Not tainted
--------------------------------------------
swapper/1/0 is trying to acquire lock:
ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960

but task is already holding lock:
ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&runtime->sleep);
  lock(&runtime->sleep);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by swapper/1/0:
 #0: ffff888147474908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
 #1: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0xfa/0x151 lib/dump_stack.c:120
 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
 check_deadlock kernel/locking/lockdep.c:2872 [inline]
 validate_chain kernel/locking/lockdep.c:3661 [inline]
 __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
 __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
 snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
 invoke_softirq kernel/softirq.c:221 [inline]
 __irq_exit_rcu kernel/softirq.c:422 [inline]
 irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:137 [inline]
RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516
Code: dd 38 6e f8 84 db 75 ac e8 54 32 6e f8 e8 0f 1c 74 f8 e9 0c 00 00 00 e8 45 32 6e f8 0f 00 2d 4e 4a c5 00 e8 39 32 6e f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 14 3a 6e f8 48 85 db
RSP: 0018:ffffc90000d47d18 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880115c3780 RSI: ffffffff89052537 RDI: 0000000000000000
RBP: ffff888141127064 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffff81794168 R11: 0000000000000000 R12: 0000000000000001
R13: ffff888141127000 R14: ffff888141127064 R15: ffff888143331804
 acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:647
 cpuidle_enter_state+0x1b1/0xc80 drivers/cpuidle/cpuidle.c:237
 cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351
 call_cpuidle kernel/sched/idle.c:158 [inline]
 cpuidle_idle_call kernel/sched/idle.c:239 [inline]
 do_idle+0x3e1/0x590 kernel/sched/idle.c:300
 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:397
 start_secondary+0x274/0x350 arch/x86/kernel/smpboot.c:272
 secondary_startup_64_no_verify+0xb0/0xbb


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-02-28  0:42 ` syzbot
@ 2021-02-28 23:08   ` Jens Axboe
  2021-03-01  2:08     ` 回复: " Zhang, Qiang
  2021-03-01  4:18     ` syzbot
  0 siblings, 2 replies; 15+ messages in thread
From: Jens Axboe @ 2021-02-28 23:08 UTC (permalink / raw)
  To: syzbot, asml.silence, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

On 2/27/21 5:42 PM, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    5695e516 Merge tag 'io_uring-worker.v3-2021-02-25' of git:..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=114e3866d00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8c76dad0946df1f3
> dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=122ed9b6d00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5a292d00000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+28abd693db9e92c160d8@syzkaller.appspotmail.com
> 
> ============================================
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> --------------------------------------------
> swapper/1/0 is trying to acquire lock:
> ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
> ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
> 
> but task is already holding lock:
> ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&runtime->sleep);
>   lock(&runtime->sleep);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation
> 
> 2 locks held by swapper/1/0:
>  #0: ffff888147474908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
>  #1: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
> 
> stack backtrace:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0xfa/0x151 lib/dump_stack.c:120
>  print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
>  check_deadlock kernel/locking/lockdep.c:2872 [inline]
>  validate_chain kernel/locking/lockdep.c:3661 [inline]
>  __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
>  lock_acquire kernel/locking/lockdep.c:5510 [inline]
>  lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:354 [inline]
>  io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
>  __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
>  __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
>  snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
>  snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
>  snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
>  dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
>  __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
>  __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
>  hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
>  __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
>  invoke_softirq kernel/softirq.c:221 [inline]
>  __irq_exit_rcu kernel/softirq.c:422 [inline]
>  irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
>  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
>  </IRQ>
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
> RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
> RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
> RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:137 [inline]
> RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
> RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516
> Code: dd 38 6e f8 84 db 75 ac e8 54 32 6e f8 e8 0f 1c 74 f8 e9 0c 00 00 00 e8 45 32 6e f8 0f 00 2d 4e 4a c5 00 e8 39 32 6e f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 14 3a 6e f8 48 85 db
> RSP: 0018:ffffc90000d47d18 EFLAGS: 00000293
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8880115c3780 RSI: ffffffff89052537 RDI: 0000000000000000
> RBP: ffff888141127064 R08: 0000000000000001 R09: 0000000000000001
> R10: ffffffff81794168 R11: 0000000000000000 R12: 0000000000000001
> R13: ffff888141127000 R14: ffff888141127064 R15: ffff888143331804
>  acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:647
>  cpuidle_enter_state+0x1b1/0xc80 drivers/cpuidle/cpuidle.c:237
>  cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351
>  call_cpuidle kernel/sched/idle.c:158 [inline]
>  cpuidle_idle_call kernel/sched/idle.c:239 [inline]
>  do_idle+0x3e1/0x590 kernel/sched/idle.c:300
>  cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:397
>  start_secondary+0x274/0x350 arch/x86/kernel/smpboot.c:272
>  secondary_startup_64_no_verify+0xb0/0xbb

This looks very odd, only thing I can think of is someone doing
poll_wait() twice with different entries but for the same
waitqueue head.

#syz test: git://git.kernel.dk/linux-block syzbot-test

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* 回复: possible deadlock in io_poll_double_wake (2)
  2021-02-28 23:08   ` Jens Axboe
@ 2021-03-01  2:08     ` Zhang, Qiang
  2021-03-01  2:48       ` Jens Axboe
  2021-03-01  4:18     ` syzbot
  1 sibling, 1 reply; 15+ messages in thread
From: Zhang, Qiang @ 2021-03-01  2:08 UTC (permalink / raw)
  To: Jens Axboe, syzbot, asml.silence, io-uring, linux-fsdevel,
	linux-kernel, syzkaller-bugs, viro



________________________________________
发件人: Jens Axboe <axboe@kernel.dk>
发送时间: 2021年3月1日 7:08
收件人: syzbot; asml.silence@gmail.com; io-uring@vger.kernel.org; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; syzkaller-bugs@googlegroups.com; viro@zeniv.linux.org.uk
主题: Re: possible deadlock in io_poll_double_wake (2)

[Please note: This e-mail is from an EXTERNAL e-mail address]

On 2/27/21 5:42 PM, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit:    5695e516 Merge tag 'io_uring-worker.v3-2021-02-25' of git:..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=114e3866d00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8c76dad0946df1f3
> dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=122ed9b6d00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5a292d00000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+28abd693db9e92c160d8@syzkaller.appspotmail.com
>
> ============================================
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> --------------------------------------------
> swapper/1/0 is trying to acquire lock:
> ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
> ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
>
> but task is already holding lock:
> ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&runtime->sleep);
>   lock(&runtime->sleep);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
> 2 locks held by swapper/1/0:
>  #0: ffff888147474908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
>  #1: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> stack backtrace:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:79 [inline]
>  dump_stack+0xfa/0x151 lib/dump_stack.c:120
>  print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
>  check_deadlock kernel/locking/lockdep.c:2872 [inline]
>  validate_chain kernel/locking/lockdep.c:3661 [inline]
>  __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
>  lock_acquire kernel/locking/lockdep.c:5510 [inline]
>  lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>  spin_lock include/linux/spinlock.h:354 [inline]
>  io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
>  __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
>  __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
>  snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
>  snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
>  snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
>  dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
>  __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
>  __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
>  hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
>  __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
>  invoke_softirq kernel/softirq.c:221 [inline]
>  __irq_exit_rcu kernel/softirq.c:422 [inline]
>  irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
>  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
>  </IRQ>
>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
> RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
> RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
> RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:137 [inline]
> RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
> RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516
> Code: dd 38 6e f8 84 db 75 ac e8 54 32 6e f8 e8 0f 1c 74 f8 e9 0c 00 00 00 e8 45 32 6e f8 0f 00 2d 4e 4a c5 00 e8 39 32 6e f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 14 3a 6e f8 48 85 db
> RSP: 0018:ffffc90000d47d18 EFLAGS: 00000293
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff8880115c3780 RSI: ffffffff89052537 RDI: 0000000000000000
> RBP: ffff888141127064 R08: 0000000000000001 R09: 0000000000000001
> R10: ffffffff81794168 R11: 0000000000000000 R12: 0000000000000001
> R13: ffff888141127000 R14: ffff888141127064 R15: ffff888143331804
>  acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:647
>  cpuidle_enter_state+0x1b1/0xc80 drivers/cpuidle/cpuidle.c:237
>  cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351
>  call_cpuidle kernel/sched/idle.c:158 [inline]
>  cpuidle_idle_call kernel/sched/idle.c:239 [inline]
>  do_idle+0x3e1/0x590 kernel/sched/idle.c:300
>  cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:397
>  start_secondary+0x274/0x350 arch/x86/kernel/smpboot.c:272
>  secondary_startup_64_no_verify+0xb0/0xbb

>This looks very odd, only thing I can think of is someone >doing
>poll_wait() twice with different entries but for the same
>waitqueue head.
>

Hello  Jens Axboe

here poll_wait() twice in waitqueue head 'runtime->sleep'
in sound/core/oss/pcm_oss.c

static __poll_t snd_pcm_oss_poll(struct file *file, poll_table * wait) {
...........
        if (psubstream != NULL) {
                struct snd_pcm_runtime *runtime = psubstream->runtime;
                poll_wait(file, &runtime->sleep, wait);
                snd_pcm_stream_lock_irq(psubstream);
                if (runtime->status->state != SNDRV_PCM_STATE_DRAINING &&
                    (runtime->status->state != SNDRV_PCM_STATE_RUNNING ||
                     snd_pcm_oss_playback_ready(psubstream)))
                        mask |= EPOLLOUT | EPOLLWRNORM;
                snd_pcm_stream_unlock_irq(psubstream);
        }
        if (csubstream != NULL) {
                struct snd_pcm_runtime *runtime = csubstream->runtime;
                snd_pcm_state_t ostate;
                poll_wait(file, &runtime->sleep, wait);
                snd_pcm_stream_lock_irq(csubstream);
..........
}

 I don't know if there are any other drivers that use the same way ,   can add some judgment in io_poll_double_wake()?

>#syz test: git://git.kernel.dk/linux-block syzbot-test
>
>--
>Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: 回复: possible deadlock in io_poll_double_wake (2)
  2021-03-01  2:08     ` 回复: " Zhang, Qiang
@ 2021-03-01  2:48       ` Jens Axboe
  0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2021-03-01  2:48 UTC (permalink / raw)
  To: Zhang, Qiang, syzbot, asml.silence, io-uring, linux-fsdevel,
	linux-kernel, syzkaller-bugs, viro

On 2/28/21 7:08 PM, Zhang, Qiang wrote:
> 
> 
> ________________________________________
> 发件人: Jens Axboe <axboe@kernel.dk>
> 发送时间: 2021年3月1日 7:08
> 收件人: syzbot; asml.silence@gmail.com; io-uring@vger.kernel.org; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; syzkaller-bugs@googlegroups.com; viro@zeniv.linux.org.uk
> 主题: Re: possible deadlock in io_poll_double_wake (2)
> 
> [Please note: This e-mail is from an EXTERNAL e-mail address]
> 
> On 2/27/21 5:42 PM, syzbot wrote:
>> syzbot has found a reproducer for the following issue on:
>>
>> HEAD commit:    5695e516 Merge tag 'io_uring-worker.v3-2021-02-25' of git:..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=114e3866d00000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=8c76dad0946df1f3
>> dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=122ed9b6d00000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5a292d00000
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+28abd693db9e92c160d8@syzkaller.appspotmail.com
>>
>> ============================================
>> WARNING: possible recursive locking detected
>> 5.11.0-syzkaller #0 Not tainted
>> --------------------------------------------
>> swapper/1/0 is trying to acquire lock:
>> ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
>> ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
>>
>> but task is already holding lock:
>> ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>        CPU0
>>        ----
>>   lock(&runtime->sleep);
>>   lock(&runtime->sleep);
>>
>>  *** DEADLOCK ***
>>
>>  May be due to missing lock nesting notation
>>
>> 2 locks held by swapper/1/0:
>>  #0: ffff888147474908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
>>  #1: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>>
>> stack backtrace:
>> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>>  <IRQ>
>>  __dump_stack lib/dump_stack.c:79 [inline]
>>  dump_stack+0xfa/0x151 lib/dump_stack.c:120
>>  print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
>>  check_deadlock kernel/locking/lockdep.c:2872 [inline]
>>  validate_chain kernel/locking/lockdep.c:3661 [inline]
>>  __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
>>  lock_acquire kernel/locking/lockdep.c:5510 [inline]
>>  lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
>>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>>  spin_lock include/linux/spinlock.h:354 [inline]
>>  io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960
>>  __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
>>  __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
>>  snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
>>  snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
>>  snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
>>  dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
>>  __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
>>  __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
>>  hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
>>  __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
>>  invoke_softirq kernel/softirq.c:221 [inline]
>>  __irq_exit_rcu kernel/softirq.c:422 [inline]
>>  irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
>>  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
>>  </IRQ>
>>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
>> RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
>> RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
>> RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:137 [inline]
>> RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline]
>> RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516
>> Code: dd 38 6e f8 84 db 75 ac e8 54 32 6e f8 e8 0f 1c 74 f8 e9 0c 00 00 00 e8 45 32 6e f8 0f 00 2d 4e 4a c5 00 e8 39 32 6e f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 14 3a 6e f8 48 85 db
>> RSP: 0018:ffffc90000d47d18 EFLAGS: 00000293
>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>> RDX: ffff8880115c3780 RSI: ffffffff89052537 RDI: 0000000000000000
>> RBP: ffff888141127064 R08: 0000000000000001 R09: 0000000000000001
>> R10: ffffffff81794168 R11: 0000000000000000 R12: 0000000000000001
>> R13: ffff888141127000 R14: ffff888141127064 R15: ffff888143331804
>>  acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:647
>>  cpuidle_enter_state+0x1b1/0xc80 drivers/cpuidle/cpuidle.c:237
>>  cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351
>>  call_cpuidle kernel/sched/idle.c:158 [inline]
>>  cpuidle_idle_call kernel/sched/idle.c:239 [inline]
>>  do_idle+0x3e1/0x590 kernel/sched/idle.c:300
>>  cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:397
>>  start_secondary+0x274/0x350 arch/x86/kernel/smpboot.c:272
>>  secondary_startup_64_no_verify+0xb0/0xbb
> 
>> This looks very odd, only thing I can think of is someone >doing
>> poll_wait() twice with different entries but for the same
>> waitqueue head.
>>
> 
> Hello  Jens Axboe
> 
> here poll_wait() twice in waitqueue head 'runtime->sleep'
> in sound/core/oss/pcm_oss.c
> 
> static __poll_t snd_pcm_oss_poll(struct file *file, poll_table * wait) {
> ...........
>         if (psubstream != NULL) {
>                 struct snd_pcm_runtime *runtime = psubstream->runtime;
>                 poll_wait(file, &runtime->sleep, wait);
>                 snd_pcm_stream_lock_irq(psubstream);
>                 if (runtime->status->state != SNDRV_PCM_STATE_DRAINING &&
>                     (runtime->status->state != SNDRV_PCM_STATE_RUNNING ||
>                      snd_pcm_oss_playback_ready(psubstream)))
>                         mask |= EPOLLOUT | EPOLLWRNORM;
>                 snd_pcm_stream_unlock_irq(psubstream);
>         }
>         if (csubstream != NULL) {
>                 struct snd_pcm_runtime *runtime = csubstream->runtime;
>                 snd_pcm_state_t ostate;
>                 poll_wait(file, &runtime->sleep, wait);
>                 snd_pcm_stream_lock_irq(csubstream);
> ..........
> }
> 
>  I don't know if there are any other drivers that use the same way ,   can add some judgment in io_poll_double_wake()?

Right, that's what my post-email investigation led to as well, hence I queued
this one up:

https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-worker.v4&id=4a0a6fd611f5109bcfab4a95db836bb27131e3be

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-02-28 23:08   ` Jens Axboe
  2021-03-01  2:08     ` 回复: " Zhang, Qiang
@ 2021-03-01  4:18     ` syzbot
  2021-03-01 15:27       ` Jens Axboe
  2021-03-02 17:20       ` Jens Axboe
  1 sibling, 2 replies; 15+ messages in thread
From: syzbot @ 2021-03-01  4:18 UTC (permalink / raw)
  To: asml.silence, axboe, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in io_poll_double_wake

============================================
WARNING: possible recursive locking detected
5.11.0-syzkaller #0 Not tainted
--------------------------------------------
syz-executor.0/10241 is trying to acquire lock:
ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921

but task is already holding lock:
ffff888013b00130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&runtime->sleep);
  lock(&runtime->sleep);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

4 locks held by syz-executor.0/10241:
 #0: ffff88801b1ce128 (&ctx->uring_lock){+.+.}-{3:3}, at: __do_sys_io_uring_enter+0x1155/0x22b0 fs/io_uring.c:9139
 #1: ffffffff8b574460 (rcu_read_lock){....}-{1:2}, at: file_ctx security/apparmor/include/file.h:33 [inline]
 #1: ffffffff8b574460 (rcu_read_lock){....}-{1:2}, at: aa_file_perm+0x119/0x1100 security/apparmor/file.c:607
 #2: ffff888020842108 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
 #3: ffff888013b00130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

stack backtrace:
CPU: 1 PID: 10241 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0xfa/0x151 lib/dump_stack.c:120
 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
 check_deadlock kernel/locking/lockdep.c:2872 [inline]
 validate_chain kernel/locking/lockdep.c:3661 [inline]
 __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921
 __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
 snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
 __do_softirq+0x29b/0x9f6 kernel/softirq.c:343
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:226 [inline]
 __irq_exit_rcu kernel/softirq.c:420 [inline]
 irq_exit_rcu+0x134/0x200 kernel/softirq.c:432
 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1100
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0010:lock_acquire+0x1e4/0x730 kernel/locking/lockdep.c:5478
Code: 07 b8 ff ff ff ff 65 0f c1 05 e8 b3 a8 7e 83 f8 01 0f 85 d9 03 00 00 48 83 7c 24 08 00 74 01 fb 48 b8 00 00 00 00 00 fc ff df <48> 01 c3 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 48 8b 84 24
RSP: 0018:ffffc9000b0e7468 EFLAGS: 00000206
RAX: dffffc0000000000 RBX: 1ffff9200161ce8f RCX: 00000000013c4b5b
RDX: 1ffff110030b54a9 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8f0a77a7
R10: fffffbfff1e14ef4 R11: 0000000000000000 R12: 0000000000000002
R13: ffffffff8b574460 R14: 0000000000000000 R15: 0000000000000000
 rcu_lock_acquire include/linux/rcupdate.h:267 [inline]
 rcu_read_lock include/linux/rcupdate.h:656 [inline]
 aa_file_perm+0x14d/0x1100 security/apparmor/file.c:609
 common_file_perm security/apparmor/lsm.c:466 [inline]
 apparmor_file_permission+0x163/0x4e0 security/apparmor/lsm.c:480
 security_file_permission+0x56/0x560 security/security.c:1456
 rw_verify_area+0x115/0x350 fs/read_write.c:400
 io_read+0x267/0xaf0 fs/io_uring.c:3235
 io_issue_sqe+0x2e1/0x59d0 fs/io_uring.c:5937
 __io_queue_sqe+0x18c/0xc40 fs/io_uring.c:6204
 io_queue_sqe+0x60d/0xf60 fs/io_uring.c:6257
 io_submit_sqe fs/io_uring.c:6421 [inline]
 io_submit_sqes+0x519a/0x6320 fs/io_uring.c:6535
 __do_sys_io_uring_enter+0x1161/0x22b0 fs/io_uring.c:9140
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x465ef9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fd5184e3188 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa
RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465ef9
RDX: 0000000000000000 RSI: 0000000000002039 RDI: 0000000000000004
RBP: 00000000004bcd1c R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60
R13: 0000000000a9fb1f R14: 00007fd5184e3300 R15: 0000000000022000


Tested on:

commit:         d5c6caec io_uring: test patch for double wake syzbot issue
git tree:       git://git.kernel.dk/linux-block syzbot-test
console output: https://syzkaller.appspot.com/x/log.txt?x=133f4f82d00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e348dbdef26bb725
dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
compiler:       


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-03-01  4:18     ` syzbot
@ 2021-03-01 15:27       ` Jens Axboe
  2021-03-01 15:55         ` syzbot
  2021-03-02 17:20       ` Jens Axboe
  1 sibling, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2021-03-01 15:27 UTC (permalink / raw)
  To: syzbot, asml.silence, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

#syz test: git://git.kernel.dk/linux-block syzbot-test

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-03-01 15:27       ` Jens Axboe
@ 2021-03-01 15:55         ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2021-03-01 15:55 UTC (permalink / raw)
  To: asml.silence, axboe, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in io_poll_double_wake

============================================
WARNING: possible recursive locking detected
5.11.0-syzkaller #0 Not tainted
--------------------------------------------
syz-executor.3/8853 is trying to acquire lock:
ffff88802cfbd130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
ffff88802cfbd130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921

but task is already holding lock:
ffff888018ac2130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&runtime->sleep);
  lock(&runtime->sleep);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

5 locks held by syz-executor.3/8853:
 #0: ffffffff8b63e390 (dup_mmap_sem){.+.+}-{0:0}, at: dup_mmap kernel/fork.c:479 [inline]
 #0: ffffffff8b63e390 (dup_mmap_sem){.+.+}-{0:0}, at: dup_mm+0x108/0x1380 kernel/fork.c:1360
 #1: ffff8880249cb958 (&mm->mmap_lock#2){++++}-{3:3}, at: mmap_write_lock_killable include/linux/mmap_lock.h:87 [inline]
 #1: ffff8880249cb958 (&mm->mmap_lock#2){++++}-{3:3}, at: dup_mmap kernel/fork.c:480 [inline]
 #1: ffff8880249cb958 (&mm->mmap_lock#2){++++}-{3:3}, at: dup_mm+0x12e/0x1380 kernel/fork.c:1360
 #2: ffff88802b225558 (&mm->mmap_lock/1){+.+.}-{3:3}, at: mmap_write_lock_nested include/linux/mmap_lock.h:78 [inline]
 #2: ffff88802b225558 (&mm->mmap_lock/1){+.+.}-{3:3}, at: dup_mmap kernel/fork.c:489 [inline]
 #2: ffff88802b225558 (&mm->mmap_lock/1){+.+.}-{3:3}, at: dup_mm+0x18a/0x1380 kernel/fork.c:1360
 #3: ffff888020f6c908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
 #4: ffff888018ac2130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

stack backtrace:
CPU: 1 PID: 8853 Comm: syz-executor.3 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0xfa/0x151 lib/dump_stack.c:120
 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
 check_deadlock kernel/locking/lockdep.c:2872 [inline]
 validate_chain kernel/locking/lockdep.c:3661 [inline]
 __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921
 __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
 snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
 __do_softirq+0x29b/0x9f6 kernel/softirq.c:343
 asm_call_irq_on_stack+0xf/0x20
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:226 [inline]
 __irq_exit_rcu kernel/softirq.c:420 [inline]
 irq_exit_rcu+0x134/0x200 kernel/softirq.c:432
 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1100
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0010:__sanitizer_cov_trace_const_cmp8+0x0/0x70 kernel/kcov.c:290
Code: fe 72 22 44 89 c6 48 83 c2 01 48 89 4c 38 f0 48 c7 44 38 e0 05 00 00 00 48 89 74 38 e8 4e 89 54 c8 20 48 89 10 c3 0f 1f 40 00 <49> 89 f8 bf 03 00 00 00 4c 8b 14 24 48 89 f1 65 48 8b 34 25 00 f0
RSP: 0018:ffffc90001a6f808 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffffc90001a6fa00 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000004 R08: 0000000000000000 R09: ffff888022c75ea3
R10: ffffed100458ebd4 R11: 0000000000000000 R12: ffff88802b225800
R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
 copy_pte_range mm/memory.c:1018 [inline]
 copy_pmd_range mm/memory.c:1070 [inline]
 copy_pud_range mm/memory.c:1107 [inline]
 copy_p4d_range mm/memory.c:1131 [inline]
 copy_page_range+0x127f/0x3fb0 mm/memory.c:1204
 dup_mmap kernel/fork.c:594 [inline]
 dup_mm+0x9ed/0x1380 kernel/fork.c:1360
 copy_mm kernel/fork.c:1416 [inline]
 copy_process+0x2a4c/0x6fd0 kernel/fork.c:2097
 kernel_clone+0xe7/0xab0 kernel/fork.c:2462
 __do_sys_clone+0xc8/0x110 kernel/fork.c:2579
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4644eb
Code: ed 0f 85 60 01 00 00 64 4c 8b 0c 25 10 00 00 00 45 31 c0 4d 8d 91 d0 02 00 00 31 d2 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 89 00 00 00 41 89 c5 85 c0 0f 85 90 00 00
RSP: 002b:0000000000a9fd50 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004644eb
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 0000000000000001 R08: 0000000000000000 R09: 00000000026a7400
R10: 00000000026a76d0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000a9fe40


Tested on:

commit:         d5c6caec io_uring: test patch for double wake syzbot issue
git tree:       git://git.kernel.dk/linux-block syzbot-test
console output: https://syzkaller.appspot.com/x/log.txt?x=15129782d00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e348dbdef26bb725
dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
compiler:       


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-03-01  4:18     ` syzbot
  2021-03-01 15:27       ` Jens Axboe
@ 2021-03-02 17:20       ` Jens Axboe
  2021-03-02 18:59         ` syzbot
                           ` (2 more replies)
  1 sibling, 3 replies; 15+ messages in thread
From: Jens Axboe @ 2021-03-02 17:20 UTC (permalink / raw)
  To: syzbot, asml.silence, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

On 2/28/21 9:18 PM, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> possible deadlock in io_poll_double_wake
> 
> ============================================
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> --------------------------------------------
> syz-executor.0/10241 is trying to acquire lock:
> ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
> ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921
> 
> but task is already holding lock:
> ffff888013b00130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&runtime->sleep);
>   lock(&runtime->sleep);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation

Since the fix is in yet this keeps failing (and I didn't get it), I looked
closer at this report. While the names of the locks are the same, they are
really two different locks. So let's try this...

#syz test: git://git.kernel.dk/linux-block syzbot-test

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-03-02 17:20       ` Jens Axboe
@ 2021-03-02 18:59         ` syzbot
  2021-03-03  4:01           ` Jens Axboe
  2021-03-03 12:15         ` 回复: " Zhang, Qiang
       [not found]         ` <20210303065231.1589-1-hdanton@sina.com>
  2 siblings, 1 reply; 15+ messages in thread
From: syzbot @ 2021-03-02 18:59 UTC (permalink / raw)
  To: asml.silence, axboe, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in io_poll_double_wake

============================================
WARNING: possible recursive locking detected
5.12.0-rc1-syzkaller #0 Not tainted
--------------------------------------------
syz-executor.4/10454 is trying to acquire lock:
ffff8880343cc130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
ffff8880343cc130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4925

but task is already holding lock:
ffff888034e3b130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&runtime->sleep);
  lock(&runtime->sleep);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

4 locks held by syz-executor.4/10454:
 #0: ffff888018cc8128 (&ctx->uring_lock){+.+.}-{3:3}, at: __do_sys_io_uring_enter+0x1146/0x2200 fs/io_uring.c:9113
 #1: ffff888021692440 (&runtime->oss.params_lock){+.+.}-{3:3}, at: snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1087 [inline]
 #1: ffff888021692440 (&runtime->oss.params_lock){+.+.}-{3:3}, at: snd_pcm_oss_make_ready+0xc7/0x1b0 sound/core/oss/pcm_oss.c:1149
 #2: ffff888020273908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
 #3: ffff888034e3b130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

stack backtrace:
CPU: 0 PID: 10454 Comm: syz-executor.4 Not tainted 5.12.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0xfa/0x151 lib/dump_stack.c:120
 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
 check_deadlock kernel/locking/lockdep.c:2872 [inline]
 validate_chain kernel/locking/lockdep.c:3661 [inline]
 __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4925
 __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
 snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
 invoke_softirq kernel/softirq.c:221 [inline]
 __irq_exit_rcu kernel/softirq.c:422 [inline]
 irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
 </IRQ>
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
RIP: 0010:unwind_next_frame+0xde0/0x2000 arch/x86/kernel/unwind_orc.c:611
Code: 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 83 0f 00 00 41 3b 2f 0f 84 c1 05 00 00 <bf> 01 00 00 00 e8 16 95 1b 00 b8 01 00 00 00 65 8b 15 ca a8 cf 7e
RSP: 0018:ffffc9000b447168 EFLAGS: 00000287
RAX: ffffc9000b448001 RBX: 1ffff92001688e35 RCX: 1ffff92001688e01
RDX: ffffc9000b447ae8 RSI: ffffc9000b447ab0 RDI: ffffc9000b447250
RBP: ffffc9000b447ae0 R08: ffffffff8dac0810 R09: 0000000000000001
R10: 0000000000084087 R11: 0000000000000001 R12: ffffc9000b440000
R13: ffffc9000b447275 R14: ffffc9000b447290 R15: ffffc9000b447240
 arch_stack_walk+0x7d/0xe0 arch/x86/kernel/stacktrace.c:25
 stack_trace_save+0x8c/0xc0 kernel/stacktrace.c:121
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357
 ____kasan_slab_free mm/kasan/common.c:360 [inline]
 ____kasan_slab_free mm/kasan/common.c:325 [inline]
 __kasan_slab_free+0xf5/0x130 mm/kasan/common.c:367
 kasan_slab_free include/linux/kasan.h:199 [inline]
 slab_free_hook mm/slub.c:1562 [inline]
 slab_free_freelist_hook+0x72/0x1b0 mm/slub.c:1600
 slab_free mm/slub.c:3161 [inline]
 kfree+0xe5/0x7b0 mm/slub.c:4213
 snd_pcm_hw_param_near.constprop.0+0x7b0/0x8f0 sound/core/oss/pcm_oss.c:438
 snd_pcm_oss_change_params_locked+0x18c6/0x39a0 sound/core/oss/pcm_oss.c:936
 snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1090 [inline]
 snd_pcm_oss_make_ready+0xe7/0x1b0 sound/core/oss/pcm_oss.c:1149
 snd_pcm_oss_set_trigger.isra.0+0x30f/0x6e0 sound/core/oss/pcm_oss.c:2057
 snd_pcm_oss_poll+0x661/0xb10 sound/core/oss/pcm_oss.c:2841
 vfs_poll include/linux/poll.h:90 [inline]
 __io_arm_poll_handler+0x354/0xa20 fs/io_uring.c:5073
 io_arm_poll_handler fs/io_uring.c:5142 [inline]
 __io_queue_sqe+0x6ef/0xc40 fs/io_uring.c:6213
 io_queue_sqe+0x60d/0xf60 fs/io_uring.c:6259
 io_submit_sqe fs/io_uring.c:6423 [inline]
 io_submit_sqes+0x519a/0x6320 fs/io_uring.c:6537
 __do_sys_io_uring_enter+0x1152/0x2200 fs/io_uring.c:9114
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x465ef9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f818e00e188 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa
RAX: ffffffffffffffda RBX: 000000000056c008 RCX: 0000000000465ef9
RDX: 0000000000000000 RSI: 0000000000002039 RDI: 0000000000000004
RBP: 00000000004bcd1c R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056c008
R13: 0000000000a9fb1f R14: 00007f818e00e300 R15: 0000000000022000


Tested on:

commit:         c9387501 sound: name fiddling
git tree:       git://git.kernel.dk/linux-block syzbot-test
console output: https://syzkaller.appspot.com/x/log.txt?x=16a51856d00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa0e4e0c3e0cf6e0
dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
compiler:       


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-03-02 18:59         ` syzbot
@ 2021-03-03  4:01           ` Jens Axboe
  2021-03-03 11:36             ` syzbot
  0 siblings, 1 reply; 15+ messages in thread
From: Jens Axboe @ 2021-03-03  4:01 UTC (permalink / raw)
  To: syzbot, asml.silence, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

On 3/2/21 11:59 AM, syzbot wrote:
> Hello,
> 
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> possible deadlock in io_poll_double_wake
> 
> ============================================
> WARNING: possible recursive locking detected
> 5.12.0-rc1-syzkaller #0 Not tainted
> --------------------------------------------
> syz-executor.4/10454 is trying to acquire lock:
> ffff8880343cc130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
> ffff8880343cc130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4925
> 
> but task is already holding lock:
> ffff888034e3b130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&runtime->sleep);
>   lock(&runtime->sleep);

This still makes no sense to me - naming is the same, but address of waitqueue_head
is not (which is what matters). Unless I'm missing something obvious here.

Anyway, added some debug printks, so let's try again.

#syz test: git://git.kernel.dk/linux-block syzbot-test

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
  2021-03-03  4:01           ` Jens Axboe
@ 2021-03-03 11:36             ` syzbot
  0 siblings, 0 replies; 15+ messages in thread
From: syzbot @ 2021-03-03 11:36 UTC (permalink / raw)
  To: asml.silence, axboe, io-uring, linux-fsdevel, linux-kernel,
	syzkaller-bugs, viro

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in io_poll_double_wake

poll and dpoll head different
============================================
WARNING: possible recursive locking detected
5.12.0-rc1-syzkaller #0 Not tainted
--------------------------------------------
kworker/1:3/8637 is trying to acquire lock:
ffff888040471130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
ffff888040471130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake.cold+0x115/0x4e0 fs/io_uring.c:4931

but task is already holding lock:
ffff888040473130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&runtime->sleep);
  lock(&runtime->sleep);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

5 locks held by kworker/1:3/8637:
 #0: ffff888020d60938 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888020d60938 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888020d60938 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888020d60938 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
 #0: ffff888020d60938 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
 #0: ffff888020d60938 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x871/0x1600 kernel/workqueue.c:2246
 #1: ffffc900027bfda8 ((work_completion)(&(&ifa->dad_work)->work)){+.+.}-{0:0}, at: process_one_work+0x8a5/0x1600 kernel/workqueue.c:2250
 #2: ffffffff8ce7d028 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_dad_work+0xa3/0x12b0 net/ipv6/addrconf.c:4031
 #3: ffff8880209d8908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
 #4: ffff888040473130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137

stack backtrace:
CPU: 1 PID: 8637 Comm: kworker/1:3 Not tainted 5.12.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: ipv6_addrconf addrconf_dad_work
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0xfa/0x151 lib/dump_stack.c:120
 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
 check_deadlock kernel/locking/lockdep.c:2872 [inline]
 validate_chain kernel/locking/lockdep.c:3661 [inline]
 __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
 lock_acquire kernel/locking/lockdep.c:5510 [inline]
 lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:354 [inline]
 io_poll_double_wake.cold+0x115/0x4e0 fs/io_uring.c:4931
 __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
 snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
 do_softirq.part.0+0xc8/0x110 kernel/softirq.c:248
 </IRQ>
 do_softirq kernel/softirq.c:240 [inline]
 __local_bh_enable_ip+0x102/0x120 kernel/softirq.c:198
 mld_send_initial_cr.part.0+0xf4/0x150 net/ipv6/mcast.c:2094
 mld_send_initial_cr net/ipv6/mcast.c:1191 [inline]
 ipv6_mc_dad_complete+0x1bb/0x6b0 net/ipv6/mcast.c:2103
 addrconf_dad_completed+0x94d/0xc70 net/ipv6/addrconf.c:4175
 addrconf_dad_work+0x79f/0x12b0 net/ipv6/addrconf.c:4105
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different
poll and dpoll head different


Tested on:

commit:         44a23ff1 io_uring: debug messages
git tree:       git://git.kernel.dk/linux-block syzbot-test
console output: https://syzkaller.appspot.com/x/log.txt?x=1790cb92d00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=fa0e4e0c3e0cf6e0
dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
compiler:       


^ permalink raw reply	[flat|nested] 15+ messages in thread

* 回复: possible deadlock in io_poll_double_wake (2)
  2021-03-02 17:20       ` Jens Axboe
  2021-03-02 18:59         ` syzbot
@ 2021-03-03 12:15         ` Zhang, Qiang
  2021-03-03 12:45           ` Zhang, Qiang
       [not found]         ` <20210303065231.1589-1-hdanton@sina.com>
  2 siblings, 1 reply; 15+ messages in thread
From: Zhang, Qiang @ 2021-03-03 12:15 UTC (permalink / raw)
  To: Jens Axboe, syzbot, asml.silence, io-uring, linux-fsdevel,
	linux-kernel, syzkaller-bugs, viro



________________________________________
发件人: Jens Axboe <axboe@kernel.dk>
发送时间: 2021年3月3日 1:20
收件人: syzbot; asml.silence@gmail.com; io-uring@vger.kernel.org; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; syzkaller-bugs@googlegroups.com; viro@zeniv.linux.org.uk
主题: Re: possible deadlock in io_poll_double_wake (2)

[Please note: This e-mail is from an EXTERNAL e-mail address]

On 2/28/21 9:18 PM, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> possible deadlock in io_poll_double_wake
>
> ============================================
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> --------------------------------------------
> syz-executor.0/10241 is trying to acquire lock:
> ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
> ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921
>
> but task is already holding lock:
> ffff888013b00130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&runtime->sleep);
>   lock(&runtime->sleep);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
>Since the fix is in yet this keeps failing (and I didn't get it), >I looked
>closer at this report. While the names of the locks are the >same, they are
>really two different locks. So let's try this...

Hello Jens Axboe

Sorry, I provided the wrong information before. 
I'm not very familiar with io_uring,  before we start vfs_poll again,  should we set  'poll->head = NULL'  ?  

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 42b675939582..cae605c14510 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4824,7 +4824,7 @@ static bool io_poll_rewait(struct io_kiocb *req, struct io_poll_iocb *poll)
 
        if (!req->result && !READ_ONCE(poll->canceled)) {
                struct poll_table_struct pt = { ._key = poll->events };
-
+               poll->head = NULL;
                req->result = vfs_poll(req->file, &pt) & poll->events;
        }

 

Thanks
Qiang

>
>#syz test: git://git.kernel.dk/linux-block syzbot-test
>
>--
>Jens Axboe


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* 回复: possible deadlock in io_poll_double_wake (2)
  2021-03-03 12:15         ` 回复: " Zhang, Qiang
@ 2021-03-03 12:45           ` Zhang, Qiang
  0 siblings, 0 replies; 15+ messages in thread
From: Zhang, Qiang @ 2021-03-03 12:45 UTC (permalink / raw)
  To: Jens Axboe, syzbot, asml.silence, io-uring, linux-fsdevel,
	linux-kernel, syzkaller-bugs, viro



________________________________________
发件人: Zhang, Qiang <Qiang.Zhang@windriver.com>
发送时间: 2021年3月3日 20:15
收件人: Jens Axboe; syzbot; asml.silence@gmail.com; io-uring@vger.kernel.org; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; syzkaller-bugs@googlegroups.com; viro@zeniv.linux.org.uk
主题: 回复: possible deadlock in io_poll_double_wake (2)



________________________________________
发件人: Jens Axboe <axboe@kernel.dk>
发送时间: 2021年3月3日 1:20
收件人: syzbot; asml.silence@gmail.com; io-uring@vger.kernel.org; linux-fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; syzkaller-bugs@googlegroups.com; viro@zeniv.linux.org.uk
主题: Re: possible deadlock in io_poll_double_wake (2)

[Please note: This e-mail is from an EXTERNAL e-mail address]

On 2/28/21 9:18 PM, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> possible deadlock in io_poll_double_wake
>
> ============================================
> WARNING: possible recursive locking detected
> 5.11.0-syzkaller #0 Not tainted
> --------------------------------------------
> syz-executor.0/10241 is trying to acquire lock:
> ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
> ffff888012e09130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4921
>
> but task is already holding lock:
> ffff888013b00130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&runtime->sleep);
>   lock(&runtime->sleep);
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
>Since the fix is in yet this keeps failing (and I didn't get it), >I looked
>closer at this report. While the names of the locks are the >same, they are
>really two different locks. So let's try this...

>Hello Jens Axboe

Sorry for I make  noise, please ignore this information.

>Sorry, I provided the wrong information before.
>I'm not very familiar with io_uring,  before we start >vfs_poll again,  should we set  'poll->head = NULL'  ?
>
>diff --git a/fs/io_uring.c b/fs/io_uring.c
>index 42b675939582..cae605c14510 100644
>--- a/fs/io_uring.c
>+++ b/fs/io_uring.c
>@@ -4824,7 +4824,7 @@ static bool io_poll_rewait(struct >io_kiocb *req, struct io_poll_iocb *poll)
>
>        if (!req->result && !READ_ONCE(poll->canceled)) {
>                struct poll_table_struct pt = { ._key = poll->events >};
>-
>+               poll->head = NULL;
>                req->result = vfs_poll(req->file, &pt) & >poll->events;
>        }



>Thanks
>Qiang

>
>#syz test: git://git.kernel.dk/linux-block syzbot-test
>
>--
>Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: possible deadlock in io_poll_double_wake (2)
       [not found]         ` <20210303065231.1589-1-hdanton@sina.com>
@ 2021-03-03 13:39           ` Jens Axboe
  0 siblings, 0 replies; 15+ messages in thread
From: Jens Axboe @ 2021-03-03 13:39 UTC (permalink / raw)
  To: Hillf Danton, syzbot; +Cc: asml.silence, io-uring, linux-kernel, syzkaller-bugs

On 3/2/21 11:52 PM, Hillf Danton wrote:
> Tue, 02 Mar 2021 10:59:05 -0800
>> Hello,
>>
>> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
>> possible deadlock in io_poll_double_wake
>>
>> ============================================
>> WARNING: possible recursive locking detected
>> 5.12.0-rc1-syzkaller #0 Not tainted
>> --------------------------------------------
>> syz-executor.4/10454 is trying to acquire lock:
>> ffff8880343cc130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
>> ffff8880343cc130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4925
>>
>> but task is already holding lock:
>> ffff888034e3b130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>        CPU0
>>        ----
>>   lock(&runtime->sleep);
>>   lock(&runtime->sleep);
>>
>>  *** DEADLOCK ***
>>
>>  May be due to missing lock nesting notation
>>
>> 4 locks held by syz-executor.4/10454:
>>  #0: ffff888018cc8128 (&ctx->uring_lock){+.+.}-{3:3}, at: __do_sys_io_uring_enter+0x1146/0x2200 fs/io_uring.c:9113
>>  #1: ffff888021692440 (&runtime->oss.params_lock){+.+.}-{3:3}, at: snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1087 [inline]
>>  #1: ffff888021692440 (&runtime->oss.params_lock){+.+.}-{3:3}, at: snd_pcm_oss_make_ready+0xc7/0x1b0 sound/core/oss/pcm_oss.c:1149
>>  #2: ffff888020273908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
>>  #3: ffff888034e3b130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137
>>
>> stack backtrace:
>> CPU: 0 PID: 10454 Comm: syz-executor.4 Not tainted 5.12.0-rc1-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>>  <IRQ>
>>  __dump_stack lib/dump_stack.c:79 [inline]
>>  dump_stack+0xfa/0x151 lib/dump_stack.c:120
>>  print_deadlock_bug kernel/locking/lockdep.c:2829 [inline]
>>  check_deadlock kernel/locking/lockdep.c:2872 [inline]
>>  validate_chain kernel/locking/lockdep.c:3661 [inline]
>>  __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900
>>  lock_acquire kernel/locking/lockdep.c:5510 [inline]
>>  lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475
>>  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>>  _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>>  spin_lock include/linux/spinlock.h:354 [inline]
>>  io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4925
>>  __wake_up_common+0x147/0x650 kernel/sched/wait.c:108
>>  __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138
>>  snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203
>>  snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464
>>  snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805
>>  dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378
>>  __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
>>  __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583
>>  hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600
>>  __do_softirq+0x29b/0x9f6 kernel/softirq.c:345
>>  invoke_softirq kernel/softirq.c:221 [inline]
>>  __irq_exit_rcu kernel/softirq.c:422 [inline]
>>  irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
>>  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
>>  </IRQ>
>>  asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
>> RIP: 0010:unwind_next_frame+0xde0/0x2000 arch/x86/kernel/unwind_orc.c:611
>> Code: 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 83 0f 00 00 41 3b 2f 0f 84 c1 05 00 00 <bf> 01 00 00 00 e8 16 95 1b 00 b8 01 00 00 00 65 8b 15 ca a8 cf 7e
>> RSP: 0018:ffffc9000b447168 EFLAGS: 00000287
>> RAX: ffffc9000b448001 RBX: 1ffff92001688e35 RCX: 1ffff92001688e01
>> RDX: ffffc9000b447ae8 RSI: ffffc9000b447ab0 RDI: ffffc9000b447250
>> RBP: ffffc9000b447ae0 R08: ffffffff8dac0810 R09: 0000000000000001
>> R10: 0000000000084087 R11: 0000000000000001 R12: ffffc9000b440000
>> R13: ffffc9000b447275 R14: ffffc9000b447290 R15: ffffc9000b447240
>>  arch_stack_walk+0x7d/0xe0 arch/x86/kernel/stacktrace.c:25
>>  stack_trace_save+0x8c/0xc0 kernel/stacktrace.c:121
>>  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
>>  kasan_set_track+0x1c/0x30 mm/kasan/common.c:46
>>  kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:357
>>  ____kasan_slab_free mm/kasan/common.c:360 [inline]
>>  ____kasan_slab_free mm/kasan/common.c:325 [inline]
>>  __kasan_slab_free+0xf5/0x130 mm/kasan/common.c:367
>>  kasan_slab_free include/linux/kasan.h:199 [inline]
>>  slab_free_hook mm/slub.c:1562 [inline]
>>  slab_free_freelist_hook+0x72/0x1b0 mm/slub.c:1600
>>  slab_free mm/slub.c:3161 [inline]
>>  kfree+0xe5/0x7b0 mm/slub.c:4213
>>  snd_pcm_hw_param_near.constprop.0+0x7b0/0x8f0 sound/core/oss/pcm_oss.c:438
>>  snd_pcm_oss_change_params_locked+0x18c6/0x39a0 sound/core/oss/pcm_oss.c:936
>>  snd_pcm_oss_change_params sound/core/oss/pcm_oss.c:1090 [inline]
>>  snd_pcm_oss_make_ready+0xe7/0x1b0 sound/core/oss/pcm_oss.c:1149
>>  snd_pcm_oss_set_trigger.isra.0+0x30f/0x6e0 sound/core/oss/pcm_oss.c:2057
>>  snd_pcm_oss_poll+0x661/0xb10 sound/core/oss/pcm_oss.c:2841
>>  vfs_poll include/linux/poll.h:90 [inline]
>>  __io_arm_poll_handler+0x354/0xa20 fs/io_uring.c:5073
>>  io_arm_poll_handler fs/io_uring.c:5142 [inline]
>>  __io_queue_sqe+0x6ef/0xc40 fs/io_uring.c:6213
>>  io_queue_sqe+0x60d/0xf60 fs/io_uring.c:6259
>>  io_submit_sqe fs/io_uring.c:6423 [inline]
>>  io_submit_sqes+0x519a/0x6320 fs/io_uring.c:6537
>>  __do_sys_io_uring_enter+0x1152/0x2200 fs/io_uring.c:9114
>>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>> RIP: 0033:0x465ef9
>> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007f818e00e188 EFLAGS: 00000246 ORIG_RAX: 00000000000001aa
>> RAX: ffffffffffffffda RBX: 000000000056c008 RCX: 0000000000465ef9
>> RDX: 0000000000000000 RSI: 0000000000002039 RDI: 0000000000000004
>> RBP: 00000000004bcd1c R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056c008
>> R13: 0000000000a9fb1f R14: 00007f818e00e300 R15: 0000000000022000
>>
>>
>> Tested on:
>>
>> commit:         c9387501 sound: name fiddling
>> git tree:       git://git.kernel.dk/linux-block syzbot-test
>> console output: https://syzkaller.appspot.com/x/log.txt?x=16a51856d00000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=fa0e4e0c3e0cf6e0
>> dashboard link: https://syzkaller.appspot.com/bug?extid=28abd693db9e92c160d8
>> compiler:       
>>
> 
> Walk around recursive lock before adding fix to io_poll_get_single().
> 
> --- x/fs/io_uring.c
> +++ y/fs/io_uring.c
> @@ -4945,6 +4945,7 @@ static int io_poll_double_wake(struct wa
>  			       int sync, void *key)
>  {
>  	struct io_kiocb *req = wait->private;
> +	struct io_poll_iocb *self = container_of(wait, struct io_poll_iocb, wait); 
>  	struct io_poll_iocb *poll = io_poll_get_single(req);
>  	__poll_t mask = key_to_poll(key);
>  
> @@ -4954,7 +4955,7 @@ static int io_poll_double_wake(struct wa
>  
>  	list_del_init(&wait->entry);
>  
> -	if (poll && poll->head) {
> +	if (poll && poll->head && poll->head != self->head) {
>  		bool done;
>  
>  		spin_lock(&poll->head->lock);

The trace and the recent test shows that they are different, this
case is already caught when we arm the double poll handling:

https://git.kernel.dk/cgit/linux-block/commit/?h=io_uring-5.12&id=9e27652c987541aa7cc062e59343e321fff539ae

I don't think there's a real issue here, I'm just poking to see why
syzbot/lockdep thinks there is.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-03-03 18:09 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29  2:28 possible deadlock in io_poll_double_wake (2) syzbot
2021-02-28  0:42 ` syzbot
2021-02-28 23:08   ` Jens Axboe
2021-03-01  2:08     ` 回复: " Zhang, Qiang
2021-03-01  2:48       ` Jens Axboe
2021-03-01  4:18     ` syzbot
2021-03-01 15:27       ` Jens Axboe
2021-03-01 15:55         ` syzbot
2021-03-02 17:20       ` Jens Axboe
2021-03-02 18:59         ` syzbot
2021-03-03  4:01           ` Jens Axboe
2021-03-03 11:36             ` syzbot
2021-03-03 12:15         ` 回复: " Zhang, Qiang
2021-03-03 12:45           ` Zhang, Qiang
     [not found]         ` <20210303065231.1589-1-hdanton@sina.com>
2021-03-03 13:39           ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).