Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
* possible deadlock in io_submit_one (2)
@ 2019-08-18 16:18 syzbot
       [not found] ` <20190822233529.4176-1-ebiggers@kernel.org>
  0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2019-08-18 16:18 UTC (permalink / raw)
  To: bcrl, linux-aio, linux-fsdevel, linux-kernel, syzkaller-bugs, viro

Hello,

syzbot found the following crash on:

HEAD commit:    17da61ae Add linux-next specific files for 20190814
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=127712e2600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4733704ca85aaa66
dashboard link: https://syzkaller.appspot.com/bug?extid=af05535bb79520f95431
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.3.0-rc4-next-20190814 #66 Not tainted
-----------------------------------------------------
syz-executor.1/25024 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: spin_lock  
include/linux/spinlock.h:338 [inline]
ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1748 [inline]
ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: __io_submit_one fs/aio.c:1822  
[inline]
ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: io_submit_one+0xefa/0x2ef0  
fs/aio.c:1859

and this task is already holding:
ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq  
include/linux/spinlock.h:363 [inline]
ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll  
fs/aio.c:1746 [inline]
ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one  
fs/aio.c:1822 [inline]
ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at:  
io_submit_one+0xeb5/0x2ef0 fs/aio.c:1859
which would create a new lock dependency:
  (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.}

but this new dependency connects a SOFTIRQ-irq-safe lock:
  (&(&ctx->ctx_lock)->rlock){..-.}

... which became SOFTIRQ-irq-safe at:
   lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
   __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
   _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:167
   spin_lock_irq include/linux/spinlock.h:363 [inline]
   free_ioctx_users+0x2d/0x490 fs/aio.c:618
   percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline]
   percpu_ref_put include/linux/percpu-refcount.h:309 [inline]
   percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline]
   percpu_ref_switch_to_atomic_rcu+0x4c0/0x570 lib/percpu-refcount.c:165
   __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
   rcu_do_batch kernel/rcu/tree.c:2157 [inline]
   rcu_core+0x581/0x1560 kernel/rcu/tree.c:2377
   rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2386
   __do_softirq+0x262/0x98c kernel/softirq.c:292
   invoke_softirq kernel/softirq.c:373 [inline]
   irq_exit+0x19b/0x1e0 kernel/softirq.c:413
   exiting_irq arch/x86/include/asm/apic.h:536 [inline]
   smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1095
   apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
   constant_test_bit arch/x86/include/asm/bitops.h:207 [inline]
   test_bit include/asm-generic/bitops-instrumented.h:238 [inline]
   test_ti_thread_flag include/linux/thread_info.h:84 [inline]
   need_resched include/linux/sched.h:1827 [inline]
   mutex_spin_on_owner+0x7b/0x330 kernel/locking/mutex.c:568
   mutex_optimistic_spin kernel/locking/mutex.c:673 [inline]
   __mutex_lock_common kernel/locking/mutex.c:959 [inline]
   __mutex_lock+0x330/0x13c0 kernel/locking/mutex.c:1103
   mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118
   rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72
   addrconf_dad_work+0xad/0x1150 net/ipv6/addrconf.c:4032
   process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
   worker_thread+0x98/0xe40 kernel/workqueue.c:2415
   kthread+0x361/0x430 kernel/kthread.c:255
   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

to a SOFTIRQ-irq-unsafe lock:
  (&fiq->waitq){+.+.}

... which became SOFTIRQ-irq-unsafe at:
...
   lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
   _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
   spin_lock include/linux/spinlock.h:338 [inline]
   flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
   fuse_request_queue_background+0x2f8/0x5a0 fs/fuse/dev.c:676
   fuse_request_send_background+0x58/0x110 fs/fuse/dev.c:687
   cuse_send_init fs/fuse/cuse.c:459 [inline]
   cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519
   misc_open+0x395/0x4c0 drivers/char/misc.c:141
   chrdev_open+0x245/0x6b0 fs/char_dev.c:414
   do_dentry_open+0x4df/0x1250 fs/open.c:797
   vfs_open+0xa0/0xd0 fs/open.c:914
   do_last fs/namei.c:3416 [inline]
   path_openat+0x10e9/0x4630 fs/namei.c:3533
   do_filp_open+0x1a1/0x280 fs/namei.c:3563
   do_sys_open+0x3fe/0x5d0 fs/open.c:1097
   __do_sys_openat fs/open.c:1124 [inline]
   __se_sys_openat fs/open.c:1118 [inline]
   __x64_sys_openat+0x9d/0x100 fs/open.c:1118
   do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

  Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&fiq->waitq);
                                local_irq_disable();
                                lock(&(&ctx->ctx_lock)->rlock);
                                lock(&fiq->waitq);
   <Interrupt>
     lock(&(&ctx->ctx_lock)->rlock);

  *** DEADLOCK ***

1 lock held by syz-executor.1/25024:
  #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq  
include/linux/spinlock.h:363 [inline]
  #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll  
fs/aio.c:1746 [inline]
  #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one  
fs/aio.c:1822 [inline]
  #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at:  
io_submit_one+0xeb5/0x2ef0 fs/aio.c:1859

the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&(&ctx->ctx_lock)->rlock){..-.} {
    IN-SOFTIRQ-W at:
                     lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
                     __raw_spin_lock_irq  
include/linux/spinlock_api_smp.h:128 [inline]
                     _raw_spin_lock_irq+0x60/0x80  
kernel/locking/spinlock.c:167
                     spin_lock_irq include/linux/spinlock.h:363 [inline]
                     free_ioctx_users+0x2d/0x490 fs/aio.c:618
                     percpu_ref_put_many include/linux/percpu-refcount.h:293  
[inline]
                     percpu_ref_put include/linux/percpu-refcount.h:309  
[inline]
                     percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130  
[inline]
                     percpu_ref_switch_to_atomic_rcu+0x4c0/0x570  
lib/percpu-refcount.c:165
                     __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
                     rcu_do_batch kernel/rcu/tree.c:2157 [inline]
                     rcu_core+0x581/0x1560 kernel/rcu/tree.c:2377
                     rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2386
                     __do_softirq+0x262/0x98c kernel/softirq.c:292
                     invoke_softirq kernel/softirq.c:373 [inline]
                     irq_exit+0x19b/0x1e0 kernel/softirq.c:413
                     exiting_irq arch/x86/include/asm/apic.h:536 [inline]
                     smp_apic_timer_interrupt+0x1a3/0x610  
arch/x86/kernel/apic/apic.c:1095
                     apic_timer_interrupt+0xf/0x20  
arch/x86/entry/entry_64.S:830
                     constant_test_bit arch/x86/include/asm/bitops.h:207  
[inline]
                     test_bit include/asm-generic/bitops-instrumented.h:238  
[inline]
                     test_ti_thread_flag include/linux/thread_info.h:84  
[inline]
                     need_resched include/linux/sched.h:1827 [inline]
                     mutex_spin_on_owner+0x7b/0x330  
kernel/locking/mutex.c:568
                     mutex_optimistic_spin kernel/locking/mutex.c:673  
[inline]
                     __mutex_lock_common kernel/locking/mutex.c:959 [inline]
                     __mutex_lock+0x330/0x13c0 kernel/locking/mutex.c:1103
                     mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118
                     rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72
                     addrconf_dad_work+0xad/0x1150 net/ipv6/addrconf.c:4032
                     process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
                     worker_thread+0x98/0xe40 kernel/workqueue.c:2415
                     kthread+0x361/0x430 kernel/kthread.c:255
                     ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
    INITIAL USE at:
                    lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
                    __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128  
[inline]
                    _raw_spin_lock_irq+0x60/0x80  
kernel/locking/spinlock.c:167
                    spin_lock_irq include/linux/spinlock.h:363 [inline]
                    free_ioctx_users+0x2d/0x490 fs/aio.c:618
                    percpu_ref_put_many include/linux/percpu-refcount.h:293  
[inline]
                    percpu_ref_put include/linux/percpu-refcount.h:309  
[inline]
                    percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130  
[inline]
                    percpu_ref_switch_to_atomic_rcu+0x4c0/0x570  
lib/percpu-refcount.c:165
                    __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
                    rcu_do_batch kernel/rcu/tree.c:2157 [inline]
                    rcu_core+0x581/0x1560 kernel/rcu/tree.c:2377
                    rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2386
                    __do_softirq+0x262/0x98c kernel/softirq.c:292
                    invoke_softirq kernel/softirq.c:373 [inline]
                    irq_exit+0x19b/0x1e0 kernel/softirq.c:413
                    exiting_irq arch/x86/include/asm/apic.h:536 [inline]
                    smp_apic_timer_interrupt+0x1a3/0x610  
arch/x86/kernel/apic/apic.c:1095
                    apic_timer_interrupt+0xf/0x20  
arch/x86/entry/entry_64.S:830
                    constant_test_bit arch/x86/include/asm/bitops.h:207  
[inline]
                    test_bit include/asm-generic/bitops-instrumented.h:238  
[inline]
                    test_ti_thread_flag include/linux/thread_info.h:84  
[inline]
                    need_resched include/linux/sched.h:1827 [inline]
                    mutex_spin_on_owner+0x7b/0x330 kernel/locking/mutex.c:568
                    mutex_optimistic_spin kernel/locking/mutex.c:673 [inline]
                    __mutex_lock_common kernel/locking/mutex.c:959 [inline]
                    __mutex_lock+0x330/0x13c0 kernel/locking/mutex.c:1103
                    mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118
                    rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72
                    addrconf_dad_work+0xad/0x1150 net/ipv6/addrconf.c:4032
                    process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
                    worker_thread+0x98/0xe40 kernel/workqueue.c:2415
                    kthread+0x361/0x430 kernel/kthread.c:255
                    ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
  }
  ... key      at: [<ffffffff8ab00360>] __key.54228+0x0/0x40
  ... acquired at:
    lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
    spin_lock include/linux/spinlock.h:338 [inline]
    aio_poll fs/aio.c:1748 [inline]
    __io_submit_one fs/aio.c:1822 [inline]
    io_submit_one+0xefa/0x2ef0 fs/aio.c:1859
    __do_sys_io_submit fs/aio.c:1918 [inline]
    __se_sys_io_submit fs/aio.c:1888 [inline]
    __x64_sys_io_submit+0x1bd/0x570 fs/aio.c:1888
    do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe


the dependencies between the lock to be acquired
  and SOFTIRQ-irq-unsafe lock:
-> (&fiq->waitq){+.+.} {
    HARDIRQ-ON-W at:
                     lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
                     __raw_spin_lock include/linux/spinlock_api_smp.h:142  
[inline]
                     _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
                     spin_lock include/linux/spinlock.h:338 [inline]
                     flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                     fuse_request_queue_background+0x2f8/0x5a0  
fs/fuse/dev.c:676
                     fuse_request_send_background+0x58/0x110  
fs/fuse/dev.c:687
                     cuse_send_init fs/fuse/cuse.c:459 [inline]
                     cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519
                     misc_open+0x395/0x4c0 drivers/char/misc.c:141
                     chrdev_open+0x245/0x6b0 fs/char_dev.c:414
                     do_dentry_open+0x4df/0x1250 fs/open.c:797
                     vfs_open+0xa0/0xd0 fs/open.c:914
                     do_last fs/namei.c:3416 [inline]
                     path_openat+0x10e9/0x4630 fs/namei.c:3533
                     do_filp_open+0x1a1/0x280 fs/namei.c:3563
                     do_sys_open+0x3fe/0x5d0 fs/open.c:1097
                     __do_sys_openat fs/open.c:1124 [inline]
                     __se_sys_openat fs/open.c:1118 [inline]
                     __x64_sys_openat+0x9d/0x100 fs/open.c:1118
                     do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
                     entry_SYSCALL_64_after_hwframe+0x49/0xbe
    SOFTIRQ-ON-W at:
                     lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
                     __raw_spin_lock include/linux/spinlock_api_smp.h:142  
[inline]
                     _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
                     spin_lock include/linux/spinlock.h:338 [inline]
                     flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                     fuse_request_queue_background+0x2f8/0x5a0  
fs/fuse/dev.c:676
                     fuse_request_send_background+0x58/0x110  
fs/fuse/dev.c:687
                     cuse_send_init fs/fuse/cuse.c:459 [inline]
                     cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519
                     misc_open+0x395/0x4c0 drivers/char/misc.c:141
                     chrdev_open+0x245/0x6b0 fs/char_dev.c:414
                     do_dentry_open+0x4df/0x1250 fs/open.c:797
                     vfs_open+0xa0/0xd0 fs/open.c:914
                     do_last fs/namei.c:3416 [inline]
                     path_openat+0x10e9/0x4630 fs/namei.c:3533
                     do_filp_open+0x1a1/0x280 fs/namei.c:3563
                     do_sys_open+0x3fe/0x5d0 fs/open.c:1097
                     __do_sys_openat fs/open.c:1124 [inline]
                     __se_sys_openat fs/open.c:1118 [inline]
                     __x64_sys_openat+0x9d/0x100 fs/open.c:1118
                     do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
                     entry_SYSCALL_64_after_hwframe+0x49/0xbe
    INITIAL USE at:
                    lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
                    __raw_spin_lock include/linux/spinlock_api_smp.h:142  
[inline]
                    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
                    spin_lock include/linux/spinlock.h:338 [inline]
                    flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                    fuse_request_queue_background+0x2f8/0x5a0  
fs/fuse/dev.c:676
                    fuse_request_send_background+0x58/0x110 fs/fuse/dev.c:687
                    cuse_send_init fs/fuse/cuse.c:459 [inline]
                    cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519
                    misc_open+0x395/0x4c0 drivers/char/misc.c:141
                    chrdev_open+0x245/0x6b0 fs/char_dev.c:414
                    do_dentry_open+0x4df/0x1250 fs/open.c:797
                    vfs_open+0xa0/0xd0 fs/open.c:914
                    do_last fs/namei.c:3416 [inline]
                    path_openat+0x10e9/0x4630 fs/namei.c:3533
                    do_filp_open+0x1a1/0x280 fs/namei.c:3563
                    do_sys_open+0x3fe/0x5d0 fs/open.c:1097
                    __do_sys_openat fs/open.c:1124 [inline]
                    __se_sys_openat fs/open.c:1118 [inline]
                    __x64_sys_openat+0x9d/0x100 fs/open.c:1118
                    do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
                    entry_SYSCALL_64_after_hwframe+0x49/0xbe
  }
  ... key      at: [<ffffffff8ab9a180>] __key.45697+0x0/0x40
  ... acquired at:
    lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
    _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
    spin_lock include/linux/spinlock.h:338 [inline]
    aio_poll fs/aio.c:1748 [inline]
    __io_submit_one fs/aio.c:1822 [inline]
    io_submit_one+0xefa/0x2ef0 fs/aio.c:1859
    __do_sys_io_submit fs/aio.c:1918 [inline]
    __se_sys_io_submit fs/aio.c:1888 [inline]
    __x64_sys_io_submit+0x1bd/0x570 fs/aio.c:1888
    do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe


stack backtrace:
CPU: 0 PID: 25024 Comm: syz-executor.1 Not tainted 5.3.0-rc4-next-20190814  
#66
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  print_bad_irq_dependency kernel/locking/lockdep.c:2095 [inline]
  check_irq_usage.cold+0x586/0x6fe kernel/locking/lockdep.c:2293
  check_prev_add kernel/locking/lockdep.c:2480 [inline]
  check_prevs_add kernel/locking/lockdep.c:2581 [inline]
  validate_chain kernel/locking/lockdep.c:2971 [inline]
  __lock_acquire+0x25d0/0x4e70 kernel/locking/lockdep.c:3955
  lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:338 [inline]
  aio_poll fs/aio.c:1748 [inline]
  __io_submit_one fs/aio.c:1822 [inline]
  io_submit_one+0xefa/0x2ef0 fs/aio.c:1859
  __do_sys_io_submit fs/aio.c:1918 [inline]
  __se_sys_io_submit fs/aio.c:1888 [inline]
  __x64_sys_io_submit+0x1bd/0x570 fs/aio.c:1888
  do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x459829
Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9e07853c78 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000459829
RDX: 0000000020000440 RSI: 0000000000000001 RDI: 00007f9e07833000
RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9e078546d4
R13: 00000000004c0c19 R14: 00000000004d3c40 R15: 00000000ffffffff


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock
       [not found] ` <20190822233529.4176-1-ebiggers@kernel.org>
@ 2019-09-03  7:31   ` Miklos Szeredi
  2019-09-03 13:39     ` Eric Biggers
  0 siblings, 1 reply; 7+ messages in thread
From: Miklos Szeredi @ 2019-09-03  7:31 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs,
	stable, Christoph Hellwig

On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote:
>
> From: Eric Biggers <ebiggers@google.com>
>
> When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs
> and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock.

Not in -linus.

Which tree was this reproduced with?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock
  2019-09-03  7:31   ` [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock Miklos Szeredi
@ 2019-09-03 13:39     ` Eric Biggers
  2019-09-04 14:29       ` Miklos Szeredi
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Biggers @ 2019-09-03 13:39 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs,
	stable, Christoph Hellwig

On Tue, Sep 03, 2019 at 09:31:29AM +0200, Miklos Szeredi wrote:
> On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > From: Eric Biggers <ebiggers@google.com>
> >
> > When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs
> > and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock.
> 
> Not in -linus.
> 
> Which tree was this reproduced with?
> 
> Thanks,
> Miklos

Linus's tree.  Here's the full symbolized output on v5.3-rc7:

=====================================================
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.3.0-rc7 #1 Not tainted
-----------------------------------------------------
syz_fuse/150 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
00000000c2701b26 (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
00000000c2701b26 (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline]
00000000c2701b26 (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825

and this task is already holding:
00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline]
00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline]
00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825
which would create a new lock dependency:
 (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.}

but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&(&ctx->ctx_lock)->rlock){..-.}

... which became SOFTIRQ-irq-safe at:
  lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
  __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
  _raw_spin_lock_irq+0x32/0x50 kernel/locking/spinlock.c:167
  spin_lock_irq include/linux/spinlock.h:363 [inline]
  free_ioctx_users+0x25/0x190 fs/aio.c:618
  percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline]
  percpu_ref_put include/linux/percpu-refcount.h:309 [inline]
  percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline]
  percpu_ref_switch_to_atomic_rcu+0x202/0x210 lib/percpu-refcount.c:165
  __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
  rcu_do_batch+0x2ae/0x890 kernel/rcu/tree.c:2114
  rcu_core+0x13a/0x370 kernel/rcu/tree.c:2314
  rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323
  __do_softirq+0xbe/0x400 kernel/softirq.c:292
  invoke_softirq kernel/softirq.c:373 [inline]
  irq_exit+0x8f/0xc0 kernel/softirq.c:413
  exiting_irq arch/x86/include/asm/apic.h:537 [inline]
  smp_apic_timer_interrupt+0x8e/0x210 arch/x86/kernel/apic/apic.c:1133
  apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
  native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline]
  arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline]
  default_idle+0x29/0x160 arch/x86/kernel/process.c:580
  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571
  default_idle_call+0x1e/0x30 kernel/sched/idle.c:94
  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
  do_idle+0x1f0/0x220 kernel/sched/idle.c:263
  cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:354
  rest_init+0x18a/0x24d init/main.c:451
  arch_call_rest_init+0x9/0xc
  start_kernel+0x530/0x54e init/main.c:785
  x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:472
  x86_64_start_kernel+0x6d/0x71 arch/x86/kernel/head64.c:453
  secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241

to a SOFTIRQ-irq-unsafe lock:
 (&fiq->waitq){+.+.}

... which became SOFTIRQ-irq-unsafe at:
...
  lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:338 [inline]
  flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415
  fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676
  fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687
  fuse_send_init fs/fuse/inode.c:986 [inline]
  fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211
  mount_nodev+0x42/0x90 fs/super.c:1329
  fuse_mount+0x13/0x20 fs/fuse/inode.c:1236
  legacy_get_tree+0x2c/0x50 fs/fs_context.c:661
  vfs_get_tree+0x22/0xc0 fs/super.c:1413
  do_new_mount fs/namespace.c:2791 [inline]
  do_mount+0x7e3/0xa60 fs/namespace.c:3111
  ksys_mount+0x7d/0xd0 fs/namespace.c:3320
  __do_sys_mount fs/namespace.c:3334 [inline]
  __se_sys_mount fs/namespace.c:3331 [inline]
  __x64_sys_mount+0x20/0x30 fs/namespace.c:3331
  do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&fiq->waitq);
                               local_irq_disable();
                               lock(&(&ctx->ctx_lock)->rlock);
                               lock(&fiq->waitq);
  <Interrupt>
    lock(&(&ctx->ctx_lock)->rlock);

 *** DEADLOCK ***

1 lock held by syz_fuse/150:
 #0: 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline]
 #0: 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline]
 #0: 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825

the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&(&ctx->ctx_lock)->rlock){..-.} {
   IN-SOFTIRQ-W at:
                    lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
                    __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
                    _raw_spin_lock_irq+0x32/0x50 kernel/locking/spinlock.c:167
                    spin_lock_irq include/linux/spinlock.h:363 [inline]
                    free_ioctx_users+0x25/0x190 fs/aio.c:618
                    percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline]
                    percpu_ref_put include/linux/percpu-refcount.h:309 [inline]
                    percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline]
                    percpu_ref_switch_to_atomic_rcu+0x202/0x210 lib/percpu-refcount.c:165
                    __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
                    rcu_do_batch+0x2ae/0x890 kernel/rcu/tree.c:2114
                    rcu_core+0x13a/0x370 kernel/rcu/tree.c:2314
                    rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323
                    __do_softirq+0xbe/0x400 kernel/softirq.c:292
                    invoke_softirq kernel/softirq.c:373 [inline]
                    irq_exit+0x8f/0xc0 kernel/softirq.c:413
                    exiting_irq arch/x86/include/asm/apic.h:537 [inline]
                    smp_apic_timer_interrupt+0x8e/0x210 arch/x86/kernel/apic/apic.c:1133
                    apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
                    native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline]
                    arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline]
                    default_idle+0x29/0x160 arch/x86/kernel/process.c:580
                    arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571
                    default_idle_call+0x1e/0x30 kernel/sched/idle.c:94
                    cpuidle_idle_call kernel/sched/idle.c:154 [inline]
                    do_idle+0x1f0/0x220 kernel/sched/idle.c:263
                    cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:354
                    rest_init+0x18a/0x24d init/main.c:451
                    arch_call_rest_init+0x9/0xc
                    start_kernel+0x530/0x54e init/main.c:785
                    x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:472
                    x86_64_start_kernel+0x6d/0x71 arch/x86/kernel/head64.c:453
                    secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
   INITIAL USE at:
                   lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
                   __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
                   _raw_spin_lock_irq+0x32/0x50 kernel/locking/spinlock.c:167
                   spin_lock_irq include/linux/spinlock.h:363 [inline]
                   free_ioctx_users+0x25/0x190 fs/aio.c:618
                   percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline]
                   percpu_ref_put include/linux/percpu-refcount.h:309 [inline]
                   percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline]
                   percpu_ref_switch_to_atomic_rcu+0x202/0x210 lib/percpu-refcount.c:165
                   __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
                   rcu_do_batch+0x2ae/0x890 kernel/rcu/tree.c:2114
                   rcu_core+0x13a/0x370 kernel/rcu/tree.c:2314
                   rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323
                   __do_softirq+0xbe/0x400 kernel/softirq.c:292
                   invoke_softirq kernel/softirq.c:373 [inline]
                   irq_exit+0x8f/0xc0 kernel/softirq.c:413
                   exiting_irq arch/x86/include/asm/apic.h:537 [inline]
                   smp_apic_timer_interrupt+0x8e/0x210 arch/x86/kernel/apic/apic.c:1133
                   apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
                   native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline]
                   arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline]
                   default_idle+0x29/0x160 arch/x86/kernel/process.c:580
                   arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571
                   default_idle_call+0x1e/0x30 kernel/sched/idle.c:94
                   cpuidle_idle_call kernel/sched/idle.c:154 [inline]
                   do_idle+0x1f0/0x220 kernel/sched/idle.c:263
                   cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:354
                   rest_init+0x18a/0x24d init/main.c:451
                   arch_call_rest_init+0x9/0xc
                   start_kernel+0x530/0x54e init/main.c:785
                   x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:472
                   x86_64_start_kernel+0x6d/0x71 arch/x86/kernel/head64.c:453
                   secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
 }
 ... key      at: [<ffffffff82860670>] __key.50377+0x0/0x10
 ... acquired at:
   check_prev_add+0xa7/0x950 kernel/locking/lockdep.c:2409
   check_prevs_add kernel/locking/lockdep.c:2507 [inline]
   validate_chain+0x483/0x870 kernel/locking/lockdep.c:2897
   __lock_acquire+0x447/0x7d0 kernel/locking/lockdep.c:3880
   lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
   _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151
   spin_lock include/linux/spinlock.h:338 [inline]
   aio_poll fs/aio.c:1751 [inline]
   __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825
   io_submit_one+0x146/0x5b0 fs/aio.c:1862
   __do_sys_io_submit fs/aio.c:1921 [inline]
   __se_sys_io_submit+0x8e/0x270 fs/aio.c:1891
   __x64_sys_io_submit+0x15/0x20 fs/aio.c:1891
   do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296
   entry_SYSCALL_64_after_hwframe+0x49/0xbe


the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (&fiq->waitq){+.+.} {
   HARDIRQ-ON-W at:
                    lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
                    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                    _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151
                    spin_lock include/linux/spinlock.h:338 [inline]
                    flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415
                    fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676
                    fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687
                    fuse_send_init fs/fuse/inode.c:986 [inline]
                    fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211
                    mount_nodev+0x42/0x90 fs/super.c:1329
                    fuse_mount+0x13/0x20 fs/fuse/inode.c:1236
                    legacy_get_tree+0x2c/0x50 fs/fs_context.c:661
                    vfs_get_tree+0x22/0xc0 fs/super.c:1413
                    do_new_mount fs/namespace.c:2791 [inline]
                    do_mount+0x7e3/0xa60 fs/namespace.c:3111
                    ksys_mount+0x7d/0xd0 fs/namespace.c:3320
                    __do_sys_mount fs/namespace.c:3334 [inline]
                    __se_sys_mount fs/namespace.c:3331 [inline]
                    __x64_sys_mount+0x20/0x30 fs/namespace.c:3331
                    do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296
                    entry_SYSCALL_64_after_hwframe+0x49/0xbe
   SOFTIRQ-ON-W at:
                    lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
                    __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                    _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151
                    spin_lock include/linux/spinlock.h:338 [inline]
                    flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415
                    fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676
                    fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687
                    fuse_send_init fs/fuse/inode.c:986 [inline]
                    fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211
                    mount_nodev+0x42/0x90 fs/super.c:1329
                    fuse_mount+0x13/0x20 fs/fuse/inode.c:1236
                    legacy_get_tree+0x2c/0x50 fs/fs_context.c:661
                    vfs_get_tree+0x22/0xc0 fs/super.c:1413
                    do_new_mount fs/namespace.c:2791 [inline]
                    do_mount+0x7e3/0xa60 fs/namespace.c:3111
                    ksys_mount+0x7d/0xd0 fs/namespace.c:3320
                    __do_sys_mount fs/namespace.c:3334 [inline]
                    __se_sys_mount fs/namespace.c:3331 [inline]
                    __x64_sys_mount+0x20/0x30 fs/namespace.c:3331
                    do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296
                    entry_SYSCALL_64_after_hwframe+0x49/0xbe
   INITIAL USE at:
                   lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
                   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                   _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151
                   spin_lock include/linux/spinlock.h:338 [inline]
                   flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415
                   fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676
                   fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687
                   fuse_send_init fs/fuse/inode.c:986 [inline]
                   fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211
                   mount_nodev+0x42/0x90 fs/super.c:1329
                   fuse_mount+0x13/0x20 fs/fuse/inode.c:1236
                   legacy_get_tree+0x2c/0x50 fs/fs_context.c:661
                   vfs_get_tree+0x22/0xc0 fs/super.c:1413
                   do_new_mount fs/namespace.c:2791 [inline]
                   do_mount+0x7e3/0xa60 fs/namespace.c:3111
                   ksys_mount+0x7d/0xd0 fs/namespace.c:3320
                   __do_sys_mount fs/namespace.c:3334 [inline]
                   __se_sys_mount fs/namespace.c:3331 [inline]
                   __x64_sys_mount+0x20/0x30 fs/namespace.c:3331
                   do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296
                   entry_SYSCALL_64_after_hwframe+0x49/0xbe
 }
 ... key      at: [<ffffffff82863ea0>] __key.40870+0x0/0x10
 ... acquired at:
   check_prev_add+0xa7/0x950 kernel/locking/lockdep.c:2409
   check_prevs_add kernel/locking/lockdep.c:2507 [inline]
   validate_chain+0x483/0x870 kernel/locking/lockdep.c:2897
   __lock_acquire+0x447/0x7d0 kernel/locking/lockdep.c:3880
   lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
   _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151
   spin_lock include/linux/spinlock.h:338 [inline]
   aio_poll fs/aio.c:1751 [inline]
   __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825
   io_submit_one+0x146/0x5b0 fs/aio.c:1862
   __do_sys_io_submit fs/aio.c:1921 [inline]
   __se_sys_io_submit+0x8e/0x270 fs/aio.c:1891
   __x64_sys_io_submit+0x15/0x20 fs/aio.c:1891
   do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296
   entry_SYSCALL_64_after_hwframe+0x49/0xbe


stack backtrace:
CPU: 1 PID: 150 Comm: syz_fuse Not tainted 5.3.0-rc7 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x70/0x9a lib/dump_stack.c:113
 print_bad_irq_dependency.cold+0x380/0x3d8 kernel/locking/lockdep.c:2025
 check_irq_usage+0x337/0x400 kernel/locking/lockdep.c:2223
 check_prev_add+0xa7/0x950 kernel/locking/lockdep.c:2409
 check_prevs_add kernel/locking/lockdep.c:2507 [inline]
 validate_chain+0x483/0x870 kernel/locking/lockdep.c:2897
 __lock_acquire+0x447/0x7d0 kernel/locking/lockdep.c:3880
 lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412
 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
 _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151
 spin_lock include/linux/spinlock.h:338 [inline]
 aio_poll fs/aio.c:1751 [inline]
 __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825
 io_submit_one+0x146/0x5b0 fs/aio.c:1862
 __do_sys_io_submit fs/aio.c:1921 [inline]
 __se_sys_io_submit+0x8e/0x270 fs/aio.c:1891
 __x64_sys_io_submit+0x15/0x20 fs/aio.c:1891
 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f1bbde7ec6d
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f3 51 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffffa3a70e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
RAX: ffffffffffffffda RBX: 00007ffffa3a7140 RCX: 00007f1bbde7ec6d
RDX: 00007ffffa3a70f8 RSI: 0000000000000001 RDI: 00007f1bbe27d000
RBP: 0000559337c29290 R08: 0000000000000000 R09: 00007f1bbe27d000
R10: 0000000000000029 R11: 0000000000000246 R12: 0000559337c29190
R13: 00007ffffa3a72b0 R14: 0000000000000000 R15: 0000000000000000

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock
  2019-09-03 13:39     ` Eric Biggers
@ 2019-09-04 14:29       ` Miklos Szeredi
  2019-09-06  4:43         ` Eric Biggers
  0 siblings, 1 reply; 7+ messages in thread
From: Miklos Szeredi @ 2019-09-04 14:29 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs,
	stable, Christoph Hellwig

On Tue, Sep 3, 2019 at 3:39 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Tue, Sep 03, 2019 at 09:31:29AM +0200, Miklos Szeredi wrote:
> > On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote:
> > >
> > > From: Eric Biggers <ebiggers@google.com>
> > >
> > > When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs
> > > and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock.
> >
> > Not in -linus.
> >
> > Which tree was this reproduced with?
> >
> > Thanks,
> > Miklos
>
> Linus's tree.  Here's the full symbolized output on v5.3-rc7:

Okay.

TBH, I find the fix disgusting. It's confusing to sprinke code that
has absolutely nothing to do with interrupts with spin_lock_irq()
calls.

I think the lock/unlock calls should at least be done with a helper
with a comment explaining why disabling interrupts is needed (though I
have not managed to understand why aio needs to actually mess with the
waitq lock...)

Probably a better fix would be to just use a separate spinlock to
avoid the need to disable interrupts in cases where it's not
necessary.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock
  2019-09-04 14:29       ` Miklos Szeredi
@ 2019-09-06  4:43         ` Eric Biggers
  2019-09-06  6:58           ` Miklos Szeredi
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Biggers @ 2019-09-06  4:43 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs,
	stable, Christoph Hellwig

On Wed, Sep 04, 2019 at 04:29:03PM +0200, Miklos Szeredi wrote:
> On Tue, Sep 3, 2019 at 3:39 PM Eric Biggers <ebiggers@kernel.org> wrote:
> >
> > On Tue, Sep 03, 2019 at 09:31:29AM +0200, Miklos Szeredi wrote:
> > > On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote:
> > > >
> > > > From: Eric Biggers <ebiggers@google.com>
> > > >
> > > > When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs
> > > > and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock.
> > >
> > > Not in -linus.
> > >
> > > Which tree was this reproduced with?
> > >
> > > Thanks,
> > > Miklos
> >
> > Linus's tree.  Here's the full symbolized output on v5.3-rc7:
> 
> Okay.
> 
> TBH, I find the fix disgusting. It's confusing to sprinke code that
> has absolutely nothing to do with interrupts with spin_lock_irq()
> calls.
> 
> I think the lock/unlock calls should at least be done with a helper
> with a comment explaining why disabling interrupts is needed (though I
> have not managed to understand why aio needs to actually mess with the
> waitq lock...)

The aio code is doing a poll(), so it needs to use the wait queue.

> 
> Probably a better fix would be to just use a separate spinlock to
> avoid the need to disable interrupts in cases where it's not
> necessary.

Well, the below is what a separate lock would look like.  Note that it actually
still disables IRQs in some places; it's just hidden away in the nested
spin_lock_irqsave() in wake_up().  Likewise, adding something to the fuse_iqueue
then requires taking 2 spin locks -- one explicit, and one hidden in wake_up().

Is this the solution you'd prefer?

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ea8237513dfa..1d9d555f2699 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -377,7 +377,7 @@ static void queue_request(struct fuse_iqueue *fiq, struct fuse_req *req)
 	req->in.h.len = sizeof(struct fuse_in_header) +
 		len_args(req->in.numargs, (struct fuse_arg *) req->in.args);
 	list_add_tail(&req->list, &fiq->pending);
-	wake_up_locked(&fiq->waitq);
+	wake_up(&fiq->waitq);
 	kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 }
 
@@ -389,16 +389,16 @@ void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget,
 	forget->forget_one.nodeid = nodeid;
 	forget->forget_one.nlookup = nlookup;
 
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	if (fiq->connected) {
 		fiq->forget_list_tail->next = forget;
 		fiq->forget_list_tail = forget;
-		wake_up_locked(&fiq->waitq);
+		wake_up(&fiq->waitq);
 		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 	} else {
 		kfree(forget);
 	}
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 }
 
 static void flush_bg_queue(struct fuse_conn *fc)
@@ -412,10 +412,10 @@ static void flush_bg_queue(struct fuse_conn *fc)
 		req = list_first_entry(&fc->bg_queue, struct fuse_req, list);
 		list_del(&req->list);
 		fc->active_background++;
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		req->in.h.unique = fuse_get_unique(fiq);
 		queue_request(fiq, req);
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 	}
 }
 
@@ -439,9 +439,9 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 	 * smp_mb() from queue_interrupt().
 	 */
 	if (!list_empty(&req->intr_entry)) {
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		list_del_init(&req->intr_entry);
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 	}
 	WARN_ON(test_bit(FR_PENDING, &req->flags));
 	WARN_ON(test_bit(FR_SENT, &req->flags));
@@ -483,10 +483,10 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 
 static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 {
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	/* Check for we've sent request to interrupt this req */
 	if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 		return -EINVAL;
 	}
 
@@ -499,13 +499,13 @@ static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 		smp_mb();
 		if (test_bit(FR_FINISHED, &req->flags)) {
 			list_del_init(&req->intr_entry);
-			spin_unlock(&fiq->waitq.lock);
+			spin_unlock(&fiq->lock);
 			return 0;
 		}
-		wake_up_locked(&fiq->waitq);
+		wake_up(&fiq->waitq);
 		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 	}
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	return 0;
 }
 
@@ -535,16 +535,16 @@ static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
 		if (!err)
 			return;
 
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		/* Request is not yet in userspace, bail out */
 		if (test_bit(FR_PENDING, &req->flags)) {
 			list_del(&req->list);
-			spin_unlock(&fiq->waitq.lock);
+			spin_unlock(&fiq->lock);;
 			__fuse_put_request(req);
 			req->out.h.error = -EINTR;
 			return;
 		}
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 	}
 
 	/*
@@ -559,9 +559,9 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req)
 	struct fuse_iqueue *fiq = &fc->iq;
 
 	BUG_ON(test_bit(FR_BACKGROUND, &req->flags));
-	spin_lock(&fiq->waitq.lock);
+	spin_lock_irq(&fiq->lock);
 	if (!fiq->connected) {
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 		req->out.h.error = -ENOTCONN;
 	} else {
 		req->in.h.unique = fuse_get_unique(fiq);
@@ -569,7 +569,7 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req)
 		/* acquire extra reference, since request is still needed
 		   after request_end() */
 		__fuse_get_request(req);
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 
 		request_wait_answer(fc, req);
 		/* Pairs with smp_wmb() in request_end() */
@@ -700,12 +700,12 @@ static int fuse_request_send_notify_reply(struct fuse_conn *fc,
 
 	__clear_bit(FR_ISREPLY, &req->flags);
 	req->in.h.unique = unique;
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	if (fiq->connected) {
 		queue_request(fiq, req);
 		err = 0;
 	}
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	return err;
 }
@@ -1149,12 +1149,12 @@ static int request_pending(struct fuse_iqueue *fiq)
  * Unlike other requests this is assembled on demand, without a need
  * to allocate a separate fuse_req structure.
  *
- * Called with fiq->waitq.lock held, releases it
+ * Called with fiq->lock held, releases it
  */
 static int fuse_read_interrupt(struct fuse_iqueue *fiq,
 			       struct fuse_copy_state *cs,
 			       size_t nbytes, struct fuse_req *req)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	struct fuse_in_header ih;
 	struct fuse_interrupt_in arg;
@@ -1169,7 +1169,7 @@ __releases(fiq->waitq.lock)
 	ih.unique = (req->in.h.unique | FUSE_INT_REQ_BIT);
 	arg.unique = req->in.h.unique;
 
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	if (nbytes < reqsize)
 		return -EINVAL;
 
@@ -1206,7 +1206,7 @@ static struct fuse_forget_link *dequeue_forget(struct fuse_iqueue *fiq,
 static int fuse_read_single_forget(struct fuse_iqueue *fiq,
 				   struct fuse_copy_state *cs,
 				   size_t nbytes)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	int err;
 	struct fuse_forget_link *forget = dequeue_forget(fiq, 1, NULL);
@@ -1220,7 +1220,7 @@ __releases(fiq->waitq.lock)
 		.len = sizeof(ih) + sizeof(arg),
 	};
 
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	kfree(forget);
 	if (nbytes < ih.len)
 		return -EINVAL;
@@ -1238,7 +1238,7 @@ __releases(fiq->waitq.lock)
 
 static int fuse_read_batch_forget(struct fuse_iqueue *fiq,
 				   struct fuse_copy_state *cs, size_t nbytes)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	int err;
 	unsigned max_forgets;
@@ -1252,13 +1252,13 @@ __releases(fiq->waitq.lock)
 	};
 
 	if (nbytes < ih.len) {
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 		return -EINVAL;
 	}
 
 	max_forgets = (nbytes - ih.len) / sizeof(struct fuse_forget_one);
 	head = dequeue_forget(fiq, max_forgets, &count);
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	arg.count = count;
 	ih.len += count * sizeof(struct fuse_forget_one);
@@ -1288,7 +1288,7 @@ __releases(fiq->waitq.lock)
 static int fuse_read_forget(struct fuse_conn *fc, struct fuse_iqueue *fiq,
 			    struct fuse_copy_state *cs,
 			    size_t nbytes)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	if (fc->minor < 16 || fiq->forget_list_head.next->next == NULL)
 		return fuse_read_single_forget(fiq, cs, nbytes);
@@ -1318,16 +1318,19 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	unsigned int hash;
 
  restart:
-	spin_lock(&fiq->waitq.lock);
-	err = -EAGAIN;
-	if ((file->f_flags & O_NONBLOCK) && fiq->connected &&
-	    !request_pending(fiq))
-		goto err_unlock;
+	for (;;) {
+		spin_lock(&fiq->lock);
+		if (!fiq->connected || request_pending(fiq))
+			break;
+		spin_unlock(&fiq->lock);
 
-	err = wait_event_interruptible_exclusive_locked(fiq->waitq,
+		if (file->f_flags & O_NONBLOCK)
+			return -EAGAIN;
+		err = wait_event_interruptible_exclusive(fiq->waitq,
 				!fiq->connected || request_pending(fiq));
-	if (err)
-		goto err_unlock;
+		if (err)
+			return err;
+	}
 
 	if (!fiq->connected) {
 		err = fc->aborted ? -ECONNABORTED : -ENODEV;
@@ -1351,7 +1354,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	req = list_entry(fiq->pending.next, struct fuse_req, list);
 	clear_bit(FR_PENDING, &req->flags);
 	list_del_init(&req->list);
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	in = &req->in;
 	reqsize = in->h.len;
@@ -1409,7 +1412,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	return err;
 
  err_unlock:
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	return err;
 }
 
@@ -2121,12 +2124,12 @@ static __poll_t fuse_dev_poll(struct file *file, poll_table *wait)
 	fiq = &fud->fc->iq;
 	poll_wait(file, &fiq->waitq, wait);
 
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	if (!fiq->connected)
 		mask = EPOLLERR;
 	else if (request_pending(fiq))
 		mask |= EPOLLIN | EPOLLRDNORM;
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	return mask;
 }
@@ -2221,15 +2224,15 @@ void fuse_abort_conn(struct fuse_conn *fc)
 		flush_bg_queue(fc);
 		spin_unlock(&fc->bg_lock);
 
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		fiq->connected = 0;
 		list_for_each_entry(req, &fiq->pending, list)
 			clear_bit(FR_PENDING, &req->flags);
 		list_splice_tail_init(&fiq->pending, &to_end);
 		while (forget_pending(fiq))
 			kfree(dequeue_forget(fiq, 1, NULL));
-		wake_up_all_locked(&fiq->waitq);
-		spin_unlock(&fiq->waitq.lock);
+		wake_up_all(&fiq->waitq);
+		spin_unlock(&fiq->lock);
 		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 		end_polls(fc);
 		wake_up_all(&fc->blocked_waitq);
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 24dbca777775..5de7fd0fd6ef 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -450,6 +450,9 @@ struct fuse_iqueue {
 	/** Connection established */
 	unsigned connected;
 
+	/** Lock protecting accessess to members of this structure */
+	spinlock_t lock;
+
 	/** Readers of the connection are waiting on this */
 	wait_queue_head_t waitq;
 
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 4bb885b0f032..987877860c01 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -582,6 +582,7 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root)
 static void fuse_iqueue_init(struct fuse_iqueue *fiq)
 {
 	memset(fiq, 0, sizeof(struct fuse_iqueue));
+	spin_lock_init(&fiq->lock);
 	init_waitqueue_head(&fiq->waitq);
 	INIT_LIST_HEAD(&fiq->pending);
 	INIT_LIST_HEAD(&fiq->interrupts);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock
  2019-09-06  4:43         ` Eric Biggers
@ 2019-09-06  6:58           ` Miklos Szeredi
  2019-09-09  3:15             ` [PATCH v2] fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock Eric Biggers
  0 siblings, 1 reply; 7+ messages in thread
From: Miklos Szeredi @ 2019-09-06  6:58 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs,
	stable, Christoph Hellwig

On Fri, Sep 6, 2019 at 6:43 AM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Wed, Sep 04, 2019 at 04:29:03PM +0200, Miklos Szeredi wrote:

> > TBH, I find the fix disgusting. It's confusing to sprinke code that
> > has absolutely nothing to do with interrupts with spin_lock_irq()
> > calls.
> >
> > I think the lock/unlock calls should at least be done with a helper
> > with a comment explaining why disabling interrupts is needed (though I
> > have not managed to understand why aio needs to actually mess with the
> > waitq lock...)
>
> The aio code is doing a poll(), so it needs to use the wait queue.

Doesn't explain why the irq disabled nested locking is needed in
aio_poll().  poll/select manage to do that without messing with waitq
internals.   How is aio poll different?

> >
> > Probably a better fix would be to just use a separate spinlock to
> > avoid the need to disable interrupts in cases where it's not
> > necessary.
>
> Well, the below is what a separate lock would look like.  Note that it actually
> still disables IRQs in some places; it's just hidden away in the nested
> spin_lock_irqsave() in wake_up().  Likewise, adding something to the fuse_iqueue
> then requires taking 2 spin locks -- one explicit, and one hidden in wake_up().

Right, that's exactly why the waitq lock was used.

> Is this the solution you'd prefer?

I'd actually prefer if aio was fixed.   But I guess that's not
realistic, so yes, the below patch looks okay.  If fiq->lock is in the
same cacheline as fiq->waitq then it shouldn't make a difference.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2] fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock
  2019-09-06  6:58           ` Miklos Szeredi
@ 2019-09-09  3:15             ` Eric Biggers
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Biggers @ 2019-09-09  3:15 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel
  Cc: linux-aio, Benjamin LaHaise, syzkaller-bugs, Christoph Hellwig

From: Eric Biggers <ebiggers@google.com>

When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs
and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock.

This may have to wait for fuse_iqueue::waitq.lock to be released by one
of many places that take it with IRQs enabled.  Since the IRQ handler
may take kioctx::ctx_lock, lockdep reports that a deadlock is possible.

Fix it by protecting the state of struct fuse_iqueue with a separate
spinlock, and only accessing fuse_iqueue::waitq using the versions of
the waitqueue functions which do IRQ-safe locking internally.

Reproducer:

	#include <fcntl.h>
	#include <stdio.h>
	#include <sys/mount.h>
	#include <sys/stat.h>
	#include <sys/syscall.h>
	#include <unistd.h>
	#include <linux/aio_abi.h>

	int main()
	{
		char opts[128];
		int fd = open("/dev/fuse", O_RDWR);
		aio_context_t ctx = 0;
		struct iocb cb = { .aio_lio_opcode = IOCB_CMD_POLL, .aio_fildes = fd };
		struct iocb *cbp = &cb;

		sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd);
		mkdir("mnt", 0700);
		mount("foo",  "mnt", "fuse", 0, opts);
		syscall(__NR_io_setup, 1, &ctx);
		syscall(__NR_io_submit, ctx, 1, &cbp);
	}

Beginning of lockdep output:

	=====================================================
	WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
	5.3.0-rc5 #9 Not tainted
	-----------------------------------------------------
	syz_fuse/135 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
	000000003590ceda (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
	000000003590ceda (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline]
	000000003590ceda (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825

	and this task is already holding:
	0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline]
	0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline]
	0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825
	which would create a new lock dependency:
	 (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.}

	but this new dependency connects a SOFTIRQ-irq-safe lock:
	 (&(&ctx->ctx_lock)->rlock){..-.}

	[...]

Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com
Reported-by: syzbot+d86c4426a01f60feddc7@syzkaller.appspotmail.com
Fixes: bfe4037e722e ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.19+
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 fs/fuse/dev.c    | 93 +++++++++++++++++++++++++-----------------------
 fs/fuse/fuse_i.h |  3 ++
 fs/fuse/inode.c  |  1 +
 3 files changed, 52 insertions(+), 45 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ea8237513dfa..186468fba82e 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -377,7 +377,7 @@ static void queue_request(struct fuse_iqueue *fiq, struct fuse_req *req)
 	req->in.h.len = sizeof(struct fuse_in_header) +
 		len_args(req->in.numargs, (struct fuse_arg *) req->in.args);
 	list_add_tail(&req->list, &fiq->pending);
-	wake_up_locked(&fiq->waitq);
+	wake_up(&fiq->waitq);
 	kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 }
 
@@ -389,16 +389,16 @@ void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget,
 	forget->forget_one.nodeid = nodeid;
 	forget->forget_one.nlookup = nlookup;
 
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	if (fiq->connected) {
 		fiq->forget_list_tail->next = forget;
 		fiq->forget_list_tail = forget;
-		wake_up_locked(&fiq->waitq);
+		wake_up(&fiq->waitq);
 		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 	} else {
 		kfree(forget);
 	}
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 }
 
 static void flush_bg_queue(struct fuse_conn *fc)
@@ -412,10 +412,10 @@ static void flush_bg_queue(struct fuse_conn *fc)
 		req = list_first_entry(&fc->bg_queue, struct fuse_req, list);
 		list_del(&req->list);
 		fc->active_background++;
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		req->in.h.unique = fuse_get_unique(fiq);
 		queue_request(fiq, req);
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 	}
 }
 
@@ -439,9 +439,9 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 	 * smp_mb() from queue_interrupt().
 	 */
 	if (!list_empty(&req->intr_entry)) {
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		list_del_init(&req->intr_entry);
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 	}
 	WARN_ON(test_bit(FR_PENDING, &req->flags));
 	WARN_ON(test_bit(FR_SENT, &req->flags));
@@ -483,10 +483,10 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req)
 
 static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 {
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	/* Check for we've sent request to interrupt this req */
 	if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) {
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 		return -EINVAL;
 	}
 
@@ -499,13 +499,13 @@ static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req)
 		smp_mb();
 		if (test_bit(FR_FINISHED, &req->flags)) {
 			list_del_init(&req->intr_entry);
-			spin_unlock(&fiq->waitq.lock);
+			spin_unlock(&fiq->lock);
 			return 0;
 		}
-		wake_up_locked(&fiq->waitq);
+		wake_up(&fiq->waitq);
 		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 	}
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	return 0;
 }
 
@@ -535,16 +535,16 @@ static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req)
 		if (!err)
 			return;
 
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		/* Request is not yet in userspace, bail out */
 		if (test_bit(FR_PENDING, &req->flags)) {
 			list_del(&req->list);
-			spin_unlock(&fiq->waitq.lock);
+			spin_unlock(&fiq->lock);
 			__fuse_put_request(req);
 			req->out.h.error = -EINTR;
 			return;
 		}
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 	}
 
 	/*
@@ -559,9 +559,9 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req)
 	struct fuse_iqueue *fiq = &fc->iq;
 
 	BUG_ON(test_bit(FR_BACKGROUND, &req->flags));
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	if (!fiq->connected) {
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 		req->out.h.error = -ENOTCONN;
 	} else {
 		req->in.h.unique = fuse_get_unique(fiq);
@@ -569,7 +569,7 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req)
 		/* acquire extra reference, since request is still needed
 		   after request_end() */
 		__fuse_get_request(req);
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 
 		request_wait_answer(fc, req);
 		/* Pairs with smp_wmb() in request_end() */
@@ -700,12 +700,12 @@ static int fuse_request_send_notify_reply(struct fuse_conn *fc,
 
 	__clear_bit(FR_ISREPLY, &req->flags);
 	req->in.h.unique = unique;
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	if (fiq->connected) {
 		queue_request(fiq, req);
 		err = 0;
 	}
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	return err;
 }
@@ -1149,12 +1149,12 @@ static int request_pending(struct fuse_iqueue *fiq)
  * Unlike other requests this is assembled on demand, without a need
  * to allocate a separate fuse_req structure.
  *
- * Called with fiq->waitq.lock held, releases it
+ * Called with fiq->lock held, releases it
  */
 static int fuse_read_interrupt(struct fuse_iqueue *fiq,
 			       struct fuse_copy_state *cs,
 			       size_t nbytes, struct fuse_req *req)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	struct fuse_in_header ih;
 	struct fuse_interrupt_in arg;
@@ -1169,7 +1169,7 @@ __releases(fiq->waitq.lock)
 	ih.unique = (req->in.h.unique | FUSE_INT_REQ_BIT);
 	arg.unique = req->in.h.unique;
 
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	if (nbytes < reqsize)
 		return -EINVAL;
 
@@ -1206,7 +1206,7 @@ static struct fuse_forget_link *dequeue_forget(struct fuse_iqueue *fiq,
 static int fuse_read_single_forget(struct fuse_iqueue *fiq,
 				   struct fuse_copy_state *cs,
 				   size_t nbytes)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	int err;
 	struct fuse_forget_link *forget = dequeue_forget(fiq, 1, NULL);
@@ -1220,7 +1220,7 @@ __releases(fiq->waitq.lock)
 		.len = sizeof(ih) + sizeof(arg),
 	};
 
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	kfree(forget);
 	if (nbytes < ih.len)
 		return -EINVAL;
@@ -1238,7 +1238,7 @@ __releases(fiq->waitq.lock)
 
 static int fuse_read_batch_forget(struct fuse_iqueue *fiq,
 				   struct fuse_copy_state *cs, size_t nbytes)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	int err;
 	unsigned max_forgets;
@@ -1252,13 +1252,13 @@ __releases(fiq->waitq.lock)
 	};
 
 	if (nbytes < ih.len) {
-		spin_unlock(&fiq->waitq.lock);
+		spin_unlock(&fiq->lock);
 		return -EINVAL;
 	}
 
 	max_forgets = (nbytes - ih.len) / sizeof(struct fuse_forget_one);
 	head = dequeue_forget(fiq, max_forgets, &count);
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	arg.count = count;
 	ih.len += count * sizeof(struct fuse_forget_one);
@@ -1288,7 +1288,7 @@ __releases(fiq->waitq.lock)
 static int fuse_read_forget(struct fuse_conn *fc, struct fuse_iqueue *fiq,
 			    struct fuse_copy_state *cs,
 			    size_t nbytes)
-__releases(fiq->waitq.lock)
+__releases(fiq->lock)
 {
 	if (fc->minor < 16 || fiq->forget_list_head.next->next == NULL)
 		return fuse_read_single_forget(fiq, cs, nbytes);
@@ -1318,16 +1318,19 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	unsigned int hash;
 
  restart:
-	spin_lock(&fiq->waitq.lock);
-	err = -EAGAIN;
-	if ((file->f_flags & O_NONBLOCK) && fiq->connected &&
-	    !request_pending(fiq))
-		goto err_unlock;
+	for (;;) {
+		spin_lock(&fiq->lock);
+		if (!fiq->connected || request_pending(fiq))
+			break;
+		spin_unlock(&fiq->lock);
 
-	err = wait_event_interruptible_exclusive_locked(fiq->waitq,
+		if (file->f_flags & O_NONBLOCK)
+			return -EAGAIN;
+		err = wait_event_interruptible_exclusive(fiq->waitq,
 				!fiq->connected || request_pending(fiq));
-	if (err)
-		goto err_unlock;
+		if (err)
+			return err;
+	}
 
 	if (!fiq->connected) {
 		err = fc->aborted ? -ECONNABORTED : -ENODEV;
@@ -1351,7 +1354,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	req = list_entry(fiq->pending.next, struct fuse_req, list);
 	clear_bit(FR_PENDING, &req->flags);
 	list_del_init(&req->list);
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	in = &req->in;
 	reqsize = in->h.len;
@@ -1409,7 +1412,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	return err;
 
  err_unlock:
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 	return err;
 }
 
@@ -2121,12 +2124,12 @@ static __poll_t fuse_dev_poll(struct file *file, poll_table *wait)
 	fiq = &fud->fc->iq;
 	poll_wait(file, &fiq->waitq, wait);
 
-	spin_lock(&fiq->waitq.lock);
+	spin_lock(&fiq->lock);
 	if (!fiq->connected)
 		mask = EPOLLERR;
 	else if (request_pending(fiq))
 		mask |= EPOLLIN | EPOLLRDNORM;
-	spin_unlock(&fiq->waitq.lock);
+	spin_unlock(&fiq->lock);
 
 	return mask;
 }
@@ -2221,15 +2224,15 @@ void fuse_abort_conn(struct fuse_conn *fc)
 		flush_bg_queue(fc);
 		spin_unlock(&fc->bg_lock);
 
-		spin_lock(&fiq->waitq.lock);
+		spin_lock(&fiq->lock);
 		fiq->connected = 0;
 		list_for_each_entry(req, &fiq->pending, list)
 			clear_bit(FR_PENDING, &req->flags);
 		list_splice_tail_init(&fiq->pending, &to_end);
 		while (forget_pending(fiq))
 			kfree(dequeue_forget(fiq, 1, NULL));
-		wake_up_all_locked(&fiq->waitq);
-		spin_unlock(&fiq->waitq.lock);
+		wake_up_all(&fiq->waitq);
+		spin_unlock(&fiq->lock);
 		kill_fasync(&fiq->fasync, SIGIO, POLL_IN);
 		end_polls(fc);
 		wake_up_all(&fc->blocked_waitq);
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 24dbca777775..89bdc41e0d86 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -450,6 +450,9 @@ struct fuse_iqueue {
 	/** Connection established */
 	unsigned connected;
 
+	/** Lock protecting accesses to members of this structure */
+	spinlock_t lock;
+
 	/** Readers of the connection are waiting on this */
 	wait_queue_head_t waitq;
 
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 4bb885b0f032..987877860c01 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -582,6 +582,7 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root)
 static void fuse_iqueue_init(struct fuse_iqueue *fiq)
 {
 	memset(fiq, 0, sizeof(struct fuse_iqueue));
+	spin_lock_init(&fiq->lock);
 	init_waitqueue_head(&fiq->waitq);
 	INIT_LIST_HEAD(&fiq->pending);
 	INIT_LIST_HEAD(&fiq->interrupts);
-- 
2.23.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-18 16:18 possible deadlock in io_submit_one (2) syzbot
     [not found] ` <20190822233529.4176-1-ebiggers@kernel.org>
2019-09-03  7:31   ` [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock Miklos Szeredi
2019-09-03 13:39     ` Eric Biggers
2019-09-04 14:29       ` Miklos Szeredi
2019-09-06  4:43         ` Eric Biggers
2019-09-06  6:58           ` Miklos Szeredi
2019-09-09  3:15             ` [PATCH v2] fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock Eric Biggers

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox