* possible deadlock in io_submit_one (2) @ 2019-08-18 16:18 syzbot [not found] ` <20190822233529.4176-1-ebiggers@kernel.org> 0 siblings, 1 reply; 7+ messages in thread From: syzbot @ 2019-08-18 16:18 UTC (permalink / raw) To: bcrl, linux-aio, linux-fsdevel, linux-kernel, syzkaller-bugs, viro Hello, syzbot found the following crash on: HEAD commit: 17da61ae Add linux-next specific files for 20190814 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=127712e2600000 kernel config: https://syzkaller.appspot.com/x/.config?x=4733704ca85aaa66 dashboard link: https://syzkaller.appspot.com/bug?extid=af05535bb79520f95431 compiler: gcc (GCC) 9.0.0 20181231 (experimental) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 5.3.0-rc4-next-20190814 #66 Not tainted ----------------------------------------------------- syz-executor.1/25024 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline] ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1748 [inline] ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: __io_submit_one fs/aio.c:1822 [inline] ffff8880a374bbe8 (&fiq->waitq){+.+.}, at: io_submit_one+0xefa/0x2ef0 fs/aio.c:1859 and this task is already holding: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1746 [inline] ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one fs/aio.c:1822 [inline] ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: io_submit_one+0xeb5/0x2ef0 fs/aio.c:1859 which would create a new lock dependency: (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.} but this new dependency connects a SOFTIRQ-irq-safe lock: (&(&ctx->ctx_lock)->rlock){..-.} ... which became SOFTIRQ-irq-safe at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x2d/0x490 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline] percpu_ref_put include/linux/percpu-refcount.h:309 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x4c0/0x570 lib/percpu-refcount.c:165 __rcu_reclaim kernel/rcu/rcu.h:222 [inline] rcu_do_batch kernel/rcu/tree.c:2157 [inline] rcu_core+0x581/0x1560 kernel/rcu/tree.c:2377 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2386 __do_softirq+0x262/0x98c kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x19b/0x1e0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1095 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 constant_test_bit arch/x86/include/asm/bitops.h:207 [inline] test_bit include/asm-generic/bitops-instrumented.h:238 [inline] test_ti_thread_flag include/linux/thread_info.h:84 [inline] need_resched include/linux/sched.h:1827 [inline] mutex_spin_on_owner+0x7b/0x330 kernel/locking/mutex.c:568 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline] __mutex_lock_common kernel/locking/mutex.c:959 [inline] __mutex_lock+0x330/0x13c0 kernel/locking/mutex.c:1103 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72 addrconf_dad_work+0xad/0x1150 net/ipv6/addrconf.c:4032 process_one_work+0x9af/0x1740 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 to a SOFTIRQ-irq-unsafe lock: (&fiq->waitq){+.+.} ... which became SOFTIRQ-irq-unsafe at: ... lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415 fuse_request_queue_background+0x2f8/0x5a0 fs/fuse/dev.c:676 fuse_request_send_background+0x58/0x110 fs/fuse/dev.c:687 cuse_send_init fs/fuse/cuse.c:459 [inline] cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519 misc_open+0x395/0x4c0 drivers/char/misc.c:141 chrdev_open+0x245/0x6b0 fs/char_dev.c:414 do_dentry_open+0x4df/0x1250 fs/open.c:797 vfs_open+0xa0/0xd0 fs/open.c:914 do_last fs/namei.c:3416 [inline] path_openat+0x10e9/0x4630 fs/namei.c:3533 do_filp_open+0x1a1/0x280 fs/namei.c:3563 do_sys_open+0x3fe/0x5d0 fs/open.c:1097 __do_sys_openat fs/open.c:1124 [inline] __se_sys_openat fs/open.c:1118 [inline] __x64_sys_openat+0x9d/0x100 fs/open.c:1118 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&fiq->waitq); local_irq_disable(); lock(&(&ctx->ctx_lock)->rlock); lock(&fiq->waitq); <Interrupt> lock(&(&ctx->ctx_lock)->rlock); *** DEADLOCK *** 1 lock held by syz-executor.1/25024: #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1746 [inline] #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one fs/aio.c:1822 [inline] #0: ffff88806001fc58 (&(&ctx->ctx_lock)->rlock){..-.}, at: io_submit_one+0xeb5/0x2ef0 fs/aio.c:1859 the dependencies between SOFTIRQ-irq-safe lock and the holding lock: -> (&(&ctx->ctx_lock)->rlock){..-.} { IN-SOFTIRQ-W at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x2d/0x490 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline] percpu_ref_put include/linux/percpu-refcount.h:309 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x4c0/0x570 lib/percpu-refcount.c:165 __rcu_reclaim kernel/rcu/rcu.h:222 [inline] rcu_do_batch kernel/rcu/tree.c:2157 [inline] rcu_core+0x581/0x1560 kernel/rcu/tree.c:2377 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2386 __do_softirq+0x262/0x98c kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x19b/0x1e0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1095 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 constant_test_bit arch/x86/include/asm/bitops.h:207 [inline] test_bit include/asm-generic/bitops-instrumented.h:238 [inline] test_ti_thread_flag include/linux/thread_info.h:84 [inline] need_resched include/linux/sched.h:1827 [inline] mutex_spin_on_owner+0x7b/0x330 kernel/locking/mutex.c:568 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline] __mutex_lock_common kernel/locking/mutex.c:959 [inline] __mutex_lock+0x330/0x13c0 kernel/locking/mutex.c:1103 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72 addrconf_dad_work+0xad/0x1150 net/ipv6/addrconf.c:4032 process_one_work+0x9af/0x1740 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 INITIAL USE at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x2d/0x490 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline] percpu_ref_put include/linux/percpu-refcount.h:309 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x4c0/0x570 lib/percpu-refcount.c:165 __rcu_reclaim kernel/rcu/rcu.h:222 [inline] rcu_do_batch kernel/rcu/tree.c:2157 [inline] rcu_core+0x581/0x1560 kernel/rcu/tree.c:2377 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2386 __do_softirq+0x262/0x98c kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x19b/0x1e0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1095 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 constant_test_bit arch/x86/include/asm/bitops.h:207 [inline] test_bit include/asm-generic/bitops-instrumented.h:238 [inline] test_ti_thread_flag include/linux/thread_info.h:84 [inline] need_resched include/linux/sched.h:1827 [inline] mutex_spin_on_owner+0x7b/0x330 kernel/locking/mutex.c:568 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline] __mutex_lock_common kernel/locking/mutex.c:959 [inline] __mutex_lock+0x330/0x13c0 kernel/locking/mutex.c:1103 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1118 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:72 addrconf_dad_work+0xad/0x1150 net/ipv6/addrconf.c:4032 process_one_work+0x9af/0x1740 kernel/workqueue.c:2269 worker_thread+0x98/0xe40 kernel/workqueue.c:2415 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 } ... key at: [<ffffffff8ab00360>] __key.54228+0x0/0x40 ... acquired at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] aio_poll fs/aio.c:1748 [inline] __io_submit_one fs/aio.c:1822 [inline] io_submit_one+0xefa/0x2ef0 fs/aio.c:1859 __do_sys_io_submit fs/aio.c:1918 [inline] __se_sys_io_submit fs/aio.c:1888 [inline] __x64_sys_io_submit+0x1bd/0x570 fs/aio.c:1888 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock: -> (&fiq->waitq){+.+.} { HARDIRQ-ON-W at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415 fuse_request_queue_background+0x2f8/0x5a0 fs/fuse/dev.c:676 fuse_request_send_background+0x58/0x110 fs/fuse/dev.c:687 cuse_send_init fs/fuse/cuse.c:459 [inline] cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519 misc_open+0x395/0x4c0 drivers/char/misc.c:141 chrdev_open+0x245/0x6b0 fs/char_dev.c:414 do_dentry_open+0x4df/0x1250 fs/open.c:797 vfs_open+0xa0/0xd0 fs/open.c:914 do_last fs/namei.c:3416 [inline] path_openat+0x10e9/0x4630 fs/namei.c:3533 do_filp_open+0x1a1/0x280 fs/namei.c:3563 do_sys_open+0x3fe/0x5d0 fs/open.c:1097 __do_sys_openat fs/open.c:1124 [inline] __se_sys_openat fs/open.c:1118 [inline] __x64_sys_openat+0x9d/0x100 fs/open.c:1118 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe SOFTIRQ-ON-W at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415 fuse_request_queue_background+0x2f8/0x5a0 fs/fuse/dev.c:676 fuse_request_send_background+0x58/0x110 fs/fuse/dev.c:687 cuse_send_init fs/fuse/cuse.c:459 [inline] cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519 misc_open+0x395/0x4c0 drivers/char/misc.c:141 chrdev_open+0x245/0x6b0 fs/char_dev.c:414 do_dentry_open+0x4df/0x1250 fs/open.c:797 vfs_open+0xa0/0xd0 fs/open.c:914 do_last fs/namei.c:3416 [inline] path_openat+0x10e9/0x4630 fs/namei.c:3533 do_filp_open+0x1a1/0x280 fs/namei.c:3563 do_sys_open+0x3fe/0x5d0 fs/open.c:1097 __do_sys_openat fs/open.c:1124 [inline] __se_sys_openat fs/open.c:1118 [inline] __x64_sys_openat+0x9d/0x100 fs/open.c:1118 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe INITIAL USE at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415 fuse_request_queue_background+0x2f8/0x5a0 fs/fuse/dev.c:676 fuse_request_send_background+0x58/0x110 fs/fuse/dev.c:687 cuse_send_init fs/fuse/cuse.c:459 [inline] cuse_channel_open+0x5ba/0x830 fs/fuse/cuse.c:519 misc_open+0x395/0x4c0 drivers/char/misc.c:141 chrdev_open+0x245/0x6b0 fs/char_dev.c:414 do_dentry_open+0x4df/0x1250 fs/open.c:797 vfs_open+0xa0/0xd0 fs/open.c:914 do_last fs/namei.c:3416 [inline] path_openat+0x10e9/0x4630 fs/namei.c:3533 do_filp_open+0x1a1/0x280 fs/namei.c:3563 do_sys_open+0x3fe/0x5d0 fs/open.c:1097 __do_sys_openat fs/open.c:1124 [inline] __se_sys_openat fs/open.c:1118 [inline] __x64_sys_openat+0x9d/0x100 fs/open.c:1118 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe } ... key at: [<ffffffff8ab9a180>] __key.45697+0x0/0x40 ... acquired at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] aio_poll fs/aio.c:1748 [inline] __io_submit_one fs/aio.c:1822 [inline] io_submit_one+0xefa/0x2ef0 fs/aio.c:1859 __do_sys_io_submit fs/aio.c:1918 [inline] __se_sys_io_submit fs/aio.c:1888 [inline] __x64_sys_io_submit+0x1bd/0x570 fs/aio.c:1888 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe stack backtrace: CPU: 0 PID: 25024 Comm: syz-executor.1 Not tainted 5.3.0-rc4-next-20190814 #66 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_bad_irq_dependency kernel/locking/lockdep.c:2095 [inline] check_irq_usage.cold+0x586/0x6fe kernel/locking/lockdep.c:2293 check_prev_add kernel/locking/lockdep.c:2480 [inline] check_prevs_add kernel/locking/lockdep.c:2581 [inline] validate_chain kernel/locking/lockdep.c:2971 [inline] __lock_acquire+0x25d0/0x4e70 kernel/locking/lockdep.c:3955 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4487 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] aio_poll fs/aio.c:1748 [inline] __io_submit_one fs/aio.c:1822 [inline] io_submit_one+0xefa/0x2ef0 fs/aio.c:1859 __do_sys_io_submit fs/aio.c:1918 [inline] __se_sys_io_submit fs/aio.c:1888 [inline] __x64_sys_io_submit+0x1bd/0x570 fs/aio.c:1888 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x459829 Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f9e07853c78 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000459829 RDX: 0000000020000440 RSI: 0000000000000001 RDI: 00007f9e07833000 RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9e078546d4 R13: 00000000004c0c19 R14: 00000000004d3c40 R15: 00000000ffffffff --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <20190822233529.4176-1-ebiggers@kernel.org>]
* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock [not found] ` <20190822233529.4176-1-ebiggers@kernel.org> @ 2019-09-03 7:31 ` Miklos Szeredi 2019-09-03 13:39 ` Eric Biggers 0 siblings, 1 reply; 7+ messages in thread From: Miklos Szeredi @ 2019-09-03 7:31 UTC (permalink / raw) To: Eric Biggers Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs, stable, Christoph Hellwig On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote: > > From: Eric Biggers <ebiggers@google.com> > > When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs > and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. Not in -linus. Which tree was this reproduced with? Thanks, Miklos ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock 2019-09-03 7:31 ` [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock Miklos Szeredi @ 2019-09-03 13:39 ` Eric Biggers 2019-09-04 14:29 ` Miklos Szeredi 0 siblings, 1 reply; 7+ messages in thread From: Eric Biggers @ 2019-09-03 13:39 UTC (permalink / raw) To: Miklos Szeredi Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs, stable, Christoph Hellwig On Tue, Sep 03, 2019 at 09:31:29AM +0200, Miklos Szeredi wrote: > On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote: > > > > From: Eric Biggers <ebiggers@google.com> > > > > When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs > > and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. > > Not in -linus. > > Which tree was this reproduced with? > > Thanks, > Miklos Linus's tree. Here's the full symbolized output on v5.3-rc7: ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 5.3.0-rc7 #1 Not tainted ----------------------------------------------------- syz_fuse/150 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 00000000c2701b26 (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline] 00000000c2701b26 (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline] 00000000c2701b26 (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 and this task is already holding: 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline] 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825 which would create a new lock dependency: (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.} but this new dependency connects a SOFTIRQ-irq-safe lock: (&(&ctx->ctx_lock)->rlock){..-.} ... which became SOFTIRQ-irq-safe at: lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x32/0x50 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x25/0x190 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline] percpu_ref_put include/linux/percpu-refcount.h:309 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x202/0x210 lib/percpu-refcount.c:165 __rcu_reclaim kernel/rcu/rcu.h:222 [inline] rcu_do_batch+0x2ae/0x890 kernel/rcu/tree.c:2114 rcu_core+0x13a/0x370 kernel/rcu/tree.c:2314 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323 __do_softirq+0xbe/0x400 kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x8f/0xc0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:537 [inline] smp_apic_timer_interrupt+0x8e/0x210 arch/x86/kernel/apic/apic.c:1133 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline] arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline] default_idle+0x29/0x160 arch/x86/kernel/process.c:580 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571 default_idle_call+0x1e/0x30 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x1f0/0x220 kernel/sched/idle.c:263 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:354 rest_init+0x18a/0x24d init/main.c:451 arch_call_rest_init+0x9/0xc start_kernel+0x530/0x54e init/main.c:785 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:472 x86_64_start_kernel+0x6d/0x71 arch/x86/kernel/head64.c:453 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 to a SOFTIRQ-irq-unsafe lock: (&fiq->waitq){+.+.} ... which became SOFTIRQ-irq-unsafe at: ... lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415 fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676 fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687 fuse_send_init fs/fuse/inode.c:986 [inline] fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211 mount_nodev+0x42/0x90 fs/super.c:1329 fuse_mount+0x13/0x20 fs/fuse/inode.c:1236 legacy_get_tree+0x2c/0x50 fs/fs_context.c:661 vfs_get_tree+0x22/0xc0 fs/super.c:1413 do_new_mount fs/namespace.c:2791 [inline] do_mount+0x7e3/0xa60 fs/namespace.c:3111 ksys_mount+0x7d/0xd0 fs/namespace.c:3320 __do_sys_mount fs/namespace.c:3334 [inline] __se_sys_mount fs/namespace.c:3331 [inline] __x64_sys_mount+0x20/0x30 fs/namespace.c:3331 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&fiq->waitq); local_irq_disable(); lock(&(&ctx->ctx_lock)->rlock); lock(&fiq->waitq); <Interrupt> lock(&(&ctx->ctx_lock)->rlock); *** DEADLOCK *** 1 lock held by syz_fuse/150: #0: 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] #0: 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline] #0: 00000000f4c02e81 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825 the dependencies between SOFTIRQ-irq-safe lock and the holding lock: -> (&(&ctx->ctx_lock)->rlock){..-.} { IN-SOFTIRQ-W at: lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x32/0x50 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x25/0x190 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline] percpu_ref_put include/linux/percpu-refcount.h:309 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x202/0x210 lib/percpu-refcount.c:165 __rcu_reclaim kernel/rcu/rcu.h:222 [inline] rcu_do_batch+0x2ae/0x890 kernel/rcu/tree.c:2114 rcu_core+0x13a/0x370 kernel/rcu/tree.c:2314 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323 __do_softirq+0xbe/0x400 kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x8f/0xc0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:537 [inline] smp_apic_timer_interrupt+0x8e/0x210 arch/x86/kernel/apic/apic.c:1133 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline] arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline] default_idle+0x29/0x160 arch/x86/kernel/process.c:580 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571 default_idle_call+0x1e/0x30 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x1f0/0x220 kernel/sched/idle.c:263 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:354 rest_init+0x18a/0x24d init/main.c:451 arch_call_rest_init+0x9/0xc start_kernel+0x530/0x54e init/main.c:785 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:472 x86_64_start_kernel+0x6d/0x71 arch/x86/kernel/head64.c:453 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 INITIAL USE at: lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline] _raw_spin_lock_irq+0x32/0x50 kernel/locking/spinlock.c:167 spin_lock_irq include/linux/spinlock.h:363 [inline] free_ioctx_users+0x25/0x190 fs/aio.c:618 percpu_ref_put_many include/linux/percpu-refcount.h:293 [inline] percpu_ref_put include/linux/percpu-refcount.h:309 [inline] percpu_ref_call_confirm_rcu lib/percpu-refcount.c:130 [inline] percpu_ref_switch_to_atomic_rcu+0x202/0x210 lib/percpu-refcount.c:165 __rcu_reclaim kernel/rcu/rcu.h:222 [inline] rcu_do_batch+0x2ae/0x890 kernel/rcu/tree.c:2114 rcu_core+0x13a/0x370 kernel/rcu/tree.c:2314 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323 __do_softirq+0xbe/0x400 kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x8f/0xc0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:537 [inline] smp_apic_timer_interrupt+0x8e/0x210 arch/x86/kernel/apic/apic.c:1133 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 native_safe_halt arch/x86/include/asm/irqflags.h:60 [inline] arch_safe_halt arch/x86/include/asm/irqflags.h:103 [inline] default_idle+0x29/0x160 arch/x86/kernel/process.c:580 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571 default_idle_call+0x1e/0x30 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x1f0/0x220 kernel/sched/idle.c:263 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:354 rest_init+0x18a/0x24d init/main.c:451 arch_call_rest_init+0x9/0xc start_kernel+0x530/0x54e init/main.c:785 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:472 x86_64_start_kernel+0x6d/0x71 arch/x86/kernel/head64.c:453 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 } ... key at: [<ffffffff82860670>] __key.50377+0x0/0x10 ... acquired at: check_prev_add+0xa7/0x950 kernel/locking/lockdep.c:2409 check_prevs_add kernel/locking/lockdep.c:2507 [inline] validate_chain+0x483/0x870 kernel/locking/lockdep.c:2897 __lock_acquire+0x447/0x7d0 kernel/locking/lockdep.c:3880 lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] aio_poll fs/aio.c:1751 [inline] __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 io_submit_one+0x146/0x5b0 fs/aio.c:1862 __do_sys_io_submit fs/aio.c:1921 [inline] __se_sys_io_submit+0x8e/0x270 fs/aio.c:1891 __x64_sys_io_submit+0x15/0x20 fs/aio.c:1891 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock: -> (&fiq->waitq){+.+.} { HARDIRQ-ON-W at: lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415 fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676 fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687 fuse_send_init fs/fuse/inode.c:986 [inline] fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211 mount_nodev+0x42/0x90 fs/super.c:1329 fuse_mount+0x13/0x20 fs/fuse/inode.c:1236 legacy_get_tree+0x2c/0x50 fs/fs_context.c:661 vfs_get_tree+0x22/0xc0 fs/super.c:1413 do_new_mount fs/namespace.c:2791 [inline] do_mount+0x7e3/0xa60 fs/namespace.c:3111 ksys_mount+0x7d/0xd0 fs/namespace.c:3320 __do_sys_mount fs/namespace.c:3334 [inline] __se_sys_mount fs/namespace.c:3331 [inline] __x64_sys_mount+0x20/0x30 fs/namespace.c:3331 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe SOFTIRQ-ON-W at: lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415 fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676 fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687 fuse_send_init fs/fuse/inode.c:986 [inline] fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211 mount_nodev+0x42/0x90 fs/super.c:1329 fuse_mount+0x13/0x20 fs/fuse/inode.c:1236 legacy_get_tree+0x2c/0x50 fs/fs_context.c:661 vfs_get_tree+0x22/0xc0 fs/super.c:1413 do_new_mount fs/namespace.c:2791 [inline] do_mount+0x7e3/0xa60 fs/namespace.c:3111 ksys_mount+0x7d/0xd0 fs/namespace.c:3320 __do_sys_mount fs/namespace.c:3334 [inline] __se_sys_mount fs/namespace.c:3331 [inline] __x64_sys_mount+0x20/0x30 fs/namespace.c:3331 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe INITIAL USE at: lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] flush_bg_queue+0x91/0xf0 fs/fuse/dev.c:415 fuse_request_queue_background+0xd1/0x140 fs/fuse/dev.c:676 fuse_request_send_background+0x27/0x60 fs/fuse/dev.c:687 fuse_send_init fs/fuse/inode.c:986 [inline] fuse_fill_super+0x656/0x681 fs/fuse/inode.c:1211 mount_nodev+0x42/0x90 fs/super.c:1329 fuse_mount+0x13/0x20 fs/fuse/inode.c:1236 legacy_get_tree+0x2c/0x50 fs/fs_context.c:661 vfs_get_tree+0x22/0xc0 fs/super.c:1413 do_new_mount fs/namespace.c:2791 [inline] do_mount+0x7e3/0xa60 fs/namespace.c:3111 ksys_mount+0x7d/0xd0 fs/namespace.c:3320 __do_sys_mount fs/namespace.c:3334 [inline] __se_sys_mount fs/namespace.c:3331 [inline] __x64_sys_mount+0x20/0x30 fs/namespace.c:3331 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe } ... key at: [<ffffffff82863ea0>] __key.40870+0x0/0x10 ... acquired at: check_prev_add+0xa7/0x950 kernel/locking/lockdep.c:2409 check_prevs_add kernel/locking/lockdep.c:2507 [inline] validate_chain+0x483/0x870 kernel/locking/lockdep.c:2897 __lock_acquire+0x447/0x7d0 kernel/locking/lockdep.c:3880 lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] aio_poll fs/aio.c:1751 [inline] __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 io_submit_one+0x146/0x5b0 fs/aio.c:1862 __do_sys_io_submit fs/aio.c:1921 [inline] __se_sys_io_submit+0x8e/0x270 fs/aio.c:1891 __x64_sys_io_submit+0x15/0x20 fs/aio.c:1891 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe stack backtrace: CPU: 1 PID: 150 Comm: syz_fuse Not tainted 5.3.0-rc7 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x70/0x9a lib/dump_stack.c:113 print_bad_irq_dependency.cold+0x380/0x3d8 kernel/locking/lockdep.c:2025 check_irq_usage+0x337/0x400 kernel/locking/lockdep.c:2223 check_prev_add+0xa7/0x950 kernel/locking/lockdep.c:2409 check_prevs_add kernel/locking/lockdep.c:2507 [inline] validate_chain+0x483/0x870 kernel/locking/lockdep.c:2897 __lock_acquire+0x447/0x7d0 kernel/locking/lockdep.c:3880 lock_acquire+0x99/0x180 kernel/locking/lockdep.c:4412 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2b/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:338 [inline] aio_poll fs/aio.c:1751 [inline] __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 io_submit_one+0x146/0x5b0 fs/aio.c:1862 __do_sys_io_submit fs/aio.c:1921 [inline] __se_sys_io_submit+0x8e/0x270 fs/aio.c:1891 __x64_sys_io_submit+0x15/0x20 fs/aio.c:1891 do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f1bbde7ec6d Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f3 51 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffffa3a70e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1 RAX: ffffffffffffffda RBX: 00007ffffa3a7140 RCX: 00007f1bbde7ec6d RDX: 00007ffffa3a70f8 RSI: 0000000000000001 RDI: 00007f1bbe27d000 RBP: 0000559337c29290 R08: 0000000000000000 R09: 00007f1bbe27d000 R10: 0000000000000029 R11: 0000000000000246 R12: 0000559337c29190 R13: 00007ffffa3a72b0 R14: 0000000000000000 R15: 0000000000000000 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock 2019-09-03 13:39 ` Eric Biggers @ 2019-09-04 14:29 ` Miklos Szeredi 2019-09-06 4:43 ` Eric Biggers 0 siblings, 1 reply; 7+ messages in thread From: Miklos Szeredi @ 2019-09-04 14:29 UTC (permalink / raw) To: Eric Biggers Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs, stable, Christoph Hellwig On Tue, Sep 3, 2019 at 3:39 PM Eric Biggers <ebiggers@kernel.org> wrote: > > On Tue, Sep 03, 2019 at 09:31:29AM +0200, Miklos Szeredi wrote: > > On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote: > > > > > > From: Eric Biggers <ebiggers@google.com> > > > > > > When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs > > > and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. > > > > Not in -linus. > > > > Which tree was this reproduced with? > > > > Thanks, > > Miklos > > Linus's tree. Here's the full symbolized output on v5.3-rc7: Okay. TBH, I find the fix disgusting. It's confusing to sprinke code that has absolutely nothing to do with interrupts with spin_lock_irq() calls. I think the lock/unlock calls should at least be done with a helper with a comment explaining why disabling interrupts is needed (though I have not managed to understand why aio needs to actually mess with the waitq lock...) Probably a better fix would be to just use a separate spinlock to avoid the need to disable interrupts in cases where it's not necessary. Thanks, Miklos ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock 2019-09-04 14:29 ` Miklos Szeredi @ 2019-09-06 4:43 ` Eric Biggers 2019-09-06 6:58 ` Miklos Szeredi 0 siblings, 1 reply; 7+ messages in thread From: Eric Biggers @ 2019-09-06 4:43 UTC (permalink / raw) To: Miklos Szeredi Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs, stable, Christoph Hellwig On Wed, Sep 04, 2019 at 04:29:03PM +0200, Miklos Szeredi wrote: > On Tue, Sep 3, 2019 at 3:39 PM Eric Biggers <ebiggers@kernel.org> wrote: > > > > On Tue, Sep 03, 2019 at 09:31:29AM +0200, Miklos Szeredi wrote: > > > On Fri, Aug 23, 2019 at 1:35 AM Eric Biggers <ebiggers@kernel.org> wrote: > > > > > > > > From: Eric Biggers <ebiggers@google.com> > > > > > > > > When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs > > > > and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. > > > > > > Not in -linus. > > > > > > Which tree was this reproduced with? > > > > > > Thanks, > > > Miklos > > > > Linus's tree. Here's the full symbolized output on v5.3-rc7: > > Okay. > > TBH, I find the fix disgusting. It's confusing to sprinke code that > has absolutely nothing to do with interrupts with spin_lock_irq() > calls. > > I think the lock/unlock calls should at least be done with a helper > with a comment explaining why disabling interrupts is needed (though I > have not managed to understand why aio needs to actually mess with the > waitq lock...) The aio code is doing a poll(), so it needs to use the wait queue. > > Probably a better fix would be to just use a separate spinlock to > avoid the need to disable interrupts in cases where it's not > necessary. Well, the below is what a separate lock would look like. Note that it actually still disables IRQs in some places; it's just hidden away in the nested spin_lock_irqsave() in wake_up(). Likewise, adding something to the fuse_iqueue then requires taking 2 spin locks -- one explicit, and one hidden in wake_up(). Is this the solution you'd prefer? diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index ea8237513dfa..1d9d555f2699 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -377,7 +377,7 @@ static void queue_request(struct fuse_iqueue *fiq, struct fuse_req *req) req->in.h.len = sizeof(struct fuse_in_header) + len_args(req->in.numargs, (struct fuse_arg *) req->in.args); list_add_tail(&req->list, &fiq->pending); - wake_up_locked(&fiq->waitq); + wake_up(&fiq->waitq); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); } @@ -389,16 +389,16 @@ void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget, forget->forget_one.nodeid = nodeid; forget->forget_one.nlookup = nlookup; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); if (fiq->connected) { fiq->forget_list_tail->next = forget; fiq->forget_list_tail = forget; - wake_up_locked(&fiq->waitq); + wake_up(&fiq->waitq); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); } else { kfree(forget); } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } static void flush_bg_queue(struct fuse_conn *fc) @@ -412,10 +412,10 @@ static void flush_bg_queue(struct fuse_conn *fc) req = list_first_entry(&fc->bg_queue, struct fuse_req, list); list_del(&req->list); fc->active_background++; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); req->in.h.unique = fuse_get_unique(fiq); queue_request(fiq, req); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } } @@ -439,9 +439,9 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req) * smp_mb() from queue_interrupt(). */ if (!list_empty(&req->intr_entry)) { - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); list_del_init(&req->intr_entry); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } WARN_ON(test_bit(FR_PENDING, &req->flags)); WARN_ON(test_bit(FR_SENT, &req->flags)); @@ -483,10 +483,10 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req) static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req) { - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); /* Check for we've sent request to interrupt this req */ if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) { - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return -EINVAL; } @@ -499,13 +499,13 @@ static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req) smp_mb(); if (test_bit(FR_FINISHED, &req->flags)) { list_del_init(&req->intr_entry); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return 0; } - wake_up_locked(&fiq->waitq); + wake_up(&fiq->waitq); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return 0; } @@ -535,16 +535,16 @@ static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req) if (!err) return; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); /* Request is not yet in userspace, bail out */ if (test_bit(FR_PENDING, &req->flags)) { list_del(&req->list); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock);; __fuse_put_request(req); req->out.h.error = -EINTR; return; } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } /* @@ -559,9 +559,9 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req) struct fuse_iqueue *fiq = &fc->iq; BUG_ON(test_bit(FR_BACKGROUND, &req->flags)); - spin_lock(&fiq->waitq.lock); + spin_lock_irq(&fiq->lock); if (!fiq->connected) { - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); req->out.h.error = -ENOTCONN; } else { req->in.h.unique = fuse_get_unique(fiq); @@ -569,7 +569,7 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req) /* acquire extra reference, since request is still needed after request_end() */ __fuse_get_request(req); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); request_wait_answer(fc, req); /* Pairs with smp_wmb() in request_end() */ @@ -700,12 +700,12 @@ static int fuse_request_send_notify_reply(struct fuse_conn *fc, __clear_bit(FR_ISREPLY, &req->flags); req->in.h.unique = unique; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); if (fiq->connected) { queue_request(fiq, req); err = 0; } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return err; } @@ -1149,12 +1149,12 @@ static int request_pending(struct fuse_iqueue *fiq) * Unlike other requests this is assembled on demand, without a need * to allocate a separate fuse_req structure. * - * Called with fiq->waitq.lock held, releases it + * Called with fiq->lock held, releases it */ static int fuse_read_interrupt(struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes, struct fuse_req *req) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { struct fuse_in_header ih; struct fuse_interrupt_in arg; @@ -1169,7 +1169,7 @@ __releases(fiq->waitq.lock) ih.unique = (req->in.h.unique | FUSE_INT_REQ_BIT); arg.unique = req->in.h.unique; - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); if (nbytes < reqsize) return -EINVAL; @@ -1206,7 +1206,7 @@ static struct fuse_forget_link *dequeue_forget(struct fuse_iqueue *fiq, static int fuse_read_single_forget(struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { int err; struct fuse_forget_link *forget = dequeue_forget(fiq, 1, NULL); @@ -1220,7 +1220,7 @@ __releases(fiq->waitq.lock) .len = sizeof(ih) + sizeof(arg), }; - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); kfree(forget); if (nbytes < ih.len) return -EINVAL; @@ -1238,7 +1238,7 @@ __releases(fiq->waitq.lock) static int fuse_read_batch_forget(struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { int err; unsigned max_forgets; @@ -1252,13 +1252,13 @@ __releases(fiq->waitq.lock) }; if (nbytes < ih.len) { - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return -EINVAL; } max_forgets = (nbytes - ih.len) / sizeof(struct fuse_forget_one); head = dequeue_forget(fiq, max_forgets, &count); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); arg.count = count; ih.len += count * sizeof(struct fuse_forget_one); @@ -1288,7 +1288,7 @@ __releases(fiq->waitq.lock) static int fuse_read_forget(struct fuse_conn *fc, struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { if (fc->minor < 16 || fiq->forget_list_head.next->next == NULL) return fuse_read_single_forget(fiq, cs, nbytes); @@ -1318,16 +1318,19 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, unsigned int hash; restart: - spin_lock(&fiq->waitq.lock); - err = -EAGAIN; - if ((file->f_flags & O_NONBLOCK) && fiq->connected && - !request_pending(fiq)) - goto err_unlock; + for (;;) { + spin_lock(&fiq->lock); + if (!fiq->connected || request_pending(fiq)) + break; + spin_unlock(&fiq->lock); - err = wait_event_interruptible_exclusive_locked(fiq->waitq, + if (file->f_flags & O_NONBLOCK) + return -EAGAIN; + err = wait_event_interruptible_exclusive(fiq->waitq, !fiq->connected || request_pending(fiq)); - if (err) - goto err_unlock; + if (err) + return err; + } if (!fiq->connected) { err = fc->aborted ? -ECONNABORTED : -ENODEV; @@ -1351,7 +1354,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, req = list_entry(fiq->pending.next, struct fuse_req, list); clear_bit(FR_PENDING, &req->flags); list_del_init(&req->list); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); in = &req->in; reqsize = in->h.len; @@ -1409,7 +1412,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, return err; err_unlock: - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return err; } @@ -2121,12 +2124,12 @@ static __poll_t fuse_dev_poll(struct file *file, poll_table *wait) fiq = &fud->fc->iq; poll_wait(file, &fiq->waitq, wait); - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); if (!fiq->connected) mask = EPOLLERR; else if (request_pending(fiq)) mask |= EPOLLIN | EPOLLRDNORM; - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return mask; } @@ -2221,15 +2224,15 @@ void fuse_abort_conn(struct fuse_conn *fc) flush_bg_queue(fc); spin_unlock(&fc->bg_lock); - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); fiq->connected = 0; list_for_each_entry(req, &fiq->pending, list) clear_bit(FR_PENDING, &req->flags); list_splice_tail_init(&fiq->pending, &to_end); while (forget_pending(fiq)) kfree(dequeue_forget(fiq, 1, NULL)); - wake_up_all_locked(&fiq->waitq); - spin_unlock(&fiq->waitq.lock); + wake_up_all(&fiq->waitq); + spin_unlock(&fiq->lock); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); end_polls(fc); wake_up_all(&fc->blocked_waitq); diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 24dbca777775..5de7fd0fd6ef 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -450,6 +450,9 @@ struct fuse_iqueue { /** Connection established */ unsigned connected; + /** Lock protecting accessess to members of this structure */ + spinlock_t lock; + /** Readers of the connection are waiting on this */ wait_queue_head_t waitq; diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 4bb885b0f032..987877860c01 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -582,6 +582,7 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root) static void fuse_iqueue_init(struct fuse_iqueue *fiq) { memset(fiq, 0, sizeof(struct fuse_iqueue)); + spin_lock_init(&fiq->lock); init_waitqueue_head(&fiq->waitq); INIT_LIST_HEAD(&fiq->pending); INIT_LIST_HEAD(&fiq->interrupts); -- 2.23.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock 2019-09-06 4:43 ` Eric Biggers @ 2019-09-06 6:58 ` Miklos Szeredi 2019-09-09 3:15 ` [PATCH v2] fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock Eric Biggers 0 siblings, 1 reply; 7+ messages in thread From: Miklos Szeredi @ 2019-09-06 6:58 UTC (permalink / raw) To: Eric Biggers Cc: linux-fsdevel, linux-aio, Benjamin LaHaise, syzkaller-bugs, stable, Christoph Hellwig On Fri, Sep 6, 2019 at 6:43 AM Eric Biggers <ebiggers@kernel.org> wrote: > > On Wed, Sep 04, 2019 at 04:29:03PM +0200, Miklos Szeredi wrote: > > TBH, I find the fix disgusting. It's confusing to sprinke code that > > has absolutely nothing to do with interrupts with spin_lock_irq() > > calls. > > > > I think the lock/unlock calls should at least be done with a helper > > with a comment explaining why disabling interrupts is needed (though I > > have not managed to understand why aio needs to actually mess with the > > waitq lock...) > > The aio code is doing a poll(), so it needs to use the wait queue. Doesn't explain why the irq disabled nested locking is needed in aio_poll(). poll/select manage to do that without messing with waitq internals. How is aio poll different? > > > > Probably a better fix would be to just use a separate spinlock to > > avoid the need to disable interrupts in cases where it's not > > necessary. > > Well, the below is what a separate lock would look like. Note that it actually > still disables IRQs in some places; it's just hidden away in the nested > spin_lock_irqsave() in wake_up(). Likewise, adding something to the fuse_iqueue > then requires taking 2 spin locks -- one explicit, and one hidden in wake_up(). Right, that's exactly why the waitq lock was used. > Is this the solution you'd prefer? I'd actually prefer if aio was fixed. But I guess that's not realistic, so yes, the below patch looks okay. If fiq->lock is in the same cacheline as fiq->waitq then it shouldn't make a difference. Thanks, Miklos ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2] fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock 2019-09-06 6:58 ` Miklos Szeredi @ 2019-09-09 3:15 ` Eric Biggers 0 siblings, 0 replies; 7+ messages in thread From: Eric Biggers @ 2019-09-09 3:15 UTC (permalink / raw) To: Miklos Szeredi, linux-fsdevel Cc: linux-aio, Benjamin LaHaise, syzkaller-bugs, Christoph Hellwig From: Eric Biggers <ebiggers@google.com> When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. This may have to wait for fuse_iqueue::waitq.lock to be released by one of many places that take it with IRQs enabled. Since the IRQ handler may take kioctx::ctx_lock, lockdep reports that a deadlock is possible. Fix it by protecting the state of struct fuse_iqueue with a separate spinlock, and only accessing fuse_iqueue::waitq using the versions of the waitqueue functions which do IRQ-safe locking internally. Reproducer: #include <fcntl.h> #include <stdio.h> #include <sys/mount.h> #include <sys/stat.h> #include <sys/syscall.h> #include <unistd.h> #include <linux/aio_abi.h> int main() { char opts[128]; int fd = open("/dev/fuse", O_RDWR); aio_context_t ctx = 0; struct iocb cb = { .aio_lio_opcode = IOCB_CMD_POLL, .aio_fildes = fd }; struct iocb *cbp = &cb; sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd); mkdir("mnt", 0700); mount("foo", "mnt", "fuse", 0, opts); syscall(__NR_io_setup, 1, &ctx); syscall(__NR_io_submit, ctx, 1, &cbp); } Beginning of lockdep output: ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 5.3.0-rc5 #9 Not tainted ----------------------------------------------------- syz_fuse/135 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 000000003590ceda (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 and this task is already holding: 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825 which would create a new lock dependency: (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.} but this new dependency connects a SOFTIRQ-irq-safe lock: (&(&ctx->ctx_lock)->rlock){..-.} [...] Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com Reported-by: syzbot+d86c4426a01f60feddc7@syzkaller.appspotmail.com Fixes: bfe4037e722e ("aio: implement IOCB_CMD_POLL") Cc: <stable@vger.kernel.org> # v4.19+ Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Eric Biggers <ebiggers@google.com> --- fs/fuse/dev.c | 93 +++++++++++++++++++++++++----------------------- fs/fuse/fuse_i.h | 3 ++ fs/fuse/inode.c | 1 + 3 files changed, 52 insertions(+), 45 deletions(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index ea8237513dfa..186468fba82e 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -377,7 +377,7 @@ static void queue_request(struct fuse_iqueue *fiq, struct fuse_req *req) req->in.h.len = sizeof(struct fuse_in_header) + len_args(req->in.numargs, (struct fuse_arg *) req->in.args); list_add_tail(&req->list, &fiq->pending); - wake_up_locked(&fiq->waitq); + wake_up(&fiq->waitq); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); } @@ -389,16 +389,16 @@ void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget, forget->forget_one.nodeid = nodeid; forget->forget_one.nlookup = nlookup; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); if (fiq->connected) { fiq->forget_list_tail->next = forget; fiq->forget_list_tail = forget; - wake_up_locked(&fiq->waitq); + wake_up(&fiq->waitq); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); } else { kfree(forget); } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } static void flush_bg_queue(struct fuse_conn *fc) @@ -412,10 +412,10 @@ static void flush_bg_queue(struct fuse_conn *fc) req = list_first_entry(&fc->bg_queue, struct fuse_req, list); list_del(&req->list); fc->active_background++; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); req->in.h.unique = fuse_get_unique(fiq); queue_request(fiq, req); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } } @@ -439,9 +439,9 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req) * smp_mb() from queue_interrupt(). */ if (!list_empty(&req->intr_entry)) { - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); list_del_init(&req->intr_entry); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } WARN_ON(test_bit(FR_PENDING, &req->flags)); WARN_ON(test_bit(FR_SENT, &req->flags)); @@ -483,10 +483,10 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req) static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req) { - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); /* Check for we've sent request to interrupt this req */ if (unlikely(!test_bit(FR_INTERRUPTED, &req->flags))) { - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return -EINVAL; } @@ -499,13 +499,13 @@ static int queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req) smp_mb(); if (test_bit(FR_FINISHED, &req->flags)) { list_del_init(&req->intr_entry); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return 0; } - wake_up_locked(&fiq->waitq); + wake_up(&fiq->waitq); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return 0; } @@ -535,16 +535,16 @@ static void request_wait_answer(struct fuse_conn *fc, struct fuse_req *req) if (!err) return; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); /* Request is not yet in userspace, bail out */ if (test_bit(FR_PENDING, &req->flags)) { list_del(&req->list); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); __fuse_put_request(req); req->out.h.error = -EINTR; return; } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); } /* @@ -559,9 +559,9 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req) struct fuse_iqueue *fiq = &fc->iq; BUG_ON(test_bit(FR_BACKGROUND, &req->flags)); - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); if (!fiq->connected) { - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); req->out.h.error = -ENOTCONN; } else { req->in.h.unique = fuse_get_unique(fiq); @@ -569,7 +569,7 @@ static void __fuse_request_send(struct fuse_conn *fc, struct fuse_req *req) /* acquire extra reference, since request is still needed after request_end() */ __fuse_get_request(req); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); request_wait_answer(fc, req); /* Pairs with smp_wmb() in request_end() */ @@ -700,12 +700,12 @@ static int fuse_request_send_notify_reply(struct fuse_conn *fc, __clear_bit(FR_ISREPLY, &req->flags); req->in.h.unique = unique; - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); if (fiq->connected) { queue_request(fiq, req); err = 0; } - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return err; } @@ -1149,12 +1149,12 @@ static int request_pending(struct fuse_iqueue *fiq) * Unlike other requests this is assembled on demand, without a need * to allocate a separate fuse_req structure. * - * Called with fiq->waitq.lock held, releases it + * Called with fiq->lock held, releases it */ static int fuse_read_interrupt(struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes, struct fuse_req *req) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { struct fuse_in_header ih; struct fuse_interrupt_in arg; @@ -1169,7 +1169,7 @@ __releases(fiq->waitq.lock) ih.unique = (req->in.h.unique | FUSE_INT_REQ_BIT); arg.unique = req->in.h.unique; - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); if (nbytes < reqsize) return -EINVAL; @@ -1206,7 +1206,7 @@ static struct fuse_forget_link *dequeue_forget(struct fuse_iqueue *fiq, static int fuse_read_single_forget(struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { int err; struct fuse_forget_link *forget = dequeue_forget(fiq, 1, NULL); @@ -1220,7 +1220,7 @@ __releases(fiq->waitq.lock) .len = sizeof(ih) + sizeof(arg), }; - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); kfree(forget); if (nbytes < ih.len) return -EINVAL; @@ -1238,7 +1238,7 @@ __releases(fiq->waitq.lock) static int fuse_read_batch_forget(struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { int err; unsigned max_forgets; @@ -1252,13 +1252,13 @@ __releases(fiq->waitq.lock) }; if (nbytes < ih.len) { - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return -EINVAL; } max_forgets = (nbytes - ih.len) / sizeof(struct fuse_forget_one); head = dequeue_forget(fiq, max_forgets, &count); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); arg.count = count; ih.len += count * sizeof(struct fuse_forget_one); @@ -1288,7 +1288,7 @@ __releases(fiq->waitq.lock) static int fuse_read_forget(struct fuse_conn *fc, struct fuse_iqueue *fiq, struct fuse_copy_state *cs, size_t nbytes) -__releases(fiq->waitq.lock) +__releases(fiq->lock) { if (fc->minor < 16 || fiq->forget_list_head.next->next == NULL) return fuse_read_single_forget(fiq, cs, nbytes); @@ -1318,16 +1318,19 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, unsigned int hash; restart: - spin_lock(&fiq->waitq.lock); - err = -EAGAIN; - if ((file->f_flags & O_NONBLOCK) && fiq->connected && - !request_pending(fiq)) - goto err_unlock; + for (;;) { + spin_lock(&fiq->lock); + if (!fiq->connected || request_pending(fiq)) + break; + spin_unlock(&fiq->lock); - err = wait_event_interruptible_exclusive_locked(fiq->waitq, + if (file->f_flags & O_NONBLOCK) + return -EAGAIN; + err = wait_event_interruptible_exclusive(fiq->waitq, !fiq->connected || request_pending(fiq)); - if (err) - goto err_unlock; + if (err) + return err; + } if (!fiq->connected) { err = fc->aborted ? -ECONNABORTED : -ENODEV; @@ -1351,7 +1354,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, req = list_entry(fiq->pending.next, struct fuse_req, list); clear_bit(FR_PENDING, &req->flags); list_del_init(&req->list); - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); in = &req->in; reqsize = in->h.len; @@ -1409,7 +1412,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file, return err; err_unlock: - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return err; } @@ -2121,12 +2124,12 @@ static __poll_t fuse_dev_poll(struct file *file, poll_table *wait) fiq = &fud->fc->iq; poll_wait(file, &fiq->waitq, wait); - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); if (!fiq->connected) mask = EPOLLERR; else if (request_pending(fiq)) mask |= EPOLLIN | EPOLLRDNORM; - spin_unlock(&fiq->waitq.lock); + spin_unlock(&fiq->lock); return mask; } @@ -2221,15 +2224,15 @@ void fuse_abort_conn(struct fuse_conn *fc) flush_bg_queue(fc); spin_unlock(&fc->bg_lock); - spin_lock(&fiq->waitq.lock); + spin_lock(&fiq->lock); fiq->connected = 0; list_for_each_entry(req, &fiq->pending, list) clear_bit(FR_PENDING, &req->flags); list_splice_tail_init(&fiq->pending, &to_end); while (forget_pending(fiq)) kfree(dequeue_forget(fiq, 1, NULL)); - wake_up_all_locked(&fiq->waitq); - spin_unlock(&fiq->waitq.lock); + wake_up_all(&fiq->waitq); + spin_unlock(&fiq->lock); kill_fasync(&fiq->fasync, SIGIO, POLL_IN); end_polls(fc); wake_up_all(&fc->blocked_waitq); diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 24dbca777775..89bdc41e0d86 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -450,6 +450,9 @@ struct fuse_iqueue { /** Connection established */ unsigned connected; + /** Lock protecting accesses to members of this structure */ + spinlock_t lock; + /** Readers of the connection are waiting on this */ wait_queue_head_t waitq; diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 4bb885b0f032..987877860c01 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -582,6 +582,7 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root) static void fuse_iqueue_init(struct fuse_iqueue *fiq) { memset(fiq, 0, sizeof(struct fuse_iqueue)); + spin_lock_init(&fiq->lock); init_waitqueue_head(&fiq->waitq); INIT_LIST_HEAD(&fiq->pending); INIT_LIST_HEAD(&fiq->interrupts); -- 2.23.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-09-09 3:17 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-18 16:18 possible deadlock in io_submit_one (2) syzbot [not found] ` <20190822233529.4176-1-ebiggers@kernel.org> 2019-09-03 7:31 ` [PATCH] fuse: disable irqs for fuse_iqueue::waitq.lock Miklos Szeredi 2019-09-03 13:39 ` Eric Biggers 2019-09-04 14:29 ` Miklos Szeredi 2019-09-06 4:43 ` Eric Biggers 2019-09-06 6:58 ` Miklos Szeredi 2019-09-09 3:15 ` [PATCH v2] fuse: fix deadlock with aio poll and fuse_iqueue::waitq.lock Eric Biggers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).