From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B2BDC10F12 for ; Mon, 15 Apr 2019 16:49:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EBC722077C for ; Mon, 15 Apr 2019 16:49:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="iD7RiUb6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727738AbfDOQtb (ORCPT ); Mon, 15 Apr 2019 12:49:31 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:56209 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727604AbfDOQt3 (ORCPT ); Mon, 15 Apr 2019 12:49:29 -0400 Received: by mail-it1-f196.google.com with SMTP id y134so28097726itc.5 for ; Mon, 15 Apr 2019 09:49:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=zPfgrP+xMGw2jpeJZAt0Vf1te85+6udsGgr2e/tUZvk=; b=iD7RiUb66BNtjmp0f5xXv6gywfXIZPZUW0du5mq78eAFjsNUlF6i/oRG3O60BW6gfh bCZ1rp8UIVNMbLtGLsPOfX8uiewS2E579Ely88/5eMTQHi/bL8ZOnu5ax8nmKuEAOXoC WL7/nEXFarHSKdmlzUa6yxMAcLgh5wZZc9wjdjzaFhlbh9dUyYBbyiZhpE5kSBId7EIM pUlWjnnc/hqyo6nohvdVNCxwFg+hoWQHIbAyol95ahNjOysd3yjy4ExO/CThBf5P+5c+ nkNcWFy/kctIOKZlO0EuWUceYruKtO8qNWa5dG1Kca4yb4UT5ZNa/YJECow54nF0El6B KTRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=zPfgrP+xMGw2jpeJZAt0Vf1te85+6udsGgr2e/tUZvk=; b=kzhoq8VfmGIvYflvGm+/obxk2t36FVole2wDJn7kp3rs1jYDQ1wcmoQKPDCGKZ11CI BzjNab+EunvgY1mFQLehGfvynJtlH/3xMmEOhWFjtmi6R6AIhQu6vZoz8J83pC9JVs8l r+k5JGXfOJ/PdRG1o2xwTlxrwwDyWtvI5yct7BXqELN4fZ4CiX9D6kD2YWYWi2ycwx4m zU+qSdGPUMFJvVwCwvxX1TrC2bIS51gzrw8YOr+T5JbINMASX7bIwL23ndreZX0yvj1x 58HpriI/j5fEy4Dqfl/BKDdT50eE3+yXYopFXBJJnBYXjcFhyy0pfxxW+8IqwKDogEcx jYbg== X-Gm-Message-State: APjAAAVKpM7W/KvzGRhPjE4ztfXPea3JbwY4Ulp+K+SsExq/gtFXlaHG KbVDranI16fTnRUTXElaUaCBlnr9RRZTcA== X-Google-Smtp-Source: APXvYqyUwUQTh3kPrNpu5wsDhzJ388E+zYD+z5THBnUw7KntWxmU99ke6YzNWHZKVNpLGhMMZXEzyw== X-Received: by 2002:a24:1342:: with SMTP id 63mr26075175itz.69.1555346968015; Mon, 15 Apr 2019 09:49:28 -0700 (PDT) Received: from [192.168.1.158] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id u66sm19127979ioe.74.2019.04.15.09.49.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 15 Apr 2019 09:49:26 -0700 (PDT) Subject: Re: INFO: task hung in __io_uring_register To: syzbot , linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk References: <0000000000003393e6058691f589@google.com> From: Jens Axboe Message-ID: <0ed0d88e-268b-fc1c-0273-2f4063cf75a1@kernel.dk> Date: Mon, 15 Apr 2019 10:49:25 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <0000000000003393e6058691f589@google.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/15/19 7:52 AM, syzbot wrote: > syzbot has found a reproducer for the following crash on: > > HEAD commit: dc4060a5 Linux 5.1-rc5 > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=17eacddd200000 > kernel config: https://syzkaller.appspot.com/x/.config?x=856fc6d0fbbeede9 > dashboard link: https://syzkaller.appspot.com/bug?extid=16dc03452dee970a0c3e > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > userspace arch: i386 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14eb47ed200000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=112175dd200000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+16dc03452dee970a0c3e@syzkaller.appspotmail.com > > INFO: task syz-executor043:8083 blocked for more than 143 seconds. > Not tainted 5.1.0-rc5 #68 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > syz-executor043 D29912 8083 8063 0x20020004 > Call Trace: > context_switch kernel/sched/core.c:2877 [inline] > __schedule+0x813/0x1cc0 kernel/sched/core.c:3518 > schedule+0x92/0x180 kernel/sched/core.c:3562 > schedule_timeout+0x8ca/0xfd0 kernel/time/timer.c:1779 > do_wait_for_common kernel/sched/completion.c:83 [inline] > __wait_for_common kernel/sched/completion.c:104 [inline] > wait_for_common kernel/sched/completion.c:115 [inline] > wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136 > __io_uring_register+0xb6/0x1fd0 fs/io_uring.c:2929 > __do_sys_io_uring_register fs/io_uring.c:2979 [inline] > __se_sys_io_uring_register fs/io_uring.c:2961 [inline] > __ia32_sys_io_uring_register+0x193/0x1f0 fs/io_uring.c:2961 > do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline] > do_fast_syscall_32+0x281/0xc98 arch/x86/entry/common.c:397 > entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 > RIP: 0023:0xf7f06869 > Code: Bad RIP value. > RSP: 002b:00000000f7f021ec EFLAGS: 00000296 ORIG_RAX: 00000000000001ab > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000001 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000014a > RBP: 00000000200000c0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > INFO: task syz-executor043:8084 blocked for more than 143 seconds. > Not tainted 5.1.0-rc5 #68 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > syz-executor043 D30320 8084 8063 0x20020004 > Call Trace: > context_switch kernel/sched/core.c:2877 [inline] > __schedule+0x813/0x1cc0 kernel/sched/core.c:3518 > schedule+0x92/0x180 kernel/sched/core.c:3562 > schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3620 > __mutex_lock_common kernel/locking/mutex.c:1002 [inline] > __mutex_lock+0x726/0x1310 kernel/locking/mutex.c:1072 > mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087 > __do_sys_io_uring_enter fs/io_uring.c:2678 [inline] > __se_sys_io_uring_enter fs/io_uring.c:2637 [inline] > __ia32_sys_io_uring_enter+0x67f/0xac0 fs/io_uring.c:2637 > do_syscall_32_irqs_on arch/x86/entry/common.c:326 [inline] > do_fast_syscall_32+0x281/0xc98 arch/x86/entry/common.c:397 > entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 > RIP: 0023:0xf7f06869 > Code: Bad RIP value. > RSP: 002b:00000000f7ee11ec EFLAGS: 00000292 ORIG_RAX: 00000000000001aa > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000010005 > RDX: 0000000000000002 RSI: 0000000000000003 RDI: 0000000000000000 > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > > Showing all locks held in the system: > 1 lock held by khungtaskd/1042: > #0: 00000000fbdbf33e (rcu_read_lock){....}, at: > debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5056 > 2 locks held by rsyslogd/7952: > #0: 000000001bb2c5a7 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xee/0x110 > fs/file.c:801 > #1: 000000007392ce44 (fs_reclaim){+.+.}, at: is_bpf_text_address+0x0/0x170 > kernel/bpf/core.c:697 > 2 locks held by getty/8041: > #0: 00000000e7d67802 (&tty->ldisc_sem){++++}, at: > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:341 > #1: 0000000026520570 (&ldata->atomic_read_lock){+.+.}, at: > n_tty_read+0x232/0x1b70 drivers/tty/n_tty.c:2156 > 2 locks held by getty/8042: > #0: 000000004426012c (&tty->ldisc_sem){++++}, at: > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:341 > #1: 0000000004c02a9e (&ldata->atomic_read_lock){+.+.}, at: > n_tty_read+0x232/0x1b70 drivers/tty/n_tty.c:2156 > 2 locks held by getty/8043: > #0: 00000000e0209856 (&tty->ldisc_sem){++++}, at: > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:341 > #1: 000000009f921bda (&ldata->atomic_read_lock){+.+.}, at: > n_tty_read+0x232/0x1b70 drivers/tty/n_tty.c:2156 > 2 locks held by getty/8044: > #0: 0000000000de2ea5 (&tty->ldisc_sem){++++}, at: > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:341 > #1: 000000002674bf52 (&ldata->atomic_read_lock){+.+.}, at: > n_tty_read+0x232/0x1b70 drivers/tty/n_tty.c:2156 > 2 locks held by getty/8045: > #0: 00000000abbba1ef (&tty->ldisc_sem){++++}, at: > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:341 > #1: 00000000ef32475e (&ldata->atomic_read_lock){+.+.}, at: > n_tty_read+0x232/0x1b70 drivers/tty/n_tty.c:2156 > 2 locks held by getty/8046: > #0: 000000008d742c83 (&tty->ldisc_sem){++++}, at: > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:341 > #1: 00000000fa255b8a (&ldata->atomic_read_lock){+.+.}, at: > n_tty_read+0x232/0x1b70 drivers/tty/n_tty.c:2156 > 2 locks held by getty/8047: > #0: 00000000d3d56b2b (&tty->ldisc_sem){++++}, at: > ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:341 > #1: 0000000015242045 (&ldata->atomic_read_lock){+.+.}, at: > n_tty_read+0x232/0x1b70 drivers/tty/n_tty.c:2156 > 1 lock held by syz-executor043/8083: > #0: 000000005524ed18 (&ctx->uring_lock){+.+.}, at: > __do_sys_io_uring_register fs/io_uring.c:2978 [inline] > #0: 000000005524ed18 (&ctx->uring_lock){+.+.}, at: > __se_sys_io_uring_register fs/io_uring.c:2961 [inline] > #0: 000000005524ed18 (&ctx->uring_lock){+.+.}, at: > __ia32_sys_io_uring_register+0x182/0x1f0 fs/io_uring.c:2961 > 1 lock held by syz-executor043/8084: > #0: 000000005524ed18 (&ctx->uring_lock){+.+.}, at: __do_sys_io_uring_enter > fs/io_uring.c:2678 [inline] > #0: 000000005524ed18 (&ctx->uring_lock){+.+.}, at: __se_sys_io_uring_enter > fs/io_uring.c:2637 [inline] > #0: 000000005524ed18 (&ctx->uring_lock){+.+.}, at: > __ia32_sys_io_uring_enter+0x67f/0xac0 fs/io_uring.c:2637 > > ============================================= > > NMI backtrace for cpu 1 > CPU: 1 PID: 1042 Comm: khungtaskd Not tainted 5.1.0-rc5 #68 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x172/0x1f0 lib/dump_stack.c:113 > nmi_cpu_backtrace.cold+0x63/0xa4 lib/nmi_backtrace.c:101 > nmi_trigger_cpumask_backtrace+0x1be/0x236 lib/nmi_backtrace.c:62 > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] > check_hung_uninterruptible_tasks kernel/hung_task.c:204 [inline] > watchdog+0x9b7/0xec0 kernel/hung_task.c:288 > kthread+0x357/0x430 kernel/kthread.c:253 > ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 > Sending NMI from CPU 1 to CPUs 0: This is a deadlock due to io_uring_register() waiting for existing users to exit. But it does so with the io_uring mutex held, and existing callers to io_uring_enter() may need that mutex to make progress. Hence we're deadlocked - one side holding the mutex and waiting for users to exit, the other side waiting on the mutex to make progress and then exit. This should fix it. diff --git a/fs/io_uring.c b/fs/io_uring.c index f4ddb9d23241..b35300e4c9a7 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2929,11 +2929,23 @@ SYSCALL_DEFINE2(io_uring_setup, u32, entries, static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) + __releases(ctx->uring_lock) + __acquires(ctx->uring_lock) { int ret; percpu_ref_kill(&ctx->refs); + + /* + * Drop uring mutex before waiting for references to exit. If another + * thread is currently inside io_uring_enter() it might need to grab + * the uring_lock to make progress. If we hold it here across the drain + * wait, then we can deadlock. It's safe to drop the mutex here, since + * no new references will come in after we've killed the percpu ref. + */ + mutex_unlock(&ctx->uring_lock); wait_for_completion(&ctx->ctx_done); + mutex_lock(&ctx->uring_lock); switch (opcode) { case IORING_REGISTER_BUFFERS: -- Jens Axboe