From: Pavel Begunkov <asml.silence@gmail.com>
To: Dan Carpenter <dan.carpenter@oracle.com>,
Hillf Danton <hdanton@sina.com>
Cc: syzbot <syzbot+66243bb7126c410cefe6@syzkaller.appspotmail.com>,
axboe@kernel.dk, io-uring@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk
Subject: Re: INFO: rcu detected stall in io_uring_release
Date: Mon, 20 Apr 2020 15:57:11 +0300 [thread overview]
Message-ID: <98a6f295-c7b4-390b-c618-b5f0043f4c1a@gmail.com> (raw)
In-Reply-To: <20200420114719.GA2659@kadam>
On 4/20/2020 2:47 PM, Dan Carpenter wrote:
> On Sun, Apr 19, 2020 at 12:06:26PM +0800, Hillf Danton wrote:
>>
>> Sat, 18 Apr 2020 11:59:13 -0700
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit: 8f3d9f35 Linux 5.7-rc1
>>> git tree: upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=115720c3e00000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=66243bb7126c410cefe6
>>> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
>>>
>>> Unfortunately, I don't have any reproducer for this crash yet.
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+66243bb7126c410cefe6@syzkaller.appspotmail.com
>>>
>>> rcu: INFO: rcu_preempt self-detected stall on CPU
>>> rcu: 0-....: (10500 ticks this GP) idle=57e/1/0x4000000000000002 softirq=44329/44329 fqs=5245
>>> (t=10502 jiffies g=79401 q=2096)
>>> NMI backtrace for cpu 0
>>> CPU: 0 PID: 23184 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Call Trace:
>>> <IRQ>
>>> __dump_stack lib/dump_stack.c:77 [inline]
>>> dump_stack+0x188/0x20d lib/dump_stack.c:118
>>> nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
>>> nmi_trigger_cpumask_backtrace+0x231/0x27e lib/nmi_backtrace.c:62
>>> trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
>>> rcu_dump_cpu_stacks+0x19b/0x1e5 kernel/rcu/tree_stall.h:254
>>> print_cpu_stall kernel/rcu/tree_stall.h:475 [inline]
>>> check_cpu_stall kernel/rcu/tree_stall.h:549 [inline]
>>> rcu_pending kernel/rcu/tree.c:3225 [inline]
>>> rcu_sched_clock_irq.cold+0x55d/0xcfa kernel/rcu/tree.c:2296
>>> update_process_times+0x25/0x60 kernel/time/timer.c:1727
>>> tick_sched_handle+0x9b/0x180 kernel/time/tick-sched.c:176
>>> tick_sched_timer+0x4e/0x140 kernel/time/tick-sched.c:1320
>>> __run_hrtimer kernel/time/hrtimer.c:1520 [inline]
>>> __hrtimer_run_queues+0x5ca/0xed0 kernel/time/hrtimer.c:1584
>>> hrtimer_interrupt+0x312/0x770 kernel/time/hrtimer.c:1646
>>> local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1113 [inline]
>>> smp_apic_timer_interrupt+0x15b/0x600 arch/x86/kernel/apic/apic.c:1138
>>> apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
>>> </IRQ>
>>> RIP: 0010:io_ring_ctx_wait_and_kill+0x98/0x5a0 fs/io_uring.c:7301
>>> Code: 01 00 00 4d 89 f4 48 b8 00 00 00 00 00 fc ff df 4c 89 ed 49 c1 ec 03 48 c1 ed 03 49 01 c4 48 01 c5 eb 1c e8 3a ea 9d ff f3 90 <41> 80 3c 24 00 0f 85 53 04 00 00 48 83 bb 10 01 00 00 00 74 21 e8
>>> RSP: 0018:ffffc9000897fdf0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
>>> RAX: ffff888024082080 RBX: ffff88808df8e000 RCX: 1ffff9200112ffab
>>> RDX: 0000000000000000 RSI: ffffffff81d549c6 RDI: ffff88808df8e300
>>> RBP: ffffed1011bf1c2c R08: 0000000000000001 R09: ffffed1011bf1c61
>>> R10: ffff88808df8e307 R11: ffffed1011bf1c60 R12: ffffed1011bf1c22
>>> R13: ffff88808df8e160 R14: ffff88808df8e110 R15: ffffffff81d54ed0
>>> io_uring_release+0x3e/0x50 fs/io_uring.c:7324
>>> __fput+0x33e/0x880 fs/file_table.c:280
>>> task_work_run+0xf4/0x1b0 kernel/task_work.c:123
>>> tracehook_notify_resume include/linux/tracehook.h:188 [inline]
>>> exit_to_usermode_loop+0x2fa/0x360 arch/x86/entry/common.c:165
>>> prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
>>> syscall_return_slowpath arch/x86/entry/common.c:279 [inline]
>>> do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305
>>> entry_SYSCALL_64_after_hwframe+0x49/0xb3
>>
>> Make io ring ctx's percpu_ref balanced.
>>
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -5904,6 +5904,7 @@ static int io_submit_sqes(struct io_ring
>> fail_req:
>> io_cqring_add_event(req, err);
>> io_double_put_req(req);
>> + --submitted;
>> break;
>> }
>
>
> fs/io_uring.c
> 5880 for (i = 0; i < nr; i++) {
> 5881 const struct io_uring_sqe *sqe;
> 5882 struct io_kiocb *req;
> 5883 int err;
> 5884
> 5885 sqe = io_get_sqe(ctx);
> 5886 if (unlikely(!sqe)) {
> 5887 io_consume_sqe(ctx);
> 5888 break;
> 5889 }
> 5890 req = io_alloc_req(ctx, statep);
> 5891 if (unlikely(!req)) {
> 5892 if (!submitted)
> 5893 submitted = -EAGAIN;
> 5894 break;
> 5895 }
> 5896
> 5897 err = io_init_req(ctx, req, sqe, statep, async);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> On the success path io_init_req() takes some references like:
>
> get_cred(req->work.creds);
If a req have got into io_init_req(), than it'll be put at some point
with io_put_req(). io_req_work_drop_env() called from there will clean
up req->work.creds.
>
> That one is probably buggy and should be put if the call to:
>
> return io_req_set_file(state, req, fd, sqe_flags);
>
> fails... But io_req_set_file() takes some other references if it
> succeeds like percpu_ref_get(req->fixed_file_refs); and it's not clear
> that those are released if io_submit_sqe() fails.
The same should happen with req->fixed_file_refs, though I don't
remember in details.
>
> 5898 io_consume_sqe(ctx);
> 5899 /* will complete beyond this point, count as submitted */
> 5900 submitted++;
Regarding, "--submitted" patch -- we take 1 ctx->refs per request, which
is put in io_put_req(). So after a request passes the line above (5900),
it's ref will be eventually dropped in io_put_req() and friends.
And it's a bit more peculiar because io_submit_sqes() batch-takes N refs
first, and then puts unused back at the end.
> 5901
> 5902 if (unlikely(err)) {
> 5903 fail_req:
> 5904 io_cqring_add_event(req, err);
> 5905 io_double_put_req(req);
> 5906 break;
> 5907 }
> 5908
> 5909 trace_io_uring_submit_sqe(ctx, req->opcode, req->user_data,
> 5910 true, async);
> 5911 err = io_submit_sqe(req, sqe, statep, &link);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> here
>
> 5912 if (err)
> 5913 goto fail_req;
> 5914 }
>
> regards,
> dan carpenter
>
--
Pavel Begunkov
next prev parent reply other threads:[~2020-04-20 12:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-18 18:59 INFO: rcu detected stall in io_uring_release syzbot
[not found] ` <20200419040626.628-1-hdanton@sina.com>
2020-04-19 19:57 ` Jens Axboe
2020-04-20 11:47 ` Dan Carpenter
2020-04-20 12:57 ` Pavel Begunkov [this message]
2020-05-12 2:17 ` syzbot
2020-05-12 15:25 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98a6f295-c7b4-390b-c618-b5f0043f4c1a@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=dan.carpenter@oracle.com \
--cc=hdanton@sina.com \
--cc=io-uring@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=syzbot+66243bb7126c410cefe6@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.