io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: syzbot <syzbot+27d62ee6f256b186883e@syzkaller.appspotmail.com>,
	axboe@kernel.dk, io-uring@vger.kernel.org,
	linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] INFO: task hung in io_wqe_worker
Date: Thu, 28 Oct 2021 21:32:23 +0100	[thread overview]
Message-ID: <2b0d6d98-b6f6-e1b1-1ea8-3126f41ec0ce@gmail.com> (raw)
In-Reply-To: <27280d59-88ff-7eeb-1e43-eb9bd23df761@gmail.com>

On 10/22/21 14:57, Pavel Begunkov wrote:
> On 10/22/21 14:49, Pavel Begunkov wrote:
>> On 10/22/21 05:38, syzbot wrote:
>>> Hello,
>>>
>>> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
>>> INFO: task hung in io_wqe_worker
>>>
>>> INFO: task iou-wrk-9392:9401 blocked for more than 143 seconds.
>>>        Not tainted 5.15.0-rc2-syzkaller #0
>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> task:iou-wrk-9392    state:D stack:27952 pid: 9401 ppid:  7038 flags:0x00004004
>>> Call Trace:
>>>   context_switch kernel/sched/core.c:4940 [inline]
>>>   __schedule+0xb44/0x5960 kernel/sched/core.c:6287
>>>   schedule+0xd3/0x270 kernel/sched/core.c:6366
>>>   schedule_timeout+0x1db/0x2a0 kernel/time/timer.c:1857
>>>   do_wait_for_common kernel/sched/completion.c:85 [inline]
>>>   __wait_for_common kernel/sched/completion.c:106 [inline]
>>>   wait_for_common kernel/sched/completion.c:117 [inline]
>>>   wait_for_completion+0x176/0x280 kernel/sched/completion.c:138
>>>   io_worker_exit fs/io-wq.c:183 [inline]
>>>   io_wqe_worker+0x66d/0xc40 fs/io-wq.c:597
>>>   ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

#syz test: https://github.com/isilence/linux.git syz_coredump



>>
>> Easily reproducible, it's stuck in
>>
>> static void io_worker_exit(struct io_worker *worker)
>> {
>>      ...
>>      wait_for_completion(&worker->ref_done);
>>      ...
>> }
>>
>> The reference belongs to a create_worker_cb() task_work item. It's expected
>> to either be executed or cancelled by io_wq_exit_workers(), but the owner
>> task never goes __io_uring_cancel (called in do_exit()) and so never
>> reaches io_wq_exit_workers().
>>
>> Following the owner task, cat /proc/<pid>/stack:
>>
>> [<0>] do_coredump+0x1d0/0x10e0
>> [<0>] get_signal+0x4a3/0x960
>> [<0>] arch_do_signal_or_restart+0xc3/0x6d0
>> [<0>] exit_to_user_mode_prepare+0x10e/0x190
>> [<0>] irqentry_exit_to_user_mode+0x9/0x20
>> [<0>] irqentry_exit+0x36/0x40
>> [<0>] exc_page_fault+0x95/0x190
>> [<0>] asm_exc_page_fault+0x1e/0x30
>>
>> (gdb) l *(do_coredump+0x1d0-5)
>> 0xffffffff81343ccb is in do_coredump (fs/coredump.c:469).
>> 464
>> 465             if (core_waiters > 0) {
>> 466                     struct core_thread *ptr;
>> 467
>> 468                     freezer_do_not_count();
>> 469                     wait_for_completion(&core_state->startup);
>> 470                     freezer_count();
>>
>> Can't say anything more at the moment as not familiar with coredump
> 
> A simple hack allowing task works to be executed from there
> workarounds the problem
> 
> 
> diff --git a/fs/coredump.c b/fs/coredump.c
> index 3224dee44d30..f6f9dfb02296 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -466,7 +466,8 @@ static int coredump_wait(int exit_code, struct core_state *core_state)
>           struct core_thread *ptr;
> 
>           freezer_do_not_count();
> -        wait_for_completion(&core_state->startup);
> +        while (wait_for_completion_interruptible(&core_state->startup))
> +            tracehook_notify_signal();
>           freezer_count();
>           /*
>            * Wait for all the threads to become inactive, so that
> 
> 
> 

-- 
Pavel Begunkov

  reply	other threads:[~2021-10-28 20:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-21 21:10 [syzbot] INFO: task hung in io_wqe_worker syzbot
2021-10-21 23:47 ` Pavel Begunkov
2021-10-22  4:38   ` syzbot
2021-10-22 13:49     ` Pavel Begunkov
2021-10-22 13:57       ` Pavel Begunkov
2021-10-28 20:32         ` Pavel Begunkov [this message]
2021-10-28 22:35           ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b0d6d98-b6f6-e1b1-1ea8-3126f41ec0ce@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+27d62ee6f256b186883e@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).