From: Pavel Begunkov <asml.silence@gmail.com>
To: Pengfei Xu <pengfei.xu@intel.com>, linux-kernel@vger.kernel.org
Cc: heng.su@intel.com, axboe@kernel.dk
Subject: Re: [Syzkaller & bisect] There is "io_ring_exit_work" related Call Trace in v5.2-rc5 kernel
Date: Wed, 1 Feb 2023 15:52:47 +0000 [thread overview]
Message-ID: <15161e5f-fe26-23e9-1d0a-ebd579d64d9f@gmail.com> (raw)
In-Reply-To: <812a7e2d-14d6-78cd-5c6c-ce508fc7fc0d@gmail.com>
On 1/28/23 14:49, Pavel Begunkov wrote:
> On 1/28/23 08:44, Pengfei Xu wrote:
>> Hi Pavel Begunkov and kernel expert,
>>
>> Greeting!
>>
>> There is "io_ring_exit_work" related Call Trace in v5.2-rc5 kernel in guest
>> on Sapphire Rapids server.
>
> Thanks for the report, we'll take a look
Not reproducible for me. Apparently, the repro creates a normal ring
and immediately closes it, then io_ring_exit_work() hangs waiting to
the task to execute task_work.
It reminds me a bug we had before when the task was being stopped
for debugging blocking task_work. Maybe some signal interaction, or
vm and/or syz magic specific.
>> INFO: task kworker/u4:2:32 blocked for more than 147 seconds.
>> [ 300.212512] Not tainted 6.2.0-rc5-2241ab53cbb5 #1
>> [ 300.213087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [ 300.213921] task:kworker/u4:2 state:D stack:0 pid:32 ppid:2 flags:0x00004000
>> [ 300.214849] Workqueue: events_unbound io_ring_exit_work
>> [ 300.215506] Call Trace:
>> [ 300.215789] <TASK>
>> [ 300.216051] __schedule+0x385/0xb00
>> [ 300.216467] ? write_comp_data+0x2f/0x90
>> [ 300.216949] ? wait_for_completion+0x7b/0x180
>> [ 300.217445] schedule+0x5b/0xe0
>> [ 300.217836] schedule_timeout+0x561/0x650
>> [ 300.218301] ? _raw_spin_unlock_irq+0x2b/0x60
>> [ 300.218816] ? wait_for_completion+0x7b/0x180
>> [ 300.219341] ? __this_cpu_preempt_check+0x20/0x30
>> [ 300.219893] ? lockdep_hardirqs_on+0x8a/0x110
>> [ 300.220384] ? _raw_spin_unlock_irq+0x2b/0x60
>> [ 300.220881] ? wait_for_completion+0x9e/0x180
>> [ 300.221382] ? trace_hardirqs_on+0x3f/0x100
>> [ 300.221868] ? wait_for_completion+0x7b/0x180
>> [ 300.222379] wait_for_completion+0xa6/0x180
>> [ 300.222897] io_ring_exit_work+0x2f7/0x747
>> [ 300.223383] ? __pfx_io_tctx_exit_cb+0x10/0x10
>> [ 300.223947] process_one_work+0x3b1/0x960
>> [ 300.224446] worker_thread+0x52/0x660
>> [ 300.224884] ? __pfx_worker_thread+0x10/0x10
>> [ 300.225376] kthread+0x161/0x1a0
>> [ 300.225782] ? __pfx_kthread+0x10/0x10
>> [ 300.226225] ret_from_fork+0x29/0x50
>> [ 300.226688] </TASK>
>> [ 300.227791]
>> [ 300.227791] Showing all locks held in the system:
>> [ 300.228501] 1 lock held by rcu_tasks_kthre/11:
>> [ 300.229003] #0: ffffffff83d617b0 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x2d/0x3d0
>> [ 300.230098] 1 lock held by rcu_tasks_rude_/12:
>> [ 300.230601] #0: ffffffff83d61530 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x2d/0x3d0
>> [ 300.231733] 1 lock held by rcu_tasks_trace/13:
>> [ 300.232244] #0: ffffffff83d61270 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x2d/0x3d0
>> [ 300.233377] 1 lock held by khungtaskd/29:
>> [ 300.233842] #0: ffffffff83d621c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x27/0x18c
>> [ 300.234866] 2 locks held by kworker/u4:2/32:
>> [ 300.235351] #0: ff11000006045d38 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x2ff/0x960
>> [ 300.236502] #1: ffa0000000117e60 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x303/0x960
>> [ 300.237696]
>> [ 300.237882] =============================================
>> [ 300.237882]
>>
>> Bisected and found the first bad commit is:
>> eebd2e37e662617a6b8041db75205f0a262ce870
>> io_uring: don't take task ring-file notes
>>
>> But after reverted above commit eebd2e37e on top of v6.2-rc5 kernel and made
>> the kernel failed. So it's just the suspected commit for above problem.
>>
>> Syzkaller reproduced code, bisect info, kconfig and v6.2-rc5 dmesg are in
>> attached.
>>
>> All detailed info is in link:
>> https://github.com/xupengfe/syzkaller_logs/tree/main/230124_055801_io_ring_exit_work
>>
>> I hope it's helpful.
>>
>> Thanks!
>> BR.
>
--
Pavel Begunkov
next prev parent reply other threads:[~2023-02-01 15:54 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-28 8:44 [Syzkaller & bisect] There is "io_ring_exit_work" related Call Trace in v5.2-rc5 kernel Pengfei Xu
2023-01-28 8:52 ` [Syzkaller & bisect] There is "io_ring_exit_work" related Call Trace in v6.2-rc5 kernel Pengfei Xu
2023-01-28 14:49 ` [Syzkaller & bisect] There is "io_ring_exit_work" related Call Trace in v5.2-rc5 kernel Pavel Begunkov
2023-02-01 15:52 ` Pavel Begunkov [this message]
2023-02-02 8:56 ` Pengfei Xu
2023-02-02 19:44 ` Jens Axboe
2023-02-02 20:37 ` Jens Axboe
2023-02-03 2:08 ` Pengfei Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15161e5f-fe26-23e9-1d0a-ebd579d64d9f@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=heng.su@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pengfei.xu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).