From: Jens Axboe <axboe@kernel.dk>
To: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
Cc: linux-block@vger.kernel.org, liuyun01@kylinos.cn
Subject: Re: [RFC PATCH] io_uring: add a memory barrier before atomic_read
Date: Thu, 18 Jul 2019 09:41:06 -0600 [thread overview]
Message-ID: <9b8f3de8-48c9-35e3-d985-00bad339b74d@kernel.dk> (raw)
In-Reply-To: <1563453840-19778-1-git-send-email-liuzhengyuan@kylinos.cn>
On 7/18/19 6:44 AM, Zhengyuan Liu wrote:
> There is a hang issue while using fio to do some basic test. The issue can
> been easily reproduced using bellow scripts:
>
> while true
> do
> fio --ioengine=io_uring -rw=write -bs=4k -numjobs=1 \
> -size=1G -iodepth=64 -name=uring --filename=/dev/zero
> done
>
> After serveral minutes, maybe more, fio would block at
> io_uring_enter->io_cqring_wait in order to waiting for previously committed
> sqes to be completed and cann't return to user anymore until we send a SIGTERM
> to fio. After got SIGTERM, fio turns to hang at io_ring_ctx_wait_and_kill with
> a backtrace like this:
>
> [54133.243816] Call Trace:
> [54133.243842] __schedule+0x3a0/0x790
> [54133.243868] schedule+0x38/0xa0
> [54133.243880] schedule_timeout+0x218/0x3b0
> [54133.243891] ? sched_clock+0x9/0x10
> [54133.243903] ? wait_for_completion+0xa3/0x130
> [54133.243916] ? _raw_spin_unlock_irq+0x2c/0x40
> [54133.243930] ? trace_hardirqs_on+0x3f/0xe0
> [54133.243951] wait_for_completion+0xab/0x130
> [54133.243962] ? wake_up_q+0x70/0x70
> [54133.243984] io_ring_ctx_wait_and_kill+0xa0/0x1d0
> [54133.243998] io_uring_release+0x20/0x30
> [54133.244008] __fput+0xcf/0x270
> [54133.244029] ____fput+0xe/0x10
> [54133.244040] task_work_run+0x7f/0xa0
> [54133.244056] do_exit+0x305/0xc40
> [54133.244067] ? get_signal+0x13b/0xbd0
> [54133.244088] do_group_exit+0x50/0xd0
> [54133.244103] get_signal+0x18d/0xbd0
> [54133.244112] ? _raw_spin_unlock_irqrestore+0x36/0x60
> [54133.244142] do_signal+0x34/0x720
> [54133.244171] ? exit_to_usermode_loop+0x7e/0x130
> [54133.244190] exit_to_usermode_loop+0xc0/0x130
> [54133.244209] do_syscall_64+0x16b/0x1d0
> [54133.244221] entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> The reason is that we had added a req to ctx->pending_async at the very end, but
> it got no chance to be processed anymore. How could this be happened?
>
> fio#cpu0 wq#cpu1
>
> io_add_to_prev_work io_sq_wq_submit_work
>
> atomic_read() <<< 1
>
> atomic_dec_return() << 1->0
> list_empty(); <<< true;
>
> list_add_tail()
> atomic_read() << 0 or 1?
>
> As was said in atomic_ops.rst, atomic_read does not guarantee that the runtime
> initialization by any other thread is visible yet, so we must take care of that
> with a proper implicit or explicit memory barrier;
Thanks for looking at this and finding this issue, it does looks like a problem.
But I'm not sure about the fix. Shouldn't we just need an smp_mb__after_atomic()
on the atomic_dec_return() side of things? Like the below.
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5ec06e5ba0be..3c2a6f88a6b0 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1881,6 +1881,7 @@ static void io_sq_wq_submit_work(struct work_struct *work)
*/
if (async_list) {
ret = atomic_dec_return(&async_list->cnt);
+ smp_mb__after_atomic();
while (!ret && !list_empty(&async_list->list)) {
spin_lock(&async_list->lock);
atomic_inc(&async_list->cnt);
@@ -1894,6 +1895,7 @@ static void io_sq_wq_submit_work(struct work_struct *work)
goto restart;
}
ret = atomic_dec_return(&async_list->cnt);
+ smp_mb__after_atomic();
}
}
--
Jens Axboe
next prev parent reply other threads:[~2019-07-18 15:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-18 12:44 [RFC PATCH] io_uring: add a memory barrier before atomic_read Zhengyuan Liu
2019-07-18 15:41 ` Jens Axboe [this message]
2019-07-18 16:43 ` Jens Axboe
2019-07-19 0:44 ` Jackie Liu
[not found] ` <5d3114d7.1c69fb81.fc097.122eSMTPIN_ADDED_BROKEN@mx.google.com>
2019-07-19 14:51 ` Jens Axboe
2019-07-20 15:14 ` Zhengyuan Liu
2019-07-20 15:19 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9b8f3de8-48c9-35e3-d985-00bad339b74d@kernel.dk \
--to=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=liuyun01@kylinos.cn \
--cc=liuzhengyuan@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).