All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.12] io_uring: do ctx sqd ejection in a clear context
@ 2021-03-23 10:52 Pavel Begunkov
  2021-03-24 12:55 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Pavel Begunkov @ 2021-03-23 10:52 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: syzbot+e3a3f84f5cecf61f0583

WARNING: CPU: 1 PID: 27907 at fs/io_uring.c:7147 io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
CPU: 1 PID: 27907 Comm: iou-sqp-27905 Not tainted 5.12.0-rc4-syzkaller #0
RIP: 0010:io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
Call Trace:
 io_ring_ctx_wait_and_kill+0x214/0x700 fs/io_uring.c:8619
 io_uring_release+0x3e/0x50 fs/io_uring.c:8646
 __fput+0x288/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x1a0 kernel/task_work.c:140
 io_run_task_work fs/io_uring.c:2238 [inline]
 io_run_task_work fs/io_uring.c:2228 [inline]
 io_uring_try_cancel_requests+0x8ec/0xc60 fs/io_uring.c:8770
 io_uring_cancel_sqpoll+0x1cf/0x290 fs/io_uring.c:8974
 io_sqpoll_cancel_cb+0x87/0xb0 fs/io_uring.c:8907
 io_run_task_work_head+0x58/0xb0 fs/io_uring.c:1961
 io_sq_thread+0x3e2/0x18d0 fs/io_uring.c:6763
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

May happen that last ctx ref is killed in io_uring_cancel_sqpoll(), so
fput callback (i.e. io_uring_release()) is enqueued through task_work,
and run by same cancellation. As it's deeply nested we can't do parking
or taking sqd->lock there, because its state is unclear. So avoid
ctx ejection from sqd list from io_ring_ctx_wait_and_kill() and do it
in a clear context in io_ring_exit_work().

Reported-by: syzbot+e3a3f84f5cecf61f0583@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f3ae83a2d7bc..8c5789b96dbb 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -8564,6 +8564,14 @@ static void io_ring_exit_work(struct work_struct *work)
 	struct io_tctx_node *node;
 	int ret;
 
+	/* prevent SQPOLL from submitting new requests */
+	if (ctx->sq_data) {
+		io_sq_thread_park(ctx->sq_data);
+		list_del_init(&ctx->sqd_list);
+		io_sqd_update_thread_idle(ctx->sq_data);
+		io_sq_thread_unpark(ctx->sq_data);
+	}
+
 	/*
 	 * If we're doing polled IO and end up having requests being
 	 * submitted async (out-of-line), then completions can come in while
@@ -8615,14 +8623,6 @@ static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
 		io_unregister_personality(ctx, index);
 	mutex_unlock(&ctx->uring_lock);
 
-	/* prevent SQPOLL from submitting new requests */
-	if (ctx->sq_data) {
-		io_sq_thread_park(ctx->sq_data);
-		list_del_init(&ctx->sqd_list);
-		io_sqd_update_thread_idle(ctx->sq_data);
-		io_sq_thread_unpark(ctx->sq_data);
-	}
-
 	io_kill_timeouts(ctx, NULL, NULL);
 	io_poll_remove_all(ctx, NULL, NULL);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH 5.12] io_uring: do ctx sqd ejection in a clear context
  2021-03-23 10:52 [PATCH 5.12] io_uring: do ctx sqd ejection in a clear context Pavel Begunkov
@ 2021-03-24 12:55 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2021-03-24 12:55 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: syzbot+e3a3f84f5cecf61f0583

On 3/23/21 4:52 AM, Pavel Begunkov wrote:
> WARNING: CPU: 1 PID: 27907 at fs/io_uring.c:7147 io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
> CPU: 1 PID: 27907 Comm: iou-sqp-27905 Not tainted 5.12.0-rc4-syzkaller #0
> RIP: 0010:io_sq_thread_park+0xb5/0xd0 fs/io_uring.c:7147
> Call Trace:
>  io_ring_ctx_wait_and_kill+0x214/0x700 fs/io_uring.c:8619
>  io_uring_release+0x3e/0x50 fs/io_uring.c:8646
>  __fput+0x288/0x920 fs/file_table.c:280
>  task_work_run+0xdd/0x1a0 kernel/task_work.c:140
>  io_run_task_work fs/io_uring.c:2238 [inline]
>  io_run_task_work fs/io_uring.c:2228 [inline]
>  io_uring_try_cancel_requests+0x8ec/0xc60 fs/io_uring.c:8770
>  io_uring_cancel_sqpoll+0x1cf/0x290 fs/io_uring.c:8974
>  io_sqpoll_cancel_cb+0x87/0xb0 fs/io_uring.c:8907
>  io_run_task_work_head+0x58/0xb0 fs/io_uring.c:1961
>  io_sq_thread+0x3e2/0x18d0 fs/io_uring.c:6763
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> 
> May happen that last ctx ref is killed in io_uring_cancel_sqpoll(), so
> fput callback (i.e. io_uring_release()) is enqueued through task_work,
> and run by same cancellation. As it's deeply nested we can't do parking
> or taking sqd->lock there, because its state is unclear. So avoid
> ctx ejection from sqd list from io_ring_ctx_wait_and_kill() and do it
> in a clear context in io_ring_exit_work().

Applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-03-24 12:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 10:52 [PATCH 5.12] io_uring: do ctx sqd ejection in a clear context Pavel Begunkov
2021-03-24 12:55 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.