io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] io-wq: fix cancellation on create-worker failure
@ 2021-09-08  9:09 Pavel Begunkov
  2021-09-08 12:34 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Pavel Begunkov @ 2021-09-08  9:09 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: Hao Sun

WARNING: CPU: 0 PID: 10392 at fs/io_uring.c:1151 req_ref_put_and_test
fs/io_uring.c:1151 [inline]
WARNING: CPU: 0 PID: 10392 at fs/io_uring.c:1151 req_ref_put_and_test
fs/io_uring.c:1146 [inline]
WARNING: CPU: 0 PID: 10392 at fs/io_uring.c:1151
io_req_complete_post+0xf5b/0x1190 fs/io_uring.c:1794
Modules linked in:
Call Trace:
 tctx_task_work+0x1e5/0x570 fs/io_uring.c:2158
 task_work_run+0xe0/0x1a0 kernel/task_work.c:164
 tracehook_notify_signal include/linux/tracehook.h:212 [inline]
 handle_signal_work kernel/entry/common.c:146 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
 exit_to_user_mode_prepare+0x232/0x2a0 kernel/entry/common.c:209
 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
 syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302
 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x44/0xae

When io_wqe_enqueue() -> io_wqe_create_worker() fails, we can't just
call io_run_cancel() to clean up the request, it's already enqueued via
io_wqe_insert_work() and will be executed either by some other worker
during cancellation (e.g. in io_wq_put_and_exit()).

Reported-by: Hao Sun <sunhao.th@gmail.com>
Fixes: 3146cba99aa28 ("io-wq: make worker creation resilient against signals")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io-wq.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/fs/io-wq.c b/fs/io-wq.c
index d80e4a735677..35e7ee26f7ea 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -832,6 +832,11 @@ static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work)
 	wq_list_add_after(&work->list, &tail->list, &acct->work_list);
 }
 
+static bool io_wq_work_match_item(struct io_wq_work *work, void *data)
+{
+	return work == data;
+}
+
 static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
 {
 	struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
@@ -844,7 +849,6 @@ static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
 	 */
 	if (test_bit(IO_WQ_BIT_EXIT, &wqe->wq->state) ||
 	    (work->flags & IO_WQ_WORK_CANCEL)) {
-run_cancel:
 		io_run_cancel(work, wqe);
 		return;
 	}
@@ -864,15 +868,22 @@ static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
 		bool did_create;
 
 		did_create = io_wqe_create_worker(wqe, acct);
-		if (unlikely(!did_create)) {
-			raw_spin_lock(&wqe->lock);
-			/* fatal condition, failed to create the first worker */
-			if (!acct->nr_workers) {
-				raw_spin_unlock(&wqe->lock);
-				goto run_cancel;
-			}
-			raw_spin_unlock(&wqe->lock);
+		if (likely(did_create))
+			return;
+
+		raw_spin_lock(&wqe->lock);
+		/* fatal condition, failed to create the first worker */
+		if (!acct->nr_workers) {
+			struct io_cb_cancel_data match = {
+				.fn		= io_wq_work_match_item,
+				.data		= work,
+				.cancel_all	= false,
+			};
+
+			if (io_acct_cancel_pending_work(wqe, acct, &match))
+				raw_spin_lock(&wqe->lock);
 		}
+		raw_spin_unlock(&wqe->lock);
 	}
 }
 
-- 
2.33.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] io-wq: fix cancellation on create-worker failure
  2021-09-08  9:09 [PATCH] io-wq: fix cancellation on create-worker failure Pavel Begunkov
@ 2021-09-08 12:34 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2021-09-08 12:34 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: Hao Sun

On 9/8/21 3:09 AM, Pavel Begunkov wrote:
> WARNING: CPU: 0 PID: 10392 at fs/io_uring.c:1151 req_ref_put_and_test
> fs/io_uring.c:1151 [inline]
> WARNING: CPU: 0 PID: 10392 at fs/io_uring.c:1151 req_ref_put_and_test
> fs/io_uring.c:1146 [inline]
> WARNING: CPU: 0 PID: 10392 at fs/io_uring.c:1151
> io_req_complete_post+0xf5b/0x1190 fs/io_uring.c:1794
> Modules linked in:
> Call Trace:
>  tctx_task_work+0x1e5/0x570 fs/io_uring.c:2158
>  task_work_run+0xe0/0x1a0 kernel/task_work.c:164
>  tracehook_notify_signal include/linux/tracehook.h:212 [inline]
>  handle_signal_work kernel/entry/common.c:146 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
>  exit_to_user_mode_prepare+0x232/0x2a0 kernel/entry/common.c:209
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
>  syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302
>  do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> When io_wqe_enqueue() -> io_wqe_create_worker() fails, we can't just
> call io_run_cancel() to clean up the request, it's already enqueued via
> io_wqe_insert_work() and will be executed either by some other worker
> during cancellation (e.g. in io_wq_put_and_exit()).

Oops yes, that looks better. Thanks, applied.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-09-08 12:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-08  9:09 [PATCH] io-wq: fix cancellation on create-worker failure Pavel Begunkov
2021-09-08 12:34 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).