io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] io_uring: don't do flush cancel under inflight_lock
@ 2019-11-11  3:35 Jens Axboe
  0 siblings, 0 replies; only message in thread
From: Jens Axboe @ 2019-11-11  3:35 UTC (permalink / raw)
  To: io-uring

We can't safely cancel under the inflight lock. If the work hasn't been
started yet, then io_wq_cancel_work() simply marks the work as cancelled
and invokes the work handler. But if the work completion needs to grab
the inflight lock because it's grabbing user files, then we'll deadlock
trying to finish the work as we already hold that lock.

Instead grab a reference to the request, if it isn't already zero. If
it's zero, then we know it's going through completion anyway, and we
can safely ignore it. If it's not zero, then we can drop the lock and
attempt to cancel from there.

This also fixes a missing finish_wait() at the end of
io_uring_cancel_files().

Signed-off-by: Jens Axboe <axboe@kernel.dk>

---

diff --git a/fs/io_uring.c b/fs/io_uring.c
index cc38f872dbf0..bb81cc11d287 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4276,38 +4276,39 @@ static int io_uring_release(struct inode *inode, struct file *file)
 static void io_uring_cancel_files(struct io_ring_ctx *ctx,
 				  struct files_struct *files)
 {
-	struct io_kiocb *req;
+	struct io_kiocb *req, *cancel_req;
 	DEFINE_WAIT(wait);
 
 	while (!list_empty_careful(&ctx->inflight_list)) {
 		enum io_wq_cancel ret = IO_WQ_CANCEL_NOTFOUND;
+		struct io_kiocb *cancel_req = NULL;
 
 		spin_lock_irq(&ctx->inflight_lock);
 		list_for_each_entry(req, &ctx->inflight_list, inflight_entry) {
-			if (req->work.files == files) {
-				ret = io_wq_cancel_work(ctx->io_wq, &req->work);
-				break;
-			}
+			if (req->work.files != files)
+				continue;
+			/* req is being completed, ignore */
+			if (!refcount_inc_not_zero(&req->refs))
+				continue;
+			cancel_req = req;
+			break;
 		}
-		if (ret == IO_WQ_CANCEL_RUNNING)
+		if (cancel_req)
 			prepare_to_wait(&ctx->inflight_wait, &wait,
-					TASK_UNINTERRUPTIBLE);
-
+						TASK_UNINTERRUPTIBLE);
 		spin_unlock_irq(&ctx->inflight_lock);
 
-		/*
-		 * We need to keep going until we get NOTFOUND. We only cancel
-		 * one work at the time.
-		 *
-		 * If we get CANCEL_RUNNING, then wait for a work to complete
-		 * before continuing.
-		 */
-		if (ret == IO_WQ_CANCEL_OK)
-			continue;
-		else if (ret != IO_WQ_CANCEL_RUNNING)
+		if (cancel_req) {
+			ret = io_wq_cancel_work(ctx->io_wq, &cancel_req->work);
+			io_put_req(cancel_req);
+		}
+
+		/* We need to keep going until we don't find a matching req */
+		if (!cancel_req)
 			break;
 		schedule();
 	}
+	finish_wait(&ctx->inflight_wait, &wait);
 }
 
 static int io_uring_flush(struct file *file, void *data)

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2019-11-11  3:35 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-11  3:35 [PATCH] io_uring: don't do flush cancel under inflight_lock Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).