IO-Uring Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/8] iopoll and task_work fixes
@ 2020-06-30 12:20 Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 1/8] io_uring: fix io_fail_links() locking Pavel Begunkov
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

[1-3] are iopoll fixes, where a bug in [1] I unfortenatuly added
yesterday. [4-6] are task_work related.

Tell me, if something from it is needed for 5.8

Pavel Begunkov (8):
  io_uring: fix io_fail_links() locking
  io_uring: fix commit_cqring() locking in iopoll
  io_uring: fix ignoring eventfd in iopoll
  io_uring: fix missing ->mm on exit
  io_uring: don't fail iopoll requeue without ->mm
  io_uring: fix NULL mm in io_poll_task_func()
  io_uring: simplify io_async_task_func()
  io_uring: optimise io_req_find_next() fast check

 fs/io_uring.c | 79 +++++++++++++++++++++------------------------------
 1 file changed, 33 insertions(+), 46 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/8] io_uring: fix io_fail_links() locking
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 14:38   ` Jens Axboe
  2020-06-30 12:20 ` [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll Pavel Begunkov
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

86b71d0daee05 ("io_uring: deduplicate freeing linked timeouts")
actually fixed one bug, where io_fail_links() doesn't consider
REQ_F_COMP_LOCKED, but added another -- io_cqring_fill_event()
without any locking

Return locking back there and do it right with REQ_F_COMP_LOCKED
check.

Fixes: 86b71d0daee05 ("io_uring: deduplicate freeing linked timeouts")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5928404acb17..a1ea41b7b811 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1620,6 +1620,10 @@ static struct io_kiocb *io_req_link_next(struct io_kiocb *req)
 static void io_fail_links(struct io_kiocb *req)
 {
 	struct io_ring_ctx *ctx = req->ctx;
+	unsigned long flags = 0;
+
+	if (!(req->flags & REQ_F_COMP_LOCKED))
+		spin_lock_irqsave(&ctx->completion_lock, flags);
 
 	while (!list_empty(&req->link_list)) {
 		struct io_kiocb *link = list_first_entry(&req->link_list,
@@ -1634,6 +1638,8 @@ static void io_fail_links(struct io_kiocb *req)
 	}
 
 	io_commit_cqring(ctx);
+	if (!(req->flags & REQ_F_COMP_LOCKED))
+		spin_unlock_irqrestore(&ctx->completion_lock, flags);
 	io_cqring_ev_posted(ctx);
 }
 
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 1/8] io_uring: fix io_fail_links() locking Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 14:04   ` Jens Axboe
  2020-06-30 12:20 ` [PATCH 3/8] io_uring: fix ignoring eventfd " Pavel Begunkov
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Don't call io_commit_cqring() without holding the completion spinlock
in io_iopoll_complete(), it can race, e.g. with async request failing.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index a1ea41b7b811..96fcdd189ac0 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1923,7 +1923,10 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
 			io_req_free_batch(&rb, req);
 	}
 
+	spin_lock_irq(&ctx->completion_lock);
 	io_commit_cqring(ctx);
+	spin_unlock_irq(&ctx->completion_lock);
+
 	if (ctx->flags & IORING_SETUP_SQPOLL)
 		io_cqring_ev_posted(ctx);
 	io_req_free_batch_finish(ctx, &rb);
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/8] io_uring: fix ignoring eventfd in iopoll
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 1/8] io_uring: fix io_fail_links() locking Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 4/8] io_uring: fix missing ->mm on exit Pavel Begunkov
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

io_cqring_ev_posted() also signals eventfd, and nothing prohibits
having eventfd with IOPOLL. The problem is that it will be ignored
in io_iopoll_complete() in non IORING_SETUP_SQPOLL case.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 96fcdd189ac0..776f593a5bf3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1926,9 +1926,7 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
 	spin_lock_irq(&ctx->completion_lock);
 	io_commit_cqring(ctx);
 	spin_unlock_irq(&ctx->completion_lock);
-
-	if (ctx->flags & IORING_SETUP_SQPOLL)
-		io_cqring_ev_posted(ctx);
+	io_cqring_ev_posted(ctx);
 	io_req_free_batch_finish(ctx, &rb);
 
 	if (!list_empty(&again))
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 4/8] io_uring: fix missing ->mm on exit
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
                   ` (2 preceding siblings ...)
  2020-06-30 12:20 ` [PATCH 3/8] io_uring: fix ignoring eventfd " Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 5/8] io_uring: don't fail iopoll requeue without ->mm Pavel Begunkov
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

There is a fancy bug, where exiting user task may not have ->mm,
that makes task_works to try to do kthread_use_mm(ctx->sqo_mm).

Don't do that if sqo_mm is NULL.

[  290.460558] WARNING: CPU: 6 PID: 150933 at kernel/kthread.c:1238
	kthread_use_mm+0xf3/0x110
[  290.460579] CPU: 6 PID: 150933 Comm: read-write2 Tainted: G
	I E     5.8.0-rc2-00066-g9b21720607cf #531
[  290.460580] RIP: 0010:kthread_use_mm+0xf3/0x110
...
[  290.460584] Call Trace:
[  290.460584]  __io_sq_thread_acquire_mm.isra.0.part.0+0x25/0x30
[  290.460584]  __io_req_task_submit+0x64/0x80
[  290.460584]  io_req_task_submit+0x15/0x20
[  290.460585]  task_work_run+0x67/0xa0
[  290.460585]  do_exit+0x35d/0xb70
[  290.460585]  do_group_exit+0x43/0xa0
[  290.460585]  get_signal+0x140/0x900
[  290.460586]  do_signal+0x37/0x780
[  290.460586]  __prepare_exit_to_usermode+0x126/0x1c0
[  290.460586]  __syscall_return_slowpath+0x3b/0x1c0
[  290.460587]  do_syscall_64+0x5f/0xa0
[  290.460587]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

following with faults.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 776f593a5bf3..c7986c27272e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -958,7 +958,7 @@ static void io_sq_thread_drop_mm(struct io_ring_ctx *ctx)
 static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx)
 {
 	if (!current->mm) {
-		if (unlikely(!mmget_not_zero(ctx->sqo_mm)))
+		if (unlikely(!ctx->sqo_mm || !mmget_not_zero(ctx->sqo_mm)))
 			return -EFAULT;
 		kthread_use_mm(ctx->sqo_mm);
 	}
@@ -7212,10 +7212,10 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx,
 {
 	int ret;
 
-	mmgrab(current->mm);
-	ctx->sqo_mm = current->mm;
-
 	if (ctx->flags & IORING_SETUP_SQPOLL) {
+		mmgrab(current->mm);
+		ctx->sqo_mm = current->mm;
+
 		ret = -EPERM;
 		if (!capable(CAP_SYS_ADMIN))
 			goto err;
@@ -7259,8 +7259,10 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx,
 	return 0;
 err:
 	io_finish_async(ctx);
-	mmdrop(ctx->sqo_mm);
-	ctx->sqo_mm = NULL;
+	if (ctx->sqo_mm) {
+		mmdrop(ctx->sqo_mm);
+		ctx->sqo_mm = NULL;
+	}
 	return ret;
 }
 
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 5/8] io_uring: don't fail iopoll requeue without ->mm
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
                   ` (3 preceding siblings ...)
  2020-06-30 12:20 ` [PATCH 4/8] io_uring: fix missing ->mm on exit Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 6/8] io_uring: fix NULL mm in io_poll_task_func() Pavel Begunkov
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Actually, io_iopoll_queue() may have NULL ->mm, that's if SQ thread
didn't grabbed mm before doing iopoll. Don't fail reqs there, as after
recent changes it won't be punted directly but rather through task_work.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index c7986c27272e..589cc157e29c 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1881,9 +1881,7 @@ static void io_iopoll_queue(struct list_head *again)
 	do {
 		req = list_first_entry(again, struct io_kiocb, list);
 		list_del(&req->list);
-
-		/* should have ->mm unless io_uring is dying, kill reqs then */
-		if (unlikely(!current->mm) || !io_rw_reissue(req, -EAGAIN))
+		if (!io_rw_reissue(req, -EAGAIN))
 			io_complete_rw_common(&req->rw.kiocb, -EAGAIN, NULL);
 	} while (!list_empty(again));
 }
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 6/8] io_uring: fix NULL mm in io_poll_task_func()
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
                   ` (4 preceding siblings ...)
  2020-06-30 12:20 ` [PATCH 5/8] io_uring: don't fail iopoll requeue without ->mm Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 7/8] io_uring: simplify io_async_task_func() Pavel Begunkov
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

io_poll_task_func() hand-coded link submission forgetting to set
TASK_RUNNING, acquire mm, etc. Call existing helper for that,
i.e. __io_req_task_submit().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 589cc157e29c..57c194de9165 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4492,13 +4492,8 @@ static void io_poll_task_func(struct callback_head *cb)
 	struct io_kiocb *nxt = NULL;
 
 	io_poll_task_handler(req, &nxt);
-	if (nxt) {
-		struct io_ring_ctx *ctx = nxt->ctx;
-
-		mutex_lock(&ctx->uring_lock);
-		__io_queue_sqe(nxt, NULL, NULL);
-		mutex_unlock(&ctx->uring_lock);
-	}
+	if (nxt)
+		__io_req_task_submit(nxt);
 }
 
 static int io_poll_double_wake(struct wait_queue_entry *wait, unsigned mode,
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 7/8] io_uring: simplify io_async_task_func()
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
                   ` (5 preceding siblings ...)
  2020-06-30 12:20 ` [PATCH 6/8] io_uring: fix NULL mm in io_poll_task_func() Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 12:20 ` [PATCH 8/8] io_uring: optimise io_req_find_next() fast check Pavel Begunkov
  2020-06-30 12:39 ` [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
  8 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Greatly simplify io_async_task_func() removing duplicated functionality
of __io_req_task_submit(). This do one extra spin lock/unlock for
cancelled poll case, but that shouldn't happen often.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 29 +++++------------------------
 1 file changed, 5 insertions(+), 24 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 57c194de9165..68dcc29c5dc5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4582,7 +4582,6 @@ static void io_async_task_func(struct callback_head *cb)
 	struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
 	struct async_poll *apoll = req->apoll;
 	struct io_ring_ctx *ctx = req->ctx;
-	bool canceled = false;
 
 	trace_io_uring_task_run(req->ctx, req->opcode, req->user_data);
 
@@ -4592,15 +4591,8 @@ static void io_async_task_func(struct callback_head *cb)
 	}
 
 	/* If req is still hashed, it cannot have been canceled. Don't check. */
-	if (hash_hashed(&req->hash_node)) {
+	if (hash_hashed(&req->hash_node))
 		hash_del(&req->hash_node);
-	} else {
-		canceled = READ_ONCE(apoll->poll.canceled);
-		if (canceled) {
-			io_cqring_fill_event(req, -ECANCELED);
-			io_commit_cqring(ctx);
-		}
-	}
 
 	spin_unlock_irq(&ctx->completion_lock);
 
@@ -4609,21 +4601,10 @@ static void io_async_task_func(struct callback_head *cb)
 		memcpy(&req->work, &apoll->work, sizeof(req->work));
 	kfree(apoll);
 
-	if (!canceled) {
-		__set_current_state(TASK_RUNNING);
-		if (io_sq_thread_acquire_mm(ctx, req)) {
-			io_cqring_add_event(req, -EFAULT, 0);
-			goto end_req;
-		}
-		mutex_lock(&ctx->uring_lock);
-		__io_queue_sqe(req, NULL, NULL);
-		mutex_unlock(&ctx->uring_lock);
-	} else {
-		io_cqring_ev_posted(ctx);
-end_req:
-		req_set_fail_links(req);
-		io_double_put_req(req);
-	}
+	if (!READ_ONCE(apoll->poll.canceled))
+		__io_req_task_submit(req);
+	else
+		__io_req_task_cancel(req, -ECANCELED);
 }
 
 static int io_async_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 8/8] io_uring: optimise io_req_find_next() fast check
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
                   ` (6 preceding siblings ...)
  2020-06-30 12:20 ` [PATCH 7/8] io_uring: simplify io_async_task_func() Pavel Begunkov
@ 2020-06-30 12:20 ` Pavel Begunkov
  2020-06-30 12:39 ` [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
  8 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:20 UTC (permalink / raw)
  To: Jens Axboe, io-uring

gcc 9.2.0 compiles io_req_find_next() as a separate function leaving
the first REQ_F_LINK_HEAD fast check not inlined. Help it by splitting
out the check from the function.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 68dcc29c5dc5..8cac266b4674 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1643,12 +1643,9 @@ static void io_fail_links(struct io_kiocb *req)
 	io_cqring_ev_posted(ctx);
 }
 
-static struct io_kiocb *io_req_find_next(struct io_kiocb *req)
+static struct io_kiocb *__io_req_find_next(struct io_kiocb *req)
 {
-	if (likely(!(req->flags & REQ_F_LINK_HEAD)))
-		return NULL;
 	req->flags &= ~REQ_F_LINK_HEAD;
-
 	if (req->flags & REQ_F_LINK_TIMEOUT)
 		io_kill_linked_timeout(req);
 
@@ -1664,6 +1661,13 @@ static struct io_kiocb *io_req_find_next(struct io_kiocb *req)
 	return NULL;
 }
 
+static struct io_kiocb *io_req_find_next(struct io_kiocb *req)
+{
+	if (likely(!(req->flags & REQ_F_LINK_HEAD)))
+		return NULL;
+	return __io_req_find_next(req);
+}
+
 static void __io_req_task_cancel(struct io_kiocb *req, int error)
 {
 	struct io_ring_ctx *ctx = req->ctx;
-- 
2.24.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/8] iopoll and task_work fixes
  2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
                   ` (7 preceding siblings ...)
  2020-06-30 12:20 ` [PATCH 8/8] io_uring: optimise io_req_find_next() fast check Pavel Begunkov
@ 2020-06-30 12:39 ` Pavel Begunkov
  8 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:39 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 30/06/2020 15:20, Pavel Begunkov wrote:
> [1-3] are iopoll fixes, where a bug in [1] I unfortenatuly added
> yesterday. [4-6] are task_work related.

I think there are even more bugs. I'll just leave it here, if somebody
wants to take a look.

- I have a hunch that linked timeouts are broken, probably all these
re-submits cause double io_queue_linked_timeout().

- select-buf may be if not leaking, then not reported to a user
in some cases. Worth to check.

- timeout-overflow test failed for me sometime ago. I think that's bulk
completion related. I sent a patch for a very similar bug, but it got
lost.


> Tell me, if something from it is needed for 5.8
> 
> Pavel Begunkov (8):
>   io_uring: fix io_fail_links() locking
>   io_uring: fix commit_cqring() locking in iopoll
>   io_uring: fix ignoring eventfd in iopoll
>   io_uring: fix missing ->mm on exit
>   io_uring: don't fail iopoll requeue without ->mm
>   io_uring: fix NULL mm in io_poll_task_func()
>   io_uring: simplify io_async_task_func()
>   io_uring: optimise io_req_find_next() fast check
> 
>  fs/io_uring.c | 79 +++++++++++++++++++++------------------------------
>  1 file changed, 33 insertions(+), 46 deletions(-)
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll
  2020-06-30 12:20 ` [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll Pavel Begunkov
@ 2020-06-30 14:04   ` Jens Axboe
  2020-06-30 14:36     ` Pavel Begunkov
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2020-06-30 14:04 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 6/30/20 6:20 AM, Pavel Begunkov wrote:
> Don't call io_commit_cqring() without holding the completion spinlock
> in io_iopoll_complete(), it can race, e.g. with async request failing.

Can you be more specific?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll
  2020-06-30 14:04   ` Jens Axboe
@ 2020-06-30 14:36     ` Pavel Begunkov
  2020-06-30 14:46       ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 14:36 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 30/06/2020 17:04, Jens Axboe wrote:
> On 6/30/20 6:20 AM, Pavel Begunkov wrote:
>> Don't call io_commit_cqring() without holding the completion spinlock
>> in io_iopoll_complete(), it can race, e.g. with async request failing.
> 
> Can you be more specific?

io_iopoll_complete()
	-> io_req_free_batch()
		-> io_queue_next()
			-> io_req_task_queue()
				-> task_work_add()

if this task_work_add() fails, it will be redirected to io-wq manager task
to do io_req_task_cancel() -> commit_cqring().


And probably something similar will happen if a request currently in io-wq
is retried with
io_rw_should_retry() -> io_async_buf_func() -> task_work_add()


I'll resend patch/series later with a better description. 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/8] io_uring: fix io_fail_links() locking
  2020-06-30 12:20 ` [PATCH 1/8] io_uring: fix io_fail_links() locking Pavel Begunkov
@ 2020-06-30 14:38   ` Jens Axboe
  2020-06-30 14:45     ` Pavel Begunkov
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2020-06-30 14:38 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 6/30/20 6:20 AM, Pavel Begunkov wrote:
> 86b71d0daee05 ("io_uring: deduplicate freeing linked timeouts")
> actually fixed one bug, where io_fail_links() doesn't consider
> REQ_F_COMP_LOCKED, but added another -- io_cqring_fill_event()
> without any locking
> 
> Return locking back there and do it right with REQ_F_COMP_LOCKED
> check.

Something like the below is much better, and it also makes it so that
the static analyzers don't throw a fit. I'm going to fold this into
the original.


diff --git a/fs/io_uring.c b/fs/io_uring.c
index e1410ff31892..a0aea78162a6 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1600,7 +1600,7 @@ static void io_req_link_next(struct io_kiocb *req, struct io_kiocb **nxtptr)
 /*
  * Called if REQ_F_LINK_HEAD is set, and we fail the head request
  */
-static void io_fail_links(struct io_kiocb *req)
+static void __io_fail_links(struct io_kiocb *req)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 
@@ -1620,6 +1620,23 @@ static void io_fail_links(struct io_kiocb *req)
 	io_cqring_ev_posted(ctx);
 }
 
+static void io_fail_links(struct io_kiocb *req)
+{
+	struct io_ring_ctx *ctx = req->ctx;
+
+	if (!(req->flags & REQ_F_COMP_LOCKED)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&ctx->completion_lock, flags);
+		__io_fail_links(req);
+		spin_unlock_irqrestore(&ctx->completion_lock, flags);
+	} else {
+		__io_fail_links(req);
+	}
+
+	io_cqring_ev_posted(ctx);
+}
+
 static void io_req_find_next(struct io_kiocb *req, struct io_kiocb **nxt)
 {
 	if (likely(!(req->flags & REQ_F_LINK_HEAD)))

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/8] io_uring: fix io_fail_links() locking
  2020-06-30 14:38   ` Jens Axboe
@ 2020-06-30 14:45     ` Pavel Begunkov
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 14:45 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 30/06/2020 17:38, Jens Axboe wrote:
> On 6/30/20 6:20 AM, Pavel Begunkov wrote:
>> 86b71d0daee05 ("io_uring: deduplicate freeing linked timeouts")
>> actually fixed one bug, where io_fail_links() doesn't consider
>> REQ_F_COMP_LOCKED, but added another -- io_cqring_fill_event()
>> without any locking
>>
>> Return locking back there and do it right with REQ_F_COMP_LOCKED
>> check.
> 
> Something like the below is much better, and it also makes it so that
> the static analyzers don't throw a fit. I'm going to fold this into
> the original.

Good point about analyzers, even uninitialised flags there is a warning.
This pattern usually prevents inlining, but don't care about
io_fail_links().

> 
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index e1410ff31892..a0aea78162a6 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -1600,7 +1600,7 @@ static void io_req_link_next(struct io_kiocb *req, struct io_kiocb **nxtptr)
>  /*
>   * Called if REQ_F_LINK_HEAD is set, and we fail the head request
>   */
> -static void io_fail_links(struct io_kiocb *req)
> +static void __io_fail_links(struct io_kiocb *req)
>  {
>  	struct io_ring_ctx *ctx = req->ctx;
>  
> @@ -1620,6 +1620,23 @@ static void io_fail_links(struct io_kiocb *req)
>  	io_cqring_ev_posted(ctx);
>  }
>  
> +static void io_fail_links(struct io_kiocb *req)
> +{
> +	struct io_ring_ctx *ctx = req->ctx;
> +
> +	if (!(req->flags & REQ_F_COMP_LOCKED)) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&ctx->completion_lock, flags);
> +		__io_fail_links(req);
> +		spin_unlock_irqrestore(&ctx->completion_lock, flags);
> +	} else {
> +		__io_fail_links(req);
> +	}
> +
> +	io_cqring_ev_posted(ctx);
> +}
> +
>  static void io_req_find_next(struct io_kiocb *req, struct io_kiocb **nxt)
>  {
>  	if (likely(!(req->flags & REQ_F_LINK_HEAD)))
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll
  2020-06-30 14:36     ` Pavel Begunkov
@ 2020-06-30 14:46       ` Jens Axboe
  2020-06-30 15:00         ` Pavel Begunkov
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2020-06-30 14:46 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 6/30/20 8:36 AM, Pavel Begunkov wrote:
> On 30/06/2020 17:04, Jens Axboe wrote:
>> On 6/30/20 6:20 AM, Pavel Begunkov wrote:
>>> Don't call io_commit_cqring() without holding the completion spinlock
>>> in io_iopoll_complete(), it can race, e.g. with async request failing.
>>
>> Can you be more specific?
> 
> io_iopoll_complete()
> 	-> io_req_free_batch()
> 		-> io_queue_next()
> 			-> io_req_task_queue()
> 				-> task_work_add()
> 
> if this task_work_add() fails, it will be redirected to io-wq manager task
> to do io_req_task_cancel() -> commit_cqring().
> 
> 
> And probably something similar will happen if a request currently in io-wq
> is retried with
> io_rw_should_retry() -> io_async_buf_func() -> task_work_add()

For the IOPOLL setup, it should be protected by the ring mutex. Adding
the irq disable + lock for polling would be a nasty performance hit.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll
  2020-06-30 14:46       ` Jens Axboe
@ 2020-06-30 15:00         ` Pavel Begunkov
  2020-06-30 15:33           ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Pavel Begunkov @ 2020-06-30 15:00 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 30/06/2020 17:46, Jens Axboe wrote:
> On 6/30/20 8:36 AM, Pavel Begunkov wrote:
>> On 30/06/2020 17:04, Jens Axboe wrote:
>>> On 6/30/20 6:20 AM, Pavel Begunkov wrote:
>>>> Don't call io_commit_cqring() without holding the completion spinlock
>>>> in io_iopoll_complete(), it can race, e.g. with async request failing.
>>>
>>> Can you be more specific?
>>
>> io_iopoll_complete()
>> 	-> io_req_free_batch()
>> 		-> io_queue_next()
>> 			-> io_req_task_queue()
>> 				-> task_work_add()
>>
>> if this task_work_add() fails, it will be redirected to io-wq manager task
>> to do io_req_task_cancel() -> commit_cqring().
>>
>>
>> And probably something similar will happen if a request currently in io-wq
>> is retried with
>> io_rw_should_retry() -> io_async_buf_func() -> task_work_add()
> 
> For the IOPOLL setup, it should be protected by the ring mutex. Adding
> the irq disable + lock for polling would be a nasty performance hit.

Ok. For what it worth, it would be easier to accidentally screw in the future.
I'll try it out.


BTW, if you're going to cherry-pick patches from this series, just push them
into the branch. I'll check git.kernel.dk later and resend what's left.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll
  2020-06-30 15:00         ` Pavel Begunkov
@ 2020-06-30 15:33           ` Jens Axboe
  0 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2020-06-30 15:33 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 6/30/20 9:00 AM, Pavel Begunkov wrote:
> On 30/06/2020 17:46, Jens Axboe wrote:
>> On 6/30/20 8:36 AM, Pavel Begunkov wrote:
>>> On 30/06/2020 17:04, Jens Axboe wrote:
>>>> On 6/30/20 6:20 AM, Pavel Begunkov wrote:
>>>>> Don't call io_commit_cqring() without holding the completion spinlock
>>>>> in io_iopoll_complete(), it can race, e.g. with async request failing.
>>>>
>>>> Can you be more specific?
>>>
>>> io_iopoll_complete()
>>> 	-> io_req_free_batch()
>>> 		-> io_queue_next()
>>> 			-> io_req_task_queue()
>>> 				-> task_work_add()
>>>
>>> if this task_work_add() fails, it will be redirected to io-wq manager task
>>> to do io_req_task_cancel() -> commit_cqring().
>>>
>>>
>>> And probably something similar will happen if a request currently in io-wq
>>> is retried with
>>> io_rw_should_retry() -> io_async_buf_func() -> task_work_add()
>>
>> For the IOPOLL setup, it should be protected by the ring mutex. Adding
>> the irq disable + lock for polling would be a nasty performance hit.
> 
> Ok. For what it worth, it would be easier to accidentally screw in the future.
> I'll try it out.
> 
> 
> BTW, if you're going to cherry-pick patches from this series, just push them
> into the branch. I'll check git.kernel.dk later and resend what's left.

Yep done, I'll push it out.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, back to index

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-30 12:20 [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov
2020-06-30 12:20 ` [PATCH 1/8] io_uring: fix io_fail_links() locking Pavel Begunkov
2020-06-30 14:38   ` Jens Axboe
2020-06-30 14:45     ` Pavel Begunkov
2020-06-30 12:20 ` [PATCH 2/8] io_uring: fix commit_cqring() locking in iopoll Pavel Begunkov
2020-06-30 14:04   ` Jens Axboe
2020-06-30 14:36     ` Pavel Begunkov
2020-06-30 14:46       ` Jens Axboe
2020-06-30 15:00         ` Pavel Begunkov
2020-06-30 15:33           ` Jens Axboe
2020-06-30 12:20 ` [PATCH 3/8] io_uring: fix ignoring eventfd " Pavel Begunkov
2020-06-30 12:20 ` [PATCH 4/8] io_uring: fix missing ->mm on exit Pavel Begunkov
2020-06-30 12:20 ` [PATCH 5/8] io_uring: don't fail iopoll requeue without ->mm Pavel Begunkov
2020-06-30 12:20 ` [PATCH 6/8] io_uring: fix NULL mm in io_poll_task_func() Pavel Begunkov
2020-06-30 12:20 ` [PATCH 7/8] io_uring: simplify io_async_task_func() Pavel Begunkov
2020-06-30 12:20 ` [PATCH 8/8] io_uring: optimise io_req_find_next() fast check Pavel Begunkov
2020-06-30 12:39 ` [PATCH 0/8] iopoll and task_work fixes Pavel Begunkov

IO-Uring Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/io-uring/0 io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ https://lore.kernel.org/io-uring \
		io-uring@vger.kernel.org
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.io-uring


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git