All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] another round of small tinkering
@ 2021-02-12  3:23 Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Some other small improvements with negative diffstat.
2-3 are to address acquire_mm_files overhead

Pavel Begunkov (5):
  io_uring: take compl state from submit state
  io_uring: optimise out unlikely link queue
  io_uring: optimise SQPOLL mm/files grabbing
  io_uring: don't duplicate io_req_task_queue()
  io_uring: save ctx put/get for task_work submit

 fs/io_uring.c | 103 +++++++++++++++++++-------------------------------
 1 file changed, 39 insertions(+), 64 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/5] io_uring: take compl state from submit state
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Completion and submission states are now coupled together, it's weird to
get one from argument and another from ctx, do it consistently for
io_req_free_batch(). It's also faster as we already have @state cached
in registers.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7b1979624320..8c2613bf54d3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2417,13 +2417,10 @@ static void io_req_free_batch(struct req_batch *rb, struct io_kiocb *req,
 	rb->ctx_refs++;
 
 	io_dismantle_req(req);
-	if (state->free_reqs != ARRAY_SIZE(state->reqs)) {
+	if (state->free_reqs != ARRAY_SIZE(state->reqs))
 		state->reqs[state->free_reqs++] = req;
-	} else {
-		struct io_comp_state *cs = &req->ctx->submit_state.comp;
-
-		list_add(&req->compl.list, &cs->free_list);
-	}
+	else
+		list_add(&req->compl.list, &state->comp.free_list);
 }
 
 static void io_submit_flush_completions(struct io_comp_state *cs,
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/5] io_uring: optimise out unlikely link queue
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

__io_queue_sqe() tries to issue as much requests of a link as it can,
and uses io_put_req_find_next() to extract a next one, targeting inline
completed requests. As now __io_queue_sqe() is always used together with
struct io_comp_state, it leaves next propagation only a small window and
only for async reqs, that doesn't justify its existence.

Remove it, make __io_queue_sqe() to issue only a head request. It
simplifies the code and will allow other optimisations.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 42 ++++++++++--------------------------------
 1 file changed, 10 insertions(+), 32 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8c2613bf54d3..26d1080217e5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6563,26 +6563,20 @@ static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req)
 
 static void __io_queue_sqe(struct io_kiocb *req)
 {
-	struct io_kiocb *linked_timeout;
+	struct io_kiocb *linked_timeout = io_prep_linked_timeout(req);
 	const struct cred *old_creds = NULL;
 	int ret;
 
-again:
-	linked_timeout = io_prep_linked_timeout(req);
-
 	if ((req->flags & REQ_F_WORK_INITIALIZED) &&
 	    (req->work.flags & IO_WQ_WORK_CREDS) &&
-	    req->work.identity->creds != current_cred()) {
-		if (old_creds)
-			revert_creds(old_creds);
-		if (old_creds == req->work.identity->creds)
-			old_creds = NULL; /* restored original creds */
-		else
-			old_creds = override_creds(req->work.identity->creds);
-	}
+	    req->work.identity->creds != current_cred())
+		old_creds = override_creds(req->work.identity->creds);
 
 	ret = io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_COMPLETE_DEFER);
 
+	if (old_creds)
+		revert_creds(old_creds);
+
 	/*
 	 * We async punt it if the file wasn't marked NOWAIT, or if the file
 	 * doesn't support non-blocking read/write attempts
@@ -6595,9 +6589,6 @@ static void __io_queue_sqe(struct io_kiocb *req)
 			 */
 			io_queue_async_work(req);
 		}
-
-		if (linked_timeout)
-			io_queue_linked_timeout(linked_timeout);
 	} else if (likely(!ret)) {
 		/* drop submission reference */
 		if (req->flags & REQ_F_COMPLETE_INLINE) {
@@ -6605,31 +6596,18 @@ static void __io_queue_sqe(struct io_kiocb *req)
 			struct io_comp_state *cs = &ctx->submit_state.comp;
 
 			cs->reqs[cs->nr++] = req;
-			if (cs->nr == IO_COMPL_BATCH)
+			if (cs->nr == ARRAY_SIZE(cs->reqs))
 				io_submit_flush_completions(cs, ctx);
-			req = NULL;
 		} else {
-			req = io_put_req_find_next(req);
-		}
-
-		if (linked_timeout)
-			io_queue_linked_timeout(linked_timeout);
-
-		if (req) {
-			if (!(req->flags & REQ_F_FORCE_ASYNC))
-				goto again;
-			io_queue_async_work(req);
+			io_put_req(req);
 		}
 	} else {
-		/* un-prep timeout, so it'll be killed as any other linked */
-		req->flags &= ~REQ_F_LINK_TIMEOUT;
 		req_set_fail_links(req);
 		io_put_req(req);
 		io_req_complete(req, ret);
 	}
-
-	if (old_creds)
-		revert_creds(old_creds);
+	if (linked_timeout)
+		io_queue_linked_timeout(linked_timeout);
 }
 
 static void io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

There are two reasons for this. First is to optimise
io_sq_thread_acquire_mm_files() for non-SQPOLL case, which currently do
too many checks and function calls in the hot path, e.g. in
io_init_req().

The second is to not grab mm/files when there are not needed. As
__io_queue_sqe() issues only one request now, we can reuse
io_sq_thread_acquire_mm_files() instead of unconditional acquire
mm/files.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 26d1080217e5..813d1ccd7a69 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1145,9 +1145,6 @@ static void io_sq_thread_drop_mm_files(void)
 
 static int __io_sq_thread_acquire_files(struct io_ring_ctx *ctx)
 {
-	if (current->flags & PF_EXITING)
-		return -EFAULT;
-
 	if (!current->files) {
 		struct files_struct *files;
 		struct nsproxy *nsproxy;
@@ -1175,15 +1172,9 @@ static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx)
 {
 	struct mm_struct *mm;
 
-	if (current->flags & PF_EXITING)
-		return -EFAULT;
 	if (current->mm)
 		return 0;
 
-	/* Should never happen */
-	if (unlikely(!(ctx->flags & IORING_SETUP_SQPOLL)))
-		return -EFAULT;
-
 	task_lock(ctx->sqo_task);
 	mm = ctx->sqo_task->mm;
 	if (unlikely(!mm || !mmget_not_zero(mm)))
@@ -1198,8 +1189,8 @@ static int __io_sq_thread_acquire_mm(struct io_ring_ctx *ctx)
 	return -EFAULT;
 }
 
-static int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
-					 struct io_kiocb *req)
+static int __io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
+					   struct io_kiocb *req)
 {
 	const struct io_op_def *def = &io_op_defs[req->opcode];
 	int ret;
@@ -1219,6 +1210,16 @@ static int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
 	return 0;
 }
 
+static inline int io_sq_thread_acquire_mm_files(struct io_ring_ctx *ctx,
+						struct io_kiocb *req)
+{
+	if (unlikely(current->flags & PF_EXITING))
+		return -EFAULT;
+	if (!(ctx->flags & IORING_SETUP_SQPOLL))
+		return 0;
+	return __io_sq_thread_acquire_mm_files(ctx, req);
+}
+
 static void io_sq_thread_associate_blkcg(struct io_ring_ctx *ctx,
 					 struct cgroup_subsys_state **cur_css)
 
@@ -2336,9 +2337,7 @@ static void __io_req_task_submit(struct io_kiocb *req)
 	struct io_ring_ctx *ctx = req->ctx;
 
 	mutex_lock(&ctx->uring_lock);
-	if (!ctx->sqo_dead &&
-	    !__io_sq_thread_acquire_mm(ctx) &&
-	    !__io_sq_thread_acquire_files(ctx))
+	if (!ctx->sqo_dead && !io_sq_thread_acquire_mm_files(ctx, req))
 		__io_queue_sqe(req);
 	else
 		__io_req_task_cancel(req, -EFAULT);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/5] io_uring: don't duplicate io_req_task_queue()
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
                   ` (2 preceding siblings ...)
  2021-02-12  3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12  3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
  2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Don't hand code io_req_task_queue() inside of io_async_buf_func(), just
call it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 813d1ccd7a69..5c0b1a7dba80 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3494,7 +3494,6 @@ static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
 	struct wait_page_queue *wpq;
 	struct io_kiocb *req = wait->private;
 	struct wait_page_key *key = arg;
-	int ret;
 
 	wpq = container_of(wait, struct wait_page_queue, wait);
 
@@ -3504,14 +3503,9 @@ static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode,
 	req->rw.kiocb.ki_flags &= ~IOCB_WAITQ;
 	list_del_init(&wait->entry);
 
-	req->task_work.func = io_req_task_submit;
-	percpu_ref_get(&req->ctx->refs);
-
 	/* submit ref gets dropped, acquire a new one */
 	refcount_inc(&req->refs);
-	ret = io_req_task_work_add(req);
-	if (unlikely(ret))
-		io_req_task_work_add_fallback(req, io_req_task_cancel);
+	io_req_task_queue(req);
 	return 1;
 }
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/5] io_uring: save ctx put/get for task_work submit
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
                   ` (3 preceding siblings ...)
  2021-02-12  3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
@ 2021-02-12  3:23 ` Pavel Begunkov
  2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-02-12  3:23 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Do a little trick in io_ring_ctx_free() briefly taking uring_lock, that
will wait for everyone currently holding it, so we can skip pinning ctx
with ctx->refs for __io_req_task_submit(), which is executed and loses
its refs/reqs while holding the lock.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5c0b1a7dba80..87f2f8e660e8 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2336,6 +2336,7 @@ static void __io_req_task_submit(struct io_kiocb *req)
 {
 	struct io_ring_ctx *ctx = req->ctx;
 
+	/* ctx stays valid until unlock, even if we drop all ours ctx->refs */
 	mutex_lock(&ctx->uring_lock);
 	if (!ctx->sqo_dead && !io_sq_thread_acquire_mm_files(ctx, req))
 		__io_queue_sqe(req);
@@ -2347,10 +2348,8 @@ static void __io_req_task_submit(struct io_kiocb *req)
 static void io_req_task_submit(struct callback_head *cb)
 {
 	struct io_kiocb *req = container_of(cb, struct io_kiocb, task_work);
-	struct io_ring_ctx *ctx = req->ctx;
 
 	__io_req_task_submit(req);
-	percpu_ref_put(&ctx->refs);
 }
 
 static void io_req_task_queue(struct io_kiocb *req)
@@ -2358,11 +2357,11 @@ static void io_req_task_queue(struct io_kiocb *req)
 	int ret;
 
 	req->task_work.func = io_req_task_submit;
-	percpu_ref_get(&req->ctx->refs);
-
 	ret = io_req_task_work_add(req);
-	if (unlikely(ret))
+	if (unlikely(ret)) {
+		percpu_ref_get(&req->ctx->refs);
 		io_req_task_work_add_fallback(req, io_req_task_cancel);
+	}
 }
 
 static inline void io_queue_next(struct io_kiocb *req)
@@ -8707,6 +8706,14 @@ static void io_ring_ctx_free(struct io_ring_ctx *ctx)
 {
 	struct io_submit_state *submit_state = &ctx->submit_state;
 
+	/*
+	 * Some may use context even when all refs and requests have been put,
+	 * and they are free to do so while still holding uring_lock, see
+	 * __io_req_task_submit(). Wait for them to finish.
+	 */
+	mutex_lock(&ctx->uring_lock);
+	mutex_unlock(&ctx->uring_lock);
+
 	io_finish_async(ctx);
 	io_sqe_buffers_unregister(ctx);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/5] another round of small tinkering
  2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
                   ` (4 preceding siblings ...)
  2021-02-12  3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
@ 2021-02-12 12:44 ` Jens Axboe
  5 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2021-02-12 12:44 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 2/11/21 8:23 PM, Pavel Begunkov wrote:
> Some other small improvements with negative diffstat.
> 2-3 are to address acquire_mm_files overhead
> 
> Pavel Begunkov (5):
>   io_uring: take compl state from submit state
>   io_uring: optimise out unlikely link queue
>   io_uring: optimise SQPOLL mm/files grabbing
>   io_uring: don't duplicate io_req_task_queue()
>   io_uring: save ctx put/get for task_work submit
> 
>  fs/io_uring.c | 103 +++++++++++++++++++-------------------------------
>  1 file changed, 39 insertions(+), 64 deletions(-)

LGTM, applied, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-12 12:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-12  3:23 [PATCH 0/5] another round of small tinkering Pavel Begunkov
2021-02-12  3:23 ` [PATCH 1/5] io_uring: take compl state from submit state Pavel Begunkov
2021-02-12  3:23 ` [PATCH 2/5] io_uring: optimise out unlikely link queue Pavel Begunkov
2021-02-12  3:23 ` [PATCH 3/5] io_uring: optimise SQPOLL mm/files grabbing Pavel Begunkov
2021-02-12  3:23 ` [PATCH 4/5] io_uring: don't duplicate io_req_task_queue() Pavel Begunkov
2021-02-12  3:23 ` [PATCH 5/5] io_uring: save ctx put/get for task_work submit Pavel Begunkov
2021-02-12 12:44 ` [PATCH 0/5] another round of small tinkering Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.