All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] a bunch of changes for submmission path
@ 2019-12-30 18:24 Pavel Begunkov
  2019-12-30 18:24 ` [PATCH 1/4] io_uring: clamp to_submit in io_submit_sqes() Pavel Begunkov
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Pavel Begunkov @ 2019-12-30 18:24 UTC (permalink / raw)
  To: Jens Axboe, io-uring

This is mostly about batching smp_load_acquire() in io_get_sqring()
with other minor changes.

Pavel Begunkov (4):
  io_uring: clamp to_submit in io_submit_sqes()
  io_uring: optimise head checks in io_get_sqring()
  io_uring: optimise commit_sqring() for common case
  io_uring: remove extra io_wq_current_is_worker()

 fs/io_uring.c | 32 ++++++++++++--------------------
 1 file changed, 12 insertions(+), 20 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] io_uring: clamp to_submit in io_submit_sqes()
  2019-12-30 18:24 [PATCH 0/4] a bunch of changes for submmission path Pavel Begunkov
@ 2019-12-30 18:24 ` Pavel Begunkov
  2019-12-30 18:24 ` [PATCH 2/4] io_uring: optimise head checks in io_get_sqring() Pavel Begunkov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2019-12-30 18:24 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Make io_submit_sqes() to clamp @to_submit itself. It removes duplicated
code and prepares for following changes.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ee860cfad780..4105c0e591c7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4582,6 +4582,8 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr,
 			return -EBUSY;
 	}
 
+	nr = min(nr, ctx->sq_entries);
+
 	if (!percpu_ref_tryget_many(&ctx->refs, nr))
 		return -EAGAIN;
 
@@ -4756,7 +4758,6 @@ static int io_sq_thread(void *data)
 			ctx->rings->sq_flags &= ~IORING_SQ_NEED_WAKEUP;
 		}
 
-		to_submit = min(to_submit, ctx->sq_entries);
 		mutex_lock(&ctx->uring_lock);
 		ret = io_submit_sqes(ctx, to_submit, NULL, -1, &cur_mm, true);
 		mutex_unlock(&ctx->uring_lock);
@@ -6094,7 +6095,6 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 	} else if (to_submit) {
 		struct mm_struct *cur_mm;
 
-		to_submit = min(to_submit, ctx->sq_entries);
 		mutex_lock(&ctx->uring_lock);
 		/* already have mm, so io_submit_sqes() won't try to grab it */
 		cur_mm = ctx->sqo_mm;
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] io_uring: optimise head checks in io_get_sqring()
  2019-12-30 18:24 [PATCH 0/4] a bunch of changes for submmission path Pavel Begunkov
  2019-12-30 18:24 ` [PATCH 1/4] io_uring: clamp to_submit in io_submit_sqes() Pavel Begunkov
@ 2019-12-30 18:24 ` Pavel Begunkov
  2019-12-30 18:24 ` [PATCH 3/4] io_uring: optimise commit_sqring() for common case Pavel Begunkov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2019-12-30 18:24 UTC (permalink / raw)
  To: Jens Axboe, io-uring

A user may ask to submit more than there is in the ring, and then
io_uring will submit as much as it can. However, in the last iteration
it will allocate an io_kiocb and immediately free it. It could do
better and adjust @to_submit to what is in the ring.

And since the ring's head is already checked here, there is no need to
do it in the loop, spamming with smp_load_acquire()'s barriers

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4105c0e591c7..05d07974a5b3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4527,7 +4527,6 @@ static void io_commit_sqring(struct io_ring_ctx *ctx)
 static bool io_get_sqring(struct io_ring_ctx *ctx, struct io_kiocb *req,
 			  const struct io_uring_sqe **sqe_ptr)
 {
-	struct io_rings *rings = ctx->rings;
 	u32 *sq_array = ctx->sq_array;
 	unsigned head;
 
@@ -4539,12 +4538,7 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	 * 2) allows the kernel side to track the head on its own, even
 	 *    though the application is the one updating it.
 	 */
-	head = ctx->cached_sq_head;
-	/* make sure SQ entry isn't read before tail */
-	if (unlikely(head == smp_load_acquire(&rings->sq.tail)))
-		return false;
-
-	head = READ_ONCE(sq_array[head & ctx->sq_mask]);
+	head = READ_ONCE(sq_array[ctx->cached_sq_head & ctx->sq_mask]);
 	if (likely(head < ctx->sq_entries)) {
 		/*
 		 * All io need record the previous position, if LINK vs DARIN,
@@ -4562,7 +4556,7 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct io_kiocb *req,
 	/* drop invalid entries */
 	ctx->cached_sq_head++;
 	ctx->cached_sq_dropped++;
-	WRITE_ONCE(rings->sq_dropped, ctx->cached_sq_dropped);
+	WRITE_ONCE(ctx->rings->sq_dropped, ctx->cached_sq_dropped);
 	return false;
 }
 
@@ -4582,7 +4576,8 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr,
 			return -EBUSY;
 	}
 
-	nr = min(nr, ctx->sq_entries);
+	/* make sure SQ entry isn't read before tail */
+	nr = min3(nr, ctx->sq_entries, io_sqring_entries(ctx));
 
 	if (!percpu_ref_tryget_many(&ctx->refs, nr))
 		return -EAGAIN;
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] io_uring: optimise commit_sqring() for common case
  2019-12-30 18:24 [PATCH 0/4] a bunch of changes for submmission path Pavel Begunkov
  2019-12-30 18:24 ` [PATCH 1/4] io_uring: clamp to_submit in io_submit_sqes() Pavel Begunkov
  2019-12-30 18:24 ` [PATCH 2/4] io_uring: optimise head checks in io_get_sqring() Pavel Begunkov
@ 2019-12-30 18:24 ` Pavel Begunkov
  2019-12-30 18:24 ` [PATCH 4/4] io_uring: remove extra io_wq_current_is_worker() Pavel Begunkov
  2019-12-30 22:24 ` [PATCH 0/4] a bunch of changes for submmission path Jens Axboe
  4 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2019-12-30 18:24 UTC (permalink / raw)
  To: Jens Axboe, io-uring

It should be pretty rare to not submitting anything when there is
something in the ring. No need to keep heuristics for this case.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 05d07974a5b3..642aca3f2d1f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4506,14 +4506,12 @@ static void io_commit_sqring(struct io_ring_ctx *ctx)
 {
 	struct io_rings *rings = ctx->rings;
 
-	if (ctx->cached_sq_head != READ_ONCE(rings->sq.head)) {
-		/*
-		 * Ensure any loads from the SQEs are done at this point,
-		 * since once we write the new head, the application could
-		 * write new data to them.
-		 */
-		smp_store_release(&rings->sq.head, ctx->cached_sq_head);
-	}
+	/*
+	 * Ensure any loads from the SQEs are done at this point,
+	 * since once we write the new head, the application could
+	 * write new data to them.
+	 */
+	smp_store_release(&rings->sq.head, ctx->cached_sq_head);
 }
 
 /*
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] io_uring: remove extra io_wq_current_is_worker()
  2019-12-30 18:24 [PATCH 0/4] a bunch of changes for submmission path Pavel Begunkov
                   ` (2 preceding siblings ...)
  2019-12-30 18:24 ` [PATCH 3/4] io_uring: optimise commit_sqring() for common case Pavel Begunkov
@ 2019-12-30 18:24 ` Pavel Begunkov
  2019-12-30 22:24 ` [PATCH 0/4] a bunch of changes for submmission path Jens Axboe
  4 siblings, 0 replies; 6+ messages in thread
From: Pavel Begunkov @ 2019-12-30 18:24 UTC (permalink / raw)
  To: Jens Axboe, io-uring

io_wq workers use io_issue_sqe() to forward sqes and never
io_queue_sqe(). Remove extra check for io_wq_current_is_worker()

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 642aca3f2d1f..ef0308126fac 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4376,8 +4376,7 @@ static void io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 			req_set_fail_links(req);
 			io_double_put_req(req);
 		}
-	} else if ((req->flags & REQ_F_FORCE_ASYNC) &&
-		   !io_wq_current_is_worker()) {
+	} else if (req->flags & REQ_F_FORCE_ASYNC) {
 		/*
 		 * Never try inline submit of IOSQE_ASYNC is set, go straight
 		 * to async execution.
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] a bunch of changes for submmission path
  2019-12-30 18:24 [PATCH 0/4] a bunch of changes for submmission path Pavel Begunkov
                   ` (3 preceding siblings ...)
  2019-12-30 18:24 ` [PATCH 4/4] io_uring: remove extra io_wq_current_is_worker() Pavel Begunkov
@ 2019-12-30 22:24 ` Jens Axboe
  4 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2019-12-30 22:24 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 12/30/19 11:24 AM, Pavel Begunkov wrote:
> This is mostly about batching smp_load_acquire() in io_get_sqring()
> with other minor changes.

These all look good to me, applied.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-30 22:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-30 18:24 [PATCH 0/4] a bunch of changes for submmission path Pavel Begunkov
2019-12-30 18:24 ` [PATCH 1/4] io_uring: clamp to_submit in io_submit_sqes() Pavel Begunkov
2019-12-30 18:24 ` [PATCH 2/4] io_uring: optimise head checks in io_get_sqring() Pavel Begunkov
2019-12-30 18:24 ` [PATCH 3/4] io_uring: optimise commit_sqring() for common case Pavel Begunkov
2019-12-30 18:24 ` [PATCH 4/4] io_uring: remove extra io_wq_current_is_worker() Pavel Begunkov
2019-12-30 22:24 ` [PATCH 0/4] a bunch of changes for submmission path Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.