io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization
@ 2020-05-19 21:52 Bijan Mottahedeh
  2020-05-19 21:52 ` [PATCH liburing 1/3] preseve wait_nr if SETUP_IOPOLL is set Bijan Mottahedeh
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Bijan Mottahedeh @ 2020-05-19 21:52 UTC (permalink / raw)
  To: axboe; +Cc: io-uring

This patch set and a corresponding kernel patch set are fixes and
optimizations resulting from running unit test 500f9fbadef8-test.

- Patch 1 is a fix to the test hanging when it runs on a non-mq queue.

The patch preserves the value of wait_nr if SETUP_IOPOLL is set
since otherwise __sys_io_uring_enter() could never be called
__io_uring_peek_cqe() could never find new completions.

With this patch applied, two problems were hit in the kernel as described
in the kernel patch set, which caused 500f9fbadef8-test to fail and
to hang.  With all three patches, 500f9fbadef8-test either passes
successfully or skips the test gracefully with the following message:

Polling not supported in current dir, test skipped

- Patch 2 is an optimization for io_uring_enter() system calls.

If we want to wait for completions (wait_nr > 0), account for the
completion we might fetch with __io_uring_peek_cqe().  For example,
with wait_nr=1 and submit=0, there is no need to call io_uring_enter()
if the peek call finds a completion.

Below are the perf results for 500f9fbadef8-test without/with the fix:

perf stat -e syscalls:sys_enter_io_uring_enter 500f9fbadef8-test

12,289     syscalls:sys_enter_io_uring_enter
8,193      syscalls:sys_enter_io_uring_enter

- Patch 3 is a cleanup with no functional changes.

Since we always have

io_uring_wait_cqe_nr()
-> __io_uring_get_cqe()
   -> __io_uring_peek_cqe()

remove the direct call from io_uring_wait_cqe_nr() to __io_uring_peek_cqe().

After the removal, __io_uring_peek_cqe() is called only from
__io_uring_get_cqe() so move the two routines together(). Without the
move, compilation fails with a 'defined but not used' error.

Bijan Mottahedeh (3):
  preseve wait_nr if SETUP_IOPOLL is set
  update wait_nr to account for completed event
  remove duplicate call to __io_uring_peek_cqe()

 src/include/liburing.h | 32 --------------------------------
 src/queue.c            | 38 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 33 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH liburing 1/3] preseve wait_nr if SETUP_IOPOLL is set
  2020-05-19 21:52 [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Bijan Mottahedeh
@ 2020-05-19 21:52 ` Bijan Mottahedeh
  2020-05-19 21:52 ` [PATCH liburing 2/3] update wait_nr to account for completed event Bijan Mottahedeh
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Bijan Mottahedeh @ 2020-05-19 21:52 UTC (permalink / raw)
  To: axboe; +Cc: io-uring

When SETUP_IOPOLL is set, __sys_io_uring_enter() must be called to reap
new completions but the call won't be made if both wait_nr and submit
are zero, so preserve wait_nr.

Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
---
 src/queue.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/queue.c b/src/queue.c
index 14a0777..c7473c0 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -60,7 +60,14 @@ int __io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
 			err = -errno;
 		} else if (ret == (int)submit) {
 			submit = 0;
-			wait_nr = 0;
+			/*
+			 * When SETUP_IOPOLL is set, __sys_io_uring enter()
+			 * must be called to reap new completions but the call
+			 * won't be made if both wait_nr and submit are zero
+			 * so preserve wait_nr.
+			 */
+			if (!(ring->flags & IORING_SETUP_IOPOLL))
+				wait_nr = 0;
 		} else {
 			submit -= ret;
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH liburing 2/3] update wait_nr to account for completed event
  2020-05-19 21:52 [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Bijan Mottahedeh
  2020-05-19 21:52 ` [PATCH liburing 1/3] preseve wait_nr if SETUP_IOPOLL is set Bijan Mottahedeh
@ 2020-05-19 21:52 ` Bijan Mottahedeh
  2020-05-19 21:52 ` [PATCH liburing 3/3] remove duplicate call to __io_uring_peek_cqe() Bijan Mottahedeh
  2020-05-19 22:12 ` [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Jens Axboe
  3 siblings, 0 replies; 5+ messages in thread
From: Bijan Mottahedeh @ 2020-05-19 21:52 UTC (permalink / raw)
  To: axboe; +Cc: io-uring

If there are events to wait for, then account for one already complete.

Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
---
 src/queue.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/queue.c b/src/queue.c
index c7473c0..da2f405 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -49,6 +49,8 @@ int __io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
 			err = -EAGAIN;
 			break;
 		}
+		if (wait_nr && cqe)
+			wait_nr--;
 		if (wait_nr)
 			flags = IORING_ENTER_GETEVENTS;
 		if (submit)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH liburing 3/3] remove duplicate call to __io_uring_peek_cqe()
  2020-05-19 21:52 [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Bijan Mottahedeh
  2020-05-19 21:52 ` [PATCH liburing 1/3] preseve wait_nr if SETUP_IOPOLL is set Bijan Mottahedeh
  2020-05-19 21:52 ` [PATCH liburing 2/3] update wait_nr to account for completed event Bijan Mottahedeh
@ 2020-05-19 21:52 ` Bijan Mottahedeh
  2020-05-19 22:12 ` [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Jens Axboe
  3 siblings, 0 replies; 5+ messages in thread
From: Bijan Mottahedeh @ 2020-05-19 21:52 UTC (permalink / raw)
  To: axboe; +Cc: io-uring

Remove the __io_uring_peek_cqe() call from io_uring_wait_cqe_nr()
since the former is unconditionally called from io_uring_wait_cqe_nr()
-> __io_uring_get_cqe(). Also move __io_uring_get_cqe() together with
__io_uring_get_cqe() since that's now the only caller.

Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
---
 src/include/liburing.h | 32 --------------------------------
 src/queue.c            | 27 +++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 32 deletions(-)

diff --git a/src/include/liburing.h b/src/include/liburing.h
index dd85f7b..4311325 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -444,32 +444,6 @@ static inline unsigned io_uring_cq_ready(struct io_uring *ring)
 	return io_uring_smp_load_acquire(ring->cq.ktail) - *ring->cq.khead;
 }
 
-static int __io_uring_peek_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr)
-{
-	struct io_uring_cqe *cqe;
-	unsigned head;
-	int err = 0;
-
-	do {
-		io_uring_for_each_cqe(ring, head, cqe)
-			break;
-		if (cqe) {
-			if (cqe->user_data == LIBURING_UDATA_TIMEOUT) {
-				if (cqe->res < 0)
-					err = cqe->res;
-				io_uring_cq_advance(ring, 1);
-				if (!err)
-					continue;
-				cqe = NULL;
-			}
-		}
-		break;
-	} while (1);
-
-	*cqe_ptr = cqe;
-	return err;
-}
-
 /*
  * Return an IO completion, waiting for 'wait_nr' completions if one isn't
  * readily available. Returns 0 with cqe_ptr filled in on success, -errno on
@@ -479,12 +453,6 @@ static inline int io_uring_wait_cqe_nr(struct io_uring *ring,
 				      struct io_uring_cqe **cqe_ptr,
 				      unsigned wait_nr)
 {
-	int err;
-
-	err = __io_uring_peek_cqe(ring, cqe_ptr);
-	if (err || *cqe_ptr)
-		return err;
-
 	return __io_uring_get_cqe(ring, cqe_ptr, 0, wait_nr, NULL);
 }
 
diff --git a/src/queue.c b/src/queue.c
index da2f405..3db52bd 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -32,6 +32,33 @@ static inline bool sq_ring_needs_enter(struct io_uring *ring,
 	return false;
 }
 
+static int __io_uring_peek_cqe(struct io_uring *ring,
+			       struct io_uring_cqe **cqe_ptr)
+{
+	struct io_uring_cqe *cqe;
+	unsigned head;
+	int err = 0;
+
+	do {
+		io_uring_for_each_cqe(ring, head, cqe)
+			break;
+		if (cqe) {
+			if (cqe->user_data == LIBURING_UDATA_TIMEOUT) {
+				if (cqe->res < 0)
+					err = cqe->res;
+				io_uring_cq_advance(ring, 1);
+				if (!err)
+					continue;
+				cqe = NULL;
+			}
+		}
+		break;
+	} while (1);
+
+	*cqe_ptr = cqe;
+	return err;
+}
+
 int __io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
 		       unsigned submit, unsigned wait_nr, sigset_t *sigmask)
 {
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization
  2020-05-19 21:52 [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Bijan Mottahedeh
                   ` (2 preceding siblings ...)
  2020-05-19 21:52 ` [PATCH liburing 3/3] remove duplicate call to __io_uring_peek_cqe() Bijan Mottahedeh
@ 2020-05-19 22:12 ` Jens Axboe
  3 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2020-05-19 22:12 UTC (permalink / raw)
  To: Bijan Mottahedeh; +Cc: io-uring

On 5/19/20 3:52 PM, Bijan Mottahedeh wrote:
> This patch set and a corresponding kernel patch set are fixes and
> optimizations resulting from running unit test 500f9fbadef8-test.
> 
> - Patch 1 is a fix to the test hanging when it runs on a non-mq queue.
> 
> The patch preserves the value of wait_nr if SETUP_IOPOLL is set
> since otherwise __sys_io_uring_enter() could never be called
> __io_uring_peek_cqe() could never find new completions.
> 
> With this patch applied, two problems were hit in the kernel as described
> in the kernel patch set, which caused 500f9fbadef8-test to fail and
> to hang.  With all three patches, 500f9fbadef8-test either passes
> successfully or skips the test gracefully with the following message:
> 
> Polling not supported in current dir, test skipped
> 
> - Patch 2 is an optimization for io_uring_enter() system calls.
> 
> If we want to wait for completions (wait_nr > 0), account for the
> completion we might fetch with __io_uring_peek_cqe().  For example,
> with wait_nr=1 and submit=0, there is no need to call io_uring_enter()
> if the peek call finds a completion.
> 
> Below are the perf results for 500f9fbadef8-test without/with the fix:
> 
> perf stat -e syscalls:sys_enter_io_uring_enter 500f9fbadef8-test
> 
> 12,289     syscalls:sys_enter_io_uring_enter
> 8,193      syscalls:sys_enter_io_uring_enter
> 
> - Patch 3 is a cleanup with no functional changes.
> 
> Since we always have
> 
> io_uring_wait_cqe_nr()
> -> __io_uring_get_cqe()
>    -> __io_uring_peek_cqe()
> 
> remove the direct call from io_uring_wait_cqe_nr() to __io_uring_peek_cqe().
> 
> After the removal, __io_uring_peek_cqe() is called only from
> __io_uring_get_cqe() so move the two routines together(). Without the
> move, compilation fails with a 'defined but not used' error.

LGTM, thanks! I've applied them.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-05-19 22:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-19 21:52 [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Bijan Mottahedeh
2020-05-19 21:52 ` [PATCH liburing 1/3] preseve wait_nr if SETUP_IOPOLL is set Bijan Mottahedeh
2020-05-19 21:52 ` [PATCH liburing 2/3] update wait_nr to account for completed event Bijan Mottahedeh
2020-05-19 21:52 ` [PATCH liburing 3/3] remove duplicate call to __io_uring_peek_cqe() Bijan Mottahedeh
2020-05-19 22:12 ` [PATCH liburing 0/3] __io_uring_get_cqe() fix/optimization Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).