All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled
@ 2020-03-11  1:26 Xiaoguang Wang
  2020-03-11 13:06 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: Xiaoguang Wang @ 2020-03-11  1:26 UTC (permalink / raw)
  To: io-uring; +Cc: axboe, joseph.qi

When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, applications don't need
to do io completion events polling again, they can rely on io_sq_thread to do
polling work, which can reduce cpu usage and uring_lock contention.

I modify fio io_uring engine codes a bit to evaluate the performance:
static int fio_ioring_getevents(struct thread_data *td, unsigned int min,
                        continue;
                }

-               if (!o->sqpoll_thread) {
+               if (o->sqpoll_thread && o->hipri) {
                        r = io_uring_enter(ld, 0, actual_min,
                                                IORING_ENTER_GETEVENTS);
                        if (r < 0) {

and use "fio  -name=fiotest -filename=/dev/nvme0n1 -iodepth=$depth -thread
-rw=read -ioengine=io_uring  -hipri=1 -sqthread_poll=1  -direct=1 -bs=4k
-size=10G -numjobs=1  -time_based -runtime=120"

original codes
--------------------------------------------------------------------
iodepth       |        4 |        8 |       16 |       32 |       64
bw            | 1133MB/s | 1519MB/s | 2090MB/s | 2710MB/s | 3012MB/s
fio cpu usage |     100% |     100% |     100% |     100% |     100%
--------------------------------------------------------------------

with patch
--------------------------------------------------------------------
iodepth       |        4 |        8 |       16 |       32 |       64
bw            | 1196MB/s | 1721MB/s | 2351MB/s | 2977MB/s | 3357MB/s
fio cpu usage |    63.8% |   74.4%% |    81.1% |    83.7% |    82.4%
--------------------------------------------------------------------
bw improve    |     5.5% |    13.2% |    12.3% |     9.8% |    11.5%
--------------------------------------------------------------------

From above test results, we can see that bw has above 5.5%~13%
improvement, and fio process's cpu usage also drops much. Note this
won't improve io_sq_thread's cpu usage when SETUP_IOPOLL|SETUP_SQPOLL
are both enabled, in this case, io_sq_thread always has 100% cpu usage.
I think this patch will be friendly to applications which will often use
io_uring_wait_cqe() or similar from liburing.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 fs/io_uring.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 6a595c1..9f56723 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1574,6 +1574,8 @@ static void io_iopoll_complete(struct io_ring_ctx *ctx, unsigned int *nr_events,
 	}
 
 	io_commit_cqring(ctx);
+	if (ctx->flags & IORING_SETUP_SQPOLL)
+		io_cqring_ev_posted(ctx);
 	io_free_req_many(ctx, &rb);
 }
 
@@ -6637,7 +6639,14 @@ static unsigned long io_uring_nommu_get_unmapped_area(struct file *file,
 
 		min_complete = min(min_complete, ctx->cq_entries);
 
-		if (ctx->flags & IORING_SETUP_IOPOLL) {
+		/*
+		 * When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, user
+		 * space applications don't need to do io completion events
+		 * polling again, they can rely on io_sq_thread to do polling
+		 * work, which can reduce cpu usage and uring_lock contention.
+		 */
+		if (ctx->flags & IORING_SETUP_IOPOLL &&
+		    !(ctx->flags & IORING_SETUP_SQPOLL)) {
 			ret = io_iopoll_check(ctx, &nr_events, min_complete);
 		} else {
 			ret = io_cqring_wait(ctx, min_complete, sig, sigsz);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled
  2020-03-11  1:26 [PATCH] io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled Xiaoguang Wang
@ 2020-03-11 13:06 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2020-03-11 13:06 UTC (permalink / raw)
  To: Xiaoguang Wang, io-uring; +Cc: joseph.qi

On 3/10/20 7:26 PM, Xiaoguang Wang wrote:
> When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, applications don't need
> to do io completion events polling again, they can rely on io_sq_thread to do
> polling work, which can reduce cpu usage and uring_lock contention.
> 
> I modify fio io_uring engine codes a bit to evaluate the performance:
> static int fio_ioring_getevents(struct thread_data *td, unsigned int min,
>                         continue;
>                 }
> 
> -               if (!o->sqpoll_thread) {
> +               if (o->sqpoll_thread && o->hipri) {
>                         r = io_uring_enter(ld, 0, actual_min,
>                                                 IORING_ENTER_GETEVENTS);
>                         if (r < 0) {
> 
> and use "fio  -name=fiotest -filename=/dev/nvme0n1 -iodepth=$depth -thread
> -rw=read -ioengine=io_uring  -hipri=1 -sqthread_poll=1  -direct=1 -bs=4k
> -size=10G -numjobs=1  -time_based -runtime=120"
> 
> original codes
> --------------------------------------------------------------------
> iodepth       |        4 |        8 |       16 |       32 |       64
> bw            | 1133MB/s | 1519MB/s | 2090MB/s | 2710MB/s | 3012MB/s
> fio cpu usage |     100% |     100% |     100% |     100% |     100%
> --------------------------------------------------------------------
> 
> with patch
> --------------------------------------------------------------------
> iodepth       |        4 |        8 |       16 |       32 |       64
> bw            | 1196MB/s | 1721MB/s | 2351MB/s | 2977MB/s | 3357MB/s
> fio cpu usage |    63.8% |   74.4%% |    81.1% |    83.7% |    82.4%
> --------------------------------------------------------------------
> bw improve    |     5.5% |    13.2% |    12.3% |     9.8% |    11.5%
> --------------------------------------------------------------------
> 
> From above test results, we can see that bw has above 5.5%~13%
> improvement, and fio process's cpu usage also drops much. Note this
> won't improve io_sq_thread's cpu usage when SETUP_IOPOLL|SETUP_SQPOLL
> are both enabled, in this case, io_sq_thread always has 100% cpu usage.
> I think this patch will be friendly to applications which will often use
> io_uring_wait_cqe() or similar from liburing.

I think this looks reasonable, and true to the spirit of how polling
should work when SQPOLL is used. I'll apply this for 5.7, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-03-11 13:06 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-11  1:26 [PATCH] io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled Xiaoguang Wang
2020-03-11 13:06 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.