[PATCHSET 0/9] io_uring: use polled async retry

* [PATCHSET 0/9] io_uring: use polled async retry
@ 2020-02-20 20:31 Jens Axboe
  2020-02-20 20:31 ` [PATCH 1/9] io_uring: consider any io_read/write -EAGAIN as final Jens Axboe
                   ` (8 more replies)
  0 siblings, 9 replies; 53+ messages in thread
From: Jens Axboe @ 2020-02-20 20:31 UTC (permalink / raw)
  To: io-uring; +Cc: glauber, peterz, asml.silence

We currently need to go async (meaning punt to a worker thread helper)
when we complete a poll request, if we have a linked request after it.
This isn't as fast as it could be. Similarly, if we try to read from
a socket (or similar) and we get -EAGAIN, we punt to an async worker
thread.

This clearly isn't optimal, both in terms of latency and system
resources.

This patchset attempts to rectify that by revamping the poll setup
infrastructure, and using that same infrastructure to handle async IO
on file types that support polling for availability of data and/or
space. The end result is a lot faster than it was before. On an echo
server example, I gain about a 4x performance improvement in throughput
for a single client case.

Just as important, this also means that an application can simply issue
an IORING_OP_RECV or IORING_OP_RECVMSG and have it complete when data
is available. It's no longer needed (or useful) to use a poll link
prior to the receive. Once data becomes available, it is read
immediately. Honestly, this almost feels like magic! This can completely
replace setups that currently use epoll to poll for data availability,
and then need to issue a receive after that. Just one system call for
the whole operation. This isn't specific to receive, that is just an
example. The send side works the same.

This is accomplished by adding a per-task sched_work handler. The work
queued there is automatically run when a task is scheduled in or out.
When a poll request completes (either an explicit one, or one just armed
on behalf of a request that would otherwise block), the bottom half side
of the work is queued as sched_work and the task is woken.

This patchset passes my test suite, but I'd be hugely surprised if there
isn't a few corner cases that still need fixing.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 53+ messages in thread