All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] aio-posix: polling scalability improvements
@ 2020-03-05 17:07 Stefan Hajnoczi
  2020-03-05 17:08 ` [PATCH 1/7] aio-posix: completely stop polling when disabled Stefan Hajnoczi
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2020-03-05 17:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: Fam Zheng, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
	Paolo Bonzini

A guest with 100 virtio-blk-pci,num-queues=32 devices only reaches 10k IOPS
while a guest with a single device reaches 105k IOPS
(rw=randread,bs=4k,iodepth=1,ioengine=libaio).

The bottleneck is that aio_poll() userspace polling iterates over all
AioHandlers to invoke their ->io_poll() callbacks.  All AioHandlers are polled
even if only one of them was recently active.  Therefore a guest with many
disks is slower than a guest with a single disk even when the workload only
accesses a single disk.

This patch series solves this scalability problem so that IOPS is unaffected by
the number of devices.  The trick is to poll only AioHandlers that were
recently active so that userspace polling scales well.

Unfortunately it's not possible to accomplish this with the existing epoll(7)
fd monitoring implementation.  This patch series adds a Linux io_uring fd
monitoring implementation.  The critical feature is that io_uring can check the
readiness of file descriptors through userspace polling.  This makes it
possible to safely poll a subset of AioHandlers from userspace without risk of
starving the other AioHandlers.

Stefan Hajnoczi (7):
  aio-posix: completely stop polling when disabled
  aio-posix: move RCU_READ_LOCK() into run_poll_handlers()
  aio-posix: extract ppoll(2) and epoll(7) fd monitoring
  aio-posix: simplify FDMonOps->update() prototype
  aio-posix: add io_uring fd monitoring implementation
  aio-posix: support userspace polling of fd monitoring
  aio-posix: remove idle poll handlers to improve scalability

 MAINTAINERS           |   2 +
 configure             |   5 +
 include/block/aio.h   |  70 ++++++-
 util/Makefile.objs    |   3 +
 util/aio-posix.c      | 449 ++++++++++++++----------------------------
 util/aio-posix.h      |  81 ++++++++
 util/fdmon-epoll.c    | 155 +++++++++++++++
 util/fdmon-io_uring.c | 332 +++++++++++++++++++++++++++++++
 util/fdmon-poll.c     | 107 ++++++++++
 util/trace-events     |   2 +
 10 files changed, 898 insertions(+), 308 deletions(-)
 create mode 100644 util/aio-posix.h
 create mode 100644 util/fdmon-epoll.c
 create mode 100644 util/fdmon-io_uring.c
 create mode 100644 util/fdmon-poll.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-03-09 16:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-05 17:07 [PATCH 0/7] aio-posix: polling scalability improvements Stefan Hajnoczi
2020-03-05 17:08 ` [PATCH 1/7] aio-posix: completely stop polling when disabled Stefan Hajnoczi
2020-03-05 17:08 ` [PATCH 2/7] aio-posix: move RCU_READ_LOCK() into run_poll_handlers() Stefan Hajnoczi
2020-03-05 17:15   ` Paolo Bonzini
2020-03-06 13:43     ` Stefan Hajnoczi
2020-03-05 17:08 ` [PATCH 3/7] aio-posix: extract ppoll(2) and epoll(7) fd monitoring Stefan Hajnoczi
2020-03-05 17:08 ` [PATCH 4/7] aio-posix: simplify FDMonOps->update() prototype Stefan Hajnoczi
2020-03-05 17:08 ` [PATCH 5/7] aio-posix: add io_uring fd monitoring implementation Stefan Hajnoczi
2020-03-05 17:08 ` [PATCH 6/7] aio-posix: support userspace polling of fd monitoring Stefan Hajnoczi
2020-03-05 17:08 ` [PATCH 7/7] aio-posix: remove idle poll handlers to improve scalability Stefan Hajnoczi
2020-03-05 17:28   ` Paolo Bonzini
2020-03-06 13:50     ` Stefan Hajnoczi
2020-03-06 14:17       ` Paolo Bonzini
2020-03-09 16:37         ` Stefan Hajnoczi
2020-03-09 16:47 ` [PATCH 0/7] aio-posix: polling scalability improvements Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.