All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 00/40] Sneak peek of virtio and dataplane changes for 2.6
@ 2015-11-24 18:00 Paolo Bonzini
  2015-11-24 18:00 ` [Qemu-devel] [PATCH 01/40] 9pfs: allocate pdus with g_malloc/g_free Paolo Bonzini
                   ` (40 more replies)
  0 siblings, 41 replies; 56+ messages in thread
From: Paolo Bonzini @ 2015-11-24 18:00 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: mlin, famz, ming.lei, stefanha, mst

This large series is basically all that I would like to get into 2.6.
It is a combination of several pieces of work on dataplane and
multithreaded block layer.

It's also a large part of why I would like someone else to look at
miscellaneous patches for a while (in case you've missed that).  I
can foresee that following the reviews is going to be a huge time drain.

With it I can get ~1300 Kiops on 8 disks (which I achieve with 2 iothreads
and 5 VCPUs).  The bulk of the improvement actually comes from the first
8 patches, but the rest of the series is what prepares for what's next
to come in QEMU 2.7 and later, such as a multiqueue block layer.

It's tedious to review, with some pretty large patches (3, 32, 33, 35).
That's how you attract reviewers, isn't it?  I would like to get the
first virtio and the first block layer part in very soon after 2.6
development starts.

I've split it in four parts, the first two touching virtio mostly,
while the last two are for the block layer.

Because it's large, I've CCed people only on the cover letter.

This work is available at github.com/bonzini/qemu.git, branch dataplane.

A. "LEAN" VIRTQUEUEELEMENT
--------------------------

Patches 1 to 8 modify VirtQueueElement so that the space for
scatter/gather lists is allocated dynamically rather than being
fixed to 4K.  VirtQueueElement becomes a sort of "superclass", and
the scatter/gather elements are placed in the same malloc block,
which is laid out like

	VirtQueueElement
	other fields ("subclass" fields)
	in_addr[]
	out_addr[]
	in_sg[]
	out_sg[]

This can provide a large speedup (from 1.3x to 2.3x) with many disks,
due to the 48K sized VirtQueueElement.  All virtio devices have to
be changed (patch 3).  I chose to do it all in a single patch because
the changes are anyway well isolated between each device.

The main issue here is that VirtQueueElement was haphazardly shoveled
straight in the migration stream (in host endianness). :(  Patch 5
straightens this out, but at the cost of breaking backwards migration
because it now writes the VirtQueueElement in big endian, consistent
with other migration streams.

This is the least tested part of the series.  I nevertheless put it
first because it's the one that is more complicated to rebase, and
I want to get rid of it as fast as possible.  Reviewing the general
approach is welcome anyway.

	Status: virtio-input, virtio-gpu and migration not tested at all

B. REMOVING VRING.C
-------------------

This is patches 9 to 16.  It removes the duplicate dataplane-specific
implementation of virtio in favor of the regular one that is already
used for non-dataplane.  While the dataplane implementation is slightly
more optimized, I chose to keep the other one to avoid another "touch
all virtio devices" series.

Patch 10 alone mostly brings performance in par between the two.
The remaining 7-8% can be recovered by mostly getting rid of tiny
address_space_* operations, keeping the rings always mapped.  Note that
the rest of this big series does bring a little performance improvement,
and already makes up for the lost performance.

This part has a dependency on patches that are not part of this series
(and do not exist yet), which make it possible to write the dirty
bitmap outside the BQL.  The dirty bitmap is not yet thread-safe because,
while it is read and written with atomic operations, it may be resized
when there is a memory hotplug operation.  There are plans to fix this
using RCU.

Nevertheless, this doesn't block part C.

	Status: ready, but depends on the missing dirty bitmap support


C. FINE-GRAINED AIO_POLL CRITICAL SECTIONS
------------------------------------------

This is patch 17 to 28.  It starts pushing aio_context_acquire down
into aio_poll.  This part is more or less independent from A and B,
and it ends with aio_poll calling aio_context_acquire/release around
every callback.

To do this, this part introduces a thread-safe variant of the common
"walking_xxx++/walking_xxx--" idiom already found in several places
in aio*.c and async.c.

	Status: ready, except that I haven't tested quorum enough

D. FINE-GRAINED BLOCK LAYER CRITICAL SECTIONS
---------------------------------------------

This is patch 29 to 40.  It explicitly acquires the AioContext in all
callbacks that need it (file descriptors, bottom halves, timers, AIO)
rather than in aio_poll.  This is the first step towards breaking
AioContext in many small locks, and hence the last prerequisite for
a real multiqueue QEMU block layer.

This has the biggest patches and, unlike patch 3, they are very hard
to split further.

At the end, starting with patch 37, a few patches do some small
optimization on aio_poll that is now possible, and the last one makes
virtio-scsi dataplane _almost_ thread-safe.

	Status: ready

If you've read so far and didn't get bored, you're more than qualified
as a reviewer. :)

Paolo

Paolo Bonzini (40):
  9pfs: allocate pdus with g_malloc/g_free
  virtio: move VirtQueueElement at the beginning of the structs
  virtio: move allocation to virtqueue_pop/vring_pop
  virtio: introduce qemu_get/put_virtqueue_element
  virtio: read/write the VirtQueueElement a field at a time
  virtio: introduce virtqueue_alloc_element
  virtio: slim down allocation of VirtQueueElements
  vring: slim down allocation of VirtQueueElements
  vring: make vring_enable_notification return void
  virtio: combine the read of a descriptor
  virtio: add AioContext-specific function for host notifiers
  virtio: export vring_notify as virtio_should_notify
  virtio-blk: fix "disabled data plane" mode
  virtio-blk: do not use vring in dataplane
  virtio-scsi: do not use vring in dataplane
  vring: remove
  iothread: release AioContext around aio_poll
  qemu-thread: introduce QemuRecMutex
  aio: convert from RFifoLock to QemuRecMutex
  aio: rename bh_lock to list_lock
  qemu-thread: introduce QemuLockCnt
  aio: make ctx->list_lock a QemuLockCnt, subsuming ctx->walking_bh
  qemu-thread: optimize QemuLockCnt with futexes on Linux
  aio: tweak walking in dispatch phase
  aio-posix: remove walking_handlers, protecting AioHandler list with list_lock
  aio-win32: remove walking_handlers, protecting AioHandler list with list_lock
  aio: document locking
  aio: push aio_context_acquire/release down to dispatching
  quorum: use atomics for rewrite_count
  quorum: split quorum_fifo_aio_cb from quorum_aio_cb
  qed: introduce qed_aio_start_io and qed_aio_next_io_cb
  block: explicitly acquire aiocontext in callbacks that need it
  block: explicitly acquire aiocontext in bottom halves that need it
  block: explicitly acquire aiocontext in timers that need it
  block: explicitly acquire aiocontext in aio callbacks that need it
  aio: update locking documentation
  async: optimize aio_bh_poll
  aio-posix: partially inline aio_dispatch into aio_poll
  async: remove unnecessary inc/dec pairs
  dma-helpers: avoid lock inversion with AioContext

 aio-posix.c                                   | 108 +++---
 aio-win32.c                                   | 111 +++---
 async.c                                       |  76 ++--
 block/blkverify.c                             |   6 +-
 block/curl.c                                  |  43 ++-
 block/gluster.c                               |   2 +
 block/io.c                                    |   7 +
 block/iscsi.c                                 |  10 +
 block/linux-aio.c                             |  14 +-
 block/mirror.c                                |  12 +-
 block/nbd-client.c                            |  14 +-
 block/nfs.c                                   |  10 +
 block/qed-cluster.c                           |   2 +
 block/qed-table.c                             |  12 +-
 block/qed.c                                   | 112 ++++--
 block/qed.h                                   |   3 +
 block/quorum.c                                |  60 +--
 block/sheepdog.c                              |  29 +-
 block/ssh.c                                   |  47 ++-
 block/throttle-groups.c                       |   2 +
 block/win32-aio.c                             |   8 +-
 dma-helpers.c                                 |  27 +-
 docs/lockcnt.txt                              | 342 +++++++++++++++++
 docs/multiple-iothreads.txt                   |  95 ++++-
 hw/9pfs/virtio-9p-device.c                    |   7 +-
 hw/9pfs/virtio-9p.c                           |  25 +-
 hw/9pfs/virtio-9p.h                           |   4 +-
 hw/block/dataplane/virtio-blk.c               | 131 +------
 hw/block/dataplane/virtio-blk.h               |   1 +
 hw/block/virtio-blk.c                         |  92 ++---
 hw/char/virtio-serial-bus.c                   |  78 ++--
 hw/display/virtio-gpu.c                       |  25 +-
 hw/input/virtio-input.c                       |  24 +-
 hw/net/virtio-net.c                           |  69 ++--
 hw/scsi/scsi-bus.c                            |   2 +
 hw/scsi/scsi-disk.c                           |  18 +
 hw/scsi/scsi-generic.c                        |  20 +-
 hw/scsi/virtio-scsi-dataplane.c               | 197 ++--------
 hw/scsi/virtio-scsi.c                         |  82 ++--
 hw/virtio/Makefile.objs                       |   1 -
 hw/virtio/dataplane/Makefile.objs             |   1 -
 hw/virtio/dataplane/vring.c                   | 526 --------------------------
 hw/virtio/virtio-balloon.c                    |  22 +-
 hw/virtio/virtio-rng.c                        |  10 +-
 hw/virtio/virtio.c                            | 323 +++++++++++-----
 include/block/aio.h                           |  38 +-
 include/hw/virtio/dataplane/vring-accessors.h |  75 ----
 include/hw/virtio/dataplane/vring.h           |  51 ---
 include/hw/virtio/virtio-balloon.h            |   2 +-
 include/hw/virtio/virtio-blk.h                |   9 +-
 include/hw/virtio/virtio-net.h                |   2 +-
 include/hw/virtio/virtio-scsi.h               |  36 +-
 include/hw/virtio/virtio-serial.h             |   2 +-
 include/hw/virtio/virtio.h                    |  16 +-
 include/qemu/futex.h                          |  36 ++
 include/qemu/rfifolock.h                      |  54 ---
 include/qemu/thread-posix.h                   |   6 +
 include/qemu/thread-win32.h                   |  10 +
 include/qemu/thread.h                         |  23 ++
 iothread.c                                    |  11 +-
 nbd.c                                         |   4 +
 tests/.gitignore                              |   1 -
 tests/Makefile                                |   2 -
 tests/test-aio.c                              |  19 +-
 tests/test-rfifolock.c                        |  91 -----
 thread-pool.c                                 |  14 +-
 trace-events                                  |  13 +-
 util/Makefile.objs                            |   2 +-
 util/lockcnt.c                                | 404 ++++++++++++++++++++
 util/qemu-coroutine-sleep.c                   |   5 +
 util/qemu-thread-posix.c                      |  38 +-
 util/qemu-thread-win32.c                      |  25 ++
 util/rfifolock.c                              |  78 ----
 73 files changed, 2000 insertions(+), 1877 deletions(-)
 create mode 100644 docs/lockcnt.txt
 delete mode 100644 hw/virtio/dataplane/Makefile.objs
 delete mode 100644 hw/virtio/dataplane/vring.c
 delete mode 100644 include/hw/virtio/dataplane/vring-accessors.h
 delete mode 100644 include/hw/virtio/dataplane/vring.h
 create mode 100644 include/qemu/futex.h
 delete mode 100644 include/qemu/rfifolock.h
 delete mode 100644 tests/test-rfifolock.c
 create mode 100644 util/lockcnt.c
 delete mode 100644 util/rfifolock.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2015-12-16 17:43 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-24 18:00 [Qemu-devel] [RFC PATCH 00/40] Sneak peek of virtio and dataplane changes for 2.6 Paolo Bonzini
2015-11-24 18:00 ` [Qemu-devel] [PATCH 01/40] 9pfs: allocate pdus with g_malloc/g_free Paolo Bonzini
2015-11-30  2:27   ` Fam Zheng
2015-11-30  2:33     ` Fam Zheng
2015-11-30 16:35   ` Greg Kurz
2015-11-24 18:00 ` [Qemu-devel] [PATCH 02/40] virtio: move VirtQueueElement at the beginning of the structs Paolo Bonzini
2015-11-24 18:00 ` [Qemu-devel] [PATCH 03/40] virtio: move allocation to virtqueue_pop/vring_pop Paolo Bonzini
2015-11-30  3:00   ` Fam Zheng
2015-11-24 18:00 ` [Qemu-devel] [PATCH 04/40] virtio: introduce qemu_get/put_virtqueue_element Paolo Bonzini
2015-11-24 18:00 ` [Qemu-devel] [PATCH 05/40] virtio: read/write the VirtQueueElement a field at a time Paolo Bonzini
2015-11-30  9:47   ` Fam Zheng
2015-11-30 10:37     ` Paolo Bonzini
2015-11-24 18:00 ` [Qemu-devel] [PATCH 06/40] virtio: introduce virtqueue_alloc_element Paolo Bonzini
2015-11-24 18:00 ` [Qemu-devel] [PATCH 07/40] virtio: slim down allocation of VirtQueueElements Paolo Bonzini
2015-11-30  3:24   ` Fam Zheng
2015-11-30  8:36     ` Paolo Bonzini
2015-11-24 18:00 ` [Qemu-devel] [PATCH 08/40] vring: " Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 09/40] vring: make vring_enable_notification return void Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 10/40] virtio: combine the read of a descriptor Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 11/40] virtio: add AioContext-specific function for host notifiers Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 12/40] virtio: export vring_notify as virtio_should_notify Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 13/40] virtio-blk: fix "disabled data plane" mode Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 14/40] virtio-blk: do not use vring in dataplane Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 15/40] virtio-scsi: " Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 16/40] vring: remove Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 17/40] iothread: release AioContext around aio_poll Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 18/40] qemu-thread: introduce QemuRecMutex Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 19/40] aio: convert from RFifoLock to QemuRecMutex Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 20/40] aio: rename bh_lock to list_lock Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 21/40] qemu-thread: introduce QemuLockCnt Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 22/40] aio: make ctx->list_lock a QemuLockCnt, subsuming ctx->walking_bh Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 23/40] qemu-thread: optimize QemuLockCnt with futexes on Linux Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 24/40] aio: tweak walking in dispatch phase Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 25/40] aio-posix: remove walking_handlers, protecting AioHandler list with list_lock Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 26/40] aio-win32: " Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 27/40] aio: document locking Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 28/40] aio: push aio_context_acquire/release down to dispatching Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 29/40] quorum: use atomics for rewrite_count Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 30/40] quorum: split quorum_fifo_aio_cb from quorum_aio_cb Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 31/40] qed: introduce qed_aio_start_io and qed_aio_next_io_cb Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 32/40] block: explicitly acquire aiocontext in callbacks that need it Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 33/40] block: explicitly acquire aiocontext in bottom halves " Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 34/40] block: explicitly acquire aiocontext in timers " Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 35/40] block: explicitly acquire aiocontext in aio callbacks " Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 36/40] aio: update locking documentation Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 37/40] async: optimize aio_bh_poll Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 38/40] aio-posix: partially inline aio_dispatch into aio_poll Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 39/40] async: remove unnecessary inc/dec pairs Paolo Bonzini
2015-11-24 18:01 ` [Qemu-devel] [PATCH 40/40] dma-helpers: avoid lock inversion with AioContext Paolo Bonzini
2015-11-26  9:36 ` [Qemu-devel] [RFC PATCH 00/40] Sneak peek of virtio and dataplane changes for 2.6 Christian Borntraeger
2015-11-26  9:41   ` Christian Borntraeger
2015-11-26 10:39   ` Paolo Bonzini
2015-12-09 20:35     ` Paolo Bonzini
2015-12-16 12:54       ` Christian Borntraeger
2015-12-16 14:40         ` Christian Borntraeger
2015-12-16 17:42         ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.