All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [PATCH 08/22] iscsi: implement .bdrv_detach/attach_aio_context()
@ 2014-05-01 22:47 Peter Lieven
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Lieven @ 2014-05-01 22:47 UTC (permalink / raw)
  To: pl
  Cc: Kevin Wolf, Ronnie Sahlberg, Shergill, Gurinder, qemu-devel,
	Stefan Hajnoczi, Paolo Bonzini, Vinod, Chegu

Stefan Hajnoczi wrote:
> Drop the assumption that we're using the main AioContext for Linux
> AIO.  Convert qemu_aio_set_fd_handler() to aio_set_fd_handler() and
> timer_new_ms() to aio_timer_new().
>
> The .bdrv_detach/attach_aio_context() interfaces also need to be
> implemented to move the fd and timer from the old to the new AioContext.
>
> Cc: Peter Lieven <pl@kamp.de>
> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/iscsi.c | 79
> +++++++++++++++++++++++++++++++++++++++++------------------
>  1 file changed, 55 insertions(+), 24 deletions(-)
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index a30202b..81e3ebd 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -47,6 +47,7 @@
>
>  typedef struct IscsiLun {
>      struct iscsi_context *iscsi;
> +    AioContext *aio_context;
>      int lun;
>      enum scsi_inquiry_peripheral_device_type type;
>      int block_size;
> @@ -69,6 +70,7 @@ typedef struct IscsiTask {
>      struct scsi_task *task;
>      Coroutine *co;
>      QEMUBH *bh;
> +    AioContext *aio_context;
>  } IscsiTask;
>
>  typedef struct IscsiAIOCB {
> @@ -120,7 +122,7 @@ iscsi_schedule_bh(IscsiAIOCB *acb)
>      if (acb->bh) {
>          return;
>      }
> -    acb->bh = qemu_bh_new(iscsi_bh_cb, acb);
> +    acb->bh = aio_bh_new(acb->iscsilun->aio_context, iscsi_bh_cb, acb);
>      qemu_bh_schedule(acb->bh);
>  }
>
> @@ -156,7 +158,7 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int
> status,
>
>  out:
>      if (iTask->co) {
> -        iTask->bh = qemu_bh_new(iscsi_co_generic_bh_cb, iTask);
> +        iTask->bh = aio_bh_new(iTask->aio_context,
> iscsi_co_generic_bh_cb, iTask);
>          qemu_bh_schedule(iTask->bh);
>      }
>  }
> @@ -164,8 +166,9 @@ out:
>  static void iscsi_co_init_iscsitask(IscsiLun *iscsilun, struct IscsiTask
> *iTask)
>  {
>      *iTask = (struct IscsiTask) {
> -        .co         = qemu_coroutine_self(),
> -        .retries    = ISCSI_CMD_RETRIES,
> +        .co             = qemu_coroutine_self(),%

^ permalink raw reply	[flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH 00/22] dataplane: use QEMU block layer
@ 2014-05-01 14:54 Stefan Hajnoczi
  2014-05-01 14:54 ` [Qemu-devel] [PATCH 08/22] iscsi: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2014-05-01 14:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Paolo Bonzini, Shergill, Gurinder, Vinod, Chegu,
	Stefan Hajnoczi

This patch series switches virtio-blk data-plane from a custom Linux AIO
request queue to the QEMU block layer.  The previous "raw files only"
limitation is lifted.  All image formats and protocols can now be used with
virtio-blk data-plane.

How to review this series
-------------------------
I CCed the maintainer of each block driver that I modified.  You probably don't
need to review the entire series, just your patch.

>From now on fd handlers, timers, BHs, and event loop wait must explicitly use
BlockDriverState's AioContext instead of the main loop.  Use
bdrv_get_aio_context(bs) to get the AioContext.  The following function calls
need to be converted:

 * qemu_aio_set_fd_handler() -> aio_set_fd_handler()
 * timer_new*() -> aio_timer_new()
 * qemu_bh_new() -> aio_bh_new()
 * qemu_aio_wait() -> aio_poll(aio_context, true)

For simple block drivers this modification suffices and it is now safe to use
outside the QEMU global mutex.

Block drivers that keep fd handlers, timers, or BHs registered when requests
have been drained need a little bit more work.  Examples of this are network
block drivers with keepalive timers, like iSCSI.

This series adds a new bdrv_set_aio_context(bs, aio_context) function that
moves a BlockDriverState into a new AioContext.  This function calls the block
driver's optional .bdrv_detach_aio_context() and .bdrv_attach_aio_context()
functions.  Implement detach/attach to move the fd handlers, timers, or BHs to
the new AioContext.

Finally, block drivers that manage their own child nodes also need to
implement detach/attach because the generic block layer doesn't know about
their children.  Both ->file and ->backing_hd are automatically taken care of
but blkverify, quorum, and VMDK need to manually propagate detach/attach to
their children.

I have audited and modified all block drivers.  Block driver maintainers,
please check I did it correctly and didn't break your code.

Background
----------
The block layer is currently tied to the QEMU main loop for fd handlers, timer
callbacks, and BHs.  This means that even on hosts with many cores, parts of
block I/O processing happen in one thread and depend on the QEMU global mutex.

virtio-blk data-plane has shown that 1,000,000 IOPS is achievable if we use
additional threads that are not under the QEMU global mutex.

It is necessary to make the QEMU block layer aware that there may be more than
one event loop.  This way BlockDriverState can be used from a thread without
contention on the QEMU global mutex.

This series builds on the aio_context_acquire/release() interface that allows a
thread to temporarily grab an AioContext.  We add bdrv_set_aio_context(bs,
aio_context) for changing which AioContext a BlockDriverState uses.

The final patches convert virtio-blk data-plane to use the QEMU block layer and
let the BlockDriverState run in the IOThread AioContext.

What's next?
------------
I have already made block I/O throttling work in another AioContext and will
send the series out next week.

In order to keep this series reviewable, I'm holding back those patches for
now.  One could say, "throttling" them.

Thank you, thank you, I'll be here all night!

Stefan Hajnoczi (22):
  block: use BlockDriverState AioContext
  block: acquire AioContext in bdrv_close_all()
  block: add bdrv_set_aio_context()
  blkdebug: use BlockDriverState's AioContext
  blkverify: implement .bdrv_detach/attach_aio_context()
  curl: implement .bdrv_detach/attach_aio_context()
  gluster: use BlockDriverState's AioContext
  iscsi: implement .bdrv_detach/attach_aio_context()
  nbd: implement .bdrv_detach/attach_aio_context()
  nfs: implement .bdrv_detach/attach_aio_context()
  qed: use BlockDriverState's AioContext
  quorum: implement .bdrv_detach/attach_aio_context()
  block/raw-posix: implement .bdrv_detach/attach_aio_context()
  block/linux-aio: fix memory and fd leak
  rbd: use BlockDriverState's AioContext
  sheepdog: implement .bdrv_detach/attach_aio_context()
  ssh: use BlockDriverState's AioContext
  vmdk: implement .bdrv_detach/attach_aio_context()
  dataplane: use the QEMU block layer for I/O
  dataplane: delete IOQueue since it is no longer used
  dataplane: implement async flush
  raw-posix: drop raw_get_aio_fd() since it is no longer used

 block.c                          |  88 +++++++++++++--
 block/blkdebug.c                 |   2 +-
 block/blkverify.c                |  47 +++++---
 block/curl.c                     | 194 +++++++++++++++++++-------------
 block/gluster.c                  |   7 +-
 block/iscsi.c                    |  79 +++++++++----
 block/linux-aio.c                |  24 +++-
 block/nbd-client.c               |  24 +++-
 block/nbd-client.h               |   4 +
 block/nbd.c                      |  87 +++++++++------
 block/nfs.c                      |  80 ++++++++++----
 block/qed-table.c                |   8 +-
 block/qed.c                      |  35 +++++-
 block/quorum.c                   |  48 ++++++--
 block/raw-aio.h                  |   3 +
 block/raw-posix.c                |  82 ++++++++------
 block/rbd.c                      |   5 +-
 block/sheepdog.c                 | 118 +++++++++++++-------
 block/ssh.c                      |  36 +++---
 block/vmdk.c                     |  23 ++++
 hw/block/dataplane/Makefile.objs |   2 +-
 hw/block/dataplane/ioq.c         | 117 --------------------
 hw/block/dataplane/ioq.h         |  57 ----------
 hw/block/dataplane/virtio-blk.c  | 233 +++++++++++++++------------------------
 include/block/block.h            |  20 ++--
 include/block/block_int.h        |  36 ++++++
 26 files changed, 829 insertions(+), 630 deletions(-)
 delete mode 100644 hw/block/dataplane/ioq.c
 delete mode 100644 hw/block/dataplane/ioq.h

-- 
1.9.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-05-08 15:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-01 22:47 [Qemu-devel] [PATCH 08/22] iscsi: implement .bdrv_detach/attach_aio_context() Peter Lieven
  -- strict thread matches above, loose matches on Subject: below --
2014-05-01 14:54 [Qemu-devel] [PATCH 00/22] dataplane: use QEMU block layer Stefan Hajnoczi
2014-05-01 14:54 ` [Qemu-devel] [PATCH 08/22] iscsi: implement .bdrv_detach/attach_aio_context() Stefan Hajnoczi
2014-05-01 22:39   ` Peter Lieven
2014-05-07 10:07     ` Stefan Hajnoczi
2014-05-07 10:29       ` Paolo Bonzini
2014-05-07 14:09         ` Peter Lieven
2014-05-08 11:33           ` Stefan Hajnoczi
2014-05-08 14:52             ` ronnie sahlberg
2014-05-08 15:45               ` Peter Lieven

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.