All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v1 0/3] linux-aio: introduce submit I/O at batch
@ 2014-06-30 15:47 Ming Lei
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug() Ming Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ming Lei @ 2014-06-30 15:47 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, Michael S. Tsirkin

Hi,

The commit 580b6b2aa2(dataplane: use the QEMU block layer for I/O)
introduces ~40% throughput regression on virtio-blk dataplane, and
one of causes is that submitting I/O at batch is removed.

This patchset trys to introduce this mechanism on block, at least,
linux-aio can benefit from that.

With these patches, it is observed that thoughout on virtio-blk
dataplane can be improved a lot, see data in commit log of patch
3/3.

It should be possible to apply the batch mechanism to other devices
(such as virtio-scsi) too.

V1:
	- move queuing io stuff into linux-aio.c as suggested by Paolo


Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug()
  2014-06-30 15:47 [Qemu-devel] [PATCH v1 0/3] linux-aio: introduce submit I/O at batch Ming Lei
@ 2014-06-30 15:47 ` Ming Lei
  2014-06-30 16:10   ` Paolo Bonzini
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug Ming Lei
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 3/3] dataplane: submit I/O at batch Ming Lei
  2 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2014-06-30 15:47 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Ming Lei, Fam Zheng, Michael S. Tsirkin

From: Ming Lei <tom.leiming@gmail.com>

This patch introduces these two APIs so that following
patches can support queuing I/O requests and submitting them
at batch for improving I/O performance.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block.c                   |   22 ++++++++++++++++++++++
 include/block/block.h     |    3 +++
 include/block/block_int.h |    4 ++++
 3 files changed, 29 insertions(+)

diff --git a/block.c b/block.c
index 217f523..2b4ec5b 100644
--- a/block.c
+++ b/block.c
@@ -1910,6 +1910,7 @@ void bdrv_drain_all(void)
             bool bs_busy;
 
             aio_context_acquire(aio_context);
+            bdrv_io_unplug(bs);
             bdrv_start_throttled_reqs(bs);
             bs_busy = bdrv_requests_pending(bs);
             bs_busy |= aio_poll(aio_context, bs_busy);
@@ -5774,3 +5775,24 @@ bool bdrv_is_first_non_filter(BlockDriverState *candidate)
 
     return false;
 }
+
+void bdrv_io_plug(BlockDriverState *bs)
+{
+    BlockDriver *drv = bs->drv;
+    if (drv && drv->bdrv_io_plug) {
+        drv->bdrv_io_plug(bs);
+    } else if (bs->file) {
+        bdrv_io_plug(bs->file);
+    }
+}
+
+int bdrv_io_unplug(BlockDriverState *bs)
+{
+    BlockDriver *drv = bs->drv;
+    if (drv && drv->bdrv_io_unplug) {
+        return drv->bdrv_io_unplug(bs);
+    } else if (bs->file) {
+        return bdrv_io_unplug(bs->file);
+    }
+    return 0;
+}
diff --git a/include/block/block.h b/include/block/block.h
index d0baf4f..ccced61 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -578,4 +578,7 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
  */
 void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context);
 
+void bdrv_io_plug(BlockDriverState *bs);
+int bdrv_io_unplug(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 715c761..b5412fa 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -257,6 +257,10 @@ struct BlockDriver {
     void (*bdrv_attach_aio_context)(BlockDriverState *bs,
                                     AioContext *new_context);
 
+    /* io queue for linux-aio */
+    void (*bdrv_io_plug)(BlockDriverState *bs);
+    int (*bdrv_io_unplug)(BlockDriverState *bs);
+
     QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug
  2014-06-30 15:47 [Qemu-devel] [PATCH v1 0/3] linux-aio: introduce submit I/O at batch Ming Lei
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug() Ming Lei
@ 2014-06-30 15:47 ` Ming Lei
  2014-06-30 17:31   ` Paolo Bonzini
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 3/3] dataplane: submit I/O at batch Ming Lei
  2 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2014-06-30 15:47 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Ming Lei, Fam Zheng, Michael S. Tsirkin

From: Ming Lei <tom.leiming@gmail.com>

This patch implements .bdrv_io_plug and .bdrv_io_unplug
callbacks for linux-aio Block Drivers, so that submitting
I/O at batch can be supported on linux-aio.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block/linux-aio.c |   85 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 block/raw-aio.h   |    2 ++
 block/raw-posix.c |   31 +++++++++++++++++++
 3 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index f0a2c08..7794162 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -25,6 +25,8 @@
  */
 #define MAX_EVENTS 128
 
+#define MAX_QUEUED_IO  128
+
 struct qemu_laiocb {
     BlockDriverAIOCB common;
     struct qemu_laio_state *ctx;
@@ -36,11 +38,84 @@ struct qemu_laiocb {
     QLIST_ENTRY(qemu_laiocb) node;
 };
 
+struct laio_queue {
+    struct iocb *iocbs[MAX_QUEUED_IO];
+    bool plugged;
+    unsigned int size;
+    unsigned int idx;
+};
+
 struct qemu_laio_state {
     io_context_t ctx;
     EventNotifier e;
+
+    /* io queue for submit at batch */
+    struct laio_queue io_q;
 };
 
+static void ioq_init(struct laio_queue *io_q)
+{
+    io_q->size = MAX_QUEUED_IO;
+    io_q->idx = 0;
+    io_q->plugged = false;
+}
+
+static int ioq_submit(struct qemu_laio_state *s)
+{
+    int ret, i, len = s->io_q.idx;
+
+    while (1) {
+        ret = io_submit(s->ctx, len, s->io_q.iocbs);
+        if (ret != -EAGAIN)
+            break;
+    }
+
+    /* empty io queue */
+    s->io_q.idx = 0;
+
+    if (ret >= 0)
+      return 0;
+
+    for (i = 0; i < len; i++) {
+        struct qemu_laiocb *laiocb =
+            container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
+        qemu_aio_release(laiocb);
+    }
+    return ret;
+}
+
+static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
+{
+    unsigned int idx = s->io_q.idx;
+
+    s->io_q.iocbs[idx++] = iocb;
+    s->io_q.idx = idx;
+
+    /* submit immediately if queue is full */
+    if (idx == s->io_q.size)
+        ioq_submit(s);
+}
+
+void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
+{
+    struct qemu_laio_state *s = aio_ctx;
+
+    s->io_q.plugged = true;
+}
+
+int laio_io_unplug(BlockDriverState *bs, void *aio_ctx)
+{
+    struct qemu_laio_state *s = aio_ctx;
+    int ret = 0;
+
+    if (s->io_q.idx > 0) {
+        ret = ioq_submit(s);
+    }
+    s->io_q.plugged = false;
+
+    return ret;
+}
+
 static inline ssize_t io_event_ret(struct io_event *ev)
 {
     return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
@@ -168,8 +243,12 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
     }
     io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
 
-    if (io_submit(s->ctx, 1, &iocbs) < 0)
-        goto out_free_aiocb;
+    if (!s->io_q.plugged) {
+        if (io_submit(s->ctx, 1, &iocbs) < 0)
+            goto out_free_aiocb;
+    } else {
+        ioq_enqueue(s, iocbs);
+    }
     return &laiocb->common;
 
 out_free_aiocb:
@@ -204,6 +283,8 @@ void *laio_init(void)
         goto out_close_efd;
     }
 
+    ioq_init(&s->io_q);
+
     return s;
 
 out_close_efd:
diff --git a/block/raw-aio.h b/block/raw-aio.h
index 8cf084e..ed47c3d 100644
--- a/block/raw-aio.h
+++ b/block/raw-aio.h
@@ -40,6 +40,8 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
         BlockDriverCompletionFunc *cb, void *opaque, int type);
 void laio_detach_aio_context(void *s, AioContext *old_context);
 void laio_attach_aio_context(void *s, AioContext *new_context);
+void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
+int laio_io_unplug(BlockDriverState *bs, void *aio_ctx);
 #endif
 
 #ifdef _WIN32
diff --git a/block/raw-posix.c b/block/raw-posix.c
index dacf4fb..ef64a6d 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -1054,6 +1054,27 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
                        cb, opaque, type);
 }
 
+static void raw_aio_plug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+    BDRVRawState *s = bs->opaque;
+    if (s->use_aio) {
+        laio_io_plug(bs, s->aio_ctx);
+    }
+#endif
+}
+
+static int raw_aio_unplug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+    BDRVRawState *s = bs->opaque;
+    if (s->use_aio) {
+        return laio_io_unplug(bs, s->aio_ctx);
+    }
+#endif
+    return 0;
+}
+
 static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockDriverCompletionFunc *cb, void *opaque)
@@ -1503,6 +1524,8 @@ static BlockDriver bdrv_file = {
     .bdrv_aio_flush = raw_aio_flush,
     .bdrv_aio_discard = raw_aio_discard,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
 
     .bdrv_truncate = raw_truncate,
     .bdrv_getlength = raw_getlength,
@@ -1902,6 +1925,8 @@ static BlockDriver bdrv_host_device = {
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_aio_discard   = hdev_aio_discard,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength	= raw_getlength,
@@ -2047,6 +2072,8 @@ static BlockDriver bdrv_host_floppy = {
     .bdrv_aio_writev    = raw_aio_writev,
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength      = raw_getlength,
@@ -2175,6 +2202,8 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_aio_writev    = raw_aio_writev,
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength      = raw_getlength,
@@ -2309,6 +2338,8 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_aio_writev    = raw_aio_writev,
     .bdrv_aio_flush	= raw_aio_flush,
     .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
 
     .bdrv_truncate      = raw_truncate,
     .bdrv_getlength      = raw_getlength,
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v1 3/3] dataplane: submit I/O at batch
  2014-06-30 15:47 [Qemu-devel] [PATCH v1 0/3] linux-aio: introduce submit I/O at batch Ming Lei
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug() Ming Lei
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug Ming Lei
@ 2014-06-30 15:47 ` Ming Lei
  2014-06-30 16:11   ` Paolo Bonzini
  2 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2014-06-30 15:47 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel, Paolo Bonzini, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin

Before commit 580b6b2aa2(dataplane: use the Qemu block
layer for I/O), dataplane for virtio-blk submits block
I/O at batch.

This commit 580b6b2aa2 replaces the custom linux AIO
implementation(including I/O batch) with Qemu block
layer, but this commit causes ~40% throughput regression
on virtio-blk performance, and removing submitting I/O
at batch is one of the cause.

This patch applys the new introduced bdrv_io_plug() and
bdrv_io_unplug() interfaces to support submitting I/O
at batch for Qemu block layer, and in my test, the change
can improve thoughput by ~30% with 'aio=native'.

Following my fio test script:

	[global]
	direct=1
	size=4G
	bsrange=4k-4k
	timeout=40
	numjobs=4
	ioengine=libaio
	iodepth=64
	filename=/dev/vdc
	group_reporting=1

	[f]
	rw=randread

Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
	- qemu master: 59K IOPS
	- qemu master with these patches: 81K IOPS
	- 2.0.0 release(dataplane using custom linux aio): 104K IOPS

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 hw/block/dataplane/virtio-blk.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index c10b7b7..8fefcce 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -289,6 +289,7 @@ static void handle_notify(EventNotifier *e)
     int ret;
 
     event_notifier_test_and_clear(&s->host_notifier);
+    bdrv_io_plug(s->blk->conf.bs);
     for (;;) {
         /* Disable guest->host notifies to avoid unnecessary vmexits */
         vring_disable_notification(s->vdev, &s->vring);
@@ -322,6 +323,7 @@ static void handle_notify(EventNotifier *e)
             break;
         }
     }
+    bdrv_io_unplug(s->blk->conf.bs);
 }
 
 /* Context: QEMU global mutex held */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug()
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug() Ming Lei
@ 2014-06-30 16:10   ` Paolo Bonzini
  2014-06-30 16:15     ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2014-06-30 16:10 UTC (permalink / raw)
  To: Ming Lei, Peter Maydell, qemu-devel, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin

Il 30/06/2014 17:47, Ming Lei ha scritto:
> From: Ming Lei <tom.leiming@gmail.com>
>
> This patch introduces these two APIs so that following
> patches can support queuing I/O requests and submitting them
> at batch for improving I/O performance.
>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  block.c                   |   22 ++++++++++++++++++++++
>  include/block/block.h     |    3 +++
>  include/block/block_int.h |    4 ++++
>  3 files changed, 29 insertions(+)
>
> diff --git a/block.c b/block.c
> index 217f523..2b4ec5b 100644
> --- a/block.c
> +++ b/block.c
> @@ -1910,6 +1910,7 @@ void bdrv_drain_all(void)
>              bool bs_busy;
>
>              aio_context_acquire(aio_context);
> +            bdrv_io_unplug(bs);
>              bdrv_start_throttled_reqs(bs);
>              bs_busy = bdrv_requests_pending(bs);
>              bs_busy |= aio_poll(aio_context, bs_busy);
> @@ -5774,3 +5775,24 @@ bool bdrv_is_first_non_filter(BlockDriverState *candidate)
>
>      return false;
>  }
> +
> +void bdrv_io_plug(BlockDriverState *bs)
> +{
> +    BlockDriver *drv = bs->drv;
> +    if (drv && drv->bdrv_io_plug) {
> +        drv->bdrv_io_plug(bs);
> +    } else if (bs->file) {
> +        bdrv_io_plug(bs->file);
> +    }
> +}
> +
> +int bdrv_io_unplug(BlockDriverState *bs)
> +{
> +    BlockDriver *drv = bs->drv;
> +    if (drv && drv->bdrv_io_unplug) {
> +        return drv->bdrv_io_unplug(bs);
> +    } else if (bs->file) {
> +        return bdrv_io_unplug(bs->file);
> +    }
> +    return 0;

I think this should return void (and that's how you use it in patch 3 
indeed).  If you fix this you can add my Reviewed-by tag.

Paolo

> +}
> diff --git a/include/block/block.h b/include/block/block.h
> index d0baf4f..ccced61 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -578,4 +578,7 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
>   */
>  void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context);
>
> +void bdrv_io_plug(BlockDriverState *bs);
> +int bdrv_io_unplug(BlockDriverState *bs);
> +
>  #endif
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 715c761..b5412fa 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -257,6 +257,10 @@ struct BlockDriver {
>      void (*bdrv_attach_aio_context)(BlockDriverState *bs,
>                                      AioContext *new_context);
>
> +    /* io queue for linux-aio */
> +    void (*bdrv_io_plug)(BlockDriverState *bs);
> +    int (*bdrv_io_unplug)(BlockDriverState *bs);
> +
>      QLIST_ENTRY(BlockDriver) list;
>  };
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 3/3] dataplane: submit I/O at batch
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 3/3] dataplane: submit I/O at batch Ming Lei
@ 2014-06-30 16:11   ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2014-06-30 16:11 UTC (permalink / raw)
  To: Ming Lei, Peter Maydell, qemu-devel, Stefan Hajnoczi
  Cc: Kevin Wolf, Fam Zheng, Michael S. Tsirkin

Il 30/06/2014 17:47, Ming Lei ha scritto:
> Before commit 580b6b2aa2(dataplane: use the Qemu block
> layer for I/O), dataplane for virtio-blk submits block
> I/O at batch.
>
> This commit 580b6b2aa2 replaces the custom linux AIO
> implementation(including I/O batch) with Qemu block
> layer, but this commit causes ~40% throughput regression
> on virtio-blk performance, and removing submitting I/O
> at batch is one of the cause.
>
> This patch applys the new introduced bdrv_io_plug() and
> bdrv_io_unplug() interfaces to support submitting I/O
> at batch for Qemu block layer, and in my test, the change
> can improve thoughput by ~30% with 'aio=native'.
>
> Following my fio test script:
>
> 	[global]
> 	direct=1
> 	size=4G
> 	bsrange=4k-4k
> 	timeout=40
> 	numjobs=4
> 	ioengine=libaio
> 	iodepth=64
> 	filename=/dev/vdc
> 	group_reporting=1
>
> 	[f]
> 	rw=randread
>
> Result on one of my small machine(host: x86_64, 2cores, 4thread, guest: 4cores):
> 	- qemu master: 59K IOPS
> 	- qemu master with these patches: 81K IOPS
> 	- 2.0.0 release(dataplane using custom linux aio): 104K IOPS
>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  hw/block/dataplane/virtio-blk.c |    2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index c10b7b7..8fefcce 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -289,6 +289,7 @@ static void handle_notify(EventNotifier *e)
>      int ret;
>
>      event_notifier_test_and_clear(&s->host_notifier);
> +    bdrv_io_plug(s->blk->conf.bs);
>      for (;;) {
>          /* Disable guest->host notifies to avoid unnecessary vmexits */
>          vring_disable_notification(s->vdev, &s->vring);
> @@ -322,6 +323,7 @@ static void handle_notify(EventNotifier *e)
>              break;
>          }
>      }
> +    bdrv_io_unplug(s->blk->conf.bs);
>  }
>
>  /* Context: QEMU global mutex held */
>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

but please add it to the non-dataplane path as well, and to virtio-scsi.

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug()
  2014-06-30 16:10   ` Paolo Bonzini
@ 2014-06-30 16:15     ` Ming Lei
  2014-06-30 16:18       ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2014-06-30 16:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin,
	qemu-devel, Stefan Hajnoczi

On Tue, Jul 1, 2014 at 12:10 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 30/06/2014 17:47, Ming Lei ha scritto:
>
>> From: Ming Lei <tom.leiming@gmail.com>
>>
>> This patch introduces these two APIs so that following
>> patches can support queuing I/O requests and submitting them
>> at batch for improving I/O performance.
>>
>> Signed-off-by: Ming Lei <ming.lei@canonical.com>
>> ---
>>  block.c                   |   22 ++++++++++++++++++++++
>>  include/block/block.h     |    3 +++
>>  include/block/block_int.h |    4 ++++
>>  3 files changed, 29 insertions(+)
>>
>> diff --git a/block.c b/block.c
>> index 217f523..2b4ec5b 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1910,6 +1910,7 @@ void bdrv_drain_all(void)
>>              bool bs_busy;
>>
>>              aio_context_acquire(aio_context);
>> +            bdrv_io_unplug(bs);
>>              bdrv_start_throttled_reqs(bs);
>>              bs_busy = bdrv_requests_pending(bs);
>>              bs_busy |= aio_poll(aio_context, bs_busy);
>> @@ -5774,3 +5775,24 @@ bool bdrv_is_first_non_filter(BlockDriverState
>> *candidate)
>>
>>      return false;
>>  }
>> +
>> +void bdrv_io_plug(BlockDriverState *bs)
>> +{
>> +    BlockDriver *drv = bs->drv;
>> +    if (drv && drv->bdrv_io_plug) {
>> +        drv->bdrv_io_plug(bs);
>> +    } else if (bs->file) {
>> +        bdrv_io_plug(bs->file);
>> +    }
>> +}
>> +
>> +int bdrv_io_unplug(BlockDriverState *bs)
>> +{
>> +    BlockDriver *drv = bs->drv;
>> +    if (drv && drv->bdrv_io_unplug) {
>> +        return drv->bdrv_io_unplug(bs);
>> +    } else if (bs->file) {
>> +        return bdrv_io_unplug(bs->file);
>> +    }
>> +    return 0;
>
>
> I think this should return void (and that's how you use it in patch 3
> indeed).  If you fix this you can add my Reviewed-by tag.

It can be used to trace how many IO are submitted at batch,
otherwise device can't know this information at all.

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug()
  2014-06-30 16:15     ` Ming Lei
@ 2014-06-30 16:18       ` Paolo Bonzini
  2014-06-30 16:29         ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2014-06-30 16:18 UTC (permalink / raw)
  To: Ming Lei
  Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin,
	qemu-devel, Stefan Hajnoczi

Il 30/06/2014 18:15, Ming Lei ha scritto:
>>> >> +int bdrv_io_unplug(BlockDriverState *bs)
>>> >> +{
>>> >> +    BlockDriver *drv = bs->drv;
>>> >> +    if (drv && drv->bdrv_io_unplug) {
>>> >> +        return drv->bdrv_io_unplug(bs);
>>> >> +    } else if (bs->file) {
>>> >> +        return bdrv_io_unplug(bs->file);
>>> >> +    }
>>> >> +    return 0;
>> >
>> >
>> > I think this should return void (and that's how you use it in patch 3
>> > indeed).  If you fix this you can add my Reviewed-by tag.
> It can be used to trace how many IO are submitted at batch,
> otherwise device can't know this information at all.

Having a return value however suggests that bdrv_io_unplug can fail.  So 
this should be documented.  For now, I'd prefer to keep it simple.

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug()
  2014-06-30 16:18       ` Paolo Bonzini
@ 2014-06-30 16:29         ` Ming Lei
  0 siblings, 0 replies; 12+ messages in thread
From: Ming Lei @ 2014-06-30 16:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin,
	qemu-devel, Stefan Hajnoczi

On Tue, Jul 1, 2014 at 12:18 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 30/06/2014 18:15, Ming Lei ha scritto:
>
>>>> >> +int bdrv_io_unplug(BlockDriverState *bs)
>>>> >> +{
>>>> >> +    BlockDriver *drv = bs->drv;
>>>> >> +    if (drv && drv->bdrv_io_unplug) {
>>>> >> +        return drv->bdrv_io_unplug(bs);
>>>> >> +    } else if (bs->file) {
>>>> >> +        return bdrv_io_unplug(bs->file);
>>>> >> +    }
>>>> >> +    return 0;
>>>
>>> >
>>> >
>>> > I think this should return void (and that's how you use it in patch 3
>>> > indeed).  If you fix this you can add my Reviewed-by tag.
>>
>> It can be used to trace how many IO are submitted at batch,
>> otherwise device can't know this information at all.
>
>
> Having a return value however suggests that bdrv_io_unplug can fail.  So
> this should be documented.  For now, I'd prefer to keep it simple.

Fair enough, will change it to void in v2.

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug
  2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug Ming Lei
@ 2014-06-30 17:31   ` Paolo Bonzini
  2014-07-01  1:05     ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2014-06-30 17:31 UTC (permalink / raw)
  To: Ming Lei, Peter Maydell, qemu-devel, Stefan Hajnoczi
  Cc: Kevin Wolf, Ming Lei, Fam Zheng, Michael S. Tsirkin

Il 30/06/2014 17:47, Ming Lei ha scritto:
> From: Ming Lei <tom.leiming@gmail.com>
>
> This patch implements .bdrv_io_plug and .bdrv_io_unplug
> callbacks for linux-aio Block Drivers, so that submitting
> I/O at batch can be supported on linux-aio.
>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> ---
>  block/linux-aio.c |   85 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>  block/raw-aio.h   |    2 ++
>  block/raw-posix.c |   31 +++++++++++++++++++
>  3 files changed, 116 insertions(+), 2 deletions(-)
>
> diff --git a/block/linux-aio.c b/block/linux-aio.c
> index f0a2c08..7794162 100644
> --- a/block/linux-aio.c
> +++ b/block/linux-aio.c
> @@ -25,6 +25,8 @@
>   */
>  #define MAX_EVENTS 128
>
> +#define MAX_QUEUED_IO  128
> +
>  struct qemu_laiocb {
>      BlockDriverAIOCB common;
>      struct qemu_laio_state *ctx;
> @@ -36,11 +38,84 @@ struct qemu_laiocb {
>      QLIST_ENTRY(qemu_laiocb) node;
>  };
>
> +struct laio_queue {
> +    struct iocb *iocbs[MAX_QUEUED_IO];
> +    bool plugged;
> +    unsigned int size;
> +    unsigned int idx;
> +};
> +
>  struct qemu_laio_state {
>      io_context_t ctx;
>      EventNotifier e;
> +
> +    /* io queue for submit at batch */
> +    struct laio_queue io_q;
>  };
>
> +static void ioq_init(struct laio_queue *io_q)
> +{
> +    io_q->size = MAX_QUEUED_IO;
> +    io_q->idx = 0;
> +    io_q->plugged = false;
> +}
> +
> +static int ioq_submit(struct qemu_laio_state *s)
> +{
> +    int ret, i, len = s->io_q.idx;
> +
> +    while (1) {
> +        ret = io_submit(s->ctx, len, s->io_q.iocbs);
> +        if (ret != -EAGAIN)
> +            break;

Busy waiting is not acceptable here (it can be unbounded if, for 
example, an NFS server is on the other side of a network partition). 
You have to add a bottom half to qemu_laio_state that calls ioq_submit, 
and schedule it after calling io_getevents.

If the bottom half is already scheduled and the queue is full, I guess 
there's no other choice than returning -EAGAIN from ioq_enqueue and 
ultimately to the guest.

Paolo

> +    }
> +
> +    /* empty io queue */
> +    s->io_q.idx = 0;
> +
> +    if (ret >= 0)
> +      return 0;
> +
> +    for (i = 0; i < len; i++) {
> +        struct qemu_laiocb *laiocb =
> +            container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
> +        qemu_aio_release(laiocb);
> +    }
> +    return ret;
> +}
> +
> +static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
> +{
> +    unsigned int idx = s->io_q.idx;
> +
> +    s->io_q.iocbs[idx++] = iocb;
> +    s->io_q.idx = idx;
> +
> +    /* submit immediately if queue is full */
> +    if (idx == s->io_q.size)
> +        ioq_submit(s);
> +}
> +
> +void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
> +{
> +    struct qemu_laio_state *s = aio_ctx;
> +
> +    s->io_q.plugged = true;
> +}
> +
> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx)
> +{
> +    struct qemu_laio_state *s = aio_ctx;
> +    int ret = 0;
> +
> +    if (s->io_q.idx > 0) {
> +        ret = ioq_submit(s);
> +    }
> +    s->io_q.plugged = false;
> +
> +    return ret;
> +}
> +
>  static inline ssize_t io_event_ret(struct io_event *ev)
>  {
>      return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res);
> @@ -168,8 +243,12 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
>      }
>      io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
>
> -    if (io_submit(s->ctx, 1, &iocbs) < 0)
> -        goto out_free_aiocb;
> +    if (!s->io_q.plugged) {
> +        if (io_submit(s->ctx, 1, &iocbs) < 0)
> +            goto out_free_aiocb;
> +    } else {
> +        ioq_enqueue(s, iocbs);
> +    }
>      return &laiocb->common;
>
>  out_free_aiocb:
> @@ -204,6 +283,8 @@ void *laio_init(void)
>          goto out_close_efd;
>      }
>
> +    ioq_init(&s->io_q);
> +
>      return s;
>
>  out_close_efd:
> diff --git a/block/raw-aio.h b/block/raw-aio.h
> index 8cf084e..ed47c3d 100644
> --- a/block/raw-aio.h
> +++ b/block/raw-aio.h
> @@ -40,6 +40,8 @@ BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
>          BlockDriverCompletionFunc *cb, void *opaque, int type);
>  void laio_detach_aio_context(void *s, AioContext *old_context);
>  void laio_attach_aio_context(void *s, AioContext *new_context);
> +void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
> +int laio_io_unplug(BlockDriverState *bs, void *aio_ctx);
>  #endif
>
>  #ifdef _WIN32
> diff --git a/block/raw-posix.c b/block/raw-posix.c
> index dacf4fb..ef64a6d 100644
> --- a/block/raw-posix.c
> +++ b/block/raw-posix.c
> @@ -1054,6 +1054,27 @@ static BlockDriverAIOCB *raw_aio_submit(BlockDriverState *bs,
>                         cb, opaque, type);
>  }
>
> +static void raw_aio_plug(BlockDriverState *bs)
> +{
> +#ifdef CONFIG_LINUX_AIO
> +    BDRVRawState *s = bs->opaque;
> +    if (s->use_aio) {
> +        laio_io_plug(bs, s->aio_ctx);
> +    }
> +#endif
> +}
> +
> +static int raw_aio_unplug(BlockDriverState *bs)
> +{
> +#ifdef CONFIG_LINUX_AIO
> +    BDRVRawState *s = bs->opaque;
> +    if (s->use_aio) {
> +        return laio_io_unplug(bs, s->aio_ctx);
> +    }
> +#endif
> +    return 0;
> +}
> +
>  static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs,
>          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
>          BlockDriverCompletionFunc *cb, void *opaque)
> @@ -1503,6 +1524,8 @@ static BlockDriver bdrv_file = {
>      .bdrv_aio_flush = raw_aio_flush,
>      .bdrv_aio_discard = raw_aio_discard,
>      .bdrv_refresh_limits = raw_refresh_limits,
> +    .bdrv_io_plug = raw_aio_plug,
> +    .bdrv_io_unplug = raw_aio_unplug,
>
>      .bdrv_truncate = raw_truncate,
>      .bdrv_getlength = raw_getlength,
> @@ -1902,6 +1925,8 @@ static BlockDriver bdrv_host_device = {
>      .bdrv_aio_flush	= raw_aio_flush,
>      .bdrv_aio_discard   = hdev_aio_discard,
>      .bdrv_refresh_limits = raw_refresh_limits,
> +    .bdrv_io_plug = raw_aio_plug,
> +    .bdrv_io_unplug = raw_aio_unplug,
>
>      .bdrv_truncate      = raw_truncate,
>      .bdrv_getlength	= raw_getlength,
> @@ -2047,6 +2072,8 @@ static BlockDriver bdrv_host_floppy = {
>      .bdrv_aio_writev    = raw_aio_writev,
>      .bdrv_aio_flush	= raw_aio_flush,
>      .bdrv_refresh_limits = raw_refresh_limits,
> +    .bdrv_io_plug = raw_aio_plug,
> +    .bdrv_io_unplug = raw_aio_unplug,
>
>      .bdrv_truncate      = raw_truncate,
>      .bdrv_getlength      = raw_getlength,
> @@ -2175,6 +2202,8 @@ static BlockDriver bdrv_host_cdrom = {
>      .bdrv_aio_writev    = raw_aio_writev,
>      .bdrv_aio_flush	= raw_aio_flush,
>      .bdrv_refresh_limits = raw_refresh_limits,
> +    .bdrv_io_plug = raw_aio_plug,
> +    .bdrv_io_unplug = raw_aio_unplug,
>
>      .bdrv_truncate      = raw_truncate,
>      .bdrv_getlength      = raw_getlength,
> @@ -2309,6 +2338,8 @@ static BlockDriver bdrv_host_cdrom = {
>      .bdrv_aio_writev    = raw_aio_writev,
>      .bdrv_aio_flush	= raw_aio_flush,
>      .bdrv_refresh_limits = raw_refresh_limits,
> +    .bdrv_io_plug = raw_aio_plug,
> +    .bdrv_io_unplug = raw_aio_unplug,
>
>      .bdrv_truncate      = raw_truncate,
>      .bdrv_getlength      = raw_getlength,
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug
  2014-06-30 17:31   ` Paolo Bonzini
@ 2014-07-01  1:05     ` Ming Lei
  2014-07-01  6:16       ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2014-07-01  1:05 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin,
	qemu-devel, Stefan Hajnoczi

On Tue, Jul 1, 2014 at 1:31 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 30/06/2014 17:47, Ming Lei ha scritto:
>
>> From: Ming Lei <tom.leiming@gmail.com>
>>
>> This patch implements .bdrv_io_plug and .bdrv_io_unplug
>> callbacks for linux-aio Block Drivers, so that submitting
>> I/O at batch can be supported on linux-aio.
>>
>> Signed-off-by: Ming Lei <ming.lei@canonical.com>
>> ---
>>  block/linux-aio.c |   85
>> +++++++++++++++++++++++++++++++++++++++++++++++++++--
>>  block/raw-aio.h   |    2 ++
>>  block/raw-posix.c |   31 +++++++++++++++++++
>>  3 files changed, 116 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/linux-aio.c b/block/linux-aio.c
>> index f0a2c08..7794162 100644
>> --- a/block/linux-aio.c
>> +++ b/block/linux-aio.c
>> @@ -25,6 +25,8 @@
>>   */
>>  #define MAX_EVENTS 128
>>
>> +#define MAX_QUEUED_IO  128
>> +
>>  struct qemu_laiocb {
>>      BlockDriverAIOCB common;
>>      struct qemu_laio_state *ctx;
>> @@ -36,11 +38,84 @@ struct qemu_laiocb {
>>      QLIST_ENTRY(qemu_laiocb) node;
>>  };
>>
>> +struct laio_queue {
>> +    struct iocb *iocbs[MAX_QUEUED_IO];
>> +    bool plugged;
>> +    unsigned int size;
>> +    unsigned int idx;
>> +};
>> +
>>  struct qemu_laio_state {
>>      io_context_t ctx;
>>      EventNotifier e;
>> +
>> +    /* io queue for submit at batch */
>> +    struct laio_queue io_q;
>>  };
>>
>> +static void ioq_init(struct laio_queue *io_q)
>> +{
>> +    io_q->size = MAX_QUEUED_IO;
>> +    io_q->idx = 0;
>> +    io_q->plugged = false;
>> +}
>> +
>> +static int ioq_submit(struct qemu_laio_state *s)
>> +{
>> +    int ret, i, len = s->io_q.idx;
>> +
>> +    while (1) {
>> +        ret = io_submit(s->ctx, len, s->io_q.iocbs);
>> +        if (ret != -EAGAIN)
>> +            break;
>
>
> Busy waiting is not acceptable here (it can be unbounded if, for example, an
> NFS server is on the other side of a network partition). You have to add a
> bottom half to qemu_laio_state that calls ioq_submit, and schedule it after
> calling io_getevents.
>
> If the bottom half is already scheduled and the queue is full, I guess
> there's no other choice than returning -EAGAIN from ioq_enqueue and
> ultimately to the guest.

That is a bit complicated, as you mentioned it is close to
2.1 release, could we just keep it simple to return failure to guest
after retrying several times? Actually, previous dataplane handles
it by exit(-1), which is unfriendly absolutely.

Also other failures should be handled as this way too.

If there is no objection, I will update it in v2.

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug
  2014-07-01  1:05     ` Ming Lei
@ 2014-07-01  6:16       ` Paolo Bonzini
  0 siblings, 0 replies; 12+ messages in thread
From: Paolo Bonzini @ 2014-07-01  6:16 UTC (permalink / raw)
  To: Ming Lei
  Cc: Kevin Wolf, Peter Maydell, Fam Zheng, Michael S. Tsirkin,
	qemu-devel, Stefan Hajnoczi

Il 01/07/2014 03:05, Ming Lei ha scritto:
>> > Busy waiting is not acceptable here (it can be unbounded if, for example, an
>> > NFS server is on the other side of a network partition). You have to add a
>> > bottom half to qemu_laio_state that calls ioq_submit, and schedule it after
>> > calling io_getevents.
>> >
>> > If the bottom half is already scheduled and the queue is full, I guess
>> > there's no other choice than returning -EAGAIN from ioq_enqueue and
>> > ultimately to the guest.
> That is a bit complicated, as you mentioned it is close to
> 2.1 release, could we just keep it simple to return failure to guest
> after retrying several times? Actually, previous dataplane handles
> it by exit(-1), which is unfriendly absolutely.

Indeed, I'd say do not even bother retrying several times. :)

Paolo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-07-01  6:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-30 15:47 [Qemu-devel] [PATCH v1 0/3] linux-aio: introduce submit I/O at batch Ming Lei
2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 1/3] block: block: introduce bdrv_io_plug() and bdrv_io_unplug() Ming Lei
2014-06-30 16:10   ` Paolo Bonzini
2014-06-30 16:15     ` Ming Lei
2014-06-30 16:18       ` Paolo Bonzini
2014-06-30 16:29         ` Ming Lei
2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 2/3] linux-aio: implement io plug and unplug Ming Lei
2014-06-30 17:31   ` Paolo Bonzini
2014-07-01  1:05     ` Ming Lei
2014-07-01  6:16       ` Paolo Bonzini
2014-06-30 15:47 ` [Qemu-devel] [PATCH v1 3/3] dataplane: submit I/O at batch Ming Lei
2014-06-30 16:11   ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.