All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency
@ 2021-07-21  9:42 Stefano Garzarella
  2021-07-21  9:42 ` [PATCH for-6.1? v2 1/3] iothread: generalize iothread_set_param/iothread_get_param Stefano Garzarella
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Stefano Garzarella @ 2021-07-21  9:42 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Fam Zheng, Kevin Wolf, Daniel P. Berrangé,
	Eduardo Habkost, qemu-block, Stefan Weil, Markus Armbruster,
	Max Reitz, Paolo Bonzini, Eric Blake, Dr. David Alan Gilbert

Since it's a performance regression, if possible we could include it in 6.1-rc1.
There shouldn't be any particular criticism.

v1: https://lists.gnu.org/archive/html/qemu-devel/2021-07/msg01526.html
v2:
  - s/bacth/batch/ [stefanha]
  - limit the batch with the number of available events [stefanha]
  - rebased on master
  - re-run benchmarks

Commit 2558cb8dd4 ("linux-aio: increasing MAX_EVENTS to a larger hardcoded
value") changed MAX_EVENTS from 128 to 1024, to increase the number of
in-flight requests. But this change also increased the potential maximum batch
to 1024 elements.

The problem is noticeable when we have a lot of requests in flight and multiple
queues attached to the same AIO context.
In this case we potentially create very large batches. Instead, when we have
a single queue, the batch is limited because when the queue is unplugged,
there is a call to io_submit(2).
In practice, io_submit(2) was called only when there are no more queues plugged
in or when we fill the AIO queue (MAX_EVENTS = 1024).

This series limit the batch size (number of request submitted to the kernel
through a single io_submit(2) call) in the Linux AIO backend, and add a new
`aio-max-batch` parameter to IOThread to allow tuning it.
If `aio-max-batch` is equal to 0 (default value), the AIO engine will use its
default maximum batch size value.

I run some benchmarks to choose 32 as default batch value for Linux AIO.
Below the kIOPS measured with fio running in the guest (average over 3 runs):

                   |   master  |           with this series applied            |
                   |143c2e04328| maxbatch=8|maxbatch=16|maxbatch=32|maxbatch=64|
          # queues | 1q  | 4qs | 1q  | 4qs | 1q  | 4qs | 1q  | 4qs | 1q  | 4qs |
-- randread tests -|-----------------------------------------------------------|
bs=4k iodepth=1    | 200 | 195 | 181 | 208 | 200 | 203 | 206 | 212 | 200 | 204 |
bs=4k iodepth=8    | 269 | 231 | 256 | 244 | 255 | 260 | 266 | 268 | 270 | 250 |
bs=4k iodepth=64   | 230 | 198 | 262 | 265 | 261 | 253 | 260 | 273 | 253 | 263 |
bs=4k iodepth=128  | 217 | 181 | 261 | 253 | 249 | 276 | 250 | 278 | 255 | 278 |
bs=16k iodepth=1   | 130 | 130 | 130 | 130 | 130 | 130 | 137 | 130 | 130 | 130 |
bs=16k iodepth=8   | 130 | 131 | 130 | 131 | 130 | 130 | 137 | 131 | 131 | 130 |
bs=16k iodepth=64  | 130 | 102 | 131 | 128 | 131 | 128 | 137 | 140 | 130 | 128 |
bs=16k iodepth=128 | 131 | 100 | 130 | 128 | 131 | 129 | 137 | 141 | 130 | 129 |

1q  = virtio-blk device with a single queue
4qs = virito-blk device with multi queues (one queue per vCPU - 4)

I reported only the most significant tests, but I also did other tests to
make sure there were no regressions, here the full report:
https://docs.google.com/spreadsheets/d/11X3_5FJu7pnMTlf4ZatRDvsnU9K3EPj6Mn3aJIsE4tI

Test environment:
- Disk: Intel Corporation NVMe Datacenter SSD [Optane]
- CPU: Intel(R) Xeon(R) Silver 4214 CPU @ 2.20GHz
- QEMU: qemu-system-x86_64 -machine q35,accel=kvm -smp 4 -m 4096 \
          ... \
          -object iothread,id=iothread0,aio-max-batch=${MAX_BATCH} \
          -device virtio-blk-pci,iothread=iothread0,num-queues=${NUM_QUEUES}

- benchmark: fio --ioengine=libaio --thread --group_reporting \
                 --number_ios=200000 --direct=1 --filename=/dev/vdb \
                 --rw=${TEST} --bs=${BS} --iodepth=${IODEPTH} --numjobs=16

Next steps:
 - benchmark io_uring and use `aio-max-batch` also there
 - make MAX_EVENTS parametric adding a new `aio-max-events` parameter

Thanks,
Stefano

Stefano Garzarella (3):
  iothread: generalize iothread_set_param/iothread_get_param
  iothread: add aio-max-batch parameter
  linux-aio: limit the batch size using `aio-max-batch` parameter

 qapi/misc.json            |  6 ++-
 qapi/qom.json             |  7 +++-
 include/block/aio.h       | 12 ++++++
 include/sysemu/iothread.h |  3 ++
 block/linux-aio.c         |  9 ++++-
 iothread.c                | 82 ++++++++++++++++++++++++++++++++++-----
 monitor/hmp-cmds.c        |  2 +
 util/aio-posix.c          | 12 ++++++
 util/aio-win32.c          |  5 +++
 util/async.c              |  2 +
 qemu-options.hx           |  8 +++-
 11 files changed, 134 insertions(+), 14 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH for-6.1? v2 1/3] iothread: generalize iothread_set_param/iothread_get_param
  2021-07-21  9:42 [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefano Garzarella
@ 2021-07-21  9:42 ` Stefano Garzarella
  2021-07-21  9:42 ` [PATCH for-6.1? v2 2/3] iothread: add aio-max-batch parameter Stefano Garzarella
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Stefano Garzarella @ 2021-07-21  9:42 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Fam Zheng, Kevin Wolf, Daniel P. Berrangé,
	Eduardo Habkost, qemu-block, Stefan Weil, Markus Armbruster,
	Max Reitz, Paolo Bonzini, Eric Blake, Dr. David Alan Gilbert

Changes in preparation for next patches where we add a new
parameter not related to the poll mechanism.

Let's add two new generic functions (iothread_set_param and
iothread_get_param) that we use to set and get IOThread
parameters.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
 iothread.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/iothread.c b/iothread.c
index 2c5ccd7367..103679a16b 100644
--- a/iothread.c
+++ b/iothread.c
@@ -213,7 +213,7 @@ static PollParamInfo poll_shrink_info = {
     "poll-shrink", offsetof(IOThread, poll_shrink),
 };
 
-static void iothread_get_poll_param(Object *obj, Visitor *v,
+static void iothread_get_param(Object *obj, Visitor *v,
         const char *name, void *opaque, Error **errp)
 {
     IOThread *iothread = IOTHREAD(obj);
@@ -223,7 +223,7 @@ static void iothread_get_poll_param(Object *obj, Visitor *v,
     visit_type_int64(v, name, field, errp);
 }
 
-static void iothread_set_poll_param(Object *obj, Visitor *v,
+static bool iothread_set_param(Object *obj, Visitor *v,
         const char *name, void *opaque, Error **errp)
 {
     IOThread *iothread = IOTHREAD(obj);
@@ -232,17 +232,36 @@ static void iothread_set_poll_param(Object *obj, Visitor *v,
     int64_t value;
 
     if (!visit_type_int64(v, name, &value, errp)) {
-        return;
+        return false;
     }
 
     if (value < 0) {
         error_setg(errp, "%s value must be in range [0, %" PRId64 "]",
                    info->name, INT64_MAX);
-        return;
+        return false;
     }
 
     *field = value;
 
+    return true;
+}
+
+static void iothread_get_poll_param(Object *obj, Visitor *v,
+        const char *name, void *opaque, Error **errp)
+{
+
+    iothread_get_param(obj, v, name, opaque, errp);
+}
+
+static void iothread_set_poll_param(Object *obj, Visitor *v,
+        const char *name, void *opaque, Error **errp)
+{
+    IOThread *iothread = IOTHREAD(obj);
+
+    if (!iothread_set_param(obj, v, name, opaque, errp)) {
+        return;
+    }
+
     if (iothread->ctx) {
         aio_context_set_poll_params(iothread->ctx,
                                     iothread->poll_max_ns,
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH for-6.1? v2 2/3] iothread: add aio-max-batch parameter
  2021-07-21  9:42 [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefano Garzarella
  2021-07-21  9:42 ` [PATCH for-6.1? v2 1/3] iothread: generalize iothread_set_param/iothread_get_param Stefano Garzarella
@ 2021-07-21  9:42 ` Stefano Garzarella
  2021-07-21  9:42 ` [PATCH for-6.1? v2 3/3] linux-aio: limit the batch size using `aio-max-batch` parameter Stefano Garzarella
  2021-07-21 13:13 ` [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefan Hajnoczi
  3 siblings, 0 replies; 5+ messages in thread
From: Stefano Garzarella @ 2021-07-21  9:42 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Fam Zheng, Kevin Wolf, Daniel P. Berrangé,
	Eduardo Habkost, qemu-block, Stefan Weil, Markus Armbruster,
	Max Reitz, Paolo Bonzini, Eric Blake, Dr. David Alan Gilbert

The `aio-max-batch` parameter will be propagated to AIO engines
and it will be used to control the maximum number of queued requests.

When there are in queue a number of requests equal to `aio-max-batch`,
the engine invokes the system call to forward the requests to the kernel.

This parameter allows us to control the maximum batch size to reduce
the latency that requests might accumulate while queued in the AIO
engine queue.

If `aio-max-batch` is equal to 0 (default value), the AIO engine will
use its default maximum batch size value.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    - s/bacth/batch/ [stefanha]

 qapi/misc.json            |  6 ++++-
 qapi/qom.json             |  7 ++++-
 include/block/aio.h       | 12 +++++++++
 include/sysemu/iothread.h |  3 +++
 iothread.c                | 55 +++++++++++++++++++++++++++++++++++----
 monitor/hmp-cmds.c        |  2 ++
 util/aio-posix.c          | 12 +++++++++
 util/aio-win32.c          |  5 ++++
 util/async.c              |  2 ++
 qemu-options.hx           |  8 ++++--
 10 files changed, 103 insertions(+), 9 deletions(-)

diff --git a/qapi/misc.json b/qapi/misc.json
index 156f98203e..5c2ca3b556 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -86,6 +86,9 @@
 # @poll-shrink: how many ns will be removed from polling time, 0 means that
 #               it's not configured (since 2.9)
 #
+# @aio-max-batch: maximum number of requests in a batch for the AIO engine,
+#                 0 means that the engine will use its default (since 6.1)
+#
 # Since: 2.0
 ##
 { 'struct': 'IOThreadInfo',
@@ -93,7 +96,8 @@
            'thread-id': 'int',
            'poll-max-ns': 'int',
            'poll-grow': 'int',
-           'poll-shrink': 'int' } }
+           'poll-shrink': 'int',
+           'aio-max-batch': 'int' } }
 
 ##
 # @query-iothreads:
diff --git a/qapi/qom.json b/qapi/qom.json
index 652be317b8..6d5f4a88e6 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -516,12 +516,17 @@
 #               algorithm detects it is spending too long polling without
 #               encountering events. 0 selects a default behaviour (default: 0)
 #
+# @aio-max-batch: maximum number of requests in a batch for the AIO engine,
+#                 0 means that the engine will use its default
+#                 (default:0, since 6.1)
+#
 # Since: 2.0
 ##
 { 'struct': 'IothreadProperties',
   'data': { '*poll-max-ns': 'int',
             '*poll-grow': 'int',
-            '*poll-shrink': 'int' } }
+            '*poll-shrink': 'int',
+            '*aio-max-batch': 'int' } }
 
 ##
 # @MemoryBackendProperties:
diff --git a/include/block/aio.h b/include/block/aio.h
index 807edce9b5..47fbe9d81f 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -232,6 +232,9 @@ struct AioContext {
     int64_t poll_grow;      /* polling time growth factor */
     int64_t poll_shrink;    /* polling time shrink factor */
 
+    /* AIO engine parameters */
+    int64_t aio_max_batch;  /* maximum number of requests in a batch */
+
     /*
      * List of handlers participating in userspace polling.  Protected by
      * ctx->list_lock.  Iterated and modified mostly by the event loop thread
@@ -755,4 +758,13 @@ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
                                  int64_t grow, int64_t shrink,
                                  Error **errp);
 
+/**
+ * aio_context_set_aio_params:
+ * @ctx: the aio context
+ * @max_batch: maximum number of requests in a batch, 0 means that the
+ *             engine will use its default
+ */
+void aio_context_set_aio_params(AioContext *ctx, int64_t max_batch,
+                                Error **errp);
+
 #endif
diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h
index f177142f16..7f714bd136 100644
--- a/include/sysemu/iothread.h
+++ b/include/sysemu/iothread.h
@@ -37,6 +37,9 @@ struct IOThread {
     int64_t poll_max_ns;
     int64_t poll_grow;
     int64_t poll_shrink;
+
+    /* AioContext AIO engine parameters */
+    int64_t aio_max_batch;
 };
 typedef struct IOThread IOThread;
 
diff --git a/iothread.c b/iothread.c
index 103679a16b..ddbbde61f7 100644
--- a/iothread.c
+++ b/iothread.c
@@ -152,6 +152,24 @@ static void iothread_init_gcontext(IOThread *iothread)
     iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
 }
 
+static void iothread_set_aio_context_params(IOThread *iothread, Error **errp)
+{
+    ERRP_GUARD();
+
+    aio_context_set_poll_params(iothread->ctx,
+                                iothread->poll_max_ns,
+                                iothread->poll_grow,
+                                iothread->poll_shrink,
+                                errp);
+    if (*errp) {
+        return;
+    }
+
+    aio_context_set_aio_params(iothread->ctx,
+                               iothread->aio_max_batch,
+                               errp);
+}
+
 static void iothread_complete(UserCreatable *obj, Error **errp)
 {
     Error *local_error = NULL;
@@ -171,11 +189,7 @@ static void iothread_complete(UserCreatable *obj, Error **errp)
      */
     iothread_init_gcontext(iothread);
 
-    aio_context_set_poll_params(iothread->ctx,
-                                iothread->poll_max_ns,
-                                iothread->poll_grow,
-                                iothread->poll_shrink,
-                                &local_error);
+    iothread_set_aio_context_params(iothread, &local_error);
     if (local_error) {
         error_propagate(errp, local_error);
         aio_context_unref(iothread->ctx);
@@ -212,6 +226,9 @@ static PollParamInfo poll_grow_info = {
 static PollParamInfo poll_shrink_info = {
     "poll-shrink", offsetof(IOThread, poll_shrink),
 };
+static PollParamInfo aio_max_batch_info = {
+    "aio-max-batch", offsetof(IOThread, aio_max_batch),
+};
 
 static void iothread_get_param(Object *obj, Visitor *v,
         const char *name, void *opaque, Error **errp)
@@ -271,6 +288,29 @@ static void iothread_set_poll_param(Object *obj, Visitor *v,
     }
 }
 
+static void iothread_get_aio_param(Object *obj, Visitor *v,
+        const char *name, void *opaque, Error **errp)
+{
+
+    iothread_get_param(obj, v, name, opaque, errp);
+}
+
+static void iothread_set_aio_param(Object *obj, Visitor *v,
+        const char *name, void *opaque, Error **errp)
+{
+    IOThread *iothread = IOTHREAD(obj);
+
+    if (!iothread_set_param(obj, v, name, opaque, errp)) {
+        return;
+    }
+
+    if (iothread->ctx) {
+        aio_context_set_aio_params(iothread->ctx,
+                                   iothread->aio_max_batch,
+                                   errp);
+    }
+}
+
 static void iothread_class_init(ObjectClass *klass, void *class_data)
 {
     UserCreatableClass *ucc = USER_CREATABLE_CLASS(klass);
@@ -288,6 +328,10 @@ static void iothread_class_init(ObjectClass *klass, void *class_data)
                               iothread_get_poll_param,
                               iothread_set_poll_param,
                               NULL, &poll_shrink_info);
+    object_class_property_add(klass, "aio-max-batch", "int",
+                              iothread_get_aio_param,
+                              iothread_set_aio_param,
+                              NULL, &aio_max_batch_info);
 }
 
 static const TypeInfo iothread_info = {
@@ -337,6 +381,7 @@ static int query_one_iothread(Object *object, void *opaque)
     info->poll_max_ns = iothread->poll_max_ns;
     info->poll_grow = iothread->poll_grow;
     info->poll_shrink = iothread->poll_shrink;
+    info->aio_max_batch = iothread->aio_max_batch;
 
     QAPI_LIST_APPEND(*tail, info);
     return 0;
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 0942027208..e00255f7ee 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1893,6 +1893,8 @@ void hmp_info_iothreads(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "  poll-max-ns=%" PRId64 "\n", value->poll_max_ns);
         monitor_printf(mon, "  poll-grow=%" PRId64 "\n", value->poll_grow);
         monitor_printf(mon, "  poll-shrink=%" PRId64 "\n", value->poll_shrink);
+        monitor_printf(mon, "  aio-max-batch=%" PRId64 "\n",
+                       value->aio_max_batch);
     }
 
     qapi_free_IOThreadInfoList(info_list);
diff --git a/util/aio-posix.c b/util/aio-posix.c
index 30f5354b1e..2b86777e91 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -716,3 +716,15 @@ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
 
     aio_notify(ctx);
 }
+
+void aio_context_set_aio_params(AioContext *ctx, int64_t max_batch,
+                                Error **errp)
+{
+    /*
+     * No thread synchronization here, it doesn't matter if an incorrect value
+     * is used once.
+     */
+    ctx->aio_max_batch = max_batch;
+
+    aio_notify(ctx);
+}
diff --git a/util/aio-win32.c b/util/aio-win32.c
index 168717b51b..d5b09a1193 100644
--- a/util/aio-win32.c
+++ b/util/aio-win32.c
@@ -440,3 +440,8 @@ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
         error_setg(errp, "AioContext polling is not implemented on Windows");
     }
 }
+
+void aio_context_set_aio_params(AioContext *ctx, int64_t max_batch,
+                                Error **errp)
+{
+}
diff --git a/util/async.c b/util/async.c
index 9a41591319..6f6717a34b 100644
--- a/util/async.c
+++ b/util/async.c
@@ -554,6 +554,8 @@ AioContext *aio_context_new(Error **errp)
     ctx->poll_grow = 0;
     ctx->poll_shrink = 0;
 
+    ctx->aio_max_batch = 0;
+
     return ctx;
 fail:
     g_source_destroy(&ctx->source);
diff --git a/qemu-options.hx b/qemu-options.hx
index 0c9ddc0274..99ed5ec5f1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -5301,7 +5301,7 @@ SRST
 
             CN=laptop.example.com,O=Example Home,L=London,ST=London,C=GB
 
-    ``-object iothread,id=id,poll-max-ns=poll-max-ns,poll-grow=poll-grow,poll-shrink=poll-shrink``
+    ``-object iothread,id=id,poll-max-ns=poll-max-ns,poll-grow=poll-grow,poll-shrink=poll-shrink,aio-max-batch=aio-max-batch``
         Creates a dedicated event loop thread that devices can be
         assigned to. This is known as an IOThread. By default device
         emulation happens in vCPU threads or the main event loop thread.
@@ -5337,7 +5337,11 @@ SRST
         the polling time when the algorithm detects it is spending too
         long polling without encountering events.
 
-        The polling parameters can be modified at run-time using the
+        The ``aio-max-batch`` parameter is the maximum number of requests
+        in a batch for the AIO engine, 0 means that the engine will use
+        its default.
+
+        The IOThread parameters can be modified at run-time using the
         ``qom-set`` command (where ``iothread1`` is the IOThread's
         ``id``):
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH for-6.1? v2 3/3] linux-aio: limit the batch size using `aio-max-batch` parameter
  2021-07-21  9:42 [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefano Garzarella
  2021-07-21  9:42 ` [PATCH for-6.1? v2 1/3] iothread: generalize iothread_set_param/iothread_get_param Stefano Garzarella
  2021-07-21  9:42 ` [PATCH for-6.1? v2 2/3] iothread: add aio-max-batch parameter Stefano Garzarella
@ 2021-07-21  9:42 ` Stefano Garzarella
  2021-07-21 13:13 ` [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefan Hajnoczi
  3 siblings, 0 replies; 5+ messages in thread
From: Stefano Garzarella @ 2021-07-21  9:42 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel
  Cc: Fam Zheng, Kevin Wolf, Daniel P. Berrangé,
	Eduardo Habkost, qemu-block, Stefan Weil, Markus Armbruster,
	Max Reitz, Paolo Bonzini, Eric Blake, Dr. David Alan Gilbert

When there are multiple queues attached to the same AIO context,
some requests may experience high latency, since in the worst case
the AIO engine queue is only flushed when it is full (MAX_EVENTS) or
there are no more queues plugged.

Commit 2558cb8dd4 ("linux-aio: increasing MAX_EVENTS to a larger
hardcoded value") changed MAX_EVENTS from 128 to 1024, to increase
the number of in-flight requests. But this change also increased
the potential maximum batch to 1024 elements.

When there is a single queue attached to the AIO context, the issue
is mitigated from laio_io_unplug() that will flush the queue every
time is invoked since there can't be others queue plugged.

Let's use the new `aio-max-batch` IOThread parameter to mitigate
this issue, limiting the number of requests in a batch.

We also define a default value (32): this value is obtained running
some benchmarks and it represents a good tradeoff between the latency
increase while a request is queued and the cost of the io_submit(2)
system call.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---

Notes:
    v2:
    - limit the batch with the number of available events [stefanha]

 block/linux-aio.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/linux-aio.c b/block/linux-aio.c
index 3c0527c2bf..0dab507b71 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -28,6 +28,9 @@
  */
 #define MAX_EVENTS 1024
 
+/* Maximum number of requests in a batch. (default value) */
+#define DEFAULT_MAX_BATCH 32
+
 struct qemu_laiocb {
     Coroutine *co;
     LinuxAioState *ctx;
@@ -351,6 +354,10 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
     LinuxAioState *s = laiocb->ctx;
     struct iocb *iocbs = &laiocb->iocb;
     QEMUIOVector *qiov = laiocb->qiov;
+    int64_t max_batch = s->aio_context->aio_max_batch ?: DEFAULT_MAX_BATCH;
+
+    /* limit the batch with the number of available events */
+    max_batch = MIN_NON_ZERO(MAX_EVENTS - s->io_q.in_flight, max_batch);
 
     switch (type) {
     case QEMU_AIO_WRITE:
@@ -371,7 +378,7 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
     s->io_q.in_queue++;
     if (!s->io_q.blocked &&
         (!s->io_q.plugged ||
-         s->io_q.in_flight + s->io_q.in_queue >= MAX_EVENTS)) {
+         s->io_q.in_queue >= max_batch)) {
         ioq_submit(s);
     }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency
  2021-07-21  9:42 [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefano Garzarella
                   ` (2 preceding siblings ...)
  2021-07-21  9:42 ` [PATCH for-6.1? v2 3/3] linux-aio: limit the batch size using `aio-max-batch` parameter Stefano Garzarella
@ 2021-07-21 13:13 ` Stefan Hajnoczi
  3 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2021-07-21 13:13 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: Fam Zheng, Kevin Wolf, Daniel P. Berrangé,
	Eduardo Habkost, qemu-block, qemu-devel, Stefan Weil,
	Markus Armbruster, Max Reitz, Paolo Bonzini, Eric Blake,
	Dr. David Alan Gilbert

[-- Attachment #1: Type: text/plain, Size: 4665 bytes --]

On Wed, Jul 21, 2021 at 11:42:08AM +0200, Stefano Garzarella wrote:
> Since it's a performance regression, if possible we could include it in 6.1-rc1.
> There shouldn't be any particular criticism.
> 
> v1: https://lists.gnu.org/archive/html/qemu-devel/2021-07/msg01526.html
> v2:
>   - s/bacth/batch/ [stefanha]
>   - limit the batch with the number of available events [stefanha]
>   - rebased on master
>   - re-run benchmarks
> 
> Commit 2558cb8dd4 ("linux-aio: increasing MAX_EVENTS to a larger hardcoded
> value") changed MAX_EVENTS from 128 to 1024, to increase the number of
> in-flight requests. But this change also increased the potential maximum batch
> to 1024 elements.
> 
> The problem is noticeable when we have a lot of requests in flight and multiple
> queues attached to the same AIO context.
> In this case we potentially create very large batches. Instead, when we have
> a single queue, the batch is limited because when the queue is unplugged,
> there is a call to io_submit(2).
> In practice, io_submit(2) was called only when there are no more queues plugged
> in or when we fill the AIO queue (MAX_EVENTS = 1024).
> 
> This series limit the batch size (number of request submitted to the kernel
> through a single io_submit(2) call) in the Linux AIO backend, and add a new
> `aio-max-batch` parameter to IOThread to allow tuning it.
> If `aio-max-batch` is equal to 0 (default value), the AIO engine will use its
> default maximum batch size value.
> 
> I run some benchmarks to choose 32 as default batch value for Linux AIO.
> Below the kIOPS measured with fio running in the guest (average over 3 runs):
> 
>                    |   master  |           with this series applied            |
>                    |143c2e04328| maxbatch=8|maxbatch=16|maxbatch=32|maxbatch=64|
>           # queues | 1q  | 4qs | 1q  | 4qs | 1q  | 4qs | 1q  | 4qs | 1q  | 4qs |
> -- randread tests -|-----------------------------------------------------------|
> bs=4k iodepth=1    | 200 | 195 | 181 | 208 | 200 | 203 | 206 | 212 | 200 | 204 |
> bs=4k iodepth=8    | 269 | 231 | 256 | 244 | 255 | 260 | 266 | 268 | 270 | 250 |
> bs=4k iodepth=64   | 230 | 198 | 262 | 265 | 261 | 253 | 260 | 273 | 253 | 263 |
> bs=4k iodepth=128  | 217 | 181 | 261 | 253 | 249 | 276 | 250 | 278 | 255 | 278 |
> bs=16k iodepth=1   | 130 | 130 | 130 | 130 | 130 | 130 | 137 | 130 | 130 | 130 |
> bs=16k iodepth=8   | 130 | 131 | 130 | 131 | 130 | 130 | 137 | 131 | 131 | 130 |
> bs=16k iodepth=64  | 130 | 102 | 131 | 128 | 131 | 128 | 137 | 140 | 130 | 128 |
> bs=16k iodepth=128 | 131 | 100 | 130 | 128 | 131 | 129 | 137 | 141 | 130 | 129 |
> 
> 1q  = virtio-blk device with a single queue
> 4qs = virito-blk device with multi queues (one queue per vCPU - 4)
> 
> I reported only the most significant tests, but I also did other tests to
> make sure there were no regressions, here the full report:
> https://docs.google.com/spreadsheets/d/11X3_5FJu7pnMTlf4ZatRDvsnU9K3EPj6Mn3aJIsE4tI
> 
> Test environment:
> - Disk: Intel Corporation NVMe Datacenter SSD [Optane]
> - CPU: Intel(R) Xeon(R) Silver 4214 CPU @ 2.20GHz
> - QEMU: qemu-system-x86_64 -machine q35,accel=kvm -smp 4 -m 4096 \
>           ... \
>           -object iothread,id=iothread0,aio-max-batch=${MAX_BATCH} \
>           -device virtio-blk-pci,iothread=iothread0,num-queues=${NUM_QUEUES}
> 
> - benchmark: fio --ioengine=libaio --thread --group_reporting \
>                  --number_ios=200000 --direct=1 --filename=/dev/vdb \
>                  --rw=${TEST} --bs=${BS} --iodepth=${IODEPTH} --numjobs=16
> 
> Next steps:
>  - benchmark io_uring and use `aio-max-batch` also there
>  - make MAX_EVENTS parametric adding a new `aio-max-events` parameter
> 
> Thanks,
> Stefano
> 
> Stefano Garzarella (3):
>   iothread: generalize iothread_set_param/iothread_get_param
>   iothread: add aio-max-batch parameter
>   linux-aio: limit the batch size using `aio-max-batch` parameter
> 
>  qapi/misc.json            |  6 ++-
>  qapi/qom.json             |  7 +++-
>  include/block/aio.h       | 12 ++++++
>  include/sysemu/iothread.h |  3 ++
>  block/linux-aio.c         |  9 ++++-
>  iothread.c                | 82 ++++++++++++++++++++++++++++++++++-----
>  monitor/hmp-cmds.c        |  2 +
>  util/aio-posix.c          | 12 ++++++
>  util/aio-win32.c          |  5 +++
>  util/async.c              |  2 +
>  qemu-options.hx           |  8 +++-
>  11 files changed, 134 insertions(+), 14 deletions(-)
> 
> -- 
> 2.31.1
> 

Thanks, applied to my master tree:
https://gitlab.com/stefanha/qemu/commits/master

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-21 13:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21  9:42 [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefano Garzarella
2021-07-21  9:42 ` [PATCH for-6.1? v2 1/3] iothread: generalize iothread_set_param/iothread_get_param Stefano Garzarella
2021-07-21  9:42 ` [PATCH for-6.1? v2 2/3] iothread: add aio-max-batch parameter Stefano Garzarella
2021-07-21  9:42 ` [PATCH for-6.1? v2 3/3] linux-aio: limit the batch size using `aio-max-batch` parameter Stefano Garzarella
2021-07-21 13:13 ` [PATCH for-6.1? v2 0/3] linux-aio: limit the batch size to reduce queue latency Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.