All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous
@ 2023-05-23 21:38 Fabiano Rosas
  2023-05-23 21:38 ` [RFC PATCH 1/6] block: Remove bdrv_query_block_node_info Fabiano Rosas
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-23 21:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

As discussed in another thread [1], here's an RFC addressing a VCPU
softlockup encountered when issuing QMP commands that target a disk
placed on NFS.

Since QMP commands happen with the qemu_global_mutex locked, any
command that takes too long to finish will block other threads waiting
to take the global mutex. One such thread could be a VCPU thread going
out of the guest to handle IO.

This is the case when issuing the QMP command query-block, which
eventually calls raw_co_get_allocated_file_size(). This function makes
an 'fstat' call that has been observed to take a long time (seconds)
over NFS.

NFS latency issues aside, we can improve the situation by not blocking
VCPU threads while the command is running.

Move the 'fstat' call into the thread-pool and make the necessary
adaptations to ensure raw_co_get_allocated_file_size runs in a
coroutine in the block driver aio_context.

1- Question about QMP and BQL
https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03141.html

CI run: https://gitlab.com/farosas/qemu/-/pipelines/876583685

Fabiano Rosas (3):
  block: Remove bdrv_query_block_node_info
  block: Mark bdrv_co_get_allocated_file_size() as mixed
  block: Allow bdrv_get_allocated_file_size to run in bdrv context

João Silva (1):
  block: Add a thread-pool version of fstat

Lin Ma (2):
  Convert query-block/info_block to coroutine
  Convert query-block/info_block to coroutine

 block/file-posix.c             | 40 ++++++++++++++++++++++++--
 block/monitor/block-hmp-cmds.c |  2 +-
 block/qapi.c                   | 51 +++++++++++++++-------------------
 blockdev.c                     |  6 ++--
 hmp-commands-info.hx           |  1 +
 include/block/block-hmp-cmds.h |  2 +-
 include/block/block-io.h       |  2 +-
 include/block/qapi.h           |  3 --
 include/block/raw-aio.h        |  4 ++-
 qapi/block-core.json           |  5 ++--
 10 files changed, 72 insertions(+), 44 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH 1/6] block: Remove bdrv_query_block_node_info
  2023-05-23 21:38 [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous Fabiano Rosas
@ 2023-05-23 21:38 ` Fabiano Rosas
  2023-05-24  8:31   ` Claudio Fontana
  2023-05-23 21:38 ` [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed Fabiano Rosas
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-23 21:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

The last call site of this function has been removed by commit
c04d0ab026 ("qemu-img: Let info print block graph").

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 block/qapi.c         | 27 ---------------------------
 include/block/qapi.h |  3 ---
 2 files changed, 30 deletions(-)

diff --git a/block/qapi.c b/block/qapi.c
index f34f95e0ef..79bf80c503 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -309,33 +309,6 @@ out:
     aio_context_release(bdrv_get_aio_context(bs));
 }
 
-/**
- * bdrv_query_block_node_info:
- * @bs: block node to examine
- * @p_info: location to store node information
- * @errp: location to store error information
- *
- * Store image information about @bs in @p_info.
- *
- * @p_info will be set only on success. On error, store error in @errp.
- */
-void bdrv_query_block_node_info(BlockDriverState *bs,
-                                BlockNodeInfo **p_info,
-                                Error **errp)
-{
-    BlockNodeInfo *info;
-    ERRP_GUARD();
-
-    info = g_new0(BlockNodeInfo, 1);
-    bdrv_do_query_node_info(bs, info, errp);
-    if (*errp) {
-        qapi_free_BlockNodeInfo(info);
-        return;
-    }
-
-    *p_info = info;
-}
-
 /**
  * bdrv_query_image_info:
  * @bs: block node to examine
diff --git a/include/block/qapi.h b/include/block/qapi.h
index 18d48ddb70..8663971c58 100644
--- a/include/block/qapi.h
+++ b/include/block/qapi.h
@@ -36,9 +36,6 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
 int bdrv_query_snapshot_info_list(BlockDriverState *bs,
                                   SnapshotInfoList **p_list,
                                   Error **errp);
-void bdrv_query_block_node_info(BlockDriverState *bs,
-                                BlockNodeInfo **p_info,
-                                Error **errp);
 void bdrv_query_image_info(BlockDriverState *bs,
                            ImageInfo **p_info,
                            bool flat,
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed
  2023-05-23 21:38 [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous Fabiano Rosas
  2023-05-23 21:38 ` [RFC PATCH 1/6] block: Remove bdrv_query_block_node_info Fabiano Rosas
@ 2023-05-23 21:38 ` Fabiano Rosas
  2023-05-25 14:17   ` Eric Blake
  2023-05-26  9:12   ` Kevin Wolf
  2023-05-23 21:39 ` [RFC PATCH 3/6] Convert query-block/info_block to coroutine Fabiano Rosas
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-23 21:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

Some callers of this function are about to be converted to use
coroutines, so allow it to be executed both inside and outside a
coroutine.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 include/block/block-io.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/block/block-io.h b/include/block/block-io.h
index a27e471a87..c1f96faca5 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -87,7 +87,7 @@ int64_t co_wrapper_mixed_bdrv_rdlock bdrv_getlength(BlockDriverState *bs);
 int64_t coroutine_fn GRAPH_RDLOCK
 bdrv_co_get_allocated_file_size(BlockDriverState *bs);
 
-int64_t co_wrapper_bdrv_rdlock
+int64_t co_wrapper_mixed_bdrv_rdlock
 bdrv_get_allocated_file_size(BlockDriverState *bs);
 
 BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts,
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 3/6] Convert query-block/info_block to coroutine
  2023-05-23 21:38 [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous Fabiano Rosas
  2023-05-23 21:38 ` [RFC PATCH 1/6] block: Remove bdrv_query_block_node_info Fabiano Rosas
  2023-05-23 21:38 ` [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed Fabiano Rosas
@ 2023-05-23 21:39 ` Fabiano Rosas
  2023-05-24  8:48   ` Claudio Fontana
  2023-05-25 14:26   ` Eric Blake
  2023-05-23 21:39 ` [RFC PATCH 4/6] " Fabiano Rosas
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-23 21:39 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake, Dr. David Alan Gilbert

From: Lin Ma <lma@suse.com>

Sometimes the query-block performs time-consuming I/O(say waiting for
the fstat of NFS complete), So let's make this QMP handler runs in a
coroutine.

The following patch moves the fstat() into a thread pool.

Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 block/monitor/block-hmp-cmds.c | 2 +-
 block/qapi.c                   | 2 +-
 hmp-commands-info.hx           | 1 +
 include/block/block-hmp-cmds.h | 2 +-
 qapi/block-core.json           | 2 +-
 5 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index ca2599de44..4b704010bc 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -738,7 +738,7 @@ static void print_block_info(Monitor *mon, BlockInfo *info,
     }
 }
 
-void hmp_info_block(Monitor *mon, const QDict *qdict)
+void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict)
 {
     BlockInfoList *block_list, *info;
     BlockDeviceInfoList *blockdev_list, *blockdev;
diff --git a/block/qapi.c b/block/qapi.c
index 79bf80c503..ae6cd1c2ff 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -667,7 +667,7 @@ bdrv_query_bds_stats(BlockDriverState *bs, bool blk_level)
     return s;
 }
 
-BlockInfoList *qmp_query_block(Error **errp)
+BlockInfoList *coroutine_fn qmp_query_block(Error **errp)
 {
     BlockInfoList *head = NULL, **p_next = &head;
     BlockBackend *blk;
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 47d63d26db..996895f417 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -65,6 +65,7 @@ ERST
         .help       = "show info of one block device or all block devices "
                       "(-n: show named nodes; -v: show details)",
         .cmd        = hmp_info_block,
+        .coroutine  = true,
     },
 
 SRST
diff --git a/include/block/block-hmp-cmds.h b/include/block/block-hmp-cmds.h
index 71113cd7ef..6d9152318f 100644
--- a/include/block/block-hmp-cmds.h
+++ b/include/block/block-hmp-cmds.h
@@ -48,7 +48,7 @@ void hmp_eject(Monitor *mon, const QDict *qdict);
 
 void hmp_qemu_io(Monitor *mon, const QDict *qdict);
 
-void hmp_info_block(Monitor *mon, const QDict *qdict);
+void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict);
 void hmp_info_blockstats(Monitor *mon, const QDict *qdict);
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
 void hmp_info_snapshots(Monitor *mon, const QDict *qdict);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 98d9116dae..da95fe456c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -838,7 +838,7 @@
 #    }
 ##
 { 'command': 'query-block', 'returns': ['BlockInfo'],
-  'allow-preconfig': true }
+  'allow-preconfig': true, 'coroutine': true }
 
 ##
 # @BlockDeviceTimedStats:
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 4/6] Convert query-block/info_block to coroutine
  2023-05-23 21:38 [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous Fabiano Rosas
                   ` (2 preceding siblings ...)
  2023-05-23 21:39 ` [RFC PATCH 3/6] Convert query-block/info_block to coroutine Fabiano Rosas
@ 2023-05-23 21:39 ` Fabiano Rosas
  2023-05-24  4:23   ` Lin Ma
  2023-05-24  8:49   ` Claudio Fontana
  2023-05-23 21:39 ` [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context Fabiano Rosas
  2023-05-23 21:39 ` [RFC PATCH 6/6] block: Add a thread-pool version of fstat Fabiano Rosas
  5 siblings, 2 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-23 21:39 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

From: Lin Ma <lma@suse.com>

Sometimes the query-block performs time-consuming I/O(say waiting for
the fstat of NFS complete), So let's make this QMP handler runs in a
coroutine.

The following patch moves the fstat() into a thread pool.

Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 blockdev.c           | 6 +++---
 qapi/block-core.json | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5d56b79df4..6412548662 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2804,9 +2804,9 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp)
     blockdev_do_action(&action, errp);
 }
 
-BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
-                                                 bool flat,
-                                                 Error **errp)
+BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat,
+                                                              bool flat,
+                                                              Error **errp)
 {
     bool return_flat = has_flat && flat;
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index da95fe456c..0559c38412 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1972,7 +1972,8 @@
 { 'command': 'query-named-block-nodes',
   'returns': [ 'BlockDeviceInfo' ],
   'data': { '*flat': 'bool' },
-  'allow-preconfig': true }
+  'allow-preconfig': true,
+  'coroutine': true}
 
 ##
 # @XDbgBlockGraphNodeType:
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context
  2023-05-23 21:38 [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous Fabiano Rosas
                   ` (3 preceding siblings ...)
  2023-05-23 21:39 ` [RFC PATCH 4/6] " Fabiano Rosas
@ 2023-05-23 21:39 ` Fabiano Rosas
  2023-05-26  9:28   ` Kevin Wolf
  2023-05-23 21:39 ` [RFC PATCH 6/6] block: Add a thread-pool version of fstat Fabiano Rosas
  5 siblings, 1 reply; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-23 21:39 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

We're about to move calls to 'fstat' into the thread-pool to avoid
blocking VCPU threads should the system call take too long.

To achieve that we first need to make sure none of its callers is
holding the aio_context lock, otherwise yielding before scheduling the
aiocb handler would result in a deadlock when the qemu_global_mutex is
released and another thread tries to acquire the aio_context.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 block/qapi.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/block/qapi.c b/block/qapi.c
index ae6cd1c2ff..cd197abf1f 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -222,6 +222,26 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs,
     return 0;
 }
 
+static int64_t bdrv_get_actual_size(BlockDriverState *bs)
+{
+    int64_t size;
+    AioContext *old_ctx = NULL;
+
+    if (qemu_in_coroutine()) {
+        aio_context_release(bdrv_get_aio_context(bs));
+        old_ctx = bdrv_co_enter(bs);
+    }
+
+    size = bdrv_get_allocated_file_size(bs);
+
+    if (qemu_in_coroutine() && old_ctx) {
+        bdrv_co_leave(bs, old_ctx);
+        aio_context_acquire(bdrv_get_aio_context(bs));
+    }
+
+    return size;
+}
+
 /**
  * Helper function for other query info functions.  Store information about @bs
  * in @info, setting @errp on error.
@@ -250,7 +270,7 @@ static void bdrv_do_query_node_info(BlockDriverState *bs,
     info->filename        = g_strdup(bs->filename);
     info->format          = g_strdup(bdrv_get_format_name(bs));
     info->virtual_size    = size;
-    info->actual_size     = bdrv_get_allocated_file_size(bs);
+    info->actual_size     = bdrv_get_actual_size(bs);
     info->has_actual_size = info->actual_size >= 0;
     if (bs->encrypted) {
         info->encrypted = true;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 6/6] block: Add a thread-pool version of fstat
  2023-05-23 21:38 [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous Fabiano Rosas
                   ` (4 preceding siblings ...)
  2023-05-23 21:39 ` [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context Fabiano Rosas
@ 2023-05-23 21:39 ` Fabiano Rosas
  2023-05-24 15:56   ` Claudio Fontana
  2023-05-25 15:45   ` Eric Blake
  5 siblings, 2 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-23 21:39 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

From: João Silva <jsilva@suse.de>

The fstat call can take a long time to finish when running over
NFS. Add a version of it that runs in the thread pool.

Adapt one of its users, raw_co_get_allocated_file size to use the new
version. That function is called via QMP under the qemu_global_mutex
so it has a large chance of blocking VCPU threads in case it takes too
long to finish.

Signed-off-by: João Silva <jsilva@suse.de>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 block/file-posix.c      | 40 +++++++++++++++++++++++++++++++++++++---
 include/block/raw-aio.h |  4 +++-
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 0ab158efba..0a29a6cc43 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -227,6 +227,9 @@ typedef struct RawPosixAIOData {
         struct {
             unsigned long op;
         } zone_mgmt;
+        struct {
+            struct stat *st;
+        } fstat;
     };
 } RawPosixAIOData;
 
@@ -2644,6 +2647,34 @@ static void raw_close(BlockDriverState *bs)
     }
 }
 
+static int handle_aiocb_fstat(void *opaque)
+{
+    RawPosixAIOData *aiocb = opaque;
+
+    if (fstat(aiocb->aio_fildes, aiocb->fstat.st) < 0) {
+        return -errno;
+    }
+
+    return 0;
+}
+
+static int coroutine_fn raw_co_fstat(BlockDriverState *bs, struct stat *st)
+{
+    BDRVRawState *s = bs->opaque;
+    RawPosixAIOData acb;
+
+    acb = (RawPosixAIOData) {
+        .bs             = bs,
+        .aio_fildes     = s->fd,
+        .aio_type       = QEMU_AIO_FSTAT,
+        .fstat          = {
+            .st = st,
+        },
+    };
+
+    return raw_thread_pool_submit(handle_aiocb_fstat, &acb);
+}
+
 /**
  * Truncates the given regular file @fd to @offset and, when growing, fills the
  * new space according to @prealloc.
@@ -2883,11 +2914,14 @@ static int64_t coroutine_fn raw_co_getlength(BlockDriverState *bs)
 static int64_t coroutine_fn raw_co_get_allocated_file_size(BlockDriverState *bs)
 {
     struct stat st;
-    BDRVRawState *s = bs->opaque;
+    int ret;
 
-    if (fstat(s->fd, &st) < 0) {
-        return -errno;
+    ret = raw_co_fstat(bs, &st);
+
+    if (ret) {
+        return ret;
     }
+
     return (int64_t)st.st_blocks * 512;
 }
 
diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index 0fe85ade77..7dc6d24e21 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -31,6 +31,7 @@
 #define QEMU_AIO_ZONE_REPORT  0x0100
 #define QEMU_AIO_ZONE_MGMT    0x0200
 #define QEMU_AIO_ZONE_APPEND  0x0400
+#define QEMU_AIO_FSTAT        0x0800
 #define QEMU_AIO_TYPE_MASK \
         (QEMU_AIO_READ | \
          QEMU_AIO_WRITE | \
@@ -42,7 +43,8 @@
          QEMU_AIO_TRUNCATE | \
          QEMU_AIO_ZONE_REPORT | \
          QEMU_AIO_ZONE_MGMT | \
-         QEMU_AIO_ZONE_APPEND)
+         QEMU_AIO_ZONE_APPEND | \
+         QEMU_AIO_FSTAT)
 
 /* AIO flags */
 #define QEMU_AIO_MISALIGNED   0x1000
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 4/6] Convert query-block/info_block to coroutine
  2023-05-23 21:39 ` [RFC PATCH 4/6] " Fabiano Rosas
@ 2023-05-24  4:23   ` Lin Ma
  2023-05-24 12:30     ` Fabiano Rosas
  2023-05-24  8:49   ` Claudio Fontana
  1 sibling, 1 reply; 23+ messages in thread
From: Lin Ma @ 2023-05-24  4:23 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Claudio Fontana, Dario Faggioli, Eric Blake

The commit title/message are duplicated to previous one, Here should
use "query-named-block-nodes" instead.

Lin

________________________________________
From: Fabiano Rosas <farosas@suse.de>
Sent: Wednesday, May 24, 2023 5:39 AM
To: qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org; Kevin Wolf; Hanna Reitz; Markus Armbruster; João Silva; Lin Ma; Claudio Fontana; Dario Faggioli; Eric Blake
Subject: [RFC PATCH 4/6] Convert query-block/info_block to coroutine

From: Lin Ma <lma@suse.com>

Sometimes the query-block performs time-consuming I/O(say waiting for
the fstat of NFS complete), So let's make this QMP handler runs in a
coroutine.

The following patch moves the fstat() into a thread pool.

Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 blockdev.c           | 6 +++---
 qapi/block-core.json | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5d56b79df4..6412548662 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2804,9 +2804,9 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp)
     blockdev_do_action(&action, errp);
 }

-BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
-                                                 bool flat,
-                                                 Error **errp)
+BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat,
+                                                              bool flat,
+                                                              Error **errp)
 {
     bool return_flat = has_flat && flat;

diff --git a/qapi/block-core.json b/qapi/block-core.json
index da95fe456c..0559c38412 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1972,7 +1972,8 @@
 { 'command': 'query-named-block-nodes',
   'returns': [ 'BlockDeviceInfo' ],
   'data': { '*flat': 'bool' },
-  'allow-preconfig': true }
+  'allow-preconfig': true,
+  'coroutine': true}

 ##
 # @XDbgBlockGraphNodeType:
--
2.35.3



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 1/6] block: Remove bdrv_query_block_node_info
  2023-05-23 21:38 ` [RFC PATCH 1/6] block: Remove bdrv_query_block_node_info Fabiano Rosas
@ 2023-05-24  8:31   ` Claudio Fontana
  0 siblings, 0 replies; 23+ messages in thread
From: Claudio Fontana @ 2023-05-24  8:31 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Dario Faggioli, Eric Blake

On 5/23/23 23:38, Fabiano Rosas wrote:
> The last call site of this function has been removed by commit
> c04d0ab026 ("qemu-img: Let info print block graph").
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  block/qapi.c         | 27 ---------------------------
>  include/block/qapi.h |  3 ---
>  2 files changed, 30 deletions(-)
> 
> diff --git a/block/qapi.c b/block/qapi.c
> index f34f95e0ef..79bf80c503 100644
> --- a/block/qapi.c
> +++ b/block/qapi.c
> @@ -309,33 +309,6 @@ out:
>      aio_context_release(bdrv_get_aio_context(bs));
>  }
>  
> -/**
> - * bdrv_query_block_node_info:
> - * @bs: block node to examine
> - * @p_info: location to store node information
> - * @errp: location to store error information
> - *
> - * Store image information about @bs in @p_info.
> - *
> - * @p_info will be set only on success. On error, store error in @errp.
> - */
> -void bdrv_query_block_node_info(BlockDriverState *bs,
> -                                BlockNodeInfo **p_info,
> -                                Error **errp)
> -{
> -    BlockNodeInfo *info;
> -    ERRP_GUARD();
> -
> -    info = g_new0(BlockNodeInfo, 1);
> -    bdrv_do_query_node_info(bs, info, errp);
> -    if (*errp) {
> -        qapi_free_BlockNodeInfo(info);
> -        return;
> -    }
> -
> -    *p_info = info;
> -}
> -
>  /**
>   * bdrv_query_image_info:
>   * @bs: block node to examine
> diff --git a/include/block/qapi.h b/include/block/qapi.h
> index 18d48ddb70..8663971c58 100644
> --- a/include/block/qapi.h
> +++ b/include/block/qapi.h
> @@ -36,9 +36,6 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
>  int bdrv_query_snapshot_info_list(BlockDriverState *bs,
>                                    SnapshotInfoList **p_list,
>                                    Error **errp);
> -void bdrv_query_block_node_info(BlockDriverState *bs,
> -                                BlockNodeInfo **p_info,
> -                                Error **errp);
>  void bdrv_query_image_info(BlockDriverState *bs,
>                             ImageInfo **p_info,
>                             bool flat,

Reviewed-by: Claudio Fontana <cfontana@suse.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/6] Convert query-block/info_block to coroutine
  2023-05-23 21:39 ` [RFC PATCH 3/6] Convert query-block/info_block to coroutine Fabiano Rosas
@ 2023-05-24  8:48   ` Claudio Fontana
  2023-05-25 14:26   ` Eric Blake
  1 sibling, 0 replies; 23+ messages in thread
From: Claudio Fontana @ 2023-05-24  8:48 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Dario Faggioli, Eric Blake,
	Dr. David Alan Gilbert

On 5/23/23 23:39, Fabiano Rosas wrote:
> From: Lin Ma <lma@suse.com>
> 
> Sometimes the query-block performs time-consuming I/O(say waiting for
> the fstat of NFS complete), So let's make this QMP handler runs in a
> coroutine.
> 
> The following patch moves the fstat() into a thread pool.


Hi, this message talks about QMP and query-block only and the subject talks about info_block as well.
We need to clarify this.

Also, lets make it clear that one is a QMP command, and the other is an HMP command.

In any case I would say:

"Convert QMP command query-block to use a coroutine", and in case we also need info_block, an additional patch could take care of the HMP command with a subject like:

"Convert HMP command info_block to use a coroutine".

Ciao,

Claudio


> 
> Signed-off-by: Lin Ma <lma@suse.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  block/monitor/block-hmp-cmds.c | 2 +-
>  block/qapi.c                   | 2 +-
>  hmp-commands-info.hx           | 1 +
>  include/block/block-hmp-cmds.h | 2 +-
>  qapi/block-core.json           | 2 +-
>  5 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
> index ca2599de44..4b704010bc 100644
> --- a/block/monitor/block-hmp-cmds.c
> +++ b/block/monitor/block-hmp-cmds.c
> @@ -738,7 +738,7 @@ static void print_block_info(Monitor *mon, BlockInfo *info,
>      }
>  }
>  
> -void hmp_info_block(Monitor *mon, const QDict *qdict)
> +void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict)
>  {
>      BlockInfoList *block_list, *info;
>      BlockDeviceInfoList *blockdev_list, *blockdev;
> diff --git a/block/qapi.c b/block/qapi.c
> index 79bf80c503..ae6cd1c2ff 100644
> --- a/block/qapi.c
> +++ b/block/qapi.c
> @@ -667,7 +667,7 @@ bdrv_query_bds_stats(BlockDriverState *bs, bool blk_level)
>      return s;
>  }
>  
> -BlockInfoList *qmp_query_block(Error **errp)
> +BlockInfoList *coroutine_fn qmp_query_block(Error **errp)
>  {
>      BlockInfoList *head = NULL, **p_next = &head;
>      BlockBackend *blk;
> diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
> index 47d63d26db..996895f417 100644
> --- a/hmp-commands-info.hx
> +++ b/hmp-commands-info.hx
> @@ -65,6 +65,7 @@ ERST
>          .help       = "show info of one block device or all block devices "
>                        "(-n: show named nodes; -v: show details)",
>          .cmd        = hmp_info_block,
> +        .coroutine  = true,
>      },
>  
>  SRST
> diff --git a/include/block/block-hmp-cmds.h b/include/block/block-hmp-cmds.h
> index 71113cd7ef..6d9152318f 100644
> --- a/include/block/block-hmp-cmds.h
> +++ b/include/block/block-hmp-cmds.h
> @@ -48,7 +48,7 @@ void hmp_eject(Monitor *mon, const QDict *qdict);
>  
>  void hmp_qemu_io(Monitor *mon, const QDict *qdict);
>  
> -void hmp_info_block(Monitor *mon, const QDict *qdict);
> +void coroutine_fn hmp_info_block(Monitor *mon, const QDict *qdict);
>  void hmp_info_blockstats(Monitor *mon, const QDict *qdict);
>  void hmp_info_block_jobs(Monitor *mon, const QDict *qdict);
>  void hmp_info_snapshots(Monitor *mon, const QDict *qdict);
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 98d9116dae..da95fe456c 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -838,7 +838,7 @@
>  #    }
>  ##
>  { 'command': 'query-block', 'returns': ['BlockInfo'],
> -  'allow-preconfig': true }
> +  'allow-preconfig': true, 'coroutine': true }
>  
>  ##
>  # @BlockDeviceTimedStats:



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 4/6] Convert query-block/info_block to coroutine
  2023-05-23 21:39 ` [RFC PATCH 4/6] " Fabiano Rosas
  2023-05-24  4:23   ` Lin Ma
@ 2023-05-24  8:49   ` Claudio Fontana
  2023-05-24  9:24     ` Lin Ma
  1 sibling, 1 reply; 23+ messages in thread
From: Claudio Fontana @ 2023-05-24  8:49 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Dario Faggioli, Eric Blake

On 5/23/23 23:39, Fabiano Rosas wrote:
> From: Lin Ma <lma@suse.com>
> 
> Sometimes the query-block performs time-consuming I/O(say waiting for
> the fstat of NFS complete), So let's make this QMP handler runs in a
> coroutine.
> 
> The following patch moves the fstat() into a thread pool.
> 
> Signed-off-by: Lin Ma <lma@suse.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Apart from the wrong subject,

why is this change not including the update to:

hmp-commands-info.hx

like the previous one?

Thanks,

C

> ---
>  blockdev.c           | 6 +++---
>  qapi/block-core.json | 3 ++-
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 5d56b79df4..6412548662 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2804,9 +2804,9 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp)
>      blockdev_do_action(&action, errp);
>  }
>  
> -BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
> -                                                 bool flat,
> -                                                 Error **errp)
> +BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat,
> +                                                              bool flat,
> +                                                              Error **errp)
>  {
>      bool return_flat = has_flat && flat;
>  
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index da95fe456c..0559c38412 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1972,7 +1972,8 @@
>  { 'command': 'query-named-block-nodes',
>    'returns': [ 'BlockDeviceInfo' ],
>    'data': { '*flat': 'bool' },
> -  'allow-preconfig': true }
> +  'allow-preconfig': true,
> +  'coroutine': true}
>  
>  ##
>  # @XDbgBlockGraphNodeType:



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 4/6] Convert query-block/info_block to coroutine
  2023-05-24  8:49   ` Claudio Fontana
@ 2023-05-24  9:24     ` Lin Ma
  2023-05-24 15:57       ` Claudio Fontana
  0 siblings, 1 reply; 23+ messages in thread
From: Lin Ma @ 2023-05-24  9:24 UTC (permalink / raw)
  To: Claudio Fontana, Fabiano Rosas, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Dario Faggioli, Eric Blake

The query-named-block-nodes is only availabe for qmp, not support hmp yet.

Lin

________________________________________
From: Claudio Fontana <cfontana@suse.de>
Sent: Wednesday, May 24, 2023 4:49 PM
To: Fabiano Rosas; qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org; Kevin Wolf; Hanna Reitz; Markus Armbruster; João Silva; Lin Ma; Dario Faggioli; Eric Blake
Subject: Re: [RFC PATCH 4/6] Convert query-block/info_block to coroutine

On 5/23/23 23:39, Fabiano Rosas wrote:
> From: Lin Ma <lma@suse.com>
>
> Sometimes the query-block performs time-consuming I/O(say waiting for
> the fstat of NFS complete), So let's make this QMP handler runs in a
> coroutine.
>
> The following patch moves the fstat() into a thread pool.
>
> Signed-off-by: Lin Ma <lma@suse.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Apart from the wrong subject,

why is this change not including the update to:

hmp-commands-info.hx

like the previous one?

Thanks,

C

> ---
>  blockdev.c           | 6 +++---
>  qapi/block-core.json | 3 ++-
>  2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/blockdev.c b/blockdev.c
> index 5d56b79df4..6412548662 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2804,9 +2804,9 @@ void qmp_drive_backup(DriveBackup *backup, Error **errp)
>      blockdev_do_action(&action, errp);
>  }
>
> -BlockDeviceInfoList *qmp_query_named_block_nodes(bool has_flat,
> -                                                 bool flat,
> -                                                 Error **errp)
> +BlockDeviceInfoList *coroutine_fn qmp_query_named_block_nodes(bool has_flat,
> +                                                              bool flat,
> +                                                              Error **errp)
>  {
>      bool return_flat = has_flat && flat;
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index da95fe456c..0559c38412 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1972,7 +1972,8 @@
>  { 'command': 'query-named-block-nodes',
>    'returns': [ 'BlockDeviceInfo' ],
>    'data': { '*flat': 'bool' },
> -  'allow-preconfig': true }
> +  'allow-preconfig': true,
> +  'coroutine': true}
>
>  ##
>  # @XDbgBlockGraphNodeType:



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 4/6] Convert query-block/info_block to coroutine
  2023-05-24  4:23   ` Lin Ma
@ 2023-05-24 12:30     ` Fabiano Rosas
  0 siblings, 0 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-24 12:30 UTC (permalink / raw)
  To: Lin Ma, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Claudio Fontana, Dario Faggioli, Eric Blake

Lin Ma <LMa@suse.com> writes:

> The commit title/message are duplicated to previous one, Here should
> use "query-named-block-nodes" instead.
>
> Lin
>

Ugh, what a blunder, they're even nicely aligned in the git log. I'll
fix it in the next version.

Thanks!



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 6/6] block: Add a thread-pool version of fstat
  2023-05-23 21:39 ` [RFC PATCH 6/6] block: Add a thread-pool version of fstat Fabiano Rosas
@ 2023-05-24 15:56   ` Claudio Fontana
  2023-05-25 15:45   ` Eric Blake
  1 sibling, 0 replies; 23+ messages in thread
From: Claudio Fontana @ 2023-05-24 15:56 UTC (permalink / raw)
  To: Fabiano Rosas, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Dario Faggioli, Eric Blake

On 5/23/23 23:39, Fabiano Rosas wrote:
> From: João Silva <jsilva@suse.de>
> 
> The fstat call can take a long time to finish when running over
> NFS. Add a version of it that runs in the thread pool.
> 
> Adapt one of its users, raw_co_get_allocated_file size to use the new
> version. That function is called via QMP under the qemu_global_mutex
> so it has a large chance of blocking VCPU threads in case it takes too
> long to finish.
> 
> Signed-off-by: João Silva <jsilva@suse.de>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  block/file-posix.c      | 40 +++++++++++++++++++++++++++++++++++++---
>  include/block/raw-aio.h |  4 +++-
>  2 files changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 0ab158efba..0a29a6cc43 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -227,6 +227,9 @@ typedef struct RawPosixAIOData {
>          struct {
>              unsigned long op;
>          } zone_mgmt;
> +        struct {
> +            struct stat *st;
> +        } fstat;
>      };
>  } RawPosixAIOData;
>  
> @@ -2644,6 +2647,34 @@ static void raw_close(BlockDriverState *bs)
>      }
>  }
>  
> +static int handle_aiocb_fstat(void *opaque)
> +{
> +    RawPosixAIOData *aiocb = opaque;
> +
> +    if (fstat(aiocb->aio_fildes, aiocb->fstat.st) < 0) {
> +        return -errno;
> +    }
> +
> +    return 0;
> +}
> +
> +static int coroutine_fn raw_co_fstat(BlockDriverState *bs, struct stat *st)
> +{
> +    BDRVRawState *s = bs->opaque;
> +    RawPosixAIOData acb;
> +
> +    acb = (RawPosixAIOData) {
> +        .bs             = bs,
> +        .aio_fildes     = s->fd,
> +        .aio_type       = QEMU_AIO_FSTAT,
> +        .fstat          = {
> +            .st = st,
> +        },
> +    };
> +
> +    return raw_thread_pool_submit(handle_aiocb_fstat, &acb);
> +}
> +
>  /**
>   * Truncates the given regular file @fd to @offset and, when growing, fills the
>   * new space according to @prealloc.
> @@ -2883,11 +2914,14 @@ static int64_t coroutine_fn raw_co_getlength(BlockDriverState *bs)
>  static int64_t coroutine_fn raw_co_get_allocated_file_size(BlockDriverState *bs)
>  {
>      struct stat st;
> -    BDRVRawState *s = bs->opaque;
> +    int ret;
>  
> -    if (fstat(s->fd, &st) < 0) {
> -        return -errno;
> +    ret = raw_co_fstat(bs, &st);
> +
> +    if (ret) {
> +        return ret;
>      }
> +
>      return (int64_t)st.st_blocks * 512;
>  }
>  
> diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
> index 0fe85ade77..7dc6d24e21 100644
> --- a/include/block/raw-aio.h
> +++ b/include/block/raw-aio.h
> @@ -31,6 +31,7 @@
>  #define QEMU_AIO_ZONE_REPORT  0x0100
>  #define QEMU_AIO_ZONE_MGMT    0x0200
>  #define QEMU_AIO_ZONE_APPEND  0x0400
> +#define QEMU_AIO_FSTAT        0x0800
>  #define QEMU_AIO_TYPE_MASK \
>          (QEMU_AIO_READ | \
>           QEMU_AIO_WRITE | \
> @@ -42,7 +43,8 @@
>           QEMU_AIO_TRUNCATE | \
>           QEMU_AIO_ZONE_REPORT | \
>           QEMU_AIO_ZONE_MGMT | \
> -         QEMU_AIO_ZONE_APPEND)
> +         QEMU_AIO_ZONE_APPEND | \
> +         QEMU_AIO_FSTAT)
>  
>  /* AIO flags */
>  #define QEMU_AIO_MISALIGNED   0x1000


Reviewed-by: Claudio Fontana <cfontana@suse.de>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 4/6] Convert query-block/info_block to coroutine
  2023-05-24  9:24     ` Lin Ma
@ 2023-05-24 15:57       ` Claudio Fontana
  0 siblings, 0 replies; 23+ messages in thread
From: Claudio Fontana @ 2023-05-24 15:57 UTC (permalink / raw)
  To: Lin Ma, Fabiano Rosas, qemu-devel
  Cc: qemu-block, Kevin Wolf, Hanna Reitz, Markus Armbruster,
	João Silva, Dario Faggioli, Eric Blake

On 5/24/23 11:24, Lin Ma wrote:
> The query-named-block-nodes is only availabe for qmp, not support hmp yet.
> 
> Lin

Ok, makes sense.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed
  2023-05-23 21:38 ` [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed Fabiano Rosas
@ 2023-05-25 14:17   ` Eric Blake
  2023-05-26  9:12   ` Kevin Wolf
  1 sibling, 0 replies; 23+ messages in thread
From: Eric Blake @ 2023-05-25 14:17 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, qemu-block, Kevin Wolf, Hanna Reitz,
	Markus Armbruster, João Silva, Lin Ma, Claudio Fontana,
	Dario Faggioli

On Tue, May 23, 2023 at 06:38:59PM -0300, Fabiano Rosas wrote:
> Some callers of this function are about to be converted to use
> coroutines, so allow it to be executed both inside and outside a
> coroutine.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  include/block/block-io.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

> 
> diff --git a/include/block/block-io.h b/include/block/block-io.h
> index a27e471a87..c1f96faca5 100644
> --- a/include/block/block-io.h
> +++ b/include/block/block-io.h
> @@ -87,7 +87,7 @@ int64_t co_wrapper_mixed_bdrv_rdlock bdrv_getlength(BlockDriverState *bs);
>  int64_t coroutine_fn GRAPH_RDLOCK
>  bdrv_co_get_allocated_file_size(BlockDriverState *bs);
>  
> -int64_t co_wrapper_bdrv_rdlock
> +int64_t co_wrapper_mixed_bdrv_rdlock
>  bdrv_get_allocated_file_size(BlockDriverState *bs);
>  
>  BlockMeasureInfo *bdrv_measure(BlockDriver *drv, QemuOpts *opts,
> -- 
> 2.35.3
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/6] Convert query-block/info_block to coroutine
  2023-05-23 21:39 ` [RFC PATCH 3/6] Convert query-block/info_block to coroutine Fabiano Rosas
  2023-05-24  8:48   ` Claudio Fontana
@ 2023-05-25 14:26   ` Eric Blake
  2023-05-26 14:05     ` Fabiano Rosas
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Blake @ 2023-05-25 14:26 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, qemu-block, Kevin Wolf, Hanna Reitz,
	Markus Armbruster, João Silva, Lin Ma, Claudio Fontana,
	Dario Faggioli, Dr. David Alan Gilbert

On Tue, May 23, 2023 at 06:39:00PM -0300, Fabiano Rosas wrote:
> From: Lin Ma <lma@suse.com>
> 
> Sometimes the query-block performs time-consuming I/O(say waiting for
> the fstat of NFS complete), So let's make this QMP handler runs in a
> coroutine.

Grammar suggestions:

Sometimes the query-block command performs time-consuming I/O (say
waiting for the fstat of NFS to complete), so let's make this QMP
handler run in a coroutine.

> 
> The following patch moves the fstat() into a thread pool.
> 
> Signed-off-by: Lin Ma <lma@suse.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

> ---
> +++ b/qapi/block-core.json
> @@ -838,7 +838,7 @@
>  #    }
>  ##
>  { 'command': 'query-block', 'returns': ['BlockInfo'],
> -  'allow-preconfig': true }
> +  'allow-preconfig': true, 'coroutine': true }

Rereading docs/devel/qapi-code-gen.rst:

| Coroutine safety can be hard to prove, similar to thread safety.  Common
| pitfalls are:
| 
| - The global mutex isn't held across ``qemu_coroutine_yield()``, so
|   operations that used to assume that they execute atomically may have
|   to be more careful to protect against changes in the global state.
| 
| - Nested event loops (``AIO_WAIT_WHILE()`` etc.) are problematic in
|   coroutine context and can easily lead to deadlocks.  They should be
|   replaced by yielding and reentering the coroutine when the condition
|   becomes false.
| 
| Since the command handler may assume coroutine context, any callers
| other than the QMP dispatcher must also call it in coroutine context.
| In particular, HMP commands calling such a QMP command handler must be
| marked ``.coroutine = true`` in hmp-commands.hx.

I agree that you also need to change the HMP handler, but the commit
message didn't mention that other than in the subject line.  The
commit message could also do a better job at explaining how you have
audited that merely adding the coroutine marker is safe (ie. that all
of the calls made by query_block are already coroutine safe).  While
the intent behind this patch is on the right track, I'm not sure if I
have enough information to say whether it is safe, or if there are
other lurking problems we will need to fix first.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 6/6] block: Add a thread-pool version of fstat
  2023-05-23 21:39 ` [RFC PATCH 6/6] block: Add a thread-pool version of fstat Fabiano Rosas
  2023-05-24 15:56   ` Claudio Fontana
@ 2023-05-25 15:45   ` Eric Blake
  2023-05-26 14:20     ` Fabiano Rosas
  1 sibling, 1 reply; 23+ messages in thread
From: Eric Blake @ 2023-05-25 15:45 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, qemu-block, Kevin Wolf, Hanna Reitz,
	Markus Armbruster, João Silva, Lin Ma, Claudio Fontana,
	Dario Faggioli

On Tue, May 23, 2023 at 06:39:03PM -0300, Fabiano Rosas wrote:
> From: João Silva <jsilva@suse.de>
> 
> The fstat call can take a long time to finish when running over
> NFS. Add a version of it that runs in the thread pool.
> 
> Adapt one of its users, raw_co_get_allocated_file size to use the new
> version. That function is called via QMP under the qemu_global_mutex
> so it has a large chance of blocking VCPU threads in case it takes too
> long to finish.
> 
> Signed-off-by: João Silva <jsilva@suse.de>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  block/file-posix.c      | 40 +++++++++++++++++++++++++++++++++++++---
>  include/block/raw-aio.h |  4 +++-
>  2 files changed, 40 insertions(+), 4 deletions(-)

Should this change occur earlier in the series, before calling
commands are marked with QAPI coroutine flags?  Otherwise, you have a
bisection bug, where something marked coroutine can end up hanging
when it calls a blocking syscall in the wrong context without the help
of this patch offloading the syscall into a helper thread.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed
  2023-05-23 21:38 ` [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed Fabiano Rosas
  2023-05-25 14:17   ` Eric Blake
@ 2023-05-26  9:12   ` Kevin Wolf
  1 sibling, 0 replies; 23+ messages in thread
From: Kevin Wolf @ 2023-05-26  9:12 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, qemu-block, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

Am 23.05.2023 um 23:38 hat Fabiano Rosas geschrieben:
> Some callers of this function are about to be converted to use
> coroutines, so allow it to be executed both inside and outside a
> coroutine.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

This is not a sufficient justification for introducing a new mixed
function (we want to get rid of them, not add new ones).

You need to explain why the new coroutine callers can't directly call
bdrv_co_get_allocated_file_size() instead of going through the wrapper.
This is usually only the case if you have a function that doesn't know
whether it runs in coroutine context or not. Functions that you
explicitly convert to coroutine_fn know for sure.

>  include/block/block-io.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/block/block-io.h b/include/block/block-io.h
> index a27e471a87..c1f96faca5 100644
> --- a/include/block/block-io.h
> +++ b/include/block/block-io.h
> @@ -87,7 +87,7 @@ int64_t co_wrapper_mixed_bdrv_rdlock bdrv_getlength(BlockDriverState *bs);
>  int64_t coroutine_fn GRAPH_RDLOCK
>  bdrv_co_get_allocated_file_size(BlockDriverState *bs);
>  
> -int64_t co_wrapper_bdrv_rdlock
> +int64_t co_wrapper_mixed_bdrv_rdlock
>  bdrv_get_allocated_file_size(BlockDriverState *bs);

You're changing bdrv_get_allocated_file_size() (which is the
function you really mean), but the subject line talks about
bdrv_co_get_allocated_file_size().

Kevin



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context
  2023-05-23 21:39 ` [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context Fabiano Rosas
@ 2023-05-26  9:28   ` Kevin Wolf
  2023-05-29 17:47     ` Fabiano Rosas
  0 siblings, 1 reply; 23+ messages in thread
From: Kevin Wolf @ 2023-05-26  9:28 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, qemu-block, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

Am 23.05.2023 um 23:39 hat Fabiano Rosas geschrieben:
> We're about to move calls to 'fstat' into the thread-pool to avoid
> blocking VCPU threads should the system call take too long.
> 
> To achieve that we first need to make sure none of its callers is
> holding the aio_context lock, otherwise yielding before scheduling the
> aiocb handler would result in a deadlock when the qemu_global_mutex is
> released and another thread tries to acquire the aio_context.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  block/qapi.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qapi.c b/block/qapi.c
> index ae6cd1c2ff..cd197abf1f 100644
> --- a/block/qapi.c
> +++ b/block/qapi.c
> @@ -222,6 +222,26 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs,
>      return 0;
>  }
>  
> +static int64_t bdrv_get_actual_size(BlockDriverState *bs)
> +{
> +    int64_t size;
> +    AioContext *old_ctx = NULL;
> +
> +    if (qemu_in_coroutine()) {

Hm. Why can't we make sure that it always runs in a coroutine?

Callers:

* bdrv_query_block_node_info(). This functions seems to be completely
  unused, we can remove it.

* bdrv_query_image_info(). Called by bdrv_block_device_info() and itself.
  bdrv_block_device_info() could become a co_wrapper after swapping the
  first two parameters so that it runs in the AioContext of @bs.

* bdrv_query_block_graph_info(). Only called by qemu-img. Could become a
  co_wrapper_bdrv_rdlock.

> +        aio_context_release(bdrv_get_aio_context(bs));
> +        old_ctx = bdrv_co_enter(bs);

I think this is the wrong function to do this. The caller should already
make sure that it's in the right AioContext.

> +    }
> +
> +    size = bdrv_get_allocated_file_size(bs);
> +
> +    if (qemu_in_coroutine() && old_ctx) {
> +        bdrv_co_leave(bs, old_ctx);
> +        aio_context_acquire(bdrv_get_aio_context(bs));
> +    }
> +
> +    return size;
> +}

Kevin



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/6] Convert query-block/info_block to coroutine
  2023-05-25 14:26   ` Eric Blake
@ 2023-05-26 14:05     ` Fabiano Rosas
  0 siblings, 0 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-26 14:05 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, Kevin Wolf, Hanna Reitz,
	Markus Armbruster, João Silva, Lin Ma, Claudio Fontana,
	Dario Faggioli, Dr. David Alan Gilbert

Eric Blake <eblake@redhat.com> writes:

> On Tue, May 23, 2023 at 06:39:00PM -0300, Fabiano Rosas wrote:
>> From: Lin Ma <lma@suse.com>
>> 
>> Sometimes the query-block performs time-consuming I/O(say waiting for
>> the fstat of NFS complete), So let's make this QMP handler runs in a
>> coroutine.
>
> Grammar suggestions:
>
> Sometimes the query-block command performs time-consuming I/O (say
> waiting for the fstat of NFS to complete), so let's make this QMP
> handler run in a coroutine.
>

Thanks!

>> 
>> The following patch moves the fstat() into a thread pool.
>> 
>> Signed-off-by: Lin Ma <lma@suse.com>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>
>> ---
>> +++ b/qapi/block-core.json
>> @@ -838,7 +838,7 @@
>>  #    }
>>  ##
>>  { 'command': 'query-block', 'returns': ['BlockInfo'],
>> -  'allow-preconfig': true }
>> +  'allow-preconfig': true, 'coroutine': true }
>
> Rereading docs/devel/qapi-code-gen.rst:
>
> | Coroutine safety can be hard to prove, similar to thread safety.  Common
> | pitfalls are:
> | 
> | - The global mutex isn't held across ``qemu_coroutine_yield()``, so
> |   operations that used to assume that they execute atomically may have
> |   to be more careful to protect against changes in the global state.
> | 
> | - Nested event loops (``AIO_WAIT_WHILE()`` etc.) are problematic in
> |   coroutine context and can easily lead to deadlocks.  They should be
> |   replaced by yielding and reentering the coroutine when the condition
> |   becomes false.
> | 
> | Since the command handler may assume coroutine context, any callers
> | other than the QMP dispatcher must also call it in coroutine context.
> | In particular, HMP commands calling such a QMP command handler must be
> | marked ``.coroutine = true`` in hmp-commands.hx.
>
> I agree that you also need to change the HMP handler, but the commit
> message didn't mention that other than in the subject line.  The

Ok, I'll update the message for v2.

> commit message could also do a better job at explaining how you have
> audited that merely adding the coroutine marker is safe (ie. that all
> of the calls made by query_block are already coroutine safe).  While
> the intent behind this patch is on the right track, I'm not sure if I
> have enough information to say whether it is safe, or if there are
> other lurking problems we will need to fix first.

Fair point, I've been trusting the tests too much. A closer code audit
is necessary indeed.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 6/6] block: Add a thread-pool version of fstat
  2023-05-25 15:45   ` Eric Blake
@ 2023-05-26 14:20     ` Fabiano Rosas
  0 siblings, 0 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-26 14:20 UTC (permalink / raw)
  To: Eric Blake
  Cc: qemu-devel, qemu-block, Kevin Wolf, Hanna Reitz,
	Markus Armbruster, João Silva, Lin Ma, Claudio Fontana,
	Dario Faggioli

Eric Blake <eblake@redhat.com> writes:

> On Tue, May 23, 2023 at 06:39:03PM -0300, Fabiano Rosas wrote:
>> From: João Silva <jsilva@suse.de>
>> 
>> The fstat call can take a long time to finish when running over
>> NFS. Add a version of it that runs in the thread pool.
>> 
>> Adapt one of its users, raw_co_get_allocated_file size to use the new
>> version. That function is called via QMP under the qemu_global_mutex
>> so it has a large chance of blocking VCPU threads in case it takes too
>> long to finish.
>> 
>> Signed-off-by: João Silva <jsilva@suse.de>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  block/file-posix.c      | 40 +++++++++++++++++++++++++++++++++++++---
>>  include/block/raw-aio.h |  4 +++-
>>  2 files changed, 40 insertions(+), 4 deletions(-)
>
> Should this change occur earlier in the series, before calling
> commands are marked with QAPI coroutine flags?  Otherwise, you have a
> bisection bug, where something marked coroutine can end up hanging
> when it calls a blocking syscall in the wrong context without the help
> of this patch offloading the syscall into a helper thread.

Hmm, I'm not sure. To submit the work to the thread pool we need to be
in a coroutine already. If the syscall blocks for too long we'd be
trading blocking the coroutine vs. blocking a vcpu thread anyway.

I have tested each patch to avoid bisection issues, but maybe it would
be warranted to merge both parts into a single patch. Or arrange them in
some other way... I'll experiment with it.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context
  2023-05-26  9:28   ` Kevin Wolf
@ 2023-05-29 17:47     ` Fabiano Rosas
  0 siblings, 0 replies; 23+ messages in thread
From: Fabiano Rosas @ 2023-05-29 17:47 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, qemu-block, Hanna Reitz, Markus Armbruster,
	João Silva, Lin Ma, Claudio Fontana, Dario Faggioli,
	Eric Blake

Kevin Wolf <kwolf@redhat.com> writes:

> Am 23.05.2023 um 23:39 hat Fabiano Rosas geschrieben:
>> We're about to move calls to 'fstat' into the thread-pool to avoid
>> blocking VCPU threads should the system call take too long.
>> 
>> To achieve that we first need to make sure none of its callers is
>> holding the aio_context lock, otherwise yielding before scheduling the
>> aiocb handler would result in a deadlock when the qemu_global_mutex is
>> released and another thread tries to acquire the aio_context.
>> 
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  block/qapi.c | 22 +++++++++++++++++++++-
>>  1 file changed, 21 insertions(+), 1 deletion(-)
>> 
>> diff --git a/block/qapi.c b/block/qapi.c
>> index ae6cd1c2ff..cd197abf1f 100644
>> --- a/block/qapi.c
>> +++ b/block/qapi.c
>> @@ -222,6 +222,26 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs,
>>      return 0;
>>  }
>>  
>> +static int64_t bdrv_get_actual_size(BlockDriverState *bs)
>> +{
>> +    int64_t size;
>> +    AioContext *old_ctx = NULL;
>> +
>> +    if (qemu_in_coroutine()) {
>
> Hm. Why can't we make sure that it always runs in a coroutine?
>
> Callers:
>
> * bdrv_query_block_node_info(). This functions seems to be completely
>   unused, we can remove it.
>

Ok, I'm already removing it in patch 1.

> * bdrv_query_image_info(). Called by bdrv_block_device_info() and itself.
>   bdrv_block_device_info() could become a co_wrapper after swapping the
>   first two parameters so that it runs in the AioContext of @bs.
>

We cannot have bdrv_block_device_info() as co_wrapper because that would
create its own coroutine and yielding from that would merely have us
waiting at bdrv_poll_co. So it doesn't solve the blocking issue.

What would work is to make bdrv_block_device_info() a coroutine_fn
(without a wrapper). Its two callers, qmp_query_block() and
qmp_query_named_block_nodes() are already being moved into the handler
coroutine in this series, so it would be mostly a matter of adding the
type annotation.

> * bdrv_query_block_graph_info(). Only called by qemu-img. Could become a
>   co_wrapper_bdrv_rdlock.
>

This one works ok.

>> +        aio_context_release(bdrv_get_aio_context(bs));
>> +        old_ctx = bdrv_co_enter(bs);
>
> I think this is the wrong function to do this. The caller should already
> make sure that it's in the right AioContext.
>

The issue here is that bdrv_do_query_node_info() calls
bdrv_co_get_allocated_file_size(), which is the function we want to make
async and therefore needs to run outside of aio_context_acquire/release.
However, bdrv_do_query_node_info() also calls bdrv_refresh_filename(),
which is GLOBAL_STATE_CODE and therefore wants to be in the main thread
context. So entering the bs AioContext at the caller doesn't work when
giving the device its own iothread.

>> +    }
>> +
>> +    size = bdrv_get_allocated_file_size(bs);
>> +
>> +    if (qemu_in_coroutine() && old_ctx) {
>> +        bdrv_co_leave(bs, old_ctx);
>> +        aio_context_acquire(bdrv_get_aio_context(bs));
>> +    }
>> +
>> +    return size;
>> +}
>
> Kevin


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-05-29 17:48 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-23 21:38 [RFC PATCH 0/6] block: Make raw_co_get_allocated_file_size asynchronous Fabiano Rosas
2023-05-23 21:38 ` [RFC PATCH 1/6] block: Remove bdrv_query_block_node_info Fabiano Rosas
2023-05-24  8:31   ` Claudio Fontana
2023-05-23 21:38 ` [RFC PATCH 2/6] block: Mark bdrv_co_get_allocated_file_size() as mixed Fabiano Rosas
2023-05-25 14:17   ` Eric Blake
2023-05-26  9:12   ` Kevin Wolf
2023-05-23 21:39 ` [RFC PATCH 3/6] Convert query-block/info_block to coroutine Fabiano Rosas
2023-05-24  8:48   ` Claudio Fontana
2023-05-25 14:26   ` Eric Blake
2023-05-26 14:05     ` Fabiano Rosas
2023-05-23 21:39 ` [RFC PATCH 4/6] " Fabiano Rosas
2023-05-24  4:23   ` Lin Ma
2023-05-24 12:30     ` Fabiano Rosas
2023-05-24  8:49   ` Claudio Fontana
2023-05-24  9:24     ` Lin Ma
2023-05-24 15:57       ` Claudio Fontana
2023-05-23 21:39 ` [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context Fabiano Rosas
2023-05-26  9:28   ` Kevin Wolf
2023-05-29 17:47     ` Fabiano Rosas
2023-05-23 21:39 ` [RFC PATCH 6/6] block: Add a thread-pool version of fstat Fabiano Rosas
2023-05-24 15:56   ` Claudio Fontana
2023-05-25 15:45   ` Eric Blake
2023-05-26 14:20     ` Fabiano Rosas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.