All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] qmp: Optionally run handlers in coroutines
@ 2020-01-21 18:11 Kevin Wolf
  2020-01-21 18:11 ` [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands Kevin Wolf
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-01-21 18:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, armbru, marcandre.lureau, stefanha

Some QMP command handlers can block the main loop for a relatively long
time, for example because they perform some I/O. This is quite nasty.
Allowing such handlers to run in a coroutine where they can yield (and
therefore release the BQL) while waiting for an event such as I/O
completion solves the problem.

This series adds the infrastructure to allow this and switches
block_resize to run in a coroutine as a first example.

This is an alternative solution to Marc-André's "monitor: add
asynchronous command type" series.

v4:
- Forbid 'oob': true, 'coroutine': true [Markus]
- Removed Python type hints [Markus]
- Introduced separate bool qmp_dispatcher_co_shutdown to make it clearer
  how a shutdown request is signalled to the dispatcher [Markus]
- Allow using aio_poll() with iohandler_ctx and use that instead of
  aio_bh_poll() [Markus]
- Removed coroutine_fn from qmp_block_resize() again because at least
  one caller (HMP) calls it outside of coroutine context [Markus]
- Tried to make the synchronisation between dispatcher and the monitor
  thread clearer, and fixed a race condition
- Improved documentation and comments

v3:
- Fix race between monitor thread and dispatcher that could schedule the
  dispatcher coroutine twice if a second requests comes in before the
  dispatcher can wake up [Patchew]

v2:
- Fix typo in a commit message [Eric]
- Use hyphen instead of underscore for the test command [Eric]
- Mark qmp_block_resize() as coroutine_fn [Stefan]

Kevin Wolf (4):
  qapi: Add a 'coroutine' flag for commands
  vl: Initialise main loop earlier
  qmp: Move dispatcher to a coroutine
  block: Mark 'block_resize' as coroutine

 qapi/block-core.json                    |   3 +-
 docs/devel/qapi-code-gen.txt            |  10 +++
 include/qapi/qmp/dispatch.h             |   2 +
 monitor/monitor-internal.h              |   6 +-
 monitor/monitor.c                       |  33 +++++--
 monitor/qmp.c                           | 110 +++++++++++++++++-------
 qapi/qmp-dispatch.c                     |  44 +++++++++-
 qapi/qmp-registry.c                     |   3 +
 tests/test-qmp-cmds.c                   |   4 +
 util/aio-posix.c                        |   7 +-
 vl.c                                    |  10 +--
 scripts/qapi/commands.py                |  10 ++-
 scripts/qapi/doc.py                     |   2 +-
 scripts/qapi/expr.py                    |   7 +-
 scripts/qapi/introspect.py              |   2 +-
 scripts/qapi/schema.py                  |   9 +-
 tests/qapi-schema/test-qapi.py          |   7 +-
 tests/Makefile.include                  |   1 +
 tests/qapi-schema/oob-coroutine.err     |   2 +
 tests/qapi-schema/oob-coroutine.json    |   2 +
 tests/qapi-schema/oob-coroutine.out     |   0
 tests/qapi-schema/qapi-schema-test.json |   1 +
 tests/qapi-schema/qapi-schema-test.out  |   2 +
 23 files changed, 216 insertions(+), 61 deletions(-)
 create mode 100644 tests/qapi-schema/oob-coroutine.err
 create mode 100644 tests/qapi-schema/oob-coroutine.json
 create mode 100644 tests/qapi-schema/oob-coroutine.out

-- 
2.20.1



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-01-21 18:11 [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
@ 2020-01-21 18:11 ` Kevin Wolf
  2020-01-22  6:32   ` Markus Armbruster
  2020-02-17  7:08   ` Markus Armbruster
  2020-01-21 18:11 ` [PATCH v4 2/4] vl: Initialise main loop earlier Kevin Wolf
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-01-21 18:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, armbru, marcandre.lureau, stefanha

This patch adds a new 'coroutine' flag to QMP command definitions that
tells the QMP dispatcher that the command handler is safe to be run in a
coroutine.

The documentation of the new flag pretends that this flag is already
used as intended, which it isn't yet after this patch. We'll implement
this in another patch in this series.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 docs/devel/qapi-code-gen.txt            | 10 ++++++++++
 include/qapi/qmp/dispatch.h             |  1 +
 tests/test-qmp-cmds.c                   |  4 ++++
 scripts/qapi/commands.py                | 10 +++++++---
 scripts/qapi/doc.py                     |  2 +-
 scripts/qapi/expr.py                    |  7 +++++--
 scripts/qapi/introspect.py              |  2 +-
 scripts/qapi/schema.py                  |  9 ++++++---
 tests/qapi-schema/test-qapi.py          |  7 ++++---
 tests/Makefile.include                  |  1 +
 tests/qapi-schema/oob-coroutine.err     |  2 ++
 tests/qapi-schema/oob-coroutine.json    |  2 ++
 tests/qapi-schema/oob-coroutine.out     |  0
 tests/qapi-schema/qapi-schema-test.json |  1 +
 tests/qapi-schema/qapi-schema-test.out  |  2 ++
 15 files changed, 47 insertions(+), 13 deletions(-)
 create mode 100644 tests/qapi-schema/oob-coroutine.err
 create mode 100644 tests/qapi-schema/oob-coroutine.json
 create mode 100644 tests/qapi-schema/oob-coroutine.out

diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt
index 59d6973e1e..9b65cd3ab3 100644
--- a/docs/devel/qapi-code-gen.txt
+++ b/docs/devel/qapi-code-gen.txt
@@ -457,6 +457,7 @@ Syntax:
                 '*gen': false,
                 '*allow-oob': true,
                 '*allow-preconfig': true,
+                '*coroutine': true,
                 '*if': COND,
                 '*features': FEATURES }
 
@@ -581,6 +582,15 @@ before the machine is built.  It defaults to false.  For example:
 QMP is available before the machine is built only when QEMU was
 started with --preconfig.
 
+Member 'coroutine' tells the QMP dispatcher whether the command handler
+is safe to be run in a coroutine.  It defaults to false.  If it is true,
+the command handler is called from coroutine context and may yield while
+waiting for an external event (such as I/O completion) in order to avoid
+blocking the guest and other background operations.
+
+It is an error to specify both 'coroutine': true and 'allow-oob': true
+for a command.
+
 The optional 'if' member specifies a conditional.  See "Configuring
 the schema" below for more on this.
 
diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
index 9aa426a398..d6ce9efc8e 100644
--- a/include/qapi/qmp/dispatch.h
+++ b/include/qapi/qmp/dispatch.h
@@ -24,6 +24,7 @@ typedef enum QmpCommandOptions
     QCO_NO_SUCCESS_RESP       =  (1U << 0),
     QCO_ALLOW_OOB             =  (1U << 1),
     QCO_ALLOW_PRECONFIG       =  (1U << 2),
+    QCO_COROUTINE             =  (1U << 3),
 } QmpCommandOptions;
 
 typedef struct QmpCommand
diff --git a/tests/test-qmp-cmds.c b/tests/test-qmp-cmds.c
index 79507d9e54..6359cc28c7 100644
--- a/tests/test-qmp-cmds.c
+++ b/tests/test-qmp-cmds.c
@@ -35,6 +35,10 @@ void qmp_cmd_success_response(Error **errp)
 {
 }
 
+void qmp_coroutine_cmd(Error **errp)
+{
+}
+
 Empty2 *qmp_user_def_cmd0(Error **errp)
 {
     return g_new0(Empty2, 1);
diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py
index afa55b055c..f2f2f8948d 100644
--- a/scripts/qapi/commands.py
+++ b/scripts/qapi/commands.py
@@ -194,7 +194,8 @@ out:
     return ret
 
 
-def gen_register_command(name, success_response, allow_oob, allow_preconfig):
+def gen_register_command(name, success_response, allow_oob, allow_preconfig,
+                         coroutine):
     options = []
 
     if not success_response:
@@ -203,6 +204,8 @@ def gen_register_command(name, success_response, allow_oob, allow_preconfig):
         options += ['QCO_ALLOW_OOB']
     if allow_preconfig:
         options += ['QCO_ALLOW_PRECONFIG']
+    if coroutine:
+        options += ['QCO_COROUTINE']
 
     if not options:
         options = ['QCO_NO_OPTIONS']
@@ -285,7 +288,7 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds);
 
     def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
                       success_response, boxed, allow_oob, allow_preconfig,
-                      features):
+                      coroutine, features):
         if not gen:
             return
         # FIXME: If T is a user-defined type, the user is responsible
@@ -303,7 +306,8 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds);
             self._genh.add(gen_marshal_decl(name))
             self._genc.add(gen_marshal(name, arg_type, boxed, ret_type))
             self._regy.add(gen_register_command(name, success_response,
-                                                allow_oob, allow_preconfig))
+                                                allow_oob, allow_preconfig,
+                                                coroutine))
 
 
 def gen_commands(schema, output_dir, prefix):
diff --git a/scripts/qapi/doc.py b/scripts/qapi/doc.py
index 6f1c17f71f..8b6978c81e 100644
--- a/scripts/qapi/doc.py
+++ b/scripts/qapi/doc.py
@@ -265,7 +265,7 @@ class QAPISchemaGenDocVisitor(QAPISchemaVisitor):
 
     def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
                       success_response, boxed, allow_oob, allow_preconfig,
-                      features):
+                      coroutine, features):
         doc = self.cur_doc
         self._gen.add(texi_msg('Command', doc, ifcond,
                                texi_arguments(doc,
diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
index d7a289eded..769c54ebfe 100644
--- a/scripts/qapi/expr.py
+++ b/scripts/qapi/expr.py
@@ -89,10 +89,13 @@ def check_flags(expr, info):
         if key in expr and expr[key] is not False:
             raise QAPISemError(
                 info, "flag '%s' may only use false value" % key)
-    for key in ['boxed', 'allow-oob', 'allow-preconfig']:
+    for key in ['boxed', 'allow-oob', 'allow-preconfig', 'coroutine']:
         if key in expr and expr[key] is not True:
             raise QAPISemError(
                 info, "flag '%s' may only use true value" % key)
+    if 'allow-oob' in expr and 'coroutine' in expr:
+        raise QAPISemError(info, "flags 'allow-oob' and 'coroutine' "
+                                 "are incompatible")
 
 
 def check_if(expr, info, source):
@@ -344,7 +347,7 @@ def check_exprs(exprs):
                        ['command'],
                        ['data', 'returns', 'boxed', 'if', 'features',
                         'gen', 'success-response', 'allow-oob',
-                        'allow-preconfig'])
+                        'allow-preconfig', 'coroutine'])
             normalize_members(expr.get('data'))
             check_command(expr, info)
         elif meta == 'event':
diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
index b3a463dd8b..8a296a69d6 100644
--- a/scripts/qapi/introspect.py
+++ b/scripts/qapi/introspect.py
@@ -212,7 +212,7 @@ const QLitObject %(c_name)s = %(c_string)s;
 
     def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
                       success_response, boxed, allow_oob, allow_preconfig,
-                      features):
+                      coroutine, features):
         arg_type = arg_type or self._schema.the_empty_object_type
         ret_type = ret_type or self._schema.the_empty_object_type
         obj = {'arg-type': self._use_type(arg_type),
diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
index 0bfc5256fb..87d76b59d3 100644
--- a/scripts/qapi/schema.py
+++ b/scripts/qapi/schema.py
@@ -128,7 +128,7 @@ class QAPISchemaVisitor(object):
 
     def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
                       success_response, boxed, allow_oob, allow_preconfig,
-                      features):
+                      coroutine, features):
         pass
 
     def visit_event(self, name, info, ifcond, arg_type, boxed):
@@ -690,7 +690,7 @@ class QAPISchemaCommand(QAPISchemaEntity):
 
     def __init__(self, name, info, doc, ifcond, arg_type, ret_type,
                  gen, success_response, boxed, allow_oob, allow_preconfig,
-                 features):
+                 coroutine, features):
         QAPISchemaEntity.__init__(self, name, info, doc, ifcond, features)
         assert not arg_type or isinstance(arg_type, str)
         assert not ret_type or isinstance(ret_type, str)
@@ -703,6 +703,7 @@ class QAPISchemaCommand(QAPISchemaEntity):
         self.boxed = boxed
         self.allow_oob = allow_oob
         self.allow_preconfig = allow_preconfig
+        self.coroutine = coroutine
 
     def check(self, schema):
         QAPISchemaEntity.check(self, schema)
@@ -745,7 +746,7 @@ class QAPISchemaCommand(QAPISchemaEntity):
                               self.arg_type, self.ret_type,
                               self.gen, self.success_response,
                               self.boxed, self.allow_oob,
-                              self.allow_preconfig,
+                              self.allow_preconfig, self.coroutine,
                               self.features)
 
 
@@ -1043,6 +1044,7 @@ class QAPISchema(object):
         boxed = expr.get('boxed', False)
         allow_oob = expr.get('allow-oob', False)
         allow_preconfig = expr.get('allow-preconfig', False)
+        coroutine = expr.get('coroutine', False)
         ifcond = expr.get('if')
         features = expr.get('features', [])
         if isinstance(data, OrderedDict):
@@ -1054,6 +1056,7 @@ class QAPISchema(object):
         self._def_entity(QAPISchemaCommand(name, info, doc, ifcond, data, rets,
                                            gen, success_response,
                                            boxed, allow_oob, allow_preconfig,
+                                           coroutine,
                                            self._make_features(features, info)))
 
     def _def_event(self, expr, info, doc):
diff --git a/tests/qapi-schema/test-qapi.py b/tests/qapi-schema/test-qapi.py
index bad14edb47..7a8e65188d 100755
--- a/tests/qapi-schema/test-qapi.py
+++ b/tests/qapi-schema/test-qapi.py
@@ -70,12 +70,13 @@ class QAPISchemaTestVisitor(QAPISchemaVisitor):
 
     def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
                       success_response, boxed, allow_oob, allow_preconfig,
-                      features):
+                      coroutine, features):
         print('command %s %s -> %s'
               % (name, arg_type and arg_type.name,
                  ret_type and ret_type.name))
-        print('    gen=%s success_response=%s boxed=%s oob=%s preconfig=%s'
-              % (gen, success_response, boxed, allow_oob, allow_preconfig))
+        print('    gen=%s success_response=%s boxed=%s oob=%s preconfig=%s%s'
+              % (gen, success_response, boxed, allow_oob, allow_preconfig,
+                 " coroutine=True" if coroutine else ""))
         self._print_if(ifcond)
         self._print_features(features)
 
diff --git a/tests/Makefile.include b/tests/Makefile.include
index c6827ce8c2..d111389c03 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -286,6 +286,7 @@ qapi-schema += missing-type.json
 qapi-schema += nested-struct-data.json
 qapi-schema += nested-struct-data-invalid-dict.json
 qapi-schema += non-objects.json
+qapi-schema += oob-coroutine.json
 qapi-schema += oob-test.json
 qapi-schema += allow-preconfig-test.json
 qapi-schema += pragma-doc-required-crap.json
diff --git a/tests/qapi-schema/oob-coroutine.err b/tests/qapi-schema/oob-coroutine.err
new file mode 100644
index 0000000000..c01a4992bd
--- /dev/null
+++ b/tests/qapi-schema/oob-coroutine.err
@@ -0,0 +1,2 @@
+oob-coroutine.json: In command 'oob-command-1':
+oob-coroutine.json:2: flags 'allow-oob' and 'coroutine' are incompatible
diff --git a/tests/qapi-schema/oob-coroutine.json b/tests/qapi-schema/oob-coroutine.json
new file mode 100644
index 0000000000..0f67663bcd
--- /dev/null
+++ b/tests/qapi-schema/oob-coroutine.json
@@ -0,0 +1,2 @@
+# Check that incompatible flags allow-oob and coroutine are rejected
+{ 'command': 'oob-command-1', 'allow-oob': true, 'coroutine': true }
diff --git a/tests/qapi-schema/oob-coroutine.out b/tests/qapi-schema/oob-coroutine.out
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tests/qapi-schema/qapi-schema-test.json b/tests/qapi-schema/qapi-schema-test.json
index 9abf175fe0..1a850fe171 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -147,6 +147,7 @@
   'returns': 'UserDefTwo' }
 
 { 'command': 'cmd-success-response', 'data': {}, 'success-response': false }
+{ 'command': 'coroutine-cmd', 'data': {}, 'coroutine': true }
 
 # Returning a non-dictionary requires a name from the whitelist
 { 'command': 'guest-get-time', 'data': {'a': 'int', '*b': 'int' },
diff --git a/tests/qapi-schema/qapi-schema-test.out b/tests/qapi-schema/qapi-schema-test.out
index 9bd3c4a490..fdc349991a 100644
--- a/tests/qapi-schema/qapi-schema-test.out
+++ b/tests/qapi-schema/qapi-schema-test.out
@@ -203,6 +203,8 @@ command user_def_cmd2 q_obj_user_def_cmd2-arg -> UserDefTwo
     gen=True success_response=True boxed=False oob=False preconfig=False
 command cmd-success-response None -> None
     gen=True success_response=False boxed=False oob=False preconfig=False
+command coroutine-cmd None -> None
+    gen=True success_response=True boxed=False oob=False preconfig=False coroutine=True
 object q_obj_guest-get-time-arg
     member a: int optional=False
     member b: int optional=True
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 2/4] vl: Initialise main loop earlier
  2020-01-21 18:11 [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
  2020-01-21 18:11 ` [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands Kevin Wolf
@ 2020-01-21 18:11 ` Kevin Wolf
  2020-01-21 18:11 ` [PATCH v4 3/4] qmp: Move dispatcher to a coroutine Kevin Wolf
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-01-21 18:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, armbru, marcandre.lureau, stefanha

We want to be able to use qemu_aio_context in the monitor
initialisation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
---
 vl.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/vl.c b/vl.c
index 751401214c..8064fef8b3 100644
--- a/vl.c
+++ b/vl.c
@@ -2909,6 +2909,11 @@ int main(int argc, char **argv, char **envp)
     runstate_init();
     precopy_infrastructure_init();
     postcopy_infrastructure_init();
+
+    if (qemu_init_main_loop(&main_loop_err)) {
+        error_report_err(main_loop_err);
+        exit(1);
+    }
     monitor_init_globals();
 
     if (qcrypto_init(&err) < 0) {
@@ -3823,11 +3828,6 @@ int main(int argc, char **argv, char **envp)
     qemu_unlink_pidfile_notifier.notify = qemu_unlink_pidfile;
     qemu_add_exit_notifier(&qemu_unlink_pidfile_notifier);
 
-    if (qemu_init_main_loop(&main_loop_err)) {
-        error_report_err(main_loop_err);
-        exit(1);
-    }
-
 #ifdef CONFIG_SECCOMP
     olist = qemu_find_opts_err("sandbox", NULL);
     if (olist) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-01-21 18:11 [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
  2020-01-21 18:11 ` [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands Kevin Wolf
  2020-01-21 18:11 ` [PATCH v4 2/4] vl: Initialise main loop earlier Kevin Wolf
@ 2020-01-21 18:11 ` Kevin Wolf
  2020-02-17 11:08   ` Markus Armbruster
  2020-01-21 18:11 ` [PATCH v4 4/4] block: Mark 'block_resize' as coroutine Kevin Wolf
  2020-02-05 14:00 ` [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
  4 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2020-01-21 18:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, armbru, marcandre.lureau, stefanha

This moves the QMP dispatcher to a coroutine and runs all QMP command
handlers that declare 'coroutine': true in coroutine context so they
can avoid blocking the main loop while doing I/O or waiting for other
events.

For commands that are not declared safe to run in a coroutine, the
dispatcher drops out of coroutine context by calling the QMP command
handler from a bottom half.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/qapi/qmp/dispatch.h |   1 +
 monitor/monitor-internal.h  |   6 +-
 monitor/monitor.c           |  33 ++++++++---
 monitor/qmp.c               | 110 ++++++++++++++++++++++++++----------
 qapi/qmp-dispatch.c         |  44 ++++++++++++++-
 qapi/qmp-registry.c         |   3 +
 util/aio-posix.c            |   7 ++-
 7 files changed, 162 insertions(+), 42 deletions(-)

diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
index d6ce9efc8e..6812e49b5f 100644
--- a/include/qapi/qmp/dispatch.h
+++ b/include/qapi/qmp/dispatch.h
@@ -30,6 +30,7 @@ typedef enum QmpCommandOptions
 typedef struct QmpCommand
 {
     const char *name;
+    /* Runs in coroutine context if QCO_COROUTINE is set */
     QmpCommandFunc *fn;
     QmpCommandOptions options;
     QTAILQ_ENTRY(QmpCommand) node;
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index d78f5ca190..f180d03368 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -154,7 +154,9 @@ static inline bool monitor_is_qmp(const Monitor *mon)
 
 typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList;
 extern IOThread *mon_iothread;
-extern QEMUBH *qmp_dispatcher_bh;
+extern Coroutine *qmp_dispatcher_co;
+extern bool qmp_dispatcher_co_shutdown;
+extern bool qmp_dispatcher_co_busy;
 extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
 extern QemuMutex monitor_lock;
 extern MonitorList mon_list;
@@ -172,7 +174,7 @@ void monitor_fdsets_cleanup(void);
 
 void qmp_send_response(MonitorQMP *mon, const QDict *rsp);
 void monitor_data_destroy_qmp(MonitorQMP *mon);
-void monitor_qmp_bh_dispatcher(void *data);
+void coroutine_fn monitor_qmp_dispatcher_co(void *data);
 
 int get_monitor_def(int64_t *pval, const char *name);
 void help_cmd(Monitor *mon, const char *name);
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 12898b6448..e753fa435d 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -53,8 +53,18 @@ typedef struct {
 /* Shared monitor I/O thread */
 IOThread *mon_iothread;
 
-/* Bottom half to dispatch the requests received from I/O thread */
-QEMUBH *qmp_dispatcher_bh;
+/* Coroutine to dispatch the requests received from I/O thread */
+Coroutine *qmp_dispatcher_co;
+
+/* Set to true when the dispatcher coroutine should terminate */
+bool qmp_dispatcher_co_shutdown;
+
+/*
+ * true if the coroutine is active and processing requests. The coroutine may
+ * only be woken up externally (e.g. from the monitor thread) after changing
+ * qmp_dispatcher_co_busy from false to true (e.g. using atomic_xchg).
+ */
+bool qmp_dispatcher_co_busy;
 
 /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
 QemuMutex monitor_lock;
@@ -579,9 +589,16 @@ void monitor_cleanup(void)
     }
     qemu_mutex_unlock(&monitor_lock);
 
-    /* QEMUBHs needs to be deleted before destroying the I/O thread */
-    qemu_bh_delete(qmp_dispatcher_bh);
-    qmp_dispatcher_bh = NULL;
+    /* The dispatcher needs to stop before destroying the I/O thread */
+    qmp_dispatcher_co_shutdown = true;
+    if (!atomic_xchg(&qmp_dispatcher_co_busy, true)) {
+        aio_co_wake(qmp_dispatcher_co);
+    }
+
+    AIO_WAIT_WHILE(qemu_get_aio_context(),
+                   (aio_poll(iohandler_get_aio_context(), false),
+                    atomic_mb_read(&qmp_dispatcher_co_busy)));
+
     if (mon_iothread) {
         iothread_destroy(mon_iothread);
         mon_iothread = NULL;
@@ -604,9 +621,9 @@ void monitor_init_globals_core(void)
      * have commands assuming that context.  It would be nice to get
      * rid of those assumptions.
      */
-    qmp_dispatcher_bh = aio_bh_new(iohandler_get_aio_context(),
-                                   monitor_qmp_bh_dispatcher,
-                                   NULL);
+    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
+    atomic_mb_set(&qmp_dispatcher_co_busy, true);
+    aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
 }
 
 QemuOptsList qemu_mon_opts = {
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 54c06ba824..9444de9fcf 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -133,6 +133,10 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
     }
 }
 
+/*
+ * Runs outside of coroutine context for OOB commands, but in coroutine context
+ * for everything else.
+ */
 static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
 {
     Monitor *old_mon;
@@ -211,43 +215,87 @@ static QMPRequest *monitor_qmp_requests_pop_any_with_lock(void)
     return req_obj;
 }
 
-void monitor_qmp_bh_dispatcher(void *data)
+void coroutine_fn monitor_qmp_dispatcher_co(void *data)
 {
-    QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
+    QMPRequest *req_obj = NULL;
     QDict *rsp;
     bool need_resume;
     MonitorQMP *mon;
 
-    if (!req_obj) {
-        return;
-    }
+    while (true) {
+        assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true);
+
+        /* Mark the dispatcher as not busy already here so that we don't miss
+         * any new requests coming in the middle of our processing. */
+        atomic_mb_set(&qmp_dispatcher_co_busy, false);
+
+        while (!(req_obj = monitor_qmp_requests_pop_any_with_lock())) {
+            /* Wait to be reentered from handle_qmp_command, or terminate if
+             * qmp_dispatcher_co_shutdown is true*/
+            if (!qmp_dispatcher_co_shutdown) {
+                qemu_coroutine_yield();
+
+                /* busy must be set to true again by whoever rescheduled us to
+                 * avoid double scheduling */
+                assert(atomic_xchg(&qmp_dispatcher_co_busy, false) == true);
+            }
+
+            /* qmp_dispatcher_co_shutdown may have changed if we yielded and
+             * were reentered from monitor_cleanup() */
+            if (qmp_dispatcher_co_shutdown) {
+                return;
+            }
+        }
 
-    mon = req_obj->mon;
-    /*  qmp_oob_enabled() might change after "qmp_capabilities" */
-    need_resume = !qmp_oob_enabled(mon) ||
-        mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
-    qemu_mutex_unlock(&mon->qmp_queue_lock);
-    if (req_obj->req) {
-        QDict *qdict = qobject_to(QDict, req_obj->req);
-        QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
-        trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
-        monitor_qmp_dispatch(mon, req_obj->req);
-    } else {
-        assert(req_obj->err);
-        rsp = qmp_error_response(req_obj->err);
-        req_obj->err = NULL;
-        monitor_qmp_respond(mon, rsp);
-        qobject_unref(rsp);
-    }
+        if (atomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
+            /* Someone rescheduled us (probably because a new requests came
+             * in), but we didn't actually yield. Do that now, only to be
+             * immediately reentered and removed from the list of scheduled
+             * coroutines. */
+            qemu_coroutine_yield();
+        }
 
-    if (need_resume) {
-        /* Pairs with the monitor_suspend() in handle_qmp_command() */
-        monitor_resume(&mon->common);
-    }
-    qmp_request_free(req_obj);
+        /*
+         * Move the coroutine from iohandler_ctx to qemu_aio_context for
+         * executing the command handler so that it can make progress if it
+         * involves an AIO_WAIT_WHILE().
+         */
+        aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
+        qemu_coroutine_yield();
+
+        mon = req_obj->mon;
+        /*  qmp_oob_enabled() might change after "qmp_capabilities" */
+        need_resume = !qmp_oob_enabled(mon) ||
+            mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
+        qemu_mutex_unlock(&mon->qmp_queue_lock);
+        if (req_obj->req) {
+            QDict *qdict = qobject_to(QDict, req_obj->req);
+            QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
+            trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
+            monitor_qmp_dispatch(mon, req_obj->req);
+        } else {
+            assert(req_obj->err);
+            rsp = qmp_error_response(req_obj->err);
+            req_obj->err = NULL;
+            monitor_qmp_respond(mon, rsp);
+            qobject_unref(rsp);
+        }
+
+        if (need_resume) {
+            /* Pairs with the monitor_suspend() in handle_qmp_command() */
+            monitor_resume(&mon->common);
+        }
+        qmp_request_free(req_obj);
 
-    /* Reschedule instead of looping so the main loop stays responsive */
-    qemu_bh_schedule(qmp_dispatcher_bh);
+        /*
+         * Yield and reschedule so the main loop stays responsive.
+         *
+         * Move back to iohandler_ctx so that nested event loops for
+         * qemu_aio_context don't start new monitor commands.
+         */
+        aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
+        qemu_coroutine_yield();
+    }
 }
 
 static void handle_qmp_command(void *opaque, QObject *req, Error *err)
@@ -308,7 +356,9 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
     qemu_mutex_unlock(&mon->qmp_queue_lock);
 
     /* Kick the dispatcher routine */
-    qemu_bh_schedule(qmp_dispatcher_bh);
+    if (!atomic_xchg(&qmp_dispatcher_co_busy, true)) {
+        aio_co_wake(qmp_dispatcher_co);
+    }
 }
 
 static void monitor_qmp_read(void *opaque, const uint8_t *buf, int size)
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index bc264b3c9b..eef09d15bc 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -12,12 +12,16 @@
  */
 
 #include "qemu/osdep.h"
+
+#include "block/aio.h"
 #include "qapi/error.h"
 #include "qapi/qmp/dispatch.h"
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qjson.h"
 #include "sysemu/runstate.h"
 #include "qapi/qmp/qbool.h"
+#include "qemu/coroutine.h"
+#include "qemu/main-loop.h"
 
 static QDict *qmp_dispatch_check_obj(const QObject *request, bool allow_oob,
                                      Error **errp)
@@ -75,6 +79,25 @@ static QDict *qmp_dispatch_check_obj(const QObject *request, bool allow_oob,
     return dict;
 }
 
+typedef struct QmpDispatchBH {
+    QmpCommand *cmd;
+    QDict *args;
+    QObject **ret;
+    Error **errp;
+    Coroutine *co;
+} QmpDispatchBH;
+
+static void do_qmp_dispatch_bh(void *opaque)
+{
+    QmpDispatchBH *data = opaque;
+    data->cmd->fn(data->args, data->ret, data->errp);
+    aio_co_wake(data->co);
+}
+
+/*
+ * Runs outside of coroutine context for OOB commands, but in coroutine context
+ * for everything else.
+ */
 static QObject *do_qmp_dispatch(QmpCommandList *cmds, QObject *request,
                                 bool allow_oob, Error **errp)
 {
@@ -129,7 +152,22 @@ static QObject *do_qmp_dispatch(QmpCommandList *cmds, QObject *request,
         qobject_ref(args);
     }
 
-    cmd->fn(args, &ret, &local_err);
+    assert(!(oob && qemu_in_coroutine()));
+    if ((cmd->options & QCO_COROUTINE) || !qemu_in_coroutine()) {
+        cmd->fn(args, &ret, &local_err);
+    } else {
+        /* Must drop out of coroutine context for this one */
+        QmpDispatchBH data = {
+            .cmd    = cmd,
+            .args   = args,
+            .ret    = &ret,
+            .errp   = &local_err,
+            .co     = qemu_coroutine_self(),
+        };
+        aio_bh_schedule_oneshot(qemu_get_aio_context(), do_qmp_dispatch_bh,
+                                &data);
+        qemu_coroutine_yield();
+    }
     if (local_err) {
         error_propagate(errp, local_err);
     } else if (cmd->options & QCO_NO_SUCCESS_RESP) {
@@ -164,6 +202,10 @@ bool qmp_is_oob(const QDict *dict)
         && !qdict_haskey(dict, "execute");
 }
 
+/*
+ * Runs outside of coroutine context for OOB commands, but in coroutine context
+ * for everything else.
+ */
 QDict *qmp_dispatch(QmpCommandList *cmds, QObject *request,
                     bool allow_oob)
 {
diff --git a/qapi/qmp-registry.c b/qapi/qmp-registry.c
index ca00f74795..3d896aedd8 100644
--- a/qapi/qmp-registry.c
+++ b/qapi/qmp-registry.c
@@ -20,6 +20,9 @@ void qmp_register_command(QmpCommandList *cmds, const char *name,
 {
     QmpCommand *cmd = g_malloc0(sizeof(*cmd));
 
+    /* QCO_COROUTINE and QCO_ALLOW_OOB are incompatible */
+    assert(!((options & QCO_COROUTINE) && (options & QCO_ALLOW_OOB)));
+
     cmd->name = name;
     cmd->fn = fn;
     cmd->enabled = true;
diff --git a/util/aio-posix.c b/util/aio-posix.c
index a4977f538e..223de08b91 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -15,6 +15,7 @@
 
 #include "qemu/osdep.h"
 #include "block/block.h"
+#include "qemu/main-loop.h"
 #include "qemu/rcu_queue.h"
 #include "qemu/sockets.h"
 #include "qemu/cutils.h"
@@ -616,7 +617,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
     int64_t timeout;
     int64_t start = 0;
 
-    assert(in_aio_context_home_thread(ctx));
+    /* aio_poll() may only be called in the AioContext's thread. iohandler_ctx
+     * is special in that it runs in the main thread, but that thread's context
+     * is qemu_aio_context. */
+    assert(in_aio_context_home_thread(ctx == iohandler_get_aio_context() ?
+                                      qemu_get_aio_context() : ctx));
 
     /* aio_notify can avoid the expensive event_notifier_set if
      * everything (file descriptors, bottom halves, timers) will
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 4/4] block: Mark 'block_resize' as coroutine
  2020-01-21 18:11 [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
                   ` (2 preceding siblings ...)
  2020-01-21 18:11 ` [PATCH v4 3/4] qmp: Move dispatcher to a coroutine Kevin Wolf
@ 2020-01-21 18:11 ` Kevin Wolf
  2020-02-05 14:00 ` [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
  4 siblings, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-01-21 18:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, armbru, marcandre.lureau, stefanha

block_resize is safe to run in a coroutine, and it does some I/O that
could potentially take quite some time, so use it as an example for the
new 'coroutine': true annotation in the QAPI schema.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
---
 qapi/block-core.json | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7ff5e5edaf..1dbb2a9901 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1341,7 +1341,8 @@
 { 'command': 'block_resize',
   'data': { '*device': 'str',
             '*node-name': 'str',
-            'size': 'int' } }
+            'size': 'int' },
+  'coroutine': true }
 
 ##
 # @NewImageMode:
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-01-21 18:11 ` [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands Kevin Wolf
@ 2020-01-22  6:32   ` Markus Armbruster
  2020-01-22 10:10     ` Kevin Wolf
  2020-02-17  7:08   ` Markus Armbruster
  1 sibling, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-01-22  6:32 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: stefanha, marcandre.lureau, qemu-devel, qemu-block

Kevin Wolf <kwolf@redhat.com> writes:

> This patch adds a new 'coroutine' flag to QMP command definitions that
> tells the QMP dispatcher that the command handler is safe to be run in a
> coroutine.

I'm afraid I missed this question in my review of v3: when is a handler
*not* safe to be run in a coroutine?

> The documentation of the new flag pretends that this flag is already
> used as intended, which it isn't yet after this patch. We'll implement
> this in another patch in this series.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-01-22  6:32   ` Markus Armbruster
@ 2020-01-22 10:10     ` Kevin Wolf
  2020-01-22 12:15       ` Markus Armbruster
  2020-03-05 15:30       ` Markus Armbruster
  0 siblings, 2 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-01-22 10:10 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: stefanha, marcandre.lureau, qemu-devel, qemu-block

Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > This patch adds a new 'coroutine' flag to QMP command definitions that
> > tells the QMP dispatcher that the command handler is safe to be run in a
> > coroutine.
> 
> I'm afraid I missed this question in my review of v3: when is a handler
> *not* safe to be run in a coroutine?

That's a hard one to answer fully.

Basically, I think the biggest problem is with calling functions that
change their behaviour if run in a coroutine compared to running them
outside of coroutine context. In most cases the differences like having
a nested event loop instead of yielding are just fine, but they are
still subtly different.

I know this is vague, but I can assure you that problematic cases exist.
I hit one of them with my initial hack that just moved everything into a
coroutine. It was related to graph modifications and bdrv_drain and
resulted in a hang. For the specifics, I would have to try and reproduce
the problem again.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-01-22 10:10     ` Kevin Wolf
@ 2020-01-22 12:15       ` Markus Armbruster
  2020-01-22 14:35         ` Kevin Wolf
  2020-03-05 15:30       ` Markus Armbruster
  1 sibling, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-01-22 12:15 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Kevin Wolf <kwolf@redhat.com> writes:

> Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > This patch adds a new 'coroutine' flag to QMP command definitions that
>> > tells the QMP dispatcher that the command handler is safe to be run in a
>> > coroutine.
>> 
>> I'm afraid I missed this question in my review of v3: when is a handler
>> *not* safe to be run in a coroutine?
>
> That's a hard one to answer fully.

You're welcome ;)

> Basically, I think the biggest problem is with calling functions that
> change their behaviour if run in a coroutine compared to running them
> outside of coroutine context. In most cases the differences like having
> a nested event loop instead of yielding are just fine, but they are
> still subtly different.
>
> I know this is vague, but I can assure you that problematic cases exist.
> I hit one of them with my initial hack that just moved everything into a
> coroutine. It was related to graph modifications and bdrv_drain and
> resulted in a hang. For the specifics, I would have to try and reproduce
> the problem again.

Interesting.

Is coroutine-incompatible command handler code necessary or accidental?

By "necessary" I mean there are (and likely always will be) commands
that need to do stuff that cannot or should not be done on coroutine
context.

"Accidental" is the opposite: coroutine-incompatibility can be regarded
as a fixable flaw.

How widespread is coroutine-incompatibility?  Common, or just a few
commands?

If coroutine-incompatibility is accidental, then your code to drop out
of coroutine context can be regarded as a temporary work-around.  Such
work-arounds receive a modest extra ugliness & complexity budget.

If coroutine-incompatibility is rare, we'll eventually want "mark the
known-bad ones with 'coroutine': false" instead of "mark the known-good
ones with 'coroutine': true".  I'm okay with marking the known-good ones
initially, and flipping only later.

Inability to recognize coroutine-incompatibility by inspection worries
me.  Can you think of ways to identify causes other than testing things
to death?



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-01-22 12:15       ` Markus Armbruster
@ 2020-01-22 14:35         ` Kevin Wolf
  0 siblings, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-01-22 14:35 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Am 22.01.2020 um 13:15 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> 
> >> > This patch adds a new 'coroutine' flag to QMP command definitions that
> >> > tells the QMP dispatcher that the command handler is safe to be run in a
> >> > coroutine.
> >> 
> >> I'm afraid I missed this question in my review of v3: when is a handler
> >> *not* safe to be run in a coroutine?
> >
> > That's a hard one to answer fully.
> 
> You're welcome ;)
> 
> > Basically, I think the biggest problem is with calling functions that
> > change their behaviour if run in a coroutine compared to running them
> > outside of coroutine context. In most cases the differences like having
> > a nested event loop instead of yielding are just fine, but they are
> > still subtly different.
> >
> > I know this is vague, but I can assure you that problematic cases exist.
> > I hit one of them with my initial hack that just moved everything into a
> > coroutine. It was related to graph modifications and bdrv_drain and
> > resulted in a hang. For the specifics, I would have to try and reproduce
> > the problem again.
> 
> Interesting.
> 
> Is coroutine-incompatible command handler code necessary or accidental?
> 
> By "necessary" I mean there are (and likely always will be) commands
> that need to do stuff that cannot or should not be done on coroutine
> context.
> 
> "Accidental" is the opposite: coroutine-incompatibility can be regarded
> as a fixable flaw.

I expect it's mostly accidental, but maybe not all of it.

bdrv_drain() is an example of a function that has a problem with running
inside a coroutine: If you schedule another coroutine to run, it will
only run after the currently active coroutine yields. However,
bdrv_drain() doesn't yield, but runs a nested event loop, so this can
lead to deadlocks.

For this reason, bdrv_drain() has code to drop out of coroutine context,
so that you can call it from a coroutine anyway:

    /* Calling bdrv_drain() from a BH ensures the current coroutine yields and
     * other coroutines run if they were queued by aio_co_enter(). */

Now, after reading this, I think the problem in the QMP handler I
observerd was that it called some parts of the drain code without
dropping out of coroutine context first. Well possible that it was the
exact deadlock described by this comment.

(Even in bdrv_drain() it might not be strictly necessary. Maybe it's
possible to redesign bdrv_drain() so that it doesn't have a nested event
loop, but can somehow rely on the normal main loop. Touching it is
scary, though, so I'd rather not unless we have a good reason.)

> How widespread is coroutine-incompatibility?  Common, or just a few
> commands?

Answering this would require a massive code audit. Way out of scope for
this series. This is the reason why I decided to just make it optional
instead of trying to find and fix the cases that would need a fix if we
enabled coroutine handlers unconditionally.

> If coroutine-incompatibility is accidental, then your code to drop out
> of coroutine context can be regarded as a temporary work-around.  Such
> work-arounds receive a modest extra ugliness & complexity budget.
> 
> If coroutine-incompatibility is rare, we'll eventually want "mark the
> known-bad ones with 'coroutine': false" instead of "mark the known-good
> ones with 'coroutine': true".  I'm okay with marking the known-good ones
> initially, and flipping only later.
> 
> Inability to recognize coroutine-incompatibility by inspection worries
> me.  Can you think of ways to identify causes other than testing things
> to death?

I think it is possible to recognise by inspection, it would just be a
massive effort. Maybe a way to start it could be adding something like a
non_coroutine_fn marker to functions that we know don't want to be
called from coroutine context, and propagating it to their callers.

I guess I would start by looking for nested event loops (AIO_WAIT_WHILE
or BDRV_POLL_WHILE), and then you would have to decide for each whether
it could run into problems. I can't promise this will cover everything,
but it would at least cover the case known from bdrv_drain.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 0/4] qmp: Optionally run handlers in coroutines
  2020-01-21 18:11 [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
                   ` (3 preceding siblings ...)
  2020-01-21 18:11 ` [PATCH v4 4/4] block: Mark 'block_resize' as coroutine Kevin Wolf
@ 2020-02-05 14:00 ` Kevin Wolf
  2020-02-12 11:40   ` Kevin Wolf
  4 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2020-02-05 14:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, marcandre.lureau, armbru, stefanha

Am 21.01.2020 um 19:11 hat Kevin Wolf geschrieben:
> Some QMP command handlers can block the main loop for a relatively long
> time, for example because they perform some I/O. This is quite nasty.
> Allowing such handlers to run in a coroutine where they can yield (and
> therefore release the BQL) while waiting for an event such as I/O
> completion solves the problem.
> 
> This series adds the infrastructure to allow this and switches
> block_resize to run in a coroutine as a first example.
> 
> This is an alternative solution to Marc-André's "monitor: add
> asynchronous command type" series.

Ping?

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 0/4] qmp: Optionally run handlers in coroutines
  2020-02-05 14:00 ` [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
@ 2020-02-12 11:40   ` Kevin Wolf
  0 siblings, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-02-12 11:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: stefanha, marcandre.lureau, armbru, qemu-block

Am 05.02.2020 um 15:00 hat Kevin Wolf geschrieben:
> Am 21.01.2020 um 19:11 hat Kevin Wolf geschrieben:
> > Some QMP command handlers can block the main loop for a relatively long
> > time, for example because they perform some I/O. This is quite nasty.
> > Allowing such handlers to run in a coroutine where they can yield (and
> > therefore release the BQL) while waiting for an event such as I/O
> > completion solves the problem.
> > 
> > This series adds the infrastructure to allow this and switches
> > block_resize to run in a coroutine as a first example.
> > 
> > This is an alternative solution to Marc-André's "monitor: add
> > asynchronous command type" series.
> 
> Ping?

Ping.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-01-21 18:11 ` [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands Kevin Wolf
  2020-01-22  6:32   ` Markus Armbruster
@ 2020-02-17  7:08   ` Markus Armbruster
  1 sibling, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2020-02-17  7:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: stefanha, marcandre.lureau, qemu-devel, qemu-block

Kevin Wolf <kwolf@redhat.com> writes:

> This patch adds a new 'coroutine' flag to QMP command definitions that
> tells the QMP dispatcher that the command handler is safe to be run in a
> coroutine.
>
> The documentation of the new flag pretends that this flag is already
> used as intended, which it isn't yet after this patch. We'll implement
> this in another patch in this series.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  docs/devel/qapi-code-gen.txt            | 10 ++++++++++
>  include/qapi/qmp/dispatch.h             |  1 +
>  tests/test-qmp-cmds.c                   |  4 ++++
>  scripts/qapi/commands.py                | 10 +++++++---
>  scripts/qapi/doc.py                     |  2 +-
>  scripts/qapi/expr.py                    |  7 +++++--
>  scripts/qapi/introspect.py              |  2 +-
>  scripts/qapi/schema.py                  |  9 ++++++---
>  tests/qapi-schema/test-qapi.py          |  7 ++++---
>  tests/Makefile.include                  |  1 +
>  tests/qapi-schema/oob-coroutine.err     |  2 ++
>  tests/qapi-schema/oob-coroutine.json    |  2 ++
>  tests/qapi-schema/oob-coroutine.out     |  0
>  tests/qapi-schema/qapi-schema-test.json |  1 +
>  tests/qapi-schema/qapi-schema-test.out  |  2 ++
>  15 files changed, 47 insertions(+), 13 deletions(-)
>  create mode 100644 tests/qapi-schema/oob-coroutine.err
>  create mode 100644 tests/qapi-schema/oob-coroutine.json
>  create mode 100644 tests/qapi-schema/oob-coroutine.out
>
> diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt
> index 59d6973e1e..9b65cd3ab3 100644
> --- a/docs/devel/qapi-code-gen.txt
> +++ b/docs/devel/qapi-code-gen.txt
> @@ -457,6 +457,7 @@ Syntax:
>                  '*gen': false,
>                  '*allow-oob': true,
>                  '*allow-preconfig': true,
> +                '*coroutine': true,
>                  '*if': COND,
>                  '*features': FEATURES }
>  
> @@ -581,6 +582,15 @@ before the machine is built.  It defaults to false.  For example:
>  QMP is available before the machine is built only when QEMU was
>  started with --preconfig.
>  
> +Member 'coroutine' tells the QMP dispatcher whether the command handler
> +is safe to be run in a coroutine.  It defaults to false.  If it is true,
> +the command handler is called from coroutine context and may yield while
> +waiting for an external event (such as I/O completion) in order to avoid
> +blocking the guest and other background operations.
> +
> +It is an error to specify both 'coroutine': true and 'allow-oob': true
> +for a command.
> +
>  The optional 'if' member specifies a conditional.  See "Configuring
>  the schema" below for more on this.
>  
> diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
> index 9aa426a398..d6ce9efc8e 100644
> --- a/include/qapi/qmp/dispatch.h
> +++ b/include/qapi/qmp/dispatch.h
> @@ -24,6 +24,7 @@ typedef enum QmpCommandOptions
>      QCO_NO_SUCCESS_RESP       =  (1U << 0),
>      QCO_ALLOW_OOB             =  (1U << 1),
>      QCO_ALLOW_PRECONFIG       =  (1U << 2),
> +    QCO_COROUTINE             =  (1U << 3),
>  } QmpCommandOptions;
>  
>  typedef struct QmpCommand
> diff --git a/tests/test-qmp-cmds.c b/tests/test-qmp-cmds.c
> index 79507d9e54..6359cc28c7 100644
> --- a/tests/test-qmp-cmds.c
> +++ b/tests/test-qmp-cmds.c
> @@ -35,6 +35,10 @@ void qmp_cmd_success_response(Error **errp)
>  {
>  }
>  
> +void qmp_coroutine_cmd(Error **errp)
> +{
> +}
> +
>  Empty2 *qmp_user_def_cmd0(Error **errp)
>  {
>      return g_new0(Empty2, 1);
> diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py
> index afa55b055c..f2f2f8948d 100644
> --- a/scripts/qapi/commands.py
> +++ b/scripts/qapi/commands.py
> @@ -194,7 +194,8 @@ out:
>      return ret
>  
>  
> -def gen_register_command(name, success_response, allow_oob, allow_preconfig):
> +def gen_register_command(name, success_response, allow_oob, allow_preconfig,
> +                         coroutine):
>      options = []
>  
>      if not success_response:
> @@ -203,6 +204,8 @@ def gen_register_command(name, success_response, allow_oob, allow_preconfig):
>          options += ['QCO_ALLOW_OOB']
>      if allow_preconfig:
>          options += ['QCO_ALLOW_PRECONFIG']
> +    if coroutine:
> +        options += ['QCO_COROUTINE']
>  
>      if not options:
>          options = ['QCO_NO_OPTIONS']
> @@ -285,7 +288,7 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds);
>  
>      def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
>                        success_response, boxed, allow_oob, allow_preconfig,
> -                      features):
> +                      coroutine, features):
>          if not gen:
>              return
>          # FIXME: If T is a user-defined type, the user is responsible
> @@ -303,7 +306,8 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds);
>              self._genh.add(gen_marshal_decl(name))
>              self._genc.add(gen_marshal(name, arg_type, boxed, ret_type))
>              self._regy.add(gen_register_command(name, success_response,
> -                                                allow_oob, allow_preconfig))
> +                                                allow_oob, allow_preconfig,
> +                                                coroutine))
>  
>  
>  def gen_commands(schema, output_dir, prefix):
> diff --git a/scripts/qapi/doc.py b/scripts/qapi/doc.py
> index 6f1c17f71f..8b6978c81e 100644
> --- a/scripts/qapi/doc.py
> +++ b/scripts/qapi/doc.py
> @@ -265,7 +265,7 @@ class QAPISchemaGenDocVisitor(QAPISchemaVisitor):
>  
>      def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
>                        success_response, boxed, allow_oob, allow_preconfig,
> -                      features):
> +                      coroutine, features):
>          doc = self.cur_doc
>          self._gen.add(texi_msg('Command', doc, ifcond,
>                                 texi_arguments(doc,
> diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py
> index d7a289eded..769c54ebfe 100644
> --- a/scripts/qapi/expr.py
> +++ b/scripts/qapi/expr.py
> @@ -89,10 +89,13 @@ def check_flags(expr, info):
>          if key in expr and expr[key] is not False:
>              raise QAPISemError(
>                  info, "flag '%s' may only use false value" % key)
> -    for key in ['boxed', 'allow-oob', 'allow-preconfig']:
> +    for key in ['boxed', 'allow-oob', 'allow-preconfig', 'coroutine']:
>          if key in expr and expr[key] is not True:
>              raise QAPISemError(
>                  info, "flag '%s' may only use true value" % key)
> +    if 'allow-oob' in expr and 'coroutine' in expr:
> +        raise QAPISemError(info, "flags 'allow-oob' and 'coroutine' "
> +                                 "are incompatible")

Only because we didn't implement coroutine context for exec-oob, for
want of a use case.  I think we should note that somehow, to remind our
future selves that the two aren't actually fundamentally incompatible.
Could put the reminder into the error message, like "'coroutine': true
isn't implement with 'allow-oob': true".  A comment would do as well.

A similar reminder in qapi-code-gen.txt wouldn't hurt.

>  
>  
>  def check_if(expr, info, source):
> @@ -344,7 +347,7 @@ def check_exprs(exprs):
>                         ['command'],
>                         ['data', 'returns', 'boxed', 'if', 'features',
>                          'gen', 'success-response', 'allow-oob',
> -                        'allow-preconfig'])
> +                        'allow-preconfig', 'coroutine'])
>              normalize_members(expr.get('data'))
>              check_command(expr, info)
>          elif meta == 'event':
> diff --git a/scripts/qapi/introspect.py b/scripts/qapi/introspect.py
> index b3a463dd8b..8a296a69d6 100644
> --- a/scripts/qapi/introspect.py
> +++ b/scripts/qapi/introspect.py
> @@ -212,7 +212,7 @@ const QLitObject %(c_name)s = %(c_string)s;
>  
>      def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
>                        success_response, boxed, allow_oob, allow_preconfig,
> -                      features):
> +                      coroutine, features):
>          arg_type = arg_type or self._schema.the_empty_object_type
>          ret_type = ret_type or self._schema.the_empty_object_type
>          obj = {'arg-type': self._use_type(arg_type),
> diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> index 0bfc5256fb..87d76b59d3 100644
> --- a/scripts/qapi/schema.py
> +++ b/scripts/qapi/schema.py
> @@ -128,7 +128,7 @@ class QAPISchemaVisitor(object):
>  
>      def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
>                        success_response, boxed, allow_oob, allow_preconfig,
> -                      features):
> +                      coroutine, features):
>          pass
>  
>      def visit_event(self, name, info, ifcond, arg_type, boxed):
> @@ -690,7 +690,7 @@ class QAPISchemaCommand(QAPISchemaEntity):
>  
>      def __init__(self, name, info, doc, ifcond, arg_type, ret_type,
>                   gen, success_response, boxed, allow_oob, allow_preconfig,
> -                 features):
> +                 coroutine, features):
>          QAPISchemaEntity.__init__(self, name, info, doc, ifcond, features)
>          assert not arg_type or isinstance(arg_type, str)
>          assert not ret_type or isinstance(ret_type, str)
> @@ -703,6 +703,7 @@ class QAPISchemaCommand(QAPISchemaEntity):
>          self.boxed = boxed
>          self.allow_oob = allow_oob
>          self.allow_preconfig = allow_preconfig
> +        self.coroutine = coroutine
>  
>      def check(self, schema):
>          QAPISchemaEntity.check(self, schema)
> @@ -745,7 +746,7 @@ class QAPISchemaCommand(QAPISchemaEntity):
>                                self.arg_type, self.ret_type,
>                                self.gen, self.success_response,
>                                self.boxed, self.allow_oob,
> -                              self.allow_preconfig,
> +                              self.allow_preconfig, self.coroutine,
>                                self.features)
>  
>  
> @@ -1043,6 +1044,7 @@ class QAPISchema(object):
>          boxed = expr.get('boxed', False)
>          allow_oob = expr.get('allow-oob', False)
>          allow_preconfig = expr.get('allow-preconfig', False)
> +        coroutine = expr.get('coroutine', False)
>          ifcond = expr.get('if')
>          features = expr.get('features', [])
>          if isinstance(data, OrderedDict):
> @@ -1054,6 +1056,7 @@ class QAPISchema(object):
>          self._def_entity(QAPISchemaCommand(name, info, doc, ifcond, data, rets,
>                                             gen, success_response,
>                                             boxed, allow_oob, allow_preconfig,
> +                                           coroutine,
>                                             self._make_features(features, info)))
>  
>      def _def_event(self, expr, info, doc):
> diff --git a/tests/qapi-schema/test-qapi.py b/tests/qapi-schema/test-qapi.py
> index bad14edb47..7a8e65188d 100755
> --- a/tests/qapi-schema/test-qapi.py
> +++ b/tests/qapi-schema/test-qapi.py
> @@ -70,12 +70,13 @@ class QAPISchemaTestVisitor(QAPISchemaVisitor):
>  
>      def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
>                        success_response, boxed, allow_oob, allow_preconfig,
> -                      features):
> +                      coroutine, features):
>          print('command %s %s -> %s'
>                % (name, arg_type and arg_type.name,
>                   ret_type and ret_type.name))
> -        print('    gen=%s success_response=%s boxed=%s oob=%s preconfig=%s'
> -              % (gen, success_response, boxed, allow_oob, allow_preconfig))
> +        print('    gen=%s success_response=%s boxed=%s oob=%s preconfig=%s%s'
> +              % (gen, success_response, boxed, allow_oob, allow_preconfig,
> +                 " coroutine=True" if coroutine else ""))
>          self._print_if(ifcond)
>          self._print_features(features)
>  
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index c6827ce8c2..d111389c03 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -286,6 +286,7 @@ qapi-schema += missing-type.json
>  qapi-schema += nested-struct-data.json
>  qapi-schema += nested-struct-data-invalid-dict.json
>  qapi-schema += non-objects.json
> +qapi-schema += oob-coroutine.json
>  qapi-schema += oob-test.json
>  qapi-schema += allow-preconfig-test.json
>  qapi-schema += pragma-doc-required-crap.json
> diff --git a/tests/qapi-schema/oob-coroutine.err b/tests/qapi-schema/oob-coroutine.err
> new file mode 100644
> index 0000000000..c01a4992bd
> --- /dev/null
> +++ b/tests/qapi-schema/oob-coroutine.err
> @@ -0,0 +1,2 @@
> +oob-coroutine.json: In command 'oob-command-1':
> +oob-coroutine.json:2: flags 'allow-oob' and 'coroutine' are incompatible
> diff --git a/tests/qapi-schema/oob-coroutine.json b/tests/qapi-schema/oob-coroutine.json
> new file mode 100644
> index 0000000000..0f67663bcd
> --- /dev/null
> +++ b/tests/qapi-schema/oob-coroutine.json
> @@ -0,0 +1,2 @@
> +# Check that incompatible flags allow-oob and coroutine are rejected
> +{ 'command': 'oob-command-1', 'allow-oob': true, 'coroutine': true }
> diff --git a/tests/qapi-schema/oob-coroutine.out b/tests/qapi-schema/oob-coroutine.out
> new file mode 100644
> index 0000000000..e69de29bb2

You remembered to cover the error case.  Appreciated!

> diff --git a/tests/qapi-schema/qapi-schema-test.json b/tests/qapi-schema/qapi-schema-test.json
> index 9abf175fe0..1a850fe171 100644
> --- a/tests/qapi-schema/qapi-schema-test.json
> +++ b/tests/qapi-schema/qapi-schema-test.json
> @@ -147,6 +147,7 @@
>    'returns': 'UserDefTwo' }
>  
>  { 'command': 'cmd-success-response', 'data': {}, 'success-response': false }
> +{ 'command': 'coroutine-cmd', 'data': {}, 'coroutine': true }
>  
>  # Returning a non-dictionary requires a name from the whitelist
>  { 'command': 'guest-get-time', 'data': {'a': 'int', '*b': 'int' },
> diff --git a/tests/qapi-schema/qapi-schema-test.out b/tests/qapi-schema/qapi-schema-test.out
> index 9bd3c4a490..fdc349991a 100644
> --- a/tests/qapi-schema/qapi-schema-test.out
> +++ b/tests/qapi-schema/qapi-schema-test.out
> @@ -203,6 +203,8 @@ command user_def_cmd2 q_obj_user_def_cmd2-arg -> UserDefTwo
>      gen=True success_response=True boxed=False oob=False preconfig=False
>  command cmd-success-response None -> None
>      gen=True success_response=False boxed=False oob=False preconfig=False
> +command coroutine-cmd None -> None
> +    gen=True success_response=True boxed=False oob=False preconfig=False coroutine=True
>  object q_obj_guest-get-time-arg
>      member a: int optional=False
>      member b: int optional=True

Preferably with the "not implemented" status clarified:
Reviewed-by: Markus Armbruster <armbru@redhat.com>



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-01-21 18:11 ` [PATCH v4 3/4] qmp: Move dispatcher to a coroutine Kevin Wolf
@ 2020-02-17 11:08   ` Markus Armbruster
  2020-02-17 12:34     ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-02-17 11:08 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: stefanha, marcandre.lureau, qemu-devel, qemu-block

This is the hairy one.  Please bear with me while I try to grok it.

Kevin Wolf <kwolf@redhat.com> writes:

> This moves the QMP dispatcher to a coroutine and runs all QMP command
> handlers that declare 'coroutine': true in coroutine context so they
> can avoid blocking the main loop while doing I/O or waiting for other
> events.
>
> For commands that are not declared safe to run in a coroutine, the
> dispatcher drops out of coroutine context by calling the QMP command
> handler from a bottom half.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/qapi/qmp/dispatch.h |   1 +
>  monitor/monitor-internal.h  |   6 +-
>  monitor/monitor.c           |  33 ++++++++---
>  monitor/qmp.c               | 110 ++++++++++++++++++++++++++----------
>  qapi/qmp-dispatch.c         |  44 ++++++++++++++-
>  qapi/qmp-registry.c         |   3 +
>  util/aio-posix.c            |   7 ++-
>  7 files changed, 162 insertions(+), 42 deletions(-)
>
> diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
> index d6ce9efc8e..6812e49b5f 100644
> --- a/include/qapi/qmp/dispatch.h
> +++ b/include/qapi/qmp/dispatch.h
> @@ -30,6 +30,7 @@ typedef enum QmpCommandOptions
>  typedef struct QmpCommand
>  {
>      const char *name;
> +    /* Runs in coroutine context if QCO_COROUTINE is set */
>      QmpCommandFunc *fn;
>      QmpCommandOptions options;
>      QTAILQ_ENTRY(QmpCommand) node;
> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> index d78f5ca190..f180d03368 100644
> --- a/monitor/monitor-internal.h
> +++ b/monitor/monitor-internal.h
> @@ -154,7 +154,9 @@ static inline bool monitor_is_qmp(const Monitor *mon)
>  
>  typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList;
>  extern IOThread *mon_iothread;
> -extern QEMUBH *qmp_dispatcher_bh;
> +extern Coroutine *qmp_dispatcher_co;
> +extern bool qmp_dispatcher_co_shutdown;
> +extern bool qmp_dispatcher_co_busy;
>  extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
>  extern QemuMutex monitor_lock;
>  extern MonitorList mon_list;
> @@ -172,7 +174,7 @@ void monitor_fdsets_cleanup(void);
>  
>  void qmp_send_response(MonitorQMP *mon, const QDict *rsp);
>  void monitor_data_destroy_qmp(MonitorQMP *mon);
> -void monitor_qmp_bh_dispatcher(void *data);
> +void coroutine_fn monitor_qmp_dispatcher_co(void *data);
>  
>  int get_monitor_def(int64_t *pval, const char *name);
>  void help_cmd(Monitor *mon, const char *name);
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 12898b6448..e753fa435d 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -53,8 +53,18 @@ typedef struct {
>  /* Shared monitor I/O thread */
>  IOThread *mon_iothread;
>  
> -/* Bottom half to dispatch the requests received from I/O thread */
> -QEMUBH *qmp_dispatcher_bh;
> +/* Coroutine to dispatch the requests received from I/O thread */
> +Coroutine *qmp_dispatcher_co;
> +
> +/* Set to true when the dispatcher coroutine should terminate */
> +bool qmp_dispatcher_co_shutdown;
> +
> +/*
> + * true if the coroutine is active and processing requests. The coroutine may
> + * only be woken up externally (e.g. from the monitor thread) after changing
> + * qmp_dispatcher_co_busy from false to true (e.g. using atomic_xchg).
> + */

I'm not sure what you mean by "externally".

Also mention how it changes from true to false?

Note to self: monitor_qmp_dispatcher_co() checks busy is true on resume.

Nitpick: wrap around column 70, two spaces between sentences for
consistency with other comments in this file, please.

> +bool qmp_dispatcher_co_busy;
>  
>  /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
>  QemuMutex monitor_lock;
> @@ -579,9 +589,16 @@ void monitor_cleanup(void)

monitor_cleanup() runs in the main thread.

Coroutine qmp_dispatcher_co also runs in the main thread, right?

>      }
>      qemu_mutex_unlock(&monitor_lock);
>  
> -    /* QEMUBHs needs to be deleted before destroying the I/O thread */
> -    qemu_bh_delete(qmp_dispatcher_bh);
> -    qmp_dispatcher_bh = NULL;
> +    /* The dispatcher needs to stop before destroying the I/O thread */
> +    qmp_dispatcher_co_shutdown = true;

The coroutine switch ensures qmp_dispatcher_co sees this write, so no
need for a barrier.  Correct?

> +    if (!atomic_xchg(&qmp_dispatcher_co_busy, true)) {

Why do we need atomic?  I figure it's because qmp_dispatcher_co_busy is
accessed from multiple threads (main thread and mon_iothread), unlike
qmp_dispatcher_co_shutdown.

What kind of atomic?  I'm asking because you use sequentially consistent
atomic_xchg() together with the weaker atomic_mb_set() and
atomic_mb_read().

> +        aio_co_wake(qmp_dispatcher_co);
> +    }
> +
> +    AIO_WAIT_WHILE(qemu_get_aio_context(),
> +                   (aio_poll(iohandler_get_aio_context(), false),
> +                    atomic_mb_read(&qmp_dispatcher_co_busy)));

This waits for qmp_dispatcher_co_busy to become false again.  While
waiting, pending AIO work is given a chance to progress, as long as it
doesn't block.

The only places that change qmp_dispatcher_co_busy to false (in
monitor_qmp_dispatcher_co()) return without yielding when
qmp_dispatcher_co_shutdown, terminating the coroutine.  Correct?

Ignorant question: what AIO work may be pending, and why do we want it
to make progress?

I have to admit the I/O context magic is still voodoo to me.  Leaning
opportunity, I guess :)

Since v3, you switched from aio_bh_poll() to aio_poll().  Good:
aio_poll() is intended for general use, while aio_bh_poll() is not.  But
what happened to your "I would have called aio_poll(), but it's
forbidden for iohandler_ctx"?  Oh, you've hidden an improvement to
aio_poll() at the very end of this patch!

You also wrote

    Much of this complication comes from the fact that the monitor runs in
    iohandler_ctx, which is not the main AioContext of the main loop thread
    (or any thread). This makes waiting for something in this AioContext
    rather complicated because nothing wil poll that AioContext if I don't
    do it in the loop condition.

Should we explain this complication in a comment somewhere?  Hmm, there
is one further down:

  +        /*
  +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
  +         * executing the command handler so that it can make progress if it
  +         * involves an AIO_WAIT_WHILE().
  +         */

Assumes working knowledge of iohandler_ctx, which I don't quite have,
yet.  I found this comment

   /*
    * Functions to operate on the I/O handler AioContext.
    * This context runs on top of main loop. We can't reuse qemu_aio_context
    * because iohandlers mustn't be polled by aio_poll(qemu_aio_context).
    */
   static AioContext *iohandler_ctx;

and docs/devel/multiple-iothreads.txt.  I guess I better study it.

> +
>      if (mon_iothread) {
>          iothread_destroy(mon_iothread);
>          mon_iothread = NULL;
> @@ -604,9 +621,9 @@ void monitor_init_globals_core(void)
>       * have commands assuming that context.  It would be nice to get
>       * rid of those assumptions.
>       */
> -    qmp_dispatcher_bh = aio_bh_new(iohandler_get_aio_context(),
> -                                   monitor_qmp_bh_dispatcher,
> -                                   NULL);
> +    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
> +    atomic_mb_set(&qmp_dispatcher_co_busy, true);
> +    aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);

In review of v3, you explained why you didn't use qemu_coroutine_enter()
here, even though it's simpler:

    Because the old code didn't run the BH right away. Should it? We're
    pretty early in the initialisation of QEMU, but it should work as long
    as we're allowed to call monitor_qmp_requests_pop_any_with_lock()
    already.

The old code creates, but does not schedule the bottom half here.  It
gets scheduled only in handle_qmp_command().

The new code appears to schedule the coroutine here.  I'm confused :)

Regarding calling monitor_qmp_requests_pop_any_with_lock(): it needs
@monitor_lock and @mon_list to be valid.  We just initialized
@monitor_lock, and @mon_list is empty.
monitor_qmp_requests_pop_any_with_lock() should be safe and return null.
monitor_qmp_dispatcher_co() should also be safe and yield without doing
work.

Can we exploit that to make things a bit simpler?  Separate patch would
be fine with me.

>  }
>  
>  QemuOptsList qemu_mon_opts = {
> diff --git a/monitor/qmp.c b/monitor/qmp.c
> index 54c06ba824..9444de9fcf 100644
> --- a/monitor/qmp.c
> +++ b/monitor/qmp.c
> @@ -133,6 +133,10 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
>      }
>  }
>  
> +/*
> + * Runs outside of coroutine context for OOB commands, but in coroutine context
> + * for everything else.
> + */

Nitpick: wrap around column 70, please.

Note to self: the precondition is asserted in do_qmp_dispatch() below.
Asserting here is impractical, because we don't know whether this is an
OOB command.

>  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
>  {
>      Monitor *old_mon;
> @@ -211,43 +215,87 @@ static QMPRequest *monitor_qmp_requests_pop_any_with_lock(void)
>      return req_obj;
>  }
>  
> -void monitor_qmp_bh_dispatcher(void *data)
> +void coroutine_fn monitor_qmp_dispatcher_co(void *data)
>  {
> -    QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
> +    QMPRequest *req_obj = NULL;
>      QDict *rsp;
>      bool need_resume;
>      MonitorQMP *mon;
>  
> -    if (!req_obj) {
> -        return;
> -    }
> +    while (true) {
> +        assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true);

Read and assert, then ...

> +
> +        /* Mark the dispatcher as not busy already here so that we don't miss
> +         * any new requests coming in the middle of our processing. */
> +        atomic_mb_set(&qmp_dispatcher_co_busy, false);

... set.  Would exchange, then assert be cleaner?

The assertion checks qmp_dispatcher_co_busy is set on coroutine enter.
It pairs with the atomic_mb_set() in monitor_init_globals_core().

Wing the comment, please, and wrap around column 70.  More of the same
below.

Hmm, qmp_dispatcher_co_busy is false while the coroutine busily runs.  I
figure its actual purpose is something like "if false, you need to wake
it up to ensure it processes additional requests".  Correct?

> +
> +        while (!(req_obj = monitor_qmp_requests_pop_any_with_lock())) {
> +            /* Wait to be reentered from handle_qmp_command, or terminate if
> +             * qmp_dispatcher_co_shutdown is true*/

Yes, these are the two places that wake this coroutine.

Hmm, there's a third aio_co_wake() in do_qmp_dispatch_bh().  But that
one resumes the yield in do_qmp_dispatch().  Correct?

Space before */, please.

Would this

               /*
                * No more requests to process.  Wait until
                * handle_qmp_command() pushes more, or monitor_cleanup()
                * requests shutdown.
                */

be clearer?

> +            if (!qmp_dispatcher_co_shutdown) {
> +                qemu_coroutine_yield();

Nothing to do, go to sleep.

> +
> +                /* busy must be set to true again by whoever rescheduled us to
> +                 * avoid double scheduling */
> +                assert(atomic_xchg(&qmp_dispatcher_co_busy, false) == true);

The assertion checks the coroutine's resume set busy as it should.  It
pairs with the atomic_xchg() in handle_qmp_command() and
monitor_cleanup().

> +            }
> +
> +            /* qmp_dispatcher_co_shutdown may have changed if we yielded and
> +             * were reentered from monitor_cleanup() */
> +            if (qmp_dispatcher_co_shutdown) {
> +                return;
> +            }
> +        }
>  

We got a request in @req.

> -    mon = req_obj->mon;
> -    /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> -    need_resume = !qmp_oob_enabled(mon) ||
> -        mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> -    qemu_mutex_unlock(&mon->qmp_queue_lock);
> -    if (req_obj->req) {
> -        QDict *qdict = qobject_to(QDict, req_obj->req);
> -        QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
> -        trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
> -        monitor_qmp_dispatch(mon, req_obj->req);
> -    } else {
> -        assert(req_obj->err);
> -        rsp = qmp_error_response(req_obj->err);
> -        req_obj->err = NULL;
> -        monitor_qmp_respond(mon, rsp);
> -        qobject_unref(rsp);
> -    }
> +        if (atomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
> +            /* Someone rescheduled us (probably because a new requests came
> +             * in), but we didn't actually yield. Do that now, only to be
> +             * immediately reentered and removed from the list of scheduled
> +             * coroutines. */
> +            qemu_coroutine_yield();
> +        }
>  
> -    if (need_resume) {
> -        /* Pairs with the monitor_suspend() in handle_qmp_command() */
> -        monitor_resume(&mon->common);
> -    }
> -    qmp_request_free(req_obj);
> +        /*
> +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
> +         * executing the command handler so that it can make progress if it
> +         * involves an AIO_WAIT_WHILE().
> +         */
> +        aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
> +        qemu_coroutine_yield();
> +
> +        mon = req_obj->mon;
> +        /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> +        need_resume = !qmp_oob_enabled(mon) ||
> +            mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> +        qemu_mutex_unlock(&mon->qmp_queue_lock);
> +        if (req_obj->req) {
> +            QDict *qdict = qobject_to(QDict, req_obj->req);
> +            QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
> +            trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
> +            monitor_qmp_dispatch(mon, req_obj->req);
> +        } else {
> +            assert(req_obj->err);
> +            rsp = qmp_error_response(req_obj->err);
> +            req_obj->err = NULL;
> +            monitor_qmp_respond(mon, rsp);
> +            qobject_unref(rsp);
> +        }
> +
> +        if (need_resume) {
> +            /* Pairs with the monitor_suspend() in handle_qmp_command() */
> +            monitor_resume(&mon->common);
> +        }
> +        qmp_request_free(req_obj);
>  
> -    /* Reschedule instead of looping so the main loop stays responsive */
> -    qemu_bh_schedule(qmp_dispatcher_bh);
> +        /*
> +         * Yield and reschedule so the main loop stays responsive.
> +         *
> +         * Move back to iohandler_ctx so that nested event loops for
> +         * qemu_aio_context don't start new monitor commands.
> +         */
> +        aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
> +        qemu_coroutine_yield();
> +    }
>  }
>  

Easier to review with diff -w:

  +        if (atomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
  +            /* Someone rescheduled us (probably because a new requests came
  +             * in), but we didn't actually yield. Do that now, only to be
  +             * immediately reentered and removed from the list of scheduled
  +             * coroutines. */
  +            qemu_coroutine_yield();
  +        }

This part I understand.

  +
  +        /*
  +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
  +         * executing the command handler so that it can make progress if it
  +         * involves an AIO_WAIT_WHILE().
  +         */
  +        aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
  +        qemu_coroutine_yield();

More I/O context voodoo.  I'll get there.

           mon = req_obj->mon;
           /*  qmp_oob_enabled() might change after "qmp_capabilities" */
  @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
           }
           qmp_request_free(req_obj);

  -    /* Reschedule instead of looping so the main loop stays responsive */
  -    qemu_bh_schedule(qmp_dispatcher_bh);
  +        /*
  +         * Yield and reschedule so the main loop stays responsive.
  +         *
  +         * Move back to iohandler_ctx so that nested event loops for
  +         * qemu_aio_context don't start new monitor commands.

Can you explain this sentence for dummies?

  +         */
  +        aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
  +        qemu_coroutine_yield();
  +    }
   }

>  static void handle_qmp_command(void *opaque, QObject *req, Error *err)
> @@ -308,7 +356,9 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
>      qemu_mutex_unlock(&mon->qmp_queue_lock);
>  
>      /* Kick the dispatcher routine */
> -    qemu_bh_schedule(qmp_dispatcher_bh);
> +    if (!atomic_xchg(&qmp_dispatcher_co_busy, true)) {
> +        aio_co_wake(qmp_dispatcher_co);
> +    }
>  }
>  
>  static void monitor_qmp_read(void *opaque, const uint8_t *buf, int size)
> diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
> index bc264b3c9b..eef09d15bc 100644
> --- a/qapi/qmp-dispatch.c
> +++ b/qapi/qmp-dispatch.c
> @@ -12,12 +12,16 @@
>   */
>  
>  #include "qemu/osdep.h"
> +
> +#include "block/aio.h"
>  #include "qapi/error.h"
>  #include "qapi/qmp/dispatch.h"
>  #include "qapi/qmp/qdict.h"
>  #include "qapi/qmp/qjson.h"
>  #include "sysemu/runstate.h"
>  #include "qapi/qmp/qbool.h"
> +#include "qemu/coroutine.h"
> +#include "qemu/main-loop.h"
>  
>  static QDict *qmp_dispatch_check_obj(const QObject *request, bool allow_oob,
>                                       Error **errp)
> @@ -75,6 +79,25 @@ static QDict *qmp_dispatch_check_obj(const QObject *request, bool allow_oob,
>      return dict;
>  }
>  
> +typedef struct QmpDispatchBH {
> +    QmpCommand *cmd;
> +    QDict *args;
> +    QObject **ret;
> +    Error **errp;
> +    Coroutine *co;
> +} QmpDispatchBH;
> +
> +static void do_qmp_dispatch_bh(void *opaque)
> +{
> +    QmpDispatchBH *data = opaque;
> +    data->cmd->fn(data->args, data->ret, data->errp);
> +    aio_co_wake(data->co);
> +}
> +
> +/*
> + * Runs outside of coroutine context for OOB commands, but in coroutine context
> + * for everything else.
> + */
>  static QObject *do_qmp_dispatch(QmpCommandList *cmds, QObject *request,
>                                  bool allow_oob, Error **errp)
>  {
> @@ -129,7 +152,22 @@ static QObject *do_qmp_dispatch(QmpCommandList *cmds, QObject *request,
>          qobject_ref(args);
>      }
>  
> -    cmd->fn(args, &ret, &local_err);
> +    assert(!(oob && qemu_in_coroutine()));
> +    if ((cmd->options & QCO_COROUTINE) || !qemu_in_coroutine()) {
> +        cmd->fn(args, &ret, &local_err);
> +    } else {
> +        /* Must drop out of coroutine context for this one */
> +        QmpDispatchBH data = {
> +            .cmd    = cmd,
> +            .args   = args,
> +            .ret    = &ret,
> +            .errp   = &local_err,
> +            .co     = qemu_coroutine_self(),
> +        };
> +        aio_bh_schedule_oneshot(qemu_get_aio_context(), do_qmp_dispatch_bh,
> +                                &data);
> +        qemu_coroutine_yield();
> +    }
>      if (local_err) {
>          error_propagate(errp, local_err);
>      } else if (cmd->options & QCO_NO_SUCCESS_RESP) {
> @@ -164,6 +202,10 @@ bool qmp_is_oob(const QDict *dict)
>          && !qdict_haskey(dict, "execute");
>  }
>  
> +/*
> + * Runs outside of coroutine context for OOB commands, but in coroutine context
> + * for everything else.
> + */
>  QDict *qmp_dispatch(QmpCommandList *cmds, QObject *request,
>                      bool allow_oob)
>  {
> diff --git a/qapi/qmp-registry.c b/qapi/qmp-registry.c
> index ca00f74795..3d896aedd8 100644
> --- a/qapi/qmp-registry.c
> +++ b/qapi/qmp-registry.c
> @@ -20,6 +20,9 @@ void qmp_register_command(QmpCommandList *cmds, const char *name,
>  {
>      QmpCommand *cmd = g_malloc0(sizeof(*cmd));
>  
> +    /* QCO_COROUTINE and QCO_ALLOW_OOB are incompatible */

Only because we don't implement coroutine context for exec-oob, for want
of a use case.  See also my review of PATCH 01.  No need to spell this
out every time, but I think spelling it out at least once wouldn't hurt.

> +    assert(!((options & QCO_COROUTINE) && (options & QCO_ALLOW_OOB)));
> +
>      cmd->name = name;
>      cmd->fn = fn;
>      cmd->enabled = true;
> diff --git a/util/aio-posix.c b/util/aio-posix.c
> index a4977f538e..223de08b91 100644
> --- a/util/aio-posix.c
> +++ b/util/aio-posix.c
> @@ -15,6 +15,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "block/block.h"
> +#include "qemu/main-loop.h"
>  #include "qemu/rcu_queue.h"
>  #include "qemu/sockets.h"
>  #include "qemu/cutils.h"
> @@ -616,7 +617,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
>      int64_t timeout;
>      int64_t start = 0;
>  
> -    assert(in_aio_context_home_thread(ctx));
> +    /* aio_poll() may only be called in the AioContext's thread. iohandler_ctx
> +     * is special in that it runs in the main thread, but that thread's context
> +     * is qemu_aio_context. */
> +    assert(in_aio_context_home_thread(ctx == iohandler_get_aio_context() ?
> +                                      qemu_get_aio_context() : ctx));
>  
>      /* aio_notify can avoid the expensive event_notifier_set if
>       * everything (file descriptors, bottom halves, timers) will

This is the aio_poll() improvement.


I very much appreciate the effort you put into making this easier to
understand and maintain.  Thank you!



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-17 11:08   ` Markus Armbruster
@ 2020-02-17 12:34     ` Kevin Wolf
  2020-02-18 14:12       ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2020-02-17 12:34 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: stefanha, marcandre.lureau, qemu-devel, qemu-block

Am 17.02.2020 um 12:08 hat Markus Armbruster geschrieben:
> This is the hairy one.  Please bear with me while I try to grok it.
> 
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > This moves the QMP dispatcher to a coroutine and runs all QMP command
> > handlers that declare 'coroutine': true in coroutine context so they
> > can avoid blocking the main loop while doing I/O or waiting for other
> > events.
> >
> > For commands that are not declared safe to run in a coroutine, the
> > dispatcher drops out of coroutine context by calling the QMP command
> > handler from a bottom half.
> >
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  include/qapi/qmp/dispatch.h |   1 +
> >  monitor/monitor-internal.h  |   6 +-
> >  monitor/monitor.c           |  33 ++++++++---
> >  monitor/qmp.c               | 110 ++++++++++++++++++++++++++----------
> >  qapi/qmp-dispatch.c         |  44 ++++++++++++++-
> >  qapi/qmp-registry.c         |   3 +
> >  util/aio-posix.c            |   7 ++-
> >  7 files changed, 162 insertions(+), 42 deletions(-)
> >
> > diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
> > index d6ce9efc8e..6812e49b5f 100644
> > --- a/include/qapi/qmp/dispatch.h
> > +++ b/include/qapi/qmp/dispatch.h
> > @@ -30,6 +30,7 @@ typedef enum QmpCommandOptions
> >  typedef struct QmpCommand
> >  {
> >      const char *name;
> > +    /* Runs in coroutine context if QCO_COROUTINE is set */
> >      QmpCommandFunc *fn;
> >      QmpCommandOptions options;
> >      QTAILQ_ENTRY(QmpCommand) node;
> > diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> > index d78f5ca190..f180d03368 100644
> > --- a/monitor/monitor-internal.h
> > +++ b/monitor/monitor-internal.h
> > @@ -154,7 +154,9 @@ static inline bool monitor_is_qmp(const Monitor *mon)
> >  
> >  typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList;
> >  extern IOThread *mon_iothread;
> > -extern QEMUBH *qmp_dispatcher_bh;
> > +extern Coroutine *qmp_dispatcher_co;
> > +extern bool qmp_dispatcher_co_shutdown;
> > +extern bool qmp_dispatcher_co_busy;
> >  extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
> >  extern QemuMutex monitor_lock;
> >  extern MonitorList mon_list;
> > @@ -172,7 +174,7 @@ void monitor_fdsets_cleanup(void);
> >  
> >  void qmp_send_response(MonitorQMP *mon, const QDict *rsp);
> >  void monitor_data_destroy_qmp(MonitorQMP *mon);
> > -void monitor_qmp_bh_dispatcher(void *data);
> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data);
> >  
> >  int get_monitor_def(int64_t *pval, const char *name);
> >  void help_cmd(Monitor *mon, const char *name);
> > diff --git a/monitor/monitor.c b/monitor/monitor.c
> > index 12898b6448..e753fa435d 100644
> > --- a/monitor/monitor.c
> > +++ b/monitor/monitor.c
> > @@ -53,8 +53,18 @@ typedef struct {
> >  /* Shared monitor I/O thread */
> >  IOThread *mon_iothread;
> >  
> > -/* Bottom half to dispatch the requests received from I/O thread */
> > -QEMUBH *qmp_dispatcher_bh;
> > +/* Coroutine to dispatch the requests received from I/O thread */
> > +Coroutine *qmp_dispatcher_co;
> > +
> > +/* Set to true when the dispatcher coroutine should terminate */
> > +bool qmp_dispatcher_co_shutdown;
> > +
> > +/*
> > + * true if the coroutine is active and processing requests. The coroutine may
> > + * only be woken up externally (e.g. from the monitor thread) after changing
> > + * qmp_dispatcher_co_busy from false to true (e.g. using atomic_xchg).
> > + */
> 
> I'm not sure what you mean by "externally".

By anyone outside the coroutine itself. Maybe just dropping the word
"externally" avoids the confusion because it's an implementation detail
that the coroutine can schedule itself while it is marked busy.

> Also mention how it changes from true to false?

Somethin like: "The coroutine will automatically change it back to false
after processing all pending requests"?

> Note to self: monitor_qmp_dispatcher_co() checks busy is true on resume.
> 
> Nitpick: wrap around column 70, two spaces between sentences for
> consistency with other comments in this file, please.

Any specific reason why comments (but not code) in this file use a
different text width than everything else in QEMU? My editor is set to
use 80 characters to conform to CODING_STYLE.rst.

> > +bool qmp_dispatcher_co_busy;
> >  
> >  /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
> >  QemuMutex monitor_lock;
> > @@ -579,9 +589,16 @@ void monitor_cleanup(void)
> 
> monitor_cleanup() runs in the main thread.
> 
> Coroutine qmp_dispatcher_co also runs in the main thread, right?

Yes.

> >      }
> >      qemu_mutex_unlock(&monitor_lock);
> >  
> > -    /* QEMUBHs needs to be deleted before destroying the I/O thread */
> > -    qemu_bh_delete(qmp_dispatcher_bh);
> > -    qmp_dispatcher_bh = NULL;
> > +    /* The dispatcher needs to stop before destroying the I/O thread */
> > +    qmp_dispatcher_co_shutdown = true;
> 
> The coroutine switch ensures qmp_dispatcher_co sees this write, so no
> need for a barrier.  Correct?

Both run in the same thread anyway, so barriers wouldn't make a
difference.

> > +    if (!atomic_xchg(&qmp_dispatcher_co_busy, true)) {
> 
> Why do we need atomic?  I figure it's because qmp_dispatcher_co_busy is
> accessed from multiple threads (main thread and mon_iothread), unlike
> qmp_dispatcher_co_shutdown.

Yes, it's for synchronisation with the monitor thread. A coroutine may
not be scheduled twice at the same time. This is essentially all that
we're protecting against with qmp_dispatcher_co_busy. (See the
documentation for qmp_dispatcher_co_busy above.)

> What kind of atomic?  I'm asking because you use sequentially consistent
> atomic_xchg() together with the weaker atomic_mb_set() and
> atomic_mb_read().

atomic_mb_set/read() contain a barrier, which avoids reordering around
them. What makes them weaker than sequentially consistent atomic
operations is, quoting docs/devel/atomics.txt:

    However, and this is the important difference between
    atomic_mb_read/atomic_mb_set and sequential consistency, it is important
    for both threads to access the same volatile variable.  It is not the
    case that everything visible to thread A when it writes volatile field f
    becomes visible to thread B after it reads volatile field g. The store
    and load have to "match" (i.e., be performed on the same volatile
    field) to achieve the right semantics.

This condition is fulfilled, both threads communicate only through
qmp_dispatcher_co_busy.

> > +        aio_co_wake(qmp_dispatcher_co);
> > +    }
> > +
> > +    AIO_WAIT_WHILE(qemu_get_aio_context(),
> > +                   (aio_poll(iohandler_get_aio_context(), false),
> > +                    atomic_mb_read(&qmp_dispatcher_co_busy)));
> 
> This waits for qmp_dispatcher_co_busy to become false again.  While
> waiting, pending AIO work is given a chance to progress, as long as it
> doesn't block.
> 
> The only places that change qmp_dispatcher_co_busy to false (in
> monitor_qmp_dispatcher_co()) return without yielding when
> qmp_dispatcher_co_shutdown, terminating the coroutine.  Correct?

Correct.

> Ignorant question: what AIO work may be pending, and why do we want it
> to make progress?

This pending work specifically contains running the monitor dispatcher
coroutine. Without the aio_poll(), the coroutine code wouldn't get a
chance to run, so we would never complete.

Note that AIO_WAIT_WHILE() automatically polls the (main) AioContext of
the thread, but the main thread is irregular in that it has two
AioContexts. The monitor dispatcher happens to live in the iohandler
context, which is not polled by default, so we need to do that manually.

> I have to admit the I/O context magic is still voodoo to me.  Leaning
> opportunity, I guess :)
> 
> Since v3, you switched from aio_bh_poll() to aio_poll().  Good:
> aio_poll() is intended for general use, while aio_bh_poll() is not.  But
> what happened to your "I would have called aio_poll(), but it's
> forbidden for iohandler_ctx"?  Oh, you've hidden an improvement to
> aio_poll() at the very end of this patch!
> 
> You also wrote
> 
>     Much of this complication comes from the fact that the monitor runs in
>     iohandler_ctx, which is not the main AioContext of the main loop thread
>     (or any thread). This makes waiting for something in this AioContext
>     rather complicated because nothing wil poll that AioContext if I don't
>     do it in the loop condition.
> 
> Should we explain this complication in a comment somewhere?  Hmm, there
> is one further down:
> 
>   +        /*
>   +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
>   +         * executing the command handler so that it can make progress if it
>   +         * involves an AIO_WAIT_WHILE().
>   +         */
> 
> Assumes working knowledge of iohandler_ctx, which I don't quite have,
> yet.  I found this comment
> 
>    /*
>     * Functions to operate on the I/O handler AioContext.
>     * This context runs on top of main loop. We can't reuse qemu_aio_context
>     * because iohandlers mustn't be polled by aio_poll(qemu_aio_context).
>     */
>    static AioContext *iohandler_ctx;
> 
> and docs/devel/multiple-iothreads.txt.  I guess I better study it.

I'm not sure myself how much of this is actually still true, but not
being an expert on iohandler_ctx myself, I decided to leave things in
the same AioContext where they were before this series.

> > +
> >      if (mon_iothread) {
> >          iothread_destroy(mon_iothread);
> >          mon_iothread = NULL;
> > @@ -604,9 +621,9 @@ void monitor_init_globals_core(void)
> >       * have commands assuming that context.  It would be nice to get
> >       * rid of those assumptions.
> >       */
> > -    qmp_dispatcher_bh = aio_bh_new(iohandler_get_aio_context(),
> > -                                   monitor_qmp_bh_dispatcher,
> > -                                   NULL);
> > +    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
> > +    atomic_mb_set(&qmp_dispatcher_co_busy, true);
> > +    aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
> 
> In review of v3, you explained why you didn't use qemu_coroutine_enter()
> here, even though it's simpler:
> 
>     Because the old code didn't run the BH right away. Should it? We're
>     pretty early in the initialisation of QEMU, but it should work as long
>     as we're allowed to call monitor_qmp_requests_pop_any_with_lock()
>     already.
> 
> The old code creates, but does not schedule the bottom half here.  It
> gets scheduled only in handle_qmp_command().
> 
> The new code appears to schedule the coroutine here.  I'm confused :)

What will happen here is essentially that we schedule a BH that enters
the coroutine. This runs the first part of monitor_qmp_dispatcher_co()
until it yields because there are no requests pending yet.

monitor_qmp_requests_pop_any_with_lock() is the only thing that is run
before handle_qmp_command() wakes up the coroutine (or monitor_cleanup
if we never get any request).

> Regarding calling monitor_qmp_requests_pop_any_with_lock(): it needs
> @monitor_lock and @mon_list to be valid.  We just initialized
> @monitor_lock, and @mon_list is empty.
> monitor_qmp_requests_pop_any_with_lock() should be safe and return null.
> monitor_qmp_dispatcher_co() should also be safe and yield without doing
> work.
> 
> Can we exploit that to make things a bit simpler?  Separate patch would
> be fine with me.

If this is true, we could replace this line:

    aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);

with the following one:

    qemu_aio_coroutine_enter(iohandler_get_aio_context(), qmp_dispatcher_co);

I'm not sure that this is any simpler.

> >  }
> >  
> >  QemuOptsList qemu_mon_opts = {
> > diff --git a/monitor/qmp.c b/monitor/qmp.c
> > index 54c06ba824..9444de9fcf 100644
> > --- a/monitor/qmp.c
> > +++ b/monitor/qmp.c
> > @@ -133,6 +133,10 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
> >      }
> >  }
> >  
> > +/*
> > + * Runs outside of coroutine context for OOB commands, but in coroutine context
> > + * for everything else.
> > + */
> 
> Nitpick: wrap around column 70, please.
> 
> Note to self: the precondition is asserted in do_qmp_dispatch() below.
> Asserting here is impractical, because we don't know whether this is an
> OOB command.
> 
> >  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
> >  {
> >      Monitor *old_mon;
> > @@ -211,43 +215,87 @@ static QMPRequest *monitor_qmp_requests_pop_any_with_lock(void)
> >      return req_obj;
> >  }
> >  
> > -void monitor_qmp_bh_dispatcher(void *data)
> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data)
> >  {
> > -    QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
> > +    QMPRequest *req_obj = NULL;
> >      QDict *rsp;
> >      bool need_resume;
> >      MonitorQMP *mon;
> >  
> > -    if (!req_obj) {
> > -        return;
> > -    }
> > +    while (true) {
> > +        assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true);
> 
> Read and assert, then ...
> 
> > +
> > +        /* Mark the dispatcher as not busy already here so that we don't miss
> > +         * any new requests coming in the middle of our processing. */
> > +        atomic_mb_set(&qmp_dispatcher_co_busy, false);
> 
> ... set.  Would exchange, then assert be cleaner?

Then you would ask me why the exchange has to be atomic. :-)

More practically, I would need a temporary variable so that I don't get
code with side effects in assert() (which may be compiled out with
NDEBUG). The temporary variable would never be read outside the assert
and would be unused with NDEBUG.

So possible, yes, cleaner I'm not sure.

> The assertion checks qmp_dispatcher_co_busy is set on coroutine enter.
> It pairs with the atomic_mb_set() in monitor_init_globals_core().
> 
> Wing the comment, please, and wrap around column 70.  More of the same
> below.
> 
> Hmm, qmp_dispatcher_co_busy is false while the coroutine busily runs.  I
> figure its actual purpose is something like "if false, you need to wake
> it up to ensure it processes additional requests".  Correct?

Yes. I understood "busy" in the sense of "executing a monitor command",
but of course the purpose of the variable is to know whether you need to
schedule the coroutine or whether you must not schedule it.

> > +
> > +        while (!(req_obj = monitor_qmp_requests_pop_any_with_lock())) {
> > +            /* Wait to be reentered from handle_qmp_command, or terminate if
> > +             * qmp_dispatcher_co_shutdown is true*/
> 
> Yes, these are the two places that wake this coroutine.
> 
> Hmm, there's a third aio_co_wake() in do_qmp_dispatch_bh().  But that
> one resumes the yield in do_qmp_dispatch().  Correct?

Yes, do_qmp_dispatch_bh is scheduled immediately before yielding, so
that yield is where the coroutine will be resumed.

> Space before */, please.
> 
> Would this
> 
>                /*
>                 * No more requests to process.  Wait until
>                 * handle_qmp_command() pushes more, or monitor_cleanup()
>                 * requests shutdown.
>                 */
> 
> be clearer?

I think I prefer explicitly mentioning that these callers not only push
requests or request shutdown, but that they actively cause this
coroutine to be reentered.

Matter of taste. You're the maintainer, so your taste wins.

> > +            if (!qmp_dispatcher_co_shutdown) {
> > +                qemu_coroutine_yield();
> 
> Nothing to do, go to sleep.
> 
> > +
> > +                /* busy must be set to true again by whoever rescheduled us to
> > +                 * avoid double scheduling */
> > +                assert(atomic_xchg(&qmp_dispatcher_co_busy, false) == true);
> 
> The assertion checks the coroutine's resume set busy as it should.  It
> pairs with the atomic_xchg() in handle_qmp_command() and
> monitor_cleanup().

Correct.

> > +            }
> > +
> > +            /* qmp_dispatcher_co_shutdown may have changed if we yielded and
> > +             * were reentered from monitor_cleanup() */
> > +            if (qmp_dispatcher_co_shutdown) {
> > +                return;
> > +            }
> > +        }
> >  
> 
> We got a request in @req.
> 
> > -    mon = req_obj->mon;
> > -    /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> > -    need_resume = !qmp_oob_enabled(mon) ||
> > -        mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> > -    qemu_mutex_unlock(&mon->qmp_queue_lock);
> > -    if (req_obj->req) {
> > -        QDict *qdict = qobject_to(QDict, req_obj->req);
> > -        QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
> > -        trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
> > -        monitor_qmp_dispatch(mon, req_obj->req);
> > -    } else {
> > -        assert(req_obj->err);
> > -        rsp = qmp_error_response(req_obj->err);
> > -        req_obj->err = NULL;
> > -        monitor_qmp_respond(mon, rsp);
> > -        qobject_unref(rsp);
> > -    }
> > +        if (atomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
> > +            /* Someone rescheduled us (probably because a new requests came
> > +             * in), but we didn't actually yield. Do that now, only to be
> > +             * immediately reentered and removed from the list of scheduled
> > +             * coroutines. */
> > +            qemu_coroutine_yield();
> > +        }
> >  
> > -    if (need_resume) {
> > -        /* Pairs with the monitor_suspend() in handle_qmp_command() */
> > -        monitor_resume(&mon->common);
> > -    }
> > -    qmp_request_free(req_obj);
> > +        /*
> > +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
> > +         * executing the command handler so that it can make progress if it
> > +         * involves an AIO_WAIT_WHILE().
> > +         */
> > +        aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
> > +        qemu_coroutine_yield();
> > +
> > +        mon = req_obj->mon;
> > +        /*  qmp_oob_enabled() might change after "qmp_capabilities" */
> > +        need_resume = !qmp_oob_enabled(mon) ||
> > +            mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
> > +        qemu_mutex_unlock(&mon->qmp_queue_lock);
> > +        if (req_obj->req) {
> > +            QDict *qdict = qobject_to(QDict, req_obj->req);
> > +            QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
> > +            trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
> > +            monitor_qmp_dispatch(mon, req_obj->req);
> > +        } else {
> > +            assert(req_obj->err);
> > +            rsp = qmp_error_response(req_obj->err);
> > +            req_obj->err = NULL;
> > +            monitor_qmp_respond(mon, rsp);
> > +            qobject_unref(rsp);
> > +        }
> > +
> > +        if (need_resume) {
> > +            /* Pairs with the monitor_suspend() in handle_qmp_command() */
> > +            monitor_resume(&mon->common);
> > +        }
> > +        qmp_request_free(req_obj);
> >  
> > -    /* Reschedule instead of looping so the main loop stays responsive */
> > -    qemu_bh_schedule(qmp_dispatcher_bh);
> > +        /*
> > +         * Yield and reschedule so the main loop stays responsive.
> > +         *
> > +         * Move back to iohandler_ctx so that nested event loops for
> > +         * qemu_aio_context don't start new monitor commands.
> > +         */
> > +        aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
> > +        qemu_coroutine_yield();
> > +    }
> >  }
> >  
> 
> Easier to review with diff -w:
> 
>   +        if (atomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
>   +            /* Someone rescheduled us (probably because a new requests came
>   +             * in), but we didn't actually yield. Do that now, only to be
>   +             * immediately reentered and removed from the list of scheduled
>   +             * coroutines. */
>   +            qemu_coroutine_yield();
>   +        }
> 
> This part I understand.
> 
>   +
>   +        /*
>   +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
>   +         * executing the command handler so that it can make progress if it
>   +         * involves an AIO_WAIT_WHILE().
>   +         */
>   +        aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
>   +        qemu_coroutine_yield();
> 
> More I/O context voodoo.  I'll get there.
> 
>            mon = req_obj->mon;
>            /*  qmp_oob_enabled() might change after "qmp_capabilities" */
>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
>            }
>            qmp_request_free(req_obj);
> 
>   -    /* Reschedule instead of looping so the main loop stays responsive */
>   -    qemu_bh_schedule(qmp_dispatcher_bh);
>   +        /*
>   +         * Yield and reschedule so the main loop stays responsive.
>   +         *
>   +         * Move back to iohandler_ctx so that nested event loops for
>   +         * qemu_aio_context don't start new monitor commands.
> 
> Can you explain this sentence for dummies?

Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
are scheduled there, the next iteration of the monitor dispatcher loop
could start from a nested event loop. If we are scheduled in
iohandler_ctx, then only the actual main loop will reenter the coroutine
and nested event loops ignore it.

I'm not sure if or why this is actually important, but this matches
scheduling the dispatcher BH in iohandler_ctx in the code before this
patch.

If we didn't do this, we could end up starting monitor requests in more
places than before, and who knows what that would mean.

>   +         */
>   +        aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>   +        qemu_coroutine_yield();
>   +    }
>    }
> 
> >  static void handle_qmp_command(void *opaque, QObject *req, Error *err)

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-17 12:34     ` Kevin Wolf
@ 2020-02-18 14:12       ` Markus Armbruster
  2020-02-18 15:29         ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-02-18 14:12 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Never been closer...

Kevin Wolf <kwolf@redhat.com> writes:

> Am 17.02.2020 um 12:08 hat Markus Armbruster geschrieben:
>> This is the hairy one.  Please bear with me while I try to grok it.
>> 
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > This moves the QMP dispatcher to a coroutine and runs all QMP command
>> > handlers that declare 'coroutine': true in coroutine context so they
>> > can avoid blocking the main loop while doing I/O or waiting for other
>> > events.
>> >
>> > For commands that are not declared safe to run in a coroutine, the
>> > dispatcher drops out of coroutine context by calling the QMP command
>> > handler from a bottom half.
>> >
>> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>> > ---
>> >  include/qapi/qmp/dispatch.h |   1 +
>> >  monitor/monitor-internal.h  |   6 +-
>> >  monitor/monitor.c           |  33 ++++++++---
>> >  monitor/qmp.c               | 110 ++++++++++++++++++++++++++----------
>> >  qapi/qmp-dispatch.c         |  44 ++++++++++++++-
>> >  qapi/qmp-registry.c         |   3 +
>> >  util/aio-posix.c            |   7 ++-
>> >  7 files changed, 162 insertions(+), 42 deletions(-)
>> >
>> > diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
>> > index d6ce9efc8e..6812e49b5f 100644
>> > --- a/include/qapi/qmp/dispatch.h
>> > +++ b/include/qapi/qmp/dispatch.h
>> > @@ -30,6 +30,7 @@ typedef enum QmpCommandOptions
>> >  typedef struct QmpCommand
>> >  {
>> >      const char *name;
>> > +    /* Runs in coroutine context if QCO_COROUTINE is set */
>> >      QmpCommandFunc *fn;
>> >      QmpCommandOptions options;
>> >      QTAILQ_ENTRY(QmpCommand) node;
>> > diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
>> > index d78f5ca190..f180d03368 100644
>> > --- a/monitor/monitor-internal.h
>> > +++ b/monitor/monitor-internal.h
>> > @@ -154,7 +154,9 @@ static inline bool monitor_is_qmp(const Monitor *mon)
>> >  
>> >  typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList;
>> >  extern IOThread *mon_iothread;
>> > -extern QEMUBH *qmp_dispatcher_bh;
>> > +extern Coroutine *qmp_dispatcher_co;
>> > +extern bool qmp_dispatcher_co_shutdown;
>> > +extern bool qmp_dispatcher_co_busy;
>> >  extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
>> >  extern QemuMutex monitor_lock;
>> >  extern MonitorList mon_list;
>> > @@ -172,7 +174,7 @@ void monitor_fdsets_cleanup(void);
>> >  
>> >  void qmp_send_response(MonitorQMP *mon, const QDict *rsp);
>> >  void monitor_data_destroy_qmp(MonitorQMP *mon);
>> > -void monitor_qmp_bh_dispatcher(void *data);
>> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data);
>> >  
>> >  int get_monitor_def(int64_t *pval, const char *name);
>> >  void help_cmd(Monitor *mon, const char *name);
>> > diff --git a/monitor/monitor.c b/monitor/monitor.c
>> > index 12898b6448..e753fa435d 100644
>> > --- a/monitor/monitor.c
>> > +++ b/monitor/monitor.c
>> > @@ -53,8 +53,18 @@ typedef struct {
>> >  /* Shared monitor I/O thread */
>> >  IOThread *mon_iothread;
>> >  
>> > -/* Bottom half to dispatch the requests received from I/O thread */
>> > -QEMUBH *qmp_dispatcher_bh;
>> > +/* Coroutine to dispatch the requests received from I/O thread */
>> > +Coroutine *qmp_dispatcher_co;
>> > +
>> > +/* Set to true when the dispatcher coroutine should terminate */
>> > +bool qmp_dispatcher_co_shutdown;
>> > +
>> > +/*
>> > + * true if the coroutine is active and processing requests. The coroutine may
>> > + * only be woken up externally (e.g. from the monitor thread) after changing
>> > + * qmp_dispatcher_co_busy from false to true (e.g. using atomic_xchg).
>> > + */
>> 
>> I'm not sure what you mean by "externally".
>
> By anyone outside the coroutine itself. Maybe just dropping the word
> "externally" avoids the confusion because it's an implementation detail
> that the coroutine can schedule itself while it is marked busy.

Let's do that.  For me, a coroutine scheduling itself is not covered by
"woken up".

>> Also mention how it changes from true to false?
>
> Somethin like: "The coroutine will automatically change it back to false
> after processing all pending requests"?

Works for me.

More below.

>> Note to self: monitor_qmp_dispatcher_co() checks busy is true on resume.
>> 
>> Nitpick: wrap around column 70, two spaces between sentences for
>> consistency with other comments in this file, please.
>
> Any specific reason why comments (but not code) in this file use a
> different text width than everything else in QEMU? My editor is set to
> use 80 characters to conform to CODING_STYLE.rst.

Legibility.  Humans tend to have trouble following long lines with their
eyes (I sure do).  Typographic manuals suggest to limit columns to
roughly 60 characters for exactly that reason[*].

Code is special.  It's typically indented, and long identifiers push it
further to the right, function arguments in particular.  We compromised
at 80 columns.

Block comments are not code.  They are typically not indented much.
This one isn't indented at all.  Line length without the comment
decoration is way above 60.

>> > +bool qmp_dispatcher_co_busy;
>> >  
>> >  /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
>> >  QemuMutex monitor_lock;
>> > @@ -579,9 +589,16 @@ void monitor_cleanup(void)
>> 
>> monitor_cleanup() runs in the main thread.
>> 
>> Coroutine qmp_dispatcher_co also runs in the main thread, right?
>
> Yes.
>
>> >      }
>> >      qemu_mutex_unlock(&monitor_lock);
>> >  
>> > -    /* QEMUBHs needs to be deleted before destroying the I/O thread */
>> > -    qemu_bh_delete(qmp_dispatcher_bh);
>> > -    qmp_dispatcher_bh = NULL;
>> > +    /* The dispatcher needs to stop before destroying the I/O thread */
>> > +    qmp_dispatcher_co_shutdown = true;
>> 
>> The coroutine switch ensures qmp_dispatcher_co sees this write, so no
>> need for a barrier.  Correct?
>
> Both run in the same thread anyway, so barriers wouldn't make a
> difference.

Possibly a compiler barrier.

>> > +    if (!atomic_xchg(&qmp_dispatcher_co_busy, true)) {
>> 
>> Why do we need atomic?  I figure it's because qmp_dispatcher_co_busy is
>> accessed from multiple threads (main thread and mon_iothread), unlike
>> qmp_dispatcher_co_shutdown.
>
> Yes, it's for synchronisation with the monitor thread. A coroutine may
> not be scheduled twice at the same time. This is essentially all that
> we're protecting against with qmp_dispatcher_co_busy.

Aha.

When I see synchronization (locks, atomics, ...), I often wonder what is
being synchronized to protect what.  I like comments explaining that.

>                                                       (See the
> documentation for qmp_dispatcher_co_busy above.)

"Not scheduled twice at the same time" is what we want.
qmp_dispatcher_co_busy's doc comment states the invariant that ensures
we get what we want.  That's useful.  Spelling out what we want would
probably also be useful.  More on that below.

>> What kind of atomic?  I'm asking because you use sequentially consistent
>> atomic_xchg() together with the weaker atomic_mb_set() and
>> atomic_mb_read().
>
> atomic_mb_set/read() contain a barrier, which avoids reordering around
> them. What makes them weaker than sequentially consistent atomic
> operations is, quoting docs/devel/atomics.txt:
>
>     However, and this is the important difference between
>     atomic_mb_read/atomic_mb_set and sequential consistency, it is important
>     for both threads to access the same volatile variable.  It is not the
>     case that everything visible to thread A when it writes volatile field f
>     becomes visible to thread B after it reads volatile field g. The store
>     and load have to "match" (i.e., be performed on the same volatile
>     field) to achieve the right semantics.
>
> This condition is fulfilled, both threads communicate only through
> qmp_dispatcher_co_busy.

If a weaker atomic_mb_xchg() existed, we could use it.  But it doesn't.
Okay.

>> > +        aio_co_wake(qmp_dispatcher_co);
>> > +    }
>> > +
>> > +    AIO_WAIT_WHILE(qemu_get_aio_context(),
>> > +                   (aio_poll(iohandler_get_aio_context(), false),
>> > +                    atomic_mb_read(&qmp_dispatcher_co_busy)));
>> 
>> This waits for qmp_dispatcher_co_busy to become false again.  While
>> waiting, pending AIO work is given a chance to progress, as long as it
>> doesn't block.
>> 
>> The only places that change qmp_dispatcher_co_busy to false (in
>> monitor_qmp_dispatcher_co()) return without yielding when
>> qmp_dispatcher_co_shutdown, terminating the coroutine.  Correct?
>
> Correct.
>
>> Ignorant question: what AIO work may be pending, and why do we want it
>> to make progress?
>
> This pending work specifically contains running the monitor dispatcher
> coroutine. Without the aio_poll(), the coroutine code wouldn't get a
> chance to run, so we would never complete.
>
> Note that AIO_WAIT_WHILE() automatically polls the (main) AioContext of
> the thread, but the main thread is irregular in that it has two
> AioContexts. The monitor dispatcher happens to live in the iohandler
> context, which is not polled by default, so we need to do that manually.

I'd like a comment about that.  A reminder should suffice, no need to
explain AioContext from basic principles.

>> I have to admit the I/O context magic is still voodoo to me.  Leaning
>> opportunity, I guess :)
>> 
>> Since v3, you switched from aio_bh_poll() to aio_poll().  Good:
>> aio_poll() is intended for general use, while aio_bh_poll() is not.  But
>> what happened to your "I would have called aio_poll(), but it's
>> forbidden for iohandler_ctx"?  Oh, you've hidden an improvement to
>> aio_poll() at the very end of this patch!
>> 
>> You also wrote
>> 
>>     Much of this complication comes from the fact that the monitor runs in
>>     iohandler_ctx, which is not the main AioContext of the main loop thread
>>     (or any thread). This makes waiting for something in this AioContext
>>     rather complicated because nothing wil poll that AioContext if I don't
>>     do it in the loop condition.
>> 
>> Should we explain this complication in a comment somewhere?  Hmm, there
>> is one further down:
>> 
>>   +        /*
>>   +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
>>   +         * executing the command handler so that it can make progress if it
>>   +         * involves an AIO_WAIT_WHILE().
>>   +         */
>> 
>> Assumes working knowledge of iohandler_ctx, which I don't quite have,
>> yet.  I found this comment
>> 
>>    /*
>>     * Functions to operate on the I/O handler AioContext.
>>     * This context runs on top of main loop. We can't reuse qemu_aio_context
>>     * because iohandlers mustn't be polled by aio_poll(qemu_aio_context).
>>     */
>>    static AioContext *iohandler_ctx;
>> 
>> and docs/devel/multiple-iothreads.txt.  I guess I better study it.
>
> I'm not sure myself how much of this is actually still true, but not
> being an expert on iohandler_ctx myself, I decided to leave things in
> the same AioContext where they were before this series.

That's fair.

>> > +
>> >      if (mon_iothread) {
>> >          iothread_destroy(mon_iothread);
>> >          mon_iothread = NULL;
>> > @@ -604,9 +621,9 @@ void monitor_init_globals_core(void)
>> >       * have commands assuming that context.  It would be nice to get
>> >       * rid of those assumptions.
>> >       */
>> > -    qmp_dispatcher_bh = aio_bh_new(iohandler_get_aio_context(),
>> > -                                   monitor_qmp_bh_dispatcher,
>> > -                                   NULL);
>> > +    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
>> > +    atomic_mb_set(&qmp_dispatcher_co_busy, true);
>> > +    aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>> 
>> In review of v3, you explained why you didn't use qemu_coroutine_enter()
>> here, even though it's simpler:
>> 
>>     Because the old code didn't run the BH right away. Should it? We're
>>     pretty early in the initialisation of QEMU, but it should work as long
>>     as we're allowed to call monitor_qmp_requests_pop_any_with_lock()
>>     already.
>> 
>> The old code creates, but does not schedule the bottom half here.  It
>> gets scheduled only in handle_qmp_command().
>> 
>> The new code appears to schedule the coroutine here.  I'm confused :)
>
> What will happen here is essentially that we schedule a BH that enters
> the coroutine. This runs the first part of monitor_qmp_dispatcher_co()
> until it yields because there are no requests pending yet.
>
> monitor_qmp_requests_pop_any_with_lock() is the only thing that is run
> before handle_qmp_command() wakes up the coroutine (or monitor_cleanup
> if we never get any request).
>
>> Regarding calling monitor_qmp_requests_pop_any_with_lock(): it needs
>> @monitor_lock and @mon_list to be valid.  We just initialized
>> @monitor_lock, and @mon_list is empty.
>> monitor_qmp_requests_pop_any_with_lock() should be safe and return null.
>> monitor_qmp_dispatcher_co() should also be safe and yield without doing
>> work.
>> 
>> Can we exploit that to make things a bit simpler?  Separate patch would
>> be fine with me.
>
> If this is true, we could replace this line:
>
>     aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>
> with the following one:
>
>     qemu_aio_coroutine_enter(iohandler_get_aio_context(), qmp_dispatcher_co);
>
> I'm not sure that this is any simpler.

Naive question: what's the difference between "scheduling", "entering",
and "waking up" a coroutine?

qemu_coroutine_enter() and qemu_aio_coroutine_enter() are in
coroutine.h.

aio_co_schedule(), aio_co_wake() and aio_co_enter() are in aio.h.

qemu_coroutine_enter() calls qemu_aio_coroutine_enter().

aio_co_wake() calls aio_co_enter() calls qemu_aio_coroutine_enter().

aio_co_enter() seems to be independent.

aio.h seems to be layered on top of coroutine.h.  Should I prefer using
aio.h to coroutine.h?

>> >  }
>> >  
>> >  QemuOptsList qemu_mon_opts = {
>> > diff --git a/monitor/qmp.c b/monitor/qmp.c
>> > index 54c06ba824..9444de9fcf 100644
>> > --- a/monitor/qmp.c
>> > +++ b/monitor/qmp.c
>> > @@ -133,6 +133,10 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
>> >      }
>> >  }
>> >  
>> > +/*
>> > + * Runs outside of coroutine context for OOB commands, but in coroutine context
>> > + * for everything else.
>> > + */
>> 
>> Nitpick: wrap around column 70, please.
>> 
>> Note to self: the precondition is asserted in do_qmp_dispatch() below.
>> Asserting here is impractical, because we don't know whether this is an
>> OOB command.
>> 
>> >  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
>> >  {
>> >      Monitor *old_mon;
>> > @@ -211,43 +215,87 @@ static QMPRequest *monitor_qmp_requests_pop_any_with_lock(void)
>> >      return req_obj;
>> >  }
>> >  
>> > -void monitor_qmp_bh_dispatcher(void *data)
>> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data)
>> >  {
>> > -    QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
>> > +    QMPRequest *req_obj = NULL;
>> >      QDict *rsp;
>> >      bool need_resume;
>> >      MonitorQMP *mon;
>> >  
>> > -    if (!req_obj) {
>> > -        return;
>> > -    }
>> > +    while (true) {
>> > +        assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true);
>> 
>> Read and assert, then ...
>> 
>> > +
>> > +        /* Mark the dispatcher as not busy already here so that we don't miss
>> > +         * any new requests coming in the middle of our processing. */
>> > +        atomic_mb_set(&qmp_dispatcher_co_busy, false);
>> 
>> ... set.  Would exchange, then assert be cleaner?
>
> Then you would ask me why the exchange has to be atomic. :-)

Possibly :)

> More practically, I would need a temporary variable so that I don't get
> code with side effects in assert() (which may be compiled out with
> NDEBUG). The temporary variable would never be read outside the assert
> and would be unused with NDEBUG.
>
> So possible, yes, cleaner I'm not sure.

I asked because the patch made me wonder whether qmp_dispatcher_co could
change between the read and the set.

>> The assertion checks qmp_dispatcher_co_busy is set on coroutine enter.
>> It pairs with the atomic_mb_set() in monitor_init_globals_core().
>> 
>> Wing the comment, please, and wrap around column 70.  More of the same
>> below.
>> 
>> Hmm, qmp_dispatcher_co_busy is false while the coroutine busily runs.  I
>> figure its actual purpose is something like "if false, you need to wake
>> it up to ensure it processes additional requests".  Correct?
>
> Yes. I understood "busy" in the sense of "executing a monitor command",
> but of course the purpose of the variable is to know whether you need to
> schedule the coroutine or whether you must not schedule it.

I could try to find an identifier that more clearly expresses the
purpose, but since "busy" is pleasantly short, let me try to write a
clearer variable comment instead:

  /*
   * Is coroutine @qmp_dispatcher_co busy processing requests?
   * If true, additional requests may be pushed onto a mon->qmp_requests,
   * and qmp_dispatcher_co_shutdown may be set without further ado.
   * If false, you also have to set @qmp_dispatcher_co_busy to true and
   * wake up @qmp_dispatcher_co.
   * @qmp_dispatcher_co will reset it to false before it yields.
   * Access must be atomic for thread-safety.
   * The purpose of all this is to ensure the coroutine is scheduled at
   * most once at any time.
   */

We could make the "you also have to" part simpler: put it in a function,
so it has a name.

>> > +
>> > +        while (!(req_obj = monitor_qmp_requests_pop_any_with_lock())) {
>> > +            /* Wait to be reentered from handle_qmp_command, or terminate if
>> > +             * qmp_dispatcher_co_shutdown is true*/
>> 
>> Yes, these are the two places that wake this coroutine.
>> 
>> Hmm, there's a third aio_co_wake() in do_qmp_dispatch_bh().  But that
>> one resumes the yield in do_qmp_dispatch().  Correct?
>
> Yes, do_qmp_dispatch_bh is scheduled immediately before yielding, so
> that yield is where the coroutine will be resumed.
>
>> Space before */, please.
>> 
>> Would this
>> 
>>                /*
>>                 * No more requests to process.  Wait until
>>                 * handle_qmp_command() pushes more, or monitor_cleanup()
>>                 * requests shutdown.
>>                 */
>> 
>> be clearer?
>
> I think I prefer explicitly mentioning that these callers not only push
> requests or request shutdown, but that they actively cause this
> coroutine to be reentered.

Can do:

                  /*
                   * No more requests to process.  Wait to be reentered when
                   * handle_qmp_command() pushes more, or monitor_cleanup()
                   * requests shutdown.
                   */

> Matter of taste. You're the maintainer, so your taste wins.

Apart from differences in taste, there are differences in viewpoint,
which can also lead to different comment needs.  Ignoring your advice
would be foolish.

>> > +            if (!qmp_dispatcher_co_shutdown) {
>> > +                qemu_coroutine_yield();
>> 
>> Nothing to do, go to sleep.
>> 
>> > +
>> > +                /* busy must be set to true again by whoever rescheduled us to
>> > +                 * avoid double scheduling */
>> > +                assert(atomic_xchg(&qmp_dispatcher_co_busy, false) == true);
>> 
>> The assertion checks the coroutine's resume set busy as it should.  It
>> pairs with the atomic_xchg() in handle_qmp_command() and
>> monitor_cleanup().
>
> Correct.
>
>> > +            }
>> > +
>> > +            /* qmp_dispatcher_co_shutdown may have changed if we yielded and
>> > +             * were reentered from monitor_cleanup() */
>> > +            if (qmp_dispatcher_co_shutdown) {
>> > +                return;
>> > +            }
>> > +        }
>> >  
>> 
>> We got a request in @req.
>> 
>> > -    mon = req_obj->mon;
>> > -    /*  qmp_oob_enabled() might change after "qmp_capabilities" */
>> > -    need_resume = !qmp_oob_enabled(mon) ||
>> > -        mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
>> > -    qemu_mutex_unlock(&mon->qmp_queue_lock);
>> > -    if (req_obj->req) {
>> > -        QDict *qdict = qobject_to(QDict, req_obj->req);
>> > -        QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
>> > -        trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
>> > -        monitor_qmp_dispatch(mon, req_obj->req);
>> > -    } else {
>> > -        assert(req_obj->err);
>> > -        rsp = qmp_error_response(req_obj->err);
>> > -        req_obj->err = NULL;
>> > -        monitor_qmp_respond(mon, rsp);
>> > -        qobject_unref(rsp);
>> > -    }
>> > +        if (atomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
>> > +            /* Someone rescheduled us (probably because a new requests came
>> > +             * in), but we didn't actually yield. Do that now, only to be
>> > +             * immediately reentered and removed from the list of scheduled
>> > +             * coroutines. */
>> > +            qemu_coroutine_yield();
>> > +        }
>> >  
>> > -    if (need_resume) {
>> > -        /* Pairs with the monitor_suspend() in handle_qmp_command() */
>> > -        monitor_resume(&mon->common);
>> > -    }
>> > -    qmp_request_free(req_obj);
>> > +        /*
>> > +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
>> > +         * executing the command handler so that it can make progress if it
>> > +         * involves an AIO_WAIT_WHILE().
>> > +         */
>> > +        aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
>> > +        qemu_coroutine_yield();
>> > +
>> > +        mon = req_obj->mon;
>> > +        /*  qmp_oob_enabled() might change after "qmp_capabilities" */
>> > +        need_resume = !qmp_oob_enabled(mon) ||
>> > +            mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
>> > +        qemu_mutex_unlock(&mon->qmp_queue_lock);
>> > +        if (req_obj->req) {
>> > +            QDict *qdict = qobject_to(QDict, req_obj->req);
>> > +            QObject *id = qdict ? qdict_get(qdict, "id") : NULL;
>> > +            trace_monitor_qmp_cmd_in_band(qobject_get_try_str(id) ?: "");
>> > +            monitor_qmp_dispatch(mon, req_obj->req);
>> > +        } else {
>> > +            assert(req_obj->err);
>> > +            rsp = qmp_error_response(req_obj->err);
>> > +            req_obj->err = NULL;
>> > +            monitor_qmp_respond(mon, rsp);
>> > +            qobject_unref(rsp);
>> > +        }
>> > +
>> > +        if (need_resume) {
>> > +            /* Pairs with the monitor_suspend() in handle_qmp_command() */
>> > +            monitor_resume(&mon->common);
>> > +        }
>> > +        qmp_request_free(req_obj);
>> >  
>> > -    /* Reschedule instead of looping so the main loop stays responsive */
>> > -    qemu_bh_schedule(qmp_dispatcher_bh);
>> > +        /*
>> > +         * Yield and reschedule so the main loop stays responsive.
>> > +         *
>> > +         * Move back to iohandler_ctx so that nested event loops for
>> > +         * qemu_aio_context don't start new monitor commands.
>> > +         */
>> > +        aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>> > +        qemu_coroutine_yield();
>> > +    }
>> >  }
>> >  
>> 
>> Easier to review with diff -w:
>> 
>>   +        if (atomic_xchg(&qmp_dispatcher_co_busy, true) == true) {
>>   +            /* Someone rescheduled us (probably because a new requests came
>>   +             * in), but we didn't actually yield. Do that now, only to be
>>   +             * immediately reentered and removed from the list of scheduled
>>   +             * coroutines. */
>>   +            qemu_coroutine_yield();
>>   +        }
>> 
>> This part I understand.
>> 
>>   +
>>   +        /*
>>   +         * Move the coroutine from iohandler_ctx to qemu_aio_context for
>>   +         * executing the command handler so that it can make progress if it
>>   +         * involves an AIO_WAIT_WHILE().
>>   +         */
>>   +        aio_co_schedule(qemu_get_aio_context(), qmp_dispatcher_co);
>>   +        qemu_coroutine_yield();
>> 
>> More I/O context voodoo.  I'll get there.
>> 
>>            mon = req_obj->mon;
>>            /*  qmp_oob_enabled() might change after "qmp_capabilities" */
>>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
>>            }
>>            qmp_request_free(req_obj);
>> 
>>   -    /* Reschedule instead of looping so the main loop stays responsive */
>>   -    qemu_bh_schedule(qmp_dispatcher_bh);
>>   +        /*
>>   +         * Yield and reschedule so the main loop stays responsive.
>>   +         *
>>   +         * Move back to iohandler_ctx so that nested event loops for
>>   +         * qemu_aio_context don't start new monitor commands.
>> 
>> Can you explain this sentence for dummies?
>
> Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
> are scheduled there, the next iteration of the monitor dispatcher loop
> could start from a nested event loop. If we are scheduled in
> iohandler_ctx, then only the actual main loop will reenter the coroutine
> and nested event loops ignore it.
>
> I'm not sure if or why this is actually important, but this matches
> scheduling the dispatcher BH in iohandler_ctx in the code before this
> patch.
>
> If we didn't do this, we could end up starting monitor requests in more
> places than before, and who knows what that would mean.

Let me say it in my own words, to make sure I got it.  I'm going to
ignore special cases like "not using I/O thread" and exec-oob.

QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
This pushes requests onto the monitor's qmp_requests queue.

Before this patch, the dispatcher runs in a bottom half in the main
thread, in qemu_aio_context.

The patch moves it to a coroutine running in the main thread.  It runs
in iohandler_ctx, but switches to qemu_aio_context for executing command
handlers.

We want to keep command handlers running in qemu_aio_context, as before
this patch.

We want to run the rest in iohandler_ctx to ensure dispatching happens
only in the main loop, as before this patch.

Correct?

>>   +         */
>>   +        aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>>   +        qemu_coroutine_yield();
>>   +    }
>>    }
>> 
>> >  static void handle_qmp_command(void *opaque, QObject *req, Error *err)
>> @@ -308,7 +356,9 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err)
>>      qemu_mutex_unlock(&mon->qmp_queue_lock);
>>  
>>      /* Kick the dispatcher routine */
>> -    qemu_bh_schedule(qmp_dispatcher_bh);
>> +    if (!atomic_xchg(&qmp_dispatcher_co_busy, true)) {
>> +        aio_co_wake(qmp_dispatcher_co);
>> +    }
>>  }
>>  
>>  static void monitor_qmp_read(void *opaque, const uint8_t *buf, int size)

[*] https://en.wikipedia.org/wiki/Column_(typography)#Typographic_style



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-18 14:12       ` Markus Armbruster
@ 2020-02-18 15:29         ` Kevin Wolf
  2020-02-19  9:03           ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2020-02-18 15:29 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Am 18.02.2020 um 15:12 hat Markus Armbruster geschrieben:
> >> Regarding calling monitor_qmp_requests_pop_any_with_lock(): it needs
> >> @monitor_lock and @mon_list to be valid.  We just initialized
> >> @monitor_lock, and @mon_list is empty.
> >> monitor_qmp_requests_pop_any_with_lock() should be safe and return null.
> >> monitor_qmp_dispatcher_co() should also be safe and yield without doing
> >> work.
> >> 
> >> Can we exploit that to make things a bit simpler?  Separate patch would
> >> be fine with me.
> >
> > If this is true, we could replace this line:
> >
> >     aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
> >
> > with the following one:
> >
> >     qemu_aio_coroutine_enter(iohandler_get_aio_context(), qmp_dispatcher_co);
> >
> > I'm not sure that this is any simpler.
> 
> Naive question: what's the difference between "scheduling", "entering",
> and "waking up" a coroutine?
> 
> qemu_coroutine_enter() and qemu_aio_coroutine_enter() are in
> coroutine.h.

These are the low-level functions that just enter the coroutine (i.e. do
a stack switch and jump to coroutine code) immediately when they are
called. They must be called in the right thread with the right
AioContext locks held. (What "right" means depends on the code run in
the coroutine.)

> aio_co_schedule(), aio_co_wake() and aio_co_enter() are in aio.h.

aio_co_schedule() never enters the coroutine directly. It only adds it
to the list of scheduled coroutines and schedules a BH in the target
AioContext. This BH in turn will actually enter the coroutine.

aio_co_enter() enters the coroutine immediately if it's being called
outside of coroutine context and in the right thread for the given
AioContext. If it's in the right thread, but in coroutine context then
it will queue the given coroutine so that it runs whenever the current
coroutine yields. If it's in the wrong thread, it uses aio_co_schedule()
to get it run in the right thread.

aio_co_wake() takes just the coroutine as a parameter and calls
aio_co_enter() with the context in which the coroutine was last run.

All of these functions make sure that the AioContext lock is taken when
the coroutine is entered and that they run in the right thread.

> qemu_coroutine_enter() calls qemu_aio_coroutine_enter().
> 
> aio_co_wake() calls aio_co_enter() calls qemu_aio_coroutine_enter().
> 
> aio_co_enter() seems to be independent.

It's not. It calls either aio_co_schedule() or
qemu_aio_coroutine_enter(), or inserts the coroutine ina queue that is
processed in qemu_aio_coroutine_enter().

> aio.h seems to be layered on top of coroutine.h.  Should I prefer using
> aio.h to coroutine.h?

In the common case, using the aio.h functions is preferable because they
just do "the right thing" without requiring as much thinking as the
low-level functions.

> >> >  }
> >> >  
> >> >  QemuOptsList qemu_mon_opts = {
> >> > diff --git a/monitor/qmp.c b/monitor/qmp.c
> >> > index 54c06ba824..9444de9fcf 100644
> >> > --- a/monitor/qmp.c
> >> > +++ b/monitor/qmp.c
> >> > @@ -133,6 +133,10 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
> >> >      }
> >> >  }
> >> >  
> >> > +/*
> >> > + * Runs outside of coroutine context for OOB commands, but in coroutine context
> >> > + * for everything else.
> >> > + */
> >> 
> >> Nitpick: wrap around column 70, please.
> >> 
> >> Note to self: the precondition is asserted in do_qmp_dispatch() below.
> >> Asserting here is impractical, because we don't know whether this is an
> >> OOB command.
> >> 
> >> >  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
> >> >  {
> >> >      Monitor *old_mon;
> >> > @@ -211,43 +215,87 @@ static QMPRequest *monitor_qmp_requests_pop_any_with_lock(void)
> >> >      return req_obj;
> >> >  }
> >> >  
> >> > -void monitor_qmp_bh_dispatcher(void *data)
> >> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data)
> >> >  {
> >> > -    QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
> >> > +    QMPRequest *req_obj = NULL;
> >> >      QDict *rsp;
> >> >      bool need_resume;
> >> >      MonitorQMP *mon;
> >> >  
> >> > -    if (!req_obj) {
> >> > -        return;
> >> > -    }
> >> > +    while (true) {
> >> > +        assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true);
> >> 
> >> Read and assert, then ...
> >> 
> >> > +
> >> > +        /* Mark the dispatcher as not busy already here so that we don't miss
> >> > +         * any new requests coming in the middle of our processing. */
> >> > +        atomic_mb_set(&qmp_dispatcher_co_busy, false);
> >> 
> >> ... set.  Would exchange, then assert be cleaner?
> >
> > Then you would ask me why the exchange has to be atomic. :-)
> 
> Possibly :)
> 
> > More practically, I would need a temporary variable so that I don't get
> > code with side effects in assert() (which may be compiled out with
> > NDEBUG). The temporary variable would never be read outside the assert
> > and would be unused with NDEBUG.
> >
> > So possible, yes, cleaner I'm not sure.
> 
> I asked because the patch made me wonder whether qmp_dispatcher_co could
> change between the read and the set.

Ah. No, this function is the only one that does a transition from true
to false. Everyone else only transitions from false to true (or leaves
the value unchanged as true).

> >>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
> >>            }
> >>            qmp_request_free(req_obj);
> >> 
> >>   -    /* Reschedule instead of looping so the main loop stays responsive */
> >>   -    qemu_bh_schedule(qmp_dispatcher_bh);
> >>   +        /*
> >>   +         * Yield and reschedule so the main loop stays responsive.
> >>   +         *
> >>   +         * Move back to iohandler_ctx so that nested event loops for
> >>   +         * qemu_aio_context don't start new monitor commands.
> >> 
> >> Can you explain this sentence for dummies?
> >
> > Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
> > are scheduled there, the next iteration of the monitor dispatcher loop
> > could start from a nested event loop. If we are scheduled in
> > iohandler_ctx, then only the actual main loop will reenter the coroutine
> > and nested event loops ignore it.
> >
> > I'm not sure if or why this is actually important, but this matches
> > scheduling the dispatcher BH in iohandler_ctx in the code before this
> > patch.
> >
> > If we didn't do this, we could end up starting monitor requests in more
> > places than before, and who knows what that would mean.
> 
> Let me say it in my own words, to make sure I got it.  I'm going to
> ignore special cases like "not using I/O thread" and exec-oob.
> 
> QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
> This pushes requests onto the monitor's qmp_requests queue.

mon_iothread (like all iothreads) has a separate AioContext, which
doesn't have a name, but can be accessed with
iothread_get_aio_context(mon_iothread).

> Before this patch, the dispatcher runs in a bottom half in the main
> thread, in qemu_aio_context.

The dispatcher BH is what is scheduled in iohandler_ctx. This means that
the BH is run from the main loop, but not from nested event loops.

> The patch moves it to a coroutine running in the main thread.  It runs
> in iohandler_ctx, but switches to qemu_aio_context for executing command
> handlers.
> 
> We want to keep command handlers running in qemu_aio_context, as before
> this patch.

"Running in AioContext X" is kind of a sloppy term, and I'm afraid it
might be more confusing than helpful here. What this means is that the
code is run while holding the lock of the AioContext, and it registers
its BHs, callbacks etc. in that AioContext so it will be called from the
event loop of the respective thread.

Before this patch, command handlers can't really use callbacks without a
nested event loop because they are synchronous. The BQL is held, which
is equivalent to holding the qemu_aio_context lock. But other than that,
all of the definition of "running in an AioContext" doesn't really apply.

Now with coroutines, you assign them an AioContext, which is the context
in which BHs reentering the coroutine will be scheduled, e.g. from
explicit aio_co_schedule(), after qemu_co_sleep_ns() or after waiting
for a lock like a CoMutex.

qemu_aio_context is what most (if not all) of the existing QMP handlers
use when they run a nested event loop, so running the coroutine in this
context while calling the handlers will make the transition the easiest.

> We want to run the rest in iohandler_ctx to ensure dispatching happens
> only in the main loop, as before this patch.
> 
> Correct?

This part is correct.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-18 15:29         ` Kevin Wolf
@ 2020-02-19  9:03           ` Markus Armbruster
  2020-02-19 10:22             ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-02-19  9:03 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: stefanha, marcandre.lureau, qemu-devel, qemu-block

Kevin Wolf <kwolf@redhat.com> writes:

> Am 18.02.2020 um 15:12 hat Markus Armbruster geschrieben:
>> >> Regarding calling monitor_qmp_requests_pop_any_with_lock(): it needs
>> >> @monitor_lock and @mon_list to be valid.  We just initialized
>> >> @monitor_lock, and @mon_list is empty.
>> >> monitor_qmp_requests_pop_any_with_lock() should be safe and return null.
>> >> monitor_qmp_dispatcher_co() should also be safe and yield without doing
>> >> work.
>> >> 
>> >> Can we exploit that to make things a bit simpler?  Separate patch would
>> >> be fine with me.
>> >
>> > If this is true, we could replace this line:
>> >
>> >     aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>> >
>> > with the following one:
>> >
>> >     qemu_aio_coroutine_enter(iohandler_get_aio_context(), qmp_dispatcher_co);
>> >
>> > I'm not sure that this is any simpler.
>> 
>> Naive question: what's the difference between "scheduling", "entering",
>> and "waking up" a coroutine?
>> 
>> qemu_coroutine_enter() and qemu_aio_coroutine_enter() are in
>> coroutine.h.
>
> These are the low-level functions that just enter the coroutine (i.e. do
> a stack switch and jump to coroutine code) immediately when they are
> called. They must be called in the right thread with the right
> AioContext locks held. (What "right" means depends on the code run in
> the coroutine.)

I see; low level building blocks.

>> aio_co_schedule(), aio_co_wake() and aio_co_enter() are in aio.h.
>
> aio_co_schedule() never enters the coroutine directly. It only adds it
> to the list of scheduled coroutines and schedules a BH in the target
> AioContext. This BH in turn will actually enter the coroutine.

Makes sense.

> aio_co_enter() enters the coroutine immediately if it's being called
> outside of coroutine context and in the right thread for the given
> AioContext. If it's in the right thread, but in coroutine context then
> it will queue the given coroutine so that it runs whenever the current
> coroutine yields. If it's in the wrong thread, it uses aio_co_schedule()
> to get it run in the right thread.

Uff, sounds complicated.  Lots of magic.

> aio_co_wake() takes just the coroutine as a parameter and calls
> aio_co_enter() with the context in which the coroutine was last run.

Complicated by extension.

> All of these functions make sure that the AioContext lock is taken when
> the coroutine is entered and that they run in the right thread.
>
>> qemu_coroutine_enter() calls qemu_aio_coroutine_enter().
>> 
>> aio_co_wake() calls aio_co_enter() calls qemu_aio_coroutine_enter().
>> 
>> aio_co_enter() seems to be independent.
>
> It's not. It calls either aio_co_schedule() or
> qemu_aio_coroutine_enter(), or inserts the coroutine ina queue that is
> processed in qemu_aio_coroutine_enter().

I was blind.

>> aio.h seems to be layered on top of coroutine.h.  Should I prefer using
>> aio.h to coroutine.h?
>
> In the common case, using the aio.h functions is preferable because they
> just do "the right thing" without requiring as much thinking as the
> low-level functions.

Got it.

>> >> >  }
>> >> >  
>> >> >  QemuOptsList qemu_mon_opts = {
>> >> > diff --git a/monitor/qmp.c b/monitor/qmp.c
>> >> > index 54c06ba824..9444de9fcf 100644
>> >> > --- a/monitor/qmp.c
>> >> > +++ b/monitor/qmp.c
>> >> > @@ -133,6 +133,10 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
>> >> >      }
>> >> >  }
>> >> >  
>> >> > +/*
>> >> > + * Runs outside of coroutine context for OOB commands, but in coroutine context
>> >> > + * for everything else.
>> >> > + */
>> >> 
>> >> Nitpick: wrap around column 70, please.
>> >> 
>> >> Note to self: the precondition is asserted in do_qmp_dispatch() below.
>> >> Asserting here is impractical, because we don't know whether this is an
>> >> OOB command.
>> >> 
>> >> >  static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
>> >> >  {
>> >> >      Monitor *old_mon;
>> >> > @@ -211,43 +215,87 @@ static QMPRequest *monitor_qmp_requests_pop_any_with_lock(void)
>> >> >      return req_obj;
>> >> >  }
>> >> >  
>> >> > -void monitor_qmp_bh_dispatcher(void *data)
>> >> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data)
>> >> >  {
>> >> > -    QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
>> >> > +    QMPRequest *req_obj = NULL;
>> >> >      QDict *rsp;
>> >> >      bool need_resume;
>> >> >      MonitorQMP *mon;
>> >> >  
>> >> > -    if (!req_obj) {
>> >> > -        return;
>> >> > -    }
>> >> > +    while (true) {
>> >> > +        assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true);
>> >> 
>> >> Read and assert, then ...
>> >> 
>> >> > +
>> >> > +        /* Mark the dispatcher as not busy already here so that we don't miss
>> >> > +         * any new requests coming in the middle of our processing. */
>> >> > +        atomic_mb_set(&qmp_dispatcher_co_busy, false);
>> >> 
>> >> ... set.  Would exchange, then assert be cleaner?
>> >
>> > Then you would ask me why the exchange has to be atomic. :-)
>> 
>> Possibly :)
>> 
>> > More practically, I would need a temporary variable so that I don't get
>> > code with side effects in assert() (which may be compiled out with
>> > NDEBUG). The temporary variable would never be read outside the assert
>> > and would be unused with NDEBUG.
>> >
>> > So possible, yes, cleaner I'm not sure.
>> 
>> I asked because the patch made me wonder whether qmp_dispatcher_co could
>> change between the read and the set.
>
> Ah. No, this function is the only one that does a transition from true
> to false. Everyone else only transitions from false to true (or leaves
> the value unchanged as true).

True, but it's non-local argument.  Anyway, not a big deal.

>> >>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
>> >>            }
>> >>            qmp_request_free(req_obj);
>> >> 
>> >>   -    /* Reschedule instead of looping so the main loop stays responsive */
>> >>   -    qemu_bh_schedule(qmp_dispatcher_bh);
>> >>   +        /*
>> >>   +         * Yield and reschedule so the main loop stays responsive.
>> >>   +         *
>> >>   +         * Move back to iohandler_ctx so that nested event loops for
>> >>   +         * qemu_aio_context don't start new monitor commands.
>> >> 
>> >> Can you explain this sentence for dummies?
>> >
>> > Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
>> > are scheduled there, the next iteration of the monitor dispatcher loop
>> > could start from a nested event loop. If we are scheduled in
>> > iohandler_ctx, then only the actual main loop will reenter the coroutine
>> > and nested event loops ignore it.
>> >
>> > I'm not sure if or why this is actually important, but this matches
>> > scheduling the dispatcher BH in iohandler_ctx in the code before this
>> > patch.
>> >
>> > If we didn't do this, we could end up starting monitor requests in more
>> > places than before, and who knows what that would mean.
>> 
>> Let me say it in my own words, to make sure I got it.  I'm going to
>> ignore special cases like "not using I/O thread" and exec-oob.
>> 
>> QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
>> This pushes requests onto the monitor's qmp_requests queue.
>
> mon_iothread (like all iothreads) has a separate AioContext, which
> doesn't have a name, but can be accessed with
> iothread_get_aio_context(mon_iothread).

Got it.

@iohandler_ctx and @qemu_aio_context both belong to the main loop.

@qemu_aio_context is for general "run in the main loop" use.  It is
polled both in the actual main loop and event loops nested in it.  I
figure "both" to keep things responsive.

@iohandler_ctx is for "I/O handlers" (whatever those are).  It is polled
only in the actual main loop.  I figure this means "I/O handlers" may
get delayed while a nested event loop runs.  But better starve a bit
than run in unexpected places.

>> Before this patch, the dispatcher runs in a bottom half in the main
>> thread, in qemu_aio_context.
>
> The dispatcher BH is what is scheduled in iohandler_ctx. This means that
> the BH is run from the main loop, but not from nested event loops.

Got it.

>> The patch moves it to a coroutine running in the main thread.  It runs
>> in iohandler_ctx, but switches to qemu_aio_context for executing command
>> handlers.
>> 
>> We want to keep command handlers running in qemu_aio_context, as before
>> this patch.
>
> "Running in AioContext X" is kind of a sloppy term, and I'm afraid it
> might be more confusing than helpful here. What this means is that the
> code is run while holding the lock of the AioContext, and it registers
> its BHs, callbacks etc. in that AioContext so it will be called from the
> event loop of the respective thread.
>
> Before this patch, command handlers can't really use callbacks without a
> nested event loop because they are synchronous.

I figure an example for such a nested event loop is BDRV_POLL_WHILE() in
bdrv_truncate(), which runs in the command handler for block_resize.
True?

>                                                 The BQL is held, which
> is equivalent to holding the qemu_aio_context lock.

Why is it equivalent?  Are they always taken together?

>                                                     But other than that,
> all of the definition of "running in an AioContext" doesn't really apply.
>
> Now with coroutines, you assign them an AioContext, which is the context
> in which BHs reentering the coroutine will be scheduled, e.g. from
> explicit aio_co_schedule(), after qemu_co_sleep_ns() or after waiting
> for a lock like a CoMutex.
>
> qemu_aio_context is what most (if not all) of the existing QMP handlers
> use when they run a nested event loop,

bdrv_truncate() appears to use bdrv_get_aio_context(), which is the
BlockDriverState's AioContext if set, else @qemu_aio_context.

>                                        so running the coroutine in this
> context while calling the handlers will make the transition the easiest.

In the case of block_resize: running it in a coroutine makes
bdrv_truncate() use that coroutine instead of creating one together with
a nested event loop.  "That coroutine" uses @qemu_aio_context.  So does
the nested event loop, *except* when the BlockDriverState has an
AioContext.

I have no idea when a BlockDriverState has an AioContext.  Can you help?

>> We want to run the rest in iohandler_ctx to ensure dispatching happens
>> only in the main loop, as before this patch.
>> 
>> Correct?
>
> This part is correct.

Alright, let me try again, so you can point out the next level of my
ignorance :)

QMP monitor I/O happens in I/O thread @mon_iothread, in the I/O thread's
AioContext.  The I/O thread's event loop polls monitor I/O.  The read
handler pushes requests onto the monitor's qmp_requests queue.

The dispatcher pops requests off these queues, and runs command
handlers for them.

Before this patch, the dispatcher runs in a bottom half in the main
thread with the BQL[*] held, in @iohandler_ctx context.  The latter
ensures dispatch happens only from the main loop, not nested event
loops.

The command handlers don't care for the AioContext; they're synchronous.
If they need AIO internally, they have to set up a nested event loop.
When they do, they pick the AioContext explicitly, so they still don't
rely on the current context.

The patch moves the dispatcher to a coroutine running in the main
thread.  It runs in iohandler_ctx, but switches to qemu_aio_context for
executing command handlers.

The dispatcher keeps running in @iohandler_ctx.  Good.

The command handlers now run in @qemu_aio_context.  Doesn't matter
because "command handlers don't care for the AioContext".  Except when a
command handler has a fast path like bdrv_truncate() has.  For those,
@iohandler_ctx is generally not what we want.  @qemu_aio_context is.
"Generally", since every one needs to be inspected for AioContext
shenanigans.


[*] @qemu_global_mutex, a.k.a. "iothread mutex"; pity anyone who needs
to figure this out from the source code by himself



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-19  9:03           ` Markus Armbruster
@ 2020-02-19 10:22             ` Kevin Wolf
  2020-02-19 14:21               ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2020-02-19 10:22 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: stefanha, marcandre.lureau, qemu-devel, qemu-block

Am 19.02.2020 um 10:03 hat Markus Armbruster geschrieben:
> >> >>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
> >> >>            }
> >> >>            qmp_request_free(req_obj);
> >> >> 
> >> >>   -    /* Reschedule instead of looping so the main loop stays responsive */
> >> >>   -    qemu_bh_schedule(qmp_dispatcher_bh);
> >> >>   +        /*
> >> >>   +         * Yield and reschedule so the main loop stays responsive.
> >> >>   +         *
> >> >>   +         * Move back to iohandler_ctx so that nested event loops for
> >> >>   +         * qemu_aio_context don't start new monitor commands.
> >> >> 
> >> >> Can you explain this sentence for dummies?
> >> >
> >> > Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
> >> > are scheduled there, the next iteration of the monitor dispatcher loop
> >> > could start from a nested event loop. If we are scheduled in
> >> > iohandler_ctx, then only the actual main loop will reenter the coroutine
> >> > and nested event loops ignore it.
> >> >
> >> > I'm not sure if or why this is actually important, but this matches
> >> > scheduling the dispatcher BH in iohandler_ctx in the code before this
> >> > patch.
> >> >
> >> > If we didn't do this, we could end up starting monitor requests in more
> >> > places than before, and who knows what that would mean.
> >> 
> >> Let me say it in my own words, to make sure I got it.  I'm going to
> >> ignore special cases like "not using I/O thread" and exec-oob.
> >> 
> >> QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
> >> This pushes requests onto the monitor's qmp_requests queue.
> >
> > mon_iothread (like all iothreads) has a separate AioContext, which
> > doesn't have a name, but can be accessed with
> > iothread_get_aio_context(mon_iothread).
> 
> Got it.
> 
> @iohandler_ctx and @qemu_aio_context both belong to the main loop.
> 
> @qemu_aio_context is for general "run in the main loop" use.  It is
> polled both in the actual main loop and event loops nested in it.  I
> figure "both" to keep things responsive.
> 
> @iohandler_ctx is for "I/O handlers" (whatever those are).  It is polled
> only in the actual main loop.  I figure this means "I/O handlers" may
> get delayed while a nested event loop runs.  But better starve a bit
> than run in unexpected places.
> 
> >> Before this patch, the dispatcher runs in a bottom half in the main
> >> thread, in qemu_aio_context.
> >
> > The dispatcher BH is what is scheduled in iohandler_ctx. This means that
> > the BH is run from the main loop, but not from nested event loops.
> 
> Got it.
> 
> >> The patch moves it to a coroutine running in the main thread.  It runs
> >> in iohandler_ctx, but switches to qemu_aio_context for executing command
> >> handlers.
> >> 
> >> We want to keep command handlers running in qemu_aio_context, as before
> >> this patch.
> >
> > "Running in AioContext X" is kind of a sloppy term, and I'm afraid it
> > might be more confusing than helpful here. What this means is that the
> > code is run while holding the lock of the AioContext, and it registers
> > its BHs, callbacks etc. in that AioContext so it will be called from the
> > event loop of the respective thread.
> >
> > Before this patch, command handlers can't really use callbacks without a
> > nested event loop because they are synchronous.
> 
> I figure an example for such a nested event loop is BDRV_POLL_WHILE() in
> bdrv_truncate(), which runs in the command handler for block_resize.
> True?

Yes. I think most nested event loops are in the block layer, and they
use the BDRV_POLL_WHILE() or AIO_WAIT_WHILE() macros today.

> >                                                 The BQL is held, which
> > is equivalent to holding the qemu_aio_context lock.
> 
> Why is it equivalent?  Are they always taken together?

I guess ideally they would be. In practice, we neglect the
qemu_aio_context lock and rely on the BQL for everything in the main
thread. So maybe I should say the BQL replaces the AioContext lock for
the main context rather than being equivalent.

> >                                                     But other than that,
> > all of the definition of "running in an AioContext" doesn't really apply.
> >
> > Now with coroutines, you assign them an AioContext, which is the context
> > in which BHs reentering the coroutine will be scheduled, e.g. from
> > explicit aio_co_schedule(), after qemu_co_sleep_ns() or after waiting
> > for a lock like a CoMutex.
> >
> > qemu_aio_context is what most (if not all) of the existing QMP handlers
> > use when they run a nested event loop,
> 
> bdrv_truncate() appears to use bdrv_get_aio_context(), which is the
> BlockDriverState's AioContext if set, else @qemu_aio_context.

Right, the BDRV_POLL_WHILE() macro allows waiting for something in a
different AioContext from the main context. This works because of the
aio_wait_kick() in bdrv_truncate_co_entry(), which schedules a BH in the
main context, and this is what BDRV_POLL_WHILE() is waiting for.

So the command handler runs in the main context (and thread), whereas
the bdrv_co_truncate() runs in the AioContext (and thread) of the BDS.

> >                                        so running the coroutine in this
> > context while calling the handlers will make the transition the easiest.
> 
> In the case of block_resize: running it in a coroutine makes
> bdrv_truncate() use that coroutine instead of creating one together with
> a nested event loop.  "That coroutine" uses @qemu_aio_context.  So does
> the nested event loop, *except* when the BlockDriverState has an
> AioContext.
> 
> I have no idea when a BlockDriverState has an AioContext.  Can you help?

When it's used with a dataplane device, i.e. is tied a separate
iothread.

The fact that the coroutine fastpath (not only in bdrv_truncate(), but
also everywhere else) assumes that we're already in the right AioContext
might be a bug.

If this is the only problem in v5, please consider applying only patches
1-3 (with the infrastructure) so that Marc-André's screendump patch gets
its dependency resolved, and I'll send a separate series for converting
block_resize that fixes this problem (probably by explicitly
rescheduling the coroutine into the BDS context and back in the
coroutine path).

> >> We want to run the rest in iohandler_ctx to ensure dispatching happens
> >> only in the main loop, as before this patch.
> >> 
> >> Correct?
> >
> > This part is correct.
> 
> Alright, let me try again, so you can point out the next level of my
> ignorance :)
> 
> QMP monitor I/O happens in I/O thread @mon_iothread, in the I/O thread's
> AioContext.  The I/O thread's event loop polls monitor I/O.  The read
> handler pushes requests onto the monitor's qmp_requests queue.
> 
> The dispatcher pops requests off these queues, and runs command
> handlers for them.
> 
> Before this patch, the dispatcher runs in a bottom half in the main
> thread with the BQL[*] held, in @iohandler_ctx context.  The latter
> ensures dispatch happens only from the main loop, not nested event
> loops.
> 
> The command handlers don't care for the AioContext; they're synchronous.
> If they need AIO internally, they have to set up a nested event loop.
> When they do, they pick the AioContext explicitly, so they still don't
> rely on the current context.

They do rely on being run in the main thread, but the exact context
doesn't matter. (In theory, running them in the thread of the block
node's AioContext would be fine, too, but the monitor can't know what
that is.)

See the AIO_WAIT_WHILE() documentation for details.

> The patch moves the dispatcher to a coroutine running in the main
> thread.  It runs in iohandler_ctx, but switches to qemu_aio_context for
> executing command handlers.
> 
> The dispatcher keeps running in @iohandler_ctx.  Good.
> 
> The command handlers now run in @qemu_aio_context.  Doesn't matter
> because "command handlers don't care for the AioContext".  Except when a
> command handler has a fast path like bdrv_truncate() has.  For those,
> @iohandler_ctx is generally not what we want.  @qemu_aio_context is.
> "Generally", since every one needs to be inspected for AioContext
> shenanigans.

Except for the bug that the specific code in bdrv_trunate() wants
to run the coroutine in the BDS AioContext, which may be different from
qemu_aio_context...

But I think the theory is right, for those commands that don't
potentially schedule a coroutine in a different AioContext.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-19 10:22             ` Kevin Wolf
@ 2020-02-19 14:21               ` Markus Armbruster
  2020-02-19 14:26                 ` Markus Armbruster
  2020-02-19 15:27                 ` Kevin Wolf
  0 siblings, 2 replies; 27+ messages in thread
From: Markus Armbruster @ 2020-02-19 14:21 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Kevin Wolf <kwolf@redhat.com> writes:

> Am 19.02.2020 um 10:03 hat Markus Armbruster geschrieben:
>> >> >>   @@ -246,8 +287,15 @@ void monitor_qmp_bh_dispatcher(void *data)
>> >> >>            }
>> >> >>            qmp_request_free(req_obj);
>> >> >> 
>> >> >>   -    /* Reschedule instead of looping so the main loop stays responsive */
>> >> >>   -    qemu_bh_schedule(qmp_dispatcher_bh);
>> >> >>   +        /*
>> >> >>   +         * Yield and reschedule so the main loop stays responsive.
>> >> >>   +         *
>> >> >>   +         * Move back to iohandler_ctx so that nested event loops for
>> >> >>   +         * qemu_aio_context don't start new monitor commands.
>> >> >> 
>> >> >> Can you explain this sentence for dummies?
>> >> >
>> >> > Nested event loops (i.e. AIO_WAIT_WHILE) poll qemu_aio_context, so if we
>> >> > are scheduled there, the next iteration of the monitor dispatcher loop
>> >> > could start from a nested event loop. If we are scheduled in
>> >> > iohandler_ctx, then only the actual main loop will reenter the coroutine
>> >> > and nested event loops ignore it.
>> >> >
>> >> > I'm not sure if or why this is actually important, but this matches
>> >> > scheduling the dispatcher BH in iohandler_ctx in the code before this
>> >> > patch.
>> >> >
>> >> > If we didn't do this, we could end up starting monitor requests in more
>> >> > places than before, and who knows what that would mean.
>> >> 
>> >> Let me say it in my own words, to make sure I got it.  I'm going to
>> >> ignore special cases like "not using I/O thread" and exec-oob.
>> >> 
>> >> QMP monitor I/O happens in mon_iothread, in iohandler_ctx (I think).
>> >> This pushes requests onto the monitor's qmp_requests queue.
>> >
>> > mon_iothread (like all iothreads) has a separate AioContext, which
>> > doesn't have a name, but can be accessed with
>> > iothread_get_aio_context(mon_iothread).
>> 
>> Got it.
>> 
>> @iohandler_ctx and @qemu_aio_context both belong to the main loop.
>> 
>> @qemu_aio_context is for general "run in the main loop" use.  It is
>> polled both in the actual main loop and event loops nested in it.  I
>> figure "both" to keep things responsive.
>> 
>> @iohandler_ctx is for "I/O handlers" (whatever those are).  It is polled
>> only in the actual main loop.  I figure this means "I/O handlers" may
>> get delayed while a nested event loop runs.  But better starve a bit
>> than run in unexpected places.
>> 
>> >> Before this patch, the dispatcher runs in a bottom half in the main
>> >> thread, in qemu_aio_context.
>> >
>> > The dispatcher BH is what is scheduled in iohandler_ctx. This means that
>> > the BH is run from the main loop, but not from nested event loops.
>> 
>> Got it.
>> 
>> >> The patch moves it to a coroutine running in the main thread.  It runs
>> >> in iohandler_ctx, but switches to qemu_aio_context for executing command
>> >> handlers.
>> >> 
>> >> We want to keep command handlers running in qemu_aio_context, as before
>> >> this patch.
>> >
>> > "Running in AioContext X" is kind of a sloppy term, and I'm afraid it
>> > might be more confusing than helpful here. What this means is that the
>> > code is run while holding the lock of the AioContext, and it registers
>> > its BHs, callbacks etc. in that AioContext so it will be called from the
>> > event loop of the respective thread.
>> >
>> > Before this patch, command handlers can't really use callbacks without a
>> > nested event loop because they are synchronous.
>> 
>> I figure an example for such a nested event loop is BDRV_POLL_WHILE() in
>> bdrv_truncate(), which runs in the command handler for block_resize.
>> True?
>
> Yes. I think most nested event loops are in the block layer, and they
> use the BDRV_POLL_WHILE() or AIO_WAIT_WHILE() macros today.
>
>> >                                                 The BQL is held, which
>> > is equivalent to holding the qemu_aio_context lock.
>> 
>> Why is it equivalent?  Are they always taken together?
>
> I guess ideally they would be. In practice, we neglect the
> qemu_aio_context lock and rely on the BQL for everything in the main
> thread. So maybe I should say the BQL replaces the AioContext lock for
> the main context rather than being equivalent.

Sounds a bit sloppy.  It works...

>> >                                                     But other than that,
>> > all of the definition of "running in an AioContext" doesn't really apply.
>> >
>> > Now with coroutines, you assign them an AioContext, which is the context
>> > in which BHs reentering the coroutine will be scheduled, e.g. from
>> > explicit aio_co_schedule(), after qemu_co_sleep_ns() or after waiting
>> > for a lock like a CoMutex.
>> >
>> > qemu_aio_context is what most (if not all) of the existing QMP handlers
>> > use when they run a nested event loop,
>> 
>> bdrv_truncate() appears to use bdrv_get_aio_context(), which is the
>> BlockDriverState's AioContext if set, else @qemu_aio_context.
>
> Right, the BDRV_POLL_WHILE() macro allows waiting for something in a
> different AioContext from the main context. This works because of the
> aio_wait_kick() in bdrv_truncate_co_entry(), which schedules a BH in the
> main context, and this is what BDRV_POLL_WHILE() is waiting for.

Hairy...

> So the command handler runs in the main context (and thread), whereas
> the bdrv_co_truncate() runs in the AioContext (and thread) of the BDS.
>
>> >                                        so running the coroutine in this
>> > context while calling the handlers will make the transition the easiest.
>> 
>> In the case of block_resize: running it in a coroutine makes
>> bdrv_truncate() use that coroutine instead of creating one together with
>> a nested event loop.  "That coroutine" uses @qemu_aio_context.  So does
>> the nested event loop, *except* when the BlockDriverState has an
>> AioContext.
>> 
>> I have no idea when a BlockDriverState has an AioContext.  Can you help?
>
> When it's used with a dataplane device, i.e. is tied a separate
> iothread.

I see.

> The fact that the coroutine fastpath (not only in bdrv_truncate(), but
> also everywhere else) assumes that we're already in the right AioContext
> might be a bug.
>
> If this is the only problem in v5, please consider applying only patches
> 1-3 (with the infrastructure) so that Marc-André's screendump patch gets
> its dependency resolved, and I'll send a separate series for converting
> block_resize that fixes this problem (probably by explicitly
> rescheduling the coroutine into the BDS context and back in the
> coroutine path).

Can do.

>> >> We want to run the rest in iohandler_ctx to ensure dispatching happens
>> >> only in the main loop, as before this patch.
>> >> 
>> >> Correct?
>> >
>> > This part is correct.
>> 
>> Alright, let me try again, so you can point out the next level of my
>> ignorance :)
>> 
>> QMP monitor I/O happens in I/O thread @mon_iothread, in the I/O thread's
>> AioContext.  The I/O thread's event loop polls monitor I/O.  The read
>> handler pushes requests onto the monitor's qmp_requests queue.
>> 
>> The dispatcher pops requests off these queues, and runs command
>> handlers for them.
>> 
>> Before this patch, the dispatcher runs in a bottom half in the main
>> thread with the BQL[*] held, in @iohandler_ctx context.  The latter
>> ensures dispatch happens only from the main loop, not nested event
>> loops.
>> 
>> The command handlers don't care for the AioContext; they're synchronous.
>> If they need AIO internally, they have to set up a nested event loop.
>> When they do, they pick the AioContext explicitly, so they still don't
>> rely on the current context.
>
> They do rely on being run in the main thread, but the exact context
> doesn't matter. (In theory, running them in the thread of the block
> node's AioContext would be fine, too, but the monitor can't know what
> that is.)

Yes.  Command handlers have always run in the main thread, and handler
code may rely on that.

> See the AIO_WAIT_WHILE() documentation for details.
>
>> The patch moves the dispatcher to a coroutine running in the main
>> thread.  It runs in iohandler_ctx, but switches to qemu_aio_context for
>> executing command handlers.
>> 
>> The dispatcher keeps running in @iohandler_ctx.  Good.
>> 
>> The command handlers now run in @qemu_aio_context.  Doesn't matter
>> because "command handlers don't care for the AioContext".  Except when a
>> command handler has a fast path like bdrv_truncate() has.  For those,
>> @iohandler_ctx is generally not what we want.  @qemu_aio_context is.
>> "Generally", since every one needs to be inspected for AioContext
>> shenanigans.
>
> Except for the bug that the specific code in bdrv_trunate() wants
> to run the coroutine in the BDS AioContext, which may be different from
> qemu_aio_context...
>
> But I think the theory is right, for those commands that don't
> potentially schedule a coroutine in a different AioContext.

Let's review how switching to @qemu_aio_context for executing command
handlers affects command handlers:

* Handler is completely synchronous: the switch has no effect.

* Handler has a nested event loop

  - with a fast path for "running in coroutine context", and

    + the fast path tacitly assumes @qemu_aio_context: the switch makes
      it work without change

    + the fast path tacitly assumes anything else: the switch is
      useless, the fast path doesn't work without further changes

  - without such a fast path (not sure this case exists): the switch is
    useless, the nested event loop works anyway

* Handler has only the fast path, i.e. it assumes it runs coroutine
  context (this case cannot exist now, but this series enables it), and

  - the handler wants @qemu_aio_context: the switch saves it the trouble
    to switch itself

  - the handler wants anything else: the switch is useless

Whether the switch is actually useful often enough to be worthwhile I
can't say.  But because @qemu_aio_context is the "normal" context for
"normal" code running in the main loop, switching feels kind of right to
me.

I think we need to talk about AioContext in qapi-code-gen.txt.  PATCH 1
now adds

  Member 'coroutine' tells the QMP dispatcher whether the command handler
  is safe to be run in a coroutine.  It defaults to false.  If it is true,
  the command handler is called from coroutine context and may yield while
  waiting for an external event (such as I/O completion) in order to avoid
  blocking the guest and other background operations.
  
What about "is called from a coroutine running in the main loop with
@qemu_aio_context, and may yield"?

Likewise for the commit message of PATCH 3:

  This moves the QMP dispatcher to a coroutine and runs all QMP command
  handlers that declare 'coroutine': true in coroutine context so they
  can avoid blocking the main loop while doing I/O or waiting for other
  events.

"calls all ... in a coroutine running in the main loop with
@qemu_aio_context, so they can".

Speaking of PATCH 1:

  It is an error to specify both 'coroutine': true and 'allow-oob': true
  for a command.  (We don't currently have a use case for both together and
  without a use case, it's not entirely clear what the semantics should be.)

Let's lose the parenthesis around the last sentence.

If you agree with my proposed tweaks, and nothing else comes up, I can
try to do them in my tree.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-19 14:21               ` Markus Armbruster
@ 2020-02-19 14:26                 ` Markus Armbruster
  2020-02-19 15:27                 ` Kevin Wolf
  1 sibling, 0 replies; 27+ messages in thread
From: Markus Armbruster @ 2020-02-19 14:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Markus Armbruster <armbru@redhat.com> writes:

[...]
> If you agree with my proposed tweaks, and nothing else comes up, I can
> try to do them in my tree.

I'll tweak your v5, of course.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine
  2020-02-19 14:21               ` Markus Armbruster
  2020-02-19 14:26                 ` Markus Armbruster
@ 2020-02-19 15:27                 ` Kevin Wolf
  1 sibling, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-02-19 15:27 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Am 19.02.2020 um 15:21 hat Markus Armbruster geschrieben:
> I think we need to talk about AioContext in qapi-code-gen.txt.  PATCH 1
> now adds
> 
>   Member 'coroutine' tells the QMP dispatcher whether the command handler
>   is safe to be run in a coroutine.  It defaults to false.  If it is true,
>   the command handler is called from coroutine context and may yield while
>   waiting for an external event (such as I/O completion) in order to avoid
>   blocking the guest and other background operations.
>   
> What about "is called from a coroutine running in the main loop with
> @qemu_aio_context, and may yield"?
> 
> Likewise for the commit message of PATCH 3:
> 
>   This moves the QMP dispatcher to a coroutine and runs all QMP command
>   handlers that declare 'coroutine': true in coroutine context so they
>   can avoid blocking the main loop while doing I/O or waiting for other
>   events.
> 
> "calls all ... in a coroutine running in the main loop with
> @qemu_aio_context, so they can".
> 
> Speaking of PATCH 1:
> 
>   It is an error to specify both 'coroutine': true and 'allow-oob': true
>   for a command.  (We don't currently have a use case for both together and
>   without a use case, it's not entirely clear what the semantics should be.)
> 
> Let's lose the parenthesis around the last sentence.
> 
> If you agree with my proposed tweaks, and nothing else comes up, I can
> try to do them in my tree.

Works for me.

In the v5 thread, dropping patch 2 came up. I think it may not be needed
any more in the current version and 'make check' passes without it.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-01-22 10:10     ` Kevin Wolf
  2020-01-22 12:15       ` Markus Armbruster
@ 2020-03-05 15:30       ` Markus Armbruster
  2020-03-05 16:06         ` Kevin Wolf
  1 sibling, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-03-05 15:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Kevin Wolf <kwolf@redhat.com> writes:

> Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > This patch adds a new 'coroutine' flag to QMP command definitions that
>> > tells the QMP dispatcher that the command handler is safe to be run in a
>> > coroutine.
>> 
>> I'm afraid I missed this question in my review of v3: when is a handler
>> *not* safe to be run in a coroutine?
>
> That's a hard one to answer fully.
>
> Basically, I think the biggest problem is with calling functions that
> change their behaviour if run in a coroutine compared to running them
> outside of coroutine context. In most cases the differences like having
> a nested event loop instead of yielding are just fine, but they are
> still subtly different.
>
> I know this is vague, but I can assure you that problematic cases exist.
> I hit one of them with my initial hack that just moved everything into a
> coroutine. It was related to graph modifications and bdrv_drain and
> resulted in a hang. For the specifics, I would have to try and reproduce
> the problem again.

I'm afraid it's even more complicated.

Monitors (HMP and QMP) run in the main loop.  Before this patch, HMP and
QMP commands run start to finish, one after the other.

After this patch, QMP commands may elect to yield.  QMP commands still
run one after the other (the shared dispatcher ensures this even when we
have multiple QMP monitors).

However, *HMP* commands can now run interleaved with a coroutine-enabled
QMP command, i.e. between a yield and its re-enter.

Consider an HMP screendump running in the middle of a coroutine-enabled
QMP screendump, using Marc-André's patch.  As far as I can tell, it
works, because:

1. HMP does not run in a coroutine.  If it did, and both dumps dumped
the same @con, then it would overwrite con->screndump_co.  If we ever
decide to make HMP coroutine-capable so it doesn't block the main loop,
this will become unsafe in a nasty way.

2. graphic_hw_update() is safe to call even while another
graphic_hw_update() runs.  qxl_render_update() appears to ensure that
with the help of qxl->ssd.lock.

3. There is no 3[*].

My point is: this is a non-trivial argument.  Whether a QMP command
handler is safe to run in a coroutine depends on how it interacts with
all the stuff that can run interleaved with it.  Typically includes
itself via its HMP buddy.


[*] I'm less confident in 3. than in 1. and 2.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-03-05 15:30       ` Markus Armbruster
@ 2020-03-05 16:06         ` Kevin Wolf
  2020-03-06  7:25           ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2020-03-05 16:06 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: qemu-block, marcandre.lureau, qemu-devel, stefanha

Am 05.03.2020 um 16:30 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> 
> >> > This patch adds a new 'coroutine' flag to QMP command definitions that
> >> > tells the QMP dispatcher that the command handler is safe to be run in a
> >> > coroutine.
> >> 
> >> I'm afraid I missed this question in my review of v3: when is a handler
> >> *not* safe to be run in a coroutine?
> >
> > That's a hard one to answer fully.
> >
> > Basically, I think the biggest problem is with calling functions that
> > change their behaviour if run in a coroutine compared to running them
> > outside of coroutine context. In most cases the differences like having
> > a nested event loop instead of yielding are just fine, but they are
> > still subtly different.
> >
> > I know this is vague, but I can assure you that problematic cases exist.
> > I hit one of them with my initial hack that just moved everything into a
> > coroutine. It was related to graph modifications and bdrv_drain and
> > resulted in a hang. For the specifics, I would have to try and reproduce
> > the problem again.
> 
> I'm afraid it's even more complicated.
> 
> Monitors (HMP and QMP) run in the main loop.  Before this patch, HMP and
> QMP commands run start to finish, one after the other.
> 
> After this patch, QMP commands may elect to yield.  QMP commands still
> run one after the other (the shared dispatcher ensures this even when we
> have multiple QMP monitors).
> 
> However, *HMP* commands can now run interleaved with a coroutine-enabled
> QMP command, i.e. between a yield and its re-enter.

I guess that's right. :-(

> Consider an HMP screendump running in the middle of a coroutine-enabled
> QMP screendump, using Marc-André's patch.  As far as I can tell, it
> works, because:
> 
> 1. HMP does not run in a coroutine.  If it did, and both dumps dumped
> the same @con, then it would overwrite con->screndump_co.  If we ever
> decide to make HMP coroutine-capable so it doesn't block the main loop,
> this will become unsafe in a nasty way.

At the same time, switching HMP to coroutines would give us an easy way
to fix the problem: Just use a CoMutex so that HMP and QMP commands
never run in parallel. Unless we actually _want_ to run both at the same
time for some commands, but I don't see why.

While we don't have this, maybe it's worth considering if there is
another simple way to achieve the same thing. Could QMP just suspend all
HMP monitors while it's executing a command? At first sight it seems
that this would work only for "interactive" monitors.

> 2. graphic_hw_update() is safe to call even while another
> graphic_hw_update() runs.  qxl_render_update() appears to ensure that
> with the help of qxl->ssd.lock.
> 
> 3. There is no 3[*].
> 
> My point is: this is a non-trivial argument.  Whether a QMP command
> handler is safe to run in a coroutine depends on how it interacts with
> all the stuff that can run interleaved with it.  Typically includes
> itself via its HMP buddy.

If the handler doesn't yield, it's still trivial. So I think my original
statement that with the existing QMP handlers, the problem is with code
that behaves different between coroutine and non-coroutine calls, is
still true because that is the only code that could possibly yield with
existing QMP command handlers.

Of course, you're right that handlers that actually can yield need to be
careful that they can deal with whatever happens until they are
reentered. And that seems to include HMP handlers.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-03-05 16:06         ` Kevin Wolf
@ 2020-03-06  7:25           ` Markus Armbruster
  2020-03-06  9:52             ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-03-06  7:25 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, Philippe Mathieu-Daudé,
	qemu-devel, David Alan Gilbert, marcandre.lureau, stefanha,
	Alex Bennée

Kevin Wolf <kwolf@redhat.com> writes:

> Am 05.03.2020 um 16:30 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
>> >> Kevin Wolf <kwolf@redhat.com> writes:
>> >> 
>> >> > This patch adds a new 'coroutine' flag to QMP command definitions that
>> >> > tells the QMP dispatcher that the command handler is safe to be run in a
>> >> > coroutine.
>> >> 
>> >> I'm afraid I missed this question in my review of v3: when is a handler
>> >> *not* safe to be run in a coroutine?
>> >
>> > That's a hard one to answer fully.
>> >
>> > Basically, I think the biggest problem is with calling functions that
>> > change their behaviour if run in a coroutine compared to running them
>> > outside of coroutine context. In most cases the differences like having
>> > a nested event loop instead of yielding are just fine, but they are
>> > still subtly different.
>> >
>> > I know this is vague, but I can assure you that problematic cases exist.
>> > I hit one of them with my initial hack that just moved everything into a
>> > coroutine. It was related to graph modifications and bdrv_drain and
>> > resulted in a hang. For the specifics, I would have to try and reproduce
>> > the problem again.
>> 
>> I'm afraid it's even more complicated.
>> 
>> Monitors (HMP and QMP) run in the main loop.  Before this patch, HMP and
>> QMP commands run start to finish, one after the other.
>> 
>> After this patch, QMP commands may elect to yield.  QMP commands still
>> run one after the other (the shared dispatcher ensures this even when we
>> have multiple QMP monitors).
>> 
>> However, *HMP* commands can now run interleaved with a coroutine-enabled
>> QMP command, i.e. between a yield and its re-enter.
>
> I guess that's right. :-(
>
>> Consider an HMP screendump running in the middle of a coroutine-enabled
>> QMP screendump, using Marc-André's patch.  As far as I can tell, it
>> works, because:
>> 
>> 1. HMP does not run in a coroutine.  If it did, and both dumps dumped
>> the same @con, then it would overwrite con->screndump_co.  If we ever
>> decide to make HMP coroutine-capable so it doesn't block the main loop,
>> this will become unsafe in a nasty way.
>
> At the same time, switching HMP to coroutines would give us an easy way
> to fix the problem: Just use a CoMutex so that HMP and QMP commands
> never run in parallel. Unless we actually _want_ to run both at the same
> time for some commands, but I don't see why.

As long as running QMP commands from *all* monitors one after the other
is good enough, I can't see why running HMP commands interleaved is
worth the risk.

> While we don't have this, maybe it's worth considering if there is
> another simple way to achieve the same thing. Could QMP just suspend all
> HMP monitors while it's executing a command? At first sight it seems
> that this would work only for "interactive" monitors.

I believe the non-"interactive" HMP code is used only by gdbstub.c.

I keep forgetting our GDB server stub creates an "HMP" monitor.
Possibly because I don't understand why.  Alex, Philippe, can you
enlighten me?

Aside: I figure the distribution of work between monitor_init(),
monitor_init_hmp() and monitor_init_qmp() could be improved.

>> 2. graphic_hw_update() is safe to call even while another
>> graphic_hw_update() runs.  qxl_render_update() appears to ensure that
>> with the help of qxl->ssd.lock.
>> 
>> 3. There is no 3[*].
>> 
>> My point is: this is a non-trivial argument.  Whether a QMP command
>> handler is safe to run in a coroutine depends on how it interacts with
>> all the stuff that can run interleaved with it.  Typically includes
>> itself via its HMP buddy.
>
> If the handler doesn't yield, it's still trivial. So I think my original
> statement that with the existing QMP handlers, the problem is with code
> that behaves different between coroutine and non-coroutine calls, is
> still true because that is the only code that could possibly yield with
> existing QMP command handlers.
>
> Of course, you're right that handlers that actually can yield need to be
> careful that they can deal with whatever happens until they are
> reentered. And that seems to include HMP handlers.
>
> Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-03-06  7:25           ` Markus Armbruster
@ 2020-03-06  9:52             ` Kevin Wolf
  2020-03-06 12:38               ` Markus Armbruster
  0 siblings, 1 reply; 27+ messages in thread
From: Kevin Wolf @ 2020-03-06  9:52 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-block, Philippe Mathieu-Daudé,
	qemu-devel, David Alan Gilbert, marcandre.lureau, stefanha,
	Alex Bennée

Am 06.03.2020 um 08:25 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 05.03.2020 um 16:30 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> 
> >> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> >> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> >> 
> >> >> > This patch adds a new 'coroutine' flag to QMP command definitions that
> >> >> > tells the QMP dispatcher that the command handler is safe to be run in a
> >> >> > coroutine.
> >> >> 
> >> >> I'm afraid I missed this question in my review of v3: when is a handler
> >> >> *not* safe to be run in a coroutine?
> >> >
> >> > That's a hard one to answer fully.
> >> >
> >> > Basically, I think the biggest problem is with calling functions that
> >> > change their behaviour if run in a coroutine compared to running them
> >> > outside of coroutine context. In most cases the differences like having
> >> > a nested event loop instead of yielding are just fine, but they are
> >> > still subtly different.
> >> >
> >> > I know this is vague, but I can assure you that problematic cases exist.
> >> > I hit one of them with my initial hack that just moved everything into a
> >> > coroutine. It was related to graph modifications and bdrv_drain and
> >> > resulted in a hang. For the specifics, I would have to try and reproduce
> >> > the problem again.
> >> 
> >> I'm afraid it's even more complicated.
> >> 
> >> Monitors (HMP and QMP) run in the main loop.  Before this patch, HMP and
> >> QMP commands run start to finish, one after the other.
> >> 
> >> After this patch, QMP commands may elect to yield.  QMP commands still
> >> run one after the other (the shared dispatcher ensures this even when we
> >> have multiple QMP monitors).
> >> 
> >> However, *HMP* commands can now run interleaved with a coroutine-enabled
> >> QMP command, i.e. between a yield and its re-enter.
> >
> > I guess that's right. :-(
> >
> >> Consider an HMP screendump running in the middle of a coroutine-enabled
> >> QMP screendump, using Marc-André's patch.  As far as I can tell, it
> >> works, because:
> >> 
> >> 1. HMP does not run in a coroutine.  If it did, and both dumps dumped
> >> the same @con, then it would overwrite con->screndump_co.  If we ever
> >> decide to make HMP coroutine-capable so it doesn't block the main loop,
> >> this will become unsafe in a nasty way.
> >
> > At the same time, switching HMP to coroutines would give us an easy way
> > to fix the problem: Just use a CoMutex so that HMP and QMP commands
> > never run in parallel. Unless we actually _want_ to run both at the same
> > time for some commands, but I don't see why.
> 
> As long as running QMP commands from *all* monitors one after the other
> is good enough, I can't see why running HMP commands interleaved is
> worth the risk.

There is one exception, actually: Obviously, human-monitor-command must
allow HMP commands to run in parallel.

> > While we don't have this, maybe it's worth considering if there is
> > another simple way to achieve the same thing. Could QMP just suspend all
> > HMP monitors while it's executing a command? At first sight it seems
> > that this would work only for "interactive" monitors.
> 
> I believe the non-"interactive" HMP code is used only by gdbstub.c.

monitor_init_hmp() is called from (based on my block branch):

* monitor_init(): This is interactive.
* qemu_chr_new_noreplay(): Interactive, too.
* gdbserver_start(): Non-interactive.

There is also qmp_human_monitor_command(), which manually creates a
MonitorHMP without going through monitor_init_hmp(). It does call
monitor_data_init(), though. There are no additional callers of
monitor_data_init() and I think I would have added it to every creation
of a Monitor object when I did the QMP/HMP split of the struct.

So GDB and human-monitor-command should be the two non-interactive
cases.

> I keep forgetting our GDB server stub creates an "HMP" monitor.
> Possibly because I don't understand why.  Alex, Philippe, can you
> enlighten me?

I think you can send HMP commands from within gdb with it:

(gdb) tar rem:1234
Remote debugging using :1234
[...]
(gdb) monitor info block
ide1-cd0: [not inserted]
    Attached to:      /machine/unattached/device[23]
    Removable device: not locked, tray closed

floppy0: [not inserted]
    Attached to:      /machine/unattached/device[16]
    Removable device: not locked, tray closed

sd0: [not inserted]
    Removable device: not locked, tray closed

So we do want stop it from processing requests while a QMP command is
running.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-03-06  9:52             ` Kevin Wolf
@ 2020-03-06 12:38               ` Markus Armbruster
  2020-03-06 14:26                 ` Kevin Wolf
  0 siblings, 1 reply; 27+ messages in thread
From: Markus Armbruster @ 2020-03-06 12:38 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-block, Alex Bennée, qemu-devel, David Alan Gilbert,
	marcandre.lureau, stefanha, Philippe Mathieu-Daudé

Kevin Wolf <kwolf@redhat.com> writes:

> Am 06.03.2020 um 08:25 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 05.03.2020 um 16:30 hat Markus Armbruster geschrieben:
>> >> Kevin Wolf <kwolf@redhat.com> writes:
>> >> 
>> >> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
>> >> >> Kevin Wolf <kwolf@redhat.com> writes:
>> >> >> 
>> >> >> > This patch adds a new 'coroutine' flag to QMP command definitions that
>> >> >> > tells the QMP dispatcher that the command handler is safe to be run in a
>> >> >> > coroutine.
>> >> >> 
>> >> >> I'm afraid I missed this question in my review of v3: when is a handler
>> >> >> *not* safe to be run in a coroutine?
>> >> >
>> >> > That's a hard one to answer fully.
>> >> >
>> >> > Basically, I think the biggest problem is with calling functions that
>> >> > change their behaviour if run in a coroutine compared to running them
>> >> > outside of coroutine context. In most cases the differences like having
>> >> > a nested event loop instead of yielding are just fine, but they are
>> >> > still subtly different.
>> >> >
>> >> > I know this is vague, but I can assure you that problematic cases exist.
>> >> > I hit one of them with my initial hack that just moved everything into a
>> >> > coroutine. It was related to graph modifications and bdrv_drain and
>> >> > resulted in a hang. For the specifics, I would have to try and reproduce
>> >> > the problem again.
>> >> 
>> >> I'm afraid it's even more complicated.
>> >> 
>> >> Monitors (HMP and QMP) run in the main loop.  Before this patch, HMP and
>> >> QMP commands run start to finish, one after the other.
>> >> 
>> >> After this patch, QMP commands may elect to yield.  QMP commands still
>> >> run one after the other (the shared dispatcher ensures this even when we
>> >> have multiple QMP monitors).
>> >> 
>> >> However, *HMP* commands can now run interleaved with a coroutine-enabled
>> >> QMP command, i.e. between a yield and its re-enter.
>> >
>> > I guess that's right. :-(
>> >
>> >> Consider an HMP screendump running in the middle of a coroutine-enabled
>> >> QMP screendump, using Marc-André's patch.  As far as I can tell, it
>> >> works, because:
>> >> 
>> >> 1. HMP does not run in a coroutine.  If it did, and both dumps dumped
>> >> the same @con, then it would overwrite con->screndump_co.  If we ever
>> >> decide to make HMP coroutine-capable so it doesn't block the main loop,
>> >> this will become unsafe in a nasty way.
>> >
>> > At the same time, switching HMP to coroutines would give us an easy way
>> > to fix the problem: Just use a CoMutex so that HMP and QMP commands
>> > never run in parallel. Unless we actually _want_ to run both at the same
>> > time for some commands, but I don't see why.
>> 
>> As long as running QMP commands from *all* monitors one after the other
>> is good enough, I can't see why running HMP commands interleaved is
>> worth the risk.
>
> There is one exception, actually: Obviously, human-monitor-command must
> allow HMP commands to run in parallel.

Yes.

>> > While we don't have this, maybe it's worth considering if there is
>> > another simple way to achieve the same thing. Could QMP just suspend all
>> > HMP monitors while it's executing a command? At first sight it seems
>> > that this would work only for "interactive" monitors.
>> 
>> I believe the non-"interactive" HMP code is used only by gdbstub.c.
>
> monitor_init_hmp() is called from (based on my block branch):
>
> * monitor_init(): This is interactive.
> * qemu_chr_new_noreplay(): Interactive, too.
> * gdbserver_start(): Non-interactive.
>
> There is also qmp_human_monitor_command(), which manually creates a
> MonitorHMP without going through monitor_init_hmp(). It does call
> monitor_data_init(), though. There are no additional callers of
> monitor_data_init() and I think I would have added it to every creation
> of a Monitor object when I did the QMP/HMP split of the struct.
>
> So GDB and human-monitor-command should be the two non-interactive
> cases.

Yes.

>> I keep forgetting our GDB server stub creates an "HMP" monitor.
>> Possibly because I don't understand why.  Alex, Philippe, can you
>> enlighten me?
>
> I think you can send HMP commands from within gdb with it:
>
> (gdb) tar rem:1234
> Remote debugging using :1234
> [...]
> (gdb) monitor info block
> ide1-cd0: [not inserted]
>     Attached to:      /machine/unattached/device[23]
>     Removable device: not locked, tray closed
>
> floppy0: [not inserted]
>     Attached to:      /machine/unattached/device[16]
>     Removable device: not locked, tray closed
>
> sd0: [not inserted]
>     Removable device: not locked, tray closed

Argh!

Any idea where we define GDB command "monitor"?

> So we do want stop it from processing requests while a QMP command is
> running.

Then a slow QMP command can interfere with debugging.

Perhaps we can synchronize just the "monitor" command.



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands
  2020-03-06 12:38               ` Markus Armbruster
@ 2020-03-06 14:26                 ` Kevin Wolf
  0 siblings, 0 replies; 27+ messages in thread
From: Kevin Wolf @ 2020-03-06 14:26 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: qemu-block, Alex Bennée, qemu-devel, David Alan Gilbert,
	marcandre.lureau, stefanha, Philippe Mathieu-Daudé

Am 06.03.2020 um 13:38 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 06.03.2020 um 08:25 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> 
> >> > Am 05.03.2020 um 16:30 hat Markus Armbruster geschrieben:
> >> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> >> 
> >> >> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> >> >> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> >> >> 
> >> >> >> > This patch adds a new 'coroutine' flag to QMP command definitions that
> >> >> >> > tells the QMP dispatcher that the command handler is safe to be run in a
> >> >> >> > coroutine.
> >> >> >> 
> >> >> >> I'm afraid I missed this question in my review of v3: when is a handler
> >> >> >> *not* safe to be run in a coroutine?
> >> >> >
> >> >> > That's a hard one to answer fully.
> >> >> >
> >> >> > Basically, I think the biggest problem is with calling functions that
> >> >> > change their behaviour if run in a coroutine compared to running them
> >> >> > outside of coroutine context. In most cases the differences like having
> >> >> > a nested event loop instead of yielding are just fine, but they are
> >> >> > still subtly different.
> >> >> >
> >> >> > I know this is vague, but I can assure you that problematic cases exist.
> >> >> > I hit one of them with my initial hack that just moved everything into a
> >> >> > coroutine. It was related to graph modifications and bdrv_drain and
> >> >> > resulted in a hang. For the specifics, I would have to try and reproduce
> >> >> > the problem again.
> >> >> 
> >> >> I'm afraid it's even more complicated.
> >> >> 
> >> >> Monitors (HMP and QMP) run in the main loop.  Before this patch, HMP and
> >> >> QMP commands run start to finish, one after the other.
> >> >> 
> >> >> After this patch, QMP commands may elect to yield.  QMP commands still
> >> >> run one after the other (the shared dispatcher ensures this even when we
> >> >> have multiple QMP monitors).
> >> >> 
> >> >> However, *HMP* commands can now run interleaved with a coroutine-enabled
> >> >> QMP command, i.e. between a yield and its re-enter.
> >> >
> >> > I guess that's right. :-(
> >> >
> >> >> Consider an HMP screendump running in the middle of a coroutine-enabled
> >> >> QMP screendump, using Marc-André's patch.  As far as I can tell, it
> >> >> works, because:
> >> >> 
> >> >> 1. HMP does not run in a coroutine.  If it did, and both dumps dumped
> >> >> the same @con, then it would overwrite con->screndump_co.  If we ever
> >> >> decide to make HMP coroutine-capable so it doesn't block the main loop,
> >> >> this will become unsafe in a nasty way.
> >> >
> >> > At the same time, switching HMP to coroutines would give us an easy way
> >> > to fix the problem: Just use a CoMutex so that HMP and QMP commands
> >> > never run in parallel. Unless we actually _want_ to run both at the same
> >> > time for some commands, but I don't see why.
> >> 
> >> As long as running QMP commands from *all* monitors one after the other
> >> is good enough, I can't see why running HMP commands interleaved is
> >> worth the risk.
> >
> > There is one exception, actually: Obviously, human-monitor-command must
> > allow HMP commands to run in parallel.
> 
> Yes.
> 
> >> > While we don't have this, maybe it's worth considering if there is
> >> > another simple way to achieve the same thing. Could QMP just suspend all
> >> > HMP monitors while it's executing a command? At first sight it seems
> >> > that this would work only for "interactive" monitors.
> >> 
> >> I believe the non-"interactive" HMP code is used only by gdbstub.c.
> >
> > monitor_init_hmp() is called from (based on my block branch):
> >
> > * monitor_init(): This is interactive.
> > * qemu_chr_new_noreplay(): Interactive, too.
> > * gdbserver_start(): Non-interactive.
> >
> > There is also qmp_human_monitor_command(), which manually creates a
> > MonitorHMP without going through monitor_init_hmp(). It does call
> > monitor_data_init(), though. There are no additional callers of
> > monitor_data_init() and I think I would have added it to every creation
> > of a Monitor object when I did the QMP/HMP split of the struct.
> >
> > So GDB and human-monitor-command should be the two non-interactive
> > cases.
> 
> Yes.
> 
> >> I keep forgetting our GDB server stub creates an "HMP" monitor.
> >> Possibly because I don't understand why.  Alex, Philippe, can you
> >> enlighten me?
> >
> > I think you can send HMP commands from within gdb with it:
> >
> > (gdb) tar rem:1234
> > Remote debugging using :1234
> > [...]
> > (gdb) monitor info block
> > ide1-cd0: [not inserted]
> >     Attached to:      /machine/unattached/device[23]
> >     Removable device: not locked, tray closed
> >
> > floppy0: [not inserted]
> >     Attached to:      /machine/unattached/device[16]
> >     Removable device: not locked, tray closed
> >
> > sd0: [not inserted]
> >     Removable device: not locked, tray closed
> 
> Argh!
> 
> Any idea where we define GDB command "monitor"?

Just following the s->mon_chr that is assigned in gdbserver_start(), it
looks like handle_query_rcmd() sends the HMP command to the monitor.

gdb_gen_query_table has a function pointer to handle_query_rcmd() for
the gdb protocol command "Rcmd", and I think this is what gdb will send
for the "monitor" command.

> > So we do want stop it from processing requests while a QMP command is
> > running.
> 
> Then a slow QMP command can interfere with debugging.
> 
> Perhaps we can synchronize just the "monitor" command.

I didn't mean stop processing of gdb commands, but only of HMP commands
submitted via gdb (which will probably still block gdb while it's
waiting for a response, but only if a "monitor" command was given).

This is probably still not trivial because so far we have no buffering
involved anywhere and handle_query_rcmd() (or probably the whole gdbstub
command processing) is synchronous. Maybe run a nested event loop until
the QMP command is done or something.

Kevin



^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-03-06 14:28 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21 18:11 [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
2020-01-21 18:11 ` [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands Kevin Wolf
2020-01-22  6:32   ` Markus Armbruster
2020-01-22 10:10     ` Kevin Wolf
2020-01-22 12:15       ` Markus Armbruster
2020-01-22 14:35         ` Kevin Wolf
2020-03-05 15:30       ` Markus Armbruster
2020-03-05 16:06         ` Kevin Wolf
2020-03-06  7:25           ` Markus Armbruster
2020-03-06  9:52             ` Kevin Wolf
2020-03-06 12:38               ` Markus Armbruster
2020-03-06 14:26                 ` Kevin Wolf
2020-02-17  7:08   ` Markus Armbruster
2020-01-21 18:11 ` [PATCH v4 2/4] vl: Initialise main loop earlier Kevin Wolf
2020-01-21 18:11 ` [PATCH v4 3/4] qmp: Move dispatcher to a coroutine Kevin Wolf
2020-02-17 11:08   ` Markus Armbruster
2020-02-17 12:34     ` Kevin Wolf
2020-02-18 14:12       ` Markus Armbruster
2020-02-18 15:29         ` Kevin Wolf
2020-02-19  9:03           ` Markus Armbruster
2020-02-19 10:22             ` Kevin Wolf
2020-02-19 14:21               ` Markus Armbruster
2020-02-19 14:26                 ` Markus Armbruster
2020-02-19 15:27                 ` Kevin Wolf
2020-01-21 18:11 ` [PATCH v4 4/4] block: Mark 'block_resize' as coroutine Kevin Wolf
2020-02-05 14:00 ` [PATCH v4 0/4] qmp: Optionally run handlers in coroutines Kevin Wolf
2020-02-12 11:40   ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.