qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] monitor/qmp: fix race with clients disconnecting early
@ 2021-08-23 10:11 Stefan Reiter
  2021-08-23 15:05 ` Philippe Mathieu-Daudé
  2021-08-25 15:54 ` Markus Armbruster
  0 siblings, 2 replies; 11+ messages in thread
From: Stefan Reiter @ 2021-08-23 10:11 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Markus Armbruster, Michael Roth,
	Paolo Bonzini, Kevin Wolf, Wolfgang Bumiller, Thomas Lamprecht
  Cc: qemu-devel

From: Stefan Reiter <stefan@pimaker.at>

The following sequence can produce a race condition that results in
responses meant for different clients being sent to the wrong one:

(QMP, no OOB)
1) client A connects
2) client A sends 'qmp_capabilities'
3) 'qmp_dispatch' runs in coroutine, schedules out to
   'do_qmp_dispatch_bh' and yields
4) client A disconnects (i.e. aborts, crashes, etc...)
5) client B connects
6) 'do_qmp_dispatch_bh' runs 'qmp_capabilities' and wakes calling coroutine
7) capabilities are now set and 'mon->commands' is set to '&qmp_commands'
8) 'qmp_dispatch' returns to 'monitor_qmp_dispatch'
9) success message is sent to client B *without it ever having sent
   'qmp_capabilities' itself*
9a) even if client B ignores it, it will now presumably send it's own
   greeting, which will error because caps are already set

The fix proposed here uses an atomic, sequential connection number
stored in the MonitorQMP struct, which is incremented every time a new
client connects. Since it is not changed on CHR_EVENT_CLOSED, the
behaviour of allowing a client to disconnect only one side of the
connection is retained.

The connection_nr needs to be exposed outside of the monitor subsystem,
since qmp_dispatch lives in qapi code. It needs to be checked twice,
once for actually running the command in the BH (fixes 7/9a), and once for
sending back a response (fixes 9).

This satisfies my local reproducer - using multiple clients constantly
looping to open a connection, send the greeting, then exiting no longer
crashes other, normally behaving clients with unrelated responses.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---

This is another patch in the apparently never-ending series of fixes to
QMP-related race conditions. Our PVE users really seem to have a knack for
triggering these ;)

Here's one of the (several) forum threads where we tried to diagnose the issue:
https://forum.proxmox.com/threads/error-with-backup-when-backing-up-qmp-command-query-backup-failed-got-wrong-command-id.88017/

It manifested itself under load, where sometimes our monitor code (which uses
the 'id' system of QMP to track commands) would receive bogus responses, showing
up as "got wrong command id" errors in our logs.

I'm not sure this approach is the best fix, it seems a lot like a band-aid to
me, but it's the best I could come up with for now - open for different ideas of
course.

Note that with this patch, we're no longer running a command that was scheduled
after a client has disconnected early. This seems like a slight behaviour change
to me... On the other hand, I didn't want to drag the connection number into
qmp_capabilities() and special case that even further.

I didn't look to deeply into 'QMP in iothread' yet, if that does what I think it
does there might be two more potential races here:
* between 'do_qmp_dispatch_bh' running 'aio_co_wake' and 'qmp_dispatch' actually
  yielding (very unlikely though)
* plus a TOCTOU in 'do_qmp_dispatch_bh' with the atomic read of the
  connection_nr and actually running cmd->fn()

Thanks!


 include/monitor/monitor.h  |  1 +
 monitor/monitor-internal.h |  7 +++++++
 monitor/monitor.c          | 15 +++++++++++++++
 monitor/qmp.c              | 15 ++++++++++++++-
 qapi/qmp-dispatch.c        | 21 +++++++++++++++++----
 stubs/monitor-core.c       |  5 +++++
 6 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index af3887bb71..ff6db1448b 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -16,6 +16,7 @@ extern QemuOptsList qemu_mon_opts;
 Monitor *monitor_cur(void);
 Monitor *monitor_set_cur(Coroutine *co, Monitor *mon);
 bool monitor_cur_is_qmp(void);
+int monitor_get_connection_nr(const Monitor *mon);
 
 void monitor_init_globals(void);
 void monitor_init_globals_core(void);
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index 9c3a09cb01..a92be8c3f7 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -144,6 +144,13 @@ typedef struct {
     QemuMutex qmp_queue_lock;
     /* Input queue that holds all the parsed QMP requests */
     GQueue *qmp_requests;
+
+    /*
+     * A sequential number that gets incremented on every new CHR_EVENT_OPENED.
+     * Used to avoid leftover responses in BHs from being sent to the wrong
+     * client. Access with atomics.
+     */
+    int connection_nr;
 } MonitorQMP;
 
 /**
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 636bcc81c5..ee5ac26c37 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -136,6 +136,21 @@ bool monitor_cur_is_qmp(void)
     return cur_mon && monitor_is_qmp(cur_mon);
 }
 
+/**
+ * If @mon is a QMP monitor, return the connection_nr, otherwise -1.
+ */
+int monitor_get_connection_nr(const Monitor *mon)
+{
+    MonitorQMP *qmp_mon;
+
+    if (!monitor_is_qmp(mon)) {
+        return -1;
+    }
+
+    qmp_mon = container_of(mon, MonitorQMP, common);
+    return qatomic_read(&qmp_mon->connection_nr);
+}
+
 /**
  * Is @mon is using readline?
  * Note: not all HMP monitors use readline, e.g., gdbserver has a
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 0ef7cebb78..3ec67e32d3 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -141,6 +141,8 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
     QDict *rsp;
     QDict *error;
 
+    int conn_nr_before = qatomic_read(&mon->connection_nr);
+
     rsp = qmp_dispatch(mon->commands, req, qmp_oob_enabled(mon),
                        &mon->common);
 
@@ -156,7 +158,17 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
         }
     }
 
-    monitor_qmp_respond(mon, rsp);
+    /*
+     * qmp_dispatch might have yielded and waited for a BH, in which case there
+     * is a chance a new client connected in the meantime - if this happened,
+     * the command will not have been executed, but we also need to ensure that
+     * we don't send back a corresponding response on a line that no longer
+     * belongs to this request.
+     */
+    if (conn_nr_before == qatomic_read(&mon->connection_nr)) {
+        monitor_qmp_respond(mon, rsp);
+    }
+
     qobject_unref(rsp);
 }
 
@@ -444,6 +456,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
 
     switch (event) {
     case CHR_EVENT_OPENED:
+        qatomic_inc_fetch(&mon->connection_nr);
         mon->commands = &qmp_cap_negotiation_commands;
         monitor_qmp_caps_reset(mon);
         data = qmp_greeting(mon);
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 59600210ce..95602446eb 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -120,16 +120,28 @@ typedef struct QmpDispatchBH {
     QObject **ret;
     Error **errp;
     Coroutine *co;
+    int conn_nr;
 } QmpDispatchBH;
 
 static void do_qmp_dispatch_bh(void *opaque)
 {
     QmpDispatchBH *data = opaque;
 
-    assert(monitor_cur() == NULL);
-    monitor_set_cur(qemu_coroutine_self(), data->cur_mon);
-    data->cmd->fn(data->args, data->ret, data->errp);
-    monitor_set_cur(qemu_coroutine_self(), NULL);
+    /*
+     * A QMP monitor tracks it's client with a connection number, if this
+     * changes during the scheduling delay of this BH, we must not execute the
+     * command. Otherwise a badly placed 'qmp_capabilities' might affect the
+     * connection state of a client it was never meant for.
+     */
+    if (data->conn_nr == monitor_get_connection_nr(data->cur_mon)) {
+        assert(monitor_cur() == NULL);
+        monitor_set_cur(qemu_coroutine_self(), data->cur_mon);
+        data->cmd->fn(data->args, data->ret, data->errp);
+        monitor_set_cur(qemu_coroutine_self(), NULL);
+    } else {
+        error_setg(data->errp, "active monitor connection changed");
+    }
+
     aio_co_wake(data->co);
 }
 
@@ -243,6 +255,7 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
             .ret        = &ret,
             .errp       = &err,
             .co         = qemu_coroutine_self(),
+            .conn_nr    = monitor_get_connection_nr(cur_mon),
         };
         aio_bh_schedule_oneshot(qemu_get_aio_context(), do_qmp_dispatch_bh,
                                 &data);
diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
index d058a2a00d..3290b58120 100644
--- a/stubs/monitor-core.c
+++ b/stubs/monitor-core.c
@@ -13,6 +13,11 @@ Monitor *monitor_set_cur(Coroutine *co, Monitor *mon)
     return NULL;
 }
 
+int monitor_get_connection_nr(const Monitor *mon)
+{
+    return -1;
+}
+
 void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
 {
 }
-- 
2.30.2




^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-08-23 10:11 [PATCH] monitor/qmp: fix race with clients disconnecting early Stefan Reiter
@ 2021-08-23 15:05 ` Philippe Mathieu-Daudé
  2021-08-25 15:54 ` Markus Armbruster
  1 sibling, 0 replies; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-23 15:05 UTC (permalink / raw)
  To: Stefan Reiter, Dr. David Alan Gilbert, Markus Armbruster,
	Michael Roth, Paolo Bonzini, Kevin Wolf, Wolfgang Bumiller,
	Thomas Lamprecht
  Cc: Marc-André Lureau, qemu-devel

Cc'ing Marc-André

On 8/23/21 12:11 PM, Stefan Reiter wrote:
> From: Stefan Reiter <stefan@pimaker.at>
> 
> The following sequence can produce a race condition that results in
> responses meant for different clients being sent to the wrong one:
> 
> (QMP, no OOB)
> 1) client A connects
> 2) client A sends 'qmp_capabilities'
> 3) 'qmp_dispatch' runs in coroutine, schedules out to
>    'do_qmp_dispatch_bh' and yields
> 4) client A disconnects (i.e. aborts, crashes, etc...)
> 5) client B connects
> 6) 'do_qmp_dispatch_bh' runs 'qmp_capabilities' and wakes calling coroutine
> 7) capabilities are now set and 'mon->commands' is set to '&qmp_commands'
> 8) 'qmp_dispatch' returns to 'monitor_qmp_dispatch'
> 9) success message is sent to client B *without it ever having sent
>    'qmp_capabilities' itself*
> 9a) even if client B ignores it, it will now presumably send it's own
>    greeting, which will error because caps are already set
> 
> The fix proposed here uses an atomic, sequential connection number
> stored in the MonitorQMP struct, which is incremented every time a new
> client connects. Since it is not changed on CHR_EVENT_CLOSED, the
> behaviour of allowing a client to disconnect only one side of the
> connection is retained.
> 
> The connection_nr needs to be exposed outside of the monitor subsystem,
> since qmp_dispatch lives in qapi code. It needs to be checked twice,
> once for actually running the command in the BH (fixes 7/9a), and once for
> sending back a response (fixes 9).
> 
> This satisfies my local reproducer - using multiple clients constantly
> looping to open a connection, send the greeting, then exiting no longer
> crashes other, normally behaving clients with unrelated responses.
> 
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
> 
> This is another patch in the apparently never-ending series of fixes to
> QMP-related race conditions. Our PVE users really seem to have a knack for
> triggering these ;)
> 
> Here's one of the (several) forum threads where we tried to diagnose the issue:
> https://forum.proxmox.com/threads/error-with-backup-when-backing-up-qmp-command-query-backup-failed-got-wrong-command-id.88017/
> 
> It manifested itself under load, where sometimes our monitor code (which uses
> the 'id' system of QMP to track commands) would receive bogus responses, showing
> up as "got wrong command id" errors in our logs.
> 
> I'm not sure this approach is the best fix, it seems a lot like a band-aid to
> me, but it's the best I could come up with for now - open for different ideas of
> course.
> 
> Note that with this patch, we're no longer running a command that was scheduled
> after a client has disconnected early. This seems like a slight behaviour change
> to me... On the other hand, I didn't want to drag the connection number into
> qmp_capabilities() and special case that even further.
> 
> I didn't look to deeply into 'QMP in iothread' yet, if that does what I think it
> does there might be two more potential races here:
> * between 'do_qmp_dispatch_bh' running 'aio_co_wake' and 'qmp_dispatch' actually
>   yielding (very unlikely though)
> * plus a TOCTOU in 'do_qmp_dispatch_bh' with the atomic read of the
>   connection_nr and actually running cmd->fn()
> 
> Thanks!
> 
> 
>  include/monitor/monitor.h  |  1 +
>  monitor/monitor-internal.h |  7 +++++++
>  monitor/monitor.c          | 15 +++++++++++++++
>  monitor/qmp.c              | 15 ++++++++++++++-
>  qapi/qmp-dispatch.c        | 21 +++++++++++++++++----
>  stubs/monitor-core.c       |  5 +++++
>  6 files changed, 59 insertions(+), 5 deletions(-)
> 
> diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
> index af3887bb71..ff6db1448b 100644
> --- a/include/monitor/monitor.h
> +++ b/include/monitor/monitor.h
> @@ -16,6 +16,7 @@ extern QemuOptsList qemu_mon_opts;
>  Monitor *monitor_cur(void);
>  Monitor *monitor_set_cur(Coroutine *co, Monitor *mon);
>  bool monitor_cur_is_qmp(void);
> +int monitor_get_connection_nr(const Monitor *mon);
>  
>  void monitor_init_globals(void);
>  void monitor_init_globals_core(void);
> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> index 9c3a09cb01..a92be8c3f7 100644
> --- a/monitor/monitor-internal.h
> +++ b/monitor/monitor-internal.h
> @@ -144,6 +144,13 @@ typedef struct {
>      QemuMutex qmp_queue_lock;
>      /* Input queue that holds all the parsed QMP requests */
>      GQueue *qmp_requests;
> +
> +    /*
> +     * A sequential number that gets incremented on every new CHR_EVENT_OPENED.
> +     * Used to avoid leftover responses in BHs from being sent to the wrong
> +     * client. Access with atomics.
> +     */
> +    int connection_nr;
>  } MonitorQMP;
>  
>  /**
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 636bcc81c5..ee5ac26c37 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -136,6 +136,21 @@ bool monitor_cur_is_qmp(void)
>      return cur_mon && monitor_is_qmp(cur_mon);
>  }
>  
> +/**
> + * If @mon is a QMP monitor, return the connection_nr, otherwise -1.
> + */
> +int monitor_get_connection_nr(const Monitor *mon)
> +{
> +    MonitorQMP *qmp_mon;
> +
> +    if (!monitor_is_qmp(mon)) {
> +        return -1;
> +    }
> +
> +    qmp_mon = container_of(mon, MonitorQMP, common);
> +    return qatomic_read(&qmp_mon->connection_nr);
> +}
> +
>  /**
>   * Is @mon is using readline?
>   * Note: not all HMP monitors use readline, e.g., gdbserver has a
> diff --git a/monitor/qmp.c b/monitor/qmp.c
> index 0ef7cebb78..3ec67e32d3 100644
> --- a/monitor/qmp.c
> +++ b/monitor/qmp.c
> @@ -141,6 +141,8 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
>      QDict *rsp;
>      QDict *error;
>  
> +    int conn_nr_before = qatomic_read(&mon->connection_nr);
> +
>      rsp = qmp_dispatch(mon->commands, req, qmp_oob_enabled(mon),
>                         &mon->common);
>  
> @@ -156,7 +158,17 @@ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
>          }
>      }
>  
> -    monitor_qmp_respond(mon, rsp);
> +    /*
> +     * qmp_dispatch might have yielded and waited for a BH, in which case there
> +     * is a chance a new client connected in the meantime - if this happened,
> +     * the command will not have been executed, but we also need to ensure that
> +     * we don't send back a corresponding response on a line that no longer
> +     * belongs to this request.
> +     */
> +    if (conn_nr_before == qatomic_read(&mon->connection_nr)) {
> +        monitor_qmp_respond(mon, rsp);
> +    }
> +
>      qobject_unref(rsp);
>  }
>  
> @@ -444,6 +456,7 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
>  
>      switch (event) {
>      case CHR_EVENT_OPENED:
> +        qatomic_inc_fetch(&mon->connection_nr);
>          mon->commands = &qmp_cap_negotiation_commands;
>          monitor_qmp_caps_reset(mon);
>          data = qmp_greeting(mon);
> diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
> index 59600210ce..95602446eb 100644
> --- a/qapi/qmp-dispatch.c
> +++ b/qapi/qmp-dispatch.c
> @@ -120,16 +120,28 @@ typedef struct QmpDispatchBH {
>      QObject **ret;
>      Error **errp;
>      Coroutine *co;
> +    int conn_nr;
>  } QmpDispatchBH;
>  
>  static void do_qmp_dispatch_bh(void *opaque)
>  {
>      QmpDispatchBH *data = opaque;
>  
> -    assert(monitor_cur() == NULL);
> -    monitor_set_cur(qemu_coroutine_self(), data->cur_mon);
> -    data->cmd->fn(data->args, data->ret, data->errp);
> -    monitor_set_cur(qemu_coroutine_self(), NULL);
> +    /*
> +     * A QMP monitor tracks it's client with a connection number, if this
> +     * changes during the scheduling delay of this BH, we must not execute the
> +     * command. Otherwise a badly placed 'qmp_capabilities' might affect the
> +     * connection state of a client it was never meant for.
> +     */
> +    if (data->conn_nr == monitor_get_connection_nr(data->cur_mon)) {
> +        assert(monitor_cur() == NULL);
> +        monitor_set_cur(qemu_coroutine_self(), data->cur_mon);
> +        data->cmd->fn(data->args, data->ret, data->errp);
> +        monitor_set_cur(qemu_coroutine_self(), NULL);
> +    } else {
> +        error_setg(data->errp, "active monitor connection changed");
> +    }
> +
>      aio_co_wake(data->co);
>  }
>  
> @@ -243,6 +255,7 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
>              .ret        = &ret,
>              .errp       = &err,
>              .co         = qemu_coroutine_self(),
> +            .conn_nr    = monitor_get_connection_nr(cur_mon),
>          };
>          aio_bh_schedule_oneshot(qemu_get_aio_context(), do_qmp_dispatch_bh,
>                                  &data);
> diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
> index d058a2a00d..3290b58120 100644
> --- a/stubs/monitor-core.c
> +++ b/stubs/monitor-core.c
> @@ -13,6 +13,11 @@ Monitor *monitor_set_cur(Coroutine *co, Monitor *mon)
>      return NULL;
>  }
>  
> +int monitor_get_connection_nr(const Monitor *mon)
> +{
> +    return -1;
> +}
> +
>  void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
>  {
>  }
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-08-23 10:11 [PATCH] monitor/qmp: fix race with clients disconnecting early Stefan Reiter
  2021-08-23 15:05 ` Philippe Mathieu-Daudé
@ 2021-08-25 15:54 ` Markus Armbruster
  2021-08-26 13:50   ` Markus Armbruster
  1 sibling, 1 reply; 11+ messages in thread
From: Markus Armbruster @ 2021-08-25 15:54 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

Stefan Reiter <s.reiter@proxmox.com> writes:

> From: Stefan Reiter <stefan@pimaker.at>
>
> The following sequence can produce a race condition that results in
> responses meant for different clients being sent to the wrong one:
>
> (QMP, no OOB)
> 1) client A connects
> 2) client A sends 'qmp_capabilities'
> 3) 'qmp_dispatch' runs in coroutine, schedules out to
>    'do_qmp_dispatch_bh' and yields
> 4) client A disconnects (i.e. aborts, crashes, etc...)
> 5) client B connects
> 6) 'do_qmp_dispatch_bh' runs 'qmp_capabilities' and wakes calling coroutine
> 7) capabilities are now set and 'mon->commands' is set to '&qmp_commands'
> 8) 'qmp_dispatch' returns to 'monitor_qmp_dispatch'
> 9) success message is sent to client B *without it ever having sent
>    'qmp_capabilities' itself*
> 9a) even if client B ignores it, it will now presumably send it's own
>    greeting, which will error because caps are already set
>
> The fix proposed here uses an atomic, sequential connection number
> stored in the MonitorQMP struct, which is incremented every time a new
> client connects. Since it is not changed on CHR_EVENT_CLOSED, the
> behaviour of allowing a client to disconnect only one side of the
> connection is retained.
>
> The connection_nr needs to be exposed outside of the monitor subsystem,
> since qmp_dispatch lives in qapi code. It needs to be checked twice,
> once for actually running the command in the BH (fixes 7/9a), and once for
> sending back a response (fixes 9).
>
> This satisfies my local reproducer - using multiple clients constantly
> looping to open a connection, send the greeting, then exiting no longer
> crashes other, normally behaving clients with unrelated responses.
>
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
>
> This is another patch in the apparently never-ending series of fixes to
> QMP-related race conditions. Our PVE users really seem to have a knack for
> triggering these ;)

Fortunately, you seem to have a knack for fixing them!

> Here's one of the (several) forum threads where we tried to diagnose the issue:
> https://forum.proxmox.com/threads/error-with-backup-when-backing-up-qmp-command-query-backup-failed-got-wrong-command-id.88017/
>
> It manifested itself under load, where sometimes our monitor code (which uses
> the 'id' system of QMP to track commands) would receive bogus responses, showing
> up as "got wrong command id" errors in our logs.

Let me re-explain the bug in my own words, to make sure I understand.

A QMP monitor normally runs in the monitor I/O thread.

A QMP monitor can serve only one client at a time.

It executes out-of-band commands right as it reads them.  In-band
commands are queued, and executed one after the other in the main loop.

Command output is buffered.  We write it out as fast as the character
device can take it.  If a write fails, we throw away the entire buffer
contents.

A client can disconnect at any time.  This throws away the queue.  An
in-band command may be executing in the main loop.  An out-of-band
command may be executing in the monitor's thread.

Such commands (if any) are not affected by the disconnect.  Their output
gets buffered, but write out fails, so it's thrown away.

*Except* when another client connects quickly enough.  Then we send it
output meant for the previous client.  This is wrong.  I suspect this
could even send invalid JSON.

Special case: if in-band command qmp_capabilities is executing when the
client disconnects, and another client connects before the command flips
the monitor from capabilities negotiation mode to command mode, that
client starts in the wrong mode.

> I'm not sure this approach is the best fix, it seems a lot like a band-aid to
> me, but it's the best I could come up with for now - open for different ideas of
> course.

I need to think through the ramifications.  Please ping me if you don't
hear from me this week.

> Note that with this patch, we're no longer running a command that was scheduled
> after a client has disconnected early. This seems like a slight behaviour change
> to me...

The commit message needs to spell this out.

>          On the other hand, I didn't want to drag the connection number into
> qmp_capabilities() and special case that even further.
>
> I didn't look to deeply into 'QMP in iothread' yet, if that does what I think it
> does there might be two more potential races here:
> * between 'do_qmp_dispatch_bh' running 'aio_co_wake' and 'qmp_dispatch' actually
>   yielding (very unlikely though)
> * plus a TOCTOU in 'do_qmp_dispatch_bh' with the atomic read of the
>   connection_nr and actually running cmd->fn()

We should discuss these in more detail.

> Thanks!

Thank *you*!


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-08-25 15:54 ` Markus Armbruster
@ 2021-08-26 13:50   ` Markus Armbruster
  2021-08-26 15:15     ` Stefan Reiter
  0 siblings, 1 reply; 11+ messages in thread
From: Markus Armbruster @ 2021-08-26 13:50 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

Markus Armbruster <armbru@redhat.com> writes:

[...]

> Let me re-explain the bug in my own words, to make sure I understand.
>
> A QMP monitor normally runs in the monitor I/O thread.
>
> A QMP monitor can serve only one client at a time.
>
> It executes out-of-band commands right as it reads them.  In-band
> commands are queued, and executed one after the other in the main loop.
>
> Command output is buffered.  We write it out as fast as the character
> device can take it.  If a write fails, we throw away the entire buffer
> contents.
>
> A client can disconnect at any time.  This throws away the queue.  An
> in-band command may be executing in the main loop.  An out-of-band
> command may be executing in the monitor's thread.
>
> Such commands (if any) are not affected by the disconnect.  Their output
> gets buffered, but write out fails, so it's thrown away.
>
> *Except* when another client connects quickly enough.  Then we send it
> output meant for the previous client.  This is wrong.  I suspect this
> could even send invalid JSON.
>
> Special case: if in-band command qmp_capabilities is executing when the
> client disconnects, and another client connects before the command flips
> the monitor from capabilities negotiation mode to command mode, that
> client starts in the wrong mode.

What the cases have in common: disconnect + connect in monitor I/O
thread and the command executing in the main thread change the same
monitor state.

You observed two issues: one involving the output buffer (new client
receives old client's output), and one involving monitor mode (new
client has its mode flipped by the old client's qmp_capabilities
command).

Any monitor state accessed by commands could cause issues.  Right now, I
can see only one more: file descriptors.  Cleaning them up on disconnect
could mess with the command.

[...]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-08-26 13:50   ` Markus Armbruster
@ 2021-08-26 15:15     ` Stefan Reiter
  2021-08-31 15:45       ` Markus Armbruster
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Reiter @ 2021-08-26 15:15 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

On 26/08/2021 15:50, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
> [...]
> 
>> Let me re-explain the bug in my own words, to make sure I understand.
>>
>> A QMP monitor normally runs in the monitor I/O thread.
>>
>> A QMP monitor can serve only one client at a time.
>>
>> It executes out-of-band commands right as it reads them.  In-band
>> commands are queued, and executed one after the other in the main loop.
>>
>> Command output is buffered.  We write it out as fast as the character
>> device can take it.  If a write fails, we throw away the entire buffer
>> contents.
>>
>> A client can disconnect at any time.  This throws away the queue.  An
>> in-band command may be executing in the main loop.  An out-of-band
>> command may be executing in the monitor's thread.
>>
>> Such commands (if any) are not affected by the disconnect.  Their output
>> gets buffered, but write out fails, so it's thrown away.
>>
>> *Except* when another client connects quickly enough.  Then we send it
>> output meant for the previous client.  This is wrong.  I suspect this
>> could even send invalid JSON.
>>

I'm not sure this is the case. In all testing I have *never* encountered 
the case of broken JS or any other indication that partial output was 
received.

I think the issue is just between starting to execute the command in the 
BH and the new client connecting... can the CHR_EVENTs even be triggered 
when the main thread is busy with the BH?

>> Special case: if in-band command qmp_capabilities is executing when the
>> client disconnects, and another client connects before the command flips
>> the monitor from capabilities negotiation mode to command mode, that
>> client starts in the wrong mode.
> 
> What the cases have in common: disconnect + connect in monitor I/O
> thread and the command executing in the main thread change the same
> monitor state.
> 
> You observed two issues: one involving the output buffer (new client
> receives old client's output), and one involving monitor mode (new
> client has its mode flipped by the old client's qmp_capabilities
> command).
> 
> Any monitor state accessed by commands could cause issues.  Right now, I
> can see only one more: file descriptors.  Cleaning them up on disconnect
> could mess with the command.

Right, that would make sense, but also only an issue if the reconnect 
can happen in the middle of the command itself.

Maybe we can acquire some kind of lock during actual in-band QMP command 
execution?

> 
> [...]
> 
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-08-26 15:15     ` Stefan Reiter
@ 2021-08-31 15:45       ` Markus Armbruster
  2021-09-02 12:05         ` Markus Armbruster
  2021-09-02 12:45         ` Markus Armbruster
  0 siblings, 2 replies; 11+ messages in thread
From: Markus Armbruster @ 2021-08-31 15:45 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

Stefan Reiter <s.reiter@proxmox.com> writes:

> On 26/08/2021 15:50, Markus Armbruster wrote:
>> Markus Armbruster <armbru@redhat.com> writes:
>> [...]
>> 
>>> Let me re-explain the bug in my own words, to make sure I understand.
>>>
>>> A QMP monitor normally runs in the monitor I/O thread.
>>>
>>> A QMP monitor can serve only one client at a time.
>>>
>>> It executes out-of-band commands right as it reads them.  In-band
>>> commands are queued, and executed one after the other in the main loop.
>>>
>>> Command output is buffered.  We write it out as fast as the character
>>> device can take it.  If a write fails, we throw away the entire buffer
>>> contents.
>>>
>>> A client can disconnect at any time.  This throws away the queue.  An
>>> in-band command may be executing in the main loop.  An out-of-band
>>> command may be executing in the monitor's thread.
>>>
>>> Such commands (if any) are not affected by the disconnect.  Their output
>>> gets buffered, but write out fails, so it's thrown away.
>>>
>>> *Except* when another client connects quickly enough.  Then we send it
>>> output meant for the previous client.  This is wrong.  I suspect this
>>> could even send invalid JSON.
>>>
>
> I'm not sure this is the case. In all testing I have *never*
> encountered the case of broken JS or any other indication that partial
> output was received.

We buffer monitor output without bounds, and try to write it out as
quickly as the client will take it.  I think short writes are possible
when the client is slow to read.  Such short writes can write partial
JSON expressions; the operating system doesn't know or care.  If the
client disconnects right then, the buffer starts with the remainder of
the JSON expression.  If the buffer is sent to the next client...

> I think the issue is just between starting to execute the command in
> the BH and the new client connecting... can the CHR_EVENTs even be
> triggered when the main thread is busy with the BH?

Good question.  We better find out.

>>> Special case: if in-band command qmp_capabilities is executing when the
>>> client disconnects, and another client connects before the command flips
>>> the monitor from capabilities negotiation mode to command mode, that
>>> client starts in the wrong mode.
>>
>> What the cases have in common: disconnect + connect in monitor I/O
>> thread and the command executing in the main thread change the same
>> monitor state.
>>
>> You observed two issues: one involving the output buffer (new client
>> receives old client's output), and one involving monitor mode (new
>> client has its mode flipped by the old client's qmp_capabilities
>> command).
>>
>> Any monitor state accessed by commands could cause issues.  Right now, I
>> can see only one more: file descriptors.  Cleaning them up on disconnect
>> could mess with the command.
>
> Right, that would make sense, but also only an issue if the reconnect
> can happen in the middle of the command itself.
>
> Maybe we can acquire some kind of lock during actual in-band QMP
> command execution?

Yes, the root cause is insufficient synchronization between in-band
command running in the main loop and disconnect / connect running in the
monitor I/O thread, and synchronizing them properly feels like the best
chance for a complete and reliable fix.

Before the OOB work, everything was in the main thread.  I figure the
misbehavior you found was not possible then.

Synchronization so that disconnect is delayed until after the in-band
command in flight completes should get us back to that state.  But I'm
afraid it could regress the OOB feature.

The point of OOB commands is "exec-oob executes right away no matter
what".  To make that possible OOB-capable commands are severely
restricted in what they can do, and the client needs to limit the number
of commands in flight.

"Right away" should be possible even when the client has to connect
first.  I'm not sure that's actually the case now.  Delaying until after
in-band completes would definitely kill it, though.

I'm afraid the synchronization needs to be more involved.

>> [...]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-08-31 15:45       ` Markus Armbruster
@ 2021-09-02 12:05         ` Markus Armbruster
  2021-09-02 12:45         ` Markus Armbruster
  1 sibling, 0 replies; 11+ messages in thread
From: Markus Armbruster @ 2021-09-02 12:05 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

Markus Armbruster <armbru@redhat.com> writes:

[...]

> The point of OOB commands is "exec-oob executes right away no matter
> what".  To make that possible OOB-capable commands are severely
> restricted in what they can do, and the client needs to limit the number
> of commands in flight.
>
> "Right away" should be possible even when the client has to connect
> first.  I'm not sure that's actually the case now.

It's not.  I consider this a bug.

[...]



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-08-31 15:45       ` Markus Armbruster
  2021-09-02 12:05         ` Markus Armbruster
@ 2021-09-02 12:45         ` Markus Armbruster
  2021-10-12  9:27           ` Markus Armbruster
  1 sibling, 1 reply; 11+ messages in thread
From: Markus Armbruster @ 2021-09-02 12:45 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

I've thought a bit more.

A monitor can serve a series of clients.

Back when all of the monitor ran in the main thread, we completely
finished serving the current client before we started serving the next
one (I think).  In other words, sessions did not overlap.

Since we moved parts of the monitor to the monitor I/O thread (merge
commit 4bdc24fa018), sessions can overlap, and this causes issues, as
you demonstrated.

Possible fixes:

1. Go back to "don't overlap" with suitable synchronization.

   I'm afraid this won't cut it, because exec-oob would have wait in
   line behind reconnect.

   It currently waits for other reasons, but that feels fixable.  Going
   back to "don't overlap" would make it unfixable.

2. Make the lingering session not affect / be affected by the new
   session's state

   This is what your patch tries.  Every access of session state needs
   to be guarded like

        if (conn_nr_before == qatomic_read(&mon->connection_nr)) {
            access session state
        } else {
            ???
        }

   Issues:

   * We have to find and guard all existing accesses.  That's work.

   * We have to guard all future accesses.  More work, and easy to
     forget, which makes this fix rather brittle.

   * The fix is spread over many places.

   * We may run into cases where the ??? part gets complicated.
     Consider file descriptors.  The command in flight will have its
     monitor_get_fd() fail after disconnect.  Probably okay, because it
     can also fail for other reasons.  But we might run into cases where
     the ??? part creates new failure modes for us to handle.

3. Per-session state

   Move per-session state from monitor state into a separate object.

   Use reference counts to keep this object alive until both threads are
   done with the session.

   Monitor I/O thread executes monitor core and the out-of-band
   commands; its stops using the old session on disconnect, and starts
   using the new session on connect.

   Main thread executes in-band commands, and these use the session that
   submitted them.

   Do I make sense, or should I explain my idea in more detail?



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-09-02 12:45         ` Markus Armbruster
@ 2021-10-12  9:27           ` Markus Armbruster
  2021-10-13 15:44             ` Stefan Reiter
  0 siblings, 1 reply; 11+ messages in thread
From: Markus Armbruster @ 2021-10-12  9:27 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

Stefan, any thoughts on this?

Markus Armbruster <armbru@redhat.com> writes:

> I've thought a bit more.
>
> A monitor can serve a series of clients.
>
> Back when all of the monitor ran in the main thread, we completely
> finished serving the current client before we started serving the next
> one (I think).  In other words, sessions did not overlap.
>
> Since we moved parts of the monitor to the monitor I/O thread (merge
> commit 4bdc24fa018), sessions can overlap, and this causes issues, as
> you demonstrated.
>
> Possible fixes:
>
> 1. Go back to "don't overlap" with suitable synchronization.
>
>    I'm afraid this won't cut it, because exec-oob would have wait in
>    line behind reconnect.
>
>    It currently waits for other reasons, but that feels fixable.  Going
>    back to "don't overlap" would make it unfixable.
>
> 2. Make the lingering session not affect / be affected by the new
>    session's state
>
>    This is what your patch tries.  Every access of session state needs
>    to be guarded like
>
>         if (conn_nr_before == qatomic_read(&mon->connection_nr)) {
>             access session state
>         } else {
>             ???
>         }
>
>    Issues:
>
>    * We have to find and guard all existing accesses.  That's work.
>
>    * We have to guard all future accesses.  More work, and easy to
>      forget, which makes this fix rather brittle.
>
>    * The fix is spread over many places.
>
>    * We may run into cases where the ??? part gets complicated.
>      Consider file descriptors.  The command in flight will have its
>      monitor_get_fd() fail after disconnect.  Probably okay, because it
>      can also fail for other reasons.  But we might run into cases where
>      the ??? part creates new failure modes for us to handle.
>
> 3. Per-session state
>
>    Move per-session state from monitor state into a separate object.
>
>    Use reference counts to keep this object alive until both threads are
>    done with the session.
>
>    Monitor I/O thread executes monitor core and the out-of-band
>    commands; its stops using the old session on disconnect, and starts
>    using the new session on connect.
>
>    Main thread executes in-band commands, and these use the session that
>    submitted them.
>
>    Do I make sense, or should I explain my idea in more detail?



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-10-12  9:27           ` Markus Armbruster
@ 2021-10-13 15:44             ` Stefan Reiter
  2021-10-14  5:40               ` Markus Armbruster
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Reiter @ 2021-10-13 15:44 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

On 10/12/21 11:27 AM, Markus Armbruster wrote:
> Stefan, any thoughts on this?
> 

Sorry, I didn't get to work on implementing this. Idea 3 does seem very
reasonable, though I suppose the question is what all should go into the
per-session state, and also how it is passed down.

We did roll out my initial patch to our users in the meantime and got
some positive feedback (that specific error disappeared), however another
one (reported at the same time) still exists, so I was trying to diagnose
- it might also turn out to be monitor related and resolved by the more
thorough fix proposed here, but unsure.

> Markus Armbruster <armbru@redhat.com> writes:
> 
>> I've thought a bit more.
>>
>> A monitor can serve a series of clients.
>>
>> Back when all of the monitor ran in the main thread, we completely
>> finished serving the current client before we started serving the next
>> one (I think).  In other words, sessions did not overlap.
>>
>> Since we moved parts of the monitor to the monitor I/O thread (merge
>> commit 4bdc24fa018), sessions can overlap, and this causes issues, as
>> you demonstrated.
>>
>> Possible fixes:
>>
>> 1. Go back to "don't overlap" with suitable synchronization.
>>
>>     I'm afraid this won't cut it, because exec-oob would have wait in
>>     line behind reconnect.
>>
>>     It currently waits for other reasons, but that feels fixable.  Going
>>     back to "don't overlap" would make it unfixable.
>>
>> 2. Make the lingering session not affect / be affected by the new
>>     session's state
>>
>>     This is what your patch tries.  Every access of session state needs
>>     to be guarded like
>>
>>          if (conn_nr_before == qatomic_read(&mon->connection_nr)) {
>>              access session state
>>          } else {
>>              ???
>>          }
>>
>>     Issues:
>>
>>     * We have to find and guard all existing accesses.  That's work.
>>
>>     * We have to guard all future accesses.  More work, and easy to
>>       forget, which makes this fix rather brittle.
>>
>>     * The fix is spread over many places.
>>
>>     * We may run into cases where the ??? part gets complicated.
>>       Consider file descriptors.  The command in flight will have its
>>       monitor_get_fd() fail after disconnect.  Probably okay, because it
>>       can also fail for other reasons.  But we might run into cases where
>>       the ??? part creates new failure modes for us to handle.
>>
>> 3. Per-session state
>>
>>     Move per-session state from monitor state into a separate object.
>>
>>     Use reference counts to keep this object alive until both threads are
>>     done with the session.
>>
>>     Monitor I/O thread executes monitor core and the out-of-band
>>     commands; its stops using the old session on disconnect, and starts
>>     using the new session on connect.
>>
>>     Main thread executes in-band commands, and these use the session that
>>     submitted them.
>>
>>     Do I make sense, or should I explain my idea in more detail?
> 
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
  2021-10-13 15:44             ` Stefan Reiter
@ 2021-10-14  5:40               ` Markus Armbruster
  0 siblings, 0 replies; 11+ messages in thread
From: Markus Armbruster @ 2021-10-14  5:40 UTC (permalink / raw)
  To: Stefan Reiter
  Cc: Kevin Wolf, Wolfgang Bumiller, Michael Roth,
	Dr. David Alan Gilbert, qemu-devel, Paolo Bonzini,
	Thomas Lamprecht

Stefan Reiter <s.reiter@proxmox.com> writes:

> On 10/12/21 11:27 AM, Markus Armbruster wrote:
>> Stefan, any thoughts on this?
>> 
>
> Sorry, I didn't get to work on implementing this. Idea 3 does seem very
> reasonable, though I suppose the question is what all should go into the
> per-session state, and also how it is passed down.

Let's start with the state you have shown to be problematic.  To decide
what else to move from monitor state to session state, we'll want to
review monitor state one by one.  Can be done in review of patches
creating the session state.

Most users need the current session state.  So have struct MonitorQMP
hold a pointer to it.

To execute an in-band command in the main thread, we need the command's
session state.  I'd try adding a pointer to QMPRequest.

Then use reference counting to keep the session alive until all its
commands are processed.

Makes sense?

> We did roll out my initial patch to our users in the meantime and got
> some positive feedback (that specific error disappeared), however another
> one (reported at the same time) still exists, so I was trying to diagnose
> - it might also turn out to be monitor related and resolved by the more
> thorough fix proposed here, but unsure.

That would be lovely.

Thanks!



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-10-14  7:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23 10:11 [PATCH] monitor/qmp: fix race with clients disconnecting early Stefan Reiter
2021-08-23 15:05 ` Philippe Mathieu-Daudé
2021-08-25 15:54 ` Markus Armbruster
2021-08-26 13:50   ` Markus Armbruster
2021-08-26 15:15     ` Stefan Reiter
2021-08-31 15:45       ` Markus Armbruster
2021-09-02 12:05         ` Markus Armbruster
2021-09-02 12:45         ` Markus Armbruster
2021-10-12  9:27           ` Markus Armbruster
2021-10-13 15:44             ` Stefan Reiter
2021-10-14  5:40               ` Markus Armbruster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).