All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	"Daniel P . Berrange" <berrange@redhat.com>,
	Fam Zheng <famz@redhat.com>, Juan Quintela <quintela@redhat.com>,
	mdroth@linux.vnet.ibm.com, peterx@redhat.com,
	Eric Blake <eblake@redhat.com>,
	Laurent Vivier <lvivier@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll
Date: Wed, 23 Aug 2017 14:51:05 +0800	[thread overview]
Message-ID: <1503471071-2233-3-git-send-email-peterx@redhat.com> (raw)
In-Reply-To: <1503471071-2233-1-git-send-email-peterx@redhat.com>

Firstly, introduce Monitor.use_thread, and set it for monitors that are
using non-mux typed backend chardev.  We only do this for monitors, so
mux-typed chardevs are not suitable (when it connects to, e.g., serials
and the monitor together).

When use_thread is set, we create standalone thread to poll the monitor
events, isolated from the main loop thread.  Here we still need to take
the BQL before dispatching the tasks since some of the monitor commands
are not allowed to execute without the protection of BQL.  Then this
gives us the chance to avoid taking the BQL for some monitor commands in
the future.

* Why this change?

We need these per-monitor threads to make sure we can have at least one
monitor that will never stuck (that can receive further monitor
commands).

* So when will monitors stuck?  And, how do they stuck?

After we have postcopy and remote page faults, it's simple to achieve a
stuck in the monitor (which is also a stuck in main loop thread):

(1) Monitor deadlock on BQL

As we may know, when postcopy is running on destination VM, the vcpu
threads can stuck merely any time as long as it tries to access an
uncopied guest page.  Meanwhile, when the stuck happens, it is possible
that the vcpu thread is holding the BQL.  If the page fault is not
handled quickly, you'll find that monitors stop working, which is trying
to take the BQL.

If the page fault cannot be handled correctly (one case is a paused
postcopy, when network is temporarily down), monitors will hang
forever.  Without current patch, that means the main loop hanged.  We'll
never find a way to talk to VM again.

(2) Monitor tries to run codes page-faulted vcpus

The HMP command "info cpus" is one of the good example - it tries to
kick all the vcpus and sync status from them.  However, if there is any
vcpu that stuck at an unhandled page fault, it can never achieve the
sync, then the HMP hangs.  Again, it hangs the main loop thread as well.

After either (1) or (2), we can see the deadlock problem:

- On one hand, if monitor hangs, we cannot do the postcopy recovery,
  because postcopy recovery needs user to specify new listening port on
  destination monitor.

- On the other hand, if we cannot recover the paused postcopy, then page
  faults cannot be serviced, and the monitors will possibly hang
  forever then.

* How this patch helps?

- Firstly, we'll have our own thread for each dedicated monitor (or say,
  the backend chardev is only used for monitor), so even main loop
  thread hangs (it is always possible), this monitor thread may still
  survive.

- Not all monitor commands need the BQL.  We can selectively take the
  BQL (depends on which command we are running) to avoid waiting on a
  page-faulted vcpu thread that has taken the BQL (this will be done in
  following up patches).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 monitor.c           | 75 +++++++++++++++++++++++++++++++++++++++++++++++++----
 qapi/qmp-dispatch.c | 15 +++++++++++
 2 files changed, 85 insertions(+), 5 deletions(-)

diff --git a/monitor.c b/monitor.c
index 7c90df7..3d4ecff 100644
--- a/monitor.c
+++ b/monitor.c
@@ -36,6 +36,8 @@
 #include "net/net.h"
 #include "net/slirp.h"
 #include "chardev/char-fe.h"
+#include "chardev/char-mux.h"
+#include "chardev/char-io.h"
 #include "ui/qemu-spice.h"
 #include "sysemu/numa.h"
 #include "monitor/monitor.h"
@@ -190,6 +192,8 @@ struct Monitor {
     int flags;
     int suspend_cnt;
     bool skip_flush;
+    /* Whether the monitor wants to be polled in standalone thread */
+    bool use_thread;
 
     QemuMutex out_lock;
     QString *outbuf;
@@ -206,6 +210,11 @@ struct Monitor {
     mon_cmd_t *cmd_table;
     QLIST_HEAD(,mon_fd_t) fds;
     QLIST_ENTRY(Monitor) entry;
+
+    /* Only used when "use_thread" is used */
+    QemuThread mon_thread;
+    GMainContext *mon_context;
+    GMainLoop *mon_loop;
 };
 
 /* QMP checker flags */
@@ -568,7 +577,7 @@ static void monitor_qapi_event_init(void)
 
 static void handle_hmp_command(Monitor *mon, const char *cmdline);
 
-static void monitor_data_init(Monitor *mon, bool skip_flush)
+static void monitor_data_init(Monitor *mon, bool skip_flush, bool use_thread)
 {
     memset(mon, 0, sizeof(Monitor));
     qemu_mutex_init(&mon->out_lock);
@@ -576,10 +585,34 @@ static void monitor_data_init(Monitor *mon, bool skip_flush)
     /* Use *mon_cmds by default. */
     mon->cmd_table = mon_cmds;
     mon->skip_flush = skip_flush;
+    mon->use_thread = use_thread;
+    if (use_thread) {
+        /*
+         * For monitors that use isolated threads, they'll need their
+         * own GMainContext and GMainLoop.  Otherwise, these pointers
+         * will be NULL, which means the default context will be used.
+         */
+        mon->mon_context = g_main_context_new();
+        mon->mon_loop = g_main_loop_new(mon->mon_context, TRUE);
+    }
 }
 
 static void monitor_data_destroy(Monitor *mon)
 {
+    /* Destroy the thread first if there is */
+    if (mon->use_thread) {
+        /* Notify the per-monitor thread to quit. */
+        g_main_loop_quit(mon->mon_loop);
+        /*
+         * Make sure the context will get the quit message since it's
+         * in another thread.  Without this, it may not be able to
+         * respond to the quit message immediately.
+         */
+        g_main_context_wakeup(mon->mon_context);
+        qemu_thread_join(&mon->mon_thread);
+        g_main_loop_unref(mon->mon_loop);
+        g_main_context_unref(mon->mon_context);
+    }
     qemu_chr_fe_deinit(&mon->chr, false);
     if (monitor_is_qmp(mon)) {
         json_message_parser_destroy(&mon->qmp.parser);
@@ -595,7 +628,7 @@ char *qmp_human_monitor_command(const char *command_line, bool has_cpu_index,
     char *output = NULL;
     Monitor *old_mon, hmp;
 
-    monitor_data_init(&hmp, true);
+    monitor_data_init(&hmp, true, false);
 
     old_mon = cur_mon;
     cur_mon = &hmp;
@@ -3101,6 +3134,11 @@ static void handle_hmp_command(Monitor *mon, const char *cmdline)
 {
     QDict *qdict;
     const mon_cmd_t *cmd;
+    /*
+     * If we haven't take the BQL (when called by per-monitor
+     * threads), we need to take care of the BQL on our own.
+     */
+    bool take_bql = !qemu_mutex_iothread_locked();
 
     trace_handle_hmp_command(mon, cmdline);
 
@@ -3116,7 +3154,16 @@ static void handle_hmp_command(Monitor *mon, const char *cmdline)
         return;
     }
 
+    if (take_bql) {
+        qemu_mutex_lock_iothread();
+    }
+
     cmd->cmd(mon, qdict);
+
+    if (take_bql) {
+        qemu_mutex_unlock_iothread();
+    }
+
     QDECREF(qdict);
 }
 
@@ -4086,6 +4133,15 @@ static void __attribute__((constructor)) monitor_lock_init(void)
     qemu_mutex_init(&monitor_lock);
 }
 
+static void *monitor_thread(void *data)
+{
+    Monitor *mon = data;
+
+    g_main_loop_run(mon->mon_loop);
+
+    return NULL;
+}
+
 void monitor_init(Chardev *chr, int flags)
 {
     static int is_first_init = 1;
@@ -4098,7 +4154,9 @@ void monitor_init(Chardev *chr, int flags)
     }
 
     mon = g_malloc(sizeof(*mon));
-    monitor_data_init(mon, false);
+
+    /* For non-mux typed monitors, we create dedicated threads. */
+    monitor_data_init(mon, false, !CHARDEV_IS_MUX(chr));
 
     qemu_chr_fe_init(&mon->chr, chr, &error_abort);
     mon->flags = flags;
@@ -4112,12 +4170,19 @@ void monitor_init(Chardev *chr, int flags)
 
     if (monitor_is_qmp(mon)) {
         qemu_chr_fe_set_handlers(&mon->chr, monitor_can_read, monitor_qmp_read,
-                                 monitor_qmp_event, NULL, mon, NULL, true);
+                                 monitor_qmp_event, NULL, mon,
+                                 mon->mon_context, true);
         qemu_chr_fe_set_echo(&mon->chr, true);
         json_message_parser_init(&mon->qmp.parser, handle_qmp_command);
     } else {
         qemu_chr_fe_set_handlers(&mon->chr, monitor_can_read, monitor_read,
-                                 monitor_event, NULL, mon, NULL, true);
+                                 monitor_event, NULL, mon,
+                                 mon->mon_context, true);
+    }
+
+    if (mon->use_thread) {
+        qemu_thread_create(&mon->mon_thread, chr->label, monitor_thread,
+                           mon, QEMU_THREAD_JOINABLE);
     }
 
     qemu_mutex_lock(&monitor_lock);
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 5ad36f8..3b6b224 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -19,6 +19,7 @@
 #include "qapi/qmp/qjson.h"
 #include "qapi-types.h"
 #include "qapi/qmp/qerror.h"
+#include "qemu/main-loop.h"
 
 static QDict *qmp_dispatch_check_obj(const QObject *request, Error **errp)
 {
@@ -75,6 +76,11 @@ static QObject *do_qmp_dispatch(QmpCommandList *cmds, QObject *request,
     QDict *args, *dict;
     QmpCommand *cmd;
     QObject *ret = NULL;
+    /*
+     * If we haven't take the BQL (when called by per-monitor
+     * threads), we need to take care of the BQL on our own.
+     */
+    bool take_bql = !qemu_mutex_iothread_locked();
 
     dict = qmp_dispatch_check_obj(request, errp);
     if (!dict) {
@@ -101,7 +107,16 @@ static QObject *do_qmp_dispatch(QmpCommandList *cmds, QObject *request,
         QINCREF(args);
     }
 
+    if (take_bql) {
+        qemu_mutex_lock_iothread();
+    }
+
     cmd->fn(args, &ret, &local_err);
+
+    if (take_bql) {
+        qemu_mutex_unlock_iothread();
+    }
+
     if (local_err) {
         error_propagate(errp, local_err);
     } else if (cmd->options & QCO_NO_SUCCESS_RESP) {
-- 
2.7.4

  parent reply	other threads:[~2017-08-23  6:51 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23  6:51 [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 1/8] monitor: move skip_flush into monitor_data_init Peter Xu
2017-08-23 16:31   ` Dr. David Alan Gilbert
2017-08-23  6:51 ` Peter Xu [this message]
2017-08-23 17:35   ` [Qemu-devel] [RFC v2 2/8] monitor: allow monitor to create thread to poll Dr. David Alan Gilbert
2017-08-25  4:25     ` Peter Xu
2017-08-25  9:30       ` Dr. David Alan Gilbert
2017-08-28  5:53         ` Peter Xu
2017-09-08 17:29           ` Dr. David Alan Gilbert
2017-08-25 15:27   ` Marc-André Lureau
2017-08-25 15:33     ` Dr. David Alan Gilbert
2017-08-25 16:07       ` Marc-André Lureau
2017-08-25 16:12         ` Dr. David Alan Gilbert
2017-08-25 16:21           ` Marc-André Lureau
2017-08-25 16:29             ` Dr. David Alan Gilbert
2017-08-26  8:33               ` Marc-André Lureau
2017-08-28  3:05         ` Peter Xu
2017-08-28 10:11           ` Marc-André Lureau
2017-08-28 12:48             ` Peter Xu
2017-09-05 18:58               ` Dr. David Alan Gilbert
2017-08-28 11:08         ` Markus Armbruster
2017-08-28 12:28           ` Marc-André Lureau
2017-08-28 16:24             ` Markus Armbruster
2017-08-28 17:24               ` Marc-André Lureau
2017-08-29  6:27                 ` Markus Armbruster
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 3/8] char-io: fix possible risk on IOWatchPoll Peter Xu
2017-08-25 14:44   ` Marc-André Lureau
2017-08-26  7:19   ` Fam Zheng
2017-08-28  5:56     ` Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 4/8] QAPI: new QMP command option "without-bql" Peter Xu
2017-08-23 17:44   ` Dr. David Alan Gilbert
2017-08-23 23:37     ` Fam Zheng
2017-08-25  5:37       ` Peter Xu
2017-08-25  9:14         ` Dr. David Alan Gilbert
2017-08-28  8:08           ` Peter Xu
2017-09-08 17:38             ` Dr. David Alan Gilbert
2017-08-25  5:35     ` Peter Xu
2017-08-25  9:06       ` Dr. David Alan Gilbert
2017-08-28  8:26         ` Peter Xu
2017-09-08 17:52           ` Dr. David Alan Gilbert
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 5/8] hmp: support "without_bql" Peter Xu
2017-08-23 17:46   ` Dr. David Alan Gilbert
2017-08-25  5:44     ` Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 6/8] migration: qmp: migrate_incoming don't need BQL Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 7/8] migration: hmp: " Peter Xu
2017-08-23  6:51 ` [Qemu-devel] [RFC v2 8/8] migration: add incoming mgmt lock Peter Xu
2017-08-23 18:01   ` Dr. David Alan Gilbert
2017-08-25  5:49     ` Peter Xu
2017-08-25  9:34       ` Dr. David Alan Gilbert
2017-08-28  8:39         ` Peter Xu
2017-08-29 11:03 ` [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread Daniel P. Berrange
2017-08-30  7:06   ` Markus Armbruster
2017-08-30 10:13     ` Daniel P. Berrange
2017-08-31  3:31       ` Peter Xu
2017-08-31  9:14         ` Daniel P. Berrange
2017-09-06  9:48   ` Dr. David Alan Gilbert
2017-09-06 10:46     ` Daniel P. Berrange
2017-09-06 10:48       ` Dr. David Alan Gilbert
2017-09-06 10:54         ` Daniel P. Berrange
2017-09-06 10:57           ` Dr. David Alan Gilbert
2017-09-06 11:06             ` Daniel P. Berrange
2017-09-06 11:31               ` Dr. David Alan Gilbert
2017-09-06 11:54                 ` Daniel P. Berrange
2017-09-07  8:13                   ` Peter Xu
2017-09-07  8:49                     ` Stefan Hajnoczi
2017-09-07  9:18                       ` Dr. David Alan Gilbert
2017-09-07 10:19                         ` Stefan Hajnoczi
2017-09-07 10:24                         ` Peter Xu
2017-09-07  8:55                     ` Daniel P. Berrange
2017-09-07  9:19                       ` Dr. David Alan Gilbert
2017-09-07  9:22                         ` Daniel P. Berrange
2017-09-07  9:27                           ` Dr. David Alan Gilbert
2017-09-07 11:19                         ` Markus Armbruster
2017-09-07 11:31                           ` Dr. David Alan Gilbert
2017-09-07  9:15                     ` Dr. David Alan Gilbert
2017-09-07  9:25                       ` Daniel P. Berrange
2017-09-07 12:59                     ` Markus Armbruster
2017-09-07 13:22                       ` Daniel P. Berrange
2017-09-07 17:41                         ` Markus Armbruster
2017-09-07 18:09                           ` Dr. David Alan Gilbert
2017-09-08  8:41                             ` Markus Armbruster
2017-09-08  9:32                               ` Dr. David Alan Gilbert
2017-09-08 11:49                                 ` Markus Armbruster
2017-09-08 13:19                                   ` Stefan Hajnoczi
2017-09-11 10:32                                   ` Peter Xu
2017-09-11 10:36                                     ` Peter Xu
2017-09-11 10:43                                   ` Daniel P. Berrange
2017-09-08  9:27                           ` Daniel P. Berrange
2017-09-07 14:20                       ` Dr. David Alan Gilbert
2017-09-07 17:41                         ` Markus Armbruster
2017-09-07 18:04                           ` Dr. David Alan Gilbert
2017-09-07 10:04                   ` Dr. David Alan Gilbert
2017-09-07 10:08                     ` Daniel P. Berrange
2017-09-07 13:59                 ` Eric Blake
2017-09-06 14:50 ` Stefan Hajnoczi
2017-09-06 15:14   ` Dr. David Alan Gilbert
2017-09-07  7:38     ` Peter Xu
2017-09-07  8:58     ` Stefan Hajnoczi
2017-09-07  9:35       ` Dr. David Alan Gilbert
2017-09-07 10:09         ` Stefan Hajnoczi
2017-09-07 12:02           ` Peter Xu
2017-09-07 16:53             ` Stefan Hajnoczi
2017-09-07 17:14               ` Dr. David Alan Gilbert
2017-09-07 17:35                 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1503471071-2233-3-git-send-email-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.