xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_*
@ 2018-11-12 16:49 Anthony PERARD
  2018-11-12 16:49 ` [PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK Anthony PERARD
                   ` (10 more replies)
  0 siblings, 11 replies; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

Changes in v6:
    Implementation of libxl__ev_qmp_* functions have been squashed to a single
    patch. And with that, a lot of changes in order to make it simpler to read
    the implementation, have better error reporting and a few bug fix.

    Checkout more detail changelog in the notes of each patch, as there is
    many.

Changes in v5:
    Plenty of patch have been applied.
    Other changes mostly are coding style and other typos.
    Some bug fixes.
    Details can be found in patch notes.

    I have left aside the change to cdrom_insert until I can found what to do
    with the userdata lock.

Changes in v4:
    Better API which meant a lot of other changes.

In order for libxl to be able to manage QEMU while it is restricted, a few
changes are needed. We need a new way to get a startup notification from QEMU
as xenstore may not be accessible when QEMU is ready. We also need to a
different way to have QEMU save it's state and to insert cdrom as a restricted
QEMU doesn't have access to the file system.

For both, we can use QMP, we can use it to query QEMU's status, and we can use
it to send a file descriptor through which QEMU can save its state, or it can
be a cdrom.

We take this opportunity to rewrite the QMP client, and this time been
asynchronous, the result is libxl__ev_qmp_*.

The plat de résistance in this patch series start with patch
"libxl: Design of an async API to issue QMP commands to QEMU"
which implement libxl__ev_qmp_* functions to turn the QMP client into
asynchronous mode.

This comes with changes that uses the new interface.
* "libxl: QEMU startup sync based on QMP"
  which can use QMP to find out when QEMU as started.
  this requires: "libxl_dm: Pre-open QMP socket for QEMU"
  But that only works with dm_restrict=1 as explain in the patch.
* "libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp"
  Which rewrite libxl__qmp_save(), and adds the ability to have QEMU save
  its state to a file descriptor which libxl will have openned.

Patches series available in a git tag:
git fetch https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git tags/libxl-ev-qmp-v6
git checkout -b libxl-ev-qmp-v6 FETCH_HEAD

Cheers,

Anthony PERARD (11):
  libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK
  libxl_qmp: Separate QMP message generation from qmp_send_prepare
  libxl_qmp: Change qmp_qemu_check_version to compare version
  libxl: Design of an async API to issue QMP commands to QEMU
  libxl_qmp: Implementation of libxl__ev_qmp_*
  libxl_exec: Add libxl__spawn_initiate_failure
  libxl_dm: Pre-open QMP socket for QEMU
  libxl: QEMU startup sync based on QMP
  libxl_qmp: Store advertised QEMU version in libxl__ev_qmp
  libxl: Change libxl__domain_suspend_device_model() to be async
  libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp

 tools/libxl/libxl_create.c      |  38 +-
 tools/libxl/libxl_dm.c          | 126 ++++-
 tools/libxl/libxl_dom_suspend.c |  37 +-
 tools/libxl/libxl_exec.c        |  11 +
 tools/libxl/libxl_internal.h    | 162 +++++-
 tools/libxl/libxl_qmp.c         | 925 ++++++++++++++++++++++++++++++--
 tools/libxl/libxl_types.idl     |   7 +
 tools/libxl/libxl_utils.c       |  19 +-
 8 files changed, 1222 insertions(+), 103 deletions(-)

-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-12 16:58   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare Anthony PERARD
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

This patch change the behavior of libxl__sendmsg_fds to retry sendmsg on
EINTR error.

This patch also allow a caller of libxl__sendmsg_fds to deal with
EWOULDBLOCK. It is best to only sent one byte when dealing with
non-blocking fds so a EWOULDBLOCK error would mean that the fds haven't
been sent yet.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---
 tools/libxl/libxl_internal.h |  3 ++-
 tools/libxl/libxl_utils.c    | 19 ++++++++++++++-----
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e498435e16..ae5960d869 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1864,7 +1864,8 @@ _hidden void libxl__qmp_cleanup(libxl__gc *gc, uint32_t domid);
 _hidden int libxl__qmp_initializations(libxl__gc *gc, uint32_t domid,
                                        const libxl_domain_config *guest_config);
 
-/* on failure, logs */
+/* returns ERROR_NOT_READY on EWOULDBLOCK
+ * or logs on failure. */
 int libxl__sendmsg_fds(libxl__gc *gc, int carrier,
                        const void *data, size_t datalen,
                        int nfds, const int fds[], const char *what);
diff --git a/tools/libxl/libxl_utils.c b/tools/libxl/libxl_utils.c
index 5854717b11..5d20fd57c5 100644
--- a/tools/libxl/libxl_utils.c
+++ b/tools/libxl/libxl_utils.c
@@ -1088,11 +1088,20 @@ int libxl__sendmsg_fds(libxl__gc *gc, int carrier,
 
     msg.msg_controllen = cmsg->cmsg_len;
 
-    r = sendmsg(carrier, &msg, 0);
-    if (r < 0) {
-        LOGE(ERROR, "failed to send fd-carrying message (%s)", what);
-        return ERROR_FAIL;
-    }
+    while (1) {
+        r = sendmsg(carrier, &msg, 0);
+        if (r < 0) {
+            if (errno == EINTR)
+                continue;
+            if (errno == EWOULDBLOCK) {
+                assert(datalen == 1);
+                return ERROR_NOT_READY;
+            }
+            LOGE(ERROR, "failed to send fd-carrying message (%s)", what);
+            return ERROR_FAIL;
+        }
+        break;
+    };
 
     return 0;
 }
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
  2018-11-12 16:49 ` [PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-12 17:20   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version Anthony PERARD
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

To be able to re-use qmp_prepare_qmp_cmd with libxl__ev_qmp.

Also, add the QMP end of command '\r\n' into the generated string.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v6:
        comment about ownership of buf
        use lib__sprintf instead of two strncpy
    v5:
        rename qmp_prepare_qmp_cmd to qmp_prepare_cmd
        fix coding style

 tools/libxl/libxl_qmp.c | 57 ++++++++++++++++++++++++++++-------------
 1 file changed, 39 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 6a5c997546..2dd54d5398 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -571,17 +571,16 @@ static int qmp_next(libxl__gc *gc, libxl__qmp_handler *qmp)
     return rc;
 }
 
-static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
-                              const char *cmd, libxl__json_object *args,
-                              qmp_callback_t callback, void *opaque,
-                              qmp_request_context *context)
+static char *qmp_prepare_cmd(libxl__gc *gc, const char *cmd,
+                             const libxl__json_object *args,
+                             int id, size_t *len_r)
 {
-    const unsigned char *buf = NULL;
-    char *ret = NULL;
-    libxl_yajl_length len = 0;
+    yajl_gen hand = NULL;
+    /* memory for 'buf' is owned by 'hand' */
+    const unsigned char *buf;
+    libxl_yajl_length len;
     yajl_gen_status s;
-    yajl_gen hand;
-    callback_id_pair *elm = NULL;
+    char *ret = NULL;
 
     hand = libxl_yajl_gen_alloc(NULL);
 
@@ -598,7 +597,7 @@ static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
     libxl__yajl_gen_asciiz(hand, "execute");
     libxl__yajl_gen_asciiz(hand, cmd);
     libxl__yajl_gen_asciiz(hand, "id");
-    yajl_gen_integer(hand, ++qmp->last_id_used);
+    yajl_gen_integer(hand, id);
     if (args) {
         libxl__yajl_gen_asciiz(hand, "arguments");
         libxl__json_object_to_yajl_gen(gc, hand, args);
@@ -608,6 +607,32 @@ static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
     s = yajl_gen_get_buf(hand, &buf, &len);
 
     if (s) {
+        goto out;
+    }
+
+    ret = libxl__sprintf(NOGC, "%*.*s\r\n", (int)len, (int)len, buf);
+    len += 2;
+
+    if (len_r)
+        *len_r = len;
+
+out:
+    yajl_gen_free(hand);
+    return ret;
+}
+
+static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
+                              const char *cmd, libxl__json_object *args,
+                              qmp_callback_t callback, void *opaque,
+                              qmp_request_context *context,
+                              size_t *len_r)
+{
+    char *buf;
+    callback_id_pair *elm;
+
+    buf = qmp_prepare_cmd(gc, cmd, args, ++qmp->last_id_used, NULL);
+
+    if (!buf) {
         LOGD(ERROR, qmp->domid, "Failed to generate a qmp command");
         goto out;
     }
@@ -623,13 +648,10 @@ static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
     elm->context = context;
     LIBXL_STAILQ_INSERT_TAIL(&qmp->callback_list, elm, next);
 
-    ret = libxl__strndup(gc, (const char*)buf, len);
-
     LOGD(DEBUG, qmp->domid, "next qmp command: '%s'", buf);
 
 out:
-    yajl_gen_free(hand);
-    return ret;
+    return buf;
 }
 
 static int qmp_send(libxl__qmp_handler *qmp,
@@ -641,7 +663,8 @@ static int qmp_send(libxl__qmp_handler *qmp,
     int rc = -1;
     GC_INIT(qmp->ctx);
 
-    buf = qmp_send_prepare(gc, qmp, cmd, args, callback, opaque, context);
+    buf = qmp_send_prepare(gc, qmp, cmd, args, callback, opaque, context,
+                           NULL);
 
     if (buf == NULL) {
         goto out;
@@ -650,12 +673,10 @@ static int qmp_send(libxl__qmp_handler *qmp,
     if (libxl_write_exactly(qmp->ctx, qmp->qmp_fd, buf, strlen(buf),
                             "QMP command", "QMP socket"))
         goto out;
-    if (libxl_write_exactly(qmp->ctx, qmp->qmp_fd, "\r\n", 2,
-                            "CRLF", "QMP socket"))
-        goto out;
 
     rc = qmp->last_id_used;
 out:
+    free(buf);
     GC_FREE;
     return rc;
 }
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
  2018-11-12 16:49 ` [PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK Anthony PERARD
  2018-11-12 16:49 ` [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-12 17:22   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 04/11] libxl: Design of an async API to issue QMP commands to QEMU Anthony PERARD
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

This patch makes the function simpler to read. It also add the ability
for a caller to tell if QEMU is newer or have the exact version.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v6:
        new patch

 tools/libxl/libxl_qmp.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 2dd54d5398..43ae9a0374 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -392,13 +392,27 @@ static int qmp_handle_response(libxl__gc *gc, libxl__qmp_handler *qmp,
     return 0;
 }
 
-static bool qmp_qemu_check_version(libxl__qmp_handler *qmp, int major,
-                                   int minor, int micro)
+/*
+ * return values:
+ *   < 0  if qemu's version <  asked version
+ *   = 0  if qemu's version == asked version
+ *   > 0  if qemu's version >  asked version
+ */
+static int qmp_qemu_compare_version(libxl__qmp_handler *qmp, int major,
+                                    int minor, int micro)
 {
-    return qmp->version.major > major ||
-        (qmp->version.major == major &&
-            (qmp->version.minor > minor ||
-             (qmp->version.minor == minor && qmp->version.micro >= micro)));
+#define CHECK_VERSION(level) do { \
+    if (qmp->version.level > (level)) return +1; \
+    if (qmp->version.level < (level)) return -1; \
+} while (0)
+
+    CHECK_VERSION(major);
+    CHECK_VERSION(minor);
+    CHECK_VERSION(micro);
+
+#undef CHECK_VERSION
+
+    return 0;
 }
 
 /*
@@ -1020,7 +1034,7 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename, bool live)
 
     /* live parameter was added to QEMU 2.11. It signal QEMU that the save
      * operation is for a live migration rather that for taking a snapshot. */
-    if (qmp_qemu_check_version(qmp, 2, 11, 0))
+    if (qmp_qemu_compare_version(qmp, 2, 11, 0) >= 0)
         qmp_parameters_add_bool(gc, &args, "live", live);
 
     rc = qmp_synchronous_send(qmp, "xen-save-devices-state", args,
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 04/11] libxl: Design of an async API to issue QMP commands to QEMU
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (2 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-12 17:29   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_* Anthony PERARD
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

All the functions will be implemented in later patches.

This patch includes the API that libxl can use to send QMP commands to
QEMU.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v6:
        use libxl_domid type for domid instead of plain uin32_t
        avoid the work "chained", rewrite the paragraph about sending one
            cmd after another
        Rewrite the comment about the callback, and explain that on error,
            the `ev` may be Idle or may still be Connected
        Change the carefd to a simple int (field cfd -> fd)
    
    v5:
        some changes in the comment

 tools/libxl/libxl_internal.h | 74 +++++++++++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index ae5960d869..f089524460 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -186,6 +186,8 @@ typedef struct libxl__ao libxl__ao;
 typedef struct libxl__aop_occurred libxl__aop_occurred;
 typedef struct libxl__osevent_hook_nexus libxl__osevent_hook_nexus;
 typedef struct libxl__osevent_hook_nexi libxl__osevent_hook_nexi;
+typedef struct libxl__json_object libxl__json_object;
+typedef struct libxl__carefd libxl__carefd;
 
 typedef struct libxl__domain_create_state libxl__domain_create_state;
 typedef void libxl__domain_create_cb(struct libxl__egc *egc,
@@ -349,6 +351,74 @@ struct libxl__ev_child {
     LIBXL_LIST_ENTRY(struct libxl__ev_child) entry;
 };
 
+/*
+ * QMP asynchronous calls
+ *
+ * This facility allows a command to be sent to QEMU, and the response
+ * to be handed to a callback function.
+ *
+ * Commands can be submited one after an other with the same
+ * connection (e.g. the result from the "add-fd" command need to be
+ * use in a follow-up command before disconnecting from QMP). A
+ * libxl__ev_qmp can be reused when the callback is been called in
+ * order to use the same connection.
+ *
+ * Only one connection at a time can be made to one QEMU, so avoid
+ * keeping a libxl__ev_qmp Connected for to long and call
+ * libxl__ev_qmp_dispose as soon as it is not needed anymore.
+ *
+ * Possible states of a libxl__ev_qmp:
+ *  Undefined
+ *    Might contain anything.
+ *  Idle
+ *    Struct contents are defined enough to pass to any
+ *    libxl__ev_qmp_* function.
+ *    The struct does not contain references to any allocated private
+ *    resources so can be thrown away.
+ *  Active
+ *    Currently waiting for the callback to be called.
+ *    _dispose must be called to reclaim resources.
+ *  Connected
+ *    Struct contain allocated ressources.
+ *    Calling _send() with this same ev will use the same QMP connection.
+ *    _dispose() must be called to reclaim resources.
+ *
+ * libxl__ev_qmp_init: Undefined/Idle -> Idle
+ *
+ * libxl__ev_qmp_send: Idle/Connected -> Active (on error: Idle)
+ *    Sends a command to QEMU.
+ *    callback will be called when a response is received or when an
+ *    error as occured.
+ *
+ * libxl__ev_qmp_dispose: Connected/Active/Idle -> Idle
+ *
+ * callback: When called: Active -> Connected (on error: Idle/Connected)
+ *    When called, ev is Connected and can be reused or disposed of.
+ *    On error, the callback is called with response == NULL and the
+ *    error code in rc. The new state of ev depending on the value of rc:
+ *    - rc == ERROR_QMP_*: This is an error associated with the cmd to
+ *      run, ev is Connected.
+ *    - otherwise: An other error happend, ev is now Idle.
+ *    The callback is only called once.
+ */
+typedef struct libxl__ev_qmp libxl__ev_qmp;
+typedef void libxl__ev_qmp_callback(libxl__egc *egc, libxl__ev_qmp *ev,
+                                    const libxl__json_object *response,
+                                    int rc);
+
+_hidden void libxl__ev_qmp_init(libxl__ev_qmp *ev);
+_hidden int libxl__ev_qmp_send(libxl__gc *gc, libxl__ev_qmp *ev,
+                               const char *cmd, libxl__json_object *args);
+_hidden void libxl__ev_qmp_dispose(libxl__gc *gc, libxl__ev_qmp *ev);
+
+struct libxl__ev_qmp {
+    /* caller should include this in their own struct */
+    /* caller must fill these in, and they must all remain valid */
+    libxl_domid domid;
+    libxl__ev_qmp_callback *callback;
+    int fd; /* set to send a fd with the command, -1 otherwise */
+};
+
 
 /*
  * evgen structures, which are the state we use for generating
@@ -1895,7 +1965,7 @@ typedef enum {
     JSON_ANY     = 255 /* this is a mask of all values above, adjust as needed */
 } libxl__json_node_type;
 
-typedef struct libxl__json_object {
+struct libxl__json_object {
     libxl__json_node_type type;
     union {
         bool b;
@@ -1908,7 +1978,7 @@ typedef struct libxl__json_object {
         flexarray_t *map;
     } u;
     struct libxl__json_object *parent;
-} libxl__json_object;
+};
 
 typedef int (*libxl__json_parse_callback)(libxl__gc *gc,
                                           libxl__json_object *o,
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (3 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 04/11] libxl: Design of an async API to issue QMP commands to QEMU Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-13 11:33   ` Ian Jackson
  2018-11-22 19:04   ` Marek Marczykowski-Górecki
  2018-11-12 16:49 ` [PATCH v6 06/11] libxl_exec: Add libxl__spawn_initiate_failure Anthony PERARD
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v6:
        This is a squash of 7 commits on the previous version:
        - libxl_qmp: Connect to QMP socket
        - libxl_qmp: Implement fd callback and read data
        - libxl_qmp: Parse JSON input from QMP
        - libxl_qmp: Prepare the command to be sent
        - libxl_qmp: Handle write to QMP socket
        - libxl_qmp: Handle messages from QEMU
        - libxl_qmp: Respond to QMP greeting
    
        General rework of the implementation.
        Added more comment, with a description of allowed internal states.
        Check for EINPROGRESS after connect().
        Read until EWOULDBLOCK.
        Handle EWOULDBLOCK on write and sendmsg.
        Using memmem instead of strstr.
        Using memmove instead of having an offset in rx_buf.
        Rework buffer allocation
        Don't feed \r into json parser anymore
        Add a check for a maximum RX buffer size
        Added more error messages
        New error code ERROR_PROTOCOL_ERROR_QMP
        Rewrite conversion of QMP ErrorClass to libxl_error code
        Added helpers: qmp_ev_ensure_reading_writing, qmp_ev_set_state
        Split some functions, squash others
        Added ev->msg* to store generated user command as tx_buf is used during
            connection (for qmp_capabilities)
        Remove qmp_state_greeting
        Added qmp_state_waiting_reply
    
    v5:
        nits
        use a define instead of a static int for QMP_CAPABILITY_NEGOCIATION_MSGID
        use a switch in qmp_ev_callback_writable to check qmp_state
        Add a description of the different value of libxl__qmp_state enum.
        some cleanup
        remove read loop that only handled EINTR, simply return
        initialize len and s at declaration time
        remove old comment
        rename buf_fd to send_fd
        Adding default:abort() in qmp_ev_handle_message.
    
    v4:
        remove use of a linked list of receive buffer, and use realloc instead.
        simplification of the patch due to use of a single allocated space for the
        receive buffer.

 tools/libxl/libxl_internal.h |  35 ++
 tools/libxl/libxl_qmp.c      | 668 +++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_types.idl  |   6 +
 3 files changed, 709 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index f089524460..e02597a0cd 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -411,12 +411,47 @@ _hidden int libxl__ev_qmp_send(libxl__gc *gc, libxl__ev_qmp *ev,
                                const char *cmd, libxl__json_object *args);
 _hidden void libxl__ev_qmp_dispose(libxl__gc *gc, libxl__ev_qmp *ev);
 
+typedef enum {
+    /* initial state */
+    qmp_state_disconnected = 1,
+    /* connected to QMP socket, waiting for greeting message */
+    qmp_state_connecting,
+    /* qmp_capabilities command sent, waiting for reply */
+    qmp_state_capability_negotiation,
+    /* ready to send commands */
+    qmp_state_connected,
+    /* cmd sent, waiting for reply */
+    qmp_state_waiting_reply,
+} libxl__qmp_state;
+
 struct libxl__ev_qmp {
     /* caller should include this in their own struct */
     /* caller must fill these in, and they must all remain valid */
     libxl_domid domid;
     libxl__ev_qmp_callback *callback;
     int fd; /* set to send a fd with the command, -1 otherwise */
+
+    /*
+     * remaining fields are private to libxl_ev_qmp_*
+     */
+
+    int id;
+    libxl__carefd *qmp_cfd;
+    libxl__ev_fd qmp_efd;
+    libxl__qmp_state qmp_state;
+    /* receive buffer, with:
+     * rx_buf_size: current allocated size,
+     * rx_buf_used: actual data in the buffer */
+    char *rx_buf;
+    size_t rx_buf_size;
+    size_t rx_buf_used;
+    /* sending buffer */
+    char *tx_buf;
+    size_t tx_buf_len;
+    size_t tx_buf_off;
+    /* The message to send when ready */
+    char *msg;
+    size_t msg_len;
 };
 
 
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 43ae9a0374..895628066a 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -75,11 +75,18 @@
 #  define DEBUG_REPORT_RECEIVED(dom, buf, len) ((void)0)
 #endif
 
+#ifdef DEBUG_QMP_CLIENT
+#  define LOG_QMP(f, ...) LOGD(DEBUG, ev->domid, f, ##__VA_ARGS__)
+#else
+#  define LOG_QMP(f, ...)
+#endif
+
 /*
  * QMP types & constant
  */
 
 #define QMP_RECEIVE_BUFFER_SIZE 4096
+#define QMP_MAX_SIZE_RX_BUF MB(8)
 #define PCI_PT_QDEV_ID "pci-pt-%02x_%02x.%01x"
 
 /*
@@ -1315,6 +1322,667 @@ int libxl__qmp_initializations(libxl__gc *gc, uint32_t domid,
     return ret;
 }
 
+/* ------------ Implementation of libxl__ev_qmp ---------------- */
+
+/*
+ * Possible internal state compared to qmp_state:
+ *
+ * qmp_state  disconnected connecting capability   connected  waiting
+ *                                    _negotiation            _reply
+ * External   Idle         Active     Active       Connected  Active
+ * qmp_cfd    close        open       open         open       open
+ * qmp_efd    Idle         Active     Active       Active     Active
+ * id         reset        set        set          prev[1]    set
+ * rx_buf*    free         used       used         used       used
+ * tx_buf*    free         free       used/free    free       used/free
+ * msg*       free         set        set          free/set   free
+ *
+ * [1] id used on the previously sent command
+ *
+ * Possible buffers states:
+ * - receiving buffer:
+ *                    free   used
+ *   rx_buf           NULL   allocated
+ *   rx_buf_size      0      allocation size of `rx_buf`
+ *   rx_buf_off       0      <= rx_buf_size
+ * - transmitted buffer:
+ *                    free   used
+ *   tx_buf           NULL   contain data
+ *   tx_buf_len       0      size of data
+ *   tx_buf_off       0      <= tx_buf_len
+ * - queued user command:
+ *                    free  set
+ *   msg              NULL  contain data
+ *   msg_len          0     size of data
+ *
+ * - Allowed internal state transition:
+ * disconnected                         -> connecting
+ * connection                           -> capability_negotiation
+ * capability_negotiation/waiting_reply -> connected
+ * connected                            -> waiting_reply
+ * any                                  -> disconnected
+ */
+
+/* hard coded message ID used for capability negotiation ("xlq\0") */
+#define QMP_CAPABILITY_NEGOTIATION_MSGID 0x786c7100
+
+/* prototypes */
+
+static void qmp_ev_fd_callback(libxl__egc *egc, libxl__ev_fd *ev_fd,
+                               int fd, short events, short revents);
+static int qmp_ev_callback_writable(libxl__gc *gc,
+                                    libxl__ev_qmp *ev, int fd);
+static int qmp_ev_callback_readable(libxl__egc *egc,
+                                    libxl__ev_qmp *ev, int fd);
+static int qmp_ev_get_next_msg(libxl__egc *egc, libxl__ev_qmp *ev,
+                               libxl__json_object **o_r);
+static int qmp_ev_handle_message(libxl__egc *egc,
+                                 libxl__ev_qmp *ev,
+                                 const libxl__json_object *resp);
+
+/* helpers */
+
+static void qmp_ev_ensure_reading_writing(libxl__gc *gc, libxl__ev_qmp *ev)
+{
+    bool enable = false;
+    short events;
+
+    if (ev->tx_buf) {
+        enable = true;
+    } else {
+        enable = (ev->qmp_state == qmp_state_connected) && ev->msg;
+    }
+
+    if (enable)
+        events = ev->qmp_efd.events | POLLOUT;
+    else
+        events = ev->qmp_efd.events & ~POLLOUT;
+
+    libxl__ev_fd_modify(gc, &ev->qmp_efd, events);
+}
+
+static void qmp_ev_set_state(libxl__gc *gc, libxl__ev_qmp *ev,
+                             libxl__qmp_state new_state)
+{
+    libxl__qmp_state cur = ev->qmp_state;
+    switch (new_state) {
+    case qmp_state_disconnected:
+        break;
+    case qmp_state_connecting:
+        assert(cur == qmp_state_disconnected);
+        break;
+    case qmp_state_capability_negotiation:
+        assert(cur == qmp_state_connecting);
+        break;
+    case qmp_state_connected:
+        assert(cur == qmp_state_capability_negotiation ||
+               cur == qmp_state_waiting_reply);
+        break;
+    case qmp_state_waiting_reply:
+        assert(cur == qmp_state_connected);
+        break;
+    }
+
+    ev->qmp_state = new_state;
+}
+
+static int qmp_error_class_to_libxl_error_code(libxl__gc *gc,
+                                               const char *eclass)
+{
+    const libxl_enum_string_table *t = libxl_error_string_table;
+
+    /* compare "QMP_GENERIC_ERROR" from libxl_error to "GenericError"
+     * generated by the QMP server */
+
+    for ( ; t->s; t++) {
+            const char *s = eclass;
+            const char *se = t->s;
+        if (strncasecmp(t->s, "QMP_", 4))
+            continue;
+
+        /* skip "QMP_" */
+        se += 4;
+        while (*s && *se) {
+            /* skip underscores */
+            if (*se == '_') {
+                se++;
+                continue;
+            }
+            if (tolower(*s) != tolower(*se))
+                break;
+            s++, se++;
+        }
+        if (!*s && !*se)
+            return t->v;
+    }
+
+    return ERROR_UNKNOWN_QMP_ERROR;
+}
+
+static int qmp_ev_prepare_cmd(libxl__gc *gc,
+                              libxl__ev_qmp *ev,
+                              const char *cmd,
+                              const libxl__json_object *args)
+{
+    char *buf = NULL;
+    size_t len;
+
+    assert(!ev->tx_buf && !ev->tx_buf_len);
+    assert(!ev->msg && !ev->msg_len);
+
+    ev->id++;
+    buf = qmp_prepare_cmd(gc, cmd, args, ev->id, &len);
+    if (!buf) {
+        return ERROR_FAIL;
+    }
+
+    ev->msg = buf;
+    ev->msg_len = len;
+
+    return 0;
+}
+
+/* Setup connection */
+
+static int qmp_ev_connect(libxl__gc *gc, libxl__ev_qmp *ev)
+{
+    int fd;
+    int rc, r;
+    struct sockaddr_un un;
+    const char *qmp_socket_path;
+
+    if (ev->qmp_state != qmp_state_disconnected)
+        return 0;
+
+    qmp_socket_path = libxl__qemu_qmp_path(gc, ev->domid);
+
+    LOGD(DEBUG, ev->domid, "Connecting to %s", qmp_socket_path);
+
+    libxl__carefd_begin();
+    fd = socket(AF_UNIX, SOCK_STREAM, 0);
+    ev->qmp_cfd = libxl__carefd_opened(CTX, fd);
+    if (!ev->qmp_cfd) {
+        LOGED(ERROR, ev->domid, "socket() failed");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+    rc = libxl_fd_set_nonblock(CTX, libxl__carefd_fd(ev->qmp_cfd), 1);
+    if (rc)
+        goto out;
+
+    rc = libxl__prepare_sockaddr_un(gc, &un, qmp_socket_path,
+                                    "QMP socket");
+    if (rc)
+        goto out;
+
+    r = connect(libxl__carefd_fd(ev->qmp_cfd),
+                (struct sockaddr *) &un, sizeof(un));
+    if (r && errno != EINPROGRESS) {
+        LOGED(ERROR, ev->domid, "Failed to connect to QMP socket %s",
+              qmp_socket_path);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    rc = libxl__ev_fd_register(gc, &ev->qmp_efd, qmp_ev_fd_callback,
+                               libxl__carefd_fd(ev->qmp_cfd), POLLIN);
+    if (rc)
+        goto out;
+
+    qmp_ev_set_state(gc, ev, qmp_state_connecting);
+
+    return 0;
+
+out:
+    libxl__carefd_close(ev->qmp_cfd);
+    ev->qmp_cfd = NULL;
+    return rc;
+}
+
+/* QMP FD callbacks */
+
+static void qmp_ev_fd_callback(libxl__egc *egc, libxl__ev_fd *ev_fd,
+                               int fd, short events, short revents)
+{
+    EGC_GC;
+    int rc;
+    libxl__json_object *o = NULL;
+    libxl__ev_qmp *ev = CONTAINER_OF(ev_fd, *ev, qmp_efd);
+
+    if (revents & (POLLHUP)) {
+        LOGD(ERROR, ev->domid, "received POLLHUP from QMP socket");
+        rc = ERROR_PROTOCOL_ERROR_QMP;
+        goto out;
+    }
+    if (revents & ~(POLLIN|POLLOUT)) {
+        LOGD(ERROR, ev->domid,
+             "unexpected poll event 0x%x on QMP socket (expected POLLIN "
+             "and/or POLLOUT)",
+            revents);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    if (revents & POLLOUT) {
+        rc = qmp_ev_callback_writable(gc, ev, fd);
+        if (rc)
+            goto out;
+    }
+
+    if (revents & POLLIN) {
+        rc = qmp_ev_callback_readable(egc, ev, fd);
+        if (rc)
+            goto out;
+
+        /* parse input */
+        while (1) {
+            /* parse rx buffer to find one json object */
+            rc = qmp_ev_get_next_msg(egc, ev, &o);
+            if (rc == ERROR_NOTFOUND) {
+                rc = 0;
+                break;
+            } else if (rc)
+                goto out;
+
+            /* Must be last and return when the user callback is called */
+            rc = qmp_ev_handle_message(egc, ev, o);
+            if (rc < 0)
+                goto out;
+            if (rc == 1) {
+                /* user callback has been called */
+                return;
+            }
+        }
+    }
+
+    qmp_ev_ensure_reading_writing(gc, ev);
+
+out:
+    if (rc) {
+        LOGD(ERROR, ev->domid,
+             "Error happend with the QMP connection to QEMU");
+
+        /* On error, deallocate all private ressources */
+        libxl__ev_qmp_dispose(gc, ev);
+
+        /* And tell libxl__ev_qmp user about the error */
+        ev->callback(egc, ev, NULL, rc); /* must be last */
+    }
+}
+
+static int qmp_ev_callback_writable(libxl__gc *gc,
+                                    libxl__ev_qmp *ev, int fd)
+{
+    int rc;
+    ssize_t r;
+
+    if (ev->qmp_state == qmp_state_connected) {
+        assert(!ev->tx_buf);
+        if (ev->msg) {
+            ev->tx_buf = ev->msg;
+            ev->tx_buf_len = ev->msg_len;
+            ev->tx_buf_off = 0;
+            ev->msg = NULL;
+            ev->msg_len = 0;
+            qmp_ev_set_state(gc, ev, qmp_state_waiting_reply);
+        }
+    }
+
+    if (!ev->tx_buf)
+        return 0;
+
+    LOG_QMP("sending: '%.*s'", (int)ev->tx_buf_len, ev->tx_buf);
+
+    /*
+     * We will send a file descriptor associated with a command on the
+     * first byte of this command.
+     */
+    if (ev->qmp_state == qmp_state_waiting_reply &&
+        ev->fd >= 0 &&
+        ev->tx_buf_off == 0) {
+
+        rc = libxl__sendmsg_fds(gc, fd, ev->tx_buf, 1,
+                                1, &ev->fd, "QMP socket");
+        /* Check for EWOULDBLOCK, and return to try again later */
+        if (rc == ERROR_NOT_READY)
+            return 0;
+        if (rc)
+            return rc;
+        ev->tx_buf_off++;
+    }
+
+    while (ev->tx_buf_off < ev->tx_buf_len) {
+        r = write(fd, ev->tx_buf + ev->tx_buf_off,
+                  ev->tx_buf_len - ev->tx_buf_off);
+        if (r < 0) {
+            if (errno == EINTR)
+                continue;
+            if (errno == EWOULDBLOCK)
+                break;
+            LOGED(ERROR, ev->domid, "failed to write to QMP socket");
+            return ERROR_FAIL;
+        }
+        ev->tx_buf_off += r;
+    }
+
+    if (ev->tx_buf_off == ev->tx_buf_len) {
+        free(ev->tx_buf);
+        ev->tx_buf = NULL;
+        ev->tx_buf_len = ev->tx_buf_off = 0;
+    }
+
+    return 0;
+}
+
+static int qmp_ev_callback_readable(libxl__egc *egc,
+                                    libxl__ev_qmp *ev, int fd)
+{
+    EGC_GC;
+
+    while (1) {
+        ssize_t r;
+
+        /* Check if the buffer still have space, or increase size */
+        if (ev->rx_buf_size - ev->rx_buf_used < QMP_RECEIVE_BUFFER_SIZE) {
+            ev->rx_buf_size = max(ev->rx_buf_size * 2,
+                               (size_t)QMP_RECEIVE_BUFFER_SIZE * 2);
+            assert(ev->rx_buf_size <= QMP_MAX_SIZE_RX_BUF);
+            if (ev->rx_buf_size > QMP_MAX_SIZE_RX_BUF) {
+                LOGD(ERROR, ev->domid,
+                     "QMP receive buffer is too big (%ld > %lld)",
+                     ev->rx_buf_size, QMP_MAX_SIZE_RX_BUF);
+                return ERROR_BUFFERFULL;
+            }
+            ev->rx_buf = libxl__realloc(NOGC, ev->rx_buf, ev->rx_buf_size);
+        }
+
+        r = read(fd, ev->rx_buf + ev->rx_buf_used,
+                 ev->rx_buf_size - ev->rx_buf_used);
+        if (r < 0) {
+            if (errno == EINTR) {
+                continue;
+            }
+            if (errno == EWOULDBLOCK) {
+                break;
+            }
+            LOGED(ERROR, ev->domid, "error reading QMP socket");
+            return ERROR_FAIL;
+        }
+
+        if (r == 0) {
+            LOGD(ERROR, ev->domid, "Unexpected EOF on QMP socket");
+            return ERROR_PROTOCOL_ERROR_QMP;
+        }
+
+        LOG_QMP("received %ldB: '%.*s'", r,
+                (int)r, ev->rx_buf + ev->rx_buf_used);
+
+        ev->rx_buf_used += r;
+        assert(ev->rx_buf_used <= ev->rx_buf_size);
+    }
+
+    return 0;
+}
+
+/* Handle messages received from QMP server */
+
+static int qmp_ev_get_next_msg(libxl__egc *egc, libxl__ev_qmp *ev,
+                               libxl__json_object **o_r)
+    /* Find a JSON object and store it in o_r.
+     * return ERROR_NOTFOUND if no object is found.
+     * `o_r` is allocated within `egc`.
+     */
+{
+    EGC_GC;
+    size_t len;
+    char *end = NULL;
+    libxl__json_object *o = NULL;
+
+    if (!ev->rx_buf_used)
+        return ERROR_NOTFOUND;
+
+    /* Search for the end of a QMP message: "\r\n" */
+    end = memmem(ev->rx_buf, ev->rx_buf_used, "\r\n", 2);
+    if (!end)
+        return ERROR_NOTFOUND;
+    len = (end - ev->rx_buf) + 2;
+
+    LOG_QMP("parsing %luB: '%.*s'", len, (int)len, ev->rx_buf);
+
+    /* Replace \r by \0 so that libxl__json_parse can use strlen */
+    ev->rx_buf[len - 2] = '\0';
+    o = libxl__json_parse(gc, ev->rx_buf);
+
+    if (!o) {
+        LOGD(ERROR, ev->domid, "Parse error");
+        return ERROR_PROTOCOL_ERROR_QMP;
+    }
+
+    ev->rx_buf_used -= len;
+    memmove(ev->rx_buf, ev->rx_buf + len, ev->rx_buf_used);
+
+    LOG_QMP("JSON object received: %s",
+            libxl__json_object_to_json(gc, o));
+
+    *o_r = o;
+
+    return 0;
+}
+
+static int qmp_ev_parse_error_messages(libxl__egc *egc,
+                                       libxl__ev_qmp *ev,
+                                       const libxl__json_object *resp)
+{
+    EGC_GC;
+    int rc;
+    const char *s;
+    const libxl__json_object *o;
+    const libxl__json_object *err;
+
+    /*
+     * { "error": { "class": string, "desc": string } }
+     */
+
+    err = libxl__json_map_get("error", resp, JSON_MAP);
+
+    o = libxl__json_map_get("class", err, JSON_STRING);
+    if (!o) {
+        LOGD(ERROR, ev->domid,
+             "Protocol error: missing \"class\" member in error message");
+        return ERROR_PROTOCOL_ERROR_QMP;
+    }
+    s = libxl__json_object_get_string(o);
+    if (s)
+        rc = qmp_error_class_to_libxl_error_code(gc, s);
+    else
+        rc = ERROR_PROTOCOL_ERROR_QMP;
+
+    o = libxl__json_map_get("desc", err, JSON_STRING);
+    if (!o) {
+        LOGD(ERROR, ev->domid,
+             "Protocol error: missing \"desc\" member in error message");
+        return ERROR_PROTOCOL_ERROR_QMP;
+    }
+    s = libxl__json_object_get_string(o);
+    if (s)
+        LOGD(ERROR, ev->domid, "%s", s);
+    else
+        LOGD(ERROR, ev->domid, "Received unexpected error: %s",
+             libxl__json_object_to_json(gc, resp));
+    return rc;
+}
+
+static int qmp_ev_handle_message(libxl__egc *egc,
+                                 libxl__ev_qmp *ev,
+                                 const libxl__json_object *resp)
+    /*
+     * This function will handle every messages sent by the QMP server.
+     * Return values:
+     *   < 0    libxl error code
+     *   0      success
+     *   1      success, but a user callback has been called,
+     *          `ev` should not be used anymore.
+     */
+{
+    EGC_GC;
+    int id;
+    int rc;
+    const libxl__json_object *o;
+    const libxl__json_object *response;
+    libxl__qmp_message_type type = qmp_response_type(resp);
+
+    switch (type) {
+    case LIBXL__QMP_MESSAGE_TYPE_QMP:
+        /* greeting message */
+
+        if (ev->qmp_state != qmp_state_connecting) {
+            LOGD(ERROR, ev->domid,
+                 "Unexpected greeting message received");
+            return ERROR_PROTOCOL_ERROR_QMP;
+        }
+
+        /* Prepare next message to send */
+        assert(!ev->tx_buf);
+        ev->tx_buf = qmp_prepare_cmd(gc, "qmp_capabilities", NULL,
+                              QMP_CAPABILITY_NEGOTIATION_MSGID,
+                              &ev->tx_buf_len);
+        ev->tx_buf_off = 0;
+        qmp_ev_set_state(gc, ev, qmp_state_capability_negotiation);
+
+        return 0;
+
+    case LIBXL__QMP_MESSAGE_TYPE_RETURN:
+    case LIBXL__QMP_MESSAGE_TYPE_ERROR:
+        /*
+         * Reply to a command (success/error) or server error
+         *
+         * In this cases, we are parsing two possibles responses:
+         * - success:
+         * { "return": json-value, "id": int }
+         * - error:
+         * { "error": { "class": string, "desc": string }, "id": int }
+         */
+
+        o = libxl__json_map_get("id", resp, JSON_INTEGER);
+        if (!o) {
+            /*
+             * If "id" isn't present, an error occur on the server before
+             * it has read the "id" provided by libxl.
+             */
+            qmp_ev_parse_error_messages(egc, ev, resp);
+            return ERROR_PROTOCOL_ERROR_QMP;
+        }
+
+        id = libxl__json_object_get_integer(o);
+
+        if (id == QMP_CAPABILITY_NEGOTIATION_MSGID) {
+            /* We have a response to our qmp_capabilities cmd */
+            if (ev->qmp_state != qmp_state_capability_negotiation ||
+                type != LIBXL__QMP_MESSAGE_TYPE_RETURN)
+                goto out_unknown_id;
+            qmp_ev_set_state(gc, ev, qmp_state_connected);
+            return 0;
+        }
+
+        if (ev->qmp_state == qmp_state_waiting_reply &&
+            id == ev->id) {
+            if (type == LIBXL__QMP_MESSAGE_TYPE_RETURN) {
+                response = libxl__json_map_get("return", resp, JSON_ANY);
+                rc = 0;
+            } else {
+                /* error message */
+                response = NULL;
+                rc = qmp_ev_parse_error_messages(egc, ev, resp);
+            }
+            qmp_ev_set_state(gc, ev, qmp_state_connected);
+            ev->callback(egc, ev, response, rc); /* must be last */
+            return 1;
+        }
+
+out_unknown_id:
+        LOGD(ERROR, ev->domid,
+             "Message from QEMU with unexpected id %d: %s",
+             id, libxl__json_object_to_json(gc, resp));
+        return ERROR_PROTOCOL_ERROR_QMP;
+
+    case LIBXL__QMP_MESSAGE_TYPE_EVENT:
+        /* Events are ignored */
+        return 0;
+
+    case LIBXL__QMP_MESSAGE_TYPE_INVALID:
+        LOGD(ERROR, ev->domid, "Unexpected message received: %s",
+             libxl__json_object_to_json(gc, resp));
+        return ERROR_PROTOCOL_ERROR_QMP;
+
+    default:
+        abort();
+    }
+
+    return 0;
+}
+
+/*
+ * libxl__ev_qmp_*
+ */
+
+void libxl__ev_qmp_init(libxl__ev_qmp *ev)
+{
+    ev->id = QMP_CAPABILITY_NEGOTIATION_MSGID + 1;
+
+    ev->qmp_cfd = NULL;
+    libxl__ev_fd_init(&ev->qmp_efd);
+    ev->qmp_state = qmp_state_disconnected;
+
+    ev->rx_buf = NULL;
+    ev->rx_buf_size = ev->rx_buf_used = 0;
+    ev->tx_buf = NULL;
+    ev->tx_buf_len = ev->tx_buf_off = 0;
+
+    ev->msg = NULL;
+    ev->msg_len = 0;
+}
+
+int libxl__ev_qmp_send(libxl__gc *gc, libxl__ev_qmp *ev,
+                       const char *cmd, libxl__json_object *args)
+{
+    int rc;
+
+    LOGD(DEBUG, ev->domid, " ev %p, cmd '%s'", ev, cmd);
+
+    assert(ev->qmp_state == qmp_state_disconnected ||
+           ev->qmp_state == qmp_state_connected);
+
+    /* Connect to QEMU if not already connected */
+    rc = qmp_ev_connect(gc, ev);
+    if (rc)
+        goto out;
+
+    rc = qmp_ev_prepare_cmd(gc, ev, cmd, args);
+    if (rc)
+        goto out;
+
+    qmp_ev_ensure_reading_writing(gc, ev);
+
+out:
+    if (rc)
+        libxl__ev_qmp_dispose(gc, ev);
+    return rc;
+}
+
+void libxl__ev_qmp_dispose(libxl__gc *gc, libxl__ev_qmp *ev)
+{
+    LOGD(DEBUG, ev->domid, " ev %p", ev);
+
+    free(ev->rx_buf);
+    free(ev->tx_buf);
+    free(ev->msg);
+
+    libxl__ev_fd_deregister(gc, &ev->qmp_efd);
+    libxl__carefd_close(ev->qmp_cfd);
+
+    libxl__ev_qmp_init(ev);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 3b8f967651..fec42b260c 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -69,6 +69,12 @@ libxl_error = Enumeration("error", [
     (-23, "NOTFOUND"),
     (-24, "DOMAIN_DESTROYED"), # Target domain ceased to exist during op
     (-25, "FEATURE_REMOVED"), # For functionality that has been removed
+    (-26, "PROTOCOL_ERROR_QMP"),
+    (-27, "UNKNOWN_QMP_ERROR"),
+    (-28, "QMP_GENERIC_ERROR"), # unspecified qmp error
+    (-29, "QMP_COMMAND_NOT_FOUND"), # the requested command has not been found
+    (-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become active
+    (-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been found
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 06/11] libxl_exec: Add libxl__spawn_initiate_failure
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (4 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_* Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-12 17:34   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU Anthony PERARD
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

This function can be used by user of libxl__spawn_* when they setup a
notification other than xenstore. The parent can already report success
via libxl__spawn_initiate_detach(), this new function can be used for
failure instead of waiting for the timeout.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---

Notes:
    v6:
        long line fix
        typos fixed
        fix leak of internal state into external doc
        if the function is call multiple times, set rc only the first time.
    
    v5:
        fix typos

 tools/libxl/libxl_exec.c     | 11 +++++++++++
 tools/libxl/libxl_internal.h | 23 ++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_exec.c b/tools/libxl/libxl_exec.c
index 02e6c917f0..bf87bc563c 100644
--- a/tools/libxl/libxl_exec.c
+++ b/tools/libxl/libxl_exec.c
@@ -373,6 +373,17 @@ void libxl__spawn_initiate_detach(libxl__gc *gc, libxl__spawn_state *ss)
     spawn_detach(gc, ss);
 }
 
+void libxl__spawn_initiate_failure(libxl__gc *gc, libxl__spawn_state *ss,
+                                   int rc)
+/* The spawn state must be Attached on entry and will be Attached Failed
+ * on return.  */
+{
+    assert(rc);
+    if (!ss->rc)
+        ss->rc = rc;
+    spawn_detach(gc, ss);
+}
+
 static void spawn_fail(libxl__egc *egc, libxl__spawn_state *ss, int rc)
 /* Caller must have logged.  Must be last thing in calling function,
  * as it may make the callback.  Precondition: Attached or Detaching. */
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e02597a0cd..6c118ccb3b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1552,7 +1552,8 @@ _hidden void libxl__spawn_init(libxl__spawn_state*);
  *
  * The inner child must soon exit or exec.  It must also soon exit or
  * notify the parent of its successful startup by writing to the
- * xenstore path xspath.
+ * xenstore path xspath OR via other means that the parent will have
+ * to set up.
  *
  * The user (in the parent) will be called back (confirm_cb) every
  * time that xenstore path is modified.
@@ -1608,6 +1609,26 @@ _hidden int libxl__spawn_spawn(libxl__egc *egc, libxl__spawn_state *spawn);
  */
 _hidden void libxl__spawn_initiate_detach(libxl__gc *gc, libxl__spawn_state*);
 
+/*
+ * libxl__spawn_initiate_failure - Propagate failure from the caller to the
+ * callee.
+ *
+ * Works by killing the intermediate process from spawn_spawn.
+ * After this function returns, a failure will be reported.
+ *
+ * This is not synchronous: there will be a further callback when
+ * the detach is complete.
+ *
+ * Logs errors.
+ *
+ * The spawn state must be Attached on entry and will remain Attached. It
+ * is possible for a spawn to fail for multiple reasons, for example
+ * call(s) to libxl__spawn_initiate_failure and also for some other reason.
+ * In that case the first rc value from any source will take precedence.
+ */
+_hidden void libxl__spawn_initiate_failure(libxl__gc *gc,
+                                           libxl__spawn_state *ss, int rc);
+
 /*
  * If successful, this should return 0.
  *
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (5 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 06/11] libxl_exec: Add libxl__spawn_initiate_failure Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-16 11:52   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 08/11] libxl: QEMU startup sync based on QMP Anthony PERARD
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

This patch move the creation of the QMP unix socket from QEMU to libxl.
But libxl doesn't rely on this yet.

When starting QEMU with dm_restrict=1, pre-open the QMP socket before
exec QEMU. That socket will be usefull to findout if QEMU is ready, and
pre-opening it means that libxl can connect to it without waiting for
QEMU to create it.

The pre-openning is conditionnal, based on the use of dm_restrict
because it is using a new command line option of QEMU, and dm_restrict
support in QEMU is newer.

-chardev socket,fd=X is available with QEMU 2.12, since commit:
> char: allow passing pre-opened socket file descriptor at startup
> 0935700f8544033ebbd41e1f13cd528f8a58d24d

dm_restrict is available in QEMU 3.0.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v6:
        move dm_monitor_fd into libxl_domain_build_info (or d_state)
        -> move the creation of the socket into libxl__spawn_local_dm
           instead of libxl__build_device_model_args
        Use libxl_domid type instead of int for libxl__pre_open_qmp_socket()
        Check function calls (bind and listen) return value in a separate statement.
        typo and other coding style issue fixes
    
    v5:
        use libxl__remove_file
        few changes in coding style
        remove stale includes (sys/socket, sys/un) which are now in libxl_internal.h
    
    v4:
        separate the logic to open a socket into a function.
        Use libxl__prepare_sockaddr_un() to check path size

 tools/libxl/libxl_create.c   |  3 ++
 tools/libxl/libxl_dm.c       | 71 ++++++++++++++++++++++++++++++++++--
 tools/libxl/libxl_internal.h |  1 +
 3 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index fa573344bc..fcbe36feba 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -539,6 +539,9 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
     libxl_domain_create_info *info = &d_config->c_info;
     libxl_domain_build_info *b_info = &d_config->b_info;
 
+    /* Attempt to initialise libxl__domain_build_state */
+    state->dm_monitor_fd = -1;
+
     uuid_string = libxl__uuid2string(gc, info->uuid);
     if (!uuid_string) {
         rc = ERROR_NOMEM;
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 9c47060473..3bf1e37894 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -910,6 +910,53 @@ static char *qemu_disk_ide_drive_string(libxl__gc *gc, const char *target_path,
     return drive;
 }
 
+static int libxl__pre_open_qmp_socket(libxl__gc *gc, libxl_domid domid,
+                                      int *fd_r)
+{
+    int rc, r;
+    int fd;
+    struct sockaddr_un un;
+    const char *path = libxl__qemu_qmp_path(gc, domid);
+
+    assert(fd_r != NULL);
+
+    fd = socket(AF_UNIX, SOCK_STREAM, 0);
+    if (fd < 0) {
+        LOGED(ERROR, domid, "socket() failed");
+        return ERROR_FAIL;
+    }
+
+    rc = libxl__prepare_sockaddr_un(gc, &un, path, "QEMU's QMP socket");
+    if (rc)
+        goto out;
+
+    rc = libxl__remove_file(gc, path);
+    if (rc)
+        goto out;
+
+    r = bind(fd, (struct sockaddr *) &un, sizeof(un));
+    if (r < 0) {
+        LOGED(ERROR, domid, "bind('%s') failed", path);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    r = listen(fd, 1);
+    if (r < 0) {
+        LOGED(ERROR, domid, "listen() failed");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    *fd_r = fd;
+    rc = 0;
+
+out:
+    if (rc && fd >= 0)
+        close(fd);
+    return rc;
+}
+
 static int libxl__build_device_model_args_new(libxl__gc *gc,
                                         const char *dm, int guest_domid,
                                         const libxl_domain_config *guest_config,
@@ -944,10 +991,16 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
                       GCSPRINTF("%d", guest_domid), NULL);
 
     flexarray_append(dm_args, "-chardev");
-    flexarray_append(dm_args,
-                     GCSPRINTF("socket,id=libxl-cmd,"
-                               "path=%s,server,nowait",
-                               libxl__qemu_qmp_path(gc, guest_domid)));
+    if (state->dm_monitor_fd >= 0) {
+        flexarray_append(dm_args,
+            GCSPRINTF("socket,id=libxl-cmd,fd=%d,server,nowait",
+                      state->dm_monitor_fd));
+    } else {
+        flexarray_append(dm_args,
+                         GCSPRINTF("socket,id=libxl-cmd,"
+                                   "path=%s,server,nowait",
+                                   libxl__qemu_qmp_path(gc, guest_domid)));
+    }
 
     flexarray_append(dm_args, "-no-shutdown");
     flexarray_append(dm_args, "-mon");
@@ -2000,6 +2053,7 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
     if (ret)
         goto out;
 
+    d_state->dm_monitor_fd = -1;
     ret = libxl__build_device_model_args(gc, "stubdom-dm", guest_domid,
                                          guest_config, &args, NULL,
                                          d_state, NULL);
@@ -2303,6 +2357,14 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
         rc = ERROR_FAIL;
         goto out;
     }
+    if (b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN &&
+        libxl_defbool_val(b_info->dm_restrict)) {
+        /* If we have to use dm_restrict, QEMU needs to be new enough
+         * and will have the new interface where we can pre-open the
+         * QMP socket. */
+        rc = libxl__pre_open_qmp_socket(gc, domid, &state->dm_monitor_fd);
+        if (rc) goto out;
+    }
     rc = libxl__build_device_model_args(gc, dm, domid, guest_config,
                                           &args, &envs, state,
                                           &dm_state_fd);
@@ -2408,6 +2470,7 @@ out_close:
     if (logfile_w >= 0) close(logfile_w);
 out:
     if (dm_state_fd >= 0) close(dm_state_fd);
+    if (state->dm_monitor_fd >= 0) close(state->dm_monitor_fd);
     if (rc)
         device_model_spawn_outcome(egc, dmss, rc);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6c118ccb3b..b768d1b09f 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1233,6 +1233,7 @@ typedef struct {
     char *console_tty;
 
     char *saved_state;
+    int dm_monitor_fd;
 
     libxl__file_reference pv_kernel;
     libxl__file_reference pv_ramdisk;
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 08/11] libxl: QEMU startup sync based on QMP
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (6 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-16 12:14   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 09/11] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp Anthony PERARD
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

This is only activated when dm_restrict=1, as explained in the previous
patch "libxl_dm: Pre-open QMP socket for QEMU"

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---

Notes:
    v6:
        invent ERROR_QEMU_API
        return better rc: ERROR_QEMU_API or ERROR_NOT_READY
        enhance log messages (debug and error)
    
    v5:
        removed empty success branch in device_model_qmp_cb()
        call libxl__ev_qmp_init() earlier in libxl__spawn_local_dm.
            otherwise the error path would use an uninitialised
            libxl__ev_qmp.
    
    v4:
        moved to libxl__dm_spawn_* from libxl__spawn_*

 tools/libxl/libxl_dm.c       | 55 ++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |  1 +
 tools/libxl/libxl_types.idl  |  1 +
 3 files changed, 57 insertions(+)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 3bf1e37894..2417abe651 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -2313,6 +2313,9 @@ static void device_model_startup_failed(libxl__egc *egc,
                                         int rc);
 static void device_model_detached(libxl__egc *egc,
                                   libxl__spawn_state *spawn);
+static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev,
+                                const libxl__json_object *response,
+                                int rc);
 
 /* our "next step" function, called from those callbacks and elsewhere */
 static void device_model_spawn_outcome(libxl__egc *egc,
@@ -2343,6 +2346,8 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
     const char *dm;
     int dm_state_fd = -1;
 
+    libxl__ev_qmp_init(&dmss->qmp);
+
     if (libxl_defbool_val(b_info->device_model_stubdomain)) {
         abort();
     }
@@ -2450,6 +2455,16 @@ retry_transaction:
     spawn->failure_cb = device_model_startup_failed;
     spawn->detached_cb = device_model_detached;
 
+    if (state->dm_monitor_fd >= 0) {
+        /* There is a valid QMP socket available now,
+         * use it to find out when QEMU is ready */
+        dmss->qmp.callback = device_model_qmp_cb;
+        dmss->qmp.domid = domid;
+        dmss->qmp.fd = -1;
+        rc = libxl__ev_qmp_send(gc, &dmss->qmp, "query-status", NULL);
+        if (rc) goto out_close;
+    }
+
     rc = libxl__spawn_spawn(egc, spawn);
     if (rc < 0)
         goto out_close;
@@ -2524,6 +2539,44 @@ static void device_model_detached(libxl__egc *egc,
     device_model_spawn_outcome(egc, dmss, 0);
 }
 
+static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev,
+                                const libxl__json_object *response,
+                                int rc)
+{
+    EGC_GC;
+    libxl__dm_spawn_state *dmss = CONTAINER_OF(ev, *dmss, qmp);
+    const libxl__json_object *o;
+    const char *status;
+
+    libxl__ev_qmp_dispose(gc, ev);
+
+    if (rc)
+        goto failed;
+
+    o = libxl__json_map_get("status", response, JSON_STRING);
+    if (!o) {
+        LOGD(ERROR, ev->domid,
+             "Missing \"status\" in response to \"query-status\"");
+        LOGD(DEBUG, ev->domid, ".. instead, got: %s",
+             libxl__json_object_to_json(gc, response));
+        rc = ERROR_QEMU_API;
+        goto failed;
+    }
+    status = libxl__json_object_get_string(o);
+    if (strcmp(status, "running")) {
+        LOGD(ERROR, ev->domid, "Unexpected QEMU status: %s", status);
+        rc = ERROR_NOT_READY;
+        goto failed;
+    }
+
+    libxl__spawn_initiate_detach(gc, &dmss->spawn);
+    return;
+
+failed:
+    LOGD(ERROR, ev->domid, "QEMU did not start properly, rc=%d", rc);
+    libxl__spawn_initiate_failure(gc, &dmss->spawn, rc);
+}
+
 static void device_model_spawn_outcome(libxl__egc *egc,
                                        libxl__dm_spawn_state *dmss,
                                        int rc)
@@ -2547,6 +2600,8 @@ static void device_model_spawn_outcome(libxl__egc *egc,
         }
     }
 
+    libxl__ev_qmp_dispose(gc, &dmss->qmp);
+
  out:
     dmss->callback(egc, dmss, rc);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b768d1b09f..de3862c839 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3898,6 +3898,7 @@ struct libxl__dm_spawn_state {
     libxl_domain_config *guest_config;
     libxl__domain_build_state *build_state; /* relates to guest_domid */
     libxl__dm_spawn_cb *callback;
+    libxl__ev_qmp qmp;
 };
 
 _hidden void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state*);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index fec42b260c..a0912718e0 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -75,6 +75,7 @@ libxl_error = Enumeration("error", [
     (-29, "QMP_COMMAND_NOT_FOUND"), # the requested command has not been found
     (-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become active
     (-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been found
+    (-32, "QEMU_API"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 09/11] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (7 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 08/11] libxl: QEMU startup sync based on QMP Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-16 12:17   ` Ian Jackson
  2018-11-12 16:49 ` [PATCH v6 10/11] libxl: Change libxl__domain_suspend_device_model() to be async Anthony PERARD
  2018-11-12 16:49 ` [PATCH v6 11/11] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp Anthony PERARD
  10 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

This will be used in a later patch.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v6:
        new local macro GRAB_VERSION
        better definition of qemu_version field in libxl_internal.h
    
    v5:
        initialise qemu_version struct in libxl__ev_qmp_init

 tools/libxl/libxl_internal.h |  8 ++++++++
 tools/libxl/libxl_qmp.c      | 23 +++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index de3862c839..53814be9d7 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -431,6 +431,14 @@ struct libxl__ev_qmp {
     libxl__ev_qmp_callback *callback;
     int fd; /* set to send a fd with the command, -1 otherwise */
 
+    /* read-only when Connected
+     * and not to be accessed by the caller otherwise */
+    struct {
+        int major;
+        int minor;
+        int micro;
+    } qemu_version;
+
     /*
      * remaining fields are private to libxl_ev_qmp_*
      */
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 895628066a..5d3984e6a5 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -1841,6 +1841,25 @@ static int qmp_ev_handle_message(libxl__egc *egc,
             return ERROR_PROTOCOL_ERROR_QMP;
         }
 
+        /*
+         * Store advertised QEMU version
+         * { "QMP": { "version": {
+         *     "qemu": { "major": int, "minor": int, "micro": int } } } }
+         */
+        o = libxl__json_map_get("QMP", resp, JSON_MAP);
+        o = libxl__json_map_get("version", o, JSON_MAP);
+        o = libxl__json_map_get("qemu", o, JSON_MAP);
+#define GRAB_VERSION(level) \
+    ev->qemu_version.level = libxl__json_object_get_integer( \
+        libxl__json_map_get(#level, o, JSON_INTEGER));
+        GRAB_VERSION(major);
+        GRAB_VERSION(minor);
+        GRAB_VERSION(micro);
+#undef GRAB_VERSION
+        LOGD(DEBUG, ev->domid, "QEMU version: %d.%d.%d",
+             ev->qemu_version.major, ev->qemu_version.minor,
+             ev->qemu_version.micro);
+
         /* Prepare next message to send */
         assert(!ev->tx_buf);
         ev->tx_buf = qmp_prepare_cmd(gc, "qmp_capabilities", NULL,
@@ -1927,6 +1946,10 @@ out_unknown_id:
 
 void libxl__ev_qmp_init(libxl__ev_qmp *ev)
 {
+    ev->qemu_version.major = -1;
+    ev->qemu_version.minor = -1;
+    ev->qemu_version.micro = -1;
+
     ev->id = QMP_CAPABILITY_NEGOTIATION_MSGID + 1;
 
     ev->qmp_cfd = NULL;
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 10/11] libxl: Change libxl__domain_suspend_device_model() to be async
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (8 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 09/11] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  2018-11-12 16:49 ` [PATCH v6 11/11] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp Anthony PERARD
  10 siblings, 0 replies; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

This create an extra step for the two call sites of the function.

libxl__domain_suspend_device_model() in this patch gets an extra error
variable (there is ret and rc), but ret goes away in the next patch.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>

---
libxl_domain_soft_reset() haven't been tested, as it doesn't appear to
possible to call the function from xl.
---

Notes:
    v6:
        fix multiple way to report errors,
        libxl__domain_suspend_device_model will now only report via
        callbacks, and return void
        add rc in libxl__domain_suspend_device_model (ret isn't a proper rc
        as libxl__qmp_save don't return one)

 tools/libxl/libxl_create.c      | 35 +++++++++++++++++++++-----------
 tools/libxl/libxl_dom_suspend.c | 36 ++++++++++++++++++++-------------
 tools/libxl/libxl_internal.h    |  7 +++++--
 3 files changed, 50 insertions(+), 28 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index fcbe36feba..47d95dda4a 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1759,6 +1759,9 @@ error:
     domcreate_complete(egc, &cdcs->dcs, rc);
 }
 
+static void soft_reset_dm_suspended(libxl__egc *egc,
+                                    libxl__domain_suspend_state *dsps,
+                                    int rc);
 static int do_domain_soft_reset(libxl_ctx *ctx,
                                 libxl_domain_config *d_config,
                                 uint32_t domid_soft_reset,
@@ -1841,11 +1844,24 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
         goto out;
     }
 
-    rc = libxl__domain_suspend_device_model(gc, &dss->dsps);
-    if (rc) {
-        LOGD(ERROR, domid_soft_reset, "failed to suspend device model.");
-        goto out;
-    }
+    dss->dsps.ao = ao;
+    dss->dsps.callback_device_model_done = soft_reset_dm_suspended;
+    libxl__domain_suspend_device_model(egc, &dss->dsps); /* must be last */
+
+    return AO_INPROGRESS;
+
+ out:
+    return AO_CREATE_FAIL(rc);
+}
+
+static void soft_reset_dm_suspended(libxl__egc *egc,
+                                    libxl__domain_suspend_state *dsps,
+                                    int rc)
+{
+    STATE_AO_GC(dsps->ao);
+    libxl__domain_soft_reset_state *srs =
+        CONTAINER_OF(dsps, *srs, dss.dsps);
+    libxl__app_domain_create_state *cdcs = &srs->cdcs;
 
     /*
      * Ask all backends to disconnect by removing the domain from
@@ -1853,18 +1869,13 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
      * xenstore again with probably different store/console/...
      * channels.
      */
-    xs_release_domain(ctx->xsh, cdcs->dcs.domid_soft_reset);
+    xs_release_domain(CTX->xsh, cdcs->dcs.domid_soft_reset);
 
     srs->dds.ao = ao;
-    srs->dds.domid = domid_soft_reset;
+    srs->dds.domid = cdcs->dcs.domid_soft_reset;
     srs->dds.callback = domain_soft_reset_cb;
     srs->dds.soft_reset = true;
     libxl__domain_destroy(egc, &srs->dds);
-
-    return AO_INPROGRESS;
-
- out:
-    return AO_CREATE_FAIL(rc);
 }
 
 static void domain_create_cb(libxl__egc *egc,
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 1e904bae8a..f8ff5cf0c5 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -68,10 +68,12 @@ out:
 
 /*----- callbacks, called by xc_domain_save -----*/
 
-int libxl__domain_suspend_device_model(libxl__gc *gc,
+void libxl__domain_suspend_device_model(libxl__egc *egc,
                                        libxl__domain_suspend_state *dsps)
 {
+    STATE_AO_GC(dsps->ao);
     int ret = 0;
+    int rc = 0;
     uint32_t const domid = dsps->domid;
     const char *const filename = dsps->dm_savefile;
 
@@ -83,18 +85,29 @@ int libxl__domain_suspend_device_model(libxl__gc *gc,
         break;
     }
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        if (libxl__qmp_stop(gc, domid))
-            return ERROR_FAIL;
+        ret = libxl__qmp_stop(gc, domid);
+        if (ret) {
+            rc = ERROR_FAIL;
+            goto out;
+        }
         /* Save DM state into filename */
         ret = libxl__qmp_save(gc, domid, filename, dsps->live);
-        if (ret)
+        if (ret) {
+            rc = ERROR_FAIL;
             unlink(filename);
+            goto out;
+        }
         break;
     default:
-        return ERROR_INVAL;
+        rc = ERROR_INVAL;
+        goto out;
     }
 
-    return ret;
+out:
+    if (rc)
+        LOGD(ERROR, dsps->domid,
+             "failed to suspend device model, rc=%d", rc);
+    dsps->callback_device_model_done(egc, dsps, rc); /* must be last */
 }
 
 static void domain_suspend_common_wait_guest(libxl__egc *egc,
@@ -371,20 +384,15 @@ static void domain_suspend_common_guest_suspended(libxl__egc *egc,
                                          libxl__domain_suspend_state *dsps)
 {
     STATE_AO_GC(dsps->ao);
-    int rc;
 
     libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
     libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
     libxl__ev_time_deregister(gc, &dsps->guest_timeout);
 
     if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) {
-        rc = libxl__domain_suspend_device_model(gc, dsps);
-        if (rc) {
-            LOGD(ERROR, dsps->domid,
-                 "libxl__domain_suspend_device_model failed ret=%d", rc);
-            domain_suspend_common_done(egc, dsps, rc);
-            return;
-        }
+        dsps->callback_device_model_done = domain_suspend_common_done;
+        libxl__domain_suspend_device_model(egc, dsps); /* must be last */
+        return;
     }
     domain_suspend_common_done(egc, dsps, 0);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 53814be9d7..28af6d866c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3414,6 +3414,8 @@ struct libxl__domain_suspend_state {
     libxl__ev_time guest_timeout;
 
     const char *dm_savefile;
+    void (*callback_device_model_done)(libxl__egc*,
+                              struct libxl__domain_suspend_state*, int rc);
     void (*callback_common_done)(libxl__egc*,
                                  struct libxl__domain_suspend_state*, int ok);
 };
@@ -4035,8 +4037,9 @@ static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
     return libxl__ev_child_inuse(&shs->child);
 }
 
-/* Each time the dm needs to be saved, we must call suspend and then save */
-_hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
+/* Each time the dm needs to be saved, we must call suspend and then save
+ * calls dsps->callback_device_model_done when done */
+_hidden void libxl__domain_suspend_device_model(libxl__egc *egc,
                                            libxl__domain_suspend_state *dsps);
 
 _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 11/11] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp
  2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
                   ` (9 preceding siblings ...)
  2018-11-12 16:49 ` [PATCH v6 10/11] libxl: Change libxl__domain_suspend_device_model() to be async Anthony PERARD
@ 2018-11-12 16:49 ` Anthony PERARD
  10 siblings, 0 replies; 31+ messages in thread
From: Anthony PERARD @ 2018-11-12 16:49 UTC (permalink / raw)
  To: xen-devel; +Cc: Anthony PERARD, Ian Jackson, Wei Liu

The re-implementation is done because we want to be able to send the
file description that QEMU can use to save its state. When QEMU is
restricted, it would not be able to write to a path.

This replace both libxl__qmp_stop() and libxl__qmp_save().

qmp_qemu_check_version() was only used by libxl__qmp_save(), so it is
replace by a version using libxl__ev_qmp instead.

Coding style fixed in libxl__domain_suspend_device_model() for the
return value.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

Notes:
    v6:
        extract open call from libxl__carefd_opened to respect coding style
        libxl__qmp_suspend_save now always report success/error
            via dsps->callback_device_model_done
    
    v5:
        rename goto 'out' label to 'error', as it is use only for errors.
        re-add/keep the comment about the "live" parameter in dm_state_fd_ready
        use libxl__remove_file instead of plain unlink
    
    v4:
        This patch replace the patch "libxl_qmp: Have QEMU save its state to a file
        descriptor" from previous version of the serie.
        It uses libxl__ev_qmp instead.

 tools/libxl/libxl_dom_suspend.c |  19 +---
 tools/libxl/libxl_internal.h    |  10 +-
 tools/libxl/libxl_qmp.c         | 159 +++++++++++++++++++++++++-------
 3 files changed, 137 insertions(+), 51 deletions(-)

diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index f8ff5cf0c5..d1af3a6573 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -34,6 +34,7 @@ int libxl__domain_suspend_init(libxl__egc *egc,
     libxl__ev_evtchn_init(&dsps->guest_evtchn);
     libxl__ev_xswatch_init(&dsps->guest_watch);
     libxl__ev_time_init(&dsps->guest_timeout);
+    libxl__ev_qmp_init(&dsps->qmp);
 
     if (type == LIBXL_DOMAIN_TYPE_INVALID) goto out;
     dsps->type = type;
@@ -72,7 +73,6 @@ void libxl__domain_suspend_device_model(libxl__egc *egc,
                                        libxl__domain_suspend_state *dsps)
 {
     STATE_AO_GC(dsps->ao);
-    int ret = 0;
     int rc = 0;
     uint32_t const domid = dsps->domid;
     const char *const filename = dsps->dm_savefile;
@@ -85,19 +85,9 @@ void libxl__domain_suspend_device_model(libxl__egc *egc,
         break;
     }
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        ret = libxl__qmp_stop(gc, domid);
-        if (ret) {
-            rc = ERROR_FAIL;
-            goto out;
-        }
-        /* Save DM state into filename */
-        ret = libxl__qmp_save(gc, domid, filename, dsps->live);
-        if (ret) {
-            rc = ERROR_FAIL;
-            unlink(filename);
-            goto out;
-        }
-        break;
+        /* calls dsps->callback_device_model_done when done */
+        libxl__qmp_suspend_save(egc, dsps); /* must be last */
+        return;
     default:
         rc = ERROR_INVAL;
         goto out;
@@ -406,6 +396,7 @@ static void domain_suspend_common_done(libxl__egc *egc,
     libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
     libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
     libxl__ev_time_deregister(gc, &dsps->guest_timeout);
+    libxl__ev_qmp_dispose(gc, &dsps->qmp);
     dsps->callback_common_done(egc, dsps, rc);
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 28af6d866c..6ec33f6eb6 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1947,13 +1947,8 @@ _hidden int libxl__qmp_pci_del(libxl__gc *gc, int domid,
                                libxl_device_pci *pcidev);
 /* Resume hvm domain */
 _hidden int libxl__qmp_system_wakeup(libxl__gc *gc, int domid);
-/* Suspend QEMU. */
-_hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
 /* Resume QEMU. */
 _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
-/* Save current QEMU state into fd. */
-_hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename,
-                            bool live);
 /* Load current QEMU state from file. */
 _hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
 /* Set dirty bitmap logging status */
@@ -3412,6 +3407,7 @@ struct libxl__domain_suspend_state {
     libxl__xswait_state pvcontrol;
     libxl__ev_xswatch guest_watch;
     libxl__ev_time guest_timeout;
+    libxl__ev_qmp qmp;
 
     const char *dm_savefile;
     void (*callback_device_model_done)(libxl__egc*,
@@ -3423,6 +3419,10 @@ int libxl__domain_suspend_init(libxl__egc *egc,
                                libxl__domain_suspend_state *dsps,
                                libxl_domain_type type);
 
+/* calls dsps->callback_device_model_done when done */
+_hidden void libxl__qmp_suspend_save(libxl__egc *egc,
+                                     libxl__domain_suspend_state *dsps);
+
 struct libxl__domain_save_state {
     /* set by caller of libxl__domain_save */
     libxl__ao *ao;
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 5d3984e6a5..b7e011940f 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -405,12 +405,12 @@ static int qmp_handle_response(libxl__gc *gc, libxl__qmp_handler *qmp,
  *   = 0  if qemu's version == asked version
  *   > 0  if qemu's version >  asked version
  */
-static int qmp_qemu_compare_version(libxl__qmp_handler *qmp, int major,
-                                    int minor, int micro)
+static int qmp_ev_qemu_compare_version(libxl__ev_qmp *ev, int major,
+                                       int minor, int micro)
 {
 #define CHECK_VERSION(level) do { \
-    if (qmp->version.level > (level)) return +1; \
-    if (qmp->version.level < (level)) return -1; \
+    if (ev->qemu_version.level > (level)) return +1; \
+    if (ev->qemu_version.level < (level)) return -1; \
 } while (0)
 
     CHECK_VERSION(major);
@@ -1027,29 +1027,6 @@ int libxl__qmp_system_wakeup(libxl__gc *gc, int domid)
     return qmp_run_command(gc, domid, "system_wakeup", NULL, NULL, NULL);
 }
 
-int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename, bool live)
-{
-    libxl__json_object *args = NULL;
-    libxl__qmp_handler *qmp = NULL;
-    int rc;
-
-    qmp = libxl__qmp_initialize(gc, domid);
-    if (!qmp)
-        return ERROR_FAIL;
-
-    qmp_parameters_add_string(gc, &args, "filename", (char *)filename);
-
-    /* live parameter was added to QEMU 2.11. It signal QEMU that the save
-     * operation is for a live migration rather that for taking a snapshot. */
-    if (qmp_qemu_compare_version(qmp, 2, 11, 0) >= 0)
-        qmp_parameters_add_bool(gc, &args, "live", live);
-
-    rc = qmp_synchronous_send(qmp, "xen-save-devices-state", args,
-                              NULL, NULL, qmp->timeout);
-    libxl__qmp_close(qmp);
-    return rc;
-}
-
 int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
 {
     libxl__json_object *args = NULL;
@@ -1078,11 +1055,6 @@ static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp,
     return rc;
 }
 
-int libxl__qmp_stop(libxl__gc *gc, int domid)
-{
-    return qmp_run_command(gc, domid, "stop", NULL, NULL, NULL);
-}
-
 int libxl__qmp_resume(libxl__gc *gc, int domid)
 {
     return qmp_run_command(gc, domid, "cont", NULL, NULL, NULL);
@@ -1322,6 +1294,129 @@ int libxl__qmp_initializations(libxl__gc *gc, uint32_t domid,
     return ret;
 }
 
+
+/*
+ * Function using libxl__ev_qmp
+ */
+
+static void dm_stopped(libxl__egc *egc, libxl__ev_qmp *ev,
+                       const libxl__json_object *response, int rc);
+static void dm_state_fd_ready(libxl__egc *egc, libxl__ev_qmp *ev,
+                              const libxl__json_object *response, int rc);
+static void dm_state_saved(libxl__egc *egc, libxl__ev_qmp *ev,
+                           const libxl__json_object *response, int rc);
+
+/* calls dsps->callback_device_model_done when done */
+void libxl__qmp_suspend_save(libxl__egc *egc,
+                             libxl__domain_suspend_state *dsps)
+{
+    EGC_GC;
+    int rc;
+    libxl__ev_qmp *ev = &dsps->qmp;
+
+    ev->domid = dsps->domid;
+    ev->callback = dm_stopped;
+    ev->fd = -1;
+
+    rc = libxl__ev_qmp_send(gc, ev, "stop", NULL);
+    if (rc)
+        goto error;
+
+    return;
+
+error:
+    dsps->callback_device_model_done(egc, dsps, rc);
+}
+
+static void dm_stopped(libxl__egc *egc, libxl__ev_qmp *ev,
+                       const libxl__json_object *response, int rc)
+{
+    EGC_GC;
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(ev, *dsps, qmp);
+    const char *const filename = dsps->dm_savefile;
+
+    if (rc)
+        goto error;
+
+    ev->fd = open(filename, O_WRONLY | O_CREAT, 0600);
+    if (ev->fd < 0) {
+        LOGED(ERROR, ev->domid,
+              "Failed to open file %s for QEMU", filename);
+        rc = ERROR_FAIL;
+        goto error;
+    }
+
+    ev->callback = dm_state_fd_ready;
+    rc = libxl__ev_qmp_send(gc, ev, "add-fd", NULL);
+    if (rc)
+        goto error;
+
+    return;
+
+error:
+    if (ev->fd >= 0) {
+        close(ev->fd);
+        unlink(filename);
+        ev->fd = -1;
+    }
+    dsps->callback_device_model_done(egc, dsps, rc);
+}
+
+static void dm_state_fd_ready(libxl__egc *egc, libxl__ev_qmp *ev,
+                              const libxl__json_object *response, int rc)
+{
+    EGC_GC;
+    int fdset;
+    const libxl__json_object *o;
+    libxl__json_object *args = NULL;
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(ev, *dsps, qmp);
+
+    close(ev->fd);
+    ev->fd = -1;
+
+    if (rc)
+        goto error;
+
+    o = libxl__json_map_get("fdset-id", response, JSON_INTEGER);
+    if (!o) {
+        rc = ERROR_QEMU_API;
+        goto error;
+    }
+    fdset = libxl__json_object_get_integer(o);
+
+    ev->callback = dm_state_saved;
+
+    /* The `live` parameter was added to QEMU 2.11. It signals QEMU that
+     * the save operation is for a live migration rather than for taking a
+     * snapshot. */
+    if (qmp_ev_qemu_compare_version(ev, 2, 11, 0) >= 0)
+        qmp_parameters_add_bool(gc, &args, "live", dsps->live);
+    QMP_PARAMETERS_SPRINTF(&args, "filename", "/dev/fdset/%d", fdset);
+    rc = libxl__ev_qmp_send(gc, ev, "xen-save-devices-state", args);
+    if (rc)
+        goto error;
+
+    return;
+
+error:
+    if (rc)
+        libxl__remove_file(gc, dsps->dm_savefile);
+    dsps->callback_device_model_done(egc, dsps, rc);
+}
+
+static void dm_state_saved(libxl__egc *egc, libxl__ev_qmp *ev,
+                           const libxl__json_object *response, int rc)
+{
+    EGC_GC;
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(ev, *dsps, qmp);
+
+    if (rc)
+        unlink(dsps->dm_savefile);
+
+    dsps->callback_device_model_done(egc, dsps, rc);
+}
+
+
 /* ------------ Implementation of libxl__ev_qmp ---------------- */
 
 /*
-- 
Anthony PERARD


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK
  2018-11-12 16:49 ` [PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK Anthony PERARD
@ 2018-11-12 16:58   ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2018-11-12 16:58 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Anthony PERARD writes ("[PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK"):
> This patch change the behavior of libxl__sendmsg_fds to retry sendmsg on
> EINTR error.
> 
> This patch also allow a caller of libxl__sendmsg_fds to deal with
> EWOULDBLOCK. It is best to only sent one byte when dealing with
> non-blocking fds so a EWOULDBLOCK error would mean that the fds haven't
> been sent yet.

This restriction is more than "it is best" - violating it can result
in an assertion failure.  I think it should be documented here:

> -/* on failure, logs */
> +/* returns ERROR_NOT_READY on EWOULDBLOCK
> + * or logs on failure. */

I looked at the actual code:

> -    r = sendmsg(carrier, &msg, 0);
> -    if (r < 0) {
> -        LOGE(ERROR, "failed to send fd-carrying message (%s)", what);
> -        return ERROR_FAIL;
> -    }
> +    while (1) {
> +        r = sendmsg(carrier, &msg, 0);
> +        if (r < 0) {
> +            if (errno == EINTR)
> +                continue;
> +            if (errno == EWOULDBLOCK) {
> +                assert(datalen == 1);
> +                return ERROR_NOT_READY;
> +            }
> +            LOGE(ERROR, "failed to send fd-carrying message (%s)", what);
> +            return ERROR_FAIL;
> +        }
> +        break;
> +    };

The previous code didn't check that the return value was as expected.
So the function could never be safely called with r!=1 since sendmsg
might report a short write, and libxl__sendmsg_fds wouldn't notice.

Now you have the opportunity to fix this...

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare
  2018-11-12 16:49 ` [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare Anthony PERARD
@ 2018-11-12 17:20   ` Ian Jackson
  2018-11-13 10:59     ` Anthony PERARD
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2018-11-12 17:20 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Thanks for the repost.  I feel I am going to make some comments which
could perhaps have been made earlier, so sorry for that:

Anthony PERARD writes ("[PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare"):
>     v6:
>         comment about ownership of buf

This is good.  But:

> -static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
> -                              const char *cmd, libxl__json_object *args,
> -                              qmp_callback_t callback, void *opaque,
> -                              qmp_request_context *context)

Previously this function returned memory allocated from malloc, and
this was not documented.  I think it should be, for both this and for
qmp_prepare_cmd.  See the big libxl.h comment on "memory allocation".

> @@ -608,6 +607,32 @@ static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
>      s = yajl_gen_get_buf(hand, &buf, &len);
>  
>      if (s) {
> +        goto out;
> +    }

Should there be a log message here ?  If not, maybe you just meant
    if (s) goto out;
In libxl_json.c we find the pattern is usually:
    if (s != yajl_gen_status_ok) goto out;
but I guess we can hardcode the assumption that yajl_gen_status_ok==0.

> +    ret = libxl__sprintf(NOGC, "%*.*s\r\n", (int)len, (int)len, buf);
> +    len += 2;

Please use %n to get the length, instead.  This kind of ad-hoc
addition encodes the buffer usage information in two disjoint places
and can easily result in buffer length bugs when the code is later
modified.

>  static int qmp_send(libxl__qmp_handler *qmp,
> @@ -641,7 +663,8 @@ static int qmp_send(libxl__qmp_handler *qmp,
>      int rc = -1;
>      GC_INIT(qmp->ctx);
>  
> -    buf = qmp_send_prepare(gc, qmp, cmd, args, callback, opaque, context);
> +    buf = qmp_send_prepare(gc, qmp, cmd, args, callback, opaque, context,
> +                           NULL);
>  
>      if (buf == NULL) {
>          goto out;
> @@ -650,12 +673,10 @@ static int qmp_send(libxl__qmp_handler *qmp,
>      if (libxl_write_exactly(qmp->ctx, qmp->qmp_fd, buf, strlen(buf),
>                              "QMP command", "QMP socket"))
>          goto out;
> -    if (libxl_write_exactly(qmp->ctx, qmp->qmp_fd, "\r\n", 2,
> -                            "CRLF", "QMP socket"))
> -        goto out;
>  
>      rc = qmp->last_id_used;
>  out:
> +    free(buf);

This patch seems to be a mixture of four things:

 1. Changing the return value convention of qmp_send_prepare
    to expect the caller to free it.

 2. Changing qmp_send_prepare to include the \r\n

 3. Splitting qmp_prepare_cmd out from qmp_send_prepare

 4. Changing qmp_send_prepare to tell the caller the length

Overall I think there is supposed to be no functional change ?
This should be mentioned in the commit message.

I appreciate that these four things are small, perhaps too small to
split out, but they should all be mentioned in the commit message.

And then, the reasons for these changes are unstated.  AFAICT:

3 is wanted because we are going to have a new caller which wants to
handle the sending itself.  Fine.

2 is wanted because every caller will want this, so it should be done
in the common function.

1 is wanted because 2 demands it.

4 ???  The existing code uses strlen.  JSON can't contain nul bytes.
Why should the new caller not do similarly ?  If the use of strlen is
wrong why not change the old caller ?  (Maybe it is going away, in
which case that needs to be mentioned.)

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version
  2018-11-12 16:49 ` [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version Anthony PERARD
@ 2018-11-12 17:22   ` Ian Jackson
  2018-11-22 12:24     ` Anthony PERARD
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2018-11-12 17:22 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Anthony PERARD writes ("[PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version"):
> This patch makes the function simpler to read. It also add the ability
> for a caller to tell if QEMU is newer or have the exact version.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 04/11] libxl: Design of an async API to issue QMP commands to QEMU
  2018-11-12 16:49 ` [PATCH v6 04/11] libxl: Design of an async API to issue QMP commands to QEMU Anthony PERARD
@ 2018-11-12 17:29   ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2018-11-12 17:29 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Anthony PERARD writes ("[PATCH v6 04/11] libxl: Design of an async API to issue QMP commands to QEMU"):
> All the functions will be implemented in later patches.
> 
> This patch includes the API that libxl can use to send QMP commands to
> QEMU.
...
>         Rewrite the comment about the callback, and explain that on error,
>             the `ev` may be Idle or may still be Connected

This is good, thanks.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 06/11] libxl_exec: Add libxl__spawn_initiate_failure
  2018-11-12 16:49 ` [PATCH v6 06/11] libxl_exec: Add libxl__spawn_initiate_failure Anthony PERARD
@ 2018-11-12 17:34   ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2018-11-12 17:34 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Anthony PERARD writes ("[PATCH v6 06/11] libxl_exec: Add libxl__spawn_initiate_failure"):
> This function can be used by user of libxl__spawn_* when they setup a
> notification other than xenstore. The parent can already report success
> via libxl__spawn_initiate_detach(), this new function can be used for
> failure instead of waiting for the timeout.
...
> + * ... It
> + * is possible for a spawn to fail for multiple reasons, for example
> + * call(s) to libxl__spawn_initiate_failure and also for some other reason.
> + * In that case the first rc value from any source will take precedence.

But your patch does not make this true, because an rc value from
libxl__spawn_initiate_failure may be overwritten by a later call to
spawn_fail.

And the reason you have written this (latent, probably as far as the
application is currently concerned) bug is that:

> +void libxl__spawn_initiate_failure(libxl__gc *gc, libxl__spawn_state *ss,
> +                                   int rc)
> +/* The spawn state must be Attached on entry and will be Attached Failed
> + * on return.  */
> +{
> +    assert(rc);
> +    if (!ss->rc)
> +        ss->rc = rc;
> +    spawn_detach(gc, ss);
> +}

This is an open-coded copy of spawn_fail.  If you hadn't written a
copy of it, you would have changed the rc squashing in spawn_fail too.

I think libxl__spawn_initiate_failure and spawn_fail need to be two
names for the same function.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare
  2018-11-12 17:20   ` Ian Jackson
@ 2018-11-13 10:59     ` Anthony PERARD
  2018-11-13 12:04       ` Ian Jackson
  0 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-13 10:59 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu

On Mon, Nov 12, 2018 at 05:20:53PM +0000, Ian Jackson wrote:
> Thanks for the repost.  I feel I am going to make some comments which
> could perhaps have been made earlier, so sorry for that:
> 
> Anthony PERARD writes ("[PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare"):
> > -static char *qmp_send_prepare(libxl__gc *gc, libxl__qmp_handler *qmp,
> > -                              const char *cmd, libxl__json_object *args,
> > -                              qmp_callback_t callback, void *opaque,
> > -                              qmp_request_context *context)
> 
> Previously this function returned memory allocated from malloc, and
> this was not documented.  I think it should be, for both this and for
> qmp_prepare_cmd.  See the big libxl.h comment on "memory allocation".

The memory allocated from malloc is new, before that it was allocated in
`gc` (aside from the `yajl_gen hand` which needs to be freed before the
function returns).

I've make the change to return a NOGC buffers because I don't know how
to have an allocation survive an `egc`.

Do you think I could call qmp_send_prepare with an `ao_gc` ? Not in this
patch, but later, in the context of libxl__ev_qmp which I think
shouldn't survive an AO.

> This patch seems to be a mixture of four things:
> 
>  1. Changing the return value convention of qmp_send_prepare
>     to expect the caller to free it.
> 
>  2. Changing qmp_send_prepare to include the \r\n
> 
>  3. Splitting qmp_prepare_cmd out from qmp_send_prepare
> 
>  4. Changing qmp_send_prepare to tell the caller the length
> 
> Overall I think there is supposed to be no functional change ?
> This should be mentioned in the commit message.
> 
> I appreciate that these four things are small, perhaps too small to
> split out, but they should all be mentioned in the commit message.
> 
> And then, the reasons for these changes are unstated.  AFAICT:
> 
> 3 is wanted because we are going to have a new caller which wants to
> handle the sending itself.  Fine.
> 
> 2 is wanted because every caller will want this, so it should be done
> in the common function.
> 
> 1 is wanted because 2 demands it.

I'll try to improve the patch description and awnser all those things.

> 4 ???  The existing code uses strlen.  JSON can't contain nul bytes.
> Why should the new caller not do similarly ?  If the use of strlen is
> wrong why not change the old caller ?  (Maybe it is going away, in
> which case that needs to be mentioned.)

I guess I can recalculate the lenght at the time when it will be needed
in a later patch.

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*
  2018-11-12 16:49 ` [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_* Anthony PERARD
@ 2018-11-13 11:33   ` Ian Jackson
  2018-11-22 19:04   ` Marek Marczykowski-Górecki
  1 sibling, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2018-11-13 11:33 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Anthony PERARD writes ("[PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*"):
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>

I started reviewing this in detail but I got bogged down in the main
implementation details because none of the internal functions like
qmp_ev_connect have doc comments saying what state they expect to find,
and what state they promise to leave.

I started trying to reverse engineer it, but I think this will be
both error prone and time consuming.  I think you have the answers to
this in your head but just didn't write it down.

Can you please respin just this patch, as a v6.1, with no changes
other than to add a comment next to the implementation of each
function (both the internal ones like qmp_ev_connect, and the external
ones like libxl__ev_qmp_send) saying what state(s) they accept and
produce (internal states, that is) ?

I guess that won't take you very long.

Eg

  static int qmp_ev_connect(libxl__gc *gc, libxl__ev_qmp *ev)
    /* disconnected/connecting/connected -> connecting */
  {

(if that is indeed true).

I think such comments would be useful even for external-facing
functions, because although they have to cope with any internal state
corresponding to any of the legal external states, the reader wants to
know what internal state results, not just what external state.  These
comments should be in libxl_qmp.c because they're part of the
implementation, not the API.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare
  2018-11-13 10:59     ` Anthony PERARD
@ 2018-11-13 12:04       ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2018-11-13 12:04 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Anthony PERARD writes ("Re: [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare"):
> On Mon, Nov 12, 2018 at 05:20:53PM +0000, Ian Jackson wrote:
> > Previously this function returned memory allocated from malloc, and
> > this was not documented.  I think it should be, for both this and for
> > qmp_prepare_cmd.  See the big libxl.h comment on "memory allocation".
> 
> The memory allocated from malloc is new, before that it was allocated in
> `gc` (aside from the `yajl_gen hand` which needs to be freed before the
> function returns).

Right.  I was asking for this to be documented in a comment, as
required by the libxl error handling model.

> I've make the change to return a NOGC buffers because I don't know how
> to have an allocation survive an `egc`.

Indeed, an egc's allocation would be too short.

> Do you think I could call qmp_send_prepare with an `ao_gc` ? Not in this
> patch, but later, in the context of libxl__ev_qmp which I think
> shouldn't survive an AO.

Ah yes, I keep thinking that you could cache a libxl__ev_qmp in the
libxl_ctx but because qemu will only talk on one at once, you are
right.  So yes a libxl__ev_qmp should have a reference to an ao.

Then you can use the ao gc.

> > 1 is wanted because 2 demands it.
> 
> I'll try to improve the patch description and awnser all those things.

Thanks.

> > 4 ???  The existing code uses strlen.  JSON can't contain nul bytes.
> > Why should the new caller not do similarly ?  If the use of strlen is
> > wrong why not change the old caller ?  (Maybe it is going away, in
> > which case that needs to be mentioned.)
> 
> I guess I can recalculate the lenght at the time when it will be needed
> in a later patch.

Or something.  I just noticed that this was done differently in
different places.  It's probably more convenient to recalculate it
than it is to carry it about.

But I don't mind if you want to carry it about - but in that case I
think you should switch all the callers to using the length returned
by qmp_prepare_cmd.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU
  2018-11-12 16:49 ` [PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU Anthony PERARD
@ 2018-11-16 11:52   ` Ian Jackson
  2018-11-21 17:14     ` Anthony PERARD
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2018-11-16 11:52 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Thanks for this patch.  I have a feeling that I have already commented
(perhaps informally) on a few aspects of it but the message was not
marked `replied' in my MUA so I thought I would formally review it.
Apologies if my comments are, effectively, duplicates.


Anthony PERARD writes ("[PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU"):
> This patch move the creation of the QMP unix socket from QEMU to libxl.

             moves

> But libxl doesn't rely on this yet.
> 
> When starting QEMU with dm_restrict=1, pre-open the QMP socket before
> exec QEMU. That socket will be usefull to findout if QEMU is ready, and

                                 useful     find out

> pre-opening it means that libxl can connect to it without waiting for
> QEMU to create it.
> 
> The pre-openning is conditionnal, based on the use of dm_restrict

      pre-opening     conditional

> because it is using a new command line option of QEMU, and dm_restrict
> support in QEMU is newer.



> @@ -539,6 +539,9 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
>      libxl_domain_create_info *info = &d_config->c_info;
>      libxl_domain_build_info *b_info = &d_config->b_info;
>  
> +    /* Attempt to initialise libxl__domain_build_state */
> +    state->dm_monitor_fd = -1;

See my comments below.

> +static int libxl__pre_open_qmp_socket(libxl__gc *gc, libxl_domid domid,
> +                                      int *fd_r)
> +{
> +    int rc, r;
> +    int fd;
> +    struct sockaddr_un un;
> +    const char *path = libxl__qemu_qmp_path(gc, domid);
> +
> +    assert(fd_r != NULL);

This assertion is not really of any use since otherwise we'll
dereference it, and crash, in due course, anyway.

> +    r = listen(fd, 1);

What is the reasoning behind the choice of 1 for the listen queue
parameter ?

>  static int libxl__build_device_model_args_new(libxl__gc *gc,
>                                          const char *dm, int guest_domid,
>                                          const libxl_domain_config *guest_config,
> @@ -944,10 +991,16 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
>                        GCSPRINTF("%d", guest_domid), NULL);
>  
>      flexarray_append(dm_args, "-chardev");
> -    flexarray_append(dm_args,
> -                     GCSPRINTF("socket,id=libxl-cmd,"
> -                               "path=%s,server,nowait",
> -                               libxl__qemu_qmp_path(gc, guest_domid)));
> +    if (state->dm_monitor_fd >= 0) {
> +        flexarray_append(dm_args,
> +            GCSPRINTF("socket,id=libxl-cmd,fd=%d,server,nowait",
> +                      state->dm_monitor_fd));

I think I suggested (IRL perhaps, and perhaps after you posted this)
that you might add some asserts to the other
..._build_device_model_args_... functions.

> @@ -2000,6 +2053,7 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
>      if (ret)
>          goto out;
>  
> +    d_state->dm_monitor_fd = -1;

This function calls libxl__domain_make which you have just changed to
also set this.  The situation is now very confusing, I think.

I think it would probably be better to introduce a new function to
initialise a libxl__domain_build_state, which is called every time one
comes into existence.  Probably as a separate patch.

> @@ -2408,6 +2470,7 @@ out_close:
>      if (logfile_w >= 0) close(logfile_w);
>  out:
>      if (dm_state_fd >= 0) close(dm_state_fd);
> +    if (state->dm_monitor_fd >= 0) close(state->dm_monitor_fd);

I think this cleanup of `state' needs to be split out.  There should
be a dispose function called everywhere a state ceases to exist.
See above about _init.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 08/11] libxl: QEMU startup sync based on QMP
  2018-11-12 16:49 ` [PATCH v6 08/11] libxl: QEMU startup sync based on QMP Anthony PERARD
@ 2018-11-16 12:14   ` Ian Jackson
  2018-11-21 16:49     ` Anthony PERARD
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Jackson @ 2018-11-16 12:14 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu, Roger Pau Monné

Anthony PERARD writes ("[PATCH v6 08/11] libxl: QEMU startup sync based on QMP"):
> This is only activated when dm_restrict=1, as explained in the previous
> patch "libxl_dm: Pre-open QMP socket for QEMU"
...
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.  I think I have spotted one DoS vulnerability (to qemu) and
one potential memory leak.

And some things which are anomalous but may or may not be bugs.

> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index b768d1b09f..de3862c839 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3898,6 +3898,7 @@ struct libxl__dm_spawn_state {
       /* filled in by user, must remain valid: */
       uint32_t guest_domid; /* domain being served */
>      libxl_domain_config *guest_config;
>      libxl__domain_build_state *build_state; /* relates to guest_domid */
>      libxl__dm_spawn_cb *callback;
> +    libxl__ev_qmp qmp;
>  };

I added a couple more lines of context.  Now we can see that you are
adding qmp in the wrong place.  The qmp is private to
libxl__spawn_*_dm, isn't it ?

This is the private field which can be handled in an idempotent way.
The other private field is `libxl__spawn_state spawn', which can't be
done that way because a spawn cannot be simply disposed.

I think you should introduce and call common functions dmss_init and
dmss_dispose for the use of libxl__spawn_local_dm and
libxl__spawn_stub_dm, and the ev_qmp_init should be done there.

As it is, you neither initialise nor dispose qmp in the case of
libxl__spawn_stub_dm.  That is perhaps correct now but it is a
latent bug if someone starts using qmp in the stub dm case.

> @@ -2343,6 +2346,8 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
>      const char *dm;
>      int dm_state_fd = -1;
>  
> +    libxl__ev_qmp_init(&dmss->qmp);
> +
>      if (libxl_defbool_val(b_info->device_model_stubdomain)) {
>          abort();
>      }
> @@ -2450,6 +2455,16 @@ retry_transaction:
>      spawn->failure_cb = device_model_startup_failed;
>      spawn->detached_cb = device_model_detached;
>  
> +    if (state->dm_monitor_fd >= 0) {
> +        /* There is a valid QMP socket available now,
> +         * use it to find out when QEMU is ready */
> +        dmss->qmp.callback = device_model_qmp_cb;
> +        dmss->qmp.domid = domid;
> +        dmss->qmp.fd = -1;
> +        rc = libxl__ev_qmp_send(gc, &dmss->qmp, "query-status", NULL);
> +        if (rc) goto out_close;
> +    }

The documentation does not make it clear whether libxl__ev_qmp_send
might make the callback synchronously.  I think if it does you are at
risk of calling libxl__spawn_initiate_failure when the spawn has not
been started yet.

>      rc = libxl__spawn_spawn(egc, spawn);
>      if (rc < 0)
>          goto out_close;
> @@ -2524,6 +2539,44 @@ static void device_model_detached(libxl__egc *egc,
>      device_model_spawn_outcome(egc, dmss, 0);
>  }
>  
> +static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev,
> +                                const libxl__json_object *response,
> +                                int rc)
> +{
> +    EGC_GC;
> +    libxl__dm_spawn_state *dmss = CONTAINER_OF(ev, *dmss, qmp);
> +    const libxl__json_object *o;
> +    const char *status;
> +
> +    libxl__ev_qmp_dispose(gc, ev);

That surely doesn't want to be here.  It should be (and I think, is)
disposed in the general teardown.  If I am wrong about that then I
have misunderstood the control flow, and the control flow may be
wrong.

> +    o = libxl__json_map_get("status", response, JSON_STRING);
> +    if (!o) {
> +        LOGD(ERROR, ev->domid,
> +             "Missing \"status\" in response to \"query-status\"");

If you used ` ' or ' ' here you could do away with the \.

> +        LOGD(DEBUG, ev->domid, ".. instead, got: %s",
> +             libxl__json_object_to_json(gc, response));

The doc comments for libxl__json_object_to_json don't say whether it
can fail.  So I assume it can, in which case you will pass NULL to %s
which is (sadly) nowadays illegal (although in practice probably
safe).

> +    status = libxl__json_object_get_string(o);
> +    if (strcmp(status, "running")) {

I think status can be NULL if o is not a string, and this is therefore
a DoS vulnerability against libxl exploitable by a depriv qemu.

> @@ -2547,6 +2600,8 @@ static void device_model_spawn_outcome(libxl__egc *egc,
>          }
>      }
>  
> +    libxl__ev_qmp_dispose(gc, &dmss->qmp);
> +
>   out:
>      dmss->callback(egc, dmss, rc);

Why is the dispose before out ?  I think this may be a memory leak (or
worse), perhaps exploitable somehow by qemu.

>  _hidden void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state*);
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index fec42b260c..a0912718e0 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -75,6 +75,7 @@ libxl_error = Enumeration("error", [
>      (-29, "QMP_COMMAND_NOT_FOUND"), # the requested command has not been found
>      (-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become active
>      (-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been found
> +    (-32, "QEMU_API"),

Can we at least have a descriptive comment for this error code ?


Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 09/11] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp
  2018-11-12 16:49 ` [PATCH v6 09/11] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp Anthony PERARD
@ 2018-11-16 12:17   ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2018-11-16 12:17 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu

Anthony PERARD writes ("[PATCH v6 09/11] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp"):
> This will be used in a later patch.

Thanks.  I have one comment about defensive programming in the macro,
and a couple of formatting nits.

> +        /*
> +         * Store advertised QEMU version
> +         * { "QMP": { "version": {
> +         *     "qemu": { "major": int, "minor": int, "micro": int } } } }
> +         */
> +        o = libxl__json_map_get("QMP", resp, JSON_MAP);
> +        o = libxl__json_map_get("version", o, JSON_MAP);
> +        o = libxl__json_map_get("qemu", o, JSON_MAP);
> +#define GRAB_VERSION(level) \
> +    ev->qemu_version.level = libxl__json_object_get_integer( \
> +        libxl__json_map_get(#level, o, JSON_INTEGER));
> +        GRAB_VERSION(major);

Your GRAB_VERSION macro ought to have a pair of ( ) around it, or the
do{...}while(0) trick.  I would prefer the indentation to be such that
the statement inside the macro is indented like the ones outside.

> +        GRAB_VERSION(minor);
> +        GRAB_VERSION(micro);
> +#undef GRAB_VERSION
> +        LOGD(DEBUG, ev->domid, "QEMU version: %d.%d.%d",
> +             ev->qemu_version.major, ev->qemu_version.minor,
> +             ev->qemu_version.micro);

A very minor point, but if you broke the line after `major,' this
would make the three identical things more similar-looking.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 08/11] libxl: QEMU startup sync based on QMP
  2018-11-16 12:14   ` Ian Jackson
@ 2018-11-21 16:49     ` Anthony PERARD
  2018-11-22 14:37       ` Anthony PERARD
  0 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-21 16:49 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On Fri, Nov 16, 2018 at 12:14:43PM +0000, Ian Jackson wrote:
> Anthony PERARD writes ("[PATCH v6 08/11] libxl: QEMU startup sync based on QMP"):
> > This is only activated when dm_restrict=1, as explained in the previous
> > patch "libxl_dm: Pre-open QMP socket for QEMU"
> ...
> > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.  I think I have spotted one DoS vulnerability (to qemu) and
> one potential memory leak.
> 
> And some things which are anomalous but may or may not be bugs.
> 
> > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> > index b768d1b09f..de3862c839 100644
> > --- a/tools/libxl/libxl_internal.h
> > +++ b/tools/libxl/libxl_internal.h
> > @@ -3898,6 +3898,7 @@ struct libxl__dm_spawn_state {
>        /* filled in by user, must remain valid: */
>        uint32_t guest_domid; /* domain being served */
> >      libxl_domain_config *guest_config;
> >      libxl__domain_build_state *build_state; /* relates to guest_domid */
> >      libxl__dm_spawn_cb *callback;
> > +    libxl__ev_qmp qmp;
> >  };
> 
> I added a couple more lines of context.  Now we can see that you are
> adding qmp in the wrong place.  The qmp is private to
> libxl__spawn_*_dm, isn't it ?

Yes, I think it is.

> This is the private field which can be handled in an idempotent way.
> The other private field is `libxl__spawn_state spawn', which can't be
> done that way because a spawn cannot be simply disposed.
> 
> I think you should introduce and call common functions dmss_init and
> dmss_dispose for the use of libxl__spawn_local_dm and
> libxl__spawn_stub_dm, and the ev_qmp_init should be done there.

Will do. There seems to be libxl__spawn_qdisk_backend that would need
dmss_init as well.

> As it is, you neither initialise nor dispose qmp in the case of
> libxl__spawn_stub_dm.  That is perhaps correct now but it is a
> latent bug if someone starts using qmp in the stub dm case.
> 
> > @@ -2343,6 +2346,8 @@ void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state *dmss)
> >      const char *dm;
> >      int dm_state_fd = -1;
> >  
> > +    libxl__ev_qmp_init(&dmss->qmp);
> > +
> >      if (libxl_defbool_val(b_info->device_model_stubdomain)) {
> >          abort();
> >      }
> > @@ -2450,6 +2455,16 @@ retry_transaction:
> >      spawn->failure_cb = device_model_startup_failed;
> >      spawn->detached_cb = device_model_detached;
> >  
> > +    if (state->dm_monitor_fd >= 0) {
> > +        /* There is a valid QMP socket available now,
> > +         * use it to find out when QEMU is ready */
> > +        dmss->qmp.callback = device_model_qmp_cb;
> > +        dmss->qmp.domid = domid;
> > +        dmss->qmp.fd = -1;
> > +        rc = libxl__ev_qmp_send(gc, &dmss->qmp, "query-status", NULL);
> > +        if (rc) goto out_close;
> > +    }
> 
> The documentation does not make it clear whether libxl__ev_qmp_send
> might make the callback synchronously.  I think if it does you are at
> risk of calling libxl__spawn_initiate_failure when the spawn has not
> been started yet.

I'll fix the documentation to tell that libxl__ev_qmp_send will not call
the callback synchronously.

> >      rc = libxl__spawn_spawn(egc, spawn);
> >      if (rc < 0)
> >          goto out_close;
> > @@ -2524,6 +2539,44 @@ static void device_model_detached(libxl__egc *egc,
> >      device_model_spawn_outcome(egc, dmss, 0);
> >  }
> >  
> > +static void device_model_qmp_cb(libxl__egc *egc, libxl__ev_qmp *ev,
> > +                                const libxl__json_object *response,
> > +                                int rc)
> > +{
> > +    EGC_GC;
> > +    libxl__dm_spawn_state *dmss = CONTAINER_OF(ev, *dmss, qmp);
> > +    const libxl__json_object *o;
> > +    const char *status;
> > +
> > +    libxl__ev_qmp_dispose(gc, ev);
> 
> That surely doesn't want to be here.  It should be (and I think, is)
> disposed in the general teardown.  If I am wrong about that then I
> have misunderstood the control flow, and the control flow may be
> wrong.

That is documented in libxl__ev_qmp as to why _dispose is called here:

    Only one connection at a time can be made to one QEMU, so avoid
    keeping a libxl__ev_qmp Connected for to long and call
    libxl__ev_qmp_dispose as soon as it is not needed anymore.


> > +        LOGD(DEBUG, ev->domid, ".. instead, got: %s",
> > +             libxl__json_object_to_json(gc, response));
> 
> The doc comments for libxl__json_object_to_json don't say whether it
> can fail.  So I assume it can, in which case you will pass NULL to %s
> which is (sadly) nowadays illegal (although in practice probably
> safe).

I wounder what to do for this.
Maybe invent a JSON macro which would be:
    JSON(o) (libxl__json_object_to_json(gc, (o)) : ? "\"null\"")
    ("null" would actually be valid json)
Or do it without the macro, but there are plenty of other caller's of
libxl__json_object_to_json in libxl__ev_qmp implementation.

> > +    status = libxl__json_object_get_string(o);
> > +    if (strcmp(status, "running")) {
> 
> I think status can be NULL if o is not a string, and this is therefore
> a DoS vulnerability against libxl exploitable by a depriv qemu.

`o` is a string, libxl__json_map_get(,,JSON_STRING) calls makes sure of
that. Then `status` can't be NULL.

> > @@ -2547,6 +2600,8 @@ static void device_model_spawn_outcome(libxl__egc *egc,
> >          }
> >      }
> >  
> > +    libxl__ev_qmp_dispose(gc, &dmss->qmp);
> > +
> >   out:
> >      dmss->callback(egc, dmss, rc);
> 
> Why is the dispose before out ?  I think this may be a memory leak (or
> worse), perhaps exploitable somehow by qemu.

It's probably a mistake.

> >  _hidden void libxl__spawn_local_dm(libxl__egc *egc, libxl__dm_spawn_state*);
> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> > index fec42b260c..a0912718e0 100644
> > --- a/tools/libxl/libxl_types.idl
> > +++ b/tools/libxl/libxl_types.idl
> > @@ -75,6 +75,7 @@ libxl_error = Enumeration("error", [
> >      (-29, "QMP_COMMAND_NOT_FOUND"), # the requested command has not been found
> >      (-30, "QMP_DEVICE_NOT_ACTIVE"), # a device has failed to be become active
> >      (-31, "QMP_DEVICE_NOT_FOUND"), # the requested device has not been found
> > +    (-32, "QEMU_API"),
> 
> Can we at least have a descriptive comment for this error code ?

What about:
  QEMU's replies don't contains expected members

Thanks,

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU
  2018-11-16 11:52   ` Ian Jackson
@ 2018-11-21 17:14     ` Anthony PERARD
  0 siblings, 0 replies; 31+ messages in thread
From: Anthony PERARD @ 2018-11-21 17:14 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu

On Fri, Nov 16, 2018 at 11:52:52AM +0000, Ian Jackson wrote:
> > +    r = listen(fd, 1);
> 
> What is the reasoning behind the choice of 1 for the listen queue
> parameter ?

I may have simply copy that from QEMU, or libvirt. They both call listen
with 1. It is probably to allow at least one connect() call on that
socket to succeed before QEMU start. I don't know if 0 is enough.

> >  static int libxl__build_device_model_args_new(libxl__gc *gc,
> >                                          const char *dm, int guest_domid,
> >                                          const libxl_domain_config *guest_config,
> > @@ -944,10 +991,16 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
> >                        GCSPRINTF("%d", guest_domid), NULL);
> >  
> >      flexarray_append(dm_args, "-chardev");
> > -    flexarray_append(dm_args,
> > -                     GCSPRINTF("socket,id=libxl-cmd,"
> > -                               "path=%s,server,nowait",
> > -                               libxl__qemu_qmp_path(gc, guest_domid)));
> > +    if (state->dm_monitor_fd >= 0) {
> > +        flexarray_append(dm_args,
> > +            GCSPRINTF("socket,id=libxl-cmd,fd=%d,server,nowait",
> > +                      state->dm_monitor_fd));
> 
> I think I suggested (IRL perhaps, and perhaps after you posted this)
> that you might add some asserts to the other
> ..._build_device_model_args_... functions.

Will do.

> > @@ -2000,6 +2053,7 @@ void libxl__spawn_stub_dm(libxl__egc *egc, libxl__stub_dm_spawn_state *sdss)
> >      if (ret)
> >          goto out;
> >  
> > +    d_state->dm_monitor_fd = -1;
> 
> This function calls libxl__domain_make which you have just changed to
> also set this.  The situation is now very confusing, I think.
> 
> I think it would probably be better to introduce a new function to
> initialise a libxl__domain_build_state, which is called every time one
> comes into existence.  Probably as a separate patch.

I'll attempt to do that.

> > @@ -2408,6 +2470,7 @@ out_close:
> >      if (logfile_w >= 0) close(logfile_w);
> >  out:
> >      if (dm_state_fd >= 0) close(dm_state_fd);
> > +    if (state->dm_monitor_fd >= 0) close(state->dm_monitor_fd);
> 
> I think this cleanup of `state' needs to be split out.  There should
> be a dispose function called everywhere a state ceases to exist.
> See above about _init.

Will do.

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version
  2018-11-12 17:22   ` Ian Jackson
@ 2018-11-22 12:24     ` Anthony PERARD
  2018-11-22 12:27       ` Anthony PERARD
  0 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-22 12:24 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu

On Mon, Nov 12, 2018 at 05:22:45PM +0000, Ian Jackson wrote:
> Anthony PERARD writes ("[PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version"):
> > This patch makes the function simpler to read. It also add the ability
> > for a caller to tell if QEMU is newer or have the exact version.
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

I will add the following to the document to clarify when a callback can
be called (not synchronously).

+++ b/tools/libxl/libxl_internal.h
@@ -388,5 +388,6 @@ struct libxl__ev_child {
  * libxl__ev_qmp_send: Idle/Connected -> Active (on error: Idle)
  *    Sends a command to QEMU.
  *    callback will be called when a response is received or when an
  *    error as occured.
+ *    callback isn't called synchronously.
  *


-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version
  2018-11-22 12:24     ` Anthony PERARD
@ 2018-11-22 12:27       ` Anthony PERARD
  0 siblings, 0 replies; 31+ messages in thread
From: Anthony PERARD @ 2018-11-22 12:27 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu

On Thu, Nov 22, 2018 at 12:24:29PM +0000, Anthony PERARD wrote:
> On Mon, Nov 12, 2018 at 05:22:45PM +0000, Ian Jackson wrote:
> > Anthony PERARD writes ("[PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version"):
> > > This patch makes the function simpler to read. It also add the ability
> > > for a caller to tell if QEMU is newer or have the exact version.
> > 
> > Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
> 
> I will add the following to the document to clarify when a callback can
> be called (not synchronously).
> 
> +++ b/tools/libxl/libxl_internal.h
> @@ -388,5 +388,6 @@ struct libxl__ev_child {
>   * libxl__ev_qmp_send: Idle/Connected -> Active (on error: Idle)
>   *    Sends a command to QEMU.
>   *    callback will be called when a response is received or when an
>   *    error as occured.
> + *    callback isn't called synchronously.
>   *

That was meant for the next patch: "libxl: Design of an async API to
issue QMP commands to QEMU" ... (which is acked as well).

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 08/11] libxl: QEMU startup sync based on QMP
  2018-11-21 16:49     ` Anthony PERARD
@ 2018-11-22 14:37       ` Anthony PERARD
  2018-11-22 15:48         ` Ian Jackson
  0 siblings, 1 reply; 31+ messages in thread
From: Anthony PERARD @ 2018-11-22 14:37 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On Wed, Nov 21, 2018 at 04:49:06PM +0000, Anthony PERARD wrote:
> On Fri, Nov 16, 2018 at 12:14:43PM +0000, Ian Jackson wrote:
> > Anthony PERARD writes ("[PATCH v6 08/11] libxl: QEMU startup sync based on QMP"):
> > > +        LOGD(DEBUG, ev->domid, ".. instead, got: %s",
> > > +             libxl__json_object_to_json(gc, response));
> > 
> > The doc comments for libxl__json_object_to_json don't say whether it
> > can fail.  So I assume it can, in which case you will pass NULL to %s
> > which is (sadly) nowadays illegal (although in practice probably
> > safe).
> 
> I wounder what to do for this.
> Maybe invent a JSON macro which would be:
>     JSON(o) (libxl__json_object_to_json(gc, (o)) : ? "\"null\"")
>     ("null" would actually be valid json)
> Or do it without the macro, but there are plenty of other caller's of
> libxl__json_object_to_json in libxl__ev_qmp implementation.

Or simply change libxl__json_object_to_json to always return a valid
json string. There are no user yet, so it is probably fine to make that
change.

What do you think?

Thanks,

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 08/11] libxl: QEMU startup sync based on QMP
  2018-11-22 14:37       ` Anthony PERARD
@ 2018-11-22 15:48         ` Ian Jackson
  0 siblings, 0 replies; 31+ messages in thread
From: Ian Jackson @ 2018-11-22 15:48 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Wei Liu, Roger Pau Monné

Anthony PERARD writes ("Re: [PATCH v6 08/11] libxl: QEMU startup sync based on QMP"):
> On Wed, Nov 21, 2018 at 04:49:06PM +0000, Anthony PERARD wrote:
> > I wounder what to do for this.
> > Maybe invent a JSON macro which would be:
> >     JSON(o) (libxl__json_object_to_json(gc, (o)) : ? "\"null\"")
> >     ("null" would actually be valid json)
> > Or do it without the macro, but there are plenty of other caller's of
> > libxl__json_object_to_json in libxl__ev_qmp implementation.
> 
> Or simply change libxl__json_object_to_json to always return a valid
> json string. There are no user yet, so it is probably fine to make that
> change.

I very much like your suggestion that there should be some simple
thing that always returns something suitable for printing in an error
message.

I'm not sure whether that thing should be libxl__json_object_to_json.
I don't like the idea that libxl__json_object_to_json should be able
to fail undetectably.

Nor do I like the idea that failures should be represented by a valid
parseable json string.

So I tentatively suggest either

  const char libxl__json_object_to_json_error[] = "<invalid-json-object>";

returning libxl__json_object_to_json_error on error, or

  #define JSON(o) (libxl__json_object_to_json(gc, (o)) \
                   : ? "<invalid-json-object>")

or some such.  I can't remember enough JS right now to be sure whether
`<invalid-json-object>' is a parse error but both json_pp and nodejs
seem to hate it well enough...

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*
  2018-11-12 16:49 ` [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_* Anthony PERARD
  2018-11-13 11:33   ` Ian Jackson
@ 2018-11-22 19:04   ` Marek Marczykowski-Górecki
  2018-11-23 11:09     ` Anthony PERARD
  1 sibling, 1 reply; 31+ messages in thread
From: Marek Marczykowski-Górecki @ 2018-11-22 19:04 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Ian Jackson, Wei Liu


[-- Attachment #1.1: Type: text/plain, Size: 4514 bytes --]

On Mon, Nov 12, 2018 at 04:49:24PM +0000, Anthony PERARD wrote:
(...)
> +/* QMP FD callbacks */
> +
> +static void qmp_ev_fd_callback(libxl__egc *egc, libxl__ev_fd *ev_fd,
> +                               int fd, short events, short revents)
> +{
> +    EGC_GC;
> +    int rc;
> +    libxl__json_object *o = NULL;
> +    libxl__ev_qmp *ev = CONTAINER_OF(ev_fd, *ev, qmp_efd);
> +
> +    if (revents & (POLLHUP)) {
> +        LOGD(ERROR, ev->domid, "received POLLHUP from QMP socket");
> +        rc = ERROR_PROTOCOL_ERROR_QMP;
> +        goto out;
> +    }
> +    if (revents & ~(POLLIN|POLLOUT)) {
> +        LOGD(ERROR, ev->domid,
> +             "unexpected poll event 0x%x on QMP socket (expected POLLIN "
> +             "and/or POLLOUT)",
> +            revents);
> +        rc = ERROR_FAIL;
> +        goto out;
> +    }
> +
> +    if (revents & POLLOUT) {
> +        rc = qmp_ev_callback_writable(gc, ev, fd);
> +        if (rc)
> +            goto out;
> +    }
> +
> +    if (revents & POLLIN) {
> +        rc = qmp_ev_callback_readable(egc, ev, fd);
> +        if (rc)
> +            goto out;
> +
> +        /* parse input */
> +        while (1) {
> +            /* parse rx buffer to find one json object */
> +            rc = qmp_ev_get_next_msg(egc, ev, &o);
> +            if (rc == ERROR_NOTFOUND) {
> +                rc = 0;
> +                break;
> +            } else if (rc)
> +                goto out;
> +
> +            /* Must be last and return when the user callback is called */
> +            rc = qmp_ev_handle_message(egc, ev, o);
> +            if (rc < 0)
> +                goto out;
> +            if (rc == 1) {
> +                /* user callback has been called */
> +                return;
> +            }
> +        }
> +    }
> +
> +    qmp_ev_ensure_reading_writing(gc, ev);
> +
> +out:
> +    if (rc) {
> +        LOGD(ERROR, ev->domid,
> +             "Error happend with the QMP connection to QEMU");
> +
> +        /* On error, deallocate all private ressources */
> +        libxl__ev_qmp_dispose(gc, ev);
> +
> +        /* And tell libxl__ev_qmp user about the error */
> +        ev->callback(egc, ev, NULL, rc); /* must be last */
> +    }
> +}
> +
(...)
> +static int qmp_ev_callback_readable(libxl__egc *egc,
> +                                    libxl__ev_qmp *ev, int fd)
> +{
> +    EGC_GC;
> +
> +    while (1) {
> +        ssize_t r;
> +
> +        /* Check if the buffer still have space, or increase size */
> +        if (ev->rx_buf_size - ev->rx_buf_used < QMP_RECEIVE_BUFFER_SIZE) {
> +            ev->rx_buf_size = max(ev->rx_buf_size * 2,
> +                               (size_t)QMP_RECEIVE_BUFFER_SIZE * 2);
> +            assert(ev->rx_buf_size <= QMP_MAX_SIZE_RX_BUF);
> +            if (ev->rx_buf_size > QMP_MAX_SIZE_RX_BUF) {
> +                LOGD(ERROR, ev->domid,
> +                     "QMP receive buffer is too big (%ld > %lld)",
> +                     ev->rx_buf_size, QMP_MAX_SIZE_RX_BUF);
> +                return ERROR_BUFFERFULL;

What if you receive multiple messages (events?), but actually a single
message do fit in a buffer? I think it would be better to stop receiving
the data in the case of buffer full, but try to find a valid message
before erroring out.

> +            }
> +            ev->rx_buf = libxl__realloc(NOGC, ev->rx_buf, ev->rx_buf_size);
> +        }
> +
> +        r = read(fd, ev->rx_buf + ev->rx_buf_used,
> +                 ev->rx_buf_size - ev->rx_buf_used);
> +        if (r < 0) {
> +            if (errno == EINTR) {
> +                continue;
> +            }
> +            if (errno == EWOULDBLOCK) {
> +                break;
> +            }
> +            LOGED(ERROR, ev->domid, "error reading QMP socket");
> +            return ERROR_FAIL;
> +        }
> +
> +        if (r == 0) {
> +            LOGD(ERROR, ev->domid, "Unexpected EOF on QMP socket");
> +            return ERROR_PROTOCOL_ERROR_QMP;
> +        }
> +
> +        LOG_QMP("received %ldB: '%.*s'", r,
> +                (int)r, ev->rx_buf + ev->rx_buf_used);
> +
> +        ev->rx_buf_used += r;
> +        assert(ev->rx_buf_used <= ev->rx_buf_size);
> +    }
> +
> +    return 0;
> +}

(...)

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_*
  2018-11-22 19:04   ` Marek Marczykowski-Górecki
@ 2018-11-23 11:09     ` Anthony PERARD
  0 siblings, 0 replies; 31+ messages in thread
From: Anthony PERARD @ 2018-11-23 11:09 UTC (permalink / raw)
  To: Marek Marczykowski-Górecki; +Cc: xen-devel, Ian Jackson, Wei Liu

On Thu, Nov 22, 2018 at 08:04:53PM +0100, Marek Marczykowski-Górecki wrote:
> On Mon, Nov 12, 2018 at 04:49:24PM +0000, Anthony PERARD wrote:
> > +static int qmp_ev_callback_readable(libxl__egc *egc,
> > +                                    libxl__ev_qmp *ev, int fd)
> > +{
> > +    EGC_GC;
> > +
> > +    while (1) {
> > +        ssize_t r;
> > +
> > +        /* Check if the buffer still have space, or increase size */
> > +        if (ev->rx_buf_size - ev->rx_buf_used < QMP_RECEIVE_BUFFER_SIZE) {
> > +            ev->rx_buf_size = max(ev->rx_buf_size * 2,
> > +                               (size_t)QMP_RECEIVE_BUFFER_SIZE * 2);
> > +            assert(ev->rx_buf_size <= QMP_MAX_SIZE_RX_BUF);
> > +            if (ev->rx_buf_size > QMP_MAX_SIZE_RX_BUF) {
> > +                LOGD(ERROR, ev->domid,
> > +                     "QMP receive buffer is too big (%ld > %lld)",
> > +                     ev->rx_buf_size, QMP_MAX_SIZE_RX_BUF);
> > +                return ERROR_BUFFERFULL;
> 
> What if you receive multiple messages (events?), but actually a single
> message do fit in a buffer? I think it would be better to stop receiving
> the data in the case of buffer full, but try to find a valid message
> before erroring out.

Yes, that have been taken care of in the v7 that I'm about to send.

The new algorithme is the following:

while true:
    parse(buffer), until no more messages can be found
    read(socket), once
    if EWOULDBLOCK, exit

But thanks for the review.

FYI, that have been talked about in a different thread (because I
mistyped the git-send-email command):
<20181115111510.11628-1-anthony.perard@citrix.com>

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2018-11-23 11:09 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-12 16:49 [PATCH v6 00/11] libxl: Enable save/restore/migration of a restricted QEMU + libxl__ev_qmp_* Anthony PERARD
2018-11-12 16:49 ` [PATCH v6 01/11] libxl: Enhance libxl__sendmsg_fds to deal with EINTR and EWOULDBLOCK Anthony PERARD
2018-11-12 16:58   ` Ian Jackson
2018-11-12 16:49 ` [PATCH v6 02/11] libxl_qmp: Separate QMP message generation from qmp_send_prepare Anthony PERARD
2018-11-12 17:20   ` Ian Jackson
2018-11-13 10:59     ` Anthony PERARD
2018-11-13 12:04       ` Ian Jackson
2018-11-12 16:49 ` [PATCH v6 03/11] libxl_qmp: Change qmp_qemu_check_version to compare version Anthony PERARD
2018-11-12 17:22   ` Ian Jackson
2018-11-22 12:24     ` Anthony PERARD
2018-11-22 12:27       ` Anthony PERARD
2018-11-12 16:49 ` [PATCH v6 04/11] libxl: Design of an async API to issue QMP commands to QEMU Anthony PERARD
2018-11-12 17:29   ` Ian Jackson
2018-11-12 16:49 ` [PATCH v6 05/11] libxl_qmp: Implementation of libxl__ev_qmp_* Anthony PERARD
2018-11-13 11:33   ` Ian Jackson
2018-11-22 19:04   ` Marek Marczykowski-Górecki
2018-11-23 11:09     ` Anthony PERARD
2018-11-12 16:49 ` [PATCH v6 06/11] libxl_exec: Add libxl__spawn_initiate_failure Anthony PERARD
2018-11-12 17:34   ` Ian Jackson
2018-11-12 16:49 ` [PATCH v6 07/11] libxl_dm: Pre-open QMP socket for QEMU Anthony PERARD
2018-11-16 11:52   ` Ian Jackson
2018-11-21 17:14     ` Anthony PERARD
2018-11-12 16:49 ` [PATCH v6 08/11] libxl: QEMU startup sync based on QMP Anthony PERARD
2018-11-16 12:14   ` Ian Jackson
2018-11-21 16:49     ` Anthony PERARD
2018-11-22 14:37       ` Anthony PERARD
2018-11-22 15:48         ` Ian Jackson
2018-11-12 16:49 ` [PATCH v6 09/11] libxl_qmp: Store advertised QEMU version in libxl__ev_qmp Anthony PERARD
2018-11-16 12:17   ` Ian Jackson
2018-11-12 16:49 ` [PATCH v6 10/11] libxl: Change libxl__domain_suspend_device_model() to be async Anthony PERARD
2018-11-12 16:49 ` [PATCH v6 11/11] libxl: Re-implement domain_suspend_device_model using libxl__ev_qmp Anthony PERARD

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).