All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 00/10] libxl: synchronise domain configuration
@ 2014-07-10 14:32 Wei Liu
  2014-07-10 14:32 ` [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it Wei Liu
                   ` (9 more replies)
  0 siblings, 10 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

About one month ago Ian Jackson and I had a long conversation on how this
feature should be implemented in libxl. I rewrote major parts two weeks ago
before leaving for vacation so this series doesn't inherit old version number
anymore.

Please note that I'm conducting tests on this series, it's not yet ready to get
merged. I have a long list of testcases I want to go through and I won't be
surprise if I miss one place or another. The purpose of this posting is to see
if people have any comment on the design or strucutre of code or arrangement of
patches.

Wei.

Wei Liu (10):
  libxl: libxl-json format and internal functions to get / set it
  libxl_internal: functions to lock / unlock domain configuration
  libxl: store a copy of vanilla domain configuration when creating
    domain
  libxl: separate device add/rm complete callbacks
  libxl: synchronise configuration when we hotplug a device
  libxl: synchronise configuration when we remove/destroy a device
  libxl: make libxl_cd_insert "eject" + "insert"
  libxl: introduce libxl_get_memory_static_max
  libxl: introduce libxl_retrieve_domain_configuration
  xl: use libxl_retrieve_domain_configuration and JSON format

 docs/man/xl.pod.1            |    3 +
 tools/libxl/libxl.c          |  506 +++++++++++++++++++++++++++++++++++++-----
 tools/libxl/libxl.h          |   26 +++
 tools/libxl/libxl_create.c   |   76 +++++++
 tools/libxl/libxl_dom.c      |   14 +-
 tools/libxl/libxl_internal.c |  132 +++++++++++
 tools/libxl/libxl_internal.h |  170 ++++++++++++++
 tools/libxl/libxl_pci.c      |   31 +++
 tools/libxl/xl_cmdimpl.c     |  121 ++++++----
 tools/libxl/xl_cmdtable.c    |    4 +-
 10 files changed, 976 insertions(+), 107 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-16 16:11   ` Ian Campbell
                     ` (2 more replies)
  2014-07-10 14:32 ` [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration Wei Liu
                   ` (8 subsequent siblings)
  9 siblings, 3 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

Introduce a new format in libxl userdata store called "libxl-json". This
file format contains JSON version of libxl_domain_config, generated by
libxl.

Two internal functions to get and set libxl_domain_configuration
are also introduced.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.h          |    2 ++
 tools/libxl/libxl_internal.c |   56 ++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |   10 ++++++++
 3 files changed, 68 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index e6e0301..2dc6a51 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1182,6 +1182,8 @@ void libxl_cpuid_set(libxl_ctx *ctx, uint32_t domid,
  *  "xl"          domain config file in xl format, Unix line endings
  *  "libvirt-xml" domain config file in libvirt XML format.  See
  *                http://libvirt.org/formatdomain.html
+ *  "libxl-json"  libxl_domain_config object in JSON format, generated
+ *                by libxl
  *
  * libxl does not enforce the registration of userdata userids or the
  * semantics of the data.  For specifications of the data formats
diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index 81f8985..dc47177 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -381,6 +381,62 @@ out:
     return rc;
 }
 
+int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                    libxl_domain_config *d_config)
+{
+    uint8_t *data = NULL;
+    int rc, len;
+
+    rc = libxl_userdata_retrieve(CTX, domid, "libxl-json", &data, &len);
+    if (rc) {
+        LOGEV(ERROR, rc,
+              "failed to retrieve domain configuration for domain %d", domid);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    if (len == 0) {
+        LOGE(ERROR, "configuration data stream empty for domain %d", domid);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+    rc = libxl_domain_config_from_json(CTX, d_config, (const char *)data);
+
+out:
+    free(data);
+    return rc;
+}
+
+int libxl__set_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                    libxl_domain_config *d_config)
+{
+    char *d_config_json;
+    int rc;
+
+    d_config_json = libxl_domain_config_to_json(CTX, d_config);
+    if (!d_config_json) {
+        LOGE(ERROR, "failed to convert domain configuration to JSON for domain %d",
+             domid);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    rc = libxl_userdata_store(CTX, domid, "libxl-json",
+                              (const uint8_t *)d_config_json,
+                              strlen(d_config_json) + 1 /* include '\0' */);
+    if (rc) {
+        LOGEV(ERROR, rc, "failed to store domain configuration for domain %d",
+              domid);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+out:
+    free(d_config_json);
+    return rc;
+}
+
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index beb052e..ef2111b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3219,6 +3219,16 @@ static inline int libxl__key_value_list_is_empty(libxl_key_value_list *pkvl)
 
 int libxl__cpuid_policy_is_empty(libxl_cpuid_policy_list *pl);
 
+/*
+ * Retrieve / store domain configuration from / to libxl private
+ * data store. The registry entry in libxl private data store
+ * is "libxl-json".
+ */
+int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                    libxl_domain_config *d_config);
+int libxl__set_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                    libxl_domain_config *d_config);
+
 #endif
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
  2014-07-10 14:32 ` [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-16 16:15   ` Ian Campbell
  2014-07-24 18:24   ` Ian Jackson
  2014-07-10 14:32 ` [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain Wei Liu
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

Simple file lock taken from xl to serialise access to "libxl-json" file.
If a thread cannot get hold of the lock it waits due to F_SETLKW.

In order to generate lock file name, rename userdata_path to
libxl__userdata_path and declare it in libxl_internal.h

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dom.c      |   14 ++++----
 tools/libxl/libxl_internal.c |   76 ++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |   11 ++++++
 3 files changed, 94 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 83eb29a..a41e8fd 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1762,9 +1762,9 @@ char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
     return GCSPRINTF(LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
 }
 
-static const char *userdata_path(libxl__gc *gc, uint32_t domid,
-                                      const char *userdata_userid,
-                                      const char *wh)
+const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
+                                 const char *userdata_userid,
+                                 const char *wh)
 {
     libxl_ctx *ctx = libxl__gc_owner(gc);
     char *uuid_string;
@@ -1799,7 +1799,7 @@ void libxl__userdata_destroyall(libxl__gc *gc, uint32_t domid)
     glob_t gl;
     int r, i;
 
-    pattern = userdata_path(gc, domid, "*", "?");
+    pattern = libxl__userdata_path(gc, domid, "*", "?");
     if (!pattern)
         goto out;
 
@@ -1830,7 +1830,7 @@ int libxl_userdata_store(libxl_ctx *ctx, uint32_t domid,
     int e, rc;
     int fd = -1;
 
-    filename = userdata_path(gc, domid, userdata_userid, "d");
+    filename = libxl__userdata_path(gc, domid, userdata_userid, "d");
     if (!filename) {
         rc = ERROR_NOMEM;
         goto out;
@@ -1841,7 +1841,7 @@ int libxl_userdata_store(libxl_ctx *ctx, uint32_t domid,
         goto out;
     }
 
-    newfilename = userdata_path(gc, domid, userdata_userid, "n");
+    newfilename = libxl__userdata_path(gc, domid, userdata_userid, "n");
     if (!newfilename) {
         rc = ERROR_NOMEM;
         goto out;
@@ -1892,7 +1892,7 @@ int libxl_userdata_retrieve(libxl_ctx *ctx, uint32_t domid,
     int datalen = 0;
     void *data = 0;
 
-    filename = userdata_path(gc, domid, userdata_userid, "d");
+    filename = libxl__userdata_path(gc, domid, userdata_userid, "d");
     if (!filename) {
         rc = ERROR_NOMEM;
         goto out;
diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index dc47177..9cc60e8 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -436,6 +436,82 @@ out:
     return rc;
 }
 
+int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                     int *fd_lock)
+{
+    int rc;
+    struct flock fl;
+    const char *lockfile;
+
+    if (*fd_lock >= 0)
+        return ERROR_INVAL;
+
+    lockfile = libxl__userdata_path(gc, domid, "libxl-json.lock", "d");
+    if (!lockfile)
+        return ERROR_FAIL;
+
+    *fd_lock = open(lockfile, O_WRONLY|O_CREAT, S_IWUSR);
+    if (*fd_lock < 0) {
+        LOGE(ERROR, "cannot open lockfile %s errno=%d\n", lockfile, errno);
+        return ERROR_FAIL;
+    }
+
+    if (fcntl(*fd_lock, F_SETFD, FD_CLOEXEC) < 0) {
+        close(*fd_lock);
+        LOGE(ERROR, "cannot set cloexec to lockfile %s errno=%d\n",
+             lockfile, errno);
+        return ERROR_FAIL;
+    }
+
+get_lock:
+    fl.l_type = F_WRLCK;
+    fl.l_whence = SEEK_SET;
+    fl.l_start = 0;
+    fl.l_len = 0;
+    rc = fcntl(*fd_lock, F_SETLKW, &fl);
+    if (rc < 0 && errno == EINTR)
+        goto get_lock;
+
+    if (rc < 0) {
+        LOGE(ERROR, "cannot acquire lock %s errno=%d\n", lockfile, errno);
+        rc = ERROR_FAIL;
+    } else
+        rc = 0;
+
+    return rc;
+}
+
+int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                       int *fd_lock)
+{
+    int rc;
+    struct flock fl;
+
+    if (*fd_lock < 0)
+        return ERROR_INVAL;
+
+release_lock:
+    fl.l_type = F_UNLCK;
+    fl.l_whence = SEEK_SET;
+    fl.l_start = 0;
+    fl.l_len = 0;
+    rc = fcntl(*fd_lock, F_SETLKW, &fl);
+    if (rc < 0 && errno == EINTR)
+        goto release_lock;
+
+    if (rc < 0) {
+        const char *lockfile;
+        lockfile = libxl__userdata_path(gc, domid, "libxl-json.lock", "d");
+        LOGE(ERROR, "cannot release lock %s, errno=%d\n", lockfile, errno);
+        rc = ERROR_FAIL;
+    } else
+        rc = 0;
+
+    close(*fd_lock);
+    *fd_lock = -1;
+
+    return rc;
+}
 
 /*
  * Local variables:
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index ef2111b..e9d93ac 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -991,6 +991,9 @@ _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
                                      uint32_t size, void *data);
 _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
 
+_hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
+                                         const char *userdata_userid,
+                                         const char *wh);
 _hidden void libxl__userdata_destroyall(libxl__gc *gc, uint32_t domid);
 
 _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
@@ -3228,6 +3231,14 @@ int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
                                     libxl_domain_config *d_config);
 int libxl__set_domain_configuration(libxl__gc *gc, uint32_t domid,
                                     libxl_domain_config *d_config);
+/*
+ * Lock / unlock domain configuration in libxl private data store.
+ * fd_lock contains the file descriptor pointing to the lock file.
+ */
+int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                     int *fd_lock);
+int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
+                                       int *fd_lock);
 
 #endif
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
  2014-07-10 14:32 ` [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it Wei Liu
  2014-07-10 14:32 ` [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-16 16:18   ` Ian Campbell
  2014-07-24 18:52   ` Ian Jackson
  2014-07-10 14:32 ` [PATCH v1 04/10] libxl: separate device add/rm complete callbacks Wei Liu
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

This patch utilises new "libxl-json" file and stores a copy of user
supplied domain configuration in JSON form. This copy will be used as
template.

There're two save operations in total. The template config is first
saved right after the domain gets its UUID and domain id. Then after the
domain creation succeeds, the relevant bits touched by libxl are
synchronised to the stored config.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_create.c   |   76 ++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |   14 ++++++++
 2 files changed, 90 insertions(+)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 0686f96..35fa7b7 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -705,6 +705,7 @@ static void initiate_domain_create(libxl__egc *egc,
     int i, ret;
     size_t last_devid = -1;
     bool pod_enabled = false;
+    libxl_domain_config d_config_saved;
 
     /* convenience aliases */
     libxl_domain_config *const d_config = dcs->guest_config;
@@ -713,6 +714,8 @@ static void initiate_domain_create(libxl__egc *egc,
 
     domid = 0;
 
+    libxl_domain_config_init(&d_config_saved);
+
     if (d_config->c_info.ssid_label) {
         char *s = d_config->c_info.ssid_label;
         ret = libxl_flask_context_to_sid(ctx, s, strlen(s),
@@ -789,6 +792,8 @@ static void initiate_domain_create(libxl__egc *egc,
         goto error_out;
     }
 
+    libxl_domain_config_copy(ctx, &d_config_saved, d_config);
+
     ret = libxl__domain_create_info_setdefault(gc, &d_config->c_info);
     if (ret) goto error_out;
 
@@ -803,6 +808,15 @@ static void initiate_domain_create(libxl__egc *egc,
     dcs->guest_domid = domid;
     dcs->dmss.dm.guest_domid = 0; /* means we haven't spawned */
 
+    /* At this point we've got domid and UUID, store configuration */
+    ret = libxl__set_domain_configuration(gc, domid, &d_config_saved);
+    libxl_domain_config_dispose(&d_config_saved);
+    if (ret) {
+        LOGE(ERROR, "cannot store domain configuration: %d", ret);
+        ret = ERROR_FAIL;
+        goto error_out;
+    }
+
     ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
     if (ret) goto error_out;
 
@@ -1344,6 +1358,62 @@ error_out:
     domcreate_complete(egc, dcs, ret);
 }
 
+static void update_domain_config(libxl__gc *gc,
+                                 libxl_domain_config *dst,
+                                 const libxl_domain_config *src)
+{
+    /* update network interface information */
+    int i;
+
+    for (i = 0; i < src->num_nics; i++)
+        libxl__update_config_nic(gc, &dst->nics[i], &src->nics[i]);
+
+    /* update vtpm information */
+    for (i = 0; i < src->num_vtpms; i++)
+        libxl__update_config_vtpm(gc, &dst->vtpms[i], &src->vtpms[i]);
+
+    /* update guest UUID */
+    libxl_uuid_copy(CTX, &dst->c_info.uuid, &src->c_info.uuid);
+
+    /* video ram */
+    dst->b_info.video_memkb = src->b_info.video_memkb;
+}
+
+/* update the saved domain configuration with a callback function,
+ * which is responsible to pull in useful fields from src.
+ */
+typedef void (update_callback)(libxl__gc *, libxl_domain_config *,
+                               const libxl_domain_config *);
+static int libxl__update_domain_configuration(libxl__gc *gc,
+                                              uint32_t domid,
+                                              update_callback callback,
+                                              const libxl_domain_config *src)
+{
+    libxl_domain_config d_config_saved;
+    int rc;
+
+    libxl_domain_config_init(&d_config_saved);
+
+    rc = libxl__get_domain_configuration(gc, domid, &d_config_saved);
+
+    if (rc) {
+        LOG(ERROR, "cannot get domain configuration: %d", rc);
+        goto out;
+    }
+
+    callback(gc, &d_config_saved, src);
+
+    rc = libxl__set_domain_configuration(gc, domid, &d_config_saved);
+
+    if (rc)
+        LOG(ERROR, "cannot set domain configuration: %d", rc);
+
+    libxl_domain_config_dispose(&d_config_saved);
+
+out:
+    return rc;
+}
+
 static void domcreate_complete(libxl__egc *egc,
                                libxl__domain_create_state *dcs,
                                int rc)
@@ -1354,6 +1424,12 @@ static void domcreate_complete(libxl__egc *egc,
     if (!rc && d_config->b_info.exec_ssidref)
         rc = xc_flask_relabel_domain(CTX->xch, dcs->guest_domid, d_config->b_info.exec_ssidref);
 
+    if (!rc) {
+        rc = libxl__update_domain_configuration(gc, dcs->guest_domid,
+                                                update_domain_config,
+                                                d_config);
+    }
+
     if (rc) {
         if (dcs->guest_domid) {
             dcs->dds.ao = ao;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e9d93ac..72d21ad 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3240,6 +3240,20 @@ int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
 int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
                                        int *fd_lock);
 
+static inline void libxl__update_config_nic(libxl__gc *gc,
+                                            libxl_device_nic *dst,
+                                            libxl_device_nic *src)
+{
+    libxl_mac_copy(CTX, &dst->mac, &src->mac);
+}
+
+static inline void libxl__update_config_vtpm(libxl__gc *gc,
+                                             libxl_device_vtpm *dst,
+                                             libxl_device_vtpm *src)
+{
+    libxl_uuid_copy(CTX, &dst->uuid, &src->uuid);
+}
+
 #endif
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 04/10] libxl: separate device add/rm complete callbacks
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
                   ` (2 preceding siblings ...)
  2014-07-10 14:32 ` [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-16 16:26   ` Ian Campbell
  2014-07-10 14:32 ` [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device Wei Liu
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

This is in preparation for device hotplug / unplug configuration
synchronisation. No functional change introduced. This change is
necessary because we need to do different things for add and remove.
Using a single callback won't meet our need anymore.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c |   48 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 36 insertions(+), 12 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index a9205d1..e1ee7bf 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1794,22 +1794,26 @@ out:
 }
 
 /******************************************************************************/
+#define DEVICE_AO_FAILED_MSG                                            \
+    do {                                                                \
+        if (aodev->dev) {                                               \
+            LOG(ERROR, "unable to %s %s with id %u",                    \
+            libxl__device_action_to_string(aodev->action),              \
+            libxl__device_kind_to_string(aodev->dev->kind),             \
+            aodev->dev->devid);                                         \
+        } else {                                                        \
+            LOG(ERROR, "unable to %s device",                           \
+                libxl__device_action_to_string(aodev->action));         \
+        }                                                               \
+    } while (0)
 
 /* generic callback for devices that only need to set ao_complete */
-static void device_addrm_aocomplete(libxl__egc *egc, libxl__ao_device *aodev)
+static void device_add_aocomplete(libxl__egc *egc, libxl__ao_device *aodev)
 {
     STATE_AO_GC(aodev->ao);
 
     if (aodev->rc) {
-        if (aodev->dev) {
-            LOG(ERROR, "unable to %s %s with id %u",
-                        libxl__device_action_to_string(aodev->action),
-                        libxl__device_kind_to_string(aodev->dev->kind),
-                        aodev->dev->devid);
-        } else {
-            LOG(ERROR, "unable to %s device",
-                       libxl__device_action_to_string(aodev->action));
-        }
+        DEVICE_AO_FAILED_MSG;
         goto out;
     }
 
@@ -1818,6 +1822,26 @@ out:
     return;
 }
 
+#define DEFINE_DEVICE_RM_AOCOMPLETE(type)                               \
+    static void device_##type##_rm_aocomplete(libxl__egc *egc,          \
+                                              libxl__ao_device *aodev)  \
+    {                                                                   \
+        STATE_AO_GC(aodev->ao);                                         \
+        if (aodev->rc) {                                                \
+            DEVICE_AO_FAILED_MSG;                                       \
+            goto out;                                                   \
+        }                                                               \
+                                                                        \
+    out:                                                                \
+        libxl__ao_complete(egc, ao, aodev->rc);                         \
+    }
+
+DEFINE_DEVICE_RM_AOCOMPLETE(vtpm);
+DEFINE_DEVICE_RM_AOCOMPLETE(nic);
+DEFINE_DEVICE_RM_AOCOMPLETE(disk);
+DEFINE_DEVICE_RM_AOCOMPLETE(vfb);
+DEFINE_DEVICE_RM_AOCOMPLETE(vkb);
+
 /* common function to get next device id */
 static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
 {
@@ -3565,7 +3589,7 @@ out:
         libxl__prepare_ao_device(ao, aodev);                            \
         aodev->action = LIBXL__DEVICE_ACTION_REMOVE;                    \
         aodev->dev = device;                                            \
-        aodev->callback = device_addrm_aocomplete;                      \
+        aodev->callback = device_##type##_rm_aocomplete;                \
         aodev->force = f;                                               \
         libxl__initiate_device_remove(egc, aodev);                      \
                                                                         \
@@ -3618,7 +3642,7 @@ DEFINE_DEVICE_REMOVE(vtpm, destroy, 1)
                                                                         \
         GCNEW(aodev);                                                   \
         libxl__prepare_ao_device(ao, aodev);                            \
-        aodev->callback = device_addrm_aocomplete;                      \
+        aodev->callback = device_add_aocomplete;                        \
         libxl__device_##type##_add(egc, domid, type, aodev);            \
                                                                         \
         return AO_INPROGRESS;                                           \
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
                   ` (3 preceding siblings ...)
  2014-07-10 14:32 ` [PATCH v1 04/10] libxl: separate device add/rm complete callbacks Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-16 16:48   ` Ian Campbell
  2014-07-25 16:06   ` Ian Jackson
  2014-07-10 14:32 ` [PATCH v1 06/10] libxl: synchronise configuration when we remove/destroy " Wei Liu
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

We update JSON version first, then write to xenstore, so that we
maintain the following invariant: any device which is present in
xenstore has a corresponding entry in JSON.

The locking pattern is as followed:
 1. lock JSON domain config
 2. write JSON entry
 3. write xenstore entry for a device
 4. unlock JSON domain config

And we ensure in code that JSON entry is always written before the
xenstore entry is written. That is, if 2 fails, 3 will not be executed.
We also ensure that if 2 and 3 are invoked in a loop, the previous
added JSON entry is removed before writing new one.

The locking patten is designed like this, because when adding a disk,
the caller might specify get_vdev which will get called in the middle of
a xenstore transaction to determine the identifier of a disk. Other
devices don't have this property, but I make them follow the same
pattern for consistency.

As we don't have a JSON config file for libxl toolstack domain
(currently Dom0) we need to skip JSON manipulation for it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c          |   96 ++++++++++++++++++++++++++++++++++++++--
 tools/libxl/libxl_internal.h |   99 ++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_pci.c      |   18 ++++++++
 3 files changed, 210 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index e1ee7bf..5f9f94a 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1904,6 +1904,11 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
     flexarray_t *back;
     libxl__device *device;
     unsigned int rc;
+    libxl_device_vtpm vtpm_saved;
+    int lock = -1;
+
+    libxl_device_vtpm_init(&vtpm_saved);
+    libxl_device_vtpm_copy(CTX, &vtpm_saved, vtpm);
 
     rc = libxl__device_vtpm_setdefault(gc, vtpm);
     if (rc) goto out;
@@ -1917,6 +1922,8 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
             goto out;
         }
     }
+    vtpm_saved.devid = vtpm->devid;
+    libxl__update_config_vtpm(gc, &vtpm_saved, vtpm);
 
     GCNEW(device);
     rc = libxl__device_from_vtpm(gc, domid, vtpm, device);
@@ -1943,17 +1950,30 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
     flexarray_append(front, "handle");
     flexarray_append(front, GCSPRINTF("%d", vtpm->devid));
 
+    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
+    if (rc) goto out;
+
+    DEVICE_ADD_JSON(vtpm, vtpms, num_vtpms, domid, &vtpm_saved,
+                    COMPARE_DEVID, rc);
+    if (rc) {
+        UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+        goto out;
+    }
+
     libxl__device_generic_add(gc, XBT_NULL, device,
                               libxl__xs_kvs_of_flexarray(gc, back, back->count),
                               libxl__xs_kvs_of_flexarray(gc, front, front->count),
                               NULL);
 
+    UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+
     aodev->dev = device;
     aodev->action = LIBXL__DEVICE_ACTION_ADD;
     libxl__wait_device_connection(egc, aodev);
 
     rc = 0;
 out:
+    libxl_device_vtpm_dispose(&vtpm_saved);
     aodev->rc = rc;
     if(rc) aodev->callback(egc, aodev);
     return;
@@ -2186,6 +2206,10 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
     int rc;
     libxl_ctx *ctx = gc->owner;
     xs_transaction_t t = XBT_NULL;
+    int lock = -1;
+    libxl_device_disk disk_saved;
+
+    libxl_device_disk_init(&disk_saved);
 
     libxl_domain_type type = libxl__domain_type(gc, domid);
     if (type == LIBXL_DOMAIN_TYPE_INVALID) {
@@ -2193,21 +2217,56 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
         goto out;
     }
 
+    libxl_device_disk_copy(CTX, &disk_saved, disk);
+
+    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
+    if (rc) goto out;
+
     for (;;) {
         rc = libxl__xs_transaction_start(gc, &t);
-        if (rc) goto out;
+        if (rc) {
+            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+            goto out;
+        }
+
+        /* This is a loop, if disk has been added to JSON before, we
+         * need to remove it, then add it later, because disk->vdev
+         * might change.
+         */
+        DEVICE_REMOVE_JSON(disk, disks, num_disks, domid, &disk_saved,
+                           COMPARE_DISK, rc);
+        if (rc) {
+            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+            goto out;
+        }
 
         if (get_vdev) {
             assert(get_vdev_user);
             disk->vdev = get_vdev(gc, get_vdev_user, t);
             if (disk->vdev == NULL) {
                 rc = ERROR_FAIL;
+                UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
                 goto out;
             }
         }
 
+        if (strcmp(disk_saved.vdev, disk->vdev)) {
+            free(disk_saved.vdev);
+            disk_saved.vdev = libxl__strdup(NOGC, disk->vdev);
+        }
+
+        DEVICE_ADD_JSON(disk, disks, num_disks, domid, &disk_saved,
+                        COMPARE_DISK, rc);
+        if (rc) {
+            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+            goto out;
+        }
+
         rc = libxl__device_disk_setdefault(gc, disk);
-        if (rc) goto out;
+        if (rc) {
+            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+            goto out;
+        }
 
         front = flexarray_make(gc, 16, 1);
         back = flexarray_make(gc, 16, 1);
@@ -2217,6 +2276,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
         if (rc != 0) {
             LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Invalid or unsupported"
                    " virtual disk identifier %s", disk->vdev);
+            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
             goto out;
         }
 
@@ -2257,6 +2317,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
                     if (!dev) {
                         LOG(ERROR, "failed to get blktap devpath for %p\n",
                             disk->pdev_path);
+                        UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
                         rc = ERROR_FAIL;
                         goto out;
                     }
@@ -2281,6 +2342,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
             default:
                 LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "unrecognized disk backend type: %d\n", disk->backend);
                 rc = ERROR_INVAL;
+                UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
                 goto out;
         }
 
@@ -2343,9 +2405,14 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
 
         rc = libxl__xs_transaction_commit(gc, &t);
         if (!rc) break;
-        if (rc < 0) goto out;
+        if (rc < 0) {
+            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+            goto out;
+        }
     }
 
+    UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+
     aodev->dev = device;
     aodev->action = LIBXL__DEVICE_ACTION_ADD;
     libxl__wait_device_connection(egc, aodev);
@@ -2353,6 +2420,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
     rc = 0;
 
 out:
+    libxl_device_disk_dispose(&disk_saved);
     libxl__xs_transaction_abort(gc, &t);
     aodev->rc = rc;
     if (rc) aodev->callback(egc, aodev);
@@ -3009,6 +3077,11 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
     flexarray_t *back;
     libxl__device *device;
     unsigned int rc;
+    int lock = -1;
+    libxl_device_nic nic_saved;
+
+    libxl_device_nic_init(&nic_saved);
+    libxl_device_nic_copy(CTX, &nic_saved, nic);
 
     rc = libxl__device_nic_setdefault(gc, nic, domid);
     if (rc) goto out;
@@ -3023,6 +3096,9 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
         }
     }
 
+    nic_saved.devid = nic->devid;
+    libxl__update_config_nic(gc, &nic_saved, nic);
+
     GCNEW(device);
     rc = libxl__device_from_nic(gc, domid, nic, device);
     if ( rc != 0 ) goto out;
@@ -3079,17 +3155,31 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
     flexarray_append(front, "mac");
     flexarray_append(front, libxl__sprintf(gc,
                                     LIBXL_MAC_FMT, LIBXL_MAC_BYTES(nic->mac)));
+
+    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
+    if (rc) goto out;
+
+    DEVICE_ADD_JSON(nic, nics, num_nics, domid, nic, COMPARE_DEVID, rc);
+
+    if (rc) {
+        UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+        goto out;
+    }
+
     libxl__device_generic_add(gc, XBT_NULL, device,
                               libxl__xs_kvs_of_flexarray(gc, back, back->count),
                               libxl__xs_kvs_of_flexarray(gc, front, front->count),
                               NULL);
 
+    UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+
     aodev->dev = device;
     aodev->action = LIBXL__DEVICE_ACTION_ADD;
     libxl__wait_device_connection(egc, aodev);
 
     rc = 0;
 out:
+    libxl_device_nic_dispose(&nic_saved);
     aodev->rc = rc;
     if (rc) aodev->callback(egc, aodev);
     return;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 72d21ad..f27a6e98 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3254,6 +3254,105 @@ static inline void libxl__update_config_vtpm(libxl__gc *gc,
     libxl_uuid_copy(CTX, &dst->uuid, &src->uuid);
 }
 
+/* Macro used to add the new device to JSON template */
+#define COMPARE_DEVID(a, b) ((a)->devid == (b)->devid)
+#define COMPARE_DISK(a, b) (!strcmp((a)->vdev, (b)->vdev))
+#define COMPARE_PCI(a, b) ((a)->func == (b)->func &&    \
+                           (a)->bus == (b)->bus &&      \
+                           (a)->dev == (b)->dev)
+
+#define LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc)                 \
+    do {                                                             \
+        if (domid != LIBXL_TOOLSTACK_DOMID)                          \
+            rc = libxl__lock_domain_configuration(gc, domid, &lock); \
+        else                                                         \
+            rc = 0;                                                  \
+    } while (0)
+
+#define UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock)                      \
+    do {                                                                \
+        if (domid != LIBXL_TOOLSTACK_DOMID)                             \
+            libxl__unlock_domain_configuration(gc, domid, &lock);       \
+    } while (0)
+
+#define DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare, rc)        \
+    do {                                                                \
+        int x;                                                          \
+        libxl_domain_config d_config;                                   \
+        libxl_device_##type *p;                                         \
+                                                                        \
+        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
+            break;                                                      \
+                                                                        \
+        libxl_domain_config_init(&d_config);                            \
+                                                                        \
+        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
+        if (rc)                                                         \
+            goto add_done;                                              \
+                                                                        \
+        /* Check for duplicated device */                               \
+        for (x = 0; x < d_config.cnt; x++) {                            \
+            if (compare(&d_config.ptr[x], (dev))) {                     \
+                rc = 0;                                                 \
+                goto add_done;                                          \
+            }                                                           \
+        }                                                               \
+                                                                        \
+        d_config.ptr =                                                  \
+            libxl__realloc(gc, d_config.ptr,                            \
+                           (d_config.cnt + 1) *                         \
+                           sizeof(libxl_device_##type));                \
+        p = &d_config.ptr[d_config.cnt];                                \
+        d_config.cnt++;                                                 \
+        libxl_device_##type##_copy(CTX, p, (dev));                      \
+                                                                        \
+        rc = libxl__set_domain_configuration(gc, (domid), &d_config);   \
+                                                                        \
+    add_done:                                                           \
+        libxl_domain_config_dispose(&d_config);                         \
+    } while (0)
+
+#define DEVICE_REMOVE_JSON(type, ptr, cnt, domid, dev, compare, rc)     \
+    do {                                                                \
+        int i, j;                                                       \
+        libxl_device_##type *p = dev;                                   \
+        libxl_domain_config d_config;                                   \
+        bool removed = false;                                           \
+                                                                        \
+        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
+            break;                                                      \
+                                                                        \
+        libxl_domain_config_init(&d_config);                            \
+                                                                        \
+        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
+        if (rc)                                                         \
+            goto remove_done;                                           \
+                                                                        \
+        for (i = j = 0; i < d_config.cnt; i++) {                        \
+            if (!compare(&d_config.ptr[i], p)) {                        \
+                if (i != j) {                                           \
+                    libxl_device_##type##_dispose(&d_config.ptr[j]);    \
+                    d_config.ptr[j] = d_config.ptr[i];                  \
+                    removed = true;                                     \
+                }                                                       \
+                j++;                                                    \
+            }                                                           \
+        }                                                               \
+                                                                        \
+        if (!removed)  /* no matching entry found */                    \
+            goto remove_done;                                           \
+                                                                        \
+        d_config.ptr =                                                  \
+            libxl__realloc(NOGC, d_config.ptr,                          \
+                           j * sizeof(libxl_device_##type));            \
+        d_config.cnt = j;                                               \
+                                                                        \
+        rc = libxl__set_domain_configuration(gc, (domid), &d_config);   \
+                                                                        \
+    remove_done:                                                        \
+        libxl_domain_config_dispose(&d_config);                         \
+    } while (0)
+
 #endif
 
 /*
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index 44d0453..d0d8f90 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -1034,6 +1034,10 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
     libxl_device_pci *assigned;
     int num_assigned, i, rc;
     int stubdomid = 0;
+    int lock = -1;
+    libxl_device_pci pcidev_saved;
+
+    libxl_device_pci_init(&pcidev_saved);
 
     if (libxl__domain_type(gc, domid) == LIBXL_DOMAIN_TYPE_HVM) {
         rc = xc_test_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev));
@@ -1047,6 +1051,8 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
         }
     }
 
+    libxl_device_pci_copy(CTX, &pcidev_saved, pcidev);
+
     rc = libxl__device_pci_setdefault(gc, pcidev);
     if (rc) goto out;
 
@@ -1104,6 +1110,15 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
         pfunc_mask = (1 << pcidev->func);
     }
 
+    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
+    if (rc) goto out;
+
+    DEVICE_ADD_JSON(pci, pcidevs, num_pcidevs, domid, pcidev, COMPARE_PCI, rc);
+    if (rc) {
+        UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+        goto out;
+    }
+
     for(rc = 0, i = 7; i >= 0; --i) {
         if ( (1 << i) & pfunc_mask ) {
             if ( pcidev->vfunc_mask == pfunc_mask ) {
@@ -1121,7 +1136,10 @@ int libxl__device_pci_add(libxl__gc *gc, uint32_t domid, libxl_device_pci *pcide
         }
     }
 
+    UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+
 out:
+    libxl_device_pci_dispose(&pcidev_saved);
     return rc;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 06/10] libxl: synchronise configuration when we remove/destroy a device
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
                   ` (4 preceding siblings ...)
  2014-07-10 14:32 ` [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-16 16:58   ` Ian Campbell
  2014-07-10 14:32 ` [PATCH v1 07/10] libxl: make libxl_cd_insert "eject" + "insert" Wei Liu
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

Xenstore is updated first, JSON is updated later, so that we maintain
the same invariant as previous commit: any device which is present in
xenstore has an entry in JSON.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c          |   25 +++++++++++++++++++------
 tools/libxl/libxl_internal.h |    2 ++
 tools/libxl/libxl_pci.c      |   13 +++++++++++++
 3 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 5f9f94a..58f07d2 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1822,7 +1822,7 @@ out:
     return;
 }
 
-#define DEFINE_DEVICE_RM_AOCOMPLETE(type)                               \
+#define DEFINE_DEVICE_RM_AOCOMPLETE(type, ptr, cnt, compare)            \
     static void device_##type##_rm_aocomplete(libxl__egc *egc,          \
                                               libxl__ao_device *aodev)  \
     {                                                                   \
@@ -1830,17 +1830,29 @@ out:
         if (aodev->rc) {                                                \
             DEVICE_AO_FAILED_MSG;                                       \
             goto out;                                                   \
+        } else {                                                        \
+            int lock = -1;                                              \
+                                                                        \
+            LOCK_DOMAIN_JSON_CONFIG(gc, aodev->dev->domid,              \
+                                    lock, aodev->rc);                   \
+            if (aodev->rc)                                              \
+                goto out;                                               \
+                                                                        \
+            DEVICE_REMOVE_JSON(type, ptr, cnt, aodev->dev->domid,       \
+                               aodev->pdev, compare, aodev->rc);        \
+                                                                        \
+            UNLOCK_DOMAIN_JSON_CONFIG(gc, aodev->dev->domid, lock);     \
         }                                                               \
                                                                         \
     out:                                                                \
         libxl__ao_complete(egc, ao, aodev->rc);                         \
     }
 
-DEFINE_DEVICE_RM_AOCOMPLETE(vtpm);
-DEFINE_DEVICE_RM_AOCOMPLETE(nic);
-DEFINE_DEVICE_RM_AOCOMPLETE(disk);
-DEFINE_DEVICE_RM_AOCOMPLETE(vfb);
-DEFINE_DEVICE_RM_AOCOMPLETE(vkb);
+DEFINE_DEVICE_RM_AOCOMPLETE(vtpm, vtpms, num_vtpms, COMPARE_DEVID);
+DEFINE_DEVICE_RM_AOCOMPLETE(nic, nics, num_nics, COMPARE_DEVID);
+DEFINE_DEVICE_RM_AOCOMPLETE(disk, disks, num_disks, COMPARE_DISK);
+DEFINE_DEVICE_RM_AOCOMPLETE(vfb, vfbs, num_vfbs, COMPARE_DEVID);
+DEFINE_DEVICE_RM_AOCOMPLETE(vkb, vkbs, num_vkbs, COMPARE_DEVID);
 
 /* common function to get next device id */
 static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
@@ -3677,6 +3689,7 @@ out:
                                                                         \
         GCNEW(aodev);                                                   \
         libxl__prepare_ao_device(ao, aodev);                            \
+        aodev->pdev = type;                                             \
         aodev->action = LIBXL__DEVICE_ACTION_REMOVE;                    \
         aodev->dev = device;                                            \
         aodev->callback = device_##type##_rm_aocomplete;                \
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index f27a6e98..b4f518a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2131,6 +2131,8 @@ struct libxl__ao_device {
     int num_exec;
     /* for calling hotplug scripts */
     libxl__async_exec_state aes;
+    /* point to libxl_device_TYPE */
+    void *pdev;
 };
 
 /*
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index d0d8f90..cab3a1b 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -1349,6 +1349,19 @@ int libxl_device_pci_remove(libxl_ctx *ctx, uint32_t domid,
 
     rc = libxl__device_pci_remove_common(gc, domid, pcidev, 0);
 
+    if (!rc) {
+        int lock = -1;
+
+        LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
+        if (rc)
+            goto out;
+
+        DEVICE_REMOVE_JSON(pci, pcidevs, num_pcidevs, domid,
+                           pcidev, COMPARE_PCI, rc);
+
+        UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+    }
+out:
     libxl__ao_complete(egc, ao, rc);
     return AO_INPROGRESS;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 07/10] libxl: make libxl_cd_insert "eject" + "insert"
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
                   ` (5 preceding siblings ...)
  2014-07-10 14:32 ` [PATCH v1 06/10] libxl: synchronise configuration when we remove/destroy " Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-17 10:44   ` Ian Campbell
  2014-07-10 14:32 ` [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max Wei Liu
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

A "cdrom insert" is always processed as "eject" + "insert", with JSON
config updated in between. So that we can know the correct state of
CDROM later when we try to retrieve domain configuration: if xenstore is
"empty", then CDROM is "empty"; otherwise use the information presented
in JSON.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c          |   48 ++++++++++++++++++++++++++++++++++++++++--
 tools/libxl/libxl_internal.h |   34 ++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 58f07d2..69d94b1 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2654,8 +2654,9 @@ int libxl_device_disk_getinfo(libxl_ctx *ctx, uint32_t domid,
     return 0;
 }
 
-int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
-                       const libxl_asyncop_how *ao_how)
+static int libxl__cdrom_insert(libxl_ctx *ctx, uint32_t domid,
+                               libxl_device_disk *disk,
+                               const libxl_asyncop_how *ao_how)
 {
     AO_CREATE(ctx, domid, ao_how);
     int num = 0, i;
@@ -2766,6 +2767,49 @@ out:
     return AO_INPROGRESS;
 }
 
+/*
+ * A "cdrom insert" is always processed as "eject" + "insert", with
+ * updating JSON in between. So that we can know the current state of
+ * CDROM later when we try to retrieve domain configuration: if
+ * xenstore is "empty", then CDROM is "empty"; otherwise use the image
+ * in JSON.
+ */
+int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
+                       const libxl_asyncop_how *ao_how)
+{
+    GC_INIT(ctx);
+    libxl_device_disk empty;
+    int rc;
+    int lock = -1;
+
+    libxl_device_disk_init(&empty);
+    empty.format = LIBXL_DISK_FORMAT_EMPTY;
+    empty.vdev = libxl__strdup(NOGC, disk->vdev);
+    empty.is_cdrom = 1;
+
+    rc = libxl__cdrom_insert(ctx, domid, &empty, NULL);
+    if (rc)
+        goto out;
+
+    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
+    if (rc)
+        goto out;
+    DEVICE_UPDATE_JSON(disk, disks, num_disks, domid, disk,
+                       COMPARE_DISK, rc);
+    UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
+    if (rc)
+        goto out;
+
+    rc = libxl__cdrom_insert(ctx, domid, disk, ao_how);
+    if (rc)
+        goto out;
+
+out:
+    libxl_device_disk_dispose(&empty);
+    GC_FREE;
+    return rc;
+}
+
 /* libxl__alloc_vdev only works on the local domain, that is the domain
  * where the toolstack is running */
 static char * libxl__alloc_vdev(libxl__gc *gc, void *get_vdev_user,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b4f518a..f8f2ba2 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3355,6 +3355,40 @@ static inline void libxl__update_config_vtpm(libxl__gc *gc,
         libxl_domain_config_dispose(&d_config);                         \
     } while (0)
 
+/* Search device list for device with the same identifier as "dev",
+ * update that device with the content pointed to by "dev".
+ */
+#define DEVICE_UPDATE_JSON(type, ptr, cnt, domid, dev, compare, rc)     \
+    do {                                                                \
+        int x;                                                          \
+        libxl_domain_config d_config;                                   \
+        bool updated = false;                                           \
+                                                                        \
+        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
+            break;                                                      \
+                                                                        \
+        libxl_domain_config_init(&d_config);                            \
+                                                                        \
+        rc = libxl__get_domain_configuration(gc, domid, &d_config);     \
+        if (rc)                                                         \
+            goto update_done;                                           \
+                                                                        \
+        for (x = 0; x < d_config.cnt; x++) {                            \
+            if (compare(&d_config.ptr[x], (dev))) {                     \
+                libxl_device_##type##_dispose(&d_config.ptr[x]);        \
+                libxl_device_##type##_copy(CTX, &d_config.ptr[x],       \
+                                           (dev));                      \
+                updated = true;                                         \
+            }                                                           \
+        }                                                               \
+                                                                        \
+        if (updated)                                                    \
+            rc = libxl__set_domain_configuration(gc, domid, &d_config); \
+                                                                        \
+    update_done:                                                        \
+        libxl_domain_config_dispose(&d_config);                         \
+    } while (0)
+
 #endif
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
                   ` (6 preceding siblings ...)
  2014-07-10 14:32 ` [PATCH v1 07/10] libxl: make libxl_cd_insert "eject" + "insert" Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-17 10:47   ` Ian Campbell
  2014-07-10 14:32 ` [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration Wei Liu
  2014-07-10 14:32 ` [PATCH v1 10/10] xl: use libxl_retrieve_domain_configuration and JSON format Wei Liu
  9 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

... which returns the "static-max" knob of a domain. It will be used in
later patch to retrieve memory static-max value of a domain.

As libxl_get_memory_{target, static_max} have similar logic, a macro is
introduced to avoid code duplication. libxl__fill_dom0_memory_info is
adjusted as needed.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c |  101 ++++++++++++++++++++++++++++++++-------------------
 tools/libxl/libxl.h |    8 ++++
 2 files changed, 72 insertions(+), 37 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 69d94b1..49d7ef6 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4187,7 +4187,8 @@ out:
     return rc;
 }
 
-static int libxl__fill_dom0_memory_info(libxl__gc *gc, uint32_t *target_memkb)
+static int libxl__fill_dom0_memory_info(libxl__gc *gc, uint32_t *target_memkb,
+                                        uint32_t *max_memkb)
 {
     int rc;
     libxl_dominfo info;
@@ -4221,6 +4222,16 @@ retry_transaction:
         }
     }
 
+    if (staticmax) {
+        *max_memkb = strtoul(staticmax, &endptr, 10);
+        if (*endptr != '\0') {
+            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
+                    "invalid memory static-max %s from %s\n", staticmax, max_path);
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
     rc = libxl_domain_info(ctx, &info, 0);
     if (rc < 0)
         goto out;
@@ -4267,12 +4278,12 @@ static int libxl__get_free_memory_slack(libxl__gc *gc, uint32_t *free_mem_slack)
     int rc;
     char *free_mem_slack_path = "/local/domain/0/memory/freemem-slack";
     char *free_mem_slack_s, *endptr;
-    uint32_t target_memkb;
+    uint32_t target_memkb, max_memkb;
 
 retry:
     free_mem_slack_s = libxl__xs_read(gc, XBT_NULL, free_mem_slack_path);
     if (!free_mem_slack_s) {
-        rc = libxl__fill_dom0_memory_info(gc, &target_memkb);
+        rc = libxl__fill_dom0_memory_info(gc, &target_memkb, &max_memkb);
         if (rc < 0)
             return rc;
         goto retry;
@@ -4295,6 +4306,7 @@ int libxl_set_memory_target(libxl_ctx *ctx, uint32_t domid,
     int rc = 1, abort_transaction = 0;
     uint32_t memorykb = 0, videoram = 0;
     uint32_t current_target_memkb = 0, new_target_memkb = 0;
+    uint32_t current_max_memkb = 0;
     char *memmax, *endptr, *videoram_s = NULL, *target = NULL;
     char *dompath = libxl__xs_get_dompath(gc, domid);
     xc_domaininfo_t info;
@@ -4310,7 +4322,8 @@ retry_transaction:
     if (!target && !domid) {
         if (!xs_transaction_end(ctx->xsh, t, 1))
             goto out_no_transaction;
-        rc = libxl__fill_dom0_memory_info(gc, &current_target_memkb);
+        rc = libxl__fill_dom0_memory_info(gc, &current_target_memkb,
+                                          &current_max_memkb);
         if (rc < 0)
             goto out_no_transaction;
         goto retry_transaction;
@@ -4425,41 +4438,55 @@ out_no_transaction:
     return rc;
 }
 
-int libxl_get_memory_target(libxl_ctx *ctx, uint32_t domid, uint32_t *out_target)
-{
-    GC_INIT(ctx);
-    int rc = 1;
-    char *target = NULL, *endptr = NULL;
-    char *dompath = libxl__xs_get_dompath(gc, domid);
-    uint32_t target_memkb;
-
-    target = libxl__xs_read(gc, XBT_NULL, libxl__sprintf(gc,
-                "%s/memory/target", dompath));
-    if (!target && !domid) {
-        rc = libxl__fill_dom0_memory_info(gc, &target_memkb);
-        if (rc < 0)
-            goto out;
-    } else if (!target) {
-        LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
-                "cannot get target memory info from %s/memory/target\n",
-                dompath);
-        goto out;
-    } else {
-        target_memkb = strtoul(target, &endptr, 10);
-        if (*endptr != '\0') {
-            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,
-                    "invalid memory target %s from %s/memory/target\n",
-                    target, dompath);
-            goto out;
-        }
+/* This macro generates:
+ *   libxl_get_memory_target
+ *   libxl_get_memory_static_max
+ */
+#define DEFINE_GET_MEMORY(name, node)                                   \
+    int libxl_get_memory_##name(libxl_ctx *ctx, uint32_t domid,         \
+                                uint32_t *out)                          \
+    {                                                                   \
+        GC_INIT(ctx);                                                   \
+        int rc = 1;                                                     \
+        char *val = NULL, *endptr = NULL;                               \
+        char *dompath = libxl__xs_get_dompath(gc, domid);               \
+        uint32_t result;                                                \
+        uint32_t target, static_max;                                    \
+                                                                        \
+        val = libxl__xs_read(gc, XBT_NULL,                              \
+                             libxl__sprintf(gc, "%s/memory/"node,       \
+                                            dompath));                  \
+        if (!val && !domid) {                                           \
+            rc = libxl__fill_dom0_memory_info(gc, &target,              \
+                                              &static_max);             \
+            if (rc < 0)                                                 \
+                goto out;                                               \
+            result = name;                                              \
+        } else if (!val) {                                              \
+            LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,                     \
+                             "cannot not get memory info from %s/memory/"node"\n", \
+                             dompath);                                  \
+            goto out;                                                   \
+        } else {                                                        \
+            result = strtoul(val, &endptr, 10);                         \
+            if (*endptr != '\0') {                                      \
+                LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR,                 \
+                                 "invalid value %s from %s/memory/"node"\n", \
+                                 val, dompath);                         \
+                goto out;                                               \
+            }                                                           \
+        }                                                               \
+        *out = result;                                                  \
+        rc = 0;                                                         \
+                                                                        \
+    out:                                                                \
+        GC_FREE;                                                        \
+        return rc;                                                      \
     }
-    *out_target = target_memkb;
-    rc = 0;
 
-out:
-    GC_FREE;
-    return rc;
-}
+DEFINE_GET_MEMORY(target, "target")
+DEFINE_GET_MEMORY(static_max, "static-max")
+#undef DEFINE_GET_MEMORY
 
 int libxl_domain_need_memory(libxl_ctx *ctx, libxl_domain_build_info *b_info,
                              uint32_t *need_memkb)
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 2dc6a51..f9eb0f1 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -358,6 +358,13 @@ typedef struct libxl__ctx libxl_ctx;
 #endif
 #endif
 
+/* LIBXL_HAVE_GET_MEMORY_STATIC_MAX
+ *
+ * If this is defined we have an API called libxl_get_memory_static_max
+ * to return the memory "static-max" value from xenstore.
+ */
+#define LIBXL_HAVE_GET_MEMORY_STATIC_MAX 1
+
 /*
  * LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS
  *
@@ -886,6 +893,7 @@ int libxl_domain_core_dump(libxl_ctx *ctx, uint32_t domid,
 int libxl_domain_setmaxmem(libxl_ctx *ctx, uint32_t domid, uint32_t target_memkb);
 int libxl_set_memory_target(libxl_ctx *ctx, uint32_t domid, int32_t target_memkb, int relative, int enforce);
 int libxl_get_memory_target(libxl_ctx *ctx, uint32_t domid, uint32_t *out_target);
+int libxl_get_memory_static_max(libxl_ctx *ctx, uint32_t domid, uint32_t *out_max_memkb);
 
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
                   ` (7 preceding siblings ...)
  2014-07-10 14:32 ` [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-17 10:59   ` Ian Campbell
  2014-07-10 14:32 ` [PATCH v1 10/10] xl: use libxl_retrieve_domain_configuration and JSON format Wei Liu
  9 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

Introduce a new public API to return domain configuration. This returned
configuration can be used to rebuild a domain.

Note that this configuration doesn't equal to the current state of the
domain. What it does is to use JSON version configuration as template
and pull in relevant information from xenstore.

With this approach we can preserve what user has provided in the
original configuration as well as valuable information from xenstore.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c |  200 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl.h |   16 +++++
 2 files changed, 216 insertions(+)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 49d7ef6..cd5914c 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -6034,6 +6034,206 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src)
     for (i = 0; i < 6; i++)
         (*dst)[i] = (*src)[i];
 }
+
+int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
+                                        libxl_domain_config *d_config)
+{
+    GC_INIT(ctx);
+    int rc;
+    int fd_lock = -1;
+
+    if (!libxl_domid_valid_guest(domid)) {
+        rc = ERROR_INVAL;
+        goto out;
+    }
+
+    rc = libxl__lock_domain_configuration(gc, domid, &fd_lock);
+    if (rc) {
+        LOGE(ERROR, "fail to lock domain configuration for domain %d", domid);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    rc = libxl__get_domain_configuration(gc, domid, d_config);
+    if (rc) {
+        LOGE(ERROR, "fail to get domain configuration for domain %d", domid);
+        rc = ERROR_FAIL;
+        goto out_unlock;
+    }
+
+    /* Domain name */
+    {
+        char *domname;
+        domname = libxl_domid_to_name(ctx, domid);
+        if (!domname) {
+            LOGE(ERROR, "fail to get domain name for domain %d", domid);
+            goto out_unlock;
+        }
+        free(d_config->c_info.name);
+        d_config->c_info.name = domname; /* steals allocation */
+    }
+
+    /* Domain UUID */
+    {
+        libxl_dominfo info;
+        rc = libxl_domain_info(ctx, &info, domid);
+        if (rc) {
+            LOGE(ERROR, "fail to get domain info for domain %d", domid);
+            goto out_unlock;
+        }
+        libxl_uuid_copy(ctx, &d_config->c_info.uuid, &info.uuid);
+    }
+
+    /* Memory limits:
+     *
+     * Currently there're three memory limits:
+     *  1. "target" in xenstore (originally memory= in config file)
+     *  2. "static-max" in xenstore (originally maxmem= in config file)
+     *  3. "max_memkb" in hypervisor
+     *
+     * The third one is not visible and currently managed by
+     * toolstack. In order to rebuild a domain we only need to have
+     * "target" and "static-max".
+     */
+    {
+        uint32_t memory;
+
+        /* "target" */
+        rc = libxl_get_memory_target(ctx, domid, &memory);
+        if (rc) {
+            LOGE(ERROR, "fail to get memory target for domain %d", domid);
+            goto out_unlock;
+        }
+        /* If the domain is HVM domain, target memory in xenstore is
+         * smaller than what user has asked for. The difference is
+         * video_memkb, so add it back. Otherwise domain rebuild will
+         * fail.
+         */
+        if (d_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM)
+            d_config->b_info.target_memkb =
+                memory + d_config->b_info.video_memkb;
+
+        /* "static-max" */
+        rc = libxl_get_memory_static_max(ctx, domid, &memory);
+        if (rc) {
+            LOGE(ERROR, "fail to get memory static-max for domain %d", domid);
+            goto out_unlock;
+        }
+        d_config->b_info.max_memkb = memory;
+    }
+
+    /* Devices: disk, nic, vtpm, pcidev */
+
+    /* The MERGE macro implements following logic:
+     * 0. retrieve JSON
+     * 1. retrieve list of device from xenstore
+     * 2. use xenstore entries as primary reference and compare JSON
+     *    entries with them.
+     *    a. if a device is present in xenstore and in JSON, merge the
+     *       two views.
+     *    b. if a device is not present in xenstore but in JSON, delete
+     *       it from the result.
+     *    c. it's impossible to have an entry present in xenstore but
+     *       not in JSON, because we maintain an invariant that every
+     *       entry in xenstore must have a corresponding entry in JSON.
+     * 3. "merge" operates on "src" and "dst". "src" points to the
+     *    entry retrieved from xenstore while "dst" points to the entry
+     *    retrieve from JSON.
+     */
+#define MERGE(type, ptr, cnt, compare, merge)                           \
+    do {                                                                \
+        libxl_device_##type *p = NULL;                                  \
+        int i, j, num;                                                  \
+                                                                        \
+        p = libxl_device_##type##_list(CTX, domid, &num);               \
+        if (p == NULL) {                                                \
+            LOGE(DEBUG,                                                 \
+                 "no %s from xenstore for domain %d",                   \
+                 #type, domid);                                         \
+        }                                                               \
+                                                                        \
+        for (i = 0; i < d_config->cnt; i++) {                           \
+            libxl_device_##type *q = &d_config->ptr[i];                 \
+            for (j = 0; j < num; j++) {                                 \
+                if (compare(&p[j], q))                                  \
+                    break;                                              \
+            }                                                           \
+                                                                        \
+            if (j < num) {         /* found in xenstore */              \
+                libxl_device_##type *dst, *src;                         \
+                dst = q;                                                \
+                src = &p[j];                                            \
+                merge;                                                  \
+            } else {                /* not found in xenstore */         \
+                LOGE(WARN, "Device present in JSON but not in xenstore, ignored"); \
+                                                                        \
+                libxl_device_##type##_dispose(q);                       \
+                                                                        \
+                for (j = i; j < d_config->cnt - 1; j++)                 \
+                    memcpy(&d_config->ptr[j], &d_config->ptr[j+1],      \
+                           sizeof(libxl_device_##type));                \
+                                                                        \
+                d_config->ptr =                                         \
+                    libxl__realloc(NOGC, d_config->ptr,                 \
+                                   sizeof(libxl_device_##type) *        \
+                                   (d_config->cnt - 1));                \
+                                                                        \
+                /* rewind counters */                                   \
+                d_config->cnt--;                                        \
+                i--;                                                    \
+            }                                                           \
+        }                                                               \
+                                                                        \
+        for (i = 0; i < num; i++)                                       \
+            libxl_device_##type##_dispose(&p[i]);                       \
+        free(p);                                                        \
+    } while (0);
+
+    MERGE(nic, nics, num_nics, COMPARE_DEVID, {
+            libxl__update_config_nic(gc, dst, src);
+        });
+
+    MERGE(vtpm, vtpms, num_vtpms, COMPARE_DEVID, {
+            libxl__update_config_vtpm(gc, dst, src);
+        });
+
+    MERGE(pci, pcidevs, num_pcidevs, COMPARE_PCI, {});
+
+    /* Take care of removable device. We maintain invariant in the
+     * insert / remove operation so that:
+     * 1. if xenstore is "empty" while JSON is not, the result
+     *    is "empty"
+     * 2. if xenstore has a different media than JSON, use the
+     *    one in JSON
+     * 3. if xenstore and JSON have the same media, well, you
+     *    know the answer :-)
+     *
+     * Currently there is only one removable device -- CDROM.
+     * Look for libxl_cdrom_insert for reference.
+     */
+    MERGE(disk, disks, num_disks, COMPARE_DISK, {
+            if (src->removable) {
+                if (!src->pdev_path || *src->pdev_path == '\0') {
+                    /* 1, use "empty" */
+                    free(dst->pdev_path);
+                    dst->pdev_path = libxl__strdup(NOGC, "");
+                    dst->format = LIBXL_DISK_FORMAT_EMPTY;
+                } else {
+                    /* 2 and 3, use JSON, no need to touch anything */
+                    ;
+                }
+            }
+        });
+
+#undef MERGE
+
+out_unlock:
+    libxl__unlock_domain_configuration(gc, domid, &fd_lock);
+out:
+    GC_FREE;
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index f9eb0f1..982f339 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -358,6 +358,14 @@ typedef struct libxl__ctx libxl_ctx;
 #endif
 #endif
 
+/* LIBXL_HAVE_RETRIEVE_DOMAIN_CONFIGURATION
+ *
+ * If this is defined we have libxl_retrieve_domain_configuration which
+ * returns the current configuration of a domain, which can be used to
+ * rebuild a domain.
+ */
+#define LIBXL_HAVE_RETRIEVE_DOMAIN_CONFIGURATION 1
+
 /* LIBXL_HAVE_GET_MEMORY_STATIC_MAX
  *
  * If this is defined we have an API called libxl_get_memory_static_max
@@ -834,6 +842,14 @@ int static inline libxl_domain_create_restore_0x040200(
 void libxl_domain_config_init(libxl_domain_config *d_config);
 void libxl_domain_config_dispose(libxl_domain_config *d_config);
 
+/*
+ * Retrieve domain configuration and filled it in d_config. The
+ * returned configuration can be used to rebuild a domain. It only
+ * works with DomU.
+ */
+int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
+                                        libxl_domain_config *d_config);
+
 int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd,
                          int flags, /* LIBXL_SUSPEND_* */
                          const libxl_asyncop_how *ao_how)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v1 10/10] xl: use libxl_retrieve_domain_configuration and JSON format
  2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
                   ` (8 preceding siblings ...)
  2014-07-10 14:32 ` [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration Wei Liu
@ 2014-07-10 14:32 ` Wei Liu
  2014-07-17 11:13   ` Ian Campbell
  9 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-10 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Wei Liu, ian.jackson, ian.campbell

Before this change, xl stores domain configuration in "xl" format, which
is in fact a verbatim copy of user supplied domain config.

Now libxl provides a new API to retrieve domain configuration, switch to
that new API, store configuration in JSON format.

Tests done so far (xl.{new,old} denotes xl with{,out} "libxl-json"
support):

1. xl.new create then xl.new save, hexdump saved file: domain config
   saved in JSON format
2. xl.new create, xl.new save then xl.old restore: failed on
   mandatory flag check
3. xl.new create, xl.new save then xl.new restore: succeeded
4. xl.old create, xl.old save then xl.new restore: succeeded
5. xl.new create then local migrate, receiving end xl.new: succeeded
6. xl.old create then local migrate, receiving end xl.new: succeeded

Note that "xl" config is still supported and handled when restarting a
domain. "xl" config file takes precedence over "libxl-json" in that
case, so that user who uses "config-update" to store new config file
won't have regression. All other scenarios (migration, domain listing
etc.) now use the new API.

Lastly, print out warning when users invoke "config-update" to
discourage them from using this command.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 docs/man/xl.pod.1         |    3 ++
 tools/libxl/xl_cmdimpl.c  |  121 ++++++++++++++++++++++++++++-----------------
 tools/libxl/xl_cmdtable.c |    4 +-
 3 files changed, 82 insertions(+), 46 deletions(-)

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 30bd4bf..6b5baf4 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -187,6 +187,9 @@ immediate effect but will be applied when the guest is next
 restarted. This command is useful to ensure that runtime modifications
 made to the guest will be preserved when the guest is restarted.
 
+Libxl now has better capability to handle domain configuration, avoid
+using this command when possible.
+
 I<configfile> has to be an absolute path to a file.
 
 B<OPTIONS>
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 68df548..3ee490e 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -107,6 +107,8 @@ static const char migrate_report[]=
    *            from target to source
    */
 
+#define XL_MANDATORY_FLAG_JSON (1U << 0) /* config data is in JSON format */
+#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON)
 struct save_file_header {
     char magic[32]; /* savefileheader_magic */
     /* All uint32_ts are in domain's byte order. */
@@ -1699,21 +1701,40 @@ skip_vfb:
 }
 
 static void reload_domain_config(uint32_t domid,
-                                 uint8_t **config_data, int *config_len)
+                                 libxl_domain_config *d_config)
 {
+    int rc;
     uint8_t *t_data;
     int ret, t_len;
+    libxl_domain_config d_config_new;
 
+    /* In case user has used "config-update" to store a new config
+     * file.
+     */
     ret = libxl_userdata_retrieve(ctx, domid, "xl", &t_data, &t_len);
-    if (ret) {
-        LOG("failed to retrieve guest configuration (rc=%d). "
-            "reusing old configuration", ret);
+    if (ret && errno != ENOENT) {
+        LOG("\"xl\" file found but failed to load\n");
+    }
+    if (t_len > 0) {
+        LOG("\"xl\" config file found, use it\n");
+        libxl_domain_config_dispose(d_config);
+        parse_config_data("<updated>", (const char *)t_data,
+                          t_len, d_config);
+        free(t_data);
         return;
     }
 
-    free(*config_data);
-    *config_data = t_data;
-    *config_len = t_len;
+    libxl_domain_config_init(&d_config_new);
+    rc = libxl_retrieve_domain_configuration(ctx, domid, &d_config_new);
+    if (rc) {
+        LOG("failed to retrieve guest configuration (rc=%d). "
+            "reusing old configuration", rc);
+        libxl_domain_config_dispose(&d_config_new);
+    } else {
+        libxl_domain_config_dispose(d_config);
+        /* Steal allocations */
+        memcpy(d_config, &d_config_new, sizeof(libxl_domain_config));
+    }
 }
 
 /* Returns 1 if domain should be restarted,
@@ -1721,7 +1742,6 @@ static void reload_domain_config(uint32_t domid,
  * Can update r_domid if domain is destroyed etc */
 static int handle_domain_death(uint32_t *r_domid,
                                libxl_event *event,
-                               uint8_t **config_data, int *config_len,
                                libxl_domain_config *d_config)
 
 {
@@ -1779,13 +1799,12 @@ static int handle_domain_death(uint32_t *r_domid,
         break;
 
     case LIBXL_ACTION_ON_SHUTDOWN_RESTART_RENAME:
-        reload_domain_config(*r_domid, config_data, config_len);
+        reload_domain_config(*r_domid, d_config);
         restart = 2;
         break;
 
     case LIBXL_ACTION_ON_SHUTDOWN_RESTART:
-        reload_domain_config(*r_domid, config_data, config_len);
-
+        reload_domain_config(*r_domid, d_config);
         restart = 1;
         /* fall-through */
     case LIBXL_ACTION_ON_SHUTDOWN_DESTROY:
@@ -1982,6 +2001,7 @@ static uint32_t create_domain(struct domain_create *dom_info)
     const char *config_source = NULL;
     const char *restore_source = NULL;
     int migrate_fd = dom_info->migrate_fd;
+    bool config_in_json;
 
     int i;
     int need_daemon = daemonize;
@@ -2036,7 +2056,7 @@ static uint32_t create_domain(struct domain_create *dom_info)
                 restore_source, hdr.mandatory_flags, hdr.optional_flags,
                 hdr.optional_data_len);
 
-        badflags = hdr.mandatory_flags & ~( 0 /* none understood yet */ );
+        badflags = hdr.mandatory_flags & ~XL_MANDATORY_FLAG_ALL;
         if (badflags) {
             fprintf(stderr, "Savefile has mandatory flag(s) 0x%"PRIx32" "
                     "which are not supported; need newer xl\n",
@@ -2064,7 +2084,9 @@ static uint32_t create_domain(struct domain_create *dom_info)
         optdata_here = optdata_begin;
 
         if (OPTDATA_LEFT) {
-            fprintf(stderr, " Savefile contains xl domain config\n");
+            fprintf(stderr, " Savefile contains xl domain config%s\n",
+                    !!(hdr.mandatory_flags & XL_MANDATORY_FLAG_JSON)
+                    ? " in JSON format" : "");
             WITH_OPTDATA(4, {
                 memcpy(u32buf.b, optdata_here, 4);
                 config_len = u32buf.u32;
@@ -2104,6 +2126,7 @@ static uint32_t create_domain(struct domain_create *dom_info)
                 extra_config);
         }
         config_source=config_file;
+        config_in_json = false;
     } else {
         if (!config_data) {
             fprintf(stderr, "Config file not specified and"
@@ -2111,12 +2134,18 @@ static uint32_t create_domain(struct domain_create *dom_info)
             return ERROR_INVAL;
         }
         config_source = "<saved>";
+        config_in_json = !!(hdr.mandatory_flags & XL_MANDATORY_FLAG_JSON);
     }
 
     if (!dom_info->quiet)
         printf("Parsing config from %s\n", config_source);
 
-    parse_config_data(config_source, config_data, config_len, &d_config);
+    if (config_in_json) {
+        libxl_domain_config_from_json(ctx, &d_config,
+                                      (const char *)config_data);
+    } else {
+        parse_config_data(config_source, config_data, config_len, &d_config);
+    }
 
     if (migrate_fd >= 0) {
         if (d_config.c_info.name) {
@@ -2184,14 +2213,6 @@ start:
     if ( ret )
         goto error_out;
 
-    ret = libxl_userdata_store(ctx, domid, "xl",
-                                    config_data, config_len);
-    if (ret) {
-        perror("cannot save config file");
-        ret = ERROR_FAIL;
-        goto error_out;
-    }
-
     release_lock();
 
     if (!paused)
@@ -2248,9 +2269,7 @@ start:
             LOG("Domain %d has shut down, reason code %d 0x%x", domid,
                 event->u.domain_shutdown.shutdown_reason,
                 event->u.domain_shutdown.shutdown_reason);
-            switch (handle_domain_death(&domid, event,
-                                        (uint8_t **)&config_data, &config_len,
-                                        &d_config)) {
+            switch (handle_domain_death(&domid, event, &d_config)) {
             case 2:
                 if (!preserve_domain(&domid, event, &d_config)) {
                     /* If we fail then exit leaving the old domain in place. */
@@ -2287,12 +2306,6 @@ start:
                     d_config.c_info.name = strdup(common_domname);
                 }
 
-                /* Reparse the configuration in case it has changed */
-                libxl_domain_config_dispose(&d_config);
-                libxl_domain_config_init(&d_config);
-                parse_config_data(config_source, config_data, config_len,
-                                  &d_config);
-
                 /*
                  * XXX FIXME: If this sleep is not there then domain
                  * re-creation fails sometimes.
@@ -3071,9 +3084,7 @@ static void list_domains_details(const libxl_dominfo *info, int nb_domain)
 {
     libxl_domain_config d_config;
 
-    char *config_source;
-    uint8_t *data;
-    int i, len, rc;
+    int i, rc;
 
     yajl_gen hand = NULL;
     yajl_gen_status s;
@@ -3094,22 +3105,18 @@ static void list_domains_details(const libxl_dominfo *info, int nb_domain)
         s = yajl_gen_status_ok;
 
     for (i = 0; i < nb_domain; i++) {
+        libxl_domain_config_init(&d_config);
         /* no detailed info available on dom0 */
         if (info[i].domid == 0)
             continue;
-        rc = libxl_userdata_retrieve(ctx, info[i].domid, "xl", &data, &len);
+        rc = libxl_retrieve_domain_configuration(ctx, info[i].domid, &d_config);
         if (rc)
             continue;
-        CHK_SYSCALL(asprintf(&config_source, "<domid %d data>", info[i].domid));
-        libxl_domain_config_init(&d_config);
-        parse_config_data(config_source, (char *)data, len, &d_config);
         if (default_output_format == OUTPUT_FORMAT_JSON)
             s = printf_info_one_json(hand, info[i].domid, &d_config);
         else
             printf_info_sexp(info[i].domid, &d_config);
         libxl_domain_config_dispose(&d_config);
-        free(data);
-        free(config_source);
         if (s != yajl_gen_status_ok)
             goto out;
     }
@@ -3294,22 +3301,41 @@ static void save_domain_core_begin(uint32_t domid,
                                    int *config_len_r)
 {
     int rc;
+    libxl_domain_config d_config;
+    char *config_c = 0;
 
     /* configuration file in optional data: */
 
+    libxl_domain_config_init(&d_config);
+
     if (override_config_file) {
         void *config_v = 0;
         rc = libxl_read_file_contents(ctx, override_config_file,
                                       &config_v, config_len_r);
-        *config_data_r = config_v;
+        if (rc) {
+            fprintf(stderr, "unable to read overridden config file\n");
+            exit(2);
+        }
+        parse_config_data(override_config_file, config_v, *config_len_r,
+                          &d_config);
+        free(config_v);
     } else {
-        rc = libxl_userdata_retrieve(ctx, domid, "xl",
-                                     config_data_r, config_len_r);
+        rc = libxl_retrieve_domain_configuration(ctx, domid, &d_config);
+        if (rc) {
+            fprintf(stderr, "unable to retrieve domain configuration\n");
+            exit(2);
+        }
     }
-    if (rc) {
-        fputs("Unable to get config file\n",stderr);
+
+    config_c = libxl_domain_config_to_json(ctx, &d_config);
+    if (!config_c) {
+        fprintf(stderr, "unable to parse overridden config file\n");
         exit(2);
     }
+    *config_data_r = (uint8_t *)config_c;
+    *config_len_r = strlen(config_c) + 1; /* including trailing '\0' */
+
+    libxl_domain_config_dispose(&d_config);
 }
 
 static void save_domain_core_writeconfig(int fd, const char *source,
@@ -3337,6 +3363,8 @@ static void save_domain_core_writeconfig(int fd, const char *source,
     u32buf.u32 = config_len;
     ADD_OPTDATA(u32buf.b,    4);
     ADD_OPTDATA(config_data, config_len);
+    if (config_len)
+        hdr.mandatory_flags |= XL_MANDATORY_FLAG_JSON;
 
     /* that's the optional data */
 
@@ -4346,6 +4374,9 @@ int main_config_update(int argc, char **argv)
         exit(1);
     }
 
+    fprintf(stderr, "WARN: libxl now has better capability to manage domain configuration, "
+            "avoid using this command when possible\n");
+
     domid = find_domain(argv[1]);
     argc--; argv++;
 
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index 4279b9f..9bc0bca 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -38,7 +38,9 @@ struct cmd_spec cmd_table[] = {
     { "config-update",
       &main_config_update, 1, 1,
       "Update a running domain's saved configuration, used when rebuilding "
-      "the domain after reboot",
+      "the domain after reboot."
+      "WARN: libxl now has better capability to manage domain configuration, "
+      "avoid using this command when possible",
       "<Domain> <ConfigFile> [options] [vars]",
       "-h                      Print this help.\n"
       "-f FILE, --defconfig=FILE\n                     Use the given configuration file.\n"
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it
  2014-07-10 14:32 ` [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it Wei Liu
@ 2014-07-16 16:11   ` Ian Campbell
  2014-07-16 16:44     ` Wei Liu
  2014-07-24 18:09   ` Ian Jackson
  2014-07-24 18:29   ` Ian Jackson
  2 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-16 16:11 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:

> diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
> index 81f8985..dc47177 100644
> --- a/tools/libxl/libxl_internal.c
> +++ b/tools/libxl/libxl_internal.c
> @@ -381,6 +381,62 @@ out:
>      return rc;
>  }
>  
> +int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                    libxl_domain_config *d_config)
> +{
> +    uint8_t *data = NULL;
> +    int rc, len;
> +
> +    rc = libxl_userdata_retrieve(CTX, domid, "libxl-json", &data, &len);
> +    if (rc) {
> +        LOGEV(ERROR, rc,
> +              "failed to retrieve domain configuration for domain %d", domid);
> +        rc = ERROR_FAIL;

rc here is already a valid libxl ERROR_* here, isn't it? No need to
override in that case. You have this several times in this series..

> +        goto out;
> +    }
> +
> +    if (len == 0) {
> +        LOGE(ERROR, "configuration data stream empty for domain %d", domid);
> +        rc = ERROR_FAIL;

This one on the other hand does need setting explicitly.

How could this happen though? Ian J is in favour of adding more error
codes instead of constantly reusing ERROR_FAIL. Here it would seem that
depending on whether/how this is supposed to happen either a
ERROR_INTERNAL_MELTDOWN (i.e. everythig has gone wrong) ish thing or
ERROR_NO_DOMAIN_CONFIG would be appropriate.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-10 14:32 ` [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration Wei Liu
@ 2014-07-16 16:15   ` Ian Campbell
  2014-07-16 16:44     ` Wei Liu
  2014-07-24 18:24   ` Ian Jackson
  1 sibling, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-16 16:15 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> Simple file lock taken from xl to serialise access to "libxl-json" file.

Do you mean libxl here?

> If a thread cannot get hold of the lock it waits due to F_SETLKW.

How is deadlock avoided here?

I think this sort of locking is safe against processes crashing with the
lock held (it gets released), so assuming I'm right its ok from that
PoV.

> In order to generate lock file name, rename userdata_path to
> libxl__userdata_path and declare it in libxl_internal.h

Given that you don't appear to be locking the userdata file itself that
seems like an odd place for a lock to me, why not XEN_LOCK_DIR?

Did you consider locking the actual file though? That would avoid this
lock being a bottleneck...

I'm going to leave all the fctrl wrangling to Ian to look at, much more
his area of expertise.

> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_dom.c      |   14 ++++----
>  tools/libxl/libxl_internal.c |   76 ++++++++++++++++++++++++++++++++++++++++++
>  tools/libxl/libxl_internal.h |   11 ++++++
>  3 files changed, 94 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 83eb29a..a41e8fd 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1762,9 +1762,9 @@ char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
>      return GCSPRINTF(LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
>  }
>  
> -static const char *userdata_path(libxl__gc *gc, uint32_t domid,
> -                                      const char *userdata_userid,
> -                                      const char *wh)
> +const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
> +                                 const char *userdata_userid,
> +                                 const char *wh)
>  {
>      libxl_ctx *ctx = libxl__gc_owner(gc);
>      char *uuid_string;
> @@ -1799,7 +1799,7 @@ void libxl__userdata_destroyall(libxl__gc *gc, uint32_t domid)
>      glob_t gl;
>      int r, i;
>  
> -    pattern = userdata_path(gc, domid, "*", "?");
> +    pattern = libxl__userdata_path(gc, domid, "*", "?");
>      if (!pattern)
>          goto out;
>  
> @@ -1830,7 +1830,7 @@ int libxl_userdata_store(libxl_ctx *ctx, uint32_t domid,
>      int e, rc;
>      int fd = -1;
>  
> -    filename = userdata_path(gc, domid, userdata_userid, "d");
> +    filename = libxl__userdata_path(gc, domid, userdata_userid, "d");
>      if (!filename) {
>          rc = ERROR_NOMEM;
>          goto out;
> @@ -1841,7 +1841,7 @@ int libxl_userdata_store(libxl_ctx *ctx, uint32_t domid,
>          goto out;
>      }
>  
> -    newfilename = userdata_path(gc, domid, userdata_userid, "n");
> +    newfilename = libxl__userdata_path(gc, domid, userdata_userid, "n");
>      if (!newfilename) {
>          rc = ERROR_NOMEM;
>          goto out;
> @@ -1892,7 +1892,7 @@ int libxl_userdata_retrieve(libxl_ctx *ctx, uint32_t domid,
>      int datalen = 0;
>      void *data = 0;
>  
> -    filename = userdata_path(gc, domid, userdata_userid, "d");
> +    filename = libxl__userdata_path(gc, domid, userdata_userid, "d");
>      if (!filename) {
>          rc = ERROR_NOMEM;
>          goto out;
> diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
> index dc47177..9cc60e8 100644
> --- a/tools/libxl/libxl_internal.c
> +++ b/tools/libxl/libxl_internal.c
> @@ -436,6 +436,82 @@ out:
>      return rc;
>  }
>  
> +int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                     int *fd_lock)
> +{
> +    int rc;
> +    struct flock fl;
> +    const char *lockfile;
> +
> +    if (*fd_lock >= 0)
> +        return ERROR_INVAL;
> +
> +    lockfile = libxl__userdata_path(gc, domid, "libxl-json.lock", "d");
> +    if (!lockfile)
> +        return ERROR_FAIL;
> +
> +    *fd_lock = open(lockfile, O_WRONLY|O_CREAT, S_IWUSR);
> +    if (*fd_lock < 0) {
> +        LOGE(ERROR, "cannot open lockfile %s errno=%d\n", lockfile, errno);
> +        return ERROR_FAIL;
> +    }
> +
> +    if (fcntl(*fd_lock, F_SETFD, FD_CLOEXEC) < 0) {
> +        close(*fd_lock);
> +        LOGE(ERROR, "cannot set cloexec to lockfile %s errno=%d\n",
> +             lockfile, errno);
> +        return ERROR_FAIL;
> +    }
> +
> +get_lock:
> +    fl.l_type = F_WRLCK;
> +    fl.l_whence = SEEK_SET;
> +    fl.l_start = 0;
> +    fl.l_len = 0;
> +    rc = fcntl(*fd_lock, F_SETLKW, &fl);
> +    if (rc < 0 && errno == EINTR)
> +        goto get_lock;
> +
> +    if (rc < 0) {
> +        LOGE(ERROR, "cannot acquire lock %s errno=%d\n", lockfile, errno);
> +        rc = ERROR_FAIL;
> +    } else
> +        rc = 0;
> +
> +    return rc;
> +}
> +
> +int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                       int *fd_lock)
> +{
> +    int rc;
> +    struct flock fl;
> +
> +    if (*fd_lock < 0)
> +        return ERROR_INVAL;
> +
> +release_lock:
> +    fl.l_type = F_UNLCK;
> +    fl.l_whence = SEEK_SET;
> +    fl.l_start = 0;
> +    fl.l_len = 0;
> +    rc = fcntl(*fd_lock, F_SETLKW, &fl);
> +    if (rc < 0 && errno == EINTR)
> +        goto release_lock;
> +
> +    if (rc < 0) {
> +        const char *lockfile;
> +        lockfile = libxl__userdata_path(gc, domid, "libxl-json.lock", "d");
> +        LOGE(ERROR, "cannot release lock %s, errno=%d\n", lockfile, errno);
> +        rc = ERROR_FAIL;
> +    } else
> +        rc = 0;
> +
> +    close(*fd_lock);
> +    *fd_lock = -1;
> +
> +    return rc;
> +}
>  
>  /*
>   * Local variables:
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index ef2111b..e9d93ac 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -991,6 +991,9 @@ _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
>                                       uint32_t size, void *data);
>  _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
>  
> +_hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
> +                                         const char *userdata_userid,
> +                                         const char *wh);
>  _hidden void libxl__userdata_destroyall(libxl__gc *gc, uint32_t domid);
>  
>  _hidden int libxl__domain_resume(libxl__gc *gc, uint32_t domid,
> @@ -3228,6 +3231,14 @@ int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
>                                      libxl_domain_config *d_config);
>  int libxl__set_domain_configuration(libxl__gc *gc, uint32_t domid,
>                                      libxl_domain_config *d_config);
> +/*
> + * Lock / unlock domain configuration in libxl private data store.
> + * fd_lock contains the file descriptor pointing to the lock file.
> + */
> +int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                     int *fd_lock);
> +int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                       int *fd_lock);
>  
>  #endif
>  

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-10 14:32 ` [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain Wei Liu
@ 2014-07-16 16:18   ` Ian Campbell
  2014-07-16 16:47     ` Wei Liu
  2014-07-24 18:52   ` Ian Jackson
  1 sibling, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-16 16:18 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:

> +/* update the saved domain configuration with a callback function,
> + * which is responsible to pull in useful fields from src.
> + */
> +typedef void (update_callback)(libxl__gc *, libxl_domain_config *,
> +                               const libxl_domain_config *);
> +static int libxl__update_domain_configuration(libxl__gc *gc,
> +                                              uint32_t domid,
> +                                              update_callback callback,
> +                                              const libxl_domain_config *src)

You aren't using your shiny new locking functions here?

> +{
> +    libxl_domain_config d_config_saved;
> +    int rc;
> +
> +    libxl_domain_config_init(&d_config_saved);
> +
> +    rc = libxl__get_domain_configuration(gc, domid, &d_config_saved);
> +
> +    if (rc) {
> +        LOG(ERROR, "cannot get domain configuration: %d", rc);
> +        goto out;
> +    }
> +
> +    callback(gc, &d_config_saved, src);
> +
> +    rc = libxl__set_domain_configuration(gc, domid, &d_config_saved);
> +
> +    if (rc)
> +        LOG(ERROR, "cannot set domain configuration: %d", rc);
> +
> +    libxl_domain_config_dispose(&d_config_saved);
> +
> +out:
> +    return rc;
> +}
> +
>  static void domcreate_complete(libxl__egc *egc,
>                                 libxl__domain_create_state *dcs,
>                                 int rc)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 04/10] libxl: separate device add/rm complete callbacks
  2014-07-10 14:32 ` [PATCH v1 04/10] libxl: separate device add/rm complete callbacks Wei Liu
@ 2014-07-16 16:26   ` Ian Campbell
  2014-07-16 16:48     ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-16 16:26 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> This is in preparation for device hotplug / unplug configuration
> synchronisation. No functional change introduced. This change is
> necessary because we need to do different things for add and remove.
> Using a single callback won't meet our need anymore.

Looking at the next two patches it seems like this is because on add you
do it at the start and on rm you do it at the end?

I did wonder if it was possible that you could continue to share the
same callback if you could add a flag, but the real reason for not
sharing is that the code you want to add to the remove case is device
type specific, so you need multiple rm callbacks. I misunderstood your
comment about a single callback to mean a single add+rm callback as
opposed to a single vs. multiple rm callbacks.

Given that the actual implementation looks ok to me.

> +DEFINE_DEVICE_RM_AOCOMPLETE(vtpm);
> +DEFINE_DEVICE_RM_AOCOMPLETE(nic);
> +DEFINE_DEVICE_RM_AOCOMPLETE(disk);
> +DEFINE_DEVICE_RM_AOCOMPLETE(vfb);
> +DEFINE_DEVICE_RM_AOCOMPLETE(vkb);

Elsewhere in this file there is:
/* disk */
DEFINE_DEVICE_REMOVE(disk, remove, 0)
DEFINE_DEVICE_REMOVE(disk, destroy, 1)
...

Perhaps add this call there too? Also that place has a comment with the
resulting function names to serve as grep fodder. That would be nice
here too.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it
  2014-07-16 16:11   ` Ian Campbell
@ 2014-07-16 16:44     ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-16 16:44 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Wed, Jul 16, 2014 at 05:11:22PM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> 
> > diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
> > index 81f8985..dc47177 100644
> > --- a/tools/libxl/libxl_internal.c
> > +++ b/tools/libxl/libxl_internal.c
> > @@ -381,6 +381,62 @@ out:
> >      return rc;
> >  }
> >  
> > +int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
> > +                                    libxl_domain_config *d_config)
> > +{
> > +    uint8_t *data = NULL;
> > +    int rc, len;
> > +
> > +    rc = libxl_userdata_retrieve(CTX, domid, "libxl-json", &data, &len);
> > +    if (rc) {
> > +        LOGEV(ERROR, rc,
> > +              "failed to retrieve domain configuration for domain %d", domid);
> > +        rc = ERROR_FAIL;
> 
> rc here is already a valid libxl ERROR_* here, isn't it? No need to
> override in that case. You have this several times in this series..
> 

OK.

> > +        goto out;
> > +    }
> > +
> > +    if (len == 0) {
> > +        LOGE(ERROR, "configuration data stream empty for domain %d", domid);
> > +        rc = ERROR_FAIL;
> 
> This one on the other hand does need setting explicitly.
> 
> How could this happen though? Ian J is in favour of adding more error

The file is accidently overwritten, failure on last write etc. Bad
things happen.

> codes instead of constantly reusing ERROR_FAIL. Here it would seem that
> depending on whether/how this is supposed to happen either a
> ERROR_INTERNAL_MELTDOWN (i.e. everythig has gone wrong) ish thing or
> ERROR_NO_DOMAIN_CONFIG would be appropriate.
> 

OK. Will figure out a new error code.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-16 16:15   ` Ian Campbell
@ 2014-07-16 16:44     ` Wei Liu
  2014-07-17 11:29       ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-16 16:44 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Wed, Jul 16, 2014 at 05:15:41PM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > Simple file lock taken from xl to serialise access to "libxl-json" file.
> 
> Do you mean libxl here?
> 

No. I did mean "xl". This implementation is "borrowed" from "xl".

> > If a thread cannot get hold of the lock it waits due to F_SETLKW.
> 
> How is deadlock avoided here?
> 

This patch only provides "primitive" to lock and unlock. Avoiding
deadlock is out of scope of this patch. That pretty much depends on the
users of this lock.

> I think this sort of locking is safe against processes crashing with the
> lock held (it gets released), so assuming I'm right its ok from that
> PoV.
> 

That's my thought as well.

> > In order to generate lock file name, rename userdata_path to
> > libxl__userdata_path and declare it in libxl_internal.h
> 
> Given that you don't appear to be locking the userdata file itself that
> seems like an odd place for a lock to me, why not XEN_LOCK_DIR?
> 

So that the lock file itself gets deleted by libxl automatically. Using
XEN_LOCK_DIR requires some extra care to delete the lock file. And if
libxl user crashes for any reason, those files become stale. Leaving
them in one location is easier to clean them up.

> Did you consider locking the actual file though? That would avoid this
> lock being a bottleneck...
> 

It's just that part of this series was taken from my older series where
some macro already had its own code to manipulate configuration file,
plumbing locking code (passing fd around) and those macros together
would make code look complete new. I was a bit worried I would introduce
too many changes that might confuse you maintainers. I can lock the file
itself if you prefer. 

But I don't quite follow how this lock can be a bottleneck. Locking the
actual file is different from locking a lock file? Isn't there one file
to be locked anyway?

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-16 16:18   ` Ian Campbell
@ 2014-07-16 16:47     ` Wei Liu
  2014-07-17 11:06       ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-16 16:47 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Wed, Jul 16, 2014 at 05:18:08PM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> 
> > +/* update the saved domain configuration with a callback function,
> > + * which is responsible to pull in useful fields from src.
> > + */
> > +typedef void (update_callback)(libxl__gc *, libxl_domain_config *,
> > +                               const libxl_domain_config *);
> > +static int libxl__update_domain_configuration(libxl__gc *gc,
> > +                                              uint32_t domid,
> > +                                              update_callback callback,
> > +                                              const libxl_domain_config *src)
> 
> You aren't using your shiny new locking functions here?
> 

We don't need locking here. The lock should already be held (if it needs
to be held).

Note, this is not async callback, this is just to make it possible to
replace the update function at will. Probably renaming "update_callback"
to "update" is better?

And for larger scope, when creating a domain, there shouldn't be any
other thread trying to access the configuration file (is there?), so
locking is not necessary.

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 04/10] libxl: separate device add/rm complete callbacks
  2014-07-16 16:26   ` Ian Campbell
@ 2014-07-16 16:48     ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-16 16:48 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Wed, Jul 16, 2014 at 05:26:20PM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > This is in preparation for device hotplug / unplug configuration
> > synchronisation. No functional change introduced. This change is
> > necessary because we need to do different things for add and remove.
> > Using a single callback won't meet our need anymore.
> 
> Looking at the next two patches it seems like this is because on add you
> do it at the start and on rm you do it at the end?
> 
> I did wonder if it was possible that you could continue to share the
> same callback if you could add a flag, but the real reason for not
> sharing is that the code you want to add to the remove case is device
> type specific, so you need multiple rm callbacks. I misunderstood your

Yes, RM is type specific, hence this change.

I shall write this in commit message, replacing "do different things" to
make it clearer.

> comment about a single callback to mean a single add+rm callback as
> opposed to a single vs. multiple rm callbacks.
> 
> Given that the actual implementation looks ok to me.
> 
> > +DEFINE_DEVICE_RM_AOCOMPLETE(vtpm);
> > +DEFINE_DEVICE_RM_AOCOMPLETE(nic);
> > +DEFINE_DEVICE_RM_AOCOMPLETE(disk);
> > +DEFINE_DEVICE_RM_AOCOMPLETE(vfb);
> > +DEFINE_DEVICE_RM_AOCOMPLETE(vkb);
> 
> Elsewhere in this file there is:
> /* disk */
> DEFINE_DEVICE_REMOVE(disk, remove, 0)
> DEFINE_DEVICE_REMOVE(disk, destroy, 1)
> ...
> 
> Perhaps add this call there too? Also that place has a comment with the
> resulting function names to serve as grep fodder. That would be nice
> here too.
> 

Will do.

Wei.

> Ian.
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-10 14:32 ` [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device Wei Liu
@ 2014-07-16 16:48   ` Ian Campbell
  2014-07-16 17:12     ` Wei Liu
  2014-07-25 16:06   ` Ian Jackson
  1 sibling, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-16 16:48 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> We update JSON version first, then write to xenstore, so that we
> maintain the following invariant: any device which is present in
> xenstore has a corresponding entry in JSON.
> 
> The locking pattern is as followed:
>  1. lock JSON domain config
>  2. write JSON entry
>  3. write xenstore entry for a device
>  4. unlock JSON domain config
> 
> And we ensure in code that JSON entry is always written before the
> xenstore entry is written. That is, if 2 fails, 3 will not be executed.
> We also ensure that if 2 and 3 are invoked in a loop, the previous
> added JSON entry is removed before writing new one.
> 
> The locking patten is designed like this, because when adding a disk,

"pattern". Apparently a patten is a type of clog or shoe l-)

> the caller might specify get_vdev which will get called in the middle
> of a xenstore transaction to determine the identifier of a disk. Other
> devices don't have this property, but I make them follow the same
> pattern for consistency.

I don't know if it is helpful but AIUI the get_vdev there is purely to
handle the case where the disk is being attached to the toolstack domain
(e.g. for pygrub). I don't know if that simplifies anything for you.

> As we don't have a JSON config file for libxl toolstack domain
> (currently Dom0) we need to skip JSON manipulation for it.

How hard would it be to create a stub/stunt JSON for dom0 when we notice
it is missing? Or from e.g. xencommons perhaps?

> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> @@ -1917,6 +1922,8 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
>              goto out;
>          }
>      }
> +    vtpm_saved.devid = vtpm->devid;

Why isn't this in the update_config fn?

> +    libxl__update_config_vtpm(gc, &vtpm_saved, vtpm);
>  
>      GCNEW(device);
>      rc = libxl__device_from_vtpm(gc, domid, vtpm, device);
> @@ -1943,17 +1950,30 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
>      flexarray_append(front, "handle");
>      flexarray_append(front, GCSPRINTF("%d", vtpm->devid));
>  
> +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> +    if (rc) goto out;

I can't see anything in this macro (or the unlock one) which
necessitates it being a macro rather than a helper function.

Couldn't you just do the LIBXL_TOOLSTACK_DOMID check in
libxl__lock_domain_configuration?

> +
> +    DEVICE_ADD_JSON(vtpm, vtpms, num_vtpms, domid, &vtpm_saved,

The second and third of these arguments could be derived from the first.

> @@ -2193,21 +2217,56 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
>          goto out;
>      }
>  
> +    libxl_device_disk_copy(CTX, &disk_saved, disk);
> +
> +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> +    if (rc) goto out;
> +
>      for (;;) {
>          rc = libxl__xs_transaction_start(gc, &t);
> -        if (rc) goto out;
> +        if (rc) {
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +            goto out;
> +        }
> +
> +        /* This is a loop, if disk has been added to JSON before, we
> +         * need to remove it, then add it later, because disk->vdev
> +         * might change.

Has the old device been removed from xenstore at this point? Otherwise
aren't you breaking your invariant that a device in xenstore is always
described in the json.

> +         */
> +        DEVICE_REMOVE_JSON(disk, disks, num_disks, domid, &disk_saved,
> +                           COMPARE_DISK, rc);
> +        if (rc) {
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +            goto out;

I wonder if you can arrange the exit path such that this can happen
there and save all of these?

> +#define DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare, rc)        \
> +    do {                                                                \
> +        int x;                                                          \
> +        libxl_domain_config d_config;                                   \
> +        libxl_device_##type *p;                                         \
> +                                                                        \
> +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> +            break;                                                      \
> +                                                                        \
> +        libxl_domain_config_init(&d_config);                            \
> +                                                                        \
> +        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
> +        if (rc)                                                         \
> +            goto add_done;                                              \
> +                                                                        \
> +        /* Check for duplicated device */                               \
> +        for (x = 0; x < d_config.cnt; x++) {                            \
> +            if (compare(&d_config.ptr[x], (dev))) {                     \

Compare just checks ->devid etc, right?

What if this device has the same devid but a different config?

That probably ought to be an error, have we already caught that or are
we relying on something later?

I suppose either way we don't want to add the device again now.

> +                rc = 0;                                                 \
> +                goto add_done;                                          \
> +            }                                                           \
> +        }                                                               \
> +                                                                        \
> +        d_config.ptr =                                                  \
> +            libxl__realloc(gc, d_config.ptr,                            \
> +                           (d_config.cnt + 1) *                         \
> +                           sizeof(libxl_device_##type));                \
> +        p = &d_config.ptr[d_config.cnt];                                \
> +        d_config.cnt++;                                                 \
> +        libxl_device_##type##_copy(CTX, p, (dev));                      \
> +                                                                        \
> +        rc = libxl__set_domain_configuration(gc, (domid), &d_config);   \

How close to being possible is it to do this as a proper helper function
which takes a size_t and a bunch of fn pointers with void where the type
would be?

> +                                                                        \
> +    add_done:                                                           \

You are going to have multiple of these labels in this source file. Not
sure quite what that means in C. The simple answer would be to involve
##type## in the name.

> +        libxl_domain_config_dispose(&d_config);                         \
> +    } while (0)
> +
> +#define DEVICE_REMOVE_JSON(type, ptr, cnt, domid, dev, compare, rc)     \

Same comments as on add.

> +    do {                                                                \
> +        int i, j;                                                       \
> +        libxl_device_##type *p = dev;                                   \
> +        libxl_domain_config d_config;                                   \
> +        bool removed = false;                                           \
> +                                                                        \
> +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> +            break;                                                      \
> +                                                                        \
> +        libxl_domain_config_init(&d_config);                            \
> +                                                                        \
> +        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
> +        if (rc)                                                         \
> +            goto remove_done;                                           \
> +                                                                        \
> +        for (i = j = 0; i < d_config.cnt; i++) {                        \
> +            if (!compare(&d_config.ptr[i], p)) {                        \
> +                if (i != j) {                                           \
> +                    libxl_device_##type##_dispose(&d_config.ptr[j]);    \
> +                    d_config.ptr[j] = d_config.ptr[i];                  \
> +                    removed = true;                                     \
> +                }                                                       \
> +                j++;                                                    \
> +            }                                                           \
> +        }                                                               \
> +                                                                        \
> +        if (!removed)  /* no matching entry found */                    \
> +            goto remove_done;                                           \
> +                                                                        \
> +        d_config.ptr =                                                  \
> +            libxl__realloc(NOGC, d_config.ptr,                          \
> +                           j * sizeof(libxl_device_##type));            \
> +        d_config.cnt = j;                                               \

The fact that the compare loop also shuffles everything down is worthy
of noting in a comment I think.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 06/10] libxl: synchronise configuration when we remove/destroy a device
  2014-07-10 14:32 ` [PATCH v1 06/10] libxl: synchronise configuration when we remove/destroy " Wei Liu
@ 2014-07-16 16:58   ` Ian Campbell
  0 siblings, 0 replies; 70+ messages in thread
From: Ian Campbell @ 2014-07-16 16:58 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> +DEFINE_DEVICE_RM_AOCOMPLETE(vtpm, vtpms, num_vtpms, COMPARE_DEVID);
> +DEFINE_DEVICE_RM_AOCOMPLETE(nic, nics, num_nics, COMPARE_DEVID);
> +DEFINE_DEVICE_RM_AOCOMPLETE(disk, disks, num_disks, COMPARE_DISK);
> +DEFINE_DEVICE_RM_AOCOMPLETE(vfb, vfbs, num_vfbs, COMPARE_DEVID);
> +DEFINE_DEVICE_RM_AOCOMPLETE(vkb, vkbs, num_vkbs, COMPARE_DEVID);

A bit more cpp pasting would give you arguments #2 and #3 from #1.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-16 16:48   ` Ian Campbell
@ 2014-07-16 17:12     ` Wei Liu
  2014-07-17 11:44       ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-16 17:12 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Wed, Jul 16, 2014 at 05:48:59PM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > We update JSON version first, then write to xenstore, so that we
> > maintain the following invariant: any device which is present in
> > xenstore has a corresponding entry in JSON.
> > 
> > The locking pattern is as followed:
> >  1. lock JSON domain config
> >  2. write JSON entry
> >  3. write xenstore entry for a device
> >  4. unlock JSON domain config
> > 
> > And we ensure in code that JSON entry is always written before the
> > xenstore entry is written. That is, if 2 fails, 3 will not be executed.
> > We also ensure that if 2 and 3 are invoked in a loop, the previous
> > added JSON entry is removed before writing new one.
> > 
> > The locking patten is designed like this, because when adding a disk,
> 
> "pattern". Apparently a patten is a type of clog or shoe l-)
> 
> > the caller might specify get_vdev which will get called in the middle
> > of a xenstore transaction to determine the identifier of a disk. Other
> > devices don't have this property, but I make them follow the same
> > pattern for consistency.
> 
> I don't know if it is helpful but AIUI the get_vdev there is purely to
> handle the case where the disk is being attached to the toolstack domain
> (e.g. for pygrub). I don't know if that simplifies anything for you.
> 

Probably not if get_vdev is set to be called in this loop and embedded
in a xenstore transaction.

For other device, determining devid only requires XBT_NULL, so that I
can safely move the call earlier. I'm not confident that I can do the
same for disk.

> > As we don't have a JSON config file for libxl toolstack domain
> > (currently Dom0) we need to skip JSON manipulation for it.
> 
> How hard would it be to create a stub/stunt JSON for dom0 when we notice
> it is missing? Or from e.g. xencommons perhaps?
> 

We need to determine:
1. when / where to generate such thing (xencommons?)
2. what to put in it

I have yet had answers for #2. The simplest version can be "{}" I think.
That is an empty configuration that every fields gets the default value.
But we probably need more than that.

> > Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> > @@ -1917,6 +1922,8 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
> >              goto out;
> >          }
> >      }
> > +    vtpm_saved.devid = vtpm->devid;
> 
> Why isn't this in the update_config fn?
> 

Sure, this can be done.

> > +    libxl__update_config_vtpm(gc, &vtpm_saved, vtpm);
> >  
> >      GCNEW(device);
> >      rc = libxl__device_from_vtpm(gc, domid, vtpm, device);
> > @@ -1943,17 +1950,30 @@ void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
> >      flexarray_append(front, "handle");
> >      flexarray_append(front, GCSPRINTF("%d", vtpm->devid));
> >  
> > +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> > +    if (rc) goto out;
> 
> I can't see anything in this macro (or the unlock one) which
> necessitates it being a macro rather than a helper function.
> 

Helper funtion it is.

> Couldn't you just do the LIBXL_TOOLSTACK_DOMID check in
> libxl__lock_domain_configuration?
> 

This can be done.

> > +
> > +    DEVICE_ADD_JSON(vtpm, vtpms, num_vtpms, domid, &vtpm_saved,
> 
> The second and third of these arguments could be derived from the first.
> 

Unfortunately not: PCI device doesn't follow this rule. Otherwise I
would have done it already.  :-(

> > @@ -2193,21 +2217,56 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
> >          goto out;
> >      }
> >  
> > +    libxl_device_disk_copy(CTX, &disk_saved, disk);
> > +
> > +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> > +    if (rc) goto out;
> > +
> >      for (;;) {
> >          rc = libxl__xs_transaction_start(gc, &t);
> > -        if (rc) goto out;
> > +        if (rc) {
> > +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> > +            goto out;
> > +        }
> > +
> > +        /* This is a loop, if disk has been added to JSON before, we
> > +         * need to remove it, then add it later, because disk->vdev
> > +         * might change.
> 
> Has the old device been removed from xenstore at this point? Otherwise
> aren't you breaking your invariant that a device in xenstore is always
> described in the json.
> 

Not yet, we're here (again) because the transaction hasn't been
committed.

> > +         */
> > +        DEVICE_REMOVE_JSON(disk, disks, num_disks, domid, &disk_saved,
> > +                           COMPARE_DISK, rc);
> > +        if (rc) {
> > +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> > +            goto out;
> 
> I wonder if you can arrange the exit path such that this can happen
> there and save all of these?
> 

I will see what I can do.

> > +#define DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare, rc)        \
> > +    do {                                                                \
> > +        int x;                                                          \
> > +        libxl_domain_config d_config;                                   \
> > +        libxl_device_##type *p;                                         \
> > +                                                                        \
> > +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> > +            break;                                                      \
> > +                                                                        \
> > +        libxl_domain_config_init(&d_config);                            \
> > +                                                                        \
> > +        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
> > +        if (rc)                                                         \
> > +            goto add_done;                                              \
> > +                                                                        \
> > +        /* Check for duplicated device */                               \
> > +        for (x = 0; x < d_config.cnt; x++) {                            \
> > +            if (compare(&d_config.ptr[x], (dev))) {                     \
> 
> Compare just checks ->devid etc, right?
> 

It depends on specific device type. See those COMPARE_* macros.
Disk and PCI devices don't have devid.

> What if this device has the same devid but a different config?
> 
> That probably ought to be an error, have we already caught that or are
> we relying on something later?
> 

Then the caller should use DEVICE_UPDATE_JSON which is introduced in
later patch.

> I suppose either way we don't want to add the device again now.
> 

Yes.

> > +                rc = 0;                                                 \
> > +                goto add_done;                                          \
> > +            }                                                           \
> > +        }                                                               \
> > +                                                                        \
> > +        d_config.ptr =                                                  \
> > +            libxl__realloc(gc, d_config.ptr,                            \
> > +                           (d_config.cnt + 1) *                         \
> > +                           sizeof(libxl_device_##type));                \
> > +        p = &d_config.ptr[d_config.cnt];                                \
> > +        d_config.cnt++;                                                 \
> > +        libxl_device_##type##_copy(CTX, p, (dev));                      \
> > +                                                                        \
> > +        rc = libxl__set_domain_configuration(gc, (domid), &d_config);   \
> 
> How close to being possible is it to do this as a proper helper function
> which takes a size_t and a bunch of fn pointers with void where the type
> would be?
> 

We need to know the type of this structure otherwise we don't know what
*_copy function to call. Sadly there's no way to pass in type information
in C in runtime.

> > +                                                                        \
> > +    add_done:                                                           \
> 
> You are going to have multiple of these labels in this source file. Not
> sure quite what that means in C. The simple answer would be to involve
> ##type## in the name.
> 

That would break compilation I think. But I haven't seen it so far
because this macro only gets used once in those caller functions.

I will make this change as you suggest anyway.

> > +        libxl_domain_config_dispose(&d_config);                         \
> > +    } while (0)
> > +
> > +#define DEVICE_REMOVE_JSON(type, ptr, cnt, domid, dev, compare, rc)     \
> 
> Same comments as on add.
> 

Same reply as on add. :-)

> > +    do {                                                                \
> > +        int i, j;                                                       \
> > +        libxl_device_##type *p = dev;                                   \
> > +        libxl_domain_config d_config;                                   \
> > +        bool removed = false;                                           \
> > +                                                                        \
> > +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> > +            break;                                                      \
> > +                                                                        \
> > +        libxl_domain_config_init(&d_config);                            \
> > +                                                                        \
> > +        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
> > +        if (rc)                                                         \
> > +            goto remove_done;                                           \
> > +                                                                        \
> > +        for (i = j = 0; i < d_config.cnt; i++) {                        \
> > +            if (!compare(&d_config.ptr[i], p)) {                        \
> > +                if (i != j) {                                           \
> > +                    libxl_device_##type##_dispose(&d_config.ptr[j]);    \
> > +                    d_config.ptr[j] = d_config.ptr[i];                  \
> > +                    removed = true;                                     \
> > +                }                                                       \
> > +                j++;                                                    \
> > +            }                                                           \
> > +        }                                                               \
> > +                                                                        \
> > +        if (!removed)  /* no matching entry found */                    \
> > +            goto remove_done;                                           \
> > +                                                                        \
> > +        d_config.ptr =                                                  \
> > +            libxl__realloc(NOGC, d_config.ptr,                          \
> > +                           j * sizeof(libxl_device_##type));            \
> > +        d_config.cnt = j;                                               \
> 
> The fact that the compare loop also shuffles everything down is worthy
> of noting in a comment I think.
> 

OK.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 07/10] libxl: make libxl_cd_insert "eject" + "insert"
  2014-07-10 14:32 ` [PATCH v1 07/10] libxl: make libxl_cd_insert "eject" + "insert" Wei Liu
@ 2014-07-17 10:44   ` Ian Campbell
  2014-07-17 14:20     ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 10:44 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> A "cdrom insert" is always processed as "eject" + "insert", with JSON
> config updated in between. So that we can know the correct state of
> CDROM later when we try to retrieve domain configuration: if xenstore is
> "empty", then CDROM is "empty"; otherwise use the information presented
> in JSON.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl.c          |   48 ++++++++++++++++++++++++++++++++++++++++--
>  tools/libxl/libxl_internal.h |   34 ++++++++++++++++++++++++++++++
>  2 files changed, 80 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 58f07d2..69d94b1 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -2654,8 +2654,9 @@ int libxl_device_disk_getinfo(libxl_ctx *ctx, uint32_t domid,
>      return 0;
>  }
>  
> -int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk,
> -                       const libxl_asyncop_how *ao_how)
> +static int libxl__cdrom_insert(libxl_ctx *ctx, uint32_t domid,
> +                               libxl_device_disk *disk,
> +                               const libxl_asyncop_how *ao_how)
>  {
>      AO_CREATE(ctx, domid, ao_how);
>      int num = 0, i;
> @@ -2766,6 +2767,49 @@ out:
>      return AO_INPROGRESS;
>  }
>  
> +/*
> + * A "cdrom insert" is always processed as "eject" + "insert", with
> + * updating JSON in between. So that we can know the current state of
> + * CDROM later when we try to retrieve domain configuration: if
> + * xenstore is "empty", then CDROM is "empty"; otherwise use the image
> + * in JSON.

I think you could short circuit the case where the user inserts an empty
disk (AKA ejects), right now you will eject twice. (We don't have an
explicit libxl_cdrom_eject so I presume this is how it is intended to be
done).

I think this function also handles HVM emulated IDE CDROM, which are
semantically a bit different from PV CDROMs in that the empty drive is
still an actual thing. Not sure if/how this impacts on your handling of
the JSON though.

> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index b4f518a..f8f2ba2 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3355,6 +3355,40 @@ static inline void libxl__update_config_vtpm(libxl__gc *gc,
>          libxl_domain_config_dispose(&d_config);                         \
>      } while (0)
>  
> +/* Search device list for device with the same identifier as "dev",
> + * update that device with the content pointed to by "dev".
> + */
> +#define DEVICE_UPDATE_JSON(type, ptr, cnt, domid, dev, compare, rc)     \

You could define this in terms of REMOVE+ADD, which would match the
behaviour of the actual operation. Slightly less efficient but less code
perhaps?

If not then I wonder how much of these three macros could be made
common.

> +    do {                                                                \
> +        int x;                                                          \
> +        libxl_domain_config d_config;                                   \
> +        bool updated = false;                                           \
> +                                                                        \
> +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> +            break;                                                      \
> +                                                                        \
> +        libxl_domain_config_init(&d_config);                            \
> +                                                                        \
> +        rc = libxl__get_domain_configuration(gc, domid, &d_config);     \
> +        if (rc)                                                         \
> +            goto update_done;                                           \
> +                                                                        \
> +        for (x = 0; x < d_config.cnt; x++) {                            \
> +            if (compare(&d_config.ptr[x], (dev))) {                     \
> +                libxl_device_##type##_dispose(&d_config.ptr[x]);        \
> +                libxl_device_##type##_copy(CTX, &d_config.ptr[x],       \
> +                                           (dev));                      \
> +                updated = true;                                         \
> +            }                                                           \
> +        }                                                               \
> +                                                                        \
> +        if (updated)                                                    \
> +            rc = libxl__set_domain_configuration(gc, domid, &d_config); \
> +                                                                        \
> +    update_done:                                                        \
> +        libxl_domain_config_dispose(&d_config);                         \
> +    } while (0)
> +
>  #endif
>  
>  /*

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max
  2014-07-10 14:32 ` [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max Wei Liu
@ 2014-07-17 10:47   ` Ian Campbell
  2014-07-17 12:02     ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 10:47 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> ... which returns the "static-max" knob of a domain. It will be used in
> later patch to retrieve memory static-max value of a domain.
> 
> As libxl_get_memory_{target, static_max} have similar logic, a macro is
> introduced to avoid code duplication.

It looks to me like this could instead be a common helper function, with
a simple boolean parameter.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-10 14:32 ` [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration Wei Liu
@ 2014-07-17 10:59   ` Ian Campbell
  2014-07-17 12:11     ` Wei Liu
  2014-07-29 15:29     ` Ian Jackson
  0 siblings, 2 replies; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 10:59 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> Introduce a new public API to return domain configuration. This returned
> configuration can be used to rebuild a domain.
> 
> Note that this configuration doesn't equal to the current state of the
> domain. What it does is to use JSON version configuration as template
> and pull in relevant information from xenstore.

I think this configuration does equal the current state, doesn't it?
Isn't that the whole point?

> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 49d7ef6..cd5914c 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -6034,6 +6034,206 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src)
>      for (i = 0; i < 6; i++)
>          (*dst)[i] = (*src)[i];
>  }
> +
> +int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
> +                                        libxl_domain_config *d_config)

It occurs to me to wonder if any function which takes a lock ought to be
async capable?

Also is there is any possibility that any of the operations needed to
gather the updated configuration might take a long time.

> +    /* Memory limits:
> +     *
> +     * Currently there're three memory limits:
> +     *  1. "target" in xenstore (originally memory= in config file)
> +     *  2. "static-max" in xenstore (originally maxmem= in config file)

Nit: strictly speaking those "originally..." are xl specific and this is
libxl. </pendant>

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-16 16:47     ` Wei Liu
@ 2014-07-17 11:06       ` Ian Campbell
  2014-07-17 11:46         ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 11:06 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Wed, 2014-07-16 at 17:47 +0100, Wei Liu wrote:
> On Wed, Jul 16, 2014 at 05:18:08PM +0100, Ian Campbell wrote:
> > On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > 
> > > +/* update the saved domain configuration with a callback function,
> > > + * which is responsible to pull in useful fields from src.
> > > + */
> > > +typedef void (update_callback)(libxl__gc *, libxl_domain_config *,
> > > +                               const libxl_domain_config *);
> > > +static int libxl__update_domain_configuration(libxl__gc *gc,
> > > +                                              uint32_t domid,
> > > +                                              update_callback callback,
> > > +                                              const libxl_domain_config *src)
> > 
> > You aren't using your shiny new locking functions here?
> > 
> 
> We don't need locking here. The lock should already be held (if it needs
> to be held).

OK. It's worth documenting that I think.

> Note, this is not async callback, this is just to make it possible to
> replace the update function at will. Probably renaming "update_callback"
> to "update" is better?

Either works for me.

> And for larger scope, when creating a domain, there shouldn't be any
> other thread trying to access the configuration file (is there?), so
> locking is not necessary.

Could someone race and try and call xl mem-set in the middle of the
create? Not sure. 

Always taking the lock when manipulating a domain's stored config, even
if it isn't strictly necessary means we don't need to think to hard
about those corner cases...

I suppose there is also a brief interval after the createdomain domctl
where the domid is "valid" but no config has been stored yet (since the
create doesn't hold the lock, I think). You might need some robustness
against that too.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 10/10] xl: use libxl_retrieve_domain_configuration and JSON format
  2014-07-10 14:32 ` [PATCH v1 10/10] xl: use libxl_retrieve_domain_configuration and JSON format Wei Liu
@ 2014-07-17 11:13   ` Ian Campbell
  2014-07-17 12:14     ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 11:13 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> Before this change, xl stores domain configuration in "xl" format, which
> is in fact a verbatim copy of user supplied domain config.
> 
> Now libxl provides a new API to retrieve domain configuration, switch to
> that new API, store configuration in JSON format.
> 
> Tests done so far (xl.{new,old} denotes xl with{,out} "libxl-json"
> support):
> 
> 1. xl.new create then xl.new save, hexdump saved file: domain config
>    saved in JSON format
> 2. xl.new create, xl.new save then xl.old restore: failed on
>    mandatory flag check
> 3. xl.new create, xl.new save then xl.new restore: succeeded
> 4. xl.old create, xl.old save then xl.new restore: succeeded
> 5. xl.new create then local migrate, receiving end xl.new: succeeded
> 6. xl.old create then local migrate, receiving end xl.new: succeeded
> 
> Note that "xl" config is still supported and handled when restarting a
> domain. "xl" config file takes precedence over "libxl-json" in that
> case, so that user who uses "config-update" to store new config file
> won't have regression. All other scenarios (migration, domain listing
> etc.) now use the new API.
> 
> Lastly, print out warning when users invoke "config-update" to
> discourage them from using this command.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> ---
>  docs/man/xl.pod.1         |    3 ++
>  tools/libxl/xl_cmdimpl.c  |  121 ++++++++++++++++++++++++++++-----------------
>  tools/libxl/xl_cmdtable.c |    4 +-
>  3 files changed, 82 insertions(+), 46 deletions(-)
> 
> diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
> index 30bd4bf..6b5baf4 100644
> --- a/docs/man/xl.pod.1
> +++ b/docs/man/xl.pod.1
> @@ -187,6 +187,9 @@ immediate effect but will be applied when the guest is next
>  restarted. This command is useful to ensure that runtime modifications
>  made to the guest will be preserved when the guest is restarted.
>  
> +Libxl now has better capability to handle domain configuration, avoid
> +using this command when possible.

Perhaps say:

        Libxl now has improved capabilities to handle dynamic domain
        configuration changes and will preserve any changes made a
        runtime when necessary. Therefore it should not normally be
        necessary to use this command any more.
        
Ian.

> +
>  I<configfile> has to be an absolute path to a file.
>  
>  B<OPTIONS>
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 68df548..3ee490e 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -107,6 +107,8 @@ static const char migrate_report[]=
>     *            from target to source
>     */
>  
> +#define XL_MANDATORY_FLAG_JSON (1U << 0) /* config data is in JSON format */
> +#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON)
>  struct save_file_header {
>      char magic[32]; /* savefileheader_magic */
>      /* All uint32_ts are in domain's byte order. */
> @@ -1699,21 +1701,40 @@ skip_vfb:
>  }
>  
>  static void reload_domain_config(uint32_t domid,
> -                                 uint8_t **config_data, int *config_len)
> +                                 libxl_domain_config *d_config)
>  {
> +    int rc;
>      uint8_t *t_data;
>      int ret, t_len;
> +    libxl_domain_config d_config_new;
>  
> +    /* In case user has used "config-update" to store a new config
> +     * file.
> +     */
>      ret = libxl_userdata_retrieve(ctx, domid, "xl", &t_data, &t_len);
> -    if (ret) {
> -        LOG("failed to retrieve guest configuration (rc=%d). "
> -            "reusing old configuration", ret);
> +    if (ret && errno != ENOENT) {
> +        LOG("\"xl\" file found but failed to load\n");

Perhaps "Saved \"xl\" configuration found ..."
> +    }
> +    if (t_len > 0) {
> +        LOG("\"xl\" config file found, use it\n");

"using"

I wonder if at this point we should delete the saved config.

IOW do we expect xl update-config to be one shot or to persist. I think
the one shot is the only sensible answer and it doesn't conflict with
the previous implementation since in the old implementation the saved
config became the new stored config.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-16 16:44     ` Wei Liu
@ 2014-07-17 11:29       ` Ian Campbell
  2014-07-17 11:41         ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 11:29 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Wed, 2014-07-16 at 17:44 +0100, Wei Liu wrote:
> On Wed, Jul 16, 2014 at 05:15:41PM +0100, Ian Campbell wrote:
> > On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > > Simple file lock taken from xl to serialise access to "libxl-json" file.
> > 
> > Do you mean libxl here?
> > 
> 
> No. I did mean "xl". This implementation is "borrowed" from "xl".


Ah, I thought you meant the lock was taken (as in locked) from xl, which
made no sense. Suggest inserting the word "implementation" in that
sentence somewhere.

> > > If a thread cannot get hold of the lock it waits due to F_SETLKW.
> > 
> > How is deadlock avoided here?
> > 
> 
> This patch only provides "primitive" to lock and unlock. Avoiding
> deadlock is out of scope of this patch. That pretty much depends on the
> users of this lock.

Hrm. It would be conventional to describe for the benefit of those users
what the requirements are. e.g. by defining a locking hierarchy for
libxl. e.g. relative to the CTX lock.

It might also be necessary  to specify what the various event callbacks
are allowed to assume here -- but Ian knows the event code best so I'll
let him comment.

> > > In order to generate lock file name, rename userdata_path to
> > > libxl__userdata_path and declare it in libxl_internal.h
> > 
> > Given that you don't appear to be locking the userdata file itself that
> > seems like an odd place for a lock to me, why not XEN_LOCK_DIR?
> > 
> 
> So that the lock file itself gets deleted by libxl automatically. Using
> XEN_LOCK_DIR requires some extra care to delete the lock file. And if
> libxl user crashes for any reason, those files become stale. Leaving
> them in one location is easier to clean them up.

There's only one (empty) file though, isn't there? and it will be reused
by any subsequent libxl call from any process. IOW I don't think you
need to concern yourself with removing it.

> > Did you consider locking the actual file though? That would avoid this
> > lock being a bottleneck...
> > 
> 
> It's just that part of this series was taken from my older series where
> some macro already had its own code to manipulate configuration file,
> plumbing locking code (passing fd around) and those macros together
> would make code look complete new. I was a bit worried I would introduce
> too many changes that might confuse you maintainers. I can lock the file
> itself if you prefer. 
> 
> But I don't quite follow how this lock can be a bottleneck. Locking the
> actual file is different from locking a lock file? Isn't there one file
> to be locked anyway?

If you lock the actual file then you are only serialising operations on
a single domain, operations on other domains can go ahead (each with
their own lock).

If you lock a single file then all operations on all domains are
serialised.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-17 11:29       ` Ian Campbell
@ 2014-07-17 11:41         ` Wei Liu
  2014-07-17 11:48           ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-17 11:41 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 12:29:11PM +0100, Ian Campbell wrote:
> On Wed, 2014-07-16 at 17:44 +0100, Wei Liu wrote:
> > On Wed, Jul 16, 2014 at 05:15:41PM +0100, Ian Campbell wrote:
> > > On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > > > Simple file lock taken from xl to serialise access to "libxl-json" file.
> > > 
> > > Do you mean libxl here?
> > > 
> > 
> > No. I did mean "xl". This implementation is "borrowed" from "xl".
> 
> 
> Ah, I thought you meant the lock was taken (as in locked) from xl, which
> made no sense. Suggest inserting the word "implementation" in that
> sentence somewhere.
> 

No problem.

> > > > If a thread cannot get hold of the lock it waits due to F_SETLKW.
> > > 
> > > How is deadlock avoided here?
> > > 
> > 
> > This patch only provides "primitive" to lock and unlock. Avoiding
> > deadlock is out of scope of this patch. That pretty much depends on the
> > users of this lock.
> 
> Hrm. It would be conventional to describe for the benefit of those users
> what the requirements are. e.g. by defining a locking hierarchy for
> libxl. e.g. relative to the CTX lock.
> 

This lock is only used internally by libxl. I guess I should document it
in libxl_internal.h? There doesn't seem to a existing doc on locking
hierarchy in either libxl.h or libxl_internal.h. I guess the
introduction of this lock is the first instance that needs documented?

> It might also be necessary  to specify what the various event callbacks
> are allowed to assume here -- but Ian knows the event code best so I'll
> let him comment.
> 
> > > > In order to generate lock file name, rename userdata_path to
> > > > libxl__userdata_path and declare it in libxl_internal.h
> > > 
> > > Given that you don't appear to be locking the userdata file itself that
> > > seems like an odd place for a lock to me, why not XEN_LOCK_DIR?
> > > 
> > 
> > So that the lock file itself gets deleted by libxl automatically. Using
> > XEN_LOCK_DIR requires some extra care to delete the lock file. And if
> > libxl user crashes for any reason, those files become stale. Leaving
> > them in one location is easier to clean them up.
> 
> There's only one (empty) file though, isn't there? and it will be reused
> by any subsequent libxl call from any process. IOW I don't think you
> need to concern yourself with removing it.
> 

No, it's per-domain, so that we don't content on it (now I see why you
said it would be a bottleneck because you thought there's only one
file). Leaving a bunch of stale files is not that nice, isn't it?

> > > Did you consider locking the actual file though? That would avoid this
> > > lock being a bottleneck...
> > > 
> > 
> > It's just that part of this series was taken from my older series where
> > some macro already had its own code to manipulate configuration file,
> > plumbing locking code (passing fd around) and those macros together
> > would make code look complete new. I was a bit worried I would introduce
> > too many changes that might confuse you maintainers. I can lock the file
> > itself if you prefer. 
> > 
> > But I don't quite follow how this lock can be a bottleneck. Locking the
> > actual file is different from locking a lock file? Isn't there one file
> > to be locked anyway?
> 
> If you lock the actual file then you are only serialising operations on
> a single domain, operations on other domains can go ahead (each with
> their own lock).
> 
> If you lock a single file then all operations on all domains are
> serialised.
> 

See above.

Wei.

> Ian.
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-16 17:12     ` Wei Liu
@ 2014-07-17 11:44       ` Ian Campbell
  2014-07-17 14:13         ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 11:44 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Wed, 2014-07-16 at 18:12 +0100, Wei Liu wrote:
> > > As we don't have a JSON config file for libxl toolstack domain
> > > (currently Dom0) we need to skip JSON manipulation for it.
> > 
> > How hard would it be to create a stub/stunt JSON for dom0 when we notice
> > it is missing? Or from e.g. xencommons perhaps?
> > 
> 
> We need to determine:
> 1. when / where to generate such thing (xencommons?)
> 2. what to put in it
> 
> I have yet had answers for #2. The simplest version can be "{}" I think.
> That is an empty configuration that every fields gets the default value.
> But we probably need more than that.

I think you would want at least a name and perhaps a uuid? And cinfo
type == PV.

Device arrays are all empty at start of day.

Some of the stuff about target and maxmem you could perhaps infer at
start of day?

> > > +    DEVICE_ADD_JSON(vtpm, vtpms, num_vtpms, domid, &vtpm_saved,
> > 
> > The second and third of these arguments could be derived from the first.
> > 
> 
> Unfortunately not: PCI device doesn't follow this rule. Otherwise I
> would have done it already.  :-(

Ah. I had somehow concluded that you weren't using this for PCI. How
annoying.

You could consider DEVICE_ADD_JSON_SIMPLE (or some other word) which
calls the above for the non-PCI cases. Perhaps not worth it though.

> 
> > > +#define DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare, rc)        \
> > > +    do {                                                                \
> > > +        int x;                                                          \
> > > +        libxl_domain_config d_config;                                   \
> > > +        libxl_device_##type *p;                                         \
> > > +                                                                        \
> > > +        if (domid == LIBXL_TOOLSTACK_DOMID)                             \
> > > +            break;                                                      \
> > > +                                                                        \
> > > +        libxl_domain_config_init(&d_config);                            \
> > > +                                                                        \
> > > +        rc = libxl__get_domain_configuration(gc, (domid), &d_config);   \
> > > +        if (rc)                                                         \
> > > +            goto add_done;                                              \
> > > +                                                                        \
> > > +        /* Check for duplicated device */                               \
> > > +        for (x = 0; x < d_config.cnt; x++) {                            \
> > > +            if (compare(&d_config.ptr[x], (dev))) {                     \
> > 
> > Compare just checks ->devid etc, right?
> > 
> 
> It depends on specific device type. See those COMPARE_* macros.
> Disk and PCI devices don't have devid.

Sure. I just meant to use devid as a specific example.

> > What if this device has the same devid but a different config?
> > 
> > That probably ought to be an error, have we already caught that or are
> > we relying on something later?
> > 
> 
> Then the caller should use DEVICE_UPDATE_JSON which is introduced in
> later patch.
> 
> > I suppose either way we don't want to add the device again now.
> > 
> 
> Yes.
> 
> > > +                rc = 0;                                                 \
> > > +                goto add_done;                                          \
> > > +            }                                                           \
> > > +        }                                                               \
> > > +                                                                        \
> > > +        d_config.ptr =                                                  \
> > > +            libxl__realloc(gc, d_config.ptr,                            \
> > > +                           (d_config.cnt + 1) *                         \
> > > +                           sizeof(libxl_device_##type));                \
> > > +        p = &d_config.ptr[d_config.cnt];                                \
> > > +        d_config.cnt++;                                                 \
> > > +        libxl_device_##type##_copy(CTX, p, (dev));                      \
> > > +                                                                        \
> > > +        rc = libxl__set_domain_configuration(gc, (domid), &d_config);   \
> > 
> > How close to being possible is it to do this as a proper helper function
> > which takes a size_t and a bunch of fn pointers with void where the type
> > would be?
> > 
> 
> We need to know the type of this structure otherwise we don't know what
> *_copy function to call. Sadly there's no way to pass in type information
> in C in runtime.

You could pass a copyfn as a function pointer with a void * argument for
the object to a helper function which doesn't need the specifics and
then have a macro which simply defines the type safe wrappers around
that. ISTR thinking  that the helper would need a sizeof passing to it
as well for the realloc.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-17 11:06       ` Ian Campbell
@ 2014-07-17 11:46         ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-17 11:46 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 12:06:48PM +0100, Ian Campbell wrote:
[...]
> Either works for me.
> 
> > And for larger scope, when creating a domain, there shouldn't be any
> > other thread trying to access the configuration file (is there?), so
> > locking is not necessary.
> 
> Could someone race and try and call xl mem-set in the middle of the
> create? Not sure. 
> 

Good point.

To be clear, we don't really care about those moving parts in xenstore,
as we always read from xenstore later to get the latest value.

What concerns me is user doing "xl XXX-attach" etc before the config is
stored.

> Always taking the lock when manipulating a domain's stored config, even
> if it isn't strictly necessary means we don't need to think to hard
> about those corner cases...
> 
> I suppose there is also a brief interval after the createdomain domctl
> where the domid is "valid" but no config has been stored yet (since the
> create doesn't hold the lock, I think). You might need some robustness
> against that too.
> 

I think you're right.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-17 11:41         ` Wei Liu
@ 2014-07-17 11:48           ` Ian Campbell
  0 siblings, 0 replies; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 11:48 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-17 at 12:41 +0100, Wei Liu wrote:
> > > > > If a thread cannot get hold of the lock it waits due to F_SETLKW.
> > > > 
> > > > How is deadlock avoided here?
> > > > 
> > > 
> > > This patch only provides "primitive" to lock and unlock. Avoiding
> > > deadlock is out of scope of this patch. That pretty much depends on the
> > > users of this lock.
> > 
> > Hrm. It would be conventional to describe for the benefit of those users
> > what the requirements are. e.g. by defining a locking hierarchy for
> > libxl. e.g. relative to the CTX lock.
> > 
> 
> This lock is only used internally by libxl. I guess I should document it
> in libxl_internal.h? There doesn't seem to a existing doc on locking
> hierarchy in either libxl.h or libxl_internal.h. I guess the
> introduction of this lock is the first instance that needs documented?

The context lock is documented in libxl_internal.h. It says there that
it must be the inner most lock, so it would be invalid to take the CTX
lock with your new lock held for example.

> > It might also be necessary  to specify what the various event callbacks
> > are allowed to assume here -- but Ian knows the event code best so I'll
> > let him comment.
> > 
> > > > > In order to generate lock file name, rename userdata_path to
> > > > > libxl__userdata_path and declare it in libxl_internal.h
> > > > 
> > > > Given that you don't appear to be locking the userdata file itself that
> > > > seems like an odd place for a lock to me, why not XEN_LOCK_DIR?
> > > > 
> > > 
> > > So that the lock file itself gets deleted by libxl automatically. Using
> > > XEN_LOCK_DIR requires some extra care to delete the lock file. And if
> > > libxl user crashes for any reason, those files become stale. Leaving
> > > them in one location is easier to clean them up.
> > 
> > There's only one (empty) file though, isn't there? and it will be reused
> > by any subsequent libxl call from any process. IOW I don't think you
> > need to concern yourself with removing it.
> > 
> 
> No, it's per-domain, so that we don't content on it (now I see why you
> said it would be a bottleneck because you thought there's only one
> file). Leaving a bunch of stale files is not that nice, isn't it?

Yes, I misread something earlier and thought there was only one lock
file. Leaving lots of files around is indeed to be avoided.

If you are going to use libxl__userdata_path for this then you should
add libxl-json.lock to the registry of userdata types.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max
  2014-07-17 10:47   ` Ian Campbell
@ 2014-07-17 12:02     ` Wei Liu
  2014-07-17 13:59       ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-17 12:02 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 11:47:08AM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > ... which returns the "static-max" knob of a domain. It will be used in
> > later patch to retrieve memory static-max value of a domain.
> > 
> > As libxl_get_memory_{target, static_max} have similar logic, a macro is
> > introduced to avoid code duplication.
> 
> It looks to me like this could instead be a common helper function, with
> a simple boolean parameter.
> 

Ian J likes macro while you likes functions. I'm fine with anything that
works. :-)

Helper function it is.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-17 10:59   ` Ian Campbell
@ 2014-07-17 12:11     ` Wei Liu
  2014-07-17 14:02       ` Ian Campbell
  2014-07-29 15:31       ` Ian Jackson
  2014-07-29 15:29     ` Ian Jackson
  1 sibling, 2 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-17 12:11 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 11:59:01AM +0100, Ian Campbell wrote:
> On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > Introduce a new public API to return domain configuration. This returned
> > configuration can be used to rebuild a domain.
> > 
> > Note that this configuration doesn't equal to the current state of the
> > domain. What it does is to use JSON version configuration as template
> > and pull in relevant information from xenstore.
> 
> I think this configuration does equal the current state, doesn't it?
> Isn't that the whole point?
> 

Hrm... by "current state" I mean the current running state. But to
rebuild a domain we might leave some configurations for the remote host
toolstack to decide. The configuration this API returns is (template
configuration + current state that we care about).

> > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > index 49d7ef6..cd5914c 100644
> > --- a/tools/libxl/libxl.c
> > +++ b/tools/libxl/libxl.c
> > @@ -6034,6 +6034,206 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src)
> >      for (i = 0; i < 6; i++)
> >          (*dst)[i] = (*src)[i];
> >  }
> > +
> > +int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
> > +                                        libxl_domain_config *d_config)
> 
> It occurs to me to wonder if any function which takes a lock ought to be
> async capable?
> 

Good point.

> Also is there is any possibility that any of the operations needed to
> gather the updated configuration might take a long time.
> 

I think reading from xenstore and filesystem can be slow, in the sense
that if there's much contention on these two resources it could take
seconds to finish operations.

> > +    /* Memory limits:
> > +     *
> > +     * Currently there're three memory limits:
> > +     *  1. "target" in xenstore (originally memory= in config file)
> > +     *  2. "static-max" in xenstore (originally maxmem= in config file)
> 
> Nit: strictly speaking those "originally..." are xl specific and this is
> libxl. </pendant>
> 

You mean I should write "originally memory= in xl config file"?

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 10/10] xl: use libxl_retrieve_domain_configuration and JSON format
  2014-07-17 11:13   ` Ian Campbell
@ 2014-07-17 12:14     ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-17 12:14 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 12:13:17PM +0100, Ian Campbell wrote:
[...]
> >  made to the guest will be preserved when the guest is restarted.
> >  
> > +Libxl now has better capability to handle domain configuration, avoid
> > +using this command when possible.
> 
> Perhaps say:
> 
>         Libxl now has improved capabilities to handle dynamic domain
>         configuration changes and will preserve any changes made a
>         runtime when necessary. Therefore it should not normally be
>         necessary to use this command any more.
>         

I will use this.

> > +
> >  I<configfile> has to be an absolute path to a file.
> >  
[...]
> >  static void reload_domain_config(uint32_t domid,
> > -                                 uint8_t **config_data, int *config_len)
> > +                                 libxl_domain_config *d_config)
> >  {
> > +    int rc;
> >      uint8_t *t_data;
> >      int ret, t_len;
> > +    libxl_domain_config d_config_new;
> >  
> > +    /* In case user has used "config-update" to store a new config
> > +     * file.
> > +     */
> >      ret = libxl_userdata_retrieve(ctx, domid, "xl", &t_data, &t_len);
> > -    if (ret) {
> > -        LOG("failed to retrieve guest configuration (rc=%d). "
> > -            "reusing old configuration", ret);
> > +    if (ret && errno != ENOENT) {
> > +        LOG("\"xl\" file found but failed to load\n");
> 
> Perhaps "Saved \"xl\" configuration found ..."

NP.

> > +    }
> > +    if (t_len > 0) {
> > +        LOG("\"xl\" config file found, use it\n");
> 
> "using"
> 
> I wonder if at this point we should delete the saved config.
> 
> IOW do we expect xl update-config to be one shot or to persist. I think
> the one shot is the only sensible answer and it doesn't conflict with
> the previous implementation since in the old implementation the saved
> config became the new stored config.
> 

Makes sense.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max
  2014-07-17 12:02     ` Wei Liu
@ 2014-07-17 13:59       ` Ian Campbell
  2014-07-29 13:39         ` Ian Jackson
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 13:59 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-17 at 13:02 +0100, Wei Liu wrote:
> On Thu, Jul 17, 2014 at 11:47:08AM +0100, Ian Campbell wrote:
> > On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > > ... which returns the "static-max" knob of a domain. It will be used in
> > > later patch to retrieve memory static-max value of a domain.
> > > 
> > > As libxl_get_memory_{target, static_max} have similar logic, a macro is
> > > introduced to avoid code duplication.
> > 
> > It looks to me like this could instead be a common helper function, with
> > a simple boolean parameter.
> > 
> 
> Ian J likes macro while you likes functions. I'm fine with anything that
> works. :-)

I think Ian prefers macros over repetition, but not at the expense of a
helper function. IOW macros only when the function is impossible. At
least I hope that's the case!

> Helper function it is.

You should probably check that Ian J doesn't disagree violently
first ;-)

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-17 12:11     ` Wei Liu
@ 2014-07-17 14:02       ` Ian Campbell
  2014-07-17 14:28         ` Wei Liu
  2014-07-29 15:31       ` Ian Jackson
  1 sibling, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-17 14:02 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-17 at 13:11 +0100, Wei Liu wrote:
> On Thu, Jul 17, 2014 at 11:59:01AM +0100, Ian Campbell wrote:
> > On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > > Introduce a new public API to return domain configuration. This returned
> > > configuration can be used to rebuild a domain.
> > > 
> > > Note that this configuration doesn't equal to the current state of the
> > > domain. What it does is to use JSON version configuration as template
> > > and pull in relevant information from xenstore.
> > 
> > I think this configuration does equal the current state, doesn't it?
> > Isn't that the whole point?
> > 
> 
> Hrm... by "current state" I mean the current running state. But to
> rebuild a domain we might leave some configurations for the remote host
> toolstack to decide. The configuration this API returns is (template
> configuration + current state that we care about).

Hrm, I can't think of a good name for this. Describing it as the state
needed to reproduce the guest visible state might make sense?
> 
> > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > > index 49d7ef6..cd5914c 100644
> > > --- a/tools/libxl/libxl.c
> > > +++ b/tools/libxl/libxl.c
> > > @@ -6034,6 +6034,206 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src)
> > >      for (i = 0; i < 6; i++)
> > >          (*dst)[i] = (*src)[i];
> > >  }
> > > +
> > > +int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
> > > +                                        libxl_domain_config *d_config)
> > 
> > It occurs to me to wonder if any function which takes a lock ought to be
> > async capable?
> > 
> 
> Good point.
> 
> > Also is there is any possibility that any of the operations needed to
> > gather the updated configuration might take a long time.
> > 
> 
> I think reading from xenstore and filesystem can be slow, in the sense
> that if there's much contention on these two resources it could take
> seconds to finish operations.

I'm pretty sure we don't consider those to be "slow". See in
libxl_internal.h:

 * "Slow" functions includes any that might block on a guest or an
 * external script.  More broadly, it includes any operations which
 * are sufficiently slow that an application might reasonably want to
 * initiate them, and then carry on doing something else, while the
 * operation completes.  That is, a "fast" function must be fast
 * enough that we do not mind blocking all other management operations
 * on the same host while it completes.
 *
 * There are certain primitive functions which make a libxl operation
 * necessarily "slow" for API reasons.  These are:
 *  - awaiting xenstore watches (although read-modify-write xenstore
 *    transactions are OK for fast functions)
 *  - spawning subprocesses
 *  - anything with a timeout

> > > +    /* Memory limits:
> > > +     *
> > > +     * Currently there're three memory limits:
> > > +     *  1. "target" in xenstore (originally memory= in config file)
> > > +     *  2. "static-max" in xenstore (originally maxmem= in config file)
> > 
> > Nit: strictly speaking those "originally..." are xl specific and this is
> > libxl. </pendant>
> > 
> 
> You mean I should write "originally memory= in xl config file"?

That would satisfy my pedantry, yes ;-)

But maybe you want to reference the libxl_domain_config fields instead?

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-17 11:44       ` Ian Campbell
@ 2014-07-17 14:13         ` Wei Liu
  2014-07-18  8:49           ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-17 14:13 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 12:44:43PM +0100, Ian Campbell wrote:
> On Wed, 2014-07-16 at 18:12 +0100, Wei Liu wrote:
> > > > As we don't have a JSON config file for libxl toolstack domain
> > > > (currently Dom0) we need to skip JSON manipulation for it.
> > > 
> > > How hard would it be to create a stub/stunt JSON for dom0 when we notice
> > > it is missing? Or from e.g. xencommons perhaps?
> > > 
> > 
> > We need to determine:
> > 1. when / where to generate such thing (xencommons?)
> > 2. what to put in it
> > 
> > I have yet had answers for #2. The simplest version can be "{}" I think.
> > That is an empty configuration that every fields gets the default value.
> > But we probably need more than that.
> 
> I think you would want at least a name and perhaps a uuid? And cinfo
> type == PV.
> 
> Device arrays are all empty at start of day.
> 
> Some of the stuff about target and maxmem you could perhaps infer at
> start of day?
> 

UUID (Dom0 doesn't seem to have one), name and memory targets can all be
pulled from xenstore when they are required.

And it occurs to me as I discovered Dom0 doesn't have UUID that we need
to special-case reading / writing of Dom0's JSON config. That's because
all other guests' JSON config are to be named with UUID and domain id.
How annoying. :-(

I would like to put as few things as possible in the stub because there
doesn't seem to be a way to conveniently generate a valid JSON config
for Dom0. Will you be against the idea of having 'xl generate-dom0-json'
in xl to do that? Otherwise we have to basically generate a
semi-handcoded stub in xencommons, or even a hardcoded stub.


> > > > +    DEVICE_ADD_JSON(vtpm, vtpms, num_vtpms, domid, &vtpm_saved,
> > > 
> > > The second and third of these arguments could be derived from the first.
[...]
> > > How close to being possible is it to do this as a proper helper function
> > > which takes a size_t and a bunch of fn pointers with void where the type
> > > would be?
> > > 
> > 
> > We need to know the type of this structure otherwise we don't know what
> > *_copy function to call. Sadly there's no way to pass in type information
> > in C in runtime.
> 
> You could pass a copyfn as a function pointer with a void * argument for
> the object to a helper function which doesn't need the specifics and
> then have a macro which simply defines the type safe wrappers around
> that. ISTR thinking  that the helper would need a sizeof passing to it
> as well for the realloc.
> 

I see. I will try to go with this approach.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 07/10] libxl: make libxl_cd_insert "eject" + "insert"
  2014-07-17 10:44   ` Ian Campbell
@ 2014-07-17 14:20     ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-17 14:20 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 11:44:35AM +0100, Ian Campbell wrote:
[...]
> >  
> > +/*
> > + * A "cdrom insert" is always processed as "eject" + "insert", with
> > + * updating JSON in between. So that we can know the current state of
> > + * CDROM later when we try to retrieve domain configuration: if
> > + * xenstore is "empty", then CDROM is "empty"; otherwise use the image
> > + * in JSON.
> 
> I think you could short circuit the case where the user inserts an empty
> disk (AKA ejects), right now you will eject twice. (We don't have an
> explicit libxl_cdrom_eject so I presume this is how it is intended to be
> done).
> 

OK. This is a valid optimisation.

> I think this function also handles HVM emulated IDE CDROM, which are
> semantically a bit different from PV CDROMs in that the empty drive is
> still an actual thing. Not sure if/how this impacts on your handling of
> the JSON though.
> 

IMO it wouldn't impact that much. I don't actually care what type of
backend it is using. I only care about whether the operation succeeds or
not.

> > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> > index b4f518a..f8f2ba2 100644
> > --- a/tools/libxl/libxl_internal.h
> > +++ b/tools/libxl/libxl_internal.h
> > @@ -3355,6 +3355,40 @@ static inline void libxl__update_config_vtpm(libxl__gc *gc,
> >          libxl_domain_config_dispose(&d_config);                         \
> >      } while (0)
> >  
> > +/* Search device list for device with the same identifier as "dev",
> > + * update that device with the content pointed to by "dev".
> > + */
> > +#define DEVICE_UPDATE_JSON(type, ptr, cnt, domid, dev, compare, rc)     \
> 
> You could define this in terms of REMOVE+ADD, which would match the
> behaviour of the actual operation. Slightly less efficient but less code
> perhaps?
> 

Correct, it's less efficient. But if you prefer it that way I can make
it that way. I guess one extra read and one extra write won't slow down
things much.

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-17 14:02       ` Ian Campbell
@ 2014-07-17 14:28         ` Wei Liu
  2014-07-18  8:52           ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-17 14:28 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Thu, Jul 17, 2014 at 03:02:11PM +0100, Ian Campbell wrote:
> On Thu, 2014-07-17 at 13:11 +0100, Wei Liu wrote:
> > On Thu, Jul 17, 2014 at 11:59:01AM +0100, Ian Campbell wrote:
> > > On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > > > Introduce a new public API to return domain configuration. This returned
> > > > configuration can be used to rebuild a domain.
> > > > 
> > > > Note that this configuration doesn't equal to the current state of the
> > > > domain. What it does is to use JSON version configuration as template
> > > > and pull in relevant information from xenstore.
> > > 
> > > I think this configuration does equal the current state, doesn't it?
> > > Isn't that the whole point?
> > > 
> > 
> > Hrm... by "current state" I mean the current running state. But to
> > rebuild a domain we might leave some configurations for the remote host
> > toolstack to decide. The configuration this API returns is (template
> > configuration + current state that we care about).
> 
> Hrm, I can't think of a good name for this. Describing it as the state
> needed to reproduce the guest visible state might make sense?

That will do for me. But as I'm the one implementing this I can always
understand the description however vague it is. :-)

> > 
> > > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > > > index 49d7ef6..cd5914c 100644
> > > > --- a/tools/libxl/libxl.c
> > > > +++ b/tools/libxl/libxl.c
> > > > @@ -6034,6 +6034,206 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src)
> > > >      for (i = 0; i < 6; i++)
> > > >          (*dst)[i] = (*src)[i];
> > > >  }
> > > > +
> > > > +int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
> > > > +                                        libxl_domain_config *d_config)
> > > 
> > > It occurs to me to wonder if any function which takes a lock ought to be
> > > async capable?
> > > 
> > 
> > Good point.
> > 
> > > Also is there is any possibility that any of the operations needed to
> > > gather the updated configuration might take a long time.
> > > 
> > 
> > I think reading from xenstore and filesystem can be slow, in the sense
> > that if there's much contention on these two resources it could take
> > seconds to finish operations.
> 
> I'm pretty sure we don't consider those to be "slow". See in
> libxl_internal.h:
> 
>  * "Slow" functions includes any that might block on a guest or an
>  * external script.  More broadly, it includes any operations which
>  * are sufficiently slow that an application might reasonably want to
>  * initiate them, and then carry on doing something else, while the
>  * operation completes.  That is, a "fast" function must be fast
>  * enough that we do not mind blocking all other management operations
>  * on the same host while it completes.
>  *
>  * There are certain primitive functions which make a libxl operation
>  * necessarily "slow" for API reasons.  These are:
>  *  - awaiting xenstore watches (although read-modify-write xenstore
>  *    transactions are OK for fast functions)
>  *  - spawning subprocesses
>  *  - anything with a timeout
> 

So I was thinking about "slow" in general while you were talking about
"slow" in the context of libxl.

If we discuss this in the context of libxl, then I think this API is a
"fast" API that doesn't require to be made async.

> > > > +    /* Memory limits:
> > > > +     *
> > > > +     * Currently there're three memory limits:
> > > > +     *  1. "target" in xenstore (originally memory= in config file)
> > > > +     *  2. "static-max" in xenstore (originally maxmem= in config file)
> > > 
> > > Nit: strictly speaking those "originally..." are xl specific and this is
> > > libxl. </pendant>
> > > 
> > 
> > You mean I should write "originally memory= in xl config file"?
> 
> That would satisfy my pedantry, yes ;-)
> 
> But maybe you want to reference the libxl_domain_config fields instead?
> 

I'm fine with this.

Wei.

> Ian.
> 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-17 14:13         ` Wei Liu
@ 2014-07-18  8:49           ` Ian Campbell
  2014-07-18 11:22             ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-18  8:49 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-17 at 15:13 +0100, Wei Liu wrote:
> On Thu, Jul 17, 2014 at 12:44:43PM +0100, Ian Campbell wrote:
> > On Wed, 2014-07-16 at 18:12 +0100, Wei Liu wrote:
> > > > > As we don't have a JSON config file for libxl toolstack domain
> > > > > (currently Dom0) we need to skip JSON manipulation for it.
> > > > 
> > > > How hard would it be to create a stub/stunt JSON for dom0 when we notice
> > > > it is missing? Or from e.g. xencommons perhaps?
> > > > 
> > > 
> > > We need to determine:
> > > 1. when / where to generate such thing (xencommons?)
> > > 2. what to put in it
> > > 
> > > I have yet had answers for #2. The simplest version can be "{}" I think.
> > > That is an empty configuration that every fields gets the default value.
> > > But we probably need more than that.
> > 
> > I think you would want at least a name and perhaps a uuid? And cinfo
> > type == PV.
> > 
> > Device arrays are all empty at start of day.
> > 
> > Some of the stuff about target and maxmem you could perhaps infer at
> > start of day?
> > 
> 
> UUID (Dom0 doesn't seem to have one), name and memory targets can all be
> pulled from xenstore when they are required.
> 
> And it occurs to me as I discovered Dom0 doesn't have UUID that we need
> to special-case reading / writing of Dom0's JSON config. That's because
> all other guests' JSON config are to be named with UUID and domain id.
> How annoying. :-(

Indeed.

I think you could generate one on boot and set it with
XEN_DOMCTL_setdomainhandle. It might be a bug that we don't do that
today (that said I can't see any evidence that xend used to do
differently).

> I would like to put as few things as possible in the stub because there
> doesn't seem to be a way to conveniently generate a valid JSON config
> for Dom0. Will you be against the idea of having 'xl generate-dom0-json'
> in xl to do that? Otherwise we have to basically generate a
> semi-handcoded stub in xencommons, or even a hardcoded stub.

I'm in favour of some sort of command to "initialise dom0". Either part
of xl or a new helper app based on libxl.

I had a similar (unposted I think) patch to add xl launch-dom0-qemu so
that it could pick the correct arch (see below, warning: it's a bit
skanky). If I were to do it again today I'd probably make a separate
$libexec helper instead of bolting it into xl though.

Probably it should subsume this bit of xencommons too:

                echo Setting domain 0 name and domid...
                ${BINDIR}/xenstore-write "/local/domain/0/name" "Domain-0"
                ${BINDIR}/xenstore-write "/local/domain/0/domid" 0

What do you think?

> > > > > +    DEVICE_ADD_JSON(vtpm, vtpms, num_vtpms, domid, &vtpm_saved,
> > > > 
> > > > The second and third of these arguments could be derived from the first.
> [...]
> > > > How close to being possible is it to do this as a proper helper function
> > > > which takes a size_t and a bunch of fn pointers with void where the type
> > > > would be?
> > > > 
> > > 
> > > We need to know the type of this structure otherwise we don't know what
> > > *_copy function to call. Sadly there's no way to pass in type information
> > > in C in runtime.
> > 
> > You could pass a copyfn as a function pointer with a void * argument for
> > the object to a helper function which doesn't need the specifics and
> > then have a macro which simply defines the type safe wrappers around
> > that. ISTR thinking  that the helper would need a sizeof passing to it
> > as well for the realloc.
> > 
> 
> I see. I will try to go with this approach.

I'd check with Ian J first, he might have some reason to prefer macros
over void * + wrapper.



Ian.


commit 7b5d54c9a5d09c4138bec905c9accea34173ba77
Author: Ian Campbell <ian.campbell@citrix.com>
Date:   Wed May 15 16:34:55 2013 +0100

    tools: build and launch correct qemu for architecture
    
    xl now provides a launch-dom0-qemu command which avoids the need to have the
    initscripts be aware of the specific qermu binary name, which differs by
    architecture and which also may have been specified by the user via the
    --with-system-qemu=PATH option to configure.
    
    Perhaps this should be a separate binary hidden in libexec?
    
    Signed-off-by: Ian Campbell <ian.campbell@citrix.com>

diff --git a/config/arm32.mk b/config/arm32.mk
index aa79d22..f3599a3 100644
--- a/config/arm32.mk
+++ b/config/arm32.mk
@@ -3,6 +3,8 @@ CONFIG_ARM_32 := y
 CONFIG_ARM_$(XEN_OS) := y
 
 CONFIG_XEN_INSTALL_SUFFIX :=
+CONFIG_QEMU_ARCH := arm
+CONFIG_QEMU_TARGET := arm-softmmu
 
 # -march= -mcpu=
 
diff --git a/config/arm64.mk b/config/arm64.mk
index 15b57a4..4ff15e0 100644
--- a/config/arm64.mk
+++ b/config/arm64.mk
@@ -3,6 +3,8 @@ CONFIG_ARM_64 := y
 CONFIG_ARM_$(XEN_OS) := y
 
 CONFIG_XEN_INSTALL_SUFFIX :=
+CONFIG_QEMU_ARCH := aarch64
+CONFIG_QEMU_TARGET := arm-softmmu
 
 CFLAGS += #-marm -march= -mcpu= etc
 
diff --git a/config/x86_32.mk b/config/x86_32.mk
index 7f76b25..da3111d 100644
--- a/config/x86_32.mk
+++ b/config/x86_32.mk
@@ -2,6 +2,9 @@ CONFIG_X86 := y
 CONFIG_X86_32 := y
 CONFIG_X86_$(XEN_OS) := y
 
+CONFIG_QEMU_ARCH := i386
+CONFIG_QEMU_TARGET := i386-softmmu
+
 CONFIG_HVM := y
 CONFIG_MIGRATE := y
 CONFIG_XCUTILS := y
diff --git a/config/x86_64.mk b/config/x86_64.mk
index 11104bd..f59e36d 100644
--- a/config/x86_64.mk
+++ b/config/x86_64.mk
@@ -2,6 +2,9 @@ CONFIG_X86 := y
 CONFIG_X86_64 := y
 CONFIG_X86_$(XEN_OS) := y
 
+CONFIG_QEMU_ARCH := x86_64
+CONFIG_QEMU_ARCH := i386-softmmu
+
 CONFIG_COMPAT := y
 CONFIG_HVM := y
 CONFIG_MIGRATE := y
diff --git a/tools/Makefile b/tools/Makefile
index 00c69ee..250b931 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -107,7 +107,7 @@ distclean: subdirs-distclean
 		config.cache autom4te.cache
 
 ifneq ($(XEN_COMPILE_ARCH),$(XEN_TARGET_ARCH))
-IOEMU_CONFIGURE_CROSS ?= --cpu=$(XEN_TARGET_ARCH) \
+IOEMU_CONFIGURE_CROSS ?= --cpu=$(CONFIG_QEMU_ARCH) \
 			 --cross-prefix=$(CROSS_COMPILE) \
 			 --interp-prefix=$(CROSS_SYS_ROOT)
 endif
@@ -186,8 +186,7 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find
 		source=.; \
 	fi; \
 	cd qemu-xen-dir; \
-	$$source/configure --enable-xen --target-list=i386-softmmu \
-		$(QEMU_XEN_ENABLE_DEBUG) \
+	$$source/configure --enable-xen --target-list=$(CONFIG_QEMU_TARGET) \
 		--prefix=$(PREFIX) \
 		--source-path=$$source \
 		--extra-cflags="-I$(XEN_ROOT)/tools/include \
diff --git a/tools/hotplug/Linux/init.d/xencommons b/tools/hotplug/Linux/init.d/xencommons
index 4ebd636..f568085 100644
--- a/tools/hotplug/Linux/init.d/xencommons
+++ b/tools/hotplug/Linux/init.d/xencommons
@@ -116,11 +116,7 @@ do_start () {
 	echo Starting xenconsoled...
 	test -z "$XENCONSOLED_TRACE" || XENCONSOLED_ARGS=" --log=$XENCONSOLED_TRACE"
 	${SBINDIR}/xenconsoled --pid-file=$XENCONSOLED_PIDFILE $XENCONSOLED_ARGS
-	echo Starting QEMU as disk backend for dom0
-	test -z "$QEMU_XEN" && QEMU_XEN="${LIBEXEC}/qemu-system-i386"
-	$QEMU_XEN -xen-domid 0 -xen-attach -name dom0 -nographic -M xenpv -daemonize \
-		-monitor /dev/null -serial /dev/null -parallel /dev/null \
-		-pidfile $QEMU_PIDFILE
+	xl launch-dom0-qemu $QEMU_XEN
 }
 do_stop () {
         echo Stopping xenconsoled
diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index d8495bb..d8b6a5c 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -27,6 +27,7 @@ CFLAGS_LIBXL += $(CFLAGS_libxenguest)
 CFLAGS_LIBXL += $(CFLAGS_libxenstore)
 CFLAGS_LIBXL += $(CFLAGS_libblktapctl) 
 CFLAGS_LIBXL += -Wshadow
+CFLAGS_LIBXL += -DCONFIG_QEMU_ARCH=\"$(CONFIG_QEMU_ARCH)\"
 
 LIBXL_LIBS-$(CONFIG_ARM) += -lfdt
 
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 12d6c31..0287a35 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -605,6 +605,9 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
                     xentoollog_logger *lg);
 int libxl_ctx_free(libxl_ctx *ctx /* 0 is OK */);
 
+/* exec's device model for dom0 and does not return. */
+void libxl_launch_dom0_qemu(libxl_ctx *ctx, const char *qemu_path, const char *pidfile);
+
 /* domain related functions */
 
 int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index f6f7bbd..f0b13a9 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -38,7 +38,7 @@ static const char *qemu_xen_path(libxl__gc *gc)
 #ifdef QEMU_XEN_PATH
     return QEMU_XEN_PATH;
 #else
-    return libxl__abs_path(gc, "qemu-system-i386", libxl__libexec_path());
+    return libxl__abs_path(gc, "qemu-system-" CONFIG_QEMU_ARCH, libxl__libexec_path());
 #endif
 }
 
@@ -1560,6 +1560,37 @@ out:
     return ret;
 }
 
+void libxl_launch_dom0_qemu(libxl_ctx *ctx, const char *qemu_path, const char *pidfile)
+{
+    GC_INIT(ctx);
+
+    flexarray_t *dm_args = flexarray_make(gc, 16, 1);
+
+    if (qemu_path == NULL)
+        qemu_path = qemu_xen_path(gc);
+
+    flexarray_vappend(dm_args,
+                      "-xen-domid", "0",
+                      "-xen-attach",
+                      "-name", "dom0",
+                      "-nographic",
+                      "-M" "xenpv",
+                      "-daemonize",
+                      "-monitor", "/dev/null",
+                      "-serial", "/dev/null",
+                      "-parallel", "/dev/null",
+                      NULL);
+    if (pidfile)
+        flexarray_append_pair(dm_args, "-pidfile", libxl__strdup(gc, pidfile));
+
+    libxl__exec(gc, -1, -1, -1,
+                qemu_path, (char **)flexarray_contents(dm_args), NULL);
+
+    /* Shouldn't return */
+    LOG(CRITICAL, "Failed to exec dom0 device model");
+    GC_FREE;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/xl.h b/tools/libxl/xl.h
index c876a33..1d7602c 100644
--- a/tools/libxl/xl.h
+++ b/tools/libxl/xl.h
@@ -106,6 +106,7 @@ int main_setenforce(int argc, char **argv);
 int main_loadpolicy(int argc, char **argv);
 int main_remus(int argc, char **argv);
 int main_devd(int argc, char **argv);
+int main_launch_dom0_qemu(int argc, char **argv);
 
 void help(const char *command);
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index bd26bcc..108dfac 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -7313,6 +7313,31 @@ out:
     return ret;
 }
 
+int main_launch_dom0_qemu(int argc, char **argv)
+{
+    int opt;
+    const char *qemu = NULL;
+    const char *pidfile = NULL;
+
+    SWITCH_FOREACH_OPT(opt, "p:", NULL, "launch-dom-qemu", 0) {
+    case 'p':
+        pidfile = optarg;
+        break;
+        /* No options */
+    }
+    if (optind < argc)
+        qemu = argv[optind];
+
+    fprintf(stderr, "argc %d\n", argc);
+    fprintf(stderr, "qemu = %s", qemu ? : "<default>");
+
+    fprintf(stdout, "Starting QEMU as disk backend for dom0");
+
+    libxl_launch_dom0_qemu(ctx, qemu, pidfile);
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/xl_cmdtable.c b/tools/libxl/xl_cmdtable.c
index ebe0220..ab4d56c 100644
--- a/tools/libxl/xl_cmdtable.c
+++ b/tools/libxl/xl_cmdtable.c
@@ -494,6 +494,12 @@ struct cmd_spec cmd_table[] = {
       "[options]",
       "-F                      Run in the foreground",
     },
+    { "launch-dom0-qemu",
+      &main_launch_dom0_qemu, 0, 1,
+      "Start qemu process to service dom0 disk backends",
+      "[options] [QEMU_PATH]",
+      "-p PIDFILE              Write a PIDFILE\n",
+    },
 };
 
 int cmdtable_len = sizeof(cmd_table)/sizeof(struct cmd_spec);

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-17 14:28         ` Wei Liu
@ 2014-07-18  8:52           ` Ian Campbell
  2014-07-18 11:17             ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-18  8:52 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Thu, 2014-07-17 at 15:28 +0100, Wei Liu wrote:
> On Thu, Jul 17, 2014 at 03:02:11PM +0100, Ian Campbell wrote:
> > On Thu, 2014-07-17 at 13:11 +0100, Wei Liu wrote:
> > > On Thu, Jul 17, 2014 at 11:59:01AM +0100, Ian Campbell wrote:
> > > > On Thu, 2014-07-10 at 15:32 +0100, Wei Liu wrote:
> > > > > Introduce a new public API to return domain configuration. This returned
> > > > > configuration can be used to rebuild a domain.
> > > > > 
> > > > > Note that this configuration doesn't equal to the current state of the
> > > > > domain. What it does is to use JSON version configuration as template
> > > > > and pull in relevant information from xenstore.
> > > > 
> > > > I think this configuration does equal the current state, doesn't it?
> > > > Isn't that the whole point?
> > > > 
> > > 
> > > Hrm... by "current state" I mean the current running state. But to
> > > rebuild a domain we might leave some configurations for the remote host
> > > toolstack to decide. The configuration this API returns is (template
> > > configuration + current state that we care about).
> > 
> > Hrm, I can't think of a good name for this. Describing it as the state
> > needed to reproduce the guest visible state might make sense?
> 
> That will do for me. But as I'm the one implementing this I can always
> understand the description however vague it is. :-)

;-)

How about:
        Note that this configuration only describes the configuration
        necessary to reproduce the guest visible state and does not
        necessarily include specific decisions made by the toolstack
        regarding its current incarnation (e.g. disk backend) unless
        they were specified by the application when the domain was
        created.

?

> 
> > > 
> > > > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> > > > > index 49d7ef6..cd5914c 100644
> > > > > --- a/tools/libxl/libxl.c
> > > > > +++ b/tools/libxl/libxl.c
> > > > > @@ -6034,6 +6034,206 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src)
> > > > >      for (i = 0; i < 6; i++)
> > > > >          (*dst)[i] = (*src)[i];
> > > > >  }
> > > > > +
> > > > > +int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
> > > > > +                                        libxl_domain_config *d_config)
> > > > 
> > > > It occurs to me to wonder if any function which takes a lock ought to be
> > > > async capable?
> > > > 
> > > 
> > > Good point.
> > > 
> > > > Also is there is any possibility that any of the operations needed to
> > > > gather the updated configuration might take a long time.
> > > > 
> > > 
> > > I think reading from xenstore and filesystem can be slow, in the sense
> > > that if there's much contention on these two resources it could take
> > > seconds to finish operations.
> > 
> > I'm pretty sure we don't consider those to be "slow". See in
> > libxl_internal.h:
> > 
> >  * "Slow" functions includes any that might block on a guest or an
> >  * external script.  More broadly, it includes any operations which
> >  * are sufficiently slow that an application might reasonably want to
> >  * initiate them, and then carry on doing something else, while the
> >  * operation completes.  That is, a "fast" function must be fast
> >  * enough that we do not mind blocking all other management operations
> >  * on the same host while it completes.
> >  *
> >  * There are certain primitive functions which make a libxl operation
> >  * necessarily "slow" for API reasons.  These are:
> >  *  - awaiting xenstore watches (although read-modify-write xenstore
> >  *    transactions are OK for fast functions)
> >  *  - spawning subprocesses
> >  *  - anything with a timeout
> > 
> 
> So I was thinking about "slow" in general while you were talking about
> "slow" in the context of libxl.
> 
> If we discuss this in the context of libxl, then I think this API is a
> "fast" API that doesn't require to be made async.

OK, great.

> 
> > > > > +    /* Memory limits:
> > > > > +     *
> > > > > +     * Currently there're three memory limits:
> > > > > +     *  1. "target" in xenstore (originally memory= in config file)
> > > > > +     *  2. "static-max" in xenstore (originally maxmem= in config file)
> > > > 
> > > > Nit: strictly speaking those "originally..." are xl specific and this is
> > > > libxl. </pendant>
> > > > 
> > > 
> > > You mean I should write "originally memory= in xl config file"?
> > 
> > That would satisfy my pedantry, yes ;-)
> > 
> > But maybe you want to reference the libxl_domain_config fields instead?
> > 
> 
> I'm fine with this.
> 
> Wei.
> 
> > Ian.
> > 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-18  8:52           ` Ian Campbell
@ 2014-07-18 11:17             ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-18 11:17 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Fri, Jul 18, 2014 at 09:52:28AM +0100, Ian Campbell wrote:
[...]
> 
> How about:
>         Note that this configuration only describes the configuration
>         necessary to reproduce the guest visible state and does not
>         necessarily include specific decisions made by the toolstack
>         regarding its current incarnation (e.g. disk backend) unless
>         they were specified by the application when the domain was
>         created.
> 
> ?
> 

Pretty clear I think.

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-18  8:49           ` Ian Campbell
@ 2014-07-18 11:22             ` Wei Liu
  2014-07-18 12:20               ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-18 11:22 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Fri, Jul 18, 2014 at 09:49:32AM +0100, Ian Campbell wrote:
> On Thu, 2014-07-17 at 15:13 +0100, Wei Liu wrote:
> > On Thu, Jul 17, 2014 at 12:44:43PM +0100, Ian Campbell wrote:
> > > On Wed, 2014-07-16 at 18:12 +0100, Wei Liu wrote:
> > > > > > As we don't have a JSON config file for libxl toolstack domain
> > > > > > (currently Dom0) we need to skip JSON manipulation for it.
> > > > > 
> > > > > How hard would it be to create a stub/stunt JSON for dom0 when we notice
> > > > > it is missing? Or from e.g. xencommons perhaps?
> > > > > 
> > > > 
> > > > We need to determine:
> > > > 1. when / where to generate such thing (xencommons?)
> > > > 2. what to put in it
> > > > 
> > > > I have yet had answers for #2. The simplest version can be "{}" I think.
> > > > That is an empty configuration that every fields gets the default value.
> > > > But we probably need more than that.
> > > 
> > > I think you would want at least a name and perhaps a uuid? And cinfo
> > > type == PV.
> > > 
> > > Device arrays are all empty at start of day.
> > > 
> > > Some of the stuff about target and maxmem you could perhaps infer at
> > > start of day?
> > > 
> > 
> > UUID (Dom0 doesn't seem to have one), name and memory targets can all be
> > pulled from xenstore when they are required.
> > 
> > And it occurs to me as I discovered Dom0 doesn't have UUID that we need
> > to special-case reading / writing of Dom0's JSON config. That's because
> > all other guests' JSON config are to be named with UUID and domain id.
> > How annoying. :-(
> 
> Indeed.
> 
> I think you could generate one on boot and set it with
> XEN_DOMCTL_setdomainhandle. It might be a bug that we don't do that
> today (that said I can't see any evidence that xend used to do
> differently).
> 

I think this approach is OK.

> > I would like to put as few things as possible in the stub because there
> > doesn't seem to be a way to conveniently generate a valid JSON config
> > for Dom0. Will you be against the idea of having 'xl generate-dom0-json'
> > in xl to do that? Otherwise we have to basically generate a
> > semi-handcoded stub in xencommons, or even a hardcoded stub.
> 
> I'm in favour of some sort of command to "initialise dom0". Either part
> of xl or a new helper app based on libxl.
> 
> I had a similar (unposted I think) patch to add xl launch-dom0-qemu so
> that it could pick the correct arch (see below, warning: it's a bit
> skanky). If I were to do it again today I'd probably make a separate
> $libexec helper instead of bolting it into xl though.
> 
> Probably it should subsume this bit of xencommons too:
> 
>                 echo Setting domain 0 name and domid...
>                 ${BINDIR}/xenstore-write "/local/domain/0/name" "Domain-0"
>                 ${BINDIR}/xenstore-write "/local/domain/0/domid" 0
> 
> What do you think?
> 

Agreed.

I'm think about adding "xl initialise-dom0", which:
1. generates UUID
2. writes relevant xenstore keys
3. generates stub JSON config
4. launches QEMU

I think we always needs first 3 items. But I'm not quite sure about the
4th. It's more flexible to launch a process in xencommons, isn't it?

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-18 11:22             ` Wei Liu
@ 2014-07-18 12:20               ` Ian Campbell
  2014-07-18 13:41                 ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Campbell @ 2014-07-18 12:20 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Fri, 2014-07-18 at 12:22 +0100, Wei Liu wrote:
> I'm think about adding "xl initialise-dom0", which:
> 1. generates UUID
> 2. writes relevant xenstore keys
> 3. generates stub JSON config
> 4. launches QEMU
> 
> I think we always needs first 3 items. But I'm not quite sure about the
> 4th. It's more flexible to launch a process in xencommons, isn't it?

I suppose so. Perhaps we can simply provide some --no-do-foo which would
allow each of these to be turned off so that an interested admin can
launch it themselves?

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-18 12:20               ` Ian Campbell
@ 2014-07-18 13:41                 ` Wei Liu
  2014-07-18 13:44                   ` Ian Campbell
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-18 13:41 UTC (permalink / raw)
  To: Ian Campbell; +Cc: ian.jackson, Wei Liu, xen-devel

On Fri, Jul 18, 2014 at 01:20:11PM +0100, Ian Campbell wrote:
> On Fri, 2014-07-18 at 12:22 +0100, Wei Liu wrote:
> > I'm think about adding "xl initialise-dom0", which:
> > 1. generates UUID
> > 2. writes relevant xenstore keys
> > 3. generates stub JSON config
> > 4. launches QEMU
> > 
> > I think we always needs first 3 items. But I'm not quite sure about the
> > 4th. It's more flexible to launch a process in xencommons, isn't it?
> 
> I suppose so. Perhaps we can simply provide some --no-do-foo which would
> allow each of these to be turned off so that an interested admin can
> launch it themselves?
> 

What I had in mind is situation like this: if we are to launch some
other service, we would then need to modify a C program, which is a bit
error prone IMHO. And if we make mistake in parameters in one of our
releases, admin needs to either a) recompile xl or b) use --no-do-foo
then write some runes in xencommon.

For plan a), recompiling a binary is a hassle comparing to modifying a
script. For plan b), the admin ends up modifying the script anyway.

If admin wants to launch new service then he / she ends up editing some
init-scripts as well.

And AIUI we don't have qemu-system-arm for the moment. So picking up the
right arch for QEMU isn't really a requirement now.

So I propose we implement the first 3 items for the moment, and leave
launching QEMU as is in xencommon.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-18 13:41                 ` Wei Liu
@ 2014-07-18 13:44                   ` Ian Campbell
  0 siblings, 0 replies; 70+ messages in thread
From: Ian Campbell @ 2014-07-18 13:44 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.jackson, xen-devel

On Fri, 2014-07-18 at 14:41 +0100, Wei Liu wrote:
> On Fri, Jul 18, 2014 at 01:20:11PM +0100, Ian Campbell wrote:
> > On Fri, 2014-07-18 at 12:22 +0100, Wei Liu wrote:
> > > I'm think about adding "xl initialise-dom0", which:
> > > 1. generates UUID
> > > 2. writes relevant xenstore keys
> > > 3. generates stub JSON config
> > > 4. launches QEMU
> > > 
> > > I think we always needs first 3 items. But I'm not quite sure about the
> > > 4th. It's more flexible to launch a process in xencommons, isn't it?
> > 
> > I suppose so. Perhaps we can simply provide some --no-do-foo which would
> > allow each of these to be turned off so that an interested admin can
> > launch it themselves?
> > 
> 
> What I had in mind is situation like this: if we are to launch some
> other service, we would then need to modify a C program, which is a bit
> error prone IMHO. And if we make mistake in parameters in one of our
> releases, admin needs to either a) recompile xl or b) use --no-do-foo
> then write some runes in xencommon.
> 
> For plan a), recompiling a binary is a hassle comparing to modifying a
> script. For plan b), the admin ends up modifying the script anyway.
> 
> If admin wants to launch new service then he / she ends up editing some
> init-scripts as well.
> 
> And AIUI we don't have qemu-system-arm for the moment. So picking up the
> right arch for QEMU isn't really a requirement now.
> 
> So I propose we implement the first 3 items for the moment, and leave
> launching QEMU as is in xencommon.

Sure.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it
  2014-07-10 14:32 ` [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it Wei Liu
  2014-07-16 16:11   ` Ian Campbell
@ 2014-07-24 18:09   ` Ian Jackson
  2014-07-24 18:29   ` Ian Jackson
  2 siblings, 0 replies; 70+ messages in thread
From: Ian Jackson @ 2014-07-24 18:09 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("[PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it"):
> Introduce a new format in libxl userdata store called "libxl-json". This
> file format contains JSON version of libxl_domain_config, generated by
> libxl.
...
> + *  "libxl-json"  libxl_domain_config object in JSON format, generated
> + *                by libxl

Applications should probably be more strongly discouraged from
accessing this directly, at least by wording here in this comment.

> +int libxl__set_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                    libxl_domain_config *d_config)
> +{
> +    char *d_config_json;
> +    int rc;
> +
> +    d_config_json = libxl_domain_config_to_json(CTX, d_config);
> +    if (!d_config_json) {
> +        LOGE(ERROR, "failed to convert domain configuration to JSON for domain %d",

If you add a newline before the first " this won't need to wrap.


Aside from that,

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-10 14:32 ` [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration Wei Liu
  2014-07-16 16:15   ` Ian Campbell
@ 2014-07-24 18:24   ` Ian Jackson
  2014-07-25 10:36     ` Wei Liu
  1 sibling, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-24 18:24 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("[PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration"):
> Simple file lock taken from xl to serialise access to "libxl-json" file.
> If a thread cannot get hold of the lock it waits due to F_SETLKW.

Right.

> In order to generate lock file name, rename userdata_path to
> libxl__userdata_path and declare it in libxl_internal.h

I don't mind it in such a small patch but in general it is easier to
review things if non-functional changes like this are split out into a
separate patch.

> +int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                     int *fd_lock)
> +{
...
> +    int rc;
> +    struct flock fl;
> +    const char *lockfile;
> +
> +    if (*fd_lock >= 0)
> +        return ERROR_INVAL;

Why not assert() ?

> +    lockfile = libxl__userdata_path(gc, domid, "libxl-json.lock", "d");

Perhaps lockfile = ...(, "libxl-json", "l") ?  I think users of
libxl__userdata_path are entitled to invent their own `wh' values.

Otherwise you have to document "libxl-json.lock" as a reserved
userdata name (which is a bit daft because no-one would use it, but it
is, formally speaking, wrong to use it here).

> +    *fd_lock = open(lockfile, O_WRONLY|O_CREAT, S_IWUSR);
> +    if (*fd_lock < 0) {
> +        LOGE(ERROR, "cannot open lockfile %s errno=%d\n", lockfile, errno);

LOGE's message should not contain \n.

> +    if (fcntl(*fd_lock, F_SETFD, FD_CLOEXEC) < 0) {

What's wrong with libxl_fd_set_cloexec ?

> +        close(*fd_lock);

Please use the idempotent `goto out' error handling style to deal with
closing the fd on error.  Your failure to do so has resulted in
error-case fd leak in this function.

> +get_lock:
> +    fl.l_type = F_WRLCK;
> +    fl.l_whence = SEEK_SET;
> +    fl.l_start = 0;
> +    fl.l_len = 0;
> +    rc = fcntl(*fd_lock, F_SETLKW, &fl);
> +    if (rc < 0 && errno == EINTR)
> +        goto get_lock;

Please, no more of these `goto'-based loops!

> +    if (rc < 0) {
> +        LOGE(ERROR, "cannot acquire lock %s errno=%d\n", lockfile, errno);
> +        rc = ERROR_FAIL;

goto out.

> +    } else
> +        rc = 0;

No, not like that.  Like this:

     rc = 0;
     return rc;

   out:
     if (*fd_lock >= 0) { close(*fd_lock); *fd_lock = -1; }
     return rc;


> +int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                       int *fd_lock)
> +{

Closing the fd is sufficient.  I'm not even sure why you need a whole
function for this; the caller could just call close().  The caller can
can ignore any errors (which I think are impossible anyway) since
after close the fd is gone anyway.

> +/*
> + * Lock / unlock domain configuration in libxl private data store.
> + * fd_lock contains the file descriptor pointing to the lock file.
> + */
> +int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                     int *fd_lock);

You need to explain the lifetime semantics of *fd_lock.  Your code
demands that the caller set it to -1 beforehand (which is fine).

> +int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                       int *fd_lock);

If you do retain this as a separate function, it should return void.
I can think of nothing useful that the caller could do with an error
from it.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it
  2014-07-10 14:32 ` [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it Wei Liu
  2014-07-16 16:11   ` Ian Campbell
  2014-07-24 18:09   ` Ian Jackson
@ 2014-07-24 18:29   ` Ian Jackson
  2014-07-25 10:30     ` Wei Liu
  2 siblings, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-24 18:29 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("[PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it"):
> Introduce a new format in libxl userdata store called "libxl-json". This
> file format contains JSON version of libxl_domain_config, generated by
> libxl.
...
> +int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
> +                                    libxl_domain_config *d_config)
> +{
> +    uint8_t *data = NULL;
> +    int rc, len;
> +
> +    rc = libxl_userdata_retrieve(CTX, domid, "libxl-json", &data, &len);
...
> +    if (len == 0) {
> +        LOGE(ERROR, "configuration data stream empty for domain %d", domid);
> +        rc = ERROR_FAIL;

Reading 03/10 I notice the following issue with this:

During domain setup there is a moment where the domid exists, but the
configuration data doesn't.  Therefore this is a possible situation,
and does not reflect an internal error.

So, you need to treat this case differently.  It shouldn't log an
error and it should probably return a custom error code, which all
callers would have to handle specially.

Ultimately if an application asks for the configuration of a domain in
this state, it is not possible to tell whether it is in the process
being created, or is left in an anomalous (crashed) state.  So I think
this possible but anomalous state needs to be exposed all the way up
the stack.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-10 14:32 ` [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain Wei Liu
  2014-07-16 16:18   ` Ian Campbell
@ 2014-07-24 18:52   ` Ian Jackson
  2014-07-25 10:53     ` Wei Liu
  1 sibling, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-24 18:52 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("[PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> This patch utilises new "libxl-json" file and stores a copy of user
> supplied domain configuration in JSON form. This copy will be used as
> template.
> 
> There're two save operations in total. The template config is first
> saved right after the domain gets its UUID and domain id. Then after the
> domain creation succeeds, the relevant bits touched by libxl are
> synchronised to the stored config.

I think there is a potential race with domain destruction here:

    Task 1                                 Task 2
    Creating the domain                    Trying to shut down

      actually create domain
                                             observe domid
                                             start domain destruction
                                             delete all userdata
                                             destroy domain
      store the userdata
        *** forbidden state created: userdata exists but domain doesn't
        *** userdata has been leaked
      [ would now bomb out ]

This race actually exists in libxl_userdata_store so it is my fault.
Sorry!

I think this should be fixed as follows:

  * domain destruction should take a lock across deleting the
    userdata and destroying the domain in Xen

  * libxl_userdata_store should take take the lock,
    check domain exists, store userdata, unlock lock.

This would be a separate lock to the one you introduce, I think.
Your lock covers a lot of xenstore processing.  But perhaps we can
reuse some of the fcntl and lockfile massaging stuff.


> +    /* At this point we've got domid and UUID, store configuration */
> +    ret = libxl__set_domain_configuration(gc, domid, &d_config_saved);

This libxl__set_domain_configuration is the only thing which creates
the libxl-json userdata ?  If so I think there is no 

> +    libxl_domain_config_dispose(&d_config_saved);
> +    if (ret) {

Please use the idempotent error-handling style.  This dispose needs to
be in the `error_out' section.  You will need to call _init at the top
of the function.  I think there's a separate success exit path in this
function so I think you do need two calls to dispose.

> +static void update_domain_config(libxl__gc *gc,
> +                                 libxl_domain_config *dst,
> +                                 const libxl_domain_config *src)
> +{

I think it would be a good idea to move this domain configuration
update stuff into a file of its own rather than interleaving it with
the (supposedly chronologically-ordered) domain creation logic.

> +    /* update network interface information */
> +    int i;
> +
> +    for (i = 0; i < src->num_nics; i++)
> +        libxl__update_config_nic(gc, &dst->nics[i], &src->nics[i]);

Does creating the config early, and then updating it, not mean that
there will be a window during domain creation when configuration
exists but lacks these updates ?

I think that might be a (theoretically application-visible) bug.

> +/* update the saved domain configuration with a callback function,
> + * which is responsible to pull in useful fields from src.
> + */
> +typedef void (update_callback)(libxl__gc *, libxl_domain_config *,
> +                               const libxl_domain_config *);

libxl coding style has no ` ' before the `*'.

> +static int libxl__update_domain_configuration(libxl__gc *gc,
> +                                              uint32_t domid,
> +                                              update_callback callback,
> +                                              const libxl_domain_config *src)
...
> +    libxl_domain_config_init(&d_config_saved);
> +
> +    rc = libxl__get_domain_configuration(gc, domid, &d_config_saved);
> +
> +    if (rc) {

Spurious blank line.

> +        LOG(ERROR, "cannot get domain configuration: %d", rc);

Doesn't libxl__get_domain_configuration log a message already ?  I
think so, in which case it's probably not ideal to log again.

To help with (a) not spewing lots of messages for a single error and
(b) writing code uncluttered by unnecessary logging calls, it can be
helpful to mention in the doc comments for functions whether they log
failures.

If they do you can replace this whole block with a one-line
       if (rc) goto out;

> +    callback(gc, &d_config_saved, src);

Callback should probably have the ability to fail.

> +    rc = libxl__set_domain_configuration(gc, domid, &d_config_saved);
> +
> +    if (rc)
> +        LOG(ERROR, "cannot set domain configuration: %d", rc);

1. Spurious blank line.  2. See above re logging.   3. This falling
through into the end of the function is unpleasant - please make
things regular.

The end of most functions should have something like:
       rc = 0;
    out:

> +    libxl_domain_config_dispose(&d_config_saved);
> +
> +out:
> +    return rc;

This seems to leak d_config_saved on error paths.

> @@ -1354,6 +1424,12 @@ static void domcreate_complete(libxl__egc *egc,
>      if (!rc && d_config->b_info.exec_ssidref)
>          rc = xc_flask_relabel_domain(CTX->xch, dcs->guest_domid, d_config->b_info.exec_ssidref);
>  
> +    if (!rc) {
> +        rc = libxl__update_domain_configuration(gc, dcs->guest_domid,
> +                                                update_domain_config,
> +                                                d_config);
> +    }

Oh dear.  I see this function already has a mad error handling style
which you are following.  I won't ask you to fix it.

> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index e9d93ac..72d21ad 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3240,6 +3240,20 @@ int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
>  int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
>                                         int *fd_lock);
>  
> +static inline void libxl__update_config_nic(libxl__gc *gc,
> +                                            libxl_device_nic *dst,
> +                                            libxl_device_nic *src)
> +{
> +    libxl_mac_copy(CTX, &dst->mac, &src->mac);
> +}

Nice that this is so short...

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it
  2014-07-24 18:29   ` Ian Jackson
@ 2014-07-25 10:30     ` Wei Liu
  2014-07-25 14:51       ` Ian Jackson
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-25 10:30 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Thu, Jul 24, 2014 at 07:29:58PM +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it"):
> > Introduce a new format in libxl userdata store called "libxl-json". This
> > file format contains JSON version of libxl_domain_config, generated by
> > libxl.
> ...
> > +int libxl__get_domain_configuration(libxl__gc *gc, uint32_t domid,
> > +                                    libxl_domain_config *d_config)
> > +{
> > +    uint8_t *data = NULL;
> > +    int rc, len;
> > +
> > +    rc = libxl_userdata_retrieve(CTX, domid, "libxl-json", &data, &len);
> ...
> > +    if (len == 0) {
> > +        LOGE(ERROR, "configuration data stream empty for domain %d", domid);
> > +        rc = ERROR_FAIL;
> 
> Reading 03/10 I notice the following issue with this:
> 
> During domain setup there is a moment where the domid exists, but the
> configuration data doesn't.  Therefore this is a possible situation,
> and does not reflect an internal error.
> 
> So, you need to treat this case differently.  It shouldn't log an
> error and it should probably return a custom error code, which all
> callers would have to handle specially.
> 
> Ultimately if an application asks for the configuration of a domain in
> this state, it is not possible to tell whether it is in the process
> being created, or is left in an anomalous (crashed) state.  So I think
> this possible but anomalous state needs to be exposed all the way up
> the stack.
> 

No problem. I can return ERROR_JSON_CONFIG_EMPTY.

Does this patch still have your ack provided I make that change?

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration
  2014-07-24 18:24   ` Ian Jackson
@ 2014-07-25 10:36     ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-25 10:36 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Thu, Jul 24, 2014 at 07:24:10PM +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration"):
> > Simple file lock taken from xl to serialise access to "libxl-json" file.
> > If a thread cannot get hold of the lock it waits due to F_SETLKW.
> 
> Right.
> 
> > In order to generate lock file name, rename userdata_path to
> > libxl__userdata_path and declare it in libxl_internal.h
> 
> I don't mind it in such a small patch but in general it is easier to
> review things if non-functional changes like this are split out into a
> separate patch.
> 

Ack.

> > +int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
> > +                                     int *fd_lock)
> > +{
> ...
> > +    int rc;
> > +    struct flock fl;
> > +    const char *lockfile;
> > +
> > +    if (*fd_lock >= 0)
> > +        return ERROR_INVAL;
> 
> Why not assert() ?
> 

Ack.

> > +    lockfile = libxl__userdata_path(gc, domid, "libxl-json.lock", "d");
> 
> Perhaps lockfile = ...(, "libxl-json", "l") ?  I think users of
> libxl__userdata_path are entitled to invent their own `wh' values.
> 

Good to know.

> Otherwise you have to document "libxl-json.lock" as a reserved
> userdata name (which is a bit daft because no-one would use it, but it
> is, formally speaking, wrong to use it here).
> 

I think I will go with the "l" approach.

> > +    *fd_lock = open(lockfile, O_WRONLY|O_CREAT, S_IWUSR);
> > +    if (*fd_lock < 0) {
> > +        LOGE(ERROR, "cannot open lockfile %s errno=%d\n", lockfile, errno);
> 
> LOGE's message should not contain \n.
> 

Fixed.

> > +    if (fcntl(*fd_lock, F_SETFD, FD_CLOEXEC) < 0) {
> 
> What's wrong with libxl_fd_set_cloexec ?
> 

Will use this one.

This implementation was basically a copy of the one in xl, I didn't
check if there's some helper function to do that.

> > +        close(*fd_lock);
> 
> Please use the idempotent `goto out' error handling style to deal with
> closing the fd on error.  Your failure to do so has resulted in
> error-case fd leak in this function.
> 

Ack.

> > +get_lock:
> > +    fl.l_type = F_WRLCK;
> > +    fl.l_whence = SEEK_SET;
> > +    fl.l_start = 0;
> > +    fl.l_len = 0;
> > +    rc = fcntl(*fd_lock, F_SETLKW, &fl);
> > +    if (rc < 0 && errno == EINTR)
> > +        goto get_lock;
> 
> Please, no more of these `goto'-based loops!
> 

Ack.

> > +    if (rc < 0) {
> > +        LOGE(ERROR, "cannot acquire lock %s errno=%d\n", lockfile, errno);
> > +        rc = ERROR_FAIL;
> 
> goto out.
> 
> > +    } else
> > +        rc = 0;
> 
> No, not like that.  Like this:
> 
>      rc = 0;
>      return rc;
> 
>    out:
>      if (*fd_lock >= 0) { close(*fd_lock); *fd_lock = -1; }
>      return rc;
> 
> 

Will fix this.

> > +int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
> > +                                       int *fd_lock)
> > +{
> 
> Closing the fd is sufficient.  I'm not even sure why you need a whole
> function for this; the caller could just call close().  The caller can
> can ignore any errors (which I think are impossible anyway) since
> after close the fd is gone anyway.
> 
> > +/*
> > + * Lock / unlock domain configuration in libxl private data store.
> > + * fd_lock contains the file descriptor pointing to the lock file.
> > + */
> > +int libxl__lock_domain_configuration(libxl__gc *gc, uint32_t domid,
> > +                                     int *fd_lock);
> 
> You need to explain the lifetime semantics of *fd_lock.  Your code
> demands that the caller set it to -1 beforehand (which is fine).
> 

Ack.

> > +int libxl__unlock_domain_configuration(libxl__gc *gc, uint32_t domid,
> > +                                       int *fd_lock);
> 
> If you do retain this as a separate function, it should return void.
> I can think of nothing useful that the caller could do with an error
> from it.
> 

I will retain this as a separate function so that it looks symmetric to
__lock.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-24 18:52   ` Ian Jackson
@ 2014-07-25 10:53     ` Wei Liu
  2014-07-25 15:01       ` Ian Jackson
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-25 10:53 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Thu, Jul 24, 2014 at 07:52:46PM +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> > This patch utilises new "libxl-json" file and stores a copy of user
> > supplied domain configuration in JSON form. This copy will be used as
> > template.
> > 
> > There're two save operations in total. The template config is first
> > saved right after the domain gets its UUID and domain id. Then after the
> > domain creation succeeds, the relevant bits touched by libxl are
> > synchronised to the stored config.
> 
> I think there is a potential race with domain destruction here:
> 
>     Task 1                                 Task 2
>     Creating the domain                    Trying to shut down
> 
>       actually create domain
>                                              observe domid
>                                              start domain destruction
>                                              delete all userdata
>                                              destroy domain
>       store the userdata
>         *** forbidden state created: userdata exists but domain doesn't
>         *** userdata has been leaked
>       [ would now bomb out ]
> 
> This race actually exists in libxl_userdata_store so it is my fault.
> Sorry!
> 

No worries. I can fix this.

> I think this should be fixed as follows:
> 
>   * domain destruction should take a lock across deleting the
>     userdata and destroying the domain in Xen
> 
>   * libxl_userdata_store should take take the lock,
>     check domain exists, store userdata, unlock lock.
> 
> This would be a separate lock to the one you introduce, I think.
> Your lock covers a lot of xenstore processing.  But perhaps we can
> reuse some of the fcntl and lockfile massaging stuff.
> 

Why do we need a second lock? The one I introduced in previous patch is
used to serialise access to user data. Isn't that just what we want in
this case?

> 
> > +    /* At this point we've got domid and UUID, store configuration */
> > +    ret = libxl__set_domain_configuration(gc, domid, &d_config_saved);
> 
> This libxl__set_domain_configuration is the only thing which creates
> the libxl-json userdata ?  If so I think there is no 
> 

-EPARSE. :-(

> > +    libxl_domain_config_dispose(&d_config_saved);
> > +    if (ret) {
> 
> Please use the idempotent error-handling style.  This dispose needs to
> be in the `error_out' section.  You will need to call _init at the top
> of the function.  I think there's a separate success exit path in this
> function so I think you do need two calls to dispose.
> 

Will fix this.

> > +static void update_domain_config(libxl__gc *gc,
> > +                                 libxl_domain_config *dst,
> > +                                 const libxl_domain_config *src)
> > +{
> 
> I think it would be a good idea to move this domain configuration
> update stuff into a file of its own rather than interleaving it with
> the (supposedly chronologically-ordered) domain creation logic.
> 

No problem.

> > +    /* update network interface information */
> > +    int i;
> > +
> > +    for (i = 0; i < src->num_nics; i++)
> > +        libxl__update_config_nic(gc, &dst->nics[i], &src->nics[i]);
> 
> Does creating the config early, and then updating it, not mean that
> there will be a window during domain creation when configuration
> exists but lacks these updates ?
> 
> I think that might be a (theoretically application-visible) bug.
> 

Yes, there's such windows.

Ian C suggested I add lock during creation, which I think should fix
this problem.

> > +/* update the saved domain configuration with a callback function,
> > + * which is responsible to pull in useful fields from src.
> > + */
> > +typedef void (update_callback)(libxl__gc *, libxl_domain_config *,
> > +                               const libxl_domain_config *);
> 
> libxl coding style has no ` ' before the `*'.
> 

Ack.

> > +static int libxl__update_domain_configuration(libxl__gc *gc,
> > +                                              uint32_t domid,
> > +                                              update_callback callback,
> > +                                              const libxl_domain_config *src)
> ...
> > +    libxl_domain_config_init(&d_config_saved);
> > +
> > +    rc = libxl__get_domain_configuration(gc, domid, &d_config_saved);
> > +
> > +    if (rc) {
> 
> Spurious blank line.
> 

Fixed.

> > +        LOG(ERROR, "cannot get domain configuration: %d", rc);
> 
> Doesn't libxl__get_domain_configuration log a message already ?  I
> think so, in which case it's probably not ideal to log again.
> 

Yes. I agree.

> To help with (a) not spewing lots of messages for a single error and
> (b) writing code uncluttered by unnecessary logging calls, it can be
> helpful to mention in the doc comments for functions whether they log
> failures.
> 
> If they do you can replace this whole block with a one-line
>        if (rc) goto out;
> 
> > +    callback(gc, &d_config_saved, src);
> 
> Callback should probably have the ability to fail.
> 

I guess you mean I should write

   if (callback(XXX))

This of course can be done.

> > +    rc = libxl__set_domain_configuration(gc, domid, &d_config_saved);
> > +
> > +    if (rc)
> > +        LOG(ERROR, "cannot set domain configuration: %d", rc);
> 
> 1. Spurious blank line.  2. See above re logging.   3. This falling
> through into the end of the function is unpleasant - please make
> things regular.
> 
> The end of most functions should have something like:
>        rc = 0;
>     out:
> 
> > +    libxl_domain_config_dispose(&d_config_saved);
> > +
> > +out:
> > +    return rc;
> 
> This seems to leak d_config_saved on error paths.
> 

Will fix this.

> > @@ -1354,6 +1424,12 @@ static void domcreate_complete(libxl__egc *egc,
> >      if (!rc && d_config->b_info.exec_ssidref)
> >          rc = xc_flask_relabel_domain(CTX->xch, dcs->guest_domid, d_config->b_info.exec_ssidref);
> >  
> > +    if (!rc) {
> > +        rc = libxl__update_domain_configuration(gc, dcs->guest_domid,
> > +                                                update_domain_config,
> > +                                                d_config);
> > +    }
> 
> Oh dear.  I see this function already has a mad error handling style
> which you are following.  I won't ask you to fix it.
> 

Luckily it's very short at the moment.

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it
  2014-07-25 10:30     ` Wei Liu
@ 2014-07-25 14:51       ` Ian Jackson
  0 siblings, 0 replies; 70+ messages in thread
From: Ian Jackson @ 2014-07-25 14:51 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("Re: [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it"):
> On Thu, Jul 24, 2014 at 07:29:58PM +0100, Ian Jackson wrote:
> > Reading 03/10 I notice the following issue with this:
...
> No problem. I can return ERROR_JSON_CONFIG_EMPTY.
> 
> Does this patch still have your ack provided I make that change?

Yes, thanks.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 10:53     ` Wei Liu
@ 2014-07-25 15:01       ` Ian Jackson
  2014-07-25 15:43         ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-25 15:01 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> On Thu, Jul 24, 2014 at 07:52:46PM +0100, Ian Jackson wrote:
> > This race actually exists in libxl_userdata_store so it is my fault.
> > Sorry!
> 
> No worries. I can fix this.

Thanks.

> > I think this should be fixed as follows:
> > 
> >   * domain destruction should take a lock across deleting the
> >     userdata and destroying the domain in Xen
> > 
> >   * libxl_userdata_store should take take the lock,
> >     check domain exists, store userdata, unlock lock.
> > 
> > This would be a separate lock to the one you introduce, I think.
> > Your lock covers a lot of xenstore processing.  But perhaps we can
> > reuse some of the fcntl and lockfile massaging stuff.
> > 
> 
> Why do we need a second lock? The one I introduced in previous patch is
> used to serialise access to user data. Isn't that just what we want in
> this case?

I think you could use the same lock.  I suggested otherwise because
the xenstore manipulations might hold the lock for a bit longer than
ideal but today that doesn't seem so important.

(Sorry if I seem rather vague.  I have a head cold at the moment and
am quite confused.)

> > > +    /* At this point we've got domid and UUID, store configuration */
> > > +    ret = libxl__set_domain_configuration(gc, domid, &d_config_saved);
> > 
> > This libxl__set_domain_configuration is the only thing which creates
> > the libxl-json userdata ?  If so I think there is no 
> 
> -EPARSE. :-(

You can forget about that comment.  I was writing about the race I
mentioned further up, and forgot to come back and delete this after I
had added the other text to my mail.

> > > +    /* update network interface information */
...
> > Does creating the config early, and then updating it, not mean that
> > there will be a window during domain creation when configuration
> > exists but lacks these updates ?
> > 
> > I think that might be a (theoretically application-visible) bug.
> 
> Yes, there's such windows.
> 
> Ian C suggested I add lock during creation, which I think should fix
> this problem.

I don't think that's a particularly good idea.  Such a lock would
presumably cover most of the domain creation work (hotplug scripts
etc) which might take a long time.

During that time various operations would block unreasonably.

Also, I think you mustn't use an fcntl lock across ao operation
callback chains.  fcntl locks do not exclude other threads in the same
process.

> > If they do you can replace this whole block with a one-line
> >        if (rc) goto out;
> > 
> > > +    callback(gc, &d_config_saved, src);
> > 
> > Callback should probably have the ability to fail.
> 
> I guess you mean I should write
>    if (callback(XXX))
> 
> This of course can be done.

I was thinking more

     rc = callback(...);
     if (rc) goto out

:-).

(Unless you are sure that none of these callbacks will ever want to
fail...)

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 15:01       ` Ian Jackson
@ 2014-07-25 15:43         ` Wei Liu
  2014-07-25 17:14           ` Ian Jackson
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-25 15:43 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Fri, Jul 25, 2014 at 04:01:06PM +0100, Ian Jackson wrote:
[...]
> > > > +    /* update network interface information */
> ...
> > > Does creating the config early, and then updating it, not mean that
> > > there will be a window during domain creation when configuration
> > > exists but lacks these updates ?
> > > 
> > > I think that might be a (theoretically application-visible) bug.
> > 
> > Yes, there's such windows.
> > 
> > Ian C suggested I add lock during creation, which I think should fix
> > this problem.
> 
> I don't think that's a particularly good idea.  Such a lock would
> presumably cover most of the domain creation work (hotplug scripts
> etc) which might take a long time.
> 
> During that time various operations would block unreasonably.
> 

That's a bit unfortunate. But what can we do to close that window?
I think no good will come out if I try to fiddle with domain state
during creation anyway, so locking like this may be acceptable?

> Also, I think you mustn't use an fcntl lock across ao operation
> callback chains.  fcntl locks do not exclude other threads in the same
> process.
> 

Then we need both mutex and file lock? Mutex to protect against threads
in the same process while file lock protect against other processes.

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-10 14:32 ` [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device Wei Liu
  2014-07-16 16:48   ` Ian Campbell
@ 2014-07-25 16:06   ` Ian Jackson
  2014-07-25 16:40     ` Wei Liu
  1 sibling, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-25 16:06 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("[PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device"):
> We update JSON version first, then write to xenstore, so that we
> maintain the following invariant: any device which is present in
> xenstore has a corresponding entry in JSON.

Thanks.  As I said I'm rather thick-headed today so please forgive me
if what I'm saying makes no sense.  But here goes:

> @@ -2193,21 +2217,56 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
>          goto out;
>      }
>  
> +    libxl_device_disk_copy(CTX, &disk_saved, disk);
> +
> +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> +    if (rc) goto out;
> +
>      for (;;) {
>          rc = libxl__xs_transaction_start(gc, &t);
> -        if (rc) goto out;
> +        if (rc) {
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +            goto out;
> +        }
> +
> +        /* This is a loop, if disk has been added to JSON before, we
> +         * need to remove it, then add it later, because disk->vdev
> +         * might change.
> +         */
> +        DEVICE_REMOVE_JSON(disk, disks, num_disks, domid, &disk_saved,
> +                           COMPARE_DISK, rc);

I don't think this is right.  At this point it may be the case that
the disk is already attached.

I see in libxl__device_generic_add that we seem to unconditionally
remove the device if we do an add of a device that's already there.  I
think this is probably wrong.

But whether it is, or not, it is not correct to remove the entry from
the json while it might be in xenstore.  I think you may need to do a
xenstore read here to check and take appropriate action if it's there.

> +        if (rc) {
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +            goto out;
> +        }
>  
>          if (get_vdev) {
>              assert(get_vdev_user);
>              disk->vdev = get_vdev(gc, get_vdev_user, t);
>              if (disk->vdev == NULL) {
>                  rc = ERROR_FAIL;
> +                UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
>                  goto out;
>              }
>          }
>  
> +        if (strcmp(disk_saved.vdev, disk->vdev)) {
> +            free(disk_saved.vdev);
> +            disk_saved.vdev = libxl__strdup(NOGC, disk->vdev);
> +        }
> +
> +        DEVICE_ADD_JSON(disk, disks, num_disks, domid, &disk_saved,
> +                        COMPARE_DISK, rc);
> +        if (rc) {
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +            goto out;
> +        }
> +
>          rc = libxl__device_disk_setdefault(gc, disk);
> -        if (rc) goto out;
> +        if (rc) {
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +            goto out;
> +        }
>  
>          front = flexarray_make(gc, 16, 1);
>          back = flexarray_make(gc, 16, 1);
> @@ -2217,6 +2276,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
>          if (rc != 0) {
>              LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "Invalid or unsupported"
>                     " virtual disk identifier %s", disk->vdev);
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
>              goto out;
>          }
>  
> @@ -2257,6 +2317,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
>                      if (!dev) {
>                          LOG(ERROR, "failed to get blktap devpath for %p\n",
>                              disk->pdev_path);
> +                        UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
>                          rc = ERROR_FAIL;
>                          goto out;
>                      }
> @@ -2281,6 +2342,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
>              default:
>                  LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "unrecognized disk backend type: %d\n", disk->backend);
>                  rc = ERROR_INVAL;
> +                UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
>                  goto out;
>          }
>  
> @@ -2343,9 +2405,14 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
>  
>          rc = libxl__xs_transaction_commit(gc, &t);
>          if (!rc) break;
> -        if (rc < 0) goto out;
> +        if (rc < 0) {
> +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +            goto out;
> +        }
>      }
>  
> +    UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +
>      aodev->dev = device;
>      aodev->action = LIBXL__DEVICE_ACTION_ADD;
>      libxl__wait_device_connection(egc, aodev);
> @@ -2353,6 +2420,7 @@ static void device_disk_add(libxl__egc *egc, uint32_t domid,
>      rc = 0;
>  
>  out:
> +    libxl_device_disk_dispose(&disk_saved);
>      libxl__xs_transaction_abort(gc, &t);
>      aodev->rc = rc;
>      if (rc) aodev->callback(egc, aodev);
> @@ -3009,6 +3077,11 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
>      flexarray_t *back;
>      libxl__device *device;
>      unsigned int rc;
> +    int lock = -1;
> +    libxl_device_nic nic_saved;
> +
> +    libxl_device_nic_init(&nic_saved);
> +    libxl_device_nic_copy(CTX, &nic_saved, nic);
>  
>      rc = libxl__device_nic_setdefault(gc, nic, domid);
>      if (rc) goto out;
> @@ -3023,6 +3096,9 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
>          }
>      }
>  
> +    nic_saved.devid = nic->devid;
> +    libxl__update_config_nic(gc, &nic_saved, nic);
> +
>      GCNEW(device);
>      rc = libxl__device_from_nic(gc, domid, nic, device);
>      if ( rc != 0 ) goto out;
> @@ -3079,17 +3155,31 @@ void libxl__device_nic_add(libxl__egc *egc, uint32_t domid,
>      flexarray_append(front, "mac");
>      flexarray_append(front, libxl__sprintf(gc,
>                                      LIBXL_MAC_FMT, LIBXL_MAC_BYTES(nic->mac)));
> +
> +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> +    if (rc) goto out;
> +
> +    DEVICE_ADD_JSON(nic, nics, num_nics, domid, nic, COMPARE_DEVID, rc);
> +
> +    if (rc) {
> +        UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +        goto out;
> +    }
> +
>      libxl__device_generic_add(gc, XBT_NULL, device,
>                                libxl__xs_kvs_of_flexarray(gc, back, back->count),
>                                libxl__xs_kvs_of_flexarray(gc, front, front->count),
>                                NULL);
>  
> +    UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> +
>      aodev->dev = device;
>      aodev->action = LIBXL__DEVICE_ACTION_ADD;
>      libxl__wait_device_connection(egc, aodev);
>  
>      rc = 0;
>  out:
> +    libxl_device_nic_dispose(&nic_saved);
>      aodev->rc = rc;
>      if (rc) aodev->callback(egc, aodev);
>      return;
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 72d21ad..f27a6e98 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3254,6 +3254,105 @@ static inline void libxl__update_config_vtpm(libxl__gc *gc,
>      libxl_uuid_copy(CTX, &dst->uuid, &src->uuid);
>  }
>  
> +/* Macro used to add the new device to JSON template */
> +#define COMPARE_DEVID(a, b) ((a)->devid == (b)->devid)
> +#define COMPARE_DISK(a, b) (!strcmp((a)->vdev, (b)->vdev))
> +#define COMPARE_PCI(a, b) ((a)->func == (b)->func &&    \
> +                           (a)->bus == (b)->bus &&      \
> +                           (a)->dev == (b)->dev)
> +
> +#define LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc)                 \
> +    do {                                                             \
> +        if (domid != LIBXL_TOOLSTACK_DOMID)                          \
> +            rc = libxl__lock_domain_configuration(gc, domid, &lock); \
> +        else                                                         \
> +            rc = 0;                                                  \
> +    } while (0)

Can this test be perhaps moved into libxl__lock_domain_configuration ?

Even if not, I don't think a macro is warranted here.  You could make
it an inline function which returns rc.

> +#define DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare, rc)        \

This needs a good doc comment.  See the doc comments for things like
GCSPRINTF, CONTAINER_OF, CTYPE, etc., for the kind of thing that's
helpful.

> +    do {                                                                \
> +        int x;                                                          \
> +        libxl_domain_config d_config;                                   \
> +        libxl_device_##type *p;                                         \

We have -Wshadow so you need to namespace all of these variable names
:-/.

That is, give them names which are such that surrounding code won't
clash with.  Presumably this is the reason why you have `int x' rather
than `int i' but that's not sufficient because in general the macro's
user might have an `int x'.  I suggest something like
  +        int DAJ_x;                                                    \
  +        libxl_domain_config DAJ_cfg;                                  \
  +        libxl_device_##type *DAJ_dev;                                 \
or the like.

The same goes for the `add_done' label.  It should probably be
`DAJ_out' (since it's a kind of `out').


You can get rid of the rc macro parameter if you:
 - replace the do{ } encapsulation with a gcc block expression ({ })
 - make rc a namespaced local variable eg
       ({              \
           int DAJ_rc; \
 - end the ({ }) expression like this
           DAJ_rc;   \
       )}
This would mean that the macro wouldn't set the caller's rc; it
would instead return it like our other macros.  The user would write
    rc = DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare);
    if (rc) goto out;


I considered whether it would be possible to make this macro a
function instead but I think it's quite tricky because it would
involve a lot of casting.  (Eg of a function pointer type.)  I haven't
looked up the exact rules to see whether the resulting code would be
legal C99 but I think it would probably be clumsier so we should
probably stick with the macro.


Thanks,
Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-25 16:06   ` Ian Jackson
@ 2014-07-25 16:40     ` Wei Liu
  2014-07-25 17:11       ` Ian Jackson
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-25 16:40 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Fri, Jul 25, 2014 at 05:06:11PM +0100, Ian Jackson wrote:
[...]
> > +    libxl_device_disk_copy(CTX, &disk_saved, disk);
> > +
> > +    LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc);
> > +    if (rc) goto out;
> > +
> >      for (;;) {
> >          rc = libxl__xs_transaction_start(gc, &t);
> > -        if (rc) goto out;
> > +        if (rc) {
> > +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> > +            goto out;
> > +        }
> > +
> > +        /* This is a loop, if disk has been added to JSON before, we
> > +         * need to remove it, then add it later, because disk->vdev
> > +         * might change.
> > +         */
> > +        DEVICE_REMOVE_JSON(disk, disks, num_disks, domid, &disk_saved,
> > +                           COMPARE_DISK, rc);
> 
> I don't think this is right.  At this point it may be the case that
> the disk is already attached.
> 

What do you mean by "attched"? I don't think guest knows anything about
that disk at this point.

> I see in libxl__device_generic_add that we seem to unconditionally
> remove the device if we do an add of a device that's already there.  I
> think this is probably wrong.
> 
> But whether it is, or not, it is not correct to remove the entry from
> the json while it might be in xenstore.  I think you may need to do a
> xenstore read here to check and take appropriate action if it's there.
> 

I'm a bit confused. If we ever come back to the beginning of the loop,
doesn't that mean this xenstore transaction is not committed, so that
any change to xenstore entries is not visible to others? That is,
libxl__device_generic_add is doing the right thing (at least in this
loop), as well as my call to DEVICE_REMOVE_JSON.

> > +        if (rc) {
> > +            UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> > +            goto out;
> > +        }
> >  
> >          if (get_vdev) {
> >              assert(get_vdev_user);
> >              disk->vdev = get_vdev(gc, get_vdev_user, t);
> >              if (disk->vdev == NULL) {
> >                  rc = ERROR_FAIL;
> > +                UNLOCK_DOMAIN_JSON_CONFIG(gc, domid, lock);
> >                  goto out;
> >              }
> >          }
> >  
[...]
> >      return;
> > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> > index 72d21ad..f27a6e98 100644
> > --- a/tools/libxl/libxl_internal.h
> > +++ b/tools/libxl/libxl_internal.h
> > @@ -3254,6 +3254,105 @@ static inline void libxl__update_config_vtpm(libxl__gc *gc,
> >      libxl_uuid_copy(CTX, &dst->uuid, &src->uuid);
> >  }
> >  
> > +/* Macro used to add the new device to JSON template */
> > +#define COMPARE_DEVID(a, b) ((a)->devid == (b)->devid)
> > +#define COMPARE_DISK(a, b) (!strcmp((a)->vdev, (b)->vdev))
> > +#define COMPARE_PCI(a, b) ((a)->func == (b)->func &&    \
> > +                           (a)->bus == (b)->bus &&      \
> > +                           (a)->dev == (b)->dev)
> > +
> > +#define LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc)                 \
> > +    do {                                                             \
> > +        if (domid != LIBXL_TOOLSTACK_DOMID)                          \
> > +            rc = libxl__lock_domain_configuration(gc, domid, &lock); \
> > +        else                                                         \
> > +            rc = 0;                                                  \
> > +    } while (0)
> 
> Can this test be perhaps moved into libxl__lock_domain_configuration ?
> 

Sure, but Ian C suggested I generate a stub JSON config for Dom0 so that
we can get rid of this special case. I think I will go down that route.

> Even if not, I don't think a macro is warranted here.  You could make
> it an inline function which returns rc.
> 

No problem.

> > +#define DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare, rc)        \
> 
> This needs a good doc comment.  See the doc comments for things like
> GCSPRINTF, CONTAINER_OF, CTYPE, etc., for the kind of thing that's
> helpful.
> 

Ack.

> > +    do {                                                                \
> > +        int x;                                                          \
> > +        libxl_domain_config d_config;                                   \
> > +        libxl_device_##type *p;                                         \
> 
> We have -Wshadow so you need to namespace all of these variable names
> :-/.
> 

Yes I know, and build system has not complained so far, but ...

> That is, give them names which are such that surrounding code won't
> clash with.  Presumably this is the reason why you have `int x' rather
> than `int i' but that's not sufficient because in general the macro's
> user might have an `int x'.  I suggest something like
>   +        int DAJ_x;                                                    \
>   +        libxl_domain_config DAJ_cfg;                                  \
>   +        libxl_device_##type *DAJ_dev;                                 \
> or the like.
> 
> The same goes for the `add_done' label.  It should probably be
> `DAJ_out' (since it's a kind of `out').
> 

I think I will use this approach in my next version.

> 
> You can get rid of the rc macro parameter if you:
>  - replace the do{ } encapsulation with a gcc block expression ({ })
>  - make rc a namespaced local variable eg
>        ({              \
>            int DAJ_rc; \
>  - end the ({ }) expression like this
>            DAJ_rc;   \
>        )}
> This would mean that the macro wouldn't set the caller's rc; it
> would instead return it like our other macros.  The user would write
>     rc = DEVICE_ADD_JSON(type, ptr, cnt, domid, dev, compare);
>     if (rc) goto out;
> 

Ack.

> 
> I considered whether it would be possible to make this macro a
> function instead but I think it's quite tricky because it would
> involve a lot of casting.  (Eg of a function pointer type.)  I haven't
> looked up the exact rules to see whether the resulting code would be
> legal C99 but I think it would probably be clumsier so we should
> probably stick with the macro.
> 

Yes I would rather stick with this macro.

Wei.

> 
> Thanks,
> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-25 16:40     ` Wei Liu
@ 2014-07-25 17:11       ` Ian Jackson
  2014-07-25 17:19         ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-25 17:11 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device"):
> On Fri, Jul 25, 2014 at 05:06:11PM +0100, Ian Jackson wrote:
> [...]
> > I don't think this is right.  At this point it may be the case that
> > the disk is already attached.
> 
> What do you mean by "attched"? I don't think guest knows anything about
> that disk at this point.

I mean, suppose I do this
     xl block-attach /dev/vm/foo,,xvda
     xl block-attach /dev/vm/bar,,xvda
then your code running during the second attach will remove
foo,,xvda from the json config while it remains in the domain.  That's
wrong.

> I'm a bit confused. If we ever come back to the beginning of the loop,
> doesn't that mean this xenstore transaction is not committed, so that
> any change to xenstore entries is not visible to others? That is,
> libxl__device_generic_add is doing the right thing (at least in this
> loop), as well as my call to DEVICE_REMOVE_JSON.

I meant, a device which existed previously.

> > > +#define LOCK_DOMAIN_JSON_CONFIG(gc, domid, lock, rc)                 \
> > > +    do {                                                             \
> > > +        if (domid != LIBXL_TOOLSTACK_DOMID)                          \
> > > +            rc = libxl__lock_domain_configuration(gc, domid, &lock); \
> > > +        else                                                         \
> > > +            rc = 0;                                                  \
> > > +    } while (0)
> > 
> > Can this test be perhaps moved into libxl__lock_domain_configuration ?
> 
> Sure, but Ian C suggested I generate a stub JSON config for Dom0 so that
> we can get rid of this special case. I think I will go down that route.

That would be even better.

> > I considered whether it would be possible to make this macro a
> > function instead but I think it's quite tricky because it would
> > involve a lot of casting.  (Eg of a function pointer type.)  I haven't
> > looked up the exact rules to see whether the resulting code would be
> > legal C99 but I think it would probably be clumsier so we should
> > probably stick with the macro.
> > 
> 
> Yes I would rather stick with this macro.

Right.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 15:43         ` Wei Liu
@ 2014-07-25 17:14           ` Ian Jackson
  2014-07-25 17:34             ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-25 17:14 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> On Fri, Jul 25, 2014 at 04:01:06PM +0100, Ian Jackson wrote:
> > During that time various operations would block unreasonably.
> 
> That's a bit unfortunate. But what can we do to close that window?
> I think no good will come out if I try to fiddle with domain state
> during creation anyway, so locking like this may be acceptable?

No, because it might block a management daemon.  You're not allowed to
hold a lock which libxl operations will block on while doing anything
`slow' in the ao sense.  Conceivably in a sufficiently complicated
system it might even result in deadlock.

I think the right answer is for those attempts to fiddle with the
domain state to fail.

> > Also, I think you mustn't use an fcntl lock across ao operation
> > callback chains.  fcntl locks do not exclude other threads in the same
> > process.
> 
> Then we need both mutex and file lock? Mutex to protect against threads
> in the same process while file lock protect against other processes.

I think the answer is that you have to not retain the lock for so
long.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device
  2014-07-25 17:11       ` Ian Jackson
@ 2014-07-25 17:19         ` Wei Liu
  0 siblings, 0 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-25 17:19 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Fri, Jul 25, 2014 at 06:11:31PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device"):
> > On Fri, Jul 25, 2014 at 05:06:11PM +0100, Ian Jackson wrote:
> > [...]
> > > I don't think this is right.  At this point it may be the case that
> > > the disk is already attached.
> > 
> > What do you mean by "attched"? I don't think guest knows anything about
> > that disk at this point.
> 
> I mean, suppose I do this
>      xl block-attach /dev/vm/foo,,xvda
>      xl block-attach /dev/vm/bar,,xvda
> then your code running during the second attach will remove
> foo,,xvda from the json config while it remains in the domain.  That's
> wrong.
> 

I see, I will do a xenstore read to determine if a device is already
there.

Note to self: also check other devices for this problem.

Wei.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 17:14           ` Ian Jackson
@ 2014-07-25 17:34             ` Wei Liu
  2014-07-25 18:31               ` Ian Jackson
  0 siblings, 1 reply; 70+ messages in thread
From: Wei Liu @ 2014-07-25 17:34 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Fri, Jul 25, 2014 at 06:14:05PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> > On Fri, Jul 25, 2014 at 04:01:06PM +0100, Ian Jackson wrote:
> > > During that time various operations would block unreasonably.
> > 
> > That's a bit unfortunate. But what can we do to close that window?
> > I think no good will come out if I try to fiddle with domain state
> > during creation anyway, so locking like this may be acceptable?
> 
> No, because it might block a management daemon.  You're not allowed to
> hold a lock which libxl operations will block on while doing anything
> `slow' in the ao sense.  Conceivably in a sufficiently complicated
> system it might even result in deadlock.
> 
> I think the right answer is for those attempts to fiddle with the
> domain state to fail.
> 

OK, how about we do this.
1. keep a copy of vanilla config in memory instead of writing it to disk
   at the beginning
2. write the most update-to-date version when domain creation finishes

That way, any operations that try to fiddle with domain state,
especially those who require reading / writing JSON config, can fail at
the point when they try to access JSON config file.

However this approach has a downside -- we try to maintain invariant
that every device in xenstore has an entry in JSON config -- this seems
to break that invariant. On the flip side, I think we can loosen this a
bit, because even if we write JSON config at the beginning, the stored
entry doesn't contain device identifier so that we're impossible to
maitain that invariant anyway (that is, there's an entry in JSON but we
cannot associate it with an entry in xenstore).

> > > Also, I think you mustn't use an fcntl lock across ao operation
> > > callback chains.  fcntl locks do not exclude other threads in the same
> > > process.
> > 
> > Then we need both mutex and file lock? Mutex to protect against threads
> > in the same process while file lock protect against other processes.
> 
> I think the answer is that you have to not retain the lock for so
> long.
> 

I don't follow. As you said fcntl lock has no effect on threads, so "not
retain the lock for so long" is irrelevant to protect races among
threads.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 17:34             ` Wei Liu
@ 2014-07-25 18:31               ` Ian Jackson
  2014-07-25 19:47                 ` Wei Liu
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Jackson @ 2014-07-25 18:31 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> OK, how about we do this.
> 1. keep a copy of vanilla config in memory instead of writing it to disk
>    at the beginning
> 2. write the most update-to-date version when domain creation finishes

I think that's correct.

> That way, any operations that try to fiddle with domain state,
> especially those who require reading / writing JSON config, can fail at
> the point when they try to access JSON config file.

Right.

> However this approach has a downside -- we try to maintain invariant
> that every device in xenstore has an entry in JSON config -- this seems
> to break that invariant. On the flip side, I think we can loosen this a
> bit, because even if we write JSON config at the beginning, the stored
> entry doesn't contain device identifier so that we're impossible to
> maitain that invariant anyway (that is, there's an entry in JSON but we
> cannot associate it with an entry in xenstore).

Yes, I think the invariant needs to be relaxed so that we also permit
a domain which has no JSON at all.  Such a domain cannot have devices
added or removed.  It is in the process of being created or destroyed.

> > > Then we need both mutex and file lock? Mutex to protect against threads
> > > in the same process while file lock protect against other processes.
> > 
> > I think the answer is that you have to not retain the lock for so
> > long.
> 
> I don't follow. As you said fcntl lock has no effect on threads, so "not
> retain the lock for so long" is irrelevant to protect races among
> threads.

If the application makes multiple threads which call into libxl to do
different things (things which might need to be serialised), these
different threads will not benefit from any interlocking due to the
fcntl lock.

I was going to say that this doesn't matter so long as all those
threads are all within the scope of a single acquision of the libxl
ctx lock.  But of course that's wrong because the two threads might be
using different ctxs.

In summary I think you should be using flock which doesn't have these
problems, but which instead requires using the carefd machinery to
avoid unrelated processes inheriting the fd and hence the lock.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 18:31               ` Ian Jackson
@ 2014-07-25 19:47                 ` Wei Liu
  2014-07-28  9:42                   ` Ian Campbell
  2014-07-28  9:50                   ` Ian Jackson
  0 siblings, 2 replies; 70+ messages in thread
From: Wei Liu @ 2014-07-25 19:47 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Wei Liu, ian.campbell, xen-devel

On Fri, Jul 25, 2014 at 07:31:38PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> > OK, how about we do this.
> > 1. keep a copy of vanilla config in memory instead of writing it to disk
> >    at the beginning
> > 2. write the most update-to-date version when domain creation finishes
> 
> I think that's correct.
> 
> > That way, any operations that try to fiddle with domain state,
> > especially those who require reading / writing JSON config, can fail at
> > the point when they try to access JSON config file.
> 
> Right.
> 
> > However this approach has a downside -- we try to maintain invariant
> > that every device in xenstore has an entry in JSON config -- this seems
> > to break that invariant. On the flip side, I think we can loosen this a
> > bit, because even if we write JSON config at the beginning, the stored
> > entry doesn't contain device identifier so that we're impossible to
> > maitain that invariant anyway (that is, there's an entry in JSON but we
> > cannot associate it with an entry in xenstore).
> 
> Yes, I think the invariant needs to be relaxed so that we also permit
> a domain which has no JSON at all.  Such a domain cannot have devices
> added or removed.  It is in the process of being created or destroyed.
> 

Ack. This sounds plausible.

> > > > Then we need both mutex and file lock? Mutex to protect against threads
> > > > in the same process while file lock protect against other processes.
> > > 
> > > I think the answer is that you have to not retain the lock for so
> > > long.
> > 
> > I don't follow. As you said fcntl lock has no effect on threads, so "not
> > retain the lock for so long" is irrelevant to protect races among
> > threads.
> 
> If the application makes multiple threads which call into libxl to do
> different things (things which might need to be serialised), these
> different threads will not benefit from any interlocking due to the
> fcntl lock.
> 
> I was going to say that this doesn't matter so long as all those
> threads are all within the scope of a single acquision of the libxl
> ctx lock.  But of course that's wrong because the two threads might be
> using different ctxs.
> 
> In summary I think you should be using flock which doesn't have these
> problems, but which instead requires using the carefd machinery to
> avoid unrelated processes inheriting the fd and hence the lock.
> 

I actually looked at flock(2) this afternoon but deemed it is not as
portable as fcntl so I didn't propose it. flock on NFS file probably
won't work (I bet we won't have the chance to put user data store on NFS
though). And somee other UNIX systems might actually implement flock with
fcntl. The latter is more concerning.

If you don't think the above issues are critical I can try to replace
fcntl with flock.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 19:47                 ` Wei Liu
@ 2014-07-28  9:42                   ` Ian Campbell
  2014-07-28  9:50                   ` Ian Jackson
  1 sibling, 0 replies; 70+ messages in thread
From: Ian Campbell @ 2014-07-28  9:42 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On Fri, 2014-07-25 at 20:47 +0100, Wei Liu wrote:
> (I bet we won't have the chance to put user data store on NFS
> though)

...shudder...

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain
  2014-07-25 19:47                 ` Wei Liu
  2014-07-28  9:42                   ` Ian Campbell
@ 2014-07-28  9:50                   ` Ian Jackson
  1 sibling, 0 replies; 70+ messages in thread
From: Ian Jackson @ 2014-07-28  9:50 UTC (permalink / raw)
  To: Wei Liu; +Cc: ian.campbell, xen-devel

Wei Liu writes ("Re: [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain"):
> I actually looked at flock(2) this afternoon but deemed it is not as
> portable as fcntl so I didn't propose it. flock on NFS file probably
> won't work (I bet we won't have the chance to put user data store on NFS
> though). And somee other UNIX systems might actually implement flock with
> fcntl. The latter is more concerning.

No-one will put /var/run/xen on NFS.  Or at least, if they do, they
deserve to keep all of the pieces.  (fcntl sometimes doesn't work
properly on NFS either.)

And we only care about dom0 kernel operating systems.  So I think we
can discount the possibility that any of them are god-knows-what
ancient thing that doesn't have a real flock.

> If you don't think the above issues are critical I can try to replace
> fcntl with flock.

Yes, please.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max
  2014-07-17 13:59       ` Ian Campbell
@ 2014-07-29 13:39         ` Ian Jackson
  0 siblings, 0 replies; 70+ messages in thread
From: Ian Jackson @ 2014-07-29 13:39 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, xen-devel

Ian Campbell writes ("Re: [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max"):
> On Thu, 2014-07-17 at 13:02 +0100, Wei Liu wrote:
> > On Thu, Jul 17, 2014 at 11:47:08AM +0100, Ian Campbell wrote:
> > > It looks to me like this could instead be a common helper function, with
> > > a simple boolean parameter.
> > 
> > Ian J likes macro while you likes functions. I'm fine with anything that
> > works. :-)
> 
> I think Ian prefers macros over repetition, but not at the expense of a
> helper function. IOW macros only when the function is impossible. At
> least I hope that's the case!

Yes.  I am indeed allergic to repetition.  Repetition should be
avoided by the use of helper functions if possible; or helper
functions with helper macros; or failing that just macros; or failing
that with code autogeneration run from the build system.

So I agree with Ian C that if you can shrink the proportion of the
code in the macros that's probably a good thing, unless it seems to
break things up so much it makes them more confusing.

Note also that there are complicated rules about pointer casting -
particularly about calling a function through a pointer which has been
cast from one function pointer type to another.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-17 10:59   ` Ian Campbell
  2014-07-17 12:11     ` Wei Liu
@ 2014-07-29 15:29     ` Ian Jackson
  1 sibling, 0 replies; 70+ messages in thread
From: Ian Jackson @ 2014-07-29 15:29 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, xen-devel

Ian Campbell writes ("Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration"):
> It occurs to me to wonder if any function which takes a lock ought to be
> async capable?

If nothing holds the lock over a slow operation, then this is not
necessary.

This property of the lock ought to be documented.

> Also is there is any possibility that any of the operations needed to
> gather the updated configuration might take a long time.

I think we're hoping not.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration
  2014-07-17 12:11     ` Wei Liu
  2014-07-17 14:02       ` Ian Campbell
@ 2014-07-29 15:31       ` Ian Jackson
  1 sibling, 0 replies; 70+ messages in thread
From: Ian Jackson @ 2014-07-29 15:31 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Campbell, xen-devel

Wei Liu writes ("Re: [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration"):
> On Thu, Jul 17, 2014 at 11:59:01AM +0100, Ian Campbell wrote:
> > Also is there is any possibility that any of the operations needed to
> > gather the updated configuration might take a long time.
> 
> I think reading from xenstore and filesystem can be slow, in the sense
> that if there's much contention on these two resources it could take
> seconds to finish operations.

For the purposes of libxl, slow functions are defined as follows
(libxl_internal.h):

 * "Slow" functions includes any that might block on a guest or an
 * external script.  More broadly, it includes any operations which
 * are sufficiently slow that an application might reasonably want to
 * initiate them, and then carry on doing something else, while the
 * operation completes.  That is, a "fast" function must be fast
 * enough that we do not mind blocking all other management operations
 * on the same host while it completes.
 *
 * There are certain primitive functions which make a libxl operation
 * necessarily "slow" for API reasons.  These are:
 *  - awaiting xenstore watches (although read-modify-write xenstore
 *    transactions are OK for fast functions)
 *  - spawning subprocesses
 *  - anything with a timeout

Note that xenstore accesses are explicitly excluded.

Ian.

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2014-07-29 15:31 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-10 14:32 [PATCH v1 00/10] libxl: synchronise domain configuration Wei Liu
2014-07-10 14:32 ` [PATCH v1 01/10] libxl: libxl-json format and internal functions to get / set it Wei Liu
2014-07-16 16:11   ` Ian Campbell
2014-07-16 16:44     ` Wei Liu
2014-07-24 18:09   ` Ian Jackson
2014-07-24 18:29   ` Ian Jackson
2014-07-25 10:30     ` Wei Liu
2014-07-25 14:51       ` Ian Jackson
2014-07-10 14:32 ` [PATCH v1 02/10] libxl_internal: functions to lock / unlock domain configuration Wei Liu
2014-07-16 16:15   ` Ian Campbell
2014-07-16 16:44     ` Wei Liu
2014-07-17 11:29       ` Ian Campbell
2014-07-17 11:41         ` Wei Liu
2014-07-17 11:48           ` Ian Campbell
2014-07-24 18:24   ` Ian Jackson
2014-07-25 10:36     ` Wei Liu
2014-07-10 14:32 ` [PATCH v1 03/10] libxl: store a copy of vanilla domain configuration when creating domain Wei Liu
2014-07-16 16:18   ` Ian Campbell
2014-07-16 16:47     ` Wei Liu
2014-07-17 11:06       ` Ian Campbell
2014-07-17 11:46         ` Wei Liu
2014-07-24 18:52   ` Ian Jackson
2014-07-25 10:53     ` Wei Liu
2014-07-25 15:01       ` Ian Jackson
2014-07-25 15:43         ` Wei Liu
2014-07-25 17:14           ` Ian Jackson
2014-07-25 17:34             ` Wei Liu
2014-07-25 18:31               ` Ian Jackson
2014-07-25 19:47                 ` Wei Liu
2014-07-28  9:42                   ` Ian Campbell
2014-07-28  9:50                   ` Ian Jackson
2014-07-10 14:32 ` [PATCH v1 04/10] libxl: separate device add/rm complete callbacks Wei Liu
2014-07-16 16:26   ` Ian Campbell
2014-07-16 16:48     ` Wei Liu
2014-07-10 14:32 ` [PATCH v1 05/10] libxl: synchronise configuration when we hotplug a device Wei Liu
2014-07-16 16:48   ` Ian Campbell
2014-07-16 17:12     ` Wei Liu
2014-07-17 11:44       ` Ian Campbell
2014-07-17 14:13         ` Wei Liu
2014-07-18  8:49           ` Ian Campbell
2014-07-18 11:22             ` Wei Liu
2014-07-18 12:20               ` Ian Campbell
2014-07-18 13:41                 ` Wei Liu
2014-07-18 13:44                   ` Ian Campbell
2014-07-25 16:06   ` Ian Jackson
2014-07-25 16:40     ` Wei Liu
2014-07-25 17:11       ` Ian Jackson
2014-07-25 17:19         ` Wei Liu
2014-07-10 14:32 ` [PATCH v1 06/10] libxl: synchronise configuration when we remove/destroy " Wei Liu
2014-07-16 16:58   ` Ian Campbell
2014-07-10 14:32 ` [PATCH v1 07/10] libxl: make libxl_cd_insert "eject" + "insert" Wei Liu
2014-07-17 10:44   ` Ian Campbell
2014-07-17 14:20     ` Wei Liu
2014-07-10 14:32 ` [PATCH v1 08/10] libxl: introduce libxl_get_memory_static_max Wei Liu
2014-07-17 10:47   ` Ian Campbell
2014-07-17 12:02     ` Wei Liu
2014-07-17 13:59       ` Ian Campbell
2014-07-29 13:39         ` Ian Jackson
2014-07-10 14:32 ` [PATCH v1 09/10] libxl: introduce libxl_retrieve_domain_configuration Wei Liu
2014-07-17 10:59   ` Ian Campbell
2014-07-17 12:11     ` Wei Liu
2014-07-17 14:02       ` Ian Campbell
2014-07-17 14:28         ` Wei Liu
2014-07-18  8:52           ` Ian Campbell
2014-07-18 11:17             ` Wei Liu
2014-07-29 15:31       ` Ian Jackson
2014-07-29 15:29     ` Ian Jackson
2014-07-10 14:32 ` [PATCH v1 10/10] xl: use libxl_retrieve_domain_configuration and JSON format Wei Liu
2014-07-17 11:13   ` Ian Campbell
2014-07-17 12:14     ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.