All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO
@ 2015-07-15  7:45 Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 01/25] tools/libxl: rename libxl__domain_suspend to libxl__domain_save Yang Hongyang
                   ` (25 more replies)
  0 siblings, 26 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

This patchset is Prerequisite for COLO feature. Refer to:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

This patchse is based on Andrew Cooper's Libxl migration v4.1:
  http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/libxl-migv2-v4.1

In this version, I moved some of the COLO specific patches down to the COLO
main series, so most patches of this series are refactoring and can be applied
first.

I've done some simple test. Both Remus and normal migration work after apply
this patchset. The patch to fix Remus on migration v2 will be sent later as
a seperate patch.

You can also get the patchset from:
  https://github.com/macrosheep/xen/tree/colo-v8

v3->v4:
 - Rebased to the latest migration v2 branch
 - Addressed comments from last round

v2->v3:
 - Merge '[PATCH v2 0/6] Misc cleanups for libxl' into this patchset
   for easy review
 - Addressed review comments
 - Add back channel to libxc
 - Introduce should_checkpoint callback
 - Introduce DIRTY_BITMAP record on libxc side
 - Introduce COLO_CONTEXT record on libxl side
 - Ported to Libxl migration v2

v1->v2:
 - Rebased to [PATCH v2 0/6] Misc cleanups for libxl
 - Add a bugfix for the error handling of process_record


Wen Congyang (2):
  tools/libxc: support to resume uncooperative HVM guests
  tools/libxl: Add back channel to allow migration target send data back

Yang Hongyang (23):
  tools/libxl: rename libxl__domain_suspend to libxl__domain_save
A  tools/libxl: move domain suspend code into libxl_dom_suspend.c
A  tools/libxl: move domain resume code into libxl_dom_suspend.c
  tools/libxl: rename remus checkpoint callbacks
  libxl/remus: introduce libxl__remus_setup
  libxl/remus: introduce libxl__remus_teardown
  libxl/remus: init checkpoint_callback in Remus checkpoint callback
  tools/libxl: move remus code into libxl_remus.c
A  tools/libxl: move save/restore code into libxl_dom_save.c
  libxl/save: Refactor libxl__domain_suspend_state
  tools/libxl: introduce enum type libxl_checkpointed_stream
  migration/save: pass checkpointed_stream from libxl to libxc
  tools/libxl: introduce libxl__domain_restore_device_model to load qemu
    state
  tools/libxl: check QEMU state before resume dm
  tools/libxl: Update libxl_domain_unpause() to support qemu-xen
A  tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
A  tools/libxl: export logdirty_init
  tools/libx{l,c}: add back channel to libxc
  tools/libxl: rename remus device to checkpoint device
A  tools/libxl: adjust the indentation
  tools/libxl: store remus_ops in checkpoint device state
  tools/libxl: move remus state into a seperate structure
  tools/libxl: seperate device init/cleanup from checkpoint device layer

 tools/libxc/include/xenguest.h        |   13 +-
 tools/libxc/xc_domain_restore.c       |    4 +-
 tools/libxc/xc_domain_save.c          |    6 +-
 tools/libxc/xc_nomigrate.c            |    3 +-
 tools/libxc/xc_resume.c               |   22 +-
 tools/libxc/xc_sr_common.h            |    2 +-
 tools/libxc/xc_sr_restore.c           |    2 +-
 tools/libxc/xc_sr_save.c              |    5 +-
 tools/libxl/Makefile                  |    5 +-
 tools/libxl/libxl.c                   |  119 +---
 tools/libxl/libxl.h                   |   30 +-
 tools/libxl/libxl_checkpoint_device.c |  282 ++++++++
 tools/libxl/libxl_create.c            |   33 +-
 tools/libxl/libxl_dom.c               | 1243 ---------------------------------
 tools/libxl/libxl_dom_save.c          |  721 +++++++++++++++++++
 tools/libxl/libxl_dom_suspend.c       |  503 +++++++++++++
 tools/libxl/libxl_internal.h          |  246 ++++---
 tools/libxl/libxl_netbuffer.c         |  117 ++--
 tools/libxl/libxl_nonetbuffer.c       |   10 +-
 tools/libxl/libxl_qmp.c               |   10 +
 tools/libxl/libxl_remus.c             |  395 +++++++++++
 tools/libxl/libxl_remus_device.c      |  327 ---------
 tools/libxl/libxl_remus_disk_drbd.c   |   56 +-
 tools/libxl/libxl_save_callout.c      |   43 +-
 tools/libxl/libxl_save_helper.c       |    9 +-
 tools/libxl/libxl_stream_write.c      |   14 +-
 tools/libxl/libxl_types.idl           |   10 +-
 tools/libxl/xl_cmdimpl.c              |   21 +-
 tools/ocaml/libs/xl/xenlight_stubs.c  |    2 +-
 29 files changed, 2321 insertions(+), 1932 deletions(-)
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 create mode 100644 tools/libxl/libxl_dom_save.c
 create mode 100644 tools/libxl/libxl_dom_suspend.c
 create mode 100644 tools/libxl/libxl_remus.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 01/25] tools/libxl: rename libxl__domain_suspend to libxl__domain_save
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 11:16   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 02/25] tools/libxl: move domain suspend code into libxl_dom_suspend.c Yang Hongyang
                   ` (24 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

The suspend/save terminology used by libxc is more consistent.
"suspend" refers to quiescing the VM, so pausing qemu, making a
remote_shutdown(SHUTDOWN_suspend) hypercall etc.
"save" refers to the actions involved in actually shuffling the
state of the VM, so xc_domain_save() etc.

libxl currently uses "suspend" to encapsulate both. The patch
Rename libxl__domain_suspend() to libxl__domain_save() since it
actually refers to shuffling the state of the VM.

This results in some strangeness in that some functions called *save*
are now passed a struct called *suspend*, this is temporary and is all
fixed up later by the refactoring of the suspend_state.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Some comments, commit messages:
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c          |  4 ++--
 tools/libxl/libxl_dom.c      | 16 ++++++++--------
 tools/libxl/libxl_internal.h | 13 ++++++++++---
 3 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 38aff8d..fa42c1c 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -914,7 +914,7 @@ static void libxl__remus_setup_done(libxl__egc *egc,
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
-        libxl__domain_suspend(egc, dss);
+        libxl__domain_save(egc, dss);
         return;
     }
 
@@ -981,7 +981,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
     dss->live = flags & LIBXL_SUSPEND_LIVE;
     dss->debug = flags & LIBXL_SUSPEND_DEBUG;
 
-    libxl__domain_suspend(egc, dss);
+    libxl__domain_save(egc, dss);
     return AO_INPROGRESS;
 
  out_err:
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 81adb3d..3bbec99 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1155,8 +1155,8 @@ out:
 
 static void stream_done(libxl__egc *egc,
                         libxl__stream_write_state *sws, int rc);
-static void domain_suspend_done(libxl__egc *egc,
-                        libxl__domain_suspend_state *dss, int rc);
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dss, int rc);
 
@@ -2036,9 +2036,9 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
 }
 
-/*----- main code for suspending, in order of execution -----*/
+/*----- main code for saving, in order of execution -----*/
 
-void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
+void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
 {
     STATE_AO_GC(dss->ao);
     int port;
@@ -2125,13 +2125,13 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
     return;
 
  out:
-    domain_suspend_done(egc, dss, rc);
+    domain_save_done(egc, dss, rc);
 }
 
 static void stream_done(libxl__egc *egc,
                         libxl__stream_write_state *sws, int rc)
 {
-    domain_suspend_done(egc, sws->dss, rc);
+    domain_save_done(egc, sws->dss, rc);
 }
 
 static void save_device_model_datacopier_done(libxl__egc *egc,
@@ -2229,8 +2229,8 @@ static void remus_teardown_done(libxl__egc *egc,
                                        libxl__remus_devices_state *rds,
                                        int rc);
 
-static void domain_suspend_done(libxl__egc *egc,
-                        libxl__domain_suspend_state *dss, int rc)
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index d9deaad..7599f15 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2946,6 +2946,13 @@ static inline bool libxl__conversion_helper_inuse
 
 
 /*----- Domain suspend (save) state structure -----*/
+/*
+ * "suspend" refers to quiescing the VM, so pausing qemu, making a
+ * remote_shutdown(SHUTDOWN_suspend) hypercall etc.
+ *
+ * "save" refers to the actions involved in actually shuffling the
+ * state of the VM, so xc_domain_save() etc.
+ */
 
 typedef struct libxl__domain_suspend_state libxl__domain_suspend_state;
 
@@ -3010,7 +3017,7 @@ typedef struct libxl__logdirty_switch {
 } libxl__logdirty_switch;
 
 struct libxl__domain_suspend_state {
-    /* set by caller of libxl__domain_suspend */
+    /* set by caller of libxl__domain_save */
     libxl__ao *ao;
     libxl__domain_suspend_cb *callback;
 
@@ -3375,8 +3382,8 @@ struct libxl__domain_create_state {
 /*----- Domain suspend (save) functions -----*/
 
 /* calls dss->callback when done */
-_hidden void libxl__domain_suspend(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss);
+_hidden void libxl__domain_save(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss);
 
 
 /* calls libxl__xc_domain_suspend_done when done */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 02/25] tools/libxl: move domain suspend code into libxl_dom_suspend.c
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 01/25] tools/libxl: rename libxl__domain_suspend to libxl__domain_save Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 03/25] tools/libxl: move domain resume " Yang Hongyang
                   ` (23 subsequent siblings)
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

Move domain suspend code into a separate file libxl_dom_suspend.c.
Add an API libxl__domain_suspend() which wraps the static
function domain_suspend_callback_common() for internal use.
Export the existing API libxl__domain_suspend_callback() used by
libxc to suspend the guest during migration.

Note that the newly added file libxl_dom_suspend.c is used for
suspend/resume code.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/Makefile            |   3 +-
 tools/libxl/libxl_dom.c         | 342 +-----------------------------------
 tools/libxl/libxl_dom_suspend.c | 380 ++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h    |   6 +
 4 files changed, 389 insertions(+), 342 deletions(-)
 create mode 100644 tools/libxl/libxl_dom_suspend.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 0150ec7..4a5957e 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -102,7 +102,8 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
 			libxl_stream_read.o libxl_stream_write.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
-			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
+			libxl_qmp.o libxl_event.o libxl_fork.o \
+			libxl_dom_suspend.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 3bbec99..e21e110 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1157,8 +1157,6 @@ static void stream_done(libxl__egc *egc,
                         libxl__stream_write_state *sws, int rc);
 static void domain_save_done(libxl__egc *egc,
                              libxl__domain_suspend_state *dss, int rc);
-static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc);
 
 /*----- complicated callback, called by xc_domain_save -----*/
 
@@ -1386,35 +1384,6 @@ static void switch_logdirty_done(libxl__egc *egc,
 
 /*----- callbacks, called by xc_domain_save -----*/
 
-int libxl__domain_suspend_device_model(libxl__gc *gc,
-                                       libxl__domain_suspend_state *dss)
-{
-    int ret = 0;
-    uint32_t const domid = dss->domid;
-    const char *const filename = dss->dm_savefile;
-
-    switch (libxl__device_model_version_running(gc, domid)) {
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-        LOG(DEBUG, "Saving device model state to %s", filename);
-        libxl__qemu_traditional_cmd(gc, domid, "save");
-        libxl__wait_for_device_model_deprecated(gc, domid, "paused", NULL, NULL, NULL);
-        break;
-    }
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        if (libxl__qmp_stop(gc, domid))
-            return ERROR_FAIL;
-        /* Save DM state into filename */
-        ret = libxl__qmp_save(gc, domid, filename);
-        if (ret)
-            unlink(filename);
-        break;
-    default:
-        return ERROR_INVAL;
-    }
-
-    return ret;
-}
-
 int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
 {
 
@@ -1435,298 +1404,6 @@ int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
     return 0;
 }
 
-static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss);
-static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss);
-
-static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
-      libxl__xswait_state *xswa, int rc, const char *state);
-static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
-        libxl__ev_evtchn *evev);
-static void suspend_common_wait_guest_watch(libxl__egc *egc,
-      libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
-static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss);
-static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-      libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
-
-static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
-                                       int rc);
-
-static bool domain_suspend_pvcontrol_acked(const char *state) {
-    /* any value other than "suspend", including ENOENT (i.e. !state), is OK */
-    if (!state) return 1;
-    return strcmp(state,"suspend");
-}
-
-/* calls dss->callback_common_done when done */
-static void domain_suspend_callback_common(libxl__egc *egc,
-                                           libxl__domain_suspend_state *dss)
-{
-    STATE_AO_GC(dss->ao);
-    uint64_t hvm_s_state = 0, hvm_pvdrv = 0;
-    int ret, rc;
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-
-    if (dss->hvm) {
-        xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_CALLBACK_IRQ, &hvm_pvdrv);
-        xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_ACPI_S_STATE, &hvm_s_state);
-    }
-
-    if ((hvm_s_state == 0) && (dss->guest_evtchn.port >= 0)) {
-        LOG(DEBUG, "issuing %s suspend request via event channel",
-            dss->hvm ? "PVHVM" : "PV");
-        ret = xc_evtchn_notify(CTX->xce, dss->guest_evtchn.port);
-        if (ret < 0) {
-            LOG(ERROR, "xc_evtchn_notify failed ret=%d", ret);
-            rc = ERROR_FAIL;
-            goto err;
-        }
-
-        dss->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
-        rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
-        if (rc) goto err;
-
-        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
-                                         suspend_common_wait_guest_timeout,
-                                         60*1000);
-        if (rc) goto err;
-
-        return;
-    }
-
-    if (dss->hvm && (!hvm_pvdrv || hvm_s_state)) {
-        LOG(DEBUG, "Calling xc_domain_shutdown on HVM domain");
-        ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend);
-        if (ret < 0) {
-            LOGE(ERROR, "xc_domain_shutdown failed");
-            rc = ERROR_FAIL;
-            goto err;
-        }
-        /* The guest does not (need to) respond to this sort of request. */
-        dss->guest_responded = 1;
-        domain_suspend_common_wait_guest(egc, dss);
-        return;
-    }
-
-    LOG(DEBUG, "issuing %s suspend request via XenBus control node",
-        dss->hvm ? "PVHVM" : "PV");
-
-    libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend");
-
-    dss->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
-    if (!dss->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
-
-    dss->pvcontrol.ao = ao;
-    dss->pvcontrol.what = "guest acknowledgement of suspend request";
-    dss->pvcontrol.timeout_ms = 60 * 1000;
-    dss->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
-    libxl__xswait_start(gc, &dss->pvcontrol);
-    return;
-
- err:
-    domain_suspend_common_done(egc, dss, rc);
-}
-
-static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
-        libxl__ev_evtchn *evev)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(evev, *dss, guest_evtchn);
-    STATE_AO_GC(dss->ao);
-    /* If we should be done waiting, suspend_common_wait_guest_check
-     * will end up calling domain_suspend_common_guest_suspended or
-     * domain_suspend_common_done, both of which cancel the evtchn
-     * wait as needed.  So re-enable it now. */
-    libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
-    suspend_common_wait_guest_check(egc, dss);
-}
-
-static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
-      libxl__xswait_state *xswa, int rc, const char *state)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xswa, *dss, pvcontrol);
-    STATE_AO_GC(dss->ao);
-    xs_transaction_t t = 0;
-
-    if (!rc && !domain_suspend_pvcontrol_acked(state))
-        /* keep waiting */
-        return;
-
-    libxl__xswait_stop(gc, &dss->pvcontrol);
-
-    if (rc == ERROR_TIMEDOUT) {
-        /*
-         * Guest appears to not be responding. Cancel the suspend
-         * request.
-         *
-         * We re-read the suspend node and clear it within a
-         * transaction in order to handle the case where we race
-         * against the guest catching up and acknowledging the request
-         * at the last minute.
-         */
-        for (;;) {
-            rc = libxl__xs_transaction_start(gc, &t);
-            if (rc) goto err;
-
-            rc = libxl__xs_read_checked(gc, t, xswa->path, &state);
-            if (rc) goto err;
-
-            if (domain_suspend_pvcontrol_acked(state))
-                /* last minute ack */
-                break;
-
-            rc = libxl__xs_write_checked(gc, t, xswa->path, "");
-            if (rc) goto err;
-
-            rc = libxl__xs_transaction_commit(gc, &t);
-            if (!rc) {
-                LOG(ERROR,
-                    "guest didn't acknowledge suspend, cancelling request");
-                goto err;
-            }
-            if (rc<0) goto err;
-        }
-    } else if (rc) {
-        /* some error in xswait's read of xenstore, already logged */
-        goto err;
-    }
-
-    assert(domain_suspend_pvcontrol_acked(state));
-    LOG(DEBUG, "guest acknowledged suspend request");
-
-    libxl__xs_transaction_abort(gc, &t);
-    dss->guest_responded = 1;
-    domain_suspend_common_wait_guest(egc,dss);
-    return;
-
- err:
-    libxl__xs_transaction_abort(gc, &t);
-    domain_suspend_common_done(egc, dss, rc);
-    return;
-}
-
-static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss)
-{
-    STATE_AO_GC(dss->ao);
-    int rc;
-
-    LOG(DEBUG, "wait for the guest to suspend");
-
-    rc = libxl__ev_xswatch_register(gc, &dss->guest_watch,
-                                    suspend_common_wait_guest_watch,
-                                    "@releaseDomain");
-    if (rc) goto err;
-
-    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
-                                     suspend_common_wait_guest_timeout,
-                                     60*1000);
-    if (rc) goto err;
-    return;
-
- err:
-    domain_suspend_common_done(egc, dss, rc);
-}
-
-static void suspend_common_wait_guest_watch(libxl__egc *egc,
-      libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xsw, *dss, guest_watch);
-    suspend_common_wait_guest_check(egc, dss);
-}
-
-static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss)
-{
-    STATE_AO_GC(dss->ao);
-    xc_domaininfo_t info;
-    int ret;
-    int shutdown_reason;
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-
-    ret = xc_domain_getinfolist(CTX->xch, domid, 1, &info);
-    if (ret < 0) {
-        LOGE(ERROR, "unable to check for status of guest %"PRId32"", domid);
-        goto err;
-    }
-
-    if (!(ret == 1 && info.domain == domid)) {
-        LOGE(ERROR, "guest %"PRId32" we were suspending has been destroyed",
-             domid);
-        goto err;
-    }
-
-    if (!(info.flags & XEN_DOMINF_shutdown))
-        /* keep waiting */
-        return;
-
-    shutdown_reason = (info.flags >> XEN_DOMINF_shutdownshift)
-        & XEN_DOMINF_shutdownmask;
-    if (shutdown_reason != SHUTDOWN_suspend) {
-        LOG(DEBUG, "guest %"PRId32" we were suspending has shut down"
-            " with unexpected reason code %d", domid, shutdown_reason);
-        goto err;
-    }
-
-    LOG(DEBUG, "guest has suspended");
-    domain_suspend_common_guest_suspended(egc, dss);
-    return;
-
- err:
-    domain_suspend_common_done(egc, dss, ERROR_FAIL);
-}
-
-static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-      libxl__ev_time *ev, const struct timeval *requested_abs, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
-    STATE_AO_GC(dss->ao);
-    if (rc == ERROR_TIMEDOUT) {
-        LOG(ERROR, "guest did not suspend, timed out");
-        rc = ERROR_GUEST_TIMEDOUT;
-    }
-    domain_suspend_common_done(egc, dss, rc);
-}
-
-static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss)
-{
-    STATE_AO_GC(dss->ao);
-    int rc;
-
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
-
-    if (dss->hvm) {
-        rc = libxl__domain_suspend_device_model(gc, dss);
-        if (rc) {
-            LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", rc);
-            domain_suspend_common_done(egc, dss, rc);
-            return;
-        }
-    }
-    domain_suspend_common_done(egc, dss, 0);
-}
-
-static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
-                                       int rc)
-{
-    EGC_GC;
-    assert(!libxl__xswait_inuse(&dss->pvcontrol));
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
-    dss->callback_common_done(egc, dss, rc);
-}
-
 static inline char *physmap_path(libxl__gc *gc, uint32_t dm_domid,
                                  uint32_t domid,
                                  char *phys_offset, char *node)
@@ -1830,23 +1507,6 @@ out:
     return ret;
 }
 
-static void libxl__domain_suspend_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-
-    dss->callback_common_done = domain_suspend_callback_common_done;
-    domain_suspend_callback_common(egc, dss);
-}
-
-static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
-{
-    dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
 /*----- remus callbacks -----*/
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dss, int ok);
@@ -1864,7 +1524,7 @@ static void libxl__remus_domain_suspend_callback(void *data)
     libxl__domain_suspend_state *dss = shs->caller_state;
 
     dss->callback_common_done = remus_domain_suspend_callback_common_done;
-    domain_suspend_callback_common(egc, dss);
+    libxl__domain_suspend(egc, dss);
 }
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
new file mode 100644
index 0000000..5146402
--- /dev/null
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -0,0 +1,380 @@
+/*
+ * Copyright (C) 2009      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+/*====================== Domain suspend =======================*/
+
+/*----- callbacks, called by xc_domain_save -----*/
+
+int libxl__domain_suspend_device_model(libxl__gc *gc,
+                                       libxl__domain_suspend_state *dss)
+{
+    int ret = 0;
+    uint32_t const domid = dss->domid;
+    const char *const filename = dss->dm_savefile;
+
+    switch (libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
+        LOG(DEBUG, "Saving device model state to %s", filename);
+        libxl__qemu_traditional_cmd(gc, domid, "save");
+        libxl__wait_for_device_model_deprecated(gc, domid, "paused", NULL, NULL, NULL);
+        break;
+    }
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        if (libxl__qmp_stop(gc, domid))
+            return ERROR_FAIL;
+        /* Save DM state into filename */
+        ret = libxl__qmp_save(gc, domid, filename);
+        if (ret)
+            unlink(filename);
+        break;
+    default:
+        return ERROR_INVAL;
+    }
+
+    return ret;
+}
+
+static void domain_suspend_common_wait_guest(libxl__egc *egc,
+                                             libxl__domain_suspend_state *dss);
+static void domain_suspend_common_guest_suspended(libxl__egc *egc,
+                                         libxl__domain_suspend_state *dss);
+
+static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
+      libxl__xswait_state *xswa, int rc, const char *state);
+static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
+        libxl__ev_evtchn *evev);
+static void suspend_common_wait_guest_watch(libxl__egc *egc,
+      libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
+static void suspend_common_wait_guest_check(libxl__egc *egc,
+        libxl__domain_suspend_state *dss);
+static void suspend_common_wait_guest_timeout(libxl__egc *egc,
+      libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
+
+static void domain_suspend_common_done(libxl__egc *egc,
+                                       libxl__domain_suspend_state *dss,
+                                       int rc);
+
+static void domain_suspend_callback_common(libxl__egc *egc,
+                                           libxl__domain_suspend_state *dss);
+static void domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int rc);
+
+/* calls dss->callback_common_done when done */
+void libxl__domain_suspend(libxl__egc *egc,
+                           libxl__domain_suspend_state *dss)
+{
+    domain_suspend_callback_common(egc, dss);
+}
+
+static bool domain_suspend_pvcontrol_acked(const char *state) {
+    /* any value other than "suspend", including ENOENT (i.e. !state), is OK */
+    if (!state) return 1;
+    return strcmp(state,"suspend");
+}
+
+/* calls dss->callback_common_done when done */
+static void domain_suspend_callback_common(libxl__egc *egc,
+                                           libxl__domain_suspend_state *dss)
+{
+    STATE_AO_GC(dss->ao);
+    uint64_t hvm_s_state = 0, hvm_pvdrv = 0;
+    int ret, rc;
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+
+    if (dss->hvm) {
+        xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_CALLBACK_IRQ, &hvm_pvdrv);
+        xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_ACPI_S_STATE, &hvm_s_state);
+    }
+
+    if ((hvm_s_state == 0) && (dss->guest_evtchn.port >= 0)) {
+        LOG(DEBUG, "issuing %s suspend request via event channel",
+            dss->hvm ? "PVHVM" : "PV");
+        ret = xc_evtchn_notify(CTX->xce, dss->guest_evtchn.port);
+        if (ret < 0) {
+            LOG(ERROR, "xc_evtchn_notify failed ret=%d", ret);
+            rc = ERROR_FAIL;
+            goto err;
+        }
+
+        dss->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
+        rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
+        if (rc) goto err;
+
+        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+                                         suspend_common_wait_guest_timeout,
+                                         60*1000);
+        if (rc) goto err;
+
+        return;
+    }
+
+    if (dss->hvm && (!hvm_pvdrv || hvm_s_state)) {
+        LOG(DEBUG, "Calling xc_domain_shutdown on HVM domain");
+        ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend);
+        if (ret < 0) {
+            LOGE(ERROR, "xc_domain_shutdown failed");
+            rc = ERROR_FAIL;
+            goto err;
+        }
+        /* The guest does not (need to) respond to this sort of request. */
+        dss->guest_responded = 1;
+        domain_suspend_common_wait_guest(egc, dss);
+        return;
+    }
+
+    LOG(DEBUG, "issuing %s suspend request via XenBus control node",
+        dss->hvm ? "PVHVM" : "PV");
+
+    libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend");
+
+    dss->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
+    if (!dss->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
+
+    dss->pvcontrol.ao = ao;
+    dss->pvcontrol.what = "guest acknowledgement of suspend request";
+    dss->pvcontrol.timeout_ms = 60 * 1000;
+    dss->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
+    libxl__xswait_start(gc, &dss->pvcontrol);
+    return;
+
+ err:
+    domain_suspend_common_done(egc, dss, rc);
+}
+
+static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
+        libxl__ev_evtchn *evev)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(evev, *dss, guest_evtchn);
+    STATE_AO_GC(dss->ao);
+    /* If we should be done waiting, suspend_common_wait_guest_check
+     * will end up calling domain_suspend_common_guest_suspended or
+     * domain_suspend_common_done, both of which cancel the evtchn
+     * wait as needed.  So re-enable it now. */
+    libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
+    suspend_common_wait_guest_check(egc, dss);
+}
+
+static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
+      libxl__xswait_state *xswa, int rc, const char *state)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(xswa, *dss, pvcontrol);
+    STATE_AO_GC(dss->ao);
+    xs_transaction_t t = 0;
+
+    if (!rc && !domain_suspend_pvcontrol_acked(state))
+        /* keep waiting */
+        return;
+
+    libxl__xswait_stop(gc, &dss->pvcontrol);
+
+    if (rc == ERROR_TIMEDOUT) {
+        /*
+         * Guest appears to not be responding. Cancel the suspend
+         * request.
+         *
+         * We re-read the suspend node and clear it within a
+         * transaction in order to handle the case where we race
+         * against the guest catching up and acknowledging the request
+         * at the last minute.
+         */
+        for (;;) {
+            rc = libxl__xs_transaction_start(gc, &t);
+            if (rc) goto err;
+
+            rc = libxl__xs_read_checked(gc, t, xswa->path, &state);
+            if (rc) goto err;
+
+            if (domain_suspend_pvcontrol_acked(state))
+                /* last minute ack */
+                break;
+
+            rc = libxl__xs_write_checked(gc, t, xswa->path, "");
+            if (rc) goto err;
+
+            rc = libxl__xs_transaction_commit(gc, &t);
+            if (!rc) {
+                LOG(ERROR,
+                    "guest didn't acknowledge suspend, cancelling request");
+                goto err;
+            }
+            if (rc<0) goto err;
+        }
+    } else if (rc) {
+        /* some error in xswait's read of xenstore, already logged */
+        goto err;
+    }
+
+    assert(domain_suspend_pvcontrol_acked(state));
+    LOG(DEBUG, "guest acknowledged suspend request");
+
+    libxl__xs_transaction_abort(gc, &t);
+    dss->guest_responded = 1;
+    domain_suspend_common_wait_guest(egc,dss);
+    return;
+
+ err:
+    libxl__xs_transaction_abort(gc, &t);
+    domain_suspend_common_done(egc, dss, rc);
+    return;
+}
+
+static void domain_suspend_common_wait_guest(libxl__egc *egc,
+                                             libxl__domain_suspend_state *dss)
+{
+    STATE_AO_GC(dss->ao);
+    int rc;
+
+    LOG(DEBUG, "wait for the guest to suspend");
+
+    rc = libxl__ev_xswatch_register(gc, &dss->guest_watch,
+                                    suspend_common_wait_guest_watch,
+                                    "@releaseDomain");
+    if (rc) goto err;
+
+    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+                                     suspend_common_wait_guest_timeout,
+                                     60*1000);
+    if (rc) goto err;
+    return;
+
+ err:
+    domain_suspend_common_done(egc, dss, rc);
+}
+
+static void suspend_common_wait_guest_watch(libxl__egc *egc,
+      libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(xsw, *dss, guest_watch);
+    suspend_common_wait_guest_check(egc, dss);
+}
+
+static void suspend_common_wait_guest_check(libxl__egc *egc,
+        libxl__domain_suspend_state *dss)
+{
+    STATE_AO_GC(dss->ao);
+    xc_domaininfo_t info;
+    int ret;
+    int shutdown_reason;
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+
+    ret = xc_domain_getinfolist(CTX->xch, domid, 1, &info);
+    if (ret < 0) {
+        LOGE(ERROR, "unable to check for status of guest %"PRId32"", domid);
+        goto err;
+    }
+
+    if (!(ret == 1 && info.domain == domid)) {
+        LOGE(ERROR, "guest %"PRId32" we were suspending has been destroyed",
+             domid);
+        goto err;
+    }
+
+    if (!(info.flags & XEN_DOMINF_shutdown))
+        /* keep waiting */
+        return;
+
+    shutdown_reason = (info.flags >> XEN_DOMINF_shutdownshift)
+        & XEN_DOMINF_shutdownmask;
+    if (shutdown_reason != SHUTDOWN_suspend) {
+        LOG(DEBUG, "guest %"PRId32" we were suspending has shut down"
+            " with unexpected reason code %d", domid, shutdown_reason);
+        goto err;
+    }
+
+    LOG(DEBUG, "guest has suspended");
+    domain_suspend_common_guest_suspended(egc, dss);
+    return;
+
+ err:
+    domain_suspend_common_done(egc, dss, ERROR_FAIL);
+}
+
+static void suspend_common_wait_guest_timeout(libxl__egc *egc,
+      libxl__ev_time *ev, const struct timeval *requested_abs, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
+    STATE_AO_GC(dss->ao);
+    if (rc == ERROR_TIMEDOUT) {
+        LOG(ERROR, "guest did not suspend, timed out");
+        rc = ERROR_GUEST_TIMEDOUT;
+    }
+    domain_suspend_common_done(egc, dss, rc);
+}
+
+static void domain_suspend_common_guest_suspended(libxl__egc *egc,
+                                         libxl__domain_suspend_state *dss)
+{
+    STATE_AO_GC(dss->ao);
+    int rc;
+
+    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
+    libxl__ev_time_deregister(gc, &dss->guest_timeout);
+
+    if (dss->hvm) {
+        rc = libxl__domain_suspend_device_model(gc, dss);
+        if (rc) {
+            LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", rc);
+            domain_suspend_common_done(egc, dss, rc);
+            return;
+        }
+    }
+    domain_suspend_common_done(egc, dss, 0);
+}
+
+static void domain_suspend_common_done(libxl__egc *egc,
+                                       libxl__domain_suspend_state *dss,
+                                       int rc)
+{
+    EGC_GC;
+    assert(!libxl__xswait_inuse(&dss->pvcontrol));
+    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
+    libxl__ev_time_deregister(gc, &dss->guest_timeout);
+    dss->callback_common_done(egc, dss, rc);
+}
+
+void libxl__domain_suspend_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+
+    dss->callback_common_done = domain_suspend_callback_common_done;
+    domain_suspend_callback_common(egc, dss);
+}
+
+static void domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int rc)
+{
+    dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7599f15..3a2ef00 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3439,6 +3439,12 @@ _hidden void libxl__domain_save_device_model(libxl__egc *egc,
 
 _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
 
+/* calls dss->callback_common_done when done */
+_hidden void libxl__domain_suspend(libxl__egc *egc,
+                                   libxl__domain_suspend_state *dss);
+/* used by libxc to suspend the guest during migration */
+_hidden void libxl__domain_suspend_callback(void *data);
+
 
 /*
  * Convenience macros.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 03/25] tools/libxl: move domain resume code into libxl_dom_suspend.c
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 01/25] tools/libxl: rename libxl__domain_suspend to libxl__domain_save Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 02/25] tools/libxl: move domain suspend code into libxl_dom_suspend.c Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 04/25] tools/libxl: rename remus checkpoint callbacks Yang Hongyang
                   ` (22 subsequent siblings)
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

move domain resume code into libxl_dom_suspend.c.
pure code move.

libxl__domain_resume_device_model() will be used later by COLO,
so we are not making this func static.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/libxl.c             | 33 -------------------------
 tools/libxl/libxl_dom.c         | 20 ---------------
 tools/libxl/libxl_dom_suspend.c | 55 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 55 insertions(+), 53 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index fa42c1c..69a6937 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -513,39 +513,6 @@ int libxl_domain_rename(libxl_ctx *ctx, uint32_t domid,
     return rc;
 }
 
-int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
-{
-    int rc = 0;
-
-    if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
-        LOGE(ERROR, "xc_domain_resume failed for domain %u", domid);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    libxl_domain_type type = libxl__domain_type(gc, domid);
-    if (type == LIBXL_DOMAIN_TYPE_INVALID) {
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    if (type == LIBXL_DOMAIN_TYPE_HVM) {
-        rc = libxl__domain_resume_device_model(gc, domid);
-        if (rc) {
-            LOG(ERROR, "failed to resume device model for domain %u:%d",
-                domid, rc);
-            goto out;
-        }
-    }
-
-    if (!xs_resume_domain(CTX->xsh, domid)) {
-        LOGE(ERROR, "xs_resume_domain failed for domain %u", domid);
-        rc = ERROR_FAIL;
-    }
-out:
-    return rc;
-}
-
 int libxl_domain_resume(libxl_ctx *ctx, uint32_t domid, int suspend_cancel,
                         const libxl_asyncop_how *ao_how)
 {
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index e21e110..0788309 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1384,26 +1384,6 @@ static void switch_logdirty_done(libxl__egc *egc,
 
 /*----- callbacks, called by xc_domain_save -----*/
 
-int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
-{
-
-    switch (libxl__device_model_version_running(gc, domid)) {
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-        libxl__qemu_traditional_cmd(gc, domid, "continue");
-        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
-        break;
-    }
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        if (libxl__qmp_resume(gc, domid))
-            return ERROR_FAIL;
-        break;
-    default:
-        return ERROR_INVAL;
-    }
-
-    return 0;
-}
-
 static inline char *physmap_path(libxl__gc *gc, uint32_t dm_domid,
                                  uint32_t domid,
                                  char *phys_offset, char *node)
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 5146402..a90800d 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -371,6 +371,61 @@ static void domain_suspend_callback_common_done(libxl__egc *egc,
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
 }
 
+/*======================= Domain resume ========================*/
+
+int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
+{
+
+    switch (libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
+        libxl__qemu_traditional_cmd(gc, domid, "continue");
+        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
+        break;
+    }
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        if (libxl__qmp_resume(gc, domid))
+            return ERROR_FAIL;
+        break;
+    default:
+        return ERROR_INVAL;
+    }
+
+    return 0;
+}
+
+int libxl__domain_resume(libxl__gc *gc, uint32_t domid, int suspend_cancel)
+{
+    int rc = 0;
+
+    if (xc_domain_resume(CTX->xch, domid, suspend_cancel)) {
+        LOGE(ERROR, "xc_domain_resume failed for domain %u", domid);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    libxl_domain_type type = libxl__domain_type(gc, domid);
+    if (type == LIBXL_DOMAIN_TYPE_INVALID) {
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    if (type == LIBXL_DOMAIN_TYPE_HVM) {
+        rc = libxl__domain_resume_device_model(gc, domid);
+        if (rc) {
+            LOG(ERROR, "failed to resume device model for domain %u:%d",
+                domid, rc);
+            goto out;
+        }
+    }
+
+    if (!xs_resume_domain(CTX->xsh, domid)) {
+        LOGE(ERROR, "xs_resume_domain failed for domain %u", domid);
+        rc = ERROR_FAIL;
+    }
+out:
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 04/25] tools/libxl: rename remus checkpoint callbacks
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (2 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 03/25] tools/libxl: move domain resume " Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 11:17   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup Yang Hongyang
                   ` (21 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

There are 2 remus checkpoint callbacks(save/restore), currently, they
both called libxl__remus_domain_checkpoint_callback in diffrent
file, so it is ok. But in the following patch, we will move all of the
remus callback code into a seperate file, the name should be diffrent.
So rename them to:
  libxl__remus_domain_{save/restore}_checkpoint_callback

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_create.c | 4 ++--
 tools/libxl/libxl_dom.c    | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 5b4d333..a32e3df 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -677,7 +677,7 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
 static void remus_checkpoint_stream_done(
     libxl__egc *egc, libxl__stream_read_state *srs, int rc);
 
-static void libxl__remus_domain_checkpoint_callback(void *data)
+static void libxl__remus_domain_restore_checkpoint_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__domain_create_state *dcs = shs->caller_state;
@@ -989,7 +989,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     }
 
     /* Restore */
-    callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
+    callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
 
     rc = libxl__build_pre(gc, domid, d_config, state);
     if (rc)
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 0788309..9c61fa7 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1586,7 +1586,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
                                   int rc);
 
-static void libxl__remus_domain_checkpoint_callback(void *data)
+static void libxl__remus_domain_save_checkpoint_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__domain_suspend_state *dss = shs->caller_state;
@@ -1749,7 +1749,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
     if (r_info != NULL) {
         callbacks->suspend = libxl__remus_domain_suspend_callback;
         callbacks->postcopy = libxl__remus_domain_resume_callback;
-        callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
+        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
         dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
     } else
         callbacks->suspend = libxl__domain_suspend_callback;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (3 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 04/25] tools/libxl: rename remus checkpoint callbacks Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 11:26   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 06/25] libxl/remus: introduce libxl__remus_teardown Yang Hongyang
                   ` (20 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

Refactoring Remus setup by introducing libxl__remus_setup API.
All Remus setup work are done in this function.

Also remove the libxl__ prefix for static functions.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c | 46 ++++++++++++++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 16 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 69a6937..acb5639 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -795,10 +795,12 @@ out:
     return ptr;
 }
 
-static void libxl__remus_setup_done(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds, int rc);
-static void libxl__remus_setup_failed(libxl__egc *egc,
-                                      libxl__remus_devices_state *rds, int rc);
+static void libxl__remus_setup(libxl__egc *egc,
+                               libxl__domain_suspend_state *dss);
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc);
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc);
 static void remus_failover_cb(libxl__egc *egc,
                               libxl__domain_suspend_state *dss, int rc);
 
@@ -847,13 +849,26 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
 
     assert(info);
 
+    /* Point of no return */
+    libxl__remus_setup(egc, dss);
+    return AO_INPROGRESS;
+
+ out:
+    return AO_CREATE_FAIL(rc);
+}
+
+static void libxl__remus_setup(libxl__egc *egc,
+                               libxl__domain_suspend_state *dss)
+{
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
+    const libxl_domain_remus_info *const info = dss->remus;
+
+    STATE_AO_GC(dss->ao);
 
     if (libxl_defbool_val(info->netbuf)) {
         if (!libxl__netbuffer_enabled(gc)) {
             LOG(ERROR, "Remus: No support for network buffering");
-            rc = ERROR_FAIL;
             goto out;
         }
         rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
@@ -863,19 +878,18 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
         rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
 
     rds->ao = ao;
-    rds->domid = domid;
-    rds->callback = libxl__remus_setup_done;
+    rds->domid = dss->domid;
+    rds->callback = remus_setup_done;
 
-    /* Point of no return */
     libxl__remus_devices_setup(egc, rds);
-    return AO_INPROGRESS;
+    return;
 
- out:
-    return AO_CREATE_FAIL(rc);
+out:
+    dss->callback(egc, dss, ERROR_FAIL);
 }
 
-static void libxl__remus_setup_done(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds, int rc)
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
@@ -887,12 +901,12 @@ static void libxl__remus_setup_done(libxl__egc *egc,
 
     LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
         dss->domid, rc);
-    rds->callback = libxl__remus_setup_failed;
+    rds->callback = remus_setup_failed;
     libxl__remus_devices_teardown(egc, rds);
 }
 
-static void libxl__remus_setup_failed(libxl__egc *egc,
-                                      libxl__remus_devices_state *rds, int rc)
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 06/25] libxl/remus: introduce libxl__remus_teardown
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (4 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 11:59   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback Yang Hongyang
                   ` (19 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

introduce libxl__remus_teardown to teardown Remus devices.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dom.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 9c61fa7..77a917c 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1865,6 +1865,9 @@ static void save_device_model_datacopier_done(libxl__egc *egc,
     dss->save_dm_callback(egc, dss, our_rc);
 }
 
+static void libxl__remus_teardown(libxl__egc *egc,
+                                  libxl__domain_suspend_state *dss,
+                                  int rc);
 static void remus_teardown_done(libxl__egc *egc,
                                        libxl__remus_devices_state *rds,
                                        int rc);
@@ -1894,6 +1897,15 @@ static void domain_save_done(libxl__egc *egc,
      * from sending checkpoints. Teardown the network buffers and
      * release netlink resources.  This is an async op.
      */
+    libxl__remus_teardown(egc, dss, rc);
+}
+
+static void libxl__remus_teardown(libxl__egc *egc,
+                                  libxl__domain_suspend_state *dss,
+                                  int rc)
+{
+    EGC_GC;
+
     LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
         " teardown Remus devices...", rc);
     dss->rds.callback = remus_teardown_done;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (5 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 06/25] libxl/remus: introduce libxl__remus_teardown Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:02   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 08/25] tools/libxl: move remus code into libxl_remus.c Yang Hongyang
                   ` (18 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

init stream {read/write} state checkpoint_callback in Remus
checkpoint callback.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_create.c | 2 +-
 tools/libxl/libxl_dom.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index a32e3df..94fe98f 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -684,6 +684,7 @@ static void libxl__remus_domain_restore_checkpoint_callback(void *data)
     libxl__egc *egc = shs->egc;
     STATE_AO_GC(dcs->ao);
 
+    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
     libxl__stream_read_start_checkpoint(egc, &dcs->srs);
 }
 
@@ -1000,7 +1001,6 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     dcs->srs.fd = restore_fd;
     dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
     dcs->srs.completion_callback = domcreate_stream_done;
-    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
 
     libxl__stream_read_start(egc, &dcs->srs);
     return;
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 77a917c..1740bed 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1593,6 +1593,7 @@ static void libxl__remus_domain_save_checkpoint_callback(void *data)
     libxl__egc *egc = shs->egc;
     STATE_AO_GC(dss->ao);
 
+    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
     libxl__stream_write_start_checkpoint(egc, &dss->sws);
 }
 
@@ -1750,7 +1751,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
         callbacks->suspend = libxl__remus_domain_suspend_callback;
         callbacks->postcopy = libxl__remus_domain_resume_callback;
         callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
-        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
     } else
         callbacks->suspend = libxl__domain_suspend_callback;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 08/25] tools/libxl: move remus code into libxl_remus.c
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (6 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:05   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 09/25] tools/libxl: move save/restore code into libxl_dom_save.c Yang Hongyang
                   ` (17 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

After previous refactoring, we are now able to move all remus code
into a separate file libxl_remus.c.

Export following functions for internal use:
- Remus callbacks
  * libxl__remus_domain_suspend_callback
  * libxl__remus_domain_resume_callback
  * libxl__remus_domain_save_checkpoint_callback
  * libxl__remus_domain_restore_checkpoint_callback
- setup/teardown Remus:
  * libxl__remus_setup
  * libxl__remus_teardown

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl.c          |  67 ---------
 tools/libxl/libxl_create.c   |  22 ---
 tools/libxl/libxl_dom.c      | 223 ----------------------------
 tools/libxl/libxl_internal.h |  12 ++
 tools/libxl/libxl_remus.c    | 339 +++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 352 insertions(+), 313 deletions(-)
 create mode 100644 tools/libxl/libxl_remus.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 4a5957e..b10f4e7 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -62,7 +62,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index acb5639..f1237d8 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -795,12 +795,6 @@ out:
     return ptr;
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-                               libxl__domain_suspend_state *dss);
-static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc);
-static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc);
 static void remus_failover_cb(libxl__egc *egc,
                               libxl__domain_suspend_state *dss, int rc);
 
@@ -857,67 +851,6 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     return AO_CREATE_FAIL(rc);
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-                               libxl__domain_suspend_state *dss)
-{
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
-    const libxl_domain_remus_info *const info = dss->remus;
-
-    STATE_AO_GC(dss->ao);
-
-    if (libxl_defbool_val(info->netbuf)) {
-        if (!libxl__netbuffer_enabled(gc)) {
-            LOG(ERROR, "Remus: No support for network buffering");
-            goto out;
-        }
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
-    }
-
-    if (libxl_defbool_val(info->diskbuf))
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
-
-    rds->ao = ao;
-    rds->domid = dss->domid;
-    rds->callback = remus_setup_done;
-
-    libxl__remus_devices_setup(egc, rds);
-    return;
-
-out:
-    dss->callback(egc, dss, ERROR_FAIL);
-}
-
-static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (!rc) {
-        libxl__domain_save(egc, dss);
-        return;
-    }
-
-    LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
-        dss->domid, rc);
-    rds->callback = remus_setup_failed;
-    libxl__remus_devices_teardown(egc, rds);
-}
-
-static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        LOG(ERROR, "Remus: failed to teardown device after setup failed"
-            " for guest with domid %u, rc %d", dss->domid, rc);
-
-    dss->callback(egc, dss, rc);
-}
-
 static void remus_failover_cb(libxl__egc *egc,
                               libxl__domain_suspend_state *dss, int rc)
 {
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 94fe98f..cbd7693 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -672,28 +672,6 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
         libxl_device_model_version_to_string(b_info->device_model_version));
 }
 
-/*----- remus asynchronous checkpoint callback -----*/
-
-static void remus_checkpoint_stream_done(
-    libxl__egc *egc, libxl__stream_read_state *srs, int rc);
-
-static void libxl__remus_domain_restore_checkpoint_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__domain_create_state *dcs = shs->caller_state;
-    libxl__egc *egc = shs->egc;
-    STATE_AO_GC(dcs->ao);
-
-    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
-    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
-}
-
-static void remus_checkpoint_stream_done(
-    libxl__egc *egc, libxl__stream_read_state *stream, int rc)
-{
-    libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
-}
-
 /*----- main domain creation -----*/
 
 /* We have a linear control flow; only one event callback is
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 1740bed..ad5e810 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1487,196 +1487,6 @@ out:
     return ret;
 }
 
-/*----- remus callbacks -----*/
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
-                                         int rc);
-static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc);
-
-static void libxl__remus_domain_suspend_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-
-    dss->callback_common_done = remus_domain_suspend_callback_common_done;
-    libxl__domain_suspend(egc, dss);
-}
-
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
-{
-    if (rc)
-        goto out;
-
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_postsuspend_cb;
-    libxl__remus_devices_postsuspend(egc, rds);
-    return;
-
-out:
-    dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
-                                         int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-
-    if (rc)
-        goto out;
-
-    rc = 0;
-
-out:
-    if (rc)
-        dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-static void libxl__remus_domain_resume_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_preresume_cb;
-    libxl__remus_devices_preresume(egc, rds);
-}
-
-static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        goto out;
-
-    /* Resumes the domain and the device model */
-    rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
-    if (rc)
-        goto out;
-
-    rc = 0;
-
-out:
-    if (rc)
-        dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-/*----- remus asynchronous checkpoint callback -----*/
-
-static void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
-static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
-                                    int rc);
-static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs,
-                                  int rc);
-
-static void libxl__remus_domain_save_checkpoint_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    libxl__egc *egc = shs->egc;
-    STATE_AO_GC(dss->ao);
-
-    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
-    libxl__stream_write_start_checkpoint(egc, &dss->sws);
-}
-
-static void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
-
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc) {
-        LOG(ERROR, "Failed to save device model. Terminating Remus..");
-        goto out;
-    }
-
-    rds->callback = remus_devices_commit_cb;
-    libxl__remus_devices_commit(egc, rds);
-
-    return;
-
-out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
-}
-
-static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
-                                    int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc) {
-        LOG(ERROR, "Failed to do device commit op."
-            " Terminating Remus..");
-        goto out;
-    }
-
-    /*
-     * At this point, we have successfully checkpointed the guest and
-     * committed it at the backup. We'll come back after the checkpoint
-     * interval to checkpoint the guest again. Until then, let the guest
-     * continue execution.
-     */
-
-    /* Set checkpoint interval timeout */
-    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
-                                     remus_next_checkpoint,
-                                     dss->interval);
-
-    if (rc)
-        goto out;
-
-    return;
-
-out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
-}
-
-static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs,
-                                  int rc)
-{
-    libxl__domain_suspend_state *dss =
-                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
-
-    STATE_AO_GC(dss->ao);
-
-    /*
-     * Time to checkpoint the guest again. We return 1 to libxc
-     * (xc_domain_save.c). in order to continue executing the infinite loop
-     * (suspend, checkpoint, resume) in xc_domain_save().
-     */
-
-    if (rc)
-        dss->rc = rc;
-
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
 /*----- main code for saving, in order of execution -----*/
 
 void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
@@ -1865,13 +1675,6 @@ static void save_device_model_datacopier_done(libxl__egc *egc,
     dss->save_dm_callback(egc, dss, our_rc);
 }
 
-static void libxl__remus_teardown(libxl__egc *egc,
-                                  libxl__domain_suspend_state *dss,
-                                  int rc);
-static void remus_teardown_done(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc);
-
 static void domain_save_done(libxl__egc *egc,
                              libxl__domain_suspend_state *dss, int rc)
 {
@@ -1900,32 +1703,6 @@ static void domain_save_done(libxl__egc *egc,
     libxl__remus_teardown(egc, dss, rc);
 }
 
-static void libxl__remus_teardown(libxl__egc *egc,
-                                  libxl__domain_suspend_state *dss,
-                                  int rc)
-{
-    EGC_GC;
-
-    LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
-        " teardown Remus devices...", rc);
-    dss->rds.callback = remus_teardown_done;
-    libxl__remus_devices_teardown(egc, &dss->rds);
-}
-
-static void remus_teardown_done(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
-            " rc %d", dss->domid, rc);
-
-    dss->callback(egc, dss, rc);
-}
-
 /*==================== Miscellaneous ====================*/
 
 char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3a2ef00..7ce3eca 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3445,6 +3445,18 @@ _hidden void libxl__domain_suspend(libxl__egc *egc,
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
+/* Remus callbacks for save */
+_hidden void libxl__remus_domain_suspend_callback(void *data);
+_hidden void libxl__remus_domain_resume_callback(void *data);
+_hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
+/* Remus callbacks for restore */
+_hidden void libxl__remus_domain_restore_checkpoint_callback(void *data);
+/* Remus setup and teardown*/
+_hidden void libxl__remus_setup(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss);
+_hidden void libxl__remus_teardown(libxl__egc *egc,
+                                   libxl__domain_suspend_state *dss,
+                                   int rc);
 
 /*
  * Convenience macros.
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
new file mode 100644
index 0000000..b7fa022
--- /dev/null
+++ b/tools/libxl/libxl_remus.c
@@ -0,0 +1,339 @@
+/*
+ * Copyright (C) 2009      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *        Yang Hongyang <yanghy@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+/*-------------------- Remus setup and teardown ---------------------*/
+
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc);
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc);
+
+void libxl__remus_setup(libxl__egc *egc,
+                        libxl__domain_suspend_state *dss)
+{
+    /* Convenience aliases */
+    libxl__remus_devices_state *const rds = &dss->rds;
+    const libxl_domain_remus_info *const info = dss->remus;
+
+    STATE_AO_GC(dss->ao);
+
+    if (libxl_defbool_val(info->netbuf)) {
+        if (!libxl__netbuffer_enabled(gc)) {
+            LOG(ERROR, "Remus: No support for network buffering");
+            goto out;
+        }
+        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
+    }
+
+    if (libxl_defbool_val(info->diskbuf))
+        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
+
+    rds->ao = ao;
+    rds->domid = dss->domid;
+    rds->callback = remus_setup_done;
+
+    libxl__remus_devices_setup(egc, rds);
+    return;
+
+out:
+    dss->callback(egc, dss, ERROR_FAIL);
+}
+
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (!rc) {
+        libxl__domain_save(egc, dss);
+        return;
+    }
+
+    LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
+        dss->domid, rc);
+    rds->callback = remus_setup_failed;
+    libxl__remus_devices_teardown(egc, rds);
+}
+
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        LOG(ERROR, "Remus: failed to teardown device after setup failed"
+            " for guest with domid %u, rc %d", dss->domid, rc);
+
+    dss->callback(egc, dss, rc);
+}
+
+static void remus_teardown_done(libxl__egc *egc,
+                                libxl__remus_devices_state *rds,
+                                int rc);
+void libxl__remus_teardown(libxl__egc *egc,
+                           libxl__domain_suspend_state *dss,
+                           int rc)
+{
+    EGC_GC;
+
+    LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
+        " teardown Remus devices...", rc);
+    dss->rds.callback = remus_teardown_done;
+    libxl__remus_devices_teardown(egc, &dss->rds);
+}
+
+static void remus_teardown_done(libxl__egc *egc,
+                                libxl__remus_devices_state *rds,
+                                int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
+            " rc %d", dss->domid, rc);
+
+    dss->callback(egc, dss, rc);
+}
+
+/*---------------------- remus callbacks (save) -----------------------*/
+
+static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int ok);
+static void remus_devices_postsuspend_cb(libxl__egc *egc,
+                                         libxl__remus_devices_state *rds,
+                                         int rc);
+static void remus_devices_preresume_cb(libxl__egc *egc,
+                                       libxl__remus_devices_state *rds,
+                                       int rc);
+
+void libxl__remus_domain_suspend_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+
+    dss->callback_common_done = remus_domain_suspend_callback_common_done;
+    libxl__domain_suspend(egc, dss);
+}
+
+static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int rc)
+{
+    if (rc)
+        goto out;
+
+    libxl__remus_devices_state *const rds = &dss->rds;
+    rds->callback = remus_devices_postsuspend_cb;
+    libxl__remus_devices_postsuspend(egc, rds);
+    return;
+
+out:
+    dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+static void remus_devices_postsuspend_cb(libxl__egc *egc,
+                                         libxl__remus_devices_state *rds,
+                                         int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+
+    if (rc)
+        goto out;
+
+    rc = 0;
+
+out:
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+void libxl__remus_domain_resume_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+
+    libxl__remus_devices_state *const rds = &dss->rds;
+    rds->callback = remus_devices_preresume_cb;
+    libxl__remus_devices_preresume(egc, rds);
+}
+
+static void remus_devices_preresume_cb(libxl__egc *egc,
+                                       libxl__remus_devices_state *rds,
+                                       int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        goto out;
+
+    /* Resumes the domain and the device model */
+    rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
+    if (rc)
+        goto out;
+
+    rc = 0;
+
+out:
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+/*----- remus asynchronous checkpoint callback -----*/
+
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
+static void remus_devices_commit_cb(libxl__egc *egc,
+                                    libxl__remus_devices_state *rds,
+                                    int rc);
+static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
+                                  const struct timeval *requested_abs,
+                                  int rc);
+
+void libxl__remus_domain_save_checkpoint_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__egc *egc = shs->egc;
+    STATE_AO_GC(dss->ao);
+
+    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
+    libxl__stream_write_start_checkpoint(egc, &dss->sws);
+}
+
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
+
+    /* Convenience aliases */
+    libxl__remus_devices_state *const rds = &dss->rds;
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc) {
+        LOG(ERROR, "Failed to save device model. Terminating Remus..");
+        goto out;
+    }
+
+    rds->callback = remus_devices_commit_cb;
+    libxl__remus_devices_commit(egc, rds);
+
+    return;
+
+out:
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
+}
+
+static void remus_devices_commit_cb(libxl__egc *egc,
+                                    libxl__remus_devices_state *rds,
+                                    int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc) {
+        LOG(ERROR, "Failed to do device commit op."
+            " Terminating Remus..");
+        goto out;
+    }
+
+    /*
+     * At this point, we have successfully checkpointed the guest and
+     * committed it at the backup. We'll come back after the checkpoint
+     * interval to checkpoint the guest again. Until then, let the guest
+     * continue execution.
+     */
+
+    /* Set checkpoint interval timeout */
+    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
+                                     remus_next_checkpoint,
+                                     dss->interval);
+
+    if (rc)
+        goto out;
+
+    return;
+
+out:
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
+}
+
+static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
+                                  const struct timeval *requested_abs,
+                                  int rc)
+{
+    libxl__domain_suspend_state *dss =
+                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
+
+    STATE_AO_GC(dss->ao);
+
+    /*
+     * Time to checkpoint the guest again. We return 1 to libxc
+     * (xc_domain_save.c). in order to continue executing the infinite loop
+     * (suspend, checkpoint, resume) in xc_domain_save().
+     */
+
+    if (rc)
+        dss->rc = rc;
+
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+/*---------------------- remus callbacks (restore) -----------------------*/
+
+/*----- remus asynchronous checkpoint callback -----*/
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__stream_read_state *srs, int rc);
+
+void libxl__remus_domain_restore_checkpoint_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__domain_create_state *dcs = shs->caller_state;
+    libxl__egc *egc = shs->egc;
+    STATE_AO_GC(dcs->ao);
+
+    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
+    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
+}
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__stream_read_state *stream, int rc)
+{
+    libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 09/25] tools/libxl: move save/restore code into libxl_dom_save.c
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (7 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 08/25] tools/libxl: move remus code into libxl_remus.c Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 10/25] libxl/save: Refactor libxl__domain_suspend_state Yang Hongyang
                   ` (16 subsequent siblings)
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

This is purely code motion.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl_dom.c      | 672 -----------------------------------------
 tools/libxl/libxl_dom_save.c | 700 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 701 insertions(+), 673 deletions(-)
 create mode 100644 tools/libxl/libxl_dom_save.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index b10f4e7..2e4c944 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -103,7 +103,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_stream_read.o libxl_stream_write.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o \
-			libxl_dom_suspend.o $(LIBXL_OBJS-y)
+			libxl_dom_suspend.o libxl_dom_save.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index ad5e810..81e6a4e 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1031,678 +1031,6 @@ int libxl__qemu_traditional_cmd(libxl__gc *gc, uint32_t domid,
     return libxl__xs_write(gc, XBT_NULL, path, "%s", cmd);
 }
 
-struct libxl__physmap_info {
-    uint64_t phys_offset;
-    uint64_t start_addr;
-    uint64_t size;
-    uint32_t namelen;
-    char name[];
-};
-
-/* Bump version every time when toolstack saved data changes.
- * Different types of data are arranged in the specified order.
- *
- * Version 1:
- *   uint32_t version
- *   QEMU physmap data:
- *     uint32_t count
- *     libxl__physmap_info * count
- */
-#define TOOLSTACK_SAVE_VERSION 1
-
-static inline char *restore_helper(libxl__gc *gc, uint32_t dm_domid,
-                                   uint32_t domid,
-                                   uint64_t phys_offset, char *node)
-{
-    return libxl__device_model_xs_path(gc, dm_domid, domid,
-                                       "/physmap/%"PRIx64"/%s",
-                                       phys_offset, node);
-}
-
-static int libxl__toolstack_restore_qemu(libxl__gc *gc, uint32_t domid,
-                                         const uint8_t *ptr, uint32_t size)
-{
-    int ret, i;
-    uint32_t count;
-    char *xs_path;
-    uint32_t dm_domid;
-    struct libxl__physmap_info *pi;
-
-    if (size < sizeof(count)) {
-        LOG(ERROR, "wrong size");
-        ret = -1;
-        goto out;
-    }
-
-    memcpy(&count, ptr, sizeof(count));
-    ptr += sizeof(count);
-
-    if (size < sizeof(count) + count*(sizeof(struct libxl__physmap_info))) {
-        LOG(ERROR, "wrong size");
-        ret = -1;
-        goto out;
-    }
-
-    dm_domid = libxl_get_stubdom_id(CTX, domid);
-    for (i = 0; i < count; i++) {
-        pi = (struct libxl__physmap_info*) ptr;
-        ptr += sizeof(struct libxl__physmap_info) + pi->namelen;
-
-        xs_path = restore_helper(gc, dm_domid, domid,
-                                 pi->phys_offset, "start_addr");
-        ret = libxl__xs_write(gc, 0, xs_path, "%"PRIx64, pi->start_addr);
-        if (ret) goto out;
-
-        xs_path = restore_helper(gc, dm_domid, domid, pi->phys_offset, "size");
-        ret = libxl__xs_write(gc, 0, xs_path, "%"PRIx64, pi->size);
-        if (ret) goto out;
-
-        if (pi->namelen > 0) {
-            xs_path = restore_helper(gc, dm_domid, domid,
-                                     pi->phys_offset, "name");
-            ret = libxl__xs_write(gc, 0, xs_path, "%s", pi->name);
-            if (ret) goto out;
-        }
-    }
-
-    ret = 0;
-out:
-    return ret;
-
-}
-
-static int libxl__toolstack_restore_v1(libxl__gc *gc, uint32_t domid,
-                                       const uint8_t *ptr, uint32_t size)
-{
-    return libxl__toolstack_restore_qemu(gc, domid, ptr, size);
-}
-
-int libxl__toolstack_restore(uint32_t domid, const uint8_t *ptr,
-                             uint32_t size, void *user)
-{
-    libxl__save_helper_state *shs = user;
-    libxl__domain_create_state *dcs = shs->caller_state;
-    STATE_AO_GC(dcs->ao);
-    int ret;
-    uint32_t version = 0, bufsize;
-
-    LOG(DEBUG,"domain=%"PRIu32" toolstack data size=%"PRIu32, domid, size);
-
-    if (size < sizeof(version)) {
-        LOG(ERROR, "wrong size");
-        ret = -1;
-        goto out;
-    }
-
-    memcpy(&version, ptr, sizeof(version));
-    ptr += sizeof(version);
-    bufsize = size - sizeof(version);
-
-    switch (version) {
-    case 1:
-        ret = libxl__toolstack_restore_v1(gc, domid, ptr, bufsize);
-        break;
-    default:
-        LOG(ERROR, "wrong version");
-        ret = -1;
-    }
-
-out:
-    return ret;
-}
-
-/*==================== Domain suspend (save) ====================*/
-
-static void stream_done(libxl__egc *egc,
-                        libxl__stream_write_state *sws, int rc);
-static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc);
-
-/*----- complicated callback, called by xc_domain_save -----*/
-
-/*
- * We implement the other end of protocol for controlling qemu-dm's
- * logdirty.  There is no documentation for this protocol, but our
- * counterparty's implementation is in
- * qemu-xen-traditional.git:xenstore.c in the function
- * xenstore_process_logdirty_event
- */
-
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs,
-                                    int rc);
-static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
-                            const char *watch_path, const char *event_path);
-static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int rc);
-
-static void logdirty_init(libxl__logdirty_switch *lds)
-{
-    lds->cmd_path = 0;
-    libxl__ev_xswatch_init(&lds->watch);
-    libxl__ev_time_init(&lds->timeout);
-}
-
-static void domain_suspend_switch_qemu_xen_traditional_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
-{
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
-    int rc;
-    xs_transaction_t t = 0;
-    const char *got;
-
-    if (!lds->cmd_path) {
-        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-        lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid,
-                                                    "/logdirty/cmd");
-        lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid,
-                                                    "/logdirty/ret");
-    }
-    lds->cmd = enable ? "enable" : "disable";
-
-    rc = libxl__ev_xswatch_register(gc, &lds->watch,
-                                switch_logdirty_xswatch, lds->ret_path);
-    if (rc) goto out;
-
-    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
-                                switch_logdirty_timeout, 10*1000);
-    if (rc) goto out;
-
-    for (;;) {
-        rc = libxl__xs_transaction_start(gc, &t);
-        if (rc) goto out;
-
-        rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got);
-        if (rc) goto out;
-
-        if (got) {
-            const char *got_ret;
-            rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret);
-            if (rc) goto out;
-
-            if (!got_ret || strcmp(got, got_ret)) {
-                LOG(ERROR,"controlling logdirty: qemu was already sent"
-                    " command `%s' (xenstore path `%s') but result is `%s'",
-                    got, lds->cmd_path, got_ret ? got_ret : "<none>");
-                rc = ERROR_FAIL;
-                goto out;
-            }
-            rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
-            if (rc) goto out;
-        }
-
-        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd);
-        if (rc) goto out;
-
-        rc = libxl__xs_transaction_commit(gc, &t);
-        if (!rc) break;
-        if (rc<0) goto out;
-    }
-
-    /* OK, wait for some callback */
-    return;
-
- out:
-    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
-    libxl__xs_transaction_abort(gc, &t);
-    switch_logdirty_done(egc,dss,rc);
-}
-
-static void domain_suspend_switch_qemu_xen_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
-{
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-    int rc;
-
-    rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
-    if (!rc) {
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
-    } else {
-        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
-        dss->rc = rc;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
-}
-
-void libxl__domain_suspend_common_switch_qemu_logdirty
-                               (int domid, unsigned enable, void *user)
-{
-    libxl__save_helper_state *shs = user;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-
-    switch (libxl__device_model_version_running(gc, domid)) {
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
-        break;
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
-        break;
-    default:
-        LOG(ERROR,"logdirty switch failed"
-            ", no valid device model version found, abandoning suspend");
-        dss->rc = ERROR_FAIL;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
-}
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs,
-                                    int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
-    STATE_AO_GC(dss->ao);
-    LOG(ERROR,"logdirty switch: wait for device model timed out");
-    switch_logdirty_done(egc,dss,ERROR_FAIL);
-}
-
-static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
-                            const char *watch_path, const char *event_path)
-{
-    libxl__domain_suspend_state *dss =
-        CONTAINER_OF(watch, *dss, logdirty.watch);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
-    const char *got;
-    xs_transaction_t t = 0;
-    int rc;
-
-    for (;;) {
-        rc = libxl__xs_transaction_start(gc, &t);
-        if (rc) goto out;
-
-        rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got);
-        if (rc) goto out;
-
-        if (!got) {
-            rc = +1;
-            goto out;
-        }
-
-        if (strcmp(got, lds->cmd)) {
-            LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'"
-                " (xenstore paths `%s' / `%s')", lds->cmd, got,
-                lds->cmd_path, lds->ret_path);
-            rc = ERROR_FAIL;
-            goto out;
-        }
-
-        rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_transaction_commit(gc, &t);
-        if (!rc) break;
-        if (rc<0) goto out;
-    }
-
- out:
-    /* rc < 0: error
-     * rc == 0: ok, we are done
-     * rc == +1: need to keep waiting
-     */
-    libxl__xs_transaction_abort(gc, &t);
-
-    if (rc <= 0) {
-        if (rc < 0)
-            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
-        switch_logdirty_done(egc,dss,rc);
-    }
-}
-
-static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss,
-                                 int rc)
-{
-    STATE_AO_GC(dss->ao);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-
-    libxl__ev_xswatch_deregister(gc, &lds->watch);
-    libxl__ev_time_deregister(gc, &lds->timeout);
-
-    int broke;
-    if (rc) {
-        broke = -1;
-        dss->rc = rc;
-    } else {
-        broke = 0;
-    }
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
-}
-
-/*----- callbacks, called by xc_domain_save -----*/
-
-static inline char *physmap_path(libxl__gc *gc, uint32_t dm_domid,
-                                 uint32_t domid,
-                                 char *phys_offset, char *node)
-{
-    return libxl__device_model_xs_path(gc, dm_domid, domid,
-                                       "/physmap/%s/%s",
-                                       phys_offset, node);
-}
-
-int libxl__toolstack_save(uint32_t domid, uint8_t **buf,
-        uint32_t *len, void *dss_void)
-{
-    libxl__domain_suspend_state *dss = dss_void;
-    int ret;
-    STATE_AO_GC(dss->ao);
-    int i = 0;
-    uint32_t version = TOOLSTACK_SAVE_VERSION;
-    uint8_t *ptr = NULL;
-
-    ret = -1;
-
-    /* Version number */
-    *len = sizeof(version);
-    *buf = calloc(1, *len);
-    if (*buf == NULL) goto out;
-    ptr = *buf;
-    memcpy(ptr, &version, sizeof(version));
-
-    /* QEMU physmap data */
-    {
-        char **entries = NULL, *xs_path;
-        struct libxl__physmap_info *pi;
-        uint32_t dm_domid;
-        char *start_addr = NULL, *size = NULL, *phys_offset = NULL;
-        char *name = NULL;
-        unsigned int num = 0;
-        uint32_t count = 0, namelen = 0;
-
-        dm_domid = libxl_get_stubdom_id(CTX, domid);
-
-        xs_path = libxl__device_model_xs_path(gc, dm_domid, domid,
-                                              "/physmap");
-        entries = libxl__xs_directory(gc, 0, xs_path, &num);
-        count = num;
-
-        *len += sizeof(count);
-        *buf = realloc(*buf, *len);
-        if (*buf == NULL) goto out;
-        ptr = *buf + sizeof(version);
-        memcpy(ptr, &count, sizeof(count));
-        ptr += sizeof(count);
-
-        for (i = 0; i < count; i++) {
-            unsigned long offset;
-            phys_offset = entries[i];
-            if (phys_offset == NULL) {
-                LOG(ERROR, "phys_offset %d is NULL", i);
-                goto out;
-            }
-
-            xs_path = physmap_path(gc, dm_domid, domid, phys_offset,
-                                   "start_addr");
-            start_addr = libxl__xs_read(gc, 0, xs_path);
-            if (start_addr == NULL) {
-                LOG(ERROR, "%s is NULL", xs_path);
-                goto out;
-            }
-
-            xs_path = physmap_path(gc, dm_domid, domid, phys_offset, "size");
-            size = libxl__xs_read(gc, 0, xs_path);
-            if (size == NULL) {
-                LOG(ERROR, "%s is NULL", xs_path);
-                goto out;
-            }
-
-            xs_path = physmap_path(gc, dm_domid, domid, phys_offset, "name");
-            name = libxl__xs_read(gc, 0, xs_path);
-            if (name == NULL)
-                namelen = 0;
-            else
-                namelen = strlen(name) + 1;
-            *len += namelen + sizeof(struct libxl__physmap_info);
-            offset = ptr - (*buf);
-            *buf = realloc(*buf, *len);
-            if (*buf == NULL) goto out;
-            ptr = (*buf) + offset;
-            pi = (struct libxl__physmap_info *) ptr;
-            pi->phys_offset = strtoll(phys_offset, NULL, 16);
-            pi->start_addr = strtoll(start_addr, NULL, 16);
-            pi->size = strtoll(size, NULL, 16);
-            pi->namelen = namelen;
-            memcpy(pi->name, name, namelen);
-            ptr += sizeof(struct libxl__physmap_info) + namelen;
-        }
-    }
-
-    LOG(DEBUG,"domain=%"PRIu32" toolstack data size=%"PRIu32, domid, *len);
-
-    ret = 0;
-out:
-    return ret;
-}
-
-/*----- main code for saving, in order of execution -----*/
-
-void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
-{
-    STATE_AO_GC(dss->ao);
-    int port;
-    int rc = ERROR_FAIL;
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-    const libxl_domain_type type = dss->type;
-    const int live = dss->live;
-    const int debug = dss->debug;
-    const libxl_domain_remus_info *const r_info = dss->remus;
-    libxl__srm_save_autogen_callbacks *const callbacks =
-        &dss->sws.shs.callbacks.save.a;
-
-    dss->rc = 0;
-    logdirty_init(&dss->logdirty);
-    libxl__xswait_init(&dss->pvcontrol);
-    libxl__ev_evtchn_init(&dss->guest_evtchn);
-    libxl__ev_xswatch_init(&dss->guest_watch);
-    libxl__ev_time_init(&dss->guest_timeout);
-
-    switch (type) {
-    case LIBXL_DOMAIN_TYPE_HVM: {
-        dss->hvm = 1;
-        break;
-    }
-    case LIBXL_DOMAIN_TYPE_PV:
-        dss->hvm = 0;
-        break;
-    default:
-        abort();
-    }
-
-    dss->xcflags = (live ? XCFLAGS_LIVE : 0)
-          | (debug ? XCFLAGS_DEBUG : 0)
-          | (dss->hvm ? XCFLAGS_HVM : 0);
-
-    dss->guest_evtchn.port = -1;
-    dss->guest_evtchn_lockfd = -1;
-    dss->guest_responded = 0;
-    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
-
-    if (r_info != NULL) {
-        dss->interval = r_info->interval;
-        dss->xcflags |= XCFLAGS_CHECKPOINTED;
-        if (libxl_defbool_val(r_info->compression))
-            dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
-    }
-
-    port = xs_suspend_evtchn_port(dss->domid);
-
-    if (port >= 0) {
-        rc = libxl__ctx_evtchn_init(gc);
-        if (rc) goto out;
-
-        dss->guest_evtchn.port =
-            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
-                                  dss->domid, port, &dss->guest_evtchn_lockfd);
-
-        if (dss->guest_evtchn.port < 0) {
-            LOG(WARN, "Suspend event channel initialization failed");
-            rc = ERROR_FAIL;
-            goto out;
-        }
-    }
-
-    memset(callbacks, 0, sizeof(*callbacks));
-    if (r_info != NULL) {
-        callbacks->suspend = libxl__remus_domain_suspend_callback;
-        callbacks->postcopy = libxl__remus_domain_resume_callback;
-        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
-    } else
-        callbacks->suspend = libxl__domain_suspend_callback;
-
-    callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
-
-    dss->sws.ao  = dss->ao;
-    dss->sws.dss = dss;
-    dss->sws.fd  = dss->fd;
-    dss->sws.completion_callback = stream_done;
-
-    libxl__stream_write_start(egc, &dss->sws);
-    return;
-
- out:
-    domain_save_done(egc, dss, rc);
-}
-
-static void stream_done(libxl__egc *egc,
-                        libxl__stream_write_state *sws, int rc)
-{
-    domain_save_done(egc, sws->dss, rc);
-}
-
-static void save_device_model_datacopier_done(libxl__egc *egc,
-     libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
-
-void libxl__domain_save_device_model(libxl__egc *egc,
-                                     libxl__domain_suspend_state *dss,
-                                     libxl__save_device_model_cb *callback)
-{
-    STATE_AO_GC(dss->ao);
-    struct stat st;
-    uint32_t qemu_state_len;
-    int rc;
-
-    dss->save_dm_callback = callback;
-
-    /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
-    const int fd = dss->fd;
-
-    libxl__datacopier_state *dc = &dss->save_dm_datacopier;
-    memset(dc, 0, sizeof(*dc));
-    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
-    dc->ao = ao;
-    dc->readfd = -1;
-    dc->writefd = fd;
-    dc->maxsz = INT_MAX;
-    dc->bytes_to_read = -1;
-    dc->copywhat = GCSPRINTF("qemu save file for domain %"PRIu32, dss->domid);
-    dc->writewhat = "save/migration stream";
-    dc->callback = save_device_model_datacopier_done;
-
-    dc->readfd = open(filename, O_RDONLY);
-    if (dc->readfd < 0) {
-        LOGE(ERROR, "unable to open %s", dc->readwhat);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    if (fstat(dc->readfd, &st))
-    {
-        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    if (!S_ISREG(st.st_mode)) {
-        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    qemu_state_len = st.st_size;
-    LOG(DEBUG, "%s is %d bytes", dc->readwhat, qemu_state_len);
-
-    rc = libxl__datacopier_start(dc);
-    if (rc) goto out;
-
-    libxl__datacopier_prefixdata(egc, dc,
-                                 QEMU_SIGNATURE, strlen(QEMU_SIGNATURE));
-
-    libxl__datacopier_prefixdata(egc, dc,
-                                 &qemu_state_len, sizeof(qemu_state_len));
-    return;
-
- out:
-    save_device_model_datacopier_done(egc, dc, rc, -1, EIO);
-}
-
-static void save_device_model_datacopier_done(libxl__egc *egc,
-     libxl__datacopier_state *dc, int our_rc, int onwrite, int errnoval)
-{
-    libxl__domain_suspend_state *dss =
-        CONTAINER_OF(dc, *dss, save_dm_datacopier);
-    STATE_AO_GC(dss->ao);
-
-    /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
-    int rc;
-
-    libxl__datacopier_kill(dc);
-
-    if (dc->readfd >= 0) {
-        close(dc->readfd);
-        dc->readfd = -1;
-    }
-
-    rc = libxl__remove_file(gc, filename);
-    if (!our_rc) our_rc = rc;
-
-    dss->save_dm_callback(egc, dss, our_rc);
-}
-
-static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc)
-{
-    STATE_AO_GC(dss->ao);
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-
-    if (dss->guest_evtchn.port > 0)
-        xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
-                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
-
-    if (!dss->remus) {
-        dss->callback(egc, dss, rc);
-        return;
-    }
-
-    /*
-     * With Remus, if we reach this point, it means either
-     * backup died or some network error occurred preventing us
-     * from sending checkpoints. Teardown the network buffers and
-     * release netlink resources.  This is an async op.
-     */
-    libxl__remus_teardown(egc, dss, rc);
-}
-
 /*==================== Miscellaneous ====================*/
 
 char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
new file mode 100644
index 0000000..d8383b1
--- /dev/null
+++ b/tools/libxl/libxl_dom_save.c
@@ -0,0 +1,700 @@
+/*
+ * Copyright (C) 2009      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+struct libxl__physmap_info {
+    uint64_t phys_offset;
+    uint64_t start_addr;
+    uint64_t size;
+    uint32_t namelen;
+    char name[];
+};
+
+/* Bump version every time when toolstack saved data changes.
+ * Different types of data are arranged in the specified order.
+ *
+ * Version 1:
+ *   uint32_t version
+ *   QEMU physmap data:
+ *     uint32_t count
+ *     libxl__physmap_info * count
+ */
+#define TOOLSTACK_SAVE_VERSION 1
+
+/*========================= Domain save ============================*/
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *sws, int rc);
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc);
+
+/*----- complicated callback, called by xc_domain_save -----*/
+
+/*
+ * We implement the other end of protocol for controlling qemu-dm's
+ * logdirty.  There is no documentation for this protocol, but our
+ * counterparty's implementation is in
+ * qemu-xen-traditional.git:xenstore.c in the function
+ * xenstore_process_logdirty_event
+ */
+
+static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
+                                    const struct timeval *requested_abs,
+                                    int rc);
+static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
+                            const char *watch_path, const char *event_path);
+static void switch_logdirty_done(libxl__egc *egc,
+                                 libxl__domain_suspend_state *dss, int rc);
+
+static void logdirty_init(libxl__logdirty_switch *lds)
+{
+    lds->cmd_path = 0;
+    libxl__ev_xswatch_init(&lds->watch);
+    libxl__ev_time_init(&lds->timeout);
+}
+
+static void domain_suspend_switch_qemu_xen_traditional_logdirty
+                               (int domid, unsigned enable,
+                                libxl__save_helper_state *shs)
+{
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(dss->ao);
+    int rc;
+    xs_transaction_t t = 0;
+    const char *got;
+
+    if (!lds->cmd_path) {
+        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+        lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid,
+                                                    "/logdirty/cmd");
+        lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid,
+                                                    "/logdirty/ret");
+    }
+    lds->cmd = enable ? "enable" : "disable";
+
+    rc = libxl__ev_xswatch_register(gc, &lds->watch,
+                                switch_logdirty_xswatch, lds->ret_path);
+    if (rc) goto out;
+
+    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
+                                switch_logdirty_timeout, 10*1000);
+    if (rc) goto out;
+
+    for (;;) {
+        rc = libxl__xs_transaction_start(gc, &t);
+        if (rc) goto out;
+
+        rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got);
+        if (rc) goto out;
+
+        if (got) {
+            const char *got_ret;
+            rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret);
+            if (rc) goto out;
+
+            if (!got_ret || strcmp(got, got_ret)) {
+                LOG(ERROR,"controlling logdirty: qemu was already sent"
+                    " command `%s' (xenstore path `%s') but result is `%s'",
+                    got, lds->cmd_path, got_ret ? got_ret : "<none>");
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
+            if (rc) goto out;
+        }
+
+        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd);
+        if (rc) goto out;
+
+        rc = libxl__xs_transaction_commit(gc, &t);
+        if (!rc) break;
+        if (rc<0) goto out;
+    }
+
+    /* OK, wait for some callback */
+    return;
+
+ out:
+    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+    libxl__xs_transaction_abort(gc, &t);
+    switch_logdirty_done(egc,dss,rc);
+}
+
+static void domain_suspend_switch_qemu_xen_logdirty
+                               (int domid, unsigned enable,
+                                libxl__save_helper_state *shs)
+{
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+    int rc;
+
+    rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
+    if (!rc) {
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+    } else {
+        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+        dss->rc = rc;
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+    }
+}
+
+void libxl__domain_suspend_common_switch_qemu_logdirty
+                               (int domid, unsigned enable, void *user)
+{
+    libxl__save_helper_state *shs = user;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+
+    switch (libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
+        break;
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
+        break;
+    default:
+        LOG(ERROR,"logdirty switch failed"
+            ", no valid device model version found, abandoning suspend");
+        dss->rc = ERROR_FAIL;
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+    }
+}
+static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
+                                    const struct timeval *requested_abs,
+                                    int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
+    STATE_AO_GC(dss->ao);
+    LOG(ERROR,"logdirty switch: wait for device model timed out");
+    switch_logdirty_done(egc,dss,ERROR_FAIL);
+}
+
+static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
+                            const char *watch_path, const char *event_path)
+{
+    libxl__domain_suspend_state *dss =
+        CONTAINER_OF(watch, *dss, logdirty.watch);
+    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(dss->ao);
+    const char *got;
+    xs_transaction_t t = 0;
+    int rc;
+
+    for (;;) {
+        rc = libxl__xs_transaction_start(gc, &t);
+        if (rc) goto out;
+
+        rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got);
+        if (rc) goto out;
+
+        if (!got) {
+            rc = +1;
+            goto out;
+        }
+
+        if (strcmp(got, lds->cmd)) {
+            LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'"
+                " (xenstore paths `%s' / `%s')", lds->cmd, got,
+                lds->cmd_path, lds->ret_path);
+            rc = ERROR_FAIL;
+            goto out;
+        }
+
+        rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_transaction_commit(gc, &t);
+        if (!rc) break;
+        if (rc<0) goto out;
+    }
+
+ out:
+    /* rc < 0: error
+     * rc == 0: ok, we are done
+     * rc == +1: need to keep waiting
+     */
+    libxl__xs_transaction_abort(gc, &t);
+
+    if (rc <= 0) {
+        if (rc < 0)
+            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
+        switch_logdirty_done(egc,dss,rc);
+    }
+}
+
+static void switch_logdirty_done(libxl__egc *egc,
+                                 libxl__domain_suspend_state *dss,
+                                 int rc)
+{
+    STATE_AO_GC(dss->ao);
+    libxl__logdirty_switch *lds = &dss->logdirty;
+
+    libxl__ev_xswatch_deregister(gc, &lds->watch);
+    libxl__ev_time_deregister(gc, &lds->timeout);
+
+    int broke;
+    if (rc) {
+        broke = -1;
+        dss->rc = rc;
+    } else {
+        broke = 0;
+    }
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
+}
+
+/*----- callbacks, called by xc_domain_save -----*/
+
+static inline char *physmap_path(libxl__gc *gc, uint32_t dm_domid,
+                                 uint32_t domid,
+                                 char *phys_offset, char *node)
+{
+    return libxl__device_model_xs_path(gc, dm_domid, domid,
+                                       "/physmap/%s/%s",
+                                       phys_offset, node);
+}
+
+int libxl__toolstack_save(uint32_t domid, uint8_t **buf,
+        uint32_t *len, void *dss_void)
+{
+    libxl__domain_suspend_state *dss = dss_void;
+    int ret;
+    STATE_AO_GC(dss->ao);
+    int i = 0;
+    uint32_t version = TOOLSTACK_SAVE_VERSION;
+    uint8_t *ptr = NULL;
+
+    ret = -1;
+
+    /* Version number */
+    *len = sizeof(version);
+    *buf = calloc(1, *len);
+    if (*buf == NULL) goto out;
+    ptr = *buf;
+    memcpy(ptr, &version, sizeof(version));
+
+    /* QEMU physmap data */
+    {
+        char **entries = NULL, *xs_path;
+        struct libxl__physmap_info *pi;
+        uint32_t dm_domid;
+        char *start_addr = NULL, *size = NULL, *phys_offset = NULL;
+        char *name = NULL;
+        unsigned int num = 0;
+        uint32_t count = 0, namelen = 0;
+
+        dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+        xs_path = libxl__device_model_xs_path(gc, dm_domid, domid,
+                                              "/physmap");
+        entries = libxl__xs_directory(gc, 0, xs_path, &num);
+        count = num;
+
+        *len += sizeof(count);
+        *buf = realloc(*buf, *len);
+        if (*buf == NULL) goto out;
+        ptr = *buf + sizeof(version);
+        memcpy(ptr, &count, sizeof(count));
+        ptr += sizeof(count);
+
+        for (i = 0; i < count; i++) {
+            unsigned long offset;
+            phys_offset = entries[i];
+            if (phys_offset == NULL) {
+                LOG(ERROR, "phys_offset %d is NULL", i);
+                goto out;
+            }
+
+            xs_path = physmap_path(gc, dm_domid, domid, phys_offset,
+                                   "start_addr");
+            start_addr = libxl__xs_read(gc, 0, xs_path);
+            if (start_addr == NULL) {
+                LOG(ERROR, "%s is NULL", xs_path);
+                goto out;
+            }
+
+            xs_path = physmap_path(gc, dm_domid, domid, phys_offset, "size");
+            size = libxl__xs_read(gc, 0, xs_path);
+            if (size == NULL) {
+                LOG(ERROR, "%s is NULL", xs_path);
+                goto out;
+            }
+
+            xs_path = physmap_path(gc, dm_domid, domid, phys_offset, "name");
+            name = libxl__xs_read(gc, 0, xs_path);
+            if (name == NULL)
+                namelen = 0;
+            else
+                namelen = strlen(name) + 1;
+            *len += namelen + sizeof(struct libxl__physmap_info);
+            offset = ptr - (*buf);
+            *buf = realloc(*buf, *len);
+            if (*buf == NULL) goto out;
+            ptr = (*buf) + offset;
+            pi = (struct libxl__physmap_info *) ptr;
+            pi->phys_offset = strtoll(phys_offset, NULL, 16);
+            pi->start_addr = strtoll(start_addr, NULL, 16);
+            pi->size = strtoll(size, NULL, 16);
+            pi->namelen = namelen;
+            memcpy(pi->name, name, namelen);
+            ptr += sizeof(struct libxl__physmap_info) + namelen;
+        }
+    }
+
+    LOG(DEBUG,"domain=%"PRIu32" toolstack data size=%"PRIu32, domid, *len);
+
+    ret = 0;
+out:
+    return ret;
+}
+
+/*----- main code for saving, in order of execution -----*/
+
+void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
+{
+    STATE_AO_GC(dss->ao);
+    int port;
+    int rc = ERROR_FAIL;
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+    const libxl_domain_type type = dss->type;
+    const int live = dss->live;
+    const int debug = dss->debug;
+    const libxl_domain_remus_info *const r_info = dss->remus;
+    libxl__srm_save_autogen_callbacks *const callbacks =
+        &dss->sws.shs.callbacks.save.a;
+
+    dss->rc = 0;
+    logdirty_init(&dss->logdirty);
+    libxl__xswait_init(&dss->pvcontrol);
+    libxl__ev_evtchn_init(&dss->guest_evtchn);
+    libxl__ev_xswatch_init(&dss->guest_watch);
+    libxl__ev_time_init(&dss->guest_timeout);
+
+    switch (type) {
+    case LIBXL_DOMAIN_TYPE_HVM: {
+        dss->hvm = 1;
+        break;
+    }
+    case LIBXL_DOMAIN_TYPE_PV:
+        dss->hvm = 0;
+        break;
+    default:
+        abort();
+    }
+
+    dss->xcflags = (live ? XCFLAGS_LIVE : 0)
+          | (debug ? XCFLAGS_DEBUG : 0)
+          | (dss->hvm ? XCFLAGS_HVM : 0);
+
+    dss->guest_evtchn.port = -1;
+    dss->guest_evtchn_lockfd = -1;
+    dss->guest_responded = 0;
+    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
+
+    if (r_info != NULL) {
+        dss->interval = r_info->interval;
+        dss->xcflags |= XCFLAGS_CHECKPOINTED;
+        if (libxl_defbool_val(r_info->compression))
+            dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
+    }
+
+    port = xs_suspend_evtchn_port(dss->domid);
+
+    if (port >= 0) {
+        rc = libxl__ctx_evtchn_init(gc);
+        if (rc) goto out;
+
+        dss->guest_evtchn.port =
+            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
+                                  dss->domid, port, &dss->guest_evtchn_lockfd);
+
+        if (dss->guest_evtchn.port < 0) {
+            LOG(WARN, "Suspend event channel initialization failed");
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
+    memset(callbacks, 0, sizeof(*callbacks));
+    if (r_info != NULL) {
+        callbacks->suspend = libxl__remus_domain_suspend_callback;
+        callbacks->postcopy = libxl__remus_domain_resume_callback;
+        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
+    } else
+        callbacks->suspend = libxl__domain_suspend_callback;
+
+    callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
+
+    dss->sws.ao  = dss->ao;
+    dss->sws.dss = dss;
+    dss->sws.fd  = dss->fd;
+    dss->sws.completion_callback = stream_done;
+
+    libxl__stream_write_start(egc, &dss->sws);
+    return;
+
+ out:
+    domain_save_done(egc, dss, rc);
+}
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *sws, int rc)
+{
+    domain_save_done(egc, sws->dss, rc);
+}
+
+static void save_device_model_datacopier_done(libxl__egc *egc,
+     libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
+
+void libxl__domain_save_device_model(libxl__egc *egc,
+                                     libxl__domain_suspend_state *dss,
+                                     libxl__save_device_model_cb *callback)
+{
+    STATE_AO_GC(dss->ao);
+    struct stat st;
+    uint32_t qemu_state_len;
+    int rc;
+
+    dss->save_dm_callback = callback;
+
+    /* Convenience aliases */
+    const char *const filename = dss->dm_savefile;
+    const int fd = dss->fd;
+
+    libxl__datacopier_state *dc = &dss->save_dm_datacopier;
+    memset(dc, 0, sizeof(*dc));
+    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
+    dc->ao = ao;
+    dc->readfd = -1;
+    dc->writefd = fd;
+    dc->maxsz = INT_MAX;
+    dc->bytes_to_read = -1;
+    dc->copywhat = GCSPRINTF("qemu save file for domain %"PRIu32, dss->domid);
+    dc->writewhat = "save/migration stream";
+    dc->callback = save_device_model_datacopier_done;
+
+    dc->readfd = open(filename, O_RDONLY);
+    if (dc->readfd < 0) {
+        LOGE(ERROR, "unable to open %s", dc->readwhat);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    if (fstat(dc->readfd, &st))
+    {
+        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    if (!S_ISREG(st.st_mode)) {
+        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    qemu_state_len = st.st_size;
+    LOG(DEBUG, "%s is %d bytes", dc->readwhat, qemu_state_len);
+
+    rc = libxl__datacopier_start(dc);
+    if (rc) goto out;
+
+    libxl__datacopier_prefixdata(egc, dc,
+                                 QEMU_SIGNATURE, strlen(QEMU_SIGNATURE));
+
+    libxl__datacopier_prefixdata(egc, dc,
+                                 &qemu_state_len, sizeof(qemu_state_len));
+    return;
+
+ out:
+    save_device_model_datacopier_done(egc, dc, rc, -1, EIO);
+}
+
+static void save_device_model_datacopier_done(libxl__egc *egc,
+     libxl__datacopier_state *dc, int our_rc, int onwrite, int errnoval)
+{
+    libxl__domain_suspend_state *dss =
+        CONTAINER_OF(dc, *dss, save_dm_datacopier);
+    STATE_AO_GC(dss->ao);
+
+    /* Convenience aliases */
+    const char *const filename = dss->dm_savefile;
+    int rc;
+
+    libxl__datacopier_kill(dc);
+
+    if (dc->readfd >= 0) {
+        close(dc->readfd);
+        dc->readfd = -1;
+    }
+
+    rc = libxl__remove_file(gc, filename);
+    if (!our_rc) our_rc = rc;
+
+    dss->save_dm_callback(egc, dss, our_rc);
+}
+
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc)
+{
+    STATE_AO_GC(dss->ao);
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+
+    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+
+    if (dss->guest_evtchn.port > 0)
+        xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
+                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
+
+    if (!dss->remus) {
+        dss->callback(egc, dss, rc);
+        return;
+    }
+
+    /*
+     * With Remus, if we reach this point, it means either
+     * backup died or some network error occurred preventing us
+     * from sending checkpoints. Teardown the network buffers and
+     * release netlink resources.  This is an async op.
+     */
+    libxl__remus_teardown(egc, dss, rc);
+}
+
+/*========================= Domain restore ============================*/
+
+static inline char *restore_helper(libxl__gc *gc, uint32_t dm_domid,
+                                   uint32_t domid,
+                                   uint64_t phys_offset, char *node)
+{
+    return libxl__device_model_xs_path(gc, dm_domid, domid,
+                                       "/physmap/%"PRIx64"/%s",
+                                       phys_offset, node);
+}
+
+static int libxl__toolstack_restore_qemu(libxl__gc *gc, uint32_t domid,
+                                         const uint8_t *ptr, uint32_t size)
+{
+    int ret, i;
+    uint32_t count;
+    char *xs_path;
+    uint32_t dm_domid;
+    struct libxl__physmap_info *pi;
+
+    if (size < sizeof(count)) {
+        LOG(ERROR, "wrong size");
+        ret = -1;
+        goto out;
+    }
+
+    memcpy(&count, ptr, sizeof(count));
+    ptr += sizeof(count);
+
+    if (size < sizeof(count) + count*(sizeof(struct libxl__physmap_info))) {
+        LOG(ERROR, "wrong size");
+        ret = -1;
+        goto out;
+    }
+
+    dm_domid = libxl_get_stubdom_id(CTX, domid);
+    for (i = 0; i < count; i++) {
+        pi = (struct libxl__physmap_info*) ptr;
+        ptr += sizeof(struct libxl__physmap_info) + pi->namelen;
+
+        xs_path = restore_helper(gc, dm_domid, domid,
+                                 pi->phys_offset, "start_addr");
+        ret = libxl__xs_write(gc, 0, xs_path, "%"PRIx64, pi->start_addr);
+        if (ret) goto out;
+
+        xs_path = restore_helper(gc, dm_domid, domid, pi->phys_offset, "size");
+        ret = libxl__xs_write(gc, 0, xs_path, "%"PRIx64, pi->size);
+        if (ret) goto out;
+
+        if (pi->namelen > 0) {
+            xs_path = restore_helper(gc, dm_domid, domid,
+                                     pi->phys_offset, "name");
+            ret = libxl__xs_write(gc, 0, xs_path, "%s", pi->name);
+            if (ret) goto out;
+        }
+    }
+
+    ret = 0;
+out:
+    return ret;
+
+}
+
+static int libxl__toolstack_restore_v1(libxl__gc *gc, uint32_t domid,
+                                       const uint8_t *ptr, uint32_t size)
+{
+    return libxl__toolstack_restore_qemu(gc, domid, ptr, size);
+}
+
+int libxl__toolstack_restore(uint32_t domid, const uint8_t *ptr,
+                             uint32_t size, void *user)
+{
+    libxl__save_helper_state *shs = user;
+    libxl__domain_create_state *dcs = shs->caller_state;
+    STATE_AO_GC(dcs->ao);
+    int ret;
+    uint32_t version = 0, bufsize;
+
+    LOG(DEBUG,"domain=%"PRIu32" toolstack data size=%"PRIu32, domid, size);
+
+    if (size < sizeof(version)) {
+        LOG(ERROR, "wrong size");
+        ret = -1;
+        goto out;
+    }
+
+    memcpy(&version, ptr, sizeof(version));
+    ptr += sizeof(version);
+    bufsize = size - sizeof(version);
+
+    switch (version) {
+    case 1:
+        ret = libxl__toolstack_restore_v1(gc, domid, ptr, bufsize);
+        break;
+    default:
+        LOG(ERROR, "wrong version");
+        ret = -1;
+    }
+
+out:
+    return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 10/25] libxl/save: Refactor libxl__domain_suspend_state
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (8 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 09/25] tools/libxl: move save/restore code into libxl_dom_save.c Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:10   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests Yang Hongyang
                   ` (15 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

Currently struct libxl__domain_suspend_state contains 2 type of states,
one is save state, another is suspend state. This patch separates those
two out.
The motivation of this is that COLO will need to do suspend/resume
continuously, we need a more common suspend state.

After this change, dss stands for libxl__domain_save_state,
dsps stands for libxl__domain_suspend_state.

Also introduce libxl__domain_suspend_init to initialise the
libxl__domain_suspend_state.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxl/libxl.c              |  10 +-
 tools/libxl/libxl_dom_save.c     |  69 +++++--------
 tools/libxl/libxl_dom_suspend.c  | 217 +++++++++++++++++++++++++--------------
 tools/libxl/libxl_internal.h     |  60 +++++++----
 tools/libxl/libxl_netbuffer.c    |   2 +-
 tools/libxl/libxl_remus.c        |  37 ++++---
 tools/libxl/libxl_save_callout.c |   2 +-
 tools/libxl/libxl_stream_write.c |  14 +--
 8 files changed, 234 insertions(+), 177 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f1237d8..05688cd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -796,7 +796,7 @@ out:
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc);
+                              libxl__domain_save_state *dss, int rc);
 
 /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
 int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
@@ -804,7 +804,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
                              const libxl_asyncop_how *ao_how)
 {
     AO_CREATE(ctx, domid, ao_how);
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     int rc;
 
     libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -852,7 +852,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc)
+                              libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
     /*
@@ -864,7 +864,7 @@ static void remus_failover_cb(libxl__egc *egc,
 }
 
 static void domain_suspend_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc)
+                              libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
     libxl__ao_complete(egc,ao,rc);
@@ -883,7 +883,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
         goto out_err;
     }
 
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     GCNEW(dss);
 
     dss->ao = ao;
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index d8383b1..6348cae 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -41,7 +41,7 @@ struct libxl__physmap_info {
 static void stream_done(libxl__egc *egc,
                         libxl__stream_write_state *sws, int rc);
 static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc);
+                             libxl__domain_save_state *dss, int rc);
 
 /*----- complicated callback, called by xc_domain_save -----*/
 
@@ -59,7 +59,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int rc);
+                                 libxl__domain_save_state *dss, int rc);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -73,7 +73,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
                                 libxl__save_helper_state *shs)
 {
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     libxl__logdirty_switch *lds = &dss->logdirty;
     STATE_AO_GC(dss->ao);
     int rc;
@@ -145,7 +145,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
                                 libxl__save_helper_state *shs)
 {
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
     int rc;
 
@@ -164,7 +164,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
 {
     libxl__save_helper_state *shs = user;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
     switch (libxl__device_model_version_running(gc, domid)) {
@@ -185,7 +185,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
                                     const struct timeval *requested_abs,
                                     int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
+    libxl__domain_save_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
     STATE_AO_GC(dss->ao);
     LOG(ERROR,"logdirty switch: wait for device model timed out");
     switch_logdirty_done(egc,dss,ERROR_FAIL);
@@ -194,7 +194,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
                             const char *watch_path, const char *event_path)
 {
-    libxl__domain_suspend_state *dss =
+    libxl__domain_save_state *dss =
         CONTAINER_OF(watch, *dss, logdirty.watch);
     libxl__logdirty_switch *lds = &dss->logdirty;
     STATE_AO_GC(dss->ao);
@@ -248,7 +248,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
 }
 
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss,
+                                 libxl__domain_save_state *dss,
                                  int rc)
 {
     STATE_AO_GC(dss->ao);
@@ -281,7 +281,7 @@ static inline char *physmap_path(libxl__gc *gc, uint32_t dm_domid,
 int libxl__toolstack_save(uint32_t domid, uint8_t **buf,
         uint32_t *len, void *dss_void)
 {
-    libxl__domain_suspend_state *dss = dss_void;
+    libxl__domain_save_state *dss = dss_void;
     int ret;
     STATE_AO_GC(dss->ao);
     int i = 0;
@@ -374,10 +374,9 @@ out:
 
 /*----- main code for saving, in order of execution -----*/
 
-void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
+void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
 {
     STATE_AO_GC(dss->ao);
-    int port;
     int rc = ERROR_FAIL;
 
     /* Convenience aliases */
@@ -388,13 +387,14 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
     const libxl_domain_remus_info *const r_info = dss->remus;
     libxl__srm_save_autogen_callbacks *const callbacks =
         &dss->sws.shs.callbacks.save.a;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
-    libxl__xswait_init(&dss->pvcontrol);
-    libxl__ev_evtchn_init(&dss->guest_evtchn);
-    libxl__ev_xswatch_init(&dss->guest_watch);
-    libxl__ev_time_init(&dss->guest_timeout);
+    dsps->ao = ao;
+    dsps->domid = domid;
+    rc = libxl__domain_suspend_init(egc, dsps);
+    if (rc) goto out;
 
     switch (type) {
     case LIBXL_DOMAIN_TYPE_HVM: {
@@ -412,11 +412,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
           | (debug ? XCFLAGS_DEBUG : 0)
           | (dss->hvm ? XCFLAGS_HVM : 0);
 
-    dss->guest_evtchn.port = -1;
-    dss->guest_evtchn_lockfd = -1;
-    dss->guest_responded = 0;
-    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
-
     if (r_info != NULL) {
         dss->interval = r_info->interval;
         dss->xcflags |= XCFLAGS_CHECKPOINTED;
@@ -424,23 +419,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
 
-    port = xs_suspend_evtchn_port(dss->domid);
-
-    if (port >= 0) {
-        rc = libxl__ctx_evtchn_init(gc);
-        if (rc) goto out;
-
-        dss->guest_evtchn.port =
-            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
-                                  dss->domid, port, &dss->guest_evtchn_lockfd);
-
-        if (dss->guest_evtchn.port < 0) {
-            LOG(WARN, "Suspend event channel initialization failed");
-            rc = ERROR_FAIL;
-            goto out;
-        }
-    }
-
     memset(callbacks, 0, sizeof(*callbacks));
     if (r_info != NULL) {
         callbacks->suspend = libxl__remus_domain_suspend_callback;
@@ -473,7 +451,7 @@ static void save_device_model_datacopier_done(libxl__egc *egc,
      libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
 
 void libxl__domain_save_device_model(libxl__egc *egc,
-                                     libxl__domain_suspend_state *dss,
+                                     libxl__domain_save_state *dss,
                                      libxl__save_device_model_cb *callback)
 {
     STATE_AO_GC(dss->ao);
@@ -484,7 +462,7 @@ void libxl__domain_save_device_model(libxl__egc *egc,
     dss->save_dm_callback = callback;
 
     /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
+    const char *const filename = dss->dsps.dm_savefile;
     const int fd = dss->fd;
 
     libxl__datacopier_state *dc = &dss->save_dm_datacopier;
@@ -539,12 +517,12 @@ void libxl__domain_save_device_model(libxl__egc *egc,
 static void save_device_model_datacopier_done(libxl__egc *egc,
      libxl__datacopier_state *dc, int our_rc, int onwrite, int errnoval)
 {
-    libxl__domain_suspend_state *dss =
+    libxl__domain_save_state *dss =
         CONTAINER_OF(dc, *dss, save_dm_datacopier);
     STATE_AO_GC(dss->ao);
 
     /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
+    const char *const filename = dss->dsps.dm_savefile;
     int rc;
 
     libxl__datacopier_kill(dc);
@@ -561,18 +539,19 @@ static void save_device_model_datacopier_done(libxl__egc *egc,
 }
 
 static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc)
+                             libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
 
     /* Convenience aliases */
     const uint32_t domid = dss->domid;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
 
-    if (dss->guest_evtchn.port > 0)
+    if (dsps->guest_evtchn.port > 0)
         xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
-                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
+                        dsps->guest_evtchn.port, &dsps->guest_evtchn_lockfd);
 
     if (!dss->remus) {
         dss->callback(egc, dss, rc);
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index a90800d..6f04c26 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -19,14 +19,71 @@
 
 /*====================== Domain suspend =======================*/
 
+int libxl__domain_suspend_init(libxl__egc *egc,
+                               libxl__domain_suspend_state *dsps)
+{
+    STATE_AO_GC(dsps->ao);
+    int rc = ERROR_FAIL;
+    int port;
+    libxl_domain_type type;
+
+    /* Convenience aliases */
+    const uint32_t domid = dsps->domid;
+
+    type = libxl__domain_type(gc, domid);
+    switch (type) {
+    case LIBXL_DOMAIN_TYPE_HVM: {
+        dsps->hvm = 1;
+        break;
+    }
+    case LIBXL_DOMAIN_TYPE_PV:
+        dsps->hvm = 0;
+        break;
+    default:
+        goto out;
+    }
+
+    libxl__xswait_init(&dsps->pvcontrol);
+    libxl__ev_evtchn_init(&dsps->guest_evtchn);
+    libxl__ev_xswatch_init(&dsps->guest_watch);
+    libxl__ev_time_init(&dsps->guest_timeout);
+
+    dsps->guest_evtchn.port = -1;
+    dsps->guest_evtchn_lockfd = -1;
+    dsps->guest_responded = 0;
+    dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
+
+    port = xs_suspend_evtchn_port(domid);
+
+    if (port >= 0) {
+        rc = libxl__ctx_evtchn_init(gc);
+        if (rc) goto out;
+
+        dsps->guest_evtchn.port =
+            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
+                                    domid, port, &dsps->guest_evtchn_lockfd);
+
+        if (dsps->guest_evtchn.port < 0) {
+            LOG(WARN, "Suspend event channel initialization failed");
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
+    rc = 0;
+
+out:
+    return rc;
+}
+
 /*----- callbacks, called by xc_domain_save -----*/
 
 int libxl__domain_suspend_device_model(libxl__gc *gc,
-                                       libxl__domain_suspend_state *dss)
+                                       libxl__domain_suspend_state *dsps)
 {
     int ret = 0;
-    uint32_t const domid = dss->domid;
-    const char *const filename = dss->dm_savefile;
+    uint32_t const domid = dsps->domid;
+    const char *const filename = dsps->dm_savefile;
 
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
@@ -51,9 +108,9 @@ int libxl__domain_suspend_device_model(libxl__gc *gc,
 }
 
 static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss);
+                                             libxl__domain_suspend_state *dsps);
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss);
+                                         libxl__domain_suspend_state *dsps);
 
 static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
       libxl__xswait_state *xswa, int rc, const char *state);
@@ -62,24 +119,24 @@ static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
       libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
 static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss);
+        libxl__domain_suspend_state *dsps);
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
 
 static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
+                                       libxl__domain_suspend_state *dsps,
                                        int rc);
 
 static void domain_suspend_callback_common(libxl__egc *egc,
-                                           libxl__domain_suspend_state *dss);
+                                           libxl__domain_suspend_state *dsps);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc);
+                                libxl__domain_suspend_state *dsps, int rc);
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 void libxl__domain_suspend(libxl__egc *egc,
-                           libxl__domain_suspend_state *dss)
+                           libxl__domain_suspend_state *dsps)
 {
-    domain_suspend_callback_common(egc, dss);
+    domain_suspend_callback_common(egc, dsps);
 }
 
 static bool domain_suspend_pvcontrol_acked(const char *state) {
@@ -88,37 +145,37 @@ static bool domain_suspend_pvcontrol_acked(const char *state) {
     return strcmp(state,"suspend");
 }
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 static void domain_suspend_callback_common(libxl__egc *egc,
-                                           libxl__domain_suspend_state *dss)
+                                           libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     uint64_t hvm_s_state = 0, hvm_pvdrv = 0;
     int ret, rc;
 
     /* Convenience aliases */
-    const uint32_t domid = dss->domid;
+    const uint32_t domid = dsps->domid;
 
-    if (dss->hvm) {
+    if (dsps->hvm) {
         xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_CALLBACK_IRQ, &hvm_pvdrv);
         xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_ACPI_S_STATE, &hvm_s_state);
     }
 
-    if ((hvm_s_state == 0) && (dss->guest_evtchn.port >= 0)) {
+    if ((hvm_s_state == 0) && (dsps->guest_evtchn.port >= 0)) {
         LOG(DEBUG, "issuing %s suspend request via event channel",
-            dss->hvm ? "PVHVM" : "PV");
-        ret = xc_evtchn_notify(CTX->xce, dss->guest_evtchn.port);
+            dsps->hvm ? "PVHVM" : "PV");
+        ret = xc_evtchn_notify(CTX->xce, dsps->guest_evtchn.port);
         if (ret < 0) {
             LOG(ERROR, "xc_evtchn_notify failed ret=%d", ret);
             rc = ERROR_FAIL;
             goto err;
         }
 
-        dss->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
-        rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
+        dsps->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
+        rc = libxl__ev_evtchn_wait(gc, &dsps->guest_evtchn);
         if (rc) goto err;
 
-        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+        rc = libxl__ev_time_register_rel(ao, &dsps->guest_timeout,
                                          suspend_common_wait_guest_timeout,
                                          60*1000);
         if (rc) goto err;
@@ -126,7 +183,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         return;
     }
 
-    if (dss->hvm && (!hvm_pvdrv || hvm_s_state)) {
+    if (dsps->hvm && (!hvm_pvdrv || hvm_s_state)) {
         LOG(DEBUG, "Calling xc_domain_shutdown on HVM domain");
         ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend);
         if (ret < 0) {
@@ -135,55 +192,55 @@ static void domain_suspend_callback_common(libxl__egc *egc,
             goto err;
         }
         /* The guest does not (need to) respond to this sort of request. */
-        dss->guest_responded = 1;
-        domain_suspend_common_wait_guest(egc, dss);
+        dsps->guest_responded = 1;
+        domain_suspend_common_wait_guest(egc, dsps);
         return;
     }
 
     LOG(DEBUG, "issuing %s suspend request via XenBus control node",
-        dss->hvm ? "PVHVM" : "PV");
+        dsps->hvm ? "PVHVM" : "PV");
 
     libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend");
 
-    dss->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
-    if (!dss->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
+    dsps->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
+    if (!dsps->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
 
-    dss->pvcontrol.ao = ao;
-    dss->pvcontrol.what = "guest acknowledgement of suspend request";
-    dss->pvcontrol.timeout_ms = 60 * 1000;
-    dss->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
-    libxl__xswait_start(gc, &dss->pvcontrol);
+    dsps->pvcontrol.ao = ao;
+    dsps->pvcontrol.what = "guest acknowledgement of suspend request";
+    dsps->pvcontrol.timeout_ms = 60 * 1000;
+    dsps->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
+    libxl__xswait_start(gc, &dsps->pvcontrol);
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
         libxl__ev_evtchn *evev)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(evev, *dss, guest_evtchn);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(evev, *dsps, guest_evtchn);
+    STATE_AO_GC(dsps->ao);
     /* If we should be done waiting, suspend_common_wait_guest_check
      * will end up calling domain_suspend_common_guest_suspended or
      * domain_suspend_common_done, both of which cancel the evtchn
      * wait as needed.  So re-enable it now. */
-    libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
-    suspend_common_wait_guest_check(egc, dss);
+    libxl__ev_evtchn_wait(gc, &dsps->guest_evtchn);
+    suspend_common_wait_guest_check(egc, dsps);
 }
 
 static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
       libxl__xswait_state *xswa, int rc, const char *state)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xswa, *dss, pvcontrol);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(xswa, *dsps, pvcontrol);
+    STATE_AO_GC(dsps->ao);
     xs_transaction_t t = 0;
 
     if (!rc && !domain_suspend_pvcontrol_acked(state))
         /* keep waiting */
         return;
 
-    libxl__xswait_stop(gc, &dss->pvcontrol);
+    libxl__xswait_stop(gc, &dsps->pvcontrol);
 
     if (rc == ERROR_TIMEDOUT) {
         /*
@@ -226,56 +283,56 @@ static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
     LOG(DEBUG, "guest acknowledged suspend request");
 
     libxl__xs_transaction_abort(gc, &t);
-    dss->guest_responded = 1;
-    domain_suspend_common_wait_guest(egc,dss);
+    dsps->guest_responded = 1;
+    domain_suspend_common_wait_guest(egc,dsps);
     return;
 
  err:
     libxl__xs_transaction_abort(gc, &t);
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
     return;
 }
 
 static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss)
+                                             libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     int rc;
 
     LOG(DEBUG, "wait for the guest to suspend");
 
-    rc = libxl__ev_xswatch_register(gc, &dss->guest_watch,
+    rc = libxl__ev_xswatch_register(gc, &dsps->guest_watch,
                                     suspend_common_wait_guest_watch,
                                     "@releaseDomain");
     if (rc) goto err;
 
-    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dsps->guest_timeout,
                                      suspend_common_wait_guest_timeout,
                                      60*1000);
     if (rc) goto err;
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
       libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xsw, *dss, guest_watch);
-    suspend_common_wait_guest_check(egc, dss);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(xsw, *dsps, guest_watch);
+    suspend_common_wait_guest_check(egc, dsps);
 }
 
 static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss)
+        libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     xc_domaininfo_t info;
     int ret;
     int shutdown_reason;
 
     /* Convenience aliases */
-    const uint32_t domid = dss->domid;
+    const uint32_t domid = dsps->domid;
 
     ret = xc_domain_getinfolist(CTX->xch, domid, 1, &info);
     if (ret < 0) {
@@ -302,71 +359,73 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
     }
 
     LOG(DEBUG, "guest has suspended");
-    domain_suspend_common_guest_suspended(egc, dss);
+    domain_suspend_common_guest_suspended(egc, dsps);
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, ERROR_FAIL);
+    domain_suspend_common_done(egc, dsps, ERROR_FAIL);
 }
 
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(ev, *dsps, guest_timeout);
+    STATE_AO_GC(dsps->ao);
     if (rc == ERROR_TIMEDOUT) {
         LOG(ERROR, "guest did not suspend, timed out");
         rc = ERROR_GUEST_TIMEDOUT;
     }
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss)
+                                         libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     int rc;
 
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
+    libxl__ev_time_deregister(gc, &dsps->guest_timeout);
 
-    if (dss->hvm) {
-        rc = libxl__domain_suspend_device_model(gc, dss);
+    if (dsps->hvm) {
+        rc = libxl__domain_suspend_device_model(gc, dsps);
         if (rc) {
             LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", rc);
-            domain_suspend_common_done(egc, dss, rc);
+            domain_suspend_common_done(egc, dsps, rc);
             return;
         }
     }
-    domain_suspend_common_done(egc, dss, 0);
+    domain_suspend_common_done(egc, dsps, 0);
 }
 
 static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
+                                       libxl__domain_suspend_state *dsps,
                                        int rc)
 {
     EGC_GC;
-    assert(!libxl__xswait_inuse(&dss->pvcontrol));
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
-    dss->callback_common_done(egc, dss, rc);
+    assert(!libxl__xswait_inuse(&dsps->pvcontrol));
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
+    libxl__ev_time_deregister(gc, &dsps->guest_timeout);
+    dsps->callback_common_done(egc, dsps, rc);
 }
 
 void libxl__domain_suspend_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    dss->callback_common_done = domain_suspend_callback_common_done;
-    domain_suspend_callback_common(egc, dss);
+    dsps->callback_common_done = domain_suspend_callback_common_done;
+    domain_suspend_callback_common(egc, dsps);
 }
 
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
+                                libxl__domain_suspend_state *dsps, int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(dsps, *dss, dsps);
     dss->rc = rc;
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7ce3eca..c53a148 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2955,11 +2955,12 @@ static inline bool libxl__conversion_helper_inuse
  */
 
 typedef struct libxl__domain_suspend_state libxl__domain_suspend_state;
+typedef struct libxl__domain_save_state libxl__domain_save_state;
 
-typedef void libxl__domain_suspend_cb(libxl__egc*,
-                                      libxl__domain_suspend_state*, int rc);
+typedef void libxl__domain_save_cb(libxl__egc*,
+                                   libxl__domain_save_state*, int rc);
 typedef void libxl__save_device_model_cb(libxl__egc*,
-                                         libxl__domain_suspend_state*, int rc);
+                                         libxl__domain_save_state*, int rc);
 
 /* State for writing a libxl migration v2 stream */
 typedef struct libxl__stream_write_state libxl__stream_write_state;
@@ -2968,7 +2969,7 @@ typedef void (*sws_record_done_cb)(libxl__egc *egc,
 struct libxl__stream_write_state {
     /* filled by the user */
     libxl__ao *ao;
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     int fd;
     void (*completion_callback)(libxl__egc *egc,
                                 libxl__stream_write_state *sws,
@@ -3017,9 +3018,32 @@ typedef struct libxl__logdirty_switch {
 } libxl__logdirty_switch;
 
 struct libxl__domain_suspend_state {
+    /* set by caller of libxl__domain_suspend_init */
+    libxl__ao *ao;
+    uint32_t domid;
+
+    /* private */
+    int hvm;
+
+    libxl__ev_evtchn guest_evtchn;
+    int guest_evtchn_lockfd;
+    int guest_responded;
+
+    libxl__xswait_state pvcontrol;
+    libxl__ev_xswatch guest_watch;
+    libxl__ev_time guest_timeout;
+
+    const char *dm_savefile;
+    void (*callback_common_done)(libxl__egc*,
+                                 struct libxl__domain_suspend_state*, int ok);
+};
+int libxl__domain_suspend_init(libxl__egc *egc,
+                               libxl__domain_suspend_state *dsps);
+
+struct libxl__domain_save_state {
     /* set by caller of libxl__domain_save */
     libxl__ao *ao;
-    libxl__domain_suspend_cb *callback;
+    libxl__domain_save_cb *callback;
 
     uint32_t domid;
     int fd;
@@ -3029,22 +3053,14 @@ struct libxl__domain_suspend_state {
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
-    libxl__ev_evtchn guest_evtchn;
-    int guest_evtchn_lockfd;
     int hvm;
     int xcflags;
-    int guest_responded;
-    libxl__xswait_state pvcontrol;
-    libxl__ev_xswatch guest_watch;
-    libxl__ev_time guest_timeout;
-    const char *dm_savefile;
+    libxl__domain_suspend_state dsps;
     libxl__remus_devices_state rds;
     libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
     int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
     libxl__logdirty_switch logdirty;
-    void (*callback_common_done)(libxl__egc*,
-                                 struct libxl__domain_suspend_state*, int ok);
     /* private for libxl__domain_save_device_model */
     libxl__save_device_model_cb *save_dm_callback;
     libxl__datacopier_state save_dm_datacopier;
@@ -3383,12 +3399,12 @@ struct libxl__domain_create_state {
 
 /* calls dss->callback when done */
 _hidden void libxl__domain_save(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss);
+                                libxl__domain_save_state *dss);
 
 
 /* calls libxl__xc_domain_suspend_done when done */
 _hidden void libxl__xc_domain_save(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss,
+                                   libxl__domain_save_state *dss,
                                    libxl__save_helper_state *shs);
 /* If rc==0 then retval is the return value from xc_domain_save
  * and errnoval is the errno value it provided.
@@ -3432,16 +3448,16 @@ static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
 
 /* Each time the dm needs to be saved, we must call suspend and then save */
 _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
-                                           libxl__domain_suspend_state *dss);
+                                           libxl__domain_suspend_state *dsps);
 _hidden void libxl__domain_save_device_model(libxl__egc *egc,
-                                     libxl__domain_suspend_state *dss,
+                                     libxl__domain_save_state *dss,
                                      libxl__save_device_model_cb *callback);
 
 _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 _hidden void libxl__domain_suspend(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss);
+                                   libxl__domain_suspend_state *dsps);
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
@@ -3453,9 +3469,9 @@ _hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
 _hidden void libxl__remus_domain_restore_checkpoint_callback(void *data);
 /* Remus setup and teardown*/
 _hidden void libxl__remus_setup(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss);
+                                libxl__domain_save_state *dss);
 _hidden void libxl__remus_teardown(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss,
+                                   libxl__domain_save_state *dss,
                                    int rc);
 
 /*
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index 107e867..c245a4e 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -41,7 +41,7 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
 int init_subkind_nic(libxl__remus_devices_state *rds)
 {
     int rc, ret;
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     STATE_AO_GC(rds->ao);
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index b7fa022..e64792b 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -26,7 +26,7 @@ static void remus_setup_failed(libxl__egc *egc,
                                libxl__remus_devices_state *rds, int rc);
 
 void libxl__remus_setup(libxl__egc *egc,
-                        libxl__domain_suspend_state *dss)
+                        libxl__domain_save_state *dss)
 {
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -59,7 +59,7 @@ out:
 static void remus_setup_done(libxl__egc *egc,
                              libxl__remus_devices_state *rds, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
@@ -76,7 +76,7 @@ static void remus_setup_done(libxl__egc *egc,
 static void remus_setup_failed(libxl__egc *egc,
                                libxl__remus_devices_state *rds, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -90,7 +90,7 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__remus_devices_state *rds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
-                           libxl__domain_suspend_state *dss,
+                           libxl__domain_save_state *dss,
                            int rc)
 {
     EGC_GC;
@@ -105,7 +105,7 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__remus_devices_state *rds,
                                 int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -118,7 +118,7 @@ static void remus_teardown_done(libxl__egc *egc,
 /*---------------------- remus callbacks (save) -----------------------*/
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
+                                libxl__domain_suspend_state *dsps, int ok);
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__remus_devices_state *rds,
                                          int rc);
@@ -130,15 +130,18 @@ void libxl__remus_domain_suspend_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    dss->callback_common_done = remus_domain_suspend_callback_common_done;
-    libxl__domain_suspend(egc, dss);
+    dsps->callback_common_done = remus_domain_suspend_callback_common_done;
+    libxl__domain_suspend(egc, dsps);
 }
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
+                                libxl__domain_suspend_state *dsps, int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(dsps, *dss, dsps);
+
     if (rc)
         goto out;
 
@@ -156,7 +159,7 @@ static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__remus_devices_state *rds,
                                          int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     if (rc)
         goto out;
@@ -173,7 +176,7 @@ void libxl__remus_domain_resume_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -185,7 +188,7 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
                                        libxl__remus_devices_state *rds,
                                        int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -218,7 +221,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
 void libxl__remus_domain_save_checkpoint_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     libxl__egc *egc = shs->egc;
     STATE_AO_GC(dss->ao);
 
@@ -229,7 +232,7 @@ void libxl__remus_domain_save_checkpoint_callback(void *data)
 static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
+    libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
 
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -254,7 +257,7 @@ static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__remus_devices_state *rds,
                                     int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     STATE_AO_GC(dss->ao);
 
@@ -289,7 +292,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
                                   int rc)
 {
-    libxl__domain_suspend_state *dss =
+    libxl__domain_save_state *dss =
                             CONTAINER_OF(ev, *dss, checkpoint_timeout);
 
     STATE_AO_GC(dss->ao);
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index f2ce868..eecb356 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -76,7 +76,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
                argnums, ARRAY_SIZE(argnums));
 }
 
-void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss,
+void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
                            libxl__save_helper_state *shs)
 {
     STATE_AO_GC(dss->ao);
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 944a87b..16f667a 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -252,7 +252,7 @@ static void libxc_header_done(libxl__egc *egc,
 void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
                                 int rc, int retval, int errnoval)
 {
-    libxl__domain_suspend_state *dss = dss_void;
+    libxl__domain_save_state *dss = dss_void;
     libxl__stream_write_state *stream = &dss->sws;
     STATE_AO_GC(dss->ao);
 
@@ -261,10 +261,10 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
 
     if (retval) {
         LOGEV(ERROR, errnoval, "saving domain: %s",
-              dss->guest_responded ?
+              dss->dsps.guest_responded ?
               "domain responded to suspend request" :
               "domain did not respond to suspend request");
-        if (!dss->guest_responded)
+        if (!dss->dsps.guest_responded)
             rc = ERROR_GUEST_TIMEDOUT;
         else if (dss->rc)
             rc = dss->rc;
@@ -282,7 +282,7 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
 static void write_toolstack_record(libxl__egc *egc,
                                    libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_rec_hdr rec;
     int rc;
@@ -314,7 +314,7 @@ static void write_toolstack_record(libxl__egc *egc,
 static void toolstack_record_done(libxl__egc *egc,
                                   libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
 
     if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
         write_emulator_record(egc, stream);
@@ -329,7 +329,7 @@ static void toolstack_record_done(libxl__egc *egc,
 static void write_emulator_record(libxl__egc *egc,
                                   libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     libxl__datacopier_state *dc = &stream->emu_dc;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_rec_hdr *rec = &stream->emu_rec_hdr;
@@ -340,7 +340,7 @@ static void write_emulator_record(libxl__egc *egc,
     assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
 
     /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
+    const char *const filename = dss->dsps.dm_savefile;
     const uint32_t domid = dss->domid;
 
     libxl__carefd_begin();
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (9 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 10/25] libxl/save: Refactor libxl__domain_suspend_state Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:26   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream Yang Hongyang
                   ` (14 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

From: Wen Congyang <wency@cn.fujitsu.com>

1. suspend
a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
   request to the guest). If the guest doesn't support evtchn, the xenstore
   variant will be used, suspending the guest via XenBus control node.
b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
   the guest

2. Resume:
a. fast path
   In this case, we don't change the guest's state.
   PV: modify the return code to 1, and than call the domctl:
       XEN_DOMCTL_resumedomain
   PVHVM: same with PV
   HVM: do nothing in modify_returncode, and than call the domctl:
        XEN_DOMCTL_resumedomain
b. slow
   Used when the guest's state have been changed.
   PV: update start info, and reset all secondary CPU states. Than call the
   domctl: XEN_DOMCTL_resumedomain
   PVHVM and HVM can not be resumed.

For PVHVM, in my test, only call the domctl: XEN_DOMCTL_resumedomain
can work. I am not sure if we should update start info and reset all
secondary CPU states.

For pure HVM guest, in my test, only call the domctl:
XEN_DOMCTL_resumedomain can work.

So we can call libxl__domain_resume(..., 1) if we don't change the guest
state, otherwise call libxl__domain_resume(..., 0).

Under COLO, we will update the guest's state(modify memory, cpu's registers,
device status...). In this case, we cannot use the fast path to resume it.
Keep the return code 0, and use a slow path to resume the guest. While
resuming HVM using slow path is not supported currently, this patch is to
make the resume call do not fail.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_resume.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
index e67bebd..bd82334 100644
--- a/tools/libxc/xc_resume.c
+++ b/tools/libxc/xc_resume.c
@@ -109,6 +109,23 @@ static int xc_domain_resume_cooperative(xc_interface *xch, uint32_t domid)
     return do_domctl(xch, &domctl);
 }
 
+static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
+{
+    DECLARE_DOMCTL;
+
+    /*
+     * If it is PVHVM, the hypercall return code is 0, because this
+     * is not a fast path resume, we do not modify_returncode as in
+     * xc_domain_resume_cooperative.
+     * (resuming it in a new domain context)
+     *
+     * If it is a HVM, the hypercall is a NOP.
+     */
+    domctl.cmd = XEN_DOMCTL_resumedomain;
+    domctl.domain = domid;
+    return do_domctl(xch, &domctl);
+}
+
 static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
 {
     DECLARE_DOMCTL;
@@ -138,10 +155,7 @@ static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
      */
 #if defined(__i386__) || defined(__x86_64__)
     if ( info.hvm )
-    {
-        ERROR("Cannot resume uncooperative HVM guests");
-        return rc;
-    }
+        return xc_domain_resume_hvm(xch, domid);
 
     if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
     {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (10 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:34   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc Yang Hongyang
                   ` (13 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

introduce enum type libxl_checkpointed_stream in IDL.
rename the last argument of migrate_receive from "remus" to
"checkpointed" since the semantics of this parameter has
changed.

NOTE:
 libxl_domain_restore_params isn't changed here,
 checkpointed_stream is still an int.
 It has to change eventually and other callers will have to be
 updated to cope (and there should be LIBXL_HAVE_...).

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxl/libxl_types.idl |  5 +++++
 tools/libxl/xl_cmdimpl.c    | 13 +++++++------
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index bc0c4ef..94c230e 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -207,6 +207,11 @@ libxl_hdtype = Enumeration("hdtype", [
     (2, "AHCI"),
     ], init_val = "LIBXL_HDTYPE_IDE")
 
+libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
+    (0, "NONE"),
+    (1, "REMUS"),
+    ])
+
 #
 # Complex libxl types
 #
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 13e154d..aa641fd 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4282,7 +4282,7 @@ static void migrate_domain(uint32_t domid, const char *rune, int debug,
 }
 
 static void migrate_receive(int debug, int daemonize, int monitor,
-                            int send_fd, int recv_fd, int remus)
+                            int send_fd, int recv_fd, int checkpointed)
 {
     uint32_t domid;
     int rc, rc2;
@@ -4307,7 +4307,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
     dom_info.paused = 1;
     dom_info.migrate_fd = recv_fd;
     dom_info.migration_domname_r = &migration_domname;
-    dom_info.checkpointed_stream = remus;
+    dom_info.checkpointed_stream = checkpointed;
 
     rc = create_domain(&dom_info);
     if (rc < 0) {
@@ -4318,7 +4318,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
 
     domid = rc;
 
-    if (remus) {
+    if (checkpointed) {
         /* If we are here, it means that the sender (primary) has crashed.
          * TODO: Split-Brain Check.
          */
@@ -4489,7 +4489,8 @@ int main_restore(int argc, char **argv)
 
 int main_migrate_receive(int argc, char **argv)
 {
-    int debug = 0, daemonize = 1, monitor = 1, remus = 0;
+    int debug = 0, daemonize = 1, monitor = 1;
+    int checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
     int opt;
 
     SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
@@ -4504,7 +4505,7 @@ int main_migrate_receive(int argc, char **argv)
         debug = 1;
         break;
     case 'r':
-        remus = 1;
+        checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
         break;
     }
 
@@ -4514,7 +4515,7 @@ int main_migrate_receive(int argc, char **argv)
     }
     migrate_receive(debug, daemonize, monitor,
                     STDOUT_FILENO, STDIN_FILENO,
-                    remus);
+                    checkpointed);
 
     return 0;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (11 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:38   ` Ian Campbell
  2015-07-16 16:10   ` Wei Liu
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 14/25] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Yang Hongyang
                   ` (12 subsequent siblings)
  25 siblings, 2 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

Pass checkpointed_stream from libxl to libxc.
It won't affact legacy migration because legacy migration
won't use this param.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxc/include/xenguest.h   |  9 ++++++---
 tools/libxc/xc_domain_save.c     |  6 ++++--
 tools/libxc/xc_nomigrate.c       |  3 ++-
 tools/libxc/xc_sr_common.h       |  2 +-
 tools/libxc/xc_sr_save.c         |  5 +++--
 tools/libxl/libxl.c              |  2 ++
 tools/libxl/libxl_dom_save.c     | 11 ++++++++---
 tools/libxl/libxl_internal.h     |  1 +
 tools/libxl/libxl_save_callout.c |  2 +-
 tools/libxl/libxl_save_helper.c  |  3 ++-
 10 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index e95af54..6e24b6c 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -30,7 +30,6 @@
 #define XCFLAGS_HVM       (1 << 2)
 #define XCFLAGS_STDVGA    (1 << 3)
 #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
-#define XCFLAGS_CHECKPOINTED    (1 << 5)
 
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
@@ -85,16 +84,20 @@ struct save_callbacks {
  * @parm xch a handle to an open hypervisor interface
  * @parm fd the file descriptor to save a domain to
  * @parm dom the id of the domain
+ * @parm checkpointed_stream non-zero if the far end of the stream is using
+ *       checkpointing
  * @return 0 on success, -1 on failure
  */
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
-                   struct save_callbacks* callbacks, int hvm);
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream);
 
 /* Domain Save v2 */
 int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                     uint32_t max_factor, uint32_t flags,
-                    struct save_callbacks* callbacks, int hvm);
+                    struct save_callbacks* callbacks, int hvm,
+                    int checkpointed_stream);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 3222473..0da3cca 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -802,7 +802,8 @@ static int save_tsc_info(xc_interface *xch, uint32_t dom, int io_fd)
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm)
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream)
 {
     xc_dominfo_t info;
     DECLARE_DOMCTL;
@@ -897,7 +898,8 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     if ( getenv("XG_MIGRATION_V2") )
     {
         return xc_domain_save2(xch, io_fd, dom, max_iters,
-                               max_factor, flags, callbacks, hvm);
+                               max_factor, flags, callbacks, hvm,
+                               checkpointed_stream);
     }
 
     DPRINTF("%s: starting save of domid %u", __func__, dom);
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 76978a0..374d5bf 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -23,7 +23,8 @@
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm)
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream)
 {
     errno = ENOSYS;
     return -1;
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 64f6082..28755ac 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -178,7 +178,7 @@ struct xc_sr_context
             bool live;
 
             /* Plain VM, or checkpoints over time. */
-            bool checkpointed;
+            int checkpointed;
 
             /* Further debugging information in the stream. */
             bool debug;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index d63b783..6102b66 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -820,7 +820,8 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 
 int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
                     uint32_t max_iters, uint32_t max_factor, uint32_t flags,
-                    struct save_callbacks* callbacks, int hvm)
+                    struct save_callbacks* callbacks, int hvm,
+                    int checkpointed_stream)
 {
     xen_pfn_t nr_pfns;
     struct xc_sr_context ctx =
@@ -833,7 +834,7 @@ int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
     ctx.save.callbacks = callbacks;
     ctx.save.live  = !!(flags & XCFLAGS_LIVE);
     ctx.save.debug = !!(flags & XCFLAGS_DEBUG);
-    ctx.save.checkpointed = !!(flags & XCFLAGS_CHECKPOINTED);
+    ctx.save.checkpointed = checkpointed_stream;
 
     /*
      * TODO: Find some time to better tweak the live migration algorithm.
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 05688cd..5b2d045 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -840,6 +840,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     dss->live = 1;
     dss->debug = 0;
     dss->remus = info;
+    dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
 
     assert(info);
 
@@ -894,6 +895,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
     dss->type = type;
     dss->live = flags & LIBXL_SUSPEND_LIVE;
     dss->debug = flags & LIBXL_SUSPEND_DEBUG;
+    dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
 
     libxl__domain_save(egc, dss);
     return AO_INPROGRESS;
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 6348cae..f89f5d4 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -389,6 +389,12 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
         &dss->sws.shs.callbacks.save.a;
     libxl__domain_suspend_state *dsps = &dss->dsps;
 
+    if (dss->checkpointed_stream && !r_info) {
+        LOG(ERROR, "Migration stream is checkpointed, but there's no "
+                   "checkpoint info!");
+        goto out;
+    }
+
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
     dsps->ao = ao;
@@ -412,15 +418,14 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
           | (debug ? XCFLAGS_DEBUG : 0)
           | (dss->hvm ? XCFLAGS_HVM : 0);
 
-    if (r_info != NULL) {
+    if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
         dss->interval = r_info->interval;
-        dss->xcflags |= XCFLAGS_CHECKPOINTED;
         if (libxl_defbool_val(r_info->compression))
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
 
     memset(callbacks, 0, sizeof(*callbacks));
-    if (r_info != NULL) {
+    if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
         callbacks->suspend = libxl__remus_domain_suspend_callback;
         callbacks->postcopy = libxl__remus_domain_resume_callback;
         callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index c53a148..0eb5f41 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3050,6 +3050,7 @@ struct libxl__domain_save_state {
     libxl_domain_type type;
     int live;
     int debug;
+    int checkpointed_stream;
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index eecb356..f393abc 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -86,7 +86,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
 
     const unsigned long argnums[] = {
         dss->domid, 0, 0, dss->xcflags, dss->hvm,
-        cbflags,
+        cbflags, dss->checkpointed_stream,
     };
 
     shs->ao = ao;
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 1622bb7..4c9d34c 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -250,6 +250,7 @@ int main(int argc, char **argv)
         uint32_t flags =           strtoul(NEXTARG,0,10);
         int hvm =                  atoi(NEXTARG);
         unsigned cbflags =         strtoul(NEXTARG,0,10);
+        int checkpointed_stream =  strtoul(NEXTARG,0,10);
         assert(!*++argv);
 
         helper_setcallbacks_save(&helper_save_callbacks, cbflags);
@@ -258,7 +259,7 @@ int main(int argc, char **argv)
         setup_signals(save_signal_handler);
 
         r = xc_domain_save2(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm);
+                           &helper_save_callbacks, hvm, checkpointed_stream);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 14/25] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (12 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:45   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm Yang Hongyang
                   ` (11 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, Anthony Perard, guijianfeng, rshriram, ian.jackson

In normal migration, the qemu state was passed to qemu as a parameter.
With COLO, Secondary vm is running. So we will do the following steps
at every checkpoint:
1. suspend both primay vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm
Primary will send qemu's state in step2, and
Secondary's qemu should read it and restore the state before it
is resumed. We can not pass the state to qemu as a parameter because
Secondary QEMU already started at this point, so we introduce
libxl__domain_restore_device_model() to do it.
This API should be called before resuming secondary vm.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Anthony Perard <anthony.perard@citrix.com>
---
 tools/libxl/libxl_dom_save.c | 29 +++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |  3 +++
 tools/libxl/libxl_qmp.c      | 10 ++++++++++
 3 files changed, 42 insertions(+)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index f89f5d4..0926b71 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -675,6 +675,35 @@ out:
     return ret;
 }
 
+int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid)
+{
+    char *state_file;
+    int rc;
+
+    switch (libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+        /* not supported now */
+        rc = ERROR_INVAL;
+        break;
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        /*
+         * This function may be called too many times for the same gc,
+         * so we use NOGC, and free the memory before return to avoid
+         * OOM.
+         */
+        state_file = libxl__sprintf(NOGC,
+                                    XC_DEVICE_MODEL_RESTORE_FILE".%d",
+                                    domid);
+        rc = libxl__qmp_restore(gc, domid, state_file);
+        free(state_file);
+        break;
+    default:
+        rc = ERROR_INVAL;
+    }
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0eb5f41..fb777c1 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1074,6 +1074,7 @@ _hidden int libxl__domain_rename(libxl__gc *gc, uint32_t domid,
 
 _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
                                      uint32_t size, void *data);
+_hidden int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid);
 _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
 
 _hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
@@ -1702,6 +1703,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
 _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
 /* Save current QEMU state into fd. */
 _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
+/* Load current QEMU state from fd. */
+_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
 /* Set dirty bitmap logging status */
 _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool enable);
 _hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const libxl_device_disk *disk);
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 6484f5e..080cb9f 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -904,6 +904,16 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename)
                            NULL, NULL);
 }
 
+int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
+{
+    libxl__json_object *args = NULL;
+
+    qmp_parameters_add_string(gc, &args, "filename", state_file);
+
+    return qmp_run_command(gc, domid, "xen-load-devices-state", args,
+                           NULL, NULL);
+}
+
 static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp,
                       char *device, char *target, char *arg)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (13 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 14/25] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:48   ` Ian Campbell
  2015-07-16 14:43   ` Wei Liu
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen Yang Hongyang
                   ` (10 subsequent siblings)
  25 siblings, 2 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

check QEMU state before resume dm on QEMU_XEN_TRADITIONAL.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dom_suspend.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 6f04c26..686a49b 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -434,11 +434,20 @@ static void domain_suspend_callback_common_done(libxl__egc *egc,
 
 int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
 {
+    char *path;
+    char *state;
 
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
-        libxl__qemu_traditional_cmd(gc, domid, "continue");
-        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
+        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
+        state = libxl__xs_read(gc, XBT_NULL, path);
+        if (state != NULL && !strcmp(state, "paused")) {
+            libxl__qemu_traditional_cmd(gc, domid, "continue");
+            libxl__wait_for_device_model_deprecated(gc, domid, "running",
+                                                    NULL, NULL, NULL);
+        }
         break;
     }
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (14 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 12:50   ` Ian Campbell
  2015-07-16 16:26   ` Wei Liu
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 17/25] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Yang Hongyang
                   ` (9 subsequent siblings)
  25 siblings, 2 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

Currently, libxl__domain_unpause() only supports
qemu-xen-traditional. Update it to support qemu-xen.
We use libxl__domain_resume_device_model to unpause guest dm.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 5b2d045..799aead 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -941,8 +941,6 @@ out:
 int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
 {
     GC_INIT(ctx);
-    char *path;
-    char *state;
     int ret, rc = 0;
 
     libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -952,14 +950,11 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
     }
 
     if (type == LIBXL_DOMAIN_TYPE_HVM) {
-        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
-
-        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
-        state = libxl__xs_read(gc, XBT_NULL, path);
-        if (state != NULL && !strcmp(state, "paused")) {
-            libxl__qemu_traditional_cmd(gc, domid, "continue");
-            libxl__wait_for_device_model_deprecated(gc, domid, "running",
-                                         NULL, NULL, NULL);
+        rc = libxl__domain_resume_device_model(gc, domid);
+        if (rc < 0) {
+            LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "failed to unpause device model "
+                       "for domain %u:%d", domid, rc);
+            goto out;
         }
     }
     ret = xc_domain_unpause(ctx->xch, domid);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 17/25] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (15 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 18/25] tools/libxl: export logdirty_init Yang Hongyang
                   ` (8 subsequent siblings)
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

Secondary vm is running in colo mode, we need to send
secondary vm's dirty page information to master at checkpoint,
so we have to enable qemu logdirty on secondary.

libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
qemu logdirty. But it uses domain_save_state, and calls
libxl__xc_domain_saverestore_async_callback_done()
before exits. This can not be used for secondary vm.

Update libxl__domain_suspend_common_switch_qemu_logdirty() to
introduce a new API libxl__domain_common_switch_qemu_logdirty().
This API only uses libxl__logdirty_switch, and calls
lds->callback before exits.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_dom_save.c | 93 ++++++++++++++++++++++++--------------------
 tools/libxl/libxl_internal.h |  8 ++++
 2 files changed, 59 insertions(+), 42 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 0926b71..ba7fc42 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -59,7 +59,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_save_state *dss, int rc);
+                                 libxl__logdirty_switch *lds, int rc);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -69,13 +69,10 @@ static void logdirty_init(libxl__logdirty_switch *lds)
 }
 
 static void domain_suspend_switch_qemu_xen_traditional_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
+                               (libxl__egc *egc, int domid, unsigned enable,
+                                libxl__logdirty_switch *lds)
 {
-    libxl__egc *egc = shs->egc;
-    libxl__domain_save_state *dss = shs->caller_state;
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(lds->ao);
     int rc;
     xs_transaction_t t = 0;
     const char *got;
@@ -137,26 +134,34 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
  out:
     LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
     libxl__xs_transaction_abort(gc, &t);
-    switch_logdirty_done(egc,dss,rc);
+    switch_logdirty_done(egc,lds,rc);
 }
 
 static void domain_suspend_switch_qemu_xen_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
+                               (libxl__egc *egc, int domid, unsigned enable,
+                                libxl__logdirty_switch *lds)
 {
-    libxl__egc *egc = shs->egc;
-    libxl__domain_save_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(lds->ao);
     int rc;
 
     rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
-    if (!rc) {
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
-    } else {
+    if (rc)
         LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+
+    lds->callback(egc, lds, rc);
+}
+
+static void domain_suspend_switch_qemu_logdirty_done
+                        (libxl__egc *egc, libxl__logdirty_switch *lds, int rc)
+{
+    libxl__domain_save_state *dss = CONTAINER_OF(lds, *dss, logdirty);
+
+    if (rc) {
         dss->rc = rc;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
+        libxl__xc_domain_saverestore_async_callback_done(egc,
+                                                         &dss->sws.shs, -1);
+    } else
+        libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
 }
 
 void libxl__domain_suspend_common_switch_qemu_logdirty
@@ -165,39 +170,49 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
     libxl__save_helper_state *shs = user;
     libxl__egc *egc = shs->egc;
     libxl__domain_save_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
+
+    /* convenience aliases */
+    libxl__logdirty_switch *const lds = &dss->logdirty;
+
+    lds->callback = domain_suspend_switch_qemu_logdirty_done;
+    libxl__domain_common_switch_qemu_logdirty(egc, domid, enable, lds);
+}
+
+void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
+                                               int domid, unsigned enable,
+                                               libxl__logdirty_switch *lds)
+{
+    STATE_AO_GC(lds->ao);
 
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
+        domain_suspend_switch_qemu_xen_traditional_logdirty(egc, domid, enable,
+                                                            lds);
         break;
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
+        domain_suspend_switch_qemu_xen_logdirty(egc, domid, enable, lds);
         break;
     default:
         LOG(ERROR,"logdirty switch failed"
             ", no valid device model version found, abandoning suspend");
-        dss->rc = ERROR_FAIL;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+        lds->callback(egc, lds, ERROR_FAIL);
     }
 }
 static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
                                     const struct timeval *requested_abs,
                                     int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
-    STATE_AO_GC(dss->ao);
+    libxl__logdirty_switch *lds = CONTAINER_OF(ev, *lds, timeout);
+    STATE_AO_GC(lds->ao);
     LOG(ERROR,"logdirty switch: wait for device model timed out");
-    switch_logdirty_done(egc,dss,ERROR_FAIL);
+    switch_logdirty_done(egc,lds,ERROR_FAIL);
 }
 
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
                             const char *watch_path, const char *event_path)
 {
-    libxl__domain_save_state *dss =
-        CONTAINER_OF(watch, *dss, logdirty.watch);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
+    libxl__logdirty_switch *lds = CONTAINER_OF(watch, *lds, watch);
+    STATE_AO_GC(lds->ao);
     const char *got;
     xs_transaction_t t = 0;
     int rc;
@@ -243,28 +258,20 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
     if (rc <= 0) {
         if (rc < 0)
             LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
-        switch_logdirty_done(egc,dss,rc);
+        switch_logdirty_done(egc,lds,rc);
     }
 }
 
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_save_state *dss,
+                                 libxl__logdirty_switch *lds,
                                  int rc)
 {
-    STATE_AO_GC(dss->ao);
-    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(lds->ao);
 
     libxl__ev_xswatch_deregister(gc, &lds->watch);
     libxl__ev_time_deregister(gc, &lds->timeout);
 
-    int broke;
-    if (rc) {
-        broke = -1;
-        dss->rc = rc;
-    } else {
-        broke = 0;
-    }
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
+    lds->callback(egc, lds, rc);
 }
 
 /*----- callbacks, called by xc_domain_save -----*/
@@ -397,6 +404,8 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
 
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
+    dss->logdirty.ao = ao;
+
     dsps->ao = ao;
     dsps->domid = domid;
     rc = libxl__domain_suspend_init(egc, dsps);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index fb777c1..0b792e3 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3013,6 +3013,11 @@ libxl__stream_write_inuse(const libxl__stream_write_state *stream)
 }
 
 typedef struct libxl__logdirty_switch {
+    /* set by caller of libxl__domain_common_switch_qemu_logdirty */
+    libxl__ao *ao;
+    void (*callback)(libxl__egc *egc, struct libxl__logdirty_switch *lds,
+                     int rc);
+
     const char *cmd;
     const char *cmd_path;
     const char *ret_path;
@@ -3426,6 +3431,9 @@ void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
 
 _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
                                (int domid, unsigned int enable, void *data);
+_hidden void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
+                                               int domid, unsigned enable,
+                                               libxl__logdirty_switch *lds);
 _hidden int libxl__toolstack_save(uint32_t domid, uint8_t **buf,
         uint32_t *len, void *data);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 18/25] tools/libxl: export logdirty_init
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (16 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 17/25] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 19/25] tools/libxl: Add back channel to allow migration target send data back Yang Hongyang
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

We need to enable logdirty on secondary, so we export logdirty_init
for internal use. Rename it to libxl__logdirty_init.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_dom_save.c | 4 ++--
 tools/libxl/libxl_internal.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index ba7fc42..9364a1d 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -61,7 +61,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
 static void switch_logdirty_done(libxl__egc *egc,
                                  libxl__logdirty_switch *lds, int rc);
 
-static void logdirty_init(libxl__logdirty_switch *lds)
+void libxl__logdirty_init(libxl__logdirty_switch *lds)
 {
     lds->cmd_path = 0;
     libxl__ev_xswatch_init(&lds->watch);
@@ -403,7 +403,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
     }
 
     dss->rc = 0;
-    logdirty_init(&dss->logdirty);
+    libxl__logdirty_init(&dss->logdirty);
     dss->logdirty.ao = ao;
 
     dsps->ao = ao;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0b792e3..219176e 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3025,6 +3025,8 @@ typedef struct libxl__logdirty_switch {
     libxl__ev_time timeout;
 } libxl__logdirty_switch;
 
+_hidden void libxl__logdirty_init(libxl__logdirty_switch *lds);
+
 struct libxl__domain_suspend_state {
     /* set by caller of libxl__domain_suspend_init */
     libxl__ao *ao;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 19/25] tools/libxl: Add back channel to allow migration target send data back
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (17 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 18/25] tools/libxl: export logdirty_init Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc Yang Hongyang
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

From: Wen Congyang <wency@cn.fujitsu.com>

In colo mode, slave needs to send data to master, but the io_fd
only can be written in master, and only can be read in slave.
Save recv_fd in domain_suspend_state, and send_fd in
domain_create_state.
Extend libxl_domain_create_restore API, add a send_fd param to
it.
Add LIBXL_HAVE_CREATE_RESTORE_SEND_FD to indicate the API change.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxl/libxl.c                  |  2 +-
 tools/libxl/libxl.h                  | 30 ++++++++++++++++++++++++++++--
 tools/libxl/libxl_create.c           |  9 +++++----
 tools/libxl/libxl_internal.h         |  2 ++
 tools/libxl/libxl_types.idl          |  1 +
 tools/libxl/xl_cmdimpl.c             |  8 +++++++-
 tools/ocaml/libs/xl/xenlight_stubs.c |  2 +-
 7 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 799aead..fcf91f1 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -835,7 +835,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     dss->callback = remus_failover_cb;
     dss->domid = domid;
     dss->fd = send_fd;
-    /* TODO do something with recv_fd */
+    dss->recv_fd = recv_fd;
     dss->type = type;
     dss->live = 1;
     dss->debug = 0;
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5a7308d..c492d20 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -617,6 +617,15 @@ typedef struct libxl__ctx libxl_ctx;
 #define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS 1
 
 /*
+ * LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
+ *
+ * If this is defined, libxl_domain_create_restore()'s API has changed to
+ * include a send_fd param which used for libxl migration back channel
+ * during COLO FT.
+ */
+#define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
+
+/*
  * LIBXL_HAVE_CREATEINFO_PVH
  * If this is defined, then libxl supports creation of a PVH guest.
  */
@@ -1089,7 +1098,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
                             const libxl_asyncprogress_how *aop_console_how)
                             LIBXL_EXTERNAL_CALLERS_ONLY;
 int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
-                                uint32_t *domid, int restore_fd,
+                                uint32_t *domid, int restore_fd, int send_fd,
                                 const libxl_domain_restore_params *params,
                                 const libxl_asyncop_how *ao_how,
                                 const libxl_asyncprogress_how *aop_console_how)
@@ -1110,7 +1119,7 @@ int static inline libxl_domain_create_restore_0x040200(
     libxl_domain_restore_params_init(&params);
 
     ret = libxl_domain_create_restore(
-        ctx, d_config, domid, restore_fd, &params, ao_how, aop_console_how);
+        ctx, d_config, domid, restore_fd, -1, &params, ao_how, aop_console_how);
 
     libxl_domain_restore_params_dispose(&params);
     return ret;
@@ -1118,6 +1127,23 @@ int static inline libxl_domain_create_restore_0x040200(
 
 #define libxl_domain_create_restore libxl_domain_create_restore_0x040200
 
+#elif defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040400 \
+                                 && LIBXL_API_VERSION < 0x040600
+
+int static inline libxl_domain_create_restore_0x040400(
+    libxl_ctx *ctx, libxl_domain_config *d_config,
+    uint32_t *domid, int restore_fd,
+    const libxl_domain_restore_params *params,
+    const libxl_asyncop_how *ao_how,
+    const libxl_asyncprogress_how *aop_console_how)
+    LIBXL_EXTERNAL_CALLERS_ONLY
+{
+    return libxl_domain_create_restore(ctx, d_config, domid, restore_fd,
+                                       -1, params, ao_how, aop_console_how);
+}
+
+#define libxl_domain_create_restore libxl_domain_create_restore_0x040400
+
 #endif
 
   /* A progress report will be made via ao_console_how, of type
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index cbd7693..1d4b13b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1498,7 +1498,7 @@ static void domain_create_cb(libxl__egc *egc,
                              int rc, uint32_t domid);
 
 static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-                            uint32_t *domid, int restore_fd,
+                            uint32_t *domid, int restore_fd, int send_fd,
                             const libxl_domain_restore_params *params,
                             const libxl_asyncop_how *ao_how,
                             const libxl_asyncprogress_how *aop_console_how)
@@ -1512,6 +1512,7 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
     libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
     libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
     cdcs->dcs.restore_fd = cdcs->dcs.libxc_fd = restore_fd;
+    cdcs->dcs.send_fd = send_fd;
     if (restore_fd > -1)
         cdcs->dcs.restore_params = *params;
     cdcs->dcs.callback = domain_create_cb;
@@ -1540,17 +1541,17 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
                             const libxl_asyncop_how *ao_how,
                             const libxl_asyncprogress_how *aop_console_how)
 {
-    return do_domain_create(ctx, d_config, domid, -1, NULL,
+    return do_domain_create(ctx, d_config, domid, -1, -1, NULL,
                             ao_how, aop_console_how);
 }
 
 int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
-                                uint32_t *domid, int restore_fd,
+                                uint32_t *domid, int restore_fd, int send_fd,
                                 const libxl_domain_restore_params *params,
                                 const libxl_asyncop_how *ao_how,
                                 const libxl_asyncprogress_how *aop_console_how)
 {
-    return do_domain_create(ctx, d_config, domid, restore_fd, params,
+    return do_domain_create(ctx, d_config, domid, restore_fd, send_fd, params,
                             ao_how, aop_console_how);
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 219176e..8a36853 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3057,6 +3057,7 @@ struct libxl__domain_save_state {
 
     uint32_t domid;
     int fd;
+    int recv_fd;
     libxl_domain_type type;
     int live;
     int debug;
@@ -3390,6 +3391,7 @@ struct libxl__domain_create_state {
     libxl_domain_config *guest_config;
     libxl_domain_config guest_config_saved; /* vanilla config */
     int restore_fd, libxc_fd;
+    int send_fd;
     libxl_domain_restore_params restore_params;
     libxl__domain_create_cb *callback;
     libxl_asyncprogress_how aop_console_how;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 94c230e..e8d3647 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -210,6 +210,7 @@ libxl_hdtype = Enumeration("hdtype", [
 libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
     (0, "NONE"),
     (1, "REMUS"),
+    (2, "COLO"),
     ])
 
 #
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index aa641fd..ace4a65 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -157,6 +157,7 @@ struct domain_create {
     char *extra_config; /* extra config string */
     const char *restore_file;
     int migrate_fd; /* -1 means none */
+    int send_fd; /* -1 means none */
     char **migration_domname_r; /* from malloc */
 };
 
@@ -2560,6 +2561,7 @@ static uint32_t create_domain(struct domain_create *dom_info)
     void *config_data = 0;
     int config_len = 0;
     int restore_fd = -1;
+    int send_fd = -1;
     const libxl_asyncprogress_how *autoconnect_console_how;
     struct save_file_header hdr;
 
@@ -2576,6 +2578,7 @@ static uint32_t create_domain(struct domain_create *dom_info)
         if (migrate_fd >= 0) {
             restore_source = "<incoming migration stream>";
             restore_fd = migrate_fd;
+            send_fd = dom_info->send_fd;
         } else {
             restore_source = restore_file;
             restore_fd = open(restore_file, O_RDONLY);
@@ -2764,7 +2767,7 @@ start:
 
         ret = libxl_domain_create_restore(ctx, &d_config,
                                           &domid, restore_fd,
-                                          &params,
+                                          send_fd, &params,
                                           0, autoconnect_console_how);
 
         libxl_domain_restore_params_dispose(&params);
@@ -4306,6 +4309,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
     dom_info.monitor = monitor;
     dom_info.paused = 1;
     dom_info.migrate_fd = recv_fd;
+    dom_info.send_fd = send_fd;
     dom_info.migration_domname_r = &migration_domname;
     dom_info.checkpointed_stream = checkpointed;
 
@@ -4476,6 +4480,7 @@ int main_restore(int argc, char **argv)
     dom_info.config_file = config_file;
     dom_info.restore_file = checkpoint_file;
     dom_info.migrate_fd = -1;
+    dom_info.send_fd = -1;
     dom_info.vnc = vnc;
     dom_info.vncautopass = vncautopass;
     dom_info.console_autoconnect = console_autoconnect;
@@ -4943,6 +4948,7 @@ int main_create(int argc, char **argv)
     dom_info.quiet = quiet;
     dom_info.config_file = filename;
     dom_info.migrate_fd = -1;
+    dom_info.send_fd = -1;
     dom_info.vnc = vnc;
     dom_info.vncautopass = vncautopass;
     dom_info.console_autoconnect = console_autoconnect;
diff --git a/tools/ocaml/libs/xl/xenlight_stubs.c b/tools/ocaml/libs/xl/xenlight_stubs.c
index 4133527..1c52c2a 100644
--- a/tools/ocaml/libs/xl/xenlight_stubs.c
+++ b/tools/ocaml/libs/xl/xenlight_stubs.c
@@ -537,7 +537,7 @@ value stub_libxl_domain_create_restore(value ctx, value domain_config, value par
 	restore_fd = Int_val(Field(params, 0));
 
 	caml_enter_blocking_section();
-	ret = libxl_domain_create_restore(CTX, &c_dconfig, &c_domid, restore_fd,
+	ret = libxl_domain_create_restore(CTX, &c_dconfig, &c_domid, restore_fd, -1,
 		&c_params, ao_how, NULL);
 	caml_leave_blocking_section();
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (18 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 19/25] tools/libxl: Add back channel to allow migration target send data back Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 13:13   ` Ian Campbell
  2015-07-15 13:21   ` Andrew Cooper
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device Yang Hongyang
                   ` (5 subsequent siblings)
  25 siblings, 2 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, Ian Jackson

In COLO mode, both VMs are running, and are considered in sync if the
visible network traffic is identical.  After some time, they fall out of
sync.

At this point, the two VMs have definitely diverged.  Lets call the
primary dirty bitmap set A, while the secondary dirty bitmap set B.

Sets A and B are different.

Under normal migration, the page data for set A will be sent form the
primary to the secondary.

However, the set difference B - A (lets call this C) is out-of-date on
the secondary (with respect to the primary) and will not be sent by the
primary, as it was not memory dirtied by the primary.  The secondary
needs the page data for C to reconstruct an exact copy of the primary at
the checkpoint.

The secondary cannot calculate C as it doesn't know A.  Instead, the
secondary must send B to the primary, at which point the primary
calculates the union of A and B (lets call this D) which is all the
pages dirtied by both the primary and the secondary, and sends all page
data covered by D.

In the general case, D is a superset of both A and B.  Without the
backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
copy of the primary.

We transfer the dirty bitmap on libxc side, so we need to introduce back
channel to libxc.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
commit message:
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xenguest.h   |  8 ++++----
 tools/libxc/xc_domain_restore.c  |  4 ++--
 tools/libxc/xc_domain_save.c     |  4 ++--
 tools/libxc/xc_sr_restore.c      |  2 +-
 tools/libxc/xc_sr_save.c         |  2 +-
 tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
 tools/libxl/libxl_save_helper.c  |  8 ++++++--
 7 files changed, 42 insertions(+), 25 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 6e24b6c..4056955 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -91,13 +91,13 @@ struct save_callbacks {
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream);
+                   int checkpointed_stream, int back_fd);
 
 /* Domain Save v2 */
 int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                     uint32_t max_factor, uint32_t flags,
                     struct save_callbacks* callbacks, int hvm,
-                    int checkpointed_stream);
+                    int checkpointed_stream, int back_fd);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
@@ -140,7 +140,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks);
+                      struct restore_callbacks *callbacks, int back_fd);
 
 /* Domain Restore v2 */
 int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
@@ -149,7 +149,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
                        unsigned long *console_mfn, domid_t console_domid,
                        unsigned int hvm, unsigned int pae, int superpages,
                        int checkpointed_stream,
-                       struct restore_callbacks *callbacks);
+                       struct restore_callbacks *callbacks, int back_fd);
 /**
  * xc_domain_restore writes a file to disk that contains the device
  * model saved state.
diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
index 3cd3483..63d1e6b 100644
--- a/tools/libxc/xc_domain_restore.c
+++ b/tools/libxc/xc_domain_restore.c
@@ -1515,7 +1515,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks)
+                      struct restore_callbacks *callbacks, int back_fd)
 {
     DECLARE_DOMCTL;
     xc_dominfo_t info;
@@ -1578,7 +1578,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
         return xc_domain_restore2(
             xch, io_fd, dom, store_evtchn, store_mfn,
             store_domid, console_evtchn, console_mfn, console_domid,
-            hvm,  pae,  superpages, checkpointed_stream, callbacks);
+            hvm,  pae,  superpages, checkpointed_stream, callbacks, back_fd);
     }
 
     DPRINTF("%s: starting restore of new domid %u", __func__, dom);
diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
index 0da3cca..b111384 100644
--- a/tools/libxc/xc_domain_save.c
+++ b/tools/libxc/xc_domain_save.c
@@ -803,7 +803,7 @@ static int save_tsc_info(xc_interface *xch, uint32_t dom, int io_fd)
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream)
+                   int checkpointed_stream, int back_fd)
 {
     xc_dominfo_t info;
     DECLARE_DOMCTL;
@@ -899,7 +899,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
     {
         return xc_domain_save2(xch, io_fd, dom, max_iters,
                                max_factor, flags, callbacks, hvm,
-                               checkpointed_stream);
+                               checkpointed_stream, back_fd);
     }
 
     DPRINTF("%s: starting save of domid %u", __func__, dom);
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index bf1ee15..504463e 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -720,7 +720,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
                        unsigned long *console_gfn, domid_t console_domid,
                        unsigned int hvm, unsigned int pae, int superpages,
                        int checkpointed_stream,
-                       struct restore_callbacks *callbacks)
+                       struct restore_callbacks *callbacks, int back_fd)
 {
     struct xc_sr_context ctx =
         {
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 6102b66..d12e5b1 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -821,7 +821,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
                     uint32_t max_iters, uint32_t max_factor, uint32_t flags,
                     struct save_callbacks* callbacks, int hvm,
-                    int checkpointed_stream)
+                    int checkpointed_stream, int back_fd)
 {
     xen_pfn_t nr_pfns;
     struct xc_sr_context ctx =
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index f393abc..f8c6cf0 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -27,7 +27,7 @@
  */
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
                        const char *mode_arg,
-                       int stream_fd,
+                       int stream_fd, int back_fd,
                        const int *preserve_fds, int num_preserve_fds,
                        const unsigned long *argnums, int num_argnums);
 
@@ -50,6 +50,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
     /* Convenience aliases */
     const uint32_t domid = dcs->guest_domid;
     const int restore_fd = dcs->libxc_fd;
+    const int send_fd = dcs->send_fd;
     libxl__domain_build_state *const state = &dcs->build_state;
 
     unsigned cbflags =
@@ -72,7 +73,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
     shs->need_results = 1;
     shs->toolstack_data_file = 0;
 
-    run_helper(egc, shs, "--restore-domain", restore_fd, 0, 0,
+    run_helper(egc, shs, "--restore-domain", restore_fd, send_fd, 0, 0,
                argnums, ARRAY_SIZE(argnums));
 }
 
@@ -96,7 +97,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
     shs->caller_state = dss;
     shs->need_results = 0;
 
-    run_helper(egc, shs, "--save-domain", dss->fd,
+    run_helper(egc, shs, "--save-domain", dss->fd, dss->recv_fd,
                NULL, 0,
                argnums, ARRAY_SIZE(argnums));
     return;
@@ -119,14 +120,29 @@ void libxl__save_helper_init(libxl__save_helper_state *shs)
 }
 
 /*----- helper execution -----*/
+static int dup_fd_helper(libxl__gc *gc, int fd, const char *what)
+{
+    int dup_fd = fd;
+
+    if (fd <= 2) {
+        dup_fd = dup(fd);
+        if (dup_fd < 0) {
+            LOGE(ERROR,"dup %s", what);
+            exit(-1);
+        }
+    }
+    libxl_fd_set_cloexec(CTX, dup_fd, 0);
+
+    return dup_fd;
+}
 
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
-                       const char *mode_arg, int stream_fd,
+                       const char *mode_arg, int stream_fd, int back_fd,
                        const int *preserve_fds, int num_preserve_fds,
                        const unsigned long *argnums, int num_argnums)
 {
     STATE_AO_GC(shs->ao);
-    const char *args[4 + num_argnums];
+    const char *args[5 + num_argnums];
     const char **arg = args;
     int i, rc;
 
@@ -154,6 +170,7 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
     *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC_BIN "/" "libxl-save-helper";
     *arg++ = mode_arg;
     const char **stream_fd_arg = arg++;
+    const char **back_fd_arg = arg++;
     for (i=0; i<num_argnums; i++)
         *arg++ = GCSPRINTF("%lu", argnums[i]);
     *arg++ = 0;
@@ -178,16 +195,12 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
 
     pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
     if (!pid) {
-        if (stream_fd <= 2) {
-            stream_fd = dup(stream_fd);
-            if (stream_fd < 0) {
-                LOGE(ERROR,"dup migration stream fd");
-                exit(-1);
-            }
-        }
-        libxl_fd_set_cloexec(CTX, stream_fd, 0);
+        stream_fd = dup_fd_helper(gc, stream_fd, "migration stream fd");
         *stream_fd_arg = GCSPRINTF("%d", stream_fd);
 
+        back_fd = dup_fd_helper(gc, back_fd, "migration back channel fd");
+        *back_fd_arg = GCSPRINTF("%d", back_fd);
+
         for (i=0; i<num_preserve_fds; i++)
             if (preserve_fds[i] >= 0) {
                 assert(preserve_fds[i] > 2);
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 4c9d34c..9de5694 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -235,6 +235,7 @@ static struct restore_callbacks helper_restore_callbacks;
 int main(int argc, char **argv)
 {
     int r;
+    int back_fd;
 
 #define NEXTARG (++argv, assert(*argv), *argv)
 
@@ -244,6 +245,7 @@ int main(int argc, char **argv)
     if (!strcmp(mode,"--save-domain")) {
 
         io_fd =                    atoi(NEXTARG);
+        back_fd =                  atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         uint32_t max_iters =       strtoul(NEXTARG,0,10);
         uint32_t max_factor =      strtoul(NEXTARG,0,10);
@@ -259,12 +261,14 @@ int main(int argc, char **argv)
         setup_signals(save_signal_handler);
 
         r = xc_domain_save2(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm, checkpointed_stream);
+                            &helper_save_callbacks, hvm, checkpointed_stream,
+                            back_fd);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
 
         io_fd =                    atoi(NEXTARG);
+        back_fd =                  atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         unsigned store_evtchn =    strtoul(NEXTARG,0,10);
         domid_t store_domid =      strtoul(NEXTARG,0,10);
@@ -289,7 +293,7 @@ int main(int argc, char **argv)
                               store_domid, console_evtchn, &console_mfn,
                               console_domid, hvm, pae, superpages,
                               checkpointed,
-                              &helper_restore_callbacks);
+                              &helper_restore_callbacks, back_fd);
         helper_stub_restore_results(store_mfn,console_mfn,0);
         complete(r);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (19 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 13:15   ` Ian Campbell
  2015-07-15 13:32   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 22/25] tools/libxl: adjust the indentation Yang Hongyang
                   ` (4 subsequent siblings)
  25 siblings, 2 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

This patch is auto generated by the following commands:
 1. git mv tools/libxl/libxl_remus_device.c tools/libxl/libxl_checkpoint_device.c
 2. perl -pi -e 's/libxl_remus_device/libxl_checkpoint_device/g' tools/libxl/Makefile
 3. perl -pi -e 's/\blibxl__remus_devices/libxl__checkpoint_devices/g' tools/libxl/*.[ch]
 4. perl -pi -e 's/\blibxl__remus_device\b/libxl__checkpoint_device/g' tools/libxl/*.[ch]
 5. perl -pi -e 's/\blibxl__remus_device_instance_ops\b/libxl__checkpoint_device_instance_ops/g' tools/libxl/*.[ch]
 6. perl -pi -e 's/\blibxl__remus_callback\b/libxl__checkpoint_callback/g' tools/libxl/*.[ch]
 7. perl -pi -e 's/\bremus_device_init\b/checkpoint_device_init/g' tools/libxl/*.[ch]
 8. perl -pi -e 's/\bremus_devices_setup\b/checkpoint_devices_setup/g' tools/libxl/*.[ch]
 9. perl -pi -e 's/\bdefine_remus_checkpoint_api\b/define_checkpoint_api/g' tools/libxl/*.[ch]
10. perl -pi -e 's/\brds\b/cds/g' tools/libxl/*.[ch]
11. perl -pi -e 's/REMUS_DEVICE/CHECKPOINT_DEVICE/g' tools/libxl/*.[ch] tools/libxl/*.idl
12. perl -pi -e 's/REMUS_DEVOPS/CHECKPOINT_DEVOPS/g' tools/libxl/*.[ch] tools/libxl/*.idl
13. perl -pi -e 's/\bremus\b/checkpoint/g' tools/libxl/libxl_checkpoint_device.[ch]
14. perl -pi -e 's/\bremus device/checkpoint device/g' tools/libxl/libxl_internal.h
15. perl -pi -e 's/\bRemus device/checkpoint device/g' tools/libxl/libxl_internal.h
16. perl -pi -e 's/\bremus abstract/checkpoint abstract/g' tools/libxl/libxl_internal.h
17. perl -pi -e 's/\bremus invocation/checkpoint invocation/g' tools/libxl/libxl_internal.h
18. perl -pi -e 's/\blibxl__remus_device_\(/libxl__checkpoint_device_(/g' tools/libxl/libxl_internal.h

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxl/Makefile                  |   2 +-
 tools/libxl/libxl_checkpoint_device.c | 327 ++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h          | 112 ++++++------
 tools/libxl/libxl_netbuffer.c         | 108 +++++------
 tools/libxl/libxl_nonetbuffer.c       |  10 +-
 tools/libxl/libxl_remus.c             |  76 ++++----
 tools/libxl/libxl_remus_device.c      | 327 ----------------------------------
 tools/libxl/libxl_remus_disk_drbd.c   |  52 +++---
 tools/libxl/libxl_types.idl           |   4 +-
 9 files changed, 509 insertions(+), 509 deletions(-)
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 2e4c944..3cb3ae9 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -62,7 +62,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
new file mode 100644
index 0000000..109cd23
--- /dev/null
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -0,0 +1,327 @@
+/*
+ * Copyright (C) 2014 FUJITSU LIMITED
+ * Author: Yang Hongyang <yanghy@cn.fujitsu.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
+    &remus_device_nic,
+    &remus_device_drbd_disk,
+    NULL,
+};
+
+/*----- helper functions -----*/
+
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* init device subkind-specific state in the libxl ctx */
+    int rc;
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc)) {
+        rc = init_subkind_nic(cds);
+        if (rc) goto out;
+    }
+
+    rc = init_subkind_drbd_disk(cds);
+    if (rc) goto out;
+
+    rc = 0;
+out:
+    return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* cleanup device subkind-specific state in the libxl ctx */
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc))
+        cleanup_subkind_nic(cds);
+
+    cleanup_subkind_drbd_disk(cds);
+}
+
+/*----- setup() and teardown() -----*/
+
+/* callbacks */
+
+static void all_devices_setup_cb(libxl__egc *egc,
+                                 libxl__multidev *multidev,
+                                 int rc);
+static void device_setup_iterate(libxl__egc *egc,
+                                 libxl__ao_device *aodev);
+static void devices_teardown_cb(libxl__egc *egc,
+                                libxl__multidev *multidev,
+                                int rc);
+
+/* checkpoint device setup and teardown */
+
+static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
+                                              libxl__checkpoint_devices_state *cds,
+                                              libxl__device_kind kind,
+                                              void *libxl_dev)
+{
+    libxl__checkpoint_device *dev = NULL;
+
+    STATE_AO_GC(cds->ao);
+    GCNEW(dev);
+    dev->backend_dev = libxl_dev;
+    dev->kind = kind;
+    dev->cds = cds;
+
+    return dev;
+}
+
+static void checkpoint_devices_setup(libxl__egc *egc,
+                                libxl__checkpoint_devices_state *cds);
+
+void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
+{
+    int i, rc;
+
+    STATE_AO_GC(cds->ao);
+
+    rc = init_device_subkind(cds);
+    if (rc)
+        goto out;
+
+    cds->num_devices = 0;
+    cds->num_nics = 0;
+    cds->num_disks = 0;
+
+    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
+        cds->nics = libxl_device_nic_list(CTX, cds->domid, &cds->num_nics);
+
+    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
+        cds->disks = libxl_device_disk_list(CTX, cds->domid, &cds->num_disks);
+
+    if (cds->num_nics == 0 && cds->num_disks == 0)
+        goto out;
+
+    GCNEW_ARRAY(cds->devs, cds->num_nics + cds->num_disks);
+
+    for (i = 0; i < cds->num_nics; i++) {
+        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
+                                                LIBXL__DEVICE_KIND_VIF,
+                                                &cds->nics[i]);
+    }
+
+    for (i = 0; i < cds->num_disks; i++) {
+        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
+                                                LIBXL__DEVICE_KIND_VBD,
+                                                &cds->disks[i]);
+    }
+
+    checkpoint_devices_setup(egc, cds);
+
+    return;
+
+out:
+    cds->callback(egc, cds, rc);
+}
+
+static void checkpoint_devices_setup(libxl__egc *egc,
+                                libxl__checkpoint_devices_state *cds)
+{
+    int i, rc;
+
+    STATE_AO_GC(cds->ao);
+
+    libxl__multidev_begin(ao, &cds->multidev);
+    cds->multidev.callback = all_devices_setup_cb;
+    for (i = 0; i < cds->num_devices; i++) {
+        libxl__checkpoint_device *dev = cds->devs[i];
+        dev->ops_index = -1;
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
+
+        dev->aodev.rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
+        dev->aodev.callback = device_setup_iterate;
+        device_setup_iterate(egc,&dev->aodev);
+    }
+
+    rc = 0;
+    libxl__multidev_prepared(egc, &cds->multidev, rc);
+}
+
+
+static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
+{
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    EGC_GC;
+
+    if (aodev->rc != ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED &&
+        aodev->rc != ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH)
+        /* might be success or disaster */
+        goto out;
+
+    do {
+        dev->ops = remus_ops[++dev->ops_index];
+        if (!dev->ops) {
+            libxl_device_nic * nic = NULL;
+            libxl_device_disk * disk = NULL;
+            uint32_t domid;
+            int devid;
+            if (dev->kind == LIBXL__DEVICE_KIND_VIF) {
+                nic = (libxl_device_nic *)dev->backend_dev;
+                domid = nic->backend_domid;
+                devid = nic->devid;
+            } else if (dev->kind == LIBXL__DEVICE_KIND_VBD) {
+                disk = (libxl_device_disk *)dev->backend_dev;
+                domid = disk->backend_domid;
+                devid = libxl__device_disk_dev_number(disk->vdev, NULL, NULL);
+            } else {
+                LOG(ERROR,"device kind not handled by checkpoint: %s",
+                    libxl__device_kind_to_string(dev->kind));
+                aodev->rc = ERROR_FAIL;
+                goto out;
+            }
+            LOG(ERROR,"device not handled by checkpoint"
+                " (device=%s:%"PRId32"/%"PRId32")",
+                libxl__device_kind_to_string(dev->kind),
+                domid, devid);
+            aodev->rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
+            goto out;
+        }
+    } while (dev->ops->kind != dev->kind);
+
+    /* found the next ops_index to try */
+    assert(dev->aodev.callback == device_setup_iterate);
+    dev->ops->setup(egc,dev);
+    return;
+
+ out:
+    libxl__multidev_one_callback(egc,aodev);
+}
+
+static void all_devices_setup_cb(libxl__egc *egc,
+                                 libxl__multidev *multidev,
+                                 int rc)
+{
+    STATE_AO_GC(multidev->ao);
+
+    /* Convenience aliases */
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
+
+    cds->callback(egc, cds, rc);
+}
+
+void libxl__checkpoint_devices_teardown(libxl__egc *egc,
+                                   libxl__checkpoint_devices_state *cds)
+{
+    int i;
+    libxl__checkpoint_device *dev;
+
+    STATE_AO_GC(cds->ao);
+
+    libxl__multidev_begin(ao, &cds->multidev);
+    cds->multidev.callback = devices_teardown_cb;
+    for (i = 0; i < cds->num_devices; i++) {
+        dev = cds->devs[i];
+        if (!dev->ops || !dev->matched)
+            continue;
+
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
+        dev->ops->teardown(egc,dev);
+    }
+
+    libxl__multidev_prepared(egc, &cds->multidev, 0);
+}
+
+static void devices_teardown_cb(libxl__egc *egc,
+                                libxl__multidev *multidev,
+                                int rc)
+{
+    int i;
+
+    STATE_AO_GC(multidev->ao);
+
+    /* Convenience aliases */
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
+
+    /* clean nic */
+    for (i = 0; i < cds->num_nics; i++)
+        libxl_device_nic_dispose(&cds->nics[i]);
+    free(cds->nics);
+    cds->nics = NULL;
+    cds->num_nics = 0;
+
+    /* clean disk */
+    for (i = 0; i < cds->num_disks; i++)
+        libxl_device_disk_dispose(&cds->disks[i]);
+    free(cds->disks);
+    cds->disks = NULL;
+    cds->num_disks = 0;
+
+    cleanup_device_subkind(cds);
+
+    cds->callback(egc, cds, rc);
+}
+
+/*----- checkpointing APIs -----*/
+
+/* callbacks */
+
+static void devices_checkpoint_cb(libxl__egc *egc,
+                                  libxl__multidev *multidev,
+                                  int rc);
+
+/* API implementations */
+
+#define define_checkpoint_api(api)                                \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
+                                libxl__checkpoint_devices_state *cds)        \
+{                                                                       \
+    int i;                                                              \
+    libxl__checkpoint_device *dev;                                           \
+                                                                        \
+    STATE_AO_GC(cds->ao);                                               \
+                                                                        \
+    libxl__multidev_begin(ao, &cds->multidev);                          \
+    cds->multidev.callback = devices_checkpoint_cb;                     \
+    for (i = 0; i < cds->num_devices; i++) {                            \
+        dev = cds->devs[i];                                             \
+        if (!dev->matched || !dev->ops->api)                            \
+            continue;                                                   \
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);\
+        dev->ops->api(egc,dev);                                         \
+    }                                                                   \
+                                                                        \
+    libxl__multidev_prepared(egc, &cds->multidev, 0);                   \
+}
+
+define_checkpoint_api(postsuspend);
+
+define_checkpoint_api(preresume);
+
+define_checkpoint_api(commit);
+
+static void devices_checkpoint_cb(libxl__egc *egc,
+                                  libxl__multidev *multidev,
+                                  int rc)
+{
+    STATE_AO_GC(multidev->ao);
+
+    /* Convenience aliases */
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
+
+    cds->callback(egc, cds, rc);
+}
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8a36853..901e216 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2741,9 +2741,9 @@ typedef struct libxl__save_helper_state {
                       * marshalling and xc callback functions */
 } libxl__save_helper_state;
 
-/*----- remus device related state structure -----*/
+/*----- checkpoint device related state structure -----*/
 /*
- * The abstract Remus device layer exposes a common
+ * The abstract checkpoint device layer exposes a common
  * set of API to [external] libxl for manipulating devices attached to
  * a guest protected by Remus. The device layer also exposes a set of
  * [internal] interfaces that every device type must implement.
@@ -2751,34 +2751,34 @@ typedef struct libxl__save_helper_state {
  * The following API are exposed to libxl:
  *
  * One-time configuration operations:
- *  +libxl__remus_devices_setup
+ *  +libxl__checkpoint_devices_setup
  *    > Enable output buffering for NICs, setup disk replication, etc.
- *  +libxl__remus_devices_teardown
+ *  +libxl__checkpoint_devices_teardown
  *    > Disable output buffering and disk replication; teardown any
  *       associated external setups like qdiscs for NICs.
  *
  * Operations executed every checkpoint (in order of invocation):
- *  +libxl__remus_devices_postsuspend
- *  +libxl__remus_devices_preresume
- *  +libxl__remus_devices_commit
+ *  +libxl__checkpoint_devices_postsuspend
+ *  +libxl__checkpoint_devices_preresume
+ *  +libxl__checkpoint_devices_commit
  *
  * Each device type needs to implement the interfaces specified in
- * the libxl__remus_device_instance_ops if it wishes to support Remus.
+ * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the Remus device layer is shown below:
+ * The high-level control flow through the checkpoint device layer is shown below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
- *    |-> libxl__remus_devices_setup
- *      |-> Per-checkpoint libxl__remus_devices_[postsuspend,preresume,commit]
+ *    |-> libxl__checkpoint_devices_setup
+ *      |-> Per-checkpoint libxl__checkpoint_devices_[postsuspend,preresume,commit]
  *        ...
  *        |-> On backup failure, network error or other internal errors:
- *            libxl__remus_devices_teardown
+ *            libxl__checkpoint_devices_teardown
  */
 
-typedef struct libxl__remus_device libxl__remus_device;
-typedef struct libxl__remus_devices_state libxl__remus_devices_state;
-typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops;
+typedef struct libxl__checkpoint_device libxl__checkpoint_device;
+typedef struct libxl__checkpoint_devices_state libxl__checkpoint_devices_state;
+typedef struct libxl__checkpoint_device_instance_ops libxl__checkpoint_device_instance_ops;
 
 /*
  * Interfaces to be implemented by every device subkind that wishes to
@@ -2788,7 +2788,7 @@ typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops
  * synchronous and call dev->aodev.callback directly (as the last
  * thing they do).
  */
-struct libxl__remus_device_instance_ops {
+struct libxl__checkpoint_device_instance_ops {
     /* the device kind this ops belongs to... */
     libxl__device_kind kind;
 
@@ -2799,12 +2799,12 @@ struct libxl__remus_device_instance_ops {
      * Asynchronous.
      */
 
-    void (*postsuspend)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*preresume)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*commit)(libxl__egc *egc, libxl__remus_device *dev);
+    void (*postsuspend)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*preresume)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*commit)(libxl__egc *egc, libxl__checkpoint_device *dev);
 
     /*
-     * setup() and teardown() are refer to the actual remus device.
+     * setup() and teardown() are refer to the actual checkpoint device.
      * Asynchronous.
      * teardown is called even if setup fails.
      */
@@ -2813,45 +2813,45 @@ struct libxl__remus_device_instance_ops {
      * device. If matched, the device will then be managed with this set of
      * subkind operations.
      * Yields 0 if the device successfully set up.
-     * REMUS_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
+     * CHECKPOINT_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
      * any other rc indicates failure.
      */
-    void (*setup)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*teardown)(libxl__egc *egc, libxl__remus_device *dev);
+    void (*setup)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*teardown)(libxl__egc *egc, libxl__checkpoint_device *dev);
 };
 
-int init_subkind_nic(libxl__remus_devices_state *rds);
-void cleanup_subkind_nic(libxl__remus_devices_state *rds);
-int init_subkind_drbd_disk(libxl__remus_devices_state *rds);
-void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds);
+int init_subkind_nic(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds);
+int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
-typedef void libxl__remus_callback(libxl__egc *,
-                                   libxl__remus_devices_state *, int rc);
+typedef void libxl__checkpoint_callback(libxl__egc *,
+                                   libxl__checkpoint_devices_state *, int rc);
 
 /*
- * State associated with a remus invocation, including parameters
- * passed to the remus abstract device layer by the remus
+ * State associated with a checkpoint invocation, including parameters
+ * passed to the checkpoint abstract device layer by the remus
  * save/restore machinery.
  */
-struct libxl__remus_devices_state {
-    /*---- must be set by caller of libxl__remus_device_(setup|teardown) ----*/
+struct libxl__checkpoint_devices_state {
+    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
 
     libxl__ao *ao;
     uint32_t domid;
-    libxl__remus_callback *callback;
+    libxl__checkpoint_callback *callback;
     int device_kind_flags;
 
     /*----- private for abstract layer only -----*/
 
     int num_devices;
     /*
-     * this array is allocated before setup the remus devices by the
-     * remus abstract layer.
-     * devs may be NULL, means there's no remus devices that has been set up.
+     * this array is allocated before setup the checkpoint devices by the
+     * checkpoint abstract layer.
+     * devs may be NULL, means there's no checkpoint devices that has been set up.
      * the size of this array is 'num_devices', which is the total number
      * of libxl nic devices and disk devices(num_nics + num_disks).
      */
-    libxl__remus_device **devs;
+    libxl__checkpoint_device **devs;
 
     libxl_device_nic *nics;
     int num_nics;
@@ -2873,20 +2873,20 @@ struct libxl__remus_devices_state {
 
 /*
  * Information about a single device being handled by remus.
- * Allocated by the remus abstract layer.
+ * Allocated by the checkpoint abstract layer.
  */
-struct libxl__remus_device {
+struct libxl__checkpoint_device {
     /*----- shared between abstract and concrete layers -----*/
     /*
      * if this is true, that means the subkind ops match the device
      */
     bool matched;
 
-    /*----- set by remus device abstruct layer -----*/
-    /* libxl__device_* which this remus device related to */
+    /*----- set by checkpoint device abstruct layer -----*/
+    /* libxl__device_* which this checkpoint device related to */
     const void *backend_dev;
     libxl__device_kind kind;
-    libxl__remus_devices_state *rds;
+    libxl__checkpoint_devices_state *cds;
     libxl__ao_device aodev;
 
     /*----- private for abstract layer only -----*/
@@ -2897,7 +2897,7 @@ struct libxl__remus_device {
      * individual devices.
      */
     int ops_index;
-    const libxl__remus_device_instance_ops *ops;
+    const libxl__checkpoint_device_instance_ops *ops;
 
     /*----- private for concrete (device-specific) layer -----*/
 
@@ -2905,17 +2905,17 @@ struct libxl__remus_device {
     void *concrete_data;
 };
 
-/* the following 5 APIs are async ops, call rds->callback when done */
-_hidden void libxl__remus_devices_setup(libxl__egc *egc,
-                                        libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_teardown(libxl__egc *egc,
-                                           libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_postsuspend(libxl__egc *egc,
-                                              libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_preresume(libxl__egc *egc,
-                                            libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_commit(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds);
+/* the following 5 APIs are async ops, call cds->callback when done */
+_hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
+                                        libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
+                                           libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
+                                              libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
+                                            libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
+                                         libxl__checkpoint_devices_state *cds);
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
@@ -3068,7 +3068,7 @@ struct libxl__domain_save_state {
     int hvm;
     int xcflags;
     libxl__domain_suspend_state dsps;
-    libxl__remus_devices_state rds;
+    libxl__checkpoint_devices_state cds;
     libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
     int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index c245a4e..33c2a42 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -38,21 +38,21 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
     return 1;
 }
 
-int init_subkind_nic(libxl__remus_devices_state *rds)
+int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     int rc, ret;
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rds->nlsock = nl_socket_alloc();
-    if (!rds->nlsock) {
+    cds->nlsock = nl_socket_alloc();
+    if (!cds->nlsock) {
         LOG(ERROR, "cannot allocate nl socket");
         rc = ERROR_FAIL;
         goto out;
     }
 
-    ret = nl_connect(rds->nlsock, NETLINK_ROUTE);
+    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
     if (ret) {
         LOG(ERROR, "failed to open netlink socket: %s",
             nl_geterror(ret));
@@ -61,7 +61,7 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
     }
 
     /* get list of all qdiscs installed on network devs. */
-    ret = rtnl_qdisc_alloc_cache(rds->nlsock, &rds->qdisc_cache);
+    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
     if (ret) {
         LOG(ERROR, "failed to allocate qdisc cache: %s",
             nl_geterror(ret));
@@ -70,9 +70,9 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
     }
 
     if (dss->remus->netbufscript) {
-        rds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
+        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
     } else {
-        rds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
+        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
                                       libxl__xen_script_dir_path());
     }
 
@@ -82,22 +82,22 @@ out:
     return rc;
 }
 
-void cleanup_subkind_nic(libxl__remus_devices_state *rds)
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* free qdisc cache */
-    if (rds->qdisc_cache) {
-        nl_cache_clear(rds->qdisc_cache);
-        nl_cache_free(rds->qdisc_cache);
-        rds->qdisc_cache = NULL;
+    if (cds->qdisc_cache) {
+        nl_cache_clear(cds->qdisc_cache);
+        nl_cache_free(cds->qdisc_cache);
+        cds->qdisc_cache = NULL;
     }
 
     /* close & free nlsock */
-    if (rds->nlsock) {
-        nl_close(rds->nlsock);
-        nl_socket_free(rds->nlsock);
-        rds->nlsock = NULL;
+    if (cds->nlsock) {
+        nl_close(cds->nlsock);
+        nl_socket_free(cds->nlsock);
+        cds->nlsock = NULL;
     }
 }
 
@@ -111,17 +111,17 @@ void cleanup_subkind_nic(libxl__remus_devices_state *rds)
  * it must ONLY be used for remus because if driver domains
  * were in use it would constitute a security vulnerability.
  */
-static const char *get_vifname(libxl__remus_device *dev,
+static const char *get_vifname(libxl__checkpoint_device *dev,
                                const libxl_device_nic *nic)
 {
     const char *vifname = NULL;
     const char *path;
     int rc;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* Convenience aliases */
-    const uint32_t domid = dev->rds->domid;
+    const uint32_t domid = dev->cds->domid;
 
     path = GCSPRINTF("%s/backend/vif/%d/%d/vifname",
                      libxl__xs_get_dompath(gc, 0), domid, nic->devid);
@@ -144,19 +144,19 @@ static void free_qdisc(libxl__remus_device_nic *remus_nic)
     remus_nic->qdisc = NULL;
 }
 
-static int init_qdisc(libxl__remus_devices_state *rds,
+static int init_qdisc(libxl__checkpoint_devices_state *cds,
                       libxl__remus_device_nic *remus_nic)
 {
     int rc, ret, ifindex;
     struct rtnl_link *ifb = NULL;
     struct rtnl_qdisc *qdisc = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Now that we have brought up REMUS_IFB device with plug qdisc for
      * this vif, so we need to refill the qdisc cache.
      */
-    ret = nl_cache_refill(rds->nlsock, rds->qdisc_cache);
+    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
     if (ret) {
         LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
         rc = ERROR_FAIL;
@@ -164,7 +164,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
     }
 
     /* get a handle to the REMUS_IFB interface */
-    ret = rtnl_link_get_kernel(rds->nlsock, 0, remus_nic->ifb, &ifb);
+    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
     if (ret) {
         LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
             nl_geterror(ret));
@@ -187,7 +187,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
      * There is no need to explicitly free this qdisc as its just a
      * reference from the qdisc cache we allocated earlier.
      */
-    qdisc = rtnl_qdisc_get_by_parent(rds->qdisc_cache, ifindex, TC_H_ROOT);
+    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
     if (qdisc) {
         const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
         /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
@@ -231,19 +231,19 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
  * $REMUS_IFB (for teardown)
  * setup/teardown as command line arg.
  */
-static void setup_async_exec(libxl__remus_device *dev, char *op)
+static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
 {
     int arraysize, nr = 0;
     char **env = NULL, **args = NULL;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
-    libxl__remus_devices_state *rds = dev->rds;
+    libxl__checkpoint_devices_state *cds = dev->cds;
     libxl__async_exec_state *aes = &dev->aodev.aes;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    char *const script = libxl__strdup(gc, rds->netbufscript);
-    const uint32_t domid = rds->domid;
+    char *const script = libxl__strdup(gc, cds->netbufscript);
+    const uint32_t domid = cds->domid;
     const int dev_id = remus_nic->devid;
     const char *const vif = remus_nic->vif;
     const char *const ifb = remus_nic->ifb;
@@ -269,7 +269,7 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
     args[nr++] = NULL;
     assert(nr == arraysize);
 
-    aes->ao = dev->rds->ao;
+    aes->ao = dev->cds->ao;
     aes->what = GCSPRINTF("%s %s", args[0], args[1]);
     aes->env = env;
     aes->args = args;
@@ -286,13 +286,13 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
 
 /* setup() and teardown() */
 
-static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic;
     const libxl_device_nic *nic = dev->backend_dev;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /*
      * thers's no subkind of nic devices, so nic ops is always matched
@@ -330,15 +330,15 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
                                    int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
-    libxl__remus_devices_state *rds = dev->rds;
+    libxl__checkpoint_devices_state *cds = dev->cds;
     const char *out_path_base, *hotplug_error = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    const uint32_t domid = rds->domid;
+    const uint32_t domid = cds->domid;
     const int devid = remus_nic->devid;
     const char *const vif = remus_nic->vif;
     const char **const ifb = &remus_nic->ifb;
@@ -377,7 +377,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
 
     if (hotplug_error) {
         LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
-            rds->netbufscript, vif, hotplug_error);
+            cds->netbufscript, vif, hotplug_error);
         rc = ERROR_FAIL;
         goto out;
     }
@@ -388,17 +388,17 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
     }
 
     LOG(DEBUG, "%s will buffer packets from vif %s", *ifb, vif);
-    rc = init_qdisc(rds, remus_nic);
+    rc = init_qdisc(cds, remus_nic);
 
 out:
     aodev->rc = rc;
     aodev->callback(egc, aodev);
 }
 
-static void nic_teardown(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     setup_async_exec(dev, "teardown");
 
@@ -418,7 +418,7 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
                                       int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
     if (status && !rc)
@@ -441,12 +441,12 @@ enum {
 /* API implementations */
 
 static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
-                           libxl__remus_devices_state *rds,
+                           libxl__checkpoint_devices_state *cds,
                            int buffer_op)
 {
     int rc, ret;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     if (buffer_op == tc_buffer_start)
         ret = rtnl_qdisc_plug_buffer(remus_nic->qdisc);
@@ -458,7 +458,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
         goto out;
     }
 
-    ret = rtnl_qdisc_add(rds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
+    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
     if (ret) {
         rc = ERROR_FAIL;
         goto out;
@@ -475,33 +475,33 @@ out:
     return rc;
 }
 
-static void nic_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
-    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_start);
+    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_start);
 
     dev->aodev.rc = rc;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-static void nic_commit(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_commit(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
-    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_release);
+    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_release);
 
     dev->aodev.rc = rc;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_nic = {
+const libxl__checkpoint_device_instance_ops remus_device_nic = {
     .kind = LIBXL__DEVICE_KIND_VIF,
     .setup = nic_setup,
     .teardown = nic_teardown,
diff --git a/tools/libxl/libxl_nonetbuffer.c b/tools/libxl/libxl_nonetbuffer.c
index 3c659c2..4b68152 100644
--- a/tools/libxl/libxl_nonetbuffer.c
+++ b/tools/libxl/libxl_nonetbuffer.c
@@ -22,25 +22,25 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
     return 0;
 }
 
-int init_subkind_nic(libxl__remus_devices_state *rds)
+int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     return 0;
 }
 
-void cleanup_subkind_nic(libxl__remus_devices_state *rds)
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     return;
 }
 
-static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     dev->aodev.rc = ERROR_FAIL;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_nic = {
+const libxl__checkpoint_device_instance_ops remus_device_nic = {
     .kind = LIBXL__DEVICE_KIND_VIF,
     .setup = nic_setup,
 };
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index e64792b..fb21b6d 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -21,15 +21,15 @@
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc);
+                             libxl__checkpoint_devices_state *cds, int rc);
 static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc);
+                               libxl__checkpoint_devices_state *cds, int rc);
 
 void libxl__remus_setup(libxl__egc *egc,
                         libxl__domain_save_state *dss)
 {
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
     const libxl_domain_remus_info *const info = dss->remus;
 
     STATE_AO_GC(dss->ao);
@@ -39,17 +39,17 @@ void libxl__remus_setup(libxl__egc *egc,
             LOG(ERROR, "Remus: No support for network buffering");
             goto out;
         }
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
+        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
     }
 
     if (libxl_defbool_val(info->diskbuf))
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
+        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
 
-    rds->ao = ao;
-    rds->domid = dss->domid;
-    rds->callback = remus_setup_done;
+    cds->ao = ao;
+    cds->domid = dss->domid;
+    cds->callback = remus_setup_done;
 
-    libxl__remus_devices_setup(egc, rds);
+    libxl__checkpoint_devices_setup(egc, cds);
     return;
 
 out:
@@ -57,9 +57,9 @@ out:
 }
 
 static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc)
+                                   libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
@@ -69,14 +69,14 @@ static void remus_setup_done(libxl__egc *egc,
 
     LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
         dss->domid, rc);
-    rds->callback = remus_setup_failed;
-    libxl__remus_devices_teardown(egc, rds);
+    cds->callback = remus_setup_failed;
+    libxl__checkpoint_devices_teardown(egc, cds);
 }
 
 static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc)
+                               libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -87,7 +87,7 @@ static void remus_setup_failed(libxl__egc *egc,
 }
 
 static void remus_teardown_done(libxl__egc *egc,
-                                libxl__remus_devices_state *rds,
+                                libxl__checkpoint_devices_state *cds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
                            libxl__domain_save_state *dss,
@@ -97,15 +97,15 @@ void libxl__remus_teardown(libxl__egc *egc,
 
     LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
         " teardown Remus devices...", rc);
-    dss->rds.callback = remus_teardown_done;
-    libxl__remus_devices_teardown(egc, &dss->rds);
+    dss->cds.callback = remus_teardown_done;
+    libxl__checkpoint_devices_teardown(egc, &dss->cds);
 }
 
 static void remus_teardown_done(libxl__egc *egc,
-                                libxl__remus_devices_state *rds,
+                                libxl__checkpoint_devices_state *cds,
                                 int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -120,10 +120,10 @@ static void remus_teardown_done(libxl__egc *egc,
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dsps, int ok);
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
+                                         libxl__checkpoint_devices_state *cds,
                                          int rc);
 static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
+                                       libxl__checkpoint_devices_state *cds,
                                        int rc);
 
 void libxl__remus_domain_suspend_callback(void *data)
@@ -145,9 +145,9 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
     if (rc)
         goto out;
 
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_postsuspend_cb;
-    libxl__remus_devices_postsuspend(egc, rds);
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    cds->callback = remus_devices_postsuspend_cb;
+    libxl__checkpoint_devices_postsuspend(egc, cds);
     return;
 
 out:
@@ -156,10 +156,10 @@ out:
 }
 
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
+                                         libxl__checkpoint_devices_state *cds,
                                          int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
     if (rc)
         goto out;
@@ -179,16 +179,16 @@ void libxl__remus_domain_resume_callback(void *data)
     libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_preresume_cb;
-    libxl__remus_devices_preresume(egc, rds);
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    cds->callback = remus_devices_preresume_cb;
+    libxl__checkpoint_devices_preresume(egc, cds);
 }
 
 static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
+                                       libxl__checkpoint_devices_state *cds,
                                        int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -212,7 +212,7 @@ out:
 static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
+                                    libxl__checkpoint_devices_state *cds,
                                     int rc);
 static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
@@ -235,7 +235,7 @@ static void remus_checkpoint_stream_written(
     libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
 
     STATE_AO_GC(dss->ao);
 
@@ -244,8 +244,8 @@ static void remus_checkpoint_stream_written(
         goto out;
     }
 
-    rds->callback = remus_devices_commit_cb;
-    libxl__remus_devices_commit(egc, rds);
+    cds->callback = remus_devices_commit_cb;
+    libxl__checkpoint_devices_commit(egc, cds);
 
     return;
 
@@ -254,10 +254,10 @@ out:
 }
 
 static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
+                                    libxl__checkpoint_devices_state *cds,
                                     int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
     STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_remus_device.c b/tools/libxl/libxl_remus_device.c
deleted file mode 100644
index a6cb7f6..0000000
--- a/tools/libxl/libxl_remus_device.c
+++ /dev/null
@@ -1,327 +0,0 @@
-/*
- * Copyright (C) 2014 FUJITSU LIMITED
- * Author: Yang Hongyang <yanghy@cn.fujitsu.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU Lesser General Public License as published
- * by the Free Software Foundation; version 2.1 only. with the special
- * exception on linking described in file LICENSE.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU Lesser General Public License for more details.
- */
-
-#include "libxl_osdeps.h" /* must come before any other headers */
-
-#include "libxl_internal.h"
-
-extern const libxl__remus_device_instance_ops remus_device_nic;
-extern const libxl__remus_device_instance_ops remus_device_drbd_disk;
-static const libxl__remus_device_instance_ops *remus_ops[] = {
-    &remus_device_nic,
-    &remus_device_drbd_disk,
-    NULL,
-};
-
-/*----- helper functions -----*/
-
-static int init_device_subkind(libxl__remus_devices_state *rds)
-{
-    /* init device subkind-specific state in the libxl ctx */
-    int rc;
-    STATE_AO_GC(rds->ao);
-
-    if (libxl__netbuffer_enabled(gc)) {
-        rc = init_subkind_nic(rds);
-        if (rc) goto out;
-    }
-
-    rc = init_subkind_drbd_disk(rds);
-    if (rc) goto out;
-
-    rc = 0;
-out:
-    return rc;
-}
-
-static void cleanup_device_subkind(libxl__remus_devices_state *rds)
-{
-    /* cleanup device subkind-specific state in the libxl ctx */
-    STATE_AO_GC(rds->ao);
-
-    if (libxl__netbuffer_enabled(gc))
-        cleanup_subkind_nic(rds);
-
-    cleanup_subkind_drbd_disk(rds);
-}
-
-/*----- setup() and teardown() -----*/
-
-/* callbacks */
-
-static void all_devices_setup_cb(libxl__egc *egc,
-                                 libxl__multidev *multidev,
-                                 int rc);
-static void device_setup_iterate(libxl__egc *egc,
-                                 libxl__ao_device *aodev);
-static void devices_teardown_cb(libxl__egc *egc,
-                                libxl__multidev *multidev,
-                                int rc);
-
-/* remus device setup and teardown */
-
-static libxl__remus_device* remus_device_init(libxl__egc *egc,
-                                              libxl__remus_devices_state *rds,
-                                              libxl__device_kind kind,
-                                              void *libxl_dev)
-{
-    libxl__remus_device *dev = NULL;
-
-    STATE_AO_GC(rds->ao);
-    GCNEW(dev);
-    dev->backend_dev = libxl_dev;
-    dev->kind = kind;
-    dev->rds = rds;
-
-    return dev;
-}
-
-static void remus_devices_setup(libxl__egc *egc,
-                                libxl__remus_devices_state *rds);
-
-void libxl__remus_devices_setup(libxl__egc *egc, libxl__remus_devices_state *rds)
-{
-    int i, rc;
-
-    STATE_AO_GC(rds->ao);
-
-    rc = init_device_subkind(rds);
-    if (rc)
-        goto out;
-
-    rds->num_devices = 0;
-    rds->num_nics = 0;
-    rds->num_disks = 0;
-
-    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
-        rds->nics = libxl_device_nic_list(CTX, rds->domid, &rds->num_nics);
-
-    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
-        rds->disks = libxl_device_disk_list(CTX, rds->domid, &rds->num_disks);
-
-    if (rds->num_nics == 0 && rds->num_disks == 0)
-        goto out;
-
-    GCNEW_ARRAY(rds->devs, rds->num_nics + rds->num_disks);
-
-    for (i = 0; i < rds->num_nics; i++) {
-        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
-                                                LIBXL__DEVICE_KIND_VIF,
-                                                &rds->nics[i]);
-    }
-
-    for (i = 0; i < rds->num_disks; i++) {
-        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
-                                                LIBXL__DEVICE_KIND_VBD,
-                                                &rds->disks[i]);
-    }
-
-    remus_devices_setup(egc, rds);
-
-    return;
-
-out:
-    rds->callback(egc, rds, rc);
-}
-
-static void remus_devices_setup(libxl__egc *egc,
-                                libxl__remus_devices_state *rds)
-{
-    int i, rc;
-
-    STATE_AO_GC(rds->ao);
-
-    libxl__multidev_begin(ao, &rds->multidev);
-    rds->multidev.callback = all_devices_setup_cb;
-    for (i = 0; i < rds->num_devices; i++) {
-        libxl__remus_device *dev = rds->devs[i];
-        dev->ops_index = -1;
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
-
-        dev->aodev.rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
-        dev->aodev.callback = device_setup_iterate;
-        device_setup_iterate(egc,&dev->aodev);
-    }
-
-    rc = 0;
-    libxl__multidev_prepared(egc, &rds->multidev, rc);
-}
-
-
-static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
-{
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
-    EGC_GC;
-
-    if (aodev->rc != ERROR_REMUS_DEVICE_NOT_SUPPORTED &&
-        aodev->rc != ERROR_REMUS_DEVOPS_DOES_NOT_MATCH)
-        /* might be success or disaster */
-        goto out;
-
-    do {
-        dev->ops = remus_ops[++dev->ops_index];
-        if (!dev->ops) {
-            libxl_device_nic * nic = NULL;
-            libxl_device_disk * disk = NULL;
-            uint32_t domid;
-            int devid;
-            if (dev->kind == LIBXL__DEVICE_KIND_VIF) {
-                nic = (libxl_device_nic *)dev->backend_dev;
-                domid = nic->backend_domid;
-                devid = nic->devid;
-            } else if (dev->kind == LIBXL__DEVICE_KIND_VBD) {
-                disk = (libxl_device_disk *)dev->backend_dev;
-                domid = disk->backend_domid;
-                devid = libxl__device_disk_dev_number(disk->vdev, NULL, NULL);
-            } else {
-                LOG(ERROR,"device kind not handled by remus: %s",
-                    libxl__device_kind_to_string(dev->kind));
-                aodev->rc = ERROR_FAIL;
-                goto out;
-            }
-            LOG(ERROR,"device not handled by remus"
-                " (device=%s:%"PRId32"/%"PRId32")",
-                libxl__device_kind_to_string(dev->kind),
-                domid, devid);
-            aodev->rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
-            goto out;
-        }
-    } while (dev->ops->kind != dev->kind);
-
-    /* found the next ops_index to try */
-    assert(dev->aodev.callback == device_setup_iterate);
-    dev->ops->setup(egc,dev);
-    return;
-
- out:
-    libxl__multidev_one_callback(egc,aodev);
-}
-
-static void all_devices_setup_cb(libxl__egc *egc,
-                                 libxl__multidev *multidev,
-                                 int rc)
-{
-    STATE_AO_GC(multidev->ao);
-
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
-
-    rds->callback(egc, rds, rc);
-}
-
-void libxl__remus_devices_teardown(libxl__egc *egc,
-                                   libxl__remus_devices_state *rds)
-{
-    int i;
-    libxl__remus_device *dev;
-
-    STATE_AO_GC(rds->ao);
-
-    libxl__multidev_begin(ao, &rds->multidev);
-    rds->multidev.callback = devices_teardown_cb;
-    for (i = 0; i < rds->num_devices; i++) {
-        dev = rds->devs[i];
-        if (!dev->ops || !dev->matched)
-            continue;
-
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
-        dev->ops->teardown(egc,dev);
-    }
-
-    libxl__multidev_prepared(egc, &rds->multidev, 0);
-}
-
-static void devices_teardown_cb(libxl__egc *egc,
-                                libxl__multidev *multidev,
-                                int rc)
-{
-    int i;
-
-    STATE_AO_GC(multidev->ao);
-
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
-
-    /* clean nic */
-    for (i = 0; i < rds->num_nics; i++)
-        libxl_device_nic_dispose(&rds->nics[i]);
-    free(rds->nics);
-    rds->nics = NULL;
-    rds->num_nics = 0;
-
-    /* clean disk */
-    for (i = 0; i < rds->num_disks; i++)
-        libxl_device_disk_dispose(&rds->disks[i]);
-    free(rds->disks);
-    rds->disks = NULL;
-    rds->num_disks = 0;
-
-    cleanup_device_subkind(rds);
-
-    rds->callback(egc, rds, rc);
-}
-
-/*----- checkpointing APIs -----*/
-
-/* callbacks */
-
-static void devices_checkpoint_cb(libxl__egc *egc,
-                                  libxl__multidev *multidev,
-                                  int rc);
-
-/* API implementations */
-
-#define define_remus_checkpoint_api(api)                                \
-void libxl__remus_devices_##api(libxl__egc *egc,                        \
-                                libxl__remus_devices_state *rds)        \
-{                                                                       \
-    int i;                                                              \
-    libxl__remus_device *dev;                                           \
-                                                                        \
-    STATE_AO_GC(rds->ao);                                               \
-                                                                        \
-    libxl__multidev_begin(ao, &rds->multidev);                          \
-    rds->multidev.callback = devices_checkpoint_cb;                     \
-    for (i = 0; i < rds->num_devices; i++) {                            \
-        dev = rds->devs[i];                                             \
-        if (!dev->matched || !dev->ops->api)                            \
-            continue;                                                   \
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);\
-        dev->ops->api(egc,dev);                                         \
-    }                                                                   \
-                                                                        \
-    libxl__multidev_prepared(egc, &rds->multidev, 0);                   \
-}
-
-define_remus_checkpoint_api(postsuspend);
-
-define_remus_checkpoint_api(preresume);
-
-define_remus_checkpoint_api(commit);
-
-static void devices_checkpoint_cb(libxl__egc *egc,
-                                  libxl__multidev *multidev,
-                                  int rc)
-{
-    STATE_AO_GC(multidev->ao);
-
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
-
-    rds->callback(egc, rds, rc);
-}
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index fc76b89..b6448f6 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -26,30 +26,30 @@ typedef struct libxl__remus_drbd_disk {
     int ackwait;
 } libxl__remus_drbd_disk;
 
-int init_subkind_drbd_disk(libxl__remus_devices_state *rds)
+int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
+    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
                                        libxl__xen_script_dir_path());
 
     return 0;
 }
 
-void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds)
+void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
     return;
 }
 
 /*----- helper functions, for async calls -----*/
 static void drbd_async_call(libxl__egc *egc,
-                            libxl__remus_device *dev,
-                            void func(libxl__remus_device *),
+                            libxl__checkpoint_device *dev,
+                            void func(libxl__checkpoint_device *),
                             libxl__ev_child_callback callback)
 {
     int pid = -1, rc;
     libxl__ao_device *aodev = &dev->aodev;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* Fork and call */
     pid = libxl__ev_child_fork(gc, &aodev->child, callback);
@@ -82,21 +82,21 @@ static void match_async_exec_cb(libxl__egc *egc,
 
 /* implementations */
 
-static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev);
+static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev);
 
-static void drbd_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     match_async_exec(egc, dev);
 }
 
-static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
+static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int arraysize, nr = 0, rc;
     const libxl_device_disk *disk = dev->backend_dev;
     libxl__async_exec_state *aes = &dev->aodev.aes;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* setup env & args */
     arraysize = 1;
@@ -107,12 +107,12 @@ static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
     arraysize = 3;
     nr = 0;
     GCNEW_ARRAY(aes->args, arraysize);
-    aes->args[nr++] = dev->rds->drbd_probe_script;
+    aes->args[nr++] = dev->cds->drbd_probe_script;
     aes->args[nr++] = disk->pdev_path;
     aes->args[nr++] = NULL;
     assert(nr <= arraysize);
 
-    aes->ao = dev->rds->ao;
+    aes->ao = dev->cds->ao;
     aes->what = GCSPRINTF("%s %s", aes->args[0], aes->args[1]);
     aes->timeout_ms = LIBXL_HOTPLUG_TIMEOUT * 1000;
     aes->callback = match_async_exec_cb;
@@ -136,7 +136,7 @@ static void match_async_exec_cb(libxl__egc *egc,
                                 int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_drbd_disk *drbd_disk;
     const libxl_device_disk *disk = dev->backend_dev;
 
@@ -146,7 +146,7 @@ static void match_async_exec_cb(libxl__egc *egc,
         goto out;
 
     if (status) {
-        rc = ERROR_REMUS_DEVOPS_DOES_NOT_MATCH;
+        rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
         /* BUG: seems to assume that any exit status means `no match' */
         /* BUG: exit status will have been logged as an error */
         goto out;
@@ -171,10 +171,10 @@ out:
     aodev->callback(egc, aodev);
 }
 
-static void drbd_teardown(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     libxl__remus_drbd_disk *drbd_disk = dev->concrete_data;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     close(drbd_disk->ctl_fd);
     dev->aodev.rc = 0;
@@ -191,9 +191,9 @@ static void checkpoint_async_call_done(libxl__egc *egc,
 /* API implementations */
 
 /* this op will not wait and block, so implement as sync op */
-static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
 
@@ -207,16 +207,16 @@ static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
 }
 
 
-static void drbd_preresume_async(libxl__remus_device *dev);
+static void drbd_preresume_async(libxl__checkpoint_device *dev);
 
-static void drbd_preresume(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_preresume(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     drbd_async_call(egc, dev, drbd_preresume_async, checkpoint_async_call_done);
 }
 
-static void drbd_preresume_async(libxl__remus_device *dev)
+static void drbd_preresume_async(libxl__checkpoint_device *dev)
 {
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
     int ackwait = rdd->ackwait;
@@ -235,7 +235,7 @@ static void checkpoint_async_call_done(libxl__egc *egc,
 {
     int rc;
     libxl__ao_device *aodev = CONTAINER_OF(child, *aodev, child);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
 
     STATE_AO_GC(aodev->ao);
@@ -253,7 +253,7 @@ out:
     aodev->callback(egc, aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_drbd_disk = {
+const libxl__checkpoint_device_instance_ops remus_device_drbd_disk = {
     .kind = LIBXL__DEVICE_KIND_VBD,
     .setup = drbd_setup,
     .teardown = drbd_teardown,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index e8d3647..1d676ef 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
     (-15, "LOCK_FAIL"),
     (-16, "JSON_CONFIG_EMPTY"),
     (-17, "DEVICE_EXISTS"),
-    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
-    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
+    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
     (-20, "VNUMA_CONFIG_INVALID"),
     (-21, "DOMAIN_NOTFOUND"),
     (-22, "ABORTED"),
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 22/25] tools/libxl: adjust the indentation
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (20 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 23/25] tools/libxl: store remus_ops in checkpoint device state Yang Hongyang
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

This is just tidying up after the previous automatic renaming.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_checkpoint_device.c | 21 +++++++++++----------
 tools/libxl/libxl_internal.h          | 19 +++++++++++--------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index 109cd23..226f159 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -73,9 +73,9 @@ static void devices_teardown_cb(libxl__egc *egc,
 /* checkpoint device setup and teardown */
 
 static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
-                                              libxl__checkpoint_devices_state *cds,
-                                              libxl__device_kind kind,
-                                              void *libxl_dev)
+                                        libxl__checkpoint_devices_state *cds,
+                                        libxl__device_kind kind,
+                                        void *libxl_dev)
 {
     libxl__checkpoint_device *dev = NULL;
 
@@ -89,9 +89,10 @@ static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-                                libxl__checkpoint_devices_state *cds);
+                                     libxl__checkpoint_devices_state *cds);
 
-void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
+void libxl__checkpoint_devices_setup(libxl__egc *egc,
+                                     libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
@@ -137,7 +138,7 @@ out:
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-                                libxl__checkpoint_devices_state *cds)
+                                     libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
@@ -285,12 +286,12 @@ static void devices_checkpoint_cb(libxl__egc *egc,
 
 /* API implementations */
 
-#define define_checkpoint_api(api)                                \
-void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
-                                libxl__checkpoint_devices_state *cds)        \
+#define define_checkpoint_api(api)                                      \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,                   \
+                                libxl__checkpoint_devices_state *cds)   \
 {                                                                       \
     int i;                                                              \
-    libxl__checkpoint_device *dev;                                           \
+    libxl__checkpoint_device *dev;                                      \
                                                                         \
     STATE_AO_GC(cds->ao);                                               \
                                                                         \
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 901e216..af992fc 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2765,7 +2765,8 @@ typedef struct libxl__save_helper_state {
  * Each device type needs to implement the interfaces specified in
  * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the checkpoint device layer is shown below:
+ * The high-level control flow through the checkpoint device layer is shown
+ * below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
@@ -2826,7 +2827,8 @@ int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
 typedef void libxl__checkpoint_callback(libxl__egc *,
-                                   libxl__checkpoint_devices_state *, int rc);
+                                        libxl__checkpoint_devices_state *,
+                                        int rc);
 
 /*
  * State associated with a checkpoint invocation, including parameters
@@ -2834,7 +2836,7 @@ typedef void libxl__checkpoint_callback(libxl__egc *,
  * save/restore machinery.
  */
 struct libxl__checkpoint_devices_state {
-    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
+    /*-- must be set by caller of libxl__checkpoint_device_(setup|teardown) --*/
 
     libxl__ao *ao;
     uint32_t domid;
@@ -2847,7 +2849,8 @@ struct libxl__checkpoint_devices_state {
     /*
      * this array is allocated before setup the checkpoint devices by the
      * checkpoint abstract layer.
-     * devs may be NULL, means there's no checkpoint devices that has been set up.
+     * devs may be NULL, means there's no checkpoint devices that has been
+     * set up.
      * the size of this array is 'num_devices', which is the total number
      * of libxl nic devices and disk devices(num_nics + num_disks).
      */
@@ -2909,13 +2912,13 @@ struct libxl__checkpoint_device {
 _hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
-                                           libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
-                                              libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
-                                            libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
-                                         libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 23/25] tools/libxl: store remus_ops in checkpoint device state
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (21 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 22/25] tools/libxl: adjust the indentation Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 13:21   ` Ian Campbell
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure Yang Hongyang
                   ` (2 subsequent siblings)
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

Checkpoint device is an abstract layer to do checkpoint.
COLO can also use it to do checkpoint. But there are
still some codes in checkpoint device which touch remus.

This patch and the following 2 will seperate remus from
checkpoint device layer.

We use remus ops directly in checkpoint device. Store it
in checkpoint device state so that we do not aware of
remus_ops in the checkpoint device layer.

it is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxl/libxl_checkpoint_device.c | 10 +---------
 tools/libxl/libxl_internal.h          |  2 ++
 tools/libxl/libxl_remus.c             |  9 +++++++++
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index 226f159..bbc6dc4 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,14 +17,6 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__checkpoint_device_instance_ops remus_device_nic;
-extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
-static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
-    &remus_device_nic,
-    &remus_device_drbd_disk,
-    NULL,
-};
-
 /*----- helper functions -----*/
 
 static int init_device_subkind(libxl__checkpoint_devices_state *cds)
@@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
         goto out;
 
     do {
-        dev->ops = remus_ops[++dev->ops_index];
+        dev->ops = dev->cds->ops[++dev->ops_index];
         if (!dev->ops) {
             libxl_device_nic * nic = NULL;
             libxl_device_disk * disk = NULL;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index af992fc..d92eabc 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2842,6 +2842,8 @@ struct libxl__checkpoint_devices_state {
     uint32_t domid;
     libxl__checkpoint_callback *callback;
     int device_kind_flags;
+    /* The ops must be pointer array, and the last ops must be NULL */
+    const libxl__checkpoint_device_instance_ops **ops;
 
     /*----- private for abstract layer only -----*/
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index fb21b6d..d2e4d42 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -18,6 +18,14 @@
 
 #include "libxl_internal.h"
 
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
+    &remus_device_nic,
+    &remus_device_drbd_disk,
+    NULL,
+};
+
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -48,6 +56,7 @@ void libxl__remus_setup(libxl__egc *egc,
     cds->ao = ao;
     cds->domid = dss->domid;
     cds->callback = remus_setup_done;
+    cds->ops = remus_ops;
 
     libxl__checkpoint_devices_setup(egc, cds);
     return;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (22 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 23/25] tools/libxl: store remus_ops in checkpoint device state Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 13:28   ` Ian Campbell
  2015-07-15 15:08   ` Ian Jackson
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 25/25] tools/libxl: seperate device init/cleanup from checkpoint device layer Yang Hongyang
  2015-07-16  1:37 ` [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
  25 siblings, 2 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

Add a new structure remus state, and move concrete layer's private
member to remus state.
it is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxl/libxl.c                 |  2 +-
 tools/libxl/libxl_dom_save.c        |  3 +--
 tools/libxl/libxl_internal.h        | 38 ++++++++++++++++-----------
 tools/libxl/libxl_netbuffer.c       | 51 +++++++++++++++++++++----------------
 tools/libxl/libxl_remus.c           | 38 ++++++++++++++-------------
 tools/libxl/libxl_remus_disk_drbd.c |  8 +++---
 6 files changed, 79 insertions(+), 61 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index fcf91f1..5502709 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -845,7 +845,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     assert(info);
 
     /* Point of no return */
-    libxl__remus_setup(egc, dss);
+    libxl__remus_setup(egc, &dss->rs);
     return AO_INPROGRESS;
 
  out:
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 9364a1d..9b7159f 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -428,7 +428,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
           | (dss->hvm ? XCFLAGS_HVM : 0);
 
     if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
-        dss->interval = r_info->interval;
         if (libxl_defbool_val(r_info->compression))
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
@@ -578,7 +577,7 @@ static void domain_save_done(libxl__egc *egc,
      * from sending checkpoints. Teardown the network buffers and
      * release netlink resources.  This is an async op.
      */
-    libxl__remus_teardown(egc, dss, rc);
+    libxl__remus_teardown(egc, &dss->rs, rc);
 }
 
 /*========================= Domain restore ============================*/
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index d92eabc..9c81d8d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2864,16 +2864,6 @@ struct libxl__checkpoint_devices_state {
     int num_disks;
 
     libxl__multidev multidev;
-
-    /*----- private for concrete (device-specific) layer only -----*/
-
-    /* private for nic device subkind ops */
-    char *netbufscript;
-    struct nl_sock *nlsock;
-    struct nl_cache *qdisc_cache;
-
-    /* private for drbd disk subkind ops */
-    char *drbd_probe_script;
 };
 
 /*
@@ -2921,6 +2911,26 @@ _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
+
+/*----- Remus related state structure -----*/
+typedef struct libxl__remus_state libxl__remus_state;
+struct libxl__remus_state {
+    /* private */
+    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
+    int interval; /* checkpoint interval */
+
+    /* abstract layer */
+    libxl__checkpoint_devices_state cds;
+
+    /*----- private for concrete (device-specific) layer only -----*/
+    /* private for nic device subkind ops */
+    char *netbufscript;
+    struct nl_sock *nlsock;
+    struct nl_cache *qdisc_cache;
+
+    /* private for drbd disk subkind ops */
+    char *drbd_probe_script;
+};
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
@@ -3073,9 +3083,7 @@ struct libxl__domain_save_state {
     int hvm;
     int xcflags;
     libxl__domain_suspend_state dsps;
-    libxl__checkpoint_devices_state cds;
-    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
-    int interval; /* checkpoint interval (for Remus) */
+    libxl__remus_state rs;
     libxl__stream_write_state sws;
     libxl__logdirty_switch logdirty;
     /* private for libxl__domain_save_device_model */
@@ -3490,9 +3498,9 @@ _hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
 _hidden void libxl__remus_domain_restore_checkpoint_callback(void *data);
 /* Remus setup and teardown*/
 _hidden void libxl__remus_setup(libxl__egc *egc,
-                                libxl__domain_save_state *dss);
+                                libxl__remus_state *rs);
 _hidden void libxl__remus_teardown(libxl__egc *egc,
-                                   libxl__domain_save_state *dss,
+                                   libxl__remus_state *rs,
                                    int rc);
 
 /*
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index 33c2a42..f7a8448 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -41,18 +41,19 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
 int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     int rc, ret;
-    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__remus_state *rs = CONTAINER_OF(cds, *rs, cds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
 
     STATE_AO_GC(cds->ao);
 
-    cds->nlsock = nl_socket_alloc();
-    if (!cds->nlsock) {
+    rs->nlsock = nl_socket_alloc();
+    if (!rs->nlsock) {
         LOG(ERROR, "cannot allocate nl socket");
         rc = ERROR_FAIL;
         goto out;
     }
 
-    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
+    ret = nl_connect(rs->nlsock, NETLINK_ROUTE);
     if (ret) {
         LOG(ERROR, "failed to open netlink socket: %s",
             nl_geterror(ret));
@@ -61,7 +62,7 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
     }
 
     /* get list of all qdiscs installed on network devs. */
-    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
+    ret = rtnl_qdisc_alloc_cache(rs->nlsock, &rs->qdisc_cache);
     if (ret) {
         LOG(ERROR, "failed to allocate qdisc cache: %s",
             nl_geterror(ret));
@@ -70,10 +71,10 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
     }
 
     if (dss->remus->netbufscript) {
-        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
+        rs->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
     } else {
-        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
-                                      libxl__xen_script_dir_path());
+        rs->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
+                                     libxl__xen_script_dir_path());
     }
 
     rc = 0;
@@ -84,20 +85,22 @@ out:
 
 void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
+    libxl__remus_state *rs = CONTAINER_OF(cds, *rs, cds);
+
     STATE_AO_GC(cds->ao);
 
     /* free qdisc cache */
-    if (cds->qdisc_cache) {
-        nl_cache_clear(cds->qdisc_cache);
-        nl_cache_free(cds->qdisc_cache);
-        cds->qdisc_cache = NULL;
+    if (rs->qdisc_cache) {
+        nl_cache_clear(rs->qdisc_cache);
+        nl_cache_free(rs->qdisc_cache);
+        rs->qdisc_cache = NULL;
     }
 
     /* close & free nlsock */
-    if (cds->nlsock) {
-        nl_close(cds->nlsock);
-        nl_socket_free(cds->nlsock);
-        cds->nlsock = NULL;
+    if (rs->nlsock) {
+        nl_close(rs->nlsock);
+        nl_socket_free(rs->nlsock);
+        rs->nlsock = NULL;
     }
 }
 
@@ -150,13 +153,14 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
     int rc, ret, ifindex;
     struct rtnl_link *ifb = NULL;
     struct rtnl_qdisc *qdisc = NULL;
+    libxl__remus_state *rs = CONTAINER_OF(cds, *rs, cds);
 
     STATE_AO_GC(cds->ao);
 
     /* Now that we have brought up REMUS_IFB device with plug qdisc for
      * this vif, so we need to refill the qdisc cache.
      */
-    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
+    ret = nl_cache_refill(rs->nlsock, rs->qdisc_cache);
     if (ret) {
         LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
         rc = ERROR_FAIL;
@@ -164,7 +168,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
     }
 
     /* get a handle to the REMUS_IFB interface */
-    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
+    ret = rtnl_link_get_kernel(rs->nlsock, 0, remus_nic->ifb, &ifb);
     if (ret) {
         LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
             nl_geterror(ret));
@@ -187,7 +191,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
      * There is no need to explicitly free this qdisc as its just a
      * reference from the qdisc cache we allocated earlier.
      */
-    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
+    qdisc = rtnl_qdisc_get_by_parent(rs->qdisc_cache, ifindex, TC_H_ROOT);
     if (qdisc) {
         const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
         /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
@@ -238,11 +242,12 @@ static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
     libxl__checkpoint_devices_state *cds = dev->cds;
     libxl__async_exec_state *aes = &dev->aodev.aes;
+    libxl__remus_state *rs = CONTAINER_OF(cds, *rs, cds);
 
     STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    char *const script = libxl__strdup(gc, cds->netbufscript);
+    char *const script = libxl__strdup(gc, rs->netbufscript);
     const uint32_t domid = cds->domid;
     const int dev_id = remus_nic->devid;
     const char *const vif = remus_nic->vif;
@@ -333,6 +338,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
     libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
     libxl__checkpoint_devices_state *cds = dev->cds;
+    libxl__remus_state *rs = CONTAINER_OF(cds, *rs, cds);
     const char *out_path_base, *hotplug_error = NULL;
 
     STATE_AO_GC(cds->ao);
@@ -377,7 +383,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
 
     if (hotplug_error) {
         LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
-            cds->netbufscript, vif, hotplug_error);
+            rs->netbufscript, vif, hotplug_error);
         rc = ERROR_FAIL;
         goto out;
     }
@@ -445,6 +451,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
                            int buffer_op)
 {
     int rc, ret;
+    libxl__remus_state *rs = CONTAINER_OF(cds, *rs, cds);
 
     STATE_AO_GC(cds->ao);
 
@@ -458,7 +465,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
         goto out;
     }
 
-    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
+    ret = rtnl_qdisc_add(rs->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
     if (ret) {
         rc = ERROR_FAIL;
         goto out;
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index d2e4d42..91abf8e 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -33,11 +33,12 @@ static void remus_setup_done(libxl__egc *egc,
 static void remus_setup_failed(libxl__egc *egc,
                                libxl__checkpoint_devices_state *cds, int rc);
 
-void libxl__remus_setup(libxl__egc *egc,
-                        libxl__domain_save_state *dss)
+void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
+
     /* Convenience aliases */
-    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    libxl__checkpoint_devices_state *const cds = &rs->cds;
     const libxl_domain_remus_info *const info = dss->remus;
 
     STATE_AO_GC(dss->ao);
@@ -57,6 +58,7 @@ void libxl__remus_setup(libxl__egc *egc,
     cds->domid = dss->domid;
     cds->callback = remus_setup_done;
     cds->ops = remus_ops;
+    rs->interval = info->interval;
 
     libxl__checkpoint_devices_setup(egc, cds);
     return;
@@ -68,7 +70,7 @@ out:
 static void remus_setup_done(libxl__egc *egc,
                                    libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, rs.cds);
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
@@ -85,7 +87,7 @@ static void remus_setup_done(libxl__egc *egc,
 static void remus_setup_failed(libxl__egc *egc,
                                libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, rs.cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -99,22 +101,22 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__checkpoint_devices_state *cds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
-                           libxl__domain_save_state *dss,
+                           libxl__remus_state *rs,
                            int rc)
 {
     EGC_GC;
 
     LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
         " teardown Remus devices...", rc);
-    dss->cds.callback = remus_teardown_done;
-    libxl__checkpoint_devices_teardown(egc, &dss->cds);
+    rs->cds.callback = remus_teardown_done;
+    libxl__checkpoint_devices_teardown(egc, &rs->cds);
 }
 
 static void remus_teardown_done(libxl__egc *egc,
                                 libxl__checkpoint_devices_state *cds,
                                 int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, rs.cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -154,7 +156,7 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
     if (rc)
         goto out;
 
-    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    libxl__checkpoint_devices_state *const cds = &dss->rs.cds;
     cds->callback = remus_devices_postsuspend_cb;
     libxl__checkpoint_devices_postsuspend(egc, cds);
     return;
@@ -168,7 +170,7 @@ static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__checkpoint_devices_state *cds,
                                          int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, rs.cds);
 
     if (rc)
         goto out;
@@ -188,7 +190,7 @@ void libxl__remus_domain_resume_callback(void *data)
     libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
-    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    libxl__checkpoint_devices_state *const cds = &dss->rs.cds;
     cds->callback = remus_devices_preresume_cb;
     libxl__checkpoint_devices_preresume(egc, cds);
 }
@@ -197,7 +199,7 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
                                        libxl__checkpoint_devices_state *cds,
                                        int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, rs.cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -244,7 +246,7 @@ static void remus_checkpoint_stream_written(
     libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
 
     /* Convenience aliases */
-    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    libxl__checkpoint_devices_state *const cds = &dss->rs.cds;
 
     STATE_AO_GC(dss->ao);
 
@@ -266,7 +268,7 @@ static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__checkpoint_devices_state *cds,
                                     int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, rs.cds);
 
     STATE_AO_GC(dss->ao);
 
@@ -284,9 +286,9 @@ static void remus_devices_commit_cb(libxl__egc *egc,
      */
 
     /* Set checkpoint interval timeout */
-    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dss->rs.checkpoint_timeout,
                                      remus_next_checkpoint,
-                                     dss->interval);
+                                     dss->rs.interval);
 
     if (rc)
         goto out;
@@ -302,7 +304,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   int rc)
 {
     libxl__domain_save_state *dss =
-                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
+                            CONTAINER_OF(ev, *dss, rs.checkpoint_timeout);
 
     STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index b6448f6..616d87e 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -28,10 +28,11 @@ typedef struct libxl__remus_drbd_disk {
 
 int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
+    libxl__remus_state *rs = CONTAINER_OF(cds, *rs, cds);
     STATE_AO_GC(cds->ao);
 
-    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
-                                       libxl__xen_script_dir_path());
+    rs->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
+                                      libxl__xen_script_dir_path());
 
     return 0;
 }
@@ -96,6 +97,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
     int arraysize, nr = 0, rc;
     const libxl_device_disk *disk = dev->backend_dev;
     libxl__async_exec_state *aes = &dev->aodev.aes;
+    libxl__remus_state *rs = CONTAINER_OF(dev->cds, *rs, cds);
     STATE_AO_GC(dev->cds->ao);
 
     /* setup env & args */
@@ -107,7 +109,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
     arraysize = 3;
     nr = 0;
     GCNEW_ARRAY(aes->args, arraysize);
-    aes->args[nr++] = dev->cds->drbd_probe_script;
+    aes->args[nr++] = rs->drbd_probe_script;
     aes->args[nr++] = disk->pdev_path;
     aes->args[nr++] = NULL;
     assert(nr <= arraysize);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH v4 --for 4.6 COLOPre 25/25] tools/libxl: seperate device init/cleanup from checkpoint device layer
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (23 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure Yang Hongyang
@ 2015-07-15  7:45 ` Yang Hongyang
  2015-07-15 13:37   ` Ian Campbell
  2015-07-16  1:37 ` [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
  25 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15  7:45 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

we call (init|cleanup)_subkind_nic and (init|cleanup)_subkind_drbd_disk
directly in checkpoint device. Move them to libxl_remus.c, Call them before
calling libxl__checkpoint_devices_setup() or after calling
libxl__checkpoint_devices_teardown().
it is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxl/libxl_checkpoint_device.c | 42 ++---------------------------------
 tools/libxl/libxl_remus.c             | 42 +++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index bbc6dc4..0a16dbb 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,38 +17,6 @@
 
 #include "libxl_internal.h"
 
-/*----- helper functions -----*/
-
-static int init_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-    /* init device subkind-specific state in the libxl ctx */
-    int rc;
-    STATE_AO_GC(cds->ao);
-
-    if (libxl__netbuffer_enabled(gc)) {
-        rc = init_subkind_nic(cds);
-        if (rc) goto out;
-    }
-
-    rc = init_subkind_drbd_disk(cds);
-    if (rc) goto out;
-
-    rc = 0;
-out:
-    return rc;
-}
-
-static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-    /* cleanup device subkind-specific state in the libxl ctx */
-    STATE_AO_GC(cds->ao);
-
-    if (libxl__netbuffer_enabled(gc))
-        cleanup_subkind_nic(cds);
-
-    cleanup_subkind_drbd_disk(cds);
-}
-
 /*----- setup() and teardown() -----*/
 
 /* callbacks */
@@ -86,14 +54,10 @@ static void checkpoint_devices_setup(libxl__egc *egc,
 void libxl__checkpoint_devices_setup(libxl__egc *egc,
                                      libxl__checkpoint_devices_state *cds)
 {
-    int i, rc;
+    int i;
 
     STATE_AO_GC(cds->ao);
 
-    rc = init_device_subkind(cds);
-    if (rc)
-        goto out;
-
     cds->num_devices = 0;
     cds->num_nics = 0;
     cds->num_disks = 0;
@@ -126,7 +90,7 @@ void libxl__checkpoint_devices_setup(libxl__egc *egc,
     return;
 
 out:
-    cds->callback(egc, cds, rc);
+    cds->callback(egc, cds, 0);
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
@@ -263,8 +227,6 @@ static void devices_teardown_cb(libxl__egc *egc,
     cds->disks = NULL;
     cds->num_disks = 0;
 
-    cleanup_device_subkind(cds);
-
     cds->callback(egc, cds, rc);
 }
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index 91abf8e..46dcc3c 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -26,6 +26,38 @@ static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
     NULL,
 };
 
+/*----- helper functions -----*/
+
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* init device subkind-specific state in the libxl ctx */
+    int rc;
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc)) {
+        rc = init_subkind_nic(cds);
+        if (rc) goto out;
+    }
+
+    rc = init_subkind_drbd_disk(cds);
+    if (rc) goto out;
+
+    rc = 0;
+out:
+    return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* cleanup device subkind-specific state in the libxl ctx */
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc))
+        cleanup_subkind_nic(cds);
+
+    cleanup_subkind_drbd_disk(cds);
+}
+
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -60,6 +92,12 @@ void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
     cds->ops = remus_ops;
     rs->interval = info->interval;
 
+    if (init_device_subkind(cds)) {
+        LOG(ERROR, "Remus: failed to init device subkind for guest %u",
+            dss->domid);
+        goto out;
+    }
+
     libxl__checkpoint_devices_setup(egc, cds);
     return;
 
@@ -94,6 +132,8 @@ static void remus_setup_failed(libxl__egc *egc,
         LOG(ERROR, "Remus: failed to teardown device after setup failed"
             " for guest with domid %u, rc %d", dss->domid, rc);
 
+    cleanup_device_subkind(cds);
+
     dss->callback(egc, dss, rc);
 }
 
@@ -123,6 +163,8 @@ static void remus_teardown_done(libxl__egc *egc,
         LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
             " rc %d", dss->domid, rc);
 
+    cleanup_device_subkind(cds);
+
     dss->callback(egc, dss, rc);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 01/25] tools/libxl: rename libxl__domain_suspend to libxl__domain_save
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 01/25] tools/libxl: rename libxl__domain_suspend to libxl__domain_save Yang Hongyang
@ 2015-07-15 11:16   ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 11:16 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, guijianfeng, rshriram

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> The suspend/save terminology used by libxc is more consistent.
> "suspend" refers to quiescing the VM, so pausing qemu, making a
> remote_shutdown(SHUTDOWN_suspend) hypercall etc.
> "save" refers to the actions involved in actually shuffling the
> state of the VM, so xc_domain_save() etc.
> 
> libxl currently uses "suspend" to encapsulate both. The patch
> Rename libxl__domain_suspend() to libxl__domain_save() since it
> actually refers to shuffling the state of the VM.
> 
> This results in some strangeness in that some functions called *save*
> are now passed a struct called *suspend*, this is temporary and is all
> fixed up later by the refactoring of the suspend_state.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> Some comments, commit messages:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 04/25] tools/libxl: rename remus checkpoint callbacks
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 04/25] tools/libxl: rename remus checkpoint callbacks Yang Hongyang
@ 2015-07-15 11:17   ` Ian Campbell
  2015-07-16  1:43     ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 11:17 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> There are 2 remus checkpoint callbacks(save/restore), currently, they
> both called libxl__remus_domain_checkpoint_callback in diffrent
> file, so it is ok. But in the following patch, we will move all of the
> remus callback code into a seperate file, the name should be diffrent.

"separate" and "different" (twice).

> So rename them to:
>   libxl__remus_domain_{save/restore}_checkpoint_callback
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_create.c | 4 ++--
>  tools/libxl/libxl_dom.c    | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 5b4d333..a32e3df 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -677,7 +677,7 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
>  static void remus_checkpoint_stream_done(
>      libxl__egc *egc, libxl__stream_read_state *srs, int rc);
>  
> -static void libxl__remus_domain_checkpoint_callback(void *data)
> +static void libxl__remus_domain_restore_checkpoint_callback(void *data)
>  {
>      libxl__save_helper_state *shs = data;
>      libxl__domain_create_state *dcs = shs->caller_state;
> @@ -989,7 +989,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      }
>  
>      /* Restore */
> -    callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
> +    callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
>  
>      rc = libxl__build_pre(gc, domid, d_config, state);
>      if (rc)
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 0788309..9c61fa7 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1586,7 +1586,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
>                                    const struct timeval *requested_abs,
>                                    int rc);
>  
> -static void libxl__remus_domain_checkpoint_callback(void *data)
> +static void libxl__remus_domain_save_checkpoint_callback(void *data)
>  {
>      libxl__save_helper_state *shs = data;
>      libxl__domain_suspend_state *dss = shs->caller_state;
> @@ -1749,7 +1749,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
>      if (r_info != NULL) {
>          callbacks->suspend = libxl__remus_domain_suspend_callback;
>          callbacks->postcopy = libxl__remus_domain_resume_callback;
> -        callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
> +        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
>          dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>      } else
>          callbacks->suspend = libxl__domain_suspend_callback;

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup Yang Hongyang
@ 2015-07-15 11:26   ` Ian Campbell
  2015-07-16  5:32     ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 11:26 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> Refactoring Remus setup by introducing libxl__remus_setup API.
> All Remus setup work are done in this function.
> 
> Also remove the libxl__ prefix for static functions.

There is a subtle behavioural change here, which is that if anything
which is now done in _setup fails then the result is a call to
dss->callback( ..,..,ERROR_FAIL) rather than _start returning
AO_CREATE_FAIL(ERROR_FAIL).

I think this is probably a reasonable and correct change, but I think it
is worth mentioning in the commit log.

That said, I also wonder if the actual check for netbuffer_enabled (the
only such failure in practice) ought to be moved up such that it stays
in _start along with the other similar checks, i.e. _start would do:

    if (libxl_defbool_val(info->netbuf) && !libxl__netbuffer_enabled(gc)) {
            LOG(ERROR, "Remus: No support for network buffering");
            rc = ERROR_FAIL;
            goto out;
        }

while _setup would do:

    if (libxl_defbool_val(info->netbuf)) {
        // MAYBE : assert(libxl__netbuffer_enabled(gc))
        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
    }

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 06/25] libxl/remus: introduce libxl__remus_teardown
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 06/25] libxl/remus: introduce libxl__remus_teardown Yang Hongyang
@ 2015-07-15 11:59   ` Ian Campbell
  2015-07-16  1:43     ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 11:59 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> introduce libxl__remus_teardown to teardown Remus devices.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

If you need to respin then you might consider inverting the if remus
check in domain_suspend_done and calling this new function if true, e.g.

    if (dss->remus) {
	libxl__remus_teardown(...)
	return;
    }

    dss->callback(egc, dss, rc);

I think the control flow would feel more natural then.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback Yang Hongyang
@ 2015-07-15 12:02   ` Ian Campbell
  2015-07-15 12:35     ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:02 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> init stream {read/write} state checkpoint_callback in Remus
> checkpoint callback.

Why? Is this earlier or later than previously? Seems later?

> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_create.c | 2 +-
>  tools/libxl/libxl_dom.c    | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index a32e3df..94fe98f 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -684,6 +684,7 @@ static void libxl__remus_domain_restore_checkpoint_callback(void *data)
>      libxl__egc *egc = shs->egc;
>      STATE_AO_GC(dcs->ao);
>  
> +    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
>      libxl__stream_read_start_checkpoint(egc, &dcs->srs);
>  }
>  
> @@ -1000,7 +1001,6 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      dcs->srs.fd = restore_fd;
>      dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
>      dcs->srs.completion_callback = domcreate_stream_done;
> -    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
>  
>      libxl__stream_read_start(egc, &dcs->srs);
>      return;
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 77a917c..1740bed 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1593,6 +1593,7 @@ static void libxl__remus_domain_save_checkpoint_callback(void *data)
>      libxl__egc *egc = shs->egc;
>      STATE_AO_GC(dss->ao);
>  
> +    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>      libxl__stream_write_start_checkpoint(egc, &dss->sws);
>  }
>  
> @@ -1750,7 +1751,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
>          callbacks->suspend = libxl__remus_domain_suspend_callback;
>          callbacks->postcopy = libxl__remus_domain_resume_callback;
>          callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
> -        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>      } else
>          callbacks->suspend = libxl__domain_suspend_callback;
>  

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 08/25] tools/libxl: move remus code into libxl_remus.c
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 08/25] tools/libxl: move remus code into libxl_remus.c Yang Hongyang
@ 2015-07-15 12:05   ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:05 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> After previous refactoring, we are now able to move all remus code
> into a separate file libxl_remus.c.
> 
> Export following functions for internal use:
> - Remus callbacks
>   * libxl__remus_domain_suspend_callback
>   * libxl__remus_domain_resume_callback
>   * libxl__remus_domain_save_checkpoint_callback
>   * libxl__remus_domain_restore_checkpoint_callback
> - setup/teardown Remus:
>   * libxl__remus_setup
>   * libxl__remus_teardown
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

On the understanding this is code motion apart from some loss of static
and associated tweaks:

> CC: Ian Campbell <Ian.Campbell@citrix.com>

Acked-by:Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 10/25] libxl/save: Refactor libxl__domain_suspend_state
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 10/25] libxl/save: Refactor libxl__domain_suspend_state Yang Hongyang
@ 2015-07-15 12:10   ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:10 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, guijianfeng, rshriram

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> Currently struct libxl__domain_suspend_state contains 2 type of states,
> one is save state, another is suspend state. This patch separates those
> two out.
> The motivation of this is that COLO will need to do suspend/resume
> continuously, we need a more common suspend state.
> 
> After this change, dss stands for libxl__domain_save_state,
> dsps stands for libxl__domain_suspend_state.
> 
> Also introduce libxl__domain_suspend_init to initialise the
> libxl__domain_suspend_state.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests Yang Hongyang
@ 2015-07-15 12:26   ` Ian Campbell
  2015-07-16  5:57     ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:26 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> From: Wen Congyang <wency@cn.fujitsu.com>
> 
> 1. suspend
> a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
>    request to the guest). If the guest doesn't support evtchn, the xenstore
>    variant will be used, suspending the guest via XenBus control node.
> b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
>    the guest
> 
> 2. Resume:
> a. fast path
>    In this case, we don't change the guest's state.
>    PV: modify the return code to 1, and than call the domctl:
>        XEN_DOMCTL_resumedomain
>    PVHVM: same with PV
>    HVM: do nothing in modify_returncode, and than call the domctl:
>         XEN_DOMCTL_resumedomain
> b. slow
>    Used when the guest's state have been changed.
>    PV: update start info, and reset all secondary CPU states. Than call the
>    domctl: XEN_DOMCTL_resumedomain
>    PVHVM and HVM can not be resumed.
> 
> For PVHVM, in my test, only call the domctl: XEN_DOMCTL_resumedomain
> can work. I am not sure if we should update start info and reset all
> secondary CPU states.
> 
> For pure HVM guest, in my test, only call the domctl:
> XEN_DOMCTL_resumedomain can work.
> 
> So we can call libxl__domain_resume(..., 1) if we don't change the guest
> state, otherwise call libxl__domain_resume(..., 0).
> 
> Under COLO, we will update the guest's state(modify memory, cpu's registers,
> device status...). In this case, we cannot use the fast path to resume it.
> Keep the return code 0, and use a slow path to resume the guest. While
> resuming HVM using slow path is not supported currently, this patch is to
> make the resume call do not fail.

I'm afraid that the addition of this paragraph has not really addressed
my comment on v3:

        I'm afraid I think the commit message for this patch (and the associated
        doc comments) need revisiting almost from scratch, to clearly explain
        what this patch is doing and why and what the constraints on the new
        functionality will be.
        
        At the moment it mostly talks in a confusing way about the old behaviour
        and adds very specific assumptions to the new function which are not
        made clear.

It also appears that this has not been addressed:

        Hrm, so it sounds here like the correctness of this new functionality
        requires the caller to have not messed with the domain's state? What
        sort of changes are to the guest state are we talking about here?
        
        Isn't that a new requirement for this call? If so then it should be
        documented somewhere, specifically what sorts of changes are and are not
        allowed and the types of guests which are affected.
        
The two usages of "in my test" in the commit message also do not inspire
confidence that this change is understood to be correct, vs. happening
to be something which works for you.

Ian.

> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  tools/libxc/xc_resume.c | 22 ++++++++++++++++++----
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
> index e67bebd..bd82334 100644
> --- a/tools/libxc/xc_resume.c
> +++ b/tools/libxc/xc_resume.c
> @@ -109,6 +109,23 @@ static int xc_domain_resume_cooperative(xc_interface *xch, uint32_t domid)
>      return do_domctl(xch, &domctl);
>  }
>  
> +static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
> +{
> +    DECLARE_DOMCTL;
> +
> +    /*
> +     * If it is PVHVM, the hypercall return code is 0, because this
> +     * is not a fast path resume, we do not modify_returncode as in
> +     * xc_domain_resume_cooperative.
> +     * (resuming it in a new domain context)
> +     *
> +     * If it is a HVM, the hypercall is a NOP.
> +     */
> +    domctl.cmd = XEN_DOMCTL_resumedomain;
> +    domctl.domain = domid;
> +    return do_domctl(xch, &domctl);
> +}
> +
>  static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>  {
>      DECLARE_DOMCTL;
> @@ -138,10 +155,7 @@ static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>       */
>  #if defined(__i386__) || defined(__x86_64__)
>      if ( info.hvm )
> -    {
> -        ERROR("Cannot resume uncooperative HVM guests");
> -        return rc;
> -    }
> +        return xc_domain_resume_hvm(xch, domid);
>  
>      if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
>      {

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream Yang Hongyang
@ 2015-07-15 12:34   ` Ian Campbell
  2015-07-15 13:58     ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:34 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> introduce enum type libxl_checkpointed_stream in IDL.
> rename the last argument of migrate_receive from "remus" to
> "checkpointed" since the semantics of this parameter has
> changed.
> 
> NOTE:
>  libxl_domain_restore_params isn't changed here,
>  checkpointed_stream is still an int.
>  It has to change eventually and other callers will have to be
>  updated to cope (and there should be LIBXL_HAVE_...).

Will this be fixed up later in this series? If so please say so.

> @@ -4282,7 +4282,7 @@ static void migrate_domain(uint32_t domid, const char *rune, int debug,
>  }
>  
>  static void migrate_receive(int debug, int daemonize, int monitor,
> -                            int send_fd, int recv_fd, int remus)
> +                            int send_fd, int recv_fd, int checkpointed)

I think you can start using the new enum type in xl straight away even
if dom_info.checkpointed_stream remains an int. So that means here.

> @@ -4489,7 +4489,8 @@ int main_restore(int argc, char **argv)
>  
>  int main_migrate_receive(int argc, char **argv)
>  {
> -    int debug = 0, daemonize = 1, monitor = 1, remus = 0;
> +    int debug = 0, daemonize = 1, monitor = 1;
> +    int checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;

and here.

> @@ -4318,7 +4318,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
>  
>      domid = rc;
>  
> -    if (remus) {
> +    if (checkpointed) {
>          /* If we are here, it means that the sender (primary) has crashed.
>           * TODO: Split-Brain Check.
>           */

Is it the case that we expect all check pointing solutions will use the
same failover code here? If yes then this should be "if (checkpointed !
= ...NONE)".

If we think they might differ (even if remus and colo happen to be the
same) then I think a switch where the NONE case does nothing would be
more structurally appropriate.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback
  2015-07-15 12:02   ` Ian Campbell
@ 2015-07-15 12:35     ` Yang Hongyang
  2015-07-16 10:32       ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15 12:35 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/15/2015 08:02 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> init stream {read/write} state checkpoint_callback in Remus
>> checkpoint callback.
>
> Why? Is this earlier or later than previously? Seems later?

There's no functional change, it's just refactoring so that we can move
all remus code into one file.

>
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>   tools/libxl/libxl_create.c | 2 +-
>>   tools/libxl/libxl_dom.c    | 2 +-
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> index a32e3df..94fe98f 100644
>> --- a/tools/libxl/libxl_create.c
>> +++ b/tools/libxl/libxl_create.c
>> @@ -684,6 +684,7 @@ static void libxl__remus_domain_restore_checkpoint_callback(void *data)
>>       libxl__egc *egc = shs->egc;
>>       STATE_AO_GC(dcs->ao);
>>
>> +    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
>>       libxl__stream_read_start_checkpoint(egc, &dcs->srs);
>>   }
>>
>> @@ -1000,7 +1001,6 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>>       dcs->srs.fd = restore_fd;
>>       dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
>>       dcs->srs.completion_callback = domcreate_stream_done;
>> -    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
>>
>>       libxl__stream_read_start(egc, &dcs->srs);
>>       return;
>> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
>> index 77a917c..1740bed 100644
>> --- a/tools/libxl/libxl_dom.c
>> +++ b/tools/libxl/libxl_dom.c
>> @@ -1593,6 +1593,7 @@ static void libxl__remus_domain_save_checkpoint_callback(void *data)
>>       libxl__egc *egc = shs->egc;
>>       STATE_AO_GC(dss->ao);
>>
>> +    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>>       libxl__stream_write_start_checkpoint(egc, &dss->sws);
>>   }
>>
>> @@ -1750,7 +1751,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
>>           callbacks->suspend = libxl__remus_domain_suspend_callback;
>>           callbacks->postcopy = libxl__remus_domain_resume_callback;
>>           callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
>> -        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>>       } else
>>           callbacks->suspend = libxl__domain_suspend_callback;
>>
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc Yang Hongyang
@ 2015-07-15 12:38   ` Ian Campbell
  2015-07-16  6:05     ` Yang Hongyang
  2015-07-16 16:10   ` Wei Liu
  1 sibling, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:38 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> Pass checkpointed_stream from libxl to libxc.
> It won't affact legacy migration because legacy migration
> won't use this param.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  tools/libxc/include/xenguest.h   |  9 ++++++---
>  tools/libxc/xc_domain_save.c     |  6 ++++--
>  tools/libxc/xc_nomigrate.c       |  3 ++-
>  tools/libxc/xc_sr_common.h       |  2 +-
>  tools/libxc/xc_sr_save.c         |  5 +++--
>  tools/libxl/libxl.c              |  2 ++
>  tools/libxl/libxl_dom_save.c     | 11 ++++++++---
>  tools/libxl/libxl_internal.h     |  1 +
>  tools/libxl/libxl_save_callout.c |  2 +-
>  tools/libxl/libxl_save_helper.c  |  3 ++-
>  10 files changed, 30 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index e95af54..6e24b6c 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -30,7 +30,6 @@
>  #define XCFLAGS_HVM       (1 << 2)
>  #define XCFLAGS_STDVGA    (1 << 3)
>  #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
> -#define XCFLAGS_CHECKPOINTED    (1 << 5)
>  
>  #define X86_64_B_SIZE   64 
>  #define X86_32_B_SIZE   32
> @@ -85,16 +84,20 @@ struct save_callbacks {
>   * @parm xch a handle to an open hypervisor interface
>   * @parm fd the file descriptor to save a domain to
>   * @parm dom the id of the domain
> + * @parm checkpointed_stream non-zero if the far end of the stream is using
> + *       checkpointing

Do (or will) specific non-zero values have any meaning to the libxc
layer? i.e. does it have any knowledge of COLO vs. Remus as the libxl
enum added in the last patch does?

If (as I hope) the answer is no then this should be a boolean and the
libxl code which propagates the enum into this field ought to use some
appropriate condition (!= ..._NONE most likely).

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 14/25] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 14/25] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Yang Hongyang
@ 2015-07-15 12:45   ` Ian Campbell
  2015-07-15 13:42     ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:45 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, guijianfeng, Anthony Perard, ian.jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> In normal migration, the qemu state was passed to qemu as a parameter.
> With COLO, Secondary vm is running. So we will do the following steps
> at every checkpoint:
> 1. suspend both primay vm and secondary vm

"primary"

> 2. sync the state
> 3. resume both primary vm and secondary vm
> Primary will send qemu's state in step2, and
> Secondary's qemu should read it and restore the state before it
> is resumed. We can not pass the state to qemu as a parameter because
> Secondary QEMU already started at this point, so we introduce
> libxl__domain_restore_device_model() to do it.
> This API should be called before resuming secondary vm.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Cc: Anthony Perard <anthony.perard@citrix.com>
> ---
>  tools/libxl/libxl_dom_save.c | 29 +++++++++++++++++++++++++++++
>  tools/libxl/libxl_internal.h |  3 +++
>  tools/libxl/libxl_qmp.c      | 10 ++++++++++
>  3 files changed, 42 insertions(+)
> 
> diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
> index f89f5d4..0926b71 100644
> --- a/tools/libxl/libxl_dom_save.c
> +++ b/tools/libxl/libxl_dom_save.c
> @@ -675,6 +675,35 @@ out:
>      return ret;
>  }
>  
> +int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid)
> +{
> +    char *state_file;
> +    int rc;
> +
> +    switch (libxl__device_model_version_running(gc, domid)) {
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +        /* not supported now */
> +        rc = ERROR_INVAL;
> +        break;
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> +        /*
> +         * This function may be called too many times for the same gc,
> +         * so we use NOGC, and free the memory before return to avoid
> +         * OOM.
> +         */

It occurs to me that domid shouldn't change for the duration of a COLO
run, right? 

Thus I think the path can be allocated once at start of day and not per
iteration, and can be stored in suspend_state (or similar) and passed in
here. Hence no complexity like a nested ao is needed.

> +        state_file = libxl__sprintf(NOGC,
> +                                    XC_DEVICE_MODEL_RESTORE_FILE".%d",
> +                                    domid);
> +        rc = libxl__qmp_restore(gc, domid, state_file);
> +        free(state_file);
> +        break;
> +    default:
> +        rc = ERROR_INVAL;
> +    }
> +
> +    return rc;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 0eb5f41..fb777c1 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -1074,6 +1074,7 @@ _hidden int libxl__domain_rename(libxl__gc *gc, uint32_t domid,
>  
>  _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
>                                       uint32_t size, void *data);
> +_hidden int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid);
>  _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
>  
>  _hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
> @@ -1702,6 +1703,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
>  _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
>  /* Save current QEMU state into fd. */
>  _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
> +/* Load current QEMU state from fd. */
> +_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
>  /* Set dirty bitmap logging status */
>  _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool enable);
>  _hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const libxl_device_disk *disk);
> diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
> index 6484f5e..080cb9f 100644
> --- a/tools/libxl/libxl_qmp.c
> +++ b/tools/libxl/libxl_qmp.c
> @@ -904,6 +904,16 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename)
>                             NULL, NULL);
>  }
>  
> +int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
> +{
> +    libxl__json_object *args = NULL;
> +
> +    qmp_parameters_add_string(gc, &args, "filename", state_file);
> +
> +    return qmp_run_command(gc, domid, "xen-load-devices-state", args,
> +                           NULL, NULL);
> +}
> +
>  static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp,
>                        char *device, char *target, char *arg)
>  {

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm Yang Hongyang
@ 2015-07-15 12:48   ` Ian Campbell
  2015-07-15 12:54     ` Ian Campbell
  2015-07-15 13:49     ` Ian Campbell
  2015-07-16 14:43   ` Wei Liu
  1 sibling, 2 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:48 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> check QEMU state before resume dm on QEMU_XEN_TRADITIONAL.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_dom_suspend.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
> index 6f04c26..686a49b 100644
> --- a/tools/libxl/libxl_dom_suspend.c
> +++ b/tools/libxl/libxl_dom_suspend.c
> @@ -434,11 +434,20 @@ static void domain_suspend_callback_common_done(libxl__egc *egc,
>  
>  int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
>  {
> +    char *path;
> +    char *state;

Can both be const.

Could also be on one line, but that is is a matter of taste so up to
you.
 
>      switch (libxl__device_model_version_running(gc, domid)) {
>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
> -        libxl__qemu_traditional_cmd(gc, domid, "continue");
> -        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
> +        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
> +
> +        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> +        state = libxl__xs_read(gc, XBT_NULL, path);
> +        if (state != NULL && !strcmp(state, "paused")) {
> +            libxl__qemu_traditional_cmd(gc, domid, "continue");

Please can you explain the apparent discrepancy between the use of
dm_domid and domid here?

> +            libxl__wait_for_device_model_deprecated(gc, domid, "running",
> +                                                    NULL, NULL, NULL);
> +        }
>          break;
>      }
>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen Yang Hongyang
@ 2015-07-15 12:50   ` Ian Campbell
  2015-07-16  3:49     ` Yang Hongyang
  2015-07-16 16:26   ` Wei Liu
  1 sibling, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:50 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> Currently, libxl__domain_unpause() only supports
> qemu-xen-traditional. Update it to support qemu-xen.
> We use libxl__domain_resume_device_model to unpause guest dm.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl.c | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 5b2d045..799aead 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -941,8 +941,6 @@ out:
>  int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>  {
>      GC_INIT(ctx);
> -    char *path;
> -    char *state;
>      int ret, rc = 0;
>  
>      libxl_domain_type type = libxl__domain_type(gc, domid);
> @@ -952,14 +950,11 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>      }
>  
>      if (type == LIBXL_DOMAIN_TYPE_HVM) {
> -        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
> -
> -        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> -        state = libxl__xs_read(gc, XBT_NULL, path);
> -        if (state != NULL && !strcmp(state, "paused")) {
> -            libxl__qemu_traditional_cmd(gc, domid, "continue");
> -            libxl__wait_for_device_model_deprecated(gc, domid, "running",
> -                                         NULL, NULL, NULL);
> +        rc = libxl__domain_resume_device_model(gc, domid);
> +        if (rc < 0) {
> +            LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "failed to unpause device model "
> +                       "for domain %u:%d", domid, rc);

Please use the preferred form of LOG(ERROR, "failed to..."), which
should also hopefully allow you to avoid splitting the line in the
middle of a string constant which is discouraged.

If you can't use LOG() then please:
            LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
                       "failed to unpause device model for domain %u:%d",
                        domid, rc);

Not splitting string constants means you can grep for an error message.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-15 12:48   ` Ian Campbell
@ 2015-07-15 12:54     ` Ian Campbell
  2015-07-15 13:00       ` Wei Liu
  2015-07-15 13:49     ` Ian Campbell
  1 sibling, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 12:54 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 13:48 +0100, Ian Campbell wrote:
> >      switch (libxl__device_model_version_running(gc, domid)) {
> >      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
> > -        libxl__qemu_traditional_cmd(gc, domid, "continue");
> > -        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
> > +        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
> > +
> > +        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> > +        state = libxl__xs_read(gc, XBT_NULL, path);
> > +        if (state != NULL && !strcmp(state, "paused")) {
> > +            libxl__qemu_traditional_cmd(gc, domid, "continue");
> 
> Please can you explain the apparent discrepancy between the use of
> dm_domid and domid here?

I see from the next patch that this pattern came from the existing
libxl_domain_unpause, which hopes to use this helper in the future.

Looking at git annotate:
83cc69fa        (Ian Jackson    2012-06-28 18:43:28 +0100       1045)    if (type == LIBXL_DOMAIN_TYPE_HVM) {
1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1046)        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1047)
1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1048)        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
bdf07e8e        (Ian Jackson    2011-12-12 17:48:42 +0000       1049)        state = libxl__xs_read(gc, XBT_NULL, path);
d1c7c3ef        (Keir Fraser    2009-11-30 10:53:39 +0000       1050)        if (state != NULL && !strcmp(state, "paused")) {
0cb90b31        (Shriram Rajagopalan    2012-02-09 18:07:48 +0000       1051)            libxl__qemu_traditional_cmd(gc, domid, "continue");
47cb2273        (Ian Jackson    2013-10-14 17:26:01 +0100       1052)            libxl__wait_for_device_model_deprecated(gc, domid, "running",
3b6eaa3e        (Ian Campbell   2011-05-24 15:57:24 +0100       1053)                                         NULL, NULL, NULL);
d1c7c3ef        (Keir Fraser    2009-11-30 10:53:39 +0000       1054)        }

It seems this came from Wei in 1fc3aeb3aa26 "libxl: use new QEMU
xenstore protocol". I suspect it was a mistake. Wei?

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-15 12:54     ` Ian Campbell
@ 2015-07-15 13:00       ` Wei Liu
  2015-07-15 13:48         ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Wei Liu @ 2015-07-15 13:00 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Yang Hongyang, Ian Jackson

On Wed, Jul 15, 2015 at 01:54:12PM +0100, Ian Campbell wrote:
> On Wed, 2015-07-15 at 13:48 +0100, Ian Campbell wrote:
> > >      switch (libxl__device_model_version_running(gc, domid)) {
> > >      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
> > > -        libxl__qemu_traditional_cmd(gc, domid, "continue");
> > > -        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
> > > +        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
> > > +
> > > +        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> > > +        state = libxl__xs_read(gc, XBT_NULL, path);
> > > +        if (state != NULL && !strcmp(state, "paused")) {
> > > +            libxl__qemu_traditional_cmd(gc, domid, "continue");
> > 
> > Please can you explain the apparent discrepancy between the use of
> > dm_domid and domid here?
> 
> I see from the next patch that this pattern came from the existing
> libxl_domain_unpause, which hopes to use this helper in the future.
> 
> Looking at git annotate:
> 83cc69fa        (Ian Jackson    2012-06-28 18:43:28 +0100       1045)    if (type == LIBXL_DOMAIN_TYPE_HVM) {
> 1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1046)        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
> 1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1047)
> 1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1048)        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> bdf07e8e        (Ian Jackson    2011-12-12 17:48:42 +0000       1049)        state = libxl__xs_read(gc, XBT_NULL, path);
> d1c7c3ef        (Keir Fraser    2009-11-30 10:53:39 +0000       1050)        if (state != NULL && !strcmp(state, "paused")) {
> 0cb90b31        (Shriram Rajagopalan    2012-02-09 18:07:48 +0000       1051)            libxl__qemu_traditional_cmd(gc, domid, "continue");
> 47cb2273        (Ian Jackson    2013-10-14 17:26:01 +0100       1052)            libxl__wait_for_device_model_deprecated(gc, domid, "running",
> 3b6eaa3e        (Ian Campbell   2011-05-24 15:57:24 +0100       1053)                                         NULL, NULL, NULL);
> d1c7c3ef        (Keir Fraser    2009-11-30 10:53:39 +0000       1054)        }
> 
> It seems this came from Wei in 1fc3aeb3aa26 "libxl: use new QEMU
> xenstore protocol". I suspect it was a mistake. Wei?
> 

No, it's not.

libxl__qemu_traditional_cmd accepts domid and then it calls
libxl_get_stubdom_id to extract dm_domid.

Wei.

> Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc Yang Hongyang
@ 2015-07-15 13:13   ` Ian Campbell
  2015-07-16  6:29     ` Yang Hongyang
  2015-07-15 13:21   ` Andrew Cooper
  1 sibling, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:13 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, guijianfeng, rshriram

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> In COLO mode, both VMs are running, and are considered in sync if the
> visible network traffic is identical.  After some time, they fall out of
> sync.
> 
> At this point, the two VMs have definitely diverged.  Lets call the
> primary dirty bitmap set A, while the secondary dirty bitmap set B.
> 
> Sets A and B are different.
> 
> Under normal migration, the page data for set A will be sent form the
> primary to the secondary.
> 
> However, the set difference B - A (lets call this C) is out-of-date on
> the secondary (with respect to the primary) and will not be sent by the
> primary, as it was not memory dirtied by the primary.  The secondary
> needs the page data for C to reconstruct an exact copy of the primary at
> the checkpoint.
> 
> The secondary cannot calculate C as it doesn't know A.  Instead, the
> secondary must send B to the primary, at which point the primary
> calculates the union of A and B (lets call this D) which is all the
> pages dirtied by both the primary and the secondary, and sends all page
> data covered by D.
> 
> In the general case, D is a superset of both A and B.  Without the
> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
> copy of the primary.

When Andy (who wrote this) said this via email I replied [0] including:

        According to the paper there is no need to resend because the
        secondary already has a non-dirty copy of any memory which is
        dirty in B but not A.

So it is not the case that a checkpoint _can't_ reconstruct a valid copy
of the primary, clearly it is possible, but for some reason this
implementation chooses to deviate from the paper and does things in a
way where it indeed cannot reconstruct D but I've yet to see a
description of _why_ the implementation produced here differs from the
paper.

> We transfer the dirty bitmap on libxc side, so we need to introduce back
> channel to libxc.

I'm sure you have good practical reasons why the implementation differs
from the design and I would like to know what they are because the back
channel is adding extra complexity to libxc and libxl so I want to know
why it is justified, as I also said this in [1].

Lastly Ian said in [2]:
        
        To be clear, I have no problem if the design has changed since the
        paper was written.  I just want:
        
         * A clear high-level explanation of the actually-implemented
           arrangements to exist somewhere
        
         * The commit messages, or code, to refer to that explanation

IMHO the addition of this extra commit message doesn't really meet at
least the first requirement. Please point us to an up to date design
document which describes COLO as actually implemented.

Ian.

[0] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00090.html
[1] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00148.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00101.html

> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> commit message:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxc/include/xenguest.h   |  8 ++++----
>  tools/libxc/xc_domain_restore.c  |  4 ++--
>  tools/libxc/xc_domain_save.c     |  4 ++--
>  tools/libxc/xc_sr_restore.c      |  2 +-
>  tools/libxc/xc_sr_save.c         |  2 +-
>  tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
>  tools/libxl/libxl_save_helper.c  |  8 ++++++--
>  7 files changed, 42 insertions(+), 25 deletions(-)
> 
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 6e24b6c..4056955 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -91,13 +91,13 @@ struct save_callbacks {
>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
>                     uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
>                     struct save_callbacks* callbacks, int hvm,
> -                   int checkpointed_stream);
> +                   int checkpointed_stream, int back_fd);
>  
>  /* Domain Save v2 */
>  int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
>                      uint32_t max_factor, uint32_t flags,
>                      struct save_callbacks* callbacks, int hvm,
> -                    int checkpointed_stream);
> +                    int checkpointed_stream, int back_fd);
>  
>  /* callbacks provided by xc_domain_restore */
>  struct restore_callbacks {
> @@ -140,7 +140,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>                        unsigned long *console_mfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
>                        int checkpointed_stream,
> -                      struct restore_callbacks *callbacks);
> +                      struct restore_callbacks *callbacks, int back_fd);
>  
>  /* Domain Restore v2 */
>  int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
> @@ -149,7 +149,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
>                         unsigned long *console_mfn, domid_t console_domid,
>                         unsigned int hvm, unsigned int pae, int superpages,
>                         int checkpointed_stream,
> -                       struct restore_callbacks *callbacks);
> +                       struct restore_callbacks *callbacks, int back_fd);
>  /**
>   * xc_domain_restore writes a file to disk that contains the device
>   * model saved state.
> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
> index 3cd3483..63d1e6b 100644
> --- a/tools/libxc/xc_domain_restore.c
> +++ b/tools/libxc/xc_domain_restore.c
> @@ -1515,7 +1515,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>                        unsigned long *console_mfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
>                        int checkpointed_stream,
> -                      struct restore_callbacks *callbacks)
> +                      struct restore_callbacks *callbacks, int back_fd)
>  {
>      DECLARE_DOMCTL;
>      xc_dominfo_t info;
> @@ -1578,7 +1578,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>          return xc_domain_restore2(
>              xch, io_fd, dom, store_evtchn, store_mfn,
>              store_domid, console_evtchn, console_mfn, console_domid,
> -            hvm,  pae,  superpages, checkpointed_stream, callbacks);
> +            hvm,  pae,  superpages, checkpointed_stream, callbacks, back_fd);
>      }
>  
>      DPRINTF("%s: starting restore of new domid %u", __func__, dom);
> diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
> index 0da3cca..b111384 100644
> --- a/tools/libxc/xc_domain_save.c
> +++ b/tools/libxc/xc_domain_save.c
> @@ -803,7 +803,7 @@ static int save_tsc_info(xc_interface *xch, uint32_t dom, int io_fd)
>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
>                     uint32_t max_factor, uint32_t flags,
>                     struct save_callbacks* callbacks, int hvm,
> -                   int checkpointed_stream)
> +                   int checkpointed_stream, int back_fd)
>  {
>      xc_dominfo_t info;
>      DECLARE_DOMCTL;
> @@ -899,7 +899,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
>      {
>          return xc_domain_save2(xch, io_fd, dom, max_iters,
>                                 max_factor, flags, callbacks, hvm,
> -                               checkpointed_stream);
> +                               checkpointed_stream, back_fd);
>      }
>  
>      DPRINTF("%s: starting save of domid %u", __func__, dom);
> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
> index bf1ee15..504463e 100644
> --- a/tools/libxc/xc_sr_restore.c
> +++ b/tools/libxc/xc_sr_restore.c
> @@ -720,7 +720,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
>                         unsigned long *console_gfn, domid_t console_domid,
>                         unsigned int hvm, unsigned int pae, int superpages,
>                         int checkpointed_stream,
> -                       struct restore_callbacks *callbacks)
> +                       struct restore_callbacks *callbacks, int back_fd)
>  {
>      struct xc_sr_context ctx =
>          {
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 6102b66..d12e5b1 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -821,7 +821,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>  int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
>                      uint32_t max_iters, uint32_t max_factor, uint32_t flags,
>                      struct save_callbacks* callbacks, int hvm,
> -                    int checkpointed_stream)
> +                    int checkpointed_stream, int back_fd)
>  {
>      xen_pfn_t nr_pfns;
>      struct xc_sr_context ctx =
> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
> index f393abc..f8c6cf0 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -27,7 +27,7 @@
>   */
>  static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
>                         const char *mode_arg,
> -                       int stream_fd,
> +                       int stream_fd, int back_fd,
>                         const int *preserve_fds, int num_preserve_fds,
>                         const unsigned long *argnums, int num_argnums);
>  
> @@ -50,6 +50,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
>      /* Convenience aliases */
>      const uint32_t domid = dcs->guest_domid;
>      const int restore_fd = dcs->libxc_fd;
> +    const int send_fd = dcs->send_fd;
>      libxl__domain_build_state *const state = &dcs->build_state;
>  
>      unsigned cbflags =
> @@ -72,7 +73,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
>      shs->need_results = 1;
>      shs->toolstack_data_file = 0;
>  
> -    run_helper(egc, shs, "--restore-domain", restore_fd, 0, 0,
> +    run_helper(egc, shs, "--restore-domain", restore_fd, send_fd, 0, 0,
>                 argnums, ARRAY_SIZE(argnums));
>  }
>  
> @@ -96,7 +97,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
>      shs->caller_state = dss;
>      shs->need_results = 0;
>  
> -    run_helper(egc, shs, "--save-domain", dss->fd,
> +    run_helper(egc, shs, "--save-domain", dss->fd, dss->recv_fd,
>                 NULL, 0,
>                 argnums, ARRAY_SIZE(argnums));
>      return;
> @@ -119,14 +120,29 @@ void libxl__save_helper_init(libxl__save_helper_state *shs)
>  }
>  
>  /*----- helper execution -----*/
> +static int dup_fd_helper(libxl__gc *gc, int fd, const char *what)
> +{
> +    int dup_fd = fd;
> +
> +    if (fd <= 2) {
> +        dup_fd = dup(fd);
> +        if (dup_fd < 0) {
> +            LOGE(ERROR,"dup %s", what);
> +            exit(-1);
> +        }
> +    }
> +    libxl_fd_set_cloexec(CTX, dup_fd, 0);
> +
> +    return dup_fd;
> +}
>  
>  static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> -                       const char *mode_arg, int stream_fd,
> +                       const char *mode_arg, int stream_fd, int back_fd,
>                         const int *preserve_fds, int num_preserve_fds,
>                         const unsigned long *argnums, int num_argnums)
>  {
>      STATE_AO_GC(shs->ao);
> -    const char *args[4 + num_argnums];
> +    const char *args[5 + num_argnums];
>      const char **arg = args;
>      int i, rc;
>  
> @@ -154,6 +170,7 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
>      *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC_BIN "/" "libxl-save-helper";
>      *arg++ = mode_arg;
>      const char **stream_fd_arg = arg++;
> +    const char **back_fd_arg = arg++;
>      for (i=0; i<num_argnums; i++)
>          *arg++ = GCSPRINTF("%lu", argnums[i]);
>      *arg++ = 0;
> @@ -178,16 +195,12 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
>  
>      pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
>      if (!pid) {
> -        if (stream_fd <= 2) {
> -            stream_fd = dup(stream_fd);
> -            if (stream_fd < 0) {
> -                LOGE(ERROR,"dup migration stream fd");
> -                exit(-1);
> -            }
> -        }
> -        libxl_fd_set_cloexec(CTX, stream_fd, 0);
> +        stream_fd = dup_fd_helper(gc, stream_fd, "migration stream fd");
>          *stream_fd_arg = GCSPRINTF("%d", stream_fd);
>  
> +        back_fd = dup_fd_helper(gc, back_fd, "migration back channel fd");
> +        *back_fd_arg = GCSPRINTF("%d", back_fd);
> +
>          for (i=0; i<num_preserve_fds; i++)
>              if (preserve_fds[i] >= 0) {
>                  assert(preserve_fds[i] > 2);
> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
> index 4c9d34c..9de5694 100644
> --- a/tools/libxl/libxl_save_helper.c
> +++ b/tools/libxl/libxl_save_helper.c
> @@ -235,6 +235,7 @@ static struct restore_callbacks helper_restore_callbacks;
>  int main(int argc, char **argv)
>  {
>      int r;
> +    int back_fd;
>  
>  #define NEXTARG (++argv, assert(*argv), *argv)
>  
> @@ -244,6 +245,7 @@ int main(int argc, char **argv)
>      if (!strcmp(mode,"--save-domain")) {
>  
>          io_fd =                    atoi(NEXTARG);
> +        back_fd =                  atoi(NEXTARG);
>          uint32_t dom =             strtoul(NEXTARG,0,10);
>          uint32_t max_iters =       strtoul(NEXTARG,0,10);
>          uint32_t max_factor =      strtoul(NEXTARG,0,10);
> @@ -259,12 +261,14 @@ int main(int argc, char **argv)
>          setup_signals(save_signal_handler);
>  
>          r = xc_domain_save2(xch, io_fd, dom, max_iters, max_factor, flags,
> -                           &helper_save_callbacks, hvm, checkpointed_stream);
> +                            &helper_save_callbacks, hvm, checkpointed_stream,
> +                            back_fd);
>          complete(r);
>  
>      } else if (!strcmp(mode,"--restore-domain")) {
>  
>          io_fd =                    atoi(NEXTARG);
> +        back_fd =                  atoi(NEXTARG);
>          uint32_t dom =             strtoul(NEXTARG,0,10);
>          unsigned store_evtchn =    strtoul(NEXTARG,0,10);
>          domid_t store_domid =      strtoul(NEXTARG,0,10);
> @@ -289,7 +293,7 @@ int main(int argc, char **argv)
>                                store_domid, console_evtchn, &console_mfn,
>                                console_domid, hvm, pae, superpages,
>                                checkpointed,
> -                              &helper_restore_callbacks);
> +                              &helper_restore_callbacks, back_fd);
>          helper_stub_restore_results(store_mfn,console_mfn,0);
>          complete(r);
>  

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device Yang Hongyang
@ 2015-07-15 13:15   ` Ian Campbell
  2015-07-15 13:34     ` Yang Hongyang
  2015-07-15 13:32   ` Ian Campbell
  1 sibling, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:15 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> This patch is auto generated by the following commands:
>  1. git mv tools/libxl/libxl_remus_device.c tools/libxl/libxl_checkpoint_device.c

This patch does not appear to have been formatted with git format-patch
-M as requested last time around.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 23/25] tools/libxl: store remus_ops in checkpoint device state
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 23/25] tools/libxl: store remus_ops in checkpoint device state Yang Hongyang
@ 2015-07-15 13:21   ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:21 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> Checkpoint device is an abstract layer to do checkpoint.
> COLO can also use it to do checkpoint. But there are
> still some codes in checkpoint device which touch remus.
> 
> This patch and the following 2 will seperate remus from
> checkpoint device layer.
> 
> We use remus ops directly in checkpoint device. Store it
> in checkpoint device state so that we do not aware of
> remus_ops in the checkpoint device layer.
> 
> it is pure refactoring and no functional changes.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> @@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
>          goto out;
>  
>      do {
> -        dev->ops = remus_ops[++dev->ops_index];
> +        dev->ops = dev->cds->ops[++dev->ops_index];

This do/while loop is really quite confusingly structured, but that's
not your fault.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc Yang Hongyang
  2015-07-15 13:13   ` Ian Campbell
@ 2015-07-15 13:21   ` Andrew Cooper
  2015-07-16  6:07     ` Yang Hongyang
  1 sibling, 1 reply; 101+ messages in thread
From: Andrew Cooper @ 2015-07-15 13:21 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram

On 15/07/15 08:45, Yang Hongyang wrote:
> In COLO mode, both VMs are running, and are considered in sync if the
> visible network traffic is identical.  After some time, they fall out of
> sync.
>
> At this point, the two VMs have definitely diverged.  Lets call the
> primary dirty bitmap set A, while the secondary dirty bitmap set B.
>
> Sets A and B are different.
>
> Under normal migration, the page data for set A will be sent form the
> primary to the secondary.
>
> However, the set difference B - A (lets call this C) is out-of-date on
> the secondary (with respect to the primary) and will not be sent by the
> primary, as it was not memory dirtied by the primary.  The secondary
> needs the page data for C to reconstruct an exact copy of the primary at
> the checkpoint.
>
> The secondary cannot calculate C as it doesn't know A.  Instead, the
> secondary must send B to the primary, at which point the primary
> calculates the union of A and B (lets call this D) which is all the
> pages dirtied by both the primary and the secondary, and sends all page
> data covered by D.
>
> In the general case, D is a superset of both A and B.  Without the
> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
> copy of the primary.
>
> We transfer the dirty bitmap on libxc side, so we need to introduce back
> channel to libxc.
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> commit message:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxc/include/xenguest.h   |  8 ++++----
>  tools/libxc/xc_domain_restore.c  |  4 ++--
>  tools/libxc/xc_domain_save.c     |  4 ++--
>  tools/libxc/xc_sr_restore.c      |  2 +-
>  tools/libxc/xc_sr_save.c         |  2 +-
>  tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
>  tools/libxl/libxl_save_helper.c  |  8 ++++++--
>  7 files changed, 42 insertions(+), 25 deletions(-)

You have not patched xc_nomigrate.c, which means this will break the ARM
build.  (I fell into the same trap, requiring c/s f50fe3a5 as a fixup).

Having said that, I plan to throw together some cleanup patches removing
files like xc_domain_{save,restore}.c and dropping most of the
parameters from the parameter list, as they are superfluous.

I will try to get my cleanup done shortly, which should make this prereq
series easier, although I am focusing on some hypervisor side fixes
right at the moment.

~Andrew

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure Yang Hongyang
@ 2015-07-15 13:28   ` Ian Campbell
  2015-07-15 13:50     ` Yang Hongyang
  2015-07-15 15:08   ` Ian Jackson
  1 sibling, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:28 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> @@ -2921,6 +2911,26 @@ _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
>                                          libxl__checkpoint_devices_state *cds);
>  _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
>                                          libxl__checkpoint_devices_state *cds);
> +
> +/*----- Remus related state structure -----*/
> +typedef struct libxl__remus_state libxl__remus_state;
> +struct libxl__remus_state {
> +    /* private */
> +    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
> +    int interval; /* checkpoint interval */
> +
> +    /* abstract layer */
> +    libxl__checkpoint_devices_state cds;

This mostly makes sense, I think, but this one field feels like it will
be wanted by colo too. Does that mean we will end up with dss->rs.cds
and dss->colo.cds doing effectively the same thing?

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device Yang Hongyang
  2015-07-15 13:15   ` Ian Campbell
@ 2015-07-15 13:32   ` Ian Campbell
  2015-07-15 13:38     ` Yang Hongyang
  2015-07-16  9:23     ` Yang Hongyang
  1 sibling, 2 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:32 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>  tools/libxl/libxl_types.idl           |   4 +-

> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index e8d3647..1d676ef 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
>      (-15, "LOCK_FAIL"),
>      (-16, "JSON_CONFIG_EMPTY"),
>      (-17, "DEVICE_EXISTS"),
> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
>      (-20, "VNUMA_CONFIG_INVALID"),
>      (-21, "DOMAIN_NOTFOUND"),
>      (-22, "ABORTED"),

This is an API change, which I think we discussed before.

In <558BC6EE.60801@cn.fujitsu.com> you said you would add an extra patch
to deal with that, and I think that needs to come before this automatic
renaming so that there is no bisect hazard. I don't see any such patch
even after this point though (from grepping your colo-v8 branch).

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-15 13:15   ` Ian Campbell
@ 2015-07-15 13:34     ` Yang Hongyang
  2015-07-16  9:26       ` Andrew Cooper
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15 13:34 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/15/2015 09:15 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> This patch is auto generated by the following commands:
>>   1. git mv tools/libxl/libxl_remus_device.c tools/libxl/libxl_checkpoint_device.c
>
> This patch does not appear to have been formatted with git format-patch
> -M as requested last time around.

Sorry I missed this :(
will do in the next version. btw, I have a dump question...how to specify -M
for only this patch while it is in a series?

>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 25/25] tools/libxl: seperate device init/cleanup from checkpoint device layer
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 25/25] tools/libxl: seperate device init/cleanup from checkpoint device layer Yang Hongyang
@ 2015-07-15 13:37   ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:37 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, guijianfeng, rshriram

On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> we call (init|cleanup)_subkind_nic and (init|cleanup)_subkind_drbd_disk
> directly in checkpoint device. Move them to libxl_remus.c, Call them before
> calling libxl__checkpoint_devices_setup() or after calling
> libxl__checkpoint_devices_teardown().
> it is pure refactoring and no functional changes.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> @@ -86,14 +54,10 @@ static void checkpoint_devices_setup(libxl__egc *egc,
>  void libxl__checkpoint_devices_setup(libxl__egc *egc,
>                                       libxl__checkpoint_devices_state *cds)
>  {
> -    int i, rc;
> +    int i;
>  
>      STATE_AO_GC(cds->ao);
>  
> -    rc = init_device_subkind(cds);
> -    if (rc)
> -        goto out;
> -
>      cds->num_devices = 0;
>      cds->num_nics = 0;
>      cds->num_disks = 0;
> @@ -126,7 +90,7 @@ void libxl__checkpoint_devices_setup(libxl__egc *egc,
>      return;
>  
>  out:
> -    cds->callback(egc, cds, rc);
> +    cds->callback(egc, cds, 0);

This change highlights a slightly odd (non-error) flow of code when
there are no NICs or disks. I think having remus_devices_setup (with its
new name) handle by passing its operation and calling
all_devices_setup_cb directly if there are no devices would be more
natural.

A cleanup for another time though.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-15 13:32   ` Ian Campbell
@ 2015-07-15 13:38     ` Yang Hongyang
  2015-07-16  9:23     ` Yang Hongyang
  1 sibling, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15 13:38 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/15/2015 09:32 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>   tools/libxl/libxl_types.idl           |   4 +-
>
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index e8d3647..1d676ef 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
>>       (-15, "LOCK_FAIL"),
>>       (-16, "JSON_CONFIG_EMPTY"),
>>       (-17, "DEVICE_EXISTS"),
>> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
>> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
>> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
>> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
>>       (-20, "VNUMA_CONFIG_INVALID"),
>>       (-21, "DOMAIN_NOTFOUND"),
>>       (-22, "ABORTED"),
>
> This is an API change, which I think we discussed before.

Also missed this one, sorry.

>
> In <558BC6EE.60801@cn.fujitsu.com> you said you would add an extra patch
> to deal with that, and I think that needs to come before this automatic

will add the patch before the automatic renaming.

> renaming so that there is no bisect hazard. I don't see any such patch
> even after this point though (from grepping your colo-v8 branch).
>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 14/25] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state
  2015-07-15 12:45   ` Ian Campbell
@ 2015-07-15 13:42     ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15 13:42 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, guijianfeng, Anthony Perard, ian.jackson



On 07/15/2015 08:45 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> In normal migration, the qemu state was passed to qemu as a parameter.
>> With COLO, Secondary vm is running. So we will do the following steps
>> at every checkpoint:
>> 1. suspend both primay vm and secondary vm
>
> "primary"
>
>> 2. sync the state
>> 3. resume both primary vm and secondary vm
>> Primary will send qemu's state in step2, and
>> Secondary's qemu should read it and restore the state before it
>> is resumed. We can not pass the state to qemu as a parameter because
>> Secondary QEMU already started at this point, so we introduce
>> libxl__domain_restore_device_model() to do it.
>> This API should be called before resuming secondary vm.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Cc: Anthony Perard <anthony.perard@citrix.com>
>> ---
>>   tools/libxl/libxl_dom_save.c | 29 +++++++++++++++++++++++++++++
>>   tools/libxl/libxl_internal.h |  3 +++
>>   tools/libxl/libxl_qmp.c      | 10 ++++++++++
>>   3 files changed, 42 insertions(+)
>>
>> diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
>> index f89f5d4..0926b71 100644
>> --- a/tools/libxl/libxl_dom_save.c
>> +++ b/tools/libxl/libxl_dom_save.c
>> @@ -675,6 +675,35 @@ out:
>>       return ret;
>>   }
>>
>> +int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid)
>> +{
>> +    char *state_file;
>> +    int rc;
>> +
>> +    switch (libxl__device_model_version_running(gc, domid)) {
>> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
>> +        /* not supported now */
>> +        rc = ERROR_INVAL;
>> +        break;
>> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
>> +        /*
>> +         * This function may be called too many times for the same gc,
>> +         * so we use NOGC, and free the memory before return to avoid
>> +         * OOM.
>> +         */
>
> It occurs to me that domid shouldn't change for the duration of a COLO
> run, right?

right!

>
> Thus I think the path can be allocated once at start of day and not per
> iteration, and can be stored in suspend_state (or similar) and passed in
> here. Hence no complexity like a nested ao is needed.

Good idea, thank you!

>
>> +        state_file = libxl__sprintf(NOGC,
>> +                                    XC_DEVICE_MODEL_RESTORE_FILE".%d",
>> +                                    domid);
>> +        rc = libxl__qmp_restore(gc, domid, state_file);
>> +        free(state_file);
>> +        break;
>> +    default:
>> +        rc = ERROR_INVAL;
>> +    }
>> +
>> +    return rc;
>> +}
>> +
>>   /*
>>    * Local variables:
>>    * mode: C
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index 0eb5f41..fb777c1 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -1074,6 +1074,7 @@ _hidden int libxl__domain_rename(libxl__gc *gc, uint32_t domid,
>>
>>   _hidden int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
>>                                        uint32_t size, void *data);
>> +_hidden int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid);
>>   _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
>>
>>   _hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
>> @@ -1702,6 +1703,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
>>   _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
>>   /* Save current QEMU state into fd. */
>>   _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
>> +/* Load current QEMU state from fd. */
>> +_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
>>   /* Set dirty bitmap logging status */
>>   _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool enable);
>>   _hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const libxl_device_disk *disk);
>> diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
>> index 6484f5e..080cb9f 100644
>> --- a/tools/libxl/libxl_qmp.c
>> +++ b/tools/libxl/libxl_qmp.c
>> @@ -904,6 +904,16 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename)
>>                              NULL, NULL);
>>   }
>>
>> +int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
>> +{
>> +    libxl__json_object *args = NULL;
>> +
>> +    qmp_parameters_add_string(gc, &args, "filename", state_file);
>> +
>> +    return qmp_run_command(gc, domid, "xen-load-devices-state", args,
>> +                           NULL, NULL);
>> +}
>> +
>>   static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp,
>>                         char *device, char *target, char *arg)
>>   {
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-15 13:00       ` Wei Liu
@ 2015-07-15 13:48         ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:48 UTC (permalink / raw)
  To: Wei Liu
  Cc: wency, andrew.cooper3, yunhong.jiang, eddie.dong, xen-devel,
	guijianfeng, rshriram, Yang Hongyang, Ian Jackson

On Wed, 2015-07-15 at 14:00 +0100, Wei Liu wrote:
> On Wed, Jul 15, 2015 at 01:54:12PM +0100, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 13:48 +0100, Ian Campbell wrote:
> > > >      switch (libxl__device_model_version_running(gc, domid)) {
> > > >      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
> > > > -        libxl__qemu_traditional_cmd(gc, domid, "continue");
> > > > -        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
> > > > +        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
> > > > +
> > > > +        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> > > > +        state = libxl__xs_read(gc, XBT_NULL, path);
> > > > +        if (state != NULL && !strcmp(state, "paused")) {
> > > > +            libxl__qemu_traditional_cmd(gc, domid, "continue");
> > > 
> > > Please can you explain the apparent discrepancy between the use of
> > > dm_domid and domid here?
> > 
> > I see from the next patch that this pattern came from the existing
> > libxl_domain_unpause, which hopes to use this helper in the future.
> > 
> > Looking at git annotate:
> > 83cc69fa        (Ian Jackson    2012-06-28 18:43:28 +0100       1045)    if (type == LIBXL_DOMAIN_TYPE_HVM) {
> > 1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1046)        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
> > 1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1047)
> > 1fc3aeb3        (   Wei Liu     2015-04-09 19:49:25 +0100       1048)        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> > bdf07e8e        (Ian Jackson    2011-12-12 17:48:42 +0000       1049)        state = libxl__xs_read(gc, XBT_NULL, path);
> > d1c7c3ef        (Keir Fraser    2009-11-30 10:53:39 +0000       1050)        if (state != NULL && !strcmp(state, "paused")) {
> > 0cb90b31        (Shriram Rajagopalan    2012-02-09 18:07:48 +0000       1051)            libxl__qemu_traditional_cmd(gc, domid, "continue");
> > 47cb2273        (Ian Jackson    2013-10-14 17:26:01 +0100       1052)            libxl__wait_for_device_model_deprecated(gc, domid, "running",
> > 3b6eaa3e        (Ian Campbell   2011-05-24 15:57:24 +0100       1053)                                         NULL, NULL, NULL);
> > d1c7c3ef        (Keir Fraser    2009-11-30 10:53:39 +0000       1054)        }
> > 
> > It seems this came from Wei in 1fc3aeb3aa26 "libxl: use new QEMU
> > xenstore protocol". I suspect it was a mistake. Wei?
> > 
> 
> No, it's not.
> 
> libxl__qemu_traditional_cmd accepts domid and then it calls
> libxl_get_stubdom_id to extract dm_domid.

How... exciting.

Some sort of helper to get the DM state would help to hide this sort of
wrinkle.

Anyway, I shall go see if this means I can ack the COLO patches which
moved this code.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-15 12:48   ` Ian Campbell
  2015-07-15 12:54     ` Ian Campbell
@ 2015-07-15 13:49     ` Ian Campbell
  1 sibling, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-15 13:49 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 13:48 +0100, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> > check QEMU state before resume dm on QEMU_XEN_TRADITIONAL.
> > 
> > Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> > CC: Ian Campbell <Ian.Campbell@citrix.com>
> > CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> > CC: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  tools/libxl/libxl_dom_suspend.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
> > index 6f04c26..686a49b 100644
> > --- a/tools/libxl/libxl_dom_suspend.c
> > +++ b/tools/libxl/libxl_dom_suspend.c
> > @@ -434,11 +434,20 @@ static void domain_suspend_callback_common_done(libxl__egc *egc,
> >  
> >  int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
> >  {
> > +    char *path;
> > +    char *state;
> 
> Can both be const.
> 
> Could also be on one line, but that is is a matter of taste so up to
> you.
>  
> >      switch (libxl__device_model_version_running(gc, domid)) {
> >      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
> > -        libxl__qemu_traditional_cmd(gc, domid, "continue");
> > -        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
> > +        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
> > +
> > +        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> > +        state = libxl__xs_read(gc, XBT_NULL, path);
> > +        if (state != NULL && !strcmp(state, "paused")) {
> > +            libxl__qemu_traditional_cmd(gc, domid, "continue");
> 
> Please can you explain the apparent discrepancy between the use of
> dm_domid and domid here?

Wei explained this, so with at least the const-ness issue mentioned
above fixed:
Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-15 13:28   ` Ian Campbell
@ 2015-07-15 13:50     ` Yang Hongyang
  2015-07-16 10:37       ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15 13:50 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/15/2015 09:28 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> @@ -2921,6 +2911,26 @@ _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
>>                                           libxl__checkpoint_devices_state *cds);
>>   _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
>>                                           libxl__checkpoint_devices_state *cds);
>> +
>> +/*----- Remus related state structure -----*/
>> +typedef struct libxl__remus_state libxl__remus_state;
>> +struct libxl__remus_state {
>> +    /* private */
>> +    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
>> +    int interval; /* checkpoint interval */
>> +
>> +    /* abstract layer */
>> +    libxl__checkpoint_devices_state cds;
>
> This mostly makes sense, I think, but this one field feels like it will
> be wanted by colo too. Does that mean we will end up with dss->rs.cds
> and dss->colo.cds doing effectively the same thing?

Yes, checkpoint device is an abstract layer, used by both Remus & colo,
in the abstract layer, we do not aware of remus or colo, in Remus or colo,
we can use container of cds to retrive Remus/colo state.

>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream
  2015-07-15 12:34   ` Ian Campbell
@ 2015-07-15 13:58     ` Yang Hongyang
  2015-07-16 10:34       ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15 13:58 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/15/2015 08:34 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> introduce enum type libxl_checkpointed_stream in IDL.
>> rename the last argument of migrate_receive from "remus" to
>> "checkpointed" since the semantics of this parameter has
>> changed.
>>
>> NOTE:
>>   libxl_domain_restore_params isn't changed here,
>>   checkpointed_stream is still an int.
>>   It has to change eventually and other callers will have to be
>>   updated to cope (and there should be LIBXL_HAVE_...).
>
> Will this be fixed up later in this series? If so please say so.

It's not fixed in this series, I plan to fix this later, but seems there
will be another round for this series, I can fix this in the next version.
My main concern is that this change is an api change, it will affect the
existing callers.

>
>> @@ -4282,7 +4282,7 @@ static void migrate_domain(uint32_t domid, const char *rune, int debug,
>>   }
>>
>>   static void migrate_receive(int debug, int daemonize, int monitor,
>> -                            int send_fd, int recv_fd, int remus)
>> +                            int send_fd, int recv_fd, int checkpointed)
>
> I think you can start using the new enum type in xl straight away even
> if dom_info.checkpointed_stream remains an int. So that means here.
>
>> @@ -4489,7 +4489,8 @@ int main_restore(int argc, char **argv)
>>
>>   int main_migrate_receive(int argc, char **argv)
>>   {
>> -    int debug = 0, daemonize = 1, monitor = 1, remus = 0;
>> +    int debug = 0, daemonize = 1, monitor = 1;
>> +    int checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
>
> and here.
>
>> @@ -4318,7 +4318,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
>>
>>       domid = rc;
>>
>> -    if (remus) {
>> +    if (checkpointed) {
>>           /* If we are here, it means that the sender (primary) has crashed.
>>            * TODO: Split-Brain Check.
>>            */
>
> Is it the case that we expect all check pointing solutions will use the
> same failover code here? If yes then this should be "if (checkpointed !
> = ...NONE)".
>
> If we think they might differ (even if remus and colo happen to be the
> same) then I think a switch where the NONE case does nothing would be
> more structurally appropriate.
>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure Yang Hongyang
  2015-07-15 13:28   ` Ian Campbell
@ 2015-07-15 15:08   ` Ian Jackson
  2015-07-15 15:18     ` Yang Hongyang
  1 sibling, 1 reply; 101+ messages in thread
From: Ian Jackson @ 2015-07-15 15:08 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram

Yang Hongyang writes ("[Xen-devel] [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure"):
> Add a new structure remus state, and move concrete layer's private
> member to remus state.
> it is pure refactoring and no functional changes.

Thanks.  I don't have much to add to what Ian Campbell has said, but

>      if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
> -        dss->interval = r_info->interval;
>          if (libxl_defbool_val(r_info->compression))
>              dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;

In your next version it would be worth mentioning the movement of this
initialisation in the commit message.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-15 15:08   ` Ian Jackson
@ 2015-07-15 15:18     ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-15 15:18 UTC (permalink / raw)
  To: Ian Jackson
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram



On 07/15/2015 11:08 PM, Ian Jackson wrote:
> Yang Hongyang writes ("[Xen-devel] [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure"):
>> Add a new structure remus state, and move concrete layer's private
>> member to remus state.
>> it is pure refactoring and no functional changes.
>
> Thanks.  I don't have much to add to what Ian Campbell has said, but
>
>>       if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
>> -        dss->interval = r_info->interval;
>>           if (libxl_defbool_val(r_info->compression))
>>               dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
>
> In your next version it would be worth mentioning the movement of this
> initialisation in the commit message.

Ok, thanks!

>
> Ian.
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO
  2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
                   ` (24 preceding siblings ...)
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 25/25] tools/libxl: seperate device init/cleanup from checkpoint device layer Yang Hongyang
@ 2015-07-16  1:37 ` Yang Hongyang
  25 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  1:37 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram, ian.jackson

Seems my reply emails last night are lost. they didn't appear on the
list, I'm going to repost them.

On 07/15/2015 03:45 PM, Yang Hongyang wrote:
> This patchset is Prerequisite for COLO feature. Refer to:
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>
> This patchse is based on Andrew Cooper's Libxl migration v4.1:
>    http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/libxl-migv2-v4.1
>
> In this version, I moved some of the COLO specific patches down to the COLO
> main series, so most patches of this series are refactoring and can be applied
> first.
>
> I've done some simple test. Both Remus and normal migration work after apply
> this patchset. The patch to fix Remus on migration v2 will be sent later as
> a seperate patch.
>
> You can also get the patchset from:
>    https://github.com/macrosheep/xen/tree/colo-v8
>
> v3->v4:
>   - Rebased to the latest migration v2 branch
>   - Addressed comments from last round
>
> v2->v3:
>   - Merge '[PATCH v2 0/6] Misc cleanups for libxl' into this patchset
>     for easy review
>   - Addressed review comments
>   - Add back channel to libxc
>   - Introduce should_checkpoint callback
>   - Introduce DIRTY_BITMAP record on libxc side
>   - Introduce COLO_CONTEXT record on libxl side
>   - Ported to Libxl migration v2
>
> v1->v2:
>   - Rebased to [PATCH v2 0/6] Misc cleanups for libxl
>   - Add a bugfix for the error handling of process_record
>
>
> Wen Congyang (2):
>    tools/libxc: support to resume uncooperative HVM guests
>    tools/libxl: Add back channel to allow migration target send data back
>
> Yang Hongyang (23):
>    tools/libxl: rename libxl__domain_suspend to libxl__domain_save
> A  tools/libxl: move domain suspend code into libxl_dom_suspend.c
> A  tools/libxl: move domain resume code into libxl_dom_suspend.c
>    tools/libxl: rename remus checkpoint callbacks
>    libxl/remus: introduce libxl__remus_setup
>    libxl/remus: introduce libxl__remus_teardown
>    libxl/remus: init checkpoint_callback in Remus checkpoint callback
>    tools/libxl: move remus code into libxl_remus.c
> A  tools/libxl: move save/restore code into libxl_dom_save.c
>    libxl/save: Refactor libxl__domain_suspend_state
>    tools/libxl: introduce enum type libxl_checkpointed_stream
>    migration/save: pass checkpointed_stream from libxl to libxc
>    tools/libxl: introduce libxl__domain_restore_device_model to load qemu
>      state
>    tools/libxl: check QEMU state before resume dm
>    tools/libxl: Update libxl_domain_unpause() to support qemu-xen
> A  tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
> A  tools/libxl: export logdirty_init
>    tools/libx{l,c}: add back channel to libxc
>    tools/libxl: rename remus device to checkpoint device
> A  tools/libxl: adjust the indentation
>    tools/libxl: store remus_ops in checkpoint device state
>    tools/libxl: move remus state into a seperate structure
>    tools/libxl: seperate device init/cleanup from checkpoint device layer
>
>   tools/libxc/include/xenguest.h        |   13 +-
>   tools/libxc/xc_domain_restore.c       |    4 +-
>   tools/libxc/xc_domain_save.c          |    6 +-
>   tools/libxc/xc_nomigrate.c            |    3 +-
>   tools/libxc/xc_resume.c               |   22 +-
>   tools/libxc/xc_sr_common.h            |    2 +-
>   tools/libxc/xc_sr_restore.c           |    2 +-
>   tools/libxc/xc_sr_save.c              |    5 +-
>   tools/libxl/Makefile                  |    5 +-
>   tools/libxl/libxl.c                   |  119 +---
>   tools/libxl/libxl.h                   |   30 +-
>   tools/libxl/libxl_checkpoint_device.c |  282 ++++++++
>   tools/libxl/libxl_create.c            |   33 +-
>   tools/libxl/libxl_dom.c               | 1243 ---------------------------------
>   tools/libxl/libxl_dom_save.c          |  721 +++++++++++++++++++
>   tools/libxl/libxl_dom_suspend.c       |  503 +++++++++++++
>   tools/libxl/libxl_internal.h          |  246 ++++---
>   tools/libxl/libxl_netbuffer.c         |  117 ++--
>   tools/libxl/libxl_nonetbuffer.c       |   10 +-
>   tools/libxl/libxl_qmp.c               |   10 +
>   tools/libxl/libxl_remus.c             |  395 +++++++++++
>   tools/libxl/libxl_remus_device.c      |  327 ---------
>   tools/libxl/libxl_remus_disk_drbd.c   |   56 +-
>   tools/libxl/libxl_save_callout.c      |   43 +-
>   tools/libxl/libxl_save_helper.c       |    9 +-
>   tools/libxl/libxl_stream_write.c      |   14 +-
>   tools/libxl/libxl_types.idl           |   10 +-
>   tools/libxl/xl_cmdimpl.c              |   21 +-
>   tools/ocaml/libs/xl/xenlight_stubs.c  |    2 +-
>   29 files changed, 2321 insertions(+), 1932 deletions(-)
>   create mode 100644 tools/libxl/libxl_checkpoint_device.c
>   create mode 100644 tools/libxl/libxl_dom_save.c
>   create mode 100644 tools/libxl/libxl_dom_suspend.c
>   create mode 100644 tools/libxl/libxl_remus.c
>   delete mode 100644 tools/libxl/libxl_remus_device.c
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 04/25] tools/libxl: rename remus checkpoint callbacks
  2015-07-15 11:17   ` Ian Campbell
@ 2015-07-16  1:43     ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  1:43 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On 07/15/2015 07:17 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> There are 2 remus checkpoint callbacks(save/restore), currently, they
>> both called libxl__remus_domain_checkpoint_callback in diffrent
>> file, so it is ok. But in the following patch, we will move all of the
>> remus callback code into a seperate file, the name should be diffrent.
>
> "separate" and "different" (twice).

OK, thanks!

>
>> So rename them to:
>>    libxl__remus_domain_{save/restore}_checkpoint_callback
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>
> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>   tools/libxl/libxl_create.c | 4 ++--
>>   tools/libxl/libxl_dom.c    | 4 ++--
>>   2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> index 5b4d333..a32e3df 100644
>> --- a/tools/libxl/libxl_create.c
>> +++ b/tools/libxl/libxl_create.c
>> @@ -677,7 +677,7 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
>>   static void remus_checkpoint_stream_done(
>>       libxl__egc *egc, libxl__stream_read_state *srs, int rc);
>>
>> -static void libxl__remus_domain_checkpoint_callback(void *data)
>> +static void libxl__remus_domain_restore_checkpoint_callback(void *data)
>>   {
>>       libxl__save_helper_state *shs = data;
>>       libxl__domain_create_state *dcs = shs->caller_state;
>> @@ -989,7 +989,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>>       }
>>
>>       /* Restore */
>> -    callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
>> +    callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
>>
>>       rc = libxl__build_pre(gc, domid, d_config, state);
>>       if (rc)
>> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
>> index 0788309..9c61fa7 100644
>> --- a/tools/libxl/libxl_dom.c
>> +++ b/tools/libxl/libxl_dom.c
>> @@ -1586,7 +1586,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
>>                                     const struct timeval *requested_abs,
>>                                     int rc);
>>
>> -static void libxl__remus_domain_checkpoint_callback(void *data)
>> +static void libxl__remus_domain_save_checkpoint_callback(void *data)
>>   {
>>       libxl__save_helper_state *shs = data;
>>       libxl__domain_suspend_state *dss = shs->caller_state;
>> @@ -1749,7 +1749,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
>>       if (r_info != NULL) {
>>           callbacks->suspend = libxl__remus_domain_suspend_callback;
>>           callbacks->postcopy = libxl__remus_domain_resume_callback;
>> -        callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
>> +        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
>>           dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>>       } else
>>           callbacks->suspend = libxl__domain_suspend_callback;
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 06/25] libxl/remus: introduce libxl__remus_teardown
  2015-07-15 11:59   ` Ian Campbell
@ 2015-07-16  1:43     ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  1:43 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/15/2015 07:59 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> introduce libxl__remus_teardown to teardown Remus devices.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>
> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
>
> If you need to respin then you might consider inverting the if remus
> check in domain_suspend_done and calling this new function if true, e.g.
>
>      if (dss->remus) {
> 	libxl__remus_teardown(...)
> 	return;
>      }
>
>      dss->callback(egc, dss, rc);
>
> I think the control flow would feel more natural then.

will do, thanks!

>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen
  2015-07-15 12:50   ` Ian Campbell
@ 2015-07-16  3:49     ` Yang Hongyang
  2015-07-16 10:39       ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  3:49 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/15/2015 08:50 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> Currently, libxl__domain_unpause() only supports
>> qemu-xen-traditional. Update it to support qemu-xen.
>> We use libxl__domain_resume_device_model to unpause guest dm.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>   tools/libxl/libxl.c | 15 +++++----------
>>   1 file changed, 5 insertions(+), 10 deletions(-)
>>
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index 5b2d045..799aead 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -941,8 +941,6 @@ out:
>>   int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>>   {
>>       GC_INIT(ctx);
>> -    char *path;
>> -    char *state;
>>       int ret, rc = 0;
>>
>>       libxl_domain_type type = libxl__domain_type(gc, domid);
>> @@ -952,14 +950,11 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>>       }
>>
>>       if (type == LIBXL_DOMAIN_TYPE_HVM) {
>> -        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
>> -
>> -        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
>> -        state = libxl__xs_read(gc, XBT_NULL, path);
>> -        if (state != NULL && !strcmp(state, "paused")) {
>> -            libxl__qemu_traditional_cmd(gc, domid, "continue");
>> -            libxl__wait_for_device_model_deprecated(gc, domid, "running",
>> -                                         NULL, NULL, NULL);
>> +        rc = libxl__domain_resume_device_model(gc, domid);
>> +        if (rc < 0) {
>> +            LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "failed to unpause device model "
>> +                       "for domain %u:%d", domid, rc);
>
> Please use the preferred form of LOG(ERROR, "failed to..."), which
> should also hopefully allow you to avoid splitting the line in the
> middle of a string constant which is discouraged.
>
> If you can't use LOG() then please:
>              LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
>                         "failed to unpause device model for domain %u:%d",
>                          domid, rc);
>
> Not splitting string constants means you can grep for an error message.

Sorry, the commit message is wrong, it's libxl_domain_unpause, not
libxl__domain_unpause, LOG() can't be used, so I will update commit message
and use your later suggestion, thank you!

>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup
  2015-07-15 11:26   ` Ian Campbell
@ 2015-07-16  5:32     ` Yang Hongyang
  2015-07-16 10:40       ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  5:32 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/15/2015 07:26 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> Refactoring Remus setup by introducing libxl__remus_setup API.
>> All Remus setup work are done in this function.
>>
>> Also remove the libxl__ prefix for static functions.
>
> There is a subtle behavioural change here, which is that if anything
> which is now done in _setup fails then the result is a call to
> dss->callback( ..,..,ERROR_FAIL) rather than _start returning
> AO_CREATE_FAIL(ERROR_FAIL).
>
> I think this is probably a reasonable and correct change, but I think it
> is worth mentioning in the commit log.

Yes, will update the commit log.

>
> That said, I also wonder if the actual check for netbuffer_enabled (the
> only such failure in practice) ought to be moved up such that it stays
> in _start along with the other similar checks, i.e. _start would do:
>
>      if (libxl_defbool_val(info->netbuf) && !libxl__netbuffer_enabled(gc)) {
>              LOG(ERROR, "Remus: No support for network buffering");
>              rc = ERROR_FAIL;
>              goto out;
>          }

This check is for Remus only, we want to reuse _start for COLO, so anything
related to Remus only should sit in libxl_remus.c.

>
> while _setup would do:
>
>      if (libxl_defbool_val(info->netbuf)) {
>          // MAYBE : assert(libxl__netbuffer_enabled(gc))
>          rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
>      }
>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-07-15 12:26   ` Ian Campbell
@ 2015-07-16  5:57     ` Yang Hongyang
  2015-07-16 15:40       ` Ian Jackson
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  5:57 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/15/2015 08:26 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>>
>> 1. suspend
>> a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
>>     request to the guest). If the guest doesn't support evtchn, the xenstore
>>     variant will be used, suspending the guest via XenBus control node.
>> b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
>>     the guest
>>
>> 2. Resume:
>> a. fast path
>>     In this case, we don't change the guest's state.
>>     PV: modify the return code to 1, and than call the domctl:
>>         XEN_DOMCTL_resumedomain
>>     PVHVM: same with PV
>>     HVM: do nothing in modify_returncode, and than call the domctl:
>>          XEN_DOMCTL_resumedomain
>> b. slow
>>     Used when the guest's state have been changed.
>>     PV: update start info, and reset all secondary CPU states. Than call the
>>     domctl: XEN_DOMCTL_resumedomain
>>     PVHVM and HVM can not be resumed.
>>
>> For PVHVM, in my test, only call the domctl: XEN_DOMCTL_resumedomain
>> can work. I am not sure if we should update start info and reset all
>> secondary CPU states.
>>
>> For pure HVM guest, in my test, only call the domctl:
>> XEN_DOMCTL_resumedomain can work.
>>
>> So we can call libxl__domain_resume(..., 1) if we don't change the guest
>> state, otherwise call libxl__domain_resume(..., 0).
>>
>> Under COLO, we will update the guest's state(modify memory, cpu's registers,
>> device status...). In this case, we cannot use the fast path to resume it.
>> Keep the return code 0, and use a slow path to resume the guest. While
>> resuming HVM using slow path is not supported currently, this patch is to
>> make the resume call do not fail.
>
> I'm afraid that the addition of this paragraph has not really addressed
> my comment on v3:
>
>          I'm afraid I think the commit message for this patch (and the associated
>          doc comments) need revisiting almost from scratch, to clearly explain
>          what this patch is doing and why and what the constraints on the new
>          functionality will be.
>
>          At the moment it mostly talks in a confusing way about the old behaviour
>          and adds very specific assumptions to the new function which are not
>          made clear.
>
> It also appears that this has not been addressed:
>
>          Hrm, so it sounds here like the correctness of this new functionality
>          requires the caller to have not messed with the domain's state? What
>          sort of changes are to the guest state are we talking about here?

This is used for secondary, at a checkpoint, we do:
1. suspend the guest
2. sync the guest state with primary  <== here the guest state has been changed
3. resume the guest
The guest state is changed by step 2, then we will resume the guest, since
the guest state has been changed, we cannot use the fast path to resume it.
For slow path, resume HVM is not supported currently, this patch is to add
the support.

While the XEN_DOMCTL_resumedomain hyper call for HVM is an NOP, it happens
to me that we could do this in a different way. We can modify
libxl__domain_resume, if the domain is HVM, we skip the xc_domain_resume
call, what do you think?

>
>          Isn't that a new requirement for this call? If so then it should be
>          documented somewhere, specifically what sorts of changes are and are not
>          allowed and the types of guests which are affected.
>
> The two usages of "in my test" in the commit message also do not inspire
> confidence that this change is understood to be correct, vs. happening
> to be something which works for you.
>
> Ian.
>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   tools/libxc/xc_resume.c | 22 ++++++++++++++++++----
>>   1 file changed, 18 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
>> index e67bebd..bd82334 100644
>> --- a/tools/libxc/xc_resume.c
>> +++ b/tools/libxc/xc_resume.c
>> @@ -109,6 +109,23 @@ static int xc_domain_resume_cooperative(xc_interface *xch, uint32_t domid)
>>       return do_domctl(xch, &domctl);
>>   }
>>
>> +static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
>> +{
>> +    DECLARE_DOMCTL;
>> +
>> +    /*
>> +     * If it is PVHVM, the hypercall return code is 0, because this
>> +     * is not a fast path resume, we do not modify_returncode as in
>> +     * xc_domain_resume_cooperative.
>> +     * (resuming it in a new domain context)
>> +     *
>> +     * If it is a HVM, the hypercall is a NOP.
>> +     */
>> +    domctl.cmd = XEN_DOMCTL_resumedomain;
>> +    domctl.domain = domid;
>> +    return do_domctl(xch, &domctl);
>> +}
>> +
>>   static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>>   {
>>       DECLARE_DOMCTL;
>> @@ -138,10 +155,7 @@ static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>>        */
>>   #if defined(__i386__) || defined(__x86_64__)
>>       if ( info.hvm )
>> -    {
>> -        ERROR("Cannot resume uncooperative HVM guests");
>> -        return rc;
>> -    }
>> +        return xc_domain_resume_hvm(xch, domid);
>>
>>       if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
>>       {
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-15 12:38   ` Ian Campbell
@ 2015-07-16  6:05     ` Yang Hongyang
  2015-07-16 10:47       ` Ian Campbell
  2015-07-16 16:13       ` Wei Liu
  0 siblings, 2 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  6:05 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/15/2015 08:38 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> Pass checkpointed_stream from libxl to libxc.
>> It won't affact legacy migration because legacy migration
>> won't use this param.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>>   tools/libxc/include/xenguest.h   |  9 ++++++---
>>   tools/libxc/xc_domain_save.c     |  6 ++++--
>>   tools/libxc/xc_nomigrate.c       |  3 ++-
>>   tools/libxc/xc_sr_common.h       |  2 +-
>>   tools/libxc/xc_sr_save.c         |  5 +++--
>>   tools/libxl/libxl.c              |  2 ++
>>   tools/libxl/libxl_dom_save.c     | 11 ++++++++---
>>   tools/libxl/libxl_internal.h     |  1 +
>>   tools/libxl/libxl_save_callout.c |  2 +-
>>   tools/libxl/libxl_save_helper.c  |  3 ++-
>>   10 files changed, 30 insertions(+), 14 deletions(-)
>>
>> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
>> index e95af54..6e24b6c 100644
>> --- a/tools/libxc/include/xenguest.h
>> +++ b/tools/libxc/include/xenguest.h
>> @@ -30,7 +30,6 @@
>>   #define XCFLAGS_HVM       (1 << 2)
>>   #define XCFLAGS_STDVGA    (1 << 3)
>>   #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
>> -#define XCFLAGS_CHECKPOINTED    (1 << 5)
>>
>>   #define X86_64_B_SIZE   64
>>   #define X86_32_B_SIZE   32
>> @@ -85,16 +84,20 @@ struct save_callbacks {
>>    * @parm xch a handle to an open hypervisor interface
>>    * @parm fd the file descriptor to save a domain to
>>    * @parm dom the id of the domain
>> + * @parm checkpointed_stream non-zero if the far end of the stream is using
>> + *       checkpointing
>
> Do (or will) specific non-zero values have any meaning to the libxc
> layer? i.e. does it have any knowledge of COLO vs. Remus as the libxl
> enum added in the last patch does?

Yes, libxc side should be aware of the type of checkpointed_stream (Remus
or COLO).

I think it is better to document the non-zero values here?
for example:
      * @parm checkpointed_stream non-zero if the far end of the stream is using
      *                           checkpointing
      *                           0 no checkpointed stream
      *                           1 Remus
      *                           2 COLO

>
> If (as I hope) the answer is no then this should be a boolean and the
> libxl code which propagates the enum into this field ought to use some
> appropriate condition (!= ..._NONE most likely).
>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc
  2015-07-15 13:21   ` Andrew Cooper
@ 2015-07-16  6:07     ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  6:07 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, guijianfeng, rshriram



On 07/15/2015 09:21 PM, Andrew Cooper wrote:
> On 15/07/15 08:45, Yang Hongyang wrote:
>> In COLO mode, both VMs are running, and are considered in sync if the
>> visible network traffic is identical.  After some time, they fall out of
>> sync.
>>
>> At this point, the two VMs have definitely diverged.  Lets call the
>> primary dirty bitmap set A, while the secondary dirty bitmap set B.
>>
>> Sets A and B are different.
>>
>> Under normal migration, the page data for set A will be sent form the
>> primary to the secondary.
>>
>> However, the set difference B - A (lets call this C) is out-of-date on
>> the secondary (with respect to the primary) and will not be sent by the
>> primary, as it was not memory dirtied by the primary.  The secondary
>> needs the page data for C to reconstruct an exact copy of the primary at
>> the checkpoint.
>>
>> The secondary cannot calculate C as it doesn't know A.  Instead, the
>> secondary must send B to the primary, at which point the primary
>> calculates the union of A and B (lets call this D) which is all the
>> pages dirtied by both the primary and the secondary, and sends all page
>> data covered by D.
>>
>> In the general case, D is a superset of both A and B.  Without the
>> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
>> copy of the primary.
>>
>> We transfer the dirty bitmap on libxc side, so we need to introduce back
>> channel to libxc.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> commit message:
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>   tools/libxc/include/xenguest.h   |  8 ++++----
>>   tools/libxc/xc_domain_restore.c  |  4 ++--
>>   tools/libxc/xc_domain_save.c     |  4 ++--
>>   tools/libxc/xc_sr_restore.c      |  2 +-
>>   tools/libxc/xc_sr_save.c         |  2 +-
>>   tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
>>   tools/libxl/libxl_save_helper.c  |  8 ++++++--
>>   7 files changed, 42 insertions(+), 25 deletions(-)
>
> You have not patched xc_nomigrate.c, which means this will break the ARM
> build.  (I fell into the same trap, requiring c/s f50fe3a5 as a fixup).

Thank you for pointing this out, I will look at that commit.

>
> Having said that, I plan to throw together some cleanup patches removing
> files like xc_domain_{save,restore}.c and dropping most of the
> parameters from the parameter list, as they are superfluous.
>
> I will try to get my cleanup done shortly, which should make this prereq
> series easier,

That would be great, thanks!

although I am focusing on some hypervisor side fixes
> right at the moment.
>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc
  2015-07-15 13:13   ` Ian Campbell
@ 2015-07-16  6:29     ` Yang Hongyang
  2015-07-16 11:01       ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  6:29 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, guijianfeng, rshriram

On 07/15/2015 09:13 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>> In COLO mode, both VMs are running, and are considered in sync if the
>> visible network traffic is identical.  After some time, they fall out of
>> sync.
>>
>> At this point, the two VMs have definitely diverged.  Lets call the
>> primary dirty bitmap set A, while the secondary dirty bitmap set B.
>>
>> Sets A and B are different.
>>
>> Under normal migration, the page data for set A will be sent form the
>> primary to the secondary.
>>
>> However, the set difference B - A (lets call this C) is out-of-date on
>> the secondary (with respect to the primary) and will not be sent by the
>> primary, as it was not memory dirtied by the primary.  The secondary
>> needs the page data for C to reconstruct an exact copy of the primary at
>> the checkpoint.
>>
>> The secondary cannot calculate C as it doesn't know A.  Instead, the
>> secondary must send B to the primary, at which point the primary
>> calculates the union of A and B (lets call this D) which is all the
>> pages dirtied by both the primary and the secondary, and sends all page
>> data covered by D.
>>
>> In the general case, D is a superset of both A and B.  Without the
>> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
>> copy of the primary.
>
> When Andy (who wrote this) said this via email I replied [0] including:
>
>          According to the paper there is no need to resend because the
>          secondary already has a non-dirty copy of any memory which is
>          dirty in B but not A.
>
> So it is not the case that a checkpoint _can't_ reconstruct a valid copy
> of the primary, clearly it is possible, but for some reason this
> implementation chooses to deviate from the paper and does things in a
> way where it indeed cannot reconstruct D but I've yet to see a
> description of _why_ the implementation produced here differs from the
> paper.
>
>> We transfer the dirty bitmap on libxc side, so we need to introduce back
>> channel to libxc.
>
> I'm sure you have good practical reasons why the implementation differs
> from the design and I would like to know what they are because the back
> channel is adding extra complexity to libxc and libxl so I want to know
> why it is justified, as I also said this in [1].
>
> Lastly Ian said in [2]:
>
>          To be clear, I have no problem if the design has changed since the
>          paper was written.  I just want:
>
>           * A clear high-level explanation of the actually-implemented
>             arrangements to exist somewhere
>
>           * The commit messages, or code, to refer to that explanation
>
> IMHO the addition of this extra commit message doesn't really meet at
> least the first requirement. Please point us to an up to date design
> document which describes COLO as actually implemented.

The original design of COLO is that:
Secondary should maintain an exact copy of Primary memory. At every
checkpoint, we receive the Primary memory into that copy, then flush
the memory to Secondary.

We changed the original design to the current one, according to our
following concerns:
1. The original design needs extra memory on Secondary host. When
    there's multiple backups on one host, the memory cost is high.
2. The memory cache code will be another 1k+, it will make the
    review more time consuming.

The best way to accomplish this is the COW which Andrew mentioned earlier,
but that should be further improvement. We will certainly continue to
improve this when COW is ready to use.

Is description above can solve your confusion? If so, I will add to
the commit log.

>
> Ian.
>
> [0] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00090.html
> [1] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00148.html
> [2] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00101.html
>
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> commit message:
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>   tools/libxc/include/xenguest.h   |  8 ++++----
>>   tools/libxc/xc_domain_restore.c  |  4 ++--
>>   tools/libxc/xc_domain_save.c     |  4 ++--
>>   tools/libxc/xc_sr_restore.c      |  2 +-
>>   tools/libxc/xc_sr_save.c         |  2 +-
>>   tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
>>   tools/libxl/libxl_save_helper.c  |  8 ++++++--
>>   7 files changed, 42 insertions(+), 25 deletions(-)
>>
>> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
>> index 6e24b6c..4056955 100644
>> --- a/tools/libxc/include/xenguest.h
>> +++ b/tools/libxc/include/xenguest.h
>> @@ -91,13 +91,13 @@ struct save_callbacks {
>>   int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
>>                      uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
>>                      struct save_callbacks* callbacks, int hvm,
>> -                   int checkpointed_stream);
>> +                   int checkpointed_stream, int back_fd);
>>
>>   /* Domain Save v2 */
>>   int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
>>                       uint32_t max_factor, uint32_t flags,
>>                       struct save_callbacks* callbacks, int hvm,
>> -                    int checkpointed_stream);
>> +                    int checkpointed_stream, int back_fd);
>>
>>   /* callbacks provided by xc_domain_restore */
>>   struct restore_callbacks {
>> @@ -140,7 +140,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>                         unsigned long *console_mfn, domid_t console_domid,
>>                         unsigned int hvm, unsigned int pae, int superpages,
>>                         int checkpointed_stream,
>> -                      struct restore_callbacks *callbacks);
>> +                      struct restore_callbacks *callbacks, int back_fd);
>>
>>   /* Domain Restore v2 */
>>   int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
>> @@ -149,7 +149,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
>>                          unsigned long *console_mfn, domid_t console_domid,
>>                          unsigned int hvm, unsigned int pae, int superpages,
>>                          int checkpointed_stream,
>> -                       struct restore_callbacks *callbacks);
>> +                       struct restore_callbacks *callbacks, int back_fd);
>>   /**
>>    * xc_domain_restore writes a file to disk that contains the device
>>    * model saved state.
>> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
>> index 3cd3483..63d1e6b 100644
>> --- a/tools/libxc/xc_domain_restore.c
>> +++ b/tools/libxc/xc_domain_restore.c
>> @@ -1515,7 +1515,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>                         unsigned long *console_mfn, domid_t console_domid,
>>                         unsigned int hvm, unsigned int pae, int superpages,
>>                         int checkpointed_stream,
>> -                      struct restore_callbacks *callbacks)
>> +                      struct restore_callbacks *callbacks, int back_fd)
>>   {
>>       DECLARE_DOMCTL;
>>       xc_dominfo_t info;
>> @@ -1578,7 +1578,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>           return xc_domain_restore2(
>>               xch, io_fd, dom, store_evtchn, store_mfn,
>>               store_domid, console_evtchn, console_mfn, console_domid,
>> -            hvm,  pae,  superpages, checkpointed_stream, callbacks);
>> +            hvm,  pae,  superpages, checkpointed_stream, callbacks, back_fd);
>>       }
>>
>>       DPRINTF("%s: starting restore of new domid %u", __func__, dom);
>> diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
>> index 0da3cca..b111384 100644
>> --- a/tools/libxc/xc_domain_save.c
>> +++ b/tools/libxc/xc_domain_save.c
>> @@ -803,7 +803,7 @@ static int save_tsc_info(xc_interface *xch, uint32_t dom, int io_fd)
>>   int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
>>                      uint32_t max_factor, uint32_t flags,
>>                      struct save_callbacks* callbacks, int hvm,
>> -                   int checkpointed_stream)
>> +                   int checkpointed_stream, int back_fd)
>>   {
>>       xc_dominfo_t info;
>>       DECLARE_DOMCTL;
>> @@ -899,7 +899,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
>>       {
>>           return xc_domain_save2(xch, io_fd, dom, max_iters,
>>                                  max_factor, flags, callbacks, hvm,
>> -                               checkpointed_stream);
>> +                               checkpointed_stream, back_fd);
>>       }
>>
>>       DPRINTF("%s: starting save of domid %u", __func__, dom);
>> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
>> index bf1ee15..504463e 100644
>> --- a/tools/libxc/xc_sr_restore.c
>> +++ b/tools/libxc/xc_sr_restore.c
>> @@ -720,7 +720,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
>>                          unsigned long *console_gfn, domid_t console_domid,
>>                          unsigned int hvm, unsigned int pae, int superpages,
>>                          int checkpointed_stream,
>> -                       struct restore_callbacks *callbacks)
>> +                       struct restore_callbacks *callbacks, int back_fd)
>>   {
>>       struct xc_sr_context ctx =
>>           {
>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>> index 6102b66..d12e5b1 100644
>> --- a/tools/libxc/xc_sr_save.c
>> +++ b/tools/libxc/xc_sr_save.c
>> @@ -821,7 +821,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>>   int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
>>                       uint32_t max_iters, uint32_t max_factor, uint32_t flags,
>>                       struct save_callbacks* callbacks, int hvm,
>> -                    int checkpointed_stream)
>> +                    int checkpointed_stream, int back_fd)
>>   {
>>       xen_pfn_t nr_pfns;
>>       struct xc_sr_context ctx =
>> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
>> index f393abc..f8c6cf0 100644
>> --- a/tools/libxl/libxl_save_callout.c
>> +++ b/tools/libxl/libxl_save_callout.c
>> @@ -27,7 +27,7 @@
>>    */
>>   static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
>>                          const char *mode_arg,
>> -                       int stream_fd,
>> +                       int stream_fd, int back_fd,
>>                          const int *preserve_fds, int num_preserve_fds,
>>                          const unsigned long *argnums, int num_argnums);
>>
>> @@ -50,6 +50,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
>>       /* Convenience aliases */
>>       const uint32_t domid = dcs->guest_domid;
>>       const int restore_fd = dcs->libxc_fd;
>> +    const int send_fd = dcs->send_fd;
>>       libxl__domain_build_state *const state = &dcs->build_state;
>>
>>       unsigned cbflags =
>> @@ -72,7 +73,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
>>       shs->need_results = 1;
>>       shs->toolstack_data_file = 0;
>>
>> -    run_helper(egc, shs, "--restore-domain", restore_fd, 0, 0,
>> +    run_helper(egc, shs, "--restore-domain", restore_fd, send_fd, 0, 0,
>>                  argnums, ARRAY_SIZE(argnums));
>>   }
>>
>> @@ -96,7 +97,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
>>       shs->caller_state = dss;
>>       shs->need_results = 0;
>>
>> -    run_helper(egc, shs, "--save-domain", dss->fd,
>> +    run_helper(egc, shs, "--save-domain", dss->fd, dss->recv_fd,
>>                  NULL, 0,
>>                  argnums, ARRAY_SIZE(argnums));
>>       return;
>> @@ -119,14 +120,29 @@ void libxl__save_helper_init(libxl__save_helper_state *shs)
>>   }
>>
>>   /*----- helper execution -----*/
>> +static int dup_fd_helper(libxl__gc *gc, int fd, const char *what)
>> +{
>> +    int dup_fd = fd;
>> +
>> +    if (fd <= 2) {
>> +        dup_fd = dup(fd);
>> +        if (dup_fd < 0) {
>> +            LOGE(ERROR,"dup %s", what);
>> +            exit(-1);
>> +        }
>> +    }
>> +    libxl_fd_set_cloexec(CTX, dup_fd, 0);
>> +
>> +    return dup_fd;
>> +}
>>
>>   static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
>> -                       const char *mode_arg, int stream_fd,
>> +                       const char *mode_arg, int stream_fd, int back_fd,
>>                          const int *preserve_fds, int num_preserve_fds,
>>                          const unsigned long *argnums, int num_argnums)
>>   {
>>       STATE_AO_GC(shs->ao);
>> -    const char *args[4 + num_argnums];
>> +    const char *args[5 + num_argnums];
>>       const char **arg = args;
>>       int i, rc;
>>
>> @@ -154,6 +170,7 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
>>       *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC_BIN "/" "libxl-save-helper";
>>       *arg++ = mode_arg;
>>       const char **stream_fd_arg = arg++;
>> +    const char **back_fd_arg = arg++;
>>       for (i=0; i<num_argnums; i++)
>>           *arg++ = GCSPRINTF("%lu", argnums[i]);
>>       *arg++ = 0;
>> @@ -178,16 +195,12 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
>>
>>       pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
>>       if (!pid) {
>> -        if (stream_fd <= 2) {
>> -            stream_fd = dup(stream_fd);
>> -            if (stream_fd < 0) {
>> -                LOGE(ERROR,"dup migration stream fd");
>> -                exit(-1);
>> -            }
>> -        }
>> -        libxl_fd_set_cloexec(CTX, stream_fd, 0);
>> +        stream_fd = dup_fd_helper(gc, stream_fd, "migration stream fd");
>>           *stream_fd_arg = GCSPRINTF("%d", stream_fd);
>>
>> +        back_fd = dup_fd_helper(gc, back_fd, "migration back channel fd");
>> +        *back_fd_arg = GCSPRINTF("%d", back_fd);
>> +
>>           for (i=0; i<num_preserve_fds; i++)
>>               if (preserve_fds[i] >= 0) {
>>                   assert(preserve_fds[i] > 2);
>> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
>> index 4c9d34c..9de5694 100644
>> --- a/tools/libxl/libxl_save_helper.c
>> +++ b/tools/libxl/libxl_save_helper.c
>> @@ -235,6 +235,7 @@ static struct restore_callbacks helper_restore_callbacks;
>>   int main(int argc, char **argv)
>>   {
>>       int r;
>> +    int back_fd;
>>
>>   #define NEXTARG (++argv, assert(*argv), *argv)
>>
>> @@ -244,6 +245,7 @@ int main(int argc, char **argv)
>>       if (!strcmp(mode,"--save-domain")) {
>>
>>           io_fd =                    atoi(NEXTARG);
>> +        back_fd =                  atoi(NEXTARG);
>>           uint32_t dom =             strtoul(NEXTARG,0,10);
>>           uint32_t max_iters =       strtoul(NEXTARG,0,10);
>>           uint32_t max_factor =      strtoul(NEXTARG,0,10);
>> @@ -259,12 +261,14 @@ int main(int argc, char **argv)
>>           setup_signals(save_signal_handler);
>>
>>           r = xc_domain_save2(xch, io_fd, dom, max_iters, max_factor, flags,
>> -                           &helper_save_callbacks, hvm, checkpointed_stream);
>> +                            &helper_save_callbacks, hvm, checkpointed_stream,
>> +                            back_fd);
>>           complete(r);
>>
>>       } else if (!strcmp(mode,"--restore-domain")) {
>>
>>           io_fd =                    atoi(NEXTARG);
>> +        back_fd =                  atoi(NEXTARG);
>>           uint32_t dom =             strtoul(NEXTARG,0,10);
>>           unsigned store_evtchn =    strtoul(NEXTARG,0,10);
>>           domid_t store_domid =      strtoul(NEXTARG,0,10);
>> @@ -289,7 +293,7 @@ int main(int argc, char **argv)
>>                                 store_domid, console_evtchn, &console_mfn,
>>                                 console_domid, hvm, pae, superpages,
>>                                 checkpointed,
>> -                              &helper_restore_callbacks);
>> +                              &helper_restore_callbacks, back_fd);
>>           helper_stub_restore_results(store_mfn,console_mfn,0);
>>           complete(r);
>>
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-15 13:32   ` Ian Campbell
  2015-07-15 13:38     ` Yang Hongyang
@ 2015-07-16  9:23     ` Yang Hongyang
  2015-07-16  9:31       ` Ian Campbell
  1 sibling, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  9:23 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/15/2015 09:32 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>   tools/libxl/libxl_types.idl           |   4 +-
>
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index e8d3647..1d676ef 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
>>       (-15, "LOCK_FAIL"),
>>       (-16, "JSON_CONFIG_EMPTY"),
>>       (-17, "DEVICE_EXISTS"),
>> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
>> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
>> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
>> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
>>       (-20, "VNUMA_CONFIG_INVALID"),
>>       (-21, "DOMAIN_NOTFOUND"),
>>       (-22, "ABORTED"),
>
> This is an API change, which I think we discussed before.
>
> In <558BC6EE.60801@cn.fujitsu.com> you said you would add an extra patch
> to deal with that, and I think that needs to come before this automatic
> renaming so that there is no bisect hazard.

Seems either before or after will break the bisection...Only merge in one
patch will makes sense...

I don't see any such patch
> even after this point though (from grepping your colo-v8 branch).
>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-15 13:34     ` Yang Hongyang
@ 2015-07-16  9:26       ` Andrew Cooper
  2015-07-16  9:29         ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Andrew Cooper @ 2015-07-16  9:26 UTC (permalink / raw)
  To: Yang Hongyang, Ian Campbell
  Cc: wei.liu2, wency, guijianfeng, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson

On 15/07/15 14:34, Yang Hongyang wrote:
>
>
> On 07/15/2015 09:15 PM, Ian Campbell wrote:
>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>> This patch is auto generated by the following commands:
>>>   1. git mv tools/libxl/libxl_remus_device.c
>>> tools/libxl/libxl_checkpoint_device.c
>>
>> This patch does not appear to have been formatted with git format-patch
>> -M as requested last time around.
>
> Sorry I missed this :(
> will do in the next version. btw, I have a dump question...how to
> specify -M
> for only this patch while it is in a series?

Just format the entire series using -M.  In most cases it will be a
no-op and there will be no change in the generated patch.

~Andrew

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-16  9:26       ` Andrew Cooper
@ 2015-07-16  9:29         ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  9:29 UTC (permalink / raw)
  To: Andrew Cooper, Ian Campbell
  Cc: wei.liu2, wency, guijianfeng, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson



On 07/16/2015 05:26 PM, Andrew Cooper wrote:
> On 15/07/15 14:34, Yang Hongyang wrote:
>>
>>
>> On 07/15/2015 09:15 PM, Ian Campbell wrote:
>>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>>> This patch is auto generated by the following commands:
>>>>    1. git mv tools/libxl/libxl_remus_device.c
>>>> tools/libxl/libxl_checkpoint_device.c
>>>
>>> This patch does not appear to have been formatted with git format-patch
>>> -M as requested last time around.
>>
>> Sorry I missed this :(
>> will do in the next version. btw, I have a dump question...how to
>> specify -M
>> for only this patch while it is in a series?
>
> Just format the entire series using -M.  In most cases it will be a
> no-op and there will be no change in the generated patch.

Thanks for your help!

>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-16  9:23     ` Yang Hongyang
@ 2015-07-16  9:31       ` Ian Campbell
  2015-07-16  9:36         ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-16  9:31 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Thu, 2015-07-16 at 17:23 +0800, Yang Hongyang wrote:
> 
> On 07/15/2015 09:32 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >>   tools/libxl/libxl_types.idl           |   4 +-
> >
> >> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >> index e8d3647..1d676ef 100644
> >> --- a/tools/libxl/libxl_types.idl
> >> +++ b/tools/libxl/libxl_types.idl
> >> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
> >>       (-15, "LOCK_FAIL"),
> >>       (-16, "JSON_CONFIG_EMPTY"),
> >>       (-17, "DEVICE_EXISTS"),
> >> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
> >> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
> >> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
> >> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
> >>       (-20, "VNUMA_CONFIG_INVALID"),
> >>       (-21, "DOMAIN_NOTFOUND"),
> >>       (-22, "ABORTED"),
> >
> > This is an API change, which I think we discussed before.
> >
> > In <558BC6EE.60801@cn.fujitsu.com> you said you would add an extra patch
> > to deal with that, and I think that needs to come before this automatic
> > renaming so that there is no bisect hazard.
> 
> Seems either before or after will break the bisection...Only merge in one
> patch will makes sense...

If you break out this specific renaming into a precursor patch then you
can add the compat stuff in at that time. Please don't combine all that
into this one patch.

> I don't see any such patch
> > even after this point though (from grepping your colo-v8 branch).
> >
> > Ian.
> >
> > .
> >
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-16  9:31       ` Ian Campbell
@ 2015-07-16  9:36         ` Yang Hongyang
  2015-07-16 10:14           ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16  9:36 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/16/2015 05:31 PM, Ian Campbell wrote:
> On Thu, 2015-07-16 at 17:23 +0800, Yang Hongyang wrote:
>>
>> On 07/15/2015 09:32 PM, Ian Campbell wrote:
>>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>>>    tools/libxl/libxl_types.idl           |   4 +-
>>>
>>>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>>>> index e8d3647..1d676ef 100644
>>>> --- a/tools/libxl/libxl_types.idl
>>>> +++ b/tools/libxl/libxl_types.idl
>>>> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
>>>>        (-15, "LOCK_FAIL"),
>>>>        (-16, "JSON_CONFIG_EMPTY"),
>>>>        (-17, "DEVICE_EXISTS"),
>>>> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
>>>> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
>>>> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
>>>> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
>>>>        (-20, "VNUMA_CONFIG_INVALID"),
>>>>        (-21, "DOMAIN_NOTFOUND"),
>>>>        (-22, "ABORTED"),
>>>
>>> This is an API change, which I think we discussed before.
>>>
>>> In <558BC6EE.60801@cn.fujitsu.com> you said you would add an extra patch
>>> to deal with that, and I think that needs to come before this automatic
>>> renaming so that there is no bisect hazard.
>>
>> Seems either before or after will break the bisection...Only merge in one
>> patch will makes sense...
>
> If you break out this specific renaming into a precursor patch then you
> can add the compat stuff in at that time. Please don't combine all that
> into this one patch.

The fix is trival though, can I just quash them and also mention this
backword compatibility fix in the commit log?

commit cdfc734337b008a811c1896f9248256474906c82
Author: Yang Hongyang <yanghy@cn.fujitsu.com>
Date:   Thu Jul 16 17:23:25 2015 +0800

     tools/libxl: fix backword compatibility after the automatic renaming

     The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
     changed to ERROR_CHECKPOINT_XXX after previous renaming.
     The patch fix the backword compatibility.

     Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index c492d20..cb3d14f 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -835,6 +835,19 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
libxl_mac *src);
   */
  #define LIBXL_HAVE_SRM_V1 1

+/* Remus stuff */
+/*
+ * ERROR_REMUS_XXX error code only exists from Xen 4.5, and in Xen 4.6
+ * it is changed to ERROR_CHECKPOINT_XXX
+ */
+#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040500 \
+                               && LIBXL_API_VERSION < 0x040600
+#define ERROR_REMUS_DEVOPS_DOES_NOT_MATCH \
+        ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH
+#define ERROR_REMUS_DEVICE_NOT_SUPPORTED \
+        ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
+#endif
+
  typedef char **libxl_string_list;
  void libxl_string_list_dispose(libxl_string_list *sl);
  int libxl_string_list_length(const libxl_string_list *sl);


>
>> I don't see any such patch
>>> even after this point though (from grepping your colo-v8 branch).
>>>
>>> Ian.
>>>
>>> .
>>>
>>
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-16  9:36         ` Yang Hongyang
@ 2015-07-16 10:14           ` Ian Campbell
  2015-07-16 10:22             ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 10:14 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Thu, 2015-07-16 at 17:36 +0800, Yang Hongyang wrote:
> 
> On 07/16/2015 05:31 PM, Ian Campbell wrote:
> > On Thu, 2015-07-16 at 17:23 +0800, Yang Hongyang wrote:
> >>
> >> On 07/15/2015 09:32 PM, Ian Campbell wrote:
> >>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >>>>    tools/libxl/libxl_types.idl           |   4 +-
> >>>
> >>>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >>>> index e8d3647..1d676ef 100644
> >>>> --- a/tools/libxl/libxl_types.idl
> >>>> +++ b/tools/libxl/libxl_types.idl
> >>>> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
> >>>>        (-15, "LOCK_FAIL"),
> >>>>        (-16, "JSON_CONFIG_EMPTY"),
> >>>>        (-17, "DEVICE_EXISTS"),
> >>>> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
> >>>> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
> >>>> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
> >>>> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
> >>>>        (-20, "VNUMA_CONFIG_INVALID"),
> >>>>        (-21, "DOMAIN_NOTFOUND"),
> >>>>        (-22, "ABORTED"),
> >>>
> >>> This is an API change, which I think we discussed before.
> >>>
> >>> In <558BC6EE.60801@cn.fujitsu.com> you said you would add an extra patch
> >>> to deal with that, and I think that needs to come before this automatic
> >>> renaming so that there is no bisect hazard.
> >>
> >> Seems either before or after will break the bisection...Only merge in one
> >> patch will makes sense...
> >
> > If you break out this specific renaming into a precursor patch then you
> > can add the compat stuff in at that time. Please don't combine all that
> > into this one patch.
> 
> The fix is trival though, can I just quash them and also mention this
> backword compatibility fix in the commit log?

At the moment this particular patch is entirely mechanical (a long list
of sed expression), this would make it no longer so.

Ian.

> 
> commit cdfc734337b008a811c1896f9248256474906c82
> Author: Yang Hongyang <yanghy@cn.fujitsu.com>
> Date:   Thu Jul 16 17:23:25 2015 +0800
> 
>      tools/libxl: fix backword compatibility after the automatic renaming
> 
>      The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
>      changed to ERROR_CHECKPOINT_XXX after previous renaming.
>      The patch fix the backword compatibility.
> 
>      Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index c492d20..cb3d14f 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -835,6 +835,19 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
> libxl_mac *src);
>    */
>   #define LIBXL_HAVE_SRM_V1 1
> 
> +/* Remus stuff */
> +/*
> + * ERROR_REMUS_XXX error code only exists from Xen 4.5, and in Xen 4.6
> + * it is changed to ERROR_CHECKPOINT_XXX
> + */
> +#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040500 \
> +                               && LIBXL_API_VERSION < 0x040600
> +#define ERROR_REMUS_DEVOPS_DOES_NOT_MATCH \
> +        ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH
> +#define ERROR_REMUS_DEVICE_NOT_SUPPORTED \
> +        ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
> +#endif
> +
>   typedef char **libxl_string_list;
>   void libxl_string_list_dispose(libxl_string_list *sl);
>   int libxl_string_list_length(const libxl_string_list *sl);
> 
> 
> >
> >> I don't see any such patch
> >>> even after this point though (from grepping your colo-v8 branch).
> >>>
> >>> Ian.
> >>>
> >>> .
> >>>
> >>
> >
> >
> > .
> >
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device
  2015-07-16 10:14           ` Ian Campbell
@ 2015-07-16 10:22             ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 10:22 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/16/2015 06:14 PM, Ian Campbell wrote:
> On Thu, 2015-07-16 at 17:36 +0800, Yang Hongyang wrote:
>>
>> On 07/16/2015 05:31 PM, Ian Campbell wrote:
>>> On Thu, 2015-07-16 at 17:23 +0800, Yang Hongyang wrote:
>>>>
>>>> On 07/15/2015 09:32 PM, Ian Campbell wrote:
>>>>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>>>>>     tools/libxl/libxl_types.idl           |   4 +-
>>>>>
>>>>>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>>>>>> index e8d3647..1d676ef 100644
>>>>>> --- a/tools/libxl/libxl_types.idl
>>>>>> +++ b/tools/libxl/libxl_types.idl
>>>>>> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
>>>>>>         (-15, "LOCK_FAIL"),
>>>>>>         (-16, "JSON_CONFIG_EMPTY"),
>>>>>>         (-17, "DEVICE_EXISTS"),
>>>>>> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
>>>>>> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
>>>>>> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
>>>>>> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
>>>>>>         (-20, "VNUMA_CONFIG_INVALID"),
>>>>>>         (-21, "DOMAIN_NOTFOUND"),
>>>>>>         (-22, "ABORTED"),
>>>>>
>>>>> This is an API change, which I think we discussed before.
>>>>>
>>>>> In <558BC6EE.60801@cn.fujitsu.com> you said you would add an extra patch
>>>>> to deal with that, and I think that needs to come before this automatic
>>>>> renaming so that there is no bisect hazard.
>>>>
>>>> Seems either before or after will break the bisection...Only merge in one
>>>> patch will makes sense...
>>>
>>> If you break out this specific renaming into a precursor patch then you
>>> can add the compat stuff in at that time. Please don't combine all that
>>> into this one patch.
>>
>> The fix is trival though, can I just quash them and also mention this
>> backword compatibility fix in the commit log?
>
> At the moment this particular patch is entirely mechanical (a long list
> of sed expression), this would make it no longer so.

Ok, I will break out this specific renaming into a precursor patch...

>
> Ian.
>
>>
>> commit cdfc734337b008a811c1896f9248256474906c82
>> Author: Yang Hongyang <yanghy@cn.fujitsu.com>
>> Date:   Thu Jul 16 17:23:25 2015 +0800
>>
>>       tools/libxl: fix backword compatibility after the automatic renaming
>>
>>       The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
>>       changed to ERROR_CHECKPOINT_XXX after previous renaming.
>>       The patch fix the backword compatibility.
>>
>>       Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>
>> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
>> index c492d20..cb3d14f 100644
>> --- a/tools/libxl/libxl.h
>> +++ b/tools/libxl/libxl.h
>> @@ -835,6 +835,19 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst,
>> libxl_mac *src);
>>     */
>>    #define LIBXL_HAVE_SRM_V1 1
>>
>> +/* Remus stuff */
>> +/*
>> + * ERROR_REMUS_XXX error code only exists from Xen 4.5, and in Xen 4.6
>> + * it is changed to ERROR_CHECKPOINT_XXX
>> + */
>> +#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040500 \
>> +                               && LIBXL_API_VERSION < 0x040600
>> +#define ERROR_REMUS_DEVOPS_DOES_NOT_MATCH \
>> +        ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH
>> +#define ERROR_REMUS_DEVICE_NOT_SUPPORTED \
>> +        ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
>> +#endif
>> +
>>    typedef char **libxl_string_list;
>>    void libxl_string_list_dispose(libxl_string_list *sl);
>>    int libxl_string_list_length(const libxl_string_list *sl);
>>
>>
>>>
>>>> I don't see any such patch
>>>>> even after this point though (from grepping your colo-v8 branch).
>>>>>
>>>>> Ian.
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>>
>>> .
>>>
>>
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback
  2015-07-15 12:35     ` Yang Hongyang
@ 2015-07-16 10:32       ` Ian Campbell
  2015-07-16 11:00         ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 10:32 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, 2015-07-15 at 20:35 +0800, Yang Hongyang wrote:
> 
> On 07/15/2015 08:02 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >> init stream {read/write} state checkpoint_callback in Remus
> >> checkpoint callback.
> >
> > Why? Is this earlier or later than previously? Seems later?
> 
> There's no functional change, it's just refactoring so that we can move
> all remus code into one file.

That answers the why, thanks.

But, it would be a no functional change if the initialisation was moving
from e.g. the very start of a function to the caller of that function
right before the call or from the end of a function to the caller right
after the call.

But AFAICT this movement is a bit more than that, e.g. the init of
dcs->srs.checkpoint_callback has moved from near the end of
domcreate_bootloader_done to
libxl__remus_domain_restore_checkpoint_callback before a call to
libxl__stream_read_start_checkpoint, which doesn't have the property I
describe above, at least not in an obvious way.

So there is either a span of time where the callback is no longer
initialised when it was before, or it is initialised for a larger span
that it was before (with the former having the larger potential for
issues).

It's also not entirely clear that the new location of the initialisation
is traversed on all the same paths as before, or if it happens on fewer
paths that those are exactly the ones which matter.

Lastly it seems odd to split out the initialisation of only one member
of dcs->srs and dcs->sws into a different location to all the others,
especially putting it into the checkpoint callback (which is called
repeatedly).

Perhaps what is really needed here is a new function to initialise a
dcs->srs for remus, and another for dcs->sws to be called exactly where
the init happens today?

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream
  2015-07-15 13:58     ` Yang Hongyang
@ 2015-07-16 10:34       ` Ian Campbell
  2015-07-16 10:47         ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 10:34 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson

On Wed, 2015-07-15 at 21:58 +0800, Yang Hongyang wrote:
> 
> On 07/15/2015 08:34 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >> introduce enum type libxl_checkpointed_stream in IDL.
> >> rename the last argument of migrate_receive from "remus" to
> >> "checkpointed" since the semantics of this parameter has
> >> changed.
> >>
> >> NOTE:
> >>   libxl_domain_restore_params isn't changed here,
> >>   checkpointed_stream is still an int.
> >>   It has to change eventually and other callers will have to be
> >>   updated to cope (and there should be LIBXL_HAVE_...).
> >
> > Will this be fixed up later in this series? If so please say so.
> 
> It's not fixed in this series, I plan to fix this later, but seems there
> will be another round for this series, I can fix this in the next version.
> My main concern is that this change is an api change, it will affect the
> existing callers.

It is already an API change, whether or not is reflected in the type of
checkpointed_stream in the API struct you've already changed the
semantics of that field and so a LIBLX_HAVE is already needed, it makes
no sense to not also change the type to be correct while you are making
these changes even if the interchangeability of ints and enums seems on
the face of it to make it possible to avoid doing so.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-15 13:50     ` Yang Hongyang
@ 2015-07-16 10:37       ` Ian Campbell
  2015-07-16 11:10         ` Ian Jackson
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 10:37 UTC (permalink / raw)
  To: Yang Hongyang, ian.jackson
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram

On Wed, 2015-07-15 at 21:50 +0800, Yang Hongyang wrote:
> 
> On 07/15/2015 09:28 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >> @@ -2921,6 +2911,26 @@ _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
> >>                                           libxl__checkpoint_devices_state *cds);
> >>   _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
> >>                                           libxl__checkpoint_devices_state *cds);
> >> +
> >> +/*----- Remus related state structure -----*/
> >> +typedef struct libxl__remus_state libxl__remus_state;
> >> +struct libxl__remus_state {
> >> +    /* private */
> >> +    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
> >> +    int interval; /* checkpoint interval */
> >> +
> >> +    /* abstract layer */
> >> +    libxl__checkpoint_devices_state cds;
> >
> > This mostly makes sense, I think, but this one field feels like it will
> > be wanted by colo too. Does that mean we will end up with dss->rs.cds
> > and dss->colo.cds doing effectively the same thing?
> 
> Yes, checkpoint device is an abstract layer, used by both Remus & colo,
> in the abstract layer, we do not aware of remus or colo, in Remus or colo,
> we can use container of cds to retrive Remus/colo state.

This is because the cds callbacks receive a
libxl__checkpoint_devices_state * but are specific to either Remus of
Colo?

I think the usual way to solve that would be for the callback to take a
void *data "closure" field, which is registered along with the callbacks
and passed to all callbacks, or in this case perhaps you can get away
with just including it in the cds itself.

Ian, what do you think?

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen
  2015-07-16  3:49     ` Yang Hongyang
@ 2015-07-16 10:39       ` Ian Campbell
  2015-07-16 10:51         ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 10:39 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Thu, 2015-07-16 at 11:49 +0800, Yang Hongyang wrote:
> 
> On 07/15/2015 08:50 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >> Currently, libxl__domain_unpause() only supports
> >> qemu-xen-traditional. Update it to support qemu-xen.
> >> We use libxl__domain_resume_device_model to unpause guest dm.
> >>
> >> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> >> CC: Ian Campbell <Ian.Campbell@citrix.com>
> >> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >> CC: Wei Liu <wei.liu2@citrix.com>
> >> ---
> >>   tools/libxl/libxl.c | 15 +++++----------
> >>   1 file changed, 5 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> >> index 5b2d045..799aead 100644
> >> --- a/tools/libxl/libxl.c
> >> +++ b/tools/libxl/libxl.c
> >> @@ -941,8 +941,6 @@ out:
> >>   int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
> >>   {
> >>       GC_INIT(ctx);
> >> -    char *path;
> >> -    char *state;
> >>       int ret, rc = 0;
> >>
> >>       libxl_domain_type type = libxl__domain_type(gc, domid);
> >> @@ -952,14 +950,11 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
> >>       }
> >>
> >>       if (type == LIBXL_DOMAIN_TYPE_HVM) {
> >> -        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
> >> -
> >> -        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> >> -        state = libxl__xs_read(gc, XBT_NULL, path);
> >> -        if (state != NULL && !strcmp(state, "paused")) {
> >> -            libxl__qemu_traditional_cmd(gc, domid, "continue");
> >> -            libxl__wait_for_device_model_deprecated(gc, domid, "running",
> >> -                                         NULL, NULL, NULL);
> >> +        rc = libxl__domain_resume_device_model(gc, domid);
> >> +        if (rc < 0) {
> >> +            LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "failed to unpause device model "
> >> +                       "for domain %u:%d", domid, rc);
> >
> > Please use the preferred form of LOG(ERROR, "failed to..."), which
> > should also hopefully allow you to avoid splitting the line in the
> > middle of a string constant which is discouraged.
> >
> > If you can't use LOG() then please:
> >              LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
> >                         "failed to unpause device model for domain %u:%d",
> >                          domid, rc);
> >
> > Not splitting string constants means you can grep for an error message.
> 
> Sorry, the commit message is wrong, it's libxl_domain_unpause, not
> libxl__domain_unpause, LOG() can't be used, so I will update commit message
> and use your later suggestion, thank you!

Why can't LOG() be used? libxl_domain_unpause has a GC_INIT(ctx), so you
should have a gc in scope, which is all you need.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup
  2015-07-16  5:32     ` Yang Hongyang
@ 2015-07-16 10:40       ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 10:40 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Thu, 2015-07-16 at 13:32 +0800, Yang Hongyang wrote:
> 
> On 07/15/2015 07:26 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >> Refactoring Remus setup by introducing libxl__remus_setup API.
> >> All Remus setup work are done in this function.
> >>
> >> Also remove the libxl__ prefix for static functions.
> >
> > There is a subtle behavioural change here, which is that if anything
> > which is now done in _setup fails then the result is a call to
> > dss->callback( ..,..,ERROR_FAIL) rather than _start returning
> > AO_CREATE_FAIL(ERROR_FAIL).
> >
> > I think this is probably a reasonable and correct change, but I think it
> > is worth mentioning in the commit log.
> 
> Yes, will update the commit log.
> 
> >
> > That said, I also wonder if the actual check for netbuffer_enabled (the
> > only such failure in practice) ought to be moved up such that it stays
> > in _start along with the other similar checks, i.e. _start would do:
> >
> >      if (libxl_defbool_val(info->netbuf) && !libxl__netbuffer_enabled(gc)) {
> >              LOG(ERROR, "Remus: No support for network buffering");
> >              rc = ERROR_FAIL;
> >              goto out;
> >          }
> 
> This check is for Remus only, we want to reuse _start for COLO, so anything
> related to Remus only should sit in libxl_remus.c.

Makes sense, thanks.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream
  2015-07-16 10:34       ` Ian Campbell
@ 2015-07-16 10:47         ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 10:47 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, ian.jackson



On 07/16/2015 06:34 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 21:58 +0800, Yang Hongyang wrote:
>>
>> On 07/15/2015 08:34 PM, Ian Campbell wrote:
>>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>>> introduce enum type libxl_checkpointed_stream in IDL.
>>>> rename the last argument of migrate_receive from "remus" to
>>>> "checkpointed" since the semantics of this parameter has
>>>> changed.
>>>>
>>>> NOTE:
>>>>    libxl_domain_restore_params isn't changed here,
>>>>    checkpointed_stream is still an int.
>>>>    It has to change eventually and other callers will have to be
>>>>    updated to cope (and there should be LIBXL_HAVE_...).
>>>
>>> Will this be fixed up later in this series? If so please say so.
>>
>> It's not fixed in this series, I plan to fix this later, but seems there
>> will be another round for this series, I can fix this in the next version.
>> My main concern is that this change is an api change, it will affect the
>> existing callers.
>
> It is already an API change, whether or not is reflected in the type of
> checkpointed_stream in the API struct you've already changed the
> semantics of that field and so a LIBLX_HAVE is already needed, it makes
> no sense to not also change the type to be correct while you are making
> these changes even if the interchangeability of ints and enums seems on
> the face of it to make it possible to avoid doing so.

Fair enough, will fix, thank you!

>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-16  6:05     ` Yang Hongyang
@ 2015-07-16 10:47       ` Ian Campbell
  2015-07-16 16:13       ` Wei Liu
  1 sibling, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 10:47 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Thu, 2015-07-16 at 14:05 +0800, Yang Hongyang wrote:
> 
> On 07/15/2015 08:38 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >> Pass checkpointed_stream from libxl to libxc.
> >> It won't affact legacy migration because legacy migration
> >> won't use this param.
> >>
> >> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> >> CC: Ian Campbell <Ian.Campbell@citrix.com>
> >> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >> CC: Wei Liu <wei.liu2@citrix.com>
> >> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >> ---
> >>   tools/libxc/include/xenguest.h   |  9 ++++++---
> >>   tools/libxc/xc_domain_save.c     |  6 ++++--
> >>   tools/libxc/xc_nomigrate.c       |  3 ++-
> >>   tools/libxc/xc_sr_common.h       |  2 +-
> >>   tools/libxc/xc_sr_save.c         |  5 +++--
> >>   tools/libxl/libxl.c              |  2 ++
> >>   tools/libxl/libxl_dom_save.c     | 11 ++++++++---
> >>   tools/libxl/libxl_internal.h     |  1 +
> >>   tools/libxl/libxl_save_callout.c |  2 +-
> >>   tools/libxl/libxl_save_helper.c  |  3 ++-
> >>   10 files changed, 30 insertions(+), 14 deletions(-)
> >>
> >> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> >> index e95af54..6e24b6c 100644
> >> --- a/tools/libxc/include/xenguest.h
> >> +++ b/tools/libxc/include/xenguest.h
> >> @@ -30,7 +30,6 @@
> >>   #define XCFLAGS_HVM       (1 << 2)
> >>   #define XCFLAGS_STDVGA    (1 << 3)
> >>   #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
> >> -#define XCFLAGS_CHECKPOINTED    (1 << 5)
> >>
> >>   #define X86_64_B_SIZE   64
> >>   #define X86_32_B_SIZE   32
> >> @@ -85,16 +84,20 @@ struct save_callbacks {
> >>    * @parm xch a handle to an open hypervisor interface
> >>    * @parm fd the file descriptor to save a domain to
> >>    * @parm dom the id of the domain
> >> + * @parm checkpointed_stream non-zero if the far end of the stream is using
> >> + *       checkpointing
> >
> > Do (or will) specific non-zero values have any meaning to the libxc
> > layer? i.e. does it have any knowledge of COLO vs. Remus as the libxl
> > enum added in the last patch does?
> 
> Yes, libxc side should be aware of the type of checkpointed_stream (Remus
> or COLO).

In which case I'm afraid something somewhere needs to explicitly convert
from the LIBXL_ namespace to the libxc one (which would be better made
explicit with their own names not open coded numbers). See the handling
of libxl_tsc_mode or libxl_trigger for an example.

Alternatively you could add a comment to libxl_types.idl like
libxl_timer_mode and libxl_shutdown_reason have indicating that these
values must remain consistent with some underlying interface. I don't
much like that but it is tolerable I suppose.

> 
> I think it is better to document the non-zero values here?
> for example:
>       * @parm checkpointed_stream non-zero if the far end of the stream is using
>       *                           checkpointing
>       *                           0 no checkpointed stream
>       *                           1 Remus
>       *                           2 COLO

I'd prefer named constants in the XC_ namespace.

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen
  2015-07-16 10:39       ` Ian Campbell
@ 2015-07-16 10:51         ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 10:51 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/16/2015 06:39 PM, Ian Campbell wrote:
> On Thu, 2015-07-16 at 11:49 +0800, Yang Hongyang wrote:
>>
>> On 07/15/2015 08:50 PM, Ian Campbell wrote:
>>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>>> Currently, libxl__domain_unpause() only supports
>>>> qemu-xen-traditional. Update it to support qemu-xen.
>>>> We use libxl__domain_resume_device_model to unpause guest dm.
>>>>
>>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>>>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>>>> CC: Wei Liu <wei.liu2@citrix.com>
>>>> ---
>>>>    tools/libxl/libxl.c | 15 +++++----------
>>>>    1 file changed, 5 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>>>> index 5b2d045..799aead 100644
>>>> --- a/tools/libxl/libxl.c
>>>> +++ b/tools/libxl/libxl.c
>>>> @@ -941,8 +941,6 @@ out:
>>>>    int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>>>>    {
>>>>        GC_INIT(ctx);
>>>> -    char *path;
>>>> -    char *state;
>>>>        int ret, rc = 0;
>>>>
>>>>        libxl_domain_type type = libxl__domain_type(gc, domid);
>>>> @@ -952,14 +950,11 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>>>>        }
>>>>
>>>>        if (type == LIBXL_DOMAIN_TYPE_HVM) {
>>>> -        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
>>>> -
>>>> -        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
>>>> -        state = libxl__xs_read(gc, XBT_NULL, path);
>>>> -        if (state != NULL && !strcmp(state, "paused")) {
>>>> -            libxl__qemu_traditional_cmd(gc, domid, "continue");
>>>> -            libxl__wait_for_device_model_deprecated(gc, domid, "running",
>>>> -                                         NULL, NULL, NULL);
>>>> +        rc = libxl__domain_resume_device_model(gc, domid);
>>>> +        if (rc < 0) {
>>>> +            LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "failed to unpause device model "
>>>> +                       "for domain %u:%d", domid, rc);
>>>
>>> Please use the preferred form of LOG(ERROR, "failed to..."), which
>>> should also hopefully allow you to avoid splitting the line in the
>>> middle of a string constant which is discouraged.
>>>
>>> If you can't use LOG() then please:
>>>               LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
>>>                          "failed to unpause device model for domain %u:%d",
>>>                           domid, rc);
>>>
>>> Not splitting string constants means you can grep for an error message.
>>
>> Sorry, the commit message is wrong, it's libxl_domain_unpause, not
>> libxl__domain_unpause, LOG() can't be used, so I will update commit message
>> and use your later suggestion, thank you!
>
> Why can't LOG() be used? libxl_domain_unpause has a GC_INIT(ctx), so you
> should have a gc in scope, which is all you need.

Sorry, my memory is wrong, fixed now.

>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback
  2015-07-16 10:32       ` Ian Campbell
@ 2015-07-16 11:00         ` Yang Hongyang
  2015-07-16 11:16           ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 11:00 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/16/2015 06:32 PM, Ian Campbell wrote:
> On Wed, 2015-07-15 at 20:35 +0800, Yang Hongyang wrote:
>>
>> On 07/15/2015 08:02 PM, Ian Campbell wrote:
>>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>>> init stream {read/write} state checkpoint_callback in Remus
>>>> checkpoint callback.
>>>
>>> Why? Is this earlier or later than previously? Seems later?
>>
>> There's no functional change, it's just refactoring so that we can move
>> all remus code into one file.
>
> That answers the why, thanks.
>
> But, it would be a no functional change if the initialisation was moving
> from e.g. the very start of a function to the caller of that function
> right before the call or from the end of a function to the caller right
> after the call.
>
> But AFAICT this movement is a bit more than that, e.g. the init of
> dcs->srs.checkpoint_callback has moved from near the end of
> domcreate_bootloader_done to
> libxl__remus_domain_restore_checkpoint_callback before a call to
> libxl__stream_read_start_checkpoint, which doesn't have the property I
> describe above, at least not in an obvious way.
>
> So there is either a span of time where the callback is no longer
> initialised when it was before, or it is initialised for a larger span
> that it was before (with the former having the larger potential for
> issues).
>
> It's also not entirely clear that the new location of the initialisation
> is traversed on all the same paths as before, or if it happens on fewer
> paths that those are exactly the ones which matter.
>
> Lastly it seems odd to split out the initialisation of only one member
> of dcs->srs and dcs->sws into a different location to all the others,
> especially putting it into the checkpoint callback (which is called
> repeatedly).
>
> Perhaps what is really needed here is a new function to initialise a
> dcs->srs for remus, and another for dcs->sws to be called exactly where
> the init happens today?

The checkpoint_callback() only used by remus, you can see from the
initialisation line:
dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;

These two functions remus_checkpoint_stream_written &
remus_checkpoint_stream_done only called when libxc call
chaeckpoint() callback, so there should be no problem
with the move. Given the fact it's only used by Remus, init
it in previous place is not a good idea IMO.

>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc
  2015-07-16  6:29     ` Yang Hongyang
@ 2015-07-16 11:01       ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 11:01 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, guijianfeng, rshriram

On Thu, 2015-07-16 at 14:29 +0800, Yang Hongyang wrote:
> On 07/15/2015 09:13 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >> In COLO mode, both VMs are running, and are considered in sync if the
> >> visible network traffic is identical.  After some time, they fall out of
> >> sync.
> >>
> >> At this point, the two VMs have definitely diverged.  Lets call the
> >> primary dirty bitmap set A, while the secondary dirty bitmap set B.
> >>
> >> Sets A and B are different.
> >>
> >> Under normal migration, the page data for set A will be sent form the
> >> primary to the secondary.
> >>
> >> However, the set difference B - A (lets call this C) is out-of-date on
> >> the secondary (with respect to the primary) and will not be sent by the
> >> primary, as it was not memory dirtied by the primary.  The secondary
> >> needs the page data for C to reconstruct an exact copy of the primary at
> >> the checkpoint.
> >>
> >> The secondary cannot calculate C as it doesn't know A.  Instead, the
> >> secondary must send B to the primary, at which point the primary
> >> calculates the union of A and B (lets call this D) which is all the
> >> pages dirtied by both the primary and the secondary, and sends all page
> >> data covered by D.
> >>
> >> In the general case, D is a superset of both A and B.  Without the
> >> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
> >> copy of the primary.
> >
> > When Andy (who wrote this) said this via email I replied [0] including:
> >
> >          According to the paper there is no need to resend because the
> >          secondary already has a non-dirty copy of any memory which is
> >          dirty in B but not A.
> >
> > So it is not the case that a checkpoint _can't_ reconstruct a valid copy
> > of the primary, clearly it is possible, but for some reason this
> > implementation chooses to deviate from the paper and does things in a
> > way where it indeed cannot reconstruct D but I've yet to see a
> > description of _why_ the implementation produced here differs from the
> > paper.
> >
> >> We transfer the dirty bitmap on libxc side, so we need to introduce back
> >> channel to libxc.
> >
> > I'm sure you have good practical reasons why the implementation differs
> > from the design and I would like to know what they are because the back
> > channel is adding extra complexity to libxc and libxl so I want to know
> > why it is justified, as I also said this in [1].
> >
> > Lastly Ian said in [2]:
> >
> >          To be clear, I have no problem if the design has changed since the
> >          paper was written.  I just want:
> >
> >           * A clear high-level explanation of the actually-implemented
> >             arrangements to exist somewhere
> >
> >           * The commit messages, or code, to refer to that explanation
> >
> > IMHO the addition of this extra commit message doesn't really meet at
> > least the first requirement. Please point us to an up to date design
> > document which describes COLO as actually implemented.
> 
> The original design of COLO is that:
> Secondary should maintain an exact copy of Primary memory. At every
> checkpoint, we receive the Primary memory into that copy, then flush
> the memory to Secondary.
> 
> We changed the original design to the current one, according to our
> following concerns:
> 1. The original design needs extra memory on Secondary host. When
>     there's multiple backups on one host, the memory cost is high.
> 2. The memory cache code will be another 1k+, it will make the
>     review more time consuming.
> 
> The best way to accomplish this is the COW which Andrew mentioned earlier,
> but that should be further improvement. We will certainly continue to
> improve this when COW is ready to use.
> 
> Is description above can solve your confusion? If so, I will add to
> the commit log.

I'd much prefer to see this incorporated into an updated design which
supersedes the existing paper and is referenced from the wiki etc.

It doesn't need to be anything like as formal as that paper, of course,
but I think it does need to cover the overall design of COLO as it is in
reality today as a whole and not just as a delta to the paper in the
commit log.

Such a document might also usefully _in_addition_ contain a section
explaining the delta from the original paper and the reasons for those.

Ian.

> >
> > Ian.
> >
> > [0] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00090.html
> > [1] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00148.html
> > [2] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00101.html
> >
> >>
> >> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> >> commit message:
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >> CC: Ian Campbell <Ian.Campbell@citrix.com>
> >> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >> CC: Wei Liu <wei.liu2@citrix.com>
> >> ---
> >>   tools/libxc/include/xenguest.h   |  8 ++++----
> >>   tools/libxc/xc_domain_restore.c  |  4 ++--
> >>   tools/libxc/xc_domain_save.c     |  4 ++--
> >>   tools/libxc/xc_sr_restore.c      |  2 +-
> >>   tools/libxc/xc_sr_save.c         |  2 +-
> >>   tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
> >>   tools/libxl/libxl_save_helper.c  |  8 ++++++--
> >>   7 files changed, 42 insertions(+), 25 deletions(-)
> >>
> >> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> >> index 6e24b6c..4056955 100644
> >> --- a/tools/libxc/include/xenguest.h
> >> +++ b/tools/libxc/include/xenguest.h
> >> @@ -91,13 +91,13 @@ struct save_callbacks {
> >>   int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
> >>                      uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
> >>                      struct save_callbacks* callbacks, int hvm,
> >> -                   int checkpointed_stream);
> >> +                   int checkpointed_stream, int back_fd);
> >>
> >>   /* Domain Save v2 */
> >>   int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
> >>                       uint32_t max_factor, uint32_t flags,
> >>                       struct save_callbacks* callbacks, int hvm,
> >> -                    int checkpointed_stream);
> >> +                    int checkpointed_stream, int back_fd);
> >>
> >>   /* callbacks provided by xc_domain_restore */
> >>   struct restore_callbacks {
> >> @@ -140,7 +140,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
> >>                         unsigned long *console_mfn, domid_t console_domid,
> >>                         unsigned int hvm, unsigned int pae, int superpages,
> >>                         int checkpointed_stream,
> >> -                      struct restore_callbacks *callbacks);
> >> +                      struct restore_callbacks *callbacks, int back_fd);
> >>
> >>   /* Domain Restore v2 */
> >>   int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
> >> @@ -149,7 +149,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
> >>                          unsigned long *console_mfn, domid_t console_domid,
> >>                          unsigned int hvm, unsigned int pae, int superpages,
> >>                          int checkpointed_stream,
> >> -                       struct restore_callbacks *callbacks);
> >> +                       struct restore_callbacks *callbacks, int back_fd);
> >>   /**
> >>    * xc_domain_restore writes a file to disk that contains the device
> >>    * model saved state.
> >> diff --git a/tools/libxc/xc_domain_restore.c b/tools/libxc/xc_domain_restore.c
> >> index 3cd3483..63d1e6b 100644
> >> --- a/tools/libxc/xc_domain_restore.c
> >> +++ b/tools/libxc/xc_domain_restore.c
> >> @@ -1515,7 +1515,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
> >>                         unsigned long *console_mfn, domid_t console_domid,
> >>                         unsigned int hvm, unsigned int pae, int superpages,
> >>                         int checkpointed_stream,
> >> -                      struct restore_callbacks *callbacks)
> >> +                      struct restore_callbacks *callbacks, int back_fd)
> >>   {
> >>       DECLARE_DOMCTL;
> >>       xc_dominfo_t info;
> >> @@ -1578,7 +1578,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
> >>           return xc_domain_restore2(
> >>               xch, io_fd, dom, store_evtchn, store_mfn,
> >>               store_domid, console_evtchn, console_mfn, console_domid,
> >> -            hvm,  pae,  superpages, checkpointed_stream, callbacks);
> >> +            hvm,  pae,  superpages, checkpointed_stream, callbacks, back_fd);
> >>       }
> >>
> >>       DPRINTF("%s: starting restore of new domid %u", __func__, dom);
> >> diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
> >> index 0da3cca..b111384 100644
> >> --- a/tools/libxc/xc_domain_save.c
> >> +++ b/tools/libxc/xc_domain_save.c
> >> @@ -803,7 +803,7 @@ static int save_tsc_info(xc_interface *xch, uint32_t dom, int io_fd)
> >>   int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
> >>                      uint32_t max_factor, uint32_t flags,
> >>                      struct save_callbacks* callbacks, int hvm,
> >> -                   int checkpointed_stream)
> >> +                   int checkpointed_stream, int back_fd)
> >>   {
> >>       xc_dominfo_t info;
> >>       DECLARE_DOMCTL;
> >> @@ -899,7 +899,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iter
> >>       {
> >>           return xc_domain_save2(xch, io_fd, dom, max_iters,
> >>                                  max_factor, flags, callbacks, hvm,
> >> -                               checkpointed_stream);
> >> +                               checkpointed_stream, back_fd);
> >>       }
> >>
> >>       DPRINTF("%s: starting save of domid %u", __func__, dom);
> >> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
> >> index bf1ee15..504463e 100644
> >> --- a/tools/libxc/xc_sr_restore.c
> >> +++ b/tools/libxc/xc_sr_restore.c
> >> @@ -720,7 +720,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
> >>                          unsigned long *console_gfn, domid_t console_domid,
> >>                          unsigned int hvm, unsigned int pae, int superpages,
> >>                          int checkpointed_stream,
> >> -                       struct restore_callbacks *callbacks)
> >> +                       struct restore_callbacks *callbacks, int back_fd)
> >>   {
> >>       struct xc_sr_context ctx =
> >>           {
> >> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> >> index 6102b66..d12e5b1 100644
> >> --- a/tools/libxc/xc_sr_save.c
> >> +++ b/tools/libxc/xc_sr_save.c
> >> @@ -821,7 +821,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
> >>   int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
> >>                       uint32_t max_iters, uint32_t max_factor, uint32_t flags,
> >>                       struct save_callbacks* callbacks, int hvm,
> >> -                    int checkpointed_stream)
> >> +                    int checkpointed_stream, int back_fd)
> >>   {
> >>       xen_pfn_t nr_pfns;
> >>       struct xc_sr_context ctx =
> >> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
> >> index f393abc..f8c6cf0 100644
> >> --- a/tools/libxl/libxl_save_callout.c
> >> +++ b/tools/libxl/libxl_save_callout.c
> >> @@ -27,7 +27,7 @@
> >>    */
> >>   static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> >>                          const char *mode_arg,
> >> -                       int stream_fd,
> >> +                       int stream_fd, int back_fd,
> >>                          const int *preserve_fds, int num_preserve_fds,
> >>                          const unsigned long *argnums, int num_argnums);
> >>
> >> @@ -50,6 +50,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
> >>       /* Convenience aliases */
> >>       const uint32_t domid = dcs->guest_domid;
> >>       const int restore_fd = dcs->libxc_fd;
> >> +    const int send_fd = dcs->send_fd;
> >>       libxl__domain_build_state *const state = &dcs->build_state;
> >>
> >>       unsigned cbflags =
> >> @@ -72,7 +73,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
> >>       shs->need_results = 1;
> >>       shs->toolstack_data_file = 0;
> >>
> >> -    run_helper(egc, shs, "--restore-domain", restore_fd, 0, 0,
> >> +    run_helper(egc, shs, "--restore-domain", restore_fd, send_fd, 0, 0,
> >>                  argnums, ARRAY_SIZE(argnums));
> >>   }
> >>
> >> @@ -96,7 +97,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
> >>       shs->caller_state = dss;
> >>       shs->need_results = 0;
> >>
> >> -    run_helper(egc, shs, "--save-domain", dss->fd,
> >> +    run_helper(egc, shs, "--save-domain", dss->fd, dss->recv_fd,
> >>                  NULL, 0,
> >>                  argnums, ARRAY_SIZE(argnums));
> >>       return;
> >> @@ -119,14 +120,29 @@ void libxl__save_helper_init(libxl__save_helper_state *shs)
> >>   }
> >>
> >>   /*----- helper execution -----*/
> >> +static int dup_fd_helper(libxl__gc *gc, int fd, const char *what)
> >> +{
> >> +    int dup_fd = fd;
> >> +
> >> +    if (fd <= 2) {
> >> +        dup_fd = dup(fd);
> >> +        if (dup_fd < 0) {
> >> +            LOGE(ERROR,"dup %s", what);
> >> +            exit(-1);
> >> +        }
> >> +    }
> >> +    libxl_fd_set_cloexec(CTX, dup_fd, 0);
> >> +
> >> +    return dup_fd;
> >> +}
> >>
> >>   static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> >> -                       const char *mode_arg, int stream_fd,
> >> +                       const char *mode_arg, int stream_fd, int back_fd,
> >>                          const int *preserve_fds, int num_preserve_fds,
> >>                          const unsigned long *argnums, int num_argnums)
> >>   {
> >>       STATE_AO_GC(shs->ao);
> >> -    const char *args[4 + num_argnums];
> >> +    const char *args[5 + num_argnums];
> >>       const char **arg = args;
> >>       int i, rc;
> >>
> >> @@ -154,6 +170,7 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> >>       *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC_BIN "/" "libxl-save-helper";
> >>       *arg++ = mode_arg;
> >>       const char **stream_fd_arg = arg++;
> >> +    const char **back_fd_arg = arg++;
> >>       for (i=0; i<num_argnums; i++)
> >>           *arg++ = GCSPRINTF("%lu", argnums[i]);
> >>       *arg++ = 0;
> >> @@ -178,16 +195,12 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
> >>
> >>       pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
> >>       if (!pid) {
> >> -        if (stream_fd <= 2) {
> >> -            stream_fd = dup(stream_fd);
> >> -            if (stream_fd < 0) {
> >> -                LOGE(ERROR,"dup migration stream fd");
> >> -                exit(-1);
> >> -            }
> >> -        }
> >> -        libxl_fd_set_cloexec(CTX, stream_fd, 0);
> >> +        stream_fd = dup_fd_helper(gc, stream_fd, "migration stream fd");
> >>           *stream_fd_arg = GCSPRINTF("%d", stream_fd);
> >>
> >> +        back_fd = dup_fd_helper(gc, back_fd, "migration back channel fd");
> >> +        *back_fd_arg = GCSPRINTF("%d", back_fd);
> >> +
> >>           for (i=0; i<num_preserve_fds; i++)
> >>               if (preserve_fds[i] >= 0) {
> >>                   assert(preserve_fds[i] > 2);
> >> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
> >> index 4c9d34c..9de5694 100644
> >> --- a/tools/libxl/libxl_save_helper.c
> >> +++ b/tools/libxl/libxl_save_helper.c
> >> @@ -235,6 +235,7 @@ static struct restore_callbacks helper_restore_callbacks;
> >>   int main(int argc, char **argv)
> >>   {
> >>       int r;
> >> +    int back_fd;
> >>
> >>   #define NEXTARG (++argv, assert(*argv), *argv)
> >>
> >> @@ -244,6 +245,7 @@ int main(int argc, char **argv)
> >>       if (!strcmp(mode,"--save-domain")) {
> >>
> >>           io_fd =                    atoi(NEXTARG);
> >> +        back_fd =                  atoi(NEXTARG);
> >>           uint32_t dom =             strtoul(NEXTARG,0,10);
> >>           uint32_t max_iters =       strtoul(NEXTARG,0,10);
> >>           uint32_t max_factor =      strtoul(NEXTARG,0,10);
> >> @@ -259,12 +261,14 @@ int main(int argc, char **argv)
> >>           setup_signals(save_signal_handler);
> >>
> >>           r = xc_domain_save2(xch, io_fd, dom, max_iters, max_factor, flags,
> >> -                           &helper_save_callbacks, hvm, checkpointed_stream);
> >> +                            &helper_save_callbacks, hvm, checkpointed_stream,
> >> +                            back_fd);
> >>           complete(r);
> >>
> >>       } else if (!strcmp(mode,"--restore-domain")) {
> >>
> >>           io_fd =                    atoi(NEXTARG);
> >> +        back_fd =                  atoi(NEXTARG);
> >>           uint32_t dom =             strtoul(NEXTARG,0,10);
> >>           unsigned store_evtchn =    strtoul(NEXTARG,0,10);
> >>           domid_t store_domid =      strtoul(NEXTARG,0,10);
> >> @@ -289,7 +293,7 @@ int main(int argc, char **argv)
> >>                                 store_domid, console_evtchn, &console_mfn,
> >>                                 console_domid, hvm, pae, superpages,
> >>                                 checkpointed,
> >> -                              &helper_restore_callbacks);
> >> +                              &helper_restore_callbacks, back_fd);
> >>           helper_stub_restore_results(store_mfn,console_mfn,0);
> >>           complete(r);
> >>
> >
> >
> > .
> >
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-16 10:37       ` Ian Campbell
@ 2015-07-16 11:10         ` Ian Jackson
  2015-07-16 11:19           ` Ian Campbell
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Jackson @ 2015-07-16 11:10 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Yang Hongyang

Ian Campbell writes ("Re: [Xen-devel] [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure"):
> On Wed, 2015-07-15 at 21:50 +0800, Yang Hongyang wrote:
> > Yes, checkpoint device is an abstract layer, used by both Remus & colo,
> > in the abstract layer, we do not aware of remus or colo, in Remus or colo,
> > we can use container of cds to retrive Remus/colo state.
> 
> This is because the cds callbacks receive a
> libxl__checkpoint_devices_state * but are specific to either Remus of
> Colo?
> 
> I think the usual way to solve that would be for the callback to take a
> void *data "closure" field, which is registered along with the callbacks
> and passed to all callbacks, or in this case perhaps you can get away
> with just including it in the cds itself.
> 
> Ian, what do you think?

This is rather an unusual situation.  Normally there are two patterns:


1.

Things like:

  static void device_disk_add(libxl__egc *egc, uint32_t domid,
                           libxl_device_disk *disk,
                           libxl__ao_device *aodev,
                           char *get_vdev(libxl__gc *, void *,
                                          xs_transaction_t),
                           void *get_vdev_user)

and similar patterns in much code in libxl and elsewhere.  This is the
normal case.  (It is of course essential to use this when there are
multiple call sites, so the void* data pointer might vary.)


2.

Things like (NB not very like real code):

  struct libxl_some_operation_state {
    libxl_inner_generic_operation_state igos;

  void someop_innerop_make_happen(libxl_some_operation_state *sos) {
    sos->igos.callback = someop_innerop_done;

  void someop_innerop_done(libxl_inner_generic_operation_state *igos) {
    sos = CONTAINER_OF(igos);

Here the callback someop_innerop_done can be sure that CONTAINER_OF is
correct because the callback is set up only in one place where it is
obvious that the igos is part of a sos.

IMO this is an exception to  the usual rule that you have to accompany
the function pointer with a void*.  The exception is justified because
it is very easy to see that  the code is correct.  And, if any mistake
is made, the setup is unconditional, so it will _always_ get the wrong
container and go wrong (which will hopefully be spotted in testing).



In this particular situation the plumbing that relates a particular
callback to the remus or colo state is rather more complicated.  I
don't think it will be as obvious that the appropriate CONTAINER_OF is
being used, let alone obvious that this it's always the same.

OTOH for any particular callback the context pointer is supposed to be
a particular CONTAINER_OF.  It would be nice to write this in the
code.

I think in this case the best answer would be:

  struct libxl__checkpoint_devices_state {
      void (*postsuspend)(libxl__egc *egc, libxl__remus_device *dev);
      void *callbacks_context;

and in the callback

  void libxl__remus_devices_postsuspend(libxl__egc *egc,
                                libxl__check_devices_state *cds)
  {
      libxl__remus_state *rs = cds->callbacks_context;
      assert(cds = &rs->cds);

or some such.


Thanks,
Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback
  2015-07-16 11:00         ` Yang Hongyang
@ 2015-07-16 11:16           ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 11:16 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson

On Thu, 2015-07-16 at 19:00 +0800, Yang Hongyang wrote:
> 
> On 07/16/2015 06:32 PM, Ian Campbell wrote:
> > On Wed, 2015-07-15 at 20:35 +0800, Yang Hongyang wrote:
> >>
> >> On 07/15/2015 08:02 PM, Ian Campbell wrote:
> >>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >>>> init stream {read/write} state checkpoint_callback in Remus
> >>>> checkpoint callback.
> >>>
> >>> Why? Is this earlier or later than previously? Seems later?
> >>
> >> There's no functional change, it's just refactoring so that we can move
> >> all remus code into one file.
> >
> > That answers the why, thanks.
> >
> > But, it would be a no functional change if the initialisation was moving
> > from e.g. the very start of a function to the caller of that function
> > right before the call or from the end of a function to the caller right
> > after the call.
> >
> > But AFAICT this movement is a bit more than that, e.g. the init of
> > dcs->srs.checkpoint_callback has moved from near the end of
> > domcreate_bootloader_done to
> > libxl__remus_domain_restore_checkpoint_callback before a call to
> > libxl__stream_read_start_checkpoint, which doesn't have the property I
> > describe above, at least not in an obvious way.
> >
> > So there is either a span of time where the callback is no longer
> > initialised when it was before, or it is initialised for a larger span
> > that it was before (with the former having the larger potential for
> > issues).
> >
> > It's also not entirely clear that the new location of the initialisation
> > is traversed on all the same paths as before, or if it happens on fewer
> > paths that those are exactly the ones which matter.
> >
> > Lastly it seems odd to split out the initialisation of only one member
> > of dcs->srs and dcs->sws into a different location to all the others,
> > especially putting it into the checkpoint callback (which is called
> > repeatedly).
> >
> > Perhaps what is really needed here is a new function to initialise a
> > dcs->srs for remus, and another for dcs->sws to be called exactly where
> > the init happens today?
> 
> The checkpoint_callback() only used by remus, you can see from the
> initialisation line:
> dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
> dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
> 
> These two functions remus_checkpoint_stream_written &
> remus_checkpoint_stream_done only called when libxc call
> chaeckpoint() callback, so there should be no problem
> with the move. Given the fact it's only used by Remus, init
> it in previous place is not a good idea IMO.

I suggested a pair of Remus specific functions to init an sws or srs
stream for Remus, called in the same place, which is not the same thing
as just leaving it there, nor of initialising this state from the
callback itself. You can call that init function on the same condition
as I suppose you will eventually apply to:
    callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;

Or perhaps have the remus (and other checkpoint types if necessary)
setup code fill in two callbacks in a higher level struct which are
called at the appropriate points to init dss->{sws,srs}?

Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure
  2015-07-16 11:10         ` Ian Jackson
@ 2015-07-16 11:19           ` Ian Campbell
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Campbell @ 2015-07-16 11:19 UTC (permalink / raw)
  To: Ian Jackson
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Yang Hongyang

On Thu, 2015-07-16 at 12:10 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure"):
> > On Wed, 2015-07-15 at 21:50 +0800, Yang Hongyang wrote:
> > > Yes, checkpoint device is an abstract layer, used by both Remus & colo,
> > > in the abstract layer, we do not aware of remus or colo, in Remus or colo,
> > > we can use container of cds to retrive Remus/colo state.
> > 
> > This is because the cds callbacks receive a
> > libxl__checkpoint_devices_state * but are specific to either Remus of
> > Colo?
> > 
> > I think the usual way to solve that would be for the callback to take a
> > void *data "closure" field, which is registered along with the callbacks
> > and passed to all callbacks, or in this case perhaps you can get away
> > with just including it in the cds itself.
> > 
> > Ian, what do you think?
> 
> This is rather an unusual situation.  Normally there are two patterns:
> 
> 
> 1.
> 
> Things like:
> 
>   static void device_disk_add(libxl__egc *egc, uint32_t domid,
>                            libxl_device_disk *disk,
>                            libxl__ao_device *aodev,
>                            char *get_vdev(libxl__gc *, void *,
>                                           xs_transaction_t),
>                            void *get_vdev_user)
> 
> and similar patterns in much code in libxl and elsewhere.  This is the
> normal case.  (It is of course essential to use this when there are
> multiple call sites, so the void* data pointer might vary.)
> 
> 
> 2.
> 
> Things like (NB not very like real code):
> 
>   struct libxl_some_operation_state {
>     libxl_inner_generic_operation_state igos;
> 
>   void someop_innerop_make_happen(libxl_some_operation_state *sos) {
>     sos->igos.callback = someop_innerop_done;
> 
>   void someop_innerop_done(libxl_inner_generic_operation_state *igos) {
>     sos = CONTAINER_OF(igos);
> 
> Here the callback someop_innerop_done can be sure that CONTAINER_OF is
> correct because the callback is set up only in one place where it is
> obvious that the igos is part of a sos.
> 
> IMO this is an exception to  the usual rule that you have to accompany
> the function pointer with a void*.  The exception is justified because
> it is very easy to see that  the code is correct.  And, if any mistake
> is made, the setup is unconditional, so it will _always_ get the wrong
> container and go wrong (which will hopefully be spotted in testing).
> 
> 
> 
> In this particular situation the plumbing that relates a particular
> callback to the remus or colo state is rather more complicated.  I
> don't think it will be as obvious that the appropriate CONTAINER_OF is
> being used, let alone obvious that this it's always the same.
> 
> OTOH for any particular callback the context pointer is supposed to be
> a particular CONTAINER_OF.  It would be nice to write this in the
> code.
> 
> I think in this case the best answer would be:
> 
>   struct libxl__checkpoint_devices_state {
>       void (*postsuspend)(libxl__egc *egc, libxl__remus_device *dev);

ITYM s/libxl__remus_device/libxl__checkpoint_devices_state here?

If so then this is pretty much what I meant by "in this case perhaps you
can get away with just including it in the cds itself", but more clearly
explained.

>       void *callbacks_context;
> 
> and in the callback
> 
>   void libxl__remus_devices_postsuspend(libxl__egc *egc,
>                                 libxl__check_devices_state *cds)
>   {
>       libxl__remus_state *rs = cds->callbacks_context;
>       assert(cds = &rs->cds);
> 
> or some such.
> 
> 
> Thanks,
> Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm Yang Hongyang
  2015-07-15 12:48   ` Ian Campbell
@ 2015-07-16 14:43   ` Wei Liu
  2015-07-16 15:43     ` Yang Hongyang
  1 sibling, 1 reply; 101+ messages in thread
From: Wei Liu @ 2015-07-16 14:43 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, Jul 15, 2015 at 03:45:41PM +0800, Yang Hongyang wrote:
> check QEMU state before resume dm on QEMU_XEN_TRADITIONAL.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

I think it was me who suggested this and this patch looks correct.

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
>  tools/libxl/libxl_dom_suspend.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
> index 6f04c26..686a49b 100644
> --- a/tools/libxl/libxl_dom_suspend.c
> +++ b/tools/libxl/libxl_dom_suspend.c
> @@ -434,11 +434,20 @@ static void domain_suspend_callback_common_done(libxl__egc *egc,
>  
>  int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
>  {
> +    char *path;
> +    char *state;
>  
>      switch (libxl__device_model_version_running(gc, domid)) {
>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
> -        libxl__qemu_traditional_cmd(gc, domid, "continue");
> -        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
> +        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
> +
> +        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> +        state = libxl__xs_read(gc, XBT_NULL, path);
> +        if (state != NULL && !strcmp(state, "paused")) {
> +            libxl__qemu_traditional_cmd(gc, domid, "continue");
> +            libxl__wait_for_device_model_deprecated(gc, domid, "running",
> +                                                    NULL, NULL, NULL);
> +        }
>          break;
>      }
>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-07-16  5:57     ` Yang Hongyang
@ 2015-07-16 15:40       ` Ian Jackson
  2015-07-16 16:15         ` Yang Hongyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Jackson @ 2015-07-16 15:40 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram

Yang Hongyang writes ("Re: [Xen-devel] [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests"):
...
> This is used for secondary, at a checkpoint, we do:

Thanks for this explanation, which helps somewhat.


However, Ian Campbell asked for changes to the commit message to
better explain what is going.  I don't think what you have sent here
is intended as a new commit message.  So you aren't addressing his
concern in the way that he is expecting.

Ian wrote, in response to v3:

        I'm afraid I think the commit message for this patch (and the
        associated doc comments) need revisiting almost from scratch,
        to clearly explain what this patch is doing and why and what
        the constraints on the new functionality will be.
        
        At the moment it mostly talks in a confusing way about the old
        behaviour and adds very specific assumptions to the new
        function which are not made clear.

To `revisit the commit message almost from scratch' means to
completely rewrite the commit message.

In v4 this patch had an additional paragraph in the commit message,
but the commit message was otherwise substantially the same.  So in
response to v4 Ian C said:

        I'm afraid that the addition of [an extra] paragraph has not
        really addressed my comment on v3:

and then he requoted the text above.

Your response seems again to miss the main point of Ian's comment.


If you are unable to rewrite the commit message right now because you
don't understand what is missing or wrong, then we can discuss that
more.  We may even be able to help write it.

However, it is not appropriate to simply ignore Ian Campbell's very
clearly stated request for a complete overhaul of the commit message.

While it is a good spirit of helpfulness to provide additional
explanation in an email, that is not the same thing.


Moving forward, what we need is a commit message which explains, in
Ian Campbell's words:

  what this patch is doing

    That is, what the change in behaviour is.  This includes clearly
    distinguishing old behaviour, before the patch, from new
    behaviour, after the patch.  I appreciate that there may be
    language problems which are making this more difficult - I think
    your native language may not use tenses the way English does.  So
    we can help you with the language, but we need the old and new
    behaviours to be clearly marked in your message.

  why

    You have already gone some way to answering this but the
    information needs to be folded into the commit message.

  what the constraints on the new functionality will be.

    It appears that you are supporting slow path resume for all HVM
    guests.  Is that true ?  Are there any cases left unhandled ?


I suggest that the best thing for you to do next is either to reply to
me with ideally (a) a draft of a new commit message for the patch, or
if (as I suspect) you don't feel confident to do that, (a) questions
to help you understand what we are looking for, or (c) a request for
help with drafting.


> While the XEN_DOMCTL_resumedomain hyper call for HVM is an NOP, it happens
> to me that we could do this in a different way. We can modify
> libxl__domain_resume, if the domain is HVM, we skip the xc_domain_resume
> call, what do you think?

Until we understand the answers to the questions above, it will be
difficult for us to give a sensible opinion.


Thanks,
Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm
  2015-07-16 14:43   ` Wei Liu
@ 2015-07-16 15:43     ` Yang Hongyang
  0 siblings, 0 replies; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 15:43 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian Campbell, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/16/2015 10:43 PM, Wei Liu wrote:
> On Wed, Jul 15, 2015 at 03:45:41PM +0800, Yang Hongyang wrote:
>> check QEMU state before resume dm on QEMU_XEN_TRADITIONAL.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>
> I think it was me who suggested this and this patch looks correct.
>
> Acked-by: Wei Liu <wei.liu2@citrix.com>

Thanks!

>
>> ---
>>   tools/libxl/libxl_dom_suspend.c | 13 +++++++++++--
>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
>> index 6f04c26..686a49b 100644
>> --- a/tools/libxl/libxl_dom_suspend.c
>> +++ b/tools/libxl/libxl_dom_suspend.c
>> @@ -434,11 +434,20 @@ static void domain_suspend_callback_common_done(libxl__egc *egc,
>>
>>   int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid)
>>   {
>> +    char *path;
>> +    char *state;
>>
>>       switch (libxl__device_model_version_running(gc, domid)) {
>>       case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
>> -        libxl__qemu_traditional_cmd(gc, domid, "continue");
>> -        libxl__wait_for_device_model_deprecated(gc, domid, "running", NULL, NULL, NULL);
>> +        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
>> +
>> +        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
>> +        state = libxl__xs_read(gc, XBT_NULL, path);
>> +        if (state != NULL && !strcmp(state, "paused")) {
>> +            libxl__qemu_traditional_cmd(gc, domid, "continue");
>> +            libxl__wait_for_device_model_deprecated(gc, domid, "running",
>> +                                                    NULL, NULL, NULL);
>> +        }
>>           break;
>>       }
>>       case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
>> --
>> 1.9.1
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc Yang Hongyang
  2015-07-15 12:38   ` Ian Campbell
@ 2015-07-16 16:10   ` Wei Liu
  2015-07-16 16:24     ` Yang Hongyang
  1 sibling, 1 reply; 101+ messages in thread
From: Wei Liu @ 2015-07-16 16:10 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, Jul 15, 2015 at 03:45:39PM +0800, Yang Hongyang wrote:
> Pass checkpointed_stream from libxl to libxc.
> It won't affact legacy migration because legacy migration
> won't use this param.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
[...]
>  
> +    if (dss->checkpointed_stream && !r_info) {

Please explicitly check for _NONE type instead of relying it of being 0.
Arguably the actual value is not going to change in the future but it's
better to be explicit.

Wei.

> +        LOG(ERROR, "Migration stream is checkpointed, but there's no "
> +                   "checkpoint info!");
> +        goto out;
> +    }
> +
>      dss->rc = 0;
>      logdirty_init(&dss->logdirty);
>      dsps->ao = ao;

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-16  6:05     ` Yang Hongyang
  2015-07-16 10:47       ` Ian Campbell
@ 2015-07-16 16:13       ` Wei Liu
  2015-07-16 16:21         ` Yang Hongyang
  1 sibling, 1 reply; 101+ messages in thread
From: Wei Liu @ 2015-07-16 16:13 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram, Ian Jackson

On Thu, Jul 16, 2015 at 02:05:45PM +0800, Yang Hongyang wrote:
> 
> 
> On 07/15/2015 08:38 PM, Ian Campbell wrote:
> >On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
> >>Pass checkpointed_stream from libxl to libxc.
> >>It won't affact legacy migration because legacy migration
> >>won't use this param.
> >>
> >>Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> >>CC: Ian Campbell <Ian.Campbell@citrix.com>
> >>CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >>CC: Wei Liu <wei.liu2@citrix.com>
> >>CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >>---
> >>  tools/libxc/include/xenguest.h   |  9 ++++++---
> >>  tools/libxc/xc_domain_save.c     |  6 ++++--
> >>  tools/libxc/xc_nomigrate.c       |  3 ++-
> >>  tools/libxc/xc_sr_common.h       |  2 +-
> >>  tools/libxc/xc_sr_save.c         |  5 +++--
> >>  tools/libxl/libxl.c              |  2 ++
> >>  tools/libxl/libxl_dom_save.c     | 11 ++++++++---
> >>  tools/libxl/libxl_internal.h     |  1 +
> >>  tools/libxl/libxl_save_callout.c |  2 +-
> >>  tools/libxl/libxl_save_helper.c  |  3 ++-
> >>  10 files changed, 30 insertions(+), 14 deletions(-)
> >>
> >>diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> >>index e95af54..6e24b6c 100644
> >>--- a/tools/libxc/include/xenguest.h
> >>+++ b/tools/libxc/include/xenguest.h
> >>@@ -30,7 +30,6 @@
> >>  #define XCFLAGS_HVM       (1 << 2)
> >>  #define XCFLAGS_STDVGA    (1 << 3)
> >>  #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
> >>-#define XCFLAGS_CHECKPOINTED    (1 << 5)
> >>
> >>  #define X86_64_B_SIZE   64
> >>  #define X86_32_B_SIZE   32
> >>@@ -85,16 +84,20 @@ struct save_callbacks {
> >>   * @parm xch a handle to an open hypervisor interface
> >>   * @parm fd the file descriptor to save a domain to
> >>   * @parm dom the id of the domain
> >>+ * @parm checkpointed_stream non-zero if the far end of the stream is using
> >>+ *       checkpointing
> >
> >Do (or will) specific non-zero values have any meaning to the libxc
> >layer? i.e. does it have any knowledge of COLO vs. Remus as the libxl
> >enum added in the last patch does?
> 
> Yes, libxc side should be aware of the type of checkpointed_stream (Remus
> or COLO).
> 
> I think it is better to document the non-zero values here?
> for example:
>      * @parm checkpointed_stream non-zero if the far end of the stream is using
>      *                           checkpointing
>      *                           0 no checkpointed stream
>      *                           1 Remus
>      *                           2 COLO
> 

These should be proper #defines instead of being buried in comments --
so that you can use them in code.

Wei.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-07-16 15:40       ` Ian Jackson
@ 2015-07-16 16:15         ` Yang Hongyang
  2015-07-16 16:27           ` Ian Jackson
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 16:15 UTC (permalink / raw)
  To: Ian Jackson
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram



On 07/16/2015 11:40 PM, Ian Jackson wrote:
> Yang Hongyang writes ("Re: [Xen-devel] [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests"):
> ...
>> This is used for secondary, at a checkpoint, we do:
>
> Thanks for this explanation, which helps somewhat.
>
>
> However, Ian Campbell asked for changes to the commit message to
> better explain what is going.  I don't think what you have sent here
> is intended as a new commit message.  So you aren't addressing his
> concern in the way that he is expecting.
>
> Ian wrote, in response to v3:
>
>          I'm afraid I think the commit message for this patch (and the
>          associated doc comments) need revisiting almost from scratch,
>          to clearly explain what this patch is doing and why and what
>          the constraints on the new functionality will be.
>
>          At the moment it mostly talks in a confusing way about the old
>          behaviour and adds very specific assumptions to the new
>          function which are not made clear.
>
> To `revisit the commit message almost from scratch' means to
> completely rewrite the commit message.
>
> In v4 this patch had an additional paragraph in the commit message,
> but the commit message was otherwise substantially the same.  So in
> response to v4 Ian C said:
>
>          I'm afraid that the addition of [an extra] paragraph has not
>          really addressed my comment on v3:
>
> and then he requoted the text above.
>
> Your response seems again to miss the main point of Ian's comment.
>
>
> If you are unable to rewrite the commit message right now because you
> don't understand what is missing or wrong, then we can discuss that
> more.  We may even be able to help write it.
>
> However, it is not appropriate to simply ignore Ian Campbell's very
> clearly stated request for a complete overhaul of the commit message.

I'm sorry, I didn't mean to it.

>
> While it is a good spirit of helpfulness to provide additional
> explanation in an email, that is not the same thing.
>
>
> Moving forward, what we need is a commit message which explains, in
> Ian Campbell's words:
>
>    what this patch is doing
>
>      That is, what the change in behaviour is.  This includes clearly
>      distinguishing old behaviour, before the patch, from new
>      behaviour, after the patch.  I appreciate that there may be
>      language problems which are making this more difficult - I think
>      your native language may not use tenses the way English does.  So
>      we can help you with the language, but we need the old and new
>      behaviours to be clearly marked in your message.

I thought this is being addressed in the commit message, sorry again
for my poor English and not make it clear, I would appreciate your
help.

>
>    why
>
>      You have already gone some way to answering this but the
>      information needs to be folded into the commit message.
>
>    what the constraints on the new functionality will be.
>
>      It appears that you are supporting slow path resume for all HVM
>      guests.  Is that true ?  Are there any cases left unhandled ?

For the first question, yes. For second, Sorry that I don't catch your question,
did you mean in some cases resuming HVM through slow path will be unhandled?

>
>
> I suggest that the best thing for you to do next is either to reply to
> me with ideally (a) a draft of a new commit message for the patch, or
> if (as I suspect) you don't feel confident to do that, (a) questions
> to help you understand what we are looking for, or (c) a request for
> help with drafting.

c, I would appreciate your help with drafting, thank you.

>
>
>> While the XEN_DOMCTL_resumedomain hyper call for HVM is an NOP, it happens
>> to me that we could do this in a different way. We can modify
>> libxl__domain_resume, if the domain is HVM, we skip the xc_domain_resume
>> call, what do you think?
>
> Until we understand the answers to the questions above, it will be
> difficult for us to give a sensible opinion.
>
>
> Thanks,
> Ian.
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-16 16:13       ` Wei Liu
@ 2015-07-16 16:21         ` Yang Hongyang
  2015-07-16 16:39           ` Wei Liu
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 16:21 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian Campbell, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/17/2015 12:13 AM, Wei Liu wrote:
> On Thu, Jul 16, 2015 at 02:05:45PM +0800, Yang Hongyang wrote:
>>
>>
>> On 07/15/2015 08:38 PM, Ian Campbell wrote:
>>> On Wed, 2015-07-15 at 15:45 +0800, Yang Hongyang wrote:
>>>> Pass checkpointed_stream from libxl to libxc.
>>>> It won't affact legacy migration because legacy migration
>>>> won't use this param.
>>>>
>>>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>>>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>>>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>>>> CC: Wei Liu <wei.liu2@citrix.com>
>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>>> ---
>>>>   tools/libxc/include/xenguest.h   |  9 ++++++---
>>>>   tools/libxc/xc_domain_save.c     |  6 ++++--
>>>>   tools/libxc/xc_nomigrate.c       |  3 ++-
>>>>   tools/libxc/xc_sr_common.h       |  2 +-
>>>>   tools/libxc/xc_sr_save.c         |  5 +++--
>>>>   tools/libxl/libxl.c              |  2 ++
>>>>   tools/libxl/libxl_dom_save.c     | 11 ++++++++---
>>>>   tools/libxl/libxl_internal.h     |  1 +
>>>>   tools/libxl/libxl_save_callout.c |  2 +-
>>>>   tools/libxl/libxl_save_helper.c  |  3 ++-
>>>>   10 files changed, 30 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
>>>> index e95af54..6e24b6c 100644
>>>> --- a/tools/libxc/include/xenguest.h
>>>> +++ b/tools/libxc/include/xenguest.h
>>>> @@ -30,7 +30,6 @@
>>>>   #define XCFLAGS_HVM       (1 << 2)
>>>>   #define XCFLAGS_STDVGA    (1 << 3)
>>>>   #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
>>>> -#define XCFLAGS_CHECKPOINTED    (1 << 5)
>>>>
>>>>   #define X86_64_B_SIZE   64
>>>>   #define X86_32_B_SIZE   32
>>>> @@ -85,16 +84,20 @@ struct save_callbacks {
>>>>    * @parm xch a handle to an open hypervisor interface
>>>>    * @parm fd the file descriptor to save a domain to
>>>>    * @parm dom the id of the domain
>>>> + * @parm checkpointed_stream non-zero if the far end of the stream is using
>>>> + *       checkpointing
>>>
>>> Do (or will) specific non-zero values have any meaning to the libxc
>>> layer? i.e. does it have any knowledge of COLO vs. Remus as the libxl
>>> enum added in the last patch does?
>>
>> Yes, libxc side should be aware of the type of checkpointed_stream (Remus
>> or COLO).
>>
>> I think it is better to document the non-zero values here?
>> for example:
>>       * @parm checkpointed_stream non-zero if the far end of the stream is using
>>       *                           checkpointing
>>       *                           0 no checkpointed stream
>>       *                           1 Remus
>>       *                           2 COLO
>>
>
> These should be proper #defines instead of being buried in comments --
> so that you can use them in code.

Those defines are introduced later in the colo series, I will move them
here and follow Ian C's suggestion using XC_ prefix, or I can move this
patch down to colo series.

>
> Wei.
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-16 16:10   ` Wei Liu
@ 2015-07-16 16:24     ` Yang Hongyang
  2015-07-16 16:37       ` Wei Liu
  0 siblings, 1 reply; 101+ messages in thread
From: Yang Hongyang @ 2015-07-16 16:24 UTC (permalink / raw)
  To: Wei Liu
  Cc: Ian Campbell, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, guijianfeng, rshriram, Ian Jackson



On 07/17/2015 12:10 AM, Wei Liu wrote:
> On Wed, Jul 15, 2015 at 03:45:39PM +0800, Yang Hongyang wrote:
>> Pass checkpointed_stream from libxl to libxc.
>> It won't affact legacy migration because legacy migration
>> won't use this param.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
> [...]
>>
>> +    if (dss->checkpointed_stream && !r_info) {
>
> Please explicitly check for _NONE type instead of relying it of being 0.
> Arguably the actual value is not going to change in the future but it's
> better to be explicit.

I think you mean check xxx != _NONE here, right? will fix.

>
> Wei.
>
>> +        LOG(ERROR, "Migration stream is checkpointed, but there's no "
>> +                   "checkpoint info!");
>> +        goto out;
>> +    }
>> +
>>       dss->rc = 0;
>>       logdirty_init(&dss->logdirty);
>>       dsps->ao = ao;
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen
  2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen Yang Hongyang
  2015-07-15 12:50   ` Ian Campbell
@ 2015-07-16 16:26   ` Wei Liu
  1 sibling, 0 replies; 101+ messages in thread
From: Wei Liu @ 2015-07-16 16:26 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram, Ian Jackson

On Wed, Jul 15, 2015 at 03:45:42PM +0800, Yang Hongyang wrote:
> Currently, libxl__domain_unpause() only supports
> qemu-xen-traditional. Update it to support qemu-xen.
> We use libxl__domain_resume_device_model to unpause guest dm.
> 
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

I think I suggested this, too.

With Ian's comment addressed:

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
>  tools/libxl/libxl.c | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 5b2d045..799aead 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -941,8 +941,6 @@ out:
>  int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>  {
>      GC_INIT(ctx);
> -    char *path;
> -    char *state;
>      int ret, rc = 0;
>  
>      libxl_domain_type type = libxl__domain_type(gc, domid);
> @@ -952,14 +950,11 @@ int libxl_domain_unpause(libxl_ctx *ctx, uint32_t domid)
>      }
>  
>      if (type == LIBXL_DOMAIN_TYPE_HVM) {
> -        uint32_t dm_domid = libxl_get_stubdom_id(ctx, domid);
> -
> -        path = libxl__device_model_xs_path(gc, dm_domid, domid, "/state");
> -        state = libxl__xs_read(gc, XBT_NULL, path);
> -        if (state != NULL && !strcmp(state, "paused")) {
> -            libxl__qemu_traditional_cmd(gc, domid, "continue");
> -            libxl__wait_for_device_model_deprecated(gc, domid, "running",
> -                                         NULL, NULL, NULL);
> +        rc = libxl__domain_resume_device_model(gc, domid);
> +        if (rc < 0) {
> +            LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "failed to unpause device model "
> +                       "for domain %u:%d", domid, rc);
> +            goto out;
>          }
>      }
>      ret = xc_domain_unpause(ctx->xch, domid);
> -- 
> 1.9.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-07-16 16:15         ` Yang Hongyang
@ 2015-07-16 16:27           ` Ian Jackson
  2015-12-15  2:05             ` Wen Congyang
  0 siblings, 1 reply; 101+ messages in thread
From: Ian Jackson @ 2015-07-16 16:27 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram

Yang Hongyang writes ("Re: [Xen-devel] [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests"):
> On 07/16/2015 11:40 PM, Ian Jackson wrote:
> >    what this patch is doing
> >
> >      That is, what the change in behaviour is.  This includes clearly
> >      distinguishing old behaviour, before the patch, from new
> >      behaviour, after the patch.  I appreciate that there may be
> >      language problems which are making this more difficult - I think
> >      your native language may not use tenses the way English does.  So
> >      we can help you with the language, but we need the old and new
> >      behaviours to be clearly marked in your message.
> 
> I thought this is being addressed in the commit message, sorry again
> for my poor English and not make it clear, I would appreciate your
> help.

Right.  Thanks.  I hope we can work on this together.  I appreciate
that working in a non-native language is difficult.

OK, at the moment I find the existing proposed commit message unclear
about before-and-after.  I'm not sure I can write it correctly.  Can I
make a suggestion ?  How about you send me a copy of it with
the different parts explicitly marked BEFORE: and AFTER: ?

> >    what the constraints on the new functionality will be.
> >
> >      It appears that you are supporting slow path resume for all HVM
> >      guests.  Is that true ?  Are there any cases left unhandled ?
> 
> For the first question, yes. For second, Sorry that I don't catch
> your question, did you mean in some cases resuming HVM through slow
> path will be unhandled?

What I mean is: I think that this patch has this overall effect:

   BEFORE: HVM resume for slow path does not work

   AFTER: HVM resume for slow path does work

But I have questions.  I don't know in what way it "does not work".
What happens instead ?

And, another question: is it true that

   AFTER: HVM resume for slow path does work in all cases

or

   AFTER: HVM resume for slow path works in some cases (specify!)
          but in other cases it (does something else - what?)

Does that make sense of my question ?


Thanks,
Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-16 16:24     ` Yang Hongyang
@ 2015-07-16 16:37       ` Wei Liu
  0 siblings, 0 replies; 101+ messages in thread
From: Wei Liu @ 2015-07-16 16:37 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: Wei Liu, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram, Ian Jackson

On Fri, Jul 17, 2015 at 12:24:11AM +0800, Yang Hongyang wrote:
> 
> 
> On 07/17/2015 12:10 AM, Wei Liu wrote:
> >On Wed, Jul 15, 2015 at 03:45:39PM +0800, Yang Hongyang wrote:
> >>Pass checkpointed_stream from libxl to libxc.
> >>It won't affact legacy migration because legacy migration
> >>won't use this param.
> >>
> >>Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> >>CC: Ian Campbell <Ian.Campbell@citrix.com>
> >>CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >>CC: Wei Liu <wei.liu2@citrix.com>
> >>CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >>---
> >[...]
> >>
> >>+    if (dss->checkpointed_stream && !r_info) {
> >
> >Please explicitly check for _NONE type instead of relying it of being 0.
> >Arguably the actual value is not going to change in the future but it's
> >better to be explicit.
> 
> I think you mean check xxx != _NONE here, right? will fix.
> 

Yes, that's what I meant.

Wei.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc
  2015-07-16 16:21         ` Yang Hongyang
@ 2015-07-16 16:39           ` Wei Liu
  0 siblings, 0 replies; 101+ messages in thread
From: Wei Liu @ 2015-07-16 16:39 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: Wei Liu, Ian Campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram, Ian Jackson

On Fri, Jul 17, 2015 at 12:21:11AM +0800, Yang Hongyang wrote:
[...]
> >>>Do (or will) specific non-zero values have any meaning to the libxc
> >>>layer? i.e. does it have any knowledge of COLO vs. Remus as the libxl
> >>>enum added in the last patch does?
> >>
> >>Yes, libxc side should be aware of the type of checkpointed_stream (Remus
> >>or COLO).
> >>
> >>I think it is better to document the non-zero values here?
> >>for example:
> >>      * @parm checkpointed_stream non-zero if the far end of the stream is using
> >>      *                           checkpointing
> >>      *                           0 no checkpointed stream
> >>      *                           1 Remus
> >>      *                           2 COLO
> >>
> >
> >These should be proper #defines instead of being buried in comments --
> >so that you can use them in code.
> 
> Those defines are introduced later in the colo series, I will move them
> here and follow Ian C's suggestion using XC_ prefix, or I can move this
> patch down to colo series.
> 

Either works for me.

Wei.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-07-16 16:27           ` Ian Jackson
@ 2015-12-15  2:05             ` Wen Congyang
  2016-01-04 16:33               ` Ian Jackson
  0 siblings, 1 reply; 101+ messages in thread
From: Wen Congyang @ 2015-12-15  2:05 UTC (permalink / raw)
  To: Ian Jackson, Yang Hongyang
  Cc: wei.liu2, Ian Campbell, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram

On 07/17/2015 12:27 AM, Ian Jackson wrote:
> Yang Hongyang writes ("Re: [Xen-devel] [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests"):
>> On 07/16/2015 11:40 PM, Ian Jackson wrote:
>>>    what this patch is doing
>>>
>>>      That is, what the change in behaviour is.  This includes clearly
>>>      distinguishing old behaviour, before the patch, from new
>>>      behaviour, after the patch.  I appreciate that there may be
>>>      language problems which are making this more difficult - I think
>>>      your native language may not use tenses the way English does.  So
>>>      we can help you with the language, but we need the old and new
>>>      behaviours to be clearly marked in your message.
>>
>> I thought this is being addressed in the commit message, sorry again
>> for my poor English and not make it clear, I would appreciate your
>> help.
> 
> Right.  Thanks.  I hope we can work on this together.  I appreciate
> that working in a non-native language is difficult.
> 
> OK, at the moment I find the existing proposed commit message unclear
> about before-and-after.  I'm not sure I can write it correctly.  Can I
> make a suggestion ?  How about you send me a copy of it with
> the different parts explicitly marked BEFORE: and AFTER: ?
> 
>>>    what the constraints on the new functionality will be.
>>>
>>>      It appears that you are supporting slow path resume for all HVM
>>>      guests.  Is that true ?  Are there any cases left unhandled ?
>>
>> For the first question, yes. For second, Sorry that I don't catch
>> your question, did you mean in some cases resuming HVM through slow
>> path will be unhandled?
> 
> What I mean is: I think that this patch has this overall effect:
> 
>    BEFORE: HVM resume for slow path does not work
> 
>    AFTER: HVM resume for slow path does work
> 
> But I have questions.  I don't know in what way it "does not work".
> What happens instead ?

Sorry for the late reply.
BEFORE: HVM resume for slow path does not work. You will get the following
error message:
"Cannot resume uncooperative HVM guests"

Fast resume: the guest status is not changed, so there is no need to disconnect and
reconnect the backend and frontend pv driver.

Slow path resume: the guest status is changed, so we must disconnect and reconnect
the backend and frontend pv driver. When we reconnect the backend and frontend, it
will take too many time, because xenstore is very slow. That is why it is a slow path.

In which case the slow path doesn't work? If the guest status is changed, but it is
also corrupted. I don't know what will happen in this case. I think resuming PV guest
in such state doesn't work(the behavior is undefined.)

> 
> And, another question: is it true that
> 
>    AFTER: HVM resume for slow path does work in all cases
> 
> or
> 
>    AFTER: HVM resume for slow path works in some cases (specify!)
>           but in other cases it (does something else - what?)
> 
> Does that make sense of my question ?

In my test, it works. I know I cannot say it does work in all cases.
How to know if it does work in all cases?
    List all cases, and do a test for all cases.
But I think it is hard to list all cases...

How to resume domU if its state(memory, device state, cpu's register...) is changed?
Note that, the domU can be resumed.(All states are copied from another guest with the same
config).
Before this patch, we only support pv guest, and do the following thing:
1. rewrite store_mfn and console_mfn
2. reset all secondary CPU states
3. resume domain(do_domctl(xch, ...), cmd is XEN_DOMCTL_resumedomain)

Thanks
Wen Congyang

> 
> 
> Thanks,
> Ian.
> .
> 

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests
  2015-12-15  2:05             ` Wen Congyang
@ 2016-01-04 16:33               ` Ian Jackson
  0 siblings, 0 replies; 101+ messages in thread
From: Ian Jackson @ 2016-01-04 16:33 UTC (permalink / raw)
  To: Wen Congyang
  Cc: wei.liu2, Ian Campbell, andrew.cooper3, yunhong.jiang,
	eddie.dong, xen-devel, guijianfeng, rshriram, Yang Hongyang

Wen Congyang writes ("Re: [Xen-devel] [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests"):
> Sorry for the late reply.
> BEFORE: HVM resume for slow path does not work. You will get the following
> error message:
> "Cannot resume uncooperative HVM guests"
> 
> Fast resume: the guest status is not changed, so there is no need to disconnect and
> reconnect the backend and frontend pv driver.
> 
> Slow path resume: the guest status is changed, so we must disconnect and reconnect
> the backend and frontend pv driver. When we reconnect the backend and frontend, it
> will take too many time, because xenstore is very slow. That is why it is a slow path.
> 
> In which case the slow path doesn't work? If the guest status is changed, but it is
> also corrupted. I don't know what will happen in this case. I think resuming PV guest
> in such state doesn't work(the behavior is undefined.)

Right.  Thanks, this is nice and clear.  Indeed, the guest will be
corrupted and if we're lucky (!) it will crash.

> [discussion of AFTER]
>
> In my test, it works. I know I cannot say it does work in all cases.
> How to know if it does work in all cases?
>     List all cases, and do a test for all cases.
> But I think it is hard to list all cases...

I certainly don't expect you to list them all and test them.  I'm just
trying to understand what you are saying in the commit message.

So I think you could say something like:

 AFTER: I think it should work in XYZ cases.  I have tested PQR
 (only), but the others are probably good because [explanation].

Or perhaps:

 There may still be problems with situation ABC because of problem
 FOO.  I have not tested this.  But situation ABC is already broken
 and this patch makes it no worse.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 101+ messages in thread

end of thread, other threads:[~2016-01-04 16:33 UTC | newest]

Thread overview: 101+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-15  7:45 [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 01/25] tools/libxl: rename libxl__domain_suspend to libxl__domain_save Yang Hongyang
2015-07-15 11:16   ` Ian Campbell
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 02/25] tools/libxl: move domain suspend code into libxl_dom_suspend.c Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 03/25] tools/libxl: move domain resume " Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 04/25] tools/libxl: rename remus checkpoint callbacks Yang Hongyang
2015-07-15 11:17   ` Ian Campbell
2015-07-16  1:43     ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 05/25] libxl/remus: introduce libxl__remus_setup Yang Hongyang
2015-07-15 11:26   ` Ian Campbell
2015-07-16  5:32     ` Yang Hongyang
2015-07-16 10:40       ` Ian Campbell
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 06/25] libxl/remus: introduce libxl__remus_teardown Yang Hongyang
2015-07-15 11:59   ` Ian Campbell
2015-07-16  1:43     ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 07/25] libxl/remus: init checkpoint_callback in Remus checkpoint callback Yang Hongyang
2015-07-15 12:02   ` Ian Campbell
2015-07-15 12:35     ` Yang Hongyang
2015-07-16 10:32       ` Ian Campbell
2015-07-16 11:00         ` Yang Hongyang
2015-07-16 11:16           ` Ian Campbell
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 08/25] tools/libxl: move remus code into libxl_remus.c Yang Hongyang
2015-07-15 12:05   ` Ian Campbell
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 09/25] tools/libxl: move save/restore code into libxl_dom_save.c Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 10/25] libxl/save: Refactor libxl__domain_suspend_state Yang Hongyang
2015-07-15 12:10   ` Ian Campbell
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 11/25] tools/libxc: support to resume uncooperative HVM guests Yang Hongyang
2015-07-15 12:26   ` Ian Campbell
2015-07-16  5:57     ` Yang Hongyang
2015-07-16 15:40       ` Ian Jackson
2015-07-16 16:15         ` Yang Hongyang
2015-07-16 16:27           ` Ian Jackson
2015-12-15  2:05             ` Wen Congyang
2016-01-04 16:33               ` Ian Jackson
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 12/25] tools/libxl: introduce enum type libxl_checkpointed_stream Yang Hongyang
2015-07-15 12:34   ` Ian Campbell
2015-07-15 13:58     ` Yang Hongyang
2015-07-16 10:34       ` Ian Campbell
2015-07-16 10:47         ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 13/25] migration/save: pass checkpointed_stream from libxl to libxc Yang Hongyang
2015-07-15 12:38   ` Ian Campbell
2015-07-16  6:05     ` Yang Hongyang
2015-07-16 10:47       ` Ian Campbell
2015-07-16 16:13       ` Wei Liu
2015-07-16 16:21         ` Yang Hongyang
2015-07-16 16:39           ` Wei Liu
2015-07-16 16:10   ` Wei Liu
2015-07-16 16:24     ` Yang Hongyang
2015-07-16 16:37       ` Wei Liu
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 14/25] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Yang Hongyang
2015-07-15 12:45   ` Ian Campbell
2015-07-15 13:42     ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 15/25] tools/libxl: check QEMU state before resume dm Yang Hongyang
2015-07-15 12:48   ` Ian Campbell
2015-07-15 12:54     ` Ian Campbell
2015-07-15 13:00       ` Wei Liu
2015-07-15 13:48         ` Ian Campbell
2015-07-15 13:49     ` Ian Campbell
2015-07-16 14:43   ` Wei Liu
2015-07-16 15:43     ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 16/25] tools/libxl: Update libxl_domain_unpause() to support qemu-xen Yang Hongyang
2015-07-15 12:50   ` Ian Campbell
2015-07-16  3:49     ` Yang Hongyang
2015-07-16 10:39       ` Ian Campbell
2015-07-16 10:51         ` Yang Hongyang
2015-07-16 16:26   ` Wei Liu
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 17/25] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 18/25] tools/libxl: export logdirty_init Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 19/25] tools/libxl: Add back channel to allow migration target send data back Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 20/25] tools/libx{l, c}: add back channel to libxc Yang Hongyang
2015-07-15 13:13   ` Ian Campbell
2015-07-16  6:29     ` Yang Hongyang
2015-07-16 11:01       ` Ian Campbell
2015-07-15 13:21   ` Andrew Cooper
2015-07-16  6:07     ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 21/25] tools/libxl: rename remus device to checkpoint device Yang Hongyang
2015-07-15 13:15   ` Ian Campbell
2015-07-15 13:34     ` Yang Hongyang
2015-07-16  9:26       ` Andrew Cooper
2015-07-16  9:29         ` Yang Hongyang
2015-07-15 13:32   ` Ian Campbell
2015-07-15 13:38     ` Yang Hongyang
2015-07-16  9:23     ` Yang Hongyang
2015-07-16  9:31       ` Ian Campbell
2015-07-16  9:36         ` Yang Hongyang
2015-07-16 10:14           ` Ian Campbell
2015-07-16 10:22             ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 22/25] tools/libxl: adjust the indentation Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 23/25] tools/libxl: store remus_ops in checkpoint device state Yang Hongyang
2015-07-15 13:21   ` Ian Campbell
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 24/25] tools/libxl: move remus state into a seperate structure Yang Hongyang
2015-07-15 13:28   ` Ian Campbell
2015-07-15 13:50     ` Yang Hongyang
2015-07-16 10:37       ` Ian Campbell
2015-07-16 11:10         ` Ian Jackson
2015-07-16 11:19           ` Ian Campbell
2015-07-15 15:08   ` Ian Jackson
2015-07-15 15:18     ` Yang Hongyang
2015-07-15  7:45 ` [PATCH v4 --for 4.6 COLOPre 25/25] tools/libxl: seperate device init/cleanup from checkpoint device layer Yang Hongyang
2015-07-15 13:37   ` Ian Campbell
2015-07-16  1:37 ` [PATCH v4 --for 4.6 COLOPre 00/25] Prerequisite patches for COLO Yang Hongyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.