All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 00/13] Prerequisite patches for COLO
@ 2016-02-18  2:43 Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback Wen Congyang
                   ` (13 more replies)
  0 siblings, 14 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

This patchset is Prerequisite for COLO feature. Refer to:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

Patch status:
1. Acked patches: patch 2-4, 6-13
2. Reviewd patches: all
3. New patches: none
Note:
1. Patch 1 and 7 is updated according to Wei Liu's comments
2. Patch 2-3 is updated because patch 1 is updated
3. Patch 8, 9, 11, 12 in v7 is moved to another series
4. Patch 13, 14 in v7 is fold into one patch(patch 9)
5. The commit message for patch 5 is not updated(wait the reply
   from Ian C, and Ian J)

You can get the codes from here:
https://github.com/wencongyang/xen/tree/colo_pre_v8
You can get the whole colo related patches from here:
https://github.com/wencongyang/xen/tree/colo_v10

v6->v7:
 - Addressed comments from Konrad Rzeszutek Wilk

v5->v6:
 - Fix some bugs found in the test

v4->v5:
 - Rebased to the latest xen
 - Addressed comments from last round

v3->v4:
 - Rebased to the latest migration v2 branch
 - Addressed comments from last round

v2->v3:
 - Merge '[PATCH v2 0/6] Misc cleanups for libxl' into this patchset
   for easy review
 - Addressed review comments
 - Add back channel to libxc
 - Introduce should_checkpoint callback
 - Introduce DIRTY_BITMAP record on libxc side
 - Introduce COLO_CONTEXT record on libxl side
 - Ported to Libxl migration v2

v1->v2:
 - Rebased to [PATCH v2 0/6] Misc cleanups for libxl
 - Add a bugfix for the error handling of process_record

Wen Congyang (13):
  libxl/remus: init checkpoint callback in Remus setup callback
  tools/libxl: move remus code into libxl_remus.c
  tools/libxl: move save/restore code into libxl_dom_save.c
  libxl/save: Refactor libxl__domain_suspend_state
  tools/libxc: support to resume uncooperative HVM guests
  tools/libxl: introduce enum type libxl_checkpointed_stream
  migration/save: pass checkpointed_stream from libxl to libxc
  tools/libxl: export logdirty_init
  tools/libxl: rename remus device to checkpoint device
  tools/libxl: adjust the indentation
  tools/libxl: store remus_ops in checkpoint device state
  tools/libxl: move remus state into a seperate structure
  tools/libxl: seperate device init/cleanup from checkpoint device layer

 tools/libxc/include/xenguest.h        |   6 +-
 tools/libxc/xc_nomigrate.c            |   3 +-
 tools/libxc/xc_resume.c               |  25 +-
 tools/libxc/xc_sr_common.h            |  12 +-
 tools/libxc/xc_sr_save.c              |  17 +-
 tools/libxl/Makefile                  |   4 +-
 tools/libxl/libxl.c                   |  81 +---
 tools/libxl/libxl.h                   |  19 +
 tools/libxl/libxl_checkpoint_device.c | 282 +++++++++++++
 tools/libxl/libxl_create.c            |  44 +-
 tools/libxl/libxl_dom.c               | 740 ----------------------------------
 tools/libxl/libxl_dom_save.c          | 521 ++++++++++++++++++++++++
 tools/libxl/libxl_dom_suspend.c       | 207 ++++++----
 tools/libxl/libxl_internal.h          | 217 ++++++----
 tools/libxl/libxl_netbuffer.c         | 117 +++---
 tools/libxl/libxl_nonetbuffer.c       |  10 +-
 tools/libxl/libxl_remus.c             | 424 +++++++++++++++++++
 tools/libxl/libxl_remus_device.c      | 327 ---------------
 tools/libxl/libxl_remus_disk_drbd.c   |  56 +--
 tools/libxl/libxl_save_callout.c      |   4 +-
 tools/libxl/libxl_save_helper.c       |   3 +-
 tools/libxl/libxl_stream_read.c       |   7 +-
 tools/libxl/libxl_stream_write.c      |  18 +-
 tools/libxl/libxl_types.idl           |  10 +-
 tools/libxl/xl_cmdimpl.c              |  18 +-
 25 files changed, 1709 insertions(+), 1463 deletions(-)
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 create mode 100644 tools/libxl/libxl_dom_save.c
 create mode 100644 tools/libxl/libxl_remus.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

-- 
2.5.0

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18 12:30   ` Wei Liu
  2016-02-18  2:43 ` [PATCH v8 02/13] tools/libxl: move remus code into libxl_remus.c Wen Congyang
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

Init stream {read/write} state checkpoint_callback, suspend/resume/checkpoint
callback in Remus setup callback.
There's no functional change, it's just refactoring so that we can move
all remus code into one file.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/libxl/libxl.c          |  8 ++++++++
 tools/libxl/libxl_create.c   | 18 ++++++++++++++----
 tools/libxl/libxl_dom.c      | 18 +++++-------------
 tools/libxl/libxl_internal.h |  7 +++++++
 4 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2d18b8d..38029cd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -899,6 +899,8 @@ static void libxl__remus_setup(libxl__egc *egc,
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
     const libxl_domain_remus_info *const info = dss->remus;
+    libxl__srm_save_autogen_callbacks *const callbacks =
+        &dss->sws.shs.callbacks.save.a;
 
     STATE_AO_GC(dss->ao);
 
@@ -917,6 +919,12 @@ static void libxl__remus_setup(libxl__egc *egc,
     rds->domid = dss->domid;
     rds->callback = remus_setup_done;
 
+    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
+
+    callbacks->suspend = libxl__remus_domain_suspend_callback;
+    callbacks->postcopy = libxl__remus_domain_resume_callback;
+    callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
+
     libxl__remus_devices_setup(egc, rds);
     return;
 
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index de5d27f..7293d0b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -730,6 +730,17 @@ static void remus_checkpoint_stream_done(
     libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
 }
 
+static void libxl__remus_restore_setup(libxl__egc *egc,
+                                       libxl__domain_create_state *dcs)
+{
+    /* Convenience aliases */
+    libxl__srm_restore_autogen_callbacks *const callbacks =
+        &dcs->srs.shs.callbacks.restore.a;
+
+    callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
+    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
+}
+
 /*----- main domain creation -----*/
 
 /* We have a linear control flow; only one event callback is
@@ -1014,8 +1025,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     libxl_domain_config *const d_config = dcs->guest_config;
     const int restore_fd = dcs->restore_fd;
     libxl__domain_build_state *const state = &dcs->build_state;
-    libxl__srm_restore_autogen_callbacks *const callbacks =
-        &dcs->srs.shs.callbacks.restore.a;
+    const int checkpointed_stream = dcs->restore_params.checkpointed_stream;
 
     if (rc) {
         domcreate_rebuild_done(egc, dcs, rc);
@@ -1043,7 +1053,6 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     }
 
     /* Restore */
-    callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
 
     rc = libxl__build_pre(gc, domid, d_config, state);
     if (rc)
@@ -1054,9 +1063,10 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     dcs->srs.fd = restore_fd;
     dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
     dcs->srs.completion_callback = domcreate_stream_done;
-    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
 
     if (restore_fd >= 0) {
+        if (checkpointed_stream)
+            libxl__remus_restore_setup(egc, dcs);
         libxl__stream_read_start(egc, &dcs->srs);
         return;
     }
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 2269998..7835d4d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1489,7 +1489,7 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
                                        libxl__remus_devices_state *rds,
                                        int rc);
 
-static void libxl__remus_domain_suspend_callback(void *data)
+void libxl__remus_domain_suspend_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
@@ -1532,7 +1532,7 @@ out:
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
 }
 
-static void libxl__remus_domain_resume_callback(void *data)
+void libxl__remus_domain_resume_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
@@ -1569,8 +1569,6 @@ out:
 
 /*----- remus asynchronous checkpoint callback -----*/
 
-static void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__remus_devices_state *rds,
                                     int rc);
@@ -1578,7 +1576,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
                                   int rc);
 
-static void libxl__remus_domain_save_checkpoint_callback(void *data)
+void libxl__remus_domain_save_checkpoint_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__domain_suspend_state *dss = shs->caller_state;
@@ -1588,7 +1586,7 @@ static void libxl__remus_domain_save_checkpoint_callback(void *data)
     libxl__stream_write_start_checkpoint(egc, &dss->sws);
 }
 
-static void remus_checkpoint_stream_written(
+void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
@@ -1756,13 +1754,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
         }
     }
 
-    memset(callbacks, 0, sizeof(*callbacks));
-    if (r_info != NULL) {
-        callbacks->suspend = libxl__remus_domain_suspend_callback;
-        callbacks->postcopy = libxl__remus_domain_resume_callback;
-        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
-        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
-    } else
+    if (r_info == NULL)
         callbacks->suspend = libxl__domain_suspend_callback;
 
     callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 650a958..29c87a2 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3508,6 +3508,13 @@ _hidden void libxl__domain_suspend(libxl__egc *egc,
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
+/* Remus callbacks for save */
+_hidden void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
+_hidden void libxl__remus_domain_suspend_callback(void *data);
+_hidden void libxl__remus_domain_resume_callback(void *data);
+_hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
+
 
 /*
  * Convenience macros.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 02/13] tools/libxl: move remus code into libxl_remus.c
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 03/13] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

After previous refactoring, we are now able to move all remus code
into a separate file libxl_remus.c.

Export following functions for internal use:
- setup/teardown Remus:
  * libxl__remus_setup
  * libxl__remus_teardown
  * libxl__remus_restore_setup

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl.c          |  75 ---------
 tools/libxl/libxl_create.c   |  32 ----
 tools/libxl/libxl_dom.c      | 223 --------------------------
 tools/libxl/libxl_internal.h |  14 +-
 tools/libxl/libxl_remus.c    | 362 +++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 371 insertions(+), 337 deletions(-)
 create mode 100644 tools/libxl/libxl_remus.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 620720e..7d64ecc 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -64,7 +64,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 38029cd..d6ce7da 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -831,12 +831,6 @@ out:
     return ptr;
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-                               libxl__domain_suspend_state *dss);
-static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc);
-static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc);
 static void remus_failover_cb(libxl__egc *egc,
                               libxl__domain_suspend_state *dss, int rc);
 
@@ -893,75 +887,6 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     return AO_CREATE_FAIL(rc);
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-                               libxl__domain_suspend_state *dss)
-{
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
-    const libxl_domain_remus_info *const info = dss->remus;
-    libxl__srm_save_autogen_callbacks *const callbacks =
-        &dss->sws.shs.callbacks.save.a;
-
-    STATE_AO_GC(dss->ao);
-
-    if (libxl_defbool_val(info->netbuf)) {
-        if (!libxl__netbuffer_enabled(gc)) {
-            LOG(ERROR, "Remus: No support for network buffering");
-            goto out;
-        }
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
-    }
-
-    if (libxl_defbool_val(info->diskbuf))
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
-
-    rds->ao = ao;
-    rds->domid = dss->domid;
-    rds->callback = remus_setup_done;
-
-    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
-
-    callbacks->suspend = libxl__remus_domain_suspend_callback;
-    callbacks->postcopy = libxl__remus_domain_resume_callback;
-    callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
-
-    libxl__remus_devices_setup(egc, rds);
-    return;
-
-out:
-    dss->callback(egc, dss, ERROR_FAIL);
-}
-
-static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (!rc) {
-        libxl__domain_save(egc, dss);
-        return;
-    }
-
-    LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
-        dss->domid, rc);
-    rds->callback = remus_setup_failed;
-    libxl__remus_devices_teardown(egc, rds);
-}
-
-static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        LOG(ERROR, "Remus: failed to teardown device after setup failed"
-            " for guest with domid %u, rc %d", dss->domid, rc);
-
-    dss->callback(egc, dss, rc);
-}
-
 static void remus_failover_cb(libxl__egc *egc,
                               libxl__domain_suspend_state *dss, int rc)
 {
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 7293d0b..e421d36 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -709,38 +709,6 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
                             libxl_device_model_version_to_string(b_info->device_model_version));
 }
 
-/*----- remus asynchronous checkpoint callback -----*/
-
-static void remus_checkpoint_stream_done(
-    libxl__egc *egc, libxl__stream_read_state *srs, int rc);
-
-static void libxl__remus_domain_restore_checkpoint_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__domain_create_state *dcs = shs->caller_state;
-    libxl__egc *egc = shs->egc;
-    STATE_AO_GC(dcs->ao);
-
-    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
-}
-
-static void remus_checkpoint_stream_done(
-    libxl__egc *egc, libxl__stream_read_state *stream, int rc)
-{
-    libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
-}
-
-static void libxl__remus_restore_setup(libxl__egc *egc,
-                                       libxl__domain_create_state *dcs)
-{
-    /* Convenience aliases */
-    libxl__srm_restore_autogen_callbacks *const callbacks =
-        &dcs->srs.shs.callbacks.restore.a;
-
-    callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
-    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
-}
-
 /*----- main domain creation -----*/
 
 /* We have a linear control flow; only one event callback is
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 7835d4d..d74f1a4 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1479,196 +1479,6 @@ int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
     return rc;
 }
 
-/*----- remus callbacks -----*/
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
-                                         int rc);
-static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc);
-
-void libxl__remus_domain_suspend_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-
-    dss->callback_common_done = remus_domain_suspend_callback_common_done;
-    libxl__domain_suspend(egc, dss);
-}
-
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
-{
-    if (rc)
-        goto out;
-
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_postsuspend_cb;
-    libxl__remus_devices_postsuspend(egc, rds);
-    return;
-
-out:
-    dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
-                                         int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-
-    if (rc)
-        goto out;
-
-    rc = 0;
-
-out:
-    if (rc)
-        dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-void libxl__remus_domain_resume_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_preresume_cb;
-    libxl__remus_devices_preresume(egc, rds);
-}
-
-static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        goto out;
-
-    /* Resumes the domain and the device model */
-    rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
-    if (rc)
-        goto out;
-
-    rc = 0;
-
-out:
-    if (rc)
-        dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-/*----- remus asynchronous checkpoint callback -----*/
-
-static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
-                                    int rc);
-static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs,
-                                  int rc);
-
-void libxl__remus_domain_save_checkpoint_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    libxl__egc *egc = shs->egc;
-    STATE_AO_GC(dss->ao);
-
-    libxl__stream_write_start_checkpoint(egc, &dss->sws);
-}
-
-void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
-
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc) {
-        LOG(ERROR, "Failed to save device model. Terminating Remus..");
-        goto out;
-    }
-
-    rds->callback = remus_devices_commit_cb;
-    libxl__remus_devices_commit(egc, rds);
-
-    return;
-
-out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
-}
-
-static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
-                                    int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc) {
-        LOG(ERROR, "Failed to do device commit op."
-            " Terminating Remus..");
-        goto out;
-    }
-
-    /*
-     * At this point, we have successfully checkpointed the guest and
-     * committed it at the backup. We'll come back after the checkpoint
-     * interval to checkpoint the guest again. Until then, let the guest
-     * continue execution.
-     */
-
-    /* Set checkpoint interval timeout */
-    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
-                                     remus_next_checkpoint,
-                                     dss->interval);
-
-    if (rc)
-        goto out;
-
-    return;
-
-out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
-}
-
-static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs,
-                                  int rc)
-{
-    libxl__domain_suspend_state *dss =
-                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc == ERROR_TIMEDOUT) /* As intended */
-        rc = 0;
-
-    /*
-     * Time to checkpoint the guest again. We return 1 to libxc
-     * (xc_domain_save.c). in order to continue executing the infinite loop
-     * (suspend, checkpoint, resume) in xc_domain_save().
-     */
-
-    if (rc)
-        dss->rc = rc;
-
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
 /*----- main code for saving, in order of execution -----*/
 
 void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
@@ -1777,13 +1587,6 @@ static void stream_done(libxl__egc *egc,
     domain_save_done(egc, sws->dss, rc);
 }
 
-static void libxl__remus_teardown(libxl__egc *egc,
-                                  libxl__domain_suspend_state *dss,
-                                  int rc);
-static void remus_teardown_done(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc);
-
 static void domain_save_done(libxl__egc *egc,
                              libxl__domain_suspend_state *dss, int rc)
 {
@@ -1812,32 +1615,6 @@ static void domain_save_done(libxl__egc *egc,
     dss->callback(egc, dss, rc);
 }
 
-static void libxl__remus_teardown(libxl__egc *egc,
-                                  libxl__domain_suspend_state *dss,
-                                  int rc)
-{
-    EGC_GC;
-
-    LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
-        " teardown Remus devices...", rc);
-    dss->rds.callback = remus_teardown_done;
-    libxl__remus_devices_teardown(egc, &dss->rds);
-}
-
-static void remus_teardown_done(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
-            " rc %d", dss->domid, rc);
-
-    dss->callback(egc, dss, rc);
-}
-
 /*==================== Miscellaneous ====================*/
 
 char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 29c87a2..d9b9e2a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3508,12 +3508,14 @@ _hidden void libxl__domain_suspend(libxl__egc *egc,
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
-/* Remus callbacks for save */
-_hidden void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
-_hidden void libxl__remus_domain_suspend_callback(void *data);
-_hidden void libxl__remus_domain_resume_callback(void *data);
-_hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
+/* Remus setup and teardown */
+_hidden void libxl__remus_setup(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss);
+_hidden void libxl__remus_teardown(libxl__egc *egc,
+                                   libxl__domain_suspend_state *dss,
+                                   int rc);
+_hidden void libxl__remus_restore_setup(libxl__egc *egc,
+                                        libxl__domain_create_state *dcs);
 
 
 /*
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
new file mode 100644
index 0000000..567250d
--- /dev/null
+++ b/tools/libxl/libxl_remus.c
@@ -0,0 +1,362 @@
+/*
+ * Copyright (C) 2009      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *        Yang Hongyang <hongyang.yang@easystack.cn>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+/*-------------------- Remus setup and teardown ---------------------*/
+
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc);
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc);
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
+static void libxl__remus_domain_suspend_callback(void *data);
+static void libxl__remus_domain_resume_callback(void *data);
+static void libxl__remus_domain_save_checkpoint_callback(void *data);
+
+void libxl__remus_setup(libxl__egc *egc,
+                        libxl__domain_suspend_state *dss)
+{
+    /* Convenience aliases */
+    libxl__remus_devices_state *const rds = &dss->rds;
+    const libxl_domain_remus_info *const info = dss->remus;
+    libxl__srm_save_autogen_callbacks *const callbacks =
+        &dss->sws.shs.callbacks.save.a;
+
+    STATE_AO_GC(dss->ao);
+
+    if (libxl_defbool_val(info->netbuf)) {
+        if (!libxl__netbuffer_enabled(gc)) {
+            LOG(ERROR, "Remus: No support for network buffering");
+            goto out;
+        }
+        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
+    }
+
+    if (libxl_defbool_val(info->diskbuf))
+        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
+
+    rds->ao = ao;
+    rds->domid = dss->domid;
+    rds->callback = remus_setup_done;
+
+    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
+
+    callbacks->suspend = libxl__remus_domain_suspend_callback;
+    callbacks->postcopy = libxl__remus_domain_resume_callback;
+    callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
+
+    libxl__remus_devices_setup(egc, rds);
+    return;
+
+out:
+    dss->callback(egc, dss, ERROR_FAIL);
+}
+
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (!rc) {
+        libxl__domain_save(egc, dss);
+        return;
+    }
+
+    LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
+        dss->domid, rc);
+    rds->callback = remus_setup_failed;
+    libxl__remus_devices_teardown(egc, rds);
+}
+
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        LOG(ERROR, "Remus: failed to teardown device after setup failed"
+            " for guest with domid %u, rc %d", dss->domid, rc);
+
+    dss->callback(egc, dss, rc);
+}
+
+static void remus_teardown_done(libxl__egc *egc,
+                                libxl__remus_devices_state *rds,
+                                int rc);
+void libxl__remus_teardown(libxl__egc *egc,
+                           libxl__domain_suspend_state *dss,
+                           int rc)
+{
+    EGC_GC;
+
+    LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
+        " teardown Remus devices...", rc);
+    dss->rds.callback = remus_teardown_done;
+    libxl__remus_devices_teardown(egc, &dss->rds);
+}
+
+static void remus_teardown_done(libxl__egc *egc,
+                                libxl__remus_devices_state *rds,
+                                int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
+            " rc %d", dss->domid, rc);
+
+    dss->callback(egc, dss, rc);
+}
+
+/*---------------------- remus callbacks (save) -----------------------*/
+
+static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int ok);
+static void remus_devices_postsuspend_cb(libxl__egc *egc,
+                                         libxl__remus_devices_state *rds,
+                                         int rc);
+static void remus_devices_preresume_cb(libxl__egc *egc,
+                                       libxl__remus_devices_state *rds,
+                                       int rc);
+
+static void libxl__remus_domain_suspend_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+
+    dss->callback_common_done = remus_domain_suspend_callback_common_done;
+    libxl__domain_suspend(egc, dss);
+}
+
+static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int rc)
+{
+    if (rc)
+        goto out;
+
+    libxl__remus_devices_state *const rds = &dss->rds;
+    rds->callback = remus_devices_postsuspend_cb;
+    libxl__remus_devices_postsuspend(egc, rds);
+    return;
+
+out:
+    dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+static void remus_devices_postsuspend_cb(libxl__egc *egc,
+                                         libxl__remus_devices_state *rds,
+                                         int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+
+    if (rc)
+        goto out;
+
+    rc = 0;
+
+out:
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+static void libxl__remus_domain_resume_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+
+    libxl__remus_devices_state *const rds = &dss->rds;
+    rds->callback = remus_devices_preresume_cb;
+    libxl__remus_devices_preresume(egc, rds);
+}
+
+static void remus_devices_preresume_cb(libxl__egc *egc,
+                                       libxl__remus_devices_state *rds,
+                                       int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        goto out;
+
+    /* Resumes the domain and the device model */
+    rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
+    if (rc)
+        goto out;
+
+    rc = 0;
+
+out:
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+/*----- remus asynchronous checkpoint callback -----*/
+
+static void remus_devices_commit_cb(libxl__egc *egc,
+                                    libxl__remus_devices_state *rds,
+                                    int rc);
+static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
+                                  const struct timeval *requested_abs,
+                                  int rc);
+
+static void libxl__remus_domain_save_checkpoint_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__egc *egc = shs->egc;
+    STATE_AO_GC(dss->ao);
+
+    libxl__stream_write_start_checkpoint(egc, &dss->sws);
+}
+
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
+
+    /* Convenience aliases */
+    libxl__remus_devices_state *const rds = &dss->rds;
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc) {
+        LOG(ERROR, "Failed to save device model. Terminating Remus..");
+        goto out;
+    }
+
+    rds->callback = remus_devices_commit_cb;
+    libxl__remus_devices_commit(egc, rds);
+
+    return;
+
+out:
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
+}
+
+static void remus_devices_commit_cb(libxl__egc *egc,
+                                    libxl__remus_devices_state *rds,
+                                    int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc) {
+        LOG(ERROR, "Failed to do device commit op."
+            " Terminating Remus..");
+        goto out;
+    }
+
+    /*
+     * At this point, we have successfully checkpointed the guest and
+     * committed it at the backup. We'll come back after the checkpoint
+     * interval to checkpoint the guest again. Until then, let the guest
+     * continue execution.
+     */
+
+    /* Set checkpoint interval timeout */
+    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
+                                     remus_next_checkpoint,
+                                     dss->interval);
+
+    if (rc)
+        goto out;
+
+    return;
+
+out:
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
+}
+
+static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
+                                  const struct timeval *requested_abs,
+                                  int rc)
+{
+    libxl__domain_suspend_state *dss =
+                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc == ERROR_TIMEDOUT) /* As intended */
+        rc = 0;
+
+    /*
+     * Time to checkpoint the guest again. We return 1 to libxc
+     * (xc_domain_save.c). in order to continue executing the infinite loop
+     * (suspend, checkpoint, resume) in xc_domain_save().
+     */
+
+    if (rc)
+        dss->rc = rc;
+
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+/*---------------------- remus callbacks (restore) -----------------------*/
+
+/*----- remus asynchronous checkpoint callback -----*/
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__stream_read_state *srs, int rc);
+
+static void libxl__remus_domain_restore_checkpoint_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__domain_create_state *dcs = shs->caller_state;
+    libxl__egc *egc = shs->egc;
+    STATE_AO_GC(dcs->ao);
+
+    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
+}
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__stream_read_state *stream, int rc)
+{
+    libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
+}
+
+void libxl__remus_restore_setup(libxl__egc *egc,
+                                libxl__domain_create_state *dcs)
+{
+    /* Convenience aliases */
+    libxl__srm_restore_autogen_callbacks *const callbacks =
+        &dcs->srs.shs.callbacks.restore.a;
+
+    callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
+    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 03/13] tools/libxl: move save/restore code into libxl_dom_save.c
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 02/13] tools/libxl: move remus code into libxl_remus.c Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 04/13] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Ian Jackson,
	Yang Hongyang

This is purely code motion.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl_dom.c      | 509 ----------------------------------------
 tools/libxl/libxl_dom_save.c | 538 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 539 insertions(+), 510 deletions(-)
 create mode 100644 tools/libxl/libxl_dom_save.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 7d64ecc..263ea0e 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -105,7 +105,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_stream_read.o libxl_stream_write.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o \
-			libxl_dom_suspend.o $(LIBXL_OBJS-y)
+			libxl_dom_suspend.o libxl_dom_save.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index d74f1a4..664adad 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -24,7 +24,6 @@
 #include <xen/hvm/hvm_info_table.h>
 #include <xen/hvm/hvm_xs_strings.h>
 #include <xen/hvm/e820.h>
-#include <xen/errno.h>
 
 libxl_domain_type libxl__domain_type(libxl__gc *gc, uint32_t domid)
 {
@@ -1107,514 +1106,6 @@ int libxl__qemu_traditional_cmd(libxl__gc *gc, uint32_t domid,
     return libxl__xs_printf(gc, XBT_NULL, path, "%s", cmd);
 }
 
-/*
- * Inspect the buffer between start and end, and return a pointer to the
- * character following the NUL terminator of start, or NULL if start is not
- * terminated before end.
- */
-static const char *next_string(const char *start, const char *end)
-{
-    if (start >= end) return NULL;
-
-    size_t total_len = end - start;
-    size_t len = strnlen(start, total_len);
-
-    if (len == total_len)
-        return NULL;
-    else
-        return start + len + 1;
-}
-
-int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs,
-                                          const char *ptr, uint32_t size)
-{
-    STATE_AO_GC(dcs->ao);
-    const char *next = ptr, *end = ptr + size, *key, *val;
-    int rc;
-
-    const uint32_t domid = dcs->guest_domid;
-    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-    const char *xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
-
-    while (next < end) {
-        key = next;
-        next = next_string(next, end);
-
-        /* Sanitise 'key'. */
-        if (!next) {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "Key in xenstore data not NUL terminated");
-            goto out;
-        }
-        if (key[0] == '\0') {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "empty key found in xenstore data");
-            goto out;
-        }
-        if (key[0] == '/') {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "Key in xenstore data not relative");
-            goto out;
-        }
-
-        val = next;
-        next = next_string(next, end);
-
-        /* Sanitise 'val'. */
-        if (!next) {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "Val in xenstore data not NUL terminated");
-            goto out;
-        }
-
-        libxl__xs_printf(gc, XBT_NULL,
-                         GCSPRINTF("%s/%s", xs_root, key),
-                         "%s", val);
-    }
-
-    rc = 0;
-
- out:
-    return rc;
-}
-
-/*==================== Domain suspend (save) ====================*/
-
-static void stream_done(libxl__egc *egc,
-                        libxl__stream_write_state *sws, int rc);
-static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc);
-
-/*----- complicated callback, called by xc_domain_save -----*/
-
-/*
- * We implement the other end of protocol for controlling qemu-dm's
- * logdirty.  There is no documentation for this protocol, but our
- * counterparty's implementation is in
- * qemu-xen-traditional.git:xenstore.c in the function
- * xenstore_process_logdirty_event
- */
-
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs,
-                                    int rc);
-static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
-                            const char *watch_path, const char *event_path);
-static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int rc);
-
-static void logdirty_init(libxl__logdirty_switch *lds)
-{
-    lds->cmd_path = 0;
-    libxl__ev_xswatch_init(&lds->watch);
-    libxl__ev_time_init(&lds->timeout);
-}
-
-static void domain_suspend_switch_qemu_xen_traditional_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
-{
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
-    int rc;
-    xs_transaction_t t = 0;
-    const char *got;
-
-    if (!lds->cmd_path) {
-        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-        lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid,
-                                                    "/logdirty/cmd");
-        lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid,
-                                                    "/logdirty/ret");
-    }
-    lds->cmd = enable ? "enable" : "disable";
-
-    rc = libxl__ev_xswatch_register(gc, &lds->watch,
-                                switch_logdirty_xswatch, lds->ret_path);
-    if (rc) goto out;
-
-    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
-                                switch_logdirty_timeout, 10*1000);
-    if (rc) goto out;
-
-    for (;;) {
-        rc = libxl__xs_transaction_start(gc, &t);
-        if (rc) goto out;
-
-        rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got);
-        if (rc) goto out;
-
-        if (got) {
-            const char *got_ret;
-            rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret);
-            if (rc) goto out;
-
-            if (!got_ret || strcmp(got, got_ret)) {
-                LOG(ERROR,"controlling logdirty: qemu was already sent"
-                    " command `%s' (xenstore path `%s') but result is `%s'",
-                    got, lds->cmd_path, got_ret ? got_ret : "<none>");
-                rc = ERROR_FAIL;
-                goto out;
-            }
-            rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
-            if (rc) goto out;
-        }
-
-        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd);
-        if (rc) goto out;
-
-        rc = libxl__xs_transaction_commit(gc, &t);
-        if (!rc) break;
-        if (rc<0) goto out;
-    }
-
-    /* OK, wait for some callback */
-    return;
-
- out:
-    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
-    libxl__xs_transaction_abort(gc, &t);
-    switch_logdirty_done(egc,dss,rc);
-}
-
-static void domain_suspend_switch_qemu_xen_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
-{
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-    int rc;
-
-    rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
-    if (!rc) {
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
-    } else {
-        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
-        dss->rc = rc;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
-}
-
-void libxl__domain_suspend_common_switch_qemu_logdirty
-                               (int domid, unsigned enable, void *user)
-{
-    libxl__save_helper_state *shs = user;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-
-    switch (libxl__device_model_version_running(gc, domid)) {
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
-        break;
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
-        break;
-    case LIBXL_DEVICE_MODEL_VERSION_NONE:
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
-        break;
-    default:
-        LOG(ERROR,"logdirty switch failed"
-            ", no valid device model version found, abandoning suspend");
-        dss->rc = ERROR_FAIL;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
-}
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs,
-                                    int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
-    STATE_AO_GC(dss->ao);
-    LOG(ERROR,"logdirty switch: wait for device model timed out");
-    switch_logdirty_done(egc,dss,ERROR_FAIL);
-}
-
-static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
-                            const char *watch_path, const char *event_path)
-{
-    libxl__domain_suspend_state *dss =
-        CONTAINER_OF(watch, *dss, logdirty.watch);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
-    const char *got;
-    xs_transaction_t t = 0;
-    int rc;
-
-    for (;;) {
-        rc = libxl__xs_transaction_start(gc, &t);
-        if (rc) goto out;
-
-        rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got);
-        if (rc) goto out;
-
-        if (!got) {
-            rc = +1;
-            goto out;
-        }
-
-        if (strcmp(got, lds->cmd)) {
-            LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'"
-                " (xenstore paths `%s' / `%s')", lds->cmd, got,
-                lds->cmd_path, lds->ret_path);
-            rc = ERROR_FAIL;
-            goto out;
-        }
-
-        rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_transaction_commit(gc, &t);
-        if (!rc) break;
-        if (rc<0) goto out;
-    }
-
- out:
-    /* rc < 0: error
-     * rc == 0: ok, we are done
-     * rc == +1: need to keep waiting
-     */
-    libxl__xs_transaction_abort(gc, &t);
-
-    if (rc <= 0) {
-        if (rc < 0)
-            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
-        switch_logdirty_done(egc,dss,rc);
-    }
-}
-
-static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss,
-                                 int rc)
-{
-    STATE_AO_GC(dss->ao);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-
-    libxl__ev_xswatch_deregister(gc, &lds->watch);
-    libxl__ev_time_deregister(gc, &lds->timeout);
-
-    int broke;
-    if (rc) {
-        broke = -1;
-        dss->rc = rc;
-    } else {
-        broke = 0;
-    }
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
-}
-
-/*----- callbacks, called by xc_domain_save -----*/
-
-/*
- * Expand the buffer 'buf' of length 'len', to append 'str' including its NUL
- * terminator.
- */
-static void append_string(libxl__gc *gc, char **buf, uint32_t *len,
-                          const char *str)
-{
-    size_t extralen = strlen(str) + 1;
-    char *new = libxl__realloc(gc, *buf, *len + extralen);
-
-    *buf = new;
-    memcpy(new + *len, str, extralen);
-    *len += extralen;
-}
-
-int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
-                                       char **callee_buf,
-                                       uint32_t *callee_len)
-{
-    STATE_AO_GC(dss->ao);
-    const char *xs_root;
-    char **entries, *buf = NULL;
-    unsigned int nr_entries, i, j, len = 0;
-    int rc;
-
-    const uint32_t domid = dss->domid;
-    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-
-    xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
-
-    entries = libxl__xs_directory(gc, 0, GCSPRINTF("%s/physmap", xs_root),
-                                  &nr_entries);
-    if (!entries || nr_entries == 0) { rc = 0; goto out; }
-
-    for (i = 0; i < nr_entries; ++i) {
-        static const char *const physmap_subkeys[] = {
-            "start_addr", "size", "name"
-        };
-
-        for (j = 0; j < ARRAY_SIZE(physmap_subkeys); ++j) {
-            const char *key = GCSPRINTF("physmap/%s/%s",
-                                        entries[i], physmap_subkeys[j]);
-
-            const char *val =
-                libxl__xs_read(gc, XBT_NULL,
-                               GCSPRINTF("%s/%s", xs_root, key));
-
-            if (!val) { rc = ERROR_FAIL; goto out; }
-
-            append_string(gc, &buf, &len, key);
-            append_string(gc, &buf, &len, val);
-        }
-    }
-
-    rc = 0;
-
- out:
-    if (!rc) {
-        *callee_buf = buf;
-        *callee_len = len;
-    }
-
-    return rc;
-}
-
-/*----- main code for saving, in order of execution -----*/
-
-void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
-{
-    STATE_AO_GC(dss->ao);
-    int port;
-    int rc, ret;
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-    const libxl_domain_type type = dss->type;
-    const int live = dss->live;
-    const int debug = dss->debug;
-    const libxl_domain_remus_info *const r_info = dss->remus;
-    libxl__srm_save_autogen_callbacks *const callbacks =
-        &dss->sws.shs.callbacks.save.a;
-    unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
-
-    dss->rc = 0;
-    logdirty_init(&dss->logdirty);
-    libxl__xswait_init(&dss->pvcontrol);
-    libxl__ev_evtchn_init(&dss->guest_evtchn);
-    libxl__ev_xswatch_init(&dss->guest_watch);
-    libxl__ev_time_init(&dss->guest_timeout);
-
-    switch (type) {
-    case LIBXL_DOMAIN_TYPE_HVM: {
-        dss->hvm = 1;
-        break;
-    }
-    case LIBXL_DOMAIN_TYPE_PV:
-        dss->hvm = 0;
-        break;
-    default:
-        abort();
-    }
-
-    dss->xcflags = (live ? XCFLAGS_LIVE : 0)
-          | (debug ? XCFLAGS_DEBUG : 0)
-          | (dss->hvm ? XCFLAGS_HVM : 0);
-
-    /* Disallow saving a guest with vNUMA configured because migration
-     * stream does not preserve node information.
-     *
-     * Reject any domain which has vnuma enabled, even if the
-     * configuration is empty. Only domains which have no vnuma
-     * configuration at all are supported.
-     */
-    ret = xc_domain_getvnuma(CTX->xch, domid, &nr_vnodes, &nr_vmemranges,
-                             &nr_vcpus, NULL, NULL, NULL);
-    if (ret != -1 || errno != XEN_EOPNOTSUPP) {
-        LOG(ERROR, "Cannot save a guest with vNUMA configured");
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    dss->guest_evtchn.port = -1;
-    dss->guest_evtchn_lockfd = -1;
-    dss->guest_responded = 0;
-    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
-
-    if (r_info != NULL) {
-        dss->interval = r_info->interval;
-        dss->xcflags |= XCFLAGS_CHECKPOINTED;
-        if (libxl_defbool_val(r_info->compression))
-            dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
-    }
-
-    port = xs_suspend_evtchn_port(dss->domid);
-
-    if (port >= 0) {
-        rc = libxl__ctx_evtchn_init(gc);
-        if (rc) goto out;
-
-        dss->guest_evtchn.port =
-            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
-                                  dss->domid, port, &dss->guest_evtchn_lockfd);
-
-        if (dss->guest_evtchn.port < 0) {
-            LOG(WARN, "Suspend event channel initialization failed");
-            rc = ERROR_FAIL;
-            goto out;
-        }
-    }
-
-    if (r_info == NULL)
-        callbacks->suspend = libxl__domain_suspend_callback;
-
-    callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
-
-    dss->sws.ao  = dss->ao;
-    dss->sws.dss = dss;
-    dss->sws.fd  = dss->fd;
-    dss->sws.completion_callback = stream_done;
-
-    libxl__stream_write_start(egc, &dss->sws);
-    return;
-
- out:
-    domain_save_done(egc, dss, rc);
-}
-
-static void stream_done(libxl__egc *egc,
-                        libxl__stream_write_state *sws, int rc)
-{
-    domain_save_done(egc, sws->dss, rc);
-}
-
-static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc)
-{
-    STATE_AO_GC(dss->ao);
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-
-    if (dss->guest_evtchn.port > 0)
-        xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
-                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
-
-    if (dss->remus) {
-        /*
-         * With Remus, if we reach this point, it means either
-         * backup died or some network error occurred preventing us
-         * from sending checkpoints. Teardown the network buffers and
-         * release netlink resources.  This is an async op.
-         */
-        libxl__remus_teardown(egc, dss, rc);
-        return;
-    }
-
-    dss->callback(egc, dss, rc);
-}
-
 /*==================== Miscellaneous ====================*/
 
 char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
new file mode 100644
index 0000000..cca3404
--- /dev/null
+++ b/tools/libxl/libxl_dom_save.c
@@ -0,0 +1,538 @@
+/*
+ * Copyright (C) 2009      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+#include <xen/errno.h>
+
+/*========================= Domain save ============================*/
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *sws, int rc);
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc);
+
+/*----- complicated callback, called by xc_domain_save -----*/
+
+/*
+ * We implement the other end of protocol for controlling qemu-dm's
+ * logdirty.  There is no documentation for this protocol, but our
+ * counterparty's implementation is in
+ * qemu-xen-traditional.git:xenstore.c in the function
+ * xenstore_process_logdirty_event
+ */
+
+static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
+                                    const struct timeval *requested_abs,
+                                    int rc);
+static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
+                            const char *watch_path, const char *event_path);
+static void switch_logdirty_done(libxl__egc *egc,
+                                 libxl__domain_suspend_state *dss, int rc);
+
+static void logdirty_init(libxl__logdirty_switch *lds)
+{
+    lds->cmd_path = 0;
+    libxl__ev_xswatch_init(&lds->watch);
+    libxl__ev_time_init(&lds->timeout);
+}
+
+static void domain_suspend_switch_qemu_xen_traditional_logdirty
+                               (int domid, unsigned enable,
+                                libxl__save_helper_state *shs)
+{
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(dss->ao);
+    int rc;
+    xs_transaction_t t = 0;
+    const char *got;
+
+    if (!lds->cmd_path) {
+        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+        lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid,
+                                                    "/logdirty/cmd");
+        lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid,
+                                                    "/logdirty/ret");
+    }
+    lds->cmd = enable ? "enable" : "disable";
+
+    rc = libxl__ev_xswatch_register(gc, &lds->watch,
+                                switch_logdirty_xswatch, lds->ret_path);
+    if (rc) goto out;
+
+    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
+                                switch_logdirty_timeout, 10*1000);
+    if (rc) goto out;
+
+    for (;;) {
+        rc = libxl__xs_transaction_start(gc, &t);
+        if (rc) goto out;
+
+        rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got);
+        if (rc) goto out;
+
+        if (got) {
+            const char *got_ret;
+            rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret);
+            if (rc) goto out;
+
+            if (!got_ret || strcmp(got, got_ret)) {
+                LOG(ERROR,"controlling logdirty: qemu was already sent"
+                    " command `%s' (xenstore path `%s') but result is `%s'",
+                    got, lds->cmd_path, got_ret ? got_ret : "<none>");
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
+            if (rc) goto out;
+        }
+
+        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd);
+        if (rc) goto out;
+
+        rc = libxl__xs_transaction_commit(gc, &t);
+        if (!rc) break;
+        if (rc<0) goto out;
+    }
+
+    /* OK, wait for some callback */
+    return;
+
+ out:
+    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+    libxl__xs_transaction_abort(gc, &t);
+    switch_logdirty_done(egc,dss,rc);
+}
+
+static void domain_suspend_switch_qemu_xen_logdirty
+                               (int domid, unsigned enable,
+                                libxl__save_helper_state *shs)
+{
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+    int rc;
+
+    rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
+    if (!rc) {
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+    } else {
+        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+        dss->rc = rc;
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+    }
+}
+
+void libxl__domain_suspend_common_switch_qemu_logdirty
+                               (int domid, unsigned enable, void *user)
+{
+    libxl__save_helper_state *shs = user;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+
+    switch (libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
+        break;
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
+        break;
+    case LIBXL_DEVICE_MODEL_VERSION_NONE:
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+        break;
+    default:
+        LOG(ERROR,"logdirty switch failed"
+            ", no valid device model version found, abandoning suspend");
+        dss->rc = ERROR_FAIL;
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+    }
+}
+static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
+                                    const struct timeval *requested_abs,
+                                    int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
+    STATE_AO_GC(dss->ao);
+    LOG(ERROR,"logdirty switch: wait for device model timed out");
+    switch_logdirty_done(egc,dss,ERROR_FAIL);
+}
+
+static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
+                            const char *watch_path, const char *event_path)
+{
+    libxl__domain_suspend_state *dss =
+        CONTAINER_OF(watch, *dss, logdirty.watch);
+    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(dss->ao);
+    const char *got;
+    xs_transaction_t t = 0;
+    int rc;
+
+    for (;;) {
+        rc = libxl__xs_transaction_start(gc, &t);
+        if (rc) goto out;
+
+        rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got);
+        if (rc) goto out;
+
+        if (!got) {
+            rc = +1;
+            goto out;
+        }
+
+        if (strcmp(got, lds->cmd)) {
+            LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'"
+                " (xenstore paths `%s' / `%s')", lds->cmd, got,
+                lds->cmd_path, lds->ret_path);
+            rc = ERROR_FAIL;
+            goto out;
+        }
+
+        rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_transaction_commit(gc, &t);
+        if (!rc) break;
+        if (rc<0) goto out;
+    }
+
+ out:
+    /* rc < 0: error
+     * rc == 0: ok, we are done
+     * rc == +1: need to keep waiting
+     */
+    libxl__xs_transaction_abort(gc, &t);
+
+    if (rc <= 0) {
+        if (rc < 0)
+            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
+        switch_logdirty_done(egc,dss,rc);
+    }
+}
+
+static void switch_logdirty_done(libxl__egc *egc,
+                                 libxl__domain_suspend_state *dss,
+                                 int rc)
+{
+    STATE_AO_GC(dss->ao);
+    libxl__logdirty_switch *lds = &dss->logdirty;
+
+    libxl__ev_xswatch_deregister(gc, &lds->watch);
+    libxl__ev_time_deregister(gc, &lds->timeout);
+
+    int broke;
+    if (rc) {
+        broke = -1;
+        dss->rc = rc;
+    } else {
+        broke = 0;
+    }
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
+}
+
+/*----- callbacks, called by xc_domain_save -----*/
+
+/*
+ * Expand the buffer 'buf' of length 'len', to append 'str' including its NUL
+ * terminator.
+ */
+static void append_string(libxl__gc *gc, char **buf, uint32_t *len,
+                          const char *str)
+{
+    size_t extralen = strlen(str) + 1;
+    char *new = libxl__realloc(gc, *buf, *len + extralen);
+
+    *buf = new;
+    memcpy(new + *len, str, extralen);
+    *len += extralen;
+}
+
+int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
+                                       char **callee_buf,
+                                       uint32_t *callee_len)
+{
+    STATE_AO_GC(dss->ao);
+    const char *xs_root;
+    char **entries, *buf = NULL;
+    unsigned int nr_entries, i, j, len = 0;
+    int rc;
+
+    const uint32_t domid = dss->domid;
+    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+    xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
+
+    entries = libxl__xs_directory(gc, 0, GCSPRINTF("%s/physmap", xs_root),
+                                  &nr_entries);
+    if (!entries || nr_entries == 0) { rc = 0; goto out; }
+
+    for (i = 0; i < nr_entries; ++i) {
+        static const char *const physmap_subkeys[] = {
+            "start_addr", "size", "name"
+        };
+
+        for (j = 0; j < ARRAY_SIZE(physmap_subkeys); ++j) {
+            const char *key = GCSPRINTF("physmap/%s/%s",
+                                        entries[i], physmap_subkeys[j]);
+
+            const char *val =
+                libxl__xs_read(gc, XBT_NULL,
+                               GCSPRINTF("%s/%s", xs_root, key));
+
+            if (!val) { rc = ERROR_FAIL; goto out; }
+
+            append_string(gc, &buf, &len, key);
+            append_string(gc, &buf, &len, val);
+        }
+    }
+
+    rc = 0;
+
+ out:
+    if (!rc) {
+        *callee_buf = buf;
+        *callee_len = len;
+    }
+
+    return rc;
+}
+
+/*----- main code for saving, in order of execution -----*/
+
+void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
+{
+    STATE_AO_GC(dss->ao);
+    int port;
+    int rc, ret;
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+    const libxl_domain_type type = dss->type;
+    const int live = dss->live;
+    const int debug = dss->debug;
+    const libxl_domain_remus_info *const r_info = dss->remus;
+    libxl__srm_save_autogen_callbacks *const callbacks =
+        &dss->sws.shs.callbacks.save.a;
+    unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
+
+    dss->rc = 0;
+    logdirty_init(&dss->logdirty);
+    libxl__xswait_init(&dss->pvcontrol);
+    libxl__ev_evtchn_init(&dss->guest_evtchn);
+    libxl__ev_xswatch_init(&dss->guest_watch);
+    libxl__ev_time_init(&dss->guest_timeout);
+
+    switch (type) {
+    case LIBXL_DOMAIN_TYPE_HVM: {
+        dss->hvm = 1;
+        break;
+    }
+    case LIBXL_DOMAIN_TYPE_PV:
+        dss->hvm = 0;
+        break;
+    default:
+        abort();
+    }
+
+    dss->xcflags = (live ? XCFLAGS_LIVE : 0)
+          | (debug ? XCFLAGS_DEBUG : 0)
+          | (dss->hvm ? XCFLAGS_HVM : 0);
+
+    /* Disallow saving a guest with vNUMA configured because migration
+     * stream does not preserve node information.
+     *
+     * Reject any domain which has vnuma enabled, even if the
+     * configuration is empty. Only domains which have no vnuma
+     * configuration at all are supported.
+     */
+    ret = xc_domain_getvnuma(CTX->xch, domid, &nr_vnodes, &nr_vmemranges,
+                             &nr_vcpus, NULL, NULL, NULL);
+    if (ret != -1 || errno != XEN_EOPNOTSUPP) {
+        LOG(ERROR, "Cannot save a guest with vNUMA configured");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    dss->guest_evtchn.port = -1;
+    dss->guest_evtchn_lockfd = -1;
+    dss->guest_responded = 0;
+    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
+
+    if (r_info != NULL) {
+        dss->interval = r_info->interval;
+        dss->xcflags |= XCFLAGS_CHECKPOINTED;
+        if (libxl_defbool_val(r_info->compression))
+            dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
+    }
+
+    port = xs_suspend_evtchn_port(dss->domid);
+
+    if (port >= 0) {
+        rc = libxl__ctx_evtchn_init(gc);
+        if (rc) goto out;
+
+        dss->guest_evtchn.port =
+            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
+                                  dss->domid, port, &dss->guest_evtchn_lockfd);
+
+        if (dss->guest_evtchn.port < 0) {
+            LOG(WARN, "Suspend event channel initialization failed");
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
+    if (r_info == NULL)
+        callbacks->suspend = libxl__domain_suspend_callback;
+
+    callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
+
+    dss->sws.ao  = dss->ao;
+    dss->sws.dss = dss;
+    dss->sws.fd  = dss->fd;
+    dss->sws.completion_callback = stream_done;
+
+    libxl__stream_write_start(egc, &dss->sws);
+    return;
+
+ out:
+    domain_save_done(egc, dss, rc);
+}
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *sws, int rc)
+{
+    domain_save_done(egc, sws->dss, rc);
+}
+
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc)
+{
+    STATE_AO_GC(dss->ao);
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+
+    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+
+    if (dss->guest_evtchn.port > 0)
+        xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
+                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
+
+    if (dss->remus) {
+        /*
+         * With Remus, if we reach this point, it means either
+         * backup died or some network error occurred preventing us
+         * from sending checkpoints. Teardown the network buffers and
+         * release netlink resources.  This is an async op.
+         */
+        libxl__remus_teardown(egc, dss, rc);
+        return;
+    }
+
+    dss->callback(egc, dss, rc);
+}
+
+/*========================= Domain restore ============================*/
+
+/*
+ * Inspect the buffer between start and end, and return a pointer to the
+ * character following the NUL terminator of start, or NULL if start is not
+ * terminated before end.
+ */
+static const char *next_string(const char *start, const char *end)
+{
+    if (start >= end) return NULL;
+
+    size_t total_len = end - start;
+    size_t len = strnlen(start, total_len);
+
+    if (len == total_len)
+        return NULL;
+    else
+        return start + len + 1;
+}
+
+int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs,
+                                          const char *ptr, uint32_t size)
+{
+    STATE_AO_GC(dcs->ao);
+    const char *next = ptr, *end = ptr + size, *key, *val;
+    int rc;
+
+    const uint32_t domid = dcs->guest_domid;
+    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+    const char *xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
+
+    while (next < end) {
+        key = next;
+        next = next_string(next, end);
+
+        /* Sanitise 'key'. */
+        if (!next) {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "Key in xenstore data not NUL terminated");
+            goto out;
+        }
+        if (key[0] == '\0') {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "empty key found in xenstore data");
+            goto out;
+        }
+        if (key[0] == '/') {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "Key in xenstore data not relative");
+            goto out;
+        }
+
+        val = next;
+        next = next_string(next, end);
+
+        /* Sanitise 'val'. */
+        if (!next) {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "Val in xenstore data not NUL terminated");
+            goto out;
+        }
+
+        libxl__xs_printf(gc, XBT_NULL,
+                         GCSPRINTF("%s/%s", xs_root, key),
+                         "%s", val);
+    }
+
+    rc = 0;
+
+ out:
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 04/13] libxl/save: Refactor libxl__domain_suspend_state
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (2 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 03/13] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

Currently struct libxl__domain_suspend_state contains 2 type of states,
one is save state, another is suspend state. This patch separates those
two out.
The motivation of this is that COLO will need to do suspend/resume
continuously, we need a more common suspend state.

After this change, dss stands for libxl__domain_save_state,
dsps stands for libxl__domain_suspend_state.

Also introduce libxl__domain_suspend_init to initialise the
libxl__domain_suspend_state.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c              |  10 +-
 tools/libxl/libxl_create.c       |  10 +-
 tools/libxl/libxl_dom_save.c     |  61 ++++--------
 tools/libxl/libxl_dom_suspend.c  | 207 ++++++++++++++++++++++++---------------
 tools/libxl/libxl_internal.h     |  61 +++++++-----
 tools/libxl/libxl_netbuffer.c    |   2 +-
 tools/libxl/libxl_remus.c        |  37 +++----
 tools/libxl/libxl_save_callout.c |   2 +-
 tools/libxl/libxl_stream_write.c |  16 +--
 9 files changed, 227 insertions(+), 179 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d6ce7da..db5732c 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -832,7 +832,7 @@ out:
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc);
+                              libxl__domain_save_state *dss, int rc);
 
 /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
 int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
@@ -840,7 +840,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
                              const libxl_asyncop_how *ao_how)
 {
     AO_CREATE(ctx, domid, ao_how);
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     int rc;
 
     libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -888,7 +888,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc)
+                              libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
     /*
@@ -900,7 +900,7 @@ static void remus_failover_cb(libxl__egc *egc,
 }
 
 static void domain_suspend_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc)
+                              libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
     int flrc;
@@ -925,7 +925,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
         goto out_err;
     }
 
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     GCNEW(dss);
 
     dss->ao = ao;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e421d36..ad1d50c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1558,7 +1558,7 @@ typedef struct {
 typedef struct {
     libxl__app_domain_create_state cdcs;
     libxl__domain_destroy_state dds;
-    libxl__domain_suspend_state dss;
+    libxl__domain_save_state dss;
     char *toolstack_buf;
     uint32_t toolstack_len;
 } libxl__domain_soft_reset_state;
@@ -1653,7 +1653,7 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
     libxl__app_domain_create_state *cdcs;
     libxl__domain_create_state *dcs;
     libxl__domain_build_state *state;
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     char *dom_path, *xs_store_mfn, *xs_console_mfn;
     uint32_t domid_out;
     int rc;
@@ -1697,8 +1697,8 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
 
     dss->ao = ao;
     dss->domid = domid_soft_reset;
-    dss->dm_savefile = GCSPRINTF(LIBXL_DEVICE_MODEL_SAVE_FILE".%d",
-                                 domid_soft_reset);
+    dss->dsps.dm_savefile = GCSPRINTF(LIBXL_DEVICE_MODEL_SAVE_FILE".%d",
+                                      domid_soft_reset);
 
     rc = libxl__save_emulator_xenstore_data(dss, &srs->toolstack_buf,
                                             &srs->toolstack_len);
@@ -1707,7 +1707,7 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
         goto out;
     }
 
-    rc = libxl__domain_suspend_device_model(gc, dss);
+    rc = libxl__domain_suspend_device_model(gc, &dss->dsps);
     if (rc) {
         LOG(ERROR, "failed to suspend device model.");
         goto out;
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index cca3404..aead042 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -24,7 +24,7 @@
 static void stream_done(libxl__egc *egc,
                         libxl__stream_write_state *sws, int rc);
 static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc);
+                             libxl__domain_save_state *dss, int rc);
 
 /*----- complicated callback, called by xc_domain_save -----*/
 
@@ -42,7 +42,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int rc);
+                                 libxl__domain_save_state *dss, int rc);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -56,7 +56,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
                                 libxl__save_helper_state *shs)
 {
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     libxl__logdirty_switch *lds = &dss->logdirty;
     STATE_AO_GC(dss->ao);
     int rc;
@@ -128,7 +128,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
                                 libxl__save_helper_state *shs)
 {
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
     int rc;
 
@@ -147,7 +147,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
 {
     libxl__save_helper_state *shs = user;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
     switch (libxl__device_model_version_running(gc, domid)) {
@@ -171,7 +171,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
                                     const struct timeval *requested_abs,
                                     int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
+    libxl__domain_save_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
     STATE_AO_GC(dss->ao);
     LOG(ERROR,"logdirty switch: wait for device model timed out");
     switch_logdirty_done(egc,dss,ERROR_FAIL);
@@ -180,7 +180,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
                             const char *watch_path, const char *event_path)
 {
-    libxl__domain_suspend_state *dss =
+    libxl__domain_save_state *dss =
         CONTAINER_OF(watch, *dss, logdirty.watch);
     libxl__logdirty_switch *lds = &dss->logdirty;
     STATE_AO_GC(dss->ao);
@@ -234,7 +234,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
 }
 
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss,
+                                 libxl__domain_save_state *dss,
                                  int rc)
 {
     STATE_AO_GC(dss->ao);
@@ -270,7 +270,7 @@ static void append_string(libxl__gc *gc, char **buf, uint32_t *len,
     *len += extralen;
 }
 
-int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
+int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
                                        char **callee_buf,
                                        uint32_t *callee_len)
 {
@@ -322,10 +322,9 @@ int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
 
 /*----- main code for saving, in order of execution -----*/
 
-void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
+void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
 {
     STATE_AO_GC(dss->ao);
-    int port;
     int rc, ret;
 
     /* Convenience aliases */
@@ -337,13 +336,14 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
     libxl__srm_save_autogen_callbacks *const callbacks =
         &dss->sws.shs.callbacks.save.a;
     unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
-    libxl__xswait_init(&dss->pvcontrol);
-    libxl__ev_evtchn_init(&dss->guest_evtchn);
-    libxl__ev_xswatch_init(&dss->guest_watch);
-    libxl__ev_time_init(&dss->guest_timeout);
+    dsps->ao = ao;
+    dsps->domid = domid;
+    rc = libxl__domain_suspend_init(egc, dsps, type);
+    if (rc) goto out;
 
     switch (type) {
     case LIBXL_DOMAIN_TYPE_HVM: {
@@ -376,11 +376,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
         goto out;
     }
 
-    dss->guest_evtchn.port = -1;
-    dss->guest_evtchn_lockfd = -1;
-    dss->guest_responded = 0;
-    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
-
     if (r_info != NULL) {
         dss->interval = r_info->interval;
         dss->xcflags |= XCFLAGS_CHECKPOINTED;
@@ -388,23 +383,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
 
-    port = xs_suspend_evtchn_port(dss->domid);
-
-    if (port >= 0) {
-        rc = libxl__ctx_evtchn_init(gc);
-        if (rc) goto out;
-
-        dss->guest_evtchn.port =
-            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
-                                  dss->domid, port, &dss->guest_evtchn_lockfd);
-
-        if (dss->guest_evtchn.port < 0) {
-            LOG(WARN, "Suspend event channel initialization failed");
-            rc = ERROR_FAIL;
-            goto out;
-        }
-    }
-
     if (r_info == NULL)
         callbacks->suspend = libxl__domain_suspend_callback;
 
@@ -429,18 +407,19 @@ static void stream_done(libxl__egc *egc,
 }
 
 static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc)
+                             libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
 
     /* Convenience aliases */
     const uint32_t domid = dss->domid;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
 
-    if (dss->guest_evtchn.port > 0)
+    if (dsps->guest_evtchn.port > 0)
         xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
-                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
+                        dsps->guest_evtchn.port, &dsps->guest_evtchn_lockfd);
 
     if (dss->remus) {
         /*
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 16f603f..cc0b217 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -19,14 +19,61 @@
 
 /*====================== Domain suspend =======================*/
 
+int libxl__domain_suspend_init(libxl__egc *egc,
+                               libxl__domain_suspend_state *dsps,
+                               libxl_domain_type type)
+{
+    STATE_AO_GC(dsps->ao);
+    int rc = ERROR_FAIL;
+    int port;
+
+    /* Convenience aliases */
+    const uint32_t domid = dsps->domid;
+
+    libxl__xswait_init(&dsps->pvcontrol);
+    libxl__ev_evtchn_init(&dsps->guest_evtchn);
+    libxl__ev_xswatch_init(&dsps->guest_watch);
+    libxl__ev_time_init(&dsps->guest_timeout);
+
+    if (type == LIBXL_DOMAIN_TYPE_INVALID) goto out;
+    dsps->type = type;
+
+    dsps->guest_evtchn.port = -1;
+    dsps->guest_evtchn_lockfd = -1;
+    dsps->guest_responded = 0;
+    dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
+
+    port = xs_suspend_evtchn_port(domid);
+
+    if (port >= 0) {
+        rc = libxl__ctx_evtchn_init(gc);
+        if (rc) goto out;
+
+        dsps->guest_evtchn.port =
+            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
+                                    domid, port, &dsps->guest_evtchn_lockfd);
+
+        if (dsps->guest_evtchn.port < 0) {
+            LOG(WARN, "Suspend event channel initialization failed");
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
+    rc = 0;
+
+out:
+    return rc;
+}
+
 /*----- callbacks, called by xc_domain_save -----*/
 
 int libxl__domain_suspend_device_model(libxl__gc *gc,
-                                       libxl__domain_suspend_state *dss)
+                                       libxl__domain_suspend_state *dsps)
 {
     int ret = 0;
-    uint32_t const domid = dss->domid;
-    const char *const filename = dss->dm_savefile;
+    uint32_t const domid = dsps->domid;
+    const char *const filename = dsps->dm_savefile;
 
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
@@ -53,9 +100,9 @@ int libxl__domain_suspend_device_model(libxl__gc *gc,
 }
 
 static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss);
+                                             libxl__domain_suspend_state *dsps);
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss);
+                                         libxl__domain_suspend_state *dsps);
 
 static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
       libxl__xswait_state *xswa, int rc, const char *state);
@@ -64,24 +111,24 @@ static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
       libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
 static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss);
+        libxl__domain_suspend_state *dsps);
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
 
 static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
+                                       libxl__domain_suspend_state *dsps,
                                        int rc);
 
 static void domain_suspend_callback_common(libxl__egc *egc,
-                                           libxl__domain_suspend_state *dss);
+                                           libxl__domain_suspend_state *dsps);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc);
+                                libxl__domain_suspend_state *dsps, int rc);
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 void libxl__domain_suspend(libxl__egc *egc,
-                           libxl__domain_suspend_state *dss)
+                           libxl__domain_suspend_state *dsps)
 {
-    domain_suspend_callback_common(egc, dss);
+    domain_suspend_callback_common(egc, dsps);
 }
 
 static bool domain_suspend_pvcontrol_acked(const char *state) {
@@ -90,37 +137,37 @@ static bool domain_suspend_pvcontrol_acked(const char *state) {
     return strcmp(state,"suspend");
 }
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 static void domain_suspend_callback_common(libxl__egc *egc,
-                                           libxl__domain_suspend_state *dss)
+                                           libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     uint64_t hvm_s_state = 0, hvm_pvdrv = 0;
     int ret, rc;
 
     /* Convenience aliases */
-    const uint32_t domid = dss->domid;
+    const uint32_t domid = dsps->domid;
 
-    if (dss->hvm) {
+    if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) {
         xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_CALLBACK_IRQ, &hvm_pvdrv);
         xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_ACPI_S_STATE, &hvm_s_state);
     }
 
-    if ((hvm_s_state == 0) && (dss->guest_evtchn.port >= 0)) {
+    if ((hvm_s_state == 0) && (dsps->guest_evtchn.port >= 0)) {
         LOG(DEBUG, "issuing %s suspend request via event channel",
-            dss->hvm ? "PVHVM" : "PV");
-        ret = xenevtchn_notify(CTX->xce, dss->guest_evtchn.port);
+            dsps->type == LIBXL_DOMAIN_TYPE_HVM ? "PVHVM" : "PV");
+        ret = xenevtchn_notify(CTX->xce, dsps->guest_evtchn.port);
         if (ret < 0) {
             LOG(ERROR, "xenevtchn_notify failed ret=%d", ret);
             rc = ERROR_FAIL;
             goto err;
         }
 
-        dss->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
-        rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
+        dsps->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
+        rc = libxl__ev_evtchn_wait(gc, &dsps->guest_evtchn);
         if (rc) goto err;
 
-        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+        rc = libxl__ev_time_register_rel(ao, &dsps->guest_timeout,
                                          suspend_common_wait_guest_timeout,
                                          60*1000);
         if (rc) goto err;
@@ -128,7 +175,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         return;
     }
 
-    if (dss->hvm && (!hvm_pvdrv || hvm_s_state)) {
+    if (dsps->type == LIBXL_DOMAIN_TYPE_HVM && (!hvm_pvdrv || hvm_s_state)) {
         LOG(DEBUG, "Calling xc_domain_shutdown on HVM domain");
         ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend);
         if (ret < 0) {
@@ -137,55 +184,55 @@ static void domain_suspend_callback_common(libxl__egc *egc,
             goto err;
         }
         /* The guest does not (need to) respond to this sort of request. */
-        dss->guest_responded = 1;
-        domain_suspend_common_wait_guest(egc, dss);
+        dsps->guest_responded = 1;
+        domain_suspend_common_wait_guest(egc, dsps);
         return;
     }
 
     LOG(DEBUG, "issuing %s suspend request via XenBus control node",
-        dss->hvm ? "PVHVM" : "PV");
+        dsps->type == LIBXL_DOMAIN_TYPE_HVM ? "PVHVM" : "PV");
 
     libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend");
 
-    dss->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
-    if (!dss->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
+    dsps->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
+    if (!dsps->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
 
-    dss->pvcontrol.ao = ao;
-    dss->pvcontrol.what = "guest acknowledgement of suspend request";
-    dss->pvcontrol.timeout_ms = 60 * 1000;
-    dss->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
-    libxl__xswait_start(gc, &dss->pvcontrol);
+    dsps->pvcontrol.ao = ao;
+    dsps->pvcontrol.what = "guest acknowledgement of suspend request";
+    dsps->pvcontrol.timeout_ms = 60 * 1000;
+    dsps->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
+    libxl__xswait_start(gc, &dsps->pvcontrol);
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
         libxl__ev_evtchn *evev)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(evev, *dss, guest_evtchn);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(evev, *dsps, guest_evtchn);
+    STATE_AO_GC(dsps->ao);
     /* If we should be done waiting, suspend_common_wait_guest_check
      * will end up calling domain_suspend_common_guest_suspended or
      * domain_suspend_common_done, both of which cancel the evtchn
      * wait as needed.  So re-enable it now. */
-    libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
-    suspend_common_wait_guest_check(egc, dss);
+    libxl__ev_evtchn_wait(gc, &dsps->guest_evtchn);
+    suspend_common_wait_guest_check(egc, dsps);
 }
 
 static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
       libxl__xswait_state *xswa, int rc, const char *state)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xswa, *dss, pvcontrol);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(xswa, *dsps, pvcontrol);
+    STATE_AO_GC(dsps->ao);
     xs_transaction_t t = 0;
 
     if (!rc && !domain_suspend_pvcontrol_acked(state))
         /* keep waiting */
         return;
 
-    libxl__xswait_stop(gc, &dss->pvcontrol);
+    libxl__xswait_stop(gc, &dsps->pvcontrol);
 
     if (rc == ERROR_TIMEDOUT) {
         /*
@@ -228,56 +275,56 @@ static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
     LOG(DEBUG, "guest acknowledged suspend request");
 
     libxl__xs_transaction_abort(gc, &t);
-    dss->guest_responded = 1;
-    domain_suspend_common_wait_guest(egc,dss);
+    dsps->guest_responded = 1;
+    domain_suspend_common_wait_guest(egc,dsps);
     return;
 
  err:
     libxl__xs_transaction_abort(gc, &t);
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
     return;
 }
 
 static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss)
+                                             libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     int rc;
 
     LOG(DEBUG, "wait for the guest to suspend");
 
-    rc = libxl__ev_xswatch_register(gc, &dss->guest_watch,
+    rc = libxl__ev_xswatch_register(gc, &dsps->guest_watch,
                                     suspend_common_wait_guest_watch,
                                     "@releaseDomain");
     if (rc) goto err;
 
-    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dsps->guest_timeout,
                                      suspend_common_wait_guest_timeout,
                                      60*1000);
     if (rc) goto err;
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
       libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xsw, *dss, guest_watch);
-    suspend_common_wait_guest_check(egc, dss);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(xsw, *dsps, guest_watch);
+    suspend_common_wait_guest_check(egc, dsps);
 }
 
 static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss)
+        libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     xc_domaininfo_t info;
     int ret;
     int shutdown_reason;
 
     /* Convenience aliases */
-    const uint32_t domid = dss->domid;
+    const uint32_t domid = dsps->domid;
 
     ret = xc_domain_getinfolist(CTX->xch, domid, 1, &info);
     if (ret < 0) {
@@ -304,71 +351,73 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
     }
 
     LOG(DEBUG, "guest has suspended");
-    domain_suspend_common_guest_suspended(egc, dss);
+    domain_suspend_common_guest_suspended(egc, dsps);
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, ERROR_FAIL);
+    domain_suspend_common_done(egc, dsps, ERROR_FAIL);
 }
 
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(ev, *dsps, guest_timeout);
+    STATE_AO_GC(dsps->ao);
     if (rc == ERROR_TIMEDOUT) {
         LOG(ERROR, "guest did not suspend, timed out");
         rc = ERROR_GUEST_TIMEDOUT;
     }
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss)
+                                         libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     int rc;
 
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
+    libxl__ev_time_deregister(gc, &dsps->guest_timeout);
 
-    if (dss->hvm) {
-        rc = libxl__domain_suspend_device_model(gc, dss);
+    if (dsps->type == LIBXL_DOMAIN_TYPE_HVM) {
+        rc = libxl__domain_suspend_device_model(gc, dsps);
         if (rc) {
             LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", rc);
-            domain_suspend_common_done(egc, dss, rc);
+            domain_suspend_common_done(egc, dsps, rc);
             return;
         }
     }
-    domain_suspend_common_done(egc, dss, 0);
+    domain_suspend_common_done(egc, dsps, 0);
 }
 
 static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
+                                       libxl__domain_suspend_state *dsps,
                                        int rc)
 {
     EGC_GC;
-    assert(!libxl__xswait_inuse(&dss->pvcontrol));
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
-    dss->callback_common_done(egc, dss, rc);
+    assert(!libxl__xswait_inuse(&dsps->pvcontrol));
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
+    libxl__ev_time_deregister(gc, &dsps->guest_timeout);
+    dsps->callback_common_done(egc, dsps, rc);
 }
 
 void libxl__domain_suspend_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    dss->callback_common_done = domain_suspend_callback_common_done;
-    domain_suspend_callback_common(egc, dss);
+    dsps->callback_common_done = domain_suspend_callback_common_done;
+    domain_suspend_callback_common(egc, dsps);
 }
 
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
+                                libxl__domain_suspend_state *dsps, int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(dsps, *dss, dsps);
     dss->rc = rc;
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index d9b9e2a..82c3610 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3011,11 +3011,12 @@ static inline bool libxl__conversion_helper_inuse
  */
 
 typedef struct libxl__domain_suspend_state libxl__domain_suspend_state;
+typedef struct libxl__domain_save_state libxl__domain_save_state;
 
-typedef void libxl__domain_suspend_cb(libxl__egc*,
-                                      libxl__domain_suspend_state*, int rc);
+typedef void libxl__domain_save_cb(libxl__egc*,
+                                   libxl__domain_save_state*, int rc);
 typedef void libxl__save_device_model_cb(libxl__egc*,
-                                         libxl__domain_suspend_state*, int rc);
+                                         libxl__domain_save_state*, int rc);
 
 /* State for writing a libxl migration v2 stream */
 typedef struct libxl__stream_write_state libxl__stream_write_state;
@@ -3024,7 +3025,7 @@ typedef void (*sws_record_done_cb)(libxl__egc *egc,
 struct libxl__stream_write_state {
     /* filled by the user */
     libxl__ao *ao;
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     int fd;
     void (*completion_callback)(libxl__egc *egc,
                                 libxl__stream_write_state *sws,
@@ -3078,9 +3079,33 @@ typedef struct libxl__logdirty_switch {
 } libxl__logdirty_switch;
 
 struct libxl__domain_suspend_state {
+    /* set by caller of libxl__domain_suspend_init */
+    libxl__ao *ao;
+    uint32_t domid;
+
+    /* private */
+    libxl_domain_type type;
+
+    libxl__ev_evtchn guest_evtchn;
+    int guest_evtchn_lockfd;
+    int guest_responded;
+
+    libxl__xswait_state pvcontrol;
+    libxl__ev_xswatch guest_watch;
+    libxl__ev_time guest_timeout;
+
+    const char *dm_savefile;
+    void (*callback_common_done)(libxl__egc*,
+                                 struct libxl__domain_suspend_state*, int ok);
+};
+int libxl__domain_suspend_init(libxl__egc *egc,
+                               libxl__domain_suspend_state *dsps,
+                               libxl_domain_type type);
+
+struct libxl__domain_save_state {
     /* set by caller of libxl__domain_save */
     libxl__ao *ao;
-    libxl__domain_suspend_cb *callback;
+    libxl__domain_save_cb *callback;
 
     uint32_t domid;
     int fd;
@@ -3091,22 +3116,14 @@ struct libxl__domain_suspend_state {
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
-    libxl__ev_evtchn guest_evtchn;
-    int guest_evtchn_lockfd;
     int hvm;
     int xcflags;
-    int guest_responded;
-    libxl__xswait_state pvcontrol;
-    libxl__ev_xswatch guest_watch;
-    libxl__ev_time guest_timeout;
-    const char *dm_savefile;
+    libxl__domain_suspend_state dsps;
     libxl__remus_devices_state rds;
     libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
     int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
     libxl__logdirty_switch logdirty;
-    void (*callback_common_done)(libxl__egc*,
-                                 struct libxl__domain_suspend_state*, int ok);
 };
 
 
@@ -3447,12 +3464,12 @@ struct libxl__domain_create_state {
 
 /* calls dss->callback when done */
 _hidden void libxl__domain_save(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss);
+                                libxl__domain_save_state *dss);
 
 
 /* calls libxl__xc_domain_suspend_done when done */
 _hidden void libxl__xc_domain_save(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss,
+                                   libxl__domain_save_state *dss,
                                    libxl__save_helper_state *shs);
 /* If rc==0 then retval is the return value from xc_domain_save
  * and errnoval is the errno value it provided.
@@ -3470,7 +3487,7 @@ void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
 
 _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
                                (int domid, unsigned int enable, void *data);
-_hidden int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
+_hidden int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
                                                char **buf, uint32_t *len);
 _hidden int libxl__restore_emulator_xenstore_data
     (libxl__domain_create_state *dcs, const char *ptr, uint32_t size);
@@ -3498,21 +3515,21 @@ static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
 
 /* Each time the dm needs to be saved, we must call suspend and then save */
 _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
-                                           libxl__domain_suspend_state *dss);
+                                           libxl__domain_suspend_state *dsps);
 
 _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 _hidden void libxl__domain_suspend(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss);
+                                   libxl__domain_suspend_state *dsps);
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
 /* Remus setup and teardown */
 _hidden void libxl__remus_setup(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss);
+                                libxl__domain_save_state *dss);
 _hidden void libxl__remus_teardown(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss,
+                                   libxl__domain_save_state *dss,
                                    int rc);
 _hidden void libxl__remus_restore_setup(libxl__egc *egc,
                                         libxl__domain_create_state *dcs);
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index 107e867..c245a4e 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -41,7 +41,7 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
 int init_subkind_nic(libxl__remus_devices_state *rds)
 {
     int rc, ret;
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     STATE_AO_GC(rds->ao);
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index 567250d..340d076 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -31,7 +31,7 @@ static void libxl__remus_domain_resume_callback(void *data);
 static void libxl__remus_domain_save_checkpoint_callback(void *data);
 
 void libxl__remus_setup(libxl__egc *egc,
-                        libxl__domain_suspend_state *dss)
+                        libxl__domain_save_state *dss)
 {
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -72,7 +72,7 @@ out:
 static void remus_setup_done(libxl__egc *egc,
                              libxl__remus_devices_state *rds, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
@@ -89,7 +89,7 @@ static void remus_setup_done(libxl__egc *egc,
 static void remus_setup_failed(libxl__egc *egc,
                                libxl__remus_devices_state *rds, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -103,7 +103,7 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__remus_devices_state *rds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
-                           libxl__domain_suspend_state *dss,
+                           libxl__domain_save_state *dss,
                            int rc)
 {
     EGC_GC;
@@ -118,7 +118,7 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__remus_devices_state *rds,
                                 int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -131,7 +131,7 @@ static void remus_teardown_done(libxl__egc *egc,
 /*---------------------- remus callbacks (save) -----------------------*/
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
+                                libxl__domain_suspend_state *dsps, int ok);
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__remus_devices_state *rds,
                                          int rc);
@@ -143,15 +143,18 @@ static void libxl__remus_domain_suspend_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    dss->callback_common_done = remus_domain_suspend_callback_common_done;
-    libxl__domain_suspend(egc, dss);
+    dsps->callback_common_done = remus_domain_suspend_callback_common_done;
+    libxl__domain_suspend(egc, dsps);
 }
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
+                                libxl__domain_suspend_state *dsps, int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(dsps, *dss, dsps);
+
     if (rc)
         goto out;
 
@@ -169,7 +172,7 @@ static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__remus_devices_state *rds,
                                          int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     if (rc)
         goto out;
@@ -186,7 +189,7 @@ static void libxl__remus_domain_resume_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -198,7 +201,7 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
                                        libxl__remus_devices_state *rds,
                                        int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -229,7 +232,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
 static void libxl__remus_domain_save_checkpoint_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     libxl__egc *egc = shs->egc;
     STATE_AO_GC(dss->ao);
 
@@ -239,7 +242,7 @@ static void libxl__remus_domain_save_checkpoint_callback(void *data)
 static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
+    libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
 
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -264,7 +267,7 @@ static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__remus_devices_state *rds,
                                     int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     STATE_AO_GC(dss->ao);
 
@@ -299,7 +302,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
                                   int rc)
 {
-    libxl__domain_suspend_state *dss =
+    libxl__domain_save_state *dss =
                             CONTAINER_OF(ev, *dss, checkpoint_timeout);
 
     STATE_AO_GC(dss->ao);
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 45b9727..94b6b67 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -75,7 +75,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
                argnums, ARRAY_SIZE(argnums));
 }
 
-void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss,
+void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
                            libxl__save_helper_state *shs)
 {
     STATE_AO_GC(dss->ao);
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 21b4b51..9053146 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -216,7 +216,7 @@ void libxl__stream_write_start(libxl__egc *egc,
                                libxl__stream_write_state *stream)
 {
     libxl__datacopier_state *dc = &stream->dc;
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_hdr hdr;
     int rc = 0;
@@ -324,7 +324,7 @@ static void libxc_header_done(libxl__egc *egc,
 void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
                                 int rc, int retval, int errnoval)
 {
-    libxl__domain_suspend_state *dss = dss_void;
+    libxl__domain_save_state *dss = dss_void;
     libxl__stream_write_state *stream = &dss->sws;
     STATE_AO_GC(dss->ao);
 
@@ -333,10 +333,10 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
 
     if (retval) {
         LOGEV(ERROR, errnoval, "saving domain: %s",
-              dss->guest_responded ?
+              dss->dsps.guest_responded ?
               "domain responded to suspend request" :
               "domain did not respond to suspend request");
-        if (!dss->guest_responded)
+        if (!dss->dsps.guest_responded)
             rc = ERROR_GUEST_TIMEDOUT;
         else if (dss->rc)
             rc = dss->rc;
@@ -371,7 +371,7 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
 static void write_emulator_xenstore_record(libxl__egc *egc,
                                            libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_rec_hdr rec;
     int rc;
@@ -410,7 +410,7 @@ static void write_emulator_xenstore_record(libxl__egc *egc,
 static void emulator_xenstore_record_done(libxl__egc *egc,
                                           libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
 
     if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
         write_emulator_context_record(egc, stream);
@@ -425,7 +425,7 @@ static void emulator_xenstore_record_done(libxl__egc *egc,
 static void write_emulator_context_record(libxl__egc *egc,
                                           libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     libxl__datacopier_state *dc = &stream->emu_dc;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_rec_hdr *rec = &stream->emu_rec_hdr;
@@ -440,7 +440,7 @@ static void write_emulator_context_record(libxl__egc *egc,
     }
 
     /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
+    const char *const filename = dss->dsps.dm_savefile;
 
     libxl__carefd_begin();
     int readfd = open(filename, O_RDONLY);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (3 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 04/13] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18 12:13   ` Wei Liu
  2016-02-18  2:43 ` [PATCH v8 06/13] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Before this patch:
1. suspend
a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
   request to the guest). If the guest doesn't support evtchn, the xenstore
   variant will be used, suspending the guest via XenBus control node.
b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
   the guest

2. Resume:
a. fast path(fast=1)
   Do not change the guest state. We call libxl__domain_resume(.., 1) which
   calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
   PV:       modify the return code to 1, and than call the domctl:
             XEN_DOMCTL_resumedomain
   PVHVM:    same with PV
   pure HVM: do nothing in modify_returncode, and than call the domctl:
             XEN_DOMCTL_resumedomain
b. slow
   Used when the guest's state have been changed. Will call
   libxl__domain_resume(..., 0) to resume the guest.
   PV:       update start info, and reset all secondary CPU states. Than call
             the domctl: XEN_DOMCTL_resumedomain
   PVHVM:    can not be resumed. You will get the following error message:
                 "Cannot resume uncooperative HVM guests"
   pure HVM: same with PVHVM

After this patch:
1. suspend
   unchanged

2. Resume
a. fast path:
   unchanged
b. slow
   PV:       unchanged
   PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest. Because we
             don't modify the return code, the PV driver will disconnect
             and reconnect.
             The guest ends up doing the XENMAPSPACE_shared_info
             XENMEM_add_to_physmap hypercall and resetting all of its CPU
             states to point to the shared_info(well except the ones past 32).
             That is the Linux kernel does that - regardless whether the
             SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
   Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.

Under COLO, we will update the guest's state(modify memory, cpu's registers,
device status...). In this case, we cannot use the fast path to resume it.
Keep the return code 0, and use a slow path to resume the guest. While
resuming HVM using slow path is not supported currently, this patch is to
make the resume call to not fail.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/libxc/xc_resume.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
index e692b81..4eedf87 100644
--- a/tools/libxc/xc_resume.c
+++ b/tools/libxc/xc_resume.c
@@ -108,6 +108,26 @@ static int xc_domain_resume_cooperative(xc_interface *xch, uint32_t domid)
     return do_domctl(xch, &domctl);
 }
 
+static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
+{
+    DECLARE_DOMCTL;
+
+    /*
+     * The domctl XEN_DOMCTL_resumedomain unpause each vcpu. After
+     * the domctl, the guest will run.
+     *
+     * If it is PVHVM, the guest called the hypercall
+     *    SCHEDOP_shutdown:SHUTDOWN_suspend
+     * to suspend itself. We don't modify the return code, so the PV driver
+     * will disconnect and reconnect.
+     *
+     * If it is a HVM, the guest will continue running.
+     */
+    domctl.cmd = XEN_DOMCTL_resumedomain;
+    domctl.domain = domid;
+    return do_domctl(xch, &domctl);
+}
+
 static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
 {
     DECLARE_DOMCTL;
@@ -137,10 +157,7 @@ static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
      */
 #if defined(__i386__) || defined(__x86_64__)
     if ( info.hvm )
-    {
-        ERROR("Cannot resume uncooperative HVM guests");
-        return rc;
-    }
+        return xc_domain_resume_hvm(xch, domid);
 
     if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
     {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 06/13] tools/libxl: introduce enum type libxl_checkpointed_stream
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (4 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 07/13] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Introduce enum type libxl_checkpointed_stream in IDL.
rename the last argument of migrate_receive from "remus" to
"checkpointed" since the semantics of this parameter has
changed.

NOTE:
 libxl_domain_restore_params and domain_create aren't changed here,
 checkpointed_stream is still an int. Because we will pass the
 value from libxl to libxc.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.h             |  7 +++++++
 tools/libxl/libxl_create.c      |  8 ++++++--
 tools/libxl/libxl_stream_read.c |  7 +++++--
 tools/libxl/libxl_types.idl     |  5 +++++
 tools/libxl/xl_cmdimpl.c        | 18 ++++++++++++------
 5 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index fa87f53..6225db1 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -876,6 +876,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
  */
 #define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
 
+/*
+ * LIBXL_HAVE_CHECKPOINTED_STREAM
+ *
+ * If this is defined, then libxl_checkpointed_stream exists.
+ */
+#define LIBXL_HAVE_CHECKPOINTED_STREAM 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index ad1d50c..f1028bc 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1033,9 +1033,13 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     dcs->srs.completion_callback = domcreate_stream_done;
 
     if (restore_fd >= 0) {
-        if (checkpointed_stream)
+        switch (checkpointed_stream) {
+        case LIBXL_CHECKPOINTED_STREAM_REMUS:
             libxl__remus_restore_setup(egc, dcs);
-        libxl__stream_read_start(egc, &dcs->srs);
+            /* fall through */
+        case LIBXL_CHECKPOINTED_STREAM_NONE:
+            libxl__stream_read_start(egc, &dcs->srs);
+        }
         return;
     }
 
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index dac134e..f4781eb 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -794,19 +794,22 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
      * If the stream is not still alive, we must not continue any work.
      */
     if (libxl__stream_read_inuse(stream)) {
-        if (checkpointed_stream) {
+        switch (checkpointed_stream) {
+        case LIBXL_CHECKPOINTED_STREAM_REMUS:
             /*
              * Failover from primary. Domain state is currently at a
              * consistent checkpoint, complete the stream, and call
              * stream->completion_callback() to resume the guest.
              */
             stream_complete(egc, stream, 0);
-        } else {
+            break;
+        case LIBXL_CHECKPOINTED_STREAM_NONE:
             /*
              * Libxc has indicated that it is done with the stream.
              * Resume reading libxl records from it.
              */
             stream_continue(egc, stream);
+            break;
         }
     }
 }
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9ad7eba..b8fb22f 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -228,6 +228,11 @@ libxl_hdtype = Enumeration("hdtype", [
     (2, "AHCI"),
     ], init_val = "LIBXL_HDTYPE_IDE")
 
+libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
+    (0, "NONE"),
+    (1, "REMUS"),
+    ])
+
 #
 # Complex libxl types
 #
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index d07ccb2..6597ebd 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4426,7 +4426,8 @@ static void migrate_domain(uint32_t domid, const char *rune, int debug,
 }
 
 static void migrate_receive(int debug, int daemonize, int monitor,
-                            int send_fd, int recv_fd, int remus)
+                            int send_fd, int recv_fd,
+                            libxl_checkpointed_stream checkpointed)
 {
     uint32_t domid;
     int rc, rc2;
@@ -4451,7 +4452,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
     dom_info.paused = 1;
     dom_info.migrate_fd = recv_fd;
     dom_info.migration_domname_r = &migration_domname;
-    dom_info.checkpointed_stream = remus;
+    dom_info.checkpointed_stream = checkpointed;
 
     rc = create_domain(&dom_info);
     if (rc < 0) {
@@ -4462,7 +4463,8 @@ static void migrate_receive(int debug, int daemonize, int monitor,
 
     domid = rc;
 
-    if (remus) {
+    switch (checkpointed) {
+    case LIBXL_CHECKPOINTED_STREAM_REMUS:
         /* If we are here, it means that the sender (primary) has crashed.
          * TODO: Split-Brain Check.
          */
@@ -4495,6 +4497,9 @@ static void migrate_receive(int debug, int daemonize, int monitor,
                     common_domname, domid, rc);
 
         exit(rc ? -ERROR_FAIL: 0);
+    default:
+        /* do nothing */
+        break;
     }
 
     fprintf(stderr, "migration target: Transfer complete,"
@@ -4632,7 +4637,8 @@ int main_restore(int argc, char **argv)
 
 int main_migrate_receive(int argc, char **argv)
 {
-    int debug = 0, daemonize = 1, monitor = 1, remus = 0;
+    int debug = 0, daemonize = 1, monitor = 1;
+    libxl_checkpointed_stream checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
     int opt;
 
     SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
@@ -4647,7 +4653,7 @@ int main_migrate_receive(int argc, char **argv)
         debug = 1;
         break;
     case 'r':
-        remus = 1;
+        checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
         break;
     }
 
@@ -4657,7 +4663,7 @@ int main_migrate_receive(int argc, char **argv)
     }
     migrate_receive(debug, daemonize, monitor,
                     STDOUT_FILENO, STDIN_FILENO,
-                    remus);
+                    checkpointed);
 
     return 0;
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 07/13] migration/save: pass checkpointed_stream from libxl to libxc
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (5 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 06/13] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 08/13] tools/libxl: export logdirty_init Wen Congyang
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

Pass checkpointed_stream from libxl to libxc.
It won't affact legacy migration because legacy migration
won't use this param.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xenguest.h   |  6 ++++--
 tools/libxc/xc_nomigrate.c       |  3 ++-
 tools/libxc/xc_sr_common.h       | 12 +++++++++++-
 tools/libxc/xc_sr_save.c         | 17 +++++++++++------
 tools/libxl/libxl.c              |  2 ++
 tools/libxl/libxl_dom_save.c     | 11 ++++++++---
 tools/libxl/libxl_internal.h     |  1 +
 tools/libxl/libxl_save_callout.c |  2 +-
 tools/libxl/libxl_save_helper.c  |  3 ++-
 tools/libxl/libxl_stream_write.c |  2 +-
 tools/libxl/libxl_types.idl      |  1 +
 11 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index d48b3ff..affc42b 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -29,7 +29,6 @@
 #define XCFLAGS_HVM       (1 << 2)
 #define XCFLAGS_STDVGA    (1 << 3)
 #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
-#define XCFLAGS_CHECKPOINTED    (1 << 5)
 
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
@@ -82,11 +81,14 @@ struct save_callbacks {
  * @parm xch a handle to an open hypervisor interface
  * @parm fd the file descriptor to save a domain to
  * @parm dom the id of the domain
+ * @param checkpointed_stream MIG_STREAM_NONE if the far end of the stream
+ *        doesn't use checkpointing
  * @return 0 on success, -1 on failure
  */
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
-                   struct save_callbacks* callbacks, int hvm);
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 902429e..c9124df 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -22,7 +22,8 @@
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm)
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream)
 {
     errno = ENOSYS;
     return -1;
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 60b43e8..66f595f 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -180,6 +180,16 @@ struct xc_sr_context
 
     xc_dominfo_t dominfo;
 
+    /*
+     * migration stream
+     * 0: Plain VM
+     * 1: Remus
+     */
+    enum {
+        MIG_STREAM_NONE, /* plain stream */
+        MIG_STREAM_REMUS,
+    } migration_stream;
+
     union /* Common save or restore data. */
     {
         struct /* Save data. */
@@ -191,7 +201,7 @@ struct xc_sr_context
             bool live;
 
             /* Plain VM, or checkpoints over time. */
-            bool checkpointed;
+            int checkpointed;
 
             /* Further debugging information in the stream. */
             bool debug;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index ccb000e..e258b7c 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -629,7 +629,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
     if ( rc )
         goto out;
 
-    if ( ctx->save.debug && !ctx->save.checkpointed )
+    if ( ctx->save.debug && ctx->save.checkpointed != MIG_STREAM_NONE )
     {
         rc = verify_frames(ctx);
         if ( rc )
@@ -758,7 +758,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 
         if ( ctx->save.live )
             rc = send_domain_memory_live(ctx);
-        else if ( ctx->save.checkpointed )
+        else if ( ctx->save.checkpointed != MIG_STREAM_NONE )
             rc = send_domain_memory_checkpointed(ctx);
         else
             rc = send_domain_memory_nonlive(ctx);
@@ -778,7 +778,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
         if ( rc )
             goto err;
 
-        if ( ctx->save.checkpointed )
+        if ( ctx->save.checkpointed != MIG_STREAM_NONE )
         {
             /*
              * We have now completed the initial live portion of the checkpoint
@@ -799,7 +799,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
             if ( rc <= 0 )
                 goto err;
         }
-    } while ( ctx->save.checkpointed );
+    } while ( ctx->save.checkpointed != MIG_STREAM_NONE );
 
     xc_report_progress_single(xch, "End of stream");
 
@@ -829,7 +829,8 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
                    uint32_t max_iters, uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm)
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream)
 {
     struct xc_sr_context ctx =
         {
@@ -841,7 +842,11 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
     ctx.save.callbacks = callbacks;
     ctx.save.live  = !!(flags & XCFLAGS_LIVE);
     ctx.save.debug = !!(flags & XCFLAGS_DEBUG);
-    ctx.save.checkpointed = !!(flags & XCFLAGS_CHECKPOINTED);
+    ctx.save.checkpointed = checkpointed_stream;
+
+    /* If altering migration_stream update this assert too. */
+    assert(checkpointed_stream == MIG_STREAM_NONE ||
+           checkpointed_stream == MIG_STREAM_REMUS);
 
     /*
      * TODO: Find some time to better tweak the live migration algorithm.
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index db5732c..58b4574 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -876,6 +876,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     dss->live = 1;
     dss->debug = 0;
     dss->remus = info;
+    dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
 
     assert(info);
 
@@ -936,6 +937,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
     dss->type = type;
     dss->live = flags & LIBXL_SUSPEND_LIVE;
     dss->debug = flags & LIBXL_SUSPEND_DEBUG;
+    dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
 
     rc = libxl__fd_flags_modify_save(gc, dss->fd,
                                      ~(O_NONBLOCK|O_NDELAY), 0,
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index aead042..a385500 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -338,6 +338,12 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
     unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
     libxl__domain_suspend_state *dsps = &dss->dsps;
 
+    if (dss->checkpointed_stream != LIBXL_CHECKPOINTED_STREAM_NONE && !r_info) {
+        LOG(ERROR, "Migration stream is checkpointed, but there's no "
+                   "checkpoint info!");
+        goto out;
+    }
+
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
     dsps->ao = ao;
@@ -376,14 +382,13 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
         goto out;
     }
 
-    if (r_info != NULL) {
+    if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
         dss->interval = r_info->interval;
-        dss->xcflags |= XCFLAGS_CHECKPOINTED;
         if (libxl_defbool_val(r_info->compression))
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
 
-    if (r_info == NULL)
+    if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_NONE)
         callbacks->suspend = libxl__domain_suspend_callback;
 
     callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 82c3610..ac6457f 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3113,6 +3113,7 @@ struct libxl__domain_save_state {
     libxl_domain_type type;
     int live;
     int debug;
+    int checkpointed_stream;
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 94b6b67..7f1f5d4 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -85,7 +85,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
 
     const unsigned long argnums[] = {
         dss->domid, 0, 0, dss->xcflags, dss->hvm,
-        cbflags,
+        cbflags, dss->checkpointed_stream,
     };
 
     shs->ao = ao;
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 39038f9..6bdcf13 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -253,6 +253,7 @@ int main(int argc, char **argv)
         uint32_t flags =           strtoul(NEXTARG,0,10);
         int hvm =                  atoi(NEXTARG);
         unsigned cbflags =         strtoul(NEXTARG,0,10);
+        int checkpointed_stream =  strtoul(NEXTARG,0,10);
         assert(!*++argv);
 
         helper_setcallbacks_save(&helper_save_callbacks, cbflags);
@@ -261,7 +262,7 @@ int main(int argc, char **argv)
         setup_signals(save_signal_handler);
 
         r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm);
+                           &helper_save_callbacks, hvm, checkpointed_stream);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 9053146..f6ea55d 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -355,7 +355,7 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
      * If the stream is not still alive, we must not continue any work.
      */
     if (libxl__stream_write_inuse(stream)) {
-        if (dss->remus)
+        if (dss->checkpointed_stream != LIBXL_CHECKPOINTED_STREAM_NONE)
             /*
              * For remus, if libxl__xc_domain_save_done() completes,
              * there was an error sending data to the secondary.
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index b8fb22f..605fb9a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -228,6 +228,7 @@ libxl_hdtype = Enumeration("hdtype", [
     (2, "AHCI"),
     ], init_val = "LIBXL_HDTYPE_IDE")
 
+# Consistent with the values defined for migration_stream.
 libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
     (0, "NONE"),
     (1, "REMUS"),
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 08/13] tools/libxl: export logdirty_init
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (6 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 07/13] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 09/13] tools/libxl: rename remus device to checkpoint device Wen Congyang
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

We need to enable logdirty on secondary, so we export logdirty_init
for internal use. Rename it to libxl__logdirty_init.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dom_save.c | 4 ++--
 tools/libxl/libxl_internal.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index a385500..28e2a41 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -44,7 +44,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
 static void switch_logdirty_done(libxl__egc *egc,
                                  libxl__domain_save_state *dss, int rc);
 
-static void logdirty_init(libxl__logdirty_switch *lds)
+void libxl__logdirty_init(libxl__logdirty_switch *lds)
 {
     lds->cmd_path = 0;
     libxl__ev_xswatch_init(&lds->watch);
@@ -345,7 +345,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
     }
 
     dss->rc = 0;
-    logdirty_init(&dss->logdirty);
+    libxl__logdirty_init(&dss->logdirty);
     dsps->ao = ao;
     dsps->domid = domid;
     rc = libxl__domain_suspend_init(egc, dsps, type);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index ac6457f..656bccd 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3078,6 +3078,8 @@ typedef struct libxl__logdirty_switch {
     libxl__ev_time timeout;
 } libxl__logdirty_switch;
 
+_hidden void libxl__logdirty_init(libxl__logdirty_switch *lds);
+
 struct libxl__domain_suspend_state {
     /* set by caller of libxl__domain_suspend_init */
     libxl__ao *ao;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 09/13] tools/libxl: rename remus device to checkpoint device
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (7 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 08/13] tools/libxl: export logdirty_init Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 10/13] tools/libxl: adjust the indentation Wen Congyang
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

This patch is auto generated by the following commands:
 1. git mv tools/libxl/libxl_remus_device.c tools/libxl/libxl_checkpoint_device.c
 2. perl -pi -e 's/libxl_remus_device/libxl_checkpoint_device/g' tools/libxl/Makefile
 3. perl -pi -e 's/\blibxl__remus_devices/libxl__checkpoint_devices/g' tools/libxl/*.[ch]
 4. perl -pi -e 's/\blibxl__remus_device\b/libxl__checkpoint_device/g' tools/libxl/*.[ch]
 5. perl -pi -e 's/\blibxl__remus_device_instance_ops\b/libxl__checkpoint_device_instance_ops/g' tools/libxl/*.[ch]
 6. perl -pi -e 's/\blibxl__remus_callback\b/libxl__checkpoint_callback/g' tools/libxl/*.[ch]
 7. perl -pi -e 's/\bremus_device_init\b/checkpoint_device_init/g' tools/libxl/*.[ch]
 8. perl -pi -e 's/\bremus_devices_setup\b/checkpoint_devices_setup/g' tools/libxl/*.[ch]
 9. perl -pi -e 's/\bdefine_remus_checkpoint_api\b/define_checkpoint_api/g' tools/libxl/*.[ch]
10. perl -pi -e 's/\brds\b/cds/g' tools/libxl/*.[ch]
11. perl -pi -e 's/REMUS_DEVICE/CHECKPOINT_DEVICE/g' tools/libxl/*.[ch] tools/libxl/*.idl
12. perl -pi -e 's/REMUS_DEVOPS/CHECKPOINT_DEVOPS/g' tools/libxl/*.[ch] tools/libxl/*.idl
13. perl -pi -e 's/\bremus\b/checkpoint/g' tools/libxl/libxl_checkpoint_device.[ch]
14. perl -pi -e 's/\bremus device/checkpoint device/g' tools/libxl/libxl_internal.h
15. perl -pi -e 's/\bRemus device/checkpoint device/g' tools/libxl/libxl_internal.h
16. perl -pi -e 's/\bremus abstract/checkpoint abstract/g' tools/libxl/libxl_internal.h
17. perl -pi -e 's/\bremus invocation/checkpoint invocation/g' tools/libxl/libxl_internal.h
18. perl -pi -e 's/\blibxl__remus_device_\(/libxl__checkpoint_device_(/g' tools/libxl/libxl_internal.h

The patch also fixes the following backword compatibility:
  The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
  changed to ERROR_CHECKPOINT_XXX after previous renaming.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Reviewed-Lightly-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/Makefile                               |   2 +-
 tools/libxl/libxl.h                                |  12 ++
 ...xl_remus_device.c => libxl_checkpoint_device.c} | 198 ++++++++++-----------
 tools/libxl/libxl_internal.h                       | 112 ++++++------
 tools/libxl/libxl_netbuffer.c                      | 108 +++++------
 tools/libxl/libxl_nonetbuffer.c                    |  10 +-
 tools/libxl/libxl_remus.c                          |  76 ++++----
 tools/libxl/libxl_remus_disk_drbd.c                |  52 +++---
 tools/libxl/libxl_types.idl                        |   4 +-
 9 files changed, 293 insertions(+), 281 deletions(-)
 rename tools/libxl/{libxl_remus_device.c => libxl_checkpoint_device.c} (52%)

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 263ea0e..789a12e 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -64,7 +64,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 6225db1..f9e3ef5 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -883,6 +883,18 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
  */
 #define LIBXL_HAVE_CHECKPOINTED_STREAM 1
 
+/*
+ * ERROR_REMUS_XXX error code only exists from Xen 4.5, Xen 4.6 and it
+ * is changed to ERROR_CHECKPOINT_XXX in Xen 4.7
+ */
+#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040500 \
+                               && LIBXL_API_VERSION < 0x040700
+#define ERROR_REMUS_DEVOPS_DOES_NOT_MATCH \
+        ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH
+#define ERROR_REMUS_DEVICE_NOT_SUPPORTED \
+        ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
+#endif
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_remus_device.c b/tools/libxl/libxl_checkpoint_device.c
similarity index 52%
rename from tools/libxl/libxl_remus_device.c
rename to tools/libxl/libxl_checkpoint_device.c
index a6cb7f6..109cd23 100644
--- a/tools/libxl/libxl_remus_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,9 +17,9 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__remus_device_instance_ops remus_device_nic;
-extern const libxl__remus_device_instance_ops remus_device_drbd_disk;
-static const libxl__remus_device_instance_ops *remus_ops[] = {
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
     &remus_device_nic,
     &remus_device_drbd_disk,
     NULL,
@@ -27,18 +27,18 @@ static const libxl__remus_device_instance_ops *remus_ops[] = {
 
 /*----- helper functions -----*/
 
-static int init_device_subkind(libxl__remus_devices_state *rds)
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
 {
     /* init device subkind-specific state in the libxl ctx */
     int rc;
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     if (libxl__netbuffer_enabled(gc)) {
-        rc = init_subkind_nic(rds);
+        rc = init_subkind_nic(cds);
         if (rc) goto out;
     }
 
-    rc = init_subkind_drbd_disk(rds);
+    rc = init_subkind_drbd_disk(cds);
     if (rc) goto out;
 
     rc = 0;
@@ -46,15 +46,15 @@ out:
     return rc;
 }
 
-static void cleanup_device_subkind(libxl__remus_devices_state *rds)
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
 {
     /* cleanup device subkind-specific state in the libxl ctx */
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     if (libxl__netbuffer_enabled(gc))
-        cleanup_subkind_nic(rds);
+        cleanup_subkind_nic(cds);
 
-    cleanup_subkind_drbd_disk(rds);
+    cleanup_subkind_drbd_disk(cds);
 }
 
 /*----- setup() and teardown() -----*/
@@ -70,103 +70,103 @@ static void devices_teardown_cb(libxl__egc *egc,
                                 libxl__multidev *multidev,
                                 int rc);
 
-/* remus device setup and teardown */
+/* checkpoint device setup and teardown */
 
-static libxl__remus_device* remus_device_init(libxl__egc *egc,
-                                              libxl__remus_devices_state *rds,
+static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
+                                              libxl__checkpoint_devices_state *cds,
                                               libxl__device_kind kind,
                                               void *libxl_dev)
 {
-    libxl__remus_device *dev = NULL;
+    libxl__checkpoint_device *dev = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
     GCNEW(dev);
     dev->backend_dev = libxl_dev;
     dev->kind = kind;
-    dev->rds = rds;
+    dev->cds = cds;
 
     return dev;
 }
 
-static void remus_devices_setup(libxl__egc *egc,
-                                libxl__remus_devices_state *rds);
+static void checkpoint_devices_setup(libxl__egc *egc,
+                                libxl__checkpoint_devices_state *cds);
 
-void libxl__remus_devices_setup(libxl__egc *egc, libxl__remus_devices_state *rds)
+void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rc = init_device_subkind(rds);
+    rc = init_device_subkind(cds);
     if (rc)
         goto out;
 
-    rds->num_devices = 0;
-    rds->num_nics = 0;
-    rds->num_disks = 0;
+    cds->num_devices = 0;
+    cds->num_nics = 0;
+    cds->num_disks = 0;
 
-    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
-        rds->nics = libxl_device_nic_list(CTX, rds->domid, &rds->num_nics);
+    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
+        cds->nics = libxl_device_nic_list(CTX, cds->domid, &cds->num_nics);
 
-    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
-        rds->disks = libxl_device_disk_list(CTX, rds->domid, &rds->num_disks);
+    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
+        cds->disks = libxl_device_disk_list(CTX, cds->domid, &cds->num_disks);
 
-    if (rds->num_nics == 0 && rds->num_disks == 0)
+    if (cds->num_nics == 0 && cds->num_disks == 0)
         goto out;
 
-    GCNEW_ARRAY(rds->devs, rds->num_nics + rds->num_disks);
+    GCNEW_ARRAY(cds->devs, cds->num_nics + cds->num_disks);
 
-    for (i = 0; i < rds->num_nics; i++) {
-        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
+    for (i = 0; i < cds->num_nics; i++) {
+        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
                                                 LIBXL__DEVICE_KIND_VIF,
-                                                &rds->nics[i]);
+                                                &cds->nics[i]);
     }
 
-    for (i = 0; i < rds->num_disks; i++) {
-        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
+    for (i = 0; i < cds->num_disks; i++) {
+        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
                                                 LIBXL__DEVICE_KIND_VBD,
-                                                &rds->disks[i]);
+                                                &cds->disks[i]);
     }
 
-    remus_devices_setup(egc, rds);
+    checkpoint_devices_setup(egc, cds);
 
     return;
 
 out:
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
 
-static void remus_devices_setup(libxl__egc *egc,
-                                libxl__remus_devices_state *rds)
+static void checkpoint_devices_setup(libxl__egc *egc,
+                                libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    libxl__multidev_begin(ao, &rds->multidev);
-    rds->multidev.callback = all_devices_setup_cb;
-    for (i = 0; i < rds->num_devices; i++) {
-        libxl__remus_device *dev = rds->devs[i];
+    libxl__multidev_begin(ao, &cds->multidev);
+    cds->multidev.callback = all_devices_setup_cb;
+    for (i = 0; i < cds->num_devices; i++) {
+        libxl__checkpoint_device *dev = cds->devs[i];
         dev->ops_index = -1;
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
 
-        dev->aodev.rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
+        dev->aodev.rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
         dev->aodev.callback = device_setup_iterate;
         device_setup_iterate(egc,&dev->aodev);
     }
 
     rc = 0;
-    libxl__multidev_prepared(egc, &rds->multidev, rc);
+    libxl__multidev_prepared(egc, &cds->multidev, rc);
 }
 
 
 static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
 {
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     EGC_GC;
 
-    if (aodev->rc != ERROR_REMUS_DEVICE_NOT_SUPPORTED &&
-        aodev->rc != ERROR_REMUS_DEVOPS_DOES_NOT_MATCH)
+    if (aodev->rc != ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED &&
+        aodev->rc != ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH)
         /* might be success or disaster */
         goto out;
 
@@ -186,16 +186,16 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
                 domid = disk->backend_domid;
                 devid = libxl__device_disk_dev_number(disk->vdev, NULL, NULL);
             } else {
-                LOG(ERROR,"device kind not handled by remus: %s",
+                LOG(ERROR,"device kind not handled by checkpoint: %s",
                     libxl__device_kind_to_string(dev->kind));
                 aodev->rc = ERROR_FAIL;
                 goto out;
             }
-            LOG(ERROR,"device not handled by remus"
+            LOG(ERROR,"device not handled by checkpoint"
                 " (device=%s:%"PRId32"/%"PRId32")",
                 libxl__device_kind_to_string(dev->kind),
                 domid, devid);
-            aodev->rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
+            aodev->rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
             goto out;
         }
     } while (dev->ops->kind != dev->kind);
@@ -216,32 +216,32 @@ static void all_devices_setup_cb(libxl__egc *egc,
     STATE_AO_GC(multidev->ao);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
 
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
 
-void libxl__remus_devices_teardown(libxl__egc *egc,
-                                   libxl__remus_devices_state *rds)
+void libxl__checkpoint_devices_teardown(libxl__egc *egc,
+                                   libxl__checkpoint_devices_state *cds)
 {
     int i;
-    libxl__remus_device *dev;
+    libxl__checkpoint_device *dev;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    libxl__multidev_begin(ao, &rds->multidev);
-    rds->multidev.callback = devices_teardown_cb;
-    for (i = 0; i < rds->num_devices; i++) {
-        dev = rds->devs[i];
+    libxl__multidev_begin(ao, &cds->multidev);
+    cds->multidev.callback = devices_teardown_cb;
+    for (i = 0; i < cds->num_devices; i++) {
+        dev = cds->devs[i];
         if (!dev->ops || !dev->matched)
             continue;
 
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
         dev->ops->teardown(egc,dev);
     }
 
-    libxl__multidev_prepared(egc, &rds->multidev, 0);
+    libxl__multidev_prepared(egc, &cds->multidev, 0);
 }
 
 static void devices_teardown_cb(libxl__egc *egc,
@@ -253,26 +253,26 @@ static void devices_teardown_cb(libxl__egc *egc,
     STATE_AO_GC(multidev->ao);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
 
     /* clean nic */
-    for (i = 0; i < rds->num_nics; i++)
-        libxl_device_nic_dispose(&rds->nics[i]);
-    free(rds->nics);
-    rds->nics = NULL;
-    rds->num_nics = 0;
+    for (i = 0; i < cds->num_nics; i++)
+        libxl_device_nic_dispose(&cds->nics[i]);
+    free(cds->nics);
+    cds->nics = NULL;
+    cds->num_nics = 0;
 
     /* clean disk */
-    for (i = 0; i < rds->num_disks; i++)
-        libxl_device_disk_dispose(&rds->disks[i]);
-    free(rds->disks);
-    rds->disks = NULL;
-    rds->num_disks = 0;
+    for (i = 0; i < cds->num_disks; i++)
+        libxl_device_disk_dispose(&cds->disks[i]);
+    free(cds->disks);
+    cds->disks = NULL;
+    cds->num_disks = 0;
 
-    cleanup_device_subkind(rds);
+    cleanup_device_subkind(cds);
 
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
 
 /*----- checkpointing APIs -----*/
@@ -285,33 +285,33 @@ static void devices_checkpoint_cb(libxl__egc *egc,
 
 /* API implementations */
 
-#define define_remus_checkpoint_api(api)                                \
-void libxl__remus_devices_##api(libxl__egc *egc,                        \
-                                libxl__remus_devices_state *rds)        \
+#define define_checkpoint_api(api)                                \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
+                                libxl__checkpoint_devices_state *cds)        \
 {                                                                       \
     int i;                                                              \
-    libxl__remus_device *dev;                                           \
+    libxl__checkpoint_device *dev;                                           \
                                                                         \
-    STATE_AO_GC(rds->ao);                                               \
+    STATE_AO_GC(cds->ao);                                               \
                                                                         \
-    libxl__multidev_begin(ao, &rds->multidev);                          \
-    rds->multidev.callback = devices_checkpoint_cb;                     \
-    for (i = 0; i < rds->num_devices; i++) {                            \
-        dev = rds->devs[i];                                             \
+    libxl__multidev_begin(ao, &cds->multidev);                          \
+    cds->multidev.callback = devices_checkpoint_cb;                     \
+    for (i = 0; i < cds->num_devices; i++) {                            \
+        dev = cds->devs[i];                                             \
         if (!dev->matched || !dev->ops->api)                            \
             continue;                                                   \
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);\
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);\
         dev->ops->api(egc,dev);                                         \
     }                                                                   \
                                                                         \
-    libxl__multidev_prepared(egc, &rds->multidev, 0);                   \
+    libxl__multidev_prepared(egc, &cds->multidev, 0);                   \
 }
 
-define_remus_checkpoint_api(postsuspend);
+define_checkpoint_api(postsuspend);
 
-define_remus_checkpoint_api(preresume);
+define_checkpoint_api(preresume);
 
-define_remus_checkpoint_api(commit);
+define_checkpoint_api(commit);
 
 static void devices_checkpoint_cb(libxl__egc *egc,
                                   libxl__multidev *multidev,
@@ -320,8 +320,8 @@ static void devices_checkpoint_cb(libxl__egc *egc,
     STATE_AO_GC(multidev->ao);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
 
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 656bccd..630f048 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2794,9 +2794,9 @@ typedef struct libxl__save_helper_state {
                       * marshalling and xc callback functions */
 } libxl__save_helper_state;
 
-/*----- remus device related state structure -----*/
+/*----- checkpoint device related state structure -----*/
 /*
- * The abstract Remus device layer exposes a common
+ * The abstract checkpoint device layer exposes a common
  * set of API to [external] libxl for manipulating devices attached to
  * a guest protected by Remus. The device layer also exposes a set of
  * [internal] interfaces that every device type must implement.
@@ -2804,34 +2804,34 @@ typedef struct libxl__save_helper_state {
  * The following API are exposed to libxl:
  *
  * One-time configuration operations:
- *  +libxl__remus_devices_setup
+ *  +libxl__checkpoint_devices_setup
  *    > Enable output buffering for NICs, setup disk replication, etc.
- *  +libxl__remus_devices_teardown
+ *  +libxl__checkpoint_devices_teardown
  *    > Disable output buffering and disk replication; teardown any
  *       associated external setups like qdiscs for NICs.
  *
  * Operations executed every checkpoint (in order of invocation):
- *  +libxl__remus_devices_postsuspend
- *  +libxl__remus_devices_preresume
- *  +libxl__remus_devices_commit
+ *  +libxl__checkpoint_devices_postsuspend
+ *  +libxl__checkpoint_devices_preresume
+ *  +libxl__checkpoint_devices_commit
  *
  * Each device type needs to implement the interfaces specified in
- * the libxl__remus_device_instance_ops if it wishes to support Remus.
+ * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the Remus device layer is shown below:
+ * The high-level control flow through the checkpoint device layer is shown below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
- *    |-> libxl__remus_devices_setup
- *      |-> Per-checkpoint libxl__remus_devices_[postsuspend,preresume,commit]
+ *    |-> libxl__checkpoint_devices_setup
+ *      |-> Per-checkpoint libxl__checkpoint_devices_[postsuspend,preresume,commit]
  *        ...
  *        |-> On backup failure, network error or other internal errors:
- *            libxl__remus_devices_teardown
+ *            libxl__checkpoint_devices_teardown
  */
 
-typedef struct libxl__remus_device libxl__remus_device;
-typedef struct libxl__remus_devices_state libxl__remus_devices_state;
-typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops;
+typedef struct libxl__checkpoint_device libxl__checkpoint_device;
+typedef struct libxl__checkpoint_devices_state libxl__checkpoint_devices_state;
+typedef struct libxl__checkpoint_device_instance_ops libxl__checkpoint_device_instance_ops;
 
 /*
  * Interfaces to be implemented by every device subkind that wishes to
@@ -2841,7 +2841,7 @@ typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops
  * synchronous and call dev->aodev.callback directly (as the last
  * thing they do).
  */
-struct libxl__remus_device_instance_ops {
+struct libxl__checkpoint_device_instance_ops {
     /* the device kind this ops belongs to... */
     libxl__device_kind kind;
 
@@ -2852,12 +2852,12 @@ struct libxl__remus_device_instance_ops {
      * Asynchronous.
      */
 
-    void (*postsuspend)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*preresume)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*commit)(libxl__egc *egc, libxl__remus_device *dev);
+    void (*postsuspend)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*preresume)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*commit)(libxl__egc *egc, libxl__checkpoint_device *dev);
 
     /*
-     * setup() and teardown() are refer to the actual remus device.
+     * setup() and teardown() are refer to the actual checkpoint device.
      * Asynchronous.
      * teardown is called even if setup fails.
      */
@@ -2866,45 +2866,45 @@ struct libxl__remus_device_instance_ops {
      * device. If matched, the device will then be managed with this set of
      * subkind operations.
      * Yields 0 if the device successfully set up.
-     * REMUS_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
+     * CHECKPOINT_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
      * any other rc indicates failure.
      */
-    void (*setup)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*teardown)(libxl__egc *egc, libxl__remus_device *dev);
+    void (*setup)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*teardown)(libxl__egc *egc, libxl__checkpoint_device *dev);
 };
 
-int init_subkind_nic(libxl__remus_devices_state *rds);
-void cleanup_subkind_nic(libxl__remus_devices_state *rds);
-int init_subkind_drbd_disk(libxl__remus_devices_state *rds);
-void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds);
+int init_subkind_nic(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds);
+int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
-typedef void libxl__remus_callback(libxl__egc *,
-                                   libxl__remus_devices_state *, int rc);
+typedef void libxl__checkpoint_callback(libxl__egc *,
+                                   libxl__checkpoint_devices_state *, int rc);
 
 /*
- * State associated with a remus invocation, including parameters
- * passed to the remus abstract device layer by the remus
+ * State associated with a checkpoint invocation, including parameters
+ * passed to the checkpoint abstract device layer by the remus
  * save/restore machinery.
  */
-struct libxl__remus_devices_state {
-    /*---- must be set by caller of libxl__remus_device_(setup|teardown) ----*/
+struct libxl__checkpoint_devices_state {
+    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
 
     libxl__ao *ao;
     uint32_t domid;
-    libxl__remus_callback *callback;
+    libxl__checkpoint_callback *callback;
     int device_kind_flags;
 
     /*----- private for abstract layer only -----*/
 
     int num_devices;
     /*
-     * this array is allocated before setup the remus devices by the
-     * remus abstract layer.
-     * devs may be NULL, means there's no remus devices that has been set up.
+     * this array is allocated before setup the checkpoint devices by the
+     * checkpoint abstract layer.
+     * devs may be NULL, means there's no checkpoint devices that has been set up.
      * the size of this array is 'num_devices', which is the total number
      * of libxl nic devices and disk devices(num_nics + num_disks).
      */
-    libxl__remus_device **devs;
+    libxl__checkpoint_device **devs;
 
     libxl_device_nic *nics;
     int num_nics;
@@ -2926,20 +2926,20 @@ struct libxl__remus_devices_state {
 
 /*
  * Information about a single device being handled by remus.
- * Allocated by the remus abstract layer.
+ * Allocated by the checkpoint abstract layer.
  */
-struct libxl__remus_device {
+struct libxl__checkpoint_device {
     /*----- shared between abstract and concrete layers -----*/
     /*
      * if this is true, that means the subkind ops match the device
      */
     bool matched;
 
-    /*----- set by remus device abstruct layer -----*/
-    /* libxl__device_* which this remus device related to */
+    /*----- set by checkpoint device abstruct layer -----*/
+    /* libxl__device_* which this checkpoint device related to */
     const void *backend_dev;
     libxl__device_kind kind;
-    libxl__remus_devices_state *rds;
+    libxl__checkpoint_devices_state *cds;
     libxl__ao_device aodev;
 
     /*----- private for abstract layer only -----*/
@@ -2950,7 +2950,7 @@ struct libxl__remus_device {
      * individual devices.
      */
     int ops_index;
-    const libxl__remus_device_instance_ops *ops;
+    const libxl__checkpoint_device_instance_ops *ops;
 
     /*----- private for concrete (device-specific) layer -----*/
 
@@ -2958,17 +2958,17 @@ struct libxl__remus_device {
     void *concrete_data;
 };
 
-/* the following 5 APIs are async ops, call rds->callback when done */
-_hidden void libxl__remus_devices_setup(libxl__egc *egc,
-                                        libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_teardown(libxl__egc *egc,
-                                           libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_postsuspend(libxl__egc *egc,
-                                              libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_preresume(libxl__egc *egc,
-                                            libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_commit(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds);
+/* the following 5 APIs are async ops, call cds->callback when done */
+_hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
+                                        libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
+                                           libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
+                                              libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
+                                            libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
+                                         libxl__checkpoint_devices_state *cds);
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
@@ -3122,7 +3122,7 @@ struct libxl__domain_save_state {
     int hvm;
     int xcflags;
     libxl__domain_suspend_state dsps;
-    libxl__remus_devices_state rds;
+    libxl__checkpoint_devices_state cds;
     libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
     int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index c245a4e..33c2a42 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -38,21 +38,21 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
     return 1;
 }
 
-int init_subkind_nic(libxl__remus_devices_state *rds)
+int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     int rc, ret;
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rds->nlsock = nl_socket_alloc();
-    if (!rds->nlsock) {
+    cds->nlsock = nl_socket_alloc();
+    if (!cds->nlsock) {
         LOG(ERROR, "cannot allocate nl socket");
         rc = ERROR_FAIL;
         goto out;
     }
 
-    ret = nl_connect(rds->nlsock, NETLINK_ROUTE);
+    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
     if (ret) {
         LOG(ERROR, "failed to open netlink socket: %s",
             nl_geterror(ret));
@@ -61,7 +61,7 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
     }
 
     /* get list of all qdiscs installed on network devs. */
-    ret = rtnl_qdisc_alloc_cache(rds->nlsock, &rds->qdisc_cache);
+    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
     if (ret) {
         LOG(ERROR, "failed to allocate qdisc cache: %s",
             nl_geterror(ret));
@@ -70,9 +70,9 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
     }
 
     if (dss->remus->netbufscript) {
-        rds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
+        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
     } else {
-        rds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
+        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
                                       libxl__xen_script_dir_path());
     }
 
@@ -82,22 +82,22 @@ out:
     return rc;
 }
 
-void cleanup_subkind_nic(libxl__remus_devices_state *rds)
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* free qdisc cache */
-    if (rds->qdisc_cache) {
-        nl_cache_clear(rds->qdisc_cache);
-        nl_cache_free(rds->qdisc_cache);
-        rds->qdisc_cache = NULL;
+    if (cds->qdisc_cache) {
+        nl_cache_clear(cds->qdisc_cache);
+        nl_cache_free(cds->qdisc_cache);
+        cds->qdisc_cache = NULL;
     }
 
     /* close & free nlsock */
-    if (rds->nlsock) {
-        nl_close(rds->nlsock);
-        nl_socket_free(rds->nlsock);
-        rds->nlsock = NULL;
+    if (cds->nlsock) {
+        nl_close(cds->nlsock);
+        nl_socket_free(cds->nlsock);
+        cds->nlsock = NULL;
     }
 }
 
@@ -111,17 +111,17 @@ void cleanup_subkind_nic(libxl__remus_devices_state *rds)
  * it must ONLY be used for remus because if driver domains
  * were in use it would constitute a security vulnerability.
  */
-static const char *get_vifname(libxl__remus_device *dev,
+static const char *get_vifname(libxl__checkpoint_device *dev,
                                const libxl_device_nic *nic)
 {
     const char *vifname = NULL;
     const char *path;
     int rc;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* Convenience aliases */
-    const uint32_t domid = dev->rds->domid;
+    const uint32_t domid = dev->cds->domid;
 
     path = GCSPRINTF("%s/backend/vif/%d/%d/vifname",
                      libxl__xs_get_dompath(gc, 0), domid, nic->devid);
@@ -144,19 +144,19 @@ static void free_qdisc(libxl__remus_device_nic *remus_nic)
     remus_nic->qdisc = NULL;
 }
 
-static int init_qdisc(libxl__remus_devices_state *rds,
+static int init_qdisc(libxl__checkpoint_devices_state *cds,
                       libxl__remus_device_nic *remus_nic)
 {
     int rc, ret, ifindex;
     struct rtnl_link *ifb = NULL;
     struct rtnl_qdisc *qdisc = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Now that we have brought up REMUS_IFB device with plug qdisc for
      * this vif, so we need to refill the qdisc cache.
      */
-    ret = nl_cache_refill(rds->nlsock, rds->qdisc_cache);
+    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
     if (ret) {
         LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
         rc = ERROR_FAIL;
@@ -164,7 +164,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
     }
 
     /* get a handle to the REMUS_IFB interface */
-    ret = rtnl_link_get_kernel(rds->nlsock, 0, remus_nic->ifb, &ifb);
+    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
     if (ret) {
         LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
             nl_geterror(ret));
@@ -187,7 +187,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
      * There is no need to explicitly free this qdisc as its just a
      * reference from the qdisc cache we allocated earlier.
      */
-    qdisc = rtnl_qdisc_get_by_parent(rds->qdisc_cache, ifindex, TC_H_ROOT);
+    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
     if (qdisc) {
         const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
         /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
@@ -231,19 +231,19 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
  * $REMUS_IFB (for teardown)
  * setup/teardown as command line arg.
  */
-static void setup_async_exec(libxl__remus_device *dev, char *op)
+static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
 {
     int arraysize, nr = 0;
     char **env = NULL, **args = NULL;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
-    libxl__remus_devices_state *rds = dev->rds;
+    libxl__checkpoint_devices_state *cds = dev->cds;
     libxl__async_exec_state *aes = &dev->aodev.aes;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    char *const script = libxl__strdup(gc, rds->netbufscript);
-    const uint32_t domid = rds->domid;
+    char *const script = libxl__strdup(gc, cds->netbufscript);
+    const uint32_t domid = cds->domid;
     const int dev_id = remus_nic->devid;
     const char *const vif = remus_nic->vif;
     const char *const ifb = remus_nic->ifb;
@@ -269,7 +269,7 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
     args[nr++] = NULL;
     assert(nr == arraysize);
 
-    aes->ao = dev->rds->ao;
+    aes->ao = dev->cds->ao;
     aes->what = GCSPRINTF("%s %s", args[0], args[1]);
     aes->env = env;
     aes->args = args;
@@ -286,13 +286,13 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
 
 /* setup() and teardown() */
 
-static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic;
     const libxl_device_nic *nic = dev->backend_dev;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /*
      * thers's no subkind of nic devices, so nic ops is always matched
@@ -330,15 +330,15 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
                                    int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
-    libxl__remus_devices_state *rds = dev->rds;
+    libxl__checkpoint_devices_state *cds = dev->cds;
     const char *out_path_base, *hotplug_error = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    const uint32_t domid = rds->domid;
+    const uint32_t domid = cds->domid;
     const int devid = remus_nic->devid;
     const char *const vif = remus_nic->vif;
     const char **const ifb = &remus_nic->ifb;
@@ -377,7 +377,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
 
     if (hotplug_error) {
         LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
-            rds->netbufscript, vif, hotplug_error);
+            cds->netbufscript, vif, hotplug_error);
         rc = ERROR_FAIL;
         goto out;
     }
@@ -388,17 +388,17 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
     }
 
     LOG(DEBUG, "%s will buffer packets from vif %s", *ifb, vif);
-    rc = init_qdisc(rds, remus_nic);
+    rc = init_qdisc(cds, remus_nic);
 
 out:
     aodev->rc = rc;
     aodev->callback(egc, aodev);
 }
 
-static void nic_teardown(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     setup_async_exec(dev, "teardown");
 
@@ -418,7 +418,7 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
                                       int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
     if (status && !rc)
@@ -441,12 +441,12 @@ enum {
 /* API implementations */
 
 static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
-                           libxl__remus_devices_state *rds,
+                           libxl__checkpoint_devices_state *cds,
                            int buffer_op)
 {
     int rc, ret;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     if (buffer_op == tc_buffer_start)
         ret = rtnl_qdisc_plug_buffer(remus_nic->qdisc);
@@ -458,7 +458,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
         goto out;
     }
 
-    ret = rtnl_qdisc_add(rds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
+    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
     if (ret) {
         rc = ERROR_FAIL;
         goto out;
@@ -475,33 +475,33 @@ out:
     return rc;
 }
 
-static void nic_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
-    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_start);
+    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_start);
 
     dev->aodev.rc = rc;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-static void nic_commit(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_commit(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
-    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_release);
+    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_release);
 
     dev->aodev.rc = rc;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_nic = {
+const libxl__checkpoint_device_instance_ops remus_device_nic = {
     .kind = LIBXL__DEVICE_KIND_VIF,
     .setup = nic_setup,
     .teardown = nic_teardown,
diff --git a/tools/libxl/libxl_nonetbuffer.c b/tools/libxl/libxl_nonetbuffer.c
index 3c659c2..4b68152 100644
--- a/tools/libxl/libxl_nonetbuffer.c
+++ b/tools/libxl/libxl_nonetbuffer.c
@@ -22,25 +22,25 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
     return 0;
 }
 
-int init_subkind_nic(libxl__remus_devices_state *rds)
+int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     return 0;
 }
 
-void cleanup_subkind_nic(libxl__remus_devices_state *rds)
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     return;
 }
 
-static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     dev->aodev.rc = ERROR_FAIL;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_nic = {
+const libxl__checkpoint_device_instance_ops remus_device_nic = {
     .kind = LIBXL__DEVICE_KIND_VIF,
     .setup = nic_setup,
 };
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index 340d076..d41a439 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -21,9 +21,9 @@
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc);
+                             libxl__checkpoint_devices_state *cds, int rc);
 static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc);
+                               libxl__checkpoint_devices_state *cds, int rc);
 static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 static void libxl__remus_domain_suspend_callback(void *data);
@@ -34,7 +34,7 @@ void libxl__remus_setup(libxl__egc *egc,
                         libxl__domain_save_state *dss)
 {
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
     const libxl_domain_remus_info *const info = dss->remus;
     libxl__srm_save_autogen_callbacks *const callbacks =
         &dss->sws.shs.callbacks.save.a;
@@ -46,15 +46,15 @@ void libxl__remus_setup(libxl__egc *egc,
             LOG(ERROR, "Remus: No support for network buffering");
             goto out;
         }
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
+        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
     }
 
     if (libxl_defbool_val(info->diskbuf))
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
+        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
 
-    rds->ao = ao;
-    rds->domid = dss->domid;
-    rds->callback = remus_setup_done;
+    cds->ao = ao;
+    cds->domid = dss->domid;
+    cds->callback = remus_setup_done;
 
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
@@ -62,7 +62,7 @@ void libxl__remus_setup(libxl__egc *egc,
     callbacks->postcopy = libxl__remus_domain_resume_callback;
     callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
 
-    libxl__remus_devices_setup(egc, rds);
+    libxl__checkpoint_devices_setup(egc, cds);
     return;
 
 out:
@@ -70,9 +70,9 @@ out:
 }
 
 static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc)
+                             libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
@@ -82,14 +82,14 @@ static void remus_setup_done(libxl__egc *egc,
 
     LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
         dss->domid, rc);
-    rds->callback = remus_setup_failed;
-    libxl__remus_devices_teardown(egc, rds);
+    cds->callback = remus_setup_failed;
+    libxl__checkpoint_devices_teardown(egc, cds);
 }
 
 static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc)
+                               libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -100,7 +100,7 @@ static void remus_setup_failed(libxl__egc *egc,
 }
 
 static void remus_teardown_done(libxl__egc *egc,
-                                libxl__remus_devices_state *rds,
+                                libxl__checkpoint_devices_state *cds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
                            libxl__domain_save_state *dss,
@@ -110,15 +110,15 @@ void libxl__remus_teardown(libxl__egc *egc,
 
     LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
         " teardown Remus devices...", rc);
-    dss->rds.callback = remus_teardown_done;
-    libxl__remus_devices_teardown(egc, &dss->rds);
+    dss->cds.callback = remus_teardown_done;
+    libxl__checkpoint_devices_teardown(egc, &dss->cds);
 }
 
 static void remus_teardown_done(libxl__egc *egc,
-                                libxl__remus_devices_state *rds,
+                                libxl__checkpoint_devices_state *cds,
                                 int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -133,10 +133,10 @@ static void remus_teardown_done(libxl__egc *egc,
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dsps, int ok);
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
+                                         libxl__checkpoint_devices_state *cds,
                                          int rc);
 static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
+                                       libxl__checkpoint_devices_state *cds,
                                        int rc);
 
 static void libxl__remus_domain_suspend_callback(void *data)
@@ -158,9 +158,9 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
     if (rc)
         goto out;
 
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_postsuspend_cb;
-    libxl__remus_devices_postsuspend(egc, rds);
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    cds->callback = remus_devices_postsuspend_cb;
+    libxl__checkpoint_devices_postsuspend(egc, cds);
     return;
 
 out:
@@ -169,10 +169,10 @@ out:
 }
 
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
+                                         libxl__checkpoint_devices_state *cds,
                                          int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
     if (rc)
         goto out;
@@ -192,16 +192,16 @@ static void libxl__remus_domain_resume_callback(void *data)
     libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_preresume_cb;
-    libxl__remus_devices_preresume(egc, rds);
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    cds->callback = remus_devices_preresume_cb;
+    libxl__checkpoint_devices_preresume(egc, cds);
 }
 
 static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
+                                       libxl__checkpoint_devices_state *cds,
                                        int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -223,7 +223,7 @@ out:
 /*----- remus asynchronous checkpoint callback -----*/
 
 static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
+                                    libxl__checkpoint_devices_state *cds,
                                     int rc);
 static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
@@ -245,7 +245,7 @@ static void remus_checkpoint_stream_written(
     libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
 
     STATE_AO_GC(dss->ao);
 
@@ -254,8 +254,8 @@ static void remus_checkpoint_stream_written(
         goto out;
     }
 
-    rds->callback = remus_devices_commit_cb;
-    libxl__remus_devices_commit(egc, rds);
+    cds->callback = remus_devices_commit_cb;
+    libxl__checkpoint_devices_commit(egc, cds);
 
     return;
 
@@ -264,10 +264,10 @@ out:
 }
 
 static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
+                                    libxl__checkpoint_devices_state *cds,
                                     int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
     STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index 1c3a88a..4dddc58 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -26,30 +26,30 @@ typedef struct libxl__remus_drbd_disk {
     int ackwait;
 } libxl__remus_drbd_disk;
 
-int init_subkind_drbd_disk(libxl__remus_devices_state *rds)
+int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
+    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
                                        libxl__xen_script_dir_path());
 
     return 0;
 }
 
-void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds)
+void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
     return;
 }
 
 /*----- helper functions, for async calls -----*/
 static void drbd_async_call(libxl__egc *egc,
-                            libxl__remus_device *dev,
-                            void func(libxl__remus_device *),
+                            libxl__checkpoint_device *dev,
+                            void func(libxl__checkpoint_device *),
                             libxl__ev_child_callback callback)
 {
     int pid, rc;
     libxl__ao_device *aodev = &dev->aodev;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* Fork and call */
     pid = libxl__ev_child_fork(gc, &aodev->child, callback);
@@ -82,21 +82,21 @@ static void match_async_exec_cb(libxl__egc *egc,
 
 /* implementations */
 
-static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev);
+static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev);
 
-static void drbd_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     match_async_exec(egc, dev);
 }
 
-static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
+static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int arraysize, nr = 0, rc;
     const libxl_device_disk *disk = dev->backend_dev;
     libxl__async_exec_state *aes = &dev->aodev.aes;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* setup env & args */
     arraysize = 1;
@@ -107,12 +107,12 @@ static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
     arraysize = 3;
     nr = 0;
     GCNEW_ARRAY(aes->args, arraysize);
-    aes->args[nr++] = dev->rds->drbd_probe_script;
+    aes->args[nr++] = dev->cds->drbd_probe_script;
     aes->args[nr++] = disk->pdev_path;
     aes->args[nr++] = NULL;
     assert(nr <= arraysize);
 
-    aes->ao = dev->rds->ao;
+    aes->ao = dev->cds->ao;
     aes->what = GCSPRINTF("%s %s", aes->args[0], aes->args[1]);
     aes->timeout_ms = LIBXL_HOTPLUG_TIMEOUT * 1000;
     aes->callback = match_async_exec_cb;
@@ -136,7 +136,7 @@ static void match_async_exec_cb(libxl__egc *egc,
                                 int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_drbd_disk *drbd_disk;
     const libxl_device_disk *disk = dev->backend_dev;
 
@@ -146,7 +146,7 @@ static void match_async_exec_cb(libxl__egc *egc,
         goto out;
 
     if (status) {
-        rc = ERROR_REMUS_DEVOPS_DOES_NOT_MATCH;
+        rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
         /* BUG: seems to assume that any exit status means `no match' */
         /* BUG: exit status will have been logged as an error */
         goto out;
@@ -171,10 +171,10 @@ out:
     aodev->callback(egc, aodev);
 }
 
-static void drbd_teardown(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     libxl__remus_drbd_disk *drbd_disk = dev->concrete_data;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     close(drbd_disk->ctl_fd);
     dev->aodev.rc = 0;
@@ -191,9 +191,9 @@ static void checkpoint_async_call_done(libxl__egc *egc,
 /* API implementations */
 
 /* this op will not wait and block, so implement as sync op */
-static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
 
@@ -207,16 +207,16 @@ static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
 }
 
 
-static void drbd_preresume_async(libxl__remus_device *dev);
+static void drbd_preresume_async(libxl__checkpoint_device *dev);
 
-static void drbd_preresume(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_preresume(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     drbd_async_call(egc, dev, drbd_preresume_async, checkpoint_async_call_done);
 }
 
-static void drbd_preresume_async(libxl__remus_device *dev)
+static void drbd_preresume_async(libxl__checkpoint_device *dev)
 {
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
     int ackwait = rdd->ackwait;
@@ -235,7 +235,7 @@ static void checkpoint_async_call_done(libxl__egc *egc,
 {
     int rc;
     libxl__ao_device *aodev = CONTAINER_OF(child, *aodev, child);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
 
     STATE_AO_GC(aodev->ao);
@@ -253,7 +253,7 @@ out:
     aodev->callback(egc, aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_drbd_disk = {
+const libxl__checkpoint_device_instance_ops remus_device_drbd_disk = {
     .kind = LIBXL__DEVICE_KIND_VBD,
     .setup = drbd_setup,
     .teardown = drbd_teardown,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 605fb9a..632c009 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
     (-15, "LOCK_FAIL"),
     (-16, "JSON_CONFIG_EMPTY"),
     (-17, "DEVICE_EXISTS"),
-    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
-    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
+    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
     (-20, "VNUMA_CONFIG_INVALID"),
     (-21, "DOMAIN_NOTFOUND"),
     (-22, "ABORTED"),
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 10/13] tools/libxl: adjust the indentation
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (8 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 09/13] tools/libxl: rename remus device to checkpoint device Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 11/13] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

This is just tidying up after the "tools/libxl: rename remus device
to checkpoint device" patch automatic renaming.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_checkpoint_device.c | 21 +++++++++++----------
 tools/libxl/libxl_internal.h          | 19 +++++++++++--------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index 109cd23..226f159 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -73,9 +73,9 @@ static void devices_teardown_cb(libxl__egc *egc,
 /* checkpoint device setup and teardown */
 
 static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
-                                              libxl__checkpoint_devices_state *cds,
-                                              libxl__device_kind kind,
-                                              void *libxl_dev)
+                                        libxl__checkpoint_devices_state *cds,
+                                        libxl__device_kind kind,
+                                        void *libxl_dev)
 {
     libxl__checkpoint_device *dev = NULL;
 
@@ -89,9 +89,10 @@ static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-                                libxl__checkpoint_devices_state *cds);
+                                     libxl__checkpoint_devices_state *cds);
 
-void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
+void libxl__checkpoint_devices_setup(libxl__egc *egc,
+                                     libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
@@ -137,7 +138,7 @@ out:
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-                                libxl__checkpoint_devices_state *cds)
+                                     libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
@@ -285,12 +286,12 @@ static void devices_checkpoint_cb(libxl__egc *egc,
 
 /* API implementations */
 
-#define define_checkpoint_api(api)                                \
-void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
-                                libxl__checkpoint_devices_state *cds)        \
+#define define_checkpoint_api(api)                                      \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,                   \
+                                libxl__checkpoint_devices_state *cds)   \
 {                                                                       \
     int i;                                                              \
-    libxl__checkpoint_device *dev;                                           \
+    libxl__checkpoint_device *dev;                                      \
                                                                         \
     STATE_AO_GC(cds->ao);                                               \
                                                                         \
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 630f048..bde7a15 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2818,7 +2818,8 @@ typedef struct libxl__save_helper_state {
  * Each device type needs to implement the interfaces specified in
  * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the checkpoint device layer is shown below:
+ * The high-level control flow through the checkpoint device layer is shown
+ * below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
@@ -2879,7 +2880,8 @@ int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
 typedef void libxl__checkpoint_callback(libxl__egc *,
-                                   libxl__checkpoint_devices_state *, int rc);
+                                        libxl__checkpoint_devices_state *,
+                                        int rc);
 
 /*
  * State associated with a checkpoint invocation, including parameters
@@ -2887,7 +2889,7 @@ typedef void libxl__checkpoint_callback(libxl__egc *,
  * save/restore machinery.
  */
 struct libxl__checkpoint_devices_state {
-    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
+    /*-- must be set by caller of libxl__checkpoint_device_(setup|teardown) --*/
 
     libxl__ao *ao;
     uint32_t domid;
@@ -2900,7 +2902,8 @@ struct libxl__checkpoint_devices_state {
     /*
      * this array is allocated before setup the checkpoint devices by the
      * checkpoint abstract layer.
-     * devs may be NULL, means there's no checkpoint devices that has been set up.
+     * devs may be NULL, means there's no checkpoint devices that has been
+     * set up.
      * the size of this array is 'num_devices', which is the total number
      * of libxl nic devices and disk devices(num_nics + num_disks).
      */
@@ -2962,13 +2965,13 @@ struct libxl__checkpoint_device {
 _hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
-                                           libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
-                                              libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
-                                            libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
-                                         libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 11/13] tools/libxl: store remus_ops in checkpoint device state
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (9 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 10/13] tools/libxl: adjust the indentation Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 12/13] tools/libxl: move remus state into a seperate structure Wen Congyang
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Checkpoint device is an abstract layer to do checkpoint.
COLO can also use it to do checkpoint. But there are
still some codes in checkpoint device which touch remus.

This patch and:
 tools/libxl: move remus state into a seperate structure
 tools/libxl: seperate device init/cleanup from checkpoint device layer
will seperate remus from checkpoint device layer.

We use remus ops directly in checkpoint device. Store it
in checkpoint device state so that we do not aware of
remus_ops in the checkpoint device layer.

It is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_checkpoint_device.c | 10 +---------
 tools/libxl/libxl_internal.h          |  2 ++
 tools/libxl/libxl_remus.c             |  9 +++++++++
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index 226f159..bbc6dc4 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,14 +17,6 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__checkpoint_device_instance_ops remus_device_nic;
-extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
-static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
-    &remus_device_nic,
-    &remus_device_drbd_disk,
-    NULL,
-};
-
 /*----- helper functions -----*/
 
 static int init_device_subkind(libxl__checkpoint_devices_state *cds)
@@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
         goto out;
 
     do {
-        dev->ops = remus_ops[++dev->ops_index];
+        dev->ops = dev->cds->ops[++dev->ops_index];
         if (!dev->ops) {
             libxl_device_nic * nic = NULL;
             libxl_device_disk * disk = NULL;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index bde7a15..2847d13 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2895,6 +2895,8 @@ struct libxl__checkpoint_devices_state {
     uint32_t domid;
     libxl__checkpoint_callback *callback;
     int device_kind_flags;
+    /* The ops must be pointer array, and the last ops must be NULL. */
+    const libxl__checkpoint_device_instance_ops **ops;
 
     /*----- private for abstract layer only -----*/
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index d41a439..86f81c3 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -18,6 +18,14 @@
 
 #include "libxl_internal.h"
 
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
+    &remus_device_nic,
+    &remus_device_drbd_disk,
+    NULL,
+};
+
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -55,6 +63,7 @@ void libxl__remus_setup(libxl__egc *egc,
     cds->ao = ao;
     cds->domid = dss->domid;
     cds->callback = remus_setup_done;
+    cds->ops = remus_ops;
 
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 12/13] tools/libxl: move remus state into a seperate structure
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (10 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 11/13] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-18  2:43 ` [PATCH v8 13/13] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
  2016-02-26 15:54 ` [PATCH v8 00/13] Prerequisite patches for COLO Wei Liu
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Add a new structure remus state, and move concrete layer's private
member to remus state.
it is pure refactoring and no functional changes.
Init interval in libxl__remus_setup(). It is safe to move this initialisation,
because this value is only used for remus, and remus will use this value after
libxl__remus_setup().

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c                 |  2 +-
 tools/libxl/libxl_dom_save.c        |  3 +--
 tools/libxl/libxl_internal.h        | 35 +++++++++++++++-----------
 tools/libxl/libxl_netbuffer.c       | 49 +++++++++++++++++++++----------------
 tools/libxl/libxl_remus.c           | 24 ++++++++++++------
 tools/libxl/libxl_remus_disk_drbd.c |  8 +++---
 6 files changed, 72 insertions(+), 49 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 58b4574..4cdc169 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -881,7 +881,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     assert(info);
 
     /* Point of no return */
-    libxl__remus_setup(egc, dss);
+    libxl__remus_setup(egc, &dss->rs);
     return AO_INPROGRESS;
 
  out:
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 28e2a41..4eb7960 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -383,7 +383,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
     }
 
     if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
-        dss->interval = r_info->interval;
         if (libxl_defbool_val(r_info->compression))
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
@@ -433,7 +432,7 @@ static void domain_save_done(libxl__egc *egc,
          * from sending checkpoints. Teardown the network buffers and
          * release netlink resources.  This is an async op.
          */
-        libxl__remus_teardown(egc, dss, rc);
+        libxl__remus_teardown(egc, &dss->rs, rc);
         return;
     }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2847d13..a1aae97 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2894,6 +2894,7 @@ struct libxl__checkpoint_devices_state {
     libxl__ao *ao;
     uint32_t domid;
     libxl__checkpoint_callback *callback;
+    void *concrete_data;
     int device_kind_flags;
     /* The ops must be pointer array, and the last ops must be NULL. */
     const libxl__checkpoint_device_instance_ops **ops;
@@ -2917,16 +2918,6 @@ struct libxl__checkpoint_devices_state {
     int num_disks;
 
     libxl__multidev multidev;
-
-    /*----- private for concrete (device-specific) layer only -----*/
-
-    /* private for nic device subkind ops */
-    char *netbufscript;
-    struct nl_sock *nlsock;
-    struct nl_cache *qdisc_cache;
-
-    /* private for drbd disk subkind ops */
-    char *drbd_probe_script;
 };
 
 /*
@@ -2974,6 +2965,23 @@ _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
+
+/*----- Remus related state structure -----*/
+typedef struct libxl__remus_state libxl__remus_state;
+struct libxl__remus_state {
+    /* private */
+    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
+    int interval; /* checkpoint interval */
+
+    /*----- private for concrete (device-specific) layer only -----*/
+    /* private for nic device subkind ops */
+    char *netbufscript;
+    struct nl_sock *nlsock;
+    struct nl_cache *qdisc_cache;
+
+    /* private for drbd disk subkind ops */
+    char *drbd_probe_script;
+};
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
@@ -3127,9 +3135,8 @@ struct libxl__domain_save_state {
     int hvm;
     int xcflags;
     libxl__domain_suspend_state dsps;
+    libxl__remus_state rs;
     libxl__checkpoint_devices_state cds;
-    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
-    int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
     libxl__logdirty_switch logdirty;
 };
@@ -3535,9 +3542,9 @@ _hidden void libxl__domain_suspend_callback(void *data);
 
 /* Remus setup and teardown */
 _hidden void libxl__remus_setup(libxl__egc *egc,
-                                libxl__domain_save_state *dss);
+                                libxl__remus_state *rs);
 _hidden void libxl__remus_teardown(libxl__egc *egc,
-                                   libxl__domain_save_state *dss,
+                                   libxl__remus_state *rs,
                                    int rc);
 _hidden void libxl__remus_restore_setup(libxl__egc *egc,
                                         libxl__domain_create_state *dcs);
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index 33c2a42..5c7e8a2 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -42,17 +42,18 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     int rc, ret;
     libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
-    cds->nlsock = nl_socket_alloc();
-    if (!cds->nlsock) {
+    rs->nlsock = nl_socket_alloc();
+    if (!rs->nlsock) {
         LOG(ERROR, "cannot allocate nl socket");
         rc = ERROR_FAIL;
         goto out;
     }
 
-    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
+    ret = nl_connect(rs->nlsock, NETLINK_ROUTE);
     if (ret) {
         LOG(ERROR, "failed to open netlink socket: %s",
             nl_geterror(ret));
@@ -61,7 +62,7 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
     }
 
     /* get list of all qdiscs installed on network devs. */
-    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
+    ret = rtnl_qdisc_alloc_cache(rs->nlsock, &rs->qdisc_cache);
     if (ret) {
         LOG(ERROR, "failed to allocate qdisc cache: %s",
             nl_geterror(ret));
@@ -70,10 +71,10 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
     }
 
     if (dss->remus->netbufscript) {
-        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
+        rs->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
     } else {
-        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
-                                      libxl__xen_script_dir_path());
+        rs->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
+                                     libxl__xen_script_dir_path());
     }
 
     rc = 0;
@@ -84,20 +85,22 @@ out:
 
 void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
+    libxl__remus_state *rs = cds->concrete_data;
+
     STATE_AO_GC(cds->ao);
 
     /* free qdisc cache */
-    if (cds->qdisc_cache) {
-        nl_cache_clear(cds->qdisc_cache);
-        nl_cache_free(cds->qdisc_cache);
-        cds->qdisc_cache = NULL;
+    if (rs->qdisc_cache) {
+        nl_cache_clear(rs->qdisc_cache);
+        nl_cache_free(rs->qdisc_cache);
+        rs->qdisc_cache = NULL;
     }
 
     /* close & free nlsock */
-    if (cds->nlsock) {
-        nl_close(cds->nlsock);
-        nl_socket_free(cds->nlsock);
-        cds->nlsock = NULL;
+    if (rs->nlsock) {
+        nl_close(rs->nlsock);
+        nl_socket_free(rs->nlsock);
+        rs->nlsock = NULL;
     }
 }
 
@@ -150,13 +153,14 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
     int rc, ret, ifindex;
     struct rtnl_link *ifb = NULL;
     struct rtnl_qdisc *qdisc = NULL;
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
     /* Now that we have brought up REMUS_IFB device with plug qdisc for
      * this vif, so we need to refill the qdisc cache.
      */
-    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
+    ret = nl_cache_refill(rs->nlsock, rs->qdisc_cache);
     if (ret) {
         LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
         rc = ERROR_FAIL;
@@ -164,7 +168,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
     }
 
     /* get a handle to the REMUS_IFB interface */
-    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
+    ret = rtnl_link_get_kernel(rs->nlsock, 0, remus_nic->ifb, &ifb);
     if (ret) {
         LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
             nl_geterror(ret));
@@ -187,7 +191,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
      * There is no need to explicitly free this qdisc as its just a
      * reference from the qdisc cache we allocated earlier.
      */
-    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
+    qdisc = rtnl_qdisc_get_by_parent(rs->qdisc_cache, ifindex, TC_H_ROOT);
     if (qdisc) {
         const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
         /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
@@ -238,11 +242,12 @@ static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
     libxl__checkpoint_devices_state *cds = dev->cds;
     libxl__async_exec_state *aes = &dev->aodev.aes;
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    char *const script = libxl__strdup(gc, cds->netbufscript);
+    char *const script = libxl__strdup(gc, rs->netbufscript);
     const uint32_t domid = cds->domid;
     const int dev_id = remus_nic->devid;
     const char *const vif = remus_nic->vif;
@@ -333,6 +338,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
     libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
     libxl__checkpoint_devices_state *cds = dev->cds;
+    libxl__remus_state *rs = cds->concrete_data;
     const char *out_path_base, *hotplug_error = NULL;
 
     STATE_AO_GC(cds->ao);
@@ -377,7 +383,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
 
     if (hotplug_error) {
         LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
-            cds->netbufscript, vif, hotplug_error);
+            rs->netbufscript, vif, hotplug_error);
         rc = ERROR_FAIL;
         goto out;
     }
@@ -445,6 +451,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
                            int buffer_op)
 {
     int rc, ret;
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
@@ -458,7 +465,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
         goto out;
     }
 
-    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
+    ret = rtnl_qdisc_add(rs->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
     if (ret) {
         rc = ERROR_FAIL;
         goto out;
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index 86f81c3..e83cdc9 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -38,9 +38,10 @@ static void libxl__remus_domain_suspend_callback(void *data);
 static void libxl__remus_domain_resume_callback(void *data);
 static void libxl__remus_domain_save_checkpoint_callback(void *data);
 
-void libxl__remus_setup(libxl__egc *egc,
-                        libxl__domain_save_state *dss)
+void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
+
     /* Convenience aliases */
     libxl__checkpoint_devices_state *const cds = &dss->cds;
     const libxl_domain_remus_info *const info = dss->remus;
@@ -64,6 +65,8 @@ void libxl__remus_setup(libxl__egc *egc,
     cds->domid = dss->domid;
     cds->callback = remus_setup_done;
     cds->ops = remus_ops;
+    cds->concrete_data = rs;
+    rs->interval = info->interval;
 
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
@@ -112,15 +115,20 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__checkpoint_devices_state *cds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
-                           libxl__domain_save_state *dss,
+                           libxl__remus_state *rs,
                            int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
+
+    /* Convenience aliases */
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+
     EGC_GC;
 
     LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
         " teardown Remus devices...", rc);
-    dss->cds.callback = remus_teardown_done;
-    libxl__checkpoint_devices_teardown(egc, &dss->cds);
+    cds->callback = remus_teardown_done;
+    libxl__checkpoint_devices_teardown(egc, cds);
 }
 
 static void remus_teardown_done(libxl__egc *egc,
@@ -294,9 +302,9 @@ static void remus_devices_commit_cb(libxl__egc *egc,
      */
 
     /* Set checkpoint interval timeout */
-    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dss->rs.checkpoint_timeout,
                                      remus_next_checkpoint,
-                                     dss->interval);
+                                     dss->rs.interval);
 
     if (rc)
         goto out;
@@ -312,7 +320,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   int rc)
 {
     libxl__domain_save_state *dss =
-                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
+                            CONTAINER_OF(ev, *dss, rs.checkpoint_timeout);
 
     STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index 4dddc58..844dd66 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -28,10 +28,11 @@ typedef struct libxl__remus_drbd_disk {
 
 int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
+    libxl__remus_state *rs = cds->concrete_data;
     STATE_AO_GC(cds->ao);
 
-    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
-                                       libxl__xen_script_dir_path());
+    rs->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
+                                      libxl__xen_script_dir_path());
 
     return 0;
 }
@@ -96,6 +97,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
     int arraysize, nr = 0, rc;
     const libxl_device_disk *disk = dev->backend_dev;
     libxl__async_exec_state *aes = &dev->aodev.aes;
+    libxl__remus_state *rs = dev->cds->concrete_data;
     STATE_AO_GC(dev->cds->ao);
 
     /* setup env & args */
@@ -107,7 +109,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
     arraysize = 3;
     nr = 0;
     GCNEW_ARRAY(aes->args, arraysize);
-    aes->args[nr++] = dev->cds->drbd_probe_script;
+    aes->args[nr++] = rs->drbd_probe_script;
     aes->args[nr++] = disk->pdev_path;
     aes->args[nr++] = NULL;
     assert(nr <= arraysize);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v8 13/13] tools/libxl: seperate device init/cleanup from checkpoint device layer
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (11 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 12/13] tools/libxl: move remus state into a seperate structure Wen Congyang
@ 2016-02-18  2:43 ` Wen Congyang
  2016-02-26 15:54 ` [PATCH v8 00/13] Prerequisite patches for COLO Wei Liu
  13 siblings, 0 replies; 26+ messages in thread
From: Wen Congyang @ 2016-02-18  2:43 UTC (permalink / raw)
  To: xen devel, Konrad Rzeszutek Wilk, Andrew Cooper, Ian Campbell,
	Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

we call (init|cleanup)_subkind_nic and (init|cleanup)_subkind_drbd_disk
directly in checkpoint device. Move them to libxl_remus.c, Call them before
calling libxl__checkpoint_devices_setup() or after calling
libxl__checkpoint_devices_teardown().
it is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_checkpoint_device.c | 42 ++---------------------------------
 tools/libxl/libxl_remus.c             | 42 +++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index bbc6dc4..0a16dbb 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,38 +17,6 @@
 
 #include "libxl_internal.h"
 
-/*----- helper functions -----*/
-
-static int init_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-    /* init device subkind-specific state in the libxl ctx */
-    int rc;
-    STATE_AO_GC(cds->ao);
-
-    if (libxl__netbuffer_enabled(gc)) {
-        rc = init_subkind_nic(cds);
-        if (rc) goto out;
-    }
-
-    rc = init_subkind_drbd_disk(cds);
-    if (rc) goto out;
-
-    rc = 0;
-out:
-    return rc;
-}
-
-static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-    /* cleanup device subkind-specific state in the libxl ctx */
-    STATE_AO_GC(cds->ao);
-
-    if (libxl__netbuffer_enabled(gc))
-        cleanup_subkind_nic(cds);
-
-    cleanup_subkind_drbd_disk(cds);
-}
-
 /*----- setup() and teardown() -----*/
 
 /* callbacks */
@@ -86,14 +54,10 @@ static void checkpoint_devices_setup(libxl__egc *egc,
 void libxl__checkpoint_devices_setup(libxl__egc *egc,
                                      libxl__checkpoint_devices_state *cds)
 {
-    int i, rc;
+    int i;
 
     STATE_AO_GC(cds->ao);
 
-    rc = init_device_subkind(cds);
-    if (rc)
-        goto out;
-
     cds->num_devices = 0;
     cds->num_nics = 0;
     cds->num_disks = 0;
@@ -126,7 +90,7 @@ void libxl__checkpoint_devices_setup(libxl__egc *egc,
     return;
 
 out:
-    cds->callback(egc, cds, rc);
+    cds->callback(egc, cds, 0);
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
@@ -263,8 +227,6 @@ static void devices_teardown_cb(libxl__egc *egc,
     cds->disks = NULL;
     cds->num_disks = 0;
 
-    cleanup_device_subkind(cds);
-
     cds->callback(egc, cds, rc);
 }
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index e83cdc9..54ec7de 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -26,6 +26,38 @@ static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
     NULL,
 };
 
+/*----- helper functions -----*/
+
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* init device subkind-specific state in the libxl ctx */
+    int rc;
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc)) {
+        rc = init_subkind_nic(cds);
+        if (rc) goto out;
+    }
+
+    rc = init_subkind_drbd_disk(cds);
+    if (rc) goto out;
+
+    rc = 0;
+out:
+    return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* cleanup device subkind-specific state in the libxl ctx */
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc))
+        cleanup_subkind_nic(cds);
+
+    cleanup_subkind_drbd_disk(cds);
+}
+
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -68,6 +100,12 @@ void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
     cds->concrete_data = rs;
     rs->interval = info->interval;
 
+    if (init_device_subkind(cds)) {
+        LOG(ERROR, "Remus: failed to init device subkind for guest %u",
+            dss->domid);
+        goto out;
+    }
+
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
     callbacks->suspend = libxl__remus_domain_suspend_callback;
@@ -108,6 +146,8 @@ static void remus_setup_failed(libxl__egc *egc,
         LOG(ERROR, "Remus: failed to teardown device after setup failed"
             " for guest with domid %u, rc %d", dss->domid, rc);
 
+    cleanup_device_subkind(cds);
+
     dss->callback(egc, dss, rc);
 }
 
@@ -142,6 +182,8 @@ static void remus_teardown_done(libxl__egc *egc,
         LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
             " rc %d", dss->domid, rc);
 
+    cleanup_device_subkind(cds);
+
     dss->callback(egc, dss, rc);
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-18  2:43 ` [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
@ 2016-02-18 12:13   ` Wei Liu
  2016-02-19 14:15     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 26+ messages in thread
From: Wei Liu @ 2016-02-18 12:13 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Thu, Feb 18, 2016 at 10:43:15AM +0800, Wen Congyang wrote:
> Before this patch:
> 1. suspend
> a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
>    request to the guest). If the guest doesn't support evtchn, the xenstore
>    variant will be used, suspending the guest via XenBus control node.
> b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
>    the guest
> 
> 2. Resume:
> a. fast path(fast=1)
>    Do not change the guest state. We call libxl__domain_resume(.., 1) which
>    calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
>    PV:       modify the return code to 1, and than call the domctl:
>              XEN_DOMCTL_resumedomain
>    PVHVM:    same with PV
>    pure HVM: do nothing in modify_returncode, and than call the domctl:
>              XEN_DOMCTL_resumedomain
> b. slow
>    Used when the guest's state have been changed. Will call
>    libxl__domain_resume(..., 0) to resume the guest.
>    PV:       update start info, and reset all secondary CPU states. Than call
>              the domctl: XEN_DOMCTL_resumedomain
>    PVHVM:    can not be resumed. You will get the following error message:
>                  "Cannot resume uncooperative HVM guests"
>    pure HVM: same with PVHVM
> 
> After this patch:
> 1. suspend
>    unchanged
> 
> 2. Resume
> a. fast path:
>    unchanged
> b. slow
>    PV:       unchanged
>    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest. Because we
>              don't modify the return code, the PV driver will disconnect
>              and reconnect.
>              The guest ends up doing the XENMAPSPACE_shared_info
>              XENMEM_add_to_physmap hypercall and resetting all of its CPU
>              states to point to the shared_info(well except the ones past 32).
>              That is the Linux kernel does that - regardless whether the
>              SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
>    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
> 
> Under COLO, we will update the guest's state(modify memory, cpu's registers,
> device status...). In this case, we cannot use the fast path to resume it.
> Keep the return code 0, and use a slow path to resume the guest. While
> resuming HVM using slow path is not supported currently, this patch is to
> make the resume call to not fail.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

I proposed an alternative commit log in a previous reply:

===
Use XEN_DOMCTL_resumedomain to resume (PV)HVM guest in slow path

Previously it was not possible to resume PVHVM or pure HVM guest in slow
path because libxc didn't support that.

Using XEN_DOMCTL_resumedomain without modifying guest return code  to resume a
guest is considered to be always safe.  Introduce a function to do that for
(PV)HVM guests in slow path resume.

This patch fixes a bug that denies (PV)HVM slow path resume.  This will
enable COLO to work properly:  COLO requires HVM guest to start in the
new context that has been set up by COLO, hence slow path resume is
required.
===

Note that I fix one place in this version from "guest state" to "guest
return code" in the second paragraph. And that sentence is a big big
assumption that I don't know whether it is true or not --
reverse-engineer from comment before xc_domain_resume and what Linux
does.

But the more I think the more I'm not sure if I'm writing the right
thing. I also can't judge what is the right behaviour on the Linux side.

Konrad, can you fact-check the commit message a bit? And maybe you can
help answer the following questions?

1. If we use fast=0 on PVHVM guest, will it work?
2. If we use fast=0 on HVM guest, will it work?

What is worse, when I say "work" I actually have no clear definition of
it. There doesn't seem to be a defined state that the guest needs to be.

Wei.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback
  2016-02-18  2:43 ` [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback Wen Congyang
@ 2016-02-18 12:30   ` Wei Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Wei Liu @ 2016-02-18 12:30 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Thu, Feb 18, 2016 at 10:43:11AM +0800, Wen Congyang wrote:
> Init stream {read/write} state checkpoint_callback, suspend/resume/checkpoint
> callback in Remus setup callback.
> There's no functional change, it's just refactoring so that we can move
> all remus code into one file.
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-18 12:13   ` Wei Liu
@ 2016-02-19 14:15     ` Konrad Rzeszutek Wilk
  2016-02-19 14:43       ` Wei Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-19 14:15 UTC (permalink / raw)
  To: Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Dong Eddie, xen devel,
	Gui Jianfeng, Shriram Rajagopalan, Ian Jackson, Yang Hongyang

On Thu, Feb 18, 2016 at 12:13:36PM +0000, Wei Liu wrote:
> On Thu, Feb 18, 2016 at 10:43:15AM +0800, Wen Congyang wrote:
> > Before this patch:
> > 1. suspend
> > a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
> >    request to the guest). If the guest doesn't support evtchn, the xenstore
> >    variant will be used, suspending the guest via XenBus control node.
> > b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
> >    the guest
> > 
> > 2. Resume:
> > a. fast path(fast=1)
> >    Do not change the guest state. We call libxl__domain_resume(.., 1) which
> >    calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
> >    PV:       modify the return code to 1, and than call the domctl:
> >              XEN_DOMCTL_resumedomain
> >    PVHVM:    same with PV
> >    pure HVM: do nothing in modify_returncode, and than call the domctl:
> >              XEN_DOMCTL_resumedomain
> > b. slow
> >    Used when the guest's state have been changed. Will call
> >    libxl__domain_resume(..., 0) to resume the guest.
> >    PV:       update start info, and reset all secondary CPU states. Than call
> >              the domctl: XEN_DOMCTL_resumedomain
> >    PVHVM:    can not be resumed. You will get the following error message:
> >                  "Cannot resume uncooperative HVM guests"
> >    pure HVM: same with PVHVM
> > 
> > After this patch:
> > 1. suspend
> >    unchanged
> > 
> > 2. Resume
> > a. fast path:
> >    unchanged
> > b. slow
> >    PV:       unchanged
> >    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest. Because we
> >              don't modify the return code, the PV driver will disconnect
> >              and reconnect.
> >              The guest ends up doing the XENMAPSPACE_shared_info
> >              XENMEM_add_to_physmap hypercall and resetting all of its CPU
> >              states to point to the shared_info(well except the ones past 32).
> >              That is the Linux kernel does that - regardless whether the
> >              SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
> >    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
> > 
> > Under COLO, we will update the guest's state(modify memory, cpu's registers,
> > device status...). In this case, we cannot use the fast path to resume it.
> > Keep the return code 0, and use a slow path to resume the guest. While
> > resuming HVM using slow path is not supported currently, this patch is to
> > make the resume call to not fail.
> > 
> > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> I proposed an alternative commit log in a previous reply:
> 
> ===
> Use XEN_DOMCTL_resumedomain to resume (PV)HVM guest in slow path
> 
> Previously it was not possible to resume PVHVM or pure HVM guest in slow
> path because libxc didn't support that.
> 
> Using XEN_DOMCTL_resumedomain without modifying guest return code  to resume a
> guest is considered to be always safe.  Introduce a function to do that for
> (PV)HVM guests in slow path resume.
> 
> This patch fixes a bug that denies (PV)HVM slow path resume.  This will
> enable COLO to work properly:  COLO requires HVM guest to start in the
> new context that has been set up by COLO, hence slow path resume is
> required.
> ===
> 
> Note that I fix one place in this version from "guest state" to "guest
> return code" in the second paragraph. And that sentence is a big big
> assumption that I don't know whether it is true or not --
> reverse-engineer from comment before xc_domain_resume and what Linux
> does.
> 
> But the more I think the more I'm not sure if I'm writing the right
> thing. I also can't judge what is the right behaviour on the Linux side.
> 
> Konrad, can you fact-check the commit message a bit? And maybe you can
> help answer the following questions?
> 
> 1. If we use fast=0 on PVHVM guest, will it work?

Yes.
> 2. If we use fast=0 on HVM guest, will it work?

Yes.

> 
> What is worse, when I say "work" I actually have no clear definition of
> it. There doesn't seem to be a defined state that the guest needs to be.

For PVHVM guests, fast = 0, requires that the guest makes an hypercall
to  SCHEDOP_shutdown(SHUTDOWN_suspend). After the hypercall has
completed (so Xen has suspended the guest then later resumed it), it
would be the guest responsibility to setup Xen infrastructure. As in
retrieve the shared_info (XENMAPSPACE_shared_info), setup XenBus, etc.

For HVM guests, fast = 0, suspends the guests without the guest making
any hypercalls. It is in effect the hypervisor injecting an S3 suspend.
Afterwards the guest is resumed and continues as usual. No PV drivers -
hence no need to re-establish Xen PV infrastructure.

Hope this helps.
> 
> Wei.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-19 14:15     ` Konrad Rzeszutek Wilk
@ 2016-02-19 14:43       ` Wei Liu
  2016-02-19 14:52         ` Ian Campbell
  0 siblings, 1 reply; 26+ messages in thread
From: Wei Liu @ 2016-02-19 14:43 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie,
	Gui Jianfeng, Shriram Rajagopalan, Yang Hongyang

On Fri, Feb 19, 2016 at 09:15:38AM -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 18, 2016 at 12:13:36PM +0000, Wei Liu wrote:
> > On Thu, Feb 18, 2016 at 10:43:15AM +0800, Wen Congyang wrote:
> > > Before this patch:
> > > 1. suspend
> > > a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
> > >    request to the guest). If the guest doesn't support evtchn, the xenstore
> > >    variant will be used, suspending the guest via XenBus control node.
> > > b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
> > >    the guest
> > > 
> > > 2. Resume:
> > > a. fast path(fast=1)
> > >    Do not change the guest state. We call libxl__domain_resume(.., 1) which
> > >    calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
> > >    PV:       modify the return code to 1, and than call the domctl:
> > >              XEN_DOMCTL_resumedomain
> > >    PVHVM:    same with PV
> > >    pure HVM: do nothing in modify_returncode, and than call the domctl:
> > >              XEN_DOMCTL_resumedomain
> > > b. slow
> > >    Used when the guest's state have been changed. Will call
> > >    libxl__domain_resume(..., 0) to resume the guest.
> > >    PV:       update start info, and reset all secondary CPU states. Than call
> > >              the domctl: XEN_DOMCTL_resumedomain
> > >    PVHVM:    can not be resumed. You will get the following error message:
> > >                  "Cannot resume uncooperative HVM guests"
> > >    pure HVM: same with PVHVM
> > > 
> > > After this patch:
> > > 1. suspend
> > >    unchanged
> > > 
> > > 2. Resume
> > > a. fast path:
> > >    unchanged
> > > b. slow
> > >    PV:       unchanged
> > >    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest. Because we
> > >              don't modify the return code, the PV driver will disconnect
> > >              and reconnect.
> > >              The guest ends up doing the XENMAPSPACE_shared_info
> > >              XENMEM_add_to_physmap hypercall and resetting all of its CPU
> > >              states to point to the shared_info(well except the ones past 32).
> > >              That is the Linux kernel does that - regardless whether the
> > >              SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
> > >    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
> > > 
> > > Under COLO, we will update the guest's state(modify memory, cpu's registers,
> > > device status...). In this case, we cannot use the fast path to resume it.
> > > Keep the return code 0, and use a slow path to resume the guest. While
> > > resuming HVM using slow path is not supported currently, this patch is to
> > > make the resume call to not fail.
> > > 
> > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> > > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > 
> > I proposed an alternative commit log in a previous reply:
> > 
> > ===
> > Use XEN_DOMCTL_resumedomain to resume (PV)HVM guest in slow path
> > 
> > Previously it was not possible to resume PVHVM or pure HVM guest in slow
> > path because libxc didn't support that.
> > 
> > Using XEN_DOMCTL_resumedomain without modifying guest return code  to resume a
> > guest is considered to be always safe.  Introduce a function to do that for
> > (PV)HVM guests in slow path resume.
> > 
> > This patch fixes a bug that denies (PV)HVM slow path resume.  This will
> > enable COLO to work properly:  COLO requires HVM guest to start in the
> > new context that has been set up by COLO, hence slow path resume is
> > required.
> > ===
> > 
> > Note that I fix one place in this version from "guest state" to "guest
> > return code" in the second paragraph. And that sentence is a big big
> > assumption that I don't know whether it is true or not --
> > reverse-engineer from comment before xc_domain_resume and what Linux
> > does.
> > 
> > But the more I think the more I'm not sure if I'm writing the right
> > thing. I also can't judge what is the right behaviour on the Linux side.
> > 
> > Konrad, can you fact-check the commit message a bit? And maybe you can
> > help answer the following questions?
> > 
> > 1. If we use fast=0 on PVHVM guest, will it work?
> 
> Yes.
> > 2. If we use fast=0 on HVM guest, will it work?
> 
> Yes.
> 
> > 
> > What is worse, when I say "work" I actually have no clear definition of
> > it. There doesn't seem to be a defined state that the guest needs to be.
> 
> For PVHVM guests, fast = 0, requires that the guest makes an hypercall
> to  SCHEDOP_shutdown(SHUTDOWN_suspend). After the hypercall has
> completed (so Xen has suspended the guest then later resumed it), it
> would be the guest responsibility to setup Xen infrastructure. As in
> retrieve the shared_info (XENMAPSPACE_shared_info), setup XenBus, etc.
> 
> For HVM guests, fast = 0, suspends the guests without the guest making
> any hypercalls. It is in effect the hypervisor injecting an S3 suspend.
> Afterwards the guest is resumed and continues as usual. No PV drivers -
> hence no need to re-establish Xen PV infrastructure.
> 

Wait, isn't this function about resuming a guest? I'm confused because
you talk about HV injecting S3 suspend. I guess you wrote the wrong
thing?

My guess is below, from the perspective of resuming a guest

  PVHVM guest would have used SCHEDOP_shutdown(SHUTDOWN_suspend) to
  suspend. So when toolstack uses fast=0, the guest resumes from the
  hypercall with return code unmodified. Guest then re-setup Xen
  infrastructure.

  HVM guest would have used S3 suspend to suspend itself. So when
  toolstack uses fast=0 case, hypervisor injects S3 resume and guest
  would just take the normal path like a real machine does.

Does that make sense?

Wei.

> Hope this helps.
> > 
> > Wei.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-19 14:43       ` Wei Liu
@ 2016-02-19 14:52         ` Ian Campbell
  2016-02-19 15:16           ` Wei Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Ian Campbell @ 2016-02-19 14:52 UTC (permalink / raw)
  To: Wei Liu, Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Dong Eddie, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Ian Jackson, xen devel,
	Gui Jianfeng, Shriram Rajagopalan, Yang Hongyang

On Fri, 2016-02-19 at 14:43 +0000, Wei Liu wrote:
> On Fri, Feb 19, 2016 at 09:15:38AM -0500, Konrad Rzeszutek Wilk wrote:
> > On Thu, Feb 18, 2016 at 12:13:36PM +0000, Wei Liu wrote:
> > > On Thu, Feb 18, 2016 at 10:43:15AM +0800, Wen Congyang wrote:
> > > > Before this patch:
> > > > 1. suspend
> > > > a. PVHVM and PV: we use the same way to suspend the guest (send the
> > > > suspend
> > > >    request to the guest). If the guest doesn't support evtchn, the
> > > > xenstore
> > > >    variant will be used, suspending the guest via XenBus control
> > > > node.
> > > > b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to
> > > > suspend
> > > >    the guest
> > > > 
> > > > 2. Resume:
> > > > a. fast path(fast=1)
> > > >    Do not change the guest state. We call libxl__domain_resume(..,
> > > > 1) which
> > > >    calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
> > > >    PV:       modify the return code to 1, and than call the domctl:
> > > >              XEN_DOMCTL_resumedomain
> > > >    PVHVM:    same with PV
> > > >    pure HVM: do nothing in modify_returncode, and than call the
> > > > domctl:
> > > >              XEN_DOMCTL_resumedomain
> > > > b. slow
> > > >    Used when the guest's state have been changed. Will call
> > > >    libxl__domain_resume(..., 0) to resume the guest.
> > > >    PV:       update start info, and reset all secondary CPU states.
> > > > Than call
> > > >              the domctl: XEN_DOMCTL_resumedomain
> > > >    PVHVM:    can not be resumed. You will get the following error
> > > > message:
> > > >                  "Cannot resume uncooperative HVM guests"
> > > >    pure HVM: same with PVHVM
> > > > 
> > > > After this patch:
> > > > 1. suspend
> > > >    unchanged
> > > > 
> > > > 2. Resume
> > > > a. fast path:
> > > >    unchanged
> > > > b. slow
> > > >    PV:       unchanged
> > > >    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest.
> > > > Because we
> > > >              don't modify the return code, the PV driver will
> > > > disconnect
> > > >              and reconnect.
> > > >              The guest ends up doing the XENMAPSPACE_shared_info
> > > >              XENMEM_add_to_physmap hypercall and resetting all of
> > > > its CPU
> > > >              states to point to the shared_info(well except the
> > > > ones past 32).
> > > >              That is the Linux kernel does that - regardless
> > > > whether the
> > > >              SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
> > > >    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
> > > > 
> > > > Under COLO, we will update the guest's state(modify memory, cpu's
> > > > registers,
> > > > device status...). In this case, we cannot use the fast path to
> > > > resume it.
> > > > Keep the return code 0, and use a slow path to resume the guest.
> > > > While
> > > > resuming HVM using slow path is not supported currently, this patch
> > > > is to
> > > > make the resume call to not fail.
> > > > 
> > > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > > Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> > > > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > 
> > > I proposed an alternative commit log in a previous reply:
> > > 
> > > ===
> > > Use XEN_DOMCTL_resumedomain to resume (PV)HVM guest in slow path
> > > 
> > > Previously it was not possible to resume PVHVM or pure HVM guest in
> > > slow
> > > path because libxc didn't support that.
> > > 
> > > Using XEN_DOMCTL_resumedomain without modifying guest return code  to
> > > resume a
> > > guest is considered to be always safe.  Introduce a function to do
> > > that for
> > > (PV)HVM guests in slow path resume.
> > > 
> > > This patch fixes a bug that denies (PV)HVM slow path resume.  This
> > > will
> > > enable COLO to work properly:  COLO requires HVM guest to start in
> > > the
> > > new context that has been set up by COLO, hence slow path resume is
> > > required.
> > > ===
> > > 
> > > Note that I fix one place in this version from "guest state" to
> > > "guest
> > > return code" in the second paragraph. And that sentence is a big big
> > > assumption that I don't know whether it is true or not --
> > > reverse-engineer from comment before xc_domain_resume and what Linux
> > > does.
> > > 
> > > But the more I think the more I'm not sure if I'm writing the right
> > > thing. I also can't judge what is the right behaviour on the Linux
> > > side.
> > > 
> > > Konrad, can you fact-check the commit message a bit? And maybe you
> > > can
> > > help answer the following questions?
> > > 
> > > 1. If we use fast=0 on PVHVM guest, will it work?
> > 
> > Yes.
> > > 2. If we use fast=0 on HVM guest, will it work?
> > 
> > Yes.
> > 
> > > 
> > > What is worse, when I say "work" I actually have no clear definition
> > > of
> > > it. There doesn't seem to be a defined state that the guest needs to
> > > be.
> > 
> > For PVHVM guests, fast = 0, requires that the guest makes an hypercall
> > to  SCHEDOP_shutdown(SHUTDOWN_suspend). After the hypercall has
> > completed (so Xen has suspended the guest then later resumed it), it
> > would be the guest responsibility to setup Xen infrastructure. As in
> > retrieve the shared_info (XENMAPSPACE_shared_info), setup XenBus, etc.
> > 
> > For HVM guests, fast = 0, suspends the guests without the guest making
> > any hypercalls. It is in effect the hypervisor injecting an S3 suspend.
> > Afterwards the guest is resumed and continues as usual. No PV drivers -
> > hence no need to re-establish Xen PV infrastructure.
> > 
> 
> Wait, isn't this function about resuming a guest? I'm confused because
> you talk about HV injecting S3 suspend. I guess you wrote the wrong
> thing?
> 
> My guess is below, from the perspective of resuming a guest
> 
>   PVHVM guest would have used SCHEDOP_shutdown(SHUTDOWN_suspend) to
>   suspend. So when toolstack uses fast=0, the guest resumes from the
>   hypercall with return code unmodified. Guest then re-setup Xen
>   infrastructure.

Who or what has torn down the existing infrastructure from the guest's life
before the suspend in this case? AFAI Remember a guest expects to return
from SCHEDOP_shutdown(SHUTDOWN_suspend) with return code == 0 in a freshly
minted new domain, but in the resume case it is actually resuming in the
original domain, complete with any evtchn's and grant tables mappings etc
still intact from before it slept.

Perhaps I'm misremembering and the guest is expected to deal with the
possibility of resources already being in place when it re-sets up the
infra?

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-19 14:52         ` Ian Campbell
@ 2016-02-19 15:16           ` Wei Liu
  2016-02-19 16:20             ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 26+ messages in thread
From: Wei Liu @ 2016-02-19 15:16 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Dong Eddie, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Ian Jackson, xen devel,
	Gui Jianfeng, Shriram Rajagopalan, Yang Hongyang

On Fri, Feb 19, 2016 at 02:52:11PM +0000, Ian Campbell wrote:
> On Fri, 2016-02-19 at 14:43 +0000, Wei Liu wrote:
> > On Fri, Feb 19, 2016 at 09:15:38AM -0500, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Feb 18, 2016 at 12:13:36PM +0000, Wei Liu wrote:
> > > > On Thu, Feb 18, 2016 at 10:43:15AM +0800, Wen Congyang wrote:
> > > > > Before this patch:
> > > > > 1. suspend
> > > > > a. PVHVM and PV: we use the same way to suspend the guest (send the
> > > > > suspend
> > > > >    request to the guest). If the guest doesn't support evtchn, the
> > > > > xenstore
> > > > >    variant will be used, suspending the guest via XenBus control
> > > > > node.
> > > > > b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to
> > > > > suspend
> > > > >    the guest
> > > > > 
> > > > > 2. Resume:
> > > > > a. fast path(fast=1)
> > > > >    Do not change the guest state. We call libxl__domain_resume(..,
> > > > > 1) which
> > > > >    calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
> > > > >    PV:       modify the return code to 1, and than call the domctl:
> > > > >              XEN_DOMCTL_resumedomain
> > > > >    PVHVM:    same with PV
> > > > >    pure HVM: do nothing in modify_returncode, and than call the
> > > > > domctl:
> > > > >              XEN_DOMCTL_resumedomain
> > > > > b. slow
> > > > >    Used when the guest's state have been changed. Will call
> > > > >    libxl__domain_resume(..., 0) to resume the guest.
> > > > >    PV:       update start info, and reset all secondary CPU states.
> > > > > Than call
> > > > >              the domctl: XEN_DOMCTL_resumedomain
> > > > >    PVHVM:    can not be resumed. You will get the following error
> > > > > message:
> > > > >                  "Cannot resume uncooperative HVM guests"
> > > > >    pure HVM: same with PVHVM
> > > > > 
> > > > > After this patch:
> > > > > 1. suspend
> > > > >    unchanged
> > > > > 
> > > > > 2. Resume
> > > > > a. fast path:
> > > > >    unchanged
> > > > > b. slow
> > > > >    PV:       unchanged
> > > > >    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest.
> > > > > Because we
> > > > >              don't modify the return code, the PV driver will
> > > > > disconnect
> > > > >              and reconnect.
> > > > >              The guest ends up doing the XENMAPSPACE_shared_info
> > > > >              XENMEM_add_to_physmap hypercall and resetting all of
> > > > > its CPU
> > > > >              states to point to the shared_info(well except the
> > > > > ones past 32).
> > > > >              That is the Linux kernel does that - regardless
> > > > > whether the
> > > > >              SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
> > > > >    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
> > > > > 
> > > > > Under COLO, we will update the guest's state(modify memory, cpu's
> > > > > registers,
> > > > > device status...). In this case, we cannot use the fast path to
> > > > > resume it.
> > > > > Keep the return code 0, and use a slow path to resume the guest.
> > > > > While
> > > > > resuming HVM using slow path is not supported currently, this patch
> > > > > is to
> > > > > make the resume call to not fail.
> > > > > 
> > > > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > > > Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> > > > > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > > 
> > > > I proposed an alternative commit log in a previous reply:
> > > > 
> > > > ===
> > > > Use XEN_DOMCTL_resumedomain to resume (PV)HVM guest in slow path
> > > > 
> > > > Previously it was not possible to resume PVHVM or pure HVM guest in
> > > > slow
> > > > path because libxc didn't support that.
> > > > 
> > > > Using XEN_DOMCTL_resumedomain without modifying guest return code  to
> > > > resume a
> > > > guest is considered to be always safe.  Introduce a function to do
> > > > that for
> > > > (PV)HVM guests in slow path resume.
> > > > 
> > > > This patch fixes a bug that denies (PV)HVM slow path resume.  This
> > > > will
> > > > enable COLO to work properly:  COLO requires HVM guest to start in
> > > > the
> > > > new context that has been set up by COLO, hence slow path resume is
> > > > required.
> > > > ===
> > > > 
> > > > Note that I fix one place in this version from "guest state" to
> > > > "guest
> > > > return code" in the second paragraph. And that sentence is a big big
> > > > assumption that I don't know whether it is true or not --
> > > > reverse-engineer from comment before xc_domain_resume and what Linux
> > > > does.
> > > > 
> > > > But the more I think the more I'm not sure if I'm writing the right
> > > > thing. I also can't judge what is the right behaviour on the Linux
> > > > side.
> > > > 
> > > > Konrad, can you fact-check the commit message a bit? And maybe you
> > > > can
> > > > help answer the following questions?
> > > > 
> > > > 1. If we use fast=0 on PVHVM guest, will it work?
> > > 
> > > Yes.
> > > > 2. If we use fast=0 on HVM guest, will it work?
> > > 
> > > Yes.
> > > 
> > > > 
> > > > What is worse, when I say "work" I actually have no clear definition
> > > > of
> > > > it. There doesn't seem to be a defined state that the guest needs to
> > > > be.
> > > 
> > > For PVHVM guests, fast = 0, requires that the guest makes an hypercall
> > > to  SCHEDOP_shutdown(SHUTDOWN_suspend). After the hypercall has
> > > completed (so Xen has suspended the guest then later resumed it), it
> > > would be the guest responsibility to setup Xen infrastructure. As in
> > > retrieve the shared_info (XENMAPSPACE_shared_info), setup XenBus, etc.
> > > 
> > > For HVM guests, fast = 0, suspends the guests without the guest making
> > > any hypercalls. It is in effect the hypervisor injecting an S3 suspend.
> > > Afterwards the guest is resumed and continues as usual. No PV drivers -
> > > hence no need to re-establish Xen PV infrastructure.
> > > 
> > 
> > Wait, isn't this function about resuming a guest? I'm confused because
> > you talk about HV injecting S3 suspend. I guess you wrote the wrong
> > thing?
> > 
> > My guess is below, from the perspective of resuming a guest
> > 
> >   PVHVM guest would have used SCHEDOP_shutdown(SHUTDOWN_suspend) to
> >   suspend. So when toolstack uses fast=0, the guest resumes from the
> >   hypercall with return code unmodified. Guest then re-setup Xen
> >   infrastructure.
> 
> Who or what has torn down the existing infrastructure from the guest's life
> before the suspend in this case? AFAI Remember a guest expects to return
> from SCHEDOP_shutdown(SHUTDOWN_suspend) with return code == 0 in a freshly
> minted new domain, but in the resume case it is actually resuming in the
> original domain, complete with any evtchn's and grant tables mappings etc
> still intact from before it slept.
> 
> Perhaps I'm misremembering and the guest is expected to deal with the
> possibility of resources already being in place when it re-sets up the
> infra?
> 

Sigh, this is that sort of things that get to my nerves. I should try to
write something down when we come to a conclusion.  I would be happy to
have any definite answer to the expected behaviour of guest.
Extrapolation is not very helpful in the face of some many different
versions of Linux'es and BSDs.

But, if the confusion is only about PVHVM guest with fast=0, we can
forbid that specific combination for now. That should be enough to move
COLO forward.

Wei.

> Ian.
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-19 15:16           ` Wei Liu
@ 2016-02-19 16:20             ` Konrad Rzeszutek Wilk
  2016-02-19 16:42               ` Wei Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-19 16:20 UTC (permalink / raw)
  To: Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Dong Eddie, xen devel,
	Gui Jianfeng, Shriram Rajagopalan, Ian Jackson, Yang Hongyang

On Fri, Feb 19, 2016 at 03:16:27PM +0000, Wei Liu wrote:
> On Fri, Feb 19, 2016 at 02:52:11PM +0000, Ian Campbell wrote:
> > On Fri, 2016-02-19 at 14:43 +0000, Wei Liu wrote:
> > > On Fri, Feb 19, 2016 at 09:15:38AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Thu, Feb 18, 2016 at 12:13:36PM +0000, Wei Liu wrote:
> > > > > On Thu, Feb 18, 2016 at 10:43:15AM +0800, Wen Congyang wrote:
> > > > > > Before this patch:
> > > > > > 1. suspend
> > > > > > a. PVHVM and PV: we use the same way to suspend the guest (send the
> > > > > > suspend
> > > > > >    request to the guest). If the guest doesn't support evtchn, the
> > > > > > xenstore
> > > > > >    variant will be used, suspending the guest via XenBus control
> > > > > > node.
> > > > > > b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to
> > > > > > suspend
> > > > > >    the guest
> > > > > > 
> > > > > > 2. Resume:
> > > > > > a. fast path(fast=1)
> > > > > >    Do not change the guest state. We call libxl__domain_resume(..,
> > > > > > 1) which
> > > > > >    calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
> > > > > >    PV:       modify the return code to 1, and than call the domctl:
> > > > > >              XEN_DOMCTL_resumedomain
> > > > > >    PVHVM:    same with PV
> > > > > >    pure HVM: do nothing in modify_returncode, and than call the
> > > > > > domctl:
> > > > > >              XEN_DOMCTL_resumedomain
> > > > > > b. slow
> > > > > >    Used when the guest's state have been changed. Will call
> > > > > >    libxl__domain_resume(..., 0) to resume the guest.
> > > > > >    PV:       update start info, and reset all secondary CPU states.
> > > > > > Than call
> > > > > >              the domctl: XEN_DOMCTL_resumedomain
> > > > > >    PVHVM:    can not be resumed. You will get the following error
> > > > > > message:
> > > > > >                  "Cannot resume uncooperative HVM guests"
> > > > > >    pure HVM: same with PVHVM
> > > > > > 
> > > > > > After this patch:
> > > > > > 1. suspend
> > > > > >    unchanged
> > > > > > 
> > > > > > 2. Resume
> > > > > > a. fast path:
> > > > > >    unchanged
> > > > > > b. slow
> > > > > >    PV:       unchanged
> > > > > >    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest.
> > > > > > Because we
> > > > > >              don't modify the return code, the PV driver will
> > > > > > disconnect
> > > > > >              and reconnect.
> > > > > >              The guest ends up doing the XENMAPSPACE_shared_info
> > > > > >              XENMEM_add_to_physmap hypercall and resetting all of
> > > > > > its CPU
> > > > > >              states to point to the shared_info(well except the
> > > > > > ones past 32).
> > > > > >              That is the Linux kernel does that - regardless
> > > > > > whether the
> > > > > >              SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
> > > > > >    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
> > > > > > 
> > > > > > Under COLO, we will update the guest's state(modify memory, cpu's
> > > > > > registers,
> > > > > > device status...). In this case, we cannot use the fast path to
> > > > > > resume it.
> > > > > > Keep the return code 0, and use a slow path to resume the guest.
> > > > > > While
> > > > > > resuming HVM using slow path is not supported currently, this patch
> > > > > > is to
> > > > > > make the resume call to not fail.
> > > > > > 
> > > > > > Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> > > > > > Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> > > > > > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > > > > 
> > > > > I proposed an alternative commit log in a previous reply:
> > > > > 
> > > > > ===
> > > > > Use XEN_DOMCTL_resumedomain to resume (PV)HVM guest in slow path
> > > > > 
> > > > > Previously it was not possible to resume PVHVM or pure HVM guest in
> > > > > slow
> > > > > path because libxc didn't support that.
> > > > > 
> > > > > Using XEN_DOMCTL_resumedomain without modifying guest return code  to
> > > > > resume a
> > > > > guest is considered to be always safe.  Introduce a function to do
> > > > > that for
> > > > > (PV)HVM guests in slow path resume.
> > > > > 
> > > > > This patch fixes a bug that denies (PV)HVM slow path resume.  This
> > > > > will
> > > > > enable COLO to work properly:  COLO requires HVM guest to start in
> > > > > the
> > > > > new context that has been set up by COLO, hence slow path resume is
> > > > > required.
> > > > > ===
> > > > > 
> > > > > Note that I fix one place in this version from "guest state" to
> > > > > "guest
> > > > > return code" in the second paragraph. And that sentence is a big big
> > > > > assumption that I don't know whether it is true or not --
> > > > > reverse-engineer from comment before xc_domain_resume and what Linux
> > > > > does.
> > > > > 
> > > > > But the more I think the more I'm not sure if I'm writing the right
> > > > > thing. I also can't judge what is the right behaviour on the Linux
> > > > > side.
> > > > > 
> > > > > Konrad, can you fact-check the commit message a bit? And maybe you
> > > > > can
> > > > > help answer the following questions?
> > > > > 
> > > > > 1. If we use fast=0 on PVHVM guest, will it work?
> > > > 
> > > > Yes.
> > > > > 2. If we use fast=0 on HVM guest, will it work?
> > > > 
> > > > Yes.
> > > > 
> > > > > 
> > > > > What is worse, when I say "work" I actually have no clear definition
> > > > > of
> > > > > it. There doesn't seem to be a defined state that the guest needs to
> > > > > be.
> > > > 
> > > > For PVHVM guests, fast = 0, requires that the guest makes an hypercall
> > > > to  SCHEDOP_shutdown(SHUTDOWN_suspend). After the hypercall has
> > > > completed (so Xen has suspended the guest then later resumed it), it
> > > > would be the guest responsibility to setup Xen infrastructure. As in
> > > > retrieve the shared_info (XENMAPSPACE_shared_info), setup XenBus, etc.
> > > > 
> > > > For HVM guests, fast = 0, suspends the guests without the guest making
> > > > any hypercalls. It is in effect the hypervisor injecting an S3 suspend.
> > > > Afterwards the guest is resumed and continues as usual. No PV drivers -
> > > > hence no need to re-establish Xen PV infrastructure.
> > > > 
> > > 
> > > Wait, isn't this function about resuming a guest? I'm confused because
> > > you talk about HV injecting S3 suspend. I guess you wrote the wrong
> > > thing?

I was writing the whole chain - suspend, and then resume. This patch is
about resume - but to get to resume you need to suspend first.

> > > 
> > > My guess is below, from the perspective of resuming a guest
> > > 
> > >   PVHVM guest would have used SCHEDOP_shutdown(SHUTDOWN_suspend) to
> > >   suspend. So when toolstack uses fast=0, the guest resumes from the
> > >   hypercall with return code unmodified. Guest then re-setup Xen
> > >   infrastructure.
> > 
> > Who or what has torn down the existing infrastructure from the guest's life
> > before the suspend in this case? AFAI Remember a guest expects to return

The guest. Or it can ignore it and and just re-init all its settings.

> > from SCHEDOP_shutdown(SHUTDOWN_suspend) with return code == 0 in a freshly
> > minted new domain, but in the resume case it is actually resuming in the
> > original domain, complete with any evtchn's and grant tables mappings etc
> > still intact from before it slept.
> > 
> > Perhaps I'm misremembering and the guest is expected to deal with the
> > possibility of resources already being in place when it re-sets up the
> > infra?

Correct - albeit all of them are stale. Thought on some off-chance they may
be set correctly.

> > 
> 
> Sigh, this is that sort of things that get to my nerves. I should try to
> write something down when we come to a conclusion.  I would be happy to
> have any definite answer to the expected behaviour of guest.
> Extrapolation is not very helpful in the face of some many different
> versions of Linux'es and BSDs.
> 
> But, if the confusion is only about PVHVM guest with fast=0, we can
> forbid that specific combination for now. That should be enough to move
> COLO forward.

.. forbid what? PVHVM resuming with fast=0? Why?  Because the guest may
fall on its face?
> 
> Wei.
> 
> > Ian.
> > 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-19 16:20             ` Konrad Rzeszutek Wilk
@ 2016-02-19 16:42               ` Wei Liu
  2016-02-19 17:16                 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 26+ messages in thread
From: Wei Liu @ 2016-02-19 16:42 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie,
	Gui Jianfeng, Shriram Rajagopalan, Yang Hongyang

On Fri, Feb 19, 2016 at 11:20:08AM -0500, Konrad Rzeszutek Wilk wrote:
[...]
> > > > > > ===
> > > > > > 
> > > > > > Note that I fix one place in this version from "guest state" to
> > > > > > "guest
> > > > > > return code" in the second paragraph. And that sentence is a big big
> > > > > > assumption that I don't know whether it is true or not --
> > > > > > reverse-engineer from comment before xc_domain_resume and what Linux
> > > > > > does.
> > > > > > 
> > > > > > But the more I think the more I'm not sure if I'm writing the right
> > > > > > thing. I also can't judge what is the right behaviour on the Linux
> > > > > > side.
> > > > > > 
> > > > > > Konrad, can you fact-check the commit message a bit? And maybe you
> > > > > > can
> > > > > > help answer the following questions?
> > > > > > 
> > > > > > 1. If we use fast=0 on PVHVM guest, will it work?
> > > > > 
> > > > > Yes.
> > > > > > 2. If we use fast=0 on HVM guest, will it work?
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > > 
> > > > > > What is worse, when I say "work" I actually have no clear definition
> > > > > > of
> > > > > > it. There doesn't seem to be a defined state that the guest needs to
> > > > > > be.
> > > > > 
> > > > > For PVHVM guests, fast = 0, requires that the guest makes an hypercall
> > > > > to  SCHEDOP_shutdown(SHUTDOWN_suspend). After the hypercall has
> > > > > completed (so Xen has suspended the guest then later resumed it), it
> > > > > would be the guest responsibility to setup Xen infrastructure. As in
> > > > > retrieve the shared_info (XENMAPSPACE_shared_info), setup XenBus, etc.
> > > > > 
> > > > > For HVM guests, fast = 0, suspends the guests without the guest making
> > > > > any hypercalls. It is in effect the hypervisor injecting an S3 suspend.
> > > > > Afterwards the guest is resumed and continues as usual. No PV drivers -
> > > > > hence no need to re-establish Xen PV infrastructure.
> > > > > 
> > > > 
> > > > Wait, isn't this function about resuming a guest? I'm confused because
> > > > you talk about HV injecting S3 suspend. I guess you wrote the wrong
> > > > thing?
> 
> I was writing the whole chain - suspend, and then resume. This patch is
> about resume - but to get to resume you need to suspend first.
> 

Yes, of course. I was thinking more about writing it down as comment for
xc_domain_resume, so I wrote something from the perspective of resuming.

If you don't disagree with my extrapolation in previous email we don't
need to quibble about the wording anymore.

> > > > 
> > > > My guess is below, from the perspective of resuming a guest
> > > > 
> > > >   PVHVM guest would have used SCHEDOP_shutdown(SHUTDOWN_suspend) to
> > > >   suspend. So when toolstack uses fast=0, the guest resumes from the
> > > >   hypercall with return code unmodified. Guest then re-setup Xen
> > > >   infrastructure.
> > > 
> > > Who or what has torn down the existing infrastructure from the guest's life
> > > before the suspend in this case? AFAI Remember a guest expects to return
> 
> The guest. Or it can ignore it and and just re-init all its settings.
> 
> > > from SCHEDOP_shutdown(SHUTDOWN_suspend) with return code == 0 in a freshly
> > > minted new domain, but in the resume case it is actually resuming in the
> > > original domain, complete with any evtchn's and grant tables mappings etc
> > > still intact from before it slept.
> > > 
> > > Perhaps I'm misremembering and the guest is expected to deal with the
> > > possibility of resources already being in place when it re-sets up the
> > > infra?
> 
> Correct - albeit all of them are stale. Thought on some off-chance they may
> be set correctly.
> 
> > > 
> > 
> > Sigh, this is that sort of things that get to my nerves. I should try to
> > write something down when we come to a conclusion.  I would be happy to
> > have any definite answer to the expected behaviour of guest.
> > Extrapolation is not very helpful in the face of some many different
> > versions of Linux'es and BSDs.
> > 
> > But, if the confusion is only about PVHVM guest with fast=0, we can
> > forbid that specific combination for now. That should be enough to move
> > COLO forward.
> 
> .. forbid what? PVHVM resuming with fast=0? Why?  Because the guest may
> fall on its face?

Yes, forbid resuming PVHVM with fast=0 if we have no clear definition of
how it works. It's not because guest would fall, it's because we can't
tell which side (the guest or the toolstack) is buggy when the guest
falls.

But it looks like we (you ;-) ) have clear idea of how it works, we
(you) just need to write it down.

Wei.

> > 
> > Wei.
> > 
> > > Ian.
> > > 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-19 16:42               ` Wei Liu
@ 2016-02-19 17:16                 ` Konrad Rzeszutek Wilk
  2016-02-19 17:21                   ` Wei Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-19 17:16 UTC (permalink / raw)
  To: Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Dong Eddie, xen devel,
	Gui Jianfeng, Shriram Rajagopalan, Ian Jackson, Yang Hongyang

> > .. forbid what? PVHVM resuming with fast=0? Why?  Because the guest may
> > fall on its face?
> 
> Yes, forbid resuming PVHVM with fast=0 if we have no clear definition of
> how it works. It's not because guest would fall, it's because we can't
> tell which side (the guest or the toolstack) is buggy when the guest
> falls.
> 
> But it looks like we (you ;-) ) have clear idea of how it works, we
> (you) just need to write it down.


Where? The header file where SHUTDOWN_suspend is introduced?

Or the libxc ones?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests
  2016-02-19 17:16                 ` Konrad Rzeszutek Wilk
@ 2016-02-19 17:21                   ` Wei Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Wei Liu @ 2016-02-19 17:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie,
	Gui Jianfeng, Shriram Rajagopalan, Yang Hongyang

On Fri, Feb 19, 2016 at 12:16:31PM -0500, Konrad Rzeszutek Wilk wrote:
> > > .. forbid what? PVHVM resuming with fast=0? Why?  Because the guest may
> > > fall on its face?
> > 
> > Yes, forbid resuming PVHVM with fast=0 if we have no clear definition of
> > how it works. It's not because guest would fall, it's because we can't
> > tell which side (the guest or the toolstack) is buggy when the guest
> > falls.
> > 
> > But it looks like we (you ;-) ) have clear idea of how it works, we
> > (you) just need to write it down.
> 
> 
> Where? The header file where SHUTDOWN_suspend is introduced?
> 
> Or the libxc ones?

I have no opinion whether Xen public header should contain such text,
but I do wish to have better document for xc_domain_resume.  Basically
it is just turning what you wrote in this thread to comment for
xc_domain_resume.

Wei.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 00/13] Prerequisite patches for COLO
  2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
                   ` (12 preceding siblings ...)
  2016-02-18  2:43 ` [PATCH v8 13/13] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
@ 2016-02-26 15:54 ` Wei Liu
  2016-02-26 18:16   ` Konrad Rzeszutek Wilk
  13 siblings, 1 reply; 26+ messages in thread
From: Wei Liu @ 2016-02-26 15:54 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

I've prepared a branch for this series (sans patch #5) and run compile
test on every commit.

Please pull from

git://xenbits.xen.org/people/liuw/xen.git colo-prep-v8-base..colo-prep-v8

Patch #5 needs a bit more work. Note that Wen you don't need to resubmit
#5 just yet. We will come back to it next week.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v8 00/13] Prerequisite patches for COLO
  2016-02-26 15:54 ` [PATCH v8 00/13] Prerequisite patches for COLO Wei Liu
@ 2016-02-26 18:16   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 26+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-26 18:16 UTC (permalink / raw)
  To: Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Andrew Cooper, Jiang Yunhong, Dong Eddie, xen devel,
	Gui Jianfeng, Shriram Rajagopalan, Ian Jackson, Yang Hongyang

On Fri, Feb 26, 2016 at 03:54:46PM +0000, Wei Liu wrote:
> I've prepared a branch for this series (sans patch #5) and run compile
> test on every commit.
> 
> Please pull from
> 
> git://xenbits.xen.org/people/liuw/xen.git colo-prep-v8-base..colo-prep-v8

I've applied:

Wen Congyang (12):
      libxl/remus: init checkpoint callback in Remus setup callback
      tools/libxl: move remus code into libxl_remus.c
      tools/libxl: move save/restore code into libxl_dom_save.c
      libxl/save: Refactor libxl__domain_suspend_state
      tools/libxl: introduce enum type libxl_checkpointed_stream
      migration/save: pass checkpointed_stream from libxl to libxc
      tools/libxl: export logdirty_init
      tools/libxl: rename remus device to checkpoint device
      tools/libxl: adjust the indentation
      tools/libxl: store remus_ops in checkpoint device state
      tools/libxl: move remus state into a seperate structure
      tools/libxl: seperate device init/cleanup from checkpoint device layer

to staging.

Will revist #5 next week and submit an patch explaining the xc_domain_resume
.. shortcuts it can take.

> 
> Patch #5 needs a bit more work. Note that Wen you don't need to resubmit
> #5 just yet. We will come back to it next week.
> 
> Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2016-02-26 18:16 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-18  2:43 [PATCH v8 00/13] Prerequisite patches for COLO Wen Congyang
2016-02-18  2:43 ` [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback Wen Congyang
2016-02-18 12:30   ` Wei Liu
2016-02-18  2:43 ` [PATCH v8 02/13] tools/libxl: move remus code into libxl_remus.c Wen Congyang
2016-02-18  2:43 ` [PATCH v8 03/13] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
2016-02-18  2:43 ` [PATCH v8 04/13] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
2016-02-18  2:43 ` [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
2016-02-18 12:13   ` Wei Liu
2016-02-19 14:15     ` Konrad Rzeszutek Wilk
2016-02-19 14:43       ` Wei Liu
2016-02-19 14:52         ` Ian Campbell
2016-02-19 15:16           ` Wei Liu
2016-02-19 16:20             ` Konrad Rzeszutek Wilk
2016-02-19 16:42               ` Wei Liu
2016-02-19 17:16                 ` Konrad Rzeszutek Wilk
2016-02-19 17:21                   ` Wei Liu
2016-02-18  2:43 ` [PATCH v8 06/13] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
2016-02-18  2:43 ` [PATCH v8 07/13] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
2016-02-18  2:43 ` [PATCH v8 08/13] tools/libxl: export logdirty_init Wen Congyang
2016-02-18  2:43 ` [PATCH v8 09/13] tools/libxl: rename remus device to checkpoint device Wen Congyang
2016-02-18  2:43 ` [PATCH v8 10/13] tools/libxl: adjust the indentation Wen Congyang
2016-02-18  2:43 ` [PATCH v8 11/13] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
2016-02-18  2:43 ` [PATCH v8 12/13] tools/libxl: move remus state into a seperate structure Wen Congyang
2016-02-18  2:43 ` [PATCH v8 13/13] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
2016-02-26 15:54 ` [PATCH v8 00/13] Prerequisite patches for COLO Wei Liu
2016-02-26 18:16   ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.