All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/18] Prerequisite patches for COLO
@ 2015-12-30  2:28 Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback Wen Congyang
                   ` (18 more replies)
  0 siblings, 19 replies; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

This patchset is Prerequisite for COLO feature. Refer to:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

It was based on the following series:
http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02881.html

v5->v6:
 - Fix some bugs found in the test

v4->v5:
 - Rebased to the latest xen
 - Addressed comments from last round

v3->v4:
 - Rebased to the latest migration v2 branch
 - Addressed comments from last round

v2->v3:
 - Merge '[PATCH v2 0/6] Misc cleanups for libxl' into this patchset
   for easy review
 - Addressed review comments
 - Add back channel to libxc
 - Introduce should_checkpoint callback
 - Introduce DIRTY_BITMAP record on libxc side
 - Introduce COLO_CONTEXT record on libxl side
 - Ported to Libxl migration v2

v1->v2:
 - Rebased to [PATCH v2 0/6] Misc cleanups for libxl
 - Add a bugfix for the error handling of process_record

Wen Congyang (18):
  libxl/remus: init checkpoint_callback in Remus setup callback
  tools/libxl: move remus code into libxl_remus.c
  tools/libxl: move save/restore code into libxl_dom_save.c
  libxl/save: Refactor libxl__domain_suspend_state
  tools/libxc: support to resume uncooperative HVM guests
  tools/libxl: introduce enum type libxl_checkpointed_stream
  migration/save: pass checkpointed_stream from libxl to libxc
  tools/libxl: introduce libxl__domain_restore_device_model to load qemu
    state
  tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  tools/libxl: export logdirty_init
  tools/libxl: Add back channel to allow migration target send data back
  tools/libx{l,c}: add back channel to libxc
  tools/libxl: rename remus device to checkpoint device
  tools/libxl: fix backword compatibility after the automatic renaming
  tools/libxl: adjust the indentation
  tools/libxl: store remus_ops in checkpoint device state
  tools/libxl: move remus state into a seperate structure
  tools/libxl: seperate device init/cleanup from checkpoint device layer

 tools/libxc/include/xenguest.h        |   8 +-
 tools/libxc/xc_nomigrate.c            |   5 +-
 tools/libxc/xc_resume.c               |  24 +-
 tools/libxc/xc_sr_common.h            |  12 +-
 tools/libxc/xc_sr_restore.c           |   2 +-
 tools/libxc/xc_sr_save.c              |  14 +-
 tools/libxl/Makefile                  |   4 +-
 tools/libxl/libxl.c                   |  83 +---
 tools/libxl/libxl.h                   |  50 ++-
 tools/libxl/libxl_checkpoint_device.c | 282 +++++++++++++
 tools/libxl/libxl_create.c            |  50 +--
 tools/libxl/libxl_dom.c               | 740 ----------------------------------
 tools/libxl/libxl_dom_save.c          | 555 +++++++++++++++++++++++++
 tools/libxl/libxl_dom_suspend.c       | 217 ++++++----
 tools/libxl/libxl_internal.h          | 236 +++++++----
 tools/libxl/libxl_netbuffer.c         | 117 +++---
 tools/libxl/libxl_nonetbuffer.c       |  10 +-
 tools/libxl/libxl_qmp.c               |  10 +
 tools/libxl/libxl_remus.c             | 410 +++++++++++++++++++
 tools/libxl/libxl_remus_device.c      | 327 ---------------
 tools/libxl/libxl_remus_disk_drbd.c   |  56 +--
 tools/libxl/libxl_save_callout.c      |  43 +-
 tools/libxl/libxl_save_helper.c       |   9 +-
 tools/libxl/libxl_stream_read.c       |   7 +-
 tools/libxl/libxl_stream_write.c      |  18 +-
 tools/libxl/libxl_types.idl           |  11 +-
 tools/libxl/xl_cmdimpl.c              |  26 +-
 tools/ocaml/libs/xl/xenlight_stubs.c  |   2 +-
 28 files changed, 1842 insertions(+), 1486 deletions(-)
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 create mode 100644 tools/libxl/libxl_dom_save.c
 create mode 100644 tools/libxl/libxl_remus.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

-- 
2.5.0

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2016-01-25 17:29   ` Konrad Rzeszutek Wilk
  2015-12-30  2:28 ` [PATCH v6 02/18] tools/libxl: move remus code into libxl_remus.c Wen Congyang
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

init stream {read/write} state checkpoint_callback in Remus setup callback.
There's no functional change, it's just refactoring so that we can move
all remus code into one file.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl.c          |  2 ++
 tools/libxl/libxl_create.c   | 10 +++++++++-
 tools/libxl/libxl_dom.c      |  5 +----
 tools/libxl/libxl_internal.h |  4 ++++
 4 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 9207621..d340a20 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -918,6 +918,8 @@ static void libxl__remus_setup(libxl__egc *egc,
     rds->domid = dss->domid;
     rds->callback = remus_setup_done;
 
+    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
+
     libxl__remus_devices_setup(egc, rds);
     return;
 
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 261816a..6ea9bc2 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -709,6 +709,12 @@ static void remus_checkpoint_stream_done(
     libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
 }
 
+static void libxl__remus_restore_setup(libxl__egc *egc,
+                                       libxl__domain_create_state *dcs)
+{
+    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
+}
+
 /*----- main domain creation -----*/
 
 /* We have a linear control flow; only one event callback is
@@ -995,6 +1001,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     libxl__domain_build_state *const state = &dcs->build_state;
     libxl__srm_restore_autogen_callbacks *const callbacks =
         &dcs->srs.shs.callbacks.restore.a;
+    const int checkpointed_stream = dcs->restore_params.checkpointed_stream;
 
     if (rc) {
         domcreate_rebuild_done(egc, dcs, rc);
@@ -1033,9 +1040,10 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     dcs->srs.fd = restore_fd;
     dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
     dcs->srs.completion_callback = domcreate_stream_done;
-    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
 
     if (restore_fd >= 0) {
+        if (checkpointed_stream)
+            libxl__remus_restore_setup(egc, dcs);
         libxl__stream_read_start(egc, &dcs->srs);
         return;
     }
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 2269998..9e28bc4 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1569,8 +1569,6 @@ out:
 
 /*----- remus asynchronous checkpoint callback -----*/
 
-static void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__remus_devices_state *rds,
                                     int rc);
@@ -1588,7 +1586,7 @@ static void libxl__remus_domain_save_checkpoint_callback(void *data)
     libxl__stream_write_start_checkpoint(egc, &dss->sws);
 }
 
-static void remus_checkpoint_stream_written(
+void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
@@ -1761,7 +1759,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
         callbacks->suspend = libxl__remus_domain_suspend_callback;
         callbacks->postcopy = libxl__remus_domain_resume_callback;
         callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
-        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
     } else
         callbacks->suspend = libxl__domain_suspend_callback;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 630172b..45d7961 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3507,6 +3507,10 @@ _hidden void libxl__domain_suspend(libxl__egc *egc,
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
+/* Remus callbacks for restore */
+_hidden void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
+
 
 /*
  * Convenience macros.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 02/18] tools/libxl: move remus code into libxl_remus.c
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 03/18] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

After previous refactoring, we are now able to move all remus code
into a separate file libxl_remus.c.

Export following functions for internal use:
- Remus callbacks
  * libxl__remus_domain_suspend_callback
  * libxl__remus_domain_resume_callback
  * libxl__remus_domain_save_checkpoint_callback
  * libxl__remus_domain_restore_checkpoint_callback
- setup/teardown Remus:
  * libxl__remus_setup
  * libxl__remus_teardown

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl.c          |  69 ---------
 tools/libxl/libxl_create.c   |  27 ----
 tools/libxl/libxl_dom.c      | 223 ---------------------------
 tools/libxl/libxl_internal.h |  15 +-
 tools/libxl/libxl_remus.c    | 348 +++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 362 insertions(+), 322 deletions(-)
 create mode 100644 tools/libxl/libxl_remus.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 6ff5bee..90e15d2 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -62,7 +62,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d340a20..bdd8ad0 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -832,12 +832,6 @@ out:
     return ptr;
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-                               libxl__domain_suspend_state *dss);
-static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc);
-static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc);
 static void remus_failover_cb(libxl__egc *egc,
                               libxl__domain_suspend_state *dss, int rc);
 
@@ -894,69 +888,6 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     return AO_CREATE_FAIL(rc);
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-                               libxl__domain_suspend_state *dss)
-{
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
-    const libxl_domain_remus_info *const info = dss->remus;
-
-    STATE_AO_GC(dss->ao);
-
-    if (libxl_defbool_val(info->netbuf)) {
-        if (!libxl__netbuffer_enabled(gc)) {
-            LOG(ERROR, "Remus: No support for network buffering");
-            goto out;
-        }
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
-    }
-
-    if (libxl_defbool_val(info->diskbuf))
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
-
-    rds->ao = ao;
-    rds->domid = dss->domid;
-    rds->callback = remus_setup_done;
-
-    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
-
-    libxl__remus_devices_setup(egc, rds);
-    return;
-
-out:
-    dss->callback(egc, dss, ERROR_FAIL);
-}
-
-static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (!rc) {
-        libxl__domain_save(egc, dss);
-        return;
-    }
-
-    LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
-        dss->domid, rc);
-    rds->callback = remus_setup_failed;
-    libxl__remus_devices_teardown(egc, rds);
-}
-
-static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        LOG(ERROR, "Remus: failed to teardown device after setup failed"
-            " for guest with domid %u, rc %d", dss->domid, rc);
-
-    dss->callback(egc, dss, rc);
-}
-
 static void remus_failover_cb(libxl__egc *egc,
                               libxl__domain_suspend_state *dss, int rc)
 {
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 6ea9bc2..0ee9a57 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -688,33 +688,6 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
                             libxl_device_model_version_to_string(b_info->device_model_version));
 }
 
-/*----- remus asynchronous checkpoint callback -----*/
-
-static void remus_checkpoint_stream_done(
-    libxl__egc *egc, libxl__stream_read_state *srs, int rc);
-
-static void libxl__remus_domain_restore_checkpoint_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__domain_create_state *dcs = shs->caller_state;
-    libxl__egc *egc = shs->egc;
-    STATE_AO_GC(dcs->ao);
-
-    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
-}
-
-static void remus_checkpoint_stream_done(
-    libxl__egc *egc, libxl__stream_read_state *stream, int rc)
-{
-    libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
-}
-
-static void libxl__remus_restore_setup(libxl__egc *egc,
-                                       libxl__domain_create_state *dcs)
-{
-    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
-}
-
 /*----- main domain creation -----*/
 
 /* We have a linear control flow; only one event callback is
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 9e28bc4..81bd464 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1479,196 +1479,6 @@ int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
     return rc;
 }
 
-/*----- remus callbacks -----*/
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
-                                         int rc);
-static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc);
-
-static void libxl__remus_domain_suspend_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-
-    dss->callback_common_done = remus_domain_suspend_callback_common_done;
-    libxl__domain_suspend(egc, dss);
-}
-
-static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
-{
-    if (rc)
-        goto out;
-
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_postsuspend_cb;
-    libxl__remus_devices_postsuspend(egc, rds);
-    return;
-
-out:
-    dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
-                                         int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-
-    if (rc)
-        goto out;
-
-    rc = 0;
-
-out:
-    if (rc)
-        dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-static void libxl__remus_domain_resume_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_preresume_cb;
-    libxl__remus_devices_preresume(egc, rds);
-}
-
-static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        goto out;
-
-    /* Resumes the domain and the device model */
-    rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
-    if (rc)
-        goto out;
-
-    rc = 0;
-
-out:
-    if (rc)
-        dss->rc = rc;
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
-/*----- remus asynchronous checkpoint callback -----*/
-
-static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
-                                    int rc);
-static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs,
-                                  int rc);
-
-static void libxl__remus_domain_save_checkpoint_callback(void *data)
-{
-    libxl__save_helper_state *shs = data;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    libxl__egc *egc = shs->egc;
-    STATE_AO_GC(dss->ao);
-
-    libxl__stream_write_start_checkpoint(egc, &dss->sws);
-}
-
-void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
-
-    /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc) {
-        LOG(ERROR, "Failed to save device model. Terminating Remus..");
-        goto out;
-    }
-
-    rds->callback = remus_devices_commit_cb;
-    libxl__remus_devices_commit(egc, rds);
-
-    return;
-
-out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
-}
-
-static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
-                                    int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc) {
-        LOG(ERROR, "Failed to do device commit op."
-            " Terminating Remus..");
-        goto out;
-    }
-
-    /*
-     * At this point, we have successfully checkpointed the guest and
-     * committed it at the backup. We'll come back after the checkpoint
-     * interval to checkpoint the guest again. Until then, let the guest
-     * continue execution.
-     */
-
-    /* Set checkpoint interval timeout */
-    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
-                                     remus_next_checkpoint,
-                                     dss->interval);
-
-    if (rc)
-        goto out;
-
-    return;
-
-out:
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
-}
-
-static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
-                                  const struct timeval *requested_abs,
-                                  int rc)
-{
-    libxl__domain_suspend_state *dss =
-                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
-
-    STATE_AO_GC(dss->ao);
-
-    if (rc == ERROR_TIMEDOUT) /* As intended */
-        rc = 0;
-
-    /*
-     * Time to checkpoint the guest again. We return 1 to libxc
-     * (xc_domain_save.c). in order to continue executing the infinite loop
-     * (suspend, checkpoint, resume) in xc_domain_save().
-     */
-
-    if (rc)
-        dss->rc = rc;
-
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
-}
-
 /*----- main code for saving, in order of execution -----*/
 
 void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
@@ -1782,13 +1592,6 @@ static void stream_done(libxl__egc *egc,
     domain_save_done(egc, sws->dss, rc);
 }
 
-static void libxl__remus_teardown(libxl__egc *egc,
-                                  libxl__domain_suspend_state *dss,
-                                  int rc);
-static void remus_teardown_done(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc);
-
 static void domain_save_done(libxl__egc *egc,
                              libxl__domain_suspend_state *dss, int rc)
 {
@@ -1817,32 +1620,6 @@ static void domain_save_done(libxl__egc *egc,
     dss->callback(egc, dss, rc);
 }
 
-static void libxl__remus_teardown(libxl__egc *egc,
-                                  libxl__domain_suspend_state *dss,
-                                  int rc)
-{
-    EGC_GC;
-
-    LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
-        " teardown Remus devices...", rc);
-    dss->rds.callback = remus_teardown_done;
-    libxl__remus_devices_teardown(egc, &dss->rds);
-}
-
-static void remus_teardown_done(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
-                                       int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-    STATE_AO_GC(dss->ao);
-
-    if (rc)
-        LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
-            " rc %d", dss->domid, rc);
-
-    dss->callback(egc, dss, rc);
-}
-
 /*==================== Miscellaneous ====================*/
 
 char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 45d7961..05537cc 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3507,9 +3507,20 @@ _hidden void libxl__domain_suspend(libxl__egc *egc,
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
+/* Remus callbacks for save */
+_hidden void libxl__remus_domain_suspend_callback(void *data);
+_hidden void libxl__remus_domain_resume_callback(void *data);
+_hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
+/* Remus setup and teardown*/
+_hidden void libxl__remus_setup(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss);
+_hidden void libxl__remus_teardown(libxl__egc *egc,
+                                   libxl__domain_suspend_state *dss,
+                                   int rc);
 /* Remus callbacks for restore */
-_hidden void remus_checkpoint_stream_written(
-    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
+_hidden void libxl__remus_domain_restore_checkpoint_callback(void *data);
+_hidden void libxl__remus_restore_setup(libxl__egc *egc,
+                                        libxl__domain_create_state *dcs);
 
 
 /*
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
new file mode 100644
index 0000000..e3caf7d
--- /dev/null
+++ b/tools/libxl/libxl_remus.c
@@ -0,0 +1,348 @@
+/*
+ * Copyright (C) 2009      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *        Yang Hongyang <hongyang.yang@easystack.cn>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+/*-------------------- Remus setup and teardown ---------------------*/
+
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc);
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc);
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
+
+void libxl__remus_setup(libxl__egc *egc,
+                        libxl__domain_suspend_state *dss)
+{
+    /* Convenience aliases */
+    libxl__remus_devices_state *const rds = &dss->rds;
+    const libxl_domain_remus_info *const info = dss->remus;
+
+    STATE_AO_GC(dss->ao);
+
+    if (libxl_defbool_val(info->netbuf)) {
+        if (!libxl__netbuffer_enabled(gc)) {
+            LOG(ERROR, "Remus: No support for network buffering");
+            goto out;
+        }
+        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
+    }
+
+    if (libxl_defbool_val(info->diskbuf))
+        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
+
+    rds->ao = ao;
+    rds->domid = dss->domid;
+    rds->callback = remus_setup_done;
+
+    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
+
+    libxl__remus_devices_setup(egc, rds);
+    return;
+
+out:
+    dss->callback(egc, dss, ERROR_FAIL);
+}
+
+static void remus_setup_done(libxl__egc *egc,
+                             libxl__remus_devices_state *rds, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (!rc) {
+        libxl__domain_save(egc, dss);
+        return;
+    }
+
+    LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
+        dss->domid, rc);
+    rds->callback = remus_setup_failed;
+    libxl__remus_devices_teardown(egc, rds);
+}
+
+static void remus_setup_failed(libxl__egc *egc,
+                               libxl__remus_devices_state *rds, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        LOG(ERROR, "Remus: failed to teardown device after setup failed"
+            " for guest with domid %u, rc %d", dss->domid, rc);
+
+    dss->callback(egc, dss, rc);
+}
+
+static void remus_teardown_done(libxl__egc *egc,
+                                libxl__remus_devices_state *rds,
+                                int rc);
+void libxl__remus_teardown(libxl__egc *egc,
+                           libxl__domain_suspend_state *dss,
+                           int rc)
+{
+    EGC_GC;
+
+    LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
+        " teardown Remus devices...", rc);
+    dss->rds.callback = remus_teardown_done;
+    libxl__remus_devices_teardown(egc, &dss->rds);
+}
+
+static void remus_teardown_done(libxl__egc *egc,
+                                libxl__remus_devices_state *rds,
+                                int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
+            " rc %d", dss->domid, rc);
+
+    dss->callback(egc, dss, rc);
+}
+
+/*---------------------- remus callbacks (save) -----------------------*/
+
+static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int ok);
+static void remus_devices_postsuspend_cb(libxl__egc *egc,
+                                         libxl__remus_devices_state *rds,
+                                         int rc);
+static void remus_devices_preresume_cb(libxl__egc *egc,
+                                       libxl__remus_devices_state *rds,
+                                       int rc);
+
+void libxl__remus_domain_suspend_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+
+    dss->callback_common_done = remus_domain_suspend_callback_common_done;
+    libxl__domain_suspend(egc, dss);
+}
+
+static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss, int rc)
+{
+    if (rc)
+        goto out;
+
+    libxl__remus_devices_state *const rds = &dss->rds;
+    rds->callback = remus_devices_postsuspend_cb;
+    libxl__remus_devices_postsuspend(egc, rds);
+    return;
+
+out:
+    dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+static void remus_devices_postsuspend_cb(libxl__egc *egc,
+                                         libxl__remus_devices_state *rds,
+                                         int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+
+    if (rc)
+        goto out;
+
+    rc = 0;
+
+out:
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+void libxl__remus_domain_resume_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+
+    libxl__remus_devices_state *const rds = &dss->rds;
+    rds->callback = remus_devices_preresume_cb;
+    libxl__remus_devices_preresume(egc, rds);
+}
+
+static void remus_devices_preresume_cb(libxl__egc *egc,
+                                       libxl__remus_devices_state *rds,
+                                       int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        goto out;
+
+    /* Resumes the domain and the device model */
+    rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
+    if (rc)
+        goto out;
+
+    rc = 0;
+
+out:
+    if (rc)
+        dss->rc = rc;
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+/*----- remus asynchronous checkpoint callback -----*/
+
+static void remus_devices_commit_cb(libxl__egc *egc,
+                                    libxl__remus_devices_state *rds,
+                                    int rc);
+static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
+                                  const struct timeval *requested_abs,
+                                  int rc);
+
+void libxl__remus_domain_save_checkpoint_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__egc *egc = shs->egc;
+    STATE_AO_GC(dss->ao);
+
+    libxl__stream_write_start_checkpoint(egc, &dss->sws);
+}
+
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__stream_write_state *sws, int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
+
+    /* Convenience aliases */
+    libxl__remus_devices_state *const rds = &dss->rds;
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc) {
+        LOG(ERROR, "Failed to save device model. Terminating Remus..");
+        goto out;
+    }
+
+    rds->callback = remus_devices_commit_cb;
+    libxl__remus_devices_commit(egc, rds);
+
+    return;
+
+out:
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
+}
+
+static void remus_devices_commit_cb(libxl__egc *egc,
+                                    libxl__remus_devices_state *rds,
+                                    int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc) {
+        LOG(ERROR, "Failed to do device commit op."
+            " Terminating Remus..");
+        goto out;
+    }
+
+    /*
+     * At this point, we have successfully checkpointed the guest and
+     * committed it at the backup. We'll come back after the checkpoint
+     * interval to checkpoint the guest again. Until then, let the guest
+     * continue execution.
+     */
+
+    /* Set checkpoint interval timeout */
+    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
+                                     remus_next_checkpoint,
+                                     dss->interval);
+
+    if (rc)
+        goto out;
+
+    return;
+
+out:
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
+}
+
+static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
+                                  const struct timeval *requested_abs,
+                                  int rc)
+{
+    libxl__domain_suspend_state *dss =
+                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
+
+    STATE_AO_GC(dss->ao);
+
+    if (rc == ERROR_TIMEDOUT) /* As intended */
+        rc = 0;
+
+    /*
+     * Time to checkpoint the guest again. We return 1 to libxc
+     * (xc_domain_save.c). in order to continue executing the infinite loop
+     * (suspend, checkpoint, resume) in xc_domain_save().
+     */
+
+    if (rc)
+        dss->rc = rc;
+
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
+}
+
+/*---------------------- remus callbacks (restore) -----------------------*/
+
+/*----- remus asynchronous checkpoint callback -----*/
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__stream_read_state *srs, int rc);
+
+void libxl__remus_domain_restore_checkpoint_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__domain_create_state *dcs = shs->caller_state;
+    libxl__egc *egc = shs->egc;
+    STATE_AO_GC(dcs->ao);
+
+    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
+}
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__stream_read_state *stream, int rc)
+{
+    libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
+}
+
+void libxl__remus_restore_setup(libxl__egc *egc,
+                                libxl__domain_create_state *dcs)
+{
+    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 03/18] tools/libxl: move save/restore code into libxl_dom_save.c
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 02/18] tools/libxl: move remus code into libxl_remus.c Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Ian Jackson,
	Yang Hongyang

This is purely code motion.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
 tools/libxl/Makefile         |   2 +-
 tools/libxl/libxl_dom.c      | 514 ----------------------------------------
 tools/libxl/libxl_dom_save.c | 543 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 544 insertions(+), 515 deletions(-)
 create mode 100644 tools/libxl/libxl_dom_save.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 90e15d2..b476012 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -103,7 +103,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_stream_read.o libxl_stream_write.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o \
-			libxl_dom_suspend.o $(LIBXL_OBJS-y)
+			libxl_dom_suspend.o libxl_dom_save.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 81bd464..664adad 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -24,7 +24,6 @@
 #include <xen/hvm/hvm_info_table.h>
 #include <xen/hvm/hvm_xs_strings.h>
 #include <xen/hvm/e820.h>
-#include <xen/errno.h>
 
 libxl_domain_type libxl__domain_type(libxl__gc *gc, uint32_t domid)
 {
@@ -1107,519 +1106,6 @@ int libxl__qemu_traditional_cmd(libxl__gc *gc, uint32_t domid,
     return libxl__xs_printf(gc, XBT_NULL, path, "%s", cmd);
 }
 
-/*
- * Inspect the buffer between start and end, and return a pointer to the
- * character following the NUL terminator of start, or NULL if start is not
- * terminated before end.
- */
-static const char *next_string(const char *start, const char *end)
-{
-    if (start >= end) return NULL;
-
-    size_t total_len = end - start;
-    size_t len = strnlen(start, total_len);
-
-    if (len == total_len)
-        return NULL;
-    else
-        return start + len + 1;
-}
-
-int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs,
-                                          const char *ptr, uint32_t size)
-{
-    STATE_AO_GC(dcs->ao);
-    const char *next = ptr, *end = ptr + size, *key, *val;
-    int rc;
-
-    const uint32_t domid = dcs->guest_domid;
-    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-    const char *xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
-
-    while (next < end) {
-        key = next;
-        next = next_string(next, end);
-
-        /* Sanitise 'key'. */
-        if (!next) {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "Key in xenstore data not NUL terminated");
-            goto out;
-        }
-        if (key[0] == '\0') {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "empty key found in xenstore data");
-            goto out;
-        }
-        if (key[0] == '/') {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "Key in xenstore data not relative");
-            goto out;
-        }
-
-        val = next;
-        next = next_string(next, end);
-
-        /* Sanitise 'val'. */
-        if (!next) {
-            rc = ERROR_FAIL;
-            LOG(ERROR, "Val in xenstore data not NUL terminated");
-            goto out;
-        }
-
-        libxl__xs_printf(gc, XBT_NULL,
-                         GCSPRINTF("%s/%s", xs_root, key),
-                         "%s", val);
-    }
-
-    rc = 0;
-
- out:
-    return rc;
-}
-
-/*==================== Domain suspend (save) ====================*/
-
-static void stream_done(libxl__egc *egc,
-                        libxl__stream_write_state *sws, int rc);
-static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc);
-
-/*----- complicated callback, called by xc_domain_save -----*/
-
-/*
- * We implement the other end of protocol for controlling qemu-dm's
- * logdirty.  There is no documentation for this protocol, but our
- * counterparty's implementation is in
- * qemu-xen-traditional.git:xenstore.c in the function
- * xenstore_process_logdirty_event
- */
-
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs,
-                                    int rc);
-static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
-                            const char *watch_path, const char *event_path);
-static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int rc);
-
-static void logdirty_init(libxl__logdirty_switch *lds)
-{
-    lds->cmd_path = 0;
-    libxl__ev_xswatch_init(&lds->watch);
-    libxl__ev_time_init(&lds->timeout);
-}
-
-static void domain_suspend_switch_qemu_xen_traditional_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
-{
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
-    int rc;
-    xs_transaction_t t = 0;
-    const char *got;
-
-    if (!lds->cmd_path) {
-        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-        lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid,
-                                                    "/logdirty/cmd");
-        lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid,
-                                                    "/logdirty/ret");
-    }
-    lds->cmd = enable ? "enable" : "disable";
-
-    rc = libxl__ev_xswatch_register(gc, &lds->watch,
-                                switch_logdirty_xswatch, lds->ret_path);
-    if (rc) goto out;
-
-    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
-                                switch_logdirty_timeout, 10*1000);
-    if (rc) goto out;
-
-    for (;;) {
-        rc = libxl__xs_transaction_start(gc, &t);
-        if (rc) goto out;
-
-        rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got);
-        if (rc) goto out;
-
-        if (got) {
-            const char *got_ret;
-            rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret);
-            if (rc) goto out;
-
-            if (!got_ret || strcmp(got, got_ret)) {
-                LOG(ERROR,"controlling logdirty: qemu was already sent"
-                    " command `%s' (xenstore path `%s') but result is `%s'",
-                    got, lds->cmd_path, got_ret ? got_ret : "<none>");
-                rc = ERROR_FAIL;
-                goto out;
-            }
-            rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
-            if (rc) goto out;
-        }
-
-        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd);
-        if (rc) goto out;
-
-        rc = libxl__xs_transaction_commit(gc, &t);
-        if (!rc) break;
-        if (rc<0) goto out;
-    }
-
-    /* OK, wait for some callback */
-    return;
-
- out:
-    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
-    libxl__xs_transaction_abort(gc, &t);
-    switch_logdirty_done(egc,dss,rc);
-}
-
-static void domain_suspend_switch_qemu_xen_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
-{
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-    int rc;
-
-    rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
-    if (!rc) {
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
-    } else {
-        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
-        dss->rc = rc;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
-}
-
-void libxl__domain_suspend_common_switch_qemu_logdirty
-                               (int domid, unsigned enable, void *user)
-{
-    libxl__save_helper_state *shs = user;
-    libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
-
-    switch (libxl__device_model_version_running(gc, domid)) {
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
-        break;
-    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
-        break;
-    case LIBXL_DEVICE_MODEL_VERSION_NONE:
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
-        break;
-    default:
-        LOG(ERROR,"logdirty switch failed"
-            ", no valid device model version found, abandoning suspend");
-        dss->rc = ERROR_FAIL;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
-}
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs,
-                                    int rc)
-{
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
-    STATE_AO_GC(dss->ao);
-    LOG(ERROR,"logdirty switch: wait for device model timed out");
-    switch_logdirty_done(egc,dss,ERROR_FAIL);
-}
-
-static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
-                            const char *watch_path, const char *event_path)
-{
-    libxl__domain_suspend_state *dss =
-        CONTAINER_OF(watch, *dss, logdirty.watch);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
-    const char *got;
-    xs_transaction_t t = 0;
-    int rc;
-
-    for (;;) {
-        rc = libxl__xs_transaction_start(gc, &t);
-        if (rc) goto out;
-
-        rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got);
-        if (rc) goto out;
-
-        if (!got) {
-            rc = +1;
-            goto out;
-        }
-
-        if (strcmp(got, lds->cmd)) {
-            LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'"
-                " (xenstore paths `%s' / `%s')", lds->cmd, got,
-                lds->cmd_path, lds->ret_path);
-            rc = ERROR_FAIL;
-            goto out;
-        }
-
-        rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
-        if (rc) goto out;
-
-        rc = libxl__xs_transaction_commit(gc, &t);
-        if (!rc) break;
-        if (rc<0) goto out;
-    }
-
- out:
-    /* rc < 0: error
-     * rc == 0: ok, we are done
-     * rc == +1: need to keep waiting
-     */
-    libxl__xs_transaction_abort(gc, &t);
-
-    if (rc <= 0) {
-        if (rc < 0)
-            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
-        switch_logdirty_done(egc,dss,rc);
-    }
-}
-
-static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss,
-                                 int rc)
-{
-    STATE_AO_GC(dss->ao);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-
-    libxl__ev_xswatch_deregister(gc, &lds->watch);
-    libxl__ev_time_deregister(gc, &lds->timeout);
-
-    int broke;
-    if (rc) {
-        broke = -1;
-        dss->rc = rc;
-    } else {
-        broke = 0;
-    }
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
-}
-
-/*----- callbacks, called by xc_domain_save -----*/
-
-/*
- * Expand the buffer 'buf' of length 'len', to append 'str' including its NUL
- * terminator.
- */
-static void append_string(libxl__gc *gc, char **buf, uint32_t *len,
-                          const char *str)
-{
-    size_t extralen = strlen(str) + 1;
-    char *new = libxl__realloc(gc, *buf, *len + extralen);
-
-    *buf = new;
-    memcpy(new + *len, str, extralen);
-    *len += extralen;
-}
-
-int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
-                                       char **callee_buf,
-                                       uint32_t *callee_len)
-{
-    STATE_AO_GC(dss->ao);
-    const char *xs_root;
-    char **entries, *buf = NULL;
-    unsigned int nr_entries, i, j, len = 0;
-    int rc;
-
-    const uint32_t domid = dss->domid;
-    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-
-    xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
-
-    entries = libxl__xs_directory(gc, 0, GCSPRINTF("%s/physmap", xs_root),
-                                  &nr_entries);
-    if (!entries || nr_entries == 0) { rc = 0; goto out; }
-
-    for (i = 0; i < nr_entries; ++i) {
-        static const char *const physmap_subkeys[] = {
-            "start_addr", "size", "name"
-        };
-
-        for (j = 0; j < ARRAY_SIZE(physmap_subkeys); ++j) {
-            const char *key = GCSPRINTF("physmap/%s/%s",
-                                        entries[i], physmap_subkeys[j]);
-
-            const char *val =
-                libxl__xs_read(gc, XBT_NULL,
-                               GCSPRINTF("%s/%s", xs_root, key));
-
-            if (!val) { rc = ERROR_FAIL; goto out; }
-
-            append_string(gc, &buf, &len, key);
-            append_string(gc, &buf, &len, val);
-        }
-    }
-
-    rc = 0;
-
- out:
-    if (!rc) {
-        *callee_buf = buf;
-        *callee_len = len;
-    }
-
-    return rc;
-}
-
-/*----- main code for saving, in order of execution -----*/
-
-void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
-{
-    STATE_AO_GC(dss->ao);
-    int port;
-    int rc, ret;
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-    const libxl_domain_type type = dss->type;
-    const int live = dss->live;
-    const int debug = dss->debug;
-    const libxl_domain_remus_info *const r_info = dss->remus;
-    libxl__srm_save_autogen_callbacks *const callbacks =
-        &dss->sws.shs.callbacks.save.a;
-    unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
-
-    dss->rc = 0;
-    logdirty_init(&dss->logdirty);
-    libxl__xswait_init(&dss->pvcontrol);
-    libxl__ev_evtchn_init(&dss->guest_evtchn);
-    libxl__ev_xswatch_init(&dss->guest_watch);
-    libxl__ev_time_init(&dss->guest_timeout);
-
-    switch (type) {
-    case LIBXL_DOMAIN_TYPE_HVM: {
-        dss->hvm = 1;
-        break;
-    }
-    case LIBXL_DOMAIN_TYPE_PV:
-        dss->hvm = 0;
-        break;
-    default:
-        abort();
-    }
-
-    dss->xcflags = (live ? XCFLAGS_LIVE : 0)
-          | (debug ? XCFLAGS_DEBUG : 0)
-          | (dss->hvm ? XCFLAGS_HVM : 0);
-
-    /* Disallow saving a guest with vNUMA configured because migration
-     * stream does not preserve node information.
-     *
-     * Reject any domain which has vnuma enabled, even if the
-     * configuration is empty. Only domains which have no vnuma
-     * configuration at all are supported.
-     */
-    ret = xc_domain_getvnuma(CTX->xch, domid, &nr_vnodes, &nr_vmemranges,
-                             &nr_vcpus, NULL, NULL, NULL);
-    if (ret != -1 || errno != XEN_EOPNOTSUPP) {
-        LOG(ERROR, "Cannot save a guest with vNUMA configured");
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    dss->guest_evtchn.port = -1;
-    dss->guest_evtchn_lockfd = -1;
-    dss->guest_responded = 0;
-    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
-
-    if (r_info != NULL) {
-        dss->interval = r_info->interval;
-        dss->xcflags |= XCFLAGS_CHECKPOINTED;
-        if (libxl_defbool_val(r_info->compression))
-            dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
-    }
-
-    port = xs_suspend_evtchn_port(dss->domid);
-
-    if (port >= 0) {
-        rc = libxl__ctx_evtchn_init(gc);
-        if (rc) goto out;
-
-        dss->guest_evtchn.port =
-            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
-                                  dss->domid, port, &dss->guest_evtchn_lockfd);
-
-        if (dss->guest_evtchn.port < 0) {
-            LOG(WARN, "Suspend event channel initialization failed");
-            rc = ERROR_FAIL;
-            goto out;
-        }
-    }
-
-    memset(callbacks, 0, sizeof(*callbacks));
-    if (r_info != NULL) {
-        callbacks->suspend = libxl__remus_domain_suspend_callback;
-        callbacks->postcopy = libxl__remus_domain_resume_callback;
-        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
-    } else
-        callbacks->suspend = libxl__domain_suspend_callback;
-
-    callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
-
-    dss->sws.ao  = dss->ao;
-    dss->sws.dss = dss;
-    dss->sws.fd  = dss->fd;
-    dss->sws.completion_callback = stream_done;
-
-    libxl__stream_write_start(egc, &dss->sws);
-    return;
-
- out:
-    domain_save_done(egc, dss, rc);
-}
-
-static void stream_done(libxl__egc *egc,
-                        libxl__stream_write_state *sws, int rc)
-{
-    domain_save_done(egc, sws->dss, rc);
-}
-
-static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc)
-{
-    STATE_AO_GC(dss->ao);
-
-    /* Convenience aliases */
-    const uint32_t domid = dss->domid;
-
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-
-    if (dss->guest_evtchn.port > 0)
-        xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
-                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
-
-    if (dss->remus) {
-        /*
-         * With Remus, if we reach this point, it means either
-         * backup died or some network error occurred preventing us
-         * from sending checkpoints. Teardown the network buffers and
-         * release netlink resources.  This is an async op.
-         */
-        libxl__remus_teardown(egc, dss, rc);
-        return;
-    }
-
-    dss->callback(egc, dss, rc);
-}
-
 /*==================== Miscellaneous ====================*/
 
 char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
new file mode 100644
index 0000000..27fd58b
--- /dev/null
+++ b/tools/libxl/libxl_dom_save.c
@@ -0,0 +1,543 @@
+/*
+ * Copyright (C) 2009      Citrix Ltd.
+ * Author Vincent Hanquez <vincent.hanquez@eu.citrix.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+#include <xen/errno.h>
+
+/*========================= Domain save ============================*/
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *sws, int rc);
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc);
+
+/*----- complicated callback, called by xc_domain_save -----*/
+
+/*
+ * We implement the other end of protocol for controlling qemu-dm's
+ * logdirty.  There is no documentation for this protocol, but our
+ * counterparty's implementation is in
+ * qemu-xen-traditional.git:xenstore.c in the function
+ * xenstore_process_logdirty_event
+ */
+
+static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
+                                    const struct timeval *requested_abs,
+                                    int rc);
+static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
+                            const char *watch_path, const char *event_path);
+static void switch_logdirty_done(libxl__egc *egc,
+                                 libxl__domain_suspend_state *dss, int rc);
+
+static void logdirty_init(libxl__logdirty_switch *lds)
+{
+    lds->cmd_path = 0;
+    libxl__ev_xswatch_init(&lds->watch);
+    libxl__ev_time_init(&lds->timeout);
+}
+
+static void domain_suspend_switch_qemu_xen_traditional_logdirty
+                               (int domid, unsigned enable,
+                                libxl__save_helper_state *shs)
+{
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(dss->ao);
+    int rc;
+    xs_transaction_t t = 0;
+    const char *got;
+
+    if (!lds->cmd_path) {
+        uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+        lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid,
+                                                    "/logdirty/cmd");
+        lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid,
+                                                    "/logdirty/ret");
+    }
+    lds->cmd = enable ? "enable" : "disable";
+
+    rc = libxl__ev_xswatch_register(gc, &lds->watch,
+                                switch_logdirty_xswatch, lds->ret_path);
+    if (rc) goto out;
+
+    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
+                                switch_logdirty_timeout, 10*1000);
+    if (rc) goto out;
+
+    for (;;) {
+        rc = libxl__xs_transaction_start(gc, &t);
+        if (rc) goto out;
+
+        rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got);
+        if (rc) goto out;
+
+        if (got) {
+            const char *got_ret;
+            rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret);
+            if (rc) goto out;
+
+            if (!got_ret || strcmp(got, got_ret)) {
+                LOG(ERROR,"controlling logdirty: qemu was already sent"
+                    " command `%s' (xenstore path `%s') but result is `%s'",
+                    got, lds->cmd_path, got_ret ? got_ret : "<none>");
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
+            if (rc) goto out;
+        }
+
+        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd);
+        if (rc) goto out;
+
+        rc = libxl__xs_transaction_commit(gc, &t);
+        if (!rc) break;
+        if (rc<0) goto out;
+    }
+
+    /* OK, wait for some callback */
+    return;
+
+ out:
+    LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+    libxl__xs_transaction_abort(gc, &t);
+    switch_logdirty_done(egc,dss,rc);
+}
+
+static void domain_suspend_switch_qemu_xen_logdirty
+                               (int domid, unsigned enable,
+                                libxl__save_helper_state *shs)
+{
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+    int rc;
+
+    rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
+    if (!rc) {
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+    } else {
+        LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+        dss->rc = rc;
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+    }
+}
+
+void libxl__domain_suspend_common_switch_qemu_logdirty
+                               (int domid, unsigned enable, void *user)
+{
+    libxl__save_helper_state *shs = user;
+    libxl__egc *egc = shs->egc;
+    libxl__domain_suspend_state *dss = shs->caller_state;
+    STATE_AO_GC(dss->ao);
+
+    switch (libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
+        break;
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
+        break;
+    case LIBXL_DEVICE_MODEL_VERSION_NONE:
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+        break;
+    default:
+        LOG(ERROR,"logdirty switch failed"
+            ", no valid device model version found, abandoning suspend");
+        dss->rc = ERROR_FAIL;
+        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+    }
+}
+static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
+                                    const struct timeval *requested_abs,
+                                    int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
+    STATE_AO_GC(dss->ao);
+    LOG(ERROR,"logdirty switch: wait for device model timed out");
+    switch_logdirty_done(egc,dss,ERROR_FAIL);
+}
+
+static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
+                            const char *watch_path, const char *event_path)
+{
+    libxl__domain_suspend_state *dss =
+        CONTAINER_OF(watch, *dss, logdirty.watch);
+    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(dss->ao);
+    const char *got;
+    xs_transaction_t t = 0;
+    int rc;
+
+    for (;;) {
+        rc = libxl__xs_transaction_start(gc, &t);
+        if (rc) goto out;
+
+        rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got);
+        if (rc) goto out;
+
+        if (!got) {
+            rc = +1;
+            goto out;
+        }
+
+        if (strcmp(got, lds->cmd)) {
+            LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'"
+                " (xenstore paths `%s' / `%s')", lds->cmd, got,
+                lds->cmd_path, lds->ret_path);
+            rc = ERROR_FAIL;
+            goto out;
+        }
+
+        rc = libxl__xs_rm_checked(gc, t, lds->cmd_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
+        if (rc) goto out;
+
+        rc = libxl__xs_transaction_commit(gc, &t);
+        if (!rc) break;
+        if (rc<0) goto out;
+    }
+
+ out:
+    /* rc < 0: error
+     * rc == 0: ok, we are done
+     * rc == +1: need to keep waiting
+     */
+    libxl__xs_transaction_abort(gc, &t);
+
+    if (rc <= 0) {
+        if (rc < 0)
+            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
+        switch_logdirty_done(egc,dss,rc);
+    }
+}
+
+static void switch_logdirty_done(libxl__egc *egc,
+                                 libxl__domain_suspend_state *dss,
+                                 int rc)
+{
+    STATE_AO_GC(dss->ao);
+    libxl__logdirty_switch *lds = &dss->logdirty;
+
+    libxl__ev_xswatch_deregister(gc, &lds->watch);
+    libxl__ev_time_deregister(gc, &lds->timeout);
+
+    int broke;
+    if (rc) {
+        broke = -1;
+        dss->rc = rc;
+    } else {
+        broke = 0;
+    }
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
+}
+
+/*----- callbacks, called by xc_domain_save -----*/
+
+/*
+ * Expand the buffer 'buf' of length 'len', to append 'str' including its NUL
+ * terminator.
+ */
+static void append_string(libxl__gc *gc, char **buf, uint32_t *len,
+                          const char *str)
+{
+    size_t extralen = strlen(str) + 1;
+    char *new = libxl__realloc(gc, *buf, *len + extralen);
+
+    *buf = new;
+    memcpy(new + *len, str, extralen);
+    *len += extralen;
+}
+
+int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
+                                       char **callee_buf,
+                                       uint32_t *callee_len)
+{
+    STATE_AO_GC(dss->ao);
+    const char *xs_root;
+    char **entries, *buf = NULL;
+    unsigned int nr_entries, i, j, len = 0;
+    int rc;
+
+    const uint32_t domid = dss->domid;
+    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+
+    xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
+
+    entries = libxl__xs_directory(gc, 0, GCSPRINTF("%s/physmap", xs_root),
+                                  &nr_entries);
+    if (!entries || nr_entries == 0) { rc = 0; goto out; }
+
+    for (i = 0; i < nr_entries; ++i) {
+        static const char *const physmap_subkeys[] = {
+            "start_addr", "size", "name"
+        };
+
+        for (j = 0; j < ARRAY_SIZE(physmap_subkeys); ++j) {
+            const char *key = GCSPRINTF("physmap/%s/%s",
+                                        entries[i], physmap_subkeys[j]);
+
+            const char *val =
+                libxl__xs_read(gc, XBT_NULL,
+                               GCSPRINTF("%s/%s", xs_root, key));
+
+            if (!val) { rc = ERROR_FAIL; goto out; }
+
+            append_string(gc, &buf, &len, key);
+            append_string(gc, &buf, &len, val);
+        }
+    }
+
+    rc = 0;
+
+ out:
+    if (!rc) {
+        *callee_buf = buf;
+        *callee_len = len;
+    }
+
+    return rc;
+}
+
+/*----- main code for saving, in order of execution -----*/
+
+void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
+{
+    STATE_AO_GC(dss->ao);
+    int port;
+    int rc, ret;
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+    const libxl_domain_type type = dss->type;
+    const int live = dss->live;
+    const int debug = dss->debug;
+    const libxl_domain_remus_info *const r_info = dss->remus;
+    libxl__srm_save_autogen_callbacks *const callbacks =
+        &dss->sws.shs.callbacks.save.a;
+    unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
+
+    dss->rc = 0;
+    logdirty_init(&dss->logdirty);
+    libxl__xswait_init(&dss->pvcontrol);
+    libxl__ev_evtchn_init(&dss->guest_evtchn);
+    libxl__ev_xswatch_init(&dss->guest_watch);
+    libxl__ev_time_init(&dss->guest_timeout);
+
+    switch (type) {
+    case LIBXL_DOMAIN_TYPE_HVM: {
+        dss->hvm = 1;
+        break;
+    }
+    case LIBXL_DOMAIN_TYPE_PV:
+        dss->hvm = 0;
+        break;
+    default:
+        abort();
+    }
+
+    dss->xcflags = (live ? XCFLAGS_LIVE : 0)
+          | (debug ? XCFLAGS_DEBUG : 0)
+          | (dss->hvm ? XCFLAGS_HVM : 0);
+
+    /* Disallow saving a guest with vNUMA configured because migration
+     * stream does not preserve node information.
+     *
+     * Reject any domain which has vnuma enabled, even if the
+     * configuration is empty. Only domains which have no vnuma
+     * configuration at all are supported.
+     */
+    ret = xc_domain_getvnuma(CTX->xch, domid, &nr_vnodes, &nr_vmemranges,
+                             &nr_vcpus, NULL, NULL, NULL);
+    if (ret != -1 || errno != XEN_EOPNOTSUPP) {
+        LOG(ERROR, "Cannot save a guest with vNUMA configured");
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    dss->guest_evtchn.port = -1;
+    dss->guest_evtchn_lockfd = -1;
+    dss->guest_responded = 0;
+    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
+
+    if (r_info != NULL) {
+        dss->interval = r_info->interval;
+        dss->xcflags |= XCFLAGS_CHECKPOINTED;
+        if (libxl_defbool_val(r_info->compression))
+            dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
+    }
+
+    port = xs_suspend_evtchn_port(dss->domid);
+
+    if (port >= 0) {
+        rc = libxl__ctx_evtchn_init(gc);
+        if (rc) goto out;
+
+        dss->guest_evtchn.port =
+            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
+                                  dss->domid, port, &dss->guest_evtchn_lockfd);
+
+        if (dss->guest_evtchn.port < 0) {
+            LOG(WARN, "Suspend event channel initialization failed");
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
+    memset(callbacks, 0, sizeof(*callbacks));
+    if (r_info != NULL) {
+        callbacks->suspend = libxl__remus_domain_suspend_callback;
+        callbacks->postcopy = libxl__remus_domain_resume_callback;
+        callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
+    } else
+        callbacks->suspend = libxl__domain_suspend_callback;
+
+    callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
+
+    dss->sws.ao  = dss->ao;
+    dss->sws.dss = dss;
+    dss->sws.fd  = dss->fd;
+    dss->sws.completion_callback = stream_done;
+
+    libxl__stream_write_start(egc, &dss->sws);
+    return;
+
+ out:
+    domain_save_done(egc, dss, rc);
+}
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *sws, int rc)
+{
+    domain_save_done(egc, sws->dss, rc);
+}
+
+static void domain_save_done(libxl__egc *egc,
+                             libxl__domain_suspend_state *dss, int rc)
+{
+    STATE_AO_GC(dss->ao);
+
+    /* Convenience aliases */
+    const uint32_t domid = dss->domid;
+
+    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+
+    if (dss->guest_evtchn.port > 0)
+        xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
+                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
+
+    if (dss->remus) {
+        /*
+         * With Remus, if we reach this point, it means either
+         * backup died or some network error occurred preventing us
+         * from sending checkpoints. Teardown the network buffers and
+         * release netlink resources.  This is an async op.
+         */
+        libxl__remus_teardown(egc, dss, rc);
+        return;
+    }
+
+    dss->callback(egc, dss, rc);
+}
+
+/*========================= Domain restore ============================*/
+
+/*
+ * Inspect the buffer between start and end, and return a pointer to the
+ * character following the NUL terminator of start, or NULL if start is not
+ * terminated before end.
+ */
+static const char *next_string(const char *start, const char *end)
+{
+    if (start >= end) return NULL;
+
+    size_t total_len = end - start;
+    size_t len = strnlen(start, total_len);
+
+    if (len == total_len)
+        return NULL;
+    else
+        return start + len + 1;
+}
+
+int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs,
+                                          const char *ptr, uint32_t size)
+{
+    STATE_AO_GC(dcs->ao);
+    const char *next = ptr, *end = ptr + size, *key, *val;
+    int rc;
+
+    const uint32_t domid = dcs->guest_domid;
+    const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
+    const char *xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
+
+    while (next < end) {
+        key = next;
+        next = next_string(next, end);
+
+        /* Sanitise 'key'. */
+        if (!next) {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "Key in xenstore data not NUL terminated");
+            goto out;
+        }
+        if (key[0] == '\0') {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "empty key found in xenstore data");
+            goto out;
+        }
+        if (key[0] == '/') {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "Key in xenstore data not relative");
+            goto out;
+        }
+
+        val = next;
+        next = next_string(next, end);
+
+        /* Sanitise 'val'. */
+        if (!next) {
+            rc = ERROR_FAIL;
+            LOG(ERROR, "Val in xenstore data not NUL terminated");
+            goto out;
+        }
+
+        libxl__xs_printf(gc, XBT_NULL,
+                         GCSPRINTF("%s/%s", xs_root, key),
+                         "%s", val);
+    }
+
+    rc = 0;
+
+ out:
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (2 preceding siblings ...)
  2015-12-30  2:28 ` [PATCH v6 03/18] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2016-01-25 17:29   ` Konrad Rzeszutek Wilk
  2015-12-30  2:28 ` [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

Currently struct libxl__domain_suspend_state contains 2 type of states,
one is save state, another is suspend state. This patch separates those
two out.
The motivation of this is that COLO will need to do suspend/resume
continuously, we need a more common suspend state.

After this change, dss stands for libxl__domain_save_state,
dsps stands for libxl__domain_suspend_state.

Also introduce libxl__domain_suspend_init to initialise the
libxl__domain_suspend_state.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl.c              |  10 +-
 tools/libxl/libxl_create.c       |  10 +-
 tools/libxl/libxl_dom_save.c     |  61 ++++-------
 tools/libxl/libxl_dom_suspend.c  | 217 +++++++++++++++++++++++++--------------
 tools/libxl/libxl_internal.h     |  60 +++++++----
 tools/libxl/libxl_netbuffer.c    |   2 +-
 tools/libxl/libxl_remus.c        |  37 ++++---
 tools/libxl/libxl_save_callout.c |   2 +-
 tools/libxl/libxl_stream_write.c |  16 +--
 9 files changed, 236 insertions(+), 179 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index bdd8ad0..f8f7158 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -833,7 +833,7 @@ out:
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc);
+                              libxl__domain_save_state *dss, int rc);
 
 /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
 int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
@@ -841,7 +841,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
                              const libxl_asyncop_how *ao_how)
 {
     AO_CREATE(ctx, domid, ao_how);
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     int rc;
 
     libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -889,7 +889,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc)
+                              libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
     /*
@@ -901,7 +901,7 @@ static void remus_failover_cb(libxl__egc *egc,
 }
 
 static void domain_suspend_cb(libxl__egc *egc,
-                              libxl__domain_suspend_state *dss, int rc)
+                              libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
     int flrc;
@@ -926,7 +926,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
         goto out_err;
     }
 
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     GCNEW(dss);
 
     dss->ao = ao;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 0ee9a57..bfa0552 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1540,7 +1540,7 @@ typedef struct {
 typedef struct {
     libxl__app_domain_create_state cdcs;
     libxl__domain_destroy_state dds;
-    libxl__domain_suspend_state dss;
+    libxl__domain_save_state dss;
     char *toolstack_buf;
     uint32_t toolstack_len;
 } libxl__domain_soft_reset_state;
@@ -1635,7 +1635,7 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
     libxl__app_domain_create_state *cdcs;
     libxl__domain_create_state *dcs;
     libxl__domain_build_state *state;
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     char *dom_path, *xs_store_mfn, *xs_console_mfn;
     uint32_t domid_out;
     int rc;
@@ -1679,8 +1679,8 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
 
     dss->ao = ao;
     dss->domid = domid_soft_reset;
-    dss->dm_savefile = GCSPRINTF(LIBXL_DEVICE_MODEL_SAVE_FILE".%d",
-                                 domid_soft_reset);
+    dss->dsps.dm_savefile = GCSPRINTF(LIBXL_DEVICE_MODEL_SAVE_FILE".%d",
+                                      domid_soft_reset);
 
     rc = libxl__save_emulator_xenstore_data(dss, &srs->toolstack_buf,
                                             &srs->toolstack_len);
@@ -1689,7 +1689,7 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
         goto out;
     }
 
-    rc = libxl__domain_suspend_device_model(gc, dss);
+    rc = libxl__domain_suspend_device_model(gc, &dss->dsps);
     if (rc) {
         LOG(ERROR, "failed to suspend device model.");
         goto out;
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 27fd58b..cc8cabe 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -24,7 +24,7 @@
 static void stream_done(libxl__egc *egc,
                         libxl__stream_write_state *sws, int rc);
 static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc);
+                             libxl__domain_save_state *dss, int rc);
 
 /*----- complicated callback, called by xc_domain_save -----*/
 
@@ -42,7 +42,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int rc);
+                                 libxl__domain_save_state *dss, int rc);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -56,7 +56,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
                                 libxl__save_helper_state *shs)
 {
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     libxl__logdirty_switch *lds = &dss->logdirty;
     STATE_AO_GC(dss->ao);
     int rc;
@@ -128,7 +128,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
                                 libxl__save_helper_state *shs)
 {
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
     int rc;
 
@@ -147,7 +147,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
 {
     libxl__save_helper_state *shs = user;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
     switch (libxl__device_model_version_running(gc, domid)) {
@@ -171,7 +171,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
                                     const struct timeval *requested_abs,
                                     int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
+    libxl__domain_save_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
     STATE_AO_GC(dss->ao);
     LOG(ERROR,"logdirty switch: wait for device model timed out");
     switch_logdirty_done(egc,dss,ERROR_FAIL);
@@ -180,7 +180,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
                             const char *watch_path, const char *event_path)
 {
-    libxl__domain_suspend_state *dss =
+    libxl__domain_save_state *dss =
         CONTAINER_OF(watch, *dss, logdirty.watch);
     libxl__logdirty_switch *lds = &dss->logdirty;
     STATE_AO_GC(dss->ao);
@@ -234,7 +234,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
 }
 
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss,
+                                 libxl__domain_save_state *dss,
                                  int rc)
 {
     STATE_AO_GC(dss->ao);
@@ -270,7 +270,7 @@ static void append_string(libxl__gc *gc, char **buf, uint32_t *len,
     *len += extralen;
 }
 
-int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
+int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
                                        char **callee_buf,
                                        uint32_t *callee_len)
 {
@@ -322,10 +322,9 @@ int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
 
 /*----- main code for saving, in order of execution -----*/
 
-void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
+void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
 {
     STATE_AO_GC(dss->ao);
-    int port;
     int rc, ret;
 
     /* Convenience aliases */
@@ -337,13 +336,14 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
     libxl__srm_save_autogen_callbacks *const callbacks =
         &dss->sws.shs.callbacks.save.a;
     unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
-    libxl__xswait_init(&dss->pvcontrol);
-    libxl__ev_evtchn_init(&dss->guest_evtchn);
-    libxl__ev_xswatch_init(&dss->guest_watch);
-    libxl__ev_time_init(&dss->guest_timeout);
+    dsps->ao = ao;
+    dsps->domid = domid;
+    rc = libxl__domain_suspend_init(egc, dsps);
+    if (rc) goto out;
 
     switch (type) {
     case LIBXL_DOMAIN_TYPE_HVM: {
@@ -376,11 +376,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
         goto out;
     }
 
-    dss->guest_evtchn.port = -1;
-    dss->guest_evtchn_lockfd = -1;
-    dss->guest_responded = 0;
-    dss->dm_savefile = libxl__device_model_savefile(gc, domid);
-
     if (r_info != NULL) {
         dss->interval = r_info->interval;
         dss->xcflags |= XCFLAGS_CHECKPOINTED;
@@ -388,23 +383,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
 
-    port = xs_suspend_evtchn_port(dss->domid);
-
-    if (port >= 0) {
-        rc = libxl__ctx_evtchn_init(gc);
-        if (rc) goto out;
-
-        dss->guest_evtchn.port =
-            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
-                                  dss->domid, port, &dss->guest_evtchn_lockfd);
-
-        if (dss->guest_evtchn.port < 0) {
-            LOG(WARN, "Suspend event channel initialization failed");
-            rc = ERROR_FAIL;
-            goto out;
-        }
-    }
-
     memset(callbacks, 0, sizeof(*callbacks));
     if (r_info != NULL) {
         callbacks->suspend = libxl__remus_domain_suspend_callback;
@@ -434,18 +412,19 @@ static void stream_done(libxl__egc *egc,
 }
 
 static void domain_save_done(libxl__egc *egc,
-                             libxl__domain_suspend_state *dss, int rc)
+                             libxl__domain_save_state *dss, int rc)
 {
     STATE_AO_GC(dss->ao);
 
     /* Convenience aliases */
     const uint32_t domid = dss->domid;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
 
-    if (dss->guest_evtchn.port > 0)
+    if (dsps->guest_evtchn.port > 0)
         xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
-                           dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
+                        dsps->guest_evtchn.port, &dsps->guest_evtchn_lockfd);
 
     if (dss->remus) {
         /*
diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c
index 3313ad1..f8185fb 100644
--- a/tools/libxl/libxl_dom_suspend.c
+++ b/tools/libxl/libxl_dom_suspend.c
@@ -19,14 +19,71 @@
 
 /*====================== Domain suspend =======================*/
 
+int libxl__domain_suspend_init(libxl__egc *egc,
+                               libxl__domain_suspend_state *dsps)
+{
+    STATE_AO_GC(dsps->ao);
+    int rc = ERROR_FAIL;
+    int port;
+    libxl_domain_type type;
+
+    /* Convenience aliases */
+    const uint32_t domid = dsps->domid;
+
+    type = libxl__domain_type(gc, domid);
+    switch (type) {
+    case LIBXL_DOMAIN_TYPE_HVM: {
+        dsps->hvm = 1;
+        break;
+    }
+    case LIBXL_DOMAIN_TYPE_PV:
+        dsps->hvm = 0;
+        break;
+    default:
+        goto out;
+    }
+
+    libxl__xswait_init(&dsps->pvcontrol);
+    libxl__ev_evtchn_init(&dsps->guest_evtchn);
+    libxl__ev_xswatch_init(&dsps->guest_watch);
+    libxl__ev_time_init(&dsps->guest_timeout);
+
+    dsps->guest_evtchn.port = -1;
+    dsps->guest_evtchn_lockfd = -1;
+    dsps->guest_responded = 0;
+    dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
+
+    port = xs_suspend_evtchn_port(domid);
+
+    if (port >= 0) {
+        rc = libxl__ctx_evtchn_init(gc);
+        if (rc) goto out;
+
+        dsps->guest_evtchn.port =
+            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
+                                    domid, port, &dsps->guest_evtchn_lockfd);
+
+        if (dsps->guest_evtchn.port < 0) {
+            LOG(WARN, "Suspend event channel initialization failed");
+            rc = ERROR_FAIL;
+            goto out;
+        }
+    }
+
+    rc = 0;
+
+out:
+    return rc;
+}
+
 /*----- callbacks, called by xc_domain_save -----*/
 
 int libxl__domain_suspend_device_model(libxl__gc *gc,
-                                       libxl__domain_suspend_state *dss)
+                                       libxl__domain_suspend_state *dsps)
 {
     int ret = 0;
-    uint32_t const domid = dss->domid;
-    const char *const filename = dss->dm_savefile;
+    uint32_t const domid = dsps->domid;
+    const char *const filename = dsps->dm_savefile;
 
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
@@ -53,9 +110,9 @@ int libxl__domain_suspend_device_model(libxl__gc *gc,
 }
 
 static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss);
+                                             libxl__domain_suspend_state *dsps);
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss);
+                                         libxl__domain_suspend_state *dsps);
 
 static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
       libxl__xswait_state *xswa, int rc, const char *state);
@@ -64,24 +121,24 @@ static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
       libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path);
 static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss);
+        libxl__domain_suspend_state *dsps);
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
 
 static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
+                                       libxl__domain_suspend_state *dsps,
                                        int rc);
 
 static void domain_suspend_callback_common(libxl__egc *egc,
-                                           libxl__domain_suspend_state *dss);
+                                           libxl__domain_suspend_state *dsps);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc);
+                                libxl__domain_suspend_state *dsps, int rc);
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 void libxl__domain_suspend(libxl__egc *egc,
-                           libxl__domain_suspend_state *dss)
+                           libxl__domain_suspend_state *dsps)
 {
-    domain_suspend_callback_common(egc, dss);
+    domain_suspend_callback_common(egc, dsps);
 }
 
 static bool domain_suspend_pvcontrol_acked(const char *state) {
@@ -90,37 +147,37 @@ static bool domain_suspend_pvcontrol_acked(const char *state) {
     return strcmp(state,"suspend");
 }
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 static void domain_suspend_callback_common(libxl__egc *egc,
-                                           libxl__domain_suspend_state *dss)
+                                           libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     uint64_t hvm_s_state = 0, hvm_pvdrv = 0;
     int ret, rc;
 
     /* Convenience aliases */
-    const uint32_t domid = dss->domid;
+    const uint32_t domid = dsps->domid;
 
-    if (dss->hvm) {
+    if (dsps->hvm) {
         xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_CALLBACK_IRQ, &hvm_pvdrv);
         xc_hvm_param_get(CTX->xch, domid, HVM_PARAM_ACPI_S_STATE, &hvm_s_state);
     }
 
-    if ((hvm_s_state == 0) && (dss->guest_evtchn.port >= 0)) {
+    if ((hvm_s_state == 0) && (dsps->guest_evtchn.port >= 0)) {
         LOG(DEBUG, "issuing %s suspend request via event channel",
-            dss->hvm ? "PVHVM" : "PV");
-        ret = xc_evtchn_notify(CTX->xce, dss->guest_evtchn.port);
+            dsps->hvm ? "PVHVM" : "PV");
+        ret = xc_evtchn_notify(CTX->xce, dsps->guest_evtchn.port);
         if (ret < 0) {
             LOG(ERROR, "xc_evtchn_notify failed ret=%d", ret);
             rc = ERROR_FAIL;
             goto err;
         }
 
-        dss->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
-        rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
+        dsps->guest_evtchn.callback = domain_suspend_common_wait_guest_evtchn;
+        rc = libxl__ev_evtchn_wait(gc, &dsps->guest_evtchn);
         if (rc) goto err;
 
-        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+        rc = libxl__ev_time_register_rel(ao, &dsps->guest_timeout,
                                          suspend_common_wait_guest_timeout,
                                          60*1000);
         if (rc) goto err;
@@ -128,7 +185,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         return;
     }
 
-    if (dss->hvm && (!hvm_pvdrv || hvm_s_state)) {
+    if (dsps->hvm && (!hvm_pvdrv || hvm_s_state)) {
         LOG(DEBUG, "Calling xc_domain_shutdown on HVM domain");
         ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend);
         if (ret < 0) {
@@ -137,55 +194,55 @@ static void domain_suspend_callback_common(libxl__egc *egc,
             goto err;
         }
         /* The guest does not (need to) respond to this sort of request. */
-        dss->guest_responded = 1;
-        domain_suspend_common_wait_guest(egc, dss);
+        dsps->guest_responded = 1;
+        domain_suspend_common_wait_guest(egc, dsps);
         return;
     }
 
     LOG(DEBUG, "issuing %s suspend request via XenBus control node",
-        dss->hvm ? "PVHVM" : "PV");
+        dsps->hvm ? "PVHVM" : "PV");
 
     libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend");
 
-    dss->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
-    if (!dss->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
+    dsps->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
+    if (!dsps->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
 
-    dss->pvcontrol.ao = ao;
-    dss->pvcontrol.what = "guest acknowledgement of suspend request";
-    dss->pvcontrol.timeout_ms = 60 * 1000;
-    dss->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
-    libxl__xswait_start(gc, &dss->pvcontrol);
+    dsps->pvcontrol.ao = ao;
+    dsps->pvcontrol.what = "guest acknowledgement of suspend request";
+    dsps->pvcontrol.timeout_ms = 60 * 1000;
+    dsps->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending;
+    libxl__xswait_start(gc, &dsps->pvcontrol);
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
         libxl__ev_evtchn *evev)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(evev, *dss, guest_evtchn);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(evev, *dsps, guest_evtchn);
+    STATE_AO_GC(dsps->ao);
     /* If we should be done waiting, suspend_common_wait_guest_check
      * will end up calling domain_suspend_common_guest_suspended or
      * domain_suspend_common_done, both of which cancel the evtchn
      * wait as needed.  So re-enable it now. */
-    libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
-    suspend_common_wait_guest_check(egc, dss);
+    libxl__ev_evtchn_wait(gc, &dsps->guest_evtchn);
+    suspend_common_wait_guest_check(egc, dsps);
 }
 
 static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
       libxl__xswait_state *xswa, int rc, const char *state)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xswa, *dss, pvcontrol);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(xswa, *dsps, pvcontrol);
+    STATE_AO_GC(dsps->ao);
     xs_transaction_t t = 0;
 
     if (!rc && !domain_suspend_pvcontrol_acked(state))
         /* keep waiting */
         return;
 
-    libxl__xswait_stop(gc, &dss->pvcontrol);
+    libxl__xswait_stop(gc, &dsps->pvcontrol);
 
     if (rc == ERROR_TIMEDOUT) {
         /*
@@ -228,56 +285,56 @@ static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
     LOG(DEBUG, "guest acknowledged suspend request");
 
     libxl__xs_transaction_abort(gc, &t);
-    dss->guest_responded = 1;
-    domain_suspend_common_wait_guest(egc,dss);
+    dsps->guest_responded = 1;
+    domain_suspend_common_wait_guest(egc,dsps);
     return;
 
  err:
     libxl__xs_transaction_abort(gc, &t);
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
     return;
 }
 
 static void domain_suspend_common_wait_guest(libxl__egc *egc,
-                                             libxl__domain_suspend_state *dss)
+                                             libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     int rc;
 
     LOG(DEBUG, "wait for the guest to suspend");
 
-    rc = libxl__ev_xswatch_register(gc, &dss->guest_watch,
+    rc = libxl__ev_xswatch_register(gc, &dsps->guest_watch,
                                     suspend_common_wait_guest_watch,
                                     "@releaseDomain");
     if (rc) goto err;
 
-    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dsps->guest_timeout,
                                      suspend_common_wait_guest_timeout,
                                      60*1000);
     if (rc) goto err;
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
       libxl__ev_xswatch *xsw, const char *watch_path, const char *event_path)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(xsw, *dss, guest_watch);
-    suspend_common_wait_guest_check(egc, dss);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(xsw, *dsps, guest_watch);
+    suspend_common_wait_guest_check(egc, dsps);
 }
 
 static void suspend_common_wait_guest_check(libxl__egc *egc,
-        libxl__domain_suspend_state *dss)
+        libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     xc_domaininfo_t info;
     int ret;
     int shutdown_reason;
 
     /* Convenience aliases */
-    const uint32_t domid = dss->domid;
+    const uint32_t domid = dsps->domid;
 
     ret = xc_domain_getinfolist(CTX->xch, domid, 1, &info);
     if (ret < 0) {
@@ -304,71 +361,73 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
     }
 
     LOG(DEBUG, "guest has suspended");
-    domain_suspend_common_guest_suspended(egc, dss);
+    domain_suspend_common_guest_suspended(egc, dsps);
     return;
 
  err:
-    domain_suspend_common_done(egc, dss, ERROR_FAIL);
+    domain_suspend_common_done(egc, dsps, ERROR_FAIL);
 }
 
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
-    STATE_AO_GC(dss->ao);
+    libxl__domain_suspend_state *dsps = CONTAINER_OF(ev, *dsps, guest_timeout);
+    STATE_AO_GC(dsps->ao);
     if (rc == ERROR_TIMEDOUT) {
         LOG(ERROR, "guest did not suspend, timed out");
         rc = ERROR_GUEST_TIMEDOUT;
     }
-    domain_suspend_common_done(egc, dss, rc);
+    domain_suspend_common_done(egc, dsps, rc);
 }
 
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss)
+                                         libxl__domain_suspend_state *dsps)
 {
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(dsps->ao);
     int rc;
 
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
+    libxl__ev_time_deregister(gc, &dsps->guest_timeout);
 
-    if (dss->hvm) {
-        rc = libxl__domain_suspend_device_model(gc, dss);
+    if (dsps->hvm) {
+        rc = libxl__domain_suspend_device_model(gc, dsps);
         if (rc) {
             LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", rc);
-            domain_suspend_common_done(egc, dss, rc);
+            domain_suspend_common_done(egc, dsps, rc);
             return;
         }
     }
-    domain_suspend_common_done(egc, dss, 0);
+    domain_suspend_common_done(egc, dsps, 0);
 }
 
 static void domain_suspend_common_done(libxl__egc *egc,
-                                       libxl__domain_suspend_state *dss,
+                                       libxl__domain_suspend_state *dsps,
                                        int rc)
 {
     EGC_GC;
-    assert(!libxl__xswait_inuse(&dss->pvcontrol));
-    libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
-    libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
-    libxl__ev_time_deregister(gc, &dss->guest_timeout);
-    dss->callback_common_done(egc, dss, rc);
+    assert(!libxl__xswait_inuse(&dsps->pvcontrol));
+    libxl__ev_evtchn_cancel(gc, &dsps->guest_evtchn);
+    libxl__ev_xswatch_deregister(gc, &dsps->guest_watch);
+    libxl__ev_time_deregister(gc, &dsps->guest_timeout);
+    dsps->callback_common_done(egc, dsps, rc);
 }
 
 void libxl__domain_suspend_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    dss->callback_common_done = domain_suspend_callback_common_done;
-    domain_suspend_callback_common(egc, dss);
+    dsps->callback_common_done = domain_suspend_callback_common_done;
+    domain_suspend_callback_common(egc, dsps);
 }
 
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
+                                libxl__domain_suspend_state *dsps, int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(dsps, *dss, dsps);
     dss->rc = rc;
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, !rc);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 05537cc..93dd06c 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3007,11 +3007,12 @@ static inline bool libxl__conversion_helper_inuse
  */
 
 typedef struct libxl__domain_suspend_state libxl__domain_suspend_state;
+typedef struct libxl__domain_save_state libxl__domain_save_state;
 
-typedef void libxl__domain_suspend_cb(libxl__egc*,
-                                      libxl__domain_suspend_state*, int rc);
+typedef void libxl__domain_save_cb(libxl__egc*,
+                                   libxl__domain_save_state*, int rc);
 typedef void libxl__save_device_model_cb(libxl__egc*,
-                                         libxl__domain_suspend_state*, int rc);
+                                         libxl__domain_save_state*, int rc);
 
 /* State for writing a libxl migration v2 stream */
 typedef struct libxl__stream_write_state libxl__stream_write_state;
@@ -3020,7 +3021,7 @@ typedef void (*sws_record_done_cb)(libxl__egc *egc,
 struct libxl__stream_write_state {
     /* filled by the user */
     libxl__ao *ao;
-    libxl__domain_suspend_state *dss;
+    libxl__domain_save_state *dss;
     int fd;
     void (*completion_callback)(libxl__egc *egc,
                                 libxl__stream_write_state *sws,
@@ -3074,9 +3075,32 @@ typedef struct libxl__logdirty_switch {
 } libxl__logdirty_switch;
 
 struct libxl__domain_suspend_state {
+    /* set by caller of libxl__domain_suspend_init */
+    libxl__ao *ao;
+    uint32_t domid;
+
+    /* private */
+    int hvm;
+
+    libxl__ev_evtchn guest_evtchn;
+    int guest_evtchn_lockfd;
+    int guest_responded;
+
+    libxl__xswait_state pvcontrol;
+    libxl__ev_xswatch guest_watch;
+    libxl__ev_time guest_timeout;
+
+    const char *dm_savefile;
+    void (*callback_common_done)(libxl__egc*,
+                                 struct libxl__domain_suspend_state*, int ok);
+};
+int libxl__domain_suspend_init(libxl__egc *egc,
+                               libxl__domain_suspend_state *dsps);
+
+struct libxl__domain_save_state {
     /* set by caller of libxl__domain_save */
     libxl__ao *ao;
-    libxl__domain_suspend_cb *callback;
+    libxl__domain_save_cb *callback;
 
     uint32_t domid;
     int fd;
@@ -3087,22 +3111,14 @@ struct libxl__domain_suspend_state {
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
-    libxl__ev_evtchn guest_evtchn;
-    int guest_evtchn_lockfd;
     int hvm;
     int xcflags;
-    int guest_responded;
-    libxl__xswait_state pvcontrol;
-    libxl__ev_xswatch guest_watch;
-    libxl__ev_time guest_timeout;
-    const char *dm_savefile;
+    libxl__domain_suspend_state dsps;
     libxl__remus_devices_state rds;
     libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
     int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
     libxl__logdirty_switch logdirty;
-    void (*callback_common_done)(libxl__egc*,
-                                 struct libxl__domain_suspend_state*, int ok);
     /* private for libxl__domain_save_device_model */
     libxl__save_device_model_cb *save_dm_callback;
     libxl__datacopier_state save_dm_datacopier;
@@ -3446,12 +3462,12 @@ struct libxl__domain_create_state {
 
 /* calls dss->callback when done */
 _hidden void libxl__domain_save(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss);
+                                libxl__domain_save_state *dss);
 
 
 /* calls libxl__xc_domain_suspend_done when done */
 _hidden void libxl__xc_domain_save(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss,
+                                   libxl__domain_save_state *dss,
                                    libxl__save_helper_state *shs);
 /* If rc==0 then retval is the return value from xc_domain_save
  * and errnoval is the errno value it provided.
@@ -3469,7 +3485,7 @@ void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
 
 _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
                                (int domid, unsigned int enable, void *data);
-_hidden int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss,
+_hidden int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
                                                char **buf, uint32_t *len);
 _hidden int libxl__restore_emulator_xenstore_data
     (libxl__domain_create_state *dcs, const char *ptr, uint32_t size);
@@ -3497,13 +3513,13 @@ static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
 
 /* Each time the dm needs to be saved, we must call suspend and then save */
 _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
-                                           libxl__domain_suspend_state *dss);
+                                           libxl__domain_suspend_state *dsps);
 
 _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
 
-/* calls dss->callback_common_done when done */
+/* calls dsps->callback_common_done when done */
 _hidden void libxl__domain_suspend(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss);
+                                   libxl__domain_suspend_state *dsps);
 /* used by libxc to suspend the guest during migration */
 _hidden void libxl__domain_suspend_callback(void *data);
 
@@ -3513,9 +3529,9 @@ _hidden void libxl__remus_domain_resume_callback(void *data);
 _hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
 /* Remus setup and teardown*/
 _hidden void libxl__remus_setup(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss);
+                                libxl__domain_save_state *dss);
 _hidden void libxl__remus_teardown(libxl__egc *egc,
-                                   libxl__domain_suspend_state *dss,
+                                   libxl__domain_save_state *dss,
                                    int rc);
 /* Remus callbacks for restore */
 _hidden void libxl__remus_domain_restore_checkpoint_callback(void *data);
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index 107e867..c245a4e 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -41,7 +41,7 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
 int init_subkind_nic(libxl__remus_devices_state *rds)
 {
     int rc, ret;
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     STATE_AO_GC(rds->ao);
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index e3caf7d..fae2120 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -28,7 +28,7 @@ static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 
 void libxl__remus_setup(libxl__egc *egc,
-                        libxl__domain_suspend_state *dss)
+                        libxl__domain_save_state *dss)
 {
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -63,7 +63,7 @@ out:
 static void remus_setup_done(libxl__egc *egc,
                              libxl__remus_devices_state *rds, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
@@ -80,7 +80,7 @@ static void remus_setup_done(libxl__egc *egc,
 static void remus_setup_failed(libxl__egc *egc,
                                libxl__remus_devices_state *rds, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -94,7 +94,7 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__remus_devices_state *rds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
-                           libxl__domain_suspend_state *dss,
+                           libxl__domain_save_state *dss,
                            int rc)
 {
     EGC_GC;
@@ -109,7 +109,7 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__remus_devices_state *rds,
                                 int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -122,7 +122,7 @@ static void remus_teardown_done(libxl__egc *egc,
 /*---------------------- remus callbacks (save) -----------------------*/
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
+                                libxl__domain_suspend_state *dsps, int ok);
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__remus_devices_state *rds,
                                          int rc);
@@ -134,15 +134,18 @@ void libxl__remus_domain_suspend_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
+    libxl__domain_suspend_state *dsps = &dss->dsps;
 
-    dss->callback_common_done = remus_domain_suspend_callback_common_done;
-    libxl__domain_suspend(egc, dss);
+    dsps->callback_common_done = remus_domain_suspend_callback_common_done;
+    libxl__domain_suspend(egc, dsps);
 }
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int rc)
+                                libxl__domain_suspend_state *dsps, int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(dsps, *dss, dsps);
+
     if (rc)
         goto out;
 
@@ -160,7 +163,7 @@ static void remus_devices_postsuspend_cb(libxl__egc *egc,
                                          libxl__remus_devices_state *rds,
                                          int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     if (rc)
         goto out;
@@ -177,7 +180,7 @@ void libxl__remus_domain_resume_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
     libxl__egc *egc = shs->egc;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -189,7 +192,7 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
                                        libxl__remus_devices_state *rds,
                                        int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -220,7 +223,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
 void libxl__remus_domain_save_checkpoint_callback(void *data)
 {
     libxl__save_helper_state *shs = data;
-    libxl__domain_suspend_state *dss = shs->caller_state;
+    libxl__domain_save_state *dss = shs->caller_state;
     libxl__egc *egc = shs->egc;
     STATE_AO_GC(dss->ao);
 
@@ -230,7 +233,7 @@ void libxl__remus_domain_save_checkpoint_callback(void *data)
 static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
+    libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
 
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -255,7 +258,7 @@ static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__remus_devices_state *rds,
                                     int rc)
 {
-    libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
 
     STATE_AO_GC(dss->ao);
 
@@ -290,7 +293,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
                                   int rc)
 {
-    libxl__domain_suspend_state *dss =
+    libxl__domain_save_state *dss =
                             CONTAINER_OF(ev, *dss, checkpoint_timeout);
 
     STATE_AO_GC(dss->ao);
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 3af99af..2d06b42 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -75,7 +75,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
                argnums, ARRAY_SIZE(argnums));
 }
 
-void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss,
+void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
                            libxl__save_helper_state *shs)
 {
     STATE_AO_GC(dss->ao);
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index ee9c53a..c408ab1 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -216,7 +216,7 @@ void libxl__stream_write_start(libxl__egc *egc,
                                libxl__stream_write_state *stream)
 {
     libxl__datacopier_state *dc = &stream->dc;
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_hdr hdr;
     int rc = 0;
@@ -324,7 +324,7 @@ static void libxc_header_done(libxl__egc *egc,
 void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
                                 int rc, int retval, int errnoval)
 {
-    libxl__domain_suspend_state *dss = dss_void;
+    libxl__domain_save_state *dss = dss_void;
     libxl__stream_write_state *stream = &dss->sws;
     STATE_AO_GC(dss->ao);
 
@@ -333,10 +333,10 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
 
     if (retval) {
         LOGEV(ERROR, errnoval, "saving domain: %s",
-              dss->guest_responded ?
+              dss->dsps.guest_responded ?
               "domain responded to suspend request" :
               "domain did not respond to suspend request");
-        if (!dss->guest_responded)
+        if (!dss->dsps.guest_responded)
             rc = ERROR_GUEST_TIMEDOUT;
         else if (dss->rc)
             rc = dss->rc;
@@ -365,7 +365,7 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
 static void write_emulator_xenstore_record(libxl__egc *egc,
                                            libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_rec_hdr rec;
     int rc;
@@ -404,7 +404,7 @@ static void write_emulator_xenstore_record(libxl__egc *egc,
 static void emulator_xenstore_record_done(libxl__egc *egc,
                                           libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
 
     if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
         write_emulator_context_record(egc, stream);
@@ -419,7 +419,7 @@ static void emulator_xenstore_record_done(libxl__egc *egc,
 static void write_emulator_context_record(libxl__egc *egc,
                                           libxl__stream_write_state *stream)
 {
-    libxl__domain_suspend_state *dss = stream->dss;
+    libxl__domain_save_state *dss = stream->dss;
     libxl__datacopier_state *dc = &stream->emu_dc;
     STATE_AO_GC(stream->ao);
     struct libxl__sr_rec_hdr *rec = &stream->emu_rec_hdr;
@@ -434,7 +434,7 @@ static void write_emulator_context_record(libxl__egc *egc,
     }
 
     /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
+    const char *const filename = dss->dsps.dm_savefile;
 
     libxl__carefd_begin();
     int readfd = open(filename, O_RDONLY);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (3 preceding siblings ...)
  2015-12-30  2:28 ` [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2016-01-25 18:21   ` Konrad Rzeszutek Wilk
  2015-12-30  2:28 ` [PATCH v6 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Befor this patch:
1. suspend
a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
   request to the guest). If the guest doesn't support evtchn, the xenstore
   variant will be used, suspending the guest via XenBus control node.
b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
   the guest

2. Resume:
a. fast path
   In this case, we don't change the guest's state. And we will call
   libxl__domain_resume(..., 1) to resume the guest.
   PV:       modify the return code to 1, and than call the domctl
             XEN_DOMCTL_resumedomain
   PVHVM:    same with PV
   pure HVM: do nothing in modify_returncode, and than call the domctl:
             XEN_DOMCTL_resumedomain
b. slow
   Used when the guest's state have been changed. And we will call
   libxl__domain_resume(..., 0) to resume the guest.
   PV:       update start info, and reset all secondary CPU states. Than call
             the domctl: XEN_DOMCTL_resumedomain
   PVHVM:    can not be resumed. You will get the following error message:
                 "Cannot resume uncooperative HVM guests"
   purt HVM: same with PVHVM

After this patch:
1. suspend
   unchanged

2. Resume
a. fast path:
   unchanged
b. slow
   PV:       unchanged
   PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest. Because we
             don't modify the return code, the PV driver will disconnect
             and reconnect. I am not sure if we should update start info
             and reset all secondary CPU states.
   Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.

Under COLO, we will update the guest's state(modify memory, cpu's registers,
device status...). In this case, we cannot use the fast path to resume it.
Keep the return code 0, and use a slow path to resume the guest. While
resuming HVM using slow path is not supported currently, this patch is to
make the resume call do not fail.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
---
 tools/libxc/xc_resume.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
index 87d4324..503e4f8 100644
--- a/tools/libxc/xc_resume.c
+++ b/tools/libxc/xc_resume.c
@@ -108,6 +108,25 @@ static int xc_domain_resume_cooperative(xc_interface *xch, uint32_t domid)
     return do_domctl(xch, &domctl);
 }
 
+static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
+{
+    DECLARE_DOMCTL;
+
+    /*
+     * This domctl XEN_DOMCTL_resumedomain just unpause each vcpu. After
+     * this domctl, the guest will run.
+     *
+     * If it is PVHVM, the guest called the hypercall HYPERVISOR_sched_op
+     * to suspend itself. We don't modify the return code, so the PV driver
+     * will disconnect and reconnect.
+     *
+     * If it is a HVM, the guest will continue running.
+     */
+    domctl.cmd = XEN_DOMCTL_resumedomain;
+    domctl.domain = domid;
+    return do_domctl(xch, &domctl);
+}
+
 static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
 {
     DECLARE_DOMCTL;
@@ -137,10 +156,7 @@ static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
      */
 #if defined(__i386__) || defined(__x86_64__)
     if ( info.hvm )
-    {
-        ERROR("Cannot resume uncooperative HVM guests");
-        return rc;
-    }
+        return xc_domain_resume_hvm(xch, domid);
 
     if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
     {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (4 preceding siblings ...)
  2015-12-30  2:28 ` [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2016-01-25 18:30   ` Konrad Rzeszutek Wilk
  2015-12-30  2:28 ` [PATCH v6 07/18] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

introduce enum type libxl_checkpointed_stream in IDL.
rename the last argument of migrate_receive from "remus" to
"checkpointed" since the semantics of this parameter has
changed.

NOTE:
 libxl_domain_restore_params isn't changed here,
 checkpointed_stream is still an int.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 tools/libxl/libxl.h             |  7 +++++++
 tools/libxl/libxl_create.c      |  8 ++++++--
 tools/libxl/libxl_stream_read.c |  7 +++++--
 tools/libxl/libxl_types.idl     |  5 +++++
 tools/libxl/xl_cmdimpl.c        | 18 ++++++++++++------
 5 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 05606a7..a01e448 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -867,6 +867,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
  */
 #define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
 
+/*
+ * LIBXL_HAVE_CHECKPOINTED_STREAM
+ *
+ * If this is defined, then libxl_checkpointed_stream exists.
+ */
+#define LIBXL_HAVE_CHECKPOINTED_STREAM 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index bfa0552..8d3896f 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1015,9 +1015,13 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     dcs->srs.completion_callback = domcreate_stream_done;
 
     if (restore_fd >= 0) {
-        if (checkpointed_stream)
+        switch (checkpointed_stream) {
+        case LIBXL_CHECKPOINTED_STREAM_REMUS:
             libxl__remus_restore_setup(egc, dcs);
-        libxl__stream_read_start(egc, &dcs->srs);
+            /* fall through */
+        case LIBXL_CHECKPOINTED_STREAM_NONE:
+            libxl__stream_read_start(egc, &dcs->srs);
+        }
         return;
     }
 
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 42c087f..6ad2a27 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -780,15 +780,18 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
      * If the stream is not still alive, we must not continue any work.
      */
     if (libxl__stream_read_inuse(stream)) {
-        if (checkpointed_stream) {
+        switch (checkpointed_stream) {
+        case LIBXL_CHECKPOINTED_STREAM_REMUS:
             /* failover */
             stream_complete(egc, stream, 0);
-        } else {
+            break;
+        case LIBXL_CHECKPOINTED_STREAM_NONE:
             /*
              * Libxc has indicated that it is done with the stream.
              * Resume reading libxl records from it.
              */
             stream_continue(egc, stream);
+            break;
         }
     }
 }
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9658356..3ef11aa 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -228,6 +228,11 @@ libxl_hdtype = Enumeration("hdtype", [
     (2, "AHCI"),
     ], init_val = "LIBXL_HDTYPE_IDE")
 
+libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
+    (0, "NONE"),
+    (1, "REMUS"),
+    ])
+
 #
 # Complex libxl types
 #
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index f9933cb..c1cd696 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4424,7 +4424,8 @@ static void migrate_domain(uint32_t domid, const char *rune, int debug,
 }
 
 static void migrate_receive(int debug, int daemonize, int monitor,
-                            int send_fd, int recv_fd, int remus)
+                            int send_fd, int recv_fd,
+                            libxl_checkpointed_stream checkpointed)
 {
     uint32_t domid;
     int rc, rc2;
@@ -4449,7 +4450,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
     dom_info.paused = 1;
     dom_info.migrate_fd = recv_fd;
     dom_info.migration_domname_r = &migration_domname;
-    dom_info.checkpointed_stream = remus;
+    dom_info.checkpointed_stream = checkpointed;
 
     rc = create_domain(&dom_info);
     if (rc < 0) {
@@ -4460,7 +4461,8 @@ static void migrate_receive(int debug, int daemonize, int monitor,
 
     domid = rc;
 
-    if (remus) {
+    switch (checkpointed) {
+    case LIBXL_CHECKPOINTED_STREAM_REMUS:
         /* If we are here, it means that the sender (primary) has crashed.
          * TODO: Split-Brain Check.
          */
@@ -4493,6 +4495,9 @@ static void migrate_receive(int debug, int daemonize, int monitor,
                     common_domname, domid, rc);
 
         exit(rc ? -ERROR_FAIL: 0);
+    default:
+        /* do nothing */
+        break;
     }
 
     fprintf(stderr, "migration target: Transfer complete,"
@@ -4630,7 +4635,8 @@ int main_restore(int argc, char **argv)
 
 int main_migrate_receive(int argc, char **argv)
 {
-    int debug = 0, daemonize = 1, monitor = 1, remus = 0;
+    int debug = 0, daemonize = 1, monitor = 1;
+    libxl_checkpointed_stream checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
     int opt;
 
     SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
@@ -4645,7 +4651,7 @@ int main_migrate_receive(int argc, char **argv)
         debug = 1;
         break;
     case 'r':
-        remus = 1;
+        checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
         break;
     }
 
@@ -4655,7 +4661,7 @@ int main_migrate_receive(int argc, char **argv)
     }
     migrate_receive(debug, daemonize, monitor,
                     STDOUT_FILENO, STDIN_FILENO,
-                    remus);
+                    checkpointed);
 
     return 0;
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 07/18] migration/save: pass checkpointed_stream from libxl to libxc
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (5 preceding siblings ...)
  2015-12-30  2:28 ` [PATCH v6 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 08/18] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Wen Congyang
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

Pass checkpointed_stream from libxl to libxc.
It won't affact legacy migration because legacy migration
won't use this param.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxc/include/xenguest.h   |  6 ++++--
 tools/libxc/xc_nomigrate.c       |  3 ++-
 tools/libxc/xc_sr_common.h       | 12 +++++++++++-
 tools/libxc/xc_sr_save.c         | 14 ++++++++------
 tools/libxl/libxl.c              |  2 ++
 tools/libxl/libxl_dom_save.c     | 11 ++++++++---
 tools/libxl/libxl_internal.h     |  1 +
 tools/libxl/libxl_save_callout.c |  2 +-
 tools/libxl/libxl_save_helper.c  |  3 ++-
 tools/libxl/libxl_stream_write.c |  2 +-
 tools/libxl/libxl_types.idl      |  1 +
 11 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 8f918b1..e8bc46d 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -29,7 +29,6 @@
 #define XCFLAGS_HVM       (1 << 2)
 #define XCFLAGS_STDVGA    (1 << 3)
 #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
-#define XCFLAGS_CHECKPOINTED    (1 << 5)
 
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
@@ -76,11 +75,14 @@ struct save_callbacks {
  * @parm xch a handle to an open hypervisor interface
  * @parm fd the file descriptor to save a domain to
  * @parm dom the id of the domain
+ * @param checkpointed_stream MIG_STREAM_NONE if the far end of the stream
+ *        doesn't use checkpointing
  * @return 0 on success, -1 on failure
  */
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
-                   struct save_callbacks* callbacks, int hvm);
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 902429e..c9124df 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -22,7 +22,8 @@
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm)
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream)
 {
     errno = ENOSYS;
     return -1;
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 9aecde2..bc99e9a 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -171,6 +171,16 @@ struct xc_sr_context
 
     xc_dominfo_t dominfo;
 
+    /*
+     * migration stream
+     * 0: Plain VM
+     * 1: Remus
+     */
+    enum {
+        MIG_STREAM_NONE, /* plain stream */
+        MIG_STREAM_REMUS,
+    } migration_stream;
+
     union /* Common save or restore data. */
     {
         struct /* Save data. */
@@ -182,7 +192,7 @@ struct xc_sr_context
             bool live;
 
             /* Plain VM, or checkpoints over time. */
-            bool checkpointed;
+            int checkpointed;
 
             /* Further debugging information in the stream. */
             bool debug;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 76ebb34..8ffd71d 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -628,7 +628,8 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
     if ( rc )
         goto out;
 
-    if ( ctx->save.debug && !ctx->save.checkpointed )
+    if ( ctx->save.debug &&
+         ctx->save.checkpointed != MIG_STREAM_NONE )
     {
         rc = verify_frames(ctx);
         if ( rc )
@@ -753,7 +754,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 
         if ( ctx->save.live )
             rc = send_domain_memory_live(ctx);
-        else if ( ctx->save.checkpointed )
+        else if ( ctx->save.checkpointed != MIG_STREAM_NONE )
             rc = send_domain_memory_checkpointed(ctx);
         else
             rc = send_domain_memory_nonlive(ctx);
@@ -773,7 +774,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
         if ( rc )
             goto err;
 
-        if ( ctx->save.checkpointed )
+        if ( ctx->save.checkpointed != MIG_STREAM_NONE )
         {
             /*
              * We have now completed the initial live portion of the checkpoint
@@ -792,7 +793,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
             if ( rc <= 0 )
                 goto err;
         }
-    } while ( ctx->save.checkpointed );
+    } while ( ctx->save.checkpointed != MIG_STREAM_NONE );
 
     xc_report_progress_single(xch, "End of stream");
 
@@ -822,7 +823,8 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
                    uint32_t max_iters, uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm)
+                   struct save_callbacks* callbacks, int hvm,
+                   int checkpointed_stream)
 {
     struct xc_sr_context ctx =
         {
@@ -834,7 +836,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
     ctx.save.callbacks = callbacks;
     ctx.save.live  = !!(flags & XCFLAGS_LIVE);
     ctx.save.debug = !!(flags & XCFLAGS_DEBUG);
-    ctx.save.checkpointed = !!(flags & XCFLAGS_CHECKPOINTED);
+    ctx.save.checkpointed = checkpointed_stream;
 
     /*
      * TODO: Find some time to better tweak the live migration algorithm.
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index f8f7158..2faea4d 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -877,6 +877,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     dss->live = 1;
     dss->debug = 0;
     dss->remus = info;
+    dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_REMUS;
 
     assert(info);
 
@@ -937,6 +938,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, int fd, int flags,
     dss->type = type;
     dss->live = flags & LIBXL_SUSPEND_LIVE;
     dss->debug = flags & LIBXL_SUSPEND_DEBUG;
+    dss->checkpointed_stream = LIBXL_CHECKPOINTED_STREAM_NONE;
 
     rc = libxl__fd_flags_modify_save(gc, dss->fd,
                                      ~(O_NONBLOCK|O_NDELAY), 0,
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index cc8cabe..dbdee8f 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -338,6 +338,12 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
     unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0;
     libxl__domain_suspend_state *dsps = &dss->dsps;
 
+    if (dss->checkpointed_stream != LIBXL_CHECKPOINTED_STREAM_NONE && !r_info) {
+        LOG(ERROR, "Migration stream is checkpointed, but there's no "
+                   "checkpoint info!");
+        goto out;
+    }
+
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
     dsps->ao = ao;
@@ -376,15 +382,14 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
         goto out;
     }
 
-    if (r_info != NULL) {
+    if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
         dss->interval = r_info->interval;
-        dss->xcflags |= XCFLAGS_CHECKPOINTED;
         if (libxl_defbool_val(r_info->compression))
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
 
     memset(callbacks, 0, sizeof(*callbacks));
-    if (r_info != NULL) {
+    if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
         callbacks->suspend = libxl__remus_domain_suspend_callback;
         callbacks->postcopy = libxl__remus_domain_resume_callback;
         callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 93dd06c..cc7978d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3108,6 +3108,7 @@ struct libxl__domain_save_state {
     libxl_domain_type type;
     int live;
     int debug;
+    int checkpointed_stream;
     const libxl_domain_remus_info *remus;
     /* private */
     int rc;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 2d06b42..416b318 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -85,7 +85,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
 
     const unsigned long argnums[] = {
         dss->domid, 0, 0, dss->xcflags, dss->hvm,
-        cbflags,
+        cbflags, dss->checkpointed_stream,
     };
 
     shs->ao = ao;
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 39038f9..6bdcf13 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -253,6 +253,7 @@ int main(int argc, char **argv)
         uint32_t flags =           strtoul(NEXTARG,0,10);
         int hvm =                  atoi(NEXTARG);
         unsigned cbflags =         strtoul(NEXTARG,0,10);
+        int checkpointed_stream =  strtoul(NEXTARG,0,10);
         assert(!*++argv);
 
         helper_setcallbacks_save(&helper_save_callbacks, cbflags);
@@ -261,7 +262,7 @@ int main(int argc, char **argv)
         setup_signals(save_signal_handler);
 
         r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm);
+                           &helper_save_callbacks, hvm, checkpointed_stream);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index c408ab1..852546f 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -355,7 +355,7 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
      * If the stream is not still alive, we must not continue any work.
      */
     if (libxl__stream_write_inuse(stream)) {
-        if (dss->remus)
+        if (dss->checkpointed_stream != LIBXL_CHECKPOINTED_STREAM_NONE)
             stream_complete(egc, stream, 0);
         else
             write_emulator_xenstore_record(egc, stream);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 3ef11aa..9aa94be 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -228,6 +228,7 @@ libxl_hdtype = Enumeration("hdtype", [
     (2, "AHCI"),
     ], init_val = "LIBXL_HDTYPE_IDE")
 
+# Consistent with the values defined for migration_stream
 libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
     (0, "NONE"),
     (1, "REMUS"),
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 08/18] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (6 preceding siblings ...)
  2015-12-30  2:28 ` [PATCH v6 07/18] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2015-12-30  2:28 ` [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Wen Congyang
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Anthony Perard, Shriram Rajagopalan,
	Yang Hongyang

In normal migration, the qemu state was passed to qemu as a parameter.
With COLO, Secondary vm is running. So we will do the following steps
at every checkpoint:
1. suspend both primary vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm
Primary will send qemu's state in step2, and
Secondary's qemu should read it and restore the state before it
is resumed. We can not pass the state to qemu as a parameter because
Secondary QEMU already started at this point, so we introduce
libxl__domain_restore_device_model() to do it.
This API should be called before resuming secondary vm.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Anthony Perard <anthony.perard@citrix.com>
---
 tools/libxl/libxl_dom_save.c | 20 ++++++++++++++++++++
 tools/libxl/libxl_internal.h |  4 ++++
 tools/libxl/libxl_qmp.c      | 10 ++++++++++
 3 files changed, 34 insertions(+)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index dbdee8f..b3ecad7 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -518,6 +518,26 @@ int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs,
     return rc;
 }
 
+int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid,
+                                       const char *restore_file)
+{
+    int rc;
+
+    switch (libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+        /* not supported now */
+        rc = ERROR_INVAL;
+        break;
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        rc = libxl__qmp_restore(gc, domid, restore_file);
+        break;
+    default:
+        rc = ERROR_INVAL;
+    }
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index cc7978d..4872619 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1115,6 +1115,8 @@ _hidden int libxl__domain_rename(libxl__gc *gc, uint32_t domid,
                                  const char *old_name, const char *new_name,
                                  xs_transaction_t trans);
 
+_hidden int libxl__domain_restore_device_model(libxl__gc *gc, uint32_t domid,
+                                               const char *restore_file);
 _hidden int libxl__domain_resume_device_model(libxl__gc *gc, uint32_t domid);
 
 _hidden const char *libxl__userdata_path(libxl__gc *gc, uint32_t domid,
@@ -1758,6 +1760,8 @@ _hidden int libxl__qmp_stop(libxl__gc *gc, int domid);
 _hidden int libxl__qmp_resume(libxl__gc *gc, int domid);
 /* Save current QEMU state into fd. */
 _hidden int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename);
+/* Load current QEMU state from fd. */
+_hidden int libxl__qmp_restore(libxl__gc *gc, int domid, const char *filename);
 /* Set dirty bitmap logging status */
 _hidden int libxl__qmp_set_global_dirty_log(libxl__gc *gc, int domid, bool enable);
 _hidden int libxl__qmp_insert_cdrom(libxl__gc *gc, int domid, const libxl_device_disk *disk);
diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
index 714038b..eec8a44 100644
--- a/tools/libxl/libxl_qmp.c
+++ b/tools/libxl/libxl_qmp.c
@@ -905,6 +905,16 @@ int libxl__qmp_save(libxl__gc *gc, int domid, const char *filename)
                            NULL, NULL);
 }
 
+int libxl__qmp_restore(libxl__gc *gc, int domid, const char *state_file)
+{
+    libxl__json_object *args = NULL;
+
+    qmp_parameters_add_string(gc, &args, "filename", state_file);
+
+    return qmp_run_command(gc, domid, "xen-load-devices-state", args,
+                           NULL, NULL);
+}
+
 static int qmp_change(libxl__gc *gc, libxl__qmp_handler *qmp,
                       char *device, char *target, char *arg)
 {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (7 preceding siblings ...)
  2015-12-30  2:28 ` [PATCH v6 08/18] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Wen Congyang
@ 2015-12-30  2:28 ` Wen Congyang
  2016-01-25 18:59   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 10/18] tools/libxl: export logdirty_init Wen Congyang
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:28 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Secondary vm is running in colo mode, we need to send
secondary vm's dirty page information to master at checkpoint,
so we have to enable qemu logdirty on secondary.

libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
qemu logdirty. But it uses domain_save_state, and calls
libxl__xc_domain_saverestore_async_callback_done()
before exits. This can not be used for secondary vm.

Update libxl__domain_suspend_common_switch_qemu_logdirty() to
introduce a new API libxl__domain_common_switch_qemu_logdirty().
This API only uses libxl__logdirty_switch, and calls
lds->callback before exits.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_dom_save.c | 95 ++++++++++++++++++++++++--------------------
 tools/libxl/libxl_internal.h |  8 ++++
 2 files changed, 60 insertions(+), 43 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index b3ecad7..79e43f1 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -42,7 +42,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_save_state *dss, int rc);
+                                 libxl__logdirty_switch *lds, int rc);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -52,13 +52,10 @@ static void logdirty_init(libxl__logdirty_switch *lds)
 }
 
 static void domain_suspend_switch_qemu_xen_traditional_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
+                               (libxl__egc *egc, int domid, unsigned enable,
+                                libxl__logdirty_switch *lds)
 {
-    libxl__egc *egc = shs->egc;
-    libxl__domain_save_state *dss = shs->caller_state;
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(lds->ao);
     int rc;
     xs_transaction_t t = 0;
     const char *got;
@@ -120,26 +117,34 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
  out:
     LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
     libxl__xs_transaction_abort(gc, &t);
-    switch_logdirty_done(egc,dss,rc);
+    switch_logdirty_done(egc,lds,rc);
 }
 
 static void domain_suspend_switch_qemu_xen_logdirty
-                               (int domid, unsigned enable,
-                                libxl__save_helper_state *shs)
+                               (libxl__egc *egc, int domid, unsigned enable,
+                                libxl__logdirty_switch *lds)
 {
-    libxl__egc *egc = shs->egc;
-    libxl__domain_save_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
+    STATE_AO_GC(lds->ao);
     int rc;
 
     rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
-    if (!rc) {
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
-    } else {
+    if (rc)
         LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
+
+    lds->callback(egc, lds, rc);
+}
+
+static void domain_suspend_switch_qemu_logdirty_done
+                        (libxl__egc *egc, libxl__logdirty_switch *lds, int rc)
+{
+    libxl__domain_save_state *dss = CONTAINER_OF(lds, *dss, logdirty);
+
+    if (rc) {
         dss->rc = rc;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
-    }
+        libxl__xc_domain_saverestore_async_callback_done(egc,
+                                                         &dss->sws.shs, -1);
+    } else
+        libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
 }
 
 void libxl__domain_suspend_common_switch_qemu_logdirty
@@ -148,42 +153,52 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
     libxl__save_helper_state *shs = user;
     libxl__egc *egc = shs->egc;
     libxl__domain_save_state *dss = shs->caller_state;
-    STATE_AO_GC(dss->ao);
+
+    /* convenience aliases */
+    libxl__logdirty_switch *const lds = &dss->logdirty;
+
+    lds->callback = domain_suspend_switch_qemu_logdirty_done;
+    libxl__domain_common_switch_qemu_logdirty(egc, domid, enable, lds);
+}
+
+void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
+                                               int domid, unsigned enable,
+                                               libxl__logdirty_switch *lds)
+{
+    STATE_AO_GC(lds->ao);
 
     switch (libxl__device_model_version_running(gc, domid)) {
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
-        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
+        domain_suspend_switch_qemu_xen_traditional_logdirty(egc, domid, enable,
+                                                            lds);
         break;
     case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
-        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
+        domain_suspend_switch_qemu_xen_logdirty(egc, domid, enable, lds);
         break;
     case LIBXL_DEVICE_MODEL_VERSION_NONE:
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
+        lds->callback(egc, lds, 0);
         break;
     default:
         LOG(ERROR,"logdirty switch failed"
             ", no valid device model version found, abandoning suspend");
-        dss->rc = ERROR_FAIL;
-        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
+        lds->callback(egc, lds, ERROR_FAIL);
     }
 }
 static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
                                     const struct timeval *requested_abs,
                                     int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
-    STATE_AO_GC(dss->ao);
+    libxl__logdirty_switch *lds = CONTAINER_OF(ev, *lds, timeout);
+    STATE_AO_GC(lds->ao);
     LOG(ERROR,"logdirty switch: wait for device model timed out");
-    switch_logdirty_done(egc,dss,ERROR_FAIL);
+    switch_logdirty_done(egc,lds,ERROR_FAIL);
 }
 
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
                             const char *watch_path, const char *event_path)
 {
-    libxl__domain_save_state *dss =
-        CONTAINER_OF(watch, *dss, logdirty.watch);
-    libxl__logdirty_switch *lds = &dss->logdirty;
-    STATE_AO_GC(dss->ao);
+    libxl__logdirty_switch *lds = CONTAINER_OF(watch, *lds, watch);
+    STATE_AO_GC(lds->ao);
     const char *got;
     xs_transaction_t t = 0;
     int rc;
@@ -229,28 +244,20 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
     if (rc <= 0) {
         if (rc < 0)
             LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
-        switch_logdirty_done(egc,dss,rc);
+        switch_logdirty_done(egc,lds,rc);
     }
 }
 
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_save_state *dss,
+                                 libxl__logdirty_switch *lds,
                                  int rc)
 {
-    STATE_AO_GC(dss->ao);
-    libxl__logdirty_switch *lds = &dss->logdirty;
+    STATE_AO_GC(lds->ao);
 
     libxl__ev_xswatch_deregister(gc, &lds->watch);
     libxl__ev_time_deregister(gc, &lds->timeout);
 
-    int broke;
-    if (rc) {
-        broke = -1;
-        dss->rc = rc;
-    } else {
-        broke = 0;
-    }
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
+    lds->callback(egc, lds, rc);
 }
 
 /*----- callbacks, called by xc_domain_save -----*/
@@ -346,6 +353,8 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
 
     dss->rc = 0;
     logdirty_init(&dss->logdirty);
+    dss->logdirty.ao = ao;
+
     dsps->ao = ao;
     dsps->domid = domid;
     rc = libxl__domain_suspend_init(egc, dsps);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4872619..552692f 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3071,6 +3071,11 @@ libxl__stream_write_inuse(const libxl__stream_write_state *stream)
 }
 
 typedef struct libxl__logdirty_switch {
+    /* set by caller of libxl__domain_common_switch_qemu_logdirty */
+    libxl__ao *ao;
+    void (*callback)(libxl__egc *egc, struct libxl__logdirty_switch *lds,
+                     int rc);
+
     const char *cmd;
     const char *cmd_path;
     const char *ret_path;
@@ -3490,6 +3495,9 @@ void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
 
 _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
                                (int domid, unsigned int enable, void *data);
+_hidden void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
+                                               int domid, unsigned enable,
+                                               libxl__logdirty_switch *lds);
 _hidden int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
                                                char **buf, uint32_t *len);
 _hidden int libxl__restore_emulator_xenstore_data
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 10/18] tools/libxl: export logdirty_init
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (8 preceding siblings ...)
  2015-12-30  2:28 ` [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 19:01   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back Wen Congyang
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

We need to enable logdirty on secondary, so we export logdirty_init
for internal use. Rename it to libxl__logdirty_init.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_dom_save.c | 4 ++--
 tools/libxl/libxl_internal.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 79e43f1..8e8d280 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -44,7 +44,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
 static void switch_logdirty_done(libxl__egc *egc,
                                  libxl__logdirty_switch *lds, int rc);
 
-static void logdirty_init(libxl__logdirty_switch *lds)
+void libxl__logdirty_init(libxl__logdirty_switch *lds)
 {
     lds->cmd_path = 0;
     libxl__ev_xswatch_init(&lds->watch);
@@ -352,7 +352,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
     }
 
     dss->rc = 0;
-    logdirty_init(&dss->logdirty);
+    libxl__logdirty_init(&dss->logdirty);
     dss->logdirty.ao = ao;
 
     dsps->ao = ao;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 552692f..8a429b7 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3083,6 +3083,8 @@ typedef struct libxl__logdirty_switch {
     libxl__ev_time timeout;
 } libxl__logdirty_switch;
 
+_hidden void libxl__logdirty_init(libxl__logdirty_switch *lds);
+
 struct libxl__domain_suspend_state {
     /* set by caller of libxl__domain_suspend_init */
     libxl__ao *ao;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (9 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 10/18] tools/libxl: export logdirty_init Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 19:17   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc Wen Congyang
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

In colo mode, slave needs to send data to master, but the io_fd
only can be written in master, and only can be read in slave.
Save recv_fd in domain_suspend_state, and send_fd in
domain_create_state.
Extend libxl_domain_create_restore API, add a send_fd param to
it.
Add LIBXL_HAVE_CREATE_RESTORE_SEND_FD to indicate the API change.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
---
 tools/libxl/libxl.c                  |  2 +-
 tools/libxl/libxl.h                  | 30 ++++++++++++++++++++++++++++--
 tools/libxl/libxl_create.c           |  9 +++++----
 tools/libxl/libxl_internal.h         |  2 ++
 tools/libxl/libxl_types.idl          |  1 +
 tools/libxl/xl_cmdimpl.c             |  8 +++++++-
 tools/ocaml/libs/xl/xenlight_stubs.c |  2 +-
 7 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2faea4d..69c8047 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -872,7 +872,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     dss->callback = remus_failover_cb;
     dss->domid = domid;
     dss->fd = send_fd;
-    /* TODO do something with recv_fd */
+    dss->recv_fd = recv_fd;
     dss->type = type;
     dss->live = 1;
     dss->debug = 0;
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index a01e448..67a4ad7 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -630,6 +630,15 @@ typedef struct libxl__ctx libxl_ctx;
 #define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS 1
 
 /*
+ * LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
+ *
+ * If this is defined, libxl_domain_create_restore()'s API has changed to
+ * include a send_fd param which used for libxl migration back channel
+ * during COLO FT.
+ */
+#define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
+
+/*
  * LIBXL_HAVE_CREATEINFO_PVH
  * If this is defined, then libxl supports creation of a PVH guest.
  */
@@ -1143,7 +1152,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
                             const libxl_asyncprogress_how *aop_console_how)
                             LIBXL_EXTERNAL_CALLERS_ONLY;
 int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
-                                uint32_t *domid, int restore_fd,
+                                uint32_t *domid, int restore_fd, int send_fd,
                                 const libxl_domain_restore_params *params,
                                 const libxl_asyncop_how *ao_how,
                                 const libxl_asyncprogress_how *aop_console_how)
@@ -1164,7 +1173,7 @@ int static inline libxl_domain_create_restore_0x040200(
     libxl_domain_restore_params_init(&params);
 
     ret = libxl_domain_create_restore(
-        ctx, d_config, domid, restore_fd, &params, ao_how, aop_console_how);
+        ctx, d_config, domid, restore_fd, -1, &params, ao_how, aop_console_how);
 
     libxl_domain_restore_params_dispose(&params);
     return ret;
@@ -1172,6 +1181,23 @@ int static inline libxl_domain_create_restore_0x040200(
 
 #define libxl_domain_create_restore libxl_domain_create_restore_0x040200
 
+#elif defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040400 \
+                                 && LIBXL_API_VERSION < 0x040600
+
+int static inline libxl_domain_create_restore_0x040400(
+    libxl_ctx *ctx, libxl_domain_config *d_config,
+    uint32_t *domid, int restore_fd,
+    const libxl_domain_restore_params *params,
+    const libxl_asyncop_how *ao_how,
+    const libxl_asyncprogress_how *aop_console_how)
+    LIBXL_EXTERNAL_CALLERS_ONLY
+{
+    return libxl_domain_create_restore(ctx, d_config, domid, restore_fd,
+                                       -1, params, ao_how, aop_console_how);
+}
+
+#define libxl_domain_create_restore libxl_domain_create_restore_0x040400
+
 #endif
 
 int libxl_domain_soft_reset(libxl_ctx *ctx,
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 8d3896f..8087bcc 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1554,7 +1554,7 @@ static void domain_create_cb(libxl__egc *egc,
                              int rc, uint32_t domid);
 
 static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-                            uint32_t *domid, int restore_fd,
+                            uint32_t *domid, int restore_fd, int send_fd,
                             const libxl_domain_restore_params *params,
                             const libxl_asyncop_how *ao_how,
                             const libxl_asyncprogress_how *aop_console_how)
@@ -1569,6 +1569,7 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
     libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
     libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
     cdcs->dcs.restore_fd = cdcs->dcs.libxc_fd = restore_fd;
+    cdcs->dcs.send_fd = send_fd;
     if (restore_fd > -1) {
         cdcs->dcs.restore_params = *params;
         rc = libxl__fd_flags_modify_save(gc, cdcs->dcs.restore_fd,
@@ -1747,17 +1748,17 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
                             const libxl_asyncop_how *ao_how,
                             const libxl_asyncprogress_how *aop_console_how)
 {
-    return do_domain_create(ctx, d_config, domid, -1, NULL,
+    return do_domain_create(ctx, d_config, domid, -1, -1, NULL,
                             ao_how, aop_console_how);
 }
 
 int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
-                                uint32_t *domid, int restore_fd,
+                                uint32_t *domid, int restore_fd, int send_fd,
                                 const libxl_domain_restore_params *params,
                                 const libxl_asyncop_how *ao_how,
                                 const libxl_asyncprogress_how *aop_console_how)
 {
-    return do_domain_create(ctx, d_config, domid, restore_fd, params,
+    return do_domain_create(ctx, d_config, domid, restore_fd, send_fd, params,
                             ao_how, aop_console_how);
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8a429b7..99a4acf 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3116,6 +3116,7 @@ struct libxl__domain_save_state {
     uint32_t domid;
     int fd;
     int fdfl; /* original flags on fd */
+    int recv_fd;
     libxl_domain_type type;
     int live;
     int debug;
@@ -3453,6 +3454,7 @@ struct libxl__domain_create_state {
     libxl_domain_config guest_config_saved; /* vanilla config */
     int restore_fd, libxc_fd;
     int restore_fdfl; /* original flags of restore_fd */
+    int send_fd;
     libxl_domain_restore_params restore_params;
     uint32_t domid_soft_reset;
     libxl__domain_create_cb *callback;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9aa94be..c5d5d40 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -232,6 +232,7 @@ libxl_hdtype = Enumeration("hdtype", [
 libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
     (0, "NONE"),
     (1, "REMUS"),
+    (2, "COLO"),
     ])
 
 #
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c1cd696..6580b59 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -159,6 +159,7 @@ struct domain_create {
     char *extra_config; /* extra config string */
     const char *restore_file;
     int migrate_fd; /* -1 means none */
+    int send_fd; /* -1 means none */
     char **migration_domname_r; /* from malloc */
 };
 
@@ -2686,6 +2687,7 @@ static uint32_t create_domain(struct domain_create *dom_info)
     int config_len = 0;
     int restore_fd = -1;
     int restore_fd_to_close = -1;
+    int send_fd = -1;
     const libxl_asyncprogress_how *autoconnect_console_how;
     struct save_file_header hdr;
     uint32_t domid_soft_reset = INVALID_DOMID;
@@ -2703,6 +2705,7 @@ static uint32_t create_domain(struct domain_create *dom_info)
         if (migrate_fd >= 0) {
             restore_source = "<incoming migration stream>";
             restore_fd = migrate_fd;
+            send_fd = dom_info->send_fd;
         } else {
             restore_source = restore_file;
             restore_fd = open(restore_file, O_RDONLY);
@@ -2893,7 +2896,7 @@ start:
 
         ret = libxl_domain_create_restore(ctx, &d_config,
                                           &domid, restore_fd,
-                                          &params,
+                                          send_fd, &params,
                                           0, autoconnect_console_how);
 
         libxl_domain_restore_params_dispose(&params);
@@ -4449,6 +4452,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
     dom_info.monitor = monitor;
     dom_info.paused = 1;
     dom_info.migrate_fd = recv_fd;
+    dom_info.send_fd = send_fd;
     dom_info.migration_domname_r = &migration_domname;
     dom_info.checkpointed_stream = checkpointed;
 
@@ -4622,6 +4626,7 @@ int main_restore(int argc, char **argv)
     dom_info.config_file = config_file;
     dom_info.restore_file = checkpoint_file;
     dom_info.migrate_fd = -1;
+    dom_info.send_fd = -1;
     dom_info.vnc = vnc;
     dom_info.vncautopass = vncautopass;
     dom_info.console_autoconnect = console_autoconnect;
@@ -5088,6 +5093,7 @@ int main_create(int argc, char **argv)
     dom_info.quiet = quiet;
     dom_info.config_file = filename;
     dom_info.migrate_fd = -1;
+    dom_info.send_fd = -1;
     dom_info.vnc = vnc;
     dom_info.vncautopass = vncautopass;
     dom_info.console_autoconnect = console_autoconnect;
diff --git a/tools/ocaml/libs/xl/xenlight_stubs.c b/tools/ocaml/libs/xl/xenlight_stubs.c
index 4133527..1c52c2a 100644
--- a/tools/ocaml/libs/xl/xenlight_stubs.c
+++ b/tools/ocaml/libs/xl/xenlight_stubs.c
@@ -537,7 +537,7 @@ value stub_libxl_domain_create_restore(value ctx, value domain_config, value par
 	restore_fd = Int_val(Field(params, 0));
 
 	caml_enter_blocking_section();
-	ret = libxl_domain_create_restore(CTX, &c_dconfig, &c_domid, restore_fd,
+	ret = libxl_domain_create_restore(CTX, &c_dconfig, &c_domid, restore_fd, -1,
 		&c_params, ao_how, NULL);
 	caml_leave_blocking_section();
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (10 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 19:41   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device Wen Congyang
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Ian Campbell, Wen Congyang,
	Gui Jianfeng, Jiang Yunhong, Dong Eddie, Shriram Rajagopalan,
	Ian Jackson, Yang Hongyang

In COLO mode, both VMs are running, and are considered in sync if the
visible network traffic is identical.  After some time, they fall out of
sync.

At this point, the two VMs have definitely diverged.  Lets call the
primary dirty bitmap set A, while the secondary dirty bitmap set B.

Sets A and B are different.

Under normal migration, the page data for set A will be sent form the
primary to the secondary.

However, the set difference B - A (lets call this C) is out-of-date on
the secondary (with respect to the primary) and will not be sent by the
primary, as it was not memory dirtied by the primary.  The secondary
needs the page data for C to reconstruct an exact copy of the primary at
the checkpoint.

The secondary cannot calculate C as it doesn't know A.  Instead, the
secondary must send B to the primary, at which point the primary
calculates the union of A and B (lets call this D) which is all the
pages dirtied by both the primary and the secondary, and sends all page
data covered by D.

In the general case, D is a superset of both A and B.  Without the
backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
copy of the primary.

We transfer the dirty bitmap on libxc side, so we need to introduce back
channel to libxc.

Note: it is different from the paper. We change the original design to
the current one, according to our following concerns:
1. The original design needs extra memory on Secondary host. When there's
   multiple backups on one host, the memory cost is high.
2. The memory cache code will be another 1k+, it will make the review
   more time consuming.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
commit message:
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xenguest.h   |  4 ++--
 tools/libxc/xc_nomigrate.c       |  4 ++--
 tools/libxc/xc_sr_restore.c      |  2 +-
 tools/libxc/xc_sr_save.c         |  2 +-
 tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
 tools/libxl/libxl_save_helper.c  |  8 ++++++--
 6 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index e8bc46d..bd133af 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -82,7 +82,7 @@ struct save_callbacks {
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream);
+                   int checkpointed_stream, int back_fd);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
@@ -121,7 +121,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks);
+                      struct restore_callbacks *callbacks, int back_fd);
 
 /**
  * This function will create a domain for a paravirtualized Linux
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index c9124df..089f767 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -23,7 +23,7 @@
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream)
+                   int checkpointed_stream, int back_fd)
 {
     errno = ENOSYS;
     return -1;
@@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks)
+                      struct restore_callbacks *callbacks, int back_fd)
 {
     errno = ENOSYS;
     return -1;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 05159bb..d4dc501 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -722,7 +722,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_gfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks)
+                      struct restore_callbacks *callbacks, int back_fd)
 {
     struct xc_sr_context ctx =
         {
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 8ffd71d..a49d083 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -824,7 +824,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
                    uint32_t max_iters, uint32_t max_factor, uint32_t flags,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream)
+                   int checkpointed_stream, int back_fd)
 {
     struct xc_sr_context ctx =
         {
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 416b318..631e3e2 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -27,7 +27,7 @@
  */
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
                        const char *mode_arg,
-                       int stream_fd,
+                       int stream_fd, int back_fd,
                        const int *preserve_fds, int num_preserve_fds,
                        const unsigned long *argnums, int num_argnums);
 
@@ -50,6 +50,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
     /* Convenience aliases */
     const uint32_t domid = dcs->guest_domid;
     const int restore_fd = dcs->libxc_fd;
+    const int send_fd = dcs->send_fd;
     libxl__domain_build_state *const state = &dcs->build_state;
 
     unsigned cbflags =
@@ -71,7 +72,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
     shs->caller_state = dcs;
     shs->need_results = 1;
 
-    run_helper(egc, shs, "--restore-domain", restore_fd, 0, 0,
+    run_helper(egc, shs, "--restore-domain", restore_fd, send_fd, 0, 0,
                argnums, ARRAY_SIZE(argnums));
 }
 
@@ -95,7 +96,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
     shs->caller_state = dss;
     shs->need_results = 0;
 
-    run_helper(egc, shs, "--save-domain", dss->fd,
+    run_helper(egc, shs, "--save-domain", dss->fd, dss->recv_fd,
                NULL, 0,
                argnums, ARRAY_SIZE(argnums));
     return;
@@ -118,14 +119,29 @@ void libxl__save_helper_init(libxl__save_helper_state *shs)
 }
 
 /*----- helper execution -----*/
+static int dup_fd_helper(libxl__gc *gc, int fd, const char *what)
+{
+    int dup_fd = fd;
+
+    if (fd <= 2) {
+        dup_fd = dup(fd);
+        if (dup_fd < 0) {
+            LOGE(ERROR,"dup %s", what);
+            exit(-1);
+        }
+    }
+    libxl_fd_set_cloexec(CTX, dup_fd, 0);
+
+    return dup_fd;
+}
 
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
-                       const char *mode_arg, int stream_fd,
+                       const char *mode_arg, int stream_fd, int back_fd,
                        const int *preserve_fds, int num_preserve_fds,
                        const unsigned long *argnums, int num_argnums)
 {
     STATE_AO_GC(shs->ao);
-    const char *args[4 + num_argnums];
+    const char *args[5 + num_argnums];
     const char **arg = args;
     int i, rc;
 
@@ -153,6 +169,7 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
     *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC_BIN "/" "libxl-save-helper";
     *arg++ = mode_arg;
     const char **stream_fd_arg = arg++;
+    const char **back_fd_arg = arg++;
     for (i=0; i<num_argnums; i++)
         *arg++ = GCSPRINTF("%lu", argnums[i]);
     *arg++ = 0;
@@ -177,16 +194,12 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
 
     pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
     if (!pid) {
-        if (stream_fd <= 2) {
-            stream_fd = dup(stream_fd);
-            if (stream_fd < 0) {
-                LOGE(ERROR,"dup migration stream fd");
-                exit(-1);
-            }
-        }
-        libxl_fd_set_cloexec(CTX, stream_fd, 0);
+        stream_fd = dup_fd_helper(gc, stream_fd, "migration stream fd");
         *stream_fd_arg = GCSPRINTF("%d", stream_fd);
 
+        back_fd = dup_fd_helper(gc, back_fd, "migration back channel fd");
+        *back_fd_arg = GCSPRINTF("%d", back_fd);
+
         for (i=0; i<num_preserve_fds; i++)
             if (preserve_fds[i] >= 0) {
                 assert(preserve_fds[i] > 2);
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 6bdcf13..9bdcf41 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -238,6 +238,7 @@ static struct restore_callbacks helper_restore_callbacks;
 int main(int argc, char **argv)
 {
     int r;
+    int back_fd;
 
 #define NEXTARG (++argv, assert(*argv), *argv)
 
@@ -247,6 +248,7 @@ int main(int argc, char **argv)
     if (!strcmp(mode,"--save-domain")) {
 
         io_fd =                    atoi(NEXTARG);
+        back_fd =                  atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         uint32_t max_iters =       strtoul(NEXTARG,0,10);
         uint32_t max_factor =      strtoul(NEXTARG,0,10);
@@ -262,12 +264,14 @@ int main(int argc, char **argv)
         setup_signals(save_signal_handler);
 
         r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm, checkpointed_stream);
+                           &helper_save_callbacks, hvm, checkpointed_stream,
+                           back_fd);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
 
         io_fd =                    atoi(NEXTARG);
+        back_fd =                  atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         unsigned store_evtchn =    strtoul(NEXTARG,0,10);
         domid_t store_domid =      strtoul(NEXTARG,0,10);
@@ -292,7 +296,7 @@ int main(int argc, char **argv)
                               store_domid, console_evtchn, &console_mfn,
                               console_domid, hvm, pae, superpages,
                               checkpointed,
-                              &helper_restore_callbacks);
+                              &helper_restore_callbacks, back_fd);
         helper_stub_restore_results(store_mfn,console_mfn,0);
         complete(r);
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (11 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 19:42   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 14/18] tools/libxl: fix backword compatibility after the automatic renaming Wen Congyang
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

This patch is auto generated by the following commands:
 1. git mv tools/libxl/libxl_remus_device.c tools/libxl/libxl_checkpoint_device.c
 2. perl -pi -e 's/libxl_remus_device/libxl_checkpoint_device/g' tools/libxl/Makefile
 3. perl -pi -e 's/\blibxl__remus_devices/libxl__checkpoint_devices/g' tools/libxl/*.[ch]
 4. perl -pi -e 's/\blibxl__remus_device\b/libxl__checkpoint_device/g' tools/libxl/*.[ch]
 5. perl -pi -e 's/\blibxl__remus_device_instance_ops\b/libxl__checkpoint_device_instance_ops/g' tools/libxl/*.[ch]
 6. perl -pi -e 's/\blibxl__remus_callback\b/libxl__checkpoint_callback/g' tools/libxl/*.[ch]
 7. perl -pi -e 's/\bremus_device_init\b/checkpoint_device_init/g' tools/libxl/*.[ch]
 8. perl -pi -e 's/\bremus_devices_setup\b/checkpoint_devices_setup/g' tools/libxl/*.[ch]
 9. perl -pi -e 's/\bdefine_remus_checkpoint_api\b/define_checkpoint_api/g' tools/libxl/*.[ch]
10. perl -pi -e 's/\brds\b/cds/g' tools/libxl/*.[ch]
11. perl -pi -e 's/REMUS_DEVICE/CHECKPOINT_DEVICE/g' tools/libxl/*.[ch] tools/libxl/*.idl
12. perl -pi -e 's/REMUS_DEVOPS/CHECKPOINT_DEVOPS/g' tools/libxl/*.[ch] tools/libxl/*.idl
13. perl -pi -e 's/\bremus\b/checkpoint/g' tools/libxl/libxl_checkpoint_device.[ch]
14. perl -pi -e 's/\bremus device/checkpoint device/g' tools/libxl/libxl_internal.h
15. perl -pi -e 's/\bRemus device/checkpoint device/g' tools/libxl/libxl_internal.h
16. perl -pi -e 's/\bremus abstract/checkpoint abstract/g' tools/libxl/libxl_internal.h
17. perl -pi -e 's/\bremus invocation/checkpoint invocation/g' tools/libxl/libxl_internal.h
18. perl -pi -e 's/\blibxl__remus_device_\(/libxl__checkpoint_device_(/g' tools/libxl/libxl_internal.h

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
---
 tools/libxl/Makefile                               |   2 +-
 ...xl_remus_device.c => libxl_checkpoint_device.c} | 198 ++++++++++-----------
 tools/libxl/libxl_internal.h                       | 112 ++++++------
 tools/libxl/libxl_netbuffer.c                      | 108 +++++------
 tools/libxl/libxl_nonetbuffer.c                    |  10 +-
 tools/libxl/libxl_remus.c                          |  76 ++++----
 tools/libxl/libxl_remus_disk_drbd.c                |  52 +++---
 tools/libxl/libxl_types.idl                        |   4 +-
 8 files changed, 281 insertions(+), 281 deletions(-)
 rename tools/libxl/{libxl_remus_device.c => libxl_checkpoint_device.c} (52%)

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index b476012..d075a30 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -62,7 +62,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl_remus_device.c b/tools/libxl/libxl_checkpoint_device.c
similarity index 52%
rename from tools/libxl/libxl_remus_device.c
rename to tools/libxl/libxl_checkpoint_device.c
index a6cb7f6..109cd23 100644
--- a/tools/libxl/libxl_remus_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,9 +17,9 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__remus_device_instance_ops remus_device_nic;
-extern const libxl__remus_device_instance_ops remus_device_drbd_disk;
-static const libxl__remus_device_instance_ops *remus_ops[] = {
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
     &remus_device_nic,
     &remus_device_drbd_disk,
     NULL,
@@ -27,18 +27,18 @@ static const libxl__remus_device_instance_ops *remus_ops[] = {
 
 /*----- helper functions -----*/
 
-static int init_device_subkind(libxl__remus_devices_state *rds)
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
 {
     /* init device subkind-specific state in the libxl ctx */
     int rc;
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     if (libxl__netbuffer_enabled(gc)) {
-        rc = init_subkind_nic(rds);
+        rc = init_subkind_nic(cds);
         if (rc) goto out;
     }
 
-    rc = init_subkind_drbd_disk(rds);
+    rc = init_subkind_drbd_disk(cds);
     if (rc) goto out;
 
     rc = 0;
@@ -46,15 +46,15 @@ out:
     return rc;
 }
 
-static void cleanup_device_subkind(libxl__remus_devices_state *rds)
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
 {
     /* cleanup device subkind-specific state in the libxl ctx */
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     if (libxl__netbuffer_enabled(gc))
-        cleanup_subkind_nic(rds);
+        cleanup_subkind_nic(cds);
 
-    cleanup_subkind_drbd_disk(rds);
+    cleanup_subkind_drbd_disk(cds);
 }
 
 /*----- setup() and teardown() -----*/
@@ -70,103 +70,103 @@ static void devices_teardown_cb(libxl__egc *egc,
                                 libxl__multidev *multidev,
                                 int rc);
 
-/* remus device setup and teardown */
+/* checkpoint device setup and teardown */
 
-static libxl__remus_device* remus_device_init(libxl__egc *egc,
-                                              libxl__remus_devices_state *rds,
+static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
+                                              libxl__checkpoint_devices_state *cds,
                                               libxl__device_kind kind,
                                               void *libxl_dev)
 {
-    libxl__remus_device *dev = NULL;
+    libxl__checkpoint_device *dev = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
     GCNEW(dev);
     dev->backend_dev = libxl_dev;
     dev->kind = kind;
-    dev->rds = rds;
+    dev->cds = cds;
 
     return dev;
 }
 
-static void remus_devices_setup(libxl__egc *egc,
-                                libxl__remus_devices_state *rds);
+static void checkpoint_devices_setup(libxl__egc *egc,
+                                libxl__checkpoint_devices_state *cds);
 
-void libxl__remus_devices_setup(libxl__egc *egc, libxl__remus_devices_state *rds)
+void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rc = init_device_subkind(rds);
+    rc = init_device_subkind(cds);
     if (rc)
         goto out;
 
-    rds->num_devices = 0;
-    rds->num_nics = 0;
-    rds->num_disks = 0;
+    cds->num_devices = 0;
+    cds->num_nics = 0;
+    cds->num_disks = 0;
 
-    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
-        rds->nics = libxl_device_nic_list(CTX, rds->domid, &rds->num_nics);
+    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
+        cds->nics = libxl_device_nic_list(CTX, cds->domid, &cds->num_nics);
 
-    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
-        rds->disks = libxl_device_disk_list(CTX, rds->domid, &rds->num_disks);
+    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
+        cds->disks = libxl_device_disk_list(CTX, cds->domid, &cds->num_disks);
 
-    if (rds->num_nics == 0 && rds->num_disks == 0)
+    if (cds->num_nics == 0 && cds->num_disks == 0)
         goto out;
 
-    GCNEW_ARRAY(rds->devs, rds->num_nics + rds->num_disks);
+    GCNEW_ARRAY(cds->devs, cds->num_nics + cds->num_disks);
 
-    for (i = 0; i < rds->num_nics; i++) {
-        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
+    for (i = 0; i < cds->num_nics; i++) {
+        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
                                                 LIBXL__DEVICE_KIND_VIF,
-                                                &rds->nics[i]);
+                                                &cds->nics[i]);
     }
 
-    for (i = 0; i < rds->num_disks; i++) {
-        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
+    for (i = 0; i < cds->num_disks; i++) {
+        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
                                                 LIBXL__DEVICE_KIND_VBD,
-                                                &rds->disks[i]);
+                                                &cds->disks[i]);
     }
 
-    remus_devices_setup(egc, rds);
+    checkpoint_devices_setup(egc, cds);
 
     return;
 
 out:
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
 
-static void remus_devices_setup(libxl__egc *egc,
-                                libxl__remus_devices_state *rds)
+static void checkpoint_devices_setup(libxl__egc *egc,
+                                libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    libxl__multidev_begin(ao, &rds->multidev);
-    rds->multidev.callback = all_devices_setup_cb;
-    for (i = 0; i < rds->num_devices; i++) {
-        libxl__remus_device *dev = rds->devs[i];
+    libxl__multidev_begin(ao, &cds->multidev);
+    cds->multidev.callback = all_devices_setup_cb;
+    for (i = 0; i < cds->num_devices; i++) {
+        libxl__checkpoint_device *dev = cds->devs[i];
         dev->ops_index = -1;
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
 
-        dev->aodev.rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
+        dev->aodev.rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
         dev->aodev.callback = device_setup_iterate;
         device_setup_iterate(egc,&dev->aodev);
     }
 
     rc = 0;
-    libxl__multidev_prepared(egc, &rds->multidev, rc);
+    libxl__multidev_prepared(egc, &cds->multidev, rc);
 }
 
 
 static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
 {
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     EGC_GC;
 
-    if (aodev->rc != ERROR_REMUS_DEVICE_NOT_SUPPORTED &&
-        aodev->rc != ERROR_REMUS_DEVOPS_DOES_NOT_MATCH)
+    if (aodev->rc != ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED &&
+        aodev->rc != ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH)
         /* might be success or disaster */
         goto out;
 
@@ -186,16 +186,16 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
                 domid = disk->backend_domid;
                 devid = libxl__device_disk_dev_number(disk->vdev, NULL, NULL);
             } else {
-                LOG(ERROR,"device kind not handled by remus: %s",
+                LOG(ERROR,"device kind not handled by checkpoint: %s",
                     libxl__device_kind_to_string(dev->kind));
                 aodev->rc = ERROR_FAIL;
                 goto out;
             }
-            LOG(ERROR,"device not handled by remus"
+            LOG(ERROR,"device not handled by checkpoint"
                 " (device=%s:%"PRId32"/%"PRId32")",
                 libxl__device_kind_to_string(dev->kind),
                 domid, devid);
-            aodev->rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
+            aodev->rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
             goto out;
         }
     } while (dev->ops->kind != dev->kind);
@@ -216,32 +216,32 @@ static void all_devices_setup_cb(libxl__egc *egc,
     STATE_AO_GC(multidev->ao);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
 
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
 
-void libxl__remus_devices_teardown(libxl__egc *egc,
-                                   libxl__remus_devices_state *rds)
+void libxl__checkpoint_devices_teardown(libxl__egc *egc,
+                                   libxl__checkpoint_devices_state *cds)
 {
     int i;
-    libxl__remus_device *dev;
+    libxl__checkpoint_device *dev;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    libxl__multidev_begin(ao, &rds->multidev);
-    rds->multidev.callback = devices_teardown_cb;
-    for (i = 0; i < rds->num_devices; i++) {
-        dev = rds->devs[i];
+    libxl__multidev_begin(ao, &cds->multidev);
+    cds->multidev.callback = devices_teardown_cb;
+    for (i = 0; i < cds->num_devices; i++) {
+        dev = cds->devs[i];
         if (!dev->ops || !dev->matched)
             continue;
 
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
         dev->ops->teardown(egc,dev);
     }
 
-    libxl__multidev_prepared(egc, &rds->multidev, 0);
+    libxl__multidev_prepared(egc, &cds->multidev, 0);
 }
 
 static void devices_teardown_cb(libxl__egc *egc,
@@ -253,26 +253,26 @@ static void devices_teardown_cb(libxl__egc *egc,
     STATE_AO_GC(multidev->ao);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
 
     /* clean nic */
-    for (i = 0; i < rds->num_nics; i++)
-        libxl_device_nic_dispose(&rds->nics[i]);
-    free(rds->nics);
-    rds->nics = NULL;
-    rds->num_nics = 0;
+    for (i = 0; i < cds->num_nics; i++)
+        libxl_device_nic_dispose(&cds->nics[i]);
+    free(cds->nics);
+    cds->nics = NULL;
+    cds->num_nics = 0;
 
     /* clean disk */
-    for (i = 0; i < rds->num_disks; i++)
-        libxl_device_disk_dispose(&rds->disks[i]);
-    free(rds->disks);
-    rds->disks = NULL;
-    rds->num_disks = 0;
+    for (i = 0; i < cds->num_disks; i++)
+        libxl_device_disk_dispose(&cds->disks[i]);
+    free(cds->disks);
+    cds->disks = NULL;
+    cds->num_disks = 0;
 
-    cleanup_device_subkind(rds);
+    cleanup_device_subkind(cds);
 
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
 
 /*----- checkpointing APIs -----*/
@@ -285,33 +285,33 @@ static void devices_checkpoint_cb(libxl__egc *egc,
 
 /* API implementations */
 
-#define define_remus_checkpoint_api(api)                                \
-void libxl__remus_devices_##api(libxl__egc *egc,                        \
-                                libxl__remus_devices_state *rds)        \
+#define define_checkpoint_api(api)                                \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
+                                libxl__checkpoint_devices_state *cds)        \
 {                                                                       \
     int i;                                                              \
-    libxl__remus_device *dev;                                           \
+    libxl__checkpoint_device *dev;                                           \
                                                                         \
-    STATE_AO_GC(rds->ao);                                               \
+    STATE_AO_GC(cds->ao);                                               \
                                                                         \
-    libxl__multidev_begin(ao, &rds->multidev);                          \
-    rds->multidev.callback = devices_checkpoint_cb;                     \
-    for (i = 0; i < rds->num_devices; i++) {                            \
-        dev = rds->devs[i];                                             \
+    libxl__multidev_begin(ao, &cds->multidev);                          \
+    cds->multidev.callback = devices_checkpoint_cb;                     \
+    for (i = 0; i < cds->num_devices; i++) {                            \
+        dev = cds->devs[i];                                             \
         if (!dev->matched || !dev->ops->api)                            \
             continue;                                                   \
-        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);\
+        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);\
         dev->ops->api(egc,dev);                                         \
     }                                                                   \
                                                                         \
-    libxl__multidev_prepared(egc, &rds->multidev, 0);                   \
+    libxl__multidev_prepared(egc, &cds->multidev, 0);                   \
 }
 
-define_remus_checkpoint_api(postsuspend);
+define_checkpoint_api(postsuspend);
 
-define_remus_checkpoint_api(preresume);
+define_checkpoint_api(preresume);
 
-define_remus_checkpoint_api(commit);
+define_checkpoint_api(commit);
 
 static void devices_checkpoint_cb(libxl__egc *egc,
                                   libxl__multidev *multidev,
@@ -320,8 +320,8 @@ static void devices_checkpoint_cb(libxl__egc *egc,
     STATE_AO_GC(multidev->ao);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds =
-                            CONTAINER_OF(multidev, *rds, multidev);
+    libxl__checkpoint_devices_state *const cds =
+                            CONTAINER_OF(multidev, *cds, multidev);
 
-    rds->callback(egc, rds, rc);
+    cds->callback(egc, cds, rc);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 99a4acf..7f80ec5 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2794,9 +2794,9 @@ typedef struct libxl__save_helper_state {
                       * marshalling and xc callback functions */
 } libxl__save_helper_state;
 
-/*----- remus device related state structure -----*/
+/*----- checkpoint device related state structure -----*/
 /*
- * The abstract Remus device layer exposes a common
+ * The abstract checkpoint device layer exposes a common
  * set of API to [external] libxl for manipulating devices attached to
  * a guest protected by Remus. The device layer also exposes a set of
  * [internal] interfaces that every device type must implement.
@@ -2804,34 +2804,34 @@ typedef struct libxl__save_helper_state {
  * The following API are exposed to libxl:
  *
  * One-time configuration operations:
- *  +libxl__remus_devices_setup
+ *  +libxl__checkpoint_devices_setup
  *    > Enable output buffering for NICs, setup disk replication, etc.
- *  +libxl__remus_devices_teardown
+ *  +libxl__checkpoint_devices_teardown
  *    > Disable output buffering and disk replication; teardown any
  *       associated external setups like qdiscs for NICs.
  *
  * Operations executed every checkpoint (in order of invocation):
- *  +libxl__remus_devices_postsuspend
- *  +libxl__remus_devices_preresume
- *  +libxl__remus_devices_commit
+ *  +libxl__checkpoint_devices_postsuspend
+ *  +libxl__checkpoint_devices_preresume
+ *  +libxl__checkpoint_devices_commit
  *
  * Each device type needs to implement the interfaces specified in
- * the libxl__remus_device_instance_ops if it wishes to support Remus.
+ * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the Remus device layer is shown below:
+ * The high-level control flow through the checkpoint device layer is shown below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
- *    |-> libxl__remus_devices_setup
- *      |-> Per-checkpoint libxl__remus_devices_[postsuspend,preresume,commit]
+ *    |-> libxl__checkpoint_devices_setup
+ *      |-> Per-checkpoint libxl__checkpoint_devices_[postsuspend,preresume,commit]
  *        ...
  *        |-> On backup failure, network error or other internal errors:
- *            libxl__remus_devices_teardown
+ *            libxl__checkpoint_devices_teardown
  */
 
-typedef struct libxl__remus_device libxl__remus_device;
-typedef struct libxl__remus_devices_state libxl__remus_devices_state;
-typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops;
+typedef struct libxl__checkpoint_device libxl__checkpoint_device;
+typedef struct libxl__checkpoint_devices_state libxl__checkpoint_devices_state;
+typedef struct libxl__checkpoint_device_instance_ops libxl__checkpoint_device_instance_ops;
 
 /*
  * Interfaces to be implemented by every device subkind that wishes to
@@ -2841,7 +2841,7 @@ typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops
  * synchronous and call dev->aodev.callback directly (as the last
  * thing they do).
  */
-struct libxl__remus_device_instance_ops {
+struct libxl__checkpoint_device_instance_ops {
     /* the device kind this ops belongs to... */
     libxl__device_kind kind;
 
@@ -2852,12 +2852,12 @@ struct libxl__remus_device_instance_ops {
      * Asynchronous.
      */
 
-    void (*postsuspend)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*preresume)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*commit)(libxl__egc *egc, libxl__remus_device *dev);
+    void (*postsuspend)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*preresume)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*commit)(libxl__egc *egc, libxl__checkpoint_device *dev);
 
     /*
-     * setup() and teardown() are refer to the actual remus device.
+     * setup() and teardown() are refer to the actual checkpoint device.
      * Asynchronous.
      * teardown is called even if setup fails.
      */
@@ -2866,45 +2866,45 @@ struct libxl__remus_device_instance_ops {
      * device. If matched, the device will then be managed with this set of
      * subkind operations.
      * Yields 0 if the device successfully set up.
-     * REMUS_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
+     * CHECKPOINT_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
      * any other rc indicates failure.
      */
-    void (*setup)(libxl__egc *egc, libxl__remus_device *dev);
-    void (*teardown)(libxl__egc *egc, libxl__remus_device *dev);
+    void (*setup)(libxl__egc *egc, libxl__checkpoint_device *dev);
+    void (*teardown)(libxl__egc *egc, libxl__checkpoint_device *dev);
 };
 
-int init_subkind_nic(libxl__remus_devices_state *rds);
-void cleanup_subkind_nic(libxl__remus_devices_state *rds);
-int init_subkind_drbd_disk(libxl__remus_devices_state *rds);
-void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds);
+int init_subkind_nic(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds);
+int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
+void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
-typedef void libxl__remus_callback(libxl__egc *,
-                                   libxl__remus_devices_state *, int rc);
+typedef void libxl__checkpoint_callback(libxl__egc *,
+                                   libxl__checkpoint_devices_state *, int rc);
 
 /*
- * State associated with a remus invocation, including parameters
- * passed to the remus abstract device layer by the remus
+ * State associated with a checkpoint invocation, including parameters
+ * passed to the checkpoint abstract device layer by the remus
  * save/restore machinery.
  */
-struct libxl__remus_devices_state {
-    /*---- must be set by caller of libxl__remus_device_(setup|teardown) ----*/
+struct libxl__checkpoint_devices_state {
+    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
 
     libxl__ao *ao;
     uint32_t domid;
-    libxl__remus_callback *callback;
+    libxl__checkpoint_callback *callback;
     int device_kind_flags;
 
     /*----- private for abstract layer only -----*/
 
     int num_devices;
     /*
-     * this array is allocated before setup the remus devices by the
-     * remus abstract layer.
-     * devs may be NULL, means there's no remus devices that has been set up.
+     * this array is allocated before setup the checkpoint devices by the
+     * checkpoint abstract layer.
+     * devs may be NULL, means there's no checkpoint devices that has been set up.
      * the size of this array is 'num_devices', which is the total number
      * of libxl nic devices and disk devices(num_nics + num_disks).
      */
-    libxl__remus_device **devs;
+    libxl__checkpoint_device **devs;
 
     libxl_device_nic *nics;
     int num_nics;
@@ -2926,20 +2926,20 @@ struct libxl__remus_devices_state {
 
 /*
  * Information about a single device being handled by remus.
- * Allocated by the remus abstract layer.
+ * Allocated by the checkpoint abstract layer.
  */
-struct libxl__remus_device {
+struct libxl__checkpoint_device {
     /*----- shared between abstract and concrete layers -----*/
     /*
      * if this is true, that means the subkind ops match the device
      */
     bool matched;
 
-    /*----- set by remus device abstruct layer -----*/
-    /* libxl__device_* which this remus device related to */
+    /*----- set by checkpoint device abstruct layer -----*/
+    /* libxl__device_* which this checkpoint device related to */
     const void *backend_dev;
     libxl__device_kind kind;
-    libxl__remus_devices_state *rds;
+    libxl__checkpoint_devices_state *cds;
     libxl__ao_device aodev;
 
     /*----- private for abstract layer only -----*/
@@ -2950,7 +2950,7 @@ struct libxl__remus_device {
      * individual devices.
      */
     int ops_index;
-    const libxl__remus_device_instance_ops *ops;
+    const libxl__checkpoint_device_instance_ops *ops;
 
     /*----- private for concrete (device-specific) layer -----*/
 
@@ -2958,17 +2958,17 @@ struct libxl__remus_device {
     void *concrete_data;
 };
 
-/* the following 5 APIs are async ops, call rds->callback when done */
-_hidden void libxl__remus_devices_setup(libxl__egc *egc,
-                                        libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_teardown(libxl__egc *egc,
-                                           libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_postsuspend(libxl__egc *egc,
-                                              libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_preresume(libxl__egc *egc,
-                                            libxl__remus_devices_state *rds);
-_hidden void libxl__remus_devices_commit(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds);
+/* the following 5 APIs are async ops, call cds->callback when done */
+_hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
+                                        libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
+                                           libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
+                                              libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
+                                            libxl__checkpoint_devices_state *cds);
+_hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
+                                         libxl__checkpoint_devices_state *cds);
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
@@ -3127,7 +3127,7 @@ struct libxl__domain_save_state {
     int hvm;
     int xcflags;
     libxl__domain_suspend_state dsps;
-    libxl__remus_devices_state rds;
+    libxl__checkpoint_devices_state cds;
     libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
     int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index c245a4e..33c2a42 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -38,21 +38,21 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
     return 1;
 }
 
-int init_subkind_nic(libxl__remus_devices_state *rds)
+int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     int rc, ret;
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rds->nlsock = nl_socket_alloc();
-    if (!rds->nlsock) {
+    cds->nlsock = nl_socket_alloc();
+    if (!cds->nlsock) {
         LOG(ERROR, "cannot allocate nl socket");
         rc = ERROR_FAIL;
         goto out;
     }
 
-    ret = nl_connect(rds->nlsock, NETLINK_ROUTE);
+    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
     if (ret) {
         LOG(ERROR, "failed to open netlink socket: %s",
             nl_geterror(ret));
@@ -61,7 +61,7 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
     }
 
     /* get list of all qdiscs installed on network devs. */
-    ret = rtnl_qdisc_alloc_cache(rds->nlsock, &rds->qdisc_cache);
+    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
     if (ret) {
         LOG(ERROR, "failed to allocate qdisc cache: %s",
             nl_geterror(ret));
@@ -70,9 +70,9 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
     }
 
     if (dss->remus->netbufscript) {
-        rds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
+        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
     } else {
-        rds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
+        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
                                       libxl__xen_script_dir_path());
     }
 
@@ -82,22 +82,22 @@ out:
     return rc;
 }
 
-void cleanup_subkind_nic(libxl__remus_devices_state *rds)
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* free qdisc cache */
-    if (rds->qdisc_cache) {
-        nl_cache_clear(rds->qdisc_cache);
-        nl_cache_free(rds->qdisc_cache);
-        rds->qdisc_cache = NULL;
+    if (cds->qdisc_cache) {
+        nl_cache_clear(cds->qdisc_cache);
+        nl_cache_free(cds->qdisc_cache);
+        cds->qdisc_cache = NULL;
     }
 
     /* close & free nlsock */
-    if (rds->nlsock) {
-        nl_close(rds->nlsock);
-        nl_socket_free(rds->nlsock);
-        rds->nlsock = NULL;
+    if (cds->nlsock) {
+        nl_close(cds->nlsock);
+        nl_socket_free(cds->nlsock);
+        cds->nlsock = NULL;
     }
 }
 
@@ -111,17 +111,17 @@ void cleanup_subkind_nic(libxl__remus_devices_state *rds)
  * it must ONLY be used for remus because if driver domains
  * were in use it would constitute a security vulnerability.
  */
-static const char *get_vifname(libxl__remus_device *dev,
+static const char *get_vifname(libxl__checkpoint_device *dev,
                                const libxl_device_nic *nic)
 {
     const char *vifname = NULL;
     const char *path;
     int rc;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* Convenience aliases */
-    const uint32_t domid = dev->rds->domid;
+    const uint32_t domid = dev->cds->domid;
 
     path = GCSPRINTF("%s/backend/vif/%d/%d/vifname",
                      libxl__xs_get_dompath(gc, 0), domid, nic->devid);
@@ -144,19 +144,19 @@ static void free_qdisc(libxl__remus_device_nic *remus_nic)
     remus_nic->qdisc = NULL;
 }
 
-static int init_qdisc(libxl__remus_devices_state *rds,
+static int init_qdisc(libxl__checkpoint_devices_state *cds,
                       libxl__remus_device_nic *remus_nic)
 {
     int rc, ret, ifindex;
     struct rtnl_link *ifb = NULL;
     struct rtnl_qdisc *qdisc = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Now that we have brought up REMUS_IFB device with plug qdisc for
      * this vif, so we need to refill the qdisc cache.
      */
-    ret = nl_cache_refill(rds->nlsock, rds->qdisc_cache);
+    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
     if (ret) {
         LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
         rc = ERROR_FAIL;
@@ -164,7 +164,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
     }
 
     /* get a handle to the REMUS_IFB interface */
-    ret = rtnl_link_get_kernel(rds->nlsock, 0, remus_nic->ifb, &ifb);
+    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
     if (ret) {
         LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
             nl_geterror(ret));
@@ -187,7 +187,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
      * There is no need to explicitly free this qdisc as its just a
      * reference from the qdisc cache we allocated earlier.
      */
-    qdisc = rtnl_qdisc_get_by_parent(rds->qdisc_cache, ifindex, TC_H_ROOT);
+    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
     if (qdisc) {
         const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
         /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
@@ -231,19 +231,19 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
  * $REMUS_IFB (for teardown)
  * setup/teardown as command line arg.
  */
-static void setup_async_exec(libxl__remus_device *dev, char *op)
+static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
 {
     int arraysize, nr = 0;
     char **env = NULL, **args = NULL;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
-    libxl__remus_devices_state *rds = dev->rds;
+    libxl__checkpoint_devices_state *cds = dev->cds;
     libxl__async_exec_state *aes = &dev->aodev.aes;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    char *const script = libxl__strdup(gc, rds->netbufscript);
-    const uint32_t domid = rds->domid;
+    char *const script = libxl__strdup(gc, cds->netbufscript);
+    const uint32_t domid = cds->domid;
     const int dev_id = remus_nic->devid;
     const char *const vif = remus_nic->vif;
     const char *const ifb = remus_nic->ifb;
@@ -269,7 +269,7 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
     args[nr++] = NULL;
     assert(nr == arraysize);
 
-    aes->ao = dev->rds->ao;
+    aes->ao = dev->cds->ao;
     aes->what = GCSPRINTF("%s %s", args[0], args[1]);
     aes->env = env;
     aes->args = args;
@@ -286,13 +286,13 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
 
 /* setup() and teardown() */
 
-static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic;
     const libxl_device_nic *nic = dev->backend_dev;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /*
      * thers's no subkind of nic devices, so nic ops is always matched
@@ -330,15 +330,15 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
                                    int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
-    libxl__remus_devices_state *rds = dev->rds;
+    libxl__checkpoint_devices_state *cds = dev->cds;
     const char *out_path_base, *hotplug_error = NULL;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    const uint32_t domid = rds->domid;
+    const uint32_t domid = cds->domid;
     const int devid = remus_nic->devid;
     const char *const vif = remus_nic->vif;
     const char **const ifb = &remus_nic->ifb;
@@ -377,7 +377,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
 
     if (hotplug_error) {
         LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
-            rds->netbufscript, vif, hotplug_error);
+            cds->netbufscript, vif, hotplug_error);
         rc = ERROR_FAIL;
         goto out;
     }
@@ -388,17 +388,17 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
     }
 
     LOG(DEBUG, "%s will buffer packets from vif %s", *ifb, vif);
-    rc = init_qdisc(rds, remus_nic);
+    rc = init_qdisc(cds, remus_nic);
 
 out:
     aodev->rc = rc;
     aodev->callback(egc, aodev);
 }
 
-static void nic_teardown(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     setup_async_exec(dev, "teardown");
 
@@ -418,7 +418,7 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
                                       int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
     if (status && !rc)
@@ -441,12 +441,12 @@ enum {
 /* API implementations */
 
 static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
-                           libxl__remus_devices_state *rds,
+                           libxl__checkpoint_devices_state *cds,
                            int buffer_op)
 {
     int rc, ret;
 
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
     if (buffer_op == tc_buffer_start)
         ret = rtnl_qdisc_plug_buffer(remus_nic->qdisc);
@@ -458,7 +458,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
         goto out;
     }
 
-    ret = rtnl_qdisc_add(rds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
+    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
     if (ret) {
         rc = ERROR_FAIL;
         goto out;
@@ -475,33 +475,33 @@ out:
     return rc;
 }
 
-static void nic_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
-    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_start);
+    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_start);
 
     dev->aodev.rc = rc;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-static void nic_commit(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_commit(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int rc;
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
 
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
-    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_release);
+    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_release);
 
     dev->aodev.rc = rc;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_nic = {
+const libxl__checkpoint_device_instance_ops remus_device_nic = {
     .kind = LIBXL__DEVICE_KIND_VIF,
     .setup = nic_setup,
     .teardown = nic_teardown,
diff --git a/tools/libxl/libxl_nonetbuffer.c b/tools/libxl/libxl_nonetbuffer.c
index 3c659c2..4b68152 100644
--- a/tools/libxl/libxl_nonetbuffer.c
+++ b/tools/libxl/libxl_nonetbuffer.c
@@ -22,25 +22,25 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
     return 0;
 }
 
-int init_subkind_nic(libxl__remus_devices_state *rds)
+int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     return 0;
 }
 
-void cleanup_subkind_nic(libxl__remus_devices_state *rds)
+void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     return;
 }
 
-static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     dev->aodev.rc = ERROR_FAIL;
     dev->aodev.callback(egc, &dev->aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_nic = {
+const libxl__checkpoint_device_instance_ops remus_device_nic = {
     .kind = LIBXL__DEVICE_KIND_VIF,
     .setup = nic_setup,
 };
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index fae2120..d088dad 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -21,9 +21,9 @@
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc);
+                             libxl__checkpoint_devices_state *cds, int rc);
 static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc);
+                               libxl__checkpoint_devices_state *cds, int rc);
 static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 
@@ -31,7 +31,7 @@ void libxl__remus_setup(libxl__egc *egc,
                         libxl__domain_save_state *dss)
 {
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
     const libxl_domain_remus_info *const info = dss->remus;
 
     STATE_AO_GC(dss->ao);
@@ -41,19 +41,19 @@ void libxl__remus_setup(libxl__egc *egc,
             LOG(ERROR, "Remus: No support for network buffering");
             goto out;
         }
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
+        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
     }
 
     if (libxl_defbool_val(info->diskbuf))
-        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
+        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
 
-    rds->ao = ao;
-    rds->domid = dss->domid;
-    rds->callback = remus_setup_done;
+    cds->ao = ao;
+    cds->domid = dss->domid;
+    cds->callback = remus_setup_done;
 
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
-    libxl__remus_devices_setup(egc, rds);
+    libxl__checkpoint_devices_setup(egc, cds);
     return;
 
 out:
@@ -61,9 +61,9 @@ out:
 }
 
 static void remus_setup_done(libxl__egc *egc,
-                             libxl__remus_devices_state *rds, int rc)
+                             libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (!rc) {
@@ -73,14 +73,14 @@ static void remus_setup_done(libxl__egc *egc,
 
     LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
         dss->domid, rc);
-    rds->callback = remus_setup_failed;
-    libxl__remus_devices_teardown(egc, rds);
+    cds->callback = remus_setup_failed;
+    libxl__checkpoint_devices_teardown(egc, cds);
 }
 
 static void remus_setup_failed(libxl__egc *egc,
-                               libxl__remus_devices_state *rds, int rc)
+                               libxl__checkpoint_devices_state *cds, int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -91,7 +91,7 @@ static void remus_setup_failed(libxl__egc *egc,
 }
 
 static void remus_teardown_done(libxl__egc *egc,
-                                libxl__remus_devices_state *rds,
+                                libxl__checkpoint_devices_state *cds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
                            libxl__domain_save_state *dss,
@@ -101,15 +101,15 @@ void libxl__remus_teardown(libxl__egc *egc,
 
     LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
         " teardown Remus devices...", rc);
-    dss->rds.callback = remus_teardown_done;
-    libxl__remus_devices_teardown(egc, &dss->rds);
+    dss->cds.callback = remus_teardown_done;
+    libxl__checkpoint_devices_teardown(egc, &dss->cds);
 }
 
 static void remus_teardown_done(libxl__egc *egc,
-                                libxl__remus_devices_state *rds,
+                                libxl__checkpoint_devices_state *cds,
                                 int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -124,10 +124,10 @@ static void remus_teardown_done(libxl__egc *egc,
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dsps, int ok);
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
+                                         libxl__checkpoint_devices_state *cds,
                                          int rc);
 static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
+                                       libxl__checkpoint_devices_state *cds,
                                        int rc);
 
 void libxl__remus_domain_suspend_callback(void *data)
@@ -149,9 +149,9 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
     if (rc)
         goto out;
 
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_postsuspend_cb;
-    libxl__remus_devices_postsuspend(egc, rds);
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    cds->callback = remus_devices_postsuspend_cb;
+    libxl__checkpoint_devices_postsuspend(egc, cds);
     return;
 
 out:
@@ -160,10 +160,10 @@ out:
 }
 
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
-                                         libxl__remus_devices_state *rds,
+                                         libxl__checkpoint_devices_state *cds,
                                          int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
     if (rc)
         goto out;
@@ -183,16 +183,16 @@ void libxl__remus_domain_resume_callback(void *data)
     libxl__domain_save_state *dss = shs->caller_state;
     STATE_AO_GC(dss->ao);
 
-    libxl__remus_devices_state *const rds = &dss->rds;
-    rds->callback = remus_devices_preresume_cb;
-    libxl__remus_devices_preresume(egc, rds);
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+    cds->callback = remus_devices_preresume_cb;
+    libxl__checkpoint_devices_preresume(egc, cds);
 }
 
 static void remus_devices_preresume_cb(libxl__egc *egc,
-                                       libxl__remus_devices_state *rds,
+                                       libxl__checkpoint_devices_state *cds,
                                        int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
     STATE_AO_GC(dss->ao);
 
     if (rc)
@@ -214,7 +214,7 @@ out:
 /*----- remus asynchronous checkpoint callback -----*/
 
 static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
+                                    libxl__checkpoint_devices_state *cds,
                                     int rc);
 static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   const struct timeval *requested_abs,
@@ -236,7 +236,7 @@ static void remus_checkpoint_stream_written(
     libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
 
     /* Convenience aliases */
-    libxl__remus_devices_state *const rds = &dss->rds;
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
 
     STATE_AO_GC(dss->ao);
 
@@ -245,8 +245,8 @@ static void remus_checkpoint_stream_written(
         goto out;
     }
 
-    rds->callback = remus_devices_commit_cb;
-    libxl__remus_devices_commit(egc, rds);
+    cds->callback = remus_devices_commit_cb;
+    libxl__checkpoint_devices_commit(egc, cds);
 
     return;
 
@@ -255,10 +255,10 @@ out:
 }
 
 static void remus_devices_commit_cb(libxl__egc *egc,
-                                    libxl__remus_devices_state *rds,
+                                    libxl__checkpoint_devices_state *cds,
                                     int rc)
 {
-    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
+    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
 
     STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index 1c3a88a..4dddc58 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -26,30 +26,30 @@ typedef struct libxl__remus_drbd_disk {
     int ackwait;
 } libxl__remus_drbd_disk;
 
-int init_subkind_drbd_disk(libxl__remus_devices_state *rds)
+int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
-    STATE_AO_GC(rds->ao);
+    STATE_AO_GC(cds->ao);
 
-    rds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
+    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
                                        libxl__xen_script_dir_path());
 
     return 0;
 }
 
-void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds)
+void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
     return;
 }
 
 /*----- helper functions, for async calls -----*/
 static void drbd_async_call(libxl__egc *egc,
-                            libxl__remus_device *dev,
-                            void func(libxl__remus_device *),
+                            libxl__checkpoint_device *dev,
+                            void func(libxl__checkpoint_device *),
                             libxl__ev_child_callback callback)
 {
     int pid, rc;
     libxl__ao_device *aodev = &dev->aodev;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* Fork and call */
     pid = libxl__ev_child_fork(gc, &aodev->child, callback);
@@ -82,21 +82,21 @@ static void match_async_exec_cb(libxl__egc *egc,
 
 /* implementations */
 
-static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev);
+static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev);
 
-static void drbd_setup(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     match_async_exec(egc, dev);
 }
 
-static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
+static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     int arraysize, nr = 0, rc;
     const libxl_device_disk *disk = dev->backend_dev;
     libxl__async_exec_state *aes = &dev->aodev.aes;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     /* setup env & args */
     arraysize = 1;
@@ -107,12 +107,12 @@ static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
     arraysize = 3;
     nr = 0;
     GCNEW_ARRAY(aes->args, arraysize);
-    aes->args[nr++] = dev->rds->drbd_probe_script;
+    aes->args[nr++] = dev->cds->drbd_probe_script;
     aes->args[nr++] = disk->pdev_path;
     aes->args[nr++] = NULL;
     assert(nr <= arraysize);
 
-    aes->ao = dev->rds->ao;
+    aes->ao = dev->cds->ao;
     aes->what = GCSPRINTF("%s %s", aes->args[0], aes->args[1]);
     aes->timeout_ms = LIBXL_HOTPLUG_TIMEOUT * 1000;
     aes->callback = match_async_exec_cb;
@@ -136,7 +136,7 @@ static void match_async_exec_cb(libxl__egc *egc,
                                 int rc, int status)
 {
     libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_drbd_disk *drbd_disk;
     const libxl_device_disk *disk = dev->backend_dev;
 
@@ -146,7 +146,7 @@ static void match_async_exec_cb(libxl__egc *egc,
         goto out;
 
     if (status) {
-        rc = ERROR_REMUS_DEVOPS_DOES_NOT_MATCH;
+        rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
         /* BUG: seems to assume that any exit status means `no match' */
         /* BUG: exit status will have been logged as an error */
         goto out;
@@ -171,10 +171,10 @@ out:
     aodev->callback(egc, aodev);
 }
 
-static void drbd_teardown(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
     libxl__remus_drbd_disk *drbd_disk = dev->concrete_data;
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     close(drbd_disk->ctl_fd);
     dev->aodev.rc = 0;
@@ -191,9 +191,9 @@ static void checkpoint_async_call_done(libxl__egc *egc,
 /* API implementations */
 
 /* this op will not wait and block, so implement as sync op */
-static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
 
@@ -207,16 +207,16 @@ static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
 }
 
 
-static void drbd_preresume_async(libxl__remus_device *dev);
+static void drbd_preresume_async(libxl__checkpoint_device *dev);
 
-static void drbd_preresume(libxl__egc *egc, libxl__remus_device *dev)
+static void drbd_preresume(libxl__egc *egc, libxl__checkpoint_device *dev)
 {
-    STATE_AO_GC(dev->rds->ao);
+    STATE_AO_GC(dev->cds->ao);
 
     drbd_async_call(egc, dev, drbd_preresume_async, checkpoint_async_call_done);
 }
 
-static void drbd_preresume_async(libxl__remus_device *dev)
+static void drbd_preresume_async(libxl__checkpoint_device *dev)
 {
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
     int ackwait = rdd->ackwait;
@@ -235,7 +235,7 @@ static void checkpoint_async_call_done(libxl__egc *egc,
 {
     int rc;
     libxl__ao_device *aodev = CONTAINER_OF(child, *aodev, child);
-    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
+    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_drbd_disk *rdd = dev->concrete_data;
 
     STATE_AO_GC(aodev->ao);
@@ -253,7 +253,7 @@ out:
     aodev->callback(egc, aodev);
 }
 
-const libxl__remus_device_instance_ops remus_device_drbd_disk = {
+const libxl__checkpoint_device_instance_ops remus_device_drbd_disk = {
     .kind = LIBXL__DEVICE_KIND_VBD,
     .setup = drbd_setup,
     .teardown = drbd_teardown,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index c5d5d40..db001ad 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
     (-15, "LOCK_FAIL"),
     (-16, "JSON_CONFIG_EMPTY"),
     (-17, "DEVICE_EXISTS"),
-    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
-    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
+    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
     (-20, "VNUMA_CONFIG_INVALID"),
     (-21, "DOMAIN_NOTFOUND"),
     (-22, "ABORTED"),
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 14/18] tools/libxl: fix backword compatibility after the automatic renaming
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (12 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2015-12-30  2:29 ` [PATCH v6 15/18] tools/libxl: adjust the indentation Wen Congyang
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
changed to ERROR_CHECKPOINT_XXX after previous renaming.
The patch fix the backword compatibility.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 tools/libxl/libxl.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 67a4ad7..2a26ba2 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -883,6 +883,19 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
  */
 #define LIBXL_HAVE_CHECKPOINTED_STREAM 1
 
+/* Remus stuff */
+/*
+ * ERROR_REMUS_XXX error code only exists from Xen 4.5, and in Xen 4.6
+ * it is changed to ERROR_CHECKPOINT_XXX
+ */
+#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040500 \
+                               && LIBXL_API_VERSION < 0x040600
+#define ERROR_REMUS_DEVOPS_DOES_NOT_MATCH \
+        ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH
+#define ERROR_REMUS_DEVICE_NOT_SUPPORTED \
+        ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
+#endif
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 15/18] tools/libxl: adjust the indentation
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (13 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 14/18] tools/libxl: fix backword compatibility after the automatic renaming Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 19:44   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

This is just tidying up after the previous automatic renaming.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_checkpoint_device.c | 21 +++++++++++----------
 tools/libxl/libxl_internal.h          | 19 +++++++++++--------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index 109cd23..226f159 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -73,9 +73,9 @@ static void devices_teardown_cb(libxl__egc *egc,
 /* checkpoint device setup and teardown */
 
 static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
-                                              libxl__checkpoint_devices_state *cds,
-                                              libxl__device_kind kind,
-                                              void *libxl_dev)
+                                        libxl__checkpoint_devices_state *cds,
+                                        libxl__device_kind kind,
+                                        void *libxl_dev)
 {
     libxl__checkpoint_device *dev = NULL;
 
@@ -89,9 +89,10 @@ static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-                                libxl__checkpoint_devices_state *cds);
+                                     libxl__checkpoint_devices_state *cds);
 
-void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
+void libxl__checkpoint_devices_setup(libxl__egc *egc,
+                                     libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
@@ -137,7 +138,7 @@ out:
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-                                libxl__checkpoint_devices_state *cds)
+                                     libxl__checkpoint_devices_state *cds)
 {
     int i, rc;
 
@@ -285,12 +286,12 @@ static void devices_checkpoint_cb(libxl__egc *egc,
 
 /* API implementations */
 
-#define define_checkpoint_api(api)                                \
-void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
-                                libxl__checkpoint_devices_state *cds)        \
+#define define_checkpoint_api(api)                                      \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,                   \
+                                libxl__checkpoint_devices_state *cds)   \
 {                                                                       \
     int i;                                                              \
-    libxl__checkpoint_device *dev;                                           \
+    libxl__checkpoint_device *dev;                                      \
                                                                         \
     STATE_AO_GC(cds->ao);                                               \
                                                                         \
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7f80ec5..5b99d6e 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2818,7 +2818,8 @@ typedef struct libxl__save_helper_state {
  * Each device type needs to implement the interfaces specified in
  * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the checkpoint device layer is shown below:
+ * The high-level control flow through the checkpoint device layer is shown
+ * below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
@@ -2879,7 +2880,8 @@ int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
 typedef void libxl__checkpoint_callback(libxl__egc *,
-                                   libxl__checkpoint_devices_state *, int rc);
+                                        libxl__checkpoint_devices_state *,
+                                        int rc);
 
 /*
  * State associated with a checkpoint invocation, including parameters
@@ -2887,7 +2889,7 @@ typedef void libxl__checkpoint_callback(libxl__egc *,
  * save/restore machinery.
  */
 struct libxl__checkpoint_devices_state {
-    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
+    /*-- must be set by caller of libxl__checkpoint_device_(setup|teardown) --*/
 
     libxl__ao *ao;
     uint32_t domid;
@@ -2900,7 +2902,8 @@ struct libxl__checkpoint_devices_state {
     /*
      * this array is allocated before setup the checkpoint devices by the
      * checkpoint abstract layer.
-     * devs may be NULL, means there's no checkpoint devices that has been set up.
+     * devs may be NULL, means there's no checkpoint devices that has been
+     * set up.
      * the size of this array is 'num_devices', which is the total number
      * of libxl nic devices and disk devices(num_nics + num_disks).
      */
@@ -2962,13 +2965,13 @@ struct libxl__checkpoint_device {
 _hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
-                                           libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
-                                              libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
-                                            libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
-                                         libxl__checkpoint_devices_state *cds);
+                                        libxl__checkpoint_devices_state *cds);
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (14 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 15/18] tools/libxl: adjust the indentation Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 19:55   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 17/18] tools/libxl: move remus state into a seperate structure Wen Congyang
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Checkpoint device is an abstract layer to do checkpoint.
COLO can also use it to do checkpoint. But there are
still some codes in checkpoint device which touch remus.

This patch and the following 2 will seperate remus from
checkpoint device layer.

We use remus ops directly in checkpoint device. Store it
in checkpoint device state so that we do not aware of
remus_ops in the checkpoint device layer.

it is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Acked-by:Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_checkpoint_device.c | 10 +---------
 tools/libxl/libxl_internal.h          |  2 ++
 tools/libxl/libxl_remus.c             |  9 +++++++++
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index 226f159..bbc6dc4 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,14 +17,6 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__checkpoint_device_instance_ops remus_device_nic;
-extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
-static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
-    &remus_device_nic,
-    &remus_device_drbd_disk,
-    NULL,
-};
-
 /*----- helper functions -----*/
 
 static int init_device_subkind(libxl__checkpoint_devices_state *cds)
@@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
         goto out;
 
     do {
-        dev->ops = remus_ops[++dev->ops_index];
+        dev->ops = dev->cds->ops[++dev->ops_index];
         if (!dev->ops) {
             libxl_device_nic * nic = NULL;
             libxl_device_disk * disk = NULL;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 5b99d6e..914ce94 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2895,6 +2895,8 @@ struct libxl__checkpoint_devices_state {
     uint32_t domid;
     libxl__checkpoint_callback *callback;
     int device_kind_flags;
+    /* The ops must be pointer array, and the last ops must be NULL */
+    const libxl__checkpoint_device_instance_ops **ops;
 
     /*----- private for abstract layer only -----*/
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index d088dad..3375331 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -18,6 +18,14 @@
 
 #include "libxl_internal.h"
 
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
+    &remus_device_nic,
+    &remus_device_drbd_disk,
+    NULL,
+};
+
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -50,6 +58,7 @@ void libxl__remus_setup(libxl__egc *egc,
     cds->ao = ao;
     cds->domid = dss->domid;
     cds->callback = remus_setup_done;
+    cds->ops = remus_ops;
 
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 17/18] tools/libxl: move remus state into a seperate structure
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (15 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 19:59   ` Konrad Rzeszutek Wilk
  2015-12-30  2:29 ` [PATCH v6 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
  2016-01-25 17:12 ` [PATCH v6 00/18] Prerequisite patches for COLO Konrad Rzeszutek Wilk
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

Add a new structure remus state, and move concrete layer's private
member to remus state.
it is pure refactoring and no functional changes.
Init interval in libxl__remus_setup(). It is safe to move this initialisation,
because this value is only used for remus, and remus will use this value after
libxl__remus_setup().

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
---
 tools/libxl/libxl.c                 |  2 +-
 tools/libxl/libxl_dom_save.c        |  3 +--
 tools/libxl/libxl_internal.h        | 35 +++++++++++++++-----------
 tools/libxl/libxl_netbuffer.c       | 49 +++++++++++++++++++++----------------
 tools/libxl/libxl_remus.c           | 24 ++++++++++++------
 tools/libxl/libxl_remus_disk_drbd.c |  8 +++---
 6 files changed, 72 insertions(+), 49 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 69c8047..481824d 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -882,7 +882,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
     assert(info);
 
     /* Point of no return */
-    libxl__remus_setup(egc, dss);
+    libxl__remus_setup(egc, &dss->rs);
     return AO_INPROGRESS;
 
  out:
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 8e8d280..86026ac 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -392,7 +392,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
     }
 
     if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
-        dss->interval = r_info->interval;
         if (libxl_defbool_val(r_info->compression))
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
@@ -447,7 +446,7 @@ static void domain_save_done(libxl__egc *egc,
          * from sending checkpoints. Teardown the network buffers and
          * release netlink resources.  This is an async op.
          */
-        libxl__remus_teardown(egc, dss, rc);
+        libxl__remus_teardown(egc, &dss->rs, rc);
         return;
     }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 914ce94..b6929a9 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2894,6 +2894,7 @@ struct libxl__checkpoint_devices_state {
     libxl__ao *ao;
     uint32_t domid;
     libxl__checkpoint_callback *callback;
+    void *concrete_data;
     int device_kind_flags;
     /* The ops must be pointer array, and the last ops must be NULL */
     const libxl__checkpoint_device_instance_ops **ops;
@@ -2917,16 +2918,6 @@ struct libxl__checkpoint_devices_state {
     int num_disks;
 
     libxl__multidev multidev;
-
-    /*----- private for concrete (device-specific) layer only -----*/
-
-    /* private for nic device subkind ops */
-    char *netbufscript;
-    struct nl_sock *nlsock;
-    struct nl_cache *qdisc_cache;
-
-    /* private for drbd disk subkind ops */
-    char *drbd_probe_script;
 };
 
 /*
@@ -2974,6 +2965,23 @@ _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
                                         libxl__checkpoint_devices_state *cds);
+
+/*----- Remus related state structure -----*/
+typedef struct libxl__remus_state libxl__remus_state;
+struct libxl__remus_state {
+    /* private */
+    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
+    int interval; /* checkpoint interval */
+
+    /*----- private for concrete (device-specific) layer only -----*/
+    /* private for nic device subkind ops */
+    char *netbufscript;
+    struct nl_sock *nlsock;
+    struct nl_cache *qdisc_cache;
+
+    /* private for drbd disk subkind ops */
+    char *drbd_probe_script;
+};
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*----- Legacy conversion helper -----*/
@@ -3132,9 +3140,8 @@ struct libxl__domain_save_state {
     int hvm;
     int xcflags;
     libxl__domain_suspend_state dsps;
+    libxl__remus_state rs;
     libxl__checkpoint_devices_state cds;
-    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
-    int interval; /* checkpoint interval (for Remus) */
     libxl__stream_write_state sws;
     libxl__logdirty_switch logdirty;
     /* private for libxl__domain_save_device_model */
@@ -3551,9 +3558,9 @@ _hidden void libxl__remus_domain_resume_callback(void *data);
 _hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
 /* Remus setup and teardown*/
 _hidden void libxl__remus_setup(libxl__egc *egc,
-                                libxl__domain_save_state *dss);
+                                libxl__remus_state *rs);
 _hidden void libxl__remus_teardown(libxl__egc *egc,
-                                   libxl__domain_save_state *dss,
+                                   libxl__remus_state *rs,
                                    int rc);
 /* Remus callbacks for restore */
 _hidden void libxl__remus_domain_restore_checkpoint_callback(void *data);
diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
index 33c2a42..5c7e8a2 100644
--- a/tools/libxl/libxl_netbuffer.c
+++ b/tools/libxl/libxl_netbuffer.c
@@ -42,17 +42,18 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
     int rc, ret;
     libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
-    cds->nlsock = nl_socket_alloc();
-    if (!cds->nlsock) {
+    rs->nlsock = nl_socket_alloc();
+    if (!rs->nlsock) {
         LOG(ERROR, "cannot allocate nl socket");
         rc = ERROR_FAIL;
         goto out;
     }
 
-    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
+    ret = nl_connect(rs->nlsock, NETLINK_ROUTE);
     if (ret) {
         LOG(ERROR, "failed to open netlink socket: %s",
             nl_geterror(ret));
@@ -61,7 +62,7 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
     }
 
     /* get list of all qdiscs installed on network devs. */
-    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
+    ret = rtnl_qdisc_alloc_cache(rs->nlsock, &rs->qdisc_cache);
     if (ret) {
         LOG(ERROR, "failed to allocate qdisc cache: %s",
             nl_geterror(ret));
@@ -70,10 +71,10 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
     }
 
     if (dss->remus->netbufscript) {
-        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
+        rs->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
     } else {
-        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
-                                      libxl__xen_script_dir_path());
+        rs->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
+                                     libxl__xen_script_dir_path());
     }
 
     rc = 0;
@@ -84,20 +85,22 @@ out:
 
 void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
 {
+    libxl__remus_state *rs = cds->concrete_data;
+
     STATE_AO_GC(cds->ao);
 
     /* free qdisc cache */
-    if (cds->qdisc_cache) {
-        nl_cache_clear(cds->qdisc_cache);
-        nl_cache_free(cds->qdisc_cache);
-        cds->qdisc_cache = NULL;
+    if (rs->qdisc_cache) {
+        nl_cache_clear(rs->qdisc_cache);
+        nl_cache_free(rs->qdisc_cache);
+        rs->qdisc_cache = NULL;
     }
 
     /* close & free nlsock */
-    if (cds->nlsock) {
-        nl_close(cds->nlsock);
-        nl_socket_free(cds->nlsock);
-        cds->nlsock = NULL;
+    if (rs->nlsock) {
+        nl_close(rs->nlsock);
+        nl_socket_free(rs->nlsock);
+        rs->nlsock = NULL;
     }
 }
 
@@ -150,13 +153,14 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
     int rc, ret, ifindex;
     struct rtnl_link *ifb = NULL;
     struct rtnl_qdisc *qdisc = NULL;
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
     /* Now that we have brought up REMUS_IFB device with plug qdisc for
      * this vif, so we need to refill the qdisc cache.
      */
-    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
+    ret = nl_cache_refill(rs->nlsock, rs->qdisc_cache);
     if (ret) {
         LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
         rc = ERROR_FAIL;
@@ -164,7 +168,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
     }
 
     /* get a handle to the REMUS_IFB interface */
-    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
+    ret = rtnl_link_get_kernel(rs->nlsock, 0, remus_nic->ifb, &ifb);
     if (ret) {
         LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
             nl_geterror(ret));
@@ -187,7 +191,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
      * There is no need to explicitly free this qdisc as its just a
      * reference from the qdisc cache we allocated earlier.
      */
-    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
+    qdisc = rtnl_qdisc_get_by_parent(rs->qdisc_cache, ifindex, TC_H_ROOT);
     if (qdisc) {
         const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
         /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
@@ -238,11 +242,12 @@ static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
     libxl__checkpoint_devices_state *cds = dev->cds;
     libxl__async_exec_state *aes = &dev->aodev.aes;
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
     /* Convenience aliases */
-    char *const script = libxl__strdup(gc, cds->netbufscript);
+    char *const script = libxl__strdup(gc, rs->netbufscript);
     const uint32_t domid = cds->domid;
     const int dev_id = remus_nic->devid;
     const char *const vif = remus_nic->vif;
@@ -333,6 +338,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
     libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
     libxl__remus_device_nic *remus_nic = dev->concrete_data;
     libxl__checkpoint_devices_state *cds = dev->cds;
+    libxl__remus_state *rs = cds->concrete_data;
     const char *out_path_base, *hotplug_error = NULL;
 
     STATE_AO_GC(cds->ao);
@@ -377,7 +383,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
 
     if (hotplug_error) {
         LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
-            cds->netbufscript, vif, hotplug_error);
+            rs->netbufscript, vif, hotplug_error);
         rc = ERROR_FAIL;
         goto out;
     }
@@ -445,6 +451,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
                            int buffer_op)
 {
     int rc, ret;
+    libxl__remus_state *rs = cds->concrete_data;
 
     STATE_AO_GC(cds->ao);
 
@@ -458,7 +465,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
         goto out;
     }
 
-    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
+    ret = rtnl_qdisc_add(rs->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
     if (ret) {
         rc = ERROR_FAIL;
         goto out;
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index 3375331..00e3c80 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -35,9 +35,10 @@ static void remus_setup_failed(libxl__egc *egc,
 static void remus_checkpoint_stream_written(
     libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 
-void libxl__remus_setup(libxl__egc *egc,
-                        libxl__domain_save_state *dss)
+void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
+
     /* Convenience aliases */
     libxl__checkpoint_devices_state *const cds = &dss->cds;
     const libxl_domain_remus_info *const info = dss->remus;
@@ -59,6 +60,8 @@ void libxl__remus_setup(libxl__egc *egc,
     cds->domid = dss->domid;
     cds->callback = remus_setup_done;
     cds->ops = remus_ops;
+    cds->concrete_data = rs;
+    rs->interval = info->interval;
 
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
@@ -103,15 +106,20 @@ static void remus_teardown_done(libxl__egc *egc,
                                 libxl__checkpoint_devices_state *cds,
                                 int rc);
 void libxl__remus_teardown(libxl__egc *egc,
-                           libxl__domain_save_state *dss,
+                           libxl__remus_state *rs,
                            int rc)
 {
+    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
+
+    /* Convenience aliases */
+    libxl__checkpoint_devices_state *const cds = &dss->cds;
+
     EGC_GC;
 
     LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
         " teardown Remus devices...", rc);
-    dss->cds.callback = remus_teardown_done;
-    libxl__checkpoint_devices_teardown(egc, &dss->cds);
+    cds->callback = remus_teardown_done;
+    libxl__checkpoint_devices_teardown(egc, cds);
 }
 
 static void remus_teardown_done(libxl__egc *egc,
@@ -285,9 +293,9 @@ static void remus_devices_commit_cb(libxl__egc *egc,
      */
 
     /* Set checkpoint interval timeout */
-    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
+    rc = libxl__ev_time_register_rel(ao, &dss->rs.checkpoint_timeout,
                                      remus_next_checkpoint,
-                                     dss->interval);
+                                     dss->rs.interval);
 
     if (rc)
         goto out;
@@ -303,7 +311,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
                                   int rc)
 {
     libxl__domain_save_state *dss =
-                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
+                            CONTAINER_OF(ev, *dss, rs.checkpoint_timeout);
 
     STATE_AO_GC(dss->ao);
 
diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
index 4dddc58..844dd66 100644
--- a/tools/libxl/libxl_remus_disk_drbd.c
+++ b/tools/libxl/libxl_remus_disk_drbd.c
@@ -28,10 +28,11 @@ typedef struct libxl__remus_drbd_disk {
 
 int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
 {
+    libxl__remus_state *rs = cds->concrete_data;
     STATE_AO_GC(cds->ao);
 
-    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
-                                       libxl__xen_script_dir_path());
+    rs->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
+                                      libxl__xen_script_dir_path());
 
     return 0;
 }
@@ -96,6 +97,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
     int arraysize, nr = 0, rc;
     const libxl_device_disk *disk = dev->backend_dev;
     libxl__async_exec_state *aes = &dev->aodev.aes;
+    libxl__remus_state *rs = dev->cds->concrete_data;
     STATE_AO_GC(dev->cds->ao);
 
     /* setup env & args */
@@ -107,7 +109,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
     arraysize = 3;
     nr = 0;
     GCNEW_ARRAY(aes->args, arraysize);
-    aes->args[nr++] = dev->cds->drbd_probe_script;
+    aes->args[nr++] = rs->drbd_probe_script;
     aes->args[nr++] = disk->pdev_path;
     aes->args[nr++] = NULL;
     assert(nr <= arraysize);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v6 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (16 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 17/18] tools/libxl: move remus state into a seperate structure Wen Congyang
@ 2015-12-30  2:29 ` Wen Congyang
  2016-01-25 20:01   ` Konrad Rzeszutek Wilk
  2016-01-25 17:12 ` [PATCH v6 00/18] Prerequisite patches for COLO Konrad Rzeszutek Wilk
  18 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2015-12-30  2:29 UTC (permalink / raw)
  To: xen devel, Andrew Cooper, Ian Campbell, Ian Jackson, Wei Liu
  Cc: Lars Kurth, Changlong Xie, Wen Congyang, Gui Jianfeng,
	Jiang Yunhong, Dong Eddie, Shriram Rajagopalan, Yang Hongyang

we call (init|cleanup)_subkind_nic and (init|cleanup)_subkind_drbd_disk
directly in checkpoint device. Move them to libxl_remus.c, Call them before
calling libxl__checkpoint_devices_setup() or after calling
libxl__checkpoint_devices_teardown().
it is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_checkpoint_device.c | 42 ++---------------------------------
 tools/libxl/libxl_remus.c             | 42 +++++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
index bbc6dc4..0a16dbb 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,38 +17,6 @@
 
 #include "libxl_internal.h"
 
-/*----- helper functions -----*/
-
-static int init_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-    /* init device subkind-specific state in the libxl ctx */
-    int rc;
-    STATE_AO_GC(cds->ao);
-
-    if (libxl__netbuffer_enabled(gc)) {
-        rc = init_subkind_nic(cds);
-        if (rc) goto out;
-    }
-
-    rc = init_subkind_drbd_disk(cds);
-    if (rc) goto out;
-
-    rc = 0;
-out:
-    return rc;
-}
-
-static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-    /* cleanup device subkind-specific state in the libxl ctx */
-    STATE_AO_GC(cds->ao);
-
-    if (libxl__netbuffer_enabled(gc))
-        cleanup_subkind_nic(cds);
-
-    cleanup_subkind_drbd_disk(cds);
-}
-
 /*----- setup() and teardown() -----*/
 
 /* callbacks */
@@ -86,14 +54,10 @@ static void checkpoint_devices_setup(libxl__egc *egc,
 void libxl__checkpoint_devices_setup(libxl__egc *egc,
                                      libxl__checkpoint_devices_state *cds)
 {
-    int i, rc;
+    int i;
 
     STATE_AO_GC(cds->ao);
 
-    rc = init_device_subkind(cds);
-    if (rc)
-        goto out;
-
     cds->num_devices = 0;
     cds->num_nics = 0;
     cds->num_disks = 0;
@@ -126,7 +90,7 @@ void libxl__checkpoint_devices_setup(libxl__egc *egc,
     return;
 
 out:
-    cds->callback(egc, cds, rc);
+    cds->callback(egc, cds, 0);
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
@@ -263,8 +227,6 @@ static void devices_teardown_cb(libxl__egc *egc,
     cds->disks = NULL;
     cds->num_disks = 0;
 
-    cleanup_device_subkind(cds);
-
     cds->callback(egc, cds, rc);
 }
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index 00e3c80..07a1699 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -26,6 +26,38 @@ static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
     NULL,
 };
 
+/*----- helper functions -----*/
+
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* init device subkind-specific state in the libxl ctx */
+    int rc;
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc)) {
+        rc = init_subkind_nic(cds);
+        if (rc) goto out;
+    }
+
+    rc = init_subkind_drbd_disk(cds);
+    if (rc) goto out;
+
+    rc = 0;
+out:
+    return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+    /* cleanup device subkind-specific state in the libxl ctx */
+    STATE_AO_GC(cds->ao);
+
+    if (libxl__netbuffer_enabled(gc))
+        cleanup_subkind_nic(cds);
+
+    cleanup_subkind_drbd_disk(cds);
+}
+
 /*-------------------- Remus setup and teardown ---------------------*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -63,6 +95,12 @@ void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
     cds->concrete_data = rs;
     rs->interval = info->interval;
 
+    if (init_device_subkind(cds)) {
+        LOG(ERROR, "Remus: failed to init device subkind for guest %u",
+            dss->domid);
+        goto out;
+    }
+
     dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
     libxl__checkpoint_devices_setup(egc, cds);
@@ -99,6 +137,8 @@ static void remus_setup_failed(libxl__egc *egc,
         LOG(ERROR, "Remus: failed to teardown device after setup failed"
             " for guest with domid %u, rc %d", dss->domid, rc);
 
+    cleanup_device_subkind(cds);
+
     dss->callback(egc, dss, rc);
 }
 
@@ -133,6 +173,8 @@ static void remus_teardown_done(libxl__egc *egc,
         LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
             " rc %d", dss->domid, rc);
 
+    cleanup_device_subkind(cds);
+
     dss->callback(egc, dss, rc);
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/18] Prerequisite patches for COLO
  2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
                   ` (17 preceding siblings ...)
  2015-12-30  2:29 ` [PATCH v6 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
@ 2016-01-25 17:12 ` Konrad Rzeszutek Wilk
  2016-01-25 20:06   ` Konrad Rzeszutek Wilk
  18 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 17:12 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:28:50AM +0800, Wen Congyang wrote:
> This patchset is Prerequisite for COLO feature. Refer to:
> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> 
> It was based on the following series:
> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02881.html

You wouldn't have this in a git tree? It is a bit hard to apply on the latest
staging. Or could you say on what branch/git commit it was based on?

Thanks.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state
  2015-12-30  2:28 ` [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
@ 2016-01-25 17:29   ` Konrad Rzeszutek Wilk
  2016-01-26  2:23     ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 17:29 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

.snip..
> --- a/tools/libxl/libxl_dom_suspend.c
> +++ b/tools/libxl/libxl_dom_suspend.c
> @@ -19,14 +19,71 @@
>  
>  /*====================== Domain suspend =======================*/
>  
> +int libxl__domain_suspend_init(libxl__egc *egc,
> +                               libxl__domain_suspend_state *dsps)
> +{
> +    STATE_AO_GC(dsps->ao);
> +    int rc = ERROR_FAIL;
> +    int port;
> +    libxl_domain_type type;
> +
> +    /* Convenience aliases */
> +    const uint32_t domid = dsps->domid;
> +
> +    type = libxl__domain_type(gc, domid);
> +    switch (type) {
> +    case LIBXL_DOMAIN_TYPE_HVM: {
> +        dsps->hvm = 1;
> +        break;
> +    }
> +    case LIBXL_DOMAIN_TYPE_PV:
> +        dsps->hvm = 0;
> +        break;
> +    default:
> +        goto out;

This will mean we return back to libxl__domain_save which will goto out which calls:
domain_save_done. And that will try to use the dsps->guestevtchn leading to a crash since:
> +    }
> +
> +    libxl__xswait_init(&dsps->pvcontrol);
> +    libxl__ev_evtchn_init(&dsps->guest_evtchn);

we initialize them here.
> +    libxl__ev_xswatch_init(&dsps->guest_watch);
> +    libxl__ev_time_init(&dsps->guest_timeout);

I would instead recommend you move these initialization routines above the
'type' check.

> +
> +    dsps->guest_evtchn.port = -1;
> +    dsps->guest_evtchn_lockfd = -1;
> +    dsps->guest_responded = 0;
> +    dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
> +
> +    port = xs_suspend_evtchn_port(domid);
> +
> +    if (port >= 0) {
> +        rc = libxl__ctx_evtchn_init(gc);
> +        if (rc) goto out;
> +
> +        dsps->guest_evtchn.port =
> +            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
> +                                    domid, port, &dsps->guest_evtchn_lockfd);
> +
> +        if (dsps->guest_evtchn.port < 0) {
> +            LOG(WARN, "Suspend event channel initialization failed");
> +            rc = ERROR_FAIL;
> +            goto out;
> +        }
> +    }
> +
> +    rc = 0;
> +
> +out:
> +    return rc;
> +}
> +

.. snip..
>  struct libxl__domain_suspend_state {
> +    /* set by caller of libxl__domain_suspend_init */
> +    libxl__ao *ao;
> +    uint32_t domid;
> +
> +    /* private */
> +    int hvm;

How about 'is_hvm' and just use 'libxl_domain_type' type?
instead of having an int? You can just do:

if (type == LIBXL_DOMAIN_TYPE_HVM) ..

And to check for non-conforming types - you can make  libxl__domain_suspend_init
do this:

    if (type == LIBXL_DOMAIN_TYPE_INVALID) {
        rc = ERROR_FAIL;
        goto out; 
    }    

?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback
  2015-12-30  2:28 ` [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback Wen Congyang
@ 2016-01-25 17:29   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 17:29 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:28:51AM +0800, Wen Congyang wrote:
> init stream {read/write} state checkpoint_callback in Remus setup callback.
> There's no functional change, it's just refactoring so that we can move
> all remus code into one file.
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxl/libxl.c          |  2 ++
>  tools/libxl/libxl_create.c   | 10 +++++++++-
>  tools/libxl/libxl_dom.c      |  5 +----
>  tools/libxl/libxl_internal.h |  4 ++++
>  4 files changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 9207621..d340a20 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -918,6 +918,8 @@ static void libxl__remus_setup(libxl__egc *egc,
>      rds->domid = dss->domid;
>      rds->callback = remus_setup_done;
>  
> +    dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
> +
>      libxl__remus_devices_setup(egc, rds);
>      return;
>  
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 261816a..6ea9bc2 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -709,6 +709,12 @@ static void remus_checkpoint_stream_done(
>      libxl__xc_domain_saverestore_async_callback_done(egc, &stream->shs, rc);
>  }
>  
> +static void libxl__remus_restore_setup(libxl__egc *egc,
> +                                       libxl__domain_create_state *dcs)
> +{
> +    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
> +}
> +
>  /*----- main domain creation -----*/
>  
>  /* We have a linear control flow; only one event callback is
> @@ -995,6 +1001,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      libxl__domain_build_state *const state = &dcs->build_state;
>      libxl__srm_restore_autogen_callbacks *const callbacks =
>          &dcs->srs.shs.callbacks.restore.a;
> +    const int checkpointed_stream = dcs->restore_params.checkpointed_stream;
>  
>      if (rc) {
>          domcreate_rebuild_done(egc, dcs, rc);
> @@ -1033,9 +1040,10 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      dcs->srs.fd = restore_fd;
>      dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
>      dcs->srs.completion_callback = domcreate_stream_done;
> -    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
>  
>      if (restore_fd >= 0) {
> +        if (checkpointed_stream)
> +            libxl__remus_restore_setup(egc, dcs);
>          libxl__stream_read_start(egc, &dcs->srs);
>          return;
>      }
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 2269998..9e28bc4 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1569,8 +1569,6 @@ out:
>  
>  /*----- remus asynchronous checkpoint callback -----*/
>  
> -static void remus_checkpoint_stream_written(
> -    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
>  static void remus_devices_commit_cb(libxl__egc *egc,
>                                      libxl__remus_devices_state *rds,
>                                      int rc);
> @@ -1588,7 +1586,7 @@ static void libxl__remus_domain_save_checkpoint_callback(void *data)
>      libxl__stream_write_start_checkpoint(egc, &dss->sws);
>  }
>  
> -static void remus_checkpoint_stream_written(
> +void remus_checkpoint_stream_written(
>      libxl__egc *egc, libxl__stream_write_state *sws, int rc)
>  {
>      libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
> @@ -1761,7 +1759,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
>          callbacks->suspend = libxl__remus_domain_suspend_callback;
>          callbacks->postcopy = libxl__remus_domain_resume_callback;
>          callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
> -        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>      } else
>          callbacks->suspend = libxl__domain_suspend_callback;
>  
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 630172b..45d7961 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3507,6 +3507,10 @@ _hidden void libxl__domain_suspend(libxl__egc *egc,
>  /* used by libxc to suspend the guest during migration */
>  _hidden void libxl__domain_suspend_callback(void *data);
>  
> +/* Remus callbacks for restore */
> +_hidden void remus_checkpoint_stream_written(
> +    libxl__egc *egc, libxl__stream_write_state *sws, int rc);
> +
>  
>  /*
>   * Convenience macros.
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests
  2015-12-30  2:28 ` [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
@ 2016-01-25 18:21   ` Konrad Rzeszutek Wilk
  2016-01-26  2:53     ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 18:21 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:28:55AM +0800, Wen Congyang wrote:
> Befor this patch:

s/Befor/Before
> 1. suspend
> a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
>    request to the guest). If the guest doesn't support evtchn, the xenstore
>    variant will be used, suspending the guest via XenBus control node.
> b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
>    the guest
> 
> 2. Resume:
> a. fast path

s/fast path/fast path(fast=1)

>    In this case, we don't change the guest's state. And we will call
>    libxl__domain_resume(..., 1) to resume the guest.

Do not change the guest state. We call libxl__domain_resume(.., 1) which
calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.


>    PV:       modify the return code to 1, and than call the domctl

s/domctl/domctl:/
>              XEN_DOMCTL_resumedomain
>    PVHVM:    same with PV
>    pure HVM: do nothing in modify_returncode, and than call the domctl:
>              XEN_DOMCTL_resumedomain

> b. slow
>    Used when the guest's state have been changed. And we will call

s/And we will/Will/

>    libxl__domain_resume(..., 0) to resume the guest.
>    PV:       update start info, and reset all secondary CPU states. Than call
>              the domctl: XEN_DOMCTL_resumedomain
>    PVHVM:    can not be resumed. You will get the following error message:
>                  "Cannot resume uncooperative HVM guests"
>    purt HVM: same with PVHVM
> 
> After this patch:
> 1. suspend
>    unchanged
> 
> 2. Resume
> a. fast path:
>    unchanged
> b. slow
>    PV:       unchanged
>    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest. Because we
>              don't modify the return code, the PV driver will disconnect
>              and reconnect. I am not sure if we should update start info
>              and reset all secondary CPU states.

The guest ends up doing the XENMAPSPACE_shared_info XENMEM_add_to_physmap
hypercall and resetting all of its CPU states to point to the shared_info
(well except the ones past 32).

That is the Linux kernel does that - regardless whether the 
SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.

>    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
> 
> Under COLO, we will update the guest's state(modify memory, cpu's registers,
> device status...). In this case, we cannot use the fast path to resume it.
> Keep the return code 0, and use a slow path to resume the guest. While
> resuming HVM using slow path is not supported currently, this patch is to
> make the resume call do not fail.

s/do/to/
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> ---
>  tools/libxc/xc_resume.c | 24 ++++++++++++++++++++----
>  1 file changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
> index 87d4324..503e4f8 100644
> --- a/tools/libxc/xc_resume.c
> +++ b/tools/libxc/xc_resume.c
> @@ -108,6 +108,25 @@ static int xc_domain_resume_cooperative(xc_interface *xch, uint32_t domid)
>      return do_domctl(xch, &domctl);
>  }
>  
> +static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
> +{
> +    DECLARE_DOMCTL;
> +
> +    /*
> +     * This domctl XEN_DOMCTL_resumedomain just unpause each vcpu. After

s/This/The/
s/just//
> +     * this domctl, the guest will run.
s/this/the/

> +     *
> +     * If it is PVHVM, the guest called the hypercall HYPERVISOR_sched_op

s/HYPERVISOR_sched_op/SCHEDOP_shutdown:SHUTDOWN_suspend/
> +     * to suspend itself. We don't modify the return code, so the PV driver
> +     * will disconnect and reconnect.
> +     *
> +     * If it is a HVM, the guest will continue running.
> +     */
> +    domctl.cmd = XEN_DOMCTL_resumedomain;
> +    domctl.domain = domid;
> +    return do_domctl(xch, &domctl);
> +}
> +
>  static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>  {
>      DECLARE_DOMCTL;
> @@ -137,10 +156,7 @@ static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>       */
>  #if defined(__i386__) || defined(__x86_64__)
>      if ( info.hvm )
> -    {
> -        ERROR("Cannot resume uncooperative HVM guests");
> -        return rc;
> -    }
> +        return xc_domain_resume_hvm(xch, domid);
>  
>      if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
>      {
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream
  2015-12-30  2:28 ` [PATCH v6 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
@ 2016-01-25 18:30   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 18:30 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:28:56AM +0800, Wen Congyang wrote:
> introduce enum type libxl_checkpointed_stream in IDL.

s/introduce/Introduce/
> rename the last argument of migrate_receive from "remus" to
> "checkpointed" since the semantics of this parameter has
> changed.
> 
> NOTE:
>  libxl_domain_restore_params isn't changed here,
>  checkpointed_stream is still an int.

Why? This enum looks so much nicer. Is the reason you do not want to
change it to an enum is because struct domain_create
seems to only have basic types such as 'int' and 'const char' ?

Thanks.
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>  tools/libxl/libxl.h             |  7 +++++++
>  tools/libxl/libxl_create.c      |  8 ++++++--
>  tools/libxl/libxl_stream_read.c |  7 +++++--
>  tools/libxl/libxl_types.idl     |  5 +++++
>  tools/libxl/xl_cmdimpl.c        | 18 ++++++++++++------
>  5 files changed, 35 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 05606a7..a01e448 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -867,6 +867,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src);
>   */
>  #define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
>  
> +/*
> + * LIBXL_HAVE_CHECKPOINTED_STREAM
> + *
> + * If this is defined, then libxl_checkpointed_stream exists.
> + */
> +#define LIBXL_HAVE_CHECKPOINTED_STREAM 1
> +
>  typedef char **libxl_string_list;
>  void libxl_string_list_dispose(libxl_string_list *sl);
>  int libxl_string_list_length(const libxl_string_list *sl);
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index bfa0552..8d3896f 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -1015,9 +1015,13 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      dcs->srs.completion_callback = domcreate_stream_done;
>  
>      if (restore_fd >= 0) {
> -        if (checkpointed_stream)
> +        switch (checkpointed_stream) {
> +        case LIBXL_CHECKPOINTED_STREAM_REMUS:
>              libxl__remus_restore_setup(egc, dcs);
> -        libxl__stream_read_start(egc, &dcs->srs);
> +            /* fall through */
> +        case LIBXL_CHECKPOINTED_STREAM_NONE:
> +            libxl__stream_read_start(egc, &dcs->srs);
> +        }
>          return;
>      }
>  
> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
> index 42c087f..6ad2a27 100644
> --- a/tools/libxl/libxl_stream_read.c
> +++ b/tools/libxl/libxl_stream_read.c
> @@ -780,15 +780,18 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
>       * If the stream is not still alive, we must not continue any work.
>       */
>      if (libxl__stream_read_inuse(stream)) {
> -        if (checkpointed_stream) {
> +        switch (checkpointed_stream) {
> +        case LIBXL_CHECKPOINTED_STREAM_REMUS:
>              /* failover */
>              stream_complete(egc, stream, 0);
> -        } else {
> +            break;
> +        case LIBXL_CHECKPOINTED_STREAM_NONE:
>              /*
>               * Libxc has indicated that it is done with the stream.
>               * Resume reading libxl records from it.
>               */
>              stream_continue(egc, stream);
> +            break;
>          }
>      }
>  }
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 9658356..3ef11aa 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -228,6 +228,11 @@ libxl_hdtype = Enumeration("hdtype", [
>      (2, "AHCI"),
>      ], init_val = "LIBXL_HDTYPE_IDE")
>  
> +libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
> +    (0, "NONE"),
> +    (1, "REMUS"),
> +    ])
> +
>  #
>  # Complex libxl types
>  #
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index f9933cb..c1cd696 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -4424,7 +4424,8 @@ static void migrate_domain(uint32_t domid, const char *rune, int debug,
>  }
>  
>  static void migrate_receive(int debug, int daemonize, int monitor,
> -                            int send_fd, int recv_fd, int remus)
> +                            int send_fd, int recv_fd,
> +                            libxl_checkpointed_stream checkpointed)
>  {
>      uint32_t domid;
>      int rc, rc2;
> @@ -4449,7 +4450,7 @@ static void migrate_receive(int debug, int daemonize, int monitor,
>      dom_info.paused = 1;
>      dom_info.migrate_fd = recv_fd;
>      dom_info.migration_domname_r = &migration_domname;
> -    dom_info.checkpointed_stream = remus;
> +    dom_info.checkpointed_stream = checkpointed;
>  
>      rc = create_domain(&dom_info);
>      if (rc < 0) {
> @@ -4460,7 +4461,8 @@ static void migrate_receive(int debug, int daemonize, int monitor,
>  
>      domid = rc;
>  
> -    if (remus) {
> +    switch (checkpointed) {
> +    case LIBXL_CHECKPOINTED_STREAM_REMUS:
>          /* If we are here, it means that the sender (primary) has crashed.
>           * TODO: Split-Brain Check.
>           */
> @@ -4493,6 +4495,9 @@ static void migrate_receive(int debug, int daemonize, int monitor,
>                      common_domname, domid, rc);
>  
>          exit(rc ? -ERROR_FAIL: 0);
> +    default:
> +        /* do nothing */
> +        break;
>      }
>  
>      fprintf(stderr, "migration target: Transfer complete,"
> @@ -4630,7 +4635,8 @@ int main_restore(int argc, char **argv)
>  
>  int main_migrate_receive(int argc, char **argv)
>  {
> -    int debug = 0, daemonize = 1, monitor = 1, remus = 0;
> +    int debug = 0, daemonize = 1, monitor = 1;
> +    libxl_checkpointed_stream checkpointed = LIBXL_CHECKPOINTED_STREAM_NONE;
>      int opt;
>  
>      SWITCH_FOREACH_OPT(opt, "Fedr", NULL, "migrate-receive", 0) {
> @@ -4645,7 +4651,7 @@ int main_migrate_receive(int argc, char **argv)
>          debug = 1;
>          break;
>      case 'r':
> -        remus = 1;
> +        checkpointed = LIBXL_CHECKPOINTED_STREAM_REMUS;
>          break;
>      }
>  
> @@ -4655,7 +4661,7 @@ int main_migrate_receive(int argc, char **argv)
>      }
>      migrate_receive(debug, daemonize, monitor,
>                      STDOUT_FILENO, STDIN_FILENO,
> -                    remus);
> +                    checkpointed);
>  
>      return 0;
>  }
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2015-12-30  2:28 ` [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Wen Congyang
@ 2016-01-25 18:59   ` Konrad Rzeszutek Wilk
  2016-01-26  7:04     ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 18:59 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:28:59AM +0800, Wen Congyang wrote:
> Secondary vm is running in colo mode, we need to send
> secondary vm's dirty page information to master at checkpoint,

In previous patch you called it primary, so perhaps:
s/master/primary/ ?

> so we have to enable qemu logdirty on secondary.
> 
> libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
> qemu logdirty. But it uses domain_save_state, and calls

s/domain_save_state/libxl__domain_save_state/
> libxl__xc_domain_saverestore_async_callback_done()
> before exits. This can not be used for secondary vm.
> 
> Update libxl__domain_suspend_common_switch_qemu_logdirty() to
> introduce a new API libxl__domain_common_switch_qemu_logdirty().
> This API only uses libxl__logdirty_switch, and calls
> lds->callback before exits.

One question - that perhaps had been part of the review earlier
(if so it may be good to include this in the description
so I don't ask silly questions):

Why add this extra API? You could squash libxl__domain_suspend_common_switch_qemu_logdirty
and libxl__domain_common_switch_qemu_logdirty code together
and call it libxl_domain_common_and_suspend_common_switch_qemu_logdirty
(ok, just kidding on the name). But - why not have one function
instead of splitting the functionality in two?

Is there another patch that depends on it? If so it may be good
to spell it out, like:

Patch blah blah is going to use it.

Thanks!
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> ---
>  tools/libxl/libxl_dom_save.c | 95 ++++++++++++++++++++++++--------------------
>  tools/libxl/libxl_internal.h |  8 ++++
>  2 files changed, 60 insertions(+), 43 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
> index b3ecad7..79e43f1 100644
> --- a/tools/libxl/libxl_dom_save.c
> +++ b/tools/libxl/libxl_dom_save.c
> @@ -42,7 +42,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
>  static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
>                              const char *watch_path, const char *event_path);
>  static void switch_logdirty_done(libxl__egc *egc,
> -                                 libxl__domain_save_state *dss, int rc);
> +                                 libxl__logdirty_switch *lds, int rc);
>  
>  static void logdirty_init(libxl__logdirty_switch *lds)
>  {
> @@ -52,13 +52,10 @@ static void logdirty_init(libxl__logdirty_switch *lds)
>  }
>  
>  static void domain_suspend_switch_qemu_xen_traditional_logdirty
> -                               (int domid, unsigned enable,
> -                                libxl__save_helper_state *shs)
> +                               (libxl__egc *egc, int domid, unsigned enable,
> +                                libxl__logdirty_switch *lds)
>  {
> -    libxl__egc *egc = shs->egc;
> -    libxl__domain_save_state *dss = shs->caller_state;
> -    libxl__logdirty_switch *lds = &dss->logdirty;
> -    STATE_AO_GC(dss->ao);
> +    STATE_AO_GC(lds->ao);
>      int rc;
>      xs_transaction_t t = 0;
>      const char *got;
> @@ -120,26 +117,34 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
>   out:
>      LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
>      libxl__xs_transaction_abort(gc, &t);
> -    switch_logdirty_done(egc,dss,rc);
> +    switch_logdirty_done(egc,lds,rc);
>  }
>  
>  static void domain_suspend_switch_qemu_xen_logdirty
> -                               (int domid, unsigned enable,
> -                                libxl__save_helper_state *shs)
> +                               (libxl__egc *egc, int domid, unsigned enable,
> +                                libxl__logdirty_switch *lds)
>  {
> -    libxl__egc *egc = shs->egc;
> -    libxl__domain_save_state *dss = shs->caller_state;
> -    STATE_AO_GC(dss->ao);
> +    STATE_AO_GC(lds->ao);
>      int rc;
>  
>      rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
> -    if (!rc) {
> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
> -    } else {
> +    if (rc)
>          LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
> +
> +    lds->callback(egc, lds, rc);
> +}
> +
> +static void domain_suspend_switch_qemu_logdirty_done
> +                        (libxl__egc *egc, libxl__logdirty_switch *lds, int rc)
> +{
> +    libxl__domain_save_state *dss = CONTAINER_OF(lds, *dss, logdirty);
> +
> +    if (rc) {
>          dss->rc = rc;
> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
> -    }
> +        libxl__xc_domain_saverestore_async_callback_done(egc,
> +                                                         &dss->sws.shs, -1);
> +    } else
> +        libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
>  }
>  
>  void libxl__domain_suspend_common_switch_qemu_logdirty
> @@ -148,42 +153,52 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
>      libxl__save_helper_state *shs = user;
>      libxl__egc *egc = shs->egc;
>      libxl__domain_save_state *dss = shs->caller_state;
> -    STATE_AO_GC(dss->ao);
> +
> +    /* convenience aliases */

/* Convenience aliases. */

> +    libxl__logdirty_switch *const lds = &dss->logdirty;
> +
> +    lds->callback = domain_suspend_switch_qemu_logdirty_done;
> +    libxl__domain_common_switch_qemu_logdirty(egc, domid, enable, lds);
> +}
> +
> +void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
> +                                               int domid, unsigned enable,
> +                                               libxl__logdirty_switch *lds)
> +{
> +    STATE_AO_GC(lds->ao);
>  
>      switch (libxl__device_model_version_running(gc, domid)) {
>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> -        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
> +        domain_suspend_switch_qemu_xen_traditional_logdirty(egc, domid, enable,
> +                                                            lds);
>          break;
>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> -        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
> +        domain_suspend_switch_qemu_xen_logdirty(egc, domid, enable, lds);
>          break;
>      case LIBXL_DEVICE_MODEL_VERSION_NONE:
> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
> +        lds->callback(egc, lds, 0);
>          break;
>      default:
>          LOG(ERROR,"logdirty switch failed"
>              ", no valid device model version found, abandoning suspend");
> -        dss->rc = ERROR_FAIL;
> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
> +        lds->callback(egc, lds, ERROR_FAIL);
>      }
>  }
>  static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
>                                      const struct timeval *requested_abs,
>                                      int rc)
>  {
> -    libxl__domain_save_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
> -    STATE_AO_GC(dss->ao);
> +    libxl__logdirty_switch *lds = CONTAINER_OF(ev, *lds, timeout);
> +    STATE_AO_GC(lds->ao);
>      LOG(ERROR,"logdirty switch: wait for device model timed out");
> -    switch_logdirty_done(egc,dss,ERROR_FAIL);
> +    switch_logdirty_done(egc,lds,ERROR_FAIL);
>  }
>  
>  static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
>                              const char *watch_path, const char *event_path)
>  {
> -    libxl__domain_save_state *dss =
> -        CONTAINER_OF(watch, *dss, logdirty.watch);
> -    libxl__logdirty_switch *lds = &dss->logdirty;
> -    STATE_AO_GC(dss->ao);
> +    libxl__logdirty_switch *lds = CONTAINER_OF(watch, *lds, watch);
> +    STATE_AO_GC(lds->ao);
>      const char *got;
>      xs_transaction_t t = 0;
>      int rc;
> @@ -229,28 +244,20 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
>      if (rc <= 0) {
>          if (rc < 0)
>              LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
> -        switch_logdirty_done(egc,dss,rc);
> +        switch_logdirty_done(egc,lds,rc);
>      }
>  }
>  
>  static void switch_logdirty_done(libxl__egc *egc,
> -                                 libxl__domain_save_state *dss,
> +                                 libxl__logdirty_switch *lds,
>                                   int rc)
>  {
> -    STATE_AO_GC(dss->ao);
> -    libxl__logdirty_switch *lds = &dss->logdirty;
> +    STATE_AO_GC(lds->ao);
>  
>      libxl__ev_xswatch_deregister(gc, &lds->watch);
>      libxl__ev_time_deregister(gc, &lds->timeout);
>  
> -    int broke;
> -    if (rc) {
> -        broke = -1;
> -        dss->rc = rc;
> -    } else {
> -        broke = 0;
> -    }
> -    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
> +    lds->callback(egc, lds, rc);
>  }
>  
>  /*----- callbacks, called by xc_domain_save -----*/
> @@ -346,6 +353,8 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
>  
>      dss->rc = 0;
>      logdirty_init(&dss->logdirty);
> +    dss->logdirty.ao = ao;
> +
>      dsps->ao = ao;
>      dsps->domid = domid;
>      rc = libxl__domain_suspend_init(egc, dsps);
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 4872619..552692f 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3071,6 +3071,11 @@ libxl__stream_write_inuse(const libxl__stream_write_state *stream)
>  }
>  
>  typedef struct libxl__logdirty_switch {
> +    /* set by caller of libxl__domain_common_switch_qemu_logdirty */

s/set/Set/

> +    libxl__ao *ao;
> +    void (*callback)(libxl__egc *egc, struct libxl__logdirty_switch *lds,
> +                     int rc);
> +
>      const char *cmd;
>      const char *cmd_path;
>      const char *ret_path;
> @@ -3490,6 +3495,9 @@ void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
>  
>  _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
>                                 (int domid, unsigned int enable, void *data);
> +_hidden void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
> +                                               int domid, unsigned enable,
> +                                               libxl__logdirty_switch *lds);
>  _hidden int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
>                                                 char **buf, uint32_t *len);
>  _hidden int libxl__restore_emulator_xenstore_data
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 10/18] tools/libxl: export logdirty_init
  2015-12-30  2:29 ` [PATCH v6 10/18] tools/libxl: export logdirty_init Wen Congyang
@ 2016-01-25 19:01   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 19:01 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:00AM +0800, Wen Congyang wrote:
> We need to enable logdirty on secondary, so we export logdirty_init
> for internal use. Rename it to libxl__logdirty_init.
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxl/libxl_dom_save.c | 4 ++--
>  tools/libxl/libxl_internal.h | 2 ++
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
> index 79e43f1..8e8d280 100644
> --- a/tools/libxl/libxl_dom_save.c
> +++ b/tools/libxl/libxl_dom_save.c
> @@ -44,7 +44,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
>  static void switch_logdirty_done(libxl__egc *egc,
>                                   libxl__logdirty_switch *lds, int rc);
>  
> -static void logdirty_init(libxl__logdirty_switch *lds)
> +void libxl__logdirty_init(libxl__logdirty_switch *lds)
>  {
>      lds->cmd_path = 0;
>      libxl__ev_xswatch_init(&lds->watch);
> @@ -352,7 +352,7 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
>      }
>  
>      dss->rc = 0;
> -    logdirty_init(&dss->logdirty);
> +    libxl__logdirty_init(&dss->logdirty);
>      dss->logdirty.ao = ao;
>  
>      dsps->ao = ao;
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 552692f..8a429b7 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3083,6 +3083,8 @@ typedef struct libxl__logdirty_switch {
>      libxl__ev_time timeout;
>  } libxl__logdirty_switch;
>  
> +_hidden void libxl__logdirty_init(libxl__logdirty_switch *lds);
> +
>  struct libxl__domain_suspend_state {
>      /* set by caller of libxl__domain_suspend_init */
>      libxl__ao *ao;
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back
  2015-12-30  2:29 ` [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back Wen Congyang
@ 2016-01-25 19:17   ` Konrad Rzeszutek Wilk
  2016-01-26  7:48     ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 19:17 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:01AM +0800, Wen Congyang wrote:
> In colo mode, slave needs to send data to master, but the io_fd



In previous patches you used COLO in all caps, can that be uniform
across the patches?

Also, slave == secondary and master == primary? Perhaps you
could s/slave/secondary/ s/master/primary/ to sync up with
the other patches?

Thank you!
> only can be written in master, and only can be read in slave.



Could you mention what kind of data the secondary has to send
to the primary? In the previous patch (] tools/libxl:
introduce libxl__domain_common_switch_qemu_logdirty()) it mentioned
dirty page. Is that the case here? If so can you mention
that as well here?


> Save recv_fd in domain_suspend_state, and send_fd in
> domain_create_state.
> Extend libxl_domain_create_restore API, add a send_fd param to
> it.
> Add LIBXL_HAVE_CREATE_RESTORE_SEND_FD to indicate the API change.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> ---
>  tools/libxl/libxl.c                  |  2 +-
>  tools/libxl/libxl.h                  | 30 ++++++++++++++++++++++++++++--
>  tools/libxl/libxl_create.c           |  9 +++++----
>  tools/libxl/libxl_internal.h         |  2 ++
>  tools/libxl/libxl_types.idl          |  1 +
>  tools/libxl/xl_cmdimpl.c             |  8 +++++++-
>  tools/ocaml/libs/xl/xenlight_stubs.c |  2 +-
>  7 files changed, 45 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 2faea4d..69c8047 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -872,7 +872,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
>      dss->callback = remus_failover_cb;
>      dss->domid = domid;
>      dss->fd = send_fd;
> -    /* TODO do something with recv_fd */
> +    dss->recv_fd = recv_fd;
>      dss->type = type;
>      dss->live = 1;
>      dss->debug = 0;
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index a01e448..67a4ad7 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -630,6 +630,15 @@ typedef struct libxl__ctx libxl_ctx;
>  #define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS 1
>  
>  /*
> + * LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
> + *
> + * If this is defined, libxl_domain_create_restore()'s API has changed to
> + * include a send_fd param which used for libxl migration back channel
> + * during COLO FT.

FT? Could you explain that acronym please?
> + */
> +#define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
> +
> +/*
>   * LIBXL_HAVE_CREATEINFO_PVH
>   * If this is defined, then libxl supports creation of a PVH guest.
>   */
> @@ -1143,7 +1152,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
>                              const libxl_asyncprogress_how *aop_console_how)
>                              LIBXL_EXTERNAL_CALLERS_ONLY;
>  int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
> -                                uint32_t *domid, int restore_fd,
> +                                uint32_t *domid, int restore_fd, int send_fd,
>                                  const libxl_domain_restore_params *params,
>                                  const libxl_asyncop_how *ao_how,
>                                  const libxl_asyncprogress_how *aop_console_how)
> @@ -1164,7 +1173,7 @@ int static inline libxl_domain_create_restore_0x040200(
>      libxl_domain_restore_params_init(&params);
>  
>      ret = libxl_domain_create_restore(
> -        ctx, d_config, domid, restore_fd, &params, ao_how, aop_console_how);
> +        ctx, d_config, domid, restore_fd, -1, &params, ao_how, aop_console_how);
>  
>      libxl_domain_restore_params_dispose(&params);
>      return ret;
> @@ -1172,6 +1181,23 @@ int static inline libxl_domain_create_restore_0x040200(
>  
>  #define libxl_domain_create_restore libxl_domain_create_restore_0x040200
>  
> +#elif defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040400 \
> +                                 && LIBXL_API_VERSION < 0x040600

s/4060/4070? Or is that suppose to be <= 040600 ?
> +
.. snip..
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 9aa94be..c5d5d40 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -232,6 +232,7 @@ libxl_hdtype = Enumeration("hdtype", [
>  libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
>      (0, "NONE"),
>      (1, "REMUS"),
> +    (2, "COLO"),

You should also update the migration_stream enum with the extra enum.

And if you follow my idea of adding an assertion in xc_domain_save
for the different checkpointed_stream types then that would need to
be expanded to include support for COLO type as well.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
  2015-12-30  2:29 ` [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc Wen Congyang
@ 2016-01-25 19:41   ` Konrad Rzeszutek Wilk
  2016-01-26  8:03     ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 19:41 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:02AM +0800, Wen Congyang wrote:
> In COLO mode, both VMs are running, and are considered in sync if the
> visible network traffic is identical.  After some time, they fall out of
> sync.
> 
> At this point, the two VMs have definitely diverged.  Lets call the
> primary dirty bitmap set A, while the secondary dirty bitmap set B.
> 
> Sets A and B are different.
> 
> Under normal migration, the page data for set A will be sent form the

s/form/from/

> primary to the secondary.
> 
> However, the set difference B - A (lets call this C) is out-of-date on
> the secondary (with respect to the primary) and will not be sent by the
> primary, as it was not memory dirtied by the primary.  The secondary

s/primary/primary (to secondary)/

> needs the page data for C to reconstruct an exact copy of the primary at

s/the page data/C page data/

> the checkpoint.
> 
> The secondary cannot calculate C as it doesn't know A.  Instead, the
> secondary must send B to the primary, at which point the primary
> calculates the union of A and B (lets call this D) which is all the
> pages dirtied by both the primary and the secondary, and sends all page
> data covered by D.

You could invert this - the primary could send A to secondary? I presume
this non-optimal as the 'A' set is much much bigger than 'C' set?

It may be good to include this in the commit description.

> 
> In the general case, D is a superset of both A and B.  Without the
> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
> copy of the primary.
> 
> We transfer the dirty bitmap on libxc side, so we need to introduce back
> channel to libxc.

> 
> Note: it is different from the paper. We change the original design to
> the current one, according to our following concerns:
> 1. The original design needs extra memory on Secondary host. When there's
>    multiple backups on one host, the memory cost is high.
> 2. The memory cache code will be another 1k+, it will make the review
>    more time consuming.

Well, that 2) is a very good reason :-)
> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> commit message:

? Huh?

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

.. snip..
> index 05159bb..d4dc501 100644
> --- a/tools/libxc/xc_sr_restore.c
> +++ b/tools/libxc/xc_sr_restore.c
> @@ -722,7 +722,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>                        unsigned long *console_gfn, domid_t console_domid,
>                        unsigned int hvm, unsigned int pae, int superpages,
>                        int checkpointed_stream,
> -                      struct restore_callbacks *callbacks)
> +                      struct restore_callbacks *callbacks, int back_fd)
>  {
>      struct xc_sr_context ctx =
>          {
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 8ffd71d..a49d083 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -824,7 +824,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
>                     uint32_t max_iters, uint32_t max_factor, uint32_t flags,
>                     struct save_callbacks* callbacks, int hvm,
> -                   int checkpointed_stream)
> +                   int checkpointed_stream, int back_fd)
>  {
>      struct xc_sr_context ctx =
>          {


But where is the code?

Or is that suppose to be done in another patch? If so you may want to
mention that in the commit description?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device
  2015-12-30  2:29 ` [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device Wen Congyang
@ 2016-01-25 19:42   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 19:42 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:03AM +0800, Wen Congyang wrote:
> This patch is auto generated by the following commands:
>  1. git mv tools/libxl/libxl_remus_device.c tools/libxl/libxl_checkpoint_device.c
>  2. perl -pi -e 's/libxl_remus_device/libxl_checkpoint_device/g' tools/libxl/Makefile
>  3. perl -pi -e 's/\blibxl__remus_devices/libxl__checkpoint_devices/g' tools/libxl/*.[ch]
>  4. perl -pi -e 's/\blibxl__remus_device\b/libxl__checkpoint_device/g' tools/libxl/*.[ch]
>  5. perl -pi -e 's/\blibxl__remus_device_instance_ops\b/libxl__checkpoint_device_instance_ops/g' tools/libxl/*.[ch]
>  6. perl -pi -e 's/\blibxl__remus_callback\b/libxl__checkpoint_callback/g' tools/libxl/*.[ch]
>  7. perl -pi -e 's/\bremus_device_init\b/checkpoint_device_init/g' tools/libxl/*.[ch]
>  8. perl -pi -e 's/\bremus_devices_setup\b/checkpoint_devices_setup/g' tools/libxl/*.[ch]
>  9. perl -pi -e 's/\bdefine_remus_checkpoint_api\b/define_checkpoint_api/g' tools/libxl/*.[ch]
> 10. perl -pi -e 's/\brds\b/cds/g' tools/libxl/*.[ch]
> 11. perl -pi -e 's/REMUS_DEVICE/CHECKPOINT_DEVICE/g' tools/libxl/*.[ch] tools/libxl/*.idl
> 12. perl -pi -e 's/REMUS_DEVOPS/CHECKPOINT_DEVOPS/g' tools/libxl/*.[ch] tools/libxl/*.idl
> 13. perl -pi -e 's/\bremus\b/checkpoint/g' tools/libxl/libxl_checkpoint_device.[ch]
> 14. perl -pi -e 's/\bremus device/checkpoint device/g' tools/libxl/libxl_internal.h
> 15. perl -pi -e 's/\bRemus device/checkpoint device/g' tools/libxl/libxl_internal.h
> 16. perl -pi -e 's/\bremus abstract/checkpoint abstract/g' tools/libxl/libxl_internal.h
> 17. perl -pi -e 's/\bremus invocation/checkpoint invocation/g' tools/libxl/libxl_internal.h
> 18. perl -pi -e 's/\blibxl__remus_device_\(/libxl__checkpoint_device_(/g' tools/libxl/libxl_internal.h
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>

Reviewed-Lightly-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxl/Makefile                               |   2 +-
>  ...xl_remus_device.c => libxl_checkpoint_device.c} | 198 ++++++++++-----------
>  tools/libxl/libxl_internal.h                       | 112 ++++++------
>  tools/libxl/libxl_netbuffer.c                      | 108 +++++------
>  tools/libxl/libxl_nonetbuffer.c                    |  10 +-
>  tools/libxl/libxl_remus.c                          |  76 ++++----
>  tools/libxl/libxl_remus_disk_drbd.c                |  52 +++---
>  tools/libxl/libxl_types.idl                        |   4 +-
>  8 files changed, 281 insertions(+), 281 deletions(-)
>  rename tools/libxl/{libxl_remus_device.c => libxl_checkpoint_device.c} (52%)
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index b476012..d075a30 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -62,7 +62,7 @@ else
>  LIBXL_OBJS-y += libxl_no_convert_callout.o
>  endif
>  
> -LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
> +LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
>  
>  LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
>  LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
> diff --git a/tools/libxl/libxl_remus_device.c b/tools/libxl/libxl_checkpoint_device.c
> similarity index 52%
> rename from tools/libxl/libxl_remus_device.c
> rename to tools/libxl/libxl_checkpoint_device.c
> index a6cb7f6..109cd23 100644
> --- a/tools/libxl/libxl_remus_device.c
> +++ b/tools/libxl/libxl_checkpoint_device.c
> @@ -17,9 +17,9 @@
>  
>  #include "libxl_internal.h"
>  
> -extern const libxl__remus_device_instance_ops remus_device_nic;
> -extern const libxl__remus_device_instance_ops remus_device_drbd_disk;
> -static const libxl__remus_device_instance_ops *remus_ops[] = {
> +extern const libxl__checkpoint_device_instance_ops remus_device_nic;
> +extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
> +static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
>      &remus_device_nic,
>      &remus_device_drbd_disk,
>      NULL,
> @@ -27,18 +27,18 @@ static const libxl__remus_device_instance_ops *remus_ops[] = {
>  
>  /*----- helper functions -----*/
>  
> -static int init_device_subkind(libxl__remus_devices_state *rds)
> +static int init_device_subkind(libxl__checkpoint_devices_state *cds)
>  {
>      /* init device subkind-specific state in the libxl ctx */
>      int rc;
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
>      if (libxl__netbuffer_enabled(gc)) {
> -        rc = init_subkind_nic(rds);
> +        rc = init_subkind_nic(cds);
>          if (rc) goto out;
>      }
>  
> -    rc = init_subkind_drbd_disk(rds);
> +    rc = init_subkind_drbd_disk(cds);
>      if (rc) goto out;
>  
>      rc = 0;
> @@ -46,15 +46,15 @@ out:
>      return rc;
>  }
>  
> -static void cleanup_device_subkind(libxl__remus_devices_state *rds)
> +static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
>  {
>      /* cleanup device subkind-specific state in the libxl ctx */
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
>      if (libxl__netbuffer_enabled(gc))
> -        cleanup_subkind_nic(rds);
> +        cleanup_subkind_nic(cds);
>  
> -    cleanup_subkind_drbd_disk(rds);
> +    cleanup_subkind_drbd_disk(cds);
>  }
>  
>  /*----- setup() and teardown() -----*/
> @@ -70,103 +70,103 @@ static void devices_teardown_cb(libxl__egc *egc,
>                                  libxl__multidev *multidev,
>                                  int rc);
>  
> -/* remus device setup and teardown */
> +/* checkpoint device setup and teardown */
>  
> -static libxl__remus_device* remus_device_init(libxl__egc *egc,
> -                                              libxl__remus_devices_state *rds,
> +static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
> +                                              libxl__checkpoint_devices_state *cds,
>                                                libxl__device_kind kind,
>                                                void *libxl_dev)
>  {
> -    libxl__remus_device *dev = NULL;
> +    libxl__checkpoint_device *dev = NULL;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>      GCNEW(dev);
>      dev->backend_dev = libxl_dev;
>      dev->kind = kind;
> -    dev->rds = rds;
> +    dev->cds = cds;
>  
>      return dev;
>  }
>  
> -static void remus_devices_setup(libxl__egc *egc,
> -                                libxl__remus_devices_state *rds);
> +static void checkpoint_devices_setup(libxl__egc *egc,
> +                                libxl__checkpoint_devices_state *cds);
>  
> -void libxl__remus_devices_setup(libxl__egc *egc, libxl__remus_devices_state *rds)
> +void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
>  {
>      int i, rc;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
> -    rc = init_device_subkind(rds);
> +    rc = init_device_subkind(cds);
>      if (rc)
>          goto out;
>  
> -    rds->num_devices = 0;
> -    rds->num_nics = 0;
> -    rds->num_disks = 0;
> +    cds->num_devices = 0;
> +    cds->num_nics = 0;
> +    cds->num_disks = 0;
>  
> -    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
> -        rds->nics = libxl_device_nic_list(CTX, rds->domid, &rds->num_nics);
> +    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VIF))
> +        cds->nics = libxl_device_nic_list(CTX, cds->domid, &cds->num_nics);
>  
> -    if (rds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
> -        rds->disks = libxl_device_disk_list(CTX, rds->domid, &rds->num_disks);
> +    if (cds->device_kind_flags & (1 << LIBXL__DEVICE_KIND_VBD))
> +        cds->disks = libxl_device_disk_list(CTX, cds->domid, &cds->num_disks);
>  
> -    if (rds->num_nics == 0 && rds->num_disks == 0)
> +    if (cds->num_nics == 0 && cds->num_disks == 0)
>          goto out;
>  
> -    GCNEW_ARRAY(rds->devs, rds->num_nics + rds->num_disks);
> +    GCNEW_ARRAY(cds->devs, cds->num_nics + cds->num_disks);
>  
> -    for (i = 0; i < rds->num_nics; i++) {
> -        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
> +    for (i = 0; i < cds->num_nics; i++) {
> +        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
>                                                  LIBXL__DEVICE_KIND_VIF,
> -                                                &rds->nics[i]);
> +                                                &cds->nics[i]);
>      }
>  
> -    for (i = 0; i < rds->num_disks; i++) {
> -        rds->devs[rds->num_devices++] = remus_device_init(egc, rds,
> +    for (i = 0; i < cds->num_disks; i++) {
> +        cds->devs[cds->num_devices++] = checkpoint_device_init(egc, cds,
>                                                  LIBXL__DEVICE_KIND_VBD,
> -                                                &rds->disks[i]);
> +                                                &cds->disks[i]);
>      }
>  
> -    remus_devices_setup(egc, rds);
> +    checkpoint_devices_setup(egc, cds);
>  
>      return;
>  
>  out:
> -    rds->callback(egc, rds, rc);
> +    cds->callback(egc, cds, rc);
>  }
>  
> -static void remus_devices_setup(libxl__egc *egc,
> -                                libxl__remus_devices_state *rds)
> +static void checkpoint_devices_setup(libxl__egc *egc,
> +                                libxl__checkpoint_devices_state *cds)
>  {
>      int i, rc;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
> -    libxl__multidev_begin(ao, &rds->multidev);
> -    rds->multidev.callback = all_devices_setup_cb;
> -    for (i = 0; i < rds->num_devices; i++) {
> -        libxl__remus_device *dev = rds->devs[i];
> +    libxl__multidev_begin(ao, &cds->multidev);
> +    cds->multidev.callback = all_devices_setup_cb;
> +    for (i = 0; i < cds->num_devices; i++) {
> +        libxl__checkpoint_device *dev = cds->devs[i];
>          dev->ops_index = -1;
> -        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
> +        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
>  
> -        dev->aodev.rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
> +        dev->aodev.rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
>          dev->aodev.callback = device_setup_iterate;
>          device_setup_iterate(egc,&dev->aodev);
>      }
>  
>      rc = 0;
> -    libxl__multidev_prepared(egc, &rds->multidev, rc);
> +    libxl__multidev_prepared(egc, &cds->multidev, rc);
>  }
>  
>  
>  static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
>  {
> -    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
> +    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
>      EGC_GC;
>  
> -    if (aodev->rc != ERROR_REMUS_DEVICE_NOT_SUPPORTED &&
> -        aodev->rc != ERROR_REMUS_DEVOPS_DOES_NOT_MATCH)
> +    if (aodev->rc != ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED &&
> +        aodev->rc != ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH)
>          /* might be success or disaster */
>          goto out;
>  
> @@ -186,16 +186,16 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
>                  domid = disk->backend_domid;
>                  devid = libxl__device_disk_dev_number(disk->vdev, NULL, NULL);
>              } else {
> -                LOG(ERROR,"device kind not handled by remus: %s",
> +                LOG(ERROR,"device kind not handled by checkpoint: %s",
>                      libxl__device_kind_to_string(dev->kind));
>                  aodev->rc = ERROR_FAIL;
>                  goto out;
>              }
> -            LOG(ERROR,"device not handled by remus"
> +            LOG(ERROR,"device not handled by checkpoint"
>                  " (device=%s:%"PRId32"/%"PRId32")",
>                  libxl__device_kind_to_string(dev->kind),
>                  domid, devid);
> -            aodev->rc = ERROR_REMUS_DEVICE_NOT_SUPPORTED;
> +            aodev->rc = ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED;
>              goto out;
>          }
>      } while (dev->ops->kind != dev->kind);
> @@ -216,32 +216,32 @@ static void all_devices_setup_cb(libxl__egc *egc,
>      STATE_AO_GC(multidev->ao);
>  
>      /* Convenience aliases */
> -    libxl__remus_devices_state *const rds =
> -                            CONTAINER_OF(multidev, *rds, multidev);
> +    libxl__checkpoint_devices_state *const cds =
> +                            CONTAINER_OF(multidev, *cds, multidev);
>  
> -    rds->callback(egc, rds, rc);
> +    cds->callback(egc, cds, rc);
>  }
>  
> -void libxl__remus_devices_teardown(libxl__egc *egc,
> -                                   libxl__remus_devices_state *rds)
> +void libxl__checkpoint_devices_teardown(libxl__egc *egc,
> +                                   libxl__checkpoint_devices_state *cds)
>  {
>      int i;
> -    libxl__remus_device *dev;
> +    libxl__checkpoint_device *dev;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
> -    libxl__multidev_begin(ao, &rds->multidev);
> -    rds->multidev.callback = devices_teardown_cb;
> -    for (i = 0; i < rds->num_devices; i++) {
> -        dev = rds->devs[i];
> +    libxl__multidev_begin(ao, &cds->multidev);
> +    cds->multidev.callback = devices_teardown_cb;
> +    for (i = 0; i < cds->num_devices; i++) {
> +        dev = cds->devs[i];
>          if (!dev->ops || !dev->matched)
>              continue;
>  
> -        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);
> +        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);
>          dev->ops->teardown(egc,dev);
>      }
>  
> -    libxl__multidev_prepared(egc, &rds->multidev, 0);
> +    libxl__multidev_prepared(egc, &cds->multidev, 0);
>  }
>  
>  static void devices_teardown_cb(libxl__egc *egc,
> @@ -253,26 +253,26 @@ static void devices_teardown_cb(libxl__egc *egc,
>      STATE_AO_GC(multidev->ao);
>  
>      /* Convenience aliases */
> -    libxl__remus_devices_state *const rds =
> -                            CONTAINER_OF(multidev, *rds, multidev);
> +    libxl__checkpoint_devices_state *const cds =
> +                            CONTAINER_OF(multidev, *cds, multidev);
>  
>      /* clean nic */
> -    for (i = 0; i < rds->num_nics; i++)
> -        libxl_device_nic_dispose(&rds->nics[i]);
> -    free(rds->nics);
> -    rds->nics = NULL;
> -    rds->num_nics = 0;
> +    for (i = 0; i < cds->num_nics; i++)
> +        libxl_device_nic_dispose(&cds->nics[i]);
> +    free(cds->nics);
> +    cds->nics = NULL;
> +    cds->num_nics = 0;
>  
>      /* clean disk */
> -    for (i = 0; i < rds->num_disks; i++)
> -        libxl_device_disk_dispose(&rds->disks[i]);
> -    free(rds->disks);
> -    rds->disks = NULL;
> -    rds->num_disks = 0;
> +    for (i = 0; i < cds->num_disks; i++)
> +        libxl_device_disk_dispose(&cds->disks[i]);
> +    free(cds->disks);
> +    cds->disks = NULL;
> +    cds->num_disks = 0;
>  
> -    cleanup_device_subkind(rds);
> +    cleanup_device_subkind(cds);
>  
> -    rds->callback(egc, rds, rc);
> +    cds->callback(egc, cds, rc);
>  }
>  
>  /*----- checkpointing APIs -----*/
> @@ -285,33 +285,33 @@ static void devices_checkpoint_cb(libxl__egc *egc,
>  
>  /* API implementations */
>  
> -#define define_remus_checkpoint_api(api)                                \
> -void libxl__remus_devices_##api(libxl__egc *egc,                        \
> -                                libxl__remus_devices_state *rds)        \
> +#define define_checkpoint_api(api)                                \
> +void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
> +                                libxl__checkpoint_devices_state *cds)        \
>  {                                                                       \
>      int i;                                                              \
> -    libxl__remus_device *dev;                                           \
> +    libxl__checkpoint_device *dev;                                           \
>                                                                          \
> -    STATE_AO_GC(rds->ao);                                               \
> +    STATE_AO_GC(cds->ao);                                               \
>                                                                          \
> -    libxl__multidev_begin(ao, &rds->multidev);                          \
> -    rds->multidev.callback = devices_checkpoint_cb;                     \
> -    for (i = 0; i < rds->num_devices; i++) {                            \
> -        dev = rds->devs[i];                                             \
> +    libxl__multidev_begin(ao, &cds->multidev);                          \
> +    cds->multidev.callback = devices_checkpoint_cb;                     \
> +    for (i = 0; i < cds->num_devices; i++) {                            \
> +        dev = cds->devs[i];                                             \
>          if (!dev->matched || !dev->ops->api)                            \
>              continue;                                                   \
> -        libxl__multidev_prepare_with_aodev(&rds->multidev, &dev->aodev);\
> +        libxl__multidev_prepare_with_aodev(&cds->multidev, &dev->aodev);\
>          dev->ops->api(egc,dev);                                         \
>      }                                                                   \
>                                                                          \
> -    libxl__multidev_prepared(egc, &rds->multidev, 0);                   \
> +    libxl__multidev_prepared(egc, &cds->multidev, 0);                   \
>  }
>  
> -define_remus_checkpoint_api(postsuspend);
> +define_checkpoint_api(postsuspend);
>  
> -define_remus_checkpoint_api(preresume);
> +define_checkpoint_api(preresume);
>  
> -define_remus_checkpoint_api(commit);
> +define_checkpoint_api(commit);
>  
>  static void devices_checkpoint_cb(libxl__egc *egc,
>                                    libxl__multidev *multidev,
> @@ -320,8 +320,8 @@ static void devices_checkpoint_cb(libxl__egc *egc,
>      STATE_AO_GC(multidev->ao);
>  
>      /* Convenience aliases */
> -    libxl__remus_devices_state *const rds =
> -                            CONTAINER_OF(multidev, *rds, multidev);
> +    libxl__checkpoint_devices_state *const cds =
> +                            CONTAINER_OF(multidev, *cds, multidev);
>  
> -    rds->callback(egc, rds, rc);
> +    cds->callback(egc, cds, rc);
>  }
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 99a4acf..7f80ec5 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2794,9 +2794,9 @@ typedef struct libxl__save_helper_state {
>                        * marshalling and xc callback functions */
>  } libxl__save_helper_state;
>  
> -/*----- remus device related state structure -----*/
> +/*----- checkpoint device related state structure -----*/
>  /*
> - * The abstract Remus device layer exposes a common
> + * The abstract checkpoint device layer exposes a common
>   * set of API to [external] libxl for manipulating devices attached to
>   * a guest protected by Remus. The device layer also exposes a set of
>   * [internal] interfaces that every device type must implement.
> @@ -2804,34 +2804,34 @@ typedef struct libxl__save_helper_state {
>   * The following API are exposed to libxl:
>   *
>   * One-time configuration operations:
> - *  +libxl__remus_devices_setup
> + *  +libxl__checkpoint_devices_setup
>   *    > Enable output buffering for NICs, setup disk replication, etc.
> - *  +libxl__remus_devices_teardown
> + *  +libxl__checkpoint_devices_teardown
>   *    > Disable output buffering and disk replication; teardown any
>   *       associated external setups like qdiscs for NICs.
>   *
>   * Operations executed every checkpoint (in order of invocation):
> - *  +libxl__remus_devices_postsuspend
> - *  +libxl__remus_devices_preresume
> - *  +libxl__remus_devices_commit
> + *  +libxl__checkpoint_devices_postsuspend
> + *  +libxl__checkpoint_devices_preresume
> + *  +libxl__checkpoint_devices_commit
>   *
>   * Each device type needs to implement the interfaces specified in
> - * the libxl__remus_device_instance_ops if it wishes to support Remus.
> + * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
>   *
> - * The high-level control flow through the Remus device layer is shown below:
> + * The high-level control flow through the checkpoint device layer is shown below:
>   *
>   * xl remus
>   *  |->  libxl_domain_remus_start
> - *    |-> libxl__remus_devices_setup
> - *      |-> Per-checkpoint libxl__remus_devices_[postsuspend,preresume,commit]
> + *    |-> libxl__checkpoint_devices_setup
> + *      |-> Per-checkpoint libxl__checkpoint_devices_[postsuspend,preresume,commit]
>   *        ...
>   *        |-> On backup failure, network error or other internal errors:
> - *            libxl__remus_devices_teardown
> + *            libxl__checkpoint_devices_teardown
>   */
>  
> -typedef struct libxl__remus_device libxl__remus_device;
> -typedef struct libxl__remus_devices_state libxl__remus_devices_state;
> -typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops;
> +typedef struct libxl__checkpoint_device libxl__checkpoint_device;
> +typedef struct libxl__checkpoint_devices_state libxl__checkpoint_devices_state;
> +typedef struct libxl__checkpoint_device_instance_ops libxl__checkpoint_device_instance_ops;
>  
>  /*
>   * Interfaces to be implemented by every device subkind that wishes to
> @@ -2841,7 +2841,7 @@ typedef struct libxl__remus_device_instance_ops libxl__remus_device_instance_ops
>   * synchronous and call dev->aodev.callback directly (as the last
>   * thing they do).
>   */
> -struct libxl__remus_device_instance_ops {
> +struct libxl__checkpoint_device_instance_ops {
>      /* the device kind this ops belongs to... */
>      libxl__device_kind kind;
>  
> @@ -2852,12 +2852,12 @@ struct libxl__remus_device_instance_ops {
>       * Asynchronous.
>       */
>  
> -    void (*postsuspend)(libxl__egc *egc, libxl__remus_device *dev);
> -    void (*preresume)(libxl__egc *egc, libxl__remus_device *dev);
> -    void (*commit)(libxl__egc *egc, libxl__remus_device *dev);
> +    void (*postsuspend)(libxl__egc *egc, libxl__checkpoint_device *dev);
> +    void (*preresume)(libxl__egc *egc, libxl__checkpoint_device *dev);
> +    void (*commit)(libxl__egc *egc, libxl__checkpoint_device *dev);
>  
>      /*
> -     * setup() and teardown() are refer to the actual remus device.
> +     * setup() and teardown() are refer to the actual checkpoint device.
>       * Asynchronous.
>       * teardown is called even if setup fails.
>       */
> @@ -2866,45 +2866,45 @@ struct libxl__remus_device_instance_ops {
>       * device. If matched, the device will then be managed with this set of
>       * subkind operations.
>       * Yields 0 if the device successfully set up.
> -     * REMUS_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
> +     * CHECKPOINT_DEVOPS_DOES_NOT_MATCH if the ops does not match the device.
>       * any other rc indicates failure.
>       */
> -    void (*setup)(libxl__egc *egc, libxl__remus_device *dev);
> -    void (*teardown)(libxl__egc *egc, libxl__remus_device *dev);
> +    void (*setup)(libxl__egc *egc, libxl__checkpoint_device *dev);
> +    void (*teardown)(libxl__egc *egc, libxl__checkpoint_device *dev);
>  };
>  
> -int init_subkind_nic(libxl__remus_devices_state *rds);
> -void cleanup_subkind_nic(libxl__remus_devices_state *rds);
> -int init_subkind_drbd_disk(libxl__remus_devices_state *rds);
> -void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds);
> +int init_subkind_nic(libxl__checkpoint_devices_state *cds);
> +void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds);
> +int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
> +void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
>  
> -typedef void libxl__remus_callback(libxl__egc *,
> -                                   libxl__remus_devices_state *, int rc);
> +typedef void libxl__checkpoint_callback(libxl__egc *,
> +                                   libxl__checkpoint_devices_state *, int rc);
>  
>  /*
> - * State associated with a remus invocation, including parameters
> - * passed to the remus abstract device layer by the remus
> + * State associated with a checkpoint invocation, including parameters
> + * passed to the checkpoint abstract device layer by the remus
>   * save/restore machinery.
>   */
> -struct libxl__remus_devices_state {
> -    /*---- must be set by caller of libxl__remus_device_(setup|teardown) ----*/
> +struct libxl__checkpoint_devices_state {
> +    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
>  
>      libxl__ao *ao;
>      uint32_t domid;
> -    libxl__remus_callback *callback;
> +    libxl__checkpoint_callback *callback;
>      int device_kind_flags;
>  
>      /*----- private for abstract layer only -----*/
>  
>      int num_devices;
>      /*
> -     * this array is allocated before setup the remus devices by the
> -     * remus abstract layer.
> -     * devs may be NULL, means there's no remus devices that has been set up.
> +     * this array is allocated before setup the checkpoint devices by the
> +     * checkpoint abstract layer.
> +     * devs may be NULL, means there's no checkpoint devices that has been set up.
>       * the size of this array is 'num_devices', which is the total number
>       * of libxl nic devices and disk devices(num_nics + num_disks).
>       */
> -    libxl__remus_device **devs;
> +    libxl__checkpoint_device **devs;
>  
>      libxl_device_nic *nics;
>      int num_nics;
> @@ -2926,20 +2926,20 @@ struct libxl__remus_devices_state {
>  
>  /*
>   * Information about a single device being handled by remus.
> - * Allocated by the remus abstract layer.
> + * Allocated by the checkpoint abstract layer.
>   */
> -struct libxl__remus_device {
> +struct libxl__checkpoint_device {
>      /*----- shared between abstract and concrete layers -----*/
>      /*
>       * if this is true, that means the subkind ops match the device
>       */
>      bool matched;
>  
> -    /*----- set by remus device abstruct layer -----*/
> -    /* libxl__device_* which this remus device related to */
> +    /*----- set by checkpoint device abstruct layer -----*/
> +    /* libxl__device_* which this checkpoint device related to */
>      const void *backend_dev;
>      libxl__device_kind kind;
> -    libxl__remus_devices_state *rds;
> +    libxl__checkpoint_devices_state *cds;
>      libxl__ao_device aodev;
>  
>      /*----- private for abstract layer only -----*/
> @@ -2950,7 +2950,7 @@ struct libxl__remus_device {
>       * individual devices.
>       */
>      int ops_index;
> -    const libxl__remus_device_instance_ops *ops;
> +    const libxl__checkpoint_device_instance_ops *ops;
>  
>      /*----- private for concrete (device-specific) layer -----*/
>  
> @@ -2958,17 +2958,17 @@ struct libxl__remus_device {
>      void *concrete_data;
>  };
>  
> -/* the following 5 APIs are async ops, call rds->callback when done */
> -_hidden void libxl__remus_devices_setup(libxl__egc *egc,
> -                                        libxl__remus_devices_state *rds);
> -_hidden void libxl__remus_devices_teardown(libxl__egc *egc,
> -                                           libxl__remus_devices_state *rds);
> -_hidden void libxl__remus_devices_postsuspend(libxl__egc *egc,
> -                                              libxl__remus_devices_state *rds);
> -_hidden void libxl__remus_devices_preresume(libxl__egc *egc,
> -                                            libxl__remus_devices_state *rds);
> -_hidden void libxl__remus_devices_commit(libxl__egc *egc,
> -                                         libxl__remus_devices_state *rds);
> +/* the following 5 APIs are async ops, call cds->callback when done */
> +_hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
> +                                        libxl__checkpoint_devices_state *cds);
> +_hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
> +                                           libxl__checkpoint_devices_state *cds);
> +_hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
> +                                              libxl__checkpoint_devices_state *cds);
> +_hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
> +                                            libxl__checkpoint_devices_state *cds);
> +_hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
> +                                         libxl__checkpoint_devices_state *cds);
>  _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
>  
>  /*----- Legacy conversion helper -----*/
> @@ -3127,7 +3127,7 @@ struct libxl__domain_save_state {
>      int hvm;
>      int xcflags;
>      libxl__domain_suspend_state dsps;
> -    libxl__remus_devices_state rds;
> +    libxl__checkpoint_devices_state cds;
>      libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
>      int interval; /* checkpoint interval (for Remus) */
>      libxl__stream_write_state sws;
> diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
> index c245a4e..33c2a42 100644
> --- a/tools/libxl/libxl_netbuffer.c
> +++ b/tools/libxl/libxl_netbuffer.c
> @@ -38,21 +38,21 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
>      return 1;
>  }
>  
> -int init_subkind_nic(libxl__remus_devices_state *rds)
> +int init_subkind_nic(libxl__checkpoint_devices_state *cds)
>  {
>      int rc, ret;
> -    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
> +    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
> -    rds->nlsock = nl_socket_alloc();
> -    if (!rds->nlsock) {
> +    cds->nlsock = nl_socket_alloc();
> +    if (!cds->nlsock) {
>          LOG(ERROR, "cannot allocate nl socket");
>          rc = ERROR_FAIL;
>          goto out;
>      }
>  
> -    ret = nl_connect(rds->nlsock, NETLINK_ROUTE);
> +    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
>      if (ret) {
>          LOG(ERROR, "failed to open netlink socket: %s",
>              nl_geterror(ret));
> @@ -61,7 +61,7 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
>      }
>  
>      /* get list of all qdiscs installed on network devs. */
> -    ret = rtnl_qdisc_alloc_cache(rds->nlsock, &rds->qdisc_cache);
> +    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
>      if (ret) {
>          LOG(ERROR, "failed to allocate qdisc cache: %s",
>              nl_geterror(ret));
> @@ -70,9 +70,9 @@ int init_subkind_nic(libxl__remus_devices_state *rds)
>      }
>  
>      if (dss->remus->netbufscript) {
> -        rds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
> +        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
>      } else {
> -        rds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
> +        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
>                                        libxl__xen_script_dir_path());
>      }
>  
> @@ -82,22 +82,22 @@ out:
>      return rc;
>  }
>  
> -void cleanup_subkind_nic(libxl__remus_devices_state *rds)
> +void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
>  {
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
>      /* free qdisc cache */
> -    if (rds->qdisc_cache) {
> -        nl_cache_clear(rds->qdisc_cache);
> -        nl_cache_free(rds->qdisc_cache);
> -        rds->qdisc_cache = NULL;
> +    if (cds->qdisc_cache) {
> +        nl_cache_clear(cds->qdisc_cache);
> +        nl_cache_free(cds->qdisc_cache);
> +        cds->qdisc_cache = NULL;
>      }
>  
>      /* close & free nlsock */
> -    if (rds->nlsock) {
> -        nl_close(rds->nlsock);
> -        nl_socket_free(rds->nlsock);
> -        rds->nlsock = NULL;
> +    if (cds->nlsock) {
> +        nl_close(cds->nlsock);
> +        nl_socket_free(cds->nlsock);
> +        cds->nlsock = NULL;
>      }
>  }
>  
> @@ -111,17 +111,17 @@ void cleanup_subkind_nic(libxl__remus_devices_state *rds)
>   * it must ONLY be used for remus because if driver domains
>   * were in use it would constitute a security vulnerability.
>   */
> -static const char *get_vifname(libxl__remus_device *dev,
> +static const char *get_vifname(libxl__checkpoint_device *dev,
>                                 const libxl_device_nic *nic)
>  {
>      const char *vifname = NULL;
>      const char *path;
>      int rc;
>  
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      /* Convenience aliases */
> -    const uint32_t domid = dev->rds->domid;
> +    const uint32_t domid = dev->cds->domid;
>  
>      path = GCSPRINTF("%s/backend/vif/%d/%d/vifname",
>                       libxl__xs_get_dompath(gc, 0), domid, nic->devid);
> @@ -144,19 +144,19 @@ static void free_qdisc(libxl__remus_device_nic *remus_nic)
>      remus_nic->qdisc = NULL;
>  }
>  
> -static int init_qdisc(libxl__remus_devices_state *rds,
> +static int init_qdisc(libxl__checkpoint_devices_state *cds,
>                        libxl__remus_device_nic *remus_nic)
>  {
>      int rc, ret, ifindex;
>      struct rtnl_link *ifb = NULL;
>      struct rtnl_qdisc *qdisc = NULL;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
>      /* Now that we have brought up REMUS_IFB device with plug qdisc for
>       * this vif, so we need to refill the qdisc cache.
>       */
> -    ret = nl_cache_refill(rds->nlsock, rds->qdisc_cache);
> +    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
>      if (ret) {
>          LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
>          rc = ERROR_FAIL;
> @@ -164,7 +164,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
>      }
>  
>      /* get a handle to the REMUS_IFB interface */
> -    ret = rtnl_link_get_kernel(rds->nlsock, 0, remus_nic->ifb, &ifb);
> +    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
>      if (ret) {
>          LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
>              nl_geterror(ret));
> @@ -187,7 +187,7 @@ static int init_qdisc(libxl__remus_devices_state *rds,
>       * There is no need to explicitly free this qdisc as its just a
>       * reference from the qdisc cache we allocated earlier.
>       */
> -    qdisc = rtnl_qdisc_get_by_parent(rds->qdisc_cache, ifindex, TC_H_ROOT);
> +    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
>      if (qdisc) {
>          const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
>          /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
> @@ -231,19 +231,19 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
>   * $REMUS_IFB (for teardown)
>   * setup/teardown as command line arg.
>   */
> -static void setup_async_exec(libxl__remus_device *dev, char *op)
> +static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
>  {
>      int arraysize, nr = 0;
>      char **env = NULL, **args = NULL;
>      libxl__remus_device_nic *remus_nic = dev->concrete_data;
> -    libxl__remus_devices_state *rds = dev->rds;
> +    libxl__checkpoint_devices_state *cds = dev->cds;
>      libxl__async_exec_state *aes = &dev->aodev.aes;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
>      /* Convenience aliases */
> -    char *const script = libxl__strdup(gc, rds->netbufscript);
> -    const uint32_t domid = rds->domid;
> +    char *const script = libxl__strdup(gc, cds->netbufscript);
> +    const uint32_t domid = cds->domid;
>      const int dev_id = remus_nic->devid;
>      const char *const vif = remus_nic->vif;
>      const char *const ifb = remus_nic->ifb;
> @@ -269,7 +269,7 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
>      args[nr++] = NULL;
>      assert(nr == arraysize);
>  
> -    aes->ao = dev->rds->ao;
> +    aes->ao = dev->cds->ao;
>      aes->what = GCSPRINTF("%s %s", args[0], args[1]);
>      aes->env = env;
>      aes->args = args;
> @@ -286,13 +286,13 @@ static void setup_async_exec(libxl__remus_device *dev, char *op)
>  
>  /* setup() and teardown() */
>  
> -static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
> +static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
>      int rc;
>      libxl__remus_device_nic *remus_nic;
>      const libxl_device_nic *nic = dev->backend_dev;
>  
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      /*
>       * thers's no subkind of nic devices, so nic ops is always matched
> @@ -330,15 +330,15 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
>                                     int rc, int status)
>  {
>      libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
> -    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
> +    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
>      libxl__remus_device_nic *remus_nic = dev->concrete_data;
> -    libxl__remus_devices_state *rds = dev->rds;
> +    libxl__checkpoint_devices_state *cds = dev->cds;
>      const char *out_path_base, *hotplug_error = NULL;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
>      /* Convenience aliases */
> -    const uint32_t domid = rds->domid;
> +    const uint32_t domid = cds->domid;
>      const int devid = remus_nic->devid;
>      const char *const vif = remus_nic->vif;
>      const char **const ifb = &remus_nic->ifb;
> @@ -377,7 +377,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
>  
>      if (hotplug_error) {
>          LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
> -            rds->netbufscript, vif, hotplug_error);
> +            cds->netbufscript, vif, hotplug_error);
>          rc = ERROR_FAIL;
>          goto out;
>      }
> @@ -388,17 +388,17 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
>      }
>  
>      LOG(DEBUG, "%s will buffer packets from vif %s", *ifb, vif);
> -    rc = init_qdisc(rds, remus_nic);
> +    rc = init_qdisc(cds, remus_nic);
>  
>  out:
>      aodev->rc = rc;
>      aodev->callback(egc, aodev);
>  }
>  
> -static void nic_teardown(libxl__egc *egc, libxl__remus_device *dev)
> +static void nic_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
>      int rc;
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      setup_async_exec(dev, "teardown");
>  
> @@ -418,7 +418,7 @@ static void netbuf_teardown_script_cb(libxl__egc *egc,
>                                        int rc, int status)
>  {
>      libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
> -    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
> +    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
>      libxl__remus_device_nic *remus_nic = dev->concrete_data;
>  
>      if (status && !rc)
> @@ -441,12 +441,12 @@ enum {
>  /* API implementations */
>  
>  static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
> -                           libxl__remus_devices_state *rds,
> +                           libxl__checkpoint_devices_state *cds,
>                             int buffer_op)
>  {
>      int rc, ret;
>  
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
>      if (buffer_op == tc_buffer_start)
>          ret = rtnl_qdisc_plug_buffer(remus_nic->qdisc);
> @@ -458,7 +458,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
>          goto out;
>      }
>  
> -    ret = rtnl_qdisc_add(rds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
> +    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
>      if (ret) {
>          rc = ERROR_FAIL;
>          goto out;
> @@ -475,33 +475,33 @@ out:
>      return rc;
>  }
>  
> -static void nic_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
> +static void nic_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
>      int rc;
>      libxl__remus_device_nic *remus_nic = dev->concrete_data;
>  
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
> -    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_start);
> +    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_start);
>  
>      dev->aodev.rc = rc;
>      dev->aodev.callback(egc, &dev->aodev);
>  }
>  
> -static void nic_commit(libxl__egc *egc, libxl__remus_device *dev)
> +static void nic_commit(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
>      int rc;
>      libxl__remus_device_nic *remus_nic = dev->concrete_data;
>  
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
> -    rc = remus_netbuf_op(remus_nic, dev->rds, tc_buffer_release);
> +    rc = remus_netbuf_op(remus_nic, dev->cds, tc_buffer_release);
>  
>      dev->aodev.rc = rc;
>      dev->aodev.callback(egc, &dev->aodev);
>  }
>  
> -const libxl__remus_device_instance_ops remus_device_nic = {
> +const libxl__checkpoint_device_instance_ops remus_device_nic = {
>      .kind = LIBXL__DEVICE_KIND_VIF,
>      .setup = nic_setup,
>      .teardown = nic_teardown,
> diff --git a/tools/libxl/libxl_nonetbuffer.c b/tools/libxl/libxl_nonetbuffer.c
> index 3c659c2..4b68152 100644
> --- a/tools/libxl/libxl_nonetbuffer.c
> +++ b/tools/libxl/libxl_nonetbuffer.c
> @@ -22,25 +22,25 @@ int libxl__netbuffer_enabled(libxl__gc *gc)
>      return 0;
>  }
>  
> -int init_subkind_nic(libxl__remus_devices_state *rds)
> +int init_subkind_nic(libxl__checkpoint_devices_state *cds)
>  {
>      return 0;
>  }
>  
> -void cleanup_subkind_nic(libxl__remus_devices_state *rds)
> +void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
>  {
>      return;
>  }
>  
> -static void nic_setup(libxl__egc *egc, libxl__remus_device *dev)
> +static void nic_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      dev->aodev.rc = ERROR_FAIL;
>      dev->aodev.callback(egc, &dev->aodev);
>  }
>  
> -const libxl__remus_device_instance_ops remus_device_nic = {
> +const libxl__checkpoint_device_instance_ops remus_device_nic = {
>      .kind = LIBXL__DEVICE_KIND_VIF,
>      .setup = nic_setup,
>  };
> diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
> index fae2120..d088dad 100644
> --- a/tools/libxl/libxl_remus.c
> +++ b/tools/libxl/libxl_remus.c
> @@ -21,9 +21,9 @@
>  /*-------------------- Remus setup and teardown ---------------------*/
>  
>  static void remus_setup_done(libxl__egc *egc,
> -                             libxl__remus_devices_state *rds, int rc);
> +                             libxl__checkpoint_devices_state *cds, int rc);
>  static void remus_setup_failed(libxl__egc *egc,
> -                               libxl__remus_devices_state *rds, int rc);
> +                               libxl__checkpoint_devices_state *cds, int rc);
>  static void remus_checkpoint_stream_written(
>      libxl__egc *egc, libxl__stream_write_state *sws, int rc);
>  
> @@ -31,7 +31,7 @@ void libxl__remus_setup(libxl__egc *egc,
>                          libxl__domain_save_state *dss)
>  {
>      /* Convenience aliases */
> -    libxl__remus_devices_state *const rds = &dss->rds;
> +    libxl__checkpoint_devices_state *const cds = &dss->cds;
>      const libxl_domain_remus_info *const info = dss->remus;
>  
>      STATE_AO_GC(dss->ao);
> @@ -41,19 +41,19 @@ void libxl__remus_setup(libxl__egc *egc,
>              LOG(ERROR, "Remus: No support for network buffering");
>              goto out;
>          }
> -        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
> +        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
>      }
>  
>      if (libxl_defbool_val(info->diskbuf))
> -        rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
> +        cds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
>  
> -    rds->ao = ao;
> -    rds->domid = dss->domid;
> -    rds->callback = remus_setup_done;
> +    cds->ao = ao;
> +    cds->domid = dss->domid;
> +    cds->callback = remus_setup_done;
>  
>      dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>  
> -    libxl__remus_devices_setup(egc, rds);
> +    libxl__checkpoint_devices_setup(egc, cds);
>      return;
>  
>  out:
> @@ -61,9 +61,9 @@ out:
>  }
>  
>  static void remus_setup_done(libxl__egc *egc,
> -                             libxl__remus_devices_state *rds, int rc)
> +                             libxl__checkpoint_devices_state *cds, int rc)
>  {
> -    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
> +    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
>      STATE_AO_GC(dss->ao);
>  
>      if (!rc) {
> @@ -73,14 +73,14 @@ static void remus_setup_done(libxl__egc *egc,
>  
>      LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
>          dss->domid, rc);
> -    rds->callback = remus_setup_failed;
> -    libxl__remus_devices_teardown(egc, rds);
> +    cds->callback = remus_setup_failed;
> +    libxl__checkpoint_devices_teardown(egc, cds);
>  }
>  
>  static void remus_setup_failed(libxl__egc *egc,
> -                               libxl__remus_devices_state *rds, int rc)
> +                               libxl__checkpoint_devices_state *cds, int rc)
>  {
> -    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
> +    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
>      STATE_AO_GC(dss->ao);
>  
>      if (rc)
> @@ -91,7 +91,7 @@ static void remus_setup_failed(libxl__egc *egc,
>  }
>  
>  static void remus_teardown_done(libxl__egc *egc,
> -                                libxl__remus_devices_state *rds,
> +                                libxl__checkpoint_devices_state *cds,
>                                  int rc);
>  void libxl__remus_teardown(libxl__egc *egc,
>                             libxl__domain_save_state *dss,
> @@ -101,15 +101,15 @@ void libxl__remus_teardown(libxl__egc *egc,
>  
>      LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
>          " teardown Remus devices...", rc);
> -    dss->rds.callback = remus_teardown_done;
> -    libxl__remus_devices_teardown(egc, &dss->rds);
> +    dss->cds.callback = remus_teardown_done;
> +    libxl__checkpoint_devices_teardown(egc, &dss->cds);
>  }
>  
>  static void remus_teardown_done(libxl__egc *egc,
> -                                libxl__remus_devices_state *rds,
> +                                libxl__checkpoint_devices_state *cds,
>                                  int rc)
>  {
> -    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
> +    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
>      STATE_AO_GC(dss->ao);
>  
>      if (rc)
> @@ -124,10 +124,10 @@ static void remus_teardown_done(libxl__egc *egc,
>  static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
>                                  libxl__domain_suspend_state *dsps, int ok);
>  static void remus_devices_postsuspend_cb(libxl__egc *egc,
> -                                         libxl__remus_devices_state *rds,
> +                                         libxl__checkpoint_devices_state *cds,
>                                           int rc);
>  static void remus_devices_preresume_cb(libxl__egc *egc,
> -                                       libxl__remus_devices_state *rds,
> +                                       libxl__checkpoint_devices_state *cds,
>                                         int rc);
>  
>  void libxl__remus_domain_suspend_callback(void *data)
> @@ -149,9 +149,9 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
>      if (rc)
>          goto out;
>  
> -    libxl__remus_devices_state *const rds = &dss->rds;
> -    rds->callback = remus_devices_postsuspend_cb;
> -    libxl__remus_devices_postsuspend(egc, rds);
> +    libxl__checkpoint_devices_state *const cds = &dss->cds;
> +    cds->callback = remus_devices_postsuspend_cb;
> +    libxl__checkpoint_devices_postsuspend(egc, cds);
>      return;
>  
>  out:
> @@ -160,10 +160,10 @@ out:
>  }
>  
>  static void remus_devices_postsuspend_cb(libxl__egc *egc,
> -                                         libxl__remus_devices_state *rds,
> +                                         libxl__checkpoint_devices_state *cds,
>                                           int rc)
>  {
> -    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
> +    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
>  
>      if (rc)
>          goto out;
> @@ -183,16 +183,16 @@ void libxl__remus_domain_resume_callback(void *data)
>      libxl__domain_save_state *dss = shs->caller_state;
>      STATE_AO_GC(dss->ao);
>  
> -    libxl__remus_devices_state *const rds = &dss->rds;
> -    rds->callback = remus_devices_preresume_cb;
> -    libxl__remus_devices_preresume(egc, rds);
> +    libxl__checkpoint_devices_state *const cds = &dss->cds;
> +    cds->callback = remus_devices_preresume_cb;
> +    libxl__checkpoint_devices_preresume(egc, cds);
>  }
>  
>  static void remus_devices_preresume_cb(libxl__egc *egc,
> -                                       libxl__remus_devices_state *rds,
> +                                       libxl__checkpoint_devices_state *cds,
>                                         int rc)
>  {
> -    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
> +    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
>      STATE_AO_GC(dss->ao);
>  
>      if (rc)
> @@ -214,7 +214,7 @@ out:
>  /*----- remus asynchronous checkpoint callback -----*/
>  
>  static void remus_devices_commit_cb(libxl__egc *egc,
> -                                    libxl__remus_devices_state *rds,
> +                                    libxl__checkpoint_devices_state *cds,
>                                      int rc);
>  static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
>                                    const struct timeval *requested_abs,
> @@ -236,7 +236,7 @@ static void remus_checkpoint_stream_written(
>      libxl__domain_save_state *dss = CONTAINER_OF(sws, *dss, sws);
>  
>      /* Convenience aliases */
> -    libxl__remus_devices_state *const rds = &dss->rds;
> +    libxl__checkpoint_devices_state *const cds = &dss->cds;
>  
>      STATE_AO_GC(dss->ao);
>  
> @@ -245,8 +245,8 @@ static void remus_checkpoint_stream_written(
>          goto out;
>      }
>  
> -    rds->callback = remus_devices_commit_cb;
> -    libxl__remus_devices_commit(egc, rds);
> +    cds->callback = remus_devices_commit_cb;
> +    libxl__checkpoint_devices_commit(egc, cds);
>  
>      return;
>  
> @@ -255,10 +255,10 @@ out:
>  }
>  
>  static void remus_devices_commit_cb(libxl__egc *egc,
> -                                    libxl__remus_devices_state *rds,
> +                                    libxl__checkpoint_devices_state *cds,
>                                      int rc)
>  {
> -    libxl__domain_save_state *dss = CONTAINER_OF(rds, *dss, rds);
> +    libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
>  
>      STATE_AO_GC(dss->ao);
>  
> diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
> index 1c3a88a..4dddc58 100644
> --- a/tools/libxl/libxl_remus_disk_drbd.c
> +++ b/tools/libxl/libxl_remus_disk_drbd.c
> @@ -26,30 +26,30 @@ typedef struct libxl__remus_drbd_disk {
>      int ackwait;
>  } libxl__remus_drbd_disk;
>  
> -int init_subkind_drbd_disk(libxl__remus_devices_state *rds)
> +int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
>  {
> -    STATE_AO_GC(rds->ao);
> +    STATE_AO_GC(cds->ao);
>  
> -    rds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
> +    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
>                                         libxl__xen_script_dir_path());
>  
>      return 0;
>  }
>  
> -void cleanup_subkind_drbd_disk(libxl__remus_devices_state *rds)
> +void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
>  {
>      return;
>  }
>  
>  /*----- helper functions, for async calls -----*/
>  static void drbd_async_call(libxl__egc *egc,
> -                            libxl__remus_device *dev,
> -                            void func(libxl__remus_device *),
> +                            libxl__checkpoint_device *dev,
> +                            void func(libxl__checkpoint_device *),
>                              libxl__ev_child_callback callback)
>  {
>      int pid, rc;
>      libxl__ao_device *aodev = &dev->aodev;
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      /* Fork and call */
>      pid = libxl__ev_child_fork(gc, &aodev->child, callback);
> @@ -82,21 +82,21 @@ static void match_async_exec_cb(libxl__egc *egc,
>  
>  /* implementations */
>  
> -static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev);
> +static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev);
>  
> -static void drbd_setup(libxl__egc *egc, libxl__remus_device *dev)
> +static void drbd_setup(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      match_async_exec(egc, dev);
>  }
>  
> -static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
> +static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
>      int arraysize, nr = 0, rc;
>      const libxl_device_disk *disk = dev->backend_dev;
>      libxl__async_exec_state *aes = &dev->aodev.aes;
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      /* setup env & args */
>      arraysize = 1;
> @@ -107,12 +107,12 @@ static void match_async_exec(libxl__egc *egc, libxl__remus_device *dev)
>      arraysize = 3;
>      nr = 0;
>      GCNEW_ARRAY(aes->args, arraysize);
> -    aes->args[nr++] = dev->rds->drbd_probe_script;
> +    aes->args[nr++] = dev->cds->drbd_probe_script;
>      aes->args[nr++] = disk->pdev_path;
>      aes->args[nr++] = NULL;
>      assert(nr <= arraysize);
>  
> -    aes->ao = dev->rds->ao;
> +    aes->ao = dev->cds->ao;
>      aes->what = GCSPRINTF("%s %s", aes->args[0], aes->args[1]);
>      aes->timeout_ms = LIBXL_HOTPLUG_TIMEOUT * 1000;
>      aes->callback = match_async_exec_cb;
> @@ -136,7 +136,7 @@ static void match_async_exec_cb(libxl__egc *egc,
>                                  int rc, int status)
>  {
>      libxl__ao_device *aodev = CONTAINER_OF(aes, *aodev, aes);
> -    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
> +    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
>      libxl__remus_drbd_disk *drbd_disk;
>      const libxl_device_disk *disk = dev->backend_dev;
>  
> @@ -146,7 +146,7 @@ static void match_async_exec_cb(libxl__egc *egc,
>          goto out;
>  
>      if (status) {
> -        rc = ERROR_REMUS_DEVOPS_DOES_NOT_MATCH;
> +        rc = ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH;
>          /* BUG: seems to assume that any exit status means `no match' */
>          /* BUG: exit status will have been logged as an error */
>          goto out;
> @@ -171,10 +171,10 @@ out:
>      aodev->callback(egc, aodev);
>  }
>  
> -static void drbd_teardown(libxl__egc *egc, libxl__remus_device *dev)
> +static void drbd_teardown(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
>      libxl__remus_drbd_disk *drbd_disk = dev->concrete_data;
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      close(drbd_disk->ctl_fd);
>      dev->aodev.rc = 0;
> @@ -191,9 +191,9 @@ static void checkpoint_async_call_done(libxl__egc *egc,
>  /* API implementations */
>  
>  /* this op will not wait and block, so implement as sync op */
> -static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
> +static void drbd_postsuspend(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      libxl__remus_drbd_disk *rdd = dev->concrete_data;
>  
> @@ -207,16 +207,16 @@ static void drbd_postsuspend(libxl__egc *egc, libxl__remus_device *dev)
>  }
>  
>  
> -static void drbd_preresume_async(libxl__remus_device *dev);
> +static void drbd_preresume_async(libxl__checkpoint_device *dev);
>  
> -static void drbd_preresume(libxl__egc *egc, libxl__remus_device *dev)
> +static void drbd_preresume(libxl__egc *egc, libxl__checkpoint_device *dev)
>  {
> -    STATE_AO_GC(dev->rds->ao);
> +    STATE_AO_GC(dev->cds->ao);
>  
>      drbd_async_call(egc, dev, drbd_preresume_async, checkpoint_async_call_done);
>  }
>  
> -static void drbd_preresume_async(libxl__remus_device *dev)
> +static void drbd_preresume_async(libxl__checkpoint_device *dev)
>  {
>      libxl__remus_drbd_disk *rdd = dev->concrete_data;
>      int ackwait = rdd->ackwait;
> @@ -235,7 +235,7 @@ static void checkpoint_async_call_done(libxl__egc *egc,
>  {
>      int rc;
>      libxl__ao_device *aodev = CONTAINER_OF(child, *aodev, child);
> -    libxl__remus_device *dev = CONTAINER_OF(aodev, *dev, aodev);
> +    libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
>      libxl__remus_drbd_disk *rdd = dev->concrete_data;
>  
>      STATE_AO_GC(aodev->ao);
> @@ -253,7 +253,7 @@ out:
>      aodev->callback(egc, aodev);
>  }
>  
> -const libxl__remus_device_instance_ops remus_device_drbd_disk = {
> +const libxl__checkpoint_device_instance_ops remus_device_drbd_disk = {
>      .kind = LIBXL__DEVICE_KIND_VBD,
>      .setup = drbd_setup,
>      .teardown = drbd_teardown,
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index c5d5d40..db001ad 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -61,8 +61,8 @@ libxl_error = Enumeration("error", [
>      (-15, "LOCK_FAIL"),
>      (-16, "JSON_CONFIG_EMPTY"),
>      (-17, "DEVICE_EXISTS"),
> -    (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
> -    (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
> +    (-18, "CHECKPOINT_DEVOPS_DOES_NOT_MATCH"),
> +    (-19, "CHECKPOINT_DEVICE_NOT_SUPPORTED"),
>      (-20, "VNUMA_CONFIG_INVALID"),
>      (-21, "DOMAIN_NOTFOUND"),
>      (-22, "ABORTED"),
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 15/18] tools/libxl: adjust the indentation
  2015-12-30  2:29 ` [PATCH v6 15/18] tools/libxl: adjust the indentation Wen Congyang
@ 2016-01-25 19:44   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 19:44 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:05AM +0800, Wen Congyang wrote:
> This is just tidying up after the previous automatic renaming.

s/previous/ tools/libxl: rename remus device to checkpoint device patch/

> 
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxl/libxl_checkpoint_device.c | 21 +++++++++++----------
>  tools/libxl/libxl_internal.h          | 19 +++++++++++--------
>  2 files changed, 22 insertions(+), 18 deletions(-)
> 
> diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
> index 109cd23..226f159 100644
> --- a/tools/libxl/libxl_checkpoint_device.c
> +++ b/tools/libxl/libxl_checkpoint_device.c
> @@ -73,9 +73,9 @@ static void devices_teardown_cb(libxl__egc *egc,
>  /* checkpoint device setup and teardown */
>  
>  static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
> -                                              libxl__checkpoint_devices_state *cds,
> -                                              libxl__device_kind kind,
> -                                              void *libxl_dev)
> +                                        libxl__checkpoint_devices_state *cds,
> +                                        libxl__device_kind kind,
> +                                        void *libxl_dev)
>  {
>      libxl__checkpoint_device *dev = NULL;
>  
> @@ -89,9 +89,10 @@ static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
>  }
>  
>  static void checkpoint_devices_setup(libxl__egc *egc,
> -                                libxl__checkpoint_devices_state *cds);
> +                                     libxl__checkpoint_devices_state *cds);
>  
> -void libxl__checkpoint_devices_setup(libxl__egc *egc, libxl__checkpoint_devices_state *cds)
> +void libxl__checkpoint_devices_setup(libxl__egc *egc,
> +                                     libxl__checkpoint_devices_state *cds)
>  {
>      int i, rc;
>  
> @@ -137,7 +138,7 @@ out:
>  }
>  
>  static void checkpoint_devices_setup(libxl__egc *egc,
> -                                libxl__checkpoint_devices_state *cds)
> +                                     libxl__checkpoint_devices_state *cds)
>  {
>      int i, rc;
>  
> @@ -285,12 +286,12 @@ static void devices_checkpoint_cb(libxl__egc *egc,
>  
>  /* API implementations */
>  
> -#define define_checkpoint_api(api)                                \
> -void libxl__checkpoint_devices_##api(libxl__egc *egc,                        \
> -                                libxl__checkpoint_devices_state *cds)        \
> +#define define_checkpoint_api(api)                                      \
> +void libxl__checkpoint_devices_##api(libxl__egc *egc,                   \
> +                                libxl__checkpoint_devices_state *cds)   \
>  {                                                                       \
>      int i;                                                              \
> -    libxl__checkpoint_device *dev;                                           \
> +    libxl__checkpoint_device *dev;                                      \
>                                                                          \
>      STATE_AO_GC(cds->ao);                                               \
>                                                                          \
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 7f80ec5..5b99d6e 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2818,7 +2818,8 @@ typedef struct libxl__save_helper_state {
>   * Each device type needs to implement the interfaces specified in
>   * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
>   *
> - * The high-level control flow through the checkpoint device layer is shown below:
> + * The high-level control flow through the checkpoint device layer is shown
> + * below:
>   *
>   * xl remus
>   *  |->  libxl_domain_remus_start
> @@ -2879,7 +2880,8 @@ int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
>  void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
>  
>  typedef void libxl__checkpoint_callback(libxl__egc *,
> -                                   libxl__checkpoint_devices_state *, int rc);
> +                                        libxl__checkpoint_devices_state *,
> +                                        int rc);
>  
>  /*
>   * State associated with a checkpoint invocation, including parameters
> @@ -2887,7 +2889,7 @@ typedef void libxl__checkpoint_callback(libxl__egc *,
>   * save/restore machinery.
>   */
>  struct libxl__checkpoint_devices_state {
> -    /*---- must be set by caller of libxl__checkpoint_device_(setup|teardown) ----*/
> +    /*-- must be set by caller of libxl__checkpoint_device_(setup|teardown) --*/
>  
>      libxl__ao *ao;
>      uint32_t domid;
> @@ -2900,7 +2902,8 @@ struct libxl__checkpoint_devices_state {
>      /*
>       * this array is allocated before setup the checkpoint devices by the
>       * checkpoint abstract layer.
> -     * devs may be NULL, means there's no checkpoint devices that has been set up.
> +     * devs may be NULL, means there's no checkpoint devices that has been
> +     * set up.
>       * the size of this array is 'num_devices', which is the total number
>       * of libxl nic devices and disk devices(num_nics + num_disks).
>       */
> @@ -2962,13 +2965,13 @@ struct libxl__checkpoint_device {
>  _hidden void libxl__checkpoint_devices_setup(libxl__egc *egc,
>                                          libxl__checkpoint_devices_state *cds);
>  _hidden void libxl__checkpoint_devices_teardown(libxl__egc *egc,
> -                                           libxl__checkpoint_devices_state *cds);
> +                                        libxl__checkpoint_devices_state *cds);
>  _hidden void libxl__checkpoint_devices_postsuspend(libxl__egc *egc,
> -                                              libxl__checkpoint_devices_state *cds);
> +                                        libxl__checkpoint_devices_state *cds);
>  _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
> -                                            libxl__checkpoint_devices_state *cds);
> +                                        libxl__checkpoint_devices_state *cds);
>  _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
> -                                         libxl__checkpoint_devices_state *cds);
> +                                        libxl__checkpoint_devices_state *cds);
>  _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
>  
>  /*----- Legacy conversion helper -----*/
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state
  2015-12-30  2:29 ` [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
@ 2016-01-25 19:55   ` Konrad Rzeszutek Wilk
  2016-01-26  8:07     ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 19:55 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:06AM +0800, Wen Congyang wrote:
> Checkpoint device is an abstract layer to do checkpoint.
> COLO can also use it to do checkpoint. But there are
> still some codes in checkpoint device which touch remus.
> 
> This patch and the following 2 will seperate remus from

s/and the following 2/and:

 tools/libxl: move remus state into a seperate structure 
 tools/libxl: seperate device init/cleanup from checkpoint device layer    

> checkpoint device layer.
> 
> We use remus ops directly in checkpoint device. Store it
> in checkpoint device state so that we do not aware of
> remus_ops in the checkpoint device layer.
> 
> it is pure refactoring and no functional changes.
s/it/It/

> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Acked-by:Ian Campbell <ian.campbell@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

with the changes I mentioned.
> ---
>  tools/libxl/libxl_checkpoint_device.c | 10 +---------
>  tools/libxl/libxl_internal.h          |  2 ++
>  tools/libxl/libxl_remus.c             |  9 +++++++++
>  3 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
> index 226f159..bbc6dc4 100644
> --- a/tools/libxl/libxl_checkpoint_device.c
> +++ b/tools/libxl/libxl_checkpoint_device.c
> @@ -17,14 +17,6 @@
>  
>  #include "libxl_internal.h"
>  
> -extern const libxl__checkpoint_device_instance_ops remus_device_nic;
> -extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
> -static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
> -    &remus_device_nic,
> -    &remus_device_drbd_disk,
> -    NULL,
> -};
> -
>  /*----- helper functions -----*/
>  
>  static int init_device_subkind(libxl__checkpoint_devices_state *cds)
> @@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
>          goto out;
>  
>      do {
> -        dev->ops = remus_ops[++dev->ops_index];
> +        dev->ops = dev->cds->ops[++dev->ops_index];
>          if (!dev->ops) {
>              libxl_device_nic * nic = NULL;
>              libxl_device_disk * disk = NULL;
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 5b99d6e..914ce94 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2895,6 +2895,8 @@ struct libxl__checkpoint_devices_state {
>      uint32_t domid;
>      libxl__checkpoint_callback *callback;
>      int device_kind_flags;
> +    /* The ops must be pointer array, and the last ops must be NULL */

s/NULL/NULL./

> +    const libxl__checkpoint_device_instance_ops **ops;
>  
>      /*----- private for abstract layer only -----*/
>  
> diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
> index d088dad..3375331 100644
> --- a/tools/libxl/libxl_remus.c
> +++ b/tools/libxl/libxl_remus.c
> @@ -18,6 +18,14 @@
>  
>  #include "libxl_internal.h"
>  
> +extern const libxl__checkpoint_device_instance_ops remus_device_nic;
> +extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
> +static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
> +    &remus_device_nic,
> +    &remus_device_drbd_disk,
> +    NULL,
> +};
> +
>  /*-------------------- Remus setup and teardown ---------------------*/
>  
>  static void remus_setup_done(libxl__egc *egc,
> @@ -50,6 +58,7 @@ void libxl__remus_setup(libxl__egc *egc,
>      cds->ao = ao;
>      cds->domid = dss->domid;
>      cds->callback = remus_setup_done;
> +    cds->ops = remus_ops;
>  
>      dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>  
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 17/18] tools/libxl: move remus state into a seperate structure
  2015-12-30  2:29 ` [PATCH v6 17/18] tools/libxl: move remus state into a seperate structure Wen Congyang
@ 2016-01-25 19:59   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 19:59 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:07AM +0800, Wen Congyang wrote:
> Add a new structure remus state, and move concrete layer's private
> member to remus state.
> it is pure refactoring and no functional changes.
> Init interval in libxl__remus_setup(). It is safe to move this initialisation,
> because this value is only used for remus, and remus will use this value after
> libxl__remus_setup().
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxl/libxl.c                 |  2 +-
>  tools/libxl/libxl_dom_save.c        |  3 +--
>  tools/libxl/libxl_internal.h        | 35 +++++++++++++++-----------
>  tools/libxl/libxl_netbuffer.c       | 49 +++++++++++++++++++++----------------
>  tools/libxl/libxl_remus.c           | 24 ++++++++++++------
>  tools/libxl/libxl_remus_disk_drbd.c |  8 +++---
>  6 files changed, 72 insertions(+), 49 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 69c8047..481824d 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -882,7 +882,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
>      assert(info);
>  
>      /* Point of no return */
> -    libxl__remus_setup(egc, dss);
> +    libxl__remus_setup(egc, &dss->rs);
>      return AO_INPROGRESS;
>  
>   out:
> diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
> index 8e8d280..86026ac 100644
> --- a/tools/libxl/libxl_dom_save.c
> +++ b/tools/libxl/libxl_dom_save.c
> @@ -392,7 +392,6 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
>      }
>  
>      if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
> -        dss->interval = r_info->interval;
>          if (libxl_defbool_val(r_info->compression))
>              dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
>      }
> @@ -447,7 +446,7 @@ static void domain_save_done(libxl__egc *egc,
>           * from sending checkpoints. Teardown the network buffers and
>           * release netlink resources.  This is an async op.
>           */
> -        libxl__remus_teardown(egc, dss, rc);
> +        libxl__remus_teardown(egc, &dss->rs, rc);
>          return;
>      }
>  
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 914ce94..b6929a9 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2894,6 +2894,7 @@ struct libxl__checkpoint_devices_state {
>      libxl__ao *ao;
>      uint32_t domid;
>      libxl__checkpoint_callback *callback;
> +    void *concrete_data;
>      int device_kind_flags;
>      /* The ops must be pointer array, and the last ops must be NULL */
>      const libxl__checkpoint_device_instance_ops **ops;
> @@ -2917,16 +2918,6 @@ struct libxl__checkpoint_devices_state {
>      int num_disks;
>  
>      libxl__multidev multidev;
> -
> -    /*----- private for concrete (device-specific) layer only -----*/
> -
> -    /* private for nic device subkind ops */
> -    char *netbufscript;
> -    struct nl_sock *nlsock;
> -    struct nl_cache *qdisc_cache;
> -
> -    /* private for drbd disk subkind ops */
> -    char *drbd_probe_script;
>  };
>  
>  /*
> @@ -2974,6 +2965,23 @@ _hidden void libxl__checkpoint_devices_preresume(libxl__egc *egc,
>                                          libxl__checkpoint_devices_state *cds);
>  _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
>                                          libxl__checkpoint_devices_state *cds);
> +
> +/*----- Remus related state structure -----*/
> +typedef struct libxl__remus_state libxl__remus_state;
> +struct libxl__remus_state {
> +    /* private */
> +    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
> +    int interval; /* checkpoint interval */
> +
> +    /*----- private for concrete (device-specific) layer only -----*/
> +    /* private for nic device subkind ops */
> +    char *netbufscript;
> +    struct nl_sock *nlsock;
> +    struct nl_cache *qdisc_cache;
> +
> +    /* private for drbd disk subkind ops */
> +    char *drbd_probe_script;
> +};
>  _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
>  
>  /*----- Legacy conversion helper -----*/
> @@ -3132,9 +3140,8 @@ struct libxl__domain_save_state {
>      int hvm;
>      int xcflags;
>      libxl__domain_suspend_state dsps;
> +    libxl__remus_state rs;
>      libxl__checkpoint_devices_state cds;
> -    libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
> -    int interval; /* checkpoint interval (for Remus) */
>      libxl__stream_write_state sws;
>      libxl__logdirty_switch logdirty;
>      /* private for libxl__domain_save_device_model */
> @@ -3551,9 +3558,9 @@ _hidden void libxl__remus_domain_resume_callback(void *data);
>  _hidden void libxl__remus_domain_save_checkpoint_callback(void *data);
>  /* Remus setup and teardown*/
>  _hidden void libxl__remus_setup(libxl__egc *egc,
> -                                libxl__domain_save_state *dss);
> +                                libxl__remus_state *rs);
>  _hidden void libxl__remus_teardown(libxl__egc *egc,
> -                                   libxl__domain_save_state *dss,
> +                                   libxl__remus_state *rs,
>                                     int rc);
>  /* Remus callbacks for restore */
>  _hidden void libxl__remus_domain_restore_checkpoint_callback(void *data);
> diff --git a/tools/libxl/libxl_netbuffer.c b/tools/libxl/libxl_netbuffer.c
> index 33c2a42..5c7e8a2 100644
> --- a/tools/libxl/libxl_netbuffer.c
> +++ b/tools/libxl/libxl_netbuffer.c
> @@ -42,17 +42,18 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
>  {
>      int rc, ret;
>      libxl__domain_save_state *dss = CONTAINER_OF(cds, *dss, cds);
> +    libxl__remus_state *rs = cds->concrete_data;
>  
>      STATE_AO_GC(cds->ao);
>  
> -    cds->nlsock = nl_socket_alloc();
> -    if (!cds->nlsock) {
> +    rs->nlsock = nl_socket_alloc();
> +    if (!rs->nlsock) {
>          LOG(ERROR, "cannot allocate nl socket");
>          rc = ERROR_FAIL;
>          goto out;
>      }
>  
> -    ret = nl_connect(cds->nlsock, NETLINK_ROUTE);
> +    ret = nl_connect(rs->nlsock, NETLINK_ROUTE);
>      if (ret) {
>          LOG(ERROR, "failed to open netlink socket: %s",
>              nl_geterror(ret));
> @@ -61,7 +62,7 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
>      }
>  
>      /* get list of all qdiscs installed on network devs. */
> -    ret = rtnl_qdisc_alloc_cache(cds->nlsock, &cds->qdisc_cache);
> +    ret = rtnl_qdisc_alloc_cache(rs->nlsock, &rs->qdisc_cache);
>      if (ret) {
>          LOG(ERROR, "failed to allocate qdisc cache: %s",
>              nl_geterror(ret));
> @@ -70,10 +71,10 @@ int init_subkind_nic(libxl__checkpoint_devices_state *cds)
>      }
>  
>      if (dss->remus->netbufscript) {
> -        cds->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
> +        rs->netbufscript = libxl__strdup(gc, dss->remus->netbufscript);
>      } else {
> -        cds->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
> -                                      libxl__xen_script_dir_path());
> +        rs->netbufscript = GCSPRINTF("%s/remus-netbuf-setup",
> +                                     libxl__xen_script_dir_path());
>      }
>  
>      rc = 0;
> @@ -84,20 +85,22 @@ out:
>  
>  void cleanup_subkind_nic(libxl__checkpoint_devices_state *cds)
>  {
> +    libxl__remus_state *rs = cds->concrete_data;
> +
>      STATE_AO_GC(cds->ao);
>  
>      /* free qdisc cache */
> -    if (cds->qdisc_cache) {
> -        nl_cache_clear(cds->qdisc_cache);
> -        nl_cache_free(cds->qdisc_cache);
> -        cds->qdisc_cache = NULL;
> +    if (rs->qdisc_cache) {
> +        nl_cache_clear(rs->qdisc_cache);
> +        nl_cache_free(rs->qdisc_cache);
> +        rs->qdisc_cache = NULL;
>      }
>  
>      /* close & free nlsock */
> -    if (cds->nlsock) {
> -        nl_close(cds->nlsock);
> -        nl_socket_free(cds->nlsock);
> -        cds->nlsock = NULL;
> +    if (rs->nlsock) {
> +        nl_close(rs->nlsock);
> +        nl_socket_free(rs->nlsock);
> +        rs->nlsock = NULL;
>      }
>  }
>  
> @@ -150,13 +153,14 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
>      int rc, ret, ifindex;
>      struct rtnl_link *ifb = NULL;
>      struct rtnl_qdisc *qdisc = NULL;
> +    libxl__remus_state *rs = cds->concrete_data;
>  
>      STATE_AO_GC(cds->ao);
>  
>      /* Now that we have brought up REMUS_IFB device with plug qdisc for
>       * this vif, so we need to refill the qdisc cache.
>       */
> -    ret = nl_cache_refill(cds->nlsock, cds->qdisc_cache);
> +    ret = nl_cache_refill(rs->nlsock, rs->qdisc_cache);
>      if (ret) {
>          LOG(ERROR, "cannot refill qdisc cache: %s", nl_geterror(ret));
>          rc = ERROR_FAIL;
> @@ -164,7 +168,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
>      }
>  
>      /* get a handle to the REMUS_IFB interface */
> -    ret = rtnl_link_get_kernel(cds->nlsock, 0, remus_nic->ifb, &ifb);
> +    ret = rtnl_link_get_kernel(rs->nlsock, 0, remus_nic->ifb, &ifb);
>      if (ret) {
>          LOG(ERROR, "cannot obtain handle for %s: %s", remus_nic->ifb,
>              nl_geterror(ret));
> @@ -187,7 +191,7 @@ static int init_qdisc(libxl__checkpoint_devices_state *cds,
>       * There is no need to explicitly free this qdisc as its just a
>       * reference from the qdisc cache we allocated earlier.
>       */
> -    qdisc = rtnl_qdisc_get_by_parent(cds->qdisc_cache, ifindex, TC_H_ROOT);
> +    qdisc = rtnl_qdisc_get_by_parent(rs->qdisc_cache, ifindex, TC_H_ROOT);
>      if (qdisc) {
>          const char *tc_kind = rtnl_tc_get_kind(TC_CAST(qdisc));
>          /* Sanity check: Ensure that the root qdisc is a plug qdisc. */
> @@ -238,11 +242,12 @@ static void setup_async_exec(libxl__checkpoint_device *dev, char *op)
>      libxl__remus_device_nic *remus_nic = dev->concrete_data;
>      libxl__checkpoint_devices_state *cds = dev->cds;
>      libxl__async_exec_state *aes = &dev->aodev.aes;
> +    libxl__remus_state *rs = cds->concrete_data;
>  
>      STATE_AO_GC(cds->ao);
>  
>      /* Convenience aliases */
> -    char *const script = libxl__strdup(gc, cds->netbufscript);
> +    char *const script = libxl__strdup(gc, rs->netbufscript);
>      const uint32_t domid = cds->domid;
>      const int dev_id = remus_nic->devid;
>      const char *const vif = remus_nic->vif;
> @@ -333,6 +338,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
>      libxl__checkpoint_device *dev = CONTAINER_OF(aodev, *dev, aodev);
>      libxl__remus_device_nic *remus_nic = dev->concrete_data;
>      libxl__checkpoint_devices_state *cds = dev->cds;
> +    libxl__remus_state *rs = cds->concrete_data;
>      const char *out_path_base, *hotplug_error = NULL;
>  
>      STATE_AO_GC(cds->ao);
> @@ -377,7 +383,7 @@ static void netbuf_setup_script_cb(libxl__egc *egc,
>  
>      if (hotplug_error) {
>          LOG(ERROR, "netbuf script %s setup failed for vif %s: %s",
> -            cds->netbufscript, vif, hotplug_error);
> +            rs->netbufscript, vif, hotplug_error);
>          rc = ERROR_FAIL;
>          goto out;
>      }
> @@ -445,6 +451,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
>                             int buffer_op)
>  {
>      int rc, ret;
> +    libxl__remus_state *rs = cds->concrete_data;
>  
>      STATE_AO_GC(cds->ao);
>  
> @@ -458,7 +465,7 @@ static int remus_netbuf_op(libxl__remus_device_nic *remus_nic,
>          goto out;
>      }
>  
> -    ret = rtnl_qdisc_add(cds->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
> +    ret = rtnl_qdisc_add(rs->nlsock, remus_nic->qdisc, NLM_F_REQUEST);
>      if (ret) {
>          rc = ERROR_FAIL;
>          goto out;
> diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
> index 3375331..00e3c80 100644
> --- a/tools/libxl/libxl_remus.c
> +++ b/tools/libxl/libxl_remus.c
> @@ -35,9 +35,10 @@ static void remus_setup_failed(libxl__egc *egc,
>  static void remus_checkpoint_stream_written(
>      libxl__egc *egc, libxl__stream_write_state *sws, int rc);
>  
> -void libxl__remus_setup(libxl__egc *egc,
> -                        libxl__domain_save_state *dss)
> +void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
>  {
> +    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
> +
>      /* Convenience aliases */
>      libxl__checkpoint_devices_state *const cds = &dss->cds;
>      const libxl_domain_remus_info *const info = dss->remus;
> @@ -59,6 +60,8 @@ void libxl__remus_setup(libxl__egc *egc,
>      cds->domid = dss->domid;
>      cds->callback = remus_setup_done;
>      cds->ops = remus_ops;
> +    cds->concrete_data = rs;
> +    rs->interval = info->interval;
>  
>      dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>  
> @@ -103,15 +106,20 @@ static void remus_teardown_done(libxl__egc *egc,
>                                  libxl__checkpoint_devices_state *cds,
>                                  int rc);
>  void libxl__remus_teardown(libxl__egc *egc,
> -                           libxl__domain_save_state *dss,
> +                           libxl__remus_state *rs,
>                             int rc)
>  {
> +    libxl__domain_save_state *dss = CONTAINER_OF(rs, *dss, rs);
> +
> +    /* Convenience aliases */
> +    libxl__checkpoint_devices_state *const cds = &dss->cds;
> +
>      EGC_GC;
>  
>      LOG(WARN, "Remus: Domain suspend terminated with rc %d,"
>          " teardown Remus devices...", rc);
> -    dss->cds.callback = remus_teardown_done;
> -    libxl__checkpoint_devices_teardown(egc, &dss->cds);
> +    cds->callback = remus_teardown_done;
> +    libxl__checkpoint_devices_teardown(egc, cds);
>  }
>  
>  static void remus_teardown_done(libxl__egc *egc,
> @@ -285,9 +293,9 @@ static void remus_devices_commit_cb(libxl__egc *egc,
>       */
>  
>      /* Set checkpoint interval timeout */
> -    rc = libxl__ev_time_register_rel(ao, &dss->checkpoint_timeout,
> +    rc = libxl__ev_time_register_rel(ao, &dss->rs.checkpoint_timeout,
>                                       remus_next_checkpoint,
> -                                     dss->interval);
> +                                     dss->rs.interval);
>  
>      if (rc)
>          goto out;
> @@ -303,7 +311,7 @@ static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
>                                    int rc)
>  {
>      libxl__domain_save_state *dss =
> -                            CONTAINER_OF(ev, *dss, checkpoint_timeout);
> +                            CONTAINER_OF(ev, *dss, rs.checkpoint_timeout);
>  
>      STATE_AO_GC(dss->ao);
>  
> diff --git a/tools/libxl/libxl_remus_disk_drbd.c b/tools/libxl/libxl_remus_disk_drbd.c
> index 4dddc58..844dd66 100644
> --- a/tools/libxl/libxl_remus_disk_drbd.c
> +++ b/tools/libxl/libxl_remus_disk_drbd.c
> @@ -28,10 +28,11 @@ typedef struct libxl__remus_drbd_disk {
>  
>  int init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds)
>  {
> +    libxl__remus_state *rs = cds->concrete_data;
>      STATE_AO_GC(cds->ao);
>  
> -    cds->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
> -                                       libxl__xen_script_dir_path());
> +    rs->drbd_probe_script = GCSPRINTF("%s/block-drbd-probe",
> +                                      libxl__xen_script_dir_path());
>  
>      return 0;
>  }
> @@ -96,6 +97,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
>      int arraysize, nr = 0, rc;
>      const libxl_device_disk *disk = dev->backend_dev;
>      libxl__async_exec_state *aes = &dev->aodev.aes;
> +    libxl__remus_state *rs = dev->cds->concrete_data;
>      STATE_AO_GC(dev->cds->ao);
>  
>      /* setup env & args */
> @@ -107,7 +109,7 @@ static void match_async_exec(libxl__egc *egc, libxl__checkpoint_device *dev)
>      arraysize = 3;
>      nr = 0;
>      GCNEW_ARRAY(aes->args, arraysize);
> -    aes->args[nr++] = dev->cds->drbd_probe_script;
> +    aes->args[nr++] = rs->drbd_probe_script;
>      aes->args[nr++] = disk->pdev_path;
>      aes->args[nr++] = NULL;
>      assert(nr <= arraysize);
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer
  2015-12-30  2:29 ` [PATCH v6 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
@ 2016-01-25 20:01   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 20:01 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Wed, Dec 30, 2015 at 10:29:08AM +0800, Wen Congyang wrote:
> we call (init|cleanup)_subkind_nic and (init|cleanup)_subkind_drbd_disk
> directly in checkpoint device. Move them to libxl_remus.c, Call them before
> calling libxl__checkpoint_devices_setup() or after calling
> libxl__checkpoint_devices_teardown().
> it is pure refactoring and no functional changes.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/libxl/libxl_checkpoint_device.c | 42 ++---------------------------------
>  tools/libxl/libxl_remus.c             | 42 +++++++++++++++++++++++++++++++++++
>  2 files changed, 44 insertions(+), 40 deletions(-)
> 
> diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
> index bbc6dc4..0a16dbb 100644
> --- a/tools/libxl/libxl_checkpoint_device.c
> +++ b/tools/libxl/libxl_checkpoint_device.c
> @@ -17,38 +17,6 @@
>  
>  #include "libxl_internal.h"
>  
> -/*----- helper functions -----*/
> -
> -static int init_device_subkind(libxl__checkpoint_devices_state *cds)
> -{
> -    /* init device subkind-specific state in the libxl ctx */
> -    int rc;
> -    STATE_AO_GC(cds->ao);
> -
> -    if (libxl__netbuffer_enabled(gc)) {
> -        rc = init_subkind_nic(cds);
> -        if (rc) goto out;
> -    }
> -
> -    rc = init_subkind_drbd_disk(cds);
> -    if (rc) goto out;
> -
> -    rc = 0;
> -out:
> -    return rc;
> -}
> -
> -static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
> -{
> -    /* cleanup device subkind-specific state in the libxl ctx */
> -    STATE_AO_GC(cds->ao);
> -
> -    if (libxl__netbuffer_enabled(gc))
> -        cleanup_subkind_nic(cds);
> -
> -    cleanup_subkind_drbd_disk(cds);
> -}
> -
>  /*----- setup() and teardown() -----*/
>  
>  /* callbacks */
> @@ -86,14 +54,10 @@ static void checkpoint_devices_setup(libxl__egc *egc,
>  void libxl__checkpoint_devices_setup(libxl__egc *egc,
>                                       libxl__checkpoint_devices_state *cds)
>  {
> -    int i, rc;
> +    int i;
>  
>      STATE_AO_GC(cds->ao);
>  
> -    rc = init_device_subkind(cds);
> -    if (rc)
> -        goto out;
> -
>      cds->num_devices = 0;
>      cds->num_nics = 0;
>      cds->num_disks = 0;
> @@ -126,7 +90,7 @@ void libxl__checkpoint_devices_setup(libxl__egc *egc,
>      return;
>  
>  out:
> -    cds->callback(egc, cds, rc);
> +    cds->callback(egc, cds, 0);
>  }
>  
>  static void checkpoint_devices_setup(libxl__egc *egc,
> @@ -263,8 +227,6 @@ static void devices_teardown_cb(libxl__egc *egc,
>      cds->disks = NULL;
>      cds->num_disks = 0;
>  
> -    cleanup_device_subkind(cds);
> -
>      cds->callback(egc, cds, rc);
>  }
>  
> diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
> index 00e3c80..07a1699 100644
> --- a/tools/libxl/libxl_remus.c
> +++ b/tools/libxl/libxl_remus.c
> @@ -26,6 +26,38 @@ static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
>      NULL,
>  };
>  
> +/*----- helper functions -----*/
> +
> +static int init_device_subkind(libxl__checkpoint_devices_state *cds)
> +{
> +    /* init device subkind-specific state in the libxl ctx */
> +    int rc;
> +    STATE_AO_GC(cds->ao);
> +
> +    if (libxl__netbuffer_enabled(gc)) {
> +        rc = init_subkind_nic(cds);
> +        if (rc) goto out;
> +    }
> +
> +    rc = init_subkind_drbd_disk(cds);
> +    if (rc) goto out;
> +
> +    rc = 0;
> +out:
> +    return rc;
> +}
> +
> +static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
> +{
> +    /* cleanup device subkind-specific state in the libxl ctx */
> +    STATE_AO_GC(cds->ao);
> +
> +    if (libxl__netbuffer_enabled(gc))
> +        cleanup_subkind_nic(cds);
> +
> +    cleanup_subkind_drbd_disk(cds);
> +}
> +
>  /*-------------------- Remus setup and teardown ---------------------*/
>  
>  static void remus_setup_done(libxl__egc *egc,
> @@ -63,6 +95,12 @@ void libxl__remus_setup(libxl__egc *egc, libxl__remus_state *rs)
>      cds->concrete_data = rs;
>      rs->interval = info->interval;
>  
> +    if (init_device_subkind(cds)) {
> +        LOG(ERROR, "Remus: failed to init device subkind for guest %u",
> +            dss->domid);
> +        goto out;
> +    }
> +
>      dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>  
>      libxl__checkpoint_devices_setup(egc, cds);
> @@ -99,6 +137,8 @@ static void remus_setup_failed(libxl__egc *egc,
>          LOG(ERROR, "Remus: failed to teardown device after setup failed"
>              " for guest with domid %u, rc %d", dss->domid, rc);
>  
> +    cleanup_device_subkind(cds);
> +
>      dss->callback(egc, dss, rc);
>  }
>  
> @@ -133,6 +173,8 @@ static void remus_teardown_done(libxl__egc *egc,
>          LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
>              " rc %d", dss->domid, rc);
>  
> +    cleanup_device_subkind(cds);
> +
>      dss->callback(egc, dss, rc);
>  }
>  
> -- 
> 2.5.0
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/18] Prerequisite patches for COLO
  2016-01-25 17:12 ` [PATCH v6 00/18] Prerequisite patches for COLO Konrad Rzeszutek Wilk
@ 2016-01-25 20:06   ` Konrad Rzeszutek Wilk
  2016-01-26  3:18     ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-25 20:06 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Mon, Jan 25, 2016 at 12:12:48PM -0500, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:28:50AM +0800, Wen Congyang wrote:
> > This patchset is Prerequisite for COLO feature. Refer to:
> > http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
> > 
> > It was based on the following series:
> > http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02881.html
> 
> You wouldn't have this in a git tree? It is a bit hard to apply on the latest
> staging. Or could you say on what branch/git commit it was based on?

I looked at the patches and they just minor tweaking from my perspective.

I think when you get to reposting it with my review comments you may want
to have in the cover letter a list of all the patches and which
ones have been reviewed and acked. That way the maintainers can zoom in
on the ones that still need some tweaking/review.

And also if possible - do include a git tree. At certain point in the patchset
you had moved some functions, did a bit of renaming and it was hard to reference
to the original staging tree to see the code around the functions.

Having a git tree would allow the reviewers to nicely git checkout at certain
points and be able to see the code. Perhaps this is me - and if getting
a git tree is quite difficult - then don't sweat over it.

Thank you!
> 
> Thanks.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state
  2016-01-25 17:29   ` Konrad Rzeszutek Wilk
@ 2016-01-26  2:23     ` Wen Congyang
  2016-01-26 14:32       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2016-01-26  2:23 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 01:29 AM, Konrad Rzeszutek Wilk wrote:
> .snip..
>> --- a/tools/libxl/libxl_dom_suspend.c
>> +++ b/tools/libxl/libxl_dom_suspend.c
>> @@ -19,14 +19,71 @@
>>  
>>  /*====================== Domain suspend =======================*/
>>  
>> +int libxl__domain_suspend_init(libxl__egc *egc,
>> +                               libxl__domain_suspend_state *dsps)
>> +{
>> +    STATE_AO_GC(dsps->ao);
>> +    int rc = ERROR_FAIL;
>> +    int port;
>> +    libxl_domain_type type;
>> +
>> +    /* Convenience aliases */
>> +    const uint32_t domid = dsps->domid;
>> +
>> +    type = libxl__domain_type(gc, domid);
>> +    switch (type) {
>> +    case LIBXL_DOMAIN_TYPE_HVM: {
>> +        dsps->hvm = 1;
>> +        break;
>> +    }
>> +    case LIBXL_DOMAIN_TYPE_PV:
>> +        dsps->hvm = 0;
>> +        break;
>> +    default:
>> +        goto out;
> 
> This will mean we return back to libxl__domain_save which will goto out which calls:
> domain_save_done. And that will try to use the dsps->guestevtchn leading to a crash since:

Yes, thanks for pointing it out. In which case, the type is not HVM or PV?

>> +    }
>> +
>> +    libxl__xswait_init(&dsps->pvcontrol);
>> +    libxl__ev_evtchn_init(&dsps->guest_evtchn);
> 
> we initialize them here.
>> +    libxl__ev_xswatch_init(&dsps->guest_watch);
>> +    libxl__ev_time_init(&dsps->guest_timeout);
> 
> I would instead recommend you move these initialization routines above the
> 'type' check.

I think we should not return ERROR_FAIL when the type is not PV or HVM. We should abort the program
like what we do in libxl__domain_save().

> 
>> +
>> +    dsps->guest_evtchn.port = -1;
>> +    dsps->guest_evtchn_lockfd = -1;
>> +    dsps->guest_responded = 0;
>> +    dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
>> +
>> +    port = xs_suspend_evtchn_port(domid);
>> +
>> +    if (port >= 0) {
>> +        rc = libxl__ctx_evtchn_init(gc);
>> +        if (rc) goto out;
>> +
>> +        dsps->guest_evtchn.port =
>> +            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
>> +                                    domid, port, &dsps->guest_evtchn_lockfd);
>> +
>> +        if (dsps->guest_evtchn.port < 0) {
>> +            LOG(WARN, "Suspend event channel initialization failed");
>> +            rc = ERROR_FAIL;
>> +            goto out;
>> +        }
>> +    }
>> +
>> +    rc = 0;
>> +
>> +out:
>> +    return rc;
>> +}
>> +
> 
> .. snip..
>>  struct libxl__domain_suspend_state {
>> +    /* set by caller of libxl__domain_suspend_init */
>> +    libxl__ao *ao;
>> +    uint32_t domid;
>> +
>> +    /* private */
>> +    int hvm;
> 
> How about 'is_hvm' and just use 'libxl_domain_type' type?
> instead of having an int? You can just do:

In dss, it is 'int hvm'.
Before this patch:
if (dss->hvm) ...
After this patch:
if (dsps->hvm) ...

Thanks
Wen Congyang

> 
> if (type == LIBXL_DOMAIN_TYPE_HVM) ..
> 
> And to check for non-conforming types - you can make  libxl__domain_suspend_init
> do this:
> 
>     if (type == LIBXL_DOMAIN_TYPE_INVALID) {
>         rc = ERROR_FAIL;
>         goto out; 
>     }    
> 
> ?
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests
  2016-01-25 18:21   ` Konrad Rzeszutek Wilk
@ 2016-01-26  2:53     ` Wen Congyang
  0 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2016-01-26  2:53 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 02:21 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:28:55AM +0800, Wen Congyang wrote:
>> Befor this patch:
> 
> s/Befor/Before
>> 1. suspend
>> a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
>>    request to the guest). If the guest doesn't support evtchn, the xenstore
>>    variant will be used, suspending the guest via XenBus control node.
>> b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
>>    the guest
>>
>> 2. Resume:
>> a. fast path
> 
> s/fast path/fast path(fast=1)
> 
>>    In this case, we don't change the guest's state. And we will call
>>    libxl__domain_resume(..., 1) to resume the guest.
> 
> Do not change the guest state. We call libxl__domain_resume(.., 1) which
> calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
> 
> 
>>    PV:       modify the return code to 1, and than call the domctl
> 
> s/domctl/domctl:/
>>              XEN_DOMCTL_resumedomain
>>    PVHVM:    same with PV
>>    pure HVM: do nothing in modify_returncode, and than call the domctl:
>>              XEN_DOMCTL_resumedomain
> 
>> b. slow
>>    Used when the guest's state have been changed. And we will call
> 
> s/And we will/Will/
> 
>>    libxl__domain_resume(..., 0) to resume the guest.
>>    PV:       update start info, and reset all secondary CPU states. Than call
>>              the domctl: XEN_DOMCTL_resumedomain
>>    PVHVM:    can not be resumed. You will get the following error message:
>>                  "Cannot resume uncooperative HVM guests"
>>    purt HVM: same with PVHVM
>>
>> After this patch:
>> 1. suspend
>>    unchanged
>>
>> 2. Resume
>> a. fast path:
>>    unchanged
>> b. slow
>>    PV:       unchanged
>>    PVHVM:    call XEN_DOMCTL_resumedomain to resume the guest. Because we
>>              don't modify the return code, the PV driver will disconnect
>>              and reconnect. I am not sure if we should update start info
>>              and reset all secondary CPU states.
> 
> The guest ends up doing the XENMAPSPACE_shared_info XENMEM_add_to_physmap
> hypercall and resetting all of its CPU states to point to the shared_info
> (well except the ones past 32).
> 
> That is the Linux kernel does that - regardless whether the 
> SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.

Yes, I find it in the Linux kernel. Thanks for pointing it out.

> 
>>    Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.
>>
>> Under COLO, we will update the guest's state(modify memory, cpu's registers,
>> device status...). In this case, we cannot use the fast path to resume it.
>> Keep the return code 0, and use a slow path to resume the guest. While
>> resuming HVM using slow path is not supported currently, this patch is to
>> make the resume call do not fail.
> 
> s/do/to/
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
>> ---
>>  tools/libxc/xc_resume.c | 24 ++++++++++++++++++++----
>>  1 file changed, 20 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
>> index 87d4324..503e4f8 100644
>> --- a/tools/libxc/xc_resume.c
>> +++ b/tools/libxc/xc_resume.c
>> @@ -108,6 +108,25 @@ static int xc_domain_resume_cooperative(xc_interface *xch, uint32_t domid)
>>      return do_domctl(xch, &domctl);
>>  }
>>  
>> +static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
>> +{
>> +    DECLARE_DOMCTL;
>> +
>> +    /*
>> +     * This domctl XEN_DOMCTL_resumedomain just unpause each vcpu. After
> 
> s/This/The/
> s/just//
>> +     * this domctl, the guest will run.
> s/this/the/
> 
>> +     *
>> +     * If it is PVHVM, the guest called the hypercall HYPERVISOR_sched_op
> 
> s/HYPERVISOR_sched_op/SCHEDOP_shutdown:SHUTDOWN_suspend/
>> +     * to suspend itself. We don't modify the return code, so the PV driver
>> +     * will disconnect and reconnect.
>> +     *
>> +     * If it is a HVM, the guest will continue running.
>> +     */
>> +    domctl.cmd = XEN_DOMCTL_resumedomain;
>> +    domctl.domain = domid;
>> +    return do_domctl(xch, &domctl);
>> +}
>> +
>>  static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>>  {
>>      DECLARE_DOMCTL;
>> @@ -137,10 +156,7 @@ static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
>>       */
>>  #if defined(__i386__) || defined(__x86_64__)
>>      if ( info.hvm )
>> -    {
>> -        ERROR("Cannot resume uncooperative HVM guests");
>> -        return rc;
>> -    }
>> +        return xc_domain_resume_hvm(xch, domid);
>>  
>>      if ( xc_domain_get_guest_width(xch, domid, &dinfo->guest_width) != 0 )
>>      {
>> -- 
>> 2.5.0
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 00/18] Prerequisite patches for COLO
  2016-01-25 20:06   ` Konrad Rzeszutek Wilk
@ 2016-01-26  3:18     ` Wen Congyang
  0 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2016-01-26  3:18 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 04:06 AM, Konrad Rzeszutek Wilk wrote:
> On Mon, Jan 25, 2016 at 12:12:48PM -0500, Konrad Rzeszutek Wilk wrote:
>> On Wed, Dec 30, 2015 at 10:28:50AM +0800, Wen Congyang wrote:
>>> This patchset is Prerequisite for COLO feature. Refer to:
>>> http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping
>>>
>>> It was based on the following series:
>>> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02881.html
>>
>> You wouldn't have this in a git tree? It is a bit hard to apply on the latest
>> staging. Or could you say on what branch/git commit it was based on?
> 
> I looked at the patches and they just minor tweaking from my perspective.
> 
> I think when you get to reposting it with my review comments you may want
> to have in the cover letter a list of all the patches and which
> ones have been reviewed and acked. That way the maintainers can zoom in
> on the ones that still need some tweaking/review.
> 
> And also if possible - do include a git tree. At certain point in the patchset
> you had moved some functions, did a bit of renaming and it was hard to reference
> to the original staging tree to see the code around the functions.
> 
> Having a git tree would allow the reviewers to nicely git checkout at certain
> points and be able to see the code. Perhaps this is me - and if getting
> a git tree is quite difficult - then don't sweat over it.

OK, will do it in the next version.

Thanks for your review.
Wen Congyang

> 
> Thank you!
>>
>> Thanks.
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2016-01-25 18:59   ` Konrad Rzeszutek Wilk
@ 2016-01-26  7:04     ` Wen Congyang
  2016-01-26 14:27       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2016-01-26  7:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 02:59 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:28:59AM +0800, Wen Congyang wrote:
>> Secondary vm is running in colo mode, we need to send
>> secondary vm's dirty page information to master at checkpoint,
> 
> In previous patch you called it primary, so perhaps:
> s/master/primary/ ?
> 
>> so we have to enable qemu logdirty on secondary.
>>
>> libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
>> qemu logdirty. But it uses domain_save_state, and calls
> 
> s/domain_save_state/libxl__domain_save_state/
>> libxl__xc_domain_saverestore_async_callback_done()
>> before exits. This can not be used for secondary vm.
>>
>> Update libxl__domain_suspend_common_switch_qemu_logdirty() to
>> introduce a new API libxl__domain_common_switch_qemu_logdirty().
>> This API only uses libxl__logdirty_switch, and calls
>> lds->callback before exits.
> 
> One question - that perhaps had been part of the review earlier
> (if so it may be good to include this in the description
> so I don't ask silly questions):
> 
> Why add this extra API? You could squash libxl__domain_suspend_common_switch_qemu_logdirty
> and libxl__domain_common_switch_qemu_logdirty code together
> and call it libxl_domain_common_and_suspend_common_switch_qemu_logdirty
> (ok, just kidding on the name). But - why not have one function
> instead of splitting the functionality in two?

Do you mean that auto switch qemu logdirty when suspend the guest?

Thanks
Wen Congyang

> 
> Is there another patch that depends on it? If so it may be good
> to spell it out, like:
> 
> Patch blah blah is going to use it.
> 
> Thanks!
>>
>> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>> ---
>>  tools/libxl/libxl_dom_save.c | 95 ++++++++++++++++++++++++--------------------
>>  tools/libxl/libxl_internal.h |  8 ++++
>>  2 files changed, 60 insertions(+), 43 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
>> index b3ecad7..79e43f1 100644
>> --- a/tools/libxl/libxl_dom_save.c
>> +++ b/tools/libxl/libxl_dom_save.c
>> @@ -42,7 +42,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
>>  static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
>>                              const char *watch_path, const char *event_path);
>>  static void switch_logdirty_done(libxl__egc *egc,
>> -                                 libxl__domain_save_state *dss, int rc);
>> +                                 libxl__logdirty_switch *lds, int rc);
>>  
>>  static void logdirty_init(libxl__logdirty_switch *lds)
>>  {
>> @@ -52,13 +52,10 @@ static void logdirty_init(libxl__logdirty_switch *lds)
>>  }
>>  
>>  static void domain_suspend_switch_qemu_xen_traditional_logdirty
>> -                               (int domid, unsigned enable,
>> -                                libxl__save_helper_state *shs)
>> +                               (libxl__egc *egc, int domid, unsigned enable,
>> +                                libxl__logdirty_switch *lds)
>>  {
>> -    libxl__egc *egc = shs->egc;
>> -    libxl__domain_save_state *dss = shs->caller_state;
>> -    libxl__logdirty_switch *lds = &dss->logdirty;
>> -    STATE_AO_GC(dss->ao);
>> +    STATE_AO_GC(lds->ao);
>>      int rc;
>>      xs_transaction_t t = 0;
>>      const char *got;
>> @@ -120,26 +117,34 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
>>   out:
>>      LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
>>      libxl__xs_transaction_abort(gc, &t);
>> -    switch_logdirty_done(egc,dss,rc);
>> +    switch_logdirty_done(egc,lds,rc);
>>  }
>>  
>>  static void domain_suspend_switch_qemu_xen_logdirty
>> -                               (int domid, unsigned enable,
>> -                                libxl__save_helper_state *shs)
>> +                               (libxl__egc *egc, int domid, unsigned enable,
>> +                                libxl__logdirty_switch *lds)
>>  {
>> -    libxl__egc *egc = shs->egc;
>> -    libxl__domain_save_state *dss = shs->caller_state;
>> -    STATE_AO_GC(dss->ao);
>> +    STATE_AO_GC(lds->ao);
>>      int rc;
>>  
>>      rc = libxl__qmp_set_global_dirty_log(gc, domid, enable);
>> -    if (!rc) {
>> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
>> -    } else {
>> +    if (rc)
>>          LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc);
>> +
>> +    lds->callback(egc, lds, rc);
>> +}
>> +
>> +static void domain_suspend_switch_qemu_logdirty_done
>> +                        (libxl__egc *egc, libxl__logdirty_switch *lds, int rc)
>> +{
>> +    libxl__domain_save_state *dss = CONTAINER_OF(lds, *dss, logdirty);
>> +
>> +    if (rc) {
>>          dss->rc = rc;
>> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
>> -    }
>> +        libxl__xc_domain_saverestore_async_callback_done(egc,
>> +                                                         &dss->sws.shs, -1);
>> +    } else
>> +        libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, 0);
>>  }
>>  
>>  void libxl__domain_suspend_common_switch_qemu_logdirty
>> @@ -148,42 +153,52 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
>>      libxl__save_helper_state *shs = user;
>>      libxl__egc *egc = shs->egc;
>>      libxl__domain_save_state *dss = shs->caller_state;
>> -    STATE_AO_GC(dss->ao);
>> +
>> +    /* convenience aliases */
> 
> /* Convenience aliases. */
> 
>> +    libxl__logdirty_switch *const lds = &dss->logdirty;
>> +
>> +    lds->callback = domain_suspend_switch_qemu_logdirty_done;
>> +    libxl__domain_common_switch_qemu_logdirty(egc, domid, enable, lds);
>> +}
>> +
>> +void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
>> +                                               int domid, unsigned enable,
>> +                                               libxl__logdirty_switch *lds)
>> +{
>> +    STATE_AO_GC(lds->ao);
>>  
>>      switch (libxl__device_model_version_running(gc, domid)) {
>>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
>> -        domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs);
>> +        domain_suspend_switch_qemu_xen_traditional_logdirty(egc, domid, enable,
>> +                                                            lds);
>>          break;
>>      case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
>> -        domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs);
>> +        domain_suspend_switch_qemu_xen_logdirty(egc, domid, enable, lds);
>>          break;
>>      case LIBXL_DEVICE_MODEL_VERSION_NONE:
>> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
>> +        lds->callback(egc, lds, 0);
>>          break;
>>      default:
>>          LOG(ERROR,"logdirty switch failed"
>>              ", no valid device model version found, abandoning suspend");
>> -        dss->rc = ERROR_FAIL;
>> -        libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
>> +        lds->callback(egc, lds, ERROR_FAIL);
>>      }
>>  }
>>  static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
>>                                      const struct timeval *requested_abs,
>>                                      int rc)
>>  {
>> -    libxl__domain_save_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
>> -    STATE_AO_GC(dss->ao);
>> +    libxl__logdirty_switch *lds = CONTAINER_OF(ev, *lds, timeout);
>> +    STATE_AO_GC(lds->ao);
>>      LOG(ERROR,"logdirty switch: wait for device model timed out");
>> -    switch_logdirty_done(egc,dss,ERROR_FAIL);
>> +    switch_logdirty_done(egc,lds,ERROR_FAIL);
>>  }
>>  
>>  static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
>>                              const char *watch_path, const char *event_path)
>>  {
>> -    libxl__domain_save_state *dss =
>> -        CONTAINER_OF(watch, *dss, logdirty.watch);
>> -    libxl__logdirty_switch *lds = &dss->logdirty;
>> -    STATE_AO_GC(dss->ao);
>> +    libxl__logdirty_switch *lds = CONTAINER_OF(watch, *lds, watch);
>> +    STATE_AO_GC(lds->ao);
>>      const char *got;
>>      xs_transaction_t t = 0;
>>      int rc;
>> @@ -229,28 +244,20 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
>>      if (rc <= 0) {
>>          if (rc < 0)
>>              LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
>> -        switch_logdirty_done(egc,dss,rc);
>> +        switch_logdirty_done(egc,lds,rc);
>>      }
>>  }
>>  
>>  static void switch_logdirty_done(libxl__egc *egc,
>> -                                 libxl__domain_save_state *dss,
>> +                                 libxl__logdirty_switch *lds,
>>                                   int rc)
>>  {
>> -    STATE_AO_GC(dss->ao);
>> -    libxl__logdirty_switch *lds = &dss->logdirty;
>> +    STATE_AO_GC(lds->ao);
>>  
>>      libxl__ev_xswatch_deregister(gc, &lds->watch);
>>      libxl__ev_time_deregister(gc, &lds->timeout);
>>  
>> -    int broke;
>> -    if (rc) {
>> -        broke = -1;
>> -        dss->rc = rc;
>> -    } else {
>> -        broke = 0;
>> -    }
>> -    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke);
>> +    lds->callback(egc, lds, rc);
>>  }
>>  
>>  /*----- callbacks, called by xc_domain_save -----*/
>> @@ -346,6 +353,8 @@ void libxl__domain_save(libxl__egc *egc, libxl__domain_save_state *dss)
>>  
>>      dss->rc = 0;
>>      logdirty_init(&dss->logdirty);
>> +    dss->logdirty.ao = ao;
>> +
>>      dsps->ao = ao;
>>      dsps->domid = domid;
>>      rc = libxl__domain_suspend_init(egc, dsps);
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index 4872619..552692f 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -3071,6 +3071,11 @@ libxl__stream_write_inuse(const libxl__stream_write_state *stream)
>>  }
>>  
>>  typedef struct libxl__logdirty_switch {
>> +    /* set by caller of libxl__domain_common_switch_qemu_logdirty */
> 
> s/set/Set/
> 
>> +    libxl__ao *ao;
>> +    void (*callback)(libxl__egc *egc, struct libxl__logdirty_switch *lds,
>> +                     int rc);
>> +
>>      const char *cmd;
>>      const char *cmd_path;
>>      const char *ret_path;
>> @@ -3490,6 +3495,9 @@ void libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
>>  
>>  _hidden void libxl__domain_suspend_common_switch_qemu_logdirty
>>                                 (int domid, unsigned int enable, void *data);
>> +_hidden void libxl__domain_common_switch_qemu_logdirty(libxl__egc *egc,
>> +                                               int domid, unsigned enable,
>> +                                               libxl__logdirty_switch *lds);
>>  _hidden int libxl__save_emulator_xenstore_data(libxl__domain_save_state *dss,
>>                                                 char **buf, uint32_t *len);
>>  _hidden int libxl__restore_emulator_xenstore_data
>> -- 
>> 2.5.0
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back
  2016-01-25 19:17   ` Konrad Rzeszutek Wilk
@ 2016-01-26  7:48     ` Wen Congyang
  0 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2016-01-26  7:48 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 03:17 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:29:01AM +0800, Wen Congyang wrote:
>> In colo mode, slave needs to send data to master, but the io_fd
> 
> 
> 
> In previous patches you used COLO in all caps, can that be uniform
> across the patches?

OK, I will check it.

> 
> Also, slave == secondary and master == primary? Perhaps you
> could s/slave/secondary/ s/master/primary/ to sync up with
> the other patches?

OK, I will do it.

> 
> Thank you!
>> only can be written in master, and only can be read in slave.
> 
> 
> 
> Could you mention what kind of data the secondary has to send
> to the primary? In the previous patch (] tools/libxl:
> introduce libxl__domain_common_switch_qemu_logdirty()) it mentioned
> dirty page. Is that the case here? If so can you mention
> that as well here?

OK. Will fix it in the next version.

> 
> 
>> Save recv_fd in domain_suspend_state, and send_fd in
>> domain_create_state.
>> Extend libxl_domain_create_restore API, add a send_fd param to
>> it.
>> Add LIBXL_HAVE_CREATE_RESTORE_SEND_FD to indicate the API change.
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
>> ---
>>  tools/libxl/libxl.c                  |  2 +-
>>  tools/libxl/libxl.h                  | 30 ++++++++++++++++++++++++++++--
>>  tools/libxl/libxl_create.c           |  9 +++++----
>>  tools/libxl/libxl_internal.h         |  2 ++
>>  tools/libxl/libxl_types.idl          |  1 +
>>  tools/libxl/xl_cmdimpl.c             |  8 +++++++-
>>  tools/ocaml/libs/xl/xenlight_stubs.c |  2 +-
>>  7 files changed, 45 insertions(+), 9 deletions(-)
>>
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index 2faea4d..69c8047 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -872,7 +872,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
>>      dss->callback = remus_failover_cb;
>>      dss->domid = domid;
>>      dss->fd = send_fd;
>> -    /* TODO do something with recv_fd */
>> +    dss->recv_fd = recv_fd;
>>      dss->type = type;
>>      dss->live = 1;
>>      dss->debug = 0;
>> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
>> index a01e448..67a4ad7 100644
>> --- a/tools/libxl/libxl.h
>> +++ b/tools/libxl/libxl.h
>> @@ -630,6 +630,15 @@ typedef struct libxl__ctx libxl_ctx;
>>  #define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS 1
>>  
>>  /*
>> + * LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
>> + *
>> + * If this is defined, libxl_domain_create_restore()'s API has changed to
>> + * include a send_fd param which used for libxl migration back channel
>> + * during COLO FT.
> 
> FT? Could you explain that acronym please?

Fault Tolerance. COLO is a FT solution.
In the comment, I think FT can be removed.

>> + */
>> +#define LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_FD 1
>> +
>> +/*
>>   * LIBXL_HAVE_CREATEINFO_PVH
>>   * If this is defined, then libxl supports creation of a PVH guest.
>>   */
>> @@ -1143,7 +1152,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
>>                              const libxl_asyncprogress_how *aop_console_how)
>>                              LIBXL_EXTERNAL_CALLERS_ONLY;
>>  int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
>> -                                uint32_t *domid, int restore_fd,
>> +                                uint32_t *domid, int restore_fd, int send_fd,
>>                                  const libxl_domain_restore_params *params,
>>                                  const libxl_asyncop_how *ao_how,
>>                                  const libxl_asyncprogress_how *aop_console_how)
>> @@ -1164,7 +1173,7 @@ int static inline libxl_domain_create_restore_0x040200(
>>      libxl_domain_restore_params_init(&params);
>>  
>>      ret = libxl_domain_create_restore(
>> -        ctx, d_config, domid, restore_fd, &params, ao_how, aop_console_how);
>> +        ctx, d_config, domid, restore_fd, -1, &params, ao_how, aop_console_how);
>>  
>>      libxl_domain_restore_params_dispose(&params);
>>      return ret;
>> @@ -1172,6 +1181,23 @@ int static inline libxl_domain_create_restore_0x040200(
>>  
>>  #define libxl_domain_create_restore libxl_domain_create_restore_0x040200
>>  
>> +#elif defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040400 \
>> +                                 && LIBXL_API_VERSION < 0x040600
> 
> s/4060/4070? Or is that suppose to be <= 040600 ?

It is 4070 here. I just rebase this series, and forgot to update it.
Thanks for pointing it out.

>> +
> .. snip..
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index 9aa94be..c5d5d40 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -232,6 +232,7 @@ libxl_hdtype = Enumeration("hdtype", [
>>  libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
>>      (0, "NONE"),
>>      (1, "REMUS"),
>> +    (2, "COLO"),
> 
> You should also update the migration_stream enum with the extra enum.
> 
> And if you follow my idea of adding an assertion in xc_domain_save
> for the different checkpointed_stream types then that would need to
> be expanded to include support for COLO type as well.

Yes, but I don't find the codes that use this new type. I wil check it.

Thanks
Wen Congyang

> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
  2016-01-25 19:41   ` Konrad Rzeszutek Wilk
@ 2016-01-26  8:03     ` Wen Congyang
  2016-01-26 14:29       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2016-01-26  8:03 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 03:41 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:29:02AM +0800, Wen Congyang wrote:
>> In COLO mode, both VMs are running, and are considered in sync if the
>> visible network traffic is identical.  After some time, they fall out of
>> sync.
>>
>> At this point, the two VMs have definitely diverged.  Lets call the
>> primary dirty bitmap set A, while the secondary dirty bitmap set B.
>>
>> Sets A and B are different.
>>
>> Under normal migration, the page data for set A will be sent form the
> 
> s/form/from/
> 
>> primary to the secondary.
>>
>> However, the set difference B - A (lets call this C) is out-of-date on
>> the secondary (with respect to the primary) and will not be sent by the
>> primary, as it was not memory dirtied by the primary.  The secondary
> 
> s/primary/primary (to secondary)/
> 
>> needs the page data for C to reconstruct an exact copy of the primary at
> 
> s/the page data/C page data/
> 
>> the checkpoint.
>>
>> The secondary cannot calculate C as it doesn't know A.  Instead, the
>> secondary must send B to the primary, at which point the primary
>> calculates the union of A and B (lets call this D) which is all the
>> pages dirtied by both the primary and the secondary, and sends all page
>> data covered by D.
> 
> You could invert this - the primary could send A to secondary? I presume
> this non-optimal as the 'A' set is much much bigger than 'C' set?

'C' set is the one in 'B' set but not in 'A' set.

> 
> It may be good to include this in the commit description.
> 
>>
>> In the general case, D is a superset of both A and B.  Without the
>> backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
>> copy of the primary.
>>
>> We transfer the dirty bitmap on libxc side, so we need to introduce back
>> channel to libxc.
> 
>>
>> Note: it is different from the paper. We change the original design to
>> the current one, according to our following concerns:
>> 1. The original design needs extra memory on Secondary host. When there's
>>    multiple backups on one host, the memory cost is high.
>> 2. The memory cache code will be another 1k+, it will make the review
>>    more time consuming.
> 
> Well, that 2) is a very good reason :-)
>>
>> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
>> commit message:
> 
> ? Huh?

I don't know what it is. Will remove it in the next version.

> 
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
> 
> .. snip..
>> index 05159bb..d4dc501 100644
>> --- a/tools/libxc/xc_sr_restore.c
>> +++ b/tools/libxc/xc_sr_restore.c
>> @@ -722,7 +722,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
>>                        unsigned long *console_gfn, domid_t console_domid,
>>                        unsigned int hvm, unsigned int pae, int superpages,
>>                        int checkpointed_stream,
>> -                      struct restore_callbacks *callbacks)
>> +                      struct restore_callbacks *callbacks, int back_fd)
>>  {
>>      struct xc_sr_context ctx =
>>          {
>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>> index 8ffd71d..a49d083 100644
>> --- a/tools/libxc/xc_sr_save.c
>> +++ b/tools/libxc/xc_sr_save.c
>> @@ -824,7 +824,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
>>                     uint32_t max_iters, uint32_t max_factor, uint32_t flags,
>>                     struct save_callbacks* callbacks, int hvm,
>> -                   int checkpointed_stream)
>> +                   int checkpointed_stream, int back_fd)
>>  {
>>      struct xc_sr_context ctx =
>>          {
> 
> 
> But where is the code?
> 
> Or is that suppose to be done in another patch? If so you may want to
> mention that in the commit description?

Do you mean where is the code that uses back_fd? It is in another series:
http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02904.html

Thanks
Wen Congyang

> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state
  2016-01-25 19:55   ` Konrad Rzeszutek Wilk
@ 2016-01-26  8:07     ` Wen Congyang
  0 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2016-01-26  8:07 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 03:55 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Dec 30, 2015 at 10:29:06AM +0800, Wen Congyang wrote:
>> Checkpoint device is an abstract layer to do checkpoint.
>> COLO can also use it to do checkpoint. But there are
>> still some codes in checkpoint device which touch remus.
>>
>> This patch and the following 2 will seperate remus from
> 
> s/and the following 2/and:
> 
>  tools/libxl: move remus state into a seperate structure 
>  tools/libxl: seperate device init/cleanup from checkpoint device layer    
> 
>> checkpoint device layer.
>>
>> We use remus ops directly in checkpoint device. Store it
>> in checkpoint device state so that we do not aware of
>> remus_ops in the checkpoint device layer.
>>
>> it is pure refactoring and no functional changes.
> s/it/It/
> 
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
>> Acked-by:Ian Campbell <ian.campbell@citrix.com>
> 
> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> with the changes I mentioned.

OK, will fix it in the next version.

Thanks
Wen Congyang

>> ---
>>  tools/libxl/libxl_checkpoint_device.c | 10 +---------
>>  tools/libxl/libxl_internal.h          |  2 ++
>>  tools/libxl/libxl_remus.c             |  9 +++++++++
>>  3 files changed, 12 insertions(+), 9 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_checkpoint_device.c b/tools/libxl/libxl_checkpoint_device.c
>> index 226f159..bbc6dc4 100644
>> --- a/tools/libxl/libxl_checkpoint_device.c
>> +++ b/tools/libxl/libxl_checkpoint_device.c
>> @@ -17,14 +17,6 @@
>>  
>>  #include "libxl_internal.h"
>>  
>> -extern const libxl__checkpoint_device_instance_ops remus_device_nic;
>> -extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
>> -static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
>> -    &remus_device_nic,
>> -    &remus_device_drbd_disk,
>> -    NULL,
>> -};
>> -
>>  /*----- helper functions -----*/
>>  
>>  static int init_device_subkind(libxl__checkpoint_devices_state *cds)
>> @@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, libxl__ao_device *aodev)
>>          goto out;
>>  
>>      do {
>> -        dev->ops = remus_ops[++dev->ops_index];
>> +        dev->ops = dev->cds->ops[++dev->ops_index];
>>          if (!dev->ops) {
>>              libxl_device_nic * nic = NULL;
>>              libxl_device_disk * disk = NULL;
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index 5b99d6e..914ce94 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -2895,6 +2895,8 @@ struct libxl__checkpoint_devices_state {
>>      uint32_t domid;
>>      libxl__checkpoint_callback *callback;
>>      int device_kind_flags;
>> +    /* The ops must be pointer array, and the last ops must be NULL */
> 
> s/NULL/NULL./
> 
>> +    const libxl__checkpoint_device_instance_ops **ops;
>>  
>>      /*----- private for abstract layer only -----*/
>>  
>> diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
>> index d088dad..3375331 100644
>> --- a/tools/libxl/libxl_remus.c
>> +++ b/tools/libxl/libxl_remus.c
>> @@ -18,6 +18,14 @@
>>  
>>  #include "libxl_internal.h"
>>  
>> +extern const libxl__checkpoint_device_instance_ops remus_device_nic;
>> +extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
>> +static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
>> +    &remus_device_nic,
>> +    &remus_device_drbd_disk,
>> +    NULL,
>> +};
>> +
>>  /*-------------------- Remus setup and teardown ---------------------*/
>>  
>>  static void remus_setup_done(libxl__egc *egc,
>> @@ -50,6 +58,7 @@ void libxl__remus_setup(libxl__egc *egc,
>>      cds->ao = ao;
>>      cds->domid = dss->domid;
>>      cds->callback = remus_setup_done;
>> +    cds->ops = remus_ops;
>>  
>>      dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>>  
>> -- 
>> 2.5.0
>>
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2016-01-26  7:04     ` Wen Congyang
@ 2016-01-26 14:27       ` Konrad Rzeszutek Wilk
  2016-01-27  0:53         ` Wen Congyang
  2016-01-27  2:06         ` Wen Congyang
  0 siblings, 2 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-26 14:27 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Tue, Jan 26, 2016 at 03:04:39PM +0800, Wen Congyang wrote:
> On 01/26/2016 02:59 AM, Konrad Rzeszutek Wilk wrote:
> > On Wed, Dec 30, 2015 at 10:28:59AM +0800, Wen Congyang wrote:
> >> Secondary vm is running in colo mode, we need to send
> >> secondary vm's dirty page information to master at checkpoint,
> > 
> > In previous patch you called it primary, so perhaps:
> > s/master/primary/ ?
> > 
> >> so we have to enable qemu logdirty on secondary.
> >>
> >> libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
> >> qemu logdirty. But it uses domain_save_state, and calls
> > 
> > s/domain_save_state/libxl__domain_save_state/
> >> libxl__xc_domain_saverestore_async_callback_done()
> >> before exits. This can not be used for secondary vm.
> >>
> >> Update libxl__domain_suspend_common_switch_qemu_logdirty() to
> >> introduce a new API libxl__domain_common_switch_qemu_logdirty().
> >> This API only uses libxl__logdirty_switch, and calls
> >> lds->callback before exits.
> > 
> > One question - that perhaps had been part of the review earlier
> > (if so it may be good to include this in the description
> > so I don't ask silly questions):
> > 
> > Why add this extra API? You could squash libxl__domain_suspend_common_switch_qemu_logdirty
> > and libxl__domain_common_switch_qemu_logdirty code together
> > and call it libxl_domain_common_and_suspend_common_switch_qemu_logdirty
> > (ok, just kidding on the name). But - why not have one function
> > instead of splitting the functionality in two?
> 
> Do you mean that auto switch qemu logdirty when suspend the guest?

Squash the two functions - libxl__domain_common_switch_qemu_logdirty and
libxl__domain_suspend_common_switch_qemu_logdirty together?

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
  2016-01-26  8:03     ` Wen Congyang
@ 2016-01-26 14:29       ` Konrad Rzeszutek Wilk
  2016-01-27  0:52         ` Wen Congyang
  0 siblings, 1 reply; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-26 14:29 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

> > Or is that suppose to be done in another patch? If so you may want to
> > mention that in the commit description?
> 
> Do you mean where is the code that uses back_fd? It is in another series:
> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02904.html

Ah right that big patchset one. Hadn't looked at that yet - it is a bit hard
without having a git tree on which the foundation patches (this patch
series) are applied so I can look at the contents of the functions.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state
  2016-01-26  2:23     ` Wen Congyang
@ 2016-01-26 14:32       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 48+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-26 14:32 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On Tue, Jan 26, 2016 at 10:23:52AM +0800, Wen Congyang wrote:
> On 01/26/2016 01:29 AM, Konrad Rzeszutek Wilk wrote:
> > .snip..
> >> --- a/tools/libxl/libxl_dom_suspend.c
> >> +++ b/tools/libxl/libxl_dom_suspend.c
> >> @@ -19,14 +19,71 @@
> >>  
> >>  /*====================== Domain suspend =======================*/
> >>  
> >> +int libxl__domain_suspend_init(libxl__egc *egc,
> >> +                               libxl__domain_suspend_state *dsps)
> >> +{
> >> +    STATE_AO_GC(dsps->ao);
> >> +    int rc = ERROR_FAIL;
> >> +    int port;
> >> +    libxl_domain_type type;
> >> +
> >> +    /* Convenience aliases */
> >> +    const uint32_t domid = dsps->domid;
> >> +
> >> +    type = libxl__domain_type(gc, domid);
> >> +    switch (type) {
> >> +    case LIBXL_DOMAIN_TYPE_HVM: {
> >> +        dsps->hvm = 1;
> >> +        break;
> >> +    }
> >> +    case LIBXL_DOMAIN_TYPE_PV:
> >> +        dsps->hvm = 0;
> >> +        break;
> >> +    default:
> >> +        goto out;
> > 
> > This will mean we return back to libxl__domain_save which will goto out which calls:
> > domain_save_done. And that will try to use the dsps->guestevtchn leading to a crash since:
> 
> Yes, thanks for pointing it out. In which case, the type is not HVM or PV?

If you call those init routines before the switch statemet - such as the
libxl__xswait_init, etc, then you can still goto out
> 
> >> +    }
> >> +
> >> +    libxl__xswait_init(&dsps->pvcontrol);
> >> +    libxl__ev_evtchn_init(&dsps->guest_evtchn);
> > 
> > we initialize them here.
> >> +    libxl__ev_xswatch_init(&dsps->guest_watch);
> >> +    libxl__ev_time_init(&dsps->guest_timeout);
> > 
> > I would instead recommend you move these initialization routines above the
> > 'type' check.
> 
> I think we should not return ERROR_FAIL when the type is not PV or HVM. We should abort the program
> like what we do in libxl__domain_save().

I would rather return - this is a library after all - so the controlling program should
do such drastic measures - not an library.

> 
> > 
> >> +
> >> +    dsps->guest_evtchn.port = -1;
> >> +    dsps->guest_evtchn_lockfd = -1;
> >> +    dsps->guest_responded = 0;
> >> +    dsps->dm_savefile = libxl__device_model_savefile(gc, domid);
> >> +
> >> +    port = xs_suspend_evtchn_port(domid);
> >> +
> >> +    if (port >= 0) {
> >> +        rc = libxl__ctx_evtchn_init(gc);
> >> +        if (rc) goto out;
> >> +
> >> +        dsps->guest_evtchn.port =
> >> +            xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce,
> >> +                                    domid, port, &dsps->guest_evtchn_lockfd);
> >> +
> >> +        if (dsps->guest_evtchn.port < 0) {
> >> +            LOG(WARN, "Suspend event channel initialization failed");
> >> +            rc = ERROR_FAIL;
> >> +            goto out;
> >> +        }
> >> +    }
> >> +
> >> +    rc = 0;
> >> +
> >> +out:
> >> +    return rc;
> >> +}
> >> +
> > 
> > .. snip..
> >>  struct libxl__domain_suspend_state {
> >> +    /* set by caller of libxl__domain_suspend_init */
> >> +    libxl__ao *ao;
> >> +    uint32_t domid;
> >> +
> >> +    /* private */
> >> +    int hvm;
> > 
> > How about 'is_hvm' and just use 'libxl_domain_type' type?
> > instead of having an int? You can just do:
> 
> In dss, it is 'int hvm'.
> Before this patch:
> if (dss->hvm) ...
> After this patch:
> if (dsps->hvm) ...

Right..
> 
> Thanks
> Wen Congyang
> 
> > 
> > if (type == LIBXL_DOMAIN_TYPE_HVM) ..

But what if you use that? As in dsps->type == LIBXL_DOMAIAN_TYPE_HVM for example?

> > 
> > And to check for non-conforming types - you can make  libxl__domain_suspend_init
> > do this:
> > 
> >     if (type == LIBXL_DOMAIN_TYPE_INVALID) {
> >         rc = ERROR_FAIL;
> >         goto out; 
> >     }    
> > 
> > ?
> > 
> > 
> > .
> > 
> 
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc
  2016-01-26 14:29       ` Konrad Rzeszutek Wilk
@ 2016-01-27  0:52         ` Wen Congyang
  0 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2016-01-27  0:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 10:29 PM, Konrad Rzeszutek Wilk wrote:
>>> Or is that suppose to be done in another patch? If so you may want to
>>> mention that in the commit description?
>>
>> Do you mean where is the code that uses back_fd? It is in another series:
>> http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02904.html
> 
> Ah right that big patchset one. Hadn't looked at that yet - it is a bit hard
> without having a git tree on which the foundation patches (this patch
> series) are applied so I can look at the contents of the functions.

Yes, I will provide a git tree to help review.

Thanks
Wen Congyang

> 
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2016-01-26 14:27       ` Konrad Rzeszutek Wilk
@ 2016-01-27  0:53         ` Wen Congyang
  2016-01-27  0:55           ` Wen Congyang
  2016-01-27  2:06         ` Wen Congyang
  1 sibling, 1 reply; 48+ messages in thread
From: Wen Congyang @ 2016-01-27  0:53 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 10:27 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 26, 2016 at 03:04:39PM +0800, Wen Congyang wrote:
>> On 01/26/2016 02:59 AM, Konrad Rzeszutek Wilk wrote:
>>> On Wed, Dec 30, 2015 at 10:28:59AM +0800, Wen Congyang wrote:
>>>> Secondary vm is running in colo mode, we need to send
>>>> secondary vm's dirty page information to master at checkpoint,
>>>
>>> In previous patch you called it primary, so perhaps:
>>> s/master/primary/ ?
>>>
>>>> so we have to enable qemu logdirty on secondary.
>>>>
>>>> libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
>>>> qemu logdirty. But it uses domain_save_state, and calls
>>>
>>> s/domain_save_state/libxl__domain_save_state/
>>>> libxl__xc_domain_saverestore_async_callback_done()
>>>> before exits. This can not be used for secondary vm.
>>>>
>>>> Update libxl__domain_suspend_common_switch_qemu_logdirty() to
>>>> introduce a new API libxl__domain_common_switch_qemu_logdirty().
>>>> This API only uses libxl__logdirty_switch, and calls
>>>> lds->callback before exits.
>>>
>>> One question - that perhaps had been part of the review earlier
>>> (if so it may be good to include this in the description
>>> so I don't ask silly questions):
>>>
>>> Why add this extra API? You could squash libxl__domain_suspend_common_switch_qemu_logdirty
>>> and libxl__domain_common_switch_qemu_logdirty code together
>>> and call it libxl_domain_common_and_suspend_common_switch_qemu_logdirty
>>> (ok, just kidding on the name). But - why not have one function
>>> instead of splitting the functionality in two?
>>
>> Do you mean that auto switch qemu logdirty when suspend the guest?
> 
> Squash the two functions - libxl__domain_common_switch_qemu_logdirty and
> libxl__domain_suspend_common_switch_qemu_logdirty together?

IIRC, no codes need such API now.

Thanks
Wen Congyang

> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2016-01-27  0:53         ` Wen Congyang
@ 2016-01-27  0:55           ` Wen Congyang
  0 siblings, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2016-01-27  0:55 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/27/2016 08:53 AM, Wen Congyang wrote:
> On 01/26/2016 10:27 PM, Konrad Rzeszutek Wilk wrote:
>> On Tue, Jan 26, 2016 at 03:04:39PM +0800, Wen Congyang wrote:
>>> On 01/26/2016 02:59 AM, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, Dec 30, 2015 at 10:28:59AM +0800, Wen Congyang wrote:
>>>>> Secondary vm is running in colo mode, we need to send
>>>>> secondary vm's dirty page information to master at checkpoint,
>>>>
>>>> In previous patch you called it primary, so perhaps:
>>>> s/master/primary/ ?
>>>>
>>>>> so we have to enable qemu logdirty on secondary.
>>>>>
>>>>> libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
>>>>> qemu logdirty. But it uses domain_save_state, and calls
>>>>
>>>> s/domain_save_state/libxl__domain_save_state/
>>>>> libxl__xc_domain_saverestore_async_callback_done()
>>>>> before exits. This can not be used for secondary vm.
>>>>>
>>>>> Update libxl__domain_suspend_common_switch_qemu_logdirty() to
>>>>> introduce a new API libxl__domain_common_switch_qemu_logdirty().
>>>>> This API only uses libxl__logdirty_switch, and calls
>>>>> lds->callback before exits.
>>>>
>>>> One question - that perhaps had been part of the review earlier
>>>> (if so it may be good to include this in the description
>>>> so I don't ask silly questions):
>>>>
>>>> Why add this extra API? You could squash libxl__domain_suspend_common_switch_qemu_logdirty
>>>> and libxl__domain_common_switch_qemu_logdirty code together
>>>> and call it libxl_domain_common_and_suspend_common_switch_qemu_logdirty
>>>> (ok, just kidding on the name). But - why not have one function
>>>> instead of splitting the functionality in two?
>>>
>>> Do you mean that auto switch qemu logdirty when suspend the guest?
>>
>> Squash the two functions - libxl__domain_common_switch_qemu_logdirty and
>> libxl__domain_suspend_common_switch_qemu_logdirty together?
> 
> IIRC, no codes need such API now.

Sorry for the mistake. I understand it now.

Thanks
Wen Congyang

> 
> Thanks
> Wen Congyang
> 
>>
>>
>> .
>>
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty()
  2016-01-26 14:27       ` Konrad Rzeszutek Wilk
  2016-01-27  0:53         ` Wen Congyang
@ 2016-01-27  2:06         ` Wen Congyang
  1 sibling, 0 replies; 48+ messages in thread
From: Wen Congyang @ 2016-01-27  2:06 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Lars Kurth, Changlong Xie, Wei Liu, Ian Campbell, Andrew Cooper,
	Jiang Yunhong, Ian Jackson, xen devel, Dong Eddie, Gui Jianfeng,
	Shriram Rajagopalan, Yang Hongyang

On 01/26/2016 10:27 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 26, 2016 at 03:04:39PM +0800, Wen Congyang wrote:
>> On 01/26/2016 02:59 AM, Konrad Rzeszutek Wilk wrote:
>>> On Wed, Dec 30, 2015 at 10:28:59AM +0800, Wen Congyang wrote:
>>>> Secondary vm is running in colo mode, we need to send
>>>> secondary vm's dirty page information to master at checkpoint,
>>>
>>> In previous patch you called it primary, so perhaps:
>>> s/master/primary/ ?
>>>
>>>> so we have to enable qemu logdirty on secondary.
>>>>
>>>> libxl__domain_suspend_common_switch_qemu_logdirty() is to enable
>>>> qemu logdirty. But it uses domain_save_state, and calls
>>>
>>> s/domain_save_state/libxl__domain_save_state/
>>>> libxl__xc_domain_saverestore_async_callback_done()
>>>> before exits. This can not be used for secondary vm.
>>>>
>>>> Update libxl__domain_suspend_common_switch_qemu_logdirty() to
>>>> introduce a new API libxl__domain_common_switch_qemu_logdirty().
>>>> This API only uses libxl__logdirty_switch, and calls
>>>> lds->callback before exits.
>>>
>>> One question - that perhaps had been part of the review earlier
>>> (if so it may be good to include this in the description
>>> so I don't ask silly questions):
>>>
>>> Why add this extra API? You could squash libxl__domain_suspend_common_switch_qemu_logdirty
>>> and libxl__domain_common_switch_qemu_logdirty code together
>>> and call it libxl_domain_common_and_suspend_common_switch_qemu_logdirty
>>> (ok, just kidding on the name). But - why not have one function
>>> instead of splitting the functionality in two?
>>
>> Do you mean that auto switch qemu logdirty when suspend the guest?
> 
> Squash the two functions - libxl__domain_common_switch_qemu_logdirty and
> libxl__domain_suspend_common_switch_qemu_logdirty together?

No, libxl__domain_suspend_common_switch_qemu_logdirty() is used by save side.
libxl__domain_common_switch_qemu_logdirty() will be used by restore side. Please
see the patch 11 in another series.

Thanks
Wen Congyang

> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2016-01-27  2:06 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-30  2:28 [PATCH v6 00/18] Prerequisite patches for COLO Wen Congyang
2015-12-30  2:28 ` [PATCH v6 01/18] libxl/remus: init checkpoint_callback in Remus setup callback Wen Congyang
2016-01-25 17:29   ` Konrad Rzeszutek Wilk
2015-12-30  2:28 ` [PATCH v6 02/18] tools/libxl: move remus code into libxl_remus.c Wen Congyang
2015-12-30  2:28 ` [PATCH v6 03/18] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
2015-12-30  2:28 ` [PATCH v6 04/18] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
2016-01-25 17:29   ` Konrad Rzeszutek Wilk
2016-01-26  2:23     ` Wen Congyang
2016-01-26 14:32       ` Konrad Rzeszutek Wilk
2015-12-30  2:28 ` [PATCH v6 05/18] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
2016-01-25 18:21   ` Konrad Rzeszutek Wilk
2016-01-26  2:53     ` Wen Congyang
2015-12-30  2:28 ` [PATCH v6 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
2016-01-25 18:30   ` Konrad Rzeszutek Wilk
2015-12-30  2:28 ` [PATCH v6 07/18] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
2015-12-30  2:28 ` [PATCH v6 08/18] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Wen Congyang
2015-12-30  2:28 ` [PATCH v6 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Wen Congyang
2016-01-25 18:59   ` Konrad Rzeszutek Wilk
2016-01-26  7:04     ` Wen Congyang
2016-01-26 14:27       ` Konrad Rzeszutek Wilk
2016-01-27  0:53         ` Wen Congyang
2016-01-27  0:55           ` Wen Congyang
2016-01-27  2:06         ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 10/18] tools/libxl: export logdirty_init Wen Congyang
2016-01-25 19:01   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 11/18] tools/libxl: Add back channel to allow migration target send data back Wen Congyang
2016-01-25 19:17   ` Konrad Rzeszutek Wilk
2016-01-26  7:48     ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 12/18] tools/libx{l, c}: add back channel to libxc Wen Congyang
2016-01-25 19:41   ` Konrad Rzeszutek Wilk
2016-01-26  8:03     ` Wen Congyang
2016-01-26 14:29       ` Konrad Rzeszutek Wilk
2016-01-27  0:52         ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 13/18] tools/libxl: rename remus device to checkpoint device Wen Congyang
2016-01-25 19:42   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 14/18] tools/libxl: fix backword compatibility after the automatic renaming Wen Congyang
2015-12-30  2:29 ` [PATCH v6 15/18] tools/libxl: adjust the indentation Wen Congyang
2016-01-25 19:44   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 16/18] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
2016-01-25 19:55   ` Konrad Rzeszutek Wilk
2016-01-26  8:07     ` Wen Congyang
2015-12-30  2:29 ` [PATCH v6 17/18] tools/libxl: move remus state into a seperate structure Wen Congyang
2016-01-25 19:59   ` Konrad Rzeszutek Wilk
2015-12-30  2:29 ` [PATCH v6 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
2016-01-25 20:01   ` Konrad Rzeszutek Wilk
2016-01-25 17:12 ` [PATCH v6 00/18] Prerequisite patches for COLO Konrad Rzeszutek Wilk
2016-01-25 20:06   ` Konrad Rzeszutek Wilk
2016-01-26  3:18     ` Wen Congyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.