All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wen Congyang <wency@cn.fujitsu.com>
To: xen devel <xen-devel@lists.xen.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Wei Liu <wei.liu2@citrix.com>
Cc: Lars Kurth <lars.kurth@citrix.com>,
	Changlong Xie <xiecl.fnst@cn.fujitsu.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	Gui Jianfeng <guijianfeng@cn.fujitsu.com>,
	Jiang Yunhong <yunhong.jiang@intel.com>,
	Dong Eddie <eddie.dong@intel.com>,
	Shriram Rajagopalan <rshriram@cs.ubc.ca>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Yang Hongyang <hongyang.yang@easystack.cn>
Subject: [PATCH v7 12/18] tools/libx{l, c}: add back channel to libxc
Date: Fri, 29 Jan 2016 13:27:28 +0800	[thread overview]
Message-ID: <1454045254-3711-13-git-send-email-wency@cn.fujitsu.com> (raw)
In-Reply-To: <1454045254-3711-1-git-send-email-wency@cn.fujitsu.com>

In COLO mode, both VMs are running, and are considered in sync if the
visible network traffic is identical.  After some time, they fall out of
sync.

At this point, the two VMs have definitely diverged.  Lets call the
primary dirty bitmap set A, while the secondary dirty bitmap set B.

Sets A and B are different.

Under normal migration, the page data for set A will be sent from the
primary to the secondary.

However, the set difference B - A (the one in B but not in A, lets
call this C) is out-of-date on the secondary (with respect to the
primary) and will not be sent by the primary (to secondary), as it
was not memory dirtied by the primary. The secondary needs C page data
to reconstruct an exact copy of the primary at the checkpoint.

The secondary cannot calculate C as it doesn't know A.  Instead, the
secondary must send B to the primary, at which point the primary
calculates the union of A and B (lets call this D) which is all the
pages dirtied by both the primary and the secondary, and sends all page
data covered by D.

In the general case, D is a superset of both A and B.  Without the
backchannel dirty bitmap, a COLO checkpoint can't reconstruct a valid
copy of the primary.

We transfer the dirty bitmap on libxc side, so we need to introduce back
channel to libxc.

Note: it is different from the paper. We change the original design to
the current one, according to our following concerns:
1. The original design needs extra memory on Secondary host. When there's
   multiple backups on one host, the memory cost is high.
2. The memory cache code will be another 1k+, it will make the review
   more time consuming.

Note: the back channel will be used in the patch
 libxc/restore: send dirty pfn list to primary when checkpoint under COLO
to send dirty pfn list from secondary to primary. The patch is posted in
another series.

Signed-off-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xenguest.h   |  4 ++--
 tools/libxc/xc_nomigrate.c       |  4 ++--
 tools/libxc/xc_sr_restore.c      |  2 +-
 tools/libxc/xc_sr_save.c         |  2 +-
 tools/libxl/libxl_save_callout.c | 39 ++++++++++++++++++++++++++-------------
 tools/libxl/libxl_save_helper.c  |  8 ++++++--
 6 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index affc42b..ff230a4 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -88,7 +88,7 @@ struct save_callbacks {
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream);
+                   int checkpointed_stream, int back_fd);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
@@ -127,7 +127,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks);
+                      struct restore_callbacks *callbacks, int back_fd);
 
 /**
  * This function will create a domain for a paravirtualized Linux
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index c9124df..089f767 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -23,7 +23,7 @@
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
                    uint32_t max_factor, uint32_t flags,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream)
+                   int checkpointed_stream, int back_fd)
 {
     errno = ENOSYS;
     return -1;
@@ -35,7 +35,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_mfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks)
+                      struct restore_callbacks *callbacks, int back_fd)
 {
     errno = ENOSYS;
     return -1;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index d4d33fd..b0f47b5 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -726,7 +726,7 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom,
                       unsigned long *console_gfn, domid_t console_domid,
                       unsigned int hvm, unsigned int pae, int superpages,
                       int checkpointed_stream,
-                      struct restore_callbacks *callbacks)
+                      struct restore_callbacks *callbacks, int back_fd)
 {
     struct xc_sr_context ctx =
         {
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 0bea97e..2cc5b45 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -831,7 +831,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
                    uint32_t max_iters, uint32_t max_factor, uint32_t flags,
                    struct save_callbacks* callbacks, int hvm,
-                   int checkpointed_stream)
+                   int checkpointed_stream, int back_fd)
 {
     struct xc_sr_context ctx =
         {
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 416b318..631e3e2 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -27,7 +27,7 @@
  */
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
                        const char *mode_arg,
-                       int stream_fd,
+                       int stream_fd, int back_fd,
                        const int *preserve_fds, int num_preserve_fds,
                        const unsigned long *argnums, int num_argnums);
 
@@ -50,6 +50,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
     /* Convenience aliases */
     const uint32_t domid = dcs->guest_domid;
     const int restore_fd = dcs->libxc_fd;
+    const int send_fd = dcs->send_fd;
     libxl__domain_build_state *const state = &dcs->build_state;
 
     unsigned cbflags =
@@ -71,7 +72,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
     shs->caller_state = dcs;
     shs->need_results = 1;
 
-    run_helper(egc, shs, "--restore-domain", restore_fd, 0, 0,
+    run_helper(egc, shs, "--restore-domain", restore_fd, send_fd, 0, 0,
                argnums, ARRAY_SIZE(argnums));
 }
 
@@ -95,7 +96,7 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
     shs->caller_state = dss;
     shs->need_results = 0;
 
-    run_helper(egc, shs, "--save-domain", dss->fd,
+    run_helper(egc, shs, "--save-domain", dss->fd, dss->recv_fd,
                NULL, 0,
                argnums, ARRAY_SIZE(argnums));
     return;
@@ -118,14 +119,29 @@ void libxl__save_helper_init(libxl__save_helper_state *shs)
 }
 
 /*----- helper execution -----*/
+static int dup_fd_helper(libxl__gc *gc, int fd, const char *what)
+{
+    int dup_fd = fd;
+
+    if (fd <= 2) {
+        dup_fd = dup(fd);
+        if (dup_fd < 0) {
+            LOGE(ERROR,"dup %s", what);
+            exit(-1);
+        }
+    }
+    libxl_fd_set_cloexec(CTX, dup_fd, 0);
+
+    return dup_fd;
+}
 
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
-                       const char *mode_arg, int stream_fd,
+                       const char *mode_arg, int stream_fd, int back_fd,
                        const int *preserve_fds, int num_preserve_fds,
                        const unsigned long *argnums, int num_argnums)
 {
     STATE_AO_GC(shs->ao);
-    const char *args[4 + num_argnums];
+    const char *args[5 + num_argnums];
     const char **arg = args;
     int i, rc;
 
@@ -153,6 +169,7 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
     *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC_BIN "/" "libxl-save-helper";
     *arg++ = mode_arg;
     const char **stream_fd_arg = arg++;
+    const char **back_fd_arg = arg++;
     for (i=0; i<num_argnums; i++)
         *arg++ = GCSPRINTF("%lu", argnums[i]);
     *arg++ = 0;
@@ -177,16 +194,12 @@ static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
 
     pid_t pid = libxl__ev_child_fork(gc, &shs->child, helper_exited);
     if (!pid) {
-        if (stream_fd <= 2) {
-            stream_fd = dup(stream_fd);
-            if (stream_fd < 0) {
-                LOGE(ERROR,"dup migration stream fd");
-                exit(-1);
-            }
-        }
-        libxl_fd_set_cloexec(CTX, stream_fd, 0);
+        stream_fd = dup_fd_helper(gc, stream_fd, "migration stream fd");
         *stream_fd_arg = GCSPRINTF("%d", stream_fd);
 
+        back_fd = dup_fd_helper(gc, back_fd, "migration back channel fd");
+        *back_fd_arg = GCSPRINTF("%d", back_fd);
+
         for (i=0; i<num_preserve_fds; i++)
             if (preserve_fds[i] >= 0) {
                 assert(preserve_fds[i] > 2);
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 6bdcf13..9bdcf41 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -238,6 +238,7 @@ static struct restore_callbacks helper_restore_callbacks;
 int main(int argc, char **argv)
 {
     int r;
+    int back_fd;
 
 #define NEXTARG (++argv, assert(*argv), *argv)
 
@@ -247,6 +248,7 @@ int main(int argc, char **argv)
     if (!strcmp(mode,"--save-domain")) {
 
         io_fd =                    atoi(NEXTARG);
+        back_fd =                  atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         uint32_t max_iters =       strtoul(NEXTARG,0,10);
         uint32_t max_factor =      strtoul(NEXTARG,0,10);
@@ -262,12 +264,14 @@ int main(int argc, char **argv)
         setup_signals(save_signal_handler);
 
         r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm, checkpointed_stream);
+                           &helper_save_callbacks, hvm, checkpointed_stream,
+                           back_fd);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
 
         io_fd =                    atoi(NEXTARG);
+        back_fd =                  atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         unsigned store_evtchn =    strtoul(NEXTARG,0,10);
         domid_t store_domid =      strtoul(NEXTARG,0,10);
@@ -292,7 +296,7 @@ int main(int argc, char **argv)
                               store_domid, console_evtchn, &console_mfn,
                               console_domid, hvm, pae, superpages,
                               checkpointed,
-                              &helper_restore_callbacks);
+                              &helper_restore_callbacks, back_fd);
         helper_stub_restore_results(store_mfn,console_mfn,0);
         complete(r);
 
-- 
2.5.0

  parent reply	other threads:[~2016-01-29  5:27 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-29  5:27 [PATCH v7 00/18] Prerequisite patches for COLO Wen Congyang
2016-01-29  5:27 ` [PATCH v7 01/18] libxl/remus: init checkpoint_callback in Remus setup callback Wen Congyang
2016-02-03 19:39   ` Wei Liu
2016-02-04  5:17     ` Wen Congyang
2016-01-29  5:27 ` [PATCH v7 02/18] tools/libxl: move remus code into libxl_remus.c Wen Congyang
2016-01-29 16:29   ` Konrad Rzeszutek Wilk
2016-02-03 19:39   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 03/18] tools/libxl: move save/restore code into libxl_dom_save.c Wen Congyang
2016-01-29 16:30   ` Konrad Rzeszutek Wilk
2016-02-03 19:39   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 04/18] libxl/save: Refactor libxl__domain_suspend_state Wen Congyang
2016-01-29 16:31   ` Konrad Rzeszutek Wilk
2016-02-03 19:39   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 05/18] tools/libxc: support to resume uncooperative HVM guests Wen Congyang
2016-01-29 16:30   ` Konrad Rzeszutek Wilk
2016-02-03 19:40   ` Wei Liu
2016-02-04  5:30     ` Wen Congyang
2016-01-29  5:27 ` [PATCH v7 06/18] tools/libxl: introduce enum type libxl_checkpointed_stream Wen Congyang
2016-01-29 16:34   ` Konrad Rzeszutek Wilk
2016-02-03 19:40   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 07/18] migration/save: pass checkpointed_stream from libxl to libxc Wen Congyang
2016-01-29 16:35   ` Konrad Rzeszutek Wilk
2016-02-03 19:40   ` Wei Liu
2016-02-04  5:18     ` Wen Congyang
2016-01-29  5:27 ` [PATCH v7 08/18] tools/libxl: introduce libxl__domain_restore_device_model to load qemu state Wen Congyang
2016-01-29 16:34   ` Konrad Rzeszutek Wilk
2016-02-03 19:40   ` Wei Liu
2016-02-04  5:24     ` Wen Congyang
2016-02-04  9:41       ` Wei Liu
2016-02-04  9:46         ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 09/18] tools/libxl: introduce libxl__domain_common_switch_qemu_logdirty() Wen Congyang
2016-01-29 16:34   ` Konrad Rzeszutek Wilk
2016-02-03 19:40   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 10/18] tools/libxl: export logdirty_init Wen Congyang
2016-02-03 19:40   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 11/18] tools/libxl: Add back channel to allow migration target send data back Wen Congyang
2016-02-03 19:40   ` Wei Liu
2016-01-29  5:27 ` Wen Congyang [this message]
2016-01-29 16:38   ` [PATCH v7 12/18] tools/libx{l, c}: add back channel to libxc Konrad Rzeszutek Wilk
2016-02-01  5:39     ` Wen Congyang
2016-02-03 19:40   ` Wei Liu
2016-02-04  5:28     ` Wen Congyang
2016-02-04  9:25       ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 13/18] tools/libxl: rename remus device to checkpoint device Wen Congyang
2016-02-03 19:40   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 14/18] tools/libxl: fix backword compatibility after the automatic renaming Wen Congyang
2016-01-29 16:32   ` Konrad Rzeszutek Wilk
2016-01-29  5:27 ` [PATCH v7 15/18] tools/libxl: adjust the indentation Wen Congyang
2016-02-03 19:40   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 16/18] tools/libxl: store remus_ops in checkpoint device state Wen Congyang
2016-02-03 19:40   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 17/18] tools/libxl: move remus state into a seperate structure Wen Congyang
2016-02-03 19:41   ` Wei Liu
2016-01-29  5:27 ` [PATCH v7 18/18] tools/libxl: seperate device init/cleanup from checkpoint device layer Wen Congyang
2016-02-03 19:41   ` Wei Liu
2016-01-29 16:43 ` [PATCH v7 00/18] Prerequisite patches for COLO Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1454045254-3711-13-git-send-email-wency@cn.fujitsu.com \
    --to=wency@cn.fujitsu.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=guijianfeng@cn.fujitsu.com \
    --cc=hongyang.yang@easystack.cn \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=lars.kurth@citrix.com \
    --cc=rshriram@cs.ubc.ca \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=xiecl.fnst@cn.fujitsu.com \
    --cc=yunhong.jiang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.