All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH Remus v2 00/10] Remus support for Migration-v2
@ 2015-05-08  9:33 Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration Yang Hongyang
                   ` (10 more replies)
  0 siblings, 11 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

This patchset implement the Remus support for Migration v2 but without
memory compressing.

The series can be found on github:
https://github.com/macrosheep/xen/tree/Remus-newmig-v2

PATCH 1-7: Some refactor and prepare work.
PATCH 8-9: The main Remus loop implement.
PATCH 10: Fix for Remus.

v2:
 - move to_send bitmap to ctx->save union and rename it to dirty_bitmap
 - introduce setup() and cleanup() on save
 - rename send_some_pages to send_dirty_pages
 - remove "defer the setting of HVM_PARAM_IDENT_PT", it should be fixed
   on hypervisor side
 - the last patch still there for my test purpose until Andrew find a
   suitable solution(which we commented on the first series) :)

v1:
initial support

Summary of changes:
M = modified
A = acked
N = new,
no mark = unchanged from last round

Yang Hongyang (10):
M  tools/libxc: adjust the memory allocation for migration
N  tools/libxc: introduce setup() and cleanup() on save
N  tools/libxc: rename send_some_pages to send_dirty_pages
N  tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
M  tools/libxc: reuse send_dirty_pages() in send_all_pages()
   tools/libxc: introduce process_record()
   tools/libxc: split read/handle qemu info
   tools/libxc: implement Remus checkpointed save
   tools/libxc: implement Remus checkpointed restore
   tools/libxc: X86_PV_INFO can be sent multiple times under Remus

 tools/libxc/include/xenctrl.h       |   8 ++
 tools/libxc/include/xenguest.h      |   1 +
 tools/libxc/xc_bitops.h             |   5 +
 tools/libxc/xc_sr_common.h          |  15 +++
 tools/libxc/xc_sr_restore.c         | 179 ++++++++++++++++++++++-----
 tools/libxc/xc_sr_restore_x86_hvm.c |  28 ++++-
 tools/libxc/xc_sr_restore_x86_pv.c  |   2 +-
 tools/libxc/xc_sr_save.c            | 234 +++++++++++++++++++-----------------
 tools/libxl/libxl_dom.c             |   1 +
 9 files changed, 330 insertions(+), 143 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08  9:51   ` Andrew Cooper
  2015-05-11 11:50   ` Ian Campbell
  2015-05-08  9:33 ` [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save Yang Hongyang
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

Move the memory allocation before the concrete live/nolive save
in order to avoid the free/alloc memory loop when using Remus.
Rename the to_send bitmap to dirty_bitmap.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_sr_common.h |  1 +
 tools/libxc/xc_sr_save.c   | 68 ++++++++++++++++++++--------------------------
 2 files changed, 31 insertions(+), 38 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index ef42412..95e3f5d 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -182,6 +182,7 @@ struct xc_sr_context
             unsigned nr_batch_pfns;
             unsigned long *deferred_pages;
             unsigned long nr_deferred_pages;
+            xc_hypercall_buffer_t dirty_bitmap_hbuf;
         } save;
 
         struct /* Restore data. */
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 5d9c267..cc3e6b1 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -475,24 +475,12 @@ static int update_progress_string(struct xc_sr_context *ctx,
 static int send_domain_memory_live(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    DECLARE_HYPERCALL_BUFFER(unsigned long, to_send);
     xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
     char *progress_str = NULL;
     unsigned x;
     int rc = -1;
-
-    to_send = xc_hypercall_buffer_alloc_pages(
-        xch, to_send, NRPAGES(bitmap_size(ctx->save.p2m_size)));
-
-    ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
-                                  sizeof(*ctx->save.batch_pfns));
-    ctx->save.deferred_pages = calloc(1, bitmap_size(ctx->save.p2m_size));
-
-    if ( !ctx->save.batch_pfns || !to_send || !ctx->save.deferred_pages )
-    {
-        ERROR("Unable to allocate memory for to_{send,fix}/batch bitmaps");
-        goto out;
-    }
+    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+                                    (&ctx->save.dirty_bitmap_hbuf));
 
     rc = enable_logdirty(ctx);
     if ( rc )
@@ -512,7 +500,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
     {
         if ( xc_shadow_control(
                  xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
-                 HYPERCALL_BUFFER(to_send), ctx->save.p2m_size,
+                 HYPERCALL_BUFFER(dirty_bitmap), ctx->save.p2m_size,
                  NULL, 0, &stats) != ctx->save.p2m_size )
         {
             PERROR("Failed to retrieve logdirty bitmap");
@@ -527,7 +515,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
         if ( rc )
             goto out;
 
-        rc = send_some_pages(ctx, to_send, stats.dirty_count);
+        rc = send_some_pages(ctx, dirty_bitmap, stats.dirty_count);
         if ( rc )
             goto out;
     }
@@ -538,7 +526,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
 
     if ( xc_shadow_control(
              xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
-             HYPERCALL_BUFFER(to_send), ctx->save.p2m_size,
+             HYPERCALL_BUFFER(dirty_bitmap), ctx->save.p2m_size,
              NULL, 0, &stats) != ctx->save.p2m_size )
     {
         PERROR("Failed to retrieve logdirty bitmap");
@@ -550,9 +538,9 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
     if ( rc )
         goto out;
 
-    bitmap_or(to_send, ctx->save.deferred_pages, ctx->save.p2m_size);
+    bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);
 
-    rc = send_some_pages(ctx, to_send,
+    rc = send_some_pages(ctx, dirty_bitmap,
                          stats.dirty_count + ctx->save.nr_deferred_pages);
     if ( rc )
         goto out;
@@ -578,7 +566,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
 
         if ( xc_shadow_control(
                  xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_PEEK,
-                 HYPERCALL_BUFFER(to_send), ctx->save.p2m_size,
+                 HYPERCALL_BUFFER(dirty_bitmap), ctx->save.p2m_size,
                  NULL, 0, &stats) != ctx->save.p2m_size )
         {
             PERROR("Failed to retrieve logdirty bitmap");
@@ -593,10 +581,6 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
   out:
     xc_set_progress_prefix(xch, NULL);
     free(progress_str);
-    xc_hypercall_buffer_free_pages(xch, to_send,
-                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));
-    free(ctx->save.deferred_pages);
-    free(ctx->save.batch_pfns);
     return rc;
 }
 
@@ -609,17 +593,6 @@ static int send_domain_memory_nonlive(struct xc_sr_context *ctx)
     xc_interface *xch = ctx->xch;
     int rc = -1;
 
-    ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
-                                  sizeof(*ctx->save.batch_pfns));
-    ctx->save.deferred_pages = calloc(1, bitmap_size(ctx->save.p2m_size));
-
-    if ( !ctx->save.batch_pfns || !ctx->save.deferred_pages )
-    {
-        PERROR("Failed to allocate memory for nonlive tracking structures");
-        errno = ENOMEM;
-        goto err;
-    }
-
     rc = suspend_domain(ctx);
     if ( rc )
         goto err;
@@ -631,9 +604,6 @@ static int send_domain_memory_nonlive(struct xc_sr_context *ctx)
         goto err;
 
  err:
-    free(ctx->save.deferred_pages);
-    free(ctx->save.batch_pfns);
-
     return rc;
 }
 
@@ -644,6 +614,23 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 {
     xc_interface *xch = ctx->xch;
     int rc, saved_rc = 0, saved_errno = 0;
+    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+                                    (&ctx->save.dirty_bitmap_hbuf));
+
+    dirty_bitmap = xc_hypercall_buffer_alloc_pages(
+                   xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)));
+    ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
+                                  sizeof(*ctx->save.batch_pfns));
+    ctx->save.deferred_pages = calloc(1, bitmap_size(ctx->save.p2m_size));
+
+    if ( !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pages )
+    {
+        ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
+              " deferred pages");
+        rc = -1;
+        errno = ENOMEM;
+        goto err;
+    }
 
     IPRINTF("Saving domain %d, type %s",
             ctx->domid, dhdr_type_to_str(guest_type));
@@ -704,6 +691,11 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
     if ( rc )
         PERROR("Failed to clean up");
 
+    xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
+                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));
+    free(ctx->save.deferred_pages);
+    free(ctx->save.batch_pfns);
+
     if ( saved_rc )
     {
         rc = saved_rc;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08  9:45   ` Andrew Cooper
  2015-05-08  9:33 ` [PATCH Remus v2 03/10] tools/libxc: rename send_some_pages to send_dirty_pages Yang Hongyang
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

introduce setup() and cleanup() which subsume the
ctx->save.ops.{setup,cleanup}() calls and also do allocate/free
necessary memory.
The SHADOW_OP_OFF hypercall also included in the cleanup().

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_sr_save.c | 72 +++++++++++++++++++++++++++++-------------------
 1 file changed, 44 insertions(+), 28 deletions(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index cc3e6b1..2394bc4 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -607,13 +607,10 @@ static int send_domain_memory_nonlive(struct xc_sr_context *ctx)
     return rc;
 }
 
-/*
- * Save a domain.
- */
-static int save(struct xc_sr_context *ctx, uint16_t guest_type)
+static int setup(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    int rc, saved_rc = 0, saved_errno = 0;
+    int rc;
     DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
                                     (&ctx->save.dirty_bitmap_hbuf));
 
@@ -632,13 +629,51 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
         goto err;
     }
 
-    IPRINTF("Saving domain %d, type %s",
-            ctx->domid, dhdr_type_to_str(guest_type));
-
     rc = ctx->save.ops.setup(ctx);
     if ( rc )
         goto err;
 
+    rc = 0;
+
+ err:
+    return rc;
+
+}
+
+static void cleanup(struct xc_sr_context *ctx)
+{
+    xc_interface *xch = ctx->xch;
+    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+                                    (&ctx->save.dirty_bitmap_hbuf));
+
+    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
+                      NULL, 0, NULL, 0, NULL);
+
+    if ( ctx->save.ops.cleanup(ctx) )
+        PERROR("Failed to clean up");
+
+    if ( dirty_bitmap )
+        xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
+                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));
+    free(ctx->save.deferred_pages);
+    free(ctx->save.batch_pfns);
+}
+
+/*
+ * Save a domain.
+ */
+static int save(struct xc_sr_context *ctx, uint16_t guest_type)
+{
+    xc_interface *xch = ctx->xch;
+    int rc;
+
+    rc = setup(ctx);
+    if ( rc )
+        goto err;
+
+    IPRINTF("Saving domain %d, type %s",
+            ctx->domid, dhdr_type_to_str(guest_type));
+
     xc_report_progress_single(xch, "Start of stream");
 
     rc = write_headers(ctx, guest_type);
@@ -679,29 +714,10 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
     goto done;
 
  err:
-    saved_errno = errno;
-    saved_rc = rc;
     PERROR("Save failed");
 
  done:
-    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
-                      NULL, 0, NULL, 0, NULL);
-
-    rc = ctx->save.ops.cleanup(ctx);
-    if ( rc )
-        PERROR("Failed to clean up");
-
-    xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
-                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));
-    free(ctx->save.deferred_pages);
-    free(ctx->save.batch_pfns);
-
-    if ( saved_rc )
-    {
-        rc = saved_rc;
-        errno = saved_errno;
-    }
-
+    cleanup(ctx);
     return rc;
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 03/10] tools/libxc: rename send_some_pages to send_dirty_pages
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08 10:11   ` Andrew Cooper
  2015-05-08  9:33 ` [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER Yang Hongyang
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

rename send_some_pages to send_dirty_pages, no functional change.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_sr_save.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 2394bc4..9ca4336 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -379,7 +379,7 @@ static int send_all_pages(struct xc_sr_context *ctx)
  *
  * Bitmap is bounded by p2m_size.
  */
-static int send_some_pages(struct xc_sr_context *ctx,
+static int send_dirty_pages(struct xc_sr_context *ctx,
                            unsigned long *bitmap,
                            unsigned long entries)
 {
@@ -515,7 +515,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
         if ( rc )
             goto out;
 
-        rc = send_some_pages(ctx, dirty_bitmap, stats.dirty_count);
+        rc = send_dirty_pages(ctx, dirty_bitmap, stats.dirty_count);
         if ( rc )
             goto out;
     }
@@ -540,7 +540,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
 
     bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);
 
-    rc = send_some_pages(ctx, dirty_bitmap,
+    rc = send_dirty_pages(ctx, dirty_bitmap,
                          stats.dirty_count + ctx->save.nr_deferred_pages);
     if ( rc )
         goto out;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (2 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 03/10] tools/libxc: rename send_some_pages to send_dirty_pages Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08 10:16   ` Andrew Cooper
  2015-05-11 11:53   ` Ian Campbell
  2015-05-08  9:33 ` [PATCH Remus v2 05/10] tools/libxc: reuse send_dirty_pages() in send_all_pages() Yang Hongyang
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

Define a user pointer that can access the hypercall buffer data
Useful when you only need to access the hypercall buffer data

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/include/xenctrl.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 6994c51..12e8c36 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -296,6 +296,14 @@ typedef struct xc_hypercall_buffer xc_hypercall_buffer_t;
     }
 
 /*
+ * Define a user pointer that can access the hypercall buffer data
+ *
+ * Useful when you only need to access the hypercall buffer data
+ */
+#define DECLARE_HYPERCALL_BUFFER_USER_POINTER(_type, _name, _hbuf)  \
+    _type *_name = _hbuf->hbuf;
+
+/*
  * Declare the necessary data structure to allow a hypercall buffer
  * passed as an argument to a function to be used in the normal way.
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 05/10] tools/libxc: reuse send_dirty_pages() in send_all_pages()
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (3 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08 10:17   ` Andrew Cooper
  2015-05-08  9:33 ` [PATCH Remus v2 06/10] tools/libxc: introduce process_record() Yang Hongyang
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

introduce bitmap_set() to set the entire bitmap.
in send_all_pages(), set the entire bitmap and call send_dirty_pages().

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_bitops.h  |  5 +++++
 tools/libxc/xc_sr_save.c | 44 ++++++++++++++------------------------------
 2 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
index dfce3b8..cd749f4 100644
--- a/tools/libxc/xc_bitops.h
+++ b/tools/libxc/xc_bitops.h
@@ -26,6 +26,11 @@ static inline unsigned long *bitmap_alloc(int nr_bits)
     return calloc(1, bitmap_size(nr_bits));
 }
 
+static inline void bitmap_set(unsigned long *addr, int nr_bits)
+{
+    memset(addr, 0xff, bitmap_size(nr_bits));
+}
+
 static inline void bitmap_clear(unsigned long *addr, int nr_bits)
 {
     memset(addr, 0, bitmap_size(nr_bits));
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 9ca4336..7efffe6 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -344,36 +344,6 @@ static int suspend_domain(struct xc_sr_context *ctx)
 }
 
 /*
- * Send all pages in the guests p2m.  Used as the first iteration of the live
- * migration loop, and for a non-live save.
- */
-static int send_all_pages(struct xc_sr_context *ctx)
-{
-    xc_interface *xch = ctx->xch;
-    xen_pfn_t p;
-    int rc;
-
-    for ( p = 0; p < ctx->save.p2m_size; ++p )
-    {
-        rc = add_to_batch(ctx, p);
-        if ( rc )
-            return rc;
-
-        /* Update progress every 4MB worth of memory sent. */
-        if ( (p & ((1U << (22 - 12)) - 1)) == 0 )
-            xc_report_progress_step(xch, p, ctx->save.p2m_size);
-    }
-
-    rc = flush_batch(ctx);
-    if ( rc )
-        return rc;
-
-    xc_report_progress_step(xch, ctx->save.p2m_size,
-                            ctx->save.p2m_size);
-    return 0;
-}
-
-/*
  * Send a subset of pages in the guests p2m, according to the provided bitmap.
  * Used for each subsequent iteration of the live migration loop.
  *
@@ -415,6 +385,20 @@ static int send_dirty_pages(struct xc_sr_context *ctx,
     return 0;
 }
 
+/*
+ * Send all pages in the guests p2m.  Used as the first iteration of the live
+ * migration loop, and for a non-live save.
+ */
+static int send_all_pages(struct xc_sr_context *ctx)
+{
+    DECLARE_HYPERCALL_BUFFER_USER_POINTER(unsigned long, dirty_bitmap,
+                                          (&ctx->save.dirty_bitmap_hbuf));
+
+    bitmap_set(dirty_bitmap, ctx->save.p2m_size);
+
+    return send_dirty_pages(ctx, dirty_bitmap, ctx->save.p2m_size);
+}
+
 static int enable_logdirty(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 06/10] tools/libxc: introduce process_record()
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (4 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 05/10] tools/libxc: reuse send_dirty_pages() in send_all_pages() Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 07/10] tools/libxc: split read/handle qemu info Yang Hongyang
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

Move record handle codes into a function process_record().
It will be used multiple times by Remus.
No functional change.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxc/xc_sr_restore.c | 77 +++++++++++++++++++++++++--------------------
 1 file changed, 43 insertions(+), 34 deletions(-)

diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 0bf4bae..53bd674 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -468,6 +468,48 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
     return rc;
 }
 
+static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
+{
+    xc_interface *xch = ctx->xch;
+    int rc = 0;
+
+    switch ( rec->type )
+    {
+    case REC_TYPE_END:
+        break;
+
+    case REC_TYPE_PAGE_DATA:
+        rc = handle_page_data(ctx, rec);
+        break;
+
+    case REC_TYPE_VERIFY:
+        DPRINTF("Verify mode enabled");
+        ctx->restore.verify = true;
+        break;
+
+    default:
+        rc = ctx->restore.ops.process_record(ctx, rec);
+        break;
+    }
+
+    free(rec->data);
+
+    if ( rc == RECORD_NOT_PROCESSED )
+    {
+        if ( rec->type & REC_TYPE_OPTIONAL )
+            DPRINTF("Ignoring optional record %#x (%s)",
+                    rec->type, rec_type_to_str(rec->type));
+        else
+        {
+            ERROR("Mandatory record %#x (%s) not handled",
+                  rec->type, rec_type_to_str(rec->type));
+            rc = -1;
+        }
+    }
+
+    return rc;
+}
+
 /*
  * Restore a domain.
  */
@@ -498,40 +540,7 @@ static int restore(struct xc_sr_context *ctx)
         if ( rc )
             goto err;
 
-        switch ( rec.type )
-        {
-        case REC_TYPE_END:
-            break;
-
-        case REC_TYPE_PAGE_DATA:
-            rc = handle_page_data(ctx, &rec);
-            break;
-
-        case REC_TYPE_VERIFY:
-            DPRINTF("Verify mode enabled");
-            ctx->restore.verify = true;
-            break;
-
-        default:
-            rc = ctx->restore.ops.process_record(ctx, &rec);
-            break;
-        }
-
-        free(rec.data);
-
-        if ( rc == RECORD_NOT_PROCESSED )
-        {
-            if ( rec.type & REC_TYPE_OPTIONAL )
-                DPRINTF("Ignoring optional record %#x (%s)",
-                        rec.type, rec_type_to_str(rec.type));
-            else
-            {
-                ERROR("Mandatory record %#x (%s) not handled",
-                      rec.type, rec_type_to_str(rec.type));
-                rc = -1;
-            }
-        }
-
+        rc = process_record(ctx, &rec);
         if ( rc )
             goto err;
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 07/10] tools/libxc: split read/handle qemu info
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (5 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 06/10] tools/libxc: introduce process_record() Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 08/10] tools/libxc: implement Remus checkpointed save Yang Hongyang
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

Split read/handle qemu info. The receiving of qemu info
should be done while we receive the migration stream,
handle_qemu will be called when the stream complete.
Otherwise, it will break Remus because read_record()
won't read qemu info and stream_complete will be called
at failover.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_sr_common.h          |  5 +++++
 tools/libxc/xc_sr_restore.c         | 12 ++++++++++++
 tools/libxc/xc_sr_restore_x86_hvm.c | 28 +++++++++++++++++++++++++---
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 95e3f5d..9567a35 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -280,6 +280,11 @@ struct xc_sr_context
                     /* HVM context blob. */
                     void *context;
                     size_t contextsz;
+
+#ifdef XG_LIBXL_HVM_COMPAT
+                    uint32_t qlen;
+                    void *qbuf;
+#endif
                 } restore;
             };
         } x86_hvm;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 53bd674..8022c3d 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -510,6 +510,9 @@ static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec)
     return rc;
 }
 
+#ifdef XG_LIBXL_HVM_COMPAT
+extern int read_qemu(struct xc_sr_context *ctx);
+#endif
 /*
  * Restore a domain.
  */
@@ -546,6 +549,15 @@ static int restore(struct xc_sr_context *ctx)
 
     } while ( rec.type != REC_TYPE_END );
 
+#ifdef XG_LIBXL_HVM_COMPAT
+    if ( ctx->dominfo.hvm )
+    {
+        rc = read_qemu(ctx);
+        if ( rc )
+            goto err;
+    }
+#endif
+
     rc = ctx->restore.ops.stream_complete(ctx);
     if ( rc )
         goto err;
diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c b/tools/libxc/xc_sr_restore_x86_hvm.c
index 6e9b318..6f5af0e 100644
--- a/tools/libxc/xc_sr_restore_x86_hvm.c
+++ b/tools/libxc/xc_sr_restore_x86_hvm.c
@@ -94,14 +94,14 @@ static int handle_hvm_params(struct xc_sr_context *ctx,
 }
 
 #ifdef XG_LIBXL_HVM_COMPAT
-static int handle_qemu(struct xc_sr_context *ctx)
+int read_qemu(struct xc_sr_context *ctx);
+int read_qemu(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    char qemusig[21], path[256];
+    char qemusig[21];
     uint32_t qlen;
     void *qbuf = NULL;
     int rc = -1;
-    FILE *fp = NULL;
 
     if ( read_exact(ctx->fd, qemusig, sizeof(qemusig)) )
     {
@@ -137,6 +137,28 @@ static int handle_qemu(struct xc_sr_context *ctx)
         goto out;
     }
 
+    /* With Remus, this could be read many times */
+    if ( ctx->x86_hvm.restore.qbuf )
+        free(ctx->x86_hvm.restore.qbuf);
+    ctx->x86_hvm.restore.qbuf = qbuf;
+    ctx->x86_hvm.restore.qlen = qlen;
+    rc = 0;
+
+out:
+    if (rc)
+        free(qbuf);
+    return rc;
+}
+
+static int handle_qemu(struct xc_sr_context *ctx)
+{
+    xc_interface *xch = ctx->xch;
+    char path[256];
+    uint32_t qlen = ctx->x86_hvm.restore.qlen;
+    void *qbuf = ctx->x86_hvm.restore.qbuf;
+    int rc = -1;
+    FILE *fp = NULL;
+
     sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", ctx->domid);
     fp = fopen(path, "wb");
     if ( !fp )
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 08/10] tools/libxc: implement Remus checkpointed save
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (6 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 07/10] tools/libxc: split read/handle qemu info Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 09/10] tools/libxc: implement Remus checkpointed restore Yang Hongyang
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

With Remus, the save flow should be:
live migration->{ periodically save(checkpointed save) }

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/include/xenguest.h |  1 +
 tools/libxc/xc_sr_common.h     |  6 ++++
 tools/libxc/xc_sr_save.c       | 74 +++++++++++++++++++++++++++++-------------
 tools/libxl/libxl_dom.c        |  1 +
 4 files changed, 59 insertions(+), 23 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 8e39075..7581263 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -30,6 +30,7 @@
 #define XCFLAGS_HVM       (1 << 2)
 #define XCFLAGS_STDVGA    (1 << 3)
 #define XCFLAGS_CHECKPOINT_COMPRESS    (1 << 4)
+#define XCFLAGS_CHECKPOINTED    (1 << 5)
 
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 9567a35..95e0f19 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -172,6 +172,12 @@ struct xc_sr_context
             /* Further debugging information in the stream. */
             bool debug;
 
+            /*
+             * Whether it is a checkpointed save, we'll enter checkpointed
+             * save after live migration under Remus.
+             */
+            bool checkpointed;
+
             /* Parameters for tweaking live migration. */
             unsigned max_iterations;
             unsigned dirty_threshold;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 7efffe6..7cf45ed 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -466,6 +466,9 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
     DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
                                     (&ctx->save.dirty_bitmap_hbuf));
 
+    if ( !ctx->save.live )
+        goto last_iter;
+
     rc = enable_logdirty(ctx);
     if ( rc )
         goto out;
@@ -504,6 +507,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
             goto out;
     }
 
+ last_iter:
     rc = suspend_domain(ctx);
     if ( rc )
         goto out;
@@ -664,35 +668,52 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
     if ( rc )
         goto err;
 
-    rc = ctx->save.ops.start_of_stream(ctx);
-    if ( rc )
-        goto err;
+    do {
+        rc = ctx->save.ops.start_of_stream(ctx);
+        if ( rc )
+            goto err;
 
-    if ( ctx->save.live )
-        rc = send_domain_memory_live(ctx);
-    else
-        rc = send_domain_memory_nonlive(ctx);
+        if ( ctx->save.live || ctx->save.checkpointed )
+            rc = send_domain_memory_live(ctx);
+        else
+            rc = send_domain_memory_nonlive(ctx);
 
-    if ( rc )
-        goto err;
+        if ( rc )
+            goto err;
 
-    if ( !ctx->dominfo.shutdown ||
-         (ctx->dominfo.shutdown_reason != SHUTDOWN_suspend) )
-    {
-        ERROR("Domain has not been suspended");
-        rc = -1;
-        goto err;
-    }
+        if ( !ctx->dominfo.shutdown ||
+             (ctx->dominfo.shutdown_reason != SHUTDOWN_suspend) )
+        {
+            ERROR("Domain has not been suspended");
+            rc = -1;
+            goto err;
+        }
 
-    xc_report_progress_single(xch, "End of stream");
+        xc_report_progress_single(xch, "End of stream");
 
-    rc = ctx->save.ops.end_of_stream(ctx);
-    if ( rc )
-        goto err;
+        rc = ctx->save.ops.end_of_stream(ctx);
+        if ( rc )
+            goto err;
 
-    rc = write_end_record(ctx);
-    if ( rc )
-        goto err;
+        rc = write_end_record(ctx);
+        if ( rc )
+            goto err;
+
+        if ( ctx->save.live ) {
+            /* End of live migration */
+            ctx->save.live = false;
+        }
+
+        if ( ctx->save.checkpointed ) {
+            ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+
+            rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
+            if ( rc > 0 )
+                IPRINTF("Checkpointed save");
+            else
+                ctx->save.checkpointed = false;
+        }
+    } while ( ctx->save.checkpointed );
 
     xc_report_progress_single(xch, "Complete");
     goto done;
@@ -720,6 +741,13 @@ int xc_domain_save2(xc_interface *xch, int io_fd, uint32_t dom,
     ctx.save.callbacks = callbacks;
     ctx.save.live  = !!(flags & XCFLAGS_LIVE);
     ctx.save.debug = !!(flags & XCFLAGS_DEBUG);
+    ctx.save.checkpointed = !!(flags & XCFLAGS_CHECKPOINTED);
+
+    if ( ctx.save.checkpointed ) {
+        /* This is a checkpointed save, we need these callbacks */
+        assert(ctx.save.callbacks->postcopy);
+        assert(ctx.save.callbacks->checkpoint);
+    }
 
     /*
      * TODO: Find some time to better tweak the live migration algorithm.
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index f408646..a0c9850 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -2003,6 +2003,7 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
 
     if (r_info != NULL) {
         dss->interval = r_info->interval;
+        dss->xcflags |= XCFLAGS_CHECKPOINTED;
         if (libxl_defbool_val(r_info->compression))
             dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 09/10] tools/libxc: implement Remus checkpointed restore
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (7 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 08/10] tools/libxc: implement Remus checkpointed save Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08  9:33 ` [PATCH Remus v2 10/10] tools/libxc: X86_PV_INFO can be sent multiple times under Remus Yang Hongyang
  2015-05-08 18:12 ` [PATCH Remus v2 00/10] Remus support for Migration-v2 Andrew Cooper
  10 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

With Remus, the restore flow should be:
the first full migration stream -> { periodically restore stream }
on failover:
  stream_complete()

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_sr_common.h  |   3 ++
 tools/libxc/xc_sr_restore.c | 108 +++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 104 insertions(+), 7 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 95e0f19..e34e6d4 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -196,6 +196,9 @@ struct xc_sr_context
             struct xc_sr_restore_ops ops;
             struct restore_callbacks *callbacks;
 
+            /* Whether it is a checkpointed stream */
+            bool checkpointed;
+
             /* From Image Header. */
             uint32_t format_version;
 
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 8022c3d..fceb5c7 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -3,6 +3,15 @@
 #include "xc_sr_common.h"
 
 /*
+ * With Remus, we buffer the records sent by primary at checkpoint,
+ * in case the primary will fail, we can recover from the last
+ * checkpoint state.
+ * This should be enough because primary only send dirty pages at
+ * checkpoint.
+ */
+#define MAX_RECORDS 1024
+
+/*
  * Read and validate the Image and Domain headers.
  */
 static int read_headers(struct xc_sr_context *ctx)
@@ -519,8 +528,11 @@ extern int read_qemu(struct xc_sr_context *ctx);
 static int restore(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    struct xc_sr_record rec;
-    int rc, saved_rc = 0, saved_errno = 0;
+    struct xc_sr_record *rec;
+    int rc, saved_rc = 0, saved_errno = 0, i;
+    struct xc_sr_record *records[MAX_RECORDS];
+
+    memset(records, 0, MAX_RECORDS*sizeof(void *));
 
     IPRINTF("Restoring domain");
 
@@ -537,17 +549,29 @@ static int restore(struct xc_sr_context *ctx)
         goto err;
     }
 
+    rec = malloc(sizeof(struct xc_sr_record));
+    if (!rec)
+    {
+        ERROR("Unable to allocate memory for record");
+        rc = -1;
+        goto err;
+    }
     do
     {
-        rc = read_record(ctx, &rec);
-        if ( rc )
+        rc = read_record(ctx, rec);
+        if ( rc ) {
+            free(rec);
             goto err;
+        }
 
-        rc = process_record(ctx, &rec);
-        if ( rc )
+        rc = process_record(ctx, rec);
+        if ( rc ) {
+            free(rec);
             goto err;
+        }
 
-    } while ( rec.type != REC_TYPE_END );
+    } while ( rec->type != REC_TYPE_END );
+    free(rec);
 
 #ifdef XG_LIBXL_HVM_COMPAT
     if ( ctx->dominfo.hvm )
@@ -557,7 +581,69 @@ static int restore(struct xc_sr_context *ctx)
             goto err;
     }
 #endif
+    /*
+     * If it is a live migration, this is the end of the live migration
+     * stream
+     * With Remus, we will enter a loop to receive the stream periodically
+     * sent by primary.
+     */
+
+    while ( ctx->restore.checkpointed )
+    {
+        i = 0;
+        do
+        {
+            /* Buffer records */
+            if ( i >= MAX_RECORDS )
+            {
+                ERROR("There are too many records");
+                rc = -1;
+                goto err;
+            }
 
+            rec = malloc(sizeof(struct xc_sr_record));
+            if (!rec)
+            {
+                ERROR("Unable to allocate memory for record");
+                rc = -1;
+                goto err;
+            }
+            rc = read_record(ctx, rec);
+            if ( rc )
+                goto err_buf;
+
+            records[i++] = rec;
+        } while ( rec->type != REC_TYPE_END );
+#ifdef XG_LIBXL_HVM_COMPAT
+        if ( ctx->dominfo.hvm )
+        {
+            rc = read_qemu(ctx);
+            if ( rc )
+                goto err_buf;
+        }
+#endif
+        IPRINTF("All records buffered");
+
+        i = 0;
+        rec = records[i++];
+        while (rec)
+        {
+            rc = process_record(ctx, rec);
+            free(rec);
+            records[i-1] = NULL;
+            if ( rc )
+                goto err;
+
+            rec = records[i++];
+        }
+        IPRINTF("All records processed");
+    }
+
+ err_buf:
+    /*
+     * With Remus, if we reach here, there must be some error on primary,
+     * failover from the last checkpoint state.
+     */
     rc = ctx->restore.ops.stream_complete(ctx);
     if ( rc )
         goto err;
@@ -571,6 +657,13 @@ static int restore(struct xc_sr_context *ctx)
     PERROR("Restore failed");
 
  done:
+    for ( i = 0; i < MAX_RECORDS; i++)
+    {
+        if ( records[i] ) {
+            free(records[i]->data);
+            free(records[i]);
+        }
+    }
     free(ctx->restore.populated_pfns);
     rc = ctx->restore.ops.cleanup(ctx);
     if ( rc )
@@ -605,6 +698,7 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
     ctx.restore.xenstore_evtchn = store_evtchn;
     ctx.restore.xenstore_domid = store_domid;
     ctx.restore.callbacks = callbacks;
+    ctx.restore.checkpointed = checkpointed_stream;
 
     IPRINTF("In experimental %s", __func__);
     DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d"
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH Remus v2 10/10] tools/libxc: X86_PV_INFO can be sent multiple times under Remus
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (8 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 09/10] tools/libxc: implement Remus checkpointed restore Yang Hongyang
@ 2015-05-08  9:33 ` Yang Hongyang
  2015-05-08 18:12 ` [PATCH Remus v2 00/10] Remus support for Migration-v2 Andrew Cooper
  10 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-08  9:33 UTC (permalink / raw)
  To: xen-devel
  Cc: wei.liu2, ian.campbell, wency, andrew.cooper3, yunhong.jiang,
	eddie.dong, rshriram, ian.jackson

X86_PV_INFO can be sent multiple times under Remus

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
---
 tools/libxc/xc_sr_restore_x86_pv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_restore_x86_pv.c b/tools/libxc/xc_sr_restore_x86_pv.c
index ef26c64..652b400 100644
--- a/tools/libxc/xc_sr_restore_x86_pv.c
+++ b/tools/libxc/xc_sr_restore_x86_pv.c
@@ -590,7 +590,7 @@ static int handle_x86_pv_info(struct xc_sr_context *ctx,
     xc_interface *xch = ctx->xch;
     struct xc_sr_rec_x86_pv_info *info = rec->data;
 
-    if ( ctx->x86_pv.restore.seen_pv_info )
+    if ( ctx->x86_pv.restore.seen_pv_info && !ctx->restore.checkpointed )
     {
         ERROR("Already received X86_PV_INFO record");
         return -1;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-08  9:33 ` [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save Yang Hongyang
@ 2015-05-08  9:45   ` Andrew Cooper
  2015-05-08  9:59     ` Hongyang Yang
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-08  9:45 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 08/05/15 10:33, Yang Hongyang wrote:
> introduce setup() and cleanup() which subsume the
> ctx->save.ops.{setup,cleanup}() calls and also do allocate/free
> necessary memory.
> The SHADOW_OP_OFF hypercall also included in the cleanup().
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

I would suggest swapping this with patch 1, to save introducing new
code, just to move it again in the next patch.

In general, a good change, but some comments...

> ---
>  tools/libxc/xc_sr_save.c | 72 +++++++++++++++++++++++++++++-------------------
>  1 file changed, 44 insertions(+), 28 deletions(-)
>
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index cc3e6b1..2394bc4 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -607,13 +607,10 @@ static int send_domain_memory_nonlive(struct xc_sr_context *ctx)
>      return rc;
>  }
>  
> -/*
> - * Save a domain.
> - */
> -static int save(struct xc_sr_context *ctx, uint16_t guest_type)
> +static int setup(struct xc_sr_context *ctx)
>  {
>      xc_interface *xch = ctx->xch;
> -    int rc, saved_rc = 0, saved_errno = 0;
> +    int rc;
>      DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>                                      (&ctx->save.dirty_bitmap_hbuf));
>  
> @@ -632,13 +629,51 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>          goto err;
>      }
>  
> -    IPRINTF("Saving domain %d, type %s",
> -            ctx->domid, dhdr_type_to_str(guest_type));
> -
>      rc = ctx->save.ops.setup(ctx);
>      if ( rc )
>          goto err;
>  
> +    rc = 0;
> +
> + err:
> +    return rc;
> +
> +}
> +
> +static void cleanup(struct xc_sr_context *ctx)
> +{
> +    xc_interface *xch = ctx->xch;
> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> +                                    (&ctx->save.dirty_bitmap_hbuf));
> +
> +    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
> +                      NULL, 0, NULL, 0, NULL);
> +
> +    if ( ctx->save.ops.cleanup(ctx) )
> +        PERROR("Failed to clean up");
> +
> +    if ( dirty_bitmap )
> +        xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
> +                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));

xc_hypercall_buffer_free_pages() if fine dealing with NULL, just like
free() is.  You can drop the conditional.

> +    free(ctx->save.deferred_pages);
> +    free(ctx->save.batch_pfns);
> +}
> +
> +/*
> + * Save a domain.
> + */
> +static int save(struct xc_sr_context *ctx, uint16_t guest_type)
> +{
> +    xc_interface *xch = ctx->xch;
> +    int rc;
> +
> +    rc = setup(ctx);
> +    if ( rc )
> +        goto err;
> +
> +    IPRINTF("Saving domain %d, type %s",
> +            ctx->domid, dhdr_type_to_str(guest_type));
> +
>      xc_report_progress_single(xch, "Start of stream");
>  
>      rc = write_headers(ctx, guest_type);
> @@ -679,29 +714,10 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>      goto done;
>  
>   err:
> -    saved_errno = errno;
> -    saved_rc = rc;
>      PERROR("Save failed");
>  
>   done:
> -    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
> -                      NULL, 0, NULL, 0, NULL);
> -
> -    rc = ctx->save.ops.cleanup(ctx);
> -    if ( rc )
> -        PERROR("Failed to clean up");
> -
> -    xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
> -                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));
> -    free(ctx->save.deferred_pages);
> -    free(ctx->save.batch_pfns);
> -
> -    if ( saved_rc )
> -    {
> -        rc = saved_rc;
> -        errno = saved_errno;
> -    }

You must keep saved_{rc,errno}, so that the cleanup doesn't clobber the
errno semantically relevant to why the save failed.

~Andrew

> -
> +    cleanup(ctx);
>      return rc;
>  };
>  

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration
  2015-05-08  9:33 ` [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration Yang Hongyang
@ 2015-05-08  9:51   ` Andrew Cooper
  2015-05-11 11:50   ` Ian Campbell
  1 sibling, 0 replies; 40+ messages in thread
From: Andrew Cooper @ 2015-05-08  9:51 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 08/05/15 10:33, Yang Hongyang wrote:
> @@ -475,24 +475,12 @@ static int update_progress_string(struct xc_sr_context *ctx,
>  static int send_domain_memory_live(struct xc_sr_context *ctx)
>  {
>      xc_interface *xch = ctx->xch;
> -    DECLARE_HYPERCALL_BUFFER(unsigned long, to_send);
>      xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
>      char *progress_str = NULL;
>      unsigned x;
>      int rc = -1;
> -
> -    to_send = xc_hypercall_buffer_alloc_pages(
> -        xch, to_send, NRPAGES(bitmap_size(ctx->save.p2m_size)));
> -
> -    ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
> -                                  sizeof(*ctx->save.batch_pfns));
> -    ctx->save.deferred_pages = calloc(1, bitmap_size(ctx->save.p2m_size));
> -
> -    if ( !ctx->save.batch_pfns || !to_send || !ctx->save.deferred_pages )
> -    {
> -        ERROR("Unable to allocate memory for to_{send,fix}/batch bitmaps");
> -        goto out;
> -    }
> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> +                                    (&ctx->save.dirty_bitmap_hbuf));

Nice bug.  I have just put a fix to DECLARE_HYPERCALL_BUFFER_SHADOW() in
my series.

~Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-08  9:45   ` Andrew Cooper
@ 2015-05-08  9:59     ` Hongyang Yang
  2015-05-08 10:08       ` Andrew Cooper
  0 siblings, 1 reply; 40+ messages in thread
From: Hongyang Yang @ 2015-05-08  9:59 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram



On 05/08/2015 05:45 PM, Andrew Cooper wrote:
> On 08/05/15 10:33, Yang Hongyang wrote:
>> introduce setup() and cleanup() which subsume the
>> ctx->save.ops.{setup,cleanup}() calls and also do allocate/free
>> necessary memory.
>> The SHADOW_OP_OFF hypercall also included in the cleanup().
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>
> I would suggest swapping this with patch 1, to save introducing new
> code, just to move it again in the next patch.

OK

>
> In general, a good change, but some comments...
>
>> ---
>>   tools/libxc/xc_sr_save.c | 72 +++++++++++++++++++++++++++++-------------------
>>   1 file changed, 44 insertions(+), 28 deletions(-)
>>
>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>> index cc3e6b1..2394bc4 100644
>> --- a/tools/libxc/xc_sr_save.c
>> +++ b/tools/libxc/xc_sr_save.c
>> @@ -607,13 +607,10 @@ static int send_domain_memory_nonlive(struct xc_sr_context *ctx)
>>       return rc;
>>   }
>>
>> -/*
>> - * Save a domain.
>> - */
>> -static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>> +static int setup(struct xc_sr_context *ctx)
>>   {
>>       xc_interface *xch = ctx->xch;
>> -    int rc, saved_rc = 0, saved_errno = 0;
>> +    int rc;
>>       DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>>                                       (&ctx->save.dirty_bitmap_hbuf));
>>
>> @@ -632,13 +629,51 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>>           goto err;
>>       }
>>
>> -    IPRINTF("Saving domain %d, type %s",
>> -            ctx->domid, dhdr_type_to_str(guest_type));
>> -
>>       rc = ctx->save.ops.setup(ctx);
>>       if ( rc )
>>           goto err;
>>
>> +    rc = 0;
>> +
>> + err:
>> +    return rc;
>> +
>> +}
>> +
>> +static void cleanup(struct xc_sr_context *ctx)
>> +{
>> +    xc_interface *xch = ctx->xch;
>> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>> +                                    (&ctx->save.dirty_bitmap_hbuf));
>> +
>> +    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
>> +                      NULL, 0, NULL, 0, NULL);
>> +
>> +    if ( ctx->save.ops.cleanup(ctx) )
>> +        PERROR("Failed to clean up");
>> +
>> +    if ( dirty_bitmap )
>> +        xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
>> +                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));
>
> xc_hypercall_buffer_free_pages() if fine dealing with NULL, just like
> free() is.  You can drop the conditional.

Actually this is another trick that I need to deal with those hypercall macros.
DECLARE_HYPERCALL_BUFFER_SHADOW will define a user pointer "dirty_bitmap"
and a shadow buffer, although xc_hypercall_buffer_free_pages takes
"dirty_bitmap" as an augument, but it is also a MACRO, without
"if ( dirty_bitmap )", the compiler will report "dirty_bitmap" unused error...

>
>> +    free(ctx->save.deferred_pages);
>> +    free(ctx->save.batch_pfns);
>> +}
>> +
>> +/*
>> + * Save a domain.
>> + */
>> +static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>> +{
>> +    xc_interface *xch = ctx->xch;
>> +    int rc;
>> +
>> +    rc = setup(ctx);
>> +    if ( rc )
>> +        goto err;
>> +
>> +    IPRINTF("Saving domain %d, type %s",
>> +            ctx->domid, dhdr_type_to_str(guest_type));
>> +
>>       xc_report_progress_single(xch, "Start of stream");
>>
>>       rc = write_headers(ctx, guest_type);
>> @@ -679,29 +714,10 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>>       goto done;
>>
>>    err:
>> -    saved_errno = errno;
>> -    saved_rc = rc;
>>       PERROR("Save failed");
>>
>>    done:
>> -    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
>> -                      NULL, 0, NULL, 0, NULL);
>> -
>> -    rc = ctx->save.ops.cleanup(ctx);
>> -    if ( rc )
>> -        PERROR("Failed to clean up");
>> -
>> -    xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
>> -                                   NRPAGES(bitmap_size(ctx->save.p2m_size)));
>> -    free(ctx->save.deferred_pages);
>> -    free(ctx->save.batch_pfns);
>> -
>> -    if ( saved_rc )
>> -    {
>> -        rc = saved_rc;
>> -        errno = saved_errno;
>> -    }
>
> You must keep saved_{rc,errno}, so that the cleanup doesn't clobber the
> errno semantically relevant to why the save failed.
>

OK

> ~Andrew
>
>> -
>> +    cleanup(ctx);
>>       return rc;
>>   };
>>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-08  9:59     ` Hongyang Yang
@ 2015-05-08 10:08       ` Andrew Cooper
  2015-05-11  1:20         ` Hongyang Yang
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-08 10:08 UTC (permalink / raw)
  To: Hongyang Yang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, eddie.dong, yunhong.jiang,
	ian.jackson, rshriram

On 08/05/15 10:59, Hongyang Yang wrote:
>
>>
>> In general, a good change, but some comments...
>>
>>> ---
>>>   tools/libxc/xc_sr_save.c | 72
>>> +++++++++++++++++++++++++++++-------------------
>>>   1 file changed, 44 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>>> index cc3e6b1..2394bc4 100644
>>> --- a/tools/libxc/xc_sr_save.c
>>> +++ b/tools/libxc/xc_sr_save.c
>>> @@ -607,13 +607,10 @@ static int send_domain_memory_nonlive(struct
>>> xc_sr_context *ctx)
>>>       return rc;
>>>   }
>>>
>>> -/*
>>> - * Save a domain.
>>> - */
>>> -static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>>> +static int setup(struct xc_sr_context *ctx)
>>>   {
>>>       xc_interface *xch = ctx->xch;
>>> -    int rc, saved_rc = 0, saved_errno = 0;
>>> +    int rc;
>>>       DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>>>                                       (&ctx->save.dirty_bitmap_hbuf));
>>>
>>> @@ -632,13 +629,51 @@ static int save(struct xc_sr_context *ctx,
>>> uint16_t guest_type)
>>>           goto err;
>>>       }
>>>
>>> -    IPRINTF("Saving domain %d, type %s",
>>> -            ctx->domid, dhdr_type_to_str(guest_type));
>>> -
>>>       rc = ctx->save.ops.setup(ctx);
>>>       if ( rc )
>>>           goto err;
>>>
>>> +    rc = 0;
>>> +
>>> + err:
>>> +    return rc;
>>> +
>>> +}
>>> +
>>> +static void cleanup(struct xc_sr_context *ctx)
>>> +{
>>> +    xc_interface *xch = ctx->xch;
>>> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>>> +                                    (&ctx->save.dirty_bitmap_hbuf));
>>> +
>>> +    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
>>> +                      NULL, 0, NULL, 0, NULL);
>>> +
>>> +    if ( ctx->save.ops.cleanup(ctx) )
>>> +        PERROR("Failed to clean up");
>>> +
>>> +    if ( dirty_bitmap )
>>> +        xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
>>> +                                  
>>> NRPAGES(bitmap_size(ctx->save.p2m_size)));
>>
>> xc_hypercall_buffer_free_pages() if fine dealing with NULL, just like
>> free() is.  You can drop the conditional.
>
> Actually this is another trick that I need to deal with those
> hypercall macros.
> DECLARE_HYPERCALL_BUFFER_SHADOW will define a user pointer "dirty_bitmap"
> and a shadow buffer, although xc_hypercall_buffer_free_pages takes
> "dirty_bitmap" as an augument, but it is also a MACRO, without
> "if ( dirty_bitmap )", the compiler will report "dirty_bitmap" unused
> error...

Ah, in which case you would be better using
xc__hypercall_buffer_free_pages() and not creating the local shadow in
the first place.

~Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 03/10] tools/libxc: rename send_some_pages to send_dirty_pages
  2015-05-08  9:33 ` [PATCH Remus v2 03/10] tools/libxc: rename send_some_pages to send_dirty_pages Yang Hongyang
@ 2015-05-08 10:11   ` Andrew Cooper
  2015-05-11  1:21     ` Hongyang Yang
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-08 10:11 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 08/05/15 10:33, Yang Hongyang wrote:
> rename send_some_pages to send_dirty_pages, no functional change.
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  tools/libxc/xc_sr_save.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 2394bc4..9ca4336 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -379,7 +379,7 @@ static int send_all_pages(struct xc_sr_context *ctx)
>   *
>   * Bitmap is bounded by p2m_size.
>   */
> -static int send_some_pages(struct xc_sr_context *ctx,
> +static int send_dirty_pages(struct xc_sr_context *ctx,
>                             unsigned long *bitmap,
>                             unsigned long entries)

Please change the parameter alignment, so the end code is properly aligned.

Also, you can drop the *bitmap parameter, as it will always be the
bitmap found in ctx.  This will simplify some of your later patches.

~Andrew

>  {
> @@ -515,7 +515,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
>          if ( rc )
>              goto out;
>  
> -        rc = send_some_pages(ctx, dirty_bitmap, stats.dirty_count);
> +        rc = send_dirty_pages(ctx, dirty_bitmap, stats.dirty_count);
>          if ( rc )
>              goto out;
>      }
> @@ -540,7 +540,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
>  
>      bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);
>  
> -    rc = send_some_pages(ctx, dirty_bitmap,
> +    rc = send_dirty_pages(ctx, dirty_bitmap,
>                           stats.dirty_count + ctx->save.nr_deferred_pages);
>      if ( rc )
>          goto out;

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-08  9:33 ` [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER Yang Hongyang
@ 2015-05-08 10:16   ` Andrew Cooper
  2015-05-11  1:22     ` Hongyang Yang
  2015-05-11 11:53   ` Ian Campbell
  1 sibling, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-08 10:16 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 08/05/15 10:33, Yang Hongyang wrote:
> Define a user pointer that can access the hypercall buffer data
> Useful when you only need to access the hypercall buffer data
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> ---
>  tools/libxc/include/xenctrl.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 6994c51..12e8c36 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -296,6 +296,14 @@ typedef struct xc_hypercall_buffer xc_hypercall_buffer_t;
>      }
>  
>  /*
> + * Define a user pointer that can access the hypercall buffer data
> + *
> + * Useful when you only need to access the hypercall buffer data
> + */
> +#define DECLARE_HYPERCALL_BUFFER_USER_POINTER(_type, _name, _hbuf)  \
> +    _type *_name = _hbuf->hbuf;

There are some bracketing issues here.  Please refer to my fixup patch
which I will post as soon as I can.

~Andrew

> +
> +/*
>   * Declare the necessary data structure to allow a hypercall buffer
>   * passed as an argument to a function to be used in the normal way.
>   */

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 05/10] tools/libxc: reuse send_dirty_pages() in send_all_pages()
  2015-05-08  9:33 ` [PATCH Remus v2 05/10] tools/libxc: reuse send_dirty_pages() in send_all_pages() Yang Hongyang
@ 2015-05-08 10:17   ` Andrew Cooper
  0 siblings, 0 replies; 40+ messages in thread
From: Andrew Cooper @ 2015-05-08 10:17 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 08/05/15 10:33, Yang Hongyang wrote:
> introduce bitmap_set() to set the entire bitmap.
> in send_all_pages(), set the entire bitmap and call send_dirty_pages().
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>  (accepting for
the expected change due to cleanup in previous patches)

> ---
>  tools/libxc/xc_bitops.h  |  5 +++++
>  tools/libxc/xc_sr_save.c | 44 ++++++++++++++------------------------------
>  2 files changed, 19 insertions(+), 30 deletions(-)
>
> diff --git a/tools/libxc/xc_bitops.h b/tools/libxc/xc_bitops.h
> index dfce3b8..cd749f4 100644
> --- a/tools/libxc/xc_bitops.h
> +++ b/tools/libxc/xc_bitops.h
> @@ -26,6 +26,11 @@ static inline unsigned long *bitmap_alloc(int nr_bits)
>      return calloc(1, bitmap_size(nr_bits));
>  }
>  
> +static inline void bitmap_set(unsigned long *addr, int nr_bits)
> +{
> +    memset(addr, 0xff, bitmap_size(nr_bits));
> +}
> +
>  static inline void bitmap_clear(unsigned long *addr, int nr_bits)
>  {
>      memset(addr, 0, bitmap_size(nr_bits));
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 9ca4336..7efffe6 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -344,36 +344,6 @@ static int suspend_domain(struct xc_sr_context *ctx)
>  }
>  
>  /*
> - * Send all pages in the guests p2m.  Used as the first iteration of the live
> - * migration loop, and for a non-live save.
> - */
> -static int send_all_pages(struct xc_sr_context *ctx)
> -{
> -    xc_interface *xch = ctx->xch;
> -    xen_pfn_t p;
> -    int rc;
> -
> -    for ( p = 0; p < ctx->save.p2m_size; ++p )
> -    {
> -        rc = add_to_batch(ctx, p);
> -        if ( rc )
> -            return rc;
> -
> -        /* Update progress every 4MB worth of memory sent. */
> -        if ( (p & ((1U << (22 - 12)) - 1)) == 0 )
> -            xc_report_progress_step(xch, p, ctx->save.p2m_size);
> -    }
> -
> -    rc = flush_batch(ctx);
> -    if ( rc )
> -        return rc;
> -
> -    xc_report_progress_step(xch, ctx->save.p2m_size,
> -                            ctx->save.p2m_size);
> -    return 0;
> -}
> -
> -/*
>   * Send a subset of pages in the guests p2m, according to the provided bitmap.
>   * Used for each subsequent iteration of the live migration loop.
>   *
> @@ -415,6 +385,20 @@ static int send_dirty_pages(struct xc_sr_context *ctx,
>      return 0;
>  }
>  
> +/*
> + * Send all pages in the guests p2m.  Used as the first iteration of the live
> + * migration loop, and for a non-live save.
> + */
> +static int send_all_pages(struct xc_sr_context *ctx)
> +{
> +    DECLARE_HYPERCALL_BUFFER_USER_POINTER(unsigned long, dirty_bitmap,
> +                                          (&ctx->save.dirty_bitmap_hbuf));
> +
> +    bitmap_set(dirty_bitmap, ctx->save.p2m_size);
> +
> +    return send_dirty_pages(ctx, dirty_bitmap, ctx->save.p2m_size);
> +}
> +
>  static int enable_logdirty(struct xc_sr_context *ctx)
>  {
>      xc_interface *xch = ctx->xch;

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
                   ` (9 preceding siblings ...)
  2015-05-08  9:33 ` [PATCH Remus v2 10/10] tools/libxc: X86_PV_INFO can be sent multiple times under Remus Yang Hongyang
@ 2015-05-08 18:12 ` Andrew Cooper
  2015-05-11  6:28   ` Hongyang Yang
  10 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-08 18:12 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 08/05/15 10:33, Yang Hongyang wrote:
> This patchset implement the Remus support for Migration v2 but without
> memory compressing.
>
> The series can be found on github:
> https://github.com/macrosheep/xen/tree/Remus-newmig-v2
>
> PATCH 1-7: Some refactor and prepare work.
> PATCH 8-9: The main Remus loop implement.
> PATCH 10: Fix for Remus.

I have reviewed the other half of the series now, and have some design
to discuss.  (I was hoping to get this email sent in reply to v1, but
never mind).  This largely concerns patch 7 and onwards.

Migration v2 has substantially more structure than legacy did.  Once
issue so far is that your series relies on using more than one END
record, which is not supported in the spec.  (Of course - the spec is
fine to be extended in forward-compatible ways.)

To fix the qemu layering issues I need to have some explicit negotiation
between libxc and libxl about sharing ownership of the input fd.  This
is going to require a new record in the format, and I currently drafting
a patch or two which should help in this regard.

My view for the eventual stream looks something like this (time going
downwards):

libxc writes:                   libxl writes:

Image Header
Domain Header
start_of_stream()
start_of_checkpoint()
<memory>
end_of_checkpoint()
Checkpoint record
            ctx->save.callbacks->checkpoint()
                                libxl qemu record
                                ...
                                libxl end-of-checkpoint record
            ctx->save.callbacks->checkpoint() returns
start_of_checkpoint()
<memory>
end_of_checkpoint()
Checkpoint record
etc...

This will eventually allow both libxc and libxl to send checkpoint data
(and by the looks of it, remove the need for postcopy()).  With this
libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the
current qemu situation, but I would prefer not to be also retrofitting
libxc checkpoint records when doing the libxl/migv2 work.

Does this look plausible in for Remus (and eventually COLO) support?

~Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-08 10:08       ` Andrew Cooper
@ 2015-05-11  1:20         ` Hongyang Yang
  2015-05-11 11:47           ` Ian Campbell
  0 siblings, 1 reply; 40+ messages in thread
From: Hongyang Yang @ 2015-05-11  1:20 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, eddie.dong, yunhong.jiang,
	ian.jackson, rshriram



On 05/08/2015 06:08 PM, Andrew Cooper wrote:
> On 08/05/15 10:59, Hongyang Yang wrote:
>>
>>>
>>> In general, a good change, but some comments...
>>>
>>>> ---
>>>>    tools/libxc/xc_sr_save.c | 72
>>>> +++++++++++++++++++++++++++++-------------------
>>>>    1 file changed, 44 insertions(+), 28 deletions(-)
>>>>
>>>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>>>> index cc3e6b1..2394bc4 100644
>>>> --- a/tools/libxc/xc_sr_save.c
>>>> +++ b/tools/libxc/xc_sr_save.c
>>>> @@ -607,13 +607,10 @@ static int send_domain_memory_nonlive(struct
>>>> xc_sr_context *ctx)
>>>>        return rc;
>>>>    }
>>>>
>>>> -/*
>>>> - * Save a domain.
>>>> - */
>>>> -static int save(struct xc_sr_context *ctx, uint16_t guest_type)
>>>> +static int setup(struct xc_sr_context *ctx)
>>>>    {
>>>>        xc_interface *xch = ctx->xch;
>>>> -    int rc, saved_rc = 0, saved_errno = 0;
>>>> +    int rc;
>>>>        DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>>>>                                        (&ctx->save.dirty_bitmap_hbuf));
>>>>
>>>> @@ -632,13 +629,51 @@ static int save(struct xc_sr_context *ctx,
>>>> uint16_t guest_type)
>>>>            goto err;
>>>>        }
>>>>
>>>> -    IPRINTF("Saving domain %d, type %s",
>>>> -            ctx->domid, dhdr_type_to_str(guest_type));
>>>> -
>>>>        rc = ctx->save.ops.setup(ctx);
>>>>        if ( rc )
>>>>            goto err;
>>>>
>>>> +    rc = 0;
>>>> +
>>>> + err:
>>>> +    return rc;
>>>> +
>>>> +}
>>>> +
>>>> +static void cleanup(struct xc_sr_context *ctx)
>>>> +{
>>>> +    xc_interface *xch = ctx->xch;
>>>> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
>>>> +                                    (&ctx->save.dirty_bitmap_hbuf));
>>>> +
>>>> +    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
>>>> +                      NULL, 0, NULL, 0, NULL);
>>>> +
>>>> +    if ( ctx->save.ops.cleanup(ctx) )
>>>> +        PERROR("Failed to clean up");
>>>> +
>>>> +    if ( dirty_bitmap )
>>>> +        xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
>>>> +
>>>> NRPAGES(bitmap_size(ctx->save.p2m_size)));
>>>
>>> xc_hypercall_buffer_free_pages() if fine dealing with NULL, just like
>>> free() is.  You can drop the conditional.
>>
>> Actually this is another trick that I need to deal with those
>> hypercall macros.
>> DECLARE_HYPERCALL_BUFFER_SHADOW will define a user pointer "dirty_bitmap"
>> and a shadow buffer, although xc_hypercall_buffer_free_pages takes
>> "dirty_bitmap" as an augument, but it is also a MACRO, without
>> "if ( dirty_bitmap )", the compiler will report "dirty_bitmap" unused
>> error...
>
> Ah, in which case you would be better using
> xc__hypercall_buffer_free_pages() and not creating the local shadow in
> the first place.

I thought we'd better use those MACROs which described in the comments...
If it is OK to use xc__hypercall_buffer_free_pages(), I will fix it in
the next version.

>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 03/10] tools/libxc: rename send_some_pages to send_dirty_pages
  2015-05-08 10:11   ` Andrew Cooper
@ 2015-05-11  1:21     ` Hongyang Yang
  0 siblings, 0 replies; 40+ messages in thread
From: Hongyang Yang @ 2015-05-11  1:21 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram



On 05/08/2015 06:11 PM, Andrew Cooper wrote:
> On 08/05/15 10:33, Yang Hongyang wrote:
>> rename send_some_pages to send_dirty_pages, no functional change.
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   tools/libxc/xc_sr_save.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
>> index 2394bc4..9ca4336 100644
>> --- a/tools/libxc/xc_sr_save.c
>> +++ b/tools/libxc/xc_sr_save.c
>> @@ -379,7 +379,7 @@ static int send_all_pages(struct xc_sr_context *ctx)
>>    *
>>    * Bitmap is bounded by p2m_size.
>>    */
>> -static int send_some_pages(struct xc_sr_context *ctx,
>> +static int send_dirty_pages(struct xc_sr_context *ctx,
>>                              unsigned long *bitmap,
>>                              unsigned long entries)
>
> Please change the parameter alignment, so the end code is properly aligned.
>
> Also, you can drop the *bitmap parameter, as it will always be the
> bitmap found in ctx.  This will simplify some of your later patches.

ok, thank you!

>
> ~Andrew
>
>>   {
>> @@ -515,7 +515,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
>>           if ( rc )
>>               goto out;
>>
>> -        rc = send_some_pages(ctx, dirty_bitmap, stats.dirty_count);
>> +        rc = send_dirty_pages(ctx, dirty_bitmap, stats.dirty_count);
>>           if ( rc )
>>               goto out;
>>       }
>> @@ -540,7 +540,7 @@ static int send_domain_memory_live(struct xc_sr_context *ctx)
>>
>>       bitmap_or(dirty_bitmap, ctx->save.deferred_pages, ctx->save.p2m_size);
>>
>> -    rc = send_some_pages(ctx, dirty_bitmap,
>> +    rc = send_dirty_pages(ctx, dirty_bitmap,
>>                            stats.dirty_count + ctx->save.nr_deferred_pages);
>>       if ( rc )
>>           goto out;
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-08 10:16   ` Andrew Cooper
@ 2015-05-11  1:22     ` Hongyang Yang
  0 siblings, 0 replies; 40+ messages in thread
From: Hongyang Yang @ 2015-05-11  1:22 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram



On 05/08/2015 06:16 PM, Andrew Cooper wrote:
> On 08/05/15 10:33, Yang Hongyang wrote:
>> Define a user pointer that can access the hypercall buffer data
>> Useful when you only need to access the hypercall buffer data
>>
>> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
>> ---
>>   tools/libxc/include/xenctrl.h | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>> index 6994c51..12e8c36 100644
>> --- a/tools/libxc/include/xenctrl.h
>> +++ b/tools/libxc/include/xenctrl.h
>> @@ -296,6 +296,14 @@ typedef struct xc_hypercall_buffer xc_hypercall_buffer_t;
>>       }
>>
>>   /*
>> + * Define a user pointer that can access the hypercall buffer data
>> + *
>> + * Useful when you only need to access the hypercall buffer data
>> + */
>> +#define DECLARE_HYPERCALL_BUFFER_USER_POINTER(_type, _name, _hbuf)  \
>> +    _type *_name = _hbuf->hbuf;
>
> There are some bracketing issues here.  Please refer to my fixup patch
> which I will post as soon as I can.

I've seen those patches, will review them, thank you.

>
> ~Andrew
>
>> +
>> +/*
>>    * Declare the necessary data structure to allow a hypercall buffer
>>    * passed as an argument to a function to be used in the normal way.
>>    */
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-08 18:12 ` [PATCH Remus v2 00/10] Remus support for Migration-v2 Andrew Cooper
@ 2015-05-11  6:28   ` Hongyang Yang
  2015-05-11  9:00     ` Andrew Cooper
  0 siblings, 1 reply; 40+ messages in thread
From: Hongyang Yang @ 2015-05-11  6:28 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 05/09/2015 02:12 AM, Andrew Cooper wrote:
> On 08/05/15 10:33, Yang Hongyang wrote:
>> This patchset implement the Remus support for Migration v2 but without
>> memory compressing.
>>
>> The series can be found on github:
>> https://github.com/macrosheep/xen/tree/Remus-newmig-v2
>>
>> PATCH 1-7: Some refactor and prepare work.
>> PATCH 8-9: The main Remus loop implement.
>> PATCH 10: Fix for Remus.
>
> I have reviewed the other half of the series now, and have some design
> to discuss.  (I was hoping to get this email sent in reply to v1, but
> never mind).  This largely concerns patch 7 and onwards.
>
> Migration v2 has substantially more structure than legacy did.  Once
> issue so far is that your series relies on using more than one END
> record, which is not supported in the spec.  (Of course - the spec is
> fine to be extended in forward-compatible ways.)

I use END record as a info that indicate the end of the stream. I saw
that you add a checkpoint record in your v2 series of Remus related patches,
I can use that record to indicate the end of the checkpointed stream, but
I think the record better to be called as end-of-checkpoint?

>
> To fix the qemu layering issues I need to have some explicit negotiation
> between libxc and libxl about sharing ownership of the input fd.  This
> is going to require a new record in the format, and I currently drafting
> a patch or two which should help in this regard.
>
> My view for the eventual stream looks something like this (time going
> downwards):
>
> libxc writes:                   libxl writes:
>
> Image Header
> Domain Header
> start_of_stream()
> start_of_checkpoint()

<live memory>

ctx->save.callbacks->suspend()
this callback suspend the primary guest and then calls Remus devices
postsuspend callbacks to buffer the network pkts etc.

<last iter of memory>

> end_of_checkpoint()
> Checkpoint record

ctx->save.callbacks->postcopy()
this callback should not be omitted, it do some necessary work before resume
primary (such as call Remus devices preresume callbacks to ensure the disk
data is consistent) and then resume the primary guest. I think this
callback should be renamed to ctx->save.callbacks->resume().

>              ctx->save.callbacks->checkpoint()
>                                  libxl qemu record

Maybe we should add another callback to send qemu record instead of
using checkpoint callback. We can call it ctx->save.callbacks->save_qemu()
Then in checkpoint callback, we only call remus devices commit callbacks(
which will release the network buffer etc...) then decide whether we need to
do another checkpoint or quit checkpointed stream.
With Remus, checkpoint callback only wait for 200ms(can be specified by -i)
then return.
With COLO, checkpoint callback will ask COLO proxy if we need to do a
checkpoint, will return when COLO proxy module indicate a checkpoint is needed.

>                                  ...
>                                  libxl end-of-checkpoint record
>              ctx->save.callbacks->checkpoint() returns
> start_of_checkpoint()

ctx->save.callbacks->suspend()

> <memory>
> end_of_checkpoint()
> Checkpoint record
> etc...
>
> This will eventually allow both libxc and libxl to send checkpoint data
> (and by the looks of it, remove the need for postcopy()).  With this
> libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the
> current qemu situation, but I would prefer not to be also retrofitting
> libxc checkpoint records when doing the libxl/migv2 work.
>
> Does this look plausible in for Remus (and eventually COLO) support?

With comments above, I would suggest the save flow as below:

libxc writes:                   libxl writes:

live migration:
Image Header
Domain Header
start_of_stream()
start_of_checkpoint()
<live memory>
ctx->save.callbacks->suspend()
<last iter memory>
end_of_checkpoint()
if ( checkpointd )
   End of Checkpoint record
   /*If resotre side receives this record, input fd should be handed to libxl*/
else
   goto end

loop of checkpointed stream:
ctx->save.callbacks->resume()
ctx->save.callbacks->save_qemu()
                                 libxl qemu record
                                 ...
                                 libxl end-of-checkpoint record
/*If resotre side receives this record, input fd should be handed to libxc*/
ctx->save.callbacks->save_qemu() returns
ctx->save.callbacks->checkpoint()
start_of_checkpoint()
ctx->save.callbacks->suspend()
<memory>
end_of_checkpoint()
End of Checkpoint record
goto 'loop of checkpointed stream'

end:
END record
/*If resotre side receives this record, input fd should be handed to libxl*/


In order to keep it simple, we can keep the current 
ctx->save.callbacks->checkpoint() as it is, which do the save_qemu thing, call
Remus devices commit callbacks and then decide whether we need a checkpoint. We
can also combine the ctx->save.callbacks->resume() with
ctx->save.callbacks->checkpoint(), with only one checkpoint() callback, we do
the following things:
  - Call Remus devices preresume callbacks
  - Resume the primary
  - Save qemu records
  - Call Remus devices commit callbacks
  - Decide whether we need a checkpoint

Overall, there are 3 options for the save flow:
1. keep the current callbacks, rename postcopy() to resume()
2. split the checkpoint() callback to save_qemu() and checkpoint()
3. combine the current postcopy() and checkpoint()
Which one do you think is the best?

>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-11  6:28   ` Hongyang Yang
@ 2015-05-11  9:00     ` Andrew Cooper
  2015-05-11 10:48       ` Hongyang Yang
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-11  9:00 UTC (permalink / raw)
  To: Hongyang Yang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 11/05/15 07:28, Hongyang Yang wrote:
> On 05/09/2015 02:12 AM, Andrew Cooper wrote:
>> On 08/05/15 10:33, Yang Hongyang wrote:
>>> This patchset implement the Remus support for Migration v2 but without
>>> memory compressing.
>>>
>>> The series can be found on github:
>>> https://github.com/macrosheep/xen/tree/Remus-newmig-v2
>>>
>>> PATCH 1-7: Some refactor and prepare work.
>>> PATCH 8-9: The main Remus loop implement.
>>> PATCH 10: Fix for Remus.
>>
>> I have reviewed the other half of the series now, and have some design
>> to discuss.  (I was hoping to get this email sent in reply to v1, but
>> never mind).  This largely concerns patch 7 and onwards.
>>
>> Migration v2 has substantially more structure than legacy did.  Once
>> issue so far is that your series relies on using more than one END
>> record, which is not supported in the spec.  (Of course - the spec is
>> fine to be extended in forward-compatible ways.)
>
> I use END record as a info that indicate the end of the stream.

I suspected this, but is not a backwards compatible use of the migration
v2 stream.  There must only be a single END record, and it must be the
very last record the save side produces.

> I saw
> that you add a checkpoint record in your v2 series of Remus related
> patches,
> I can use that record to indicate the end of the checkpointed stream, but
> I think the record better to be called as end-of-checkpoint?

It is the logical end of the libxc bits of a checkpoint, but not of the
(soon to exist) libxl bits.

>
>>
>> To fix the qemu layering issues I need to have some explicit negotiation
>> between libxc and libxl about sharing ownership of the input fd.  This
>> is going to require a new record in the format, and I currently drafting
>> a patch or two which should help in this regard.
>>
>> My view for the eventual stream looks something like this (time going
>> downwards):
>>
>> libxc writes:                   libxl writes:
>>
>> Image Header
>> Domain Header
>> start_of_stream()
>> start_of_checkpoint()
>
> <live memory>
>
> ctx->save.callbacks->suspend()
> this callback suspend the primary guest and then calls Remus devices
> postsuspend callbacks to buffer the network pkts etc.

Sorry yes - I omitted this call in the example for brevity, but was not
intending to omit it from the code.

>
> <last iter of memory>
>
>> end_of_checkpoint()
>> Checkpoint record
>
> ctx->save.callbacks->postcopy()
> this callback should not be omitted, it do some necessary work before
> resume
> primary (such as call Remus devices preresume callbacks to ensure the
> disk
> data is consistent) and then resume the primary guest. I think this
> callback should be renamed to ctx->save.callbacks->resume().

That looks to be a useful cleanup (and answers one of my questions of
what exactly postcopy was)

>
>>              ctx->save.callbacks->checkpoint()
>>                                  libxl qemu record
>
> Maybe we should add another callback to send qemu record instead of
> using checkpoint callback. We can call it
> ctx->save.callbacks->save_qemu()

This is another layering violation.  libxc should not prescribe what
libxl might or might not do.  One example we are experimenting with in
XenServer at the moment is support for multiple emulators attached to a
single domain, which would necessitate two LIBXL_EMULATOR records to be
sent per checkpoint.  libxl might also want to send an updated json blob
or such.

> Then in checkpoint callback, we only call remus devices commit callbacks(
> which will release the network buffer etc...) then decide whether we
> need to
> do another checkpoint or quit checkpointed stream.
> With Remus, checkpoint callback only wait for 200ms(can be specified
> by -i)
> then return.
> With COLO, checkpoint callback will ask COLO proxy if we need to do a
> checkpoint, will return when COLO proxy module indicate a checkpoint
> is needed.

That sounds like COLO wants a should_checkpoint() callback which
separates the decision to make a checkpoint from the logic of
implementing a checkpoint.

>
>>                                  ...
>>                                  libxl end-of-checkpoint record
>>              ctx->save.callbacks->checkpoint() returns
>> start_of_checkpoint()
>
> ctx->save.callbacks->suspend()
>
>> <memory>
>> end_of_checkpoint()
>> Checkpoint record
>> etc...
>>
>> This will eventually allow both libxc and libxl to send checkpoint data
>> (and by the looks of it, remove the need for postcopy()).  With this
>> libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the
>> current qemu situation, but I would prefer not to be also retrofitting
>> libxc checkpoint records when doing the libxl/migv2 work.
>>
>> Does this look plausible in for Remus (and eventually COLO) support?
>
> With comments above, I would suggest the save flow as below:
>
> libxc writes:                   libxl writes:
>
> live migration:
> Image Header
> Domain Header
> start_of_stream()
> start_of_checkpoint()
> <live memory>
> ctx->save.callbacks->suspend()
> <last iter memory>
> end_of_checkpoint()
> if ( checkpointd )
>   End of Checkpoint record
>   /*If resotre side receives this record, input fd should be handed to
> libxl*/
> else
>   goto end
>
> loop of checkpointed stream:
> ctx->save.callbacks->resume()
> ctx->save.callbacks->save_qemu()
>                                 libxl qemu record
>                                 ...
>                                 libxl end-of-checkpoint record
> /*If resotre side receives this record, input fd should be handed to
> libxc*/
> ctx->save.callbacks->save_qemu() returns
> ctx->save.callbacks->checkpoint()
> start_of_checkpoint()
> ctx->save.callbacks->suspend()
> <memory>
> end_of_checkpoint()
> End of Checkpoint record
> goto 'loop of checkpointed stream'
>
> end:
> END record
> /*If resotre side receives this record, input fd should be handed to
> libxl*/
>
>
> In order to keep it simple, we can keep the current
> ctx->save.callbacks->checkpoint() as it is, which do the save_qemu
> thing, call
> Remus devices commit callbacks and then decide whether we need a
> checkpoint. We
> can also combine the ctx->save.callbacks->resume() with
> ctx->save.callbacks->checkpoint(), with only one checkpoint()
> callback, we do
> the following things:
>  - Call Remus devices preresume callbacks
>  - Resume the primary
>  - Save qemu records
>  - Call Remus devices commit callbacks
>  - Decide whether we need a checkpoint
>
> Overall, there are 3 options for the save flow:
> 1. keep the current callbacks, rename postcopy() to resume()
> 2. split the checkpoint() callback to save_qemu() and checkpoint()
> 3. combine the current postcopy() and checkpoint()
> Which one do you think is the best?

I have a 4th alternative in mind, but would like your feedback from my
comments in this email first.

~Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-11  9:00     ` Andrew Cooper
@ 2015-05-11 10:48       ` Hongyang Yang
  2015-05-11 11:01         ` Andrew Cooper
  0 siblings, 1 reply; 40+ messages in thread
From: Hongyang Yang @ 2015-05-11 10:48 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram


On 05/11/2015 05:00 PM, Andrew Cooper wrote:
> On 11/05/15 07:28, Hongyang Yang wrote:
>> On 05/09/2015 02:12 AM, Andrew Cooper wrote:
>>> On 08/05/15 10:33, Yang Hongyang wrote:
>>>> This patchset implement the Remus support for Migration v2 but without
>>>> memory compressing.
[...]
>
>>
>> <last iter of memory>
>>
>>> end_of_checkpoint()
>>> Checkpoint record
>>
>> ctx->save.callbacks->postcopy()
>> this callback should not be omitted, it do some necessary work before
>> resume
>> primary (such as call Remus devices preresume callbacks to ensure the
>> disk
>> data is consistent) and then resume the primary guest. I think this
>> callback should be renamed to ctx->save.callbacks->resume().
>
> That looks to be a useful cleanup (and answers one of my questions of
> what exactly postcopy was)
>
>>
>>>               ctx->save.callbacks->checkpoint()
>>>                                   libxl qemu record
>>
>> Maybe we should add another callback to send qemu record instead of
>> using checkpoint callback. We can call it
>> ctx->save.callbacks->save_qemu()
>
> This is another layering violation.  libxc should not prescribe what
> libxl might or might not do.  One example we are experimenting with in
> XenServer at the moment is support for multiple emulators attached to a
> single domain, which would necessitate two LIBXL_EMULATOR records to be
> sent per checkpoint.  libxl might also want to send an updated json blob
> or such.

Ok, so we'd better not introduce save_qemu callback.

>
>> Then in checkpoint callback, we only call remus devices commit callbacks(
>> which will release the network buffer etc...) then decide whether we
>> need to
>> do another checkpoint or quit checkpointed stream.
>> With Remus, checkpoint callback only wait for 200ms(can be specified
>> by -i)
>> then return.
>> With COLO, checkpoint callback will ask COLO proxy if we need to do a
>> checkpoint, will return when COLO proxy module indicate a checkpoint
>> is needed.
>
> That sounds like COLO wants a should_checkpoint() callback which
> separates the decision to make a checkpoint from the logic of
> implementing a checkpoint.

We use checkpoint callback to do should_checkpoint() thing currently.
libxc will check the return value of checkpoint callback.

>
>>
>>>                                   ...
>>>                                   libxl end-of-checkpoint record
>>>               ctx->save.callbacks->checkpoint() returns
>>> start_of_checkpoint()
>>
>> ctx->save.callbacks->suspend()
>>
>>> <memory>
>>> end_of_checkpoint()
>>> Checkpoint record
>>> etc...
>>>
>>> This will eventually allow both libxc and libxl to send checkpoint data
>>> (and by the looks of it, remove the need for postcopy()).  With this
>>> libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the
>>> current qemu situation, but I would prefer not to be also retrofitting
>>> libxc checkpoint records when doing the libxl/migv2 work.
>>>
>>> Does this look plausible in for Remus (and eventually COLO) support?
>>
>> With comments above, I would suggest the save flow as below:
>>
>> libxc writes:                   libxl writes:
>>
>> live migration:
>> Image Header
>> Domain Header
>> start_of_stream()
>> start_of_checkpoint()
>> <live memory>
>> ctx->save.callbacks->suspend()
>> <last iter memory>
>> end_of_checkpoint()
>> if ( checkpointd )
>>    End of Checkpoint record
>>    /*If resotre side receives this record, input fd should be handed to
>> libxl*/
>> else
>>    goto end
>>
>> loop of checkpointed stream:
>> ctx->save.callbacks->resume()
>> ctx->save.callbacks->save_qemu()
>>                                  libxl qemu record
>>                                  ...
>>                                  libxl end-of-checkpoint record
>> /*If resotre side receives this record, input fd should be handed to
>> libxc*/
>> ctx->save.callbacks->save_qemu() returns
>> ctx->save.callbacks->checkpoint()
>> start_of_checkpoint()
>> ctx->save.callbacks->suspend()
>> <memory>
>> end_of_checkpoint()
>> End of Checkpoint record
>> goto 'loop of checkpointed stream'
>>
>> end:
>> END record
>> /*If resotre side receives this record, input fd should be handed to
>> libxl*/
>>
>>
>> In order to keep it simple, we can keep the current
>> ctx->save.callbacks->checkpoint() as it is, which do the save_qemu
>> thing, call
>> Remus devices commit callbacks and then decide whether we need a
>> checkpoint. We
>> can also combine the ctx->save.callbacks->resume() with
>> ctx->save.callbacks->checkpoint(), with only one checkpoint()
>> callback, we do
>> the following things:
>>   - Call Remus devices preresume callbacks
>>   - Resume the primary
>>   - Save qemu records
>>   - Call Remus devices commit callbacks
>>   - Decide whether we need a checkpoint
>>
>> Overall, there are 3 options for the save flow:
>> 1. keep the current callbacks, rename postcopy() to resume()
>> 2. split the checkpoint() callback to save_qemu() and checkpoint()
>> 3. combine the current postcopy() and checkpoint()
>> Which one do you think is the best?
>
> I have a 4th alternative in mind, but would like your feedback from my
> comments in this email first.

So what's the 4th alternative?

>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-11 10:48       ` Hongyang Yang
@ 2015-05-11 11:01         ` Andrew Cooper
  2015-05-12  8:12           ` Yang Hongyang
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-11 11:01 UTC (permalink / raw)
  To: Hongyang Yang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 11/05/15 11:48, Hongyang Yang wrote:
>
> On 05/11/2015 05:00 PM, Andrew Cooper wrote:
>> On 11/05/15 07:28, Hongyang Yang wrote:
>>> On 05/09/2015 02:12 AM, Andrew Cooper wrote:
>>>> On 08/05/15 10:33, Yang Hongyang wrote:
>>>>> This patchset implement the Remus support for Migration v2 but
>>>>> without
>>>>> memory compressing.
> [...]
>>
>>>
>>> <last iter of memory>
>>>
>>>> end_of_checkpoint()
>>>> Checkpoint record
>>>
>>> ctx->save.callbacks->postcopy()
>>> this callback should not be omitted, it do some necessary work before
>>> resume
>>> primary (such as call Remus devices preresume callbacks to ensure the
>>> disk
>>> data is consistent) and then resume the primary guest. I think this
>>> callback should be renamed to ctx->save.callbacks->resume().
>>
>> That looks to be a useful cleanup (and answers one of my questions of
>> what exactly postcopy was)
>>
>>>
>>>>               ctx->save.callbacks->checkpoint()
>>>>                                   libxl qemu record
>>>
>>> Maybe we should add another callback to send qemu record instead of
>>> using checkpoint callback. We can call it
>>> ctx->save.callbacks->save_qemu()
>>
>> This is another layering violation.  libxc should not prescribe what
>> libxl might or might not do.  One example we are experimenting with in
>> XenServer at the moment is support for multiple emulators attached to a
>> single domain, which would necessitate two LIBXL_EMULATOR records to be
>> sent per checkpoint.  libxl might also want to send an updated json blob
>> or such.
>
> Ok, so we'd better not introduce save_qemu callback.
>
>>
>>> Then in checkpoint callback, we only call remus devices commit
>>> callbacks(
>>> which will release the network buffer etc...) then decide whether we
>>> need to
>>> do another checkpoint or quit checkpointed stream.
>>> With Remus, checkpoint callback only wait for 200ms(can be specified
>>> by -i)
>>> then return.
>>> With COLO, checkpoint callback will ask COLO proxy if we need to do a
>>> checkpoint, will return when COLO proxy module indicate a checkpoint
>>> is needed.
>>
>> That sounds like COLO wants a should_checkpoint() callback which
>> separates the decision to make a checkpoint from the logic of
>> implementing a checkpoint.
>
> We use checkpoint callback to do should_checkpoint() thing currently.
> libxc will check the return value of checkpoint callback.

But that causes a chicken & egg problem.

I am planning to use a CHECKPOINT record to synchronise the transfer of
ownership of the FD between libxc and libxl.  Therefore, a CHECKPOINT
record must be in the stream ahead of the checkpoint() callback, as
libxl will then write/read some records in itself.

As a result, the checkpoint() callback itself can't be used to gate
whether a CHECKPOINT record is written by libxc.

>
>>
>>>
>>>>                                   ...
>>>>                                   libxl end-of-checkpoint record
>>>>               ctx->save.callbacks->checkpoint() returns
>>>> start_of_checkpoint()
>>>
>>> ctx->save.callbacks->suspend()
>>>
>>>> <memory>
>>>> end_of_checkpoint()
>>>> Checkpoint record
>>>> etc...
>>>>
>>>> This will eventually allow both libxc and libxl to send checkpoint
>>>> data
>>>> (and by the looks of it, remove the need for postcopy()).  With this
>>>> libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the
>>>> current qemu situation, but I would prefer not to be also retrofitting
>>>> libxc checkpoint records when doing the libxl/migv2 work.
>>>>
>>>> Does this look plausible in for Remus (and eventually COLO) support?
>>>
>>> With comments above, I would suggest the save flow as below:
>>>
>>> libxc writes:                   libxl writes:
>>>
>>> live migration:
>>> Image Header
>>> Domain Header
>>> start_of_stream()
>>> start_of_checkpoint()
>>> <live memory>
>>> ctx->save.callbacks->suspend()
>>> <last iter memory>
>>> end_of_checkpoint()
>>> if ( checkpointd )
>>>    End of Checkpoint record
>>>    /*If resotre side receives this record, input fd should be handed to
>>> libxl*/
>>> else
>>>    goto end
>>>
>>> loop of checkpointed stream:
>>> ctx->save.callbacks->resume()
>>> ctx->save.callbacks->save_qemu()
>>>                                  libxl qemu record
>>>                                  ...
>>>                                  libxl end-of-checkpoint record
>>> /*If resotre side receives this record, input fd should be handed to
>>> libxc*/
>>> ctx->save.callbacks->save_qemu() returns
>>> ctx->save.callbacks->checkpoint()
>>> start_of_checkpoint()
>>> ctx->save.callbacks->suspend()
>>> <memory>
>>> end_of_checkpoint()
>>> End of Checkpoint record
>>> goto 'loop of checkpointed stream'
>>>
>>> end:
>>> END record
>>> /*If resotre side receives this record, input fd should be handed to
>>> libxl*/
>>>
>>>
>>> In order to keep it simple, we can keep the current
>>> ctx->save.callbacks->checkpoint() as it is, which do the save_qemu
>>> thing, call
>>> Remus devices commit callbacks and then decide whether we need a
>>> checkpoint. We
>>> can also combine the ctx->save.callbacks->resume() with
>>> ctx->save.callbacks->checkpoint(), with only one checkpoint()
>>> callback, we do
>>> the following things:
>>>   - Call Remus devices preresume callbacks
>>>   - Resume the primary
>>>   - Save qemu records
>>>   - Call Remus devices commit callbacks
>>>   - Decide whether we need a checkpoint
>>>
>>> Overall, there are 3 options for the save flow:
>>> 1. keep the current callbacks, rename postcopy() to resume()
>>> 2. split the checkpoint() callback to save_qemu() and checkpoint()
>>> 3. combine the current postcopy() and checkpoint()
>>> Which one do you think is the best?
>>
>> I have a 4th alternative in mind, but would like your feedback from my
>> comments in this email first.
>
> So what's the 4th alternative?

I have some corrections to my patch series based on David's feedback,
and your comments.  After that, it should hopefully be far easier to
describe.

~Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-11  1:20         ` Hongyang Yang
@ 2015-05-11 11:47           ` Ian Campbell
  2015-05-11 11:49             ` Ian Campbell
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-05-11 11:47 UTC (permalink / raw)
  To: Hongyang Yang
  Cc: wei.liu2, eddie.dong, wency, Andrew Cooper, yunhong.jiang,
	ian.jackson, xen-devel, rshriram

On Mon, 2015-05-11 at 09:20 +0800, Hongyang Yang wrote:
> 
> On 05/08/2015 06:08 PM, Andrew Cooper wrote:
> > On 08/05/15 10:59, Hongyang Yang wrote:
> >>
> >>>
> >>> In general, a good change, but some comments...
> >>>
> >>>> ---
> >>>>    tools/libxc/xc_sr_save.c | 72
> >>>> +++++++++++++++++++++++++++++-------------------
> >>>>    1 file changed, 44 insertions(+), 28 deletions(-)
> >>>>
> >>>> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> >>>> index cc3e6b1..2394bc4 100644
> >>>> --- a/tools/libxc/xc_sr_save.c
> >>>> +++ b/tools/libxc/xc_sr_save.c
> >>>> @@ -607,13 +607,10 @@ static int send_domain_memory_nonlive(struct
> >>>> xc_sr_context *ctx)
> >>>>        return rc;
> >>>>    }
> >>>>
> >>>> -/*
> >>>> - * Save a domain.
> >>>> - */
> >>>> -static int save(struct xc_sr_context *ctx, uint16_t guest_type)
> >>>> +static int setup(struct xc_sr_context *ctx)
> >>>>    {
> >>>>        xc_interface *xch = ctx->xch;
> >>>> -    int rc, saved_rc = 0, saved_errno = 0;
> >>>> +    int rc;
> >>>>        DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> >>>>                                        (&ctx->save.dirty_bitmap_hbuf));
> >>>>
> >>>> @@ -632,13 +629,51 @@ static int save(struct xc_sr_context *ctx,
> >>>> uint16_t guest_type)
> >>>>            goto err;
> >>>>        }
> >>>>
> >>>> -    IPRINTF("Saving domain %d, type %s",
> >>>> -            ctx->domid, dhdr_type_to_str(guest_type));
> >>>> -
> >>>>        rc = ctx->save.ops.setup(ctx);
> >>>>        if ( rc )
> >>>>            goto err;
> >>>>
> >>>> +    rc = 0;
> >>>> +
> >>>> + err:
> >>>> +    return rc;
> >>>> +
> >>>> +}
> >>>> +
> >>>> +static void cleanup(struct xc_sr_context *ctx)
> >>>> +{
> >>>> +    xc_interface *xch = ctx->xch;
> >>>> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> >>>> +                                    (&ctx->save.dirty_bitmap_hbuf));
> >>>> +
> >>>> +    xc_shadow_control(xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_OFF,
> >>>> +                      NULL, 0, NULL, 0, NULL);
> >>>> +
> >>>> +    if ( ctx->save.ops.cleanup(ctx) )
> >>>> +        PERROR("Failed to clean up");
> >>>> +
> >>>> +    if ( dirty_bitmap )
> >>>> +        xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
> >>>> +
> >>>> NRPAGES(bitmap_size(ctx->save.p2m_size)));
> >>>
> >>> xc_hypercall_buffer_free_pages() if fine dealing with NULL, just like
> >>> free() is.  You can drop the conditional.
> >>
> >> Actually this is another trick that I need to deal with those
> >> hypercall macros.
> >> DECLARE_HYPERCALL_BUFFER_SHADOW will define a user pointer "dirty_bitmap"
> >> and a shadow buffer, although xc_hypercall_buffer_free_pages takes
> >> "dirty_bitmap" as an augument, but it is also a MACRO, without
> >> "if ( dirty_bitmap )", the compiler will report "dirty_bitmap" unused
> >> error...
> >
> > Ah, in which case you would be better using
> > xc__hypercall_buffer_free_pages() and not creating the local shadow in
> > the first place.
> 
> I thought we'd better use those MACROs which described in the comments...
> If it is OK to use xc__hypercall_buffer_free_pages(), I will fix it in
> the next version.

If anything I think I'd prefer for the if to move inside the
xc_hypercall_buffer_free_pages macro.

Ian.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-11 11:47           ` Ian Campbell
@ 2015-05-11 11:49             ` Ian Campbell
  2015-05-12  7:04               ` Yang Hongyang
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-05-11 11:49 UTC (permalink / raw)
  To: Hongyang Yang
  Cc: wei.liu2, eddie.dong, wency, Andrew Cooper, yunhong.jiang,
	ian.jackson, xen-devel, rshriram

On Mon, 2015-05-11 at 12:47 +0100, Ian Campbell wrote:
> > >> Actually this is another trick that I need to deal with those
> > >> hypercall macros.
> > >> DECLARE_HYPERCALL_BUFFER_SHADOW will define a user pointer "dirty_bitmap"
> > >> and a shadow buffer, although xc_hypercall_buffer_free_pages takes
> > >> "dirty_bitmap" as an augument, but it is also a MACRO, without
> > >> "if ( dirty_bitmap )", the compiler will report "dirty_bitmap" unused
> > >> error...
> > >
> > > Ah, in which case you would be better using
> > > xc__hypercall_buffer_free_pages() and not creating the local shadow in
> > > the first place.
> > 
> > I thought we'd better use those MACROs which described in the comments...
> > If it is OK to use xc__hypercall_buffer_free_pages(), I will fix it in
> > the next version.
> 
> If anything I think I'd prefer for the if to move inside the
> xc_hypercall_buffer_free_pages macro.

Or if you want to use the "raw" xc__hypercall_buffer variant you should
be consistent and do the same on allocation.

Ian.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration
  2015-05-08  9:33 ` [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration Yang Hongyang
  2015-05-08  9:51   ` Andrew Cooper
@ 2015-05-11 11:50   ` Ian Campbell
  2015-05-12  6:43     ` Hongyang Yang
  1 sibling, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-05-11 11:50 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, rshriram

On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
> Move the memory allocation before the concrete live/nolive save
> in order to avoid the free/alloc memory loop when using Remus.
> Rename the to_send bitmap to dirty_bitmap.

Doing the rename in a precursor patch would make the functional change
much easier to see.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-08  9:33 ` [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER Yang Hongyang
  2015-05-08 10:16   ` Andrew Cooper
@ 2015-05-11 11:53   ` Ian Campbell
  2015-05-12  7:18     ` Yang Hongyang
  1 sibling, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-05-11 11:53 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson

On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
> Define a user pointer that can access the hypercall buffer data
> Useful when you only need to access the hypercall buffer data

Can you expand on this please, I think the context is a hypercall buffer
passed as an argument or as a member of a struct. Please describe why
DECLARE_HYPERCALL_BUFFER_ARGUMENT and/or DECLARE_HYPERCALL_BUFFER_SHADOW
are not usable here.

Ian.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration
  2015-05-11 11:50   ` Ian Campbell
@ 2015-05-12  6:43     ` Hongyang Yang
  0 siblings, 0 replies; 40+ messages in thread
From: Hongyang Yang @ 2015-05-12  6:43 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, eddie.dong, wency, andrew.cooper3, yunhong.jiang,
	ian.jackson, xen-devel, rshriram



On 05/11/2015 07:50 PM, Ian Campbell wrote:
> On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
>> Move the memory allocation before the concrete live/nolive save
>> in order to avoid the free/alloc memory loop when using Remus.
>> Rename the to_send bitmap to dirty_bitmap.
>
> Doing the rename in a precursor patch would make the functional change
> much easier to see.

Thank you! I've split the rename patch!

>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save
  2015-05-11 11:49             ` Ian Campbell
@ 2015-05-12  7:04               ` Yang Hongyang
  0 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-12  7:04 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, eddie.dong, wency, Andrew Cooper, yunhong.jiang,
	ian.jackson, xen-devel, rshriram

On 05/11/2015 07:49 PM, Ian Campbell wrote:
> On Mon, 2015-05-11 at 12:47 +0100, Ian Campbell wrote:
>>>>> Actually this is another trick that I need to deal with those
>>>>> hypercall macros.
>>>>> DECLARE_HYPERCALL_BUFFER_SHADOW will define a user pointer "dirty_bitmap"
>>>>> and a shadow buffer, although xc_hypercall_buffer_free_pages takes
>>>>> "dirty_bitmap" as an augument, but it is also a MACRO, without
>>>>> "if ( dirty_bitmap )", the compiler will report "dirty_bitmap" unused
>>>>> error...
>>>>
>>>> Ah, in which case you would be better using
>>>> xc__hypercall_buffer_free_pages() and not creating the local shadow in
>>>> the first place.
>>>
>>> I thought we'd better use those MACROs which described in the comments...
>>> If it is OK to use xc__hypercall_buffer_free_pages(), I will fix it in
>>> the next version.
>>
>> If anything I think I'd prefer for the if to move inside the
>> xc_hypercall_buffer_free_pages macro.
>
> Or if you want to use the "raw" xc__hypercall_buffer variant you should
> be consistent and do the same on allocation.

I think I will add a patch that move the if inside the
xc_hypercall_buffer_free_pages macro.

>
> Ian.
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-11 11:53   ` Ian Campbell
@ 2015-05-12  7:18     ` Yang Hongyang
  2015-05-12  8:19       ` Ian Campbell
  0 siblings, 1 reply; 40+ messages in thread
From: Yang Hongyang @ 2015-05-12  7:18 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson



On 05/11/2015 07:53 PM, Ian Campbell wrote:
> On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
>> Define a user pointer that can access the hypercall buffer data
>> Useful when you only need to access the hypercall buffer data
>
> Can you expand on this please, I think the context is a hypercall buffer
> passed as an argument or as a member of a struct. Please describe why
> DECLARE_HYPERCALL_BUFFER_ARGUMENT and/or DECLARE_HYPERCALL_BUFFER_SHADOW
> are not usable here.

There are cases that we only need to use the hypercall buffer data, and do
not use the xc_hypercall_buffer_t struct.
DECLARE_HYPERCALL_BUFFER_ARGUMENT will only define a xc_hypercall_buffer_t,
while DECLARE_HYPERCALL_BUFFER_SHADOW do define a user pointer that can allow
us to access the hypercall buffer data but it also define a
xc_hypercall_buffer_t that we don't use, the compiler will report arg unused
error.

The cases for example:
In send_all_pages(), we only need to use the haypercall buffer data which is a
dirty bitmap, we set the dirty bitmap to all dirty and call send_dirty_pages,
we will not use the xc_hypercall_buffer_t and hypercall to retrive the dirty
bitmap.
In send_some_pages(), we will also only need to use the dirty_bitmap. the
retrive dirty bitmap hypercall are done by the caller.

>
> Ian.
>
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-11 11:01         ` Andrew Cooper
@ 2015-05-12  8:12           ` Yang Hongyang
  2015-05-12  9:40             ` Andrew Cooper
  0 siblings, 1 reply; 40+ messages in thread
From: Yang Hongyang @ 2015-05-12  8:12 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram



On 05/11/2015 07:01 PM, Andrew Cooper wrote:
> On 11/05/15 11:48, Hongyang Yang wrote:
>>
>> On 05/11/2015 05:00 PM, Andrew Cooper wrote:
>>> On 11/05/15 07:28, Hongyang Yang wrote:
>>>> On 05/09/2015 02:12 AM, Andrew Cooper wrote:
>>>>> On 08/05/15 10:33, Yang Hongyang wrote:
>>>>>> This patchset implement the Remus support for Migration v2 but
>>>>>> without
>>>>>> memory compressing.
>> [...]
>>>
>>>>
>>>> <last iter of memory>
>>>>
>>>>> end_of_checkpoint()
>>>>> Checkpoint record
>>>>
>>>> ctx->save.callbacks->postcopy()
>>>> this callback should not be omitted, it do some necessary work before
>>>> resume
>>>> primary (such as call Remus devices preresume callbacks to ensure the
>>>> disk
>>>> data is consistent) and then resume the primary guest. I think this
>>>> callback should be renamed to ctx->save.callbacks->resume().
>>>
>>> That looks to be a useful cleanup (and answers one of my questions of
>>> what exactly postcopy was)
>>>
>>>>
>>>>>                ctx->save.callbacks->checkpoint()
>>>>>                                    libxl qemu record
>>>>
>>>> Maybe we should add another callback to send qemu record instead of
>>>> using checkpoint callback. We can call it
>>>> ctx->save.callbacks->save_qemu()
>>>
>>> This is another layering violation.  libxc should not prescribe what
>>> libxl might or might not do.  One example we are experimenting with in
>>> XenServer at the moment is support for multiple emulators attached to a
>>> single domain, which would necessitate two LIBXL_EMULATOR records to be
>>> sent per checkpoint.  libxl might also want to send an updated json blob
>>> or such.
>>
>> Ok, so we'd better not introduce save_qemu callback.
>>
>>>
>>>> Then in checkpoint callback, we only call remus devices commit
>>>> callbacks(
>>>> which will release the network buffer etc...) then decide whether we
>>>> need to
>>>> do another checkpoint or quit checkpointed stream.
>>>> With Remus, checkpoint callback only wait for 200ms(can be specified
>>>> by -i)
>>>> then return.
>>>> With COLO, checkpoint callback will ask COLO proxy if we need to do a
>>>> checkpoint, will return when COLO proxy module indicate a checkpoint
>>>> is needed.
>>>
>>> That sounds like COLO wants a should_checkpoint() callback which
>>> separates the decision to make a checkpoint from the logic of
>>> implementing a checkpoint.
>>
>> We use checkpoint callback to do should_checkpoint() thing currently.
>> libxc will check the return value of checkpoint callback.
>
> But that causes a chicken & egg problem.
>
> I am planning to use a CHECKPOINT record to synchronise the transfer of
> ownership of the FD between libxc and libxl.  Therefore, a CHECKPOINT
> record must be in the stream ahead of the checkpoint() callback, as
> libxl will then write/read some records in itself.

The record name CHECKPOINT seems do not match the thing what you are
planning to do, In this case I think END-OF-CHECKPOINT which represent the
END of libxc side checkpoint is better, when libxc side checkpoint end,
libxc should transfer the ownership of FD to libxl and let libxl to
handle the following stream. libxl side can also use END-OF-CHECKPOINT
as a sign to hand the ownership of the FD to libxc.

>
> As a result, the checkpoint() callback itself can't be used to gate
> whether a CHECKPOINT record is written by libxc.

I was wondering how you will do the FD transfer job?

>
>>
>>>
>>>>
>>>>>                                    ...
>>>>>                                    libxl end-of-checkpoint record
>>>>>                ctx->save.callbacks->checkpoint() returns
>>>>> start_of_checkpoint()
>>>>
>>>> ctx->save.callbacks->suspend()
>>>>
>>>>> <memory>
>>>>> end_of_checkpoint()
>>>>> Checkpoint record
>>>>> etc...
>>>>>
>>>>> This will eventually allow both libxc and libxl to send checkpoint
>>>>> data
>>>>> (and by the looks of it, remove the need for postcopy()).  With this
>>>>> libxc/remus work it is fine to use XG_LIBXL_HVM_COMPAT to cover the
>>>>> current qemu situation, but I would prefer not to be also retrofitting
>>>>> libxc checkpoint records when doing the libxl/migv2 work.
>>>>>
>>>>> Does this look plausible in for Remus (and eventually COLO) support?
>>>>
>>>> With comments above, I would suggest the save flow as below:
>>>>
>>>> libxc writes:                   libxl writes:
>>>>
>>>> live migration:
>>>> Image Header
>>>> Domain Header
>>>> start_of_stream()
>>>> start_of_checkpoint()
>>>> <live memory>
>>>> ctx->save.callbacks->suspend()
>>>> <last iter memory>
>>>> end_of_checkpoint()
>>>> if ( checkpointd )
>>>>     End of Checkpoint record
>>>>     /*If resotre side receives this record, input fd should be handed to
>>>> libxl*/
>>>> else
>>>>     goto end
>>>>
>>>> loop of checkpointed stream:
>>>> ctx->save.callbacks->resume()
>>>> ctx->save.callbacks->save_qemu()
>>>>                                   libxl qemu record
>>>>                                   ...
>>>>                                   libxl end-of-checkpoint record
>>>> /*If resotre side receives this record, input fd should be handed to
>>>> libxc*/
>>>> ctx->save.callbacks->save_qemu() returns
>>>> ctx->save.callbacks->checkpoint()
>>>> start_of_checkpoint()
>>>> ctx->save.callbacks->suspend()
>>>> <memory>
>>>> end_of_checkpoint()
>>>> End of Checkpoint record
>>>> goto 'loop of checkpointed stream'
>>>>
>>>> end:
>>>> END record
>>>> /*If resotre side receives this record, input fd should be handed to
>>>> libxl*/
>>>>
>>>>
>>>> In order to keep it simple, we can keep the current
>>>> ctx->save.callbacks->checkpoint() as it is, which do the save_qemu
>>>> thing, call
>>>> Remus devices commit callbacks and then decide whether we need a
>>>> checkpoint. We
>>>> can also combine the ctx->save.callbacks->resume() with
>>>> ctx->save.callbacks->checkpoint(), with only one checkpoint()
>>>> callback, we do
>>>> the following things:
>>>>    - Call Remus devices preresume callbacks
>>>>    - Resume the primary
>>>>    - Save qemu records
>>>>    - Call Remus devices commit callbacks
>>>>    - Decide whether we need a checkpoint
>>>>
>>>> Overall, there are 3 options for the save flow:
>>>> 1. keep the current callbacks, rename postcopy() to resume()
>>>> 2. split the checkpoint() callback to save_qemu() and checkpoint()
>>>> 3. combine the current postcopy() and checkpoint()
>>>> Which one do you think is the best?
>>>
>>> I have a 4th alternative in mind, but would like your feedback from my
>>> comments in this email first.
>>
>> So what's the 4th alternative?
>
> I have some corrections to my patch series based on David's feedback,
> and your comments.  After that, it should hopefully be far easier to
> describe.

OK, I've addressed all comments on my series and wait for your series
to continue :-)

>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-12  7:18     ` Yang Hongyang
@ 2015-05-12  8:19       ` Ian Campbell
  2015-05-12  9:24         ` Yang Hongyang
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-05-12  8:19 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson

On Tue, 2015-05-12 at 15:18 +0800, Yang Hongyang wrote:
> 
> On 05/11/2015 07:53 PM, Ian Campbell wrote:
> > On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
> >> Define a user pointer that can access the hypercall buffer data
> >> Useful when you only need to access the hypercall buffer data
> >
> > Can you expand on this please, I think the context is a hypercall buffer
> > passed as an argument or as a member of a struct. Please describe why
> > DECLARE_HYPERCALL_BUFFER_ARGUMENT and/or DECLARE_HYPERCALL_BUFFER_SHADOW
> > are not usable here.
> 
> There are cases that we only need to use the hypercall buffer data, and do
> not use the xc_hypercall_buffer_t struct.
>
> DECLARE_HYPERCALL_BUFFER_ARGUMENT will only define a xc_hypercall_buffer_t,
> while DECLARE_HYPERCALL_BUFFER_SHADOW do define a user pointer that can allow
> us to access the hypercall buffer data but it also define a
> xc_hypercall_buffer_t that we don't use, the compiler will report arg unused
> error.
> 
> The cases for example:
> In send_all_pages(), we only need to use the haypercall buffer data which is a
> dirty bitmap, we set the dirty bitmap to all dirty and call send_dirty_pages,
> we will not use the xc_hypercall_buffer_t and hypercall to retrive the dirty
> bitmap.
> In send_some_pages(), we will also only need to use the dirty_bitmap. the
> retrive dirty bitmap hypercall are done by the caller.

Thanks. Please include the bulk of this description in the commit
message.

However, perhaps we should just mark the xc_hypercall_buffer_t as
potentially unused, by tagging it with  __attribute__((unused)), in some
or all of the DECLARE_HYPERCALL_* variants. Then I think you could use
DECLARE_HYPERCALL_BUFFER_SHADOW as is?

My argument here would be that the xc_hypercall_buffer_t created here is
a hidden part of the internal hcall buf infrastructure and should be
transparent to the user's code. The user's code only interacts with it
via the magic macros which take the "user pointer" name, not the mangled
xc_hypercall_buffer_t's name.

I'm wary of creating too many accessors macros, or of creating too many
back doors around the attempts to steer code away from using regular
memory to pass to hypercalls.

Ian.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-12  8:19       ` Ian Campbell
@ 2015-05-12  9:24         ` Yang Hongyang
  2015-05-12  9:43           ` Ian Campbell
  0 siblings, 1 reply; 40+ messages in thread
From: Yang Hongyang @ 2015-05-12  9:24 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson



On 05/12/2015 04:19 PM, Ian Campbell wrote:
> On Tue, 2015-05-12 at 15:18 +0800, Yang Hongyang wrote:
>>
>> On 05/11/2015 07:53 PM, Ian Campbell wrote:
>>> On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
>>>> Define a user pointer that can access the hypercall buffer data
>>>> Useful when you only need to access the hypercall buffer data
>>>
>>> Can you expand on this please, I think the context is a hypercall buffer
>>> passed as an argument or as a member of a struct. Please describe why
>>> DECLARE_HYPERCALL_BUFFER_ARGUMENT and/or DECLARE_HYPERCALL_BUFFER_SHADOW
>>> are not usable here.
>>
>> There are cases that we only need to use the hypercall buffer data, and do
>> not use the xc_hypercall_buffer_t struct.
>>
>> DECLARE_HYPERCALL_BUFFER_ARGUMENT will only define a xc_hypercall_buffer_t,
>> while DECLARE_HYPERCALL_BUFFER_SHADOW do define a user pointer that can allow
>> us to access the hypercall buffer data but it also define a
>> xc_hypercall_buffer_t that we don't use, the compiler will report arg unused
>> error.
>>
>> The cases for example:
>> In send_all_pages(), we only need to use the haypercall buffer data which is a
>> dirty bitmap, we set the dirty bitmap to all dirty and call send_dirty_pages,
>> we will not use the xc_hypercall_buffer_t and hypercall to retrive the dirty
>> bitmap.
>> In send_some_pages(), we will also only need to use the dirty_bitmap. the
>> retrive dirty bitmap hypercall are done by the caller.
>
> Thanks. Please include the bulk of this description in the commit
> message.
>
> However, perhaps we should just mark the xc_hypercall_buffer_t as
> potentially unused, by tagging it with  __attribute__((unused)), in some
> or all of the DECLARE_HYPERCALL_* variants. Then I think you could use
> DECLARE_HYPERCALL_BUFFER_SHADOW as is?

If __attribute__((unused)) won't cause other problems here, I also prefer
to add this to DECLARE_HYPERCALL_BUFFER_SHADOW instead of create another
macro, I think only DECLARE_HYPERCALL_BUFFER_SHADOW may need to set this
attribute because this is a shadow, we might not use the xc_hypercall_buffer_t
which already allocated.

>
> My argument here would be that the xc_hypercall_buffer_t created here is
> a hidden part of the internal hcall buf infrastructure and should be
> transparent to the user's code. The user's code only interacts with it
> via the magic macros which take the "user pointer" name, not the mangled
> xc_hypercall_buffer_t's name.
>
> I'm wary of creating too many accessors macros, or of creating too many
> back doors around the attempts to steer code away from using regular
> memory to pass to hypercalls.
>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-12  8:12           ` Yang Hongyang
@ 2015-05-12  9:40             ` Andrew Cooper
  2015-05-12 10:02               ` Yang Hongyang
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Cooper @ 2015-05-12  9:40 UTC (permalink / raw)
  To: Yang Hongyang, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram

On 12/05/15 09:12, Yang Hongyang wrote:
>
>>>> That sounds like COLO wants a should_checkpoint() callback which
>>>> separates the decision to make a checkpoint from the logic of
>>>> implementing a checkpoint.
>>>
>>> We use checkpoint callback to do should_checkpoint() thing currently.
>>> libxc will check the return value of checkpoint callback.
>>
>> But that causes a chicken & egg problem.
>>
>> I am planning to use a CHECKPOINT record to synchronise the transfer of
>> ownership of the FD between libxc and libxl.  Therefore, a CHECKPOINT
>> record must be in the stream ahead of the checkpoint() callback, as
>> libxl will then write/read some records in itself.
>
> The record name CHECKPOINT seems do not match the thing what you are
> planning to do, In this case I think END-OF-CHECKPOINT which represent
> the
> END of libxc side checkpoint is better, when libxc side checkpoint end,
> libxc should transfer the ownership of FD to libxl and let libxl to
> handle the following stream. libxl side can also use END-OF-CHECKPOINT
> as a sign to hand the ownership of the FD to libxc.

END_OF_CHECKPOINT implies the presence of START_OF_CHECKPOINT.  The
current spec for CHECKPOINT is more of a sentinal value between
checkpoints of data.

>
>>
>> As a result, the checkpoint() callback itself can't be used to gate
>> whether a CHECKPOINT record is written by libxc.
>
> I was wondering how you will do the FD transfer job?

The FD needs to be readable/writable in both the libxl and
libxl-save-helper processes.  The CHECKPOINT record simply signals a
transfer of ownership.

>>>> I have a 4th alternative in mind, but would like your feedback from my
>>>> comments in this email first.
>>>
>>> So what's the 4th alternative?
>>
>> I have some corrections to my patch series based on David's feedback,
>> and your comments.  After that, it should hopefully be far easier to
>> describe.
>
> OK, I've addressed all comments on my series and wait for your series
> to continue :-)

Sent.  Sorry for the delay (I also have some XenServer issues I am
working on atm).

~Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-12  9:24         ` Yang Hongyang
@ 2015-05-12  9:43           ` Ian Campbell
  2015-05-12  9:48             ` Yang Hongyang
  0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-05-12  9:43 UTC (permalink / raw)
  To: Yang Hongyang
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson

On Tue, 2015-05-12 at 17:24 +0800, Yang Hongyang wrote:
> 
> On 05/12/2015 04:19 PM, Ian Campbell wrote:
> > On Tue, 2015-05-12 at 15:18 +0800, Yang Hongyang wrote:
> >>
> >> On 05/11/2015 07:53 PM, Ian Campbell wrote:
> >>> On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
> >>>> Define a user pointer that can access the hypercall buffer data
> >>>> Useful when you only need to access the hypercall buffer data
> >>>
> >>> Can you expand on this please, I think the context is a hypercall buffer
> >>> passed as an argument or as a member of a struct. Please describe why
> >>> DECLARE_HYPERCALL_BUFFER_ARGUMENT and/or DECLARE_HYPERCALL_BUFFER_SHADOW
> >>> are not usable here.
> >>
> >> There are cases that we only need to use the hypercall buffer data, and do
> >> not use the xc_hypercall_buffer_t struct.
> >>
> >> DECLARE_HYPERCALL_BUFFER_ARGUMENT will only define a xc_hypercall_buffer_t,
> >> while DECLARE_HYPERCALL_BUFFER_SHADOW do define a user pointer that can allow
> >> us to access the hypercall buffer data but it also define a
> >> xc_hypercall_buffer_t that we don't use, the compiler will report arg unused
> >> error.
> >>
> >> The cases for example:
> >> In send_all_pages(), we only need to use the haypercall buffer data which is a
> >> dirty bitmap, we set the dirty bitmap to all dirty and call send_dirty_pages,
> >> we will not use the xc_hypercall_buffer_t and hypercall to retrive the dirty
> >> bitmap.
> >> In send_some_pages(), we will also only need to use the dirty_bitmap. the
> >> retrive dirty bitmap hypercall are done by the caller.
> >
> > Thanks. Please include the bulk of this description in the commit
> > message.
> >
> > However, perhaps we should just mark the xc_hypercall_buffer_t as
> > potentially unused, by tagging it with  __attribute__((unused)), in some
> > or all of the DECLARE_HYPERCALL_* variants. Then I think you could use
> > DECLARE_HYPERCALL_BUFFER_SHADOW as is?
> 
> If __attribute__((unused)) won't cause other problems here, I also prefer
> to add this to DECLARE_HYPERCALL_BUFFER_SHADOW instead of create another
> macro, I think only DECLARE_HYPERCALL_BUFFER_SHADOW may need to set this
> attribute because this is a shadow, we might not use the xc_hypercall_buffer_t
> which already allocated.

We can certainly start from there and consider other such additions as
the need arises.

Ian.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER
  2015-05-12  9:43           ` Ian Campbell
@ 2015-05-12  9:48             ` Yang Hongyang
  0 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-12  9:48 UTC (permalink / raw)
  To: Ian Campbell
  Cc: wei.liu2, wency, andrew.cooper3, yunhong.jiang, eddie.dong,
	xen-devel, rshriram, ian.jackson



On 05/12/2015 05:43 PM, Ian Campbell wrote:
> On Tue, 2015-05-12 at 17:24 +0800, Yang Hongyang wrote:
>>
>> On 05/12/2015 04:19 PM, Ian Campbell wrote:
>>> On Tue, 2015-05-12 at 15:18 +0800, Yang Hongyang wrote:
>>>>
>>>> On 05/11/2015 07:53 PM, Ian Campbell wrote:
>>>>> On Fri, 2015-05-08 at 17:33 +0800, Yang Hongyang wrote:
>>>>>> Define a user pointer that can access the hypercall buffer data
>>>>>> Useful when you only need to access the hypercall buffer data
>>>>>
>>>>> Can you expand on this please, I think the context is a hypercall buffer
>>>>> passed as an argument or as a member of a struct. Please describe why
>>>>> DECLARE_HYPERCALL_BUFFER_ARGUMENT and/or DECLARE_HYPERCALL_BUFFER_SHADOW
>>>>> are not usable here.
>>>>
>>>> There are cases that we only need to use the hypercall buffer data, and do
>>>> not use the xc_hypercall_buffer_t struct.
>>>>
>>>> DECLARE_HYPERCALL_BUFFER_ARGUMENT will only define a xc_hypercall_buffer_t,
>>>> while DECLARE_HYPERCALL_BUFFER_SHADOW do define a user pointer that can allow
>>>> us to access the hypercall buffer data but it also define a
>>>> xc_hypercall_buffer_t that we don't use, the compiler will report arg unused
>>>> error.
>>>>
>>>> The cases for example:
>>>> In send_all_pages(), we only need to use the haypercall buffer data which is a
>>>> dirty bitmap, we set the dirty bitmap to all dirty and call send_dirty_pages,
>>>> we will not use the xc_hypercall_buffer_t and hypercall to retrive the dirty
>>>> bitmap.
>>>> In send_some_pages(), we will also only need to use the dirty_bitmap. the
>>>> retrive dirty bitmap hypercall are done by the caller.
>>>
>>> Thanks. Please include the bulk of this description in the commit
>>> message.
>>>
>>> However, perhaps we should just mark the xc_hypercall_buffer_t as
>>> potentially unused, by tagging it with  __attribute__((unused)), in some
>>> or all of the DECLARE_HYPERCALL_* variants. Then I think you could use
>>> DECLARE_HYPERCALL_BUFFER_SHADOW as is?
>>
>> If __attribute__((unused)) won't cause other problems here, I also prefer
>> to add this to DECLARE_HYPERCALL_BUFFER_SHADOW instead of create another
>> macro, I think only DECLARE_HYPERCALL_BUFFER_SHADOW may need to set this
>> attribute because this is a shadow, we might not use the xc_hypercall_buffer_t
>> which already allocated.
>
> We can certainly start from there and consider other such additions as
> the need arises.

OK, thanks, will do in the next version.

>
> Ian.
>
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH Remus v2 00/10] Remus support for Migration-v2
  2015-05-12  9:40             ` Andrew Cooper
@ 2015-05-12 10:02               ` Yang Hongyang
  0 siblings, 0 replies; 40+ messages in thread
From: Yang Hongyang @ 2015-05-12 10:02 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: wei.liu2, ian.campbell, wency, ian.jackson, yunhong.jiang,
	eddie.dong, rshriram



On 05/12/2015 05:40 PM, Andrew Cooper wrote:
> On 12/05/15 09:12, Yang Hongyang wrote:
>>
>>>>> That sounds like COLO wants a should_checkpoint() callback which
>>>>> separates the decision to make a checkpoint from the logic of
>>>>> implementing a checkpoint.
>>>>
>>>> We use checkpoint callback to do should_checkpoint() thing currently.
>>>> libxc will check the return value of checkpoint callback.
>>>
>>> But that causes a chicken & egg problem.
>>>
>>> I am planning to use a CHECKPOINT record to synchronise the transfer of
>>> ownership of the FD between libxc and libxl.  Therefore, a CHECKPOINT
>>> record must be in the stream ahead of the checkpoint() callback, as
>>> libxl will then write/read some records in itself.
>>
>> The record name CHECKPOINT seems do not match the thing what you are
>> planning to do, In this case I think END-OF-CHECKPOINT which represent
>> the
>> END of libxc side checkpoint is better, when libxc side checkpoint end,
>> libxc should transfer the ownership of FD to libxl and let libxl to
>> handle the following stream. libxl side can also use END-OF-CHECKPOINT
>> as a sign to hand the ownership of the FD to libxc.
>
> END_OF_CHECKPOINT implies the presence of START_OF_CHECKPOINT.  The
> current spec for CHECKPOINT is more of a sentinal value between
> checkpoints of data.
>
>>
>>>
>>> As a result, the checkpoint() callback itself can't be used to gate
>>> whether a CHECKPOINT record is written by libxc.
>>
>> I was wondering how you will do the FD transfer job?
>
> The FD needs to be readable/writable in both the libxl and
> libxl-save-helper processes.  The CHECKPOINT record simply signals a
> transfer of ownership.
>
>>>>> I have a 4th alternative in mind, but would like your feedback from my
>>>>> comments in this email first.
>>>>
>>>> So what's the 4th alternative?
>>>
>>> I have some corrections to my patch series based on David's feedback,
>>> and your comments.  After that, it should hopefully be far easier to
>>> describe.
>>
>> OK, I've addressed all comments on my series and wait for your series
>> to continue :-)
>
> Sent.  Sorry for the delay (I also have some XenServer issues I am
> working on atm).

Never mind, and thank you very much for the quick turnaround! The design
looks more clear now, It really helps me a lot!

>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2015-05-12 10:02 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-08  9:33 [PATCH Remus v2 00/10] Remus support for Migration-v2 Yang Hongyang
2015-05-08  9:33 ` [PATCH Remus v2 01/10] tools/libxc: adjust the memory allocation for migration Yang Hongyang
2015-05-08  9:51   ` Andrew Cooper
2015-05-11 11:50   ` Ian Campbell
2015-05-12  6:43     ` Hongyang Yang
2015-05-08  9:33 ` [PATCH Remus v2 02/10] tools/libxc: introduce setup() and cleanup() on save Yang Hongyang
2015-05-08  9:45   ` Andrew Cooper
2015-05-08  9:59     ` Hongyang Yang
2015-05-08 10:08       ` Andrew Cooper
2015-05-11  1:20         ` Hongyang Yang
2015-05-11 11:47           ` Ian Campbell
2015-05-11 11:49             ` Ian Campbell
2015-05-12  7:04               ` Yang Hongyang
2015-05-08  9:33 ` [PATCH Remus v2 03/10] tools/libxc: rename send_some_pages to send_dirty_pages Yang Hongyang
2015-05-08 10:11   ` Andrew Cooper
2015-05-11  1:21     ` Hongyang Yang
2015-05-08  9:33 ` [PATCH Remus v2 04/10] tools/libxc: introduce DECLARE_HYPERCALL_BUFFER_USER_POINTER Yang Hongyang
2015-05-08 10:16   ` Andrew Cooper
2015-05-11  1:22     ` Hongyang Yang
2015-05-11 11:53   ` Ian Campbell
2015-05-12  7:18     ` Yang Hongyang
2015-05-12  8:19       ` Ian Campbell
2015-05-12  9:24         ` Yang Hongyang
2015-05-12  9:43           ` Ian Campbell
2015-05-12  9:48             ` Yang Hongyang
2015-05-08  9:33 ` [PATCH Remus v2 05/10] tools/libxc: reuse send_dirty_pages() in send_all_pages() Yang Hongyang
2015-05-08 10:17   ` Andrew Cooper
2015-05-08  9:33 ` [PATCH Remus v2 06/10] tools/libxc: introduce process_record() Yang Hongyang
2015-05-08  9:33 ` [PATCH Remus v2 07/10] tools/libxc: split read/handle qemu info Yang Hongyang
2015-05-08  9:33 ` [PATCH Remus v2 08/10] tools/libxc: implement Remus checkpointed save Yang Hongyang
2015-05-08  9:33 ` [PATCH Remus v2 09/10] tools/libxc: implement Remus checkpointed restore Yang Hongyang
2015-05-08  9:33 ` [PATCH Remus v2 10/10] tools/libxc: X86_PV_INFO can be sent multiple times under Remus Yang Hongyang
2015-05-08 18:12 ` [PATCH Remus v2 00/10] Remus support for Migration-v2 Andrew Cooper
2015-05-11  6:28   ` Hongyang Yang
2015-05-11  9:00     ` Andrew Cooper
2015-05-11 10:48       ` Hongyang Yang
2015-05-11 11:01         ` Andrew Cooper
2015-05-12  8:12           ` Yang Hongyang
2015-05-12  9:40             ` Andrew Cooper
2015-05-12 10:02               ` Yang Hongyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.