All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Introduce migration precopy policy
@ 2017-09-14 15:33 Jennifer Herbert
  2017-09-14 15:33 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Jennifer Herbert @ 2017-09-14 15:33 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

Hi all,

Here I present a updated/modified subset of patch 7/20, part of
Joshua Otto's "Add postcopy live migration support." patch,
dated 27th March 2017.  As indicated on the original thread,
I wish to make use of this this within the XenServer product,
and hence am trying to get this subset  pulled in now. I also
hope this will aid Josh in pushing the remainder of his series.

Here I present two patches, the first which does some tidy up, removing
unused and unhelpful paramaters to xc_domain_save(), and the second
which allows a precopy callback to be specified, providing the
test for when to end the live phase of migration should end.
If none is provided, a default policy of the current behaviour
is used.

Cheers,

Jennifer Herbert (2):
  Tidy libxc xc_domain_save
  Introduce migration precopy policy

 tools/libxc/include/xenguest.h   |  23 +++++++-
 tools/libxc/xc_nomigrate.c       |   3 +-
 tools/libxc/xc_sr_common.h       |   7 ++-
 tools/libxc/xc_sr_save.c         | 110 ++++++++++++++++++++++++++-------------
 tools/libxl/libxl_save_callout.c |   4 +-
 tools/libxl/libxl_save_helper.c  |   7 +--
 6 files changed, 104 insertions(+), 50 deletions(-)

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] Tidy libxc xc_domain_save
  2017-09-14 15:33 [PATCH 0/2] Introduce migration precopy policy Jennifer Herbert
@ 2017-09-14 15:33 ` Jennifer Herbert
  2017-09-18  8:46   ` Paul Durrant
  2017-09-14 15:33 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Jennifer Herbert @ 2017-09-14 15:33 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

Tidy up libxc's xc_domain_save, removing unused paramaters
max_iters and max_factor, making matching changes to libxl.

Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>
---
 tools/libxc/include/xenguest.h   | 4 ++--
 tools/libxc/xc_nomigrate.c       | 3 +--
 tools/libxc/xc_sr_save.c         | 8 +++-----
 tools/libxl/libxl_save_callout.c | 4 ++--
 tools/libxl/libxl_save_helper.c  | 7 ++-----
 5 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 5cd8111..6626f0c 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -100,8 +100,8 @@ typedef enum {
  *        doesn't use checkpointing
  * @return 0 on success, -1 on failure
  */
-int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
-                   uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
+int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
+                   uint32_t flags /* XCFLAGS_xxx */,
                    struct save_callbacks* callbacks, int hvm,
                    xc_migration_stream_t stream_type, int recv_fd);
 
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 317c8ce..fe8f68c 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -20,8 +20,7 @@
 #include <xenctrl.h>
 #include <xenguest.h>
 
-int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
-                   uint32_t max_factor, uint32_t flags,
+int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t flags,
                    struct save_callbacks* callbacks, int hvm,
                    xc_migration_stream_t stream_type, int recv_fd)
 {
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index ca6913b..1e7502d 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -916,9 +916,8 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 };
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
-                   uint32_t max_iters, uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm,
-                   xc_migration_stream_t stream_type, int recv_fd)
+                   uint32_t flags, struct save_callbacks* callbacks,
+                   int hvm, xc_migration_stream_t stream_type, int recv_fd)
 {
     struct xc_sr_context ctx =
         {
@@ -955,8 +954,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
     if ( ctx.save.checkpointed == XC_MIG_STREAM_COLO )
         assert(callbacks->wait_checkpoint);
 
-    DPRINTF("fd %d, dom %u, max_iters %u, max_factor %u, flags %u, hvm %d",
-            io_fd, dom, max_iters, max_factor, flags, hvm);
+    DPRINTF("fd %d, dom %u, flags %u, hvm %d", io_fd, dom, flags, hvm);
 
     if ( xc_domain_getinfo(xch, dom, 1, &ctx.dominfo) != 1 )
     {
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 891c669..6452d70 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -89,8 +89,8 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
         libxl__srm_callout_enumcallbacks_save(&shs->callbacks.save.a);
 
     const unsigned long argnums[] = {
-        dss->domid, 0, 0, dss->xcflags, dss->hvm,
-        cbflags, dss->checkpointed_stream,
+        dss->domid, dss->xcflags, dss->hvm, cbflags,
+        dss->checkpointed_stream,
     };
 
     shs->ao = ao;
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 1dece23..38089a0 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -251,8 +251,6 @@ int main(int argc, char **argv)
         io_fd =                             atoi(NEXTARG);
         recv_fd =                           atoi(NEXTARG);
         uint32_t dom =                      strtoul(NEXTARG,0,10);
-        uint32_t max_iters =                strtoul(NEXTARG,0,10);
-        uint32_t max_factor =               strtoul(NEXTARG,0,10);
         uint32_t flags =                    strtoul(NEXTARG,0,10);
         int hvm =                           atoi(NEXTARG);
         unsigned cbflags =                  strtoul(NEXTARG,0,10);
@@ -264,9 +262,8 @@ int main(int argc, char **argv)
         startup("save");
         setup_signals(save_signal_handler);
 
-        r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm, stream_type,
-                           recv_fd);
+        r = xc_domain_save(xch, io_fd, dom, flags, &helper_save_callbacks,
+                           hvm, stream_type, recv_fd);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] Introduce migration precopy policy
  2017-09-14 15:33 [PATCH 0/2] Introduce migration precopy policy Jennifer Herbert
  2017-09-14 15:33 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
@ 2017-09-14 15:33 ` Jennifer Herbert
  2017-09-18  9:05   ` Paul Durrant
                     ` (2 more replies)
  2017-09-14 15:33 ` [PATCH 0/2] " Jennifer Herbert
                   ` (2 subsequent siblings)
  4 siblings, 3 replies; 15+ messages in thread
From: Jennifer Herbert @ 2017-09-14 15:33 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

This Patch allows a migration precopy policy to be specified.

The precopy phase of the xc_domain_save() live migration algorithm has
historically been implemented to run until either a) (almost) no pages
are dirty or b) some fixed, hard-coded maximum number of precopy
iterations has been exceeded.  This policy and its implementation are
less than ideal for a few reasons:
- the logic of the policy is intertwined with the control flow of the
  mechanism of the precopy stage
- it can't take into account facts external to the immediate
  migration context, such external state transfer state, interactive
  user input, or the passage of wall-clock time.
- it does not permit the user to change their mind, over time, about
  what to do at the end of the precopy (they get an unconditional
  transition into the stop-and-copy phase of the migration)

To permit callers to implement arbitrary higher-level policies governing
when the live migration precopy phase should end, and what should be
done next:
- add a precopy_policy() callback to the xc_domain_save() user-supplied
  callbacks
- during the precopy phase of live migrations, consult this policy after
  each batch of pages transmitted and take the dictated action, which
  may be to a) abort the migration entirely, b) continue with the
  precopy, or c) proceed to the stop-and-copy phase.
- provide an implementation of the old policy, used when
  precopy_policy callback  is not provided.

Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>

---

This is updated/modified subset of patch 7/20, part of
Joshua Otto's "Add postcopy live migration support." patch,
dated 27th March 2017.  As indicated on the original thread,
I wish to make use of this this within the XenServer product.
I hope this will aid Josh in pushing the remainder of his series.
---
 tools/libxc/include/xenguest.h |  19 ++++++++
 tools/libxc/xc_sr_common.h     |   7 ++-
 tools/libxc/xc_sr_save.c       | 102 +++++++++++++++++++++++++++++------------
 3 files changed, 94 insertions(+), 34 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 6626f0c..d5908dc 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -39,6 +39,14 @@
  */
 struct xenevtchn_handle;
 
+/* For save's precopy_policy(). */
+struct precopy_stats
+{
+    unsigned iteration;
+    unsigned total_written;
+    long dirty_count; /* -1 if unknown */
+};
+
 /* callbacks provided by xc_domain_save */
 struct save_callbacks {
     /* Called after expiration of checkpoint interval,
@@ -46,6 +54,17 @@ struct save_callbacks {
      */
     int (*suspend)(void* data);
 
+    /* Called after every batch of page data sent during the precopy phase of a
+     * live migration to ask the caller what to do next based on the current
+     * state of the precopy migration.
+     */
+#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely and
+                                        * tidy up. */
+#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy phase. */
+#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend and transmit the
+                                        * remaining dirty pages. */
+    int (*precopy_policy)(struct precopy_stats stats, void *data);
+
     /* Called after the guest's dirty pages have been
      *  copied into an output buffer.
      * Callback function resumes the guest & the device model,
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index a83f22a..2bc261b 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -198,12 +198,11 @@ struct xc_sr_context
             /* Further debugging information in the stream. */
             bool debug;
 
-            /* Parameters for tweaking live migration. */
-            unsigned max_iterations;
-            unsigned dirty_threshold;
-
             unsigned long p2m_size;
 
+            struct precopy_stats stats;
+            int policy_decision;
+
             xen_pfn_t *batch_pfns;
             unsigned nr_batch_pfns;
             unsigned long *deferred_pages;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 1e7502d..03dfa61 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -452,8 +452,7 @@ static int update_progress_string(struct xc_sr_context *ctx,
     xc_interface *xch = ctx->xch;
     char *new_str = NULL;
 
-    if ( asprintf(&new_str, "Frames iteration %u of %u",
-                  iter, ctx->save.max_iterations) == -1 )
+    if ( asprintf(&new_str, "Frames iteration %u", iter) == -1 )
     {
         PERROR("Unable to allocate new progress string");
         return -1;
@@ -467,6 +466,25 @@ static int update_progress_string(struct xc_sr_context *ctx,
 }
 
 /*
+ * This is the live migration precopy policy - it's called periodically during
+ * the precopy phase of live migrations, and is responsible for deciding when
+ * the precopy phase should terminate and what should be done next.
+ *
+ * The policy implemented here behaves identically to the policy previously
+ * hard-coded into xc_domain_save() - it proceeds to the stop-and-copy phase of
+ * the live migration when there are either fewer than 50 dirty pages, or more
+ * than 5 precopy rounds have completed.
+ */
+static int simple_precopy_policy(
+    struct precopy_stats stats, void *user)
+{
+    return ((stats.dirty_count >= 0 && stats.dirty_count < 50) ||
+            stats.iteration >= 5)
+        ? XGS_POLICY_STOP_AND_COPY
+        : XGS_POLICY_CONTINUE_PRECOPY;
+}
+
+/*
  * Send memory while guest is running.
  */
 static int send_memory_live(struct xc_sr_context *ctx)
@@ -474,21 +492,62 @@ static int send_memory_live(struct xc_sr_context *ctx)
     xc_interface *xch = ctx->xch;
     xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
     char *progress_str = NULL;
-    unsigned x;
+    unsigned int x = 0;
     int rc;
+    int policy_decision;
+
+    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+                                    &ctx->save.dirty_bitmap_hbuf);
+
+    int (*precopy_policy)(struct precopy_stats, void *) =
+        ctx->save.callbacks->precopy_policy;
+    void *data = ctx->save.callbacks->data;
+
+    struct precopy_stats *policy_stats;
 
     rc = update_progress_string(ctx, &progress_str, 0);
     if ( rc )
         goto out;
 
-    rc = send_all_pages(ctx);
-    if ( rc )
-        goto out;
+    ctx->save.stats = (struct precopy_stats)
+        {
+            .iteration     = x,
+            .total_written = 0,
+            .dirty_count   = ctx->save.p2m_size
+        };
+    policy_stats = &ctx->save.stats;
+
+    if (precopy_policy == NULL)
+         precopy_policy = simple_precopy_policy;
+
+    bitmap_set(dirty_bitmap, ctx->save.p2m_size);
+
+    do {
+        policy_decision = precopy_policy(*policy_stats, data);
+        x++;
+
+        if ( stats.dirty_count > 0 && policy_decision != XGS_POLICY_ABORT ) {
+            rc = update_progress_string(ctx, &progress_str, x);
+            if ( rc )
+                goto out;
+
+            rc = send_dirty_pages(ctx, stats.dirty_count);
+            if ( rc )
+                goto out;
+        }
+
+        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)
+            break;
+
+        policy_stats->iteration     = x;
+        policy_stats->total_written += policy_stats->dirty_count;
+        policy_stats->dirty_count   = -1;
+
+        policy_decision = precopy_policy(*policy_stats, data);
+
+        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)
+           break;
 
-    for ( x = 1;
-          ((x < ctx->save.max_iterations) &&
-           (stats.dirty_count > ctx->save.dirty_threshold)); ++x )
-    {
         if ( xc_shadow_control(
                  xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
                  &ctx->save.dirty_bitmap_hbuf, ctx->save.p2m_size,
@@ -499,17 +558,9 @@ static int send_memory_live(struct xc_sr_context *ctx)
             goto out;
         }
 
-        if ( stats.dirty_count == 0 )
-            break;
-
-        rc = update_progress_string(ctx, &progress_str, x);
-        if ( rc )
-            goto out;
+        policy_stats->dirty_count = stats.dirty_count;
 
-        rc = send_dirty_pages(ctx, stats.dirty_count);
-        if ( rc )
-            goto out;
-    }
+    } while (true);
 
  out:
     xc_set_progress_prefix(xch, NULL);
@@ -601,7 +652,7 @@ static int suspend_and_send_dirty(struct xc_sr_context *ctx)
     if ( ctx->save.live )
     {
         rc = update_progress_string(ctx, &progress_str,
-                                    ctx->save.max_iterations);
+                                    ctx->save.stats.iteration);
         if ( rc )
             goto out;
     }
@@ -937,15 +988,6 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
            stream_type == XC_MIG_STREAM_REMUS ||
            stream_type == XC_MIG_STREAM_COLO);
 
-    /*
-     * TODO: Find some time to better tweak the live migration algorithm.
-     *
-     * These parameters are better than the legacy algorithm especially for
-     * busy guests.
-     */
-    ctx.save.max_iterations = 5;
-    ctx.save.dirty_threshold = 50;
-
     /* Sanity checks for callbacks. */
     if ( hvm )
         assert(callbacks->switch_qemu_logdirty);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 0/2] Introduce migration precopy policy
  2017-09-14 15:33 [PATCH 0/2] Introduce migration precopy policy Jennifer Herbert
  2017-09-14 15:33 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
  2017-09-14 15:33 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
@ 2017-09-14 15:33 ` Jennifer Herbert
  2017-09-14 15:34 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
  2017-09-14 15:34 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
  4 siblings, 0 replies; 15+ messages in thread
From: Jennifer Herbert @ 2017-09-14 15:33 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

Hi all,

Here I present a updated/modified subset of patch 7/20, part of
Joshua Otto's "Add postcopy live migration support." patch,
dated 27th March 2017.  As indicated on the original thread,
I wish to make use of this this within the XenServer product,
and hence am trying to get this subset  pulled in now. I also
hope this will aid Josh in pushing the remainder of his series.

Here I present two patches, the first which does some tidy up, removing
unused and unhelpful paramaters to xc_domain_save(), and the second
which allows a precopy callback to be specified, providing the
test for when to end the live phase of migration should end.
If none is provided, a default policy of the current behaviour
is used.

Cheers,

Jennifer Herbert (2):
  Tidy libxc xc_domain_save
  Introduce migration precopy policy

 tools/libxc/include/xenguest.h   |  23 +++++++-
 tools/libxc/xc_nomigrate.c       |   3 +-
 tools/libxc/xc_sr_common.h       |   7 ++-
 tools/libxc/xc_sr_save.c         | 110 ++++++++++++++++++++++++++-------------
 tools/libxl/libxl_save_callout.c |   4 +-
 tools/libxl/libxl_save_helper.c  |   7 +--
 6 files changed, 104 insertions(+), 50 deletions(-)

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] Tidy libxc xc_domain_save
  2017-09-14 15:33 [PATCH 0/2] Introduce migration precopy policy Jennifer Herbert
                   ` (2 preceding siblings ...)
  2017-09-14 15:33 ` [PATCH 0/2] " Jennifer Herbert
@ 2017-09-14 15:34 ` Jennifer Herbert
  2017-09-14 15:59   ` Wei Liu
  2017-09-14 15:34 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
  4 siblings, 1 reply; 15+ messages in thread
From: Jennifer Herbert @ 2017-09-14 15:34 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

Tidy up libxc's xc_domain_save, removing unused paramaters
max_iters and max_factor, making matching changes to libxl.

Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>
---
 tools/libxc/include/xenguest.h   | 4 ++--
 tools/libxc/xc_nomigrate.c       | 3 +--
 tools/libxc/xc_sr_save.c         | 8 +++-----
 tools/libxl/libxl_save_callout.c | 4 ++--
 tools/libxl/libxl_save_helper.c  | 7 ++-----
 5 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 5cd8111..6626f0c 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -100,8 +100,8 @@ typedef enum {
  *        doesn't use checkpointing
  * @return 0 on success, -1 on failure
  */
-int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
-                   uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
+int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
+                   uint32_t flags /* XCFLAGS_xxx */,
                    struct save_callbacks* callbacks, int hvm,
                    xc_migration_stream_t stream_type, int recv_fd);
 
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 317c8ce..fe8f68c 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -20,8 +20,7 @@
 #include <xenctrl.h>
 #include <xenguest.h>
 
-int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t max_iters,
-                   uint32_t max_factor, uint32_t flags,
+int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t flags,
                    struct save_callbacks* callbacks, int hvm,
                    xc_migration_stream_t stream_type, int recv_fd)
 {
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index ca6913b..1e7502d 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -916,9 +916,8 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 };
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
-                   uint32_t max_iters, uint32_t max_factor, uint32_t flags,
-                   struct save_callbacks* callbacks, int hvm,
-                   xc_migration_stream_t stream_type, int recv_fd)
+                   uint32_t flags, struct save_callbacks* callbacks,
+                   int hvm, xc_migration_stream_t stream_type, int recv_fd)
 {
     struct xc_sr_context ctx =
         {
@@ -955,8 +954,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
     if ( ctx.save.checkpointed == XC_MIG_STREAM_COLO )
         assert(callbacks->wait_checkpoint);
 
-    DPRINTF("fd %d, dom %u, max_iters %u, max_factor %u, flags %u, hvm %d",
-            io_fd, dom, max_iters, max_factor, flags, hvm);
+    DPRINTF("fd %d, dom %u, flags %u, hvm %d", io_fd, dom, flags, hvm);
 
     if ( xc_domain_getinfo(xch, dom, 1, &ctx.dominfo) != 1 )
     {
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 891c669..6452d70 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -89,8 +89,8 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_save_state *dss,
         libxl__srm_callout_enumcallbacks_save(&shs->callbacks.save.a);
 
     const unsigned long argnums[] = {
-        dss->domid, 0, 0, dss->xcflags, dss->hvm,
-        cbflags, dss->checkpointed_stream,
+        dss->domid, dss->xcflags, dss->hvm, cbflags,
+        dss->checkpointed_stream,
     };
 
     shs->ao = ao;
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 1dece23..38089a0 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -251,8 +251,6 @@ int main(int argc, char **argv)
         io_fd =                             atoi(NEXTARG);
         recv_fd =                           atoi(NEXTARG);
         uint32_t dom =                      strtoul(NEXTARG,0,10);
-        uint32_t max_iters =                strtoul(NEXTARG,0,10);
-        uint32_t max_factor =               strtoul(NEXTARG,0,10);
         uint32_t flags =                    strtoul(NEXTARG,0,10);
         int hvm =                           atoi(NEXTARG);
         unsigned cbflags =                  strtoul(NEXTARG,0,10);
@@ -264,9 +262,8 @@ int main(int argc, char **argv)
         startup("save");
         setup_signals(save_signal_handler);
 
-        r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
-                           &helper_save_callbacks, hvm, stream_type,
-                           recv_fd);
+        r = xc_domain_save(xch, io_fd, dom, flags, &helper_save_callbacks,
+                           hvm, stream_type, recv_fd);
         complete(r);
 
     } else if (!strcmp(mode,"--restore-domain")) {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] Introduce migration precopy policy
  2017-09-14 15:33 [PATCH 0/2] Introduce migration precopy policy Jennifer Herbert
                   ` (3 preceding siblings ...)
  2017-09-14 15:34 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
@ 2017-09-14 15:34 ` Jennifer Herbert
  4 siblings, 0 replies; 15+ messages in thread
From: Jennifer Herbert @ 2017-09-14 15:34 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

This Patch allows a migration precopy policy to be specified.

The precopy phase of the xc_domain_save() live migration algorithm has
historically been implemented to run until either a) (almost) no pages
are dirty or b) some fixed, hard-coded maximum number of precopy
iterations has been exceeded.  This policy and its implementation are
less than ideal for a few reasons:
- the logic of the policy is intertwined with the control flow of the
  mechanism of the precopy stage
- it can't take into account facts external to the immediate
  migration context, such external state transfer state, interactive
  user input, or the passage of wall-clock time.
- it does not permit the user to change their mind, over time, about
  what to do at the end of the precopy (they get an unconditional
  transition into the stop-and-copy phase of the migration)

To permit callers to implement arbitrary higher-level policies governing
when the live migration precopy phase should end, and what should be
done next:
- add a precopy_policy() callback to the xc_domain_save() user-supplied
  callbacks
- during the precopy phase of live migrations, consult this policy after
  each batch of pages transmitted and take the dictated action, which
  may be to a) abort the migration entirely, b) continue with the
  precopy, or c) proceed to the stop-and-copy phase.
- provide an implementation of the old policy, used when
  precopy_policy callback  is not provided.

Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>

---

This is updated/modified subset of patch 7/20, part of
Joshua Otto's "Add postcopy live migration support." patch,
dated 27th March 2017.  As indicated on the original thread,
I wish to make use of this this within the XenServer product.
I hope this will aid Josh in pushing the remainder of his series.
---
 tools/libxc/include/xenguest.h |  19 ++++++++
 tools/libxc/xc_sr_common.h     |   7 ++-
 tools/libxc/xc_sr_save.c       | 102 +++++++++++++++++++++++++++++------------
 3 files changed, 94 insertions(+), 34 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 6626f0c..d5908dc 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -39,6 +39,14 @@
  */
 struct xenevtchn_handle;
 
+/* For save's precopy_policy(). */
+struct precopy_stats
+{
+    unsigned iteration;
+    unsigned total_written;
+    long dirty_count; /* -1 if unknown */
+};
+
 /* callbacks provided by xc_domain_save */
 struct save_callbacks {
     /* Called after expiration of checkpoint interval,
@@ -46,6 +54,17 @@ struct save_callbacks {
      */
     int (*suspend)(void* data);
 
+    /* Called after every batch of page data sent during the precopy phase of a
+     * live migration to ask the caller what to do next based on the current
+     * state of the precopy migration.
+     */
+#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely and
+                                        * tidy up. */
+#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy phase. */
+#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend and transmit the
+                                        * remaining dirty pages. */
+    int (*precopy_policy)(struct precopy_stats stats, void *data);
+
     /* Called after the guest's dirty pages have been
      *  copied into an output buffer.
      * Callback function resumes the guest & the device model,
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index a83f22a..2bc261b 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -198,12 +198,11 @@ struct xc_sr_context
             /* Further debugging information in the stream. */
             bool debug;
 
-            /* Parameters for tweaking live migration. */
-            unsigned max_iterations;
-            unsigned dirty_threshold;
-
             unsigned long p2m_size;
 
+            struct precopy_stats stats;
+            int policy_decision;
+
             xen_pfn_t *batch_pfns;
             unsigned nr_batch_pfns;
             unsigned long *deferred_pages;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 1e7502d..03dfa61 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -452,8 +452,7 @@ static int update_progress_string(struct xc_sr_context *ctx,
     xc_interface *xch = ctx->xch;
     char *new_str = NULL;
 
-    if ( asprintf(&new_str, "Frames iteration %u of %u",
-                  iter, ctx->save.max_iterations) == -1 )
+    if ( asprintf(&new_str, "Frames iteration %u", iter) == -1 )
     {
         PERROR("Unable to allocate new progress string");
         return -1;
@@ -467,6 +466,25 @@ static int update_progress_string(struct xc_sr_context *ctx,
 }
 
 /*
+ * This is the live migration precopy policy - it's called periodically during
+ * the precopy phase of live migrations, and is responsible for deciding when
+ * the precopy phase should terminate and what should be done next.
+ *
+ * The policy implemented here behaves identically to the policy previously
+ * hard-coded into xc_domain_save() - it proceeds to the stop-and-copy phase of
+ * the live migration when there are either fewer than 50 dirty pages, or more
+ * than 5 precopy rounds have completed.
+ */
+static int simple_precopy_policy(
+    struct precopy_stats stats, void *user)
+{
+    return ((stats.dirty_count >= 0 && stats.dirty_count < 50) ||
+            stats.iteration >= 5)
+        ? XGS_POLICY_STOP_AND_COPY
+        : XGS_POLICY_CONTINUE_PRECOPY;
+}
+
+/*
  * Send memory while guest is running.
  */
 static int send_memory_live(struct xc_sr_context *ctx)
@@ -474,21 +492,62 @@ static int send_memory_live(struct xc_sr_context *ctx)
     xc_interface *xch = ctx->xch;
     xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
     char *progress_str = NULL;
-    unsigned x;
+    unsigned int x = 0;
     int rc;
+    int policy_decision;
+
+    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
+                                    &ctx->save.dirty_bitmap_hbuf);
+
+    int (*precopy_policy)(struct precopy_stats, void *) =
+        ctx->save.callbacks->precopy_policy;
+    void *data = ctx->save.callbacks->data;
+
+    struct precopy_stats *policy_stats;
 
     rc = update_progress_string(ctx, &progress_str, 0);
     if ( rc )
         goto out;
 
-    rc = send_all_pages(ctx);
-    if ( rc )
-        goto out;
+    ctx->save.stats = (struct precopy_stats)
+        {
+            .iteration     = x,
+            .total_written = 0,
+            .dirty_count   = ctx->save.p2m_size
+        };
+    policy_stats = &ctx->save.stats;
+
+    if (precopy_policy == NULL)
+         precopy_policy = simple_precopy_policy;
+
+    bitmap_set(dirty_bitmap, ctx->save.p2m_size);
+
+    do {
+        policy_decision = precopy_policy(*policy_stats, data);
+        x++;
+
+        if ( stats.dirty_count > 0 && policy_decision != XGS_POLICY_ABORT ) {
+            rc = update_progress_string(ctx, &progress_str, x);
+            if ( rc )
+                goto out;
+
+            rc = send_dirty_pages(ctx, stats.dirty_count);
+            if ( rc )
+                goto out;
+        }
+
+        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)
+            break;
+
+        policy_stats->iteration     = x;
+        policy_stats->total_written += policy_stats->dirty_count;
+        policy_stats->dirty_count   = -1;
+
+        policy_decision = precopy_policy(*policy_stats, data);
+
+        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)
+           break;
 
-    for ( x = 1;
-          ((x < ctx->save.max_iterations) &&
-           (stats.dirty_count > ctx->save.dirty_threshold)); ++x )
-    {
         if ( xc_shadow_control(
                  xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
                  &ctx->save.dirty_bitmap_hbuf, ctx->save.p2m_size,
@@ -499,17 +558,9 @@ static int send_memory_live(struct xc_sr_context *ctx)
             goto out;
         }
 
-        if ( stats.dirty_count == 0 )
-            break;
-
-        rc = update_progress_string(ctx, &progress_str, x);
-        if ( rc )
-            goto out;
+        policy_stats->dirty_count = stats.dirty_count;
 
-        rc = send_dirty_pages(ctx, stats.dirty_count);
-        if ( rc )
-            goto out;
-    }
+    } while (true);
 
  out:
     xc_set_progress_prefix(xch, NULL);
@@ -601,7 +652,7 @@ static int suspend_and_send_dirty(struct xc_sr_context *ctx)
     if ( ctx->save.live )
     {
         rc = update_progress_string(ctx, &progress_str,
-                                    ctx->save.max_iterations);
+                                    ctx->save.stats.iteration);
         if ( rc )
             goto out;
     }
@@ -937,15 +988,6 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
            stream_type == XC_MIG_STREAM_REMUS ||
            stream_type == XC_MIG_STREAM_COLO);
 
-    /*
-     * TODO: Find some time to better tweak the live migration algorithm.
-     *
-     * These parameters are better than the legacy algorithm especially for
-     * busy guests.
-     */
-    ctx.save.max_iterations = 5;
-    ctx.save.dirty_threshold = 50;
-
     /* Sanity checks for callbacks. */
     if ( hvm )
         assert(callbacks->switch_qemu_logdirty);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Tidy libxc xc_domain_save
  2017-09-14 15:34 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
@ 2017-09-14 15:59   ` Wei Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Wei Liu @ 2017-09-14 15:59 UTC (permalink / raw)
  To: Jennifer Herbert; +Cc: Wei Liu, Ian Jackson, jtotto, xen-devel

On Thu, Sep 14, 2017 at 04:34:00PM +0100, Jennifer Herbert wrote:
> Tidy up libxc's xc_domain_save, removing unused paramaters
> max_iters and max_factor, making matching changes to libxl.
> 
> Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>

Good riddance.

Acked-by: Wei Liu <wei.liu2@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] Tidy libxc xc_domain_save
  2017-09-14 15:33 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
@ 2017-09-18  8:46   ` Paul Durrant
  0 siblings, 0 replies; 15+ messages in thread
From: Paul Durrant @ 2017-09-18  8:46 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Jennifer Herbert
> Sent: 14 September 2017 16:34
> To: Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wei.liu2@citrix.com>;
> xen-devel@lists.xenproject.org; jtotto@uwaterloo.ca
> Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
> Subject: [Xen-devel] [PATCH 1/2] Tidy libxc xc_domain_save
> 
> Tidy up libxc's xc_domain_save, removing unused paramaters
> max_iters and max_factor, making matching changes to libxl.
> 
> Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
>  tools/libxc/include/xenguest.h   | 4 ++--
>  tools/libxc/xc_nomigrate.c       | 3 +--
>  tools/libxc/xc_sr_save.c         | 8 +++-----
>  tools/libxl/libxl_save_callout.c | 4 ++--
>  tools/libxl/libxl_save_helper.c  | 7 ++-----
>  5 files changed, 10 insertions(+), 16 deletions(-)
> 
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 5cd8111..6626f0c 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -100,8 +100,8 @@ typedef enum {
>   *        doesn't use checkpointing
>   * @return 0 on success, -1 on failure
>   */
> -int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t
> max_iters,
> -                   uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
> +int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
> +                   uint32_t flags /* XCFLAGS_xxx */,
>                     struct save_callbacks* callbacks, int hvm,
>                     xc_migration_stream_t stream_type, int recv_fd);
> 
> diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
> index 317c8ce..fe8f68c 100644
> --- a/tools/libxc/xc_nomigrate.c
> +++ b/tools/libxc/xc_nomigrate.c
> @@ -20,8 +20,7 @@
>  #include <xenctrl.h>
>  #include <xenguest.h>
> 
> -int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t
> max_iters,
> -                   uint32_t max_factor, uint32_t flags,
> +int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t
> flags,
>                     struct save_callbacks* callbacks, int hvm,
>                     xc_migration_stream_t stream_type, int recv_fd)
>  {
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index ca6913b..1e7502d 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -916,9 +916,8 @@ static int save(struct xc_sr_context *ctx, uint16_t
> guest_type)
>  };
> 
>  int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom,
> -                   uint32_t max_iters, uint32_t max_factor, uint32_t flags,
> -                   struct save_callbacks* callbacks, int hvm,
> -                   xc_migration_stream_t stream_type, int recv_fd)
> +                   uint32_t flags, struct save_callbacks* callbacks,
> +                   int hvm, xc_migration_stream_t stream_type, int recv_fd)
>  {
>      struct xc_sr_context ctx =
>          {
> @@ -955,8 +954,7 @@ int xc_domain_save(xc_interface *xch, int io_fd,
> uint32_t dom,
>      if ( ctx.save.checkpointed == XC_MIG_STREAM_COLO )
>          assert(callbacks->wait_checkpoint);
> 
> -    DPRINTF("fd %d, dom %u, max_iters %u, max_factor %u, flags %u, hvm
> %d",
> -            io_fd, dom, max_iters, max_factor, flags, hvm);
> +    DPRINTF("fd %d, dom %u, flags %u, hvm %d", io_fd, dom, flags, hvm);
> 
>      if ( xc_domain_getinfo(xch, dom, 1, &ctx.dominfo) != 1 )
>      {
> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
> index 891c669..6452d70 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -89,8 +89,8 @@ void libxl__xc_domain_save(libxl__egc *egc,
> libxl__domain_save_state *dss,
>          libxl__srm_callout_enumcallbacks_save(&shs->callbacks.save.a);
> 
>      const unsigned long argnums[] = {
> -        dss->domid, 0, 0, dss->xcflags, dss->hvm,
> -        cbflags, dss->checkpointed_stream,
> +        dss->domid, dss->xcflags, dss->hvm, cbflags,
> +        dss->checkpointed_stream,
>      };
> 
>      shs->ao = ao;
> diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
> index 1dece23..38089a0 100644
> --- a/tools/libxl/libxl_save_helper.c
> +++ b/tools/libxl/libxl_save_helper.c
> @@ -251,8 +251,6 @@ int main(int argc, char **argv)
>          io_fd =                             atoi(NEXTARG);
>          recv_fd =                           atoi(NEXTARG);
>          uint32_t dom =                      strtoul(NEXTARG,0,10);
> -        uint32_t max_iters =                strtoul(NEXTARG,0,10);
> -        uint32_t max_factor =               strtoul(NEXTARG,0,10);
>          uint32_t flags =                    strtoul(NEXTARG,0,10);
>          int hvm =                           atoi(NEXTARG);
>          unsigned cbflags =                  strtoul(NEXTARG,0,10);
> @@ -264,9 +262,8 @@ int main(int argc, char **argv)
>          startup("save");
>          setup_signals(save_signal_handler);
> 
> -        r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
> -                           &helper_save_callbacks, hvm, stream_type,
> -                           recv_fd);
> +        r = xc_domain_save(xch, io_fd, dom, flags, &helper_save_callbacks,
> +                           hvm, stream_type, recv_fd);
>          complete(r);
> 
>      } else if (!strcmp(mode,"--restore-domain")) {
> --
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] Introduce migration precopy policy
  2017-09-14 15:33 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
@ 2017-09-18  9:05   ` Paul Durrant
  2017-09-18 13:20     ` Wei Liu
  2017-09-18 13:30   ` Ian Jackson
  2017-09-18 13:49   ` Wei Liu
  2 siblings, 1 reply; 15+ messages in thread
From: Paul Durrant @ 2017-09-18  9:05 UTC (permalink / raw)
  To: Ian Jackson, Wei Liu, xen-devel, jtotto; +Cc: Jennifer Herbert

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> Jennifer Herbert
> Sent: 14 September 2017 16:34
> To: Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wei.liu2@citrix.com>;
> xen-devel@lists.xenproject.org; jtotto@uwaterloo.ca
> Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
> Subject: [Xen-devel] [PATCH 2/2] Introduce migration precopy policy
> 
> This Patch allows a migration precopy policy to be specified.
> 
> The precopy phase of the xc_domain_save() live migration algorithm has
> historically been implemented to run until either a) (almost) no pages
> are dirty or b) some fixed, hard-coded maximum number of precopy
> iterations has been exceeded.  This policy and its implementation are
> less than ideal for a few reasons:
> - the logic of the policy is intertwined with the control flow of the
>   mechanism of the precopy stage
> - it can't take into account facts external to the immediate
>   migration context, such external state transfer state, interactive
>   user input, or the passage of wall-clock time.
> - it does not permit the user to change their mind, over time, about
>   what to do at the end of the precopy (they get an unconditional
>   transition into the stop-and-copy phase of the migration)
> 
> To permit callers to implement arbitrary higher-level policies governing
> when the live migration precopy phase should end, and what should be
> done next:
> - add a precopy_policy() callback to the xc_domain_save() user-supplied
>   callbacks
> - during the precopy phase of live migrations, consult this policy after
>   each batch of pages transmitted and take the dictated action, which
>   may be to a) abort the migration entirely, b) continue with the
>   precopy, or c) proceed to the stop-and-copy phase.
> - provide an implementation of the old policy, used when
>   precopy_policy callback  is not provided.
> 
> Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>
> 
> ---
> 
> This is updated/modified subset of patch 7/20, part of
> Joshua Otto's "Add postcopy live migration support." patch,
> dated 27th March 2017.  As indicated on the original thread,
> I wish to make use of this this within the XenServer product.
> I hope this will aid Josh in pushing the remainder of his series.

Does this patch need to carry Joshua's s-o-b, or at least 'suggested-by'?

> ---
>  tools/libxc/include/xenguest.h |  19 ++++++++
>  tools/libxc/xc_sr_common.h     |   7 ++-
>  tools/libxc/xc_sr_save.c       | 102 +++++++++++++++++++++++++++++------
> ------
>  3 files changed, 94 insertions(+), 34 deletions(-)
> 
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 6626f0c..d5908dc 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -39,6 +39,14 @@
>   */
>  struct xenevtchn_handle;
> 
> +/* For save's precopy_policy(). */
> +struct precopy_stats
> +{
> +    unsigned iteration;
> +    unsigned total_written;
> +    long dirty_count; /* -1 if unknown */
> +};
> +
>  /* callbacks provided by xc_domain_save */
>  struct save_callbacks {
>      /* Called after expiration of checkpoint interval,
> @@ -46,6 +54,17 @@ struct save_callbacks {
>       */
>      int (*suspend)(void* data);
> 
> +    /* Called after every batch of page data sent during the precopy phase of
> a
> +     * live migration to ask the caller what to do next based on the current
> +     * state of the precopy migration.
> +     */
> +#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely
> and
> +                                        * tidy up. */
> +#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy
> phase. */
> +#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend and
> transmit the
> +                                        * remaining dirty pages. */
> +    int (*precopy_policy)(struct precopy_stats stats, void *data);

Do we really want to be passing the struct, rather than a pointer to it?

> +
>      /* Called after the guest's dirty pages have been
>       *  copied into an output buffer.
>       * Callback function resumes the guest & the device model,
> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> index a83f22a..2bc261b 100644
> --- a/tools/libxc/xc_sr_common.h
> +++ b/tools/libxc/xc_sr_common.h
> @@ -198,12 +198,11 @@ struct xc_sr_context
>              /* Further debugging information in the stream. */
>              bool debug;
> 
> -            /* Parameters for tweaking live migration. */
> -            unsigned max_iterations;
> -            unsigned dirty_threshold;
> -
>              unsigned long p2m_size;
> 
> +            struct precopy_stats stats;
> +            int policy_decision;
> +
>              xen_pfn_t *batch_pfns;
>              unsigned nr_batch_pfns;
>              unsigned long *deferred_pages;
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 1e7502d..03dfa61 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -452,8 +452,7 @@ static int update_progress_string(struct
> xc_sr_context *ctx,
>      xc_interface *xch = ctx->xch;
>      char *new_str = NULL;
> 
> -    if ( asprintf(&new_str, "Frames iteration %u of %u",
> -                  iter, ctx->save.max_iterations) == -1 )
> +    if ( asprintf(&new_str, "Frames iteration %u", iter) == -1 )
>      {
>          PERROR("Unable to allocate new progress string");
>          return -1;
> @@ -467,6 +466,25 @@ static int update_progress_string(struct
> xc_sr_context *ctx,
>  }
> 
>  /*
> + * This is the live migration precopy policy - it's called periodically during
> + * the precopy phase of live migrations, and is responsible for deciding
> when
> + * the precopy phase should terminate and what should be done next.
> + *
> + * The policy implemented here behaves identically to the policy previously
> + * hard-coded into xc_domain_save() - it proceeds to the stop-and-copy
> phase of
> + * the live migration when there are either fewer than 50 dirty pages, or
> more
> + * than 5 precopy rounds have completed.
> + */
> +static int simple_precopy_policy(
> +    struct precopy_stats stats, void *user)
> +{
> +    return ((stats.dirty_count >= 0 && stats.dirty_count < 50) ||
> +            stats.iteration >= 5)
> +        ? XGS_POLICY_STOP_AND_COPY
> +        : XGS_POLICY_CONTINUE_PRECOPY;
> +}
> +
> +/*
>   * Send memory while guest is running.
>   */
>  static int send_memory_live(struct xc_sr_context *ctx)
> @@ -474,21 +492,62 @@ static int send_memory_live(struct xc_sr_context
> *ctx)
>      xc_interface *xch = ctx->xch;
>      xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
>      char *progress_str = NULL;
> -    unsigned x;
> +    unsigned int x = 0;
>      int rc;
> +    int policy_decision;
> +
> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> +                                    &ctx->save.dirty_bitmap_hbuf);
> +
> +    int (*precopy_policy)(struct precopy_stats, void *) =
> +        ctx->save.callbacks->precopy_policy;
> +    void *data = ctx->save.callbacks->data;
> +
> +    struct precopy_stats *policy_stats;
> 
>      rc = update_progress_string(ctx, &progress_str, 0);
>      if ( rc )
>          goto out;
> 
> -    rc = send_all_pages(ctx);
> -    if ( rc )
> -        goto out;
> +    ctx->save.stats = (struct precopy_stats)
> +        {
> +            .iteration     = x,
> +            .total_written = 0,

Do the above two initializers need to be there? AFACT x == 0 at this point and c99 says that the struct will be zeroed apart from the specificied initializers.

  Paul

> +            .dirty_count   = ctx->save.p2m_size
> +        };
> +    policy_stats = &ctx->save.stats;
> +
> +    if (precopy_policy == NULL)
> +         precopy_policy = simple_precopy_policy;
> +
> +    bitmap_set(dirty_bitmap, ctx->save.p2m_size);
> +
> +    do {
> +        policy_decision = precopy_policy(*policy_stats, data);
> +        x++;
> +
> +        if ( stats.dirty_count > 0 && policy_decision != XGS_POLICY_ABORT ) {
> +            rc = update_progress_string(ctx, &progress_str, x);
> +            if ( rc )
> +                goto out;
> +
> +            rc = send_dirty_pages(ctx, stats.dirty_count);
> +            if ( rc )
> +                goto out;
> +        }
> +
> +        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)
> +            break;
> +
> +        policy_stats->iteration     = x;
> +        policy_stats->total_written += policy_stats->dirty_count;
> +        policy_stats->dirty_count   = -1;
> +
> +        policy_decision = precopy_policy(*policy_stats, data);
> +
> +        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)
> +           break;
> 
> -    for ( x = 1;
> -          ((x < ctx->save.max_iterations) &&
> -           (stats.dirty_count > ctx->save.dirty_threshold)); ++x )
> -    {
>          if ( xc_shadow_control(
>                   xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
>                   &ctx->save.dirty_bitmap_hbuf, ctx->save.p2m_size,
> @@ -499,17 +558,9 @@ static int send_memory_live(struct xc_sr_context
> *ctx)
>              goto out;
>          }
> 
> -        if ( stats.dirty_count == 0 )
> -            break;
> -
> -        rc = update_progress_string(ctx, &progress_str, x);
> -        if ( rc )
> -            goto out;
> +        policy_stats->dirty_count = stats.dirty_count;
> 
> -        rc = send_dirty_pages(ctx, stats.dirty_count);
> -        if ( rc )
> -            goto out;
> -    }
> +    } while (true);
> 
>   out:
>      xc_set_progress_prefix(xch, NULL);
> @@ -601,7 +652,7 @@ static int suspend_and_send_dirty(struct
> xc_sr_context *ctx)
>      if ( ctx->save.live )
>      {
>          rc = update_progress_string(ctx, &progress_str,
> -                                    ctx->save.max_iterations);
> +                                    ctx->save.stats.iteration);
>          if ( rc )
>              goto out;
>      }
> @@ -937,15 +988,6 @@ int xc_domain_save(xc_interface *xch, int io_fd,
> uint32_t dom,
>             stream_type == XC_MIG_STREAM_REMUS ||
>             stream_type == XC_MIG_STREAM_COLO);
> 
> -    /*
> -     * TODO: Find some time to better tweak the live migration algorithm.
> -     *
> -     * These parameters are better than the legacy algorithm especially for
> -     * busy guests.
> -     */
> -    ctx.save.max_iterations = 5;
> -    ctx.save.dirty_threshold = 50;
> -
>      /* Sanity checks for callbacks. */
>      if ( hvm )
>          assert(callbacks->switch_qemu_logdirty);
> --
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] Introduce migration precopy policy
  2017-09-18  9:05   ` Paul Durrant
@ 2017-09-18 13:20     ` Wei Liu
  2017-09-18 13:21       ` Paul Durrant
  0 siblings, 1 reply; 15+ messages in thread
From: Wei Liu @ 2017-09-18 13:20 UTC (permalink / raw)
  To: Paul Durrant; +Cc: Ian Jackson, jtotto, Jennifer Herbert, Wei Liu, xen-devel

On Mon, Sep 18, 2017 at 10:05:07AM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> > Jennifer Herbert
> > Sent: 14 September 2017 16:34
> > To: Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wei.liu2@citrix.com>;
> > xen-devel@lists.xenproject.org; jtotto@uwaterloo.ca
> > Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
> > Subject: [Xen-devel] [PATCH 2/2] Introduce migration precopy policy
> > 
> > This Patch allows a migration precopy policy to be specified.
> > 
> > The precopy phase of the xc_domain_save() live migration algorithm has
> > historically been implemented to run until either a) (almost) no pages
> > are dirty or b) some fixed, hard-coded maximum number of precopy
> > iterations has been exceeded.  This policy and its implementation are
> > less than ideal for a few reasons:
> > - the logic of the policy is intertwined with the control flow of the
> >   mechanism of the precopy stage
> > - it can't take into account facts external to the immediate
> >   migration context, such external state transfer state, interactive
> >   user input, or the passage of wall-clock time.
> > - it does not permit the user to change their mind, over time, about
> >   what to do at the end of the precopy (they get an unconditional
> >   transition into the stop-and-copy phase of the migration)
> > 
> > To permit callers to implement arbitrary higher-level policies governing
> > when the live migration precopy phase should end, and what should be
> > done next:
> > - add a precopy_policy() callback to the xc_domain_save() user-supplied
> >   callbacks
> > - during the precopy phase of live migrations, consult this policy after
> >   each batch of pages transmitted and take the dictated action, which
> >   may be to a) abort the migration entirely, b) continue with the
> >   precopy, or c) proceed to the stop-and-copy phase.
> > - provide an implementation of the old policy, used when
> >   precopy_policy callback  is not provided.
> > 
> > Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>
> > 
> > ---
> > 
> > This is updated/modified subset of patch 7/20, part of
> > Joshua Otto's "Add postcopy live migration support." patch,
> > dated 27th March 2017.  As indicated on the original thread,
> > I wish to make use of this this within the XenServer product.
> > I hope this will aid Josh in pushing the remainder of his series.
> 
> Does this patch need to carry Joshua's s-o-b, or at least 'suggested-by'?

I agree. We need to retain Joshua's s-o-b because this patch is based on
his.

> 
> > ---
> >  tools/libxc/include/xenguest.h |  19 ++++++++
> >  tools/libxc/xc_sr_common.h     |   7 ++-
> >  tools/libxc/xc_sr_save.c       | 102 +++++++++++++++++++++++++++++------
> > ------
> >  3 files changed, 94 insertions(+), 34 deletions(-)
> > 
> > diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> > index 6626f0c..d5908dc 100644
> > --- a/tools/libxc/include/xenguest.h
> > +++ b/tools/libxc/include/xenguest.h
> > @@ -39,6 +39,14 @@
> >   */
> >  struct xenevtchn_handle;
> > 
> > +/* For save's precopy_policy(). */
> > +struct precopy_stats
> > +{
> > +    unsigned iteration;
> > +    unsigned total_written;
> > +    long dirty_count; /* -1 if unknown */
> > +};
> > +
> >  /* callbacks provided by xc_domain_save */
> >  struct save_callbacks {
> >      /* Called after expiration of checkpoint interval,
> > @@ -46,6 +54,17 @@ struct save_callbacks {
> >       */
> >      int (*suspend)(void* data);
> > 
> > +    /* Called after every batch of page data sent during the precopy phase of
> > a
> > +     * live migration to ask the caller what to do next based on the current
> > +     * state of the precopy migration.
> > +     */
> > +#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely
> > and
> > +                                        * tidy up. */
> > +#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy
> > phase. */
> > +#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend and
> > transmit the
> > +                                        * remaining dirty pages. */
> > +    int (*precopy_policy)(struct precopy_stats stats, void *data);
> 
> Do we really want to be passing the struct, rather than a pointer to it?
> 

IIRC that was discussed in the past. We passed the struct because we
couldn't pass pointers across process boundary.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] Introduce migration precopy policy
  2017-09-18 13:20     ` Wei Liu
@ 2017-09-18 13:21       ` Paul Durrant
  0 siblings, 0 replies; 15+ messages in thread
From: Paul Durrant @ 2017-09-18 13:21 UTC (permalink / raw)
  Cc: Ian Jackson, jtotto, Jennifer Herbert, Wei Liu, xen-devel

> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@citrix.com]
> Sent: 18 September 2017 14:20
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Jennifer Herbert <jennifer.herbert@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Wei Liu <wei.liu2@citrix.com>; xen-
> devel@lists.xenproject.org; jtotto@uwaterloo.ca
> Subject: Re: [Xen-devel] [PATCH 2/2] Introduce migration precopy policy
> 
> On Mon, Sep 18, 2017 at 10:05:07AM +0100, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of
> > > Jennifer Herbert
> > > Sent: 14 September 2017 16:34
> > > To: Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu <wei.liu2@citrix.com>;
> > > xen-devel@lists.xenproject.org; jtotto@uwaterloo.ca
> > > Cc: Jennifer Herbert <jennifer.herbert@citrix.com>
> > > Subject: [Xen-devel] [PATCH 2/2] Introduce migration precopy policy
> > >
> > > This Patch allows a migration precopy policy to be specified.
> > >
> > > The precopy phase of the xc_domain_save() live migration algorithm has
> > > historically been implemented to run until either a) (almost) no pages
> > > are dirty or b) some fixed, hard-coded maximum number of precopy
> > > iterations has been exceeded.  This policy and its implementation are
> > > less than ideal for a few reasons:
> > > - the logic of the policy is intertwined with the control flow of the
> > >   mechanism of the precopy stage
> > > - it can't take into account facts external to the immediate
> > >   migration context, such external state transfer state, interactive
> > >   user input, or the passage of wall-clock time.
> > > - it does not permit the user to change their mind, over time, about
> > >   what to do at the end of the precopy (they get an unconditional
> > >   transition into the stop-and-copy phase of the migration)
> > >
> > > To permit callers to implement arbitrary higher-level policies governing
> > > when the live migration precopy phase should end, and what should be
> > > done next:
> > > - add a precopy_policy() callback to the xc_domain_save() user-supplied
> > >   callbacks
> > > - during the precopy phase of live migrations, consult this policy after
> > >   each batch of pages transmitted and take the dictated action, which
> > >   may be to a) abort the migration entirely, b) continue with the
> > >   precopy, or c) proceed to the stop-and-copy phase.
> > > - provide an implementation of the old policy, used when
> > >   precopy_policy callback  is not provided.
> > >
> > > Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>
> > >
> > > ---
> > >
> > > This is updated/modified subset of patch 7/20, part of
> > > Joshua Otto's "Add postcopy live migration support." patch,
> > > dated 27th March 2017.  As indicated on the original thread,
> > > I wish to make use of this this within the XenServer product.
> > > I hope this will aid Josh in pushing the remainder of his series.
> >
> > Does this patch need to carry Joshua's s-o-b, or at least 'suggested-by'?
> 
> I agree. We need to retain Joshua's s-o-b because this patch is based on
> his.
> 
> >
> > > ---
> > >  tools/libxc/include/xenguest.h |  19 ++++++++
> > >  tools/libxc/xc_sr_common.h     |   7 ++-
> > >  tools/libxc/xc_sr_save.c       | 102 +++++++++++++++++++++++++++++--
> ----
> > > ------
> > >  3 files changed, 94 insertions(+), 34 deletions(-)
> > >
> > > diff --git a/tools/libxc/include/xenguest.h
> b/tools/libxc/include/xenguest.h
> > > index 6626f0c..d5908dc 100644
> > > --- a/tools/libxc/include/xenguest.h
> > > +++ b/tools/libxc/include/xenguest.h
> > > @@ -39,6 +39,14 @@
> > >   */
> > >  struct xenevtchn_handle;
> > >
> > > +/* For save's precopy_policy(). */
> > > +struct precopy_stats
> > > +{
> > > +    unsigned iteration;
> > > +    unsigned total_written;
> > > +    long dirty_count; /* -1 if unknown */
> > > +};
> > > +
> > >  /* callbacks provided by xc_domain_save */
> > >  struct save_callbacks {
> > >      /* Called after expiration of checkpoint interval,
> > > @@ -46,6 +54,17 @@ struct save_callbacks {
> > >       */
> > >      int (*suspend)(void* data);
> > >
> > > +    /* Called after every batch of page data sent during the precopy
> phase of
> > > a
> > > +     * live migration to ask the caller what to do next based on the current
> > > +     * state of the precopy migration.
> > > +     */
> > > +#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely
> > > and
> > > +                                        * tidy up. */
> > > +#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy
> > > phase. */
> > > +#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend
> and
> > > transmit the
> > > +                                        * remaining dirty pages. */
> > > +    int (*precopy_policy)(struct precopy_stats stats, void *data);
> >
> > Do we really want to be passing the struct, rather than a pointer to it?
> >
> 
> IIRC that was discussed in the past. We passed the struct because we
> couldn't pass pointers across process boundary.

Ok, that's fine. As long as there is a good reason :-)

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] Introduce migration precopy policy
  2017-09-14 15:33 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
  2017-09-18  9:05   ` Paul Durrant
@ 2017-09-18 13:30   ` Ian Jackson
  2017-09-18 14:46     ` Jennifer Herbert
  2017-09-18 13:49   ` Wei Liu
  2 siblings, 1 reply; 15+ messages in thread
From: Ian Jackson @ 2017-09-18 13:30 UTC (permalink / raw)
  To: Jennifer Herbert; +Cc: xen-devel, Wei Liu, jtotto

Jennifer Herbert writes ("[PATCH 2/2] Introduce migration precopy policy"):
> This Patch allows a migration precopy policy to be specified.

But only for direct libxc users.  How do you think this should be
exposed via libxl ?

The general approach, so far, seems sound.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] Introduce migration precopy policy
  2017-09-14 15:33 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
  2017-09-18  9:05   ` Paul Durrant
  2017-09-18 13:30   ` Ian Jackson
@ 2017-09-18 13:49   ` Wei Liu
  2 siblings, 0 replies; 15+ messages in thread
From: Wei Liu @ 2017-09-18 13:49 UTC (permalink / raw)
  To: Jennifer Herbert; +Cc: Wei Liu, Ian Jackson, jtotto, xen-devel

On Thu, Sep 14, 2017 at 04:33:58PM +0100, Jennifer Herbert wrote:
> This Patch allows a migration precopy policy to be specified.
> 
> The precopy phase of the xc_domain_save() live migration algorithm has
> historically been implemented to run until either a) (almost) no pages
> are dirty or b) some fixed, hard-coded maximum number of precopy
> iterations has been exceeded.  This policy and its implementation are
> less than ideal for a few reasons:
> - the logic of the policy is intertwined with the control flow of the
>   mechanism of the precopy stage
> - it can't take into account facts external to the immediate
>   migration context, such external state transfer state, interactive
>   user input, or the passage of wall-clock time.
> - it does not permit the user to change their mind, over time, about
>   what to do at the end of the precopy (they get an unconditional
>   transition into the stop-and-copy phase of the migration)
> 
> To permit callers to implement arbitrary higher-level policies governing
> when the live migration precopy phase should end, and what should be
> done next:
> - add a precopy_policy() callback to the xc_domain_save() user-supplied
>   callbacks
> - during the precopy phase of live migrations, consult this policy after
>   each batch of pages transmitted and take the dictated action, which
>   may be to a) abort the migration entirely, b) continue with the
>   precopy, or c) proceed to the stop-and-copy phase.
> - provide an implementation of the old policy, used when
>   precopy_policy callback  is not provided.
> 
> Signed-off-by: Jennifer Herbert <Jennifer.Herbert@citrix.com>
> 
> ---
> 
> This is updated/modified subset of patch 7/20, part of
> Joshua Otto's "Add postcopy live migration support." patch,
> dated 27th March 2017.  As indicated on the original thread,
> I wish to make use of this this within the XenServer product.
> I hope this will aid Josh in pushing the remainder of his series.
> ---
>  tools/libxc/include/xenguest.h |  19 ++++++++
>  tools/libxc/xc_sr_common.h     |   7 ++-
>  tools/libxc/xc_sr_save.c       | 102 +++++++++++++++++++++++++++++------------
>  3 files changed, 94 insertions(+), 34 deletions(-)
> 
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 6626f0c..d5908dc 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -39,6 +39,14 @@
>   */
>  struct xenevtchn_handle;
>  
> +/* For save's precopy_policy(). */
> +struct precopy_stats
> +{
> +    unsigned iteration;
> +    unsigned total_written;
> +    long dirty_count; /* -1 if unknown */
> +};
> +
>  /* callbacks provided by xc_domain_save */
>  struct save_callbacks {
>      /* Called after expiration of checkpoint interval,
> @@ -46,6 +54,17 @@ struct save_callbacks {
>       */
>      int (*suspend)(void* data);
>  
> +    /* Called after every batch of page data sent during the precopy phase of a
> +     * live migration to ask the caller what to do next based on the current
> +     * state of the precopy migration.
> +     */
> +#define XGS_POLICY_ABORT          (-1) /* Abandon the migration entirely and
> +                                        * tidy up. */
> +#define XGS_POLICY_CONTINUE_PRECOPY 0  /* Remain in the precopy phase. */
> +#define XGS_POLICY_STOP_AND_COPY    1  /* Immediately suspend and transmit the
> +                                        * remaining dirty pages. */
> +    int (*precopy_policy)(struct precopy_stats stats, void *data);
> +
>      /* Called after the guest's dirty pages have been
>       *  copied into an output buffer.
>       * Callback function resumes the guest & the device model,
> diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
> index a83f22a..2bc261b 100644
> --- a/tools/libxc/xc_sr_common.h
> +++ b/tools/libxc/xc_sr_common.h
> @@ -198,12 +198,11 @@ struct xc_sr_context
>              /* Further debugging information in the stream. */
>              bool debug;
>  
> -            /* Parameters for tweaking live migration. */
> -            unsigned max_iterations;
> -            unsigned dirty_threshold;
> -
>              unsigned long p2m_size;
>  
> +            struct precopy_stats stats;
> +            int policy_decision;
> +
>              xen_pfn_t *batch_pfns;
>              unsigned nr_batch_pfns;
>              unsigned long *deferred_pages;
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 1e7502d..03dfa61 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -452,8 +452,7 @@ static int update_progress_string(struct xc_sr_context *ctx,
>      xc_interface *xch = ctx->xch;
>      char *new_str = NULL;
>  
> -    if ( asprintf(&new_str, "Frames iteration %u of %u",
> -                  iter, ctx->save.max_iterations) == -1 )
> +    if ( asprintf(&new_str, "Frames iteration %u", iter) == -1 )
>      {
>          PERROR("Unable to allocate new progress string");
>          return -1;
> @@ -467,6 +466,25 @@ static int update_progress_string(struct xc_sr_context *ctx,
>  }
>  
>  /*
> + * This is the live migration precopy policy - it's called periodically during
> + * the precopy phase of live migrations, and is responsible for deciding when
> + * the precopy phase should terminate and what should be done next.
> + *
> + * The policy implemented here behaves identically to the policy previously
> + * hard-coded into xc_domain_save() - it proceeds to the stop-and-copy phase of
> + * the live migration when there are either fewer than 50 dirty pages, or more
> + * than 5 precopy rounds have completed.
> + */
> +static int simple_precopy_policy(
> +    struct precopy_stats stats, void *user)

Please join these two lines together.

> +{
> +    return ((stats.dirty_count >= 0 && stats.dirty_count < 50) ||
> +            stats.iteration >= 5)
> +        ? XGS_POLICY_STOP_AND_COPY
> +        : XGS_POLICY_CONTINUE_PRECOPY;
> +}
> +
> +/*
>   * Send memory while guest is running.
>   */
>  static int send_memory_live(struct xc_sr_context *ctx)
> @@ -474,21 +492,62 @@ static int send_memory_live(struct xc_sr_context *ctx)
>      xc_interface *xch = ctx->xch;
>      xc_shadow_op_stats_t stats = { 0, ctx->save.p2m_size };
>      char *progress_str = NULL;
> -    unsigned x;
> +    unsigned int x = 0;
>      int rc;
> +    int policy_decision;
> +
> +    DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
> +                                    &ctx->save.dirty_bitmap_hbuf);
> +
> +    int (*precopy_policy)(struct precopy_stats, void *) =

Please provide typedef for the function pointer to reduce code
repetition.

> +        ctx->save.callbacks->precopy_policy;
> +    void *data = ctx->save.callbacks->data;
> +
> +    struct precopy_stats *policy_stats;
>  
>      rc = update_progress_string(ctx, &progress_str, 0);
>      if ( rc )
>          goto out;
>  
> -    rc = send_all_pages(ctx);
> -    if ( rc )
> -        goto out;
> +    ctx->save.stats = (struct precopy_stats)
> +        {
> +            .iteration     = x,
> +            .total_written = 0,
> +            .dirty_count   = ctx->save.p2m_size
> +        };
> +    policy_stats = &ctx->save.stats;
> +
> +    if (precopy_policy == NULL)

Coding style.

> +         precopy_policy = simple_precopy_policy;
> +
> +    bitmap_set(dirty_bitmap, ctx->save.p2m_size);
> +
> +    do {
> +        policy_decision = precopy_policy(*policy_stats, data);
> +        x++;
> +
> +        if ( stats.dirty_count > 0 && policy_decision != XGS_POLICY_ABORT ) {

Coding style: the bracket should be placed on a new line.

> +            rc = update_progress_string(ctx, &progress_str, x);
> +            if ( rc )
> +                goto out;
> +
> +            rc = send_dirty_pages(ctx, stats.dirty_count);
> +            if ( rc )
> +                goto out;
> +        }
> +
> +        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)

Coding style.

> +            break;
> +
> +        policy_stats->iteration     = x;
> +        policy_stats->total_written += policy_stats->dirty_count;
> +        policy_stats->dirty_count   = -1;
> +
> +        policy_decision = precopy_policy(*policy_stats, data);
> +
> +        if (policy_decision != XGS_POLICY_CONTINUE_PRECOPY)

Coding style.

> +           break;
>  
> -    for ( x = 1;
> -          ((x < ctx->save.max_iterations) &&
> -           (stats.dirty_count > ctx->save.dirty_threshold)); ++x )
> -    {
>          if ( xc_shadow_control(
>                   xch, ctx->domid, XEN_DOMCTL_SHADOW_OP_CLEAN,
>                   &ctx->save.dirty_bitmap_hbuf, ctx->save.p2m_size,
> @@ -499,17 +558,9 @@ static int send_memory_live(struct xc_sr_context *ctx)
>              goto out;
>          }
>  
> -        if ( stats.dirty_count == 0 )
> -            break;
> -
> -        rc = update_progress_string(ctx, &progress_str, x);
> -        if ( rc )
> -            goto out;
> +        policy_stats->dirty_count = stats.dirty_count;
>  
> -        rc = send_dirty_pages(ctx, stats.dirty_count);
> -        if ( rc )
> -            goto out;
> -    }
> +    } while (true);
>  

Coding style.

This approach looks OK.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] Introduce migration precopy policy
  2017-09-18 13:30   ` Ian Jackson
@ 2017-09-18 14:46     ` Jennifer Herbert
  2017-09-18 15:43       ` Ian Jackson
  0 siblings, 1 reply; 15+ messages in thread
From: Jennifer Herbert @ 2017-09-18 14:46 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Wei Liu, jtotto


On 18/09/17 14:30, Ian Jackson wrote:
> Jennifer Herbert writes ("[PATCH 2/2] Introduce migration precopy policy"):
>> This Patch allows a migration precopy policy to be specified.
> But only for direct libxc users.  How do you think this should be
> exposed via libxl ?

The original patch had a small amount of code to pass this though such 
that libxl could provide one.  I'm not entity sure how the perl code 
worked, but I've not made any changes that wouldn't stop this working.  
That section would form a follow on patch.  Since I don't use or have 
much understanding of libxl, I didn't really feel I was the best person 
to do this.

-jenny

> The general approach, so far, seems sound.
>
> Thanks,
> Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] Introduce migration precopy policy
  2017-09-18 14:46     ` Jennifer Herbert
@ 2017-09-18 15:43       ` Ian Jackson
  0 siblings, 0 replies; 15+ messages in thread
From: Ian Jackson @ 2017-09-18 15:43 UTC (permalink / raw)
  To: Jennifer Herbert; +Cc: xen-devel, Wei Liu, jtotto

Jennifer Herbert writes ("Re: [PATCH 2/2] Introduce migration precopy policy"):
> The original patch had a small amount of code to pass this though such 
> that libxl could provide one.  I'm not entity sure how the perl code 
> worked, but I've not made any changes that wouldn't stop this working.  
> That section would form a follow on patch.  Since I don't use or have 
> much understanding of libxl, I didn't really feel I was the best person 
> to do this.

Can you post that extra bit of code as an RFC 3/2 ?  I think it would
help my understanding of the future path, and maybe I'd like to fix it
up for you :-).

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-09-18 15:43 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-14 15:33 [PATCH 0/2] Introduce migration precopy policy Jennifer Herbert
2017-09-14 15:33 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
2017-09-18  8:46   ` Paul Durrant
2017-09-14 15:33 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert
2017-09-18  9:05   ` Paul Durrant
2017-09-18 13:20     ` Wei Liu
2017-09-18 13:21       ` Paul Durrant
2017-09-18 13:30   ` Ian Jackson
2017-09-18 14:46     ` Jennifer Herbert
2017-09-18 15:43       ` Ian Jackson
2017-09-18 13:49   ` Wei Liu
2017-09-14 15:33 ` [PATCH 0/2] " Jennifer Herbert
2017-09-14 15:34 ` [PATCH 1/2] Tidy libxc xc_domain_save Jennifer Herbert
2017-09-14 15:59   ` Wei Liu
2017-09-14 15:34 ` [PATCH 2/2] Introduce migration precopy policy Jennifer Herbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.