All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/27]  Libxl migration v2
@ 2015-06-15 13:44 Andrew Cooper
  2015-06-15 13:44 ` [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children Andrew Cooper
                   ` (29 more replies)
  0 siblings, 30 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

This series adds support for the libxl migration v2 stream, and untangles the
existing layering violations of the toolstack and qemu records.

At the end of the series, legacy migration is no longer used.

Note: Remus support is broken and (RFC) fixed in separate patches in this
series.  It was too tangled to fix in a bisectable fashon.  Plain
suspend/migrate/resume however is (should be) bisectable along the entire
series.

There are a couple of outstanding questions:

1) What to do about the toolstack/xenstore record.  It is currently by being
   passed around as a blob, but it might be better to split it out.

2) What (if any) ABI/API qualifications are needed? (Particularly in reference
   to patch 21)

The Remus code is untested by me, but is hopefully in the correct ballpark.
All other combinations of suspend/migrate/resume have been tested with PV and
HVM guests (qemu-trad and qemu-upstream), including 32 -> 64 bit migration
(which was the underlying bug causing us to write migration v2 in the first
place).

There are some further improvements which could be made.  In particular, it
appears that sending the toolstack record on each checkpoint is redundant, and
there is certainly room for some more pruning of the legacy migration code.

Anyway, thoughts/comments welcome.  Please test!

~Andrew


Andrew Cooper (22):
  tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  tools/libxc: Always compile the compat qemu variables into xc_sr_context
  tools/libxl: Stash all restore parameters in domain_create_state
  tools/xl: Mandatory flag indicating the format of the migration stream
  tools/libxl: Introduce ROUNDUP()
  tools/libxl: Extra APIs for the save helper
  tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
  docs: Libxl migration v2 stream specification
  tools/python: Libxc migration v2 infrastructure
  tools/python: Libxl migration v2 infrastructure
  tools/python: Verification utility for v2 stream spec compliance
  tools/python: Conversion utility for legacy migration streams
  tools/libxl: Support converting a legacy stream to a v2 stream
  tools/libxl: Convert a legacy stream if needed
  tools/libxc+libxl+xl: Restore v2 streams
  tools/libxc+libxl+xl: Save v2 streams
  docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
  tools/libxl: [RFC] Write checkpoint records into the stream
  tools/libx{c,l}: [RFC] Introduce restore_callbacks.checkpoint()
  tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
  tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
  tools/libxl: Drop all knowledge of toolstack callbacks

Ian Jackson (2):
  libxl: cancellation: Preparations for save/restore cancellation
  libxl: cancellation: Handle SIGTERM in save/restore helper

Ross Lagerwall (3):
  tools/libxl: Migration v2 stream format
  tools/libxl: Infrastructure for reading a libxl migration v2 stream
  tools/libxl: Infrastructure for writing a v2 stream

 docs/specs/libxl-migration-stream.pandoc      |  218 ++++++++
 tools/libxc/Makefile                          |    2 -
 tools/libxc/include/xenguest.h                |    3 +
 tools/libxc/xc_sr_common.h                    |    5 -
 tools/libxc/xc_sr_restore.c                   |   33 +-
 tools/libxc/xc_sr_restore_x86_hvm.c           |  124 -----
 tools/libxc/xc_sr_save_x86_hvm.c              |   36 --
 tools/libxl/Makefile                          |    2 +
 tools/libxl/libxl_aoutils.c                   |    7 +
 tools/libxl/libxl_convert_callout.c           |  146 ++++++
 tools/libxl/libxl_create.c                    |   80 +--
 tools/libxl/libxl_dom.c                       |   61 +--
 tools/libxl/libxl_internal.h                  |  140 ++++-
 tools/libxl/libxl_save_callout.c              |   63 +--
 tools/libxl/libxl_save_helper.c               |   95 ++--
 tools/libxl/libxl_save_msgs_gen.pl            |    9 +-
 tools/libxl/libxl_sr_stream_format.h          |   58 +++
 tools/libxl/libxl_stream_read.c               |  663 ++++++++++++++++++++++++
 tools/libxl/libxl_stream_write.c              |  640 +++++++++++++++++++++++
 tools/libxl/libxl_types.idl                   |    2 +
 tools/libxl/xl_cmdimpl.c                      |    9 +-
 tools/python/Makefile                         |    4 +
 tools/python/scripts/convert-legacy-stream.py |  683 +++++++++++++++++++++++++
 tools/python/scripts/verify-stream-v2.py      |  174 +++++++
 tools/python/setup.py                         |    1 +
 tools/python/xen/migration/libxc.py           |  446 ++++++++++++++++
 tools/python/xen/migration/libxl.py           |  199 +++++++
 tools/python/xen/migration/tests.py           |   54 ++
 tools/python/xen/migration/verify.py          |   37 ++
 29 files changed, 3638 insertions(+), 356 deletions(-)
 create mode 100644 docs/specs/libxl-migration-stream.pandoc
 create mode 100644 tools/libxl/libxl_convert_callout.c
 create mode 100644 tools/libxl/libxl_sr_stream_format.h
 create mode 100644 tools/libxl/libxl_stream_read.c
 create mode 100644 tools/libxl/libxl_stream_write.c
 create mode 100755 tools/python/scripts/convert-legacy-stream.py
 create mode 100755 tools/python/scripts/verify-stream-v2.py
 create mode 100644 tools/python/xen/migration/__init__.py
 create mode 100644 tools/python/xen/migration/libxc.py
 create mode 100644 tools/python/xen/migration/libxl.py
 create mode 100644 tools/python/xen/migration/tests.py
 create mode 100644 tools/python/xen/migration/verify.py

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 107+ messages in thread

* [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:21   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context Andrew Cooper
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Shortly, libxl will be juggling multiple parallel operations, and will
possibly have to take error decisions before some tasks have been set up.

No child process of libxl will ever have a pid of 0, so gate
libxl__ev_child_inuse() on a pid strictly greater than 0.

This makes it safe to use on a zeroed structure of a task which has not yet
been set up.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

---
This change does make libxl__ev_child_init() functionally useless.  I am
undecided between leaving it in place in case it is useful in the future, or to
remove it completely.
---
 tools/libxl/libxl_internal.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e96d6b5..6226c18 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -880,7 +880,7 @@ _hidden pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *childw_out,
 static inline void libxl__ev_child_init(libxl__ev_child *childw_out)
                 { childw_out->pid = -1; }
 static inline int libxl__ev_child_inuse(const libxl__ev_child *childw_out)
-                { return childw_out->pid >= 0; }
+                { return childw_out->pid > 0; }
 
 /* Useable (only) in the child to once more make the ctx useable for
  * xenstore operations.  logs failure in the form "what: <error
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
  2015-06-15 13:44 ` [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:22   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state Andrew Cooper
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

This is safe (as the variable will simply be unused), and is required for
correct compilation when midway through untangling the libxc/libxl
interaction.

The #define is left in place to highlight that the variables can be removed
once the untangling is complete.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/xc_sr_common.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 565c5da..08c66db 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -307,10 +307,10 @@ struct xc_sr_context
                     void *context;
                     size_t contextsz;
 
-#ifdef XG_LIBXL_HVM_COMPAT
+/* #ifdef XG_LIBXL_HVM_COMPAT */
                     uint32_t qlen;
                     void *qbuf;
-#endif
+/* #endif */
                 } restore;
             };
         } x86_hvm;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
  2015-06-15 13:44 ` [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children Andrew Cooper
  2015-06-15 13:44 ` [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:37   ` Ian Campbell
  2015-06-18  2:32   ` Yang Hongyang
  2015-06-15 13:44 ` [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream Andrew Cooper
                   ` (26 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Shortly more parameters will appear, and this saves unboxing each one.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_create.c       |   12 ++++++------
 tools/libxl/libxl_internal.h     |    2 +-
 tools/libxl/libxl_save_callout.c |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 86384d2..385891c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
                              int rc, uint32_t domid);
 
 static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-                            uint32_t *domid,
-                            int restore_fd, int checkpointed_stream,
+                            uint32_t *domid, int restore_fd,
+                            const libxl_domain_restore_params *params,
                             const libxl_asyncop_how *ao_how,
                             const libxl_asyncprogress_how *aop_console_how)
 {
@@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
     libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
     libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
     cdcs->dcs.restore_fd = restore_fd;
+    if (params) cdcs->dcs.restore_params = *params;
     cdcs->dcs.callback = domain_create_cb;
-    cdcs->dcs.checkpointed_stream = checkpointed_stream;
     libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
     cdcs->domid_out = domid;
 
@@ -1619,7 +1619,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
                             const libxl_asyncop_how *ao_how,
                             const libxl_asyncprogress_how *aop_console_how)
 {
-    return do_domain_create(ctx, d_config, domid, -1, 0,
+    return do_domain_create(ctx, d_config, domid, -1, NULL,
                             ao_how, aop_console_how);
 }
 
@@ -1629,8 +1629,8 @@ int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
                                 const libxl_asyncop_how *ao_how,
                                 const libxl_asyncprogress_how *aop_console_how)
 {
-    return do_domain_create(ctx, d_config, domid, restore_fd,
-                            params->checkpointed_stream, ao_how, aop_console_how);
+    return do_domain_create(ctx, d_config, domid, restore_fd, params,
+                            ao_how, aop_console_how);
 }
 
 /*
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6226c18..796bd21 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
     libxl_domain_config *guest_config;
     libxl_domain_config guest_config_saved; /* vanilla config */
     int restore_fd;
+    libxl_domain_restore_params restore_params;
     libxl__domain_create_cb *callback;
     libxl_asyncprogress_how aop_console_how;
     /* private to domain_create */
     int guest_domid;
-    int checkpointed_stream;
     libxl__domain_build_state build_state;
     libxl__bootloader_state bl;
     libxl__stub_dm_spawn_state dmss;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 40b25e4..3585a84 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -59,7 +59,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
         state->store_domid, state->console_port,
         state->console_domid,
         hvm, pae, superpages,
-        cbflags, dcs->checkpointed_stream,
+        cbflags, dcs->restore_params.checkpointed_stream,
     };
 
     dcs->shs.ao = ao;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (2 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:39   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 05/27] tools/libxl: Introduce ROUNDUP() Andrew Cooper
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Introduced at this point so the python stream conversion code has a concrete
ABI to use.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/xl_cmdimpl.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index c858068..ddb293c 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -109,6 +109,7 @@
    */
 
 #define XL_MANDATORY_FLAG_JSON (1U << 0) /* config data is in JSON format */
+#define XL_MANDATORY_FLAG_STREAMv2 (1U << 1) /* stream is v2 */
 #define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON)
 struct save_file_header {
     char magic[32]; /* savefileheader_magic */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 05/27] tools/libxl: Introduce ROUNDUP()
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (3 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:39   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 06/27] libxl: cancellation: Preparations for save/restore cancellation Andrew Cooper
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

This is the same as is used by libxc.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_internal.h |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 796bd21..a4636ca 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -109,6 +109,9 @@
 
 #define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0]))
 
+#define ROUNDUP(_val, _order)                                           \
+    (((unsigned long)(_val)+(1UL<<(_order))-1) & ~((1UL<<(_order))-1))
+
 #define min(X, Y) ({                             \
             const typeof (X) _x = (X);           \
             const typeof (Y) _y = (Y);           \
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 06/27] libxl: cancellation: Preparations for save/restore cancellation
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (4 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 05/27] tools/libxl: Introduce ROUNDUP() Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-15 13:44 ` [PATCH 07/27] libxl: cancellation: Handle SIGTERM in save/restore helper Andrew Cooper
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

From: Ian Jackson <ian.jackson@eu.citrix.com>

Two unrelated non-functional changes, broken out into a pre-patch for
easier review:

Break out a function sendsig() in libxl_save_callout.c.

Move io_fd to be a global variable in libxl_save_helper.c.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

Alter sendsig() to being libxl__kill() as it is needed in other translation
units.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_aoutils.c      |    7 +++++++
 tools/libxl/libxl_internal.h     |    2 ++
 tools/libxl/libxl_save_callout.c |    4 +---
 tools/libxl/libxl_save_helper.c  |    5 +++--
 4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index ef679dd..bea0282 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -593,3 +593,10 @@ bool libxl__async_exec_inuse(const libxl__async_exec_state *aes)
     assert(time_inuse == child_inuse);
     return child_inuse;
 }
+
+void libxl__kill(libxl__gc *gc, pid_t pid, int sig, const char *what)
+{
+    int r = kill(pid, sig);
+    if (r) LOGE(WARN, "failed to kill() %s [%lu] (signal %d)",
+                what, (unsigned long)pid, sig);
+}
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index a4636ca..4f204f9 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2161,6 +2161,8 @@ struct libxl__async_exec_state {
 int libxl__async_exec_start(libxl__gc *gc, libxl__async_exec_state *aes);
 bool libxl__async_exec_inuse(const libxl__async_exec_state *aes);
 
+void libxl__kill(libxl__gc *gc, pid_t pid, int sig, const char *what);
+
 /*----- device addition/removal -----*/
 
 typedef struct libxl__ao_device libxl__ao_device;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 3585a84..231de2f 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -253,9 +253,7 @@ static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
         return;
     }
 
-    int r = kill(shs->child.pid, SIGKILL);
-    if (r) LOGE(WARN, "failed to kill save/restore helper [%lu]",
-                (unsigned long)shs->child.pid);
+    libxl__kill(gc, shs->child.pid, SIGKILL, "save/restore helper");
 }
 
 static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 74826a1..7514b2e 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -85,6 +85,7 @@ static void tellparent_destroy(struct xentoollog_logger *logger_in)
     tellparent_destroy,
 };
 static xc_interface *xch;
+static int io_fd;
 
 /*----- error handling -----*/
 
@@ -211,7 +212,7 @@ int main(int argc, char **argv)
 
     if (!strcmp(mode,"--save-domain")) {
 
-        int io_fd =                atoi(NEXTARG);
+        io_fd =                    atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         uint32_t max_iters =       strtoul(NEXTARG,0,10);
         uint32_t max_factor =      strtoul(NEXTARG,0,10);
@@ -234,7 +235,7 @@ int main(int argc, char **argv)
 
     } else if (!strcmp(mode,"--restore-domain")) {
 
-        int io_fd =                atoi(NEXTARG);
+        io_fd =                    atoi(NEXTARG);
         uint32_t dom =             strtoul(NEXTARG,0,10);
         unsigned store_evtchn =    strtoul(NEXTARG,0,10);
         domid_t store_domid =      strtoul(NEXTARG,0,10);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 07/27] libxl: cancellation: Handle SIGTERM in save/restore helper
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (5 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 06/27] libxl: cancellation: Preparations for save/restore cancellation Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-15 13:44 ` [PATCH 08/27] tools/libxl: Extra APIs for the save helper Andrew Cooper
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell

From: Ian Jackson <ian.jackson@eu.citrix.com>

During startup of the save/restore helper, set the disposition of
SIGTERM appropriately.

For restore, we can simply die immediately - there is no point trying
to do any kind of cleanup on what is now going to be a trashed domain.

For save, we want to arrange that libxc's cleanup code (eg turning off
logdirty) takes place.  So our signal handler replaces the fd with one
on which writes will fail, causing libxc's own loop to fail next time
it actually tries to do a write.

Currently this has only a minor beneficial effect: we don't send the
helper a SIGTERM ourselves, and if someone else contrives to send our
helper a SIGTERM they have probably sent one to libxl too in which
case things are going to be a bit messy anyway.

But in the next patch libxl is going to use SIGTERM itself on ao
cancellation.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_save_helper.c |   57 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 7514b2e..4b72f24 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -40,8 +40,10 @@
 #include <unistd.h>
 #include <assert.h>
 #include <inttypes.h>
+#include <fcntl.h>
 
 #include "libxl.h"
+#include "libxl_utils.h"
 
 #include "xenctrl.h"
 #include "xenguest.h"
@@ -120,6 +122,57 @@ static void *xmalloc(size_t sz)
     return r;
 }
 
+/*----- signal handling -----*/
+
+static int unwriteable_fd;
+
+static void save_signal_handler(int num)
+{
+    /*
+     * We want to be able to interrupt save.  But the code in libxc
+     * which does the actual saving is straight-through, and we need
+     * to execute its error path to put the guest back to sanity.
+     *
+     * So what we do is this: when we get the signal, we dup2
+     * the result of open("/dev/null",O_RDONLY) onto the output fd.
+     *
+     * This is guaranteed to 1. interrupt libxc's write (causing it to
+     * return short, or maybe EINTR); 2. make the next write give
+     * EBADF, so that: 3. at latest, libxc will notice when it next
+     * tries to write data and will then go into its cleanup path.
+     *
+     * We make no effort here to sanitise the resulting errors.
+     * That's libxl's job.
+     */
+    int esave = errno;
+
+    int r = dup2(unwriteable_fd, io_fd);
+    assert(r == io_fd); /* if not we can't write an xtl message because we
+                         * might end up interleaving on our control stream */
+
+    errno = esave;
+}
+
+static void setup_signals(void (*handler)(int))
+{
+    struct sigaction sa = { { 0 } };
+    sigset_t spmask;
+    int r;
+
+    unwriteable_fd = open("/dev/null",O_RDONLY);
+    if (unwriteable_fd < 0) fail(errno,"open /dev/null for reading");
+
+    sa.sa_handler = handler;
+    sigemptyset(&sa.sa_mask);
+    r = sigaction(SIGTERM, &sa, 0);
+    if (r) fail(errno,"sigaction SIGTERM failed");
+
+    sigemptyset(&spmask);
+    sigaddset(&spmask,SIGTERM);
+    r = sigprocmask(SIG_UNBLOCK,&spmask,0);
+    if (r) fail(errno,"sigprocmask unblock SIGTERM failed");
+}
+
 /*----- helper functions called by autogenerated stubs -----*/
 
 unsigned char * helper_allocbuf(int len, void *user)
@@ -229,6 +282,8 @@ int main(int argc, char **argv)
         helper_setcallbacks_save(&helper_save_callbacks, cbflags);
 
         startup("save");
+        setup_signals(save_signal_handler);
+
         r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
                            &helper_save_callbacks, hvm);
         complete(r);
@@ -254,6 +309,8 @@ int main(int argc, char **argv)
         unsigned long console_mfn = 0;
 
         startup("restore");
+        setup_signals(SIG_DFL);
+
         r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
                               store_domid, console_evtchn, &console_mfn,
                               console_domid, hvm, pae, superpages,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 08/27] tools/libxl: Extra APIs for the save helper
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (6 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 07/27] libxl: cancellation: Handle SIGTERM in save/restore helper Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:50   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore() Andrew Cooper
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

With libxl migration v2, there will be other moving parts which might fail,
requiring the helper to be stopped for reasons which are not its fault.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_internal.h     |    8 ++++++++
 tools/libxl/libxl_save_callout.c |   16 ++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4f204f9..3fcc37a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3182,6 +3182,14 @@ _hidden void libxl__xc_domain_restore(libxl__egc *egc,
 _hidden void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
                                            int rc, int retval, int errnoval);
 
+_hidden void libxl__save_helper_abort(libxl__egc *egc,
+                                      libxl__save_helper_state *shs);
+
+static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
+{
+    return libxl__ev_child_inuse(&shs->child);
+}
+
 /* Each time the dm needs to be saved, we must call suspend and then save */
 _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
                                            libxl__domain_suspend_state *dss);
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 231de2f..71de297 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -256,6 +256,22 @@ static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
     libxl__kill(gc, shs->child.pid, SIGKILL, "save/restore helper");
 }
 
+void libxl__save_helper_abort(libxl__egc *egc,
+                              libxl__save_helper_state *shs)
+{
+    STATE_AO_GC(shs->ao);
+
+    if (!libxl__ev_child_inuse(&shs->child)) {
+        helper_failed(egc, shs, ERROR_FAIL);
+        return;
+    }
+
+    if (!shs->rc)
+        shs->rc = ERROR_FAIL;
+
+    libxl__kill(gc, shs->child.pid, SIGTERM, "save/restore helper");
+}
+
 static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
                                    int fd, short events, short revents)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (7 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 08/27] tools/libxl: Extra APIs for the save helper Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:53   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 10/27] docs: Libxl migration v2 stream specification Andrew Cooper
                   ` (20 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

If a conversion of a legacy stream is needed, libxl__xc_domain_restore() will
need to use an fd other to the one found in the domain_create_state.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_create.c       |    2 +-
 tools/libxl/libxl_internal.h     |    1 +
 tools/libxl/libxl_save_callout.c |    2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 385891c..a37cdf8 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1057,7 +1057,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
         rc = ERROR_INVAL;
         goto out;
     }
-    libxl__xc_domain_restore(egc, dcs,
+    libxl__xc_domain_restore(egc, dcs, restore_fd,
                              hvm, pae, superpages);
     return;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3fcc37a..101994f 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3175,6 +3175,7 @@ _hidden int libxl__toolstack_save(uint32_t domid, uint8_t **buf,
 /* calls libxl__xc_domain_restore_done when done */
 _hidden void libxl__xc_domain_restore(libxl__egc *egc,
                                       libxl__domain_create_state *dcs,
+                                      int restore_fd,
                                       int hvm, int pae, int superpages);
 /* If rc==0 then retval is the return value from xc_domain_save
  * and errnoval is the errno value it provided.
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 71de297..0579372 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -41,13 +41,13 @@ static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
 /*----- entrypoints -----*/
 
 void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
+                              int restore_fd,
                               int hvm, int pae, int superpages)
 {
     STATE_AO_GC(dcs->ao);
 
     /* Convenience aliases */
     const uint32_t domid = dcs->guest_domid;
-    const int restore_fd = dcs->restore_fd;
     libxl__domain_build_state *const state = &dcs->build_state;
 
     unsigned cbflags = libxl__srm_callout_enumcallbacks_restore
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 10/27] docs: Libxl migration v2 stream specification
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (8 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore() Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 13:58   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 11/27] tools/python: Libxc migration v2 infrastructure Andrew Cooper
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 docs/specs/libxl-migration-stream.pandoc |  205 ++++++++++++++++++++++++++++++
 1 file changed, 205 insertions(+)
 create mode 100644 docs/specs/libxl-migration-stream.pandoc

diff --git a/docs/specs/libxl-migration-stream.pandoc b/docs/specs/libxl-migration-stream.pandoc
new file mode 100644
index 0000000..7235317
--- /dev/null
+++ b/docs/specs/libxl-migration-stream.pandoc
@@ -0,0 +1,205 @@
+% LibXenLight Domain Image Format
+% Andrew Cooper <<andrew.cooper3@citrix.com>>
+% Draft B
+
+Introduction
+============
+
+For the purposes of this document, `xl` is used as a representation of any
+implementer of the `libxl` API.  `xl` should be considered completely
+interchangeable with alternates, such as `libvirt` or `xenopsd-xl`.
+
+Purpose
+-------
+
+The _domain image format_ is the context of a running domain used for
+snapshots of a domain or for transferring domains between hosts during
+migration.
+
+There are a number of problems with the domain image format used in Xen 4.5
+and earlier (the _legacy format_)
+
+* There is no `libxl` context information.  `xl` is required to send certain
+  pieces of `libxl` context itself.
+
+* The contents of the stream is passed directly through `libxl` to `libxc`.
+  The legacy `libxc` format contained some information which belonged at the
+  `libxl` level, resulting in awkward layer violation to return the
+  information back to `libxl`.
+
+* The legacy `libxc` format was inextensible, causing inextensibility in the
+  legacy `libxl` handling.
+
+This design addresses the above points, allowing for a completely
+self-contained, extensible stream with each layer responsibile for its own
+appropriate information.
+
+
+Not Yet Included
+----------------
+
+The following features are not yet fully specified and will be
+included in a future draft.
+
+* Remus
+
+* ARM
+
+
+Overview
+========
+
+The image format consists of a _Header_, followed by 1 or more _Records_.
+Each record consists of a type and length field, followed by any type-specific
+data.
+
+\clearpage
+
+Header
+======
+
+The header identifies the stream as a `libxl` stream, including the version of
+this specification that it complies with.
+
+All fields in this header shall be in _big-endian_ byte order, regardless of
+the setting of the endianness bit.
+
+     0     1     2     3     4     5     6     7 octet
+    +-------------------------------------------------+
+    | ident                                           |
+    +-----------------------+-------------------------+
+    | version               | options                 |
+    +-----------------------+-------------------------+
+
+--------------------------------------------------------------------
+Field       Description
+----------- --------------------------------------------------------
+ident       0x4c6962786c466d74 ("LibxlFmt" in ASCII).
+
+version     0x00000002.  The version of this specification.
+
+options     bit 0: Endianness.    0 = little-endian, 1 = big-endian.
+
+            bit 1: Legacy Format. If set, this stream was created by
+                                  the legacy conversion tool.
+
+            bits 2-31: Reserved.
+--------------------------------------------------------------------
+
+The endianness shall be 0 (little-endian) for images generated on an
+i386, x86_64, or arm host.
+
+\clearpage
+
+
+Records
+=======
+
+A record has a record header, type specific data and a trailing footer.  If
+`length` is not a multiple of 8, the body is padded with zeroes to align the
+end of the record on an 8 octet boundary.
+
+     0     1     2     3     4     5     6     7 octet
+    +-----------------------+-------------------------+
+    | type                  | body_length             |
+    +-----------+-----------+-------------------------+
+    | body...                                         |
+    ...
+    |           | padding (0 to 7 octets)             |
+    +-----------+-------------------------------------+
+
+--------------------------------------------------------------------
+Field        Description
+-----------  -------------------------------------------------------
+type         0x00000000: END
+
+             0x00000001: LIBXC_CONTEXT
+
+             0x00000002: XENSTORE_DATA
+
+             0x00000003: EMULATOR_CONTEXT
+
+             0x00000004 - 0x7FFFFFFF: Reserved for future _mandatory_
+             records.
+
+             0x80000000 - 0xFFFFFFFF: Reserved for future _optional_
+             records.
+
+body_length  Length in octets of the record body.
+
+body         Content of the record.
+
+padding      0 to 7 octets of zeros to pad the whole record to a multiple
+             of 8 octets.
+--------------------------------------------------------------------
+
+\clearpage
+
+END
+----
+
+A end record marks the end of the image, and shall be the final record
+in the stream.
+
+     0     1     2     3     4     5     6     7 octet
+    +-------------------------------------------------+
+
+The end record contains no fields; its body_length is 0.
+
+LIBXC\_CONTEXT
+--------------
+
+A libxc context record is a marker, indicating that the stream should be
+handed to `xc_domain_restore()`.  `libxc` shall be resonsible for reading its
+own image format from the stream.
+
+     0     1     2     3     4     5     6     7 octet
+    +-------------------------------------------------+
+
+The libxc context record contains no fields; its body_length is 0[^1].
+
+
+[^1]: The sending side cannot calculate ahead of time how much data `libxc`
+might write into the stream, especially for live migration where the quantity
+of data is partially proportional to the elapsed time.
+
+XENSTORE\_DATA
+-------------
+
+A record containing xenstore key/value pairs of data.
+
+     0     1     2     3     4     5     6     7 octet
+    +-------------------------------------------------+
+    | xenstore key/value pairs                        |
+    ...
+    +-------------------------------------------------+
+
+EMULATOR\_CONTEXT
+----------------
+
+A context blob for a specific emulator associated with the domain.
+
+     0     1     2     3     4     5     6     7 octet
+    +------------------------+------------------------+
+    | emulator_id            | index                  |
+    +------------------------+------------------------+
+    | emulator_ctx                                    |
+    ...
+    +-------------------------------------------------+
+
+--------------------------------------------------------------------
+Field            Description
+------------     ---------------------------------------------------
+emulator_id      0x00000000: Unknown (In the case of a legacy stream)
+
+                 0x00000001: Qemu Traditional
+
+                 0x00000002: Qemu Upstream
+
+                 0x00000003 - 0xFFFFFFFF: Reserved for future emulators.
+
+index            Index of this emulator for the domain, if multiple
+                 emulators are in use.
+
+emulator_ctx     Emulator context blob.
+--------------------------------------------------------------------
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 11/27] tools/python: Libxc migration v2 infrastructure
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (9 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 10/27] docs: Libxl migration v2 stream specification Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:01   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 12/27] tools/python: Libxl " Andrew Cooper
                   ` (18 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Contains:
 * Python implementation of the libxc migration v2 records
 * Verification code for spec compliance
 * Unit tests

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/python/setup.py                |    1 +
 tools/python/xen/migration/libxc.py  |  446 ++++++++++++++++++++++++++++++++++
 tools/python/xen/migration/tests.py  |   41 ++++
 tools/python/xen/migration/verify.py |   37 +++
 4 files changed, 525 insertions(+)
 create mode 100644 tools/python/xen/migration/__init__.py
 create mode 100644 tools/python/xen/migration/libxc.py
 create mode 100644 tools/python/xen/migration/tests.py
 create mode 100644 tools/python/xen/migration/verify.py

diff --git a/tools/python/setup.py b/tools/python/setup.py
index 439c429..5bf81be 100644
--- a/tools/python/setup.py
+++ b/tools/python/setup.py
@@ -43,6 +43,7 @@ setup(name            = 'xen',
       version         = '3.0',
       description     = 'Xen',
       packages        = ['xen',
+                         'xen.migration',
                          'xen.lowlevel',
                         ],
       ext_package = "xen.lowlevel",
diff --git a/tools/python/xen/migration/__init__.py b/tools/python/xen/migration/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tools/python/xen/migration/libxc.py b/tools/python/xen/migration/libxc.py
new file mode 100644
index 0000000..b0255ac
--- /dev/null
+++ b/tools/python/xen/migration/libxc.py
@@ -0,0 +1,446 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+"""
+Libxc Migration v2 streams
+
+Record structures as per docs/specs/libxc-migration-stream.pandoc, and
+verification routines.
+"""
+
+import sys
+
+from struct import calcsize, unpack
+
+from xen.migration.verify import StreamError, RecordError, VerifyBase
+
+# Image Header
+IHDR_FORMAT = "!QIIHHI"
+
+IHDR_MARKER  = 0xffffffffffffffff
+IHDR_IDENT   = 0x58454E46 # "XENF" in ASCII
+IHDR_VERSION = 2
+
+IHDR_OPT_BIT_ENDIAN = 0
+IHDR_OPT_LE = (0 << IHDR_OPT_BIT_ENDIAN)
+IHDR_OPT_BE = (1 << IHDR_OPT_BIT_ENDIAN)
+
+IHDR_OPT_RESZ_MASK = 0xfffe
+
+# Domain Header
+DHDR_FORMAT = "IHHII"
+
+DHDR_TYPE_x86_pv  = 0x00000001
+DHDR_TYPE_x86_hvm = 0x00000002
+DHDR_TYPE_x86_pvh = 0x00000003
+DHDR_TYPE_arm     = 0x00000004
+
+dhdr_type_to_str = {
+    DHDR_TYPE_x86_pv  : "x86 PV",
+    DHDR_TYPE_x86_hvm : "x86 HVM",
+    DHDR_TYPE_x86_pvh : "x86 PVH",
+    DHDR_TYPE_arm     : "ARM",
+}
+
+# Records
+RH_FORMAT = "II"
+
+REC_TYPE_end                  = 0x00000000
+REC_TYPE_page_data            = 0x00000001
+REC_TYPE_x86_pv_info          = 0x00000002
+REC_TYPE_x86_pv_p2m_frames    = 0x00000003
+REC_TYPE_x86_pv_vcpu_basic    = 0x00000004
+REC_TYPE_x86_pv_vcpu_extended = 0x00000005
+REC_TYPE_x86_pv_vcpu_xsave    = 0x00000006
+REC_TYPE_shared_info          = 0x00000007
+REC_TYPE_tsc_info             = 0x00000008
+REC_TYPE_hvm_context          = 0x00000009
+REC_TYPE_hvm_params           = 0x0000000a
+REC_TYPE_toolstack            = 0x0000000b
+REC_TYPE_x86_pv_vcpu_msrs     = 0x0000000c
+REC_TYPE_verify               = 0x0000000d
+REC_TYPE_checkpoint           = 0x0000000e
+
+rec_type_to_str = {
+    REC_TYPE_end                  : "End",
+    REC_TYPE_page_data            : "Page data",
+    REC_TYPE_x86_pv_info          : "x86 PV info",
+    REC_TYPE_x86_pv_p2m_frames    : "x86 PV P2M frames",
+    REC_TYPE_x86_pv_vcpu_basic    : "x86 PV vcpu basic",
+    REC_TYPE_x86_pv_vcpu_extended : "x86 PV vcpu extended",
+    REC_TYPE_x86_pv_vcpu_xsave    : "x86 PV vcpu xsave",
+    REC_TYPE_shared_info          : "Shared info",
+    REC_TYPE_tsc_info             : "TSC info",
+    REC_TYPE_hvm_context          : "HVM context",
+    REC_TYPE_hvm_params           : "HVM params",
+    REC_TYPE_toolstack            : "Toolstack",
+    REC_TYPE_x86_pv_vcpu_msrs     : "x86 PV vcpu msrs",
+    REC_TYPE_verify               : "Verify",
+    REC_TYPE_checkpoint           : "Checkpoint",
+}
+
+# page_data
+PAGE_DATA_FORMAT             = "II"
+PAGE_DATA_PFN_MASK           = (1L << 52) - 1
+PAGE_DATA_PFN_RESZ_MASK      = ((1L << 60) - 1) & ~((1L << 52) - 1)
+
+# flags from xen/public/domctl.h: XEN_DOMCTL_PFINFO_* shifted by 32 bits
+PAGE_DATA_TYPE_SHIFT         = 60
+PAGE_DATA_TYPE_LTABTYPE_MASK = (0x7L << PAGE_DATA_TYPE_SHIFT)
+PAGE_DATA_TYPE_LTAB_MASK     = (0xfL << PAGE_DATA_TYPE_SHIFT)
+PAGE_DATA_TYPE_LPINTAB       = (0x8L << PAGE_DATA_TYPE_SHIFT) # Pinned pagetable
+
+PAGE_DATA_TYPE_NOTAB         = (0x0L << PAGE_DATA_TYPE_SHIFT) # Regular page
+PAGE_DATA_TYPE_L1TAB         = (0x1L << PAGE_DATA_TYPE_SHIFT) # L1 pagetable
+PAGE_DATA_TYPE_L2TAB         = (0x2L << PAGE_DATA_TYPE_SHIFT) # L2 pagetable
+PAGE_DATA_TYPE_L3TAB         = (0x3L << PAGE_DATA_TYPE_SHIFT) # L3 pagetable
+PAGE_DATA_TYPE_L4TAB         = (0x4L << PAGE_DATA_TYPE_SHIFT) # L4 pagetable
+PAGE_DATA_TYPE_BROKEN        = (0xdL << PAGE_DATA_TYPE_SHIFT) # Broken
+PAGE_DATA_TYPE_XALLOC        = (0xeL << PAGE_DATA_TYPE_SHIFT) # Allocate-only
+PAGE_DATA_TYPE_XTAB          = (0xfL << PAGE_DATA_TYPE_SHIFT) # Invalid
+
+# x86_pv_info
+X86_PV_INFO_FORMAT        = "BBHI"
+
+X86_PV_P2M_FRAMES_FORMAT  = "II"
+
+# x86_pv_vcpu_{basic,extended,xsave,msrs}
+X86_PV_VCPU_HDR_FORMAT    = "II"
+
+# tsc_info
+TSC_INFO_FORMAT           = "IIQII"
+
+# hvm_params
+HVM_PARAMS_ENTRY_FORMAT   = "QQ"
+HVM_PARAMS_FORMAT         = "II"
+
+class VerifyLibxc(VerifyBase):
+    """ Verify a Libxc v2 stream """
+
+    def __init__(self, info, read):
+        VerifyBase.__init__(self, info, read)
+
+        self.squashed_pagedata_records = 0
+
+
+    def verify(self):
+        """ Verity a libxc stream """
+
+        self.verify_ihdr()
+        self.verify_dhdr()
+
+        while self.verify_record() != REC_TYPE_end:
+            pass
+
+
+    def verify_ihdr(self):
+        """ Verify an Image Header """
+        marker, ident, version, options, res1, res2 = \
+            self.unpack_exact(IHDR_FORMAT)
+
+        if marker != IHDR_MARKER:
+            raise StreamError("Bad image marker: Expected 0x%x, got 0x%x"
+                              % (IHDR_MARKER, marker))
+
+        if ident != IHDR_IDENT:
+            raise StreamError("Bad image id: Expected 0x%x, got 0x%x"
+                              % (IHDR_IDENT, ident))
+
+        if version != IHDR_VERSION:
+            raise StreamError("Unknown image version: Expected %d, got %d"
+                              % (IHDR_VERSION, version))
+
+        if options & IHDR_OPT_RESZ_MASK:
+            raise StreamError("Reserved bits set in image options field: 0x%x"
+                              % (options & IHDR_OPT_RESZ_MASK))
+
+        if res1 != 0 or res2 != 0:
+            raise StreamError("Reserved bits set in image header: 0x%04x:0x%08x"
+                              % (res1, res2))
+
+        if ( (sys.byteorder == "little") and
+             ((options & IHDR_OPT_BIT_ENDIAN) != IHDR_OPT_LE) ):
+            raise StreamError(
+                "Stream is not native endianess - unable to validate")
+
+        endian = ["little", "big"][options & IHDR_OPT_LE]
+        self.info("Libxc Image Header: %s endian" % (endian, ))
+
+
+    def verify_dhdr(self):
+        """ Verify a domain header """
+
+        gtype, page_shift, res1, major, minor = \
+            self.unpack_exact(DHDR_FORMAT)
+
+        if gtype not in dhdr_type_to_str:
+            raise StreamError("Unrecognised domain type 0x%x" % (gtype, ))
+
+        if res1 != 0:
+            raise StreamError("Reserved bits set in domain header 0x%04x"
+                              % (res1, ))
+
+        if page_shift != 12:
+            raise StreamError("Page shift expected to be 12.  Got %d"
+                              % (page_shift, ))
+
+        if major == 0:
+            self.info("Domain Header: legacy converted %s"
+                      % (dhdr_type_to_str[gtype], ))
+        else:
+            self.info("Domain Header: %s from Xen %d.%d"
+                      % (dhdr_type_to_str[gtype], major, minor))
+
+
+    def verify_record(self):
+        """ Verify an individual record """
+
+        rtype, length = self.unpack_exact(RH_FORMAT)
+
+        if rtype not in rec_type_to_str:
+            raise StreamError("Unrecognised record type 0x%x" % (rtype, ))
+
+        contentsz = (length + 7) & ~7
+        content = self.rdexact(contentsz)
+
+        if rtype != REC_TYPE_page_data:
+
+            if self.squashed_pagedata_records > 0:
+                self.info("Squashed %d Page Data records together"
+                          % (self.squashed_pagedata_records, ))
+                self.squashed_pagedata_records = 0
+
+            self.info("Libxc Record: %s, length %d"
+                      % (rec_type_to_str[rtype], length))
+
+        else:
+            self.squashed_pagedata_records += 1
+
+        padding = content[length:]
+        if padding != "\x00" * len(padding):
+            raise StreamError("Padding containing non0 bytes found")
+
+        if rtype not in record_verifiers:
+            raise RuntimeError("No verification function for libxc record '%s'"
+                               % rec_type_to_str[rtype])
+        else:
+            record_verifiers[rtype](self, content[:length])
+
+        return rtype
+
+
+    def verify_record_end(self, content):
+        """ End record """
+
+        if len(content) != 0:
+            raise RecordError("End record with non-zero length")
+
+
+    def verify_record_page_data(self, content):
+        """ Page Data record """
+        minsz = calcsize(PAGE_DATA_FORMAT)
+
+        if len(content) <= minsz:
+            raise RecordError("PAGE_DATA record must be at least %d bytes long"
+                              % (minsz, ))
+
+        count, res1 = unpack(PAGE_DATA_FORMAT, content[:minsz])
+
+        if res1 != 0:
+            raise StreamError("Reserved bits set in PAGE_DATA record 0x%04x"
+                              % (res1, ))
+
+        pfnsz = count * 8
+        if (len(content) - minsz) < pfnsz:
+            raise RecordError("PAGE_DATA record must contain a pfn record for "
+                              "each count")
+
+        pfns = list(unpack("=%dQ" % (count,), content[minsz:minsz + pfnsz]))
+
+        nr_pages = 0
+        for idx, pfn in enumerate(pfns):
+
+            if pfn & PAGE_DATA_PFN_RESZ_MASK:
+                raise RecordError("Reserved bits set in pfn[%d]: 0x%016x",
+                                  idx, pfn & PAGE_DATA_PFN_RESZ_MASK)
+
+            if pfn >> PAGE_DATA_TYPE_SHIFT in (5, 6, 7, 8):
+                raise RecordError("Invalid type value in pfn[%d]: 0x%016x",
+                                  idx, pfn & PAGE_DATA_TYPE_LTAB_MASK)
+
+            # We expect page data for each normal page or pagetable
+            if PAGE_DATA_TYPE_NOTAB <= (pfn & PAGE_DATA_TYPE_LTABTYPE_MASK) \
+                    <= PAGE_DATA_TYPE_L4TAB:
+                nr_pages += 1
+
+        pagesz = nr_pages * 4096
+        if len(content) != minsz + pfnsz + pagesz:
+            raise RecordError("Expected %u + %u + %u, got %u"
+                              % (minsz, pfnsz, pagesz, len(content)))
+
+
+    def verify_record_x86_pv_info(self, content):
+        """ x86 PV Info record """
+
+        expectedsz = calcsize(X86_PV_INFO_FORMAT)
+        if len(content) != expectedsz:
+            raise RecordError("x86_pv_info: expected length of %d, got %d"
+                              % (expectedsz, len(content)))
+
+        width, levels, res1, res2 = unpack(X86_PV_INFO_FORMAT, content)
+
+        if width not in (4, 8):
+            raise RecordError("Expected width of 4 or 8, got %d" % (width, ))
+
+        if levels not in (3, 4):
+            raise RecordError("Expected levels of 3 or 4, got %d" % (levels, ))
+
+        if res1 != 0 or res2 != 0:
+            raise StreamError("Reserved bits set in X86_PV_INFO: 0x%04x 0x%08x"
+                              % (res1, res2))
+
+        bitness = {4:32, 8:64}[width]
+        self.info("  %sbit guest, %d levels of pagetables" % (bitness, levels))
+
+
+    def verify_record_x86_pv_p2m_frames(self, content):
+        """ x86 PV p2m frames record """
+
+        if len(content) % 8 != 0:
+            raise RecordError("Length expected to be a multiple of 8, not %d"
+                              % (len(content), ))
+
+        start, end = unpack("=II", content[:8])
+        self.info("  Start pfn 0x%x, End 0x%x" % (start, end))
+
+
+    def verify_record_x86_pv_vcpu_generic(self, content, name):
+        """ Generic for all REC_TYPE_x86_pv_vcpu_{basic,extended,xsave,msrs} """
+        minsz = calcsize(X86_PV_VCPU_HDR_FORMAT)
+
+        if len(content) <= minsz:
+            raise RecordError("X86_PV_VCPU_%s record length must be at least %d"
+                              " bytes long" % (name, minsz))
+
+        vcpuid, res1 = unpack(X86_PV_VCPU_HDR_FORMAT, content[:minsz])
+
+        if res1 != 0:
+            raise StreamError(
+                "Reserved bits set in x86_pv_vcpu_%s record 0x%04x"
+                              % (name, res1))
+
+        self.info("  vcpu%d %s context, %d bytes"
+                  % (vcpuid, name, len(content) - minsz))
+
+
+    def verify_record_shared_info(self, content):
+        """ shared info record """
+
+        if len(content) != 4096:
+            raise RecordError("Length expected to be 4906 bytes, not %d"
+                              % (len(content), ))
+
+
+    def verify_record_tsc_info(self, content):
+        """ tsc info record """
+
+        sz = calcsize(TSC_INFO_FORMAT)
+
+        if len(content) != sz:
+            raise RecordError("Length should be %u bytes" % (sz, ))
+
+        mode, khz, nsec, incarn, res1 = unpack(TSC_INFO_FORMAT, content)
+
+        if res1 != 0:
+            raise StreamError("Reserved bits set in TSC_INFO: 0x%08x"
+                              % (res1, ))
+
+        self.info("  Mode %u, %u kHz, %u ns, incarnation %d"
+                  % (mode, khz, nsec, incarn))
+
+
+    def verify_record_hvm_context(self, content):
+        """ hvm context record """
+
+        if len(content) == 0:
+            raise RecordError("Zero length HVM context")
+
+
+    def verify_record_hvm_params(self, content):
+        """ hvm params record """
+
+        sz = calcsize(HVM_PARAMS_FORMAT)
+
+        if len(content) < sz:
+            raise RecordError("Length should be at least %u bytes" % (sz, ))
+
+        count, rsvd = unpack(HVM_PARAMS_FORMAT, content[:sz])
+
+        if rsvd != 0:
+            raise RecordError("Reserved field not zero (0x%04x)" % (rsvd, ))
+
+        sz += count * calcsize(HVM_PARAMS_ENTRY_FORMAT)
+
+        if len(content) != sz:
+            raise RecordError("Length should be %u bytes" % (sz, ))
+
+
+    def verify_record_toolstack(self, _):
+        """ toolstack record """
+        raise DeprecationWarning("Found Toolstack record in stream")
+
+
+    def verify_record_verify(self, content):
+        """ verify record """
+
+        if len(content) != 0:
+            raise RecordError("Verify record with non-zero length")
+
+
+    def verify_record_checkpoint(self, content):
+        """ checkpoint record """
+
+        if len(content) != 0:
+            raise RecordError("Checkpoint record with non-zero length")
+
+
+record_verifiers = {
+    REC_TYPE_end:
+        VerifyLibxc.verify_record_end,
+    REC_TYPE_page_data:
+        VerifyLibxc.verify_record_page_data,
+
+    REC_TYPE_x86_pv_info:
+        VerifyLibxc.verify_record_x86_pv_info,
+    REC_TYPE_x86_pv_p2m_frames:
+        VerifyLibxc.verify_record_x86_pv_p2m_frames,
+
+    REC_TYPE_x86_pv_vcpu_basic:
+        lambda s, x:
+        VerifyLibxc.verify_record_x86_pv_vcpu_generic(s, x, "basic"),
+    REC_TYPE_x86_pv_vcpu_extended:
+        lambda s, x:
+        VerifyLibxc.verify_record_x86_pv_vcpu_generic(s, x, "extended"),
+    REC_TYPE_x86_pv_vcpu_xsave:
+        lambda s, x:
+        VerifyLibxc.verify_record_x86_pv_vcpu_generic(s, x, "xsave"),
+    REC_TYPE_x86_pv_vcpu_msrs:
+        lambda s, x:
+        VerifyLibxc.verify_record_x86_pv_vcpu_generic(s, x, "msrs"),
+
+    REC_TYPE_shared_info:
+        VerifyLibxc.verify_record_shared_info,
+    REC_TYPE_tsc_info:
+        VerifyLibxc.verify_record_tsc_info,
+
+    REC_TYPE_hvm_context:
+        VerifyLibxc.verify_record_hvm_context,
+    REC_TYPE_hvm_params:
+        VerifyLibxc.verify_record_hvm_params,
+    REC_TYPE_toolstack:
+        VerifyLibxc.verify_record_toolstack,
+    REC_TYPE_verify:
+        VerifyLibxc.verify_record_verify,
+    REC_TYPE_checkpoint:
+        VerifyLibxc.verify_record_checkpoint,
+    }
diff --git a/tools/python/xen/migration/tests.py b/tools/python/xen/migration/tests.py
new file mode 100644
index 0000000..3e97268
--- /dev/null
+++ b/tools/python/xen/migration/tests.py
@@ -0,0 +1,41 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+"""
+Unit tests for migration v2 streams
+"""
+
+import unittest
+
+from struct import calcsize
+
+from xen.migration import libxc, libxl
+
+class TestLibxc(unittest.TestCase):
+
+    def test_format_sizes(self):
+
+        for fmt, sz in ( (libxc.IHDR_FORMAT, 24),
+                         (libxc.DHDR_FORMAT, 16),
+                         (libxc.RH_FORMAT, 8),
+
+                         (libxc.PAGE_DATA_FORMAT, 8),
+                         (libxc.X86_PV_INFO_FORMAT, 8),
+                         (libxc.X86_PV_P2M_FRAMES_FORMAT, 8),
+                         (libxc.X86_PV_VCPU_HDR_FORMAT, 8),
+                         (libxc.TSC_INFO_FORMAT, 24),
+                         (libxc.HVM_PARAMS_ENTRY_FORMAT, 16),
+                         (libxc.HVM_PARAMS_FORMAT, 8),
+                         ):
+            self.assertEqual(calcsize(fmt), sz)
+
+
+def test_suite():
+    suite = unittest.TestSuite()
+
+    suite.addTest(unittest.makeSuite(TestLibxc))
+
+    return suite
+
+if __name__ == "__main__":
+    unittest.main()
diff --git a/tools/python/xen/migration/verify.py b/tools/python/xen/migration/verify.py
new file mode 100644
index 0000000..7a42dbf
--- /dev/null
+++ b/tools/python/xen/migration/verify.py
@@ -0,0 +1,37 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+"""
+Common verification infrastructure for v2 streams
+"""
+
+from struct import calcsize, unpack
+
+class StreamError(StandardError):
+    """Error with the stream"""
+    pass
+
+class RecordError(StandardError):
+    """Error with a record in the stream"""
+    pass
+
+
+class VerifyBase(object):
+
+    def __init__(self, info, read):
+
+        self.info = info
+        self.read = read
+
+    def rdexact(self, nr_bytes):
+        """Read exactly nr_bytes from the stream"""
+        _ = self.read(nr_bytes)
+        if len(_) != nr_bytes:
+            raise IOError("Stream truncated")
+        return _
+
+    def unpack_exact(self, fmt):
+        """Unpack a struct format string from the stream"""
+        sz = calcsize(fmt)
+        return unpack(fmt, self.rdexact(sz))
+
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 12/27] tools/python: Libxl migration v2 infrastructure
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (10 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 11/27] tools/python: Libxc migration v2 infrastructure Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-15 13:44 ` [PATCH 13/27] tools/python: Verification utility for v2 stream spec compliance Andrew Cooper
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Contains:
 * Python implementation of the libxl migration v2 records
 * Verification code for spec compliance
 * Unit tests

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/python/xen/migration/libxl.py |  188 +++++++++++++++++++++++++++++++++++
 tools/python/xen/migration/tests.py |   13 +++
 2 files changed, 201 insertions(+)
 create mode 100644 tools/python/xen/migration/libxl.py

diff --git a/tools/python/xen/migration/libxl.py b/tools/python/xen/migration/libxl.py
new file mode 100644
index 0000000..4e1f4f8
--- /dev/null
+++ b/tools/python/xen/migration/libxl.py
@@ -0,0 +1,188 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+"""
+Libxl Migration v2 streams
+
+Record structures as per docs/specs/libxl-migration-stream.pandoc, and
+verification routines.
+"""
+
+import sys
+
+from struct import calcsize, unpack
+from xen.migration.verify import StreamError, RecordError, VerifyBase
+from xen.migration.libxc import VerifyLibxc
+
+# Header
+HDR_FORMAT = "!QII"
+
+HDR_IDENT = 0x4c6962786c466d74 # "LibxlFmt" in ASCII
+HDR_VERSION = 2
+
+HDR_OPT_BIT_ENDIAN = 0
+HDR_OPT_BIT_LEGACY = 1
+
+HDR_OPT_LE     = (0 << HDR_OPT_BIT_ENDIAN)
+HDR_OPT_BE     = (1 << HDR_OPT_BIT_ENDIAN)
+HDR_OPT_LEGACY = (1 << HDR_OPT_BIT_LEGACY)
+
+HDR_OPT_RESZ_MASK = 0xfffc
+
+# Records
+RH_FORMAT = "II"
+
+REC_TYPE_end              = 0x00000000
+REC_TYPE_libxc_context    = 0x00000001
+REC_TYPE_xenstore_data    = 0x00000002
+REC_TYPE_emulator_context = 0x00000003
+
+rec_type_to_str = {
+    REC_TYPE_end              : "End",
+    REC_TYPE_libxc_context    : "Libxc context",
+    REC_TYPE_xenstore_data    : "Xenstore data",
+    REC_TYPE_emulator_context : "Emulator context",
+}
+
+# emulator_context
+EMULATOR_CONTEXT_FORMAT = "II"
+
+EMULATOR_ID_unknown       = 0x00000000
+EMULATOR_ID_qemu_trad     = 0x00000001
+EMULATOR_ID_qemu_upstream = 0x00000002
+
+emulator_id_to_str = {
+    EMULATOR_ID_unknown       : "Unknown",
+    EMULATOR_ID_qemu_trad     : "Qemu Traditional",
+    EMULATOR_ID_qemu_upstream : "Qemu Upstream",
+}
+
+
+#
+# libxl format
+#
+
+LIBXL_QEMU_SIGNATURE = "DeviceModelRecord0002"
+LIBXL_QEMU_RECORD_HDR = "=%dsI" % (len(LIBXL_QEMU_SIGNATURE), )
+
+class VerifyLibxl(VerifyBase):
+    """ Verify a Libxl v2 stream """
+
+    def __init__(self, info, read):
+        VerifyBase.__init__(self, info, read)
+
+
+    def verify(self):
+        """ Verity a libxl stream """
+
+        self.verify_hdr()
+
+        while self.verify_record() != REC_TYPE_end:
+            pass
+
+
+    def verify_hdr(self):
+        """ Verify a Header """
+        ident, version, options = self.unpack_exact(HDR_FORMAT)
+
+        if ident != HDR_IDENT:
+            raise StreamError("Bad image id: Expected 0x%x, got 0x%x"
+                              % (HDR_IDENT, ident))
+
+        if version != HDR_VERSION:
+            raise StreamError("Unknown image version: Expected %d, got %d"
+                              % (HDR_VERSION, version))
+
+        if options & HDR_OPT_RESZ_MASK:
+            raise StreamError("Reserved bits set in image options field: 0x%x"
+                              % (options & HDR_OPT_RESZ_MASK))
+
+        if ( (sys.byteorder == "little") and
+             ((options & HDR_OPT_BIT_ENDIAN) != HDR_OPT_LE) ):
+            raise StreamError(
+                "Stream is not native endianess - unable to validate")
+
+        endian = ["little", "big"][options & HDR_OPT_LE]
+
+        if options & HDR_OPT_LEGACY:
+            self.info("Libxl Header: %s endian, legacy converted" % (endian, ))
+        else:
+            self.info("Libxl Header: %s endian" % (endian, ))
+
+
+    def verify_record(self):
+        """ Verify an individual record """
+        rtype, length = self.unpack_exact(RH_FORMAT)
+
+        if rtype not in rec_type_to_str:
+            raise StreamError("Unrecognised record type %x" % (rtype, ))
+
+        self.info("Libxl Record: %s, length %d"
+                  % (rec_type_to_str[rtype], length))
+
+        contentsz = (length + 7) & ~7
+        content = self.rdexact(contentsz)
+
+        padding = content[length:]
+        if padding != "\x00" * len(padding):
+            raise StreamError("Padding containing non0 bytes found")
+
+        if rtype not in record_verifiers:
+            raise RuntimeError("No verification function for libxl record '%s'"
+                               % rec_type_to_str[rtype])
+        else:
+            record_verifiers[rtype](self, content[:length])
+
+        return rtype
+
+
+    def verify_record_end(self, content):
+        """ End record """
+
+        if len(content) != 0:
+            raise RecordError("End record with non-zero length")
+
+
+    def verify_record_libxc_context(self, content):
+        """ Libxc context record """
+
+        if len(content) != 0:
+            raise RecordError("Libxc context record with non-zero length")
+
+        # Verify the libxc stream, as we can't seek forwards through it
+        VerifyLibxc(self.info, self.read).verify()
+
+
+    def verify_record_xenstore_data(self, content):
+        """ Xenstore Data record """
+
+        if len(content) == 0:
+            raise RecordError("Xenstore data record with zero length")
+
+
+    def verify_record_emulator_context(self, content):
+        """ Emulator Context record """
+        minsz = calcsize(EMULATOR_CONTEXT_FORMAT)
+
+        if len(content) < minsz:
+            raise RecordError("Length must be at least %d bytes, got %d"
+                              % (minsz, len(content)))
+
+        emu_id, emu_idx = unpack(EMULATOR_CONTEXT_FORMAT, content[:minsz])
+
+        if emu_id not in emulator_id_to_str:
+            raise RecordError("Unrecognised emulator id 0x%x" % (emu_id, ))
+
+        self.info("  Index %d, type %s" % (emu_idx, emulator_id_to_str[emu_id]))
+
+
+record_verifiers = {
+    REC_TYPE_end:
+        VerifyLibxl.verify_record_end,
+    REC_TYPE_libxc_context:
+        VerifyLibxl.verify_record_libxc_context,
+    REC_TYPE_xenstore_data:
+        VerifyLibxl.verify_record_xenstore_data,
+    REC_TYPE_emulator_context:
+        VerifyLibxl.verify_record_emulator_context,
+}
diff --git a/tools/python/xen/migration/tests.py b/tools/python/xen/migration/tests.py
index 3e97268..91044cd 100644
--- a/tools/python/xen/migration/tests.py
+++ b/tools/python/xen/migration/tests.py
@@ -30,10 +30,23 @@ class TestLibxc(unittest.TestCase):
             self.assertEqual(calcsize(fmt), sz)
 
 
+class TestLibxl(unittest.TestCase):
+
+    def test_format_sizes(self):
+
+        for fmt, sz in ( (libxl.HDR_FORMAT, 16),
+                         (libxl.RH_FORMAT, 8),
+
+                         (libxl.EMULATOR_CONTEXT_FORMAT, 8),
+                         ):
+            self.assertEqual(calcsize(fmt), sz)
+
+
 def test_suite():
     suite = unittest.TestSuite()
 
     suite.addTest(unittest.makeSuite(TestLibxc))
+    suite.addTest(unittest.makeSuite(TestLibxl))
 
     return suite
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 13/27] tools/python: Verification utility for v2 stream spec compliance
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (11 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 12/27] tools/python: Libxl " Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-15 13:44 ` [PATCH 14/27] tools/python: Conversion utility for legacy migration streams Andrew Cooper
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

---
This is exceedingly useful for development, but not of practical use being
installed into a production dom0.
---
 tools/python/scripts/verify-stream-v2.py |  174 ++++++++++++++++++++++++++++++
 1 file changed, 174 insertions(+)
 create mode 100755 tools/python/scripts/verify-stream-v2.py

diff --git a/tools/python/scripts/verify-stream-v2.py b/tools/python/scripts/verify-stream-v2.py
new file mode 100755
index 0000000..0cb1a4e
--- /dev/null
+++ b/tools/python/scripts/verify-stream-v2.py
@@ -0,0 +1,174 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+""" Verify a v2 format migration stream """
+
+import sys
+import struct
+import os, os.path
+import syslog
+import traceback
+
+from xen.migration.verify import StreamError, RecordError
+from xen.migration.libxc import VerifyLibxc
+from xen.migration.libxl import VerifyLibxl
+
+fin = None             # Input file/fd
+log_to_syslog = False  # Boolean - Log to syslog instead of stdout/err?
+verbose = False        # Boolean - Summarise stream contents
+quiet = False          # Boolean - Suppress error printing
+
+def info(msg):
+    """Info message, routed to appropriate destination"""
+    if not quiet and verbose:
+        if log_to_syslog:
+            for line in msg.split("\n"):
+                syslog.syslog(syslog.LOG_INFO, line)
+        else:
+            print msg
+
+def err(msg):
+    """Error message, routed to appropriate destination"""
+    if not quiet:
+        if log_to_syslog:
+            for line in msg.split("\n"):
+                syslog.syslog(syslog.LOG_ERR, line)
+        print >> sys.stderr, msg
+
+def stream_read(_ = None):
+    """Read from input"""
+    return fin.read(_)
+
+def rdexact(nr_bytes):
+    """Read exactly nr_bytes from fin"""
+    _ = stream_read(nr_bytes)
+    if len(_) != nr_bytes:
+        raise IOError("Stream truncated")
+    return _
+
+def unpack_exact(fmt):
+    """Unpack a format from fin"""
+    sz = struct.calcsize(fmt)
+    return struct.unpack(fmt, rdexact(sz))
+
+
+def skip_xl_header():
+    """Skip over an xl header in the stream"""
+
+    hdr = rdexact(32)
+    if hdr != "Xen saved domain, xl format\n \0 \r":
+        raise StreamError("No xl header")
+
+    _, mflags, _, optlen = unpack_exact("=IIII")
+    _ = rdexact(optlen)
+
+    info("Processed xl header")
+
+    if mflags & 2: # XL_MANDATORY_FLAG_STREAMv2
+        return "libxl"
+    else:
+        return "libxc"
+
+def read_stream(fmt):
+    """ Read an entire stream """
+
+    try:
+        if fmt == "xl":
+            fmt = skip_xl_header()
+
+        if fmt == "libxc":
+            VerifyLibxc(info, stream_read).verify()
+        else:
+            VerifyLibxl(info, stream_read).verify()
+
+    except (IOError, StreamError, RecordError):
+        err("Stream Error:")
+        err(traceback.format_exc())
+        return 1
+
+    except StandardError:
+        err("Script Error:")
+        err(traceback.format_exc())
+        err("Please fix me")
+        return 2
+
+    return 0
+
+def open_file_or_fd(val, mode, buffering):
+    """
+    If 'val' looks like a decimal integer, open it as an fd.  If not, try to
+    open it as a regular file.
+    """
+
+    fd = -1
+    try:
+        # Does it look like an integer?
+        try:
+            fd = int(val, 10)
+        except ValueError:
+            pass
+
+        # Try to open it...
+        if fd != -1:
+            return os.fdopen(fd, mode, buffering)
+        else:
+            return open(val, mode, buffering)
+
+    except StandardError, e:
+        if fd != -1:
+            err("Unable to open fd %d: %s: %s" %
+                (fd, e.__class__.__name__, e))
+        else:
+            err("Unable to open file '%s': %s: %s" %
+                (val, __class__.__name__, e))
+
+    raise SystemExit(2)
+
+def main():
+    """ main """
+    from optparse import OptionParser
+    global fin, quiet, verbose
+
+    # Change stdout to be line-buffered.
+    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 1)
+
+    parser = OptionParser(usage = "%prog [options]",
+                          description =
+                          "Verify a stream according to the v2 spec")
+
+    # Optional options
+    parser.add_option("-i", "--in", dest = "fin", metavar = "<FD or FILE>",
+                      default = "0",
+                      help = "Stream to verify (defaults to stdin)")
+    parser.add_option("-v", "--verbose", action = "store_true", default = False,
+                      help = "Summarise stream contents")
+    parser.add_option("-q", "--quiet", action = "store_true", default = False,
+                      help = "Suppress all logging/errors")
+    parser.add_option("-f", "--format", dest = "format",
+                      metavar = "<libxc|libxl|xl>", default = "libxc",
+                      choices = ["libxc", "libxl", "xl"],
+                      help = "Format of the incoming stream (defaults to libxc)")
+    parser.add_option("--syslog", action = "store_true", default = False,
+                      help = "Log to syslog instead of stdout")
+
+    opts, _ = parser.parse_args()
+
+    if opts.syslog:
+        global log_to_syslog
+
+        syslog.openlog("verify-stream-v2", syslog.LOG_PID)
+        log_to_syslog = True
+
+    verbose = opts.verbose
+    quiet = opts.quiet
+    fin = open_file_or_fd(opts.fin, "rb", 0)
+
+    return read_stream(opts.format)
+
+if __name__ == "__main__":
+    try:
+        sys.exit(main())
+    except SystemExit, e:
+        sys.exit(e.code)
+    except KeyboardInterrupt:
+        sys.exit(2)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 14/27] tools/python: Conversion utility for legacy migration streams
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (12 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 13/27] tools/python: Verification utility for v2 stream spec compliance Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:01   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 15/27] tools/libxl: Migration v2 stream format Andrew Cooper
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

This utility will take a legacy stream as in input, and produce a v2 stream as
an output.  It is exec()'d by libxl to provide backwards compatibility.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/python/Makefile                         |    4 +
 tools/python/scripts/convert-legacy-stream.py |  683 +++++++++++++++++++++++++
 2 files changed, 687 insertions(+)
 create mode 100755 tools/python/scripts/convert-legacy-stream.py

diff --git a/tools/python/Makefile b/tools/python/Makefile
index e933be8..531c862 100644
--- a/tools/python/Makefile
+++ b/tools/python/Makefile
@@ -17,9 +17,13 @@ build: genwrap.py $(XEN_ROOT)/tools/libxl/libxl_types.idl \
 
 .PHONY: install
 install:
+	$(INSTALL_DIR) $(DESTDIR)$(PRIVATE_BINDIR)
+
 	CC="$(CC)" CFLAGS="$(PY_CFLAGS)" $(PYTHON) setup.py install \
 		$(PYTHON_PREFIX_ARG) --root="$(DESTDIR)" --force
 
+	$(INSTALL_PROG) scripts/convert-legacy-stream.py $(DESTDIR)$(PRIVATE_BINDIR)
+
 .PHONY: test
 test:
 	export LD_LIBRARY_PATH=$$(readlink -f ../libxc):$$(readlink -f ../xenstore); $(PYTHON) test.py -b -u
diff --git a/tools/python/scripts/convert-legacy-stream.py b/tools/python/scripts/convert-legacy-stream.py
new file mode 100755
index 0000000..beda9e4
--- /dev/null
+++ b/tools/python/scripts/convert-legacy-stream.py
@@ -0,0 +1,683 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+import sys
+import os, os.path
+import syslog
+import traceback
+
+from struct import calcsize, unpack, pack
+
+from xen.migration import libxc, libxl
+
+__version__ = 1
+
+fin = None             # Input file/fd
+fout = None            # Output file/fd
+twidth = 0             # Legacy toolstack bitness (32 or 64)
+pv = None              # Boolean (pv or hvm)
+qemu = True            # Boolean - process qemu record?
+log_to_syslog = False  # Boolean - Log to syslog instead of stdout/err?
+verbose = False        # Boolean - Summarise stream contents
+
+def stream_read(_ = None):
+    return fin.read(_)
+
+def stream_write(_):
+    return fout.write(_)
+
+def info(msg):
+    """Info message, routed to appropriate destination"""
+    if verbose:
+        if log_to_syslog:
+            for line in msg.split("\n"):
+                syslog.syslog(syslog.LOG_INFO, line)
+        else:
+            print msg
+
+def err(msg):
+    """Error message, routed to appropriate destination"""
+    if log_to_syslog:
+        for line in msg.split("\n"):
+            syslog.syslog(syslog.LOG_ERR, line)
+    print >> sys.stderr, msg
+
+class StreamError(StandardError):
+    pass
+
+class VM(object):
+
+    def __init__(self, fmt):
+        # Common
+        self.p2m_size = 0
+
+        # PV
+        self.max_vcpu_id = 0
+        self.online_vcpu_map = []
+        self.width = 0
+        self.levels = 0
+        self.basic_len = 0
+        self.extd = False
+        self.xsave_len = 0
+
+        # libxl
+        self.libxl = fmt == "libxl"
+        self.xenstore = [] # Deferred "toolstack" records
+
+def write_libxc_ihdr():
+    stream_write(pack(libxc.IHDR_FORMAT,
+                      libxc.IHDR_MARKER,  # Marker
+                      libxc.IHDR_IDENT,   # Ident
+                      libxc.IHDR_VERSION, # Version
+                      libxc.IHDR_OPT_LE,  # Options
+                      0, 0))              # Reserved
+
+def write_libxc_dhdr():
+    if pv:
+        dtype = libxc.DHDR_TYPE_x86_pv
+    else:
+        dtype = libxc.DHDR_TYPE_x86_hvm
+
+    stream_write(pack(libxc.DHDR_FORMAT,
+                      dtype,        # Type
+                      12,           # Page size
+                      0,            # Reserved
+                      0,            # Xen major (converted)
+                      __version__)) # Xen minor (converted)
+
+def write_libxl_hdr():
+    stream_write(pack(libxl.HDR_FORMAT,
+                      libxl.HDR_IDENT,     # Ident
+                      libxl.HDR_VERSION,   # Version 2
+                      libxl.HDR_OPT_LE |   # Options
+                      libxl.HDR_OPT_LEGACY # Little Endian and Legacy
+                      ))
+
+def write_record(rt, *argl):
+    alldata = ''.join(argl)
+    length = len(alldata)
+
+    record = pack(libxc.RH_FORMAT, rt, length) + alldata
+    plen = (8 - (length & 7)) & 7
+    record += '\x00' * plen
+
+    stream_write(record)
+
+def write_libxc_pv_info(vm):
+    write_record(libxc.REC_TYPE_x86_pv_info,
+                 pack(libxc.X86_PV_INFO_FORMAT,
+                      vm.width, vm.levels, 0, 0))
+
+def write_libxc_pv_p2m_frames(vm, pfns):
+    write_record(libxc.REC_TYPE_x86_pv_p2m_frames,
+                 pack(libxc.X86_PV_P2M_FRAMES_FORMAT,
+                      0, vm.p2m_size - 1),
+                 pack("Q" * len(pfns), *pfns))
+
+def write_libxc_pv_vcpu_basic(vcpu_id, data):
+    write_record(libxc.REC_TYPE_x86_pv_vcpu_basic,
+                 pack(libxc.X86_PV_VCPU_HDR_FORMAT, vcpu_id, 0), data)
+
+def write_libxc_pv_vcpu_extd(vcpu_id, data):
+    write_record(libxc.REC_TYPE_x86_pv_vcpu_extended,
+                 pack(libxc.X86_PV_VCPU_HDR_FORMAT, vcpu_id, 0), data)
+
+def write_libxc_pv_vcpu_xsave(vcpu_id, data):
+    write_record(libxc.REC_TYPE_x86_pv_vcpu_xsave,
+                 pack(libxc.X86_PV_VCPU_HDR_FORMAT, vcpu_id, 0), data)
+
+def write_page_data(pfns, pages):
+    if fout is None: # Save copying 1M buffers around for no reason
+        return
+
+    new_pfns = [(((x & 0xf0000000) << 32) | (x & 0x0fffffff)) for x in pfns]
+
+    # Optimise the needless buffer copying in write_record()
+    stream_write(pack(libxc.RH_FORMAT,
+                      libxc.REC_TYPE_page_data,
+                      8 + (len(new_pfns) * 8) + len(pages)))
+    stream_write(pack(libxc.PAGE_DATA_FORMAT, len(new_pfns), 0))
+    stream_write(pack("Q" * len(new_pfns), *new_pfns))
+    stream_write(pages)
+
+def write_libxc_tsc_info(mode, khz, nsec, incarn):
+    write_record(libxc.REC_TYPE_tsc_info,
+                 pack(libxc.TSC_INFO_FORMAT,
+                      mode, khz, nsec, incarn, 0))
+
+def write_libxc_hvm_params(params):
+    if pv:
+        raise StreamError("HVM-only param in PV stream")
+    elif len(params) % 2:
+        raise RuntimeError("Expected even length list of hvm parameters")
+
+    write_record(libxc.REC_TYPE_hvm_params,
+                 pack(libxc.HVM_PARAMS_FORMAT, len(params) / 2, 0),
+                 pack("Q" * len(params), *params))
+
+def write_libxl_end():
+    write_record(libxl.REC_TYPE_end, "")
+
+def write_libxl_libxc_context():
+    write_record(libxl.REC_TYPE_libxc_context, "")
+
+def write_libxl_xenstore_data(data):
+    write_record(libxl.REC_TYPE_xenstore_data, data)
+
+def write_libxl_emulator_context(blob):
+    write_record(libxl.REC_TYPE_emulator_context,
+                 pack(libxl.EMULATOR_CONTEXT_FORMAT,
+                      libxl.EMULATOR_ID_unknown, 0) + blob)
+
+def rdexact(nr_bytes):
+    """Read exactly nr_bytes from fin"""
+    _ = stream_read(nr_bytes)
+    if len(_) != nr_bytes:
+        raise IOError("Stream truncated")
+    return _
+
+def unpack_exact(fmt):
+    """Unpack a format from fin"""
+    sz = calcsize(fmt)
+    return unpack(fmt, rdexact(sz))
+
+def unpack_ulongs(nr_ulongs):
+    if twidth == 32:
+        return unpack_exact("I" * nr_ulongs)
+    else:
+        return unpack_exact("Q" * nr_ulongs)
+
+def read_pv_extended_info(vm):
+
+    marker, = unpack_ulongs(1)
+
+    if twidth == 32:
+        expected = 0xffffffff
+    else:
+        expected = 0xffffffffffffffff
+
+    if marker != expected:
+        raise StreamError("Unexpected extended info marker 0x%x" % (marker, ))
+
+    total_length, = unpack_exact("I")
+    so_far = 0
+
+    info("Extended Info: length 0x%x" % (total_length, ))
+
+    while so_far < total_length:
+
+        blkid, datasz = unpack_exact("=4sI")
+        so_far += 8
+
+        info("  Record type: %s, size 0x%x" % (blkid, datasz))
+
+        data = rdexact(datasz)
+        so_far += datasz
+
+        # Eww, but this is how it is done :(
+        if blkid == "vcpu":
+
+            vm.basic_len = datasz
+
+            if datasz == 0x1430:
+                vm.width = 8
+                vm.levels = 4
+                info("    64bit domain, 4 levels")
+            elif datasz == 0xaf0:
+                vm.width = 4
+                vm.levels = 3
+                info("    32bit domain, 3 levels")
+            else:
+                raise StreamError("Unable to determine guest width/level")
+
+            write_libxc_pv_info(vm)
+
+        elif blkid == "extv":
+            vm.extd = True
+
+        elif blkid == "xcnt":
+            vm.xsave_len, = unpack("I", data[:4])
+            info("xcnt sz 0x%x" % (vm.xsave_len, ))
+
+        else:
+            raise StreamError("Unrecognised extended block")
+
+
+    if so_far != total_length:
+        raise StreamError("Overshot Extended Info size by %d bytes"
+                          % (so_far - total_length,))
+
+def read_pv_p2m_frames(vm):
+    fpp = 4096 / vm.width
+    p2m_frame_len = (vm.p2m_size - 1) / fpp + 1
+
+    info("P2M frames: fpp %d, p2m_frame_len %d" % (fpp, p2m_frame_len))
+    write_libxc_pv_p2m_frames(vm, unpack_ulongs(p2m_frame_len))
+
+def read_pv_tail(vm):
+
+    nr_unmapped_pfns, = unpack_exact("I")
+
+    if nr_unmapped_pfns != 0:
+        # "Unmapped" pfns are bogus
+        _ = unpack_ulongs(nr_unmapped_pfns)
+        info("discarding %d bogus 'unmapped pfns'" % (nr_unmapped_pfns, ))
+        #raise StreamError("Found bogus 'unmapped pfns'")
+
+    for vcpu_id in vm.online_vcpu_map:
+
+        basic = rdexact(vm.basic_len)
+        info("Got VCPU basic (size 0x%x)" % (vm.basic_len, ))
+        write_libxc_pv_vcpu_basic(vcpu_id, basic)
+
+        if vm.extd:
+            extd = rdexact(128)
+            info("Got VCPU extd (size 0x%x)" % (128, ))
+            write_libxc_pv_vcpu_extd(vcpu_id, extd)
+
+        if vm.xsave_len:
+            mask, size = unpack_exact("QQ")
+            assert vm.xsave_len - 16 == size
+
+            xsave = rdexact(size)
+            info("Got VCPU xsave (mask 0x%x, size 0x%x)" % (mask, size))
+            write_libxc_pv_vcpu_xsave(vcpu_id, xsave)
+
+    shinfo = rdexact(4096)
+    info("Got shinfo")
+
+    write_record(libxc.REC_TYPE_shared_info, shinfo)
+    write_record(libxc.REC_TYPE_end, "")
+
+
+def read_chunks(vm):
+
+    hvm_params = []
+
+    while True:
+
+        marker, = unpack_exact("=i")
+        if marker <= 0:
+            info("Chunk: type 0x%x" % (marker, ))
+
+        if marker == 0:
+            info("  End")
+
+            if hvm_params:
+                write_libxc_hvm_params(hvm_params)
+
+            return
+
+        elif marker > 0:
+
+            if marker > 1024:
+                raise StreamError("Page batch (%d) exceeded MAX_BATCH"
+                                  % (marker, ))
+            pfns = unpack_ulongs(marker)
+
+            # xc_domain_save() leaves many XEN_DOMCTL_PFINFO_XTAB records for
+            # sequences of pfns it cant map.  Drop these.
+            pfns = [ x for x in pfns if x != 0xf0000000 ]
+
+            if len(set(pfns)) != len(pfns):
+                raise StreamError("Duplicate pfns in batch")
+
+                # print "0x[",
+                # for pfn in pfns:
+                #     print "%x" % (pfn, ),
+                # print "]"
+
+            nr_pages = len([x for x in pfns if (x & 0xf0000000) < 0xd0000000])
+
+            #print "  Page Batch, %d PFNs, %d pages" % (marker, nr_pages)
+            pages = rdexact(nr_pages * 4096)
+
+            write_page_data(pfns, pages)
+
+        elif marker == -1: # XC_SAVE_ID_ENABLE_VERIFY_MODE
+            # Verify mode... Seemingly nothing to do...
+            pass
+
+        elif marker == -2: # XC_SAVE_ID_VCPU_INFO
+            max_id, = unpack_exact("i")
+
+            if max_id > 4095:
+                raise StreamError("Vcpu max_id out of range: %d > 4095"
+                                  % (max_id, ) )
+
+            vm.max_vcpu_id = max_id
+            bitmap = unpack_exact("Q" * ((max_id/64) + 1))
+
+            for idx, word in enumerate(bitmap):
+                bit_idx = 0
+
+                while word > 0:
+                    if word & 1:
+                        vm.online_vcpu_map.append((idx * 64) + bit_idx)
+
+                    bit_idx += 1
+                    word >>= 1
+
+            info("  Vcpu info: max_id %d, online map %s"
+                 % (vm.max_vcpu_id, vm.online_vcpu_map))
+
+        elif marker == -3: # XC_SAVE_ID_HVM_IDENT_PT
+            _, ident_pt = unpack_exact("=IQ")
+            info("  EPT Identity Pagetable: 0x%x" % (ident_pt, ))
+            hvm_params.extend([12, # HVM_PARAM_IDENT_PT
+                               ident_pt])
+
+        elif marker == -4: # XC_SAVE_ID_HVM_VM86_TSS
+            _, vm86_tss = unpack_exact("=IQ")
+            info("  VM86 TSS: 0x%x" % (vm86_tss, ))
+            hvm_params.extend([15, # HVM_PARAM_VM86_TSS
+                               vm86_tss])
+
+        elif marker == -5: # XC_SAVE_ID_TMEM
+            raise RuntimeError("todo")
+
+        elif marker == -6: # XC_SAVE_ID_TMEM_EXTRA
+            raise RuntimeError("todo")
+
+        elif marker == -7: # XC_SAVE_ID_TSC_INFO
+            mode, nsec, khz, incarn = unpack_exact("=IQII")
+            info("  TSC_INFO: mode %s, %d ns, %d khz, %d incarn"
+                 % (mode, nsec, khz, incarn))
+            write_libxc_tsc_info(mode, khz, nsec, incarn)
+
+        elif marker == -8: # XC_SAVE_ID_HVM_CONSOLE_PFN
+            _, console_pfn = unpack_exact("=IQ")
+            info("  Console pfn: 0x%x" % (console_pfn, ))
+            hvm_params.extend([17, # HVM_PARAM_CONSOLE_PFN
+                               console_pfn])
+
+        elif marker == -9: # XC_SAVE_ID_LAST_CHECKPOINT
+            info("  Last Checkpoint")
+            # Nothing to do
+
+        elif marker == -10: # XC_SAVE_ID_HVM_ACPI_IOPORTS_LOCATION
+            _, loc = unpack_exact("=IQ")
+            info("  ACPI ioport location: 0x%x" % (loc, ))
+            hvm_params.extend([19, # HVM_PARAM_ACPI_IOPORTS_LOCATION
+                               loc])
+
+        elif marker == -11: # XC_SAVE_ID_HVM_VIRIDIAN
+            _, loc = unpack_exact("=IQ")
+            info("  Viridian location: 0x%x" % (loc, ))
+            hvm_params.extend([9, # HVM_PARAM_VIRIDIAN
+                               loc])
+
+        elif marker == -12: # XC_SAVE_ID_COMPRESSED_DATA
+            sz, = unpack_exact("I")
+            data = rdexact(sz)
+            info("  Compressed Data: sz 0x%x" % (sz, ))
+            raise RuntimeError("todo")
+
+        elif marker == -13: # XC_SAVE_ID_ENABLE_COMPRESSION
+            raise RuntimeError("todo")
+
+        elif marker == -14: # XC_SAVE_ID_HVM_GENERATION_ID_ADDR
+            _, genid_loc = unpack_exact("=IQ")
+            info("  Generation ID Address: 0x%x" % (genid_loc, ))
+            hvm_params.extend([34, # HVM_PARAM_VM_GENERATION_ID_ADDR
+                               genid_loc])
+
+        elif marker == -15: # XC_SAVE_ID_HVM_PAGING_RING_PFN
+            _, paging_ring_pfn = unpack_exact("=IQ")
+            info("  Paging ring pfn: 0x%x" % (paging_ring_pfn, ))
+            hvm_params.extend([27, # HVM_PARAM_PAGING_RING_PFN
+                               paging_ring_pfn])
+
+        elif marker == -16: # XC_SAVE_ID_HVM_ACCESS_RING_PFN
+            _, access_ring_pfn = unpack_exact("=IQ")
+            info("  Access ring pfn: 0x%x" % (access_ring_pfn, ))
+            hvm_params.extend([28, # HVM_PARAM_ACCESS_RING_PFN
+                               access_ring_pfn])
+
+        elif marker == -17: # XC_SAVE_ID_HVM_SHARING_RING_PFN
+            _, sharing_ring_pfn = unpack_exact("=IQ")
+            info("  Sharing ring pfn: 0x%x" % (sharing_ring_pfn, ))
+            hvm_params.extend([29, # HVM_PARAM_SHARING_RING_PFN
+                               sharing_ring_pfn])
+
+        elif marker == -18:
+            sz, = unpack_exact("I")
+
+            if sz:
+                data = rdexact(sz)
+                info("  Toolstack Data: sz 0x%x" % (sz, ))
+
+                if vm.libxl:
+                    vm.xenstore.append(data)
+                else:
+                    info("    Discarding")
+
+        elif marker == -19: # XC_SAVE_ID_HVM_IOREQ_SERVER_PFN
+            _, ioreq_server_pfn = unpack_exact("=IQ")
+            info("  IOREQ server pfn: 0x%x" % (ioreq_server_pfn, ))
+            hvm_params.extend([32 , # HVM_PARAM_IOREQ_SERVER_PFN
+                               ioreq_server_pfn])
+
+        elif marker == -20: # XC_SAVE_ID_HVM_NR_IOREQ_SERVER_PAGES
+            _, nr_pages = unpack_exact("=IQ")
+            info("  IOREQ server pages: %d" % (nr_pages, ))
+            hvm_params.extend([33 , # HVM_PARAM_NR_IOREQ_SERVER_PAGES
+                               nr_pages])
+
+        else:
+            raise StreamError("Unrecognised chunk %d" % (marker,))
+
+def read_hvm_tail(vm):
+
+    io, bufio, store = unpack_exact("QQQ")
+    info("Magic pfns: 0x%x 0x%x 0x%x" % (io, bufio, store))
+    write_libxc_hvm_params([5, io,     # HVM_PARAM_IOREQ_PFN
+                            6, bufio,  # HVM_PARAM_BUFIOREQ_PFN
+                            1, store]) # HVM_PARAM_STORE_PFN
+
+    blobsz, = unpack_exact("I")
+    info("Got HVM Context (0x%x bytes)" % (blobsz, ))
+    blob = rdexact(blobsz)
+
+    write_record(libxc.REC_TYPE_hvm_context, blob)
+    write_record(libxc.REC_TYPE_end, "")
+
+
+
+def read_qemu(vm):
+
+    rawsig = rdexact(21)
+    sig, = unpack("21s", rawsig)
+    info("Qemu signature: %s" % (sig, ))
+
+    if sig == "DeviceModelRecord0002":
+        rawsz = rdexact(4)
+        sz, = unpack("I", rawsz)
+        qdata = rdexact(sz)
+
+        if vm.libxl:
+            write_libxl_emulator_context(qdata)
+        else:
+            stream_write(rawsig)
+            stream_write(rawsz)
+            stream_write(qdata)
+
+    else:
+        raise RuntimeError("Unrecognised Qemu sig '%s'" % (sig, ))
+
+
+def skip_xl_header(fmt):
+    """Skip over an xl header in the stream"""
+
+    hdr = rdexact(32)
+    if hdr != "Xen saved domain, xl format\n \0 \r":
+        raise StreamError("No xl header")
+
+    end, mflags, oflags, optlen = unpack_exact("=IIII")
+
+    if fmt == "libxl":
+        mflags |= 2 # XL_MANDATORY_FLAG_STREAMv2
+
+    opts = pack("=IIII", end, mflags, oflags, optlen)
+
+    optdata = rdexact(optlen)
+
+    info("Processed xl header")
+
+    stream_write(hdr)
+    stream_write(opts)
+    stream_write(optdata)
+
+def read_legacy_stream(vm):
+
+    try:
+        vm.p2m_size, = unpack_ulongs(1)
+        info("P2M Size: 0x%x" % (vm.p2m_size,))
+
+        if vm.libxl:
+            write_libxl_hdr()
+            write_libxl_libxc_context()
+
+        write_libxc_ihdr()
+        write_libxc_dhdr()
+
+        if pv:
+            read_pv_extended_info(vm)
+            read_pv_p2m_frames(vm)
+
+        read_chunks(vm)
+
+        if pv:
+            read_pv_tail(vm)
+        else:
+            read_hvm_tail(vm)
+
+        if vm.libxl:
+            for x in vm.xenstore:
+                write_libxl_xenstore_data(x)
+
+        if not pv and (vm.libxl or qemu):
+            read_qemu(vm)
+
+        if vm.libxl:
+            write_libxl_end()
+
+    except (IOError, StreamError):
+        err("Stream Error:")
+        err(traceback.format_exc())
+        return 1
+
+    except RuntimeError:
+        err("Script Error:")
+        err(traceback.format_exc())
+        err("Please fix me")
+        return 2
+    return 0
+
+def open_file_or_fd(val, mode):
+    """
+    If 'val' looks like a decimal integer, open it as an fd.  If not, try to
+    open it as a regular file.
+    """
+
+    fd = -1
+    try:
+        # Does it look like an integer?
+        try:
+            fd = int(val, 10)
+        except ValueError:
+            pass
+
+        # Try to open it...
+        if fd != -1:
+            return os.fdopen(fd, mode, 0)
+        else:
+            return open(val, mode, 0)
+
+    except StandardError, e:
+        if fd != -1:
+            err("Unable to open fd %d: %s" % (fd, e))
+        else:
+            err("Unable to open file '%s': %s" % (val, e))
+
+    raise SystemExit(1)
+
+
+def main(argv):
+    from optparse import OptionParser
+    global fin, fout, twidth, pv, qemu, verbose
+
+    # Change stdout to be line-buffered.
+    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 1)
+
+    parser = OptionParser(version = __version__,
+                          usage = ("%prog [options] -i INPUT -o OUTPUT"
+                                   " -w WIDTH -g GUEST"),
+                          description =
+                          "Convert a legacy stream to a v2 stream")
+
+    # Required options
+    parser.add_option("-i", "--in", dest = "fin", metavar = "<FD or FILE>",
+                      help = "Legacy input to convert")
+    parser.add_option("-o", "--out", dest = "fout", metavar = "<FD or FILE>",
+                      help = "v2 format output")
+    parser.add_option("-w", "--width", dest = "twidth",
+                      metavar = "<32/64>", choices = ["32", "64"],
+                      help = "Legacy toolstack bitness")
+    parser.add_option("-g", "--guest-type", dest = "gtype",
+                      metavar = "<pv/hvm>", choices = ["pv", "hvm"],
+                      help = "Type of guest in stream")
+
+    # Optional options
+    parser.add_option("-f", "--format", dest = "format",
+                      metavar = "<libxc|libxl>", default = "libxc",
+                      choices = ["libxc", "libxl"],
+                      help = "Desired format of the outgoing stream (defaults to libxc)")
+    parser.add_option("-v", "--verbose", action = "store_true", default = False,
+                      help = "Summarise stream contents")
+    parser.add_option("-x", "--xl", action = "store_true", default = False,
+                      help = ("Is an `xl` header present in the stream?"
+                              " (default no)"))
+    parser.add_option("--skip-qemu", action = "store_true", default = False,
+                      help = ("Skip processing of the qemu tail?"
+                              " (default no)"))
+    parser.add_option("--syslog", action = "store_true", default = False,
+                      help = "Log to syslog instead of stdout")
+
+    opts, _ = parser.parse_args()
+
+    if (opts.fin is None or opts.fout is None or
+        opts.twidth is None or opts.gtype is None):
+
+        parser.print_help(sys.stderr)
+        raise SystemExit(1)
+
+    if opts.syslog:
+        global log_to_syslog
+
+        syslog.openlog("convert-legacy-stream", syslog.LOG_PID)
+        log_to_syslog = True
+
+    fin     = open_file_or_fd(opts.fin,  "rb")
+    fout    = open_file_or_fd(opts.fout, "wb")
+    twidth  = int(opts.twidth)
+    pv      = opts.gtype == "pv"
+    verbose = opts.verbose
+    if opts.skip_qemu:
+        qemu = False
+
+    if opts.xl:
+        skip_xl_header(opts.format)
+
+    rc = read_legacy_stream(VM(opts.format))
+    fout.close()
+
+    return rc
+
+if __name__ == "__main__":
+    try:
+        sys.exit(main(sys.argv))
+    except SystemExit, e:
+        sys.exit(e.code)
+    except KeyboardInterrupt:
+        sys.exit(1)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 15/27] tools/libxl: Migration v2 stream format
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (13 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 14/27] tools/python: Conversion utility for legacy migration streams Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:04   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Andrew Cooper
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson,
	Ross Lagerwall, Yang Hongyang

From: Ross Lagerwall <ross.lagerwall@citrix.com>

C structures describing the Libxl migration v2 stream format

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_sr_stream_format.h |   57 ++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)
 create mode 100644 tools/libxl/libxl_sr_stream_format.h

diff --git a/tools/libxl/libxl_sr_stream_format.h b/tools/libxl/libxl_sr_stream_format.h
new file mode 100644
index 0000000..487f9e2
--- /dev/null
+++ b/tools/libxl/libxl_sr_stream_format.h
@@ -0,0 +1,57 @@
+#ifndef LIBXL_SR_STREAM_FORMAT_H
+#define LIBXL_SR_STREAM_FORMAT_H
+
+/*
+ * C structures for the Migration v2 stream format.
+ * See docs/specs/libxl-migration-stream.pandoc
+ */
+
+#include <stdint.h>
+
+typedef struct libxl_sr_hdr
+{
+    uint64_t ident;
+    uint32_t version;
+    uint32_t options;
+} libxl_sr_hdr;
+
+#define RESTORE_STREAM_IDENT         0x4c6962786c466d74UL
+#define RESTORE_STREAM_VERSION       0x00000002U
+
+#define RESTORE_OPT_BIG_ENDIAN       (1 << 0)
+#define RESTORE_OPT_LEGACY           (1 << 1)
+
+
+typedef struct libxl_sr_rec_hdr
+{
+    uint32_t type;
+    uint32_t length;
+} libxl_sr_rec_hdr;
+
+/* All records must be aligned up to an 8 octet boundary */
+#define REC_ALIGN_ORDER              3U
+
+#define REC_TYPE_END                 0x00000000U
+#define REC_TYPE_LIBXC_CONTEXT       0x00000001U
+#define REC_TYPE_XENSTORE_DATA       0x00000002U
+#define REC_TYPE_EMULATOR_CONTEXT    0x00000003U
+
+typedef struct libxl_sr_emulator_hdr
+{
+    uint32_t id;
+    uint32_t index;
+} libxl_sr_emulator_hdr;
+
+#define EMULATOR_UNKNOWN             0x00000000U
+#define EMULATOR_QEMU_TRADITIONAL    0x00000001U
+#define EMULATOR_QEMU_UPSTREAM       0x00000002U
+
+#endif /* LIBXL_SR_STREAM_FORMAT_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (14 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 15/27] tools/libxl: Migration v2 stream format Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:31   ` Ian Campbell
                     ` (3 more replies)
  2015-06-15 13:44 ` [PATCH 17/27] tools/libxl: Support converting a legacy stream to a " Andrew Cooper
                   ` (13 subsequent siblings)
  29 siblings, 4 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson,
	Ross Lagerwall, Yang Hongyang

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/Makefile            |    1 +
 tools/libxl/libxl_internal.h    |   39 ++++
 tools/libxl/libxl_stream_read.c |  485 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 525 insertions(+)
 create mode 100644 tools/libxl/libxl_stream_read.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index cc9c152..c71c5fe 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
 			libxl_internal.o libxl_utils.o libxl_uuid.o \
 			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
+			libxl_stream_read.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 101994f..4f33cb8 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -19,6 +19,8 @@
 
 #include "libxl_osdeps.h" /* must come before any other headers */
 
+#include "libxl_sr_stream_format.h"
+
 #include <assert.h>
 #include <dirent.h>
 #include <errno.h>
@@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
                                      libxl__domain_create_state*,
                                      int rc, uint32_t domid);
 
+/* State for manipulating a libxl migration v2 stream */
+typedef struct libxl__stream_read_state libxl__stream_read_state;
+
+struct libxl__stream_read_state {
+    /* filled by the user */
+    libxl__ao *ao;
+    int fd;
+    void (*completion_callback)(libxl__egc *egc,
+                                libxl__domain_create_state *dcs,
+                                int rc);
+    /* Private */
+    int rc;
+    bool running;
+    libxl__datacopier_state dc;
+    size_t expected_len;
+    libxl_sr_hdr hdr;
+    libxl_sr_rec_hdr rec_hdr;
+    void *rec_body;
+};
+
+_hidden void libxl__stream_read_start(libxl__egc *egc,
+                                      libxl__stream_read_state *stream);
+
+_hidden void libxl__stream_read_continue(libxl__egc *egc,
+                                         libxl__stream_read_state *stream);
+
+_hidden void libxl__stream_read_abort(libxl__egc *egc,
+                                      libxl__stream_read_state *stream, int rc);
+
+static inline bool libxl__stream_read_inuse(
+    const libxl__stream_read_state *stream)
+{
+    return stream->running;
+}
+
+
 struct libxl__domain_create_state {
     /* filled in by user */
     libxl__ao *ao;
@@ -3137,6 +3175,7 @@ struct libxl__domain_create_state {
     libxl__stub_dm_spawn_state dmss;
         /* If we're not doing stubdom, we use only dmss.dm,
          * for the non-stubdom device model. */
+    libxl__stream_read_state srs;
     libxl__save_helper_state shs;
     /* necessary if the domain creation failed and we have to destroy it */
     libxl__domain_destroy_state dds;
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
new file mode 100644
index 0000000..9cdaadf
--- /dev/null
+++ b/tools/libxl/libxl_stream_read.c
@@ -0,0 +1,485 @@
+/*
+ * Copyright (C) 2015      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+/*
+ * Infrastructure for reading and acting on the contents of a libxl migration
+ * stream. There are a lot of moving parts here.
+ *
+ * Entry points from outside:
+ *  - libxl__stream_read_start()
+ *     - Set up reading a stream from the start.
+ *
+ *  - libxl__stream_read_continue()
+ *     - Set up reading the next record from a started stream.
+ *
+ * The principle loop functionality involves reading the stream header, then
+ * reading a record at time and acting upon it.  It follows the callbacks:
+ *
+ *  - stream_header_done()
+ *  - stream_record_header_done()
+ *  - stream_record_body_done()
+ *  - process_record()
+ *
+ * process_record() will choose the correct next action based upon the
+ * record.  Upon completion of the action, the next record header will be read
+ * from the stream.
+ */
+
+static void stream_success(libxl__egc *egc,
+                           libxl__stream_read_state *stream);
+static void stream_failed(libxl__egc *egc,
+                          libxl__stream_read_state *stream, int rc);
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_read_state *stream);
+
+/* Event callbacks for main reading loop. */
+static void stream_header_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval);
+static void record_header_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval);
+static void record_body_done(libxl__egc *egc,
+                             libxl__datacopier_state *dc,
+                             int onwrite, int errnoval);
+static void process_record(libxl__egc *egc,
+                           libxl__stream_read_state *stream);
+
+/* Mini-event loop for splicing a emulator record out of the stream. */
+static void read_emulator_body(libxl__egc *egc,
+                               libxl__stream_read_state *stream);
+static void emulator_body_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval);
+static void emulator_padding_done(libxl__egc *egc,
+                                  libxl__datacopier_state *dc,
+                                  int onwrite, int errnoval);
+
+void libxl__stream_read_start(libxl__egc *egc,
+                              libxl__stream_read_state *stream)
+{
+    libxl__datacopier_state *dc = &stream->dc;
+    int ret = 0;
+
+    /* State initialisation. */
+    assert(!stream->running);
+
+    memset(dc, 0, sizeof(*dc));
+    dc->ao = stream->ao;
+    dc->readfd = stream->fd;
+    dc->writefd = -1;
+
+    /* Start reading the stream header. */
+    dc->readwhat = "stream header";
+    dc->readbuf = &stream->hdr;
+    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
+    dc->used = 0;
+    dc->callback = stream_header_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    stream->running = true;
+    assert(!ret);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+void libxl__stream_read_continue(libxl__egc *egc,
+                                 libxl__stream_read_state *stream)
+{
+    libxl__datacopier_state *dc = &stream->dc;
+    int ret = 0;
+
+    assert(stream->running);
+
+    /* Read a record header. */
+    dc->readwhat = "record header";
+    dc->readbuf = &stream->rec_hdr;
+    stream->expected_len = dc->bytes_to_read = sizeof(stream->rec_hdr);
+    dc->used = 0;
+    dc->callback = record_header_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    assert(!ret);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+void libxl__stream_read_abort(libxl__egc *egc,
+                              libxl__stream_read_state *stream, int rc)
+{
+    stream_failed(egc, stream, rc);
+}
+
+static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
+{
+    stream->rc = 0;
+    stream->running = false;
+
+    stream_done(egc, stream);
+}
+
+static void stream_failed(libxl__egc *egc,
+                          libxl__stream_read_state *stream, int rc)
+{
+    assert(rc);
+    stream->rc = rc;
+
+    if (stream->running) {
+        stream->running = false;
+        stream_done(egc, stream);
+    }
+}
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_read_state *stream)
+{
+    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
+
+    assert(!stream->running);
+
+    stream->completion_callback(egc, dcs, stream->rc);
+}
+
+static void stream_header_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval)
+{
+    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
+    libxl_sr_hdr *hdr = &stream->hdr;
+    STATE_AO_GC(dc->ao);
+    int ret = 0;
+
+    if (onwrite || dc->used != stream->expected_len) {
+        ret = ERROR_FAIL;
+        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
+            onwrite, errnoval, stream->expected_len, dc->used);
+        goto err;
+    }
+
+    hdr->ident   = be64toh(hdr->ident);
+    hdr->version = be32toh(hdr->version);
+    hdr->options = be32toh(hdr->options);
+
+    if (hdr->ident != RESTORE_STREAM_IDENT) {
+        ret = ERROR_FAIL;
+        LOG(ERROR,
+            "Invalid ident: expected 0x%016"PRIx64", got 0x%016"PRIx64,
+            RESTORE_STREAM_IDENT, hdr->ident);
+        goto err;
+    }
+    if (hdr->version != RESTORE_STREAM_VERSION) {
+        ret = ERROR_FAIL;
+        LOG(ERROR, "Unexpected Version: expected %u, got %u",
+            RESTORE_STREAM_VERSION, hdr->version);
+        goto err;
+    }
+    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
+        ret = ERROR_FAIL;
+        LOG(ERROR, "Unable to handle big endian streams");
+        goto err;
+    }
+
+    LOG(INFO, "Stream v%u%s", hdr->version,
+        hdr->options & RESTORE_OPT_LEGACY ? " (from legacy)" : "");
+
+    libxl__stream_read_continue(egc, stream);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void record_header_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval)
+{
+    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
+    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
+    STATE_AO_GC(dc->ao);
+    int ret = 0;
+
+    if (onwrite || dc->used != stream->expected_len) {
+        ret = ERROR_FAIL;
+        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
+            onwrite, errnoval, stream->expected_len, dc->used);
+        goto err;
+    }
+
+    assert(stream->rec_body == NULL);
+
+    /* No body? Process straight away. */
+    if (rec_hdr->length == 0) {
+        process_record(egc, stream);
+        return;
+    }
+
+    /* Queue up reading the body. */
+    size_t bytes_to_read;
+
+    switch (rec_hdr->type) {
+        /*
+         * Emulator records want to retain the blob in the pipe, for a further
+         * datacopier call to move elsewhere.  Just read the emulator header.
+         */
+    case REC_TYPE_EMULATOR_CONTEXT:
+        bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
+        break;
+
+    default:
+        bytes_to_read = rec_hdr->length;
+        break;
+    }
+
+    bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);
+
+    dc->readwhat = "record body";
+    stream->rec_body = dc->readbuf = libxl__malloc(NOGC, bytes_to_read);
+    stream->expected_len = dc->bytes_to_read = bytes_to_read;
+    dc->used = 0;
+    dc->callback = record_body_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void record_body_done(libxl__egc *egc,
+                             libxl__datacopier_state *dc,
+                             int onwrite, int errnoval)
+{
+    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
+    STATE_AO_GC(dc->ao);
+    int ret = 0;
+
+    if (onwrite || dc->used != stream->expected_len) {
+        ret = ERROR_FAIL;
+        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
+            onwrite, errnoval, stream->expected_len, dc->used);
+
+        free(stream->rec_body);
+        stream->rec_body = dc->readbuf = NULL;
+
+        goto err;
+    }
+
+    process_record(egc, stream);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void process_record(libxl__egc *egc,
+                           libxl__stream_read_state *stream)
+{
+    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
+    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
+    STATE_AO_GC(stream->ao);
+    int ret = 0;
+
+    LOG(DEBUG, "Record: 0x%08x, length %u", rec_hdr->type, rec_hdr->length);
+
+    switch (rec_hdr->type) {
+
+    case REC_TYPE_END:
+        /* Handled later, after cleanup. */
+        break;
+
+    case REC_TYPE_XENSTORE_DATA:
+        ret = libxl__toolstack_restore(dcs->guest_domid, stream->rec_body,
+                                       rec_hdr->length, &dcs->shs);
+        if (ret)
+            goto err;
+
+        /*
+         * libxl__toolstack_restore() is a synchronous function.  Manually
+         * start looking for the next record.
+         */
+        libxl__stream_read_continue(egc, &dcs->srs);
+        break;
+
+    case REC_TYPE_EMULATOR_CONTEXT:
+        read_emulator_body(egc, stream);
+        break;
+
+    default:
+        LOG(ERROR, "Unrecognised record 0x%08x", rec_hdr->type);
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    assert(!ret);
+    if (rec_hdr->length) {
+        free(stream->rec_body);
+        stream->rec_body = NULL;
+    }
+
+    if (rec_hdr->type == REC_TYPE_END)
+        stream_success(egc, stream);
+    return;
+
+ err:
+    assert(ret);
+    if (rec_hdr->length) {
+        free(stream->rec_body);
+        stream->rec_body = NULL;
+    }
+    stream_failed(egc, stream, ret);
+}
+
+static void read_emulator_body(libxl__egc *egc,
+                               libxl__stream_read_state *stream)
+{
+    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
+    libxl__datacopier_state *dc = &stream->dc;
+    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
+    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
+    STATE_AO_GC(stream->ao);
+    char path[256];
+    int ret = 0;
+
+    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
+
+    dc->readwhat = "save/migration stream";
+    dc->copywhat = "emulator context";
+    dc->writewhat = "qemu save file";
+    dc->readbuf = NULL;
+    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);
+    if (dc->writefd == -1) {
+        ret = ERROR_FAIL;
+        LOGE(ERROR, "Unable to open '%s'", path);
+        goto err;
+    }
+    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
+    stream->expected_len = dc->used = 0;
+    dc->callback = emulator_body_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void emulator_body_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval)
+{
+    /* Safe to be static, as it is a write-only discard buffer. */
+    static char padding[1U << REC_ALIGN_ORDER];
+
+    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
+    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
+    STATE_AO_GC(dc->ao);
+    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
+    int ret = 0;
+
+    if (onwrite || dc->used != stream->expected_len) {
+        ret = ERROR_FAIL;
+        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
+            onwrite, errnoval, stream->expected_len, dc->used);
+        goto err;
+    }
+
+    /* Undo modifications for splicing the emulator context. */
+    memset(dc, 0, sizeof(*dc));
+    dc->ao = stream->ao;
+    dc->readfd = stream->fd;
+    dc->writefd = -1;
+
+    /* Do we need to eat some padding out of the stream? */
+    if (rec_hdr->length & (nr_padding_bytes - 1)) {
+        unsigned int bytes_to_discard =
+            nr_padding_bytes - (rec_hdr->length & (nr_padding_bytes - 1));
+
+        dc->readwhat = "padding bytes";
+        dc->readbuf = padding;
+        stream->expected_len = dc->bytes_to_read = bytes_to_discard;
+        dc->used = 0;
+        dc->callback = emulator_padding_done;
+
+        ret = libxl__datacopier_start(dc);
+        if (ret)
+            goto err;
+    }
+    else
+    {
+        stream->expected_len = dc->bytes_to_read = 0;
+        dc->used = 0;
+
+        emulator_padding_done(egc, dc, 0, 0);
+    }
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void emulator_padding_done(libxl__egc *egc,
+                                  libxl__datacopier_state *dc,
+                                  int onwrite, int errnoval)
+{
+    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
+    STATE_AO_GC(dc->ao);
+    int ret = 0;
+
+    if (onwrite || dc->used != stream->expected_len) {
+        ret = ERROR_FAIL;
+        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
+            onwrite, errnoval, stream->expected_len, dc->used);
+        goto err;
+    }
+
+    libxl__stream_read_continue(egc, stream);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 17/27] tools/libxl: Support converting a legacy stream to a v2 stream
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (15 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:38   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 18/27] tools/libxl: Convert a legacy stream if needed Andrew Cooper
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

When a legacy stream is found, it needs to be converted to a v2 stream for the
reading logic.  This is done by exec()ing the python conversion utility.

One complication is that the caller of this interface needs to assume
ownership of the output fd, to prevent it being closed while still in use in a
datacopier.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/Makefile                |    1 +
 tools/libxl/libxl_convert_callout.c |  146 +++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h        |   32 ++++++++
 3 files changed, 179 insertions(+)
 create mode 100644 tools/libxl/libxl_convert_callout.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index c71c5fe..ca0ae3e 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -96,6 +96,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
 			libxl_stream_read.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
+			libxl_convert_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
diff --git a/tools/libxl/libxl_convert_callout.c b/tools/libxl/libxl_convert_callout.c
new file mode 100644
index 0000000..9050bb9
--- /dev/null
+++ b/tools/libxl/libxl_convert_callout.c
@@ -0,0 +1,146 @@
+/*
+ * Copyright (C) 2014      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h"
+
+#include "libxl_internal.h"
+
+/*
+ * Infrastructure for converting a legacy migration stream into a libxl v2
+ * stream.
+ *
+ * This is done by fork()ing the python conversion script, which takes in a
+ * legacy stream, and puts out a suitably-formatted v2 stream.
+ */
+
+static void helper_failed(libxl__egc *egc,
+                          libxl__conversion_helper_state *chs, int rc);
+static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
+                          pid_t pid, int status);
+static void helper_done(libxl__egc *egc,
+                        libxl__conversion_helper_state *chs);
+
+void libxl__convert_legacy_stream(libxl__egc *egc,
+                                  libxl__conversion_helper_state *chs)
+{
+    STATE_AO_GC(chs->ao);
+    int ret = 0;
+
+    chs->rc = 0;
+    libxl__ev_child_init(&chs->child);
+
+    libxl__carefd *child_in = NULL, *child_out = NULL;
+
+    if (chs->legacy_width == 0) {
+#ifdef __i386__
+        chs->legacy_width = 32;
+#else
+        chs->legacy_width = 64;
+#endif
+    }
+
+    libxl__carefd_begin();
+    int fds[2];
+    if (libxl_pipe(CTX, fds)) {
+        ret = ERROR_FAIL;
+        libxl__carefd_unlock();
+        goto err;
+    }
+    child_out = libxl__carefd_record(CTX, fds[0]);
+    child_in  = libxl__carefd_record(CTX, fds[1]);
+    libxl__carefd_unlock();
+
+    pid_t pid = libxl__ev_child_fork(gc, &chs->child, helper_exited);
+    if (!pid) {
+        char * const args[] =
+        {
+            getenv("LIBXL_CONVERT_HELPER") ?:
+                LIBEXEC_BIN "/convert-legacy-stream.py",
+            "--in",     GCSPRINTF("%d", chs->legacy_fd),
+            "--out",    GCSPRINTF("%d", fds[1]),
+            "--width",  GCSPRINTF("%u", chs->legacy_width),
+            "--guest",  chs->hvm ? "hvm" : "pv",
+            "--format", "libxl",
+            /* "--verbose", */
+            NULL,
+        };
+
+        libxl_fd_set_cloexec(CTX, chs->legacy_fd, 0);
+        libxl_fd_set_cloexec(CTX, libxl__carefd_fd(child_in), 0);
+
+        libxl__exec(gc,
+                    -1, -1, -1,
+                    args[0], args, NULL);
+    }
+
+    libxl__carefd_close(child_in);
+    chs->v2_carefd = child_out;
+
+    assert(!ret);
+    return;
+
+ err:
+    assert(ret);
+    helper_failed(egc, chs, ret);
+}
+
+void libxl__convert_legacy_stream_abort(libxl__egc *egc,
+                                        libxl__conversion_helper_state *chs,
+                                        int rc)
+{
+    helper_failed(egc, chs, rc);
+}
+
+static void helper_failed(libxl__egc *egc,
+                          libxl__conversion_helper_state *chs, int rc)
+{
+    STATE_AO_GC(chs->ao);
+
+    if (!chs->rc)
+        chs->rc = rc;
+
+    if (!libxl__ev_child_inuse(&chs->child)) {
+        helper_done(egc, chs);
+        return;
+    }
+
+    libxl__kill(gc, chs->child.pid, SIGKILL, "conversion helper");
+}
+
+static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
+                          pid_t pid, int status)
+{
+    libxl__conversion_helper_state *chs = CONTAINER_OF(ch, *chs, child);
+    STATE_AO_GC(chs->ao);
+
+    if (status) {
+        libxl_report_child_exitstatus(CTX, XTL_ERROR, "conversion helper",
+                                      pid, status);
+        chs->rc = ERROR_FAIL;
+    }
+    else
+        chs->rc = 0;
+
+    helper_done(egc, chs);
+}
+
+static void helper_done(libxl__egc *egc,
+                        libxl__conversion_helper_state *chs)
+{
+    STATE_AO_GC(chs->ao);
+
+    assert(!libxl__ev_child_inuse(&chs->child));
+
+    chs->completion_callback(egc, chs, chs->rc);
+}
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 4f33cb8..3e43cc6 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2828,6 +2828,37 @@ _hidden void libxl__remus_devices_commit(libxl__egc *egc,
                                          libxl__remus_devices_state *rds);
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
+/*----- Legacy conversion helper -----*/
+typedef struct libxl__conversion_helper_state libxl__conversion_helper_state;
+
+struct libxl__conversion_helper_state {
+    /* public */
+    libxl__ao *ao;
+    int legacy_fd;
+    unsigned int legacy_width; /* Bitness (32/64) of legacy libxc. */
+    bool hvm;                  /* pv or hvm domain? */
+    libxl__carefd *v2_carefd;  /* Filled by successful call to
+                                * libxl__convert_legacy_stream().  Caller
+                                * assumes ownership of the fd. */
+    void (*completion_callback)(
+        libxl__egc *egc, libxl__conversion_helper_state *chs, int rc);
+    /* private */
+    int rc;
+    libxl__ev_child child;
+};
+
+_hidden void libxl__convert_legacy_stream(libxl__egc *egc,
+                                          libxl__conversion_helper_state *chs);
+_hidden void libxl__convert_legacy_stream_abort(
+    libxl__egc *egc, libxl__conversion_helper_state *chs, int rc);
+
+static inline bool libxl__convert_legacy_stream_inuse(
+    libxl__conversion_helper_state *chs)
+{
+    return libxl__ev_child_inuse(&chs->child);
+}
+
+
 /*----- Domain suspend (save) state structure -----*/
 
 typedef struct libxl__domain_suspend_state libxl__domain_suspend_state;
@@ -3177,6 +3208,7 @@ struct libxl__domain_create_state {
          * for the non-stubdom device model. */
     libxl__stream_read_state srs;
     libxl__save_helper_state shs;
+    libxl__conversion_helper_state chs;
     /* necessary if the domain creation failed and we have to destroy it */
     libxl__domain_destroy_state dds;
     libxl__multidev multidev;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 18/27] tools/libxl: Convert a legacy stream if needed
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (16 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 17/27] tools/libxl: Support converting a legacy stream to a " Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-15 13:44 ` [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams Andrew Cooper
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

For backwards compatibility, a legacy stream needs converting before it can be
read by the v2 stream logic.

This causes the v2 stream logic to need to juggle two parallel tasks.
check_stream_finished() is introduced for the purpose of joining the tasks in
both success and error cases.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_internal.h    |    3 ++
 tools/libxl/libxl_stream_read.c |   85 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 87 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3e43cc6..5482950 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3161,11 +3161,14 @@ struct libxl__stream_read_state {
     /* filled by the user */
     libxl__ao *ao;
     int fd;
+    bool legacy;
     void (*completion_callback)(libxl__egc *egc,
                                 libxl__domain_create_state *dcs,
                                 int rc);
     /* Private */
+    libxl__carefd *v2_carefd;
     int rc;
+    int joined_rc;
     bool running;
     libxl__datacopier_state dc;
     size_t expected_len;
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 9cdaadf..87b9737 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -38,6 +38,10 @@
  * process_record() will choose the correct next action based upon the
  * record.  Upon completion of the action, the next record header will be read
  * from the stream.
+ *
+ * Depending on the contents of the stream, there are likely to be several
+ * parallel tasks being managed.  check_stream_finished() is used to join all
+ * tasks in both success and error cases.
  */
 
 static void stream_success(libxl__egc *egc,
@@ -47,6 +51,12 @@ static void stream_failed(libxl__egc *egc,
 static void stream_done(libxl__egc *egc,
                         libxl__stream_read_state *stream);
 
+static void conversion_done(libxl__egc *egc,
+                            libxl__conversion_helper_state *chs, int rc);
+static void check_stream_finished(libxl__egc *egc,
+                                  libxl__domain_create_state *dcs,
+                                  int rc, const char *what);
+
 /* Event callbacks for main reading loop. */
 static void stream_header_done(libxl__egc *egc,
                                libxl__datacopier_state *dc,
@@ -73,12 +83,33 @@ static void emulator_padding_done(libxl__egc *egc,
 void libxl__stream_read_start(libxl__egc *egc,
                               libxl__stream_read_state *stream)
 {
+    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
     libxl__datacopier_state *dc = &stream->dc;
+    STATE_AO_GC(stream->ao);
     int ret = 0;
 
     /* State initialisation. */
     assert(!stream->running);
 
+    if (stream->legacy) {
+        /* Convert a legacy stream, if needed. */
+        dcs->chs.ao = stream->ao;
+        dcs->chs.legacy_fd = stream->fd;
+        dcs->chs.legacy_width = dcs->restore_params.legacy_width;
+        dcs->chs.hvm =
+            (dcs->guest_config->b_info.type == LIBXL_DOMAIN_TYPE_HVM);
+        dcs->chs.v2_carefd = NULL;
+        dcs->chs.completion_callback = conversion_done;
+
+        libxl__convert_legacy_stream(egc, &dcs->chs);
+
+        assert(dcs->chs.v2_carefd);
+        stream->v2_carefd = dcs->chs.v2_carefd;
+        stream->fd = libxl__carefd_fd(dcs->chs.v2_carefd);
+    }
+
+    /* stream->fd is now guarenteed to be a v2 stream. */
+
     memset(dc, 0, sizeof(*dc));
     dc->ao = stream->ao;
     dc->readfd = stream->fd;
@@ -164,7 +195,50 @@ static void stream_done(libxl__egc *egc,
 
     assert(!stream->running);
 
-    stream->completion_callback(egc, dcs, stream->rc);
+    if (stream->v2_carefd)
+        libxl__carefd_close(stream->v2_carefd);
+
+    check_stream_finished(egc, dcs, stream->rc, "stream");
+}
+
+static void check_stream_finished(libxl__egc *egc,
+                                  libxl__domain_create_state *dcs,
+                                  int rc, const char *what)
+{
+    libxl__stream_read_state *stream = &dcs->srs;
+    STATE_AO_GC(dcs->ao);
+
+    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
+
+    if (rc && !stream->joined_rc) {
+        bool skip = false;
+        /* First reported failure from joining tasks.  Tear everything down */
+        stream->joined_rc = rc;
+
+        if (libxl__stream_read_inuse(&dcs->srs)) {
+            skip = true;
+            libxl__stream_read_abort(egc, &dcs->srs, rc);
+        }
+
+        if (libxl__convert_legacy_stream_inuse(&dcs->chs)) {
+            skip = true;
+            libxl__convert_legacy_stream_abort(egc, &dcs->chs, rc);
+        }
+
+        /* There is at least one more active task to join - wait for its
+           callback */
+        if ( skip )
+            return;
+    }
+
+    if (libxl__stream_read_inuse(&dcs->srs))
+        LOG(DEBUG, "stream still in use");
+    else if (libxl__convert_legacy_stream_inuse(&dcs->chs))
+        LOG(DEBUG, "conversion still in use");
+    else {
+        LOG(INFO, "Join complete: result %d", stream->joined_rc);
+        stream->completion_callback(egc, dcs, stream->joined_rc);
+    }
 }
 
 static void stream_header_done(libxl__egc *egc,
@@ -303,6 +377,15 @@ static void record_body_done(libxl__egc *egc,
     stream_failed(egc, stream, ret);
 }
 
+static void conversion_done(libxl__egc *egc,
+                            libxl__conversion_helper_state *chs, int rc)
+{
+    STATE_AO_GC(chs->ao);
+    libxl__domain_create_state *dcs = CONTAINER_OF(chs, *dcs, chs);
+
+    check_stream_finished(egc, dcs, rc, "conversion");
+}
+
 static void process_record(libxl__egc *egc,
                            libxl__stream_read_state *stream)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (17 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 18/27] tools/libxl: Convert a legacy stream if needed Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:53   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
                   ` (10 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

This is a complicated set of changes which must be done together for
bisectability.

 * libxl-save-helper is updated to unconditionally use libxc migration v2.
 * libxl compatibility workarounds in libxc are disabled for restore operations.
 * libxl__stream_read_start() is logically spliced into the event location
   where libxl__xc_domain_restore() used to reside.

The parameters 'hvm', 'pae', and 'superpages' were previously superfluous, and
are completely unused in migration v2. callbacks->toolstack_restore is handled
via a migration v2 record now, rather than via a callback from libxc.

NB: this change breaks Remus.  Further untangling needs to happen before Remus
will function.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/Makefile            |    4 ++--
 tools/libxl/libxl_create.c      |   47 ++++++++++++---------------------------
 tools/libxl/libxl_save_helper.c |    2 +-
 tools/libxl/libxl_stream_read.c |   33 +++++++++++++++++++++++++++
 tools/libxl/libxl_types.idl     |    2 ++
 tools/libxl/xl_cmdimpl.c        |    7 +++++-
 6 files changed, 58 insertions(+), 37 deletions(-)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 55782c8..0e65b88 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -64,8 +64,8 @@ GUEST_SRCS-$(CONFIG_X86) += xc_sr_save_x86_hvm.c
 GUEST_SRCS-y += xc_sr_restore.c
 GUEST_SRCS-y += xc_sr_save.c
 GUEST_SRCS-y += xc_offline_page.c xc_compression.c
-$(patsubst %.c,%.o,$(GUEST_SRCS-y)): CFLAGS += -DXG_LIBXL_HVM_COMPAT
-$(patsubst %.c,%.opic,$(GUEST_SRCS-y)): CFLAGS += -DXG_LIBXL_HVM_COMPAT
+xc_sr_save_x86_hvm.o: CFLAGS += -DXG_LIBXL_HVM_COMPAT
+xc_sr_save_x86_hvm.opic: CFLAGS += -DXG_LIBXL_HVM_COMPAT
 else
 GUEST_SRCS-y += xc_nomigrate.c
 endif
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index a37cdf8..7dd7130 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -779,6 +779,10 @@ static void domcreate_attach_dtdev(libxl__egc *egc,
 static void domcreate_console_available(libxl__egc *egc,
                                         libxl__domain_create_state *dcs);
 
+static void domcreate_stream_done(libxl__egc *egc,
+                                  libxl__domain_create_state *dcs,
+                                  int ret);
+
 static void domcreate_rebuild_done(libxl__egc *egc,
                                    libxl__domain_create_state *dcs,
                                    int ret);
@@ -1002,11 +1006,8 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     /* convenience aliases */
     const uint32_t domid = dcs->guest_domid;
     libxl_domain_config *const d_config = dcs->guest_config;
-    libxl_domain_build_info *const info = &d_config->b_info;
     const int restore_fd = dcs->restore_fd;
     libxl__domain_build_state *const state = &dcs->build_state;
-    libxl__srm_restore_autogen_callbacks *const callbacks =
-        &dcs->shs.callbacks.restore.a;
 
     if (rc) {
         domcreate_rebuild_done(egc, dcs, rc);
@@ -1039,30 +1040,16 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     if (rc)
         goto out;
 
-    /* read signature */
-    int hvm, pae, superpages;
-    switch (info->type) {
-    case LIBXL_DOMAIN_TYPE_HVM:
-        hvm = 1;
-        superpages = 1;
-        pae = libxl_defbool_val(info->u.hvm.pae);
-        callbacks->toolstack_restore = libxl__toolstack_restore;
-        break;
-    case LIBXL_DOMAIN_TYPE_PV:
-        hvm = 0;
-        superpages = 0;
-        pae = 1;
-        break;
-    default:
-        rc = ERROR_INVAL;
-        goto out;
-    }
-    libxl__xc_domain_restore(egc, dcs, restore_fd,
-                             hvm, pae, superpages);
+    dcs->srs.ao = ao;
+    dcs->srs.fd = restore_fd;
+    dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
+    dcs->srs.completion_callback = domcreate_stream_done;
+
+    libxl__stream_read_start(egc, &dcs->srs);
     return;
 
  out:
-    libxl__xc_domain_restore_done(egc, dcs, rc, 0, 0);
+    domcreate_stream_done(egc, dcs, rc);
 }
 
 void libxl__srm_callout_callback_restore_results(unsigned long store_mfn,
@@ -1078,10 +1065,10 @@ void libxl__srm_callout_callback_restore_results(unsigned long store_mfn,
     shs->need_results =           0;
 }
 
-void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
-                                   int ret, int retval, int errnoval)
+static void domcreate_stream_done(libxl__egc *egc,
+                                  libxl__domain_create_state *dcs,
+                                  int ret)
 {
-    libxl__domain_create_state *dcs = dcs_void;
     STATE_AO_GC(dcs->ao);
     libxl_ctx *ctx = libxl__gc_owner(gc);
     char **vments = NULL, **localents = NULL;
@@ -1098,12 +1085,6 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
     if (ret)
         goto out;
 
-    if (retval) {
-        LOGEV(ERROR, errnoval, "restoring domain");
-        ret = ERROR_FAIL;
-        goto out;
-    }
-
     gettimeofday(&start_time, NULL);
 
     switch (info->type) {
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 4b72f24..851ba92 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -311,7 +311,7 @@ int main(int argc, char **argv)
         startup("restore");
         setup_signals(SIG_DFL);
 
-        r = xc_domain_restore(xch, io_fd, dom, store_evtchn, &store_mfn,
+        r = xc_domain_restore2(xch, io_fd, dom, store_evtchn, &store_mfn,
                               store_domid, console_evtchn, &console_mfn,
                               console_domid, hvm, pae, superpages,
                               checkpointed,
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 87b9737..a8cd2c3 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -225,6 +225,11 @@ static void check_stream_finished(libxl__egc *egc,
             libxl__convert_legacy_stream_abort(egc, &dcs->chs, rc);
         }
 
+        if (libxl__save_helper_inuse(&dcs->shs)) {
+            skip = true;
+            libxl__save_helper_abort(egc, &dcs->shs);
+        }
+
         /* There is at least one more active task to join - wait for its
            callback */
         if ( skip )
@@ -235,6 +240,8 @@ static void check_stream_finished(libxl__egc *egc,
         LOG(DEBUG, "stream still in use");
     else if (libxl__convert_legacy_stream_inuse(&dcs->chs))
         LOG(DEBUG, "conversion still in use");
+    else if (libxl__save_helper_inuse(&dcs->shs))
+        LOG(DEBUG, "save/restore still in use");
     else {
         LOG(INFO, "Join complete: result %d", stream->joined_rc);
         stream->completion_callback(egc, dcs, stream->joined_rc);
@@ -377,6 +384,28 @@ static void record_body_done(libxl__egc *egc,
     stream_failed(egc, stream, ret);
 }
 
+void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
+                                   int ret, int retval, int errnoval)
+{
+    libxl__domain_create_state *dcs = dcs_void;
+    STATE_AO_GC(dcs->ao);
+
+    if (ret)
+        goto err;
+
+    if (retval) {
+        LOGEV(ERROR, errnoval, "restoring domain");
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    libxl__stream_read_continue(egc, &dcs->srs);
+    return;
+
+ err:
+    check_stream_finished(egc, dcs, ret, "save/restore helper");
+}
+
 static void conversion_done(libxl__egc *egc,
                             libxl__conversion_helper_state *chs, int rc)
 {
@@ -402,6 +431,10 @@ static void process_record(libxl__egc *egc,
         /* Handled later, after cleanup. */
         break;
 
+    case REC_TYPE_LIBXC_CONTEXT:
+        libxl__xc_domain_restore(egc, dcs, stream->fd, 0, 0, 0);
+        break;
+
     case REC_TYPE_XENSTORE_DATA:
         ret = libxl__toolstack_restore(dcs->guest_domid, stream->rec_body,
                                        rec_hdr->length, &dcs->shs);
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 23f27d4..7418d92 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -346,6 +346,8 @@ libxl_domain_create_info = Struct("domain_create_info",[
 
 libxl_domain_restore_params = Struct("domain_restore_params", [
     ("checkpointed_stream", integer),
+    ("stream_version", uint32, {'init_val': '1'}),
+    ("legacy_width", uint32),
     ])
 
 libxl_domain_sched_params = Struct("domain_sched_params",[
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index ddb293c..14d96c9 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -110,7 +110,9 @@
 
 #define XL_MANDATORY_FLAG_JSON (1U << 0) /* config data is in JSON format */
 #define XL_MANDATORY_FLAG_STREAMv2 (1U << 1) /* stream is v2 */
-#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON)
+#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON |        \
+                                XL_MANDATORY_FLAG_STREAMv2)
+
 struct save_file_header {
     char magic[32]; /* savefileheader_magic */
     /* All uint32_ts are in domain's byte order. */
@@ -2724,6 +2726,9 @@ static uint32_t create_domain(struct domain_create *dom_info)
         libxl_domain_restore_params_init(&params);
 
         params.checkpointed_stream = dom_info->checkpointed_stream;
+        params.stream_version =
+            (hdr.mandatory_flags & XL_MANDATORY_FLAG_STREAMv2) ? 2 : 1;
+
         ret = libxl_domain_create_restore(ctx, &d_config,
                                           &domid, restore_fd,
                                           &params,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (18 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:57   ` Ian Campbell
                     ` (6 more replies)
  2015-06-15 13:44 ` [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams Andrew Cooper
                   ` (9 subsequent siblings)
  29 siblings, 7 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Ian Campbell, Andrew Cooper, Ian Jackson,
	Ross Lagerwall, Yang Hongyang

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/Makefile             |    2 +-
 tools/libxl/libxl_internal.h     |   33 +++
 tools/libxl/libxl_stream_write.c |  536 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 570 insertions(+), 1 deletion(-)
 create mode 100644 tools/libxl/libxl_stream_write.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index ca0ae3e..63e32f7 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
 			libxl_internal.o libxl_utils.o libxl_uuid.o \
 			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
-			libxl_stream_read.o \
+			libxl_stream_read.o libxl_stream_write.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
 			libxl_convert_callout.o \
 			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 5482950..82cd792 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
 typedef void libxl__save_device_model_cb(libxl__egc*,
                                          libxl__domain_suspend_state*, int rc);
 
+/* State for writing a libxl migration v2 stream */
+typedef struct libxl__stream_write_state libxl__stream_write_state;
+
+struct libxl__stream_write_state {
+    /* filled by the user */
+    libxl__ao *ao;
+    int fd;
+    uint32_t domid;
+    void (*completion_callback)(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss,
+                                int rc);
+    /* Private */
+    int rc;
+    int joined_rc;
+    size_t padding;
+    bool running;
+    libxl__datacopier_state dc;
+};
+
+_hidden void libxl__stream_write_start(libxl__egc *egc,
+                                       libxl__stream_write_state *stream);
+
+_hidden void libxl__stream_write_abort(libxl__egc *egc,
+                                       libxl__stream_write_state *stream,
+                                       int rc);
+
+static inline bool libxl__stream_write_inuse(
+    const libxl__stream_write_state *stream)
+{
+    return stream->running;
+}
+
 typedef struct libxl__logdirty_switch {
     const char *cmd;
     const char *cmd_path;
@@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
     /* private for libxl__domain_save_device_model */
     libxl__save_device_model_cb *save_dm_callback;
     libxl__datacopier_state save_dm_datacopier;
+    libxl__stream_write_state sws;
 };
 
 
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
new file mode 100644
index 0000000..856d72e
--- /dev/null
+++ b/tools/libxl/libxl_stream_write.c
@@ -0,0 +1,536 @@
+/*
+ * Copyright (C) 2015      Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* must come before any other headers */
+
+#include "libxl_internal.h"
+
+/*
+ * Infrastructure for writing a domain to a libxl migration v2 stream.
+ *
+ * Entry points from outside:
+ *  - libxl__stream_write_start()
+ *     - Start writing a stream from the start.
+ *
+ * In normal operation, there are two tasks running at once; this stream
+ * processing, and the the libxl-save-helper.  check_stream_finished() is used
+ * to join all the tasks in both success and error cases.
+ *
+ * Nomenclature for event callbacks:
+ *  - $FOO_done(): Completion callback for $FOO
+ *  - write_$FOO(): Set up writing a $FOO
+ *  - $BAR_header(): A $BAR record header only
+ *  - $BAR_record(): A complete $BAR record with header and content
+ *
+ * The main loop for a plain VM writes:
+ *  - Stream header
+ *  - Libxc record
+ *  - Toolstack record
+ *  - if (hvm), Qemu record
+ *  - End record
+ */
+
+static const uint8_t zero_padding[1U << REC_ALIGN_ORDER] = { 0 };
+
+static void stream_success(libxl__egc *egc,
+                           libxl__stream_write_state *stream);
+static void stream_failed(libxl__egc *egc,
+                          libxl__stream_write_state *stream, int ret);
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *stream);
+
+static void check_stream_finished(libxl__egc *egc,
+                                  libxl__domain_suspend_state *dcs,
+                                  int rc, const char *what);
+
+/* Event callbacks for plain VM. */
+static void stream_header_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval);
+static void libxc_header_done(libxl__egc *egc,
+                              libxl__datacopier_state *dc,
+                              int onwrite, int errnoval);
+/* libxl__xc_domain_save_done() lives here, event-order wise. */
+static void write_toolstack_record(libxl__egc *egc,
+                                   libxl__stream_write_state *stream);
+static void toolstack_record_done(libxl__egc *egc,
+                                  libxl__datacopier_state *dc,
+                                  int onwrite, int errnoval);
+static void write_emulator_record(libxl__egc *egc,
+                                  libxl__stream_write_state *stream);
+static void emulator_body_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval);
+static void emulator_padding_done(libxl__egc *egc,
+                                  libxl__datacopier_state *dc,
+                                  int onwrite, int errnoval);
+static void write_end_record(libxl__egc *egc,
+                             libxl__stream_write_state *stream);
+static void end_record_done(libxl__egc *egc,
+                            libxl__datacopier_state *dc,
+                            int onwrite, int errnoval);
+
+void libxl__stream_write_start(libxl__egc *egc,
+                               libxl__stream_write_state *stream)
+{
+    libxl__datacopier_state *dc = &stream->dc;
+    STATE_AO_GC(stream->ao);
+    struct libxl_sr_hdr hdr = { 0 };
+    int ret = 0;
+
+    assert(!stream->running);
+    stream->running = true;
+
+    memset(dc, 0, sizeof(*dc));
+    dc->readwhat = "";
+    dc->copywhat = "suspend header";
+    dc->writewhat = "save/migration stream";
+    dc->ao = ao;
+    dc->readfd = -1;
+    dc->writefd = stream->fd;
+    dc->maxsz = INT_MAX;
+    dc->bytes_to_read = INT_MAX;
+    dc->callback = stream_header_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    hdr.ident   = htobe64(RESTORE_STREAM_IDENT);
+    hdr.version = htobe32(RESTORE_STREAM_VERSION);
+    hdr.options = htobe32(0);
+
+    libxl__datacopier_prefixdata(egc, dc, &hdr, sizeof(hdr));
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+void libxl__stream_write_abort(libxl__egc *egc,
+                               libxl__stream_write_state *stream, int rc)
+{
+    stream_failed(egc, stream, rc);
+}
+
+static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
+{
+    stream->rc = 0;
+    stream->running = false;
+
+    stream_done(egc, stream);
+}
+
+static void stream_failed(libxl__egc *egc,
+                          libxl__stream_write_state *stream, int rc)
+{
+    assert(rc);
+    stream->rc = rc;
+
+    if (stream->running) {
+        stream->running = false;
+        stream_done(egc, stream);
+    }
+}
+
+static void stream_done(libxl__egc *egc,
+                        libxl__stream_write_state *stream)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+
+    assert(!stream->running);
+
+    check_stream_finished(egc, dss, stream->rc, "stream");
+}
+
+static void check_stream_finished(libxl__egc *egc,
+                                  libxl__domain_suspend_state *dss,
+                                  int rc, const char *what)
+{
+    libxl__stream_write_state *stream = &dss->sws;
+    STATE_AO_GC(dss->ao);
+
+    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
+
+    if (rc && !stream->joined_rc) {
+        bool skip = false;
+        /* First reported failure from joining tasks.  Tear everything down */
+        stream->joined_rc = rc;
+
+        if (libxl__stream_write_inuse(&dss->sws)) {
+            skip = true;
+            libxl__stream_write_abort(egc, &dss->sws, rc);
+        }
+
+        if (libxl__save_helper_inuse(&dss->shs)) {
+            skip = true;
+            libxl__save_helper_abort(egc, &dss->shs);
+        }
+
+        /* There is at least one more active task to join - wait for its
+           callback */
+        if ( skip )
+            return;
+    }
+
+    if (libxl__stream_write_inuse(&dss->sws))
+        LOG(DEBUG, "stream still in use");
+    else if (libxl__save_helper_inuse(&dss->shs))
+        LOG(DEBUG, "save/restore still in use");
+    else {
+        LOG(INFO, "Join complete: result %d", stream->joined_rc);
+        stream->completion_callback(egc, dss, stream->joined_rc);
+    }
+}
+
+static void stream_header_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval)
+{
+    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+    STATE_AO_GC(stream->ao);
+    struct libxl_sr_rec_hdr rec = { REC_TYPE_LIBXC_CONTEXT, 0 };
+    int ret = 0;
+
+    if (onwrite || errnoval) {
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    dc->copywhat = "suspend footer";
+    dc->writewhat = "save/migration stream";
+    dc->callback = libxc_header_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void libxc_header_done(libxl__egc *egc,
+                              libxl__datacopier_state *dc,
+                              int onwrite, int errnoval)
+{
+    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+    STATE_AO_GC(stream->ao);
+    int ret = 0;
+
+    if (onwrite || errnoval) {
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    libxl__xc_domain_save(egc, dss);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void __attribute__((used))
+will_be_libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
+                                int rc, int retval, int errnoval)
+{
+    libxl__domain_suspend_state *dss = dss_void;
+    libxl__stream_write_state *stream = &dss->sws;
+    STATE_AO_GC(dss->ao);
+
+    if (rc)
+        goto err;
+
+    if (retval) {
+        LOGEV(ERROR, errnoval, "saving domain: %s",
+                         dss->guest_responded ?
+                         "domain responded to suspend request" :
+                         "domain did not respond to suspend request");
+        if ( !dss->guest_responded )
+            rc = ERROR_GUEST_TIMEDOUT;
+        else
+            rc = ERROR_FAIL;
+        goto err;
+    }
+
+    write_toolstack_record(egc, stream);
+    return;
+
+ err:
+    assert(rc);
+    check_stream_finished(egc, dss, rc, "save/restore helper");
+}
+
+static void write_toolstack_record(libxl__egc *egc,
+                                   libxl__stream_write_state *stream)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+    libxl__datacopier_state *dc = &stream->dc;
+    STATE_AO_GC(stream->ao);
+    struct libxl_sr_rec_hdr rec = { REC_TYPE_XENSTORE_DATA, 0 };
+    int ret = 0;
+    uint8_t *toolstack_buf = NULL; /* We must free this. */
+    uint32_t toolstack_len, padding_len;
+
+    ret = libxl__toolstack_save(dss->domid, &toolstack_buf,
+                                &toolstack_len, dss);
+    if (ret)
+        goto err;
+
+    dc->copywhat = "toolstack record";
+    dc->writewhat = "save/migration stream";
+    dc->callback = toolstack_record_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    rec.length = toolstack_len;
+
+    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+    libxl__datacopier_prefixdata(egc, dc, toolstack_buf, toolstack_len);
+
+    padding_len = ROUNDUP(rec.length, REC_ALIGN_ORDER) - rec.length;
+    if (padding_len)
+        libxl__datacopier_prefixdata(egc, dc, zero_padding, padding_len);
+
+    free(toolstack_buf);
+    return;
+
+ err:
+    assert(ret);
+    free(toolstack_buf);
+    stream_failed(egc, stream, ret);
+}
+
+static void toolstack_record_done(libxl__egc *egc,
+                                  libxl__datacopier_state *dc,
+                                  int onwrite, int errnoval)
+{
+    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+    STATE_AO_GC(stream->ao);
+    int ret = 0;
+
+    if (onwrite || errnoval) {
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
+        write_emulator_record(egc, stream);
+    else
+        write_end_record(egc, stream);
+
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void write_emulator_record(libxl__egc *egc,
+                                  libxl__stream_write_state *stream)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+    libxl__datacopier_state *dc = &stream->dc;
+    STATE_AO_GC(stream->ao);
+    struct libxl_sr_rec_hdr rec = { REC_TYPE_EMULATOR_CONTEXT, 0 };
+    struct libxl_sr_emulator_hdr ehdr = { 0 };
+    struct stat st;
+    int ret = 0;
+    uint32_t qemu_state_len;
+
+    assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
+
+    /* Convenience aliases */
+    const char *const filename = dss->dm_savefile;
+    const uint32_t domid = dss->domid;
+
+    switch(libxl__device_model_version_running(gc, domid)) {
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+        ehdr.id = EMULATOR_QEMU_TRADITIONAL;
+        break;
+
+    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+        ehdr.id = EMULATOR_QEMU_UPSTREAM;
+        break;
+
+    default:
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    ret = libxl__domain_suspend_device_model(gc, dss);
+    if (ret)
+        goto err;
+
+    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
+    dc->copywhat = "emulator record";
+    dc->writewhat = "save/migration stream";
+    dc->callback = emulator_body_done;
+
+    dc->readfd = open(filename, O_RDONLY);
+    if (dc->readfd < 0) {
+        LOGE(ERROR, "unable to open %s", dc->readwhat);
+        goto err;
+    }
+
+    if (fstat(dc->readfd, &st))
+    {
+        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
+        goto err;
+    }
+
+    if (!S_ISREG(st.st_mode)) {
+        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
+        goto err;
+    }
+
+    qemu_state_len = st.st_size;
+    rec.length = qemu_state_len + sizeof(ehdr);
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+    libxl__datacopier_prefixdata(egc, dc, &ehdr, sizeof(ehdr));
+
+    stream->padding = ROUNDUP(qemu_state_len, REC_ALIGN_ORDER) - qemu_state_len;
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void emulator_body_done(libxl__egc *egc,
+                               libxl__datacopier_state *dc,
+                               int onwrite, int errnoval)
+{
+    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+    STATE_AO_GC(stream->ao);
+    int ret = 0;
+
+    if (onwrite || errnoval) {
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    dc->readwhat = "";
+    dc->readfd = -1;
+
+    if (stream->padding) {
+        assert(stream->padding < (1U << REC_ALIGN_ORDER));
+
+        dc->copywhat = "emulator padding";
+        dc->writewhat = "save/migration stream";
+        dc->callback = emulator_padding_done;
+
+        ret = libxl__datacopier_start(dc);
+        if (ret)
+            goto err;
+
+        libxl__datacopier_prefixdata(egc, dc, zero_padding, stream->padding);
+        return;
+    }
+
+    emulator_padding_done(egc, dc, 0, 0);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void emulator_padding_done(libxl__egc *egc,
+                                  libxl__datacopier_state *dc,
+                                  int onwrite, int errnoval)
+{
+    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+    STATE_AO_GC(stream->ao);
+    int ret = 0;
+
+    if (onwrite || errnoval) {
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    write_end_record(egc, stream);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void write_end_record(libxl__egc *egc,
+                             libxl__stream_write_state *stream)
+{
+    libxl__datacopier_state *dc = &stream->dc;
+    STATE_AO_GC(stream->ao);
+    struct libxl_sr_rec_hdr rec = { REC_TYPE_END, 0 };
+    int ret = 0;
+
+    dc->copywhat = "suspend footer";
+    dc->writewhat = "save/migration stream";
+    dc->callback = end_record_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void end_record_done(libxl__egc *egc,
+                            libxl__datacopier_state *dc,
+                            int onwrite, int errnoval)
+{
+    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+    STATE_AO_GC(stream->ao);
+    int ret = 0;
+
+    if (onwrite || errnoval) {
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    stream_success(egc, stream);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (19 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 14:59   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams Andrew Cooper
                   ` (8 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

This is a complicated set of changes which must be done together for
bisectability.

 * libxl-save-helper is updated to unconditionally use libxc migration v2.
 * libxl compatibility workarounds in libxc are disabled for save operations.
 * libxl__stream_write_start() is logically spliced into the event location
   where libxl__xc_domain_save() used to reside.
 * xl is updated to indicate that the stream is now v2

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>

---
RFC: What kind of ABI/API indication is appropriate here.  A LIBXL_HAVE*
isn't apppropriate.
---
 tools/libxc/Makefile             |    2 --
 tools/libxl/libxl_dom.c          |   44 ++++++++------------------------------
 tools/libxl/libxl_save_helper.c  |    2 +-
 tools/libxl/libxl_stream_write.c |    3 +--
 tools/libxl/xl_cmdimpl.c         |    1 +
 5 files changed, 12 insertions(+), 40 deletions(-)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 0e65b88..99fcf9f 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -64,8 +64,6 @@ GUEST_SRCS-$(CONFIG_X86) += xc_sr_save_x86_hvm.c
 GUEST_SRCS-y += xc_sr_restore.c
 GUEST_SRCS-y += xc_sr_save.c
 GUEST_SRCS-y += xc_offline_page.c xc_compression.c
-xc_sr_save_x86_hvm.o: CFLAGS += -DXG_LIBXL_HVM_COMPAT
-xc_sr_save_x86_hvm.opic: CFLAGS += -DXG_LIBXL_HVM_COMPAT
 else
 GUEST_SRCS-y += xc_nomigrate.c
 endif
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 867172a..06bfaab 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1107,6 +1107,8 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
 
 /*==================== Domain suspend (save) ====================*/
 
+static void stream_done(libxl__egc *egc,
+                        libxl__domain_suspend_state *dss, int rc);
 static void domain_suspend_done(libxl__egc *egc,
                         libxl__domain_suspend_state *dss, int rc);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
@@ -2040,48 +2042,20 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
     callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
     dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save;
 
-    libxl__xc_domain_save(egc, dss);
+    dss->sws.fd = dss->fd;
+    dss->sws.ao = dss->ao;
+    dss->sws.completion_callback = stream_done;
+
+    libxl__stream_write_start(egc, &dss->sws);
     return;
 
  out:
     domain_suspend_done(egc, dss, rc);
 }
 
-void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
-                                int rc, int retval, int errnoval)
+static void stream_done(libxl__egc *egc,
+                        libxl__domain_suspend_state *dss, int rc)
 {
-    libxl__domain_suspend_state *dss = dss_void;
-    STATE_AO_GC(dss->ao);
-
-    /* Convenience aliases */
-    const libxl_domain_type type = dss->type;
-
-    if (rc)
-        goto out;
-
-    if (retval) {
-        LOGEV(ERROR, errnoval, "saving domain: %s",
-                         dss->guest_responded ?
-                         "domain responded to suspend request" :
-                         "domain did not respond to suspend request");
-        if ( !dss->guest_responded )
-            rc = ERROR_GUEST_TIMEDOUT;
-        else
-            rc = ERROR_FAIL;
-        goto out;
-    }
-
-    if (type == LIBXL_DOMAIN_TYPE_HVM) {
-        rc = libxl__domain_suspend_device_model(gc, dss);
-        if (rc) goto out;
-
-        libxl__domain_save_device_model(egc, dss, domain_suspend_done);
-        return;
-    }
-
-    rc = 0;
-
-out:
     domain_suspend_done(egc, dss, rc);
 }
 
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 851ba92..4cc93a2 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -284,7 +284,7 @@ int main(int argc, char **argv)
         startup("save");
         setup_signals(save_signal_handler);
 
-        r = xc_domain_save(xch, io_fd, dom, max_iters, max_factor, flags,
+        r = xc_domain_save2(xch, io_fd, dom, max_iters, max_factor, flags,
                            &helper_save_callbacks, hvm);
         complete(r);
 
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 856d72e..d28a8a5 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -247,8 +247,7 @@ static void libxc_header_done(libxl__egc *egc,
     stream_failed(egc, stream, ret);
 }
 
-static void __attribute__((used))
-will_be_libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
+void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
                                 int rc, int retval, int errnoval)
 {
     libxl__domain_suspend_state *dss = dss_void;
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 14d96c9..35bc26d 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -3889,6 +3889,7 @@ static void save_domain_core_writeconfig(int fd, const char *source,
     memset(&hdr, 0, sizeof(hdr));
     memcpy(hdr.magic, savefileheader_magic, sizeof(hdr.magic));
     hdr.byteorder = SAVEFILE_BYTEORDER_VALUE;
+    hdr.mandatory_flags = XL_MANDATORY_FLAG_STREAMv2;
 
     optdata_begin= 0;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (20 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 15:00   ` Ian Campbell
  2015-06-17  3:30   ` Wen Congyang
  2015-06-15 13:44 ` [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream Andrew Cooper
                   ` (7 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

In a remus senario, libxc will write a CHECKPOINT record, then hand ownership
of the fd to libxl.  Libxl then writes any records required and finishes with
a CHECKPOINT_END record, then hands ownership of the fd back to libxc.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 docs/specs/libxl-migration-stream.pandoc |   15 ++++++++++++++-
 tools/libxl/libxl_sr_stream_format.h     |    1 +
 tools/python/xen/migration/libxl.py      |   11 +++++++++++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/docs/specs/libxl-migration-stream.pandoc b/docs/specs/libxl-migration-stream.pandoc
index 7235317..d41932a 100644
--- a/docs/specs/libxl-migration-stream.pandoc
+++ b/docs/specs/libxl-migration-stream.pandoc
@@ -119,7 +119,9 @@ type         0x00000000: END
 
              0x00000003: EMULATOR_CONTEXT
 
-             0x00000004 - 0x7FFFFFFF: Reserved for future _mandatory_
+             0x00000004: CHECKPOINT_END
+
+             0x00000005 - 0x7FFFFFFF: Reserved for future _mandatory_
              records.
 
              0x80000000 - 0xFFFFFFFF: Reserved for future _optional_
@@ -203,3 +205,14 @@ index            Index of this emulator for the domain, if multiple
 
 emulator_ctx     Emulator context blob.
 --------------------------------------------------------------------
+
+CHECKPOINT_END
+--------------
+
+A checkpoint end record marks the end of a checkpoint in the image.
+
+     0     1     2     3     4     5     6     7 octet
+    +-------------------------------------------------+
+
+The end record contains no fields; its body_length is 0.
+
diff --git a/tools/libxl/libxl_sr_stream_format.h b/tools/libxl/libxl_sr_stream_format.h
index 487f9e2..5dfa55f 100644
--- a/tools/libxl/libxl_sr_stream_format.h
+++ b/tools/libxl/libxl_sr_stream_format.h
@@ -35,6 +35,7 @@
 #define REC_TYPE_LIBXC_CONTEXT       0x00000001U
 #define REC_TYPE_XENSTORE_DATA       0x00000002U
 #define REC_TYPE_EMULATOR_CONTEXT    0x00000003U
+#define REC_TYPE_CHECKPOINT_END      0x00000004U
 
 typedef struct libxl_sr_emulator_hdr
 {
diff --git a/tools/python/xen/migration/libxl.py b/tools/python/xen/migration/libxl.py
index 4e1f4f8..415502e 100644
--- a/tools/python/xen/migration/libxl.py
+++ b/tools/python/xen/migration/libxl.py
@@ -36,12 +36,14 @@ REC_TYPE_end              = 0x00000000
 REC_TYPE_libxc_context    = 0x00000001
 REC_TYPE_xenstore_data    = 0x00000002
 REC_TYPE_emulator_context = 0x00000003
+REC_TYPE_checkpoint_end   = 0x00000004
 
 rec_type_to_str = {
     REC_TYPE_end              : "End",
     REC_TYPE_libxc_context    : "Libxc context",
     REC_TYPE_xenstore_data    : "Xenstore data",
     REC_TYPE_emulator_context : "Emulator context",
+    REC_TYPE_checkpoint_end   : "Checkpoint end",
 }
 
 # emulator_context
@@ -176,6 +178,13 @@ class VerifyLibxl(VerifyBase):
         self.info("  Index %d, type %s" % (emu_idx, emulator_id_to_str[emu_id]))
 
 
+    def verify_record_checkpoint_end(self, content):
+        """ Checkpoint end record """
+
+        if len(content) != 0:
+            raise RecordError("Checkpoint end record with non-zero length")
+
+
 record_verifiers = {
     REC_TYPE_end:
         VerifyLibxl.verify_record_end,
@@ -185,4 +194,6 @@ record_verifiers = {
         VerifyLibxl.verify_record_xenstore_data,
     REC_TYPE_emulator_context:
         VerifyLibxl.verify_record_emulator_context,
+    REC_TYPE_checkpoint_end:
+        VerifyLibxl.verify_record_checkpoint_end,
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (21 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 15:03   ` Ian Campbell
  2015-06-18  3:13   ` Wen Congyang
  2015-06-15 13:44 ` [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint() Andrew Cooper
                   ` (6 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

when signalled to do so by libxl__remus_domain_checkpoint_callback()

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dom.c          |   16 +++---
 tools/libxl/libxl_internal.h     |    7 +++
 tools/libxl/libxl_stream_write.c |  111 ++++++++++++++++++++++++++++++++++++--
 3 files changed, 121 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 06bfaab..3597a91 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1867,8 +1867,8 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
 
 /*----- remus asynchronous checkpoint callback -----*/
 
-static void remus_checkpoint_dm_saved(libxl__egc *egc,
-                                      libxl__domain_suspend_state *dss, int rc);
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__domain_suspend_state *dss, int rc);
 static void remus_devices_commit_cb(libxl__egc *egc,
                                     libxl__remus_devices_state *rds,
                                     int rc);
@@ -1882,16 +1882,11 @@ static void libxl__remus_domain_checkpoint_callback(void *data)
     libxl__egc *egc = dss->shs.egc;
     STATE_AO_GC(dss->ao);
 
-    /* This would go into tailbuf. */
-    if (dss->hvm) {
-        libxl__domain_save_device_model(egc, dss, remus_checkpoint_dm_saved);
-    } else {
-        remus_checkpoint_dm_saved(egc, dss, 0);
-    }
+    libxl__stream_write_start_checkpoint(egc, &dss->sws);
 }
 
-static void remus_checkpoint_dm_saved(libxl__egc *egc,
-                                      libxl__domain_suspend_state *dss, int rc)
+static void remus_checkpoint_stream_written(
+    libxl__egc *egc, libxl__domain_suspend_state *dss, int rc)
 {
     /* Convenience aliases */
     libxl__remus_devices_state *const rds = &dss->rds;
@@ -2036,6 +2031,7 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
         callbacks->suspend = libxl__remus_domain_suspend_callback;
         callbacks->postcopy = libxl__remus_domain_resume_callback;
         callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
+        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
     } else
         callbacks->suspend = libxl__domain_suspend_callback;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 82cd792..bf1c377 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2879,17 +2879,24 @@ struct libxl__stream_write_state {
     void (*completion_callback)(libxl__egc *egc,
                                 libxl__domain_suspend_state *dss,
                                 int rc);
+    void (*checkpoint_callback)(libxl__egc *egc,
+                                libxl__domain_suspend_state *dss,
+                                int rc);
     /* Private */
     int rc;
     int joined_rc;
     size_t padding;
     bool running;
+    bool in_checkpoint;
     libxl__datacopier_state dc;
 };
 
 _hidden void libxl__stream_write_start(libxl__egc *egc,
                                        libxl__stream_write_state *stream);
 
+_hidden void libxl__stream_write_start_checkpoint(
+    libxl__egc *egc, libxl__stream_write_state *stream);
+
 _hidden void libxl__stream_write_abort(libxl__egc *egc,
                                        libxl__stream_write_state *stream,
                                        int rc);
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index d28a8a5..40f2cb7 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -23,6 +23,9 @@
  *  - libxl__stream_write_start()
  *     - Start writing a stream from the start.
  *
+ *  - libxl__stream_write_start()
+ *     - Write the records which form a checkpoint into a stream.
+ *
  * In normal operation, there are two tasks running at once; this stream
  * processing, and the the libxl-save-helper.  check_stream_finished() is used
  * to join all the tasks in both success and error cases.
@@ -39,6 +42,12 @@
  *  - Toolstack record
  *  - if (hvm), Qemu record
  *  - End record
+ *
+ * For checkpointed stream, there is a second loop which is triggered by a
+ * save-helper checkpoint callback.  It writes:
+ *  - Toolstack record
+ *  - if (hvm), Qemu record
+ *  - Checkpoint end record
  */
 
 static const uint8_t zero_padding[1U << REC_ALIGN_ORDER] = { 0 };
@@ -81,6 +90,16 @@ static void end_record_done(libxl__egc *egc,
                             libxl__datacopier_state *dc,
                             int onwrite, int errnoval);
 
+/* Event callbacks unique to checkpointed streams. */
+static void checkpoint_done(libxl__egc *egc,
+                            libxl__stream_write_state *stream,
+                            int rc);
+static void write_checkpoint_end_record(libxl__egc *egc,
+                                        libxl__stream_write_state *stream);
+static void checkpoint_end_record_done(libxl__egc *egc,
+                                       libxl__datacopier_state *dc,
+                                       int onwrite, int errnoval);
+
 void libxl__stream_write_start(libxl__egc *egc,
                                libxl__stream_write_state *stream)
 {
@@ -119,6 +138,16 @@ void libxl__stream_write_start(libxl__egc *egc,
     stream_failed(egc, stream, ret);
 }
 
+void libxl__stream_write_start_checkpoint(libxl__egc *egc,
+                                          libxl__stream_write_state *stream)
+{
+    assert(stream->running);
+    assert(!stream->in_checkpoint);
+    stream->in_checkpoint = true;
+
+    write_toolstack_record(egc, stream);
+}
+
 void libxl__stream_write_abort(libxl__egc *egc,
                                libxl__stream_write_state *stream, int rc)
 {
@@ -130,6 +159,7 @@ static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
     stream->rc = 0;
     stream->running = false;
 
+    assert(!stream->in_checkpoint);
     stream_done(egc, stream);
 }
 
@@ -139,6 +169,15 @@ static void stream_failed(libxl__egc *egc,
     assert(rc);
     stream->rc = rc;
 
+    /*
+     *If we are in a checkpoint, pass the failure to libxc, which will come
+     * back around to us via libxl__xc_domain_save_done().
+     */
+    if (stream->in_checkpoint) {
+        checkpoint_done(egc, stream, rc);
+        return;
+    }
+
     if (stream->running) {
         stream->running = false;
         stream_done(egc, stream);
@@ -151,6 +190,7 @@ static void stream_done(libxl__egc *egc,
     libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
 
     assert(!stream->running);
+    assert(!stream->in_checkpoint);
 
     check_stream_finished(egc, dss, stream->rc, "stream");
 }
@@ -335,8 +375,12 @@ static void toolstack_record_done(libxl__egc *egc,
 
     if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
         write_emulator_record(egc, stream);
-    else
-        write_end_record(egc, stream);
+    else {
+        if (stream->in_checkpoint)
+            write_checkpoint_end_record(egc, stream);
+        else
+            write_end_record(egc, stream);
+    }
 
     return;
 
@@ -473,7 +517,10 @@ static void emulator_padding_done(libxl__egc *egc,
         goto err;
     }
 
-    write_end_record(egc, stream);
+    if (stream->in_checkpoint)
+        write_checkpoint_end_record(egc, stream);
+    else
+        write_end_record(egc, stream);
     return;
 
  err:
@@ -526,6 +573,64 @@ static void end_record_done(libxl__egc *egc,
     stream_failed(egc, stream, ret);
 }
 
+static void checkpoint_done(libxl__egc *egc,
+                            libxl__stream_write_state *stream,
+                            int rc)
+{
+    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
+
+    assert(stream->in_checkpoint);
+    stream->in_checkpoint = false;
+    stream->checkpoint_callback(egc, dss, rc);
+}
+
+static void write_checkpoint_end_record(libxl__egc *egc,
+                                        libxl__stream_write_state *stream)
+{
+    libxl__datacopier_state *dc = &stream->dc;
+    STATE_AO_GC(stream->ao);
+    struct libxl_sr_rec_hdr rec = { REC_TYPE_CHECKPOINT_END, 0 };
+    int ret = 0;
+
+    assert(stream->in_checkpoint);
+
+    dc->copywhat = "checkpoint record";
+    dc->writewhat = "save/migration stream";
+    dc->callback = checkpoint_end_record_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
+static void checkpoint_end_record_done(libxl__egc *egc,
+                                       libxl__datacopier_state *dc,
+                                       int onwrite, int errnoval)
+{
+    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
+    STATE_AO_GC(stream->ao);
+    int ret = 0;
+
+    if (onwrite || errnoval) {
+        ret = ERROR_FAIL;
+        goto err;
+    }
+
+    checkpoint_done(egc, stream, 0);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint()
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (22 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16  2:23   ` Yang Hongyang
  2015-06-17  8:20   ` Yang Hongyang
  2015-06-15 13:44 ` [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream Andrew Cooper
                   ` (5 subsequent siblings)
  29 siblings, 2 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

And call it when a checkpoint record is found in the libxc stream.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xenguest.h     |    3 +++
 tools/libxc/xc_sr_restore.c        |   15 ++++++++++++++-
 tools/libxl/libxl_save_msgs_gen.pl |    2 +-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7581263..b0d27ed 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -102,6 +102,9 @@ struct restore_callbacks {
     int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
             uint32_t size, void* data);
 
+    /* A checkpoint record has been found in the stream */
+    int (*checkpoint)(void* data);
+
     /* to be provided as the last argument to each callback function */
     void* data;
 };
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..5e0f817 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,5 +1,7 @@
 #include <arpa/inet.h>
 
+#include <assert.h>
+
 #include "xc_sr_common.h"
 
 /*
@@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
 static int handle_checkpoint(struct xc_sr_context *ctx)
 {
     xc_interface *xch = ctx->xch;
-    int rc = 0;
+    int rc = 0, ret;
     unsigned i;
 
     if ( !ctx->restore.checkpointed )
@@ -482,6 +484,13 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
         goto err;
     }
 
+    ret = ctx->restore.callbacks->checkpoint(ctx->restore.callbacks->data);
+    if ( ret )
+    {
+        rc = -1;
+        goto err;
+    }
+
     if ( ctx->restore.buffer_all_records )
     {
         IPRINTF("All records buffered");
@@ -735,6 +744,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
     ctx.restore.checkpointed = checkpointed_stream;
     ctx.restore.callbacks = callbacks;
 
+    /* Sanity checks for callbacks. */
+    if (checkpointed_stream)
+        assert(callbacks->checkpoint);
+
     IPRINTF("In experimental %s", __func__);
     DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d"
             ", checkpointed_stream %d", io_fd, dom, hvm, pae,
diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl
index 6b4b65e..36b279e 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -25,7 +25,7 @@ our @msgs = (
                                                 'unsigned long', 'total'] ],
     [  3, 'scxA',   "suspend", [] ],
     [  4, 'scxA',   "postcopy", [] ],
-    [  5, 'scxA',   "checkpoint", [] ],
+    [  5, 'srcxA',   "checkpoint", [] ],
     [  6, 'scxA',   "switch_qemu_logdirty",  [qw(int domid
                                               unsigned enable)] ],
     #                toolstack_save          done entirely `by hand'
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (23 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint() Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-17  7:28   ` Wen Congyang
  2015-06-15 13:44 ` [PATCH 26/27] tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc Andrew Cooper
                   ` (4 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

This is the final bit of untangling for Remus.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_create.c      |   25 ++++++++++++++++
 tools/libxl/libxl_internal.h    |    6 ++++
 tools/libxl/libxl_stream_read.c |   62 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 93 insertions(+)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 7dd7130..ac918bd 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -747,6 +747,27 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
         libxl_device_model_version_to_string(b_info->device_model_version));
 }
 
+/*----- remus asynchronous checkpoint callback -----*/
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__domain_create_state *dcs, int rc);
+
+static void libxl__remus_domain_checkpoint_callback(void *data)
+{
+    libxl__save_helper_state *shs = data;
+    libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+    libxl__egc *egc = dcs->shs.egc;
+    STATE_AO_GC(dcs->ao);
+
+    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
+}
+
+static void remus_checkpoint_stream_done(
+    libxl__egc *egc, libxl__domain_create_state *dcs, int rc)
+{
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, rc);
+}
+
 /*----- main domain creation -----*/
 
 /* We have a linear control flow; only one event callback is
@@ -1008,6 +1029,8 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     libxl_domain_config *const d_config = dcs->guest_config;
     const int restore_fd = dcs->restore_fd;
     libxl__domain_build_state *const state = &dcs->build_state;
+    libxl__srm_restore_autogen_callbacks *const callbacks =
+        &dcs->shs.callbacks.restore.a;
 
     if (rc) {
         domcreate_rebuild_done(egc, dcs, rc);
@@ -1035,6 +1058,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     }
 
     /* Restore */
+    callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
 
     rc = libxl__build_pre(gc, domid, d_config, state);
     if (rc)
@@ -1044,6 +1068,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
     dcs->srs.fd = restore_fd;
     dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
     dcs->srs.completion_callback = domcreate_stream_done;
+    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
 
     libxl__stream_read_start(egc, &dcs->srs);
     return;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index bf1c377..e271a0b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3205,11 +3205,15 @@ struct libxl__stream_read_state {
     void (*completion_callback)(libxl__egc *egc,
                                 libxl__domain_create_state *dcs,
                                 int rc);
+    void (*checkpoint_callback)(libxl__egc *egc,
+                                libxl__domain_create_state *dcs,
+                                int rc);
     /* Private */
     libxl__carefd *v2_carefd;
     int rc;
     int joined_rc;
     bool running;
+    bool in_checkpoint;
     libxl__datacopier_state dc;
     size_t expected_len;
     libxl_sr_hdr hdr;
@@ -3222,6 +3226,8 @@ _hidden void libxl__stream_read_start(libxl__egc *egc,
 
 _hidden void libxl__stream_read_continue(libxl__egc *egc,
                                          libxl__stream_read_state *stream);
+_hidden void libxl__stream_read_start_checkpoint(
+    libxl__egc *egc, libxl__stream_read_state *stream);
 
 _hidden void libxl__stream_read_abort(libxl__egc *egc,
                                       libxl__stream_read_state *stream, int rc);
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index a8cd2c3..09ef0aa 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -80,6 +80,10 @@ static void emulator_padding_done(libxl__egc *egc,
                                   libxl__datacopier_state *dc,
                                   int onwrite, int errnoval);
 
+/* Error handling for checkpoint mini-loop. */
+static void checkpoint_done(libxl__egc *egc,
+                            libxl__stream_read_state *stream, int rc);
+
 void libxl__stream_read_start(libxl__egc *egc,
                               libxl__stream_read_state *stream)
 {
@@ -162,6 +166,35 @@ void libxl__stream_read_continue(libxl__egc *egc,
     stream_failed(egc, stream, ret);
 }
 
+void libxl__stream_read_start_checkpoint(libxl__egc *egc,
+                                         libxl__stream_read_state *stream)
+{
+    libxl__datacopier_state *dc = &stream->dc;
+    int ret = 0;
+
+    assert(stream->running);
+    assert(!stream->in_checkpoint);
+    stream->in_checkpoint = true;
+
+    /* Read a record header. */
+    dc->readwhat = "record header";
+    dc->readbuf = &stream->rec_hdr;
+    stream->expected_len = dc->bytes_to_read = sizeof(stream->rec_hdr);
+    dc->used = 0;
+    dc->callback = record_header_done;
+
+    ret = libxl__datacopier_start(dc);
+    if (ret)
+        goto err;
+
+    assert(!ret);
+    return;
+
+ err:
+    assert(ret);
+    stream_failed(egc, stream, ret);
+}
+
 void libxl__stream_read_abort(libxl__egc *egc,
                               libxl__stream_read_state *stream, int rc)
 {
@@ -182,6 +215,15 @@ static void stream_failed(libxl__egc *egc,
     assert(rc);
     stream->rc = rc;
 
+    /*
+     *If we are in a checkpoint, pass the failure to libxc, which will come
+     * back around to us via libxl__xc_domain_restore_done().
+     */
+    if (stream->in_checkpoint) {
+        checkpoint_done(egc, stream, rc);
+        return;
+    }
+
     if (stream->running) {
         stream->running = false;
         stream_done(egc, stream);
@@ -194,6 +236,7 @@ static void stream_done(libxl__egc *egc,
     libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
 
     assert(!stream->running);
+    assert(!stream->in_checkpoint);
 
     if (stream->v2_carefd)
         libxl__carefd_close(stream->v2_carefd);
@@ -452,6 +495,15 @@ static void process_record(libxl__egc *egc,
         read_emulator_body(egc, stream);
         break;
 
+    case REC_TYPE_CHECKPOINT_END:
+        if (!stream->in_checkpoint) {
+            LOG(ERROR, "Unexpected CHECKPOINT_END record in stream");
+            ret = ERROR_FAIL;
+            goto err;
+        }
+        checkpoint_done(egc, stream, 0);
+        break;
+
     default:
         LOG(ERROR, "Unrecognised record 0x%08x", rec_hdr->type);
         ret = ERROR_FAIL;
@@ -592,6 +644,16 @@ static void emulator_padding_done(libxl__egc *egc,
     stream_failed(egc, stream, ret);
 }
 
+static void checkpoint_done(libxl__egc *egc,
+                            libxl__stream_read_state *stream, int rc)
+{
+    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
+
+    assert(stream->in_checkpoint);
+    stream->in_checkpoint = false;
+    stream->checkpoint_callback(egc, dcs, rc);
+}
+
 /*
  * Local variables:
  * mode: C
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 26/27] tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (24 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 15:03   ` Ian Campbell
  2015-06-15 13:44 ` [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks Andrew Cooper
                   ` (3 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Libxl has now been fully adjusted not to need it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/xc_sr_common.h          |    5 --
 tools/libxc/xc_sr_restore.c         |   18 -----
 tools/libxc/xc_sr_restore_x86_hvm.c |  124 -----------------------------------
 tools/libxc/xc_sr_save_x86_hvm.c    |   36 ----------
 4 files changed, 183 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 08c66db..42af55b 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -306,11 +306,6 @@ struct xc_sr_context
                     /* HVM context blob. */
                     void *context;
                     size_t contextsz;
-
-/* #ifdef XG_LIBXL_HVM_COMPAT */
-                    uint32_t qlen;
-                    void *qbuf;
-/* #endif */
                 } restore;
             };
         } x86_hvm;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 5e0f817..fd45775 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -632,9 +632,6 @@ static void cleanup(struct xc_sr_context *ctx)
         PERROR("Failed to clean up");
 }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-extern int read_qemu(struct xc_sr_context *ctx);
-#endif
 /*
  * Restore a domain.
  */
@@ -661,21 +658,6 @@ static int restore(struct xc_sr_context *ctx)
                 goto err;
         }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-        if ( ctx->dominfo.hvm &&
-             (rec.type == REC_TYPE_END || rec.type == REC_TYPE_CHECKPOINT) )
-        {
-            rc = read_qemu(ctx);
-            if ( rc )
-            {
-                if ( ctx->restore.buffer_all_records )
-                    goto remus_failover;
-                else
-                    goto err;
-            }
-        }
-#endif
-
         if ( ctx->restore.buffer_all_records &&
              rec.type != REC_TYPE_END &&
              rec.type != REC_TYPE_CHECKPOINT )
diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c b/tools/libxc/xc_sr_restore_x86_hvm.c
index 6f5af0e..49d22c7 100644
--- a/tools/libxc/xc_sr_restore_x86_hvm.c
+++ b/tools/libxc/xc_sr_restore_x86_hvm.c
@@ -3,24 +3,6 @@
 
 #include "xc_sr_common_x86.h"
 
-#ifdef XG_LIBXL_HVM_COMPAT
-static int handle_toolstack(struct xc_sr_context *ctx, struct xc_sr_record *rec)
-{
-    xc_interface *xch = ctx->xch;
-    int rc;
-
-    if ( !ctx->restore.callbacks || !ctx->restore.callbacks->toolstack_restore )
-        return 0;
-
-    rc = ctx->restore.callbacks->toolstack_restore(
-        ctx->domid, rec->data, rec->length, ctx->restore.callbacks->data);
-
-    if ( rc < 0 )
-        PERROR("restoring toolstack");
-    return rc;
-}
-#endif
-
 /*
  * Process an HVM_CONTEXT record from the stream.
  */
@@ -93,98 +75,6 @@ static int handle_hvm_params(struct xc_sr_context *ctx,
     return 0;
 }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-int read_qemu(struct xc_sr_context *ctx);
-int read_qemu(struct xc_sr_context *ctx)
-{
-    xc_interface *xch = ctx->xch;
-    char qemusig[21];
-    uint32_t qlen;
-    void *qbuf = NULL;
-    int rc = -1;
-
-    if ( read_exact(ctx->fd, qemusig, sizeof(qemusig)) )
-    {
-        PERROR("Error reading QEMU signature");
-        goto out;
-    }
-
-    if ( !memcmp(qemusig, "DeviceModelRecord0002", sizeof(qemusig)) )
-    {
-        if ( read_exact(ctx->fd, &qlen, sizeof(qlen)) )
-        {
-            PERROR("Error reading QEMU record length");
-            goto out;
-        }
-
-        qbuf = malloc(qlen);
-        if ( !qbuf )
-        {
-            PERROR("no memory for device model state");
-            goto out;
-        }
-
-        if ( read_exact(ctx->fd, qbuf, qlen) )
-        {
-            PERROR("Error reading device model state");
-            goto out;
-        }
-    }
-    else
-    {
-        ERROR("Invalid device model state signature '%*.*s'",
-              (int)sizeof(qemusig), (int)sizeof(qemusig), qemusig);
-        goto out;
-    }
-
-    /* With Remus, this could be read many times */
-    if ( ctx->x86_hvm.restore.qbuf )
-        free(ctx->x86_hvm.restore.qbuf);
-    ctx->x86_hvm.restore.qbuf = qbuf;
-    ctx->x86_hvm.restore.qlen = qlen;
-    rc = 0;
-
-out:
-    if (rc)
-        free(qbuf);
-    return rc;
-}
-
-static int handle_qemu(struct xc_sr_context *ctx)
-{
-    xc_interface *xch = ctx->xch;
-    char path[256];
-    uint32_t qlen = ctx->x86_hvm.restore.qlen;
-    void *qbuf = ctx->x86_hvm.restore.qbuf;
-    int rc = -1;
-    FILE *fp = NULL;
-
-    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", ctx->domid);
-    fp = fopen(path, "wb");
-    if ( !fp )
-    {
-        PERROR("Failed to open '%s' for writing", path);
-        goto out;
-    }
-
-    DPRINTF("Writing %u bytes of QEMU data", qlen);
-    if ( fwrite(qbuf, 1, qlen, fp) != qlen )
-    {
-        PERROR("Failed to write %u bytes of QEMU data", qlen);
-        goto out;
-    }
-
-    rc = 0;
-
- out:
-    if ( fp )
-        fclose(fp);
-    free(qbuf);
-
-    return rc;
-}
-#endif
-
 /* restore_ops function. */
 static bool x86_hvm_pfn_is_valid(const struct xc_sr_context *ctx, xen_pfn_t pfn)
 {
@@ -260,11 +150,6 @@ static int x86_hvm_process_record(struct xc_sr_context *ctx,
     case REC_TYPE_HVM_PARAMS:
         return handle_hvm_params(ctx, rec);
 
-#ifdef XG_LIBXL_HVM_COMPAT
-    case REC_TYPE_TOOLSTACK:
-        return handle_toolstack(ctx, rec);
-#endif
-
     default:
         return RECORD_NOT_PROCESSED;
     }
@@ -314,15 +199,6 @@ static int x86_hvm_stream_complete(struct xc_sr_context *ctx)
         return rc;
     }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-    rc = handle_qemu(ctx);
-    if ( rc )
-    {
-        ERROR("Failed to dump qemu");
-        return rc;
-    }
-#endif
-
     return rc;
 }
 
diff --git a/tools/libxc/xc_sr_save_x86_hvm.c b/tools/libxc/xc_sr_save_x86_hvm.c
index f4604db..cdee774 100644
--- a/tools/libxc/xc_sr_save_x86_hvm.c
+++ b/tools/libxc/xc_sr_save_x86_hvm.c
@@ -118,36 +118,6 @@ static int write_hvm_params(struct xc_sr_context *ctx)
     return rc;
 }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-static int write_toolstack(struct xc_sr_context *ctx)
-{
-    xc_interface *xch = ctx->xch;
-    struct xc_sr_record rec = {
-        .type = REC_TYPE_TOOLSTACK,
-        .length = 0,
-    };
-    uint8_t *buf;
-    uint32_t len;
-    int rc;
-
-    if ( !ctx->save.callbacks || !ctx->save.callbacks->toolstack_save )
-        return 0;
-
-    if ( ctx->save.callbacks->toolstack_save(
-             ctx->domid, &buf, &len, ctx->save.callbacks->data) < 0 )
-    {
-        PERROR("Error calling toolstack_save");
-        return -1;
-    }
-
-    rc = write_split_record(ctx, &rec, buf, len);
-    if ( rc < 0 )
-        PERROR("Error writing TOOLSTACK record");
-    free(buf);
-    return rc;
-}
-#endif
-
 static xen_pfn_t x86_hvm_pfn_to_gfn(const struct xc_sr_context *ctx,
                                     xen_pfn_t pfn)
 {
@@ -199,12 +169,6 @@ static int x86_hvm_end_of_checkpoint(struct xc_sr_context *ctx)
     if ( rc )
         return rc;
 
-#ifdef XG_LIBXL_HVM_COMPAT
-    rc = write_toolstack(ctx);
-    if ( rc )
-        return rc;
-#endif
-
     /* Write the HVM_CONTEXT record. */
     rc = write_hvm_context(ctx);
     if ( rc )
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (25 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 26/27] tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc Andrew Cooper
@ 2015-06-15 13:44 ` Andrew Cooper
  2015-06-16 15:04   ` Ian Campbell
  2015-06-16  2:21 ` [PATCH 00/27] Libxl migration v2 Yang Hongyang
                   ` (2 subsequent siblings)
  29 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-15 13:44 UTC (permalink / raw)
  To: Xen-devel
  Cc: Wei Liu, Yang Hongyang, Ian Jackson, Ian Campbell, Andrew Cooper

Libxl has now been fully adjusted not to need them.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxl/libxl_dom.c            |    1 -
 tools/libxl/libxl_internal.h       |    2 --
 tools/libxl/libxl_save_callout.c   |   39 +-----------------------------------
 tools/libxl/libxl_save_helper.c    |   29 ---------------------------
 tools/libxl/libxl_save_msgs_gen.pl |    7 ++-----
 5 files changed, 3 insertions(+), 75 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 3597a91..43915a2 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -2036,7 +2036,6 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
         callbacks->suspend = libxl__domain_suspend_callback;
 
     callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty;
-    dss->shs.callbacks.save.toolstack_save = libxl__toolstack_save;
 
     dss->sws.fd = dss->fd;
     dss->sws.ao = dss->ao;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e271a0b..e0f6e09 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2611,8 +2611,6 @@ _hidden void libxl__datacopier_prefixdata(libxl__egc*, libxl__datacopier_state*,
 
 typedef struct libxl__srm_save_callbacks {
     libxl__srm_save_autogen_callbacks a;
-    int (*toolstack_save)(uint32_t domid, uint8_t **buf,
-                          uint32_t *len, void *data);
 } libxl__srm_save_callbacks;
 
 typedef struct libxl__srm_restore_callbacks {
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 0579372..02e0190 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -77,41 +77,12 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
 void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
 {
     STATE_AO_GC(dss->ao);
-    int r, rc, toolstack_data_fd = -1;
-    uint32_t toolstack_data_len = 0;
-
-    /* Resources we need to free */
-    uint8_t *toolstack_data_buf = 0;
 
     unsigned cbflags = libxl__srm_callout_enumcallbacks_save
         (&dss->shs.callbacks.save.a);
 
-    if (dss->shs.callbacks.save.toolstack_save) {
-        r = dss->shs.callbacks.save.toolstack_save
-            (dss->domid, &toolstack_data_buf, &toolstack_data_len, dss);
-        if (r) { rc = ERROR_FAIL; goto out; }
-
-        dss->shs.toolstack_data_file = tmpfile();
-        if (!dss->shs.toolstack_data_file) {
-            LOGE(ERROR, "cannot create toolstack data tmpfile");
-            rc = ERROR_FAIL;
-            goto out;
-        }
-        toolstack_data_fd = fileno(dss->shs.toolstack_data_file);
-
-        r = libxl_write_exactly(CTX, toolstack_data_fd,
-                                toolstack_data_buf, toolstack_data_len,
-                                "toolstack data tmpfile", 0);
-        if (r) { rc = ERROR_FAIL; goto out; }
-
-        /* file position must be reset before passing to libxl-save-helper. */
-        r = lseek(toolstack_data_fd, 0, SEEK_SET);
-        if (r) { rc = ERROR_FAIL; goto out; }
-    }
-
     const unsigned long argnums[] = {
         dss->domid, 0, 0, dss->xcflags, dss->hvm,
-        toolstack_data_fd, toolstack_data_len,
         cbflags,
     };
 
@@ -122,18 +93,10 @@ void libxl__xc_domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss)
     dss->shs.caller_state = dss;
     dss->shs.need_results = 0;
 
-    free(toolstack_data_buf);
-
     run_helper(egc, &dss->shs, "--save-domain", dss->fd,
-               &toolstack_data_fd, 1,
+               NULL, 0,
                argnums, ARRAY_SIZE(argnums));
     return;
-
- out:
-    free(toolstack_data_buf);
-    if (dss->shs.toolstack_data_file) fclose(dss->shs.toolstack_data_file);
-
-    libxl__xc_domain_save_done(egc, dss, rc, 0, 0);
 }
 
 
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index 4cc93a2..b8f0390 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -211,32 +211,8 @@ int helper_getreply(void *user)
 
 /*----- other callbacks -----*/
 
-static int toolstack_save_fd;
-static uint32_t toolstack_save_len;
 static struct save_callbacks helper_save_callbacks;
 
-static int toolstack_save_cb(uint32_t domid, uint8_t **buf,
-                             uint32_t *len, void *data)
-{
-    int r;
-
-    assert(toolstack_save_fd > 0);
-
-    /* This is a hack for remus */
-    if (helper_save_callbacks.checkpoint) {
-        r = lseek(toolstack_save_fd, 0, SEEK_SET);
-        if (r) fail(errno,"rewind toolstack data tmpfile");
-    }
-
-    *buf = xmalloc(toolstack_save_len);
-    r = read_exactly(toolstack_save_fd, *buf, toolstack_save_len);
-    if (r<0) fail(errno,"read toolstack data");
-    if (r==0) fail(0,"read toolstack data eof");
-
-    *len = toolstack_save_len;
-    return 0;
-}
-
 static void startup(const char *op) {
     xtl_log(&logger,XTL_DEBUG,0,program,"starting %s",op);
 
@@ -271,14 +247,9 @@ int main(int argc, char **argv)
         uint32_t max_factor =      strtoul(NEXTARG,0,10);
         uint32_t flags =           strtoul(NEXTARG,0,10);
         int hvm =                  atoi(NEXTARG);
-        toolstack_save_fd  =       atoi(NEXTARG);
-        toolstack_save_len =       strtoul(NEXTARG,0,10);
         unsigned cbflags =         strtoul(NEXTARG,0,10);
         assert(!*++argv);
 
-        if (toolstack_save_fd >= 0)
-            helper_save_callbacks.toolstack_save = toolstack_save_cb;
-
         helper_setcallbacks_save(&helper_save_callbacks, cbflags);
 
         startup("save");
diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl
index 36b279e..dc17c6b 100755
--- a/tools/libxl/libxl_save_msgs_gen.pl
+++ b/tools/libxl/libxl_save_msgs_gen.pl
@@ -28,12 +28,9 @@ our @msgs = (
     [  5, 'srcxA',   "checkpoint", [] ],
     [  6, 'scxA',   "switch_qemu_logdirty",  [qw(int domid
                                               unsigned enable)] ],
-    #                toolstack_save          done entirely `by hand'
-    [  7, 'rcxW',   "toolstack_restore",     [qw(uint32_t domid
-                                                BLOCK tsdata)] ],
-    [  8, 'r',      "restore_results",       ['unsigned long', 'store_mfn',
+    [  7, 'r',      "restore_results",       ['unsigned long', 'store_mfn',
                                               'unsigned long', 'console_mfn'] ],
-    [  9, 'srW',    "complete",              [qw(int retval
+    [  8, 'srW',    "complete",              [qw(int retval
                                                  int errnoval)] ],
 );
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 107+ messages in thread

* Re: [PATCH 00/27]  Libxl migration v2
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (26 preceding siblings ...)
  2015-06-15 13:44 ` [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks Andrew Cooper
@ 2015-06-16  2:21 ` Yang Hongyang
  2015-06-17  1:55 ` Wen Congyang
  2015-07-02  7:33 ` Yang Hongyang
  29 siblings, 0 replies; 107+ messages in thread
From: Yang Hongyang @ 2015-06-16  2:21 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> This series adds support for the libxl migration v2 stream, and untangles the
> existing layering violations of the toolstack and qemu records.
>
> At the end of the series, legacy migration is no longer used.
>
> Note: Remus support is broken and (RFC) fixed in separate patches in this
> series.  It was too tangled to fix in a bisectable fashon.  Plain
> suspend/migrate/resume however is (should be) bisectable along the entire
> series.

By a quick test on both pv/hvm, Remus support is still broken. The Remus
save/restore part is working, but failover is broken. To solve this:
On libxl side:
1. buffer toolstack and qemu records at checkpoint.
2. If stream read failed on xl side, drop the buffered records, return with
    error code that indicate a failover.
3. If all stream buffered(xl side), process/apply the toolstack and qemu
    records, return with success.
4. If apply toolstack and qemu records failed, return error.

On libxc side:
check the return value of checkpoint callback, if it indicate a failover,
then do failover.

>
> There are a couple of outstanding questions:
>
> 1) What to do about the toolstack/xenstore record.  It is currently by being
>     passed around as a blob, but it might be better to split it out.
>
> 2) What (if any) ABI/API qualifications are needed? (Particularly in reference
>     to patch 21)
>
> The Remus code is untested by me, but is hopefully in the correct ballpark.
> All other combinations of suspend/migrate/resume have been tested with PV and
> HVM guests (qemu-trad and qemu-upstream), including 32 -> 64 bit migration
> (which was the underlying bug causing us to write migration v2 in the first
> place).
>
> There are some further improvements which could be made.  In particular, it
> appears that sending the toolstack record on each checkpoint is redundant, and
> there is certainly room for some more pruning of the legacy migration code.
>
> Anyway, thoughts/comments welcome.  Please test!
>
> ~Andrew
>
>
> Andrew Cooper (22):
>    tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
>    tools/libxc: Always compile the compat qemu variables into xc_sr_context
>    tools/libxl: Stash all restore parameters in domain_create_state
>    tools/xl: Mandatory flag indicating the format of the migration stream
>    tools/libxl: Introduce ROUNDUP()
>    tools/libxl: Extra APIs for the save helper
>    tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
>    docs: Libxl migration v2 stream specification
>    tools/python: Libxc migration v2 infrastructure
>    tools/python: Libxl migration v2 infrastructure
>    tools/python: Verification utility for v2 stream spec compliance
>    tools/python: Conversion utility for legacy migration streams
>    tools/libxl: Support converting a legacy stream to a v2 stream
>    tools/libxl: Convert a legacy stream if needed
>    tools/libxc+libxl+xl: Restore v2 streams
>    tools/libxc+libxl+xl: Save v2 streams
>    docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
>    tools/libxl: [RFC] Write checkpoint records into the stream
>    tools/libx{c,l}: [RFC] Introduce restore_callbacks.checkpoint()
>    tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
>    tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
>    tools/libxl: Drop all knowledge of toolstack callbacks
>
> Ian Jackson (2):
>    libxl: cancellation: Preparations for save/restore cancellation
>    libxl: cancellation: Handle SIGTERM in save/restore helper
>
> Ross Lagerwall (3):
>    tools/libxl: Migration v2 stream format
>    tools/libxl: Infrastructure for reading a libxl migration v2 stream
>    tools/libxl: Infrastructure for writing a v2 stream
>
>   docs/specs/libxl-migration-stream.pandoc      |  218 ++++++++
>   tools/libxc/Makefile                          |    2 -
>   tools/libxc/include/xenguest.h                |    3 +
>   tools/libxc/xc_sr_common.h                    |    5 -
>   tools/libxc/xc_sr_restore.c                   |   33 +-
>   tools/libxc/xc_sr_restore_x86_hvm.c           |  124 -----
>   tools/libxc/xc_sr_save_x86_hvm.c              |   36 --
>   tools/libxl/Makefile                          |    2 +
>   tools/libxl/libxl_aoutils.c                   |    7 +
>   tools/libxl/libxl_convert_callout.c           |  146 ++++++
>   tools/libxl/libxl_create.c                    |   80 +--
>   tools/libxl/libxl_dom.c                       |   61 +--
>   tools/libxl/libxl_internal.h                  |  140 ++++-
>   tools/libxl/libxl_save_callout.c              |   63 +--
>   tools/libxl/libxl_save_helper.c               |   95 ++--
>   tools/libxl/libxl_save_msgs_gen.pl            |    9 +-
>   tools/libxl/libxl_sr_stream_format.h          |   58 +++
>   tools/libxl/libxl_stream_read.c               |  663 ++++++++++++++++++++++++
>   tools/libxl/libxl_stream_write.c              |  640 +++++++++++++++++++++++
>   tools/libxl/libxl_types.idl                   |    2 +
>   tools/libxl/xl_cmdimpl.c                      |    9 +-
>   tools/python/Makefile                         |    4 +
>   tools/python/scripts/convert-legacy-stream.py |  683 +++++++++++++++++++++++++
>   tools/python/scripts/verify-stream-v2.py      |  174 +++++++
>   tools/python/setup.py                         |    1 +
>   tools/python/xen/migration/libxc.py           |  446 ++++++++++++++++
>   tools/python/xen/migration/libxl.py           |  199 +++++++
>   tools/python/xen/migration/tests.py           |   54 ++
>   tools/python/xen/migration/verify.py          |   37 ++
>   29 files changed, 3638 insertions(+), 356 deletions(-)
>   create mode 100644 docs/specs/libxl-migration-stream.pandoc
>   create mode 100644 tools/libxl/libxl_convert_callout.c
>   create mode 100644 tools/libxl/libxl_sr_stream_format.h
>   create mode 100644 tools/libxl/libxl_stream_read.c
>   create mode 100644 tools/libxl/libxl_stream_write.c
>   create mode 100755 tools/python/scripts/convert-legacy-stream.py
>   create mode 100755 tools/python/scripts/verify-stream-v2.py
>   create mode 100644 tools/python/xen/migration/__init__.py
>   create mode 100644 tools/python/xen/migration/libxc.py
>   create mode 100644 tools/python/xen/migration/libxl.py
>   create mode 100644 tools/python/xen/migration/tests.py
>   create mode 100644 tools/python/xen/migration/verify.py
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint()
  2015-06-15 13:44 ` [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint() Andrew Cooper
@ 2015-06-16  2:23   ` Yang Hongyang
  2015-06-17  8:20   ` Yang Hongyang
  1 sibling, 0 replies; 107+ messages in thread
From: Yang Hongyang @ 2015-06-16  2:23 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> And call it when a checkpoint record is found in the libxc stream.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>   tools/libxc/include/xenguest.h     |    3 +++
>   tools/libxc/xc_sr_restore.c        |   15 ++++++++++++++-
>   tools/libxl/libxl_save_msgs_gen.pl |    2 +-
>   3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 7581263..b0d27ed 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -102,6 +102,9 @@ struct restore_callbacks {
>       int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
>               uint32_t size, void* data);
>
> +    /* A checkpoint record has been found in the stream */

Describe the return value, e.g:
2 failover
1 success
0 error

> +    int (*checkpoint)(void* data);
> +
>       /* to be provided as the last argument to each callback function */
>       void* data;
>   };
> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
> index 9e27dba..5e0f817 100644
> --- a/tools/libxc/xc_sr_restore.c
> +++ b/tools/libxc/xc_sr_restore.c
> @@ -1,5 +1,7 @@
>   #include <arpa/inet.h>
>
> +#include <assert.h>
> +
>   #include "xc_sr_common.h"
>
>   /*
> @@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
>   static int handle_checkpoint(struct xc_sr_context *ctx)
>   {
>       xc_interface *xch = ctx->xch;
> -    int rc = 0;
> +    int rc = 0, ret;
>       unsigned i;
>
>       if ( !ctx->restore.checkpointed )
> @@ -482,6 +484,13 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
>           goto err;
>       }
>
> +    ret = ctx->restore.callbacks->checkpoint(ctx->restore.callbacks->data);

Should check whether we need to failover.

> +    if ( ret )
> +    {
> +        rc = -1;
> +        goto err;
> +    }
> +
>       if ( ctx->restore.buffer_all_records )
>       {
>           IPRINTF("All records buffered");
> @@ -735,6 +744,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
>       ctx.restore.checkpointed = checkpointed_stream;
>       ctx.restore.callbacks = callbacks;
>
> +    /* Sanity checks for callbacks. */
> +    if (checkpointed_stream)
> +        assert(callbacks->checkpoint);
> +
>       IPRINTF("In experimental %s", __func__);
>       DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d"
>               ", checkpointed_stream %d", io_fd, dom, hvm, pae,
> diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl
> index 6b4b65e..36b279e 100755
> --- a/tools/libxl/libxl_save_msgs_gen.pl
> +++ b/tools/libxl/libxl_save_msgs_gen.pl
> @@ -25,7 +25,7 @@ our @msgs = (
>                                                   'unsigned long', 'total'] ],
>       [  3, 'scxA',   "suspend", [] ],
>       [  4, 'scxA',   "postcopy", [] ],
> -    [  5, 'scxA',   "checkpoint", [] ],
> +    [  5, 'srcxA',   "checkpoint", [] ],
>       [  6, 'scxA',   "switch_qemu_logdirty",  [qw(int domid
>                                                 unsigned enable)] ],
>       #                toolstack_save          done entirely `by hand'
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-15 13:44 ` [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children Andrew Cooper
@ 2015-06-16 13:21   ` Ian Campbell
  2015-06-16 13:36     ` Andrew Cooper
  2015-06-16 13:39     ` Ian Jackson
  0 siblings, 2 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:21 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> Shortly, libxl will be juggling multiple parallel operations, and will
> possibly have to take error decisions before some tasks have been set up.

It would be preferable, I think, to arrange to call libxl__ev_child_init
on all such libxl__ev_child structs either up front or certainly before
there is any possibility of needing to unwind them.

Such an init would at worst correspond to exactly the place where the
zeroed structure you refer to is zeroed.

> No child process of libxl will ever have a pid of 0, so gate
> libxl__ev_child_inuse() on a pid strictly greater than 0.
> 
> This makes it safe to use on a zeroed structure of a task which has not yet
> been set up.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> 
> ---
> This change does make libxl__ev_child_init() functionally useless.  I am
> undecided between leaving it in place in case it is useful in the future, or to
> remove it completely.
> ---
>  tools/libxl/libxl_internal.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index e96d6b5..6226c18 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -880,7 +880,7 @@ _hidden pid_t libxl__ev_child_fork(libxl__gc *gc, libxl__ev_child *childw_out,
>  static inline void libxl__ev_child_init(libxl__ev_child *childw_out)
>                  { childw_out->pid = -1; }
>  static inline int libxl__ev_child_inuse(const libxl__ev_child *childw_out)
> -                { return childw_out->pid >= 0; }
> +                { return childw_out->pid > 0; }
>  
>  /* Useable (only) in the child to once more make the ctx useable for
>   * xenstore operations.  logs failure in the form "what: <error

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context
  2015-06-15 13:44 ` [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context Andrew Cooper
@ 2015-06-16 13:22   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:22 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> This is safe (as the variable will simply be unused), and is required for
> correct compilation when midway through untangling the libxc/libxl
> interaction.
> 
> The #define is left in place to highlight that the variables can be removed
> once the untangling is complete.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-16 13:21   ` Ian Campbell
@ 2015-06-16 13:36     ` Andrew Cooper
  2015-06-16 13:47       ` Ian Jackson
  2015-06-16 15:24       ` Ian Campbell
  2015-06-16 13:39     ` Ian Jackson
  1 sibling, 2 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 13:36 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 14:21, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> Shortly, libxl will be juggling multiple parallel operations, and will
>> possibly have to take error decisions before some tasks have been set up.
> It would be preferable, I think, to arrange to call libxl__ev_child_init
> on all such libxl__ev_child structs either up front or certainly before
> there is any possibility of needing to unwind them.
>
> Such an init would at worst correspond to exactly the place where the
> zeroed structure you refer to is zeroed.

It is possible that one bit fails before it can be calculated whether
the second bit needs to start or not.

At the moment, all bits in libxl in this area do initialisation
immediately before use; most bits are even initialised in the function
which starts their actions.  Some bits are initialised differently
depending on the path taken to get to the initialisation site. 

It would be non-trivial to initialise everything appropriately at the
very start.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state
  2015-06-15 13:44 ` [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state Andrew Cooper
@ 2015-06-16 13:37   ` Ian Campbell
  2015-06-16 14:09     ` Andrew Cooper
  2015-06-18  2:32   ` Yang Hongyang
  1 sibling, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:37 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> Shortly more parameters will appear, and this saves unboxing each one.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_create.c       |   12 ++++++------
>  tools/libxl/libxl_internal.h     |    2 +-
>  tools/libxl/libxl_save_callout.c |    2 +-
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 86384d2..385891c 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
>                               int rc, uint32_t domid);
>  
>  static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
> -                            uint32_t *domid,
> -                            int restore_fd, int checkpointed_stream,
> +                            uint32_t *domid, int restore_fd,
> +                            const libxl_domain_restore_params *params,
>                              const libxl_asyncop_how *ao_how,
>                              const libxl_asyncprogress_how *aop_console_how)
>  {
> @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
>      libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
>      libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
>      cdcs->dcs.restore_fd = restore_fd;
> +    if (params) cdcs->dcs.restore_params = *params;

Is this eventually going to become non-optional? I think not and its
validity is entirely intertwined with the validity of restore_fd (as I
suspect it was before, but I've not checked).

Perhaps an error check to that effect would be useful?

Anyway, I think what you've done here is correct, so:
        Acked-by: Ian Campbell <ian.campbell@citrix.com>
        
[...]
> @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
>      libxl_domain_config *guest_config;
>      libxl_domain_config guest_config_saved; /* vanilla config */
>      int restore_fd;
> +    libxl_domain_restore_params restore_params;
>      libxl__domain_create_cb *callback;
>      libxl_asyncprogress_how aop_console_how;
>      /* private to domain_create */
>      int guest_domid;
> -    int checkpointed_stream;

This has, in effect moved from "private to domain_create" to "filled in
by user", I don't think the change here has actually changed its status,
but I suspect it was wrong before (alternatively restore_fd is in the
wrong place instead).

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream
  2015-06-15 13:44 ` [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream Andrew Cooper
@ 2015-06-16 13:39   ` Ian Campbell
  2015-06-16 14:10     ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> Introduced at this point so the python stream conversion code has a concrete
> ABI to use.

Please could you also explicitly mention that it isn't added to FLAG_ALL
yet because we don't actually implement it yet and that it will be added
there later.

With that:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

(although I wish I had a better name in mind...)

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 05/27] tools/libxl: Introduce ROUNDUP()
  2015-06-15 13:44 ` [PATCH 05/27] tools/libxl: Introduce ROUNDUP() Andrew Cooper
@ 2015-06-16 13:39   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> This is the same as is used by libxc.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-16 13:21   ` Ian Campbell
  2015-06-16 13:36     ` Andrew Cooper
@ 2015-06-16 13:39     ` Ian Jackson
  1 sibling, 0 replies; 107+ messages in thread
From: Ian Jackson @ 2015-06-16 13:39 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Andrew Cooper, Yang Hongyang, Wei Liu, Xen-devel

Ian Campbell writes ("Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children"):
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> > Shortly, libxl will be juggling multiple parallel operations, and will
> > possibly have to take error decisions before some tasks have been set up.
> 
> It would be preferable, I think, to arrange to call libxl__ev_child_init
> on all such libxl__ev_child structs either up front or certainly before
> there is any possibility of needing to unwind them.

Yes.

> Such an init would at worst correspond to exactly the place where the
> zeroed structure you refer to is zeroed.

I would welcome a patch which caused an assertion failure if ->pid==0.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-16 13:36     ` Andrew Cooper
@ 2015-06-16 13:47       ` Ian Jackson
  2015-06-16 14:05         ` Andrew Cooper
  2015-06-16 15:24       ` Ian Campbell
  1 sibling, 1 reply; 107+ messages in thread
From: Ian Jackson @ 2015-06-16 13:47 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Yang Hongyang, Wei Liu, Ian Campbell, Xen-devel

Andrew Cooper writes ("Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children"):
> It is possible that one bit fails before it can be calculated whether
> the second bit needs to start or not.
> 
> At the moment, all bits in libxl in this area do initialisation
> immediately before use; most bits are even initialised in the function
> which starts their actions.  Some bits are initialised differently
> depending on the path taken to get to the initialisation site. 

As a rule of thumb a function libxl__initiate_foo_ which takes a
libxl__foo_state* should do this initialisation for the whole
libxl__foo_state.

I don't see why you can't do that.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 08/27] tools/libxl: Extra APIs for the save helper
  2015-06-15 13:44 ` [PATCH 08/27] tools/libxl: Extra APIs for the save helper Andrew Cooper
@ 2015-06-16 13:50   ` Ian Campbell
  2015-06-16 15:03     ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:50 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> With libxl migration v2, there will be other moving parts which might fail,
> requiring the helper to be stopped for reasons which are not its fault.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_internal.h     |    8 ++++++++
>  tools/libxl/libxl_save_callout.c |   16 ++++++++++++++++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 4f204f9..3fcc37a 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3182,6 +3182,14 @@ _hidden void libxl__xc_domain_restore(libxl__egc *egc,
>  _hidden void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
>                                             int rc, int retval, int errnoval);
>  
> +_hidden void libxl__save_helper_abort(libxl__egc *egc,
> +                                      libxl__save_helper_state *shs);
> +
> +static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
> +{
> +    return libxl__ev_child_inuse(&shs->child);
> +}

Will this be used other than in libxl__save_helper_abort?

> +
>  /* Each time the dm needs to be saved, we must call suspend and then save */
>  _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
>                                             libxl__domain_suspend_state *dss);
> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
> index 231de2f..71de297 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -256,6 +256,22 @@ static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
>      libxl__kill(gc, shs->child.pid, SIGKILL, "save/restore helper");
>  }
>  
> +void libxl__save_helper_abort(libxl__egc *egc,
> +                              libxl__save_helper_state *shs)
> +{
> +    STATE_AO_GC(shs->ao);
> +
> +    if (!libxl__ev_child_inuse(&shs->child)) {
> +        helper_failed(egc, shs, ERROR_FAIL);

Did you mean helper_done here?

helper_failed deregisters the fd, and potentially sends SIGKILL,
although in this case it won't because its not inuse.

So all this is doing is deregistering the fd, which AFAICT isn't
registered if there is no pid.

In fact, apart from the specific signal used, this function looks a lot
like helper_failed.

> +        return;
> +    }
> +
> +    if (!shs->rc)
> +        shs->rc = ERROR_FAIL;
> +
> +    libxl__kill(gc, shs->child.pid, SIGTERM, "save/restore helper");
> +}
> +
>  static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
>                                     int fd, short events, short revents)
>  {

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
  2015-06-15 13:44 ` [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore() Andrew Cooper
@ 2015-06-16 13:53   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> If a conversion of a legacy stream is needed, libxl__xc_domain_restore() will
> need to use an fd other to the one found in the domain_create_state.
> 
> No functional change.

It could be argued that the one in domain_create_state should always be
the correct one. If that means that it differs from the one ultimately
passed in by the user then the original ought to be saved in the state
associated with the filter as its input and the filter's output plumbed
into domain_create_state instead.

I'll reserve judgement until I see a bit more of how this pans out
through the series.

> @@ -41,13 +41,13 @@ static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
>  /*----- entrypoints -----*/
>  
>  void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
> +                              int restore_fd,

Assuming we go this way: can it remain const?


>                                int hvm, int pae, int superpages)
>  {
>      STATE_AO_GC(dcs->ao);
>  
>      /* Convenience aliases */
>      const uint32_t domid = dcs->guest_domid;
> -    const int restore_fd = dcs->restore_fd;
>      libxl__domain_build_state *const state = &dcs->build_state;
>  
>      unsigned cbflags = libxl__srm_callout_enumcallbacks_restore

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 10/27] docs: Libxl migration v2 stream specification
  2015-06-15 13:44 ` [PATCH 10/27] docs: Libxl migration v2 stream specification Andrew Cooper
@ 2015-06-16 13:58   ` Ian Campbell
  2015-07-08 13:49     ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 13:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

> +EMULATOR\_CONTEXT
> +----------------
> +
> +A context blob for a specific emulator associated with the domain.
> +
> +     0     1     2     3     4     5     6     7 octet
> +    +------------------------+------------------------+
> +    | emulator_id            | index                  |
> +    +------------------------+------------------------+
> +    | emulator_ctx                                    |
> +    ...
> +    +-------------------------------------------------+
> +
> +--------------------------------------------------------------------
> +Field            Description
> +------------     ---------------------------------------------------
> +emulator_id      0x00000000: Unknown (In the case of a legacy stream)
> +
> +                 0x00000001: Qemu Traditional
> +
> +                 0x00000002: Qemu Upstream
> +
> +                 0x00000003 - 0xFFFFFFFF: Reserved for future emulators.

Would it be useful for future proofing to carve out some space for a
per-emulator version field too?

Otherwise LGTM.

One thought, it might be useful (here or elsewhere) to have an explicit
overview of the expected control flow (as in the ownership of the fd,
and/or nesting of the layers as you prefer to think about it) between
libxc, libxl and the next layer (i.e. xl).

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 11/27] tools/python: Libxc migration v2 infrastructure
  2015-06-15 13:44 ` [PATCH 11/27] tools/python: Libxc migration v2 infrastructure Andrew Cooper
@ 2015-06-16 14:01   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> Contains:
>  * Python implementation of the libxc migration v2 records
>  * Verification code for spec compliance
>  * Unit tests
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

For this, and the following lobxl and verification patches:

Acked-by: Ian Campbell <ian.campbell@citrix.com>

(I've not read them closely on this pass, I recall looking at them
before)

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 14/27] tools/python: Conversion utility for legacy migration streams
  2015-06-15 13:44 ` [PATCH 14/27] tools/python: Conversion utility for legacy migration streams Andrew Cooper
@ 2015-06-16 14:01   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> This utility will take a legacy stream as in input, and produce a v2 stream as
> an output.  It is exec()'d by libxl to provide backwards compatibility.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/python/Makefile                         |    4 +
>  tools/python/scripts/convert-legacy-stream.py |  683 +++++++++++++++++++++++++
>  2 files changed, 687 insertions(+)
>  create mode 100755 tools/python/scripts/convert-legacy-stream.py
> 
> diff --git a/tools/python/Makefile b/tools/python/Makefile
> index e933be8..531c862 100644
> --- a/tools/python/Makefile
> +++ b/tools/python/Makefile
> @@ -17,9 +17,13 @@ build: genwrap.py $(XEN_ROOT)/tools/libxl/libxl_types.idl \
>  
>  .PHONY: install
>  install:
> +	$(INSTALL_DIR) $(DESTDIR)$(PRIVATE_BINDIR)
> +
>  	CC="$(CC)" CFLAGS="$(PY_CFLAGS)" $(PYTHON) setup.py install \
>  		$(PYTHON_PREFIX_ARG) --root="$(DESTDIR)" --force
>  
> +	$(INSTALL_PROG) scripts/convert-legacy-stream.py $(DESTDIR)$(PRIVATE_BINDIR)

Please drop the .py on install, or from the source too if you prefer.
With that:

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 15/27] tools/libxl: Migration v2 stream format
  2015-06-15 13:44 ` [PATCH 15/27] tools/libxl: Migration v2 stream format Andrew Cooper
@ 2015-06-16 14:04   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:04 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> C structures describing the Libxl migration v2 stream format

Do we think these should be internal or are they part of the library
API? I suppose it's a bit of a grey area, obviously the file format is
"ABI", but its not one a user should ever interact with directly.

What I'm getting at is may most of this should be in the libxl__ rather
than libxl_ namespace (also some of it being un-namespaced would further
suggest these are strictly speaking install, as would the fact it isn't
installed...)

> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_sr_stream_format.h |   57 ++++++++++++++++++++++++++++++++++
>  1 file changed, 57 insertions(+)
>  create mode 100644 tools/libxl/libxl_sr_stream_format.h
> 
> diff --git a/tools/libxl/libxl_sr_stream_format.h b/tools/libxl/libxl_sr_stream_format.h
> new file mode 100644
> index 0000000..487f9e2
> --- /dev/null
> +++ b/tools/libxl/libxl_sr_stream_format.h
> @@ -0,0 +1,57 @@
> +#ifndef LIBXL_SR_STREAM_FORMAT_H
> +#define LIBXL_SR_STREAM_FORMAT_H
> +
> +/*
> + * C structures for the Migration v2 stream format.
> + * See docs/specs/libxl-migration-stream.pandoc
> + */
> +
> +#include <stdint.h>
> +
> +typedef struct libxl_sr_hdr
> +{
> +    uint64_t ident;
> +    uint32_t version;
> +    uint32_t options;
> +} libxl_sr_hdr;
> +
> +#define RESTORE_STREAM_IDENT         0x4c6962786c466d74UL
> +#define RESTORE_STREAM_VERSION       0x00000002U
> +
> +#define RESTORE_OPT_BIG_ENDIAN       (1 << 0)
> +#define RESTORE_OPT_LEGACY           (1 << 1)
> +
> +
> +typedef struct libxl_sr_rec_hdr
> +{
> +    uint32_t type;
> +    uint32_t length;
> +} libxl_sr_rec_hdr;
> +
> +/* All records must be aligned up to an 8 octet boundary */
> +#define REC_ALIGN_ORDER              3U
> +
> +#define REC_TYPE_END                 0x00000000U
> +#define REC_TYPE_LIBXC_CONTEXT       0x00000001U
> +#define REC_TYPE_XENSTORE_DATA       0x00000002U
> +#define REC_TYPE_EMULATOR_CONTEXT    0x00000003U
> +
> +typedef struct libxl_sr_emulator_hdr
> +{
> +    uint32_t id;
> +    uint32_t index;
> +} libxl_sr_emulator_hdr;
> +
> +#define EMULATOR_UNKNOWN             0x00000000U
> +#define EMULATOR_QEMU_TRADITIONAL    0x00000001U
> +#define EMULATOR_QEMU_UPSTREAM       0x00000002U
> +
> +#endif /* LIBXL_SR_STREAM_FORMAT_H */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-16 13:47       ` Ian Jackson
@ 2015-06-16 14:05         ` Andrew Cooper
  2015-06-16 15:26           ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 14:05 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Yang Hongyang, Wei Liu, Ian Campbell, Xen-devel

On 16/06/15 14:47, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children"):
>> It is possible that one bit fails before it can be calculated whether
>> the second bit needs to start or not.
>>
>> At the moment, all bits in libxl in this area do initialisation
>> immediately before use; most bits are even initialised in the function
>> which starts their actions.  Some bits are initialised differently
>> depending on the path taken to get to the initialisation site. 
> As a rule of thumb a function libxl__initiate_foo_ which takes a
> libxl__foo_state* should do this initialisation for the whole
> libxl__foo_state.
>
> I don't see why you can't do that.

The only example of libxl__initiate_foo_ is
libxl__initiate_device_remove() which starts the first action involved
with removing a device.

I will see what I can do, but there are areas of this code which can't
have their initialisation brought any further forward.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state
  2015-06-16 13:37   ` Ian Campbell
@ 2015-06-16 14:09     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 14:09 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 14:37, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> Shortly more parameters will appear, and this saves unboxing each one.
>>
>> No functional change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>  tools/libxl/libxl_create.c       |   12 ++++++------
>>  tools/libxl/libxl_internal.h     |    2 +-
>>  tools/libxl/libxl_save_callout.c |    2 +-
>>  3 files changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> index 86384d2..385891c 100644
>> --- a/tools/libxl/libxl_create.c
>> +++ b/tools/libxl/libxl_create.c
>> @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
>>                               int rc, uint32_t domid);
>>  
>>  static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
>> -                            uint32_t *domid,
>> -                            int restore_fd, int checkpointed_stream,
>> +                            uint32_t *domid, int restore_fd,
>> +                            const libxl_domain_restore_params *params,
>>                              const libxl_asyncop_how *ao_how,
>>                              const libxl_asyncprogress_how *aop_console_how)
>>  {
>> @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
>>      libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
>>      libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
>>      cdcs->dcs.restore_fd = restore_fd;
>> +    if (params) cdcs->dcs.restore_params = *params;
> Is this eventually going to become non-optional? I think not and its
> validity is entirely intertwined with the validity of restore_fd (as I
> suspect it was before, but I've not checked).
>
> Perhaps an error check to that effect would be useful?

It is mandatory for restore, and currently unused for plain create. 
restore_fd being > -1 does appear to be the canonical switch between a
restore and a create, so should be the qualification of validity.

>
> Anyway, I think what you've done here is correct, so:
>         Acked-by: Ian Campbell <ian.campbell@citrix.com>
>         
> [...]
>> @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
>>      libxl_domain_config *guest_config;
>>      libxl_domain_config guest_config_saved; /* vanilla config */
>>      int restore_fd;
>> +    libxl_domain_restore_params restore_params;
>>      libxl__domain_create_cb *callback;
>>      libxl_asyncprogress_how aop_console_how;
>>      /* private to domain_create */
>>      int guest_domid;
>> -    int checkpointed_stream;
> This has, in effect moved from "private to domain_create" to "filled in
> by user", I don't think the change here has actually changed its status,
> but I suspect it was wrong before (alternatively restore_fd is in the
> wrong place instead).

I think it was wrong before.  It was always a caller-provided parameter,
albeit implicit by virtue of essentially being a "remus" boolean.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream
  2015-06-16 13:39   ` Ian Campbell
@ 2015-06-16 14:10     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 14:10 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 14:39, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> Introduced at this point so the python stream conversion code has a concrete
>> ABI to use.
> Please could you also explicitly mention that it isn't added to FLAG_ALL
> yet because we don't actually implement it yet and that it will be added
> there later.

Certainly.

>
> With that:
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
>
> (although I wish I had a better name in mind...)

Any suggestions welcome.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-15 13:44 ` [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Andrew Cooper
@ 2015-06-16 14:31   ` Ian Campbell
  2015-06-16 15:01     ` Andrew Cooper
  2015-06-17  3:09   ` Wen Congyang
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:31 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

Overall looks good, I've got some comments below and I think it almost
certainly wants eyes from Ian who knows more about the dc infra etc.

> +void libxl__stream_read_start(libxl__egc *egc,
> +                              libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    /* State initialisation. */
> +    assert(!stream->running);
> +
> +    memset(dc, 0, sizeof(*dc));

libxl__datacopier_init, please

> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Start reading the stream header. */
> +    dc->readwhat = "stream header";
> +    dc->readbuf = &stream->hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
> +    dc->used = 0;
> +    dc->callback = stream_header_done;

This pattern of resetting and reinitialising the dc occurs in multiple
places, I think a helper would be in order, some sort of
stream_next_record_init or something perhaps?

> +void libxl__stream_read_abort(libxl__egc *egc,
> +                              libxl__stream_read_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);

Push the running = false into stream_done and flip the assert there?
Logically the stream is still running until it is done, so having done
assert it isn't running seems counter-intuitive.

> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +
> +    assert(!stream->running);
> +
> +    stream->completion_callback(egc, dcs, stream->rc);
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_hdr *hdr = &stream->hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }

I think you need to check errnoval == 0 in the !onwrite case, otherwise
you may miss a read error?

Also it looks like onwrite can be -1, which is a separate error case.

> +
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }

Same comments wrt the arguments as the previous one.

Maybe a common helper to check (and log) the status at the head of each
callback? So you can effectively do if (!everything_ok(stream, dc) goto
err?

> +    assert(!ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;

reset length too?

> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
> +    STATE_AO_GC(stream->ao);
> +    char path[256];
> +    int ret = 0;
> +
> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
> +
> +    dc->readwhat = "save/migration stream";
> +    dc->copywhat = "emulator context";
> +    dc->writewhat = "qemu save file";
> +    dc->readbuf = NULL;
> +    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);

Since it this is all done in the same process (or children of it) with
not setuid etc, I think 0600 would be better to avoid accidentally
leaving the save state world readable (just in case it matters).

Also, should consider whether this fd needs to be subject to the carefd
machinery.

Sharing the dc between al these differing usages is starting to rankle a
little, but I think it is necessary because it may have queued data from
a previous read which was larger than the current record, correct?

Hrm, isn't setting dc->used = 0 on each reset potentially throwing some
stuff away?

> +    if (dc->writefd == -1) {
> +        ret = ERROR_FAIL;
> +        LOGE(ERROR, "Unable to open '%s'", path);
> +        goto err;
> +    }
> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
> +    stream->expected_len = dc->used = 0;

expecting 0? This differs from the pattern common everywhere else and
I'm not sure why.

> +    dc->callback = emulator_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    /* Safe to be static, as it is a write-only discard buffer. */
> +    static char padding[1U << REC_ALIGN_ORDER];
> +
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    /* Undo modifications for splicing the emulator context. */

Hrm, not so much undo as nuke and rebuild. Is that really necessary,
can't you just reset what you need to in the inverse of the other thing?

If there isn't a problem with buffered stuff on callback, then perhaps
it would be clearer to use a separate dc, at least for the qemu side. Or
to _always_ teardown and restart the dc from scratch instead of doing it
partially in some places and fully in others.


> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Do we need to eat some padding out of the stream? */

Why only now and not for e.g. the xenstore stuff (which doesn't appear
to be explicitly padded).

And given that why not handle this in some central place rather than in
the emulator only place?

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 17/27] tools/libxl: Support converting a legacy stream to a v2 stream
  2015-06-15 13:44 ` [PATCH 17/27] tools/libxl: Support converting a legacy stream to a " Andrew Cooper
@ 2015-06-16 14:38   ` Ian Campbell
  2015-06-16 15:13     ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> When a legacy stream is found, it needs to be converted to a v2 stream for the
> reading logic.  This is done by exec()ing the python conversion utility.
> 
> One complication is that the caller of this interface needs to assume
> ownership of the output fd, to prevent it being closed while still in use in a
> datacopier.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile                |    1 +
>  tools/libxl/libxl_convert_callout.c |  146 +++++++++++++++++++++++++++++++++++
>  tools/libxl/libxl_internal.h        |   32 ++++++++
>  3 files changed, 179 insertions(+)
>  create mode 100644 tools/libxl/libxl_convert_callout.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index c71c5fe..ca0ae3e 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -96,6 +96,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
>  			libxl_stream_read.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
> +			libxl_convert_callout.o \

Could we arrange for this to be x86 only, please (both here while
compiling and at runtime)

>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
>  LIBXL_OBJS += libxl_genid.o
>  LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
> diff --git a/tools/libxl/libxl_convert_callout.c b/tools/libxl/libxl_convert_callout.c
> new file mode 100644
> index 0000000..9050bb9
> --- /dev/null
> +++ b/tools/libxl/libxl_convert_callout.c

> +
> +static void helper_failed(libxl__egc *egc,
> +                          libxl__conversion_helper_state *chs, int rc);
> +static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
> +                          pid_t pid, int status);
> +static void helper_done(libxl__egc *egc,
> +                        libxl__conversion_helper_state *chs);

A lot of this stuff looks a lot like the contents of
libxl_save_callout.c, is there no scope for sharing any of it?

Since we only support N->N+1 we could perhaps tolerate the duplication
if we agreed upon a reasonable schedule to remove all this compat stuff,
e.g. in 4.7 or 4.8.

Ian?

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams
  2015-06-15 13:44 ` [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams Andrew Cooper
@ 2015-06-16 14:53   ` Ian Campbell
  2015-06-16 15:23     ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> @@ -377,6 +384,28 @@ static void record_body_done(libxl__egc *egc,
>      stream_failed(egc, stream, ret);
>  }
>  
> +void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
> +                                   int ret, int retval, int errnoval)
> +{
> +    libxl__domain_create_state *dcs = dcs_void;
> +    STATE_AO_GC(dcs->ao);
> +
> +    if (ret)
> +        goto err;
> +
> +    if (retval) {
> +        LOGEV(ERROR, errnoval, "restoring domain");
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    libxl__stream_read_continue(egc, &dcs->srs);

continue? Is this something to do with checkpointing?

> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 23f27d4..7418d92 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -346,6 +346,8 @@ libxl_domain_create_info = Struct("domain_create_info",[
>  
>  libxl_domain_restore_params = Struct("domain_restore_params", [

At some point we will need a LIBXL_HAVE #define.

>      ("checkpointed_stream", integer),
> +    ("stream_version", uint32, {'init_val': '1'}),

If we aren't going to go for an IDL enum rather than a uint32 you
probably just want the bare integer 1.

But, I suspect we would prefer an enum, i.e an explicit list of known
versions, rather than an integer?

I wonder when, if ever, we will be able to flip this to 2? I suppose
whenever the legacy conversion stuff gets pulled out.

> +    ("legacy_width", uint32),

>From what I've seen so far this is never user provided but is internal
to libxl and detected[0] at runtime. As such it belongs somewhere else
other than in the public API.

[0] FVO "detected" == "hardcoded depending on the build arch"

>      ])
>  
>  libxl_domain_sched_params = Struct("domain_sched_params",[
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index ddb293c..14d96c9 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -110,7 +110,9 @@
>  
>  #define XL_MANDATORY_FLAG_JSON (1U << 0) /* config data is in JSON format */
>  #define XL_MANDATORY_FLAG_STREAMv2 (1U << 1) /* stream is v2 */
> -#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON)
> +#define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON |        \
> +                                XL_MANDATORY_FLAG_STREAMv2)
> +
>  struct save_file_header {
>      char magic[32]; /* savefileheader_magic */
>      /* All uint32_ts are in domain's byte order. */
> @@ -2724,6 +2726,9 @@ static uint32_t create_domain(struct domain_create *dom_info)
>          libxl_domain_restore_params_init(&params);
>  
>          params.checkpointed_stream = dom_info->checkpointed_stream;
> +        params.stream_version =
> +            (hdr.mandatory_flags & XL_MANDATORY_FLAG_STREAMv2) ? 2 : 1;
> +
>          ret = libxl_domain_create_restore(ctx, &d_config,
>                                            &domid, restore_fd,
>                                            &params,

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
@ 2015-06-16 14:57   ` Ian Campbell
  2015-06-16 15:28     ` Andrew Cooper
  2015-06-17  1:31   ` Yang Hongyang
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:57 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile             |    2 +-
>  tools/libxl/libxl_internal.h     |   33 +++
>  tools/libxl/libxl_stream_write.c |  536 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 570 insertions(+), 1 deletion(-)
>  create mode 100644 tools/libxl/libxl_stream_write.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index ca0ae3e..63e32f7 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
> -			libxl_stream_read.o \
> +			libxl_stream_read.o libxl_stream_write.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>  			libxl_convert_callout.o \
>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 5482950..82cd792 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
>  typedef void libxl__save_device_model_cb(libxl__egc*,
>                                           libxl__domain_suspend_state*, int rc);
>  
> +/* State for writing a libxl migration v2 stream */
> +typedef struct libxl__stream_write_state libxl__stream_write_state;
> +
> +struct libxl__stream_write_state {
> +    /* filled by the user */
> +    libxl__ao *ao;
> +    int fd;
> +    uint32_t domid;
> +    void (*completion_callback)(libxl__egc *egc,
> +                                libxl__domain_suspend_state *dss,
> +                                int rc);
> +    /* Private */
> +    int rc;
> +    int joined_rc;
> +    size_t padding;
> +    bool running;
> +    libxl__datacopier_state dc;
> +};
> +
> +_hidden void libxl__stream_write_start(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream);
> +
> +_hidden void libxl__stream_write_abort(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream,
> +                                       int rc);
> +
> +static inline bool libxl__stream_write_inuse(
> +    const libxl__stream_write_state *stream)
> +{
> +    return stream->running;
> +}
> +
>  typedef struct libxl__logdirty_switch {
>      const char *cmd;
>      const char *cmd_path;
> @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
>      /* private for libxl__domain_save_device_model */
>      libxl__save_device_model_cb *save_dm_callback;
>      libxl__datacopier_state save_dm_datacopier;
> +    libxl__stream_write_state sws;
>  };
>  
> 
> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
> new file mode 100644
> index 0000000..856d72e
> --- /dev/null
> +++ b/tools/libxl/libxl_stream_write.c
> @@ -0,0 +1,536 @@
> +/*
> + * Copyright (C) 2015      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +/*
> + * Infrastructure for writing a domain to a libxl migration v2 stream.
> + *
> + * Entry points from outside:
> + *  - libxl__stream_write_start()
> + *     - Start writing a stream from the start.
> + *
> + * In normal operation, there are two tasks running at once; this stream
> + * processing, and the the libxl-save-helper.  check_stream_finished() is used

"the the".

> + * to join all the tasks in both success and error cases.
> + *
> + * Nomenclature for event callbacks:
> + *  - $FOO_done(): Completion callback for $FOO
> + *  - write_$FOO(): Set up writing a $FOO

Set up or actually write?

> +
> +void libxl__stream_write_start(libxl__egc *egc,
> +                               libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_hdr hdr = { 0 };
> +    int ret = 0;
> +
> +    assert(!stream->running);
> +    stream->running = true;
> +
> +    memset(dc, 0, sizeof(*dc));

Please use the _init()


> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dss,
> +                                  int rc, const char *what)
> +{
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
> +
> +    if (rc && !stream->joined_rc) {
> +        bool skip = false;
> +        /* First reported failure from joining tasks.  Tear everything down */
> +        stream->joined_rc = rc;


This (not just this, but a bunch of the preceeding helpers) all looks
rather familiar, can it be shared to some extent?

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams
  2015-06-15 13:44 ` [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams Andrew Cooper
@ 2015-06-16 14:59   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 14:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> This is a complicated set of changes which must be done together for
> bisectability.
> 
>  * libxl-save-helper is updated to unconditionally use libxc migration v2.
>  * libxl compatibility workarounds in libxc are disabled for save operations.
>  * libxl__stream_write_start() is logically spliced into the event location
>    where libxl__xc_domain_save() used to reside.
>  * xl is updated to indicate that the stream is now v2
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> 
> ---
> RFC: What kind of ABI/API indication is appropriate here.  A LIBXL_HAVE*
> isn't apppropriate.

Whether it has "HAVE" in the name or not I think some sort of #define,
or one each for SAVE, RESTORE and perhaps even LEGACY would be
appropriate. We should arrange that we can remove the LEGACY one in the
future without causing applications to think we've reverted to Xen 4.5
era.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
  2015-06-15 13:44 ` [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams Andrew Cooper
@ 2015-06-16 15:00   ` Ian Campbell
  2015-06-16 15:30     ` Andrew Cooper
  2015-06-17  3:30   ` Wen Congyang
  1 sibling, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:00 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> In a remus senario, libxc will write a CHECKPOINT record, then hand ownership

"scenario"

> of the fd to libxl.  Libxl then writes any records required and finishes with
> a CHECKPOINT_END record, then hands ownership of the fd back to libxc.

Seems like a plausible scheme to me, if that's what the RFC was for.

> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  docs/specs/libxl-migration-stream.pandoc |   15 ++++++++++++++-
>  tools/libxl/libxl_sr_stream_format.h     |    1 +
>  tools/python/xen/migration/libxl.py      |   11 +++++++++++
>  3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/libxl-migration-stream.pandoc b/docs/specs/libxl-migration-stream.pandoc
> index 7235317..d41932a 100644
> --- a/docs/specs/libxl-migration-stream.pandoc
> +++ b/docs/specs/libxl-migration-stream.pandoc
> @@ -119,7 +119,9 @@ type         0x00000000: END
>  
>               0x00000003: EMULATOR_CONTEXT
>  
> -             0x00000004 - 0x7FFFFFFF: Reserved for future _mandatory_
> +             0x00000004: CHECKPOINT_END
> +
> +             0x00000005 - 0x7FFFFFFF: Reserved for future _mandatory_
>               records.
>  
>               0x80000000 - 0xFFFFFFFF: Reserved for future _optional_
> @@ -203,3 +205,14 @@ index            Index of this emulator for the domain, if multiple
>  
>  emulator_ctx     Emulator context blob.
>  --------------------------------------------------------------------
> +
> +CHECKPOINT_END
> +--------------
> +
> +A checkpoint end record marks the end of a checkpoint in the image.
> +
> +     0     1     2     3     4     5     6     7 octet
> +    +-------------------------------------------------+
> +
> +The end record contains no fields; its body_length is 0.
> +
> diff --git a/tools/libxl/libxl_sr_stream_format.h b/tools/libxl/libxl_sr_stream_format.h
> index 487f9e2..5dfa55f 100644
> --- a/tools/libxl/libxl_sr_stream_format.h
> +++ b/tools/libxl/libxl_sr_stream_format.h
> @@ -35,6 +35,7 @@
>  #define REC_TYPE_LIBXC_CONTEXT       0x00000001U
>  #define REC_TYPE_XENSTORE_DATA       0x00000002U
>  #define REC_TYPE_EMULATOR_CONTEXT    0x00000003U
> +#define REC_TYPE_CHECKPOINT_END      0x00000004U
>  
>  typedef struct libxl_sr_emulator_hdr
>  {
> diff --git a/tools/python/xen/migration/libxl.py b/tools/python/xen/migration/libxl.py
> index 4e1f4f8..415502e 100644
> --- a/tools/python/xen/migration/libxl.py
> +++ b/tools/python/xen/migration/libxl.py
> @@ -36,12 +36,14 @@ REC_TYPE_end              = 0x00000000
>  REC_TYPE_libxc_context    = 0x00000001
>  REC_TYPE_xenstore_data    = 0x00000002
>  REC_TYPE_emulator_context = 0x00000003
> +REC_TYPE_checkpoint_end   = 0x00000004
>  
>  rec_type_to_str = {
>      REC_TYPE_end              : "End",
>      REC_TYPE_libxc_context    : "Libxc context",
>      REC_TYPE_xenstore_data    : "Xenstore data",
>      REC_TYPE_emulator_context : "Emulator context",
> +    REC_TYPE_checkpoint_end   : "Checkpoint end",
>  }
>  
>  # emulator_context
> @@ -176,6 +178,13 @@ class VerifyLibxl(VerifyBase):
>          self.info("  Index %d, type %s" % (emu_idx, emulator_id_to_str[emu_id]))
>  
> 
> +    def verify_record_checkpoint_end(self, content):
> +        """ Checkpoint end record """
> +
> +        if len(content) != 0:
> +            raise RecordError("Checkpoint end record with non-zero length")
> +
> +
>  record_verifiers = {
>      REC_TYPE_end:
>          VerifyLibxl.verify_record_end,
> @@ -185,4 +194,6 @@ record_verifiers = {
>          VerifyLibxl.verify_record_xenstore_data,
>      REC_TYPE_emulator_context:
>          VerifyLibxl.verify_record_emulator_context,
> +    REC_TYPE_checkpoint_end:
> +        VerifyLibxl.verify_record_checkpoint_end,
>  }

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-16 14:31   ` Ian Campbell
@ 2015-06-16 15:01     ` Andrew Cooper
  2015-06-16 15:35       ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:01 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 15:31, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> From: Ross Lagerwall <ross.lagerwall@citrix.com>
>>
>> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
> Overall looks good, I've got some comments below and I think it almost
> certainly wants eyes from Ian who knows more about the dc infra etc.
>
>> +void libxl__stream_read_start(libxl__egc *egc,
>> +                              libxl__stream_read_state *stream)
>> +{
>> +    libxl__datacopier_state *dc = &stream->dc;
>> +    int ret = 0;
>> +
>> +    /* State initialisation. */
>> +    assert(!stream->running);
>> +
>> +    memset(dc, 0, sizeof(*dc));
> libxl__datacopier_init, please

That call is made by libxl__datacopier_start() each and every time, and
unlike here, is matched with an equivalent _kill() call.

>
>> +    dc->ao = stream->ao;
>> +    dc->readfd = stream->fd;
>> +    dc->writefd = -1;
>> +
>> +    /* Start reading the stream header. */
>> +    dc->readwhat = "stream header";
>> +    dc->readbuf = &stream->hdr;
>> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
>> +    dc->used = 0;
>> +    dc->callback = stream_header_done;
> This pattern of resetting and reinitialising the dc occurs in multiple
> places, I think a helper would be in order, some sort of
> stream_next_record_init or something perhaps?

The only feasible helper would have to take everything as parameters; 
there is insufficient similarity between all users.

I dunno whether that would be harder to read...

>
>> +void libxl__stream_read_abort(libxl__egc *egc,
>> +                              libxl__stream_read_state *stream, int rc)
>> +{
>> +    stream_failed(egc, stream, rc);
>> +}
>> +
>> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
>> +{
>> +    stream->rc = 0;
>> +    stream->running = false;
>> +
>> +    stream_done(egc, stream);
> Push the running = false into stream_done and flip the assert there?
> Logically the stream is still running until it is done, so having done
> assert it isn't running seems counter-intuitive.

This is more for piece of mind.  stream_done() my strictly only ever be
called once, hence its assert.

>
>> +static void stream_done(libxl__egc *egc,
>> +                        libxl__stream_read_state *stream)
>> +{
>> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
>> +
>> +    assert(!stream->running);
>> +
>> +    stream->completion_callback(egc, dcs, stream->rc);
>> +}
>> +
>> +static void stream_header_done(libxl__egc *egc,
>> +                               libxl__datacopier_state *dc,
>> +                               int onwrite, int errnoval)
>> +{
>> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
>> +    libxl_sr_hdr *hdr = &stream->hdr;
>> +    STATE_AO_GC(dc->ao);
>> +    int ret = 0;
>> +
>> +    if (onwrite || dc->used != stream->expected_len) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
>> +            onwrite, errnoval, stream->expected_len, dc->used);
>> +        goto err;
>> +    }
> I think you need to check errnoval == 0 in the !onwrite case, otherwise
> you may miss a read error?

"dc->used != stream->expected_len" covers all possible read errors, in
the "something went wrong" kind of way.

>
> Also it looks like onwrite can be -1, which is a separate error case.
>
>> +
>> +static void record_header_done(libxl__egc *egc,
>> +                               libxl__datacopier_state *dc,
>> +                               int onwrite, int errnoval)
>> +{
>> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
>> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
>> +    STATE_AO_GC(dc->ao);
>> +    int ret = 0;
>> +
>> +    if (onwrite || dc->used != stream->expected_len) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
>> +            onwrite, errnoval, stream->expected_len, dc->used);
>> +        goto err;
>> +    }
> Same comments wrt the arguments as the previous one.
>
> Maybe a common helper to check (and log) the status at the head of each
> callback? So you can effectively do if (!everything_ok(stream, dc) goto
> err?

I will see what I can do.

>
>> +    assert(!ret);
>> +    if (rec_hdr->length) {
>> +        free(stream->rec_body);
>> +        stream->rec_body = NULL;
> reset length too?
>
>> +static void read_emulator_body(libxl__egc *egc,
>> +                               libxl__stream_read_state *stream)
>> +{
>> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
>> +    libxl__datacopier_state *dc = &stream->dc;
>> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
>> +    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
>> +    STATE_AO_GC(stream->ao);
>> +    char path[256];
>> +    int ret = 0;
>> +
>> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
>> +
>> +    dc->readwhat = "save/migration stream";
>> +    dc->copywhat = "emulator context";
>> +    dc->writewhat = "qemu save file";
>> +    dc->readbuf = NULL;
>> +    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);
> Since it this is all done in the same process (or children of it) with
> not setuid etc, I think 0600 would be better to avoid accidentally
> leaving the save state world readable (just in case it matters).

Probably best.

>
> Also, should consider whether this fd needs to be subject to the carefd
> machinery.

Probably does.

>
> Sharing the dc between al these differing usages is starting to rankle a
> little, but I think it is necessary because it may have queued data from
> a previous read which was larger than the current record, correct?
>
> Hrm, isn't setting dc->used = 0 on each reset potentially throwing some
> stuff away?

We should never be in a case where we are setting up a new read/write
from the dc with any previous IO pending.

>
>> +    if (dc->writefd == -1) {
>> +        ret = ERROR_FAIL;
>> +        LOGE(ERROR, "Unable to open '%s'", path);
>> +        goto err;
>> +    }
>> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
>> +    stream->expected_len = dc->used = 0;
> expecting 0? This differs from the pattern common everywhere else and
> I'm not sure why.

The datacopier has been overloaded so many times, it is messy to use.

In this case, we are splicing from read fd to a write fd, rather than to
a local buffer.

Therefore, when the IO is complete, we expect 0 bytes in the local
buffer, as all data should end up in the fd.

>
>> +    dc->callback = emulator_body_done;
>> +
>> +    ret = libxl__datacopier_start(dc);
>> +    if (ret)
>> +        goto err;
>> +    return;
>> +
>> + err:
>> +    assert(ret);
>> +    stream_failed(egc, stream, ret);
>> +}
>> +
>> +static void emulator_body_done(libxl__egc *egc,
>> +                               libxl__datacopier_state *dc,
>> +                               int onwrite, int errnoval)
>> +{
>> +    /* Safe to be static, as it is a write-only discard buffer. */
>> +    static char padding[1U << REC_ALIGN_ORDER];
>> +
>> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
>> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
>> +    STATE_AO_GC(dc->ao);
>> +    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
>> +    int ret = 0;
>> +
>> +    if (onwrite || dc->used != stream->expected_len) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
>> +            onwrite, errnoval, stream->expected_len, dc->used);
>> +        goto err;
>> +    }
>> +
>> +    /* Undo modifications for splicing the emulator context. */
> Hrm, not so much undo as nuke and rebuild. Is that really necessary,
> can't you just reset what you need to in the inverse of the other thing?
>
> If there isn't a problem with buffered stuff on callback, then perhaps
> it would be clearer to use a separate dc, at least for the qemu side. Or
> to _always_ teardown and restart the dc from scratch instead of doing it
> partially in some places and fully in others.
>
>
>> +    memset(dc, 0, sizeof(*dc));
>> +    dc->ao = stream->ao;
>> +    dc->readfd = stream->fd;
>> +    dc->writefd = -1;
>> +
>> +    /* Do we need to eat some padding out of the stream? */
> Why only now and not for e.g. the xenstore stuff (which doesn't appear
> to be explicitly padded).

Any record which is read into a local buffer has the local buffer
aligned up, and the padding read onto the end.

>
> And given that why not handle this in some central place rather than in
> the emulator only place?

Experimentally, some versions of Qemu barf if they have trailing zeros
in save file.  I think they expect to find eof() on a qemu record boundary.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
  2015-06-15 13:44 ` [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream Andrew Cooper
@ 2015-06-16 15:03   ` Ian Campbell
  2015-06-16 15:53     ` Andrew Cooper
  2015-06-18  3:13   ` Wen Congyang
  1 sibling, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> when signalled to do so by libxl__remus_domain_checkpoint_callback()

I think I saw that Remus wasn't currently working, so I'll let you and
Hongyang thrash something out before I spend too much effort reviewing
these last few RFC bits. Unless you think it is worth my having a look
now?

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 26/27] tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
  2015-06-15 13:44 ` [PATCH 26/27] tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc Andrew Cooper
@ 2015-06-16 15:03   ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> Libxl has now been fully adjusted not to need it.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 08/27] tools/libxl: Extra APIs for the save helper
  2015-06-16 13:50   ` Ian Campbell
@ 2015-06-16 15:03     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:03 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 14:50, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> With libxl migration v2, there will be other moving parts which might fail,
>> requiring the helper to be stopped for reasons which are not its fault.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>  tools/libxl/libxl_internal.h     |    8 ++++++++
>>  tools/libxl/libxl_save_callout.c |   16 ++++++++++++++++
>>  2 files changed, 24 insertions(+)
>>
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index 4f204f9..3fcc37a 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -3182,6 +3182,14 @@ _hidden void libxl__xc_domain_restore(libxl__egc *egc,
>>  _hidden void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
>>                                             int rc, int retval, int errnoval);
>>  
>> +_hidden void libxl__save_helper_abort(libxl__egc *egc,
>> +                                      libxl__save_helper_state *shs);
>> +
>> +static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
>> +{
>> +    return libxl__ev_child_inuse(&shs->child);
>> +}
> Will this be used other than in libxl__save_helper_abort?

The two are typically used together, but inuse() does need to be used
without abort() as part of joining 3 parallel tasks.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
  2015-06-15 13:44 ` [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks Andrew Cooper
@ 2015-06-16 15:04   ` Ian Campbell
  2015-06-16 15:06     ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:04 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> Libxl has now been fully adjusted not to need them.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Ian Campbell <Ian.Campbell@citrix.com>

/me looks mournfully at the #28 shaped hole in this series which would
nuke all the migration v1 code from libxc :-)

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
  2015-06-16 15:04   ` Ian Campbell
@ 2015-06-16 15:06     ` Andrew Cooper
  2015-06-17 10:14       ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:06 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 16:04, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> Libxl has now been fully adjusted not to need them.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
>
> /me looks mournfully at the #28 shaped hole in this series which would
> nuke all the migration v1 code from libxc :-)

I was going to slip that into v2.  I didn't want to delay posting v1 for
review, given the proximity of the 4.6 freeze.

I think I will transcribe the description of the legacy protocol from
xg_save_restore.h and code up the legacy protocol in python.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 17/27] tools/libxl: Support converting a legacy stream to a v2 stream
  2015-06-16 14:38   ` Ian Campbell
@ 2015-06-16 15:13     ` Andrew Cooper
  2015-06-16 15:38       ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:13 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 15:38, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> When a legacy stream is found, it needs to be converted to a v2 stream for the
>> reading logic.  This is done by exec()ing the python conversion utility.
>>
>> One complication is that the caller of this interface needs to assume
>> ownership of the output fd, to prevent it being closed while still in use in a
>> datacopier.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>  tools/libxl/Makefile                |    1 +
>>  tools/libxl/libxl_convert_callout.c |  146 +++++++++++++++++++++++++++++++++++
>>  tools/libxl/libxl_internal.h        |   32 ++++++++
>>  3 files changed, 179 insertions(+)
>>  create mode 100644 tools/libxl/libxl_convert_callout.c
>>
>> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
>> index c71c5fe..ca0ae3e 100644
>> --- a/tools/libxl/Makefile
>> +++ b/tools/libxl/Makefile
>> @@ -96,6 +96,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
>>  			libxl_stream_read.o \
>>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>> +			libxl_convert_callout.o \
> Could we arrange for this to be x86 only, please (both here while
> compiling and at runtime)

Yes

>
>>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
>>  LIBXL_OBJS += libxl_genid.o
>>  LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
>> diff --git a/tools/libxl/libxl_convert_callout.c b/tools/libxl/libxl_convert_callout.c
>> new file mode 100644
>> index 0000000..9050bb9
>> --- /dev/null
>> +++ b/tools/libxl/libxl_convert_callout.c
>> +
>> +static void helper_failed(libxl__egc *egc,
>> +                          libxl__conversion_helper_state *chs, int rc);
>> +static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
>> +                          pid_t pid, int status);
>> +static void helper_done(libxl__egc *egc,
>> +                        libxl__conversion_helper_state *chs);
> A lot of this stuff looks a lot like the contents of
> libxl_save_callout.c, is there no scope for sharing any of it?

I don't believe so.  About the only thing they actually have in common
is that they are looking after a child process.  Starting, error and
termination conditions are all different, as well as ownership of
various bits of state.

>
> Since we only support N->N+1 we could perhaps tolerate the duplication
> if we agreed upon a reasonable schedule to remove all this compat stuff,
> e.g. in 4.7 or 4.8.

N->N+1 is another item on the "needs to be resolved before Xapi can use
libxl" list.

In XenServer, we still have support for importing VMs from XenServer
4.0, which is certainly older than Xen 3.2

It might be reasonable to make a compile time option to omit the legacy
conversion, but conversion should absolutely be on by default for the
first release after this series being accepted.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams
  2015-06-16 14:53   ` Ian Campbell
@ 2015-06-16 15:23     ` Andrew Cooper
  2015-06-16 15:39       ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:23 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 15:53, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> @@ -377,6 +384,28 @@ static void record_body_done(libxl__egc *egc,
>>      stream_failed(egc, stream, ret);
>>  }
>>  
>> +void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
>> +                                   int ret, int retval, int errnoval)
>> +{
>> +    libxl__domain_create_state *dcs = dcs_void;
>> +    STATE_AO_GC(dcs->ao);
>> +
>> +    if (ret)
>> +        goto err;
>> +
>> +    if (retval) {
>> +        LOGEV(ERROR, errnoval, "restoring domain");
>> +        ret = ERROR_FAIL;
>> +        goto err;
>> +    }
>> +
>> +    libxl__stream_read_continue(egc, &dcs->srs);
> continue? Is this something to do with checkpointing?

No.  Simply "continue reading records from the stream".

The code has changed since the first iteration, where
libxl__stream_read_continue() was called from outside of this
translation unit.  As it currently stands, I should make it a static
function.

>
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index 23f27d4..7418d92 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -346,6 +346,8 @@ libxl_domain_create_info = Struct("domain_create_info",[
>>  
>>  libxl_domain_restore_params = Struct("domain_restore_params", [
> At some point we will need a LIBXL_HAVE #define.
>
>>      ("checkpointed_stream", integer),
>> +    ("stream_version", uint32, {'init_val': '1'}),
> If we aren't going to go for an IDL enum rather than a uint32 you
> probably just want the bare integer 1.
>
> But, I suspect we would prefer an enum, i.e an explicit list of known
> versions, rather than an integer?

Ideally, this should match the version field in the libxl stream header
record.  Realistically, I never expect version 2 needing bumping to
version 3.

>
> I wonder when, if ever, we will be able to flip this to 2? I suppose
> whenever the legacy conversion stuff gets pulled out.
>
>> +    ("legacy_width", uint32),
> From what I've seen so far this is never user provided but is internal
> to libxl and detected[0] at runtime. As such it belongs somewhere else
> other than in the public API.
>
> [0] FVO "detected" == "hardcoded depending on the build arch"

The intention was originally to expose it to the user.  While this is
possible for `xl restore /path/to/legacy/save --was-32bit-toolstack`, it
is much harder for the `xl migrate` case where we cannot control the
sending side.

I am half tempted to say that anyone experiencing 32->64bit problems
should just use the conversion helper themselves.  It is deliberately
written to be usable on the command line as well as automatically.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-16 13:36     ` Andrew Cooper
  2015-06-16 13:47       ` Ian Jackson
@ 2015-06-16 15:24       ` Ian Campbell
  1 sibling, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Tue, 2015-06-16 at 14:36 +0100, Andrew Cooper wrote:
> On 16/06/15 14:21, Ian Campbell wrote:
> > On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> >> Shortly, libxl will be juggling multiple parallel operations, and will
> >> possibly have to take error decisions before some tasks have been set up.
> > It would be preferable, I think, to arrange to call libxl__ev_child_init
> > on all such libxl__ev_child structs either up front or certainly before
> > there is any possibility of needing to unwind them.
> >
> > Such an init would at worst correspond to exactly the place where the
> > zeroed structure you refer to is zeroed.
> 
> It is possible that one bit fails before it can be calculated whether
> the second bit needs to start or not.

You can call libxl__ev_child_init without needing to know that, i.e. you
can do them all at the point where you allocate/initialise the
containing struct.

> At the moment, all bits in libxl in this area do initialisation
> immediately before use; most bits are even initialised in the function
> which starts their actions.  Some bits are initialised differently
> depending on the path taken to get to the initialisation site. 
> 
> It would be non-trivial to initialise everything appropriately at the
> very start.

You don't need to fully init, just call libxl__ev_child_init in order to
arrange for correct behaviour from libxl__ev_child_inuse and friends.
Actually turning it into a useful child can stay where it is if it is
tricky to arrange for those things to happen at the same time.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
  2015-06-16 14:05         ` Andrew Cooper
@ 2015-06-16 15:26           ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Tue, 2015-06-16 at 15:05 +0100, Andrew Cooper wrote:
> On 16/06/15 14:47, Ian Jackson wrote:
> > Andrew Cooper writes ("Re: [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children"):
> >> It is possible that one bit fails before it can be calculated whether
> >> the second bit needs to start or not.
> >>
> >> At the moment, all bits in libxl in this area do initialisation
> >> immediately before use; most bits are even initialised in the function
> >> which starts their actions.  Some bits are initialised differently
> >> depending on the path taken to get to the initialisation site. 
> > As a rule of thumb a function libxl__initiate_foo_ which takes a
> > libxl__foo_state* should do this initialisation for the whole
> > libxl__foo_state.
> >
> > I don't see why you can't do that.
> 
> The only example of libxl__initiate_foo_ is
> libxl__initiate_device_remove() which starts the first action involved
> with removing a device.
> 
> I will see what I can do, but there are areas of this code which can't
> have their initialisation brought any further forward.

I think you need to consider "make the struct into some known good
initial/default state" as something separate from "turn the struct into
a useful thing to achieve its goal". Only the first bit needs to move
IMHO (although if they both can that is nice too)

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-16 14:57   ` Ian Campbell
@ 2015-06-16 15:28     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:28 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 15:57, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> From: Ross Lagerwall <ross.lagerwall@citrix.com>
>>
>> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> CC: Ian Campbell <Ian.Campbell@citrix.com>
>> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
>> CC: Wei Liu <wei.liu2@citrix.com>
>> ---
>>  tools/libxl/Makefile             |    2 +-
>>  tools/libxl/libxl_internal.h     |   33 +++
>>  tools/libxl/libxl_stream_write.c |  536 ++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 570 insertions(+), 1 deletion(-)
>>  create mode 100644 tools/libxl/libxl_stream_write.c
>>
>> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
>> index ca0ae3e..63e32f7 100644
>> --- a/tools/libxl/Makefile
>> +++ b/tools/libxl/Makefile
>> @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
>> -			libxl_stream_read.o \
>> +			libxl_stream_read.o libxl_stream_write.o \
>>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>>  			libxl_convert_callout.o \
>>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
>> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> index 5482950..82cd792 100644
>> --- a/tools/libxl/libxl_internal.h
>> +++ b/tools/libxl/libxl_internal.h
>> @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
>>  typedef void libxl__save_device_model_cb(libxl__egc*,
>>                                           libxl__domain_suspend_state*, int rc);
>>  
>> +/* State for writing a libxl migration v2 stream */
>> +typedef struct libxl__stream_write_state libxl__stream_write_state;
>> +
>> +struct libxl__stream_write_state {
>> +    /* filled by the user */
>> +    libxl__ao *ao;
>> +    int fd;
>> +    uint32_t domid;
>> +    void (*completion_callback)(libxl__egc *egc,
>> +                                libxl__domain_suspend_state *dss,
>> +                                int rc);
>> +    /* Private */
>> +    int rc;
>> +    int joined_rc;
>> +    size_t padding;
>> +    bool running;
>> +    libxl__datacopier_state dc;
>> +};
>> +
>> +_hidden void libxl__stream_write_start(libxl__egc *egc,
>> +                                       libxl__stream_write_state *stream);
>> +
>> +_hidden void libxl__stream_write_abort(libxl__egc *egc,
>> +                                       libxl__stream_write_state *stream,
>> +                                       int rc);
>> +
>> +static inline bool libxl__stream_write_inuse(
>> +    const libxl__stream_write_state *stream)
>> +{
>> +    return stream->running;
>> +}
>> +
>>  typedef struct libxl__logdirty_switch {
>>      const char *cmd;
>>      const char *cmd_path;
>> @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
>>      /* private for libxl__domain_save_device_model */
>>      libxl__save_device_model_cb *save_dm_callback;
>>      libxl__datacopier_state save_dm_datacopier;
>> +    libxl__stream_write_state sws;
>>  };
>>  
>>
>> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
>> new file mode 100644
>> index 0000000..856d72e
>> --- /dev/null
>> +++ b/tools/libxl/libxl_stream_write.c
>> @@ -0,0 +1,536 @@
>> +/*
>> + * Copyright (C) 2015      Citrix Ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU Lesser General Public License as published
>> + * by the Free Software Foundation; version 2.1 only. with the special
>> + * exception on linking described in file LICENSE.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU Lesser General Public License for more details.
>> + */
>> +
>> +#include "libxl_osdeps.h" /* must come before any other headers */
>> +
>> +#include "libxl_internal.h"
>> +
>> +/*
>> + * Infrastructure for writing a domain to a libxl migration v2 stream.
>> + *
>> + * Entry points from outside:
>> + *  - libxl__stream_write_start()
>> + *     - Start writing a stream from the start.
>> + *
>> + * In normal operation, there are two tasks running at once; this stream
>> + * processing, and the the libxl-save-helper.  check_stream_finished() is used
> "the the".
>
>> + * to join all the tasks in both success and error cases.
>> + *
>> + * Nomenclature for event callbacks:
>> + *  - $FOO_done(): Completion callback for $FOO
>> + *  - write_$FOO(): Set up writing a $FOO
> Set up or actually write?

Set up the dc to write $FOO

The write doesn't actually happen until we get the dc callback.

>
>> +
>> +void libxl__stream_write_start(libxl__egc *egc,
>> +                               libxl__stream_write_state *stream)
>> +{
>> +    libxl__datacopier_state *dc = &stream->dc;
>> +    STATE_AO_GC(stream->ao);
>> +    struct libxl_sr_hdr hdr = { 0 };
>> +    int ret = 0;
>> +
>> +    assert(!stream->running);
>> +    stream->running = true;
>> +
>> +    memset(dc, 0, sizeof(*dc));
> Please use the _init()

_init() is already called for us.

>
>
>> +static void check_stream_finished(libxl__egc *egc,
>> +                                  libxl__domain_suspend_state *dss,
>> +                                  int rc, const char *what)
>> +{
>> +    libxl__stream_write_state *stream = &dss->sws;
>> +    STATE_AO_GC(dss->ao);
>> +
>> +    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
>> +
>> +    if (rc && !stream->joined_rc) {
>> +        bool skip = false;
>> +        /* First reported failure from joining tasks.  Tear everything down */
>> +        stream->joined_rc = rc;
>
> This (not just this, but a bunch of the preceeding helpers) all looks
> rather familiar, can it be shared to some extent?

The types are all different, including the signature of the eventual
callback to use.  As COLO gets built on top, I expect these to diverge
further.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
  2015-06-16 15:00   ` Ian Campbell
@ 2015-06-16 15:30     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:30 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 16:00, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> In a remus senario, libxc will write a CHECKPOINT record, then hand ownership
> "scenario"
>
>> of the fd to libxl.  Libxl then writes any records required and finishes with
>> a CHECKPOINT_END record, then hands ownership of the fd back to libxc.
> Seems like a plausible scheme to me, if that's what the RFC was for.

The RFC was for all the "support remus" bits, where I wrote code but was
unable to test.  They will all be dropped for v2, once the suggested
adjustments are done.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-16 15:01     ` Andrew Cooper
@ 2015-06-16 15:35       ` Ian Campbell
  2015-06-16 15:46         ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:35 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Tue, 2015-06-16 at 16:01 +0100, Andrew Cooper wrote:
> On 16/06/15 15:31, Ian Campbell wrote:
> > On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> >> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> >>
> >> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >> CC: Ian Campbell <Ian.Campbell@citrix.com>
> >> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> >> CC: Wei Liu <wei.liu2@citrix.com>
> > Overall looks good, I've got some comments below and I think it almost
> > certainly wants eyes from Ian who knows more about the dc infra etc.
> >
> >> +void libxl__stream_read_start(libxl__egc *egc,
> >> +                              libxl__stream_read_state *stream)
> >> +{
> >> +    libxl__datacopier_state *dc = &stream->dc;
> >> +    int ret = 0;
> >> +
> >> +    /* State initialisation. */
> >> +    assert(!stream->running);
> >> +
> >> +    memset(dc, 0, sizeof(*dc));
> > libxl__datacopier_init, please
> 
> That call is made by libxl__datacopier_start() each and every time, and
> unlike here, is matched with an equivalent _kill() call.

Hrm, I think I'd best defer to Ian J on what the right way to deal with
a dc being setup and resused in this way is.

> >> +    /* Start reading the stream header. */
> >> +    dc->readwhat = "stream header";
> >> +    dc->readbuf = &stream->hdr;
> >> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
> >> +    dc->used = 0;
> >> +    dc->callback = stream_header_done;
> > This pattern of resetting and reinitialising the dc occurs in multiple
> > places, I think a helper would be in order, some sort of
> > stream_next_record_init or something perhaps?
> 
> The only feasible helper would have to take everything as parameters; 
> there is insufficient similarity between all users.
> 
> I dunno whether that would be harder to read...

I was more concerned about ensuring everyone does everything (especially
if a new field gets added), having a function with parameters would
cause a compile failure when the addition was (hopefully) propagated to
that function and added as a parameter.

Lets see what Ian thinks.

> 
> >
> >> +void libxl__stream_read_abort(libxl__egc *egc,
> >> +                              libxl__stream_read_state *stream, int rc)
> >> +{
> >> +    stream_failed(egc, stream, rc);
> >> +}
> >> +
> >> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
> >> +{
> >> +    stream->rc = 0;
> >> +    stream->running = false;
> >> +
> >> +    stream_done(egc, stream);
> > Push the running = false into stream_done and flip the assert there?
> > Logically the stream is still running until it is done, so having done
> > assert it isn't running seems counter-intuitive.
> 
> This is more for piece of mind.  stream_done() my strictly only ever be
> called once, hence its assert.

assert(stream->running);
stream->running = false

in stream_done() gives the same piece of mind, doesn't it?

> >> +    if (onwrite || dc->used != stream->expected_len) {
> >> +        ret = ERROR_FAIL;
> >> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> >> +            onwrite, errnoval, stream->expected_len, dc->used);
> >> +        goto err;
> >> +    }
> > I think you need to check errnoval == 0 in the !onwrite case, otherwise
> > you may miss a read error?
> 
> "dc->used != stream->expected_len" covers all possible read errors, in
> the "something went wrong" kind of way.

With the current implementation, perhaps, but the doc doesn't seem to
say you can rely on it (an appropriate reaction might be to change the
doc rather than the code, I don't mind).

> > Sharing the dc between al these differing usages is starting to rankle a
> > little, but I think it is necessary because it may have queued data from
> > a previous read which was larger than the current record, correct?
> >
> > Hrm, isn't setting dc->used = 0 on each reset potentially throwing some
> > stuff away?
> 
> We should never be in a case where we are setting up a new read/write
> from the dc with any previous IO pending.

Essentially because dc uses its idea of bytes remaining to do the read,
the scenario I was imagining only comes about if the dc is reading large
blocks and segmenting them later, which isn't how it is used here.

> >> +    if (dc->writefd == -1) {
> >> +        ret = ERROR_FAIL;
> >> +        LOGE(ERROR, "Unable to open '%s'", path);
> >> +        goto err;
> >> +    }
> >> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
> >> +    stream->expected_len = dc->used = 0;
> > expecting 0? This differs from the pattern common everywhere else and
> > I'm not sure why.
> 
> The datacopier has been overloaded so many times, it is messy to use.
> 
> In this case, we are splicing from read fd to a write fd, rather than to
> a local buffer.
> 
> Therefore, when the IO is complete, we expect 0 bytes in the local
> buffer, as all data should end up in the fd.

I think using 2 or more data copiers to cover the different
configurations might help? You can still reuse one for the normal record
processing but a separate dedicated one for writing the emu to a file
might iron out a wrinkle.

> >
> >> +    memset(dc, 0, sizeof(*dc));
> >> +    dc->ao = stream->ao;
> >> +    dc->readfd = stream->fd;
> >> +    dc->writefd = -1;
> >> +
> >> +    /* Do we need to eat some padding out of the stream? */
> > Why only now and not for e.g. the xenstore stuff (which doesn't appear
> > to be explicitly padded).
> 
> Any record which is read into a local buffer has the local buffer
> aligned up, and the padding read onto the end.

OK.

> > And given that why not handle this in some central place rather than in
> > the emulator only place?
> 
> Experimentally, some versions of Qemu barf if they have trailing zeros
> in save file.  I think they expect to find eof() on a qemu record boundary.

What I was suggesting was to do the padding in the core, where it would
often be a zero nop, but would save mistakes (or duplication) if some
other record also needs such handling in the future.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 17/27] tools/libxl: Support converting a legacy stream to a v2 stream
  2015-06-16 15:13     ` Andrew Cooper
@ 2015-06-16 15:38       ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Tue, 2015-06-16 at 16:13 +0100, Andrew Cooper wrote:

> > Since we only support N->N+1 we could perhaps tolerate the duplication
> > if we agreed upon a reasonable schedule to remove all this compat stuff,
> > e.g. in 4.7 or 4.8.
> 
> N->N+1 is another item on the "needs to be resolved before Xapi can use
> libxl" list.
> 
> In XenServer, we still have support for importing VMs from XenServer
> 4.0, which is certainly older than Xen 3.2

OK. FWIW I think we should consider importing (i.e. restore) separate to
migration, since the former can be subjected to more heavy duty offline
conversions if necessary.

It's not out of the question that this new migration stream format will
allow us to sensibly increase the +1 to some larger number for
migrations too.

Ian.

> It might be reasonable to make a compile time option to omit the legacy
> conversion, but conversion should absolutely be on by default for the
> first release after this series being accepted.
> 
> ~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams
  2015-06-16 15:23     ` Andrew Cooper
@ 2015-06-16 15:39       ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-16 15:39 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Tue, 2015-06-16 at 16:23 +0100, Andrew Cooper wrote:
> I am half tempted to say that anyone experiencing 32->64bit problems
> should just use the conversion helper themselves.  It is deliberately
> written to be usable on the command line as well as automatically.

Yes, I think that's a good plan.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-16 15:35       ` Ian Campbell
@ 2015-06-16 15:46         ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:46 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 16:35, Ian Campbell wrote:
>
>>>> +    if (dc->writefd == -1) {
>>>> +        ret = ERROR_FAIL;
>>>> +        LOGE(ERROR, "Unable to open '%s'", path);
>>>> +        goto err;
>>>> +    }
>>>> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
>>>> +    stream->expected_len = dc->used = 0;
>>> expecting 0? This differs from the pattern common everywhere else and
>>> I'm not sure why.
>> The datacopier has been overloaded so many times, it is messy to use.
>>
>> In this case, we are splicing from read fd to a write fd, rather than to
>> a local buffer.
>>
>> Therefore, when the IO is complete, we expect 0 bytes in the local
>> buffer, as all data should end up in the fd.
> I think using 2 or more data copiers to cover the different
> configurations might help? You can still reuse one for the normal record
> processing but a separate dedicated one for writing the emu to a file
> might iron out a wrinkle.

I specifically do not want to risk setting two dc's running at the same
time with the same readfd.

As all of this code is reading from a single readfd, I have specifically
avoided having multiple dc structures lying around.

>
>>> And given that why not handle this in some central place rather than in
>>> the emulator only place?
>> Experimentally, some versions of Qemu barf if they have trailing zeros
>> in save file.  I think they expect to find eof() on a qemu record boundary.
> What I was suggesting was to do the padding in the core, where it would
> often be a zero nop, but would save mistakes (or duplication) if some
> other record also needs such handling in the future.

I don't see an easy way of doing that, given the current divergence in
setting the dcs up in the first place.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
  2015-06-16 15:03   ` Ian Campbell
@ 2015-06-16 15:53     ` Andrew Cooper
  2015-06-17  7:30       ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-16 15:53 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/15 16:03, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> when signalled to do so by libxl__remus_domain_checkpoint_callback()
> I think I saw that Remus wasn't currently working, so I'll let you and
> Hongyang thrash something out before I spend too much effort reviewing
> these last few RFC bits. Unless you think it is worth my having a look
> now?
>
>

Remus was broken by patch 19 in the series, and this patch forms part of
fixing it again.

I can't find a way of fixing the layering violation in both plain
migration and Remus, in a readable, bisectable way.

Remus requires identical source and destination toolstacks, and the
Remus maintainers are happy enough with the "break it and fix it up in
the same series" approach.

Now that the series is comeplete, there is some shuffling room to reduce
the window of breakage, but short of folding patches 19, 21, 23-25
together, Remus will break.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
  2015-06-16 14:57   ` Ian Campbell
@ 2015-06-17  1:31   ` Yang Hongyang
  2015-06-17  9:51     ` Andrew Cooper
  2015-06-17  1:39   ` Wen Congyang
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 107+ messages in thread
From: Yang Hongyang @ 2015-06-17  1:31 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Wei Liu, Ian Jackson, Ian Campbell, Ross Lagerwall



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
[...]
> +
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_EMULATOR_CONTEXT, 0 };
> +    struct libxl_sr_emulator_hdr ehdr = { 0 };
> +    struct stat st;
> +    int ret = 0;
> +    uint32_t qemu_state_len;
> +
> +    assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
> +
> +    /* Convenience aliases */
> +    const char *const filename = dss->dm_savefile;
> +    const uint32_t domid = dss->domid;
> +
> +    switch(libxl__device_model_version_running(gc, domid)) {
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +        ehdr.id = EMULATOR_QEMU_TRADITIONAL;
> +        break;
> +
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> +        ehdr.id = EMULATOR_QEMU_UPSTREAM;
> +        break;
> +
> +    default:
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    ret = libxl__domain_suspend_device_model(gc, dss);

This is no longer needed, the suspend callback already called
this function and the emulator context already saved to a file.

This call will cause Primary's emulator stop under Remus.
postcopy callback will resume primary. then in checkpoint
callback, we shouldn't suspend device model.

> +    if (ret)
> +        goto err;
> +
> +    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
> +    dc->copywhat = "emulator record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = emulator_body_done;
> +
> +    dc->readfd = open(filename, O_RDONLY);
> +    if (dc->readfd < 0) {
> +        LOGE(ERROR, "unable to open %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (fstat(dc->readfd, &st))
> +    {
> +        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (!S_ISREG(st.st_mode)) {
> +        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
> +        goto err;
> +    }
> +
> +    qemu_state_len = st.st_size;
> +    rec.length = qemu_state_len + sizeof(ehdr);
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, &ehdr, sizeof(ehdr));
> +
> +    stream->padding = ROUNDUP(qemu_state_len, REC_ALIGN_ORDER) - qemu_state_len;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->readwhat = "";
> +    dc->readfd = -1;
> +
> +    if (stream->padding) {
> +        assert(stream->padding < (1U << REC_ALIGN_ORDER));
> +
> +        dc->copywhat = "emulator padding";
> +        dc->writewhat = "save/migration stream";
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, stream->padding);
> +        return;
> +    }
> +
> +    emulator_padding_done(egc, dc, 0, 0);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_end_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_END, 0 };
> +    int ret = 0;
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = end_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
  2015-06-16 14:57   ` Ian Campbell
  2015-06-17  1:31   ` Yang Hongyang
@ 2015-06-17  1:39   ` Wen Congyang
  2015-06-17  2:24   ` Wen Congyang
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  1:39 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile             |    2 +-
>  tools/libxl/libxl_internal.h     |   33 +++
>  tools/libxl/libxl_stream_write.c |  536 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 570 insertions(+), 1 deletion(-)
>  create mode 100644 tools/libxl/libxl_stream_write.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index ca0ae3e..63e32f7 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
> -			libxl_stream_read.o \
> +			libxl_stream_read.o libxl_stream_write.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>  			libxl_convert_callout.o \
>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 5482950..82cd792 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
>  typedef void libxl__save_device_model_cb(libxl__egc*,
>                                           libxl__domain_suspend_state*, int rc);
>  
> +/* State for writing a libxl migration v2 stream */
> +typedef struct libxl__stream_write_state libxl__stream_write_state;
> +
> +struct libxl__stream_write_state {
> +    /* filled by the user */
> +    libxl__ao *ao;
> +    int fd;
> +    uint32_t domid;
> +    void (*completion_callback)(libxl__egc *egc,
> +                                libxl__domain_suspend_state *dss,
> +                                int rc);
> +    /* Private */
> +    int rc;
> +    int joined_rc;
> +    size_t padding;
> +    bool running;
> +    libxl__datacopier_state dc;
> +};
> +
> +_hidden void libxl__stream_write_start(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream);
> +
> +_hidden void libxl__stream_write_abort(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream,
> +                                       int rc);
> +
> +static inline bool libxl__stream_write_inuse(
> +    const libxl__stream_write_state *stream)
> +{
> +    return stream->running;
> +}
> +
>  typedef struct libxl__logdirty_switch {
>      const char *cmd;
>      const char *cmd_path;
> @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
>      /* private for libxl__domain_save_device_model */
>      libxl__save_device_model_cb *save_dm_callback;
>      libxl__datacopier_state save_dm_datacopier;
> +    libxl__stream_write_state sws;
>  };
>  
>  
> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
> new file mode 100644
> index 0000000..856d72e
> --- /dev/null
> +++ b/tools/libxl/libxl_stream_write.c
> @@ -0,0 +1,536 @@
> +/*
> + * Copyright (C) 2015      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +/*
> + * Infrastructure for writing a domain to a libxl migration v2 stream.
> + *
> + * Entry points from outside:
> + *  - libxl__stream_write_start()
> + *     - Start writing a stream from the start.
> + *
> + * In normal operation, there are two tasks running at once; this stream
> + * processing, and the the libxl-save-helper.  check_stream_finished() is used
> + * to join all the tasks in both success and error cases.
> + *
> + * Nomenclature for event callbacks:
> + *  - $FOO_done(): Completion callback for $FOO
> + *  - write_$FOO(): Set up writing a $FOO
> + *  - $BAR_header(): A $BAR record header only
> + *  - $BAR_record(): A complete $BAR record with header and content
> + *
> + * The main loop for a plain VM writes:
> + *  - Stream header
> + *  - Libxc record
> + *  - Toolstack record
> + *  - if (hvm), Qemu record
> + *  - End record
> + */
> +
> +static const uint8_t zero_padding[1U << REC_ALIGN_ORDER] = { 0 };
> +
> +static void stream_success(libxl__egc *egc,
> +                           libxl__stream_write_state *stream);
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int ret);
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream);
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dcs,
> +                                  int rc, const char *what);
> +
> +/* Event callbacks for plain VM. */
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval);
> +/* libxl__xc_domain_save_done() lives here, event-order wise. */
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream);
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream);
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream);
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval);
> +
> +void libxl__stream_write_start(libxl__egc *egc,
> +                               libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_hdr hdr = { 0 };
> +    int ret = 0;
> +
> +    assert(!stream->running);
> +    stream->running = true;
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->readwhat = "";
> +    dc->copywhat = "suspend header";
> +    dc->writewhat = "save/migration stream";
> +    dc->ao = ao;
> +    dc->readfd = -1;
> +    dc->writefd = stream->fd;
> +    dc->maxsz = INT_MAX;
> +    dc->bytes_to_read = INT_MAX;
> +    dc->callback = stream_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    hdr.ident   = htobe64(RESTORE_STREAM_IDENT);
> +    hdr.version = htobe32(RESTORE_STREAM_VERSION);
> +    hdr.options = htobe32(0);
> +
> +    libxl__datacopier_prefixdata(egc, dc, &hdr, sizeof(hdr));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_write_abort(libxl__egc *egc,
> +                               libxl__stream_write_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);
> +}
> +
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int rc)
> +{
> +    assert(rc);
> +    stream->rc = rc;
> +
> +    if (stream->running) {
> +        stream->running = false;
> +        stream_done(egc, stream);
> +    }
> +}
> +
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +
> +    assert(!stream->running);
> +
> +    check_stream_finished(egc, dss, stream->rc, "stream");
> +}
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dss,
> +                                  int rc, const char *what)
> +{
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
> +
> +    if (rc && !stream->joined_rc) {
> +        bool skip = false;
> +        /* First reported failure from joining tasks.  Tear everything down */
> +        stream->joined_rc = rc;
> +
> +        if (libxl__stream_write_inuse(&dss->sws)) {
> +            skip = true;
> +            libxl__stream_write_abort(egc, &dss->sws, rc);
> +        }
> +
> +        if (libxl__save_helper_inuse(&dss->shs)) {
> +            skip = true;
> +            libxl__save_helper_abort(egc, &dss->shs);
> +        }
> +
> +        /* There is at least one more active task to join - wait for its
> +           callback */
> +        if ( skip )
> +            return;
> +    }
> +
> +    if (libxl__stream_write_inuse(&dss->sws))
> +        LOG(DEBUG, "stream still in use");
> +    else if (libxl__save_helper_inuse(&dss->shs))
> +        LOG(DEBUG, "save/restore still in use");
> +    else {
> +        LOG(INFO, "Join complete: result %d", stream->joined_rc);
> +        stream->completion_callback(egc, dss, stream->joined_rc);
> +    }
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_LIBXC_CONTEXT, 0 };
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = libxc_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    libxl__xc_domain_save(egc, dss);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void __attribute__((used))
> +will_be_libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
> +                                int rc, int retval, int errnoval)
> +{
> +    libxl__domain_suspend_state *dss = dss_void;
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    if (rc)
> +        goto err;
> +
> +    if (retval) {
> +        LOGEV(ERROR, errnoval, "saving domain: %s",
> +                         dss->guest_responded ?
> +                         "domain responded to suspend request" :
> +                         "domain did not respond to suspend request");
> +        if ( !dss->guest_responded )
> +            rc = ERROR_GUEST_TIMEDOUT;
> +        else
> +            rc = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_toolstack_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(rc);
> +    check_stream_finished(egc, dss, rc, "save/restore helper");
> +}
> +
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_XENSTORE_DATA, 0 };
> +    int ret = 0;
> +    uint8_t *toolstack_buf = NULL; /* We must free this. */
> +    uint32_t toolstack_len, padding_len;
> +
> +    ret = libxl__toolstack_save(dss->domid, &toolstack_buf,
> +                                &toolstack_len, dss);
> +    if (ret)
> +        goto err;
> +
> +    dc->copywhat = "toolstack record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = toolstack_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    rec.length = toolstack_len;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, toolstack_buf, toolstack_len);
> +
> +    padding_len = ROUNDUP(rec.length, REC_ALIGN_ORDER) - rec.length;
> +    if (padding_len)
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, padding_len);
> +
> +    free(toolstack_buf);
> +    return;
> +
> + err:
> +    assert(ret);
> +    free(toolstack_buf);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
> +        write_emulator_record(egc, stream);
> +    else
> +        write_end_record(egc, stream);
> +
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_EMULATOR_CONTEXT, 0 };
> +    struct libxl_sr_emulator_hdr ehdr = { 0 };
> +    struct stat st;
> +    int ret = 0;
> +    uint32_t qemu_state_len;
> +
> +    assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
> +
> +    /* Convenience aliases */
> +    const char *const filename = dss->dm_savefile;
> +    const uint32_t domid = dss->domid;
> +
> +    switch(libxl__device_model_version_running(gc, domid)) {
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +        ehdr.id = EMULATOR_QEMU_TRADITIONAL;
> +        break;
> +
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> +        ehdr.id = EMULATOR_QEMU_UPSTREAM;
> +        break;
> +
> +    default:
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    ret = libxl__domain_suspend_device_model(gc, dss);

We have called it when suspending the guest. For migration, calling
this function twice is OK, but for remus, it will cause a problem:
the guest is running, and this device-model will be stopped by this
function.

Thanks
Wen Congyang

> +    if (ret)
> +        goto err;
> +
> +    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
> +    dc->copywhat = "emulator record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = emulator_body_done;
> +
> +    dc->readfd = open(filename, O_RDONLY);
> +    if (dc->readfd < 0) {
> +        LOGE(ERROR, "unable to open %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (fstat(dc->readfd, &st))
> +    {
> +        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (!S_ISREG(st.st_mode)) {
> +        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
> +        goto err;
> +    }
> +
> +    qemu_state_len = st.st_size;
> +    rec.length = qemu_state_len + sizeof(ehdr);
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, &ehdr, sizeof(ehdr));
> +
> +    stream->padding = ROUNDUP(qemu_state_len, REC_ALIGN_ORDER) - qemu_state_len;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->readwhat = "";
> +    dc->readfd = -1;
> +
> +    if (stream->padding) {
> +        assert(stream->padding < (1U << REC_ALIGN_ORDER));
> +
> +        dc->copywhat = "emulator padding";
> +        dc->writewhat = "save/migration stream";
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, stream->padding);
> +        return;
> +    }
> +
> +    emulator_padding_done(egc, dc, 0, 0);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_end_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_END, 0 };
> +    int ret = 0;
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = end_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 00/27]  Libxl migration v2
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (27 preceding siblings ...)
  2015-06-16  2:21 ` [PATCH 00/27] Libxl migration v2 Yang Hongyang
@ 2015-06-17  1:55 ` Wen Congyang
  2015-06-17  9:45   ` Andrew Cooper
  2015-07-02  7:33 ` Yang Hongyang
  29 siblings, 1 reply; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  1:55 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> This series adds support for the libxl migration v2 stream, and untangles the
> existing layering violations of the toolstack and qemu records.
> 
> At the end of the series, legacy migration is no longer used.
> 
> Note: Remus support is broken and (RFC) fixed in separate patches in this
> series.  It was too tangled to fix in a bisectable fashon.  Plain
> suspend/migrate/resume however is (should be) bisectable along the entire
> series.
> 
> There are a couple of outstanding questions:
> 
> 1) What to do about the toolstack/xenstore record.  It is currently by being
>    passed around as a blob, but it might be better to split it out.
> 
> 2) What (if any) ABI/API qualifications are needed? (Particularly in reference
>    to patch 21)
> 
> The Remus code is untested by me, but is hopefully in the correct ballpark.
> All other combinations of suspend/migrate/resume have been tested with PV and
> HVM guests (qemu-trad and qemu-upstream), including 32 -> 64 bit migration
> (which was the underlying bug causing us to write migration v2 in the first
> place).
> 
> There are some further improvements which could be made.  In particular, it
> appears that sending the toolstack record on each checkpoint is redundant, and
> there is certainly room for some more pruning of the legacy migration code.

Do you mean: libxl__toolstack_save is harmless, and it can be called when the
guest is running?

Thanks
Wen Congyang

> 
> Anyway, thoughts/comments welcome.  Please test!
> 
> ~Andrew
> 
> 
> Andrew Cooper (22):
>   tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
>   tools/libxc: Always compile the compat qemu variables into xc_sr_context
>   tools/libxl: Stash all restore parameters in domain_create_state
>   tools/xl: Mandatory flag indicating the format of the migration stream
>   tools/libxl: Introduce ROUNDUP()
>   tools/libxl: Extra APIs for the save helper
>   tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
>   docs: Libxl migration v2 stream specification
>   tools/python: Libxc migration v2 infrastructure
>   tools/python: Libxl migration v2 infrastructure
>   tools/python: Verification utility for v2 stream spec compliance
>   tools/python: Conversion utility for legacy migration streams
>   tools/libxl: Support converting a legacy stream to a v2 stream
>   tools/libxl: Convert a legacy stream if needed
>   tools/libxc+libxl+xl: Restore v2 streams
>   tools/libxc+libxl+xl: Save v2 streams
>   docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
>   tools/libxl: [RFC] Write checkpoint records into the stream
>   tools/libx{c,l}: [RFC] Introduce restore_callbacks.checkpoint()
>   tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
>   tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
>   tools/libxl: Drop all knowledge of toolstack callbacks
> 
> Ian Jackson (2):
>   libxl: cancellation: Preparations for save/restore cancellation
>   libxl: cancellation: Handle SIGTERM in save/restore helper
> 
> Ross Lagerwall (3):
>   tools/libxl: Migration v2 stream format
>   tools/libxl: Infrastructure for reading a libxl migration v2 stream
>   tools/libxl: Infrastructure for writing a v2 stream
> 
>  docs/specs/libxl-migration-stream.pandoc      |  218 ++++++++
>  tools/libxc/Makefile                          |    2 -
>  tools/libxc/include/xenguest.h                |    3 +
>  tools/libxc/xc_sr_common.h                    |    5 -
>  tools/libxc/xc_sr_restore.c                   |   33 +-
>  tools/libxc/xc_sr_restore_x86_hvm.c           |  124 -----
>  tools/libxc/xc_sr_save_x86_hvm.c              |   36 --
>  tools/libxl/Makefile                          |    2 +
>  tools/libxl/libxl_aoutils.c                   |    7 +
>  tools/libxl/libxl_convert_callout.c           |  146 ++++++
>  tools/libxl/libxl_create.c                    |   80 +--
>  tools/libxl/libxl_dom.c                       |   61 +--
>  tools/libxl/libxl_internal.h                  |  140 ++++-
>  tools/libxl/libxl_save_callout.c              |   63 +--
>  tools/libxl/libxl_save_helper.c               |   95 ++--
>  tools/libxl/libxl_save_msgs_gen.pl            |    9 +-
>  tools/libxl/libxl_sr_stream_format.h          |   58 +++
>  tools/libxl/libxl_stream_read.c               |  663 ++++++++++++++++++++++++
>  tools/libxl/libxl_stream_write.c              |  640 +++++++++++++++++++++++
>  tools/libxl/libxl_types.idl                   |    2 +
>  tools/libxl/xl_cmdimpl.c                      |    9 +-
>  tools/python/Makefile                         |    4 +
>  tools/python/scripts/convert-legacy-stream.py |  683 +++++++++++++++++++++++++
>  tools/python/scripts/verify-stream-v2.py      |  174 +++++++
>  tools/python/setup.py                         |    1 +
>  tools/python/xen/migration/libxc.py           |  446 ++++++++++++++++
>  tools/python/xen/migration/libxl.py           |  199 +++++++
>  tools/python/xen/migration/tests.py           |   54 ++
>  tools/python/xen/migration/verify.py          |   37 ++
>  29 files changed, 3638 insertions(+), 356 deletions(-)
>  create mode 100644 docs/specs/libxl-migration-stream.pandoc
>  create mode 100644 tools/libxl/libxl_convert_callout.c
>  create mode 100644 tools/libxl/libxl_sr_stream_format.h
>  create mode 100644 tools/libxl/libxl_stream_read.c
>  create mode 100644 tools/libxl/libxl_stream_write.c
>  create mode 100755 tools/python/scripts/convert-legacy-stream.py
>  create mode 100755 tools/python/scripts/verify-stream-v2.py
>  create mode 100644 tools/python/xen/migration/__init__.py
>  create mode 100644 tools/python/xen/migration/libxc.py
>  create mode 100644 tools/python/xen/migration/libxl.py
>  create mode 100644 tools/python/xen/migration/tests.py
>  create mode 100644 tools/python/xen/migration/verify.py
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
                     ` (2 preceding siblings ...)
  2015-06-17  1:39   ` Wen Congyang
@ 2015-06-17  2:24   ` Wen Congyang
  2015-06-17  7:38   ` Yang Hongyang
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  2:24 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile             |    2 +-
>  tools/libxl/libxl_internal.h     |   33 +++
>  tools/libxl/libxl_stream_write.c |  536 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 570 insertions(+), 1 deletion(-)
>  create mode 100644 tools/libxl/libxl_stream_write.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index ca0ae3e..63e32f7 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
> -			libxl_stream_read.o \
> +			libxl_stream_read.o libxl_stream_write.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>  			libxl_convert_callout.o \
>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 5482950..82cd792 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
>  typedef void libxl__save_device_model_cb(libxl__egc*,
>                                           libxl__domain_suspend_state*, int rc);
>  
> +/* State for writing a libxl migration v2 stream */
> +typedef struct libxl__stream_write_state libxl__stream_write_state;
> +
> +struct libxl__stream_write_state {
> +    /* filled by the user */
> +    libxl__ao *ao;
> +    int fd;
> +    uint32_t domid;
> +    void (*completion_callback)(libxl__egc *egc,
> +                                libxl__domain_suspend_state *dss,
> +                                int rc);
> +    /* Private */
> +    int rc;
> +    int joined_rc;
> +    size_t padding;
> +    bool running;
> +    libxl__datacopier_state dc;
> +};
> +
> +_hidden void libxl__stream_write_start(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream);
> +
> +_hidden void libxl__stream_write_abort(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream,
> +                                       int rc);
> +
> +static inline bool libxl__stream_write_inuse(
> +    const libxl__stream_write_state *stream)
> +{
> +    return stream->running;
> +}
> +
>  typedef struct libxl__logdirty_switch {
>      const char *cmd;
>      const char *cmd_path;
> @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
>      /* private for libxl__domain_save_device_model */
>      libxl__save_device_model_cb *save_dm_callback;
>      libxl__datacopier_state save_dm_datacopier;
> +    libxl__stream_write_state sws;
>  };
>  
>  
> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
> new file mode 100644
> index 0000000..856d72e
> --- /dev/null
> +++ b/tools/libxl/libxl_stream_write.c
> @@ -0,0 +1,536 @@
> +/*
> + * Copyright (C) 2015      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +/*
> + * Infrastructure for writing a domain to a libxl migration v2 stream.
> + *
> + * Entry points from outside:
> + *  - libxl__stream_write_start()
> + *     - Start writing a stream from the start.
> + *
> + * In normal operation, there are two tasks running at once; this stream
> + * processing, and the the libxl-save-helper.  check_stream_finished() is used
> + * to join all the tasks in both success and error cases.
> + *
> + * Nomenclature for event callbacks:
> + *  - $FOO_done(): Completion callback for $FOO
> + *  - write_$FOO(): Set up writing a $FOO
> + *  - $BAR_header(): A $BAR record header only
> + *  - $BAR_record(): A complete $BAR record with header and content
> + *
> + * The main loop for a plain VM writes:
> + *  - Stream header
> + *  - Libxc record
> + *  - Toolstack record
> + *  - if (hvm), Qemu record
> + *  - End record
> + */
> +
> +static const uint8_t zero_padding[1U << REC_ALIGN_ORDER] = { 0 };
> +
> +static void stream_success(libxl__egc *egc,
> +                           libxl__stream_write_state *stream);
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int ret);
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream);
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dcs,
> +                                  int rc, const char *what);
> +
> +/* Event callbacks for plain VM. */
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval);
> +/* libxl__xc_domain_save_done() lives here, event-order wise. */
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream);
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream);
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream);
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval);
> +
> +void libxl__stream_write_start(libxl__egc *egc,
> +                               libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_hdr hdr = { 0 };
> +    int ret = 0;
> +
> +    assert(!stream->running);
> +    stream->running = true;
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->readwhat = "";
> +    dc->copywhat = "suspend header";
> +    dc->writewhat = "save/migration stream";
> +    dc->ao = ao;
> +    dc->readfd = -1;
> +    dc->writefd = stream->fd;
> +    dc->maxsz = INT_MAX;
> +    dc->bytes_to_read = INT_MAX;
> +    dc->callback = stream_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    hdr.ident   = htobe64(RESTORE_STREAM_IDENT);
> +    hdr.version = htobe32(RESTORE_STREAM_VERSION);
> +    hdr.options = htobe32(0);
> +
> +    libxl__datacopier_prefixdata(egc, dc, &hdr, sizeof(hdr));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_write_abort(libxl__egc *egc,
> +                               libxl__stream_write_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);
> +}
> +
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int rc)
> +{
> +    assert(rc);
> +    stream->rc = rc;
> +
> +    if (stream->running) {
> +        stream->running = false;
> +        stream_done(egc, stream);
> +    }
> +}
> +
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +
> +    assert(!stream->running);
> +
> +    check_stream_finished(egc, dss, stream->rc, "stream");
> +}
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dss,
> +                                  int rc, const char *what)
> +{
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
> +
> +    if (rc && !stream->joined_rc) {
> +        bool skip = false;
> +        /* First reported failure from joining tasks.  Tear everything down */
> +        stream->joined_rc = rc;
> +
> +        if (libxl__stream_write_inuse(&dss->sws)) {
> +            skip = true;
> +            libxl__stream_write_abort(egc, &dss->sws, rc);
> +        }
> +
> +        if (libxl__save_helper_inuse(&dss->shs)) {
> +            skip = true;
> +            libxl__save_helper_abort(egc, &dss->shs);
> +        }
> +
> +        /* There is at least one more active task to join - wait for its
> +           callback */
> +        if ( skip )
> +            return;
> +    }
> +
> +    if (libxl__stream_write_inuse(&dss->sws))
> +        LOG(DEBUG, "stream still in use");
> +    else if (libxl__save_helper_inuse(&dss->shs))
> +        LOG(DEBUG, "save/restore still in use");
> +    else {
> +        LOG(INFO, "Join complete: result %d", stream->joined_rc);
> +        stream->completion_callback(egc, dss, stream->joined_rc);
> +    }
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_LIBXC_CONTEXT, 0 };
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = libxc_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    libxl__xc_domain_save(egc, dss);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void __attribute__((used))
> +will_be_libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
> +                                int rc, int retval, int errnoval)
> +{
> +    libxl__domain_suspend_state *dss = dss_void;
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    if (rc)
> +        goto err;
> +
> +    if (retval) {
> +        LOGEV(ERROR, errnoval, "saving domain: %s",
> +                         dss->guest_responded ?
> +                         "domain responded to suspend request" :
> +                         "domain did not respond to suspend request");
> +        if ( !dss->guest_responded )
> +            rc = ERROR_GUEST_TIMEDOUT;
> +        else
> +            rc = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_toolstack_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(rc);
> +    check_stream_finished(egc, dss, rc, "save/restore helper");
> +}
> +
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_XENSTORE_DATA, 0 };
> +    int ret = 0;
> +    uint8_t *toolstack_buf = NULL; /* We must free this. */
> +    uint32_t toolstack_len, padding_len;
> +
> +    ret = libxl__toolstack_save(dss->domid, &toolstack_buf,
> +                                &toolstack_len, dss);
> +    if (ret)
> +        goto err;
> +
> +    dc->copywhat = "toolstack record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = toolstack_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    rec.length = toolstack_len;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, toolstack_buf, toolstack_len);
> +
> +    padding_len = ROUNDUP(rec.length, REC_ALIGN_ORDER) - rec.length;
> +    if (padding_len)
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, padding_len);
> +
> +    free(toolstack_buf);
> +    return;
> +
> + err:
> +    assert(ret);
> +    free(toolstack_buf);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
> +        write_emulator_record(egc, stream);
> +    else
> +        write_end_record(egc, stream);
> +
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_EMULATOR_CONTEXT, 0 };
> +    struct libxl_sr_emulator_hdr ehdr = { 0 };
> +    struct stat st;
> +    int ret = 0;
> +    uint32_t qemu_state_len;
> +
> +    assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
> +
> +    /* Convenience aliases */
> +    const char *const filename = dss->dm_savefile;
> +    const uint32_t domid = dss->domid;
> +
> +    switch(libxl__device_model_version_running(gc, domid)) {
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +        ehdr.id = EMULATOR_QEMU_TRADITIONAL;
> +        break;
> +
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> +        ehdr.id = EMULATOR_QEMU_UPSTREAM;
> +        break;
> +
> +    default:
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    ret = libxl__domain_suspend_device_model(gc, dss);
> +    if (ret)
> +        goto err;
> +
> +    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
> +    dc->copywhat = "emulator record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = emulator_body_done;
> +
> +    dc->readfd = open(filename, O_RDONLY);
> +    if (dc->readfd < 0) {
> +        LOGE(ERROR, "unable to open %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (fstat(dc->readfd, &st))
> +    {
> +        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (!S_ISREG(st.st_mode)) {
> +        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
> +        goto err;
> +    }
> +
> +    qemu_state_len = st.st_size;
> +    rec.length = qemu_state_len + sizeof(ehdr);
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, &ehdr, sizeof(ehdr));
> +
> +    stream->padding = ROUNDUP(qemu_state_len, REC_ALIGN_ORDER) - qemu_state_len;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;

I think you forget to close dc->readfd here.

Thanks
Wen Congyang

> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->readwhat = "";
> +    dc->readfd = -1;
> +
> +    if (stream->padding) {
> +        assert(stream->padding < (1U << REC_ALIGN_ORDER));
> +
> +        dc->copywhat = "emulator padding";
> +        dc->writewhat = "save/migration stream";
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, stream->padding);
> +        return;
> +    }
> +
> +    emulator_padding_done(egc, dc, 0, 0);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_end_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_END, 0 };
> +    int ret = 0;
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = end_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-15 13:44 ` [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Andrew Cooper
  2015-06-16 14:31   ` Ian Campbell
@ 2015-06-17  3:09   ` Wen Congyang
  2015-06-17 10:15     ` Ian Campbell
  2015-06-17  6:03   ` Wen Congyang
  2015-06-17  7:57   ` Wen Congyang
  3 siblings, 1 reply; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  3:09 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile            |    1 +
>  tools/libxl/libxl_internal.h    |   39 ++++
>  tools/libxl/libxl_stream_read.c |  485 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 525 insertions(+)
>  create mode 100644 tools/libxl/libxl_stream_read.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index cc9c152..c71c5fe 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
> +			libxl_stream_read.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
>  LIBXL_OBJS += libxl_genid.o
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 101994f..4f33cb8 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -19,6 +19,8 @@
>  
>  #include "libxl_osdeps.h" /* must come before any other headers */
>  
> +#include "libxl_sr_stream_format.h"
> +
>  #include <assert.h>
>  #include <dirent.h>
>  #include <errno.h>
> @@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
>                                       libxl__domain_create_state*,
>                                       int rc, uint32_t domid);
>  
> +/* State for manipulating a libxl migration v2 stream */
> +typedef struct libxl__stream_read_state libxl__stream_read_state;
> +
> +struct libxl__stream_read_state {
> +    /* filled by the user */
> +    libxl__ao *ao;
> +    int fd;
> +    void (*completion_callback)(libxl__egc *egc,
> +                                libxl__domain_create_state *dcs,
> +                                int rc);
> +    /* Private */
> +    int rc;
> +    bool running;
> +    libxl__datacopier_state dc;
> +    size_t expected_len;
> +    libxl_sr_hdr hdr;
> +    libxl_sr_rec_hdr rec_hdr;
> +    void *rec_body;
> +};
> +
> +_hidden void libxl__stream_read_start(libxl__egc *egc,
> +                                      libxl__stream_read_state *stream);
> +
> +_hidden void libxl__stream_read_continue(libxl__egc *egc,
> +                                         libxl__stream_read_state *stream);
> +
> +_hidden void libxl__stream_read_abort(libxl__egc *egc,
> +                                      libxl__stream_read_state *stream, int rc);
> +
> +static inline bool libxl__stream_read_inuse(
> +    const libxl__stream_read_state *stream)
> +{
> +    return stream->running;
> +}
> +
> +
>  struct libxl__domain_create_state {
>      /* filled in by user */
>      libxl__ao *ao;
> @@ -3137,6 +3175,7 @@ struct libxl__domain_create_state {
>      libxl__stub_dm_spawn_state dmss;
>          /* If we're not doing stubdom, we use only dmss.dm,
>           * for the non-stubdom device model. */
> +    libxl__stream_read_state srs;
>      libxl__save_helper_state shs;
>      /* necessary if the domain creation failed and we have to destroy it */
>      libxl__domain_destroy_state dds;
> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
> new file mode 100644
> index 0000000..9cdaadf
> --- /dev/null
> +++ b/tools/libxl/libxl_stream_read.c
> @@ -0,0 +1,485 @@
> +/*
> + * Copyright (C) 2015      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +/*
> + * Infrastructure for reading and acting on the contents of a libxl migration
> + * stream. There are a lot of moving parts here.
> + *
> + * Entry points from outside:
> + *  - libxl__stream_read_start()
> + *     - Set up reading a stream from the start.
> + *
> + *  - libxl__stream_read_continue()
> + *     - Set up reading the next record from a started stream.
> + *
> + * The principle loop functionality involves reading the stream header, then
> + * reading a record at time and acting upon it.  It follows the callbacks:
> + *
> + *  - stream_header_done()
> + *  - stream_record_header_done()
> + *  - stream_record_body_done()
> + *  - process_record()
> + *
> + * process_record() will choose the correct next action based upon the
> + * record.  Upon completion of the action, the next record header will be read
> + * from the stream.
> + */
> +
> +static void stream_success(libxl__egc *egc,
> +                           libxl__stream_read_state *stream);
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_read_state *stream, int rc);
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream);
> +
> +/* Event callbacks for main reading loop. */
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void record_body_done(libxl__egc *egc,
> +                             libxl__datacopier_state *dc,
> +                             int onwrite, int errnoval);
> +static void process_record(libxl__egc *egc,
> +                           libxl__stream_read_state *stream);
> +
> +/* Mini-event loop for splicing a emulator record out of the stream. */
> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream);
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +
> +void libxl__stream_read_start(libxl__egc *egc,
> +                              libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    /* State initialisation. */
> +    assert(!stream->running);
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Start reading the stream header. */
> +    dc->readwhat = "stream header";
> +    dc->readbuf = &stream->hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
> +    dc->used = 0;
> +    dc->callback = stream_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    stream->running = true;
> +    assert(!ret);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_read_continue(libxl__egc *egc,
> +                                 libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    assert(stream->running);
> +
> +    /* Read a record header. */
> +    dc->readwhat = "record header";
> +    dc->readbuf = &stream->rec_hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->rec_hdr);
> +    dc->used = 0;
> +    dc->callback = record_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    assert(!ret);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_read_abort(libxl__egc *egc,
> +                              libxl__stream_read_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);
> +}
> +
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_read_state *stream, int rc)
> +{
> +    assert(rc);
> +    stream->rc = rc;
> +
> +    if (stream->running) {
> +        stream->running = false;
> +        stream_done(egc, stream);
> +    }
> +}
> +
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +
> +    assert(!stream->running);
> +
> +    stream->completion_callback(egc, dcs, stream->rc);
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_hdr *hdr = &stream->hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    hdr->ident   = be64toh(hdr->ident);
> +    hdr->version = be32toh(hdr->version);
> +    hdr->options = be32toh(hdr->options);
> +
> +    if (hdr->ident != RESTORE_STREAM_IDENT) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR,
> +            "Invalid ident: expected 0x%016"PRIx64", got 0x%016"PRIx64,
> +            RESTORE_STREAM_IDENT, hdr->ident);
> +        goto err;
> +    }
> +    if (hdr->version != RESTORE_STREAM_VERSION) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "Unexpected Version: expected %u, got %u",
> +            RESTORE_STREAM_VERSION, hdr->version);
> +        goto err;
> +    }
> +    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "Unable to handle big endian streams");
> +        goto err;

I think it is better to check if the host is big endian or not.
The source and target should be the same.

Thanks
Wen Congyang

> +    }
> +
> +    LOG(INFO, "Stream v%u%s", hdr->version,
> +        hdr->options & RESTORE_OPT_LEGACY ? " (from legacy)" : "");
> +
> +    libxl__stream_read_continue(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    assert(stream->rec_body == NULL);
> +
> +    /* No body? Process straight away. */
> +    if (rec_hdr->length == 0) {
> +        process_record(egc, stream);
> +        return;
> +    }
> +
> +    /* Queue up reading the body. */
> +    size_t bytes_to_read;
> +
> +    switch (rec_hdr->type) {
> +        /*
> +         * Emulator records want to retain the blob in the pipe, for a further
> +         * datacopier call to move elsewhere.  Just read the emulator header.
> +         */
> +    case REC_TYPE_EMULATOR_CONTEXT:
> +        bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
> +        break;
> +
> +    default:
> +        bytes_to_read = rec_hdr->length;
> +        break;
> +    }
> +
> +    bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);
> +
> +    dc->readwhat = "record body";
> +    stream->rec_body = dc->readbuf = libxl__malloc(NOGC, bytes_to_read);
> +    stream->expected_len = dc->bytes_to_read = bytes_to_read;
> +    dc->used = 0;
> +    dc->callback = record_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void record_body_done(libxl__egc *egc,
> +                             libxl__datacopier_state *dc,
> +                             int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +
> +        free(stream->rec_body);
> +        stream->rec_body = dc->readbuf = NULL;
> +
> +        goto err;
> +    }
> +
> +    process_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void process_record(libxl__egc *egc,
> +                           libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    LOG(DEBUG, "Record: 0x%08x, length %u", rec_hdr->type, rec_hdr->length);
> +
> +    switch (rec_hdr->type) {
> +
> +    case REC_TYPE_END:
> +        /* Handled later, after cleanup. */
> +        break;
> +
> +    case REC_TYPE_XENSTORE_DATA:
> +        ret = libxl__toolstack_restore(dcs->guest_domid, stream->rec_body,
> +                                       rec_hdr->length, &dcs->shs);
> +        if (ret)
> +            goto err;
> +
> +        /*
> +         * libxl__toolstack_restore() is a synchronous function.  Manually
> +         * start looking for the next record.
> +         */
> +        libxl__stream_read_continue(egc, &dcs->srs);
> +        break;
> +
> +    case REC_TYPE_EMULATOR_CONTEXT:
> +        read_emulator_body(egc, stream);
> +        break;
> +
> +    default:
> +        LOG(ERROR, "Unrecognised record 0x%08x", rec_hdr->type);
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    assert(!ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;
> +    }
> +
> +    if (rec_hdr->type == REC_TYPE_END)
> +        stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;
> +    }
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
> +    STATE_AO_GC(stream->ao);
> +    char path[256];
> +    int ret = 0;
> +
> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
> +
> +    dc->readwhat = "save/migration stream";
> +    dc->copywhat = "emulator context";
> +    dc->writewhat = "qemu save file";
> +    dc->readbuf = NULL;
> +    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);
> +    if (dc->writefd == -1) {
> +        ret = ERROR_FAIL;
> +        LOGE(ERROR, "Unable to open '%s'", path);
> +        goto err;
> +    }
> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
> +    stream->expected_len = dc->used = 0;
> +    dc->callback = emulator_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    /* Safe to be static, as it is a write-only discard buffer. */
> +    static char padding[1U << REC_ALIGN_ORDER];
> +
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    /* Undo modifications for splicing the emulator context. */
> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Do we need to eat some padding out of the stream? */
> +    if (rec_hdr->length & (nr_padding_bytes - 1)) {
> +        unsigned int bytes_to_discard =
> +            nr_padding_bytes - (rec_hdr->length & (nr_padding_bytes - 1));
> +
> +        dc->readwhat = "padding bytes";
> +        dc->readbuf = padding;
> +        stream->expected_len = dc->bytes_to_read = bytes_to_discard;
> +        dc->used = 0;
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +    }
> +    else
> +    {
> +        stream->expected_len = dc->bytes_to_read = 0;
> +        dc->used = 0;
> +
> +        emulator_padding_done(egc, dc, 0, 0);
> +    }
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    libxl__stream_read_continue(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
  2015-06-15 13:44 ` [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams Andrew Cooper
  2015-06-16 15:00   ` Ian Campbell
@ 2015-06-17  3:30   ` Wen Congyang
  1 sibling, 0 replies; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  3:30 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> In a remus senario, libxc will write a CHECKPOINT record, then hand ownership
> of the fd to libxl.  Libxl then writes any records required and finishes with
> a CHECKPOINT_END record, then hands ownership of the fd back to libxc.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  docs/specs/libxl-migration-stream.pandoc |   15 ++++++++++++++-
>  tools/libxl/libxl_sr_stream_format.h     |    1 +
>  tools/python/xen/migration/libxl.py      |   11 +++++++++++
>  3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/libxl-migration-stream.pandoc b/docs/specs/libxl-migration-stream.pandoc
> index 7235317..d41932a 100644
> --- a/docs/specs/libxl-migration-stream.pandoc
> +++ b/docs/specs/libxl-migration-stream.pandoc
> @@ -119,7 +119,9 @@ type         0x00000000: END
>  
>               0x00000003: EMULATOR_CONTEXT
>  
> -             0x00000004 - 0x7FFFFFFF: Reserved for future _mandatory_
> +             0x00000004: CHECKPOINT_END
> +
> +             0x00000005 - 0x7FFFFFFF: Reserved for future _mandatory_
>               records.
>  
>               0x80000000 - 0xFFFFFFFF: Reserved for future _optional_
> @@ -203,3 +205,14 @@ index            Index of this emulator for the domain, if multiple
>  
>  emulator_ctx     Emulator context blob.
>  --------------------------------------------------------------------
> +
> +CHECKPOINT_END

CHECKPOINT\_END?

> +--------------
> +
> +A checkpoint end record marks the end of a checkpoint in the image.
> +
> +     0     1     2     3     4     5     6     7 octet
> +    +-------------------------------------------------+
> +
> +The end record contains no fields; its body_length is 0.
> +
> diff --git a/tools/libxl/libxl_sr_stream_format.h b/tools/libxl/libxl_sr_stream_format.h
> index 487f9e2..5dfa55f 100644
> --- a/tools/libxl/libxl_sr_stream_format.h
> +++ b/tools/libxl/libxl_sr_stream_format.h
> @@ -35,6 +35,7 @@
>  #define REC_TYPE_LIBXC_CONTEXT       0x00000001U
>  #define REC_TYPE_XENSTORE_DATA       0x00000002U
>  #define REC_TYPE_EMULATOR_CONTEXT    0x00000003U
> +#define REC_TYPE_CHECKPOINT_END      0x00000004U
>  
>  typedef struct libxl_sr_emulator_hdr
>  {
> diff --git a/tools/python/xen/migration/libxl.py b/tools/python/xen/migration/libxl.py
> index 4e1f4f8..415502e 100644
> --- a/tools/python/xen/migration/libxl.py
> +++ b/tools/python/xen/migration/libxl.py
> @@ -36,12 +36,14 @@ REC_TYPE_end              = 0x00000000
>  REC_TYPE_libxc_context    = 0x00000001
>  REC_TYPE_xenstore_data    = 0x00000002
>  REC_TYPE_emulator_context = 0x00000003
> +REC_TYPE_checkpoint_end   = 0x00000004
>  
>  rec_type_to_str = {
>      REC_TYPE_end              : "End",
>      REC_TYPE_libxc_context    : "Libxc context",
>      REC_TYPE_xenstore_data    : "Xenstore data",
>      REC_TYPE_emulator_context : "Emulator context",
> +    REC_TYPE_checkpoint_end   : "Checkpoint end",
>  }
>  
>  # emulator_context
> @@ -176,6 +178,13 @@ class VerifyLibxl(VerifyBase):
>          self.info("  Index %d, type %s" % (emu_idx, emulator_id_to_str[emu_id]))
>  
>  
> +    def verify_record_checkpoint_end(self, content):
> +        """ Checkpoint end record """
> +
> +        if len(content) != 0:
> +            raise RecordError("Checkpoint end record with non-zero length")
> +
> +
>  record_verifiers = {
>      REC_TYPE_end:
>          VerifyLibxl.verify_record_end,
> @@ -185,4 +194,6 @@ record_verifiers = {
>          VerifyLibxl.verify_record_xenstore_data,
>      REC_TYPE_emulator_context:
>          VerifyLibxl.verify_record_emulator_context,
> +    REC_TYPE_checkpoint_end:
> +        VerifyLibxl.verify_record_checkpoint_end,
>  }
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-15 13:44 ` [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Andrew Cooper
  2015-06-16 14:31   ` Ian Campbell
  2015-06-17  3:09   ` Wen Congyang
@ 2015-06-17  6:03   ` Wen Congyang
  2015-06-17  9:47     ` Andrew Cooper
  2015-06-17  7:57   ` Wen Congyang
  3 siblings, 1 reply; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  6:03 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile            |    1 +
>  tools/libxl/libxl_internal.h    |   39 ++++
>  tools/libxl/libxl_stream_read.c |  485 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 525 insertions(+)
>  create mode 100644 tools/libxl/libxl_stream_read.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index cc9c152..c71c5fe 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
> +			libxl_stream_read.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
>  LIBXL_OBJS += libxl_genid.o
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 101994f..4f33cb8 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -19,6 +19,8 @@
>  
>  #include "libxl_osdeps.h" /* must come before any other headers */
>  
> +#include "libxl_sr_stream_format.h"
> +
>  #include <assert.h>
>  #include <dirent.h>
>  #include <errno.h>
> @@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
>                                       libxl__domain_create_state*,
>                                       int rc, uint32_t domid);
>  
> +/* State for manipulating a libxl migration v2 stream */
> +typedef struct libxl__stream_read_state libxl__stream_read_state;
> +
> +struct libxl__stream_read_state {
> +    /* filled by the user */
> +    libxl__ao *ao;
> +    int fd;
> +    void (*completion_callback)(libxl__egc *egc,
> +                                libxl__domain_create_state *dcs,
> +                                int rc);
> +    /* Private */
> +    int rc;
> +    bool running;
> +    libxl__datacopier_state dc;
> +    size_t expected_len;
> +    libxl_sr_hdr hdr;
> +    libxl_sr_rec_hdr rec_hdr;
> +    void *rec_body;
> +};
> +
> +_hidden void libxl__stream_read_start(libxl__egc *egc,
> +                                      libxl__stream_read_state *stream);
> +
> +_hidden void libxl__stream_read_continue(libxl__egc *egc,
> +                                         libxl__stream_read_state *stream);
> +
> +_hidden void libxl__stream_read_abort(libxl__egc *egc,
> +                                      libxl__stream_read_state *stream, int rc);
> +
> +static inline bool libxl__stream_read_inuse(
> +    const libxl__stream_read_state *stream)
> +{
> +    return stream->running;
> +}
> +
> +
>  struct libxl__domain_create_state {
>      /* filled in by user */
>      libxl__ao *ao;
> @@ -3137,6 +3175,7 @@ struct libxl__domain_create_state {
>      libxl__stub_dm_spawn_state dmss;
>          /* If we're not doing stubdom, we use only dmss.dm,
>           * for the non-stubdom device model. */
> +    libxl__stream_read_state srs;
>      libxl__save_helper_state shs;
>      /* necessary if the domain creation failed and we have to destroy it */
>      libxl__domain_destroy_state dds;
> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
> new file mode 100644
> index 0000000..9cdaadf
> --- /dev/null
> +++ b/tools/libxl/libxl_stream_read.c
> @@ -0,0 +1,485 @@
> +/*
> + * Copyright (C) 2015      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +/*
> + * Infrastructure for reading and acting on the contents of a libxl migration
> + * stream. There are a lot of moving parts here.
> + *
> + * Entry points from outside:
> + *  - libxl__stream_read_start()
> + *     - Set up reading a stream from the start.
> + *
> + *  - libxl__stream_read_continue()
> + *     - Set up reading the next record from a started stream.
> + *
> + * The principle loop functionality involves reading the stream header, then
> + * reading a record at time and acting upon it.  It follows the callbacks:
> + *
> + *  - stream_header_done()
> + *  - stream_record_header_done()
> + *  - stream_record_body_done()
> + *  - process_record()
> + *
> + * process_record() will choose the correct next action based upon the
> + * record.  Upon completion of the action, the next record header will be read
> + * from the stream.
> + */
> +
> +static void stream_success(libxl__egc *egc,
> +                           libxl__stream_read_state *stream);
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_read_state *stream, int rc);
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream);
> +
> +/* Event callbacks for main reading loop. */
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void record_body_done(libxl__egc *egc,
> +                             libxl__datacopier_state *dc,
> +                             int onwrite, int errnoval);
> +static void process_record(libxl__egc *egc,
> +                           libxl__stream_read_state *stream);
> +
> +/* Mini-event loop for splicing a emulator record out of the stream. */
> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream);
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +
> +void libxl__stream_read_start(libxl__egc *egc,
> +                              libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    /* State initialisation. */
> +    assert(!stream->running);
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Start reading the stream header. */
> +    dc->readwhat = "stream header";
> +    dc->readbuf = &stream->hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
> +    dc->used = 0;
> +    dc->callback = stream_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    stream->running = true;
> +    assert(!ret);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_read_continue(libxl__egc *egc,
> +                                 libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    assert(stream->running);
> +
> +    /* Read a record header. */
> +    dc->readwhat = "record header";
> +    dc->readbuf = &stream->rec_hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->rec_hdr);
> +    dc->used = 0;
> +    dc->callback = record_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    assert(!ret);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_read_abort(libxl__egc *egc,
> +                              libxl__stream_read_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);
> +}
> +
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_read_state *stream, int rc)
> +{
> +    assert(rc);
> +    stream->rc = rc;

I have a question: rc is always less than 0?

Thanks
Wen Congyang

> +
> +    if (stream->running) {
> +        stream->running = false;
> +        stream_done(egc, stream);
> +    }
> +}
> +
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +
> +    assert(!stream->running);
> +
> +    stream->completion_callback(egc, dcs, stream->rc);
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_hdr *hdr = &stream->hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    hdr->ident   = be64toh(hdr->ident);
> +    hdr->version = be32toh(hdr->version);
> +    hdr->options = be32toh(hdr->options);
> +
> +    if (hdr->ident != RESTORE_STREAM_IDENT) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR,
> +            "Invalid ident: expected 0x%016"PRIx64", got 0x%016"PRIx64,
> +            RESTORE_STREAM_IDENT, hdr->ident);
> +        goto err;
> +    }
> +    if (hdr->version != RESTORE_STREAM_VERSION) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "Unexpected Version: expected %u, got %u",
> +            RESTORE_STREAM_VERSION, hdr->version);
> +        goto err;
> +    }
> +    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "Unable to handle big endian streams");
> +        goto err;
> +    }
> +
> +    LOG(INFO, "Stream v%u%s", hdr->version,
> +        hdr->options & RESTORE_OPT_LEGACY ? " (from legacy)" : "");
> +
> +    libxl__stream_read_continue(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    assert(stream->rec_body == NULL);
> +
> +    /* No body? Process straight away. */
> +    if (rec_hdr->length == 0) {
> +        process_record(egc, stream);
> +        return;
> +    }
> +
> +    /* Queue up reading the body. */
> +    size_t bytes_to_read;
> +
> +    switch (rec_hdr->type) {
> +        /*
> +         * Emulator records want to retain the blob in the pipe, for a further
> +         * datacopier call to move elsewhere.  Just read the emulator header.
> +         */
> +    case REC_TYPE_EMULATOR_CONTEXT:
> +        bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
> +        break;
> +
> +    default:
> +        bytes_to_read = rec_hdr->length;
> +        break;
> +    }
> +
> +    bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);
> +
> +    dc->readwhat = "record body";
> +    stream->rec_body = dc->readbuf = libxl__malloc(NOGC, bytes_to_read);
> +    stream->expected_len = dc->bytes_to_read = bytes_to_read;
> +    dc->used = 0;
> +    dc->callback = record_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void record_body_done(libxl__egc *egc,
> +                             libxl__datacopier_state *dc,
> +                             int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +
> +        free(stream->rec_body);
> +        stream->rec_body = dc->readbuf = NULL;
> +
> +        goto err;
> +    }
> +
> +    process_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void process_record(libxl__egc *egc,
> +                           libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    LOG(DEBUG, "Record: 0x%08x, length %u", rec_hdr->type, rec_hdr->length);
> +
> +    switch (rec_hdr->type) {
> +
> +    case REC_TYPE_END:
> +        /* Handled later, after cleanup. */
> +        break;
> +
> +    case REC_TYPE_XENSTORE_DATA:
> +        ret = libxl__toolstack_restore(dcs->guest_domid, stream->rec_body,
> +                                       rec_hdr->length, &dcs->shs);
> +        if (ret)
> +            goto err;
> +
> +        /*
> +         * libxl__toolstack_restore() is a synchronous function.  Manually
> +         * start looking for the next record.
> +         */
> +        libxl__stream_read_continue(egc, &dcs->srs);
> +        break;
> +
> +    case REC_TYPE_EMULATOR_CONTEXT:
> +        read_emulator_body(egc, stream);
> +        break;
> +
> +    default:
> +        LOG(ERROR, "Unrecognised record 0x%08x", rec_hdr->type);
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    assert(!ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;
> +    }
> +
> +    if (rec_hdr->type == REC_TYPE_END)
> +        stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;
> +    }
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
> +    STATE_AO_GC(stream->ao);
> +    char path[256];
> +    int ret = 0;
> +
> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
> +
> +    dc->readwhat = "save/migration stream";
> +    dc->copywhat = "emulator context";
> +    dc->writewhat = "qemu save file";
> +    dc->readbuf = NULL;
> +    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);
> +    if (dc->writefd == -1) {
> +        ret = ERROR_FAIL;
> +        LOGE(ERROR, "Unable to open '%s'", path);
> +        goto err;
> +    }
> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
> +    stream->expected_len = dc->used = 0;
> +    dc->callback = emulator_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    /* Safe to be static, as it is a write-only discard buffer. */
> +    static char padding[1U << REC_ALIGN_ORDER];
> +
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    /* Undo modifications for splicing the emulator context. */
> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Do we need to eat some padding out of the stream? */
> +    if (rec_hdr->length & (nr_padding_bytes - 1)) {
> +        unsigned int bytes_to_discard =
> +            nr_padding_bytes - (rec_hdr->length & (nr_padding_bytes - 1));
> +
> +        dc->readwhat = "padding bytes";
> +        dc->readbuf = padding;
> +        stream->expected_len = dc->bytes_to_read = bytes_to_discard;
> +        dc->used = 0;
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +    }
> +    else
> +    {
> +        stream->expected_len = dc->bytes_to_read = 0;
> +        dc->used = 0;
> +
> +        emulator_padding_done(egc, dc, 0, 0);
> +    }
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    libxl__stream_read_continue(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
  2015-06-15 13:44 ` [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream Andrew Cooper
@ 2015-06-17  7:28   ` Wen Congyang
  0 siblings, 0 replies; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  7:28 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> This is the final bit of untangling for Remus.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_create.c      |   25 ++++++++++++++++
>  tools/libxl/libxl_internal.h    |    6 ++++
>  tools/libxl/libxl_stream_read.c |   62 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 93 insertions(+)
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 7dd7130..ac918bd 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -747,6 +747,27 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
>          libxl_device_model_version_to_string(b_info->device_model_version));
>  }
>  
> +/*----- remus asynchronous checkpoint callback -----*/
> +
> +static void remus_checkpoint_stream_done(
> +    libxl__egc *egc, libxl__domain_create_state *dcs, int rc);
> +
> +static void libxl__remus_domain_checkpoint_callback(void *data)
> +{
> +    libxl__save_helper_state *shs = data;
> +    libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
> +    libxl__egc *egc = dcs->shs.egc;
> +    STATE_AO_GC(dcs->ao);
> +
> +    libxl__stream_read_start_checkpoint(egc, &dcs->srs);
> +}
> +
> +static void remus_checkpoint_stream_done(
> +    libxl__egc *egc, libxl__domain_create_state *dcs, int rc)
> +{
> +    libxl__xc_domain_saverestore_async_callback_done(egc, &dcs->shs, rc);
> +}
> +
>  /*----- main domain creation -----*/
>  
>  /* We have a linear control flow; only one event callback is
> @@ -1008,6 +1029,8 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      libxl_domain_config *const d_config = dcs->guest_config;
>      const int restore_fd = dcs->restore_fd;
>      libxl__domain_build_state *const state = &dcs->build_state;
> +    libxl__srm_restore_autogen_callbacks *const callbacks =
> +        &dcs->shs.callbacks.restore.a;
>  
>      if (rc) {
>          domcreate_rebuild_done(egc, dcs, rc);
> @@ -1035,6 +1058,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      }
>  
>      /* Restore */
> +    callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
>  
>      rc = libxl__build_pre(gc, domid, d_config, state);
>      if (rc)
> @@ -1044,6 +1068,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
>      dcs->srs.fd = restore_fd;
>      dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
>      dcs->srs.completion_callback = domcreate_stream_done;
> +    dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
>  
>      libxl__stream_read_start(egc, &dcs->srs);
>      return;
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index bf1c377..e271a0b 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3205,11 +3205,15 @@ struct libxl__stream_read_state {
>      void (*completion_callback)(libxl__egc *egc,
>                                  libxl__domain_create_state *dcs,
>                                  int rc);
> +    void (*checkpoint_callback)(libxl__egc *egc,
> +                                libxl__domain_create_state *dcs,
> +                                int rc);
>      /* Private */
>      libxl__carefd *v2_carefd;
>      int rc;
>      int joined_rc;
>      bool running;
> +    bool in_checkpoint;
>      libxl__datacopier_state dc;
>      size_t expected_len;
>      libxl_sr_hdr hdr;
> @@ -3222,6 +3226,8 @@ _hidden void libxl__stream_read_start(libxl__egc *egc,
>  
>  _hidden void libxl__stream_read_continue(libxl__egc *egc,
>                                           libxl__stream_read_state *stream);
> +_hidden void libxl__stream_read_start_checkpoint(
> +    libxl__egc *egc, libxl__stream_read_state *stream);
>  
>  _hidden void libxl__stream_read_abort(libxl__egc *egc,
>                                        libxl__stream_read_state *stream, int rc);
> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
> index a8cd2c3..09ef0aa 100644
> --- a/tools/libxl/libxl_stream_read.c
> +++ b/tools/libxl/libxl_stream_read.c
> @@ -80,6 +80,10 @@ static void emulator_padding_done(libxl__egc *egc,
>                                    libxl__datacopier_state *dc,
>                                    int onwrite, int errnoval);
>  
> +/* Error handling for checkpoint mini-loop. */
> +static void checkpoint_done(libxl__egc *egc,
> +                            libxl__stream_read_state *stream, int rc);
> +
>  void libxl__stream_read_start(libxl__egc *egc,
>                                libxl__stream_read_state *stream)
>  {
> @@ -162,6 +166,35 @@ void libxl__stream_read_continue(libxl__egc *egc,
>      stream_failed(egc, stream, ret);
>  }
>  
> +void libxl__stream_read_start_checkpoint(libxl__egc *egc,
> +                                         libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    assert(stream->running);
> +    assert(!stream->in_checkpoint);
> +    stream->in_checkpoint = true;
> +

I think you can call libxl__stream_read_continue() here.

Thanks
Wen Congyang

> +    /* Read a record header. */
> +    dc->readwhat = "record header";
> +    dc->readbuf = &stream->rec_hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->rec_hdr);
> +    dc->used = 0;
> +    dc->callback = record_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    assert(!ret);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
>  void libxl__stream_read_abort(libxl__egc *egc,
>                                libxl__stream_read_state *stream, int rc)
>  {
> @@ -182,6 +215,15 @@ static void stream_failed(libxl__egc *egc,
>      assert(rc);
>      stream->rc = rc;
>  
> +    /*
> +     *If we are in a checkpoint, pass the failure to libxc, which will come
> +     * back around to us via libxl__xc_domain_restore_done().
> +     */
> +    if (stream->in_checkpoint) {
> +        checkpoint_done(egc, stream, rc);
> +        return;
> +    }
> +
>      if (stream->running) {
>          stream->running = false;
>          stream_done(egc, stream);
> @@ -194,6 +236,7 @@ static void stream_done(libxl__egc *egc,
>      libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
>  
>      assert(!stream->running);
> +    assert(!stream->in_checkpoint);
>  
>      if (stream->v2_carefd)
>          libxl__carefd_close(stream->v2_carefd);
> @@ -452,6 +495,15 @@ static void process_record(libxl__egc *egc,
>          read_emulator_body(egc, stream);
>          break;
>  
> +    case REC_TYPE_CHECKPOINT_END:
> +        if (!stream->in_checkpoint) {
> +            LOG(ERROR, "Unexpected CHECKPOINT_END record in stream");
> +            ret = ERROR_FAIL;
> +            goto err;
> +        }
> +        checkpoint_done(egc, stream, 0);
> +        break;
> +
>      default:
>          LOG(ERROR, "Unrecognised record 0x%08x", rec_hdr->type);
>          ret = ERROR_FAIL;
> @@ -592,6 +644,16 @@ static void emulator_padding_done(libxl__egc *egc,
>      stream_failed(egc, stream, ret);
>  }
>  
> +static void checkpoint_done(libxl__egc *egc,
> +                            libxl__stream_read_state *stream, int rc)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +
> +    assert(stream->in_checkpoint);
> +    stream->in_checkpoint = false;
> +    stream->checkpoint_callback(egc, dcs, rc);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
  2015-06-16 15:53     ` Andrew Cooper
@ 2015-06-17  7:30       ` Ian Campbell
  2015-06-17  9:55         ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-17  7:30 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Tue, 2015-06-16 at 16:53 +0100, Andrew Cooper wrote:
> On 16/06/15 16:03, Ian Campbell wrote:
> > On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> >> when signalled to do so by libxl__remus_domain_checkpoint_callback()
> > I think I saw that Remus wasn't currently working, so I'll let you and
> > Hongyang thrash something out before I spend too much effort reviewing
> > these last few RFC bits. Unless you think it is worth my having a look
> > now?
> >
> >
> 
> Remus was broken by patch 19 in the series, and this patch forms part of
> fixing it again.
> 
> I can't find a way of fixing the layering violation in both plain
> migration and Remus, in a readable, bisectable way.
> 
> Remus requires identical source and destination toolstacks, and the
> Remus maintainers are happy enough with the "break it and fix it up in
> the same series" approach.
> 
> Now that the series is comeplete, there is some shuffling room to reduce
> the window of breakage, but short of folding patches 19, 21, 23-25
> together, Remus will break.

The report I was referring to thinking I'd seen was that Remus was still
broken even after the complete series was applied i.e. there was still
more to be done.

I'm happy with the transient breakage in this series on this occasion,
but I was proposing not to review until Remus was thought to be working
OK at the end.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
                     ` (3 preceding siblings ...)
  2015-06-17  2:24   ` Wen Congyang
@ 2015-06-17  7:38   ` Yang Hongyang
  2015-06-17 10:14   ` Wen Congyang
  2015-07-10 10:55   ` Ian Campbell
  6 siblings, 0 replies; 107+ messages in thread
From: Yang Hongyang @ 2015-06-17  7:38 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Wei Liu, Ian Jackson, Ian Campbell, Ross Lagerwall



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
[...]
> +
> +static void stream_success(libxl__egc *egc,
> +                           libxl__stream_write_state *stream);
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int ret);
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream);
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dcs,

s/dcs/dss/

> +                                  int rc, const char *what);
> +
> +/* Event callbacks for plain VM. */
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval);
> +/* libxl__xc_domain_save_done() lives here, event-order wise. */
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream);
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream);
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream);
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval);
> +
> +void libxl__stream_write_start(libxl__egc *egc,
> +                               libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_hdr hdr = { 0 };
> +    int ret = 0;
> +
> +    assert(!stream->running);
> +    stream->running = true;
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->readwhat = "";
> +    dc->copywhat = "suspend header";
> +    dc->writewhat = "save/migration stream";
> +    dc->ao = ao;
> +    dc->readfd = -1;
> +    dc->writefd = stream->fd;
> +    dc->maxsz = INT_MAX;
> +    dc->bytes_to_read = INT_MAX;
> +    dc->callback = stream_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    hdr.ident   = htobe64(RESTORE_STREAM_IDENT);
> +    hdr.version = htobe32(RESTORE_STREAM_VERSION);
> +    hdr.options = htobe32(0);
> +
> +    libxl__datacopier_prefixdata(egc, dc, &hdr, sizeof(hdr));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_write_abort(libxl__egc *egc,
> +                               libxl__stream_write_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);
> +}
> +
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int rc)
> +{
> +    assert(rc);
> +    stream->rc = rc;
> +
> +    if (stream->running) {
> +        stream->running = false;
> +        stream_done(egc, stream);
> +    }
> +}
> +
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +
> +    assert(!stream->running);
> +
> +    check_stream_finished(egc, dss, stream->rc, "stream");
> +}
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dss,
> +                                  int rc, const char *what)
> +{
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
> +
> +    if (rc && !stream->joined_rc) {
> +        bool skip = false;
> +        /* First reported failure from joining tasks.  Tear everything down */
> +        stream->joined_rc = rc;
> +
> +        if (libxl__stream_write_inuse(&dss->sws)) {
> +            skip = true;
> +            libxl__stream_write_abort(egc, &dss->sws, rc);
> +        }
> +
> +        if (libxl__save_helper_inuse(&dss->shs)) {
> +            skip = true;
> +            libxl__save_helper_abort(egc, &dss->shs);
> +        }
> +
> +        /* There is at least one more active task to join - wait for its
> +           callback */
> +        if ( skip )
> +            return;
> +    }
> +
> +    if (libxl__stream_write_inuse(&dss->sws))
> +        LOG(DEBUG, "stream still in use");
> +    else if (libxl__save_helper_inuse(&dss->shs))
> +        LOG(DEBUG, "save/restore still in use");
> +    else {
> +        LOG(INFO, "Join complete: result %d", stream->joined_rc);
> +        stream->completion_callback(egc, dss, stream->joined_rc);
> +    }
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_LIBXC_CONTEXT, 0 };
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = libxc_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    libxl__xc_domain_save(egc, dss);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void __attribute__((used))
> +will_be_libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
> +                                int rc, int retval, int errnoval)
> +{
> +    libxl__domain_suspend_state *dss = dss_void;
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    if (rc)
> +        goto err;
> +
> +    if (retval) {
> +        LOGEV(ERROR, errnoval, "saving domain: %s",
> +                         dss->guest_responded ?
> +                         "domain responded to suspend request" :
> +                         "domain did not respond to suspend request");
> +        if ( !dss->guest_responded )
> +            rc = ERROR_GUEST_TIMEDOUT;
> +        else
> +            rc = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_toolstack_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(rc);
> +    check_stream_finished(egc, dss, rc, "save/restore helper");
> +}
> +
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_XENSTORE_DATA, 0 };
> +    int ret = 0;
> +    uint8_t *toolstack_buf = NULL; /* We must free this. */
> +    uint32_t toolstack_len, padding_len;
> +
> +    ret = libxl__toolstack_save(dss->domid, &toolstack_buf,
> +                                &toolstack_len, dss);
> +    if (ret)
> +        goto err;
> +
> +    dc->copywhat = "toolstack record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = toolstack_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    rec.length = toolstack_len;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, toolstack_buf, toolstack_len);
> +
> +    padding_len = ROUNDUP(rec.length, REC_ALIGN_ORDER) - rec.length;
> +    if (padding_len)
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, padding_len);
> +
> +    free(toolstack_buf);
> +    return;
> +
> + err:
> +    assert(ret);
> +    free(toolstack_buf);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
> +        write_emulator_record(egc, stream);
> +    else
> +        write_end_record(egc, stream);
> +
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_EMULATOR_CONTEXT, 0 };
> +    struct libxl_sr_emulator_hdr ehdr = { 0 };
> +    struct stat st;
> +    int ret = 0;
> +    uint32_t qemu_state_len;
> +
> +    assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
> +
> +    /* Convenience aliases */
> +    const char *const filename = dss->dm_savefile;
> +    const uint32_t domid = dss->domid;
> +
> +    switch(libxl__device_model_version_running(gc, domid)) {
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +        ehdr.id = EMULATOR_QEMU_TRADITIONAL;
> +        break;
> +
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> +        ehdr.id = EMULATOR_QEMU_UPSTREAM;
> +        break;
> +
> +    default:
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    ret = libxl__domain_suspend_device_model(gc, dss);
> +    if (ret)
> +        goto err;
> +
> +    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
> +    dc->copywhat = "emulator record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = emulator_body_done;
> +
> +    dc->readfd = open(filename, O_RDONLY);
> +    if (dc->readfd < 0) {
> +        LOGE(ERROR, "unable to open %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (fstat(dc->readfd, &st))
> +    {
> +        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (!S_ISREG(st.st_mode)) {
> +        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
> +        goto err;
> +    }
> +
> +    qemu_state_len = st.st_size;
> +    rec.length = qemu_state_len + sizeof(ehdr);
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, &ehdr, sizeof(ehdr));
> +
> +    stream->padding = ROUNDUP(qemu_state_len, REC_ALIGN_ORDER) - qemu_state_len;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->readwhat = "";
> +    dc->readfd = -1;
> +
> +    if (stream->padding) {
> +        assert(stream->padding < (1U << REC_ALIGN_ORDER));
> +
> +        dc->copywhat = "emulator padding";
> +        dc->writewhat = "save/migration stream";
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, stream->padding);
> +        return;
> +    }
> +
> +    emulator_padding_done(egc, dc, 0, 0);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_end_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_END, 0 };
> +    int ret = 0;
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = end_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-15 13:44 ` [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Andrew Cooper
                     ` (2 preceding siblings ...)
  2015-06-17  6:03   ` Wen Congyang
@ 2015-06-17  7:57   ` Wen Congyang
  2015-06-17  9:50     ` Andrew Cooper
  3 siblings, 1 reply; 107+ messages in thread
From: Wen Congyang @ 2015-06-17  7:57 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile            |    1 +
>  tools/libxl/libxl_internal.h    |   39 ++++
>  tools/libxl/libxl_stream_read.c |  485 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 525 insertions(+)
>  create mode 100644 tools/libxl/libxl_stream_read.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index cc9c152..c71c5fe 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
> +			libxl_stream_read.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
>  LIBXL_OBJS += libxl_genid.o
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 101994f..4f33cb8 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -19,6 +19,8 @@
>  
>  #include "libxl_osdeps.h" /* must come before any other headers */
>  
> +#include "libxl_sr_stream_format.h"
> +
>  #include <assert.h>
>  #include <dirent.h>
>  #include <errno.h>
> @@ -3121,6 +3123,42 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
>                                       libxl__domain_create_state*,
>                                       int rc, uint32_t domid);
>  
> +/* State for manipulating a libxl migration v2 stream */
> +typedef struct libxl__stream_read_state libxl__stream_read_state;
> +
> +struct libxl__stream_read_state {
> +    /* filled by the user */
> +    libxl__ao *ao;
> +    int fd;
> +    void (*completion_callback)(libxl__egc *egc,
> +                                libxl__domain_create_state *dcs,
> +                                int rc);
> +    /* Private */
> +    int rc;
> +    bool running;
> +    libxl__datacopier_state dc;
> +    size_t expected_len;
> +    libxl_sr_hdr hdr;
> +    libxl_sr_rec_hdr rec_hdr;
> +    void *rec_body;
> +};
> +
> +_hidden void libxl__stream_read_start(libxl__egc *egc,
> +                                      libxl__stream_read_state *stream);
> +
> +_hidden void libxl__stream_read_continue(libxl__egc *egc,
> +                                         libxl__stream_read_state *stream);
> +
> +_hidden void libxl__stream_read_abort(libxl__egc *egc,
> +                                      libxl__stream_read_state *stream, int rc);
> +
> +static inline bool libxl__stream_read_inuse(
> +    const libxl__stream_read_state *stream)
> +{
> +    return stream->running;
> +}
> +
> +
>  struct libxl__domain_create_state {
>      /* filled in by user */
>      libxl__ao *ao;
> @@ -3137,6 +3175,7 @@ struct libxl__domain_create_state {
>      libxl__stub_dm_spawn_state dmss;
>          /* If we're not doing stubdom, we use only dmss.dm,
>           * for the non-stubdom device model. */
> +    libxl__stream_read_state srs;
>      libxl__save_helper_state shs;
>      /* necessary if the domain creation failed and we have to destroy it */
>      libxl__domain_destroy_state dds;
> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
> new file mode 100644
> index 0000000..9cdaadf
> --- /dev/null
> +++ b/tools/libxl/libxl_stream_read.c
> @@ -0,0 +1,485 @@
> +/*
> + * Copyright (C) 2015      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +/*
> + * Infrastructure for reading and acting on the contents of a libxl migration
> + * stream. There are a lot of moving parts here.
> + *
> + * Entry points from outside:
> + *  - libxl__stream_read_start()
> + *     - Set up reading a stream from the start.
> + *
> + *  - libxl__stream_read_continue()
> + *     - Set up reading the next record from a started stream.
> + *
> + * The principle loop functionality involves reading the stream header, then
> + * reading a record at time and acting upon it.  It follows the callbacks:
> + *
> + *  - stream_header_done()
> + *  - stream_record_header_done()
> + *  - stream_record_body_done()
> + *  - process_record()
> + *
> + * process_record() will choose the correct next action based upon the
> + * record.  Upon completion of the action, the next record header will be read
> + * from the stream.
> + */
> +
> +static void stream_success(libxl__egc *egc,
> +                           libxl__stream_read_state *stream);
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_read_state *stream, int rc);
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream);
> +
> +/* Event callbacks for main reading loop. */
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void record_body_done(libxl__egc *egc,
> +                             libxl__datacopier_state *dc,
> +                             int onwrite, int errnoval);
> +static void process_record(libxl__egc *egc,
> +                           libxl__stream_read_state *stream);
> +
> +/* Mini-event loop for splicing a emulator record out of the stream. */
> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream);
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +
> +void libxl__stream_read_start(libxl__egc *egc,
> +                              libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    /* State initialisation. */
> +    assert(!stream->running);
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Start reading the stream header. */
> +    dc->readwhat = "stream header";
> +    dc->readbuf = &stream->hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
> +    dc->used = 0;
> +    dc->callback = stream_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    stream->running = true;
> +    assert(!ret);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_read_continue(libxl__egc *egc,
> +                                 libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    assert(stream->running);
> +
> +    /* Read a record header. */
> +    dc->readwhat = "record header";
> +    dc->readbuf = &stream->rec_hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->rec_hdr);
> +    dc->used = 0;
> +    dc->callback = record_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    assert(!ret);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_read_abort(libxl__egc *egc,
> +                              libxl__stream_read_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);
> +}
> +
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_read_state *stream, int rc)
> +{
> +    assert(rc);
> +    stream->rc = rc;
> +
> +    if (stream->running) {
> +        stream->running = false;
> +        stream_done(egc, stream);
> +    }
> +}
> +
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +
> +    assert(!stream->running);
> +
> +    stream->completion_callback(egc, dcs, stream->rc);
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_hdr *hdr = &stream->hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    hdr->ident   = be64toh(hdr->ident);
> +    hdr->version = be32toh(hdr->version);
> +    hdr->options = be32toh(hdr->options);
> +
> +    if (hdr->ident != RESTORE_STREAM_IDENT) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR,
> +            "Invalid ident: expected 0x%016"PRIx64", got 0x%016"PRIx64,
> +            RESTORE_STREAM_IDENT, hdr->ident);
> +        goto err;
> +    }
> +    if (hdr->version != RESTORE_STREAM_VERSION) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "Unexpected Version: expected %u, got %u",
> +            RESTORE_STREAM_VERSION, hdr->version);
> +        goto err;
> +    }
> +    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "Unable to handle big endian streams");
> +        goto err;
> +    }
> +
> +    LOG(INFO, "Stream v%u%s", hdr->version,
> +        hdr->options & RESTORE_OPT_LEGACY ? " (from legacy)" : "");
> +
> +    libxl__stream_read_continue(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    assert(stream->rec_body == NULL);
> +
> +    /* No body? Process straight away. */
> +    if (rec_hdr->length == 0) {
> +        process_record(egc, stream);
> +        return;
> +    }
> +
> +    /* Queue up reading the body. */
> +    size_t bytes_to_read;
> +
> +    switch (rec_hdr->type) {
> +        /*
> +         * Emulator records want to retain the blob in the pipe, for a further
> +         * datacopier call to move elsewhere.  Just read the emulator header.
> +         */

In this case, we should not call ROUNDUP().

> +    case REC_TYPE_EMULATOR_CONTEXT:
> +        bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
> +        break;
> +
> +    default:
> +        bytes_to_read = rec_hdr->length;
> +        break;
> +    }
> +
> +    bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);

So, I think it is better to move ROUNDUP to default case.

Thanks
Wen Congyang

> +
> +    dc->readwhat = "record body";
> +    stream->rec_body = dc->readbuf = libxl__malloc(NOGC, bytes_to_read);
> +    stream->expected_len = dc->bytes_to_read = bytes_to_read;
> +    dc->used = 0;
> +    dc->callback = record_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void record_body_done(libxl__egc *egc,
> +                             libxl__datacopier_state *dc,
> +                             int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +
> +        free(stream->rec_body);
> +        stream->rec_body = dc->readbuf = NULL;
> +
> +        goto err;
> +    }
> +
> +    process_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void process_record(libxl__egc *egc,
> +                           libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    LOG(DEBUG, "Record: 0x%08x, length %u", rec_hdr->type, rec_hdr->length);
> +
> +    switch (rec_hdr->type) {
> +
> +    case REC_TYPE_END:
> +        /* Handled later, after cleanup. */
> +        break;
> +
> +    case REC_TYPE_XENSTORE_DATA:
> +        ret = libxl__toolstack_restore(dcs->guest_domid, stream->rec_body,
> +                                       rec_hdr->length, &dcs->shs);
> +        if (ret)
> +            goto err;
> +
> +        /*
> +         * libxl__toolstack_restore() is a synchronous function.  Manually
> +         * start looking for the next record.
> +         */
> +        libxl__stream_read_continue(egc, &dcs->srs);
> +        break;
> +
> +    case REC_TYPE_EMULATOR_CONTEXT:
> +        read_emulator_body(egc, stream);
> +        break;
> +
> +    default:
> +        LOG(ERROR, "Unrecognised record 0x%08x", rec_hdr->type);
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    assert(!ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;
> +    }
> +
> +    if (rec_hdr->type == REC_TYPE_END)
> +        stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;
> +    }
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
> +    STATE_AO_GC(stream->ao);
> +    char path[256];
> +    int ret = 0;
> +
> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
> +
> +    dc->readwhat = "save/migration stream";
> +    dc->copywhat = "emulator context";
> +    dc->writewhat = "qemu save file";
> +    dc->readbuf = NULL;
> +    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);
> +    if (dc->writefd == -1) {
> +        ret = ERROR_FAIL;
> +        LOGE(ERROR, "Unable to open '%s'", path);
> +        goto err;
> +    }
> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
> +    stream->expected_len = dc->used = 0;
> +    dc->callback = emulator_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    /* Safe to be static, as it is a write-only discard buffer. */
> +    static char padding[1U << REC_ALIGN_ORDER];
> +
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    /* Undo modifications for splicing the emulator context. */
> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Do we need to eat some padding out of the stream? */
> +    if (rec_hdr->length & (nr_padding_bytes - 1)) {
> +        unsigned int bytes_to_discard =
> +            nr_padding_bytes - (rec_hdr->length & (nr_padding_bytes - 1));
> +
> +        dc->readwhat = "padding bytes";
> +        dc->readbuf = padding;
> +        stream->expected_len = dc->bytes_to_read = bytes_to_discard;
> +        dc->used = 0;
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +    }
> +    else
> +    {
> +        stream->expected_len = dc->bytes_to_read = 0;
> +        dc->used = 0;
> +
> +        emulator_padding_done(egc, dc, 0, 0);
> +    }
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    libxl__stream_read_continue(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint()
  2015-06-15 13:44 ` [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint() Andrew Cooper
  2015-06-16  2:23   ` Yang Hongyang
@ 2015-06-17  8:20   ` Yang Hongyang
  1 sibling, 0 replies; 107+ messages in thread
From: Yang Hongyang @ 2015-06-17  8:20 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> And call it when a checkpoint record is found in the libxc stream.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>   tools/libxc/include/xenguest.h     |    3 +++
>   tools/libxc/xc_sr_restore.c        |   15 ++++++++++++++-
>   tools/libxl/libxl_save_msgs_gen.pl |    2 +-
>   3 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
> index 7581263..b0d27ed 100644
> --- a/tools/libxc/include/xenguest.h
> +++ b/tools/libxc/include/xenguest.h
> @@ -102,6 +102,9 @@ struct restore_callbacks {
>       int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
>               uint32_t size, void* data);
>
> +    /* A checkpoint record has been found in the stream */
> +    int (*checkpoint)(void* data);
> +
>       /* to be provided as the last argument to each callback function */
>       void* data;
>   };
> diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
> index 9e27dba..5e0f817 100644
> --- a/tools/libxc/xc_sr_restore.c
> +++ b/tools/libxc/xc_sr_restore.c
> @@ -1,5 +1,7 @@
>   #include <arpa/inet.h>
>
> +#include <assert.h>
> +
>   #include "xc_sr_common.h"
>
>   /*
> @@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec)
>   static int handle_checkpoint(struct xc_sr_context *ctx)
>   {
>       xc_interface *xch = ctx->xch;
> -    int rc = 0;
> +    int rc = 0, ret;
>       unsigned i;
>
>       if ( !ctx->restore.checkpointed )
> @@ -482,6 +484,13 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
>           goto err;
>       }
>
> +    ret = ctx->restore.callbacks->checkpoint(ctx->restore.callbacks->data);
> +    if ( ret )
> +    {
> +        rc = -1;
> +        goto err;
> +    }
> +
>       if ( ctx->restore.buffer_all_records )
>       {
>           IPRINTF("All records buffered");
> @@ -735,6 +744,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, uint32_t dom,
>       ctx.restore.checkpointed = checkpointed_stream;
>       ctx.restore.callbacks = callbacks;
>
> +    /* Sanity checks for callbacks. */
> +    if (checkpointed_stream)

coding style

> +        assert(callbacks->checkpoint);
> +
>       IPRINTF("In experimental %s", __func__);
>       DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d"
>               ", checkpointed_stream %d", io_fd, dom, hvm, pae,
> diff --git a/tools/libxl/libxl_save_msgs_gen.pl b/tools/libxl/libxl_save_msgs_gen.pl
> index 6b4b65e..36b279e 100755
> --- a/tools/libxl/libxl_save_msgs_gen.pl
> +++ b/tools/libxl/libxl_save_msgs_gen.pl
> @@ -25,7 +25,7 @@ our @msgs = (
>                                                   'unsigned long', 'total'] ],
>       [  3, 'scxA',   "suspend", [] ],
>       [  4, 'scxA',   "postcopy", [] ],
> -    [  5, 'scxA',   "checkpoint", [] ],
> +    [  5, 'srcxA',   "checkpoint", [] ],
>       [  6, 'scxA',   "switch_qemu_logdirty",  [qw(int domid
>                                                 unsigned enable)] ],
>       #                toolstack_save          done entirely `by hand'
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 00/27]  Libxl migration v2
  2015-06-17  1:55 ` Wen Congyang
@ 2015-06-17  9:45   ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-17  9:45 UTC (permalink / raw)
  To: xen-devel

On 17/06/15 02:55, Wen Congyang wrote:
> On 06/15/2015 09:44 PM, Andrew Cooper wrote:
>> This series adds support for the libxl migration v2 stream, and untangles the
>> existing layering violations of the toolstack and qemu records.
>>
>> At the end of the series, legacy migration is no longer used.
>>
>> Note: Remus support is broken and (RFC) fixed in separate patches in this
>> series.  It was too tangled to fix in a bisectable fashon.  Plain
>> suspend/migrate/resume however is (should be) bisectable along the entire
>> series.
>>
>> There are a couple of outstanding questions:
>>
>> 1) What to do about the toolstack/xenstore record.  It is currently by being
>>    passed around as a blob, but it might be better to split it out.
>>
>> 2) What (if any) ABI/API qualifications are needed? (Particularly in reference
>>    to patch 21)
>>
>> The Remus code is untested by me, but is hopefully in the correct ballpark.
>> All other combinations of suspend/migrate/resume have been tested with PV and
>> HVM guests (qemu-trad and qemu-upstream), including 32 -> 64 bit migration
>> (which was the underlying bug causing us to write migration v2 in the first
>> place).
>>
>> There are some further improvements which could be made.  In particular, it
>> appears that sending the toolstack record on each checkpoint is redundant, and
>> there is certainly room for some more pruning of the legacy migration code.
> Do you mean: libxl__toolstack_save is harmless, and it can be called when the
> guest is running?
>
> Thanks
> Wen Congyang

It is harmless when a guest is running.

It contains the contents of the device models "/physmap" xenstore tree,
which is empty for a PV or qemu-trad HVM guest and constant-after-setup
for qemu-upstream HVM guests

(I don't see why this information isn't maintained in the Qemu record
itself because nothing else uses it).

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-17  6:03   ` Wen Congyang
@ 2015-06-17  9:47     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-17  9:47 UTC (permalink / raw)
  To: Wen Congyang, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 17/06/15 07:03, Wen Congyang wrote:
>> +static void stream_failed(libxl__egc *egc,
>> > +                          libxl__stream_read_state *stream, int rc)
>> > +{
>> > +    assert(rc);
>> > +    stream->rc = rc;
> I have a question: rc is always less than 0?
>
> Thanks
> Wen Congyang
>

I believe so.  It should realistically always be an ERROR_$foo

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-17  7:57   ` Wen Congyang
@ 2015-06-17  9:50     ` Andrew Cooper
  2015-06-17 10:01       ` Wen Congyang
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-17  9:50 UTC (permalink / raw)
  To: Wen Congyang, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 17/06/15 08:57, Wen Congyang wrote:
>> +    /* Queue up reading the body. */
>> > +    size_t bytes_to_read;
>> > +
>> > +    switch (rec_hdr->type) {
>> > +        /*
>> > +         * Emulator records want to retain the blob in the pipe, for a further
>> > +         * datacopier call to move elsewhere.  Just read the emulator header.
>> > +         */
> In this case, we should not call ROUNDUP().
>
>> > +    case REC_TYPE_EMULATOR_CONTEXT:
>> > +        bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
>> > +        break;
>> > +
>> > +    default:
>> > +        bytes_to_read = rec_hdr->length;
>> > +        break;
>> > +    }
>> > +
>> > +    bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);
> So, I think it is better to move ROUNDUP to default case.
>
> Thanks
> Wen Congyang
>

sizeof(struct libxl_sr_emulator_hdr) is cunningly of the appropriate
order already.

I suppose it is probably better to move the roundup into the default
case and assert() appropriate alignment after the switch()

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-17  1:31   ` Yang Hongyang
@ 2015-06-17  9:51     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-17  9:51 UTC (permalink / raw)
  To: Yang Hongyang, Xen-devel
  Cc: Wei Liu, Ian Jackson, Ian Campbell, Ross Lagerwall

On 17/06/15 02:31, Yang Hongyang wrote:
>> +    default:
>> +        ret = ERROR_FAIL;
>> +        goto err;
>> +    }
>> +
>> +    ret = libxl__domain_suspend_device_model(gc, dss);
>
> This is no longer needed, the suspend callback already called
> this function and the emulator context already saved to a file.
>
> This call will cause Primary's emulator stop under Remus.
> postcopy callback will resume primary. then in checkpoint
> callback, we shouldn't suspend device model. 

It is the result of copying how everything was done previously.  I will
drop it.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
  2015-06-17  7:30       ` Ian Campbell
@ 2015-06-17  9:55         ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-17  9:55 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 17/06/15 08:30, Ian Campbell wrote:
> On Tue, 2015-06-16 at 16:53 +0100, Andrew Cooper wrote:
>> On 16/06/15 16:03, Ian Campbell wrote:
>>> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>>>> when signalled to do so by libxl__remus_domain_checkpoint_callback()
>>> I think I saw that Remus wasn't currently working, so I'll let you and
>>> Hongyang thrash something out before I spend too much effort reviewing
>>> these last few RFC bits. Unless you think it is worth my having a look
>>> now?
>>>
>>>
>> Remus was broken by patch 19 in the series, and this patch forms part of
>> fixing it again.
>>
>> I can't find a way of fixing the layering violation in both plain
>> migration and Remus, in a readable, bisectable way.
>>
>> Remus requires identical source and destination toolstacks, and the
>> Remus maintainers are happy enough with the "break it and fix it up in
>> the same series" approach.
>>
>> Now that the series is comeplete, there is some shuffling room to reduce
>> the window of breakage, but short of folding patches 19, 21, 23-25
>> together, Remus will break.
> The report I was referring to thinking I'd seen was that Remus was still
> broken even after the complete series was applied i.e. there was still
> more to be done.

That is because I was blind-coding Remus support without an ability to test.

>
> I'm happy with the transient breakage in this series on this occasion,
> but I was proposing not to review until Remus was thought to be working
> OK at the end.

It is mostly fixed now.  I just need to fix the failover handling, and
have instructions on how to do so.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-17  9:50     ` Andrew Cooper
@ 2015-06-17 10:01       ` Wen Congyang
  2015-06-17 10:48         ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Wen Congyang @ 2015-06-17 10:01 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 06/17/2015 05:50 PM, Andrew Cooper wrote:
> On 17/06/15 08:57, Wen Congyang wrote:
>>> +    /* Queue up reading the body. */
>>>> +    size_t bytes_to_read;
>>>> +
>>>> +    switch (rec_hdr->type) {
>>>> +        /*
>>>> +         * Emulator records want to retain the blob in the pipe, for a further
>>>> +         * datacopier call to move elsewhere.  Just read the emulator header.
>>>> +         */
>> In this case, we should not call ROUNDUP().
>>
>>>> +    case REC_TYPE_EMULATOR_CONTEXT:
>>>> +        bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
>>>> +        break;
>>>> +
>>>> +    default:
>>>> +        bytes_to_read = rec_hdr->length;
>>>> +        break;
>>>> +    }
>>>> +
>>>> +    bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);
>> So, I think it is better to move ROUNDUP to default case.
>>
>> Thanks
>> Wen Congyang
>>
> 
> sizeof(struct libxl_sr_emulator_hdr) is cunningly of the appropriate
> order already.

Yes

> 
> I suppose it is probably better to move the roundup into the default
> case and assert() appropriate alignment after the switch()

Do you mean the sub-header must be aligned

Thanks
Wen Congyang

> 
> ~Andrew
> .
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
  2015-06-16 15:06     ` Andrew Cooper
@ 2015-06-17 10:14       ` Ian Campbell
  2015-06-17 10:43         ` Andrew Cooper
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-17 10:14 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Tue, 2015-06-16 at 16:06 +0100, Andrew Cooper wrote:
> On 16/06/15 16:04, Ian Campbell wrote:
> > On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> >> Libxl has now been fully adjusted not to need them.
> >>
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> > Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
> >
> > /me looks mournfully at the #28 shaped hole in this series which would
> > nuke all the migration v1 code from libxc :-)
> 
> I was going to slip that into v2.  I didn't want to delay posting v1 for
> review, given the proximity of the 4.6 freeze.
> 
> I think I will transcribe the description of the legacy protocol from
> xg_save_restore.h

That would be good, thanks.

>  and code up the legacy protocol in python.

That would be above and beyond, but don't let me stop you ;-)

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
                     ` (4 preceding siblings ...)
  2015-06-17  7:38   ` Yang Hongyang
@ 2015-06-17 10:14   ` Wen Congyang
  2015-07-10 10:55   ` Ian Campbell
  6 siblings, 0 replies; 107+ messages in thread
From: Wen Congyang @ 2015-06-17 10:14 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/Makefile             |    2 +-
>  tools/libxl/libxl_internal.h     |   33 +++
>  tools/libxl/libxl_stream_write.c |  536 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 570 insertions(+), 1 deletion(-)
>  create mode 100644 tools/libxl/libxl_stream_write.c
> 
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index ca0ae3e..63e32f7 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -94,7 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
>  			libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
>  			libxl_internal.o libxl_utils.o libxl_uuid.o \
>  			libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \
> -			libxl_stream_read.o \
> +			libxl_stream_read.o libxl_stream_write.o \
>  			libxl_save_callout.o _libxl_save_msgs_callout.o \
>  			libxl_convert_callout.o \
>  			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 5482950..82cd792 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2868,6 +2868,38 @@ typedef void libxl__domain_suspend_cb(libxl__egc*,
>  typedef void libxl__save_device_model_cb(libxl__egc*,
>                                           libxl__domain_suspend_state*, int rc);
>  
> +/* State for writing a libxl migration v2 stream */
> +typedef struct libxl__stream_write_state libxl__stream_write_state;
> +
> +struct libxl__stream_write_state {
> +    /* filled by the user */
> +    libxl__ao *ao;
> +    int fd;
> +    uint32_t domid;
> +    void (*completion_callback)(libxl__egc *egc,
> +                                libxl__domain_suspend_state *dss,
> +                                int rc);
> +    /* Private */
> +    int rc;
> +    int joined_rc;
> +    size_t padding;
> +    bool running;
> +    libxl__datacopier_state dc;
> +};
> +
> +_hidden void libxl__stream_write_start(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream);
> +
> +_hidden void libxl__stream_write_abort(libxl__egc *egc,
> +                                       libxl__stream_write_state *stream,
> +                                       int rc);
> +
> +static inline bool libxl__stream_write_inuse(
> +    const libxl__stream_write_state *stream)
> +{
> +    return stream->running;
> +}
> +
>  typedef struct libxl__logdirty_switch {
>      const char *cmd;
>      const char *cmd_path;
> @@ -2907,6 +2939,7 @@ struct libxl__domain_suspend_state {
>      /* private for libxl__domain_save_device_model */
>      libxl__save_device_model_cb *save_dm_callback;
>      libxl__datacopier_state save_dm_datacopier;
> +    libxl__stream_write_state sws;
>  };
>  
>  
> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
> new file mode 100644
> index 0000000..856d72e
> --- /dev/null
> +++ b/tools/libxl/libxl_stream_write.c
> @@ -0,0 +1,536 @@
> +/*
> + * Copyright (C) 2015      Citrix Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +/*
> + * Infrastructure for writing a domain to a libxl migration v2 stream.
> + *
> + * Entry points from outside:
> + *  - libxl__stream_write_start()
> + *     - Start writing a stream from the start.
> + *
> + * In normal operation, there are two tasks running at once; this stream
> + * processing, and the the libxl-save-helper.  check_stream_finished() is used
> + * to join all the tasks in both success and error cases.
> + *
> + * Nomenclature for event callbacks:
> + *  - $FOO_done(): Completion callback for $FOO
> + *  - write_$FOO(): Set up writing a $FOO
> + *  - $BAR_header(): A $BAR record header only
> + *  - $BAR_record(): A complete $BAR record with header and content
> + *
> + * The main loop for a plain VM writes:
> + *  - Stream header
> + *  - Libxc record
> + *  - Toolstack record
> + *  - if (hvm), Qemu record
> + *  - End record
> + */
> +
> +static const uint8_t zero_padding[1U << REC_ALIGN_ORDER] = { 0 };
> +
> +static void stream_success(libxl__egc *egc,
> +                           libxl__stream_write_state *stream);
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int ret);
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream);
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dcs,
> +                                  int rc, const char *what);
> +
> +/* Event callbacks for plain VM. */
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval);
> +/* libxl__xc_domain_save_done() lives here, event-order wise. */
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream);
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream);
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval);
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval);
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream);
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval);
> +
> +void libxl__stream_write_start(libxl__egc *egc,
> +                               libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_hdr hdr = { 0 };
> +    int ret = 0;
> +
> +    assert(!stream->running);
> +    stream->running = true;
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->readwhat = "";
> +    dc->copywhat = "suspend header";
> +    dc->writewhat = "save/migration stream";
> +    dc->ao = ao;
> +    dc->readfd = -1;
> +    dc->writefd = stream->fd;
> +    dc->maxsz = INT_MAX;
> +    dc->bytes_to_read = INT_MAX;
> +    dc->callback = stream_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    hdr.ident   = htobe64(RESTORE_STREAM_IDENT);
> +    hdr.version = htobe32(RESTORE_STREAM_VERSION);
> +    hdr.options = htobe32(0);
> +
> +    libxl__datacopier_prefixdata(egc, dc, &hdr, sizeof(hdr));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +void libxl__stream_write_abort(libxl__egc *egc,
> +                               libxl__stream_write_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);
> +}
> +
> +static void stream_failed(libxl__egc *egc,
> +                          libxl__stream_write_state *stream, int rc)
> +{
> +    assert(rc);
> +    stream->rc = rc;
> +
> +    if (stream->running) {
> +        stream->running = false;
> +        stream_done(egc, stream);
> +    }
> +}
> +
> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);

Sometimes, we need to back stream from restore to save. In this case,
there is no dss. So it is better to not use dss directly in stream_write.c

Thanks
Wen Congyang

> +
> +    assert(!stream->running);
> +
> +    check_stream_finished(egc, dss, stream->rc, "stream");
> +}
> +
> +static void check_stream_finished(libxl__egc *egc,
> +                                  libxl__domain_suspend_state *dss,
> +                                  int rc, const char *what)
> +{
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    LOG(INFO, "Task '%s' joining (rc %d)", what, rc);
> +
> +    if (rc && !stream->joined_rc) {
> +        bool skip = false;
> +        /* First reported failure from joining tasks.  Tear everything down */
> +        stream->joined_rc = rc;
> +
> +        if (libxl__stream_write_inuse(&dss->sws)) {
> +            skip = true;
> +            libxl__stream_write_abort(egc, &dss->sws, rc);
> +        }
> +
> +        if (libxl__save_helper_inuse(&dss->shs)) {
> +            skip = true;
> +            libxl__save_helper_abort(egc, &dss->shs);
> +        }
> +
> +        /* There is at least one more active task to join - wait for its
> +           callback */
> +        if ( skip )
> +            return;
> +    }
> +
> +    if (libxl__stream_write_inuse(&dss->sws))
> +        LOG(DEBUG, "stream still in use");
> +    else if (libxl__save_helper_inuse(&dss->shs))
> +        LOG(DEBUG, "save/restore still in use");
> +    else {
> +        LOG(INFO, "Join complete: result %d", stream->joined_rc);
> +        stream->completion_callback(egc, dss, stream->joined_rc);
> +    }
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_LIBXC_CONTEXT, 0 };
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = libxc_header_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void libxc_header_done(libxl__egc *egc,
> +                              libxl__datacopier_state *dc,
> +                              int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    libxl__xc_domain_save(egc, dss);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void __attribute__((used))
> +will_be_libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
> +                                int rc, int retval, int errnoval)
> +{
> +    libxl__domain_suspend_state *dss = dss_void;
> +    libxl__stream_write_state *stream = &dss->sws;
> +    STATE_AO_GC(dss->ao);
> +
> +    if (rc)
> +        goto err;
> +
> +    if (retval) {
> +        LOGEV(ERROR, errnoval, "saving domain: %s",
> +                         dss->guest_responded ?
> +                         "domain responded to suspend request" :
> +                         "domain did not respond to suspend request");
> +        if ( !dss->guest_responded )
> +            rc = ERROR_GUEST_TIMEDOUT;
> +        else
> +            rc = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_toolstack_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(rc);
> +    check_stream_finished(egc, dss, rc, "save/restore helper");
> +}
> +
> +static void write_toolstack_record(libxl__egc *egc,
> +                                   libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_XENSTORE_DATA, 0 };
> +    int ret = 0;
> +    uint8_t *toolstack_buf = NULL; /* We must free this. */
> +    uint32_t toolstack_len, padding_len;
> +
> +    ret = libxl__toolstack_save(dss->domid, &toolstack_buf,
> +                                &toolstack_len, dss);
> +    if (ret)
> +        goto err;
> +
> +    dc->copywhat = "toolstack record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = toolstack_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    rec.length = toolstack_len;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, toolstack_buf, toolstack_len);
> +
> +    padding_len = ROUNDUP(rec.length, REC_ALIGN_ORDER) - rec.length;
> +    if (padding_len)
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, padding_len);
> +
> +    free(toolstack_buf);
> +    return;
> +
> + err:
> +    assert(ret);
> +    free(toolstack_buf);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void toolstack_record_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
> +        write_emulator_record(egc, stream);
> +    else
> +        write_end_record(egc, stream);
> +
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_emulator_record(libxl__egc *egc,
> +                                  libxl__stream_write_state *stream)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_EMULATOR_CONTEXT, 0 };
> +    struct libxl_sr_emulator_hdr ehdr = { 0 };
> +    struct stat st;
> +    int ret = 0;
> +    uint32_t qemu_state_len;
> +
> +    assert(dss->type == LIBXL_DOMAIN_TYPE_HVM);
> +
> +    /* Convenience aliases */
> +    const char *const filename = dss->dm_savefile;
> +    const uint32_t domid = dss->domid;
> +
> +    switch(libxl__device_model_version_running(gc, domid)) {
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> +        ehdr.id = EMULATOR_QEMU_TRADITIONAL;
> +        break;
> +
> +    case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> +        ehdr.id = EMULATOR_QEMU_UPSTREAM;
> +        break;
> +
> +    default:
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    ret = libxl__domain_suspend_device_model(gc, dss);
> +    if (ret)
> +        goto err;
> +
> +    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
> +    dc->copywhat = "emulator record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = emulator_body_done;
> +
> +    dc->readfd = open(filename, O_RDONLY);
> +    if (dc->readfd < 0) {
> +        LOGE(ERROR, "unable to open %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (fstat(dc->readfd, &st))
> +    {
> +        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
> +        goto err;
> +    }
> +
> +    if (!S_ISREG(st.st_mode)) {
> +        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
> +        goto err;
> +    }
> +
> +    qemu_state_len = st.st_size;
> +    rec.length = qemu_state_len + sizeof(ehdr);
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    libxl__datacopier_prefixdata(egc, dc, &ehdr, sizeof(ehdr));
> +
> +    stream->padding = ROUNDUP(qemu_state_len, REC_ALIGN_ORDER) - qemu_state_len;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    dc->readwhat = "";
> +    dc->readfd = -1;
> +
> +    if (stream->padding) {
> +        assert(stream->padding < (1U << REC_ALIGN_ORDER));
> +
> +        dc->copywhat = "emulator padding";
> +        dc->writewhat = "save/migration stream";
> +        dc->callback = emulator_padding_done;
> +
> +        ret = libxl__datacopier_start(dc);
> +        if (ret)
> +            goto err;
> +
> +        libxl__datacopier_prefixdata(egc, dc, zero_padding, stream->padding);
> +        return;
> +    }
> +
> +    emulator_padding_done(egc, dc, 0, 0);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_padding_done(libxl__egc *egc,
> +                                  libxl__datacopier_state *dc,
> +                                  int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    write_end_record(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void write_end_record(libxl__egc *egc,
> +                             libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_END, 0 };
> +    int ret = 0;
> +
> +    dc->copywhat = "suspend footer";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = end_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void end_record_done(libxl__egc *egc,
> +                            libxl__datacopier_state *dc,
> +                            int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    stream_success(egc, stream);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-17  3:09   ` Wen Congyang
@ 2015-06-17 10:15     ` Ian Campbell
  2015-06-17 10:49       ` Wen Congyang
  0 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-06-17 10:15 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Wei Liu, Andrew Cooper, Ian Jackson, Xen-devel, Ross Lagerwall,
	Yang Hongyang

On Wed, 2015-06-17 at 11:09 +0800, Wen Congyang wrote:
> > +    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
> > +        ret = ERROR_FAIL;
> > +        LOG(ERROR, "Unable to handle big endian streams");
> > +        goto err;
> 
> I think it is better to check if the host is big endian or not.

It's not, there are no big endian ports of Xen today. I think encoding
that here is fine.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
  2015-06-17 10:14       ` Ian Campbell
@ 2015-06-17 10:43         ` Andrew Cooper
  2015-06-17 10:53           ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-06-17 10:43 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 17/06/15 11:14, Ian Campbell wrote:
> On Tue, 2015-06-16 at 16:06 +0100, Andrew Cooper wrote:
>> On 16/06/15 16:04, Ian Campbell wrote:
>>> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>>>> Libxl has now been fully adjusted not to need them.
>>>>
>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>>> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
>>>
>>> /me looks mournfully at the #28 shaped hole in this series which would
>>> nuke all the migration v1 code from libxc :-)
>> I was going to slip that into v2.  I didn't want to delay posting v1 for
>> review, given the proximity of the 4.6 freeze.
>>
>> I think I will transcribe the description of the legacy protocol from
>> xg_save_restore.h
> That would be good, thanks.
>
>>  and code up the legacy protocol in python.
> That would be above and beyond, but don't let me stop you ;-)

Well - it is currently coded up with magic numbers in the conversion
script.  All I was planning to do was make some numbers less magic.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-17 10:01       ` Wen Congyang
@ 2015-06-17 10:48         ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-17 10:48 UTC (permalink / raw)
  To: Wen Congyang, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell, Ross Lagerwall

On 17/06/15 11:01, Wen Congyang wrote:
> On 06/17/2015 05:50 PM, Andrew Cooper wrote:
>> On 17/06/15 08:57, Wen Congyang wrote:
>>>> +    /* Queue up reading the body. */
>>>>> +    size_t bytes_to_read;
>>>>> +
>>>>> +    switch (rec_hdr->type) {
>>>>> +        /*
>>>>> +         * Emulator records want to retain the blob in the pipe, for a further
>>>>> +         * datacopier call to move elsewhere.  Just read the emulator header.
>>>>> +         */
>>> In this case, we should not call ROUNDUP().
>>>
>>>>> +    case REC_TYPE_EMULATOR_CONTEXT:
>>>>> +        bytes_to_read = sizeof(struct libxl_sr_emulator_hdr);
>>>>> +        break;
>>>>> +
>>>>> +    default:
>>>>> +        bytes_to_read = rec_hdr->length;
>>>>> +        break;
>>>>> +    }
>>>>> +
>>>>> +    bytes_to_read = ROUNDUP(bytes_to_read, REC_ALIGN_ORDER);
>>> So, I think it is better to move ROUNDUP to default case.
>>>
>>> Thanks
>>> Wen Congyang
>>>
>> sizeof(struct libxl_sr_emulator_hdr) is cunningly of the appropriate
>> order already.
> Yes
>
>> I suppose it is probably better to move the roundup into the default
>> case and assert() appropriate alignment after the switch()
> Do you mean the sub-header must be aligned

The start of any record is required to be aligned.  It is the
responsibility of any record which is not aligned to insert padding
after the content so the following record starts on an 8 byte boundary.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-17 10:15     ` Ian Campbell
@ 2015-06-17 10:49       ` Wen Congyang
  2015-06-17 10:55         ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Wen Congyang @ 2015-06-17 10:49 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, Andrew Cooper, Ian Jackson, Xen-devel, Ross Lagerwall,
	Yang Hongyang

On 06/17/2015 06:15 PM, Ian Campbell wrote:
> On Wed, 2015-06-17 at 11:09 +0800, Wen Congyang wrote:
>>> +    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
>>> +        ret = ERROR_FAIL;
>>> +        LOG(ERROR, "Unable to handle big endian streams");
>>> +        goto err;
>>
>> I think it is better to check if the host is big endian or not.
> 
> It's not, there are no big endian ports of Xen today. I think encoding
> that here is fine.

IIRC, arm supports big endian. Do we only use arm+little endian? If so,
it is OK now.

Thanks
Wen Congyang

> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks
  2015-06-17 10:43         ` Andrew Cooper
@ 2015-06-17 10:53           ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-17 10:53 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Wed, 2015-06-17 at 11:43 +0100, Andrew Cooper wrote:
> On 17/06/15 11:14, Ian Campbell wrote:
> > On Tue, 2015-06-16 at 16:06 +0100, Andrew Cooper wrote:
> >> On 16/06/15 16:04, Ian Campbell wrote:
> >>> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> >>>> Libxl has now been fully adjusted not to need them.
> >>>>
> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >>> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
> >>>
> >>> /me looks mournfully at the #28 shaped hole in this series which would
> >>> nuke all the migration v1 code from libxc :-)
> >> I was going to slip that into v2.  I didn't want to delay posting v1 for
> >> review, given the proximity of the 4.6 freeze.
> >>
> >> I think I will transcribe the description of the legacy protocol from
> >> xg_save_restore.h
> > That would be good, thanks.
> >
> >>  and code up the legacy protocol in python.
> > That would be above and beyond, but don't let me stop you ;-)
> 
> Well - it is currently coded up with magic numbers in the conversion
> script.  All I was planning to do was make some numbers less magic.

That does sound nice, thanks.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream
  2015-06-17 10:49       ` Wen Congyang
@ 2015-06-17 10:55         ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-06-17 10:55 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Wei Liu, Andrew Cooper, Ian Jackson, Xen-devel, Ross Lagerwall,
	Yang Hongyang

On Wed, 2015-06-17 at 18:49 +0800, Wen Congyang wrote:
> On 06/17/2015 06:15 PM, Ian Campbell wrote:
> > On Wed, 2015-06-17 at 11:09 +0800, Wen Congyang wrote:
> >>> +    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
> >>> +        ret = ERROR_FAIL;
> >>> +        LOG(ERROR, "Unable to handle big endian streams");
> >>> +        goto err;
> >>
> >> I think it is better to check if the host is big endian or not.
> > 
> > It's not, there are no big endian ports of Xen today. I think encoding
> > that here is fine.
> 
> IIRC, arm supports big endian. Do we only use arm+little endian?

Currently, yes. There are plans afoot to support big endian _guests_ but
not big endian hypervisor (and by extension IMHO tools or at least the
migration stream should remain LE). That would be a lot of faff for very
little benefit IMHO.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state
  2015-06-15 13:44 ` [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state Andrew Cooper
  2015-06-16 13:37   ` Ian Campbell
@ 2015-06-18  2:32   ` Yang Hongyang
  1 sibling, 0 replies; 107+ messages in thread
From: Yang Hongyang @ 2015-06-18  2:32 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell



On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> Shortly more parameters will appear, and this saves unboxing each one.
>
> No functional change.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Yang Hongyang <yanghy@cn.fujitsu.com>

> ---
>   tools/libxl/libxl_create.c       |   12 ++++++------
>   tools/libxl/libxl_internal.h     |    2 +-
>   tools/libxl/libxl_save_callout.c |    2 +-
>   3 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 86384d2..385891c 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -1577,8 +1577,8 @@ static void domain_create_cb(libxl__egc *egc,
>                                int rc, uint32_t domid);
>
>   static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
> -                            uint32_t *domid,
> -                            int restore_fd, int checkpointed_stream,
> +                            uint32_t *domid, int restore_fd,
> +                            const libxl_domain_restore_params *params,
>                               const libxl_asyncop_how *ao_how,
>                               const libxl_asyncprogress_how *aop_console_how)
>   {
> @@ -1591,8 +1591,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
>       libxl_domain_config_init(&cdcs->dcs.guest_config_saved);
>       libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config);
>       cdcs->dcs.restore_fd = restore_fd;
> +    if (params) cdcs->dcs.restore_params = *params;
>       cdcs->dcs.callback = domain_create_cb;
> -    cdcs->dcs.checkpointed_stream = checkpointed_stream;
>       libxl__ao_progress_gethow(&cdcs->dcs.aop_console_how, aop_console_how);
>       cdcs->domid_out = domid;
>
> @@ -1619,7 +1619,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
>                               const libxl_asyncop_how *ao_how,
>                               const libxl_asyncprogress_how *aop_console_how)
>   {
> -    return do_domain_create(ctx, d_config, domid, -1, 0,
> +    return do_domain_create(ctx, d_config, domid, -1, NULL,
>                               ao_how, aop_console_how);
>   }
>
> @@ -1629,8 +1629,8 @@ int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config,
>                                   const libxl_asyncop_how *ao_how,
>                                   const libxl_asyncprogress_how *aop_console_how)
>   {
> -    return do_domain_create(ctx, d_config, domid, restore_fd,
> -                            params->checkpointed_stream, ao_how, aop_console_how);
> +    return do_domain_create(ctx, d_config, domid, restore_fd, params,
> +                            ao_how, aop_console_how);
>   }
>
>   /*
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 6226c18..796bd21 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -3122,11 +3122,11 @@ struct libxl__domain_create_state {
>       libxl_domain_config *guest_config;
>       libxl_domain_config guest_config_saved; /* vanilla config */
>       int restore_fd;
> +    libxl_domain_restore_params restore_params;
>       libxl__domain_create_cb *callback;
>       libxl_asyncprogress_how aop_console_how;
>       /* private to domain_create */
>       int guest_domid;
> -    int checkpointed_stream;
>       libxl__domain_build_state build_state;
>       libxl__bootloader_state bl;
>       libxl__stub_dm_spawn_state dmss;
> diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
> index 40b25e4..3585a84 100644
> --- a/tools/libxl/libxl_save_callout.c
> +++ b/tools/libxl/libxl_save_callout.c
> @@ -59,7 +59,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, libxl__domain_create_state *dcs,
>           state->store_domid, state->console_port,
>           state->console_domid,
>           hvm, pae, superpages,
> -        cbflags, dcs->checkpointed_stream,
> +        cbflags, dcs->restore_params.checkpointed_stream,
>       };
>
>       dcs->shs.ao = ao;
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
  2015-06-15 13:44 ` [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream Andrew Cooper
  2015-06-16 15:03   ` Ian Campbell
@ 2015-06-18  3:13   ` Wen Congyang
  2015-06-18  9:44     ` Andrew Cooper
  1 sibling, 1 reply; 107+ messages in thread
From: Wen Congyang @ 2015-06-18  3:13 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> when signalled to do so by libxl__remus_domain_checkpoint_callback()
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxl/libxl_dom.c          |   16 +++---
>  tools/libxl/libxl_internal.h     |    7 +++
>  tools/libxl/libxl_stream_write.c |  111 ++++++++++++++++++++++++++++++++++++--
>  3 files changed, 121 insertions(+), 13 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index 06bfaab..3597a91 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1867,8 +1867,8 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
>  
>  /*----- remus asynchronous checkpoint callback -----*/
>  
> -static void remus_checkpoint_dm_saved(libxl__egc *egc,
> -                                      libxl__domain_suspend_state *dss, int rc);
> +static void remus_checkpoint_stream_written(
> +    libxl__egc *egc, libxl__domain_suspend_state *dss, int rc);
>  static void remus_devices_commit_cb(libxl__egc *egc,
>                                      libxl__remus_devices_state *rds,
>                                      int rc);
> @@ -1882,16 +1882,11 @@ static void libxl__remus_domain_checkpoint_callback(void *data)
>      libxl__egc *egc = dss->shs.egc;
>      STATE_AO_GC(dss->ao);
>  
> -    /* This would go into tailbuf. */
> -    if (dss->hvm) {
> -        libxl__domain_save_device_model(egc, dss, remus_checkpoint_dm_saved);
> -    } else {
> -        remus_checkpoint_dm_saved(egc, dss, 0);
> -    }
> +    libxl__stream_write_start_checkpoint(egc, &dss->sws);
>  }
>  
> -static void remus_checkpoint_dm_saved(libxl__egc *egc,
> -                                      libxl__domain_suspend_state *dss, int rc)
> +static void remus_checkpoint_stream_written(
> +    libxl__egc *egc, libxl__domain_suspend_state *dss, int rc)
>  {
>      /* Convenience aliases */
>      libxl__remus_devices_state *const rds = &dss->rds;
> @@ -2036,6 +2031,7 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
>          callbacks->suspend = libxl__remus_domain_suspend_callback;
>          callbacks->postcopy = libxl__remus_domain_resume_callback;
>          callbacks->checkpoint = libxl__remus_domain_checkpoint_callback;
> +        dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
>      } else
>          callbacks->suspend = libxl__domain_suspend_callback;
>  
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 82cd792..bf1c377 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2879,17 +2879,24 @@ struct libxl__stream_write_state {
>      void (*completion_callback)(libxl__egc *egc,
>                                  libxl__domain_suspend_state *dss,
>                                  int rc);
> +    void (*checkpoint_callback)(libxl__egc *egc,
> +                                libxl__domain_suspend_state *dss,
> +                                int rc);
>      /* Private */
>      int rc;
>      int joined_rc;
>      size_t padding;
>      bool running;
> +    bool in_checkpoint;
>      libxl__datacopier_state dc;
>  };
>  
>  _hidden void libxl__stream_write_start(libxl__egc *egc,
>                                         libxl__stream_write_state *stream);
>  
> +_hidden void libxl__stream_write_start_checkpoint(
> +    libxl__egc *egc, libxl__stream_write_state *stream);
> +
>  _hidden void libxl__stream_write_abort(libxl__egc *egc,
>                                         libxl__stream_write_state *stream,
>                                         int rc);
> diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
> index d28a8a5..40f2cb7 100644
> --- a/tools/libxl/libxl_stream_write.c
> +++ b/tools/libxl/libxl_stream_write.c
> @@ -23,6 +23,9 @@
>   *  - libxl__stream_write_start()
>   *     - Start writing a stream from the start.
>   *
> + *  - libxl__stream_write_start()
> + *     - Write the records which form a checkpoint into a stream.
> + *
>   * In normal operation, there are two tasks running at once; this stream
>   * processing, and the the libxl-save-helper.  check_stream_finished() is used
>   * to join all the tasks in both success and error cases.
> @@ -39,6 +42,12 @@
>   *  - Toolstack record
>   *  - if (hvm), Qemu record
>   *  - End record
> + *
> + * For checkpointed stream, there is a second loop which is triggered by a
> + * save-helper checkpoint callback.  It writes:
> + *  - Toolstack record
> + *  - if (hvm), Qemu record
> + *  - Checkpoint end record
>   */
>  
>  static const uint8_t zero_padding[1U << REC_ALIGN_ORDER] = { 0 };
> @@ -81,6 +90,16 @@ static void end_record_done(libxl__egc *egc,
>                              libxl__datacopier_state *dc,
>                              int onwrite, int errnoval);
>  
> +/* Event callbacks unique to checkpointed streams. */
> +static void checkpoint_done(libxl__egc *egc,
> +                            libxl__stream_write_state *stream,
> +                            int rc);
> +static void write_checkpoint_end_record(libxl__egc *egc,
> +                                        libxl__stream_write_state *stream);
> +static void checkpoint_end_record_done(libxl__egc *egc,
> +                                       libxl__datacopier_state *dc,
> +                                       int onwrite, int errnoval);
> +
>  void libxl__stream_write_start(libxl__egc *egc,
>                                 libxl__stream_write_state *stream)
>  {
> @@ -119,6 +138,16 @@ void libxl__stream_write_start(libxl__egc *egc,
>      stream_failed(egc, stream, ret);
>  }
>  
> +void libxl__stream_write_start_checkpoint(libxl__egc *egc,
> +                                          libxl__stream_write_state *stream)
> +{
> +    assert(stream->running);
> +    assert(!stream->in_checkpoint);
> +    stream->in_checkpoint = true;
> +
> +    write_toolstack_record(egc, stream);
> +}
> +
>  void libxl__stream_write_abort(libxl__egc *egc,
>                                 libxl__stream_write_state *stream, int rc)
>  {
> @@ -130,6 +159,7 @@ static void stream_success(libxl__egc *egc, libxl__stream_write_state *stream)
>      stream->rc = 0;
>      stream->running = false;
>  
> +    assert(!stream->in_checkpoint);
>      stream_done(egc, stream);
>  }
>  
> @@ -139,6 +169,15 @@ static void stream_failed(libxl__egc *egc,
>      assert(rc);
>      stream->rc = rc;
>  
> +    /*
> +     *If we are in a checkpoint, pass the failure to libxc, which will come
> +     * back around to us via libxl__xc_domain_save_done().
> +     */
> +    if (stream->in_checkpoint) {

I think we should set running to false here too.

Thanks
Wen Congyang

> +        checkpoint_done(egc, stream, rc);
> +        return;
> +    }
> +
>      if (stream->running) {
>          stream->running = false;
>          stream_done(egc, stream);
> @@ -151,6 +190,7 @@ static void stream_done(libxl__egc *egc,
>      libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
>  
>      assert(!stream->running);
> +    assert(!stream->in_checkpoint);
>  
>      check_stream_finished(egc, dss, stream->rc, "stream");
>  }
> @@ -335,8 +375,12 @@ static void toolstack_record_done(libxl__egc *egc,
>  
>      if (dss->type == LIBXL_DOMAIN_TYPE_HVM)
>          write_emulator_record(egc, stream);
> -    else
> -        write_end_record(egc, stream);
> +    else {
> +        if (stream->in_checkpoint)
> +            write_checkpoint_end_record(egc, stream);
> +        else
> +            write_end_record(egc, stream);
> +    }
>  
>      return;
>  
> @@ -473,7 +517,10 @@ static void emulator_padding_done(libxl__egc *egc,
>          goto err;
>      }
>  
> -    write_end_record(egc, stream);
> +    if (stream->in_checkpoint)
> +        write_checkpoint_end_record(egc, stream);
> +    else
> +        write_end_record(egc, stream);
>      return;
>  
>   err:
> @@ -526,6 +573,64 @@ static void end_record_done(libxl__egc *egc,
>      stream_failed(egc, stream, ret);
>  }
>  
> +static void checkpoint_done(libxl__egc *egc,
> +                            libxl__stream_write_state *stream,
> +                            int rc)
> +{
> +    libxl__domain_suspend_state *dss = CONTAINER_OF(stream, *dss, sws);
> +
> +    assert(stream->in_checkpoint);
> +    stream->in_checkpoint = false;
> +    stream->checkpoint_callback(egc, dss, rc);
> +}
> +
> +static void write_checkpoint_end_record(libxl__egc *egc,
> +                                        libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_rec_hdr rec = { REC_TYPE_CHECKPOINT_END, 0 };
> +    int ret = 0;
> +
> +    assert(stream->in_checkpoint);
> +
> +    dc->copywhat = "checkpoint record";
> +    dc->writewhat = "save/migration stream";
> +    dc->callback = checkpoint_end_record_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +
> +    libxl__datacopier_prefixdata(egc, dc, &rec, sizeof(rec));
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void checkpoint_end_record_done(libxl__egc *egc,
> +                                       libxl__datacopier_state *dc,
> +                                       int onwrite, int errnoval)
> +{
> +    libxl__stream_write_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    STATE_AO_GC(stream->ao);
> +    int ret = 0;
> +
> +    if (onwrite || errnoval) {
> +        ret = ERROR_FAIL;
> +        goto err;
> +    }
> +
> +    checkpoint_done(egc, stream, 0);
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream
  2015-06-18  3:13   ` Wen Congyang
@ 2015-06-18  9:44     ` Andrew Cooper
  0 siblings, 0 replies; 107+ messages in thread
From: Andrew Cooper @ 2015-06-18  9:44 UTC (permalink / raw)
  To: Wen Congyang, Xen-devel; +Cc: Ian Jackson, Yang Hongyang, Wei Liu, Ian Campbell

On 18/06/15 04:13, Wen Congyang wrote:
>> @@ -139,6 +169,15 @@ static void stream_failed(libxl__egc *egc,
>> >      assert(rc);
>> >      stream->rc = rc;
>> >  
>> > +    /*
>> > +     *If we are in a checkpoint, pass the failure to libxc, which will come
>> > +     * back around to us via libxl__xc_domain_save_done().
>> > +     */
>> > +    if (stream->in_checkpoint) {
> I think we should set running to false here too.
>
> Thanks
> Wen Congyang
>

"running" encapsulates that a stream is being written, which includes
time when libxc is writing into the fd below us.

In particular, setting running to false here will prevent the stream
from being correctly torn down if the libxl-save-helper process
encounters an error.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 00/27]  Libxl migration v2
  2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
                   ` (28 preceding siblings ...)
  2015-06-17  1:55 ` Wen Congyang
@ 2015-07-02  7:33 ` Yang Hongyang
  2015-07-02  9:26   ` Andrew Cooper
  29 siblings, 1 reply; 107+ messages in thread
From: Yang Hongyang @ 2015-07-02  7:33 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell

Hi Andrew,

   Are there any updates of this series that I can checkout and rebase mine onto?
:)

On 06/15/2015 09:44 PM, Andrew Cooper wrote:
> This series adds support for the libxl migration v2 stream, and untangles the
> existing layering violations of the toolstack and qemu records.
>
> At the end of the series, legacy migration is no longer used.
>
> Note: Remus support is broken and (RFC) fixed in separate patches in this
> series.  It was too tangled to fix in a bisectable fashon.  Plain
> suspend/migrate/resume however is (should be) bisectable along the entire
> series.
>
> There are a couple of outstanding questions:
>
> 1) What to do about the toolstack/xenstore record.  It is currently by being
>     passed around as a blob, but it might be better to split it out.
>
> 2) What (if any) ABI/API qualifications are needed? (Particularly in reference
>     to patch 21)
>
> The Remus code is untested by me, but is hopefully in the correct ballpark.
> All other combinations of suspend/migrate/resume have been tested with PV and
> HVM guests (qemu-trad and qemu-upstream), including 32 -> 64 bit migration
> (which was the underlying bug causing us to write migration v2 in the first
> place).
>
> There are some further improvements which could be made.  In particular, it
> appears that sending the toolstack record on each checkpoint is redundant, and
> there is certainly room for some more pruning of the legacy migration code.
>
> Anyway, thoughts/comments welcome.  Please test!
>
> ~Andrew
>
>
> Andrew Cooper (22):
>    tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children
>    tools/libxc: Always compile the compat qemu variables into xc_sr_context
>    tools/libxl: Stash all restore parameters in domain_create_state
>    tools/xl: Mandatory flag indicating the format of the migration stream
>    tools/libxl: Introduce ROUNDUP()
>    tools/libxl: Extra APIs for the save helper
>    tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore()
>    docs: Libxl migration v2 stream specification
>    tools/python: Libxc migration v2 infrastructure
>    tools/python: Libxl migration v2 infrastructure
>    tools/python: Verification utility for v2 stream spec compliance
>    tools/python: Conversion utility for legacy migration streams
>    tools/libxl: Support converting a legacy stream to a v2 stream
>    tools/libxl: Convert a legacy stream if needed
>    tools/libxc+libxl+xl: Restore v2 streams
>    tools/libxc+libxl+xl: Save v2 streams
>    docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams
>    tools/libxl: [RFC] Write checkpoint records into the stream
>    tools/libx{c,l}: [RFC] Introduce restore_callbacks.checkpoint()
>    tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream
>    tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
>    tools/libxl: Drop all knowledge of toolstack callbacks
>
> Ian Jackson (2):
>    libxl: cancellation: Preparations for save/restore cancellation
>    libxl: cancellation: Handle SIGTERM in save/restore helper
>
> Ross Lagerwall (3):
>    tools/libxl: Migration v2 stream format
>    tools/libxl: Infrastructure for reading a libxl migration v2 stream
>    tools/libxl: Infrastructure for writing a v2 stream
>
>   docs/specs/libxl-migration-stream.pandoc      |  218 ++++++++
>   tools/libxc/Makefile                          |    2 -
>   tools/libxc/include/xenguest.h                |    3 +
>   tools/libxc/xc_sr_common.h                    |    5 -
>   tools/libxc/xc_sr_restore.c                   |   33 +-
>   tools/libxc/xc_sr_restore_x86_hvm.c           |  124 -----
>   tools/libxc/xc_sr_save_x86_hvm.c              |   36 --
>   tools/libxl/Makefile                          |    2 +
>   tools/libxl/libxl_aoutils.c                   |    7 +
>   tools/libxl/libxl_convert_callout.c           |  146 ++++++
>   tools/libxl/libxl_create.c                    |   80 +--
>   tools/libxl/libxl_dom.c                       |   61 +--
>   tools/libxl/libxl_internal.h                  |  140 ++++-
>   tools/libxl/libxl_save_callout.c              |   63 +--
>   tools/libxl/libxl_save_helper.c               |   95 ++--
>   tools/libxl/libxl_save_msgs_gen.pl            |    9 +-
>   tools/libxl/libxl_sr_stream_format.h          |   58 +++
>   tools/libxl/libxl_stream_read.c               |  663 ++++++++++++++++++++++++
>   tools/libxl/libxl_stream_write.c              |  640 +++++++++++++++++++++++
>   tools/libxl/libxl_types.idl                   |    2 +
>   tools/libxl/xl_cmdimpl.c                      |    9 +-
>   tools/python/Makefile                         |    4 +
>   tools/python/scripts/convert-legacy-stream.py |  683 +++++++++++++++++++++++++
>   tools/python/scripts/verify-stream-v2.py      |  174 +++++++
>   tools/python/setup.py                         |    1 +
>   tools/python/xen/migration/libxc.py           |  446 ++++++++++++++++
>   tools/python/xen/migration/libxl.py           |  199 +++++++
>   tools/python/xen/migration/tests.py           |   54 ++
>   tools/python/xen/migration/verify.py          |   37 ++
>   29 files changed, 3638 insertions(+), 356 deletions(-)
>   create mode 100644 docs/specs/libxl-migration-stream.pandoc
>   create mode 100644 tools/libxl/libxl_convert_callout.c
>   create mode 100644 tools/libxl/libxl_sr_stream_format.h
>   create mode 100644 tools/libxl/libxl_stream_read.c
>   create mode 100644 tools/libxl/libxl_stream_write.c
>   create mode 100755 tools/python/scripts/convert-legacy-stream.py
>   create mode 100755 tools/python/scripts/verify-stream-v2.py
>   create mode 100644 tools/python/xen/migration/__init__.py
>   create mode 100644 tools/python/xen/migration/libxc.py
>   create mode 100644 tools/python/xen/migration/libxl.py
>   create mode 100644 tools/python/xen/migration/tests.py
>   create mode 100644 tools/python/xen/migration/verify.py
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 00/27]  Libxl migration v2
  2015-07-02  7:33 ` Yang Hongyang
@ 2015-07-02  9:26   ` Andrew Cooper
  2015-07-02  9:33     ` Yang Hongyang
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-07-02  9:26 UTC (permalink / raw)
  To: Yang Hongyang, Xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell

On 02/07/15 08:33, Yang Hongyang wrote:
> Hi Andrew,
>
>   Are there any updates of this series that I can checkout and rebase
> mine onto?
> :)

Not yet - I am very sorry it is taking this long.  I am working on it
and am half way through, but rebasing over the AO Abort series is
proving far more complicated than I initially expected.  A lot of
functions which I had introduced or modified have had their number of
parameters and error semantics changed.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 00/27]  Libxl migration v2
  2015-07-02  9:26   ` Andrew Cooper
@ 2015-07-02  9:33     ` Yang Hongyang
  0 siblings, 0 replies; 107+ messages in thread
From: Yang Hongyang @ 2015-07-02  9:33 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel; +Cc: Wei Liu, Ian Jackson, Ian Campbell



On 07/02/2015 05:26 PM, Andrew Cooper wrote:
> On 02/07/15 08:33, Yang Hongyang wrote:
>> Hi Andrew,
>>
>>    Are there any updates of this series that I can checkout and rebase
>> mine onto?
>> :)
>
> Not yet - I am very sorry it is taking this long.  I am working on it
> and am half way through, but rebasing over the AO Abort series is
> proving far more complicated than I initially expected.  A lot of
> functions which I had introduced or modified have had their number of
> parameters and error semantics changed.

I understand that's a pain...

>
> ~Andrew
> .
>

-- 
Thanks,
Yang.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 10/27] docs: Libxl migration v2 stream specification
  2015-06-16 13:58   ` Ian Campbell
@ 2015-07-08 13:49     ` Andrew Cooper
  2015-07-08 13:58       ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-07-08 13:49 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 16/06/2015 14:58, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> +EMULATOR\_CONTEXT
>> +----------------
>> +
>> +A context blob for a specific emulator associated with the domain.
>> +
>> +     0     1     2     3     4     5     6     7 octet
>> +    +------------------------+------------------------+
>> +    | emulator_id            | index                  |
>> +    +------------------------+------------------------+
>> +    | emulator_ctx                                    |
>> +    ...
>> +    +-------------------------------------------------+
>> +
>> +--------------------------------------------------------------------
>> +Field            Description
>> +------------     ---------------------------------------------------
>> +emulator_id      0x00000000: Unknown (In the case of a legacy stream)
>> +
>> +                 0x00000001: Qemu Traditional
>> +
>> +                 0x00000002: Qemu Upstream
>> +
>> +                 0x00000003 - 0xFFFFFFFF: Reserved for future emulators.
> Would it be useful for future proofing to carve out some space for a
> per-emulator version field too?

What would that be useful for?  It is the emulators problem/fault if it
can't read the blob it is given.

Superficially, I can see why the field would be nice for debugging
purposes, but not all emulators will have a consistent version scheme,
and we only install a single version of each emulator.  All I can see
happening is libxl starting to guess about emulator/blob compatibility,
which is absolutely not its place to do.

>
> Otherwise LGTM.
>
> One thought, it might be useful (here or elsewhere) to have an explicit
> overview of the expected control flow (as in the ownership of the fd,
> and/or nesting of the layers as you prefer to think about it) between
> libxc, libxl and the next layer (i.e. xl).

I will see what I can do, but the freeze is very imminent.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 10/27] docs: Libxl migration v2 stream specification
  2015-07-08 13:49     ` Andrew Cooper
@ 2015-07-08 13:58       ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-07-08 13:58 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Wed, 2015-07-08 at 14:49 +0100, Andrew Cooper wrote:
> On 16/06/2015 14:58, Ian Campbell wrote:
> > On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >> +EMULATOR\_CONTEXT
> >> +----------------
> >> +
> >> +A context blob for a specific emulator associated with the domain.
> >> +
> >> +     0     1     2     3     4     5     6     7 octet
> >> +    +------------------------+------------------------+
> >> +    | emulator_id            | index                  |
> >> +    +------------------------+------------------------+
> >> +    | emulator_ctx                                    |
> >> +    ...
> >> +    +-------------------------------------------------+
> >> +
> >> +--------------------------------------------------------------------
> >> +Field            Description
> >> +------------     ---------------------------------------------------
> >> +emulator_id      0x00000000: Unknown (In the case of a legacy stream)
> >> +
> >> +                 0x00000001: Qemu Traditional
> >> +
> >> +                 0x00000002: Qemu Upstream
> >> +
> >> +                 0x00000003 - 0xFFFFFFFF: Reserved for future emulators.
> > Would it be useful for future proofing to carve out some space for a
> > per-emulator version field too?
> 
> What would that be useful for?  It is the emulators problem/fault if it
> can't read the blob it is given.
> 
> Superficially, I can see why the field would be nice for debugging
> purposes, but not all emulators will have a consistent version scheme,
> and we only install a single version of each emulator.  All I can see
> happening is libxl starting to guess about emulator/blob compatibility,
> which is absolutely not its place to do.

Good point (I also can't quite remember what I thought this for).

> > Otherwise LGTM.
> >
> > One thought, it might be useful (here or elsewhere) to have an explicit
> > overview of the expected control flow (as in the ownership of the fd,
> > and/or nesting of the layers as you prefer to think about it) between
> > libxc, libxl and the next layer (i.e. xl).
> 
> I will see what I can do, but the freeze is very imminent.

Indeed, don't let this distract you.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
                     ` (5 preceding siblings ...)
  2015-06-17 10:14   ` Wen Congyang
@ 2015-07-10 10:55   ` Ian Campbell
  2015-07-10 11:03     ` Andrew Cooper
  6 siblings, 1 reply; 107+ messages in thread
From: Ian Campbell @ 2015-07-10 10:55 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> +void libxl__stream_write_start(libxl__egc *egc,
> +                               libxl__stream_write_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    STATE_AO_GC(stream->ao);
> +    struct libxl_sr_hdr hdr = { 0 };
> +    int ret = 0;
> +
> +    assert(!stream->running);

This has the same issue wrt who initialises this when as the restore
side.

> +    stream->running = true;
> +
> +    memset(dc, 0, sizeof(*dc));
> +    dc->readwhat = "";
> +    dc->copywhat = "suspend header";
> +    dc->writewhat = "save/migration stream";
> +    dc->ao = ao;
> +    dc->readfd = -1;
> +    dc->writefd = stream->fd;
> +    dc->maxsz = INT_MAX;
> +    dc->bytes_to_read = INT_MAX;
> +    dc->callback = stream_header_done;

On the read side some of this was nicely encapsulated in a helper. Not a
blocker, just an observation for a potential future tidying.

So, the only immediate issue is the ->running one, which I suppose will
be discussed on the restore side patch and the same conclusion applied
here.

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-07-10 10:55   ` Ian Campbell
@ 2015-07-10 11:03     ` Andrew Cooper
  2015-07-10 11:05       ` Ian Campbell
  0 siblings, 1 reply; 107+ messages in thread
From: Andrew Cooper @ 2015-07-10 11:03 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On 10/07/15 11:55, Ian Campbell wrote:
> On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
>> +void libxl__stream_write_start(libxl__egc *egc,
>> +                               libxl__stream_write_state *stream)
>> +{
>> +    libxl__datacopier_state *dc = &stream->dc;
>> +    STATE_AO_GC(stream->ao);
>> +    struct libxl_sr_hdr hdr = { 0 };
>> +    int ret = 0;
>> +
>> +    assert(!stream->running);
> This has the same issue wrt who initialises this when as the restore
> side.
>
>> +    stream->running = true;
>> +
>> +    memset(dc, 0, sizeof(*dc));
>> +    dc->readwhat = "";
>> +    dc->copywhat = "suspend header";
>> +    dc->writewhat = "save/migration stream";
>> +    dc->ao = ao;
>> +    dc->readfd = -1;
>> +    dc->writefd = stream->fd;
>> +    dc->maxsz = INT_MAX;
>> +    dc->bytes_to_read = INT_MAX;
>> +    dc->callback = stream_header_done;
> On the read side some of this was nicely encapsulated in a helper. Not a
> blocker, just an observation for a potential future tidying.
>
> So, the only immediate issue is the ->running one, which I suppose will
> be discussed on the restore side patch and the same conclusion applied
> here.

You realise you have moved back onto v1 of the series?  This is very
different in v2.

~Andrew

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream
  2015-07-10 11:03     ` Andrew Cooper
@ 2015-07-10 11:05       ` Ian Campbell
  0 siblings, 0 replies; 107+ messages in thread
From: Ian Campbell @ 2015-07-10 11:05 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Ross Lagerwall, Wei Liu, Yang Hongyang, Ian Jackson, Xen-devel

On Fri, 2015-07-10 at 12:03 +0100, Andrew Cooper wrote:
> On 10/07/15 11:55, Ian Campbell wrote:
> > On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> >> +void libxl__stream_write_start(libxl__egc *egc,
> >> +                               libxl__stream_write_state *stream)
> >> +{
> >> +    libxl__datacopier_state *dc = &stream->dc;
> >> +    STATE_AO_GC(stream->ao);
> >> +    struct libxl_sr_hdr hdr = { 0 };
> >> +    int ret = 0;
> >> +
> >> +    assert(!stream->running);
> > This has the same issue wrt who initialises this when as the restore
> > side.
> >
> >> +    stream->running = true;
> >> +
> >> +    memset(dc, 0, sizeof(*dc));
> >> +    dc->readwhat = "";
> >> +    dc->copywhat = "suspend header";
> >> +    dc->writewhat = "save/migration stream";
> >> +    dc->ao = ao;
> >> +    dc->readfd = -1;
> >> +    dc->writefd = stream->fd;
> >> +    dc->maxsz = INT_MAX;
> >> +    dc->bytes_to_read = INT_MAX;
> >> +    dc->callback = stream_header_done;
> > On the read side some of this was nicely encapsulated in a helper. Not a
> > blocker, just an observation for a potential future tidying.
> >
> > So, the only immediate issue is the ->running one, which I suppose will
> > be discussed on the restore side patch and the same conclusion applied
> > here.
> 
> You realise you have moved back onto v1 of the series?  This is very
> different in v2.

Oh c..k, no, I didn't. I shall go take another look at hte correct
thing!

Ian.

^ permalink raw reply	[flat|nested] 107+ messages in thread

end of thread, other threads:[~2015-07-10 11:05 UTC | newest]

Thread overview: 107+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-15 13:44 [PATCH 00/27] Libxl migration v2 Andrew Cooper
2015-06-15 13:44 ` [PATCH 01/27] tools/libxl: Fix libxl__ev_child_inuse() check for not-yet-initialised children Andrew Cooper
2015-06-16 13:21   ` Ian Campbell
2015-06-16 13:36     ` Andrew Cooper
2015-06-16 13:47       ` Ian Jackson
2015-06-16 14:05         ` Andrew Cooper
2015-06-16 15:26           ` Ian Campbell
2015-06-16 15:24       ` Ian Campbell
2015-06-16 13:39     ` Ian Jackson
2015-06-15 13:44 ` [PATCH 02/27] tools/libxc: Always compile the compat qemu variables into xc_sr_context Andrew Cooper
2015-06-16 13:22   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 03/27] tools/libxl: Stash all restore parameters in domain_create_state Andrew Cooper
2015-06-16 13:37   ` Ian Campbell
2015-06-16 14:09     ` Andrew Cooper
2015-06-18  2:32   ` Yang Hongyang
2015-06-15 13:44 ` [PATCH 04/27] tools/xl: Mandatory flag indicating the format of the migration stream Andrew Cooper
2015-06-16 13:39   ` Ian Campbell
2015-06-16 14:10     ` Andrew Cooper
2015-06-15 13:44 ` [PATCH 05/27] tools/libxl: Introduce ROUNDUP() Andrew Cooper
2015-06-16 13:39   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 06/27] libxl: cancellation: Preparations for save/restore cancellation Andrew Cooper
2015-06-15 13:44 ` [PATCH 07/27] libxl: cancellation: Handle SIGTERM in save/restore helper Andrew Cooper
2015-06-15 13:44 ` [PATCH 08/27] tools/libxl: Extra APIs for the save helper Andrew Cooper
2015-06-16 13:50   ` Ian Campbell
2015-06-16 15:03     ` Andrew Cooper
2015-06-15 13:44 ` [PATCH 09/27] tools/libxl: Pass restore_fd as a parameter to libxl__xc_domain_restore() Andrew Cooper
2015-06-16 13:53   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 10/27] docs: Libxl migration v2 stream specification Andrew Cooper
2015-06-16 13:58   ` Ian Campbell
2015-07-08 13:49     ` Andrew Cooper
2015-07-08 13:58       ` Ian Campbell
2015-06-15 13:44 ` [PATCH 11/27] tools/python: Libxc migration v2 infrastructure Andrew Cooper
2015-06-16 14:01   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 12/27] tools/python: Libxl " Andrew Cooper
2015-06-15 13:44 ` [PATCH 13/27] tools/python: Verification utility for v2 stream spec compliance Andrew Cooper
2015-06-15 13:44 ` [PATCH 14/27] tools/python: Conversion utility for legacy migration streams Andrew Cooper
2015-06-16 14:01   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 15/27] tools/libxl: Migration v2 stream format Andrew Cooper
2015-06-16 14:04   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream Andrew Cooper
2015-06-16 14:31   ` Ian Campbell
2015-06-16 15:01     ` Andrew Cooper
2015-06-16 15:35       ` Ian Campbell
2015-06-16 15:46         ` Andrew Cooper
2015-06-17  3:09   ` Wen Congyang
2015-06-17 10:15     ` Ian Campbell
2015-06-17 10:49       ` Wen Congyang
2015-06-17 10:55         ` Ian Campbell
2015-06-17  6:03   ` Wen Congyang
2015-06-17  9:47     ` Andrew Cooper
2015-06-17  7:57   ` Wen Congyang
2015-06-17  9:50     ` Andrew Cooper
2015-06-17 10:01       ` Wen Congyang
2015-06-17 10:48         ` Andrew Cooper
2015-06-15 13:44 ` [PATCH 17/27] tools/libxl: Support converting a legacy stream to a " Andrew Cooper
2015-06-16 14:38   ` Ian Campbell
2015-06-16 15:13     ` Andrew Cooper
2015-06-16 15:38       ` Ian Campbell
2015-06-15 13:44 ` [PATCH 18/27] tools/libxl: Convert a legacy stream if needed Andrew Cooper
2015-06-15 13:44 ` [PATCH 19/27] tools/libxc+libxl+xl: Restore v2 streams Andrew Cooper
2015-06-16 14:53   ` Ian Campbell
2015-06-16 15:23     ` Andrew Cooper
2015-06-16 15:39       ` Ian Campbell
2015-06-15 13:44 ` [PATCH 20/27] tools/libxl: Infrastructure for writing a v2 stream Andrew Cooper
2015-06-16 14:57   ` Ian Campbell
2015-06-16 15:28     ` Andrew Cooper
2015-06-17  1:31   ` Yang Hongyang
2015-06-17  9:51     ` Andrew Cooper
2015-06-17  1:39   ` Wen Congyang
2015-06-17  2:24   ` Wen Congyang
2015-06-17  7:38   ` Yang Hongyang
2015-06-17 10:14   ` Wen Congyang
2015-07-10 10:55   ` Ian Campbell
2015-07-10 11:03     ` Andrew Cooper
2015-07-10 11:05       ` Ian Campbell
2015-06-15 13:44 ` [PATCH 21/27] tools/libxc+libxl+xl: Save v2 streams Andrew Cooper
2015-06-16 14:59   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 22/27] docs/libxl: [RFC] Introduce CHECKPOINT_END to support migration v2 remus streams Andrew Cooper
2015-06-16 15:00   ` Ian Campbell
2015-06-16 15:30     ` Andrew Cooper
2015-06-17  3:30   ` Wen Congyang
2015-06-15 13:44 ` [PATCH 23/27] tools/libxl: [RFC] Write checkpoint records into the stream Andrew Cooper
2015-06-16 15:03   ` Ian Campbell
2015-06-16 15:53     ` Andrew Cooper
2015-06-17  7:30       ` Ian Campbell
2015-06-17  9:55         ` Andrew Cooper
2015-06-18  3:13   ` Wen Congyang
2015-06-18  9:44     ` Andrew Cooper
2015-06-15 13:44 ` [PATCH 24/27] tools/libx{c, l}: [RFC] Introduce restore_callbacks.checkpoint() Andrew Cooper
2015-06-16  2:23   ` Yang Hongyang
2015-06-17  8:20   ` Yang Hongyang
2015-06-15 13:44 ` [PATCH 25/27] tools/libxl: [RFC] Handle checkpoint records in a libxl migration v2 stream Andrew Cooper
2015-06-17  7:28   ` Wen Congyang
2015-06-15 13:44 ` [PATCH 26/27] tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc Andrew Cooper
2015-06-16 15:03   ` Ian Campbell
2015-06-15 13:44 ` [PATCH 27/27] tools/libxl: Drop all knowledge of toolstack callbacks Andrew Cooper
2015-06-16 15:04   ` Ian Campbell
2015-06-16 15:06     ` Andrew Cooper
2015-06-17 10:14       ` Ian Campbell
2015-06-17 10:43         ` Andrew Cooper
2015-06-17 10:53           ` Ian Campbell
2015-06-16  2:21 ` [PATCH 00/27] Libxl migration v2 Yang Hongyang
2015-06-17  1:55 ` Wen Congyang
2015-06-17  9:45   ` Andrew Cooper
2015-07-02  7:33 ` Yang Hongyang
2015-07-02  9:26   ` Andrew Cooper
2015-07-02  9:33     ` Yang Hongyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.