All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/5] migration/remus: bug fix and cleanup
@ 2016-01-18  5:40 Wen Congyang
  2016-01-18  5:40 ` [PATCH v4 1/5] remus: don't call stream_continue() when doing failover Wen Congyang
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Wen Congyang @ 2016-01-18  5:40 UTC (permalink / raw)
  To: xen devel, Andrew Cooper
  Cc: Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson,
	Shriram Rajagopalan, Yang Hongyang

Wen Congyang (5):
  remus: don't call stream_continue() when doing failover
  remus: resume immediately if libxl__xc_domain_save_done() completes
  tools/libxc: don't send end record if remus fails
  tools/libxc: error handling for the postcopy() callback
  tools/libxl: remove unused function libxl__domain_save_device_model()

 tools/libxc/xc_sr_save.c         |  6 ++-
 tools/libxl/libxl.c              |  9 ++--
 tools/libxl/libxl_dom.c          | 91 ----------------------------------------
 tools/libxl/libxl_internal.h     |  6 ---
 tools/libxl/libxl_stream_read.c  | 35 +++++++++++++---
 tools/libxl/libxl_stream_write.c | 14 ++++++-
 6 files changed, 50 insertions(+), 111 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v4 1/5] remus: don't call stream_continue() when doing failover
  2016-01-18  5:40 [PATCH v4 0/5] migration/remus: bug fix and cleanup Wen Congyang
@ 2016-01-18  5:40 ` Wen Congyang
  2016-01-18 16:45   ` Ian Campbell
  2016-01-18  5:40 ` [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes Wen Congyang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Wen Congyang @ 2016-01-18  5:40 UTC (permalink / raw)
  To: xen devel, Andrew Cooper
  Cc: Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson,
	Shriram Rajagopalan, Yang Hongyang

stream_continue() is used for migration to read emulator
xenstore data and emulator context. For remus, if we do
failover, we have read it in the checkpoint cycle, and
we only need to complete the stream.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxl/libxl_stream_read.c | 35 ++++++++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 258dec4..24305f4 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -101,6 +101,19 @@
  *    - stream_write_emulator_done()
  *    - stream_continue()
  *
+ * 4) Failover for remus
+ *    - we buffer all records until a CHECKPOINT_END record is received
+ *    - we will use the records when a CHECKPOINT_END record is received
+ *    - if we find some internal error, the rc or retval is not 0 in
+ *      libxl__xc_domain_restore_done(). In this case, we don't resume the
+ *      guest
+ *    - if we need to do failover from primary, the rc and retval are 0
+ *      in libxl__xc_domain_restore_done(). In this case, the buffered state
+ *      will be dropped, because we don't receive a CHECKPOINT_END record,
+ *      and it is a inconsistent state. In libxl__xc_domain_restore_done(),
+ *      we just complete the stream and stream->completion_callback() will
+ *      be called to resume the guest
+ *
  * Depending on the contents of the stream, there are likely to be several
  * parallel tasks being managed.  check_all_finished() is used to join all
  * tasks in both success and error cases.
@@ -758,6 +771,9 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
     libxl__stream_read_state *stream = &dcs->srs;
     STATE_AO_GC(dcs->ao);
 
+    /* convenience aliases */
+    const int checkpointed_stream = dcs->restore_params.checkpointed_stream;
+
     if (rc)
         goto err;
 
@@ -777,11 +793,20 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
      * If the stream is not still alive, we must not continue any work.
      */
     if (libxl__stream_read_inuse(stream)) {
-        /*
-         * Libxc has indicated that it is done with the stream.  Resume reading
-         * libxl records from it.
-         */
-        stream_continue(egc, stream);
+        if (checkpointed_stream) {
+            /*
+             * Failover from primary. Domain state is currently at a
+             * consistent checkpoint, complete the stream, and call
+             * stream->completion_callback() to resume the guest.
+             */
+            stream_complete(egc, stream, 0);
+        } else {
+            /*
+             * Libxc has indicated that it is done with the stream.
+             * Resume reading libxl records from it.
+             */
+            stream_continue(egc, stream);
+        }
     }
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes
  2016-01-18  5:40 [PATCH v4 0/5] migration/remus: bug fix and cleanup Wen Congyang
  2016-01-18  5:40 ` [PATCH v4 1/5] remus: don't call stream_continue() when doing failover Wen Congyang
@ 2016-01-18  5:40 ` Wen Congyang
  2016-01-18 16:51   ` Ian Campbell
  2016-01-18  5:40 ` [PATCH v4 3/5] tools/libxc: don't send end record if remus fails Wen Congyang
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Wen Congyang @ 2016-01-18  5:40 UTC (permalink / raw)
  To: xen devel, Andrew Cooper
  Cc: Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson,
	Shriram Rajagopalan, Yang Hongyang

For example: if the secondary host is down, and we fail to send the data to
the secondary host. xc_domain_save() returns 0. So in the function
libxl__xc_domain_save_done(), rc is 0(the helper program exits normally),
and retval is 0(it is xc_domain_save()'s return value). In such case, we
just need to complete the stream.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxl/libxl.c              |  5 ++++-
 tools/libxl/libxl_stream_write.c | 14 ++++++++++++--
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index abb2845..d50c3fb 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -884,7 +884,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
 
     assert(info);
 
-    /* Point of no return */
+    /*
+     * This function doesn't return until something is wrong, and
+     * we need to do failover from secondary.
+     */
     libxl__remus_setup(egc, dss);
     return AO_INPROGRESS;
 
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index 80d9208..2f077c5 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -354,8 +354,18 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
      * alive, and check_all_finished() may have torn it down around us.
      * If the stream is not still alive, we must not continue any work.
      */
-    if (libxl__stream_write_inuse(stream))
-        write_emulator_xenstore_record(egc, stream);
+    if (libxl__stream_write_inuse(stream)) {
+        if (dss->remus)
+            /*
+             * For remus, if libxl__xc_domain_save_done() completes,
+             * there was an error sending data to the secondary.
+             * Resume the primary ASAP. The caller doesn't care of the
+             * return value(Please refer to libxl__remus_teardown())
+             */
+            stream_complete(egc, stream, 0);
+        else
+            write_emulator_xenstore_record(egc, stream);
+    }
 }
 
 static void write_emulator_xenstore_record(libxl__egc *egc,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 3/5] tools/libxc: don't send end record if remus fails
  2016-01-18  5:40 [PATCH v4 0/5] migration/remus: bug fix and cleanup Wen Congyang
  2016-01-18  5:40 ` [PATCH v4 1/5] remus: don't call stream_continue() when doing failover Wen Congyang
  2016-01-18  5:40 ` [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes Wen Congyang
@ 2016-01-18  5:40 ` Wen Congyang
  2016-01-18 16:53   ` Ian Campbell
  2016-01-18  5:40 ` [PATCH v4 4/5] tools/libxc: error handling for the postcopy() callback Wen Congyang
  2016-01-18  5:40 ` [PATCH v4 5/5] tools/libxl: remove unused function libxl__domain_save_device_model() Wen Congyang
  4 siblings, 1 reply; 14+ messages in thread
From: Wen Congyang @ 2016-01-18  5:40 UTC (permalink / raw)
  To: xen devel, Andrew Cooper
  Cc: Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson,
	Shriram Rajagopalan, Yang Hongyang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxc/xc_sr_save.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 88d85ef..e532168 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -795,7 +795,7 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
 
             rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
             if ( rc <= 0 )
-                ctx->save.checkpointed = false;
+                goto err;
         }
     } while ( ctx->save.checkpointed );
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 4/5] tools/libxc: error handling for the postcopy() callback
  2016-01-18  5:40 [PATCH v4 0/5] migration/remus: bug fix and cleanup Wen Congyang
                   ` (2 preceding siblings ...)
  2016-01-18  5:40 ` [PATCH v4 3/5] tools/libxc: don't send end record if remus fails Wen Congyang
@ 2016-01-18  5:40 ` Wen Congyang
  2016-01-18 16:53   ` Ian Campbell
  2016-01-18  5:40 ` [PATCH v4 5/5] tools/libxl: remove unused function libxl__domain_save_device_model() Wen Congyang
  4 siblings, 1 reply; 14+ messages in thread
From: Wen Congyang @ 2016-01-18  5:40 UTC (permalink / raw)
  To: xen devel, Andrew Cooper
  Cc: Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson,
	Shriram Rajagopalan, Yang Hongyang

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 tools/libxc/xc_sr_save.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index e532168..e4ba560 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -791,7 +791,9 @@ static int save(struct xc_sr_context *ctx, uint16_t guest_type)
             if ( rc )
                 goto err;
 
-            ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+            rc = ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+            if ( rc <= 0 )
+                goto err;
 
             rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
             if ( rc <= 0 )
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v4 5/5] tools/libxl: remove unused function libxl__domain_save_device_model()
  2016-01-18  5:40 [PATCH v4 0/5] migration/remus: bug fix and cleanup Wen Congyang
                   ` (3 preceding siblings ...)
  2016-01-18  5:40 ` [PATCH v4 4/5] tools/libxc: error handling for the postcopy() callback Wen Congyang
@ 2016-01-18  5:40 ` Wen Congyang
  4 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2016-01-18  5:40 UTC (permalink / raw)
  To: xen devel, Andrew Cooper
  Cc: Changlong Xie, Wei Liu, Ian Campbell, Wen Congyang, Ian Jackson,
	Shriram Rajagopalan, Yang Hongyang

After the commit d77570e7, libxl__domain_save_device_model() is
completely unused and can be dropped.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl.c          |  4 --
 tools/libxl/libxl_dom.c      | 91 --------------------------------------------
 tools/libxl/libxl_internal.h |  6 ---
 3 files changed, 101 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d50c3fb..673dc6c 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1552,10 +1552,6 @@ static void stubdom_destroy_callback(libxl__egc *egc,
     dds->stubdom_finished = 1;
     savefile = libxl__device_model_savefile(gc, dis->domid);
     rc = libxl__remove_file(gc, savefile);
-    /*
-     * On suspend libxl__domain_save_device_model will have already
-     * unlinked the save file.
-     */
     if (rc) {
         LOG(ERROR, "failed to remove device-model savefile %s", savefile);
     }
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 47971a9..2269998 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1785,97 +1785,6 @@ static void stream_done(libxl__egc *egc,
     domain_save_done(egc, sws->dss, rc);
 }
 
-static void save_device_model_datacopier_done(libxl__egc *egc,
-     libxl__datacopier_state *dc, int rc, int onwrite, int errnoval);
-
-void libxl__domain_save_device_model(libxl__egc *egc,
-                                     libxl__domain_suspend_state *dss,
-                                     libxl__save_device_model_cb *callback)
-{
-    STATE_AO_GC(dss->ao);
-    struct stat st;
-    uint32_t qemu_state_len;
-    int rc;
-
-    dss->save_dm_callback = callback;
-
-    /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
-    const int fd = dss->fd;
-
-    libxl__datacopier_state *dc = &dss->save_dm_datacopier;
-    memset(dc, 0, sizeof(*dc));
-    dc->readwhat = GCSPRINTF("qemu save file %s", filename);
-    dc->ao = ao;
-    dc->readfd = -1;
-    dc->writefd = fd;
-    dc->maxsz = INT_MAX;
-    dc->bytes_to_read = -1;
-    dc->copywhat = GCSPRINTF("qemu save file for domain %"PRIu32, dss->domid);
-    dc->writewhat = "save/migration stream";
-    dc->callback = save_device_model_datacopier_done;
-
-    dc->readfd = open(filename, O_RDONLY);
-    if (dc->readfd < 0) {
-        LOGE(ERROR, "unable to open %s", dc->readwhat);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    if (fstat(dc->readfd, &st))
-    {
-        LOGE(ERROR, "unable to fstat %s", dc->readwhat);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    if (!S_ISREG(st.st_mode)) {
-        LOG(ERROR, "%s is not a plain file!", dc->readwhat);
-        rc = ERROR_FAIL;
-        goto out;
-    }
-
-    qemu_state_len = st.st_size;
-    LOG(DEBUG, "%s is %d bytes", dc->readwhat, qemu_state_len);
-
-    rc = libxl__datacopier_start(dc);
-    if (rc) goto out;
-
-    libxl__datacopier_prefixdata(egc, dc,
-                                 QEMU_SIGNATURE, strlen(QEMU_SIGNATURE));
-
-    libxl__datacopier_prefixdata(egc, dc,
-                                 &qemu_state_len, sizeof(qemu_state_len));
-    return;
-
- out:
-    save_device_model_datacopier_done(egc, dc, rc, -1, EIO);
-}
-
-static void save_device_model_datacopier_done(libxl__egc *egc,
-     libxl__datacopier_state *dc, int our_rc, int onwrite, int errnoval)
-{
-    libxl__domain_suspend_state *dss =
-        CONTAINER_OF(dc, *dss, save_dm_datacopier);
-    STATE_AO_GC(dss->ao);
-
-    /* Convenience aliases */
-    const char *const filename = dss->dm_savefile;
-    int rc;
-
-    libxl__datacopier_kill(dc);
-
-    if (dc->readfd >= 0) {
-        close(dc->readfd);
-        dc->readfd = -1;
-    }
-
-    rc = libxl__remove_file(gc, filename);
-    if (!our_rc) our_rc = rc;
-
-    dss->save_dm_callback(egc, dss, our_rc);
-}
-
 static void libxl__remus_teardown(libxl__egc *egc,
                                   libxl__domain_suspend_state *dss,
                                   int rc);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index a556a38..233d44a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3103,9 +3103,6 @@ struct libxl__domain_suspend_state {
     libxl__logdirty_switch logdirty;
     void (*callback_common_done)(libxl__egc*,
                                  struct libxl__domain_suspend_state*, int ok);
-    /* private for libxl__domain_save_device_model */
-    libxl__save_device_model_cb *save_dm_callback;
-    libxl__datacopier_state save_dm_datacopier;
 };
 
 
@@ -3498,9 +3495,6 @@ static inline bool libxl__save_helper_inuse(const libxl__save_helper_state *shs)
 /* Each time the dm needs to be saved, we must call suspend and then save */
 _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
                                            libxl__domain_suspend_state *dss);
-_hidden void libxl__domain_save_device_model(libxl__egc *egc,
-                                     libxl__domain_suspend_state *dss,
-                                     libxl__save_device_model_cb *callback);
 
 _hidden const char *libxl__device_model_savefile(libxl__gc *gc, uint32_t domid);
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/5] remus: don't call stream_continue() when doing failover
  2016-01-18  5:40 ` [PATCH v4 1/5] remus: don't call stream_continue() when doing failover Wen Congyang
@ 2016-01-18 16:45   ` Ian Campbell
  2016-01-19  1:05     ` Wen Congyang
  0 siblings, 1 reply; 14+ messages in thread
From: Ian Campbell @ 2016-01-18 16:45 UTC (permalink / raw)
  To: Wen Congyang, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Ian Jackson, Changlong Xie, Wei Liu, Yang Hongyang

On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
> stream_continue() is used for migration to read emulator
> xenstore data and emulator context. For remus, if we do
> failover, we have read it in the checkpoint cycle, and
> we only need to complete the stream.
> 
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  tools/libxl/libxl_stream_read.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libxl/libxl_stream_read.c
> b/tools/libxl/libxl_stream_read.c
> index 258dec4..24305f4 100644
> --- a/tools/libxl/libxl_stream_read.c
> +++ b/tools/libxl/libxl_stream_read.c
> @@ -101,6 +101,19 @@
>   *    - stream_write_emulator_done()
>   *    - stream_continue()
>   *
> + * 4) Failover for remus

I don't think this is really #4 in the list which precedes it. I think a
section "Failover for remus" would be absolutely fine right that the end of
this comment block though, i.e. right after the "Depending on the
contents..." paragraph.

Andy?

> + *    - we buffer all records until a CHECKPOINT_END record is received
> + *    - we will use the records when a CHECKPOINT_END record is received

"we will consume the buffered records..."


> + *    - if we find some internal error, the rc or retval is not 0 in

s/the/then/

> + *      libxl__xc_domain_restore_done(). In this case, we don't resume the
> + *      guest
> + *    - if we need to do failover from primary, the rc and retval are 0

s/the/then/ again and I would say "are both 0" for clarity (assuming that
is indeed the requirement).

> + *      in libxl__xc_domain_restore_done(). In this case, the buffered state
> + *      will be dropped, because we don't receive a CHECKPOINT_END record,

                                       haven't received

> + *      and it is a inconsistent state. In libxl__xc_domain_restore_done(),

"an inconsistent".

I think I would say "... and therefore the buffered state is inconsistent".

-        stream_continue(egc, stream);
> +        if (checkpointed_stream) {
> +            /*
> +             * Failover from primary. Domain state is currently at a
> +             * consistent checkpoint, complete the stream, and call
> +             * stream->completion_callback() to resume the guest.
> +             */
> +            stream_complete(egc, stream, 0);

Is it possible to get here having never received a single CHECKPOINT_END?

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes
  2016-01-18  5:40 ` [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes Wen Congyang
@ 2016-01-18 16:51   ` Ian Campbell
  2016-01-19  1:01     ` Wen Congyang
  0 siblings, 1 reply; 14+ messages in thread
From: Ian Campbell @ 2016-01-18 16:51 UTC (permalink / raw)
  To: Wen Congyang, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Wei Liu, Changlong Xie, Ian Jackson, Yang Hongyang

On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
> For example: if the secondary host is down, and we fail to send the data to
> the secondary host. xc_domain_save() returns 0. So in the function
> libxl__xc_domain_save_done(), rc is 0(the helper program exits normally),
> and retval is 0(it is xc_domain_save()'s return value). In such case, we
> just need to complete the stream.

What if the secondary host isn't actually down but just communication has
failed for some reason? Won't both primary and secondary start their
respective versions of the domain? What are the consequences of that?
(Corruption?)

I suppose this is a consequence of the lack of STONITH or splitbrain
handling within Remus. Are there any plans to address this?

> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  tools/libxl/libxl.c              |  5 ++++-
>  tools/libxl/libxl_stream_write.c | 14 ++++++++++++--
>  2 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index abb2845..d50c3fb 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -884,7 +884,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx,
> libxl_domain_remus_info *info,
>  
>      assert(info);
>  
> -    /* Point of no return */
> +    /*
> +     * This function doesn't return until something is wrong, and
> +     * we need to do failover from secondary.

I was actually hoping for user/API documentation (i.e. in a public header)
rather than a code comment, I suppose this will do though.


> +        if (dss->remus)
> +            /*
> +             * For remus, if libxl__xc_domain_save_done() completes,
> +             * there was an error sending data to the secondary.
> +             * Resume the primary ASAP. The caller doesn't care of the
> +             * return value(Please refer to libxl__remus_teardown())

There should usually be a space before a ( in text/prose (also in the
changelog).

> +             */
> +            stream_complete(egc, stream, 0);
> +        else
> +            write_emulator_xenstore_record(egc, stream);
> +    }
>  }
>  
>  static void write_emulator_xenstore_record(libxl__egc *egc,

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/5] tools/libxc: don't send end record if remus fails
  2016-01-18  5:40 ` [PATCH v4 3/5] tools/libxc: don't send end record if remus fails Wen Congyang
@ 2016-01-18 16:53   ` Ian Campbell
  2016-01-18 16:53     ` Ian Campbell
  0 siblings, 1 reply; 14+ messages in thread
From: Ian Campbell @ 2016-01-18 16:53 UTC (permalink / raw)
  To: Wen Congyang, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Ian Jackson, Changlong Xie, Wei Liu, Yang Hongyang

On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

What was the outcoming of Andy's comment on v1 regarding error handling for
->postcopy?

> ---
>  tools/libxc/xc_sr_save.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
> index 88d85ef..e532168 100644
> --- a/tools/libxc/xc_sr_save.c
> +++ b/tools/libxc/xc_sr_save.c
> @@ -795,7 +795,7 @@ static int save(struct xc_sr_context *ctx, uint16_t
> guest_type)
>  
>              rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks-
> >data);
>              if ( rc <= 0 )
> -                ctx->save.checkpointed = false;
> +                goto err;
>          }
>      } while ( ctx->save.checkpointed );
>  

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 4/5] tools/libxc: error handling for the postcopy() callback
  2016-01-18  5:40 ` [PATCH v4 4/5] tools/libxc: error handling for the postcopy() callback Wen Congyang
@ 2016-01-18 16:53   ` Ian Campbell
  0 siblings, 0 replies; 14+ messages in thread
From: Ian Campbell @ 2016-01-18 16:53 UTC (permalink / raw)
  To: Wen Congyang, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Wei Liu, Changlong Xie, Ian Jackson, Yang Hongyang

On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 3/5] tools/libxc: don't send end record if remus fails
  2016-01-18 16:53   ` Ian Campbell
@ 2016-01-18 16:53     ` Ian Campbell
  0 siblings, 0 replies; 14+ messages in thread
From: Ian Campbell @ 2016-01-18 16:53 UTC (permalink / raw)
  To: Wen Congyang, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Ian Jackson, Changlong Xie, Wei Liu, Yang Hongyang

On Mon, 2016-01-18 at 16:53 +0000, Ian Campbell wrote:
> 
> What was the outcoming of Andy's comment on v1 regarding error handling
> for ->postcopy?

Ignore this, I saw just too late that it was in #4.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes
  2016-01-18 16:51   ` Ian Campbell
@ 2016-01-19  1:01     ` Wen Congyang
  2016-01-19 11:01       ` Ian Campbell
  0 siblings, 1 reply; 14+ messages in thread
From: Wen Congyang @ 2016-01-19  1:01 UTC (permalink / raw)
  To: Ian Campbell, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Wei Liu, Changlong Xie, Ian Jackson, Yang Hongyang

On 01/19/2016 12:51 AM, Ian Campbell wrote:
> On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
>> For example: if the secondary host is down, and we fail to send the data to
>> the secondary host. xc_domain_save() returns 0. So in the function
>> libxl__xc_domain_save_done(), rc is 0(the helper program exits normally),
>> and retval is 0(it is xc_domain_save()'s return value). In such case, we
>> just need to complete the stream.
> 
> What if the secondary host isn't actually down but just communication has
> failed for some reason? Won't both primary and secondary start their
> respective versions of the domain? What are the consequences of that?
> (Corruption?)
> 
> I suppose this is a consequence of the lack of STONITH or splitbrain
> handling within Remus. Are there any plans to address this?

IIRC, Shriram Rajagopalan has some ideas about it(check the external heartbeat?).
There is no way to avoid splitbrain unless we have more than two hosts(at least
three hosts). If we want to avoid splitbrain, we may need to destroy both primary
and secondary guests.

An example:
            judge host
            /        \
           1          2
          /            \
primary host  <-3->  secondary host

If connection 3 has problem:
case A: connection 1 and 2 is OK, we can select one to run, and another one will
        be destroyed (we have a judge host)
case B: one of connection 1 and is OK, the another connection has problem. The
        guest on the host that can connects to judge host will continue to run.
        The another guest will be destroyed.
case C: both connection 1 and 2 have problem. The two guest will be destroyed.

> 
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>>  tools/libxl/libxl.c              |  5 ++++-
>>  tools/libxl/libxl_stream_write.c | 14 ++++++++++++--
>>  2 files changed, 16 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index abb2845..d50c3fb 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -884,7 +884,10 @@ int libxl_domain_remus_start(libxl_ctx *ctx,
>> libxl_domain_remus_info *info,
>>  
>>      assert(info);
>>  
>> -    /* Point of no return */
>> +    /*
>> +     * This function doesn't return until something is wrong, and
>> +     * we need to do failover from secondary.
> 
> I was actually hoping for user/API documentation (i.e. in a public header)
> rather than a code comment, I suppose this will do though.

OK, I will fix it.

> 
> 
>> +        if (dss->remus)
>> +            /*
>> +             * For remus, if libxl__xc_domain_save_done() completes,
>> +             * there was an error sending data to the secondary.
>> +             * Resume the primary ASAP. The caller doesn't care of the
>> +             * return value(Please refer to libxl__remus_teardown())
> 
> There should usually be a space before a ( in text/prose (also in the
> changelog).

OK, I will fix it.

> 
>> +             */
>> +            stream_complete(egc, stream, 0);
>> +        else
>> +            write_emulator_xenstore_record(egc, stream);
>> +    }
>>  }
>>  
>>  static void write_emulator_xenstore_record(libxl__egc *egc,
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 1/5] remus: don't call stream_continue() when doing failover
  2016-01-18 16:45   ` Ian Campbell
@ 2016-01-19  1:05     ` Wen Congyang
  0 siblings, 0 replies; 14+ messages in thread
From: Wen Congyang @ 2016-01-19  1:05 UTC (permalink / raw)
  To: Ian Campbell, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Ian Jackson, Changlong Xie, Wei Liu, Yang Hongyang

On 01/19/2016 12:45 AM, Ian Campbell wrote:
> On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
>> stream_continue() is used for migration to read emulator
>> xenstore data and emulator context. For remus, if we do
>> failover, we have read it in the checkpoint cycle, and
>> we only need to complete the stream.
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> ---
>>  tools/libxl/libxl_stream_read.c | 35 ++++++++++++++++++++++++++++++-----
>>  1 file changed, 30 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/libxl/libxl_stream_read.c
>> b/tools/libxl/libxl_stream_read.c
>> index 258dec4..24305f4 100644
>> --- a/tools/libxl/libxl_stream_read.c
>> +++ b/tools/libxl/libxl_stream_read.c
>> @@ -101,6 +101,19 @@
>>   *    - stream_write_emulator_done()
>>   *    - stream_continue()
>>   *
>> + * 4) Failover for remus
> 
> I don't think this is really #4 in the list which precedes it. I think a
> section "Failover for remus" would be absolutely fine right that the end of
> this comment block though, i.e. right after the "Depending on the
> contents..." paragraph.
> 
> Andy?
> 
>> + *    - we buffer all records until a CHECKPOINT_END record is received
>> + *    - we will use the records when a CHECKPOINT_END record is received
> 
> "we will consume the buffered records..."
> 
> 
>> + *    - if we find some internal error, the rc or retval is not 0 in
> 
> s/the/then/
> 
>> + *      libxl__xc_domain_restore_done(). In this case, we don't resume the
>> + *      guest
>> + *    - if we need to do failover from primary, the rc and retval are 0
> 
> s/the/then/ again and I would say "are both 0" for clarity (assuming that
> is indeed the requirement).
> 
>> + *      in libxl__xc_domain_restore_done(). In this case, the buffered state
>> + *      will be dropped, because we don't receive a CHECKPOINT_END record,
> 
>                                        haven't received
> 
>> + *      and it is a inconsistent state. In libxl__xc_domain_restore_done(),
> 
> "an inconsistent".
> 
> I think I would say "... and therefore the buffered state is inconsistent".

Sorry for may poor English... I will fix it in the next version.

> 
> -        stream_continue(egc, stream);
>> +        if (checkpointed_stream) {
>> +            /*
>> +             * Failover from primary. Domain state is currently at a
>> +             * consistent checkpoint, complete the stream, and call
>> +             * stream->completion_callback() to resume the guest.
>> +             */
>> +            stream_complete(egc, stream, 0);
> 
> Is it possible to get here having never received a single CHECKPOINT_END?

I will check the code first. I think xc_domain_restore() should not return
0 if it doesn't receive a single CHECKPOINT_END.

Thanks
Wen Congyang

> 
> Ian.
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes
  2016-01-19  1:01     ` Wen Congyang
@ 2016-01-19 11:01       ` Ian Campbell
  0 siblings, 0 replies; 14+ messages in thread
From: Ian Campbell @ 2016-01-19 11:01 UTC (permalink / raw)
  To: Wen Congyang, xen devel, Andrew Cooper
  Cc: Shriram Rajagopalan, Wei Liu, Changlong Xie, Ian Jackson, Yang Hongyang

On Tue, 2016-01-19 at 09:01 +0800, Wen Congyang wrote:
> On 01/19/2016 12:51 AM, Ian Campbell wrote:
> > On Mon, 2016-01-18 at 13:40 +0800, Wen Congyang wrote:
> > > For example: if the secondary host is down, and we fail to send the
> > > data to
> > > the secondary host. xc_domain_save() returns 0. So in the function
> > > libxl__xc_domain_save_done(), rc is 0(the helper program exits
> > > normally),
> > > and retval is 0(it is xc_domain_save()'s return value). In such case,
> > > we
> > > just need to complete the stream.
> > 
> > What if the secondary host isn't actually down but just communication
> > has
> > failed for some reason? Won't both primary and secondary start their
> > respective versions of the domain? What are the consequences of that?
> > (Corruption?)
> > 
> > I suppose this is a consequence of the lack of STONITH or splitbrain
> > handling within Remus. Are there any plans to address this?
> 
> IIRC, Shriram Rajagopalan has some ideas about it(check the external heartbeat?).
> There is no way to avoid splitbrain unless we have more than two hosts(at least
> three hosts). If we want to avoid splitbrain, we may need to destroy both primary
> and secondary guests.

I think there's plenty of existing systems for taking care of this side of
fault-tolerance/HA (e.g. linux-ha, Pacemaker, Corosync, etc), we don't need
(or want) to reinvent that particular wheel here.

I think we just need a story on how one would integrate with such a system
in order to say that Remus is properly usable in real world scenarios (i.e.
before we can remove the "proof-of-concept" wording from the man page).

That might just be a documentation exercise, or it might require some hooks
etc adding to (lib)xl in order to allow such integrations, I'm not sure
what's needed.

IIRC Ian expressed a similar sentiment when Remus support was first added
to libxl.

Ian.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-01-19 11:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-18  5:40 [PATCH v4 0/5] migration/remus: bug fix and cleanup Wen Congyang
2016-01-18  5:40 ` [PATCH v4 1/5] remus: don't call stream_continue() when doing failover Wen Congyang
2016-01-18 16:45   ` Ian Campbell
2016-01-19  1:05     ` Wen Congyang
2016-01-18  5:40 ` [PATCH v4 2/5] remus: resume immediately if libxl__xc_domain_save_done() completes Wen Congyang
2016-01-18 16:51   ` Ian Campbell
2016-01-19  1:01     ` Wen Congyang
2016-01-19 11:01       ` Ian Campbell
2016-01-18  5:40 ` [PATCH v4 3/5] tools/libxc: don't send end record if remus fails Wen Congyang
2016-01-18 16:53   ` Ian Campbell
2016-01-18 16:53     ` Ian Campbell
2016-01-18  5:40 ` [PATCH v4 4/5] tools/libxc: error handling for the postcopy() callback Wen Congyang
2016-01-18 16:53   ` Ian Campbell
2016-01-18  5:40 ` [PATCH v4 5/5] tools/libxl: remove unused function libxl__domain_save_device_model() Wen Congyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.