xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage
@ 2016-04-14 19:54 Andrew Cooper
  2016-04-15  0:43 ` Wen Congyang
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Andrew Cooper @ 2016-04-14 19:54 UTC (permalink / raw)
  To: Xen-devel
  Cc: Olaf Hering, Changlong Xie, Wei Liu, Wen Congyang, Andrew Cooper,
	Ian Jackson, Yang Hongyang

c/s f5d947bf1b "tools/libxl: add back channel support to read stream"
made a bogus adjustment to libxl__stream_read_start(), including
removing the comment hinting at what was going on, which breaks
conversion of a legacy migration stream.

Symptoms look like:

  root@anonymi:~ # xl migrate domU host
  migration target: Ready to receive domain.
  Saving to migration stream new xl format (info 0x1/0x0/2677)
  xc: error: error polling suspend notification channel: -1: Internal error
  Loading new save file <incoming migration stream> (new xl fmt info 0x1/0x0/2677)
   Savefile contains xl domain config in JSON format
  Parsing config from <saved>
  libxl: error: libxl_stream_read.c:327:stream_header_done: Invalid ident: expected 0x4c6962786c466d74, got 0x01f00f0000000000
  libxl: error: libxl_utils.c:430:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 1 save/restore helper stdout pipe

The adjustment is not required for backchannel support (as there is no
interaction between back channels and legacy conversion), and caused
stream->fd to be latched in the datacopier before legacy conversion
substitutes it for the fd which is the output of the conversion script.

This causes libxl to consume data from the legacy stream rather than the
v2 stream, and for the conversion script to encounter an error as the
legacy stream appears to skip ahead.

Undo the adjustments to libxl__stream_read_start(), and introduce a
better description of what is going on.  Introduce some extra assertions
to try and catch similar breakage in the future.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Olaf Hering <olaf@aepfle.de>
CC: Yang Hongyang <hongyang.yang@easystack.cn>
CC: Wen Congyang <wency@cn.fujitsu.com>
CC: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
---
 tools/libxl/libxl_stream_read.c | 33 ++++++++++++++++++++++++---------
 1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 9659051..89c2f21 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -234,16 +234,16 @@ void libxl__stream_read_start(libxl__egc *egc,
     stream->running = true;
     stream->phase   = SRS_PHASE_NORMAL;
 
-    dc->ao       = stream->ao;
-    dc->copywhat = "restore v2 stream";
-    dc->readfd = stream->fd;
-    dc->writefd  = -1;
-
-    if (stream->back_channel)
-        return;
-
     if (stream->legacy) {
-        /* Convert the legacy stream. */
+        /*
+         * Convert the legacy stream.
+         *
+         * This results in a fork()/exec() of conversion helper script.  It is
+         * passed the exiting stream->fd as an input, and returns the
+         * transformed stream via a new pipe.  The fd of this new pipe then
+         * replaces stream->fd, to make the rest of the stream read code
+         * agnostic to whether legacy conversion is happening or not.
+         */
         libxl__conversion_helper_state *chs = &stream->chs;
 
         chs->legacy_fd = stream->fd;
@@ -258,10 +258,25 @@ void libxl__stream_read_start(libxl__egc *egc,
             goto err;
         }
 
+        /* There should be no interaction of COLO backchannels and legacy
+         * stream conversion. */
+        assert(!stream->back_channel);
+
+        /* Confirm *dc is still zeroed out, while we shuffle stream->fd. */
+        assert(dc->ao == NULL);
         assert(stream->chs.v2_carefd);
         stream->fd = libxl__carefd_fd(stream->chs.v2_carefd);
         stream->dcs->libxc_fd = stream->fd;
     }
+    /* stream->fd is now a v2 stream. */
+
+    dc->ao       = stream->ao;
+    dc->copywhat = "restore v2 stream";
+    dc->readfd   = stream->fd;
+    dc->writefd  = -1;
+
+    if (stream->back_channel)
+        return;
 
     /* Start reading the stream header. */
     rc = setup_read(stream, "stream header",
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage
  2016-04-14 19:54 [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage Andrew Cooper
@ 2016-04-15  0:43 ` Wen Congyang
  2016-04-15  9:00 ` Wei Liu
  2016-04-15  9:01 ` Olaf Hering
  2 siblings, 0 replies; 5+ messages in thread
From: Wen Congyang @ 2016-04-15  0:43 UTC (permalink / raw)
  To: Andrew Cooper, Xen-devel
  Cc: Wei Liu, Olaf Hering, Changlong Xie, Ian Jackson, Yang Hongyang

On 04/15/2016 03:54 AM, Andrew Cooper wrote:
> c/s f5d947bf1b "tools/libxl: add back channel support to read stream"
> made a bogus adjustment to libxl__stream_read_start(), including
> removing the comment hinting at what was going on, which breaks
> conversion of a legacy migration stream.
> 
> Symptoms look like:
> 
>   root@anonymi:~ # xl migrate domU host
>   migration target: Ready to receive domain.
>   Saving to migration stream new xl format (info 0x1/0x0/2677)
>   xc: error: error polling suspend notification channel: -1: Internal error
>   Loading new save file <incoming migration stream> (new xl fmt info 0x1/0x0/2677)
>    Savefile contains xl domain config in JSON format
>   Parsing config from <saved>
>   libxl: error: libxl_stream_read.c:327:stream_header_done: Invalid ident: expected 0x4c6962786c466d74, got 0x01f00f0000000000
>   libxl: error: libxl_utils.c:430:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 1 save/restore helper stdout pipe
> 
> The adjustment is not required for backchannel support (as there is no
> interaction between back channels and legacy conversion), and caused
> stream->fd to be latched in the datacopier before legacy conversion
> substitutes it for the fd which is the output of the conversion script.
> 
> This causes libxl to consume data from the legacy stream rather than the
> v2 stream, and for the conversion script to encounter an error as the
> legacy stream appears to skip ahead.
> 
> Undo the adjustments to libxl__stream_read_start(), and introduce a
> better description of what is going on.  Introduce some extra assertions
> to try and catch similar breakage in the future.
> 
> Reported-by: Olaf Hering <olaf@aepfle.de>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>

> ---
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Olaf Hering <olaf@aepfle.de>
> CC: Yang Hongyang <hongyang.yang@easystack.cn>
> CC: Wen Congyang <wency@cn.fujitsu.com>
> CC: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
> ---
>  tools/libxl/libxl_stream_read.c | 33 ++++++++++++++++++++++++---------
>  1 file changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
> index 9659051..89c2f21 100644
> --- a/tools/libxl/libxl_stream_read.c
> +++ b/tools/libxl/libxl_stream_read.c
> @@ -234,16 +234,16 @@ void libxl__stream_read_start(libxl__egc *egc,
>      stream->running = true;
>      stream->phase   = SRS_PHASE_NORMAL;
>  
> -    dc->ao       = stream->ao;
> -    dc->copywhat = "restore v2 stream";
> -    dc->readfd = stream->fd;
> -    dc->writefd  = -1;
> -
> -    if (stream->back_channel)
> -        return;
> -
>      if (stream->legacy) {
> -        /* Convert the legacy stream. */
> +        /*
> +         * Convert the legacy stream.
> +         *
> +         * This results in a fork()/exec() of conversion helper script.  It is
> +         * passed the exiting stream->fd as an input, and returns the
> +         * transformed stream via a new pipe.  The fd of this new pipe then
> +         * replaces stream->fd, to make the rest of the stream read code
> +         * agnostic to whether legacy conversion is happening or not.
> +         */
>          libxl__conversion_helper_state *chs = &stream->chs;
>  
>          chs->legacy_fd = stream->fd;
> @@ -258,10 +258,25 @@ void libxl__stream_read_start(libxl__egc *egc,
>              goto err;
>          }
>  
> +        /* There should be no interaction of COLO backchannels and legacy
> +         * stream conversion. */
> +        assert(!stream->back_channel);
> +
> +        /* Confirm *dc is still zeroed out, while we shuffle stream->fd. */
> +        assert(dc->ao == NULL);
>          assert(stream->chs.v2_carefd);
>          stream->fd = libxl__carefd_fd(stream->chs.v2_carefd);
>          stream->dcs->libxc_fd = stream->fd;
>      }
> +    /* stream->fd is now a v2 stream. */
> +
> +    dc->ao       = stream->ao;
> +    dc->copywhat = "restore v2 stream";
> +    dc->readfd   = stream->fd;
> +    dc->writefd  = -1;
> +
> +    if (stream->back_channel)
> +        return;
>  
>      /* Start reading the stream header. */
>      rc = setup_read(stream, "stream header",
> 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage
  2016-04-14 19:54 [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage Andrew Cooper
  2016-04-15  0:43 ` Wen Congyang
@ 2016-04-15  9:00 ` Wei Liu
  2016-04-15 11:02   ` Ian Jackson
  2016-04-15  9:01 ` Olaf Hering
  2 siblings, 1 reply; 5+ messages in thread
From: Wei Liu @ 2016-04-15  9:00 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Olaf Hering, Changlong Xie, Wei Liu, Wen Congyang, Ian Jackson,
	Xen-devel, Yang Hongyang

On Thu, Apr 14, 2016 at 08:54:15PM +0100, Andrew Cooper wrote:
> c/s f5d947bf1b "tools/libxl: add back channel support to read stream"
> made a bogus adjustment to libxl__stream_read_start(), including
> removing the comment hinting at what was going on, which breaks
> conversion of a legacy migration stream.
> 
> Symptoms look like:
> 
>   root@anonymi:~ # xl migrate domU host
>   migration target: Ready to receive domain.
>   Saving to migration stream new xl format (info 0x1/0x0/2677)
>   xc: error: error polling suspend notification channel: -1: Internal error
>   Loading new save file <incoming migration stream> (new xl fmt info 0x1/0x0/2677)
>    Savefile contains xl domain config in JSON format
>   Parsing config from <saved>
>   libxl: error: libxl_stream_read.c:327:stream_header_done: Invalid ident: expected 0x4c6962786c466d74, got 0x01f00f0000000000
>   libxl: error: libxl_utils.c:430:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 1 save/restore helper stdout pipe
> 
> The adjustment is not required for backchannel support (as there is no
> interaction between back channels and legacy conversion), and caused
> stream->fd to be latched in the datacopier before legacy conversion
> substitutes it for the fd which is the output of the conversion script.
> 
> This causes libxl to consume data from the legacy stream rather than the
> v2 stream, and for the conversion script to encounter an error as the
> legacy stream appears to skip ahead.
> 
> Undo the adjustments to libxl__stream_read_start(), and introduce a
> better description of what is going on.  Introduce some extra assertions
> to try and catch similar breakage in the future.
> 
> Reported-by: Olaf Hering <olaf@aepfle.de>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

Thank you for fixing this.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage
  2016-04-14 19:54 [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage Andrew Cooper
  2016-04-15  0:43 ` Wen Congyang
  2016-04-15  9:00 ` Wei Liu
@ 2016-04-15  9:01 ` Olaf Hering
  2 siblings, 0 replies; 5+ messages in thread
From: Olaf Hering @ 2016-04-15  9:01 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Changlong Xie, Wei Liu, Wen Congyang, Ian Jackson, Xen-devel,
	Yang Hongyang

On Thu, Apr 14, Andrew Cooper wrote:

> c/s f5d947bf1b "tools/libxl: add back channel support to read stream"
> made a bogus adjustment to libxl__stream_read_start(), including
> removing the comment hinting at what was going on, which breaks
> conversion of a legacy migration stream.

Thanks!

Tested-by: Olaf Hering <olaf@aepfle.de>


root@anonymi:~ # xl migrate  domU host
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x1/0x0/2675)
Loading new save file <incoming migration stream> (new xl fmt info 0x1/0x0/2675)
 Savefile contains xl domain config in JSON format
Parsing config from <saved>
xc: info: Found x86 HVM domain, converted from legacy stream format
xc: info: Restoring domain
xc: info: Restore successful
xc: info: XenStore: mfn 0xfeffc, dom 0, evt 1
xc: info: Console: mfn 0xfefff, dom 0, evt 2
libxl: warning: libxl_dm.c:1486:libxl__build_device_model_args_new: Could not find user xen-qemuuser-shared, starting QEMU as root
migration target: Transfer complete, requesting permission to start domain.
migration sender: Target has acknowledged transfer.
migration sender: Giving target permission to start.
migration target: Got permission, starting domain.
migration target: Domain started successsfully.
migration sender: Target reports successful startup.
libxl: info: libxl.c:1698:devices_destroy_cb: forked pid 2034 for destroy of domain 1
Migration successful.
root@anonymi:~ #

Olaf

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage
  2016-04-15  9:00 ` Wei Liu
@ 2016-04-15 11:02   ` Ian Jackson
  0 siblings, 0 replies; 5+ messages in thread
From: Ian Jackson @ 2016-04-15 11:02 UTC (permalink / raw)
  To: Wei Liu
  Cc: Olaf Hering, Changlong Xie, Wen Congyang, Andrew Cooper,
	Xen-devel, Yang Hongyang

Wei Liu writes ("Re: [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage"):
> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
> 
> Thank you for fixing this.

Indeed, thanks everyone.  Queued.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-15 11:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-14 19:54 [PATCH for-4.7] tools/libxl: Fix legacy migration following COLO backchannel breakage Andrew Cooper
2016-04-15  0:43 ` Wen Congyang
2016-04-15  9:00 ` Wei Liu
2016-04-15 11:02   ` Ian Jackson
2016-04-15  9:01 ` Olaf Hering

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).