All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug
@ 2020-01-13 10:36 Olaf Hering
  2020-01-13 17:26 ` Olaf Hering
  2020-01-27 11:30 ` Olaf Hering
  0 siblings, 2 replies; 7+ messages in thread
From: Olaf Hering @ 2020-01-13 10:36 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1149 bytes --]

I did not find anything in the Xen 4.13 release notes, so I'm asking here:

This HVM domU fails to live migrate from staging-4.12 to staging-4.13:

name='hvm'
serial='pty'
vcpus='4'
memory='888'
disk=[ 'vdev=xvda, format=raw, access=rw, target=/netshare/disk0.raw' ]
vif=[ 'bridge=br0,mac=3c:27:63:58:ca:35' ]
builder="hvm"
device_model_version="qemu-xen"

The receiving qemu fails like that:

char device redirected to /dev/pts/3 (label serial0)
xen_ram_alloc: do not alloc 37000000 bytes of ram at 0 when runstate is INMIGRATE
xen_ram_alloc: do not alloc 800000 bytes of ram at 37000000 when runstate is INMIGRATE
xen_ram_alloc: do not alloc 10000 bytes of ram at 37800000 when runstate is INMIGRATE
xen_ram_alloc: do not alloc 40000 bytes of ram at 37840000 when runstate is INMIGRATE
VNC server running on 127.0.0.1:5900
qemu-system-i386: Unknown savevm section type 111
qemu-system-i386: load of migration failed: Invalid argument

Google does not seem to know this specific error.
In fact, every upstream qemu-3.x fails to migrate to qemu-4.x.
Does anyone know what incompatible change was done to qemu.git?

Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug
  2020-01-13 10:36 [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug Olaf Hering
@ 2020-01-13 17:26 ` Olaf Hering
  2020-01-27 11:30 ` Olaf Hering
  1 sibling, 0 replies; 7+ messages in thread
From: Olaf Hering @ 2020-01-13 17:26 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 291 bytes --]

Am Mon, 13 Jan 2020 11:36:27 +0100
schrieb Olaf Hering <olaf@aepfle.de>:

> qemu-system-i386: Unknown savevm section type 111

Looks like hw/i386/pc_piix.c:xenfv_machine_options must set m->smbus_no_migration_support to true.
Not sure why this remained unnoticed for so long.

Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug
  2020-01-13 10:36 [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug Olaf Hering
  2020-01-13 17:26 ` Olaf Hering
@ 2020-01-27 11:30 ` Olaf Hering
  2020-01-27 11:54   ` Durrant, Paul
  1 sibling, 1 reply; 7+ messages in thread
From: Olaf Hering @ 2020-01-27 11:30 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2882 bytes --]

Am Mon, 13 Jan 2020 11:36:27 +0100
schrieb Olaf Hering <olaf@aepfle.de>:

> This HVM domU fails to live migrate from staging-4.12 to staging-4.13:

It turned out libxl does not configure qemu correctly at runtime:
libxl__build_device_model_args_new() uses 'qemu -machine xenfv', perhaps with the assumption that 'xenfv' does the right thing. Unfortunately, 'xenfv' entirely ignores compatibility of "pc-i440fx" between qemu versions, 'xenfv' just maps to 'pc' aka 'the lastest'. Instead of 'qemu -machine xenfv', libxl should run 'qemu -machine pc-i440fx-3.0 -device xen-platform -accel xen' to make sure the domU can be migrated safely to future versions of qemu.

Maybe there should also be a way to select a specific variant of "pc-i440fx". Currently the only way to do that is to use device_model_args_hvm= in xl.cfg. Unfortunately libvirt does not support "b_info->extra*".

Should the string "pc-i440fx-3.0" become a configure option?



I think this (untested) patch has to be applied to staging-4.13:


--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1715,23 +1715,20 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
     for (i = 0; b_info->extra && b_info->extra[i] != NULL; i++)
         flexarray_append(dm_args, b_info->extra[i]);
 
-    flexarray_append(dm_args, "-machine");
     switch (b_info->type) {
     case LIBXL_DOMAIN_TYPE_PVH:
     case LIBXL_DOMAIN_TYPE_PV:
+        flexarray_append(dm_args, "-machine");
         flexarray_append(dm_args, "xenpv");
         for (i = 0; b_info->extra_pv && b_info->extra_pv[i] != NULL; i++)
             flexarray_append(dm_args, b_info->extra_pv[i]);
         break;
     case LIBXL_DOMAIN_TYPE_HVM:
-        if (!libxl_defbool_val(b_info->u.hvm.xen_platform_pci)) {
-            /* Switching here to the machine "pc" which does not add
-             * the xen-platform device instead of the default "xenfv" machine.
-             */
-            machinearg = libxl__strdup(gc, "pc,accel=xen");
-        } else {
-            machinearg = libxl__strdup(gc, "xenfv");
+        if (libxl_defbool_val(b_info->u.hvm.xen_platform_pci)) {
+            flexarray_append(dm_args, "-device");
+            flexarray_append(dm_args, "xen-platform");
         }
+        machinearg = libxl__strdup(gc, "pc-i440fx-3.0,accel=xen");
         if (b_info->u.hvm.mmio_hole_memkb) {
             uint64_t max_ram_below_4g = (1ULL << 32) -
                 (b_info->u.hvm.mmio_hole_memkb << 10);
@@ -1762,6 +1759,7 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
             }
         }
 
+        flexarray_append(dm_args, "-machine");
         flexarray_append(dm_args, machinearg);
         for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; i++)
             flexarray_append(dm_args, b_info->extra_hvm[i]);



Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug
  2020-01-27 11:30 ` Olaf Hering
@ 2020-01-27 11:54   ` Durrant, Paul
  2020-01-27 12:59     ` Olaf Hering
  2020-02-17 15:04     ` Olaf Hering
  0 siblings, 2 replies; 7+ messages in thread
From: Durrant, Paul @ 2020-01-27 11:54 UTC (permalink / raw)
  To: Olaf Hering, xen-devel

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Olaf Hering
> Sent: 27 January 2020 11:30
> To: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to
> qemu-xen bug
> 
> Am Mon, 13 Jan 2020 11:36:27 +0100
> schrieb Olaf Hering <olaf@aepfle.de>:
> 
> > This HVM domU fails to live migrate from staging-4.12 to staging-4.13:
> 
> It turned out libxl does not configure qemu correctly at runtime:
> libxl__build_device_model_args_new() uses 'qemu -machine xenfv', perhaps
> with the assumption that 'xenfv' does the right thing. Unfortunately,
> 'xenfv' entirely ignores compatibility of "pc-i440fx" between qemu
> versions, 'xenfv' just maps to 'pc' aka 'the lastest'. Instead of 'qemu -
> machine xenfv', libxl should run 'qemu -machine pc-i440fx-3.0 -device xen-
> platform -accel xen' to make sure the domU can be migrated safely to
> future versions of qemu.

Agreed, I think use xenfv needs to be dropped and xl/libxl ought to specify the pc version it wants, as you suggest. For compat though, if the pc version is not specified in xl.cfg we'd need a mechanism to scan the versions supported by the installed qemu and then pick the latest, such that it then gets baked into the json blob for save/restore/migration purposes. 

> 
> Maybe there should also be a way to select a specific variant of "pc-
> i440fx". Currently the only way to do that is to use
> device_model_args_hvm= in xl.cfg. Unfortunately libvirt does not support
> "b_info->extra*".
> 

Yeah, it should be a first class config option.

> Should the string "pc-i440fx-3.0" become a configure option?
> 

I suppose. Could we have "pc-i440fx" as the default, which libxl prefix matches against qemu's supported versions to select the latest? I guess that would work.

Functionally your code looks fine, but I don't think fixing on 3.0 is the right thing to do. What happens if someone is trying to use an older version of qemu? It's going to cause unexpected breakage I think.

  Paul


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug
  2020-01-27 11:54   ` Durrant, Paul
@ 2020-01-27 12:59     ` Olaf Hering
  2020-01-27 13:18       ` Durrant, Paul
  2020-02-17 15:04     ` Olaf Hering
  1 sibling, 1 reply; 7+ messages in thread
From: Olaf Hering @ 2020-01-27 12:59 UTC (permalink / raw)
  To: Durrant, Paul; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1033 bytes --]

Am Mon, 27 Jan 2020 11:54:37 +0000
schrieb "Durrant, Paul" <pdurrant@amazon.co.uk>:

> > Should the string "pc-i440fx-3.0" become a configure option?
> I suppose. Could we have "pc-i440fx" as the default, which libxl prefix matches against qemu's supported versions to select the latest?

I think the qemu machine variant must become a property of the running domU, so that it will not get lost during migration. For incoming domUs without such property some default must be selected by libxl. libxl at runtime has no info what the initial qemu command was. So this fallback must become a compile or runtime knob as well. Not sure if it would be too cumbersome for host admins to apply the equivalent of "device_model_args_hvm=" to a five or six digit number of running domUs during or prior their migration.

There should be a --qemuu-hvm-machine, which may just default to 'pc-1.0' if not specified. That string should go to domain_build_info.u.hvm.qemuu_machine, so that it becomes part of the domU properties.


Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug
  2020-01-27 12:59     ` Olaf Hering
@ 2020-01-27 13:18       ` Durrant, Paul
  0 siblings, 0 replies; 7+ messages in thread
From: Durrant, Paul @ 2020-01-27 13:18 UTC (permalink / raw)
  To: Olaf Hering; +Cc: Anthony Perard, Ian Jackson, Wei Liu, xen-devel

> -----Original Message-----
> From: Olaf Hering [mailto:olaf@aepfle.de]
> Sent: 27 January 2020 13:00
> To: Durrant, Paul <pdurrant@amazon.co.uk>
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to
> qemu-xen bug
> 
> Am Mon, 27 Jan 2020 11:54:37 +0000
> schrieb "Durrant, Paul" <pdurrant@amazon.co.uk>:
> 
> > > Should the string "pc-i440fx-3.0" become a configure option?
> > I suppose. Could we have "pc-i440fx" as the default, which libxl prefix
> matches against qemu's supported versions to select the latest?
> 
> I think the qemu machine variant must become a property of the running
> domU, so that it will not get lost during migration. For incoming domUs
> without such property some default must be selected by libxl. libxl at
> runtime has no info what the initial qemu command was. So this fallback
> must become a compile or runtime knob as well. Not sure if it would be too
> cumbersome for host admins to apply the equivalent of
> "device_model_args_hvm=" to a five or six digit number of running domUs
> during or prior their migration.
> 
> There should be a --qemuu-hvm-machine, which may just default to 'pc-1.0'
> if not specified. That string should go to
> domain_build_info.u.hvm.qemuu_machine, so that it becomes part of the domU
> properties.
> 

Could we have an opinion from a toolstack maintainer (cc-ed), please?

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug
  2020-01-27 11:54   ` Durrant, Paul
  2020-01-27 12:59     ` Olaf Hering
@ 2020-02-17 15:04     ` Olaf Hering
  1 sibling, 0 replies; 7+ messages in thread
From: Olaf Hering @ 2020-02-17 15:04 UTC (permalink / raw)
  To: Durrant, Paul; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 920 bytes --]

Am Mon, 27 Jan 2020 11:54:37 +0000
schrieb "Durrant, Paul" <pdurrant@amazon.co.uk>:

> I suppose. Could we have "pc-i440fx" as the default, which libxl prefix matches against qemu's supported versions to select the latest? I guess that would work.

This can not be fixed in libxl because libxl can not possibly know what is inside the domU.

With '-machine xenfv' the PCI device is at 0000:00:02.0/platform, while with '-machine pc-i440fx-3.1,accel=xen -device xen-platform' the PCI device is somewhere else. As a result the receiving host rejects this approach:

qemu-system-i386: Unknown savevm section or instance '0000:00:02.0/platform' 0. Make sure that your current VM setup matches your saved VM setup, including any hotplugged devices

In my earlier testing I forced -machine pc-i440fx* on the sending side, and did not spot the flaw in this patch for libxl.

For short: we are doomed...


Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-17 15:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-13 10:36 [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug Olaf Hering
2020-01-13 17:26 ` Olaf Hering
2020-01-27 11:30 ` Olaf Hering
2020-01-27 11:54   ` Durrant, Paul
2020-01-27 12:59     ` Olaf Hering
2020-01-27 13:18       ` Durrant, Paul
2020-02-17 15:04     ` Olaf Hering

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.