All of lore.kernel.org
 help / color / mirror / Atom feed
* HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
       [not found] <514405924.8634267.1368537117719.JavaMail.root@zimbra002>
@ 2013-05-14 13:12 ` Diana Crisan
  2013-05-17 17:33   ` Ian Campbell
  0 siblings, 1 reply; 9+ messages in thread
From: Diana Crisan @ 2013-05-14 13:12 UTC (permalink / raw)
  To: xen-devel

This is problem 2 of 3 problems we are having with live migration and/or ACPI on Xen-4.3 and Xen-4.2.

Any help would be appreciated.

Detailed description of problem:

We are using Xen-4.3-rc1 with dom0 running Ubuntu Precise and 3.5.0-23-generic kernel, and domU running Ubuntu Precise (12.04) cloud images running 3.2.0-39-virtual. We are using the xl.conf below on qemu-upstream-dm and HVM and two identical sending and receiving machines (hardware and software)

When live migration is instigated between two identical hardware configurations using 'xl migrate', the migrate completes but the xenstore entries on the sending and receiving side differ. 

Prior to issuing a migration the lines 'platform = ""', 'acpi = "1"', 'acpi_s3 = "1"' and 'acpi_s4 = "1"' are present in xenstore (despite the fact xl.conf does not explicitly specify them). However, after the migration succeeds on the receiving side, those lines are missing. 

This is replicable every time.

How to replicate:

1. Take two machines with identical hardware and software, running the xen-4.3-rc1 version of Xen on Ubuntu Precise with 3.5.0-23-generic kernel.
2. Use the xl.conf below as a configuration file.
3. Create a VM using Ubuntu Precise and 3.5.0-23 generic.
4. Start the VM
5. Do xenstore-ls and save it to a file for later comparison.
6. xl migrate from one machine to the other
7. wait until it resumes on the receiving side
8. Do xenstore-ls on the receiving side and save it to a file.
9. Compare the two files and notice that the lines referring to the platform ACPI configuration in xenstore are missing. 

Expected results:

The platform ACPI details should be present in xenstore after a migration.

Actual results:

The platform ACPI details are missing in xenstore after a migration.

Notes:

On xen-4.2, a similar thing happens.

--xl.conf--

builder='hvm'
memory = 512
name = "416-vm"
vcpus=1
disk = [ 'tap:qcow2:/root/diana.qcow2,xvda,w' ]
vif = ['mac=00:16:3f:1d:6a:c0, bridge=defaultbr']
sdl=0
opengl=1
vnc=1
vnclisten="0.0.0.0"
vncdisplay=0
vncunused=0
vncpasswd='p'
stdvga=0
serial='pty'

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-14 13:12 ` HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore Diana Crisan
@ 2013-05-17 17:33   ` Ian Campbell
  2013-05-18  9:52     ` Alex Bligh
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-05-17 17:33 UTC (permalink / raw)
  To: Diana Crisan; +Cc: xen-devel

On Tue, 2013-05-14 at 14:12 +0100, Diana Crisan wrote:
> Prior to issuing a migration the lines 'platform = ""', 'acpi = "1"',
> 'acpi_s3 = "1"' and 'acpi_s4 = "1"' are present in xenstore (despite
> the fact xl.conf does not explicitly specify them). However, after the
> migration succeeds on the receiving side, those lines are missing. 

Is the lack of these keys causing you a problem? IIRC they are used by
the builder to communicate with hvmloader (the pre-BIOS loader used in
HVM guests) so it can setup ACPI tables etc as appropriate. Nothing else
should be using them. They are documented as INTERNAL in
docs/misc/xenstore-paths.markdown.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-17 17:33   ` Ian Campbell
@ 2013-05-18  9:52     ` Alex Bligh
  2013-05-18 11:02       ` Ian Campbell
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Bligh @ 2013-05-18  9:52 UTC (permalink / raw)
  To: Ian Campbell, Diana Crisan; +Cc: Alex Bligh, xen-devel

Ian,

--On 17 May 2013 18:33:48 +0100 Ian Campbell <ian.campbell@citrix.com> 
wrote:

> On Tue, 2013-05-14 at 14:12 +0100, Diana Crisan wrote:
>> Prior to issuing a migration the lines 'platform = ""', 'acpi = "1"',
>> 'acpi_s3 = "1"' and 'acpi_s4 = "1"' are present in xenstore (despite
>> the fact xl.conf does not explicitly specify them). However, after the
>> migration succeeds on the receiving side, those lines are missing.
>
> Is the lack of these keys causing you a problem? IIRC they are used by
> the builder to communicate with hvmloader (the pre-BIOS loader used in
> HVM guests) so it can setup ACPI tables etc as appropriate. Nothing else
> should be using them. They are documented as INTERNAL in
> docs/misc/xenstore-paths.markdown.

(Diana is my colleague)

We don't know whether it causes a problem, but we were looking to
find something something that might explain the stuck clock on migration
Diana reported along side this on ACPI enabled hvm:
 http://lists.xen.org/archives/html/xen-devel/2013-05/msg01472.html

We figured if ACPI wasn't being set up right on the recipient (migrated)
domain, this might be the problem (given the stuck clock only appears
if you use ACPI).

How does the recipient upstream QEMU / Xen know whether to emulate
ACPI if this is not transferred?

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-18  9:52     ` Alex Bligh
@ 2013-05-18 11:02       ` Ian Campbell
  2013-05-18 11:17         ` Alex Bligh
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-05-18 11:02 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Diana Crisan, xen-devel

On Sat, 2013-05-18 at 10:52 +0100, Alex Bligh wrote:
> Ian,
> 
> --On 17 May 2013 18:33:48 +0100 Ian Campbell <ian.campbell@citrix.com> 
> wrote:
> 
> > On Tue, 2013-05-14 at 14:12 +0100, Diana Crisan wrote:
> >> Prior to issuing a migration the lines 'platform = ""', 'acpi = "1"',
> >> 'acpi_s3 = "1"' and 'acpi_s4 = "1"' are present in xenstore (despite
> >> the fact xl.conf does not explicitly specify them). However, after the
> >> migration succeeds on the receiving side, those lines are missing.
> >
> > Is the lack of these keys causing you a problem? IIRC they are used by
> > the builder to communicate with hvmloader (the pre-BIOS loader used in
> > HVM guests) so it can setup ACPI tables etc as appropriate. Nothing else
> > should be using them. They are documented as INTERNAL in
> > docs/misc/xenstore-paths.markdown.
> 
> (Diana is my colleague)
> 
> We don't know whether it causes a problem, but we were looking to
> find something something that might explain the stuck clock on migration
> Diana reported along side this on ACPI enabled hvm:
>  http://lists.xen.org/archives/html/xen-devel/2013-05/msg01472.html
> 
> We figured if ACPI wasn't being set up right on the recipient (migrated)
> domain, this might be the problem (given the stuck clock only appears
> if you use ACPI).
> 
> How does the recipient upstream QEMU / Xen know whether to emulate
> ACPI if this is not transferred?

These keys have nothing to do with that, all they do is cause hvmloader
to expose ACPI tables to the guest or to tweak the content of those
tables. That state is preserved as part of the memory image of the
guest. The qemu state is also pickled as part of the save image.

ACPI is jut a set of tables describing the hardware, there's no
"emulation" to turn off and on. Whatever magic I/O ports the ACPI AML
references are always on, the setting just controls whether the guest
gets to see that via the AML.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-18 11:02       ` Ian Campbell
@ 2013-05-18 11:17         ` Alex Bligh
  2013-05-20  8:40           ` Ian Campbell
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Bligh @ 2013-05-18 11:17 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Diana Crisan, Alex Bligh, xen-devel

Ian,

--On 18 May 2013 12:02:23 +0100 Ian Campbell <Ian.Campbell@citrix.com> 
wrote:

> These keys have nothing to do with that, all they do is cause hvmloader
> to expose ACPI tables to the guest or to tweak the content of those
> tables. That state is preserved as part of the memory image of the
> guest. The qemu state is also pickled as part of the save image.
>
> ACPI is jut a set of tables describing the hardware, there's no
> "emulation" to turn off and on. Whatever magic I/O ports the ACPI AML
> references are always on, the setting just controls whether the guest
> gets to see that via the AML.

Thanks. So it could not be that the guest gets to see that via the AML
pre migration, but not post migration?

In that case I can only conclude that some part of the qemu state
is not migrating correctly, and the fact that it the cluck stock
doesn't happen if ACPI is enabled in xl.conf is only relevant as
it influences how the guest (linux in this case) chooses its clock
source (i.e. its broken in any case, just the guest does not notice
if the relevant ACPI dtables aren't exposed).

Any ideas on how to debug this further? It is odd that the date command
(used to set a date) will unstick the clock.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-18 11:17         ` Alex Bligh
@ 2013-05-20  8:40           ` Ian Campbell
  2013-05-20 11:50             ` Alex Bligh
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-05-20  8:40 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Diana Crisan, xen-devel

On Sat, 2013-05-18 at 12:17 +0100, Alex Bligh wrote:
> Ian,
> 
> --On 18 May 2013 12:02:23 +0100 Ian Campbell <Ian.Campbell@citrix.com> 
> wrote:
> 
> > These keys have nothing to do with that, all they do is cause hvmloader
> > to expose ACPI tables to the guest or to tweak the content of those
> > tables. That state is preserved as part of the memory image of the
> > guest. The qemu state is also pickled as part of the save image.
> >
> > ACPI is jut a set of tables describing the hardware, there's no
> > "emulation" to turn off and on. Whatever magic I/O ports the ACPI AML
> > references are always on, the setting just controls whether the guest
> > gets to see that via the AML.
> 
> Thanks. So it could not be that the guest gets to see that via the AML
> pre migration, but not post migration?

AML lives in guest RAM, so no.

> In that case I can only conclude that some part of the qemu state
> is not migrating correctly, and the fact that it the cluck stock
> doesn't happen if ACPI is enabled in xl.conf is only relevant as
> it influences how the guest (linux in this case) chooses its clock
> source (i.e. its broken in any case, just the guest does not notice
> if the relevant ACPI dtables aren't exposed).

You could perhaps verify this somewhat by playing with the kernel's
clocksource= option.

Most (all?) of the clocks are actually emulated by the hypervisor rather
than qemu for performance reasons, but that state is also pickled over a
migrate.

> Any ideas on how to debug this further? It is odd that the date command
> (used to set a date) will unstick the clock.

I'd have thought that would only poke the RTC, but to be honest I'm not
sure.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-20  8:40           ` Ian Campbell
@ 2013-05-20 11:50             ` Alex Bligh
  2013-05-20 12:01               ` Ian Campbell
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Bligh @ 2013-05-20 11:50 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Diana Crisan, Alex Bligh, xen-devel

Ian,

--On 20 May 2013 09:40:01 +0100 Ian Campbell <Ian.Campbell@citrix.com> 
wrote:

> You could perhaps verify this somewhat by playing with the kernel's
> clocksource= option.

clocksource=[hpet|pit|tsc|acpi_pm|cyclone|scx200_hrt]

I'm guessing clocksource=tsc is the least dependent on 'other stuff'.

We'll have a play.

>> Any ideas on how to debug this further? It is odd that the date command
>> (used to set a date) will unstick the clock.
>
> I'd have thought that would only poke the RTC, but to be honest I'm not
> sure.

I'm pretty sure date itself only changes the wallclock (the clock command
normally being needed to write to the RTC). If the images are running
ntp, that may notice the wallclock change and write to CMOS I guess.

-- 
Alex Bligh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-20 11:50             ` Alex Bligh
@ 2013-05-20 12:01               ` Ian Campbell
  2013-05-21 10:08                 ` Diana Crisan
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Campbell @ 2013-05-20 12:01 UTC (permalink / raw)
  To: Alex Bligh; +Cc: Diana Crisan, xen-devel

On Mon, 2013-05-20 at 12:50 +0100, Alex Bligh wrote:
> Ian,
> 
> --On 20 May 2013 09:40:01 +0100 Ian Campbell <Ian.Campbell@citrix.com> 
> wrote:
> 
> > You could perhaps verify this somewhat by playing with the kernel's
> > clocksource= option.
> 
> clocksource=[hpet|pit|tsc|acpi_pm|cyclone|scx200_hrt]
> 
> I'm guessing clocksource=tsc is the least dependent on 'other stuff'.

It'd be a good one to start with. You should be able to confirm under
sysfs which one is used now, I'm guessing it is acpi_pm.

It'd be worth trying each of the first 4 though. cyclone and scx200 seem
a bit specific...

> We'll have a play.
> 
> >> Any ideas on how to debug this further? It is odd that the date command
> >> (used to set a date) will unstick the clock.
> >
> > I'd have thought that would only poke the RTC, but to be honest I'm not
> > sure.
> 
> I'm pretty sure date itself only changes the wallclock (the clock command
> normally being needed to write to the RTC).

Yes, I think you are right.

>  If the images are running
> ntp, that may notice the wallclock change and write to CMOS I guess.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore
  2013-05-20 12:01               ` Ian Campbell
@ 2013-05-21 10:08                 ` Diana Crisan
  0 siblings, 0 replies; 9+ messages in thread
From: Diana Crisan @ 2013-05-21 10:08 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Alex Bligh, xen-devel

Hello Ian,

>On Mon, 2013-05-20 at 12:50 +0100, Alex Bligh wrote:
>> Ian,
>> 
>> --On 20 May 2013 09:40:01 +0100 Ian Campbell <Ian.Campbell@citrix.com> 
>> wrote:
>> 
>> > You could perhaps verify this somewhat by playing with the kernel's
>> > clocksource= option.
>> 
>> clocksource=[hpet|pit|tsc|acpi_pm|cyclone|scx200_hrt]
>> 
>> I'm guessing clocksource=tsc is the least dependent on 'other stuff'.

>It'd be a good one to start with. You should be able to confirm under
>sysfs which one is used now, I'm guessing it is acpi_pm.

>It'd be worth trying each of the first 4 though. cyclone and scx200 seem
>a bit specific...

I tested the first 4 options with both xen 4.2 and xen 4.3. Please see below my findings:

Under 4.2:
tsc: cannot use this as not HRT compatible; can't switch while in HRT/NOHZ mode
hpet: activated according to dmesg; after migration got this error:
"i8042 no controller found; pm: device i8042 failed to restore: error -19"
and this error (??) got the clock stuck again
pit: did not get an error, but it did not seem to have listened to my request as I could see in dmesg it reading my request to switch to clocksource=pit but choosing to set the xen clocksource.


Under 4.3:
tsc: as above, not HRT compatible so cannot use it.
hpet: still get the clock stuck after several migrations (3-4 migrations)
pit: as above it is ignored.
acpi_pm: it seems to be easier to replicate the problem with this clocksource - got it on the first migrate.


Notes: the wallclock (date) and RTC  (clock -r) on the hosts were in sync within a second


>> We'll have a play.
>> 
>> >> Any ideas on how to debug this further? It is odd that the date command
>> >> (used to set a date) will unstick the clock.
>> >
>> > I'd have thought that would only poke the RTC, but to be honest I'm not
>> > sure.
>> 
>> I'm pretty sure date itself only changes the wallclock (the clock command
>> normally being needed to write to the RTC).

>Yes, I think you are right.

>>  If the images are running
>> ntp, that may notice the wallclock change and write to CMOS I guess.


Please let me know if you need any more details.

-- 
Diana Crisan 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-05-21 10:08 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <514405924.8634267.1368537117719.JavaMail.root@zimbra002>
2013-05-14 13:12 ` HVM Migration of domU on Qemu-upstream DM loses ACPI data in xenstore Diana Crisan
2013-05-17 17:33   ` Ian Campbell
2013-05-18  9:52     ` Alex Bligh
2013-05-18 11:02       ` Ian Campbell
2013-05-18 11:17         ` Alex Bligh
2013-05-20  8:40           ` Ian Campbell
2013-05-20 11:50             ` Alex Bligh
2013-05-20 12:01               ` Ian Campbell
2013-05-21 10:08                 ` Diana Crisan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.