All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] xen: update machine_to_phys_order on resume
@ 2011-07-13 14:29 Jan Beulich
  2011-07-14 10:26 ` [Xen-devel] " Olaf Hering
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2011-07-13 14:29 UTC (permalink / raw)
  To: Ian.Campbell, konrad.wilk, keir; +Cc: olaf, xen-devel, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 495 bytes --]

>>> Ian Campbell  07/13/11 11:12 AM >>>
>It's not so much an objection to this patch but this issue seems to have
>been caused by Xen cset 20892:d311d1efc25e which looks to me like a
>subtle ABI breakage for guests. Perhaps we should introduce a feature
>flag to indicate that a guest can cope with the m2p changing size over
>migration like this?

Indeed - migration was completely beyond my consideration when
submitting this. A feature flag seems the right way to go to me.

Jan


[-- Attachment #1.2: HTML --]
[-- Type: text/html, Size: 801 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [Xen-devel] [PATCH] xen: update machine_to_phys_order on resume
  2011-07-15 18:23 ` Keir Fraser
@ 2011-07-18  7:05 Jan Beulich
  2011-07-18  7:27 ` Keir Fraser
  -1 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2011-07-18  7:05 UTC (permalink / raw)
  To: Ian.Campbell, Keir Fraser, Konrad Rzeszutek Wilk, keir
  Cc: olaf, xen-devel, linux-kernel

>>> On 15.07.11 at 20:23, Keir Fraser <keir.xen@gmail.com> wrote:
> On 15/07/2011 18:30, "Jan Beulich" <JBeulich@novell.com> wrote:
> 
>> Actually, one more thought: What's the purpose of this hypercall if
>> it is set in stone what values it ought to return? Isn't a guest using
>> it (supposed to be) advertising that it can deal with the values being
>> variable (and it was just overlooked so far that this doesn't only
>> include varying values from boot to boot, but also migration)? Or in
>> other words, if we found a need to relocate the M2P table or grow
>> its static maximum size, it would be impossible to migrate guests
>> from an old to a new hypervisor.
> 
> Fair point. There has to be a static fallback set of return values for old
> guests.

Hmm, in my reading the two sentences sort of contradict each other.
That is, I'm not certain what route we want to go here: Keep things
the way they are after 23706:3dd399873c9e, and introduce a
completely new discovery mechanism if we find it necessary to change
the M2P table's location and/or size, including a mechanism for a guest
to announce it's capable of dealing with that? If so, I think we ought
to add a comment to the hypercall implementation documenting that
its return values must not be changed (and why).

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PATCH] xen: update machine_to_phys_order on resume
@ 2011-07-15 17:30 Jan Beulich
  2011-07-15 18:23 ` Keir Fraser
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2011-07-15 17:30 UTC (permalink / raw)
  To: Ian.Campbell, konrad.wilk, keir; +Cc: olaf, xen-devel, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1427 bytes --]

>>> "Jan Beulich"  07/15/11 6:07 PM >>>
>>>> On 13.07.11 at 11:12, Ian Campbell  wrote:
>> It's not so much an objection to this patch but this issue seems to have
>> been caused by Xen cset 20892:d311d1efc25e which looks to me like a
>> subtle ABI breakage for guests. Perhaps we should introduce a feature
>> flag to indicate that a guest can cope with the m2p changing size over
>> migration like this?
>
>That's actually not strait forward, as the hypervisor can't see the ELF
>note specified features of a DomU kernel. Passing this information
>down from the tools or from the guest kernel itself otoh doesn't
>necessarily seem worth it. Instead a guest that can deal with the
>upper bound of the M2P table changing can easily obtain the
>desired information through XENMEM_maximum_ram_page. So I
>think on the hypervisor side we're good with the patch I sent
>earlier today.

Actually, one more thought: What's the purpose of this hypercall if
it is set in stone what values it ought to return? Isn't a guest using
it (supposed to be) advertising that it can deal with the values being
variable (and it was just overlooked so far that this doesn't only
include varying values from boot to boot, but also migration)? Or in
other words, if we found a need to relocate the M2P table or grow
its static maximum size, it would be impossible to migrate guests
from an old to a new hypervisor.

Jan


[-- Attachment #1.2: HTML --]
[-- Type: text/html, Size: 1852 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread
* migration of pv guest fails from small to large host
@ 2011-07-01 10:41 Olaf Hering
  2011-07-12 16:43 ` [PATCH] xen: update machine_to_phys_order on resume Olaf Hering
  0 siblings, 1 reply; 9+ messages in thread
From: Olaf Hering @ 2011-07-01 10:41 UTC (permalink / raw)
  To: xen-devel

This issue was initially reported to happen on different sized HP ProLiant
systems running SLES11SP1 on dom0 and domU.

Migration of pv guests fails, the guest crashes on the target host once the
guest is unpaused after transit. It happens when the guest is started on a
small systen, then migrated from that small system to a large system.
If the guest is started on a large system, then migrated to a small system and
back to the large system, the migration will be successful.


The symptoms on the target host differ with the systems I have access to,
which are listed below. It is not possible to take a core dump.
The pv guest has one vcpu and 256MB, one network interface and a disk.


I have currently no idea what to look for. The xenctx patch for dumping
pagetables showed no differences between src/dst guest after transit to the
target host (I have to verify this on my hosts).


involved hardware:

bolen: ProLiant DL580 G7, 32GB, CPU E7540 @ 2.00GHz
falla: ProLiant DL360 G6, 8GB, CPU E5540 @ 2.53GHz 
drnek: ProLiant DL170h G6, 6GB, CPU E5504 @ 2.00GHz
gubaidulina: Intel SDV S3E37, 192GB, CPU 000 @ 2.40GHz (unknown cpu 0x206f1)

(other target hosts from different vendors with large amount of memory were reported to fail as well.)
I still trying to test a non-HP system as source host.

involved software:
host: sles11sp1, xen 4.0. Also xen-unstable 4.2 hg rev23640
pv gust: sles11sp1


migration with this command on bolen, falla, drnek:
"xm migrate sles11sp_para_1 gubaidulina" fails on gubaidulina:

[2011-06-30 21:21:32 21858] WARNING (XendDomainInfo:2061) Domain has crashed: name=sles11sp1_para_1 id=1.
[2011-06-30 21:21:32 21858] ERROR (XendDomainInfo:2318) core dump failed: id = 1 name = sles11sp1_para_1: (1, 'Internal error', "Couldn't map p2m_frame_list_list (errno 1) (1 = Operation not permitted)")
[2011-06-30 21:21:32 21858] DEBUG (XendDomainInfo:3084) XendDomainInfo.destroy: domid=1
[2011-06-30 21:21:32 21858] DEBUG (XendDomainInfo:2403) Destroying device model
[2011-06-30 21:21:32 21858] INFO (image:702) sles11sp1_para_1 device model terminated

xm dmesg shows no errors.

notes from a "bisect" with limiting Xen memory:
gubaidulina booted with mem=64G, migration from bolen succeeds.
gubaidulina booted with mem=96G, migration from bolen fails.
gubaidulina booted with mem=80G, migration from bolen fails.
gubaidulina booted with mem=72G, migration from bolen fails.
gubaidulina booted with mem=68G, migration from bolen fails.
gubaidulina booted with mem=65G, migration from bolen succeeds.
now testing more after migration:
gubaidulina booted with mem=66G, migration from bolen fails, no coredump message, no coredump
gubaidulina booted with mem=66G, second migration from bolen succeeds. xm shutdown crashes guest, no coredump
gubaidulina booted with mem=65G, migration from bolen succeeds. xm shutdown succeeds

Olaf

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-07-18  8:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-13 14:29 [PATCH] xen: update machine_to_phys_order on resume Jan Beulich
2011-07-14 10:26 ` [Xen-devel] " Olaf Hering
2011-07-15  8:32   ` Jan Beulich
2011-07-15  8:32     ` Jan Beulich
  -- strict thread matches above, loose matches on Subject: below --
2011-07-18  7:05 Jan Beulich
2011-07-18  7:27 ` Keir Fraser
2011-07-18  8:31   ` [Xen-devel] " Ian Campbell
2011-07-18  8:47     ` Jan Beulich
2011-07-15 17:30 Jan Beulich
2011-07-15 18:23 ` Keir Fraser
2011-07-01 10:41 migration of pv guest fails from small to large host Olaf Hering
2011-07-12 16:43 ` [PATCH] xen: update machine_to_phys_order on resume Olaf Hering

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.