All of lore.kernel.org
 help / color / mirror / Atom feed
* support for more than 32 VCPUs when migrating PVHVM guest
@ 2015-02-02 10:47 Vitaly Kuznetsov
  2015-02-02 10:58 ` Andrew Cooper
  0 siblings, 1 reply; 5+ messages in thread
From: Vitaly Kuznetsov @ 2015-02-02 10:47 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

Hi Konrad,

I just hit an issue with PVHVM guests after save/restore (or migration),
if a PVHVM guest has > 32 VCPUs it hangs. Turns out, you saw it almost a
year ago and even wrote patches to call VCPUOP_register_vcpu_info after
resume. Unfortunately these patches never made it to xen/kernel. Do you
have a plan to pick this up? What were the arguments against your
suggestion?

Thanks,

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: support for more than 32 VCPUs when migrating PVHVM guest
  2015-02-02 10:47 support for more than 32 VCPUs when migrating PVHVM guest Vitaly Kuznetsov
@ 2015-02-02 10:58 ` Andrew Cooper
  2015-02-02 11:03   ` Vitaly Kuznetsov
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cooper @ 2015-02-02 10:58 UTC (permalink / raw)
  To: Vitaly Kuznetsov, Konrad Rzeszutek Wilk; +Cc: xen-devel

On 02/02/15 10:47, Vitaly Kuznetsov wrote:
> Hi Konrad,
>
> I just hit an issue with PVHVM guests after save/restore (or migration),
> if a PVHVM guest has > 32 VCPUs it hangs. Turns out, you saw it almost a
> year ago and even wrote patches to call VCPUOP_register_vcpu_info after
> resume. Unfortunately these patches never made it to xen/kernel. Do you
> have a plan to pick this up? What were the arguments against your
> suggestion?

32 VCPUs is the legacy limit for HVM guests, but should not have any
remaining artefacts these days.

Do you know why the hang occurs?  I can't spot anything in the legacy
migration code which would enforce such a limit.

What is the subject of the thread you reference so I can search for it?

~Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: support for more than 32 VCPUs when migrating PVHVM guest
  2015-02-02 10:58 ` Andrew Cooper
@ 2015-02-02 11:03   ` Vitaly Kuznetsov
  2015-02-02 14:21     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 5+ messages in thread
From: Vitaly Kuznetsov @ 2015-02-02 11:03 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

Andrew Cooper <andrew.cooper3@citrix.com> writes:

> On 02/02/15 10:47, Vitaly Kuznetsov wrote:
>> Hi Konrad,
>>
>> I just hit an issue with PVHVM guests after save/restore (or migration),
>> if a PVHVM guest has > 32 VCPUs it hangs. Turns out, you saw it almost a
>> year ago and even wrote patches to call VCPUOP_register_vcpu_info after
>> resume. Unfortunately these patches never made it to xen/kernel. Do you
>> have a plan to pick this up? What were the arguments against your
>> suggestion?
>
> 32 VCPUs is the legacy limit for HVM guests, but should not have any
> remaining artefacts these days.
>
> Do you know why the hang occurs?  I can't spot anything in the legacy
> migration code which would enforce such a limit.
>
> What is the subject of the thread you reference so I can search for it?
>

Sorry, I should have send the link:

http://lists.xen.org/archives/html/xen-devel/2014-04/msg00794.html

Konrad's patches:

http://lists.xen.org/archives/html/xen-devel/2014-04/msg01199.html

The issue is that we don't call VCPUOP_register_vcpu_info after
suspend/resume (or migration) and it is mandatory.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: support for more than 32 VCPUs when migrating PVHVM guest
  2015-02-02 11:03   ` Vitaly Kuznetsov
@ 2015-02-02 14:21     ` Konrad Rzeszutek Wilk
  2015-02-03  9:38       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 5+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-02-02 14:21 UTC (permalink / raw)
  To: Vitaly Kuznetsov; +Cc: Andrew Cooper, xen-devel

On Mon, Feb 02, 2015 at 12:03:28PM +0100, Vitaly Kuznetsov wrote:
> Andrew Cooper <andrew.cooper3@citrix.com> writes:
> 
> > On 02/02/15 10:47, Vitaly Kuznetsov wrote:
> >> Hi Konrad,
> >>
> >> I just hit an issue with PVHVM guests after save/restore (or migration),
> >> if a PVHVM guest has > 32 VCPUs it hangs. Turns out, you saw it almost a
> >> year ago and even wrote patches to call VCPUOP_register_vcpu_info after
> >> resume. Unfortunately these patches never made it to xen/kernel. Do you
> >> have a plan to pick this up? What were the arguments against your
> >> suggestion?
> >
> > 32 VCPUs is the legacy limit for HVM guests, but should not have any
> > remaining artefacts these days.
> >
> > Do you know why the hang occurs?  I can't spot anything in the legacy
> > migration code which would enforce such a limit.
> >
> > What is the subject of the thread you reference so I can search for it?
> >
> 
> Sorry, I should have send the link:
> 
> http://lists.xen.org/archives/html/xen-devel/2014-04/msg00794.html
> 
> Konrad's patches:
> 
> http://lists.xen.org/archives/html/xen-devel/2014-04/msg01199.html
> 
> The issue is that we don't call VCPUOP_register_vcpu_info after
> suspend/resume (or migration) and it is mandatory.

The issues I saw were that with the enablement of that everything
(which is what Jan requested) seems to work - except that I , ah here it is:

http://lists.xen.org/archives/html/xen-devel/2014-04/msg02875.html
err:

http://lists.xen.org/archives/html/xen-devel/2014-04/msg02945.html

	> The VCPUOP_send_nmi did cause the HVM to get an NMI and it spitted out
	> 'Dazed and confused'. It also noticed corruption:
	> 
	> [    3.611742] Corrupted low memory at c000fffc (fffc phys) = 00029b00
	> [    2.386785] Corrupted low memory at ffff88000000fff8 (fff8 phys) = 
	> 2990000000000
	> 
	> Which is odd because there does not seem to be anything in the path
	> of hypervisor that would cause this.

	Indeed. This looks a little like a segment descriptor got modified here
	with a descriptor table base of zero and a selector of 0xfff8. That
	corruption needs to be hunted down in any case before enabling
	VCPUOP_send_nmi for HVM.


I did not get a chance to "hunt down" that pesky issue. That is the only
thing holding this patchset.

Said patch is in my queue of patches to upstream (amongts 30 other ones) -
and I am working through the review/issues - but it will take me quite some
time - so if you feel like taking a stab at this - please do!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: support for more than 32 VCPUs when migrating PVHVM guest
  2015-02-02 14:21     ` Konrad Rzeszutek Wilk
@ 2015-02-03  9:38       ` Vitaly Kuznetsov
  0 siblings, 0 replies; 5+ messages in thread
From: Vitaly Kuznetsov @ 2015-02-03  9:38 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, xen-devel

Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> writes:

> On Mon, Feb 02, 2015 at 12:03:28PM +0100, Vitaly Kuznetsov wrote:
>> Andrew Cooper <andrew.cooper3@citrix.com> writes:
>> 
>> > On 02/02/15 10:47, Vitaly Kuznetsov wrote:
>> >> Hi Konrad,
>> >>
>> >> I just hit an issue with PVHVM guests after save/restore (or migration),
>> >> if a PVHVM guest has > 32 VCPUs it hangs. Turns out, you saw it almost a
>> >> year ago and even wrote patches to call VCPUOP_register_vcpu_info after
>> >> resume. Unfortunately these patches never made it to xen/kernel. Do you
>> >> have a plan to pick this up? What were the arguments against your
>> >> suggestion?
>> >
>> > 32 VCPUs is the legacy limit for HVM guests, but should not have any
>> > remaining artefacts these days.
>> >
>> > Do you know why the hang occurs?  I can't spot anything in the legacy
>> > migration code which would enforce such a limit.
>> >
>> > What is the subject of the thread you reference so I can search for it?
>> >
>> 
>> Sorry, I should have send the link:
>> 
>> http://lists.xen.org/archives/html/xen-devel/2014-04/msg00794.html
>> 
>> Konrad's patches:
>> 
>> http://lists.xen.org/archives/html/xen-devel/2014-04/msg01199.html
>> 
>> The issue is that we don't call VCPUOP_register_vcpu_info after
>> suspend/resume (or migration) and it is mandatory.
>
> The issues I saw were that with the enablement of that everything
> (which is what Jan requested) seems to work - except that I , ah here it is:
>
> http://lists.xen.org/archives/html/xen-devel/2014-04/msg02875.html
> err:
>
> http://lists.xen.org/archives/html/xen-devel/2014-04/msg02945.html
>
> 	> The VCPUOP_send_nmi did cause the HVM to get an NMI and it spitted out
> 	> 'Dazed and confused'. It also noticed corruption:
> 	> 
> 	> [    3.611742] Corrupted low memory at c000fffc (fffc phys) = 00029b00
> 	> [    2.386785] Corrupted low memory at ffff88000000fff8 (fff8 phys) = 
> 	> 2990000000000
> 	> 
> 	> Which is odd because there does not seem to be anything in the path
> 	> of hypervisor that would cause this.
>
> 	Indeed. This looks a little like a segment descriptor got modified here
> 	with a descriptor table base of zero and a selector of 0xfff8. That
> 	corruption needs to be hunted down in any case before enabling
> 	VCPUOP_send_nmi for HVM.
>
> I did not get a chance to "hunt down" that pesky issue. That is the only
> thing holding this patchset.
>
> Said patch is in my queue of patches to upstream (amongts 30 other ones) -
> and I am working through the review/issues - but it will take me quite some
> time - so if you feel like taking a stab at this - please do!

Thanks for summing this up for me, in case something pops up wrt this
corruption issue I'll report.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-03  9:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-02 10:47 support for more than 32 VCPUs when migrating PVHVM guest Vitaly Kuznetsov
2015-02-02 10:58 ` Andrew Cooper
2015-02-02 11:03   ` Vitaly Kuznetsov
2015-02-02 14:21     ` Konrad Rzeszutek Wilk
2015-02-03  9:38       ` Vitaly Kuznetsov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.