All of lore.kernel.org
 help / color / mirror / Atom feed
* xen-4.7
@ 2016-08-25 15:48 Stefan Bader
  2016-08-25 16:06 ` xen-4.7 regression when saving a PV guest Stefan Bader
  2016-08-25 17:31 ` xen-4.7 Juergen Gross
  0 siblings, 2 replies; 9+ messages in thread
From: Stefan Bader @ 2016-08-25 15:48 UTC (permalink / raw)
  To: xen-devel; +Cc: Juergen Gross


[-- Attachment #1.1: Type: text/plain, Size: 1423 bytes --]

When I try to save a PV guest with 4G of memory using xen-4.7 I get the
following error:

II: Guest memory 4096 MB
II: Saving guest state to file...
Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
xc: info: Saving domain 23, type x86 PV
xc: error: Bad mfn in p2m_frame_list[0]: Internal error
xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
xc: error: Save failed (34 = Numerical result out of range): Internal error
libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
domain: domain did not respond to suspend request: Numerical result out of range
Failed to save domain, resuming domain
xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: xc_domain_resume
failed for domain 23: Invalid argument
EE: Guest not off after save!
FAIL

From dmesg inside the guest:
[    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000

Somehow I am slightly suspicious about

commit 91e204d37f44913913776d0a89279721694f8b32
  libxc: try to find last used pfn when migrating

since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
against in mfn_in_pseudophysmap(). Is that a known problem?
With xen-4.6 and the same dom0/guest kernel version combination this does work.

-Stefan


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xen-4.7 regression when saving a PV guest
  2016-08-25 15:48 xen-4.7 Stefan Bader
@ 2016-08-25 16:06 ` Stefan Bader
  2016-08-25 17:31 ` xen-4.7 Juergen Gross
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Bader @ 2016-08-25 16:06 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1803 bytes --]

Sorry for the incomplete subject. Got interrupted while writing the email and
then forgot to complete it... :/

On 25.08.2016 17:48, Stefan Bader wrote:
> When I try to save a PV guest with 4G of memory using xen-4.7 I get the
> following error:
> 
> II: Guest memory 4096 MB
> II: Saving guest state to file...
> Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
> xc: info: Saving domain 23, type x86 PV
> xc: error: Bad mfn in p2m_frame_list[0]: Internal error
> xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
> xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
> xc: error: Save failed (34 = Numerical result out of range): Internal error
> libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
> domain: domain did not respond to suspend request: Numerical result out of range
> Failed to save domain, resuming domain
> xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
> libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: xc_domain_resume
> failed for domain 23: Invalid argument
> EE: Guest not off after save!
> FAIL
> 
> From dmesg inside the guest:
> [    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
> 
> Somehow I am slightly suspicious about
> 
> commit 91e204d37f44913913776d0a89279721694f8b32
>   libxc: try to find last used pfn when migrating
> 
> since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
> against in mfn_in_pseudophysmap(). Is that a known problem?
> With xen-4.6 and the same dom0/guest kernel version combination this does work.
> 
> -Stefan
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
> 



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xen-4.7
  2016-08-25 15:48 xen-4.7 Stefan Bader
  2016-08-25 16:06 ` xen-4.7 regression when saving a PV guest Stefan Bader
@ 2016-08-25 17:31 ` Juergen Gross
  2016-08-26 10:52   ` xen-4.7 regression when saving a pv guest Stefan Bader
  1 sibling, 1 reply; 9+ messages in thread
From: Juergen Gross @ 2016-08-25 17:31 UTC (permalink / raw)
  To: Stefan Bader, xen-devel

On 25/08/16 17:48, Stefan Bader wrote:
> When I try to save a PV guest with 4G of memory using xen-4.7 I get the
> following error:
> 
> II: Guest memory 4096 MB
> II: Saving guest state to file...
> Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
> xc: info: Saving domain 23, type x86 PV
> xc: error: Bad mfn in p2m_frame_list[0]: Internal error

So the first mfn of the memory containing the p2m information is bogus.
Weird.

> xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
> xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
> xc: error: Save failed (34 = Numerical result out of range): Internal error
> libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
> domain: domain did not respond to suspend request: Numerical result out of range
> Failed to save domain, resuming domain
> xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
> libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: xc_domain_resume
> failed for domain 23: Invalid argument
> EE: Guest not off after save!
> FAIL
> 
> From dmesg inside the guest:
> [    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
> 
> Somehow I am slightly suspicious about
> 
> commit 91e204d37f44913913776d0a89279721694f8b32
>   libxc: try to find last used pfn when migrating
> 
> since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
> against in mfn_in_pseudophysmap(). Is that a known problem?
> With xen-4.6 and the same dom0/guest kernel version combination this does work.

Can you please share some more information? Especially:

- guest kernel version?
- any patches in kernel not being upstream, especially in Xen-specific
  boot path?
- dmesg from guest with E820 map?
- guest configuration?

The same error would occur when trying to live migrate the guest. And
this has been tested a lot since above commit, so I suspect something
is very special in your case.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xen-4.7 regression when saving a pv guest
  2016-08-25 17:31 ` xen-4.7 Juergen Gross
@ 2016-08-26 10:52   ` Stefan Bader
  2016-08-26 11:53     ` Juergen Gross
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Bader @ 2016-08-26 10:52 UTC (permalink / raw)
  To: Juergen Gross, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 4410 bytes --]

On 25.08.2016 19:31, Juergen Gross wrote:
> On 25/08/16 17:48, Stefan Bader wrote:
>> When I try to save a PV guest with 4G of memory using xen-4.7 I get the
>> following error:
>>
>> II: Guest memory 4096 MB
>> II: Saving guest state to file...
>> Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
>> xc: info: Saving domain 23, type x86 PV
>> xc: error: Bad mfn in p2m_frame_list[0]: Internal error
> 
> So the first mfn of the memory containing the p2m information is bogus.
> Weird.

Hm, not sure how bogus. From below the first mfn is 0x4eb1c8 and points to
pfn=0xff7c8 which is above the current max of 0xbffff. But then the dmesg inside
the guest said: "last_pfn = 0x100000" which would be larger than the pfn causing
the error.

> 
>> xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
>> xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
>> xc: error: Save failed (34 = Numerical result out of range): Internal error
>> libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
>> domain: domain did not respond to suspend request: Numerical result out of range
>> Failed to save domain, resuming domain
>> xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
>> libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: xc_domain_resume
>> failed for domain 23: Invalid argument
>> EE: Guest not off after save!
>> FAIL
>>
>> From dmesg inside the guest:
>> [    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
>>
>> Somehow I am slightly suspicious about
>>
>> commit 91e204d37f44913913776d0a89279721694f8b32
>>   libxc: try to find last used pfn when migrating
>>
>> since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
>> against in mfn_in_pseudophysmap(). Is that a known problem?
>> With xen-4.6 and the same dom0/guest kernel version combination this does work.
> 
> Can you please share some more information? Especially:
> 
> - guest kernel version?
Hm, apparently 4.4 and 4.6 with stable updates. I just tried a much older guest
kernel (3.2) environment and that works. So it is the combination of switching
from xen-4.6 to 4.7 and guest kernels running 4.4 and later. And while the exact
mfn/pfn which gets dumped varies a little, the offending mapping always points
to 0xffxxx which would be below last_pfn.

Xen version		4.6		4.7
Guest Kernel
3.13.x			ok		ok
4.2.x			ok		ok
4.4.15			ok		fail
4.6.7			ok		fail

I will try 4.7 and 4.8 based guest kernels with xen-4.7 in a bit, too.

> - any patches in kernel not being upstream, especially in Xen-specific
None I know of.

>   boot path?
With affected kernels both direct kernel load and pvgrub.

> - dmesg from guest with E820 map?

From 4.4.x kernel:
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
[    0.000000] e820: cannot find a gap in the 32bit address range
               e820: PCI devices with unassigned 32bit BARs may break!
[    0.000000] e820: [mem 0x100100000-0x1004fffff] available for PCI devices

Old 3.13 kernel (I see nothing different here):
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
[    0.000000] e820: cannot find a gap in the 32bit address range
[    0.000000] e820: PCI devices with unassigned 32bit BARs may break!
[    0.000000] e820: [mem 0x100100000-0x1004fffff] available for PCI devices

> - guest configuration?

Rather simple (some of it ls /for historic reasons, I also tried externally
supplied kernel and initrd):

name     = "testpv"
kernel   = "/root/boot/pv-grub-hd0--x86_64.gz"
memory   = 4096
vcpus    = 4
disk     = [
 		'file:/root/img/testpv.img,xvda1,w'
]
vif      = [ 'mac=xx:xx:xx:xx:xx:xx, bridge=br0' ]
on_crash = "coredump-destroy"

> 
> The same error would occur when trying to live migrate the guest. And
> this has been tested a lot since above commit, so I suspect something
> is very special in your case.
> 
> 
> Juergen
> 



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xen-4.7 regression when saving a pv guest
  2016-08-26 10:52   ` xen-4.7 regression when saving a pv guest Stefan Bader
@ 2016-08-26 11:53     ` Juergen Gross
  2016-08-26 12:11       ` Ian Jackson
  2016-08-26 12:55       ` Stefan Bader
  0 siblings, 2 replies; 9+ messages in thread
From: Juergen Gross @ 2016-08-26 11:53 UTC (permalink / raw)
  To: Stefan Bader, xen-devel, Ian Jackson

On 26/08/16 12:52, Stefan Bader wrote:
> On 25.08.2016 19:31, Juergen Gross wrote:
>> On 25/08/16 17:48, Stefan Bader wrote:
>>> When I try to save a PV guest with 4G of memory using xen-4.7 I get the
>>> following error:
>>>
>>> II: Guest memory 4096 MB
>>> II: Saving guest state to file...
>>> Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
>>> xc: info: Saving domain 23, type x86 PV
>>> xc: error: Bad mfn in p2m_frame_list[0]: Internal error
>>
>> So the first mfn of the memory containing the p2m information is bogus.
>> Weird.
> 
> Hm, not sure how bogus. From below the first mfn is 0x4eb1c8 and points to
> pfn=0xff7c8 which is above the current max of 0xbffff. But then the dmesg inside
> the guest said: "last_pfn = 0x100000" which would be larger than the pfn causing
> the error.
> 
>>
>>> xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
>>> xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
>>> xc: error: Save failed (34 = Numerical result out of range): Internal error
>>> libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
>>> domain: domain did not respond to suspend request: Numerical result out of range
>>> Failed to save domain, resuming domain
>>> xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
>>> libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: xc_domain_resume
>>> failed for domain 23: Invalid argument
>>> EE: Guest not off after save!
>>> FAIL
>>>
>>> From dmesg inside the guest:
>>> [    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
>>>
>>> Somehow I am slightly suspicious about
>>>
>>> commit 91e204d37f44913913776d0a89279721694f8b32
>>>   libxc: try to find last used pfn when migrating
>>>
>>> since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
>>> against in mfn_in_pseudophysmap(). Is that a known problem?
>>> With xen-4.6 and the same dom0/guest kernel version combination this does work.
>>
>> Can you please share some more information? Especially:
>>
>> - guest kernel version?
> Hm, apparently 4.4 and 4.6 with stable updates. I just tried a much older guest
> kernel (3.2) environment and that works. So it is the combination of switching
> from xen-4.6 to 4.7 and guest kernels running 4.4 and later. And while the exact
> mfn/pfn which gets dumped varies a little, the offending mapping always points
> to 0xffxxx which would be below last_pfn.

Aah, okay. The problem seems to be specific to the linear p2m list
handling.

Trying on my system... Yep, seeing your problem, too.

Weird that nobody else stumbled over it.
Ian, don't we have any test in OSSTEST which should catch this problem?
A 4GB 64-bit pv-domain with Linux kernel 4.3 or newer can't be saved
currently.

Following upstream patch fixes it for me:

diff --git a/tools/libxc/xc_sr_save_x86_pv.c
b/tools/libxc/xc_sr_save_x86_pv.c
index 4a29460..7043409 100644
--- a/tools/libxc/xc_sr_save_x86_pv.c
+++ b/tools/libxc/xc_sr_save_x86_pv.c
@@ -430,6 +430,8 @@ static int map_p2m_list(struct xc_sr_context *ctx,
uint64_t p2m_cr3)

         if ( level == 2 )
         {
+            if ( saved_idx == idx_end )
+                saved_idx++;
             max_pfn = ((xen_pfn_t)saved_idx << 9) * fpp - 1;
             if ( max_pfn < ctx->x86_pv.max_pfn )
             {


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: xen-4.7 regression when saving a pv guest
  2016-08-26 11:53     ` Juergen Gross
@ 2016-08-26 12:11       ` Ian Jackson
  2016-08-26 12:23         ` Juergen Gross
  2016-08-26 12:55       ` Stefan Bader
  1 sibling, 1 reply; 9+ messages in thread
From: Ian Jackson @ 2016-08-26 12:11 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Stefan Bader

Juergen Gross writes ("Re: xen-4.7 regression when saving a pv guest"):
> Weird that nobody else stumbled over it.
> Ian, don't we have any test in OSSTEST which should catch this problem?
> A 4GB 64-bit pv-domain with Linux kernel 4.3 or newer can't be saved
> currently.

I don't think we have any such tests right now.

It would probably be worth putting in some tests of larger domains.
At least some of our test boxes could cope...

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xen-4.7 regression when saving a pv guest
  2016-08-26 12:11       ` Ian Jackson
@ 2016-08-26 12:23         ` Juergen Gross
  0 siblings, 0 replies; 9+ messages in thread
From: Juergen Gross @ 2016-08-26 12:23 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Stefan Bader

On 26/08/16 14:11, Ian Jackson wrote:
> Juergen Gross writes ("Re: xen-4.7 regression when saving a pv guest"):
>> Weird that nobody else stumbled over it.
>> Ian, don't we have any test in OSSTEST which should catch this problem?
>> A 4GB 64-bit pv-domain with Linux kernel 4.3 or newer can't be saved
>> currently.
> 
> I don't think we have any such tests right now.
> 
> It would probably be worth putting in some tests of larger domains.
> At least some of our test boxes could cope...

The wrong code suggests any domain larger than 1GB will hit the problem.
Just tested it with 1025MB and voila: xl save failed.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xen-4.7 regression when saving a pv guest
  2016-08-26 11:53     ` Juergen Gross
  2016-08-26 12:11       ` Ian Jackson
@ 2016-08-26 12:55       ` Stefan Bader
  2016-08-26 13:07         ` Wei Liu
  1 sibling, 1 reply; 9+ messages in thread
From: Stefan Bader @ 2016-08-26 12:55 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, Ian Jackson


[-- Attachment #1.1: Type: text/plain, Size: 3686 bytes --]

On 26.08.2016 13:53, Juergen Gross wrote:
> On 26/08/16 12:52, Stefan Bader wrote:
>> On 25.08.2016 19:31, Juergen Gross wrote:
>>> On 25/08/16 17:48, Stefan Bader wrote:
>>>> When I try to save a PV guest with 4G of memory using xen-4.7 I get the
>>>> following error:
>>>>
>>>> II: Guest memory 4096 MB
>>>> II: Saving guest state to file...
>>>> Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
>>>> xc: info: Saving domain 23, type x86 PV
>>>> xc: error: Bad mfn in p2m_frame_list[0]: Internal error
>>>
>>> So the first mfn of the memory containing the p2m information is bogus.
>>> Weird.
>>
>> Hm, not sure how bogus. From below the first mfn is 0x4eb1c8 and points to
>> pfn=0xff7c8 which is above the current max of 0xbffff. But then the dmesg inside
>> the guest said: "last_pfn = 0x100000" which would be larger than the pfn causing
>> the error.
>>
>>>
>>>> xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
>>>> xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
>>>> xc: error: Save failed (34 = Numerical result out of range): Internal error
>>>> libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
>>>> domain: domain did not respond to suspend request: Numerical result out of range
>>>> Failed to save domain, resuming domain
>>>> xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
>>>> libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: xc_domain_resume
>>>> failed for domain 23: Invalid argument
>>>> EE: Guest not off after save!
>>>> FAIL
>>>>
>>>> From dmesg inside the guest:
>>>> [    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
>>>>
>>>> Somehow I am slightly suspicious about
>>>>
>>>> commit 91e204d37f44913913776d0a89279721694f8b32
>>>>   libxc: try to find last used pfn when migrating
>>>>
>>>> since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
>>>> against in mfn_in_pseudophysmap(). Is that a known problem?
>>>> With xen-4.6 and the same dom0/guest kernel version combination this does work.
>>>
>>> Can you please share some more information? Especially:
>>>
>>> - guest kernel version?
>> Hm, apparently 4.4 and 4.6 with stable updates. I just tried a much older guest
>> kernel (3.2) environment and that works. So it is the combination of switching
>> from xen-4.6 to 4.7 and guest kernels running 4.4 and later. And while the exact
>> mfn/pfn which gets dumped varies a little, the offending mapping always points
>> to 0xffxxx which would be below last_pfn.
> 
> Aah, okay. The problem seems to be specific to the linear p2m list
> handling.
> 
> Trying on my system... Yep, seeing your problem, too.
> 
> Weird that nobody else stumbled over it.
> Ian, don't we have any test in OSSTEST which should catch this problem?
> A 4GB 64-bit pv-domain with Linux kernel 4.3 or newer can't be saved
> currently.
> 
> Following upstream patch fixes it for me:

Ah! :) Thanks. I applied the below locally, too. And save works with a 4.6 guest
kernel.

-Stefan

> 
> diff --git a/tools/libxc/xc_sr_save_x86_pv.c
> b/tools/libxc/xc_sr_save_x86_pv.c
> index 4a29460..7043409 100644
> --- a/tools/libxc/xc_sr_save_x86_pv.c
> +++ b/tools/libxc/xc_sr_save_x86_pv.c
> @@ -430,6 +430,8 @@ static int map_p2m_list(struct xc_sr_context *ctx,
> uint64_t p2m_cr3)
> 
>          if ( level == 2 )
>          {
> +            if ( saved_idx == idx_end )
> +                saved_idx++;
>              max_pfn = ((xen_pfn_t)saved_idx << 9) * fpp - 1;
>              if ( max_pfn < ctx->x86_pv.max_pfn )
>              {
> 
> 
> Juergen
> 



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xen-4.7 regression when saving a pv guest
  2016-08-26 12:55       ` Stefan Bader
@ 2016-08-26 13:07         ` Wei Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Wei Liu @ 2016-08-26 13:07 UTC (permalink / raw)
  To: Stefan Bader; +Cc: Juergen Gross, Wei Liu, xen-devel, Ian Jackson

On Fri, Aug 26, 2016 at 02:55:06PM +0200, Stefan Bader wrote:
> On 26.08.2016 13:53, Juergen Gross wrote:
> > On 26/08/16 12:52, Stefan Bader wrote:
> >> On 25.08.2016 19:31, Juergen Gross wrote:
> >>> On 25/08/16 17:48, Stefan Bader wrote:
> >>>> When I try to save a PV guest with 4G of memory using xen-4.7 I get the
> >>>> following error:
> >>>>
> >>>> II: Guest memory 4096 MB
> >>>> II: Saving guest state to file...
> >>>> Saving to /tmp/pvguest.save new xl format (info 0x3/0x0/1131)
> >>>> xc: info: Saving domain 23, type x86 PV
> >>>> xc: error: Bad mfn in p2m_frame_list[0]: Internal error
> >>>
> >>> So the first mfn of the memory containing the p2m information is bogus.
> >>> Weird.
> >>
> >> Hm, not sure how bogus. From below the first mfn is 0x4eb1c8 and points to
> >> pfn=0xff7c8 which is above the current max of 0xbffff. But then the dmesg inside
> >> the guest said: "last_pfn = 0x100000" which would be larger than the pfn causing
> >> the error.
> >>
> >>>
> >>>> xc: error: mfn 0x4eb1c8, max 0x820000: Internal error
> >>>> xc: error:   m2p[0x4eb1c8] = 0xff7c8, max_pfn 0xbffff: Internal error
> >>>> xc: error: Save failed (34 = Numerical result out of range): Internal error
> >>>> libxl: error: libxl_stream_write.c:355:libxl__xc_domain_save_done: saving
> >>>> domain: domain did not respond to suspend request: Numerical result out of range
> >>>> Failed to save domain, resuming domain
> >>>> xc: error: Dom 23 not suspended: (shutdown 0, reason 255): Internal error
> >>>> libxl: error: libxl_dom_suspend.c:460:libxl__domain_resume: xc_domain_resume
> >>>> failed for domain 23: Invalid argument
> >>>> EE: Guest not off after save!
> >>>> FAIL
> >>>>
> >>>> From dmesg inside the guest:
> >>>> [    0.000000] e820: last_pfn = 0x100000 max_arch_pfn = 0x400000000
> >>>>
> >>>> Somehow I am slightly suspicious about
> >>>>
> >>>> commit 91e204d37f44913913776d0a89279721694f8b32
> >>>>   libxc: try to find last used pfn when migrating
> >>>>
> >>>> since that seems to potentially lower ctx->x86_pv.max_pfn which is checked
> >>>> against in mfn_in_pseudophysmap(). Is that a known problem?
> >>>> With xen-4.6 and the same dom0/guest kernel version combination this does work.
> >>>
> >>> Can you please share some more information? Especially:
> >>>
> >>> - guest kernel version?
> >> Hm, apparently 4.4 and 4.6 with stable updates. I just tried a much older guest
> >> kernel (3.2) environment and that works. So it is the combination of switching
> >> from xen-4.6 to 4.7 and guest kernels running 4.4 and later. And while the exact
> >> mfn/pfn which gets dumped varies a little, the offending mapping always points
> >> to 0xffxxx which would be below last_pfn.
> > 
> > Aah, okay. The problem seems to be specific to the linear p2m list
> > handling.
> > 
> > Trying on my system... Yep, seeing your problem, too.
> > 
> > Weird that nobody else stumbled over it.
> > Ian, don't we have any test in OSSTEST which should catch this problem?
> > A 4GB 64-bit pv-domain with Linux kernel 4.3 or newer can't be saved
> > currently.
> > 
> > Following upstream patch fixes it for me:
> 
> Ah! :) Thanks. I applied the below locally, too. And save works with a 4.6 guest
> kernel.
> 

I'm going to translate this into a Tested-by tag in the proper patch [0].

Wei.

[0] <1472212735-27445-1-git-send-email-jgross@suse.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-08-26 13:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-25 15:48 xen-4.7 Stefan Bader
2016-08-25 16:06 ` xen-4.7 regression when saving a PV guest Stefan Bader
2016-08-25 17:31 ` xen-4.7 Juergen Gross
2016-08-26 10:52   ` xen-4.7 regression when saving a pv guest Stefan Bader
2016-08-26 11:53     ` Juergen Gross
2016-08-26 12:11       ` Ian Jackson
2016-08-26 12:23         ` Juergen Gross
2016-08-26 12:55       ` Stefan Bader
2016-08-26 13:07         ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.