* 32bit PAE PV guest on 64bit hypervisor
@ 2009-04-14 3:39 Mukesh Rathor
2009-04-14 6:28 ` Keir Fraser
2009-04-14 15:59 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 5+ messages in thread
From: Mukesh Rathor @ 2009-04-14 3:39 UTC (permalink / raw)
To: xen-devel
Hi,
Been chasing down this message from guest boot:
(XEN) mm.c:1841:d1 Error pfn 7f36a: rd=ffff8300cea28080,
od=0000000000000000, caf=00000000, taf=0000000000000000
(XEN) mm.c:730:d1 Error getting mfn 7f36a (pfn 5555555555555555) from L1
entry 000000007f36a025 for dom1
(XEN) mm.c:3700:d1 ptwr_emulate: fixing up invalid PAE PTE
000000007f36a025
Firstly, on a >64GB system, looks like a 32bit guest can get mfn above 64G.
The above msg comes when the PV guest tries to do WP check. To that end,
it does set_pte for mapping a (some swapper) temp page in test_wp_bit():
__set_fixmap(FIX_WP_TEST, __pa_symbol(&swapper_pg_dir), PAGE_READONLY);
boot_cpu_data.wp_works_ok = do_test_wp_bit();
clear_fixmap(FIX_WP_TEST);
...
/* use writable pagetables */
static inline void set_pte(pte_t *ptep, pte_t pte)
{
ptep->pte_high = pte.pte_high;
smp_wmb();
ptep->pte_low = pte.pte_low;
}
During the clear fixmap, the pte high write results in clearing upper
32bits portion of pte/mfn, as a result the pte low write results in
hypervisor getting wrong mfn, 7f36a instead of 1f7f36a.
I understand writeable page tables allow guest to do this, but I assume
they are for mapping user and not kernel pages, in which case we should
be doing a hypercall here? Or, would switching the order, first set low pte
then high pte work?
Thanks,
Mukesh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 32bit PAE PV guest on 64bit hypervisor
2009-04-14 3:39 32bit PAE PV guest on 64bit hypervisor Mukesh Rathor
@ 2009-04-14 6:28 ` Keir Fraser
2009-04-23 1:57 ` Mukesh Rathor
2009-04-14 15:59 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 5+ messages in thread
From: Keir Fraser @ 2009-04-14 6:28 UTC (permalink / raw)
To: mukesh.rathor, xen-devel
On 14/04/2009 04:39, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
> During the clear fixmap, the pte high write results in clearing upper
> 32bits portion of pte/mfn, as a result the pte low write results in
> hypervisor getting wrong mfn, 7f36a instead of 1f7f36a.
>
> I understand writeable page tables allow guest to do this, but I assume
> they are for mapping user and not kernel pages, in which case we should
> be doing a hypercall here? Or, would switching the order, first set low pte
> then high pte work?
Implementing clear_fixmap() with set_pte() is not correct, even on native.
Since it clears high then low, it temporarily leaves you with a possibly
invalid present PTE -- even on native this can cause problems if e.g., the
invalid PTE maps uncacheable I/O memory.
In our kernel we simply solved this by implementing __set_fixmap() with a
hypercall that could update all 64 bits at once. An alternative is indeed to
clear low then high. Basically, clearing a pte has to be done the opposite
way round to setting a pte.
-- Keir
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 32bit PAE PV guest on 64bit hypervisor
2009-04-14 3:39 32bit PAE PV guest on 64bit hypervisor Mukesh Rathor
2009-04-14 6:28 ` Keir Fraser
@ 2009-04-14 15:59 ` Jeremy Fitzhardinge
2009-04-14 18:06 ` Mukesh Rathor
1 sibling, 1 reply; 5+ messages in thread
From: Jeremy Fitzhardinge @ 2009-04-14 15:59 UTC (permalink / raw)
To: mukesh.rathor; +Cc: xen-devel
Mukesh Rathor wrote:
> Hi,
>
> Been chasing down this message from guest boot:
>
> (XEN) mm.c:1841:d1 Error pfn 7f36a: rd=ffff8300cea28080,
> od=0000000000000000, caf=00000000, taf=0000000000000000
> (XEN) mm.c:730:d1 Error getting mfn 7f36a (pfn 5555555555555555) from L1
> entry 000000007f36a025 for dom1
> (XEN) mm.c:3700:d1 ptwr_emulate: fixing up invalid PAE PTE
> 000000007f36a025
>
> Firstly, on a >64GB system, looks like a 32bit guest can get mfn above
> 64G.
>
> The above msg comes when the PV guest tries to do WP check. To that end,
> it does set_pte for mapping a (some swapper) temp page in test_wp_bit():
>
> __set_fixmap(FIX_WP_TEST, __pa_symbol(&swapper_pg_dir),
> PAGE_READONLY);
> boot_cpu_data.wp_works_ok = do_test_wp_bit();
> clear_fixmap(FIX_WP_TEST);
>
> ...
> /* use writable pagetables */
> static inline void set_pte(pte_t *ptep, pte_t pte)
> {
> ptep->pte_high = pte.pte_high;
> smp_wmb();
> ptep->pte_low = pte.pte_low;
> }
What kernel version is this?
J
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 32bit PAE PV guest on 64bit hypervisor
2009-04-14 15:59 ` Jeremy Fitzhardinge
@ 2009-04-14 18:06 ` Mukesh Rathor
0 siblings, 0 replies; 5+ messages in thread
From: Mukesh Rathor @ 2009-04-14 18:06 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: xen-devel
Jeremy Fitzhardinge wrote:
> Mukesh Rathor wrote:
>> Hi,
>>
>> Been chasing down this message from guest boot:
>>
>> (XEN) mm.c:1841:d1 Error pfn 7f36a: rd=ffff8300cea28080,
>> od=0000000000000000, caf=00000000, taf=0000000000000000
>> (XEN) mm.c:730:d1 Error getting mfn 7f36a (pfn 5555555555555555) from L1
>> entry 000000007f36a025 for dom1
>> (XEN) mm.c:3700:d1 ptwr_emulate: fixing up invalid PAE PTE
>> 000000007f36a025
>>
>> Firstly, on a >64GB system, looks like a 32bit guest can get mfn above
>> 64G.
>>
>> The above msg comes when the PV guest tries to do WP check. To that end,
>> it does set_pte for mapping a (some swapper) temp page in test_wp_bit():
>>
>> __set_fixmap(FIX_WP_TEST, __pa_symbol(&swapper_pg_dir),
>> PAGE_READONLY);
>> boot_cpu_data.wp_works_ok = do_test_wp_bit();
>> clear_fixmap(FIX_WP_TEST);
>>
>> ...
>> /* use writable pagetables */
>> static inline void set_pte(pte_t *ptep, pte_t pte)
>> {
>> ptep->pte_high = pte.pte_high;
>> smp_wmb();
>> ptep->pte_low = pte.pte_low;
>> }
>
> What kernel version is this?
>
> J
While I was debugging 2.6.18-92, I see code is same on 2.6.18-128.
The xen version, just in case, 3.3.1.
Thanks,
Mukesh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: 32bit PAE PV guest on 64bit hypervisor
2009-04-14 6:28 ` Keir Fraser
@ 2009-04-23 1:57 ` Mukesh Rathor
0 siblings, 0 replies; 5+ messages in thread
From: Mukesh Rathor @ 2009-04-23 1:57 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
Keir Fraser wrote:
> On 14/04/2009 04:39, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
>
..
> Implementing clear_fixmap() with set_pte() is not correct, even on native.
> Since it clears high then low, it temporarily leaves you with a possibly
> invalid present PTE -- even on native this can cause problems if e.g., the
> invalid PTE maps uncacheable I/O memory.
>
> In our kernel we simply solved this by implementing __set_fixmap() with a
> hypercall that could update all 64 bits at once. An alternative is indeed to
> clear low then high. Basically, clearing a pte has to be done the opposite
> way round to setting a pte.
>
> -- Keir
Just a quick update, I changed to hypercall and it worked. BTW, I also had to
increase the __PHYSICAL_MASK_SHIFT in guest (to 40) as I'm on system with
128GB. With both changes in the 32bit PAE guest, it's doing OK now.
Thanks for the help.
Mukesh
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-04-23 1:57 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-14 3:39 32bit PAE PV guest on 64bit hypervisor Mukesh Rathor
2009-04-14 6:28 ` Keir Fraser
2009-04-23 1:57 ` Mukesh Rathor
2009-04-14 15:59 ` Jeremy Fitzhardinge
2009-04-14 18:06 ` Mukesh Rathor
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.