All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
@ 2022-04-15 10:34 Lai Jiangshan
  2022-05-13  8:54 ` Lai Jiangshan
  2022-05-16 20:45 ` Sean Christopherson
  0 siblings, 2 replies; 9+ messages in thread
From: Lai Jiangshan @ 2022-04-15 10:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, Marcelo Tosatti, Avi Kivity, kvm

From: Lai Jiangshan <jiangshan.ljs@antgroup.com>

When NPT enabled L1 is PAE paging, vcpu->arch.mmu->get_pdptrs() which
is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memroy
unconditionally for each call.

The guest PAE root page is not write-protected.

The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get different
values every time or it is different from the return value of
mmu->get_pdptrs() in mmu_alloc_shadow_roots().

And it will cause FNAME(fetch) installs the spte in a wrong sp
or links a sp to a wrong parent since FNAME(gpte_changed) can't
check these kind of changes.

Cache the PDPTEs and the problem is resolved.  The guest is responsible
to info the host if its PAE root page is updated which will cause
nested vmexit and the host updates the cache when next nested run.

The commit e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE
from guest memory") fixs the same problem for non-nested case.

Fixes: e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE from guest memory")
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
 arch/x86/kvm/svm/nested.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index d736ec6514ca..a34983d2dc07 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -72,18 +72,22 @@ static void svm_inject_page_fault_nested(struct kvm_vcpu *vcpu, struct x86_excep
        }
 }
 
-static u64 nested_svm_get_tdp_pdptr(struct kvm_vcpu *vcpu, int index)
+static void nested_svm_cache_tdp_pdptrs(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	u64 cr3 = svm->nested.ctl.nested_cr3;
-	u64 pdpte;
+	u64 *pdptrs = vcpu->arch.mmu->pdptrs;
 	int ret;
 
-	ret = kvm_vcpu_read_guest_page(vcpu, gpa_to_gfn(cr3), &pdpte,
-				       offset_in_page(cr3) + index * 8, 8);
+	ret = kvm_vcpu_read_guest_page(vcpu, gpa_to_gfn(cr3), pdptrs,
+				       offset_in_page(cr3), 8 * 4);
 	if (ret)
-		return 0;
-	return pdpte;
+		memset(pdptrs, 0, 8 * 4);
+}
+
+static u64 nested_svm_get_tdp_pdptr(struct kvm_vcpu *vcpu, int index)
+{
+	return vcpu->arch.mmu->pdptrs[index];
 }
 
 static unsigned long nested_svm_get_tdp_cr3(struct kvm_vcpu *vcpu)
@@ -109,6 +113,8 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)
 	kvm_init_shadow_npt_mmu(vcpu, X86_CR0_PG, svm->vmcb01.ptr->save.cr4,
 				svm->vmcb01.ptr->save.efer,
 				svm->nested.ctl.nested_cr3);
+	if (vcpu->arch.mmu->root_level == PT32E_ROOT_LEVEL)
+		nested_svm_cache_tdp_pdptrs(vcpu);
 	vcpu->arch.mmu->get_guest_pgd     = nested_svm_get_tdp_cr3;
 	vcpu->arch.mmu->get_pdptr         = nested_svm_get_tdp_pdptr;
 	vcpu->arch.mmu->inject_page_fault = nested_svm_inject_npf_exit;
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-04-15 10:34 [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode Lai Jiangshan
@ 2022-05-13  8:54 ` Lai Jiangshan
  2022-05-16 20:45 ` Sean Christopherson
  1 sibling, 0 replies; 9+ messages in thread
From: Lai Jiangshan @ 2022-05-13  8:54 UTC (permalink / raw)
  To: LKML
  Cc: Lai Jiangshan, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Marcelo Tosatti, Avi Kivity,
	open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)

On Fri, Apr 15, 2022 at 6:33 PM Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
>
> When NPT enabled L1 is PAE paging, vcpu->arch.mmu->get_pdptrs() which
> is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memroy
> unconditionally for each call.
>
> The guest PAE root page is not write-protected.
>
> The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get different
> values every time or it is different from the return value of
> mmu->get_pdptrs() in mmu_alloc_shadow_roots().
>
> And it will cause FNAME(fetch) installs the spte in a wrong sp
> or links a sp to a wrong parent since FNAME(gpte_changed) can't
> check these kind of changes.
>
> Cache the PDPTEs and the problem is resolved.  The guest is responsible
> to info the host if its PAE root page is updated which will cause
> nested vmexit and the host updates the cache when next nested run.
>
> The commit e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE
> from guest memory") fixs the same problem for non-nested case.
>
> Fixes: e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE from guest memory")
> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>

Ping

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-04-15 10:34 [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode Lai Jiangshan
  2022-05-13  8:54 ` Lai Jiangshan
@ 2022-05-16 20:45 ` Sean Christopherson
  2022-05-17  1:02   ` Lai Jiangshan
  2022-05-25 21:45   ` David Matlack
  1 sibling, 2 replies; 9+ messages in thread
From: Sean Christopherson @ 2022-05-16 20:45 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Lai Jiangshan, Paolo Bonzini, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Marcelo Tosatti, Avi Kivity, kvm

On Fri, Apr 15, 2022, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> 
> When NPT enabled L1 is PAE paging, vcpu->arch.mmu->get_pdptrs() which
> is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memroy
> unconditionally for each call.
> 
> The guest PAE root page is not write-protected.
> 
> The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get different
> values every time or it is different from the return value of
> mmu->get_pdptrs() in mmu_alloc_shadow_roots().
> 
> And it will cause FNAME(fetch) installs the spte in a wrong sp
> or links a sp to a wrong parent since FNAME(gpte_changed) can't
> check these kind of changes.
> 
> Cache the PDPTEs and the problem is resolved.  The guest is responsible
> to info the host if its PAE root page is updated which will cause
> nested vmexit and the host updates the cache when next nested run.

Hmm, no, the guest is responsible for invalidating translations that can be
cached in the TLB, but the guest is not responsible for a full reload of PDPTEs.
Per the APM, the PDPTEs can be cached like regular PTEs:

  Under SVM, however, when the processor is in guest mode with PAE enabled, the
  guest PDPT entries are not cached or validated at this point, but instead are
  loaded and checked on demand in the normal course of address translation, just
  like page directory and page table entries. Any reserved bit violations ared
  etected at the point of use, and result in a page-fault (#PF) exception rather
  than a general-protection (#GP) exception.

So if L1 modifies a PDPTE from !PRESENT (or RESERVED) to PRESENT (and valid), then
any active L2 vCPUs should recognize the new PDPTE without a nested VM-Exit because
the old entry can't have been cached in the TLB.

In practice, snapshotting at nested VMRUN would likely work, but architecturally
it's wrong and could cause problems if L1+L2 are engange in paravirt shenanigans,
e.g. async #PF comes to mind.

I believe the correct way to fix this is to write-protect nNPT PDPTEs like all other
shadow pages, which shouldn't be too awful to do as part of your series to route
PDPTEs through kvm_mmu_get_page().

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-05-16 20:45 ` Sean Christopherson
@ 2022-05-17  1:02   ` Lai Jiangshan
  2022-05-17  1:10     ` Lai Jiangshan
  2022-05-25 21:45   ` David Matlack
  1 sibling, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2022-05-17  1:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: LKML, Lai Jiangshan, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin,
	Marcelo Tosatti, Avi Kivity,
	open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)

On Tue, May 17, 2022 at 4:45 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Apr 15, 2022, Lai Jiangshan wrote:
> > From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> >
> > When NPT enabled L1 is PAE paging, vcpu->arch.mmu->get_pdptrs() which
> > is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memroy
> > unconditionally for each call.
> >
> > The guest PAE root page is not write-protected.
> >
> > The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get different
> > values every time or it is different from the return value of
> > mmu->get_pdptrs() in mmu_alloc_shadow_roots().
> >
> > And it will cause FNAME(fetch) installs the spte in a wrong sp
> > or links a sp to a wrong parent since FNAME(gpte_changed) can't
> > check these kind of changes.
> >
> > Cache the PDPTEs and the problem is resolved.  The guest is responsible
> > to info the host if its PAE root page is updated which will cause
> > nested vmexit and the host updates the cache when next nested run.
>
> Hmm, no, the guest is responsible for invalidating translations that can be
> cached in the TLB, but the guest is not responsible for a full reload of PDPTEs.
> Per the APM, the PDPTEs can be cached like regular PTEs:
>
>   Under SVM, however, when the processor is in guest mode with PAE enabled, the
>   guest PDPT entries are not cached or validated at this point, but instead are
>   loaded and checked on demand in the normal course of address translation, just
>   like page directory and page table entries. Any reserved bit violations ared
>   etected at the point of use, and result in a page-fault (#PF) exception rather
>   than a general-protection (#GP) exception.
>
> So if L1 modifies a PDPTE from !PRESENT (or RESERVED) to PRESENT (and valid), then
> any active L2 vCPUs should recognize the new PDPTE without a nested VM-Exit because
> the old entry can't have been cached in the TLB.

In this case, it is still !PRESENT in the shadow page, and it will cause
a vmexit when L2 tries to use the translation.  I can't see anything wrong
in TLB or vTLB(shadow pages).

But I think some code is needed to reload the cached PDPTEs
when guest_mmu->get_pdptrs() returns !PRESENT and reload mmu if
the cache is changed. (and add code to avoid infinite loop)

The patch fails to reload mmu if the cache is changed which
leaves the problem described in the changelog partial resolved
only.

Maybe we need to add mmu parameter back in load_pdptrs() for it.

>
> In practice, snapshotting at nested VMRUN would likely work, but architecturally
> it's wrong and could cause problems if L1+L2 are engange in paravirt shenanigans,
> e.g. async #PF comes to mind.
>
> I believe the correct way to fix this is to write-protect nNPT PDPTEs like all other
> shadow pages, which shouldn't be too awful to do as part of your series to route
> PDPTEs through kvm_mmu_get_page().

In the one-off special shadow page (will be renamed to one-off local
shadow page)
patchsets, PAE PDPTEs is not write-protected.  Wirte-protecting it causing nasty
code.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-05-17  1:02   ` Lai Jiangshan
@ 2022-05-17  1:10     ` Lai Jiangshan
  0 siblings, 0 replies; 9+ messages in thread
From: Lai Jiangshan @ 2022-05-17  1:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: LKML, Lai Jiangshan, Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin,
	Marcelo Tosatti, Avi Kivity,
	open list:KERNEL VIRTUAL MACHINE FOR MIPS (KVM/mips)

On Tue, May 17, 2022 at 9:02 AM Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>
> On Tue, May 17, 2022 at 4:45 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Apr 15, 2022, Lai Jiangshan wrote:
> > > From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> > >
> > > When NPT enabled L1 is PAE paging, vcpu->arch.mmu->get_pdptrs() which
> > > is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memroy
> > > unconditionally for each call.
> > >
> > > The guest PAE root page is not write-protected.
> > >
> > > The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get different
> > > values every time or it is different from the return value of
> > > mmu->get_pdptrs() in mmu_alloc_shadow_roots().
> > >
> > > And it will cause FNAME(fetch) installs the spte in a wrong sp
> > > or links a sp to a wrong parent since FNAME(gpte_changed) can't
> > > check these kind of changes.
> > >
> > > Cache the PDPTEs and the problem is resolved.  The guest is responsible
> > > to info the host if its PAE root page is updated which will cause
> > > nested vmexit and the host updates the cache when next nested run.
> >
> > Hmm, no, the guest is responsible for invalidating translations that can be
> > cached in the TLB, but the guest is not responsible for a full reload of PDPTEs.
> > Per the APM, the PDPTEs can be cached like regular PTEs:
> >
> >   Under SVM, however, when the processor is in guest mode with PAE enabled, the
> >   guest PDPT entries are not cached or validated at this point, but instead are
> >   loaded and checked on demand in the normal course of address translation, just
> >   like page directory and page table entries. Any reserved bit violations ared
> >   etected at the point of use, and result in a page-fault (#PF) exception rather
> >   than a general-protection (#GP) exception.
> >
> > So if L1 modifies a PDPTE from !PRESENT (or RESERVED) to PRESENT (and valid), then
> > any active L2 vCPUs should recognize the new PDPTE without a nested VM-Exit because
> > the old entry can't have been cached in the TLB.
>
> In this case, it is still !PRESENT in the shadow page, and it will cause
> a vmexit when L2 tries to use the translation.  I can't see anything wrong
> in TLB or vTLB(shadow pages).
>
> But I think some code is needed to reload the cached PDPTEs
> when guest_mmu->get_pdptrs() returns !PRESENT and reload mmu if
> the cache is changed. (and add code to avoid infinite loop)
>
> The patch fails to reload mmu if the cache is changed which
> leaves the problem described in the changelog partial resolved
> only.
>
> Maybe we need to add mmu parameter back in load_pdptrs() for it.
>

It is still too complicated, I will try to do a direct check
in FNAME(fetch) instead of (re-)using the cache.

> >
> > In practice, snapshotting at nested VMRUN would likely work, but architecturally
> > it's wrong and could cause problems if L1+L2 are engange in paravirt shenanigans,
> > e.g. async #PF comes to mind.
> >
> > I believe the correct way to fix this is to write-protect nNPT PDPTEs like all other
> > shadow pages, which shouldn't be too awful to do as part of your series to route
> > PDPTEs through kvm_mmu_get_page().
>
> In the one-off special shadow page (will be renamed to one-off local
> shadow page)
> patchsets, PAE PDPTEs is not write-protected.  Wirte-protecting it causing nasty
> code.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-05-16 20:45 ` Sean Christopherson
  2022-05-17  1:02   ` Lai Jiangshan
@ 2022-05-25 21:45   ` David Matlack
  2022-05-26  8:38     ` Lai Jiangshan
  1 sibling, 1 reply; 9+ messages in thread
From: David Matlack @ 2022-05-25 21:45 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Lai Jiangshan, LKML, Lai Jiangshan, Paolo Bonzini,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Marcelo Tosatti, Avi Kivity, kvm list

On Mon, May 16, 2022 at 2:06 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Apr 15, 2022, Lai Jiangshan wrote:
> > From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> >
> > When NPT enabled L1 is PAE paging, vcpu->arch.mmu->get_pdptrs() which
> > is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memroy
> > unconditionally for each call.
> >
> > The guest PAE root page is not write-protected.
> >
> > The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get different
> > values every time or it is different from the return value of
> > mmu->get_pdptrs() in mmu_alloc_shadow_roots().
> >
> > And it will cause FNAME(fetch) installs the spte in a wrong sp
> > or links a sp to a wrong parent since FNAME(gpte_changed) can't
> > check these kind of changes.
> >
> > Cache the PDPTEs and the problem is resolved.  The guest is responsible
> > to info the host if its PAE root page is updated which will cause
> > nested vmexit and the host updates the cache when next nested run.
>
> Hmm, no, the guest is responsible for invalidating translations that can be
> cached in the TLB, but the guest is not responsible for a full reload of PDPTEs.
> Per the APM, the PDPTEs can be cached like regular PTEs:
>
>   Under SVM, however, when the processor is in guest mode with PAE enabled, the
>   guest PDPT entries are not cached or validated at this point, but instead are
>   loaded and checked on demand in the normal course of address translation, just
>   like page directory and page table entries. Any reserved bit violations ared
>   etected at the point of use, and result in a page-fault (#PF) exception rather
>   than a general-protection (#GP) exception.

This paragraph from the APM describes the behavior of CR3 loads while
in SVM guest-mode. But this patch is changing how KVM emulates SVM
host-mode (i.e. L1), right? It seems like AMD makes no guarantee
whether or not CR3 loads pre-load PDPTEs while in SVM host-mode.
(Although the APM does say that "modern processors" do not pre-load
PDPTEs.)

>
> So if L1 modifies a PDPTE from !PRESENT (or RESERVED) to PRESENT (and valid), then
> any active L2 vCPUs should recognize the new PDPTE without a nested VM-Exit because
> the old entry can't have been cached in the TLB.
>
> In practice, snapshotting at nested VMRUN would likely work, but architecturally
> it's wrong and could cause problems if L1+L2 are engange in paravirt shenanigans,
> e.g. async #PF comes to mind.
>
> I believe the correct way to fix this is to write-protect nNPT PDPTEs like all other
> shadow pages, which shouldn't be too awful to do as part of your series to route
> PDPTEs through kvm_mmu_get_page().

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-05-25 21:45   ` David Matlack
@ 2022-05-26  8:38     ` Lai Jiangshan
  2022-05-26  9:33       ` Paolo Bonzini
  0 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2022-05-26  8:38 UTC (permalink / raw)
  To: David Matlack
  Cc: Sean Christopherson, LKML, Lai Jiangshan, Paolo Bonzini,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Marcelo Tosatti, Avi Kivity, kvm list

On Thu, May 26, 2022 at 5:45 AM David Matlack <dmatlack@google.com> wrote:
>
> On Mon, May 16, 2022 at 2:06 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Apr 15, 2022, Lai Jiangshan wrote:
> > > From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> > >
> > > When NPT enabled L1 is PAE paging, vcpu->arch.mmu->get_pdptrs() which
> > > is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memroy
> > > unconditionally for each call.
> > >
> > > The guest PAE root page is not write-protected.
> > >
> > > The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get different
> > > values every time or it is different from the return value of
> > > mmu->get_pdptrs() in mmu_alloc_shadow_roots().
> > >
> > > And it will cause FNAME(fetch) installs the spte in a wrong sp
> > > or links a sp to a wrong parent since FNAME(gpte_changed) can't
> > > check these kind of changes.
> > >
> > > Cache the PDPTEs and the problem is resolved.  The guest is responsible
> > > to info the host if its PAE root page is updated which will cause
> > > nested vmexit and the host updates the cache when next nested run.
> >
> > Hmm, no, the guest is responsible for invalidating translations that can be
> > cached in the TLB, but the guest is not responsible for a full reload of PDPTEs.
> > Per the APM, the PDPTEs can be cached like regular PTEs:
> >
> >   Under SVM, however, when the processor is in guest mode with PAE enabled, the
> >   guest PDPT entries are not cached or validated at this point, but instead are
> >   loaded and checked on demand in the normal course of address translation, just
> >   like page directory and page table entries. Any reserved bit violations ared
> >   etected at the point of use, and result in a page-fault (#PF) exception rather
> >   than a general-protection (#GP) exception.
>
> This paragraph from the APM describes the behavior of CR3 loads while
> in SVM guest-mode. But this patch is changing how KVM emulates SVM
> host-mode (i.e. L1), right? It seems like AMD makes no guarantee
> whether or not CR3 loads pre-load PDPTEs while in SVM host-mode.
> (Although the APM does say that "modern processors" do not pre-load
> PDPTEs.)

Oh, I also missed the fact that L1 is the host when emulating it.

The code is for host-mode (L1)'s nested_cr3 which is using the
traditional PAE PDPTEs loading and checking.

So using caches is the only correct way, right?

If so, I can update this patch only (not adding it to the patchset of
one-off local shadow page) and add some checks to see if the loaded
caches changed.

Maybe I just ignore it since I'm not familiar with SVM enough.
I hope it served as a bug report.

Thanks
Lai

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-05-26  8:38     ` Lai Jiangshan
@ 2022-05-26  9:33       ` Paolo Bonzini
  2022-05-26  9:52         ` Lai Jiangshan
  0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2022-05-26  9:33 UTC (permalink / raw)
  To: Lai Jiangshan, David Matlack
  Cc: Sean Christopherson, LKML, Lai Jiangshan, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML,
	H. Peter Anvin, Marcelo Tosatti, Avi Kivity, kvm list

On 5/26/22 10:38, Lai Jiangshan wrote:
>> (Although the APM does say that "modern processors" do not pre-load
>> PDPTEs.)

This changed between the Oct 2020 and Nov 2021, so I suppose the change 
was done in Zen 3.

> Oh, I also missed the fact that L1 is the host when emulating it.
> 
> The code is for host-mode (L1)'s nested_cr3 which is using the
> traditional PAE PDPTEs loading and checking.
> 
> So using caches is the only correct way, right?

The caching behavior for NPT PDPTEs does not matter too much.  What 
matters is that a PDPTE with reserved bits should cause a #NPF at usage 
time rather than a VMentry failure or a #NPF immediately after VMentry.

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode
  2022-05-26  9:33       ` Paolo Bonzini
@ 2022-05-26  9:52         ` Lai Jiangshan
  0 siblings, 0 replies; 9+ messages in thread
From: Lai Jiangshan @ 2022-05-26  9:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: David Matlack, Sean Christopherson, LKML, Lai Jiangshan,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	X86 ML, H. Peter Anvin, Marcelo Tosatti, Avi Kivity, kvm list

On Thu, May 26, 2022 at 5:33 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 5/26/22 10:38, Lai Jiangshan wrote:
> >> (Although the APM does say that "modern processors" do not pre-load
> >> PDPTEs.)
>
> This changed between the Oct 2020 and Nov 2021, so I suppose the change
> was done in Zen 3.
>
> > Oh, I also missed the fact that L1 is the host when emulating it.
> >
> > The code is for host-mode (L1)'s nested_cr3 which is using the
> > traditional PAE PDPTEs loading and checking.
> >
> > So using caches is the only correct way, right?
>
> The caching behavior for NPT PDPTEs does not matter too much.  What
> matters is that a PDPTE with reserved bits should cause a #NPF at usage
> time rather than a VMentry failure or a #NPF immediately after VMentry.
>

Since there is mmu->get_pdptrs() in mmu_alloc_shadow_roots(), you can't
conform this now.

It will be easier to only cause a #NPF at usage time after the one-off local
patchset.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-05-26  9:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-15 10:34 [PATCH] kvm: x86/svm/nested: Cache PDPTEs for nested NPT in PAE paging mode Lai Jiangshan
2022-05-13  8:54 ` Lai Jiangshan
2022-05-16 20:45 ` Sean Christopherson
2022-05-17  1:02   ` Lai Jiangshan
2022-05-17  1:10     ` Lai Jiangshan
2022-05-25 21:45   ` David Matlack
2022-05-26  8:38     ` Lai Jiangshan
2022-05-26  9:33       ` Paolo Bonzini
2022-05-26  9:52         ` Lai Jiangshan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.