KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/2] KVM: x86/mmu: Optimize rsvd pte checks
@ 2020-01-09 23:06 Sean Christopherson
  2020-01-09 23:06 ` [PATCH v2 1/2] KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte() Sean Christopherson
  2020-01-09 23:06 ` [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks Sean Christopherson
  0 siblings, 2 replies; 8+ messages in thread
From: Sean Christopherson @ 2020-01-09 23:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, David Laight, Arvind Sankar

Two patches to optimize the reserved bit and illegal EPT memtype/XWR
checks, and more importantly, clean up an undocumented, subtle bitwise-OR
in the code.

v2: Rewrite everything.

v1: https://lkml.kernel.org/r/20200108001859.25254-1-sean.j.christopherson@intel.com

Sean Christopherson (2):
  KVM: x86/mmu: Reorder the reserved bit check in
    prefetch_invalid_gpte()
  KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks

 arch/x86/kvm/mmu/mmu.c         | 26 ++++++++++++++------------
 arch/x86/kvm/mmu/paging_tmpl.h | 23 +++++++++++++++++++----
 2 files changed, 33 insertions(+), 16 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte()
  2020-01-09 23:06 [PATCH v2 0/2] KVM: x86/mmu: Optimize rsvd pte checks Sean Christopherson
@ 2020-01-09 23:06 ` Sean Christopherson
  2020-01-10 11:37   ` Vitaly Kuznetsov
  2020-01-09 23:06 ` [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks Sean Christopherson
  1 sibling, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2020-01-09 23:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, David Laight, Arvind Sankar

Move the !PRESENT and !ACCESSED checks in FNAME(prefetch_invalid_gpte)
above the call to is_rsvd_bits_set().  For a well behaved guest, the
!PRESENT and !ACCESSED are far more likely to evaluate true than the
reserved bit checks, and they do not require additional memory accesses.

Before:
 Dump of assembler code for function paging32_prefetch_invalid_gpte:
   0x0000000000044240 <+0>:     callq  0x44245 <paging32_prefetch_invalid_gpte+5>
   0x0000000000044245 <+5>:     mov    %rcx,%rax
   0x0000000000044248 <+8>:     shr    $0x7,%rax
   0x000000000004424c <+12>:    and    $0x1,%eax
   0x000000000004424f <+15>:    lea    0x0(,%rax,4),%r8
   0x0000000000044257 <+23>:    add    %r8,%rax
   0x000000000004425a <+26>:    mov    %rcx,%r8
   0x000000000004425d <+29>:    and    0x120(%rsi,%rax,8),%r8
   0x0000000000044265 <+37>:    mov    0x170(%rsi),%rax
   0x000000000004426c <+44>:    shr    %cl,%rax
   0x000000000004426f <+47>:    and    $0x1,%eax
   0x0000000000044272 <+50>:    or     %rax,%r8
   0x0000000000044275 <+53>:    jne    0x4427c <paging32_prefetch_invalid_gpte+60>
   0x0000000000044277 <+55>:    test   $0x1,%cl
   0x000000000004427a <+58>:    jne    0x4428a <paging32_prefetch_invalid_gpte+74>
   0x000000000004427c <+60>:    mov    %rdx,%rsi
   0x000000000004427f <+63>:    callq  0x44080 <drop_spte>
   0x0000000000044284 <+68>:    mov    $0x1,%eax
   0x0000000000044289 <+73>:    retq
   0x000000000004428a <+74>:    xor    %eax,%eax
   0x000000000004428c <+76>:    and    $0x20,%ecx
   0x000000000004428f <+79>:    jne    0x44289 <paging32_prefetch_invalid_gpte+73>
   0x0000000000044291 <+81>:    mov    %rdx,%rsi
   0x0000000000044294 <+84>:    callq  0x44080 <drop_spte>
   0x0000000000044299 <+89>:    mov    $0x1,%eax
   0x000000000004429e <+94>:    jmp    0x44289 <paging32_prefetch_invalid_gpte+73>
 End of assembler dump.

After:
 Dump of assembler code for function paging32_prefetch_invalid_gpte:
   0x0000000000044240 <+0>:     callq  0x44245 <paging32_prefetch_invalid_gpte+5>
   0x0000000000044245 <+5>:     test   $0x1,%cl
   0x0000000000044248 <+8>:     je     0x4424f <paging32_prefetch_invalid_gpte+15>
   0x000000000004424a <+10>:    test   $0x20,%cl
   0x000000000004424d <+13>:    jne    0x4425d <paging32_prefetch_invalid_gpte+29>
   0x000000000004424f <+15>:    mov    %rdx,%rsi
   0x0000000000044252 <+18>:    callq  0x44080 <drop_spte>
   0x0000000000044257 <+23>:    mov    $0x1,%eax
   0x000000000004425c <+28>:    retq
   0x000000000004425d <+29>:    mov    %rcx,%rax
   0x0000000000044260 <+32>:    mov    (%rsi),%rsi
   0x0000000000044263 <+35>:    shr    $0x7,%rax
   0x0000000000044267 <+39>:    and    $0x1,%eax
   0x000000000004426a <+42>:    lea    0x0(,%rax,4),%r8
   0x0000000000044272 <+50>:    add    %r8,%rax
   0x0000000000044275 <+53>:    mov    %rcx,%r8
   0x0000000000044278 <+56>:    and    0x120(%rsi,%rax,8),%r8
   0x0000000000044280 <+64>:    mov    0x170(%rsi),%rax
   0x0000000000044287 <+71>:    shr    %cl,%rax
   0x000000000004428a <+74>:    and    $0x1,%eax
   0x000000000004428d <+77>:    mov    %rax,%rcx
   0x0000000000044290 <+80>:    xor    %eax,%eax
   0x0000000000044292 <+82>:    or     %rcx,%r8
   0x0000000000044295 <+85>:    je     0x4425c <paging32_prefetch_invalid_gpte+28>
   0x0000000000044297 <+87>:    mov    %rdx,%rsi
   0x000000000004429a <+90>:    callq  0x44080 <drop_spte>
   0x000000000004429f <+95>:    mov    $0x1,%eax
   0x00000000000442a4 <+100>:   jmp    0x4425c <paging32_prefetch_invalid_gpte+28>
 End of assembler dump.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/mmu/paging_tmpl.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index b53bed3c901c..1fde6a1c506d 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -175,9 +175,6 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
 				  struct kvm_mmu_page *sp, u64 *spte,
 				  u64 gpte)
 {
-	if (is_rsvd_bits_set(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
-		goto no_present;
-
 	if (!FNAME(is_present_gpte)(gpte))
 		goto no_present;
 
@@ -186,6 +183,9 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
 	    !(gpte & PT_GUEST_ACCESSED_MASK))
 		goto no_present;
 
+	if (is_rsvd_bits_set(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
+		goto no_present;
+
 	return false;
 
 no_present:
-- 
2.24.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks
  2020-01-09 23:06 [PATCH v2 0/2] KVM: x86/mmu: Optimize rsvd pte checks Sean Christopherson
  2020-01-09 23:06 ` [PATCH v2 1/2] KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte() Sean Christopherson
@ 2020-01-09 23:06 ` Sean Christopherson
  2020-01-10 11:37   ` Vitaly Kuznetsov
  1 sibling, 1 reply; 8+ messages in thread
From: Sean Christopherson @ 2020-01-09 23:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, David Laight, Arvind Sankar

Rework the handling of nEPT's bad memtype/XWR checks to micro-optimize
the checks as much as possible.  Move the check to a separate helper,
__is_bad_mt_xwr(), which allows the guest_rsvd_check usage in
paging_tmpl.h to omit the check entirely for paging32/64 (bad_mt_xwr is
always zero for non-nEPT) while retaining the bitwise-OR of the current
code for the shadow_zero_check in walk_shadow_page_get_mmio_spte().

Add a comment for the bitwise-OR usage in the mmio spte walk to avoid
future attempts to "fix" the code, which is what prompted this
optimization in the first place[*].

Opportunistically remove the superfluous '!= 0' and parantheses, and
use BIT_ULL() instead of open coding its equivalent.

The net effect is that code generation is largely unhanged for
walk_shadow_page_get_mmio_spte(), marginally better for
ept_prefetch_invalid_gpte(), and significantly improved for
paging32/64_prefetch_invalid_gpte().

Note, walk_shadow_page_get_mmio_spte() can't use a templated version of
the memtype/XRW as it works on the host's shadow PTEs, e.g. checks that
KVM hasn't borked its EPT tables.  Even if it could be templated, the
benefits of having a single implementation far outweight the few uops
that would be saved for NPT or non-TDP paging, e.g. most compilers
inline it all the way to up kvm_mmu_page_fault().

[*] https://lkml.kernel.org/r/20200108001859.25254-1-sean.j.christopherson@intel.com

Cc: Jim Mattson <jmattson@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/mmu/mmu.c         | 26 ++++++++++++++------------
 arch/x86/kvm/mmu/paging_tmpl.h | 19 +++++++++++++++++--
 2 files changed, 31 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7269130ea5e2..2992ff7b42a7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3968,20 +3968,14 @@ static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gpa_t vaddr,
 static bool
 __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
 {
-	int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
+	int bit7 = (pte >> 7) & 1;
 
-	return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
-		((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
+	return pte & rsvd_check->rsvd_bits_mask[bit7][level-1];
 }
 
-static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level)
+static bool __is_bad_mt_xwr(struct rsvd_bits_validate *rsvd_check, u64 pte)
 {
-	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level);
-}
-
-static bool is_shadow_zero_bits_set(struct kvm_mmu *mmu, u64 spte, int level)
-{
-	return __is_rsvd_bits_set(&mmu->shadow_zero_check, spte, level);
+	return rsvd_check->bad_mt_xwr & BIT_ULL(pte & 0x3f);
 }
 
 static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
@@ -4005,9 +3999,12 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 {
 	struct kvm_shadow_walk_iterator iterator;
 	u64 sptes[PT64_ROOT_MAX_LEVEL], spte = 0ull;
+	struct rsvd_bits_validate *rsvd_check;
 	int root, leaf;
 	bool reserved = false;
 
+	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
+
 	walk_shadow_page_lockless_begin(vcpu);
 
 	for (shadow_walk_init(&iterator, vcpu, addr),
@@ -4022,8 +4019,13 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
 		if (!is_shadow_present_pte(spte))
 			break;
 
-		reserved |= is_shadow_zero_bits_set(vcpu->arch.mmu, spte,
-						    iterator.level);
+		/*
+		 * Use a bitwise-OR instead of a logical-OR to aggregate the
+		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
+		 * adding a Jcc in the loop.
+		 */
+		reserved |= __is_bad_mt_xwr(rsvd_check, spte) |
+			    __is_rsvd_bits_set(rsvd_check, spte, iterator.level);
 	}
 
 	walk_shadow_page_lockless_end(vcpu);
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 1fde6a1c506d..eaa00c4daeb1 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -128,6 +128,21 @@ static inline int FNAME(is_present_gpte)(unsigned long pte)
 #endif
 }
 
+static bool FNAME(is_bad_mt_xwr)(struct rsvd_bits_validate *rsvd_check, u64 gpte)
+{
+#if PTTYPE != PTTYPE_EPT
+	return false;
+#else
+	return __is_bad_mt_xwr(rsvd_check, gpte);
+#endif
+}
+
+static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
+{
+	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) ||
+	       FNAME(is_bad_mt_xwr)(&mmu->guest_rsvd_check, gpte);
+}
+
 static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 			       pt_element_t __user *ptep_user, unsigned index,
 			       pt_element_t orig_pte, pt_element_t new_pte)
@@ -183,7 +198,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
 	    !(gpte & PT_GUEST_ACCESSED_MASK))
 		goto no_present;
 
-	if (is_rsvd_bits_set(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
+	if (FNAME(is_rsvd_bits_set)(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
 		goto no_present;
 
 	return false;
@@ -400,7 +415,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 		if (unlikely(!FNAME(is_present_gpte)(pte)))
 			goto error;
 
-		if (unlikely(is_rsvd_bits_set(mmu, pte, walker->level))) {
+		if (unlikely(FNAME(is_rsvd_bits_set)(mmu, pte, walker->level))) {
 			errcode = PFERR_RSVD_MASK | PFERR_PRESENT_MASK;
 			goto error;
 		}
-- 
2.24.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte()
  2020-01-09 23:06 ` [PATCH v2 1/2] KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte() Sean Christopherson
@ 2020-01-10 11:37   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 8+ messages in thread
From: Vitaly Kuznetsov @ 2020-01-10 11:37 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, David Laight, Arvind Sankar

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> Move the !PRESENT and !ACCESSED checks in FNAME(prefetch_invalid_gpte)
> above the call to is_rsvd_bits_set().  For a well behaved guest, the
> !PRESENT and !ACCESSED are far more likely to evaluate true than the
> reserved bit checks, and they do not require additional memory accesses.
>
> Before:
>  Dump of assembler code for function paging32_prefetch_invalid_gpte:
>    0x0000000000044240 <+0>:     callq  0x44245 <paging32_prefetch_invalid_gpte+5>
>    0x0000000000044245 <+5>:     mov    %rcx,%rax
>    0x0000000000044248 <+8>:     shr    $0x7,%rax
>    0x000000000004424c <+12>:    and    $0x1,%eax
>    0x000000000004424f <+15>:    lea    0x0(,%rax,4),%r8
>    0x0000000000044257 <+23>:    add    %r8,%rax
>    0x000000000004425a <+26>:    mov    %rcx,%r8
>    0x000000000004425d <+29>:    and    0x120(%rsi,%rax,8),%r8
>    0x0000000000044265 <+37>:    mov    0x170(%rsi),%rax
>    0x000000000004426c <+44>:    shr    %cl,%rax
>    0x000000000004426f <+47>:    and    $0x1,%eax
>    0x0000000000044272 <+50>:    or     %rax,%r8
>    0x0000000000044275 <+53>:    jne    0x4427c <paging32_prefetch_invalid_gpte+60>
>    0x0000000000044277 <+55>:    test   $0x1,%cl
>    0x000000000004427a <+58>:    jne    0x4428a <paging32_prefetch_invalid_gpte+74>
>    0x000000000004427c <+60>:    mov    %rdx,%rsi
>    0x000000000004427f <+63>:    callq  0x44080 <drop_spte>
>    0x0000000000044284 <+68>:    mov    $0x1,%eax
>    0x0000000000044289 <+73>:    retq
>    0x000000000004428a <+74>:    xor    %eax,%eax
>    0x000000000004428c <+76>:    and    $0x20,%ecx
>    0x000000000004428f <+79>:    jne    0x44289 <paging32_prefetch_invalid_gpte+73>
>    0x0000000000044291 <+81>:    mov    %rdx,%rsi
>    0x0000000000044294 <+84>:    callq  0x44080 <drop_spte>
>    0x0000000000044299 <+89>:    mov    $0x1,%eax
>    0x000000000004429e <+94>:    jmp    0x44289 <paging32_prefetch_invalid_gpte+73>
>  End of assembler dump.
>
> After:
>  Dump of assembler code for function paging32_prefetch_invalid_gpte:
>    0x0000000000044240 <+0>:     callq  0x44245 <paging32_prefetch_invalid_gpte+5>
>    0x0000000000044245 <+5>:     test   $0x1,%cl
>    0x0000000000044248 <+8>:     je     0x4424f <paging32_prefetch_invalid_gpte+15>
>    0x000000000004424a <+10>:    test   $0x20,%cl
>    0x000000000004424d <+13>:    jne    0x4425d <paging32_prefetch_invalid_gpte+29>
>    0x000000000004424f <+15>:    mov    %rdx,%rsi
>    0x0000000000044252 <+18>:    callq  0x44080 <drop_spte>
>    0x0000000000044257 <+23>:    mov    $0x1,%eax
>    0x000000000004425c <+28>:    retq
>    0x000000000004425d <+29>:    mov    %rcx,%rax
>    0x0000000000044260 <+32>:    mov    (%rsi),%rsi
>    0x0000000000044263 <+35>:    shr    $0x7,%rax
>    0x0000000000044267 <+39>:    and    $0x1,%eax
>    0x000000000004426a <+42>:    lea    0x0(,%rax,4),%r8
>    0x0000000000044272 <+50>:    add    %r8,%rax
>    0x0000000000044275 <+53>:    mov    %rcx,%r8
>    0x0000000000044278 <+56>:    and    0x120(%rsi,%rax,8),%r8
>    0x0000000000044280 <+64>:    mov    0x170(%rsi),%rax
>    0x0000000000044287 <+71>:    shr    %cl,%rax
>    0x000000000004428a <+74>:    and    $0x1,%eax
>    0x000000000004428d <+77>:    mov    %rax,%rcx
>    0x0000000000044290 <+80>:    xor    %eax,%eax
>    0x0000000000044292 <+82>:    or     %rcx,%r8
>    0x0000000000044295 <+85>:    je     0x4425c <paging32_prefetch_invalid_gpte+28>
>    0x0000000000044297 <+87>:    mov    %rdx,%rsi
>    0x000000000004429a <+90>:    callq  0x44080 <drop_spte>
>    0x000000000004429f <+95>:    mov    $0x1,%eax
>    0x00000000000442a4 <+100>:   jmp    0x4425c <paging32_prefetch_invalid_gpte+28>
>  End of assembler dump.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/mmu/paging_tmpl.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index b53bed3c901c..1fde6a1c506d 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -175,9 +175,6 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
>  				  struct kvm_mmu_page *sp, u64 *spte,
>  				  u64 gpte)
>  {
> -	if (is_rsvd_bits_set(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
> -		goto no_present;
> -
>  	if (!FNAME(is_present_gpte)(gpte))
>  		goto no_present;
>  
> @@ -186,6 +183,9 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
>  	    !(gpte & PT_GUEST_ACCESSED_MASK))
>  		goto no_present;
>  
> +	if (is_rsvd_bits_set(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
> +		goto no_present;
> +
>  	return false;
>  
>  no_present:

FWIW,
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks
  2020-01-09 23:06 ` [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks Sean Christopherson
@ 2020-01-10 11:37   ` Vitaly Kuznetsov
  2020-01-10 16:04     ` Sean Christopherson
  0 siblings, 1 reply; 8+ messages in thread
From: Vitaly Kuznetsov @ 2020-01-10 11:37 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, David Laight, Arvind Sankar

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> Rework the handling of nEPT's bad memtype/XWR checks to micro-optimize
> the checks as much as possible.  Move the check to a separate helper,
> __is_bad_mt_xwr(), which allows the guest_rsvd_check usage in
> paging_tmpl.h to omit the check entirely for paging32/64 (bad_mt_xwr is
> always zero for non-nEPT) while retaining the bitwise-OR of the current
> code for the shadow_zero_check in walk_shadow_page_get_mmio_spte().
>
> Add a comment for the bitwise-OR usage in the mmio spte walk to avoid
> future attempts to "fix" the code, which is what prompted this
> optimization in the first place[*].
>
> Opportunistically remove the superfluous '!= 0' and parantheses, and
> use BIT_ULL() instead of open coding its equivalent.
>
> The net effect is that code generation is largely unhanged

unchanged? (or who's gonna hang that code? :-) )

> for walk_shadow_page_get_mmio_spte(), marginally better for
> ept_prefetch_invalid_gpte(), and significantly improved for
> paging32/64_prefetch_invalid_gpte().
>
> Note, walk_shadow_page_get_mmio_spte() can't use a templated version of
> the memtype/XRW as it works on the host's shadow PTEs, e.g. checks that
> KVM hasn't borked its EPT tables.  Even if it could be templated, the
> benefits of having a single implementation far outweight the few uops
> that would be saved for NPT or non-TDP paging, e.g. most compilers
> inline it all the way to up kvm_mmu_page_fault().
>
> [*] https://lkml.kernel.org/r/20200108001859.25254-1-sean.j.christopherson@intel.com
>
> Cc: Jim Mattson <jmattson@google.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: David Laight <David.Laight@ACULAB.COM>
> Cc: Arvind Sankar <nivedita@alum.mit.edu>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/mmu/mmu.c         | 26 ++++++++++++++------------
>  arch/x86/kvm/mmu/paging_tmpl.h | 19 +++++++++++++++++--
>  2 files changed, 31 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 7269130ea5e2..2992ff7b42a7 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3968,20 +3968,14 @@ static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gpa_t vaddr,
>  static bool
>  __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
>  {
> -	int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
> +	int bit7 = (pte >> 7) & 1;
>  
> -	return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
> -		((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
> +	return pte & rsvd_check->rsvd_bits_mask[bit7][level-1];
>  }
>  
> -static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level)
> +static bool __is_bad_mt_xwr(struct rsvd_bits_validate *rsvd_check, u64 pte)
>  {
> -	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level);
> -}
> -
> -static bool is_shadow_zero_bits_set(struct kvm_mmu *mmu, u64 spte, int level)
> -{
> -	return __is_rsvd_bits_set(&mmu->shadow_zero_check, spte, level);
> +	return rsvd_check->bad_mt_xwr & BIT_ULL(pte & 0x3f);
>  }
>  
>  static bool mmio_info_in_cache(struct kvm_vcpu *vcpu, u64 addr, bool direct)
> @@ -4005,9 +3999,12 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  {
>  	struct kvm_shadow_walk_iterator iterator;
>  	u64 sptes[PT64_ROOT_MAX_LEVEL], spte = 0ull;
> +	struct rsvd_bits_validate *rsvd_check;
>  	int root, leaf;
>  	bool reserved = false;
>  
> +	rsvd_check = &vcpu->arch.mmu->shadow_zero_check;
> +
>  	walk_shadow_page_lockless_begin(vcpu);
>  
>  	for (shadow_walk_init(&iterator, vcpu, addr),
> @@ -4022,8 +4019,13 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep)
>  		if (!is_shadow_present_pte(spte))
>  			break;
>  
> -		reserved |= is_shadow_zero_bits_set(vcpu->arch.mmu, spte,
> -						    iterator.level);
> +		/*
> +		 * Use a bitwise-OR instead of a logical-OR to aggregate the
> +		 * reserved bit and EPT's invalid memtype/XWR checks to avoid
> +		 * adding a Jcc in the loop.
> +		 */
> +		reserved |= __is_bad_mt_xwr(rsvd_check, spte) |
> +			    __is_rsvd_bits_set(rsvd_check, spte, iterator.level);
>  	}
>  
>  	walk_shadow_page_lockless_end(vcpu);
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 1fde6a1c506d..eaa00c4daeb1 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -128,6 +128,21 @@ static inline int FNAME(is_present_gpte)(unsigned long pte)
>  #endif
>  }
>  
> +static bool FNAME(is_bad_mt_xwr)(struct rsvd_bits_validate *rsvd_check, u64 gpte)
> +{
> +#if PTTYPE != PTTYPE_EPT
> +	return false;
> +#else
> +	return __is_bad_mt_xwr(rsvd_check, gpte);
> +#endif
> +}
> +
> +static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
> +{
> +	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) ||
> +	       FNAME(is_bad_mt_xwr)(&mmu->guest_rsvd_check, gpte);
> +}
> +

Not sure if it would make sense/difference (well, this is famous KVM
MMU!) but as FNAME(is_bad_mt_xwr)

has only one call site we could've as well merged it, something like:

static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
{
#if PTTYPE == PTTYPE_EPT
	bool res =  __is_bad_mt_xwr(&mmu->guest_rsvd_check, gpte);
#else
	bool res = false;
#endif
	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) || res;
}

but keeping it in-line with __is_rsvd_bits_set()/__is_bad_mt_xwr() in
mmu.c likely has greater benefits.

>  static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>  			       pt_element_t __user *ptep_user, unsigned index,
>  			       pt_element_t orig_pte, pt_element_t new_pte)
> @@ -183,7 +198,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
>  	    !(gpte & PT_GUEST_ACCESSED_MASK))
>  		goto no_present;
>  
> -	if (is_rsvd_bits_set(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
> +	if (FNAME(is_rsvd_bits_set)(vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
>  		goto no_present;
>  
>  	return false;
> @@ -400,7 +415,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
>  		if (unlikely(!FNAME(is_present_gpte)(pte)))
>  			goto error;
>  
> -		if (unlikely(is_rsvd_bits_set(mmu, pte, walker->level))) {
> +		if (unlikely(FNAME(is_rsvd_bits_set)(mmu, pte, walker->level))) {
>  			errcode = PFERR_RSVD_MASK | PFERR_PRESENT_MASK;
>  			goto error;
>  		}

FWIW,

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

I've also smoke-tested with with a Hyper-V guest (just in case) and
nothing seems to be broken.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks
  2020-01-10 11:37   ` Vitaly Kuznetsov
@ 2020-01-10 16:04     ` Sean Christopherson
  2020-01-10 16:13       ` David Laight
  2020-01-15 18:17       ` Paolo Bonzini
  0 siblings, 2 replies; 8+ messages in thread
From: Sean Christopherson @ 2020-01-10 16:04 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, David Laight, Arvind Sankar

On Fri, Jan 10, 2020 at 12:37:33PM +0100, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> > --- a/arch/x86/kvm/mmu/paging_tmpl.h
> > +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> > @@ -128,6 +128,21 @@ static inline int FNAME(is_present_gpte)(unsigned long pte)
> >  #endif
> >  }
> >  
> > +static bool FNAME(is_bad_mt_xwr)(struct rsvd_bits_validate *rsvd_check, u64 gpte)
> > +{
> > +#if PTTYPE != PTTYPE_EPT
> > +	return false;
> > +#else
> > +	return __is_bad_mt_xwr(rsvd_check, gpte);
> > +#endif
> > +}
> > +
> > +static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
> > +{
> > +	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) ||
> > +	       FNAME(is_bad_mt_xwr)(&mmu->guest_rsvd_check, gpte);
> > +}
> > +
> 
> Not sure if it would make sense/difference (well, this is famous KVM
> MMU!) but as FNAME(is_bad_mt_xwr)
> 
> has only one call site we could've as well merged it, something like:
> 
> static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
> {
> #if PTTYPE == PTTYPE_EPT
> 	bool res =  __is_bad_mt_xwr(&mmu->guest_rsvd_check, gpte);
> #else
> 	bool res = false;
> #endif
> 	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) || res;
> }
> 
> but keeping it in-line with __is_rsvd_bits_set()/__is_bad_mt_xwr() in
> mmu.c likely has greater benefits.

Ya, I don't love the code, but it was the least awful of the options I
tried, in that it's the most readable without being too obnoxious.


Similar to your suggestion, but it avoids evaluating __is_bad_mt_xwr() if
reserved bits are set, which is admittedly rare.

	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level)
#if PTTYPE == PTTYPE_EPT
	       || __is_bad_mt_xwr(&mmu->guest_rsvd_check, gpte)
#endif
	       ;

Or macrofying the call to keep the call site clean, but IMO this obfuscates
things too much.

	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) ||
	       IS_BAD_MT_XWR(&mmu->guest_rsvd_check, gpte);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks
  2020-01-10 16:04     ` Sean Christopherson
@ 2020-01-10 16:13       ` David Laight
  2020-01-15 18:17       ` Paolo Bonzini
  1 sibling, 0 replies; 8+ messages in thread
From: David Laight @ 2020-01-10 16:13 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Arvind Sankar

From: Sean Christopherson <sean.j.christopherson@intel.com>
> Sent: 10 January 2020 16:05
...
> Similar to your suggestion, but it avoids evaluating __is_bad_mt_xwr() if
> reserved bits are set, which is admittedly rare.
> 
> 	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level)
> #if PTTYPE == PTTYPE_EPT
> 	       || __is_bad_mt_xwr(&mmu->guest_rsvd_check, gpte)
> #endif
> 	       ;

Or:
	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) ||
		(PTTYPE == PTTYPE_EPT && __is_bad_mt_xwr(&mmu->guest_rsvd_check, gpte));

Relying in the compiler to optimise it away.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks
  2020-01-10 16:04     ` Sean Christopherson
  2020-01-10 16:13       ` David Laight
@ 2020-01-15 18:17       ` Paolo Bonzini
  1 sibling, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2020-01-15 18:17 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	David Laight, Arvind Sankar

On 10/01/20 17:04, Sean Christopherson wrote:
> Ya, I don't love the code, but it was the least awful of the options I
> tried, in that it's the most readable without being too obnoxious.
> 
> 
> Similar to your suggestion, but it avoids evaluating __is_bad_mt_xwr() if
> reserved bits are set, which is admittedly rare.
> 
> 	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level)
> #if PTTYPE == PTTYPE_EPT
> 	       || __is_bad_mt_xwr(&mmu->guest_rsvd_check, gpte)
> #endif
> 	       ;
> 
> Or macrofying the call to keep the call site clean, but IMO this obfuscates
> things too much.
> 
> 	return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) ||
> 	       IS_BAD_MT_XWR(&mmu->guest_rsvd_check, gpte);

I think what you posted is the best (David's comes second).

Queued, thanks.

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-09 23:06 [PATCH v2 0/2] KVM: x86/mmu: Optimize rsvd pte checks Sean Christopherson
2020-01-09 23:06 ` [PATCH v2 1/2] KVM: x86/mmu: Reorder the reserved bit check in prefetch_invalid_gpte() Sean Christopherson
2020-01-10 11:37   ` Vitaly Kuznetsov
2020-01-09 23:06 ` [PATCH v2 2/2] KVM: x86/mmu: Micro-optimize nEPT's bad memptype/XWR checks Sean Christopherson
2020-01-10 11:37   ` Vitaly Kuznetsov
2020-01-10 16:04     ` Sean Christopherson
2020-01-10 16:13       ` David Laight
2020-01-15 18:17       ` Paolo Bonzini

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git