linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/23] KVM: MMU: MMU role refactoring
@ 2022-02-04 11:56 Paolo Bonzini
  2022-02-04 11:56 ` [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask Paolo Bonzini
                   ` (24 more replies)
  0 siblings, 25 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:56 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

The TDP MMU has a performance regression compared to the legacy
MMU when CR0 changes often.  This was reported for the grsecurity
kernel, which uses CR0.WP to implement kernel W^X.  In that case,
each change to CR0.WP unloads the MMU and causes a lot of unnecessary
work.  When running nested, this can even cause the L1 to hardly
make progress, as the L0 hypervisor it is overwhelmed by the amount
of MMU work that is needed.

The root cause of the issue is that the "MMU role" in KVM is a mess
that mixes the CPU setup (CR0/CR4/EFER, SMM, guest mode, etc.)
and the shadow page table format.  Whenever something is different
between the MMU and the CPU, it is stored as an extra field in struct
kvm_mmu---and for extra bonus complication, sometimes the same thing
is stored in both the role and an extra field.

So, this is the "no functional change intended" part of the changes
required to fix the performance regression.  It separates neatly
the shadow page table format ("MMU role") from the guest page table
format ("CPU role"), and removes the duplicate fields.  The next
step then is to avoid unloading the MMU as long as the MMU role
stays the same.

Please review!

Paolo

Paolo Bonzini (23):
  KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  KVM: MMU: nested EPT cannot be used in SMM
  KVM: MMU: remove valid from extended role
  KVM: MMU: constify uses of struct kvm_mmu_role_regs
  KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu
  KVM: MMU: load new PGD once nested two-dimensional paging is
    initialized
  KVM: MMU: remove kvm_mmu_calc_root_page_role
  KVM: MMU: rephrase unclear comment
  KVM: MMU: remove "bool base_only" arguments
  KVM: MMU: split cpu_role from mmu_role
  KVM: MMU: do not recompute root level from kvm_mmu_role_regs
  KVM: MMU: remove ept_ad field
  KVM: MMU: remove kvm_calc_shadow_root_page_role_common
  KVM: MMU: cleanup computation of MMU roles for two-dimensional paging
  KVM: MMU: cleanup computation of MMU roles for shadow paging
  KVM: MMU: remove extended bits from mmu_role
  KVM: MMU: remove redundant bits from extended role
  KVM: MMU: fetch shadow EFER.NX from MMU role
  KVM: MMU: simplify and/or inline computation of shadow MMU roles
  KVM: MMU: pull CPU role computation to kvm_init_mmu
  KVM: MMU: store shadow_root_level into mmu_role
  KVM: MMU: use cpu_role for root_level
  KVM: MMU: replace direct_map with mmu_role.direct

 arch/x86/include/asm/kvm_host.h |  13 +-
 arch/x86/kvm/mmu.h              |   2 +-
 arch/x86/kvm/mmu/mmu.c          | 408 ++++++++++++--------------------
 arch/x86/kvm/mmu/mmu_audit.c    |   6 +-
 arch/x86/kvm/mmu/paging_tmpl.h  |  12 +-
 arch/86/kvm/mmu/tdp_mmu.c      |   4 +-
 arch/x86/kvm/svm/svm.c          |   2 +-
 arch/x86/kvm/vmx/vmx.c          |   2 +-
 arch/x86/kvm/x86.c              |  12 +-
 10 files changed, 178 insertions(+), 284 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
@ 2022-02-04 11:56 ` Paolo Bonzini
  2022-02-04 17:59   ` David Matlack
  2022-02-04 11:56 ` [PATCH 02/23] KVM: MMU: nested EPT cannot be used in SMM Paolo Bonzini
                   ` (23 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:56 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

reset_shadow_zero_bits_mask has a very unintuitive way of deciding
whether the shadow pages will use the NX bit.  The function is used in
two cases, shadow paging and shadow NPT; shadow paging has a use for
EFER.NX and needs to force it enabled, while shadow NPT only needs it
depending on L1's setting.

The actual root problem here is that is_efer_nx, despite being part
of the "base" role, only matches the format of the shadow pages in the
NPT case.  For now, just remove the ugly variable initialization and move
the call to reset_shadow_zero_bits_mask out of shadow_mmu_init_context.
The parameter can then be removed after the root problem in the role
is fixed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 296f8723f9ae..9424ae90f1ef 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4410,18 +4410,9 @@ static inline u64 reserved_hpa_bits(void)
  * follow the features in guest.
  */
 static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
-					struct kvm_mmu *context)
+					struct kvm_mmu *context,
+					bool uses_nx)
 {
-	/*
-	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
-	 * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and
-	 * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0.
-	 * The iTLB multi-hit workaround can be toggled at any time, so assume
-	 * NX can be used by any non-nested shadow MMU to avoid having to reset
-	 * MMU contexts.  Note, KVM forces EFER.NX=1 when TDP is disabled.
-	 */
-	bool uses_nx = is_efer_nx(context) || !tdp_enabled;
-
 	/* @amd adds a check on bit of SPTEs, which KVM shouldn't use anyways. */
 	bool is_amd = true;
 	/* KVM doesn't use 2-level page tables for the shadow MMU. */
@@ -4829,8 +4820,6 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 
 	reset_guest_paging_metadata(vcpu, context);
 	context->shadow_root_level = new_role.base.level;
-
-	reset_shadow_zero_bits_mask(vcpu, context);
 }
 
 static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
@@ -4841,6 +4830,16 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 		kvm_calc_shadow_mmu_root_page_role(vcpu, regs, false);
 
 	shadow_mmu_init_context(vcpu, context, regs, new_role);
+
+	/*
+	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
+	 * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and
+	 * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0.
+	 * The iTLB multi-hit workaround can be toggled at any time, so assume
+	 * NX can be used by any non-nested shadow MMU to avoid having to reset
+	 * MMU contexts.  Note, KVM forces EFER.NX=1 when TDP is disabled.
+	 */
+	reset_shadow_zero_bits_mask(vcpu, context, true);
 }
 
 static union kvm_mmu_role
@@ -4872,6 +4871,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
 
 	shadow_mmu_init_context(vcpu, context, &regs, new_role);
+	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 02/23] KVM: MMU: nested EPT cannot be used in SMM
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
  2022-02-04 11:56 ` [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask Paolo Bonzini
@ 2022-02-04 11:56 ` Paolo Bonzini
  2022-02-04 18:16   ` David Matlack
  2022-02-04 11:56 ` [PATCH 03/23] KVM: MMU: remove valid from extended role Paolo Bonzini
                   ` (22 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:56 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

The role.base.smm flag is always zero, do not bother copying it over
from vcpu->arch.root_mmu.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 9424ae90f1ef..b0065ae3cea8 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4881,9 +4881,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
 {
 	union kvm_mmu_role role = {0};
 
-	/* SMM flag is inherited from root_mmu */
-	role.base.smm = vcpu->arch.root_mmu.mmu_role.base.smm;
-
 	role.base.level = level;
 	role.base.has_4_byte_gpte = false;
 	role.base.direct = false;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 03/23] KVM: MMU: remove valid from extended role
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
  2022-02-04 11:56 ` [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask Paolo Bonzini
  2022-02-04 11:56 ` [PATCH 02/23] KVM: MMU: nested EPT cannot be used in SMM Paolo Bonzini
@ 2022-02-04 11:56 ` Paolo Bonzini
  2022-02-04 18:32   ` David Matlack
  2022-02-09 22:54   ` Sean Christopherson
  2022-02-04 11:56 ` [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs Paolo Bonzini
                   ` (21 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:56 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

The level field of the MMU role can act as a marker for validity
instead: it is guaranteed to be nonzero so a zero value means the role
is invalid and the MMU properties will be computed again.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 4 +---
 arch/x86/kvm/mmu/mmu.c          | 9 +++------
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e7e5bd9a984d..4ec7d1e3aa36 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -342,8 +342,7 @@ union kvm_mmu_page_role {
  * kvm_mmu_extended_role complements kvm_mmu_page_role, tracking properties
  * relevant to the current MMU configuration.   When loading CR0, CR4, or EFER,
  * including on nested transitions, if nothing in the full role changes then
- * MMU re-configuration can be skipped. @valid bit is set on first usage so we
- * don't treat all-zero structure as valid data.
+ * MMU re-configuration can be skipped.
  *
  * The properties that are tracked in the extended role but not the page role
  * are for things that either (a) do not affect the validity of the shadow page
@@ -360,7 +359,6 @@ union kvm_mmu_page_role {
 union kvm_mmu_extended_role {
 	u32 word;
 	struct {
-		unsigned int valid:1;
 		unsigned int execonly:1;
 		unsigned int cr0_pg:1;
 		unsigned int cr4_pae:1;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b0065ae3cea8..0039b2f21286 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4683,8 +4683,6 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
 		ext.efer_lma = ____is_efer_lma(regs);
 	}
 
-	ext.valid = 1;
-
 	return ext;
 }
 
@@ -4891,7 +4889,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
 	/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
 	role.ext.word = 0;
 	role.ext.execonly = execonly;
-	role.ext.valid = 1;
 
 	return role;
 }
@@ -5039,9 +5036,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * problem is swept under the rug; KVM's CPUID API is horrific and
 	 * it's all but impossible to solve it without introducing a new API.
 	 */
-	vcpu->arch.root_mmu.mmu_role.ext.valid = 0;
-	vcpu->arch.guest_mmu.mmu_role.ext.valid = 0;
-	vcpu->arch.nested_mmu.mmu_role.ext.valid = 0;
+	vcpu->arch.root_mmu.mmu_role.base.level = 0;
+	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
+	vcpu->arch.nested_mmu.mmu_role.base.level = 0;
 	kvm_mmu_reset_context(vcpu);
 
 	/*
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (2 preceding siblings ...)
  2022-02-04 11:56 ` [PATCH 03/23] KVM: MMU: remove valid from extended role Paolo Bonzini
@ 2022-02-04 11:56 ` Paolo Bonzini
  2022-02-04 18:41   ` David Matlack
  2022-02-09 22:57   ` Sean Christopherson
  2022-02-04 11:57 ` [PATCH 05/23] KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu Paolo Bonzini
                   ` (20 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:56 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

struct kvm_mmu_role_regs is computed just once and then accessed.  Use
const to enforce this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0039b2f21286..3add9d8b0630 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -208,7 +208,7 @@ struct kvm_mmu_role_regs {
  * the single source of truth for the MMU's state.
  */
 #define BUILD_MMU_ROLE_REGS_ACCESSOR(reg, name, flag)			\
-static inline bool __maybe_unused ____is_##reg##_##name(struct kvm_mmu_role_regs *regs)\
+static inline bool __maybe_unused ____is_##reg##_##name(const struct kvm_mmu_role_regs *regs)\
 {									\
 	return !!(regs->reg & flag);					\
 }
@@ -255,7 +255,7 @@ static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
 	return regs;
 }
 
-static int role_regs_to_root_level(struct kvm_mmu_role_regs *regs)
+static int role_regs_to_root_level(const struct kvm_mmu_role_regs *regs)
 {
 	if (!____is_cr0_pg(regs))
 		return 0;
@@ -4666,7 +4666,7 @@ static void paging32_init_context(struct kvm_mmu *context)
 }
 
 static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
-							 struct kvm_mmu_role_regs *regs)
+							 const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_mmu_extended_role ext = {0};
 
@@ -4687,7 +4687,7 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
 }
 
 static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
-						   struct kvm_mmu_role_regs *regs,
+						   const struct kvm_mmu_role_regs *regs,
 						   bool base_only)
 {
 	union kvm_mmu_role role = {0};
@@ -4723,7 +4723,8 @@ static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
 
 static union kvm_mmu_role
 kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
-				struct kvm_mmu_role_regs *regs, bool base_only)
+				const struct kvm_mmu_role_regs *regs,
+				bool base_only)
 {
 	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
 
@@ -4769,7 +4770,8 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 
 static union kvm_mmu_role
 kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
-				      struct kvm_mmu_role_regs *regs, bool base_only)
+				      const struct kvm_mmu_role_regs *regs,
+				      bool base_only)
 {
 	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
 
@@ -4782,7 +4784,8 @@ kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
 
 static union kvm_mmu_role
 kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
-				   struct kvm_mmu_role_regs *regs, bool base_only)
+				   const struct kvm_mmu_role_regs *regs,
+				   bool base_only)
 {
 	union kvm_mmu_role role =
 		kvm_calc_shadow_root_page_role_common(vcpu, regs, base_only);
@@ -4940,7 +4943,7 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
 }
 
 static union kvm_mmu_role
-kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, struct kvm_mmu_role_regs *regs)
+kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_mmu_role role;
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 05/23] KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (3 preceding siblings ...)
  2022-02-04 11:56 ` [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 18:45   ` David Matlack
  2022-02-04 11:57 ` [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized Paolo Bonzini
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

The init_kvm_*mmu functions, with the exception of shadow NPT,
do not need to know the full values of CR0/CR4/EFER; they only
need to know the bits that make up the "role".  This cleanup
however will take quite a few incremental steps.  As a start,
pull the common computation of the struct kvm_mmu_role_regs
into their caller: all of them extract the struct from the vcpu
as the very first step.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3add9d8b0630..577e70509510 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4736,12 +4736,12 @@ kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
 	return role;
 }
 
-static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
+static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
+			     const struct kvm_mmu_role_regs *regs)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
 	union kvm_mmu_role new_role =
-		kvm_calc_tdp_mmu_root_page_role(vcpu, &regs, false);
+		kvm_calc_tdp_mmu_root_page_role(vcpu, regs, false);
 
 	if (new_role.as_u64 == context->mmu_role.as_u64)
 		return;
@@ -4755,7 +4755,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	context->get_guest_pgd = get_cr3;
 	context->get_pdptr = kvm_pdptr_read;
 	context->inject_page_fault = kvm_inject_page_fault;
-	context->root_level = role_regs_to_root_level(&regs);
+	context->root_level = role_regs_to_root_level(regs);
 
 	if (!is_cr0_pg(context))
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -4803,7 +4803,7 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
 }
 
 static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
-				    struct kvm_mmu_role_regs *regs,
+				    const struct kvm_mmu_role_regs *regs,
 				    union kvm_mmu_role new_role)
 {
 	if (new_role.as_u64 == context->mmu_role.as_u64)
@@ -4824,7 +4824,7 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 }
 
 static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
-				struct kvm_mmu_role_regs *regs)
+				const struct kvm_mmu_role_regs *regs)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role new_role =
@@ -4845,7 +4845,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 
 static union kvm_mmu_role
 kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
-				   struct kvm_mmu_role_regs *regs)
+				   const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_mmu_role role =
 		kvm_calc_shadow_root_page_role_common(vcpu, regs, false);
@@ -4930,12 +4930,12 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
 
-static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
+static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
+			     const struct kvm_mmu_role_regs *regs)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
 
-	kvm_init_shadow_mmu(vcpu, &regs);
+	kvm_init_shadow_mmu(vcpu, regs);
 
 	context->get_guest_pgd     = get_cr3;
 	context->get_pdptr         = kvm_pdptr_read;
@@ -4959,10 +4959,9 @@ kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *
 	return role;
 }
 
-static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
+static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
 {
-	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
-	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, &regs);
+	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, regs);
 	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
 
 	if (new_role.as_u64 == g_context->mmu_role.as_u64)
@@ -5002,12 +5001,14 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
 
 void kvm_init_mmu(struct kvm_vcpu *vcpu)
 {
+	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
+
 	if (mmu_is_nested(vcpu))
-		init_kvm_nested_mmu(vcpu);
+		init_kvm_nested_mmu(vcpu, &regs);
 	else if (tdp_enabled)
-		init_kvm_tdp_mmu(vcpu);
+		init_kvm_tdp_mmu(vcpu, &regs);
 	else
-		init_kvm_softmmu(vcpu);
+		init_kvm_softmmu(vcpu, &regs);
 }
 EXPORT_SYMBOL_GPL(kvm_init_mmu);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (4 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 05/23] KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 19:18   ` David Matlack
  2022-02-04 11:57 ` [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role Paolo Bonzini
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

__kvm_mmu_new_pgd looks at the MMU's root_level and shadow_root_level
via fast_pgd_switch.  It makes no sense to call it before updating
these fields, even though it was done like that ever since nested
VMX grew the ability to use fast CR3 switch (commit 50c28f21d045,
"kvm: x86: Use fast CR3 switch for nested VMX").

Pull it to the end of the initialization of the shadow nested MMUs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 41 +++++++++++++++++++----------------------
 1 file changed, 19 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 577e70509510..b8ab16323629 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4869,10 +4869,9 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 
 	new_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);
 
-	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
-
 	shadow_mmu_init_context(vcpu, context, &regs, new_role);
 	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
+	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
 
@@ -4906,27 +4905,25 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
 						   execonly, level);
 
-	__kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base);
-
-	if (new_role.as_u64 == context->mmu_role.as_u64)
-		return;
-
-	context->mmu_role.as_u64 = new_role.as_u64;
-
-	context->shadow_root_level = level;
-
-	context->ept_ad = accessed_dirty;
-	context->page_fault = ept_page_fault;
-	context->gva_to_gpa = ept_gva_to_gpa;
-	context->sync_page = ept_sync_page;
-	context->invlpg = ept_invlpg;
-	context->root_level = level;
-	context->direct_map = false;
+	if (new_role.as_u64 != context->mmu_role.as_u64) {
+		context->mmu_role.as_u64 = new_role.as_u64;
+
+		context->shadow_root_level = level;
+
+		context->ept_ad = accessed_dirty;
+		context->page_fault = ept_page_fault;
+		context->gva_to_gpa = ept_gva_to_gpa;
+		context->sync_page = ept_sync_page;
+		context->invlpg = ept_invlpg;
+		context->root_level = level;
+		context->direct_map = false;
+		update_permission_bitmask(context, true);
+		context->pkru_mask = 0;
+		reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
+		reset_ept_shadow_zero_bits_mask(context, execonly);
+	}
 
-	update_permission_bitmask(context, true);
-	context->pkru_mask = 0;
-	reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
-	reset_ept_shadow_zero_bits_mask(context, execonly);
+	__kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base);
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (5 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 19:32   ` David Matlack
  2022-02-10  0:47   ` Sean Christopherson
  2022-02-04 11:57 ` [PATCH 08/23] KVM: MMU: rephrase unclear comment Paolo Bonzini
                   ` (17 subsequent siblings)
  24 siblings, 2 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Since the guest PGD is now loaded after the MMU has been set up
completely, the desired role for a cache hit is simply the current
mmu_role.  There is no need to compute it again, so __kvm_mmu_new_pgd
can be folded in kvm_mmu_new_pgd.

As an aside, the !tdp_enabled case in the function was dead code,
and that also gets mopped up as a side effect.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 29 ++++-------------------------
 1 file changed, 4 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b8ab16323629..42475e4c2a48 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -190,8 +190,6 @@ struct kmem_cache *mmu_page_header_cache;
 static struct percpu_counter kvm_total_used_mmu_pages;
 
 static void mmu_spte_set(u64 *sptep, u64 spte);
-static union kvm_mmu_page_role
-kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu);
 
 struct kvm_mmu_role_regs {
 	const unsigned long cr0;
@@ -4159,9 +4157,9 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
 	return false;
 }
 
-static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd,
-			      union kvm_mmu_page_role new_role)
+void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
 {
+	union kvm_mmu_page_role new_role = vcpu->arch.mmu->mmu_role.base;
 	if (!fast_pgd_switch(vcpu, new_pgd, new_role)) {
 		kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT);
 		return;
@@ -4196,11 +4194,6 @@ static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd,
 		__clear_sp_write_flooding_count(
 				to_shadow_page(vcpu->arch.mmu->root_hpa));
 }
-
-void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
-{
-	__kvm_mmu_new_pgd(vcpu, new_pgd, kvm_mmu_calc_root_page_role(vcpu));
-}
 EXPORT_SYMBOL_GPL(kvm_mmu_new_pgd);
 
 static unsigned long get_cr3(struct kvm_vcpu *vcpu)
@@ -4871,7 +4864,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 
 	shadow_mmu_init_context(vcpu, context, &regs, new_role);
 	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
-	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
+	kvm_mmu_new_pgd(vcpu, nested_cr3);
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
 
@@ -4923,7 +4916,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 		reset_ept_shadow_zero_bits_mask(context, execonly);
 	}
 
-	__kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base);
+	kvm_mmu_new_pgd(vcpu, new_eptp);
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
 
@@ -5009,20 +5002,6 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_init_mmu);
 
-static union kvm_mmu_page_role
-kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu)
-{
-	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
-	union kvm_mmu_role role;
-
-	if (tdp_enabled)
-		role = kvm_calc_tdp_mmu_root_page_role(vcpu, &regs, true);
-	else
-		role = kvm_calc_shadow_mmu_root_page_role(vcpu, &regs, true);
-
-	return role.base;
-}
-
 void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	/*
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 08/23] KVM: MMU: rephrase unclear comment
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (6 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 19:38   ` David Matlack
  2022-02-04 11:57 ` [PATCH 09/23] KVM: MMU: remove "bool base_only" arguments Paolo Bonzini
                   ` (16 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

If accessed bits are not supported there simple isn't any distinction
between accessed and non-accessed gPTEs, so the comment does not make
much sense.  Rephrase it in terms of what happens if accessed bits
*are* supported.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 5b5bdac97c7b..6bb9a377bf89 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -193,7 +193,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
 	if (!FNAME(is_present_gpte)(gpte))
 		goto no_present;
 
-	/* if accessed bit is not supported prefetch non accessed gpte */
+	/* if accessed bit is supported, prefetch only accessed gpte */
 	if (PT_HAVE_ACCESSED_DIRTY(vcpu->arch.mmu) &&
 	    !(gpte & PT_GUEST_ACCESSED_MASK))
 		goto no_present;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 09/23] KVM: MMU: remove "bool base_only" arguments
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (7 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 08/23] KVM: MMU: rephrase unclear comment Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 19:41   ` David Matlack
  2022-02-04 11:57 ` [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role Paolo Bonzini
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

The argument is always false now that kvm_mmu_calc_root_page_role has
been removed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 63 +++++++++++++++---------------------------
 1 file changed, 22 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 42475e4c2a48..dd69cfc8c4f6 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4658,46 +4658,30 @@ static void paging32_init_context(struct kvm_mmu *context)
 	context->direct_map = false;
 }
 
-static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
-							 const struct kvm_mmu_role_regs *regs)
-{
-	union kvm_mmu_extended_role ext = {0};
-
-	if (____is_cr0_pg(regs)) {
-		ext.cr0_pg = 1;
-		ext.cr4_pae = ____is_cr4_pae(regs);
-		ext.cr4_smep = ____is_cr4_smep(regs);
-		ext.cr4_smap = ____is_cr4_smap(regs);
-		ext.cr4_pse = ____is_cr4_pse(regs);
-
-		/* PKEY and LA57 are active iff long mode is active. */
-		ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
-		ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
-		ext.efer_lma = ____is_efer_lma(regs);
-	}
-
-	return ext;
-}
-
 static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
-						   const struct kvm_mmu_role_regs *regs,
-						   bool base_only)
+						   const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_mmu_role role = {0};
 
 	role.base.access = ACC_ALL;
 	if (____is_cr0_pg(regs)) {
+		role.ext.cr0_pg = 1;
 		role.base.efer_nx = ____is_efer_nx(regs);
 		role.base.cr0_wp = ____is_cr0_wp(regs);
+
+		role.ext.cr4_pae = ____is_cr4_pae(regs);
+		role.ext.cr4_smep = ____is_cr4_smep(regs);
+		role.ext.cr4_smap = ____is_cr4_smap(regs);
+		role.ext.cr4_pse = ____is_cr4_pse(regs);
+
+		/* PKEY and LA57 are active iff long mode is active. */
+		role.ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
+		role.ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
+		role.ext.efer_lma = ____is_efer_lma(regs);
 	}
 	role.base.smm = is_smm(vcpu);
 	role.base.guest_mode = is_guest_mode(vcpu);
 
-	if (base_only)
-		return role;
-
-	role.ext = kvm_calc_mmu_role_ext(vcpu, regs);
-
 	return role;
 }
 
@@ -4716,10 +4700,9 @@ static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
 
 static union kvm_mmu_role
 kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
-				const struct kvm_mmu_role_regs *regs,
-				bool base_only)
+				const struct kvm_mmu_role_regs *regs)
 {
-	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
+	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs);
 
 	role.base.ad_disabled = (shadow_accessed_mask == 0);
 	role.base.level = kvm_mmu_get_tdp_level(vcpu);
@@ -4734,7 +4717,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role new_role =
-		kvm_calc_tdp_mmu_root_page_role(vcpu, regs, false);
+		kvm_calc_tdp_mmu_root_page_role(vcpu, regs);
 
 	if (new_role.as_u64 == context->mmu_role.as_u64)
 		return;
@@ -4763,10 +4746,9 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 
 static union kvm_mmu_role
 kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
-				      const struct kvm_mmu_role_regs *regs,
-				      bool base_only)
+				      const struct kvm_mmu_role_regs *regs)
 {
-	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
+	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs);
 
 	role.base.smep_andnot_wp = role.ext.cr4_smep && !____is_cr0_wp(regs);
 	role.base.smap_andnot_wp = role.ext.cr4_smap && !____is_cr0_wp(regs);
@@ -4777,11 +4759,10 @@ kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
 
 static union kvm_mmu_role
 kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
-				   const struct kvm_mmu_role_regs *regs,
-				   bool base_only)
+				   const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_mmu_role role =
-		kvm_calc_shadow_root_page_role_common(vcpu, regs, base_only);
+		kvm_calc_shadow_root_page_role_common(vcpu, regs);
 
 	role.base.direct = !____is_cr0_pg(regs);
 
@@ -4821,7 +4802,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role new_role =
-		kvm_calc_shadow_mmu_root_page_role(vcpu, regs, false);
+		kvm_calc_shadow_mmu_root_page_role(vcpu, regs);
 
 	shadow_mmu_init_context(vcpu, context, regs, new_role);
 
@@ -4841,7 +4822,7 @@ kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
 				   const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_mmu_role role =
-		kvm_calc_shadow_root_page_role_common(vcpu, regs, false);
+		kvm_calc_shadow_root_page_role_common(vcpu, regs);
 
 	role.base.direct = false;
 	role.base.level = kvm_mmu_get_tdp_level(vcpu);
@@ -4937,7 +4918,7 @@ kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *
 {
 	union kvm_mmu_role role;
 
-	role = kvm_calc_shadow_root_page_role_common(vcpu, regs, false);
+	role = kvm_calc_shadow_root_page_role_common(vcpu, regs);
 
 	/*
 	 * Nested MMUs are used only for walking L2's gva->gpa, they never have
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (8 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 09/23] KVM: MMU: remove "bool base_only" arguments Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 21:57   ` David Matlack
  2022-02-04 11:57 ` [PATCH 11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs Paolo Bonzini
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Snapshot the state of the processor registers that govern page walk into
a new field of struct kvm_mmu.  This is a more natural representation
than having it *mostly* in mmu_role but not exclusively; the delta
right now is represented in other fields, such as root_level.  For
example, already in this patch we can replace role_regs_to_root_level
with the "level" field of the CPU role.

The nested MMU now has only the CPU role; and in fact the new function
kvm_calc_cpu_role is analogous to the previous kvm_calc_nested_mmu_role,
except that it has role.base.direct equal to CR0.PG.  It is not clear
what the code meant by "setting role.base.direct to true to detect bogus
usage of the nested MMU".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/kvm/mmu/mmu.c          | 100 ++++++++++++++++++++------------
 arch/x86/kvm/mmu/paging_tmpl.h  |   2 +-
 3 files changed, 64 insertions(+), 39 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4ec7d1e3aa36..427ee486309c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -432,6 +432,7 @@ struct kvm_mmu {
 	void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa);
 	hpa_t root_hpa;
 	gpa_t root_pgd;
+	union kvm_mmu_role cpu_role;
 	union kvm_mmu_role mmu_role;
 	u8 root_level;
 	u8 shadow_root_level;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dd69cfc8c4f6..f98444e1d834 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -230,7 +230,7 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
 #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name)		\
 static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu)	\
 {								\
-	return !!(mmu->mmu_role. base_or_ext . reg##_##name);	\
+	return !!(mmu->cpu_role. base_or_ext . reg##_##name);	\
 }
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr0, pg);
 BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
@@ -4658,6 +4658,38 @@ static void paging32_init_context(struct kvm_mmu *context)
 	context->direct_map = false;
 }
 
+static union kvm_mmu_role
+kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
+{
+	union kvm_mmu_role role = {0};
+
+	role.base.access = ACC_ALL;
+	role.base.smm = is_smm(vcpu);
+	role.base.guest_mode = is_guest_mode(vcpu);
+	role.base.direct = !____is_cr0_pg(regs);
+	if (!role.base.direct) {
+		role.base.efer_nx = ____is_efer_nx(regs);
+		role.base.cr0_wp = ____is_cr0_wp(regs);
+		role.base.smep_andnot_wp = ____is_cr4_smep(regs) && !____is_cr0_wp(regs);
+		role.base.smap_andnot_wp = ____is_cr4_smap(regs) && !____is_cr0_wp(regs);
+		role.base.has_4_byte_gpte = !____is_cr4_pae(regs);
+		role.base.level = role_regs_to_root_level(regs);
+
+		role.ext.cr0_pg = 1;
+		role.ext.cr4_pae = ____is_cr4_pae(regs);
+		role.ext.cr4_smep = ____is_cr4_smep(regs);
+		role.ext.cr4_smap = ____is_cr4_smap(regs);
+		role.ext.cr4_pse = ____is_cr4_pse(regs);
+
+		/* PKEY and LA57 are active iff long mode is active. */
+		role.ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
+		role.ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
+		role.ext.efer_lma = ____is_efer_lma(regs);
+	}
+
+	return role;
+}
+
 static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
 						   const struct kvm_mmu_role_regs *regs)
 {
@@ -4716,13 +4748,16 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 			     const struct kvm_mmu_role_regs *regs)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	union kvm_mmu_role new_role =
+	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
+	union kvm_mmu_role mmu_role =
 		kvm_calc_tdp_mmu_root_page_role(vcpu, regs);
 
-	if (new_role.as_u64 == context->mmu_role.as_u64)
+	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
+	    mmu_role.as_u64 == context->mmu_role.as_u64)
 		return;
 
-	context->mmu_role.as_u64 = new_role.as_u64;
+	context->cpu_role.as_u64 = cpu_role.as_u64;
+	context->mmu_role.as_u64 = mmu_role.as_u64;
 	context->page_fault = kvm_tdp_page_fault;
 	context->sync_page = nonpaging_sync_page;
 	context->invlpg = NULL;
@@ -4777,13 +4812,15 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
 }
 
 static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
-				    const struct kvm_mmu_role_regs *regs,
-				    union kvm_mmu_role new_role)
+				    union kvm_mmu_role cpu_role,
+				    union kvm_mmu_role mmu_role)
 {
-	if (new_role.as_u64 == context->mmu_role.as_u64)
+	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
+	    mmu_role.as_u64 == context->mmu_role.as_u64)
 		return;
 
-	context->mmu_role.as_u64 = new_role.as_u64;
+	context->cpu_role.as_u64 = cpu_role.as_u64;
+	context->mmu_role.as_u64 = mmu_role.as_u64;
 
 	if (!is_cr0_pg(context))
 		nonpaging_init_context(context);
@@ -4791,20 +4828,21 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 		paging64_init_context(context);
 	else
 		paging32_init_context(context);
-	context->root_level = role_regs_to_root_level(regs);
+	context->root_level = cpu_role.base.level;
 
 	reset_guest_paging_metadata(vcpu, context);
-	context->shadow_root_level = new_role.base.level;
+	context->shadow_root_level = mmu_role.base.level;
 }
 
 static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 				const struct kvm_mmu_role_regs *regs)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	union kvm_mmu_role new_role =
+	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
+	union kvm_mmu_role mmu_role =
 		kvm_calc_shadow_mmu_root_page_role(vcpu, regs);
 
-	shadow_mmu_init_context(vcpu, context, regs, new_role);
+	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 
 	/*
 	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
@@ -4839,11 +4877,10 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 		.cr4 = cr4 & ~X86_CR4_PKE,
 		.efer = efer,
 	};
-	union kvm_mmu_role new_role;
-
-	new_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);
+	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
+	union kvm_mmu_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);;
 
-	shadow_mmu_init_context(vcpu, context, &regs, new_role);
+	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
 	kvm_mmu_new_pgd(vcpu, nested_cr3);
 }
@@ -4862,7 +4899,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
 	role.base.guest_mode = true;
 	role.base.access = ACC_ALL;
 
-	/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
 	role.ext.word = 0;
 	role.ext.execonly = execonly;
 
@@ -4879,7 +4915,9 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
 						   execonly, level);
 
-	if (new_role.as_u64 != context->mmu_role.as_u64) {
+	if (new_role.as_u64 != context->cpu_role.as_u64) {
+		/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
+		context->cpu_role.as_u64 = new_role.as_u64;
 		context->mmu_role.as_u64 = new_role.as_u64;
 
 		context->shadow_root_level = level;
@@ -4913,32 +4951,15 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
 	context->inject_page_fault = kvm_inject_page_fault;
 }
 
-static union kvm_mmu_role
-kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
-{
-	union kvm_mmu_role role;
-
-	role = kvm_calc_shadow_root_page_role_common(vcpu, regs);
-
-	/*
-	 * Nested MMUs are used only for walking L2's gva->gpa, they never have
-	 * shadow pages of their own and so "direct" has no meaning.   Set it
-	 * to "true" to try to detect bogus usage of the nested MMU.
-	 */
-	role.base.direct = true;
-	role.base.level = role_regs_to_root_level(regs);
-	return role;
-}
-
 static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
 {
-	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, regs);
+	union kvm_mmu_role new_role = kvm_calc_cpu_role(vcpu, regs);
 	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
 
-	if (new_role.as_u64 == g_context->mmu_role.as_u64)
+	if (new_role.as_u64 == g_context->cpu_role.as_u64)
 		return;
 
-	g_context->mmu_role.as_u64 = new_role.as_u64;
+	g_context->cpu_role.as_u64 = new_role.as_u64;
 	g_context->get_guest_pgd     = get_cr3;
 	g_context->get_pdptr         = kvm_pdptr_read;
 	g_context->inject_page_fault = kvm_inject_page_fault;
@@ -4997,6 +5018,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	 * problem is swept under the rug; KVM's CPUID API is horrific and
 	 * it's all but impossible to solve it without introducing a new API.
 	 */
+	vcpu->arch.root_mmu.cpu_role.base.level = 0;
+	vcpu->arch.guest_mmu.cpu_role.base.level = 0;
+	vcpu->arch.nested_mmu.cpu_role.base.level = 0;
 	vcpu->arch.root_mmu.mmu_role.base.level = 0;
 	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
 	vcpu->arch.nested_mmu.mmu_role.base.level = 0;
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 6bb9a377bf89..b9f472f27077 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -323,7 +323,7 @@ static inline bool FNAME(is_last_gpte)(struct kvm_mmu *mmu,
 	 * is not reserved and does not indicate a large page at this level,
 	 * so clear PT_PAGE_SIZE_MASK in gpte if that is the case.
 	 */
-	gpte &= level - (PT32_ROOT_LEVEL + mmu->mmu_role.ext.cr4_pse);
+	gpte &= level - (PT32_ROOT_LEVEL + mmu->cpu_role.ext.cr4_pse);
 #endif
 	/*
 	 * PG_LEVEL_4K always terminates.  The RHS has bit 7 set
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (9 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-07 22:10   ` David Matlack
  2022-02-04 11:57 ` [PATCH 12/23] KVM: MMU: remove ept_ad field Paolo Bonzini
                   ` (13 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

The root_level can be found in the cpu_role (in fact the field
is superfluous and could be removed, but one thing at a time).
Since there is only one usage left of role_regs_to_root_level,
inline it into kvm_calc_cpu_role.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 23 ++++++++---------------
 1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index f98444e1d834..74789295f922 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -253,19 +253,6 @@ static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
 	return regs;
 }
 
-static int role_regs_to_root_level(const struct kvm_mmu_role_regs *regs)
-{
-	if (!____is_cr0_pg(regs))
-		return 0;
-	else if (____is_efer_lma(regs))
-		return ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL :
-					       PT64_ROOT_4LEVEL;
-	else if (____is_cr4_pae(regs))
-		return PT32E_ROOT_LEVEL;
-	else
-		return PT32_ROOT_LEVEL;
-}
-
 static inline bool kvm_available_flush_tlb_with_range(void)
 {
 	return kvm_x86_ops.tlb_remote_flush_with_range;
@@ -4673,7 +4660,13 @@ kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
 		role.base.smep_andnot_wp = ____is_cr4_smep(regs) && !____is_cr0_wp(regs);
 		role.base.smap_andnot_wp = ____is_cr4_smap(regs) && !____is_cr0_wp(regs);
 		role.base.has_4_byte_gpte = !____is_cr4_pae(regs);
-		role.base.level = role_regs_to_root_level(regs);
+
+		if (____is_efer_lma(regs))
+			role.base.level = ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL : PT64_ROOT_4LEVEL;
+		else if (____is_cr4_pae(regs))
+			role.base.level = PT32E_ROOT_LEVEL;
+		else
+			role.base.level = PT32_ROOT_LEVEL;
 
 		role.ext.cr0_pg = 1;
 		role.ext.cr4_pae = ____is_cr4_pae(regs);
@@ -4766,7 +4759,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 	context->get_guest_pgd = get_cr3;
 	context->get_pdptr = kvm_pdptr_read;
 	context->inject_page_fault = kvm_inject_page_fault;
-	context->root_level = role_regs_to_root_level(regs);
+	context->root_level = cpu_role.base.level;
 
 	if (!is_cr0_pg(context))
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 12/23] KVM: MMU: remove ept_ad field
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (10 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 11:57 ` [PATCH 13/23] KVM: MMU: remove kvm_calc_shadow_root_page_role_common Paolo Bonzini
                   ` (12 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

The similar field in the CPU role, ad_disabled, is initialized to the
opposite value for shadow EPT, and left zero for "normal" EPT because
guest page tables always have A/D bits.  So, read it from the CPU role,
like other page-format fields; it just has to be inverted to account
for the different polarity.

In the MMU role, instead, the ad_disabled bit is set according to
shadow_accessed_mask, so it would have been incorrect to replace
PT_HAVE_ACCESSED_DIRTY with just !mmu->mmu_role.base.ad_disabled.
However, with the separation of CPU and MMU roles, we might even get
rid of the PT_HAVE_ACCESSED_DIRTY macro altogether.  I am not doing this
because the macro has a small effect in terms of dead code elimination:

   text	   data	    bss	    dec	    hex
 103544	  16665	    112	 120321	  1d601    # as of this patch
 103746	  16665	    112	 120523	  1d6cb    # without PT_HAVE_ACCESSED_DIRTY

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 1 -
 arch/x86/kvm/mmu/mmu.c          | 1 -
 arch/x86/kvm/mmu/paging_tmpl.h  | 2 +-
 3 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 427ee486309c..795b345361c8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -436,7 +436,6 @@ struct kvm_mmu {
 	union kvm_mmu_role mmu_role;
 	u8 root_level;
 	u8 shadow_root_level;
-	u8 ept_ad;
 	bool direct_map;
 	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 74789295f922..d6b5d8c1c0dc 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4915,7 +4915,6 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 
 		context->shadow_root_level = level;
 
-		context->ept_ad = accessed_dirty;
 		context->page_fault = ept_page_fault;
 		context->gva_to_gpa = ept_gva_to_gpa;
 		context->sync_page = ept_sync_page;
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index b9f472f27077..1b5c7d03f94b 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -64,7 +64,7 @@
 	#define PT_LEVEL_BITS PT64_LEVEL_BITS
 	#define PT_GUEST_DIRTY_SHIFT 9
 	#define PT_GUEST_ACCESSED_SHIFT 8
-	#define PT_HAVE_ACCESSED_DIRTY(mmu) ((mmu)->ept_ad)
+	#define PT_HAVE_ACCESSED_DIRTY(mmu) (!(mmu)->cpu_role.base.ad_disabled)
 	#define CMPXCHG cmpxchg64
 	#define PT_MAX_FULL_LEVELS PT64_ROOT_MAX_LEVEL
 #else
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 13/23] KVM: MMU: remove kvm_calc_shadow_root_page_role_common
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (11 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 12/23] KVM: MMU: remove ept_ad field Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-07 22:25   ` David Matlack
  2022-02-04 11:57 ` [PATCH 14/23] KVM: MMU: cleanup computation of MMU roles for two-dimensional paging Paolo Bonzini
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

kvm_calc_shadow_root_page_role_common is the same as
kvm_calc_cpu_role except for the level, which is overwritten
afterwards in kvm_calc_shadow_mmu_root_page_role
and kvm_calc_shadow_npt_root_page_role.

role.base.direct is already set correctly for the CPU role,
and CR0.PG=1 is required for VMRUN so it will also be
correct for nested NPT.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 21 ++-------------------
 1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d6b5d8c1c0dc..19abf1e4cee9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4772,27 +4772,11 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 	reset_tdp_shadow_zero_bits_mask(context);
 }
 
-static union kvm_mmu_role
-kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
-				      const struct kvm_mmu_role_regs *regs)
-{
-	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs);
-
-	role.base.smep_andnot_wp = role.ext.cr4_smep && !____is_cr0_wp(regs);
-	role.base.smap_andnot_wp = role.ext.cr4_smap && !____is_cr0_wp(regs);
-	role.base.has_4_byte_gpte = ____is_cr0_pg(regs) && !____is_cr4_pae(regs);
-
-	return role;
-}
-
 static union kvm_mmu_role
 kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
 				   const struct kvm_mmu_role_regs *regs)
 {
-	union kvm_mmu_role role =
-		kvm_calc_shadow_root_page_role_common(vcpu, regs);
-
-	role.base.direct = !____is_cr0_pg(regs);
+	union kvm_mmu_role role = kvm_calc_cpu_role(vcpu, regs);
 
 	if (!____is_efer_lma(regs))
 		role.base.level = PT32E_ROOT_LEVEL;
@@ -4853,9 +4837,8 @@ kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
 				   const struct kvm_mmu_role_regs *regs)
 {
 	union kvm_mmu_role role =
-		kvm_calc_shadow_root_page_role_common(vcpu, regs);
+               kvm_calc_cpu_role(vcpu, regs);
 
-	role.base.direct = false;
 	role.base.level = kvm_mmu_get_tdp_level(vcpu);
 
 	return role;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 14/23] KVM: MMU: cleanup computation of MMU roles for two-dimensional paging
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (12 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 13/23] KVM: MMU: remove kvm_calc_shadow_root_page_role_common Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 11:57 ` [PATCH 15/23] KVM: MMU: cleanup computation of MMU roles for shadow paging Paolo Bonzini
                   ` (10 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Inline kvm_calc_mmu_role_common into its sole caller, and simplify it
by removing the computation of unnecessary bits.

Extended bits are unnecessary because page walking uses the CPU role,
and EFER.NX/CR0.WP can be set to one unconditionally---matching the
format of shadow pages rather than the format of guest pages.

The MMU role for two dimensional paging does still depend on the CPU role,
even if only barely so, due to SMM and guest mode; for consistency,
pass it down to kvm_calc_tdp_mmu_root_page_role instead of querying
the vcpu with is_smm or is_guest_mode.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 39 ++++++++-------------------------------
 1 file changed, 8 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 19abf1e4cee9..1650fc291284 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4683,33 +4683,6 @@ kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
 	return role;
 }
 
-static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
-						   const struct kvm_mmu_role_regs *regs)
-{
-	union kvm_mmu_role role = {0};
-
-	role.base.access = ACC_ALL;
-	if (____is_cr0_pg(regs)) {
-		role.ext.cr0_pg = 1;
-		role.base.efer_nx = ____is_efer_nx(regs);
-		role.base.cr0_wp = ____is_cr0_wp(regs);
-
-		role.ext.cr4_pae = ____is_cr4_pae(regs);
-		role.ext.cr4_smep = ____is_cr4_smep(regs);
-		role.ext.cr4_smap = ____is_cr4_smap(regs);
-		role.ext.cr4_pse = ____is_cr4_pse(regs);
-
-		/* PKEY and LA57 are active iff long mode is active. */
-		role.ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
-		role.ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
-		role.ext.efer_lma = ____is_efer_lma(regs);
-	}
-	role.base.smm = is_smm(vcpu);
-	role.base.guest_mode = is_guest_mode(vcpu);
-
-	return role;
-}
-
 static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
 {
 	/* tdp_root_level is architecture forced level, use it if nonzero */
@@ -4725,10 +4698,15 @@ static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
 
 static union kvm_mmu_role
 kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
-				const struct kvm_mmu_role_regs *regs)
+				union kvm_mmu_role cpu_role)
 {
-	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs);
+	union kvm_mmu_role role = {0};
 
+	role.base.access = ACC_ALL;
+	role.base.cr0_wp = true;
+	role.base.efer_nx = true;
+	role.base.smm = cpu_role.base.smm;
+	role.base.guest_mode = cpu_role.base.guest_mode;
 	role.base.ad_disabled = (shadow_accessed_mask == 0);
 	role.base.level = kvm_mmu_get_tdp_level(vcpu);
 	role.base.direct = true;
@@ -4742,8 +4720,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
-	union kvm_mmu_role mmu_role =
-		kvm_calc_tdp_mmu_root_page_role(vcpu, regs);
+	union kvm_mmu_role mmu_role = kvm_calc_tdp_mmu_root_page_role(vcpu, cpu_role);
 
 	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
 	    mmu_role.as_u64 == context->mmu_role.as_u64)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 15/23] KVM: MMU: cleanup computation of MMU roles for shadow paging
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (13 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 14/23] KVM: MMU: cleanup computation of MMU roles for two-dimensional paging Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 11:57 ` [PATCH 16/23] KVM: MMU: remove extended bits from mmu_role Paolo Bonzini
                   ` (9 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Pass the already-computed CPU role, instead of redoing it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1650fc291284..817e6cc916fc 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4751,13 +4751,11 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 
 static union kvm_mmu_role
 kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
-				   const struct kvm_mmu_role_regs *regs)
+				   union kvm_mmu_role role)
 {
-	union kvm_mmu_role role = kvm_calc_cpu_role(vcpu, regs);
-
-	if (!____is_efer_lma(regs))
+	if (!role.ext.efer_lma)
 		role.base.level = PT32E_ROOT_LEVEL;
-	else if (____is_cr4_la57(regs))
+	else if (role.ext.cr4_la57)
 		role.base.level = PT64_ROOT_5LEVEL;
 	else
 		role.base.level = PT64_ROOT_4LEVEL;
@@ -4794,7 +4792,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
 	union kvm_mmu_role mmu_role =
-		kvm_calc_shadow_mmu_root_page_role(vcpu, regs);
+		kvm_calc_shadow_mmu_root_page_role(vcpu, cpu_role);
 
 	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 
@@ -4811,11 +4809,8 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 
 static union kvm_mmu_role
 kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
-				   const struct kvm_mmu_role_regs *regs)
+				   union kvm_mmu_role role)
 {
-	union kvm_mmu_role role =
-               kvm_calc_cpu_role(vcpu, regs);
-
 	role.base.level = kvm_mmu_get_tdp_level(vcpu);
 
 	return role;
@@ -4831,7 +4826,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 		.efer = efer,
 	};
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
-	union kvm_mmu_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);;
+	union kvm_mmu_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, cpu_role);
 
 	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 16/23] KVM: MMU: remove extended bits from mmu_role
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (14 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 15/23] KVM: MMU: cleanup computation of MMU roles for shadow paging Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 11:57 ` [PATCH 17/23] KVM: MMU: remove redundant bits from extended role Paolo Bonzini
                   ` (8 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

mmu_role represents the role of the root of the page tables.
It does not need any extended bits, as those govern only KVM's
page table walking; the is_* functions used for page table
walking always use the CPU role.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 63 ++++++++++++++++-----------------
 arch/x86/kvm/mmu/paging_tmpl.h  |  2 +-
 arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
 4 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 795b345361c8..121eefdb9991 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -433,7 +433,7 @@ struct kvm_mmu {
 	hpa_t root_hpa;
 	gpa_t root_pgd;
 	union kvm_mmu_role cpu_role;
-	union kvm_mmu_role mmu_role;
+	union kvm_mmu_page_role mmu_role;
 	u8 root_level;
 	u8 shadow_root_level;
 	bool direct_map;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 817e6cc916fc..0cb46a74e561 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2045,7 +2045,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 	int collisions = 0;
 	LIST_HEAD(invalid_list);
 
-	role = vcpu->arch.mmu->mmu_role.base;
+	role = vcpu->arch.mmu->mmu_role;
 	role.level = level;
 	role.direct = direct;
 	role.access = access;
@@ -3278,7 +3278,7 @@ void kvm_mmu_free_guest_mode_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
 	 * This should not be called while L2 is active, L2 can't invalidate
 	 * _only_ its own roots, e.g. INVVPID unconditionally exits.
 	 */
-	WARN_ON_ONCE(mmu->mmu_role.base.guest_mode);
+	WARN_ON_ONCE(mmu->mmu_role.guest_mode);
 
 	for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) {
 		root_hpa = mmu->prev_roots[i].hpa;
@@ -4146,7 +4146,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
 
 void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
 {
-	union kvm_mmu_page_role new_role = vcpu->arch.mmu->mmu_role.base;
+	union kvm_mmu_page_role new_role = vcpu->arch.mmu->mmu_role;
 	if (!fast_pgd_switch(vcpu, new_pgd, new_role)) {
 		kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT);
 		return;
@@ -4696,21 +4696,21 @@ static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
 	return max_tdp_level;
 }
 
-static union kvm_mmu_role
+static union kvm_mmu_page_role
 kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
 				union kvm_mmu_role cpu_role)
 {
-	union kvm_mmu_role role = {0};
+	union kvm_mmu_page_role role = {0};
 
-	role.base.access = ACC_ALL;
-	role.base.cr0_wp = true;
-	role.base.efer_nx = true;
-	role.base.smm = cpu_role.base.smm;
-	role.base.guest_mode = cpu_role.base.guest_mode;
-	role.base.ad_disabled = (shadow_accessed_mask == 0);
-	role.base.level = kvm_mmu_get_tdp_level(vcpu);
-	role.base.direct = true;
-	role.base.has_4_byte_gpte = false;
+	role.access = ACC_ALL;
+	role.cr0_wp = true;
+	role.efer_nx = true;
+	role.smm = cpu_role.base.smm;
+	role.guest_mode = cpu_role.base.guest_mode;
+	role.ad_disabled = (shadow_accessed_mask == 0);
+	role.level = kvm_mmu_get_tdp_level(vcpu);
+	role.direct = true;
+	role.has_4_byte_gpte = false;
 
 	return role;
 }
@@ -4720,14 +4720,14 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
-	union kvm_mmu_role mmu_role = kvm_calc_tdp_mmu_root_page_role(vcpu, cpu_role);
+	union kvm_mmu_page_role mmu_role = kvm_calc_tdp_mmu_root_page_role(vcpu, cpu_role);
 
 	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
-	    mmu_role.as_u64 == context->mmu_role.as_u64)
+	    mmu_role.word == context->mmu_role.word)
 		return;
 
 	context->cpu_role.as_u64 = cpu_role.as_u64;
-	context->mmu_role.as_u64 = mmu_role.as_u64;
+	context->mmu_role.word = mmu_role.word;
 	context->page_fault = kvm_tdp_page_fault;
 	context->sync_page = nonpaging_sync_page;
 	context->invlpg = NULL;
@@ -4749,7 +4749,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 	reset_tdp_shadow_zero_bits_mask(context);
 }
 
-static union kvm_mmu_role
+static union kvm_mmu_page_role
 kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
 				   union kvm_mmu_role role)
 {
@@ -4760,19 +4760,19 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
 	else
 		role.base.level = PT64_ROOT_4LEVEL;
 
-	return role;
+	return role.base;
 }
 
 static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
 				    union kvm_mmu_role cpu_role,
-				    union kvm_mmu_role mmu_role)
+				    union kvm_mmu_page_role mmu_role)
 {
 	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
-	    mmu_role.as_u64 == context->mmu_role.as_u64)
+	    mmu_role.word == context->mmu_role.word)
 		return;
 
 	context->cpu_role.as_u64 = cpu_role.as_u64;
-	context->mmu_role.as_u64 = mmu_role.as_u64;
+	context->mmu_role.word = mmu_role.word;
 
 	if (!is_cr0_pg(context))
 		nonpaging_init_context(context);
@@ -4783,7 +4783,7 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 	context->root_level = cpu_role.base.level;
 
 	reset_guest_paging_metadata(vcpu, context);
-	context->shadow_root_level = mmu_role.base.level;
+	context->shadow_root_level = mmu_role.level;
 }
 
 static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
@@ -4791,7 +4791,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
-	union kvm_mmu_role mmu_role =
+	union kvm_mmu_page_role mmu_role =
 		kvm_calc_shadow_mmu_root_page_role(vcpu, cpu_role);
 
 	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
@@ -4807,13 +4807,12 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 	reset_shadow_zero_bits_mask(vcpu, context, true);
 }
 
-static union kvm_mmu_role
+static union kvm_mmu_page_role
 kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
 				   union kvm_mmu_role role)
 {
 	role.base.level = kvm_mmu_get_tdp_level(vcpu);
-
-	return role;
+	return role.base;
 }
 
 void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
@@ -4826,7 +4825,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 		.efer = efer,
 	};
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
-	union kvm_mmu_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, cpu_role);
+	union kvm_mmu_page_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, cpu_role);
 
 	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
@@ -4866,7 +4865,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 	if (new_role.as_u64 != context->cpu_role.as_u64) {
 		/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
 		context->cpu_role.as_u64 = new_role.as_u64;
-		context->mmu_role.as_u64 = new_role.as_u64;
+		context->mmu_role.word = new_role.base.word;
 
 		context->shadow_root_level = level;
 
@@ -4968,9 +4967,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu->arch.root_mmu.cpu_role.base.level = 0;
 	vcpu->arch.guest_mmu.cpu_role.base.level = 0;
 	vcpu->arch.nested_mmu.cpu_role.base.level = 0;
-	vcpu->arch.root_mmu.mmu_role.base.level = 0;
-	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
-	vcpu->arch.nested_mmu.mmu_role.base.level = 0;
+	vcpu->arch.root_mmu.mmu_role.level = 0;
+	vcpu->arch.guest_mmu.mmu_role.level = 0;
+	vcpu->arch.nested_mmu.mmu_role.level = 0;
 	kvm_mmu_reset_context(vcpu);
 
 	/*
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 1b5c7d03f94b..847c4339e4d9 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -1025,7 +1025,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
  */
 static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 {
-	union kvm_mmu_page_role mmu_role = vcpu->arch.mmu->mmu_role.base;
+	union kvm_mmu_page_role mmu_role = vcpu->arch.mmu->mmu_role;
 	int i;
 	bool host_writable;
 	gpa_t first_pte_gpa;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 8def8f810cb0..dd4c78833016 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -209,7 +209,7 @@ static void tdp_mmu_init_child_sp(struct kvm_mmu_page *child_sp,
 
 hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu)
 {
-	union kvm_mmu_page_role role = vcpu->arch.mmu->mmu_role.base;
+	union kvm_mmu_page_role role = vcpu->arch.mmu->mmu_role;
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_mmu_page *root;
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 17/23] KVM: MMU: remove redundant bits from extended role
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (15 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 16/23] KVM: MMU: remove extended bits from mmu_role Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 11:57 ` [PATCH 18/23] KVM: MMU: fetch shadow EFER.NX from MMU role Paolo Bonzini
                   ` (7 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Before the separation of the CPU and the MMU role, CR0.PG was not
available in the base MMU role, because two-dimensional paging always
used direct=1 in the MMU role.  However, now that the raw role is
snapshotted in mmu->cpu_role, CR0.PG *can* be found (though inverted)
as !cpu_role.base.direct.  There is no need to store it again in union
kvm_mmu_extended_role; instead, write an is_cr0_pg accessor by hand that
takes care of the inversion.

Likewise, CR4.PAE is now always present in the CPU role as
!cpu_role.base.has_4_byte_gpte.  The inversion makes certain tests on
the MMU role easier, and is easily hidden by the is_cr4_pae accessor
when operating on the CPU role.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  2 --
 arch/x86/kvm/mmu/mmu.c          | 14 ++++++++++----
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 121eefdb9991..b0085c54786c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -360,8 +360,6 @@ union kvm_mmu_extended_role {
 	u32 word;
 	struct {
 		unsigned int execonly:1;
-		unsigned int cr0_pg:1;
-		unsigned int cr4_pae:1;
 		unsigned int cr4_pse:1;
 		unsigned int cr4_pke:1;
 		unsigned int cr4_smap:1;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0cb46a74e561..b3856551607d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -232,16 +232,24 @@ static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu)	\
 {								\
 	return !!(mmu->cpu_role. base_or_ext . reg##_##name);	\
 }
-BUILD_MMU_ROLE_ACCESSOR(ext,  cr0, pg);
 BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pse);
-BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pae);
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, smep);
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, smap);
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, pke);
 BUILD_MMU_ROLE_ACCESSOR(ext,  cr4, la57);
 BUILD_MMU_ROLE_ACCESSOR(base, efer, nx);
 
+static inline bool is_cr0_pg(struct kvm_mmu *mmu)
+{
+	return !(mmu->cpu_role.base.direct);
+}
+
+static inline bool is_cr4_pae(struct kvm_mmu *mmu)
+{
+	return !(mmu->cpu_role.base.has_4_byte_gpte);
+}
+
 static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu_role_regs regs = {
@@ -4668,8 +4676,6 @@ kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
 		else
 			role.base.level = PT32_ROOT_LEVEL;
 
-		role.ext.cr0_pg = 1;
-		role.ext.cr4_pae = ____is_cr4_pae(regs);
 		role.ext.cr4_smep = ____is_cr4_smep(regs);
 		role.ext.cr4_smap = ____is_cr4_smap(regs);
 		role.ext.cr4_pse = ____is_cr4_pse(regs);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 18/23] KVM: MMU: fetch shadow EFER.NX from MMU role
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (16 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 17/23] KVM: MMU: remove redundant bits from extended role Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 11:57 ` [PATCH 19/23] KVM: MMU: simplify and/or inline computation of shadow MMU roles Paolo Bonzini
                   ` (6 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Now that the MMU role is separate from the CPU role, it contains a
truthful description of the format of the shadow pages.  This includes
whether the shadow pages use the NX bit, so use the MMU role instead of
hardcoding it in the callers of reset_shadow_zero_bits_mask.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b3856551607d..bba712d1a6d7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4398,13 +4398,13 @@ static inline u64 reserved_hpa_bits(void)
  * follow the features in guest.
  */
 static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
-					struct kvm_mmu *context,
-					bool uses_nx)
+					struct kvm_mmu *context)
 {
 	/* @amd adds a check on bit of SPTEs, which KVM shouldn't use anyways. */
 	bool is_amd = true;
 	/* KVM doesn't use 2-level page tables for the shadow MMU. */
 	bool is_pse = false;
+	bool uses_nx = context->mmu_role.efer_nx;
 	struct rsvd_bits_validate *shadow_zero_check;
 	int i;
 
@@ -4810,7 +4810,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 	 * NX can be used by any non-nested shadow MMU to avoid having to reset
 	 * MMU contexts.  Note, KVM forces EFER.NX=1 when TDP is disabled.
 	 */
-	reset_shadow_zero_bits_mask(vcpu, context, true);
+	reset_shadow_zero_bits_mask(vcpu, context);
 }
 
 static union kvm_mmu_page_role
@@ -4834,7 +4834,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 	union kvm_mmu_page_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, cpu_role);
 
 	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
-	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
+	reset_shadow_zero_bits_mask(vcpu, context);
 	kvm_mmu_new_pgd(vcpu, nested_cr3);
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 19/23] KVM: MMU: simplify and/or inline computation of shadow MMU roles
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (17 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 18/23] KVM: MMU: fetch shadow EFER.NX from MMU role Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-04 11:57 ` [PATCH 20/23] KVM: MMU: pull CPU role computation to kvm_init_mmu Paolo Bonzini
                   ` (5 subsequent siblings)
  24 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Shadow MMUs can compute their role from cpu_role.base, simply by adjusting
the root level.  It's one line of code, so do not place it in a separate
function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 32 +++++++-------------------------
 1 file changed, 7 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index bba712d1a6d7..01027da82e23 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4755,20 +4755,6 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
 	reset_tdp_shadow_zero_bits_mask(context);
 }
 
-static union kvm_mmu_page_role
-kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
-				   union kvm_mmu_role role)
-{
-	if (!role.ext.efer_lma)
-		role.base.level = PT32E_ROOT_LEVEL;
-	else if (role.ext.cr4_la57)
-		role.base.level = PT64_ROOT_5LEVEL;
-	else
-		role.base.level = PT64_ROOT_4LEVEL;
-
-	return role.base;
-}
-
 static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
 				    union kvm_mmu_role cpu_role,
 				    union kvm_mmu_page_role mmu_role)
@@ -4797,9 +4783,10 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
-	union kvm_mmu_page_role mmu_role =
-		kvm_calc_shadow_mmu_root_page_role(vcpu, cpu_role);
+	union kvm_mmu_page_role mmu_role;
 
+	mmu_role = cpu_role.base;
+	mmu_role.level = max_t(u32, mmu_role.level, PT32E_ROOT_LEVEL);
 	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 
 	/*
@@ -4813,14 +4800,6 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 	reset_shadow_zero_bits_mask(vcpu, context);
 }
 
-static union kvm_mmu_page_role
-kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
-				   union kvm_mmu_role role)
-{
-	role.base.level = kvm_mmu_get_tdp_level(vcpu);
-	return role.base;
-}
-
 void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 			     unsigned long cr4, u64 efer, gpa_t nested_cr3)
 {
@@ -4831,7 +4810,10 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
 		.efer = efer,
 	};
 	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
-	union kvm_mmu_page_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, cpu_role);
+	union kvm_mmu_page_role mmu_role;
+
+	mmu_role = cpu_role.base;
+	mmu_role.level = kvm_mmu_get_tdp_level(vcpu);
 
 	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
 	reset_shadow_zero_bits_mask(vcpu, context);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 20/23] KVM: MMU: pull CPU role computation to kvm_init_mmu
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (18 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 19/23] KVM: MMU: simplify and/or inline computation of shadow MMU roles Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-07 22:42   ` David Matlack
  2022-02-04 11:57 ` [PATCH 21/23] KVM: MMU: store shadow_root_level into mmu_role Paolo Bonzini
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Do not lead init_kvm_*mmu into the temptation of poking
into struct kvm_mmu_role_regs, by passing to it directly
the CPU role.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 01027da82e23..6f9d876ce429 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4721,11 +4721,9 @@ kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
 	return role;
 }
 
-static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
-			     const struct kvm_mmu_role_regs *regs)
+static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
 	union kvm_mmu_page_role mmu_role = kvm_calc_tdp_mmu_root_page_role(vcpu, cpu_role);
 
 	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
@@ -4779,10 +4777,9 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 }
 
 static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
-				const struct kvm_mmu_role_regs *regs)
+				union kvm_mmu_role cpu_role)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
-	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
 	union kvm_mmu_page_role mmu_role;
 
 	mmu_role = cpu_role.base;
@@ -4874,20 +4871,19 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
 
 static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
-			     const struct kvm_mmu_role_regs *regs)
+			     union kvm_mmu_role cpu_role)
 {
 	struct kvm_mmu *context = &vcpu->arch.root_mmu;
 
-	kvm_init_shadow_mmu(vcpu, regs);
+	kvm_init_shadow_mmu(vcpu, cpu_role);
 
 	context->get_guest_pgd     = get_cr3;
 	context->get_pdptr         = kvm_pdptr_read;
 	context->inject_page_fault = kvm_inject_page_fault;
 }
 
-static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
+static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role new_role)
 {
-	union kvm_mmu_role new_role = kvm_calc_cpu_role(vcpu, regs);
 	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
 
 	if (new_role.as_u64 == g_context->cpu_role.as_u64)
@@ -4928,13 +4924,14 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role
 void kvm_init_mmu(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
+	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
 
 	if (mmu_is_nested(vcpu))
-		init_kvm_nested_mmu(vcpu, &regs);
+		init_kvm_nested_mmu(vcpu, cpu_role);
 	else if (tdp_enabled)
-		init_kvm_tdp_mmu(vcpu, &regs);
+		init_kvm_tdp_mmu(vcpu, cpu_role);
 	else
-		init_kvm_softmmu(vcpu, &regs);
+		init_kvm_softmmu(vcpu, cpu_role);
 }
 EXPORT_SYMBOL_GPL(kvm_init_mmu);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 21/23] KVM: MMU: store shadow_root_level into mmu_role
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (19 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 20/23] KVM: MMU: pull CPU role computation to kvm_init_mmu Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-07 23:00   ` David Matlack
  2022-02-04 11:57 ` [PATCH 22/23] KVM: MMU: use cpu_role for root_level Paolo Bonzini
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

mmu_role.level is always the same value as shadow_level:

- kvm_mmu_get_tdp_level(vcpu) when going through init_kvm_tdp_mmu

- the level argument when going through kvm_init_shadow_ept_mmu

- it's assigned directly from new_role.base.level when going
  through shadow_mmu_init_context

Remove the duplication and get the level directly from the role.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/mmu.h              |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 36 +++++++++++++++------------------
 arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
 arch/x86/kvm/svm/svm.c          |  2 +-
 arch/x86/kvm/vmx/vmx.c          |  2 +-
 6 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b0085c54786c..867fc82f1de5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -433,7 +433,6 @@ struct kvm_mmu {
 	union kvm_mmu_role cpu_role;
 	union kvm_mmu_page_role mmu_role;
 	u8 root_level;
-	u8 shadow_root_level;
 	bool direct_map;
 	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];
 
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 51faa2c76ca5..43b99308cb0e 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -112,7 +112,7 @@ static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
 		return;
 
 	static_call(kvm_x86_load_mmu_pgd)(vcpu, root_hpa,
-					  vcpu->arch.mmu->shadow_root_level);
+					  vcpu->arch.mmu->mmu_role.level);
 }
 
 struct kvm_page_fault {
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6f9d876ce429..4d1fa87718f8 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2143,7 +2143,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
 {
 	iterator->addr = addr;
 	iterator->shadow_addr = root;
-	iterator->level = vcpu->arch.mmu->shadow_root_level;
+	iterator->level = vcpu->arch.mmu->mmu_role.level;
 
 	if (iterator->level >= PT64_ROOT_4LEVEL &&
 	    vcpu->arch.mmu->root_level < PT64_ROOT_4LEVEL &&
@@ -3254,7 +3254,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 					   &invalid_list);
 
 	if (free_active_root) {
-		if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
+		if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
 		    (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
 			mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list);
 		} else if (mmu->pae_root) {
@@ -3329,7 +3329,7 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva,
 static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
-	u8 shadow_root_level = mmu->shadow_root_level;
+	u8 shadow_root_level = mmu->mmu_role.level;
 	hpa_t root;
 	unsigned i;
 	int r;
@@ -3479,7 +3479,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	 */
 	if (mmu->root_level >= PT64_ROOT_4LEVEL) {
 		root = mmu_alloc_root(vcpu, root_gfn, 0,
-				      mmu->shadow_root_level, false);
+				      mmu->mmu_role.level, false);
 		mmu->root_hpa = root;
 		goto set_root_pgd;
 	}
@@ -3495,7 +3495,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	 * the shadow page table may be a PAE or a long mode page table.
 	 */
 	pm_mask = PT_PRESENT_MASK | shadow_me_mask;
-	if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL) {
+	if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL) {
 		pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
 
 		if (WARN_ON_ONCE(!mmu->pml4_root)) {
@@ -3504,7 +3504,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 		}
 		mmu->pml4_root[0] = __pa(mmu->pae_root) | pm_mask;
 
-		if (mmu->shadow_root_level == PT64_ROOT_5LEVEL) {
+		if (mmu->mmu_role.level == PT64_ROOT_5LEVEL) {
 			if (WARN_ON_ONCE(!mmu->pml5_root)) {
 				r = -EIO;
 				goto out_unlock;
@@ -3529,9 +3529,9 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 		mmu->pae_root[i] = root | pm_mask;
 	}
 
-	if (mmu->shadow_root_level == PT64_ROOT_5LEVEL)
+	if (mmu->mmu_role.level == PT64_ROOT_5LEVEL)
 		mmu->root_hpa = __pa(mmu->pml5_root);
-	else if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
+	else if (mmu->mmu_role.level == PT64_ROOT_4LEVEL)
 		mmu->root_hpa = __pa(mmu->pml4_root);
 	else
 		mmu->root_hpa = __pa(mmu->pae_root);
@@ -3547,7 +3547,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
 {
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
-	bool need_pml5 = mmu->shadow_root_level > PT64_ROOT_4LEVEL;
+	bool need_pml5 = mmu->mmu_role.level > PT64_ROOT_4LEVEL;
 	u64 *pml5_root = NULL;
 	u64 *pml4_root = NULL;
 	u64 *pae_root;
@@ -3559,7 +3559,7 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
 	 * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare.
 	 */
 	if (mmu->direct_map || mmu->root_level >= PT64_ROOT_4LEVEL ||
-	    mmu->shadow_root_level < PT64_ROOT_4LEVEL)
+	    mmu->mmu_role.level < PT64_ROOT_4LEVEL)
 		return 0;
 
 	/*
@@ -4145,7 +4145,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
 	 * having to deal with PDPTEs. We may add support for 32-bit hosts/VMs
 	 * later if necessary.
 	 */
-	if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
+	if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
 	    mmu->root_level >= PT64_ROOT_4LEVEL)
 		return cached_root_available(vcpu, new_pgd, new_role);
 
@@ -4408,17 +4408,17 @@ static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
 	struct rsvd_bits_validate *shadow_zero_check;
 	int i;
 
-	WARN_ON_ONCE(context->shadow_root_level < PT32E_ROOT_LEVEL);
+	WARN_ON_ONCE(context->mmu_role.level < PT32E_ROOT_LEVEL);
 
 	shadow_zero_check = &context->shadow_zero_check;
 	__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
-				context->shadow_root_level, uses_nx,
+				context->mmu_role.level, uses_nx,
 				guest_can_use_gbpages(vcpu), is_pse, is_amd);
 
 	if (!shadow_me_mask)
 		return;
 
-	for (i = context->shadow_root_level; --i >= 0;) {
+	for (i = context->mmu_role.level; --i >= 0;) {
 		shadow_zero_check->rsvd_bits_mask[0][i] &= ~shadow_me_mask;
 		shadow_zero_check->rsvd_bits_mask[1][i] &= ~shadow_me_mask;
 	}
@@ -4445,7 +4445,7 @@ reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
 
 	if (boot_cpu_is_amd())
 		__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
-					context->shadow_root_level, false,
+					context->mmu_role.level, false,
 					boot_cpu_has(X86_FEATURE_GBPAGES),
 					false, true);
 	else
@@ -4456,7 +4456,7 @@ reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
 	if (!shadow_me_mask)
 		return;
 
-	for (i = context->shadow_root_level; --i >= 0;) {
+	for (i = context->mmu_role.level; --i >= 0;) {
 		shadow_zero_check->rsvd_bits_mask[0][i] &= ~shadow_me_mask;
 		shadow_zero_check->rsvd_bits_mask[1][i] &= ~shadow_me_mask;
 	}
@@ -4735,7 +4735,6 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
 	context->page_fault = kvm_tdp_page_fault;
 	context->sync_page = nonpaging_sync_page;
 	context->invlpg = NULL;
-	context->shadow_root_level = kvm_mmu_get_tdp_level(vcpu);
 	context->direct_map = true;
 	context->get_guest_pgd = get_cr3;
 	context->get_pdptr = kvm_pdptr_read;
@@ -4773,7 +4772,6 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 	context->root_level = cpu_role.base.level;
 
 	reset_guest_paging_metadata(vcpu, context);
-	context->shadow_root_level = mmu_role.level;
 }
 
 static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
@@ -4852,8 +4850,6 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 		context->cpu_role.as_u64 = new_role.as_u64;
 		context->mmu_role.word = new_role.base.word;
 
-		context->shadow_root_level = level;
-
 		context->page_fault = ept_page_fault;
 		context->gva_to_gpa = ept_gva_to_gpa;
 		context->sync_page = ept_sync_page;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index dd4c78833016..9fb6d983bae9 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1697,7 +1697,7 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
 	gfn_t gfn = addr >> PAGE_SHIFT;
 	int leaf = -1;
 
-	*root_level = vcpu->arch.mmu->shadow_root_level;
+	*root_level = vcpu->arch.mmu->mmu_role.level;
 
 	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
 		leaf = iter.level;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7b5345a66117..5a1d552b535b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3815,7 +3815,7 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
 		hv_track_root_tdp(vcpu, root_hpa);
 
 		cr3 = vcpu->arch.cr3;
-	} else if (vcpu->arch.mmu->shadow_root_level >= PT64_ROOT_4LEVEL) {
+	} else if (vcpu->arch.mmu->mmu_role.level >= PT64_ROOT_4LEVEL) {
 		cr3 = __sme_set(root_hpa) | kvm_get_active_pcid(vcpu);
 	} else {
 		/* PCID in the guest should be impossible with a 32-bit MMU. */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8ac5a6fa7720..5e2c865a04ff 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2965,7 +2965,7 @@ static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu)
 
 	if (enable_ept)
 		ept_sync_context(construct_eptp(vcpu, root_hpa,
-						mmu->shadow_root_level));
+						mmu->mmu_role.level));
 	else
 		vpid_sync_context(vmx_get_current_vpid(vcpu));
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 22/23] KVM: MMU: use cpu_role for root_level
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (20 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 21/23] KVM: MMU: store shadow_root_level into mmu_role Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-07 23:01   ` David Matlack
  2022-02-04 11:57 ` [PATCH 23/23] KVM: MMU: replace direct_map with mmu_role.direct Paolo Bonzini
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

Remove another duplicate field of struct kvm_mmu.  This time it's
the root level for page table walking; we were already initializing
it mostly as cpu_role.base.level, but the field still existed;
remove it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/mmu/mmu.c          | 22 +++++++++-------------
 arch/x86/kvm/mmu/mmu_audit.c    |  6 +++---
 arch/x86/kvm/mmu/paging_tmpl.h  |  4 ++--
 4 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 867fc82f1de5..c86a2beee92a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -432,7 +432,6 @@ struct kvm_mmu {
 	gpa_t root_pgd;
 	union kvm_mmu_role cpu_role;
 	union kvm_mmu_page_role mmu_role;
-	u8 root_level;
 	bool direct_map;
 	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4d1fa87718f8..5a6541d6a424 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2146,7 +2146,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
 	iterator->level = vcpu->arch.mmu->mmu_role.level;
 
 	if (iterator->level >= PT64_ROOT_4LEVEL &&
-	    vcpu->arch.mmu->root_level < PT64_ROOT_4LEVEL &&
+	    vcpu->arch.mmu->cpu_role.base.level < PT64_ROOT_4LEVEL &&
 	    !vcpu->arch.mmu->direct_map)
 		iterator->level = PT32E_ROOT_LEVEL;
 
@@ -3255,7 +3255,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 
 	if (free_active_root) {
 		if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
-		    (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
+		    (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
 			mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list);
 		} else if (mmu->pae_root) {
 			for (i = 0; i < 4; ++i) {
@@ -3453,7 +3453,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	 * On SVM, reading PDPTRs might access guest memory, which might fault
 	 * and thus might sleep.  Grab the PDPTRs before acquiring mmu_lock.
 	 */
-	if (mmu->root_level == PT32E_ROOT_LEVEL) {
+	if (mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) {
 		for (i = 0; i < 4; ++i) {
 			pdptrs[i] = mmu->get_pdptr(vcpu, i);
 			if (!(pdptrs[i] & PT_PRESENT_MASK))
@@ -3477,7 +3477,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	 * Do we shadow a long mode page table? If so we need to
 	 * write-protect the guests page table root.
 	 */
-	if (mmu->root_level >= PT64_ROOT_4LEVEL) {
+	if (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
 		root = mmu_alloc_root(vcpu, root_gfn, 0,
 				      mmu->mmu_role.level, false);
 		mmu->root_hpa = root;
@@ -3516,7 +3516,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	for (i = 0; i < 4; ++i) {
 		WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i]));
 
-		if (mmu->root_level == PT32E_ROOT_LEVEL) {
+		if (mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) {
 			if (!(pdptrs[i] & PT_PRESENT_MASK)) {
 				mmu->pae_root[i] = INVALID_PAE_ROOT;
 				continue;
@@ -3558,7 +3558,7 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
 	 * equivalent level in the guest's NPT to shadow.  Allocate the tables
 	 * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare.
 	 */
-	if (mmu->direct_map || mmu->root_level >= PT64_ROOT_4LEVEL ||
+	if (mmu->direct_map || mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
 	    mmu->mmu_role.level < PT64_ROOT_4LEVEL)
 		return 0;
 
@@ -3655,7 +3655,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
 
 	vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
 
-	if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
+	if (vcpu->arch.mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
 		hpa_t root = vcpu->arch.mmu->root_hpa;
 		sp = to_shadow_page(root);
 
@@ -4146,7 +4146,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
 	 * later if necessary.
 	 */
 	if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
-	    mmu->root_level >= PT64_ROOT_4LEVEL)
+	    mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL)
 		return cached_root_available(vcpu, new_pgd, new_role);
 
 	return false;
@@ -4335,7 +4335,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 {
 	__reset_rsvds_bits_mask(&context->guest_rsvd_check,
 				vcpu->arch.reserved_gpa_bits,
-				context->root_level, is_efer_nx(context),
+				context->cpu_role.base.level, is_efer_nx(context),
 				guest_can_use_gbpages(vcpu),
 				is_cr4_pse(context),
 				guest_cpuid_is_amd_or_hygon(vcpu));
@@ -4739,7 +4739,6 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
 	context->get_guest_pgd = get_cr3;
 	context->get_pdptr = kvm_pdptr_read;
 	context->inject_page_fault = kvm_inject_page_fault;
-	context->root_level = cpu_role.base.level;
 
 	if (!is_cr0_pg(context))
 		context->gva_to_gpa = nonpaging_gva_to_gpa;
@@ -4769,7 +4768,6 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
 		paging64_init_context(context);
 	else
 		paging32_init_context(context);
-	context->root_level = cpu_role.base.level;
 
 	reset_guest_paging_metadata(vcpu, context);
 }
@@ -4854,7 +4852,6 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 		context->gva_to_gpa = ept_gva_to_gpa;
 		context->sync_page = ept_sync_page;
 		context->invlpg = ept_invlpg;
-		context->root_level = level;
 		context->direct_map = false;
 		update_permission_bitmask(context, true);
 		context->pkru_mask = 0;
@@ -4889,7 +4886,6 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role new_ro
 	g_context->get_guest_pgd     = get_cr3;
 	g_context->get_pdptr         = kvm_pdptr_read;
 	g_context->inject_page_fault = kvm_inject_page_fault;
-	g_context->root_level        = new_role.base.level;
 
 	/*
 	 * L2 page tables are never shadowed, so there is no need to sync
diff --git a/arch/x86/kvm/mmu/mmu_audit.c b/arch/x86/kvm/mmu/mmu_audit.c
index f31fdb874f1f..eb9c59fcb957 100644
--- a/arch/x86/kvm/mmu/mmu_audit.c
+++ b/arch/x86/kvm/mmu/mmu_audit.c
@@ -59,11 +59,11 @@ static void mmu_spte_walk(struct kvm_vcpu *vcpu, inspect_spte_fn fn)
 	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
 		return;
 
-	if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
+	if (vcpu->arch.mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
 		hpa_t root = vcpu->arch.mmu->root_hpa;
 
 		sp = to_shadow_page(root);
-		__mmu_spte_walk(vcpu, sp, fn, vcpu->arch.mmu->root_level);
+		__mmu_spte_walk(vcpu, sp, fn, vcpu->arch.mmu->cpu_role.base.level);
 		return;
 	}
 
@@ -119,7 +119,7 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level)
 	hpa =  pfn << PAGE_SHIFT;
 	if ((*sptep & PT64_BASE_ADDR_MASK) != hpa)
 		audit_printk(vcpu->kvm, "levels %d pfn %llx hpa %llx "
-			     "ent %llxn", vcpu->arch.mmu->root_level, pfn,
+			     "ent %llxn", vcpu->arch.mmu->cpu_role.base.level, pfn,
 			     hpa, *sptep);
 }
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 847c4339e4d9..dd0b6f83171f 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -361,7 +361,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 
 	trace_kvm_mmu_pagetable_walk(addr, access);
 retry_walk:
-	walker->level = mmu->root_level;
+	walker->level = mmu->cpu_role.base.level;
 	pte           = mmu->get_guest_pgd(vcpu);
 	have_ad       = PT_HAVE_ACCESSED_DIRTY(mmu);
 
@@ -656,7 +656,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 	WARN_ON_ONCE(gw->gfn != base_gfn);
 	direct_access = gw->pte_access;
 
-	top_level = vcpu->arch.mmu->root_level;
+	top_level = vcpu->arch.mmu->cpu_role.base.level;
 	if (top_level == PT32E_ROOT_LEVEL)
 		top_level = PT32_ROOT_LEVEL;
 	/*
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 23/23] KVM: MMU: replace direct_map with mmu_role.direct
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (21 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 22/23] KVM: MMU: use cpu_role for root_level Paolo Bonzini
@ 2022-02-04 11:57 ` Paolo Bonzini
  2022-02-07 23:02   ` David Matlack
  2022-02-07 23:08 ` [PATCH 00/23] KVM: MMU: MMU role refactoring David Matlack
  2022-02-09 22:31 ` Sean Christopherson
  24 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-04 11:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: dmatlack, seanjc, vkuznets

direct_map is always equal to the role's direct field:

- for shadow paging, direct_map is true if CR0.PG=0 and mmu_role.direct is
copied from cpu_role.base.direct

- for TDP, it is always true and mmu_role.direct is also always true

- for shadow EPT, it is always false and mmu_role.direct is also always
false

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/mmu/mmu.c          | 30 ++++++++++++++----------------
 arch/x86/kvm/x86.c              | 12 ++++++------
 4 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c86a2beee92a..647b3f6d02d0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -432,7 +432,6 @@ struct kvm_mmu {
 	gpa_t root_pgd;
 	union kvm_mmu_role cpu_role;
 	union kvm_mmu_page_role mmu_role;
-	bool direct_map;
 	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];
 
 	/*
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 5a6541d6a424..ce55fad99671 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2045,7 +2045,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 					     int direct,
 					     unsigned int access)
 {
-	bool direct_mmu = vcpu->arch.mmu->direct_map;
+	bool direct_mmu = vcpu->arch.mmu->mmu_role.direct;
 	union kvm_mmu_page_role role;
 	struct hlist_head *sp_list;
 	unsigned quadrant;
@@ -2147,7 +2147,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
 
 	if (iterator->level >= PT64_ROOT_4LEVEL &&
 	    vcpu->arch.mmu->cpu_role.base.level < PT64_ROOT_4LEVEL &&
-	    !vcpu->arch.mmu->direct_map)
+	    !vcpu->arch.mmu->mmu_role.direct)
 		iterator->level = PT32E_ROOT_LEVEL;
 
 	if (iterator->level == PT32E_ROOT_LEVEL) {
@@ -2523,7 +2523,7 @@ static int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
 	gpa_t gpa;
 	int r;
 
-	if (vcpu->arch.mmu->direct_map)
+	if (vcpu->arch.mmu->mmu_role.direct)
 		return 0;
 
 	gpa = kvm_mmu_gva_to_gpa_read(vcpu, gva, NULL);
@@ -3255,7 +3255,8 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 
 	if (free_active_root) {
 		if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
-		    (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
+		    (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
+		     mmu->mmu_role.direct)) {
 			mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list);
 		} else if (mmu->pae_root) {
 			for (i = 0; i < 4; ++i) {
@@ -3558,7 +3559,8 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
 	 * equivalent level in the guest's NPT to shadow.  Allocate the tables
 	 * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare.
 	 */
-	if (mmu->direct_map || mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
+	if (mmu->mmu_role.direct ||
+	    mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
 	    mmu->mmu_role.level < PT64_ROOT_4LEVEL)
 		return 0;
 
@@ -3647,7 +3649,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
 	int i;
 	struct kvm_mmu_page *sp;
 
-	if (vcpu->arch.mmu->direct_map)
+	if (vcpu->arch.mmu->mmu_role.direct)
 		return;
 
 	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
@@ -3872,7 +3874,7 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 
 	arch.token = (vcpu->arch.apf.id++ << 12) | vcpu->vcpu_id;
 	arch.gfn = gfn;
-	arch.direct_map = vcpu->arch.mmu->direct_map;
+	arch.direct_map = vcpu->arch.mmu->mmu_role.direct;
 	arch.cr3 = vcpu->arch.mmu->get_guest_pgd(vcpu);
 
 	return kvm_setup_async_pf(vcpu, cr2_or_gpa,
@@ -4090,7 +4092,6 @@ static void nonpaging_init_context(struct kvm_mmu *context)
 	context->gva_to_gpa = nonpaging_gva_to_gpa;
 	context->sync_page = nonpaging_sync_page;
 	context->invlpg = NULL;
-	context->direct_map = true;
 }
 
 static inline bool is_root_usable(struct kvm_mmu_root_info *root, gpa_t pgd,
@@ -4641,7 +4642,6 @@ static void paging64_init_context(struct kvm_mmu *context)
 	context->gva_to_gpa = paging64_gva_to_gpa;
 	context->sync_page = paging64_sync_page;
 	context->invlpg = paging64_invlpg;
-	context->direct_map = false;
 }
 
 static void paging32_init_context(struct kvm_mmu *context)
@@ -4650,7 +4650,6 @@ static void paging32_init_context(struct kvm_mmu *context)
 	context->gva_to_gpa = paging32_gva_to_gpa;
 	context->sync_page = paging32_sync_page;
 	context->invlpg = paging32_invlpg;
-	context->direct_map = false;
 }
 
 static union kvm_mmu_role
@@ -4735,7 +4734,6 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
 	context->page_fault = kvm_tdp_page_fault;
 	context->sync_page = nonpaging_sync_page;
 	context->invlpg = NULL;
-	context->direct_map = true;
 	context->get_guest_pgd = get_cr3;
 	context->get_pdptr = kvm_pdptr_read;
 	context->inject_page_fault = kvm_inject_page_fault;
@@ -4852,7 +4850,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 		context->gva_to_gpa = ept_gva_to_gpa;
 		context->sync_page = ept_sync_page;
 		context->invlpg = ept_invlpg;
-		context->direct_map = false;
+
 		update_permission_bitmask(context, true);
 		context->pkru_mask = 0;
 		reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
@@ -4967,13 +4965,13 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 {
 	int r;
 
-	r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->direct_map);
+	r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->mmu_role.direct);
 	if (r)
 		goto out;
 	r = mmu_alloc_special_roots(vcpu);
 	if (r)
 		goto out;
-	if (vcpu->arch.mmu->direct_map)
+	if (vcpu->arch.mmu->mmu_role.direct)
 		r = mmu_alloc_direct_roots(vcpu);
 	else
 		r = mmu_alloc_shadow_roots(vcpu);
@@ -5176,7 +5174,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
 		       void *insn, int insn_len)
 {
 	int r, emulation_type = EMULTYPE_PF;
-	bool direct = vcpu->arch.mmu->direct_map;
+	bool direct = vcpu->arch.mmu->mmu_role.direct;
 
 	if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)))
 		return RET_PF_RETRY;
@@ -5207,7 +5205,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
 	 * paging in both guests. If true, we simply unprotect the page
 	 * and resume the guest.
 	 */
-	if (vcpu->arch.mmu->direct_map &&
+	if (vcpu->arch.mmu->mmu_role.direct &&
 	    (error_code & PFERR_NESTED_GUEST_PAGE) == PFERR_NESTED_GUEST_PAGE) {
 		kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa));
 		return 1;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 657aa646871e..b910fa34e57e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7978,7 +7978,7 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	    WARN_ON_ONCE(!(emulation_type & EMULTYPE_PF)))
 		return false;
 
-	if (!vcpu->arch.mmu->direct_map) {
+	if (!vcpu->arch.mmu->mmu_role.direct) {
 		/*
 		 * Write permission should be allowed since only
 		 * write access need to be emulated.
@@ -8011,7 +8011,7 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	kvm_release_pfn_clean(pfn);
 
 	/* The instructions are well-emulated on direct mmu. */
-	if (vcpu->arch.mmu->direct_map) {
+	if (vcpu->arch.mmu->mmu_role.direct) {
 		unsigned int indirect_shadow_pages;
 
 		write_lock(&vcpu->kvm->mmu_lock);
@@ -8079,7 +8079,7 @@ static bool retry_instruction(struct x86_emulate_ctxt *ctxt,
 	vcpu->arch.last_retry_eip = ctxt->eip;
 	vcpu->arch.last_retry_addr = cr2_or_gpa;
 
-	if (!vcpu->arch.mmu->direct_map)
+	if (!vcpu->arch.mmu->mmu_role.direct)
 		gpa = kvm_mmu_gva_to_gpa_write(vcpu, cr2_or_gpa, NULL);
 
 	kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(gpa));
@@ -8359,7 +8359,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 		ctxt->exception.address = cr2_or_gpa;
 
 		/* With shadow page tables, cr2 contains a GVA or nGPA. */
-		if (vcpu->arch.mmu->direct_map) {
+		if (vcpu->arch.mmu->mmu_role.direct) {
 			ctxt->gpa_available = true;
 			ctxt->gpa_val = cr2_or_gpa;
 		}
@@ -12196,7 +12196,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
 {
 	int r;
 
-	if ((vcpu->arch.mmu->direct_map != work->arch.direct_map) ||
+	if ((vcpu->arch.mmu->mmu_role.direct != work->arch.direct_map) ||
 	      work->wakeup_all)
 		return;
 
@@ -12204,7 +12204,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
 	if (unlikely(r))
 		return;
 
-	if (!vcpu->arch.mmu->direct_map &&
+	if (!vcpu->arch.mmu->mmu_role.direct &&
 	      work->arch.cr3 != vcpu->arch.mmu->get_guest_pgd(vcpu))
 		return;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  2022-02-04 11:56 ` [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask Paolo Bonzini
@ 2022-02-04 17:59   ` David Matlack
  2022-02-05 14:52     ` Paolo Bonzini
  2022-02-10  0:30     ` Sean Christopherson
  0 siblings, 2 replies; 72+ messages in thread
From: David Matlack @ 2022-02-04 17:59 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:56:56AM -0500, Paolo Bonzini wrote:
> reset_shadow_zero_bits_mask has a very unintuitive way of deciding
> whether the shadow pages will use the NX bit.  The function is used in
> two cases, shadow paging and shadow NPT; shadow paging has a use for
> EFER.NX and needs to force it enabled, while shadow NPT only needs it
> depending on L1's setting.
> 
> The actual root problem here is that is_efer_nx, despite being part
> of the "base" role, only matches the format of the shadow pages in the
> NPT case.  For now, just remove the ugly variable initialization and move
> the call to reset_shadow_zero_bits_mask out of shadow_mmu_init_context.
> The parameter can then be removed after the root problem in the role
> is fixed.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

(I agree this commit makes no functional change.)

> ---
>  arch/x86/kvm/mmu/mmu.c | 26 +++++++++++++-------------
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 296f8723f9ae..9424ae90f1ef 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4410,18 +4410,9 @@ static inline u64 reserved_hpa_bits(void)
>   * follow the features in guest.
>   */
>  static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
> -					struct kvm_mmu *context)
> +					struct kvm_mmu *context,
> +					bool uses_nx)
>  {
> -	/*
> -	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
> -	 * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and
> -	 * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0.
> -	 * The iTLB multi-hit workaround can be toggled at any time, so assume
> -	 * NX can be used by any non-nested shadow MMU to avoid having to reset
> -	 * MMU contexts.  Note, KVM forces EFER.NX=1 when TDP is disabled.
> -	 */
> -	bool uses_nx = is_efer_nx(context) || !tdp_enabled;
> -
>  	/* @amd adds a check on bit of SPTEs, which KVM shouldn't use anyways. */
>  	bool is_amd = true;
>  	/* KVM doesn't use 2-level page tables for the shadow MMU. */
> @@ -4829,8 +4820,6 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
>  
>  	reset_guest_paging_metadata(vcpu, context);
>  	context->shadow_root_level = new_role.base.level;
> -
> -	reset_shadow_zero_bits_mask(vcpu, context);
>  }
>  
>  static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
> @@ -4841,6 +4830,16 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
>  		kvm_calc_shadow_mmu_root_page_role(vcpu, regs, false);
>  
>  	shadow_mmu_init_context(vcpu, context, regs, new_role);
> +
> +	/*
> +	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
> +	 * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and
> +	 * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0.
> +	 * The iTLB multi-hit workaround can be toggled at any time, so assume
> +	 * NX can be used by any non-nested shadow MMU to avoid having to reset
> +	 * MMU contexts.  Note, KVM forces EFER.NX=1 when TDP is disabled.
> +	 */
> +	reset_shadow_zero_bits_mask(vcpu, context, true);
>  }
>  
>  static union kvm_mmu_role
> @@ -4872,6 +4871,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
>  	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
>  
>  	shadow_mmu_init_context(vcpu, context, &regs, new_role);
> +	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));

Out of curiousity, how does KVM mitigate iTLB multi-hit when shadowing
NPT and the guest has not enabled EFER.NX?

>  }
>  EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
>  
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 02/23] KVM: MMU: nested EPT cannot be used in SMM
  2022-02-04 11:56 ` [PATCH 02/23] KVM: MMU: nested EPT cannot be used in SMM Paolo Bonzini
@ 2022-02-04 18:16   ` David Matlack
  2022-02-09 22:43     ` Sean Christopherson
  0 siblings, 1 reply; 72+ messages in thread
From: David Matlack @ 2022-02-04 18:16 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:56:57AM -0500, Paolo Bonzini wrote:
> The role.base.smm flag is always zero, do not bother copying it over
> from vcpu->arch.root_mmu.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/mmu.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 9424ae90f1ef..b0065ae3cea8 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4881,9 +4881,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
>  {
>  	union kvm_mmu_role role = {0};
>  
> -	/* SMM flag is inherited from root_mmu */
> -	role.base.smm = vcpu->arch.root_mmu.mmu_role.base.smm;

nit: Retaining a comment here and/or warning here would be useful.

	/* EPT is not allowed in SMM */
	WARN_ONCE_ONCE(vcpu->arch.root_mmu.mmu_role.base.smm);

(Although I imagine it would just get removed later in the series.)

> -
>  	role.base.level = level;
>  	role.base.has_4_byte_gpte = false;
>  	role.base.direct = false;
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/23] KVM: MMU: remove valid from extended role
  2022-02-04 11:56 ` [PATCH 03/23] KVM: MMU: remove valid from extended role Paolo Bonzini
@ 2022-02-04 18:32   ` David Matlack
  2022-02-05 14:50     ` Paolo Bonzini
  2022-02-09 22:54   ` Sean Christopherson
  1 sibling, 1 reply; 72+ messages in thread
From: David Matlack @ 2022-02-04 18:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:56:58AM -0500, Paolo Bonzini wrote:
> The level field of the MMU role can act as a marker for validity
> instead: it is guaranteed to be nonzero so a zero value means the role
> is invalid and the MMU properties will be computed again.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 4 +---
>  arch/x86/kvm/mmu/mmu.c          | 9 +++------
>  2 files changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index e7e5bd9a984d..4ec7d1e3aa36 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -342,8 +342,7 @@ union kvm_mmu_page_role {
>   * kvm_mmu_extended_role complements kvm_mmu_page_role, tracking properties
>   * relevant to the current MMU configuration.   When loading CR0, CR4, or EFER,
>   * including on nested transitions, if nothing in the full role changes then
> - * MMU re-configuration can be skipped. @valid bit is set on first usage so we
> - * don't treat all-zero structure as valid data.
> + * MMU re-configuration can be skipped.
>   *
>   * The properties that are tracked in the extended role but not the page role
>   * are for things that either (a) do not affect the validity of the shadow page
> @@ -360,7 +359,6 @@ union kvm_mmu_page_role {
>  union kvm_mmu_extended_role {
>  	u32 word;
>  	struct {
> -		unsigned int valid:1;
>  		unsigned int execonly:1;
>  		unsigned int cr0_pg:1;
>  		unsigned int cr4_pae:1;
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index b0065ae3cea8..0039b2f21286 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4683,8 +4683,6 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
>  		ext.efer_lma = ____is_efer_lma(regs);
>  	}
>  
> -	ext.valid = 1;
> -
>  	return ext;
>  }
>  
> @@ -4891,7 +4889,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
>  	/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
>  	role.ext.word = 0;
>  	role.ext.execonly = execonly;
> -	role.ext.valid = 1;
>  
>  	return role;
>  }
> @@ -5039,9 +5036,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * problem is swept under the rug; KVM's CPUID API is horrific and
>  	 * it's all but impossible to solve it without introducing a new API.
>  	 */
> -	vcpu->arch.root_mmu.mmu_role.ext.valid = 0;
> -	vcpu->arch.guest_mmu.mmu_role.ext.valid = 0;
> -	vcpu->arch.nested_mmu.mmu_role.ext.valid = 0;
> +	vcpu->arch.root_mmu.mmu_role.base.level = 0;
> +	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
> +	vcpu->arch.nested_mmu.mmu_role.base.level = 0;

I agree this will work but I think it makes the code more difficult to
follow (and I start worrying that some code that relies on level being
accurate will creep in in the future). At minimum we should extend the
comment here to describe why level is being changed.

I did a half-assed attempt to pass something like "bool force_role_reset"
down to the MMU initialization functions as an alternative but it very
quickly got out of hand.

What about just changing `valid` to `cpuid_stale` and flip the meaning?
kvm_mmu_after_set_cpuid() would set the cpuid_stale bit and then reset
the MMUs.

>  	kvm_mmu_reset_context(vcpu);
>  
>  	/*
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs
  2022-02-04 11:56 ` [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs Paolo Bonzini
@ 2022-02-04 18:41   ` David Matlack
  2022-02-09 22:57   ` Sean Christopherson
  1 sibling, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-04 18:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:56:59AM -0500, Paolo Bonzini wrote:
> struct kvm_mmu_role_regs is computed just once and then accessed.  Use
> const to enforce this.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/mmu.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 0039b2f21286..3add9d8b0630 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -208,7 +208,7 @@ struct kvm_mmu_role_regs {
>   * the single source of truth for the MMU's state.
>   */
>  #define BUILD_MMU_ROLE_REGS_ACCESSOR(reg, name, flag)			\
> -static inline bool __maybe_unused ____is_##reg##_##name(struct kvm_mmu_role_regs *regs)\
> +static inline bool __maybe_unused ____is_##reg##_##name(const struct kvm_mmu_role_regs *regs)\
>  {									\
>  	return !!(regs->reg & flag);					\
>  }
> @@ -255,7 +255,7 @@ static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
>  	return regs;
>  }
>  
> -static int role_regs_to_root_level(struct kvm_mmu_role_regs *regs)
> +static int role_regs_to_root_level(const struct kvm_mmu_role_regs *regs)
>  {
>  	if (!____is_cr0_pg(regs))
>  		return 0;
> @@ -4666,7 +4666,7 @@ static void paging32_init_context(struct kvm_mmu *context)
>  }
>  
>  static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
> -							 struct kvm_mmu_role_regs *regs)
> +							 const struct kvm_mmu_role_regs *regs)
>  {
>  	union kvm_mmu_extended_role ext = {0};
>  
> @@ -4687,7 +4687,7 @@ static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
>  }
>  
>  static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
> -						   struct kvm_mmu_role_regs *regs,
> +						   const struct kvm_mmu_role_regs *regs,
>  						   bool base_only)
>  {
>  	union kvm_mmu_role role = {0};
> @@ -4723,7 +4723,8 @@ static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
>  
>  static union kvm_mmu_role
>  kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
> -				struct kvm_mmu_role_regs *regs, bool base_only)
> +				const struct kvm_mmu_role_regs *regs,
> +				bool base_only)
>  {
>  	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
>  
> @@ -4769,7 +4770,8 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
>  
>  static union kvm_mmu_role
>  kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
> -				      struct kvm_mmu_role_regs *regs, bool base_only)
> +				      const struct kvm_mmu_role_regs *regs,
> +				      bool base_only)
>  {
>  	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
>  
> @@ -4782,7 +4784,8 @@ kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
>  
>  static union kvm_mmu_role
>  kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
> -				   struct kvm_mmu_role_regs *regs, bool base_only)
> +				   const struct kvm_mmu_role_regs *regs,
> +				   bool base_only)
>  {
>  	union kvm_mmu_role role =
>  		kvm_calc_shadow_root_page_role_common(vcpu, regs, base_only);
> @@ -4940,7 +4943,7 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
>  }
>  
>  static union kvm_mmu_role
> -kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, struct kvm_mmu_role_regs *regs)
> +kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
>  {
>  	union kvm_mmu_role role;
>  
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 05/23] KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu
  2022-02-04 11:57 ` [PATCH 05/23] KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu Paolo Bonzini
@ 2022-02-04 18:45   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-04 18:45 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:00AM -0500, Paolo Bonzini wrote:
> The init_kvm_*mmu functions, with the exception of shadow NPT,
> do not need to know the full values of CR0/CR4/EFER; they only
> need to know the bits that make up the "role".  This cleanup
> however will take quite a few incremental steps.  As a start,
> pull the common computation of the struct kvm_mmu_role_regs
> into their caller: all of them extract the struct from the vcpu
> as the very first step.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 33 +++++++++++++++++----------------
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 3add9d8b0630..577e70509510 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4736,12 +4736,12 @@ kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
>  	return role;
>  }
>  
> -static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
> +static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
> +			     const struct kvm_mmu_role_regs *regs)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
>  	union kvm_mmu_role new_role =
> -		kvm_calc_tdp_mmu_root_page_role(vcpu, &regs, false);
> +		kvm_calc_tdp_mmu_root_page_role(vcpu, regs, false);
>  
>  	if (new_role.as_u64 == context->mmu_role.as_u64)
>  		return;
> @@ -4755,7 +4755,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
>  	context->get_guest_pgd = get_cr3;
>  	context->get_pdptr = kvm_pdptr_read;
>  	context->inject_page_fault = kvm_inject_page_fault;
> -	context->root_level = role_regs_to_root_level(&regs);
> +	context->root_level = role_regs_to_root_level(regs);
>  
>  	if (!is_cr0_pg(context))
>  		context->gva_to_gpa = nonpaging_gva_to_gpa;
> @@ -4803,7 +4803,7 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
>  }
>  
>  static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
> -				    struct kvm_mmu_role_regs *regs,
> +				    const struct kvm_mmu_role_regs *regs,
>  				    union kvm_mmu_role new_role)
>  {
>  	if (new_role.as_u64 == context->mmu_role.as_u64)
> @@ -4824,7 +4824,7 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
>  }
>  
>  static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
> -				struct kvm_mmu_role_regs *regs)
> +				const struct kvm_mmu_role_regs *regs)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
>  	union kvm_mmu_role new_role =
> @@ -4845,7 +4845,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
>  
>  static union kvm_mmu_role
>  kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
> -				   struct kvm_mmu_role_regs *regs)
> +				   const struct kvm_mmu_role_regs *regs)

Should these go in the previous commit? Aside from that,

Reviewed-by: David Matlack <dmatlack@google>

>  {
>  	union kvm_mmu_role role =
>  		kvm_calc_shadow_root_page_role_common(vcpu, regs, false);
> @@ -4930,12 +4930,12 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
>  
> -static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
> +static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
> +			     const struct kvm_mmu_role_regs *regs)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
>  
> -	kvm_init_shadow_mmu(vcpu, &regs);
> +	kvm_init_shadow_mmu(vcpu, regs);
>  
>  	context->get_guest_pgd     = get_cr3;
>  	context->get_pdptr         = kvm_pdptr_read;
> @@ -4959,10 +4959,9 @@ kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *
>  	return role;
>  }
>  
> -static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
> +static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
>  {
> -	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
> -	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, &regs);
> +	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, regs);
>  	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
>  
>  	if (new_role.as_u64 == g_context->mmu_role.as_u64)
> @@ -5002,12 +5001,14 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
>  
>  void kvm_init_mmu(struct kvm_vcpu *vcpu)
>  {
> +	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
> +
>  	if (mmu_is_nested(vcpu))
> -		init_kvm_nested_mmu(vcpu);
> +		init_kvm_nested_mmu(vcpu, &regs);
>  	else if (tdp_enabled)
> -		init_kvm_tdp_mmu(vcpu);
> +		init_kvm_tdp_mmu(vcpu, &regs);
>  	else
> -		init_kvm_softmmu(vcpu);
> +		init_kvm_softmmu(vcpu, &regs);
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_mmu);
>  
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized
  2022-02-04 11:57 ` [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized Paolo Bonzini
@ 2022-02-04 19:18   ` David Matlack
  2022-02-07 13:50     ` Paolo Bonzini
  2022-02-09 12:34     ` Paolo Bonzini
  0 siblings, 2 replies; 72+ messages in thread
From: David Matlack @ 2022-02-04 19:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:01AM -0500, Paolo Bonzini wrote:
> __kvm_mmu_new_pgd looks at the MMU's root_level and shadow_root_level
> via fast_pgd_switch.

Those checks are just for performance correct (to skip iterating through
the list of roots)?

Either way, it's probably worth including a Fixes tag below.

> It makes no sense to call it before updating
> these fields, even though it was done like that ever since nested
> VMX grew the ability to use fast CR3 switch (commit 50c28f21d045,
> "kvm: x86: Use fast CR3 switch for nested VMX").
> 
> Pull it to the end of the initialization of the shadow nested MMUs.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/mmu.c | 41 +++++++++++++++++++----------------------
>  1 file changed, 19 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 577e70509510..b8ab16323629 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4869,10 +4869,9 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
>  
>  	new_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);
>  
> -	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
> -
>  	shadow_mmu_init_context(vcpu, context, &regs, new_role);
>  	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> +	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
>  
> @@ -4906,27 +4905,25 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
>  						   execonly, level);
>  
> -	__kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base);
> -
> -	if (new_role.as_u64 == context->mmu_role.as_u64)
> -		return;
> -
> -	context->mmu_role.as_u64 = new_role.as_u64;
> -
> -	context->shadow_root_level = level;
> -
> -	context->ept_ad = accessed_dirty;
> -	context->page_fault = ept_page_fault;
> -	context->gva_to_gpa = ept_gva_to_gpa;
> -	context->sync_page = ept_sync_page;
> -	context->invlpg = ept_invlpg;
> -	context->root_level = level;
> -	context->direct_map = false;
> +	if (new_role.as_u64 != context->mmu_role.as_u64) {
> +		context->mmu_role.as_u64 = new_role.as_u64;
> +
> +		context->shadow_root_level = level;
> +
> +		context->ept_ad = accessed_dirty;
> +		context->page_fault = ept_page_fault;
> +		context->gva_to_gpa = ept_gva_to_gpa;
> +		context->sync_page = ept_sync_page;
> +		context->invlpg = ept_invlpg;
> +		context->root_level = level;
> +		context->direct_map = false;
> +		update_permission_bitmask(context, true);
> +		context->pkru_mask = 0;
> +		reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
> +		reset_ept_shadow_zero_bits_mask(context, execonly);
> +	}
>  
> -	update_permission_bitmask(context, true);
> -	context->pkru_mask = 0;
> -	reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
> -	reset_ept_shadow_zero_bits_mask(context, execonly);
> +	__kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base);
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
>  
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role
  2022-02-04 11:57 ` [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role Paolo Bonzini
@ 2022-02-04 19:32   ` David Matlack
  2022-02-05 14:46     ` Paolo Bonzini
  2022-02-10  0:47   ` Sean Christopherson
  1 sibling, 1 reply; 72+ messages in thread
From: David Matlack @ 2022-02-04 19:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:02AM -0500, Paolo Bonzini wrote:
> Since the guest PGD is now loaded after the MMU has been set up
> completely, the desired role for a cache hit is simply the current
> mmu_role.  There is no need to compute it again, so __kvm_mmu_new_pgd
> can be folded in kvm_mmu_new_pgd.
> 
> As an aside, the !tdp_enabled case in the function was dead code,
> and that also gets mopped up as a side effect.

Couldn't the !tdp_enabled case be called via kvm_set_cr3() ->
kvm_mmu_new_pgd()?

> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 29 ++++-------------------------
>  1 file changed, 4 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index b8ab16323629..42475e4c2a48 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -190,8 +190,6 @@ struct kmem_cache *mmu_page_header_cache;
>  static struct percpu_counter kvm_total_used_mmu_pages;
>  
>  static void mmu_spte_set(u64 *sptep, u64 spte);
> -static union kvm_mmu_page_role
> -kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu);
>  
>  struct kvm_mmu_role_regs {
>  	const unsigned long cr0;
> @@ -4159,9 +4157,9 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
>  	return false;
>  }
>  
> -static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd,
> -			      union kvm_mmu_page_role new_role)
> +void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
>  {
> +	union kvm_mmu_page_role new_role = vcpu->arch.mmu->mmu_role.base;

nit: Blank line after variable declarations.

>  	if (!fast_pgd_switch(vcpu, new_pgd, new_role)) {
>  		kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT);
>  		return;
> @@ -4196,11 +4194,6 @@ static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd,
>  		__clear_sp_write_flooding_count(
>  				to_shadow_page(vcpu->arch.mmu->root_hpa));
>  }
> -
> -void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd)
> -{
> -	__kvm_mmu_new_pgd(vcpu, new_pgd, kvm_mmu_calc_root_page_role(vcpu));
> -}
>  EXPORT_SYMBOL_GPL(kvm_mmu_new_pgd);
>  
>  static unsigned long get_cr3(struct kvm_vcpu *vcpu)
> @@ -4871,7 +4864,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
>  
>  	shadow_mmu_init_context(vcpu, context, &regs, new_role);
>  	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> -	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
> +	kvm_mmu_new_pgd(vcpu, nested_cr3);
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
>  
> @@ -4923,7 +4916,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  		reset_ept_shadow_zero_bits_mask(context, execonly);
>  	}
>  
> -	__kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base);
> +	kvm_mmu_new_pgd(vcpu, new_eptp);
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
>  
> @@ -5009,20 +5002,6 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu)
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_mmu);
>  
> -static union kvm_mmu_page_role
> -kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu)
> -{
> -	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
> -	union kvm_mmu_role role;
> -
> -	if (tdp_enabled)
> -		role = kvm_calc_tdp_mmu_root_page_role(vcpu, &regs, true);
> -	else
> -		role = kvm_calc_shadow_mmu_root_page_role(vcpu, &regs, true);
> -
> -	return role.base;
> -}
> -
>  void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  {
>  	/*
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 08/23] KVM: MMU: rephrase unclear comment
  2022-02-04 11:57 ` [PATCH 08/23] KVM: MMU: rephrase unclear comment Paolo Bonzini
@ 2022-02-04 19:38   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-04 19:38 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:03AM -0500, Paolo Bonzini wrote:
> If accessed bits are not supported there simple isn't any distinction
> between accessed and non-accessed gPTEs, so the comment does not make
> much sense.  Rephrase it in terms of what happens if accessed bits
> *are* supported.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 5b5bdac97c7b..6bb9a377bf89 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -193,7 +193,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
>  	if (!FNAME(is_present_gpte)(gpte))
>  		goto no_present;
>  
> -	/* if accessed bit is not supported prefetch non accessed gpte */
> +	/* if accessed bit is supported, prefetch only accessed gpte */
>  	if (PT_HAVE_ACCESSED_DIRTY(vcpu->arch.mmu) &&
>  	    !(gpte & PT_GUEST_ACCESSED_MASK))
>  		goto no_present;
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 09/23] KVM: MMU: remove "bool base_only" arguments
  2022-02-04 11:57 ` [PATCH 09/23] KVM: MMU: remove "bool base_only" arguments Paolo Bonzini
@ 2022-02-04 19:41   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-04 19:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:04AM -0500, Paolo Bonzini wrote:
> The argument is always false now that kvm_mmu_calc_root_page_role has
> been removed.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/mmu.c | 63 +++++++++++++++---------------------------
>  1 file changed, 22 insertions(+), 41 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 42475e4c2a48..dd69cfc8c4f6 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4658,46 +4658,30 @@ static void paging32_init_context(struct kvm_mmu *context)
>  	context->direct_map = false;
>  }
>  
> -static union kvm_mmu_extended_role kvm_calc_mmu_role_ext(struct kvm_vcpu *vcpu,
> -							 const struct kvm_mmu_role_regs *regs)
> -{
> -	union kvm_mmu_extended_role ext = {0};
> -
> -	if (____is_cr0_pg(regs)) {
> -		ext.cr0_pg = 1;
> -		ext.cr4_pae = ____is_cr4_pae(regs);
> -		ext.cr4_smep = ____is_cr4_smep(regs);
> -		ext.cr4_smap = ____is_cr4_smap(regs);
> -		ext.cr4_pse = ____is_cr4_pse(regs);
> -
> -		/* PKEY and LA57 are active iff long mode is active. */
> -		ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
> -		ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
> -		ext.efer_lma = ____is_efer_lma(regs);
> -	}
> -
> -	return ext;
> -}
> -
>  static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
> -						   const struct kvm_mmu_role_regs *regs,
> -						   bool base_only)
> +						   const struct kvm_mmu_role_regs *regs)
>  {
>  	union kvm_mmu_role role = {0};
>  
>  	role.base.access = ACC_ALL;
>  	if (____is_cr0_pg(regs)) {
> +		role.ext.cr0_pg = 1;
>  		role.base.efer_nx = ____is_efer_nx(regs);
>  		role.base.cr0_wp = ____is_cr0_wp(regs);
> +
> +		role.ext.cr4_pae = ____is_cr4_pae(regs);
> +		role.ext.cr4_smep = ____is_cr4_smep(regs);
> +		role.ext.cr4_smap = ____is_cr4_smap(regs);
> +		role.ext.cr4_pse = ____is_cr4_pse(regs);
> +
> +		/* PKEY and LA57 are active iff long mode is active. */
> +		role.ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
> +		role.ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
> +		role.ext.efer_lma = ____is_efer_lma(regs);
>  	}
>  	role.base.smm = is_smm(vcpu);
>  	role.base.guest_mode = is_guest_mode(vcpu);
>  
> -	if (base_only)
> -		return role;
> -
> -	role.ext = kvm_calc_mmu_role_ext(vcpu, regs);
> -
>  	return role;
>  }
>  
> @@ -4716,10 +4700,9 @@ static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu)
>  
>  static union kvm_mmu_role
>  kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
> -				const struct kvm_mmu_role_regs *regs,
> -				bool base_only)
> +				const struct kvm_mmu_role_regs *regs)
>  {
> -	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
> +	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs);
>  
>  	role.base.ad_disabled = (shadow_accessed_mask == 0);
>  	role.base.level = kvm_mmu_get_tdp_level(vcpu);
> @@ -4734,7 +4717,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
>  	union kvm_mmu_role new_role =
> -		kvm_calc_tdp_mmu_root_page_role(vcpu, regs, false);
> +		kvm_calc_tdp_mmu_root_page_role(vcpu, regs);
>  
>  	if (new_role.as_u64 == context->mmu_role.as_u64)
>  		return;
> @@ -4763,10 +4746,9 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
>  
>  static union kvm_mmu_role
>  kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
> -				      const struct kvm_mmu_role_regs *regs,
> -				      bool base_only)
> +				      const struct kvm_mmu_role_regs *regs)
>  {
> -	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs, base_only);
> +	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs);
>  
>  	role.base.smep_andnot_wp = role.ext.cr4_smep && !____is_cr0_wp(regs);
>  	role.base.smap_andnot_wp = role.ext.cr4_smap && !____is_cr0_wp(regs);
> @@ -4777,11 +4759,10 @@ kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
>  
>  static union kvm_mmu_role
>  kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
> -				   const struct kvm_mmu_role_regs *regs,
> -				   bool base_only)
> +				   const struct kvm_mmu_role_regs *regs)
>  {
>  	union kvm_mmu_role role =
> -		kvm_calc_shadow_root_page_role_common(vcpu, regs, base_only);
> +		kvm_calc_shadow_root_page_role_common(vcpu, regs);
>  
>  	role.base.direct = !____is_cr0_pg(regs);
>  
> @@ -4821,7 +4802,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
>  	union kvm_mmu_role new_role =
> -		kvm_calc_shadow_mmu_root_page_role(vcpu, regs, false);
> +		kvm_calc_shadow_mmu_root_page_role(vcpu, regs);
>  
>  	shadow_mmu_init_context(vcpu, context, regs, new_role);
>  
> @@ -4841,7 +4822,7 @@ kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
>  				   const struct kvm_mmu_role_regs *regs)
>  {
>  	union kvm_mmu_role role =
> -		kvm_calc_shadow_root_page_role_common(vcpu, regs, false);
> +		kvm_calc_shadow_root_page_role_common(vcpu, regs);
>  
>  	role.base.direct = false;
>  	role.base.level = kvm_mmu_get_tdp_level(vcpu);
> @@ -4937,7 +4918,7 @@ kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *
>  {
>  	union kvm_mmu_role role;
>  
> -	role = kvm_calc_shadow_root_page_role_common(vcpu, regs, false);
> +	role = kvm_calc_shadow_root_page_role_common(vcpu, regs);
>  
>  	/*
>  	 * Nested MMUs are used only for walking L2's gva->gpa, they never have
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role
  2022-02-04 11:57 ` [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role Paolo Bonzini
@ 2022-02-04 21:57   ` David Matlack
  2022-02-05 14:49     ` Paolo Bonzini
  0 siblings, 1 reply; 72+ messages in thread
From: David Matlack @ 2022-02-04 21:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:05AM -0500, Paolo Bonzini wrote:
> Snapshot the state of the processor registers that govern page walk into
> a new field of struct kvm_mmu.  This is a more natural representation
> than having it *mostly* in mmu_role but not exclusively; the delta
> right now is represented in other fields, such as root_level.  For
> example, already in this patch we can replace role_regs_to_root_level
> with the "level" field of the CPU role.
> 
> The nested MMU now has only the CPU role; and in fact the new function
> kvm_calc_cpu_role is analogous to the previous kvm_calc_nested_mmu_role,
> except that it has role.base.direct equal to CR0.PG.  It is not clear
> what the code meant by "setting role.base.direct to true to detect bogus
> usage of the nested MMU".
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h |   1 +
>  arch/x86/kvm/mmu/mmu.c          | 100 ++++++++++++++++++++------------
>  arch/x86/kvm/mmu/paging_tmpl.h  |   2 +-
>  3 files changed, 64 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4ec7d1e3aa36..427ee486309c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -432,6 +432,7 @@ struct kvm_mmu {
>  	void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa);
>  	hpa_t root_hpa;
>  	gpa_t root_pgd;
> +	union kvm_mmu_role cpu_role;
>  	union kvm_mmu_role mmu_role;
>  	u8 root_level;
>  	u8 shadow_root_level;
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index dd69cfc8c4f6..f98444e1d834 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -230,7 +230,7 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA);
>  #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name)		\
>  static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu)	\
>  {								\
> -	return !!(mmu->mmu_role. base_or_ext . reg##_##name);	\
> +	return !!(mmu->cpu_role. base_or_ext . reg##_##name);	\
>  }
>  BUILD_MMU_ROLE_ACCESSOR(ext,  cr0, pg);
>  BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp);
> @@ -4658,6 +4658,38 @@ static void paging32_init_context(struct kvm_mmu *context)
>  	context->direct_map = false;
>  }
>  
> +static union kvm_mmu_role
> +kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
> +{
> +	union kvm_mmu_role role = {0};
> +
> +	role.base.access = ACC_ALL;
> +	role.base.smm = is_smm(vcpu);
> +	role.base.guest_mode = is_guest_mode(vcpu);
> +	role.base.direct = !____is_cr0_pg(regs);
> +	if (!role.base.direct) {
> +		role.base.efer_nx = ____is_efer_nx(regs);
> +		role.base.cr0_wp = ____is_cr0_wp(regs);
> +		role.base.smep_andnot_wp = ____is_cr4_smep(regs) && !____is_cr0_wp(regs);
> +		role.base.smap_andnot_wp = ____is_cr4_smap(regs) && !____is_cr0_wp(regs);
> +		role.base.has_4_byte_gpte = !____is_cr4_pae(regs);
> +		role.base.level = role_regs_to_root_level(regs);
> +
> +		role.ext.cr0_pg = 1;
> +		role.ext.cr4_pae = ____is_cr4_pae(regs);
> +		role.ext.cr4_smep = ____is_cr4_smep(regs);
> +		role.ext.cr4_smap = ____is_cr4_smap(regs);
> +		role.ext.cr4_pse = ____is_cr4_pse(regs);
> +
> +		/* PKEY and LA57 are active iff long mode is active. */
> +		role.ext.cr4_pke = ____is_efer_lma(regs) && ____is_cr4_pke(regs);
> +		role.ext.cr4_la57 = ____is_efer_lma(regs) && ____is_cr4_la57(regs);
> +		role.ext.efer_lma = ____is_efer_lma(regs);
> +	}
> +
> +	return role;
> +}
> +
>  static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
>  						   const struct kvm_mmu_role_regs *regs)
>  {
> @@ -4716,13 +4748,16 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
>  			     const struct kvm_mmu_role_regs *regs)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	union kvm_mmu_role new_role =
> +	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
> +	union kvm_mmu_role mmu_role =
>  		kvm_calc_tdp_mmu_root_page_role(vcpu, regs);
>  
> -	if (new_role.as_u64 == context->mmu_role.as_u64)
> +	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
> +	    mmu_role.as_u64 == context->mmu_role.as_u64)
>  		return;
>  
> -	context->mmu_role.as_u64 = new_role.as_u64;
> +	context->cpu_role.as_u64 = cpu_role.as_u64;
> +	context->mmu_role.as_u64 = mmu_role.as_u64;
>  	context->page_fault = kvm_tdp_page_fault;
>  	context->sync_page = nonpaging_sync_page;
>  	context->invlpg = NULL;
> @@ -4777,13 +4812,15 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
>  }
>  
>  static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *context,
> -				    const struct kvm_mmu_role_regs *regs,
> -				    union kvm_mmu_role new_role)
> +				    union kvm_mmu_role cpu_role,
> +				    union kvm_mmu_role mmu_role)
>  {
> -	if (new_role.as_u64 == context->mmu_role.as_u64)
> +	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
> +	    mmu_role.as_u64 == context->mmu_role.as_u64)
>  		return;
>  
> -	context->mmu_role.as_u64 = new_role.as_u64;
> +	context->cpu_role.as_u64 = cpu_role.as_u64;
> +	context->mmu_role.as_u64 = mmu_role.as_u64;
>  
>  	if (!is_cr0_pg(context))
>  		nonpaging_init_context(context);
> @@ -4791,20 +4828,21 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
>  		paging64_init_context(context);
>  	else
>  		paging32_init_context(context);
> -	context->root_level = role_regs_to_root_level(regs);
> +	context->root_level = cpu_role.base.level;
>  
>  	reset_guest_paging_metadata(vcpu, context);
> -	context->shadow_root_level = new_role.base.level;
> +	context->shadow_root_level = mmu_role.base.level;
>  }
>  
>  static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
>  				const struct kvm_mmu_role_regs *regs)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	union kvm_mmu_role new_role =
> +	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
> +	union kvm_mmu_role mmu_role =
>  		kvm_calc_shadow_mmu_root_page_role(vcpu, regs);
>  
> -	shadow_mmu_init_context(vcpu, context, regs, new_role);
> +	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
>  
>  	/*
>  	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
> @@ -4839,11 +4877,10 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
>  		.cr4 = cr4 & ~X86_CR4_PKE,
>  		.efer = efer,
>  	};
> -	union kvm_mmu_role new_role;
> -
> -	new_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);
> +	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);
> +	union kvm_mmu_role mmu_role = kvm_calc_shadow_npt_root_page_role(vcpu, &regs);;
>  
> -	shadow_mmu_init_context(vcpu, context, &regs, new_role);
> +	shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
>  	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
>  	kvm_mmu_new_pgd(vcpu, nested_cr3);
>  }
> @@ -4862,7 +4899,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
>  	role.base.guest_mode = true;
>  	role.base.access = ACC_ALL;
>  
> -	/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
>  	role.ext.word = 0;
>  	role.ext.execonly = execonly;
>  
> @@ -4879,7 +4915,9 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
>  						   execonly, level);
>  
> -	if (new_role.as_u64 != context->mmu_role.as_u64) {
> +	if (new_role.as_u64 != context->cpu_role.as_u64) {
> +		/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
> +		context->cpu_role.as_u64 = new_role.as_u64;
>  		context->mmu_role.as_u64 = new_role.as_u64;
>  
>  		context->shadow_root_level = level;
> @@ -4913,32 +4951,15 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
>  	context->inject_page_fault = kvm_inject_page_fault;
>  }
>  
> -static union kvm_mmu_role
> -kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
> -{
> -	union kvm_mmu_role role;
> -
> -	role = kvm_calc_shadow_root_page_role_common(vcpu, regs);
> -
> -	/*
> -	 * Nested MMUs are used only for walking L2's gva->gpa, they never have
> -	 * shadow pages of their own and so "direct" has no meaning.   Set it
> -	 * to "true" to try to detect bogus usage of the nested MMU.
> -	 */
> -	role.base.direct = true;
> -	role.base.level = role_regs_to_root_level(regs);
> -	return role;
> -}
> -
>  static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
>  {
> -	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, regs);
> +	union kvm_mmu_role new_role = kvm_calc_cpu_role(vcpu, regs);
>  	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
>  
> -	if (new_role.as_u64 == g_context->mmu_role.as_u64)
> +	if (new_role.as_u64 == g_context->cpu_role.as_u64)
>  		return;
>  
> -	g_context->mmu_role.as_u64 = new_role.as_u64;
> +	g_context->cpu_role.as_u64 = new_role.as_u64;
>  	g_context->get_guest_pgd     = get_cr3;
>  	g_context->get_pdptr         = kvm_pdptr_read;
>  	g_context->inject_page_fault = kvm_inject_page_fault;
> @@ -4997,6 +5018,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	 * problem is swept under the rug; KVM's CPUID API is horrific and
>  	 * it's all but impossible to solve it without introducing a new API.
>  	 */
> +	vcpu->arch.root_mmu.cpu_role.base.level = 0;
> +	vcpu->arch.guest_mmu.cpu_role.base.level = 0;
> +	vcpu->arch.nested_mmu.cpu_role.base.level = 0;

Will cpu_role.base.level already be 0 if CR0.PG=0 && !tdp_enabled? i.e.
setting cpu_role.base.level to 0 might not have the desired effect.

It might not matter in practice since the shadow_mmu_init_context() and
kvm_calc_mmu_role_common() check both the mmu_role and cpu_role, but does
make this reset code confusing.

>  	vcpu->arch.root_mmu.mmu_role.base.level = 0;
>  	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
>  	vcpu->arch.nested_mmu.mmu_role.base.level = 0;
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 6bb9a377bf89..b9f472f27077 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -323,7 +323,7 @@ static inline bool FNAME(is_last_gpte)(struct kvm_mmu *mmu,
>  	 * is not reserved and does not indicate a large page at this level,
>  	 * so clear PT_PAGE_SIZE_MASK in gpte if that is the case.
>  	 */
> -	gpte &= level - (PT32_ROOT_LEVEL + mmu->mmu_role.ext.cr4_pse);
> +	gpte &= level - (PT32_ROOT_LEVEL + mmu->cpu_role.ext.cr4_pse);
>  #endif
>  	/*
>  	 * PG_LEVEL_4K always terminates.  The RHS has bit 7 set
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role
  2022-02-04 19:32   ` David Matlack
@ 2022-02-05 14:46     ` Paolo Bonzini
  0 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-05 14:46 UTC (permalink / raw)
  To: David Matlack; +Cc: linux-kernel, kvm, seanjc, vkuznets

On 2/4/22 20:32, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:57:02AM -0500, Paolo Bonzini wrote:
>> Since the guest PGD is now loaded after the MMU has been set up
>> completely, the desired role for a cache hit is simply the current
>> mmu_role.  There is no need to compute it again, so __kvm_mmu_new_pgd
>> can be folded in kvm_mmu_new_pgd.
>>
>> As an aside, the !tdp_enabled case in the function was dead code,
>> and that also gets mopped up as a side effect.
> Couldn't the !tdp_enabled case be called via kvm_set_cr3() ->
> kvm_mmu_new_pgd()?
> 

Right, it's not dead code.  Rather, there never was a need for 
kvm_mmu_calc_root_page_role(vcpu) as opposed to just using what is 
already in vcpu->arch.mmu.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role
  2022-02-04 21:57   ` David Matlack
@ 2022-02-05 14:49     ` Paolo Bonzini
  2022-02-07 21:38       ` David Matlack
  0 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-05 14:49 UTC (permalink / raw)
  To: David Matlack; +Cc: linux-kernel, kvm, seanjc, vkuznets

On 2/4/22 22:57, David Matlack wrote:
>> +	vcpu->arch.root_mmu.cpu_role.base.level = 0;
>> +	vcpu->arch.guest_mmu.cpu_role.base.level = 0;
>> +	vcpu->arch.nested_mmu.cpu_role.base.level = 0;
> Will cpu_role.base.level already be 0 if CR0.PG=0 && !tdp_enabled? i.e.
> setting cpu_role.base.level to 0 might not have the desired effect.
> 
> It might not matter in practice since the shadow_mmu_init_context() and
> kvm_calc_mmu_role_common() check both the mmu_role and cpu_role, but does
> make this reset code confusing.
> 

Good point.  The (still unrealized) purpose of this series is to be able 
to check mmu_role only, so for now I'll just keep the valid bit in the 
ext part of the cpu_role.  The mmu_role's level however is never zero, 
so I can already use the level when I remove the ext part from the mmu_role.

I'll remove the valid bit of the ext part only after the cpu_role check 
is removed, because then it can trivially go.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/23] KVM: MMU: remove valid from extended role
  2022-02-04 18:32   ` David Matlack
@ 2022-02-05 14:50     ` Paolo Bonzini
  0 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-05 14:50 UTC (permalink / raw)
  To: David Matlack; +Cc: linux-kernel, kvm, seanjc, vkuznets

On 2/4/22 19:32, David Matlack wrote:
>> -	vcpu->arch.root_mmu.mmu_role.ext.valid = 0;
>> -	vcpu->arch.guest_mmu.mmu_role.ext.valid = 0;
>> -	vcpu->arch.nested_mmu.mmu_role.ext.valid = 0;
>> +	vcpu->arch.root_mmu.mmu_role.base.level = 0;
>> +	vcpu->arch.guest_mmu.mmu_role.base.level = 0;
>> +	vcpu->arch.nested_mmu.mmu_role.base.level = 0;
> I agree this will work but I think it makes the code more difficult to
> follow (and I start worrying that some code that relies on level being
> accurate will creep in in the future). At minimum we should extend the
> comment here to describe why level is being changed.
> 
> I did a half-assed attempt to pass something like "bool force_role_reset"
> down to the MMU initialization functions as an alternative but it very
> quickly got out of hand.
> 
> What about just changing `valid` to `cpuid_stale` and flip the meaning?
> kvm_mmu_after_set_cpuid() would set the cpuid_stale bit and then reset
> the MMUs.
> 

For now I'll swap this patch with one that clears the whole word, but 
keep the ext bit as described in my other reply.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  2022-02-04 17:59   ` David Matlack
@ 2022-02-05 14:52     ` Paolo Bonzini
  2022-02-07 16:09       ` Sean Christopherson
  2022-02-10  0:30     ` Sean Christopherson
  1 sibling, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-05 14:52 UTC (permalink / raw)
  To: David Matlack; +Cc: linux-kernel, kvm, seanjc, vkuznets

On 2/4/22 18:59, David Matlack wrote:
>> +	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> 
> Out of curiousity, how does KVM mitigate iTLB multi-hit when shadowing
> NPT and the guest has not enabled EFER.NX?

You got me worried for a second but iTLB multihit is Intel-only, isn't it?

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized
  2022-02-04 19:18   ` David Matlack
@ 2022-02-07 13:50     ` Paolo Bonzini
  2022-02-07 14:35       ` Paolo Bonzini
  2022-02-09 12:34     ` Paolo Bonzini
  1 sibling, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-07 13:50 UTC (permalink / raw)
  To: David Matlack; +Cc: linux-kernel, kvm, seanjc, vkuznets

On 2/4/22 20:18, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:57:01AM -0500, Paolo Bonzini wrote:
>> __kvm_mmu_new_pgd looks at the MMU's root_level and shadow_root_level
>> via fast_pgd_switch.
> Those checks are just for performance correct (to skip iterating through
> the list of roots)?
> 
> Either way, it's probably worth including a Fixes tag below.
> 

There's no bug because __kvm_mmu_new_pgd is passed a correct new_role. 
But it's unnecessarily complex as shown by patches 7 and 9.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized
  2022-02-07 13:50     ` Paolo Bonzini
@ 2022-02-07 14:35       ` Paolo Bonzini
  0 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-07 14:35 UTC (permalink / raw)
  To: David Matlack; +Cc: linux-kernel, kvm, seanjc, vkuznets

On 2/7/22 14:50, Paolo Bonzini wrote:
>>>
>> Those checks are just for performance correct (to skip iterating through
>> the list of roots)?
>>
>> Either way, it's probably worth including a Fixes tag below.
> 
> There's no bug because __kvm_mmu_new_pgd is passed a correct new_role. 
> But it's unnecessarily complex as shown by patches 7 and 9.

root_level is not part of the role, but we're still safe because 
changing it requires to flip CR0.PG to 0 and back to 1.  This calls 
kvm_mmu_reset_context and unloads all roots, so it's not possible to 
have a cached PGD with the wrong root level.  On the first call the 
root_level might be incorrect, but then again there's no cached PGD.

(Also, shadow_root_level is either always or never >= PT64_ROOT_4LEVEL, 
so it's okay unless the guest_mmu was never set up and therefore no 
cached PGD exists).

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  2022-02-05 14:52     ` Paolo Bonzini
@ 2022-02-07 16:09       ` Sean Christopherson
  2022-02-07 21:50         ` David Matlack
  0 siblings, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-07 16:09 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: David Matlack, linux-kernel, kvm, vkuznets

On Sat, Feb 05, 2022, Paolo Bonzini wrote:
> On 2/4/22 18:59, David Matlack wrote:
> > > +	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> > 
> > Out of curiousity, how does KVM mitigate iTLB multi-hit when shadowing
> > NPT and the guest has not enabled EFER.NX?
> 
> You got me worried for a second but iTLB multihit is Intel-only, isn't it?

AFAIK, yes, big Core only.  arch/x86/kernel/cpu/common.c sets NO_ITLB_MULTIHIT
for all AMD, Hygon, and Atom CPUs.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role
  2022-02-05 14:49     ` Paolo Bonzini
@ 2022-02-07 21:38       ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 21:38 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: LKML, kvm list, Sean Christopherson, Vitaly Kuznetsov

On Sat, Feb 5, 2022 at 6:49 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 2/4/22 22:57, David Matlack wrote:
> >> +    vcpu->arch.root_mmu.cpu_role.base.level = 0;
> >> +    vcpu->arch.guest_mmu.cpu_role.base.level = 0;
> >> +    vcpu->arch.nested_mmu.cpu_role.base.level = 0;
> > Will cpu_role.base.level already be 0 if CR0.PG=0 && !tdp_enabled? i.e.
> > setting cpu_role.base.level to 0 might not have the desired effect.
> >
> > It might not matter in practice since the shadow_mmu_init_context() and
> > kvm_calc_mmu_role_common() check both the mmu_role and cpu_role, but does
> > make this reset code confusing.
> >
>
> Good point.  The (still unrealized) purpose of this series is to be able
> to check mmu_role only, so for now I'll just keep the valid bit in the
> ext part of the cpu_role.  The mmu_role's level however is never zero,
> so I can already use the level when I remove the ext part from the mmu_role.

Agreed.

>
> I'll remove the valid bit of the ext part only after the cpu_role check
> is removed, because then it can trivially go.

Ok sounds good.

>
> Paolo
>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  2022-02-07 16:09       ` Sean Christopherson
@ 2022-02-07 21:50         ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 21:50 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, LKML, kvm list, Vitaly Kuznetsov

On Mon, Feb 7, 2022 at 8:10 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Sat, Feb 05, 2022, Paolo Bonzini wrote:
> > On 2/4/22 18:59, David Matlack wrote:
> > > > + reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> > >
> > > Out of curiousity, how does KVM mitigate iTLB multi-hit when shadowing
> > > NPT and the guest has not enabled EFER.NX?
> >
> > You got me worried for a second but iTLB multihit is Intel-only, isn't it?
>
> AFAIK, yes, big Core only.  arch/x86/kernel/cpu/common.c sets NO_ITLB_MULTIHIT
> for all AMD, Hygon, and Atom CPUs.

Ah that's right, it's Intel-only. Thanks!

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs
  2022-02-04 11:57 ` [PATCH 11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs Paolo Bonzini
@ 2022-02-07 22:10   ` David Matlack
  2022-02-07 22:17     ` David Matlack
  0 siblings, 1 reply; 72+ messages in thread
From: David Matlack @ 2022-02-07 22:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:06AM -0500, Paolo Bonzini wrote:
> The root_level can be found in the cpu_role (in fact the field
> is superfluous and could be removed, but one thing at a time).
> Since there is only one usage left of role_regs_to_root_level,
> inline it into kvm_calc_cpu_role.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 23 ++++++++---------------
>  1 file changed, 8 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index f98444e1d834..74789295f922 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -253,19 +253,6 @@ static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
>  	return regs;
>  }
>  
> -static int role_regs_to_root_level(const struct kvm_mmu_role_regs *regs)
> -{
> -	if (!____is_cr0_pg(regs))
> -		return 0;
> -	else if (____is_efer_lma(regs))
> -		return ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL :
> -					       PT64_ROOT_4LEVEL;
> -	else if (____is_cr4_pae(regs))
> -		return PT32E_ROOT_LEVEL;
> -	else
> -		return PT32_ROOT_LEVEL;
> -}
> -
>  static inline bool kvm_available_flush_tlb_with_range(void)
>  {
>  	return kvm_x86_ops.tlb_remote_flush_with_range;
> @@ -4673,7 +4660,13 @@ kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
>  		role.base.smep_andnot_wp = ____is_cr4_smep(regs) && !____is_cr0_wp(regs);
>  		role.base.smap_andnot_wp = ____is_cr4_smap(regs) && !____is_cr0_wp(regs);
>  		role.base.has_4_byte_gpte = !____is_cr4_pae(regs);
> -		role.base.level = role_regs_to_root_level(regs);
> +
> +		if (____is_efer_lma(regs))
> +			role.base.level = ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL : PT64_ROOT_4LEVEL;
> +		else if (____is_cr4_pae(regs))
> +			role.base.level = PT32E_ROOT_LEVEL;
> +		else
> +			role.base.level = PT32_ROOT_LEVEL;

Did you mean to drop the !CR0.PG case?

>  
>  		role.ext.cr0_pg = 1;
>  		role.ext.cr4_pae = ____is_cr4_pae(regs);
> @@ -4766,7 +4759,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
>  	context->get_guest_pgd = get_cr3;
>  	context->get_pdptr = kvm_pdptr_read;
>  	context->inject_page_fault = kvm_inject_page_fault;
> -	context->root_level = role_regs_to_root_level(regs);
> +	context->root_level = cpu_role.base.level;
>  
>  	if (!is_cr0_pg(context))
>  		context->gva_to_gpa = nonpaging_gva_to_gpa;
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs
  2022-02-07 22:10   ` David Matlack
@ 2022-02-07 22:17     ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 22:17 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Mon, Feb 07, 2022 at 10:10:39PM +0000, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:57:06AM -0500, Paolo Bonzini wrote:
> > The root_level can be found in the cpu_role (in fact the field
> > is superfluous and could be removed, but one thing at a time).
> > Since there is only one usage left of role_regs_to_root_level,
> > inline it into kvm_calc_cpu_role.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 23 ++++++++---------------
> >  1 file changed, 8 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index f98444e1d834..74789295f922 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -253,19 +253,6 @@ static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu)
> >  	return regs;
> >  }
> >  
> > -static int role_regs_to_root_level(const struct kvm_mmu_role_regs *regs)
> > -{
> > -	if (!____is_cr0_pg(regs))
> > -		return 0;
> > -	else if (____is_efer_lma(regs))
> > -		return ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL :
> > -					       PT64_ROOT_4LEVEL;
> > -	else if (____is_cr4_pae(regs))
> > -		return PT32E_ROOT_LEVEL;
> > -	else
> > -		return PT32_ROOT_LEVEL;
> > -}
> > -
> >  static inline bool kvm_available_flush_tlb_with_range(void)
> >  {
> >  	return kvm_x86_ops.tlb_remote_flush_with_range;
> > @@ -4673,7 +4660,13 @@ kvm_calc_cpu_role(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
> >  		role.base.smep_andnot_wp = ____is_cr4_smep(regs) && !____is_cr0_wp(regs);
> >  		role.base.smap_andnot_wp = ____is_cr4_smap(regs) && !____is_cr0_wp(regs);
> >  		role.base.has_4_byte_gpte = !____is_cr4_pae(regs);
> > -		role.base.level = role_regs_to_root_level(regs);
> > +
> > +		if (____is_efer_lma(regs))
> > +			role.base.level = ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL : PT64_ROOT_4LEVEL;
> > +		else if (____is_cr4_pae(regs))
> > +			role.base.level = PT32E_ROOT_LEVEL;
> > +		else
> > +			role.base.level = PT32_ROOT_LEVEL;
> 
> Did you mean to drop the !CR0.PG case?

Oh never mind I see the !CR0.PG case is handled above.

Reviewed-by: David Matlack <dmatlack@google.com>

> 
> >  
> >  		role.ext.cr0_pg = 1;
> >  		role.ext.cr4_pae = ____is_cr4_pae(regs);
> > @@ -4766,7 +4759,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
> >  	context->get_guest_pgd = get_cr3;
> >  	context->get_pdptr = kvm_pdptr_read;
> >  	context->inject_page_fault = kvm_inject_page_fault;
> > -	context->root_level = role_regs_to_root_level(regs);
> > +	context->root_level = cpu_role.base.level;
> >  
> >  	if (!is_cr0_pg(context))
> >  		context->gva_to_gpa = nonpaging_gva_to_gpa;
> > -- 
> > 2.31.1
> > 
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 13/23] KVM: MMU: remove kvm_calc_shadow_root_page_role_common
  2022-02-04 11:57 ` [PATCH 13/23] KVM: MMU: remove kvm_calc_shadow_root_page_role_common Paolo Bonzini
@ 2022-02-07 22:25   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 22:25 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:08AM -0500, Paolo Bonzini wrote:
> kvm_calc_shadow_root_page_role_common is the same as
> kvm_calc_cpu_role except for the level, which is overwritten
> afterwards in kvm_calc_shadow_mmu_root_page_role
> and kvm_calc_shadow_npt_root_page_role.
> 
> role.base.direct is already set correctly for the CPU role,
> and CR0.PG=1 is required for VMRUN so it will also be
> correct for nested NPT.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/mmu/mmu.c | 21 ++-------------------
>  1 file changed, 2 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index d6b5d8c1c0dc..19abf1e4cee9 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4772,27 +4772,11 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
>  	reset_tdp_shadow_zero_bits_mask(context);
>  }
>  
> -static union kvm_mmu_role
> -kvm_calc_shadow_root_page_role_common(struct kvm_vcpu *vcpu,
> -				      const struct kvm_mmu_role_regs *regs)
> -{
> -	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, regs);
> -
> -	role.base.smep_andnot_wp = role.ext.cr4_smep && !____is_cr0_wp(regs);
> -	role.base.smap_andnot_wp = role.ext.cr4_smap && !____is_cr0_wp(regs);
> -	role.base.has_4_byte_gpte = ____is_cr0_pg(regs) && !____is_cr4_pae(regs);
> -
> -	return role;
> -}
> -
>  static union kvm_mmu_role
>  kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu,
>  				   const struct kvm_mmu_role_regs *regs)
>  {
> -	union kvm_mmu_role role =
> -		kvm_calc_shadow_root_page_role_common(vcpu, regs);
> -
> -	role.base.direct = !____is_cr0_pg(regs);
> +	union kvm_mmu_role role = kvm_calc_cpu_role(vcpu, regs);
>  
>  	if (!____is_efer_lma(regs))
>  		role.base.level = PT32E_ROOT_LEVEL;
> @@ -4853,9 +4837,8 @@ kvm_calc_shadow_npt_root_page_role(struct kvm_vcpu *vcpu,
>  				   const struct kvm_mmu_role_regs *regs)
>  {
>  	union kvm_mmu_role role =
> -		kvm_calc_shadow_root_page_role_common(vcpu, regs);
> +               kvm_calc_cpu_role(vcpu, regs);
>  
> -	role.base.direct = false;
>  	role.base.level = kvm_mmu_get_tdp_level(vcpu);
>  
>  	return role;
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 20/23] KVM: MMU: pull CPU role computation to kvm_init_mmu
  2022-02-04 11:57 ` [PATCH 20/23] KVM: MMU: pull CPU role computation to kvm_init_mmu Paolo Bonzini
@ 2022-02-07 22:42   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 22:42 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:15AM -0500, Paolo Bonzini wrote:
> Do not lead init_kvm_*mmu into the temptation of poking
> into struct kvm_mmu_role_regs, by passing to it directly
> the CPU role.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 21 +++++++++------------
>  1 file changed, 9 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 01027da82e23..6f9d876ce429 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4721,11 +4721,9 @@ kvm_calc_tdp_mmu_root_page_role(struct kvm_vcpu *vcpu,
>  	return role;
>  }
>  
> -static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu,
> -			     const struct kvm_mmu_role_regs *regs)
> +static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
>  	union kvm_mmu_page_role mmu_role = kvm_calc_tdp_mmu_root_page_role(vcpu, cpu_role);
>  
>  	if (cpu_role.as_u64 == context->cpu_role.as_u64 &&
> @@ -4779,10 +4777,9 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
>  }
>  
>  static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
> -				const struct kvm_mmu_role_regs *regs)
> +				union kvm_mmu_role cpu_role)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
> -	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, regs);
>  	union kvm_mmu_page_role mmu_role;
>  
>  	mmu_role = cpu_role.base;
> @@ -4874,20 +4871,19 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu);
>  
>  static void init_kvm_softmmu(struct kvm_vcpu *vcpu,
> -			     const struct kvm_mmu_role_regs *regs)
> +			     union kvm_mmu_role cpu_role)
>  {
>  	struct kvm_mmu *context = &vcpu->arch.root_mmu;
>  
> -	kvm_init_shadow_mmu(vcpu, regs);
> +	kvm_init_shadow_mmu(vcpu, cpu_role);
>  
>  	context->get_guest_pgd     = get_cr3;
>  	context->get_pdptr         = kvm_pdptr_read;
>  	context->inject_page_fault = kvm_inject_page_fault;
>  }
>  
> -static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role_regs *regs)
> +static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role new_role)
>  {
> -	union kvm_mmu_role new_role = kvm_calc_cpu_role(vcpu, regs);
>  	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;
>  
>  	if (new_role.as_u64 == g_context->cpu_role.as_u64)
> @@ -4928,13 +4924,14 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, const struct kvm_mmu_role
>  void kvm_init_mmu(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
> +	union kvm_mmu_role cpu_role = kvm_calc_cpu_role(vcpu, &regs);

WDYT about also inlining vcpu_to_role_regs() in kvm_calc_cpu_role()?

>  
>  	if (mmu_is_nested(vcpu))
> -		init_kvm_nested_mmu(vcpu, &regs);
> +		init_kvm_nested_mmu(vcpu, cpu_role);
>  	else if (tdp_enabled)
> -		init_kvm_tdp_mmu(vcpu, &regs);
> +		init_kvm_tdp_mmu(vcpu, cpu_role);
>  	else
> -		init_kvm_softmmu(vcpu, &regs);
> +		init_kvm_softmmu(vcpu, cpu_role);
>  }
>  EXPORT_SYMBOL_GPL(kvm_init_mmu);
>  
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 21/23] KVM: MMU: store shadow_root_level into mmu_role
  2022-02-04 11:57 ` [PATCH 21/23] KVM: MMU: store shadow_root_level into mmu_role Paolo Bonzini
@ 2022-02-07 23:00   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 23:00 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:16AM -0500, Paolo Bonzini wrote:
> mmu_role.level is always the same value as shadow_level:
> 
> - kvm_mmu_get_tdp_level(vcpu) when going through init_kvm_tdp_mmu
> 
> - the level argument when going through kvm_init_shadow_ept_mmu
> 
> - it's assigned directly from new_role.base.level when going
>   through shadow_mmu_init_context
> 
> Remove the duplication and get the level directly from the role.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

nit: How about the following for the shortlog?

KVM: MMU: Replace shadow_root_level with mmu_role.level

Otherwise,

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/include/asm/kvm_host.h |  1 -
>  arch/x86/kvm/mmu.h              |  2 +-
>  arch/x86/kvm/mmu/mmu.c          | 36 +++++++++++++++------------------
>  arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
>  arch/x86/kvm/svm/svm.c          |  2 +-
>  arch/x86/kvm/vmx/vmx.c          |  2 +-
>  6 files changed, 20 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index b0085c54786c..867fc82f1de5 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -433,7 +433,6 @@ struct kvm_mmu {
>  	union kvm_mmu_role cpu_role;
>  	union kvm_mmu_page_role mmu_role;
>  	u8 root_level;
> -	u8 shadow_root_level;
>  	bool direct_map;
>  	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];
>  
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 51faa2c76ca5..43b99308cb0e 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -112,7 +112,7 @@ static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
>  		return;
>  
>  	static_call(kvm_x86_load_mmu_pgd)(vcpu, root_hpa,
> -					  vcpu->arch.mmu->shadow_root_level);
> +					  vcpu->arch.mmu->mmu_role.level);
>  }
>  
>  struct kvm_page_fault {
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 6f9d876ce429..4d1fa87718f8 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2143,7 +2143,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
>  {
>  	iterator->addr = addr;
>  	iterator->shadow_addr = root;
> -	iterator->level = vcpu->arch.mmu->shadow_root_level;
> +	iterator->level = vcpu->arch.mmu->mmu_role.level;
>  
>  	if (iterator->level >= PT64_ROOT_4LEVEL &&
>  	    vcpu->arch.mmu->root_level < PT64_ROOT_4LEVEL &&
> @@ -3254,7 +3254,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>  					   &invalid_list);
>  
>  	if (free_active_root) {
> -		if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
> +		if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
>  		    (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
>  			mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list);
>  		} else if (mmu->pae_root) {
> @@ -3329,7 +3329,7 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva,
>  static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_mmu *mmu = vcpu->arch.mmu;
> -	u8 shadow_root_level = mmu->shadow_root_level;
> +	u8 shadow_root_level = mmu->mmu_role.level;
>  	hpa_t root;
>  	unsigned i;
>  	int r;
> @@ -3479,7 +3479,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	 */
>  	if (mmu->root_level >= PT64_ROOT_4LEVEL) {
>  		root = mmu_alloc_root(vcpu, root_gfn, 0,
> -				      mmu->shadow_root_level, false);
> +				      mmu->mmu_role.level, false);
>  		mmu->root_hpa = root;
>  		goto set_root_pgd;
>  	}
> @@ -3495,7 +3495,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	 * the shadow page table may be a PAE or a long mode page table.
>  	 */
>  	pm_mask = PT_PRESENT_MASK | shadow_me_mask;
> -	if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL) {
> +	if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL) {
>  		pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
>  
>  		if (WARN_ON_ONCE(!mmu->pml4_root)) {
> @@ -3504,7 +3504,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  		}
>  		mmu->pml4_root[0] = __pa(mmu->pae_root) | pm_mask;
>  
> -		if (mmu->shadow_root_level == PT64_ROOT_5LEVEL) {
> +		if (mmu->mmu_role.level == PT64_ROOT_5LEVEL) {
>  			if (WARN_ON_ONCE(!mmu->pml5_root)) {
>  				r = -EIO;
>  				goto out_unlock;
> @@ -3529,9 +3529,9 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  		mmu->pae_root[i] = root | pm_mask;
>  	}
>  
> -	if (mmu->shadow_root_level == PT64_ROOT_5LEVEL)
> +	if (mmu->mmu_role.level == PT64_ROOT_5LEVEL)
>  		mmu->root_hpa = __pa(mmu->pml5_root);
> -	else if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
> +	else if (mmu->mmu_role.level == PT64_ROOT_4LEVEL)
>  		mmu->root_hpa = __pa(mmu->pml4_root);
>  	else
>  		mmu->root_hpa = __pa(mmu->pae_root);
> @@ -3547,7 +3547,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
>  {
>  	struct kvm_mmu *mmu = vcpu->arch.mmu;
> -	bool need_pml5 = mmu->shadow_root_level > PT64_ROOT_4LEVEL;
> +	bool need_pml5 = mmu->mmu_role.level > PT64_ROOT_4LEVEL;
>  	u64 *pml5_root = NULL;
>  	u64 *pml4_root = NULL;
>  	u64 *pae_root;
> @@ -3559,7 +3559,7 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
>  	 * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare.
>  	 */
>  	if (mmu->direct_map || mmu->root_level >= PT64_ROOT_4LEVEL ||
> -	    mmu->shadow_root_level < PT64_ROOT_4LEVEL)
> +	    mmu->mmu_role.level < PT64_ROOT_4LEVEL)
>  		return 0;
>  
>  	/*
> @@ -4145,7 +4145,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
>  	 * having to deal with PDPTEs. We may add support for 32-bit hosts/VMs
>  	 * later if necessary.
>  	 */
> -	if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
> +	if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
>  	    mmu->root_level >= PT64_ROOT_4LEVEL)
>  		return cached_root_available(vcpu, new_pgd, new_role);
>  
> @@ -4408,17 +4408,17 @@ static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
>  	struct rsvd_bits_validate *shadow_zero_check;
>  	int i;
>  
> -	WARN_ON_ONCE(context->shadow_root_level < PT32E_ROOT_LEVEL);
> +	WARN_ON_ONCE(context->mmu_role.level < PT32E_ROOT_LEVEL);
>  
>  	shadow_zero_check = &context->shadow_zero_check;
>  	__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
> -				context->shadow_root_level, uses_nx,
> +				context->mmu_role.level, uses_nx,
>  				guest_can_use_gbpages(vcpu), is_pse, is_amd);
>  
>  	if (!shadow_me_mask)
>  		return;
>  
> -	for (i = context->shadow_root_level; --i >= 0;) {
> +	for (i = context->mmu_role.level; --i >= 0;) {
>  		shadow_zero_check->rsvd_bits_mask[0][i] &= ~shadow_me_mask;
>  		shadow_zero_check->rsvd_bits_mask[1][i] &= ~shadow_me_mask;
>  	}
> @@ -4445,7 +4445,7 @@ reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
>  
>  	if (boot_cpu_is_amd())
>  		__reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(),
> -					context->shadow_root_level, false,
> +					context->mmu_role.level, false,
>  					boot_cpu_has(X86_FEATURE_GBPAGES),
>  					false, true);
>  	else
> @@ -4456,7 +4456,7 @@ reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context)
>  	if (!shadow_me_mask)
>  		return;
>  
> -	for (i = context->shadow_root_level; --i >= 0;) {
> +	for (i = context->mmu_role.level; --i >= 0;) {
>  		shadow_zero_check->rsvd_bits_mask[0][i] &= ~shadow_me_mask;
>  		shadow_zero_check->rsvd_bits_mask[1][i] &= ~shadow_me_mask;
>  	}
> @@ -4735,7 +4735,6 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
>  	context->page_fault = kvm_tdp_page_fault;
>  	context->sync_page = nonpaging_sync_page;
>  	context->invlpg = NULL;
> -	context->shadow_root_level = kvm_mmu_get_tdp_level(vcpu);
>  	context->direct_map = true;
>  	context->get_guest_pgd = get_cr3;
>  	context->get_pdptr = kvm_pdptr_read;
> @@ -4773,7 +4772,6 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
>  	context->root_level = cpu_role.base.level;
>  
>  	reset_guest_paging_metadata(vcpu, context);
> -	context->shadow_root_level = mmu_role.level;
>  }
>  
>  static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
> @@ -4852,8 +4850,6 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  		context->cpu_role.as_u64 = new_role.as_u64;
>  		context->mmu_role.word = new_role.base.word;
>  
> -		context->shadow_root_level = level;
> -
>  		context->page_fault = ept_page_fault;
>  		context->gva_to_gpa = ept_gva_to_gpa;
>  		context->sync_page = ept_sync_page;
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index dd4c78833016..9fb6d983bae9 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -1697,7 +1697,7 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
>  	gfn_t gfn = addr >> PAGE_SHIFT;
>  	int leaf = -1;
>  
> -	*root_level = vcpu->arch.mmu->shadow_root_level;
> +	*root_level = vcpu->arch.mmu->mmu_role.level;
>  
>  	tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
>  		leaf = iter.level;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 7b5345a66117..5a1d552b535b 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3815,7 +3815,7 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,
>  		hv_track_root_tdp(vcpu, root_hpa);
>  
>  		cr3 = vcpu->arch.cr3;
> -	} else if (vcpu->arch.mmu->shadow_root_level >= PT64_ROOT_4LEVEL) {
> +	} else if (vcpu->arch.mmu->mmu_role.level >= PT64_ROOT_4LEVEL) {
>  		cr3 = __sme_set(root_hpa) | kvm_get_active_pcid(vcpu);
>  	} else {
>  		/* PCID in the guest should be impossible with a 32-bit MMU. */
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 8ac5a6fa7720..5e2c865a04ff 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2965,7 +2965,7 @@ static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu)
>  
>  	if (enable_ept)
>  		ept_sync_context(construct_eptp(vcpu, root_hpa,
> -						mmu->shadow_root_level));
> +						mmu->mmu_role.level));
>  	else
>  		vpid_sync_context(vmx_get_current_vpid(vcpu));
>  }
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 22/23] KVM: MMU: use cpu_role for root_level
  2022-02-04 11:57 ` [PATCH 22/23] KVM: MMU: use cpu_role for root_level Paolo Bonzini
@ 2022-02-07 23:01   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 23:01 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:17AM -0500, Paolo Bonzini wrote:
> Remove another duplicate field of struct kvm_mmu.  This time it's
> the root level for page table walking; we were already initializing
> it mostly as cpu_role.base.level, but the field still existed;
> remove it.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

nit: How about the following for the shortlog?

KVM: MMU: Replace root_level with cpu_role.base.level

Otherwise,

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/include/asm/kvm_host.h |  1 -
>  arch/x86/kvm/mmu/mmu.c          | 22 +++++++++-------------
>  arch/x86/kvm/mmu/mmu_audit.c    |  6 +++---
>  arch/x86/kvm/mmu/paging_tmpl.h  |  4 ++--
>  4 files changed, 14 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 867fc82f1de5..c86a2beee92a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -432,7 +432,6 @@ struct kvm_mmu {
>  	gpa_t root_pgd;
>  	union kvm_mmu_role cpu_role;
>  	union kvm_mmu_page_role mmu_role;
> -	u8 root_level;
>  	bool direct_map;
>  	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];
>  
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 4d1fa87718f8..5a6541d6a424 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2146,7 +2146,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
>  	iterator->level = vcpu->arch.mmu->mmu_role.level;
>  
>  	if (iterator->level >= PT64_ROOT_4LEVEL &&
> -	    vcpu->arch.mmu->root_level < PT64_ROOT_4LEVEL &&
> +	    vcpu->arch.mmu->cpu_role.base.level < PT64_ROOT_4LEVEL &&
>  	    !vcpu->arch.mmu->direct_map)
>  		iterator->level = PT32E_ROOT_LEVEL;
>  
> @@ -3255,7 +3255,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>  
>  	if (free_active_root) {
>  		if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
> -		    (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
> +		    (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
>  			mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list);
>  		} else if (mmu->pae_root) {
>  			for (i = 0; i < 4; ++i) {
> @@ -3453,7 +3453,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	 * On SVM, reading PDPTRs might access guest memory, which might fault
>  	 * and thus might sleep.  Grab the PDPTRs before acquiring mmu_lock.
>  	 */
> -	if (mmu->root_level == PT32E_ROOT_LEVEL) {
> +	if (mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) {
>  		for (i = 0; i < 4; ++i) {
>  			pdptrs[i] = mmu->get_pdptr(vcpu, i);
>  			if (!(pdptrs[i] & PT_PRESENT_MASK))
> @@ -3477,7 +3477,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	 * Do we shadow a long mode page table? If so we need to
>  	 * write-protect the guests page table root.
>  	 */
> -	if (mmu->root_level >= PT64_ROOT_4LEVEL) {
> +	if (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
>  		root = mmu_alloc_root(vcpu, root_gfn, 0,
>  				      mmu->mmu_role.level, false);
>  		mmu->root_hpa = root;
> @@ -3516,7 +3516,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	for (i = 0; i < 4; ++i) {
>  		WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i]));
>  
> -		if (mmu->root_level == PT32E_ROOT_LEVEL) {
> +		if (mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) {
>  			if (!(pdptrs[i] & PT_PRESENT_MASK)) {
>  				mmu->pae_root[i] = INVALID_PAE_ROOT;
>  				continue;
> @@ -3558,7 +3558,7 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
>  	 * equivalent level in the guest's NPT to shadow.  Allocate the tables
>  	 * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare.
>  	 */
> -	if (mmu->direct_map || mmu->root_level >= PT64_ROOT_4LEVEL ||
> +	if (mmu->direct_map || mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
>  	    mmu->mmu_role.level < PT64_ROOT_4LEVEL)
>  		return 0;
>  
> @@ -3655,7 +3655,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
>  
>  	vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
>  
> -	if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
> +	if (vcpu->arch.mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
>  		hpa_t root = vcpu->arch.mmu->root_hpa;
>  		sp = to_shadow_page(root);
>  
> @@ -4146,7 +4146,7 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd,
>  	 * later if necessary.
>  	 */
>  	if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
> -	    mmu->root_level >= PT64_ROOT_4LEVEL)
> +	    mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL)
>  		return cached_root_available(vcpu, new_pgd, new_role);
>  
>  	return false;
> @@ -4335,7 +4335,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
>  {
>  	__reset_rsvds_bits_mask(&context->guest_rsvd_check,
>  				vcpu->arch.reserved_gpa_bits,
> -				context->root_level, is_efer_nx(context),
> +				context->cpu_role.base.level, is_efer_nx(context),
>  				guest_can_use_gbpages(vcpu),
>  				is_cr4_pse(context),
>  				guest_cpuid_is_amd_or_hygon(vcpu));
> @@ -4739,7 +4739,6 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
>  	context->get_guest_pgd = get_cr3;
>  	context->get_pdptr = kvm_pdptr_read;
>  	context->inject_page_fault = kvm_inject_page_fault;
> -	context->root_level = cpu_role.base.level;
>  
>  	if (!is_cr0_pg(context))
>  		context->gva_to_gpa = nonpaging_gva_to_gpa;
> @@ -4769,7 +4768,6 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
>  		paging64_init_context(context);
>  	else
>  		paging32_init_context(context);
> -	context->root_level = cpu_role.base.level;
>  
>  	reset_guest_paging_metadata(vcpu, context);
>  }
> @@ -4854,7 +4852,6 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  		context->gva_to_gpa = ept_gva_to_gpa;
>  		context->sync_page = ept_sync_page;
>  		context->invlpg = ept_invlpg;
> -		context->root_level = level;
>  		context->direct_map = false;
>  		update_permission_bitmask(context, true);
>  		context->pkru_mask = 0;
> @@ -4889,7 +4886,6 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role new_ro
>  	g_context->get_guest_pgd     = get_cr3;
>  	g_context->get_pdptr         = kvm_pdptr_read;
>  	g_context->inject_page_fault = kvm_inject_page_fault;
> -	g_context->root_level        = new_role.base.level;
>  
>  	/*
>  	 * L2 page tables are never shadowed, so there is no need to sync
> diff --git a/arch/x86/kvm/mmu/mmu_audit.c b/arch/x86/kvm/mmu/mmu_audit.c
> index f31fdb874f1f..eb9c59fcb957 100644
> --- a/arch/x86/kvm/mmu/mmu_audit.c
> +++ b/arch/x86/kvm/mmu/mmu_audit.c
> @@ -59,11 +59,11 @@ static void mmu_spte_walk(struct kvm_vcpu *vcpu, inspect_spte_fn fn)
>  	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
>  		return;
>  
> -	if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
> +	if (vcpu->arch.mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) {
>  		hpa_t root = vcpu->arch.mmu->root_hpa;
>  
>  		sp = to_shadow_page(root);
> -		__mmu_spte_walk(vcpu, sp, fn, vcpu->arch.mmu->root_level);
> +		__mmu_spte_walk(vcpu, sp, fn, vcpu->arch.mmu->cpu_role.base.level);
>  		return;
>  	}
>  
> @@ -119,7 +119,7 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level)
>  	hpa =  pfn << PAGE_SHIFT;
>  	if ((*sptep & PT64_BASE_ADDR_MASK) != hpa)
>  		audit_printk(vcpu->kvm, "levels %d pfn %llx hpa %llx "
> -			     "ent %llxn", vcpu->arch.mmu->root_level, pfn,
> +			     "ent %llxn", vcpu->arch.mmu->cpu_role.base.level, pfn,
>  			     hpa, *sptep);
>  }
>  
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 847c4339e4d9..dd0b6f83171f 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -361,7 +361,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
>  
>  	trace_kvm_mmu_pagetable_walk(addr, access);
>  retry_walk:
> -	walker->level = mmu->root_level;
> +	walker->level = mmu->cpu_role.base.level;
>  	pte           = mmu->get_guest_pgd(vcpu);
>  	have_ad       = PT_HAVE_ACCESSED_DIRTY(mmu);
>  
> @@ -656,7 +656,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
>  	WARN_ON_ONCE(gw->gfn != base_gfn);
>  	direct_access = gw->pte_access;
>  
> -	top_level = vcpu->arch.mmu->root_level;
> +	top_level = vcpu->arch.mmu->cpu_role.base.level;
>  	if (top_level == PT32E_ROOT_LEVEL)
>  		top_level = PT32_ROOT_LEVEL;
>  	/*
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 23/23] KVM: MMU: replace direct_map with mmu_role.direct
  2022-02-04 11:57 ` [PATCH 23/23] KVM: MMU: replace direct_map with mmu_role.direct Paolo Bonzini
@ 2022-02-07 23:02   ` David Matlack
  0 siblings, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-07 23:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:57:18AM -0500, Paolo Bonzini wrote:
> direct_map is always equal to the role's direct field:
> 
> - for shadow paging, direct_map is true if CR0.PG=0 and mmu_role.direct is
> copied from cpu_role.base.direct
> 
> - for TDP, it is always true and mmu_role.direct is also always true
> 
> - for shadow EPT, it is always false and mmu_role.direct is also always
> false
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/include/asm/kvm_host.h |  1 -
>  arch/x86/kvm/mmu/mmu.c          | 30 ++++++++++++++----------------
>  arch/x86/kvm/x86.c              | 12 ++++++------
>  4 files changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c86a2beee92a..647b3f6d02d0 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -432,7 +432,6 @@ struct kvm_mmu {
>  	gpa_t root_pgd;
>  	union kvm_mmu_role cpu_role;
>  	union kvm_mmu_page_role mmu_role;
> -	bool direct_map;
>  	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];
>  
>  	/*
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 5a6541d6a424..ce55fad99671 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2045,7 +2045,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>  					     int direct,
>  					     unsigned int access)
>  {
> -	bool direct_mmu = vcpu->arch.mmu->direct_map;
> +	bool direct_mmu = vcpu->arch.mmu->mmu_role.direct;
>  	union kvm_mmu_page_role role;
>  	struct hlist_head *sp_list;
>  	unsigned quadrant;
> @@ -2147,7 +2147,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato
>  
>  	if (iterator->level >= PT64_ROOT_4LEVEL &&
>  	    vcpu->arch.mmu->cpu_role.base.level < PT64_ROOT_4LEVEL &&
> -	    !vcpu->arch.mmu->direct_map)
> +	    !vcpu->arch.mmu->mmu_role.direct)
>  		iterator->level = PT32E_ROOT_LEVEL;
>  
>  	if (iterator->level == PT32E_ROOT_LEVEL) {
> @@ -2523,7 +2523,7 @@ static int kvm_mmu_unprotect_page_virt(struct kvm_vcpu *vcpu, gva_t gva)
>  	gpa_t gpa;
>  	int r;
>  
> -	if (vcpu->arch.mmu->direct_map)
> +	if (vcpu->arch.mmu->mmu_role.direct)
>  		return 0;
>  
>  	gpa = kvm_mmu_gva_to_gpa_read(vcpu, gva, NULL);
> @@ -3255,7 +3255,8 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
>  
>  	if (free_active_root) {
>  		if (mmu->mmu_role.level >= PT64_ROOT_4LEVEL &&
> -		    (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
> +		    (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
> +		     mmu->mmu_role.direct)) {
>  			mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list);
>  		} else if (mmu->pae_root) {
>  			for (i = 0; i < 4; ++i) {
> @@ -3558,7 +3559,8 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
>  	 * equivalent level in the guest's NPT to shadow.  Allocate the tables
>  	 * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare.
>  	 */
> -	if (mmu->direct_map || mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
> +	if (mmu->mmu_role.direct ||
> +	    mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL ||
>  	    mmu->mmu_role.level < PT64_ROOT_4LEVEL)
>  		return 0;
>  
> @@ -3647,7 +3649,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
>  	int i;
>  	struct kvm_mmu_page *sp;
>  
> -	if (vcpu->arch.mmu->direct_map)
> +	if (vcpu->arch.mmu->mmu_role.direct)
>  		return;
>  
>  	if (!VALID_PAGE(vcpu->arch.mmu->root_hpa))
> @@ -3872,7 +3874,7 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  
>  	arch.token = (vcpu->arch.apf.id++ << 12) | vcpu->vcpu_id;
>  	arch.gfn = gfn;
> -	arch.direct_map = vcpu->arch.mmu->direct_map;
> +	arch.direct_map = vcpu->arch.mmu->mmu_role.direct;
>  	arch.cr3 = vcpu->arch.mmu->get_guest_pgd(vcpu);
>  
>  	return kvm_setup_async_pf(vcpu, cr2_or_gpa,
> @@ -4090,7 +4092,6 @@ static void nonpaging_init_context(struct kvm_mmu *context)
>  	context->gva_to_gpa = nonpaging_gva_to_gpa;
>  	context->sync_page = nonpaging_sync_page;
>  	context->invlpg = NULL;
> -	context->direct_map = true;
>  }
>  
>  static inline bool is_root_usable(struct kvm_mmu_root_info *root, gpa_t pgd,
> @@ -4641,7 +4642,6 @@ static void paging64_init_context(struct kvm_mmu *context)
>  	context->gva_to_gpa = paging64_gva_to_gpa;
>  	context->sync_page = paging64_sync_page;
>  	context->invlpg = paging64_invlpg;
> -	context->direct_map = false;
>  }
>  
>  static void paging32_init_context(struct kvm_mmu *context)
> @@ -4650,7 +4650,6 @@ static void paging32_init_context(struct kvm_mmu *context)
>  	context->gva_to_gpa = paging32_gva_to_gpa;
>  	context->sync_page = paging32_sync_page;
>  	context->invlpg = paging32_invlpg;
> -	context->direct_map = false;
>  }
>  
>  static union kvm_mmu_role
> @@ -4735,7 +4734,6 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, union kvm_mmu_role cpu_role)
>  	context->page_fault = kvm_tdp_page_fault;
>  	context->sync_page = nonpaging_sync_page;
>  	context->invlpg = NULL;
> -	context->direct_map = true;
>  	context->get_guest_pgd = get_cr3;
>  	context->get_pdptr = kvm_pdptr_read;
>  	context->inject_page_fault = kvm_inject_page_fault;
> @@ -4852,7 +4850,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
>  		context->gva_to_gpa = ept_gva_to_gpa;
>  		context->sync_page = ept_sync_page;
>  		context->invlpg = ept_invlpg;
> -		context->direct_map = false;
> +
>  		update_permission_bitmask(context, true);
>  		context->pkru_mask = 0;
>  		reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level);
> @@ -4967,13 +4965,13 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
>  {
>  	int r;
>  
> -	r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->direct_map);
> +	r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->mmu_role.direct);
>  	if (r)
>  		goto out;
>  	r = mmu_alloc_special_roots(vcpu);
>  	if (r)
>  		goto out;
> -	if (vcpu->arch.mmu->direct_map)
> +	if (vcpu->arch.mmu->mmu_role.direct)
>  		r = mmu_alloc_direct_roots(vcpu);
>  	else
>  		r = mmu_alloc_shadow_roots(vcpu);
> @@ -5176,7 +5174,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
>  		       void *insn, int insn_len)
>  {
>  	int r, emulation_type = EMULTYPE_PF;
> -	bool direct = vcpu->arch.mmu->direct_map;
> +	bool direct = vcpu->arch.mmu->mmu_role.direct;
>  
>  	if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa)))
>  		return RET_PF_RETRY;
> @@ -5207,7 +5205,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
>  	 * paging in both guests. If true, we simply unprotect the page
>  	 * and resume the guest.
>  	 */
> -	if (vcpu->arch.mmu->direct_map &&
> +	if (vcpu->arch.mmu->mmu_role.direct &&
>  	    (error_code & PFERR_NESTED_GUEST_PAGE) == PFERR_NESTED_GUEST_PAGE) {
>  		kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa));
>  		return 1;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 657aa646871e..b910fa34e57e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7978,7 +7978,7 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  	    WARN_ON_ONCE(!(emulation_type & EMULTYPE_PF)))
>  		return false;
>  
> -	if (!vcpu->arch.mmu->direct_map) {
> +	if (!vcpu->arch.mmu->mmu_role.direct) {
>  		/*
>  		 * Write permission should be allowed since only
>  		 * write access need to be emulated.
> @@ -8011,7 +8011,7 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  	kvm_release_pfn_clean(pfn);
>  
>  	/* The instructions are well-emulated on direct mmu. */
> -	if (vcpu->arch.mmu->direct_map) {
> +	if (vcpu->arch.mmu->mmu_role.direct) {
>  		unsigned int indirect_shadow_pages;
>  
>  		write_lock(&vcpu->kvm->mmu_lock);
> @@ -8079,7 +8079,7 @@ static bool retry_instruction(struct x86_emulate_ctxt *ctxt,
>  	vcpu->arch.last_retry_eip = ctxt->eip;
>  	vcpu->arch.last_retry_addr = cr2_or_gpa;
>  
> -	if (!vcpu->arch.mmu->direct_map)
> +	if (!vcpu->arch.mmu->mmu_role.direct)
>  		gpa = kvm_mmu_gva_to_gpa_write(vcpu, cr2_or_gpa, NULL);
>  
>  	kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(gpa));
> @@ -8359,7 +8359,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  		ctxt->exception.address = cr2_or_gpa;
>  
>  		/* With shadow page tables, cr2 contains a GVA or nGPA. */
> -		if (vcpu->arch.mmu->direct_map) {
> +		if (vcpu->arch.mmu->mmu_role.direct) {
>  			ctxt->gpa_available = true;
>  			ctxt->gpa_val = cr2_or_gpa;
>  		}
> @@ -12196,7 +12196,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
>  {
>  	int r;
>  
> -	if ((vcpu->arch.mmu->direct_map != work->arch.direct_map) ||
> +	if ((vcpu->arch.mmu->mmu_role.direct != work->arch.direct_map) ||
>  	      work->wakeup_all)
>  		return;
>  
> @@ -12204,7 +12204,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
>  	if (unlikely(r))
>  		return;
>  
> -	if (!vcpu->arch.mmu->direct_map &&
> +	if (!vcpu->arch.mmu->mmu_role.direct &&
>  	      work->arch.cr3 != vcpu->arch.mmu->get_guest_pgd(vcpu))
>  		return;
>  
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (22 preceding siblings ...)
  2022-02-04 11:57 ` [PATCH 23/23] KVM: MMU: replace direct_map with mmu_role.direct Paolo Bonzini
@ 2022-02-07 23:08 ` David Matlack
  2022-02-07 23:27   ` Sean Christopherson
  2022-02-09 22:31 ` Sean Christopherson
  24 siblings, 1 reply; 72+ messages in thread
From: David Matlack @ 2022-02-07 23:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, seanjc, vkuznets

On Fri, Feb 04, 2022 at 06:56:55AM -0500, Paolo Bonzini wrote:
> The TDP MMU has a performance regression compared to the legacy
> MMU when CR0 changes often.  This was reported for the grsecurity
> kernel, which uses CR0.WP to implement kernel W^X.  In that case,
> each change to CR0.WP unloads the MMU and causes a lot of unnecessary
> work.  When running nested, this can even cause the L1 to hardly
> make progress, as the L0 hypervisor it is overwhelmed by the amount
> of MMU work that is needed.
> 
> The root cause of the issue is that the "MMU role" in KVM is a mess
> that mixes the CPU setup (CR0/CR4/EFER, SMM, guest mode, etc.)
> and the shadow page table format.  Whenever something is different
> between the MMU and the CPU, it is stored as an extra field in struct
> kvm_mmu---and for extra bonus complication, sometimes the same thing
> is stored in both the role and an extra field.
> 
> So, this is the "no functional change intended" part of the changes
> required to fix the performance regression.  It separates neatly
> the shadow page table format ("MMU role") from the guest page table
> format ("CPU role"), and removes the duplicate fields.

What do you think about calling this the guest_role instead of cpu_role?
There is a bit of a precedent for using "guest" instead of "cpu" already
for this type of concept (e.g. guest_walker), and I find it more
intuitive.

> The next
> step then is to avoid unloading the MMU as long as the MMU role
> stays the same.
> 
> Please review!
> 
> Paolo
> 
> Paolo Bonzini (23):
>   KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
>   KVM: MMU: nested EPT cannot be used in SMM
>   KVM: MMU: remove valid from extended role
>   KVM: MMU: constify uses of struct kvm_mmu_role_regs
>   KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu
>   KVM: MMU: load new PGD once nested two-dimensional paging is
>     initialized
>   KVM: MMU: remove kvm_mmu_calc_root_page_role
>   KVM: MMU: rephrase unclear comment
>   KVM: MMU: remove "bool base_only" arguments
>   KVM: MMU: split cpu_role from mmu_role
>   KVM: MMU: do not recompute root level from kvm_mmu_role_regs
>   KVM: MMU: remove ept_ad field
>   KVM: MMU: remove kvm_calc_shadow_root_page_role_common
>   KVM: MMU: cleanup computation of MMU roles for two-dimensional paging
>   KVM: MMU: cleanup computation of MMU roles for shadow paging
>   KVM: MMU: remove extended bits from mmu_role
>   KVM: MMU: remove redundant bits from extended role
>   KVM: MMU: fetch shadow EFER.NX from MMU role
>   KVM: MMU: simplify and/or inline computation of shadow MMU roles
>   KVM: MMU: pull CPU role computation to kvm_init_mmu
>   KVM: MMU: store shadow_root_level into mmu_role
>   KVM: MMU: use cpu_role for root_level
>   KVM: MMU: replace direct_map with mmu_role.direct
> 
>  arch/x86/include/asm/kvm_host.h |  13 +-
>  arch/x86/kvm/mmu.h              |   2 +-
>  arch/x86/kvm/mmu/mmu.c          | 408 ++++++++++++--------------------
>  arch/x86/kvm/mmu/mmu_audit.c    |   6 +-
>  arch/x86/kvm/mmu/paging_tmpl.h  |  12 +-
>  arch/86/kvm/mmu/tdp_mmu.c      |   4 +-
>  arch/x86/kvm/svm/svm.c          |   2 +-
>  arch/x86/kvm/vmx/vmx.c          |   2 +-
>  arch/x86/kvm/x86.c              |  12 +-
>  10 files changed, 178 insertions(+), 284 deletions(-)
> 
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-07 23:08 ` [PATCH 00/23] KVM: MMU: MMU role refactoring David Matlack
@ 2022-02-07 23:27   ` Sean Christopherson
  2022-02-07 23:53     ` David Matlack
  0 siblings, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-07 23:27 UTC (permalink / raw)
  To: David Matlack; +Cc: Paolo Bonzini, linux-kernel, kvm, vkuznets

On Mon, Feb 07, 2022, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:56:55AM -0500, Paolo Bonzini wrote:
> > The TDP MMU has a performance regression compared to the legacy
> > MMU when CR0 changes often.  This was reported for the grsecurity
> > kernel, which uses CR0.WP to implement kernel W^X.  In that case,
> > each change to CR0.WP unloads the MMU and causes a lot of unnecessary
> > work.  When running nested, this can even cause the L1 to hardly
> > make progress, as the L0 hypervisor it is overwhelmed by the amount
> > of MMU work that is needed.
> > 
> > The root cause of the issue is that the "MMU role" in KVM is a mess
> > that mixes the CPU setup (CR0/CR4/EFER, SMM, guest mode, etc.)
> > and the shadow page table format.  Whenever something is different
> > between the MMU and the CPU, it is stored as an extra field in struct
> > kvm_mmu---and for extra bonus complication, sometimes the same thing
> > is stored in both the role and an extra field.
> > 
> > So, this is the "no functional change intended" part of the changes
> > required to fix the performance regression.  It separates neatly
> > the shadow page table format ("MMU role") from the guest page table
> > format ("CPU role"), and removes the duplicate fields.
> 
> What do you think about calling this the guest_role instead of cpu_role?
> There is a bit of a precedent for using "guest" instead of "cpu" already
> for this type of concept (e.g. guest_walker), and I find it more
> intuitive.

Haven't looked at the series yet, but I'd prefer not to use guest_role, it's
too similar to is_guest_mode() and kvm_mmu_role.guest_mode.  E.g. we'd end up with

  static union kvm_mmu_role kvm_calc_guest_role(struct kvm_vcpu *vcpu,
  					      const struct kvm_mmu_role_regs *regs)
  {
	union kvm_mmu_role role = {0};

	role.base.access = ACC_ALL;
	role.base.smm = is_smm(vcpu);
	role.base.guest_mode = is_guest_mode(vcpu);
	role.base.direct = !____is_cr0_pg(regs);

	...
  }

and possibly

	if (guest_role.guest_mode)
		...

which would be quite messy.  Maybe vcpu_role if cpu_role isn't intuitive?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-07 23:27   ` Sean Christopherson
@ 2022-02-07 23:53     ` David Matlack
  2022-02-10  1:11       ` Sean Christopherson
  0 siblings, 1 reply; 72+ messages in thread
From: David Matlack @ 2022-02-07 23:53 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, LKML, kvm list, Vitaly Kuznetsov

On Mon, Feb 7, 2022 at 3:27 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Feb 07, 2022, David Matlack wrote:
> > On Fri, Feb 04, 2022 at 06:56:55AM -0500, Paolo Bonzini wrote:
> > > The TDP MMU has a performance regression compared to the legacy
> > > MMU when CR0 changes often.  This was reported for the grsecurity
> > > kernel, which uses CR0.WP to implement kernel W^X.  In that case,
> > > each change to CR0.WP unloads the MMU and causes a lot of unnecessary
> > > work.  When running nested, this can even cause the L1 to hardly
> > > make progress, as the L0 hypervisor it is overwhelmed by the amount
> > > of MMU work that is needed.
> > >
> > > The root cause of the issue is that the "MMU role" in KVM is a mess
> > > that mixes the CPU setup (CR0/CR4/EFER, SMM, guest mode, etc.)
> > > and the shadow page table format.  Whenever something is different
> > > between the MMU and the CPU, it is stored as an extra field in struct
> > > kvm_mmu---and for extra bonus complication, sometimes the same thing
> > > is stored in both the role and an extra field.
> > >
> > > So, this is the "no functional change intended" part of the changes
> > > required to fix the performance regression.  It separates neatly
> > > the shadow page table format ("MMU role") from the guest page table
> > > format ("CPU role"), and removes the duplicate fields.
> >
> > What do you think about calling this the guest_role instead of cpu_role?
> > There is a bit of a precedent for using "guest" instead of "cpu" already
> > for this type of concept (e.g. guest_walker), and I find it more
> > intuitive.
>
> Haven't looked at the series yet, but I'd prefer not to use guest_role, it's
> too similar to is_guest_mode() and kvm_mmu_role.guest_mode.  E.g. we'd end up with
>
>   static union kvm_mmu_role kvm_calc_guest_role(struct kvm_vcpu *vcpu,
>                                               const struct kvm_mmu_role_regs *regs)
>   {
>         union kvm_mmu_role role = {0};
>
>         role.base.access = ACC_ALL;
>         role.base.smm = is_smm(vcpu);
>         role.base.guest_mode = is_guest_mode(vcpu);
>         role.base.direct = !____is_cr0_pg(regs);
>
>         ...
>   }
>
> and possibly
>
>         if (guest_role.guest_mode)
>                 ...
>
> which would be quite messy.  Maybe vcpu_role if cpu_role isn't intuitive?

I agree it's a little odd. But actually it's somewhat intuitive (the
guest is in guest-mode, i.e. we're running a nested guest).

Ok I'm stretching a little bit :). But if the trade-off is just
"guest_role.guest_mode" requires a clarifying comment, but the rest of
the code gets more readable (cpu_role is used a lot more than
role.guest_mode), it still might be worth it.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized
  2022-02-04 19:18   ` David Matlack
  2022-02-07 13:50     ` Paolo Bonzini
@ 2022-02-09 12:34     ` Paolo Bonzini
  1 sibling, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-09 12:34 UTC (permalink / raw)
  To: David Matlack; +Cc: linux-kernel, kvm, seanjc, vkuznets

On 2/4/22 20:18, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:57:01AM -0500, Paolo Bonzini wrote:
>> __kvm_mmu_new_pgd looks at the MMU's root_level and shadow_root_level
>> via fast_pgd_switch.
> 
> Those checks are just for performance correct (to skip iterating through
> the list of roots)?
> 
> Either way, it's probably worth including a Fixes tag below.

It's actually a much bigger mess---though it's working as intended, 
except that in some cases the caching is suboptimal.

Basically, you have to call __kvm_mmu_new_pgd *before* kvm_init_mmu 
because of how fast_pgd_switch takes care of stashing the current root 
in the cache.   A PAE root is never placed in the cache exactly because 
mmu->root_level and mmu->shadow_root_level hold the value for the 
_current_ root.  In that case, fast_pgd_switch does not look for a 
cached PGD (even if according to the role it could be there).

__kvm_mmu_new_pgd then frees the current root using kvm_mmu_free_roots. 
  kvm_mmu_free_roots *also* uses mmu->root_level and 
mmu->shadow_root_level to distinguish whether the page table uses a 
single root or 4 PAE roots.  Because kvm_init_mmu can overwrite 
muu->root_level, kvm_mmu_free_roots must also be called before kvm_init_mmu.

I do wonder if the use of mmu->shadow_root_level was intentional (it 
certainly escaped me when first reviewing fast PGD switching), but 
fortunately none of this is necessary.  PAE roots can be identified from 
!to_shadow_page(root_hpa), so the better fix is to do that:

- in fast_pgd_switch/cached_root_available, you need to account for two 
possible transformations of the cache: either the old entry becomes the 
MRU of the prev_roots as in the current code, or the old entry cannot be 
cached.  This is 20-30 more lines of code.

- in kvm_mmu_free_roots, just use to_shadow_page instead of 
mmu->root_level and mmu->shadow_root_level.

Once this is in place, the original bug is fixed simply by calling 
kvm_mmu_new_pgd after kvm_init_mmu.  kvm_mmu_reset_context is not doing 
now to avoid having to figure out the new role, but it can do that 
easily after the above change.

I'll keep this cpu_role refactoring around, but strictly speaking it 
becomes a separate change than the optimization.

Paolo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
                   ` (23 preceding siblings ...)
  2022-02-07 23:08 ` [PATCH 00/23] KVM: MMU: MMU role refactoring David Matlack
@ 2022-02-09 22:31 ` Sean Christopherson
  2022-02-10  9:54   ` Paolo Bonzini
  2022-02-14 18:14   ` David Matlack
  24 siblings, 2 replies; 72+ messages in thread
From: Sean Christopherson @ 2022-02-09 22:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On Fri, Feb 04, 2022, Paolo Bonzini wrote:
> Paolo Bonzini (23):
>   KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
>   KVM: MMU: nested EPT cannot be used in SMM
>   KVM: MMU: remove valid from extended role
>   KVM: MMU: constify uses of struct kvm_mmu_role_regs
>   KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu
>   KVM: MMU: load new PGD once nested two-dimensional paging is
>     initialized
>   KVM: MMU: remove kvm_mmu_calc_root_page_role
>   KVM: MMU: rephrase unclear comment
>   KVM: MMU: remove "bool base_only" arguments
>   KVM: MMU: split cpu_role from mmu_role
>   KVM: MMU: do not recompute root level from kvm_mmu_role_regs
>   KVM: MMU: remove ept_ad field
>   KVM: MMU: remove kvm_calc_shadow_root_page_role_common
>   KVM: MMU: cleanup computation of MMU roles for two-dimensional paging
>   KVM: MMU: cleanup computation of MMU roles for shadow paging
>   KVM: MMU: remove extended bits from mmu_role
>   KVM: MMU: remove redundant bits from extended role
>   KVM: MMU: fetch shadow EFER.NX from MMU role
>   KVM: MMU: simplify and/or inline computation of shadow MMU roles
>   KVM: MMU: pull CPU role computation to kvm_init_mmu
>   KVM: MMU: store shadow_root_level into mmu_role
>   KVM: MMU: use cpu_role for root_level
>   KVM: MMU: replace direct_map with mmu_role.direct

Heresy!  Everyone knows the one true way is "KVM: x86/mmu:"

  $ glo | grep "KVM: MMU:" | wc -l
  740
  $ glo | grep "KVM: x86/mmu:" | wc -l
  403

Dammit, I'm the heathen...

I do think we should use x86/mmu though.  VMX and SVM (and nVMX and nSVM) are ok
because they're unlikely to collide with other architectures, but every arch has
an MMU...

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 02/23] KVM: MMU: nested EPT cannot be used in SMM
  2022-02-04 18:16   ` David Matlack
@ 2022-02-09 22:43     ` Sean Christopherson
  0 siblings, 0 replies; 72+ messages in thread
From: Sean Christopherson @ 2022-02-09 22:43 UTC (permalink / raw)
  To: David Matlack; +Cc: Paolo Bonzini, linux-kernel, kvm, vkuznets

On Fri, Feb 04, 2022, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:56:57AM -0500, Paolo Bonzini wrote:
> > The role.base.smm flag is always zero, do not bother copying it over
> > from vcpu->arch.root_mmu.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> Reviewed-by: David Matlack <dmatlack@google.com>
> 
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 3 ---
> >  1 file changed, 3 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 9424ae90f1ef..b0065ae3cea8 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -4881,9 +4881,6 @@ kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
> >  {
> >  	union kvm_mmu_role role = {0};
> >  
> > -	/* SMM flag is inherited from root_mmu */
> > -	role.base.smm = vcpu->arch.root_mmu.mmu_role.base.smm;
> 
> nit: Retaining a comment here and/or warning here would be useful.
> 
> 	/* EPT is not allowed in SMM */

EPT _is_ allowed in SMM, KVM just doesn't support it.  Specifically, KVM doesn't
emulate Parallel SMM, a.k.a. Dual-Monitor Treatment of SMIs.  Probably worth
calling that out in the changelog.  If there's a WARN, then we don't really need
a comment as blame will get someone to the "why" if they're really curious, and
most people probably would only be confused about parallel SMM comments.

> 	WARN_ONCE_ONCE(vcpu->arch.root_mmu.mmu_role.base.smm);

+1 to a WARN, if only to provide a paper trail for git blame.  Finding when
something is purely deleted is painful.

> 
> (Although I imagine it would just get removed later in the series.)
> 
> > -
> >  	role.base.level = level;
> >  	role.base.has_4_byte_gpte = false;
> >  	role.base.direct = false;
> > -- 
> > 2.31.1
> > 
> > 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/23] KVM: MMU: remove valid from extended role
  2022-02-04 11:56 ` [PATCH 03/23] KVM: MMU: remove valid from extended role Paolo Bonzini
  2022-02-04 18:32   ` David Matlack
@ 2022-02-09 22:54   ` Sean Christopherson
  2022-02-10  9:53     ` Paolo Bonzini
  1 sibling, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-09 22:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On Fri, Feb 04, 2022, Paolo Bonzini wrote:
> The level field of the MMU role can act as a marker for validity
> instead: it is guaranteed to be nonzero so a zero value means the role
> is invalid and the MMU properties will be computed again.

Nope, it's not guaranteed to be non-zero:

static int role_regs_to_root_level(struct kvm_mmu_role_regs *regs)
{
	if (!____is_cr0_pg(regs))
		return 0; <=============================================
	else if (____is_efer_lma(regs))
		return ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL :
					       PT64_ROOT_4LEVEL;
	else if (____is_cr4_pae(regs))
		return PT32E_ROOT_LEVEL;
	else
		return PT32_ROOT_LEVEL;
}


static union kvm_mmu_role
kvm_calc_nested_mmu_role(struct kvm_vcpu *vcpu, struct kvm_mmu_role_regs *regs)
{
	union kvm_mmu_role role;

	role = kvm_calc_shadow_root_page_role_common(vcpu, regs, false);

	/*
	 * Nested MMUs are used only for walking L2's gva->gpa, they never have
	 * shadow pages of their own and so "direct" has no meaning.   Set it
	 * to "true" to try to detect bogus usage of the nested MMU.
	 */
	role.base.direct = true;
	role.base.level = role_regs_to_root_level(regs);
	return role;
}


static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
{
	struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu);
	union kvm_mmu_role new_role = kvm_calc_nested_mmu_role(vcpu, &regs);
	struct kvm_mmu *g_context = &vcpu->arch.nested_mmu;

	if (new_role.as_u64 == g_context->mmu_role.as_u64) <== theoretically can get a false positive
		return;

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs
  2022-02-04 11:56 ` [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs Paolo Bonzini
  2022-02-04 18:41   ` David Matlack
@ 2022-02-09 22:57   ` Sean Christopherson
  1 sibling, 0 replies; 72+ messages in thread
From: Sean Christopherson @ 2022-02-09 22:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On Fri, Feb 04, 2022, Paolo Bonzini wrote:
> struct kvm_mmu_role_regs is computed just once and then accessed.  Use
> const to enforce this.

It's not new enforcement, just syntatic sugar (though it's tasty sugar).  All fields
in struct kvm_mmu_role_regs are const specifically to prevent such a struct from
being changed regardless of how a pointer was obtained.

struct kvm_mmu_role_regs {
	const unsigned long cr0;
	const unsigned long cr4;
	const u64 efer;
};

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  2022-02-04 17:59   ` David Matlack
  2022-02-05 14:52     ` Paolo Bonzini
@ 2022-02-10  0:30     ` Sean Christopherson
  2022-02-10 16:38       ` Paolo Bonzini
  1 sibling, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-10  0:30 UTC (permalink / raw)
  To: David Matlack; +Cc: Paolo Bonzini, linux-kernel, kvm, vkuznets

On Fri, Feb 04, 2022, David Matlack wrote:
> On Fri, Feb 04, 2022 at 06:56:56AM -0500, Paolo Bonzini wrote:
> > reset_shadow_zero_bits_mask has a very unintuitive way of deciding
> > whether the shadow pages will use the NX bit.  The function is used in
> > two cases, shadow paging and shadow NPT; shadow paging has a use for
> > EFER.NX and needs to force it enabled, while shadow NPT only needs it
> > depending on L1's setting.
> > 
> > The actual root problem here is that is_efer_nx, despite being part
> > of the "base" role, only matches the format of the shadow pages in the
> > NPT case.  For now, just remove the ugly variable initialization and move
> > the call to reset_shadow_zero_bits_mask out of shadow_mmu_init_context.
> > The parameter can then be removed after the root problem in the role
> > is fixed.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> Reviewed-by: David Matlack <dmatlack@google.com>
> 
> (I agree this commit makes no functional change.)

There may not be a functional change, but it drops an optimization and contributes
to making future code/patches more fragile due to making it harder to understand
the relationship between shadow_mmu_init_context() and __kvm_mmu_new_pgd().

> > @@ -4829,8 +4820,6 @@ static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu *conte
> >  
> >  	reset_guest_paging_metadata(vcpu, context);
> >  	context->shadow_root_level = new_role.base.level;
> > -
> > -	reset_shadow_zero_bits_mask(vcpu, context);

This is guarded by:

	if (new_role.as_u64 == context->mmu_role.as_u64)
		return;

> >  }
> >  
> >  static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
> > @@ -4841,6 +4830,16 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
> >  		kvm_calc_shadow_mmu_root_page_role(vcpu, regs, false);
> >  
> >  	shadow_mmu_init_context(vcpu, context, regs, new_role);
> > +
> > +	/*
> > +	 * KVM uses NX when TDP is disabled to handle a variety of scenarios,
> > +	 * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and
> > +	 * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0.
> > +	 * The iTLB multi-hit workaround can be toggled at any time, so assume
> > +	 * NX can be used by any non-nested shadow MMU to avoid having to reset
> > +	 * MMU contexts.  Note, KVM forces EFER.NX=1 when TDP is disabled.
> > +	 */
> > +	reset_shadow_zero_bits_mask(vcpu, context, true);

Whereas this will compute the mask even if the role doesn't change.  I say that
matters later on because then this sequence:

	shadow_mmu_init_context(vcpu, context, &regs, new_role);
	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);

becomes even more difficult to untangle.

And looking at where this series ends up, I don't understand the purpose of this
change.  Patch 18 essentially reverts this patch, and I see nothing in between
that will break without the temporary change.   That patch becomes:

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 02e6d256805d..f9c96de1189d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4408,7 +4408,7 @@ static void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
         * NX can be used by any non-nested shadow MMU to avoid having to reset
         * MMU contexts.  Note, KVM forces EFER.NX=1 when TDP is disabled.
         */
-       bool uses_nx = is_efer_nx(context) || !tdp_enabled;
+       bool uses_nx = context->mmu_role.efer_nx;

        /* @amd adds a check on bit of SPTEs, which KVM shouldn't use anyways. */
        bool is_amd = true;

though it needs to update the comment as well.

> >  }
> >  
> >  static union kvm_mmu_role
> > @@ -4872,6 +4871,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
> >  	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
> >  
> >  	shadow_mmu_init_context(vcpu, context, &regs, new_role);
> > +	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> 
> Out of curiousity, how does KVM mitigate iTLB multi-hit when shadowing
> NPT and the guest has not enabled EFER.NX?
> 
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
> >  
> > -- 
> > 2.31.1
> > 
> > 

^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role
  2022-02-04 11:57 ` [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role Paolo Bonzini
  2022-02-04 19:32   ` David Matlack
@ 2022-02-10  0:47   ` Sean Christopherson
  2022-02-10  9:52     ` Paolo Bonzini
  1 sibling, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-10  0:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On Fri, Feb 04, 2022, Paolo Bonzini wrote:
> Since the guest PGD is now loaded after the MMU has been set up
> completely, the desired role for a cache hit is simply the current
> mmu_role.  There is no need to compute it again, so __kvm_mmu_new_pgd
> can be folded in kvm_mmu_new_pgd.
> 
> As an aside, the !tdp_enabled case in the function was dead code,
> and that also gets mopped up as a side effect.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---

...

> @@ -4871,7 +4864,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
>  
>  	shadow_mmu_init_context(vcpu, context, &regs, new_role);
>  	reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> -	__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
> +	kvm_mmu_new_pgd(vcpu, nested_cr3);

I'm not a fan of this change.  Conceptually it makes perfect sense, but I really,
really don't like the hidden dependency between shadow_mmu_init_context() and
kvm_mmu_new_pgd().

It's less bad once patch 01 is jettisoned, but I still don't love.  Yes, it's
silly to call __kvm_mmu_new_pgd() before the role is set, but it is robust.  And
either way we end up with an incoherent state, e.g. with this change, the role of
the shadow page pointed at by mmu->root_hpa doesn't match the role of the mmu itself.

Furthermore, the next patch to take advantage of this change is subtly broken/wrong.
It drops kvm_mmu_calc_root_page_role() from kvm_mmu_new_pgd() under the guise that
kvm_mmu_new_pgd() is called with an up-to-date mmu_role, but that is incorrect for
both nested VM-Enter with TDP disabled in the host and nested VM-Exit (regardless of
TDP enabling).

Both nested_vmx_load_cr3() and nested_svm_load_cr3() load a new CR3 before updating
the mmu.  Why, I have no idea.  Probably a historical wart

The nested mess is likely easily solved, I don't see any obvious issue with swapping
the order.  But I still don't love the subtlety.  I do like shaving cycles, just
not the subtlety...

If we do rework things to have kvm_mmu_new_pgd() pull the role from the mmu, then
we should first add a long overdue audit/warn that KVM never runs with a mmu_role
that isn't consistent with respect to its root SP's role.  E.g. finally get rid of
mmu_audit.c, add a proper CONFIG_KVM_AUDIT_MMU (which I'll happily turn on for all
my kernels), and start adding WARNs and other assertions.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-07 23:53     ` David Matlack
@ 2022-02-10  1:11       ` Sean Christopherson
  2022-02-10 11:58         ` Paolo Bonzini
  0 siblings, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-10  1:11 UTC (permalink / raw)
  To: David Matlack; +Cc: Paolo Bonzini, LKML, kvm list, Vitaly Kuznetsov

On Mon, Feb 07, 2022, David Matlack wrote:
> On Mon, Feb 7, 2022 at 3:27 PM Sean Christopherson <seanjc@google.com> wrote:
> > > What do you think about calling this the guest_role instead of cpu_role?
> > > There is a bit of a precedent for using "guest" instead of "cpu" already
> > > for this type of concept (e.g. guest_walker), and I find it more
> > > intuitive.
> >
> > Haven't looked at the series yet, but I'd prefer not to use guest_role, it's
> > too similar to is_guest_mode() and kvm_mmu_role.guest_mode.  E.g. we'd end up with
> >
> >   static union kvm_mmu_role kvm_calc_guest_role(struct kvm_vcpu *vcpu,
> >                                               const struct kvm_mmu_role_regs *regs)
> >   {
> >         union kvm_mmu_role role = {0};
> >
> >         role.base.access = ACC_ALL;
> >         role.base.smm = is_smm(vcpu);
> >         role.base.guest_mode = is_guest_mode(vcpu);
> >         role.base.direct = !____is_cr0_pg(regs);
> >
> >         ...
> >   }
> >
> > and possibly
> >
> >         if (guest_role.guest_mode)
> >                 ...
> >
> > which would be quite messy.  Maybe vcpu_role if cpu_role isn't intuitive?
> 
> I agree it's a little odd. But actually it's somewhat intuitive (the
> guest is in guest-mode, i.e. we're running a nested guest).
> 
> Ok I'm stretching a little bit :). But if the trade-off is just
> "guest_role.guest_mode" requires a clarifying comment, but the rest of
> the code gets more readable (cpu_role is used a lot more than
> role.guest_mode), it still might be worth it.

It's not just guest_mode, we also have guest_mmu, e.g. we'd end up with

	vcpu->arch.root_mmu.guest_role.base.level
	vcpu->arch.guest_mmu.guest_role.base.level
	vcpu->arch.nested_mmu.guest_role.base.level

In a vacuum, I 100% agree that guest_role is better than cpu_role or vcpu_role,
but the term "guest" has already been claimed for "L2" in far too many places.

While we're behind the bikeshed... the resulting:

	union kvm_mmu_role cpu_role;
	union kvm_mmu_page_role mmu_role;

is a mess.  Again, I really like "mmu_role" in a vacuum, but juxtaposed with
	
	union kvm_mmu_role cpu_role;

it's super confusing, e.g. I expected

	union kvm_mmu_role mmu_role;

Nested EPT is a good example of complete confusion, because we compute kvm_mmu_role,
compare it to cpu_role, then shove it into both cpu_role and mmu_ole.  It makes
sense once you reason about what it's doing, but on the surface it's confusing.

	struct kvm_mmu *context = &vcpu->arch.guest_mmu;
	u8 level = vmx_eptp_page_walk_level(new_eptp);
	union kvm_mmu_role new_role =
		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
						   execonly, level);

	if (new_role.as_u64 != context->cpu_role.as_u64) {
		/* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
		context->cpu_role.as_u64 = new_role.as_u64;
		context->mmu_role.word = new_role.base.word;

Mabye this?

	union kvm_mmu_vcpu_role vcpu_role;
	union kvm_mmu_page_role mmu_role;

and some sample usage?

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d25f8cb2e62b..9f9b97c88738 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4836,13 +4836,16 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 {
        struct kvm_mmu *context = &vcpu->arch.guest_mmu;
        u8 level = vmx_eptp_page_walk_level(new_eptp);
-       union kvm_mmu_role new_role =
+       union kvm_mmu_vcpu_role new_role =
                kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
                                                   execonly, level);

-       if (new_role.as_u64 != context->cpu_role.as_u64) {
-               /* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
-               context->cpu_role.as_u64 = new_role.as_u64;
+       if (new_role.as_u64 != context->vcpu_role.as_u64) {
+               /*
+                * EPT, and thus nested EPT, does not consume CR0, CR4, nor
+                * EFER, so the mmu_role is a strict subset of the vcpu_role.
+               */
+               context->vcpu_role.as_u64 = new_role.as_u64;
                context->mmu_role.word = new_role.base.word;

                context->page_fault = ept_page_fault;



And while I'm on a soapbox....  am I the only one that absolutely detests the use
of "context" and "g_context"?  I'd be all in favor of renaming those to "mmu"
throughout the code as a prep to this series.

I also think we should move the initializing of guest_mmu => mmu into the MMU
helpers.  Pulling the mmu from guest_mmu but then relying on the caller to wire
up guest_mmu => mmu so that e.g. kvm_mmu_new_pgd() works is gross and confused
the heck out of me.  E.g.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d25f8cb2e62b..4e7fe9758ce8 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4794,7 +4794,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu,
 void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
                             unsigned long cr4, u64 efer, gpa_t nested_cr3)
 {
-       struct kvm_mmu *context = &vcpu->arch.guest_mmu;
+       struct kvm_mmu *mmu = &vcpu->arch.guest_mmu;
        struct kvm_mmu_role_regs regs = {
                .cr0 = cr0,
                .cr4 = cr4 & ~X86_CR4_PKE,
@@ -4806,6 +4806,8 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
        mmu_role = cpu_role.base;
        mmu_role.level = kvm_mmu_get_tdp_level(vcpu);

+       vcpu->arch.mmu = &vcpu->arch.guest_mmu;
+
        shadow_mmu_init_context(vcpu, context, cpu_role, mmu_role);
        kvm_mmu_new_pgd(vcpu, nested_cr3);
 }
@@ -4834,12 +4836,14 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
                             int huge_page_level, bool accessed_dirty,
                             gpa_t new_eptp)
 {
-       struct kvm_mmu *context = &vcpu->arch.guest_mmu;
+       struct kvm_mmu *mmu = &vcpu->arch.guest_mmu;
        u8 level = vmx_eptp_page_walk_level(new_eptp);
        union kvm_mmu_role new_role =
                kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
                                                   execonly, level);

+       vcpu->arch.mmu = mmu;
+
        if (new_role.as_u64 != context->cpu_role.as_u64) {
                /* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */
                context->cpu_role.as_u64 = new_role.as_u64;
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 1218b5a342fc..d0f8eddb32be 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -98,8 +98,6 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)

        WARN_ON(mmu_is_nested(vcpu));

-       vcpu->arch.mmu = &vcpu->arch.guest_mmu;
-
        /*
         * The NPT format depends on L1's CR4 and EFER, which is in vmcb01.  Note,
         * when called via KVM_SET_NESTED_STATE, that state may _not_ match current




^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role
  2022-02-10  0:47   ` Sean Christopherson
@ 2022-02-10  9:52     ` Paolo Bonzini
  2022-02-10 17:29       ` Sean Christopherson
  0 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-10  9:52 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On 2/10/22 01:47, Sean Christopherson wrote:
> The nested mess is likely easily solved, I don't see any obvious issue with swapping
> the order.  But I still don't love the subtlety.  I do like shaving cycles, just
> not the subtlety...

Not so easily, but it's doable and it's essentially what I did in the 
other series (the one that reworks the root cache).

Quick spoiler: there's a complicated dependency between the _old_ values 
in kvm_mmu and the root cache, so that the root cache code currently 
needs both the old MMU state (especially shadow_root_level/root_level) 
and the new role.

kvm_mmu_reset_context does the expensive kvm_mmu_unload to cop out of 
having to know in advance the new role; the crux of the other series is 
to remove that need, so that kvm_mmu_reset_context does not have to cop 
out anymore.

> If we do rework things to have kvm_mmu_new_pgd() pull the role from the mmu, then
> we should first add a long overdue audit/warn that KVM never runs with a mmu_role
> that isn't consistent with respect to its root SP's role.

There's a much cheaper check that can be done to enforce the invariant 
that kvm_mmu_new_pgd must follow kvm_init_mmu: kvm_init_mmu sets a 
not_ready flag, kvm_mmu_new_pgd clears it, and kvm_mmu_reload screams if 
it sees not_ready == 1.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 03/23] KVM: MMU: remove valid from extended role
  2022-02-09 22:54   ` Sean Christopherson
@ 2022-02-10  9:53     ` Paolo Bonzini
  0 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-10  9:53 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On 2/9/22 23:54, Sean Christopherson wrote:
> Nope, it's not guaranteed to be non-zero:
> 
> static int role_regs_to_root_level(struct kvm_mmu_role_regs *regs)
> {
> 	if (!____is_cr0_pg(regs))
> 		return 0; <=============================================
> 	else if (____is_efer_lma(regs))
> 		return ____is_cr4_la57(regs) ? PT64_ROOT_5LEVEL :
> 					       PT64_ROOT_4LEVEL;
> 	else if (____is_cr4_pae(regs))
> 		return PT32E_ROOT_LEVEL;
> 	else
> 		return PT32_ROOT_LEVEL;
> }
> 

Yes, see my reply to David.  At the end of the series the assumption is 
correct:

- level is always nonzero in mmu_role

- one of level or direct (which is !CR0.PG) is always nonzero in cpu_role

So the patch can be kept but it has to be moved much later.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-09 22:31 ` Sean Christopherson
@ 2022-02-10  9:54   ` Paolo Bonzini
  2022-02-14 18:14   ` David Matlack
  1 sibling, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-10  9:54 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On 2/9/22 23:31, Sean Christopherson wrote:
> Heresy!  Everyone knows the one true way is "KVM: x86/mmu:"
> 
>    $ glo | grep "KVM: MMU:" | wc -l
>    740
>    $ glo | grep "KVM: x86/mmu:" | wc -l
>    403
> 
> Dammit, I'm the heathen...
> 
> I do think we should use x86/mmu though.  VMX and SVM (and nVMX and nSVM) are ok
> because they're unlikely to collide with other architectures, but every arch has
> an MMU...
> 

Sure, I can adjust my habits.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-10  1:11       ` Sean Christopherson
@ 2022-02-10 11:58         ` Paolo Bonzini
  2022-02-10 16:55           ` Sean Christopherson
  0 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-10 11:58 UTC (permalink / raw)
  To: Sean Christopherson, David Matlack; +Cc: LKML, kvm list, Vitaly Kuznetsov

On 2/10/22 02:11, Sean Christopherson wrote:
> In a vacuum, I 100% agree that guest_role is better than cpu_role or vcpu_role,
> but the term "guest" has already been claimed for "L2" in far too many places.
> 
> While we're behind the bikeshed... the resulting:
> 
> 	union kvm_mmu_role cpu_role;
> 	union kvm_mmu_page_role mmu_role;
> 
> is a mess.  Again, I really like "mmu_role" in a vacuum, but juxtaposed with
> 	
> 	union kvm_mmu_role cpu_role;
> 
> it's super confusing, e.g. I expected
> 
> 	union kvm_mmu_role mmu_role;

What about

	union kvm_mmu_page_role root_role;
	union kvm_mmu_paging_mode cpu_mode;

?  I already have to remove ".base" from all accesses to mmu_role, so 
it's not much extra churn.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask
  2022-02-10  0:30     ` Sean Christopherson
@ 2022-02-10 16:38       ` Paolo Bonzini
  0 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-10 16:38 UTC (permalink / raw)
  To: Sean Christopherson, David Matlack; +Cc: linux-kernel, kvm, vkuznets

On 2/10/22 01:30, Sean Christopherson wrote:
> There may not be a functional change, but it drops an optimization and contributes
> to making future code/patches more fragile due to making it harder to understand
> the relationship between shadow_mmu_init_context() and __kvm_mmu_new_pgd().

Fair enough, I'll drop this patch.

Regarding the relationship between shadow_mmu_init_context() and 
kvm_mmu_new_pgd(), it's a lot more subtle than it looks.  Please take a 
look first at the optimization series, since it removes that subtlety.

Paolo


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-10 11:58         ` Paolo Bonzini
@ 2022-02-10 16:55           ` Sean Christopherson
  2022-02-10 17:30             ` Paolo Bonzini
  0 siblings, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-10 16:55 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: David Matlack, LKML, kvm list, Vitaly Kuznetsov

On Thu, Feb 10, 2022, Paolo Bonzini wrote:
> On 2/10/22 02:11, Sean Christopherson wrote:
> > In a vacuum, I 100% agree that guest_role is better than cpu_role or vcpu_role,
> > but the term "guest" has already been claimed for "L2" in far too many places.
> > 
> > While we're behind the bikeshed... the resulting:
> > 
> > 	union kvm_mmu_role cpu_role;
> > 	union kvm_mmu_page_role mmu_role;
> > 
> > is a mess.  Again, I really like "mmu_role" in a vacuum, but juxtaposed with
> > 	
> > 	union kvm_mmu_role cpu_role;
> > 
> > it's super confusing, e.g. I expected
> > 
> > 	union kvm_mmu_role mmu_role;
> 
> What about
> 
> 	union kvm_mmu_page_role root_role;
> 	union kvm_mmu_paging_mode cpu_mode;
> 
> ?  I already have to remove ".base" from all accesses to mmu_role, so it's
> not much extra churn.

I'd prefer to not use "paging mode", the SDM uses that terminology to refer to
the four paging modes.  My expectation given the name is that the union would
track only CR0.PG, EFER.LME, CR4.PAE, and CR4.PSE[*].

I'm out of ideas at the moment, I'll keep chewing on this while reviewing...

[*] Someone at Intel rewrote the SDM and eliminated Mode B, a.k.a. PSE 36-bit
physical paging, it's now just part of "32-bit paging".  But 5-level paging is
considered it's own paging mode?!?!  Lame.  I guess they really want to have
exactly four paging modes...

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role
  2022-02-10  9:52     ` Paolo Bonzini
@ 2022-02-10 17:29       ` Sean Christopherson
  2022-02-10 17:43         ` Paolo Bonzini
  0 siblings, 1 reply; 72+ messages in thread
From: Sean Christopherson @ 2022-02-10 17:29 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On Thu, Feb 10, 2022, Paolo Bonzini wrote:
> On 2/10/22 01:47, Sean Christopherson wrote:
> > The nested mess is likely easily solved, I don't see any obvious issue with swapping
> > the order.  But I still don't love the subtlety.  I do like shaving cycles, just
> > not the subtlety...
> 
> Not so easily, but it's doable and it's essentially what I did in the other
> series (the one that reworks the root cache).

Sounds like I should reveiw that series first?

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-10 16:55           ` Sean Christopherson
@ 2022-02-10 17:30             ` Paolo Bonzini
  2022-02-10 19:28               ` Sean Christopherson
  0 siblings, 1 reply; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-10 17:30 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: David Matlack, LKML, kvm list, Vitaly Kuznetsov

On 2/10/22 17:55, Sean Christopherson wrote:
>> 	union kvm_mmu_page_role root_role;
>> 	union kvm_mmu_paging_mode cpu_mode;
>
> I'd prefer to not use "paging mode", the SDM uses that terminology to refer to
> the four paging modes.  My expectation given the name is that the union would
> track only CR0.PG, EFER.LME, CR4.PAE, and CR4.PSE[*].

Yeah, I had started with kvm_mmu_paging_flags, but cpu_flags was an even 
worse method than kvm_mmu_paging_mode.

Anyway, now that I have done _some_ replacement, it's a matter of sed -i 
on the patch files once you or someone else come up with a good moniker.

I take it that "root_role" passed your filter successfully.

Paolo

> I'm out of ideas at the moment, I'll keep chewing on this while reviewing...
> 
> [*] Someone at Intel rewrote the SDM and eliminated Mode B, a.k.a. PSE 36-bit
> physical paging, it's now just part of "32-bit paging".  But 5-level paging is
> considered it's own paging mode?!?!  Lame.  I guess they really want to have
> exactly four paging modes...


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role
  2022-02-10 17:29       ` Sean Christopherson
@ 2022-02-10 17:43         ` Paolo Bonzini
  0 siblings, 0 replies; 72+ messages in thread
From: Paolo Bonzini @ 2022-02-10 17:43 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, dmatlack, vkuznets

On 2/10/22 18:29, Sean Christopherson wrote:
> On Thu, Feb 10, 2022, Paolo Bonzini wrote:
>> On 2/10/22 01:47, Sean Christopherson wrote:
>>> The nested mess is likely easily solved, I don't see any obvious issue with swapping
>>> the order.  But I still don't love the subtlety.  I do like shaving cycles, just
>>> not the subtlety...
>>
>> Not so easily, but it's doable and it's essentially what I did in the other
>> series (the one that reworks the root cache).
> 
> Sounds like I should reveiw that series first?

Yeah, this one is still a nice step in the direction of guest pt walk 
from shadow pt build(*), but with no immediate use for TDP MMU root reuse.

The original idea was to use the MMU role to decide whether to do 
kvm_mmu_unload(), but that would have still been a bandaid---inefficient 
and a bad idea overall.  Patches 6+7 of this series (once fixed, because 
they were buggy as hell) turned out to be enough to tame the PGD cache 
and remove kvm_mmu_unload() altogether from kvm_mmu_reset_context().

Paolo

(*) Your idea of detecting stale roots is quite easily implemented on 
top of these, for example, because a root is stale if and only if the 
root_role changes.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-10 17:30             ` Paolo Bonzini
@ 2022-02-10 19:28               ` Sean Christopherson
  0 siblings, 0 replies; 72+ messages in thread
From: Sean Christopherson @ 2022-02-10 19:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: David Matlack, LKML, kvm list, Vitaly Kuznetsov

On Thu, Feb 10, 2022, Paolo Bonzini wrote:
> On 2/10/22 17:55, Sean Christopherson wrote:
> > > 	union kvm_mmu_page_role root_role;
> > > 	union kvm_mmu_paging_mode cpu_mode;
> > 
> > I'd prefer to not use "paging mode", the SDM uses that terminology to refer to
> > the four paging modes.  My expectation given the name is that the union would
> > track only CR0.PG, EFER.LME, CR4.PAE, and CR4.PSE[*].
> 
> Yeah, I had started with kvm_mmu_paging_flags, but cpu_flags was an even
> worse method than kvm_mmu_paging_mode.

We could always do s/is_guest_mode/is_nested_mode or something to that effect.
It would take some retraining, but I feel like we've been fighting the whole
"guest mode" thing over and over.

> Anyway, now that I have done _some_ replacement, it's a matter of sed -i on
> the patch files once you or someone else come up with a good moniker.
> 
> I take it that "root_role" passed your filter successfully.

Yep, works for me.  I almost suggested it, too, but decided I liked mmu_role
marginally better.  I like root_role because it ties in with root_hpa and root_pga.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 00/23] KVM: MMU: MMU role refactoring
  2022-02-09 22:31 ` Sean Christopherson
  2022-02-10  9:54   ` Paolo Bonzini
@ 2022-02-14 18:14   ` David Matlack
  1 sibling, 0 replies; 72+ messages in thread
From: David Matlack @ 2022-02-14 18:14 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, LKML, kvm list, Vitaly Kuznetsov

On Wed, Feb 9, 2022 at 2:31 PM Sean Christopherson <seanjc@google.com> wrote:
> On Fri, Feb 04, 2022, Paolo Bonzini wrote:
> >   KVM: MMU: replace direct_map with mmu_role.direct
>
> Heresy!  Everyone knows the one true way is "KVM: x86/mmu:"
>
>   $ glo | grep "KVM: MMU:" | wc -l
>   740
>   $ glo | grep "KVM: x86/mmu:" | wc -l
>   403
>
> Dammit, I'm the heathen...
>
> I do think we should use x86/mmu though.  VMX and SVM (and nVMX and nSVM) are ok
> because they're unlikely to collide with other architectures, but every arch has
> an MMU...

Can you document these rules/preferences somewhere? Even better if we
can enforce them with checkpatch :)

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2022-02-14 18:14 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-04 11:56 [PATCH 00/23] KVM: MMU: MMU role refactoring Paolo Bonzini
2022-02-04 11:56 ` [PATCH 01/23] KVM: MMU: pass uses_nx directly to reset_shadow_zero_bits_mask Paolo Bonzini
2022-02-04 17:59   ` David Matlack
2022-02-05 14:52     ` Paolo Bonzini
2022-02-07 16:09       ` Sean Christopherson
2022-02-07 21:50         ` David Matlack
2022-02-10  0:30     ` Sean Christopherson
2022-02-10 16:38       ` Paolo Bonzini
2022-02-04 11:56 ` [PATCH 02/23] KVM: MMU: nested EPT cannot be used in SMM Paolo Bonzini
2022-02-04 18:16   ` David Matlack
2022-02-09 22:43     ` Sean Christopherson
2022-02-04 11:56 ` [PATCH 03/23] KVM: MMU: remove valid from extended role Paolo Bonzini
2022-02-04 18:32   ` David Matlack
2022-02-05 14:50     ` Paolo Bonzini
2022-02-09 22:54   ` Sean Christopherson
2022-02-10  9:53     ` Paolo Bonzini
2022-02-04 11:56 ` [PATCH 04/23] KVM: MMU: constify uses of struct kvm_mmu_role_regs Paolo Bonzini
2022-02-04 18:41   ` David Matlack
2022-02-09 22:57   ` Sean Christopherson
2022-02-04 11:57 ` [PATCH 05/23] KVM: MMU: pull computation of kvm_mmu_role_regs to kvm_init_mmu Paolo Bonzini
2022-02-04 18:45   ` David Matlack
2022-02-04 11:57 ` [PATCH 06/23] KVM: MMU: load new PGD once nested two-dimensional paging is initialized Paolo Bonzini
2022-02-04 19:18   ` David Matlack
2022-02-07 13:50     ` Paolo Bonzini
2022-02-07 14:35       ` Paolo Bonzini
2022-02-09 12:34     ` Paolo Bonzini
2022-02-04 11:57 ` [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role Paolo Bonzini
2022-02-04 19:32   ` David Matlack
2022-02-05 14:46     ` Paolo Bonzini
2022-02-10  0:47   ` Sean Christopherson
2022-02-10  9:52     ` Paolo Bonzini
2022-02-10 17:29       ` Sean Christopherson
2022-02-10 17:43         ` Paolo Bonzini
2022-02-04 11:57 ` [PATCH 08/23] KVM: MMU: rephrase unclear comment Paolo Bonzini
2022-02-04 19:38   ` David Matlack
2022-02-04 11:57 ` [PATCH 09/23] KVM: MMU: remove "bool base_only" arguments Paolo Bonzini
2022-02-04 19:41   ` David Matlack
2022-02-04 11:57 ` [PATCH 10/23] KVM: MMU: split cpu_role from mmu_role Paolo Bonzini
2022-02-04 21:57   ` David Matlack
2022-02-05 14:49     ` Paolo Bonzini
2022-02-07 21:38       ` David Matlack
2022-02-04 11:57 ` [PATCH 11/23] KVM: MMU: do not recompute root level from kvm_mmu_role_regs Paolo Bonzini
2022-02-07 22:10   ` David Matlack
2022-02-07 22:17     ` David Matlack
2022-02-04 11:57 ` [PATCH 12/23] KVM: MMU: remove ept_ad field Paolo Bonzini
2022-02-04 11:57 ` [PATCH 13/23] KVM: MMU: remove kvm_calc_shadow_root_page_role_common Paolo Bonzini
2022-02-07 22:25   ` David Matlack
2022-02-04 11:57 ` [PATCH 14/23] KVM: MMU: cleanup computation of MMU roles for two-dimensional paging Paolo Bonzini
2022-02-04 11:57 ` [PATCH 15/23] KVM: MMU: cleanup computation of MMU roles for shadow paging Paolo Bonzini
2022-02-04 11:57 ` [PATCH 16/23] KVM: MMU: remove extended bits from mmu_role Paolo Bonzini
2022-02-04 11:57 ` [PATCH 17/23] KVM: MMU: remove redundant bits from extended role Paolo Bonzini
2022-02-04 11:57 ` [PATCH 18/23] KVM: MMU: fetch shadow EFER.NX from MMU role Paolo Bonzini
2022-02-04 11:57 ` [PATCH 19/23] KVM: MMU: simplify and/or inline computation of shadow MMU roles Paolo Bonzini
2022-02-04 11:57 ` [PATCH 20/23] KVM: MMU: pull CPU role computation to kvm_init_mmu Paolo Bonzini
2022-02-07 22:42   ` David Matlack
2022-02-04 11:57 ` [PATCH 21/23] KVM: MMU: store shadow_root_level into mmu_role Paolo Bonzini
2022-02-07 23:00   ` David Matlack
2022-02-04 11:57 ` [PATCH 22/23] KVM: MMU: use cpu_role for root_level Paolo Bonzini
2022-02-07 23:01   ` David Matlack
2022-02-04 11:57 ` [PATCH 23/23] KVM: MMU: replace direct_map with mmu_role.direct Paolo Bonzini
2022-02-07 23:02   ` David Matlack
2022-02-07 23:08 ` [PATCH 00/23] KVM: MMU: MMU role refactoring David Matlack
2022-02-07 23:27   ` Sean Christopherson
2022-02-07 23:53     ` David Matlack
2022-02-10  1:11       ` Sean Christopherson
2022-02-10 11:58         ` Paolo Bonzini
2022-02-10 16:55           ` Sean Christopherson
2022-02-10 17:30             ` Paolo Bonzini
2022-02-10 19:28               ` Sean Christopherson
2022-02-09 22:31 ` Sean Christopherson
2022-02-10  9:54   ` Paolo Bonzini
2022-02-14 18:14   ` David Matlack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).