linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling
@ 2020-02-07 17:37 Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 1/7] KVM: nVMX: Use correct root level for nested EPT shadow page tables Sean Christopherson
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Two fixes for 5-level paging bugs with a 100% fatality rate, a patch to
enable 5-level EPT in L1, and additional clean up on top (mostly renames
of functions/variables that caused me no end of confusion when trying to
figure out what was broken).

Tested fixed kernels at L0, L1 and L2, with most combinations of EPT,
shadow paging, 4-level and 5-level.  EPT kvm-unit-tests runs clean in L0.
Patches for kvm-unit-tests incoming to play nice with 5-level nested EPT.

Ideally patches 1 and 2 would get into 5.6, 5-level paging is quite
broken without them.

v2:
  - Increase the nested EPT array sizes to accomodate 5-level paging in
    the patch that adds support for 5-level nested EPT, not in the bug
    fix for 5-level shadow paging.

Sean Christopherson (7):
  KVM: nVMX: Use correct root level for nested EPT shadow page tables
  KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging
  KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT
  KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp()
  KVM: nVMX: Rename EPTP validity helper and associated variables
  KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
  KVM: nVMX: Drop unnecessary check on ept caps for execute-only

 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/include/asm/vmx.h      | 12 +++++++
 arch/x86/kvm/mmu/mmu.c          | 35 ++++++++++----------
 arch/x86/kvm/mmu/paging_tmpl.h  |  6 ++--
 arch/x86/kvm/svm.c              | 10 +++---
 arch/x86/kvm/vmx/nested.c       | 58 ++++++++++++++++++++-------------
 arch/x86/kvm/vmx/nested.h       |  4 +--
 arch/x86/kvm/vmx/vmx.c          |  2 ++
 arch/x86/kvm/x86.c              |  2 +-
 9 files changed, 79 insertions(+), 52 deletions(-)

-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/7] KVM: nVMX: Use correct root level for nested EPT shadow page tables
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
@ 2020-02-07 17:37 ` Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 2/7] KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging Sean Christopherson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU
currently also hardcodes the page walk level for nested EPT to be 4
levels.  The L2 guest is all but guaranteed to soft hang on its first
instruction when L1 is using EPT, as KVM will construct 4-level page
tables and then tell hardware to use 5-level page tables.

Fixes: 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support.")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9a6664886f2e..ed1d41f5f505 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2947,6 +2947,9 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 
 static int get_ept_level(struct kvm_vcpu *vcpu)
 {
+	/* Nested EPT currently only supports 4-level walks. */
+	if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
+		return 4;
 	if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
 		return 5;
 	return 4;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/7] KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 1/7] KVM: nVMX: Use correct root level for nested EPT shadow page tables Sean Christopherson
@ 2020-02-07 17:37 ` Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 3/7] KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT Sean Christopherson
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Define PT_MAX_FULL_LEVELS as PT64_ROOT_MAX_LEVEL, i.e. 5, to fix shadow
paging for 5-level guest page tables.  PT_MAX_FULL_LEVELS is used to
size the arrays that track guest pages table information, i.e. using a
"max levels" of 4 causes KVM to access garbage beyond the end of an
array when querying state for level 5 entries.  E.g. FNAME(gpte_changed)
will read garbage and most likely return %true for a level 5 entry,
soft-hanging the guest because FNAME(fetch) will restart the guest
instead of creating SPTEs because it thinks the guest PTE has changed.

Note, KVM doesn't yet support 5-level nested EPT, so PT_MAX_FULL_LEVELS
gets to stay "4" for the PTTYPE_EPT case.

Fixes: 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support.")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 4e1ef0473663..e4c8a4cbf407 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -33,7 +33,7 @@
 	#define PT_GUEST_ACCESSED_SHIFT PT_ACCESSED_SHIFT
 	#define PT_HAVE_ACCESSED_DIRTY(mmu) true
 	#ifdef CONFIG_X86_64
-	#define PT_MAX_FULL_LEVELS 4
+	#define PT_MAX_FULL_LEVELS PT64_ROOT_MAX_LEVEL
 	#define CMPXCHG cmpxchg
 	#else
 	#define CMPXCHG cmpxchg64
-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 3/7] KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 1/7] KVM: nVMX: Use correct root level for nested EPT shadow page tables Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 2/7] KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging Sean Christopherson
@ 2020-02-07 17:37 ` Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 4/7] KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp() Sean Christopherson
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Add support for 5-level nested EPT, and advertise said support in the
EPT capabilities MSR.  KVM's MMU can already handle 5-level legacy page
tables, there's no reason to force an L1 VMM to use shadow paging if it
wants to employ 5-level page tables.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/vmx.h     | 12 ++++++++++++
 arch/x86/kvm/mmu/mmu.c         | 11 ++++++-----
 arch/x86/kvm/mmu/paging_tmpl.h |  2 +-
 arch/x86/kvm/vmx/nested.c      | 21 +++++++++++++++++----
 arch/x86/kvm/vmx/vmx.c         |  3 +--
 5 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 2a85287b3685..bcd93fe07991 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -500,6 +500,18 @@ enum vmcs_field {
 						 VMX_EPT_EXECUTABLE_MASK)
 #define VMX_EPT_MT_MASK				(7ull << VMX_EPT_MT_EPTE_SHIFT)
 
+static inline u8 vmx_eptp_page_walk_level(u64 eptp)
+{
+	u64 encoded_level = eptp & VMX_EPTP_PWL_MASK;
+
+	if (encoded_level == VMX_EPTP_PWL_5)
+		return 5;
+
+	/* @eptp must be pre-validated by the caller. */
+	WARN_ON_ONCE(encoded_level != VMX_EPTP_PWL_4);
+	return 4;
+}
+
 /* The mask to use to trigger an EPT Misconfiguration in order to track MMIO */
 #define VMX_EPT_MISCONFIG_WX_VALUE		(VMX_EPT_WRITABLE_MASK |       \
 						 VMX_EPT_EXECUTABLE_MASK)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7011a4e54866..70f67bcab7db 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5012,14 +5012,14 @@ EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu);
 
 static union kvm_mmu_role
 kvm_calc_shadow_ept_root_page_role(struct kvm_vcpu *vcpu, bool accessed_dirty,
-				   bool execonly)
+				   bool execonly, u8 level)
 {
 	union kvm_mmu_role role = {0};
 
 	/* SMM flag is inherited from root_mmu */
 	role.base.smm = vcpu->arch.root_mmu.mmu_role.base.smm;
 
-	role.base.level = PT64_ROOT_4LEVEL;
+	role.base.level = level;
 	role.base.gpte_is_8_bytes = true;
 	role.base.direct = false;
 	role.base.ad_disabled = !accessed_dirty;
@@ -5043,9 +5043,10 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 			     bool accessed_dirty, gpa_t new_eptp)
 {
 	struct kvm_mmu *context = vcpu->arch.mmu;
+	u8 level = vmx_eptp_page_walk_level(new_eptp);
 	union kvm_mmu_role new_role =
 		kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty,
-						   execonly);
+						   execonly, level);
 
 	__kvm_mmu_new_cr3(vcpu, new_eptp, new_role.base, false);
 
@@ -5053,7 +5054,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 	if (new_role.as_u64 == context->mmu_role.as_u64)
 		return;
 
-	context->shadow_root_level = PT64_ROOT_4LEVEL;
+	context->shadow_root_level = level;
 
 	context->nx = true;
 	context->ept_ad = accessed_dirty;
@@ -5062,7 +5063,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly,
 	context->sync_page = ept_sync_page;
 	context->invlpg = ept_invlpg;
 	context->update_pte = ept_update_pte;
-	context->root_level = PT64_ROOT_4LEVEL;
+	context->root_level = level;
 	context->direct_map = false;
 	context->mmu_role.as_u64 = new_role.as_u64;
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index e4c8a4cbf407..6b15b58f3ecc 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -66,7 +66,7 @@
 	#define PT_GUEST_ACCESSED_SHIFT 8
 	#define PT_HAVE_ACCESSED_DIRTY(mmu) ((mmu)->ept_ad)
 	#define CMPXCHG cmpxchg64
-	#define PT_MAX_FULL_LEVELS 4
+	#define PT_MAX_FULL_LEVELS PT64_ROOT_MAX_LEVEL
 #else
 	#error Invalid PTTYPE value
 #endif
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 657c2eda357c..d5fc4bfea0e2 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2581,9 +2581,19 @@ static bool valid_ept_address(struct kvm_vcpu *vcpu, u64 address)
 		return false;
 	}
 
-	/* only 4 levels page-walk length are valid */
-	if (CC((address & VMX_EPTP_PWL_MASK) != VMX_EPTP_PWL_4))
+	/* Page-walk levels validity. */
+	switch (address & VMX_EPTP_PWL_MASK) {
+	case VMX_EPTP_PWL_5:
+		if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_PAGE_WALK_5_BIT)))
+			return false;
+		break;
+	case VMX_EPTP_PWL_4:
+		if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_PAGE_WALK_4_BIT)))
+			return false;
+		break;
+	default:
 		return false;
+	}
 
 	/* Reserved bits should not be set */
 	if (CC(address >> maxphyaddr || ((address >> 7) & 0x1f)))
@@ -6057,8 +6067,11 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps,
 		/* nested EPT: emulate EPT also to L1 */
 		msrs->secondary_ctls_high |=
 			SECONDARY_EXEC_ENABLE_EPT;
-		msrs->ept_caps = VMX_EPT_PAGE_WALK_4_BIT |
-			 VMX_EPTP_WB_BIT | VMX_EPT_INVEPT_BIT;
+		msrs->ept_caps =
+			VMX_EPT_PAGE_WALK_4_BIT |
+			VMX_EPT_PAGE_WALK_5_BIT |
+			VMX_EPTP_WB_BIT |
+			VMX_EPT_INVEPT_BIT;
 		if (cpu_has_vmx_ept_execute_only())
 			msrs->ept_caps |=
 				VMX_EPT_EXECUTE_ONLY_BIT;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ed1d41f5f505..e6d5c9277ba5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2947,9 +2947,8 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 
 static int get_ept_level(struct kvm_vcpu *vcpu)
 {
-	/* Nested EPT currently only supports 4-level walks. */
 	if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
-		return 4;
+		return vmx_eptp_page_walk_level(nested_ept_get_cr3(vcpu));
 	if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
 		return 5;
 	return 4;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 4/7] KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp()
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
                   ` (2 preceding siblings ...)
  2020-02-07 17:37 ` [PATCH v2 3/7] KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT Sean Christopherson
@ 2020-02-07 17:37 ` Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 5/7] KVM: nVMX: Rename EPTP validity helper and associated variables Sean Christopherson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Rename the accessor for vmcs12.EPTP to use "eptp" instead of "cr3".  The
accessor has no relation to cr3 whatsoever, other than it being assigned
to the also poorly named kvm_mmu->get_cr3() hook.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 4 ++--
 arch/x86/kvm/vmx/nested.h | 4 ++--
 arch/x86/kvm/vmx/vmx.c    | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d5fc4bfea0e2..1a5db5f64352 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -353,9 +353,9 @@ static void nested_ept_init_mmu_context(struct kvm_vcpu *vcpu)
 			to_vmx(vcpu)->nested.msrs.ept_caps &
 			VMX_EPT_EXECUTE_ONLY_BIT,
 			nested_ept_ad_enabled(vcpu),
-			nested_ept_get_cr3(vcpu));
+			nested_ept_get_eptp(vcpu));
 	vcpu->arch.mmu->set_cr3           = vmx_set_cr3;
-	vcpu->arch.mmu->get_cr3           = nested_ept_get_cr3;
+	vcpu->arch.mmu->get_cr3           = nested_ept_get_eptp;
 	vcpu->arch.mmu->inject_page_fault = nested_ept_inject_page_fault;
 	vcpu->arch.mmu->get_pdptr         = kvm_pdptr_read;
 
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index fc874d4ead0f..9d1c2fc81221 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -59,7 +59,7 @@ static inline int vmx_has_valid_vmcs12(struct kvm_vcpu *vcpu)
 		vmx->nested.hv_evmcs;
 }
 
-static inline unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu)
+static inline unsigned long nested_ept_get_eptp(struct kvm_vcpu *vcpu)
 {
 	/* return the page table to be shadowed - in our case, EPT12 */
 	return get_vmcs12(vcpu)->ept_pointer;
@@ -67,7 +67,7 @@ static inline unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu)
 
 static inline bool nested_ept_ad_enabled(struct kvm_vcpu *vcpu)
 {
-	return nested_ept_get_cr3(vcpu) & VMX_EPTP_AD_ENABLE_BIT;
+	return nested_ept_get_eptp(vcpu) & VMX_EPTP_AD_ENABLE_BIT;
 }
 
 /*
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e6d5c9277ba5..5b4aea535958 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2948,7 +2948,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 static int get_ept_level(struct kvm_vcpu *vcpu)
 {
 	if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
-		return vmx_eptp_page_walk_level(nested_ept_get_cr3(vcpu));
+		return vmx_eptp_page_walk_level(nested_ept_get_eptp(vcpu));
 	if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
 		return 5;
 	return 4;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 5/7] KVM: nVMX: Rename EPTP validity helper and associated variables
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
                   ` (3 preceding siblings ...)
  2020-02-07 17:37 ` [PATCH v2 4/7] KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp() Sean Christopherson
@ 2020-02-07 17:37 ` Sean Christopherson
  2020-02-07 17:37 ` [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp() Sean Christopherson
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Rename valid_ept_address() to nested_vmx_check_eptp() to follow the nVMX
nomenclature and to reflect that the function now checks a lot more than
just the address contained in the EPTP.  Rename address to new_eptp in
associated code.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 1a5db5f64352..4fb05c0e29fe 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2562,13 +2562,13 @@ static int nested_vmx_check_nmi_controls(struct vmcs12 *vmcs12)
 	return 0;
 }
 
-static bool valid_ept_address(struct kvm_vcpu *vcpu, u64 address)
+static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	int maxphyaddr = cpuid_maxphyaddr(vcpu);
 
 	/* Check for memory type validity */
-	switch (address & VMX_EPTP_MT_MASK) {
+	switch (new_eptp & VMX_EPTP_MT_MASK) {
 	case VMX_EPTP_MT_UC:
 		if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_UC_BIT)))
 			return false;
@@ -2582,7 +2582,7 @@ static bool valid_ept_address(struct kvm_vcpu *vcpu, u64 address)
 	}
 
 	/* Page-walk levels validity. */
-	switch (address & VMX_EPTP_PWL_MASK) {
+	switch (new_eptp & VMX_EPTP_PWL_MASK) {
 	case VMX_EPTP_PWL_5:
 		if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_PAGE_WALK_5_BIT)))
 			return false;
@@ -2596,11 +2596,11 @@ static bool valid_ept_address(struct kvm_vcpu *vcpu, u64 address)
 	}
 
 	/* Reserved bits should not be set */
-	if (CC(address >> maxphyaddr || ((address >> 7) & 0x1f)))
+	if (CC(new_eptp >> maxphyaddr || ((new_eptp >> 7) & 0x1f)))
 		return false;
 
 	/* AD, if set, should be supported */
-	if (address & VMX_EPTP_AD_ENABLE_BIT) {
+	if (new_eptp & VMX_EPTP_AD_ENABLE_BIT) {
 		if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_AD_BIT)))
 			return false;
 	}
@@ -2649,7 +2649,7 @@ static int nested_check_vm_execution_controls(struct kvm_vcpu *vcpu,
 		return -EINVAL;
 
 	if (nested_cpu_has_ept(vmcs12) &&
-	    CC(!valid_ept_address(vcpu, vmcs12->ept_pointer)))
+	    CC(!nested_vmx_check_eptp(vcpu, vmcs12->ept_pointer)))
 		return -EINVAL;
 
 	if (nested_cpu_has_vmfunc(vmcs12)) {
@@ -5188,7 +5188,7 @@ static int nested_vmx_eptp_switching(struct kvm_vcpu *vcpu,
 				     struct vmcs12 *vmcs12)
 {
 	u32 index = kvm_rcx_read(vcpu);
-	u64 address;
+	u64 new_eptp;
 	bool accessed_dirty;
 	struct kvm_mmu *mmu = vcpu->arch.walk_mmu;
 
@@ -5201,23 +5201,23 @@ static int nested_vmx_eptp_switching(struct kvm_vcpu *vcpu,
 
 
 	if (kvm_vcpu_read_guest_page(vcpu, vmcs12->eptp_list_address >> PAGE_SHIFT,
-				     &address, index * 8, 8))
+				     &new_eptp, index * 8, 8))
 		return 1;
 
-	accessed_dirty = !!(address & VMX_EPTP_AD_ENABLE_BIT);
+	accessed_dirty = !!(new_eptp & VMX_EPTP_AD_ENABLE_BIT);
 
 	/*
 	 * If the (L2) guest does a vmfunc to the currently
 	 * active ept pointer, we don't have to do anything else
 	 */
-	if (vmcs12->ept_pointer != address) {
-		if (!valid_ept_address(vcpu, address))
+	if (vmcs12->ept_pointer != new_eptp) {
+		if (!nested_vmx_check_eptp(vcpu, new_eptp))
 			return 1;
 
 		kvm_mmu_unload(vcpu);
 		mmu->ept_ad = accessed_dirty;
 		mmu->mmu_role.base.ad_disabled = !accessed_dirty;
-		vmcs12->ept_pointer = address;
+		vmcs12->ept_pointer = new_eptp;
 		/*
 		 * TODO: Check what's the correct approach in case
 		 * mmu reload fails. Currently, we just let the next
-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
                   ` (4 preceding siblings ...)
  2020-02-07 17:37 ` [PATCH v2 5/7] KVM: nVMX: Rename EPTP validity helper and associated variables Sean Christopherson
@ 2020-02-07 17:37 ` Sean Christopherson
  2020-02-12 12:00   ` Paolo Bonzini
  2020-02-07 17:37 ` [PATCH v2 7/7] KVM: nVMX: Drop unnecessary check on ept caps for execute-only Sean Christopherson
  2020-02-12 12:03 ` [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Paolo Bonzini
  7 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Rename kvm_mmu->get_cr3() to call out that it is retrieving a guest
value, as opposed to kvm_mmu->set_cr3(), which sets a host value, and to
note that it will return L1's EPTP when nested EPT is in use.  Hopefully
the new name will also make it more obvious that L1's nested_cr3 is
returned in SVM's nested NPT case.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 24 ++++++++++++------------
 arch/x86/kvm/mmu/paging_tmpl.h  |  2 +-
 arch/x86/kvm/svm.c              | 10 +++++-----
 arch/x86/kvm/vmx/nested.c       |  8 ++++----
 arch/x86/kvm/x86.c              |  2 +-
 6 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4dffbc10d3f8..d3d69ad2e969 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -383,7 +383,7 @@ struct kvm_mmu_root_info {
  */
 struct kvm_mmu {
 	void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
-	unsigned long (*get_cr3)(struct kvm_vcpu *vcpu);
+	unsigned long (*get_guest_cr3_or_eptp)(struct kvm_vcpu *vcpu);
 	u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index);
 	int (*page_fault)(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 err,
 			  bool prefault);
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 70f67bcab7db..13df4b4a5649 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3731,7 +3731,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 		vcpu->arch.mmu->root_hpa = __pa(vcpu->arch.mmu->pae_root);
 	} else
 		BUG();
-	vcpu->arch.mmu->root_cr3 = vcpu->arch.mmu->get_cr3(vcpu);
+	vcpu->arch.mmu->root_cr3 = vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu);
 
 	return 0;
 }
@@ -3743,7 +3743,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 	gfn_t root_gfn, root_cr3;
 	int i;
 
-	root_cr3 = vcpu->arch.mmu->get_cr3(vcpu);
+	root_cr3 = vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu);
 	root_gfn = root_cr3 >> PAGE_SHIFT;
 
 	if (mmu_check_root(vcpu, root_gfn))
@@ -4080,7 +4080,7 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 	arch.token = (vcpu->arch.apf.id++ << 12) | vcpu->vcpu_id;
 	arch.gfn = gfn;
 	arch.direct_map = vcpu->arch.mmu->direct_map;
-	arch.cr3 = vcpu->arch.mmu->get_cr3(vcpu);
+	arch.cr3 = vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu);
 
 	return kvm_setup_async_pf(vcpu, cr2_or_gpa,
 				  kvm_vcpu_gfn_to_hva(vcpu, gfn), &arch);
@@ -4932,7 +4932,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
 	context->shadow_root_level = kvm_x86_ops->get_tdp_level(vcpu);
 	context->direct_map = true;
 	context->set_cr3 = kvm_x86_ops->set_tdp_cr3;
-	context->get_cr3 = get_cr3;
+	context->get_guest_cr3_or_eptp = get_cr3;
 	context->get_pdptr = kvm_pdptr_read;
 	context->inject_page_fault = kvm_inject_page_fault;
 
@@ -5080,10 +5080,10 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
 	struct kvm_mmu *context = vcpu->arch.mmu;
 
 	kvm_init_shadow_mmu(vcpu);
-	context->set_cr3           = kvm_x86_ops->set_cr3;
-	context->get_cr3           = get_cr3;
-	context->get_pdptr         = kvm_pdptr_read;
-	context->inject_page_fault = kvm_inject_page_fault;
+	context->set_cr3	       = kvm_x86_ops->set_cr3;
+	context->get_guest_cr3_or_eptp = get_cr3;
+	context->get_pdptr	       = kvm_pdptr_read;
+	context->inject_page_fault     = kvm_inject_page_fault;
 }
 
 static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
@@ -5095,10 +5095,10 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
 	if (new_role.as_u64 == g_context->mmu_role.as_u64)
 		return;
 
-	g_context->mmu_role.as_u64 = new_role.as_u64;
-	g_context->get_cr3           = get_cr3;
-	g_context->get_pdptr         = kvm_pdptr_read;
-	g_context->inject_page_fault = kvm_inject_page_fault;
+	g_context->mmu_role.as_u64	 = new_role.as_u64;
+	g_context->get_guest_cr3_or_eptp = get_cr3;
+	g_context->get_pdptr		 = kvm_pdptr_read;
+	g_context->inject_page_fault	 = kvm_inject_page_fault;
 
 	/*
 	 * Note that arch.mmu->gva_to_gpa translates l2_gpa to l1_gpa using
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 6b15b58f3ecc..24dfa0fcba56 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -333,7 +333,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
 	trace_kvm_mmu_pagetable_walk(addr, access);
 retry_walk:
 	walker->level = mmu->root_level;
-	pte           = mmu->get_cr3(vcpu);
+	pte           = mmu->get_guest_cr3_or_eptp(vcpu);
 	have_ad       = PT_HAVE_ACCESSED_DIRTY(mmu);
 
 #if PTTYPE == 64
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a3e32d61d60c..1e2f05a79417 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3026,11 +3026,11 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.mmu = &vcpu->arch.guest_mmu;
 	kvm_init_shadow_mmu(vcpu);
-	vcpu->arch.mmu->set_cr3           = nested_svm_set_tdp_cr3;
-	vcpu->arch.mmu->get_cr3           = nested_svm_get_tdp_cr3;
-	vcpu->arch.mmu->get_pdptr         = nested_svm_get_tdp_pdptr;
-	vcpu->arch.mmu->inject_page_fault = nested_svm_inject_npf_exit;
-	vcpu->arch.mmu->shadow_root_level = get_npt_level(vcpu);
+	vcpu->arch.mmu->set_cr3		      = nested_svm_set_tdp_cr3;
+	vcpu->arch.mmu->get_guest_cr3_or_eptp = nested_svm_get_tdp_cr3;
+	vcpu->arch.mmu->get_pdptr	      = nested_svm_get_tdp_pdptr;
+	vcpu->arch.mmu->inject_page_fault     = nested_svm_inject_npf_exit;
+	vcpu->arch.mmu->shadow_root_level     = get_npt_level(vcpu);
 	reset_shadow_zero_bits_mask(vcpu, vcpu->arch.mmu);
 	vcpu->arch.walk_mmu              = &vcpu->arch.nested_mmu;
 }
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4fb05c0e29fe..2d7b87b532f5 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -354,10 +354,10 @@ static void nested_ept_init_mmu_context(struct kvm_vcpu *vcpu)
 			VMX_EPT_EXECUTE_ONLY_BIT,
 			nested_ept_ad_enabled(vcpu),
 			nested_ept_get_eptp(vcpu));
-	vcpu->arch.mmu->set_cr3           = vmx_set_cr3;
-	vcpu->arch.mmu->get_cr3           = nested_ept_get_eptp;
-	vcpu->arch.mmu->inject_page_fault = nested_ept_inject_page_fault;
-	vcpu->arch.mmu->get_pdptr         = kvm_pdptr_read;
+	vcpu->arch.mmu->set_cr3		      = vmx_set_cr3;
+	vcpu->arch.mmu->get_guest_cr3_or_eptp = nested_ept_get_eptp;
+	vcpu->arch.mmu->inject_page_fault     = nested_ept_inject_page_fault;
+	vcpu->arch.mmu->get_pdptr	      = kvm_pdptr_read;
 
 	vcpu->arch.walk_mmu              = &vcpu->arch.nested_mmu;
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fbabb2f06273..1e6d8766fbdd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10179,7 +10179,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
 		return;
 
 	if (!vcpu->arch.mmu->direct_map &&
-	      work->arch.cr3 != vcpu->arch.mmu->get_cr3(vcpu))
+	      work->arch.cr3 != vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu))
 		return;
 
 	vcpu->arch.mmu->page_fault(vcpu, work->cr2_or_gpa, 0, true);
-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 7/7] KVM: nVMX: Drop unnecessary check on ept caps for execute-only
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
                   ` (5 preceding siblings ...)
  2020-02-07 17:37 ` [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp() Sean Christopherson
@ 2020-02-07 17:37 ` Sean Christopherson
  2020-02-12 12:03 ` [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Paolo Bonzini
  7 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-02-07 17:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Drop the call to cpu_has_vmx_ept_execute_only() when calculating which
EPT capabilities will be exposed to L1 for nested EPT.  The resulting
configuration is immediately sanitized by the passed in @ept_caps, and
except for the call from vmx_check_processor_compat(), @ept_caps is the
capabilities that are queried by cpu_has_vmx_ept_execute_only().  For
vmx_check_processor_compat(), KVM *wants* to ignore vmx_capability.ept
so that a divergence in EPT capabilities between CPUs is detected.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 2d7b87b532f5..fe7da5e2fc59 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6071,10 +6071,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps,
 			VMX_EPT_PAGE_WALK_4_BIT |
 			VMX_EPT_PAGE_WALK_5_BIT |
 			VMX_EPTP_WB_BIT |
-			VMX_EPT_INVEPT_BIT;
-		if (cpu_has_vmx_ept_execute_only())
-			msrs->ept_caps |=
-				VMX_EPT_EXECUTE_ONLY_BIT;
+			VMX_EPT_INVEPT_BIT |
+			VMX_EPT_EXECUTE_ONLY_BIT;
+
 		msrs->ept_caps &= ept_caps;
 		msrs->ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
 			VMX_EPT_EXTENT_CONTEXT_BIT | VMX_EPT_2MB_PAGE_BIT |
-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
  2020-02-07 17:37 ` [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp() Sean Christopherson
@ 2020-02-12 12:00   ` Paolo Bonzini
  2020-02-12 16:28     ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2020-02-12 12:00 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On 07/02/20 18:37, Sean Christopherson wrote:
> Rename kvm_mmu->get_cr3() to call out that it is retrieving a guest
> value, as opposed to kvm_mmu->set_cr3(), which sets a host value, and to
> note that it will return L1's EPTP when nested EPT is in use.  Hopefully
> the new name will also make it more obvious that L1's nested_cr3 is
> returned in SVM's nested NPT case.
> 
> No functional change intended.

Should we call it "get_pgd", since that is how Linux calls the top-level
directory?  I always get confused by PUD/PMD, but as long as we only
keep one /p.d/ moniker it should be fine.

Paolo

> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  2 +-
>  arch/x86/kvm/mmu/mmu.c          | 24 ++++++++++++------------
>  arch/x86/kvm/mmu/paging_tmpl.h  |  2 +-
>  arch/x86/kvm/svm.c              | 10 +++++-----
>  arch/x86/kvm/vmx/nested.c       |  8 ++++----
>  arch/x86/kvm/x86.c              |  2 +-
>  6 files changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4dffbc10d3f8..d3d69ad2e969 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -383,7 +383,7 @@ struct kvm_mmu_root_info {
>   */
>  struct kvm_mmu {
>  	void (*set_cr3)(struct kvm_vcpu *vcpu, unsigned long root);
> -	unsigned long (*get_cr3)(struct kvm_vcpu *vcpu);
> +	unsigned long (*get_guest_cr3_or_eptp)(struct kvm_vcpu *vcpu);
>  	u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index);
>  	int (*page_fault)(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u32 err,
>  			  bool prefault);
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 70f67bcab7db..13df4b4a5649 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3731,7 +3731,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
>  		vcpu->arch.mmu->root_hpa = __pa(vcpu->arch.mmu->pae_root);
>  	} else
>  		BUG();
> -	vcpu->arch.mmu->root_cr3 = vcpu->arch.mmu->get_cr3(vcpu);
> +	vcpu->arch.mmu->root_cr3 = vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu);
>  
>  	return 0;
>  }
> @@ -3743,7 +3743,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>  	gfn_t root_gfn, root_cr3;
>  	int i;
>  
> -	root_cr3 = vcpu->arch.mmu->get_cr3(vcpu);
> +	root_cr3 = vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu);
>  	root_gfn = root_cr3 >> PAGE_SHIFT;
>  
>  	if (mmu_check_root(vcpu, root_gfn))
> @@ -4080,7 +4080,7 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
>  	arch.token = (vcpu->arch.apf.id++ << 12) | vcpu->vcpu_id;
>  	arch.gfn = gfn;
>  	arch.direct_map = vcpu->arch.mmu->direct_map;
> -	arch.cr3 = vcpu->arch.mmu->get_cr3(vcpu);
> +	arch.cr3 = vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu);
>  
>  	return kvm_setup_async_pf(vcpu, cr2_or_gpa,
>  				  kvm_vcpu_gfn_to_hva(vcpu, gfn), &arch);
> @@ -4932,7 +4932,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
>  	context->shadow_root_level = kvm_x86_ops->get_tdp_level(vcpu);
>  	context->direct_map = true;
>  	context->set_cr3 = kvm_x86_ops->set_tdp_cr3;
> -	context->get_cr3 = get_cr3;
> +	context->get_guest_cr3_or_eptp = get_cr3;
>  	context->get_pdptr = kvm_pdptr_read;
>  	context->inject_page_fault = kvm_inject_page_fault;
>  
> @@ -5080,10 +5080,10 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu)
>  	struct kvm_mmu *context = vcpu->arch.mmu;
>  
>  	kvm_init_shadow_mmu(vcpu);
> -	context->set_cr3           = kvm_x86_ops->set_cr3;
> -	context->get_cr3           = get_cr3;
> -	context->get_pdptr         = kvm_pdptr_read;
> -	context->inject_page_fault = kvm_inject_page_fault;
> +	context->set_cr3	       = kvm_x86_ops->set_cr3;
> +	context->get_guest_cr3_or_eptp = get_cr3;
> +	context->get_pdptr	       = kvm_pdptr_read;
> +	context->inject_page_fault     = kvm_inject_page_fault;
>  }
>  
>  static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
> @@ -5095,10 +5095,10 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu)
>  	if (new_role.as_u64 == g_context->mmu_role.as_u64)
>  		return;
>  
> -	g_context->mmu_role.as_u64 = new_role.as_u64;
> -	g_context->get_cr3           = get_cr3;
> -	g_context->get_pdptr         = kvm_pdptr_read;
> -	g_context->inject_page_fault = kvm_inject_page_fault;
> +	g_context->mmu_role.as_u64	 = new_role.as_u64;
> +	g_context->get_guest_cr3_or_eptp = get_cr3;
> +	g_context->get_pdptr		 = kvm_pdptr_read;
> +	g_context->inject_page_fault	 = kvm_inject_page_fault;
>  
>  	/*
>  	 * Note that arch.mmu->gva_to_gpa translates l2_gpa to l1_gpa using
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 6b15b58f3ecc..24dfa0fcba56 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -333,7 +333,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
>  	trace_kvm_mmu_pagetable_walk(addr, access);
>  retry_walk:
>  	walker->level = mmu->root_level;
> -	pte           = mmu->get_cr3(vcpu);
> +	pte           = mmu->get_guest_cr3_or_eptp(vcpu);
>  	have_ad       = PT_HAVE_ACCESSED_DIRTY(mmu);
>  
>  #if PTTYPE == 64
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index a3e32d61d60c..1e2f05a79417 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -3026,11 +3026,11 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)
>  
>  	vcpu->arch.mmu = &vcpu->arch.guest_mmu;
>  	kvm_init_shadow_mmu(vcpu);
> -	vcpu->arch.mmu->set_cr3           = nested_svm_set_tdp_cr3;
> -	vcpu->arch.mmu->get_cr3           = nested_svm_get_tdp_cr3;
> -	vcpu->arch.mmu->get_pdptr         = nested_svm_get_tdp_pdptr;
> -	vcpu->arch.mmu->inject_page_fault = nested_svm_inject_npf_exit;
> -	vcpu->arch.mmu->shadow_root_level = get_npt_level(vcpu);
> +	vcpu->arch.mmu->set_cr3		      = nested_svm_set_tdp_cr3;
> +	vcpu->arch.mmu->get_guest_cr3_or_eptp = nested_svm_get_tdp_cr3;
> +	vcpu->arch.mmu->get_pdptr	      = nested_svm_get_tdp_pdptr;
> +	vcpu->arch.mmu->inject_page_fault     = nested_svm_inject_npf_exit;
> +	vcpu->arch.mmu->shadow_root_level     = get_npt_level(vcpu);
>  	reset_shadow_zero_bits_mask(vcpu, vcpu->arch.mmu);
>  	vcpu->arch.walk_mmu              = &vcpu->arch.nested_mmu;
>  }
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 4fb05c0e29fe..2d7b87b532f5 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -354,10 +354,10 @@ static void nested_ept_init_mmu_context(struct kvm_vcpu *vcpu)
>  			VMX_EPT_EXECUTE_ONLY_BIT,
>  			nested_ept_ad_enabled(vcpu),
>  			nested_ept_get_eptp(vcpu));
> -	vcpu->arch.mmu->set_cr3           = vmx_set_cr3;
> -	vcpu->arch.mmu->get_cr3           = nested_ept_get_eptp;
> -	vcpu->arch.mmu->inject_page_fault = nested_ept_inject_page_fault;
> -	vcpu->arch.mmu->get_pdptr         = kvm_pdptr_read;
> +	vcpu->arch.mmu->set_cr3		      = vmx_set_cr3;
> +	vcpu->arch.mmu->get_guest_cr3_or_eptp = nested_ept_get_eptp;
> +	vcpu->arch.mmu->inject_page_fault     = nested_ept_inject_page_fault;
> +	vcpu->arch.mmu->get_pdptr	      = kvm_pdptr_read;
>  
>  	vcpu->arch.walk_mmu              = &vcpu->arch.nested_mmu;
>  }
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fbabb2f06273..1e6d8766fbdd 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10179,7 +10179,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
>  		return;
>  
>  	if (!vcpu->arch.mmu->direct_map &&
> -	      work->arch.cr3 != vcpu->arch.mmu->get_cr3(vcpu))
> +	      work->arch.cr3 != vcpu->arch.mmu->get_guest_cr3_or_eptp(vcpu))
>  		return;
>  
>  	vcpu->arch.mmu->page_fault(vcpu, work->cr2_or_gpa, 0, true);
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling
  2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
                   ` (6 preceding siblings ...)
  2020-02-07 17:37 ` [PATCH v2 7/7] KVM: nVMX: Drop unnecessary check on ept caps for execute-only Sean Christopherson
@ 2020-02-12 12:03 ` Paolo Bonzini
  7 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2020-02-12 12:03 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On 07/02/20 18:37, Sean Christopherson wrote:
> Two fixes for 5-level paging bugs with a 100% fatality rate, a patch to
> enable 5-level EPT in L1, and additional clean up on top (mostly renames
> of functions/variables that caused me no end of confusion when trying to
> figure out what was broken).
> 
> Tested fixed kernels at L0, L1 and L2, with most combinations of EPT,
> shadow paging, 4-level and 5-level.  EPT kvm-unit-tests runs clean in L0.
> Patches for kvm-unit-tests incoming to play nice with 5-level nested EPT.
> 
> Ideally patches 1 and 2 would get into 5.6, 5-level paging is quite
> broken without them.
> 
> v2:
>   - Increase the nested EPT array sizes to accomodate 5-level paging in
>     the patch that adds support for 5-level nested EPT, not in the bug
>     fix for 5-level shadow paging.
> 
> Sean Christopherson (7):
>   KVM: nVMX: Use correct root level for nested EPT shadow page tables
>   KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging
>   KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT
>   KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp()
>   KVM: nVMX: Rename EPTP validity helper and associated variables
>   KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
>   KVM: nVMX: Drop unnecessary check on ept caps for execute-only
> 
>  arch/x86/include/asm/kvm_host.h |  2 +-
>  arch/x86/include/asm/vmx.h      | 12 +++++++
>  arch/x86/kvm/mmu/mmu.c          | 35 ++++++++++----------
>  arch/x86/kvm/mmu/paging_tmpl.h  |  6 ++--
>  arch/x86/kvm/svm.c              | 10 +++---
>  arch/x86/kvm/vmx/nested.c       | 58 ++++++++++++++++++++-------------
>  arch/x86/kvm/vmx/nested.h       |  4 +--
>  arch/x86/kvm/vmx/vmx.c          |  2 ++
>  arch/x86/kvm/x86.c              |  2 +-
>  9 files changed, 79 insertions(+), 52 deletions(-)
> 

Queued 1-2-4-5-7 (for 5.6), thanks!

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
  2020-02-12 12:00   ` Paolo Bonzini
@ 2020-02-12 16:28     ` Sean Christopherson
  2020-02-12 16:42       ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2020-02-12 16:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On Wed, Feb 12, 2020 at 01:00:59PM +0100, Paolo Bonzini wrote:
> On 07/02/20 18:37, Sean Christopherson wrote:
> > Rename kvm_mmu->get_cr3() to call out that it is retrieving a guest
> > value, as opposed to kvm_mmu->set_cr3(), which sets a host value, and to
> > note that it will return L1's EPTP when nested EPT is in use.  Hopefully
> > the new name will also make it more obvious that L1's nested_cr3 is
> > returned in SVM's nested NPT case.
> > 
> > No functional change intended.
> 
> Should we call it "get_pgd", since that is how Linux calls the top-level
> directory?  I always get confused by PUD/PMD, but as long as we only
> keep one /p.d/ moniker it should be fine.

Heh, I have the exact same sentiment.  get_pgd() works for me.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
  2020-02-12 16:28     ` Sean Christopherson
@ 2020-02-12 16:42       ` Paolo Bonzini
  2020-03-01 17:49         ` Sean Christopherson
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2020-02-12 16:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On 12/02/20 17:28, Sean Christopherson wrote:
> On Wed, Feb 12, 2020 at 01:00:59PM +0100, Paolo Bonzini wrote:
>> On 07/02/20 18:37, Sean Christopherson wrote:
>>> Rename kvm_mmu->get_cr3() to call out that it is retrieving a guest
>>> value, as opposed to kvm_mmu->set_cr3(), which sets a host value, and to
>>> note that it will return L1's EPTP when nested EPT is in use.  Hopefully
>>> the new name will also make it more obvious that L1's nested_cr3 is
>>> returned in SVM's nested NPT case.
>>>
>>> No functional change intended.
>>
>> Should we call it "get_pgd", since that is how Linux calls the top-level
>> directory?  I always get confused by PUD/PMD, but as long as we only
>> keep one /p.d/ moniker it should be fine.
> 
> Heh, I have the exact same sentiment.  get_pgd() works for me.

Ok, I'll post a patch that uses get_guest_pgd() as soon as I open
kvm/next for 5.7 material.

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
  2020-02-12 16:42       ` Paolo Bonzini
@ 2020-03-01 17:49         ` Sean Christopherson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2020-03-01 17:49 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On Wed, Feb 12, 2020 at 05:42:33PM +0100, Paolo Bonzini wrote:
> On 12/02/20 17:28, Sean Christopherson wrote:
> > On Wed, Feb 12, 2020 at 01:00:59PM +0100, Paolo Bonzini wrote:
> >> On 07/02/20 18:37, Sean Christopherson wrote:
> >>> Rename kvm_mmu->get_cr3() to call out that it is retrieving a guest
> >>> value, as opposed to kvm_mmu->set_cr3(), which sets a host value, and to
> >>> note that it will return L1's EPTP when nested EPT is in use.  Hopefully
> >>> the new name will also make it more obvious that L1's nested_cr3 is
> >>> returned in SVM's nested NPT case.
> >>>
> >>> No functional change intended.
> >>
> >> Should we call it "get_pgd", since that is how Linux calls the top-level
> >> directory?  I always get confused by PUD/PMD, but as long as we only
> >> keep one /p.d/ moniker it should be fine.
> > 
> > Heh, I have the exact same sentiment.  get_pgd() works for me.
> 
> Ok, I'll post a patch that uses get_guest_pgd() as soon as I open
> kvm/next for 5.7 material.

I need to resend the 5-level nested EPT support, I'll include this change.
Should I also include patches 4, 5 and 7 when I send v3 of that series?
Your earlier mail said they were queued for 5.6, but AFAICT only patches
1 and 2 made it into 5.6 (which is not a big deal at all).

On Wed, Feb 12, 2020 at 01:03:03PM +0100, Paolo Bonzini wrote:
> On 07/02/20 18:37, Sean Christopherson wrote:
> > Sean Christopherson (7):
> >   KVM: nVMX: Use correct root level for nested EPT shadow page tables
> >   KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging
> >   KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT
> >   KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp()
> >   KVM: nVMX: Rename EPTP validity helper and associated variables
> >   KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp()
> >   KVM: nVMX: Drop unnecessary check on ept caps for execute-only
> >
>
> Queued 1-2-4-5-7 (for 5.6), thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-03-01 17:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-07 17:37 [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Sean Christopherson
2020-02-07 17:37 ` [PATCH v2 1/7] KVM: nVMX: Use correct root level for nested EPT shadow page tables Sean Christopherson
2020-02-07 17:37 ` [PATCH v2 2/7] KVM: x86/mmu: Fix struct guest_walker arrays for 5-level paging Sean Christopherson
2020-02-07 17:37 ` [PATCH v2 3/7] KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT Sean Christopherson
2020-02-07 17:37 ` [PATCH v2 4/7] KVM: nVMX: Rename nested_ept_get_cr3() to nested_ept_get_eptp() Sean Christopherson
2020-02-07 17:37 ` [PATCH v2 5/7] KVM: nVMX: Rename EPTP validity helper and associated variables Sean Christopherson
2020-02-07 17:37 ` [PATCH v2 6/7] KVM: x86/mmu: Rename kvm_mmu->get_cr3() to ->get_guest_cr3_or_eptp() Sean Christopherson
2020-02-12 12:00   ` Paolo Bonzini
2020-02-12 16:28     ` Sean Christopherson
2020-02-12 16:42       ` Paolo Bonzini
2020-03-01 17:49         ` Sean Christopherson
2020-02-07 17:37 ` [PATCH v2 7/7] KVM: nVMX: Drop unnecessary check on ept caps for execute-only Sean Christopherson
2020-02-12 12:03 ` [PATCH v2 0/7] KVM: x86/mmu: nVMX: 5-level paging fixes and enabling Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).