linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] KVM: few more SMM fixes
@ 2021-08-30 12:55 Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 1/6] KVM: SVM: restore the L1 host state prior to resuming nested guest on SMM exit Maxim Levitsky
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Maxim Levitsky @ 2021-08-30 12:55 UTC (permalink / raw)
  To: kvm
  Cc: Jim Mattson, Sean Christopherson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner, Maxim Levitsky,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

These are few SMM fixes I was working on last week.

(V2: I merged here VMX's SMM fixes, while SVM SMM fixes are unchanged from V1)

* Patch 1 fixes a minor issue that remained after
  commit 37be407b2ce8 ("KVM: nSVM: Fix L1 state corruption upon return from SMM")

  While now, returns to guest mode from SMM work due to restored state from HSAVE
  area, the guest entry still sees incorrect HSAVE state.

  This for example breaks return from SMM when the guest is 32 bit, due to PDPTRs
  loading which are done using incorrect MMU state which is incorrect,
  because it was setup with incorrect L1 HSAVE state.

* Patch 2 fixes a theoretical issue that I introduced with my SREGS2 patchset,
  which Sean Christopherson pointed out.

  The issue is that KVM_REQ_GET_NESTED_STATE_PAGES request is not only used
  for completing the load of the nested state, but it is also used to complete
  exit from SMM to guest mode, and my compatibility hack of pdptrs_from_userspace
  was done assuming that this is not done.

  While it is safe to just reset 'pdptrs_from_userspace' on each VM entry,
  I don't want to slow down the common code for this very rare hack.
  Instead I explicitly zero this variable when SMM exit to guest mode is done,
  because in this case PDPTRs do need to be reloaded from memory always.

  Note that this is a theoretical issue only, because after 'vendor' return from
  smm code (aka .leave_smm) is done, even when it returned to the guest mode,
  which loads some of L2 CPU state, we still load again all of the L2 cpu state
  captured in SMRAM which includes CR3, at which point guest PDPTRs are re-loaded
  anyway.

  Also note that across SMI entries the CR3 seems not to be updated, and Intel's
  SDM notes that it saved value in SMRAM isn't writable, thus it is possible
  that if SMM handler didn't change CR3, the pdptrs would not be touched.

  I guess that means that a SMI handler can in theory preserve PDPTRs by never
  touching CR3, but since recently we removed that code that didn't update PDPTRs
  if CR3 didn't change, I guess it won't work.

  Anyway I don't think any OS bothers to have PDPTRs not synced with whatever
  page CR3 points at, thus I didn't bother to try and test what the real hardware
  does in this case.

* 3rd patch makes SVM SMM exit to be a bit more similar to how VMX does it
  by also raising KVM_REQ_GET_NESTED_STATE_PAGES requests.

  I do have doubts about why we need to do this on VMX though. The initial
  justification for this comes from

  7f7f1ba33cf2 ("KVM: x86: do not load vmcs12 pages while still in SMM")

  With all the MMU changes, I am not sure that we can still have a case
  of not up to date MMU when we enter the nested guest from SMM.
  On SVM it does seem to work anyway without this.

* Patch 3 fixes guest emulation failure when unrestricted_guest=0 and we reach
  handle_exception_nmi_irqoff.
  That function takes stale values from current vmcs and fails not taking into account
  the fact that we are emulating invalid guest state now, and thus no VM exit happened.

* Patch 4 fixed a corner case where return from SMM is slightly corrupting
  the L2 segment register state when unrestricted_guest=0 due to real mode segement
  caching register logic, but later it restores it correctly from SMMRAM.
  Fix this by not failing nested_vmx_enter_non_root_mode and delaying this
  failure to the next nested VM entry.

* Patch 5 fixes another corner case where emulation_required was not updated
  correctly on nested VMexit when restoring the L1 segement registers.

I still track 2 SMM issues:

1. When HyperV guest is running nested, and uses SMM enabled OVMF, it crashes and
   reboots during the boot process.

2. Nested migration on VMX is still broken when L1 floods itself with SMIs.

Best regards,
	Maxim Levitsky

Maxim Levitsky (6):
  KVM: SVM: restore the L1 host state prior to resuming nested guest on
    SMM exit
  KVM: x86: force PDPTR reload on SMM exit
  KVM: nSVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode
  KVM: VMX: synthesize invalid VM exit when emulating invalid guest
    state
  KVM: nVMX: don't fail nested VM entry on invalid guest state if
    !from_vmentry
  KVM: nVMX: re-evaluate emulation_required on nested VM exit

 arch/x86/kvm/svm/nested.c | 10 +++++++---
 arch/x86/kvm/svm/svm.c    | 27 ++++++++++++++++++---------
 arch/x86/kvm/svm/svm.h    |  3 ++-
 arch/x86/kvm/vmx/nested.c |  9 ++++++++-
 arch/x86/kvm/vmx/vmx.c    | 35 ++++++++++++++++++++++++++++-------
 arch/x86/kvm/vmx/vmx.h    |  1 +
 6 files changed, 64 insertions(+), 21 deletions(-)

-- 
2.26.3



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/6] KVM: SVM: restore the L1 host state prior to resuming nested guest on SMM exit
  2021-08-30 12:55 [PATCH v2 0/6] KVM: few more SMM fixes Maxim Levitsky
@ 2021-08-30 12:55 ` Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 2/6] KVM: x86: force PDPTR reload " Maxim Levitsky
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Maxim Levitsky @ 2021-08-30 12:55 UTC (permalink / raw)
  To: kvm
  Cc: Jim Mattson, Sean Christopherson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner, Maxim Levitsky,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

Otherwise guest entry code might see incorrect L1 state (e.g paging state).

Fixes: 37be407b2ce8 ("KVM: nSVM: Fix L1 state corruption upon return from SMM")

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/svm.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1a70e11f0487..4aa269a587d0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4347,27 +4347,30 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 					 gpa_to_gfn(vmcb12_gpa), &map) == -EINVAL)
 				return 1;
 
-			if (svm_allocate_nested(svm))
+			if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->nested.hsave_msr),
+					 &map_save) == -EINVAL)
 				return 1;
 
-			vmcb12 = map.hva;
-
-			nested_load_control_from_vmcb12(svm, &vmcb12->control);
-
-			ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12);
-			kvm_vcpu_unmap(vcpu, &map, true);
+			if (svm_allocate_nested(svm))
+				return 1;
 
 			/*
 			 * Restore L1 host state from L1 HSAVE area as VMCB01 was
 			 * used during SMM (see svm_enter_smm())
 			 */
-			if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->nested.hsave_msr),
-					 &map_save) == -EINVAL)
-				return 1;
 
 			svm_copy_vmrun_state(&svm->vmcb01.ptr->save,
 					     map_save.hva + 0x400);
 
+			/*
+			 * Restore L2 state
+			 * */
+
+			vmcb12 = map.hva;
+			nested_load_control_from_vmcb12(svm, &vmcb12->control);
+			ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12);
+
+			kvm_vcpu_unmap(vcpu, &map, true);
 			kvm_vcpu_unmap(vcpu, &map_save, true);
 		}
 	}
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/6] KVM: x86: force PDPTR reload on SMM exit
  2021-08-30 12:55 [PATCH v2 0/6] KVM: few more SMM fixes Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 1/6] KVM: SVM: restore the L1 host state prior to resuming nested guest on SMM exit Maxim Levitsky
@ 2021-08-30 12:55 ` Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 3/6] KVM: nSVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode Maxim Levitsky
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Maxim Levitsky @ 2021-08-30 12:55 UTC (permalink / raw)
  To: kvm
  Cc: Jim Mattson, Sean Christopherson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner, Maxim Levitsky,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

KVM_REQ_GET_NESTED_STATE_PAGES is also used on VM entries
that happen on exit from SMM mode, and in this case PDPTRS
must be always reloaded

Fixes: 0f85722341b0 ("KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index fada1055f325..4194fbf5e5d6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7504,6 +7504,13 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 	}
 
 	if (vmx->nested.smm.guest_mode) {
+
+		/* Exit from the SMM to the non root mode also uses
+		 * the KVM_REQ_GET_NESTED_STATE_PAGES request,
+		 * but in this case the pdptrs must be always reloaded
+		 */
+		vcpu->arch.pdptrs_from_userspace = false;
+
 		ret = nested_vmx_enter_non_root_mode(vcpu, false);
 		if (ret)
 			return ret;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/6] KVM: nSVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode
  2021-08-30 12:55 [PATCH v2 0/6] KVM: few more SMM fixes Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 1/6] KVM: SVM: restore the L1 host state prior to resuming nested guest on SMM exit Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 2/6] KVM: x86: force PDPTR reload " Maxim Levitsky
@ 2021-08-30 12:55 ` Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 4/6] KVM: VMX: synthesize invalid VM exit when emulating invalid guest state Maxim Levitsky
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Maxim Levitsky @ 2021-08-30 12:55 UTC (permalink / raw)
  To: kvm
  Cc: Jim Mattson, Sean Christopherson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner, Maxim Levitsky,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 10 +++++++---
 arch/x86/kvm/svm/svm.c    |  8 +++++++-
 arch/x86/kvm/svm/svm.h    |  3 ++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 5e13357da21e..e9c326ea9847 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -572,7 +572,7 @@ static void nested_svm_copy_common_state(struct vmcb *from_vmcb, struct vmcb *to
 }
 
 int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
-			 struct vmcb *vmcb12)
+			 struct vmcb *vmcb12, bool from_entry)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	int ret;
@@ -602,13 +602,17 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
 	nested_vmcb02_prepare_save(svm, vmcb12);
 
 	ret = nested_svm_load_cr3(&svm->vcpu, vmcb12->save.cr3,
-				  nested_npt_enabled(svm), true);
+				  nested_npt_enabled(svm), from_entry);
 	if (ret)
 		return ret;
 
 	if (!npt_enabled)
 		vcpu->arch.mmu->inject_page_fault = svm_inject_page_fault_nested;
 
+	if (!from_entry) {
+		kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
+	}
+
 	svm_set_gif(svm, true);
 
 	return 0;
@@ -674,7 +678,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
 
 	svm->nested.nested_run_pending = 1;
 
-	if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12))
+	if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, true))
 		goto out_exit_err;
 
 	if (nested_svm_vmrun_msrpm(svm))
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 4aa269a587d0..05b25a627c03 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4354,6 +4354,12 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 			if (svm_allocate_nested(svm))
 				return 1;
 
+			/* Exit from the SMM to the non root mode also uses
+			 * the KVM_REQ_GET_NESTED_STATE_PAGES request,
+			 * but in this case the pdptrs must be always reloaded
+			 */
+			vcpu->arch.pdptrs_from_userspace = false;
+
 			/*
 			 * Restore L1 host state from L1 HSAVE area as VMCB01 was
 			 * used during SMM (see svm_enter_smm())
@@ -4368,7 +4374,7 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 
 			vmcb12 = map.hva;
 			nested_load_control_from_vmcb12(svm, &vmcb12->control);
-			ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12);
+			ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false);
 
 			kvm_vcpu_unmap(vcpu, &map, true);
 			kvm_vcpu_unmap(vcpu, &map_save, true);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 524d943f3efc..51ffa46ab257 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -459,7 +459,8 @@ static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
 	return vmcb_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
 }
 
-int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb_gpa, struct vmcb *vmcb12);
+int enter_svm_guest_mode(struct kvm_vcpu *vcpu,
+		u64 vmcb_gpa, struct vmcb *vmcb12, bool from_entry);
 void svm_leave_nested(struct vcpu_svm *svm);
 void svm_free_nested(struct vcpu_svm *svm);
 int svm_allocate_nested(struct vcpu_svm *svm);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/6] KVM: VMX: synthesize invalid VM exit when emulating invalid guest state
  2021-08-30 12:55 [PATCH v2 0/6] KVM: few more SMM fixes Maxim Levitsky
                   ` (2 preceding siblings ...)
  2021-08-30 12:55 ` [PATCH v2 3/6] KVM: nSVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode Maxim Levitsky
@ 2021-08-30 12:55 ` Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 5/6] KVM: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry Maxim Levitsky
  2021-08-30 12:55 ` [PATCH v2 6/6] KVM: nVMX: re-evaluate emulation_required on nested VM exit Maxim Levitsky
  5 siblings, 0 replies; 9+ messages in thread
From: Maxim Levitsky @ 2021-08-30 12:55 UTC (permalink / raw)
  To: kvm
  Cc: Jim Mattson, Sean Christopherson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner, Maxim Levitsky,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

Since no actual VM entry happened, the VM exit information is stale.
To avoid this, synthesize an invalid VM guest state VM exit.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4194fbf5e5d6..1c113195c846 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6618,10 +6618,21 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
 		     vmx->loaded_vmcs->soft_vnmi_blocked))
 		vmx->loaded_vmcs->entry_time = ktime_get();
 
-	/* Don't enter VMX if guest state is invalid, let the exit handler
-	   start emulation until we arrive back to a valid state */
-	if (vmx->emulation_required)
+	/*
+	 * Don't enter VMX if guest state is invalid, let the exit handler
+	 * start emulation until we arrive back to a valid state.  Synthesize a
+	 * consistency check VM-Exit due to invalid guest state and bail.
+	 */
+	if (unlikely(vmx->emulation_required)) {
+		vmx->fail = 0;
+		vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
+		vmx->exit_reason.failed_vmentry = 1;
+		kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);
+		vmx->exit_qualification = ENTRY_FAIL_DEFAULT;
+		kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_2);
+		vmx->exit_intr_info = 0;
 		return EXIT_FASTPATH_NONE;
+	}
 
 	trace_kvm_entry(vcpu);
 
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/6] KVM: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
  2021-08-30 12:55 [PATCH v2 0/6] KVM: few more SMM fixes Maxim Levitsky
                   ` (3 preceding siblings ...)
  2021-08-30 12:55 ` [PATCH v2 4/6] KVM: VMX: synthesize invalid VM exit when emulating invalid guest state Maxim Levitsky
@ 2021-08-30 12:55 ` Maxim Levitsky
  2021-10-08 23:37   ` Sean Christopherson
  2021-08-30 12:55 ` [PATCH v2 6/6] KVM: nVMX: re-evaluate emulation_required on nested VM exit Maxim Levitsky
  5 siblings, 1 reply; 9+ messages in thread
From: Maxim Levitsky @ 2021-08-30 12:55 UTC (permalink / raw)
  To: kvm
  Cc: Jim Mattson, Sean Christopherson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner, Maxim Levitsky,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

It is possible that when non root mode is entered via special entry
(!from_vmentry), that is from SMM or from loading the nested state,
the L2 state could be invalid in regard to non unrestricted guest mode,
but later it can become valid.

(for example when RSM emulation restores segment registers from SMRAM)

Thus delay the check to VM entry, where we will check this and fail.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 7 ++++++-
 arch/x86/kvm/vmx/vmx.c    | 5 ++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index bc6327950657..1a05ae83dae5 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2546,8 +2546,13 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 	 * Guest state is invalid and unrestricted guest is disabled,
 	 * which means L1 attempted VMEntry to L2 with invalid state.
 	 * Fail the VMEntry.
+	 *
+	 * However when force loading the guest state (SMM exit or
+	 * loading nested state after migration, it is possible to
+	 * have invalid guest state now, which will be later fixed by
+	 * restoring L2 register state
 	 */
-	if (CC(!vmx_guest_state_valid(vcpu))) {
+	if (CC(from_vmentry && !vmx_guest_state_valid(vcpu))) {
 		*entry_failure_code = ENTRY_FAIL_DEFAULT;
 		return -EINVAL;
 	}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1c113195c846..02d061f5956a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6624,7 +6624,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	 * consistency check VM-Exit due to invalid guest state and bail.
 	 */
 	if (unlikely(vmx->emulation_required)) {
-		vmx->fail = 0;
+
+		/* We don't emulate invalid state of a nested guest */
+		vmx->fail = is_guest_mode(vcpu);
+
 		vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
 		vmx->exit_reason.failed_vmentry = 1;
 		kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 6/6] KVM: nVMX: re-evaluate emulation_required on nested VM exit
  2021-08-30 12:55 [PATCH v2 0/6] KVM: few more SMM fixes Maxim Levitsky
                   ` (4 preceding siblings ...)
  2021-08-30 12:55 ` [PATCH v2 5/6] KVM: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry Maxim Levitsky
@ 2021-08-30 12:55 ` Maxim Levitsky
  2021-09-21 22:55   ` Sean Christopherson
  5 siblings, 1 reply; 9+ messages in thread
From: Maxim Levitsky @ 2021-08-30 12:55 UTC (permalink / raw)
  To: kvm
  Cc: Jim Mattson, Sean Christopherson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner, Maxim Levitsky,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

If L1 had invalid state on VM entry (can happen on SMM transactions
when we enter from real mode, straight to nested guest),

then after we load 'host' state from VMCS12, the state has to become
valid again, but since we load the segment registers with
__vmx_set_segment we weren't always updating emulation_required.

Update emulation_required explicitly at end of load_vmcs12_host_state.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 2 ++
 arch/x86/kvm/vmx/vmx.c    | 8 ++++----
 arch/x86/kvm/vmx/vmx.h    | 1 +
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 1a05ae83dae5..f915e1ac589c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4319,6 +4319,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
 	if (nested_vmx_load_msr(vcpu, vmcs12->vm_exit_msr_load_addr,
 				vmcs12->vm_exit_msr_load_count))
 		nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_MSR_FAIL);
+
+	to_vmx(vcpu)->emulation_required = vmx_emulation_required(vcpu);
 }
 
 static inline u64 nested_vmx_get_vmcs01_guest_efer(struct vcpu_vmx *vmx)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 02d061f5956a..7ff1e1daeb0a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1323,7 +1323,7 @@ static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
 	vmx_prepare_switch_to_host(to_vmx(vcpu));
 }
 
-static bool emulation_required(struct kvm_vcpu *vcpu)
+bool vmx_emulation_required(struct kvm_vcpu *vcpu)
 {
 	return emulate_invalid_guest_state && !vmx_guest_state_valid(vcpu);
 }
@@ -1367,7 +1367,7 @@ void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
 	vmcs_writel(GUEST_RFLAGS, rflags);
 
 	if ((old_rflags ^ vmx->rflags) & X86_EFLAGS_VM)
-		vmx->emulation_required = emulation_required(vcpu);
+		vmx->emulation_required = vmx_emulation_required(vcpu);
 }
 
 u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu)
@@ -3077,7 +3077,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	}
 
 	/* depends on vcpu->arch.cr0 to be set to a new value */
-	vmx->emulation_required = emulation_required(vcpu);
+	vmx->emulation_required = vmx_emulation_required(vcpu);
 }
 
 static int vmx_get_max_tdp_level(void)
@@ -3330,7 +3330,7 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int
 {
 	__vmx_set_segment(vcpu, var, seg);
 
-	to_vmx(vcpu)->emulation_required = emulation_required(vcpu);
+	to_vmx(vcpu)->emulation_required = vmx_emulation_required(vcpu);
 }
 
 static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 4858c5fd95f2..3a587c51a8d1 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -359,6 +359,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu);
 void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel,
 			unsigned long fs_base, unsigned long gs_base);
 int vmx_get_cpl(struct kvm_vcpu *vcpu);
+bool vmx_emulation_required(struct kvm_vcpu *vcpu);
 unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu);
 void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
 u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 6/6] KVM: nVMX: re-evaluate emulation_required on nested VM exit
  2021-08-30 12:55 ` [PATCH v2 6/6] KVM: nVMX: re-evaluate emulation_required on nested VM exit Maxim Levitsky
@ 2021-09-21 22:55   ` Sean Christopherson
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2021-09-21 22:55 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Jim Mattson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

On Mon, Aug 30, 2021, Maxim Levitsky wrote:
> If L1 had invalid state on VM entry (can happen on SMM transactions
> when we enter from real mode, straight to nested guest),
> 
> then after we load 'host' state from VMCS12, the state has to become
> valid again, but since we load the segment registers with
> __vmx_set_segment we weren't always updating emulation_required.

Because I'm an idiot.

> Update emulation_required explicitly at end of load_vmcs12_host_state.

Can you also add

  Reported-by: kernel test robot <oliver.sang@intel.com>

which is how I ended up here, sort of.

Fixes: 816be9e9be8d ("KVM: nVMX: Don't evaluate "emulation required" on nested VM-Exit")

> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---

The below is overkill.  The state after VM-Exit _can't_ be invalid, which was the
whole point of moving to __vmx_set_segment(), I just forgot the minor detail of
clearing emulation_required.  So just:

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index bc6327950657..55ac7211fb37 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4314,6 +4314,12 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
        if (nested_vmx_load_msr(vcpu, vmcs12->vm_exit_msr_load_addr,
                                vmcs12->vm_exit_msr_load_count))
                nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_MSR_FAIL);
+
+       /*
+        * All relevant vmcs12 host state is checked prior to VM-Entry, thus
+        * L1 guest can never be invalid after VM-Exit.
+        */
+       to_vmx(vcpu)->emulation_required = false;
 }
 
 static inline u64 nested_vmx_get_vmcs01_guest_efer(struct vcpu_vmx *vmx)

>  arch/x86/kvm/vmx/nested.c | 2 ++
>  arch/x86/kvm/vmx/vmx.c    | 8 ++++----
>  arch/x86/kvm/vmx/vmx.h    | 1 +
>  3 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 1a05ae83dae5..f915e1ac589c 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4319,6 +4319,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
>  	if (nested_vmx_load_msr(vcpu, vmcs12->vm_exit_msr_load_addr,
>  				vmcs12->vm_exit_msr_load_count))
>  		nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_MSR_FAIL);
> +
> +	to_vmx(vcpu)->emulation_required = vmx_emulation_required(vcpu);
>  }
>  
>  static inline u64 nested_vmx_get_vmcs01_guest_efer(struct vcpu_vmx *vmx)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 02d061f5956a..7ff1e1daeb0a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1323,7 +1323,7 @@ static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
>  	vmx_prepare_switch_to_host(to_vmx(vcpu));
>  }
>  
> -static bool emulation_required(struct kvm_vcpu *vcpu)
> +bool vmx_emulation_required(struct kvm_vcpu *vcpu)
>  {
>  	return emulate_invalid_guest_state && !vmx_guest_state_valid(vcpu);
>  }
> @@ -1367,7 +1367,7 @@ void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
>  	vmcs_writel(GUEST_RFLAGS, rflags);
>  
>  	if ((old_rflags ^ vmx->rflags) & X86_EFLAGS_VM)
> -		vmx->emulation_required = emulation_required(vcpu);
> +		vmx->emulation_required = vmx_emulation_required(vcpu);
>  }
>  
>  u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu)
> @@ -3077,7 +3077,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
>  	}
>  
>  	/* depends on vcpu->arch.cr0 to be set to a new value */
> -	vmx->emulation_required = emulation_required(vcpu);
> +	vmx->emulation_required = vmx_emulation_required(vcpu);
>  }
>  
>  static int vmx_get_max_tdp_level(void)
> @@ -3330,7 +3330,7 @@ static void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int
>  {
>  	__vmx_set_segment(vcpu, var, seg);
>  
> -	to_vmx(vcpu)->emulation_required = emulation_required(vcpu);
> +	to_vmx(vcpu)->emulation_required = vmx_emulation_required(vcpu);
>  }
>  
>  static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 4858c5fd95f2..3a587c51a8d1 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -359,6 +359,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu);
>  void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel,
>  			unsigned long fs_base, unsigned long gs_base);
>  int vmx_get_cpl(struct kvm_vcpu *vcpu);
> +bool vmx_emulation_required(struct kvm_vcpu *vcpu);
>  unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu);
>  void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
>  u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu);
> -- 
> 2.26.3
> 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/6] KVM: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
  2021-08-30 12:55 ` [PATCH v2 5/6] KVM: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry Maxim Levitsky
@ 2021-10-08 23:37   ` Sean Christopherson
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2021-10-08 23:37 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Jim Mattson, Paolo Bonzini,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Vitaly Kuznetsov, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Joerg Roedel, Wanpeng Li, H. Peter Anvin, Borislav Petkov

On Mon, Aug 30, 2021, Maxim Levitsky wrote:
> It is possible that when non root mode is entered via special entry
> (!from_vmentry), that is from SMM or from loading the nested state,
> the L2 state could be invalid in regard to non unrestricted guest mode,
> but later it can become valid.
> 
> (for example when RSM emulation restores segment registers from SMRAM)
> 
> Thus delay the check to VM entry, where we will check this and fail.

And then do what?  Won't invalidate state send KVM into handle_invalid_guest_state(),
which we very much don't want to do for L2?  E.g. this is meant to fire, but won't
because nested_run_pending is false for the !from_vmentry paths.

	/*
	 * We should never reach this point with a pending nested VM-Enter, and
	 * more specifically emulation of L2 due to invalid guest state (see
	 * below) should never happen as that means we incorrectly allowed a
	 * nested VM-Enter with an invalid vmcs12.
	 */
	if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
		return -EIO;

> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 7 ++++++-
>  arch/x86/kvm/vmx/vmx.c    | 5 ++++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index bc6327950657..1a05ae83dae5 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2546,8 +2546,13 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
>  	 * Guest state is invalid and unrestricted guest is disabled,
>  	 * which means L1 attempted VMEntry to L2 with invalid state.
>  	 * Fail the VMEntry.
> +	 *
> +	 * However when force loading the guest state (SMM exit or
> +	 * loading nested state after migration, it is possible to
> +	 * have invalid guest state now, which will be later fixed by
> +	 * restoring L2 register state
>  	 */
> -	if (CC(!vmx_guest_state_valid(vcpu))) {
> +	if (CC(from_vmentry && !vmx_guest_state_valid(vcpu))) {
>  		*entry_failure_code = ENTRY_FAIL_DEFAULT;
>  		return -EINVAL;
>  	}
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 1c113195c846..02d061f5956a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6624,7 +6624,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  	 * consistency check VM-Exit due to invalid guest state and bail.
>  	 */
>  	if (unlikely(vmx->emulation_required)) {
> -		vmx->fail = 0;
> +
> +		/* We don't emulate invalid state of a nested guest */
> +		vmx->fail = is_guest_mode(vcpu);

This is contradictory and wrong.  (a) it's impossible to have both a VM-Fail and
VM-Exit, (b) vmcs.EXIT_REASON is not modified on VM-Fail, and (c) emulation_required
refers to guest state and guest state checks are always VM-Exits, not VM-Fails.

I don't understand this change, AFAICT vmx->fail won't actually be consumed as
either the above KVM_BUG_ON() will be hit or KVM will incorrectly emulate L2
state.

> +
>  		vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
>  		vmx->exit_reason.failed_vmentry = 1;
>  		kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);
> -- 
> 2.26.3
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-08 23:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-30 12:55 [PATCH v2 0/6] KVM: few more SMM fixes Maxim Levitsky
2021-08-30 12:55 ` [PATCH v2 1/6] KVM: SVM: restore the L1 host state prior to resuming nested guest on SMM exit Maxim Levitsky
2021-08-30 12:55 ` [PATCH v2 2/6] KVM: x86: force PDPTR reload " Maxim Levitsky
2021-08-30 12:55 ` [PATCH v2 3/6] KVM: nSVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode Maxim Levitsky
2021-08-30 12:55 ` [PATCH v2 4/6] KVM: VMX: synthesize invalid VM exit when emulating invalid guest state Maxim Levitsky
2021-08-30 12:55 ` [PATCH v2 5/6] KVM: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry Maxim Levitsky
2021-10-08 23:37   ` Sean Christopherson
2021-08-30 12:55 ` [PATCH v2 6/6] KVM: nVMX: re-evaluate emulation_required on nested VM exit Maxim Levitsky
2021-09-21 22:55   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).