stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND 01/30] KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case
       [not found] <20220207155447.840194-1-mlevitsk@redhat.com>
@ 2022-02-07 15:54 ` Maxim Levitsky
  2022-02-07 15:54 ` [PATCH RESEND 02/30] KVM: x86: nSVM: fix potential NULL derefernce on nested migration Maxim Levitsky
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: Maxim Levitsky @ 2022-02-07 15:54 UTC (permalink / raw)
  To: kvm
  Cc: Tony Luck, Chang S. Bae, Thomas Gleixner, Wanpeng Li,
	Ingo Molnar, Vitaly Kuznetsov, Pawan Gupta, Dave Hansen,
	Paolo Bonzini, linux-kernel, Rodrigo Vivi, H. Peter Anvin,
	intel-gvt-dev, Joonas Lahtinen, Joerg Roedel,
	Sean Christopherson, David Airlie, Zhi Wang, Brijesh Singh,
	Jim Mattson, x86, Daniel Vetter, Borislav Petkov, Zhenyu Wang,
	Kan Liang, Jani Nikula, Maxim Levitsky, stable

When the guest doesn't enable paging, and NPT/EPT is disabled, we
use guest't paging CR3's as KVM's shadow paging pointer and
we are technically in direct mode as if we were to use NPT/EPT.

In direct mode we create SPTEs with user mode permissions
because usually in the direct mode the NPT/EPT doesn't
need to restrict access based on guest CPL
(there are MBE/GMET extenstions for that but KVM doesn't use them).

In this special "use guest paging as direct" mode however,
and if CR4.SMAP/CR4.SMEP are enabled, that will make the CPU
fault on each access and KVM will enter endless loop of page faults.

Since page protection doesn't have any meaning in !PG case,
just don't passthrough these bits.

The fix is the same as was done for VMX in commit:
commit 656ec4a4928a ("KVM: VMX: fix SMEP and SMAP without EPT")

This fixes the boot of windows 10 without NPT for good.
(Without this patch, BSP boots, but APs were stuck in endless
loop of page faults, causing the VM boot with 1 CPU)

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
---
 arch/x86/kvm/svm/svm.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 975be872cd1a3..995c203a62fd9 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1596,6 +1596,7 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	u64 hcr0 = cr0;
+	bool old_paging = is_paging(vcpu);
 
 #ifdef CONFIG_X86_64
 	if (vcpu->arch.efer & EFER_LME && !vcpu->arch.guest_state_protected) {
@@ -1612,8 +1613,11 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 #endif
 	vcpu->arch.cr0 = cr0;
 
-	if (!npt_enabled)
+	if (!npt_enabled) {
 		hcr0 |= X86_CR0_PG | X86_CR0_WP;
+		if (old_paging != is_paging(vcpu))
+			svm_set_cr4(vcpu, kvm_read_cr4(vcpu));
+	}
 
 	/*
 	 * re-enable caching here because the QEMU bios
@@ -1657,8 +1661,12 @@ void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 		svm_flush_tlb_current(vcpu);
 
 	vcpu->arch.cr4 = cr4;
-	if (!npt_enabled)
+	if (!npt_enabled) {
 		cr4 |= X86_CR4_PAE;
+
+		if (!is_paging(vcpu))
+			cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
+	}
 	cr4 |= host_cr4_mce;
 	to_svm(vcpu)->vmcb->save.cr4 = cr4;
 	vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_CR);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH RESEND 02/30] KVM: x86: nSVM: fix potential NULL derefernce on nested migration
       [not found] <20220207155447.840194-1-mlevitsk@redhat.com>
  2022-02-07 15:54 ` [PATCH RESEND 01/30] KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case Maxim Levitsky
@ 2022-02-07 15:54 ` Maxim Levitsky
  2022-02-07 15:54 ` [PATCH RESEND 03/30] KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state Maxim Levitsky
  2022-02-07 15:54 ` [PATCH RESEND 04/30] KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM Maxim Levitsky
  3 siblings, 0 replies; 4+ messages in thread
From: Maxim Levitsky @ 2022-02-07 15:54 UTC (permalink / raw)
  To: kvm
  Cc: Tony Luck, Chang S. Bae, Thomas Gleixner, Wanpeng Li,
	Ingo Molnar, Vitaly Kuznetsov, Pawan Gupta, Dave Hansen,
	Paolo Bonzini, linux-kernel, Rodrigo Vivi, H. Peter Anvin,
	intel-gvt-dev, Joonas Lahtinen, Joerg Roedel,
	Sean Christopherson, David Airlie, Zhi Wang, Brijesh Singh,
	Jim Mattson, x86, Daniel Vetter, Borislav Petkov, Zhenyu Wang,
	Kan Liang, Jani Nikula, Maxim Levitsky, stable

Turns out that due to review feedback and/or rebases
I accidentally moved the call to nested_svm_load_cr3 to be too early,
before the NPT is enabled, which is very wrong to do.

KVM can't even access guest memory at that point as nested NPT
is needed for that, and of course it won't initialize the walk_mmu,
which is main issue the patch was addressing.

Fix this for real.

Fixes: 232f75d3b4b5 ("KVM: nSVM: call nested_svm_load_cr3 on nested state load")
Cc: stable@vger.kernel.org

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 1218b5a342fc8..39d280e7e80ef 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1457,18 +1457,6 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	    !__nested_vmcb_check_save(vcpu, &save_cached))
 		goto out_free;
 
-	/*
-	 * While the nested guest CR3 is already checked and set by
-	 * KVM_SET_SREGS, it was set when nested state was yet loaded,
-	 * thus MMU might not be initialized correctly.
-	 * Set it again to fix this.
-	 */
-
-	ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
-				  nested_npt_enabled(svm), false);
-	if (WARN_ON_ONCE(ret))
-		goto out_free;
-
 
 	/*
 	 * All checks done, we can enter guest mode. Userspace provides
@@ -1494,6 +1482,20 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 
 	svm_switch_vmcb(svm, &svm->nested.vmcb02);
 	nested_vmcb02_prepare_control(svm);
+
+	/*
+	 * While the nested guest CR3 is already checked and set by
+	 * KVM_SET_SREGS, it was set when nested state was yet loaded,
+	 * thus MMU might not be initialized correctly.
+	 * Set it again to fix this.
+	 */
+
+	ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
+				  nested_npt_enabled(svm), false);
+	if (WARN_ON_ONCE(ret))
+		goto out_free;
+
+
 	kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
 	ret = 0;
 out_free:
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH RESEND 03/30] KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state
       [not found] <20220207155447.840194-1-mlevitsk@redhat.com>
  2022-02-07 15:54 ` [PATCH RESEND 01/30] KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case Maxim Levitsky
  2022-02-07 15:54 ` [PATCH RESEND 02/30] KVM: x86: nSVM: fix potential NULL derefernce on nested migration Maxim Levitsky
@ 2022-02-07 15:54 ` Maxim Levitsky
  2022-02-07 15:54 ` [PATCH RESEND 04/30] KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM Maxim Levitsky
  3 siblings, 0 replies; 4+ messages in thread
From: Maxim Levitsky @ 2022-02-07 15:54 UTC (permalink / raw)
  To: kvm
  Cc: Tony Luck, Chang S. Bae, Thomas Gleixner, Wanpeng Li,
	Ingo Molnar, Vitaly Kuznetsov, Pawan Gupta, Dave Hansen,
	Paolo Bonzini, linux-kernel, Rodrigo Vivi, H. Peter Anvin,
	intel-gvt-dev, Joonas Lahtinen, Joerg Roedel,
	Sean Christopherson, David Airlie, Zhi Wang, Brijesh Singh,
	Jim Mattson, x86, Daniel Vetter, Borislav Petkov, Zhenyu Wang,
	Kan Liang, Jani Nikula, Maxim Levitsky, stable

While usually, restoring the smm state makes the KVM enter
the nested guest thus a different vmcb (vmcb02 vs vmcb01),
KVM should still mark it as dirty, since hardware
can in theory cache multiple vmcbs.

Failure to do so, combined with lack of setting the
nested_run_pending (which is fixed in the next patch),
might make KVM re-enter vmcb01, which was just exited from,
with completely different set of guest state registers
(SMM vs non SMM) and without proper dirty bits set,
which results in the CPU reusing stale IDTR pointer
which leads to a guest shutdown on any interrupt.

On the real hardware this usually doesn't happen,
but when running nested, L0's KVM does check and
honour few dirty bits, causing this issue to happen.

This patch fixes boot of hyperv and SMM enabled
windows VM running nested on KVM.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
---
 arch/x86/kvm/svm/svm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 995c203a62fd9..3f1d11e652123 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4267,6 +4267,8 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 	 * Enter the nested guest now
 	 */
 
+	vmcb_mark_all_dirty(svm->vmcb01.ptr);
+
 	vmcb12 = map.hva;
 	nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
 	nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH RESEND 04/30] KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM
       [not found] <20220207155447.840194-1-mlevitsk@redhat.com>
                   ` (2 preceding siblings ...)
  2022-02-07 15:54 ` [PATCH RESEND 03/30] KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state Maxim Levitsky
@ 2022-02-07 15:54 ` Maxim Levitsky
  3 siblings, 0 replies; 4+ messages in thread
From: Maxim Levitsky @ 2022-02-07 15:54 UTC (permalink / raw)
  To: kvm
  Cc: Tony Luck, Chang S. Bae, Thomas Gleixner, Wanpeng Li,
	Ingo Molnar, Vitaly Kuznetsov, Pawan Gupta, Dave Hansen,
	Paolo Bonzini, linux-kernel, Rodrigo Vivi, H. Peter Anvin,
	intel-gvt-dev, Joonas Lahtinen, Joerg Roedel,
	Sean Christopherson, David Airlie, Zhi Wang, Brijesh Singh,
	Jim Mattson, x86, Daniel Vetter, Borislav Petkov, Zhenyu Wang,
	Kan Liang, Jani Nikula, Maxim Levitsky, stable

While RSM induced VM entries are not full VM entries,
they still need to be followed by actual VM entry to complete it,
unlike setting the nested state.

This patch fixes boot of hyperv and SMM enabled
windows VM running nested on KVM, which fail due
to this issue combined with lack of dirty bit setting.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
---
 arch/x86/kvm/svm/svm.c | 5 +++++
 arch/x86/kvm/vmx/vmx.c | 1 +
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3f1d11e652123..71bfa52121622 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4274,6 +4274,11 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 	nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
 	ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false);
 
+	if (ret)
+		goto unmap_save;
+
+	svm->nested.nested_run_pending = 1;
+
 unmap_save:
 	kvm_vcpu_unmap(vcpu, &map_save, true);
 unmap_map:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8ac5a6fa77203..fc9c4eca90a78 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7659,6 +7659,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 		if (ret)
 			return ret;
 
+		vmx->nested.nested_run_pending = 1;
 		vmx->nested.smm.guest_mode = false;
 	}
 	return 0;
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-02-07 16:01 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220207155447.840194-1-mlevitsk@redhat.com>
2022-02-07 15:54 ` [PATCH RESEND 01/30] KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case Maxim Levitsky
2022-02-07 15:54 ` [PATCH RESEND 02/30] KVM: x86: nSVM: fix potential NULL derefernce on nested migration Maxim Levitsky
2022-02-07 15:54 ` [PATCH RESEND 03/30] KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state Maxim Levitsky
2022-02-07 15:54 ` [PATCH RESEND 04/30] KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).