All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] KVM: SVM: fixes for vmentry code
@ 2022-11-07 14:54 Paolo Bonzini
  2022-11-07 14:54 ` [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

This series comprises two related fixes:

- the FILL_RETURN_BUFFER macro in -next needs to access percpu data,
  hence the GS segment base needs to be loaded before FILL_RETURN_BUFFER.
  This means moving guest vmload/vmsave and host vmload to assembly
  (patches 4 and 6).

- because AMD wants the OS to set STIBP to 1 before executing the
  return thunk (un)training sequence, IA32_SPEC_CTRL must be restored
  before UNTRAIN_RET, too.  This must also be moved to assembly and,
  for consistency, the guest SPEC_CTRL is also loaded in there
  (patch 7).

Neither is particularly hard, however because of 32-bit systems one needs
to keep the number of arguments to __svm_vcpu_run to three or fewer.
One is taken for whether IA32_SPEC_CTRL is intercepted, and one for the
host save area, so all accesses to the vcpu_svm struct have to be done
from assembly too.  This is done in patches 2, 3 and 5 and it turns out
not to be that bad; in fact I don't think the code is much harder to
follow than before despite doing a lot more stuff.  Care has been taken
to keep the "normal" and SEV-ES code as similar as possible, too.

The above summary leaves out the more mundane patches 1 and 8.  They
are respectively preparation for adding more asm-offsets, and dead
code removal.  Most of the scary diffstat comes from patch 1, which is
purely moving inline functions to a separate header file than svm.h.

Peter Zijlstra had already sent a similar patch for the first issue last
Friday.  Unfortunately it did not take care of the 32-bit issue with the
number of arguments.  This series is independent of his, but I did steal
his organization of the exception fixup code because it's pretty.

Tested on 64-bit bare metal including SEV-ES, and on 32-bit nested.  On
top of this I also spent way too much time comparing the output of
the compiler code before the patch with the assembly code after.

Paolo

Supersedes: <20221028230723.3254250-1-pbonzini@redhat.com>

Paolo Bonzini (8):
  KVM: SVM: extract VMCB accessors to a new file
  KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  KVM: SVM: adjust register allocation for __svm_vcpu_run
  KVM: SVM: move guest vmsave/vmload to assembly
  KVM: SVM: retrieve VMCB from assembly
  KVM: SVM: restore host save area from assembly
  KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
    callers

 arch/x86/include/asm/spec-ctrl.h |  10 +-
 arch/x86/kernel/asm-offsets.c    |  10 ++
 arch/x86/kernel/cpu/bugs.c       |  15 +-
 arch/x86/kvm/svm/avic.c          |   1 +
 arch/x86/kvm/svm/nested.c        |   1 +
 arch/x86/kvm/svm/sev.c           |   1 +
 arch/x86/kvm/svm/svm.c           |  54 +++-----
 arch/x86/kvm/svm/svm.h           | 204 +--------------------------
 arch/x86/kvm/svm/svm_onhyperv.c  |   1 +
 arch/x86/kvm/svm/svm_ops.h       |   5 -
 arch/x86/kvm/svm/vmcb.h          | 211 ++++++++++++++++++++++++++++
 arch/x86/kvm/svm/vmenter.S       | 231 ++++++++++++++++++++++++-------
 12 files changed, 434 insertions(+), 310 deletions(-)
 create mode 100644 arch/x86/kvm/svm/vmcb.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 17:08   ` Sean Christopherson
  2022-11-07 14:54 ` [PATCH 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

Having inline functions confuses the compilation of asm-offsets.c,
which cannot find kvm_cache_regs.h because arch/x86/kvm is not in
asm-offset.c's include path.  Just extract the functions to a
new file.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/avic.c         |   1 +
 arch/x86/kvm/svm/nested.c       |   1 +
 arch/x86/kvm/svm/sev.c          |   1 +
 arch/x86/kvm/svm/svm.c          |   1 +
 arch/x86/kvm/svm/svm.h          | 200 ------------------------------
 arch/x86/kvm/svm/svm_onhyperv.c |   1 +
 arch/x86/kvm/svm/vmcb.h         | 211 ++++++++++++++++++++++++++++++++
 7 files changed, 216 insertions(+), 200 deletions(-)
 create mode 100644 arch/x86/kvm/svm/vmcb.h

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 6919dee69f18..cc651a3310b1 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -26,6 +26,7 @@
 #include "x86.h"
 #include "irq.h"
 #include "svm.h"
+#include "vmcb.h"
 
 /* AVIC GATAG is encoded using VM and VCPU IDs */
 #define AVIC_VCPU_ID_BITS		8
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 4c620999d230..6a90aefb7a8e 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -28,6 +28,7 @@
 #include "cpuid.h"
 #include "lapic.h"
 #include "svm.h"
+#include "vmcb.h"
 #include "hyperv.h"
 
 #define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 28064060413a..73a229a9975b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -25,6 +25,7 @@
 #include "mmu.h"
 #include "x86.h"
 #include "svm.h"
+#include "vmcb.h"
 #include "svm_ops.h"
 #include "cpuid.h"
 #include "trace.h"
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58f0077d9357..cd71f53590b2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -43,6 +43,7 @@
 #include "trace.h"
 
 #include "svm.h"
+#include "vmcb.h"
 #include "svm_ops.h"
 
 #include "kvm_onhyperv.h"
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..222856788153 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -22,8 +22,6 @@
 #include <asm/svm.h>
 #include <asm/sev-common.h>
 
-#include "kvm_cache_regs.h"
-
 #define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
 
 #define	IOPM_SIZE PAGE_SIZE * 3
@@ -327,27 +325,6 @@ static __always_inline bool sev_es_guest(struct kvm *kvm)
 #endif
 }
 
-static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
-{
-	vmcb->control.clean = 0;
-}
-
-static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
-{
-	vmcb->control.clean = VMCB_ALL_CLEAN_MASK
-			       & ~VMCB_ALWAYS_DIRTY_MASK;
-}
-
-static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
-{
-	vmcb->control.clean &= ~(1 << bit);
-}
-
-static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
-{
-        return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
-}
-
 static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
 {
 	return container_of(vcpu, struct vcpu_svm, vcpu);
@@ -363,161 +340,6 @@ static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
  */
 #define SVM_REGS_LAZY_LOAD_SET	(1 << VCPU_EXREG_PDPTR)
 
-static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	__set_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	__clear_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void set_dr_intercepts(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	if (!sev_es_guest(svm->vcpu.kvm)) {
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
-	}
-
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
-
-	recalc_intercepts(svm);
-}
-
-static inline void clr_dr_intercepts(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	vmcb->control.intercepts[INTERCEPT_DR] = 0;
-
-	/* DR7 access must remain intercepted for an SEV-ES guest */
-	if (sev_es_guest(svm->vcpu.kvm)) {
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
-	}
-
-	recalc_intercepts(svm);
-}
-
-static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	WARN_ON_ONCE(bit >= 32);
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	WARN_ON_ONCE(bit >= 32);
-	vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	vmcb_set_intercept(&vmcb->control, bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	vmcb_clr_intercept(&vmcb->control, bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
-{
-	return vmcb_is_intercept(&svm->vmcb->control, bit);
-}
-
-static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
-{
-	return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
-}
-
-static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
-{
-	if (!vgif)
-		return NULL;
-
-	if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
-		return svm->nested.vmcb02.ptr;
-	else
-		return svm->vmcb01.ptr;
-}
-
-static inline void enable_gif(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = get_vgif_vmcb(svm);
-
-	if (vmcb)
-		vmcb->control.int_ctl |= V_GIF_MASK;
-	else
-		svm->vcpu.arch.hflags |= HF_GIF_MASK;
-}
-
-static inline void disable_gif(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = get_vgif_vmcb(svm);
-
-	if (vmcb)
-		vmcb->control.int_ctl &= ~V_GIF_MASK;
-	else
-		svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
-}
-
-static inline bool gif_set(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = get_vgif_vmcb(svm);
-
-	if (vmcb)
-		return !!(vmcb->control.int_ctl & V_GIF_MASK);
-	else
-		return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
-}
-
 static inline bool nested_npt_enabled(struct vcpu_svm *svm)
 {
 	return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE;
@@ -567,28 +389,6 @@ void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode,
 #define NESTED_EXIT_DONE	1	/* Exit caused nested vmexit  */
 #define NESTED_EXIT_CONTINUE	2	/* Further checks needed      */
 
-static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
-{
-	struct vcpu_svm *svm = to_svm(vcpu);
-
-	return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
-}
-
-static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
-{
-	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
-}
-
-static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
-{
-	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
-}
-
-static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
-{
-	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
-}
-
 int enter_svm_guest_mode(struct kvm_vcpu *vcpu,
 			 u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun);
 void svm_leave_nested(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
index 8cdc62c74a96..ae0a101329e6 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.c
+++ b/arch/x86/kvm/svm/svm_onhyperv.c
@@ -8,6 +8,7 @@
 #include <asm/mshyperv.h>
 
 #include "svm.h"
+#include "vmcb.h"
 #include "svm_ops.h"
 
 #include "hyperv.h"
diff --git a/arch/x86/kvm/svm/vmcb.h b/arch/x86/kvm/svm/vmcb.h
new file mode 100644
index 000000000000..8757cda27e3a
--- /dev/null
+++ b/arch/x86/kvm/svm/vmcb.h
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * AMD SVM support - VMCB accessors
+ */
+
+#ifndef __SVM_VMCB_H
+#define __SVM_VMCB_H
+
+#include "kvm_cache_regs.h"
+
+static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
+{
+	vmcb->control.clean = 0;
+}
+
+static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
+{
+	vmcb->control.clean = VMCB_ALL_CLEAN_MASK
+			       & ~VMCB_ALWAYS_DIRTY_MASK;
+}
+
+static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
+{
+	vmcb->control.clean &= ~(1 << bit);
+}
+
+static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
+{
+        return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
+}
+
+static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	__set_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	__clear_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void set_dr_intercepts(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	if (!sev_es_guest(svm->vcpu.kvm)) {
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
+	}
+
+	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+
+	recalc_intercepts(svm);
+}
+
+static inline void clr_dr_intercepts(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	vmcb->control.intercepts[INTERCEPT_DR] = 0;
+
+	/* DR7 access must remain intercepted for an SEV-ES guest */
+	if (sev_es_guest(svm->vcpu.kvm)) {
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+	}
+
+	recalc_intercepts(svm);
+}
+
+static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	WARN_ON_ONCE(bit >= 32);
+	vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	WARN_ON_ONCE(bit >= 32);
+	vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	vmcb_set_intercept(&vmcb->control, bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	vmcb_clr_intercept(&vmcb->control, bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
+{
+	return vmcb_is_intercept(&svm->vmcb->control, bit);
+}
+
+static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
+{
+	return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
+}
+
+static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
+{
+	if (!vgif)
+		return NULL;
+
+	if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
+		return svm->nested.vmcb02.ptr;
+	else
+		return svm->vmcb01.ptr;
+}
+
+static inline void enable_gif(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+	if (vmcb)
+		vmcb->control.int_ctl |= V_GIF_MASK;
+	else
+		svm->vcpu.arch.hflags |= HF_GIF_MASK;
+}
+
+static inline void disable_gif(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+	if (vmcb)
+		vmcb->control.int_ctl &= ~V_GIF_MASK;
+	else
+		svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
+}
+
+static inline bool gif_set(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+	if (vmcb)
+		return !!(vmcb->control.int_ctl & V_GIF_MASK);
+	else
+		return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
+}
+
+static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
+}
+
+static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
+{
+	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
+}
+
+static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
+{
+	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
+}
+
+static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
+{
+	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
+}
+
+#endif
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
  2022-11-07 14:54 ` [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 17:10   ` Sean Christopherson
  2022-11-07 14:54 ` [PATCH 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  6 ++++++
 arch/x86/kvm/svm/svm.c        |  2 +-
 arch/x86/kvm/svm/svm.h        |  2 +-
 arch/x86/kvm/svm/vmenter.S    | 36 +++++++++++++++++------------------
 4 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cb50589a7102..85de7e4fe59a 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -20,6 +20,7 @@
 #include <asm/tlbflush.h>
 #include <asm/tdx.h>
 #include "../kvm/vmx/vmx.h"
+#include "../kvm/svm/svm.h"
 
 #ifdef CONFIG_XEN
 #include <xen/interface/xen.h>
@@ -109,6 +110,11 @@ static void __used common(void)
 	OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
 	OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
 
+	if (IS_ENABLED(CONFIG_KVM_AMD)) {
+		BLANK();
+		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+	}
+
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
 		BLANK();
 		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cd71f53590b2..4cfa62e66a0e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3931,7 +3931,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		 * vmcb02 when switching vmcbs for nested virtualization.
 		 */
 		vmload(svm->vmcb01.pa);
-		__svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
+		__svm_vcpu_run(vmcb_pa, svm);
 		vmsave(svm->vmcb01.pa);
 
 		vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 222856788153..5f8dfc9cd9a7 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -484,6 +484,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, unsigned long *regs);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 723f8534986c..8fac744361e5 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -8,23 +8,23 @@
 #define WORD_SIZE (BITS_PER_LONG / 8)
 
 /* Intentionally omit RAX as it's context switched by hardware */
-#define VCPU_RCX	__VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX	__VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX	__VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RCX	(SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX	(SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX	(SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
 /* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP	__VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI	__VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI	__VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP	(SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI	(SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI	(SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
 
 #ifdef CONFIG_X86_64
-#define VCPU_R8		__VCPU_REGS_R8  * WORD_SIZE
-#define VCPU_R9		__VCPU_REGS_R9  * WORD_SIZE
-#define VCPU_R10	__VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11	__VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12	__VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13	__VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14	__VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15	__VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8		(SVM_vcpu_arch_regs + __VCPU_REGS_R8  * WORD_SIZE)
+#define VCPU_R9		(SVM_vcpu_arch_regs + __VCPU_REGS_R9  * WORD_SIZE)
+#define VCPU_R10	(SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11	(SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12	(SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13	(SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14	(SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
 .section .noinstr.text, "ax"
@@ -32,7 +32,7 @@
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @vmcb_pa:	unsigned long
- * @regs:	unsigned long * (to guest registers)
+ * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -47,13 +47,13 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Save @regs. */
+	/* Save @svm. */
 	push %_ASM_ARG2
 
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @regs to RAX. */
+	/* Move @svm to RAX. */
 	mov %_ASM_ARG2, %_ASM_AX
 
 	/* Load guest registers. */
@@ -89,7 +89,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
-	/* "POP" @regs to RAX. */
+	/* "POP" @svm to RAX. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
  2022-11-07 14:54 ` [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
  2022-11-07 14:54 ` [PATCH 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 14:54 ` [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

In preparation for moving vmload/vmsave to __svm_vcpu_run,
keep the pointer to the struct vcpu_svm in %rdi.  This way
it is possible to load svm->vmcb01.pa in %rax without
clobbering the pointer to svm itself.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/vmenter.S | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 8fac744361e5..dc558d0a589e 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -53,29 +53,29 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @svm to RAX. */
-	mov %_ASM_ARG2, %_ASM_AX
+	/* Move @svm to RDI. */
+	mov %_ASM_ARG2, %_ASM_DI
+
+	/* "POP" @vmcb to RAX. */
+	pop %_ASM_AX
 
 	/* Load guest registers. */
-	mov VCPU_RCX(%_ASM_AX), %_ASM_CX
-	mov VCPU_RDX(%_ASM_AX), %_ASM_DX
-	mov VCPU_RBX(%_ASM_AX), %_ASM_BX
-	mov VCPU_RBP(%_ASM_AX), %_ASM_BP
-	mov VCPU_RSI(%_ASM_AX), %_ASM_SI
-	mov VCPU_RDI(%_ASM_AX), %_ASM_DI
+	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+	mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+	mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+	mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+	mov VCPU_RSI(%_ASM_DI), %_ASM_SI
 #ifdef CONFIG_X86_64
-	mov VCPU_R8 (%_ASM_AX),  %r8
-	mov VCPU_R9 (%_ASM_AX),  %r9
-	mov VCPU_R10(%_ASM_AX), %r10
-	mov VCPU_R11(%_ASM_AX), %r11
-	mov VCPU_R12(%_ASM_AX), %r12
-	mov VCPU_R13(%_ASM_AX), %r13
-	mov VCPU_R14(%_ASM_AX), %r14
-	mov VCPU_R15(%_ASM_AX), %r15
+	mov VCPU_R8 (%_ASM_DI),  %r8
+	mov VCPU_R9 (%_ASM_DI),  %r9
+	mov VCPU_R10(%_ASM_DI), %r10
+	mov VCPU_R11(%_ASM_DI), %r11
+	mov VCPU_R12(%_ASM_DI), %r12
+	mov VCPU_R13(%_ASM_DI), %r13
+	mov VCPU_R14(%_ASM_DI), %r14
+	mov VCPU_R15(%_ASM_DI), %r15
 #endif
-
-	/* "POP" @vmcb to RAX. */
-	pop %_ASM_AX
+	mov VCPU_RDI(%_ASM_DI), %_ASM_DI
 
 	/* Enter guest mode */
 	sti
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (2 preceding siblings ...)
  2022-11-07 14:54 ` [PATCH 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 15:23   ` Peter Zijlstra
  2022-11-07 15:32   ` Andrew Cooper
  2022-11-07 14:54 ` [PATCH 5/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

FILL_RETURN_BUFFER can access percpu data, therefore vmload of the
host save area must be executed first.  First of all, move the
VMCB vmsave/vmload to assembly.

The idea on how to number the exception tables is stolen from
a prototype patch by Peter Zijlstra.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Link: <https://lore.kernel.org/all/f571e404-e625-bae1-10e9-449b2eb4cbd8@citrix.com/>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  2 ++
 arch/x86/kvm/svm/svm.c        |  9 -------
 arch/x86/kvm/svm/vmenter.S    | 50 +++++++++++++++++++++++++++--------
 3 files changed, 41 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 85de7e4fe59a..f01293a1e594 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -113,6 +113,8 @@ static void __used common(void)
 	if (IS_ENABLED(CONFIG_KVM_AMD)) {
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+		OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
+		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
 
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 4cfa62e66a0e..ae65cdcab660 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3924,16 +3924,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		/*
-		 * Use a single vmcb (vmcb01 because it's always valid) for
-		 * context switching guest state via VMLOAD/VMSAVE, that way
-		 * the state doesn't need to be copied between vmcb01 and
-		 * vmcb02 when switching vmcbs for nested virtualization.
-		 */
-		vmload(svm->vmcb01.pa);
 		__svm_vcpu_run(vmcb_pa, svm);
-		vmsave(svm->vmcb01.pa);
-
 		vmload(__sme_page_pa(sd->save_area));
 	}
 
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index dc558d0a589e..4709bc8868d7 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/linkage.h>
 #include <asm/asm.h>
+#include <asm/asm-offsets.h>
 #include <asm/bitsperlong.h>
 #include <asm/kvm_vcpu_regs.h>
 #include <asm/nospec-branch.h>
@@ -27,6 +28,8 @@
 #define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
+#define SVM_vmcb01_pa	(SVM_vmcb01 + KVM_VMCB_pa)
+
 .section .noinstr.text, "ax"
 
 /**
@@ -56,6 +59,16 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Move @svm to RDI. */
 	mov %_ASM_ARG2, %_ASM_DI
 
+	/*
+	 * Use a single vmcb (vmcb01 because it's always valid) for
+	 * context switching guest state via VMLOAD/VMSAVE, that way
+	 * the state doesn't need to be copied between vmcb01 and
+	 * vmcb02 when switching vmcbs for nested virtualization.
+	 */
+	mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
+1:	vmload %_ASM_AX
+2:
+
 	/* "POP" @vmcb to RAX. */
 	pop %_ASM_AX
 
@@ -80,16 +93,11 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Enter guest mode */
 	sti
 
-1:	vmrun %_ASM_AX
-
-2:	cli
-
-#ifdef CONFIG_RETPOLINE
-	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
-	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
-#endif
+3:	vmrun %_ASM_AX
+4:
+	cli
 
-	/* "POP" @svm to RAX. */
+	/* Pop @svm to RAX while it's the only available register. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
@@ -110,6 +118,18 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %r15, VCPU_R15(%_ASM_AX)
 #endif
 
+	/* @svm can stay in RDI from now on.  */
+	mov %_ASM_AX, %_ASM_DI
+
+	mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
+5:	vmsave %_ASM_AX
+6:
+
+#ifdef CONFIG_RETPOLINE
+	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
+	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
+#endif
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -159,11 +179,19 @@ SYM_FUNC_START(__svm_vcpu_run)
 	pop %_ASM_BP
 	RET
 
-3:	cmpb $0, kvm_rebooting
+10:	cmpb $0, kvm_rebooting
 	jne 2b
 	ud2
+30:	cmpb $0, kvm_rebooting
+	jne 4b
+	ud2
+50:	cmpb $0, kvm_rebooting
+	jne 6b
+	ud2
 
-	_ASM_EXTABLE(1b, 3b)
+	_ASM_EXTABLE(1b, 10b)
+	_ASM_EXTABLE(3b, 30b)
+	_ASM_EXTABLE(5b, 50b)
 
 SYM_FUNC_END(__svm_vcpu_run)
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/8] KVM: SVM: retrieve VMCB from assembly
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (3 preceding siblings ...)
  2022-11-07 14:54 ` [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 14:54 ` [PATCH 6/8] KVM: SVM: restore host save area " Paolo Bonzini
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

This is needed in order to keep the number of arguments to 3 or less,
so that they are all passed in registers on either 32-bit or 64-bit
builds.

It is not strictly necessary for __svm_sev_es_vcpu_run, but staying
consistent is a good idea since it makes __svm_sev_es_vcpu_run a
stripped version of _svm_vcpu_run.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  1 +
 arch/x86/kvm/svm/svm.c        |  5 ++---
 arch/x86/kvm/svm/svm.h        |  4 ++--
 arch/x86/kvm/svm/vmenter.S    | 20 ++++++++++----------
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index f01293a1e594..69d1fed51086 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -114,6 +114,7 @@ static void __used common(void)
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
 		OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
+		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
 		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ae65cdcab660..550a364be8d3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3915,16 +3915,15 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	unsigned long vmcb_pa = svm->current_vmcb->pa;
 
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(vmcb_pa);
+		__svm_sev_es_vcpu_run(svm);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(vmcb_pa, svm);
+		__svm_vcpu_run(svm);
 		vmload(__sme_page_pa(sd->save_area));
 	}
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5f8dfc9cd9a7..c5b8ec370108 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
+void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
+void __svm_vcpu_run(struct vcpu_svm *svm);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 4709bc8868d7..9738ce41fac9 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -34,7 +34,6 @@
 
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
- * @vmcb_pa:	unsigned long
  * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_vcpu_run)
@@ -51,13 +50,12 @@ SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BX
 
 	/* Save @svm. */
-	push %_ASM_ARG2
-
-	/* Save @vmcb. */
 	push %_ASM_ARG1
 
+.ifnc _ASM_ARG1, _ASM_DI
 	/* Move @svm to RDI. */
-	mov %_ASM_ARG2, %_ASM_DI
+	mov %_ASM_ARG1, %_ASM_DI
+.endif
 
 	/*
 	 * Use a single vmcb (vmcb01 because it's always valid) for
@@ -69,8 +67,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 1:	vmload %_ASM_AX
 2:
 
-	/* "POP" @vmcb to RAX. */
-	pop %_ASM_AX
+	/* Get svm->current_vmcb->pa into RAX. */
+	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
+	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Load guest registers. */
 	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
@@ -197,7 +196,7 @@ SYM_FUNC_END(__svm_vcpu_run)
 
 /**
  * __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
- * @vmcb_pa:	unsigned long
+ * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	push %_ASM_BP
@@ -212,8 +211,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Move @vmcb to RAX. */
-	mov %_ASM_ARG1, %_ASM_AX
+	/* Get svm->current_vmcb->pa into RAX. */
+	mov SVM_current_vmcb(%_ASM_ARG1), %_ASM_AX
+	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Enter guest mode */
 	sti
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/8] KVM: SVM: restore host save area from assembly
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (4 preceding siblings ...)
  2022-11-07 14:54 ` [PATCH 5/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 14:54 ` [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

This is needed so that FILL_RETURN_BUFFER has access to the
percpu area via the GS segment base.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Analyzed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/svm.c     |  3 +--
 arch/x86/kvm/svm/svm.h     |  2 +-
 arch/x86/kvm/svm/svm_ops.h |  5 -----
 arch/x86/kvm/svm/vmenter.S | 13 +++++++++++++
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 550a364be8d3..381c7dcffe25 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3923,8 +3923,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(svm);
-		vmload(__sme_page_pa(sd->save_area));
+		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area));
 	}
 
 	guest_state_exit_irqoff();
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index c5b8ec370108..99410651f2a5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -484,6 +484,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
-void __svm_vcpu_run(struct vcpu_svm *svm);
+void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa);
 
 #endif
diff --git a/arch/x86/kvm/svm/svm_ops.h b/arch/x86/kvm/svm/svm_ops.h
index 9430d6437c9f..36c8af87a707 100644
--- a/arch/x86/kvm/svm/svm_ops.h
+++ b/arch/x86/kvm/svm/svm_ops.h
@@ -61,9 +61,4 @@ static __always_inline void vmsave(unsigned long pa)
 	svm_asm1(vmsave, "a" (pa), "memory");
 }
 
-static __always_inline void vmload(unsigned long pa)
-{
-	svm_asm1(vmload, "a" (pa), "memory");
-}
-
 #endif /* __KVM_X86_SVM_OPS_H */
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 9738ce41fac9..45a4bd002494 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -35,6 +35,7 @@
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
+ * @hsave_pa:	unsigned long
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -49,6 +50,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
+	/* @hsave_pa is needed last after vmexit, save it first.  */
+	push %_ASM_ARG2
+
 	/* Save @svm. */
 	push %_ASM_ARG1
 
@@ -124,6 +128,11 @@ SYM_FUNC_START(__svm_vcpu_run)
 5:	vmsave %_ASM_AX
 6:
 
+	/* Pop @hsave_pa and restore GSBASE, allowing access to percpu data.  */
+	pop %_ASM_AX
+7:	vmload %_ASM_AX
+8:
+
 #ifdef CONFIG_RETPOLINE
 	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
@@ -187,10 +196,14 @@ SYM_FUNC_START(__svm_vcpu_run)
 50:	cmpb $0, kvm_rebooting
 	jne 6b
 	ud2
+70:	cmpb $0, kvm_rebooting
+	jne 8b
+	ud2
 
 	_ASM_EXTABLE(1b, 10b)
 	_ASM_EXTABLE(3b, 30b)
 	_ASM_EXTABLE(5b, 50b)
+	_ASM_EXTABLE(7b, 70b)
 
 SYM_FUNC_END(__svm_vcpu_run)
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (5 preceding siblings ...)
  2022-11-07 14:54 ` [PATCH 6/8] KVM: SVM: restore host save area " Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 18:45   ` Jim Mattson
  2022-11-07 14:54 ` [PATCH 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
  2022-11-07 15:33 ` [PATCH 0/8] KVM: SVM: fixes for vmentry code Peter Zijlstra
  8 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.

With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.

To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.

Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  1 +
 arch/x86/kernel/cpu/bugs.c    | 13 ++---
 arch/x86/kvm/svm/svm.c        | 38 ++++++---------
 arch/x86/kvm/svm/svm.h        |  4 +-
 arch/x86/kvm/svm/vmenter.S    | 92 ++++++++++++++++++++++++++++++++++-
 5 files changed, 111 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 69d1fed51086..d0bd68af0a5a 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -115,6 +115,7 @@ static void __used common(void)
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
 		OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
 		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
+		OFFSET(SVM_spec_ctrl, vcpu_svm, spec_ctrl);
 		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index da7c361f47e0..6ec0b7ce7453 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -196,22 +196,15 @@ void __init check_bugs(void)
 }
 
 /*
- * NOTE: This function is *only* called for SVM.  VMX spec_ctrl handling is
- * done in vmenter.S.
+ * NOTE: This function is *only* called for SVM, since Intel uses
+ * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
 x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
 {
-	u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
+	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
 
-	if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
-		if (hostval != guestval) {
-			msrval = setguest ? guestval : hostval;
-			wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
-		}
-	}
-
 	/*
 	 * If SSBD is not handled in MSR_SPEC_CTRL on AMD, update
 	 * MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported.
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 381c7dcffe25..31aa158a2e10 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -731,6 +731,15 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
 	u32 offset;
 	u32 *msrpm;
 
+	/*
+	 * For non-nested case:
+	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 *
+	 * For nested case:
+	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 */
 	msrpm = is_guest_mode(vcpu) ? to_svm(vcpu)->nested.msrpm:
 				      to_svm(vcpu)->msrpm;
 
@@ -3912,18 +3921,19 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 	return EXIT_FASTPATH_NONE;
 }
 
-static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
+static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(svm);
+		__svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area));
+		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area),
+			       spec_ctrl_intercepted);
 	}
 
 	guest_state_exit_irqoff();
@@ -3932,6 +3942,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
 
 	trace_kvm_entry(vcpu);
 
@@ -3990,26 +4001,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
 		x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
 
-	svm_vcpu_enter_exit(vcpu);
-
-	/*
-	 * We do not use IBRS in the kernel. If this vCPU has used the
-	 * SPEC_CTRL MSR it may have left it on; save the value and
-	 * turn it off. This is much more efficient than blindly adding
-	 * it to the atomic save/restore list. Especially as the former
-	 * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
-	 *
-	 * For non-nested case:
-	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 *
-	 * For nested case:
-	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 */
-	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) &&
-	    unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
-		svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
+	svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
 
 	if (!sev_es_guest(vcpu->kvm))
 		reload_tss(vcpu);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 99410651f2a5..9d940d8736f0 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
-void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa);
+void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa, bool spec_ctrl_intercepted);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 45a4bd002494..9e381386ffdc 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -32,10 +32,64 @@
 
 .section .noinstr.text, "ax"
 
+.macro RESTORE_GUEST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "jmp 999f", \
+		"", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+	/*
+	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
+	 * host's, write the MSR.
+	 *
+	 * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
+	 * there must not be any returns or indirect branches between this code
+	 * and vmentry.
+	 */
+	movl SVM_spec_ctrl(%_ASM_DI), %eax
+	cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	je 999f
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+	xor %edx, %edx
+	wrmsr
+999:
+
+.endm
+
+.macro RESTORE_HOST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "jmp 999f", \
+		"", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+
+	/*
+	 * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
+	 * if it was not intercepted during guest execution.
+	 */
+	cmpb $0, (%_ASM_SP)
+	jnz 998f
+	rdmsr
+	movl %eax, SVM_spec_ctrl(%_ASM_DI)
+998:
+
+	/* Now restore the host value of the MSR if different from the guest's.  */
+	movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	cmp SVM_spec_ctrl(%_ASM_DI), %eax
+	je 999f
+	xor %edx, %edx
+	wrmsr
+999:
+
+.endm
+
+
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
  * @hsave_pa:	unsigned long
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -50,7 +104,12 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* @hsave_pa is needed last after vmexit, save it first.  */
+	/*
+	 * Both @spec_ctrl_intercepted and @hsave_pa are used only after vmexit.
+	 * @spec_ctrl_intercepted is needed later and accessed directly from
+	 * the stack in RESTORE_HOST_SPEC_CTRL, so save it first.
+	 */
+	push %_ASM_ARG3
 	push %_ASM_ARG2
 
 	/* Save @svm. */
@@ -61,6 +120,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %_ASM_ARG1, %_ASM_DI
 .endif
 
+	RESTORE_GUEST_SPEC_CTRL
+
 	/*
 	 * Use a single vmcb (vmcb01 because it's always valid) for
 	 * context switching guest state via VMLOAD/VMSAVE, that way
@@ -138,6 +199,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -173,6 +236,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 	xor %r15d, %r15d
 #endif
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
@@ -210,6 +276,7 @@ SYM_FUNC_END(__svm_vcpu_run)
 /**
  * __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	push %_ASM_BP
@@ -224,8 +291,21 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 #endif
 	push %_ASM_BX
 
+	/* Save @spec_ctrl_intercepted for RESTORE_HOST_SPEC_CTRL. */
+	push %_ASM_ARG2
+
+	/* Save @svm. */
+	push %_ASM_ARG1
+
+.ifnc _ASM_ARG1, _ASM_DI
+	/* Move @svm to RDI for RESTORE_GUEST_SPEC_CTRL. */
+	mov %_ASM_ARG1, %_ASM_DI
+.endif
+
+	RESTORE_GUEST_SPEC_CTRL
+
 	/* Get svm->current_vmcb->pa into RAX. */
-	mov SVM_current_vmcb(%_ASM_ARG1), %_ASM_AX
+	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
 	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Enter guest mode */
@@ -235,11 +315,16 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 
 2:	cli
 
+	/* Pop @svm to RDI, guest registers have been saved already. */
+	pop %_ASM_DI
+
 #ifdef CONFIG_RETPOLINE
 	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -249,6 +334,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 */
 	UNTRAIN_RET
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (6 preceding siblings ...)
  2022-11-07 14:54 ` [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-11-07 14:54 ` Paolo Bonzini
  2022-11-07 15:33 ` [PATCH 0/8] KVM: SVM: fixes for vmentry code Peter Zijlstra
  8 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 14:54 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/spec-ctrl.h | 10 +++++-----
 arch/x86/kernel/cpu/bugs.c       |  2 +-
 arch/x86/kvm/svm/svm.c           |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index 5393babc0598..cb0386fc4dc3 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -13,7 +13,7 @@
  * Takes the guest view of SPEC_CTRL MSR as a parameter and also
  * the guest's version of VIRT_SPEC_CTRL, if emulated.
  */
-extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool guest);
+extern void x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool guest);
 
 /**
  * x86_spec_ctrl_set_guest - Set speculation control registers for the guest
@@ -24,9 +24,9 @@ extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bo
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_set_guest(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, true);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, true);
 }
 
 /**
@@ -38,9 +38,9 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_restore_host(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, false);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, false);
 }
 
 /* AMD specific Speculative Store Bypass MSR data */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6ec0b7ce7453..3e3230cccaa7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -200,7 +200,7 @@ void __init check_bugs(void)
  * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
-x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
+x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool setguest)
 {
 	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 31aa158a2e10..e95684cbc194 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3999,7 +3999,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	 * being speculatively taken.
 	 */
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);
 
 	svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
 
@@ -4007,7 +4007,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 		reload_tss(vcpu);
 
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
 
 	if (!sev_es_guest(vcpu->kvm)) {
 		vcpu->arch.cr2 = svm->vmcb->save.cr2;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-07 14:54 ` [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
@ 2022-11-07 15:23   ` Peter Zijlstra
  2022-11-07 15:40     ` Paolo Bonzini
  2022-11-07 15:32   ` Andrew Cooper
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2022-11-07 15:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	jmattson, seanjc, stable

On Mon, Nov 07, 2022 at 09:54:32AM -0500, Paolo Bonzini wrote:
> @@ -56,6 +59,16 @@ SYM_FUNC_START(__svm_vcpu_run)
>  	/* Move @svm to RDI. */
>  	mov %_ASM_ARG2, %_ASM_DI
>  
> +	/*
> +	 * Use a single vmcb (vmcb01 because it's always valid) for
> +	 * context switching guest state via VMLOAD/VMSAVE, that way
> +	 * the state doesn't need to be copied between vmcb01 and
> +	 * vmcb02 when switching vmcbs for nested virtualization.
> +	 */
> +	mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
> +1:	vmload %_ASM_AX
> +2:
> +
>  	/* "POP" @vmcb to RAX. */
>  	pop %_ASM_AX
>  
> @@ -80,16 +93,11 @@ SYM_FUNC_START(__svm_vcpu_run)
>  	/* Enter guest mode */
>  	sti
>  
> +3:	vmrun %_ASM_AX
> +4:
> +	cli
>  
> +	/* Pop @svm to RAX while it's the only available register. */
>  	pop %_ASM_AX
>  
>  	/* Save all guest registers.  */

So Andrew noted that once the vmload has executed any exception taken
(say at 3) will crash and burn because %gs is scribbled.

Might be good to make a record of this in the code so it can be cleaned
up some day.

> @@ -159,11 +179,19 @@ SYM_FUNC_START(__svm_vcpu_run)
>  	pop %_ASM_BP
>  	RET
>  
> +10:	cmpb $0, kvm_rebooting
>  	jne 2b
>  	ud2
> +30:	cmpb $0, kvm_rebooting
> +	jne 4b
> +	ud2
> +50:	cmpb $0, kvm_rebooting
> +	jne 6b
> +	ud2
>  
> +	_ASM_EXTABLE(1b, 10b)
> +	_ASM_EXTABLE(3b, 30b)
> +	_ASM_EXTABLE(5b, 50b)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-07 14:54 ` [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
  2022-11-07 15:23   ` Peter Zijlstra
@ 2022-11-07 15:32   ` Andrew Cooper
  2022-11-07 15:37     ` Paolo Bonzini
  1 sibling, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2022-11-07 15:32 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: nathan, thomas.lendacky, peterz, jmattson, seanjc, stable, Andrew Cooper

On 07/11/2022 14:54, Paolo Bonzini wrote:
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 4cfa62e66a0e..ae65cdcab660 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3924,16 +3924,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>  	} else {
>  		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
>  
> -		/*
> -		 * Use a single vmcb (vmcb01 because it's always valid) for
> -		 * context switching guest state via VMLOAD/VMSAVE, that way
> -		 * the state doesn't need to be copied between vmcb01 and
> -		 * vmcb02 when switching vmcbs for nested virtualization.
> -		 */
> -		vmload(svm->vmcb01.pa);
>  		__svm_vcpu_run(vmcb_pa, svm);
> -		vmsave(svm->vmcb01.pa);
> -
>  		vmload(__sme_page_pa(sd->save_area));

%gs is still the guests until this vmload has completed.  It needs to
move down into asm too.

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/8] KVM: SVM: fixes for vmentry code
  2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (7 preceding siblings ...)
  2022-11-07 14:54 ` [PATCH 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
@ 2022-11-07 15:33 ` Peter Zijlstra
  8 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2022-11-07 15:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	jmattson, seanjc

On Mon, Nov 07, 2022 at 09:54:28AM -0500, Paolo Bonzini wrote:

> Paolo Bonzini (8):
>   KVM: SVM: extract VMCB accessors to a new file
>   KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
>   KVM: SVM: adjust register allocation for __svm_vcpu_run
>   KVM: SVM: move guest vmsave/vmload to assembly
>   KVM: SVM: retrieve VMCB from assembly
>   KVM: SVM: restore host save area from assembly
>   KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
>   x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
>     callers

Nice!

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-07 15:32   ` Andrew Cooper
@ 2022-11-07 15:37     ` Paolo Bonzini
  2022-11-07 15:47       ` Andrew Cooper
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 15:37 UTC (permalink / raw)
  To: Andrew Cooper, linux-kernel, kvm
  Cc: nathan, thomas.lendacky, peterz, jmattson, seanjc, stable

On 11/7/22 16:32, Andrew Cooper wrote:
>> -		vmload(svm->vmcb01.pa);
>>   		__svm_vcpu_run(vmcb_pa, svm);
>> -		vmsave(svm->vmcb01.pa);
>> -
>>   		vmload(__sme_page_pa(sd->save_area));
>
> %gs is still the guests until this vmload has completed.  It needs to
> move down into asm too.

Sure, that's patch 6 in the series.  See also cover letter: "this means 
moving guest vmload/vmsave and host vmload to assembly".

Paolo


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-07 15:23   ` Peter Zijlstra
@ 2022-11-07 15:40     ` Paolo Bonzini
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 15:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	jmattson, seanjc, stable

On 11/7/22 16:23, Peter Zijlstra wrote:
>>   
>> +3:	vmrun %_ASM_AX
>> +4:
>> +	cli
>>   
>> +	/* Pop @svm to RAX while it's the only available register. */
>>   	pop %_ASM_AX
>>   
>>   	/* Save all guest registers.  */
> So Andrew noted that once the vmload has executed any exception taken
> (say at 3) will crash and burn because %gs is scribbled.
> 
> Might be good to make a record of this in the code so it can be cleaned
> up some day.
> 

Yeah, it won't happen because clgi/stgi blocks setting kvm_rebooting so 
I thought of killing the three exception fixups after the first.  In the 
end I kept them for simplicity and to keep the normal/SEV-ES versions as 
similar as possible.

Paolo


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-07 15:37     ` Paolo Bonzini
@ 2022-11-07 15:47       ` Andrew Cooper
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Cooper @ 2022-11-07 15:47 UTC (permalink / raw)
  To: Paolo Bonzini, linux-kernel, kvm
  Cc: nathan, thomas.lendacky, peterz, jmattson, seanjc, stable

On 07/11/2022 15:37, Paolo Bonzini wrote:
> On 11/7/22 16:32, Andrew Cooper wrote:
>>> -        vmload(svm->vmcb01.pa);
>>>           __svm_vcpu_run(vmcb_pa, svm);
>>> -        vmsave(svm->vmcb01.pa);
>>> -
>>>           vmload(__sme_page_pa(sd->save_area));
>>
>> %gs is still the guests until this vmload has completed.  It needs to
>> move down into asm too.
>
> Sure, that's patch 6 in the series.  See also cover letter: "this
> means moving guest vmload/vmsave and host vmload to assembly".

Oh, ok.  I missed that it was split across two patches.

Sorry for the noise.  The end result looks ok.

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file
  2022-11-07 14:54 ` [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
@ 2022-11-07 17:08   ` Sean Christopherson
  2022-11-07 17:36     ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Sean Christopherson @ 2022-11-07 17:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Mon, Nov 07, 2022, Paolo Bonzini wrote:
> Having inline functions confuses the compilation of asm-offsets.c,
> which cannot find kvm_cache_regs.h because arch/x86/kvm is not in
> asm-offset.c's include path.  Just extract the functions to a
> new file.
> 
> No functional change intended.
> 
> Cc: stable@vger.kernel.org
> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/svm/avic.c         |   1 +
>  arch/x86/kvm/svm/nested.c       |   1 +
>  arch/x86/kvm/svm/sev.c          |   1 +
>  arch/x86/kvm/svm/svm.c          |   1 +
>  arch/x86/kvm/svm/svm.h          | 200 ------------------------------
>  arch/x86/kvm/svm/svm_onhyperv.c |   1 +
>  arch/x86/kvm/svm/vmcb.h         | 211 ++++++++++++++++++++++++++++++++

I don't think vmcb.h is a good name.  The logical inclusion sequence would be for
svm.h to include vmcb.h, e.g. SVM requires knowledge about VMCBs, but this requires
vmcb.h to include svm.h to dereference "struct vcpu_svm".

Unlike VMX's vmcs.h, the new file isn't a "pure" VMCB helper, it also holds a
decent amount of KVM's SVM logic.

What about making KVM self-sufficient?  The includes in asm-offsets.c are quite
ugly

 #include "../kvm/vmx/vmx.h"
 #include "../kvm/svm/svm.h"

or as a stopgap to make backporting easier, just include kvm_cache_regs.h?

>  void svm_leave_nested(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
> index 8cdc62c74a96..ae0a101329e6 100644
> --- a/arch/x86/kvm/svm/svm_onhyperv.c
> +++ b/arch/x86/kvm/svm/svm_onhyperv.c
> @@ -8,6 +8,7 @@
>  #include <asm/mshyperv.h>
>  
>  #include "svm.h"
> +#include "vmcb.h"
>  #include "svm_ops.h"
>  
>  #include "hyperv.h"
> diff --git a/arch/x86/kvm/svm/vmcb.h b/arch/x86/kvm/svm/vmcb.h
> new file mode 100644
> index 000000000000..8757cda27e3a
> --- /dev/null
> +++ b/arch/x86/kvm/svm/vmcb.h
> @@ -0,0 +1,211 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Kernel-based Virtual Machine driver for Linux
> + *
> + * AMD SVM support - VMCB accessors
> + */
> +
> +#ifndef __SVM_VMCB_H
> +#define __SVM_VMCB_H
> +
> +#include "kvm_cache_regs.h"

This should include "svm.h" instead of relying on the parent to include said file.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-07 14:54 ` [PATCH 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
@ 2022-11-07 17:10   ` Sean Christopherson
  2022-11-07 17:22     ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Sean Christopherson @ 2022-11-07 17:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Mon, Nov 07, 2022, Paolo Bonzini wrote:
> diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
> index 723f8534986c..8fac744361e5 100644
> --- a/arch/x86/kvm/svm/vmenter.S
> +++ b/arch/x86/kvm/svm/vmenter.S

Needs to include asm/asm-offsets.h, otherwise the compiler may think that
SVM_vcpu_arch_regs is a symbol.

  ERROR: modpost: "SVM_vcpu_arch_regs" [arch/x86/kvm/kvm-amd.ko] undefined!

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 8fac744361e5..8d0b0781462e 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/linkage.h>
 #include <asm/asm.h>
+#include <asm/asm-offsets.h>
 #include <asm/bitsperlong.h>
 #include <asm/kvm_vcpu_regs.h>
 #include <asm/nospec-branch.h>

> @@ -8,23 +8,23 @@
>  #define WORD_SIZE (BITS_PER_LONG / 8)
>  
>  /* Intentionally omit RAX as it's context switched by hardware */
> -#define VCPU_RCX	__VCPU_REGS_RCX * WORD_SIZE
> -#define VCPU_RDX	__VCPU_REGS_RDX * WORD_SIZE
> -#define VCPU_RBX	__VCPU_REGS_RBX * WORD_SIZE
> +#define VCPU_RCX	(SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
> +#define VCPU_RDX	(SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
> +#define VCPU_RBX	(SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
>  /* Intentionally omit RSP as it's context switched by hardware */
> -#define VCPU_RBP	__VCPU_REGS_RBP * WORD_SIZE
> -#define VCPU_RSI	__VCPU_REGS_RSI * WORD_SIZE
> -#define VCPU_RDI	__VCPU_REGS_RDI * WORD_SIZE
> +#define VCPU_RBP	(SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
> +#define VCPU_RSI	(SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
> +#define VCPU_RDI	(SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
>  
>  #ifdef CONFIG_X86_64
> -#define VCPU_R8		__VCPU_REGS_R8  * WORD_SIZE
> -#define VCPU_R9		__VCPU_REGS_R9  * WORD_SIZE
> -#define VCPU_R10	__VCPU_REGS_R10 * WORD_SIZE
> -#define VCPU_R11	__VCPU_REGS_R11 * WORD_SIZE
> -#define VCPU_R12	__VCPU_REGS_R12 * WORD_SIZE
> -#define VCPU_R13	__VCPU_REGS_R13 * WORD_SIZE
> -#define VCPU_R14	__VCPU_REGS_R14 * WORD_SIZE
> -#define VCPU_R15	__VCPU_REGS_R15 * WORD_SIZE
> +#define VCPU_R8		(SVM_vcpu_arch_regs + __VCPU_REGS_R8  * WORD_SIZE)
> +#define VCPU_R9		(SVM_vcpu_arch_regs + __VCPU_REGS_R9  * WORD_SIZE)
> +#define VCPU_R10	(SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
> +#define VCPU_R11	(SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
> +#define VCPU_R12	(SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
> +#define VCPU_R13	(SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
> +#define VCPU_R14	(SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
> +#define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
>  #endif

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-07 17:10   ` Sean Christopherson
@ 2022-11-07 17:22     ` Paolo Bonzini
  0 siblings, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 17:22 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/7/22 18:10, Sean Christopherson wrote:
> Needs to include asm/asm-offsets.h, otherwise the compiler may think that
> SVM_vcpu_arch_regs is a symbol.
> 
>    ERROR: modpost: "SVM_vcpu_arch_regs" [arch/x86/kvm/kvm-amd.ko] undefined!
> 
> diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
> index 8fac744361e5..8d0b0781462e 100644
> --- a/arch/x86/kvm/svm/vmenter.S
> +++ b/arch/x86/kvm/svm/vmenter.S
> @@ -1,6 +1,7 @@
>   /* SPDX-License-Identifier: GPL-2.0 */
>   #include <linux/linkage.h>
>   #include <asm/asm.h>
> +#include <asm/asm-offsets.h>
>   #include <asm/bitsperlong.h>
>   #include <asm/kvm_vcpu_regs.h>
>   #include <asm/nospec-branch.h>

Yeah, it's included slightly later (I did test each patch independently, 
but I'm not sure how it ended up disappearing from this one).

Paolo


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file
  2022-11-07 17:08   ` Sean Christopherson
@ 2022-11-07 17:36     ` Paolo Bonzini
  2022-11-07 18:14       ` Sean Christopherson
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 17:36 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/7/22 18:08, Sean Christopherson wrote:
> On Mon, Nov 07, 2022, Paolo Bonzini wrote:
>> Having inline functions confuses the compilation of asm-offsets.c,
>> which cannot find kvm_cache_regs.h because arch/x86/kvm is not in
>> asm-offset.c's include path.  Just extract the functions to a
>> new file.
>>
>> No functional change intended.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   arch/x86/kvm/svm/avic.c         |   1 +
>>   arch/x86/kvm/svm/nested.c       |   1 +
>>   arch/x86/kvm/svm/sev.c          |   1 +
>>   arch/x86/kvm/svm/svm.c          |   1 +
>>   arch/x86/kvm/svm/svm.h          | 200 ------------------------------
>>   arch/x86/kvm/svm/svm_onhyperv.c |   1 +
>>   arch/x86/kvm/svm/vmcb.h         | 211 ++++++++++++++++++++++++++++++++
> 
> I don't think vmcb.h is a good name.  The logical inclusion sequence would be for
> svm.h to include vmcb.h, e.g. SVM requires knowledge about VMCBs, but this requires
> vmcb.h to include svm.h to dereference "struct vcpu_svm".
> Unlike VMX's vmcs.h, the new file isn't a "pure" VMCB helper, it also holds a
> decent amount of KVM's SVM logic.

Yes, it's basically the wrappers that KVM uses to access the VMCB fields.

> What about making KVM self-sufficient?

You mean having a different asm-offsets.h file just for arch/x86/kvm/?

> The includes in asm-offsets.c are quite ugly
> 
>   #include "../kvm/vmx/vmx.h"
>   #include "../kvm/svm/svm.h"
> 
> or as a stopgap to make backporting easier, just include kvm_cache_regs.h?

The problem is that the _existing_ include of kvm_cache_regs.h in svm.h 
fails, with

arch/x86/kernel/../kvm/svm/svm.h:25:10: fatal error: kvm_cache_regs.h: 
No such file or directory
    25 | #include "kvm_cache_regs.h"
       |          ^~~~~~~~~~~~~~~~~~
compilation terminated.

The other two solutions here are:

1) move kvm_cache_regs.h to arch/x86/include/asm/ so it can be included 
normally

2) extract the structs to arch/x86/kvm/svm/svm_types.h and include that 
from asm-offsets.h, basically the opposite of this patch.

(2) is my preference if having a different asm-offsets.h file turns out 
to be too complex.  We can do the same for VMX as well.

Paolo

>>   void svm_leave_nested(struct kvm_vcpu *vcpu);
>> diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
>> index 8cdc62c74a96..ae0a101329e6 100644
>> --- a/arch/x86/kvm/svm/svm_onhyperv.c
>> +++ b/arch/x86/kvm/svm/svm_onhyperv.c
>> @@ -8,6 +8,7 @@
>>   #include <asm/mshyperv.h>
>>   
>>   #include "svm.h"
>> +#include "vmcb.h"
>>   #include "svm_ops.h"
>>   
>>   #include "hyperv.h"
>> diff --git a/arch/x86/kvm/svm/vmcb.h b/arch/x86/kvm/svm/vmcb.h
>> new file mode 100644
>> index 000000000000..8757cda27e3a
>> --- /dev/null
>> +++ b/arch/x86/kvm/svm/vmcb.h
>> @@ -0,0 +1,211 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Kernel-based Virtual Machine driver for Linux
>> + *
>> + * AMD SVM support - VMCB accessors
>> + */
>> +
>> +#ifndef __SVM_VMCB_H
>> +#define __SVM_VMCB_H
>> +
>> +#include "kvm_cache_regs.h"
> 
> This should include "svm.h" instead of relying on the parent to include said file.
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file
  2022-11-07 17:36     ` Paolo Bonzini
@ 2022-11-07 18:14       ` Sean Christopherson
  2022-11-07 18:51         ` Paolo Bonzini
  2022-11-08  8:52         ` Paolo Bonzini
  0 siblings, 2 replies; 25+ messages in thread
From: Sean Christopherson @ 2022-11-07 18:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Mon, Nov 07, 2022, Paolo Bonzini wrote:
> On 11/7/22 18:08, Sean Christopherson wrote:
> > What about making KVM self-sufficient?
> 
> You mean having a different asm-offsets.h file just for arch/x86/kvm/?

Yeah.

> > The includes in asm-offsets.c are quite ugly
> > 
> >   #include "../kvm/vmx/vmx.h"
> >   #include "../kvm/svm/svm.h"
> > 
> > or as a stopgap to make backporting easier, just include kvm_cache_regs.h?
> 
> The problem is that the _existing_ include of kvm_cache_regs.h in svm.h
> fails, with
> 
> arch/x86/kernel/../kvm/svm/svm.h:25:10: fatal error: kvm_cache_regs.h: No
> such file or directory
>    25 | #include "kvm_cache_regs.h"
>       |          ^~~~~~~~~~~~~~~~~~
> compilation terminated.

Duh.  Now the changelog makes more sense...

> The other two solutions here are:
> 
> 1) move kvm_cache_regs.h to arch/x86/include/asm/ so it can be included
> normally
> 
> 2) extract the structs to arch/x86/kvm/svm/svm_types.h and include that from
> asm-offsets.h, basically the opposite of this patch.
> 
> (2) is my preference if having a different asm-offsets.h file turns out to
> be too complex.  We can do the same for VMX as well.

What about adding dedicated structs to hold the non-regs params for VM-Enter and
VMRUN?  Grabbing stuff willy-nilly in the assembly code makes the flows difficult
to read as there's nothing in the C code that describes what fields are actually
used.  And due to 32-bit's restriction on the number of params, maintaining the
ad hoc approach will be annoying as passing in new info will require shuffling,
and any KVM refactorings will need updates to asm-offsets.c, e.g. "spec_ctrl"
should really live in "struct kvm_vcpu" since it's common to both AMD and Intel.

That would also allow fixing the bugs introduced by commit bb06650634d3 ("KVM:
VMX: Convert launched argument to flags").  nested_vmx_check_vmentry_hw() never
fully enters the guest; at worst, it triggers a "VM-Exit on VM-Enter" consistency
check.  Thus there's no need to load the guest's spec control and zero chance that
the guest can write to spec control.

E.g. as a very rough starting point

---
 arch/x86/include/asm/kvm_host.h |  8 ++++++++
 arch/x86/kvm/svm/svm.c          | 13 ++++++++++---
 arch/x86/kvm/svm/svm.h          |  4 ++--
 arch/x86/kvm/vmx/nested.c       |  8 ++++++--
 arch/x86/kvm/vmx/run_flags.h    |  8 --------
 arch/x86/kvm/vmx/vmx.c          | 32 +++++++++-----------------------
 arch/x86/kvm/vmx/vmx.h          |  4 +---
 7 files changed, 36 insertions(+), 41 deletions(-)
 delete mode 100644 arch/x86/kvm/vmx/run_flags.h

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 415113dea951..d56fe6151656 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -204,6 +204,14 @@ enum exit_fastpath_completion {
 };
 typedef enum exit_fastpath_completion fastpath_t;
 
+struct kvm_vmrun_params {
+	...
+};
+
+struct kvm_vmenter_params {
+	...
+};
+
 struct x86_emulate_ctxt;
 struct x86_exception;
 union kvm_smram;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 627c126cd9bb..7df9ea3ad3f1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3925,16 +3925,23 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	struct kvm_vmrun_params vmrun = {
+		.regs = vcpu->arch.regs,
+		.vmcb01 = svm->vmcb01.ptr,
+		.vmcb = svm->current_vmcb->ptr,
+		.spec_ctrl = svm->current_vmcb->ptr,
+		.spec_ctrl_intercepted = spec_ctrl_intercepted,
+	};
 
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
+		__svm_sev_es_vcpu_run(&params);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area),
-			       spec_ctrl_intercepted);
+		params.save_save_pa = __sme_page_pa(sd->save_area);
+		__svm_vcpu_run(vcpu->arch.regs, &params);
 	}
 
 	guest_state_exit_irqoff();
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 9d940d8736f0..bf321b755a15 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
-void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa, bool spec_ctrl_intercepted);
+void __svm_sev_es_vcpu_run(struct kvm_vmrun_params *params);
+void __svm_vcpu_run(unsigned long *regs, struct kvm_vmrun_params *params);
 
 #endif
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 61a2e551640a..da6c9b8c3a4f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3058,6 +3058,11 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
 
 static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
 {
+	struct kvm_vmenter_params params = {
+		.launched = vmx->loaded_vmcs->launched,
+		.spec_ctrl = this_cpu_read(x86_spec_ctrl_current),
+		.spec_ctrl_intercepted = true,
+	};
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	unsigned long cr3, cr4;
 	bool vm_fail;
@@ -3094,8 +3099,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
 		vmx->loaded_vmcs->host_state.cr4 = cr4;
 	}
 
-	vm_fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
-				 __vmx_vcpu_run_flags(vmx));
+	vm_fail = __vmx_vcpu_run(vcpu->arch.regs, &params);
 
 	if (vmx->msr_autoload.host.nr)
 		vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
diff --git a/arch/x86/kvm/vmx/run_flags.h b/arch/x86/kvm/vmx/run_flags.h
deleted file mode 100644
index edc3f16cc189..000000000000
--- a/arch/x86/kvm/vmx/run_flags.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __KVM_X86_VMX_RUN_FLAGS_H
-#define __KVM_X86_VMX_RUN_FLAGS_H
-
-#define VMX_RUN_VMRESUME	(1 << 0)
-#define VMX_RUN_SAVE_SPEC_CTRL	(1 << 1)
-
-#endif /* __KVM_X86_VMX_RUN_FLAGS_H */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 05a747c9a9ff..307380cd2000 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -847,24 +847,6 @@ static bool msr_write_intercepted(struct vcpu_vmx *vmx, u32 msr)
 	return vmx_test_msr_bitmap_write(vmx->loaded_vmcs->msr_bitmap, msr);
 }
 
-unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx)
-{
-	unsigned int flags = 0;
-
-	if (vmx->loaded_vmcs->launched)
-		flags |= VMX_RUN_VMRESUME;
-
-	/*
-	 * If writes to the SPEC_CTRL MSR aren't intercepted, the guest is free
-	 * to change it directly without causing a vmexit.  In that case read
-	 * it after vmexit and store it in vmx->spec_ctrl.
-	 */
-	if (unlikely(!msr_write_intercepted(vmx, MSR_IA32_SPEC_CTRL)))
-		flags |= VMX_RUN_SAVE_SPEC_CTRL;
-
-	return flags;
-}
-
 static __always_inline void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
 		unsigned long entry, unsigned long exit)
 {
@@ -7065,9 +7047,14 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 }
 
 static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
-					struct vcpu_vmx *vmx,
-					unsigned long flags)
+					struct vcpu_vmx *vmx)
 {
+	struct kvm_vmenter_params params = {
+		.launched = vmx->loaded_vmcs->launched,
+		.spec_ctrl = vmx->spec_ctrl,
+		.spec_ctrl_intercepted = msr_write_intercepted(vmx, MSR_IA32_SPEC_CTRL),
+	};
+
 	guest_state_enter_irqoff();
 
 	/* L1D Flush includes CPU buffer clear to mitigate MDS */
@@ -7084,8 +7071,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 	if (vcpu->arch.cr2 != native_read_cr2())
 		native_write_cr2(vcpu->arch.cr2);
 
-	vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
-				   flags);
+	vmx->fail = __vmx_vcpu_run(vcpu->arch.regs, &params);
 
 	vcpu->arch.cr2 = native_read_cr2();
 
@@ -7185,7 +7171,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	kvm_wait_lapic_expire(vcpu);
 
 	/* The actual VMENTER/EXIT is in the .noinstr.text section. */
-	vmx_vcpu_enter_exit(vcpu, vmx, __vmx_vcpu_run_flags(vmx));
+	vmx_vcpu_enter_exit(vcpu, vmx);
 
 	/* All fields are clean at this point */
 	if (static_branch_unlikely(&enable_evmcs)) {
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a3da84f4ea45..4eb196f88b47 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -421,9 +421,7 @@ struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr);
 void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
 void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
 void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
-unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
-bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
-		    unsigned int flags);
+bool __vmx_vcpu_run(unsigned long *regs, struct kvm_vmenter_params *params);
 int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
 void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);
 

base-commit: 0443d79faa4575a5871b54801ed4a36eecce32e3
-- 

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-07 14:54 ` [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-11-07 18:45   ` Jim Mattson
  2022-11-07 19:08     ` Paolo Bonzini
  0 siblings, 1 reply; 25+ messages in thread
From: Jim Mattson @ 2022-11-07 18:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, seanjc, stable

On Mon, Nov 7, 2022 at 6:54 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> Restoration of the host IA32_SPEC_CTRL value is probably too late
> with respect to the return thunk training sequence.
>
> With respect to the user/kernel boundary, AMD says, "If software chooses
> to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
> exit), software should set STIBP to 1 before executing the return thunk
> training sequence." I assume the same requirements apply to the guest/host
> boundary. The return thunk training sequence is in vmenter.S, quite close
> to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
> IA32_SPEC_CTRL value is not restored until much later.
>
> To avoid this, move the restoration of host SPEC_CTRL to assembly and,
> for consistency, move the restoration of the guest SPEC_CTRL as well.
> This is not particularly difficult, apart from some care to cover both
> 32- and 64-bit, and to share code between SEV-ES and normal vmentry.
>
> Cc: stable@vger.kernel.org
> Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
> Suggested-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kernel/asm-offsets.c |  1 +
>  arch/x86/kernel/cpu/bugs.c    | 13 ++---
>  arch/x86/kvm/svm/svm.c        | 38 ++++++---------
>  arch/x86/kvm/svm/svm.h        |  4 +-
>  arch/x86/kvm/svm/vmenter.S    | 92 ++++++++++++++++++++++++++++++++++-
>  5 files changed, 111 insertions(+), 37 deletions(-)
>
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index 69d1fed51086..d0bd68af0a5a 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -115,6 +115,7 @@ static void __used common(void)
>                 OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
>                 OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
>                 OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
> +               OFFSET(SVM_spec_ctrl, vcpu_svm, spec_ctrl);
>                 OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
>         }
>
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index da7c361f47e0..6ec0b7ce7453 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -196,22 +196,15 @@ void __init check_bugs(void)
>  }
>
>  /*
> - * NOTE: This function is *only* called for SVM.  VMX spec_ctrl handling is
> - * done in vmenter.S.
> + * NOTE: This function is *only* called for SVM, since Intel uses
> + * MSR_IA32_SPEC_CTRL for SSBD.
>   */
>  void
>  x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
>  {
> -       u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
> +       u64 guestval, hostval;
>         struct thread_info *ti = current_thread_info();
>
> -       if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
> -               if (hostval != guestval) {
> -                       msrval = setguest ? guestval : hostval;
> -                       wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
> -               }
> -       }
> -
>         /*
>          * If SSBD is not handled in MSR_SPEC_CTRL on AMD, update
>          * MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported.
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 381c7dcffe25..31aa158a2e10 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -731,6 +731,15 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
>         u32 offset;
>         u32 *msrpm;
>
> +       /*
> +        * For non-nested case:
> +        * If the L01 MSR bitmap does not intercept the MSR, then we need to
> +        * save it.
> +        *
> +        * For nested case:
> +        * If the L02 MSR bitmap does not intercept the MSR, then we need to
> +        * save it.
> +        */
>         msrpm = is_guest_mode(vcpu) ? to_svm(vcpu)->nested.msrpm:
>                                       to_svm(vcpu)->msrpm;
>
> @@ -3912,18 +3921,19 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
>         return EXIT_FASTPATH_NONE;
>  }
>
> -static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
> +static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
>  {
>         struct vcpu_svm *svm = to_svm(vcpu);
>
>         guest_state_enter_irqoff();
>
>         if (sev_es_guest(vcpu->kvm)) {
> -               __svm_sev_es_vcpu_run(svm);
> +               __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
>         } else {
>                 struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
>
> -               __svm_vcpu_run(svm, __sme_page_pa(sd->save_area));
> +               __svm_vcpu_run(svm, __sme_page_pa(sd->save_area),
> +                              spec_ctrl_intercepted);
>         }
>
>         guest_state_exit_irqoff();
> @@ -3932,6 +3942,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>  static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
>  {
>         struct vcpu_svm *svm = to_svm(vcpu);
> +       bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
>
>         trace_kvm_entry(vcpu);
>
> @@ -3990,26 +4001,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
>         if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
>                 x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
>
> -       svm_vcpu_enter_exit(vcpu);
> -
> -       /*
> -        * We do not use IBRS in the kernel. If this vCPU has used the
> -        * SPEC_CTRL MSR it may have left it on; save the value and
> -        * turn it off. This is much more efficient than blindly adding
> -        * it to the atomic save/restore list. Especially as the former
> -        * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
> -        *
> -        * For non-nested case:
> -        * If the L01 MSR bitmap does not intercept the MSR, then we need to
> -        * save it.
> -        *
> -        * For nested case:
> -        * If the L02 MSR bitmap does not intercept the MSR, then we need to
> -        * save it.
> -        */
> -       if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) &&
> -           unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
> -               svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
> +       svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
>
>         if (!sev_es_guest(vcpu->kvm))
>                 reload_tss(vcpu);
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 99410651f2a5..9d940d8736f0 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
>
>  /* vmenter.S */
>
> -void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
> -void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa);
> +void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
> +void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa, bool spec_ctrl_intercepted);
>
>  #endif
> diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
> index 45a4bd002494..9e381386ffdc 100644
> --- a/arch/x86/kvm/svm/vmenter.S
> +++ b/arch/x86/kvm/svm/vmenter.S
> @@ -32,10 +32,64 @@
>
>  .section .noinstr.text, "ax"
>
> +.macro RESTORE_GUEST_SPEC_CTRL
> +       /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> +       ALTERNATIVE_2 "jmp 999f", \
> +               "", X86_FEATURE_MSR_SPEC_CTRL, \
> +               "jmp 999f", X86_FEATURE_V_SPEC_CTRL
> +
> +       /*
> +        * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
> +        * host's, write the MSR.
> +        *
> +        * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
> +        * there must not be any returns or indirect branches between this code
> +        * and vmentry.
> +        */
> +       movl SVM_spec_ctrl(%_ASM_DI), %eax
> +       cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
> +       je 999f
> +       mov $MSR_IA32_SPEC_CTRL, %ecx
> +       xor %edx, %edx
> +       wrmsr
> +999:
> +
> +.endm
> +
> +.macro RESTORE_HOST_SPEC_CTRL
> +       /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> +       ALTERNATIVE_2 "jmp 999f", \
> +               "", X86_FEATURE_MSR_SPEC_CTRL, \
> +               "jmp 999f", X86_FEATURE_V_SPEC_CTRL
> +
> +       mov $MSR_IA32_SPEC_CTRL, %ecx
> +
> +       /*
> +        * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
> +        * if it was not intercepted during guest execution.
> +        */
> +       cmpb $0, (%_ASM_SP)
> +       jnz 998f
> +       rdmsr
> +       movl %eax, SVM_spec_ctrl(%_ASM_DI)
> +998:
> +
> +       /* Now restore the host value of the MSR if different from the guest's.  */
> +       movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
> +       cmp SVM_spec_ctrl(%_ASM_DI), %eax
> +       je 999f
> +       xor %edx, %edx
> +       wrmsr
> +999:
> +
> +.endm
> +
> +

It seems unfortunate to have the unconditional branches in the more
common cases.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file
  2022-11-07 18:14       ` Sean Christopherson
@ 2022-11-07 18:51         ` Paolo Bonzini
  2022-11-08  8:52         ` Paolo Bonzini
  1 sibling, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 18:51 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/7/22 19:14, Sean Christopherson wrote:
> On Mon, Nov 07, 2022, Paolo Bonzini wrote:
>> On 11/7/22 18:08, Sean Christopherson wrote:
>>> What about making KVM self-sufficient?
>>
>> You mean having a different asm-offsets.h file just for arch/x86/kvm/?
> 
> Yeah.

I'll try tomorrow, wish me luck. :)

>> The problem is that the _existing_ include of kvm_cache_regs.h in svm.h
>> fails, with
>>
>> arch/x86/kernel/../kvm/svm/svm.h:25:10: fatal error: kvm_cache_regs.h: No
>> such file or directory
>>     25 | #include "kvm_cache_regs.h"
>>        |          ^~~~~~~~~~~~~~~~~~
>> compilation terminated.
> 
> Duh.  Now the changelog makes more sense...

I'll add the commit message, though I also would not mind getting rid of 
this patch.

>> The other two solutions here are:
>>
>> 1) move kvm_cache_regs.h to arch/x86/include/asm/ so it can be included
>> normally
>>
>> 2) extract the structs to arch/x86/kvm/svm/svm_types.h and include that from
>> asm-offsets.h, basically the opposite of this patch.
>>
>> (2) is my preference if having a different asm-offsets.h file turns out to
>> be too complex.  We can do the same for VMX as well.
> 
> What about adding dedicated structs to hold the non-regs params for VM-Enter and
> VMRUN?  Grabbing stuff willy-nilly in the assembly code makes the flows difficult
> to read as there's nothing in the C code that describes what fields are actually
> used.

What fields are actually used is (like with any other function) 
"potentially all, you'll have to read the source code and in fact you 
can just read asm-offsets.c instead".  What I mean is, I cannot offhand 
see or remember what fields are touched by svm_prepare_switch_to_guest, 
why would __svm_vcpu_run be any different?  And the new SVM assembly 
code is quite readable overall.

> And due to 32-bit's restriction on the number of params, maintaining the
> ad hoc approach will be annoying as passing in new info will require shuffling,
> and any KVM refactorings will need updates to asm-offsets.c, e.g. "spec_ctrl"
> should really live in "struct kvm_vcpu" since it's common to both AMD and Intel.

So what? :)  We're talking of 2 fields (regs, spec_ctrl) for VMX and 4 
(regs, spec_ctrl, svm->current_vmcb and svm->vmcb01---for simplicity all 
of them) for SVM, and they don't change very often at all.  If 
hypothetically KVM had used similar assembly code from the get go, there 
would have been 3 changes in 15 years: spec_ctrl was added for SSBD, and 
the other two fields correspond to two changes to the nesed VMCB 
handling (svm->current_vmcb was first introduced together with vmcb02, 
and later VMSAVE/VMLOAD started always using vmcb01).  That's not too bad.

> That would also allow fixing the bugs introduced by commit bb06650634d3 ("KVM:
> VMX: Convert launched argument to flags").  nested_vmx_check_vmentry_hw() never
> fully enters the guest; at worst, it triggers a "VM-Exit on VM-Enter" consistency
> check.  Thus there's no need to load the guest's spec control and zero chance that
> the guest can write to spec control.

Hmm, I'm not sure how it's related to this series.  Is the problem also 
with the three-argument limit?

Paolo

> E.g. as a very rough starting point
> 
> ---
>   arch/x86/include/asm/kvm_host.h |  8 ++++++++
>   arch/x86/kvm/svm/svm.c          | 13 ++++++++++---
>   arch/x86/kvm/svm/svm.h          |  4 ++--
>   arch/x86/kvm/vmx/nested.c       |  8 ++++++--
>   arch/x86/kvm/vmx/run_flags.h    |  8 --------
>   arch/x86/kvm/vmx/vmx.c          | 32 +++++++++-----------------------
>   arch/x86/kvm/vmx/vmx.h          |  4 +---
>   7 files changed, 36 insertions(+), 41 deletions(-)
>   delete mode 100644 arch/x86/kvm/vmx/run_flags.h
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 415113dea951..d56fe6151656 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -204,6 +204,14 @@ enum exit_fastpath_completion {
>   };
>   typedef enum exit_fastpath_completion fastpath_t;
>   
> +struct kvm_vmrun_params {
> +	...
> +};
> +
> +struct kvm_vmenter_params {
> +	...
> +};
> +
>   struct x86_emulate_ctxt;
>   struct x86_exception;
>   union kvm_smram;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 627c126cd9bb..7df9ea3ad3f1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3925,16 +3925,23 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
>   static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
>   {
>   	struct vcpu_svm *svm = to_svm(vcpu);
> +	struct kvm_vmrun_params vmrun = {
> +		.regs = vcpu->arch.regs,
> +		.vmcb01 = svm->vmcb01.ptr,
> +		.vmcb = svm->current_vmcb->ptr,
> +		.spec_ctrl = svm->current_vmcb->ptr,
> +		.spec_ctrl_intercepted = spec_ctrl_intercepted,
> +	};
>   
>   	guest_state_enter_irqoff();
>   
>   	if (sev_es_guest(vcpu->kvm)) {
> -		__svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
> +		__svm_sev_es_vcpu_run(&params);
>   	} else {
>   		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
>   
> -		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area),
> -			       spec_ctrl_intercepted);
> +		params.save_save_pa = __sme_page_pa(sd->save_area);
> +		__svm_vcpu_run(vcpu->arch.regs, &params);
>   	}
>   
>   	guest_state_exit_irqoff();
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 9d940d8736f0..bf321b755a15 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
>   
>   /* vmenter.S */
>   
> -void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
> -void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa, bool spec_ctrl_intercepted);
> +void __svm_sev_es_vcpu_run(struct kvm_vmrun_params *params);
> +void __svm_vcpu_run(unsigned long *regs, struct kvm_vmrun_params *params);
>   
>   #endif
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 61a2e551640a..da6c9b8c3a4f 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3058,6 +3058,11 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu *vcpu,
>   
>   static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
>   {
> +	struct kvm_vmenter_params params = {
> +		.launched = vmx->loaded_vmcs->launched,
> +		.spec_ctrl = this_cpu_read(x86_spec_ctrl_current),
> +		.spec_ctrl_intercepted = true,
> +	};
>   	struct vcpu_vmx *vmx = to_vmx(vcpu);
>   	unsigned long cr3, cr4;
>   	bool vm_fail;
> @@ -3094,8 +3099,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
>   		vmx->loaded_vmcs->host_state.cr4 = cr4;
>   	}
>   
> -	vm_fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
> -				 __vmx_vcpu_run_flags(vmx));
> +	vm_fail = __vmx_vcpu_run(vcpu->arch.regs, &params);
>   
>   	if (vmx->msr_autoload.host.nr)
>   		vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> diff --git a/arch/x86/kvm/vmx/run_flags.h b/arch/x86/kvm/vmx/run_flags.h
> deleted file mode 100644
> index edc3f16cc189..000000000000
> --- a/arch/x86/kvm/vmx/run_flags.h
> +++ /dev/null
> @@ -1,8 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -#ifndef __KVM_X86_VMX_RUN_FLAGS_H
> -#define __KVM_X86_VMX_RUN_FLAGS_H
> -
> -#define VMX_RUN_VMRESUME	(1 << 0)
> -#define VMX_RUN_SAVE_SPEC_CTRL	(1 << 1)
> -
> -#endif /* __KVM_X86_VMX_RUN_FLAGS_H */
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 05a747c9a9ff..307380cd2000 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -847,24 +847,6 @@ static bool msr_write_intercepted(struct vcpu_vmx *vmx, u32 msr)
>   	return vmx_test_msr_bitmap_write(vmx->loaded_vmcs->msr_bitmap, msr);
>   }
>   
> -unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx)
> -{
> -	unsigned int flags = 0;
> -
> -	if (vmx->loaded_vmcs->launched)
> -		flags |= VMX_RUN_VMRESUME;
> -
> -	/*
> -	 * If writes to the SPEC_CTRL MSR aren't intercepted, the guest is free
> -	 * to change it directly without causing a vmexit.  In that case read
> -	 * it after vmexit and store it in vmx->spec_ctrl.
> -	 */
> -	if (unlikely(!msr_write_intercepted(vmx, MSR_IA32_SPEC_CTRL)))
> -		flags |= VMX_RUN_SAVE_SPEC_CTRL;
> -
> -	return flags;
> -}
> -
>   static __always_inline void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
>   		unsigned long entry, unsigned long exit)
>   {
> @@ -7065,9 +7047,14 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
>   }
>   
>   static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
> -					struct vcpu_vmx *vmx,
> -					unsigned long flags)
> +					struct vcpu_vmx *vmx)
>   {
> +	struct kvm_vmenter_params params = {
> +		.launched = vmx->loaded_vmcs->launched,
> +		.spec_ctrl = vmx->spec_ctrl,
> +		.spec_ctrl_intercepted = msr_write_intercepted(vmx, MSR_IA32_SPEC_CTRL),
> +	};
> +
>   	guest_state_enter_irqoff();
>   
>   	/* L1D Flush includes CPU buffer clear to mitigate MDS */
> @@ -7084,8 +7071,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
>   	if (vcpu->arch.cr2 != native_read_cr2())
>   		native_write_cr2(vcpu->arch.cr2);
>   
> -	vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
> -				   flags);
> +	vmx->fail = __vmx_vcpu_run(vcpu->arch.regs, &params);
>   
>   	vcpu->arch.cr2 = native_read_cr2();
>   
> @@ -7185,7 +7171,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>   	kvm_wait_lapic_expire(vcpu);
>   
>   	/* The actual VMENTER/EXIT is in the .noinstr.text section. */
> -	vmx_vcpu_enter_exit(vcpu, vmx, __vmx_vcpu_run_flags(vmx));
> +	vmx_vcpu_enter_exit(vcpu, vmx);
>   
>   	/* All fields are clean at this point */
>   	if (static_branch_unlikely(&enable_evmcs)) {
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index a3da84f4ea45..4eb196f88b47 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -421,9 +421,7 @@ struct vmx_uret_msr *vmx_find_uret_msr(struct vcpu_vmx *vmx, u32 msr);
>   void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
>   void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
>   void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
> -unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
> -bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
> -		    unsigned int flags);
> +bool __vmx_vcpu_run(unsigned long *regs, struct kvm_vmenter_params *params);
>   int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
>   void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);
>   
> 
> base-commit: 0443d79faa4575a5871b54801ed4a36eecce32e3


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-07 18:45   ` Jim Mattson
@ 2022-11-07 19:08     ` Paolo Bonzini
  2022-11-09 21:23       ` Jim Mattson
  0 siblings, 1 reply; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-07 19:08 UTC (permalink / raw)
  To: Jim Mattson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, seanjc, stable

On 11/7/22 19:45, Jim Mattson wrote:
>> +.macro RESTORE_GUEST_SPEC_CTRL
>> +       /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
>> +       ALTERNATIVE_2 "jmp 999f", \
>> +               "", X86_FEATURE_MSR_SPEC_CTRL, \
>> +               "jmp 999f", X86_FEATURE_V_SPEC_CTRL
>> +
>> +       /*
>> +        * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
>> +        * host's, write the MSR.
>> +        *
>> +        * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
>> +        * there must not be any returns or indirect branches between this code
>> +        * and vmentry.
>> +        */
>> +       movl SVM_spec_ctrl(%_ASM_DI), %eax
>> +       cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
>> +       je 999f
>> +       mov $MSR_IA32_SPEC_CTRL, %ecx
>> +       xor %edx, %edx
>> +       wrmsr
>> +999:
>> +
>> +.endm
>> +
>> +.macro RESTORE_HOST_SPEC_CTRL
>> +       /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
>> +       ALTERNATIVE_2 "jmp 999f", \
>> +               "", X86_FEATURE_MSR_SPEC_CTRL, \
>> +               "jmp 999f", X86_FEATURE_V_SPEC_CTRL
>> +
>> +       mov $MSR_IA32_SPEC_CTRL, %ecx
>> +
>> +       /*
>> +        * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
>> +        * if it was not intercepted during guest execution.
>> +        */
>> +       cmpb $0, (%_ASM_SP)
>> +       jnz 998f
>> +       rdmsr
>> +       movl %eax, SVM_spec_ctrl(%_ASM_DI)
>> +998:
>> +
>> +       /* Now restore the host value of the MSR if different from the guest's.  */
>> +       movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
>> +       cmp SVM_spec_ctrl(%_ASM_DI), %eax
>> +       je 999f
>> +       xor %edx, %edx
>> +       wrmsr
>> +999:
>> +
>> +.endm
>> +
>> +
> 
> It seems unfortunate to have the unconditional branches in the more
> common cases.

One way to do it could be something like

.macro RESTORE_HOST_SPEC_CTRL
        ALTERNATIVE_2 "", \
                "jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
                "", X86_FEATURE_V_SPEC_CTRL \
901:
.endm

.macro RESTORE_SPEC_CTRL_BODY \
800:
	/* restore guest spec ctrl ... */
	jmp 801b

900:
	/* save guest spec ctrl + restore host ... */
	jmp 901b
.endm

The cmp/je pair can also jump back to 801b/901b.

What do you think?  I'll check if objtool is happy and if so include it 
in v2.

Paolo


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file
  2022-11-07 18:14       ` Sean Christopherson
  2022-11-07 18:51         ` Paolo Bonzini
@ 2022-11-08  8:52         ` Paolo Bonzini
  1 sibling, 0 replies; 25+ messages in thread
From: Paolo Bonzini @ 2022-11-08  8:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/7/22 19:14, Sean Christopherson wrote:
> On Mon, Nov 07, 2022, Paolo Bonzini wrote:
>> On 11/7/22 18:08, Sean Christopherson wrote:
>>> What about making KVM self-sufficient?
>>
>> You mean having a different asm-offsets.h file just for arch/x86/kvm/?
> 
> Yeah.

Doh, it would have been enough to add #ifdef COMPILE_OFFSETS to 
svm/svm.h, but it was also pretty easy to generate a separate 
asm-offsets file so why not.

Paolo


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-07 19:08     ` Paolo Bonzini
@ 2022-11-09 21:23       ` Jim Mattson
  0 siblings, 0 replies; 25+ messages in thread
From: Jim Mattson @ 2022-11-09 21:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, seanjc, stable

On Mon, Nov 7, 2022 at 11:08 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 11/7/22 19:45, Jim Mattson wrote:
> >> +.macro RESTORE_GUEST_SPEC_CTRL
> >> +       /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> >> +       ALTERNATIVE_2 "jmp 999f", \
> >> +               "", X86_FEATURE_MSR_SPEC_CTRL, \
> >> +               "jmp 999f", X86_FEATURE_V_SPEC_CTRL
> >> +
> >> +       /*
> >> +        * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
> >> +        * host's, write the MSR.
> >> +        *
> >> +        * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
> >> +        * there must not be any returns or indirect branches between this code
> >> +        * and vmentry.
> >> +        */
> >> +       movl SVM_spec_ctrl(%_ASM_DI), %eax
> >> +       cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
> >> +       je 999f
> >> +       mov $MSR_IA32_SPEC_CTRL, %ecx
> >> +       xor %edx, %edx
> >> +       wrmsr
> >> +999:
> >> +
> >> +.endm
> >> +
> >> +.macro RESTORE_HOST_SPEC_CTRL
> >> +       /* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> >> +       ALTERNATIVE_2 "jmp 999f", \
> >> +               "", X86_FEATURE_MSR_SPEC_CTRL, \
> >> +               "jmp 999f", X86_FEATURE_V_SPEC_CTRL
> >> +
> >> +       mov $MSR_IA32_SPEC_CTRL, %ecx
> >> +
> >> +       /*
> >> +        * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
> >> +        * if it was not intercepted during guest execution.
> >> +        */
> >> +       cmpb $0, (%_ASM_SP)
> >> +       jnz 998f
> >> +       rdmsr
> >> +       movl %eax, SVM_spec_ctrl(%_ASM_DI)
> >> +998:
> >> +
> >> +       /* Now restore the host value of the MSR if different from the guest's.  */
> >> +       movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
> >> +       cmp SVM_spec_ctrl(%_ASM_DI), %eax
> >> +       je 999f
> >> +       xor %edx, %edx
> >> +       wrmsr
> >> +999:
> >> +
> >> +.endm
> >> +
> >> +
> >
> > It seems unfortunate to have the unconditional branches in the more
> > common cases.
>
> One way to do it could be something like
>
> .macro RESTORE_HOST_SPEC_CTRL
>         ALTERNATIVE_2 "", \
>                 "jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
>                 "", X86_FEATURE_V_SPEC_CTRL \
> 901:
> .endm
>
> .macro RESTORE_SPEC_CTRL_BODY \
> 800:
>         /* restore guest spec ctrl ... */
>         jmp 801b
>
> 900:
>         /* save guest spec ctrl + restore host ... */
>         jmp 901b
> .endm
>
> The cmp/je pair can also jump back to 801b/901b.
>
> What do you think?  I'll check if objtool is happy and if so include it
> in v2.
>
> Paolo
>
This seems reasonable, if you trust a direct branch prior to the IBPB.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-11-09 21:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-07 14:54 [PATCH 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
2022-11-07 14:54 ` [PATCH 1/8] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
2022-11-07 17:08   ` Sean Christopherson
2022-11-07 17:36     ` Paolo Bonzini
2022-11-07 18:14       ` Sean Christopherson
2022-11-07 18:51         ` Paolo Bonzini
2022-11-08  8:52         ` Paolo Bonzini
2022-11-07 14:54 ` [PATCH 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
2022-11-07 17:10   ` Sean Christopherson
2022-11-07 17:22     ` Paolo Bonzini
2022-11-07 14:54 ` [PATCH 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
2022-11-07 14:54 ` [PATCH 4/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
2022-11-07 15:23   ` Peter Zijlstra
2022-11-07 15:40     ` Paolo Bonzini
2022-11-07 15:32   ` Andrew Cooper
2022-11-07 15:37     ` Paolo Bonzini
2022-11-07 15:47       ` Andrew Cooper
2022-11-07 14:54 ` [PATCH 5/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
2022-11-07 14:54 ` [PATCH 6/8] KVM: SVM: restore host save area " Paolo Bonzini
2022-11-07 14:54 ` [PATCH 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
2022-11-07 18:45   ` Jim Mattson
2022-11-07 19:08     ` Paolo Bonzini
2022-11-09 21:23       ` Jim Mattson
2022-11-07 14:54 ` [PATCH 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
2022-11-07 15:33 ` [PATCH 0/8] KVM: SVM: fixes for vmentry code Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.