All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] KVM: SVM: fixes for vmentry code
@ 2022-11-08 15:15 Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

This series comprises two related fixes:

- the FILL_RETURN_BUFFER macro in -next needs to access percpu data,
  hence the GS segment base needs to be loaded before FILL_RETURN_BUFFER.
  This means moving guest vmload/vmsave and host vmload to assembly
  (patches 5 and 6).

- because AMD wants the OS to set STIBP to 1 before executing the
  return thunk (un)training sequence, IA32_SPEC_CTRL must be restored
  before UNTRAIN_RET, too.  This must also be moved to assembly and,
  for consistency, the guest SPEC_CTRL is also loaded in there
  (patch 7).

Neither is particularly hard, however because of 32-bit systems one needs
to keep the number of arguments to __svm_vcpu_run to three or fewer.
One is taken for whether IA32_SPEC_CTRL is intercepted, and one for the
host save area, so all accesses to the vcpu_svm struct have to be done
from assembly too.  This is done in patches 2 to 4, and it turns out
not to be that bad; in fact I think the code is simpler than before
after these prerequisites, and even at the end of the series it is not
much harder to follow despite doing a lot more stuff.  Care has been
taken to keep the "normal" and SEV-ES code as similar as possible,
even though the latter would not hit the three argument barrier.

The above summary leaves out the more mundane patches 1 and 8.  The
former introduces a separate asm-offsets.c file for KVM, so that
kernel/asm-offsets.c does not have to do ugly includes with ../ paths.
The latter is dead code removal.

Thanks,

Paolo

v1->v2: use a separate asm-offsets.c file instead of hacking around
	the arch/x86/kvm/svm/svm.h file; this could have been done
	also with just a "#ifndef COMPILE_OFFSETS", but Sean's
	suggestion is cleaner and there is a precedent in
	drivers/memory/ for private asm-offsets files

	keep preparatory cleanups together at the beginning of the
	series

	move SPEC_CTRL save/restore out of line [Jim]

Paolo Bonzini (8):
  KVM: x86: use a separate asm-offsets.c file
  KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  KVM: SVM: adjust register allocation for __svm_vcpu_run
  KVM: SVM: retrieve VMCB from assembly
  KVM: SVM: move guest vmsave/vmload to assembly
  KVM: SVM: restore host save area from assembly
  KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
    callers

 arch/x86/include/asm/spec-ctrl.h |  10 +-
 arch/x86/kernel/asm-offsets.c    |   6 -
 arch/x86/kernel/cpu/bugs.c       |  15 +-
 arch/x86/kvm/Makefile            |  12 ++
 arch/x86/kvm/kvm-asm-offsets.c   |  28 ++++
 arch/x86/kvm/svm/svm.c           |  53 +++----
 arch/x86/kvm/svm/svm.h           |   4 +-
 arch/x86/kvm/svm/svm_ops.h       |   5 -
 arch/x86/kvm/svm/vmenter.S       | 241 ++++++++++++++++++++++++-------
 arch/x86/kvm/vmx/vmenter.S       |   2 +-
 10 files changed, 259 insertions(+), 117 deletions(-)
 create mode 100644 arch/x86/kvm/kvm-asm-offsets.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

This already removes the ugly #includes from asm-offsets.c, but especially
it avoids a future error when asm-offsets will try to include svm/svm.h.

This would not work for kernel/asm-offsets.c, because svm/svm.h
includes kvm_cache_regs.h which is not in the include path when
compiling asm-offsets.c.  The problem is not there if the .c file is
in arch/x86/kvm.

Tested with both in-tree and separate-objtree compilation.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c  |  6 ------
 arch/x86/kvm/Makefile          |  9 +++++++++
 arch/x86/kvm/kvm-asm-offsets.c | 18 ++++++++++++++++++
 arch/x86/kvm/vmx/vmenter.S     |  2 +-
 4 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 arch/x86/kvm/kvm-asm-offsets.c

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cb50589a7102..437308004ef2 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -19,7 +19,6 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 #include <asm/tdx.h>
-#include "../kvm/vmx/vmx.h"
 
 #ifdef CONFIG_XEN
 #include <xen/interface/xen.h>
@@ -108,9 +107,4 @@ static void __used common(void)
 	OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
 	OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
 	OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
-
-	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
-		BLANK();
-		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
-	}
 }
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 30f244b64523..a02cf9baacc8 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -34,3 +34,12 @@ endif
 obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
+
+AFLAGS_vmx/vmenter.o    := -iquote $(obj)
+$(obj)/vmx/vmenter.o: $(obj)/kvm-asm-offsets.h
+
+$(obj)/kvm-asm-offsets.h: $(obj)/kvm-asm-offsets.s FORCE
+	$(call filechk,offsets,__KVM_ASM_OFFSETS_H__)
+
+targets += kvm-asm-offsets.s
+clean-files += kvm-asm-offsets.h
diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
new file mode 100644
index 000000000000..9d84f2b32d7f
--- /dev/null
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -0,0 +1,18 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generate definitions needed by assembly language modules.
+ * This code generates raw asm output which is post-processed to extract
+ * and format the required data.
+ */
+#define COMPILE_OFFSETS
+
+#include <linux/kbuild.h>
+#include "vmx/vmx.h"
+
+static void __used common(void)
+{
+	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
+		BLANK();
+		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
+	}
+}
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 8477d8bdd69c..0b5db4de4d09 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -1,12 +1,12 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/linkage.h>
 #include <asm/asm.h>
-#include <asm/asm-offsets.h>
 #include <asm/bitsperlong.h>
 #include <asm/kvm_vcpu_regs.h>
 #include <asm/nospec-branch.h>
 #include <asm/percpu.h>
 #include <asm/segment.h>
+#include "kvm-asm-offsets.h"
 #include "run_flags.h"
 
 #define WORD_SIZE (BITS_PER_LONG / 8)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 20:54   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/Makefile          |  3 +++
 arch/x86/kvm/kvm-asm-offsets.c |  6 ++++++
 arch/x86/kvm/svm/svm.c         |  2 +-
 arch/x86/kvm/svm/svm.h         |  2 +-
 arch/x86/kvm/svm/vmenter.S     | 37 +++++++++++++++++-----------------
 5 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index a02cf9baacc8..f453a0f96e24 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -35,6 +35,9 @@ obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
 
+AFLAGS_svm/vmenter.o    := -iquote $(obj)
+$(obj)/svm/vmenter.o: $(obj)/kvm-asm-offsets.h
+
 AFLAGS_vmx/vmenter.o    := -iquote $(obj)
 $(obj)/vmx/vmenter.o: $(obj)/kvm-asm-offsets.h
 
diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index 9d84f2b32d7f..30db96852e2d 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -8,9 +8,15 @@
 
 #include <linux/kbuild.h>
 #include "vmx/vmx.h"
+#include "svm/svm.h"
 
 static void __used common(void)
 {
+	if (IS_ENABLED(CONFIG_KVM_AMD)) {
+		BLANK();
+		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+	}
+
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
 		BLANK();
 		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58f0077d9357..b412bc5773c5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3930,7 +3930,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		 * vmcb02 when switching vmcbs for nested virtualization.
 		 */
 		vmload(svm->vmcb01.pa);
-		__svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
+		__svm_vcpu_run(vmcb_pa, svm);
 		vmsave(svm->vmcb01.pa);
 
 		vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..447e25c9101a 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -684,6 +684,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, unsigned long *regs);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 723f8534986c..f0ff41103e4c 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -4,27 +4,28 @@
 #include <asm/bitsperlong.h>
 #include <asm/kvm_vcpu_regs.h>
 #include <asm/nospec-branch.h>
+#include "kvm-asm-offsets.h"
 
 #define WORD_SIZE (BITS_PER_LONG / 8)
 
 /* Intentionally omit RAX as it's context switched by hardware */
-#define VCPU_RCX	__VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX	__VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX	__VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RCX	(SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX	(SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX	(SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
 /* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP	__VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI	__VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI	__VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP	(SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI	(SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI	(SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
 
 #ifdef CONFIG_X86_64
-#define VCPU_R8		__VCPU_REGS_R8  * WORD_SIZE
-#define VCPU_R9		__VCPU_REGS_R9  * WORD_SIZE
-#define VCPU_R10	__VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11	__VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12	__VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13	__VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14	__VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15	__VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8		(SVM_vcpu_arch_regs + __VCPU_REGS_R8  * WORD_SIZE)
+#define VCPU_R9		(SVM_vcpu_arch_regs + __VCPU_REGS_R9  * WORD_SIZE)
+#define VCPU_R10	(SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11	(SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12	(SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13	(SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14	(SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
 .section .noinstr.text, "ax"
@@ -32,7 +33,7 @@
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @vmcb_pa:	unsigned long
- * @regs:	unsigned long * (to guest registers)
+ * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -47,13 +48,13 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Save @regs. */
+	/* Save @svm. */
 	push %_ASM_ARG2
 
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @regs to RAX. */
+	/* Move @svm to RAX. */
 	mov %_ASM_ARG2, %_ASM_AX
 
 	/* Load guest registers. */
@@ -89,7 +90,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
-	/* "POP" @regs to RAX. */
+	/* "POP" @svm to RAX. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 20:55   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

In preparation for moving vmload/vmsave to __svm_vcpu_run,
keep the pointer to the struct vcpu_svm in %rdi.  This way
it is possible to load svm->vmcb01.pa in %rax without
clobbering the pointer to svm itself.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/vmenter.S | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index f0ff41103e4c..531510ab6072 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -54,29 +54,29 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @svm to RAX. */
-	mov %_ASM_ARG2, %_ASM_AX
+	/* Move @svm to RDI. */
+	mov %_ASM_ARG2, %_ASM_DI
+
+	/* "POP" @vmcb to RAX. */
+	pop %_ASM_AX
 
 	/* Load guest registers. */
-	mov VCPU_RCX(%_ASM_AX), %_ASM_CX
-	mov VCPU_RDX(%_ASM_AX), %_ASM_DX
-	mov VCPU_RBX(%_ASM_AX), %_ASM_BX
-	mov VCPU_RBP(%_ASM_AX), %_ASM_BP
-	mov VCPU_RSI(%_ASM_AX), %_ASM_SI
-	mov VCPU_RDI(%_ASM_AX), %_ASM_DI
+	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+	mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+	mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+	mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+	mov VCPU_RSI(%_ASM_DI), %_ASM_SI
 #ifdef CONFIG_X86_64
-	mov VCPU_R8 (%_ASM_AX),  %r8
-	mov VCPU_R9 (%_ASM_AX),  %r9
-	mov VCPU_R10(%_ASM_AX), %r10
-	mov VCPU_R11(%_ASM_AX), %r11
-	mov VCPU_R12(%_ASM_AX), %r12
-	mov VCPU_R13(%_ASM_AX), %r13
-	mov VCPU_R14(%_ASM_AX), %r14
-	mov VCPU_R15(%_ASM_AX), %r15
+	mov VCPU_R8 (%_ASM_DI),  %r8
+	mov VCPU_R9 (%_ASM_DI),  %r9
+	mov VCPU_R10(%_ASM_DI), %r10
+	mov VCPU_R11(%_ASM_DI), %r11
+	mov VCPU_R12(%_ASM_DI), %r12
+	mov VCPU_R13(%_ASM_DI), %r13
+	mov VCPU_R14(%_ASM_DI), %r14
+	mov VCPU_R15(%_ASM_DI), %r15
 #endif
-
-	/* "POP" @vmcb to RAX. */
-	pop %_ASM_AX
+	mov VCPU_RDI(%_ASM_DI), %_ASM_DI
 
 	/* Enter guest mode */
 	sti
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (2 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-09  0:53   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

This is needed in order to keep the number of arguments to 3 or less,
after adding hsave_pa and spec_ctrl_intercepted.  32-bit builds only
support passing three arguments in registers, fortunately all other
data is reachable from the vcpu_svm struct.

It is not strictly necessary for __svm_sev_es_vcpu_run, but staying
consistent is a good idea since it makes __svm_sev_es_vcpu_run a
stripped version of _svm_vcpu_run.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/kvm-asm-offsets.c |  2 ++
 arch/x86/kvm/svm/svm.c         |  5 ++---
 arch/x86/kvm/svm/svm.h         |  4 ++--
 arch/x86/kvm/svm/vmenter.S     | 20 ++++++++++----------
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index 30db96852e2d..f1b694e431ae 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -15,6 +15,8 @@ static void __used common(void)
 	if (IS_ENABLED(CONFIG_KVM_AMD)) {
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
+		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
 
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b412bc5773c5..0c86c435c51f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3914,12 +3914,11 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	unsigned long vmcb_pa = svm->current_vmcb->pa;
 
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(vmcb_pa);
+		__svm_sev_es_vcpu_run(svm);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
@@ -3930,7 +3929,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		 * vmcb02 when switching vmcbs for nested virtualization.
 		 */
 		vmload(svm->vmcb01.pa);
-		__svm_vcpu_run(vmcb_pa, svm);
+		__svm_vcpu_run(svm);
 		vmsave(svm->vmcb01.pa);
 
 		vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 447e25c9101a..7ff1879e73c5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -683,7 +683,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
+void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
+void __svm_vcpu_run(struct vcpu_svm *svm);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 531510ab6072..d07bac1952c5 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -32,7 +32,6 @@
 
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
- * @vmcb_pa:	unsigned long
  * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_vcpu_run)
@@ -49,16 +48,16 @@ SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BX
 
 	/* Save @svm. */
-	push %_ASM_ARG2
-
-	/* Save @vmcb. */
 	push %_ASM_ARG1
 
+.ifnc _ASM_ARG1, _ASM_DI
 	/* Move @svm to RDI. */
-	mov %_ASM_ARG2, %_ASM_DI
+	mov %_ASM_ARG1, %_ASM_DI
+.endif
 
-	/* "POP" @vmcb to RAX. */
-	pop %_ASM_AX
+	/* Get svm->current_vmcb->pa into RAX. */
+	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
+	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Load guest registers. */
 	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
@@ -170,7 +169,7 @@ SYM_FUNC_END(__svm_vcpu_run)
 
 /**
  * __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
- * @vmcb_pa:	unsigned long
+ * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	push %_ASM_BP
@@ -185,8 +184,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Move @vmcb to RAX. */
-	mov %_ASM_ARG1, %_ASM_AX
+	/* Get svm->current_vmcb->pa into RAX. */
+	mov SVM_current_vmcb(%_ASM_ARG1), %_ASM_AX
+	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Enter guest mode */
 	sti
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (3 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 6/8] KVM: SVM: restore host save area from assembly Paolo Bonzini
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

FILL_RETURN_BUFFER can access percpu data, therefore vmload of the
host save area must be executed first.  First of all, move the
VMCB vmsave/vmload to assembly.

The idea on how to number the exception tables is stolen from
a prototype patch by Peter Zijlstra.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Link: <https://lore.kernel.org/all/f571e404-e625-bae1-10e9-449b2eb4cbd8@citrix.com/>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/kvm-asm-offsets.c |  1 +
 arch/x86/kvm/svm/svm.c         |  9 -------
 arch/x86/kvm/svm/vmenter.S     | 49 ++++++++++++++++++++++++++--------
 3 files changed, 39 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index f1b694e431ae..f83e88b85bf2 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -16,6 +16,7 @@ static void __used common(void)
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
 		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
+		OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
 		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0c86c435c51f..0ba4feb19cb6 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3922,16 +3922,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		/*
-		 * Use a single vmcb (vmcb01 because it's always valid) for
-		 * context switching guest state via VMLOAD/VMSAVE, that way
-		 * the state doesn't need to be copied between vmcb01 and
-		 * vmcb02 when switching vmcbs for nested virtualization.
-		 */
-		vmload(svm->vmcb01.pa);
 		__svm_vcpu_run(svm);
-		vmsave(svm->vmcb01.pa);
-
 		vmload(__sme_page_pa(sd->save_area));
 	}
 
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index d07bac1952c5..5bc2ed7d79c0 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -28,6 +28,8 @@
 #define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
+#define SVM_vmcb01_pa	(SVM_vmcb01 + KVM_VMCB_pa)
+
 .section .noinstr.text, "ax"
 
 /**
@@ -55,6 +57,16 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %_ASM_ARG1, %_ASM_DI
 .endif
 
+	/*
+	 * Use a single vmcb (vmcb01 because it's always valid) for
+	 * context switching guest state via VMLOAD/VMSAVE, that way
+	 * the state doesn't need to be copied between vmcb01 and
+	 * vmcb02 when switching vmcbs for nested virtualization.
+	 */
+	mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
+1:	vmload %_ASM_AX
+2:
+
 	/* Get svm->current_vmcb->pa into RAX. */
 	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
 	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
@@ -80,16 +92,11 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Enter guest mode */
 	sti
 
-1:	vmrun %_ASM_AX
-
-2:	cli
-
-#ifdef CONFIG_RETPOLINE
-	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
-	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
-#endif
+3:	vmrun %_ASM_AX
+4:
+	cli
 
-	/* "POP" @svm to RAX. */
+	/* Pop @svm to RAX while it's the only available register. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
@@ -110,6 +117,18 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %r15, VCPU_R15(%_ASM_AX)
 #endif
 
+	/* @svm can stay in RDI from now on.  */
+	mov %_ASM_AX, %_ASM_DI
+
+	mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
+5:	vmsave %_ASM_AX
+6:
+
+#ifdef CONFIG_RETPOLINE
+	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
+	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
+#endif
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -159,11 +178,19 @@ SYM_FUNC_START(__svm_vcpu_run)
 	pop %_ASM_BP
 	RET
 
-3:	cmpb $0, kvm_rebooting
+10:	cmpb $0, kvm_rebooting
 	jne 2b
 	ud2
+30:	cmpb $0, kvm_rebooting
+	jne 4b
+	ud2
+50:	cmpb $0, kvm_rebooting
+	jne 6b
+	ud2
 
-	_ASM_EXTABLE(1b, 3b)
+	_ASM_EXTABLE(1b, 10b)
+	_ASM_EXTABLE(3b, 30b)
+	_ASM_EXTABLE(5b, 50b)
 
 SYM_FUNC_END(__svm_vcpu_run)
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/8] KVM: SVM: restore host save area from assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (4 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

This is needed so that FILL_RETURN_BUFFER has access to the
percpu area via the GS segment base.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Analyzed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/svm.c     |  3 +--
 arch/x86/kvm/svm/svm.h     |  2 +-
 arch/x86/kvm/svm/svm_ops.h |  5 -----
 arch/x86/kvm/svm/vmenter.S | 13 +++++++++++++
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0ba4feb19cb6..e15f6ea9e5cc 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3922,8 +3922,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(svm);
-		vmload(__sme_page_pa(sd->save_area));
+		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area));
 	}
 
 	guest_state_exit_irqoff();
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7ff1879e73c5..932f26be5675 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -684,6 +684,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
-void __svm_vcpu_run(struct vcpu_svm *svm);
+void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa);
 
 #endif
diff --git a/arch/x86/kvm/svm/svm_ops.h b/arch/x86/kvm/svm/svm_ops.h
index 9430d6437c9f..36c8af87a707 100644
--- a/arch/x86/kvm/svm/svm_ops.h
+++ b/arch/x86/kvm/svm/svm_ops.h
@@ -61,9 +61,4 @@ static __always_inline void vmsave(unsigned long pa)
 	svm_asm1(vmsave, "a" (pa), "memory");
 }
 
-static __always_inline void vmload(unsigned long pa)
-{
-	svm_asm1(vmload, "a" (pa), "memory");
-}
-
 #endif /* __KVM_X86_SVM_OPS_H */
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 5bc2ed7d79c0..0a4272faf80f 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -35,6 +35,7 @@
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
+ * @hsave_pa:	unsigned long
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -49,6 +50,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
+	/* @hsave_pa is needed last after vmexit, save it first.  */
+	push %_ASM_ARG2
+
 	/* Save @svm. */
 	push %_ASM_ARG1
 
@@ -124,6 +128,11 @@ SYM_FUNC_START(__svm_vcpu_run)
 5:	vmsave %_ASM_AX
 6:
 
+	/* Pop @hsave_pa and restore GSBASE, allowing access to percpu data.  */
+	pop %_ASM_AX
+7:	vmload %_ASM_AX
+8:
+
 #ifdef CONFIG_RETPOLINE
 	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
@@ -187,10 +196,14 @@ SYM_FUNC_START(__svm_vcpu_run)
 50:	cmpb $0, kvm_rebooting
 	jne 6b
 	ud2
+70:	cmpb $0, kvm_rebooting
+	jne 8b
+	ud2
 
 	_ASM_EXTABLE(1b, 10b)
 	_ASM_EXTABLE(3b, 30b)
 	_ASM_EXTABLE(5b, 50b)
+	_ASM_EXTABLE(7b, 70b)
 
 SYM_FUNC_END(__svm_vcpu_run)
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (5 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 6/8] KVM: SVM: restore host save area from assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-09  1:14   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
  2022-11-08 19:43 ` [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Nathan Chancellor
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.

With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.

To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.

Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/cpu/bugs.c     |  13 +----
 arch/x86/kvm/kvm-asm-offsets.c |   1 +
 arch/x86/kvm/svm/svm.c         |  38 +++++-------
 arch/x86/kvm/svm/svm.h         |   4 +-
 arch/x86/kvm/svm/vmenter.S     | 102 ++++++++++++++++++++++++++++++++-
 5 files changed, 121 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index da7c361f47e0..6ec0b7ce7453 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -196,22 +196,15 @@ void __init check_bugs(void)
 }
 
 /*
- * NOTE: This function is *only* called for SVM.  VMX spec_ctrl handling is
- * done in vmenter.S.
+ * NOTE: This function is *only* called for SVM, since Intel uses
+ * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
 x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
 {
-	u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
+	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
 
-	if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
-		if (hostval != guestval) {
-			msrval = setguest ? guestval : hostval;
-			wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
-		}
-	}
-
 	/*
 	 * If SSBD is not handled in MSR_SPEC_CTRL on AMD, update
 	 * MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported.
diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index f83e88b85bf2..b2877c2c8df1 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -16,6 +16,7 @@ static void __used common(void)
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
 		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
+		OFFSET(SVM_spec_ctrl, vcpu_svm, spec_ctrl);
 		OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
 		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e15f6ea9e5cc..512bc06a4ba1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -730,6 +730,15 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
 	u32 offset;
 	u32 *msrpm;
 
+	/*
+	 * For non-nested case:
+	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 *
+	 * For nested case:
+	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 */
 	msrpm = is_guest_mode(vcpu) ? to_svm(vcpu)->nested.msrpm:
 				      to_svm(vcpu)->msrpm;
 
@@ -3911,18 +3920,19 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 	return EXIT_FASTPATH_NONE;
 }
 
-static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
+static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(svm);
+		__svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area));
+		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area),
+			       spec_ctrl_intercepted);
 	}
 
 	guest_state_exit_irqoff();
@@ -3931,6 +3941,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
 
 	trace_kvm_entry(vcpu);
 
@@ -3989,26 +4000,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
 		x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
 
-	svm_vcpu_enter_exit(vcpu);
-
-	/*
-	 * We do not use IBRS in the kernel. If this vCPU has used the
-	 * SPEC_CTRL MSR it may have left it on; save the value and
-	 * turn it off. This is much more efficient than blindly adding
-	 * it to the atomic save/restore list. Especially as the former
-	 * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
-	 *
-	 * For non-nested case:
-	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 *
-	 * For nested case:
-	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 */
-	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) &&
-	    unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
-		svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
+	svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
 
 	if (!sev_es_guest(vcpu->kvm))
 		reload_tss(vcpu);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 932f26be5675..bf9ff39dc420 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -683,7 +683,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
-void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa);
+void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa, bool spec_ctrl_intercepted);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 0a4272faf80f..a02eef724379 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -32,10 +32,70 @@
 
 .section .noinstr.text, "ax"
 
+.macro RESTORE_GUEST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "", \
+		"jmp 800f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"", X86_FEATURE_V_SPEC_CTRL
+801:
+.endm
+
+.macro RESTORE_HOST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "", \
+		"jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"", X86_FEATURE_V_SPEC_CTRL
+901:
+.endm
+
+.macro RESTORE_SPEC_CTRL_BODY
+800:
+	/*
+	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
+	 * host's, write the MSR.  This is kept out-of-line so that the common
+	 * case does not have to jump.
+	 *
+	 * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
+	 * there must not be any returns or indirect branches between this code
+	 * and vmentry.
+	 */
+	movl SVM_spec_ctrl(%_ASM_DI), %eax
+	cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	je 801b
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+	xor %edx, %edx
+	wrmsr
+	jmp 801b
+
+900:
+	/* Same for after vmexit.  */
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+
+	/*
+	 * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
+	 * if it was not intercepted during guest execution.
+	 */
+	cmpb $0, (%_ASM_SP)
+	jnz 998f
+	rdmsr
+	movl %eax, SVM_spec_ctrl(%_ASM_DI)
+998:
+
+	/* Now restore the host value of the MSR if different from the guest's.  */
+	movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	cmp SVM_spec_ctrl(%_ASM_DI), %eax
+	je 901b
+	xor %edx, %edx
+	wrmsr
+	jmp 901b
+.endm
+
+
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
  * @hsave_pa:	unsigned long
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -50,7 +110,12 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* @hsave_pa is needed last after vmexit, save it first.  */
+	/*
+	 * Both @spec_ctrl_intercepted and @hsave_pa are used only after vmexit.
+	 * @spec_ctrl_intercepted is needed later and accessed directly from
+	 * the stack in RESTORE_HOST_SPEC_CTRL, so save it first.
+	 */
+	push %_ASM_ARG3
 	push %_ASM_ARG2
 
 	/* Save @svm. */
@@ -61,6 +126,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %_ASM_ARG1, %_ASM_DI
 .endif
 
+	RESTORE_GUEST_SPEC_CTRL
+
 	/*
 	 * Use a single vmcb (vmcb01 because it's always valid) for
 	 * context switching guest state via VMLOAD/VMSAVE, that way
@@ -138,6 +205,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -173,6 +242,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 	xor %r15d, %r15d
 #endif
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
@@ -187,6 +259,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	pop %_ASM_BP
 	RET
 
+	RESTORE_SPEC_CTRL_BODY
+
 10:	cmpb $0, kvm_rebooting
 	jne 2b
 	ud2
@@ -210,6 +284,7 @@ SYM_FUNC_END(__svm_vcpu_run)
 /**
  * __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	push %_ASM_BP
@@ -224,8 +299,21 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 #endif
 	push %_ASM_BX
 
+	/* Save @spec_ctrl_intercepted for RESTORE_HOST_SPEC_CTRL. */
+	push %_ASM_ARG2
+
+	/* Save @svm. */
+	push %_ASM_ARG1
+
+.ifnc _ASM_ARG1, _ASM_DI
+	/* Move @svm to RDI for RESTORE_GUEST_SPEC_CTRL. */
+	mov %_ASM_ARG1, %_ASM_DI
+.endif
+
+	RESTORE_GUEST_SPEC_CTRL
+
 	/* Get svm->current_vmcb->pa into RAX. */
-	mov SVM_current_vmcb(%_ASM_ARG1), %_ASM_AX
+	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
 	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Enter guest mode */
@@ -235,11 +323,16 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 
 2:	cli
 
+	/* Pop @svm to RDI, guest registers have been saved already. */
+	pop %_ASM_DI
+
 #ifdef CONFIG_RETPOLINE
 	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -249,6 +342,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 */
 	UNTRAIN_RET
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
@@ -263,6 +359,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	pop %_ASM_BP
 	RET
 
+	RESTORE_SPEC_CTRL_BODY
+
 3:	cmpb $0, kvm_rebooting
 	jne 2b
 	ud2
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (6 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 19:43 ` [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Nathan Chancellor
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/spec-ctrl.h | 10 +++++-----
 arch/x86/kernel/cpu/bugs.c       |  2 +-
 arch/x86/kvm/svm/svm.c           |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index 5393babc0598..cb0386fc4dc3 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -13,7 +13,7 @@
  * Takes the guest view of SPEC_CTRL MSR as a parameter and also
  * the guest's version of VIRT_SPEC_CTRL, if emulated.
  */
-extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool guest);
+extern void x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool guest);
 
 /**
  * x86_spec_ctrl_set_guest - Set speculation control registers for the guest
@@ -24,9 +24,9 @@ extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bo
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_set_guest(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, true);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, true);
 }
 
 /**
@@ -38,9 +38,9 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_restore_host(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, false);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, false);
 }
 
 /* AMD specific Speculative Store Bypass MSR data */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6ec0b7ce7453..3e3230cccaa7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -200,7 +200,7 @@ void __init check_bugs(void)
  * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
-x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
+x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool setguest)
 {
 	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 512bc06a4ba1..e6706fd88cfa 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3998,7 +3998,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	 * being speculatively taken.
 	 */
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);
 
 	svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
 
@@ -4006,7 +4006,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 		reload_tss(vcpu);
 
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
 
 	if (!sev_es_guest(vcpu->kvm)) {
 		vcpu->arch.cr2 = svm->vmcb->save.cr2;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] KVM: SVM: fixes for vmentry code
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (7 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
@ 2022-11-08 19:43 ` Nathan Chancellor
  8 siblings, 0 replies; 17+ messages in thread
From: Nathan Chancellor @ 2022-11-08 19:43 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, thomas.lendacky, andrew.cooper3, peterz,
	jmattson, seanjc

On Tue, Nov 08, 2022 at 10:15:24AM -0500, Paolo Bonzini wrote:
> This series comprises two related fixes:
> 
> - the FILL_RETURN_BUFFER macro in -next needs to access percpu data,
>   hence the GS segment base needs to be loaded before FILL_RETURN_BUFFER.
>   This means moving guest vmload/vmsave and host vmload to assembly
>   (patches 5 and 6).
> 
> - because AMD wants the OS to set STIBP to 1 before executing the
>   return thunk (un)training sequence, IA32_SPEC_CTRL must be restored
>   before UNTRAIN_RET, too.  This must also be moved to assembly and,
>   for consistency, the guest SPEC_CTRL is also loaded in there
>   (patch 7).
> 
> Neither is particularly hard, however because of 32-bit systems one needs
> to keep the number of arguments to __svm_vcpu_run to three or fewer.
> One is taken for whether IA32_SPEC_CTRL is intercepted, and one for the
> host save area, so all accesses to the vcpu_svm struct have to be done
> from assembly too.  This is done in patches 2 to 4, and it turns out
> not to be that bad; in fact I think the code is simpler than before
> after these prerequisites, and even at the end of the series it is not
> much harder to follow despite doing a lot more stuff.  Care has been
> taken to keep the "normal" and SEV-ES code as similar as possible,
> even though the latter would not hit the three argument barrier.
> 
> The above summary leaves out the more mundane patches 1 and 8.  The
> former introduces a separate asm-offsets.c file for KVM, so that
> kernel/asm-offsets.c does not have to do ugly includes with ../ paths.
> The latter is dead code removal.
> 
> Thanks,
> 
> Paolo
> 
> v1->v2: use a separate asm-offsets.c file instead of hacking around
> 	the arch/x86/kvm/svm/svm.h file; this could have been done
> 	also with just a "#ifndef COMPILE_OFFSETS", but Sean's
> 	suggestion is cleaner and there is a precedent in
> 	drivers/memory/ for private asm-offsets files
> 
> 	keep preparatory cleanups together at the beginning of the
> 	series
> 
> 	move SPEC_CTRL save/restore out of line [Jim]
> 
> Paolo Bonzini (8):
>   KVM: x86: use a separate asm-offsets.c file
>   KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
>   KVM: SVM: adjust register allocation for __svm_vcpu_run
>   KVM: SVM: retrieve VMCB from assembly
>   KVM: SVM: move guest vmsave/vmload to assembly
>   KVM: SVM: restore host save area from assembly
>   KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
>   x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
>     callers
> 
>  arch/x86/include/asm/spec-ctrl.h |  10 +-
>  arch/x86/kernel/asm-offsets.c    |   6 -
>  arch/x86/kernel/cpu/bugs.c       |  15 +-
>  arch/x86/kvm/Makefile            |  12 ++
>  arch/x86/kvm/kvm-asm-offsets.c   |  28 ++++
>  arch/x86/kvm/svm/svm.c           |  53 +++----
>  arch/x86/kvm/svm/svm.h           |   4 +-
>  arch/x86/kvm/svm/svm_ops.h       |   5 -
>  arch/x86/kvm/svm/vmenter.S       | 241 ++++++++++++++++++++++++-------
>  arch/x86/kvm/vmx/vmenter.S       |   2 +-
>  10 files changed, 259 insertions(+), 117 deletions(-)
>  create mode 100644 arch/x86/kvm/kvm-asm-offsets.c
> 
> -- 
> 2.31.1
> 
> 

I applied this series on next-20221108, which has the call depth
tracking patches, and I no longer see the panic when starting a guest on
my AMD test system and I can still start a simple nested guest without
any problems, which is about the extent of my regular KVM testing.  I
did test the same kernel on my Intel systems and saw no problems there
but that seems expected given the diffstat. Thank you for the quick
response and fixes!

Tested-by: Nathan Chancellor <nathan@kernel.org>

One small nit I noticed: kvm-asm-offsets.h should be added to a
.gitignore file in arch/x86/kvm.

$ git status --short
?? arch/x86/kvm/kvm-asm-offsets.h

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
@ 2022-11-08 20:54   ` Sean Christopherson
  2022-11-09 10:35     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2022-11-08 20:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> Since registers are reachable through vcpu_svm, and we will
> need to access more fields of that struct, pass it instead
> of the regs[] array.
> 
> No functional change intended.
> 
> Cc: stable@vger.kernel.org
> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")

I don't think a Fixes: tag for a pure nop patch is fair to the original commit.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run
  2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
@ 2022-11-08 20:55   ` Sean Christopherson
  0 siblings, 0 replies; 17+ messages in thread
From: Sean Christopherson @ 2022-11-08 20:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> In preparation for moving vmload/vmsave to __svm_vcpu_run,
> keep the pointer to the struct vcpu_svm in %rdi.  This way
> it is possible to load svm->vmcb01.pa in %rax without
> clobbering the pointer to svm itself.
> 
> No functional change intended.
> 
> Cc: stable@vger.kernel.org
> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")

Same nit, Fixes: shouldn't be provided.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly
  2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
@ 2022-11-09  0:53   ` Sean Christopherson
  2022-11-09  9:09     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2022-11-09  0:53 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

[-- Attachment #1: Type: text/plain, Size: 3005 bytes --]

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> This is needed in order to keep the number of arguments to 3 or less,
> after adding hsave_pa and spec_ctrl_intercepted.  32-bit builds only
> support passing three arguments in registers, fortunately all other
> data is reachable from the vcpu_svm struct.

Is it actually a problem if parameters are passed on the stack?  The assembly
code mostly creates a stack frame, i.e. %ebp can be used to pull values off the
stack.

I dont think it will matter in the end (more below), but hypothetically if we
ended up with

  void __svm_vcpu_run(struct kvm_vcpu *vcpu, unsigned long vmcb_pa,
		      unsigned long gsave_pa, unsigned long hsave_pa,
		      bool spec_ctrl_intercepted);

then the asm prologue would be something like:

	/*
	 * Save @vcpu, @gsave_pa, @hsave_pa, and @spec_ctrl_intercepted, all of
	 * which are needed after VM-Exit.
	 */
	push %_ASM_ARG1
	push %_ASM_ARG3

  #ifdef CONFIG_X86_64
	push %_ASM_ARG4
	push %_ASM_ARG5
  #else
	push %_ASM_ARG4_EBP(%ebp)
	push %_ASM_ARG5_EBP(%ebp)
  #endif

which is a few extra memory accesses, especially for 32-bit, but no one cares
about 32-bit and I highly doubt a few extra PUSH+POP instructions will be noticeable.

Threading in yesterday's conversation...

> > What about adding dedicated structs to hold the non-regs params for VM-Enter and
> > VMRUN?  Grabbing stuff willy-nilly in the assembly code makes the flows difficult
> > to read as there's nothing in the C code that describes what fields are actually
> > used.
>
> What fields are actually used is (like with any other function)
> "potentially all, you'll have to read the source code and in fact you
> can just read asm-offsets.c instead".  What I mean is, I cannot offhand
> see or remember what fields are touched by svm_prepare_switch_to_guest,
> why would __svm_vcpu_run be any different?

It's different because if it were a normal C function, it would simply take
@vcpu, and maybe @spec_ctrl_intercepted to shave cycles after CLGI.  But because
it's assembly and doesn't have to_svm() readily available (among other restrictions),
__svm_vcpu_run() ends up taking a mishmash of parameters, which for me makes it
rather difficult to understand what to expect.

Oooh, and after much staring I realized that the address of the host save area
is passed in because grabbing it after VM-Exit can't work.  That's subtle, and
passing it in isn't strictly necessary; there's no reason the assembly code can't
grab it and stash it on the stack.

What about killing a few birds with one stone?  Move the host save area PA to
its own per-CPU variable, and then grab that from assembly as well.  Then it's
a bit more obvious why the address needs to be saved on the stack across VMRUN,
and my whining about the prototype being funky goes away.  __svm_vcpu_run() and
__svm_sev_es_vcpu_run() would have identical prototypes too.

Attached patches would slot in early in the series.  Tested SVM and SME-enabled
kernels, didn't test the SEV-ES bits.

[-- Attachment #2: 0001-KVM-SVM-Add-a-helper-to-convert-a-SME-aware-PA-back-.patch --]
[-- Type: text/x-diff, Size: 3055 bytes --]

From 59a4b14ec509e30e614beaa20ceb920c181e3739 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Tue, 8 Nov 2022 15:25:41 -0800
Subject: [PATCH 1/2] KVM: SVM: Add a helper to convert a SME-aware PA back to
 a struct page

Add __sme_pa_to_page() to pair with __sme_page_pa() and use it to replace
open coded equivalents, including for "iopm_base", which previously
avoided having to do __sme_clr() by storing the raw PA in the global
variable.

Opportunistically convert __sme_page_pa() to a helper to provide type
safety.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c |  9 ++++-----
 arch/x86/kvm/svm/svm.h | 16 +++++++++++++++-
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58f0077d9357..bb7427fd1242 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1065,8 +1065,7 @@ static void svm_hardware_unsetup(void)
 	for_each_possible_cpu(cpu)
 		svm_cpu_uninit(cpu);
 
-	__free_pages(pfn_to_page(iopm_base >> PAGE_SHIFT),
-	get_order(IOPM_SIZE));
+	__free_pages(__sme_pa_to_page(iopm_base), get_order(IOPM_SIZE));
 	iopm_base = 0;
 }
 
@@ -1243,7 +1242,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	if (!kvm_hlt_in_guest(vcpu->kvm))
 		svm_set_intercept(svm, INTERCEPT_HLT);
 
-	control->iopm_base_pa = __sme_set(iopm_base);
+	control->iopm_base_pa = iopm_base;
 	control->msrpm_base_pa = __sme_set(__pa(svm->msrpm));
 	control->int_ctl = V_INTR_MASKING_MASK;
 
@@ -1443,7 +1442,7 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
 
 	sev_free_vcpu(vcpu);
 
-	__free_page(pfn_to_page(__sme_clr(svm->vmcb01.pa) >> PAGE_SHIFT));
+	__free_page(__sme_pa_to_page(svm->vmcb01.pa));
 	__free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE));
 }
 
@@ -4970,7 +4969,7 @@ static __init int svm_hardware_setup(void)
 
 	iopm_va = page_address(iopm_pages);
 	memset(iopm_va, 0xff, PAGE_SIZE * (1 << order));
-	iopm_base = page_to_pfn(iopm_pages) << PAGE_SHIFT;
+	iopm_base = __sme_page_pa(iopm_pages);
 
 	init_msrpm_offsets();
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..9a2567803006 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -24,7 +24,21 @@
 
 #include "kvm_cache_regs.h"
 
-#define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
+/*
+ * Helpers to convert to/from physical addresses for pages whose address is
+ * consumed directly by hardware.  Even though it's a physical address, SVM
+ * often restricts the address to the natural width, hence 'unsigned long'
+ * instead of 'hpa_t'.
+ */
+static inline unsigned long __sme_page_pa(struct page *page)
+{
+	return __sme_set(page_to_pfn(page) << PAGE_SHIFT);
+}
+
+static inline struct page *__sme_pa_to_page(unsigned long pa)
+{
+	return pfn_to_page(__sme_clr(pa) >> PAGE_SHIFT);
+}
 
 #define	IOPM_SIZE PAGE_SIZE * 3
 #define	MSRPM_SIZE PAGE_SIZE * 2

base-commit: 88cd4a037496682f164e7ae8dac13cd4ec8edc2b
-- 
2.38.1.431.g37b22c650d-goog


[-- Attachment #3: 0002-KVM-SVM-Snapshot-host-save-area-PA-in-dedicated-per-.patch --]
[-- Type: text/x-diff, Size: 4804 bytes --]

From e9a4f9af27d7d384dc36ad66a9856f9decb8ce02 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Tue, 8 Nov 2022 15:32:58 -0800
Subject: [PATCH 2/2] KVM: SVM: Snapshot host save area PA in dedicated per-CPU
 variable

Track the SME-aware physical address of the host save area outside of
"struct svm_cpu_data" so that a future patch can easily reference the
address from assembly code.

The "overhead" of always allocating the per-CPU data is more or less
meaningless in the grand scheme.  And when KVM AMD is built as a module,
it's actually more costly (by a single pointer) to dynamically allocate
the struct, as the per-CPU data is allocated only when the module is
loaded.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 30 ++++++++++++++++--------------
 arch/x86/kvm/svm/svm.h |  2 +-
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index bb7427fd1242..8fc02bef4f85 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -246,6 +246,7 @@ struct kvm_ldttss_desc {
 } __attribute__((packed));
 
 DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
+DEFINE_PER_CPU(unsigned long, svm_host_save_area_pa);
 
 /*
  * Only MSR_TSC_AUX is switched via the user return hook.  EFER is switched via
@@ -597,7 +598,7 @@ static int svm_hardware_enable(void)
 
 	wrmsrl(MSR_EFER, efer | EFER_SVME);
 
-	wrmsrl(MSR_VM_HSAVE_PA, __sme_page_pa(sd->save_area));
+	wrmsrl(MSR_VM_HSAVE_PA, per_cpu(svm_host_save_area_pa, me));
 
 	if (static_cpu_has(X86_FEATURE_TSCRATEMSR)) {
 		/*
@@ -653,12 +654,13 @@ static void svm_cpu_uninit(int cpu)
 
 	per_cpu(svm_data, cpu) = NULL;
 	kfree(sd->sev_vmcbs);
-	__free_page(sd->save_area);
+	__free_page(__sme_pa_to_page(per_cpu(svm_host_save_area_pa, cpu)));
 	kfree(sd);
 }
 
 static int svm_cpu_init(int cpu)
 {
+	struct page *host_save_area;
 	struct svm_cpu_data *sd;
 	int ret = -ENOMEM;
 
@@ -666,20 +668,20 @@ static int svm_cpu_init(int cpu)
 	if (!sd)
 		return ret;
 	sd->cpu = cpu;
-	sd->save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
-	if (!sd->save_area)
-		goto free_cpu_data;
 
 	ret = sev_cpu_init(sd);
 	if (ret)
-		goto free_save_area;
+		goto free_cpu_data;
+
+	host_save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!host_save_area)
+		goto free_cpu_data;
 
 	per_cpu(svm_data, cpu) = sd;
+	per_cpu(svm_host_save_area_pa, cpu) = __sme_page_pa(host_save_area);
 
 	return 0;
 
-free_save_area:
-	__free_page(sd->save_area);
 free_cpu_data:
 	kfree(sd);
 	return ret;
@@ -1449,7 +1451,6 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
 static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
 	if (sev_es_guest(vcpu->kvm))
 		sev_es_unmap_ghcb(svm);
@@ -1461,10 +1462,13 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 	 * Save additional host state that will be restored on VMEXIT (sev-es)
 	 * or subsequent vmload of host save area.
 	 */
-	vmsave(__sme_page_pa(sd->save_area));
+	vmsave(per_cpu(svm_host_save_area_pa, vcpu->cpu));
 	if (sev_es_guest(vcpu->kvm)) {
 		struct sev_es_save_area *hostsa;
-		hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
+		struct page *hostsa_page;
+
+		hostsa_page = __sme_pa_to_page(per_cpu(svm_host_save_area_pa, vcpu->cpu));
+		hostsa = (struct sev_es_save_area *)(page_address(hostsa_page) + 0x400);
 
 		sev_es_prepare_switch_to_guest(hostsa);
 	}
@@ -3920,8 +3924,6 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	if (sev_es_guest(vcpu->kvm)) {
 		__svm_sev_es_vcpu_run(vmcb_pa);
 	} else {
-		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
-
 		/*
 		 * Use a single vmcb (vmcb01 because it's always valid) for
 		 * context switching guest state via VMLOAD/VMSAVE, that way
@@ -3932,7 +3934,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		__svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
 		vmsave(svm->vmcb01.pa);
 
-		vmload(__sme_page_pa(sd->save_area));
+		vmload(per_cpu(svm_host_save_area_pa, vcpu->cpu));
 	}
 
 	guest_state_exit_irqoff();
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 9a2567803006..d9ec8ea69a56 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -303,7 +303,6 @@ struct svm_cpu_data {
 	u32 min_asid;
 	struct kvm_ldttss_desc *tss_desc;
 
-	struct page *save_area;
 	struct vmcb *current_vmcb;
 
 	/* index = sev_asid, value = vmcb pointer */
@@ -311,6 +310,7 @@ struct svm_cpu_data {
 };
 
 DECLARE_PER_CPU(struct svm_cpu_data *, svm_data);
+DECLARE_PER_CPU(unsigned long, svm_host_save_area_pa);
 
 void recalc_intercepts(struct vcpu_svm *svm);
 
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-11-09  1:14   ` Sean Christopherson
  2022-11-09  9:29     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2022-11-09  1:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
> index 0a4272faf80f..a02eef724379 100644
> --- a/arch/x86/kvm/svm/vmenter.S
> +++ b/arch/x86/kvm/svm/vmenter.S
> @@ -32,10 +32,70 @@
>  
>  .section .noinstr.text, "ax"
>  
> +.macro RESTORE_GUEST_SPEC_CTRL
> +	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> +	ALTERNATIVE_2 "", \
> +		"jmp 800f", X86_FEATURE_MSR_SPEC_CTRL, \
> +		"", X86_FEATURE_V_SPEC_CTRL
> +801:
> +.endm
> +
> +.macro RESTORE_HOST_SPEC_CTRL
> +	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> +	ALTERNATIVE_2 "", \
> +		"jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
> +		"", X86_FEATURE_V_SPEC_CTRL
> +901:
> +.endm
> +
> +.macro RESTORE_SPEC_CTRL_BODY

Can we split these into separate macros?  It's a bit more typing, but it's not
immediately obvious that these are two independent chunks (I was expecting a JMP
from the 800 section into the 900 section).

E.g. RESTORE_GUEST_SPEC_CTRL_BODY and RESTORE_HOST_SPEC_CTRL_BODY

> +800:

Ugh, the multiple users makes it somewhat ugly, but rather than arbitrary numbers,
what about using named labels to make it easier to understand the branches?

E.g.

---
 arch/x86/kvm/svm/vmenter.S | 43 +++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index a02eef724379..23fd7353f0d0 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -32,24 +32,24 @@
 
 .section .noinstr.text, "ax"
 
-.macro RESTORE_GUEST_SPEC_CTRL
+.macro RESTORE_GUEST_SPEC_CTRL name
 	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
 	ALTERNATIVE_2 "", \
-		"jmp 800f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp .Lrestore_guest_spec_ctrl\name", X86_FEATURE_MSR_SPEC_CTRL, \
 		"", X86_FEATURE_V_SPEC_CTRL
-801:
+.Lpost_restore_guest_spec_ctrl\name:
 .endm
 
-.macro RESTORE_HOST_SPEC_CTRL
+.macro RESTORE_HOST_SPEC_CTRL name
 	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
 	ALTERNATIVE_2 "", \
-		"jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp .Lrestore_host_spec_ctrl\name", X86_FEATURE_MSR_SPEC_CTRL, \
 		"", X86_FEATURE_V_SPEC_CTRL
-901:
+.Lpost_restore_host_spec_ctrl\name:
 .endm
 
-.macro RESTORE_SPEC_CTRL_BODY
-800:
+.macro RESTORE_GUEST_SPEC_CTRL_BODY name
+.Lrestore_guest_spec_ctrl\name:
 	/*
 	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
 	 * host's, write the MSR.  This is kept out-of-line so that the common
@@ -61,13 +61,16 @@
 	 */
 	movl SVM_spec_ctrl(%_ASM_DI), %eax
 	cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
-	je 801b
+	je .Lpost_restore_guest_spec_ctrl\name
+
 	mov $MSR_IA32_SPEC_CTRL, %ecx
 	xor %edx, %edx
 	wrmsr
-	jmp 801b
+	jmp .Lpost_restore_guest_spec_ctrl\name
+.endm
 
-900:
+.macro RESTORE_HOST_SPEC_CTRL_BODY name
+.Lrestore_host_spec_ctrl\name:
 	/* Same for after vmexit.  */
 	mov $MSR_IA32_SPEC_CTRL, %ecx
 
@@ -76,18 +79,18 @@
 	 * if it was not intercepted during guest execution.
 	 */
 	cmpb $0, (%_ASM_SP)
-	jnz 998f
+	jnz .Lskip_save_guest_spec_ctrl\name
 	rdmsr
 	movl %eax, SVM_spec_ctrl(%_ASM_DI)
-998:
 
+.Lskip_save_guest_spec_ctrl\name:
 	/* Now restore the host value of the MSR if different from the guest's.  */
 	movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
 	cmp SVM_spec_ctrl(%_ASM_DI), %eax
-	je 901b
+	je .Lpost_restore_host_spec_ctrl\name
 	xor %edx, %edx
 	wrmsr
-	jmp 901b
+	jmp .Lpost_restore_host_spec_ctrl\name
 .endm
 
 
@@ -259,7 +262,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	pop %_ASM_BP
 	RET
 
-	RESTORE_SPEC_CTRL_BODY
+	RESTORE_GUEST_SPEC_CTRL_BODY
+	RESTORE_HOST_SPEC_CTRL_BODY
 
 10:	cmpb $0, kvm_rebooting
 	jne 2b
@@ -310,7 +314,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	mov %_ASM_ARG1, %_ASM_DI
 .endif
 
-	RESTORE_GUEST_SPEC_CTRL
+	RESTORE_GUEST_SPEC_CTRL sev_es
 
 	/* Get svm->current_vmcb->pa into RAX. */
 	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
@@ -331,7 +335,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
-	RESTORE_HOST_SPEC_CTRL
+	RESTORE_HOST_SPEC_CTRL sev_es
 
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
@@ -359,7 +363,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	pop %_ASM_BP
 	RET
 
-	RESTORE_SPEC_CTRL_BODY
+	RESTORE_GUEST_SPEC_CTRL_BODY sev_es
+	RESTORE_HOST_SPEC_CTRL_BODY sev_es
 
 3:	cmpb $0, kvm_rebooting
 	jne 2b

base-commit: 0b242ada175d97a556866c48c80310860f634579
-- 

> @@ -61,6 +126,8 @@ SYM_FUNC_START(__svm_vcpu_run)
>  	mov %_ASM_ARG1, %_ASM_DI
>  .endif
>

Can you add a comment to all of these to call out that RAX, RCX, and RDX might
get clobbered?  It's easy to overlook that detail, and it matters on 32-bit where
the arguments use RCX and RDX.  And because the clobber is conditional, it wouldn't
be that hard for a bug to escape initial testing.

	/* This might clobber RAX, RCX, and RDX! */

> +	RESTORE_GUEST_SPEC_CTRL

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly
  2022-11-09  0:53   ` Sean Christopherson
@ 2022-11-09  9:09     ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-09  9:09 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/9/22 01:53, Sean Christopherson wrote:
> On Tue, Nov 08, 2022, Paolo Bonzini wrote:
>> This is needed in order to keep the number of arguments to 3 or less,
>> after adding hsave_pa and spec_ctrl_intercepted.  32-bit builds only
>> support passing three arguments in registers, fortunately all other
>> data is reachable from the vcpu_svm struct.
> 
> Is it actually a problem if parameters are passed on the stack?  The assembly
> code mostly creates a stack frame, i.e. %ebp can be used to pull values off the
> stack.

It's not, but given how little love 32-bit KVM receives, I prefer to 
stick to the subset of the ABI that is "equivalent" to 64-bit.

> no one cares about 32-bit and I highly doubt a few extra PUSH+POP
> instructions will  be noticeable.

Same reasoning (no one cares about 32-bits), different conclusions...

>> What fields are actually used is (like with any other function)
>> "potentially all, you'll have to read the source code and in fact you
>> can just read asm-offsets.c instead".  What I mean is, I cannot offhand
>> see or remember what fields are touched by svm_prepare_switch_to_guest,
>> why would __svm_vcpu_run be any different?
> 
> It's different because if it were a normal C function, it would simply take
> @vcpu, and maybe @spec_ctrl_intercepted to shave cycles after CLGI.

Not just for that, but especially to avoid making 
msr_write_intercepted() noinstr.

> But because
> it's assembly and doesn't have to_svm() readily available (among other restrictions),
> __svm_vcpu_run() ends up taking a mishmash of parameters, which for me makes it
> rather difficult to understand what to expect.

Yeah, there could be three reasons to have parameters in assembly:

* you just need them (@svm)

* it's too much of a pain to compute it in assembly 
(@spec_ctrl_intercepted, @hsave_pa)

* it needs to be computed outside the clgi/stgi region (not happening 
here, only mentioned for completeness)

As this patch shows, @vmcb is not much of a pain to compute in assembly: 
it is just two instructions, and not passing it in simplifies register 
allocation (the weird push/pop goes away) because all the arguments 
except @svm/_ASM_ARG1 are needed only after vmexit.

> Oooh, and after much staring I realized that the address of the host save area
> is passed in because grabbing it after VM-Exit can't work.  That's subtle, and
> passing it in isn't strictly necessary; there's no reason the assembly code can't
> grab it and stash it on the stack.

Right, in fact that's not the reason why it's passed in---it's just to 
avoid coding page_to_pfn() in assembly, and to limit the differences 
between the regular and SEV-ES cases.  But using a per-CPU variable is 
fine (either in addition to the struct page, which "wastes" 8 bytes per 
CPU, or as a replacement).

> What about killing a few birds with one stone?  Move the host save area PA to
> its own per-CPU variable, and then grab that from assembly as well.

I would still place it in struct svm_cpu_data itself, I'll see how it 
looks and possibly post v3.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-09  1:14   ` Sean Christopherson
@ 2022-11-09  9:29     ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-09  9:29 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/9/22 02:14, Sean Christopherson wrote:
>>
>> +.macro RESTORE_SPEC_CTRL_BODY
> 
> Can we split these into separate macros?  It's a bit more typing, but it's not
> immediately obvious that these are two independent chunks (I was expecting a JMP
> from the 800 section into the 900 section).
> 
> E.g. RESTORE_GUEST_SPEC_CTRL_BODY and RESTORE_HOST_SPEC_CTRL_BODY

Sure, I had it like that in an earlier version.  I didn't see much 
benefit but it is indeed a bit more readable if you order the macros like

.macro RESTORE_GUEST_SPEC_CTRL
.macro RESTORE_GUEST_SPEC_CTRL_BODY
.macro RESTORE_HOST_SPEC_CTRL
.macro RESTORE_HOST_SPEC_CTRL_BODY

>> +800:
>
> Ugh, the multiple users makes it somewhat ugly, but rather than arbitrary numbers,
> what about using named labels to make it easier to understand the branches?

I think it's okay if we separate the macros.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-08 20:54   ` Sean Christopherson
@ 2022-11-09 10:35     ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-09 10:35 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/8/22 21:54, Sean Christopherson wrote:
> On Tue, Nov 08, 2022, Paolo Bonzini wrote:
>> Since registers are reachable through vcpu_svm, and we will
>> need to access more fields of that struct, pass it instead
>> of the regs[] array.
>>
>> No functional change intended.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
> 
> I don't think a Fixes: tag for a pure nop patch is fair to the original commit.

That's for the sake of correct tracking in stable@, still it's not the 
right commit to point at:

- f14eec0a3203 did not move the RSB stuffing before the GSBASE restore, 
the code before was:

	vmexit_fill_RSB();
  #ifdef CONFIG_X86_64
  	wrmsrl(MSR_GS_BASE, svm->host.gs_base);
  #else

- anyway it wasn't really buggy at the time: it's only in -next that 
FILL_RETURN_BUFFER uses percpu data, because alternatives take care of 
the X86_FEATURE_* checks

The real reason to do all this is to access the percpu host spec_ctrl, 
which in turn is needed for retbleed.  I'll point the Fixes tag to 
a149180fbcf3 ("x86: Add magic AMD return-thunk") instead, again just for 
the sake of tracking releases that need the change to full fix retbleed.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-11-09 10:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
2022-11-08 20:54   ` Sean Christopherson
2022-11-09 10:35     ` Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
2022-11-08 20:55   ` Sean Christopherson
2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
2022-11-09  0:53   ` Sean Christopherson
2022-11-09  9:09     ` Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 6/8] KVM: SVM: restore host save area from assembly Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
2022-11-09  1:14   ` Sean Christopherson
2022-11-09  9:29     ` Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
2022-11-08 19:43 ` [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Nathan Chancellor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.