linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] KVM: SVM: fixes for vmentry code
@ 2022-11-08 15:15 Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

This series comprises two related fixes:

- the FILL_RETURN_BUFFER macro in -next needs to access percpu data,
  hence the GS segment base needs to be loaded before FILL_RETURN_BUFFER.
  This means moving guest vmload/vmsave and host vmload to assembly
  (patches 5 and 6).

- because AMD wants the OS to set STIBP to 1 before executing the
  return thunk (un)training sequence, IA32_SPEC_CTRL must be restored
  before UNTRAIN_RET, too.  This must also be moved to assembly and,
  for consistency, the guest SPEC_CTRL is also loaded in there
  (patch 7).

Neither is particularly hard, however because of 32-bit systems one needs
to keep the number of arguments to __svm_vcpu_run to three or fewer.
One is taken for whether IA32_SPEC_CTRL is intercepted, and one for the
host save area, so all accesses to the vcpu_svm struct have to be done
from assembly too.  This is done in patches 2 to 4, and it turns out
not to be that bad; in fact I think the code is simpler than before
after these prerequisites, and even at the end of the series it is not
much harder to follow despite doing a lot more stuff.  Care has been
taken to keep the "normal" and SEV-ES code as similar as possible,
even though the latter would not hit the three argument barrier.

The above summary leaves out the more mundane patches 1 and 8.  The
former introduces a separate asm-offsets.c file for KVM, so that
kernel/asm-offsets.c does not have to do ugly includes with ../ paths.
The latter is dead code removal.

Thanks,

Paolo

v1->v2: use a separate asm-offsets.c file instead of hacking around
	the arch/x86/kvm/svm/svm.h file; this could have been done
	also with just a "#ifndef COMPILE_OFFSETS", but Sean's
	suggestion is cleaner and there is a precedent in
	drivers/memory/ for private asm-offsets files

	keep preparatory cleanups together at the beginning of the
	series

	move SPEC_CTRL save/restore out of line [Jim]

Paolo Bonzini (8):
  KVM: x86: use a separate asm-offsets.c file
  KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  KVM: SVM: adjust register allocation for __svm_vcpu_run
  KVM: SVM: retrieve VMCB from assembly
  KVM: SVM: move guest vmsave/vmload to assembly
  KVM: SVM: restore host save area from assembly
  KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
    callers

 arch/x86/include/asm/spec-ctrl.h |  10 +-
 arch/x86/kernel/asm-offsets.c    |   6 -
 arch/x86/kernel/cpu/bugs.c       |  15 +-
 arch/x86/kvm/Makefile            |  12 ++
 arch/x86/kvm/kvm-asm-offsets.c   |  28 ++++
 arch/x86/kvm/svm/svm.c           |  53 +++----
 arch/x86/kvm/svm/svm.h           |   4 +-
 arch/x86/kvm/svm/svm_ops.h       |   5 -
 arch/x86/kvm/svm/vmenter.S       | 241 ++++++++++++++++++++++++-------
 arch/x86/kvm/vmx/vmenter.S       |   2 +-
 10 files changed, 259 insertions(+), 117 deletions(-)
 create mode 100644 arch/x86/kvm/kvm-asm-offsets.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

This already removes the ugly #includes from asm-offsets.c, but especially
it avoids a future error when asm-offsets will try to include svm/svm.h.

This would not work for kernel/asm-offsets.c, because svm/svm.h
includes kvm_cache_regs.h which is not in the include path when
compiling asm-offsets.c.  The problem is not there if the .c file is
in arch/x86/kvm.

Tested with both in-tree and separate-objtree compilation.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c  |  6 ------
 arch/x86/kvm/Makefile          |  9 +++++++++
 arch/x86/kvm/kvm-asm-offsets.c | 18 ++++++++++++++++++
 arch/x86/kvm/vmx/vmenter.S     |  2 +-
 4 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 arch/x86/kvm/kvm-asm-offsets.c

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cb50589a7102..437308004ef2 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -19,7 +19,6 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 #include <asm/tdx.h>
-#include "../kvm/vmx/vmx.h"
 
 #ifdef CONFIG_XEN
 #include <xen/interface/xen.h>
@@ -108,9 +107,4 @@ static void __used common(void)
 	OFFSET(TSS_sp0, tss_struct, x86_tss.sp0);
 	OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
 	OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
-
-	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
-		BLANK();
-		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
-	}
 }
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 30f244b64523..a02cf9baacc8 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -34,3 +34,12 @@ endif
 obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
+
+AFLAGS_vmx/vmenter.o    := -iquote $(obj)
+$(obj)/vmx/vmenter.o: $(obj)/kvm-asm-offsets.h
+
+$(obj)/kvm-asm-offsets.h: $(obj)/kvm-asm-offsets.s FORCE
+	$(call filechk,offsets,__KVM_ASM_OFFSETS_H__)
+
+targets += kvm-asm-offsets.s
+clean-files += kvm-asm-offsets.h
diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
new file mode 100644
index 000000000000..9d84f2b32d7f
--- /dev/null
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -0,0 +1,18 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generate definitions needed by assembly language modules.
+ * This code generates raw asm output which is post-processed to extract
+ * and format the required data.
+ */
+#define COMPILE_OFFSETS
+
+#include <linux/kbuild.h>
+#include "vmx/vmx.h"
+
+static void __used common(void)
+{
+	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
+		BLANK();
+		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
+	}
+}
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 8477d8bdd69c..0b5db4de4d09 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -1,12 +1,12 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/linkage.h>
 #include <asm/asm.h>
-#include <asm/asm-offsets.h>
 #include <asm/bitsperlong.h>
 #include <asm/kvm_vcpu_regs.h>
 #include <asm/nospec-branch.h>
 #include <asm/percpu.h>
 #include <asm/segment.h>
+#include "kvm-asm-offsets.h"
 #include "run_flags.h"
 
 #define WORD_SIZE (BITS_PER_LONG / 8)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 20:54   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/Makefile          |  3 +++
 arch/x86/kvm/kvm-asm-offsets.c |  6 ++++++
 arch/x86/kvm/svm/svm.c         |  2 +-
 arch/x86/kvm/svm/svm.h         |  2 +-
 arch/x86/kvm/svm/vmenter.S     | 37 +++++++++++++++++-----------------
 5 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index a02cf9baacc8..f453a0f96e24 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -35,6 +35,9 @@ obj-$(CONFIG_KVM)	+= kvm.o
 obj-$(CONFIG_KVM_INTEL)	+= kvm-intel.o
 obj-$(CONFIG_KVM_AMD)	+= kvm-amd.o
 
+AFLAGS_svm/vmenter.o    := -iquote $(obj)
+$(obj)/svm/vmenter.o: $(obj)/kvm-asm-offsets.h
+
 AFLAGS_vmx/vmenter.o    := -iquote $(obj)
 $(obj)/vmx/vmenter.o: $(obj)/kvm-asm-offsets.h
 
diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index 9d84f2b32d7f..30db96852e2d 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -8,9 +8,15 @@
 
 #include <linux/kbuild.h>
 #include "vmx/vmx.h"
+#include "svm/svm.h"
 
 static void __used common(void)
 {
+	if (IS_ENABLED(CONFIG_KVM_AMD)) {
+		BLANK();
+		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+	}
+
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
 		BLANK();
 		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58f0077d9357..b412bc5773c5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3930,7 +3930,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		 * vmcb02 when switching vmcbs for nested virtualization.
 		 */
 		vmload(svm->vmcb01.pa);
-		__svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
+		__svm_vcpu_run(vmcb_pa, svm);
 		vmsave(svm->vmcb01.pa);
 
 		vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..447e25c9101a 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -684,6 +684,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, unsigned long *regs);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 723f8534986c..f0ff41103e4c 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -4,27 +4,28 @@
 #include <asm/bitsperlong.h>
 #include <asm/kvm_vcpu_regs.h>
 #include <asm/nospec-branch.h>
+#include "kvm-asm-offsets.h"
 
 #define WORD_SIZE (BITS_PER_LONG / 8)
 
 /* Intentionally omit RAX as it's context switched by hardware */
-#define VCPU_RCX	__VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX	__VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX	__VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RCX	(SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX	(SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX	(SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
 /* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP	__VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI	__VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI	__VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP	(SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI	(SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI	(SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
 
 #ifdef CONFIG_X86_64
-#define VCPU_R8		__VCPU_REGS_R8  * WORD_SIZE
-#define VCPU_R9		__VCPU_REGS_R9  * WORD_SIZE
-#define VCPU_R10	__VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11	__VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12	__VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13	__VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14	__VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15	__VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8		(SVM_vcpu_arch_regs + __VCPU_REGS_R8  * WORD_SIZE)
+#define VCPU_R9		(SVM_vcpu_arch_regs + __VCPU_REGS_R9  * WORD_SIZE)
+#define VCPU_R10	(SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11	(SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12	(SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13	(SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14	(SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
 .section .noinstr.text, "ax"
@@ -32,7 +33,7 @@
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @vmcb_pa:	unsigned long
- * @regs:	unsigned long * (to guest registers)
+ * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -47,13 +48,13 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Save @regs. */
+	/* Save @svm. */
 	push %_ASM_ARG2
 
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @regs to RAX. */
+	/* Move @svm to RAX. */
 	mov %_ASM_ARG2, %_ASM_AX
 
 	/* Load guest registers. */
@@ -89,7 +90,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
-	/* "POP" @regs to RAX. */
+	/* "POP" @svm to RAX. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 20:55   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

In preparation for moving vmload/vmsave to __svm_vcpu_run,
keep the pointer to the struct vcpu_svm in %rdi.  This way
it is possible to load svm->vmcb01.pa in %rax without
clobbering the pointer to svm itself.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/vmenter.S | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index f0ff41103e4c..531510ab6072 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -54,29 +54,29 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @svm to RAX. */
-	mov %_ASM_ARG2, %_ASM_AX
+	/* Move @svm to RDI. */
+	mov %_ASM_ARG2, %_ASM_DI
+
+	/* "POP" @vmcb to RAX. */
+	pop %_ASM_AX
 
 	/* Load guest registers. */
-	mov VCPU_RCX(%_ASM_AX), %_ASM_CX
-	mov VCPU_RDX(%_ASM_AX), %_ASM_DX
-	mov VCPU_RBX(%_ASM_AX), %_ASM_BX
-	mov VCPU_RBP(%_ASM_AX), %_ASM_BP
-	mov VCPU_RSI(%_ASM_AX), %_ASM_SI
-	mov VCPU_RDI(%_ASM_AX), %_ASM_DI
+	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+	mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+	mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+	mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+	mov VCPU_RSI(%_ASM_DI), %_ASM_SI
 #ifdef CONFIG_X86_64
-	mov VCPU_R8 (%_ASM_AX),  %r8
-	mov VCPU_R9 (%_ASM_AX),  %r9
-	mov VCPU_R10(%_ASM_AX), %r10
-	mov VCPU_R11(%_ASM_AX), %r11
-	mov VCPU_R12(%_ASM_AX), %r12
-	mov VCPU_R13(%_ASM_AX), %r13
-	mov VCPU_R14(%_ASM_AX), %r14
-	mov VCPU_R15(%_ASM_AX), %r15
+	mov VCPU_R8 (%_ASM_DI),  %r8
+	mov VCPU_R9 (%_ASM_DI),  %r9
+	mov VCPU_R10(%_ASM_DI), %r10
+	mov VCPU_R11(%_ASM_DI), %r11
+	mov VCPU_R12(%_ASM_DI), %r12
+	mov VCPU_R13(%_ASM_DI), %r13
+	mov VCPU_R14(%_ASM_DI), %r14
+	mov VCPU_R15(%_ASM_DI), %r15
 #endif
-
-	/* "POP" @vmcb to RAX. */
-	pop %_ASM_AX
+	mov VCPU_RDI(%_ASM_DI), %_ASM_DI
 
 	/* Enter guest mode */
 	sti
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (2 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-09  0:53   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

This is needed in order to keep the number of arguments to 3 or less,
after adding hsave_pa and spec_ctrl_intercepted.  32-bit builds only
support passing three arguments in registers, fortunately all other
data is reachable from the vcpu_svm struct.

It is not strictly necessary for __svm_sev_es_vcpu_run, but staying
consistent is a good idea since it makes __svm_sev_es_vcpu_run a
stripped version of _svm_vcpu_run.

No functional change intended.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/kvm-asm-offsets.c |  2 ++
 arch/x86/kvm/svm/svm.c         |  5 ++---
 arch/x86/kvm/svm/svm.h         |  4 ++--
 arch/x86/kvm/svm/vmenter.S     | 20 ++++++++++----------
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index 30db96852e2d..f1b694e431ae 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -15,6 +15,8 @@ static void __used common(void)
 	if (IS_ENABLED(CONFIG_KVM_AMD)) {
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
+		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
 
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b412bc5773c5..0c86c435c51f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3914,12 +3914,11 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	unsigned long vmcb_pa = svm->current_vmcb->pa;
 
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(vmcb_pa);
+		__svm_sev_es_vcpu_run(svm);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
@@ -3930,7 +3929,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		 * vmcb02 when switching vmcbs for nested virtualization.
 		 */
 		vmload(svm->vmcb01.pa);
-		__svm_vcpu_run(vmcb_pa, svm);
+		__svm_vcpu_run(svm);
 		vmsave(svm->vmcb01.pa);
 
 		vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 447e25c9101a..7ff1879e73c5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -683,7 +683,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
+void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
+void __svm_vcpu_run(struct vcpu_svm *svm);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 531510ab6072..d07bac1952c5 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -32,7 +32,6 @@
 
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
- * @vmcb_pa:	unsigned long
  * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_vcpu_run)
@@ -49,16 +48,16 @@ SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BX
 
 	/* Save @svm. */
-	push %_ASM_ARG2
-
-	/* Save @vmcb. */
 	push %_ASM_ARG1
 
+.ifnc _ASM_ARG1, _ASM_DI
 	/* Move @svm to RDI. */
-	mov %_ASM_ARG2, %_ASM_DI
+	mov %_ASM_ARG1, %_ASM_DI
+.endif
 
-	/* "POP" @vmcb to RAX. */
-	pop %_ASM_AX
+	/* Get svm->current_vmcb->pa into RAX. */
+	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
+	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Load guest registers. */
 	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
@@ -170,7 +169,7 @@ SYM_FUNC_END(__svm_vcpu_run)
 
 /**
  * __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
- * @vmcb_pa:	unsigned long
+ * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	push %_ASM_BP
@@ -185,8 +184,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Move @vmcb to RAX. */
-	mov %_ASM_ARG1, %_ASM_AX
+	/* Get svm->current_vmcb->pa into RAX. */
+	mov SVM_current_vmcb(%_ASM_ARG1), %_ASM_AX
+	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Enter guest mode */
 	sti
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (3 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 6/8] KVM: SVM: restore host save area from assembly Paolo Bonzini
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

FILL_RETURN_BUFFER can access percpu data, therefore vmload of the
host save area must be executed first.  First of all, move the
VMCB vmsave/vmload to assembly.

The idea on how to number the exception tables is stolen from
a prototype patch by Peter Zijlstra.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Link: <https://lore.kernel.org/all/f571e404-e625-bae1-10e9-449b2eb4cbd8@citrix.com/>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/kvm-asm-offsets.c |  1 +
 arch/x86/kvm/svm/svm.c         |  9 -------
 arch/x86/kvm/svm/vmenter.S     | 49 ++++++++++++++++++++++++++--------
 3 files changed, 39 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index f1b694e431ae..f83e88b85bf2 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -16,6 +16,7 @@ static void __used common(void)
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
 		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
+		OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
 		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0c86c435c51f..0ba4feb19cb6 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3922,16 +3922,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		/*
-		 * Use a single vmcb (vmcb01 because it's always valid) for
-		 * context switching guest state via VMLOAD/VMSAVE, that way
-		 * the state doesn't need to be copied between vmcb01 and
-		 * vmcb02 when switching vmcbs for nested virtualization.
-		 */
-		vmload(svm->vmcb01.pa);
 		__svm_vcpu_run(svm);
-		vmsave(svm->vmcb01.pa);
-
 		vmload(__sme_page_pa(sd->save_area));
 	}
 
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index d07bac1952c5..5bc2ed7d79c0 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -28,6 +28,8 @@
 #define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
+#define SVM_vmcb01_pa	(SVM_vmcb01 + KVM_VMCB_pa)
+
 .section .noinstr.text, "ax"
 
 /**
@@ -55,6 +57,16 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %_ASM_ARG1, %_ASM_DI
 .endif
 
+	/*
+	 * Use a single vmcb (vmcb01 because it's always valid) for
+	 * context switching guest state via VMLOAD/VMSAVE, that way
+	 * the state doesn't need to be copied between vmcb01 and
+	 * vmcb02 when switching vmcbs for nested virtualization.
+	 */
+	mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
+1:	vmload %_ASM_AX
+2:
+
 	/* Get svm->current_vmcb->pa into RAX. */
 	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
 	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
@@ -80,16 +92,11 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Enter guest mode */
 	sti
 
-1:	vmrun %_ASM_AX
-
-2:	cli
-
-#ifdef CONFIG_RETPOLINE
-	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
-	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
-#endif
+3:	vmrun %_ASM_AX
+4:
+	cli
 
-	/* "POP" @svm to RAX. */
+	/* Pop @svm to RAX while it's the only available register. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
@@ -110,6 +117,18 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %r15, VCPU_R15(%_ASM_AX)
 #endif
 
+	/* @svm can stay in RDI from now on.  */
+	mov %_ASM_AX, %_ASM_DI
+
+	mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
+5:	vmsave %_ASM_AX
+6:
+
+#ifdef CONFIG_RETPOLINE
+	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
+	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
+#endif
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -159,11 +178,19 @@ SYM_FUNC_START(__svm_vcpu_run)
 	pop %_ASM_BP
 	RET
 
-3:	cmpb $0, kvm_rebooting
+10:	cmpb $0, kvm_rebooting
 	jne 2b
 	ud2
+30:	cmpb $0, kvm_rebooting
+	jne 4b
+	ud2
+50:	cmpb $0, kvm_rebooting
+	jne 6b
+	ud2
 
-	_ASM_EXTABLE(1b, 3b)
+	_ASM_EXTABLE(1b, 10b)
+	_ASM_EXTABLE(3b, 30b)
+	_ASM_EXTABLE(5b, 50b)
 
 SYM_FUNC_END(__svm_vcpu_run)
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/8] KVM: SVM: restore host save area from assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (4 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

This is needed so that FILL_RETURN_BUFFER has access to the
percpu area via the GS segment base.

Cc: stable@vger.kernel.org
Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Analyzed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/svm.c     |  3 +--
 arch/x86/kvm/svm/svm.h     |  2 +-
 arch/x86/kvm/svm/svm_ops.h |  5 -----
 arch/x86/kvm/svm/vmenter.S | 13 +++++++++++++
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0ba4feb19cb6..e15f6ea9e5cc 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3922,8 +3922,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(svm);
-		vmload(__sme_page_pa(sd->save_area));
+		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area));
 	}
 
 	guest_state_exit_irqoff();
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7ff1879e73c5..932f26be5675 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -684,6 +684,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
-void __svm_vcpu_run(struct vcpu_svm *svm);
+void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa);
 
 #endif
diff --git a/arch/x86/kvm/svm/svm_ops.h b/arch/x86/kvm/svm/svm_ops.h
index 9430d6437c9f..36c8af87a707 100644
--- a/arch/x86/kvm/svm/svm_ops.h
+++ b/arch/x86/kvm/svm/svm_ops.h
@@ -61,9 +61,4 @@ static __always_inline void vmsave(unsigned long pa)
 	svm_asm1(vmsave, "a" (pa), "memory");
 }
 
-static __always_inline void vmload(unsigned long pa)
-{
-	svm_asm1(vmload, "a" (pa), "memory");
-}
-
 #endif /* __KVM_X86_SVM_OPS_H */
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 5bc2ed7d79c0..0a4272faf80f 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -35,6 +35,7 @@
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
+ * @hsave_pa:	unsigned long
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -49,6 +50,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
+	/* @hsave_pa is needed last after vmexit, save it first.  */
+	push %_ASM_ARG2
+
 	/* Save @svm. */
 	push %_ASM_ARG1
 
@@ -124,6 +128,11 @@ SYM_FUNC_START(__svm_vcpu_run)
 5:	vmsave %_ASM_AX
 6:
 
+	/* Pop @hsave_pa and restore GSBASE, allowing access to percpu data.  */
+	pop %_ASM_AX
+7:	vmload %_ASM_AX
+8:
+
 #ifdef CONFIG_RETPOLINE
 	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
@@ -187,10 +196,14 @@ SYM_FUNC_START(__svm_vcpu_run)
 50:	cmpb $0, kvm_rebooting
 	jne 6b
 	ud2
+70:	cmpb $0, kvm_rebooting
+	jne 8b
+	ud2
 
 	_ASM_EXTABLE(1b, 10b)
 	_ASM_EXTABLE(3b, 30b)
 	_ASM_EXTABLE(5b, 50b)
+	_ASM_EXTABLE(7b, 70b)
 
 SYM_FUNC_END(__svm_vcpu_run)
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (5 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 6/8] KVM: SVM: restore host save area from assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-09  1:14   ` Sean Christopherson
  2022-11-08 15:15 ` [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
  2022-11-08 19:43 ` [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Nathan Chancellor
  8 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson,
	seanjc, stable

Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.

With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.

To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.

Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/cpu/bugs.c     |  13 +----
 arch/x86/kvm/kvm-asm-offsets.c |   1 +
 arch/x86/kvm/svm/svm.c         |  38 +++++-------
 arch/x86/kvm/svm/svm.h         |   4 +-
 arch/x86/kvm/svm/vmenter.S     | 102 ++++++++++++++++++++++++++++++++-
 5 files changed, 121 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index da7c361f47e0..6ec0b7ce7453 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -196,22 +196,15 @@ void __init check_bugs(void)
 }
 
 /*
- * NOTE: This function is *only* called for SVM.  VMX spec_ctrl handling is
- * done in vmenter.S.
+ * NOTE: This function is *only* called for SVM, since Intel uses
+ * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
 x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
 {
-	u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
+	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
 
-	if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
-		if (hostval != guestval) {
-			msrval = setguest ? guestval : hostval;
-			wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
-		}
-	}
-
 	/*
 	 * If SSBD is not handled in MSR_SPEC_CTRL on AMD, update
 	 * MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported.
diff --git a/arch/x86/kvm/kvm-asm-offsets.c b/arch/x86/kvm/kvm-asm-offsets.c
index f83e88b85bf2..b2877c2c8df1 100644
--- a/arch/x86/kvm/kvm-asm-offsets.c
+++ b/arch/x86/kvm/kvm-asm-offsets.c
@@ -16,6 +16,7 @@ static void __used common(void)
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
 		OFFSET(SVM_current_vmcb, vcpu_svm, current_vmcb);
+		OFFSET(SVM_spec_ctrl, vcpu_svm, spec_ctrl);
 		OFFSET(SVM_vmcb01, vcpu_svm, vmcb01);
 		OFFSET(KVM_VMCB_pa, kvm_vmcb_info, pa);
 	}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e15f6ea9e5cc..512bc06a4ba1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -730,6 +730,15 @@ static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
 	u32 offset;
 	u32 *msrpm;
 
+	/*
+	 * For non-nested case:
+	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 *
+	 * For nested case:
+	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 */
 	msrpm = is_guest_mode(vcpu) ? to_svm(vcpu)->nested.msrpm:
 				      to_svm(vcpu)->msrpm;
 
@@ -3911,18 +3920,19 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 	return EXIT_FASTPATH_NONE;
 }
 
-static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
+static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_intercepted)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(svm);
+		__svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
-		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area));
+		__svm_vcpu_run(svm, __sme_page_pa(sd->save_area),
+			       spec_ctrl_intercepted);
 	}
 
 	guest_state_exit_irqoff();
@@ -3931,6 +3941,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
 
 	trace_kvm_entry(vcpu);
 
@@ -3989,26 +4000,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
 		x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
 
-	svm_vcpu_enter_exit(vcpu);
-
-	/*
-	 * We do not use IBRS in the kernel. If this vCPU has used the
-	 * SPEC_CTRL MSR it may have left it on; save the value and
-	 * turn it off. This is much more efficient than blindly adding
-	 * it to the atomic save/restore list. Especially as the former
-	 * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
-	 *
-	 * For non-nested case:
-	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 *
-	 * For nested case:
-	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 */
-	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) &&
-	    unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
-		svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
+	svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
 
 	if (!sev_es_guest(vcpu->kvm))
 		reload_tss(vcpu);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 932f26be5675..bf9ff39dc420 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -683,7 +683,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(struct vcpu_svm *svm);
-void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa);
+void __svm_sev_es_vcpu_run(struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+void __svm_vcpu_run(struct vcpu_svm *svm, unsigned long hsave_pa, bool spec_ctrl_intercepted);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 0a4272faf80f..a02eef724379 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -32,10 +32,70 @@
 
 .section .noinstr.text, "ax"
 
+.macro RESTORE_GUEST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "", \
+		"jmp 800f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"", X86_FEATURE_V_SPEC_CTRL
+801:
+.endm
+
+.macro RESTORE_HOST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "", \
+		"jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"", X86_FEATURE_V_SPEC_CTRL
+901:
+.endm
+
+.macro RESTORE_SPEC_CTRL_BODY
+800:
+	/*
+	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
+	 * host's, write the MSR.  This is kept out-of-line so that the common
+	 * case does not have to jump.
+	 *
+	 * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
+	 * there must not be any returns or indirect branches between this code
+	 * and vmentry.
+	 */
+	movl SVM_spec_ctrl(%_ASM_DI), %eax
+	cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	je 801b
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+	xor %edx, %edx
+	wrmsr
+	jmp 801b
+
+900:
+	/* Same for after vmexit.  */
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+
+	/*
+	 * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
+	 * if it was not intercepted during guest execution.
+	 */
+	cmpb $0, (%_ASM_SP)
+	jnz 998f
+	rdmsr
+	movl %eax, SVM_spec_ctrl(%_ASM_DI)
+998:
+
+	/* Now restore the host value of the MSR if different from the guest's.  */
+	movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	cmp SVM_spec_ctrl(%_ASM_DI), %eax
+	je 901b
+	xor %edx, %edx
+	wrmsr
+	jmp 901b
+.endm
+
+
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
  * @hsave_pa:	unsigned long
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -50,7 +110,12 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* @hsave_pa is needed last after vmexit, save it first.  */
+	/*
+	 * Both @spec_ctrl_intercepted and @hsave_pa are used only after vmexit.
+	 * @spec_ctrl_intercepted is needed later and accessed directly from
+	 * the stack in RESTORE_HOST_SPEC_CTRL, so save it first.
+	 */
+	push %_ASM_ARG3
 	push %_ASM_ARG2
 
 	/* Save @svm. */
@@ -61,6 +126,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %_ASM_ARG1, %_ASM_DI
 .endif
 
+	RESTORE_GUEST_SPEC_CTRL
+
 	/*
 	 * Use a single vmcb (vmcb01 because it's always valid) for
 	 * context switching guest state via VMLOAD/VMSAVE, that way
@@ -138,6 +205,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -173,6 +242,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 	xor %r15d, %r15d
 #endif
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
@@ -187,6 +259,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	pop %_ASM_BP
 	RET
 
+	RESTORE_SPEC_CTRL_BODY
+
 10:	cmpb $0, kvm_rebooting
 	jne 2b
 	ud2
@@ -210,6 +284,7 @@ SYM_FUNC_END(__svm_vcpu_run)
 /**
  * __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
  * @svm:	struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	push %_ASM_BP
@@ -224,8 +299,21 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 #endif
 	push %_ASM_BX
 
+	/* Save @spec_ctrl_intercepted for RESTORE_HOST_SPEC_CTRL. */
+	push %_ASM_ARG2
+
+	/* Save @svm. */
+	push %_ASM_ARG1
+
+.ifnc _ASM_ARG1, _ASM_DI
+	/* Move @svm to RDI for RESTORE_GUEST_SPEC_CTRL. */
+	mov %_ASM_ARG1, %_ASM_DI
+.endif
+
+	RESTORE_GUEST_SPEC_CTRL
+
 	/* Get svm->current_vmcb->pa into RAX. */
-	mov SVM_current_vmcb(%_ASM_ARG1), %_ASM_AX
+	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
 	mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
 
 	/* Enter guest mode */
@@ -235,11 +323,16 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 
 2:	cli
 
+	/* Pop @svm to RDI, guest registers have been saved already. */
+	pop %_ASM_DI
+
 #ifdef CONFIG_RETPOLINE
 	/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -249,6 +342,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 */
 	UNTRAIN_RET
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
@@ -263,6 +359,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	pop %_ASM_BP
 	RET
 
+	RESTORE_SPEC_CTRL_BODY
+
 3:	cmpb $0, kvm_rebooting
 	jne 2b
 	ud2
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (6 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-11-08 15:15 ` Paolo Bonzini
  2022-11-08 19:43 ` [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Nathan Chancellor
  8 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-08 15:15 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: nathan, thomas.lendacky, andrew.cooper3, peterz, jmattson, seanjc

x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/spec-ctrl.h | 10 +++++-----
 arch/x86/kernel/cpu/bugs.c       |  2 +-
 arch/x86/kvm/svm/svm.c           |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index 5393babc0598..cb0386fc4dc3 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -13,7 +13,7 @@
  * Takes the guest view of SPEC_CTRL MSR as a parameter and also
  * the guest's version of VIRT_SPEC_CTRL, if emulated.
  */
-extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool guest);
+extern void x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool guest);
 
 /**
  * x86_spec_ctrl_set_guest - Set speculation control registers for the guest
@@ -24,9 +24,9 @@ extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bo
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_set_guest(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, true);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, true);
 }
 
 /**
@@ -38,9 +38,9 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_restore_host(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, false);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, false);
 }
 
 /* AMD specific Speculative Store Bypass MSR data */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6ec0b7ce7453..3e3230cccaa7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -200,7 +200,7 @@ void __init check_bugs(void)
  * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
-x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
+x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool setguest)
 {
 	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 512bc06a4ba1..e6706fd88cfa 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3998,7 +3998,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	 * being speculatively taken.
 	 */
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);
 
 	svm_vcpu_enter_exit(vcpu, spec_ctrl_intercepted);
 
@@ -4006,7 +4006,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 		reload_tss(vcpu);
 
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
 
 	if (!sev_es_guest(vcpu->kvm)) {
 		vcpu->arch.cr2 = svm->vmcb->save.cr2;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 0/8] KVM: SVM: fixes for vmentry code
  2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
                   ` (7 preceding siblings ...)
  2022-11-08 15:15 ` [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
@ 2022-11-08 19:43 ` Nathan Chancellor
  8 siblings, 0 replies; 17+ messages in thread
From: Nathan Chancellor @ 2022-11-08 19:43 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, thomas.lendacky, andrew.cooper3, peterz,
	jmattson, seanjc

On Tue, Nov 08, 2022 at 10:15:24AM -0500, Paolo Bonzini wrote:
> This series comprises two related fixes:
> 
> - the FILL_RETURN_BUFFER macro in -next needs to access percpu data,
>   hence the GS segment base needs to be loaded before FILL_RETURN_BUFFER.
>   This means moving guest vmload/vmsave and host vmload to assembly
>   (patches 5 and 6).
> 
> - because AMD wants the OS to set STIBP to 1 before executing the
>   return thunk (un)training sequence, IA32_SPEC_CTRL must be restored
>   before UNTRAIN_RET, too.  This must also be moved to assembly and,
>   for consistency, the guest SPEC_CTRL is also loaded in there
>   (patch 7).
> 
> Neither is particularly hard, however because of 32-bit systems one needs
> to keep the number of arguments to __svm_vcpu_run to three or fewer.
> One is taken for whether IA32_SPEC_CTRL is intercepted, and one for the
> host save area, so all accesses to the vcpu_svm struct have to be done
> from assembly too.  This is done in patches 2 to 4, and it turns out
> not to be that bad; in fact I think the code is simpler than before
> after these prerequisites, and even at the end of the series it is not
> much harder to follow despite doing a lot more stuff.  Care has been
> taken to keep the "normal" and SEV-ES code as similar as possible,
> even though the latter would not hit the three argument barrier.
> 
> The above summary leaves out the more mundane patches 1 and 8.  The
> former introduces a separate asm-offsets.c file for KVM, so that
> kernel/asm-offsets.c does not have to do ugly includes with ../ paths.
> The latter is dead code removal.
> 
> Thanks,
> 
> Paolo
> 
> v1->v2: use a separate asm-offsets.c file instead of hacking around
> 	the arch/x86/kvm/svm/svm.h file; this could have been done
> 	also with just a "#ifndef COMPILE_OFFSETS", but Sean's
> 	suggestion is cleaner and there is a precedent in
> 	drivers/memory/ for private asm-offsets files
> 
> 	keep preparatory cleanups together at the beginning of the
> 	series
> 
> 	move SPEC_CTRL save/restore out of line [Jim]
> 
> Paolo Bonzini (8):
>   KVM: x86: use a separate asm-offsets.c file
>   KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
>   KVM: SVM: adjust register allocation for __svm_vcpu_run
>   KVM: SVM: retrieve VMCB from assembly
>   KVM: SVM: move guest vmsave/vmload to assembly
>   KVM: SVM: restore host save area from assembly
>   KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
>   x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
>     callers
> 
>  arch/x86/include/asm/spec-ctrl.h |  10 +-
>  arch/x86/kernel/asm-offsets.c    |   6 -
>  arch/x86/kernel/cpu/bugs.c       |  15 +-
>  arch/x86/kvm/Makefile            |  12 ++
>  arch/x86/kvm/kvm-asm-offsets.c   |  28 ++++
>  arch/x86/kvm/svm/svm.c           |  53 +++----
>  arch/x86/kvm/svm/svm.h           |   4 +-
>  arch/x86/kvm/svm/svm_ops.h       |   5 -
>  arch/x86/kvm/svm/vmenter.S       | 241 ++++++++++++++++++++++++-------
>  arch/x86/kvm/vmx/vmenter.S       |   2 +-
>  10 files changed, 259 insertions(+), 117 deletions(-)
>  create mode 100644 arch/x86/kvm/kvm-asm-offsets.c
> 
> -- 
> 2.31.1
> 
> 

I applied this series on next-20221108, which has the call depth
tracking patches, and I no longer see the panic when starting a guest on
my AMD test system and I can still start a simple nested guest without
any problems, which is about the extent of my regular KVM testing.  I
did test the same kernel on my Intel systems and saw no problems there
but that seems expected given the diffstat. Thank you for the quick
response and fixes!

Tested-by: Nathan Chancellor <nathan@kernel.org>

One small nit I noticed: kvm-asm-offsets.h should be added to a
.gitignore file in arch/x86/kvm.

$ git status --short
?? arch/x86/kvm/kvm-asm-offsets.h

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
@ 2022-11-08 20:54   ` Sean Christopherson
  2022-11-09 10:35     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2022-11-08 20:54 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> Since registers are reachable through vcpu_svm, and we will
> need to access more fields of that struct, pass it instead
> of the regs[] array.
> 
> No functional change intended.
> 
> Cc: stable@vger.kernel.org
> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")

I don't think a Fixes: tag for a pure nop patch is fair to the original commit.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run
  2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
@ 2022-11-08 20:55   ` Sean Christopherson
  0 siblings, 0 replies; 17+ messages in thread
From: Sean Christopherson @ 2022-11-08 20:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> In preparation for moving vmload/vmsave to __svm_vcpu_run,
> keep the pointer to the struct vcpu_svm in %rdi.  This way
> it is possible to load svm->vmcb01.pa in %rax without
> clobbering the pointer to svm itself.
> 
> No functional change intended.
> 
> Cc: stable@vger.kernel.org
> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")

Same nit, Fixes: shouldn't be provided.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly
  2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
@ 2022-11-09  0:53   ` Sean Christopherson
  2022-11-09  9:09     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2022-11-09  0:53 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

[-- Attachment #1: Type: text/plain, Size: 3005 bytes --]

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> This is needed in order to keep the number of arguments to 3 or less,
> after adding hsave_pa and spec_ctrl_intercepted.  32-bit builds only
> support passing three arguments in registers, fortunately all other
> data is reachable from the vcpu_svm struct.

Is it actually a problem if parameters are passed on the stack?  The assembly
code mostly creates a stack frame, i.e. %ebp can be used to pull values off the
stack.

I dont think it will matter in the end (more below), but hypothetically if we
ended up with

  void __svm_vcpu_run(struct kvm_vcpu *vcpu, unsigned long vmcb_pa,
		      unsigned long gsave_pa, unsigned long hsave_pa,
		      bool spec_ctrl_intercepted);

then the asm prologue would be something like:

	/*
	 * Save @vcpu, @gsave_pa, @hsave_pa, and @spec_ctrl_intercepted, all of
	 * which are needed after VM-Exit.
	 */
	push %_ASM_ARG1
	push %_ASM_ARG3

  #ifdef CONFIG_X86_64
	push %_ASM_ARG4
	push %_ASM_ARG5
  #else
	push %_ASM_ARG4_EBP(%ebp)
	push %_ASM_ARG5_EBP(%ebp)
  #endif

which is a few extra memory accesses, especially for 32-bit, but no one cares
about 32-bit and I highly doubt a few extra PUSH+POP instructions will be noticeable.

Threading in yesterday's conversation...

> > What about adding dedicated structs to hold the non-regs params for VM-Enter and
> > VMRUN?  Grabbing stuff willy-nilly in the assembly code makes the flows difficult
> > to read as there's nothing in the C code that describes what fields are actually
> > used.
>
> What fields are actually used is (like with any other function)
> "potentially all, you'll have to read the source code and in fact you
> can just read asm-offsets.c instead".  What I mean is, I cannot offhand
> see or remember what fields are touched by svm_prepare_switch_to_guest,
> why would __svm_vcpu_run be any different?

It's different because if it were a normal C function, it would simply take
@vcpu, and maybe @spec_ctrl_intercepted to shave cycles after CLGI.  But because
it's assembly and doesn't have to_svm() readily available (among other restrictions),
__svm_vcpu_run() ends up taking a mishmash of parameters, which for me makes it
rather difficult to understand what to expect.

Oooh, and after much staring I realized that the address of the host save area
is passed in because grabbing it after VM-Exit can't work.  That's subtle, and
passing it in isn't strictly necessary; there's no reason the assembly code can't
grab it and stash it on the stack.

What about killing a few birds with one stone?  Move the host save area PA to
its own per-CPU variable, and then grab that from assembly as well.  Then it's
a bit more obvious why the address needs to be saved on the stack across VMRUN,
and my whining about the prototype being funky goes away.  __svm_vcpu_run() and
__svm_sev_es_vcpu_run() would have identical prototypes too.

Attached patches would slot in early in the series.  Tested SVM and SME-enabled
kernels, didn't test the SEV-ES bits.

[-- Attachment #2: 0001-KVM-SVM-Add-a-helper-to-convert-a-SME-aware-PA-back-.patch --]
[-- Type: text/x-diff, Size: 3055 bytes --]

From 59a4b14ec509e30e614beaa20ceb920c181e3739 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Tue, 8 Nov 2022 15:25:41 -0800
Subject: [PATCH 1/2] KVM: SVM: Add a helper to convert a SME-aware PA back to
 a struct page

Add __sme_pa_to_page() to pair with __sme_page_pa() and use it to replace
open coded equivalents, including for "iopm_base", which previously
avoided having to do __sme_clr() by storing the raw PA in the global
variable.

Opportunistically convert __sme_page_pa() to a helper to provide type
safety.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c |  9 ++++-----
 arch/x86/kvm/svm/svm.h | 16 +++++++++++++++-
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58f0077d9357..bb7427fd1242 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1065,8 +1065,7 @@ static void svm_hardware_unsetup(void)
 	for_each_possible_cpu(cpu)
 		svm_cpu_uninit(cpu);
 
-	__free_pages(pfn_to_page(iopm_base >> PAGE_SHIFT),
-	get_order(IOPM_SIZE));
+	__free_pages(__sme_pa_to_page(iopm_base), get_order(IOPM_SIZE));
 	iopm_base = 0;
 }
 
@@ -1243,7 +1242,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	if (!kvm_hlt_in_guest(vcpu->kvm))
 		svm_set_intercept(svm, INTERCEPT_HLT);
 
-	control->iopm_base_pa = __sme_set(iopm_base);
+	control->iopm_base_pa = iopm_base;
 	control->msrpm_base_pa = __sme_set(__pa(svm->msrpm));
 	control->int_ctl = V_INTR_MASKING_MASK;
 
@@ -1443,7 +1442,7 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
 
 	sev_free_vcpu(vcpu);
 
-	__free_page(pfn_to_page(__sme_clr(svm->vmcb01.pa) >> PAGE_SHIFT));
+	__free_page(__sme_pa_to_page(svm->vmcb01.pa));
 	__free_pages(virt_to_page(svm->msrpm), get_order(MSRPM_SIZE));
 }
 
@@ -4970,7 +4969,7 @@ static __init int svm_hardware_setup(void)
 
 	iopm_va = page_address(iopm_pages);
 	memset(iopm_va, 0xff, PAGE_SIZE * (1 << order));
-	iopm_base = page_to_pfn(iopm_pages) << PAGE_SHIFT;
+	iopm_base = __sme_page_pa(iopm_pages);
 
 	init_msrpm_offsets();
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..9a2567803006 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -24,7 +24,21 @@
 
 #include "kvm_cache_regs.h"
 
-#define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
+/*
+ * Helpers to convert to/from physical addresses for pages whose address is
+ * consumed directly by hardware.  Even though it's a physical address, SVM
+ * often restricts the address to the natural width, hence 'unsigned long'
+ * instead of 'hpa_t'.
+ */
+static inline unsigned long __sme_page_pa(struct page *page)
+{
+	return __sme_set(page_to_pfn(page) << PAGE_SHIFT);
+}
+
+static inline struct page *__sme_pa_to_page(unsigned long pa)
+{
+	return pfn_to_page(__sme_clr(pa) >> PAGE_SHIFT);
+}
 
 #define	IOPM_SIZE PAGE_SIZE * 3
 #define	MSRPM_SIZE PAGE_SIZE * 2

base-commit: 88cd4a037496682f164e7ae8dac13cd4ec8edc2b
-- 
2.38.1.431.g37b22c650d-goog


[-- Attachment #3: 0002-KVM-SVM-Snapshot-host-save-area-PA-in-dedicated-per-.patch --]
[-- Type: text/x-diff, Size: 4804 bytes --]

From e9a4f9af27d7d384dc36ad66a9856f9decb8ce02 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Tue, 8 Nov 2022 15:32:58 -0800
Subject: [PATCH 2/2] KVM: SVM: Snapshot host save area PA in dedicated per-CPU
 variable

Track the SME-aware physical address of the host save area outside of
"struct svm_cpu_data" so that a future patch can easily reference the
address from assembly code.

The "overhead" of always allocating the per-CPU data is more or less
meaningless in the grand scheme.  And when KVM AMD is built as a module,
it's actually more costly (by a single pointer) to dynamically allocate
the struct, as the per-CPU data is allocated only when the module is
loaded.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 30 ++++++++++++++++--------------
 arch/x86/kvm/svm/svm.h |  2 +-
 2 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index bb7427fd1242..8fc02bef4f85 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -246,6 +246,7 @@ struct kvm_ldttss_desc {
 } __attribute__((packed));
 
 DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
+DEFINE_PER_CPU(unsigned long, svm_host_save_area_pa);
 
 /*
  * Only MSR_TSC_AUX is switched via the user return hook.  EFER is switched via
@@ -597,7 +598,7 @@ static int svm_hardware_enable(void)
 
 	wrmsrl(MSR_EFER, efer | EFER_SVME);
 
-	wrmsrl(MSR_VM_HSAVE_PA, __sme_page_pa(sd->save_area));
+	wrmsrl(MSR_VM_HSAVE_PA, per_cpu(svm_host_save_area_pa, me));
 
 	if (static_cpu_has(X86_FEATURE_TSCRATEMSR)) {
 		/*
@@ -653,12 +654,13 @@ static void svm_cpu_uninit(int cpu)
 
 	per_cpu(svm_data, cpu) = NULL;
 	kfree(sd->sev_vmcbs);
-	__free_page(sd->save_area);
+	__free_page(__sme_pa_to_page(per_cpu(svm_host_save_area_pa, cpu)));
 	kfree(sd);
 }
 
 static int svm_cpu_init(int cpu)
 {
+	struct page *host_save_area;
 	struct svm_cpu_data *sd;
 	int ret = -ENOMEM;
 
@@ -666,20 +668,20 @@ static int svm_cpu_init(int cpu)
 	if (!sd)
 		return ret;
 	sd->cpu = cpu;
-	sd->save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
-	if (!sd->save_area)
-		goto free_cpu_data;
 
 	ret = sev_cpu_init(sd);
 	if (ret)
-		goto free_save_area;
+		goto free_cpu_data;
+
+	host_save_area = alloc_page(GFP_KERNEL | __GFP_ZERO);
+	if (!host_save_area)
+		goto free_cpu_data;
 
 	per_cpu(svm_data, cpu) = sd;
+	per_cpu(svm_host_save_area_pa, cpu) = __sme_page_pa(host_save_area);
 
 	return 0;
 
-free_save_area:
-	__free_page(sd->save_area);
 free_cpu_data:
 	kfree(sd);
 	return ret;
@@ -1449,7 +1451,6 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
 static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
 	if (sev_es_guest(vcpu->kvm))
 		sev_es_unmap_ghcb(svm);
@@ -1461,10 +1462,13 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 	 * Save additional host state that will be restored on VMEXIT (sev-es)
 	 * or subsequent vmload of host save area.
 	 */
-	vmsave(__sme_page_pa(sd->save_area));
+	vmsave(per_cpu(svm_host_save_area_pa, vcpu->cpu));
 	if (sev_es_guest(vcpu->kvm)) {
 		struct sev_es_save_area *hostsa;
-		hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
+		struct page *hostsa_page;
+
+		hostsa_page = __sme_pa_to_page(per_cpu(svm_host_save_area_pa, vcpu->cpu));
+		hostsa = (struct sev_es_save_area *)(page_address(hostsa_page) + 0x400);
 
 		sev_es_prepare_switch_to_guest(hostsa);
 	}
@@ -3920,8 +3924,6 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	if (sev_es_guest(vcpu->kvm)) {
 		__svm_sev_es_vcpu_run(vmcb_pa);
 	} else {
-		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
-
 		/*
 		 * Use a single vmcb (vmcb01 because it's always valid) for
 		 * context switching guest state via VMLOAD/VMSAVE, that way
@@ -3932,7 +3934,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		__svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
 		vmsave(svm->vmcb01.pa);
 
-		vmload(__sme_page_pa(sd->save_area));
+		vmload(per_cpu(svm_host_save_area_pa, vcpu->cpu));
 	}
 
 	guest_state_exit_irqoff();
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 9a2567803006..d9ec8ea69a56 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -303,7 +303,6 @@ struct svm_cpu_data {
 	u32 min_asid;
 	struct kvm_ldttss_desc *tss_desc;
 
-	struct page *save_area;
 	struct vmcb *current_vmcb;
 
 	/* index = sev_asid, value = vmcb pointer */
@@ -311,6 +310,7 @@ struct svm_cpu_data {
 };
 
 DECLARE_PER_CPU(struct svm_cpu_data *, svm_data);
+DECLARE_PER_CPU(unsigned long, svm_host_save_area_pa);
 
 void recalc_intercepts(struct vcpu_svm *svm);
 
-- 
2.38.1.431.g37b22c650d-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-11-09  1:14   ` Sean Christopherson
  2022-11-09  9:29     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2022-11-09  1:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On Tue, Nov 08, 2022, Paolo Bonzini wrote:
> diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
> index 0a4272faf80f..a02eef724379 100644
> --- a/arch/x86/kvm/svm/vmenter.S
> +++ b/arch/x86/kvm/svm/vmenter.S
> @@ -32,10 +32,70 @@
>  
>  .section .noinstr.text, "ax"
>  
> +.macro RESTORE_GUEST_SPEC_CTRL
> +	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> +	ALTERNATIVE_2 "", \
> +		"jmp 800f", X86_FEATURE_MSR_SPEC_CTRL, \
> +		"", X86_FEATURE_V_SPEC_CTRL
> +801:
> +.endm
> +
> +.macro RESTORE_HOST_SPEC_CTRL
> +	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
> +	ALTERNATIVE_2 "", \
> +		"jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
> +		"", X86_FEATURE_V_SPEC_CTRL
> +901:
> +.endm
> +
> +.macro RESTORE_SPEC_CTRL_BODY

Can we split these into separate macros?  It's a bit more typing, but it's not
immediately obvious that these are two independent chunks (I was expecting a JMP
from the 800 section into the 900 section).

E.g. RESTORE_GUEST_SPEC_CTRL_BODY and RESTORE_HOST_SPEC_CTRL_BODY

> +800:

Ugh, the multiple users makes it somewhat ugly, but rather than arbitrary numbers,
what about using named labels to make it easier to understand the branches?

E.g.

---
 arch/x86/kvm/svm/vmenter.S | 43 +++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index a02eef724379..23fd7353f0d0 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -32,24 +32,24 @@
 
 .section .noinstr.text, "ax"
 
-.macro RESTORE_GUEST_SPEC_CTRL
+.macro RESTORE_GUEST_SPEC_CTRL name
 	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
 	ALTERNATIVE_2 "", \
-		"jmp 800f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp .Lrestore_guest_spec_ctrl\name", X86_FEATURE_MSR_SPEC_CTRL, \
 		"", X86_FEATURE_V_SPEC_CTRL
-801:
+.Lpost_restore_guest_spec_ctrl\name:
 .endm
 
-.macro RESTORE_HOST_SPEC_CTRL
+.macro RESTORE_HOST_SPEC_CTRL name
 	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
 	ALTERNATIVE_2 "", \
-		"jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp .Lrestore_host_spec_ctrl\name", X86_FEATURE_MSR_SPEC_CTRL, \
 		"", X86_FEATURE_V_SPEC_CTRL
-901:
+.Lpost_restore_host_spec_ctrl\name:
 .endm
 
-.macro RESTORE_SPEC_CTRL_BODY
-800:
+.macro RESTORE_GUEST_SPEC_CTRL_BODY name
+.Lrestore_guest_spec_ctrl\name:
 	/*
 	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
 	 * host's, write the MSR.  This is kept out-of-line so that the common
@@ -61,13 +61,16 @@
 	 */
 	movl SVM_spec_ctrl(%_ASM_DI), %eax
 	cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
-	je 801b
+	je .Lpost_restore_guest_spec_ctrl\name
+
 	mov $MSR_IA32_SPEC_CTRL, %ecx
 	xor %edx, %edx
 	wrmsr
-	jmp 801b
+	jmp .Lpost_restore_guest_spec_ctrl\name
+.endm
 
-900:
+.macro RESTORE_HOST_SPEC_CTRL_BODY name
+.Lrestore_host_spec_ctrl\name:
 	/* Same for after vmexit.  */
 	mov $MSR_IA32_SPEC_CTRL, %ecx
 
@@ -76,18 +79,18 @@
 	 * if it was not intercepted during guest execution.
 	 */
 	cmpb $0, (%_ASM_SP)
-	jnz 998f
+	jnz .Lskip_save_guest_spec_ctrl\name
 	rdmsr
 	movl %eax, SVM_spec_ctrl(%_ASM_DI)
-998:
 
+.Lskip_save_guest_spec_ctrl\name:
 	/* Now restore the host value of the MSR if different from the guest's.  */
 	movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
 	cmp SVM_spec_ctrl(%_ASM_DI), %eax
-	je 901b
+	je .Lpost_restore_host_spec_ctrl\name
 	xor %edx, %edx
 	wrmsr
-	jmp 901b
+	jmp .Lpost_restore_host_spec_ctrl\name
 .endm
 
 
@@ -259,7 +262,8 @@ SYM_FUNC_START(__svm_vcpu_run)
 	pop %_ASM_BP
 	RET
 
-	RESTORE_SPEC_CTRL_BODY
+	RESTORE_GUEST_SPEC_CTRL_BODY
+	RESTORE_HOST_SPEC_CTRL_BODY
 
 10:	cmpb $0, kvm_rebooting
 	jne 2b
@@ -310,7 +314,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	mov %_ASM_ARG1, %_ASM_DI
 .endif
 
-	RESTORE_GUEST_SPEC_CTRL
+	RESTORE_GUEST_SPEC_CTRL sev_es
 
 	/* Get svm->current_vmcb->pa into RAX. */
 	mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
@@ -331,7 +335,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
-	RESTORE_HOST_SPEC_CTRL
+	RESTORE_HOST_SPEC_CTRL sev_es
 
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
@@ -359,7 +363,8 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	pop %_ASM_BP
 	RET
 
-	RESTORE_SPEC_CTRL_BODY
+	RESTORE_GUEST_SPEC_CTRL_BODY sev_es
+	RESTORE_HOST_SPEC_CTRL_BODY sev_es
 
 3:	cmpb $0, kvm_rebooting
 	jne 2b

base-commit: 0b242ada175d97a556866c48c80310860f634579
-- 

> @@ -61,6 +126,8 @@ SYM_FUNC_START(__svm_vcpu_run)
>  	mov %_ASM_ARG1, %_ASM_DI
>  .endif
>

Can you add a comment to all of these to call out that RAX, RCX, and RDX might
get clobbered?  It's easy to overlook that detail, and it matters on 32-bit where
the arguments use RCX and RDX.  And because the clobber is conditional, it wouldn't
be that hard for a bug to escape initial testing.

	/* This might clobber RAX, RCX, and RDX! */

> +	RESTORE_GUEST_SPEC_CTRL

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly
  2022-11-09  0:53   ` Sean Christopherson
@ 2022-11-09  9:09     ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-09  9:09 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/9/22 01:53, Sean Christopherson wrote:
> On Tue, Nov 08, 2022, Paolo Bonzini wrote:
>> This is needed in order to keep the number of arguments to 3 or less,
>> after adding hsave_pa and spec_ctrl_intercepted.  32-bit builds only
>> support passing three arguments in registers, fortunately all other
>> data is reachable from the vcpu_svm struct.
> 
> Is it actually a problem if parameters are passed on the stack?  The assembly
> code mostly creates a stack frame, i.e. %ebp can be used to pull values off the
> stack.

It's not, but given how little love 32-bit KVM receives, I prefer to 
stick to the subset of the ABI that is "equivalent" to 64-bit.

> no one cares about 32-bit and I highly doubt a few extra PUSH+POP
> instructions will  be noticeable.

Same reasoning (no one cares about 32-bits), different conclusions...

>> What fields are actually used is (like with any other function)
>> "potentially all, you'll have to read the source code and in fact you
>> can just read asm-offsets.c instead".  What I mean is, I cannot offhand
>> see or remember what fields are touched by svm_prepare_switch_to_guest,
>> why would __svm_vcpu_run be any different?
> 
> It's different because if it were a normal C function, it would simply take
> @vcpu, and maybe @spec_ctrl_intercepted to shave cycles after CLGI.

Not just for that, but especially to avoid making 
msr_write_intercepted() noinstr.

> But because
> it's assembly and doesn't have to_svm() readily available (among other restrictions),
> __svm_vcpu_run() ends up taking a mishmash of parameters, which for me makes it
> rather difficult to understand what to expect.

Yeah, there could be three reasons to have parameters in assembly:

* you just need them (@svm)

* it's too much of a pain to compute it in assembly 
(@spec_ctrl_intercepted, @hsave_pa)

* it needs to be computed outside the clgi/stgi region (not happening 
here, only mentioned for completeness)

As this patch shows, @vmcb is not much of a pain to compute in assembly: 
it is just two instructions, and not passing it in simplifies register 
allocation (the weird push/pop goes away) because all the arguments 
except @svm/_ASM_ARG1 are needed only after vmexit.

> Oooh, and after much staring I realized that the address of the host save area
> is passed in because grabbing it after VM-Exit can't work.  That's subtle, and
> passing it in isn't strictly necessary; there's no reason the assembly code can't
> grab it and stash it on the stack.

Right, in fact that's not the reason why it's passed in---it's just to 
avoid coding page_to_pfn() in assembly, and to limit the differences 
between the regular and SEV-ES cases.  But using a per-CPU variable is 
fine (either in addition to the struct page, which "wastes" 8 bytes per 
CPU, or as a replacement).

> What about killing a few birds with one stone?  Move the host save area PA to
> its own per-CPU variable, and then grab that from assembly as well.

I would still place it in struct svm_cpu_data itself, I'll see how it 
looks and possibly post v3.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-09  1:14   ` Sean Christopherson
@ 2022-11-09  9:29     ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-09  9:29 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/9/22 02:14, Sean Christopherson wrote:
>>
>> +.macro RESTORE_SPEC_CTRL_BODY
> 
> Can we split these into separate macros?  It's a bit more typing, but it's not
> immediately obvious that these are two independent chunks (I was expecting a JMP
> from the 800 section into the 900 section).
> 
> E.g. RESTORE_GUEST_SPEC_CTRL_BODY and RESTORE_HOST_SPEC_CTRL_BODY

Sure, I had it like that in an earlier version.  I didn't see much 
benefit but it is indeed a bit more readable if you order the macros like

.macro RESTORE_GUEST_SPEC_CTRL
.macro RESTORE_GUEST_SPEC_CTRL_BODY
.macro RESTORE_HOST_SPEC_CTRL
.macro RESTORE_HOST_SPEC_CTRL_BODY

>> +800:
>
> Ugh, the multiple users makes it somewhat ugly, but rather than arbitrary numbers,
> what about using named labels to make it easier to understand the branches?

I think it's okay if we separate the macros.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm
  2022-11-08 20:54   ` Sean Christopherson
@ 2022-11-09 10:35     ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2022-11-09 10:35 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, kvm, nathan, thomas.lendacky, andrew.cooper3,
	peterz, jmattson, stable

On 11/8/22 21:54, Sean Christopherson wrote:
> On Tue, Nov 08, 2022, Paolo Bonzini wrote:
>> Since registers are reachable through vcpu_svm, and we will
>> need to access more fields of that struct, pass it instead
>> of the regs[] array.
>>
>> No functional change intended.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: f14eec0a3203 ("KVM: SVM: move more vmentry code to assembly")
> 
> I don't think a Fixes: tag for a pure nop patch is fair to the original commit.

That's for the sake of correct tracking in stable@, still it's not the 
right commit to point at:

- f14eec0a3203 did not move the RSB stuffing before the GSBASE restore, 
the code before was:

	vmexit_fill_RSB();
  #ifdef CONFIG_X86_64
  	wrmsrl(MSR_GS_BASE, svm->host.gs_base);
  #else

- anyway it wasn't really buggy at the time: it's only in -next that 
FILL_RETURN_BUFFER uses percpu data, because alternatives take care of 
the X86_FEATURE_* checks

The real reason to do all this is to access the percpu host spec_ctrl, 
which in turn is needed for retbleed.  I'll point the Fixes tag to 
a149180fbcf3 ("x86: Add magic AMD return-thunk") instead, again just for 
the sake of tracking releases that need the change to full fix retbleed.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-11-09 10:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-08 15:15 [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 1/8] KVM: x86: use a separate asm-offsets.c file Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 2/8] KVM: SVM: replace regs argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
2022-11-08 20:54   ` Sean Christopherson
2022-11-09 10:35     ` Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 3/8] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
2022-11-08 20:55   ` Sean Christopherson
2022-11-08 15:15 ` [PATCH v2 4/8] KVM: SVM: retrieve VMCB from assembly Paolo Bonzini
2022-11-09  0:53   ` Sean Christopherson
2022-11-09  9:09     ` Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 5/8] KVM: SVM: move guest vmsave/vmload to assembly Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 6/8] KVM: SVM: restore host save area from assembly Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 7/8] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
2022-11-09  1:14   ` Sean Christopherson
2022-11-09  9:29     ` Paolo Bonzini
2022-11-08 15:15 ` [PATCH v2 8/8] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
2022-11-08 19:43 ` [PATCH v2 0/8] KVM: SVM: fixes for vmentry code Nathan Chancellor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).