kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
@ 2022-10-28 23:07 Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run Paolo Bonzini
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

On the Intel side, restoration of the guest's IA32_SPEC_CTRL is done
as late as possible before vmentry, with the comment:

* IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
* there must not be any returns or indirect branches between this code
* and vmentry.

On AMD, there is no need to avoid returns or indirect branches between
wrmsr and vmrun because Linux doesn't use IBRS; however, restoration
of the host IA32_SPEC_CTRL value is definitely way too late. With
respect to the user/kernel boundary, AMD says, "If software chooses to
toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." Assuming the same requirements apply to the guest/host
boundary, KVM does not respect this recommendation: the return thunk
training sequence is in vmenter.S, quite close to the VM-exit, while
the host's IA32_SPEC_CTRL value is only restored much later for hosts
without V_SPEC_CTRL.

In the absence of clarifications for AMD, move all the SPEC_CTRL
handling to assembly code and, in passing, also make the Intel and AMD
code a bit more similar to each other.

Patches 1-2 are the Intel side, which is just a cleanup.

Patch 3 prepares for adding asm-offsets.c entries in arch/x86/kvm/svm/svm.h,
and patches 4-5 are a similar cleanup to the earlier VMX ones.

Patch 6 is the bulk of the change, and finally patch 7 removes now
dead code in asm/spec-ctrl.h and arch/x86/kernel/.

This is RFC because I haven't tested SEV-ES or 32-bit yet.

Paolo

Paolo Bonzini (7):
  KVM: VMX: remove regs argument of __vmx_vcpu_run
  KVM: VMX: more cleanups to __vmx_vcpu_run
  KVM: SVM: extract VMCB accessors to a new file
  KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm
  KVM: SVM: adjust register allocation for __svm_vcpu_run
  KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and
    callers

 arch/x86/include/asm/spec-ctrl.h |  10 +-
 arch/x86/kernel/asm-offsets.c    |   8 ++
 arch/x86/kernel/cpu/bugs.c       |  15 +--
 arch/x86/kvm/svm/avic.c          |   1 +
 arch/x86/kvm/svm/nested.c        |   1 +
 arch/x86/kvm/svm/sev.c           |   1 +
 arch/x86/kvm/svm/svm.c           |  39 +++---
 arch/x86/kvm/svm/svm.h           | 204 +-----------------------------
 arch/x86/kvm/svm/svm_onhyperv.c  |   1 +
 arch/x86/kvm/svm/vmcb.h          | 211 +++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/vmenter.S       | 164 ++++++++++++++++++------
 arch/x86/kvm/vmx/nested.c        |   3 +-
 arch/x86/kvm/vmx/vmenter.S       |  92 ++++++--------
 arch/x86/kvm/vmx/vmx.c           |   3 +-
 arch/x86/kvm/vmx/vmx.h           |   3 +-
 15 files changed, 419 insertions(+), 337 deletions(-)
 create mode 100644 arch/x86/kvm/svm/vmcb.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run
  2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-10-28 23:07 ` Paolo Bonzini
  2022-10-31 17:37   ` Sean Christopherson
  2022-10-28 23:07 ` [PATCH 2/7] KVM: VMX: more cleanups to __vmx_vcpu_run Paolo Bonzini
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

Registers are reachable through vcpu_vmx, no need to pass
a separate pointer to the regs[] array.

No functional change intended.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  1 +
 arch/x86/kvm/vmx/nested.c     |  3 +-
 arch/x86/kvm/vmx/vmenter.S    | 58 +++++++++++++++--------------------
 arch/x86/kvm/vmx/vmx.c        |  3 +-
 arch/x86/kvm/vmx/vmx.h        |  3 +-
 5 files changed, 29 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cb50589a7102..90da275ad223 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -111,6 +111,7 @@ static void __used common(void)
 
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
 		BLANK();
+		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
 		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
 	}
 }
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 61a2e551640a..3f62bdaffb0b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3094,8 +3094,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
 		vmx->loaded_vmcs->host_state.cr4 = cr4;
 	}
 
-	vm_fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
-				 __vmx_vcpu_run_flags(vmx));
+	vm_fail = __vmx_vcpu_run(vmx, __vmx_vcpu_run_flags(vmx));
 
 	if (vmx->msr_autoload.host.nr)
 		vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 63b4ad54331b..1362fe5859f9 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -11,24 +11,24 @@
 
 #define WORD_SIZE (BITS_PER_LONG / 8)
 
-#define VCPU_RAX	__VCPU_REGS_RAX * WORD_SIZE
-#define VCPU_RCX	__VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX	__VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX	__VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RAX	(VMX_vcpu_arch_regs + __VCPU_REGS_RAX * WORD_SIZE)
+#define VCPU_RCX	(VMX_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX	(VMX_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX	(VMX_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
 /* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP	__VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI	__VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI	__VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP	(VMX_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI	(VMX_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI	(VMX_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
 
 #ifdef CONFIG_X86_64
-#define VCPU_R8		__VCPU_REGS_R8  * WORD_SIZE
-#define VCPU_R9		__VCPU_REGS_R9  * WORD_SIZE
-#define VCPU_R10	__VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11	__VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12	__VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13	__VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14	__VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15	__VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8		(VMX_vcpu_arch_regs + __VCPU_REGS_R8  * WORD_SIZE)
+#define VCPU_R9		(VMX_vcpu_arch_regs + __VCPU_REGS_R9  * WORD_SIZE)
+#define VCPU_R10	(VMX_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11	(VMX_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12	(VMX_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13	(VMX_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14	(VMX_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15	(VMX_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
 .section .noinstr.text, "ax"
@@ -36,7 +36,6 @@
 /**
  * __vmx_vcpu_run - Run a vCPU via a transition to VMX guest mode
  * @vmx:	struct vcpu_vmx *
- * @regs:	unsigned long * (to guest registers)
  * @flags:	VMX_RUN_VMRESUME:	use VMRESUME instead of VMLAUNCH
  *		VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
  *
@@ -61,22 +60,19 @@ SYM_FUNC_START(__vmx_vcpu_run)
 	push %_ASM_ARG1
 
 	/* Save @flags for SPEC_CTRL handling */
-	push %_ASM_ARG3
-
-	/*
-	 * Save @regs, _ASM_ARG2 may be modified by vmx_update_host_rsp() and
-	 * @regs is needed after VM-Exit to save the guest's register values.
-	 */
 	push %_ASM_ARG2
 
-	/* Copy @flags to BL, _ASM_ARG3 is volatile. */
-	mov %_ASM_ARG3B, %bl
+	/* Copy @flags to BL, _ASM_ARG2 is volatile. */
+	mov %_ASM_ARG2B, %bl
 
 	lea (%_ASM_SP), %_ASM_ARG2
 	call vmx_update_host_rsp
 
 	ALTERNATIVE "jmp .Lspec_ctrl_done", "", X86_FEATURE_MSR_SPEC_CTRL
 
+	/* Reload @vmx, _ASM_ARG1 may be modified by vmx_update_host_rsp().  */
+	mov WORD_SIZE(%_ASM_SP), %_ASM_DI
+
 	/*
 	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
 	 * host's, write the MSR.
@@ -85,7 +81,6 @@ SYM_FUNC_START(__vmx_vcpu_run)
 	 * there must not be any returns or indirect branches between this code
 	 * and vmentry.
 	 */
-	mov 2*WORD_SIZE(%_ASM_SP), %_ASM_DI
 	movl VMX_spec_ctrl(%_ASM_DI), %edi
 	movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
 	cmp %edi, %esi
@@ -102,8 +97,8 @@ SYM_FUNC_START(__vmx_vcpu_run)
 	 * an LFENCE to stop speculation from skipping the wrmsr.
 	 */
 
-	/* Load @regs to RAX. */
-	mov (%_ASM_SP), %_ASM_AX
+	/* Load @vmx to RAX. */
+	mov WORD_SIZE(%_ASM_SP), %_ASM_AX
 
 	/* Check if vmlaunch or vmresume is needed */
 	testb $VMX_RUN_VMRESUME, %bl
@@ -125,7 +120,7 @@ SYM_FUNC_START(__vmx_vcpu_run)
 	mov VCPU_R14(%_ASM_AX), %r14
 	mov VCPU_R15(%_ASM_AX), %r15
 #endif
-	/* Load guest RAX.  This kills the @regs pointer! */
+	/* Load guest RAX.  This kills the @vmx pointer! */
 	mov VCPU_RAX(%_ASM_AX), %_ASM_AX
 
 	/* Check EFLAGS.ZF from 'testb' above */
@@ -163,8 +158,8 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
 	/* Temporarily save guest's RAX. */
 	push %_ASM_AX
 
-	/* Reload @regs to RAX. */
-	mov WORD_SIZE(%_ASM_SP), %_ASM_AX
+	/* Reload @vmx to RAX. */
+	mov 2*WORD_SIZE(%_ASM_SP), %_ASM_AX
 
 	/* Save all guest registers, including RAX from the stack */
 	pop           VCPU_RAX(%_ASM_AX)
@@ -189,9 +184,6 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
 	xor %ebx, %ebx
 
 .Lclear_regs:
-	/* Discard @regs.  The register is irrelevant, it just can't be RBX. */
-	pop %_ASM_AX
-
 	/*
 	 * Clear all general purpose registers except RSP and RBX to prevent
 	 * speculative use of the guest's values, even those that are reloaded
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 05a747c9a9ff..42cda7a5c009 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7084,8 +7084,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 	if (vcpu->arch.cr2 != native_read_cr2())
 		native_write_cr2(vcpu->arch.cr2);
 
-	vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
-				   flags);
+	vmx->fail = __vmx_vcpu_run(vmx, flags);
 
 	vcpu->arch.cr2 = native_read_cr2();
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a3da84f4ea45..d90cdbea0e4c 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -422,8 +422,7 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
 void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
 void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
 unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
-bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
-		    unsigned int flags);
+bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned int flags);
 int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
 void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/7] KVM: VMX: more cleanups to __vmx_vcpu_run
  2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run Paolo Bonzini
@ 2022-10-28 23:07 ` Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 3/7] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

Slightly improve register allocation, loading vmx only once
before vmlaunch/vmresume.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/vmx/vmenter.S | 40 +++++++++++++++++---------------------
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 1362fe5859f9..0aea6b348a96 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -81,13 +81,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
 	 * there must not be any returns or indirect branches between this code
 	 * and vmentry.
 	 */
-	movl VMX_spec_ctrl(%_ASM_DI), %edi
+	movl VMX_spec_ctrl(%_ASM_DI), %eax
 	movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
-	cmp %edi, %esi
+	cmp %eax, %esi
 	je .Lspec_ctrl_done
 	mov $MSR_IA32_SPEC_CTRL, %ecx
 	xor %edx, %edx
-	mov %edi, %eax
 	wrmsr
 
 .Lspec_ctrl_done:
@@ -97,31 +96,28 @@ SYM_FUNC_START(__vmx_vcpu_run)
 	 * an LFENCE to stop speculation from skipping the wrmsr.
 	 */
 
-	/* Load @vmx to RAX. */
-	mov WORD_SIZE(%_ASM_SP), %_ASM_AX
-
 	/* Check if vmlaunch or vmresume is needed */
 	testb $VMX_RUN_VMRESUME, %bl
 
 	/* Load guest registers.  Don't clobber flags. */
-	mov VCPU_RCX(%_ASM_AX), %_ASM_CX
-	mov VCPU_RDX(%_ASM_AX), %_ASM_DX
-	mov VCPU_RBX(%_ASM_AX), %_ASM_BX
-	mov VCPU_RBP(%_ASM_AX), %_ASM_BP
-	mov VCPU_RSI(%_ASM_AX), %_ASM_SI
-	mov VCPU_RDI(%_ASM_AX), %_ASM_DI
+	mov VCPU_RAX(%_ASM_DI), %_ASM_AX
+	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+	mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+	mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+	mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+	mov VCPU_RSI(%_ASM_DI), %_ASM_SI
 #ifdef CONFIG_X86_64
-	mov VCPU_R8 (%_ASM_AX),  %r8
-	mov VCPU_R9 (%_ASM_AX),  %r9
-	mov VCPU_R10(%_ASM_AX), %r10
-	mov VCPU_R11(%_ASM_AX), %r11
-	mov VCPU_R12(%_ASM_AX), %r12
-	mov VCPU_R13(%_ASM_AX), %r13
-	mov VCPU_R14(%_ASM_AX), %r14
-	mov VCPU_R15(%_ASM_AX), %r15
+	mov VCPU_R8 (%_ASM_DI),  %r8
+	mov VCPU_R9 (%_ASM_DI),  %r9
+	mov VCPU_R10(%_ASM_DI), %r10
+	mov VCPU_R11(%_ASM_DI), %r11
+	mov VCPU_R12(%_ASM_DI), %r12
+	mov VCPU_R13(%_ASM_DI), %r13
+	mov VCPU_R14(%_ASM_DI), %r14
+	mov VCPU_R15(%_ASM_DI), %r15
 #endif
-	/* Load guest RAX.  This kills the @vmx pointer! */
-	mov VCPU_RAX(%_ASM_AX), %_ASM_AX
+	/* Load guest RDI.  This kills the @vmx pointer! */
+	mov VCPU_RDI(%_ASM_DI), %_ASM_DI
 
 	/* Check EFLAGS.ZF from 'testb' above */
 	jz .Lvmlaunch
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/7] KVM: SVM: extract VMCB accessors to a new file
  2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 2/7] KVM: VMX: more cleanups to __vmx_vcpu_run Paolo Bonzini
@ 2022-10-28 23:07 ` Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

Having inline functions confuses the compilation of asm-offsets.c,
which cannot find kvm_cache_regs.h because arch/x86/kvm is not in
asm-offset.c's include path.  Just extract the functions to a
new file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/avic.c         |   1 +
 arch/x86/kvm/svm/nested.c       |   1 +
 arch/x86/kvm/svm/sev.c          |   1 +
 arch/x86/kvm/svm/svm.c          |   1 +
 arch/x86/kvm/svm/svm.h          | 200 ------------------------------
 arch/x86/kvm/svm/svm_onhyperv.c |   1 +
 arch/x86/kvm/svm/vmcb.h         | 211 ++++++++++++++++++++++++++++++++
 7 files changed, 216 insertions(+), 200 deletions(-)
 create mode 100644 arch/x86/kvm/svm/vmcb.h

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 6919dee69f18..cc651a3310b1 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -26,6 +26,7 @@
 #include "x86.h"
 #include "irq.h"
 #include "svm.h"
+#include "vmcb.h"
 
 /* AVIC GATAG is encoded using VM and VCPU IDs */
 #define AVIC_VCPU_ID_BITS		8
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index b258d6988f5d..365f5ef55b53 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -29,6 +29,7 @@
 #include "cpuid.h"
 #include "lapic.h"
 #include "svm.h"
+#include "vmcb.h"
 #include "hyperv.h"
 
 #define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c0c9ed5e279c..549f35ded880 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -25,6 +25,7 @@
 #include "mmu.h"
 #include "x86.h"
 #include "svm.h"
+#include "vmcb.h"
 #include "svm_ops.h"
 #include "cpuid.h"
 #include "trace.h"
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d22a809d9233..b793cfdce68d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -44,6 +44,7 @@
 #include "trace.h"
 
 #include "svm.h"
+#include "vmcb.h"
 #include "svm_ops.h"
 
 #include "kvm_onhyperv.h"
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 6a7686bf6900..222856788153 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -22,8 +22,6 @@
 #include <asm/svm.h>
 #include <asm/sev-common.h>
 
-#include "kvm_cache_regs.h"
-
 #define __sme_page_pa(x) __sme_set(page_to_pfn(x) << PAGE_SHIFT)
 
 #define	IOPM_SIZE PAGE_SIZE * 3
@@ -327,27 +325,6 @@ static __always_inline bool sev_es_guest(struct kvm *kvm)
 #endif
 }
 
-static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
-{
-	vmcb->control.clean = 0;
-}
-
-static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
-{
-	vmcb->control.clean = VMCB_ALL_CLEAN_MASK
-			       & ~VMCB_ALWAYS_DIRTY_MASK;
-}
-
-static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
-{
-	vmcb->control.clean &= ~(1 << bit);
-}
-
-static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
-{
-        return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
-}
-
 static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
 {
 	return container_of(vcpu, struct vcpu_svm, vcpu);
@@ -363,161 +340,6 @@ static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
  */
 #define SVM_REGS_LAZY_LOAD_SET	(1 << VCPU_EXREG_PDPTR)
 
-static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	__set_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	__clear_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
-{
-	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
-	return test_bit(bit, (unsigned long *)&control->intercepts);
-}
-
-static inline void set_dr_intercepts(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	if (!sev_es_guest(svm->vcpu.kvm)) {
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
-	}
-
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
-
-	recalc_intercepts(svm);
-}
-
-static inline void clr_dr_intercepts(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	vmcb->control.intercepts[INTERCEPT_DR] = 0;
-
-	/* DR7 access must remain intercepted for an SEV-ES guest */
-	if (sev_es_guest(svm->vcpu.kvm)) {
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
-		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
-	}
-
-	recalc_intercepts(svm);
-}
-
-static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	WARN_ON_ONCE(bit >= 32);
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	WARN_ON_ONCE(bit >= 32);
-	vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	vmcb_set_intercept(&vmcb->control, bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
-{
-	struct vmcb *vmcb = svm->vmcb01.ptr;
-
-	vmcb_clr_intercept(&vmcb->control, bit);
-
-	recalc_intercepts(svm);
-}
-
-static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
-{
-	return vmcb_is_intercept(&svm->vmcb->control, bit);
-}
-
-static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
-{
-	return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
-}
-
-static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
-{
-	if (!vgif)
-		return NULL;
-
-	if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
-		return svm->nested.vmcb02.ptr;
-	else
-		return svm->vmcb01.ptr;
-}
-
-static inline void enable_gif(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = get_vgif_vmcb(svm);
-
-	if (vmcb)
-		vmcb->control.int_ctl |= V_GIF_MASK;
-	else
-		svm->vcpu.arch.hflags |= HF_GIF_MASK;
-}
-
-static inline void disable_gif(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = get_vgif_vmcb(svm);
-
-	if (vmcb)
-		vmcb->control.int_ctl &= ~V_GIF_MASK;
-	else
-		svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
-}
-
-static inline bool gif_set(struct vcpu_svm *svm)
-{
-	struct vmcb *vmcb = get_vgif_vmcb(svm);
-
-	if (vmcb)
-		return !!(vmcb->control.int_ctl & V_GIF_MASK);
-	else
-		return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
-}
-
 static inline bool nested_npt_enabled(struct vcpu_svm *svm)
 {
 	return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE;
@@ -567,28 +389,6 @@ void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode,
 #define NESTED_EXIT_DONE	1	/* Exit caused nested vmexit  */
 #define NESTED_EXIT_CONTINUE	2	/* Further checks needed      */
 
-static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
-{
-	struct vcpu_svm *svm = to_svm(vcpu);
-
-	return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
-}
-
-static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
-{
-	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
-}
-
-static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
-{
-	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
-}
-
-static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
-{
-	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
-}
-
 int enter_svm_guest_mode(struct kvm_vcpu *vcpu,
 			 u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun);
 void svm_leave_nested(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
index 8cdc62c74a96..ae0a101329e6 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.c
+++ b/arch/x86/kvm/svm/svm_onhyperv.c
@@ -8,6 +8,7 @@
 #include <asm/mshyperv.h>
 
 #include "svm.h"
+#include "vmcb.h"
 #include "svm_ops.h"
 
 #include "hyperv.h"
diff --git a/arch/x86/kvm/svm/vmcb.h b/arch/x86/kvm/svm/vmcb.h
new file mode 100644
index 000000000000..8757cda27e3a
--- /dev/null
+++ b/arch/x86/kvm/svm/vmcb.h
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Kernel-based Virtual Machine driver for Linux
+ *
+ * AMD SVM support - VMCB accessors
+ */
+
+#ifndef __SVM_VMCB_H
+#define __SVM_VMCB_H
+
+#include "kvm_cache_regs.h"
+
+static inline void vmcb_mark_all_dirty(struct vmcb *vmcb)
+{
+	vmcb->control.clean = 0;
+}
+
+static inline void vmcb_mark_all_clean(struct vmcb *vmcb)
+{
+	vmcb->control.clean = VMCB_ALL_CLEAN_MASK
+			       & ~VMCB_ALWAYS_DIRTY_MASK;
+}
+
+static inline void vmcb_mark_dirty(struct vmcb *vmcb, int bit)
+{
+	vmcb->control.clean &= ~(1 << bit);
+}
+
+static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
+{
+        return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
+}
+
+static inline void vmcb_set_intercept(struct vmcb_control_area *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	__set_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void vmcb_clr_intercept(struct vmcb_control_area *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	__clear_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb_is_intercept(struct vmcb_control_area *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u32 bit)
+{
+	WARN_ON_ONCE(bit >= 32 * MAX_INTERCEPT);
+	return test_bit(bit, (unsigned long *)&control->intercepts);
+}
+
+static inline void set_dr_intercepts(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	if (!sev_es_guest(svm->vcpu.kvm)) {
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
+	}
+
+	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+
+	recalc_intercepts(svm);
+}
+
+static inline void clr_dr_intercepts(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	vmcb->control.intercepts[INTERCEPT_DR] = 0;
+
+	/* DR7 access must remain intercepted for an SEV-ES guest */
+	if (sev_es_guest(svm->vcpu.kvm)) {
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+	}
+
+	recalc_intercepts(svm);
+}
+
+static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	WARN_ON_ONCE(bit >= 32);
+	vmcb_set_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	WARN_ON_ONCE(bit >= 32);
+	vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline void svm_set_intercept(struct vcpu_svm *svm, int bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	vmcb_set_intercept(&vmcb->control, bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline void svm_clr_intercept(struct vcpu_svm *svm, int bit)
+{
+	struct vmcb *vmcb = svm->vmcb01.ptr;
+
+	vmcb_clr_intercept(&vmcb->control, bit);
+
+	recalc_intercepts(svm);
+}
+
+static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit)
+{
+	return vmcb_is_intercept(&svm->vmcb->control, bit);
+}
+
+static inline bool nested_vgif_enabled(struct vcpu_svm *svm)
+{
+	return svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK);
+}
+
+static inline struct vmcb *get_vgif_vmcb(struct vcpu_svm *svm)
+{
+	if (!vgif)
+		return NULL;
+
+	if (is_guest_mode(&svm->vcpu) && !nested_vgif_enabled(svm))
+		return svm->nested.vmcb02.ptr;
+	else
+		return svm->vmcb01.ptr;
+}
+
+static inline void enable_gif(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+	if (vmcb)
+		vmcb->control.int_ctl |= V_GIF_MASK;
+	else
+		svm->vcpu.arch.hflags |= HF_GIF_MASK;
+}
+
+static inline void disable_gif(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+	if (vmcb)
+		vmcb->control.int_ctl &= ~V_GIF_MASK;
+	else
+		svm->vcpu.arch.hflags &= ~HF_GIF_MASK;
+}
+
+static inline bool gif_set(struct vcpu_svm *svm)
+{
+	struct vmcb *vmcb = get_vgif_vmcb(svm);
+
+	if (vmcb)
+		return !!(vmcb->control.int_ctl & V_GIF_MASK);
+	else
+		return !!(svm->vcpu.arch.hflags & HF_GIF_MASK);
+}
+
+static inline bool nested_svm_virtualize_tpr(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	return is_guest_mode(vcpu) && (svm->nested.ctl.int_ctl & V_INTR_MASKING_MASK);
+}
+
+static inline bool nested_exit_on_smi(struct vcpu_svm *svm)
+{
+	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_SMI);
+}
+
+static inline bool nested_exit_on_intr(struct vcpu_svm *svm)
+{
+	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_INTR);
+}
+
+static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
+{
+	return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI);
+}
+
+#endif
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm
  2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
                   ` (2 preceding siblings ...)
  2022-10-28 23:07 ` [PATCH 3/7] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
@ 2022-10-28 23:07 ` Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 5/7] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  6 ++++++
 arch/x86/kvm/svm/svm.c        |  2 +-
 arch/x86/kvm/svm/svm.h        |  2 +-
 arch/x86/kvm/svm/vmenter.S    | 36 +++++++++++++++++------------------
 4 files changed, 26 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 90da275ad223..7f1dd1138117 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -20,6 +20,7 @@
 #include <asm/tlbflush.h>
 #include <asm/tdx.h>
 #include "../kvm/vmx/vmx.h"
+#include "../kvm/svm/svm.h"
 
 #ifdef CONFIG_XEN
 #include <xen/interface/xen.h>
@@ -109,6 +110,11 @@ static void __used common(void)
 	OFFSET(TSS_sp1, tss_struct, x86_tss.sp1);
 	OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
 
+	if (IS_ENABLED(CONFIG_KVM_AMD)) {
+		BLANK();
+		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+	}
+
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
 		BLANK();
 		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b793cfdce68d..64f5f0544b4f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3932,7 +3932,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		 * vmcb02 when switching vmcbs for nested virtualization.
 		 */
 		vmload(svm->vmcb01.pa);
-		__svm_vcpu_run(vmcb_pa, (unsigned long *)&vcpu->arch.regs);
+		__svm_vcpu_run(vmcb_pa, svm);
 		vmsave(svm->vmcb01.pa);
 
 		vmload(__sme_page_pa(sd->save_area));
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 222856788153..5f8dfc9cd9a7 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -484,6 +484,6 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 /* vmenter.S */
 
 void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, unsigned long *regs);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 723f8534986c..8fac744361e5 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -8,23 +8,23 @@
 #define WORD_SIZE (BITS_PER_LONG / 8)
 
 /* Intentionally omit RAX as it's context switched by hardware */
-#define VCPU_RCX	__VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX	__VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX	__VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RCX	(SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX	(SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX	(SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
 /* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP	__VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI	__VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI	__VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP	(SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI	(SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI	(SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
 
 #ifdef CONFIG_X86_64
-#define VCPU_R8		__VCPU_REGS_R8  * WORD_SIZE
-#define VCPU_R9		__VCPU_REGS_R9  * WORD_SIZE
-#define VCPU_R10	__VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11	__VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12	__VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13	__VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14	__VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15	__VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8		(SVM_vcpu_arch_regs + __VCPU_REGS_R8  * WORD_SIZE)
+#define VCPU_R9		(SVM_vcpu_arch_regs + __VCPU_REGS_R9  * WORD_SIZE)
+#define VCPU_R10	(SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11	(SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12	(SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13	(SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14	(SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15	(SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
 .section .noinstr.text, "ax"
@@ -32,7 +32,7 @@
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @vmcb_pa:	unsigned long
- * @regs:	unsigned long * (to guest registers)
+ * @svm:	struct vcpu_svm *
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -47,13 +47,13 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Save @regs. */
+	/* Save @svm. */
 	push %_ASM_ARG2
 
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @regs to RAX. */
+	/* Move @svm to RAX. */
 	mov %_ASM_ARG2, %_ASM_AX
 
 	/* Load guest registers. */
@@ -89,7 +89,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
-	/* "POP" @regs to RAX. */
+	/* "POP" @svm to RAX. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/7] KVM: SVM: adjust register allocation for __svm_vcpu_run
  2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
                   ` (3 preceding siblings ...)
  2022-10-28 23:07 ` [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
@ 2022-10-28 23:07 ` Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
  2022-10-28 23:07 ` [PATCH 7/7] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
  6 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

In preparation for moving SPEC_CTRL access to __svm_vcpu_run,
keep the pointer to the struct vcpu_svm in %rdi, which is not
used by rdmsr/wrmsr.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/vmenter.S | 39 +++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 8fac744361e5..51a63a47d74f 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -53,30 +53,31 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Save @vmcb. */
 	push %_ASM_ARG1
 
-	/* Move @svm to RAX. */
-	mov %_ASM_ARG2, %_ASM_AX
+	/* Move @svm to RDI. */
+	mov %_ASM_ARG2, %_ASM_DI
 
-	/* Load guest registers. */
-	mov VCPU_RCX(%_ASM_AX), %_ASM_CX
-	mov VCPU_RDX(%_ASM_AX), %_ASM_DX
-	mov VCPU_RBX(%_ASM_AX), %_ASM_BX
-	mov VCPU_RBP(%_ASM_AX), %_ASM_BP
-	mov VCPU_RSI(%_ASM_AX), %_ASM_SI
-	mov VCPU_RDI(%_ASM_AX), %_ASM_DI
-#ifdef CONFIG_X86_64
-	mov VCPU_R8 (%_ASM_AX),  %r8
-	mov VCPU_R9 (%_ASM_AX),  %r9
-	mov VCPU_R10(%_ASM_AX), %r10
-	mov VCPU_R11(%_ASM_AX), %r11
-	mov VCPU_R12(%_ASM_AX), %r12
-	mov VCPU_R13(%_ASM_AX), %r13
-	mov VCPU_R14(%_ASM_AX), %r14
-	mov VCPU_R15(%_ASM_AX), %r15
-#endif
 
 	/* "POP" @vmcb to RAX. */
 	pop %_ASM_AX
 
+	/* Load guest registers. */
+	mov VCPU_RCX(%_ASM_DI), %_ASM_CX
+	mov VCPU_RDX(%_ASM_DI), %_ASM_DX
+	mov VCPU_RBX(%_ASM_DI), %_ASM_BX
+	mov VCPU_RBP(%_ASM_DI), %_ASM_BP
+	mov VCPU_RSI(%_ASM_DI), %_ASM_SI
+#ifdef CONFIG_X86_64
+	mov VCPU_R8 (%_ASM_DI),  %r8
+	mov VCPU_R9 (%_ASM_DI),  %r9
+	mov VCPU_R10(%_ASM_DI), %r10
+	mov VCPU_R11(%_ASM_DI), %r11
+	mov VCPU_R12(%_ASM_DI), %r12
+	mov VCPU_R13(%_ASM_DI), %r13
+	mov VCPU_R14(%_ASM_DI), %r14
+	mov VCPU_R15(%_ASM_DI), %r15
+#endif
+	mov VCPU_RDI(%_ASM_DI), %_ASM_DI
+
 	/* Enter guest mode */
 	sti
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
                   ` (4 preceding siblings ...)
  2022-10-28 23:07 ` [PATCH 5/7] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
@ 2022-10-28 23:07 ` Paolo Bonzini
  2022-11-02 15:28   ` Josh Poimboeuf
  2022-10-28 23:07 ` [PATCH 7/7] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini
  6 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.

With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.

To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.

Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  1 +
 arch/x86/kernel/cpu/bugs.c    | 13 ++---
 arch/x86/kvm/svm/svm.c        | 34 +++++--------
 arch/x86/kvm/svm/svm.h        |  4 +-
 arch/x86/kvm/svm/vmenter.S    | 93 +++++++++++++++++++++++++++++++++--
 5 files changed, 109 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 7f1dd1138117..9dad8849c1ef 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -113,6 +113,7 @@ static void __used common(void)
 	if (IS_ENABLED(CONFIG_KVM_AMD)) {
 		BLANK();
 		OFFSET(SVM_vcpu_arch_regs, vcpu_svm, vcpu.arch.regs);
+		OFFSET(SVM_spec_ctrl, vcpu_svm, spec_ctrl);
 	}
 
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index da7c361f47e0..6ec0b7ce7453 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -196,22 +196,15 @@ void __init check_bugs(void)
 }
 
 /*
- * NOTE: This function is *only* called for SVM.  VMX spec_ctrl handling is
- * done in vmenter.S.
+ * NOTE: This function is *only* called for SVM, since Intel uses
+ * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
 x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
 {
-	u64 msrval, guestval = guest_spec_ctrl, hostval = spec_ctrl_current();
+	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
 
-	if (static_cpu_has(X86_FEATURE_MSR_SPEC_CTRL)) {
-		if (hostval != guestval) {
-			msrval = setguest ? guestval : hostval;
-			wrmsrl(MSR_IA32_SPEC_CTRL, msrval);
-		}
-	}
-
 	/*
 	 * If SSBD is not handled in MSR_SPEC_CTRL on AMD, update
 	 * MSR_AMD64_L2_CFG or MSR_VIRT_SPEC_CTRL if supported.
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 64f5f0544b4f..a79cdeebc181 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 	struct vcpu_svm *svm = to_svm(vcpu);
 	unsigned long vmcb_pa = svm->current_vmcb->pa;
 
+	/*
+	 * For non-nested case:
+	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 *
+	 * For nested case:
+	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
+	 * save it.
+	 */
+	bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
+
 	guest_state_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
-		__svm_sev_es_vcpu_run(vmcb_pa);
+		__svm_sev_es_vcpu_run(vmcb_pa, svm, spec_ctrl_intercepted);
 	} else {
 		struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
 
@@ -3932,7 +3943,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		 * vmcb02 when switching vmcbs for nested virtualization.
 		 */
 		vmload(svm->vmcb01.pa);
-		__svm_vcpu_run(vmcb_pa, svm);
+		__svm_vcpu_run(vmcb_pa, svm, spec_ctrl_intercepted);
 		vmsave(svm->vmcb01.pa);
 
 		vmload(__sme_page_pa(sd->save_area));
@@ -4004,25 +4015,6 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 
 	svm_vcpu_enter_exit(vcpu);
 
-	/*
-	 * We do not use IBRS in the kernel. If this vCPU has used the
-	 * SPEC_CTRL MSR it may have left it on; save the value and
-	 * turn it off. This is much more efficient than blindly adding
-	 * it to the atomic save/restore list. Especially as the former
-	 * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
-	 *
-	 * For non-nested case:
-	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 *
-	 * For nested case:
-	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
-	 * save it.
-	 */
-	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL) &&
-	    unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
-		svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
-
 	if (!sev_es_guest(vcpu->kvm))
 		reload_tss(vcpu);
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5f8dfc9cd9a7..f61c05116ea5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -483,7 +483,7 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
 
-void __svm_sev_es_vcpu_run(unsigned long vmcb_pa);
-void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm);
+void __svm_sev_es_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm, bool spec_ctrl_intercepted);
+void __svm_vcpu_run(unsigned long vmcb_pa, struct vcpu_svm *svm, bool spec_ctrl_intercepted);
 
 #endif
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 51a63a47d74f..89402e450071 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -29,10 +29,65 @@
 
 .section .noinstr.text, "ax"
 
+.macro RESTORE_GUEST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "jmp 999f", \
+		"", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+	/*
+	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
+	 * host's, write the MSR.
+	 *
+	 * IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
+	 * there must not be any returns or indirect branches between this code
+	 * and vmentry.
+	 */
+	movl SVM_spec_ctrl(%_ASM_DI), %eax
+	cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	je 999f
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+	xor %edx, %edx
+	wrmsr
+999:
+
+.endm
+
+.macro RESTORE_HOST_SPEC_CTRL
+	/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
+	ALTERNATIVE_2 "jmp 999f", \
+		"", X86_FEATURE_MSR_SPEC_CTRL, \
+		"jmp 999f", X86_FEATURE_V_SPEC_CTRL
+
+	mov $MSR_IA32_SPEC_CTRL, %ecx
+
+	/*
+	 * Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
+	 * if it was not intercepted during guest execution.
+	 */
+	cmpb $0, (%_ASM_SP)
+	jnz 998f
+	rdmsr
+	movl %eax, SVM_spec_ctrl(%_ASM_DI)
+998:
+
+	/* Now restore the host value of the MSR if different from the guest's.  */
+	movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
+	cmp SVM_spec_ctrl(%_ASM_DI), %eax
+	je 999f
+	xor %edx, %edx
+	wrmsr
+999:
+
+.endm
+
+
+
 /**
  * __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
  * @vmcb_pa:	unsigned long
  * @svm:	struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_vcpu_run)
 	push %_ASM_BP
@@ -47,6 +102,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 #endif
 	push %_ASM_BX
 
+	/* Save @spec_ctrl_intercepted. */
+	push %_ASM_ARG3
+
 	/* Save @svm. */
 	push %_ASM_ARG2
 
@@ -56,6 +114,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	/* Move @svm to RDI. */
 	mov %_ASM_ARG2, %_ASM_DI
 
+	RESTORE_GUEST_SPEC_CTRL
 
 	/* "POP" @vmcb to RAX. */
 	pop %_ASM_AX
@@ -90,7 +149,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
-	/* "POP" @svm to RAX. */
+	/* "POP" @svm to RAX while it's the only available register. */
 	pop %_ASM_AX
 
 	/* Save all guest registers.  */
@@ -111,6 +170,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 	mov %r15, VCPU_R15(%_ASM_AX)
 #endif
 
+	mov %_ASM_AX, %_ASM_DI
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -146,6 +208,9 @@ SYM_FUNC_START(__svm_vcpu_run)
 	xor %r15d, %r15d
 #endif
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
@@ -171,6 +236,8 @@ SYM_FUNC_END(__svm_vcpu_run)
 /**
  * __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
  * @vmcb_pa:	unsigned long
+ * @svm:	struct vcpu_svm *
+ * @spec_ctrl_intercepted: bool
  */
 SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	push %_ASM_BP
@@ -185,8 +252,22 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 #endif
 	push %_ASM_BX
 
-	/* Move @vmcb to RAX. */
-	mov %_ASM_ARG1, %_ASM_AX
+	/* Save @spec_ctrl_intercepted. */
+	push %_ASM_ARG3
+
+	/* Save @svm. */
+	push %_ASM_ARG2
+
+	/* Save @vmcb. */
+	push %_ASM_ARG1
+
+	/* Move @svm to RDI for RESTORE_GUEST_SPEC_CTRL. */
+	mov %_ASM_ARG2, %_ASM_DI
+
+	RESTORE_GUEST_SPEC_CTRL
+
+	/* Pop @vmcb to RAX. */
+	pop %_ASM_AX
 
 	/* Enter guest mode */
 	sti
@@ -200,6 +281,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
 #endif
 
+	pop %_ASM_DI
+	RESTORE_HOST_SPEC_CTRL
+
 	/*
 	 * Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
 	 * untrained as soon as we exit the VM and are back to the
@@ -209,6 +293,9 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 */
 	UNTRAIN_RET
 
+	/* "Pop" @spec_ctrl_intercepted.  */
+	pop %_ASM_BX
+
 	pop %_ASM_BX
 
 #ifdef CONFIG_X86_64
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 7/7] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers
  2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
                   ` (5 preceding siblings ...)
  2022-10-28 23:07 ` [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-10-28 23:07 ` Paolo Bonzini
  6 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-10-28 23:07 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: jmattson, seanjc, jpoimboe

x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/spec-ctrl.h | 10 +++++-----
 arch/x86/kernel/cpu/bugs.c       |  2 +-
 arch/x86/kvm/svm/svm.c           |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/spec-ctrl.h b/arch/x86/include/asm/spec-ctrl.h
index 5393babc0598..cb0386fc4dc3 100644
--- a/arch/x86/include/asm/spec-ctrl.h
+++ b/arch/x86/include/asm/spec-ctrl.h
@@ -13,7 +13,7 @@
  * Takes the guest view of SPEC_CTRL MSR as a parameter and also
  * the guest's version of VIRT_SPEC_CTRL, if emulated.
  */
-extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool guest);
+extern void x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool guest);
 
 /**
  * x86_spec_ctrl_set_guest - Set speculation control registers for the guest
@@ -24,9 +24,9 @@ extern void x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bo
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_set_guest(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, true);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, true);
 }
 
 /**
@@ -38,9 +38,9 @@ void x86_spec_ctrl_set_guest(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
  * Avoids writing to the MSR if the content/bits are the same
  */
 static inline
-void x86_spec_ctrl_restore_host(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl)
+void x86_spec_ctrl_restore_host(u64 guest_virt_spec_ctrl)
 {
-	x86_virt_spec_ctrl(guest_spec_ctrl, guest_virt_spec_ctrl, false);
+	x86_virt_spec_ctrl(guest_virt_spec_ctrl, false);
 }
 
 /* AMD specific Speculative Store Bypass MSR data */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6ec0b7ce7453..3e3230cccaa7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -200,7 +200,7 @@ void __init check_bugs(void)
  * MSR_IA32_SPEC_CTRL for SSBD.
  */
 void
-x86_virt_spec_ctrl(u64 guest_spec_ctrl, u64 guest_virt_spec_ctrl, bool setguest)
+x86_virt_spec_ctrl(u64 guest_virt_spec_ctrl, bool setguest)
 {
 	u64 guestval, hostval;
 	struct thread_info *ti = current_thread_info();
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a79cdeebc181..7b87a4e43708 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4011,7 +4011,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 	 * being speculatively taken.
 	 */
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_set_guest(svm->virt_spec_ctrl);
 
 	svm_vcpu_enter_exit(vcpu);
 
@@ -4019,7 +4019,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 		reload_tss(vcpu);
 
 	if (!static_cpu_has(X86_FEATURE_V_SPEC_CTRL))
-		x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
+		x86_spec_ctrl_restore_host(svm->virt_spec_ctrl);
 
 	if (!sev_es_guest(vcpu->kvm)) {
 		vcpu->arch.cr2 = svm->vmcb->save.cr2;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run
  2022-10-28 23:07 ` [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run Paolo Bonzini
@ 2022-10-31 17:37   ` Sean Christopherson
  2022-11-01 17:32     ` Josh Poimboeuf
  2022-11-02 17:42     ` Paolo Bonzini
  0 siblings, 2 replies; 14+ messages in thread
From: Sean Christopherson @ 2022-10-31 17:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, jmattson, jpoimboe

On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> Registers are reachable through vcpu_vmx, no need to pass
> a separate pointer to the regs[] array.
> 
> No functional change intended.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kernel/asm-offsets.c |  1 +
>  arch/x86/kvm/vmx/nested.c     |  3 +-
>  arch/x86/kvm/vmx/vmenter.S    | 58 +++++++++++++++--------------------
>  arch/x86/kvm/vmx/vmx.c        |  3 +-
>  arch/x86/kvm/vmx/vmx.h        |  3 +-
>  5 files changed, 29 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index cb50589a7102..90da275ad223 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -111,6 +111,7 @@ static void __used common(void)
>  
>  	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
>  		BLANK();
> +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);

Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
migration between versions of KVM to upgrade/rollback KVM without changing the
kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
been able to figure out a simple way to avoid exposing KVM's internal structures
outside of KVM (so that the structures can change across KVM instances without
breaking kernel code).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run
  2022-10-31 17:37   ` Sean Christopherson
@ 2022-11-01 17:32     ` Josh Poimboeuf
  2022-11-01 18:03       ` Sean Christopherson
  2022-11-02 17:42     ` Paolo Bonzini
  1 sibling, 1 reply; 14+ messages in thread
From: Josh Poimboeuf @ 2022-11-01 17:32 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, linux-kernel, kvm, jmattson

On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> > Registers are reachable through vcpu_vmx, no need to pass
> > a separate pointer to the regs[] array.
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  arch/x86/kernel/asm-offsets.c |  1 +
> >  arch/x86/kvm/vmx/nested.c     |  3 +-
> >  arch/x86/kvm/vmx/vmenter.S    | 58 +++++++++++++++--------------------
> >  arch/x86/kvm/vmx/vmx.c        |  3 +-
> >  arch/x86/kvm/vmx/vmx.h        |  3 +-
> >  5 files changed, 29 insertions(+), 39 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > index cb50589a7102..90da275ad223 100644
> > --- a/arch/x86/kernel/asm-offsets.c
> > +++ b/arch/x86/kernel/asm-offsets.c
> > @@ -111,6 +111,7 @@ static void __used common(void)
> >  
> >  	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> >  		BLANK();
> > +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> 
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).

Is that really a problem?  Would it even make sense for non-KVM kernel
code to use 'vcpu_vmx' anyway?  It already seems to be private.
asm-offsets.c has to "cheat" to get access to it by including
"../kvm/vmx/vmx.h".

So the only concern is exposing the asm offsets, right?  But it seems
highly unlikely any non-KVM code would be using those either.

And, that would be a bug anyway: module code is subject to change and
could always get recompiled.  The kernel proper shouldn't be making any
assumptions about the layouts of module-owned structs.

-- 
Josh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run
  2022-11-01 17:32     ` Josh Poimboeuf
@ 2022-11-01 18:03       ` Sean Christopherson
  0 siblings, 0 replies; 14+ messages in thread
From: Sean Christopherson @ 2022-11-01 18:03 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: Paolo Bonzini, linux-kernel, kvm, jmattson

On Tue, Nov 01, 2022, Josh Poimboeuf wrote:
> On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > > index cb50589a7102..90da275ad223 100644
> > > --- a/arch/x86/kernel/asm-offsets.c
> > > +++ b/arch/x86/kernel/asm-offsets.c
> > > @@ -111,6 +111,7 @@ static void __used common(void)
> > >  
> > >  	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> > >  		BLANK();
> > > +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> > 
> > Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> > outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
> > i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> > migration between versions of KVM to upgrade/rollback KVM without changing the
> > kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
> > been able to figure out a simple way to avoid exposing KVM's internal structures
> > outside of KVM (so that the structures can change across KVM instances without
> > breaking kernel code).
> 
> Is that really a problem?  Would it even make sense for non-KVM kernel
> code to use 'vcpu_vmx' anyway?

vcpu_vmx itself isn't a problem as non-KVM kernel code _shouldn't_ be using
vcpu_vmx, but I want to go beyond "shouldn't" and make it all-but-impossible for
non-KVM code to reference internal KVM structures/state, e.g. I want to bury all
kvm_host.h headers in kvm/ code instead of exposing them in include/asm/.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-10-28 23:07 ` [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
@ 2022-11-02 15:28   ` Josh Poimboeuf
  2022-11-02 16:02     ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Josh Poimboeuf @ 2022-11-02 15:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, jmattson, seanjc

On Fri, Oct 28, 2022 at 07:07:22PM -0400, Paolo Bonzini wrote:
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  	unsigned long vmcb_pa = svm->current_vmcb->pa;
>  
> +	/*
> +	 * For non-nested case:
> +	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
> +	 * save it.
> +	 *
> +	 * For nested case:
> +	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
> +	 * save it.
> +	 */
> +	bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);

This triggers a warning:

  vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x3d: call to svm_msrpm_offset() leaves .noinstr.text section

svm_vcpu_enter_exit() is noinstr, but it's calling
msr_write_intercepted() which is not.

That's why in the VMX code I did the call to msr_write_intercepted() (in
__vmx_vcpu_run_flags) before calling vmx_vcpu_enter_exit().

-- 
Josh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
  2022-11-02 15:28   ` Josh Poimboeuf
@ 2022-11-02 16:02     ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-11-02 16:02 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: linux-kernel, kvm, jmattson, seanjc

On 11/2/22 16:28, Josh Poimboeuf wrote:
> On Fri, Oct 28, 2022 at 07:07:22PM -0400, Paolo Bonzini wrote:
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -3918,10 +3918,21 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
>>   	struct vcpu_svm *svm = to_svm(vcpu);
>>   	unsigned long vmcb_pa = svm->current_vmcb->pa;
>>   
>> +	/*
>> +	 * For non-nested case:
>> +	 * If the L01 MSR bitmap does not intercept the MSR, then we need to
>> +	 * save it.
>> +	 *
>> +	 * For nested case:
>> +	 * If the L02 MSR bitmap does not intercept the MSR, then we need to
>> +	 * save it.
>> +	 */
>> +	bool spec_ctrl_intercepted = msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL);
> 
> This triggers a warning:
> 
>    vmlinux.o: warning: objtool: svm_vcpu_enter_exit+0x3d: call to svm_msrpm_offset() leaves .noinstr.text section
> 
> svm_vcpu_enter_exit() is noinstr, but it's calling
> msr_write_intercepted() which is not.

I suspect I didn't see it because it's inlined here, but it has to be 
fixed indeed.

> That's why in the VMX code I did the call to msr_write_intercepted() (in
> __vmx_vcpu_run_flags) before calling vmx_vcpu_enter_exit().

Yes, it's easy to do the same and do it in svm_vcpu_run().

Paolo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run
  2022-10-31 17:37   ` Sean Christopherson
  2022-11-01 17:32     ` Josh Poimboeuf
@ 2022-11-02 17:42     ` Paolo Bonzini
  1 sibling, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2022-11-02 17:42 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: linux-kernel, kvm, jmattson, jpoimboe

On 10/31/22 18:37, Sean Christopherson wrote:
>>   	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
>>   		BLANK();
>> +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).
> 

It's possible to create our own asm-offsets.h file using something 
similar to the logic in Kbuild:

#####
# Generate asm-offsets.h

offsets-file := include/generated/asm-offsets.h

always-y += $(offsets-file)
targets += arch/$(SRCARCH)/kernel/asm-offsets.s

arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)

$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
         $(call filechk,offsets,__ASM_OFFSETS_H__)

Paolo


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-11-02 17:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-28 23:07 [RFC PATCH 0/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
2022-10-28 23:07 ` [PATCH 1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run Paolo Bonzini
2022-10-31 17:37   ` Sean Christopherson
2022-11-01 17:32     ` Josh Poimboeuf
2022-11-01 18:03       ` Sean Christopherson
2022-11-02 17:42     ` Paolo Bonzini
2022-10-28 23:07 ` [PATCH 2/7] KVM: VMX: more cleanups to __vmx_vcpu_run Paolo Bonzini
2022-10-28 23:07 ` [PATCH 3/7] KVM: SVM: extract VMCB accessors to a new file Paolo Bonzini
2022-10-28 23:07 ` [PATCH 4/7] KVM: SVM: replace argument of __svm_vcpu_run with vcpu_svm Paolo Bonzini
2022-10-28 23:07 ` [PATCH 5/7] KVM: SVM: adjust register allocation for __svm_vcpu_run Paolo Bonzini
2022-10-28 23:07 ` [PATCH 6/7] KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly Paolo Bonzini
2022-11-02 15:28   ` Josh Poimboeuf
2022-11-02 16:02     ` Paolo Bonzini
2022-10-28 23:07 ` [PATCH 7/7] x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).