* [patch 0/7] x86/KVM: Async #PF and instrumentation protection
@ 2020-05-19 20:31 Thomas Gleixner
2020-05-19 20:31 ` [patch 1/7] x86/kvm: Move context tracking where it belongs Thomas Gleixner
` (7 more replies)
0 siblings, 8 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Alexandre Chartre, Peter Zijlstra,
Juergen Gross, Tom Lendacky
Folks,
this series is the KVM side of the ongoing quest to confine instrumentation
to safe places and ensure that RCU and context tracking state is correct.
The async #PF changes are in the tip tree already as they conflict with the
entry code rework. The minimal set of commits to carry these have been
isolated and tagged:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git noinstr-x86-kvm-2020-05-16
Paolo, please pull this into your next branch to avoid conflicts in
next. The prerequisites for the following KVM specific changes come with
that tag so that you have no merge dependencies.
The tag has also been merged into
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/entry
where the x86 core #PF entry code changes will be queued soon as well.
The KVM specific patches which deal with the RCU and context tracking state
and the protection against instrumentation in sensitive places have been
split out from the larger entry/noinstr series:
https://lore.kernel.org/r/20200505134112.272268764@linutronix.de
The patches deal with:
- Placing the guest_enter/exit() calls at the correct place
- Moving the sensitive VMENTER/EXIT code into the non-instrumentable code
section.
- Fixup the tracing code to comply with the non-instrumentation rules
- Use native functions to access CR2 and the GS base MSR in the critical
code pathes to prevent them from being instrumented.
The patches apply on top of
git://git.kernel.org/pub/scm/linux/kernel/git/kvm/kvm.git next
with the noinstr-x86-kvm-2020-05-16 tag from the tip tree merged in.
For reference the whole lot is available from:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git kvm/noinstr
Thanks,
tglx
---
include/asm/hardirq.h | 4 +-
include/asm/kvm_host.h | 8 +++++
kvm/svm/svm.c | 65 ++++++++++++++++++++++++++++++++++------
kvm/svm/vmenter.S | 2 -
kvm/vmx/ops.h | 4 ++
kvm/vmx/vmenter.S | 5 ++-
kvm/vmx/vmx.c | 78 ++++++++++++++++++++++++++++++++++++++-----------
kvm/x86.c | 4 --
8 files changed, 137 insertions(+), 33 deletions(-)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch 1/7] x86/kvm: Move context tracking where it belongs
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
@ 2020-05-19 20:31 ` Thomas Gleixner
2020-05-19 20:31 ` [patch 2/7] x86/kvm/vmx: Add hardirq tracing to guest enter/exit Thomas Gleixner
` (6 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Alexandre Chartre, Peter Zijlstra,
Juergen Gross, Tom Lendacky
Context tracking for KVM happens way too early in the vcpu_run()
code. Anything after guest_enter_irqoff() and before guest_exit_irqoff()
cannot use RCU and should also be not instrumented.
The current way of doing this covers way too much code. Move it closer to
the actual vmenter/exit code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200505134341.379326289@linutronix.de
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 4e9cd2a73ad0..40242e0af20d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3389,6 +3389,14 @@ static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
*/
x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
+ /*
+ * Tell context tracking that this CPU is about to enter guest
+ * mode. This has to be after x86_spec_ctrl_set_guest() because
+ * that can take locks (lockdep needs RCU) and calls into world and
+ * some more.
+ */
+ guest_enter_irqoff();
+
__svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
#ifdef CONFIG_X86_64
@@ -3399,6 +3407,14 @@ static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
loadsegment(gs, svm->host.gs);
#endif
#endif
+ /*
+ * Tell context tracking that this CPU is back.
+ *
+ * This needs to be done before the below as native_read_msr()
+ * contains a tracepoint and x86_spec_ctrl_restore_host() calls
+ * into world and some more.
+ */
+ guest_exit_irqoff();
/*
* We do not use IBRS in the kernel. If this vCPU has used the
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6a03c27ff314..ad0159f68ce4 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6724,6 +6724,11 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
*/
x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
+ /*
+ * Tell context tracking that this CPU is about to enter guest mode.
+ */
+ guest_enter_irqoff();
+
/* L1D Flush includes CPU buffer clear to mitigate MDS */
if (static_branch_unlikely(&vmx_l1d_should_flush))
vmx_l1d_flush(vcpu);
@@ -6739,6 +6744,11 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
vcpu->arch.cr2 = read_cr2();
/*
+ * Tell context tracking that this CPU is back.
+ */
+ guest_exit_irqoff();
+
+ /*
* We do not use IBRS in the kernel. If this vCPU has used the
* SPEC_CTRL MSR it may have left it on; save the value and
* turn it off. This is much more efficient than blindly adding
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 471fccf7f850..28663a6688d1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8438,7 +8438,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
}
trace_kvm_entry(vcpu->vcpu_id);
- guest_enter_irqoff();
fpregs_assert_state_consistent();
if (test_thread_flag(TIF_NEED_FPU_LOAD))
@@ -8500,7 +8499,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
local_irq_disable();
kvm_after_interrupt(vcpu);
- guest_exit_irqoff();
if (lapic_in_kernel(vcpu)) {
s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
if (delta != S64_MIN) {
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [patch 2/7] x86/kvm/vmx: Add hardirq tracing to guest enter/exit
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
2020-05-19 20:31 ` [patch 1/7] x86/kvm: Move context tracking where it belongs Thomas Gleixner
@ 2020-05-19 20:31 ` Thomas Gleixner
2020-05-20 5:48 ` kbuild test robot
2020-05-19 20:31 ` [patch 3/7] x86/kvm/svm: Add hardirq tracing on " Thomas Gleixner
` (5 subsequent siblings)
7 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Alexandre Chartre, Peter Zijlstra,
Juergen Gross, Tom Lendacky
Entering guest mode is more or less the same as returning to user
space. From an instrumentation point of view both leave kernel mode and the
transition to guest or user mode reenables interrupts on the host. In user
mode an interrupt is served directly and in guest mode it causes a VM exit
which then handles or reinjects the interrupt.
The transition from guest mode or user mode to kernel mode disables
interrupts, which needs to be recorded in instrumentation to set the
correct state again.
This is important for e.g. latency analysis because otherwise the execution
time in guest or user mode would be wrongly accounted as interrupt disabled
and could trigger false positives.
Add hardirq tracing to guest enter/exit functions in the same way as it
is done in the user mode enter/exit code, respecting the RCU requirements.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200505134341.471542318@linutronix.de
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ad0159f68ce4..afdb6f489ab2 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6725,9 +6725,21 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
/*
- * Tell context tracking that this CPU is about to enter guest mode.
+ * VMENTER enables interrupts (host state), but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about
+ * it. This is the same logic as for exit_to_user_mode().
+ *
+ * This ensures that e.g. latency analysis on the host observes
+ * guest mode as interrupt enabled.
+ *
+ * guest_enter_irqoff() informs context tracking about the
+ * transition to guest mode and if enabled adjusts RCU state
+ * accordingly.
*/
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare(CALLER_ADDR0);
guest_enter_irqoff();
+ lockdep_hardirqs_on(CALLER_ADDR0);
/* L1D Flush includes CPU buffer clear to mitigate MDS */
if (static_branch_unlikely(&vmx_l1d_should_flush))
@@ -6744,9 +6756,20 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
vcpu->arch.cr2 = read_cr2();
/*
- * Tell context tracking that this CPU is back.
+ * VMEXIT disables interrupts (host state), but tracing and lockdep
+ * have them in state 'on' as recorded before entering guest mode.
+ * Same as enter_from_user_mode().
+ *
+ * guest_exit_irqoff() restores host context and reinstates RCU if
+ * enabled and required.
+ *
+ * This needs to be done before the below as native_read_msr()
+ * contains a tracepoint and x86_spec_ctrl_restore_host() calls
+ * into world and some more.
*/
+ lockdep_hardirqs_off(CALLER_ADDR0);
guest_exit_irqoff();
+ trace_hardirqs_off_prepare();
/*
* We do not use IBRS in the kernel. If this vCPU has used the
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [patch 3/7] x86/kvm/svm: Add hardirq tracing on guest enter/exit
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
2020-05-19 20:31 ` [patch 1/7] x86/kvm: Move context tracking where it belongs Thomas Gleixner
2020-05-19 20:31 ` [patch 2/7] x86/kvm/vmx: Add hardirq tracing to guest enter/exit Thomas Gleixner
@ 2020-05-19 20:31 ` Thomas Gleixner
2020-05-20 8:19 ` kbuild test robot
2020-05-19 20:31 ` [patch 4/7] x86/kvm/vmx: Move guest enter/exit into .noinstr.text Thomas Gleixner
` (4 subsequent siblings)
7 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Alexandre Chartre, Peter Zijlstra,
Juergen Gross, Tom Lendacky
Entering guest mode is more or less the same as returning to user
space. From an instrumentation point of view both leave kernel mode and the
transition to guest or user mode reenables interrupts on the host. In user
mode an interrupt is served directly and in guest mode it causes a VM exit
which then handles or reinjects the interrupt.
The transition from guest mode or user mode to kernel mode disables
interrupts, which needs to be recorded in instrumentation to set the
correct state again.
This is important for e.g. latency analysis because otherwise the execution
time in guest or user mode would be wrongly accounted as interrupt disabled
and could trigger false positives.
Add hardirq tracing to guest enter/exit functions in the same way as it
is done in the user mode enter/exit code, respecting the RCU requirements.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200505134341.579034898@linutronix.de
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 40242e0af20d..46d69567faab 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3390,12 +3390,21 @@ static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
/*
- * Tell context tracking that this CPU is about to enter guest
- * mode. This has to be after x86_spec_ctrl_set_guest() because
- * that can take locks (lockdep needs RCU) and calls into world and
- * some more.
+ * VMENTER enables interrupts (host state), but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about
+ * it. This is the same logic as for exit_to_user_mode().
+ *
+ * This ensures that e.g. latency analysis on the host observes
+ * guest mode as interrupt enabled.
+ *
+ * guest_enter_irqoff() informs context tracking about the
+ * transition to guest mode and if enabled adjusts RCU state
+ * accordingly.
*/
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare(CALLER_ADDR0);
guest_enter_irqoff();
+ lockdep_hardirqs_on(CALLER_ADDR0);
__svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
@@ -3407,14 +3416,22 @@ static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
loadsegment(gs, svm->host.gs);
#endif
#endif
+
/*
- * Tell context tracking that this CPU is back.
+ * VMEXIT disables interrupts (host state), but tracing and lockdep
+ * have them in state 'on' as recorded before entering guest mode.
+ * Same as enter_from_user_mode().
+ *
+ * guest_exit_irqoff() restores host context and reinstates RCU if
+ * enabled and required.
*
* This needs to be done before the below as native_read_msr()
* contains a tracepoint and x86_spec_ctrl_restore_host() calls
* into world and some more.
*/
+ lockdep_hardirqs_off(CALLER_ADDR0);
guest_exit_irqoff();
+ trace_hardirqs_off_prepare();
/*
* We do not use IBRS in the kernel. If this vCPU has used the
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [patch 4/7] x86/kvm/vmx: Move guest enter/exit into .noinstr.text
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
` (2 preceding siblings ...)
2020-05-19 20:31 ` [patch 3/7] x86/kvm/svm: Add hardirq tracing on " Thomas Gleixner
@ 2020-05-19 20:31 ` Thomas Gleixner
2020-05-19 20:31 ` [patch 5/7] x86/kvm/svm: " Thomas Gleixner
` (3 subsequent siblings)
7 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Alexandre Chartre, Peter Zijlstra,
Juergen Gross, Tom Lendacky
Move the functions which are inside the RCU off region into the
non-instrumentable text section.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200505134341.781667216@linutronix.de
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 07533795b8d2..275e7fd20310 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -67,12 +67,12 @@ static inline void kvm_set_cpu_l1tf_flush_l1d(void)
__this_cpu_write(irq_stat.kvm_cpu_l1tf_flush_l1d, 1);
}
-static inline void kvm_clear_cpu_l1tf_flush_l1d(void)
+static __always_inline void kvm_clear_cpu_l1tf_flush_l1d(void)
{
__this_cpu_write(irq_stat.kvm_cpu_l1tf_flush_l1d, 0);
}
-static inline bool kvm_get_cpu_l1tf_flush_l1d(void)
+static __always_inline bool kvm_get_cpu_l1tf_flush_l1d(void)
{
return __this_cpu_read(irq_stat.kvm_cpu_l1tf_flush_l1d);
}
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fd78bd44b2d6..b7492c804c30 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1620,7 +1620,15 @@ asmlinkage void kvm_spurious_fault(void);
insn "\n\t" \
"jmp 668f \n\t" \
"667: \n\t" \
+ "1: \n\t" \
+ ".pushsection .discard.instr_begin \n\t" \
+ ".long 1b - . \n\t" \
+ ".popsection \n\t" \
"call kvm_spurious_fault \n\t" \
+ "1: \n\t" \
+ ".pushsection .discard.instr_end \n\t" \
+ ".long 1b - . \n\t" \
+ ".popsection \n\t" \
"668: \n\t" \
_ASM_EXTABLE(666b, 667b)
diff --git a/arch/x86/kvm/vmx/ops.h b/arch/x86/kvm/vmx/ops.h
index 5f1ac002b4b6..692b0c31c9c8 100644
--- a/arch/x86/kvm/vmx/ops.h
+++ b/arch/x86/kvm/vmx/ops.h
@@ -146,7 +146,9 @@ do { \
: : op1 : "cc" : error, fault); \
return; \
error: \
+ instrumentation_begin(); \
insn##_error(error_args); \
+ instrumentation_end(); \
return; \
fault: \
kvm_spurious_fault(); \
@@ -161,7 +163,9 @@ do { \
: : op1, op2 : "cc" : error, fault); \
return; \
error: \
+ instrumentation_begin(); \
insn##_error(error_args); \
+ instrumentation_end(); \
return; \
fault: \
kvm_spurious_fault(); \
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index e0a182cb3cdd..799db084a336 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -27,7 +27,7 @@
#define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE
#endif
- .text
+.section .noinstr.text, "ax"
/**
* vmx_vmenter - VM-Enter the current loaded VMCS
@@ -234,6 +234,9 @@ SYM_FUNC_START(__vmx_vcpu_run)
jmp 1b
SYM_FUNC_END(__vmx_vcpu_run)
+
+.section .text, "ax"
+
/**
* vmread_error_trampoline - Trampoline from inline asm to vmread_error()
* @field: VMCS field encoding that failed
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index afdb6f489ab2..0a751a6ddf6e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6090,7 +6090,7 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
* information but as all relevant affected CPUs have 32KiB L1D cache size
* there is no point in doing so.
*/
-static void vmx_l1d_flush(struct kvm_vcpu *vcpu)
+static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu)
{
int size = PAGE_SIZE << L1D_CACHE_ORDER;
@@ -6123,7 +6123,7 @@ static void vmx_l1d_flush(struct kvm_vcpu *vcpu)
vcpu->stat.l1d_flush++;
if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
- wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
+ native_wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
return;
}
@@ -6626,7 +6626,7 @@ static void vmx_update_hv_timer(struct kvm_vcpu *vcpu)
}
}
-void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
+void noinstr vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
{
if (unlikely(host_rsp != vmx->loaded_vmcs->host_state.rsp)) {
vmx->loaded_vmcs->host_state.rsp = host_rsp;
@@ -6648,6 +6648,63 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs, bool launched);
+static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
+ struct vcpu_vmx *vmx)
+{
+ /*
+ * VMENTER enables interrupts (host state), but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about
+ * it. This is the same logic as for exit_to_user_mode().
+ *
+ * This ensures that e.g. latency analysis on the host observes
+ * guest mode as interrupt enabled.
+ *
+ * guest_enter_irqoff() informs context tracking about the
+ * transition to guest mode and if enabled adjusts RCU state
+ * accordingly.
+ */
+ instrumentation_begin();
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+ instrumentation_end();
+
+ guest_enter_irqoff();
+ lockdep_hardirqs_on(CALLER_ADDR0);
+
+ /* L1D Flush includes CPU buffer clear to mitigate MDS */
+ if (static_branch_unlikely(&vmx_l1d_should_flush))
+ vmx_l1d_flush(vcpu);
+ else if (static_branch_unlikely(&mds_user_clear))
+ mds_clear_cpu_buffers();
+
+ if (vcpu->arch.cr2 != read_cr2())
+ write_cr2(vcpu->arch.cr2);
+
+ vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
+ vmx->loaded_vmcs->launched);
+
+ vcpu->arch.cr2 = read_cr2();
+
+ /*
+ * VMEXIT disables interrupts (host state), but tracing and lockdep
+ * have them in state 'on' as recorded before entering guest mode.
+ * Same as enter_from_user_mode().
+ *
+ * guest_exit_irqoff() restores host context and reinstates RCU if
+ * enabled and required.
+ *
+ * This needs to be done before the below as native_read_msr()
+ * contains a tracepoint and x86_spec_ctrl_restore_host() calls
+ * into world and some more.
+ */
+ lockdep_hardirqs_off(CALLER_ADDR0);
+ guest_exit_irqoff();
+
+ instrumentation_begin();
+ trace_hardirqs_off_prepare();
+ instrumentation_end();
+}
+
static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
{
fastpath_t exit_fastpath;
@@ -6724,52 +6781,8 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
*/
x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
- /*
- * VMENTER enables interrupts (host state), but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about
- * it. This is the same logic as for exit_to_user_mode().
- *
- * This ensures that e.g. latency analysis on the host observes
- * guest mode as interrupt enabled.
- *
- * guest_enter_irqoff() informs context tracking about the
- * transition to guest mode and if enabled adjusts RCU state
- * accordingly.
- */
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare(CALLER_ADDR0);
- guest_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
-
- /* L1D Flush includes CPU buffer clear to mitigate MDS */
- if (static_branch_unlikely(&vmx_l1d_should_flush))
- vmx_l1d_flush(vcpu);
- else if (static_branch_unlikely(&mds_user_clear))
- mds_clear_cpu_buffers();
-
- if (vcpu->arch.cr2 != read_cr2())
- write_cr2(vcpu->arch.cr2);
-
- vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
- vmx->loaded_vmcs->launched);
-
- vcpu->arch.cr2 = read_cr2();
-
- /*
- * VMEXIT disables interrupts (host state), but tracing and lockdep
- * have them in state 'on' as recorded before entering guest mode.
- * Same as enter_from_user_mode().
- *
- * guest_exit_irqoff() restores host context and reinstates RCU if
- * enabled and required.
- *
- * This needs to be done before the below as native_read_msr()
- * contains a tracepoint and x86_spec_ctrl_restore_host() calls
- * into world and some more.
- */
- lockdep_hardirqs_off(CALLER_ADDR0);
- guest_exit_irqoff();
- trace_hardirqs_off_prepare();
+ /* The actual VMENTER/EXIT is in the .noinstr.text section. */
+ vmx_vcpu_enter_exit(vcpu, vmx);
/*
* We do not use IBRS in the kernel. If this vCPU has used the
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 28663a6688d1..c1257100ecbb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -379,7 +379,7 @@ int kvm_set_apic_base(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
}
EXPORT_SYMBOL_GPL(kvm_set_apic_base);
-asmlinkage __visible void kvm_spurious_fault(void)
+asmlinkage __visible noinstr void kvm_spurious_fault(void)
{
/* Fault while not rebooting. We want the trace. */
BUG_ON(!kvm_rebooting);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [patch 5/7] x86/kvm/svm: Move guest enter/exit into .noinstr.text
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
` (3 preceding siblings ...)
2020-05-19 20:31 ` [patch 4/7] x86/kvm/vmx: Move guest enter/exit into .noinstr.text Thomas Gleixner
@ 2020-05-19 20:31 ` Thomas Gleixner
2020-05-20 9:20 ` kbuild test robot
2020-05-20 13:45 ` kbuild test robot
2020-05-19 20:31 ` [patch 6/7] x86/kvm/svm: Use uninstrumented wrmsrl() to restore GS Thomas Gleixner
` (2 subsequent siblings)
7 siblings, 2 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Alexandre Chartre, Peter Zijlstra,
Juergen Gross, Tom Lendacky
Move the functions which are inside the RCU off region into the
non-instrumentable text section.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20200505134341.873785437@linutronix.de
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 46d69567faab..35bd95dd64dd 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3327,6 +3327,60 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
void __svm_vcpu_run(unsigned long vmcb_pa, unsigned long *regs);
+static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
+ struct vcpu_svm *svm)
+{
+ /*
+ * VMENTER enables interrupts (host state), but the kernel state is
+ * interrupts disabled when this is invoked. Also tell RCU about
+ * it. This is the same logic as for exit_to_user_mode().
+ *
+ * This ensures that e.g. latency analysis on the host observes
+ * guest mode as interrupt enabled.
+ *
+ * guest_enter_irqoff() informs context tracking about the
+ * transition to guest mode and if enabled adjusts RCU state
+ * accordingly.
+ */
+ instrumentation_begin();
+ trace_hardirqs_on_prepare();
+ lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+ instrumentation_end();
+
+ guest_enter_irqoff();
+ lockdep_hardirqs_on(CALLER_ADDR0);
+
+ __svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
+
+#ifdef CONFIG_X86_64
+ wrmsrl(MSR_GS_BASE, svm->host.gs_base);
+#else
+ loadsegment(fs, svm->host.fs);
+#ifndef CONFIG_X86_32_LAZY_GS
+ loadsegment(gs, svm->host.gs);
+#endif
+#endif
+
+ /*
+ * VMEXIT disables interrupts (host state), but tracing and lockdep
+ * have them in state 'on' as recorded before entering guest mode.
+ * Same as enter_from_user_mode().
+ *
+ * guest_exit_irqoff() restores host context and reinstates RCU if
+ * enabled and required.
+ *
+ * This needs to be done before the below as native_read_msr()
+ * contains a tracepoint and x86_spec_ctrl_restore_host() calls
+ * into world and some more.
+ */
+ lockdep_hardirqs_off(CALLER_ADDR0);
+ guest_exit_irqoff();
+
+ instrumentation_begin();
+ trace_hardirqs_off_prepare();
+ instrumentation_end();
+}
+
static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
{
fastpath_t exit_fastpath;
@@ -3389,49 +3443,7 @@ static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
*/
x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
- /*
- * VMENTER enables interrupts (host state), but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about
- * it. This is the same logic as for exit_to_user_mode().
- *
- * This ensures that e.g. latency analysis on the host observes
- * guest mode as interrupt enabled.
- *
- * guest_enter_irqoff() informs context tracking about the
- * transition to guest mode and if enabled adjusts RCU state
- * accordingly.
- */
- trace_hardirqs_on_prepare();
- lockdep_hardirqs_on_prepare(CALLER_ADDR0);
- guest_enter_irqoff();
- lockdep_hardirqs_on(CALLER_ADDR0);
-
- __svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
-
-#ifdef CONFIG_X86_64
- wrmsrl(MSR_GS_BASE, svm->host.gs_base);
-#else
- loadsegment(fs, svm->host.fs);
-#ifndef CONFIG_X86_32_LAZY_GS
- loadsegment(gs, svm->host.gs);
-#endif
-#endif
-
- /*
- * VMEXIT disables interrupts (host state), but tracing and lockdep
- * have them in state 'on' as recorded before entering guest mode.
- * Same as enter_from_user_mode().
- *
- * guest_exit_irqoff() restores host context and reinstates RCU if
- * enabled and required.
- *
- * This needs to be done before the below as native_read_msr()
- * contains a tracepoint and x86_spec_ctrl_restore_host() calls
- * into world and some more.
- */
- lockdep_hardirqs_off(CALLER_ADDR0);
- guest_exit_irqoff();
- trace_hardirqs_off_prepare();
+ svm_vcpu_enter_exit(vcpu, svm);
/*
* We do not use IBRS in the kernel. If this vCPU has used the
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index bf944334003a..1ec1ac40e328 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -27,7 +27,7 @@
#define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE
#endif
- .text
+.section .noinstr.text, "ax"
/**
* __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [patch 6/7] x86/kvm/svm: Use uninstrumented wrmsrl() to restore GS
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
` (4 preceding siblings ...)
2020-05-19 20:31 ` [patch 5/7] x86/kvm/svm: " Thomas Gleixner
@ 2020-05-19 20:31 ` Thomas Gleixner
2020-05-19 20:31 ` [patch 7/7] x86/kvm/vmx: Use native read/write_cr2() Thomas Gleixner
2020-05-20 7:41 ` [patch 0/7] x86/KVM: Async #PF and instrumentation protection Paolo Bonzini
7 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Juergen Gross, Tom Lendacky,
Alexandre Chartre, Peter Zijlstra
On guest exit MSR_GS_BASE contains whatever the guest wrote to it and the
first action after returning from the ASM code is to set it to the host
kernel value. This uses wrmsrl() which is interesting at least.
wrmsrl() is either using native_write_msr() or the paravirt variant. The
XEN_PV code is uninteresting as nested SVM in a XEN_PV guest does not work.
But native_write_msr() can be placed out of line by the compiler especially
when paravirtualization is enabled in the kernel configuration. The
function is marked notrace, but still can be probed if
CONFIG_KPROBE_EVENTS_ON_NOTRACE is enabled.
That would be a fatal problem as kprobe events use per-CPU variables which
are GS based and would be accessed with the guest GS. Depending on the GS
value this would either explode in colorful ways or lead to completely
undebugable data corruption.
Aside of that native_write_msr() contains a tracepoint which objtool
complains about as it is invoked from the noinstr section.
As this cannot run inside a XEN_PV guest there is no point in using
wrmsrl(). Use native_wrmsrl() instead which is just a plain native WRMSR
without tracing or anything else attached.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 35bd95dd64dd..663333d34c84 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3353,7 +3353,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
__svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
#ifdef CONFIG_X86_64
- wrmsrl(MSR_GS_BASE, svm->host.gs_base);
+ native_wrmsrl(MSR_GS_BASE, svm->host.gs_base);
#else
loadsegment(fs, svm->host.fs);
#ifndef CONFIG_X86_32_LAZY_GS
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [patch 7/7] x86/kvm/vmx: Use native read/write_cr2()
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
` (5 preceding siblings ...)
2020-05-19 20:31 ` [patch 6/7] x86/kvm/svm: Use uninstrumented wrmsrl() to restore GS Thomas Gleixner
@ 2020-05-19 20:31 ` Thomas Gleixner
2020-05-20 7:41 ` [patch 0/7] x86/KVM: Async #PF and instrumentation protection Paolo Bonzini
7 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2020-05-19 20:31 UTC (permalink / raw)
To: LKML
Cc: x86, Paolo Bonzini, kvm, Juergen Gross, Alexandre Chartre,
Peter Zijlstra, Tom Lendacky
read/write_cr2() go throuh the paravirt XXL indirection, but nested VMX in
a XEN_PV guest is not supported.
Use the native variants.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0a751a6ddf6e..9a5b193fe0b3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6677,13 +6677,13 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
else if (static_branch_unlikely(&mds_user_clear))
mds_clear_cpu_buffers();
- if (vcpu->arch.cr2 != read_cr2())
- write_cr2(vcpu->arch.cr2);
+ if (vcpu->arch.cr2 != native_read_cr2())
+ native_write_cr2(vcpu->arch.cr2);
vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
vmx->loaded_vmcs->launched);
- vcpu->arch.cr2 = read_cr2();
+ vcpu->arch.cr2 = native_read_cr2();
/*
* VMEXIT disables interrupts (host state), but tracing and lockdep
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [patch 2/7] x86/kvm/vmx: Add hardirq tracing to guest enter/exit
2020-05-19 20:31 ` [patch 2/7] x86/kvm/vmx: Add hardirq tracing to guest enter/exit Thomas Gleixner
@ 2020-05-20 5:48 ` kbuild test robot
0 siblings, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2020-05-20 5:48 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 9166 bytes --]
Hi Thomas,
I love your patch! Yet something to improve:
[auto build test ERROR on kvm/linux-next]
[also build test ERROR on tip/auto-latest linus/master v5.7-rc6 next-20200519]
[cannot apply to linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Thomas-Gleixner/x86-KVM-Async-PF-and-instrumentation-protection/20200520-051526
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e6658079aca6d971b4e9d7137a3a2ecbc9c34aec)
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
<< In file included from arch/x86/kvm/vmx/vmx.c:50:
>> arch/x86/kvm/vmx/vmx.c:6739:2: error: implicit declaration of function 'trace_hardirqs_on_prepare' [-Werror,-Wimplicit-function-declaration]
trace_hardirqs_on_prepare();
^
arch/x86/kvm/vmx/vmx.c:6739:2: note: did you mean 'trace_hardirqs_on'?
include/linux/irqflags.h:32:15: note: 'trace_hardirqs_on' declared here
extern void trace_hardirqs_on(void);
^
<< In file included from arch/x86/kvm/vmx/vmx.c:50:
>> arch/x86/kvm/vmx/vmx.c:6740:2: error: implicit declaration of function 'lockdep_hardirqs_on_prepare' [-Werror,-Wimplicit-function-declaration]
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
^
arch/x86/kvm/vmx/vmx.c:6740:2: note: did you mean 'trace_hardirqs_on_prepare'?
arch/x86/kvm/vmx/vmx.c:6739:2: note: 'trace_hardirqs_on_prepare' declared here
trace_hardirqs_on_prepare();
^
<< In file included from arch/x86/kvm/vmx/vmx.c:50:
>> arch/x86/kvm/vmx/vmx.c:6772:2: error: implicit declaration of function 'trace_hardirqs_off_prepare' [-Werror,-Wimplicit-function-declaration]
trace_hardirqs_off_prepare();
^
arch/x86/kvm/vmx/vmx.c:6772:2: note: did you mean 'trace_hardirqs_on_prepare'?
arch/x86/kvm/vmx/vmx.c:6739:2: note: 'trace_hardirqs_on_prepare' declared here
trace_hardirqs_on_prepare();
^
3 errors generated.
vim +/trace_hardirqs_on_prepare +6739 arch/x86/kvm/vmx/vmx.c
6650
6651 static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
6652 {
6653 fastpath_t exit_fastpath;
6654 struct vcpu_vmx *vmx = to_vmx(vcpu);
6655 unsigned long cr3, cr4;
6656
6657 reenter_guest:
6658 /* Record the guest's net vcpu time for enforced NMI injections. */
6659 if (unlikely(!enable_vnmi &&
6660 vmx->loaded_vmcs->soft_vnmi_blocked))
6661 vmx->loaded_vmcs->entry_time = ktime_get();
6662
6663 /* Don't enter VMX if guest state is invalid, let the exit handler
6664 start emulation until we arrive back to a valid state */
6665 if (vmx->emulation_required)
6666 return EXIT_FASTPATH_NONE;
6667
6668 if (vmx->ple_window_dirty) {
6669 vmx->ple_window_dirty = false;
6670 vmcs_write32(PLE_WINDOW, vmx->ple_window);
6671 }
6672
6673 /*
6674 * We did this in prepare_switch_to_guest, because it needs to
6675 * be within srcu_read_lock.
6676 */
6677 WARN_ON_ONCE(vmx->nested.need_vmcs12_to_shadow_sync);
6678
6679 if (kvm_register_is_dirty(vcpu, VCPU_REGS_RSP))
6680 vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);
6681 if (kvm_register_is_dirty(vcpu, VCPU_REGS_RIP))
6682 vmcs_writel(GUEST_RIP, vcpu->arch.regs[VCPU_REGS_RIP]);
6683
6684 cr3 = __get_current_cr3_fast();
6685 if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) {
6686 vmcs_writel(HOST_CR3, cr3);
6687 vmx->loaded_vmcs->host_state.cr3 = cr3;
6688 }
6689
6690 cr4 = cr4_read_shadow();
6691 if (unlikely(cr4 != vmx->loaded_vmcs->host_state.cr4)) {
6692 vmcs_writel(HOST_CR4, cr4);
6693 vmx->loaded_vmcs->host_state.cr4 = cr4;
6694 }
6695
6696 /* When single-stepping over STI and MOV SS, we must clear the
6697 * corresponding interruptibility bits in the guest state. Otherwise
6698 * vmentry fails as it then expects bit 14 (BS) in pending debug
6699 * exceptions being set, but that's not correct for the guest debugging
6700 * case. */
6701 if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
6702 vmx_set_interrupt_shadow(vcpu, 0);
6703
6704 kvm_load_guest_xsave_state(vcpu);
6705
6706 pt_guest_enter(vmx);
6707
6708 if (vcpu_to_pmu(vcpu)->version)
6709 atomic_switch_perf_msrs(vmx);
6710 atomic_switch_umwait_control_msr(vmx);
6711
6712 if (enable_preemption_timer)
6713 vmx_update_hv_timer(vcpu);
6714
6715 if (lapic_in_kernel(vcpu) &&
6716 vcpu->arch.apic->lapic_timer.timer_advance_ns)
6717 kvm_wait_lapic_expire(vcpu);
6718
6719 /*
6720 * If this vCPU has touched SPEC_CTRL, restore the guest's value if
6721 * it's non-zero. Since vmentry is serialising on affected CPUs, there
6722 * is no need to worry about the conditional branch over the wrmsr
6723 * being speculatively taken.
6724 */
6725 x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
6726
6727 /*
6728 * VMENTER enables interrupts (host state), but the kernel state is
6729 * interrupts disabled when this is invoked. Also tell RCU about
6730 * it. This is the same logic as for exit_to_user_mode().
6731 *
6732 * This ensures that e.g. latency analysis on the host observes
6733 * guest mode as interrupt enabled.
6734 *
6735 * guest_enter_irqoff() informs context tracking about the
6736 * transition to guest mode and if enabled adjusts RCU state
6737 * accordingly.
6738 */
> 6739 trace_hardirqs_on_prepare();
> 6740 lockdep_hardirqs_on_prepare(CALLER_ADDR0);
6741 guest_enter_irqoff();
6742 lockdep_hardirqs_on(CALLER_ADDR0);
6743
6744 /* L1D Flush includes CPU buffer clear to mitigate MDS */
6745 if (static_branch_unlikely(&vmx_l1d_should_flush))
6746 vmx_l1d_flush(vcpu);
6747 else if (static_branch_unlikely(&mds_user_clear))
6748 mds_clear_cpu_buffers();
6749
6750 if (vcpu->arch.cr2 != read_cr2())
6751 write_cr2(vcpu->arch.cr2);
6752
6753 vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
6754 vmx->loaded_vmcs->launched);
6755
6756 vcpu->arch.cr2 = read_cr2();
6757
6758 /*
6759 * VMEXIT disables interrupts (host state), but tracing and lockdep
6760 * have them in state 'on' as recorded before entering guest mode.
6761 * Same as enter_from_user_mode().
6762 *
6763 * guest_exit_irqoff() restores host context and reinstates RCU if
6764 * enabled and required.
6765 *
6766 * This needs to be done before the below as native_read_msr()
6767 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
6768 * into world and some more.
6769 */
6770 lockdep_hardirqs_off(CALLER_ADDR0);
6771 guest_exit_irqoff();
> 6772 trace_hardirqs_off_prepare();
6773
6774 /*
6775 * We do not use IBRS in the kernel. If this vCPU has used the
6776 * SPEC_CTRL MSR it may have left it on; save the value and
6777 * turn it off. This is much more efficient than blindly adding
6778 * it to the atomic save/restore list. Especially as the former
6779 * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
6780 *
6781 * For non-nested case:
6782 * If the L01 MSR bitmap does not intercept the MSR, then we need to
6783 * save it.
6784 *
6785 * For nested case:
6786 * If the L02 MSR bitmap does not intercept the MSR, then we need to
6787 * save it.
6788 */
6789 if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
6790 vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
6791
6792 x86_spec_ctrl_restore_host(vmx->spec_ctrl, 0);
6793
6794 /* All fields are clean at this point */
6795 if (static_branch_unlikely(&enable_evmcs))
6796 current_evmcs->hv_clean_fields |=
6797 HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL;
6798
6799 if (static_branch_unlikely(&enable_evmcs))
6800 current_evmcs->hv_vp_id = vcpu->arch.hyperv.vp_index;
6801
6802 /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
6803 if (vmx->host_debugctlmsr)
6804 update_debugctlmsr(vmx->host_debugctlmsr);
6805
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 73433 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 0/7] x86/KVM: Async #PF and instrumentation protection
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
` (6 preceding siblings ...)
2020-05-19 20:31 ` [patch 7/7] x86/kvm/vmx: Use native read/write_cr2() Thomas Gleixner
@ 2020-05-20 7:41 ` Paolo Bonzini
7 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2020-05-20 7:41 UTC (permalink / raw)
To: Thomas Gleixner, LKML
Cc: x86, kvm, Alexandre Chartre, Peter Zijlstra, Juergen Gross, Tom Lendacky
On 19/05/20 22:31, Thomas Gleixner wrote:
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git noinstr-x86-kvm-2020-05-16
Pulled, thanks.
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 3/7] x86/kvm/svm: Add hardirq tracing on guest enter/exit
2020-05-19 20:31 ` [patch 3/7] x86/kvm/svm: Add hardirq tracing on " Thomas Gleixner
@ 2020-05-20 8:19 ` kbuild test robot
0 siblings, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2020-05-20 8:19 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 9330 bytes --]
Hi Thomas,
I love your patch! Yet something to improve:
[auto build test ERROR on kvm/linux-next]
[also build test ERROR on tip/auto-latest linus/master v5.7-rc6 next-20200519]
[cannot apply to linux/master]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Thomas-Gleixner/x86-KVM-Async-PF-and-instrumentation-protection/20200520-051526
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e6658079aca6d971b4e9d7137a3a2ecbc9c34aec)
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> arch/x86/kvm/svm/svm.c:3404:2: error: implicit declaration of function 'trace_hardirqs_on_prepare' [-Werror,-Wimplicit-function-declaration]
trace_hardirqs_on_prepare();
^
arch/x86/kvm/svm/svm.c:3404:2: note: did you mean 'trace_hardirqs_on'?
include/linux/irqflags.h:32:15: note: 'trace_hardirqs_on' declared here
extern void trace_hardirqs_on(void);
^
>> arch/x86/kvm/svm/svm.c:3405:2: error: implicit declaration of function 'lockdep_hardirqs_on_prepare' [-Werror,-Wimplicit-function-declaration]
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
^
arch/x86/kvm/svm/svm.c:3405:2: note: did you mean 'trace_hardirqs_on_prepare'?
arch/x86/kvm/svm/svm.c:3404:2: note: 'trace_hardirqs_on_prepare' declared here
trace_hardirqs_on_prepare();
^
>> arch/x86/kvm/svm/svm.c:3434:2: error: implicit declaration of function 'trace_hardirqs_off_prepare' [-Werror,-Wimplicit-function-declaration]
trace_hardirqs_off_prepare();
^
arch/x86/kvm/svm/svm.c:3434:2: note: did you mean 'trace_hardirqs_on_prepare'?
arch/x86/kvm/svm/svm.c:3404:2: note: 'trace_hardirqs_on_prepare' declared here
trace_hardirqs_on_prepare();
^
3 errors generated.
vim +/trace_hardirqs_on_prepare +3404 arch/x86/kvm/svm/svm.c
3329
3330 static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
3331 {
3332 fastpath_t exit_fastpath;
3333 struct vcpu_svm *svm = to_svm(vcpu);
3334
3335 svm->vmcb->save.rax = vcpu->arch.regs[VCPU_REGS_RAX];
3336 svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP];
3337 svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP];
3338
3339 /*
3340 * A vmexit emulation is required before the vcpu can be executed
3341 * again.
3342 */
3343 if (unlikely(svm->nested.exit_required))
3344 return EXIT_FASTPATH_NONE;
3345
3346 /*
3347 * Disable singlestep if we're injecting an interrupt/exception.
3348 * We don't want our modified rflags to be pushed on the stack where
3349 * we might not be able to easily reset them if we disabled NMI
3350 * singlestep later.
3351 */
3352 if (svm->nmi_singlestep && svm->vmcb->control.event_inj) {
3353 /*
3354 * Event injection happens before external interrupts cause a
3355 * vmexit and interrupts are disabled here, so smp_send_reschedule
3356 * is enough to force an immediate vmexit.
3357 */
3358 disable_nmi_singlestep(svm);
3359 smp_send_reschedule(vcpu->cpu);
3360 }
3361
3362 pre_svm_run(svm);
3363
3364 sync_lapic_to_cr8(vcpu);
3365
3366 svm->vmcb->save.cr2 = vcpu->arch.cr2;
3367
3368 /*
3369 * Run with all-zero DR6 unless needed, so that we can get the exact cause
3370 * of a #DB.
3371 */
3372 if (unlikely(svm->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))
3373 svm_set_dr6(svm, vcpu->arch.dr6);
3374 else
3375 svm_set_dr6(svm, DR6_FIXED_1 | DR6_RTM);
3376
3377 clgi();
3378 kvm_load_guest_xsave_state(vcpu);
3379
3380 if (lapic_in_kernel(vcpu) &&
3381 vcpu->arch.apic->lapic_timer.timer_advance_ns)
3382 kvm_wait_lapic_expire(vcpu);
3383
3384 /*
3385 * If this vCPU has touched SPEC_CTRL, restore the guest's value if
3386 * it's non-zero. Since vmentry is serialising on affected CPUs, there
3387 * is no need to worry about the conditional branch over the wrmsr
3388 * being speculatively taken.
3389 */
3390 x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
3391
3392 /*
3393 * VMENTER enables interrupts (host state), but the kernel state is
3394 * interrupts disabled when this is invoked. Also tell RCU about
3395 * it. This is the same logic as for exit_to_user_mode().
3396 *
3397 * This ensures that e.g. latency analysis on the host observes
3398 * guest mode as interrupt enabled.
3399 *
3400 * guest_enter_irqoff() informs context tracking about the
3401 * transition to guest mode and if enabled adjusts RCU state
3402 * accordingly.
3403 */
> 3404 trace_hardirqs_on_prepare();
> 3405 lockdep_hardirqs_on_prepare(CALLER_ADDR0);
3406 guest_enter_irqoff();
3407 lockdep_hardirqs_on(CALLER_ADDR0);
3408
3409 __svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
3410
3411 #ifdef CONFIG_X86_64
3412 wrmsrl(MSR_GS_BASE, svm->host.gs_base);
3413 #else
3414 loadsegment(fs, svm->host.fs);
3415 #ifndef CONFIG_X86_32_LAZY_GS
3416 loadsegment(gs, svm->host.gs);
3417 #endif
3418 #endif
3419
3420 /*
3421 * VMEXIT disables interrupts (host state), but tracing and lockdep
3422 * have them in state 'on' as recorded before entering guest mode.
3423 * Same as enter_from_user_mode().
3424 *
3425 * guest_exit_irqoff() restores host context and reinstates RCU if
3426 * enabled and required.
3427 *
3428 * This needs to be done before the below as native_read_msr()
3429 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
3430 * into world and some more.
3431 */
3432 lockdep_hardirqs_off(CALLER_ADDR0);
3433 guest_exit_irqoff();
> 3434 trace_hardirqs_off_prepare();
3435
3436 /*
3437 * We do not use IBRS in the kernel. If this vCPU has used the
3438 * SPEC_CTRL MSR it may have left it on; save the value and
3439 * turn it off. This is much more efficient than blindly adding
3440 * it to the atomic save/restore list. Especially as the former
3441 * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
3442 *
3443 * For non-nested case:
3444 * If the L01 MSR bitmap does not intercept the MSR, then we need to
3445 * save it.
3446 *
3447 * For nested case:
3448 * If the L02 MSR bitmap does not intercept the MSR, then we need to
3449 * save it.
3450 */
3451 if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
3452 svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
3453
3454 reload_tss(vcpu);
3455
3456 x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
3457
3458 vcpu->arch.cr2 = svm->vmcb->save.cr2;
3459 vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax;
3460 vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp;
3461 vcpu->arch.regs[VCPU_REGS_RIP] = svm->vmcb->save.rip;
3462
3463 if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI))
3464 kvm_before_interrupt(&svm->vcpu);
3465
3466 kvm_load_host_xsave_state(vcpu);
3467 stgi();
3468
3469 /* Any pending NMI will happen here */
3470 exit_fastpath = svm_exit_handlers_fastpath(vcpu);
3471
3472 if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI))
3473 kvm_after_interrupt(&svm->vcpu);
3474
3475 sync_cr8_to_lapic(vcpu);
3476
3477 svm->next_rip = 0;
3478 svm->nested.nested_run_pending = 0;
3479
3480 svm->vmcb->control.tlb_ctl = TLB_CONTROL_DO_NOTHING;
3481
3482 /* if exit due to PF check for async PF */
3483 if (svm->vmcb->control.exit_code == SVM_EXIT_EXCP_BASE + PF_VECTOR)
3484 svm->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason();
3485
3486 if (npt_enabled) {
3487 vcpu->arch.regs_avail &= ~(1 << VCPU_EXREG_PDPTR);
3488 vcpu->arch.regs_dirty &= ~(1 << VCPU_EXREG_PDPTR);
3489 }
3490
3491 /*
3492 * We need to handle MC intercepts here before the vcpu has a chance to
3493 * change the physical cpu
3494 */
3495 if (unlikely(svm->vmcb->control.exit_code ==
3496 SVM_EXIT_EXCP_BASE + MC_VECTOR))
3497 svm_handle_mce(svm);
3498
3499 mark_all_clean(svm->vmcb);
3500 return exit_fastpath;
3501 }
3502
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 73433 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 5/7] x86/kvm/svm: Move guest enter/exit into .noinstr.text
2020-05-19 20:31 ` [patch 5/7] x86/kvm/svm: " Thomas Gleixner
@ 2020-05-20 9:20 ` kbuild test robot
2020-05-20 13:45 ` kbuild test robot
1 sibling, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2020-05-20 9:20 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 4349 bytes --]
Hi Thomas,
I love your patch! Yet something to improve:
[auto build test ERROR on kvm/linux-next]
[cannot apply to tip/auto-latest linus/master linux/master v5.7-rc6 next-20200519]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Thomas-Gleixner/x86-KVM-Async-PF-and-instrumentation-protection/20200520-051526
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e6658079aca6d971b4e9d7137a3a2ecbc9c34aec)
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> arch/x86/kvm/svm/svm.c:3330:8: error: unknown type name 'noinstr'
static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
^
>> arch/x86/kvm/svm/svm.c:3345:2: error: implicit declaration of function 'instrumentation_begin' [-Werror,-Wimplicit-function-declaration]
instrumentation_begin();
^
arch/x86/kvm/svm/svm.c:3346:2: error: implicit declaration of function 'trace_hardirqs_on_prepare' [-Werror,-Wimplicit-function-declaration]
trace_hardirqs_on_prepare();
^
arch/x86/kvm/svm/svm.c:3346:2: note: did you mean 'trace_hardirqs_on'?
include/linux/irqflags.h:32:15: note: 'trace_hardirqs_on' declared here
extern void trace_hardirqs_on(void);
^
arch/x86/kvm/svm/svm.c:3347:2: error: implicit declaration of function 'lockdep_hardirqs_on_prepare' [-Werror,-Wimplicit-function-declaration]
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
^
arch/x86/kvm/svm/svm.c:3347:2: note: did you mean 'trace_hardirqs_on_prepare'?
arch/x86/kvm/svm/svm.c:3346:2: note: 'trace_hardirqs_on_prepare' declared here
trace_hardirqs_on_prepare();
^
>> arch/x86/kvm/svm/svm.c:3348:2: error: implicit declaration of function 'instrumentation_end' [-Werror,-Wimplicit-function-declaration]
instrumentation_end();
^
arch/x86/kvm/svm/svm.c:3348:2: note: did you mean 'instrumentation_begin'?
arch/x86/kvm/svm/svm.c:3345:2: note: 'instrumentation_begin' declared here
instrumentation_begin();
^
arch/x86/kvm/svm/svm.c:3380:2: error: implicit declaration of function 'trace_hardirqs_off_prepare' [-Werror,-Wimplicit-function-declaration]
trace_hardirqs_off_prepare();
^
arch/x86/kvm/svm/svm.c:3380:2: note: did you mean 'trace_hardirqs_on_prepare'?
arch/x86/kvm/svm/svm.c:3346:2: note: 'trace_hardirqs_on_prepare' declared here
trace_hardirqs_on_prepare();
^
6 errors generated.
vim +/noinstr +3330 arch/x86/kvm/svm/svm.c
3329
> 3330 static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
3331 struct vcpu_svm *svm)
3332 {
3333 /*
3334 * VMENTER enables interrupts (host state), but the kernel state is
3335 * interrupts disabled when this is invoked. Also tell RCU about
3336 * it. This is the same logic as for exit_to_user_mode().
3337 *
3338 * This ensures that e.g. latency analysis on the host observes
3339 * guest mode as interrupt enabled.
3340 *
3341 * guest_enter_irqoff() informs context tracking about the
3342 * transition to guest mode and if enabled adjusts RCU state
3343 * accordingly.
3344 */
> 3345 instrumentation_begin();
3346 trace_hardirqs_on_prepare();
3347 lockdep_hardirqs_on_prepare(CALLER_ADDR0);
> 3348 instrumentation_end();
3349
3350 guest_enter_irqoff();
3351 lockdep_hardirqs_on(CALLER_ADDR0);
3352
3353 __svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
3354
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 73533 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 5/7] x86/kvm/svm: Move guest enter/exit into .noinstr.text
2020-05-19 20:31 ` [patch 5/7] x86/kvm/svm: " Thomas Gleixner
2020-05-20 9:20 ` kbuild test robot
@ 2020-05-20 13:45 ` kbuild test robot
1 sibling, 0 replies; 13+ messages in thread
From: kbuild test robot @ 2020-05-20 13:45 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 8453 bytes --]
Hi Thomas,
I love your patch! Yet something to improve:
[auto build test ERROR on kvm/linux-next]
[cannot apply to tip/auto-latest linus/master linux/master v5.7-rc6 next-20200519]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Thomas-Gleixner/x86-KVM-Async-PF-and-instrumentation-protection/20200520-051526
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
config: i386-allyesconfig (attached as .config)
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> arch/x86/kvm/svm/svm.c:3330:16: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'void'
static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
^~~~
arch/x86/kvm/svm/svm.c: In function 'svm_vcpu_run':
>> arch/x86/kvm/svm/svm.c:3446:2: error: implicit declaration of function 'svm_vcpu_enter_exit'; did you mean 'kvm_vcpu_mtrr_init'? [-Werror=implicit-function-declaration]
svm_vcpu_enter_exit(vcpu, svm);
^~~~~~~~~~~~~~~~~~~
kvm_vcpu_mtrr_init
cc1: some warnings being treated as errors
vim +3330 arch/x86/kvm/svm/svm.c
3329
> 3330 static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu,
3331 struct vcpu_svm *svm)
3332 {
3333 /*
3334 * VMENTER enables interrupts (host state), but the kernel state is
3335 * interrupts disabled when this is invoked. Also tell RCU about
3336 * it. This is the same logic as for exit_to_user_mode().
3337 *
3338 * This ensures that e.g. latency analysis on the host observes
3339 * guest mode as interrupt enabled.
3340 *
3341 * guest_enter_irqoff() informs context tracking about the
3342 * transition to guest mode and if enabled adjusts RCU state
3343 * accordingly.
3344 */
3345 instrumentation_begin();
3346 trace_hardirqs_on_prepare();
3347 lockdep_hardirqs_on_prepare(CALLER_ADDR0);
3348 instrumentation_end();
3349
3350 guest_enter_irqoff();
3351 lockdep_hardirqs_on(CALLER_ADDR0);
3352
3353 __svm_vcpu_run(svm->vmcb_pa, (unsigned long *)&svm->vcpu.arch.regs);
3354
3355 #ifdef CONFIG_X86_64
3356 wrmsrl(MSR_GS_BASE, svm->host.gs_base);
3357 #else
3358 loadsegment(fs, svm->host.fs);
3359 #ifndef CONFIG_X86_32_LAZY_GS
3360 loadsegment(gs, svm->host.gs);
3361 #endif
3362 #endif
3363
3364 /*
3365 * VMEXIT disables interrupts (host state), but tracing and lockdep
3366 * have them in state 'on' as recorded before entering guest mode.
3367 * Same as enter_from_user_mode().
3368 *
3369 * guest_exit_irqoff() restores host context and reinstates RCU if
3370 * enabled and required.
3371 *
3372 * This needs to be done before the below as native_read_msr()
3373 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
3374 * into world and some more.
3375 */
3376 lockdep_hardirqs_off(CALLER_ADDR0);
3377 guest_exit_irqoff();
3378
3379 instrumentation_begin();
3380 trace_hardirqs_off_prepare();
3381 instrumentation_end();
3382 }
3383
3384 static fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
3385 {
3386 fastpath_t exit_fastpath;
3387 struct vcpu_svm *svm = to_svm(vcpu);
3388
3389 svm->vmcb->save.rax = vcpu->arch.regs[VCPU_REGS_RAX];
3390 svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP];
3391 svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP];
3392
3393 /*
3394 * A vmexit emulation is required before the vcpu can be executed
3395 * again.
3396 */
3397 if (unlikely(svm->nested.exit_required))
3398 return EXIT_FASTPATH_NONE;
3399
3400 /*
3401 * Disable singlestep if we're injecting an interrupt/exception.
3402 * We don't want our modified rflags to be pushed on the stack where
3403 * we might not be able to easily reset them if we disabled NMI
3404 * singlestep later.
3405 */
3406 if (svm->nmi_singlestep && svm->vmcb->control.event_inj) {
3407 /*
3408 * Event injection happens before external interrupts cause a
3409 * vmexit and interrupts are disabled here, so smp_send_reschedule
3410 * is enough to force an immediate vmexit.
3411 */
3412 disable_nmi_singlestep(svm);
3413 smp_send_reschedule(vcpu->cpu);
3414 }
3415
3416 pre_svm_run(svm);
3417
3418 sync_lapic_to_cr8(vcpu);
3419
3420 svm->vmcb->save.cr2 = vcpu->arch.cr2;
3421
3422 /*
3423 * Run with all-zero DR6 unless needed, so that we can get the exact cause
3424 * of a #DB.
3425 */
3426 if (unlikely(svm->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))
3427 svm_set_dr6(svm, vcpu->arch.dr6);
3428 else
3429 svm_set_dr6(svm, DR6_FIXED_1 | DR6_RTM);
3430
3431 clgi();
3432 kvm_load_guest_xsave_state(vcpu);
3433
3434 if (lapic_in_kernel(vcpu) &&
3435 vcpu->arch.apic->lapic_timer.timer_advance_ns)
3436 kvm_wait_lapic_expire(vcpu);
3437
3438 /*
3439 * If this vCPU has touched SPEC_CTRL, restore the guest's value if
3440 * it's non-zero. Since vmentry is serialising on affected CPUs, there
3441 * is no need to worry about the conditional branch over the wrmsr
3442 * being speculatively taken.
3443 */
3444 x86_spec_ctrl_set_guest(svm->spec_ctrl, svm->virt_spec_ctrl);
3445
> 3446 svm_vcpu_enter_exit(vcpu, svm);
3447
3448 /*
3449 * We do not use IBRS in the kernel. If this vCPU has used the
3450 * SPEC_CTRL MSR it may have left it on; save the value and
3451 * turn it off. This is much more efficient than blindly adding
3452 * it to the atomic save/restore list. Especially as the former
3453 * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
3454 *
3455 * For non-nested case:
3456 * If the L01 MSR bitmap does not intercept the MSR, then we need to
3457 * save it.
3458 *
3459 * For nested case:
3460 * If the L02 MSR bitmap does not intercept the MSR, then we need to
3461 * save it.
3462 */
3463 if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
3464 svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
3465
3466 reload_tss(vcpu);
3467
3468 x86_spec_ctrl_restore_host(svm->spec_ctrl, svm->virt_spec_ctrl);
3469
3470 vcpu->arch.cr2 = svm->vmcb->save.cr2;
3471 vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax;
3472 vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp;
3473 vcpu->arch.regs[VCPU_REGS_RIP] = svm->vmcb->save.rip;
3474
3475 if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI))
3476 kvm_before_interrupt(&svm->vcpu);
3477
3478 kvm_load_host_xsave_state(vcpu);
3479 stgi();
3480
3481 /* Any pending NMI will happen here */
3482 exit_fastpath = svm_exit_handlers_fastpath(vcpu);
3483
3484 if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI))
3485 kvm_after_interrupt(&svm->vcpu);
3486
3487 sync_cr8_to_lapic(vcpu);
3488
3489 svm->next_rip = 0;
3490 svm->nested.nested_run_pending = 0;
3491
3492 svm->vmcb->control.tlb_ctl = TLB_CONTROL_DO_NOTHING;
3493
3494 /* if exit due to PF check for async PF */
3495 if (svm->vmcb->control.exit_code == SVM_EXIT_EXCP_BASE + PF_VECTOR)
3496 svm->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason();
3497
3498 if (npt_enabled) {
3499 vcpu->arch.regs_avail &= ~(1 << VCPU_EXREG_PDPTR);
3500 vcpu->arch.regs_dirty &= ~(1 << VCPU_EXREG_PDPTR);
3501 }
3502
3503 /*
3504 * We need to handle MC intercepts here before the vcpu has a chance to
3505 * change the physical cpu
3506 */
3507 if (unlikely(svm->vmcb->control.exit_code ==
3508 SVM_EXIT_EXCP_BASE + MC_VECTOR))
3509 svm_handle_mce(svm);
3510
3511 mark_all_clean(svm->vmcb);
3512 return exit_fastpath;
3513 }
3514
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 72535 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-05-20 13:45 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-19 20:31 [patch 0/7] x86/KVM: Async #PF and instrumentation protection Thomas Gleixner
2020-05-19 20:31 ` [patch 1/7] x86/kvm: Move context tracking where it belongs Thomas Gleixner
2020-05-19 20:31 ` [patch 2/7] x86/kvm/vmx: Add hardirq tracing to guest enter/exit Thomas Gleixner
2020-05-20 5:48 ` kbuild test robot
2020-05-19 20:31 ` [patch 3/7] x86/kvm/svm: Add hardirq tracing on " Thomas Gleixner
2020-05-20 8:19 ` kbuild test robot
2020-05-19 20:31 ` [patch 4/7] x86/kvm/vmx: Move guest enter/exit into .noinstr.text Thomas Gleixner
2020-05-19 20:31 ` [patch 5/7] x86/kvm/svm: " Thomas Gleixner
2020-05-20 9:20 ` kbuild test robot
2020-05-20 13:45 ` kbuild test robot
2020-05-19 20:31 ` [patch 6/7] x86/kvm/svm: Use uninstrumented wrmsrl() to restore GS Thomas Gleixner
2020-05-19 20:31 ` [patch 7/7] x86/kvm/vmx: Use native read/write_cr2() Thomas Gleixner
2020-05-20 7:41 ` [patch 0/7] x86/KVM: Async #PF and instrumentation protection Paolo Bonzini
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.