linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Retpoline / IBRS_ALL
@ 2018-02-07  0:03 David Woodhouse
  2018-02-07  0:03 ` [RFC PATCH 1/4] Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()" David Woodhouse
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: David Woodhouse @ 2018-02-07  0:03 UTC (permalink / raw)
  To: tglx, torvalds, x86, linux-kernel, bp, peterz, tim.c.chen,
	dave.hansen, arjan.van.de.ven

Using retpoline ensures the kernel is safe because it doesn't contain
any indirect branches, but firmware still can — and we make calls into
firmware at runtime. Where the IBRS microcode support is available, use
that before calling into firmware.

While doing that, I noticed that we were calling C functions without
telling the compiler about the call-clobbered registers. Stop that.

This also contains the always_inline fix for the performance problem 
introduced by retpoline in KVM code, and finally adds IBRS_ALL support 
for future CPUs, where we can disable the retpoline but still want to 
use IBPB on context switch etc. I'll repeat the comment from that commit 
here, for clarity:

This does not actually *set* the IBRS bit in the SPEC_CTRL register,
because Intel's documentation is wrong. Not wrong in the sense of
"does not accurately describe Intel's planned future hardware", but
more in the sense that if Intel make hardware like that, then they
are Doing It Wrong™.

With IBRS_ALL advertised in IA32_ARCH_CAPABILITIES, the IBRS bit in
the MSR should do *nothing*. The safe mode where the CPU honours the
tags in the BTB/RSB should be enabled *unconditionally*.


David Woodhouse (4):
  Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()"
  KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()
  x86/speculation: Use IBRS if available before calling into firmware
  x86/speculation: Support "Enhanced IBRS" on future CPUs

 arch/x86/include/asm/apm.h           |  6 ++++++
 arch/x86/include/asm/cpufeatures.h   |  1 +
 arch/x86/include/asm/efi.h           | 13 +++++++++++--
 arch/x86/include/asm/nospec-branch.h | 34 +++++++++++++++++++++++++++++-----
 arch/x86/include/asm/processor.h     |  3 ---
 arch/x86/kernel/cpu/bugs.c           | 29 ++++++++++++++++++++++-------
 arch/x86/kvm/mmu.c                   | 10 +++++-----
 drivers/watchdog/hpwdt.c             |  3 +++
 8 files changed, 77 insertions(+), 22 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 1/4] Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()"
  2018-02-07  0:03 [RFC PATCH 0/4] Retpoline / IBRS_ALL David Woodhouse
@ 2018-02-07  0:03 ` David Woodhouse
  2018-02-07  0:03 ` [RFC PATCH 2/4] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range() David Woodhouse
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: David Woodhouse @ 2018-02-07  0:03 UTC (permalink / raw)
  To: tglx, torvalds, x86, linux-kernel, bp, peterz, tim.c.chen,
	dave.hansen, arjan.van.de.ven

This reverts commit 64e16720ea0879f8ab4547e3b9758936d483909b.

We cannot call C functions like that, without marking all the
call-clobbered registers as, well, clobbered. We might have got away
with it for now because the __ibp_barrier() function was *fairly*
unlikely to actually use any other registers. But no. Just no.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/nospec-branch.h | 13 +++++++++----
 arch/x86/include/asm/processor.h     |  3 ---
 arch/x86/kernel/cpu/bugs.c           |  6 ------
 3 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 4d57894..300cc15 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -164,10 +164,15 @@ static inline void vmexit_fill_RSB(void)
 
 static inline void indirect_branch_prediction_barrier(void)
 {
-	alternative_input("",
-			  "call __ibp_barrier",
-			  X86_FEATURE_USE_IBPB,
-			  ASM_NO_INPUT_CLOBBER("eax", "ecx", "edx", "memory"));
+	asm volatile(ALTERNATIVE("",
+				 "movl %[msr], %%ecx\n\t"
+				 "movl %[val], %%eax\n\t"
+				 "movl $0, %%edx\n\t"
+				 "wrmsr",
+				 X86_FEATURE_USE_IBPB)
+		     : : [msr] "i" (MSR_IA32_PRED_CMD),
+			 [val] "i" (PRED_CMD_IBPB)
+		     : "eax", "ecx", "edx", "memory");
 }
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 513f960..99799fb 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -969,7 +969,4 @@ bool xen_set_default_idle(void);
 
 void stop_this_cpu(void *dummy);
 void df_debug(struct pt_regs *regs, long error_code);
-
-void __ibp_barrier(void);
-
 #endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 71949bf..61152aa 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -337,9 +337,3 @@ ssize_t cpu_show_spectre_v2(struct device *dev,
 		       spectre_v2_module_string());
 }
 #endif
-
-void __ibp_barrier(void)
-{
-	__wrmsr(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, 0);
-}
-EXPORT_SYMBOL_GPL(__ibp_barrier);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 2/4] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()
  2018-02-07  0:03 [RFC PATCH 0/4] Retpoline / IBRS_ALL David Woodhouse
  2018-02-07  0:03 ` [RFC PATCH 1/4] Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()" David Woodhouse
@ 2018-02-07  0:03 ` David Woodhouse
       [not found]   ` <1518092274.3677.177.camel@infradead.org>
  2018-02-07  0:03 ` [RFC PATCH 3/4] x86/speculation: Use IBRS if available before calling into firmware David Woodhouse
  2018-02-07  0:03 ` [RFC PATCH 4/4] x86/speculation: Support "Enhanced IBRS" on future CPUs David Woodhouse
  3 siblings, 1 reply; 8+ messages in thread
From: David Woodhouse @ 2018-02-07  0:03 UTC (permalink / raw)
  To: tglx, torvalds, x86, linux-kernel, bp, peterz, tim.c.chen,
	dave.hansen, arjan.van.de.ven

With retpoline, tight loops of "call this function for every XXX" are
very much pessimised by taking a prediction miss *every* time. This one
showed up very high in our early testing.

By marking the iterator slot_handle_…() functions always_inline, we can
ensure that the indirect function call can be optimised away into a
direct call and it actually generates slightly smaller code because
some of the other conditionals can get optimised away too.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kvm/mmu.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 2b8eb4d..cc83bdc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5058,7 +5058,7 @@ void kvm_mmu_uninit_vm(struct kvm *kvm)
 typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head *rmap_head);
 
 /* The caller should hold mmu-lock before calling this function. */
-static bool
+static __always_inline bool
 slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
 			slot_level_handler fn, int start_level, int end_level,
 			gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb)
@@ -5088,7 +5088,7 @@ slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
 	return flush;
 }
 
-static bool
+static __always_inline bool
 slot_handle_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
 		  slot_level_handler fn, int start_level, int end_level,
 		  bool lock_flush_tlb)
@@ -5099,7 +5099,7 @@ slot_handle_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
 			lock_flush_tlb);
 }
 
-static bool
+static __always_inline bool
 slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
 		      slot_level_handler fn, bool lock_flush_tlb)
 {
@@ -5107,7 +5107,7 @@ slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
 				 PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb);
 }
 
-static bool
+static __always_inline bool
 slot_handle_large_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
 			slot_level_handler fn, bool lock_flush_tlb)
 {
@@ -5115,7 +5115,7 @@ slot_handle_large_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
 				 PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb);
 }
 
-static bool
+static __always_inline bool
 slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot,
 		 slot_level_handler fn, bool lock_flush_tlb)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 3/4] x86/speculation: Use IBRS if available before calling into firmware
  2018-02-07  0:03 [RFC PATCH 0/4] Retpoline / IBRS_ALL David Woodhouse
  2018-02-07  0:03 ` [RFC PATCH 1/4] Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()" David Woodhouse
  2018-02-07  0:03 ` [RFC PATCH 2/4] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range() David Woodhouse
@ 2018-02-07  0:03 ` David Woodhouse
  2018-02-07 11:17   ` Borislav Petkov
  2018-02-07  0:03 ` [RFC PATCH 4/4] x86/speculation: Support "Enhanced IBRS" on future CPUs David Woodhouse
  3 siblings, 1 reply; 8+ messages in thread
From: David Woodhouse @ 2018-02-07  0:03 UTC (permalink / raw)
  To: tglx, torvalds, x86, linux-kernel, bp, peterz, tim.c.chen,
	dave.hansen, arjan.van.de.ven

Retpoline means the kernel is safe because it has no indirect branches.
But firmware isn't, so use IBRS for firmware calls if it's available.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/apm.h           |  6 ++++++
 arch/x86/include/asm/cpufeatures.h   |  1 +
 arch/x86/include/asm/efi.h           | 13 +++++++++++--
 arch/x86/include/asm/nospec-branch.h | 37 +++++++++++++++++++++++++++---------
 arch/x86/kernel/cpu/bugs.c           | 12 +++++++++++-
 drivers/watchdog/hpwdt.c             |  3 +++
 6 files changed, 60 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/apm.h b/arch/x86/include/asm/apm.h
index 4d4015d..4483616 100644
--- a/arch/x86/include/asm/apm.h
+++ b/arch/x86/include/asm/apm.h
@@ -7,6 +7,8 @@
 #ifndef _ASM_X86_MACH_DEFAULT_APM_H
 #define _ASM_X86_MACH_DEFAULT_APM_H
 
+#include <asm/spec_ctrl.h>
+
 #ifdef APM_ZERO_SEGS
 #	define APM_DO_ZERO_SEGS \
 		"pushl %%ds\n\t" \
@@ -32,6 +34,7 @@ static inline void apm_bios_call_asm(u32 func, u32 ebx_in, u32 ecx_in,
 	 * N.B. We do NOT need a cld after the BIOS call
 	 * because we always save and restore the flags.
 	 */
+	firmware_restrict_branch_speculation_start();
 	__asm__ __volatile__(APM_DO_ZERO_SEGS
 		"pushl %%edi\n\t"
 		"pushl %%ebp\n\t"
@@ -44,6 +47,7 @@ static inline void apm_bios_call_asm(u32 func, u32 ebx_in, u32 ecx_in,
 		  "=S" (*esi)
 		: "a" (func), "b" (ebx_in), "c" (ecx_in)
 		: "memory", "cc");
+	firmware_restrict_branch_speculation_end();
 }
 
 static inline bool apm_bios_call_simple_asm(u32 func, u32 ebx_in,
@@ -56,6 +60,7 @@ static inline bool apm_bios_call_simple_asm(u32 func, u32 ebx_in,
 	 * N.B. We do NOT need a cld after the BIOS call
 	 * because we always save and restore the flags.
 	 */
+	firmware_restrict_branch_speculation_start();
 	__asm__ __volatile__(APM_DO_ZERO_SEGS
 		"pushl %%edi\n\t"
 		"pushl %%ebp\n\t"
@@ -68,6 +73,7 @@ static inline bool apm_bios_call_simple_asm(u32 func, u32 ebx_in,
 		  "=S" (si)
 		: "a" (func), "b" (ebx_in), "c" (ecx_in)
 		: "memory", "cc");
+	firmware_restrict_branch_speculation_end();
 	return error;
 }
 
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 73b5fff..66c1434 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -211,6 +211,7 @@
 #define X86_FEATURE_RSB_CTXSW		( 7*32+19) /* "" Fill RSB on context switches */
 
 #define X86_FEATURE_USE_IBPB		( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */
+#define X86_FEATURE_USE_IBRS_FW		( 7*32+22) /* "" Use IBRS during runtime firmware calls */
 
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 85f6ccb..6b27828 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -6,6 +6,7 @@
 #include <asm/pgtable.h>
 #include <asm/processor-flags.h>
 #include <asm/tlb.h>
+#include <asm/nospec-branch.h>
 
 /*
  * We map the EFI regions needed for runtime services non-contiguously,
@@ -36,8 +37,14 @@
 
 extern asmlinkage unsigned long efi_call_phys(void *, ...);
 
-#define arch_efi_call_virt_setup()	kernel_fpu_begin()
-#define arch_efi_call_virt_teardown()	kernel_fpu_end()
+#define arch_efi_call_virt_setup()			\
+	kernel_fpu_begin()				\
+	firmware_restrict_branch_speculation_start();
+
+#define arch_efi_call_virt_teardown()				\
+	firmware_restrict_branch_speculation_end();		\
+	kernel_fpu_end()
+
 
 /*
  * Wrap all the virtual calls in a way that forces the parameters on the stack.
@@ -73,6 +80,7 @@ struct efi_scratch {
 	efi_sync_low_kernel_mappings();					\
 	preempt_disable();						\
 	__kernel_fpu_begin();						\
+	firmware_restrict_branch_speculation_start();			\
 									\
 	if (efi_scratch.use_pgd) {					\
 		efi_scratch.prev_cr3 = __read_cr3();			\
@@ -91,6 +99,7 @@ struct efi_scratch {
 		__flush_tlb_all();					\
 	}								\
 									\
+	firmware_restrict_branch_speculation_end();			\
 	__kernel_fpu_end();						\
 	preempt_enable();						\
 })
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 300cc15..788c4da 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -162,17 +162,36 @@ static inline void vmexit_fill_RSB(void)
 #endif
 }
 
+#define alternative_msr_write(_msr, _val, _feature)		\
+	asm volatile(ALTERNATIVE("",				\
+				 "movl %[msr], %%ecx\n\t"	\
+				 "movl %[val], %%eax\n\t"	\
+				 "movl $0, %%edx\n\t"		\
+				 "wrmsr",			\
+				 _feature)			\
+		     : : [msr] "i" (_msr), [val] "i" (_val)	\
+		     : "eax", "ecx", "edx", "memory")
+
 static inline void indirect_branch_prediction_barrier(void)
 {
-	asm volatile(ALTERNATIVE("",
-				 "movl %[msr], %%ecx\n\t"
-				 "movl %[val], %%eax\n\t"
-				 "movl $0, %%edx\n\t"
-				 "wrmsr",
-				 X86_FEATURE_USE_IBPB)
-		     : : [msr] "i" (MSR_IA32_PRED_CMD),
-			 [val] "i" (PRED_CMD_IBPB)
-		     : "eax", "ecx", "edx", "memory");
+	alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB,
+			      X86_FEATURE_USE_IBPB);
+}
+
+/*
+ * With retpoline, we must use IBRS to restrict branch prediction
+ * before calling into firmware.
+ */
+static inline void firmware_restrict_branch_speculation_start(void)
+{
+	alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS,
+			      X86_FEATURE_USE_IBRS_FW);
+}
+
+static inline void firmware_restrict_branch_speculation_end(void)
+{
+	alternative_msr_write(MSR_IA32_SPEC_CTRL, 0,
+			      X86_FEATURE_USE_IBRS_FW);
 }
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 61152aa..6f6d763 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -303,6 +303,15 @@ static void __init spectre_v2_select_mitigation(void)
 		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
 		pr_info("Enabling Indirect Branch Prediction Barrier\n");
 	}
+
+	/*
+	 * Retpoline means the kernel is safe because it has no indirect
+	 * branches. But firmware isn't, so use IBRS to protect that.
+	 */
+	if (boot_cpu_has(X86_FEATURE_IBRS)) {
+		setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
+		pr_info("Enabling Restricted Speculation for firmware calls\n");
+	}
 }
 
 #undef pr_fmt
@@ -332,8 +341,9 @@ ssize_t cpu_show_spectre_v2(struct device *dev,
 	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
 		return sprintf(buf, "Not affected\n");
 
-	return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+	return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 		       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
+		       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
 		       spectre_v2_module_string());
 }
 #endif
diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
index 67fbe35..bab3721 100644
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -38,6 +38,7 @@
 #endif /* CONFIG_HPWDT_NMI_DECODING */
 #include <asm/nmi.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 
 #define HPWDT_VERSION			"1.4.0"
 #define SECS_TO_TICKS(secs)		((secs) * 1000 / 128)
@@ -486,11 +487,13 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)
 	if (!hpwdt_nmi_decoding)
 		return NMI_DONE;
 
+	firmware_restrict_branch_speculation_start();
 	spin_lock_irqsave(&rom_lock, rom_pl);
 	if (!die_nmi_called && !is_icru && !is_uefi)
 		asminline_call(&cmn_regs, cru_rom_addr);
 	die_nmi_called = 1;
 	spin_unlock_irqrestore(&rom_lock, rom_pl);
+	firmware_restrict_branch_speculation_end();
 
 	if (allow_kdump)
 		hpwdt_stop();
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 4/4] x86/speculation: Support "Enhanced IBRS" on future CPUs
  2018-02-07  0:03 [RFC PATCH 0/4] Retpoline / IBRS_ALL David Woodhouse
                   ` (2 preceding siblings ...)
  2018-02-07  0:03 ` [RFC PATCH 3/4] x86/speculation: Use IBRS if available before calling into firmware David Woodhouse
@ 2018-02-07  0:03 ` David Woodhouse
  3 siblings, 0 replies; 8+ messages in thread
From: David Woodhouse @ 2018-02-07  0:03 UTC (permalink / raw)
  To: tglx, torvalds, x86, linux-kernel, bp, peterz, tim.c.chen,
	dave.hansen, arjan.van.de.ven

The original IBRS hack in microcode is horribly slow. For the next
generation of CPUs, as a stopgap until we get a proper fix, Intel
promise an "Enhanced IBRS" which will be fast.

The assumption is that predictions in the BTB/RSB will be tagged with
the VMX mode and ring that they were learned in, and thus we can avoid
consuming unsafe predictions without a performance penalty.

This does not actually *set* the IBRS bit in the SPEC_CTRL register,
because Intel's documentation is wrong. Not wrong in the sense of
"does not accurately describe Intel's planned future hardware", but
more in the sense that if Intel make hardware like that, then they
are Doing It Wrong™.

With IBRS_ALL advertised in IA32_ARCH_CAPABILITIES, the IBRS bit in
the MSR should do *nothing*. The safe mode where the CPU honours the
tags in the BTB/RSB should be enabled *unconditionally*.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/nospec-branch.h |  2 +-
 arch/x86/kernel/cpu/bugs.c           | 13 ++++++++++++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 788c4da..9b65211 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -140,7 +140,7 @@ enum spectre_v2_mitigation {
 	SPECTRE_V2_RETPOLINE_MINIMAL_AMD,
 	SPECTRE_V2_RETPOLINE_GENERIC,
 	SPECTRE_V2_RETPOLINE_AMD,
-	SPECTRE_V2_IBRS,
+	SPECTRE_V2_IBRS_ALL,
 };
 
 extern char __indirect_thunk_start[];
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6f6d763..8dbbeda 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -88,6 +88,7 @@ static const char *spectre_v2_strings[] = {
 	[SPECTRE_V2_RETPOLINE_MINIMAL_AMD]	= "Vulnerable: Minimal AMD ASM retpoline",
 	[SPECTRE_V2_RETPOLINE_GENERIC]		= "Mitigation: Full generic retpoline",
 	[SPECTRE_V2_RETPOLINE_AMD]		= "Mitigation: Full AMD retpoline",
+	[SPECTRE_V2_IBRS_ALL]			= "Mitigation: Enhanced IBRS",
 };
 
 #undef pr_fmt
@@ -240,6 +241,15 @@ static void __init spectre_v2_select_mitigation(void)
 
 	case SPECTRE_V2_CMD_FORCE:
 	case SPECTRE_V2_CMD_AUTO:
+		if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) {
+			u64 ia32_cap = 0;
+
+			rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+			if (ia32_cap & ARCH_CAP_IBRS_ALL) {
+				mode = SPECTRE_V2_IBRS_ALL;
+				goto ibrs_all;
+			}
+		}
 		if (IS_ENABLED(CONFIG_RETPOLINE))
 			goto retpoline_auto;
 		break;
@@ -277,6 +287,7 @@ static void __init spectre_v2_select_mitigation(void)
 		setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
 	}
 
+ ibrs_all:
 	spectre_v2_enabled = mode;
 	pr_info("%s\n", spectre_v2_strings[mode]);
 
@@ -308,7 +319,7 @@ static void __init spectre_v2_select_mitigation(void)
 	 * Retpoline means the kernel is safe because it has no indirect
 	 * branches. But firmware isn't, so use IBRS to protect that.
 	 */
-	if (boot_cpu_has(X86_FEATURE_IBRS)) {
+	if (mode != SPECTRE_V2_IBRS_ALL && boot_cpu_has(X86_FEATURE_IBRS)) {
 		setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
 		pr_info("Enabling Restricted Speculation for firmware calls\n");
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 3/4] x86/speculation: Use IBRS if available before calling into firmware
  2018-02-07  0:03 ` [RFC PATCH 3/4] x86/speculation: Use IBRS if available before calling into firmware David Woodhouse
@ 2018-02-07 11:17   ` Borislav Petkov
  2018-02-07 11:29     ` David Woodhouse
  0 siblings, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2018-02-07 11:17 UTC (permalink / raw)
  To: David Woodhouse
  Cc: tglx, torvalds, x86, linux-kernel, peterz, tim.c.chen,
	dave.hansen, arjan.van.de.ven

Just some random thoughts:

On Wed, Feb 07, 2018 at 12:03:13AM +0000, David Woodhouse wrote:
> +#define alternative_msr_write(_msr, _val, _feature)		\
> +	asm volatile(ALTERNATIVE("",				\
> +				 "movl %[msr], %%ecx\n\t"	\
> +				 "movl %[val], %%eax\n\t"	\
> +				 "movl $0, %%edx\n\t"		\

We'll never write anything except 0 in %edx?

> +				 "wrmsr",			\
> +				 _feature)			\
> +		     : : [msr] "i" (_msr), [val] "i" (_val)	\
> +		     : "eax", "ecx", "edx", "memory")

So I'm not crazy about making it a separate macro because TBH it doesn't
look too generic to do that but then again what do I know, considering
recent history. :-)

	 [ Maybe I need to not look at the spectral meltdown for a
	   couple of weeks and simply take a break. ]

I guess it is fine if it is in nospec-branch.h and not prefix it with
alternative_ so that it doesn't give people ideas.

Oh and firmware_restrict_branch_speculation_start/end() is just too long
a name.

All IMHO, of course.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 3/4] x86/speculation: Use IBRS if available before calling into firmware
  2018-02-07 11:17   ` Borislav Petkov
@ 2018-02-07 11:29     ` David Woodhouse
  0 siblings, 0 replies; 8+ messages in thread
From: David Woodhouse @ 2018-02-07 11:29 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: tglx, torvalds, x86, linux-kernel, peterz, tim.c.chen,
	dave.hansen, arjan.van.de.ven

[-- Attachment #1: Type: text/plain, Size: 1486 bytes --]



On Wed, 2018-02-07 at 12:17 +0100, Borislav Petkov wrote:
> Just some random thoughts:
> 
> On Wed, Feb 07, 2018 at 12:03:13AM +0000, David Woodhouse wrote:
> > 
> > +#define alternative_msr_write(_msr, _val, _feature)		\
> > +	asm volatile(ALTERNATIVE("",				\
> > +				 "movl %[msr], %%ecx\n\t"	\
> > +				 "movl %[val], %%eax\n\t"	\
> > +				 "movl $0, %%edx\n\t"		\
> We'll never write anything except 0 in %edx?

For these uses, no.

> > 
> > +				 "wrmsr",			\
> > +				 _feature)			\
> > +		     : : [msr] "i" (_msr), [val] "i" (_val)	\
> > +		     : "eax", "ecx", "edx", "memory")
>
> So I'm not crazy about making it a separate macro because TBH it doesn't
> look too generic to do that but then again what do I know, considering
> recent history. :-)

It was mostly split out so that if you want to keep playing silly
buggers with it, you can do so in only one place and not three.

If you don't want to keep messing with it, I'm happy to not bother ;)

> 	 [ Maybe I need to not look at the spectral meltdown for a
> 	   couple of weeks and simply take a break. ]
> 
> I guess it is fine if it is in nospec-branch.h and not prefix it with
> alternative_ so that it doesn't give people ideas.

Right. It wasn't really being proposed as a generic alternative.

> Oh and firmware_restrict_branch_speculation_start/end() is just too long
> a name.

Yeah... Thomas wanted descriptive names; I'm not really convinced.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: [RFC PATCH 2/4] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()]
       [not found]   ` <1518092274.3677.177.camel@infradead.org>
@ 2018-02-08 12:28     ` Sironi, Filippo
  0 siblings, 0 replies; 8+ messages in thread
From: Sironi, Filippo @ 2018-02-08 12:28 UTC (permalink / raw)
  To: David Woodhouse
  Cc: tglx, Linus Torvalds, x86, LKML, bp, Peter Zijlstra, tim.c.chen,
	dave.hansen, arjan.van.de.ven, KVM list


> On 8. Feb 2018, at 13:17, David Woodhouse <dwmw2@infradead.org> wrote:
> 
> 
> From: David Woodhouse <dwmw@amazon.co.uk>
> Subject: [RFC PATCH 2/4] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()
> Date: 7. February 2018 at 01:03:12 GMT+1
> To: tglx@linutronix.de, torvalds@linux-foundation.org, x86@kernel.org, linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, tim.c.chen@linux.intel.com, dave.hansen@intel.com, arjan.van.de.ven@intel.com
> 
> 
> With retpoline, tight loops of "call this function for every XXX" are
> very much pessimised by taking a prediction miss *every* time. This one
> showed up very high in our early testing.
> 
> By marking the iterator slot_handle_…() functions always_inline, we can
> ensure that the indirect function call can be optimised away into a
> direct call and it actually generates slightly smaller code because
> some of the other conditionals can get optimised away too.
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> arch/x86/kvm/mmu.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 2b8eb4d..cc83bdc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5058,7 +5058,7 @@ void kvm_mmu_uninit_vm(struct kvm *kvm)
> typedef bool (*slot_level_handler) (struct kvm *kvm, struct kvm_rmap_head *rmap_head);
> 
> /* The caller should hold mmu-lock before calling this function. */
> -static bool
> +static __always_inline bool
> slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 			slot_level_handler fn, int start_level, int end_level,
> 			gfn_t start_gfn, gfn_t end_gfn, bool lock_flush_tlb)
> @@ -5088,7 +5088,7 @@ slot_handle_level_range(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 	return flush;
> }
> 
> -static bool
> +static __always_inline bool
> slot_handle_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 		  slot_level_handler fn, int start_level, int end_level,
> 		  bool lock_flush_tlb)
> @@ -5099,7 +5099,7 @@ slot_handle_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 			lock_flush_tlb);
> }
> 
> -static bool
> +static __always_inline bool
> slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 		      slot_level_handler fn, bool lock_flush_tlb)
> {
> @@ -5107,7 +5107,7 @@ slot_handle_all_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 				 PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb);
> }
> 
> -static bool
> +static __always_inline bool
> slot_handle_large_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 			slot_level_handler fn, bool lock_flush_tlb)
> {
> @@ -5115,7 +5115,7 @@ slot_handle_large_level(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 				 PT_MAX_HUGEPAGE_LEVEL, lock_flush_tlb);
> }
> 
> -static bool
> +static __always_inline bool
> slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot *memslot,
> 		 slot_level_handler fn, bool lock_flush_tlb)
> {
> -- 
> 2.7.4

+kvm@vger.kernel.org

With this patch, launches of "large instances" are pretty close to what we see with
nospectre_v2 (within tens of milliseconds).

Reviewed-by: Filippo Sironi <sironi@amazon.de>
Tested-by: Filippo Sironi <sironi@amazon.de>

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-02-08 12:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-07  0:03 [RFC PATCH 0/4] Retpoline / IBRS_ALL David Woodhouse
2018-02-07  0:03 ` [RFC PATCH 1/4] Revert "x86/speculation: Simplify indirect_branch_prediction_barrier()" David Woodhouse
2018-02-07  0:03 ` [RFC PATCH 2/4] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range() David Woodhouse
     [not found]   ` <1518092274.3677.177.camel@infradead.org>
2018-02-08 12:28     ` [Fwd: [RFC PATCH 2/4] KVM: x86: Reduce retpoline performance impact in slot_handle_level_range()] Sironi, Filippo
2018-02-07  0:03 ` [RFC PATCH 3/4] x86/speculation: Use IBRS if available before calling into firmware David Woodhouse
2018-02-07 11:17   ` Borislav Petkov
2018-02-07 11:29     ` David Woodhouse
2018-02-07  0:03 ` [RFC PATCH 4/4] x86/speculation: Support "Enhanced IBRS" on future CPUs David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).