linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 00/17] Fix up the recent SRSO patches
@ 2023-08-09  7:12 Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk Peter Zijlstra
                   ` (18 more replies)
  0 siblings, 19 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Since I wasn't invited to the party (even though I did retbleed), I get to
clean things up afterwards :/

Anyway, this here overhauls the SRSO patches in a big way.

I claim that AMD retbleed (also called Speculative-Type-Confusion -- not to be
confused with Intel retbleed, which is an entirely different bug) is
fundamentally the same as this SRSO -- which is also caused by STC. And the
mitigations are so similar they should all be controlled from a single spot and
not conflated like they are now.

As such, at the end of the ride the new kernel command line and srso sysfs
files are no more and all we're left with is a few extra retbleed options.

Aside of that; this deals with a few implementation issues -- but not all known
issues. Josh and Andrew are telling me there's a problem when running inside
virt due to how this checks the microcode. I'm hoping either of those two gents
will add a patch to address this.




^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  9:31   ` Nikolay Borisov
  2023-08-10 11:37   ` Borislav Petkov
  2023-08-09  7:12 ` [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

There is infrastructure to rewrite return thunks to point to any
random thunk one desires, unwrap that from CALL_THUNKS, which up to
now was the sole user of that.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    4 ----
 arch/x86/kernel/alternative.c        |    2 --
 2 files changed, 6 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -347,11 +347,7 @@ extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
 extern void entry_ibpb(void);
 
-#ifdef CONFIG_CALL_THUNKS
 extern void (*x86_return_thunk)(void);
-#else
-#define x86_return_thunk	(&__x86_return_thunk)
-#endif
 
 #ifdef CONFIG_CALL_DEPTH_TRACKING
 extern void __x86_return_skl(void);
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -698,9 +698,7 @@ void __init_or_module noinline apply_ret
 
 #ifdef CONFIG_RETHUNK
 
-#ifdef CONFIG_CALL_THUNKS
 void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
-#endif
 
 /*
  * Rewrite the compiler generated return thunk tail-calls.



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09 15:45   ` Nikolay Borisov
  2023-08-10 11:51   ` Borislav Petkov
  2023-08-09  7:12 ` [RFC][PATCH 03/17] x86/cpu: Make srso_untrain_ret consistent Peter Zijlstra
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Use the existing configurable return thunk. There is absolute no
justification for having created this __x86_return_thunk alternative.

To clarify, the whole thing looks like:

Zen3/4 does:

  srso_alias_untrain_ret:
	  nop2
	  lfence
	  jmp srso_alias_return_thunk
	  int3

  srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
	  add $8, %rsp
	  ret
	  int3

  srso_alias_return_thunk:
	  call srso_alias_safe_ret
	  ud2

While Zen1/2 does:

  srso_untrain_ret:
	  movabs $foo, %rax
	  lfence
	  call srso_safe_ret           (jmp srso_return_thunk ?)
	  int3

  srso_safe_ret: // embedded in movabs immediate
	  add $8,%rsp
          ret
          int3

  srso_return_thunk:
	  call srso_safe_ret
	  ud2

While retbleed does:

  zen_untrain_ret:
	  test $0xcc, %bl
	  lfence
	  jmp zen_return_thunk
          int3

  zen_return_thunk: // embedded in the test instruction
	  ret
          int3

Where Zen1/2 flush the BTB using the instruction decoder trick
(test,movabs) Zen3/4 use instruction aliasing. SRSO adds RSB (RAP in
AMD speak) stuffing to force a return mis-predict.

That is; the AMD retbleed is a form of Speculative-Type-Confusion
where the branch predictor is trained to use the BTB to predict the
RET address, while AMD inception/SRSO is a form of
Speculative-Type-Confusion where another instruction is trained to be
treated like a CALL instruction and poison the RSB (RAP).

Pick one of three options at boot.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    4 +++
 arch/x86/kernel/cpu/bugs.c           |    7 ++++--
 arch/x86/kernel/vmlinux.lds.S        |    2 -
 arch/x86/lib/retpoline.S             |   37 ++++++++++++++++++++++++-----------
 4 files changed, 36 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -342,9 +342,13 @@ extern retpoline_thunk_t __x86_indirect_
 extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
 
 extern void __x86_return_thunk(void);
+extern void srso_return_thunk(void);
+extern void srso_alias_return_thunk(void);
+
 extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
+
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2305,10 +2305,13 @@ static void __init srso_select_mitigatio
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 
-			if (boot_cpu_data.x86 == 0x19)
+			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
-			else
+				x86_return_thunk = srso_alias_return_thunk;
+			} else {
 				setup_force_cpu_cap(X86_FEATURE_SRSO);
+				x86_return_thunk = srso_return_thunk;
+			}
 			srso_mitigation = SRSO_MITIGATION_SAFE_RET;
 		} else {
 			pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -523,7 +523,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_RETHUNK
-. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
+. = ASSERT((__x86_return_thunk & 0x3f) == 0, "__x86_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 #endif
 
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -151,10 +151,11 @@ SYM_CODE_END(__x86_indirect_jump_thunk_a
 	.section .text.__x86.rethunk_untrain
 
 SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
-	jmp __x86_return_thunk
+	jmp srso_alias_return_thunk
 SYM_FUNC_END(srso_untrain_ret_alias)
 __EXPORT_THUNK(srso_untrain_ret_alias)
 
@@ -184,7 +185,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (__ret - zen_untrain_ret), 0xcc
+	.skip 64 - (__x86_return_thunk - zen_untrain_ret), 0xcc
 SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
@@ -216,10 +217,10 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
 	 * evicted, __x86_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(__x86_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(__ret)
+SYM_CODE_END(__x86_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -230,11 +231,13 @@ SYM_CODE_END(__ret)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp __ret
+	jmp __x86_return_thunk
 	int3
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
 
+EXPORT_SYMBOL(__x86_return_thunk)
+
 /*
  * SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
  * above. On kernel entry, srso_untrain_ret() is executed which is a
@@ -257,6 +260,7 @@ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLO
 	int3
 	int3
 	int3
+	/* end of movabs */
 	lfence
 	call srso_safe_ret
 	int3
@@ -264,12 +268,23 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
-SYM_FUNC_START(__x86_return_thunk)
-	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
-			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
-	int3
-SYM_CODE_END(__x86_return_thunk)
-EXPORT_SYMBOL(__x86_return_thunk)
+/*
+ * Both these do an unbalanced CALL to mess up the RSB, terminate with UD2
+ * to indicate noreturn.
+ */
+SYM_CODE_START(srso_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	call srso_safe_ret
+	ud2
+SYM_CODE_END(srso_return_thunk)
+
+SYM_CODE_START(srso_alias_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	call srso_safe_ret_alias
+	ud2
+SYM_CODE_END(srso_alias_return_thunk)
 
 #endif /* CONFIG_RETHUNK */
 



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 03/17] x86/cpu: Make srso_untrain_ret consistent
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-10 12:00   ` Borislav Petkov
  2023-08-09  7:12 ` [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess Peter Zijlstra
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

This does change srso_untrain_ret a little to be more consistent with
srso_alias_untrain_ret (and zen_untrain_ret). Specifically I made
srso_untrain_ret tail-call the srso_return_thunk, instead of doing the
call directly. This matches how srso_alias_untrain_ret amd
zen_untrain_ret also tail-call their respective return_thunk.

If this is a problem this can be easily fixed and a comment added to
explain -- but this way they all end with a tail-call to their own
return-thunk, which is nice and consistent.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/retpoline.S |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -262,7 +262,7 @@ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLO
 	int3
 	/* end of movabs */
 	lfence
-	call srso_safe_ret
+	jmp srso_return_thunk
 	int3
 SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (2 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 03/17] x86/cpu: Make srso_untrain_ret consistent Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-10 12:06   ` Borislav Petkov
  2023-08-09  7:12 ` [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess Peter Zijlstra
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

arch_is_rethunk() indicates those functions that will be considered
equivalent to INSN_RETURN and any callsite will be added to the
.return_sites section.

Notably: srso_untrain_ret, srso_safe_ret and __ret do not qualify.

The only thing that needs consideration is srso_safe_ret(), since that
is, like __x86_return_thunk, inside another instruction. Skip
validating that.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/arch/x86/decode.c |    5 +----
 tools/objtool/check.c           |    2 +-
 2 files changed, 2 insertions(+), 5 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -824,8 +824,5 @@ bool arch_is_retpoline(struct symbol *sy
 
 bool arch_is_rethunk(struct symbol *sym)
 {
-	return !strcmp(sym->name, "__x86_return_thunk") ||
-	       !strcmp(sym->name, "srso_untrain_ret") ||
-	       !strcmp(sym->name, "srso_safe_ret") ||
-	       !strcmp(sym->name, "__ret");
+	return !strcmp(sym->name, "__x86_return_thunk");
 }
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -455,7 +455,7 @@ static int decode_instructions(struct ob
 				return -1;
 			}
 
-			if (func->return_thunk || func->alias != func)
+			if (func->return_thunk || !strcmp(func->name, "srso_safe_ret") || func->alias != func)
 				continue;
 
 			if (!find_insn(file, sec, func->offset)) {



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (3 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09 12:51   ` Josh Poimboeuf
  2023-08-09  7:12 ` [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed= Peter Zijlstra
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Since there can only be one active return_thunk, there only needs be
one (matching) untrain_ret. It fundamentally doesn't make sense to
allow multiple untrain_ret at the same time.

Fold all the 3 different untrain methods into a single (temporary)
helper stub.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   19 +++++--------------
 arch/x86/lib/retpoline.S             |    7 +++++++
 2 files changed, 12 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,9 +272,9 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_ZEN_UNTRAIN_RET	"call zen_untrain_ret"
+#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
 #else
-#define CALL_ZEN_UNTRAIN_RET	""
+#define CALL_UNTRAIN_RET	""
 #endif
 
 /*
@@ -293,15 +293,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 .macro UNTRAIN_RET_FROM_CALL
@@ -309,15 +304,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 
@@ -349,6 +339,7 @@ extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
 
+extern void entry_untrain_ret(void);
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -268,6 +268,13 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
+SYM_FUNC_START(entry_untrain_ret)
+	ALTERNATIVE_2 "jmp zen_untrain_ret", \
+		      "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
+		      "jmp srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+SYM_FUNC_END(entry_untrain_ret)
+__EXPORT_THUNK(entry_untrain_ret)
+
 /*
  * Both these do an unbalanced CALL to mess up the RSB, terminate with UD2
  * to indicate noreturn.



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (4 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09 13:42   ` Josh Poimboeuf
  2023-08-10 15:44   ` Borislav Petkov
  2023-08-09  7:12 ` [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Since it is now readily apparent that the two SRSO
untrain_ret+return_thunk variants are exactly the same mechanism as
the existing (retbleed) zen untrain_ret+return_thunk, add them to the
existing retbleed options.

This avoids all confusion as to which of the three -- if any -- ought
to be active, there's a single point of control and no funny
interactions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/bugs.c |   87 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 76 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -748,6 +748,8 @@ enum spectre_v2_mitigation spectre_v2_en
 enum retbleed_mitigation {
 	RETBLEED_MITIGATION_NONE,
 	RETBLEED_MITIGATION_UNRET,
+	RETBLEED_MITIGATION_UNRET_SRSO,
+	RETBLEED_MITIGATION_UNRET_SRSO_ALIAS,
 	RETBLEED_MITIGATION_IBPB,
 	RETBLEED_MITIGATION_IBRS,
 	RETBLEED_MITIGATION_EIBRS,
@@ -758,17 +760,21 @@ enum retbleed_mitigation_cmd {
 	RETBLEED_CMD_OFF,
 	RETBLEED_CMD_AUTO,
 	RETBLEED_CMD_UNRET,
+	RETBLEED_CMD_UNRET_SRSO,
+	RETBLEED_CMD_UNRET_SRSO_ALIAS,
 	RETBLEED_CMD_IBPB,
 	RETBLEED_CMD_STUFF,
 };
 
 static const char * const retbleed_strings[] = {
-	[RETBLEED_MITIGATION_NONE]	= "Vulnerable",
-	[RETBLEED_MITIGATION_UNRET]	= "Mitigation: untrained return thunk",
-	[RETBLEED_MITIGATION_IBPB]	= "Mitigation: IBPB",
-	[RETBLEED_MITIGATION_IBRS]	= "Mitigation: IBRS",
-	[RETBLEED_MITIGATION_EIBRS]	= "Mitigation: Enhanced IBRS",
-	[RETBLEED_MITIGATION_STUFF]	= "Mitigation: Stuffing",
+	[RETBLEED_MITIGATION_NONE]		= "Vulnerable",
+	[RETBLEED_MITIGATION_UNRET]		= "Mitigation: untrained return thunk",
+	[RETBLEED_MITIGATION_UNRET_SRSO]	= "Mitigation: srso untrained return thunk",
+	[RETBLEED_MITIGATION_UNRET_SRSO_ALIAS]	= "Mitigation: srso alias untrained return thunk",
+	[RETBLEED_MITIGATION_IBPB]		= "Mitigation: IBPB",
+	[RETBLEED_MITIGATION_IBRS]		= "Mitigation: IBRS",
+	[RETBLEED_MITIGATION_EIBRS]		= "Mitigation: Enhanced IBRS",
+	[RETBLEED_MITIGATION_STUFF]		= "Mitigation: Stuffing",
 };
 
 static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
@@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
 			retbleed_cmd = RETBLEED_CMD_AUTO;
 		} else if (!strcmp(str, "unret")) {
 			retbleed_cmd = RETBLEED_CMD_UNRET;
+		} else if (!strcmp(str, "srso")) {
+			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
+		} else if (!strcmp(str, "srso_alias")) {
+			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
 		} else if (!strcmp(str, "ibpb")) {
 			retbleed_cmd = RETBLEED_CMD_IBPB;
 		} else if (!strcmp(str, "stuff")) {
@@ -817,21 +827,54 @@ early_param("retbleed", retbleed_parse_c
 
 #define RETBLEED_UNTRAIN_MSG "WARNING: BTB untrained return thunk mitigation is only effective on AMD/Hygon!\n"
 #define RETBLEED_INTEL_MSG "WARNING: Spectre v2 mitigation leaves CPU vulnerable to RETBleed attacks, data leaks possible!\n"
+#define RETBLEED_SRSO_NOTICE "WARNING: See https://kernel.org/doc/html/latest/admin-guide/hw-vuln/srso.html for mitigation options."
 
 static void __init retbleed_select_mitigation(void)
 {
 	bool mitigate_smt = false;
+	bool has_microcode = false;
 
-	if (!boot_cpu_has_bug(X86_BUG_RETBLEED) || cpu_mitigations_off())
+	if ((!boot_cpu_has_bug(X86_BUG_RETBLEED) && !boot_cpu_has_bug(X86_BUG_SRSO)) ||
+	    cpu_mitigations_off())
 		return;
 
+	if (boot_cpu_has_bug(X86_BUG_SRSO)) {
+		has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
+		if (!has_microcode) {
+			pr_warn("IBPB-extending microcode not applied!\n");
+			pr_warn(RETBLEED_SRSO_NOTICE);
+		} else {
+			/*
+			 * Enable the synthetic (even if in a real CPUID leaf)
+			 * flags for guests.
+			 */
+			setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
+			setup_force_cpu_cap(X86_FEATURE_SBPB);
+
+			/*
+			 * Zen1/2 with SMT off aren't vulnerable after the right
+			 * IBPB microcode has been applied.
+			 */
+			if ((boot_cpu_data.x86 < 0x19) &&
+			    (cpu_smt_control == CPU_SMT_DISABLED))
+				setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
+		}
+	}
+
 	switch (retbleed_cmd) {
 	case RETBLEED_CMD_OFF:
 		return;
 
 	case RETBLEED_CMD_UNRET:
+	case RETBLEED_CMD_UNRET_SRSO:
+	case RETBLEED_CMD_UNRET_SRSO_ALIAS:
 		if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY)) {
-			retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
+			if (retbleed_cmd == RETBLEED_CMD_UNRET)
+				retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
+			if (retbleed_cmd == RETBLEED_CMD_UNRET_SRSO)
+				retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO;
+			if (retbleed_cmd == RETBLEED_CMD_UNRET_SRSO_ALIAS)
+				retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO_ALIAS;
 		} else {
 			pr_err("WARNING: kernel not compiled with CPU_UNRET_ENTRY.\n");
 			goto do_cmd_auto;
@@ -843,6 +886,8 @@ static void __init retbleed_select_mitig
 			pr_err("WARNING: CPU does not support IBPB.\n");
 			goto do_cmd_auto;
 		} else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY)) {
+			if (boot_cpu_has_bug(X86_BUG_SRSO) && !has_microcode)
+				pr_err("IBPB-extending microcode not applied; SRSO NOT mitigated\n");
 			retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
 		} else {
 			pr_err("WARNING: kernel not compiled with CPU_IBPB_ENTRY.\n");
@@ -870,8 +915,17 @@ static void __init retbleed_select_mitig
 	default:
 		if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
 		    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
-			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY))
-				retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
+			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY)) {
+				if (boot_cpu_has_bug(X86_BUG_RETBLEED))
+					retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
+
+				if (boot_cpu_has_bug(X86_BUG_SRSO) && !boot_cpu_has(X86_FEATURE_SRSO_NO)) {
+					if (boot_cpu_data.x86 == 0x19)
+						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO_ALIAS;
+					else
+						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO;
+				}
+			}
 			else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY) && boot_cpu_has(X86_FEATURE_IBPB))
 				retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
 		}
@@ -886,9 +940,20 @@ static void __init retbleed_select_mitig
 	}
 
 	switch (retbleed_mitigation) {
+	case RETBLEED_MITIGATION_UNRET_SRSO_ALIAS:
+		setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
+		x86_return_thunk = srso_alias_return_thunk;
+		goto do_rethunk;
+
+	case RETBLEED_MITIGATION_UNRET_SRSO:
+		setup_force_cpu_cap(X86_FEATURE_SRSO);
+		x86_return_thunk = srso_return_thunk;
+		goto do_rethunk;
+
 	case RETBLEED_MITIGATION_UNRET:
-		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
+do_rethunk:
+		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (5 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed= Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09 13:50   ` Josh Poimboeuf
  2023-08-13 10:36   ` Borislav Petkov
  2023-08-09  7:12 ` [RFC][PATCH 08/17] x86/cpu: Add IBPB on VMEXIT to retbleed= Peter Zijlstra
                   ` (11 subsequent siblings)
  18 siblings, 2 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

With the difference being that UNTRAIN_RET_VM uses
X86_FEATURE_IBPB_ON_VMEXIT instead of X86_FEATURE_ENTRY_IBPB.

This cures VMEXIT doing potentially unret+IBPB or double IBPB.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   11 +++++++++++
 arch/x86/kernel/cpu/bugs.c           |   17 ++++++++++++++++-
 arch/x86/kvm/svm/vmenter.S           |    7 ++-----
 3 files changed, 29 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -299,6 +299,17 @@
 #endif
 .endm
 
+.macro UNTRAIN_RET_VM
+#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
+	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+	VALIDATE_UNRET_END
+	ALTERNATIVE_3 "",						\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT,	\
+		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+#endif
+.endm
+
 .macro UNTRAIN_RET_FROM_CALL
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
 	defined(CONFIG_CALL_DEPTH_TRACKING)
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -222,10 +222,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	 * because interrupt handlers won't sanitize 'ret' if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
-
-	/* SRSO */
-	ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT
+	UNTRAIN_RET_VM
 
 	/*
 	 * Clear all general purpose registers except RSP and RAX to prevent
@@ -362,7 +359,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 * because interrupt handlers won't sanitize RET if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
+	UNTRAIN_RET_VM
 
 	/* "Pop" @spec_ctrl_intercepted.  */
 	pop %_ASM_BX



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 08/17] x86/cpu: Add IBPB on VMEXIT to retbleed=
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (6 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 09/17] x86: Remove CONFIG_CPU_SRSO Peter Zijlstra
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Since IBPB-on-VMEXIT is an obvious variant of retbleed=ibpb, add it as
an such.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/bugs.c |   17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -751,6 +751,7 @@ enum retbleed_mitigation {
 	RETBLEED_MITIGATION_UNRET_SRSO,
 	RETBLEED_MITIGATION_UNRET_SRSO_ALIAS,
 	RETBLEED_MITIGATION_IBPB,
+	RETBLEED_MITIGATION_IBPB_VMEXIT,
 	RETBLEED_MITIGATION_IBRS,
 	RETBLEED_MITIGATION_EIBRS,
 	RETBLEED_MITIGATION_STUFF,
@@ -763,6 +764,7 @@ enum retbleed_mitigation_cmd {
 	RETBLEED_CMD_UNRET_SRSO,
 	RETBLEED_CMD_UNRET_SRSO_ALIAS,
 	RETBLEED_CMD_IBPB,
+	RETBLEED_CMD_IBPB_VMEXIT,
 	RETBLEED_CMD_STUFF,
 };
 
@@ -772,6 +774,7 @@ static const char * const retbleed_strin
 	[RETBLEED_MITIGATION_UNRET_SRSO]	= "Mitigation: srso untrained return thunk",
 	[RETBLEED_MITIGATION_UNRET_SRSO_ALIAS]	= "Mitigation: srso alias untrained return thunk",
 	[RETBLEED_MITIGATION_IBPB]		= "Mitigation: IBPB",
+	[RETBLEED_MITIGATION_IBPB_VMEXIT]	= "Mitigation: IBPB on VMEXIT only",
 	[RETBLEED_MITIGATION_IBRS]		= "Mitigation: IBRS",
 	[RETBLEED_MITIGATION_EIBRS]		= "Mitigation: Enhanced IBRS",
 	[RETBLEED_MITIGATION_STUFF]		= "Mitigation: Stuffing",
@@ -808,6 +811,8 @@ static int __init retbleed_parse_cmdline
 			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
 		} else if (!strcmp(str, "ibpb")) {
 			retbleed_cmd = RETBLEED_CMD_IBPB;
+		} else if (!strcmp(str, "ibpb_vmexit")) {
+			retbleed_cmd = RETBLEED_CMD_IBPB_VMEXIT;
 		} else if (!strcmp(str, "stuff")) {
 			retbleed_cmd = RETBLEED_CMD_STUFF;
 		} else if (!strcmp(str, "nosmt")) {
@@ -881,13 +886,17 @@ static void __init retbleed_select_mitig
 		break;
 
 	case RETBLEED_CMD_IBPB:
+	case RETBLEED_CMD_IBPB_VMEXIT:
 		if (!boot_cpu_has(X86_FEATURE_IBPB)) {
 			pr_err("WARNING: CPU does not support IBPB.\n");
 			goto do_cmd_auto;
 		} else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY)) {
 			if (boot_cpu_has_bug(X86_BUG_SRSO) && !has_microcode)
 				pr_err("IBPB-extending microcode not applied; SRSO NOT mitigated\n");
-			retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
+			if (retbleed_cmd == RETBLEED_CMD_IBPB)
+				retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
+			if (retbleed_cmd == RETBLEED_CMD_IBPB_VMEXIT)
+				retbleed_mitigation = RETBLEED_MITIGATION_IBPB_VMEXIT;
 		} else {
 			pr_err("WARNING: kernel not compiled with CPU_IBPB_ENTRY.\n");
 			goto do_cmd_auto;
@@ -961,6 +970,12 @@ static void __init retbleed_select_mitig
 
 	case RETBLEED_MITIGATION_IBPB:
 		setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
+		setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
+		mitigate_smt = true;
+		break;
+
+	case RETBLEED_MITIGATION_IBPB_VMEXIT:
+		setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
 		mitigate_smt = true;
 		break;
 



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 09/17] x86: Remove CONFIG_CPU_SRSO
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (7 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 08/17] x86/cpu: Add IBPB on VMEXIT to retbleed= Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09 13:57   ` Josh Poimboeuf
  2023-08-09  7:12 ` [RFC][PATCH 10/17] x86: Remove CPU_IBPB_ENTRY Peter Zijlstra
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

It doesn't do anything useful anymore (it never really did), remove it
and while at it, put all the untrain and return thunk mess under
CPU_UNRET_ENTRY.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/Kconfig                     |    7 -------
 arch/x86/include/asm/nospec-branch.h |    9 ++++-----
 arch/x86/kernel/vmlinux.lds.S        |    9 +++------
 arch/x86/lib/retpoline.S             |   23 +++++++++++++++++------
 4 files changed, 24 insertions(+), 24 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2593,13 +2593,6 @@ config CPU_IBRS_ENTRY
 	  This mitigates both spectre_v2 and retbleed at great cost to
 	  performance.
 
-config CPU_SRSO
-	bool "Mitigate speculative RAS overflow on AMD"
-	depends on CPU_SUP_AMD && X86_64 && RETHUNK
-	default y
-	help
-	  Enable the SRSO mitigation needed on AMD Zen1-4 machines.
-
 config SLS
 	bool "Mitigate Straight-Line-Speculation"
 	depends on CC_HAS_SLS && X86_64
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -211,8 +211,7 @@
  * eventually turn into it's own annotation.
  */
 .macro VALIDATE_UNRET_END
-#if defined(CONFIG_NOINSTR_VALIDATION) && \
-	(defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_SRSO))
+#if defined(CONFIG_NOINSTR_VALIDATION) && defined(CONFIG_CPU_UNRET_ENTRY)
 	ANNOTATE_RETPOLINE_SAFE
 	nop
 #endif
@@ -290,7 +289,7 @@
  */
 .macro UNTRAIN_RET
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
-	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+    defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
 		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
@@ -301,7 +300,7 @@
 
 .macro UNTRAIN_RET_VM
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
-	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+    defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
 		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
@@ -312,7 +311,7 @@
 
 .macro UNTRAIN_RET_FROM_CALL
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
-	defined(CONFIG_CALL_DEPTH_TRACKING)
+    defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
 		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -141,13 +141,13 @@ SECTIONS
 		STATIC_CALL_TEXT
 
 		ALIGN_ENTRY_TEXT_BEGIN
-#ifdef CONFIG_CPU_SRSO
+#ifdef CONFIG_CPU_UNRET_ENTRY
 		*(.text.__x86.rethunk_untrain)
 #endif
 
 		ENTRY_TEXT
 
-#ifdef CONFIG_CPU_SRSO
+#ifdef CONFIG_CPU_UNRET_ENTRY
 		/*
 		 * See the comment above srso_untrain_ret_alias()'s
 		 * definition.
@@ -522,12 +522,9 @@ INIT_PER_CPU(irq_stack_backing_store);
            "fixed_percpu_data is not at start of per-cpu area");
 #endif
 
-#ifdef CONFIG_RETHUNK
+#ifdef CONFIG_CPU_UNRET_ENTRY
 . = ASSERT((__x86_return_thunk & 0x3f) == 0, "__x86_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
-#endif
-
-#ifdef CONFIG_CPU_SRSO
 /*
  * GNU ld cannot do XOR so do: (A | B) - (A & B) in order to compute the XOR
  * of the two function addresses:
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -132,6 +132,7 @@ SYM_CODE_END(__x86_indirect_jump_thunk_a
  */
 #ifdef CONFIG_RETHUNK
 
+#ifdef CONFIG_CPU_UNRET_ENTRY
 /*
  * srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
  * special addresses:
@@ -147,7 +148,6 @@ SYM_CODE_END(__x86_indirect_jump_thunk_a
  *
  * As a result, srso_safe_ret_alias() becomes a safe return.
  */
-#ifdef CONFIG_CPU_SRSO
 	.section .text.__x86.rethunk_untrain
 
 SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
@@ -160,14 +160,11 @@ SYM_FUNC_END(srso_untrain_ret_alias)
 __EXPORT_THUNK(srso_untrain_ret_alias)
 
 	.section .text.__x86.rethunk_safe
-#endif
 
 /* Needs a definition for the __x86_return_thunk alternative below. */
 SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
-#ifdef CONFIG_CPU_SRSO
 	add $8, %_ASM_SP
 	UNWIND_HINT_FUNC
-#endif
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
@@ -236,8 +233,6 @@ SYM_CODE_END(__x86_return_thunk)
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
 
-EXPORT_SYMBOL(__x86_return_thunk)
-
 /*
  * SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
  * above. On kernel entry, srso_untrain_ret() is executed which is a
@@ -293,6 +288,22 @@ SYM_CODE_START(srso_alias_return_thunk)
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 
+#else /* CONFIG_CPU_UNRET_ENTRY */
+
+	.section .text.__x86.return_thunk
+
+SYM_CODE_START(__x86_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	ANNOTATE_UNRET_SAFE
+	ret
+	int3
+SYM_CODE_END(__x86_return_thunk)
+
+#endif /* CONFIG_CPU_UNRET_ENTRY */
+
+__EXPORT_THUNK(__x86_return_thunk)
+
 #endif /* CONFIG_RETHUNK */
 
 #ifdef CONFIG_CALL_DEPTH_TRACKING



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 10/17] x86: Remove CPU_IBPB_ENTRY
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (8 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 09/17] x86: Remove CONFIG_CPU_SRSO Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense Peter Zijlstra
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

It was a pointless .config knob, it didn't even cause entry_ibpb to be
omitted from the build.

Our Kconfig space is definitely too big to carry pointless ones.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/Kconfig                     |    7 -------
 arch/x86/include/asm/nospec-branch.h |    9 +++------
 arch/x86/kernel/cpu/bugs.c           |   18 +++++-------------
 3 files changed, 8 insertions(+), 26 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2577,13 +2577,6 @@ config CALL_THUNKS_DEBUG
 	  Only enable this when you are debugging call thunks as this
 	  creates a noticeable runtime overhead. If unsure say N.
 
-config CPU_IBPB_ENTRY
-	bool "Enable IBPB on kernel entry"
-	depends on CPU_SUP_AMD && X86_64
-	default y
-	help
-	  Compile the kernel with support for the retbleed=ibpb mitigation.
-
 config CPU_IBRS_ENTRY
 	bool "Enable IBRS on kernel entry"
 	depends on CPU_SUP_INTEL && X86_64
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -288,8 +288,7 @@
  * where we have a stack but before any RET instruction.
  */
 .macro UNTRAIN_RET
-#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
-    defined(CONFIG_CALL_DEPTH_TRACKING)
+#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
 		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
@@ -299,8 +298,7 @@
 .endm
 
 .macro UNTRAIN_RET_VM
-#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
-    defined(CONFIG_CALL_DEPTH_TRACKING)
+#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
 		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
@@ -310,8 +308,7 @@
 .endm
 
 .macro UNTRAIN_RET_FROM_CALL
-#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
-    defined(CONFIG_CALL_DEPTH_TRACKING)
+#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
 		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -890,16 +890,13 @@ static void __init retbleed_select_mitig
 		if (!boot_cpu_has(X86_FEATURE_IBPB)) {
 			pr_err("WARNING: CPU does not support IBPB.\n");
 			goto do_cmd_auto;
-		} else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY)) {
+		} else {
 			if (boot_cpu_has_bug(X86_BUG_SRSO) && !has_microcode)
 				pr_err("IBPB-extending microcode not applied; SRSO NOT mitigated\n");
 			if (retbleed_cmd == RETBLEED_CMD_IBPB)
 				retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
 			if (retbleed_cmd == RETBLEED_CMD_IBPB_VMEXIT)
 				retbleed_mitigation = RETBLEED_MITIGATION_IBPB_VMEXIT;
-		} else {
-			pr_err("WARNING: kernel not compiled with CPU_IBPB_ENTRY.\n");
-			goto do_cmd_auto;
 		}
 		break;
 
@@ -932,7 +929,7 @@ static void __init retbleed_select_mitig
 						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO;
 				}
 			}
-			else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY) && boot_cpu_has(X86_FEATURE_IBPB))
+			else if (boot_cpu_has(X86_FEATURE_IBPB))
 				retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
 		}
 
@@ -2397,14 +2394,9 @@ static void __init srso_select_mitigatio
 		break;
 
 	case SRSO_CMD_IBPB:
-		if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY)) {
-			if (has_microcode) {
-				setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
-				srso_mitigation = SRSO_MITIGATION_IBPB;
-			}
-		} else {
-			pr_err("WARNING: kernel not compiled with CPU_IBPB_ENTRY.\n");
-			goto pred_cmd;
+		if (has_microcode) {
+			setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
+			srso_mitigation = SRSO_MITIGATION_IBPB;
 		}
 		break;
 



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (9 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 10/17] x86: Remove CPU_IBPB_ENTRY Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09 13:10   ` Andrew.Cooper3
                     ` (2 more replies)
  2023-08-09  7:12 ` [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk Peter Zijlstra
                   ` (7 subsequent siblings)
  18 siblings, 3 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Now that retbleed can do all that the srso knob did, and without the
dubious interactions with retbleed selections, remove it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/bugs.c |  188 ++-------------------------------------------
 drivers/base/cpu.c         |    8 -
 include/linux/cpu.h        |    2 
 3 files changed, 10 insertions(+), 188 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -47,7 +47,6 @@ static void __init taa_select_mitigation
 static void __init mmio_select_mitigation(void);
 static void __init srbds_select_mitigation(void);
 static void __init l1d_flush_select_mitigation(void);
-static void __init srso_select_mitigation(void);
 
 /* The base value of the SPEC_CTRL MSR without task-specific bits set */
 u64 x86_spec_ctrl_base;
@@ -164,7 +163,6 @@ void __init cpu_select_mitigations(void)
 	md_clear_select_mitigation();
 	srbds_select_mitigation();
 	l1d_flush_select_mitigation();
-	srso_select_mitigation();
 }
 
 /*
@@ -2267,164 +2265,6 @@ static int __init l1tf_cmdline(char *str
 early_param("l1tf", l1tf_cmdline);
 
 #undef pr_fmt
-#define pr_fmt(fmt)	"Speculative Return Stack Overflow: " fmt
-
-enum srso_mitigation {
-	SRSO_MITIGATION_NONE,
-	SRSO_MITIGATION_MICROCODE,
-	SRSO_MITIGATION_SAFE_RET,
-	SRSO_MITIGATION_IBPB,
-	SRSO_MITIGATION_IBPB_ON_VMEXIT,
-};
-
-enum srso_mitigation_cmd {
-	SRSO_CMD_OFF,
-	SRSO_CMD_MICROCODE,
-	SRSO_CMD_SAFE_RET,
-	SRSO_CMD_IBPB,
-	SRSO_CMD_IBPB_ON_VMEXIT,
-};
-
-static const char * const srso_strings[] = {
-	[SRSO_MITIGATION_NONE]           = "Vulnerable",
-	[SRSO_MITIGATION_MICROCODE]      = "Mitigation: microcode",
-	[SRSO_MITIGATION_SAFE_RET]	 = "Mitigation: safe RET",
-	[SRSO_MITIGATION_IBPB]		 = "Mitigation: IBPB",
-	[SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only"
-};
-
-static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE;
-static enum srso_mitigation_cmd srso_cmd __ro_after_init = SRSO_CMD_SAFE_RET;
-
-static int __init srso_parse_cmdline(char *str)
-{
-	if (!str)
-		return -EINVAL;
-
-	if (!strcmp(str, "off"))
-		srso_cmd = SRSO_CMD_OFF;
-	else if (!strcmp(str, "microcode"))
-		srso_cmd = SRSO_CMD_MICROCODE;
-	else if (!strcmp(str, "safe-ret"))
-		srso_cmd = SRSO_CMD_SAFE_RET;
-	else if (!strcmp(str, "ibpb"))
-		srso_cmd = SRSO_CMD_IBPB;
-	else if (!strcmp(str, "ibpb-vmexit"))
-		srso_cmd = SRSO_CMD_IBPB_ON_VMEXIT;
-	else
-		pr_err("Ignoring unknown SRSO option (%s).", str);
-
-	return 0;
-}
-early_param("spec_rstack_overflow", srso_parse_cmdline);
-
-#define SRSO_NOTICE "WARNING: See https://kernel.org/doc/html/latest/admin-guide/hw-vuln/srso.html for mitigation options."
-
-static void __init srso_select_mitigation(void)
-{
-	bool has_microcode;
-
-	if (!boot_cpu_has_bug(X86_BUG_SRSO) || cpu_mitigations_off())
-		goto pred_cmd;
-
-	/*
-	 * The first check is for the kernel running as a guest in order
-	 * for guests to verify whether IBPB is a viable mitigation.
-	 */
-	has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
-	if (!has_microcode) {
-		pr_warn("IBPB-extending microcode not applied!\n");
-		pr_warn(SRSO_NOTICE);
-	} else {
-		/*
-		 * Enable the synthetic (even if in a real CPUID leaf)
-		 * flags for guests.
-		 */
-		setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
-		setup_force_cpu_cap(X86_FEATURE_SBPB);
-
-		/*
-		 * Zen1/2 with SMT off aren't vulnerable after the right
-		 * IBPB microcode has been applied.
-		 */
-		if ((boot_cpu_data.x86 < 0x19) &&
-		    (cpu_smt_control == CPU_SMT_DISABLED))
-			setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
-	}
-
-	if (retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
-		if (has_microcode) {
-			pr_err("Retbleed IBPB mitigation enabled, using same for SRSO\n");
-			srso_mitigation = SRSO_MITIGATION_IBPB;
-			goto pred_cmd;
-		}
-	}
-
-	switch (srso_cmd) {
-	case SRSO_CMD_OFF:
-		return;
-
-	case SRSO_CMD_MICROCODE:
-		if (has_microcode) {
-			srso_mitigation = SRSO_MITIGATION_MICROCODE;
-			pr_warn(SRSO_NOTICE);
-		}
-		break;
-
-	case SRSO_CMD_SAFE_RET:
-		if (IS_ENABLED(CONFIG_CPU_SRSO)) {
-			/*
-			 * Enable the return thunk for generated code
-			 * like ftrace, static_call, etc.
-			 */
-			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
-
-			if (boot_cpu_data.x86 == 0x19) {
-				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
-				x86_return_thunk = srso_alias_return_thunk;
-			} else {
-				setup_force_cpu_cap(X86_FEATURE_SRSO);
-				x86_return_thunk = srso_return_thunk;
-			}
-			srso_mitigation = SRSO_MITIGATION_SAFE_RET;
-		} else {
-			pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
-			goto pred_cmd;
-		}
-		break;
-
-	case SRSO_CMD_IBPB:
-		if (has_microcode) {
-			setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
-			srso_mitigation = SRSO_MITIGATION_IBPB;
-		}
-		break;
-
-	case SRSO_CMD_IBPB_ON_VMEXIT:
-		if (IS_ENABLED(CONFIG_CPU_SRSO)) {
-			if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
-				setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
-				srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT;
-			}
-		} else {
-			pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
-			goto pred_cmd;
-                }
-		break;
-
-	default:
-		break;
-	}
-
-	pr_info("%s%s\n", srso_strings[srso_mitigation], (has_microcode ? "" : ", no microcode"));
-
-pred_cmd:
-	if (boot_cpu_has(X86_FEATURE_SRSO_NO) ||
-	    srso_cmd == SRSO_CMD_OFF)
-		x86_pred_cmd = PRED_CMD_SBPB;
-}
-
-#undef pr_fmt
 #define pr_fmt(fmt) fmt
 
 #ifdef CONFIG_SYSFS
@@ -2607,26 +2447,26 @@ static ssize_t srbds_show_state(char *bu
 static ssize_t retbleed_show_state(char *buf)
 {
 	if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET ||
+	    retbleed_mitigation == RETBLEED_MITIGATION_UNRET_SRSO ||
+	    retbleed_mitigation == RETBLEED_MITIGATION_UNRET_SRSO_ALIAS ||
 	    retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
+
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
 			return sysfs_emit(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n");
 
-		return sysfs_emit(buf, "%s; SMT %s\n", retbleed_strings[retbleed_mitigation],
+		return sysfs_emit(buf, "%s; SMT %s%s\n", retbleed_strings[retbleed_mitigation],
 				  !sched_smt_active() ? "disabled" :
 				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
 				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ?
-				  "enabled with STIBP protection" : "vulnerable");
-	}
+				  "enabled with STIBP protection" : "vulnerable",
+				  cpu_has_ibpb_brtype_microcode() ? "" : ", no SRSO microcode");
 
-	return sysfs_emit(buf, "%s\n", retbleed_strings[retbleed_mitigation]);
-}
+	}
 
-static ssize_t srso_show_state(char *buf)
-{
-	return sysfs_emit(buf, "%s%s\n",
-			  srso_strings[srso_mitigation],
-			  (cpu_has_ibpb_brtype_microcode() ? "" : ", no microcode"));
+	return sysfs_emit(buf, "%s%s\n", retbleed_strings[retbleed_mitigation],
+			  (boot_cpu_data.x86_vendor != X86_VENDOR_AMD ||
+			   cpu_has_ibpb_brtype_microcode()) ? "" : ", no SRSO microcode");
 }
 
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
@@ -2678,9 +2518,6 @@ static ssize_t cpu_show_common(struct de
 	case X86_BUG_RETBLEED:
 		return retbleed_show_state(buf);
 
-	case X86_BUG_SRSO:
-		return srso_show_state(buf);
-
 	default:
 		break;
 	}
@@ -2745,9 +2582,4 @@ ssize_t cpu_show_retbleed(struct device
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_RETBLEED);
 }
-
-ssize_t cpu_show_spec_rstack_overflow(struct device *dev, struct device_attribute *attr, char *buf)
-{
-	return cpu_show_common(dev, attr, buf, X86_BUG_SRSO);
-}
 #endif
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -577,12 +577,6 @@ ssize_t __weak cpu_show_retbleed(struct
 	return sysfs_emit(buf, "Not affected\n");
 }
 
-ssize_t __weak cpu_show_spec_rstack_overflow(struct device *dev,
-					     struct device_attribute *attr, char *buf)
-{
-	return sysfs_emit(buf, "Not affected\n");
-}
-
 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
@@ -594,7 +588,6 @@ static DEVICE_ATTR(itlb_multihit, 0444,
 static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL);
 static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL);
 static DEVICE_ATTR(retbleed, 0444, cpu_show_retbleed, NULL);
-static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL);
 
 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_meltdown.attr,
@@ -608,7 +601,6 @@ static struct attribute *cpu_root_vulner
 	&dev_attr_srbds.attr,
 	&dev_attr_mmio_stale_data.attr,
 	&dev_attr_retbleed.attr,
-	&dev_attr_spec_rstack_overflow.attr,
 	NULL
 };
 
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -70,8 +70,6 @@ extern ssize_t cpu_show_mmio_stale_data(
 					char *buf);
 extern ssize_t cpu_show_retbleed(struct device *dev,
 				 struct device_attribute *attr, char *buf);
-extern ssize_t cpu_show_spec_rstack_overflow(struct device *dev,
-					     struct device_attribute *attr, char *buf);
 
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (10 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09 14:20   ` Josh Poimboeuf
  2023-08-09  7:12 ` [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn() Peter Zijlstra
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Rename the original retbleed return thunk from __x86_return_thunk to
zen_return_thunk, matching zen_untrain_ret.

Pull the dummy __x86_return_thunk from the !CPU_UNRET_ENTRY case and
explicitly set zen_return_thunk in the retbleed=unret case.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    2 ++
 arch/x86/kernel/cpu/bugs.c           |    1 +
 arch/x86/kernel/vmlinux.lds.S        |    2 +-
 arch/x86/lib/retpoline.S             |   25 +++++++++++--------------
 tools/objtool/check.c                |    9 +++++++--
 5 files changed, 22 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -339,6 +339,8 @@ extern retpoline_thunk_t __x86_indirect_
 extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
 
 extern void __x86_return_thunk(void);
+
+extern void zen_return_thunk(void);
 extern void srso_return_thunk(void);
 extern void srso_alias_return_thunk(void);
 
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -953,6 +953,7 @@ static void __init retbleed_select_mitig
 
 	case RETBLEED_MITIGATION_UNRET:
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
+		x86_return_thunk = zen_return_thunk;
 do_rethunk:
 		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -523,7 +523,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-. = ASSERT((__x86_return_thunk & 0x3f) == 0, "__x86_return_thunk not cacheline-aligned");
+. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 /*
  * GNU ld cannot do XOR so do: (A | B) - (A & B) in order to compute the XOR
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -161,7 +161,7 @@ __EXPORT_THUNK(srso_untrain_ret_alias)
 
 	.section .text.__x86.rethunk_safe
 
-/* Needs a definition for the __x86_return_thunk alternative below. */
+/* Needs a definition for the zen_return_thunk alternative below. */
 SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
 	add $8, %_ASM_SP
 	UNWIND_HINT_FUNC
@@ -174,7 +174,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
 
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
- * 1) The RET at __x86_return_thunk must be on a 64 byte boundary, for
+ * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
  *    alignment within the BTB.
  * 2) The instruction at zen_untrain_ret must contain, and not
  *    end with, the 0xc3 byte of the RET.
@@ -182,7 +182,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (__x86_return_thunk - zen_untrain_ret), 0xcc
+	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
 SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
@@ -190,16 +190,16 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
 	 *
 	 *   TEST $0xcc, %bl
 	 *   LFENCE
-	 *   JMP __x86_return_thunk
+	 *   JMP zen_return_thunk
 	 *
 	 * Executing the TEST instruction has a side effect of evicting any BTB
 	 * prediction (potentially attacker controlled) attached to the RET, as
-	 * __x86_return_thunk + 1 isn't an instruction boundary at the moment.
+	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
 	 */
 	.byte	0xf6
 
 	/*
-	 * As executed from __x86_return_thunk, this is a plain RET.
+	 * As executed from zen_return_thunk, this is a plain RET.
 	 *
 	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
 	 *
@@ -211,13 +211,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
 	 * With SMT enabled and STIBP active, a sibling thread cannot poison
 	 * RET's prediction to a type of its choice, but can evict the
 	 * prediction due to competitive sharing. If the prediction is
-	 * evicted, __x86_return_thunk will suffer Straight Line Speculation
+	 * evicted, zen_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(__x86_return_thunk, SYM_L_GLOBAL)
+SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(__x86_return_thunk)
+SYM_CODE_END(zen_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -228,7 +228,7 @@ SYM_CODE_END(__x86_return_thunk)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp __x86_return_thunk
+	jmp zen_return_thunk
 	int3
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
@@ -288,7 +288,7 @@ SYM_CODE_START(srso_alias_return_thunk)
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 
-#else /* CONFIG_CPU_UNRET_ENTRY */
+#endif /* CONFIG_CPU_UNRET_ENTRY */
 
 	.section .text.__x86.return_thunk
 
@@ -299,9 +299,6 @@ SYM_CODE_START(__x86_return_thunk)
 	ret
 	int3
 SYM_CODE_END(__x86_return_thunk)
-
-#endif /* CONFIG_CPU_UNRET_ENTRY */
-
 __EXPORT_THUNK(__x86_return_thunk)
 
 #endif /* CONFIG_RETHUNK */
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -455,7 +455,12 @@ static int decode_instructions(struct ob
 				return -1;
 			}
 
-			if (func->return_thunk || !strcmp(func->name, "srso_safe_ret") || func->alias != func)
+			/*
+			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
+			 * another instruction and objtool doesn't grok that. Skip validating them.
+			 */
+			if (!strcmp(func->name, "zen_return_thunk") ||
+			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)
 				continue;
 
 			if (!find_insn(file, sec, func->offset)) {
@@ -1583,7 +1588,7 @@ static int add_jump_destinations(struct
 			 * middle of another instruction.  Objtool only
 			 * knows about the outer instruction.
 			 */
-			if (sym && sym->return_thunk) {
+			if (sym && !strcmp(sym->name, "zen_return_thunk")) {
 				add_return_call(file, insn, false);
 				continue;
 			}



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn()
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (11 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  9:56   ` Nikolay Borisov
  2023-08-09 14:34   ` Josh Poimboeuf
  2023-08-09  7:12 ` [RFC][PATCH 14/17] objtool: Add comments to the arch_is_$foo() magic symbols Peter Zijlstra
                   ` (5 subsequent siblings)
  18 siblings, 2 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Add a little wrappery to identify the magic symbols that are actually
inside another instruction -- yay for variable length instruction
encoding.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/arch/x86/decode.c      |    6 ++++++
 tools/objtool/check.c                |   13 ++++++++++---
 tools/objtool/include/objtool/arch.h |    1 +
 tools/objtool/include/objtool/elf.h  |    1 +
 4 files changed, 18 insertions(+), 3 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -826,3 +826,9 @@ bool arch_is_rethunk(struct symbol *sym)
 {
 	return !strcmp(sym->name, "__x86_return_thunk");
 }
+
+bool arch_is_offset_insn(struct symbol *sym)
+{
+	return !strcmp(sym->name, "zen_return_thunk") ||
+	       !strcmp(sym->name, "srso_safe_ret");
+}
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -459,8 +459,7 @@ static int decode_instructions(struct ob
 			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
 			 * another instruction and objtool doesn't grok that. Skip validating them.
 			 */
-			if (!strcmp(func->name, "zen_return_thunk") ||
-			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)
+			if (func->offset_insn || func->alias != func)
 				continue;
 
 			if (!find_insn(file, sec, func->offset)) {
@@ -1303,6 +1302,11 @@ __weak bool arch_is_rethunk(struct symbo
 	return false;
 }
 
+__weak bool arch_is_offset_insn(struct symbol *sym)
+{
+	return false;
+}
+
 static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
 {
 	struct reloc *reloc;
@@ -1588,7 +1592,7 @@ static int add_jump_destinations(struct
 			 * middle of another instruction.  Objtool only
 			 * knows about the outer instruction.
 			 */
-			if (sym && !strcmp(sym->name, "zen_return_thunk")) {
+			if (sym && sym->offset_insn) {
 				add_return_call(file, insn, false);
 				continue;
 			}
@@ -2507,6 +2511,9 @@ static int classify_symbols(struct objto
 		if (arch_is_rethunk(func))
 			func->return_thunk = true;
 
+		if (arch_is_offset_insn(func))
+			func->offset_insn = true;
+
 		if (arch_ftrace_match(func->name))
 			func->fentry = true;
 
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -90,6 +90,7 @@ int arch_decode_hint_reg(u8 sp_reg, int
 
 bool arch_is_retpoline(struct symbol *sym);
 bool arch_is_rethunk(struct symbol *sym);
+bool arch_is_offset_insn(struct symbol *sym);
 
 int arch_rewrite_retpolines(struct objtool_file *file);
 
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -66,6 +66,7 @@ struct symbol {
 	u8 fentry            : 1;
 	u8 profiling_func    : 1;
 	u8 warned	     : 1;
+	u8 offset_insn       : 1;
 	struct list_head pv_target;
 	struct reloc *relocs;
 };



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 14/17] objtool: Add comments to the arch_is_$foo() magic symbols
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (12 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn() Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 15/17] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Clarification was asked for...

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1292,16 +1292,28 @@ static int add_ignore_alternatives(struc
 	return 0;
 }
 
+/*
+ * Symbols that replace INSN_CALL_DYNAMIC, every (tail) call to such a symbol
+ * will be added to the .retpoline_sites section.
+ */
 __weak bool arch_is_retpoline(struct symbol *sym)
 {
 	return false;
 }
 
+/*
+ * Symbols that replace INSN_RETURN, every (tail) call to such a symbol
+ * will be added to the .return_sites section.
+ */
 __weak bool arch_is_rethunk(struct symbol *sym)
 {
 	return false;
 }
 
+/*
+ * Symbols that are embedded inside other instructions, because sometimes crazy
+ * code exists. These are mostly ignored for validation purposes.
+ */
 __weak bool arch_is_offset_insn(struct symbol *sym)
 {
 	return false;



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 15/17] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (13 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 14/17] objtool: Add comments to the arch_is_$foo() magic symbols Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 16/17] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

For a more consistent namespace.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    2 +-
 arch/x86/kernel/vmlinux.lds.S        |    8 ++++----
 arch/x86/lib/retpoline.S             |   24 ++++++++++++------------
 3 files changed, 17 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -346,7 +346,7 @@ extern void srso_alias_return_thunk(void
 
 extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
-extern void srso_untrain_ret_alias(void);
+extern void srso_alias_untrain_ret(void);
 
 extern void entry_untrain_ret(void);
 extern void entry_ibpb(void);
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -149,10 +149,10 @@ SECTIONS
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
 		/*
-		 * See the comment above srso_untrain_ret_alias()'s
+		 * See the comment above srso_alias_untrain_ret()'s
 		 * definition.
 		 */
-		. = srso_untrain_ret_alias | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
+		. = srso_alias_untrain_ret | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
 		*(.text.__x86.rethunk_safe)
 #endif
 		ALIGN_ENTRY_TEXT_END
@@ -529,8 +529,8 @@ INIT_PER_CPU(irq_stack_backing_store);
  * GNU ld cannot do XOR so do: (A | B) - (A & B) in order to compute the XOR
  * of the two function addresses:
  */
-. = ASSERT(((srso_untrain_ret_alias | srso_safe_ret_alias) -
-		(srso_untrain_ret_alias & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
+. = ASSERT(((srso_alias_untrain_ret | srso_alias_safe_ret) -
+	    (srso_alias_untrain_ret & srso_alias_safe_ret)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
 		"SRSO function pair won't alias");
 #endif
 
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -134,41 +134,41 @@ SYM_CODE_END(__x86_indirect_jump_thunk_a
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
 /*
- * srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
+ * srso_alias_untrain_ret() and srso_alias_safe_ret() are placed at
  * special addresses:
  *
- * - srso_untrain_ret_alias() is 2M aligned
- * - srso_safe_ret_alias() is also in the same 2M page but bits 2, 8, 14
+ * - srso_alias_untrain_ret() is 2M aligned
+ * - srso_alias_safe_ret() is also in the same 2M page but bits 2, 8, 14
  * and 20 in its virtual address are set (while those bits in the
- * srso_untrain_ret_alias() function are cleared).
+ * srso_alias_untrain_ret() function are cleared).
  *
  * This guarantees that those two addresses will alias in the branch
  * target buffer of Zen3/4 generations, leading to any potential
  * poisoned entries at that BTB slot to get evicted.
  *
- * As a result, srso_safe_ret_alias() becomes a safe return.
+ * As a result, srso_alias_safe_ret() becomes a safe return.
  */
 	.section .text.__x86.rethunk_untrain
 
-SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
 	jmp srso_alias_return_thunk
-SYM_FUNC_END(srso_untrain_ret_alias)
-__EXPORT_THUNK(srso_untrain_ret_alias)
+SYM_FUNC_END(srso_alias_untrain_ret)
+__EXPORT_THUNK(srso_alias_untrain_ret)
 
 	.section .text.__x86.rethunk_safe
 
 /* Needs a definition for the zen_return_thunk alternative below. */
-SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_safe_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	add $8, %_ASM_SP
 	UNWIND_HINT_FUNC
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
-SYM_FUNC_END(srso_safe_ret_alias)
+SYM_FUNC_END(srso_alias_safe_ret)
 
 	.section .text.__x86.return_thunk
 
@@ -266,7 +266,7 @@ __EXPORT_THUNK(srso_untrain_ret)
 SYM_FUNC_START(entry_untrain_ret)
 	ALTERNATIVE_2 "jmp zen_untrain_ret", \
 		      "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
-		      "jmp srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+		      "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
 SYM_FUNC_END(entry_untrain_ret)
 __EXPORT_THUNK(entry_untrain_ret)
 
@@ -284,7 +284,7 @@ SYM_CODE_END(srso_return_thunk)
 SYM_CODE_START(srso_alias_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	call srso_safe_ret_alias
+	call srso_alias_safe_ret
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 16/17] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (14 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 15/17] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  7:12 ` [RFC][PATCH 17/17] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret() Peter Zijlstra
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Instead of making increasingly complicated ALTERNATIVE_n()
implementations, use a nested alternative expression.

The only difference between:

  ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)

and

  ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
              newinst2, flag2)

is that the outer alternative can add additional padding when the
inner alternative is the shorter one, which then results in
alt_instr::instrlen being inconsistent.

However, this is easily remedied since the alt_instr entries will be
consecutive and it is trivial to compute the max(alt_instr::instrlen)
at runtime while patching.

Specifically, after this patch the ALTERNATIVE_2 macro, after CPP
expansion (and manual layout), looks like this:

  .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
   140:

     140: \oldinstr ;
     141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
     142: .pushsection .altinstructions,"a" ;
	  altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f ;
	  .popsection ; .pushsection .altinstr_replacement,"ax" ;
     143: \newinstr1 ;
     144: .popsection ; ;

   141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
   142: .pushsection .altinstructions,"a" ;
	altinstr_entry 140b,143f,\ft_flags2,142b-140b,144f-143f ;
	.popsection ;
	.pushsection .altinstr_replacement,"ax" ;
   143: \newinstr2 ;
   144: .popsection ;
  .endm

The only label that is ambiguous is 140, however they all reference
the same spot, so that doesn't matter.

NOTE: obviously only @oldinstr may be an alternative; making @newinstr
an alternative would mean patching .altinstr_replacement which very
likely isn't what is intended, also the labels will be confused in
that case.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230628104952.GA2439977@hirez.programming.kicks-ass.net
---
 arch/x86/include/asm/alternative.h |  206 ++++++++++---------------------------
 arch/x86/kernel/alternative.c      |   13 ++
 tools/objtool/arch/x86/special.c   |   23 ++++
 tools/objtool/special.c            |   16 +-
 4 files changed, 100 insertions(+), 158 deletions(-)

--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -150,102 +150,60 @@ static inline int alternatives_text_rese
 }
 #endif	/* CONFIG_SMP */
 
-#define b_replacement(num)	"664"#num
-#define e_replacement(num)	"665"#num
-
-#define alt_end_marker		"663"
 #define alt_slen		"662b-661b"
-#define alt_total_slen		alt_end_marker"b-661b"
-#define alt_rlen(num)		e_replacement(num)"f-"b_replacement(num)"f"
+#define alt_total_slen		"663b-661b"
+#define alt_rlen		"665f-664f"
 
-#define OLDINSTR(oldinstr, num)						\
+#define OLDINSTR(oldinstr)						\
 	"# ALT: oldnstr\n"						\
 	"661:\n\t" oldinstr "\n662:\n"					\
 	"# ALT: padding\n"						\
-	".skip -(((" alt_rlen(num) ")-(" alt_slen ")) > 0) * "		\
-		"((" alt_rlen(num) ")-(" alt_slen ")),0x90\n"		\
-	alt_end_marker ":\n"
-
-/*
- * gas compatible max based on the idea from:
- * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
- *
- * The additional "-" is needed because gas uses a "true" value of -1.
- */
-#define alt_max_short(a, b)	"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") < (" b ")))))"
+	".skip -(((" alt_rlen ")-(" alt_slen ")) > 0) * "		\
+		"((" alt_rlen ")-(" alt_slen ")),0x90\n"		\
+	"663:\n"
 
-/*
- * Pad the second replacement alternative with additional NOPs if it is
- * additionally longer than the first replacement alternative.
- */
-#define OLDINSTR_2(oldinstr, num1, num2) \
-	"# ALT: oldinstr2\n"									\
-	"661:\n\t" oldinstr "\n662:\n"								\
-	"# ALT: padding2\n"									\
-	".skip -((" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")) > 0) * "	\
-		"(" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")), 0x90\n"	\
-	alt_end_marker ":\n"
-
-#define OLDINSTR_3(oldinsn, n1, n2, n3)								\
-	"# ALT: oldinstr3\n"									\
-	"661:\n\t" oldinsn "\n662:\n"								\
-	"# ALT: padding3\n"									\
-	".skip -((" alt_max_short(alt_max_short(alt_rlen(n1), alt_rlen(n2)), alt_rlen(n3))	\
-		" - (" alt_slen ")) > 0) * "							\
-		"(" alt_max_short(alt_max_short(alt_rlen(n1), alt_rlen(n2)), alt_rlen(n3))	\
-		" - (" alt_slen ")), 0x90\n"							\
-	alt_end_marker ":\n"
-
-#define ALTINSTR_ENTRY(ft_flags, num)					      \
+#define ALTINSTR_ENTRY(ft_flags)					      \
+	".pushsection .altinstructions,\"a\"\n"				      \
 	" .long 661b - .\n"				/* label           */ \
-	" .long " b_replacement(num)"f - .\n"		/* new instruction */ \
+	" .long 664f - .\n"				/* new instruction */ \
 	" .4byte " __stringify(ft_flags) "\n"		/* feature + flags */ \
 	" .byte " alt_total_slen "\n"			/* source len      */ \
-	" .byte " alt_rlen(num) "\n"			/* replacement len */
-
-#define ALTINSTR_REPLACEMENT(newinstr, num)		/* replacement */	\
-	"# ALT: replacement " #num "\n"						\
-	b_replacement(num)":\n\t" newinstr "\n" e_replacement(num) ":\n"
-
-/* alternative assembly primitive: */
-#define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
-	OLDINSTR(oldinstr, 1)						\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags, 1)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinstr, 1)				\
+	" .byte " alt_rlen "\n"				/* replacement len */ \
 	".popsection\n"
 
-#define ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) \
-	OLDINSTR_2(oldinstr, 1, 2)					\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags1, 1)					\
-	ALTINSTR_ENTRY(ft_flags2, 2)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinstr1, 1)				\
-	ALTINSTR_REPLACEMENT(newinstr2, 2)				\
+#define ALTINSTR_REPLACEMENT(newinstr)			/* replacement */	\
+	".pushsection .altinstr_replacement, \"ax\"\n"				\
+	"# ALT: replacement \n"							\
+	"664:\n\t" newinstr "\n 665:\n"						\
 	".popsection\n"
 
+/*
+ * Define an alternative between two instructions. If @ft_flags is
+ * present, early code in apply_alternatives() replaces @oldinstr with
+ * @newinstr. ".skip" directive takes care of proper instruction padding
+ * in case @newinstr is longer than @oldinstr.
+ *
+ * Notably: @oldinstr may be an ALTERNATIVE() itself, also see
+ *          apply_alternatives()
+ */
+#define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
+	OLDINSTR(oldinstr)						\
+	ALTINSTR_ENTRY(ft_flags)					\
+	ALTINSTR_REPLACEMENT(newinstr)
+
+#define ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)	\
+	ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),		\
+		    newinst2, flag2)
+
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
 #define ALTERNATIVE_TERNARY(oldinstr, ft_flags, newinstr_yes, newinstr_no) \
 	ALTERNATIVE_2(oldinstr, newinstr_no, X86_FEATURE_ALWAYS,	\
 		      newinstr_yes, ft_flags)
 
-#define ALTERNATIVE_3(oldinsn, newinsn1, ft_flags1, newinsn2, ft_flags2, \
-			newinsn3, ft_flags3)				\
-	OLDINSTR_3(oldinsn, 1, 2, 3)					\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags1, 1)					\
-	ALTINSTR_ENTRY(ft_flags2, 2)					\
-	ALTINSTR_ENTRY(ft_flags3, 3)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinsn1, 1)				\
-	ALTINSTR_REPLACEMENT(newinsn2, 2)				\
-	ALTINSTR_REPLACEMENT(newinsn3, 3)				\
-	".popsection\n"
+#define ALTERNATIVE_3(oldinst, newinst1, flag1, newinst2, flag2,	\
+		      newinst3, flag3)					\
+	ALTERNATIVE(ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2), \
+		    newinst3, flag3)
 
 /*
  * Alternative instructions for different CPU types or capabilities.
@@ -370,6 +328,21 @@ static inline int alternatives_text_rese
 	.byte \alt_len
 .endm
 
+#define __ALTERNATIVE(oldinst, newinst, flag)				\
+140:									\
+	oldinst	;							\
+141:									\
+	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90	;\
+142:									\
+	.pushsection .altinstructions,"a" ;				\
+	altinstr_entry 140b,143f,flag,142b-140b,144f-143f ;		\
+	.popsection ;							\
+	.pushsection .altinstr_replacement,"ax"	;			\
+143:									\
+	newinst	;							\
+144:									\
+	.popsection ;
+
 /*
  * Define an alternative between two instructions. If @feature is
  * present, early code in apply_alternatives() replaces @oldinstr with
@@ -377,88 +350,23 @@ static inline int alternatives_text_rese
  * in case @newinstr is longer than @oldinstr.
  */
 .macro ALTERNATIVE oldinstr, newinstr, ft_flags
-140:
-	\oldinstr
-141:
-	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags,142b-140b,144f-143f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr
-144:
-	.popsection
+	__ALTERNATIVE(\oldinstr, \newinstr, \ft_flags)
 .endm
 
-#define old_len			141b-140b
-#define new_len1		144f-143f
-#define new_len2		145f-144f
-#define new_len3		146f-145f
-
-/*
- * gas compatible max based on the idea from:
- * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
- *
- * The additional "-" is needed because gas uses a "true" value of -1.
- */
-#define alt_max_2(a, b)		((a) ^ (((a) ^ (b)) & -(-((a) < (b)))))
-#define alt_max_3(a, b, c)	(alt_max_2(alt_max_2(a, b), c))
-
-
 /*
  * Same as ALTERNATIVE macro above but for two alternatives. If CPU
  * has @feature1, it replaces @oldinstr with @newinstr1. If CPU has
  * @feature2, it replaces @oldinstr with @feature2.
  */
 .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
-140:
-	\oldinstr
-141:
-	.skip -((alt_max_2(new_len1, new_len2) - (old_len)) > 0) * \
-		(alt_max_2(new_len1, new_len2) - (old_len)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f
-	altinstr_entry 140b,144f,\ft_flags2,142b-140b,145f-144f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr1
-144:
-	\newinstr2
-145:
-	.popsection
+	__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1),
+		      \newinstr2, \ft_flags2)
 .endm
 
 .macro ALTERNATIVE_3 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2, newinstr3, ft_flags3
-140:
-	\oldinstr
-141:
-	.skip -((alt_max_3(new_len1, new_len2, new_len3) - (old_len)) > 0) * \
-		(alt_max_3(new_len1, new_len2, new_len3) - (old_len)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f
-	altinstr_entry 140b,144f,\ft_flags2,142b-140b,145f-144f
-	altinstr_entry 140b,145f,\ft_flags3,142b-140b,146f-145f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr1
-144:
-	\newinstr2
-145:
-	\newinstr3
-146:
-	.popsection
+	__ALTERNATIVE(__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1),
+				    \newinstr2, \ft_flags2),
+		      \newinstr3, \ft_flags3)
 .endm
 
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -398,7 +398,7 @@ apply_relocation(u8 *buf, size_t len, u8
 void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 						  struct alt_instr *end)
 {
-	struct alt_instr *a;
+	struct alt_instr *a, *b;
 	u8 *instr, *replacement;
 	u8 insn_buff[MAX_PATCH_LEN];
 
@@ -415,6 +415,17 @@ void __init_or_module noinline apply_alt
 	for (a = start; a < end; a++) {
 		int insn_buff_sz = 0;
 
+		/*
+		 * In case of nested ALTERNATIVE()s the outer alternative might
+		 * add more padding. To ensure consistent patching find the max
+		 * padding for all alt_instr entries for this site (nested
+		 * alternatives result in consecutive entries).
+		 */
+		for (b = a+1; b < end && b->instr_offset == a->instr_offset; b++) {
+			u8 len = max(a->instrlen, b->instrlen);
+			a->instrlen = b->instrlen = len;
+		}
+
 		instr = (u8 *)&a->instr_offset + a->instr_offset;
 		replacement = (u8 *)&a->repl_offset + a->repl_offset;
 		BUG_ON(a->instrlen > sizeof(insn_buff));
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -9,6 +9,29 @@
 
 void arch_handle_alternative(unsigned short feature, struct special_alt *alt)
 {
+	static struct special_alt *group, *prev;
+
+	/*
+	 * Recompute orig_len for nested ALTERNATIVE()s.
+	 */
+	if (group && group->orig_sec == alt->orig_sec &&
+	             group->orig_off == alt->orig_off) {
+
+		struct special_alt *iter = group;
+		for (;;) {
+			unsigned int len = max(iter->orig_len, alt->orig_len);
+			iter->orig_len = alt->orig_len = len;
+
+			if (iter == prev)
+				break;
+
+			iter = list_next_entry(iter, list);
+		}
+
+	} else group = alt;
+
+	prev = alt;
+
 	switch (feature) {
 	case X86_FEATURE_SMAP:
 		/*
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -84,6 +84,14 @@ static int get_alt_entry(struct elf *elf
 						  entry->new_len);
 	}
 
+	orig_reloc = find_reloc_by_dest(elf, sec, offset + entry->orig);
+	if (!orig_reloc) {
+		WARN_FUNC("can't find orig reloc", sec, offset + entry->orig);
+		return -1;
+	}
+
+	reloc_to_sec_off(orig_reloc, &alt->orig_sec, &alt->orig_off);
+
 	if (entry->feature) {
 		unsigned short feature;
 
@@ -94,14 +102,6 @@ static int get_alt_entry(struct elf *elf
 		arch_handle_alternative(feature, alt);
 	}
 
-	orig_reloc = find_reloc_by_dest(elf, sec, offset + entry->orig);
-	if (!orig_reloc) {
-		WARN_FUNC("can't find orig reloc", sec, offset + entry->orig);
-		return -1;
-	}
-
-	reloc_to_sec_off(orig_reloc, &alt->orig_sec, &alt->orig_off);
-
 	if (!entry->group || alt->new_len) {
 		new_reloc = find_reloc_by_dest(elf, sec, offset + entry->new);
 		if (!new_reloc) {



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [RFC][PATCH 17/17] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret()
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (15 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 16/17] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
@ 2023-08-09  7:12 ` Peter Zijlstra
  2023-08-09  9:04 ` [RFC][PATCH 00/17] Fix up the recent SRSO patches Nikolay Borisov
  2023-08-09 10:04 ` Andrew.Cooper3
  18 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09  7:12 UTC (permalink / raw)
  To: x86; +Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

Use the new nested alternatives to create what is effectively
ALTERNATIVE_5 and merge the dummy entry_untrain_ret stub into
UNTRAIN_RET properly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   33 ++++++++++++++++++---------------
 arch/x86/lib/retpoline.S             |    7 -------
 2 files changed, 18 insertions(+), 22 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -271,11 +271,15 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
+#define ALT_UNRET(old)	\
+	__ALTERNATIVE(__ALTERNATIVE(__ALTERNATIVE(old, call zen_untrain_ret, X86_FEATURE_UNRET), \
+				    call srso_untrain_ret, X86_FEATURE_SRSO), \
+		      call srso_alias_untrain_ret, X86_FEATURE_SRSO_ALIAS)
 #else
-#define CALL_UNTRAIN_RET	""
+#define ALT_UNRET(old)	old
 #endif
 
+
 /*
  * Mitigate RETBleed for AMD/Hygon Zen uarch. Requires KERNEL CR3 because the
  * return thunk isn't mapped into the userspace tables (then again, AMD
@@ -290,30 +294,30 @@
 .macro UNTRAIN_RET
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
-	ALTERNATIVE_3 "",						\
-		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
-		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
-		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+
+	__ALTERNATIVE(__ALTERNATIVE(ALT_UNRET(;),
+				    call entry_ibpb, X86_FEATURE_ENTRY_IBPB),
+		      RESET_CALL_DEPTH, X86_FEATURE_CALL_DEPTH)
 #endif
 .endm
 
 .macro UNTRAIN_RET_VM
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
-	ALTERNATIVE_3 "",						\
-		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
-		      "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT,	\
-		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+
+	__ALTERNATIVE(__ALTERNATIVE(ALT_UNRET(;),
+				    call entry_ibpb, X86_FEATURE_IBPB_ON_VMEXIT),
+		      RESET_CALL_DEPTH, X86_FEATURE_CALL_DEPTH)
 #endif
 .endm
 
 .macro UNTRAIN_RET_FROM_CALL
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
-	ALTERNATIVE_3 "",						\
-		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
-		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
-		      __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
+
+	__ALTERNATIVE(__ALTERNATIVE(ALT_UNRET(;),
+				    call entry_ibpb, X86_FEATURE_ENTRY_IBPB),
+		      RESET_CALL_DEPTH_FROM_CALL, X86_FEATURE_CALL_DEPTH)
 #endif
 .endm
 
@@ -348,7 +352,6 @@ extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_alias_untrain_ret(void);
 
-extern void entry_untrain_ret(void);
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -263,13 +263,6 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
-SYM_FUNC_START(entry_untrain_ret)
-	ALTERNATIVE_2 "jmp zen_untrain_ret", \
-		      "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
-		      "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-SYM_FUNC_END(entry_untrain_ret)
-__EXPORT_THUNK(entry_untrain_ret)
-
 /*
  * Both these do an unbalanced CALL to mess up the RSB, terminate with UD2
  * to indicate noreturn.



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (16 preceding siblings ...)
  2023-08-09  7:12 ` [RFC][PATCH 17/17] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret() Peter Zijlstra
@ 2023-08-09  9:04 ` Nikolay Borisov
  2023-08-09 10:04 ` Andrew.Cooper3
  18 siblings, 0 replies; 80+ messages in thread
From: Nikolay Borisov @ 2023-08-09  9:04 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh



On 9.08.23 г. 10:12 ч., Peter Zijlstra wrote:
> Since I wasn't invited to the party (even though I did retbleed), I get to
> clean things up afterwards :/
> 
> Anyway, this here overhauls the SRSO patches in a big way.
> 
> I claim that AMD retbleed (also called Speculative-Type-Confusion -- not to be
> confused with Intel retbleed, which is an entirely different bug) is
> fundamentally the same as this SRSO -- which is also caused by STC. And the
> mitigations are so similar they should all be controlled from a single spot and
> not conflated like they are now.
> 
> As such, at the end of the ride the new kernel command line and srso sysfs
> files are no more and all we're left with is a few extra retbleed options.
> 
> Aside of that; this deals with a few implementation issues -- but not all known
> issues. Josh and Andrew are telling me there's a problem when running inside
> virt due to how this checks the microcode. I'm hoping either of those two gents
> will add a patch to address this.

The microcode issue should have been fixed as Boris added a safe_wrmsr 
call which checks for the presence of SBPB bit on zen3/4.


> 
> 
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk
  2023-08-09  7:12 ` [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk Peter Zijlstra
@ 2023-08-09  9:31   ` Nikolay Borisov
  2023-08-10 11:37   ` Borislav Petkov
  1 sibling, 0 replies; 80+ messages in thread
From: Nikolay Borisov @ 2023-08-09  9:31 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh



On 9.08.23 г. 10:12 ч., Peter Zijlstra wrote:
> There is infrastructure to rewrite return thunks to point to any
> random thunk one desires, unwrap that from CALL_THUNKS, which up to
> now was the sole user of that.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   arch/x86/include/asm/nospec-branch.h |    4 ----
>   arch/x86/kernel/alternative.c        |    2 --
>   2 files changed, 6 deletions(-)
> 
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -347,11 +347,7 @@ extern void srso_untrain_ret(void);
>   extern void srso_untrain_ret_alias(void);
>   extern void entry_ibpb(void);
>   
> -#ifdef CONFIG_CALL_THUNKS
>   extern void (*x86_return_thunk)(void);
> -#else
> -#define x86_return_thunk	(&__x86_return_thunk)
> -#endif
>   
>   #ifdef CONFIG_CALL_DEPTH_TRACKING
>   extern void __x86_return_skl(void);
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -698,9 +698,7 @@ void __init_or_module noinline apply_ret
>   
>   #ifdef CONFIG_RETHUNK
>   
> -#ifdef CONFIG_CALL_THUNKS
>   void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
> -#endif
>   
>   /*
>    * Rewrite the compiler generated return thunk tail-calls.
> 
> 


Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn()
  2023-08-09  7:12 ` [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn() Peter Zijlstra
@ 2023-08-09  9:56   ` Nikolay Borisov
  2023-08-09 14:34   ` Josh Poimboeuf
  1 sibling, 0 replies; 80+ messages in thread
From: Nikolay Borisov @ 2023-08-09  9:56 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh



On 9.08.23 г. 10:12 ч., Peter Zijlstra wrote:
> Add a little wrappery to identify the magic symbols that are actually
> inside another instruction -- yay for variable length instruction
> encoding.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>   tools/objtool/arch/x86/decode.c      |    6 ++++++
>   tools/objtool/check.c                |   13 ++++++++++---
>   tools/objtool/include/objtool/arch.h |    1 +
>   tools/objtool/include/objtool/elf.h  |    1 +
>   4 files changed, 18 insertions(+), 3 deletions(-)
> 
> --- a/tools/objtool/arch/x86/decode.c
> +++ b/tools/objtool/arch/x86/decode.c
> @@ -826,3 +826,9 @@ bool arch_is_rethunk(struct symbol *sym)
>   {
>   	return !strcmp(sym->name, "__x86_return_thunk");
>   }
> +
> +bool arch_is_offset_insn(struct symbol *sym)
> +{
> +	return !strcmp(sym->name, "zen_return_thunk") ||
> +	       !strcmp(sym->name, "srso_safe_ret");
> +}
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -459,8 +459,7 @@ static int decode_instructions(struct ob
>   			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
>   			 * another instruction and objtool doesn't grok that. Skip validating them.
>   			 */
> -			if (!strcmp(func->name, "zen_return_thunk") ||
> -			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)
> +			if (func->offset_insn || func->alias != func)
>   				continue;
>   
>   			if (!find_insn(file, sec, func->offset)) {
> @@ -1303,6 +1302,11 @@ __weak bool arch_is_rethunk(struct symbo
>   	return false;
>   }
>   
> +__weak bool arch_is_offset_insn(struct symbol *sym)
> +{
> +	return false;
> +}
> +
>   static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
>   {
>   	struct reloc *reloc;
> @@ -1588,7 +1592,7 @@ static int add_jump_destinations(struct
>   			 * middle of another instruction.  Objtool only
>   			 * knows about the outer instruction.
>   			 */
> -			if (sym && !strcmp(sym->name, "zen_return_thunk")) {
> +			if (sym && sym->offset_insn) {
>   				add_return_call(file, insn, false);
>   				continue;
>   			}
> @@ -2507,6 +2511,9 @@ static int classify_symbols(struct objto
>   		if (arch_is_rethunk(func))
>   			func->return_thunk = true;
>   
> +		if (arch_is_offset_insn(func))
> +			func->offset_insn = true;
> +
>   		if (arch_ftrace_match(func->name))
>   			func->fentry = true;

nit: Why go through this if when one can simply do:

func->foo = arch_is_foo(bar) ?

>   
> --- a/tools/objtool/include/objtool/arch.h
> +++ b/tools/objtool/include/objtool/arch.h
> @@ -90,6 +90,7 @@ int arch_decode_hint_reg(u8 sp_reg, int
>   
>   bool arch_is_retpoline(struct symbol *sym);
>   bool arch_is_rethunk(struct symbol *sym);
> +bool arch_is_offset_insn(struct symbol *sym);
>   
>   int arch_rewrite_retpolines(struct objtool_file *file);
>   
> --- a/tools/objtool/include/objtool/elf.h
> +++ b/tools/objtool/include/objtool/elf.h
> @@ -66,6 +66,7 @@ struct symbol {
>   	u8 fentry            : 1;
>   	u8 profiling_func    : 1;
>   	u8 warned	     : 1;
> +	u8 offset_insn       : 1;
>   	struct list_head pv_target;
>   	struct reloc *relocs;
>   };
> 
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
                   ` (17 preceding siblings ...)
  2023-08-09  9:04 ` [RFC][PATCH 00/17] Fix up the recent SRSO patches Nikolay Borisov
@ 2023-08-09 10:04 ` Andrew.Cooper3
  2023-08-09 11:58   ` Peter Zijlstra
  18 siblings, 1 reply; 80+ messages in thread
From: Andrew.Cooper3 @ 2023-08-09 10:04 UTC (permalink / raw)
  To: Peter Zijlstra, x86; +Cc: linux-kernel, David.Kaplan, jpoimboe, gregkh

On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> Since I wasn't invited to the party (even though I did retbleed), I get to
> clean things up afterwards :/
>
> Anyway, this here overhauls the SRSO patches in a big way.
>
> I claim that AMD retbleed (also called Speculative-Type-Confusion

Branch Type Confusion.

Speculative Type Confusion is something else; generally Spectre v1 or v2
around a logical type check, usually ending up confusing pointers and
integer.

It appears that you might be suffering from Type-of-Speculative-Bug
Confusion, an affliction brought on by the chronic lack of documentation
and consistency, the fact that almost everything has at least 2 names,
and that 6 years in this horror show it's not showing any sign of
slowing down.

>  -- not to be
> confused with Intel retbleed, which is an entirely different bug) is
> fundamentally the same as this SRSO -- which is also caused by STC. And the
> mitigations are so similar they should all be controlled from a single spot and
> not conflated like they are now.

BTC and SRSO are certainly related, but they're not the same.

With BTC, an attacker poisons a branch type prediction to say "that
thing (which isn't actually a ret) is a ret".

With SRSO, an attacker leaves a poisoned infinite-call-loop prediction. 
Later, a real function (that is architecturally correct execution and
will retire) trips over the predicted infinite loop, which overflows the
RSB/RAS/RAP replacing the correct prediction on the top with the
attackers choice of value.

So while branch type confusion is used to poison the top-of-RSB value,
the ret that actually goes wrong needs a correct type=ret prediction for
the SRSO attack to succeed.


Both issues can be mitigated with IBPB-on-entry (given up-to-date
microcode in some cases).

Both issues have a software sequence that tries to make the contents of
a __x86_return_thunk sequence safe to use.  For BTC, it's simply a case
of ensuring the type prediction of the one ret is good.  For SRSO, it's
something more complicated and I don't know the uarch details fully.

~Andrew

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 00/17] Fix up the recent SRSO patches
  2023-08-09 10:04 ` Andrew.Cooper3
@ 2023-08-09 11:58   ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 11:58 UTC (permalink / raw)
  To: Andrew.Cooper3; +Cc: x86, linux-kernel, David.Kaplan, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 11:04:15AM +0100, Andrew.Cooper3@citrix.com wrote:
> On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> > Since I wasn't invited to the party (even though I did retbleed), I get to
> > clean things up afterwards :/
> >
> > Anyway, this here overhauls the SRSO patches in a big way.
> >
> > I claim that AMD retbleed (also called Speculative-Type-Confusion
> 
> Branch Type Confusion.

Durr, I shoud've double checked, and yes, too damn many different things
and not enough sleep.

> >  -- not to be
> > confused with Intel retbleed, which is an entirely different bug) is
> > fundamentally the same as this SRSO -- which is also caused by STC. And the
> > mitigations are so similar they should all be controlled from a single spot and
> > not conflated like they are now.
> 
> BTC and SRSO are certainly related, but they're not the same.
> 
> With BTC, an attacker poisons a branch type prediction to say "that
> thing (which isn't actually a ret) is a ret".
> 
> With SRSO, an attacker leaves a poisoned infinite-call-loop prediction. 
> Later, a real function (that is architecturally correct execution and
> will retire) trips over the predicted infinite loop, which overflows the
> RSB/RAS/RAP replacing the correct prediction on the top with the
> attackers choice of value.
> 
> So while branch type confusion is used to poison the top-of-RSB value,
> the ret that actually goes wrong needs a correct type=ret prediction for
> the SRSO attack to succeed.

Yes, this is what I meant, and I clearly failed to express myself
better. The point was that branch-type-confusion is involved with both,
just in different ways.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess
  2023-08-09  7:12 ` [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess Peter Zijlstra
@ 2023-08-09 12:51   ` Josh Poimboeuf
  2023-08-09 13:12     ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 12:51 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:23AM +0200, Peter Zijlstra wrote:
> Since there can only be one active return_thunk, there only needs be
> one (matching) untrain_ret. It fundamentally doesn't make sense to
> allow multiple untrain_ret at the same time.
> 
> Fold all the 3 different untrain methods into a single (temporary)
> helper stub.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/include/asm/nospec-branch.h |   19 +++++--------------
>  arch/x86/lib/retpoline.S             |    7 +++++++
>  2 files changed, 12 insertions(+), 14 deletions(-)
> 
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -272,9 +272,9 @@
>  .endm
>  
>  #ifdef CONFIG_CPU_UNRET_ENTRY
> -#define CALL_ZEN_UNTRAIN_RET	"call zen_untrain_ret"
> +#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
>  #else
> -#define CALL_ZEN_UNTRAIN_RET	""
> +#define CALL_UNTRAIN_RET	""
>  #endif
>  
>  /*
> @@ -293,15 +293,10 @@
>  	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
>  	VALIDATE_UNRET_END
>  	ALTERNATIVE_3 "",						\
> -		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
> +		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\

SRSO doesn't have X86_FEATURE_UNRET set.

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense
  2023-08-09  7:12 ` [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense Peter Zijlstra
@ 2023-08-09 13:10   ` Andrew.Cooper3
  2023-08-09 13:36     ` Peter Zijlstra
  2023-08-09 14:05   ` Josh Poimboeuf
  2023-08-09 15:34   ` Josh Poimboeuf
  2 siblings, 1 reply; 80+ messages in thread
From: Andrew.Cooper3 @ 2023-08-09 13:10 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, jpoimboe, gregkh,
	Sean Christopherson, Paolo Bonzini

On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> Now that retbleed can do all that the srso knob did, and without the
> dubious interactions with retbleed selections, remove it.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/bugs.c |  188 ++-------------------------------------------
>  drivers/base/cpu.c         |    8 -
>  include/linux/cpu.h        |    2 
>  3 files changed, 10 insertions(+), 188 deletions(-)

Not all of this can go, because ...

> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> ...
> -static void __init srso_select_mitigation(void)
> -{
> -	bool has_microcode;
> -
> -	if (!boot_cpu_has_bug(X86_BUG_SRSO) || cpu_mitigations_off())
> -		goto pred_cmd;
> -
> -	/*
> -	 * The first check is for the kernel running as a guest in order
> -	 * for guests to verify whether IBPB is a viable mitigation.
> -	 */
> -	has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
> -	if (!has_microcode) {
> -		pr_warn("IBPB-extending microcode not applied!\n");
> -		pr_warn(SRSO_NOTICE);
> -	} else {
> -		/*
> -		 * Enable the synthetic (even if in a real CPUID leaf)
> -		 * flags for guests.
> -		 */
> -		setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
> -		setup_force_cpu_cap(X86_FEATURE_SBPB);

... these (minus the virt bug caused by probing for microcode behaviour
even when virtualised, and the enumeration bug caused by ignoring
synthesis if host mitigations are off) are necessary for KVM.

https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf

and here's one I prepared earlier
https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=2280b0ee2aed6e0fd4af3fa31bf99bc04d038bfe

but these bits need to get into guests for the guests to be able to
figure out what to do.

~Andrew

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess
  2023-08-09 12:51   ` Josh Poimboeuf
@ 2023-08-09 13:12     ` Peter Zijlstra
  2023-08-09 13:26       ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 13:12 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 08:51:01AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 09:12:23AM +0200, Peter Zijlstra wrote:
> > Since there can only be one active return_thunk, there only needs be
> > one (matching) untrain_ret. It fundamentally doesn't make sense to
> > allow multiple untrain_ret at the same time.
> > 
> > Fold all the 3 different untrain methods into a single (temporary)
> > helper stub.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/x86/include/asm/nospec-branch.h |   19 +++++--------------
> >  arch/x86/lib/retpoline.S             |    7 +++++++
> >  2 files changed, 12 insertions(+), 14 deletions(-)
> > 
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -272,9 +272,9 @@
> >  .endm
> >  
> >  #ifdef CONFIG_CPU_UNRET_ENTRY
> > -#define CALL_ZEN_UNTRAIN_RET	"call zen_untrain_ret"
> > +#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
> >  #else
> > -#define CALL_ZEN_UNTRAIN_RET	""
> > +#define CALL_UNTRAIN_RET	""
> >  #endif
> >  
> >  /*
> > @@ -293,15 +293,10 @@
> >  	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
> >  	VALIDATE_UNRET_END
> >  	ALTERNATIVE_3 "",						\
> > -		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
> > +		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
> 
> SRSO doesn't have X86_FEATURE_UNRET set.

Argh.. this stuff doesn't exist at the end anymore, but yeah, that's
unfortunate.

I'll see if I can find another intermediate step.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess
  2023-08-09 13:12     ` Peter Zijlstra
@ 2023-08-09 13:26       ` Peter Zijlstra
  2023-08-12 18:30         ` Borislav Petkov
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 13:26 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 03:12:43PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 09, 2023 at 08:51:01AM -0400, Josh Poimboeuf wrote:
> > On Wed, Aug 09, 2023 at 09:12:23AM +0200, Peter Zijlstra wrote:
> > > Since there can only be one active return_thunk, there only needs be
> > > one (matching) untrain_ret. It fundamentally doesn't make sense to
> > > allow multiple untrain_ret at the same time.
> > > 
> > > Fold all the 3 different untrain methods into a single (temporary)
> > > helper stub.
> > > 
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > ---
> > >  arch/x86/include/asm/nospec-branch.h |   19 +++++--------------
> > >  arch/x86/lib/retpoline.S             |    7 +++++++
> > >  2 files changed, 12 insertions(+), 14 deletions(-)
> > > 
> > > --- a/arch/x86/include/asm/nospec-branch.h
> > > +++ b/arch/x86/include/asm/nospec-branch.h
> > > @@ -272,9 +272,9 @@
> > >  .endm
> > >  
> > >  #ifdef CONFIG_CPU_UNRET_ENTRY
> > > -#define CALL_ZEN_UNTRAIN_RET	"call zen_untrain_ret"
> > > +#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
> > >  #else
> > > -#define CALL_ZEN_UNTRAIN_RET	""
> > > +#define CALL_UNTRAIN_RET	""
> > >  #endif
> > >  
> > >  /*
> > > @@ -293,15 +293,10 @@
> > >  	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
> > >  	VALIDATE_UNRET_END
> > >  	ALTERNATIVE_3 "",						\
> > > -		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
> > > +		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
> > 
> > SRSO doesn't have X86_FEATURE_UNRET set.
> 
> Argh.. this stuff doesn't exist at the end anymore, but yeah, that's
> unfortunate.
> 
> I'll see if I can find another intermediate step.

I think simply setting UNRET for SRSO at this point will be sufficient.
That ensures the entry_untrain_ret thing gets called, and the
alternative there DTRT.

The feature isn't used anywhere else afaict.

Then later, after the fancy alternatives happen, this can be cleaned up
again.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense
  2023-08-09 13:10   ` Andrew.Cooper3
@ 2023-08-09 13:36     ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 13:36 UTC (permalink / raw)
  To: Andrew.Cooper3
  Cc: x86, linux-kernel, David.Kaplan, jpoimboe, gregkh,
	Sean Christopherson, Paolo Bonzini

On Wed, Aug 09, 2023 at 02:10:42PM +0100, Andrew.Cooper3@citrix.com wrote:
> On 09/08/2023 8:12 am, Peter Zijlstra wrote:
> > Now that retbleed can do all that the srso knob did, and without the
> > dubious interactions with retbleed selections, remove it.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/x86/kernel/cpu/bugs.c |  188 ++-------------------------------------------
> >  drivers/base/cpu.c         |    8 -
> >  include/linux/cpu.h        |    2 
> >  3 files changed, 10 insertions(+), 188 deletions(-)
> 
> Not all of this can go, because ...
> 
> > --- a/arch/x86/kernel/cpu/bugs.c
> > +++ b/arch/x86/kernel/cpu/bugs.c
> > ...
> > -static void __init srso_select_mitigation(void)
> > -{
> > -	bool has_microcode;
> > -
> > -	if (!boot_cpu_has_bug(X86_BUG_SRSO) || cpu_mitigations_off())
> > -		goto pred_cmd;
> > -
> > -	/*
> > -	 * The first check is for the kernel running as a guest in order
> > -	 * for guests to verify whether IBPB is a viable mitigation.
> > -	 */
> > -	has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
> > -	if (!has_microcode) {
> > -		pr_warn("IBPB-extending microcode not applied!\n");
> > -		pr_warn(SRSO_NOTICE);
> > -	} else {
> > -		/*
> > -		 * Enable the synthetic (even if in a real CPUID leaf)
> > -		 * flags for guests.
> > -		 */
> > -		setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
> > -		setup_force_cpu_cap(X86_FEATURE_SBPB);
> 
> ... these (minus the virt bug caused by probing for microcode behaviour
> even when virtualised, and the enumeration bug caused by ignoring
> synthesis if host mitigations are off) are necessary for KVM.
> 
> https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
> 
> and here's one I prepared earlier
> https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=2280b0ee2aed6e0fd4af3fa31bf99bc04d038bfe
> 
> but these bits need to get into guests for the guests to be able to
> figure out what to do.

Patch 6 adds these feature bits to retbleed_select_mitigation().



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09  7:12 ` [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed= Peter Zijlstra
@ 2023-08-09 13:42   ` Josh Poimboeuf
  2023-08-09 14:06     ` Peter Zijlstra
  2023-08-09 14:31     ` Andrew.Cooper3
  2023-08-10 15:44   ` Borislav Petkov
  1 sibling, 2 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 13:42 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
>  static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
> @@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
>  			retbleed_cmd = RETBLEED_CMD_AUTO;
>  		} else if (!strcmp(str, "unret")) {
>  			retbleed_cmd = RETBLEED_CMD_UNRET;
> +		} else if (!strcmp(str, "srso")) {
> +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
> +		} else if (!strcmp(str, "srso_alias")) {
> +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;

It doesn't make sense for "srso_alias" to be a separate cmdline option,
as that option is a model-dependent variant of the SRSO mitigation.

> +	if (boot_cpu_has_bug(X86_BUG_SRSO)) {
> +		has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
> +		if (!has_microcode) {
> +			pr_warn("IBPB-extending microcode not applied!\n");
> +			pr_warn(RETBLEED_SRSO_NOTICE);
> +		} else {
> +			/*
> +			 * Enable the synthetic (even if in a real CPUID leaf)
> +			 * flags for guests.
> +			 */
> +			setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
> +			setup_force_cpu_cap(X86_FEATURE_SBPB);
> +
> +			/*
> +			 * Zen1/2 with SMT off aren't vulnerable after the right
> +			 * IBPB microcode has been applied.
> +			 */
> +			if ((boot_cpu_data.x86 < 0x19) &&
> +			    (cpu_smt_control == CPU_SMT_DISABLED))
> +				setup_force_cpu_cap(X86_FEATURE_SRSO_NO);

The rumor I heard was that SMT had to be disabled specifically by BIOS
for this condition to be true.  Can somebody from AMD confirm?

The above check doesn't even remotely do that, in fact it does the
opposite.  Regardless, at the very least it should be checking
cpu_smt_possible().  I guess that would be a separate patch as it's a
bug fix.

> @@ -870,8 +915,17 @@ static void __init retbleed_select_mitig
>  	default:
>  		if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
>  		    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
> -			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY))
> -				retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> +			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY)) {
> +				if (boot_cpu_has_bug(X86_BUG_RETBLEED))
> +					retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> +
> +				if (boot_cpu_has_bug(X86_BUG_SRSO) && !boot_cpu_has(X86_FEATURE_SRSO_NO)) {
> +					if (boot_cpu_data.x86 == 0x19)
> +						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO_ALIAS;
> +					else
> +						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO;

It would be great to get confirmation from somebody at AMD that the SRSO
mitigations supersede the Retbleed one, i.e., that the SRSO mitigations
also fix Retbleed.

Yes, the original patches might imply that, but they also seem confused
since they do the double untraining for both Retbleed and SRSO.  And I
was given conflicting information.

Even better, we could really use some official AMD documentation for
this mitigation, since right now it all feels very unofficial and
ubsubstantiated.

> +				}
> +			}
>  			else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY) && boot_cpu_has(X86_FEATURE_IBPB))
>  				retbleed_mitigation = RETBLEED_MITIGATION_IBPB;

Here should have the microcode check too:

				if (boot_cpu_has_bug(X86_BUG_SRSO) && !has_microcode)
					pr_err("IBPB-extending microcode not applied; SRSO NOT mitigated\n");


-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-09  7:12 ` [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
@ 2023-08-09 13:50   ` Josh Poimboeuf
  2023-08-09 14:06     ` Peter Zijlstra
  2023-08-13 10:36   ` Borislav Petkov
  1 sibling, 1 reply; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 13:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:25AM +0200, Peter Zijlstra wrote:
> With the difference being that UNTRAIN_RET_VM uses
> X86_FEATURE_IBPB_ON_VMEXIT instead of X86_FEATURE_ENTRY_IBPB.
> 
> This cures VMEXIT doing potentially unret+IBPB or double IBPB.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/include/asm/nospec-branch.h |   11 +++++++++++
>  arch/x86/kernel/cpu/bugs.c           |   17 ++++++++++++++++-
>  arch/x86/kvm/svm/vmenter.S           |    7 ++-----
>  3 files changed, 29 insertions(+), 6 deletions(-)
> 
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -299,6 +299,17 @@
>  #endif
>  .endm
>  
> +.macro UNTRAIN_RET_VM
> +#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
> +	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)

Maybe can be simplified?

#if defined(CONFIG_RETHUNK) || defined(CONFIG_CPU_IBPB_ENTRY)

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 09/17] x86: Remove CONFIG_CPU_SRSO
  2023-08-09  7:12 ` [RFC][PATCH 09/17] x86: Remove CONFIG_CPU_SRSO Peter Zijlstra
@ 2023-08-09 13:57   ` Josh Poimboeuf
  0 siblings, 0 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 13:57 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:27AM +0200, Peter Zijlstra wrote:
> -#endif
>  
>  /* Needs a definition for the __x86_return_thunk alternative below. */
>  SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
> -#ifdef CONFIG_CPU_SRSO

The comment is no longer needed.

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense
  2023-08-09  7:12 ` [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense Peter Zijlstra
  2023-08-09 13:10   ` Andrew.Cooper3
@ 2023-08-09 14:05   ` Josh Poimboeuf
  2023-08-09 14:43     ` Peter Zijlstra
  2023-08-09 15:34   ` Josh Poimboeuf
  2 siblings, 1 reply; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 14:05 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:29AM +0200, Peter Zijlstra wrote:
> @@ -2607,26 +2447,26 @@ static ssize_t srbds_show_state(char *bu
>  static ssize_t retbleed_show_state(char *buf)
>  {
>  	if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET ||
> +	    retbleed_mitigation == RETBLEED_MITIGATION_UNRET_SRSO ||
> +	    retbleed_mitigation == RETBLEED_MITIGATION_UNRET_SRSO_ALIAS ||
>  	    retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {

These retbleed_show_state() changes probably belong in that other patch
which adds the retbleed= cmdline options.

> +
>  		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
>  		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
>  			return sysfs_emit(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n");
>  
> -		return sysfs_emit(buf, "%s; SMT %s\n", retbleed_strings[retbleed_mitigation],
> +		return sysfs_emit(buf, "%s; SMT %s%s\n", retbleed_strings[retbleed_mitigation],
>  				  !sched_smt_active() ? "disabled" :
>  				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
>  				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ?
> -				  "enabled with STIBP protection" : "vulnerable");
> -	}
> +				  "enabled with STIBP protection" : "vulnerable",
> +				  cpu_has_ibpb_brtype_microcode() ? "" : ", no SRSO microcode");

Hm?  What does missing microcode have to do with SMT?

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09 13:42   ` Josh Poimboeuf
@ 2023-08-09 14:06     ` Peter Zijlstra
  2023-08-09 14:28       ` Josh Poimboeuf
  2023-08-09 14:31     ` Andrew.Cooper3
  1 sibling, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 14:06 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:42:33AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> >  static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
> > @@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
> >  			retbleed_cmd = RETBLEED_CMD_AUTO;
> >  		} else if (!strcmp(str, "unret")) {
> >  			retbleed_cmd = RETBLEED_CMD_UNRET;
> > +		} else if (!strcmp(str, "srso")) {
> > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
> > +		} else if (!strcmp(str, "srso_alias")) {
> > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
> 
> It doesn't make sense for "srso_alias" to be a separate cmdline option,
> as that option is a model-dependent variant of the SRSO mitigation.

so what I did with retbleed, and what should be fixed here too (I
forgot) is run with:

  retbleed=force,unret

on any random machine (typically Intel, because I have a distinct lack
of AMD machines :-() and look at the life kernel image to see if all the
patching worked.

I suppose I should add:

  setup_force_cpu_bug(X86_BUG_SRSO);

to the 'force' option, then:

  retbleed=force,srso_alias

should function the same, irrespective of the hardware.

I'm also of the opinion that the kernel should do as told, even if it
doesn't make sense. If you tell it nonsense, you get to keep the pieces.

So in that light, yes I think we should have separate options.

> > +	if (boot_cpu_has_bug(X86_BUG_SRSO)) {
> > +		has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
> > +		if (!has_microcode) {
> > +			pr_warn("IBPB-extending microcode not applied!\n");
> > +			pr_warn(RETBLEED_SRSO_NOTICE);
> > +		} else {
> > +			/*
> > +			 * Enable the synthetic (even if in a real CPUID leaf)
> > +			 * flags for guests.
> > +			 */
> > +			setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
> > +			setup_force_cpu_cap(X86_FEATURE_SBPB);
> > +
> > +			/*
> > +			 * Zen1/2 with SMT off aren't vulnerable after the right
> > +			 * IBPB microcode has been applied.
> > +			 */
> > +			if ((boot_cpu_data.x86 < 0x19) &&
> > +			    (cpu_smt_control == CPU_SMT_DISABLED))
> > +				setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
> 
> The rumor I heard was that SMT had to be disabled specifically by BIOS
> for this condition to be true.  Can somebody from AMD confirm?

David Kaplan is on Cc, I was hoping he would enlighten us -- once he's
back from PTO.

> The above check doesn't even remotely do that, in fact it does the
> opposite.  Regardless, at the very least it should be checking
> cpu_smt_possible().  I guess that would be a separate patch as it's a
> bug fix.
> 
> > @@ -870,8 +915,17 @@ static void __init retbleed_select_mitig
> >  	default:
> >  		if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
> >  		    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
> > -			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY))
> > -				retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> > +			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY)) {
> > +				if (boot_cpu_has_bug(X86_BUG_RETBLEED))
> > +					retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> > +
> > +				if (boot_cpu_has_bug(X86_BUG_SRSO) && !boot_cpu_has(X86_FEATURE_SRSO_NO)) {
> > +					if (boot_cpu_data.x86 == 0x19)
> > +						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO_ALIAS;
> > +					else
> > +						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO;
> 
> It would be great to get confirmation from somebody at AMD that the SRSO
> mitigations supersede the Retbleed one, i.e., that the SRSO mitigations
> also fix Retbleed.

They should, the discussions we had back then explained the Zen1/2
retbleed case in quite some detail and the srso case matches that
exactly with the movabs. A larger instruction is used because we need a
larger embedded sequence of instructions, but otherwise it is identical.

The comments provided for srso_alias state the BTB is untrained using
the explicit aliasing.

That is to say, AFAIU any of this, yes both srso options untrain the BTB
and mitigate the earlier retbleed thing.

SRSO then goes one step further with the RAP/RSB clobber.

> Yes, the original patches might imply that, but they also seem confused
> since they do the double untraining for both Retbleed and SRSO.  And I
> was given conflicting information.

Which makes no bloody sense, since the double untrain is for another
address, which is not at all used.

> Even better, we could really use some official AMD documentation for
> this mitigation, since right now it all feels very unofficial and
> ubsubstantiated.

Seconded. David is there anything you can do to clarify this stuff?

> 
> > +				}
> > +			}
> >  			else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY) && boot_cpu_has(X86_FEATURE_IBPB))
> >  				retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
> 
> Here should have the microcode check too:
> 
> 				if (boot_cpu_has_bug(X86_BUG_SRSO) && !has_microcode)
> 					pr_err("IBPB-extending microcode not applied; SRSO NOT mitigated\n");

That earlier printk is unconditional of the selected mitigation, you
really want it printed again?


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-09 13:50   ` Josh Poimboeuf
@ 2023-08-09 14:06     ` Peter Zijlstra
  2023-08-09 14:30       ` Josh Poimboeuf
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 14:06 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:50:04AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 09:12:25AM +0200, Peter Zijlstra wrote:
> > With the difference being that UNTRAIN_RET_VM uses
> > X86_FEATURE_IBPB_ON_VMEXIT instead of X86_FEATURE_ENTRY_IBPB.
> > 
> > This cures VMEXIT doing potentially unret+IBPB or double IBPB.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/x86/include/asm/nospec-branch.h |   11 +++++++++++
> >  arch/x86/kernel/cpu/bugs.c           |   17 ++++++++++++++++-
> >  arch/x86/kvm/svm/vmenter.S           |    7 ++-----
> >  3 files changed, 29 insertions(+), 6 deletions(-)
> > 
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -299,6 +299,17 @@
> >  #endif
> >  .endm
> >  
> > +.macro UNTRAIN_RET_VM
> > +#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
> > +	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
> 
> Maybe can be simplified?
> 

See patches 9 and 10 :-)

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-09  7:12 ` [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk Peter Zijlstra
@ 2023-08-09 14:20   ` Josh Poimboeuf
  2023-08-09 14:22     ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 14:20 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:30AM +0200, Peter Zijlstra wrote:
> +++ b/tools/objtool/check.c
> @@ -455,7 +455,12 @@ static int decode_instructions(struct ob
>  				return -1;
>  			}
>  
> -			if (func->return_thunk || !strcmp(func->name, "srso_safe_ret") || func->alias != func)
> +			/*
> +			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
> +			 * another instruction and objtool doesn't grok that. Skip validating them.
> +			 */
> +			if (!strcmp(func->name, "zen_return_thunk") ||
> +			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)

Hm, speaking of renaming they should probably be called
retbleed_return_thunk() and srso_return_thunk().

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-09 14:20   ` Josh Poimboeuf
@ 2023-08-09 14:22     ` Peter Zijlstra
  2023-08-10 11:06       ` Andrew.Cooper3
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 14:22 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 10:20:31AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 09:12:30AM +0200, Peter Zijlstra wrote:
> > +++ b/tools/objtool/check.c
> > @@ -455,7 +455,12 @@ static int decode_instructions(struct ob
> >  				return -1;
> >  			}
> >  
> > -			if (func->return_thunk || !strcmp(func->name, "srso_safe_ret") || func->alias != func)
> > +			/*
> > +			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
> > +			 * another instruction and objtool doesn't grok that. Skip validating them.
> > +			 */
> > +			if (!strcmp(func->name, "zen_return_thunk") ||
> > +			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)
> 
> Hm, speaking of renaming they should probably be called
> retbleed_return_thunk() and srso_return_thunk().

Yes, clearly naming is better in daylight. Let me regex that.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09 14:06     ` Peter Zijlstra
@ 2023-08-09 14:28       ` Josh Poimboeuf
  2023-08-09 15:08         ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 14:28 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 04:06:03PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 09, 2023 at 09:42:33AM -0400, Josh Poimboeuf wrote:
> > On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> > >  static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
> > > @@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
> > >  			retbleed_cmd = RETBLEED_CMD_AUTO;
> > >  		} else if (!strcmp(str, "unret")) {
> > >  			retbleed_cmd = RETBLEED_CMD_UNRET;
> > > +		} else if (!strcmp(str, "srso")) {
> > > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
> > > +		} else if (!strcmp(str, "srso_alias")) {
> > > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
> > 
> > It doesn't make sense for "srso_alias" to be a separate cmdline option,
> > as that option is a model-dependent variant of the SRSO mitigation.
> 
> so what I did with retbleed, and what should be fixed here too (I
> forgot) is run with:
> 
>   retbleed=force,unret
> 
> on any random machine (typically Intel, because I have a distinct lack
> of AMD machines :-() and look at the life kernel image to see if all the
> patching worked.
> 
> I suppose I should add:
> 
>   setup_force_cpu_bug(X86_BUG_SRSO);
> 
> to the 'force' option, then:
> 
>   retbleed=force,srso_alias
> 
> should function the same, irrespective of the hardware.
> 
> I'm also of the opinion that the kernel should do as told, even if it
> doesn't make sense. If you tell it nonsense, you get to keep the pieces.
> 
> So in that light, yes I think we should have separate options.

What if I want the SRSO mitigation regardless of CPU model?

> > > @@ -870,8 +915,17 @@ static void __init retbleed_select_mitig
> > >  	default:
> > >  		if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
> > >  		    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
> > > -			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY))
> > > -				retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> > > +			if (IS_ENABLED(CONFIG_CPU_UNRET_ENTRY)) {
> > > +				if (boot_cpu_has_bug(X86_BUG_RETBLEED))
> > > +					retbleed_mitigation = RETBLEED_MITIGATION_UNRET;
> > > +
> > > +				if (boot_cpu_has_bug(X86_BUG_SRSO) && !boot_cpu_has(X86_FEATURE_SRSO_NO)) {
> > > +					if (boot_cpu_data.x86 == 0x19)
> > > +						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO_ALIAS;
> > > +					else
> > > +						retbleed_mitigation = RETBLEED_MITIGATION_UNRET_SRSO;
> > 
> > It would be great to get confirmation from somebody at AMD that the SRSO
> > mitigations supersede the Retbleed one, i.e., that the SRSO mitigations
> > also fix Retbleed.
> 
> They should, the discussions we had back then explained the Zen1/2
> retbleed case in quite some detail and the srso case matches that
> exactly with the movabs. A larger instruction is used because we need a
> larger embedded sequence of instructions, but otherwise it is identical.
> 
> The comments provided for srso_alias state the BTB is untrained using
> the explicit aliasing.
> 
> That is to say, AFAIU any of this, yes both srso options untrain the BTB
> and mitigate the earlier retbleed thing.
> 
> SRSO then goes one step further with the RAP/RSB clobber.

Ah, nice.  Please add that information somewhere (e.g., one of the
commit logs).

> > > +				}
> > > +			}
> > >  			else if (IS_ENABLED(CONFIG_CPU_IBPB_ENTRY) && boot_cpu_has(X86_FEATURE_IBPB))
> > >  				retbleed_mitigation = RETBLEED_MITIGATION_IBPB;
> > 
> > Here should have the microcode check too:
> > 
> > 				if (boot_cpu_has_bug(X86_BUG_SRSO) && !has_microcode)
> > 					pr_err("IBPB-extending microcode not applied; SRSO NOT mitigated\n");
> 
> That earlier printk is unconditional of the selected mitigation, you
> really want it printed again?

Hm, if you don't want it printed twice then remove it from the
RETBLEED_CMD_IBPB case too.

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-09 14:06     ` Peter Zijlstra
@ 2023-08-09 14:30       ` Josh Poimboeuf
  2023-08-09 15:10         ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 14:30 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 04:06:44PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 09, 2023 at 09:50:04AM -0400, Josh Poimboeuf wrote:
> > On Wed, Aug 09, 2023 at 09:12:25AM +0200, Peter Zijlstra wrote:
> > > With the difference being that UNTRAIN_RET_VM uses
> > > X86_FEATURE_IBPB_ON_VMEXIT instead of X86_FEATURE_ENTRY_IBPB.
> > > 
> > > This cures VMEXIT doing potentially unret+IBPB or double IBPB.
> > > 
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > ---
> > >  arch/x86/include/asm/nospec-branch.h |   11 +++++++++++
> > >  arch/x86/kernel/cpu/bugs.c           |   17 ++++++++++++++++-
> > >  arch/x86/kvm/svm/vmenter.S           |    7 ++-----
> > >  3 files changed, 29 insertions(+), 6 deletions(-)
> > > 
> > > --- a/arch/x86/include/asm/nospec-branch.h
> > > +++ b/arch/x86/include/asm/nospec-branch.h
> > > @@ -299,6 +299,17 @@
> > >  #endif
> > >  .endm
> > >  
> > > +.macro UNTRAIN_RET_VM
> > > +#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
> > > +	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
> > 
> > Maybe can be simplified?
> > 
> 
> See patches 9 and 10 :-)

Can still be further simplified to CONFIG_RETHUNK?

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09 13:42   ` Josh Poimboeuf
  2023-08-09 14:06     ` Peter Zijlstra
@ 2023-08-09 14:31     ` Andrew.Cooper3
  2023-08-09 14:39       ` Josh Poimboeuf
  1 sibling, 1 reply; 80+ messages in thread
From: Andrew.Cooper3 @ 2023-08-09 14:31 UTC (permalink / raw)
  To: Josh Poimboeuf, Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, gregkh

On 09/08/2023 2:42 pm, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
>> +	if (boot_cpu_has_bug(X86_BUG_SRSO)) {
>> +		has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
>> +		if (!has_microcode) {
>> +			pr_warn("IBPB-extending microcode not applied!\n");
>> +			pr_warn(RETBLEED_SRSO_NOTICE);
>> +		} else {
>> +			/*
>> +			 * Enable the synthetic (even if in a real CPUID leaf)
>> +			 * flags for guests.
>> +			 */
>> +			setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
>> +			setup_force_cpu_cap(X86_FEATURE_SBPB);
>> +
>> +			/*
>> +			 * Zen1/2 with SMT off aren't vulnerable after the right
>> +			 * IBPB microcode has been applied.
>> +			 */
>> +			if ((boot_cpu_data.x86 < 0x19) &&
>> +			    (cpu_smt_control == CPU_SMT_DISABLED))
>> +				setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
> The rumor I heard was that SMT had to be disabled specifically by BIOS
> for this condition to be true.  Can somebody from AMD confirm?

It's Complicated.

On Zen1/2, uarch constraints mitigate SRSO when the core is in 1T mode,
where such an attack would succeed in 2T mode.  Specifically, it is
believed that the SRSO infinite-call-loop can poison more than 16
RSB/RAS/RAP entries, but can't poison 32 entries.

The RSB dynamically repartitions depending on the idleness of the
sibling.  Therefore, offlining/parking the siblings should make you
safe.  (Assuming you can handwave away the NMI hitting the parked thread
case as outside of an attackers control.)


In Xen, I decided that synthesizing SRSO_NO was only safe when SMT was
disabled by firmware, because that's the only case where it can't cease
being true later by admin action.

If it were just Xen's safety that mattered here it might be ok to allow
the OS SMT=0 cases, but this bit needs to get into guests, you can't
credibly tell the guest SRSO_NO and then make it unsafe at a later point.

~Andrew

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn()
  2023-08-09  7:12 ` [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn() Peter Zijlstra
  2023-08-09  9:56   ` Nikolay Borisov
@ 2023-08-09 14:34   ` Josh Poimboeuf
  1 sibling, 0 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 14:34 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:31AM +0200, Peter Zijlstra wrote:
> Add a little wrappery to identify the magic symbols that are actually
> inside another instruction -- yay for variable length instruction
> encoding.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  tools/objtool/arch/x86/decode.c      |    6 ++++++
>  tools/objtool/check.c                |   13 ++++++++++---
>  tools/objtool/include/objtool/arch.h |    1 +
>  tools/objtool/include/objtool/elf.h  |    1 +
>  4 files changed, 18 insertions(+), 3 deletions(-)
> 
> --- a/tools/objtool/arch/x86/decode.c
> +++ b/tools/objtool/arch/x86/decode.c
> @@ -826,3 +826,9 @@ bool arch_is_rethunk(struct symbol *sym)
>  {
>  	return !strcmp(sym->name, "__x86_return_thunk");
>  }
> +
> +bool arch_is_offset_insn(struct symbol *sym)
> +{
> +	return !strcmp(sym->name, "zen_return_thunk") ||
> +	       !strcmp(sym->name, "srso_safe_ret");
> +}

arch_is_embedded_in_insn()?

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09 14:31     ` Andrew.Cooper3
@ 2023-08-09 14:39       ` Josh Poimboeuf
  0 siblings, 0 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 14:39 UTC (permalink / raw)
  To: Andrew.Cooper3; +Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, gregkh

On Wed, Aug 09, 2023 at 03:31:20PM +0100, Andrew.Cooper3@citrix.com wrote:
> On 09/08/2023 2:42 pm, Josh Poimboeuf wrote:
> > On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> >> +	if (boot_cpu_has_bug(X86_BUG_SRSO)) {
> >> +		has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) || cpu_has_ibpb_brtype_microcode();
> >> +		if (!has_microcode) {
> >> +			pr_warn("IBPB-extending microcode not applied!\n");
> >> +			pr_warn(RETBLEED_SRSO_NOTICE);
> >> +		} else {
> >> +			/*
> >> +			 * Enable the synthetic (even if in a real CPUID leaf)
> >> +			 * flags for guests.
> >> +			 */
> >> +			setup_force_cpu_cap(X86_FEATURE_IBPB_BRTYPE);
> >> +			setup_force_cpu_cap(X86_FEATURE_SBPB);
> >> +
> >> +			/*
> >> +			 * Zen1/2 with SMT off aren't vulnerable after the right
> >> +			 * IBPB microcode has been applied.
> >> +			 */
> >> +			if ((boot_cpu_data.x86 < 0x19) &&
> >> +			    (cpu_smt_control == CPU_SMT_DISABLED))
> >> +				setup_force_cpu_cap(X86_FEATURE_SRSO_NO);
> > The rumor I heard was that SMT had to be disabled specifically by BIOS
> > for this condition to be true.  Can somebody from AMD confirm?
> 
> It's Complicated.
> 
> On Zen1/2, uarch constraints mitigate SRSO when the core is in 1T mode,
> where such an attack would succeed in 2T mode.  Specifically, it is
> believed that the SRSO infinite-call-loop can poison more than 16
> RSB/RAS/RAP entries, but can't poison 32 entries.
> 
> The RSB dynamically repartitions depending on the idleness of the
> sibling.  Therefore, offlining/parking the siblings should make you
> safe.  (Assuming you can handwave away the NMI hitting the parked thread
> case as outside of an attackers control.)
> 
> 
> In Xen, I decided that synthesizing SRSO_NO was only safe when SMT was
> disabled by firmware, because that's the only case where it can't cease
> being true later by admin action.
> 
> If it were just Xen's safety that mattered here it might be ok to allow
> the OS SMT=0 cases, but this bit needs to get into guests, you can't
> credibly tell the guest SRSO_NO and then make it unsafe at a later point.

Thanks for that explanation.  It sounds like we can use
!cpu_smt_possible() here.

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense
  2023-08-09 14:05   ` Josh Poimboeuf
@ 2023-08-09 14:43     ` Peter Zijlstra
  2023-08-09 14:51       ` Josh Poimboeuf
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 14:43 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 10:05:30AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 09:12:29AM +0200, Peter Zijlstra wrote:
> > @@ -2607,26 +2447,26 @@ static ssize_t srbds_show_state(char *bu
> >  static ssize_t retbleed_show_state(char *buf)
> >  {
> >  	if (retbleed_mitigation == RETBLEED_MITIGATION_UNRET ||
> > +	    retbleed_mitigation == RETBLEED_MITIGATION_UNRET_SRSO ||
> > +	    retbleed_mitigation == RETBLEED_MITIGATION_UNRET_SRSO_ALIAS ||
> >  	    retbleed_mitigation == RETBLEED_MITIGATION_IBPB) {
> 
> These retbleed_show_state() changes probably belong in that other patch
> which adds the retbleed= cmdline options.

Ah yes, lost hunk that. Let me move it there.

> > +
> >  		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
> >  		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
> >  			return sysfs_emit(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n");
> >  
> > -		return sysfs_emit(buf, "%s; SMT %s\n", retbleed_strings[retbleed_mitigation],
> > +		return sysfs_emit(buf, "%s; SMT %s%s\n", retbleed_strings[retbleed_mitigation],
> >  				  !sched_smt_active() ? "disabled" :
> >  				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
> >  				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ?
> > -				  "enabled with STIBP protection" : "vulnerable");
> > -	}
> > +				  "enabled with STIBP protection" : "vulnerable",
> > +				  cpu_has_ibpb_brtype_microcode() ? "" : ", no SRSO microcode");
> 
> Hm?  What does missing microcode have to do with SMT?

semi-colon then, instead of comma ?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense
  2023-08-09 14:43     ` Peter Zijlstra
@ 2023-08-09 14:51       ` Josh Poimboeuf
  0 siblings, 0 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 14:51 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 04:43:35PM +0200, Peter Zijlstra wrote:
> > >  		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
> > >  		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
> > >  			return sysfs_emit(buf, "Vulnerable: untrained return thunk / IBPB on non-AMD based uarch\n");
> > >  
> > > -		return sysfs_emit(buf, "%s; SMT %s\n", retbleed_strings[retbleed_mitigation],
> > > +		return sysfs_emit(buf, "%s; SMT %s%s\n", retbleed_strings[retbleed_mitigation],
> > >  				  !sched_smt_active() ? "disabled" :
> > >  				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT ||
> > >  				  spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED ?
> > > -				  "enabled with STIBP protection" : "vulnerable");
> > > -	}
> > > +				  "enabled with STIBP protection" : "vulnerable",
> > > +				  cpu_has_ibpb_brtype_microcode() ? "" : ", no SRSO microcode");
> > 
> > Hm?  What does missing microcode have to do with SMT?
> 
> semi-colon then, instead of comma ?

Nm, I was confused.  Comma is fine.

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09 14:28       ` Josh Poimboeuf
@ 2023-08-09 15:08         ` Peter Zijlstra
  2023-08-09 15:43           ` Josh Poimboeuf
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 15:08 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 10:28:47AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 04:06:03PM +0200, Peter Zijlstra wrote:
> > On Wed, Aug 09, 2023 at 09:42:33AM -0400, Josh Poimboeuf wrote:
> > > On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> > > >  static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
> > > > @@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
> > > >  			retbleed_cmd = RETBLEED_CMD_AUTO;
> > > >  		} else if (!strcmp(str, "unret")) {
> > > >  			retbleed_cmd = RETBLEED_CMD_UNRET;
> > > > +		} else if (!strcmp(str, "srso")) {
> > > > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
> > > > +		} else if (!strcmp(str, "srso_alias")) {
> > > > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
> > > 
> > > It doesn't make sense for "srso_alias" to be a separate cmdline option,
> > > as that option is a model-dependent variant of the SRSO mitigation.
> > 
> > so what I did with retbleed, and what should be fixed here too (I
> > forgot) is run with:
> > 
> >   retbleed=force,unret
> > 
> > on any random machine (typically Intel, because I have a distinct lack
> > of AMD machines :-() and look at the life kernel image to see if all the
> > patching worked.
> > 
> > I suppose I should add:
> > 
> >   setup_force_cpu_bug(X86_BUG_SRSO);
> > 
> > to the 'force' option, then:
> > 
> >   retbleed=force,srso_alias
> > 
> > should function the same, irrespective of the hardware.
> > 
> > I'm also of the opinion that the kernel should do as told, even if it
> > doesn't make sense. If you tell it nonsense, you get to keep the pieces.
> > 
> > So in that light, yes I think we should have separate options.
> 
> What if I want the SRSO mitigation regardless of CPU model?

You mean, you want to say 'any of the SRSO things, you pick the right
version?'

Which means the user has an AMD machine, but can't be arsed to figure
out which and somehow doesn't want to use AUTO?




> > They should, the discussions we had back then explained the Zen1/2
> > retbleed case in quite some detail and the srso case matches that
> > exactly with the movabs. A larger instruction is used because we need a
> > larger embedded sequence of instructions, but otherwise it is identical.
> > 
> > The comments provided for srso_alias state the BTB is untrained using
> > the explicit aliasing.
> > 
> > That is to say, AFAIU any of this, yes both srso options untrain the BTB
> > and mitigate the earlier retbleed thing.
> > 
> > SRSO then goes one step further with the RAP/RSB clobber.
> 
> Ah, nice.  Please add that information somewhere (e.g., one of the
> commit logs).

The comment above zen_untrain_ret (or retbleed_untrain_ret as you've
christened it) not clear enough?

/*
 * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
 * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
 *    alignment within the BTB.
 * 2) The instruction at retbleed_untrain_ret must contain, and not
 *    end with, the 0xc3 byte of the RET.
 * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
 *    from re-poisioning the BTB prediction.
 */


Hmm, when compared to:

	.align 64
	.skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc
SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
ANNOTATE_NOENDBR
	.byte 0x48, 0xb8

SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
	add $8, %_ASM_SP
	ret
	int3
	int3
	int3
	/* end of movabs */
	lfence
	jmp srso_return_thunk
	int3
SYM_CODE_END(srso_safe_ret)
SYM_FUNC_END(srso_untrain_ret)

Then we match 2, srso_safe_ret is strictly *inside* the movabs, that is,
it is not the first nor the last byte of the outer instruction.

However, we fail at 1, 'add $8, %rsp' sits at the boundary, not ret.

Anybody, help ?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-09 14:30       ` Josh Poimboeuf
@ 2023-08-09 15:10         ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-09 15:10 UTC (permalink / raw)
  To: Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 10:30:00AM -0400, Josh Poimboeuf wrote:
> On Wed, Aug 09, 2023 at 04:06:44PM +0200, Peter Zijlstra wrote:
> > On Wed, Aug 09, 2023 at 09:50:04AM -0400, Josh Poimboeuf wrote:
> > > On Wed, Aug 09, 2023 at 09:12:25AM +0200, Peter Zijlstra wrote:
> > > > With the difference being that UNTRAIN_RET_VM uses
> > > > X86_FEATURE_IBPB_ON_VMEXIT instead of X86_FEATURE_ENTRY_IBPB.
> > > > 
> > > > This cures VMEXIT doing potentially unret+IBPB or double IBPB.
> > > > 
> > > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > > ---
> > > >  arch/x86/include/asm/nospec-branch.h |   11 +++++++++++
> > > >  arch/x86/kernel/cpu/bugs.c           |   17 ++++++++++++++++-
> > > >  arch/x86/kvm/svm/vmenter.S           |    7 ++-----
> > > >  3 files changed, 29 insertions(+), 6 deletions(-)
> > > > 
> > > > --- a/arch/x86/include/asm/nospec-branch.h
> > > > +++ b/arch/x86/include/asm/nospec-branch.h
> > > > @@ -299,6 +299,17 @@
> > > >  #endif
> > > >  .endm
> > > >  
> > > > +.macro UNTRAIN_RET_VM
> > > > +#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
> > > > +	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
> > > 
> > > Maybe can be simplified?
> > > 
> > 
> > See patches 9 and 10 :-)
> 
> Can still be further simplified to CONFIG_RETHUNK?

Hmm, probably, but my brain seems to be giving out. Let me do the whole
dinner and kids to bed thing and I'll look again later.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense
  2023-08-09  7:12 ` [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense Peter Zijlstra
  2023-08-09 13:10   ` Andrew.Cooper3
  2023-08-09 14:05   ` Josh Poimboeuf
@ 2023-08-09 15:34   ` Josh Poimboeuf
  2 siblings, 0 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 15:34 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 09:12:29AM +0200, Peter Zijlstra wrote:
> Now that retbleed can do all that the srso knob did, and without the
> dubious interactions with retbleed selections, remove it.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/cpu/bugs.c |  188 ++-------------------------------------------
>  drivers/base/cpu.c         |    8 -
>  include/linux/cpu.h        |    2 
>  3 files changed, 10 insertions(+), 188 deletions(-)

Documentation/admin-guide/hw-vuln/srso.rst also needs to be updated...

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09 15:08         ` Peter Zijlstra
@ 2023-08-09 15:43           ` Josh Poimboeuf
  0 siblings, 0 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-09 15:43 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 05:08:52PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 09, 2023 at 10:28:47AM -0400, Josh Poimboeuf wrote:
> > On Wed, Aug 09, 2023 at 04:06:03PM +0200, Peter Zijlstra wrote:
> > > On Wed, Aug 09, 2023 at 09:42:33AM -0400, Josh Poimboeuf wrote:
> > > > On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> > > > >  static enum retbleed_mitigation retbleed_mitigation __ro_after_init =
> > > > > @@ -796,6 +802,10 @@ static int __init retbleed_parse_cmdline
> > > > >  			retbleed_cmd = RETBLEED_CMD_AUTO;
> > > > >  		} else if (!strcmp(str, "unret")) {
> > > > >  			retbleed_cmd = RETBLEED_CMD_UNRET;
> > > > > +		} else if (!strcmp(str, "srso")) {
> > > > > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO;
> > > > > +		} else if (!strcmp(str, "srso_alias")) {
> > > > > +			retbleed_cmd = RETBLEED_CMD_UNRET_SRSO_ALIAS;
> > > > 
> > > > It doesn't make sense for "srso_alias" to be a separate cmdline option,
> > > > as that option is a model-dependent variant of the SRSO mitigation.
> > > 
> > > so what I did with retbleed, and what should be fixed here too (I
> > > forgot) is run with:
> > > 
> > >   retbleed=force,unret
> > > 
> > > on any random machine (typically Intel, because I have a distinct lack
> > > of AMD machines :-() and look at the life kernel image to see if all the
> > > patching worked.
> > > 
> > > I suppose I should add:
> > > 
> > >   setup_force_cpu_bug(X86_BUG_SRSO);
> > > 
> > > to the 'force' option, then:
> > > 
> > >   retbleed=force,srso_alias
> > > 
> > > should function the same, irrespective of the hardware.
> > > 
> > > I'm also of the opinion that the kernel should do as told, even if it
> > > doesn't make sense. If you tell it nonsense, you get to keep the pieces.
> > > 
> > > So in that light, yes I think we should have separate options.
> > 
> > What if I want the SRSO mitigation regardless of CPU model?
> 
> You mean, you want to say 'any of the SRSO things, you pick the right
> version?'
> 
> Which means the user has an AMD machine, but can't be arsed to figure
> out which and somehow doesn't want to use AUTO?

Well, nobody's going to use any of these options anyway so it doesn't
matter regardless.

> > > They should, the discussions we had back then explained the Zen1/2
> > > retbleed case in quite some detail and the srso case matches that
> > > exactly with the movabs. A larger instruction is used because we need a
> > > larger embedded sequence of instructions, but otherwise it is identical.
> > > 
> > > The comments provided for srso_alias state the BTB is untrained using
> > > the explicit aliasing.
> > > 
> > > That is to say, AFAIU any of this, yes both srso options untrain the BTB
> > > and mitigate the earlier retbleed thing.
> > > 
> > > SRSO then goes one step further with the RAP/RSB clobber.
> > 
> > Ah, nice.  Please add that information somewhere (e.g., one of the
> > commit logs).
> 
> The comment above zen_untrain_ret (or retbleed_untrain_ret as you've
> christened it) not clear enough?
> 
> /*
>  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
>  * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
>  *    alignment within the BTB.
>  * 2) The instruction at retbleed_untrain_ret must contain, and not
>  *    end with, the 0xc3 byte of the RET.
>  * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
>  *    from re-poisioning the BTB prediction.
>  */

To me, it's only clear now that you connected the dots.

> Hmm, when compared to:
> 
> 	.align 64
> 	.skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc
> SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
> ANNOTATE_NOENDBR
> 	.byte 0x48, 0xb8
> 
> SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
> 	add $8, %_ASM_SP
> 	ret
> 	int3
> 	int3
> 	int3
> 	/* end of movabs */
> 	lfence
> 	jmp srso_return_thunk
> 	int3
> SYM_CODE_END(srso_safe_ret)
> SYM_FUNC_END(srso_untrain_ret)
> 
> Then we match 2, srso_safe_ret is strictly *inside* the movabs, that is,
> it is not the first nor the last byte of the outer instruction.
> 
> However, we fail at 1, 'add $8, %rsp' sits at the boundary, not ret.
> 
> Anybody, help ?

Um... yeah, doesn't look right.

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-09  7:12 ` [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
@ 2023-08-09 15:45   ` Nikolay Borisov
  2023-08-10 11:51   ` Borislav Petkov
  1 sibling, 0 replies; 80+ messages in thread
From: Nikolay Borisov @ 2023-08-09 15:45 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh



On 9.08.23 г. 10:12 ч., Peter Zijlstra wrote:
> Use the existing configurable return thunk. There is absolute no
> justification for having created this __x86_return_thunk alternative.
> 
> To clarify, the whole thing looks like:
> 
> Zen3/4 does:
> 
>    srso_alias_untrain_ret:
> 	  nop2
> 	  lfence
> 	  jmp srso_alias_return_thunk
> 	  int3
> 
>    srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
> 	  add $8, %rsp
> 	  ret
> 	  int3
> 
>    srso_alias_return_thunk:
> 	  call srso_alias_safe_ret
> 	  ud2
> 
> While Zen1/2 does:
> 
>    srso_untrain_ret:
> 	  movabs $foo, %rax
> 	  lfence
> 	  call srso_safe_ret           (jmp srso_return_thunk ?)
> 	  int3
> 
>    srso_safe_ret: // embedded in movabs immediate
> 	  add $8,%rsp
>            ret
>            int3
> 
>    srso_return_thunk:
> 	  call srso_safe_ret
> 	  ud2
> 
> While retbleed does:
> 
>    zen_untrain_ret:
> 	  test $0xcc, %bl
> 	  lfence
> 	  jmp zen_return_thunk
>            int3
> 
>    zen_return_thunk: // embedded in the test instruction
> 	  ret
>            int3
> 
> Where Zen1/2 flush the BTB using the instruction decoder trick
> (test,movabs) Zen3/4 use instruction aliasing. SRSO adds RSB (RAP in
> AMD speak) stuffing to force a return mis-predict.
> 
> That is; the AMD retbleed is a form of Speculative-Type-Confusion
> where the branch predictor is trained to use the BTB to predict the
> RET address, while AMD inception/SRSO is a form of
> Speculative-Type-Confusion where another instruction is trained to be
> treated like a CALL instruction and poison the RSB (RAP).
> 
> Pick one of three options at boot.
> 


So this boils down to simply removing one level of indirection, instead 
of patching the body of __x86_return_thunk you directly patch the return 
sites with the correct thunk.


Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-09 14:22     ` Peter Zijlstra
@ 2023-08-10 11:06       ` Andrew.Cooper3
  2023-08-10 13:02         ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Andrew.Cooper3 @ 2023-08-10 11:06 UTC (permalink / raw)
  To: Peter Zijlstra, Josh Poimboeuf; +Cc: x86, linux-kernel, David.Kaplan, gregkh

On 09/08/2023 3:22 pm, Peter Zijlstra wrote:
> On Wed, Aug 09, 2023 at 10:20:31AM -0400, Josh Poimboeuf wrote:
>> On Wed, Aug 09, 2023 at 09:12:30AM +0200, Peter Zijlstra wrote:
>>> +++ b/tools/objtool/check.c
>>> @@ -455,7 +455,12 @@ static int decode_instructions(struct ob
>>>  				return -1;
>>>  			}
>>>  
>>> -			if (func->return_thunk || !strcmp(func->name, "srso_safe_ret") || func->alias != func)
>>> +			/*
>>> +			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
>>> +			 * another instruction and objtool doesn't grok that. Skip validating them.
>>> +			 */
>>> +			if (!strcmp(func->name, "zen_return_thunk") ||
>>> +			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)
>> Hm, speaking of renaming they should probably be called
>> retbleed_return_thunk() and srso_return_thunk().
> Yes, clearly naming is better in daylight. Let me regex that.

btc_*, not retbleed_*.

That way it matches the terminology you'll find in the AMD whitepaper
about what's going on, and there's already an entirely different issue
called Retbleed.

~Andrew

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk
  2023-08-09  7:12 ` [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk Peter Zijlstra
  2023-08-09  9:31   ` Nikolay Borisov
@ 2023-08-10 11:37   ` Borislav Petkov
  1 sibling, 0 replies; 80+ messages in thread
From: Borislav Petkov @ 2023-08-10 11:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 09:12:19AM +0200, Peter Zijlstra wrote:

Please add a verb to the subject:

"Make the custom return thunk unconditional"

or so.

> There is infrastructure to rewrite return thunks to point to any
> random thunk one desires, unwrap that from CALL_THUNKS, which up to
> now was the sole user of that.

And yes, provided we can do the thing in the next patch which makes
sense to me, ack.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-09  7:12 ` [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
  2023-08-09 15:45   ` Nikolay Borisov
@ 2023-08-10 11:51   ` Borislav Petkov
  2023-08-10 12:37     ` Peter Zijlstra
  1 sibling, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-10 11:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 09:12:20AM +0200, Peter Zijlstra wrote:
> Where Zen1/2 flush the BTB using the instruction decoder trick
> (test,movabs) Zen3/4 use instruction aliasing. SRSO adds RSB (RAP in

BTB aliasing.

> AMD speak) stuffing to force a return mis-predict.

No it doesn't. It causes BTB aliasing which evicts any potentially
poisoned entries.

> That is; the AMD retbleed is a form of Speculative-Type-Confusion
> where the branch predictor is trained to use the BTB to predict the
> RET address, while AMD inception/SRSO is a form of
> Speculative-Type-Confusion where another instruction is trained to be
> treated like a CALL instruction and poison the RSB (RAP).

Nope, Andy explained it already in the 0th message.

> Pick one of three options at boot.

Yes, provided microarchitecturally that works, I'm all for removing the
__ret alternative.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 03/17] x86/cpu: Make srso_untrain_ret consistent
  2023-08-09  7:12 ` [RFC][PATCH 03/17] x86/cpu: Make srso_untrain_ret consistent Peter Zijlstra
@ 2023-08-10 12:00   ` Borislav Petkov
  0 siblings, 0 replies; 80+ messages in thread
From: Borislav Petkov @ 2023-08-10 12:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 09:12:21AM +0200, Peter Zijlstra wrote:
> This does change srso_untrain_ret a little to be more consistent with
> srso_alias_untrain_ret (and zen_untrain_ret). Specifically I made
> srso_untrain_ret tail-call the srso_return_thunk, instead of doing the
> call directly. This matches how srso_alias_untrain_ret amd
> zen_untrain_ret also tail-call their respective return_thunk.
> 
> If this is a problem this can be easily fixed and a comment added to
> explain -- but this way they all end with a tail-call to their own
> return-thunk, which is nice and consistent.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/lib/retpoline.S |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/arch/x86/lib/retpoline.S
> +++ b/arch/x86/lib/retpoline.S
> @@ -262,7 +262,7 @@ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLO
>  	int3
>  	/* end of movabs */
>  	lfence
> -	call srso_safe_ret
> +	jmp srso_return_thunk
>  	int3
>  SYM_CODE_END(srso_safe_ret)
>  SYM_FUNC_END(srso_untrain_ret)

I don't see a problem with it but I'd let David comment here.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess
  2023-08-09  7:12 ` [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess Peter Zijlstra
@ 2023-08-10 12:06   ` Borislav Petkov
  2023-08-10 12:48     ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-10 12:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 09:12:22AM +0200, Peter Zijlstra wrote:
> arch_is_rethunk() indicates those functions that will be considered
> equivalent to INSN_RETURN and any callsite will be added to the
> .return_sites section.
> 
> Notably: srso_untrain_ret, srso_safe_ret and __ret do not qualify.

They do not qualify *anymore*. Before your changes, the last two would
call RET.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-10 11:51   ` Borislav Petkov
@ 2023-08-10 12:37     ` Peter Zijlstra
  2023-08-10 12:56       ` Borislav Petkov
  2023-08-11  7:01       ` Peter Zijlstra
  0 siblings, 2 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-10 12:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Thu, Aug 10, 2023 at 01:51:48PM +0200, Borislav Petkov wrote:
> On Wed, Aug 09, 2023 at 09:12:20AM +0200, Peter Zijlstra wrote:
> > Where Zen1/2 flush the BTB using the instruction decoder trick
> > (test,movabs) Zen3/4 use instruction aliasing. SRSO adds RSB (RAP in
> 
> BTB aliasing.
> 
> > AMD speak) stuffing to force a return mis-predict.
> 
> No it doesn't. It causes BTB aliasing which evicts any potentially
> poisoned entries.

It does; so zen1/2 use the decoder thing to flush BTB entry of the RET,
both retbleed and srso do.

Then zen3/4 use the aliassing trick to flush the BTB entry of the RET.

Then both srso options use RSB/RAP stuffing to force a mispredict there.
Retbleed doesn't do this.

retbleed is about BTB, srso does both BTB and RSB/RAP.

> > That is; the AMD retbleed is a form of Speculative-Type-Confusion
> > where the branch predictor is trained to use the BTB to predict the
> > RET address, while AMD inception/SRSO is a form of
> > Speculative-Type-Confusion where another instruction is trained to be
> > treated like a CALL instruction and poison the RSB (RAP).
> 
> Nope, Andy explained it already in the 0th message.

I'm still of the opinion that branch-type-confusion is an integral part
of setting up the srso RSB/RAP trickery. It just targets a different
predictor, RSB/RAP vs BTB.

> > Pick one of three options at boot.
> 
> Yes, provided microarchitecturally that works, I'm all for removing the
> __ret alternative.

So this patch doesn't actually change anything except one layer of
indirection.

Your thing does:

SYNC_FUNC_START(foo)
	...
	ALTERNATIVE "ret; int3",
		    "jmp __x86_return_thunk", X86_FEATURE_RETHUNK
SYM_FUNC_END(foo)

SYM_FUNC_START(__x86_return_thunk)
	ALTERNATIVE("jmp __ret",
		    "call srso_safe_ret", X86_FEATURE_SRSO,
		    "call srso_alias_safe_ret", X86_FEATURE_SRSO_ALIAS);
	int3
SYM_FUNC_END(__x86_return_thunk)


So what was RET, jumps to __x86_return_thunk, which then jumps to the
actual return thunk.

After this patch things look equivalent to:

SYM_FUNC_START(foo)
	...
	ALTERNATIVE "ret; int3"
		    "jmp __x86_return_thunk", X86_FEATURE_RETHUNK
		    "jmp srso_return_thunk, X86_FEATURE_SRSO
		    "jmp srsi_alias_return_thunk", X86_FEATURE_SRSO_ALIAS
SYM_FUNC_END(foo)

SYM_CODE_START(srso_return_thunk)
	UNWIND_HINT_FUNC
	ANNOTATE_NOENDBR
	call srso_safe_ret;
	ud2
SYM_CODE_END(srso_return_thunk)

SYM_CODE_START(srso_alias_return_thunk)
	UNWIND_HINT_FUNC
	ANNOTATE_NOENDBR
	call srso_alias_safe_ret;
	ud2
SYM_CODE_END(srso_alias_return_thunk)


Except of course we don't have an actual ALTERNATIVE at the ret site,
but .return_sites and rewriting things to either "ret; int3" or whatever
function is in x86_return_thunk.


Before this patch, only one ret thunk is used at any one time, after
this patch still only one ret thunk is used.

fundamentally, you can only ever use one ret.

IOW this patch changes nothing for SRSO, it still does a jump to a call.
But it does clean up retbleed, which you had as a jump to a jump, back
to just a jump, and it does get rid of that extra alternative layer yo
had by using the one we already have at .return_sites rewrite.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess
  2023-08-10 12:06   ` Borislav Petkov
@ 2023-08-10 12:48     ` Peter Zijlstra
  2023-08-10 12:50       ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-10 12:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Thu, Aug 10, 2023 at 02:06:15PM +0200, Borislav Petkov wrote:
> On Wed, Aug 09, 2023 at 09:12:22AM +0200, Peter Zijlstra wrote:
> > arch_is_rethunk() indicates those functions that will be considered
> > equivalent to INSN_RETURN and any callsite will be added to the
> > .return_sites section.
> > 
> > Notably: srso_untrain_ret, srso_safe_ret and __ret do not qualify.
> 
> They do not qualify *anymore*. Before your changes, the last two would
> call RET.

They were wrong too before too. From the previous email (slightly
refined), you had:

SYNC_FUNC_START(foo)
	...
	ALTERNATIVE "jmp __x86_return_thunk",
		    "ret; int3" ALT_NOT(X86_FEATURE_RETHUNK)
SYM_FUNC_END(foo)

SYM_FUNC_START(__x86_return_thunk)
	ALTERNATIVE "jmp __ret",
		    "call srso_safe_ret", X86_FEATURE_SRSO,
		    "call srso_alias_safe_ret", X86_FEATURE_SRSO_ALIAS
	int3
SYM_FUNC_END(__x86_return_thunk)


so we want foo's "jmp __x86_return_thunk" to end up in .return_sites.

But what you had would also add __x86_return_thunk's 'jmp __ret' to
.return_sites. And .altins_replacement's 'call srso_safe_ret' in
.return_sites if we're unlucky.

You do *NOT* want those in .return_sites.


objtool did two things:

 - generate .return_sites to rewrite all the compiler generated 'jmp
   __x86_return_thunk' sites to either 'ret; int3' or a jump to your
   function of choice (default __x86_return_thunk).

 - not get confused by the fact that __x86_return_thunk is in the middle
   of another instruction and zen_untrain_ret's validation fails to find
   the jump target because it doesn't know this instruction -- because
   it's inside another instruction.

Before all this they were both __x86_return_thunk, there was no
distinction needed.

But now there is, you still only want 'jmp __x86_return_thunk' to end up
in .return_sites. As per the above, you very much do *NOT* want any
other things on there.

But you added one more magic function inside another instruction
instance and had it identified at return_thunk, which is a fail.

Worse, you also added srso_untrain_ret, even though that really
shouldn't have been added since it doesn't suffer any of the above two
things.



^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess
  2023-08-10 12:48     ` Peter Zijlstra
@ 2023-08-10 12:50       ` Peter Zijlstra
  2023-08-10 15:02         ` Borislav Petkov
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-10 12:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Thu, Aug 10, 2023 at 02:48:59PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 10, 2023 at 02:06:15PM +0200, Borislav Petkov wrote:
> > On Wed, Aug 09, 2023 at 09:12:22AM +0200, Peter Zijlstra wrote:
> > > arch_is_rethunk() indicates those functions that will be considered
> > > equivalent to INSN_RETURN and any callsite will be added to the
> > > .return_sites section.
> > > 
> > > Notably: srso_untrain_ret, srso_safe_ret and __ret do not qualify.
> > 
> > They do not qualify *anymore*. Before your changes, the last two would
> > call RET.

Notably 'call RET' is not what this is about. I hope the below
clarifies.

> They were wrong too before too. From the previous email (slightly
> refined), you had:
> 
> SYNC_FUNC_START(foo)
> 	...
> 	ALTERNATIVE "jmp __x86_return_thunk",
> 		    "ret; int3" ALT_NOT(X86_FEATURE_RETHUNK)
> SYM_FUNC_END(foo)
> 
> SYM_FUNC_START(__x86_return_thunk)
> 	ALTERNATIVE "jmp __ret",
> 		    "call srso_safe_ret", X86_FEATURE_SRSO,
> 		    "call srso_alias_safe_ret", X86_FEATURE_SRSO_ALIAS
> 	int3
> SYM_FUNC_END(__x86_return_thunk)
> 
> 
> so we want foo's "jmp __x86_return_thunk" to end up in .return_sites.
> 
> But what you had would also add __x86_return_thunk's 'jmp __ret' to
> .return_sites. And .altins_replacement's 'call srso_safe_ret' in
> .return_sites if we're unlucky.
> 
> You do *NOT* want those in .return_sites.
> 
> 
> objtool did two things:
> 
>  - generate .return_sites to rewrite all the compiler generated 'jmp
>    __x86_return_thunk' sites to either 'ret; int3' or a jump to your
>    function of choice (default __x86_return_thunk).
> 
>  - not get confused by the fact that __x86_return_thunk is in the middle
>    of another instruction and zen_untrain_ret's validation fails to find
>    the jump target because it doesn't know this instruction -- because
>    it's inside another instruction.
> 
> Before all this they were both __x86_return_thunk, there was no
> distinction needed.
> 
> But now there is, you still only want 'jmp __x86_return_thunk' to end up
> in .return_sites. As per the above, you very much do *NOT* want any
> other things on there.
> 
> But you added one more magic function inside another instruction
> instance and had it identified at return_thunk, which is a fail.
> 
> Worse, you also added srso_untrain_ret, even though that really
> shouldn't have been added since it doesn't suffer any of the above two
> things.
> 
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-10 12:37     ` Peter Zijlstra
@ 2023-08-10 12:56       ` Borislav Petkov
  2023-08-10 13:22         ` Peter Zijlstra
  2023-08-11  7:01       ` Peter Zijlstra
  1 sibling, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-10 12:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Thu, Aug 10, 2023 at 02:37:56PM +0200, Peter Zijlstra wrote:
> It does; so zen1/2 use the decoder thing to flush BTB entry of the RET,
> both retbleed and srso do.
> 
> Then zen3/4 use the aliassing trick to flush the BTB entry of the RET.

Yes, I was correcting your "instruction aliasing". It is "BTB aliasing"
by causing those bits in the VAs to XOR.

> Then both srso options use RSB/RAP stuffing to force a mispredict there.

They cause the RETs to mispredict - no stuffing. That's the add $8,
%rsp in the zen3/4 case which causes the RET to mispredict. There's no
doing a bunch of CALLs to stuff something.

> Retbleed doesn't do this.
> 
> retbleed is about BTB, srso does both BTB and RSB/RAP.

Yes.

> So this patch doesn't actually change anything except one layer of
> indirection.

I agree with everything from here on to the end. Provided we can do that
and there's no some microarchitectural catch there, I'm all for removing
the __ret alternative.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-10 11:06       ` Andrew.Cooper3
@ 2023-08-10 13:02         ` Peter Zijlstra
  2023-08-13 15:23           ` Andrew.Cooper3
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-10 13:02 UTC (permalink / raw)
  To: Andrew.Cooper3; +Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, gregkh

On Thu, Aug 10, 2023 at 12:06:17PM +0100, Andrew.Cooper3@citrix.com wrote:
> On 09/08/2023 3:22 pm, Peter Zijlstra wrote:
> > On Wed, Aug 09, 2023 at 10:20:31AM -0400, Josh Poimboeuf wrote:
> >> On Wed, Aug 09, 2023 at 09:12:30AM +0200, Peter Zijlstra wrote:
> >>> +++ b/tools/objtool/check.c
> >>> @@ -455,7 +455,12 @@ static int decode_instructions(struct ob
> >>>  				return -1;
> >>>  			}
> >>>  
> >>> -			if (func->return_thunk || !strcmp(func->name, "srso_safe_ret") || func->alias != func)
> >>> +			/*
> >>> +			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
> >>> +			 * another instruction and objtool doesn't grok that. Skip validating them.
> >>> +			 */
> >>> +			if (!strcmp(func->name, "zen_return_thunk") ||
> >>> +			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)
> >> Hm, speaking of renaming they should probably be called
> >> retbleed_return_thunk() and srso_return_thunk().
> > Yes, clearly naming is better in daylight. Let me regex that.
> 
> btc_*, not retbleed_*.
> 
> That way it matches the terminology you'll find in the AMD whitepaper
> about what's going on, and there's already an entirely different issue
> called Retbleed.

So BTC as a whole is the fact that AMD predicts the type of an
instruction and then picks a predictor to predict the target of that
instruction, no?

The retbleed issue is that BTC is used to trick the thing into thinking
the RET is not a return but an indirect branch (type confusion) and then
pick the BTB predictor for a target, which got trained to point at
targer of choice.

The SRSO thing employs the same type confusion to train the RSB/RAP and
have RET, now correctly predicted to be a RET and using said RSB/RAP, to
jump to target of choice.

So in that regards, I don't think BTC is a good name for the thing.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-10 12:56       ` Borislav Petkov
@ 2023-08-10 13:22         ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-10 13:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Thu, Aug 10, 2023 at 02:56:31PM +0200, Borislav Petkov wrote:

> > Then both srso options use RSB/RAP stuffing to force a mispredict there.
> 
> They cause the RETs to mispredict - no stuffing. That's the add $8,
> %rsp in the zen3/4 case which causes the RET to mispredict. There's no
> doing a bunch of CALLs to stuff something.

This is what is called RSB stuffing, we've been doing it for ages on the
Intel side, and code in nospec-branch.h has a number of variants of
this.

	CALL	srso_safe_ret	// push addr of UD2 into RSB -- aka 'stuff'
	UD2
srso_safe_ret:
	ADD $8, %RSP		// skip over the return to UD2
	RET			// pop RSB, speculate into UD2, miss like a beast


Now compare to __FILL_ONE_RETURN, which has the comment 'Stuff a single
RSB slot.' That expands to:

	call	772f
	int3
772:	add	$8, %rsp
	lfence

Which is the same sequence and causes the next RET to speculate into
that int3.


So RSB stuffing is sticking addresses to traps in the RSB so that
subsequent predictions go into said traps instead of potentially user
controlled targets.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess
  2023-08-10 12:50       ` Peter Zijlstra
@ 2023-08-10 15:02         ` Borislav Petkov
  2023-08-10 15:22           ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-10 15:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Thu, Aug 10, 2023 at 02:50:05PM +0200, Peter Zijlstra wrote:
> Notably 'call RET' is not what this is about. I hope the below
> clarifies.

Yah, it does, now that you've taken the time to explain it. Looking at
the commit which added it:

  15e67227c49a ("x86: Undo return-thunk damage")

it doesn't even begin to explain it. Can you please put the gist of what
you've explained here in a comment over patch_return() as it is not
really obvious - at least to me it isn't - what the rules for the thunks
are.

I mean, they're clear now but I think we should hold them down
explicitly. Especially with all the crazy patching we're doing now.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess
  2023-08-10 15:02         ` Borislav Petkov
@ 2023-08-10 15:22           ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-10 15:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Thu, Aug 10, 2023 at 05:02:01PM +0200, Borislav Petkov wrote:
> On Thu, Aug 10, 2023 at 02:50:05PM +0200, Peter Zijlstra wrote:
> > Notably 'call RET' is not what this is about. I hope the below
> > clarifies.
> 
> Yah, it does, now that you've taken the time to explain it. Looking at
> the commit which added it:
> 
>   15e67227c49a ("x86: Undo return-thunk damage")
> 
> it doesn't even begin to explain it. Can you please put the gist of what
> you've explained here in a comment over patch_return() as it is not
> really obvious - at least to me it isn't - what the rules for the thunks
> are.
> 
> I mean, they're clear now but I think we should hold them down
> explicitly. Especially with all the crazy patching we're doing now.

Will update.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-09  7:12 ` [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed= Peter Zijlstra
  2023-08-09 13:42   ` Josh Poimboeuf
@ 2023-08-10 15:44   ` Borislav Petkov
  2023-08-10 16:10     ` Josh Poimboeuf
  1 sibling, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-10 15:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> Since it is now readily apparent that the two SRSO
> untrain_ret+return_thunk variants are exactly the same mechanism as
> the existing (retbleed) zen untrain_ret+return_thunk, add them to the
> existing retbleed options.

Except that Zen3/4 are not affected by retbleed, according to my current
state of information.

I don't like this lumping together of the issues even if their
mitigations are more or less similar.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-10 15:44   ` Borislav Petkov
@ 2023-08-10 16:10     ` Josh Poimboeuf
  2023-08-11 10:27       ` Borislav Petkov
  2023-08-12 11:24       ` Peter Zijlstra
  0 siblings, 2 replies; 80+ messages in thread
From: Josh Poimboeuf @ 2023-08-10 16:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Thu, Aug 10, 2023 at 05:44:04PM +0200, Borislav Petkov wrote:
> On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> > Since it is now readily apparent that the two SRSO
> > untrain_ret+return_thunk variants are exactly the same mechanism as
> > the existing (retbleed) zen untrain_ret+return_thunk, add them to the
> > existing retbleed options.
> 
> Except that Zen3/4 are not affected by retbleed, according to my current
> state of information.
> 
> I don't like this lumping together of the issues even if their
> mitigations are more or less similar.

I tend to agree that SRSO is a new issue and should have its own sysfs
and cmdline options (though a separate CONFIG option is overkill IMO).

The mitigations are unfortunately intertwined, but we've been in that
situation several times before (e.g., spectre_v2 + intel retbleed).

-- 
Josh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-10 12:37     ` Peter Zijlstra
  2023-08-10 12:56       ` Borislav Petkov
@ 2023-08-11  7:01       ` Peter Zijlstra
  2023-08-11 17:00         ` Nick Desaulniers
  1 sibling, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-11  7:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, matz, Nick Desaulniers, joao.moreira, samitolvanen

On Thu, Aug 10, 2023 at 02:37:56PM +0200, Peter Zijlstra wrote:

> After this patch things look equivalent to:
> 
> SYM_FUNC_START(foo)
> 	...
> 	ALTERNATIVE "ret; int3"
> 		    "jmp __x86_return_thunk", X86_FEATURE_RETHUNK
> 		    "jmp srso_return_thunk, X86_FEATURE_SRSO
> 		    "jmp srsi_alias_return_thunk", X86_FEATURE_SRSO_ALIAS
> SYM_FUNC_END(foo)
> 
> SYM_CODE_START(srso_return_thunk)
> 	UNWIND_HINT_FUNC
> 	ANNOTATE_NOENDBR
> 	call srso_safe_ret;
> 	ud2
> SYM_CODE_END(srso_return_thunk)
> 
> SYM_CODE_START(srso_alias_return_thunk)
> 	UNWIND_HINT_FUNC
> 	ANNOTATE_NOENDBR
> 	call srso_alias_safe_ret;
> 	ud2
> SYM_CODE_END(srso_alias_return_thunk)
> 

So it looks like the compilers are still not emitting int3 after jmp,
even with the SLS options enabled :/

This means the tail end of functions compiled with:

  -mharden-sls=all -mfunction-return=thunk-extern

Is still a regular: jmp __x86_return_thunk, no trailing trap.

  https://godbolt.org/z/Ecqv76YbE

If we all could please finally fix that, then I can rewrite the above to
effectively be:

SYM_FUNC_START(foo)
	...
	ALTERNATIVE "ret; int3"
		    "jmp __x86_return_thunk", X86_FEATURE_RETHUNK
		    "call srso_safe_ret, X86_FEATURE_SRSO
		    "call srso_alias_safe_ret", X86_FEATURE_SRSO_ALIAS
	int3 //  <--- *MISSING*
SYM_FUNC_END(foo)

Bonus points if I can compile time tell if a compiler DTRT, feature flag
or what have you in the preprocessor would be awesome.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-10 16:10     ` Josh Poimboeuf
@ 2023-08-11 10:27       ` Borislav Petkov
  2023-08-12 11:32         ` Peter Zijlstra
  2023-08-12 11:24       ` Peter Zijlstra
  1 sibling, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-11 10:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Thu, Aug 10, 2023 at 12:10:03PM -0400, Josh Poimboeuf wrote:
> I tend to agree that SRSO is a new issue and should have its own sysfs
> and cmdline options (though a separate CONFIG option is overkill IMO).

Yeah, there's a patch floating around adding a config option for every
mitigation. Apparently people want to build-time disable them all.

> The mitigations are unfortunately intertwined, but we've been in that
> situation several times before (e.g., spectre_v2 + intel retbleed).

Yap.

And if you recall, keeping Intel Retbleed from AMD Retbleed apart was
already a PITA at the time so adding yet another one behind that flag
would be madness.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-11  7:01       ` Peter Zijlstra
@ 2023-08-11 17:00         ` Nick Desaulniers
  2023-08-12 11:20           ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Nick Desaulniers @ 2023-08-11 17:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	jpoimboe, gregkh, matz, joao.moreira, samitolvanen

On Fri, Aug 11, 2023 at 12:01 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Aug 10, 2023 at 02:37:56PM +0200, Peter Zijlstra wrote:
>
> > After this patch things look equivalent to:
> >
> > SYM_FUNC_START(foo)
> >       ...
> >       ALTERNATIVE "ret; int3"
> >                   "jmp __x86_return_thunk", X86_FEATURE_RETHUNK
> >                   "jmp srso_return_thunk, X86_FEATURE_SRSO
> >                   "jmp srsi_alias_return_thunk", X86_FEATURE_SRSO_ALIAS
> > SYM_FUNC_END(foo)
> >
> > SYM_CODE_START(srso_return_thunk)
> >       UNWIND_HINT_FUNC
> >       ANNOTATE_NOENDBR
> >       call srso_safe_ret;
> >       ud2
> > SYM_CODE_END(srso_return_thunk)
> >
> > SYM_CODE_START(srso_alias_return_thunk)
> >       UNWIND_HINT_FUNC
> >       ANNOTATE_NOENDBR
> >       call srso_alias_safe_ret;
> >       ud2
> > SYM_CODE_END(srso_alias_return_thunk)
> >
>
> So it looks like the compilers are still not emitting int3 after jmp,
> even with the SLS options enabled :/
>
> This means the tail end of functions compiled with:
>
>   -mharden-sls=all -mfunction-return=thunk-extern
>
> Is still a regular: jmp __x86_return_thunk, no trailing trap.
>
>   https://godbolt.org/z/Ecqv76YbE

I don't have time to finish this today, but
https://reviews.llvm.org/D157734 should do what you're looking for, I
think.

>
> If we all could please finally fix that, then I can rewrite the above to
> effectively be:
>
> SYM_FUNC_START(foo)
>         ...
>         ALTERNATIVE "ret; int3"
>                     "jmp __x86_return_thunk", X86_FEATURE_RETHUNK
>                     "call srso_safe_ret, X86_FEATURE_SRSO
>                     "call srso_alias_safe_ret", X86_FEATURE_SRSO_ALIAS
>         int3 //  <--- *MISSING*
> SYM_FUNC_END(foo)
>
> Bonus points if I can compile time tell if a compiler DTRT, feature flag
> or what have you in the preprocessor would be awesome.

Probably not a preprocessor token; in the past I have made that
suggestion and the old guard informed me "no, too many preprocessor
tokens to lex, no more!"  I still disagree but that is a viewpoint I
can sympathize with, slightly.

Probably version checks for now on the SLS config (or version checks
on a new kconfig CONFIG_IMPROVED_SLS)

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess
  2023-08-11 17:00         ` Nick Desaulniers
@ 2023-08-12 11:20           ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-12 11:20 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Borislav Petkov, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	jpoimboe, gregkh, matz, joao.moreira, samitolvanen

On Fri, Aug 11, 2023 at 10:00:31AM -0700, Nick Desaulniers wrote:
> On Fri, Aug 11, 2023 at 12:01 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Aug 10, 2023 at 02:37:56PM +0200, Peter Zijlstra wrote:
> >
> > > After this patch things look equivalent to:
> > >
> > > SYM_FUNC_START(foo)
> > >       ...
> > >       ALTERNATIVE "ret; int3"
> > >                   "jmp __x86_return_thunk", X86_FEATURE_RETHUNK
> > >                   "jmp srso_return_thunk, X86_FEATURE_SRSO
> > >                   "jmp srsi_alias_return_thunk", X86_FEATURE_SRSO_ALIAS
> > > SYM_FUNC_END(foo)
> > >
> > > SYM_CODE_START(srso_return_thunk)
> > >       UNWIND_HINT_FUNC
> > >       ANNOTATE_NOENDBR
> > >       call srso_safe_ret;
> > >       ud2
> > > SYM_CODE_END(srso_return_thunk)
> > >
> > > SYM_CODE_START(srso_alias_return_thunk)
> > >       UNWIND_HINT_FUNC
> > >       ANNOTATE_NOENDBR
> > >       call srso_alias_safe_ret;
> > >       ud2
> > > SYM_CODE_END(srso_alias_return_thunk)
> > >
> >
> > So it looks like the compilers are still not emitting int3 after jmp,
> > even with the SLS options enabled :/
> >
> > This means the tail end of functions compiled with:
> >
> >   -mharden-sls=all -mfunction-return=thunk-extern
> >
> > Is still a regular: jmp __x86_return_thunk, no trailing trap.
> >
> >   https://godbolt.org/z/Ecqv76YbE
> 
> I don't have time to finish this today, but
> https://reviews.llvm.org/D157734 should do what you're looking for, I
> think.

Hmm, so your wording seems to imply regular SLS would already emit INT3
after jump, but I'm not seeing that in clang-16 output. Should I upgrade
my llvm?

[[edit]] Oooh, now I see, regular SLS would emit RET; INT3, but what I'm
alluding to was that sls=all should also emit INT3 after every JMP due
to AMD BTC. This is an SLS option that seems to have gone missing in
both compilers for a long while.


And yesterday I only quickly looked at bigger gcc output and not clang.
But when I look at clang-16 output I see things like:

1053:       2e e8 00 00 00 00       cs call 1059 <yield_to+0xe9>    1055: R_X86_64_PLT32    __x86_indirect_thunk_r11-0x4
1059:       84 c0                   test   %al,%al
105b:       74 1c                   je     1079 <yield_to+0x109>
105d:       eb 6e                   jmp    10cd <yield_to+0x15d>

No INT3

105f:       41 bc 01 00 00 00       mov    $0x1,%r12d
1065:       80 7c 24 04 00          cmpb   $0x0,0x4(%rsp)
106a:       74 0d                   je     1079 <yield_to+0x109>
106c:       4d 39 fe                cmp    %r15,%r14
106f:       74 08                   je     1079 <yield_to+0x109>
1071:       4c 89 ff                mov    %r15,%rdi
1074:       e8 00 00 00 00          call   1079 <yield_to+0x109>    1075: R_X86_64_PLT32    resched_curr-0x4
1079:       4d 39 fe                cmp    %r15,%r14
107c:       74 08                   je     1086 <yield_to+0x116>
107e:       4c 89 ff                mov    %r15,%rdi
1081:       e8 00 00 00 00          call   1086 <yield_to+0x116>    1082: R_X86_64_PLT32    _raw_spin_unlock-0x4
1086:       4c 89 f7                mov    %r14,%rdi
1089:       e8 00 00 00 00          call   108e <yield_to+0x11e>    108a: R_X86_64_PLT32    _raw_spin_unlock-0x4
108e:       f7 c3 00 02 00 00       test   $0x200,%ebx
1094:       74 06                   je     109c <yield_to+0x12c>
1096:       ff 15 00 00 00 00       call   *0x0(%rip)        # 109c <yield_to+0x12c>        1098: R_X86_64_PC32     pv_ops+0xfc
109c:       45 85 e4                test   %r12d,%r12d
109f:       7e 05                   jle    10a6 <yield_to+0x136>
10a1:       e8 00 00 00 00          call   10a6 <yield_to+0x136>    10a2: R_X86_64_PLT32    schedule-0x4
10a6:       44 89 e0                mov    %r12d,%eax
10a9:       48 83 c4 08             add    $0x8,%rsp
10ad:       5b                      pop    %rbx
10ae:       41 5c                   pop    %r12
10b0:       41 5d                   pop    %r13
10b2:       41 5e                   pop    %r14
10b4:       41 5f                   pop    %r15
10b6:       5d                      pop    %rbp
10b7:       2e e9 00 00 00 00       cs jmp 10bd <yield_to+0x14d>    10b9: R_X86_64_PLT32    __x86_return_thunk-0x4

CS padding!!

10bd:       41 bc fd ff ff ff       mov    $0xfffffffd,%r12d
10c3:       f7 c3 00 02 00 00       test   $0x200,%ebx


So since you (surprisingly!) CS pad the return thunk, I *could* pull it
off there, 6 bytes is enough space to write: 'CALL foo; INT3'

But really SLS *should* put INT3 after every JMP instruction -- of
course including the return thunk one.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-10 16:10     ` Josh Poimboeuf
  2023-08-11 10:27       ` Borislav Petkov
@ 2023-08-12 11:24       ` Peter Zijlstra
  2023-08-12 12:10         ` Borislav Petkov
  1 sibling, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-12 11:24 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Borislav Petkov, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Thu, Aug 10, 2023 at 12:10:03PM -0400, Josh Poimboeuf wrote:
> On Thu, Aug 10, 2023 at 05:44:04PM +0200, Borislav Petkov wrote:
> > On Wed, Aug 09, 2023 at 09:12:24AM +0200, Peter Zijlstra wrote:
> > > Since it is now readily apparent that the two SRSO
> > > untrain_ret+return_thunk variants are exactly the same mechanism as
> > > the existing (retbleed) zen untrain_ret+return_thunk, add them to the
> > > existing retbleed options.
> > 
> > Except that Zen3/4 are not affected by retbleed, according to my current
> > state of information.
> > 
> > I don't like this lumping together of the issues even if their
> > mitigations are more or less similar.
> 
> I tend to agree that SRSO is a new issue and should have its own sysfs
> and cmdline options (though a separate CONFIG option is overkill IMO).
> 
> The mitigations are unfortunately intertwined, but we've been in that
> situation several times before (e.g., spectre_v2 + intel retbleed).

That very experience wants me to avoid doing it again :-/ But you all
really want to keep the parameter, can we at least rename it something
you can remember how to type, like 'srso=' instead of this horrific
'spec_rstack_overflow=' thing?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-11 10:27       ` Borislav Petkov
@ 2023-08-12 11:32         ` Peter Zijlstra
  2023-08-12 12:12           ` Borislav Petkov
  2023-08-14 15:45           ` David Laight
  0 siblings, 2 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-12 11:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Fri, Aug 11, 2023 at 12:27:48PM +0200, Borislav Petkov wrote:
> On Thu, Aug 10, 2023 at 12:10:03PM -0400, Josh Poimboeuf wrote:
> > I tend to agree that SRSO is a new issue and should have its own sysfs
> > and cmdline options (though a separate CONFIG option is overkill IMO).
> 
> Yeah, there's a patch floating around adding a config option for every
> mitigation. Apparently people want to build-time disable them all.

So I really hate that .Kconfig explosion, that's the very last thing we
need :-( More random options that can cause build time trouble.

I might see value in one knob that kills all speculation nonsense at
build time, but not one per mitigation, that's maddness.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-12 11:24       ` Peter Zijlstra
@ 2023-08-12 12:10         ` Borislav Petkov
  2023-08-14 10:56           ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-12 12:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Sat, Aug 12, 2023 at 01:24:04PM +0200, Peter Zijlstra wrote:
> That very experience wants me to avoid doing it again :-/ But you all
> really want to keep the parameter, can we at least rename it something
> you can remember how to type, like 'srso=' instead of this horrific
> 'spec_rstack_overflow=' thing?

I'm all for short'n'sweet but last time I did that, Linus said we should
have option names which aren't abbreviations which don't mean anything.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-12 11:32         ` Peter Zijlstra
@ 2023-08-12 12:12           ` Borislav Petkov
  2023-08-14 15:45           ` David Laight
  1 sibling, 0 replies; 80+ messages in thread
From: Borislav Petkov @ 2023-08-12 12:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Sat, Aug 12, 2023 at 01:32:56PM +0200, Peter Zijlstra wrote:
> So I really hate that .Kconfig explosion, that's the very last thing we
> need :-( More random options that can cause build time trouble.
> 
> I might see value in one knob that kills all speculation nonsense at
> build time, but not one per mitigation, that's maddness.

Yap, I hate them too. Linus wanted them last time and you added them.

And now we have a subset of mitigations with Kconfig options and
a subset without.

;-\

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess
  2023-08-09 13:26       ` Peter Zijlstra
@ 2023-08-12 18:30         ` Borislav Petkov
  0 siblings, 0 replies; 80+ messages in thread
From: Borislav Petkov @ 2023-08-12 18:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Wed, Aug 09, 2023 at 03:26:35PM +0200, Peter Zijlstra wrote:
> I think simply setting UNRET for SRSO at this point will be sufficient.
> That ensures the entry_untrain_ret thing gets called, and the
> alternative there DTRT.
> 
> The feature isn't used anywhere else afaict.
> 
> Then later, after the fancy alternatives happen, this can be cleaned up
> again.

Yes, this fixes it for >= Zen3:

---
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 4b0a770fbacb..611d048f6415 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2448,6 +2448,7 @@ static void __init srso_select_mitigation(void)
 			 * like ftrace, static_call, etc.
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
+			setup_force_cpu_cap(X86_FEATURE_UNRET);
 
 			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-09  7:12 ` [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
  2023-08-09 13:50   ` Josh Poimboeuf
@ 2023-08-13 10:36   ` Borislav Petkov
  2023-08-14 10:35     ` Peter Zijlstra
  1 sibling, 1 reply; 80+ messages in thread
From: Borislav Petkov @ 2023-08-13 10:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Wed, Aug 09, 2023 at 09:12:25AM +0200, Peter Zijlstra wrote:
> With the difference being that UNTRAIN_RET_VM uses
> X86_FEATURE_IBPB_ON_VMEXIT instead of X86_FEATURE_ENTRY_IBPB.
> 
> This cures VMEXIT doing potentially unret+IBPB or double IBPB.

Can't - I have a separate flag for that and I set it only when !IBPB:

        case SRSO_CMD_IBPB_ON_VMEXIT:
                if (IS_ENABLED(CONFIG_CPU_SRSO)) {
                        if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
                                setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);


But I like the separate macro.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-10 13:02         ` Peter Zijlstra
@ 2023-08-13 15:23           ` Andrew.Cooper3
  2023-08-14 10:34             ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Andrew.Cooper3 @ 2023-08-13 15:23 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, gregkh

On 10/08/2023 2:02 pm, Peter Zijlstra wrote:
> On Thu, Aug 10, 2023 at 12:06:17PM +0100, Andrew.Cooper3@citrix.com wrote:
>> On 09/08/2023 3:22 pm, Peter Zijlstra wrote:
>>> On Wed, Aug 09, 2023 at 10:20:31AM -0400, Josh Poimboeuf wrote:
>>>> On Wed, Aug 09, 2023 at 09:12:30AM +0200, Peter Zijlstra wrote:
>>>>> +++ b/tools/objtool/check.c
>>>>> @@ -455,7 +455,12 @@ static int decode_instructions(struct ob
>>>>>  				return -1;
>>>>>  			}
>>>>>  
>>>>> -			if (func->return_thunk || !strcmp(func->name, "srso_safe_ret") || func->alias != func)
>>>>> +			/*
>>>>> +			 * Both zen_return_thunk() and srso_safe_ret() are embedded inside
>>>>> +			 * another instruction and objtool doesn't grok that. Skip validating them.
>>>>> +			 */
>>>>> +			if (!strcmp(func->name, "zen_return_thunk") ||
>>>>> +			    !strcmp(func->name, "srso_safe_ret") || func->alias != func)
>>>> Hm, speaking of renaming they should probably be called
>>>> retbleed_return_thunk() and srso_return_thunk().
>>> Yes, clearly naming is better in daylight. Let me regex that.
>> btc_*, not retbleed_*.
>>
>> That way it matches the terminology you'll find in the AMD whitepaper
>> about what's going on, and there's already an entirely different issue
>> called Retbleed.
> So BTC as a whole is the fact that AMD predicts the type of an
> instruction and then picks a predictor to predict the target of that
> instruction, no?

No.

"Branch Type Confusion" is the technical name AMD gave last year's
issue.  Hence the name of the whitepaper about it,
https://www.amd.com/system/files/documents/technical-guidance-for-mitigating-branch-type-confusion.pdf

Each of these issues has been given a technical name given by one of the
vendors.  Spectre-v1 was Bounds Check Bypass, Spectre-v2 was Branch
Target Injection, etc.  We can debate at some other point whether the
names are ideal or not, but the point is that these are the names that
cross-reference with the vendor documentation.

The naming scheme either needs to be BTC+SRSO (AMD's names), *or*
Retbleed+Inception (EthZ's names, remembering that Retbleed is still
ambiguous), but not a mix of the two.

~Andrew

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-13 15:23           ` Andrew.Cooper3
@ 2023-08-14 10:34             ` Peter Zijlstra
  2023-08-14 11:31               ` Andrew.Cooper3
  0 siblings, 1 reply; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-14 10:34 UTC (permalink / raw)
  To: Andrew.Cooper3; +Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, gregkh

On Sun, Aug 13, 2023 at 04:23:27PM +0100, Andrew.Cooper3@citrix.com wrote:
> On 10/08/2023 2:02 pm, Peter Zijlstra wrote:

> > So BTC as a whole is the fact that AMD predicts the type of an
> > instruction and then picks a predictor to predict the target of that
> > instruction, no?
> 
> No.
> 
> "Branch Type Confusion" is the technical name AMD gave last year's
> issue.  Hence the name of the whitepaper about it,
> https://www.amd.com/system/files/documents/technical-guidance-for-mitigating-branch-type-confusion.pdf

Bah, then what do we call the actual underlying issue that the AMD
branch predictor starts by predicting the next instruction type --
before it has been decoded -- meaning it can predict it wrong, which
then leads to a tons of other issues, including but not limited to:

 SLS through JMP (or pretty much anything else)
 RET from BTB

?

Calling *THAT* branch-type-confusion makes a heap more sense to me.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-13 10:36   ` Borislav Petkov
@ 2023-08-14 10:35     ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-14 10:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Sun, Aug 13, 2023 at 12:36:57PM +0200, Borislav Petkov wrote:
> On Wed, Aug 09, 2023 at 09:12:25AM +0200, Peter Zijlstra wrote:
> > With the difference being that UNTRAIN_RET_VM uses
> > X86_FEATURE_IBPB_ON_VMEXIT instead of X86_FEATURE_ENTRY_IBPB.
> > 
> > This cures VMEXIT doing potentially unret+IBPB or double IBPB.
> 
> Can't - I have a separate flag for that and I set it only when !IBPB:
> 
>         case SRSO_CMD_IBPB_ON_VMEXIT:
>                 if (IS_ENABLED(CONFIG_CPU_SRSO)) {
>                         if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
>                                 setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
> 

Of course you can, just also set it with regular IBPB :-)

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-12 12:10         ` Borislav Petkov
@ 2023-08-14 10:56           ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-14 10:56 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

On Sat, Aug 12, 2023 at 02:10:34PM +0200, Borislav Petkov wrote:
> On Sat, Aug 12, 2023 at 01:24:04PM +0200, Peter Zijlstra wrote:
> > That very experience wants me to avoid doing it again :-/ But you all
> > really want to keep the parameter, can we at least rename it something
> > you can remember how to type, like 'srso=' instead of this horrific
> > 'spec_rstack_overflow=' thing?
> 
> I'm all for short'n'sweet but last time I did that, Linus said we should
> have option names which aren't abbreviations which don't mean anything.

So:

 1) do you guys really want to keep this extra argument?

 2) if so, can we *PLEASE* rename it, because the current naming *SUCKS*.


I really don't see the need for an extra feature, we can trivially fold
the whole thing into retbleed, that's already 2 issues, might as well
make it 3 :-)


If we're going to rename, how about we simply call it 'inception' then
we haz both 'retbleed=' and 'inception=' and we're consistent here. Then
I'll make a compromise and do:

 's/zen_\(untrain_ret\|return_thunk\)/btc_\1/g'

so that the actual mitigations have the official amd name on them --
however much I disagree with calling this branch-type-confusion.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-14 10:34             ` Peter Zijlstra
@ 2023-08-14 11:31               ` Andrew.Cooper3
  2023-08-14 12:06                 ` Peter Zijlstra
  0 siblings, 1 reply; 80+ messages in thread
From: Andrew.Cooper3 @ 2023-08-14 11:31 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, gregkh

On 14/08/2023 11:34 am, Peter Zijlstra wrote:
> On Sun, Aug 13, 2023 at 04:23:27PM +0100, Andrew.Cooper3@citrix.com wrote:
>> On 10/08/2023 2:02 pm, Peter Zijlstra wrote:
>>> So BTC as a whole is the fact that AMD predicts the type of an
>>> instruction and then picks a predictor to predict the target of that
>>> instruction, no?
>> No.
>>
>> "Branch Type Confusion" is the technical name AMD gave last year's
>> issue.  Hence the name of the whitepaper about it,
>> https://www.amd.com/system/files/documents/technical-guidance-for-mitigating-branch-type-confusion.pdf
> Bah, then what do we call the actual underlying issue that the AMD
> branch predictor starts by predicting the next instruction type --
> before it has been decoded -- meaning it can predict it wrong, which
> then leads to a tons of other issues, including but not limited to:
>
>  SLS through JMP (or pretty much anything else)
>  RET from BTB
>
> ?
>
> Calling *THAT* branch-type-confusion makes a heap more sense to me.

You know the branch predictor being ahead of decode is a property that
exists in both vendors CPUs and has done for more than a decade
already?  Bad branch type predictions are relevant for a number of the
Intel issues too - they just don't talk as openly about it.

The thing that missing from AMD Zen2-and-older CPUs is the early stall
in decode when the branch type prediction is discovered to be wrong. 
Intel have this early feeback cycle, as do AMD Zen3 and later.

And yes - this is why SRSO is not an extension of BTC.  The
micro-architectural details are very different.

~Andrew

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk
  2023-08-14 11:31               ` Andrew.Cooper3
@ 2023-08-14 12:06                 ` Peter Zijlstra
  0 siblings, 0 replies; 80+ messages in thread
From: Peter Zijlstra @ 2023-08-14 12:06 UTC (permalink / raw)
  To: Andrew.Cooper3; +Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, gregkh

On Mon, Aug 14, 2023 at 12:31:04PM +0100, Andrew.Cooper3@citrix.com wrote:
> On 14/08/2023 11:34 am, Peter Zijlstra wrote:
> > On Sun, Aug 13, 2023 at 04:23:27PM +0100, Andrew.Cooper3@citrix.com wrote:
> >> On 10/08/2023 2:02 pm, Peter Zijlstra wrote:
> >>> So BTC as a whole is the fact that AMD predicts the type of an
> >>> instruction and then picks a predictor to predict the target of that
> >>> instruction, no?
> >> No.
> >>
> >> "Branch Type Confusion" is the technical name AMD gave last year's
> >> issue.  Hence the name of the whitepaper about it,
> >> https://www.amd.com/system/files/documents/technical-guidance-for-mitigating-branch-type-confusion.pdf
> > Bah, then what do we call the actual underlying issue that the AMD
> > branch predictor starts by predicting the next instruction type --
> > before it has been decoded -- meaning it can predict it wrong, which
> > then leads to a tons of other issues, including but not limited to:
> >
> >  SLS through JMP (or pretty much anything else)
> >  RET from BTB
> >
> > ?
> >
> > Calling *THAT* branch-type-confusion makes a heap more sense to me.
> 
> You know the branch predictor being ahead of decode is a property that
> exists in both vendors CPUs and has done for more than a decade
> already?  Bad branch type predictions are relevant for a number of the
> Intel issues too - they just don't talk as openly about it.
> 
> The thing that missing from AMD Zen2-and-older CPUs is the early stall
> in decode when the branch type prediction is discovered to be wrong. 
> Intel have this early feeback cycle, as do AMD Zen3 and later.

This early stall avoids the actual type confusion from escaping
(mostly).

> And yes - this is why SRSO is not an extension of BTC.  The
> micro-architectural details are very different.

Yeah, it is more related to intel-retbleed, both exhaust the return
stack... /me runs :-)

^ permalink raw reply	[flat|nested] 80+ messages in thread

* RE: [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed=
  2023-08-12 11:32         ` Peter Zijlstra
  2023-08-12 12:12           ` Borislav Petkov
@ 2023-08-14 15:45           ` David Laight
  1 sibling, 0 replies; 80+ messages in thread
From: David Laight @ 2023-08-14 15:45 UTC (permalink / raw)
  To: 'Peter Zijlstra', Borislav Petkov
  Cc: Josh Poimboeuf, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh

From: Peter Zijlstra
> Sent: 12 August 2023 12:33
> 
> On Fri, Aug 11, 2023 at 12:27:48PM +0200, Borislav Petkov wrote:
> > On Thu, Aug 10, 2023 at 12:10:03PM -0400, Josh Poimboeuf wrote:
> > > I tend to agree that SRSO is a new issue and should have its own sysfs
> > > and cmdline options (though a separate CONFIG option is overkill IMO).
> >
> > Yeah, there's a patch floating around adding a config option for every
> > mitigation. Apparently people want to build-time disable them all.
> 
> So I really hate that .Kconfig explosion, that's the very last thing we
> need :-( More random options that can cause build time trouble.
> 
> I might see value in one knob that kills all speculation nonsense at
> build time, but not one per mitigation, that's maddness.

Or a very limited number of options for things that are
pretty separate.
Maybe the call/indirect jump are separate enough from the
return ones?

An one big KNOB to disable them all (DEPEND_ON !NO_MIGIGATIONS ?).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2023-08-14 15:46 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-09  7:12 [RFC][PATCH 00/17] Fix up the recent SRSO patches Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 01/17] x86/alternative: Unconditional custom return thunk Peter Zijlstra
2023-08-09  9:31   ` Nikolay Borisov
2023-08-10 11:37   ` Borislav Petkov
2023-08-09  7:12 ` [RFC][PATCH 02/17] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
2023-08-09 15:45   ` Nikolay Borisov
2023-08-10 11:51   ` Borislav Petkov
2023-08-10 12:37     ` Peter Zijlstra
2023-08-10 12:56       ` Borislav Petkov
2023-08-10 13:22         ` Peter Zijlstra
2023-08-11  7:01       ` Peter Zijlstra
2023-08-11 17:00         ` Nick Desaulniers
2023-08-12 11:20           ` Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 03/17] x86/cpu: Make srso_untrain_ret consistent Peter Zijlstra
2023-08-10 12:00   ` Borislav Petkov
2023-08-09  7:12 ` [RFC][PATCH 04/17] objtool/x86: Fix SRSO mess Peter Zijlstra
2023-08-10 12:06   ` Borislav Petkov
2023-08-10 12:48     ` Peter Zijlstra
2023-08-10 12:50       ` Peter Zijlstra
2023-08-10 15:02         ` Borislav Petkov
2023-08-10 15:22           ` Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 05/17] x86/cpu: Cleanup the untrain mess Peter Zijlstra
2023-08-09 12:51   ` Josh Poimboeuf
2023-08-09 13:12     ` Peter Zijlstra
2023-08-09 13:26       ` Peter Zijlstra
2023-08-12 18:30         ` Borislav Petkov
2023-08-09  7:12 ` [RFC][PATCH 06/17] x86/cpu: Add SRSO untrain to retbleed= Peter Zijlstra
2023-08-09 13:42   ` Josh Poimboeuf
2023-08-09 14:06     ` Peter Zijlstra
2023-08-09 14:28       ` Josh Poimboeuf
2023-08-09 15:08         ` Peter Zijlstra
2023-08-09 15:43           ` Josh Poimboeuf
2023-08-09 14:31     ` Andrew.Cooper3
2023-08-09 14:39       ` Josh Poimboeuf
2023-08-10 15:44   ` Borislav Petkov
2023-08-10 16:10     ` Josh Poimboeuf
2023-08-11 10:27       ` Borislav Petkov
2023-08-12 11:32         ` Peter Zijlstra
2023-08-12 12:12           ` Borislav Petkov
2023-08-14 15:45           ` David Laight
2023-08-12 11:24       ` Peter Zijlstra
2023-08-12 12:10         ` Borislav Petkov
2023-08-14 10:56           ` Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 07/17] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
2023-08-09 13:50   ` Josh Poimboeuf
2023-08-09 14:06     ` Peter Zijlstra
2023-08-09 14:30       ` Josh Poimboeuf
2023-08-09 15:10         ` Peter Zijlstra
2023-08-13 10:36   ` Borislav Petkov
2023-08-14 10:35     ` Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 08/17] x86/cpu: Add IBPB on VMEXIT to retbleed= Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 09/17] x86: Remove CONFIG_CPU_SRSO Peter Zijlstra
2023-08-09 13:57   ` Josh Poimboeuf
2023-08-09  7:12 ` [RFC][PATCH 10/17] x86: Remove CPU_IBPB_ENTRY Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 11/17] x86/cpu: Remove all SRSO interface nonsense Peter Zijlstra
2023-08-09 13:10   ` Andrew.Cooper3
2023-08-09 13:36     ` Peter Zijlstra
2023-08-09 14:05   ` Josh Poimboeuf
2023-08-09 14:43     ` Peter Zijlstra
2023-08-09 14:51       ` Josh Poimboeuf
2023-08-09 15:34   ` Josh Poimboeuf
2023-08-09  7:12 ` [RFC][PATCH 12/17] x86/cpu: Rename original retbleed return thunk Peter Zijlstra
2023-08-09 14:20   ` Josh Poimboeuf
2023-08-09 14:22     ` Peter Zijlstra
2023-08-10 11:06       ` Andrew.Cooper3
2023-08-10 13:02         ` Peter Zijlstra
2023-08-13 15:23           ` Andrew.Cooper3
2023-08-14 10:34             ` Peter Zijlstra
2023-08-14 11:31               ` Andrew.Cooper3
2023-08-14 12:06                 ` Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 13/17] objtool/x86: Add arch_is_offset_insn() Peter Zijlstra
2023-08-09  9:56   ` Nikolay Borisov
2023-08-09 14:34   ` Josh Poimboeuf
2023-08-09  7:12 ` [RFC][PATCH 14/17] objtool: Add comments to the arch_is_$foo() magic symbols Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 15/17] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 16/17] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
2023-08-09  7:12 ` [RFC][PATCH 17/17] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret() Peter Zijlstra
2023-08-09  9:04 ` [RFC][PATCH 00/17] Fix up the recent SRSO patches Nikolay Borisov
2023-08-09 10:04 ` Andrew.Cooper3
2023-08-09 11:58   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).