linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] Fix up SRSO stuff
@ 2023-08-14 11:44 Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 01/11] x86/cpu: Fixup __x86_return_thunk Peter Zijlstra
                   ` (11 more replies)
  0 siblings, 12 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Hi!

Second version of the SRSO fixes/cleanup.

I've redone some, reorderd most and left out the interface bits entirely for
now. Although I do strongly feel the extra interface is superfluous (and ugly).

This is based on top of current tip/x86/urgent 833fd800bf56.

The one open techinical issue I have with the mitigation is the alignment of
the RET inside srso_safe_ret(). The details given for retbleed stated that RET
should be on a 64byte boundary, which is not the case here.

I'll go prod at bringing the rest of the patches forward after I stare at some
other email.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 01/11] x86/cpu: Fixup __x86_return_thunk
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] x86/cpu: Fix __x86_return_thunk symbol type tip-bot2 for Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 02/11] x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk() Peter Zijlstra
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Patch fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow
mitigation") reimplemented __x86_return_thunk with a mix of
SYM_FUNC_START and SYM_CODE_END, this is not a sane combination.

Since nothing should ever actually 'CALL' this, make it consistently
CODE.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/retpoline.S |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -264,7 +264,9 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
-SYM_FUNC_START(__x86_return_thunk)
+SYM_CODE_START(__x86_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
 	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
 			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
 	int3



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 02/11] x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk()
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 01/11] x86/cpu: Fixup __x86_return_thunk Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 03/11] objtool/x86: Fix SRSO mess Peter Zijlstra
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

vmlinux.o: warning: objtool: srso_untrain_ret() falls through to next function __x86_return_skl()
vmlinux.o: warning: objtool: __x86_return_thunk() falls through to next function __x86_return_skl()

This is because these functions (can) end with CALL, which objtool
does not consider a terminating instruction. Therefore replace the
INT3 instruction (which is a non-fatal trap) with UD2 (which is a
fatal-trap).

This indicates execution will not continue past this point.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/retpoline.S |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -259,7 +259,7 @@ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLO
 	int3
 	lfence
 	call srso_safe_ret
-	int3
+	ud2
 SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
@@ -269,7 +269,7 @@ SYM_CORE_START(__x86_return_thunk)
 	ANNOTATE_NOENDBR
 	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
 			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
-	int3
+	ud2
 SYM_CODE_END(__x86_return_thunk)
 EXPORT_SYMBOL(__x86_return_thunk)
 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 03/11] objtool/x86: Fix SRSO mess
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 01/11] x86/cpu: Fixup __x86_return_thunk Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 02/11] x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk() Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-14 12:54   ` Andrew.Cooper3
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 04/11] x86/alternative: Make custom return thunk unconditional Peter Zijlstra
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Objtool --rethunk does two things:

 - it collects all (tail) call's of __x86_return_thunk and places them
   into .return_sites. These are typically compiler generated, but
   RET also emits this same.

 - it fudges the validation of the __x86_return_thunk symbol; because
   this symbol is inside another instruction, it can't actually find
   the instruction pointed to by the symbol offset and gets upset.

Because these two things pertained to the same symbol, there was no
pressing need to separate these two separate things.

However, alas, along comes SRSO and we get more crazy things to deal
with.

The SRSO patch itself added the following symbol names to identify as
rethunk:

  'srso_untrain_ret', 'srso_safe_ret' and '__ret'

Where '__ret' is the old retbleed return thunk, 'srso_safe_ret' is a
new similarly embedded return thunk, and 'srso_untrain_ret' is
completely unrelated to anything the above does (and was only included
because of that INT3 vs UD2 issue fixed previous).

Clear things up by adding a second category for the embedded
instruction thing.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/arch/x86/decode.c      |   11 +++++++----
 tools/objtool/check.c                |   24 ++++++++++++++++++++++--
 tools/objtool/include/objtool/arch.h |    1 +
 tools/objtool/include/objtool/elf.h  |    1 +
 4 files changed, 31 insertions(+), 6 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -824,8 +824,11 @@ bool arch_is_retpoline(struct symbol *sy
 
 bool arch_is_rethunk(struct symbol *sym)
 {
-	return !strcmp(sym->name, "__x86_return_thunk") ||
-	       !strcmp(sym->name, "srso_untrain_ret") ||
-	       !strcmp(sym->name, "srso_safe_ret") ||
-	       !strcmp(sym->name, "__ret");
+	return !strcmp(sym->name, "__x86_return_thunk");
+}
+
+bool arch_is_embedded_insn(struct symbol *sym)
+{
+	return !strcmp(sym->name, "__ret") ||
+	       !strcmp(sym->name, "srso_safe_ret");
 }
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -455,7 +455,7 @@ static int decode_instructions(struct ob
 				return -1;
 			}
 
-			if (func->return_thunk || func->alias != func)
+			if (func->embedded_insn || func->alias != func)
 				continue;
 
 			if (!find_insn(file, sec, func->offset)) {
@@ -1288,16 +1288,33 @@ static int add_ignore_alternatives(struc
 	return 0;
 }
 
+/*
+ * Symbols that replace INSN_CALL_DYNAMIC, every (tail) call to such a symbol
+ * will be added to the .retpoline_sites section.
+ */
 __weak bool arch_is_retpoline(struct symbol *sym)
 {
 	return false;
 }
 
+/*
+ * Symbols that replace INSN_RETURN, every (tail) call to such a symbol
+ * will be added to the .return_sites section.
+ */
 __weak bool arch_is_rethunk(struct symbol *sym)
 {
 	return false;
 }
 
+/*
+ * Symbols that are embedded inside other instructions, because sometimes crazy
+ * code exists. These are mostly ignored for validation purposes.
+ */
+__weak bool arch_is_embedded_insn(struct symbol *sym)
+{
+	return false;
+}
+
 static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
 {
 	struct reloc *reloc;
@@ -1583,7 +1600,7 @@ static int add_jump_destinations(struct
 			 * middle of another instruction.  Objtool only
 			 * knows about the outer instruction.
 			 */
-			if (sym && sym->return_thunk) {
+			if (sym && sym->embedded_insn) {
 				add_return_call(file, insn, false);
 				continue;
 			}
@@ -2502,6 +2519,9 @@ static int classify_symbols(struct objto
 		if (arch_is_rethunk(func))
 			func->return_thunk = true;
 
+		if (arch_is_embedded_insn(func))
+			func->embedded_insn = true;
+
 		if (arch_ftrace_match(func->name))
 			func->fentry = true;
 
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -90,6 +90,7 @@ int arch_decode_hint_reg(u8 sp_reg, int
 
 bool arch_is_retpoline(struct symbol *sym);
 bool arch_is_rethunk(struct symbol *sym);
+bool arch_is_embedded_insn(struct symbol *sym);
 
 int arch_rewrite_retpolines(struct objtool_file *file);
 
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -66,6 +66,7 @@ struct symbol {
 	u8 fentry            : 1;
 	u8 profiling_func    : 1;
 	u8 warned	     : 1;
+	u8 embedded_insn     : 1;
 	struct list_head pv_target;
 	struct reloc *relocs;
 };



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 04/11] x86/alternative: Make custom return thunk unconditional
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (2 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 03/11] objtool/x86: Fix SRSO mess Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

There is infrastructure to rewrite return thunks to point to any
random thunk one desires, unwrap that from CALL_THUNKS, which up to
now was the sole user of that.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    4 ----
 arch/x86/kernel/alternative.c        |    2 --
 2 files changed, 6 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -347,11 +347,7 @@ extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
 extern void entry_ibpb(void);
 
-#ifdef CONFIG_CALL_THUNKS
 extern void (*x86_return_thunk)(void);
-#else
-#define x86_return_thunk	(&__x86_return_thunk)
-#endif
 
 #ifdef CONFIG_CALL_DEPTH_TRACKING
 extern void __x86_return_skl(void);
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -687,9 +687,7 @@ void __init_or_module noinline apply_ret
 
 #ifdef CONFIG_RETHUNK
 
-#ifdef CONFIG_CALL_THUNKS
 void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
-#endif
 
 /*
  * Rewrite the compiler generated return thunk tail-calls.



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (3 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 04/11] x86/alternative: Make custom return thunk unconditional Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-14 13:02   ` Borislav Petkov
                     ` (4 more replies)
  2023-08-14 11:44 ` [PATCH v2 06/11] x86/cpu: Rename original retbleed methods Peter Zijlstra
                   ` (6 subsequent siblings)
  11 siblings, 5 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Use the existing configurable return thunk. There is absolute no
justification for having created this __x86_return_thunk alternative.

To clarify, the whole thing looks like:

Zen3/4 does:

  srso_alias_untrain_ret:
	  nop2
	  lfence
	  jmp srso_alias_return_thunk
	  int3

  srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
	  add $8, %rsp
	  ret
	  int3

  srso_alias_return_thunk:
	  call srso_alias_safe_ret
	  ud2

While Zen1/2 does:

  srso_untrain_ret:
	  movabs $foo, %rax
	  lfence
	  call srso_safe_ret           (jmp srso_return_thunk ?)
	  int3

  srso_safe_ret: // embedded in movabs instruction
	  add $8,%rsp
          ret
          int3

  srso_return_thunk:
	  call srso_safe_ret
	  ud2

While retbleed does:

  zen_untrain_ret:
	  test $0xcc, %bl
	  lfence
	  jmp zen_return_thunk
          int3

  zen_return_thunk: // embedded in the test instruction
	  ret
          int3

Where Zen1/2 flush the BTB entry using the instruction decoder trick
(test,movabs) Zen3/4 use instruction aliasing. SRSO adds RSB (RAP in
AMD speak) stuffing to force speculation into a trap an cause a
mis-predict.

Pick one of three options at boot (evey function can only ever return
once).

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    6 ++++
 arch/x86/kernel/cpu/bugs.c           |    8 ++++--
 arch/x86/kernel/vmlinux.lds.S        |    2 -
 arch/x86/lib/retpoline.S             |   45 ++++++++++++++++++++++-------------
 tools/objtool/arch/x86/decode.c      |    2 -
 5 files changed, 43 insertions(+), 20 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -342,9 +342,15 @@ extern retpoline_thunk_t __x86_indirect_
 extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
 
 extern void __x86_return_thunk(void);
+
+extern void zen_return_thunk(void);
+extern void srso_return_thunk(void);
+extern void srso_alias_return_thunk(void);
+
 extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
+
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1034,6 +1034,7 @@ static void __init retbleed_select_mitig
 	case RETBLEED_MITIGATION_UNRET:
 		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
+		x86_return_thunk = zen_return_thunk;
 
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
@@ -2451,10 +2452,13 @@ static void __init srso_select_mitigatio
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 
-			if (boot_cpu_data.x86 == 0x19)
+			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
-			else
+				x86_return_thunk = srso_alias_return_thunk;
+			} else {
 				setup_force_cpu_cap(X86_FEATURE_SRSO);
+				x86_return_thunk = srso_return_thunk;
+			}
 			srso_mitigation = SRSO_MITIGATION_SAFE_RET;
 		} else {
 			pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -521,7 +521,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_RETHUNK
-. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
+. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 #endif
 
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -151,22 +151,20 @@ SYM_CODE_END(__x86_indirect_jump_thunk_a
 	.section .text..__x86.rethunk_untrain
 
 SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
-	jmp __x86_return_thunk
+	jmp srso_alias_return_thunk
 SYM_FUNC_END(srso_untrain_ret_alias)
 __EXPORT_THUNK(srso_untrain_ret_alias)
 
 	.section .text..__x86.rethunk_safe
 #endif
 
-/* Needs a definition for the __x86_return_thunk alternative below. */
 SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
-#ifdef CONFIG_CPU_SRSO
 	lea 8(%_ASM_SP), %_ASM_SP
 	UNWIND_HINT_FUNC
-#endif
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
@@ -174,9 +172,16 @@ SYM_FUNC_END(srso_safe_ret_alias)
 
 	.section .text..__x86.return_thunk
 
+SYM_CODE_START(srso_alias_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	call srso_safe_ret_alias
+	ud2
+SYM_CODE_END(srso_alias_return_thunk)
+
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
- * 1) The RET at __x86_return_thunk must be on a 64 byte boundary, for
+ * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
  *    alignment within the BTB.
  * 2) The instruction at zen_untrain_ret must contain, and not
  *    end with, the 0xc3 byte of the RET.
@@ -184,7 +189,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (__ret - zen_untrain_ret), 0xcc
+	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
 SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
@@ -192,16 +197,16 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
 	 *
 	 *   TEST $0xcc, %bl
 	 *   LFENCE
-	 *   JMP __x86_return_thunk
+	 *   JMP zen_return_thunk
 	 *
 	 * Executing the TEST instruction has a side effect of evicting any BTB
 	 * prediction (potentially attacker controlled) attached to the RET, as
-	 * __x86_return_thunk + 1 isn't an instruction boundary at the moment.
+	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
 	 */
 	.byte	0xf6
 
 	/*
-	 * As executed from __x86_return_thunk, this is a plain RET.
+	 * As executed from zen_return_thunk, this is a plain RET.
 	 *
 	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
 	 *
@@ -213,13 +218,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
 	 * With SMT enabled and STIBP active, a sibling thread cannot poison
 	 * RET's prediction to a type of its choice, but can evict the
 	 * prediction due to competitive sharing. If the prediction is
-	 * evicted, __x86_return_thunk will suffer Straight Line Speculation
+	 * evicted, zen_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(__ret)
+SYM_CODE_END(zen_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -230,7 +235,7 @@ SYM_CODE_END(__ret)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp __ret
+	jmp zen_return_thunk
 	int3
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
@@ -256,6 +261,7 @@ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLO
 	ret
 	int3
 	int3
+	/* end of movabs */
 	lfence
 	call srso_safe_ret
 	ud2
@@ -263,12 +269,19 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
-SYM_CODE_START(__x86_return_thunk)
+SYM_CODE_START(srso_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
-			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
+	call srso_safe_ret
 	ud2
+SYM_CODE_END(srso_return_thunk)
+
+SYM_CODE_START(__x86_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	ANNOTATE_UNRET_SAFE
+	ret
+	int3
 SYM_CODE_END(__x86_return_thunk)
 EXPORT_SYMBOL(__x86_return_thunk)
 
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -829,6 +829,6 @@ bool arch_is_rethunk(struct symbol *sym)
 
 bool arch_is_embedded_insn(struct symbol *sym)
 {
-	return !strcmp(sym->name, "__ret") ||
+	return !strcmp(sym->name, "zen_return_thunk") ||
 	       !strcmp(sym->name, "srso_safe_ret");
 }



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 06/11] x86/cpu: Rename original retbleed methods
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (4 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-14 19:41   ` Josh Poimboeuf
                     ` (2 more replies)
  2023-08-14 11:44 ` [PATCH v2 07/11] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
                   ` (5 subsequent siblings)
  11 siblings, 3 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Rename the original retbleed return thunk and untrain_ret to
retbleed_return_thunk and retbleed_untrain_ret.

Andrew wants to call this btc_*, do we have a poll?

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    8 ++++----
 arch/x86/kernel/cpu/bugs.c           |    2 +-
 arch/x86/kernel/vmlinux.lds.S        |    2 +-
 arch/x86/lib/retpoline.S             |   30 +++++++++++++++---------------
 tools/objtool/arch/x86/decode.c      |    2 +-
 tools/objtool/check.c                |    2 +-
 6 files changed, 23 insertions(+), 23 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,7 +272,7 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_ZEN_UNTRAIN_RET	"call zen_untrain_ret"
+#define CALL_ZEN_UNTRAIN_RET	"call retbleed_untrain_ret"
 #else
 #define CALL_ZEN_UNTRAIN_RET	""
 #endif
@@ -282,7 +282,7 @@
  * return thunk isn't mapped into the userspace tables (then again, AMD
  * typically has NO_MELTDOWN).
  *
- * While zen_untrain_ret() doesn't clobber anything but requires stack,
+ * While retbleed_untrain_ret() doesn't clobber anything but requires stack,
  * entry_ibpb() will clobber AX, CX, DX.
  *
  * As such, this must be placed after every *SWITCH_TO_KERNEL_CR3 at a point
@@ -343,11 +343,11 @@ extern retpoline_thunk_t __x86_indirect_
 
 extern void __x86_return_thunk(void);
 
-extern void zen_return_thunk(void);
+extern void retbleed_return_thunk(void);
 extern void srso_return_thunk(void);
 extern void srso_alias_return_thunk(void);
 
-extern void zen_untrain_ret(void);
+extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
 
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1034,7 +1034,7 @@ static void __init retbleed_select_mitig
 	case RETBLEED_MITIGATION_UNRET:
 		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
-		x86_return_thunk = zen_return_thunk;
+		x86_return_thunk = retbleed_return_thunk;
 
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -521,7 +521,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_RETHUNK
-. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
+. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 #endif
 
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -181,32 +181,32 @@ SYM_CODE_END(srso_alias_return_thunk)
 
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
- * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
+ * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
  *    alignment within the BTB.
- * 2) The instruction at zen_untrain_ret must contain, and not
+ * 2) The instruction at retbleed_untrain_ret must contain, and not
  *    end with, the 0xc3 byte of the RET.
  * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
-SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+	.skip 64 - (retbleed_return_thunk - retbleed_untrain_ret), 0xcc
+SYM_START(retbleed_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
-	 * As executed from zen_untrain_ret, this is:
+	 * As executed from retbleed_untrain_ret, this is:
 	 *
 	 *   TEST $0xcc, %bl
 	 *   LFENCE
-	 *   JMP zen_return_thunk
+	 *   JMP retbleed_return_thunk
 	 *
 	 * Executing the TEST instruction has a side effect of evicting any BTB
 	 * prediction (potentially attacker controlled) attached to the RET, as
-	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
+	 * retbleed_return_thunk + 1 isn't an instruction boundary at the moment.
 	 */
 	.byte	0xf6
 
 	/*
-	 * As executed from zen_return_thunk, this is a plain RET.
+	 * As executed from retbleed_return_thunk, this is a plain RET.
 	 *
 	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
 	 *
@@ -218,13 +218,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
 	 * With SMT enabled and STIBP active, a sibling thread cannot poison
 	 * RET's prediction to a type of its choice, but can evict the
 	 * prediction due to competitive sharing. If the prediction is
-	 * evicted, zen_return_thunk will suffer Straight Line Speculation
+	 * evicted, retbleed_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
+SYM_INNER_LABEL(retbleed_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(zen_return_thunk)
+SYM_CODE_END(retbleed_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -235,13 +235,13 @@ SYM_CODE_END(zen_return_thunk)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp zen_return_thunk
+	jmp retbleed_return_thunk
 	int3
-SYM_FUNC_END(zen_untrain_ret)
-__EXPORT_THUNK(zen_untrain_ret)
+SYM_FUNC_END(retbleed_untrain_ret)
+__EXPORT_THUNK(retbleed_untrain_ret)
 
 /*
- * SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
+ * SRSO untraining sequence for Zen1/2, similar to retbleed_untrain_ret()
  * above. On kernel entry, srso_untrain_ret() is executed which is a
  *
  * movabs $0xccccc30824648d48,%rax
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -829,6 +829,6 @@ bool arch_is_rethunk(struct symbol *sym)
 
 bool arch_is_embedded_insn(struct symbol *sym)
 {
-	return !strcmp(sym->name, "zen_return_thunk") ||
+	return !strcmp(sym->name, "retbleed_return_thunk") ||
 	       !strcmp(sym->name, "srso_safe_ret");
 }
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1593,7 +1593,7 @@ static int add_jump_destinations(struct
 			struct symbol *sym = find_symbol_by_offset(dest_sec, dest_off);
 
 			/*
-			 * This is a special case for zen_untrain_ret().
+			 * This is a special case for retbleed_untrain_ret().
 			 * It jumps to __x86_return_thunk(), but objtool
 			 * can't find the thunk's starting RET
 			 * instruction, because the RET is also in the



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 07/11] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (5 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 06/11] x86/cpu: Rename original retbleed methods Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 08/11] x86/cpu: Cleanup the untrain mess Peter Zijlstra
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

For a more consistent namespace.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |    6 +++---
 arch/x86/kernel/vmlinux.lds.S        |    8 ++++----
 arch/x86/lib/retpoline.S             |   22 +++++++++++-----------
 3 files changed, 18 insertions(+), 18 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -300,7 +300,7 @@
 
 #ifdef CONFIG_CPU_SRSO
 	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
 #endif
 .endm
 
@@ -316,7 +316,7 @@
 
 #ifdef CONFIG_CPU_SRSO
 	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
 #endif
 .endm
 
@@ -349,7 +349,7 @@ extern void srso_alias_return_thunk(void
 
 extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
-extern void srso_untrain_ret_alias(void);
+extern void srso_alias_untrain_ret(void);
 
 extern void entry_ibpb(void);
 
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -147,10 +147,10 @@ SECTIONS
 
 #ifdef CONFIG_CPU_SRSO
 		/*
-		 * See the comment above srso_untrain_ret_alias()'s
+		 * See the comment above srso_alias_untrain_ret()'s
 		 * definition.
 		 */
-		. = srso_untrain_ret_alias | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
+		. = srso_alias_untrain_ret | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
 		*(.text..__x86.rethunk_safe)
 #endif
 		ALIGN_ENTRY_TEXT_END
@@ -536,8 +536,8 @@ INIT_PER_CPU(irq_stack_backing_store);
  * Instead do: (A | B) - (A & B) in order to compute the XOR
  * of the two function addresses:
  */
-. = ASSERT(((ABSOLUTE(srso_untrain_ret_alias) | srso_safe_ret_alias) -
-		(ABSOLUTE(srso_untrain_ret_alias) & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
+. = ASSERT(((ABSOLUTE(srso_alias_untrain_ret) | srso_alias_safe_ret) -
+		(ABSOLUTE(srso_alias_untrain_ret) & srso_alias_safe_ret)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
 		"SRSO function pair won't alias");
 #endif
 
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -133,49 +133,49 @@ SYM_CODE_END(__x86_indirect_jump_thunk_a
 #ifdef CONFIG_RETHUNK
 
 /*
- * srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
+ * srso_alias_untrain_ret() and srso_alias_safe_ret() are placed at
  * special addresses:
  *
- * - srso_untrain_ret_alias() is 2M aligned
- * - srso_safe_ret_alias() is also in the same 2M page but bits 2, 8, 14
+ * - srso_alias_untrain_ret() is 2M aligned
+ * - srso_alias_safe_ret() is also in the same 2M page but bits 2, 8, 14
  * and 20 in its virtual address are set (while those bits in the
- * srso_untrain_ret_alias() function are cleared).
+ * srso_alias_untrain_ret() function are cleared).
  *
  * This guarantees that those two addresses will alias in the branch
  * target buffer of Zen3/4 generations, leading to any potential
  * poisoned entries at that BTB slot to get evicted.
  *
- * As a result, srso_safe_ret_alias() becomes a safe return.
+ * As a result, srso_alias_safe_ret() becomes a safe return.
  */
 #ifdef CONFIG_CPU_SRSO
 	.section .text..__x86.rethunk_untrain
 
-SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
 	jmp srso_alias_return_thunk
-SYM_FUNC_END(srso_untrain_ret_alias)
-__EXPORT_THUNK(srso_untrain_ret_alias)
+SYM_FUNC_END(srso_alias_untrain_ret)
+__EXPORT_THUNK(srso_alias_untrain_ret)
 
 	.section .text..__x86.rethunk_safe
 #endif
 
-SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_safe_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	lea 8(%_ASM_SP), %_ASM_SP
 	UNWIND_HINT_FUNC
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
-SYM_FUNC_END(srso_safe_ret_alias)
+SYM_FUNC_END(srso_alias_safe_ret)
 
 	.section .text..__x86.return_thunk
 
 SYM_CODE_START(srso_alias_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	call srso_safe_ret_alias
+	call srso_alias_safe_ret
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 08/11] x86/cpu: Cleanup the untrain mess
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (6 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 07/11] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 09/11] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Since there can only be one active return_thunk, there only needs be
one (matching) untrain_ret. It fundamentally doesn't make sense to
allow multiple untrain_ret at the same time.

Fold all the 3 different untrain methods into a single (temporary)
helper stub.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   19 +++++--------------
 arch/x86/kernel/cpu/bugs.c           |    1 +
 arch/x86/lib/retpoline.S             |    7 +++++++
 3 files changed, 13 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,9 +272,9 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_ZEN_UNTRAIN_RET	"call retbleed_untrain_ret"
+#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
 #else
-#define CALL_ZEN_UNTRAIN_RET	""
+#define CALL_UNTRAIN_RET	""
 #endif
 
 /*
@@ -293,15 +293,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 .macro UNTRAIN_RET_FROM_CALL
@@ -309,15 +304,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 
@@ -351,6 +341,7 @@ extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_alias_untrain_ret(void);
 
+extern void entry_untrain_ret(void);
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2449,6 +2449,7 @@ static void __init srso_select_mitigatio
 			 * like ftrace, static_call, etc.
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
+			setup_force_cpu_cap(X86_FEATURE_UNRET);
 
 			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -277,6 +277,13 @@ SYM_CODE_START(srso_return_thunk)
 	ud2
 SYM_CODE_END(srso_return_thunk)
 
+SYM_FUNC_START(entry_untrain_ret)
+	ALTERNATIVE_2 "jmp retbleed_untrain_ret", \
+		      "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
+		      "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
+SYM_FUNC_END(entry_untrain_ret)
+__EXPORT_THUNK(entry_untrain_ret)
+
 SYM_CODE_START(__x86_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 09/11] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (7 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 08/11] x86/cpu: Cleanup the untrain mess Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  2023-08-14 11:44 ` [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Similar to how it doesn't make sense to have UNTRAIN_RET have two
untrain calls, it also doesn't make sense for VMEXIT to have an extra
IBPB call.

This cures VMEXIT doing potentially unret+IBPB or double IBPB.
Also, the (SEV) VMEXIT case seems to have been overlooked.

Redefine the meaning of the synthetic IBPB flags to:

 - ENTRY_IBPB     -- issue IBPB on entry  (was: entry + VMEXIT)
 - IBPB_ON_VMEXIT -- issue IBPB on VMEXIT

And have 'retbleed=ibpb' set *BOTH* feature flags to ensure it retains
the previous behaviour and issues IBPB on entry+VMEXIT.

The new 'srso=ibpb_vmexit' option only sets IBPB_ON_VMEXIT.

Create UNTRAIN_RET_VM specifically for the VMEXIT case, and have that
check IBPB_ON_VMEXIT.

All this avoids having the VMEXIT case having to check both ENTRY_IBPB
and IBPB_ON_VMEXIT and simplifies the alternatives.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   11 +++++++++++
 arch/x86/kernel/cpu/bugs.c           |    1 +
 arch/x86/kvm/svm/vmenter.S           |    7 ++-----
 3 files changed, 14 insertions(+), 5 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -299,6 +299,17 @@
 #endif
 .endm
 
+.macro UNTRAIN_RET_VM
+#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
+	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+	VALIDATE_UNRET_END
+	ALTERNATIVE_3 "",						\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT,	\
+		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+#endif
+.endm
+
 .macro UNTRAIN_RET_FROM_CALL
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
 	defined(CONFIG_CALL_DEPTH_TRACKING)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1045,6 +1045,7 @@ static void __init retbleed_select_mitig
 
 	case RETBLEED_MITIGATION_IBPB:
 		setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
+		setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
 		mitigate_smt = true;
 		break;
 
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -222,10 +222,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	 * because interrupt handlers won't sanitize 'ret' if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
-
-	/* SRSO */
-	ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT
+	UNTRAIN_RET_VM
 
 	/*
 	 * Clear all general purpose registers except RSP and RAX to prevent
@@ -362,7 +359,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 * because interrupt handlers won't sanitize RET if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
+	UNTRAIN_RET_VM
 
 	/* "Pop" @spec_ctrl_intercepted.  */
 	pop %_ASM_BX



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (8 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 09/11] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-15 20:49   ` Nikolay Borisov
  2023-09-07  8:31   ` Borislav Petkov
  2023-08-14 11:44 ` [PATCH v2 11/11] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret() Peter Zijlstra
  2023-08-14 16:44 ` [PATCH v2 00/11] Fix up SRSO stuff Borislav Petkov
  11 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Instead of making increasingly complicated ALTERNATIVE_n()
implementations, use a nested alternative expression.

The only difference between:

  ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)

and

  ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
              newinst2, flag2)

is that the outer alternative can add additional padding when the
inner alternative is the shorter one, which then results in
alt_instr::instrlen being inconsistent.

However, this is easily remedied since the alt_instr entries will be
consecutive and it is trivial to compute the max(alt_instr::instrlen)
at runtime while patching.

Specifically, after this patch the ALTERNATIVE_2 macro, after CPP
expansion (and manual layout), looks like this:

  .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
   140:

     140: \oldinstr ;
     141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
     142: .pushsection .altinstructions,"a" ;
	  altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f ;
	  .popsection ; .pushsection .altinstr_replacement,"ax" ;
     143: \newinstr1 ;
     144: .popsection ; ;

   141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
   142: .pushsection .altinstructions,"a" ;
	altinstr_entry 140b,143f,\ft_flags2,142b-140b,144f-143f ;
	.popsection ;
	.pushsection .altinstr_replacement,"ax" ;
   143: \newinstr2 ;
   144: .popsection ;
  .endm

The only label that is ambiguous is 140, however they all reference
the same spot, so that doesn't matter.

NOTE: obviously only @oldinstr may be an alternative; making @newinstr
an alternative would mean patching .altinstr_replacement which very
likely isn't what is intended, also the labels will be confused in
that case.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230628104952.GA2439977@hirez.programming.kicks-ass.net
---
 arch/x86/include/asm/alternative.h |  206 ++++++++++---------------------------
 arch/x86/kernel/alternative.c      |   13 ++
 tools/objtool/arch/x86/special.c   |   23 ++++
 tools/objtool/special.c            |   16 +-
 4 files changed, 100 insertions(+), 158 deletions(-)

--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -150,102 +150,60 @@ static inline int alternatives_text_rese
 }
 #endif	/* CONFIG_SMP */
 
-#define b_replacement(num)	"664"#num
-#define e_replacement(num)	"665"#num
-
-#define alt_end_marker		"663"
 #define alt_slen		"662b-661b"
-#define alt_total_slen		alt_end_marker"b-661b"
-#define alt_rlen(num)		e_replacement(num)"f-"b_replacement(num)"f"
+#define alt_total_slen		"663b-661b"
+#define alt_rlen		"665f-664f"
 
-#define OLDINSTR(oldinstr, num)						\
+#define OLDINSTR(oldinstr)						\
 	"# ALT: oldnstr\n"						\
 	"661:\n\t" oldinstr "\n662:\n"					\
 	"# ALT: padding\n"						\
-	".skip -(((" alt_rlen(num) ")-(" alt_slen ")) > 0) * "		\
-		"((" alt_rlen(num) ")-(" alt_slen ")),0x90\n"		\
-	alt_end_marker ":\n"
-
-/*
- * gas compatible max based on the idea from:
- * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
- *
- * The additional "-" is needed because gas uses a "true" value of -1.
- */
-#define alt_max_short(a, b)	"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") < (" b ")))))"
+	".skip -(((" alt_rlen ")-(" alt_slen ")) > 0) * "		\
+		"((" alt_rlen ")-(" alt_slen ")),0x90\n"		\
+	"663:\n"
 
-/*
- * Pad the second replacement alternative with additional NOPs if it is
- * additionally longer than the first replacement alternative.
- */
-#define OLDINSTR_2(oldinstr, num1, num2) \
-	"# ALT: oldinstr2\n"									\
-	"661:\n\t" oldinstr "\n662:\n"								\
-	"# ALT: padding2\n"									\
-	".skip -((" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")) > 0) * "	\
-		"(" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")), 0x90\n"	\
-	alt_end_marker ":\n"
-
-#define OLDINSTR_3(oldinsn, n1, n2, n3)								\
-	"# ALT: oldinstr3\n"									\
-	"661:\n\t" oldinsn "\n662:\n"								\
-	"# ALT: padding3\n"									\
-	".skip -((" alt_max_short(alt_max_short(alt_rlen(n1), alt_rlen(n2)), alt_rlen(n3))	\
-		" - (" alt_slen ")) > 0) * "							\
-		"(" alt_max_short(alt_max_short(alt_rlen(n1), alt_rlen(n2)), alt_rlen(n3))	\
-		" - (" alt_slen ")), 0x90\n"							\
-	alt_end_marker ":\n"
-
-#define ALTINSTR_ENTRY(ft_flags, num)					      \
+#define ALTINSTR_ENTRY(ft_flags)					      \
+	".pushsection .altinstructions,\"a\"\n"				      \
 	" .long 661b - .\n"				/* label           */ \
-	" .long " b_replacement(num)"f - .\n"		/* new instruction */ \
+	" .long 664f - .\n"				/* new instruction */ \
 	" .4byte " __stringify(ft_flags) "\n"		/* feature + flags */ \
 	" .byte " alt_total_slen "\n"			/* source len      */ \
-	" .byte " alt_rlen(num) "\n"			/* replacement len */
-
-#define ALTINSTR_REPLACEMENT(newinstr, num)		/* replacement */	\
-	"# ALT: replacement " #num "\n"						\
-	b_replacement(num)":\n\t" newinstr "\n" e_replacement(num) ":\n"
-
-/* alternative assembly primitive: */
-#define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
-	OLDINSTR(oldinstr, 1)						\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags, 1)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinstr, 1)				\
+	" .byte " alt_rlen "\n"				/* replacement len */ \
 	".popsection\n"
 
-#define ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) \
-	OLDINSTR_2(oldinstr, 1, 2)					\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags1, 1)					\
-	ALTINSTR_ENTRY(ft_flags2, 2)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinstr1, 1)				\
-	ALTINSTR_REPLACEMENT(newinstr2, 2)				\
+#define ALTINSTR_REPLACEMENT(newinstr)			/* replacement */	\
+	".pushsection .altinstr_replacement, \"ax\"\n"				\
+	"# ALT: replacement \n"							\
+	"664:\n\t" newinstr "\n 665:\n"						\
 	".popsection\n"
 
+/*
+ * Define an alternative between two instructions. If @ft_flags is
+ * present, early code in apply_alternatives() replaces @oldinstr with
+ * @newinstr. ".skip" directive takes care of proper instruction padding
+ * in case @newinstr is longer than @oldinstr.
+ *
+ * Notably: @oldinstr may be an ALTERNATIVE() itself, also see
+ *          apply_alternatives()
+ */
+#define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
+	OLDINSTR(oldinstr)						\
+	ALTINSTR_ENTRY(ft_flags)					\
+	ALTINSTR_REPLACEMENT(newinstr)
+
+#define ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)	\
+	ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),		\
+		    newinst2, flag2)
+
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
 #define ALTERNATIVE_TERNARY(oldinstr, ft_flags, newinstr_yes, newinstr_no) \
 	ALTERNATIVE_2(oldinstr, newinstr_no, X86_FEATURE_ALWAYS,	\
 		      newinstr_yes, ft_flags)
 
-#define ALTERNATIVE_3(oldinsn, newinsn1, ft_flags1, newinsn2, ft_flags2, \
-			newinsn3, ft_flags3)				\
-	OLDINSTR_3(oldinsn, 1, 2, 3)					\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags1, 1)					\
-	ALTINSTR_ENTRY(ft_flags2, 2)					\
-	ALTINSTR_ENTRY(ft_flags3, 3)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinsn1, 1)				\
-	ALTINSTR_REPLACEMENT(newinsn2, 2)				\
-	ALTINSTR_REPLACEMENT(newinsn3, 3)				\
-	".popsection\n"
+#define ALTERNATIVE_3(oldinst, newinst1, flag1, newinst2, flag2,	\
+		      newinst3, flag3)					\
+	ALTERNATIVE(ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2), \
+		    newinst3, flag3)
 
 /*
  * Alternative instructions for different CPU types or capabilities.
@@ -370,6 +328,21 @@ static inline int alternatives_text_rese
 	.byte \alt_len
 .endm
 
+#define __ALTERNATIVE(oldinst, newinst, flag)				\
+140:									\
+	oldinst	;							\
+141:									\
+	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90	;\
+142:									\
+	.pushsection .altinstructions,"a" ;				\
+	altinstr_entry 140b,143f,flag,142b-140b,144f-143f ;		\
+	.popsection ;							\
+	.pushsection .altinstr_replacement,"ax"	;			\
+143:									\
+	newinst	;							\
+144:									\
+	.popsection ;
+
 /*
  * Define an alternative between two instructions. If @feature is
  * present, early code in apply_alternatives() replaces @oldinstr with
@@ -377,88 +350,23 @@ static inline int alternatives_text_rese
  * in case @newinstr is longer than @oldinstr.
  */
 .macro ALTERNATIVE oldinstr, newinstr, ft_flags
-140:
-	\oldinstr
-141:
-	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags,142b-140b,144f-143f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr
-144:
-	.popsection
+	__ALTERNATIVE(\oldinstr, \newinstr, \ft_flags)
 .endm
 
-#define old_len			141b-140b
-#define new_len1		144f-143f
-#define new_len2		145f-144f
-#define new_len3		146f-145f
-
-/*
- * gas compatible max based on the idea from:
- * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
- *
- * The additional "-" is needed because gas uses a "true" value of -1.
- */
-#define alt_max_2(a, b)		((a) ^ (((a) ^ (b)) & -(-((a) < (b)))))
-#define alt_max_3(a, b, c)	(alt_max_2(alt_max_2(a, b), c))
-
-
 /*
  * Same as ALTERNATIVE macro above but for two alternatives. If CPU
  * has @feature1, it replaces @oldinstr with @newinstr1. If CPU has
  * @feature2, it replaces @oldinstr with @feature2.
  */
 .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
-140:
-	\oldinstr
-141:
-	.skip -((alt_max_2(new_len1, new_len2) - (old_len)) > 0) * \
-		(alt_max_2(new_len1, new_len2) - (old_len)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f
-	altinstr_entry 140b,144f,\ft_flags2,142b-140b,145f-144f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr1
-144:
-	\newinstr2
-145:
-	.popsection
+	__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1),
+		      \newinstr2, \ft_flags2)
 .endm
 
 .macro ALTERNATIVE_3 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2, newinstr3, ft_flags3
-140:
-	\oldinstr
-141:
-	.skip -((alt_max_3(new_len1, new_len2, new_len3) - (old_len)) > 0) * \
-		(alt_max_3(new_len1, new_len2, new_len3) - (old_len)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f
-	altinstr_entry 140b,144f,\ft_flags2,142b-140b,145f-144f
-	altinstr_entry 140b,145f,\ft_flags3,142b-140b,146f-145f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr1
-144:
-	\newinstr2
-145:
-	\newinstr3
-146:
-	.popsection
+	__ALTERNATIVE(__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1),
+				    \newinstr2, \ft_flags2),
+		      \newinstr3, \ft_flags3)
 .endm
 
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -398,7 +398,7 @@ apply_relocation(u8 *buf, size_t len, u8
 void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 						  struct alt_instr *end)
 {
-	struct alt_instr *a;
+	struct alt_instr *a, *b;
 	u8 *instr, *replacement;
 	u8 insn_buff[MAX_PATCH_LEN];
 
@@ -415,6 +415,17 @@ void __init_or_module noinline apply_alt
 	for (a = start; a < end; a++) {
 		int insn_buff_sz = 0;
 
+		/*
+		 * In case of nested ALTERNATIVE()s the outer alternative might
+		 * add more padding. To ensure consistent patching find the max
+		 * padding for all alt_instr entries for this site (nested
+		 * alternatives result in consecutive entries).
+		 */
+		for (b = a+1; b < end && b->instr_offset == a->instr_offset; b++) {
+			u8 len = max(a->instrlen, b->instrlen);
+			a->instrlen = b->instrlen = len;
+		}
+
 		instr = (u8 *)&a->instr_offset + a->instr_offset;
 		replacement = (u8 *)&a->repl_offset + a->repl_offset;
 		BUG_ON(a->instrlen > sizeof(insn_buff));
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -9,6 +9,29 @@
 
 void arch_handle_alternative(unsigned short feature, struct special_alt *alt)
 {
+	static struct special_alt *group, *prev;
+
+	/*
+	 * Recompute orig_len for nested ALTERNATIVE()s.
+	 */
+	if (group && group->orig_sec == alt->orig_sec &&
+	             group->orig_off == alt->orig_off) {
+
+		struct special_alt *iter = group;
+		for (;;) {
+			unsigned int len = max(iter->orig_len, alt->orig_len);
+			iter->orig_len = alt->orig_len = len;
+
+			if (iter == prev)
+				break;
+
+			iter = list_next_entry(iter, list);
+		}
+
+	} else group = alt;
+
+	prev = alt;
+
 	switch (feature) {
 	case X86_FEATURE_SMAP:
 		/*
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -84,6 +84,14 @@ static int get_alt_entry(struct elf *elf
 						  entry->new_len);
 	}
 
+	orig_reloc = find_reloc_by_dest(elf, sec, offset + entry->orig);
+	if (!orig_reloc) {
+		WARN_FUNC("can't find orig reloc", sec, offset + entry->orig);
+		return -1;
+	}
+
+	reloc_to_sec_off(orig_reloc, &alt->orig_sec, &alt->orig_off);
+
 	if (entry->feature) {
 		unsigned short feature;
 
@@ -94,14 +102,6 @@ static int get_alt_entry(struct elf *elf
 		arch_handle_alternative(feature, alt);
 	}
 
-	orig_reloc = find_reloc_by_dest(elf, sec, offset + entry->orig);
-	if (!orig_reloc) {
-		WARN_FUNC("can't find orig reloc", sec, offset + entry->orig);
-		return -1;
-	}
-
-	reloc_to_sec_off(orig_reloc, &alt->orig_sec, &alt->orig_off);
-
 	if (!entry->group || alt->new_len) {
 		new_reloc = find_reloc_by_dest(elf, sec, offset + entry->new);
 		if (!new_reloc) {



^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 11/11] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret()
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (9 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
@ 2023-08-14 11:44 ` Peter Zijlstra
  2023-08-14 16:44 ` [PATCH v2 00/11] Fix up SRSO stuff Borislav Petkov
  11 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-14 11:44 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Use the new nested alternatives to create what is effectively
ALTERNATIVE_5 and merge the dummy entry_untrain_ret stub into
UNTRAIN_RET properly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/nospec-branch.h |   33 ++++++++++++++++++---------------
 arch/x86/kernel/cpu/bugs.c           |    1 -
 arch/x86/lib/retpoline.S             |    7 -------
 3 files changed, 18 insertions(+), 23 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,11 +272,15 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
+#define ALT_UNRET(old)	\
+	__ALTERNATIVE(__ALTERNATIVE(__ALTERNATIVE(old, call retbleed_untrain_ret, X86_FEATURE_UNRET), \
+				    call srso_untrain_ret, X86_FEATURE_SRSO), \
+		      call srso_alias_untrain_ret, X86_FEATURE_SRSO_ALIAS)
 #else
-#define CALL_UNTRAIN_RET	""
+#define ALT_UNRET(old)	old
 #endif
 
+
 /*
  * Mitigate RETBleed for AMD/Hygon Zen uarch. Requires KERNEL CR3 because the
  * return thunk isn't mapped into the userspace tables (then again, AMD
@@ -292,10 +296,10 @@
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
 	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
 	VALIDATE_UNRET_END
-	ALTERNATIVE_3 "",						\
-		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
-		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
-		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+
+	__ALTERNATIVE(__ALTERNATIVE(ALT_UNRET(;),
+				    call entry_ibpb, X86_FEATURE_ENTRY_IBPB),
+		      RESET_CALL_DEPTH, X86_FEATURE_CALL_DEPTH)
 #endif
 .endm
 
@@ -303,10 +307,10 @@
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
 	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
 	VALIDATE_UNRET_END
-	ALTERNATIVE_3 "",						\
-		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
-		      "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT,	\
-		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+
+	__ALTERNATIVE(__ALTERNATIVE(ALT_UNRET(;),
+				    call entry_ibpb, X86_FEATURE_IBPB_ON_VMEXIT),
+		      RESET_CALL_DEPTH, X86_FEATURE_CALL_DEPTH)
 #endif
 .endm
 
@@ -314,10 +318,10 @@
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
 	defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
-	ALTERNATIVE_3 "",						\
-		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
-		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
-		      __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
+
+	__ALTERNATIVE(__ALTERNATIVE(ALT_UNRET(;),
+				    call entry_ibpb, X86_FEATURE_ENTRY_IBPB),
+		      RESET_CALL_DEPTH_FROM_CALL, X86_FEATURE_CALL_DEPTH)
 #endif
 .endm
 
@@ -352,7 +356,6 @@ extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_alias_untrain_ret(void);
 
-extern void entry_untrain_ret(void);
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2450,7 +2450,6 @@ static void __init srso_select_mitigatio
 			 * like ftrace, static_call, etc.
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
-			setup_force_cpu_cap(X86_FEATURE_UNRET);
 
 			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -277,13 +277,6 @@ SYM_CODE_START(srso_return_thunk)
 	ud2
 SYM_CODE_END(srso_return_thunk)
 
-SYM_FUNC_START(entry_untrain_ret)
-	ALTERNATIVE_2 "jmp retbleed_untrain_ret", \
-		      "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
-		      "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-SYM_FUNC_END(entry_untrain_ret)
-__EXPORT_THUNK(entry_untrain_ret)
-
 SYM_CODE_START(__x86_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 03/11] objtool/x86: Fix SRSO mess
  2023-08-14 11:44 ` [PATCH v2 03/11] objtool/x86: Fix SRSO mess Peter Zijlstra
@ 2023-08-14 12:54   ` Andrew.Cooper3
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: Andrew.Cooper3 @ 2023-08-14 12:54 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, jpoimboe, gregkh, nik.borisov

On 14/08/2023 12:44 pm, Peter Zijlstra wrote:
> <snip>
> +/*
> + * Symbols that are embedded inside other instructions, because sometimes crazy
> + * code exists. These are mostly ignored for validation purposes.

I feel this comment still doesn't get across the sweat and tears
involved with the fixes, so offer this alternative for consideration:

", because sometimes the only thing more crazy than hardware behaviour
is what we have to do in software to mitigate"

~Andrew

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
@ 2023-08-14 13:02   ` Borislav Petkov
  2023-08-14 17:48   ` Borislav Petkov
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-08-14 13:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 01:44:31PM +0200, Peter Zijlstra wrote:
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1034,6 +1034,7 @@ static void __init retbleed_select_mitig
>  	case RETBLEED_MITIGATION_UNRET:
>  		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
>  		setup_force_cpu_cap(X86_FEATURE_UNRET);
> +		x86_return_thunk = zen_return_thunk;
>  
>  		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
>  		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
> @@ -2451,10 +2452,13 @@ static void __init srso_select_mitigatio

Note to self: When applying, add a comment that srso_select_mitigation()
depends on and must run after retbleed_select_mitigation().

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 00/11] Fix up SRSO stuff
  2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
                   ` (10 preceding siblings ...)
  2023-08-14 11:44 ` [PATCH v2 11/11] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret() Peter Zijlstra
@ 2023-08-14 16:44 ` Borislav Petkov
  2023-08-14 19:51   ` Josh Poimboeuf
  11 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-08-14 16:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote:
> The one open techinical issue I have with the mitigation is the alignment of
> the RET inside srso_safe_ret(). The details given for retbleed stated that RET
> should be on a 64byte boundary, which is not the case here.

I have written this in the hope to make this more clear:

/*
 * Some generic notes on the untraining sequences:
 *
 * They are interchangeable when it comes to flushing potentially wrong
 * RET predictions from the BTB.
 *
 * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
 * Retbleed sequence because the return sequence done there
 * (srso_safe_ret()) is longer and the return sequence must fully nest
 * (end before) the untraining sequence. Therefore, the untraining
 * sequence must overlap the return sequence.
 *
 * Regarding alignment - the instructions which need to be untrained,
 * must all start at a cacheline boundary for Zen1/2 generations. That
 * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
 * srso_untrain_ret() must both be placed at the beginning of
 * a cacheline.
 */

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
  2023-08-14 13:02   ` Borislav Petkov
@ 2023-08-14 17:48   ` Borislav Petkov
  2023-08-15 21:29   ` Nathan Chancellor
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-08-14 17:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 01:44:31PM +0200, Peter Zijlstra wrote:
> Where Zen1/2 flush the BTB entry using the instruction decoder trick
> (test,movabs) Zen3/4 use instruction aliasing. SRSO adds RSB (RAP in

I'll change that "instruction aliasing" to "BTB aliasing".

> AMD speak) stuffing to force speculation into a trap an cause a
> mis-predict.

I'll change that to the much more precise:

"SRSO adds a return sequence (srso_safe_ret()) which forces the function
return instruction to speculate into a trap (UD2).  This RET will then
mispredict and execution will continue at the return site read from the
top of the stack."

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 06/11] x86/cpu: Rename original retbleed methods
  2023-08-14 11:44 ` [PATCH v2 06/11] x86/cpu: Rename original retbleed methods Peter Zijlstra
@ 2023-08-14 19:41   ` Josh Poimboeuf
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 74+ messages in thread
From: Josh Poimboeuf @ 2023-08-14 19:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh, nik.borisov

On Mon, Aug 14, 2023 at 01:44:32PM +0200, Peter Zijlstra wrote:
> Rename the original retbleed return thunk and untrain_ret to
> retbleed_return_thunk and retbleed_untrain_ret.
> 
> Andrew wants to call this btc_*, do we have a poll?

It should stay retbleed because:

1) It doesn't mitigate all possible manifestations of BTC.  It only
   mitigates BTC-RET, aka AMD "retbleed".

2) It should match the naming of the user interfaces which aren't going
   to change at this point.

-- 
Josh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 00/11] Fix up SRSO stuff
  2023-08-14 16:44 ` [PATCH v2 00/11] Fix up SRSO stuff Borislav Petkov
@ 2023-08-14 19:51   ` Josh Poimboeuf
  2023-08-14 19:57     ` Borislav Petkov
  2023-08-14 20:01     ` Josh Poimboeuf
  0 siblings, 2 replies; 74+ messages in thread
From: Josh Poimboeuf @ 2023-08-14 19:51 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 06:44:47PM +0200, Borislav Petkov wrote:
> On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote:
> > The one open techinical issue I have with the mitigation is the alignment of
> > the RET inside srso_safe_ret(). The details given for retbleed stated that RET
> > should be on a 64byte boundary, which is not the case here.
> 
> I have written this in the hope to make this more clear:
> 
> /*
>  * Some generic notes on the untraining sequences:
>  *
>  * They are interchangeable when it comes to flushing potentially wrong
>  * RET predictions from the BTB.
>  *
>  * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
>  * Retbleed sequence because the return sequence done there
>  * (srso_safe_ret()) is longer and the return sequence must fully nest
>  * (end before) the untraining sequence. Therefore, the untraining
>  * sequence must overlap the return sequence.
>  *
>  * Regarding alignment - the instructions which need to be untrained,
>  * must all start at a cacheline boundary for Zen1/2 generations. That
>  * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
>  * srso_untrain_ret() must both be placed at the beginning of
>  * a cacheline.
>  */

It's a good comment, but RET in srso_safe_ret() is still misaligned.
Don't we need something like so?

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 9bc19deacad1..373ac128a30a 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -251,13 +251,14 @@ __EXPORT_THUNK(retbleed_untrain_ret)
  * thus a "safe" one to use.
  */
 	.align 64
-	.skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc
+	.skip 64 - (.Lsrso_ret - srso_untrain_ret), 0xcc
 SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	.byte 0x48, 0xb8
 
 SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
 	lea 8(%_ASM_SP), %_ASM_SP
+.Lsrso_ret:
 	ret
 	int3
 	int3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 00/11] Fix up SRSO stuff
  2023-08-14 19:51   ` Josh Poimboeuf
@ 2023-08-14 19:57     ` Borislav Petkov
  2023-08-14 20:01     ` Josh Poimboeuf
  1 sibling, 0 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-08-14 19:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 12:51:53PM -0700, Josh Poimboeuf wrote:
> >  * Regarding alignment - the instructions which need to be untrained,
> >  * must all start at a cacheline boundary for Zen1/2 generations. That
> >  * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
> >  * srso_untrain_ret() must both be placed at the beginning of
> >  * a cacheline.
> >  */
> 
> It's a good comment, but RET in srso_safe_ret() is still misaligned.
> Don't we need something like so?

Well, I guess that comment is still not good enough. Not the RET must be
cacheline-aligned but the function return sequences must be.

IOW, we need this:

<--- cacheline begin
SYM_INNER_LABEL(retbleed_return_thunk, SYM_L_GLOBAL)
        ret
        int3


and

<--- cacheline begin
SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
        lea 8(%_ASM_SP), %_ASM_SP
        ret
        int3

I'll improve on it before I apply it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 00/11] Fix up SRSO stuff
  2023-08-14 19:51   ` Josh Poimboeuf
  2023-08-14 19:57     ` Borislav Petkov
@ 2023-08-14 20:01     ` Josh Poimboeuf
  2023-08-14 20:09       ` Borislav Petkov
  1 sibling, 1 reply; 74+ messages in thread
From: Josh Poimboeuf @ 2023-08-14 20:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 12:51:55PM -0700, Josh Poimboeuf wrote:
> On Mon, Aug 14, 2023 at 06:44:47PM +0200, Borislav Petkov wrote:
> > On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote:
> > > The one open techinical issue I have with the mitigation is the alignment of
> > > the RET inside srso_safe_ret(). The details given for retbleed stated that RET
> > > should be on a 64byte boundary, which is not the case here.
> > 
> > I have written this in the hope to make this more clear:
> > 
> > /*
> >  * Some generic notes on the untraining sequences:
> >  *
> >  * They are interchangeable when it comes to flushing potentially wrong
> >  * RET predictions from the BTB.
> >  *
> >  * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
> >  * Retbleed sequence because the return sequence done there
> >  * (srso_safe_ret()) is longer and the return sequence must fully nest
> >  * (end before) the untraining sequence. Therefore, the untraining
> >  * sequence must overlap the return sequence.
> >  *
> >  * Regarding alignment - the instructions which need to be untrained,
> >  * must all start at a cacheline boundary for Zen1/2 generations. That
> >  * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
> >  * srso_untrain_ret() must both be placed at the beginning of
> >  * a cacheline.
> >  */
> 
> It's a good comment, but RET in srso_safe_ret() is still misaligned.
> Don't we need something like so?

Scratch that, I guess I misread the confusingly worded comment:

  "both the ret in zen_untrain_ret() and srso_safe_ret()..."

to mean the RET in each function.

How about:

  "both the RET in zen_untrain_ret() and the LEA in srso_untrain_ret()"

?

-- 
Josh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 00/11] Fix up SRSO stuff
  2023-08-14 20:01     ` Josh Poimboeuf
@ 2023-08-14 20:09       ` Borislav Petkov
  2023-08-15 14:26         ` [PATCH] x86/srso: Explain the untraining sequences a bit more Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-08-14 20:09 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 01:01:28PM -0700, Josh Poimboeuf wrote:
> How about:
> 
>   "both the RET in zen_untrain_ret() and the LEA in srso_untrain_ret()"
> 
> ?

Yeah, or the "instruction sequences starting at srso_safe_ret and
retbleed_return_thunk" (that's what it's called now) "must start at
a cacheline boundary."

Because the LEA was ADD but that changed so saying "the instruction
sequences" just works.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH] x86/srso: Explain the untraining sequences a bit more
  2023-08-14 20:09       ` Borislav Petkov
@ 2023-08-15 14:26         ` Borislav Petkov
  2023-08-15 15:41           ` Nikolay Borisov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-08-15 14:26 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	gregkh, nik.borisov

From: "Borislav Petkov (AMD)" <bp@alien8.de>
Date: Mon, 14 Aug 2023 21:29:50 +0200

The goal is to eventually have a proper documentation about all this.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 915c4fe17718..e59c46581bbb 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -183,6 +183,25 @@ SYM_CODE_START(srso_alias_return_thunk)
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 
+/*
+ * Some generic notes on the untraining sequences:
+ *
+ * They are interchangeable when it comes to flushing potentially wrong
+ * RET predictions from the BTB.
+ *
+ * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
+ * Retbleed sequence because the return sequence done there
+ * (srso_safe_ret()) is longer and the return sequence must fully nest
+ * (end before) the untraining sequence. Therefore, the untraining
+ * sequence must fully overlap the return sequence.
+ *
+ * Regarding alignment - the instructions which need to be untrained,
+ * must all start at a cacheline boundary for Zen1/2 generations. That
+ * is, instruction sequences starting at srso_safe_ret() and
+ * the respective instruction sequences at retbleed_return_thunk()
+ * must start at a cacheline boundary.
+ */
+
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
  * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
-- 
2.42.0.rc0.25.ga82fb66fed25

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH] x86/srso: Explain the untraining sequences a bit more
  2023-08-15 14:26         ` [PATCH] x86/srso: Explain the untraining sequences a bit more Borislav Petkov
@ 2023-08-15 15:41           ` Nikolay Borisov
  0 siblings, 0 replies; 74+ messages in thread
From: Nikolay Borisov @ 2023-08-15 15:41 UTC (permalink / raw)
  To: Borislav Petkov, Josh Poimboeuf
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3, gregkh



On 15.08.23 г. 17:26 ч., Borislav Petkov wrote:
> From: "Borislav Petkov (AMD)" <bp@alien8.de>
> Date: Mon, 14 Aug 2023 21:29:50 +0200
> 
> The goal is to eventually have a proper documentation about all this.
> 
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> 
> diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
> index 915c4fe17718..e59c46581bbb 100644
> --- a/arch/x86/lib/retpoline.S
> +++ b/arch/x86/lib/retpoline.S
> @@ -183,6 +183,25 @@ SYM_CODE_START(srso_alias_return_thunk)
>   	ud2
>   SYM_CODE_END(srso_alias_return_thunk)
>   
> +/*
> + * Some generic notes on the untraining sequences:
> + *
> + * They are interchangeable when it comes to flushing potentially wrong
> + * RET predictions from the BTB.
> + *
> + * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
> + * Retbleed sequence because the return sequence done there
> + * (srso_safe_ret()) is longer and the return sequence must fully nest
> + * (end before) the untraining sequence. Therefore, the untraining
> + * sequence must fully overlap the return sequence.
> + *
> + * Regarding alignment - the instructions which need to be untrained,
> + * must all start at a cacheline boundary for Zen1/2 generations. That
> + * is, instruction sequences starting at srso_safe_ret() and
> + * the respective instruction sequences at retbleed_return_thunk()
> + * must start at a cacheline boundary.
> + */

Are there any salient generic details about zen 3/4 ?
> +
>   /*
>    * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
>    * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-08-14 11:44 ` [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
@ 2023-08-15 20:49   ` Nikolay Borisov
  2023-08-15 22:44     ` Peter Zijlstra
  2023-09-07  8:31   ` Borislav Petkov
  1 sibling, 1 reply; 74+ messages in thread
From: Nikolay Borisov @ 2023-08-15 20:49 UTC (permalink / raw)
  To: Peter Zijlstra, x86
  Cc: linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh



On 14.08.23 г. 14:44 ч., Peter Zijlstra wrote:
> Instead of making increasingly complicated ALTERNATIVE_n()
> implementations, use a nested alternative expression.
> 
> The only difference between:
> 
>    ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)
> 
> and
> 
>    ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
>                newinst2, flag2)
> 
> is that the outer alternative can add additional padding when the
> inner alternative is the shorter one, which then results in
> alt_instr::instrlen being inconsistent.
> 
> However, this is easily remedied since the alt_instr entries will be
> consecutive and it is trivial to compute the max(alt_instr::instrlen)
> at runtime while patching.
> 
> Specifically, after this patch the ALTERNATIVE_2 macro, after CPP
> expansion (and manual layout), looks like this:
> 
>    .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
>     140:
> 
>       140: \oldinstr ;
>       141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
>       142: .pushsection .altinstructions,"a" ;
> 	  altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f ;
> 	  .popsection ; .pushsection .altinstr_replacement,"ax" ;
>       143: \newinstr1 ;
>       144: .popsection ; ;
> 
>     141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
>     142: .pushsection .altinstructions,"a" ;
> 	altinstr_entry 140b,143f,\ft_flags2,142b-140b,144f-143f ;
> 	.popsection ;
> 	.pushsection .altinstr_replacement,"ax" ;
>     143: \newinstr2 ;
>     144: .popsection ;
>    .endm
> 
> The only label that is ambiguous is 140, however they all reference
> the same spot, so that doesn't matter.
> 
> NOTE: obviously only @oldinstr may be an alternative; making @newinstr
> an alternative would mean patching .altinstr_replacement which very
> likely isn't what is intended, also the labels will be confused in
> that case.
> 

Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>

Ps. I feel very "enlightened" knowing that GAS uses -1 to represent true 
...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
  2023-08-14 13:02   ` Borislav Petkov
  2023-08-14 17:48   ` Borislav Petkov
@ 2023-08-15 21:29   ` Nathan Chancellor
  2023-08-15 22:43     ` Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  4 siblings, 1 reply; 74+ messages in thread
From: Nathan Chancellor @ 2023-08-15 21:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

Hi Peter,

On Mon, Aug 14, 2023 at 01:44:31PM +0200, Peter Zijlstra wrote:

<snip>

>  arch/x86/include/asm/nospec-branch.h |    6 ++++
>  arch/x86/kernel/cpu/bugs.c           |    8 ++++--
>  arch/x86/kernel/vmlinux.lds.S        |    2 -
>  arch/x86/lib/retpoline.S             |   45 ++++++++++++++++++++++-------------
>  tools/objtool/arch/x86/decode.c      |    2 -
>  5 files changed, 43 insertions(+), 20 deletions(-)
> 
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -342,9 +342,15 @@ extern retpoline_thunk_t __x86_indirect_
>  extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
>  
>  extern void __x86_return_thunk(void);
> +
> +extern void zen_return_thunk(void);
> +extern void srso_return_thunk(void);
> +extern void srso_alias_return_thunk(void);
> +
>  extern void zen_untrain_ret(void);
>  extern void srso_untrain_ret(void);
>  extern void srso_untrain_ret_alias(void);
> +
>  extern void entry_ibpb(void);
>  
>  extern void (*x86_return_thunk)(void);
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1034,6 +1034,7 @@ static void __init retbleed_select_mitig
>  	case RETBLEED_MITIGATION_UNRET:
>  		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
>  		setup_force_cpu_cap(X86_FEATURE_UNRET);
> +		x86_return_thunk = zen_return_thunk;
>  
>  		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
>  		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
> @@ -2451,10 +2452,13 @@ static void __init srso_select_mitigatio
>  			 */
>  			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
>  
> -			if (boot_cpu_data.x86 == 0x19)
> +			if (boot_cpu_data.x86 == 0x19) {
>  				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
> -			else
> +				x86_return_thunk = srso_alias_return_thunk;
> +			} else {
>  				setup_force_cpu_cap(X86_FEATURE_SRSO);
> +				x86_return_thunk = srso_return_thunk;
> +			}
>  			srso_mitigation = SRSO_MITIGATION_SAFE_RET;
>  		} else {
>  			pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
> --- a/arch/x86/kernel/vmlinux.lds.S
> +++ b/arch/x86/kernel/vmlinux.lds.S
> @@ -521,7 +521,7 @@ INIT_PER_CPU(irq_stack_backing_store);
>  #endif
>  
>  #ifdef CONFIG_RETHUNK
> -. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
> +. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
>  . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
>  #endif
>  
> --- a/arch/x86/lib/retpoline.S
> +++ b/arch/x86/lib/retpoline.S
> @@ -151,22 +151,20 @@ SYM_CODE_END(__x86_indirect_jump_thunk_a
>  	.section .text..__x86.rethunk_untrain
>  
>  SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
> +	UNWIND_HINT_FUNC
>  	ANNOTATE_NOENDBR
>  	ASM_NOP2
>  	lfence
> -	jmp __x86_return_thunk
> +	jmp srso_alias_return_thunk
>  SYM_FUNC_END(srso_untrain_ret_alias)
>  __EXPORT_THUNK(srso_untrain_ret_alias)
>  
>  	.section .text..__x86.rethunk_safe
>  #endif
>  
> -/* Needs a definition for the __x86_return_thunk alternative below. */
>  SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
> -#ifdef CONFIG_CPU_SRSO
>  	lea 8(%_ASM_SP), %_ASM_SP
>  	UNWIND_HINT_FUNC
> -#endif
>  	ANNOTATE_UNRET_SAFE
>  	ret
>  	int3
> @@ -174,9 +172,16 @@ SYM_FUNC_END(srso_safe_ret_alias)
>  
>  	.section .text..__x86.return_thunk
>  
> +SYM_CODE_START(srso_alias_return_thunk)
> +	UNWIND_HINT_FUNC
> +	ANNOTATE_NOENDBR
> +	call srso_safe_ret_alias
> +	ud2
> +SYM_CODE_END(srso_alias_return_thunk)
> +
>  /*
>   * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
> - * 1) The RET at __x86_return_thunk must be on a 64 byte boundary, for
> + * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
>   *    alignment within the BTB.
>   * 2) The instruction at zen_untrain_ret must contain, and not
>   *    end with, the 0xc3 byte of the RET.
> @@ -184,7 +189,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
>   *    from re-poisioning the BTB prediction.
>   */
>  	.align 64
> -	.skip 64 - (__ret - zen_untrain_ret), 0xcc
> +	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
>  SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
>  	ANNOTATE_NOENDBR
>  	/*
> @@ -192,16 +197,16 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
>  	 *
>  	 *   TEST $0xcc, %bl
>  	 *   LFENCE
> -	 *   JMP __x86_return_thunk
> +	 *   JMP zen_return_thunk
>  	 *
>  	 * Executing the TEST instruction has a side effect of evicting any BTB
>  	 * prediction (potentially attacker controlled) attached to the RET, as
> -	 * __x86_return_thunk + 1 isn't an instruction boundary at the moment.
> +	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
>  	 */
>  	.byte	0xf6
>  
>  	/*
> -	 * As executed from __x86_return_thunk, this is a plain RET.
> +	 * As executed from zen_return_thunk, this is a plain RET.
>  	 *
>  	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
>  	 *
> @@ -213,13 +218,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL,
>  	 * With SMT enabled and STIBP active, a sibling thread cannot poison
>  	 * RET's prediction to a type of its choice, but can evict the
>  	 * prediction due to competitive sharing. If the prediction is
> -	 * evicted, __x86_return_thunk will suffer Straight Line Speculation
> +	 * evicted, zen_return_thunk will suffer Straight Line Speculation
>  	 * which will be contained safely by the INT3.
>  	 */
> -SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
> +SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
>  	ret
>  	int3
> -SYM_CODE_END(__ret)
> +SYM_CODE_END(zen_return_thunk)
>  
>  	/*
>  	 * Ensure the TEST decoding / BTB invalidation is complete.
> @@ -230,7 +235,7 @@ SYM_CODE_END(__ret)
>  	 * Jump back and execute the RET in the middle of the TEST instruction.
>  	 * INT3 is for SLS protection.
>  	 */
> -	jmp __ret
> +	jmp zen_return_thunk
>  	int3
>  SYM_FUNC_END(zen_untrain_ret)
>  __EXPORT_THUNK(zen_untrain_ret)
> @@ -256,6 +261,7 @@ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLO
>  	ret
>  	int3
>  	int3
> +	/* end of movabs */
>  	lfence
>  	call srso_safe_ret
>  	ud2
> @@ -263,12 +269,19 @@ SYM_CODE_END(srso_safe_ret)
>  SYM_FUNC_END(srso_untrain_ret)
>  __EXPORT_THUNK(srso_untrain_ret)
>  
> -SYM_CODE_START(__x86_return_thunk)
> +SYM_CODE_START(srso_return_thunk)
>  	UNWIND_HINT_FUNC
>  	ANNOTATE_NOENDBR
> -	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
> -			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
> +	call srso_safe_ret
>  	ud2
> +SYM_CODE_END(srso_return_thunk)
> +
> +SYM_CODE_START(__x86_return_thunk)
> +	UNWIND_HINT_FUNC
> +	ANNOTATE_NOENDBR
> +	ANNOTATE_UNRET_SAFE
> +	ret
> +	int3
>  SYM_CODE_END(__x86_return_thunk)
>  EXPORT_SYMBOL(__x86_return_thunk)
>  
> --- a/tools/objtool/arch/x86/decode.c
> +++ b/tools/objtool/arch/x86/decode.c
> @@ -829,6 +829,6 @@ bool arch_is_rethunk(struct symbol *sym)
>  
>  bool arch_is_embedded_insn(struct symbol *sym)
>  {
> -	return !strcmp(sym->name, "__ret") ||
> +	return !strcmp(sym->name, "zen_return_thunk") ||
>  	       !strcmp(sym->name, "srso_safe_ret");
>  }
> 
> 

I applied this change on top of -tip master and linux-next, where it
appears to break i386_defconfig (I see this error in other
configurations too but defconfig is obviously a simple target) with both
GCC and LLVM:

  i386-linux-ld: arch/x86/kernel/cpu/bugs.o: in function `cpu_select_mitigations':
  bugs.c:(.init.text+0xe61): undefined reference to `zen_return_thunk'
  i386-linux-ld: bugs.c:(.init.text+0xe66): undefined reference to `x86_return_thunk'

  ld.lld: error: undefined symbol: x86_return_thunk
  >>> referenced by bugs.c
  >>>               arch/x86/kernel/cpu/bugs.o:(retbleed_select_mitigation) in archive vmlinux.a

  ld.lld: error: undefined symbol: zen_return_thunk
  >>> referenced by bugs.c
  >>>               arch/x86/kernel/cpu/bugs.o:(retbleed_select_mitigation) in archive vmlinux.a

It is still present at the head of the series, just with the function
rename.

  i386-linux-ld: arch/x86/kernel/cpu/bugs.o: in function `cpu_select_mitigations':
  bugs.c:(.init.text+0xe61): undefined reference to `retbleed_return_thunk'
  i386-linux-ld: bugs.c:(.init.text+0xe66): undefined reference to `x86_return_thunk'

  ld.lld: error: undefined symbol: x86_return_thunk
  >>> referenced by bugs.c
  >>>               arch/x86/kernel/cpu/bugs.o:(retbleed_select_mitigation) in archive vmlinux.a

  ld.lld: error: undefined symbol: retbleed_return_thunk
  >>> referenced by bugs.c
  >>>               arch/x86/kernel/cpu/bugs.o:(retbleed_select_mitigation) in archive vmlinux.a

This configuration does have

  # CONFIG_RETHUNK is not set

but turning it on does not resolve the x86_return_thunk error...

  i386-linux-ld: arch/x86/kernel/static_call.o: in function `__static_call_transform':
  static_call.c:(.ref.text+0x4a): undefined reference to `x86_return_thunk'
  i386-linux-ld: static_call.c:(.ref.text+0x137): undefined reference to `x86_return_thunk'
  i386-linux-ld: arch/x86/kernel/cpu/bugs.o: in function `cpu_select_mitigations':
  bugs.c:(.init.text+0xef2): undefined reference to `x86_return_thunk'

  ld.lld: error: undefined symbol: x86_return_thunk
  >>> referenced by static_call.c
  >>>               arch/x86/kernel/static_call.o:(__static_call_transform) in archive vmlinux.a
  >>> referenced by static_call.c
  >>>               arch/x86/kernel/static_call.o:(__static_call_transform) in archive vmlinux.a
  >>> referenced by bugs.c
  >>>               arch/x86/kernel/cpu/bugs.o:(retbleed_select_mitigation) in archive vmlinux.a

I'd keep digging but I am running out of time for the day, hence just
the report rather than a fix.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-15 21:29   ` Nathan Chancellor
@ 2023-08-15 22:43     ` Peter Zijlstra
  2023-08-16  7:38       ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-15 22:43 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Tue, Aug 15, 2023 at 02:29:31PM -0700, Nathan Chancellor wrote:

> I applied this change on top of -tip master and linux-next, where it
> appears to break i386_defconfig (I see this error in other
> configurations too but defconfig is obviously a simple target) with both
> GCC and LLVM:

Yeah, Boris and me fixed that yesterday evening or so. I'm not sure
I still have the diffs, but Boris should have them all somewhere.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-08-15 20:49   ` Nikolay Borisov
@ 2023-08-15 22:44     ` Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-15 22:44 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe, gregkh

On Tue, Aug 15, 2023 at 11:49:16PM +0300, Nikolay Borisov wrote:
> 
> 
> On 14.08.23 г. 14:44 ч., Peter Zijlstra wrote:
> > Instead of making increasingly complicated ALTERNATIVE_n()
> > implementations, use a nested alternative expression.
> > 
> > The only difference between:
> > 
> >    ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)
> > 
> > and
> > 
> >    ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
> >                newinst2, flag2)
> > 
> > is that the outer alternative can add additional padding when the
> > inner alternative is the shorter one, which then results in
> > alt_instr::instrlen being inconsistent.
> > 
> > However, this is easily remedied since the alt_instr entries will be
> > consecutive and it is trivial to compute the max(alt_instr::instrlen)
> > at runtime while patching.
> > 
> > Specifically, after this patch the ALTERNATIVE_2 macro, after CPP
> > expansion (and manual layout), looks like this:
> > 
> >    .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
> >     140:
> > 
> >       140: \oldinstr ;
> >       141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
> >       142: .pushsection .altinstructions,"a" ;
> > 	  altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f ;
> > 	  .popsection ; .pushsection .altinstr_replacement,"ax" ;
> >       143: \newinstr1 ;
> >       144: .popsection ; ;
> > 
> >     141: .skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90 ;
> >     142: .pushsection .altinstructions,"a" ;
> > 	altinstr_entry 140b,143f,\ft_flags2,142b-140b,144f-143f ;
> > 	.popsection ;
> > 	.pushsection .altinstr_replacement,"ax" ;
> >     143: \newinstr2 ;
> >     144: .popsection ;
> >    .endm
> > 
> > The only label that is ambiguous is 140, however they all reference
> > the same spot, so that doesn't matter.
> > 
> > NOTE: obviously only @oldinstr may be an alternative; making @newinstr
> > an alternative would mean patching .altinstr_replacement which very
> > likely isn't what is intended, also the labels will be confused in
> > that case.
> > 
> 
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> 
> Ps. I feel very "enlightened" knowing that GAS uses -1 to represent true ...

Ah, but only sometimes ;-)

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-15 22:43     ` Peter Zijlstra
@ 2023-08-16  7:38       ` Borislav Petkov
  2023-08-16 14:52         ` Nathan Chancellor
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-08-16  7:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nathan Chancellor, x86, linux-kernel, David.Kaplan,
	Andrew.Cooper3, jpoimboe, gregkh, nik.borisov

On Wed, Aug 16, 2023 at 12:43:48AM +0200, Peter Zijlstra wrote:
> Yeah, Boris and me fixed that yesterday evening or so. I'm not sure
> I still have the diffs, but Boris should have them all somewhere.

Even better - all the urgent fixes I've accumulated so far are coming up
in tip's x86/urgent.  I'd appreciate people testing it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-14 11:44 ` [PATCH v2 09/11] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     ad63073765fa394665bcb54660cc997f05b704d4
Gitweb:        https://git.kernel.org/tip/ad63073765fa394665bcb54660cc997f05b704d4
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:35 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/cpu/kvm: Provide UNTRAIN_RET_VM

Similar to how it doesn't make sense to have UNTRAIN_RET have two
untrain calls, it also doesn't make sense for VMEXIT to have an extra
IBPB call.

This cures VMEXIT doing potentially unret+IBPB or double IBPB.
Also, the (SEV) VMEXIT case seems to have been overlooked.

Redefine the meaning of the synthetic IBPB flags to:

 - ENTRY_IBPB     -- issue IBPB on entry  (was: entry + VMEXIT)
 - IBPB_ON_VMEXIT -- issue IBPB on VMEXIT

And have 'retbleed=ibpb' set *BOTH* feature flags to ensure it retains
the previous behaviour and issues IBPB on entry+VMEXIT.

The new 'srso=ibpb_vmexit' option only sets IBPB_ON_VMEXIT.

Create UNTRAIN_RET_VM specifically for the VMEXIT case, and have that
check IBPB_ON_VMEXIT.

All this avoids having the VMEXIT case having to check both ENTRY_IBPB
and IBPB_ON_VMEXIT and simplifies the alternatives.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121149.109557833@infradead.org
---
 arch/x86/include/asm/nospec-branch.h | 11 +++++++++++
 arch/x86/kernel/cpu/bugs.c           |  1 +
 arch/x86/kvm/svm/vmenter.S           |  7 ++-----
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 5285c8e..c55cc24 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -299,6 +299,17 @@
 #endif
 .endm
 
+.macro UNTRAIN_RET_VM
+#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
+	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+	VALIDATE_UNRET_END
+	ALTERNATIVE_3 "",						\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT,	\
+		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+#endif
+.endm
+
 .macro UNTRAIN_RET_FROM_CALL
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
 	defined(CONFIG_CALL_DEPTH_TRACKING)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6f3e195..9026e3f 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1054,6 +1054,7 @@ do_cmd_auto:
 
 	case RETBLEED_MITIGATION_IBPB:
 		setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
+		setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
 		mitigate_smt = true;
 		break;
 
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 265452f..ef2ebab 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -222,10 +222,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	 * because interrupt handlers won't sanitize 'ret' if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
-
-	/* SRSO */
-	ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT
+	UNTRAIN_RET_VM
 
 	/*
 	 * Clear all general purpose registers except RSP and RAX to prevent
@@ -362,7 +359,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 * because interrupt handlers won't sanitize RET if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
+	UNTRAIN_RET_VM
 
 	/* "Pop" @spec_ctrl_intercepted.  */
 	pop %_ASM_BX

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Cleanup the untrain mess
  2023-08-14 11:44 ` [PATCH v2 08/11] x86/cpu: Cleanup the untrain mess Peter Zijlstra
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     4854a36d877a3aef0afdd241cccb68c1825234d0
Gitweb:        https://git.kernel.org/tip/4854a36d877a3aef0afdd241cccb68c1825234d0
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:34 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/cpu: Cleanup the untrain mess

Since there can only be one active return_thunk, there only needs be
one (matching) untrain_ret. It fundamentally doesn't make sense to
allow multiple untrain_ret at the same time.

Fold all the 3 different untrain methods into a single (temporary)
helper stub.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121149.042774962@infradead.org
---
 arch/x86/include/asm/nospec-branch.h | 19 +++++--------------
 arch/x86/kernel/cpu/bugs.c           |  1 +
 arch/x86/lib/retpoline.S             |  7 +++++++
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index f7c3375..5285c8e 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,9 +272,9 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_ZEN_UNTRAIN_RET	"call retbleed_untrain_ret"
+#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
 #else
-#define CALL_ZEN_UNTRAIN_RET	""
+#define CALL_UNTRAIN_RET	""
 #endif
 
 /*
@@ -293,15 +293,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 .macro UNTRAIN_RET_FROM_CALL
@@ -309,15 +304,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 
@@ -355,6 +345,7 @@ extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_alias_untrain_ret(void);
 
+extern void entry_untrain_ret(void);
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bbbbda9..6f3e195 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2460,6 +2460,7 @@ static void __init srso_select_mitigation(void)
 			 * like ftrace, static_call, etc.
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
+			setup_force_cpu_cap(X86_FEATURE_UNRET);
 
 			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index d37e5ab..5e85da1 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -289,6 +289,13 @@ SYM_CODE_START(srso_return_thunk)
 	ud2
 SYM_CODE_END(srso_return_thunk)
 
+SYM_FUNC_START(entry_untrain_ret)
+	ALTERNATIVE_2 "jmp retbleed_untrain_ret", \
+		      "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
+		      "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
+SYM_FUNC_END(entry_untrain_ret)
+__EXPORT_THUNK(entry_untrain_ret)
+
 SYM_CODE_START(__x86_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1
  2023-08-14 11:44 ` [PATCH v2 07/11] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     a3fd3ac0a605e27484b1e8aaa9560972800e6706
Gitweb:        https://git.kernel.org/tip/a3fd3ac0a605e27484b1e8aaa9560972800e6706
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:33 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/cpu: Rename srso_(.*)_alias to srso_alias_\1

For a more consistent namespace.

  [ bp: Fixup names in the doc too. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.976236447@infradead.org
---
 Documentation/admin-guide/hw-vuln/srso.rst |  4 ++--
 arch/x86/include/asm/nospec-branch.h       |  6 ++---
 arch/x86/kernel/vmlinux.lds.S              |  8 +++----
 arch/x86/lib/retpoline.S                   | 24 ++++++++++-----------
 4 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index af59a93..b6cfb51 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -141,8 +141,8 @@ sequence.
 To ensure the safety of this mitigation, the kernel must ensure that the
 safe return sequence is itself free from attacker interference.  In Zen3
 and Zen4, this is accomplished by creating a BTB alias between the
-untraining function srso_untrain_ret_alias() and the safe return
-function srso_safe_ret_alias() which results in evicting a potentially
+untraining function srso_alias_untrain_ret() and the safe return
+function srso_alias_safe_ret() which results in evicting a potentially
 poisoned BTB entry and using that safe one for all function returns.
 
 In older Zen1 and Zen2, this is accomplished using a reinterpretation
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 8a0d4c5..f7c3375 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -300,7 +300,7 @@
 
 #ifdef CONFIG_CPU_SRSO
 	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
 #endif
 .endm
 
@@ -316,7 +316,7 @@
 
 #ifdef CONFIG_CPU_SRSO
 	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
 #endif
 .endm
 
@@ -353,7 +353,7 @@ extern void srso_alias_return_thunk(void);
 
 extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
-extern void srso_untrain_ret_alias(void);
+extern void srso_alias_untrain_ret(void);
 
 extern void entry_ibpb(void);
 
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 7c0e2b4..83d41c2 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -147,10 +147,10 @@ SECTIONS
 
 #ifdef CONFIG_CPU_SRSO
 		/*
-		 * See the comment above srso_untrain_ret_alias()'s
+		 * See the comment above srso_alias_untrain_ret()'s
 		 * definition.
 		 */
-		. = srso_untrain_ret_alias | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
+		. = srso_alias_untrain_ret | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
 		*(.text..__x86.rethunk_safe)
 #endif
 		ALIGN_ENTRY_TEXT_END
@@ -536,8 +536,8 @@ INIT_PER_CPU(irq_stack_backing_store);
  * Instead do: (A | B) - (A & B) in order to compute the XOR
  * of the two function addresses:
  */
-. = ASSERT(((ABSOLUTE(srso_untrain_ret_alias) | srso_safe_ret_alias) -
-		(ABSOLUTE(srso_untrain_ret_alias) & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
+. = ASSERT(((ABSOLUTE(srso_alias_untrain_ret) | srso_alias_safe_ret) -
+		(ABSOLUTE(srso_alias_untrain_ret) & srso_alias_safe_ret)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
 		"SRSO function pair won't alias");
 #endif
 
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 2cf7c51..d37e5ab 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -133,56 +133,56 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
 #ifdef CONFIG_RETHUNK
 
 /*
- * srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
+ * srso_alias_untrain_ret() and srso_alias_safe_ret() are placed at
  * special addresses:
  *
- * - srso_untrain_ret_alias() is 2M aligned
- * - srso_safe_ret_alias() is also in the same 2M page but bits 2, 8, 14
+ * - srso_alias_untrain_ret() is 2M aligned
+ * - srso_alias_safe_ret() is also in the same 2M page but bits 2, 8, 14
  * and 20 in its virtual address are set (while those bits in the
- * srso_untrain_ret_alias() function are cleared).
+ * srso_alias_untrain_ret() function are cleared).
  *
  * This guarantees that those two addresses will alias in the branch
  * target buffer of Zen3/4 generations, leading to any potential
  * poisoned entries at that BTB slot to get evicted.
  *
- * As a result, srso_safe_ret_alias() becomes a safe return.
+ * As a result, srso_alias_safe_ret() becomes a safe return.
  */
 #ifdef CONFIG_CPU_SRSO
 	.section .text..__x86.rethunk_untrain
 
-SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
 	jmp srso_alias_return_thunk
-SYM_FUNC_END(srso_untrain_ret_alias)
-__EXPORT_THUNK(srso_untrain_ret_alias)
+SYM_FUNC_END(srso_alias_untrain_ret)
+__EXPORT_THUNK(srso_alias_untrain_ret)
 
 	.section .text..__x86.rethunk_safe
 #else
 /* dummy definition for alternatives */
-SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
 SYM_FUNC_END(srso_alias_untrain_ret)
 #endif
 
-SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_safe_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	lea 8(%_ASM_SP), %_ASM_SP
 	UNWIND_HINT_FUNC
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
-SYM_FUNC_END(srso_safe_ret_alias)
+SYM_FUNC_END(srso_alias_safe_ret)
 
 	.section .text..__x86.return_thunk
 
 SYM_CODE_START(srso_alias_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	call srso_safe_ret_alias
+	call srso_alias_safe_ret
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Rename original retbleed methods
  2023-08-14 11:44 ` [PATCH v2 06/11] x86/cpu: Rename original retbleed methods Peter Zijlstra
  2023-08-14 19:41   ` Josh Poimboeuf
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Josh Poimboeuf, Peter Zijlstra (Intel), Borislav Petkov (AMD),
	x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     24ed3c4c42c326a3054f68008aa4e54fa000a400
Gitweb:        https://git.kernel.org/tip/24ed3c4c42c326a3054f68008aa4e54fa000a400
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:32 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/cpu: Rename original retbleed methods

Rename the original retbleed return thunk and untrain_ret to
retbleed_return_thunk() and retbleed_untrain_ret().

No functional changes.

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.909378169@infradead.org
---
 arch/x86/include/asm/nospec-branch.h |  8 +++----
 arch/x86/kernel/cpu/bugs.c           |  2 +-
 arch/x86/kernel/vmlinux.lds.S        |  2 +-
 arch/x86/lib/retpoline.S             | 30 +++++++++++++--------------
 tools/objtool/arch/x86/decode.c      |  2 +-
 tools/objtool/check.c                |  2 +-
 6 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 5ed78ad..8a0d4c5 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,7 +272,7 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_ZEN_UNTRAIN_RET	"call zen_untrain_ret"
+#define CALL_ZEN_UNTRAIN_RET	"call retbleed_untrain_ret"
 #else
 #define CALL_ZEN_UNTRAIN_RET	""
 #endif
@@ -282,7 +282,7 @@
  * return thunk isn't mapped into the userspace tables (then again, AMD
  * typically has NO_MELTDOWN).
  *
- * While zen_untrain_ret() doesn't clobber anything but requires stack,
+ * While retbleed_untrain_ret() doesn't clobber anything but requires stack,
  * entry_ibpb() will clobber AX, CX, DX.
  *
  * As such, this must be placed after every *SWITCH_TO_KERNEL_CR3 at a point
@@ -347,11 +347,11 @@ extern void __x86_return_thunk(void);
 static inline void __x86_return_thunk(void) {}
 #endif
 
-extern void zen_return_thunk(void);
+extern void retbleed_return_thunk(void);
 extern void srso_return_thunk(void);
 extern void srso_alias_return_thunk(void);
 
-extern void zen_untrain_ret(void);
+extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 56cf250..bbbbda9 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1043,7 +1043,7 @@ do_cmd_auto:
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
 
 		if (IS_ENABLED(CONFIG_RETHUNK))
-			x86_return_thunk = zen_return_thunk;
+			x86_return_thunk = retbleed_return_thunk;
 
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index d3b02d6..7c0e2b4 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -521,7 +521,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_RETHUNK
-. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
+. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 #endif
 
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index fb81895..2cf7c51 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -188,32 +188,32 @@ SYM_CODE_END(srso_alias_return_thunk)
 
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
- * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
+ * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
  *    alignment within the BTB.
- * 2) The instruction at zen_untrain_ret must contain, and not
+ * 2) The instruction at retbleed_untrain_ret must contain, and not
  *    end with, the 0xc3 byte of the RET.
  * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
-SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+	.skip 64 - (retbleed_return_thunk - retbleed_untrain_ret), 0xcc
+SYM_START(retbleed_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
-	 * As executed from zen_untrain_ret, this is:
+	 * As executed from retbleed_untrain_ret, this is:
 	 *
 	 *   TEST $0xcc, %bl
 	 *   LFENCE
-	 *   JMP zen_return_thunk
+	 *   JMP retbleed_return_thunk
 	 *
 	 * Executing the TEST instruction has a side effect of evicting any BTB
 	 * prediction (potentially attacker controlled) attached to the RET, as
-	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
+	 * retbleed_return_thunk + 1 isn't an instruction boundary at the moment.
 	 */
 	.byte	0xf6
 
 	/*
-	 * As executed from zen_return_thunk, this is a plain RET.
+	 * As executed from retbleed_return_thunk, this is a plain RET.
 	 *
 	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
 	 *
@@ -225,13 +225,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	 * With SMT enabled and STIBP active, a sibling thread cannot poison
 	 * RET's prediction to a type of its choice, but can evict the
 	 * prediction due to competitive sharing. If the prediction is
-	 * evicted, zen_return_thunk will suffer Straight Line Speculation
+	 * evicted, retbleed_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
+SYM_INNER_LABEL(retbleed_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(zen_return_thunk)
+SYM_CODE_END(retbleed_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -242,13 +242,13 @@ SYM_CODE_END(zen_return_thunk)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp zen_return_thunk
+	jmp retbleed_return_thunk
 	int3
-SYM_FUNC_END(zen_untrain_ret)
-__EXPORT_THUNK(zen_untrain_ret)
+SYM_FUNC_END(retbleed_untrain_ret)
+__EXPORT_THUNK(retbleed_untrain_ret)
 
 /*
- * SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
+ * SRSO untraining sequence for Zen1/2, similar to retbleed_untrain_ret()
  * above. On kernel entry, srso_untrain_ret() is executed which is a
  *
  * movabs $0xccccc30824648d48,%rax
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index c55f3bb..c0f25d0 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -829,6 +829,6 @@ bool arch_is_rethunk(struct symbol *sym)
 
 bool arch_is_embedded_insn(struct symbol *sym)
 {
-	return !strcmp(sym->name, "zen_return_thunk") ||
+	return !strcmp(sym->name, "retbleed_return_thunk") ||
 	       !strcmp(sym->name, "srso_safe_ret");
 }
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 191656e..7a9aaf4 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1593,7 +1593,7 @@ static int add_jump_destinations(struct objtool_file *file)
 			struct symbol *sym = find_symbol_by_offset(dest_sec, dest_off);
 
 			/*
-			 * This is a special case for zen_untrain_ret().
+			 * This is a special case for retbleed_untrain_ret().
 			 * It jumps to __x86_return_thunk(), but objtool
 			 * can't find the thunk's starting RET
 			 * instruction, because the RET is also in the

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Clean up SRSO return thunk mess
  2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
                     ` (2 preceding siblings ...)
  2023-08-15 21:29   ` Nathan Chancellor
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  2023-08-16 18:58     ` Nathan Chancellor
  2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  4 siblings, 1 reply; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     9010e01a8efffa0d14972b79fbe87bd329d79bfd
Gitweb:        https://git.kernel.org/tip/9010e01a8efffa0d14972b79fbe87bd329d79bfd
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:31 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/cpu: Clean up SRSO return thunk mess

Use the existing configurable return thunk. There is absolute no
justification for having created this __x86_return_thunk alternative.

To clarify, the whole thing looks like:

Zen3/4 does:

  srso_alias_untrain_ret:
	  nop2
	  lfence
	  jmp srso_alias_return_thunk
	  int3

  srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
	  add $8, %rsp
	  ret
	  int3

  srso_alias_return_thunk:
	  call srso_alias_safe_ret
	  ud2

While Zen1/2 does:

  srso_untrain_ret:
	  movabs $foo, %rax
	  lfence
	  call srso_safe_ret           (jmp srso_return_thunk ?)
	  int3

  srso_safe_ret: // embedded in movabs instruction
	  add $8,%rsp
          ret
          int3

  srso_return_thunk:
	  call srso_safe_ret
	  ud2

While retbleed does:

  zen_untrain_ret:
	  test $0xcc, %bl
	  lfence
	  jmp zen_return_thunk
          int3

  zen_return_thunk: // embedded in the test instruction
	  ret
          int3

Where Zen1/2 flush the BTB entry using the instruction decoder trick
(test,movabs) Zen3/4 use BTB aliasing. SRSO adds a return sequence
(srso_safe_ret()) which forces the function return instruction to
speculate into a trap (UD2).  This RET will then mispredict and
execution will continue at the return site read from the top of the
stack.

Pick one of three options at boot (evey function can only ever return
once).

  [ bp: Fixup commit message uarch details and add them in a comment in
    the code too. Add a comment about the srso_select_mitigation()
    dependency on retbleed_select_mitigation(). Add moar ifdeffery for
    32-bit builds. Add a dummy srso_untrain_ret_alias() definition for
    32-bit alternatives needing the symbol. ]

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.842775684@infradead.org
---
 arch/x86/include/asm/nospec-branch.h |  5 ++-
 arch/x86/kernel/cpu/bugs.c           | 15 ++++++-
 arch/x86/kernel/vmlinux.lds.S        |  2 +-
 arch/x86/lib/retpoline.S             | 58 +++++++++++++++++++--------
 tools/objtool/arch/x86/decode.c      |  2 +-
 5 files changed, 62 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index b3625cc..5ed78ad 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -347,9 +347,14 @@ extern void __x86_return_thunk(void);
 static inline void __x86_return_thunk(void) {}
 #endif
 
+extern void zen_return_thunk(void);
+extern void srso_return_thunk(void);
+extern void srso_alias_return_thunk(void);
+
 extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
+
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 3bc0d14..56cf250 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -167,6 +167,11 @@ void __init cpu_select_mitigations(void)
 	md_clear_select_mitigation();
 	srbds_select_mitigation();
 	l1d_flush_select_mitigation();
+
+	/*
+	 * srso_select_mitigation() depends and must run after
+	 * retbleed_select_mitigation().
+	 */
 	srso_select_mitigation();
 	gds_select_mitigation();
 }
@@ -1037,6 +1042,9 @@ do_cmd_auto:
 		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
 
+		if (IS_ENABLED(CONFIG_RETHUNK))
+			x86_return_thunk = zen_return_thunk;
+
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
 			pr_err(RETBLEED_UNTRAIN_MSG);
@@ -2453,10 +2461,13 @@ static void __init srso_select_mitigation(void)
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 
-			if (boot_cpu_data.x86 == 0x19)
+			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
-			else
+				x86_return_thunk = srso_alias_return_thunk;
+			} else {
 				setup_force_cpu_cap(X86_FEATURE_SRSO);
+				x86_return_thunk = srso_return_thunk;
+			}
 			srso_mitigation = SRSO_MITIGATION_SAFE_RET;
 		} else {
 			pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 8e2a306..d3b02d6 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -521,7 +521,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_RETHUNK
-. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
+. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 #endif
 
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index a478eb5..fb81895 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -151,22 +151,27 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
 	.section .text..__x86.rethunk_untrain
 
 SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
-	jmp __x86_return_thunk
+	jmp srso_alias_return_thunk
 SYM_FUNC_END(srso_untrain_ret_alias)
 __EXPORT_THUNK(srso_untrain_ret_alias)
 
 	.section .text..__x86.rethunk_safe
+#else
+/* dummy definition for alternatives */
+SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+	ANNOTATE_UNRET_SAFE
+	ret
+	int3
+SYM_FUNC_END(srso_alias_untrain_ret)
 #endif
 
-/* Needs a definition for the __x86_return_thunk alternative below. */
 SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
-#ifdef CONFIG_CPU_SRSO
 	lea 8(%_ASM_SP), %_ASM_SP
 	UNWIND_HINT_FUNC
-#endif
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
@@ -174,9 +179,16 @@ SYM_FUNC_END(srso_safe_ret_alias)
 
 	.section .text..__x86.return_thunk
 
+SYM_CODE_START(srso_alias_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	call srso_safe_ret_alias
+	ud2
+SYM_CODE_END(srso_alias_return_thunk)
+
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
- * 1) The RET at __x86_return_thunk must be on a 64 byte boundary, for
+ * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
  *    alignment within the BTB.
  * 2) The instruction at zen_untrain_ret must contain, and not
  *    end with, the 0xc3 byte of the RET.
@@ -184,7 +196,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (__ret - zen_untrain_ret), 0xcc
+	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
 SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
@@ -192,16 +204,16 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	 *
 	 *   TEST $0xcc, %bl
 	 *   LFENCE
-	 *   JMP __x86_return_thunk
+	 *   JMP zen_return_thunk
 	 *
 	 * Executing the TEST instruction has a side effect of evicting any BTB
 	 * prediction (potentially attacker controlled) attached to the RET, as
-	 * __x86_return_thunk + 1 isn't an instruction boundary at the moment.
+	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
 	 */
 	.byte	0xf6
 
 	/*
-	 * As executed from __x86_return_thunk, this is a plain RET.
+	 * As executed from zen_return_thunk, this is a plain RET.
 	 *
 	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
 	 *
@@ -213,13 +225,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	 * With SMT enabled and STIBP active, a sibling thread cannot poison
 	 * RET's prediction to a type of its choice, but can evict the
 	 * prediction due to competitive sharing. If the prediction is
-	 * evicted, __x86_return_thunk will suffer Straight Line Speculation
+	 * evicted, zen_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(__ret)
+SYM_CODE_END(zen_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -230,7 +242,7 @@ SYM_CODE_END(__ret)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp __ret
+	jmp zen_return_thunk
 	int3
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
@@ -251,11 +263,18 @@ SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	.byte 0x48, 0xb8
 
+/*
+ * This forces the function return instruction to speculate into a trap
+ * (UD2 in srso_return_thunk() below).  This RET will then mispredict
+ * and execution will continue at the return site read from the top of
+ * the stack.
+ */
 SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
 	lea 8(%_ASM_SP), %_ASM_SP
 	ret
 	int3
 	int3
+	/* end of movabs */
 	lfence
 	call srso_safe_ret
 	ud2
@@ -263,12 +282,19 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
-SYM_CODE_START(__x86_return_thunk)
+SYM_CODE_START(srso_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
-			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
+	call srso_safe_ret
 	ud2
+SYM_CODE_END(srso_return_thunk)
+
+SYM_CODE_START(__x86_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	ANNOTATE_UNRET_SAFE
+	ret
+	int3
 SYM_CODE_END(__x86_return_thunk)
 EXPORT_SYMBOL(__x86_return_thunk)
 
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index cba8a7b..c55f3bb 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -829,6 +829,6 @@ bool arch_is_rethunk(struct symbol *sym)
 
 bool arch_is_embedded_insn(struct symbol *sym)
 {
-	return !strcmp(sym->name, "__ret") ||
+	return !strcmp(sym->name, "zen_return_thunk") ||
 	       !strcmp(sym->name, "srso_safe_ret");
 }

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/alternative: Make custom return thunk unconditional
  2023-08-14 11:44 ` [PATCH v2 04/11] x86/alternative: Make custom return thunk unconditional Peter Zijlstra
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     095b8303f3835c68ac4a8b6d754ca1c3b6230711
Gitweb:        https://git.kernel.org/tip/095b8303f3835c68ac4a8b6d754ca1c3b6230711
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:30 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/alternative: Make custom return thunk unconditional

There is infrastructure to rewrite return thunks to point to any
random thunk one desires, unwrap that from CALL_THUNKS, which up to
now was the sole user of that.

  [ bp: Make the thunks visible on 32-bit and add ifdeffery for the
    32-bit builds. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.775293785@infradead.org
---
 arch/x86/include/asm/nospec-branch.h |  9 +++++----
 arch/x86/kernel/alternative.c        |  4 ----
 arch/x86/kernel/cpu/bugs.c           |  2 ++
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index e50db53..b3625cc 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -341,17 +341,18 @@ extern retpoline_thunk_t __x86_indirect_thunk_array[];
 extern retpoline_thunk_t __x86_indirect_call_thunk_array[];
 extern retpoline_thunk_t __x86_indirect_jump_thunk_array[];
 
+#ifdef CONFIG_RETHUNK
 extern void __x86_return_thunk(void);
+#else
+static inline void __x86_return_thunk(void) {}
+#endif
+
 extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
 extern void entry_ibpb(void);
 
-#ifdef CONFIG_CALL_THUNKS
 extern void (*x86_return_thunk)(void);
-#else
-#define x86_return_thunk	(&__x86_return_thunk)
-#endif
 
 #ifdef CONFIG_CALL_DEPTH_TRACKING
 extern void __x86_return_skl(void);
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2dcf3a0..099d58d 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -687,10 +687,6 @@ void __init_or_module noinline apply_retpolines(s32 *start, s32 *end)
 
 #ifdef CONFIG_RETHUNK
 
-#ifdef CONFIG_CALL_THUNKS
-void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
-#endif
-
 /*
  * Rewrite the compiler generated return thunk tail-calls.
  *
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6c04aef..3bc0d14 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -63,6 +63,8 @@ EXPORT_SYMBOL_GPL(x86_pred_cmd);
 
 static DEFINE_MUTEX(spec_ctrl_mutex);
 
+void (*x86_return_thunk)(void) __ro_after_init = &__x86_return_thunk;
+
 /* Update SPEC_CTRL MSR and its cached copy unconditionally */
 static void update_spec_ctrl(u64 val)
 {

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] objtool/x86: Fix SRSO mess
  2023-08-14 11:44 ` [PATCH v2 03/11] objtool/x86: Fix SRSO mess Peter Zijlstra
  2023-08-14 12:54   ` Andrew.Cooper3
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  2023-08-16 11:59     ` Peter Zijlstra
  1 sibling, 1 reply; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     4ae68b26c3ab5a82aa271e6e9fc9b1a06e1d6b40
Gitweb:        https://git.kernel.org/tip/4ae68b26c3ab5a82aa271e6e9fc9b1a06e1d6b40
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:29 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

objtool/x86: Fix SRSO mess

Objtool --rethunk does two things:

 - it collects all (tail) call's of __x86_return_thunk and places them
   into .return_sites. These are typically compiler generated, but
   RET also emits this same.

 - it fudges the validation of the __x86_return_thunk symbol; because
   this symbol is inside another instruction, it can't actually find
   the instruction pointed to by the symbol offset and gets upset.

Because these two things pertained to the same symbol, there was no
pressing need to separate these two separate things.

However, alas, along comes SRSO and more crazy things to deal with
appeared.

The SRSO patch itself added the following symbol names to identify as
rethunk:

  'srso_untrain_ret', 'srso_safe_ret' and '__ret'

Where '__ret' is the old retbleed return thunk, 'srso_safe_ret' is a
new similarly embedded return thunk, and 'srso_untrain_ret' is
completely unrelated to anything the above does (and was only included
because of that INT3 vs UD2 issue fixed previous).

Clear things up by adding a second category for the embedded instruction
thing.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.704502245@infradead.org
---
 tools/objtool/arch/x86/decode.c      | 11 +++++++----
 tools/objtool/check.c                | 24 ++++++++++++++++++++++--
 tools/objtool/include/objtool/arch.h |  1 +
 tools/objtool/include/objtool/elf.h  |  1 +
 4 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 2d51fa8..cba8a7b 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -824,8 +824,11 @@ bool arch_is_retpoline(struct symbol *sym)
 
 bool arch_is_rethunk(struct symbol *sym)
 {
-	return !strcmp(sym->name, "__x86_return_thunk") ||
-	       !strcmp(sym->name, "srso_untrain_ret") ||
-	       !strcmp(sym->name, "srso_safe_ret") ||
-	       !strcmp(sym->name, "__ret");
+	return !strcmp(sym->name, "__x86_return_thunk");
+}
+
+bool arch_is_embedded_insn(struct symbol *sym)
+{
+	return !strcmp(sym->name, "__ret") ||
+	       !strcmp(sym->name, "srso_safe_ret");
 }
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e2ee10c..191656e 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -455,7 +455,7 @@ static int decode_instructions(struct objtool_file *file)
 				return -1;
 			}
 
-			if (func->return_thunk || func->alias != func)
+			if (func->embedded_insn || func->alias != func)
 				continue;
 
 			if (!find_insn(file, sec, func->offset)) {
@@ -1288,16 +1288,33 @@ static int add_ignore_alternatives(struct objtool_file *file)
 	return 0;
 }
 
+/*
+ * Symbols that replace INSN_CALL_DYNAMIC, every (tail) call to such a symbol
+ * will be added to the .retpoline_sites section.
+ */
 __weak bool arch_is_retpoline(struct symbol *sym)
 {
 	return false;
 }
 
+/*
+ * Symbols that replace INSN_RETURN, every (tail) call to such a symbol
+ * will be added to the .return_sites section.
+ */
 __weak bool arch_is_rethunk(struct symbol *sym)
 {
 	return false;
 }
 
+/*
+ * Symbols that are embedded inside other instructions, because sometimes crazy
+ * code exists. These are mostly ignored for validation purposes.
+ */
+__weak bool arch_is_embedded_insn(struct symbol *sym)
+{
+	return false;
+}
+
 static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
 {
 	struct reloc *reloc;
@@ -1583,7 +1600,7 @@ static int add_jump_destinations(struct objtool_file *file)
 			 * middle of another instruction.  Objtool only
 			 * knows about the outer instruction.
 			 */
-			if (sym && sym->return_thunk) {
+			if (sym && sym->embedded_insn) {
 				add_return_call(file, insn, false);
 				continue;
 			}
@@ -2502,6 +2519,9 @@ static int classify_symbols(struct objtool_file *file)
 		if (arch_is_rethunk(func))
 			func->return_thunk = true;
 
+		if (arch_is_embedded_insn(func))
+			func->embedded_insn = true;
+
 		if (arch_ftrace_match(func->name))
 			func->fentry = true;
 
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index 2b6d2ce..0b303eb 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -90,6 +90,7 @@ int arch_decode_hint_reg(u8 sp_reg, int *base);
 
 bool arch_is_retpoline(struct symbol *sym);
 bool arch_is_rethunk(struct symbol *sym);
+bool arch_is_embedded_insn(struct symbol *sym);
 
 int arch_rewrite_retpolines(struct objtool_file *file);
 
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index c532d70..9f71e98 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -66,6 +66,7 @@ struct symbol {
 	u8 fentry            : 1;
 	u8 profiling_func    : 1;
 	u8 warned	     : 1;
+	u8 embedded_insn     : 1;
 	struct list_head pv_target;
 	struct reloc *relocs;
 };

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Fix __x86_return_thunk symbol type
  2023-08-14 11:44 ` [PATCH v2 01/11] x86/cpu: Fixup __x86_return_thunk Peter Zijlstra
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     77f67119004296a9b2503b377d610e08b08afc2a
Gitweb:        https://git.kernel.org/tip/77f67119004296a9b2503b377d610e08b08afc2a
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:27 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/cpu: Fix __x86_return_thunk symbol type

Commit

  fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")

reimplemented __x86_return_thunk with a mix of SYM_FUNC_START and
SYM_CODE_END, this is not a sane combination.

Since nothing should ever actually 'CALL' this, make it consistently
CODE.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.571027074@infradead.org
---
 arch/x86/lib/retpoline.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 8db74d8..9427480 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -263,7 +263,9 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
-SYM_FUNC_START(__x86_return_thunk)
+SYM_CODE_START(__x86_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
 	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
 			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
 	int3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk()
  2023-08-14 11:44 ` [PATCH v2 02/11] x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk() Peter Zijlstra
@ 2023-08-16  7:55   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16  7:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     af023ef335f13c8b579298fc432daeef609a9e60
Gitweb:        https://git.kernel.org/tip/af023ef335f13c8b579298fc432daeef609a9e60
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:28 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00

x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk()

  vmlinux.o: warning: objtool: srso_untrain_ret() falls through to next function __x86_return_skl()
  vmlinux.o: warning: objtool: __x86_return_thunk() falls through to next function __x86_return_skl()

This is because these functions (can) end with CALL, which objtool
does not consider a terminating instruction. Therefore, replace the
INT3 instruction (which is a non-fatal trap) with UD2 (which is a
fatal-trap).

This indicates execution will not continue past this point.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.637802730@infradead.org
---
 arch/x86/lib/retpoline.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 9427480..a478eb5 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -258,7 +258,7 @@ SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
 	int3
 	lfence
 	call srso_safe_ret
-	int3
+	ud2
 SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
@@ -268,7 +268,7 @@ SYM_CODE_START(__x86_return_thunk)
 	ANNOTATE_NOENDBR
 	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
 			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
-	int3
+	ud2
 SYM_CODE_END(__x86_return_thunk)
 EXPORT_SYMBOL(__x86_return_thunk)
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [tip: x86/urgent] objtool/x86: Fix SRSO mess
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
@ 2023-08-16 11:59     ` Peter Zijlstra
  2023-08-16 20:31       ` Josh Poimboeuf
  2023-08-17  8:39       ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  0 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-16 11:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-tip-commits, Borislav Petkov (AMD), x86

On Wed, Aug 16, 2023 at 07:55:17AM -0000, tip-bot2 for Peter Zijlstra wrote:
> The following commit has been merged into the x86/urgent branch of tip:
> 
> Commit-ID:     4ae68b26c3ab5a82aa271e6e9fc9b1a06e1d6b40
> Gitweb:        https://git.kernel.org/tip/4ae68b26c3ab5a82aa271e6e9fc9b1a06e1d6b40
> Author:        Peter Zijlstra <peterz@infradead.org>
> AuthorDate:    Mon, 14 Aug 2023 13:44:29 +02:00
> Committer:     Borislav Petkov (AMD) <bp@alien8.de>
> CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00
> 
> objtool/x86: Fix SRSO mess
> 
> Objtool --rethunk does two things:
> 
>  - it collects all (tail) call's of __x86_return_thunk and places them
>    into .return_sites. These are typically compiler generated, but
>    RET also emits this same.
> 
>  - it fudges the validation of the __x86_return_thunk symbol; because
>    this symbol is inside another instruction, it can't actually find
>    the instruction pointed to by the symbol offset and gets upset.
> 
> Because these two things pertained to the same symbol, there was no
> pressing need to separate these two separate things.
> 
> However, alas, along comes SRSO and more crazy things to deal with
> appeared.
> 
> The SRSO patch itself added the following symbol names to identify as
> rethunk:
> 
>   'srso_untrain_ret', 'srso_safe_ret' and '__ret'
> 
> Where '__ret' is the old retbleed return thunk, 'srso_safe_ret' is a
> new similarly embedded return thunk, and 'srso_untrain_ret' is
> completely unrelated to anything the above does (and was only included
> because of that INT3 vs UD2 issue fixed previous).
> 
> Clear things up by adding a second category for the embedded instruction
> thing.
> 
> Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> Link: https://lore.kernel.org/r/20230814121148.704502245@infradead.org

Turns out I forgot to build with FRAME_POINTER=y, that still gives:

vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup

the below seems to cure this.

---
 tools/objtool/check.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 7a9aaf400873..1384090530db 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2650,12 +2650,17 @@ static int decode_sections(struct objtool_file *file)
 	return 0;
 }
 
-static bool is_fentry_call(struct instruction *insn)
+static bool is_special_call(struct instruction *insn)
 {
-	if (insn->type == INSN_CALL &&
-	    insn_call_dest(insn) &&
-	    insn_call_dest(insn)->fentry)
-		return true;
+	if (insn->type == INSN_CALL) {
+		struct symbol *dest = insn_call_dest(insn);
+
+		if (!dest)
+			return false;
+
+		if (dest->fentry || dest->embedded_insn)
+			return true;
+	}
 
 	return false;
 }
@@ -3656,7 +3661,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 			if (ret)
 				return ret;
 
-			if (opts.stackval && func && !is_fentry_call(insn) &&
+			if (opts.stackval && func && !is_special_call(insn) &&
 			    !has_valid_stack_frame(&state)) {
 				WARN_INSN(insn, "call without frame pointer save/setup");
 				return 1;

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-16  7:38       ` Borislav Petkov
@ 2023-08-16 14:52         ` Nathan Chancellor
  2023-08-16 15:08           ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Nathan Chancellor @ 2023-08-16 14:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	jpoimboe, gregkh, nik.borisov

On Wed, Aug 16, 2023 at 09:38:28AM +0200, Borislav Petkov wrote:
> On Wed, Aug 16, 2023 at 12:43:48AM +0200, Peter Zijlstra wrote:
> > Yeah, Boris and me fixed that yesterday evening or so. I'm not sure
> > I still have the diffs, but Boris should have them all somewhere.
> 
> Even better - all the urgent fixes I've accumulated so far are coming up
> in tip's x86/urgent.  I'd appreciate people testing it.

All my configurations build and run cleanly in QEMU at commit
d80c3c9de067 ("x86/srso: Explain the untraining sequences a bit more")
so I think we should be good here.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess
  2023-08-16 14:52         ` Nathan Chancellor
@ 2023-08-16 15:08           ` Borislav Petkov
  0 siblings, 0 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-08-16 15:08 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Peter Zijlstra, x86, linux-kernel, David.Kaplan, Andrew.Cooper3,
	jpoimboe, gregkh, nik.borisov

On Wed, Aug 16, 2023 at 07:52:32AM -0700, Nathan Chancellor wrote:
> All my configurations build and run cleanly in QEMU at commit
> d80c3c9de067 ("x86/srso: Explain the untraining sequences a bit more")
> so I think we should be good here.

Phew!

Thanks a lot for testing!

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip: x86/urgent] x86/cpu: Clean up SRSO return thunk mess
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
@ 2023-08-16 18:58     ` Nathan Chancellor
  2023-08-16 19:24       ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Nathan Chancellor @ 2023-08-16 18:58 UTC (permalink / raw)
  To: Peter Zijlstra, Borislav Petkov; +Cc: linux-tip-commits, linux-kernel, x86

On Wed, Aug 16, 2023 at 07:55:16AM -0000, tip-bot2 for Peter Zijlstra wrote:
> The following commit has been merged into the x86/urgent branch of tip:
> 
> Commit-ID:     9010e01a8efffa0d14972b79fbe87bd329d79bfd
> Gitweb:        https://git.kernel.org/tip/9010e01a8efffa0d14972b79fbe87bd329d79bfd
> Author:        Peter Zijlstra <peterz@infradead.org>
> AuthorDate:    Mon, 14 Aug 2023 13:44:31 +02:00
> Committer:     Borislav Petkov (AMD) <bp@alien8.de>
> CommitterDate: Wed, 16 Aug 2023 09:39:16 +02:00
> 
> x86/cpu: Clean up SRSO return thunk mess

<snip>

> diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
> index a478eb5..fb81895 100644
> --- a/arch/x86/lib/retpoline.S
> +++ b/arch/x86/lib/retpoline.S
> @@ -151,22 +151,27 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
>  	.section .text..__x86.rethunk_untrain
>  
>  SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
> +	UNWIND_HINT_FUNC
>  	ANNOTATE_NOENDBR
>  	ASM_NOP2
>  	lfence
> -	jmp __x86_return_thunk
> +	jmp srso_alias_return_thunk
>  SYM_FUNC_END(srso_untrain_ret_alias)
>  __EXPORT_THUNK(srso_untrain_ret_alias)
>  
>  	.section .text..__x86.rethunk_safe
> +#else
> +/* dummy definition for alternatives */
> +SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
> +	ANNOTATE_UNRET_SAFE
> +	ret
> +	int3
> +SYM_FUNC_END(srso_alias_untrain_ret)

Just a heads up, this series will have a small bisectability issue
because of this hunk, it needs

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index fb818957955b..7df8582fb64e 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -166,7 +166,7 @@ SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
-SYM_FUNC_END(srso_alias_untrain_ret)
+SYM_FUNC_END(srso_untrain_ret_alias)
 #endif
 
 SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)

but it obviously gets fixed by commit a3fd3ac0a605 ("x86/cpu: Rename
srso_(.*)_alias to srso_alias_\1") so it is probably fine. I only
noticed it because I cherry-picked the first five changes to my patched
-next tree.

Cheers,
Nathan

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [tip: x86/urgent] x86/cpu: Clean up SRSO return thunk mess
  2023-08-16 18:58     ` Nathan Chancellor
@ 2023-08-16 19:24       ` Borislav Petkov
  2023-08-16 19:30         ` Nathan Chancellor
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-08-16 19:24 UTC (permalink / raw)
  To: Nathan Chancellor; +Cc: Peter Zijlstra, linux-tip-commits, linux-kernel, x86

On Wed, Aug 16, 2023 at 11:58:39AM -0700, Nathan Chancellor wrote:
> but it obviously gets fixed by commit a3fd3ac0a605 ("x86/cpu: Rename
> srso_(.*)_alias to srso_alias_\1") so it is probably fine. I only
> noticed it because I cherry-picked the first five changes to my patched
> -next tree.

Gah, and I meant to merge that hunk into the right one when fixing the
32-bit builds.

So how did you trigger it? You do builds of every patch? Because that's
the !CONFIG_CPU_SRSO case.

Oh well, lemme rebase and fix it.

Thx for letting me know.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip: x86/urgent] x86/cpu: Clean up SRSO return thunk mess
  2023-08-16 19:24       ` Borislav Petkov
@ 2023-08-16 19:30         ` Nathan Chancellor
  2023-08-16 19:42           ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Nathan Chancellor @ 2023-08-16 19:30 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Peter Zijlstra, linux-tip-commits, linux-kernel, x86

On Wed, Aug 16, 2023 at 09:24:13PM +0200, Borislav Petkov wrote:
> On Wed, Aug 16, 2023 at 11:58:39AM -0700, Nathan Chancellor wrote:
> > but it obviously gets fixed by commit a3fd3ac0a605 ("x86/cpu: Rename
> > srso_(.*)_alias to srso_alias_\1") so it is probably fine. I only
> > noticed it because I cherry-picked the first five changes to my patched
> > -next tree.
> 
> Gah, and I meant to merge that hunk into the right one when fixing the
> 32-bit builds.

Heh, fixups are always hard to get right across multiple patches, been
there, done that...

> So how did you trigger it? You do builds of every patch? Because that's
> the !CONFIG_CPU_SRSO case.

Just ARCH=i386 allmodconfig. CONFIG_CPU_SRSO depends on X86_64 so I
guess that is how it got triggered. I did not build between every patch,
just this one (since it should fix the runtime warning folks have been
noticing) and the final one (as I reported earlier).

> Oh well, lemme rebase and fix it.
> 
> Thx for letting me know.

No problem, hopefully most of the hard work around SRSO is behind us :)

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip: x86/urgent] x86/cpu: Clean up SRSO return thunk mess
  2023-08-16 19:30         ` Nathan Chancellor
@ 2023-08-16 19:42           ` Borislav Petkov
  2023-08-16 19:57             ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-08-16 19:42 UTC (permalink / raw)
  To: Nathan Chancellor; +Cc: Peter Zijlstra, linux-tip-commits, linux-kernel, x86

On Wed, Aug 16, 2023 at 12:30:11PM -0700, Nathan Chancellor wrote:
> Heh, fixups are always hard to get right across multiple patches, been
> there, done that...

Oh yeah.

> Just ARCH=i386 allmodconfig. CONFIG_CPU_SRSO depends on X86_64 so I
> guess that is how it got triggered. I did not build between every patch,
> just this one (since it should fix the runtime warning folks have been
> noticing) and the final one (as I reported earlier).

I see.

> No problem, hopefully most of the hard work around SRSO is behind us :)

Yeah, I'm pretty sure Murphy will visit us. He always does.

But we'll see. :-)

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip: x86/urgent] x86/cpu: Clean up SRSO return thunk mess
  2023-08-16 19:42           ` Borislav Petkov
@ 2023-08-16 19:57             ` Borislav Petkov
  0 siblings, 0 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-08-16 19:57 UTC (permalink / raw)
  To: Nathan Chancellor; +Cc: Peter Zijlstra, linux-tip-commits, linux-kernel, x86

Ok,

now I know what the problem was: I fixed it up but then the rebasing
didn't pick it up when it came to

"x86/cpu: Rename srso_(.*)_alias to srso_alias_\1"

so I have to explicitly select that one and fix it up.

Hohumm, nasty.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip: x86/urgent] objtool/x86: Fix SRSO mess
  2023-08-16 11:59     ` Peter Zijlstra
@ 2023-08-16 20:31       ` Josh Poimboeuf
  2023-08-16 22:08         ` [PATCH] objtool/x86: Fixup frame-pointer vs rethunk Peter Zijlstra
  2023-08-17  8:39       ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
  1 sibling, 1 reply; 74+ messages in thread
From: Josh Poimboeuf @ 2023-08-16 20:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-tip-commits, Borislav Petkov (AMD), x86

On Wed, Aug 16, 2023 at 01:59:21PM +0200, Peter Zijlstra wrote:
> Turns out I forgot to build with FRAME_POINTER=y, that still gives:
> 
> vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup
> 
> the below seems to cure this.

LGTM

-- 
Josh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu/kvm: Provide UNTRAIN_RET_VM
  2023-08-14 11:44 ` [PATCH v2 09/11] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
@ 2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16 21:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     864bcaa38ee44ec6c0e43f79c2d2997b977e26b2
Gitweb:        https://git.kernel.org/tip/864bcaa38ee44ec6c0e43f79c2d2997b977e26b2
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:35 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 21:58:59 +02:00

x86/cpu/kvm: Provide UNTRAIN_RET_VM

Similar to how it doesn't make sense to have UNTRAIN_RET have two
untrain calls, it also doesn't make sense for VMEXIT to have an extra
IBPB call.

This cures VMEXIT doing potentially unret+IBPB or double IBPB.
Also, the (SEV) VMEXIT case seems to have been overlooked.

Redefine the meaning of the synthetic IBPB flags to:

 - ENTRY_IBPB     -- issue IBPB on entry  (was: entry + VMEXIT)
 - IBPB_ON_VMEXIT -- issue IBPB on VMEXIT

And have 'retbleed=ibpb' set *BOTH* feature flags to ensure it retains
the previous behaviour and issues IBPB on entry+VMEXIT.

The new 'srso=ibpb_vmexit' option only sets IBPB_ON_VMEXIT.

Create UNTRAIN_RET_VM specifically for the VMEXIT case, and have that
check IBPB_ON_VMEXIT.

All this avoids having the VMEXIT case having to check both ENTRY_IBPB
and IBPB_ON_VMEXIT and simplifies the alternatives.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121149.109557833@infradead.org
---
 arch/x86/include/asm/nospec-branch.h | 11 +++++++++++
 arch/x86/kernel/cpu/bugs.c           |  1 +
 arch/x86/kvm/svm/vmenter.S           |  7 ++-----
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 5285c8e..c55cc24 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -299,6 +299,17 @@
 #endif
 .endm
 
+.macro UNTRAIN_RET_VM
+#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
+	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
+	VALIDATE_UNRET_END
+	ALTERNATIVE_3 "",						\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT,	\
+		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
+#endif
+.endm
+
 .macro UNTRAIN_RET_FROM_CALL
 #if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
 	defined(CONFIG_CALL_DEPTH_TRACKING)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6f3e195..9026e3f 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1054,6 +1054,7 @@ do_cmd_auto:
 
 	case RETBLEED_MITIGATION_IBPB:
 		setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
+		setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
 		mitigate_smt = true;
 		break;
 
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 265452f..ef2ebab 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -222,10 +222,7 @@ SYM_FUNC_START(__svm_vcpu_run)
 	 * because interrupt handlers won't sanitize 'ret' if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
-
-	/* SRSO */
-	ALTERNATIVE "", "call entry_ibpb", X86_FEATURE_IBPB_ON_VMEXIT
+	UNTRAIN_RET_VM
 
 	/*
 	 * Clear all general purpose registers except RSP and RAX to prevent
@@ -362,7 +359,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
 	 * because interrupt handlers won't sanitize RET if the return is
 	 * from the kernel.
 	 */
-	UNTRAIN_RET
+	UNTRAIN_RET_VM
 
 	/* "Pop" @spec_ctrl_intercepted.  */
 	pop %_ASM_BX

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Cleanup the untrain mess
  2023-08-14 11:44 ` [PATCH v2 08/11] x86/cpu: Cleanup the untrain mess Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
@ 2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16 21:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     e7c25c441e9e0fa75b4c83e0b26306b702cfe90d
Gitweb:        https://git.kernel.org/tip/e7c25c441e9e0fa75b4c83e0b26306b702cfe90d
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:34 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 21:58:59 +02:00

x86/cpu: Cleanup the untrain mess

Since there can only be one active return_thunk, there only needs be
one (matching) untrain_ret. It fundamentally doesn't make sense to
allow multiple untrain_ret at the same time.

Fold all the 3 different untrain methods into a single (temporary)
helper stub.

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121149.042774962@infradead.org
---
 arch/x86/include/asm/nospec-branch.h | 19 +++++--------------
 arch/x86/kernel/cpu/bugs.c           |  1 +
 arch/x86/lib/retpoline.S             |  7 +++++++
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index f7c3375..5285c8e 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,9 +272,9 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_ZEN_UNTRAIN_RET	"call retbleed_untrain_ret"
+#define CALL_UNTRAIN_RET	"call entry_untrain_ret"
 #else
-#define CALL_ZEN_UNTRAIN_RET	""
+#define CALL_UNTRAIN_RET	""
 #endif
 
 /*
@@ -293,15 +293,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 .macro UNTRAIN_RET_FROM_CALL
@@ -309,15 +304,10 @@
 	defined(CONFIG_CALL_DEPTH_TRACKING)
 	VALIDATE_UNRET_END
 	ALTERNATIVE_3 "",						\
-		      CALL_ZEN_UNTRAIN_RET, X86_FEATURE_UNRET,		\
+		      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,		\
 		      "call entry_ibpb", X86_FEATURE_ENTRY_IBPB,	\
 		      __stringify(RESET_CALL_DEPTH_FROM_CALL), X86_FEATURE_CALL_DEPTH
 #endif
-
-#ifdef CONFIG_CPU_SRSO
-	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
-#endif
 .endm
 
 
@@ -355,6 +345,7 @@ extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_alias_untrain_ret(void);
 
+extern void entry_untrain_ret(void);
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bbbbda9..6f3e195 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2460,6 +2460,7 @@ static void __init srso_select_mitigation(void)
 			 * like ftrace, static_call, etc.
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
+			setup_force_cpu_cap(X86_FEATURE_UNRET);
 
 			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index d37e5ab..5e85da1 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -289,6 +289,13 @@ SYM_CODE_START(srso_return_thunk)
 	ud2
 SYM_CODE_END(srso_return_thunk)
 
+SYM_FUNC_START(entry_untrain_ret)
+	ALTERNATIVE_2 "jmp retbleed_untrain_ret", \
+		      "jmp srso_untrain_ret", X86_FEATURE_SRSO, \
+		      "jmp srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
+SYM_FUNC_END(entry_untrain_ret)
+__EXPORT_THUNK(entry_untrain_ret)
+
 SYM_CODE_START(__x86_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1
  2023-08-14 11:44 ` [PATCH v2 07/11] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
@ 2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16 21:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     42be649dd1f2eee6b1fb185f1a231b9494cf095f
Gitweb:        https://git.kernel.org/tip/42be649dd1f2eee6b1fb185f1a231b9494cf095f
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:33 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 21:58:53 +02:00

x86/cpu: Rename srso_(.*)_alias to srso_alias_\1

For a more consistent namespace.

  [ bp: Fixup names in the doc too. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.976236447@infradead.org
---
 Documentation/admin-guide/hw-vuln/srso.rst |  4 +--
 arch/x86/include/asm/nospec-branch.h       |  6 ++---
 arch/x86/kernel/vmlinux.lds.S              |  8 +++---
 arch/x86/lib/retpoline.S                   | 26 ++++++++++-----------
 4 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst
index af59a93..b6cfb51 100644
--- a/Documentation/admin-guide/hw-vuln/srso.rst
+++ b/Documentation/admin-guide/hw-vuln/srso.rst
@@ -141,8 +141,8 @@ sequence.
 To ensure the safety of this mitigation, the kernel must ensure that the
 safe return sequence is itself free from attacker interference.  In Zen3
 and Zen4, this is accomplished by creating a BTB alias between the
-untraining function srso_untrain_ret_alias() and the safe return
-function srso_safe_ret_alias() which results in evicting a potentially
+untraining function srso_alias_untrain_ret() and the safe return
+function srso_alias_safe_ret() which results in evicting a potentially
 poisoned BTB entry and using that safe one for all function returns.
 
 In older Zen1 and Zen2, this is accomplished using a reinterpretation
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 8a0d4c5..f7c3375 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -300,7 +300,7 @@
 
 #ifdef CONFIG_CPU_SRSO
 	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
 #endif
 .endm
 
@@ -316,7 +316,7 @@
 
 #ifdef CONFIG_CPU_SRSO
 	ALTERNATIVE_2 "", "call srso_untrain_ret", X86_FEATURE_SRSO, \
-			  "call srso_untrain_ret_alias", X86_FEATURE_SRSO_ALIAS
+			  "call srso_alias_untrain_ret", X86_FEATURE_SRSO_ALIAS
 #endif
 .endm
 
@@ -353,7 +353,7 @@ extern void srso_alias_return_thunk(void);
 
 extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
-extern void srso_untrain_ret_alias(void);
+extern void srso_alias_untrain_ret(void);
 
 extern void entry_ibpb(void);
 
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 7c0e2b4..83d41c2 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -147,10 +147,10 @@ SECTIONS
 
 #ifdef CONFIG_CPU_SRSO
 		/*
-		 * See the comment above srso_untrain_ret_alias()'s
+		 * See the comment above srso_alias_untrain_ret()'s
 		 * definition.
 		 */
-		. = srso_untrain_ret_alias | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
+		. = srso_alias_untrain_ret | (1 << 2) | (1 << 8) | (1 << 14) | (1 << 20);
 		*(.text..__x86.rethunk_safe)
 #endif
 		ALIGN_ENTRY_TEXT_END
@@ -536,8 +536,8 @@ INIT_PER_CPU(irq_stack_backing_store);
  * Instead do: (A | B) - (A & B) in order to compute the XOR
  * of the two function addresses:
  */
-. = ASSERT(((ABSOLUTE(srso_untrain_ret_alias) | srso_safe_ret_alias) -
-		(ABSOLUTE(srso_untrain_ret_alias) & srso_safe_ret_alias)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
+. = ASSERT(((ABSOLUTE(srso_alias_untrain_ret) | srso_alias_safe_ret) -
+		(ABSOLUTE(srso_alias_untrain_ret) & srso_alias_safe_ret)) == ((1 << 2) | (1 << 8) | (1 << 14) | (1 << 20)),
 		"SRSO function pair won't alias");
 #endif
 
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index adabd07..d37e5ab 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -133,56 +133,56 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
 #ifdef CONFIG_RETHUNK
 
 /*
- * srso_untrain_ret_alias() and srso_safe_ret_alias() are placed at
+ * srso_alias_untrain_ret() and srso_alias_safe_ret() are placed at
  * special addresses:
  *
- * - srso_untrain_ret_alias() is 2M aligned
- * - srso_safe_ret_alias() is also in the same 2M page but bits 2, 8, 14
+ * - srso_alias_untrain_ret() is 2M aligned
+ * - srso_alias_safe_ret() is also in the same 2M page but bits 2, 8, 14
  * and 20 in its virtual address are set (while those bits in the
- * srso_untrain_ret_alias() function are cleared).
+ * srso_alias_untrain_ret() function are cleared).
  *
  * This guarantees that those two addresses will alias in the branch
  * target buffer of Zen3/4 generations, leading to any potential
  * poisoned entries at that BTB slot to get evicted.
  *
- * As a result, srso_safe_ret_alias() becomes a safe return.
+ * As a result, srso_alias_safe_ret() becomes a safe return.
  */
 #ifdef CONFIG_CPU_SRSO
 	.section .text..__x86.rethunk_untrain
 
-SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
 	jmp srso_alias_return_thunk
-SYM_FUNC_END(srso_untrain_ret_alias)
-__EXPORT_THUNK(srso_untrain_ret_alias)
+SYM_FUNC_END(srso_alias_untrain_ret)
+__EXPORT_THUNK(srso_alias_untrain_ret)
 
 	.section .text..__x86.rethunk_safe
 #else
 /* dummy definition for alternatives */
-SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
-SYM_FUNC_END(srso_untrain_ret_alias)
+SYM_FUNC_END(srso_alias_untrain_ret)
 #endif
 
-SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+SYM_START(srso_alias_safe_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	lea 8(%_ASM_SP), %_ASM_SP
 	UNWIND_HINT_FUNC
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
-SYM_FUNC_END(srso_safe_ret_alias)
+SYM_FUNC_END(srso_alias_safe_ret)
 
 	.section .text..__x86.return_thunk
 
 SYM_CODE_START(srso_alias_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	call srso_safe_ret_alias
+	call srso_alias_safe_ret
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Rename original retbleed methods
  2023-08-14 11:44 ` [PATCH v2 06/11] x86/cpu: Rename original retbleed methods Peter Zijlstra
  2023-08-14 19:41   ` Josh Poimboeuf
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
@ 2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16 21:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Josh Poimboeuf, Peter Zijlstra (Intel), Borislav Petkov (AMD),
	x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     d025b7bac07a6e90b6b98b487f88854ad9247c39
Gitweb:        https://git.kernel.org/tip/d025b7bac07a6e90b6b98b487f88854ad9247c39
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:32 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 21:47:53 +02:00

x86/cpu: Rename original retbleed methods

Rename the original retbleed return thunk and untrain_ret to
retbleed_return_thunk() and retbleed_untrain_ret().

No functional changes.

Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.909378169@infradead.org
---
 arch/x86/include/asm/nospec-branch.h |  8 +++----
 arch/x86/kernel/cpu/bugs.c           |  2 +-
 arch/x86/kernel/vmlinux.lds.S        |  2 +-
 arch/x86/lib/retpoline.S             | 30 +++++++++++++--------------
 tools/objtool/arch/x86/decode.c      |  2 +-
 tools/objtool/check.c                |  2 +-
 6 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 5ed78ad..8a0d4c5 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -272,7 +272,7 @@
 .endm
 
 #ifdef CONFIG_CPU_UNRET_ENTRY
-#define CALL_ZEN_UNTRAIN_RET	"call zen_untrain_ret"
+#define CALL_ZEN_UNTRAIN_RET	"call retbleed_untrain_ret"
 #else
 #define CALL_ZEN_UNTRAIN_RET	""
 #endif
@@ -282,7 +282,7 @@
  * return thunk isn't mapped into the userspace tables (then again, AMD
  * typically has NO_MELTDOWN).
  *
- * While zen_untrain_ret() doesn't clobber anything but requires stack,
+ * While retbleed_untrain_ret() doesn't clobber anything but requires stack,
  * entry_ibpb() will clobber AX, CX, DX.
  *
  * As such, this must be placed after every *SWITCH_TO_KERNEL_CR3 at a point
@@ -347,11 +347,11 @@ extern void __x86_return_thunk(void);
 static inline void __x86_return_thunk(void) {}
 #endif
 
-extern void zen_return_thunk(void);
+extern void retbleed_return_thunk(void);
 extern void srso_return_thunk(void);
 extern void srso_alias_return_thunk(void);
 
-extern void zen_untrain_ret(void);
+extern void retbleed_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 56cf250..bbbbda9 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1043,7 +1043,7 @@ do_cmd_auto:
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
 
 		if (IS_ENABLED(CONFIG_RETHUNK))
-			x86_return_thunk = zen_return_thunk;
+			x86_return_thunk = retbleed_return_thunk;
 
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index d3b02d6..7c0e2b4 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -521,7 +521,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_RETHUNK
-. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
+. = ASSERT((retbleed_return_thunk & 0x3f) == 0, "retbleed_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 #endif
 
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 7df8582..adabd07 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -188,32 +188,32 @@ SYM_CODE_END(srso_alias_return_thunk)
 
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
- * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
+ * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
  *    alignment within the BTB.
- * 2) The instruction at zen_untrain_ret must contain, and not
+ * 2) The instruction at retbleed_untrain_ret must contain, and not
  *    end with, the 0xc3 byte of the RET.
  * 3) STIBP must be enabled, or SMT disabled, to prevent the sibling thread
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
-SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
+	.skip 64 - (retbleed_return_thunk - retbleed_untrain_ret), 0xcc
+SYM_START(retbleed_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
-	 * As executed from zen_untrain_ret, this is:
+	 * As executed from retbleed_untrain_ret, this is:
 	 *
 	 *   TEST $0xcc, %bl
 	 *   LFENCE
-	 *   JMP zen_return_thunk
+	 *   JMP retbleed_return_thunk
 	 *
 	 * Executing the TEST instruction has a side effect of evicting any BTB
 	 * prediction (potentially attacker controlled) attached to the RET, as
-	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
+	 * retbleed_return_thunk + 1 isn't an instruction boundary at the moment.
 	 */
 	.byte	0xf6
 
 	/*
-	 * As executed from zen_return_thunk, this is a plain RET.
+	 * As executed from retbleed_return_thunk, this is a plain RET.
 	 *
 	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
 	 *
@@ -225,13 +225,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	 * With SMT enabled and STIBP active, a sibling thread cannot poison
 	 * RET's prediction to a type of its choice, but can evict the
 	 * prediction due to competitive sharing. If the prediction is
-	 * evicted, zen_return_thunk will suffer Straight Line Speculation
+	 * evicted, retbleed_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
+SYM_INNER_LABEL(retbleed_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(zen_return_thunk)
+SYM_CODE_END(retbleed_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -242,13 +242,13 @@ SYM_CODE_END(zen_return_thunk)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp zen_return_thunk
+	jmp retbleed_return_thunk
 	int3
-SYM_FUNC_END(zen_untrain_ret)
-__EXPORT_THUNK(zen_untrain_ret)
+SYM_FUNC_END(retbleed_untrain_ret)
+__EXPORT_THUNK(retbleed_untrain_ret)
 
 /*
- * SRSO untraining sequence for Zen1/2, similar to zen_untrain_ret()
+ * SRSO untraining sequence for Zen1/2, similar to retbleed_untrain_ret()
  * above. On kernel entry, srso_untrain_ret() is executed which is a
  *
  * movabs $0xccccc30824648d48,%rax
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index c55f3bb..c0f25d0 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -829,6 +829,6 @@ bool arch_is_rethunk(struct symbol *sym)
 
 bool arch_is_embedded_insn(struct symbol *sym)
 {
-	return !strcmp(sym->name, "zen_return_thunk") ||
+	return !strcmp(sym->name, "retbleed_return_thunk") ||
 	       !strcmp(sym->name, "srso_safe_ret");
 }
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 191656e..7a9aaf4 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1593,7 +1593,7 @@ static int add_jump_destinations(struct objtool_file *file)
 			struct symbol *sym = find_symbol_by_offset(dest_sec, dest_off);
 
 			/*
-			 * This is a special case for zen_untrain_ret().
+			 * This is a special case for retbleed_untrain_ret().
 			 * It jumps to __x86_return_thunk(), but objtool
 			 * can't find the thunk's starting RET
 			 * instruction, because the RET is also in the

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] x86/cpu: Clean up SRSO return thunk mess
  2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
                     ` (3 preceding siblings ...)
  2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
@ 2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
  4 siblings, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-16 21:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD), x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     d43490d0ab824023e11d0b57d0aeec17a6e0ca13
Gitweb:        https://git.kernel.org/tip/d43490d0ab824023e11d0b57d0aeec17a6e0ca13
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 14 Aug 2023 13:44:31 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 16 Aug 2023 21:47:24 +02:00

x86/cpu: Clean up SRSO return thunk mess

Use the existing configurable return thunk. There is absolute no
justification for having created this __x86_return_thunk alternative.

To clarify, the whole thing looks like:

Zen3/4 does:

  srso_alias_untrain_ret:
	  nop2
	  lfence
	  jmp srso_alias_return_thunk
	  int3

  srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so
	  add $8, %rsp
	  ret
	  int3

  srso_alias_return_thunk:
	  call srso_alias_safe_ret
	  ud2

While Zen1/2 does:

  srso_untrain_ret:
	  movabs $foo, %rax
	  lfence
	  call srso_safe_ret           (jmp srso_return_thunk ?)
	  int3

  srso_safe_ret: // embedded in movabs instruction
	  add $8,%rsp
          ret
          int3

  srso_return_thunk:
	  call srso_safe_ret
	  ud2

While retbleed does:

  zen_untrain_ret:
	  test $0xcc, %bl
	  lfence
	  jmp zen_return_thunk
          int3

  zen_return_thunk: // embedded in the test instruction
	  ret
          int3

Where Zen1/2 flush the BTB entry using the instruction decoder trick
(test,movabs) Zen3/4 use BTB aliasing. SRSO adds a return sequence
(srso_safe_ret()) which forces the function return instruction to
speculate into a trap (UD2).  This RET will then mispredict and
execution will continue at the return site read from the top of the
stack.

Pick one of three options at boot (evey function can only ever return
once).

  [ bp: Fixup commit message uarch details and add them in a comment in
    the code too. Add a comment about the srso_select_mitigation()
    dependency on retbleed_select_mitigation(). Add moar ifdeffery for
    32-bit builds. Add a dummy srso_untrain_ret_alias() definition for
    32-bit alternatives needing the symbol. ]

Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230814121148.842775684@infradead.org
---
 arch/x86/include/asm/nospec-branch.h |  5 ++-
 arch/x86/kernel/cpu/bugs.c           | 15 ++++++-
 arch/x86/kernel/vmlinux.lds.S        |  2 +-
 arch/x86/lib/retpoline.S             | 58 +++++++++++++++++++--------
 tools/objtool/arch/x86/decode.c      |  2 +-
 5 files changed, 62 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index b3625cc..5ed78ad 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -347,9 +347,14 @@ extern void __x86_return_thunk(void);
 static inline void __x86_return_thunk(void) {}
 #endif
 
+extern void zen_return_thunk(void);
+extern void srso_return_thunk(void);
+extern void srso_alias_return_thunk(void);
+
 extern void zen_untrain_ret(void);
 extern void srso_untrain_ret(void);
 extern void srso_untrain_ret_alias(void);
+
 extern void entry_ibpb(void);
 
 extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 3bc0d14..56cf250 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -167,6 +167,11 @@ void __init cpu_select_mitigations(void)
 	md_clear_select_mitigation();
 	srbds_select_mitigation();
 	l1d_flush_select_mitigation();
+
+	/*
+	 * srso_select_mitigation() depends and must run after
+	 * retbleed_select_mitigation().
+	 */
 	srso_select_mitigation();
 	gds_select_mitigation();
 }
@@ -1037,6 +1042,9 @@ do_cmd_auto:
 		setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 		setup_force_cpu_cap(X86_FEATURE_UNRET);
 
+		if (IS_ENABLED(CONFIG_RETHUNK))
+			x86_return_thunk = zen_return_thunk;
+
 		if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
 		    boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
 			pr_err(RETBLEED_UNTRAIN_MSG);
@@ -2453,10 +2461,13 @@ static void __init srso_select_mitigation(void)
 			 */
 			setup_force_cpu_cap(X86_FEATURE_RETHUNK);
 
-			if (boot_cpu_data.x86 == 0x19)
+			if (boot_cpu_data.x86 == 0x19) {
 				setup_force_cpu_cap(X86_FEATURE_SRSO_ALIAS);
-			else
+				x86_return_thunk = srso_alias_return_thunk;
+			} else {
 				setup_force_cpu_cap(X86_FEATURE_SRSO);
+				x86_return_thunk = srso_return_thunk;
+			}
 			srso_mitigation = SRSO_MITIGATION_SAFE_RET;
 		} else {
 			pr_err("WARNING: kernel not compiled with CPU_SRSO.\n");
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 8e2a306..d3b02d6 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -521,7 +521,7 @@ INIT_PER_CPU(irq_stack_backing_store);
 #endif
 
 #ifdef CONFIG_RETHUNK
-. = ASSERT((__ret & 0x3f) == 0, "__ret not cacheline-aligned");
+. = ASSERT((zen_return_thunk & 0x3f) == 0, "zen_return_thunk not cacheline-aligned");
 . = ASSERT((srso_safe_ret & 0x3f) == 0, "srso_safe_ret not cacheline-aligned");
 #endif
 
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index a478eb5..7df8582 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -151,22 +151,27 @@ SYM_CODE_END(__x86_indirect_jump_thunk_array)
 	.section .text..__x86.rethunk_untrain
 
 SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	ASM_NOP2
 	lfence
-	jmp __x86_return_thunk
+	jmp srso_alias_return_thunk
 SYM_FUNC_END(srso_untrain_ret_alias)
 __EXPORT_THUNK(srso_untrain_ret_alias)
 
 	.section .text..__x86.rethunk_safe
+#else
+/* dummy definition for alternatives */
+SYM_START(srso_untrain_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
+	ANNOTATE_UNRET_SAFE
+	ret
+	int3
+SYM_FUNC_END(srso_untrain_ret_alias)
 #endif
 
-/* Needs a definition for the __x86_return_thunk alternative below. */
 SYM_START(srso_safe_ret_alias, SYM_L_GLOBAL, SYM_A_NONE)
-#ifdef CONFIG_CPU_SRSO
 	lea 8(%_ASM_SP), %_ASM_SP
 	UNWIND_HINT_FUNC
-#endif
 	ANNOTATE_UNRET_SAFE
 	ret
 	int3
@@ -174,9 +179,16 @@ SYM_FUNC_END(srso_safe_ret_alias)
 
 	.section .text..__x86.return_thunk
 
+SYM_CODE_START(srso_alias_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	call srso_safe_ret_alias
+	ud2
+SYM_CODE_END(srso_alias_return_thunk)
+
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
- * 1) The RET at __x86_return_thunk must be on a 64 byte boundary, for
+ * 1) The RET at zen_return_thunk must be on a 64 byte boundary, for
  *    alignment within the BTB.
  * 2) The instruction at zen_untrain_ret must contain, and not
  *    end with, the 0xc3 byte of the RET.
@@ -184,7 +196,7 @@ SYM_FUNC_END(srso_safe_ret_alias)
  *    from re-poisioning the BTB prediction.
  */
 	.align 64
-	.skip 64 - (__ret - zen_untrain_ret), 0xcc
+	.skip 64 - (zen_return_thunk - zen_untrain_ret), 0xcc
 SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	/*
@@ -192,16 +204,16 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	 *
 	 *   TEST $0xcc, %bl
 	 *   LFENCE
-	 *   JMP __x86_return_thunk
+	 *   JMP zen_return_thunk
 	 *
 	 * Executing the TEST instruction has a side effect of evicting any BTB
 	 * prediction (potentially attacker controlled) attached to the RET, as
-	 * __x86_return_thunk + 1 isn't an instruction boundary at the moment.
+	 * zen_return_thunk + 1 isn't an instruction boundary at the moment.
 	 */
 	.byte	0xf6
 
 	/*
-	 * As executed from __x86_return_thunk, this is a plain RET.
+	 * As executed from zen_return_thunk, this is a plain RET.
 	 *
 	 * As part of the TEST above, RET is the ModRM byte, and INT3 the imm8.
 	 *
@@ -213,13 +225,13 @@ SYM_START(zen_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	 * With SMT enabled and STIBP active, a sibling thread cannot poison
 	 * RET's prediction to a type of its choice, but can evict the
 	 * prediction due to competitive sharing. If the prediction is
-	 * evicted, __x86_return_thunk will suffer Straight Line Speculation
+	 * evicted, zen_return_thunk will suffer Straight Line Speculation
 	 * which will be contained safely by the INT3.
 	 */
-SYM_INNER_LABEL(__ret, SYM_L_GLOBAL)
+SYM_INNER_LABEL(zen_return_thunk, SYM_L_GLOBAL)
 	ret
 	int3
-SYM_CODE_END(__ret)
+SYM_CODE_END(zen_return_thunk)
 
 	/*
 	 * Ensure the TEST decoding / BTB invalidation is complete.
@@ -230,7 +242,7 @@ SYM_CODE_END(__ret)
 	 * Jump back and execute the RET in the middle of the TEST instruction.
 	 * INT3 is for SLS protection.
 	 */
-	jmp __ret
+	jmp zen_return_thunk
 	int3
 SYM_FUNC_END(zen_untrain_ret)
 __EXPORT_THUNK(zen_untrain_ret)
@@ -251,11 +263,18 @@ SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	.byte 0x48, 0xb8
 
+/*
+ * This forces the function return instruction to speculate into a trap
+ * (UD2 in srso_return_thunk() below).  This RET will then mispredict
+ * and execution will continue at the return site read from the top of
+ * the stack.
+ */
 SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
 	lea 8(%_ASM_SP), %_ASM_SP
 	ret
 	int3
 	int3
+	/* end of movabs */
 	lfence
 	call srso_safe_ret
 	ud2
@@ -263,12 +282,19 @@ SYM_CODE_END(srso_safe_ret)
 SYM_FUNC_END(srso_untrain_ret)
 __EXPORT_THUNK(srso_untrain_ret)
 
-SYM_CODE_START(__x86_return_thunk)
+SYM_CODE_START(srso_return_thunk)
 	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
-	ALTERNATIVE_2 "jmp __ret", "call srso_safe_ret", X86_FEATURE_SRSO, \
-			"call srso_safe_ret_alias", X86_FEATURE_SRSO_ALIAS
+	call srso_safe_ret
 	ud2
+SYM_CODE_END(srso_return_thunk)
+
+SYM_CODE_START(__x86_return_thunk)
+	UNWIND_HINT_FUNC
+	ANNOTATE_NOENDBR
+	ANNOTATE_UNRET_SAFE
+	ret
+	int3
 SYM_CODE_END(__x86_return_thunk)
 EXPORT_SYMBOL(__x86_return_thunk)
 
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index cba8a7b..c55f3bb 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -829,6 +829,6 @@ bool arch_is_rethunk(struct symbol *sym)
 
 bool arch_is_embedded_insn(struct symbol *sym)
 {
-	return !strcmp(sym->name, "__ret") ||
+	return !strcmp(sym->name, "zen_return_thunk") ||
 	       !strcmp(sym->name, "srso_safe_ret");
 }

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH] objtool/x86: Fixup frame-pointer vs rethunk
  2023-08-16 20:31       ` Josh Poimboeuf
@ 2023-08-16 22:08         ` Peter Zijlstra
  2023-08-16 22:22           ` Josh Poimboeuf
  0 siblings, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-08-16 22:08 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-tip-commits, Borislav Petkov (AMD), x86

On Wed, Aug 16, 2023 at 01:31:52PM -0700, Josh Poimboeuf wrote:
> On Wed, Aug 16, 2023 at 01:59:21PM +0200, Peter Zijlstra wrote:
> > Turns out I forgot to build with FRAME_POINTER=y, that still gives:
> > 
> > vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup
> > 
> > the below seems to cure this.
> 
> LGTM

OK, with Changelog below.

---
Subject: objtool/x86: Fixup frame-pointer vs rethunk
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 16 Aug 2023 13:59:21 +0200

For stack-validation of a frame-pointer build, objtool validates that
every CALL instructions is preceded by a frame-setup. The new SRSO
return thunks violate this with their RSB stuffing trickery.

Extend the __fentry__ exception to also cover the embedded_insn case
used for this. This cures:

vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup

Fixes: 4ae68b26c3ab ("objtool/x86: Fix SRSO mess")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2630,12 +2630,17 @@ static int decode_sections(struct objtoo
 	return 0;
 }
 
-static bool is_fentry_call(struct instruction *insn)
+static bool is_special_call(struct instruction *insn)
 {
-	if (insn->type == INSN_CALL &&
-	    insn_call_dest(insn) &&
-	    insn_call_dest(insn)->fentry)
-		return true;
+	if (insn->type == INSN_CALL) {
+		struct symbol *dest = insn_call_dest(insn);
+
+		if (!dest)
+			return false;
+
+		if (dest->fentry || dest->embedded_insn)
+			return true;
+	}
 
 	return false;
 }
@@ -3636,7 +3641,7 @@ static int validate_branch(struct objtoo
 			if (ret)
 				return ret;
 
-			if (opts.stackval && func && !is_fentry_call(insn) &&
+			if (opts.stackval && func && !is_special_call(insn) &&
 			    !has_valid_stack_frame(&state)) {
 				WARN_INSN(insn, "call without frame pointer save/setup");
 				return 1;

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH] objtool/x86: Fixup frame-pointer vs rethunk
  2023-08-16 22:08         ` [PATCH] objtool/x86: Fixup frame-pointer vs rethunk Peter Zijlstra
@ 2023-08-16 22:22           ` Josh Poimboeuf
  0 siblings, 0 replies; 74+ messages in thread
From: Josh Poimboeuf @ 2023-08-16 22:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-tip-commits, Borislav Petkov (AMD), x86

On Thu, Aug 17, 2023 at 12:08:40AM +0200, Peter Zijlstra wrote:
> On Wed, Aug 16, 2023 at 01:31:52PM -0700, Josh Poimboeuf wrote:
> > On Wed, Aug 16, 2023 at 01:59:21PM +0200, Peter Zijlstra wrote:
> > > Turns out I forgot to build with FRAME_POINTER=y, that still gives:
> > > 
> > > vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup
> > > 
> > > the below seems to cure this.
> > 
> > LGTM
> 
> OK, with Changelog below.
> 
> ---
> Subject: objtool/x86: Fixup frame-pointer vs rethunk
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Wed, 16 Aug 2023 13:59:21 +0200
> 
> For stack-validation of a frame-pointer build, objtool validates that
> every CALL instructions is preceded by a frame-setup. The new SRSO
> return thunks violate this with their RSB stuffing trickery.
> 
> Extend the __fentry__ exception to also cover the embedded_insn case
> used for this. This cures:
> 
> vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup
> 
> Fixes: 4ae68b26c3ab ("objtool/x86: Fix SRSO mess")
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>

-- 
Josh

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [tip: x86/urgent] objtool/x86: Fixup frame-pointer vs rethunk
  2023-08-16 11:59     ` Peter Zijlstra
  2023-08-16 20:31       ` Josh Poimboeuf
@ 2023-08-17  8:39       ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2023-08-17  8:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov (AMD),
	Josh Poimboeuf, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     dbf46008775516f7f25c95b7760041c286299783
Gitweb:        https://git.kernel.org/tip/dbf46008775516f7f25c95b7760041c286299783
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 16 Aug 2023 13:59:21 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Thu, 17 Aug 2023 00:44:35 +02:00

objtool/x86: Fixup frame-pointer vs rethunk

For stack-validation of a frame-pointer build, objtool validates that
every CALL instruction is preceded by a frame-setup. The new SRSO
return thunks violate this with their RSB stuffing trickery.

Extend the __fentry__ exception to also cover the embedded_insn case
used for this. This cures:

  vmlinux.o: warning: objtool: srso_untrain_ret+0xd: call without frame pointer save/setup

Fixes: 4ae68b26c3ab ("objtool/x86: Fix SRSO mess")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/20230816115921.GH980931@hirez.programming.kicks-ass.net
---
 tools/objtool/check.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 7a9aaf4..1384090 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2650,12 +2650,17 @@ static int decode_sections(struct objtool_file *file)
 	return 0;
 }
 
-static bool is_fentry_call(struct instruction *insn)
+static bool is_special_call(struct instruction *insn)
 {
-	if (insn->type == INSN_CALL &&
-	    insn_call_dest(insn) &&
-	    insn_call_dest(insn)->fentry)
-		return true;
+	if (insn->type == INSN_CALL) {
+		struct symbol *dest = insn_call_dest(insn);
+
+		if (!dest)
+			return false;
+
+		if (dest->fentry || dest->embedded_insn)
+			return true;
+	}
 
 	return false;
 }
@@ -3656,7 +3661,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 			if (ret)
 				return ret;
 
-			if (opts.stackval && func && !is_fentry_call(insn) &&
+			if (opts.stackval && func && !is_special_call(insn) &&
 			    !has_valid_stack_frame(&state)) {
 				WARN_INSN(insn, "call without frame pointer save/setup");
 				return 1;

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-08-14 11:44 ` [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
  2023-08-15 20:49   ` Nikolay Borisov
@ 2023-09-07  8:31   ` Borislav Petkov
  2023-09-07 11:09     ` Peter Zijlstra
  1 sibling, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-09-07  8:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Mon, Aug 14, 2023 at 01:44:36PM +0200, Peter Zijlstra wrote:
> Instead of making increasingly complicated ALTERNATIVE_n()
> implementations, use a nested alternative expression.
> 
> The only difference between:
> 
>   ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)
> 
> and
> 
>   ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
>               newinst2, flag2)

Hmm, one more problem I see with this. You're handling it, it seems, but
the whole thing doesn't feel clean to me.

Here's an exemplary eval:

> #APP
> # 53 "./arch/x86/include/asm/page_64.h" 1
> 	# ALT: oldnstr
> 661:
> 	# ALT: oldnstr
> 661:

<--- X

> 	call clear_page_orig	#
> 662:
> # ALT: padding
> .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90
> 663:
> .pushsection .altinstructions,"a"
>  .long 661b - .
>  .long 664f - .
>  .4byte ( 3*32+16)
>  .byte 663b-661b
>  .byte 665f-664f
> .popsection
> .pushsection .altinstr_replacement, "ax"
> # ALT: replacement 
> 664:
> 	call clear_page_rep	#
>  665:
> .popsection
> 
> 662:
> # ALT: padding
> .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90
> 663:

<--- Z

So here it would add the padding again, unnecessarily.

> .pushsection .altinstructions,"a"
>  .long 661b - .

This refers to the 661 label, if you count backwards it would be the
second 661 label at my marker X above.

>  .long 664f - .

This is the 664 label at my marker Y below.

>  .4byte ( 9*32+ 9)
>  .byte 663b-661b

And here's where it gets interesting. That's source length. 663
backwards label is at marker Z which includes the second padding.

So if we do a lot of padding, that might grow vmlinux. Not a big deal
but still... Have you measured how much allyesconfig builds grow with
this patch?

>  .byte 665f-664f
> .popsection
> .pushsection .altinstr_replacement, "ax"
> # ALT: replacement 
> 664:

<--- Y

> 	call clear_page_erms	#
>  665:
> .popsection

In any case, I'd still like to solve this in a clean way, without the
fixup and unnecessary padding addition.

Lemme play some more with the preprocessor...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-07  8:31   ` Borislav Petkov
@ 2023-09-07 11:09     ` Peter Zijlstra
  2023-09-07 11:11       ` Peter Zijlstra
  2023-09-07 15:06       ` Borislav Petkov
  0 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-07 11:09 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Thu, Sep 07, 2023 at 10:31:58AM +0200, Borislav Petkov wrote:
> On Mon, Aug 14, 2023 at 01:44:36PM +0200, Peter Zijlstra wrote:
> > Instead of making increasingly complicated ALTERNATIVE_n()
> > implementations, use a nested alternative expression.
> > 
> > The only difference between:
> > 
> >   ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)
> > 
> > and
> > 
> >   ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
> >               newinst2, flag2)
> 
> Hmm, one more problem I see with this. You're handling it, it seems, but
> the whole thing doesn't feel clean to me.
> 
> Here's an exemplary eval:
> 
> > #APP
> > # 53 "./arch/x86/include/asm/page_64.h" 1
> > 	# ALT: oldnstr
> > 661:
> > 	# ALT: oldnstr
> > 661:
> 
> <--- X
> 
> > 	call clear_page_orig	#
> > 662:
> > # ALT: padding
> > .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90

665f-664f = 5 (rep)
662b-661b = 5 (orig)

5-5 > 0 = 0

so no padding

> > 663:
> > .pushsection .altinstructions,"a"
> >  .long 661b - .
> >  .long 664f - .
> >  .4byte ( 3*32+16)
> >  .byte 663b-661b
> >  .byte 665f-664f
> > .popsection
> > .pushsection .altinstr_replacement, "ax"
> > # ALT: replacement 
> > 664:
> > 	call clear_page_rep	#
> >  665:
> > .popsection
> > 
> > 662:
> > # ALT: padding
> > .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90
> > 663:
> 
> <--- Z
> 
> So here it would add the padding again, unnecessarily.

665f-664f = 5 (erms)
662b-661b = 5 (orig + padding)

5-5 > 0 = 0

no padding, also since, as you note 661b is the first, we include all
previous padding, and the skip will only add additional padding if the
new sequence is longer still.

So, no I'm not seeing it. Doubly not with this example where all 3
variants are 5 bytes.

Notably, the following nonsense alternative with 1 2 and 3 bytes
instructions:

	asm volatile (
		ALTERNATIVE_2("push %rbp",
			"push %r12", X86_FEATURE_ALWAYS,
			"mov %rsp,%rbp", X86_FEATURE_ALWAYS));

ends up as:

0004  204:      55                      push   %rbp
0005  205:      90                      nop
0006  206:      90                      nop

If you flip the 3 and 2 byte instructions the result is the same. No
extra padding.

And no, I had not actually tested this before, because clearly this is
all obvious ;-)

Anyway, the 1,3,2 variant spelled out reads like:

#APP
# 1563 "../arch/x86/kernel/alternative.c" 1
# ALT: oldnstr
661:
# ALT: oldnstr
661:
push %rbp
662:
# ALT: padding
.skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90

 #   Which evaluates like:
 #     665f-664f = 3
 #     662b-661b = 1
 #     3-1 > 0 = -1
 #     --1 * (3-1) = 2
 #
 #   so two single byte nops get emitted here.

663:
.pushsection .altinstructions,"a"
.long 661b - .
.long 664f - .
.4byte ( 3*32+21)
.byte 663b-661b
.byte 665f-664f
.popsection
.pushsection .altinstr_replacement, "ax"
# ALT: replacement
664:
mov %rsp,%rbp
665:
.popsection

662:
# ALT: padding
.skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90

 #   And this evaluates to:
 #     665f-664f = 2
 #     662b-661b = 3 (because it includes the original 1 byte instruction and 2 bytes padding)
 #     3-1 > 0 = 0
 #     0 * (3-1) = 0
 #
 #   so no extra padding

663:
.pushsection .altinstructions,"a"
.long 661b - .
.long 664f - .
.4byte ( 3*32+21)
.byte 663b-661b
.byte 665f-664f
.popsection
.pushsection .altinstr_replacement, "ax"
# ALT: replacement
664:
push %r12
665:
.popsection

# 0 "" 2
# ../arch/x86/kernel/alternative.c:1569:        int3_selftest();
#NO_APP

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-07 11:09     ` Peter Zijlstra
@ 2023-09-07 11:11       ` Peter Zijlstra
  2023-09-07 11:16         ` Peter Zijlstra
  2023-09-07 15:06       ` Borislav Petkov
  1 sibling, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-07 11:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Thu, Sep 07, 2023 at 01:09:17PM +0200, Peter Zijlstra wrote:

> Anyway, the 1,3,2 variant spelled out reads like:
> 
> #APP
> # 1563 "../arch/x86/kernel/alternative.c" 1
> # ALT: oldnstr
> 661:
> # ALT: oldnstr
> 661:
> push %rbp
> 662:
> # ALT: padding
> .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90
> 
>  #   Which evaluates like:
>  #     665f-664f = 3
>  #     662b-661b = 1
>  #     3-1 > 0 = -1
>  #     --1 * (3-1) = 2
>  #
>  #   so two single byte nops get emitted here.
> 
> 663:
> .pushsection .altinstructions,"a"
> .long 661b - .
> .long 664f - .
> .4byte ( 3*32+21)
> .byte 663b-661b
> .byte 665f-664f
> .popsection
> .pushsection .altinstr_replacement, "ax"
> # ALT: replacement
> 664:
> mov %rsp,%rbp
> 665:
> .popsection
> 
> 662:
> # ALT: padding
> .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90
> 
>  #   And this evaluates to:
>  #     665f-664f = 2
>  #     662b-661b = 3 (because it includes the original 1 byte instruction and 2 bytes padding)
>  #     3-1 > 0 = 0
>  #     0 * (3-1) = 0

copy-paste fail, that needs to read:

	3-3 > 0 = 0
	0 * (3-3) = 0

>  #
>  #   so no extra padding
> 
> 663:
> .pushsection .altinstructions,"a"
> .long 661b - .
> .long 664f - .
> .4byte ( 3*32+21)
> .byte 663b-661b
> .byte 665f-664f
> .popsection
> .pushsection .altinstr_replacement, "ax"
> # ALT: replacement
> 664:
> push %r12
> 665:
> .popsection
> 
> # 0 "" 2
> # ../arch/x86/kernel/alternative.c:1569:        int3_selftest();
> #NO_APP

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-07 11:11       ` Peter Zijlstra
@ 2023-09-07 11:16         ` Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-07 11:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Thu, Sep 07, 2023 at 01:11:00PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 07, 2023 at 01:09:17PM +0200, Peter Zijlstra wrote:
> 
> > Anyway, the 1,3,2 variant spelled out reads like:
> > 
> > #APP
> > # 1563 "../arch/x86/kernel/alternative.c" 1
> > # ALT: oldnstr
> > 661:
> > # ALT: oldnstr
> > 661:
> > push %rbp
> > 662:
> > # ALT: padding
> > .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90
> > 
> >  #   Which evaluates like:
> >  #     665f-664f = 3
> >  #     662b-661b = 1
> >  #     3-1 > 0 = -1
> >  #     --1 * (3-1) = 2
> >  #
> >  #   so two single byte nops get emitted here.
> > 
> > 663:
> > .pushsection .altinstructions,"a"
> > .long 661b - .
> > .long 664f - .
> > .4byte ( 3*32+21)
> > .byte 663b-661b
> > .byte 665f-664f
> > .popsection
> > .pushsection .altinstr_replacement, "ax"
> > # ALT: replacement
> > 664:
> > mov %rsp,%rbp
> > 665:
> > .popsection
> > 
> > 662:
> > # ALT: padding
> > .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90
> > 
> >  #   And this evaluates to:
> >  #     665f-664f = 2
> >  #     662b-661b = 3 (because it includes the original 1 byte instruction and 2 bytes padding)
> >  #     3-1 > 0 = 0
> >  #     0 * (3-1) = 0
> 
> copy-paste fail, that needs to read:
> 
> 	3-3 > 0 = 0
> 	0 * (3-3) = 0

I'm a moron ofcourse:

	2-3

> 
> >  #
> >  #   so no extra padding
> > 
> > 663:
> > .pushsection .altinstructions,"a"
> > .long 661b - .
> > .long 664f - .
> > .4byte ( 3*32+21)
> > .byte 663b-661b
> > .byte 665f-664f
> > .popsection
> > .pushsection .altinstr_replacement, "ax"
> > # ALT: replacement
> > 664:
> > push %r12
> > 665:
> > .popsection
> > 
> > # 0 "" 2
> > # ../arch/x86/kernel/alternative.c:1569:        int3_selftest();
> > #NO_APP

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-07 11:09     ` Peter Zijlstra
  2023-09-07 11:11       ` Peter Zijlstra
@ 2023-09-07 15:06       ` Borislav Petkov
  2023-09-07 15:30         ` Borislav Petkov
  1 sibling, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-09-07 15:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Thu, Sep 07, 2023 at 01:09:17PM +0200, Peter Zijlstra wrote:
> If you flip the 3 and 2 byte instructions the result is the same. No
> extra padding.
>
> And no, I had not actually tested this before, because clearly this is
> all obvious ;-)

IKR.

So I take that extra padding thing back - we actually *must* have that
padding so that it actually works correctly. I just did a silly example
but nothing says one cannot do one like that today:

		alternative_2("", "pop %%rax", X86_FEATURE_ALWAYS,
			      "call clear_page_orig", X86_FEATURE_ALWAYS);

An order of insns which grows in size: 0, then 1, then 5.

It turns into:

> # arch/x86/mm/init.c:163: 		alternative_2("", "pop %%rax", X86_FEATURE_ALWAYS,
> # 163 "arch/x86/mm/init.c" 1
> 	# ALT: oldnstr
> 661:
> 	# ALT: oldnstr
> 661:
> 	
> 662:
> # ALT: padding
> .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90

IINM, this turns into:

.skip 1 * (1 - 0) = 1.

because "pop %rax" is one byte. The original insn is of size 0.

So we end up with a 0x90 here.

> 663:
> .pushsection .altinstructions,"a"
>  .long 661b - .
>  .long 664f - .
>  .4byte ( 3*32+21)
>  .byte 663b-661b
>  .byte 665f-664f
> .popsection
> .pushsection .altinstr_replacement, "ax"
> # ALT: replacement 
> 664:
> 	pop %rax
>  665:
> .popsection
> 
> 662:

<--- X

> # ALT: padding
> .skip -(((665f-664f)-(662b-661b)) > 0) * ((665f-664f)-(662b-661b)),0x90

Now the second guy comes in. That turns into:

.skip 1 * (5 - 1) = 4

Because, IINM, the 662 label above is the *second* one at marker X (we
go backwards) and the 661 is the second one too.

So between those two labels you have the 0x90 - one byte padding from
the first .skip.

And now it adds 4 more bytes to accomodate the CALL.

So we need to have that padding back-to-back in case the second
replacement is longer.

Ok, I guess the only thing that's bothering me is:

>       # ALT: oldnstr
> 661:
>       # ALT: oldnstr 
> 661:

I'll keep on playing with this.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-07 15:06       ` Borislav Petkov
@ 2023-09-07 15:30         ` Borislav Petkov
  2023-09-09  7:50           ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-09-07 15:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Thu, Sep 07, 2023 at 05:06:32PM +0200, Borislav Petkov wrote:
> >       # ALT: oldnstr
> > 661:
> >       # ALT: oldnstr 
> > 661:
> 
> I'll keep on playing with this.

Ok, below's what I've been thinking. It looks ok but I'll keep staring
at it for a while to make sure I'm not missing an angle.

We simply pass a number to the ALTERNATIVE macro, starting from 0. 0 is
the innermost invocation, 1 is the outer and so on. And then the labels
are unique and the sizes are correct. And we hardcode 0 to mean the
innermost macro invocation and use that for sizing of orig instr.

But I might be missing something so lemme poke at it more. Below is
a userspace program which makes this a lot easier to experiment with:

---
#include <stdio.h>

#define __stringify_1(x...)	#x
#define __stringify(x...)	__stringify_1(x)

#define alt_slen		"662b-6610b"
#define alt_total_slen		"663b-661b"
#define alt_rlen		"665f-664f"

#define OLDINSTR(oldinstr, n)						\
	"# ALT: oldnstr\n"						\
	"661" #n ":\n\t" oldinstr "\n662:\n"					\
	"# ALT: padding\n"						\
	".skip -(((" alt_rlen ")-(" alt_slen ")) > 0) * "		\
		"((" alt_rlen ")-(" alt_slen ")),0x90\n"		\
	"663:\n"

#define ALTINSTR_ENTRY(ft_flags)					      \
	".pushsection .altinstructions,\"a\"\n"				      \
	" .long 6610b - .\n"				/* label           */ \
	" .long 664f - .\n"				/* new instruction */ \
	" .4byte " __stringify(ft_flags) "\n"		/* feature + flags */ \
	" .byte " alt_total_slen "\n"			/* source len      */ \
	" .byte " alt_rlen "\n"				/* replacement len */ \
	".popsection\n"

#define ALTINSTR_REPLACEMENT(newinstr)			/* replacement */	\
	".pushsection .altinstr_replacement, \"ax\"\n"				\
	"# ALT: replacement \n"							\
	"664:\n\t" newinstr "\n 665:\n"						\
	".popsection\n"

/*
 * Define an alternative between two instructions. If @ft_flags is
 * present, early code in apply_alternatives() replaces @oldinstr with
 * @newinstr. ".skip" directive takes care of proper instruction padding
 * in case @newinstr is longer than @oldinstr.
 *
 * Notably: @oldinstr may be an ALTERNATIVE() itself, also see
 *          apply_alternatives()
 */
#define __ALTERNATIVE(oldinstr, newinstr, ft_flags, n)			\
	OLDINSTR(oldinstr, n)						\
	ALTINSTR_ENTRY(ft_flags)					\
	ALTINSTR_REPLACEMENT(newinstr)

#define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
	__ALTERNATIVE(oldinstr, newinstr, ft_flags, 0)

#define ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)	\
	__ALTERNATIVE(__ALTERNATIVE(oldinst, newinst1, flag1, 0),		\
		    newinst2, flag2, 1)

#define alternative_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) \
	asm __inline volatile(ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) ::: "memory")

int main(void)
{
	alternative_2("", "pop %%rax", 1, "call main", 1);
	return 0;
}



-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-07 15:30         ` Borislav Petkov
@ 2023-09-09  7:50           ` Borislav Petkov
  2023-09-09  9:25             ` Peter Zijlstra
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-09-09  7:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Thu, Sep 07, 2023 at 05:30:36PM +0200, Borislav Petkov wrote:
> But I might be missing something so lemme poke at it more.

So my guest boots with the below diff ontop of yours. That doesn't mean
a whole lot but it looks like it DTRT. (Btw, the DPRINTK hunk is hoisted
up only for debugging - not part of the change).

And I've backed out your handling of the additional padding because, as
we've established, that's not really additional padding but the padding
which is missing when a subsequent sequence is longer.

It all ends up being a single consecutive region of padding as it should
be.

Building that says:

arch/x86/entry/entry_64.o: warning: objtool: entry_SYSCALL_64+0x91: weirdly overlapping alternative! 5 != 16
arch/x86/entry/entry_64_compat.o: warning: objtool: entry_SYSENTER_compat+0x80: weirdly overlapping alternative! 5 != 16

but that warning is bogus because the code in question is the
UNTRAIN_RET macro which has an empty orig insn, then two CALLs of size
5 and then the RESET_CALL_DEPTH sequence which is 16 bytes.

At build time it looks like this:

ffffffff81c000d1:       90                      nop
ffffffff81c000d2:       90                      nop
ffffffff81c000d3:       90                      nop
ffffffff81c000d4:       90                      nop
ffffffff81c000d5:       90                      nop
ffffffff81c000d6:       90                      nop
ffffffff81c000d7:       90                      nop
ffffffff81c000d8:       90                      nop
ffffffff81c000d9:       90                      nop
ffffffff81c000da:       90                      nop
ffffffff81c000db:       90                      nop
ffffffff81c000dc:       90                      nop
ffffffff81c000dd:       90                      nop
ffffffff81c000de:       90                      nop
ffffffff81c000df:       90                      nop
ffffffff81c000e0:       90                      nop

and those are 16 contiguous NOPs of padding.

At boot time, it does:

[    0.679523] SMP alternatives: feat: 11*32+15, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a362b, len: 5)
[    0.683516] SMP alternatives: ffffffff81c000d1: [0:5) optimized NOPs: 0f 1f 44 00 00

That first one is X86_FEATURE_UNRET and the alt_instr descriptor simply
says that the replacement is 5 bytes long, which is the CALL that can
potentially be poked in. It doesn't care about the following 11 bytes of
padding because it doesn't matter - it wants 5 bytes only for the CALL.

[    0.687514] SMP alternatives: feat: 11*32+10, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a3630, len: 5)
[    0.691521] SMP alternatives: ffffffff81c000d1: [0:5) optimized NOPs: 0f 1f 44 00 00

This is X86_FEATURE_ENTRY_IBPB. Same thing.

[    0.695515] SMP alternatives: feat: 11*32+19, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 16), repl: (ffffffff833a3635, len: 16)
[    0.699516] SMP alternatives: ffffffff81c000d1: [0:16) optimized NOPs: eb 0e cc cc cc cc cc cc cc cc cc cc cc cc cc cc

And this is X86_FEATURE_CALL_DEPTH and here the alt_instr descriptor has
replacement length of 16 and that is all still ok as it starts at the
same address and contains the first 5 bytes from the previous entries
which overlap here.

So address-wise we're good, the alt_instr patching descriptors are
correct and we should be good.

Thoughts?

---

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index df128ff49d60..de612307ed1e 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -150,13 +150,13 @@ static inline int alternatives_text_reserved(void *start, void *end)
 }
 #endif	/* CONFIG_SMP */
 
-#define alt_slen		"662b-661b"
-#define alt_total_slen		"663b-661b"
+#define alt_slen		"662b-6610b"
+#define alt_total_slen		"663b-6610b"
 #define alt_rlen		"665f-664f"
 
-#define OLDINSTR(oldinstr)						\
-	"# ALT: oldnstr\n"						\
-	"661:\n\t" oldinstr "\n662:\n"					\
+#define OLDINSTR(oldinstr, n)						\
+	"# ALT: oldinstr\n"						\
+	"661" #n ":\n\t" oldinstr "\n662:\n"					\
 	"# ALT: padding\n"						\
 	".skip -(((" alt_rlen ")-(" alt_slen ")) > 0) * "		\
 		"((" alt_rlen ")-(" alt_slen ")),0x90\n"		\
@@ -164,7 +164,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 
 #define ALTINSTR_ENTRY(ft_flags)					      \
 	".pushsection .altinstructions,\"a\"\n"				      \
-	" .long 661b - .\n"				/* label           */ \
+	" .long 6610b - .\n"				/* label           */ \
 	" .long 664f - .\n"				/* new instruction */ \
 	" .4byte " __stringify(ft_flags) "\n"		/* feature + flags */ \
 	" .byte " alt_total_slen "\n"			/* source len      */ \
@@ -185,15 +185,25 @@ static inline int alternatives_text_reserved(void *start, void *end)
  *
  * Notably: @oldinstr may be an ALTERNATIVE() itself, also see
  *          apply_alternatives()
+ *
+ * @n: nesting level. Because those macros are nested, in order to
+ * compute the source length and the total source length including the
+ * padding, the nesting level is used to define unique labels. The
+ * nesting level increases from the innermost macro invocation outwards,
+ * starting with 0. Thus, the correct starting label of oldinstr is
+ * 6610 which is hardcoded in the macros above.
  */
-#define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
-	OLDINSTR(oldinstr)						\
+#define __ALTERNATIVE(oldinstr, newinstr, ft_flags, n)			\
+	OLDINSTR(oldinstr, n)						\
 	ALTINSTR_ENTRY(ft_flags)					\
 	ALTINSTR_REPLACEMENT(newinstr)
 
+#define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
+	__ALTERNATIVE(oldinstr, newinstr, ft_flags, 0)
+
 #define ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)	\
-	ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),		\
-		    newinst2, flag2)
+	__ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),		\
+		    newinst2, flag2, 1)
 
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
 #define ALTERNATIVE_TERNARY(oldinstr, ft_flags, newinstr_yes, newinstr_no) \
@@ -202,8 +212,8 @@ static inline int alternatives_text_reserved(void *start, void *end)
 
 #define ALTERNATIVE_3(oldinst, newinst1, flag1, newinst2, flag2,	\
 		      newinst3, flag3)					\
-	ALTERNATIVE(ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2), \
-		    newinst3, flag3)
+	__ALTERNATIVE(ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2), \
+		    newinst3, flag3, 2)
 
 /*
  * Alternative instructions for different CPU types or capabilities.
@@ -328,14 +338,18 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	.byte \alt_len
 .endm
 
-#define __ALTERNATIVE(oldinst, newinst, flag)				\
-140:									\
+/*
+ * Make sure the innermost macro invocation passes in as label "1400"
+ * as it is used for @oldinst sizing further down here.
+ */
+#define __ALTERNATIVE(oldinst, newinst, flag, label)			\
+label:									\
 	oldinst	;							\
 141:									\
-	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90	;\
+	.skip -(((144f-143f)-(141b-1400b)) > 0) * ((144f-143f)-(141b-1400b)),0x90	;\
 142:									\
 	.pushsection .altinstructions,"a" ;				\
-	altinstr_entry 140b,143f,flag,142b-140b,144f-143f ;		\
+	altinstr_entry 1400b,143f,flag,142b-1400b,144f-143f ;		\
 	.popsection ;							\
 	.pushsection .altinstr_replacement,"ax"	;			\
 143:									\
@@ -350,7 +364,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * in case @newinstr is longer than @oldinstr.
  */
 .macro ALTERNATIVE oldinstr, newinstr, ft_flags
-	__ALTERNATIVE(\oldinstr, \newinstr, \ft_flags)
+	__ALTERNATIVE(\oldinstr, \newinstr, \ft_flags, 1400)
 .endm
 
 /*
@@ -359,14 +373,14 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * @feature2, it replaces @oldinstr with @feature2.
  */
 .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
-	__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1),
-		      \newinstr2, \ft_flags2)
+	__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1, 1400),
+		      \newinstr2, \ft_flags2, 1401)
 .endm
 
 .macro ALTERNATIVE_3 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2, newinstr3, ft_flags3
-	__ALTERNATIVE(__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1),
-				    \newinstr2, \ft_flags2),
-		      \newinstr3, \ft_flags3)
+	__ALTERNATIVE(__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1, 1400),
+				    \newinstr2, \ft_flags2, 1401),
+		      \newinstr3, \ft_flags3, 1402)
 .endm
 
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index fa9eb5f1ff1e..aa0ea0317127 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -398,7 +398,7 @@ apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src, size_t src_len)
 void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 						  struct alt_instr *end)
 {
-	struct alt_instr *a, *b;
+	struct alt_instr *a;
 	u8 *instr, *replacement;
 	u8 insn_buff[MAX_PATCH_LEN];
 
@@ -415,22 +415,18 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 	for (a = start; a < end; a++) {
 		int insn_buff_sz = 0;
 
-		/*
-		 * In case of nested ALTERNATIVE()s the outer alternative might
-		 * add more padding. To ensure consistent patching find the max
-		 * padding for all alt_instr entries for this site (nested
-		 * alternatives result in consecutive entries).
-		 */
-		for (b = a+1; b < end && b->instr_offset == a->instr_offset; b++) {
-			u8 len = max(a->instrlen, b->instrlen);
-			a->instrlen = b->instrlen = len;
-		}
-
 		instr = (u8 *)&a->instr_offset + a->instr_offset;
 		replacement = (u8 *)&a->repl_offset + a->repl_offset;
 		BUG_ON(a->instrlen > sizeof(insn_buff));
 		BUG_ON(a->cpuid >= (NCAPINTS + NBUGINTS) * 32);
 
+		DPRINTK(ALT, "feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d)",
+			(a->flags & ALT_FLAG_NOT) ? "!" : "",
+			a->cpuid >> 5,
+			a->cpuid & 0x1f,
+			instr, instr, a->instrlen,
+			replacement, a->replacementlen);
+
 		/*
 		 * Patch if either:
 		 * - feature is present
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index a4ecb04d8d64..37328ffc72bb 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -109,7 +109,7 @@ static inline u64 xfeatures_mask_independent(void)
  *
  * We use XSAVE as a fallback.
  *
- * The 661 label is defined in the ALTERNATIVE* macros as the address of the
+ * The 6610 label is defined in the ALTERNATIVE* macros as the address of the
  * original instruction which gets replaced. We need to use it here as the
  * address of the instruction where we might get an exception at.
  */
@@ -121,7 +121,7 @@ static inline u64 xfeatures_mask_independent(void)
 		     "\n"						\
 		     "xor %[err], %[err]\n"				\
 		     "3:\n"						\
-		     _ASM_EXTABLE_TYPE_REG(661b, 3b, EX_TYPE_EFAULT_REG, %[err]) \
+		     _ASM_EXTABLE_TYPE_REG(6610b, 3b, EX_TYPE_EFAULT_REG, %[err]) \
 		     : [err] "=r" (err)					\
 		     : "D" (st), "m" (*st), "a" (lmask), "d" (hmask)	\
 		     : "memory")
@@ -135,7 +135,7 @@ static inline u64 xfeatures_mask_independent(void)
 				 XRSTORS, X86_FEATURE_XSAVES)		\
 		     "\n"						\
 		     "3:\n"						\
-		     _ASM_EXTABLE_TYPE(661b, 3b, EX_TYPE_FPU_RESTORE)	\
+		     _ASM_EXTABLE_TYPE(6610b, 3b, EX_TYPE_FPU_RESTORE)	\
 		     :							\
 		     : "D" (st), "m" (*st), "a" (lmask), "d" (hmask)	\
 		     : "memory")
diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
index 7145920a7aba..29e949579ede 100644
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -9,29 +9,6 @@
 
 void arch_handle_alternative(unsigned short feature, struct special_alt *alt)
 {
-	static struct special_alt *group, *prev;
-
-	/*
-	 * Recompute orig_len for nested ALTERNATIVE()s.
-	 */
-	if (group && group->orig_sec == alt->orig_sec &&
-	             group->orig_off == alt->orig_off) {
-
-		struct special_alt *iter = group;
-		for (;;) {
-			unsigned int len = max(iter->orig_len, alt->orig_len);
-			iter->orig_len = alt->orig_len = len;
-
-			if (iter == prev)
-				break;
-
-			iter = list_next_entry(iter, list);
-		}
-
-	} else group = alt;
-
-	prev = alt;
-
 	switch (feature) {
 	case X86_FEATURE_SMAP:
 		/*

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-09  7:50           ` Borislav Petkov
@ 2023-09-09  9:25             ` Peter Zijlstra
  2023-09-09  9:42               ` Peter Zijlstra
  2023-09-10 14:42               ` Borislav Petkov
  0 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-09  9:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Sat, Sep 09, 2023 at 09:50:09AM +0200, Borislav Petkov wrote:
> On Thu, Sep 07, 2023 at 05:30:36PM +0200, Borislav Petkov wrote:
> > But I might be missing something so lemme poke at it more.
> 
> So my guest boots with the below diff ontop of yours. That doesn't mean
> a whole lot but it looks like it DTRT. (Btw, the DPRINTK hunk is hoisted
> up only for debugging - not part of the change).
> 
> And I've backed out your handling of the additional padding because, as
> we've established, that's not really additional padding but the padding
> which is missing when a subsequent sequence is longer.
> 
> It all ends up being a single consecutive region of padding as it should
> be.
> 
> Building that says:
> 
> arch/x86/entry/entry_64.o: warning: objtool: entry_SYSCALL_64+0x91: weirdly overlapping alternative! 5 != 16
> arch/x86/entry/entry_64_compat.o: warning: objtool: entry_SYSENTER_compat+0x80: weirdly overlapping alternative! 5 != 16
> 
> but that warning is bogus because the code in question is the
> UNTRAIN_RET macro which has an empty orig insn, then two CALLs of size
> 5 and then the RESET_CALL_DEPTH sequence which is 16 bytes.
> 
> At build time it looks like this:
> 
> ffffffff81c000d1:       90                      nop
> ffffffff81c000d2:       90                      nop
> ffffffff81c000d3:       90                      nop
> ffffffff81c000d4:       90                      nop
> ffffffff81c000d5:       90                      nop
> ffffffff81c000d6:       90                      nop
> ffffffff81c000d7:       90                      nop
> ffffffff81c000d8:       90                      nop
> ffffffff81c000d9:       90                      nop
> ffffffff81c000da:       90                      nop
> ffffffff81c000db:       90                      nop
> ffffffff81c000dc:       90                      nop
> ffffffff81c000dd:       90                      nop
> ffffffff81c000de:       90                      nop
> ffffffff81c000df:       90                      nop
> ffffffff81c000e0:       90                      nop
> 
> and those are 16 contiguous NOPs of padding.
> 
> At boot time, it does:
> 
> [    0.679523] SMP alternatives: feat: 11*32+15, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a362b, len: 5)
> [    0.683516] SMP alternatives: ffffffff81c000d1: [0:5) optimized NOPs: 0f 1f 44 00 00
> 
> That first one is X86_FEATURE_UNRET and the alt_instr descriptor simply
> says that the replacement is 5 bytes long, which is the CALL that can
> potentially be poked in. It doesn't care about the following 11 bytes of
> padding because it doesn't matter - it wants 5 bytes only for the CALL.
> 
> [    0.687514] SMP alternatives: feat: 11*32+10, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a3630, len: 5)
> [    0.691521] SMP alternatives: ffffffff81c000d1: [0:5) optimized NOPs: 0f 1f 44 00 00
> 
> This is X86_FEATURE_ENTRY_IBPB. Same thing.
> 
> [    0.695515] SMP alternatives: feat: 11*32+19, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 16), repl: (ffffffff833a3635, len: 16)
> [    0.699516] SMP alternatives: ffffffff81c000d1: [0:16) optimized NOPs: eb 0e cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> 
> And this is X86_FEATURE_CALL_DEPTH and here the alt_instr descriptor has
> replacement length of 16 and that is all still ok as it starts at the
> same address and contains the first 5 bytes from the previous entries
> which overlap here.
> 
> So address-wise we're good, the alt_instr patching descriptors are
> correct and we should be good.
> 
> Thoughts?
> 
> ---
> 

> @@ -415,22 +415,18 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
>  	for (a = start; a < end; a++) {
>  		int insn_buff_sz = 0;
>  
> -		/*
> -		 * In case of nested ALTERNATIVE()s the outer alternative might
> -		 * add more padding. To ensure consistent patching find the max
> -		 * padding for all alt_instr entries for this site (nested
> -		 * alternatives result in consecutive entries).
> -		 */
> -		for (b = a+1; b < end && b->instr_offset == a->instr_offset; b++) {
> -			u8 len = max(a->instrlen, b->instrlen);
> -			a->instrlen = b->instrlen = len;
> -		}
> -
>  		instr = (u8 *)&a->instr_offset + a->instr_offset;
>  		replacement = (u8 *)&a->repl_offset + a->repl_offset;
>  		BUG_ON(a->instrlen > sizeof(insn_buff));
>  		BUG_ON(a->cpuid >= (NCAPINTS + NBUGINTS) * 32);
>  

> diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
> index 7145920a7aba..29e949579ede 100644
> --- a/tools/objtool/arch/x86/special.c
> +++ b/tools/objtool/arch/x86/special.c
> @@ -9,29 +9,6 @@
>  
>  void arch_handle_alternative(unsigned short feature, struct special_alt *alt)
>  {
> -	static struct special_alt *group, *prev;
> -
> -	/*
> -	 * Recompute orig_len for nested ALTERNATIVE()s.
> -	 */
> -	if (group && group->orig_sec == alt->orig_sec &&
> -	             group->orig_off == alt->orig_off) {
> -
> -		struct special_alt *iter = group;
> -		for (;;) {
> -			unsigned int len = max(iter->orig_len, alt->orig_len);
> -			iter->orig_len = alt->orig_len = len;
> -
> -			if (iter == prev)
> -				break;
> -
> -			iter = list_next_entry(iter, list);
> -		}
> -
> -	} else group = alt;
> -
> -	prev = alt;
> -
>  	switch (feature) {
>  	case X86_FEATURE_SMAP:
>  		/*

Yeah, that wasn't optional.

So what you end up with is:

661:
  "one byte orig insn"
  "one nop because alt1 is 2 bytes"
  "one nop because alt2 is 3 bytes"

right?

But your alt_instr are:

  alt_instr1 = {
 	.instr_offset = 661b-.; /* .text location */
	.repl_offset = 664f-.;  /* .altinstr_replacement location */

	/* .ft_flags */

	.instrlen = 2;
	.replacementlen = 2;
  }

  alt_instr2 = {
  	.instr_offset = 661b-.;
	.repl_offset = 664f-.;

	/* .ft_flags */

	.instrlen = 3;
	.replacementlen = 3;
  }


So if you patch alt2, you will only patch 2 bytes of the original text,
even though that has 3 bytes of 'space'.


This becomes more of a problem with your example above where the
respective lengths are 0, 5, 16. In that case, when you patch 5, you'll
leave 11 single nops in there.

So what that code you deleted does is look for all alternatives that
start at the same point and computes the max replacementlen, because
that is the amount of bytes in the source text that has been reserved
for this alternative.

That is not optional.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-09  9:25             ` Peter Zijlstra
@ 2023-09-09  9:42               ` Peter Zijlstra
  2023-09-10 14:42               ` Borislav Petkov
  1 sibling, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-09  9:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Sat, Sep 09, 2023 at 11:25:54AM +0200, Peter Zijlstra wrote:
> This becomes more of a problem with your example above where the
> respective lengths are 0, 5, 16. In that case, when you patch 5, you'll
> leave 11 single nops in there.
> 
> So what that code you deleted does is look for all alternatives that
> start at the same point and computes the max replacementlen, because
> that is the amount of bytes in the source text that has been reserved
> for this alternative.
> 
> That is not optional.

Note that the original alternatives did this with the alt_max_*() macro
at build time.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-09  9:25             ` Peter Zijlstra
  2023-09-09  9:42               ` Peter Zijlstra
@ 2023-09-10 14:42               ` Borislav Petkov
  2023-09-12  9:27                 ` Peter Zijlstra
  1 sibling, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-09-10 14:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Sat, Sep 09, 2023 at 11:25:54AM +0200, Peter Zijlstra wrote:
> So what you end up with is:
> 
> 661:
>   "one byte orig insn"
>   "one nop because alt1 is 2 bytes"
>   "one nop because alt2 is 3 bytes"
> 
> right?

Right.

> This becomes more of a problem with your example above where the
> respective lengths are 0, 5, 16. In that case, when you patch 5, you'll
> leave 11 single nops in there.

Well, I know what you mean but the code handles that gracefully and it
works. Watch this:

I made it always apply the second one and not apply the third, the
longest one:

.macro UNTRAIN_RET
#if defined(CONFIG_CPU_UNRET_ENTRY) || defined(CONFIG_CPU_IBPB_ENTRY) || \
        defined(CONFIG_CALL_DEPTH_TRACKING) || defined(CONFIG_CPU_SRSO)
        VALIDATE_UNRET_END
        ALTERNATIVE_3 "",                                               \
                      CALL_UNTRAIN_RET, X86_FEATURE_UNRET,              \
                      "call entry_ibpb", X86_FEATURE_ALWAYS,            \
                      __stringify(RESET_CALL_DEPTH), X86_FEATURE_CALL_DEPTH
#endif
.endm

So it comes in and pokes in the padding for the first one:
X86_FEATURE_UNRET

[    0.903506] SMP alternatives: feat: 11*32+15, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a362b, len: 5)
[    0.911256] SMP alternatives: ffffffff81c000d1: [0:5) optimized NOPs: 0f 1f 44 00 00

Then patches in the entry_ibpb call:

[    0.916849] SMP alternatives: feat: 3*32+21, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a3630, len: 5)
[    0.924844] SMP alternatives: ffffffff81c000d1:   old_insn: 0f 1f 44 00 00
[    0.928842] SMP alternatives: ffffffff833a3630:   rpl_insn: e8 5b 9e 81 fe
[    0.932849] SMP alternatives: ffffffff81c000d1: final_insn: e8 ba d3 fb ff

and now it comes to the call depth thing which is of size 16:

[    0.936845] SMP alternatives: feat: 11*32+19, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 16), repl: (ffffffff833a3635, len: 16)
[    0.940844] SMP alternatives: __optimize_nops: next: 5, insn len: 5

and this is why it works: __optimize_nops() is cautious enough to do
insn_is_nop() and there is the CALL insn: e8 ba d3 fb ff, it skips over
it:

[    0.944852] SMP alternatives: __optimize_nops: next: 6, insn len: 1

next one is a NOP and it patches the rest of it. Resulting in an 11
bytes NOP:

[    0.950758] SMP alternatives: ffffffff81c000d1: [5:16) optimized NOPs: e8 ba d3 fb ff 66 66 2e 0f 1f 84 00 00 00 00 00

So we're good here without this max(repl_len) thing even if it is the
right thing to do.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-10 14:42               ` Borislav Petkov
@ 2023-09-12  9:27                 ` Peter Zijlstra
  2023-09-12  9:44                   ` Peter Zijlstra
  2023-09-13  4:24                   ` Borislav Petkov
  0 siblings, 2 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-12  9:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Sun, Sep 10, 2023 at 04:42:27PM +0200, Borislav Petkov wrote:
> On Sat, Sep 09, 2023 at 11:25:54AM +0200, Peter Zijlstra wrote:
> > So what you end up with is:
> > 
> > 661:
> >   "one byte orig insn"
> >   "one nop because alt1 is 2 bytes"
> >   "one nop because alt2 is 3 bytes"
> > 
> > right?
> 
> Right.
> 
> > This becomes more of a problem with your example above where the
> > respective lengths are 0, 5, 16. In that case, when you patch 5, you'll
> > leave 11 single nops in there.
> 
> Well, I know what you mean but the code handles that gracefully and it
> works. Watch this:

Aah, because we run optimize_nops() for all alternatives, irrespective
of it being selected. And thus also for the longest and then that'll fix
things up.

OK, let me check on objtool.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-12  9:27                 ` Peter Zijlstra
@ 2023-09-12  9:44                   ` Peter Zijlstra
  2023-09-13  4:37                     ` Borislav Petkov
  2023-09-13  4:24                   ` Borislav Petkov
  1 sibling, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-12  9:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Tue, Sep 12, 2023 at 11:27:09AM +0200, Peter Zijlstra wrote:
> On Sun, Sep 10, 2023 at 04:42:27PM +0200, Borislav Petkov wrote:
> > On Sat, Sep 09, 2023 at 11:25:54AM +0200, Peter Zijlstra wrote:
> > > So what you end up with is:
> > > 
> > > 661:
> > >   "one byte orig insn"
> > >   "one nop because alt1 is 2 bytes"
> > >   "one nop because alt2 is 3 bytes"
> > > 
> > > right?
> > 
> > Right.
> > 
> > > This becomes more of a problem with your example above where the
> > > respective lengths are 0, 5, 16. In that case, when you patch 5, you'll
> > > leave 11 single nops in there.
> > 
> > Well, I know what you mean but the code handles that gracefully and it
> > works. Watch this:
> 
> Aah, because we run optimize_nops() for all alternatives, irrespective
> of it being selected. And thus also for the longest and then that'll fix
> things up.
> 
> OK, let me check on objtool.

OK, I think objtool really does need the hunk you took out.

The problem there is that we're having to create ORC data that is valid
for all possible alternatives -- there is only one ORC table (unless we
go dynamically patch the ORC table too, but so far we've managed to
avoid doing that).

The constraint we have is that for every address the ORC data must match
between the alternatives, but because x86 is a variable length
instruction encoding we can (and do) play games. As long as the
instruction addresses do not line up, they can have different ORC data.

One place where this matters is the tail, if we consider this a string
of single byte nops, that forces a bunch of ORC state to match. So what
we do is that we assume the tail is a single large NOP, this way we get
minimal overlap / ORC conflicts.

As such, we need to know the max length when constructing the
alternatives, otherwise you get short alternatives jumping to somewhere
in the middle of the actual range and well, see above.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-12  9:27                 ` Peter Zijlstra
  2023-09-12  9:44                   ` Peter Zijlstra
@ 2023-09-13  4:24                   ` Borislav Petkov
  1 sibling, 0 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-09-13  4:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Tue, Sep 12, 2023 at 11:27:09AM +0200, Peter Zijlstra wrote:
> Aah, because we run optimize_nops() for all alternatives, irrespective
> of it being selected.

Yeah, and I remember us talking about it the last time you did it and
how it would be a good idea to do that but be careful about it...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-12  9:44                   ` Peter Zijlstra
@ 2023-09-13  4:37                     ` Borislav Petkov
  2023-09-13  8:46                       ` Peter Zijlstra
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2023-09-13  4:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Tue, Sep 12, 2023 at 11:44:41AM +0200, Peter Zijlstra wrote:
> OK, I think objtool really does need the hunk you took out.
> 
> The problem there is that we're having to create ORC data that is valid
> for all possible alternatives -- there is only one ORC table (unless we
> go dynamically patch the ORC table too, but so far we've managed to
> avoid doing that).
> 
> The constraint we have is that for every address the ORC data must match
> between the alternatives, but because x86 is a variable length
> instruction encoding we can (and do) play games. As long as the
> instruction addresses do not line up, they can have different ORC data.
> 
> One place where this matters is the tail, if we consider this a string
> of single byte nops, that forces a bunch of ORC state to match. So what
> we do is that we assume the tail is a single large NOP, this way we get
> minimal overlap / ORC conflicts.
> 
> As such, we need to know the max length when constructing the
> alternatives, otherwise you get short alternatives jumping to somewhere
> in the middle of the actual range and well, see above.

Lemme make sure I understand this correctly. We have a three-way
alternative in our example with the descrptors saying this:

feat: 11*32+15, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a362b, len: 5)
feat: 3*32+21, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a3630, len: 5)
feat: 11*32+19, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 16), repl: (ffffffff833a3635, len: 16)

i.e., the address to patch each time is ffffffff81c000d1, and the length
is different - 5, 5 and 16.

So that ORC data is tracking the starting address and the length?

I guess I don't fully understand the "middle of the actual range" thing
because you don't really have a middle - you have the starting address
and a length.

Or are you saying that the differing length would cause ORC conflicts?

In any case, I guess I could extend your commit with what we've figured
out in this thread and send a new version of what I think it should look
like and I can start testing it on my pile of hw next week, when I get
back...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-13  4:37                     ` Borislav Petkov
@ 2023-09-13  8:46                       ` Peter Zijlstra
  2023-09-13 14:38                         ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-13  8:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Wed, Sep 13, 2023 at 06:37:38AM +0200, Borislav Petkov wrote:
> On Tue, Sep 12, 2023 at 11:44:41AM +0200, Peter Zijlstra wrote:
> > OK, I think objtool really does need the hunk you took out.
> > 
> > The problem there is that we're having to create ORC data that is valid
> > for all possible alternatives -- there is only one ORC table (unless we
> > go dynamically patch the ORC table too, but so far we've managed to
> > avoid doing that).
> > 
> > The constraint we have is that for every address the ORC data must match
> > between the alternatives, but because x86 is a variable length
> > instruction encoding we can (and do) play games. As long as the
> > instruction addresses do not line up, they can have different ORC data.
> > 
> > One place where this matters is the tail, if we consider this a string
> > of single byte nops, that forces a bunch of ORC state to match. So what
> > we do is that we assume the tail is a single large NOP, this way we get
> > minimal overlap / ORC conflicts.
> > 
> > As such, we need to know the max length when constructing the
> > alternatives, otherwise you get short alternatives jumping to somewhere
> > in the middle of the actual range and well, see above.
> 
> Lemme make sure I understand this correctly. We have a three-way
> alternative in our example with the descrptors saying this:
> 
> feat: 11*32+15, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a362b, len: 5)
> feat: 3*32+21, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 5), repl: (ffffffff833a3630, len: 5)
> feat: 11*32+19, old: (entry_SYSCALL_64_after_hwframe+0x59/0xd8 (ffffffff81c000d1) len: 16), repl: (ffffffff833a3635, len: 16)
> 
> i.e., the address to patch each time is ffffffff81c000d1, and the length
> is different - 5, 5 and 16.
> 
> So that ORC data is tracking the starting address and the length?

No, ORC data tracks the address of every instruction that can possibly
exist in that range -- with the constraint that if two instructions have
the same address, the ORC data must match.

To reduce instruction edges in that range, we make sure the tail is a
single large instruction to the end of the alternative.

But since we now have variable length alternatives, we must search the
max length.

> I guess I don't fully understand the "middle of the actual range" thing
> because you don't really have a middle - you have the starting address
> and a length.

The alternative in the source location is of size max-length. Because
there must be room to patch in the longest alternative.

If you allow short alternatives you get:

	CALL entry_untrain_ret
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	
Which is significantly different from:

	CALL entry_untrain_ret
	nop11

In that is has about 10 less ORC entries. But in order to build that
nop11 we must know the max size.

> Or are you saying that the differing length would cause ORC conflicts?

Yes, see above, the short alternative will want to continue at +5, but
we have a string of 1 byte nops there, and this will then constrain
things.

What objtool does/want is make then all of the same size so all the
tails are a single instruction to +16 so that we can disregard what is
in the actual tail.

We've gone over this multiple times already, also see commit
6c480f222128. That made sure the kernel side adhered to this scheme by
making the tail a single instruction irrespective of the length.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-13  8:46                       ` Peter Zijlstra
@ 2023-09-13 14:38                         ` Borislav Petkov
  2023-09-13 16:14                           ` Peter Zijlstra
  2023-09-15  7:46                           ` Peter Zijlstra
  0 siblings, 2 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-09-13 14:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Wed, Sep 13, 2023 at 10:46:58AM +0200, Peter Zijlstra wrote:
> We've gone over this multiple times already, also see commit
> 6c480f222128. That made sure the kernel side adhered to this scheme by
> making the tail a single instruction irrespective of the length.

Yes, sorry about that. I've been putting off digging deep into objtool
internals for a while now and there are no more excuses so lemme finally
read that whole thing and what it does in detail. And thanks for
explaining again. :-\

As to the patch at hand, how does the below look like?

Thx.

---

From: Peter Zijlstra <peterz@infradead.org>
Date: Mon, 14 Aug 2023 13:44:36 +0200
Subject: [PATCH] x86/alternatives: Simplify ALTERNATIVE_n()

Instead of making increasingly complicated ALTERNATIVE_n()
implementations, use a nested alternative expression.

The only difference between:

  ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)

and

  ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),
              newinst2, flag2)

is that the outer alternative can add additional padding - padding which
is needed when newinst2 is longer than oldinst and newinst1 combined
- which then results in alt_instr::instrlen being inconsistent.

However, this is easily remedied since the alt_instr entries will be
consecutive and it is trivial to compute the max(alt_instr::instrlen)
at runtime while patching.

The correct max length of all the alternative insn variants is needed
for ORC unwinding CFI tracking data to be correct. For details, see

  6c480f222128 ("x86/alternative: Rewrite optimize_nops() some")

  [ bp: Make labels unique and thus all sizing use unambiguous labels.
    Add more info. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lkml.kernel.org/r/20230628104952.GA2439977@hirez.programming.kicks-ass.net
---
 arch/x86/include/asm/alternative.h | 226 ++++++++++-------------------
 arch/x86/kernel/alternative.c      |  18 ++-
 arch/x86/kernel/fpu/xstate.h       |   6 +-
 tools/objtool/arch/x86/special.c   |  23 +++
 tools/objtool/special.c            |  16 +-
 5 files changed, 125 insertions(+), 164 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 9c4da699e11a..bcdce6026301 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -150,102 +150,70 @@ static inline int alternatives_text_reserved(void *start, void *end)
 }
 #endif	/* CONFIG_SMP */
 
-#define b_replacement(num)	"664"#num
-#define e_replacement(num)	"665"#num
+#define alt_slen		"662b-6610b"
+#define alt_total_slen		"663b-6610b"
+#define alt_rlen		"665f-664f"
 
-#define alt_end_marker		"663"
-#define alt_slen		"662b-661b"
-#define alt_total_slen		alt_end_marker"b-661b"
-#define alt_rlen(num)		e_replacement(num)"f-"b_replacement(num)"f"
-
-#define OLDINSTR(oldinstr, num)						\
-	"# ALT: oldnstr\n"						\
-	"661:\n\t" oldinstr "\n662:\n"					\
+#define OLDINSTR(oldinstr, n)						\
+	"# ALT: oldinstr\n"						\
+	"661" #n ":\n\t" oldinstr "\n662:\n"					\
 	"# ALT: padding\n"						\
-	".skip -(((" alt_rlen(num) ")-(" alt_slen ")) > 0) * "		\
-		"((" alt_rlen(num) ")-(" alt_slen ")),0x90\n"		\
-	alt_end_marker ":\n"
+	".skip -(((" alt_rlen ")-(" alt_slen ")) > 0) * "		\
+		"((" alt_rlen ")-(" alt_slen ")),0x90\n"		\
+	"663:\n"
+
+#define ALTINSTR_ENTRY(ft_flags)					      \
+	".pushsection .altinstructions,\"a\"\n"				      \
+	" .long 6610b - .\n"				/* label           */ \
+	" .long 664f - .\n"				/* new instruction */ \
+	" .4byte " __stringify(ft_flags) "\n"		/* feature + flags */ \
+	" .byte " alt_total_slen "\n"			/* source len      */ \
+	" .byte " alt_rlen "\n"				/* replacement len */ \
+	".popsection\n"
 
-/*
- * gas compatible max based on the idea from:
- * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
- *
- * The additional "-" is needed because gas uses a "true" value of -1.
- */
-#define alt_max_short(a, b)	"((" a ") ^ (((" a ") ^ (" b ")) & -(-((" a ") < (" b ")))))"
+#define ALTINSTR_REPLACEMENT(newinstr)			/* replacement */	\
+	".pushsection .altinstr_replacement, \"ax\"\n"				\
+	"# ALT: replacement \n"							\
+	"664:\n\t" newinstr "\n 665:\n"						\
+	".popsection\n"
 
 /*
- * Pad the second replacement alternative with additional NOPs if it is
- * additionally longer than the first replacement alternative.
+ * Define an alternative between two instructions. If @ft_flags is
+ * present, early code in apply_alternatives() replaces @oldinstr with
+ * @newinstr. ".skip" directive takes care of proper instruction padding
+ * in case @newinstr is longer than @oldinstr.
+ *
+ * Notably: @oldinstr may be an ALTERNATIVE() itself, also see
+ *          apply_alternatives()
+ *
+ * @n: nesting level. Because those macros can be nested, in order to
+ * compute the source length and the total source length including the
+ * padding, the nesting level is used to define unique labels. The
+ * nesting level increases from the innermost macro invocation outwards,
+ * starting with 0. Thus, the correct starting label of oldinstr is
+ * 6610 which is hardcoded in the macros above.
  */
-#define OLDINSTR_2(oldinstr, num1, num2) \
-	"# ALT: oldinstr2\n"									\
-	"661:\n\t" oldinstr "\n662:\n"								\
-	"# ALT: padding2\n"									\
-	".skip -((" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")) > 0) * "	\
-		"(" alt_max_short(alt_rlen(num1), alt_rlen(num2)) " - (" alt_slen ")), 0x90\n"	\
-	alt_end_marker ":\n"
-
-#define OLDINSTR_3(oldinsn, n1, n2, n3)								\
-	"# ALT: oldinstr3\n"									\
-	"661:\n\t" oldinsn "\n662:\n"								\
-	"# ALT: padding3\n"									\
-	".skip -((" alt_max_short(alt_max_short(alt_rlen(n1), alt_rlen(n2)), alt_rlen(n3))	\
-		" - (" alt_slen ")) > 0) * "							\
-		"(" alt_max_short(alt_max_short(alt_rlen(n1), alt_rlen(n2)), alt_rlen(n3))	\
-		" - (" alt_slen ")), 0x90\n"							\
-	alt_end_marker ":\n"
-
-#define ALTINSTR_ENTRY(ft_flags, num)					      \
-	" .long 661b - .\n"				/* label           */ \
-	" .long " b_replacement(num)"f - .\n"		/* new instruction */ \
-	" .4byte " __stringify(ft_flags) "\n"		/* feature + flags */ \
-	" .byte " alt_total_slen "\n"			/* source len      */ \
-	" .byte " alt_rlen(num) "\n"			/* replacement len */
+#define __ALTERNATIVE(oldinstr, newinstr, ft_flags, n)			\
+	OLDINSTR(oldinstr, n)						\
+	ALTINSTR_ENTRY(ft_flags)					\
+	ALTINSTR_REPLACEMENT(newinstr)
 
-#define ALTINSTR_REPLACEMENT(newinstr, num)		/* replacement */	\
-	"# ALT: replacement " #num "\n"						\
-	b_replacement(num)":\n\t" newinstr "\n" e_replacement(num) ":\n"
-
-/* alternative assembly primitive: */
 #define ALTERNATIVE(oldinstr, newinstr, ft_flags)			\
-	OLDINSTR(oldinstr, 1)						\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags, 1)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinstr, 1)				\
-	".popsection\n"
+	__ALTERNATIVE(oldinstr, newinstr, ft_flags, 0)
 
-#define ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2) \
-	OLDINSTR_2(oldinstr, 1, 2)					\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags1, 1)					\
-	ALTINSTR_ENTRY(ft_flags2, 2)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinstr1, 1)				\
-	ALTINSTR_REPLACEMENT(newinstr2, 2)				\
-	".popsection\n"
+#define ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)	\
+	__ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),		\
+		    newinst2, flag2, 1)
 
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
 #define ALTERNATIVE_TERNARY(oldinstr, ft_flags, newinstr_yes, newinstr_no) \
 	ALTERNATIVE_2(oldinstr, newinstr_no, X86_FEATURE_ALWAYS,	\
 		      newinstr_yes, ft_flags)
 
-#define ALTERNATIVE_3(oldinsn, newinsn1, ft_flags1, newinsn2, ft_flags2, \
-			newinsn3, ft_flags3)				\
-	OLDINSTR_3(oldinsn, 1, 2, 3)					\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(ft_flags1, 1)					\
-	ALTINSTR_ENTRY(ft_flags2, 2)					\
-	ALTINSTR_ENTRY(ft_flags3, 3)					\
-	".popsection\n"							\
-	".pushsection .altinstr_replacement, \"ax\"\n"			\
-	ALTINSTR_REPLACEMENT(newinsn1, 1)				\
-	ALTINSTR_REPLACEMENT(newinsn2, 2)				\
-	ALTINSTR_REPLACEMENT(newinsn3, 3)				\
-	".popsection\n"
+#define ALTERNATIVE_3(oldinst, newinst1, flag1, newinst2, flag2,	\
+		      newinst3, flag3)					\
+	__ALTERNATIVE(ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2), \
+		    newinst3, flag3, 2)
 
 /*
  * Alternative instructions for different CPU types or capabilities.
@@ -370,6 +338,25 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	.byte \alt_len
 .endm
 
+/*
+ * Make sure the innermost macro invocation passes in as label "1400"
+ * as it is used for @oldinst sizing further down here.
+ */
+#define __ALTERNATIVE(oldinst, newinst, flag, label)			\
+label:									\
+	oldinst	;							\
+141:									\
+	.skip -(((144f-143f)-(141b-1400b)) > 0) * ((144f-143f)-(141b-1400b)),0x90	;\
+142:									\
+	.pushsection .altinstructions,"a" ;				\
+	altinstr_entry 1400b,143f,flag,142b-1400b,144f-143f ;		\
+	.popsection ;							\
+	.pushsection .altinstr_replacement,"ax"	;			\
+143:									\
+	newinst	;							\
+144:									\
+	.popsection ;
+
 /*
  * Define an alternative between two instructions. If @feature is
  * present, early code in apply_alternatives() replaces @oldinstr with
@@ -377,88 +364,23 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * in case @newinstr is longer than @oldinstr.
  */
 .macro ALTERNATIVE oldinstr, newinstr, ft_flags
-140:
-	\oldinstr
-141:
-	.skip -(((144f-143f)-(141b-140b)) > 0) * ((144f-143f)-(141b-140b)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags,142b-140b,144f-143f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr
-144:
-	.popsection
+	__ALTERNATIVE(\oldinstr, \newinstr, \ft_flags, 1400)
 .endm
 
-#define old_len			141b-140b
-#define new_len1		144f-143f
-#define new_len2		145f-144f
-#define new_len3		146f-145f
-
-/*
- * gas compatible max based on the idea from:
- * http://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax
- *
- * The additional "-" is needed because gas uses a "true" value of -1.
- */
-#define alt_max_2(a, b)		((a) ^ (((a) ^ (b)) & -(-((a) < (b)))))
-#define alt_max_3(a, b, c)	(alt_max_2(alt_max_2(a, b), c))
-
-
 /*
  * Same as ALTERNATIVE macro above but for two alternatives. If CPU
  * has @feature1, it replaces @oldinstr with @newinstr1. If CPU has
  * @feature2, it replaces @oldinstr with @feature2.
  */
 .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2
-140:
-	\oldinstr
-141:
-	.skip -((alt_max_2(new_len1, new_len2) - (old_len)) > 0) * \
-		(alt_max_2(new_len1, new_len2) - (old_len)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f
-	altinstr_entry 140b,144f,\ft_flags2,142b-140b,145f-144f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr1
-144:
-	\newinstr2
-145:
-	.popsection
+	__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1, 1400),
+		      \newinstr2, \ft_flags2, 1401)
 .endm
 
 .macro ALTERNATIVE_3 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2, newinstr3, ft_flags3
-140:
-	\oldinstr
-141:
-	.skip -((alt_max_3(new_len1, new_len2, new_len3) - (old_len)) > 0) * \
-		(alt_max_3(new_len1, new_len2, new_len3) - (old_len)),0x90
-142:
-
-	.pushsection .altinstructions,"a"
-	altinstr_entry 140b,143f,\ft_flags1,142b-140b,144f-143f
-	altinstr_entry 140b,144f,\ft_flags2,142b-140b,145f-144f
-	altinstr_entry 140b,145f,\ft_flags3,142b-140b,146f-145f
-	.popsection
-
-	.pushsection .altinstr_replacement,"ax"
-143:
-	\newinstr1
-144:
-	\newinstr2
-145:
-	\newinstr3
-146:
-	.popsection
+	__ALTERNATIVE(__ALTERNATIVE(__ALTERNATIVE(\oldinstr, \newinstr1, \ft_flags1, 1400),
+				    \newinstr2, \ft_flags2, 1401),
+		      \newinstr3, \ft_flags3, 1402)
 .endm
 
 /* If @feature is set, patch in @newinstr_yes, otherwise @newinstr_no. */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index a5ead6a6d233..bcbef8ce9d94 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -398,7 +398,7 @@ apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src, size_t src_len)
 void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 						  struct alt_instr *end)
 {
-	struct alt_instr *a;
+	struct alt_instr *a, *b;
 	u8 *instr, *replacement;
 	u8 insn_buff[MAX_PATCH_LEN];
 
@@ -415,6 +415,22 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 	for (a = start; a < end; a++) {
 		int insn_buff_sz = 0;
 
+		/*
+		 * In case of nested ALTERNATIVE()s the outer alternative might
+		 * add more padding. To ensure consistent patching find the max
+		 * padding for all alt_instr entries for this site (nested
+		 * alternatives result in consecutive entries).
+		 *
+		 * Patching works even without it but ORC unwinder CFI tracking
+		 * turns the last trailing NOP into a max size instruction
+		 * so that edge data can be kept at a minimum, see
+		 * comment over add_nop().
+		 */
+		for (b = a+1; b < end && b->instr_offset == a->instr_offset; b++) {
+			u8 len = max(a->instrlen, b->instrlen);
+			a->instrlen = b->instrlen = len;
+		}
+
 		instr = (u8 *)&a->instr_offset + a->instr_offset;
 		replacement = (u8 *)&a->repl_offset + a->repl_offset;
 		BUG_ON(a->instrlen > sizeof(insn_buff));
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index a4ecb04d8d64..37328ffc72bb 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -109,7 +109,7 @@ static inline u64 xfeatures_mask_independent(void)
  *
  * We use XSAVE as a fallback.
  *
- * The 661 label is defined in the ALTERNATIVE* macros as the address of the
+ * The 6610 label is defined in the ALTERNATIVE* macros as the address of the
  * original instruction which gets replaced. We need to use it here as the
  * address of the instruction where we might get an exception at.
  */
@@ -121,7 +121,7 @@ static inline u64 xfeatures_mask_independent(void)
 		     "\n"						\
 		     "xor %[err], %[err]\n"				\
 		     "3:\n"						\
-		     _ASM_EXTABLE_TYPE_REG(661b, 3b, EX_TYPE_EFAULT_REG, %[err]) \
+		     _ASM_EXTABLE_TYPE_REG(6610b, 3b, EX_TYPE_EFAULT_REG, %[err]) \
 		     : [err] "=r" (err)					\
 		     : "D" (st), "m" (*st), "a" (lmask), "d" (hmask)	\
 		     : "memory")
@@ -135,7 +135,7 @@ static inline u64 xfeatures_mask_independent(void)
 				 XRSTORS, X86_FEATURE_XSAVES)		\
 		     "\n"						\
 		     "3:\n"						\
-		     _ASM_EXTABLE_TYPE(661b, 3b, EX_TYPE_FPU_RESTORE)	\
+		     _ASM_EXTABLE_TYPE(6610b, 3b, EX_TYPE_FPU_RESTORE)	\
 		     :							\
 		     : "D" (st), "m" (*st), "a" (lmask), "d" (hmask)	\
 		     : "memory")
diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
index 29e949579ede..7145920a7aba 100644
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -9,6 +9,29 @@
 
 void arch_handle_alternative(unsigned short feature, struct special_alt *alt)
 {
+	static struct special_alt *group, *prev;
+
+	/*
+	 * Recompute orig_len for nested ALTERNATIVE()s.
+	 */
+	if (group && group->orig_sec == alt->orig_sec &&
+	             group->orig_off == alt->orig_off) {
+
+		struct special_alt *iter = group;
+		for (;;) {
+			unsigned int len = max(iter->orig_len, alt->orig_len);
+			iter->orig_len = alt->orig_len = len;
+
+			if (iter == prev)
+				break;
+
+			iter = list_next_entry(iter, list);
+		}
+
+	} else group = alt;
+
+	prev = alt;
+
 	switch (feature) {
 	case X86_FEATURE_SMAP:
 		/*
diff --git a/tools/objtool/special.c b/tools/objtool/special.c
index 91b1950f5bd8..097a69db82a0 100644
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -84,6 +84,14 @@ static int get_alt_entry(struct elf *elf, const struct special_entry *entry,
 						  entry->new_len);
 	}
 
+	orig_reloc = find_reloc_by_dest(elf, sec, offset + entry->orig);
+	if (!orig_reloc) {
+		WARN_FUNC("can't find orig reloc", sec, offset + entry->orig);
+		return -1;
+	}
+
+	reloc_to_sec_off(orig_reloc, &alt->orig_sec, &alt->orig_off);
+
 	if (entry->feature) {
 		unsigned short feature;
 
@@ -94,14 +102,6 @@ static int get_alt_entry(struct elf *elf, const struct special_entry *entry,
 		arch_handle_alternative(feature, alt);
 	}
 
-	orig_reloc = find_reloc_by_dest(elf, sec, offset + entry->orig);
-	if (!orig_reloc) {
-		WARN_FUNC("can't find orig reloc", sec, offset + entry->orig);
-		return -1;
-	}
-
-	reloc_to_sec_off(orig_reloc, &alt->orig_sec, &alt->orig_off);
-
 	if (!entry->group || alt->new_len) {
 		new_reloc = find_reloc_by_dest(elf, sec, offset + entry->new);
 		if (!new_reloc) {
-- 
2.23.0


---

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-13 14:38                         ` Borislav Petkov
@ 2023-09-13 16:14                           ` Peter Zijlstra
  2023-09-15  7:46                           ` Peter Zijlstra
  1 sibling, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-13 16:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Wed, Sep 13, 2023 at 04:38:47PM +0200, Borislav Petkov wrote:
> On Wed, Sep 13, 2023 at 10:46:58AM +0200, Peter Zijlstra wrote:
> > We've gone over this multiple times already, also see commit
> > 6c480f222128. That made sure the kernel side adhered to this scheme by
> > making the tail a single instruction irrespective of the length.
> 
> Yes, sorry about that. I've been putting off digging deep into objtool
> internals for a while now and there are no more excuses so lemme finally
> read that whole thing and what it does in detail. And thanks for
> explaining again. :-\

No worries, I always seem to forget a detail or two myself :-)

> As to the patch at hand, how does the below look like?

Brain is fried, I'll give it a look tomorrow.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-13 14:38                         ` Borislav Petkov
  2023-09-13 16:14                           ` Peter Zijlstra
@ 2023-09-15  7:46                           ` Peter Zijlstra
  2023-09-15  7:51                             ` Peter Zijlstra
  1 sibling, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-15  7:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Wed, Sep 13, 2023 at 04:38:47PM +0200, Borislav Petkov wrote:

>   [ bp: Make labels unique and thus all sizing use unambiguous labels.
>     Add more info. ]

> +#define __ALTERNATIVE(oldinstr, newinstr, ft_flags, n)			\
> +	OLDINSTR(oldinstr, n)						\
> +	ALTINSTR_ENTRY(ft_flags)					\
> +	ALTINSTR_REPLACEMENT(newinstr)

> +#define ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)	\
> +	__ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),		\
> +		    newinst2, flag2, 1)

> +#define ALTERNATIVE_3(oldinst, newinst1, flag1, newinst2, flag2,	\
> +		      newinst3, flag3)					\
> +	__ALTERNATIVE(ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2), \
> +		    newinst3, flag3, 2)


So I see what you did with that @n argument, but urgh, do we really need
this? I mean, it just makes things harder to use and it doesn't actually
fix anything.. :/

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-15  7:46                           ` Peter Zijlstra
@ 2023-09-15  7:51                             ` Peter Zijlstra
  2023-09-15 12:05                               ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Peter Zijlstra @ 2023-09-15  7:51 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Fri, Sep 15, 2023 at 09:46:47AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 13, 2023 at 04:38:47PM +0200, Borislav Petkov wrote:
> 
> >   [ bp: Make labels unique and thus all sizing use unambiguous labels.
> >     Add more info. ]
> 
> > +#define __ALTERNATIVE(oldinstr, newinstr, ft_flags, n)			\
> > +	OLDINSTR(oldinstr, n)						\
> > +	ALTINSTR_ENTRY(ft_flags)					\
> > +	ALTINSTR_REPLACEMENT(newinstr)
> 
> > +#define ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2)	\
> > +	__ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1),		\
> > +		    newinst2, flag2, 1)
> 
> > +#define ALTERNATIVE_3(oldinst, newinst1, flag1, newinst2, flag2,	\
> > +		      newinst3, flag3)					\
> > +	__ALTERNATIVE(ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2), \
> > +		    newinst3, flag3, 2)
> 
> 
> So I see what you did with that @n argument, but urgh, do we really need
> this? I mean, it just makes things harder to use and it doesn't actually
> fix anything.. :/

That is, if we can magic this using __COUNTER__ without a user interface
penalty, then sure. But the last time I tried that I failed utterly and
ended up with labels like:

  .Lalt_old___COUNTER__:

no matter how many layers of CPP macro eval I stuck in it. So clearly I
wasn't having a good day ....

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n()
  2023-09-15  7:51                             ` Peter Zijlstra
@ 2023-09-15 12:05                               ` Borislav Petkov
  0 siblings, 0 replies; 74+ messages in thread
From: Borislav Petkov @ 2023-09-15 12:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, David.Kaplan, Andrew.Cooper3, jpoimboe,
	gregkh, nik.borisov

On Fri, Sep 15, 2023 at 09:51:06AM +0200, Peter Zijlstra wrote:
> > So I see what you did with that @n argument, but urgh, do we really need
> > this? I mean, it just makes things harder to use and it doesn't actually
> > fix anything.. :/

It only addresses this repeating of the 661 labels:

# 53 "./arch/x86/include/asm/page_64.h" 1
	# ALT: oldnstr
661:
	# ALT: oldnstr
661:
	call clear_page_orig	#
662:

but this is only the produced asm which no one but me and you look at so
I guess it is not worth the effort.

I still think, though, that adding the comments explaining the situation
more is worth it because we will forget.

> That is, if we can magic this using __COUNTER__ without a user interface
> penalty, then sure. But the last time I tried that I failed utterly and
> ended up with labels like:
> 
>   .Lalt_old___COUNTER__:
> 
> no matter how many layers of CPP macro eval I stuck in it. So clearly I
> wasn't having a good day ....

Yeah, I tried it too because Matz said it should work with it but
I failed too. Reportedly, the approach should be to do that in CPP and
use CPP even for the asm macro but my CPP-fu is basic, to say the least.

I'll poke him next time we meet - I might've missed an aspect.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2023-09-15 12:08 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-14 11:44 [PATCH v2 00/11] Fix up SRSO stuff Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 01/11] x86/cpu: Fixup __x86_return_thunk Peter Zijlstra
2023-08-16  7:55   ` [tip: x86/urgent] x86/cpu: Fix __x86_return_thunk symbol type tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 02/11] x86/cpu: Fix up srso_safe_ret() and __x86_return_thunk() Peter Zijlstra
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 03/11] objtool/x86: Fix SRSO mess Peter Zijlstra
2023-08-14 12:54   ` Andrew.Cooper3
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-16 11:59     ` Peter Zijlstra
2023-08-16 20:31       ` Josh Poimboeuf
2023-08-16 22:08         ` [PATCH] objtool/x86: Fixup frame-pointer vs rethunk Peter Zijlstra
2023-08-16 22:22           ` Josh Poimboeuf
2023-08-17  8:39       ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 04/11] x86/alternative: Make custom return thunk unconditional Peter Zijlstra
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 05/11] x86/cpu: Clean up SRSO return thunk mess Peter Zijlstra
2023-08-14 13:02   ` Borislav Petkov
2023-08-14 17:48   ` Borislav Petkov
2023-08-15 21:29   ` Nathan Chancellor
2023-08-15 22:43     ` Peter Zijlstra
2023-08-16  7:38       ` Borislav Petkov
2023-08-16 14:52         ` Nathan Chancellor
2023-08-16 15:08           ` Borislav Petkov
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-16 18:58     ` Nathan Chancellor
2023-08-16 19:24       ` Borislav Petkov
2023-08-16 19:30         ` Nathan Chancellor
2023-08-16 19:42           ` Borislav Petkov
2023-08-16 19:57             ` Borislav Petkov
2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 06/11] x86/cpu: Rename original retbleed methods Peter Zijlstra
2023-08-14 19:41   ` Josh Poimboeuf
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 07/11] x86/cpu: Rename srso_(.*)_alias to srso_alias_\1 Peter Zijlstra
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 08/11] x86/cpu: Cleanup the untrain mess Peter Zijlstra
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 09/11] x86/cpu/kvm: Provide UNTRAIN_RET_VM Peter Zijlstra
2023-08-16  7:55   ` [tip: x86/urgent] " tip-bot2 for Peter Zijlstra
2023-08-16 21:20   ` tip-bot2 for Peter Zijlstra
2023-08-14 11:44 ` [PATCH v2 10/11] x86/alternatives: Simplify ALTERNATIVE_n() Peter Zijlstra
2023-08-15 20:49   ` Nikolay Borisov
2023-08-15 22:44     ` Peter Zijlstra
2023-09-07  8:31   ` Borislav Petkov
2023-09-07 11:09     ` Peter Zijlstra
2023-09-07 11:11       ` Peter Zijlstra
2023-09-07 11:16         ` Peter Zijlstra
2023-09-07 15:06       ` Borislav Petkov
2023-09-07 15:30         ` Borislav Petkov
2023-09-09  7:50           ` Borislav Petkov
2023-09-09  9:25             ` Peter Zijlstra
2023-09-09  9:42               ` Peter Zijlstra
2023-09-10 14:42               ` Borislav Petkov
2023-09-12  9:27                 ` Peter Zijlstra
2023-09-12  9:44                   ` Peter Zijlstra
2023-09-13  4:37                     ` Borislav Petkov
2023-09-13  8:46                       ` Peter Zijlstra
2023-09-13 14:38                         ` Borislav Petkov
2023-09-13 16:14                           ` Peter Zijlstra
2023-09-15  7:46                           ` Peter Zijlstra
2023-09-15  7:51                             ` Peter Zijlstra
2023-09-15 12:05                               ` Borislav Petkov
2023-09-13  4:24                   ` Borislav Petkov
2023-08-14 11:44 ` [PATCH v2 11/11] x86/cpu: Use fancy alternatives to get rid of entry_untrain_ret() Peter Zijlstra
2023-08-14 16:44 ` [PATCH v2 00/11] Fix up SRSO stuff Borislav Petkov
2023-08-14 19:51   ` Josh Poimboeuf
2023-08-14 19:57     ` Borislav Petkov
2023-08-14 20:01     ` Josh Poimboeuf
2023-08-14 20:09       ` Borislav Petkov
2023-08-15 14:26         ` [PATCH] x86/srso: Explain the untraining sequences a bit more Borislav Petkov
2023-08-15 15:41           ` Nikolay Borisov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).