linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE
@ 2021-03-26 15:11 Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 01/16] x86: Add insn_decode_kernel() Peter Zijlstra
                   ` (16 more replies)
  0 siblings, 17 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:11 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Hi, another week, another update :-)

Respin of the !RETPOLINE optimization patches.

Boris, the first 3 should probably go into tip/x86/core, it's an ungodly tangle
since it relies on the insn decoder patches in tip/x86/core, the NOP patches in
tip/x86/cpu and the alternative patches in tip/x86/alternatives.

Just to make life easy I'd suggest merging everything in x86/core and
forgetting about the other topic branches (that's what I ended up doing locally).

The remaining 13 patches depend on the first 3 as well as on the work in
tip/objtool/core, just to make life more interesting still ;-)

All except the last 4 patches should be fairly uncontroversial (I hope...).

There's a fair number of new patches and another few have been completely
rewritten, but it all seems to work nicely.


^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 01/16] x86: Add insn_decode_kernel()
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 02/16] x86/alternatives: Optimize optimize_nops() Peter Zijlstra
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Add a helper to decode kernel instructions; there's no point in
endlessly repeating those last two arguments.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/insn.h        |    2 ++
 arch/x86/kernel/alternative.c      |    2 +-
 arch/x86/kernel/cpu/mce/severity.c |    2 +-
 arch/x86/kernel/kprobes/core.c     |    4 ++--
 arch/x86/kernel/kprobes/opt.c      |    2 +-
 arch/x86/kernel/traps.c            |    2 +-
 tools/arch/x86/include/asm/insn.h  |    4 +++-
 7 files changed, 11 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -150,6 +150,8 @@ enum insn_mode {
 
 extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m);
 
+#define insn_decode_kernel(_insn, _ptr) insn_decode((_insn), (_ptr), MAX_INSN_SIZE, INSN_MODE_KERN)
+
 /* Attribute will be determined after getting ModRM (for opcode groups) */
 static inline void insn_get_attribute(struct insn *insn)
 {
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1160,7 +1160,7 @@ static void text_poke_loc_init(struct te
 	if (!emulate)
 		emulate = opcode;
 
-	ret = insn_decode(&insn, emulate, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(&insn, emulate);
 
 	BUG_ON(ret < 0);
 	BUG_ON(len != insn.length);
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -225,7 +225,7 @@ static bool is_copy_from_user(struct pt_
 	if (copy_from_kernel_nofault(insn_buf, (void *)regs->ip, MAX_INSN_SIZE))
 		return false;
 
-	ret = insn_decode(&insn, insn_buf, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(&insn, insn_buf);
 	if (ret < 0)
 		return false;
 
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -279,7 +279,7 @@ static int can_probe(unsigned long paddr
 		if (!__addr)
 			return 0;
 
-		ret = insn_decode(&insn, (void *)__addr, MAX_INSN_SIZE, INSN_MODE_KERN);
+		ret = insn_decode_kernel(&insn, (void *)__addr);
 		if (ret < 0)
 			return 0;
 
@@ -316,7 +316,7 @@ int __copy_instruction(u8 *dest, u8 *src
 			MAX_INSN_SIZE))
 		return 0;
 
-	ret = insn_decode(insn, dest, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(insn, dest);
 	if (ret < 0)
 		return 0;
 
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -324,7 +324,7 @@ static int can_optimize(unsigned long pa
 		if (!recovered_insn)
 			return 0;
 
-		ret = insn_decode(&insn, (void *)recovered_insn, MAX_INSN_SIZE, INSN_MODE_KERN);
+		ret = insn_decode_kernel(&insn, (void *)recovered_insn);
 		if (ret < 0)
 			return 0;
 
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -504,7 +504,7 @@ static enum kernel_gp_hint get_kernel_gp
 			MAX_INSN_SIZE))
 		return GP_NO_HINT;
 
-	ret = insn_decode(&insn, insn_buf, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(&insn, insn_buf);
 	if (ret < 0)
 		return GP_NO_HINT;
 
--- a/tools/arch/x86/include/asm/insn.h
+++ b/tools/arch/x86/include/asm/insn.h
@@ -150,6 +150,8 @@ enum insn_mode {
 
 extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m);
 
+#define insn_decode_kernel(_insn, _ptr) insn_decode((_insn), (_ptr), MAX_INSN_SIZE, INSN_MODE_KERN)
+
 /* Attribute will be determined after getting ModRM (for opcode groups) */
 static inline void insn_get_attribute(struct insn *insn)
 {



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 02/16] x86/alternatives: Optimize optimize_nops()
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 01/16] x86: Add insn_decode_kernel() Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:11   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 03/16] x86/retpoline: Simplify retpolines Peter Zijlstra
                   ` (14 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Currently optimize_nops() scans to see if the alternative starts with
NOPs. However, the emit pattern is:

  141:	\oldinstr
  142:	.skip (len-(142b-141b)), 0x90

That is, when oldinstr is short, we pad the tail with NOPs. This case
never gets optimized.

Rewrite optimize_nops() to replace any string of NOPs inside the
alternative to larger NOPs. Also run it irrespective of patching,
replacing NOPs in both the original and replaced code.

A direct consequence is that padlen becomes superfluous, so remove it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/alternative.h            |   14 ++------
 arch/x86/kernel/alternative.c                 |   45 +++++++++++++++-----------
 tools/objtool/arch/x86/include/arch/special.h |    2 -
 3 files changed, 33 insertions(+), 28 deletions(-)

--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -65,7 +65,6 @@ struct alt_instr {
 	u16 cpuid;		/* cpuid bit set for replacement */
 	u8  instrlen;		/* length of original instruction */
 	u8  replacementlen;	/* length of new instruction */
-	u8  padlen;		/* length of build-time padding */
 } __packed;
 
 /*
@@ -104,7 +103,6 @@ static inline int alternatives_text_rese
 
 #define alt_end_marker		"663"
 #define alt_slen		"662b-661b"
-#define alt_pad_len		alt_end_marker"b-662b"
 #define alt_total_slen		alt_end_marker"b-661b"
 #define alt_rlen(num)		e_replacement(num)"f-"b_replacement(num)"f"
 
@@ -151,8 +149,7 @@ static inline int alternatives_text_rese
 	" .long " b_replacement(num)"f - .\n"		/* new instruction */ \
 	" .word " __stringify(feature) "\n"		/* feature bit     */ \
 	" .byte " alt_total_slen "\n"			/* source len      */ \
-	" .byte " alt_rlen(num) "\n"			/* replacement len */ \
-	" .byte " alt_pad_len "\n"			/* pad len */
+	" .byte " alt_rlen(num) "\n"			/* replacement len */
 
 #define ALTINSTR_REPLACEMENT(newinstr, num)		/* replacement */	\
 	"# ALT: replacement " #num "\n"						\
@@ -315,13 +312,12 @@ static inline int alternatives_text_rese
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig alt feature orig_len alt_len pad_len
+.macro altinstruction_entry orig alt feature orig_len alt_len
 	.long \orig - .
 	.long \alt - .
 	.word \feature
 	.byte \orig_len
 	.byte \alt_len
-	.byte \pad_len
 .endm
 
 /*
@@ -338,7 +334,7 @@ static inline int alternatives_text_rese
 142:
 
 	.pushsection .altinstructions,"a"
-	altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f,142b-141b
+	altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f
 	.popsection
 
 	.pushsection .altinstr_replacement,"ax"
@@ -375,8 +371,8 @@ static inline int alternatives_text_rese
 142:
 
 	.pushsection .altinstructions,"a"
-	altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f,142b-141b
-	altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f,142b-141b
+	altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f
+	altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f
 	.popsection
 
 	.pushsection .altinstr_replacement,"ax"
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -189,19 +189,31 @@ recompute_jump(struct alt_instr *a, u8 *
 static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
 {
 	unsigned long flags;
-	int i;
+	struct insn insn;
+	int nop, i = 0;
 
-	for (i = 0; i < a->padlen; i++) {
-		if (instr[i] != 0x90)
+	for (;;) {
+		if (insn_decode_kernel(&insn, &instr[i]))
 			return;
+
+		if (insn.length == 1 && insn.opcode.bytes[0] == 0x90)
+			break;
+
+		if ((i += insn.length) >= a->instrlen)
+			return;
+	}
+
+	for (nop = i; i < a->instrlen; i++) {
+		if (WARN_ONCE(instr[i] != 0x90, "Not a NOP at 0x%px\n", &instr[i]))
+			break;
 	}
 
 	local_irq_save(flags);
-	add_nops(instr + (a->instrlen - a->padlen), a->padlen);
+	add_nops(instr + nop, i - nop);
 	local_irq_restore(flags);
 
 	DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
-		   instr, a->instrlen - a->padlen, a->padlen);
+		   instr, nop, i - nop);
 }
 
 /*
@@ -247,19 +259,15 @@ void __init_or_module noinline apply_alt
 		 * - feature not present but ALTINSTR_FLAG_INV is set to mean,
 		 *   patch if feature is *NOT* present.
 		 */
-		if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV)) {
-			if (a->padlen > 1)
-				optimize_nops(a, instr);
+		if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV))
+			goto next;
 
-			continue;
-		}
-
-		DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d), pad: %d",
+		DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d)",
 			(a->cpuid & ALTINSTR_FLAG_INV) ? "!" : "",
 			feature >> 5,
 			feature & 0x1f,
 			instr, instr, a->instrlen,
-			replacement, a->replacementlen, a->padlen);
+			replacement, a->replacementlen);
 
 		DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
 		DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);
@@ -283,14 +291,15 @@ void __init_or_module noinline apply_alt
 		if (a->replacementlen && is_jmp(replacement[0]))
 			recompute_jump(a, instr, replacement, insn_buff);
 
-		if (a->instrlen > a->replacementlen) {
-			add_nops(insn_buff + a->replacementlen,
-				 a->instrlen - a->replacementlen);
-			insn_buff_sz += a->instrlen - a->replacementlen;
-		}
+		for (; insn_buff_sz < a->instrlen; insn_buff_sz++)
+			insn_buff[insn_buff_sz] = 0x90;
+
 		DUMP_BYTES(insn_buff, insn_buff_sz, "%px: final_insn: ", instr);
 
 		text_poke_early(instr, insn_buff, insn_buff_sz);
+
+next:
+		optimize_nops(a, instr);
 	}
 }
 
--- a/tools/objtool/arch/x86/include/arch/special.h
+++ b/tools/objtool/arch/x86/include/arch/special.h
@@ -10,7 +10,7 @@
 #define JUMP_ORIG_OFFSET	0
 #define JUMP_NEW_OFFSET		4
 
-#define ALT_ENTRY_SIZE		13
+#define ALT_ENTRY_SIZE		12
 #define ALT_ORIG_OFFSET		0
 #define ALT_NEW_OFFSET		4
 #define ALT_FEATURE_OFFSET	8



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 03/16] x86/retpoline: Simplify retpolines
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 01/16] x86: Add insn_decode_kernel() Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 02/16] x86/alternatives: Optimize optimize_nops() Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 04/16] objtool: Correctly handle retpoline thunk calls Peter Zijlstra
                   ` (13 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Due to commit c9c324dc22aa ("objtool: Support stack layout changes
in alternatives"), it is possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols,
__x86_indirect_thunk_\reg, which is the compiler target, and
__x86_retpoline_\reg, which is the actual retpoline. Both are
consecutive in code and aligned such that for any one register they
both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a jmp to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW,
if we can shrink the retpoline by 1 byte we can pack it more dense)

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/asm-prototypes.h |    7 -------
 arch/x86/include/asm/nospec-branch.h  |    6 +++---
 arch/x86/lib/retpoline.S              |   34 +++++++++++++++++-----------------
 tools/objtool/check.c                 |    3 +--
 4 files changed, 21 insertions(+), 29 deletions(-)

--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -22,15 +22,8 @@ extern void cmpxchg8b_emu(void);
 #define DECL_INDIRECT_THUNK(reg) \
 	extern asmlinkage void __x86_indirect_thunk_ ## reg (void);
 
-#define DECL_RETPOLINE(reg) \
-	extern asmlinkage void __x86_retpoline_ ## reg (void);
-
 #undef GEN
 #define GEN(reg) DECL_INDIRECT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
-#undef GEN
-#define GEN(reg) DECL_RETPOLINE(reg)
-#include <asm/GEN-for-each-reg.h>
-
 #endif /* CONFIG_RETPOLINE */
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -81,7 +81,7 @@
 .macro JMP_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
-		      __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(jmp __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \
 		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	jmp	*%\reg
@@ -91,7 +91,7 @@
 .macro CALL_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \
-		      __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(call __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \
 		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	call	*%\reg
@@ -129,7 +129,7 @@
 	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
 	"call *%[thunk_target]\n",				\
-	"call __x86_retpoline_%V[thunk_target]\n",		\
+	"call __x86_indirect_thunk_%V[thunk_target]\n",		\
 	X86_FEATURE_RETPOLINE,					\
 	"lfence;\n"						\
 	ANNOTATE_RETPOLINE_SAFE					\
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -10,27 +10,31 @@
 #include <asm/unwind_hints.h>
 #include <asm/frame.h>
 
-.macro THUNK reg
-	.section .text.__x86.indirect_thunk
-
-	.align 32
-SYM_FUNC_START(__x86_indirect_thunk_\reg)
-	JMP_NOSPEC \reg
-SYM_FUNC_END(__x86_indirect_thunk_\reg)
-
-SYM_FUNC_START_NOALIGN(__x86_retpoline_\reg)
+.macro RETPOLINE reg
 	ANNOTATE_INTRA_FUNCTION_CALL
-	call	.Ldo_rop_\@
+	call    .Ldo_rop_\@
 .Lspec_trap_\@:
 	UNWIND_HINT_EMPTY
 	pause
 	lfence
-	jmp	.Lspec_trap_\@
+	jmp .Lspec_trap_\@
 .Ldo_rop_\@:
-	mov	%\reg, (%_ASM_SP)
+	mov     %\reg, (%_ASM_SP)
 	UNWIND_HINT_FUNC
 	ret
-SYM_FUNC_END(__x86_retpoline_\reg)
+.endm
+
+.macro THUNK reg
+	.section .text.__x86.indirect_thunk
+
+	.align 32
+SYM_FUNC_START(__x86_indirect_thunk_\reg)
+
+	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
+		      __stringify(RETPOLINE \reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
+
+SYM_FUNC_END(__x86_indirect_thunk_\reg)
 
 .endm
 
@@ -48,7 +52,6 @@ SYM_FUNC_END(__x86_retpoline_\reg)
 
 #define __EXPORT_THUNK(sym)	_ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
 #define EXPORT_THUNK(reg)	__EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
-#define EXPORT_RETPOLINE(reg)  __EXPORT_THUNK(__x86_retpoline_ ## reg)
 
 #undef GEN
 #define GEN(reg) THUNK reg
@@ -58,6 +61,3 @@ SYM_FUNC_END(__x86_retpoline_\reg)
 #define GEN(reg) EXPORT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
-#undef GEN
-#define GEN(reg) EXPORT_RETPOLINE(reg)
-#include <asm/GEN-for-each-reg.h>
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -800,8 +800,7 @@ static int add_jump_destinations(struct
 		} else if (reloc->sym->type == STT_SECTION) {
 			dest_sec = reloc->sym->sec;
 			dest_off = arch_dest_reloc_offset(reloc->addend);
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21) ||
-			   !strncmp(reloc->sym->name, "__x86_retpoline_", 16)) {
+		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
 			/*
 			 * Retpoline jumps are really dynamic jumps in
 			 * disguise, so convert them accordingly.



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 04/16] objtool: Correctly handle retpoline thunk calls
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (2 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 03/16] x86/retpoline: Simplify retpolines Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 05/16] objtool: Per arch retpoline naming Peter Zijlstra
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Just like JMP handling, convert a direct CALL to a retpoline thunk
into a retpoline safe indirect CALL.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -953,6 +953,18 @@ static int add_call_destinations(struct
 					  dest_off);
 				return -1;
 			}
+
+		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+			/*
+			 * Retpoline calls are really dynamic calls in
+			 * disguise, so convert them accordingly.
+			 */
+			insn->type = INSN_CALL_DYNAMIC;
+			insn->retpoline_safe = true;
+
+			remove_insn_ops(insn);
+			continue;
+
 		} else
 			insn->call_dest = reloc->sym;
 



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 05/16] objtool: Per arch retpoline naming
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (3 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 04/16] objtool: Correctly handle retpoline thunk calls Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] objtool: Handle per " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 06/16] objtool: Fix static_call list generation Peter Zijlstra
                   ` (11 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

The __x86_indirect_ naming is obviously not generic. Shorten to allow
matching some additional magic names later.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/arch/x86/decode.c      |    5 +++++
 tools/objtool/check.c                |    9 +++++++--
 tools/objtool/include/objtool/arch.h |    2 ++
 3 files changed, 14 insertions(+), 2 deletions(-)

--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -692,3 +692,8 @@ int arch_decode_hint_reg(struct instruct
 
 	return 0;
 }
+
+bool arch_is_retpoline(struct symbol *sym)
+{
+	return !strncmp(sym->name, "__x86_indirect_", 15);
+}
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -850,6 +850,11 @@ static int add_ignore_alternatives(struc
 	return 0;
 }
 
+__weak bool arch_is_retpoline(struct symbol *sym)
+{
+	return false;
+}
+
 /*
  * Find the destination instructions for all jumps.
  */
@@ -872,7 +877,7 @@ static int add_jump_destinations(struct
 		} else if (reloc->sym->type == STT_SECTION) {
 			dest_sec = reloc->sym->sec;
 			dest_off = arch_dest_reloc_offset(reloc->addend);
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+		} else if (arch_is_retpoline(reloc->sym)) {
 			/*
 			 * Retpoline jumps are really dynamic jumps in
 			 * disguise, so convert them accordingly.
@@ -1026,7 +1031,7 @@ static int add_call_destinations(struct
 				return -1;
 			}
 
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+		} else if (arch_is_retpoline(reloc->sym)) {
 			/*
 			 * Retpoline calls are really dynamic calls in
 			 * disguise, so convert them accordingly.
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -85,4 +85,6 @@ const char *arch_nop_insn(int len);
 
 int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg);
 
+bool arch_is_retpoline(struct symbol *sym);
+
 #endif /* _ARCH_H */



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 06/16] objtool: Fix static_call list generation
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (4 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 05/16] objtool: Per arch retpoline naming Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 07/16] objtool: Rework rebuild_reloc logic Peter Zijlstra
                   ` (10 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Currently objtool generates tail call entries in
add_jump_destination() but waits until validate_branch() to generate
the regular call entries, move these to add_call_destination() for
consistency.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c |   18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1045,6 +1045,11 @@ static int add_call_destinations(struct
 		} else
 			insn->call_dest = reloc->sym;
 
+		if (insn->call_dest && insn->call_dest->static_call_tramp) {
+			list_add_tail(&insn->static_call_node,
+				      &file->static_call_list);
+		}
+
 		/*
 		 * Many compilers cannot disable KCOV with a function attribute
 		 * so they need a little help, NOP out any KCOV calls from noinstr
@@ -1788,6 +1794,9 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_{jump_call}_destination.
+	 */
 	ret = read_static_call_tramps(file);
 	if (ret)
 		return ret;
@@ -1800,6 +1809,10 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_call_destination(); it changes INSN_CALL to
+	 * INSN_JUMP.
+	 */
 	ret = read_intra_function_calls(file);
 	if (ret)
 		return ret;
@@ -2745,11 +2758,6 @@ static int validate_branch(struct objtoo
 			if (dead_end_function(file, insn->call_dest))
 				return 0;
 
-			if (insn->type == INSN_CALL && insn->call_dest->static_call_tramp) {
-				list_add_tail(&insn->static_call_node,
-					      &file->static_call_list);
-			}
-
 			break;
 
 		case INSN_JUMP_CONDITIONAL:



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 07/16] objtool: Rework rebuild_reloc logic
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (5 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 06/16] objtool: Fix static_call list generation Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` [tip: x86/core] objtool: Rework the elf_rebuild_reloc_section() logic tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 08/16] objtool: Add elf_create_reloc() helper Peter Zijlstra
                   ` (9 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Instead of manually calling elf_rebuild_reloc_section() on sections
we've called elf_add_reloc() on, have elf_write() DTRT.

This makes it easier to add random relocations in places without
carefully tracking when we're done and need to flush what section.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c               |    6 ------
 tools/objtool/elf.c                 |   20 ++++++++++++++------
 tools/objtool/include/objtool/elf.h |    1 -
 tools/objtool/orc_gen.c             |    3 ---
 4 files changed, 14 insertions(+), 16 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -542,9 +542,6 @@ static int create_static_call_sections(s
 		idx++;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, reloc_sec))
-		return -1;
-
 	return 0;
 }
 
@@ -614,9 +611,6 @@ static int create_mcount_loc_sections(st
 		idx++;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, reloc_sec))
-		return -1;
-
 	return 0;
 }
 
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, stru
 
 	list_add_tail(&reloc->list, &sec->reloc_list);
 	elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
+
+	sec->changed = true;
 }
 
 static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, unsigned int *symndx)
@@ -558,7 +560,9 @@ static int read_relocs(struct elf *elf)
 				return -1;
 			}
 
-			elf_add_reloc(elf, reloc);
+			list_add_tail(&reloc->list, &sec->reloc_list);
+			elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
+
 			nr_reloc++;
 		}
 		max_reloc = max(max_reloc, nr_reloc);
@@ -873,14 +877,11 @@ static int elf_rebuild_rela_reloc_sectio
 	return 0;
 }
 
-int elf_rebuild_reloc_section(struct elf *elf, struct section *sec)
+static int elf_rebuild_reloc_section(struct elf *elf, struct section *sec)
 {
 	struct reloc *reloc;
 	int nr;
 
-	sec->changed = true;
-	elf->changed = true;
-
 	nr = 0;
 	list_for_each_entry(reloc, &sec->reloc_list, list)
 		nr++;
@@ -944,9 +945,15 @@ int elf_write(struct elf *elf)
 	struct section *sec;
 	Elf_Scn *s;
 
-	/* Update section headers for changed sections: */
+	/* Update changed relocation sections and section headers: */
 	list_for_each_entry(sec, &elf->sections, list) {
 		if (sec->changed) {
+			if (sec->base &&
+			    elf_rebuild_reloc_section(elf, sec)) {
+				WARN("elf_rebuild_reloc_section");
+				return -1;
+			}
+
 			s = elf_getscn(elf->elf, sec->idx);
 			if (!s) {
 				WARN_ELF("elf_getscn");
@@ -958,6 +965,7 @@ int elf_write(struct elf *elf)
 			}
 
 			sec->changed = false;
+			elf->changed = true;
 		}
 	}
 
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -142,7 +142,6 @@ struct reloc *find_reloc_by_dest_range(c
 struct symbol *find_func_containing(struct section *sec, unsigned long offset);
 void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
 			      struct reloc *reloc);
-int elf_rebuild_reloc_section(struct elf *elf, struct section *sec);
 
 #define for_each_sec(file, sec)						\
 	list_for_each_entry(sec, &file->elf->sections, list)
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -254,8 +254,5 @@ int orc_create(struct objtool_file *file
 			return -1;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, ip_rsec))
-		return -1;
-
 	return 0;
 }



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 08/16] objtool: Add elf_create_reloc() helper
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (6 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 07/16] objtool: Rework rebuild_reloc logic Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 09/16] objtool: Implicitly create reloc sections Peter Zijlstra
                   ` (8 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

We have 4 instances of adding a relocation. Create a common helper
to avoid growing even more.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c               |   78 ++++++--------------------------
 tools/objtool/elf.c                 |   86 +++++++++++++++++++++++-------------
 tools/objtool/include/objtool/elf.h |   10 ++--
 tools/objtool/orc_gen.c             |   30 ++----------
 4 files changed, 85 insertions(+), 119 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -433,8 +433,7 @@ static int add_dead_ends(struct objtool_
 
 static int create_static_call_sections(struct objtool_file *file)
 {
-	struct section *sec, *reloc_sec;
-	struct reloc *reloc;
+	struct section *sec;
 	struct static_call_site *site;
 	struct instruction *insn;
 	struct symbol *key_sym;
@@ -460,8 +459,7 @@ static int create_static_call_sections(s
 	if (!sec)
 		return -1;
 
-	reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!reloc_sec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	idx = 0;
@@ -471,25 +469,11 @@ static int create_static_call_sections(s
 		memset(site, 0, sizeof(struct static_call_site));
 
 		/* populate reloc for 'addr' */
-		reloc = malloc(sizeof(*reloc));
-
-		if (!reloc) {
-			perror("malloc");
+		if (elf_add_reloc_to_insn(file->elf, sec,
+					  idx * sizeof(struct static_call_site),
+					  R_X86_64_PC32,
+					  insn->sec, insn->offset))
 			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-
-		insn_to_reloc_sym_addend(insn->sec, insn->offset, reloc);
-		if (!reloc->sym) {
-			WARN_FUNC("static call tramp: missing containing symbol",
-				  insn->sec, insn->offset);
-			return -1;
-		}
-
-		reloc->type = R_X86_64_PC32;
-		reloc->offset = idx * sizeof(struct static_call_site);
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		/* find key symbol */
 		key_name = strdup(insn->call_dest->name);
@@ -526,18 +510,11 @@ static int create_static_call_sections(s
 		free(key_name);
 
 		/* populate reloc for 'key' */
-		reloc = malloc(sizeof(*reloc));
-		if (!reloc) {
-			perror("malloc");
+		if (elf_add_reloc(file->elf, sec,
+				  idx * sizeof(struct static_call_site) + 4,
+				  R_X86_64_PC32, key_sym,
+				  is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
 			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-		reloc->sym = key_sym;
-		reloc->addend = is_sibling_call(insn) ? STATIC_CALL_SITE_TAIL : 0;
-		reloc->type = R_X86_64_PC32;
-		reloc->offset = idx * sizeof(struct static_call_site) + 4;
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		idx++;
 	}
@@ -547,8 +524,7 @@ static int create_static_call_sections(s
 
 static int create_mcount_loc_sections(struct objtool_file *file)
 {
-	struct section *sec, *reloc_sec;
-	struct reloc *reloc;
+	struct section *sec;
 	unsigned long *loc;
 	struct instruction *insn;
 	int idx;
@@ -571,8 +547,7 @@ static int create_mcount_loc_sections(st
 	if (!sec)
 		return -1;
 
-	reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!reloc_sec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	idx = 0;
@@ -581,32 +556,11 @@ static int create_mcount_loc_sections(st
 		loc = (unsigned long *)sec->data->d_buf + idx;
 		memset(loc, 0, sizeof(unsigned long));
 
-		reloc = malloc(sizeof(*reloc));
-		if (!reloc) {
-			perror("malloc");
+		if (elf_add_reloc_to_insn(file->elf, sec,
+					  idx * sizeof(unsigned long),
+					  R_X86_64_64,
+					  insn->sec, insn->offset))
 			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-
-		if (insn->sec->sym) {
-			reloc->sym = insn->sec->sym;
-			reloc->addend = insn->offset;
-		} else {
-			reloc->sym = find_symbol_containing(insn->sec, insn->offset);
-
-			if (!reloc->sym) {
-				WARN("missing symbol for insn at offset 0x%lx\n",
-				     insn->offset);
-				return -1;
-			}
-
-			reloc->addend = insn->offset - reloc->sym->offset;
-		}
-
-		reloc->type = R_X86_64_64;
-		reloc->offset = idx * sizeof(unsigned long);
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		idx++;
 	}
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -211,32 +211,6 @@ struct reloc *find_reloc_by_dest(const s
 	return find_reloc_by_dest_range(elf, sec, offset, 1);
 }
 
-void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
-			      struct reloc *reloc)
-{
-	if (sec->sym) {
-		reloc->sym = sec->sym;
-		reloc->addend = offset;
-		return;
-	}
-
-	/*
-	 * The Clang assembler strips section symbols, so we have to reference
-	 * the function symbol instead:
-	 */
-	reloc->sym = find_symbol_containing(sec, offset);
-	if (!reloc->sym) {
-		/*
-		 * Hack alert.  This happens when we need to reference the NOP
-		 * pad insn immediately after the function.
-		 */
-		reloc->sym = find_symbol_containing(sec, offset - 1);
-	}
-
-	if (reloc->sym)
-		reloc->addend = offset - reloc->sym->offset;
-}
-
 static int read_sections(struct elf *elf)
 {
 	Elf_Scn *s = NULL;
@@ -473,14 +447,66 @@ static int read_symbols(struct elf *elf)
 	return -1;
 }
 
-void elf_add_reloc(struct elf *elf, struct reloc *reloc)
+int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+		  unsigned int type, struct symbol *sym, int addend)
 {
-	struct section *sec = reloc->sec;
+	struct reloc *reloc;
 
-	list_add_tail(&reloc->list, &sec->reloc_list);
+	reloc = malloc(sizeof(*reloc));
+	if (!reloc) {
+		perror("malloc");
+		return -1;
+	}
+	memset(reloc, 0, sizeof(*reloc));
+
+	reloc->sec = sec->reloc;
+	reloc->offset = offset;
+	reloc->type = type;
+	reloc->sym = sym;
+	reloc->addend = addend;
+
+	list_add_tail(&reloc->list, &sec->reloc->reloc_list);
 	elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
 
-	sec->changed = true;
+	sec->reloc->changed = true;
+
+	return 0;
+}
+
+int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+			  unsigned long offset, unsigned int type,
+			  struct section *insn_sec, unsigned long insn_off)
+{
+	struct symbol *sym;
+	int addend;
+
+	if (insn_sec->sym) {
+		sym = insn_sec->sym;
+		addend = insn_off;
+
+	} else {
+		/*
+		 * The Clang assembler strips section symbols, so we have to
+		 * reference the function symbol instead:
+		 */
+		sym = find_symbol_containing(insn_sec, insn_off);
+		if (!sym) {
+			/*
+			 * Hack alert.  This happens when we need to reference
+			 * the NOP pad insn immediately after the function.
+			 */
+			sym = find_symbol_containing(insn_sec, insn_off - 1);
+		}
+
+		if (!sym) {
+			WARN("can't find symbol containing %s+0x%lx", insn_sec->name, insn_off);
+			return -1;
+		}
+
+		addend = insn_off - sym->offset;
+	}
+
+	return elf_add_reloc(elf, sec, offset, type, sym, addend);
 }
 
 static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, unsigned int *symndx)
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -123,7 +123,13 @@ static inline u32 reloc_hash(struct relo
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
 struct section *elf_create_reloc_section(struct elf *elf, struct section *base, int reltype);
-void elf_add_reloc(struct elf *elf, struct reloc *reloc);
+
+int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+		  unsigned int type, struct symbol *sym, int addend);
+int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+			  unsigned long offset, unsigned int type,
+			  struct section *insn_sec, unsigned long insn_off);
+
 int elf_write_insn(struct elf *elf, struct section *sec,
 		   unsigned long offset, unsigned int len,
 		   const char *insn);
@@ -140,8 +146,6 @@ struct reloc *find_reloc_by_dest(const s
 struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *sec,
 				     unsigned long offset, unsigned int len);
 struct symbol *find_func_containing(struct section *sec, unsigned long offset);
-void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
-			      struct reloc *reloc);
 
 #define for_each_sec(file, sec)						\
 	list_for_each_entry(sec, &file->elf->sections, list)
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -82,12 +82,11 @@ static int init_orc_entry(struct orc_ent
 }
 
 static int write_orc_entry(struct elf *elf, struct section *orc_sec,
-			   struct section *ip_rsec, unsigned int idx,
+			   struct section *ip_sec, unsigned int idx,
 			   struct section *insn_sec, unsigned long insn_off,
 			   struct orc_entry *o)
 {
 	struct orc_entry *orc;
-	struct reloc *reloc;
 
 	/* populate ORC data */
 	orc = (struct orc_entry *)orc_sec->data->d_buf + idx;
@@ -96,25 +95,9 @@ static int write_orc_entry(struct elf *e
 	orc->bp_offset = bswap_if_needed(orc->bp_offset);
 
 	/* populate reloc for ip */
-	reloc = malloc(sizeof(*reloc));
-	if (!reloc) {
-		perror("malloc");
+	if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_X86_64_PC32,
+				  insn_sec, insn_off))
 		return -1;
-	}
-	memset(reloc, 0, sizeof(*reloc));
-
-	insn_to_reloc_sym_addend(insn_sec, insn_off, reloc);
-	if (!reloc->sym) {
-		WARN("missing symbol for insn at offset 0x%lx",
-		     insn_off);
-		return -1;
-	}
-
-	reloc->type = R_X86_64_PC32;
-	reloc->offset = idx * sizeof(int);
-	reloc->sec = ip_rsec;
-
-	elf_add_reloc(elf, reloc);
 
 	return 0;
 }
@@ -153,7 +136,7 @@ static unsigned long alt_group_len(struc
 
 int orc_create(struct objtool_file *file)
 {
-	struct section *sec, *ip_rsec, *orc_sec;
+	struct section *sec, *orc_sec;
 	unsigned int nr = 0, idx = 0;
 	struct orc_list_entry *entry;
 	struct list_head orc_list;
@@ -242,13 +225,12 @@ int orc_create(struct objtool_file *file
 	sec = elf_create_section(file->elf, ".orc_unwind_ip", 0, sizeof(int), nr);
 	if (!sec)
 		return -1;
-	ip_rsec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!ip_rsec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	/* Write ORC entries to sections: */
 	list_for_each_entry(entry, &orc_list, list) {
-		if (write_orc_entry(file->elf, orc_sec, ip_rsec, idx++,
+		if (write_orc_entry(file->elf, orc_sec, sec, idx++,
 				    entry->insn_sec, entry->insn_off,
 				    &entry->orc))
 			return -1;



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 09/16] objtool: Implicitly create reloc sections
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (7 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 08/16] objtool: Add elf_create_reloc() helper Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` [tip: x86/core] objtool: Create reloc sections implicitly tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 10/16] objtool: Extract elf_strtab_concat() Peter Zijlstra
                   ` (7 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Have elf_add_reloc() create the relocation section implicity.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c               |    6 ------
 tools/objtool/elf.c                 |    9 ++++++++-
 tools/objtool/include/objtool/elf.h |    1 -
 tools/objtool/orc_gen.c             |    2 --
 4 files changed, 8 insertions(+), 10 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -459,9 +459,6 @@ static int create_static_call_sections(s
 	if (!sec)
 		return -1;
 
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
-
 	idx = 0;
 	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
 
@@ -547,9 +544,6 @@ static int create_mcount_loc_sections(st
 	if (!sec)
 		return -1;
 
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
-
 	idx = 0;
 	list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) {
 
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -447,11 +447,18 @@ static int read_symbols(struct elf *elf)
 	return -1;
 }
 
+static struct section *elf_create_reloc_section(struct elf *elf,
+						struct section *base,
+						int reltype);
+
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, int addend)
 {
 	struct reloc *reloc;
 
+	if (!sec->reloc && !elf_create_reloc_section(elf, sec, SHT_RELA))
+		return -1;
+
 	reloc = malloc(sizeof(*reloc));
 	if (!reloc) {
 		perror("malloc");
@@ -829,7 +836,7 @@ static struct section *elf_create_rela_r
 	return sec;
 }
 
-struct section *elf_create_reloc_section(struct elf *elf,
+static struct section *elf_create_reloc_section(struct elf *elf,
 					 struct section *base,
 					 int reltype)
 {
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -122,7 +122,6 @@ static inline u32 reloc_hash(struct relo
 
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
-struct section *elf_create_reloc_section(struct elf *elf, struct section *base, int reltype);
 
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, int addend);
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -225,8 +225,6 @@ int orc_create(struct objtool_file *file
 	sec = elf_create_section(file->elf, ".orc_unwind_ip", 0, sizeof(int), nr);
 	if (!sec)
 		return -1;
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
 
 	/* Write ORC entries to sections: */
 	list_for_each_entry(entry, &orc_list, list) {



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 10/16] objtool: Extract elf_strtab_concat()
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (8 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 09/16] objtool: Implicitly create reloc sections Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 11/16] objtool: Extract elf_symbol_add() Peter Zijlstra
                   ` (6 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Create a common helper to append strings to a strtab.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/elf.c |   60 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 38 insertions(+), 22 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -666,13 +666,48 @@ struct elf *elf_open_read(const char *na
 	return NULL;
 }
 
+static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
+{
+	Elf_Data *data;
+	Elf_Scn *s;
+	int len;
+
+	if (!strtab)
+		strtab = find_section_by_name(elf, ".strtab");
+	if (!strtab) {
+		WARN("can't find .strtab section");
+		return -1;
+	}
+
+	s = elf_getscn(elf->elf, strtab->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	data = elf_newdata(s);
+	if (!data) {
+		WARN_ELF("elf_newdata");
+		return -1;
+	}
+
+	data->d_buf = str;
+	data->d_size = strlen(str) + 1;;
+	data->d_align = 1;
+
+	len = strtab->len;
+	strtab->len += data->d_size;
+	strtab->changed = true;
+
+	return len;
+}
+
 struct section *elf_create_section(struct elf *elf, const char *name,
 				   unsigned int sh_flags, size_t entsize, int nr)
 {
 	struct section *sec, *shstrtab;
 	size_t size = entsize * nr;
 	Elf_Scn *s;
-	Elf_Data *data;
 
 	sec = malloc(sizeof(*sec));
 	if (!sec) {
@@ -729,7 +764,6 @@ struct section *elf_create_section(struc
 	sec->sh.sh_addralign = 1;
 	sec->sh.sh_flags = SHF_ALLOC | sh_flags;
 
-
 	/* Add section name to .shstrtab (or .strtab for Clang) */
 	shstrtab = find_section_by_name(elf, ".shstrtab");
 	if (!shstrtab)
@@ -738,27 +772,9 @@ struct section *elf_create_section(struc
 		WARN("can't find .shstrtab or .strtab section");
 		return NULL;
 	}
-
-	s = elf_getscn(elf->elf, shstrtab->idx);
-	if (!s) {
-		WARN_ELF("elf_getscn");
-		return NULL;
-	}
-
-	data = elf_newdata(s);
-	if (!data) {
-		WARN_ELF("elf_newdata");
+	sec->sh.sh_name = elf_add_string(elf, shstrtab, sec->name);
+	if (sec->sh.sh_name == -1)
 		return NULL;
-	}
-
-	data->d_buf = sec->name;
-	data->d_size = strlen(name) + 1;
-	data->d_align = 1;
-
-	sec->sh.sh_name = shstrtab->len;
-
-	shstrtab->len += strlen(name) + 1;
-	shstrtab->changed = true;
 
 	list_add_tail(&sec->list, &elf->sections);
 	elf_hash_add(elf->section_hash, &sec->hash, sec->idx);



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 11/16] objtool: Extract elf_symbol_add()
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (9 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 10/16] objtool: Extract elf_strtab_concat() Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 12/16] objtool: Add elf_create_undef_symbol() Peter Zijlstra
                   ` (5 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Create a common helper to add symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/elf.c |   56 ++++++++++++++++++++++++++++------------------------
 1 file changed, 31 insertions(+), 25 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -290,12 +290,39 @@ static int read_sections(struct elf *elf
 	return 0;
 }
 
+static void elf_add_symbol(struct elf *elf, struct symbol *sym)
+{
+	struct list_head *entry;
+	struct rb_node *pnode;
+
+	sym->type = GELF_ST_TYPE(sym->sym.st_info);
+	sym->bind = GELF_ST_BIND(sym->sym.st_info);
+
+	sym->offset = sym->sym.st_value;
+	sym->len = sym->sym.st_size;
+
+	rb_add(&sym->node, &sym->sec->symbol_tree, symbol_to_offset);
+	pnode = rb_prev(&sym->node);
+	if (pnode)
+		entry = &rb_entry(pnode, struct symbol, node)->list;
+	else
+		entry = &sym->sec->symbol_list;
+	list_add(&sym->list, entry);
+	elf_hash_add(elf->symbol_hash, &sym->hash, sym->idx);
+	elf_hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
+
+	/*
+	 * Don't store empty STT_NOTYPE symbols in the rbtree.  They
+	 * can exist within a function, confusing the sorting.
+	 */
+	if (!sym->len)
+		rb_erase(&sym->node, &sym->sec->symbol_tree);
+}
+
 static int read_symbols(struct elf *elf)
 {
 	struct section *symtab, *symtab_shndx, *sec;
 	struct symbol *sym, *pfunc;
-	struct list_head *entry;
-	struct rb_node *pnode;
 	int symbols_nr, i;
 	char *coldstr;
 	Elf_Data *shndx_data = NULL;
@@ -340,9 +367,6 @@ static int read_symbols(struct elf *elf)
 			goto err;
 		}
 
-		sym->type = GELF_ST_TYPE(sym->sym.st_info);
-		sym->bind = GELF_ST_BIND(sym->sym.st_info);
-
 		if ((sym->sym.st_shndx > SHN_UNDEF &&
 		     sym->sym.st_shndx < SHN_LORESERVE) ||
 		    (shndx_data && sym->sym.st_shndx == SHN_XINDEX)) {
@@ -355,32 +379,14 @@ static int read_symbols(struct elf *elf)
 				     sym->name);
 				goto err;
 			}
-			if (sym->type == STT_SECTION) {
+			if (GELF_ST_TYPE(sym->sym.st_info) == STT_SECTION) {
 				sym->name = sym->sec->name;
 				sym->sec->sym = sym;
 			}
 		} else
 			sym->sec = find_section_by_index(elf, 0);
 
-		sym->offset = sym->sym.st_value;
-		sym->len = sym->sym.st_size;
-
-		rb_add(&sym->node, &sym->sec->symbol_tree, symbol_to_offset);
-		pnode = rb_prev(&sym->node);
-		if (pnode)
-			entry = &rb_entry(pnode, struct symbol, node)->list;
-		else
-			entry = &sym->sec->symbol_list;
-		list_add(&sym->list, entry);
-		elf_hash_add(elf->symbol_hash, &sym->hash, sym->idx);
-		elf_hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
-
-		/*
-		 * Don't store empty STT_NOTYPE symbols in the rbtree.  They
-		 * can exist within a function, confusing the sorting.
-		 */
-		if (!sym->len)
-			rb_erase(&sym->node, &sym->sec->symbol_tree);
+		elf_add_symbol(elf, sym);
 	}
 
 	if (stats)



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 12/16] objtool: Add elf_create_undef_symbol()
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (10 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 11/16] objtool: Extract elf_symbol_add() Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 13/16] objtool: Keep track of retpoline call sites Peter Zijlstra
                   ` (4 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Allow objtool to create undefined symbols; this allows creating
relocations to symbols not currently in the symbol table.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/elf.c                 |   60 ++++++++++++++++++++++++++++++++++++
 tools/objtool/include/objtool/elf.h |    1 
 2 files changed, 61 insertions(+)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -715,6 +715,66 @@ static int elf_add_string(struct elf *el
 	return len;
 }
 
+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
+{
+	struct section *symtab;
+	struct symbol *sym;
+	Elf_Data *data;
+	Elf_Scn *s;
+
+	sym = malloc(sizeof(*sym));
+	if (!sym) {
+		perror("malloc");
+		return NULL;
+	}
+	memset(sym, 0, sizeof(*sym));
+
+	sym->name = strdup(name);
+
+	sym->sym.st_name = elf_add_string(elf, NULL, sym->name);
+	if (sym->sym.st_name == -1)
+		return NULL;
+
+	sym->sym.st_info = GELF_ST_INFO(STB_GLOBAL, STT_NOTYPE);
+	// st_other 0
+	// st_shndx 0
+	// st_value 0
+	// st_size 0
+
+	symtab = find_section_by_name(elf, ".symtab");
+	if (!symtab) {
+		WARN("can't find .symtab");
+		return NULL;
+	}
+
+	s = elf_getscn(elf->elf, symtab->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return NULL;
+	}
+
+	data = elf_newdata(s);
+	if (!data) {
+		WARN_ELF("elf_newdata");
+		return NULL;
+	}
+
+	data->d_buf = &sym->sym;
+	data->d_size = sizeof(sym->sym);
+	data->d_align = 1;
+
+	sym->idx = symtab->len / sizeof(sym->sym);
+
+	symtab->len += data->d_size;
+	symtab->changed = true;
+
+	sym->sec = find_section_by_index(elf, 0);
+
+	elf_add_symbol(elf, sym);
+
+	return sym;
+}
+
 struct section *elf_create_section(struct elf *elf, const char *name,
 				   unsigned int sh_flags, size_t entsize, int nr)
 {
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -133,6 +133,7 @@ int elf_write_insn(struct elf *elf, stru
 		   unsigned long offset, unsigned int len,
 		   const char *insn);
 int elf_write_reloc(struct elf *elf, struct reloc *reloc);
+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name);
 int elf_write(struct elf *elf);
 void elf_close(struct elf *elf);
 



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 13/16] objtool: Keep track of retpoline call sites
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (11 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 12/16] objtool: Add elf_create_undef_symbol() Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 14/16] objtool: Cache instruction relocs Peter Zijlstra
                   ` (3 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Provide infrastructure for architectures to rewrite/augment compiler
generated retpoline calls. Similar to what we do for static_call()s,
keep track of the instructions that are retpoline calls.

Use the same list_head, since a retpoline call cannot also be a
static_call.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c                   |   34 +++++++++++++++++++++++++++-----
 tools/objtool/include/objtool/arch.h    |    2 +
 tools/objtool/include/objtool/check.h   |    2 -
 tools/objtool/include/objtool/objtool.h |    1 
 tools/objtool/objtool.c                 |    1 
 5 files changed, 34 insertions(+), 6 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -451,7 +451,7 @@ static int create_static_call_sections(s
 		return 0;
 
 	idx = 0;
-	list_for_each_entry(insn, &file->static_call_list, static_call_node)
+	list_for_each_entry(insn, &file->static_call_list, call_node)
 		idx++;
 
 	sec = elf_create_section(file->elf, ".static_call_sites", SHF_WRITE,
@@ -460,7 +460,7 @@ static int create_static_call_sections(s
 		return -1;
 
 	idx = 0;
-	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
+	list_for_each_entry(insn, &file->static_call_list, call_node) {
 
 		site = (struct static_call_site *)sec->data->d_buf + idx;
 		memset(site, 0, sizeof(struct static_call_site));
@@ -829,13 +829,16 @@ static int add_jump_destinations(struct
 			else
 				insn->type = INSN_JUMP_DYNAMIC_CONDITIONAL;
 
+			list_add_tail(&insn->call_node,
+				      &file->retpoline_call_list);
+
 			insn->retpoline_safe = true;
 			continue;
 		} else if (insn->func) {
 			/* internal or external sibling call (with reloc) */
 			insn->call_dest = reloc->sym;
 			if (insn->call_dest->static_call_tramp) {
-				list_add_tail(&insn->static_call_node,
+				list_add_tail(&insn->call_node,
 					      &file->static_call_list);
 			}
 			continue;
@@ -897,7 +900,7 @@ static int add_jump_destinations(struct
 				/* internal sibling call (without reloc) */
 				insn->call_dest = insn->jump_dest->func;
 				if (insn->call_dest->static_call_tramp) {
-					list_add_tail(&insn->static_call_node,
+					list_add_tail(&insn->call_node,
 						      &file->static_call_list);
 				}
 			}
@@ -981,6 +984,9 @@ static int add_call_destinations(struct
 			insn->type = INSN_CALL_DYNAMIC;
 			insn->retpoline_safe = true;
 
+			list_add_tail(&insn->call_node,
+				      &file->retpoline_call_list);
+
 			remove_insn_ops(insn);
 			continue;
 
@@ -988,7 +994,7 @@ static int add_call_destinations(struct
 			insn->call_dest = reloc->sym;
 
 		if (insn->call_dest && insn->call_dest->static_call_tramp) {
-			list_add_tail(&insn->static_call_node,
+			list_add_tail(&insn->call_node,
 				      &file->static_call_list);
 		}
 
@@ -1714,6 +1720,11 @@ static void mark_rodata(struct objtool_f
 	file->rodata = found;
 }
 
+__weak int arch_rewrite_retpolines(struct objtool_file *file)
+{
+	return 0;
+}
+
 static int decode_sections(struct objtool_file *file)
 {
 	int ret;
@@ -1742,6 +1753,10 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_special_section_alts() as that depends on
+	 * jump_dest being set.
+	 */
 	ret = add_jump_destinations(file);
 	if (ret)
 		return ret;
@@ -1778,6 +1793,15 @@ static int decode_sections(struct objtoo
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be after add_special_section_alts(), since this will emit
+	 * alternatives. Must be after add_{jump,call}_destination(), since
+	 * those create the call insn lists.
+	 */
+	ret = arch_rewrite_retpolines(file);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -87,4 +87,6 @@ int arch_decode_hint_reg(struct instruct
 
 bool arch_is_retpoline(struct symbol *sym);
 
+int arch_rewrite_retpolines(struct objtool_file *file);
+
 #endif /* _ARCH_H */
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -39,7 +39,7 @@ struct alt_group {
 struct instruction {
 	struct list_head list;
 	struct hlist_node hash;
-	struct list_head static_call_node;
+	struct list_head call_node;
 	struct list_head mcount_loc_node;
 	struct section *sec;
 	unsigned long offset;
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -18,6 +18,7 @@ struct objtool_file {
 	struct elf *elf;
 	struct list_head insn_list;
 	DECLARE_HASHTABLE(insn_hash, 20);
+	struct list_head retpoline_call_list;
 	struct list_head static_call_list;
 	struct list_head mcount_loc_list;
 	bool ignore_unreachables, c_file, hints, rodata;
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -125,6 +125,7 @@ struct objtool_file *objtool_open_read(c
 
 	INIT_LIST_HEAD(&file.insn_list);
 	hash_init(file.insn_hash);
+	INIT_LIST_HEAD(&file.retpoline_call_list);
 	INIT_LIST_HEAD(&file.static_call_list);
 	INIT_LIST_HEAD(&file.mcount_loc_list);
 	file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment");



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 14/16] objtool: Cache instruction relocs
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (12 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 13/16] objtool: Keep track of retpoline call sites Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 15/16] objtool: Skip magical retpoline .altinstr_replacement Peter Zijlstra
                   ` (2 subsequent siblings)
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

Track the reloc of instructions to avoid having to look them up again
later.

(Technically x86 instructions can have two relocations, but not jumps
and calls, for which we're using this.)

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c                 |   28 ++++++++++++++++++++++------
 tools/objtool/include/objtool/check.h |    1 +
 2 files changed, 23 insertions(+), 6 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -803,6 +803,25 @@ __weak bool arch_is_retpoline(struct sym
 	return false;
 }
 
+#define NEGATIVE_RELOC	((void *)-1L)
+
+static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
+{
+	if (insn->reloc == NEGATIVE_RELOC)
+		return NULL;
+
+	if (!insn->reloc) {
+		insn->reloc = find_reloc_by_dest_range(file->elf, insn->sec,
+						       insn->offset, insn->len);
+		if (!insn->reloc) {
+			insn->reloc = NEGATIVE_RELOC;
+			return NULL;
+		}
+	}
+
+	return insn->reloc;
+}
+
 /*
  * Find the destination instructions for all jumps.
  */
@@ -817,8 +836,7 @@ static int add_jump_destinations(struct
 		if (!is_static_jump(insn))
 			continue;
 
-		reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-						 insn->offset, insn->len);
+		reloc = insn_reloc(file, insn);
 		if (!reloc) {
 			dest_sec = insn->sec;
 			dest_off = arch_jump_destination(insn);
@@ -950,8 +968,7 @@ static int add_call_destinations(struct
 		if (insn->type != INSN_CALL)
 			continue;
 
-		reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-					       insn->offset, insn->len);
+		reloc = insn_reloc(file, insn);
 		if (!reloc) {
 			dest_off = arch_jump_destination(insn);
 			insn->call_dest = find_call_destination(insn->sec, dest_off);
@@ -1151,8 +1168,7 @@ static int handle_group_alt(struct objto
 		 * alternatives code can adjust the relative offsets
 		 * accordingly.
 		 */
-		alt_reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-						   insn->offset, insn->len);
+		alt_reloc = insn_reloc(file, insn);
 		if (alt_reloc &&
 		    !arch_support_alt_relocation(special_alt, insn, alt_reloc)) {
 
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -56,6 +56,7 @@ struct instruction {
 	struct instruction *jump_dest;
 	struct instruction *first_jump_src;
 	struct reloc *jump_table;
+	struct reloc *reloc;
 	struct list_head alts;
 	struct symbol *func;
 	struct list_head stack_ops;



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 15/16] objtool: Skip magical retpoline .altinstr_replacement
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (13 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 14/16] objtool: Cache instruction relocs Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-03-26 15:12 ` [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls Peter Zijlstra
  2021-03-30 15:02 ` [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Miroslav Benes
  16 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

When the .altinstr_replacement is a retpoline, skip the alternative.
We already special case retpolines anyway.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/special.c |   12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -106,6 +106,14 @@ static int get_alt_entry(struct elf *elf
 			return -1;
 		}
 
+		/*
+		 * Skip retpoline .altinstr_replacement... we already rewrite the
+		 * instructions for retpolines anyway, see arch_is_retpoline()
+		 * usage in add_{call,jump}_destinations().
+		 */
+		if (arch_is_retpoline(new_reloc->sym))
+			return 1;
+
 		alt->new_sec = new_reloc->sym->sec;
 		alt->new_off = (unsigned int)new_reloc->addend;
 
@@ -154,7 +162,9 @@ int special_get_alts(struct elf *elf, st
 			memset(alt, 0, sizeof(*alt));
 
 			ret = get_alt_entry(elf, entry, sec, idx, alt);
-			if (ret)
+			if (ret > 0)
+				continue;
+			if (ret < 0)
 				return ret;
 
 			list_add_tail(&alt->list, alts);



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (14 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 15/16] objtool: Skip magical retpoline .altinstr_replacement Peter Zijlstra
@ 2021-03-26 15:12 ` Peter Zijlstra
  2021-03-29 16:38   ` Josh Poimboeuf
                     ` (2 more replies)
  2021-03-30 15:02 ` [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Miroslav Benes
  16 siblings, 3 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-03-26 15:12 UTC (permalink / raw)
  To: x86, jpoimboe, jgross, mbenes; +Cc: linux-kernel, peterz

When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
indirect call, have objtool rewrite it to:

	ALTERNATIVE "call __x86_indirect_thunk_\reg",
		    "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)

Additionally, in order to not emit endless identical
.altinst_replacement chunks, use a global symbol for them, see
__x86_indirect_alt_*.

This also avoids objtool from having to do code generation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/asm-prototypes.h |   12 ++-
 arch/x86/lib/retpoline.S              |   42 +++++++++++
 tools/objtool/arch/x86/decode.c       |  122 ++++++++++++++++++++++++++++++++++
 3 files changed, 173 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -19,11 +19,19 @@ extern void cmpxchg8b_emu(void);
 
 #ifdef CONFIG_RETPOLINE
 
-#define DECL_INDIRECT_THUNK(reg) \
+#undef GEN
+#define GEN(reg) \
 	extern asmlinkage void __x86_indirect_thunk_ ## reg (void);
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) \
+	extern asmlinkage void __x86_indirect_alt_call_ ## reg (void);
+#include <asm/GEN-for-each-reg.h>
 
 #undef GEN
-#define GEN(reg) DECL_INDIRECT_THUNK(reg)
+#define GEN(reg) \
+	extern asmlinkage void __x86_indirect_alt_jmp_ ## reg (void);
 #include <asm/GEN-for-each-reg.h>
 
 #endif /* CONFIG_RETPOLINE */
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -10,6 +10,8 @@
 #include <asm/unwind_hints.h>
 #include <asm/frame.h>
 
+	.section .text.__x86.indirect_thunk
+
 .macro RETPOLINE reg
 	ANNOTATE_INTRA_FUNCTION_CALL
 	call    .Ldo_rop_\@
@@ -25,9 +27,9 @@
 .endm
 
 .macro THUNK reg
-	.section .text.__x86.indirect_thunk
 
 	.align 32
+
 SYM_FUNC_START(__x86_indirect_thunk_\reg)
 
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
@@ -39,6 +41,32 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
 .endm
 
 /*
+ * This generates .altinstr_replacement symbols for use by objtool. They,
+ * however, must not actually live in .altinstr_replacement since that will be
+ * discarded after init, but module alternatives will also reference these
+ * symbols.
+ *
+ * Their names matches the "__x86_indirect_" prefix to mark them as retpolines.
+ */
+.macro ALT_THUNK reg
+
+	.align 1
+
+SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg)
+	ANNOTATE_RETPOLINE_SAFE
+1:	call	*%\reg
+2:	.skip	5-(2b-1b), 0x90
+SYM_FUNC_END(__x86_indirect_alt_call_\reg)
+
+SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg)
+	ANNOTATE_RETPOLINE_SAFE
+1:	jmp	*%\reg
+2:	.skip	5-(2b-1b), 0x90
+SYM_FUNC_END(__x86_indirect_alt_jmp_\reg)
+
+.endm
+
+/*
  * Despite being an assembler file we can't just use .irp here
  * because __KSYM_DEPS__ only uses the C preprocessor and would
  * only see one instance of "__x86_indirect_thunk_\reg" rather
@@ -61,3 +89,15 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
 #define GEN(reg) EXPORT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
+#undef GEN
+#define GEN(reg) ALT_THUNK reg
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_call_ ## reg)
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_jmp_ ## reg)
+#include <asm/GEN-for-each-reg.h>
+
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -19,6 +19,7 @@
 #include <objtool/elf.h>
 #include <objtool/arch.h>
 #include <objtool/warn.h>
+#include <arch/elf.h>
 
 static int is_x86_64(const struct elf *elf)
 {
@@ -657,6 +658,122 @@ const char *arch_nop_insn(int len)
 	return nops[len-1];
 }
 
+/* asm/alternative.h ? */
+
+#define ALTINSTR_FLAG_INV	(1 << 15)
+#define ALT_NOT(feat)		((feat) | ALTINSTR_FLAG_INV)
+
+struct alt_instr {
+	s32 instr_offset;	/* original instruction */
+	s32 repl_offset;	/* offset to replacement instruction */
+	u16 cpuid;		/* cpuid bit set for replacement */
+	u8  instrlen;		/* length of original instruction */
+	u8  replacementlen;	/* length of new instruction */
+} __packed;
+
+static int elf_add_alternative(struct elf *elf,
+			       struct instruction *orig, struct symbol *sym,
+			       int cpuid, u8 orig_len, u8 repl_len)
+{
+	const int size = sizeof(struct alt_instr);
+	struct alt_instr *alt;
+	struct section *sec;
+	Elf_Scn *s;
+
+	sec = find_section_by_name(elf, ".altinstructions");
+	if (!sec) {
+		sec = elf_create_section(elf, ".altinstructions",
+					 SHF_WRITE, size, 0);
+
+		if (!sec) {
+			WARN_ELF("elf_create_section");
+			return -1;
+		}
+	}
+
+	s = elf_getscn(elf->elf, sec->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	sec->data = elf_newdata(s);
+	if (!sec->data) {
+		WARN_ELF("elf_newdata");
+		return -1;
+	}
+
+	sec->data->d_size = size;
+	sec->data->d_align = 1;
+
+	alt = sec->data->d_buf = malloc(size);
+	if (!sec->data->d_buf) {
+		perror("malloc");
+		return -1;
+	}
+	memset(sec->data->d_buf, 0, size);
+
+	if (elf_add_reloc_to_insn(elf, sec, sec->sh.sh_size,
+				  R_X86_64_PC32, orig->sec, orig->offset)) {
+		WARN("elf_create_reloc: alt_instr::instr_offset");
+		return -1;
+	}
+
+	if (elf_add_reloc(elf, sec, sec->sh.sh_size + 4,
+			  R_X86_64_PC32, sym, 0)) {
+		WARN("elf_create_reloc: alt_instr::repl_offset");
+		return -1;
+	}
+
+	alt->cpuid = cpuid;
+	alt->instrlen = orig_len;
+	alt->replacementlen = repl_len;
+
+	sec->sh.sh_size += size;
+	sec->changed = true;
+
+	return 0;
+}
+
+#define X86_FEATURE_RETPOLINE                ( 7*32+12)
+
+int arch_rewrite_retpolines(struct objtool_file *file)
+{
+	struct instruction *insn;
+	struct reloc *reloc;
+	struct symbol *sym;
+	char name[32] = "";
+
+	list_for_each_entry(insn, &file->retpoline_call_list, call_node) {
+
+		if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
+			continue;
+
+		reloc = insn->reloc;
+
+		sprintf(name, "__x86_indirect_alt_%s_%s",
+			insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call",
+			reloc->sym->name + 21);
+
+		sym = find_symbol_by_name(file->elf, name);
+		if (!sym) {
+			sym = elf_create_undef_symbol(file->elf, name);
+			if (!sym) {
+				WARN("elf_create_undef_symbol");
+				return -1;
+			}
+		}
+
+		if (elf_add_alternative(file->elf, insn, sym,
+					ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5)) {
+			WARN("elf_add_alternative");
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
 int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg)
 {
 	struct cfi_reg *cfa = &insn->cfi.cfa;



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-03-26 15:12 ` [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls Peter Zijlstra
@ 2021-03-29 16:38   ` Josh Poimboeuf
  2021-06-02 15:51     ` Lukasz Majczak
  2021-04-01 15:08   ` [tip: x86/core] objtool/x86: " tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2 siblings, 1 reply; 82+ messages in thread
From: Josh Poimboeuf @ 2021-03-29 16:38 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, jgross, mbenes, linux-kernel

On Fri, Mar 26, 2021 at 04:12:15PM +0100, Peter Zijlstra wrote:
> @@ -61,3 +89,15 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
>  #define GEN(reg) EXPORT_THUNK(reg)
>  #include <asm/GEN-for-each-reg.h>
>  
> +#undef GEN
> +#define GEN(reg) ALT_THUNK reg
> +#include <asm/GEN-for-each-reg.h>
> +
> +#undef GEN
> +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_call_ ## reg)
> +#include <asm/GEN-for-each-reg.h>
> +
> +#undef GEN
> +#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_jmp_ ## reg)
> +#include <asm/GEN-for-each-reg.h>
> +

Git complains about this last newline.

Otherwise everything looks pretty good to me.  Let me run it through the
test matrix.

-- 
Josh


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE
  2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
                   ` (15 preceding siblings ...)
  2021-03-26 15:12 ` [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls Peter Zijlstra
@ 2021-03-30 15:02 ` Miroslav Benes
  16 siblings, 0 replies; 82+ messages in thread
From: Miroslav Benes @ 2021-03-30 15:02 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: x86, jpoimboe, jgross, linux-kernel

On Fri, 26 Mar 2021, Peter Zijlstra wrote:

> Hi, another week, another update :-)
> 
> Respin of the !RETPOLINE optimization patches.
> 
> Boris, the first 3 should probably go into tip/x86/core, it's an ungodly tangle
> since it relies on the insn decoder patches in tip/x86/core, the NOP patches in
> tip/x86/cpu and the alternative patches in tip/x86/alternatives.
> 
> Just to make life easy I'd suggest merging everything in x86/core and
> forgetting about the other topic branches (that's what I ended up doing locally).
> 
> The remaining 13 patches depend on the first 3 as well as on the work in
> tip/objtool/core, just to make life more interesting still ;-)
> 
> All except the last 4 patches should be fairly uncontroversial (I hope...).
> 
> There's a fair number of new patches and another few have been completely
> rewritten, but it all seems to work nicely.

Reviewed-by: Miroslav Benes <mbenes@suse.cz>

for the objtool changes. All looks much better in this version.

I have only one minor thing. There are only two call sites of 
elf_add_string(). The one in elf_create_section() passes shstrtab, the 
other one in elf_create_undef_symbol() NULL. elf_add_string() then 
retrieves it itself. I think it would be nicer to just call 
find_section_by_name() in elf_create_undef_symbol(), pass it down and make 
it consistent. Might be a matter of taste.

Miroslav

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool/x86: Rewrite retpoline thunk calls
  2021-03-26 15:12 ` [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls Peter Zijlstra
  2021-03-29 16:38   ` Josh Poimboeuf
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     f31390437ce984118215169d75570e365457ec23
Gitweb:        https://git.kernel.org/tip/f31390437ce984118215169d75570e365457ec23
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:15 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 14:30:45 +02:00

objtool/x86: Rewrite retpoline thunk calls

When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
indirect call, have objtool rewrite it to:

	ALTERNATIVE "call __x86_indirect_thunk_\reg",
		    "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)

Additionally, in order to not emit endless identical
.altinst_replacement chunks, use a global symbol for them, see
__x86_indirect_alt_*.

This also avoids objtool from having to do code generation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.320177914@infradead.org
---
 arch/x86/include/asm/asm-prototypes.h |  12 ++-
 arch/x86/lib/retpoline.S              |  41 ++++++++-
 tools/objtool/arch/x86/decode.c       | 117 +++++++++++++++++++++++++-
 3 files changed, 167 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index 0545b07..4cb726c 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -19,11 +19,19 @@ extern void cmpxchg8b_emu(void);
 
 #ifdef CONFIG_RETPOLINE
 
-#define DECL_INDIRECT_THUNK(reg) \
+#undef GEN
+#define GEN(reg) \
 	extern asmlinkage void __x86_indirect_thunk_ ## reg (void);
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) \
+	extern asmlinkage void __x86_indirect_alt_call_ ## reg (void);
+#include <asm/GEN-for-each-reg.h>
 
 #undef GEN
-#define GEN(reg) DECL_INDIRECT_THUNK(reg)
+#define GEN(reg) \
+	extern asmlinkage void __x86_indirect_alt_jmp_ ## reg (void);
 #include <asm/GEN-for-each-reg.h>
 
 #endif /* CONFIG_RETPOLINE */
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index d2c0d14..4d32cb0 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -10,6 +10,8 @@
 #include <asm/unwind_hints.h>
 #include <asm/frame.h>
 
+	.section .text.__x86.indirect_thunk
+
 .macro RETPOLINE reg
 	ANNOTATE_INTRA_FUNCTION_CALL
 	call    .Ldo_rop_\@
@@ -25,9 +27,9 @@
 .endm
 
 .macro THUNK reg
-	.section .text.__x86.indirect_thunk
 
 	.align 32
+
 SYM_FUNC_START(__x86_indirect_thunk_\reg)
 
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
@@ -39,6 +41,32 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
 .endm
 
 /*
+ * This generates .altinstr_replacement symbols for use by objtool. They,
+ * however, must not actually live in .altinstr_replacement since that will be
+ * discarded after init, but module alternatives will also reference these
+ * symbols.
+ *
+ * Their names matches the "__x86_indirect_" prefix to mark them as retpolines.
+ */
+.macro ALT_THUNK reg
+
+	.align 1
+
+SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg)
+	ANNOTATE_RETPOLINE_SAFE
+1:	call	*%\reg
+2:	.skip	5-(2b-1b), 0x90
+SYM_FUNC_END(__x86_indirect_alt_call_\reg)
+
+SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg)
+	ANNOTATE_RETPOLINE_SAFE
+1:	jmp	*%\reg
+2:	.skip	5-(2b-1b), 0x90
+SYM_FUNC_END(__x86_indirect_alt_jmp_\reg)
+
+.endm
+
+/*
  * Despite being an assembler file we can't just use .irp here
  * because __KSYM_DEPS__ only uses the C preprocessor and would
  * only see one instance of "__x86_indirect_thunk_\reg" rather
@@ -61,3 +89,14 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
 #define GEN(reg) EXPORT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
+#undef GEN
+#define GEN(reg) ALT_THUNK reg
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_call_ ## reg)
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_jmp_ ## reg)
+#include <asm/GEN-for-each-reg.h>
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index e5fa3a5..44375fa 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -16,6 +16,7 @@
 #include <objtool/elf.h>
 #include <objtool/arch.h>
 #include <objtool/warn.h>
+#include <arch/elf.h>
 
 static unsigned char op_to_cfi_reg[][2] = {
 	{CFI_AX, CFI_R8},
@@ -610,6 +611,122 @@ const char *arch_nop_insn(int len)
 	return nops[len-1];
 }
 
+/* asm/alternative.h ? */
+
+#define ALTINSTR_FLAG_INV	(1 << 15)
+#define ALT_NOT(feat)		((feat) | ALTINSTR_FLAG_INV)
+
+struct alt_instr {
+	s32 instr_offset;	/* original instruction */
+	s32 repl_offset;	/* offset to replacement instruction */
+	u16 cpuid;		/* cpuid bit set for replacement */
+	u8  instrlen;		/* length of original instruction */
+	u8  replacementlen;	/* length of new instruction */
+} __packed;
+
+static int elf_add_alternative(struct elf *elf,
+			       struct instruction *orig, struct symbol *sym,
+			       int cpuid, u8 orig_len, u8 repl_len)
+{
+	const int size = sizeof(struct alt_instr);
+	struct alt_instr *alt;
+	struct section *sec;
+	Elf_Scn *s;
+
+	sec = find_section_by_name(elf, ".altinstructions");
+	if (!sec) {
+		sec = elf_create_section(elf, ".altinstructions",
+					 SHF_WRITE, size, 0);
+
+		if (!sec) {
+			WARN_ELF("elf_create_section");
+			return -1;
+		}
+	}
+
+	s = elf_getscn(elf->elf, sec->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	sec->data = elf_newdata(s);
+	if (!sec->data) {
+		WARN_ELF("elf_newdata");
+		return -1;
+	}
+
+	sec->data->d_size = size;
+	sec->data->d_align = 1;
+
+	alt = sec->data->d_buf = malloc(size);
+	if (!sec->data->d_buf) {
+		perror("malloc");
+		return -1;
+	}
+	memset(sec->data->d_buf, 0, size);
+
+	if (elf_add_reloc_to_insn(elf, sec, sec->sh.sh_size,
+				  R_X86_64_PC32, orig->sec, orig->offset)) {
+		WARN("elf_create_reloc: alt_instr::instr_offset");
+		return -1;
+	}
+
+	if (elf_add_reloc(elf, sec, sec->sh.sh_size + 4,
+			  R_X86_64_PC32, sym, 0)) {
+		WARN("elf_create_reloc: alt_instr::repl_offset");
+		return -1;
+	}
+
+	alt->cpuid = cpuid;
+	alt->instrlen = orig_len;
+	alt->replacementlen = repl_len;
+
+	sec->sh.sh_size += size;
+	sec->changed = true;
+
+	return 0;
+}
+
+#define X86_FEATURE_RETPOLINE                ( 7*32+12)
+
+int arch_rewrite_retpolines(struct objtool_file *file)
+{
+	struct instruction *insn;
+	struct reloc *reloc;
+	struct symbol *sym;
+	char name[32] = "";
+
+	list_for_each_entry(insn, &file->retpoline_call_list, call_node) {
+
+		if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
+			continue;
+
+		reloc = insn->reloc;
+
+		sprintf(name, "__x86_indirect_alt_%s_%s",
+			insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call",
+			reloc->sym->name + 21);
+
+		sym = find_symbol_by_name(file->elf, name);
+		if (!sym) {
+			sym = elf_create_undef_symbol(file->elf, name);
+			if (!sym) {
+				WARN("elf_create_undef_symbol");
+				return -1;
+			}
+		}
+
+		if (elf_add_alternative(file->elf, insn, sym,
+					ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5)) {
+			WARN("elf_add_alternative");
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
 int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg)
 {
 	struct cfi_reg *cfa = &insn->cfi.cfa;

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Skip magical retpoline .altinstr_replacement
  2021-03-26 15:12 ` [PATCH v3 15/16] objtool: Skip magical retpoline .altinstr_replacement Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     68a59124f4c6363de619fea63231a97dd220a12c
Gitweb:        https://git.kernel.org/tip/68a59124f4c6363de619fea63231a97dd220a12c
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:14 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 13:29:40 +02:00

objtool: Skip magical retpoline .altinstr_replacement

When the .altinstr_replacement is a retpoline, skip the alternative.
We already special case retpolines anyway.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.259429287@infradead.org
---
 tools/objtool/special.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/special.c b/tools/objtool/special.c
index 2c7fbda..07b21cf 100644
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -106,6 +106,14 @@ static int get_alt_entry(struct elf *elf, struct special_entry *entry,
 			return -1;
 		}
 
+		/*
+		 * Skip retpoline .altinstr_replacement... we already rewrite the
+		 * instructions for retpolines anyway, see arch_is_retpoline()
+		 * usage in add_{call,jump}_destinations().
+		 */
+		if (arch_is_retpoline(new_reloc->sym))
+			return 1;
+
 		alt->new_sec = new_reloc->sym->sec;
 		alt->new_off = (unsigned int)new_reloc->addend;
 
@@ -154,7 +162,9 @@ int special_get_alts(struct elf *elf, struct list_head *alts)
 			memset(alt, 0, sizeof(*alt));
 
 			ret = get_alt_entry(elf, entry, sec, idx, alt);
-			if (ret)
+			if (ret > 0)
+				continue;
+			if (ret < 0)
 				return ret;
 
 			list_add_tail(&alt->list, alts);

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Cache instruction relocs
  2021-03-26 15:12 ` [PATCH v3 14/16] objtool: Cache instruction relocs Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     4ecdc0265dc911adba0772fd6e816d48da678fe7
Gitweb:        https://git.kernel.org/tip/4ecdc0265dc911adba0772fd6e816d48da678fe7
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:13 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 13:25:38 +02:00

objtool: Cache instruction relocs

Track the reloc of instructions to avoid having to look them up again
later.

(Technically x86 instructions can have two relocations, but not jumps
and calls, for which we're using this.)

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.195441549@infradead.org
---
 tools/objtool/check.c                 | 28 ++++++++++++++++++++------
 tools/objtool/include/objtool/check.h |  1 +-
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 77074db..1f4154f 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -797,6 +797,25 @@ __weak bool arch_is_retpoline(struct symbol *sym)
 	return false;
 }
 
+#define NEGATIVE_RELOC	((void *)-1L)
+
+static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
+{
+	if (insn->reloc == NEGATIVE_RELOC)
+		return NULL;
+
+	if (!insn->reloc) {
+		insn->reloc = find_reloc_by_dest_range(file->elf, insn->sec,
+						       insn->offset, insn->len);
+		if (!insn->reloc) {
+			insn->reloc = NEGATIVE_RELOC;
+			return NULL;
+		}
+	}
+
+	return insn->reloc;
+}
+
 /*
  * Find the destination instructions for all jumps.
  */
@@ -811,8 +830,7 @@ static int add_jump_destinations(struct objtool_file *file)
 		if (!is_static_jump(insn))
 			continue;
 
-		reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-						 insn->offset, insn->len);
+		reloc = insn_reloc(file, insn);
 		if (!reloc) {
 			dest_sec = insn->sec;
 			dest_off = arch_jump_destination(insn);
@@ -944,8 +962,7 @@ static int add_call_destinations(struct objtool_file *file)
 		if (insn->type != INSN_CALL)
 			continue;
 
-		reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-					       insn->offset, insn->len);
+		reloc = insn_reloc(file, insn);
 		if (!reloc) {
 			dest_off = arch_jump_destination(insn);
 			insn->call_dest = find_call_destination(insn->sec, dest_off);
@@ -1144,8 +1161,7 @@ static int handle_group_alt(struct objtool_file *file,
 		 * alternatives code can adjust the relative offsets
 		 * accordingly.
 		 */
-		alt_reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-						   insn->offset, insn->len);
+		alt_reloc = insn_reloc(file, insn);
 		if (alt_reloc &&
 		    !arch_support_alt_relocation(special_alt, insn, alt_reloc)) {
 
diff --git a/tools/objtool/include/objtool/check.h b/tools/objtool/include/objtool/check.h
index e5528ce..56d50bc 100644
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -56,6 +56,7 @@ struct instruction {
 	struct instruction *jump_dest;
 	struct instruction *first_jump_src;
 	struct reloc *jump_table;
+	struct reloc *reloc;
 	struct list_head alts;
 	struct symbol *func;
 	struct list_head stack_ops;

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Keep track of retpoline call sites
  2021-03-26 15:12 ` [PATCH v3 13/16] objtool: Keep track of retpoline call sites Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     7e57a6bc5a22145429d3a232619b0637c312397a
Gitweb:        https://git.kernel.org/tip/7e57a6bc5a22145429d3a232619b0637c312397a
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:12 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 13:20:21 +02:00

objtool: Keep track of retpoline call sites

Provide infrastructure for architectures to rewrite/augment compiler
generated retpoline calls. Similar to what we do for static_call()s,
keep track of the instructions that are retpoline calls.

Use the same list_head, since a retpoline call cannot also be a
static_call.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.130805730@infradead.org
---
 tools/objtool/check.c                   | 34 ++++++++++++++++++++----
 tools/objtool/include/objtool/arch.h    |  2 +-
 tools/objtool/include/objtool/check.h   |  2 +-
 tools/objtool/include/objtool/objtool.h |  1 +-
 tools/objtool/objtool.c                 |  1 +-
 5 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 600fa67..77074db 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -451,7 +451,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		return 0;
 
 	idx = 0;
-	list_for_each_entry(insn, &file->static_call_list, static_call_node)
+	list_for_each_entry(insn, &file->static_call_list, call_node)
 		idx++;
 
 	sec = elf_create_section(file->elf, ".static_call_sites", SHF_WRITE,
@@ -460,7 +460,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		return -1;
 
 	idx = 0;
-	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
+	list_for_each_entry(insn, &file->static_call_list, call_node) {
 
 		site = (struct static_call_site *)sec->data->d_buf + idx;
 		memset(site, 0, sizeof(struct static_call_site));
@@ -829,13 +829,16 @@ static int add_jump_destinations(struct objtool_file *file)
 			else
 				insn->type = INSN_JUMP_DYNAMIC_CONDITIONAL;
 
+			list_add_tail(&insn->call_node,
+				      &file->retpoline_call_list);
+
 			insn->retpoline_safe = true;
 			continue;
 		} else if (insn->func) {
 			/* internal or external sibling call (with reloc) */
 			insn->call_dest = reloc->sym;
 			if (insn->call_dest->static_call_tramp) {
-				list_add_tail(&insn->static_call_node,
+				list_add_tail(&insn->call_node,
 					      &file->static_call_list);
 			}
 			continue;
@@ -897,7 +900,7 @@ static int add_jump_destinations(struct objtool_file *file)
 				/* internal sibling call (without reloc) */
 				insn->call_dest = insn->jump_dest->func;
 				if (insn->call_dest->static_call_tramp) {
-					list_add_tail(&insn->static_call_node,
+					list_add_tail(&insn->call_node,
 						      &file->static_call_list);
 				}
 			}
@@ -981,6 +984,9 @@ static int add_call_destinations(struct objtool_file *file)
 			insn->type = INSN_CALL_DYNAMIC;
 			insn->retpoline_safe = true;
 
+			list_add_tail(&insn->call_node,
+				      &file->retpoline_call_list);
+
 			remove_insn_ops(insn);
 			continue;
 
@@ -988,7 +994,7 @@ static int add_call_destinations(struct objtool_file *file)
 			insn->call_dest = reloc->sym;
 
 		if (insn->call_dest && insn->call_dest->static_call_tramp) {
-			list_add_tail(&insn->static_call_node,
+			list_add_tail(&insn->call_node,
 				      &file->static_call_list);
 		}
 
@@ -1714,6 +1720,11 @@ static void mark_rodata(struct objtool_file *file)
 	file->rodata = found;
 }
 
+__weak int arch_rewrite_retpolines(struct objtool_file *file)
+{
+	return 0;
+}
+
 static int decode_sections(struct objtool_file *file)
 {
 	int ret;
@@ -1742,6 +1753,10 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_special_section_alts() as that depends on
+	 * jump_dest being set.
+	 */
 	ret = add_jump_destinations(file);
 	if (ret)
 		return ret;
@@ -1778,6 +1793,15 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be after add_special_section_alts(), since this will emit
+	 * alternatives. Must be after add_{jump,call}_destination(), since
+	 * those create the call insn lists.
+	 */
+	ret = arch_rewrite_retpolines(file);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index bb30993..48b540a 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -88,4 +88,6 @@ int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg);
 
 bool arch_is_retpoline(struct symbol *sym);
 
+int arch_rewrite_retpolines(struct objtool_file *file);
+
 #endif /* _ARCH_H */
diff --git a/tools/objtool/include/objtool/check.h b/tools/objtool/include/objtool/check.h
index f5be798..e5528ce 100644
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -39,7 +39,7 @@ struct alt_group {
 struct instruction {
 	struct list_head list;
 	struct hlist_node hash;
-	struct list_head static_call_node;
+	struct list_head call_node;
 	struct list_head mcount_loc_node;
 	struct section *sec;
 	unsigned long offset;
diff --git a/tools/objtool/include/objtool/objtool.h b/tools/objtool/include/objtool/objtool.h
index e68e374..e4084af 100644
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -18,6 +18,7 @@ struct objtool_file {
 	struct elf *elf;
 	struct list_head insn_list;
 	DECLARE_HASHTABLE(insn_hash, 20);
+	struct list_head retpoline_call_list;
 	struct list_head static_call_list;
 	struct list_head mcount_loc_list;
 	bool ignore_unreachables, c_file, hints, rodata;
diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
index 7b97ce4..3a3ea1b 100644
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -61,6 +61,7 @@ struct objtool_file *objtool_open_read(const char *_objname)
 
 	INIT_LIST_HEAD(&file.insn_list);
 	hash_init(file.insn_hash);
+	INIT_LIST_HEAD(&file.retpoline_call_list);
 	INIT_LIST_HEAD(&file.static_call_list);
 	INIT_LIST_HEAD(&file.mcount_loc_list);
 	file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment");

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Add elf_create_undef_symbol()
  2021-03-26 15:12 ` [PATCH v3 12/16] objtool: Add elf_create_undef_symbol() Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     993b477acdb652c6134e5faae05e8a378911cbb3
Gitweb:        https://git.kernel.org/tip/993b477acdb652c6134e5faae05e8a378911cbb3
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:11 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 13:12:48 +02:00

objtool: Add elf_create_undef_symbol()

Allow objtool to create undefined symbols; this allows creating
relocations to symbols not currently in the symbol table.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.064743095@infradead.org
---
 tools/objtool/elf.c                 | 60 ++++++++++++++++++++++++++++-
 tools/objtool/include/objtool/elf.h |  1 +-
 2 files changed, 61 insertions(+)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 8457218..d08f5f3 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -715,6 +715,66 @@ static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
 	return len;
 }
 
+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
+{
+	struct section *symtab;
+	struct symbol *sym;
+	Elf_Data *data;
+	Elf_Scn *s;
+
+	sym = malloc(sizeof(*sym));
+	if (!sym) {
+		perror("malloc");
+		return NULL;
+	}
+	memset(sym, 0, sizeof(*sym));
+
+	sym->name = strdup(name);
+
+	sym->sym.st_name = elf_add_string(elf, NULL, sym->name);
+	if (sym->sym.st_name == -1)
+		return NULL;
+
+	sym->sym.st_info = GELF_ST_INFO(STB_GLOBAL, STT_NOTYPE);
+	// st_other 0
+	// st_shndx 0
+	// st_value 0
+	// st_size 0
+
+	symtab = find_section_by_name(elf, ".symtab");
+	if (!symtab) {
+		WARN("can't find .symtab");
+		return NULL;
+	}
+
+	s = elf_getscn(elf->elf, symtab->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return NULL;
+	}
+
+	data = elf_newdata(s);
+	if (!data) {
+		WARN_ELF("elf_newdata");
+		return NULL;
+	}
+
+	data->d_buf = &sym->sym;
+	data->d_size = sizeof(sym->sym);
+	data->d_align = 1;
+
+	sym->idx = symtab->len / sizeof(sym->sym);
+
+	symtab->len += data->d_size;
+	symtab->changed = true;
+
+	sym->sec = find_section_by_index(elf, 0);
+
+	elf_add_symbol(elf, sym);
+
+	return sym;
+}
+
 struct section *elf_create_section(struct elf *elf, const char *name,
 				   unsigned int sh_flags, size_t entsize, int nr)
 {
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 463f329..45e5ede 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -133,6 +133,7 @@ int elf_write_insn(struct elf *elf, struct section *sec,
 		   unsigned long offset, unsigned int len,
 		   const char *insn);
 int elf_write_reloc(struct elf *elf, struct reloc *reloc);
+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name);
 int elf_write(struct elf *elf);
 void elf_close(struct elf *elf);
 

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Extract elf_strtab_concat()
  2021-03-26 15:12 ` [PATCH v3 10/16] objtool: Extract elf_strtab_concat() Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     557c25be3588971caf21364b6fd240769e37c47c
Gitweb:        https://git.kernel.org/tip/557c25be3588971caf21364b6fd240769e37c47c
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:09 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 13:05:50 +02:00

objtool: Extract elf_strtab_concat()

Create a common helper to append strings to a strtab.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.941474004@infradead.org
---
 tools/objtool/elf.c | 60 +++++++++++++++++++++++++++-----------------
 1 file changed, 38 insertions(+), 22 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 7b65ae3..c278a04 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -673,13 +673,48 @@ err:
 	return NULL;
 }
 
+static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
+{
+	Elf_Data *data;
+	Elf_Scn *s;
+	int len;
+
+	if (!strtab)
+		strtab = find_section_by_name(elf, ".strtab");
+	if (!strtab) {
+		WARN("can't find .strtab section");
+		return -1;
+	}
+
+	s = elf_getscn(elf->elf, strtab->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	data = elf_newdata(s);
+	if (!data) {
+		WARN_ELF("elf_newdata");
+		return -1;
+	}
+
+	data->d_buf = str;
+	data->d_size = strlen(str) + 1;
+	data->d_align = 1;
+
+	len = strtab->len;
+	strtab->len += data->d_size;
+	strtab->changed = true;
+
+	return len;
+}
+
 struct section *elf_create_section(struct elf *elf, const char *name,
 				   unsigned int sh_flags, size_t entsize, int nr)
 {
 	struct section *sec, *shstrtab;
 	size_t size = entsize * nr;
 	Elf_Scn *s;
-	Elf_Data *data;
 
 	sec = malloc(sizeof(*sec));
 	if (!sec) {
@@ -736,7 +771,6 @@ struct section *elf_create_section(struct elf *elf, const char *name,
 	sec->sh.sh_addralign = 1;
 	sec->sh.sh_flags = SHF_ALLOC | sh_flags;
 
-
 	/* Add section name to .shstrtab (or .strtab for Clang) */
 	shstrtab = find_section_by_name(elf, ".shstrtab");
 	if (!shstrtab)
@@ -745,27 +779,9 @@ struct section *elf_create_section(struct elf *elf, const char *name,
 		WARN("can't find .shstrtab or .strtab section");
 		return NULL;
 	}
-
-	s = elf_getscn(elf->elf, shstrtab->idx);
-	if (!s) {
-		WARN_ELF("elf_getscn");
+	sec->sh.sh_name = elf_add_string(elf, shstrtab, sec->name);
+	if (sec->sh.sh_name == -1)
 		return NULL;
-	}
-
-	data = elf_newdata(s);
-	if (!data) {
-		WARN_ELF("elf_newdata");
-		return NULL;
-	}
-
-	data->d_buf = sec->name;
-	data->d_size = strlen(name) + 1;
-	data->d_align = 1;
-
-	sec->sh.sh_name = shstrtab->len;
-
-	shstrtab->len += strlen(name) + 1;
-	shstrtab->changed = true;
 
 	list_add_tail(&sec->list, &elf->sections);
 	elf_hash_add(elf->section_hash, &sec->hash, sec->idx);

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Extract elf_symbol_add()
  2021-03-26 15:12 ` [PATCH v3 11/16] objtool: Extract elf_symbol_add() Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     d56a3568827ec4b8efcbcfc46fdc944995b6dcf1
Gitweb:        https://git.kernel.org/tip/d56a3568827ec4b8efcbcfc46fdc944995b6dcf1
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:10 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 13:08:52 +02:00

objtool: Extract elf_symbol_add()

Create a common helper to add symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.003468981@infradead.org
---
 tools/objtool/elf.c | 56 ++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index c278a04..8457218 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -290,12 +290,39 @@ static int read_sections(struct elf *elf)
 	return 0;
 }
 
+static void elf_add_symbol(struct elf *elf, struct symbol *sym)
+{
+	struct list_head *entry;
+	struct rb_node *pnode;
+
+	sym->type = GELF_ST_TYPE(sym->sym.st_info);
+	sym->bind = GELF_ST_BIND(sym->sym.st_info);
+
+	sym->offset = sym->sym.st_value;
+	sym->len = sym->sym.st_size;
+
+	rb_add(&sym->node, &sym->sec->symbol_tree, symbol_to_offset);
+	pnode = rb_prev(&sym->node);
+	if (pnode)
+		entry = &rb_entry(pnode, struct symbol, node)->list;
+	else
+		entry = &sym->sec->symbol_list;
+	list_add(&sym->list, entry);
+	elf_hash_add(elf->symbol_hash, &sym->hash, sym->idx);
+	elf_hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
+
+	/*
+	 * Don't store empty STT_NOTYPE symbols in the rbtree.  They
+	 * can exist within a function, confusing the sorting.
+	 */
+	if (!sym->len)
+		rb_erase(&sym->node, &sym->sec->symbol_tree);
+}
+
 static int read_symbols(struct elf *elf)
 {
 	struct section *symtab, *symtab_shndx, *sec;
 	struct symbol *sym, *pfunc;
-	struct list_head *entry;
-	struct rb_node *pnode;
 	int symbols_nr, i;
 	char *coldstr;
 	Elf_Data *shndx_data = NULL;
@@ -340,9 +367,6 @@ static int read_symbols(struct elf *elf)
 			goto err;
 		}
 
-		sym->type = GELF_ST_TYPE(sym->sym.st_info);
-		sym->bind = GELF_ST_BIND(sym->sym.st_info);
-
 		if ((sym->sym.st_shndx > SHN_UNDEF &&
 		     sym->sym.st_shndx < SHN_LORESERVE) ||
 		    (shndx_data && sym->sym.st_shndx == SHN_XINDEX)) {
@@ -355,32 +379,14 @@ static int read_symbols(struct elf *elf)
 				     sym->name);
 				goto err;
 			}
-			if (sym->type == STT_SECTION) {
+			if (GELF_ST_TYPE(sym->sym.st_info) == STT_SECTION) {
 				sym->name = sym->sec->name;
 				sym->sec->sym = sym;
 			}
 		} else
 			sym->sec = find_section_by_index(elf, 0);
 
-		sym->offset = sym->sym.st_value;
-		sym->len = sym->sym.st_size;
-
-		rb_add(&sym->node, &sym->sec->symbol_tree, symbol_to_offset);
-		pnode = rb_prev(&sym->node);
-		if (pnode)
-			entry = &rb_entry(pnode, struct symbol, node)->list;
-		else
-			entry = &sym->sec->symbol_list;
-		list_add(&sym->list, entry);
-		elf_hash_add(elf->symbol_hash, &sym->hash, sym->idx);
-		elf_hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
-
-		/*
-		 * Don't store empty STT_NOTYPE symbols in the rbtree.  They
-		 * can exist within a function, confusing the sorting.
-		 */
-		if (!sym->len)
-			rb_erase(&sym->node, &sym->sec->symbol_tree);
+		elf_add_symbol(elf, sym);
 	}
 
 	if (stats)

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Implicitly create reloc sections
  2021-03-26 15:12 ` [PATCH v3 09/16] objtool: Implicitly create reloc sections Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` [tip: x86/core] objtool: Create reloc sections implicitly tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Josh Poimboeuf, Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     aef0f13e96db08f31be6b96d28e761df46d86ff4
Gitweb:        https://git.kernel.org/tip/aef0f13e96db08f31be6b96d28e761df46d86ff4
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:08 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 13:01:15 +02:00

objtool: Implicitly create reloc sections

Have elf_add_reloc() create the relocation section implicitly.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.880174448@infradead.org
---
 tools/objtool/check.c               |  6 ------
 tools/objtool/elf.c                 |  9 ++++++++-
 tools/objtool/include/objtool/elf.h |  1 -
 tools/objtool/orc_gen.c             |  2 --
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 61fe29a..600fa67 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -459,9 +459,6 @@ static int create_static_call_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
-
 	idx = 0;
 	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
 
@@ -547,9 +544,6 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
-
 	idx = 0;
 	list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) {
 
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 0ab52ac..7b65ae3 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -447,11 +447,18 @@ err:
 	return -1;
 }
 
+static struct section *elf_create_reloc_section(struct elf *elf,
+						struct section *base,
+						int reltype);
+
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, int addend)
 {
 	struct reloc *reloc;
 
+	if (!sec->reloc && !elf_create_reloc_section(elf, sec, SHT_RELA))
+		return -1;
+
 	reloc = malloc(sizeof(*reloc));
 	if (!reloc) {
 		perror("malloc");
@@ -829,7 +836,7 @@ static struct section *elf_create_rela_reloc_section(struct elf *elf, struct sec
 	return sec;
 }
 
-struct section *elf_create_reloc_section(struct elf *elf,
+static struct section *elf_create_reloc_section(struct elf *elf,
 					 struct section *base,
 					 int reltype)
 {
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 825ad32..463f329 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -122,7 +122,6 @@ static inline u32 reloc_hash(struct reloc *reloc)
 
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
-struct section *elf_create_reloc_section(struct elf *elf, struct section *base, int reltype);
 
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, int addend);
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 1b57be6..dc9b7dd 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -225,8 +225,6 @@ int orc_create(struct objtool_file *file)
 	sec = elf_create_section(file->elf, ".orc_unwind_ip", 0, sizeof(int), nr);
 	if (!sec)
 		return -1;
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
 
 	/* Write ORC entries to sections: */
 	list_for_each_entry(entry, &orc_list, list) {

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Add elf_create_reloc() helper
  2021-03-26 15:12 ` [PATCH v3 08/16] objtool: Add elf_create_reloc() helper Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     7508e2958a82675e75e34221c26ad4242d4ef283
Gitweb:        https://git.kernel.org/tip/7508e2958a82675e75e34221c26ad4242d4ef283
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:07 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 12:55:55 +02:00

objtool: Add elf_create_reloc() helper

We have 4 instances of adding a relocation. Create a common helper
to avoid growing even more.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.817438847@infradead.org
---
 tools/objtool/check.c               | 78 +++++--------------------
 tools/objtool/elf.c                 | 86 ++++++++++++++++++----------
 tools/objtool/include/objtool/elf.h | 10 ++-
 tools/objtool/orc_gen.c             | 30 ++--------
 4 files changed, 85 insertions(+), 119 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 1d0415b..61fe29a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -433,8 +433,7 @@ reachable:
 
 static int create_static_call_sections(struct objtool_file *file)
 {
-	struct section *sec, *reloc_sec;
-	struct reloc *reloc;
+	struct section *sec;
 	struct static_call_site *site;
 	struct instruction *insn;
 	struct symbol *key_sym;
@@ -460,8 +459,7 @@ static int create_static_call_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!reloc_sec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	idx = 0;
@@ -471,25 +469,11 @@ static int create_static_call_sections(struct objtool_file *file)
 		memset(site, 0, sizeof(struct static_call_site));
 
 		/* populate reloc for 'addr' */
-		reloc = malloc(sizeof(*reloc));
-
-		if (!reloc) {
-			perror("malloc");
-			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-
-		insn_to_reloc_sym_addend(insn->sec, insn->offset, reloc);
-		if (!reloc->sym) {
-			WARN_FUNC("static call tramp: missing containing symbol",
-				  insn->sec, insn->offset);
+		if (elf_add_reloc_to_insn(file->elf, sec,
+					  idx * sizeof(struct static_call_site),
+					  R_X86_64_PC32,
+					  insn->sec, insn->offset))
 			return -1;
-		}
-
-		reloc->type = R_X86_64_PC32;
-		reloc->offset = idx * sizeof(struct static_call_site);
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		/* find key symbol */
 		key_name = strdup(insn->call_dest->name);
@@ -526,18 +510,11 @@ static int create_static_call_sections(struct objtool_file *file)
 		free(key_name);
 
 		/* populate reloc for 'key' */
-		reloc = malloc(sizeof(*reloc));
-		if (!reloc) {
-			perror("malloc");
+		if (elf_add_reloc(file->elf, sec,
+				  idx * sizeof(struct static_call_site) + 4,
+				  R_X86_64_PC32, key_sym,
+				  is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
 			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-		reloc->sym = key_sym;
-		reloc->addend = is_sibling_call(insn) ? STATIC_CALL_SITE_TAIL : 0;
-		reloc->type = R_X86_64_PC32;
-		reloc->offset = idx * sizeof(struct static_call_site) + 4;
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		idx++;
 	}
@@ -547,8 +524,7 @@ static int create_static_call_sections(struct objtool_file *file)
 
 static int create_mcount_loc_sections(struct objtool_file *file)
 {
-	struct section *sec, *reloc_sec;
-	struct reloc *reloc;
+	struct section *sec;
 	unsigned long *loc;
 	struct instruction *insn;
 	int idx;
@@ -571,8 +547,7 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!reloc_sec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	idx = 0;
@@ -581,32 +556,11 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 		loc = (unsigned long *)sec->data->d_buf + idx;
 		memset(loc, 0, sizeof(unsigned long));
 
-		reloc = malloc(sizeof(*reloc));
-		if (!reloc) {
-			perror("malloc");
+		if (elf_add_reloc_to_insn(file->elf, sec,
+					  idx * sizeof(unsigned long),
+					  R_X86_64_64,
+					  insn->sec, insn->offset))
 			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-
-		if (insn->sec->sym) {
-			reloc->sym = insn->sec->sym;
-			reloc->addend = insn->offset;
-		} else {
-			reloc->sym = find_symbol_containing(insn->sec, insn->offset);
-
-			if (!reloc->sym) {
-				WARN("missing symbol for insn at offset 0x%lx\n",
-				     insn->offset);
-				return -1;
-			}
-
-			reloc->addend = insn->offset - reloc->sym->offset;
-		}
-
-		reloc->type = R_X86_64_64;
-		reloc->offset = idx * sizeof(unsigned long);
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		idx++;
 	}
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 374813e..0ab52ac 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -211,32 +211,6 @@ struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, uns
 	return find_reloc_by_dest_range(elf, sec, offset, 1);
 }
 
-void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
-			      struct reloc *reloc)
-{
-	if (sec->sym) {
-		reloc->sym = sec->sym;
-		reloc->addend = offset;
-		return;
-	}
-
-	/*
-	 * The Clang assembler strips section symbols, so we have to reference
-	 * the function symbol instead:
-	 */
-	reloc->sym = find_symbol_containing(sec, offset);
-	if (!reloc->sym) {
-		/*
-		 * Hack alert.  This happens when we need to reference the NOP
-		 * pad insn immediately after the function.
-		 */
-		reloc->sym = find_symbol_containing(sec, offset - 1);
-	}
-
-	if (reloc->sym)
-		reloc->addend = offset - reloc->sym->offset;
-}
-
 static int read_sections(struct elf *elf)
 {
 	Elf_Scn *s = NULL;
@@ -473,14 +447,66 @@ err:
 	return -1;
 }
 
-void elf_add_reloc(struct elf *elf, struct reloc *reloc)
+int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+		  unsigned int type, struct symbol *sym, int addend)
 {
-	struct section *sec = reloc->sec;
+	struct reloc *reloc;
+
+	reloc = malloc(sizeof(*reloc));
+	if (!reloc) {
+		perror("malloc");
+		return -1;
+	}
+	memset(reloc, 0, sizeof(*reloc));
 
-	list_add_tail(&reloc->list, &sec->reloc_list);
+	reloc->sec = sec->reloc;
+	reloc->offset = offset;
+	reloc->type = type;
+	reloc->sym = sym;
+	reloc->addend = addend;
+
+	list_add_tail(&reloc->list, &sec->reloc->reloc_list);
 	elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
 
-	sec->changed = true;
+	sec->reloc->changed = true;
+
+	return 0;
+}
+
+int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+			  unsigned long offset, unsigned int type,
+			  struct section *insn_sec, unsigned long insn_off)
+{
+	struct symbol *sym;
+	int addend;
+
+	if (insn_sec->sym) {
+		sym = insn_sec->sym;
+		addend = insn_off;
+
+	} else {
+		/*
+		 * The Clang assembler strips section symbols, so we have to
+		 * reference the function symbol instead:
+		 */
+		sym = find_symbol_containing(insn_sec, insn_off);
+		if (!sym) {
+			/*
+			 * Hack alert.  This happens when we need to reference
+			 * the NOP pad insn immediately after the function.
+			 */
+			sym = find_symbol_containing(insn_sec, insn_off - 1);
+		}
+
+		if (!sym) {
+			WARN("can't find symbol containing %s+0x%lx", insn_sec->name, insn_off);
+			return -1;
+		}
+
+		addend = insn_off - sym->offset;
+	}
+
+	return elf_add_reloc(elf, sec, offset, type, sym, addend);
 }
 
 static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, unsigned int *symndx)
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index fc576ed..825ad32 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -123,7 +123,13 @@ static inline u32 reloc_hash(struct reloc *reloc)
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
 struct section *elf_create_reloc_section(struct elf *elf, struct section *base, int reltype);
-void elf_add_reloc(struct elf *elf, struct reloc *reloc);
+
+int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+		  unsigned int type, struct symbol *sym, int addend);
+int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+			  unsigned long offset, unsigned int type,
+			  struct section *insn_sec, unsigned long insn_off);
+
 int elf_write_insn(struct elf *elf, struct section *sec,
 		   unsigned long offset, unsigned int len,
 		   const char *insn);
@@ -140,8 +146,6 @@ struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, uns
 struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *sec,
 				     unsigned long offset, unsigned int len);
 struct symbol *find_func_containing(struct section *sec, unsigned long offset);
-void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
-			      struct reloc *reloc);
 
 #define for_each_sec(file, sec)						\
 	list_for_each_entry(sec, &file->elf->sections, list)
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index f534708..1b57be6 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -82,12 +82,11 @@ static int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi)
 }
 
 static int write_orc_entry(struct elf *elf, struct section *orc_sec,
-			   struct section *ip_rsec, unsigned int idx,
+			   struct section *ip_sec, unsigned int idx,
 			   struct section *insn_sec, unsigned long insn_off,
 			   struct orc_entry *o)
 {
 	struct orc_entry *orc;
-	struct reloc *reloc;
 
 	/* populate ORC data */
 	orc = (struct orc_entry *)orc_sec->data->d_buf + idx;
@@ -96,25 +95,9 @@ static int write_orc_entry(struct elf *elf, struct section *orc_sec,
 	orc->bp_offset = bswap_if_needed(orc->bp_offset);
 
 	/* populate reloc for ip */
-	reloc = malloc(sizeof(*reloc));
-	if (!reloc) {
-		perror("malloc");
+	if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_X86_64_PC32,
+				  insn_sec, insn_off))
 		return -1;
-	}
-	memset(reloc, 0, sizeof(*reloc));
-
-	insn_to_reloc_sym_addend(insn_sec, insn_off, reloc);
-	if (!reloc->sym) {
-		WARN("missing symbol for insn at offset 0x%lx",
-		     insn_off);
-		return -1;
-	}
-
-	reloc->type = R_X86_64_PC32;
-	reloc->offset = idx * sizeof(int);
-	reloc->sec = ip_rsec;
-
-	elf_add_reloc(elf, reloc);
 
 	return 0;
 }
@@ -153,7 +136,7 @@ static unsigned long alt_group_len(struct alt_group *alt_group)
 
 int orc_create(struct objtool_file *file)
 {
-	struct section *sec, *ip_rsec, *orc_sec;
+	struct section *sec, *orc_sec;
 	unsigned int nr = 0, idx = 0;
 	struct orc_list_entry *entry;
 	struct list_head orc_list;
@@ -242,13 +225,12 @@ int orc_create(struct objtool_file *file)
 	sec = elf_create_section(file->elf, ".orc_unwind_ip", 0, sizeof(int), nr);
 	if (!sec)
 		return -1;
-	ip_rsec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!ip_rsec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	/* Write ORC entries to sections: */
 	list_for_each_entry(entry, &orc_list, list) {
-		if (write_orc_entry(file->elf, orc_sec, ip_rsec, idx++,
+		if (write_orc_entry(file->elf, orc_sec, sec, idx++,
 				    entry->insn_sec, entry->insn_off,
 				    &entry->orc))
 			return -1;

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Rework rebuild_reloc logic
  2021-03-26 15:12 ` [PATCH v3 07/16] objtool: Rework rebuild_reloc logic Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` [tip: x86/core] objtool: Rework the elf_rebuild_reloc_section() logic tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     98ce4d014ad4c1c4afcc427fc3f0002674315cb9
Gitweb:        https://git.kernel.org/tip/98ce4d014ad4c1c4afcc427fc3f0002674315cb9
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:06 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 12:51:35 +02:00

objtool: Rework rebuild_reloc logic

Instead of manually calling elf_rebuild_reloc_section() on sections
we've called elf_add_reloc() on, have elf_write() DTRT.

This makes it easier to add random relocations in places without
carefully tracking when we're done and need to flush what section.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.754213408@infradead.org
---
 tools/objtool/check.c               |  6 ------
 tools/objtool/elf.c                 | 20 ++++++++++++++------
 tools/objtool/include/objtool/elf.h |  1 -
 tools/objtool/orc_gen.c             |  3 ---
 4 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 8618d03..1d0415b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -542,9 +542,6 @@ static int create_static_call_sections(struct objtool_file *file)
 		idx++;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, reloc_sec))
-		return -1;
-
 	return 0;
 }
 
@@ -614,9 +611,6 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 		idx++;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, reloc_sec))
-		return -1;
-
 	return 0;
 }
 
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 93fa833..374813e 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, struct reloc *reloc)
 
 	list_add_tail(&reloc->list, &sec->reloc_list);
 	elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
+
+	sec->changed = true;
 }
 
 static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, unsigned int *symndx)
@@ -558,7 +560,9 @@ static int read_relocs(struct elf *elf)
 				return -1;
 			}
 
-			elf_add_reloc(elf, reloc);
+			list_add_tail(&reloc->list, &sec->reloc_list);
+			elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
+
 			nr_reloc++;
 		}
 		max_reloc = max(max_reloc, nr_reloc);
@@ -873,14 +877,11 @@ static int elf_rebuild_rela_reloc_section(struct section *sec, int nr)
 	return 0;
 }
 
-int elf_rebuild_reloc_section(struct elf *elf, struct section *sec)
+static int elf_rebuild_reloc_section(struct elf *elf, struct section *sec)
 {
 	struct reloc *reloc;
 	int nr;
 
-	sec->changed = true;
-	elf->changed = true;
-
 	nr = 0;
 	list_for_each_entry(reloc, &sec->reloc_list, list)
 		nr++;
@@ -944,9 +945,15 @@ int elf_write(struct elf *elf)
 	struct section *sec;
 	Elf_Scn *s;
 
-	/* Update section headers for changed sections: */
+	/* Update changed relocation sections and section headers: */
 	list_for_each_entry(sec, &elf->sections, list) {
 		if (sec->changed) {
+			if (sec->base &&
+			    elf_rebuild_reloc_section(elf, sec)) {
+				WARN("elf_rebuild_reloc_section");
+				return -1;
+			}
+
 			s = elf_getscn(elf->elf, sec->idx);
 			if (!s) {
 				WARN_ELF("elf_getscn");
@@ -958,6 +965,7 @@ int elf_write(struct elf *elf)
 			}
 
 			sec->changed = false;
+			elf->changed = true;
 		}
 	}
 
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index e6890cc..fc576ed 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -142,7 +142,6 @@ struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *se
 struct symbol *find_func_containing(struct section *sec, unsigned long offset);
 void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
 			      struct reloc *reloc);
-int elf_rebuild_reloc_section(struct elf *elf, struct section *sec);
 
 #define for_each_sec(file, sec)						\
 	list_for_each_entry(sec, &file->elf->sections, list)
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 738aa50..f534708 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -254,8 +254,5 @@ int orc_create(struct objtool_file *file)
 			return -1;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, ip_rsec))
-		return -1;
-
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Handle per arch retpoline naming
  2021-03-26 15:12 ` [PATCH v3 05/16] objtool: Per arch retpoline naming Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     3b652980a250c1ed9e0c361750f029781831cdc3
Gitweb:        https://git.kernel.org/tip/3b652980a250c1ed9e0c361750f029781831cdc3
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:04 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 11:36:52 +02:00

objtool: Handle per arch retpoline naming

The __x86_indirect_ naming is obviously not generic. Shorten to allow
matching some additional magic names later.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.630296706@infradead.org
---
 tools/objtool/arch/x86/decode.c      |  5 +++++
 tools/objtool/check.c                |  9 +++++++--
 tools/objtool/include/objtool/arch.h |  2 ++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 8380d0b..e5fa3a5 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -645,3 +645,8 @@ int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg)
 
 	return 0;
 }
+
+bool arch_is_retpoline(struct symbol *sym)
+{
+	return !strncmp(sym->name, "__x86_indirect_", 15);
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 519af4b..6fbc001 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -850,6 +850,11 @@ static int add_ignore_alternatives(struct objtool_file *file)
 	return 0;
 }
 
+__weak bool arch_is_retpoline(struct symbol *sym)
+{
+	return false;
+}
+
 /*
  * Find the destination instructions for all jumps.
  */
@@ -872,7 +877,7 @@ static int add_jump_destinations(struct objtool_file *file)
 		} else if (reloc->sym->type == STT_SECTION) {
 			dest_sec = reloc->sym->sec;
 			dest_off = arch_dest_reloc_offset(reloc->addend);
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+		} else if (arch_is_retpoline(reloc->sym)) {
 			/*
 			 * Retpoline jumps are really dynamic jumps in
 			 * disguise, so convert them accordingly.
@@ -1026,7 +1031,7 @@ static int add_call_destinations(struct objtool_file *file)
 				return -1;
 			}
 
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+		} else if (arch_is_retpoline(reloc->sym)) {
 			/*
 			 * Retpoline calls are really dynamic calls in
 			 * disguise, so convert them accordingly.
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index 6ff0685..bb30993 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -86,4 +86,6 @@ const char *arch_nop_insn(int len);
 
 int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg);
 
+bool arch_is_retpoline(struct symbol *sym);
+
 #endif /* _ARCH_H */

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Fix static_call list generation
  2021-03-26 15:12 ` [PATCH v3 06/16] objtool: Fix static_call list generation Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     b62b63571e4be0ce31984ce83b04853f2cba678b
Gitweb:        https://git.kernel.org/tip/b62b63571e4be0ce31984ce83b04853f2cba678b
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:05 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 11:43:16 +02:00

objtool: Fix static_call list generation

Currently, objtool generates tail call entries in add_jump_destination()
but waits until validate_branch() to generate the regular call entries.
Move these to add_call_destination() for consistency.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.691529901@infradead.org
---
 tools/objtool/check.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 6fbc001..8618d03 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1045,6 +1045,11 @@ static int add_call_destinations(struct objtool_file *file)
 		} else
 			insn->call_dest = reloc->sym;
 
+		if (insn->call_dest && insn->call_dest->static_call_tramp) {
+			list_add_tail(&insn->static_call_node,
+				      &file->static_call_list);
+		}
+
 		/*
 		 * Many compilers cannot disable KCOV with a function attribute
 		 * so they need a little help, NOP out any KCOV calls from noinstr
@@ -1788,6 +1793,9 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_{jump_call}_destination.
+	 */
 	ret = read_static_call_tramps(file);
 	if (ret)
 		return ret;
@@ -1800,6 +1808,10 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_call_destination(); it changes INSN_CALL to
+	 * INSN_JUMP.
+	 */
 	ret = read_intra_function_calls(file);
 	if (ret)
 		return ret;
@@ -2762,11 +2774,6 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 			if (dead_end_function(file, insn->call_dest))
 				return 0;
 
-			if (insn->type == INSN_CALL && insn->call_dest->static_call_tramp) {
-				list_add_tail(&insn->static_call_node,
-					      &file->static_call_list);
-			}
-
 			break;
 
 		case INSN_JUMP_CONDITIONAL:

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] x86/retpoline: Simplify retpolines
  2021-03-26 15:12 ` [PATCH v3 03/16] x86/retpoline: Simplify retpolines Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     2077915516ebb06d36e03cb542ccb833a8b0a3eb
Gitweb:        https://git.kernel.org/tip/2077915516ebb06d36e03cb542ccb833a8b0a3eb
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:02 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 31 Mar 2021 22:31:57 +02:00

x86/retpoline: Simplify retpolines

Due to

  c9c324dc22aa ("objtool: Support stack layout changes in alternatives")

it is possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a jmp to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
---
 arch/x86/include/asm/asm-prototypes.h |  7 +-----
 arch/x86/include/asm/nospec-branch.h  |  6 ++---
 arch/x86/lib/retpoline.S              | 34 +++++++++++++-------------
 tools/objtool/check.c                 |  3 +--
 4 files changed, 21 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index 51e2bf2..0545b07 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -22,15 +22,8 @@ extern void cmpxchg8b_emu(void);
 #define DECL_INDIRECT_THUNK(reg) \
 	extern asmlinkage void __x86_indirect_thunk_ ## reg (void);
 
-#define DECL_RETPOLINE(reg) \
-	extern asmlinkage void __x86_retpoline_ ## reg (void);
-
 #undef GEN
 #define GEN(reg) DECL_INDIRECT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
-#undef GEN
-#define GEN(reg) DECL_RETPOLINE(reg)
-#include <asm/GEN-for-each-reg.h>
-
 #endif /* CONFIG_RETPOLINE */
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 529f8e9..664be73 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -80,7 +80,7 @@
 .macro JMP_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
-		      __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(jmp __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \
 		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	jmp	*%\reg
@@ -90,7 +90,7 @@
 .macro CALL_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \
-		      __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(call __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \
 		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	call	*%\reg
@@ -128,7 +128,7 @@
 	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
 	"call *%[thunk_target]\n",				\
-	"call __x86_retpoline_%V[thunk_target]\n",		\
+	"call __x86_indirect_thunk_%V[thunk_target]\n",		\
 	X86_FEATURE_RETPOLINE,					\
 	"lfence;\n"						\
 	ANNOTATE_RETPOLINE_SAFE					\
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 6bb74b5..d2c0d14 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -10,27 +10,31 @@
 #include <asm/unwind_hints.h>
 #include <asm/frame.h>
 
-.macro THUNK reg
-	.section .text.__x86.indirect_thunk
-
-	.align 32
-SYM_FUNC_START(__x86_indirect_thunk_\reg)
-	JMP_NOSPEC \reg
-SYM_FUNC_END(__x86_indirect_thunk_\reg)
-
-SYM_FUNC_START_NOALIGN(__x86_retpoline_\reg)
+.macro RETPOLINE reg
 	ANNOTATE_INTRA_FUNCTION_CALL
-	call	.Ldo_rop_\@
+	call    .Ldo_rop_\@
 .Lspec_trap_\@:
 	UNWIND_HINT_EMPTY
 	pause
 	lfence
-	jmp	.Lspec_trap_\@
+	jmp .Lspec_trap_\@
 .Ldo_rop_\@:
-	mov	%\reg, (%_ASM_SP)
+	mov     %\reg, (%_ASM_SP)
 	UNWIND_HINT_FUNC
 	ret
-SYM_FUNC_END(__x86_retpoline_\reg)
+.endm
+
+.macro THUNK reg
+	.section .text.__x86.indirect_thunk
+
+	.align 32
+SYM_FUNC_START(__x86_indirect_thunk_\reg)
+
+	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
+		      __stringify(RETPOLINE \reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
+
+SYM_FUNC_END(__x86_indirect_thunk_\reg)
 
 .endm
 
@@ -48,7 +52,6 @@ SYM_FUNC_END(__x86_retpoline_\reg)
 
 #define __EXPORT_THUNK(sym)	_ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
 #define EXPORT_THUNK(reg)	__EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
-#define EXPORT_RETPOLINE(reg)  __EXPORT_THUNK(__x86_retpoline_ ## reg)
 
 #undef GEN
 #define GEN(reg) THUNK reg
@@ -58,6 +61,3 @@ SYM_FUNC_END(__x86_retpoline_\reg)
 #define GEN(reg) EXPORT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
-#undef GEN
-#define GEN(reg) EXPORT_RETPOLINE(reg)
-#include <asm/GEN-for-each-reg.h>
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5e5388a..d45f018 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -872,8 +872,7 @@ static int add_jump_destinations(struct objtool_file *file)
 		} else if (reloc->sym->type == STT_SECTION) {
 			dest_sec = reloc->sym->sec;
 			dest_off = arch_dest_reloc_offset(reloc->addend);
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21) ||
-			   !strncmp(reloc->sym->name, "__x86_retpoline_", 16)) {
+		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
 			/*
 			 * Retpoline jumps are really dynamic jumps in
 			 * disguise, so convert them accordingly.

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Correctly handle retpoline thunk calls
  2021-03-26 15:12 ` [PATCH v3 04/16] objtool: Correctly handle retpoline thunk calls Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     db9d1dd670d7f3f146c654f289f20968af6a12de
Gitweb:        https://git.kernel.org/tip/db9d1dd670d7f3f146c654f289f20968af6a12de
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:03 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 01 Apr 2021 11:34:01 +02:00

objtool: Correctly handle retpoline thunk calls

Just like JMP handling, convert a direct CALL to a retpoline thunk
into a retpoline safe indirect CALL.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.567568238@infradead.org
---
 tools/objtool/check.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index d45f018..519af4b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1025,6 +1025,18 @@ static int add_call_destinations(struct objtool_file *file)
 					  dest_off);
 				return -1;
 			}
+
+		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+			/*
+			 * Retpoline calls are really dynamic calls in
+			 * disguise, so convert them accordingly.
+			 */
+			insn->type = INSN_CALL_DYNAMIC;
+			insn->retpoline_safe = true;
+
+			remove_insn_ops(insn);
+			continue;
+
 		} else
 			insn->call_dest = reloc->sym;
 

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] x86/alternatives: Optimize optimize_nops()
  2021-03-26 15:12 ` [PATCH v3 02/16] x86/alternatives: Optimize optimize_nops() Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  2021-04-03 11:11   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     b4da5166b084f3fac01d68e0e67cbf3bf78a3e12
Gitweb:        https://git.kernel.org/tip/b4da5166b084f3fac01d68e0e67cbf3bf78a3e12
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:01 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 31 Mar 2021 20:30:04 +02:00

x86/alternatives: Optimize optimize_nops()

Currently, optimize_nops() scans to see if the alternative starts with
NOPs. However, the emit pattern is:

  141:	\oldinstr
  142:	.skip (len-(142b-141b)), 0x90

That is, when oldinstr is short, the tail is padded with NOPs. This case
never gets optimized.

Rewrite optimize_nops() to replace any trailing string of NOPs inside
the alternative to larger NOPs. Also run it irrespective of patching,
replacing NOPs in both the original and replaced code.

A direct consequence is that padlen becomes superfluous, so remove it.

 [ bp:
   - Adjust commit message
   - remove a stale comment about needing to pad
   - add a comment in optimize_nops()
   - exit early if the NOP verif. loop catches a mismatch - function
     should not not add NOPs in that case
   - fix the "optimized NOPs" offsets output ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210326151259.442992235@infradead.org
---
 arch/x86/include/asm/alternative.h            | 17 +-----
 arch/x86/kernel/alternative.c                 | 49 +++++++++++-------
 tools/objtool/arch/x86/include/arch/special.h |  2 +-
 3 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 17b3609..a3c2315 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -65,7 +65,6 @@ struct alt_instr {
 	u16 cpuid;		/* cpuid bit set for replacement */
 	u8  instrlen;		/* length of original instruction */
 	u8  replacementlen;	/* length of new instruction */
-	u8  padlen;		/* length of build-time padding */
 } __packed;
 
 /*
@@ -104,7 +103,6 @@ static inline int alternatives_text_reserved(void *start, void *end)
 
 #define alt_end_marker		"663"
 #define alt_slen		"662b-661b"
-#define alt_pad_len		alt_end_marker"b-662b"
 #define alt_total_slen		alt_end_marker"b-661b"
 #define alt_rlen(num)		e_replacement(num)"f-"b_replacement(num)"f"
 
@@ -151,8 +149,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	" .long " b_replacement(num)"f - .\n"		/* new instruction */ \
 	" .word " __stringify(feature) "\n"		/* feature bit     */ \
 	" .byte " alt_total_slen "\n"			/* source len      */ \
-	" .byte " alt_rlen(num) "\n"			/* replacement len */ \
-	" .byte " alt_pad_len "\n"			/* pad len */
+	" .byte " alt_rlen(num) "\n"			/* replacement len */
 
 #define ALTINSTR_REPLACEMENT(newinstr, num)		/* replacement */	\
 	"# ALT: replacement " #num "\n"						\
@@ -224,9 +221,6 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * Peculiarities:
  * No memory clobber here.
  * Argument numbers start with 1.
- * Best is to use constraints that are fixed size (like (%1) ... "r")
- * If you use variable sized constraints like "m" or "g" in the
- * replacement make sure to pad to the worst case length.
  * Leaving an unused argument 0 to keep API compatibility.
  */
 #define alternative_input(oldinstr, newinstr, feature, input...)	\
@@ -315,13 +309,12 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig alt feature orig_len alt_len pad_len
+.macro altinstruction_entry orig alt feature orig_len alt_len
 	.long \orig - .
 	.long \alt - .
 	.word \feature
 	.byte \orig_len
 	.byte \alt_len
-	.byte \pad_len
 .endm
 
 /*
@@ -338,7 +331,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 142:
 
 	.pushsection .altinstructions,"a"
-	altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f,142b-141b
+	altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f
 	.popsection
 
 	.pushsection .altinstr_replacement,"ax"
@@ -375,8 +368,8 @@ static inline int alternatives_text_reserved(void *start, void *end)
 142:
 
 	.pushsection .altinstructions,"a"
-	altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f,142b-141b
-	altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f,142b-141b
+	altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f
+	altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f
 	.popsection
 
 	.pushsection .altinstr_replacement,"ax"
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index f902f28..1298d58 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -345,19 +345,35 @@ done:
 static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
 {
 	unsigned long flags;
-	int i;
+	struct insn insn;
+	int nop, i = 0;
+
+	/*
+	 * Jump over the non-NOP insns, the remaining bytes must be single-byte
+	 * NOPs, optimize them.
+	 */
+	for (;;) {
+		if (insn_decode_kernel(&insn, &instr[i]))
+			return;
+
+		if (insn.length == 1 && insn.opcode.bytes[0] == 0x90)
+			break;
+
+		if ((i += insn.length) >= a->instrlen)
+			return;
+	}
 
-	for (i = 0; i < a->padlen; i++) {
-		if (instr[i] != 0x90)
+	for (nop = i; i < a->instrlen; i++) {
+		if (WARN_ONCE(instr[i] != 0x90, "Not a NOP at 0x%px\n", &instr[i]))
 			return;
 	}
 
 	local_irq_save(flags);
-	add_nops(instr + (a->instrlen - a->padlen), a->padlen);
+	add_nops(instr + nop, i - nop);
 	local_irq_restore(flags);
 
 	DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
-		   instr, a->instrlen - a->padlen, a->padlen);
+		   instr, nop, a->instrlen);
 }
 
 /*
@@ -403,19 +419,15 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 		 * - feature not present but ALTINSTR_FLAG_INV is set to mean,
 		 *   patch if feature is *NOT* present.
 		 */
-		if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV)) {
-			if (a->padlen > 1)
-				optimize_nops(a, instr);
-
-			continue;
-		}
+		if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV))
+			goto next;
 
-		DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d), pad: %d",
+		DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d)",
 			(a->cpuid & ALTINSTR_FLAG_INV) ? "!" : "",
 			feature >> 5,
 			feature & 0x1f,
 			instr, instr, a->instrlen,
-			replacement, a->replacementlen, a->padlen);
+			replacement, a->replacementlen);
 
 		DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
 		DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);
@@ -439,14 +451,15 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 		if (a->replacementlen && is_jmp(replacement[0]))
 			recompute_jump(a, instr, replacement, insn_buff);
 
-		if (a->instrlen > a->replacementlen) {
-			add_nops(insn_buff + a->replacementlen,
-				 a->instrlen - a->replacementlen);
-			insn_buff_sz += a->instrlen - a->replacementlen;
-		}
+		for (; insn_buff_sz < a->instrlen; insn_buff_sz++)
+			insn_buff[insn_buff_sz] = 0x90;
+
 		DUMP_BYTES(insn_buff, insn_buff_sz, "%px: final_insn: ", instr);
 
 		text_poke_early(instr, insn_buff, insn_buff_sz);
+
+next:
+		optimize_nops(a, instr);
 	}
 }
 
diff --git a/tools/objtool/arch/x86/include/arch/special.h b/tools/objtool/arch/x86/include/arch/special.h
index d818b2b..14271cc 100644
--- a/tools/objtool/arch/x86/include/arch/special.h
+++ b/tools/objtool/arch/x86/include/arch/special.h
@@ -10,7 +10,7 @@
 #define JUMP_ORIG_OFFSET	0
 #define JUMP_NEW_OFFSET		4
 
-#define ALT_ENTRY_SIZE		13
+#define ALT_ENTRY_SIZE		12
 #define ALT_ORIG_OFFSET		0
 #define ALT_NEW_OFFSET		4
 #define ALT_FEATURE_OFFSET	8

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] x86: Add insn_decode_kernel()
  2021-03-26 15:12 ` [PATCH v3 01/16] x86: Add insn_decode_kernel() Peter Zijlstra
@ 2021-04-01 15:08   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-01 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     52fa82c21f64e900a72437269a5cc9e0034b424e
Gitweb:        https://git.kernel.org/tip/52fa82c21f64e900a72437269a5cc9e0034b424e
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:00 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 31 Mar 2021 16:20:22 +02:00

x86: Add insn_decode_kernel()

Add a helper to decode kernel instructions; there's no point in
endlessly repeating those last two arguments.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210326151259.379242587@infradead.org
---
 arch/x86/include/asm/insn.h        | 2 ++
 arch/x86/kernel/alternative.c      | 2 +-
 arch/x86/kernel/cpu/mce/severity.c | 2 +-
 arch/x86/kernel/kprobes/core.c     | 4 ++--
 arch/x86/kernel/kprobes/opt.c      | 2 +-
 arch/x86/kernel/traps.c            | 2 +-
 tools/arch/x86/include/asm/insn.h  | 2 ++
 7 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/insn.h b/arch/x86/include/asm/insn.h
index f03b6ca..05a6ab9 100644
--- a/arch/x86/include/asm/insn.h
+++ b/arch/x86/include/asm/insn.h
@@ -150,6 +150,8 @@ enum insn_mode {
 
 extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m);
 
+#define insn_decode_kernel(_insn, _ptr) insn_decode((_insn), (_ptr), MAX_INSN_SIZE, INSN_MODE_KERN)
+
 /* Attribute will be determined after getting ModRM (for opcode groups) */
 static inline void insn_get_attribute(struct insn *insn)
 {
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index ce28c5c..ff359b3 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1280,7 +1280,7 @@ static void text_poke_loc_init(struct text_poke_loc *tp, void *addr,
 	if (!emulate)
 		emulate = opcode;
 
-	ret = insn_decode(&insn, emulate, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(&insn, emulate);
 
 	BUG_ON(ret < 0);
 	BUG_ON(len != insn.length);
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index a2136ce..abdd2e4 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -225,7 +225,7 @@ static bool is_copy_from_user(struct pt_regs *regs)
 	if (copy_from_kernel_nofault(insn_buf, (void *)regs->ip, MAX_INSN_SIZE))
 		return false;
 
-	ret = insn_decode(&insn, insn_buf, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(&insn, insn_buf);
 	if (ret < 0)
 		return false;
 
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index dd09021..1319ff4 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -285,7 +285,7 @@ static int can_probe(unsigned long paddr)
 		if (!__addr)
 			return 0;
 
-		ret = insn_decode(&insn, (void *)__addr, MAX_INSN_SIZE, INSN_MODE_KERN);
+		ret = insn_decode_kernel(&insn, (void *)__addr);
 		if (ret < 0)
 			return 0;
 
@@ -322,7 +322,7 @@ int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
 			MAX_INSN_SIZE))
 		return 0;
 
-	ret = insn_decode(insn, dest, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(insn, dest);
 	if (ret < 0)
 		return 0;
 
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 4299fc8..71425eb 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -324,7 +324,7 @@ static int can_optimize(unsigned long paddr)
 		if (!recovered_insn)
 			return 0;
 
-		ret = insn_decode(&insn, (void *)recovered_insn, MAX_INSN_SIZE, INSN_MODE_KERN);
+		ret = insn_decode_kernel(&insn, (void *)recovered_insn);
 		if (ret < 0)
 			return 0;
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a5d2540..034f27f 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -504,7 +504,7 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
 			MAX_INSN_SIZE))
 		return GP_NO_HINT;
 
-	ret = insn_decode(&insn, insn_buf, MAX_INSN_SIZE, INSN_MODE_KERN);
+	ret = insn_decode_kernel(&insn, insn_buf);
 	if (ret < 0)
 		return GP_NO_HINT;
 
diff --git a/tools/arch/x86/include/asm/insn.h b/tools/arch/x86/include/asm/insn.h
index c9f3eee..dc632b4 100644
--- a/tools/arch/x86/include/asm/insn.h
+++ b/tools/arch/x86/include/asm/insn.h
@@ -150,6 +150,8 @@ enum insn_mode {
 
 extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum insn_mode m);
 
+#define insn_decode_kernel(_insn, _ptr) insn_decode((_insn), (_ptr), MAX_INSN_SIZE, INSN_MODE_KERN)
+
 /* Attribute will be determined after getting ModRM (for opcode groups) */
 static inline void insn_get_attribute(struct insn *insn)
 {

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool/x86: Rewrite retpoline thunk calls
  2021-03-26 15:12 ` [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls Peter Zijlstra
  2021-03-29 16:38   ` Josh Poimboeuf
  2021-04-01 15:08   ` [tip: x86/core] objtool/x86: " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2 siblings, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     9bc0bb50727c8ac69fbb33fb937431cf3518ff37
Gitweb:        https://git.kernel.org/tip/9bc0bb50727c8ac69fbb33fb937431cf3518ff37
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:15 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:47:28 +02:00

objtool/x86: Rewrite retpoline thunk calls

When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an
indirect call, have objtool rewrite it to:

	ALTERNATIVE "call __x86_indirect_thunk_\reg",
		    "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE)

Additionally, in order to not emit endless identical
.altinst_replacement chunks, use a global symbol for them, see
__x86_indirect_alt_*.

This also avoids objtool from having to do code generation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.320177914@infradead.org
---
 arch/x86/include/asm/asm-prototypes.h |  12 ++-
 arch/x86/lib/retpoline.S              |  41 ++++++++-
 tools/objtool/arch/x86/decode.c       | 117 +++++++++++++++++++++++++-
 3 files changed, 167 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index 0545b07..4cb726c 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -19,11 +19,19 @@ extern void cmpxchg8b_emu(void);
 
 #ifdef CONFIG_RETPOLINE
 
-#define DECL_INDIRECT_THUNK(reg) \
+#undef GEN
+#define GEN(reg) \
 	extern asmlinkage void __x86_indirect_thunk_ ## reg (void);
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) \
+	extern asmlinkage void __x86_indirect_alt_call_ ## reg (void);
+#include <asm/GEN-for-each-reg.h>
 
 #undef GEN
-#define GEN(reg) DECL_INDIRECT_THUNK(reg)
+#define GEN(reg) \
+	extern asmlinkage void __x86_indirect_alt_jmp_ ## reg (void);
 #include <asm/GEN-for-each-reg.h>
 
 #endif /* CONFIG_RETPOLINE */
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index d2c0d14..4d32cb0 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -10,6 +10,8 @@
 #include <asm/unwind_hints.h>
 #include <asm/frame.h>
 
+	.section .text.__x86.indirect_thunk
+
 .macro RETPOLINE reg
 	ANNOTATE_INTRA_FUNCTION_CALL
 	call    .Ldo_rop_\@
@@ -25,9 +27,9 @@
 .endm
 
 .macro THUNK reg
-	.section .text.__x86.indirect_thunk
 
 	.align 32
+
 SYM_FUNC_START(__x86_indirect_thunk_\reg)
 
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
@@ -39,6 +41,32 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
 .endm
 
 /*
+ * This generates .altinstr_replacement symbols for use by objtool. They,
+ * however, must not actually live in .altinstr_replacement since that will be
+ * discarded after init, but module alternatives will also reference these
+ * symbols.
+ *
+ * Their names matches the "__x86_indirect_" prefix to mark them as retpolines.
+ */
+.macro ALT_THUNK reg
+
+	.align 1
+
+SYM_FUNC_START_NOALIGN(__x86_indirect_alt_call_\reg)
+	ANNOTATE_RETPOLINE_SAFE
+1:	call	*%\reg
+2:	.skip	5-(2b-1b), 0x90
+SYM_FUNC_END(__x86_indirect_alt_call_\reg)
+
+SYM_FUNC_START_NOALIGN(__x86_indirect_alt_jmp_\reg)
+	ANNOTATE_RETPOLINE_SAFE
+1:	jmp	*%\reg
+2:	.skip	5-(2b-1b), 0x90
+SYM_FUNC_END(__x86_indirect_alt_jmp_\reg)
+
+.endm
+
+/*
  * Despite being an assembler file we can't just use .irp here
  * because __KSYM_DEPS__ only uses the C preprocessor and would
  * only see one instance of "__x86_indirect_thunk_\reg" rather
@@ -61,3 +89,14 @@ SYM_FUNC_END(__x86_indirect_thunk_\reg)
 #define GEN(reg) EXPORT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
+#undef GEN
+#define GEN(reg) ALT_THUNK reg
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_call_ ## reg)
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) __EXPORT_THUNK(__x86_indirect_alt_jmp_ ## reg)
+#include <asm/GEN-for-each-reg.h>
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 782894e..7e8b5be 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -19,6 +19,7 @@
 #include <objtool/elf.h>
 #include <objtool/arch.h>
 #include <objtool/warn.h>
+#include <arch/elf.h>
 
 static unsigned char op_to_cfi_reg[][2] = {
 	{CFI_AX, CFI_R8},
@@ -613,6 +614,122 @@ const char *arch_nop_insn(int len)
 	return nops[len-1];
 }
 
+/* asm/alternative.h ? */
+
+#define ALTINSTR_FLAG_INV	(1 << 15)
+#define ALT_NOT(feat)		((feat) | ALTINSTR_FLAG_INV)
+
+struct alt_instr {
+	s32 instr_offset;	/* original instruction */
+	s32 repl_offset;	/* offset to replacement instruction */
+	u16 cpuid;		/* cpuid bit set for replacement */
+	u8  instrlen;		/* length of original instruction */
+	u8  replacementlen;	/* length of new instruction */
+} __packed;
+
+static int elf_add_alternative(struct elf *elf,
+			       struct instruction *orig, struct symbol *sym,
+			       int cpuid, u8 orig_len, u8 repl_len)
+{
+	const int size = sizeof(struct alt_instr);
+	struct alt_instr *alt;
+	struct section *sec;
+	Elf_Scn *s;
+
+	sec = find_section_by_name(elf, ".altinstructions");
+	if (!sec) {
+		sec = elf_create_section(elf, ".altinstructions",
+					 SHF_WRITE, size, 0);
+
+		if (!sec) {
+			WARN_ELF("elf_create_section");
+			return -1;
+		}
+	}
+
+	s = elf_getscn(elf->elf, sec->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	sec->data = elf_newdata(s);
+	if (!sec->data) {
+		WARN_ELF("elf_newdata");
+		return -1;
+	}
+
+	sec->data->d_size = size;
+	sec->data->d_align = 1;
+
+	alt = sec->data->d_buf = malloc(size);
+	if (!sec->data->d_buf) {
+		perror("malloc");
+		return -1;
+	}
+	memset(sec->data->d_buf, 0, size);
+
+	if (elf_add_reloc_to_insn(elf, sec, sec->sh.sh_size,
+				  R_X86_64_PC32, orig->sec, orig->offset)) {
+		WARN("elf_create_reloc: alt_instr::instr_offset");
+		return -1;
+	}
+
+	if (elf_add_reloc(elf, sec, sec->sh.sh_size + 4,
+			  R_X86_64_PC32, sym, 0)) {
+		WARN("elf_create_reloc: alt_instr::repl_offset");
+		return -1;
+	}
+
+	alt->cpuid = cpuid;
+	alt->instrlen = orig_len;
+	alt->replacementlen = repl_len;
+
+	sec->sh.sh_size += size;
+	sec->changed = true;
+
+	return 0;
+}
+
+#define X86_FEATURE_RETPOLINE                ( 7*32+12)
+
+int arch_rewrite_retpolines(struct objtool_file *file)
+{
+	struct instruction *insn;
+	struct reloc *reloc;
+	struct symbol *sym;
+	char name[32] = "";
+
+	list_for_each_entry(insn, &file->retpoline_call_list, call_node) {
+
+		if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
+			continue;
+
+		reloc = insn->reloc;
+
+		sprintf(name, "__x86_indirect_alt_%s_%s",
+			insn->type == INSN_JUMP_DYNAMIC ? "jmp" : "call",
+			reloc->sym->name + 21);
+
+		sym = find_symbol_by_name(file->elf, name);
+		if (!sym) {
+			sym = elf_create_undef_symbol(file->elf, name);
+			if (!sym) {
+				WARN("elf_create_undef_symbol");
+				return -1;
+			}
+		}
+
+		if (elf_add_alternative(file->elf, insn, sym,
+					ALT_NOT(X86_FEATURE_RETPOLINE), 5, 5)) {
+			WARN("elf_add_alternative");
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
 int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg)
 {
 	struct cfi_reg *cfa = &insn->cfi.cfa;

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Skip magical retpoline .altinstr_replacement
  2021-03-26 15:12 ` [PATCH v3 15/16] objtool: Skip magical retpoline .altinstr_replacement Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     50e7b4a1a1b264fc7df0698f2defb93cadf19a7b
Gitweb:        https://git.kernel.org/tip/50e7b4a1a1b264fc7df0698f2defb93cadf19a7b
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:14 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:46:57 +02:00

objtool: Skip magical retpoline .altinstr_replacement

When the .altinstr_replacement is a retpoline, skip the alternative.
We already special case retpolines anyway.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.259429287@infradead.org
---
 tools/objtool/special.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/special.c b/tools/objtool/special.c
index 2c7fbda..07b21cf 100644
--- a/tools/objtool/special.c
+++ b/tools/objtool/special.c
@@ -106,6 +106,14 @@ static int get_alt_entry(struct elf *elf, struct special_entry *entry,
 			return -1;
 		}
 
+		/*
+		 * Skip retpoline .altinstr_replacement... we already rewrite the
+		 * instructions for retpolines anyway, see arch_is_retpoline()
+		 * usage in add_{call,jump}_destinations().
+		 */
+		if (arch_is_retpoline(new_reloc->sym))
+			return 1;
+
 		alt->new_sec = new_reloc->sym->sec;
 		alt->new_off = (unsigned int)new_reloc->addend;
 
@@ -154,7 +162,9 @@ int special_get_alts(struct elf *elf, struct list_head *alts)
 			memset(alt, 0, sizeof(*alt));
 
 			ret = get_alt_entry(elf, entry, sec, idx, alt);
-			if (ret)
+			if (ret > 0)
+				continue;
+			if (ret < 0)
 				return ret;
 
 			list_add_tail(&alt->list, alts);

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Cache instruction relocs
  2021-03-26 15:12 ` [PATCH v3 14/16] objtool: Cache instruction relocs Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     7bd2a600f3e9d27286bbf23c83d599e9cc7cf245
Gitweb:        https://git.kernel.org/tip/7bd2a600f3e9d27286bbf23c83d599e9cc7cf245
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:13 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:46:15 +02:00

objtool: Cache instruction relocs

Track the reloc of instructions in the new instruction->reloc field
to avoid having to look them up again later.

( Technically x86 instructions can have two relocations, but not jumps
  and calls, for which we're using this. )

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.195441549@infradead.org
---
 tools/objtool/check.c                 | 28 ++++++++++++++++++++------
 tools/objtool/include/objtool/check.h |  1 +-
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 77074db..1f4154f 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -797,6 +797,25 @@ __weak bool arch_is_retpoline(struct symbol *sym)
 	return false;
 }
 
+#define NEGATIVE_RELOC	((void *)-1L)
+
+static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
+{
+	if (insn->reloc == NEGATIVE_RELOC)
+		return NULL;
+
+	if (!insn->reloc) {
+		insn->reloc = find_reloc_by_dest_range(file->elf, insn->sec,
+						       insn->offset, insn->len);
+		if (!insn->reloc) {
+			insn->reloc = NEGATIVE_RELOC;
+			return NULL;
+		}
+	}
+
+	return insn->reloc;
+}
+
 /*
  * Find the destination instructions for all jumps.
  */
@@ -811,8 +830,7 @@ static int add_jump_destinations(struct objtool_file *file)
 		if (!is_static_jump(insn))
 			continue;
 
-		reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-						 insn->offset, insn->len);
+		reloc = insn_reloc(file, insn);
 		if (!reloc) {
 			dest_sec = insn->sec;
 			dest_off = arch_jump_destination(insn);
@@ -944,8 +962,7 @@ static int add_call_destinations(struct objtool_file *file)
 		if (insn->type != INSN_CALL)
 			continue;
 
-		reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-					       insn->offset, insn->len);
+		reloc = insn_reloc(file, insn);
 		if (!reloc) {
 			dest_off = arch_jump_destination(insn);
 			insn->call_dest = find_call_destination(insn->sec, dest_off);
@@ -1144,8 +1161,7 @@ static int handle_group_alt(struct objtool_file *file,
 		 * alternatives code can adjust the relative offsets
 		 * accordingly.
 		 */
-		alt_reloc = find_reloc_by_dest_range(file->elf, insn->sec,
-						   insn->offset, insn->len);
+		alt_reloc = insn_reloc(file, insn);
 		if (alt_reloc &&
 		    !arch_support_alt_relocation(special_alt, insn, alt_reloc)) {
 
diff --git a/tools/objtool/include/objtool/check.h b/tools/objtool/include/objtool/check.h
index e5528ce..56d50bc 100644
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -56,6 +56,7 @@ struct instruction {
 	struct instruction *jump_dest;
 	struct instruction *first_jump_src;
 	struct reloc *jump_table;
+	struct reloc *reloc;
 	struct list_head alts;
 	struct symbol *func;
 	struct list_head stack_ops;

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Keep track of retpoline call sites
  2021-03-26 15:12 ` [PATCH v3 13/16] objtool: Keep track of retpoline call sites Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     43d5430ad74ef5156353af7aec352426ec7a8e57
Gitweb:        https://git.kernel.org/tip/43d5430ad74ef5156353af7aec352426ec7a8e57
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:12 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:45:27 +02:00

objtool: Keep track of retpoline call sites

Provide infrastructure for architectures to rewrite/augment compiler
generated retpoline calls. Similar to what we do for static_call()s,
keep track of the instructions that are retpoline calls.

Use the same list_head, since a retpoline call cannot also be a
static_call.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.130805730@infradead.org
---
 tools/objtool/check.c                   | 34 ++++++++++++++++++++----
 tools/objtool/include/objtool/arch.h    |  2 +-
 tools/objtool/include/objtool/check.h   |  2 +-
 tools/objtool/include/objtool/objtool.h |  1 +-
 tools/objtool/objtool.c                 |  1 +-
 5 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 600fa67..77074db 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -451,7 +451,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		return 0;
 
 	idx = 0;
-	list_for_each_entry(insn, &file->static_call_list, static_call_node)
+	list_for_each_entry(insn, &file->static_call_list, call_node)
 		idx++;
 
 	sec = elf_create_section(file->elf, ".static_call_sites", SHF_WRITE,
@@ -460,7 +460,7 @@ static int create_static_call_sections(struct objtool_file *file)
 		return -1;
 
 	idx = 0;
-	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
+	list_for_each_entry(insn, &file->static_call_list, call_node) {
 
 		site = (struct static_call_site *)sec->data->d_buf + idx;
 		memset(site, 0, sizeof(struct static_call_site));
@@ -829,13 +829,16 @@ static int add_jump_destinations(struct objtool_file *file)
 			else
 				insn->type = INSN_JUMP_DYNAMIC_CONDITIONAL;
 
+			list_add_tail(&insn->call_node,
+				      &file->retpoline_call_list);
+
 			insn->retpoline_safe = true;
 			continue;
 		} else if (insn->func) {
 			/* internal or external sibling call (with reloc) */
 			insn->call_dest = reloc->sym;
 			if (insn->call_dest->static_call_tramp) {
-				list_add_tail(&insn->static_call_node,
+				list_add_tail(&insn->call_node,
 					      &file->static_call_list);
 			}
 			continue;
@@ -897,7 +900,7 @@ static int add_jump_destinations(struct objtool_file *file)
 				/* internal sibling call (without reloc) */
 				insn->call_dest = insn->jump_dest->func;
 				if (insn->call_dest->static_call_tramp) {
-					list_add_tail(&insn->static_call_node,
+					list_add_tail(&insn->call_node,
 						      &file->static_call_list);
 				}
 			}
@@ -981,6 +984,9 @@ static int add_call_destinations(struct objtool_file *file)
 			insn->type = INSN_CALL_DYNAMIC;
 			insn->retpoline_safe = true;
 
+			list_add_tail(&insn->call_node,
+				      &file->retpoline_call_list);
+
 			remove_insn_ops(insn);
 			continue;
 
@@ -988,7 +994,7 @@ static int add_call_destinations(struct objtool_file *file)
 			insn->call_dest = reloc->sym;
 
 		if (insn->call_dest && insn->call_dest->static_call_tramp) {
-			list_add_tail(&insn->static_call_node,
+			list_add_tail(&insn->call_node,
 				      &file->static_call_list);
 		}
 
@@ -1714,6 +1720,11 @@ static void mark_rodata(struct objtool_file *file)
 	file->rodata = found;
 }
 
+__weak int arch_rewrite_retpolines(struct objtool_file *file)
+{
+	return 0;
+}
+
 static int decode_sections(struct objtool_file *file)
 {
 	int ret;
@@ -1742,6 +1753,10 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_special_section_alts() as that depends on
+	 * jump_dest being set.
+	 */
 	ret = add_jump_destinations(file);
 	if (ret)
 		return ret;
@@ -1778,6 +1793,15 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be after add_special_section_alts(), since this will emit
+	 * alternatives. Must be after add_{jump,call}_destination(), since
+	 * those create the call insn lists.
+	 */
+	ret = arch_rewrite_retpolines(file);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index bb30993..48b540a 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -88,4 +88,6 @@ int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg);
 
 bool arch_is_retpoline(struct symbol *sym);
 
+int arch_rewrite_retpolines(struct objtool_file *file);
+
 #endif /* _ARCH_H */
diff --git a/tools/objtool/include/objtool/check.h b/tools/objtool/include/objtool/check.h
index f5be798..e5528ce 100644
--- a/tools/objtool/include/objtool/check.h
+++ b/tools/objtool/include/objtool/check.h
@@ -39,7 +39,7 @@ struct alt_group {
 struct instruction {
 	struct list_head list;
 	struct hlist_node hash;
-	struct list_head static_call_node;
+	struct list_head call_node;
 	struct list_head mcount_loc_node;
 	struct section *sec;
 	unsigned long offset;
diff --git a/tools/objtool/include/objtool/objtool.h b/tools/objtool/include/objtool/objtool.h
index e68e374..e4084af 100644
--- a/tools/objtool/include/objtool/objtool.h
+++ b/tools/objtool/include/objtool/objtool.h
@@ -18,6 +18,7 @@ struct objtool_file {
 	struct elf *elf;
 	struct list_head insn_list;
 	DECLARE_HASHTABLE(insn_hash, 20);
+	struct list_head retpoline_call_list;
 	struct list_head static_call_list;
 	struct list_head mcount_loc_list;
 	bool ignore_unreachables, c_file, hints, rodata;
diff --git a/tools/objtool/objtool.c b/tools/objtool/objtool.c
index 7b97ce4..3a3ea1b 100644
--- a/tools/objtool/objtool.c
+++ b/tools/objtool/objtool.c
@@ -61,6 +61,7 @@ struct objtool_file *objtool_open_read(const char *_objname)
 
 	INIT_LIST_HEAD(&file.insn_list);
 	hash_init(file.insn_hash);
+	INIT_LIST_HEAD(&file.retpoline_call_list);
 	INIT_LIST_HEAD(&file.static_call_list);
 	INIT_LIST_HEAD(&file.mcount_loc_list);
 	file.c_file = !vmlinux && find_section_by_name(file.elf, ".comment");

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Add elf_create_undef_symbol()
  2021-03-26 15:12 ` [PATCH v3 12/16] objtool: Add elf_create_undef_symbol() Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     2f2f7e47f0525cbaad5dd9675fd9d8aa8da12046
Gitweb:        https://git.kernel.org/tip/2f2f7e47f0525cbaad5dd9675fd9d8aa8da12046
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:11 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:45:05 +02:00

objtool: Add elf_create_undef_symbol()

Allow objtool to create undefined symbols; this allows creating
relocations to symbols not currently in the symbol table.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.064743095@infradead.org
---
 tools/objtool/elf.c                 | 60 ++++++++++++++++++++++++++++-
 tools/objtool/include/objtool/elf.h |  1 +-
 2 files changed, 61 insertions(+)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 8457218..d08f5f3 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -715,6 +715,66 @@ static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
 	return len;
 }
 
+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
+{
+	struct section *symtab;
+	struct symbol *sym;
+	Elf_Data *data;
+	Elf_Scn *s;
+
+	sym = malloc(sizeof(*sym));
+	if (!sym) {
+		perror("malloc");
+		return NULL;
+	}
+	memset(sym, 0, sizeof(*sym));
+
+	sym->name = strdup(name);
+
+	sym->sym.st_name = elf_add_string(elf, NULL, sym->name);
+	if (sym->sym.st_name == -1)
+		return NULL;
+
+	sym->sym.st_info = GELF_ST_INFO(STB_GLOBAL, STT_NOTYPE);
+	// st_other 0
+	// st_shndx 0
+	// st_value 0
+	// st_size 0
+
+	symtab = find_section_by_name(elf, ".symtab");
+	if (!symtab) {
+		WARN("can't find .symtab");
+		return NULL;
+	}
+
+	s = elf_getscn(elf->elf, symtab->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return NULL;
+	}
+
+	data = elf_newdata(s);
+	if (!data) {
+		WARN_ELF("elf_newdata");
+		return NULL;
+	}
+
+	data->d_buf = &sym->sym;
+	data->d_size = sizeof(sym->sym);
+	data->d_align = 1;
+
+	sym->idx = symtab->len / sizeof(sym->sym);
+
+	symtab->len += data->d_size;
+	symtab->changed = true;
+
+	sym->sec = find_section_by_index(elf, 0);
+
+	elf_add_symbol(elf, sym);
+
+	return sym;
+}
+
 struct section *elf_create_section(struct elf *elf, const char *name,
 				   unsigned int sh_flags, size_t entsize, int nr)
 {
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 463f329..45e5ede 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -133,6 +133,7 @@ int elf_write_insn(struct elf *elf, struct section *sec,
 		   unsigned long offset, unsigned int len,
 		   const char *insn);
 int elf_write_reloc(struct elf *elf, struct reloc *reloc);
+struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name);
 int elf_write(struct elf *elf);
 void elf_close(struct elf *elf);
 

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Extract elf_symbol_add()
  2021-03-26 15:12 ` [PATCH v3 11/16] objtool: Extract elf_symbol_add() Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     9a7827b7789c630c1efdb121daa42c6e77dce97f
Gitweb:        https://git.kernel.org/tip/9a7827b7789c630c1efdb121daa42c6e77dce97f
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:10 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:45:01 +02:00

objtool: Extract elf_symbol_add()

Create a common helper to add symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151300.003468981@infradead.org
---
 tools/objtool/elf.c | 56 ++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index c278a04..8457218 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -290,12 +290,39 @@ static int read_sections(struct elf *elf)
 	return 0;
 }
 
+static void elf_add_symbol(struct elf *elf, struct symbol *sym)
+{
+	struct list_head *entry;
+	struct rb_node *pnode;
+
+	sym->type = GELF_ST_TYPE(sym->sym.st_info);
+	sym->bind = GELF_ST_BIND(sym->sym.st_info);
+
+	sym->offset = sym->sym.st_value;
+	sym->len = sym->sym.st_size;
+
+	rb_add(&sym->node, &sym->sec->symbol_tree, symbol_to_offset);
+	pnode = rb_prev(&sym->node);
+	if (pnode)
+		entry = &rb_entry(pnode, struct symbol, node)->list;
+	else
+		entry = &sym->sec->symbol_list;
+	list_add(&sym->list, entry);
+	elf_hash_add(elf->symbol_hash, &sym->hash, sym->idx);
+	elf_hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
+
+	/*
+	 * Don't store empty STT_NOTYPE symbols in the rbtree.  They
+	 * can exist within a function, confusing the sorting.
+	 */
+	if (!sym->len)
+		rb_erase(&sym->node, &sym->sec->symbol_tree);
+}
+
 static int read_symbols(struct elf *elf)
 {
 	struct section *symtab, *symtab_shndx, *sec;
 	struct symbol *sym, *pfunc;
-	struct list_head *entry;
-	struct rb_node *pnode;
 	int symbols_nr, i;
 	char *coldstr;
 	Elf_Data *shndx_data = NULL;
@@ -340,9 +367,6 @@ static int read_symbols(struct elf *elf)
 			goto err;
 		}
 
-		sym->type = GELF_ST_TYPE(sym->sym.st_info);
-		sym->bind = GELF_ST_BIND(sym->sym.st_info);
-
 		if ((sym->sym.st_shndx > SHN_UNDEF &&
 		     sym->sym.st_shndx < SHN_LORESERVE) ||
 		    (shndx_data && sym->sym.st_shndx == SHN_XINDEX)) {
@@ -355,32 +379,14 @@ static int read_symbols(struct elf *elf)
 				     sym->name);
 				goto err;
 			}
-			if (sym->type == STT_SECTION) {
+			if (GELF_ST_TYPE(sym->sym.st_info) == STT_SECTION) {
 				sym->name = sym->sec->name;
 				sym->sec->sym = sym;
 			}
 		} else
 			sym->sec = find_section_by_index(elf, 0);
 
-		sym->offset = sym->sym.st_value;
-		sym->len = sym->sym.st_size;
-
-		rb_add(&sym->node, &sym->sec->symbol_tree, symbol_to_offset);
-		pnode = rb_prev(&sym->node);
-		if (pnode)
-			entry = &rb_entry(pnode, struct symbol, node)->list;
-		else
-			entry = &sym->sec->symbol_list;
-		list_add(&sym->list, entry);
-		elf_hash_add(elf->symbol_hash, &sym->hash, sym->idx);
-		elf_hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
-
-		/*
-		 * Don't store empty STT_NOTYPE symbols in the rbtree.  They
-		 * can exist within a function, confusing the sorting.
-		 */
-		if (!sym->len)
-			rb_erase(&sym->node, &sym->sec->symbol_tree);
+		elf_add_symbol(elf, sym);
 	}
 
 	if (stats)

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Extract elf_strtab_concat()
  2021-03-26 15:12 ` [PATCH v3 10/16] objtool: Extract elf_strtab_concat() Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     417a4dc91e559f92404c2544f785b02ce75784c3
Gitweb:        https://git.kernel.org/tip/417a4dc91e559f92404c2544f785b02ce75784c3
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:09 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:44:56 +02:00

objtool: Extract elf_strtab_concat()

Create a common helper to append strings to a strtab.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.941474004@infradead.org
---
 tools/objtool/elf.c | 60 +++++++++++++++++++++++++++-----------------
 1 file changed, 38 insertions(+), 22 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 7b65ae3..c278a04 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -673,13 +673,48 @@ err:
 	return NULL;
 }
 
+static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
+{
+	Elf_Data *data;
+	Elf_Scn *s;
+	int len;
+
+	if (!strtab)
+		strtab = find_section_by_name(elf, ".strtab");
+	if (!strtab) {
+		WARN("can't find .strtab section");
+		return -1;
+	}
+
+	s = elf_getscn(elf->elf, strtab->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	data = elf_newdata(s);
+	if (!data) {
+		WARN_ELF("elf_newdata");
+		return -1;
+	}
+
+	data->d_buf = str;
+	data->d_size = strlen(str) + 1;
+	data->d_align = 1;
+
+	len = strtab->len;
+	strtab->len += data->d_size;
+	strtab->changed = true;
+
+	return len;
+}
+
 struct section *elf_create_section(struct elf *elf, const char *name,
 				   unsigned int sh_flags, size_t entsize, int nr)
 {
 	struct section *sec, *shstrtab;
 	size_t size = entsize * nr;
 	Elf_Scn *s;
-	Elf_Data *data;
 
 	sec = malloc(sizeof(*sec));
 	if (!sec) {
@@ -736,7 +771,6 @@ struct section *elf_create_section(struct elf *elf, const char *name,
 	sec->sh.sh_addralign = 1;
 	sec->sh.sh_flags = SHF_ALLOC | sh_flags;
 
-
 	/* Add section name to .shstrtab (or .strtab for Clang) */
 	shstrtab = find_section_by_name(elf, ".shstrtab");
 	if (!shstrtab)
@@ -745,27 +779,9 @@ struct section *elf_create_section(struct elf *elf, const char *name,
 		WARN("can't find .shstrtab or .strtab section");
 		return NULL;
 	}
-
-	s = elf_getscn(elf->elf, shstrtab->idx);
-	if (!s) {
-		WARN_ELF("elf_getscn");
+	sec->sh.sh_name = elf_add_string(elf, shstrtab, sec->name);
+	if (sec->sh.sh_name == -1)
 		return NULL;
-	}
-
-	data = elf_newdata(s);
-	if (!data) {
-		WARN_ELF("elf_newdata");
-		return NULL;
-	}
-
-	data->d_buf = sec->name;
-	data->d_size = strlen(name) + 1;
-	data->d_align = 1;
-
-	sec->sh.sh_name = shstrtab->len;
-
-	shstrtab->len += strlen(name) + 1;
-	shstrtab->changed = true;
 
 	list_add_tail(&sec->list, &elf->sections);
 	elf_hash_add(elf->section_hash, &sec->hash, sec->idx);

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Add elf_create_reloc() helper
  2021-03-26 15:12 ` [PATCH v3 08/16] objtool: Add elf_create_reloc() helper Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     ef47cc01cb4abcd760d8ac66b9361d6ade4d0846
Gitweb:        https://git.kernel.org/tip/ef47cc01cb4abcd760d8ac66b9361d6ade4d0846
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:07 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:44:18 +02:00

objtool: Add elf_create_reloc() helper

We have 4 instances of adding a relocation. Create a common helper
to avoid growing even more.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.817438847@infradead.org
---
 tools/objtool/check.c               | 78 +++++--------------------
 tools/objtool/elf.c                 | 86 ++++++++++++++++++----------
 tools/objtool/include/objtool/elf.h | 10 ++-
 tools/objtool/orc_gen.c             | 30 ++--------
 4 files changed, 85 insertions(+), 119 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 1d0415b..61fe29a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -433,8 +433,7 @@ reachable:
 
 static int create_static_call_sections(struct objtool_file *file)
 {
-	struct section *sec, *reloc_sec;
-	struct reloc *reloc;
+	struct section *sec;
 	struct static_call_site *site;
 	struct instruction *insn;
 	struct symbol *key_sym;
@@ -460,8 +459,7 @@ static int create_static_call_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!reloc_sec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	idx = 0;
@@ -471,25 +469,11 @@ static int create_static_call_sections(struct objtool_file *file)
 		memset(site, 0, sizeof(struct static_call_site));
 
 		/* populate reloc for 'addr' */
-		reloc = malloc(sizeof(*reloc));
-
-		if (!reloc) {
-			perror("malloc");
-			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-
-		insn_to_reloc_sym_addend(insn->sec, insn->offset, reloc);
-		if (!reloc->sym) {
-			WARN_FUNC("static call tramp: missing containing symbol",
-				  insn->sec, insn->offset);
+		if (elf_add_reloc_to_insn(file->elf, sec,
+					  idx * sizeof(struct static_call_site),
+					  R_X86_64_PC32,
+					  insn->sec, insn->offset))
 			return -1;
-		}
-
-		reloc->type = R_X86_64_PC32;
-		reloc->offset = idx * sizeof(struct static_call_site);
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		/* find key symbol */
 		key_name = strdup(insn->call_dest->name);
@@ -526,18 +510,11 @@ static int create_static_call_sections(struct objtool_file *file)
 		free(key_name);
 
 		/* populate reloc for 'key' */
-		reloc = malloc(sizeof(*reloc));
-		if (!reloc) {
-			perror("malloc");
+		if (elf_add_reloc(file->elf, sec,
+				  idx * sizeof(struct static_call_site) + 4,
+				  R_X86_64_PC32, key_sym,
+				  is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
 			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-		reloc->sym = key_sym;
-		reloc->addend = is_sibling_call(insn) ? STATIC_CALL_SITE_TAIL : 0;
-		reloc->type = R_X86_64_PC32;
-		reloc->offset = idx * sizeof(struct static_call_site) + 4;
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		idx++;
 	}
@@ -547,8 +524,7 @@ static int create_static_call_sections(struct objtool_file *file)
 
 static int create_mcount_loc_sections(struct objtool_file *file)
 {
-	struct section *sec, *reloc_sec;
-	struct reloc *reloc;
+	struct section *sec;
 	unsigned long *loc;
 	struct instruction *insn;
 	int idx;
@@ -571,8 +547,7 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	reloc_sec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!reloc_sec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	idx = 0;
@@ -581,32 +556,11 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 		loc = (unsigned long *)sec->data->d_buf + idx;
 		memset(loc, 0, sizeof(unsigned long));
 
-		reloc = malloc(sizeof(*reloc));
-		if (!reloc) {
-			perror("malloc");
+		if (elf_add_reloc_to_insn(file->elf, sec,
+					  idx * sizeof(unsigned long),
+					  R_X86_64_64,
+					  insn->sec, insn->offset))
 			return -1;
-		}
-		memset(reloc, 0, sizeof(*reloc));
-
-		if (insn->sec->sym) {
-			reloc->sym = insn->sec->sym;
-			reloc->addend = insn->offset;
-		} else {
-			reloc->sym = find_symbol_containing(insn->sec, insn->offset);
-
-			if (!reloc->sym) {
-				WARN("missing symbol for insn at offset 0x%lx\n",
-				     insn->offset);
-				return -1;
-			}
-
-			reloc->addend = insn->offset - reloc->sym->offset;
-		}
-
-		reloc->type = R_X86_64_64;
-		reloc->offset = idx * sizeof(unsigned long);
-		reloc->sec = reloc_sec;
-		elf_add_reloc(file->elf, reloc);
 
 		idx++;
 	}
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 374813e..0ab52ac 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -211,32 +211,6 @@ struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, uns
 	return find_reloc_by_dest_range(elf, sec, offset, 1);
 }
 
-void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
-			      struct reloc *reloc)
-{
-	if (sec->sym) {
-		reloc->sym = sec->sym;
-		reloc->addend = offset;
-		return;
-	}
-
-	/*
-	 * The Clang assembler strips section symbols, so we have to reference
-	 * the function symbol instead:
-	 */
-	reloc->sym = find_symbol_containing(sec, offset);
-	if (!reloc->sym) {
-		/*
-		 * Hack alert.  This happens when we need to reference the NOP
-		 * pad insn immediately after the function.
-		 */
-		reloc->sym = find_symbol_containing(sec, offset - 1);
-	}
-
-	if (reloc->sym)
-		reloc->addend = offset - reloc->sym->offset;
-}
-
 static int read_sections(struct elf *elf)
 {
 	Elf_Scn *s = NULL;
@@ -473,14 +447,66 @@ err:
 	return -1;
 }
 
-void elf_add_reloc(struct elf *elf, struct reloc *reloc)
+int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+		  unsigned int type, struct symbol *sym, int addend)
 {
-	struct section *sec = reloc->sec;
+	struct reloc *reloc;
+
+	reloc = malloc(sizeof(*reloc));
+	if (!reloc) {
+		perror("malloc");
+		return -1;
+	}
+	memset(reloc, 0, sizeof(*reloc));
 
-	list_add_tail(&reloc->list, &sec->reloc_list);
+	reloc->sec = sec->reloc;
+	reloc->offset = offset;
+	reloc->type = type;
+	reloc->sym = sym;
+	reloc->addend = addend;
+
+	list_add_tail(&reloc->list, &sec->reloc->reloc_list);
 	elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
 
-	sec->changed = true;
+	sec->reloc->changed = true;
+
+	return 0;
+}
+
+int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+			  unsigned long offset, unsigned int type,
+			  struct section *insn_sec, unsigned long insn_off)
+{
+	struct symbol *sym;
+	int addend;
+
+	if (insn_sec->sym) {
+		sym = insn_sec->sym;
+		addend = insn_off;
+
+	} else {
+		/*
+		 * The Clang assembler strips section symbols, so we have to
+		 * reference the function symbol instead:
+		 */
+		sym = find_symbol_containing(insn_sec, insn_off);
+		if (!sym) {
+			/*
+			 * Hack alert.  This happens when we need to reference
+			 * the NOP pad insn immediately after the function.
+			 */
+			sym = find_symbol_containing(insn_sec, insn_off - 1);
+		}
+
+		if (!sym) {
+			WARN("can't find symbol containing %s+0x%lx", insn_sec->name, insn_off);
+			return -1;
+		}
+
+		addend = insn_off - sym->offset;
+	}
+
+	return elf_add_reloc(elf, sec, offset, type, sym, addend);
 }
 
 static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, unsigned int *symndx)
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index fc576ed..825ad32 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -123,7 +123,13 @@ static inline u32 reloc_hash(struct reloc *reloc)
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
 struct section *elf_create_reloc_section(struct elf *elf, struct section *base, int reltype);
-void elf_add_reloc(struct elf *elf, struct reloc *reloc);
+
+int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
+		  unsigned int type, struct symbol *sym, int addend);
+int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+			  unsigned long offset, unsigned int type,
+			  struct section *insn_sec, unsigned long insn_off);
+
 int elf_write_insn(struct elf *elf, struct section *sec,
 		   unsigned long offset, unsigned int len,
 		   const char *insn);
@@ -140,8 +146,6 @@ struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, uns
 struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *sec,
 				     unsigned long offset, unsigned int len);
 struct symbol *find_func_containing(struct section *sec, unsigned long offset);
-void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
-			      struct reloc *reloc);
 
 #define for_each_sec(file, sec)						\
 	list_for_each_entry(sec, &file->elf->sections, list)
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index f534708..1b57be6 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -82,12 +82,11 @@ static int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi)
 }
 
 static int write_orc_entry(struct elf *elf, struct section *orc_sec,
-			   struct section *ip_rsec, unsigned int idx,
+			   struct section *ip_sec, unsigned int idx,
 			   struct section *insn_sec, unsigned long insn_off,
 			   struct orc_entry *o)
 {
 	struct orc_entry *orc;
-	struct reloc *reloc;
 
 	/* populate ORC data */
 	orc = (struct orc_entry *)orc_sec->data->d_buf + idx;
@@ -96,25 +95,9 @@ static int write_orc_entry(struct elf *elf, struct section *orc_sec,
 	orc->bp_offset = bswap_if_needed(orc->bp_offset);
 
 	/* populate reloc for ip */
-	reloc = malloc(sizeof(*reloc));
-	if (!reloc) {
-		perror("malloc");
+	if (elf_add_reloc_to_insn(elf, ip_sec, idx * sizeof(int), R_X86_64_PC32,
+				  insn_sec, insn_off))
 		return -1;
-	}
-	memset(reloc, 0, sizeof(*reloc));
-
-	insn_to_reloc_sym_addend(insn_sec, insn_off, reloc);
-	if (!reloc->sym) {
-		WARN("missing symbol for insn at offset 0x%lx",
-		     insn_off);
-		return -1;
-	}
-
-	reloc->type = R_X86_64_PC32;
-	reloc->offset = idx * sizeof(int);
-	reloc->sec = ip_rsec;
-
-	elf_add_reloc(elf, reloc);
 
 	return 0;
 }
@@ -153,7 +136,7 @@ static unsigned long alt_group_len(struct alt_group *alt_group)
 
 int orc_create(struct objtool_file *file)
 {
-	struct section *sec, *ip_rsec, *orc_sec;
+	struct section *sec, *orc_sec;
 	unsigned int nr = 0, idx = 0;
 	struct orc_list_entry *entry;
 	struct list_head orc_list;
@@ -242,13 +225,12 @@ int orc_create(struct objtool_file *file)
 	sec = elf_create_section(file->elf, ".orc_unwind_ip", 0, sizeof(int), nr);
 	if (!sec)
 		return -1;
-	ip_rsec = elf_create_reloc_section(file->elf, sec, SHT_RELA);
-	if (!ip_rsec)
+	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
 		return -1;
 
 	/* Write ORC entries to sections: */
 	list_for_each_entry(entry, &orc_list, list) {
-		if (write_orc_entry(file->elf, orc_sec, ip_rsec, idx++,
+		if (write_orc_entry(file->elf, orc_sec, sec, idx++,
 				    entry->insn_sec, entry->insn_off,
 				    &entry->orc))
 			return -1;

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Create reloc sections implicitly
  2021-03-26 15:12 ` [PATCH v3 09/16] objtool: Implicitly create reloc sections Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Josh Poimboeuf, Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     d0c5c4cc73da0b05b0d9e5f833f2d859e1b45f8e
Gitweb:        https://git.kernel.org/tip/d0c5c4cc73da0b05b0d9e5f833f2d859e1b45f8e
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:08 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:44:37 +02:00

objtool: Create reloc sections implicitly

Have elf_add_reloc() create the relocation section implicitly.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.880174448@infradead.org
---
 tools/objtool/check.c               |  6 ------
 tools/objtool/elf.c                 |  9 ++++++++-
 tools/objtool/include/objtool/elf.h |  1 -
 tools/objtool/orc_gen.c             |  2 --
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 61fe29a..600fa67 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -459,9 +459,6 @@ static int create_static_call_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
-
 	idx = 0;
 	list_for_each_entry(insn, &file->static_call_list, static_call_node) {
 
@@ -547,9 +544,6 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 	if (!sec)
 		return -1;
 
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
-
 	idx = 0;
 	list_for_each_entry(insn, &file->mcount_loc_list, mcount_loc_node) {
 
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 0ab52ac..7b65ae3 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -447,11 +447,18 @@ err:
 	return -1;
 }
 
+static struct section *elf_create_reloc_section(struct elf *elf,
+						struct section *base,
+						int reltype);
+
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, int addend)
 {
 	struct reloc *reloc;
 
+	if (!sec->reloc && !elf_create_reloc_section(elf, sec, SHT_RELA))
+		return -1;
+
 	reloc = malloc(sizeof(*reloc));
 	if (!reloc) {
 		perror("malloc");
@@ -829,7 +836,7 @@ static struct section *elf_create_rela_reloc_section(struct elf *elf, struct sec
 	return sec;
 }
 
-struct section *elf_create_reloc_section(struct elf *elf,
+static struct section *elf_create_reloc_section(struct elf *elf,
 					 struct section *base,
 					 int reltype)
 {
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 825ad32..463f329 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -122,7 +122,6 @@ static inline u32 reloc_hash(struct reloc *reloc)
 
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
-struct section *elf_create_reloc_section(struct elf *elf, struct section *base, int reltype);
 
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, int addend);
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 1b57be6..dc9b7dd 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -225,8 +225,6 @@ int orc_create(struct objtool_file *file)
 	sec = elf_create_section(file->elf, ".orc_unwind_ip", 0, sizeof(int), nr);
 	if (!sec)
 		return -1;
-	if (!elf_create_reloc_section(file->elf, sec, SHT_RELA))
-		return -1;
 
 	/* Write ORC entries to sections: */
 	list_for_each_entry(entry, &orc_list, list) {

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Fix static_call list generation
  2021-03-26 15:12 ` [PATCH v3 06/16] objtool: Fix static_call list generation Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     a958c4fea768d2c378c89032ab41d38da2a24422
Gitweb:        https://git.kernel.org/tip/a958c4fea768d2c378c89032ab41d38da2a24422
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:05 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:43:19 +02:00

objtool: Fix static_call list generation

Currently, objtool generates tail call entries in add_jump_destination()
but waits until validate_branch() to generate the regular call entries.
Move these to add_call_destination() for consistency.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.691529901@infradead.org
---
 tools/objtool/check.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 6fbc001..8618d03 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1045,6 +1045,11 @@ static int add_call_destinations(struct objtool_file *file)
 		} else
 			insn->call_dest = reloc->sym;
 
+		if (insn->call_dest && insn->call_dest->static_call_tramp) {
+			list_add_tail(&insn->static_call_node,
+				      &file->static_call_list);
+		}
+
 		/*
 		 * Many compilers cannot disable KCOV with a function attribute
 		 * so they need a little help, NOP out any KCOV calls from noinstr
@@ -1788,6 +1793,9 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_{jump_call}_destination.
+	 */
 	ret = read_static_call_tramps(file);
 	if (ret)
 		return ret;
@@ -1800,6 +1808,10 @@ static int decode_sections(struct objtool_file *file)
 	if (ret)
 		return ret;
 
+	/*
+	 * Must be before add_call_destination(); it changes INSN_CALL to
+	 * INSN_JUMP.
+	 */
 	ret = read_intra_function_calls(file);
 	if (ret)
 		return ret;
@@ -2762,11 +2774,6 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 			if (dead_end_function(file, insn->call_dest))
 				return 0;
 
-			if (insn->type == INSN_CALL && insn->call_dest->static_call_tramp) {
-				list_add_tail(&insn->static_call_node,
-					      &file->static_call_list);
-			}
-
 			break;
 
 		case INSN_JUMP_CONDITIONAL:

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Rework the elf_rebuild_reloc_section() logic
  2021-03-26 15:12 ` [PATCH v3 07/16] objtool: Rework rebuild_reloc logic Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     3a647607b57ad8346e659ddd3b951ac292c83690
Gitweb:        https://git.kernel.org/tip/3a647607b57ad8346e659ddd3b951ac292c83690
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:06 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:43:32 +02:00

objtool: Rework the elf_rebuild_reloc_section() logic

Instead of manually calling elf_rebuild_reloc_section() on sections
we've called elf_add_reloc() on, have elf_write() DTRT.

This makes it easier to add random relocations in places without
carefully tracking when we're done and need to flush what section.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.754213408@infradead.org
---
 tools/objtool/check.c               |  6 ------
 tools/objtool/elf.c                 | 20 ++++++++++++++------
 tools/objtool/include/objtool/elf.h |  1 -
 tools/objtool/orc_gen.c             |  3 ---
 4 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 8618d03..1d0415b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -542,9 +542,6 @@ static int create_static_call_sections(struct objtool_file *file)
 		idx++;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, reloc_sec))
-		return -1;
-
 	return 0;
 }
 
@@ -614,9 +611,6 @@ static int create_mcount_loc_sections(struct objtool_file *file)
 		idx++;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, reloc_sec))
-		return -1;
-
 	return 0;
 }
 
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 93fa833..374813e 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -479,6 +479,8 @@ void elf_add_reloc(struct elf *elf, struct reloc *reloc)
 
 	list_add_tail(&reloc->list, &sec->reloc_list);
 	elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
+
+	sec->changed = true;
 }
 
 static int read_rel_reloc(struct section *sec, int i, struct reloc *reloc, unsigned int *symndx)
@@ -558,7 +560,9 @@ static int read_relocs(struct elf *elf)
 				return -1;
 			}
 
-			elf_add_reloc(elf, reloc);
+			list_add_tail(&reloc->list, &sec->reloc_list);
+			elf_hash_add(elf->reloc_hash, &reloc->hash, reloc_hash(reloc));
+
 			nr_reloc++;
 		}
 		max_reloc = max(max_reloc, nr_reloc);
@@ -873,14 +877,11 @@ static int elf_rebuild_rela_reloc_section(struct section *sec, int nr)
 	return 0;
 }
 
-int elf_rebuild_reloc_section(struct elf *elf, struct section *sec)
+static int elf_rebuild_reloc_section(struct elf *elf, struct section *sec)
 {
 	struct reloc *reloc;
 	int nr;
 
-	sec->changed = true;
-	elf->changed = true;
-
 	nr = 0;
 	list_for_each_entry(reloc, &sec->reloc_list, list)
 		nr++;
@@ -944,9 +945,15 @@ int elf_write(struct elf *elf)
 	struct section *sec;
 	Elf_Scn *s;
 
-	/* Update section headers for changed sections: */
+	/* Update changed relocation sections and section headers: */
 	list_for_each_entry(sec, &elf->sections, list) {
 		if (sec->changed) {
+			if (sec->base &&
+			    elf_rebuild_reloc_section(elf, sec)) {
+				WARN("elf_rebuild_reloc_section");
+				return -1;
+			}
+
 			s = elf_getscn(elf->elf, sec->idx);
 			if (!s) {
 				WARN_ELF("elf_getscn");
@@ -958,6 +965,7 @@ int elf_write(struct elf *elf)
 			}
 
 			sec->changed = false;
+			elf->changed = true;
 		}
 	}
 
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index e6890cc..fc576ed 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -142,7 +142,6 @@ struct reloc *find_reloc_by_dest_range(const struct elf *elf, struct section *se
 struct symbol *find_func_containing(struct section *sec, unsigned long offset);
 void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
 			      struct reloc *reloc);
-int elf_rebuild_reloc_section(struct elf *elf, struct section *sec);
 
 #define for_each_sec(file, sec)						\
 	list_for_each_entry(sec, &file->elf->sections, list)
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index 738aa50..f534708 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -254,8 +254,5 @@ int orc_create(struct objtool_file *file)
 			return -1;
 	}
 
-	if (elf_rebuild_reloc_section(file->elf, ip_rsec))
-		return -1;
-
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Handle per arch retpoline naming
  2021-03-26 15:12 ` [PATCH v3 05/16] objtool: Per arch retpoline naming Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] objtool: Handle per " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     530b4ddd9dd92b263081f5c7786d39a8129c8b2d
Gitweb:        https://git.kernel.org/tip/530b4ddd9dd92b263081f5c7786d39a8129c8b2d
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:04 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:43:02 +02:00

objtool: Handle per arch retpoline naming

The __x86_indirect_ naming is obviously not generic. Shorten to allow
matching some additional magic names later.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.630296706@infradead.org
---
 tools/objtool/arch/x86/decode.c      |  5 +++++
 tools/objtool/check.c                |  9 +++++++--
 tools/objtool/include/objtool/arch.h |  2 ++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index ba9ebff..782894e 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -648,3 +648,8 @@ int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg)
 
 	return 0;
 }
+
+bool arch_is_retpoline(struct symbol *sym)
+{
+	return !strncmp(sym->name, "__x86_indirect_", 15);
+}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 519af4b..6fbc001 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -850,6 +850,11 @@ static int add_ignore_alternatives(struct objtool_file *file)
 	return 0;
 }
 
+__weak bool arch_is_retpoline(struct symbol *sym)
+{
+	return false;
+}
+
 /*
  * Find the destination instructions for all jumps.
  */
@@ -872,7 +877,7 @@ static int add_jump_destinations(struct objtool_file *file)
 		} else if (reloc->sym->type == STT_SECTION) {
 			dest_sec = reloc->sym->sec;
 			dest_off = arch_dest_reloc_offset(reloc->addend);
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+		} else if (arch_is_retpoline(reloc->sym)) {
 			/*
 			 * Retpoline jumps are really dynamic jumps in
 			 * disguise, so convert them accordingly.
@@ -1026,7 +1031,7 @@ static int add_call_destinations(struct objtool_file *file)
 				return -1;
 			}
 
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+		} else if (arch_is_retpoline(reloc->sym)) {
 			/*
 			 * Retpoline calls are really dynamic calls in
 			 * disguise, so convert them accordingly.
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index 6ff0685..bb30993 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -86,4 +86,6 @@ const char *arch_nop_insn(int len);
 
 int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg);
 
+bool arch_is_retpoline(struct symbol *sym);
+
 #endif /* _ARCH_H */

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] objtool: Correctly handle retpoline thunk calls
  2021-03-26 15:12 ` [PATCH v3 04/16] objtool: Correctly handle retpoline thunk calls Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel),
	Borislav Petkov, Ingo Molnar, Miroslav Benes, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     bcb1b6ff39da7e8a6a986eb08126fba2b5e13c32
Gitweb:        https://git.kernel.org/tip/bcb1b6ff39da7e8a6a986eb08126fba2b5e13c32
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:03 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:42:54 +02:00

objtool: Correctly handle retpoline thunk calls

Just like JMP handling, convert a direct CALL to a retpoline thunk
into a retpoline safe indirect CALL.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/20210326151259.567568238@infradead.org
---
 tools/objtool/check.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index d45f018..519af4b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1025,6 +1025,18 @@ static int add_call_destinations(struct objtool_file *file)
 					  dest_off);
 				return -1;
 			}
+
+		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
+			/*
+			 * Retpoline calls are really dynamic calls in
+			 * disguise, so convert them accordingly.
+			 */
+			insn->type = INSN_CALL_DYNAMIC;
+			insn->retpoline_safe = true;
+
+			remove_insn_ops(insn);
+			continue;
+
 		} else
 			insn->call_dest = reloc->sym;
 

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] x86/retpoline: Simplify retpolines
  2021-03-26 15:12 ` [PATCH v3 03/16] x86/retpoline: Simplify retpolines Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
  2021-04-06  8:56     ` David Laight
  1 sibling, 1 reply; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     119251855f9adf9421cb5eb409933092141ab2c7
Gitweb:        https://git.kernel.org/tip/119251855f9adf9421cb5eb409933092141ab2c7
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:02 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:42:04 +02:00

x86/retpoline: Simplify retpolines

Due to:

  c9c324dc22aa ("objtool: Support stack layout changes in alternatives")

it is now possible to simplify the retpolines.

Currently our retpolines consist of 2 symbols:

 - __x86_indirect_thunk_\reg: the compiler target
 - __x86_retpoline_\reg:  the actual retpoline.

Both are consecutive in code and aligned such that for any one register
they both live in the same cacheline:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop

  0000000000000005 <__x86_retpoline_rax>:
   5:   e8 07 00 00 00          callq  11 <__x86_retpoline_rax+0xc>
   a:   f3 90                   pause
   c:   0f ae e8                lfence
   f:   eb f9                   jmp    a <__x86_retpoline_rax+0x5>
  11:   48 89 04 24             mov    %rax,(%rsp)
  15:   c3                      retq
  16:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)

The thunk is an alternative_2, where one option is a JMP to the
retpoline. This was done so that objtool didn't need to deal with
alternatives with stack ops. But that problem has been solved, so now
it is possible to fold the entire retpoline into the alternative to
simplify and consolidate unused bytes:

  0000000000000000 <__x86_indirect_thunk_rax>:
   0:   ff e0                   jmpq   *%rax
   2:   90                      nop
   3:   90                      nop
   4:   90                      nop
   5:   90                      nop
   6:   90                      nop
   7:   90                      nop
   8:   90                      nop
   9:   90                      nop
   a:   90                      nop
   b:   90                      nop
   c:   90                      nop
   d:   90                      nop
   e:   90                      nop
   f:   90                      nop
  10:   90                      nop
  11:   66 66 2e 0f 1f 84 00 00 00 00 00        data16 nopw %cs:0x0(%rax,%rax,1)
  1c:   0f 1f 40 00             nopl   0x0(%rax)

Notice that since the longest alternative sequence is now:

   0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
   5:   f3 90                   pause
   7:   0f ae e8                lfence
   a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
   c:   48 89 04 24             mov    %rax,(%rsp)
  10:   c3                      retq

17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
we can shrink the retpoline by 1 byte we can pack it more densely).

 [ bp: Massage commit message. ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.506071949@infradead.org
---
 arch/x86/include/asm/asm-prototypes.h |  7 +-----
 arch/x86/include/asm/nospec-branch.h  |  6 ++---
 arch/x86/lib/retpoline.S              | 34 +++++++++++++-------------
 tools/objtool/check.c                 |  3 +--
 4 files changed, 21 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index 51e2bf2..0545b07 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -22,15 +22,8 @@ extern void cmpxchg8b_emu(void);
 #define DECL_INDIRECT_THUNK(reg) \
 	extern asmlinkage void __x86_indirect_thunk_ ## reg (void);
 
-#define DECL_RETPOLINE(reg) \
-	extern asmlinkage void __x86_retpoline_ ## reg (void);
-
 #undef GEN
 #define GEN(reg) DECL_INDIRECT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
-#undef GEN
-#define GEN(reg) DECL_RETPOLINE(reg)
-#include <asm/GEN-for-each-reg.h>
-
 #endif /* CONFIG_RETPOLINE */
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 529f8e9..664be73 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -80,7 +80,7 @@
 .macro JMP_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
-		      __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(jmp __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \
 		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	jmp	*%\reg
@@ -90,7 +90,7 @@
 .macro CALL_NOSPEC reg:req
 #ifdef CONFIG_RETPOLINE
 	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \
-		      __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(call __x86_indirect_thunk_\reg), X86_FEATURE_RETPOLINE, \
 		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD
 #else
 	call	*%\reg
@@ -128,7 +128,7 @@
 	ALTERNATIVE_2(						\
 	ANNOTATE_RETPOLINE_SAFE					\
 	"call *%[thunk_target]\n",				\
-	"call __x86_retpoline_%V[thunk_target]\n",		\
+	"call __x86_indirect_thunk_%V[thunk_target]\n",		\
 	X86_FEATURE_RETPOLINE,					\
 	"lfence;\n"						\
 	ANNOTATE_RETPOLINE_SAFE					\
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 6bb74b5..d2c0d14 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -10,27 +10,31 @@
 #include <asm/unwind_hints.h>
 #include <asm/frame.h>
 
-.macro THUNK reg
-	.section .text.__x86.indirect_thunk
-
-	.align 32
-SYM_FUNC_START(__x86_indirect_thunk_\reg)
-	JMP_NOSPEC \reg
-SYM_FUNC_END(__x86_indirect_thunk_\reg)
-
-SYM_FUNC_START_NOALIGN(__x86_retpoline_\reg)
+.macro RETPOLINE reg
 	ANNOTATE_INTRA_FUNCTION_CALL
-	call	.Ldo_rop_\@
+	call    .Ldo_rop_\@
 .Lspec_trap_\@:
 	UNWIND_HINT_EMPTY
 	pause
 	lfence
-	jmp	.Lspec_trap_\@
+	jmp .Lspec_trap_\@
 .Ldo_rop_\@:
-	mov	%\reg, (%_ASM_SP)
+	mov     %\reg, (%_ASM_SP)
 	UNWIND_HINT_FUNC
 	ret
-SYM_FUNC_END(__x86_retpoline_\reg)
+.endm
+
+.macro THUNK reg
+	.section .text.__x86.indirect_thunk
+
+	.align 32
+SYM_FUNC_START(__x86_indirect_thunk_\reg)
+
+	ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
+		      __stringify(RETPOLINE \reg), X86_FEATURE_RETPOLINE, \
+		      __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
+
+SYM_FUNC_END(__x86_indirect_thunk_\reg)
 
 .endm
 
@@ -48,7 +52,6 @@ SYM_FUNC_END(__x86_retpoline_\reg)
 
 #define __EXPORT_THUNK(sym)	_ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
 #define EXPORT_THUNK(reg)	__EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
-#define EXPORT_RETPOLINE(reg)  __EXPORT_THUNK(__x86_retpoline_ ## reg)
 
 #undef GEN
 #define GEN(reg) THUNK reg
@@ -58,6 +61,3 @@ SYM_FUNC_END(__x86_retpoline_\reg)
 #define GEN(reg) EXPORT_THUNK(reg)
 #include <asm/GEN-for-each-reg.h>
 
-#undef GEN
-#define GEN(reg) EXPORT_RETPOLINE(reg)
-#include <asm/GEN-for-each-reg.h>
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5e5388a..d45f018 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -872,8 +872,7 @@ static int add_jump_destinations(struct objtool_file *file)
 		} else if (reloc->sym->type == STT_SECTION) {
 			dest_sec = reloc->sym->sec;
 			dest_off = arch_dest_reloc_offset(reloc->addend);
-		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21) ||
-			   !strncmp(reloc->sym->name, "__x86_retpoline_", 16)) {
+		} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21)) {
 			/*
 			 * Retpoline jumps are really dynamic jumps in
 			 * disguise, so convert them accordingly.

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [tip: x86/core] x86/alternatives: Optimize optimize_nops()
  2021-03-26 15:12 ` [PATCH v3 02/16] x86/alternatives: Optimize optimize_nops() Peter Zijlstra
  2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2021-04-03 11:11   ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 82+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-04-03 11:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     23c1ad538f4f371bdb67d8a112314842d5db7e5a
Gitweb:        https://git.kernel.org/tip/23c1ad538f4f371bdb67d8a112314842d5db7e5a
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 26 Mar 2021 16:12:01 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 02 Apr 2021 12:41:17 +02:00

x86/alternatives: Optimize optimize_nops()

Currently, optimize_nops() scans to see if the alternative starts with
NOPs. However, the emit pattern is:

  141:	\oldinstr
  142:	.skip (len-(142b-141b)), 0x90

That is, when 'oldinstr' is short, the tail is padded with NOPs. This case
never gets optimized.

Rewrite optimize_nops() to replace any trailing string of NOPs inside
the alternative to larger NOPs. Also run it irrespective of patching,
replacing NOPs in both the original and replaced code.

A direct consequence is that 'padlen' becomes superfluous, so remove it.

 [ bp:
   - Adjust commit message
   - remove a stale comment about needing to pad
   - add a comment in optimize_nops()
   - exit early if the NOP verif. loop catches a mismatch - function
     should not not add NOPs in that case
   - fix the "optimized NOPs" offsets output ]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210326151259.442992235@infradead.org
---
 arch/x86/include/asm/alternative.h            | 17 +-----
 arch/x86/kernel/alternative.c                 | 49 +++++++++++-------
 tools/objtool/arch/x86/include/arch/special.h |  2 +-
 3 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 17b3609..a3c2315 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -65,7 +65,6 @@ struct alt_instr {
 	u16 cpuid;		/* cpuid bit set for replacement */
 	u8  instrlen;		/* length of original instruction */
 	u8  replacementlen;	/* length of new instruction */
-	u8  padlen;		/* length of build-time padding */
 } __packed;
 
 /*
@@ -104,7 +103,6 @@ static inline int alternatives_text_reserved(void *start, void *end)
 
 #define alt_end_marker		"663"
 #define alt_slen		"662b-661b"
-#define alt_pad_len		alt_end_marker"b-662b"
 #define alt_total_slen		alt_end_marker"b-661b"
 #define alt_rlen(num)		e_replacement(num)"f-"b_replacement(num)"f"
 
@@ -151,8 +149,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	" .long " b_replacement(num)"f - .\n"		/* new instruction */ \
 	" .word " __stringify(feature) "\n"		/* feature bit     */ \
 	" .byte " alt_total_slen "\n"			/* source len      */ \
-	" .byte " alt_rlen(num) "\n"			/* replacement len */ \
-	" .byte " alt_pad_len "\n"			/* pad len */
+	" .byte " alt_rlen(num) "\n"			/* replacement len */
 
 #define ALTINSTR_REPLACEMENT(newinstr, num)		/* replacement */	\
 	"# ALT: replacement " #num "\n"						\
@@ -224,9 +221,6 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * Peculiarities:
  * No memory clobber here.
  * Argument numbers start with 1.
- * Best is to use constraints that are fixed size (like (%1) ... "r")
- * If you use variable sized constraints like "m" or "g" in the
- * replacement make sure to pad to the worst case length.
  * Leaving an unused argument 0 to keep API compatibility.
  */
 #define alternative_input(oldinstr, newinstr, feature, input...)	\
@@ -315,13 +309,12 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * enough information for the alternatives patching code to patch an
  * instruction. See apply_alternatives().
  */
-.macro altinstruction_entry orig alt feature orig_len alt_len pad_len
+.macro altinstruction_entry orig alt feature orig_len alt_len
 	.long \orig - .
 	.long \alt - .
 	.word \feature
 	.byte \orig_len
 	.byte \alt_len
-	.byte \pad_len
 .endm
 
 /*
@@ -338,7 +331,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 142:
 
 	.pushsection .altinstructions,"a"
-	altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f,142b-141b
+	altinstruction_entry 140b,143f,\feature,142b-140b,144f-143f
 	.popsection
 
 	.pushsection .altinstr_replacement,"ax"
@@ -375,8 +368,8 @@ static inline int alternatives_text_reserved(void *start, void *end)
 142:
 
 	.pushsection .altinstructions,"a"
-	altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f,142b-141b
-	altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f,142b-141b
+	altinstruction_entry 140b,143f,\feature1,142b-140b,144f-143f
+	altinstruction_entry 140b,144f,\feature2,142b-140b,145f-144f
 	.popsection
 
 	.pushsection .altinstr_replacement,"ax"
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 80adf5a..84ec0ba 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -189,19 +189,35 @@ done:
 static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
 {
 	unsigned long flags;
-	int i;
+	struct insn insn;
+	int nop, i = 0;
+
+	/*
+	 * Jump over the non-NOP insns, the remaining bytes must be single-byte
+	 * NOPs, optimize them.
+	 */
+	for (;;) {
+		if (insn_decode_kernel(&insn, &instr[i]))
+			return;
+
+		if (insn.length == 1 && insn.opcode.bytes[0] == 0x90)
+			break;
+
+		if ((i += insn.length) >= a->instrlen)
+			return;
+	}
 
-	for (i = 0; i < a->padlen; i++) {
-		if (instr[i] != 0x90)
+	for (nop = i; i < a->instrlen; i++) {
+		if (WARN_ONCE(instr[i] != 0x90, "Not a NOP at 0x%px\n", &instr[i]))
 			return;
 	}
 
 	local_irq_save(flags);
-	add_nops(instr + (a->instrlen - a->padlen), a->padlen);
+	add_nops(instr + nop, i - nop);
 	local_irq_restore(flags);
 
 	DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
-		   instr, a->instrlen - a->padlen, a->padlen);
+		   instr, nop, a->instrlen);
 }
 
 /*
@@ -247,19 +263,15 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 		 * - feature not present but ALTINSTR_FLAG_INV is set to mean,
 		 *   patch if feature is *NOT* present.
 		 */
-		if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV)) {
-			if (a->padlen > 1)
-				optimize_nops(a, instr);
-
-			continue;
-		}
+		if (!boot_cpu_has(feature) == !(a->cpuid & ALTINSTR_FLAG_INV))
+			goto next;
 
-		DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d), pad: %d",
+		DPRINTK("feat: %s%d*32+%d, old: (%pS (%px) len: %d), repl: (%px, len: %d)",
 			(a->cpuid & ALTINSTR_FLAG_INV) ? "!" : "",
 			feature >> 5,
 			feature & 0x1f,
 			instr, instr, a->instrlen,
-			replacement, a->replacementlen, a->padlen);
+			replacement, a->replacementlen);
 
 		DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
 		DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);
@@ -283,14 +295,15 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 		if (a->replacementlen && is_jmp(replacement[0]))
 			recompute_jump(a, instr, replacement, insn_buff);
 
-		if (a->instrlen > a->replacementlen) {
-			add_nops(insn_buff + a->replacementlen,
-				 a->instrlen - a->replacementlen);
-			insn_buff_sz += a->instrlen - a->replacementlen;
-		}
+		for (; insn_buff_sz < a->instrlen; insn_buff_sz++)
+			insn_buff[insn_buff_sz] = 0x90;
+
 		DUMP_BYTES(insn_buff, insn_buff_sz, "%px: final_insn: ", instr);
 
 		text_poke_early(instr, insn_buff, insn_buff_sz);
+
+next:
+		optimize_nops(a, instr);
 	}
 }
 
diff --git a/tools/objtool/arch/x86/include/arch/special.h b/tools/objtool/arch/x86/include/arch/special.h
index d818b2b..14271cc 100644
--- a/tools/objtool/arch/x86/include/arch/special.h
+++ b/tools/objtool/arch/x86/include/arch/special.h
@@ -10,7 +10,7 @@
 #define JUMP_ORIG_OFFSET	0
 #define JUMP_NEW_OFFSET		4
 
-#define ALT_ENTRY_SIZE		13
+#define ALT_ENTRY_SIZE		12
 #define ALT_ORIG_OFFSET		0
 #define ALT_NEW_OFFSET		4
 #define ALT_FEATURE_OFFSET	8

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* RE: [tip: x86/core] x86/retpoline: Simplify retpolines
  2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
@ 2021-04-06  8:56     ` David Laight
  0 siblings, 0 replies; 82+ messages in thread
From: David Laight @ 2021-04-06  8:56 UTC (permalink / raw)
  To: linux-kernel, linux-tip-commits
  Cc: Peter Zijlstra (Intel), Borislav Petkov, Ingo Molnar, x86

From: tip-bot2@linutronix.de
> Sent: 03 April 2021 12:11
...
> Notice that since the longest alternative sequence is now:
> 
>    0:   e8 07 00 00 00          callq  c <.altinstr_replacement+0xc>
>    5:   f3 90                   pause
>    7:   0f ae e8                lfence
>    a:   eb f9                   jmp    5 <.altinstr_replacement+0x5>
>    c:   48 89 04 24             mov    %rax,(%rsp)
>   10:   c3                      retq
> 
> 17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if
> we can shrink the retpoline by 1 byte we can pack it more densely).

Every time I see this I can't help feeling that doing something
(aka anything) to get the 'mov' and 'retq' into the same 16 byte
code fetch/decode block but be advantageous.

Even something like:
	call	1f
	pause
	jmp 	2f
1:	mov	%rax,(%rsp)
	retq
2:	pause
	lfence
	jmp	2b
Might meet all the requirements for the retpoline while
allowing the 'mov' and 'retq' be decoded in the same clock.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-03-29 16:38   ` Josh Poimboeuf
@ 2021-06-02 15:51     ` Lukasz Majczak
  2021-06-02 16:56       ` Peter Zijlstra
                         ` (2 more replies)
  0 siblings, 3 replies; 82+ messages in thread
From: Lukasz Majczak @ 2021-06-02 15:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, x86, jgross, mbenes, linux-kernel, upstream,
	Radosław Biernacki, Łukasz Bartosik, Guenter Roeck

Hi Peter,

This patch seems to crash on Tigerlake platform (Chromebook delbin), I
got the following error:

[    2.103054] pcieport 0000:00:1c.0: PME: Signaling with IRQ 122
[    2.110148] pcieport 0000:00:1c.0: pciehp: Slot #7 AttnBtn-
PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl+
IbPresDis- LLActRep+
[    2.126754] pcieport 0000:00:1d.0: PME: Signaling with IRQ 123
[    2.133946] ACPI: \_SB_.CP00: Found 3 idle states
[    2.139708] BUG: kernel NULL pointer dereference, address: 000000000000012b
[    2.140704] #PF: supervisor read access in kernel mode
[    2.140704] #PF: error_code(0x0000) - not-present page
[    2.140704] PGD 0 P4D 0
[    2.140704] Oops: 0000 [#1] PREEMPT SMP NOPTI
[    2.140704] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G     U
  5.13.0-rc1 #31
[    2.140704] Hardware name: Google Delbin/Delbin, BIOS
Google_Delbin.13672.156.3 05/14/2021
[    2.140704] RIP: 0010:cpuidle_poll_time+0x9/0x6a
[    2.140704] Code: 44 00 00 85 f6 78 19 55 48 89 e5 48 8b 05 16 44
44 01 4c 8b 58 40 4d 85 db 5d 41 ff d3 66 90 00 c3 0f 1f 44 00 00 55
48 89 e5 <48> 8b 46 20 48 85 c0 75 56 4c 63 87 28 04 00 00 b8 24 f49
[    2.140704] RSP: 0000:ffffffff9cc03ea8 EFLAGS: 00010282
[    2.140704] RAX: 0000000000008e7d RBX: ffffffff9cc1c5fd RCX: 000000007f894e5a
[    2.140704] RDX: 000000007f894d4f RSI: 000000000000010b RDI: 0000000002fa1cf6
[    2.140704] RBP: ffffffff9cc03ea8 R08: 0000000000000000 R09: 00000000ca948246
[    2.140704] R10: 0000000000000000 R11: ffffffff9bf132cb R12: 0000000000000003
[    2.140704] R13: ffffbbfdffc21960 R14: 0000000000000000 R15: ffffffff9cdba638
[    2.140704] FS:  0000000000000000(0000) GS:ffff928280000000(0000)
knlGS:0000000000000000
[    2.140704] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.140704] CR2: 000000000000012b CR3: 000000027e414001 CR4: 0000000000770ef0
[    2.140704] PKRU: 55555554
[    2.140704] Call Trace:
[    2.140704]  do_idle+0x175/0x1f6
[    2.140704]  cpu_startup_entry+0x1d/0x1f
[    2.140704]  start_kernel+0x3be/0x420
[    2.140704]  secondary_startup_64_no_verify+0xb0/0xbb
[    2.140704] Modules linked in:
[    2.140704] CR2: 000000000000012b
[    2.140704] ---[ end trace d15839e2bd509f00 ]---
[    2.140704] RIP: 0010:cpuidle_poll_time+0x9/0x6a
[    2.140704] Code: 44 00 00 85 f6 78 19 55 48 89 e5 48 8b 05 16 44
44 01 4c 8b 58 40 4d 85 db 5d 41 ff d3 66 90 00 c3 0f 1f 44 00 00 55
48 89 e5 <48> 8b 46 20 48 85 c0 75 56 4c 63 87 28 04 00 00 b8 24 f49
[    2.140704] RSP: 0000:ffffffff9cc03ea8 EFLAGS: 00010282
[    2.140704] RAX: 0000000000008e7d RBX: ffffffff9cc1c5fd RCX: 000000007f894e5a
[    2.140704] RDX: 000000007f894d4f RSI: 000000000000010b RDI: 0000000002fa1cf6
[    2.140704] RBP: ffffffff9cc03ea8 R08: 0000000000000000 R09: 00000000ca948246
[    2.140704] R10: 0000000000000000 R11: ffffffff9bf132cb R12: 0000000000000003
[    2.140704] R13: ffffbbfdffc21960 R14: 0000000000000000 R15: ffffffff9cdba638
[    2.140704] FS:  0000000000000000(0000) GS:ffff928280000000(0000)
knlGS:0000000000000000
[    2.140704] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.140704] CR2: 000000000000012b CR3: 000000027e414001 CR4: 0000000000770ef0
[    2.140704] PKRU: 55555554
[    2.140704] Kernel panic - not syncing: Fatal exception
[    2.140704] Kernel Offset: 0x1a600000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    2.140704] ACPI MEMORY or I/O RESET_REG.

Git bisect pointed to this commit:

9bc0bb50727c8ac69fbb33fb937431cf3518ff37 is the first bad commit
commit 9bc0bb50727c8ac69fbb33fb937431cf3518ff37
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri Mar 26 16:12:15 2021 +0100

    objtool/x86: Rewrite retpoline thunk calls

If there is anything I could do to help debug this issue (additional
debugs, logs etc.), please let me know.
Best regards,
Lukasz

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-02 15:51     ` Lukasz Majczak
@ 2021-06-02 16:56       ` Peter Zijlstra
  2021-06-02 17:10         ` Peter Zijlstra
  2021-06-02 20:43       ` Josh Poimboeuf
  2021-06-04 20:50       ` Nick Desaulniers
  2 siblings, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-02 16:56 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: Josh Poimboeuf, x86, jgross, mbenes, linux-kernel, upstream,
	Radosław Biernacki, Łukasz Bartosik, Guenter Roeck

On Wed, Jun 02, 2021 at 05:51:01PM +0200, Lukasz Majczak wrote:
> Hi Peter,
> 
> This patch seems to crash on Tigerlake platform (Chromebook delbin), I
> got the following error:
> 
> [    2.103054] pcieport 0000:00:1c.0: PME: Signaling with IRQ 122
> [    2.110148] pcieport 0000:00:1c.0: pciehp: Slot #7 AttnBtn-
> PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl+
> IbPresDis- LLActRep+
> [    2.126754] pcieport 0000:00:1d.0: PME: Signaling with IRQ 123
> [    2.133946] ACPI: \_SB_.CP00: Found 3 idle states
> [    2.139708] BUG: kernel NULL pointer dereference, address: 000000000000012b
> [    2.140704] #PF: supervisor read access in kernel mode
> [    2.140704] #PF: error_code(0x0000) - not-present page
> [    2.140704] PGD 0 P4D 0
> [    2.140704] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [    2.140704] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G     U
>   5.13.0-rc1 #31
> [    2.140704] Hardware name: Google Delbin/Delbin, BIOS
> Google_Delbin.13672.156.3 05/14/2021
> [    2.140704] RIP: 0010:cpuidle_poll_time+0x9/0x6a
> [    2.140704] Code: 44 00 00 85 f6 78 19 55 48 89 e5 48 8b 05 16 44
> 44 01 4c 8b 58 40 4d 85 db 5d 41 ff d3 66 90 00 c3 0f 1f 44 00 00 55
> 48 89 e5 <48> 8b 46 20 48 85 c0 75 56 4c 63 87 28 04 00 00 b8 24 f49

All code
========
 0:   44 00 00                add    %r8b,(%rax)
 3:   85 f6                   test   %esi,%esi
 5:   78 19                   js     0x20
 7:   55                      push   %rbp
 8:   48 89 e5                mov    %rsp,%rbp
 b:   48 8b 05 16 44 44 01    mov    0x1444416(%rip),%rax        # 0x1444428
12:   4c 8b 58 40             mov    0x40(%rax),%r11
16:   4d 85 db                test   %r11,%r11
19:   5d                      pop    %rbp
1a:   41 ff d3                callq  *%r11
1d:   66 90                   xchg   %ax,%ax
1f:   00 c3                   add    %al,%bl
21:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
26:   55                      push   %rbp
27:   48 89 e5                mov    %rsp,%rbp
2a:*  48 8b 46 20             mov    0x20(%rsi),%rax          <-- trapping instruction
2e:   48 85 c0                test   %rax,%rax
31:   75 56                   jne    0x89
33:   4c 63 87 28 04 00 00    movslq 0x428(%rdi),%r8
3a:   b8                      .byte 0xb8
3b:   24 49                   and    $0x49,%al

What does something like:

OBJ=vmlinux.o FUNC=0010:cpuidle_poll_time objdump -wdr $@ $OBJ | awk "/^\$/ { P=0; } /$FUNC[^>]*>:\$/ { P=1; O=strtonum(\"0x\" \$1); } { if (P) { o=strtonum(\"0x\" \$1); printf(\"%04x \", o-O); print \$0; } }"

look like for that build?

The 1d,1f instructions look exactly like what the alternative would've
written.

> [    2.140704] RSP: 0000:ffffffff9cc03ea8 EFLAGS: 00010282
> [    2.140704] RAX: 0000000000008e7d RBX: ffffffff9cc1c5fd RCX: 000000007f894e5a
> [    2.140704] RDX: 000000007f894d4f RSI: 000000000000010b RDI: 0000000002fa1cf6

That said, your RSI is buggered, and 0x20(%rsi) rightfully blows up.



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-02 16:56       ` Peter Zijlstra
@ 2021-06-02 17:10         ` Peter Zijlstra
  0 siblings, 0 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-02 17:10 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: Josh Poimboeuf, x86, jgross, mbenes, linux-kernel, upstream,
	Radosław Biernacki, Łukasz Bartosik, Guenter Roeck

On Wed, Jun 02, 2021 at 06:56:51PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 02, 2021 at 05:51:01PM +0200, Lukasz Majczak wrote:
> > Hi Peter,
> > 
> > This patch seems to crash on Tigerlake platform (Chromebook delbin), I
> > got the following error:
> > 
> > [    2.103054] pcieport 0000:00:1c.0: PME: Signaling with IRQ 122
> > [    2.110148] pcieport 0000:00:1c.0: pciehp: Slot #7 AttnBtn-
> > PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl+
> > IbPresDis- LLActRep+
> > [    2.126754] pcieport 0000:00:1d.0: PME: Signaling with IRQ 123
> > [    2.133946] ACPI: \_SB_.CP00: Found 3 idle states
> > [    2.139708] BUG: kernel NULL pointer dereference, address: 000000000000012b
> > [    2.140704] #PF: supervisor read access in kernel mode
> > [    2.140704] #PF: error_code(0x0000) - not-present page
> > [    2.140704] PGD 0 P4D 0
> > [    2.140704] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [    2.140704] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G     U
> >   5.13.0-rc1 #31
> > [    2.140704] Hardware name: Google Delbin/Delbin, BIOS
> > Google_Delbin.13672.156.3 05/14/2021
> > [    2.140704] RIP: 0010:cpuidle_poll_time+0x9/0x6a
> > [    2.140704] Code: 44 00 00 85 f6 78 19 55 48 89 e5 48 8b 05 16 44
> > 44 01 4c 8b 58 40 4d 85 db 5d 41 ff d3 66 90 00 c3 0f 1f 44 00 00 55
> > 48 89 e5 <48> 8b 46 20 48 85 c0 75 56 4c 63 87 28 04 00 00 b8 24 f49
> 
> All code
> ========
>  0:   44 00 00                add    %r8b,(%rax)
>  3:   85 f6                   test   %esi,%esi
>  5:   78 19                   js     0x20
>  7:   55                      push   %rbp
>  8:   48 89 e5                mov    %rsp,%rbp
>  b:   48 8b 05 16 44 44 01    mov    0x1444416(%rip),%rax        # 0x1444428
> 12:   4c 8b 58 40             mov    0x40(%rax),%r11
> 16:   4d 85 db                test   %r11,%r11
> 19:   5d                      pop    %rbp
> 1a:   41 ff d3                callq  *%r11
> 1d:   66 90                   xchg   %ax,%ax
> 1f:   00 c3                   add    %al,%bl
> 21:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> 26:   55                      push   %rbp
> 27:   48 89 e5                mov    %rsp,%rbp
> 2a:*  48 8b 46 20             mov    0x20(%rsi),%rax          <-- trapping instruction
> 2e:   48 85 c0                test   %rax,%rax
> 31:   75 56                   jne    0x89
> 33:   4c 63 87 28 04 00 00    movslq 0x428(%rdi),%r8
> 3a:   b8                      .byte 0xb8
> 3b:   24 49                   and    $0x49,%al
> 
> What does something like:
> 
> OBJ=vmlinux.o FUNC=0010:cpuidle_poll_time objdump -wdr $@ $OBJ | awk "/^\$/ { P=0; } /$FUNC[^>]*>:\$/ { P=1; O=strtonum(\"0x\" \$1); } { if (P) { o=strtonum(\"0x\" \$1); printf(\"%04x \", o-O); print \$0; } }"
> 
> look like for that build?

I'm being daft; we build debug stuff for this.

Can you please do something like:

$ touch drivers/cpuidle/cpuidle.c
$ OBJTOOL_ARGS="--backup" make drivers/cpuidle/cpuidle.o

and send me both: drivers/cpuidle/cpuidle.o{,.orig}



^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-02 15:51     ` Lukasz Majczak
  2021-06-02 16:56       ` Peter Zijlstra
@ 2021-06-02 20:43       ` Josh Poimboeuf
  2021-06-04 20:50       ` Nick Desaulniers
  2 siblings, 0 replies; 82+ messages in thread
From: Josh Poimboeuf @ 2021-06-02 20:43 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: Peter Zijlstra, x86, jgross, mbenes, linux-kernel, upstream,
	Radosław Biernacki, Łukasz Bartosik, Guenter Roeck

On Wed, Jun 02, 2021 at 05:51:01PM +0200, Lukasz Majczak wrote:
> Hi Peter,
> 
> This patch seems to crash on Tigerlake platform (Chromebook delbin), I
> got the following error:
> 
> [    2.103054] pcieport 0000:00:1c.0: PME: Signaling with IRQ 122
> [    2.110148] pcieport 0000:00:1c.0: pciehp: Slot #7 AttnBtn-
> PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl+
> IbPresDis- LLActRep+
> [    2.126754] pcieport 0000:00:1d.0: PME: Signaling with IRQ 123
> [    2.133946] ACPI: \_SB_.CP00: Found 3 idle states
> [    2.139708] BUG: kernel NULL pointer dereference, address: 000000000000012b
> [    2.140704] #PF: supervisor read access in kernel mode
> [    2.140704] #PF: error_code(0x0000) - not-present page
> [    2.140704] PGD 0 P4D 0
> [    2.140704] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [    2.140704] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G     U
>   5.13.0-rc1 #31
> [    2.140704] Hardware name: Google Delbin/Delbin, BIOS
> Google_Delbin.13672.156.3 05/14/2021
> [    2.140704] RIP: 0010:cpuidle_poll_time+0x9/0x6a
> [    2.140704] Code: 44 00 00 85 f6 78 19 55 48 89 e5 48 8b 05 16 44
> 44 01 4c 8b 58 40 4d 85 db 5d 41 ff d3 66 90 00 c3 0f 1f 44 00 00 55
> 48 89 e5 <48> 8b 46 20 48 85 c0 75 56 4c 63 87 28 04 00 00 b8 24 f49
> [    2.140704] RSP: 0000:ffffffff9cc03ea8 EFLAGS: 00010282
> [    2.140704] RAX: 0000000000008e7d RBX: ffffffff9cc1c5fd RCX: 000000007f894e5a
> [    2.140704] RDX: 000000007f894d4f RSI: 000000000000010b RDI: 0000000002fa1cf6
> [    2.140704] RBP: ffffffff9cc03ea8 R08: 0000000000000000 R09: 00000000ca948246
> [    2.140704] R10: 0000000000000000 R11: ffffffff9bf132cb R12: 0000000000000003
> [    2.140704] R13: ffffbbfdffc21960 R14: 0000000000000000 R15: ffffffff9cdba638
> [    2.140704] FS:  0000000000000000(0000) GS:ffff928280000000(0000)
> knlGS:0000000000000000
> [    2.140704] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    2.140704] CR2: 000000000000012b CR3: 000000027e414001 CR4: 0000000000770ef0
> [    2.140704] PKRU: 55555554
> [    2.140704] Call Trace:
> [    2.140704]  do_idle+0x175/0x1f6
> [    2.140704]  cpu_startup_entry+0x1d/0x1f
> [    2.140704]  start_kernel+0x3be/0x420
> [    2.140704]  secondary_startup_64_no_verify+0xb0/0xbb

Assuming I'm looking at the right code, this is weird.

cpuidle_poll_time()'s only caller is poll_idle(), which isn't even
listed in the stack trace.  Maybe the function before
cpuidle_poll_time() fell through into it somehow.  Or execution got
otherwise hosed.  That would also explain the bad function argument.

In addition to the data Peter requested, it would also be interesting to
see the disassembly of do_idle() with objdump -dr, to see which function
got called before it went off the rails.

-- 
Josh


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-02 15:51     ` Lukasz Majczak
  2021-06-02 16:56       ` Peter Zijlstra
  2021-06-02 20:43       ` Josh Poimboeuf
@ 2021-06-04 20:50       ` Nick Desaulniers
  2021-06-04 23:27         ` Nick Desaulniers
  2 siblings, 1 reply; 82+ messages in thread
From: Nick Desaulniers @ 2021-06-04 20:50 UTC (permalink / raw)
  To: peterz
  Cc: jpoimboe, lma, groeck, jgross, lb, linux-kernel, mbenes, rad,
	upstream, x86, clang-built-linux, nathan

(Manually replying to https://lore.kernel.org/lkml/CAFJ_xbq06nfaEWtVNLtg7XCJrQeQ9wCs4Zsoi5Y_HP3Dx0iTRA@mail.gmail.com/)

Hi Peter,
We're also tracking 2 recent regressions that look like they've come from this
patch.

https://github.com/ClangBuiltLinux/linux/issues/1384
https://github.com/ClangBuiltLinux/linux/issues/1388

(Both in linux-next at the moment).

The first, it looks like a boot failure. The second is a warning from the
linker on a kernel module; even readelf seems unhappy with the results of the
output from objtool.

I can more easily reproduce the latter, so I'm working on getting a smaller
reproducer. I'll let you know when I have it, but wanted to report it ASAP.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-04 20:50       ` Nick Desaulniers
@ 2021-06-04 23:27         ` Nick Desaulniers
  2021-06-04 23:50           ` Fangrui Song
  0 siblings, 1 reply; 82+ messages in thread
From: Nick Desaulniers @ 2021-06-04 23:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, lma, Guenter Roeck, Juergen Gross, lb, LKML,
	mbenes, rad, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Fri, Jun 4, 2021 at 1:50 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> (Manually replying to https://lore.kernel.org/lkml/CAFJ_xbq06nfaEWtVNLtg7XCJrQeQ9wCs4Zsoi5Y_HP3Dx0iTRA@mail.gmail.com/)
>
> Hi Peter,
> We're also tracking 2 recent regressions that look like they've come from this
> patch.
>
> https://github.com/ClangBuiltLinux/linux/issues/1384
> https://github.com/ClangBuiltLinux/linux/issues/1388
>
> (Both in linux-next at the moment).
>
> The first, it looks like a boot failure. The second is a warning from the
> linker on a kernel module; even readelf seems unhappy with the results of the
> output from objtool.
>
> I can more easily reproduce the latter, so I'm working on getting a smaller
> reproducer. I'll let you know when I have it, but wanted to report it ASAP.

Sent a pretty big attachment, privately.  I was able to capture the
before/after with:

$ $ echo 'CONFIG_GCOV_KERNEL=n
CONFIG_KASAN=n
CONFIG_LTO_CLANG_THIN=y' >allmod.config
$ OBJTOOL_ARGS="--backup" make -kj"$(nproc)" KCONFIG_ALLCONFIG=1
LLVM=1 LLVM_IAS=1 all

It looks like

$ ./tools/objtool/objtool orc generate  --module  --no-fp
--no-unreachable  --retpoline  --uaccess  --mcount
drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o; ld.lld -r -m elf_x86_64
-plugin-opt=-code-model=kernel -plugin-opt=-stack-alignment=8
--thinlto-cache-dir=.thinlto-cache -mllvm -import-instr-limit=5
-plugin-opt=-warn-stack-size=2048 --build-id=sha1  -T
scripts/module.lds -o drivers/gpu/drm/amd/amdgpu/amdgpu.ko
drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o
drivers/gpu/drm/amd/amdgpu/amdgpu.mod.o

is producing the linker error:

ld.lld: error: drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o:
SHT_SYMTAB_SHNDX has 79581 entries, but the symbol table associated
has 79582

Readelf having issues with the output:
$ readelf -s amdgpu.lto.o.orig
<works fine>
$ readelf -s amdgpu.lto.o
readelf: Error: Reading 73014451695 bytes extends past end of file for
string table
$ llvm-readelf -s amdgpu.lto.o
llvm-readelf: error: 'amdgpu.lto.o': unable to continue dumping, the
file is corrupt: section table goes past the end of file

`file` having issues:
$ file drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o
drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o: ELF 64-bit LSB relocatable,
x86-64, version 1 (SYSV), no section header

for comparison:
$ file ./drivers/spi/spi-ath79.lto.o
./drivers/spi/spi-ath79.lto.o: ELF 64-bit LSB relocatable, x86-64,
version 1 (SYSV), not stripped
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-04 23:27         ` Nick Desaulniers
@ 2021-06-04 23:50           ` Fangrui Song
  2021-06-05 10:38             ` Peter Zijlstra
  0 siblings, 1 reply; 82+ messages in thread
From: Fangrui Song @ 2021-06-04 23:50 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, rad, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On 2021-06-04, 'Nick Desaulniers' via Clang Built Linux wrote:
>On Fri, Jun 4, 2021 at 1:50 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>>
>> (Manually replying to https://lore.kernel.org/lkml/CAFJ_xbq06nfaEWtVNLtg7XCJrQeQ9wCs4Zsoi5Y_HP3Dx0iTRA@mail.gmail.com/)
>>
>> Hi Peter,
>> We're also tracking 2 recent regressions that look like they've come from this
>> patch.
>>
>> https://github.com/ClangBuiltLinux/linux/issues/1384
>> https://github.com/ClangBuiltLinux/linux/issues/1388
>>
>> (Both in linux-next at the moment).
>>
>> The first, it looks like a boot failure. The second is a warning from the
>> linker on a kernel module; even readelf seems unhappy with the results of the
>> output from objtool.
>>
>> I can more easily reproduce the latter, so I'm working on getting a smaller
>> reproducer. I'll let you know when I have it, but wanted to report it ASAP.
>
>Sent a pretty big attachment, privately.  I was able to capture the
>before/after with:
>
>$ $ echo 'CONFIG_GCOV_KERNEL=n
>CONFIG_KASAN=n
>CONFIG_LTO_CLANG_THIN=y' >allmod.config
>$ OBJTOOL_ARGS="--backup" make -kj"$(nproc)" KCONFIG_ALLCONFIG=1
>LLVM=1 LLVM_IAS=1 all
>
>It looks like
>
>$ ./tools/objtool/objtool orc generate  --module  --no-fp
>--no-unreachable  --retpoline  --uaccess  --mcount
>drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o; ld.lld -r -m elf_x86_64
>-plugin-opt=-code-model=kernel -plugin-opt=-stack-alignment=8
>--thinlto-cache-dir=.thinlto-cache -mllvm -import-instr-limit=5
>-plugin-opt=-warn-stack-size=2048 --build-id=sha1  -T
>scripts/module.lds -o drivers/gpu/drm/amd/amdgpu/amdgpu.ko
>drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o
>drivers/gpu/drm/amd/amdgpu/amdgpu.mod.o
>
>is producing the linker error:
>
>ld.lld: error: drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o:
>SHT_SYMTAB_SHNDX has 79581 entries, but the symbol table associated
>has 79582
>
>Readelf having issues with the output:
>$ readelf -s amdgpu.lto.o.orig
><works fine>
>$ readelf -s amdgpu.lto.o
>readelf: Error: Reading 73014451695 bytes extends past end of file for
>string table
>$ llvm-readelf -s amdgpu.lto.o
>llvm-readelf: error: 'amdgpu.lto.o': unable to continue dumping, the
>file is corrupt: section table goes past the end of file
>
>`file` having issues:
>$ file drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o
>drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o: ELF 64-bit LSB relocatable,
>x86-64, version 1 (SYSV), no section header
>
>for comparison:
>$ file ./drivers/spi/spi-ath79.lto.o
>./drivers/spi/spi-ath79.lto.o: ELF 64-bit LSB relocatable, x86-64,
>version 1 (SYSV), not stripped

tools/objtool/elf.c:elf_add_symbol may not update .symtab_shndx .
Speaking of llvm-objcopy, it finalizes the content of .symtab_shndx when .symtab
is finalized. objtool may want to adopt a similar approach.

read_symbols searches for the section ".symtab_shndx". It'd be better to
use the section type SHT_SYMTAB_SHNDX.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-04 23:50           ` Fangrui Song
@ 2021-06-05 10:38             ` Peter Zijlstra
  2021-06-06  1:58               ` Fāng-ruì Sòng
  0 siblings, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-05 10:38 UTC (permalink / raw)
  To: Fangrui Song
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, rad, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Fri, Jun 04, 2021 at 04:50:46PM -0700, Fangrui Song wrote:
> On 2021-06-04, 'Nick Desaulniers' via Clang Built Linux wrote:

> > is producing the linker error:
> > 
> > ld.lld: error: drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o:
> > SHT_SYMTAB_SHNDX has 79581 entries, but the symbol table associated
> > has 79582
> > 
> > Readelf having issues with the output:
> > $ readelf -s amdgpu.lto.o.orig
> > <works fine>
> > $ readelf -s amdgpu.lto.o
> > readelf: Error: Reading 73014451695 bytes extends past end of file for
> > string table
> > $ llvm-readelf -s amdgpu.lto.o
> > llvm-readelf: error: 'amdgpu.lto.o': unable to continue dumping, the
> > file is corrupt: section table goes past the end of file
> > 

> tools/objtool/elf.c:elf_add_symbol may not update .symtab_shndx .
> Speaking of llvm-objcopy, it finalizes the content of .symtab_shndx when .symtab
> is finalized. objtool may want to adopt a similar approach.
> 
> read_symbols searches for the section ".symtab_shndx". It'd be better to
> use the section type SHT_SYMTAB_SHNDX.

I think you've absolutely nailed it; but would you have more information
or a code reference to what you're speaking about? My complete ELF
and libelf knowledge is very limited and as demonstrated here, I'm not
at all sure how all that extended index stuff is supposed to work.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-05 10:38             ` Peter Zijlstra
@ 2021-06-06  1:58               ` Fāng-ruì Sòng
  2021-06-07  7:56                 ` Peter Zijlstra
  2021-06-07 18:19                 ` Peter Zijlstra
  0 siblings, 2 replies; 82+ messages in thread
From: Fāng-ruì Sòng @ 2021-06-06  1:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Sat, Jun 5, 2021 at 3:39 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Jun 04, 2021 at 04:50:46PM -0700, Fangrui Song wrote:
> > On 2021-06-04, 'Nick Desaulniers' via Clang Built Linux wrote:
>
> > > is producing the linker error:
> > >
> > > ld.lld: error: drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o:
> > > SHT_SYMTAB_SHNDX has 79581 entries, but the symbol table associated
> > > has 79582
> > >
> > > Readelf having issues with the output:
> > > $ readelf -s amdgpu.lto.o.orig
> > > <works fine>
> > > $ readelf -s amdgpu.lto.o
> > > readelf: Error: Reading 73014451695 bytes extends past end of file for
> > > string table
> > > $ llvm-readelf -s amdgpu.lto.o
> > > llvm-readelf: error: 'amdgpu.lto.o': unable to continue dumping, the
> > > file is corrupt: section table goes past the end of file
> > >
>
> > tools/objtool/elf.c:elf_add_symbol may not update .symtab_shndx .
> > Speaking of llvm-objcopy, it finalizes the content of .symtab_shndx when .symtab
> > is finalized. objtool may want to adopt a similar approach.
> >
> > read_symbols searches for the section ".symtab_shndx". It'd be better to
> > use the section type SHT_SYMTAB_SHNDX.
>
> I think you've absolutely nailed it; but would you have more information
> or a code reference to what you're speaking about? My complete ELF
> and libelf knowledge is very limited and as demonstrated here, I'm not
> at all sure how all that extended index stuff is supposed to work.

The section index field of an Elf{32,64}_Sym (st_shndx) is 16-bit, so
it cannot represent a section index greater than 0xffff.
ELF actually reserves values in 0xff00~0xff00 for other purposes, so
st_shndx cannot represent a section whose index is greater or equal to
0xff00.
To overcome the 16-bit section index limitation, .symtab_shndx was designed.

http://www.sco.com/developers/gabi/latest/ch4.symtab.html says

> SHN_XINDEX
> This value is an escape value. It indicates that the symbol refers to a specific location within a section, but that the section header index for that section is too large to be represented directly in the symbol table entry. The actual section header index is found in the associated SHT_SYMTAB_SHNDX section. The entries in that section correspond one to one with the entries in the symbol table. Only those entries in SHT_SYMTAB_SHNDX that correspond to symbol table entries with SHN_XINDEX will hold valid section header indexes; all other entries will have value 0.

You may use https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-objcopy/ELF/Object.cpp#L843
as a reference.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-06  1:58               ` Fāng-ruì Sòng
@ 2021-06-07  7:56                 ` Peter Zijlstra
  2021-06-07  9:22                   ` Peter Zijlstra
  2021-06-07 18:19                 ` Peter Zijlstra
  1 sibling, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-07  7:56 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Sat, Jun 05, 2021 at 06:58:39PM -0700, Fāng-ruì Sòng wrote:
> On Sat, Jun 5, 2021 at 3:39 AM Peter Zijlstra <peterz@infradead.org> wrote:

> > I think you've absolutely nailed it; but would you have more information
> > or a code reference to what you're speaking about? My complete ELF
> > and libelf knowledge is very limited and as demonstrated here, I'm not
> > at all sure how all that extended index stuff is supposed to work.
> 
> The section index field of an Elf{32,64}_Sym (st_shndx) is 16-bit, so
> it cannot represent a section index greater than 0xffff.
> ELF actually reserves values in 0xff00~0xff00 for other purposes, so
> st_shndx cannot represent a section whose index is greater or equal to
> 0xff00.

Right, that's about as far as I got, but never could find details on how
the extension worked in detail, and I clearly muddled it :/

> To overcome the 16-bit section index limitation, .symtab_shndx was designed.
> 
> http://www.sco.com/developers/gabi/latest/ch4.symtab.html says
> 
> > SHN_XINDEX This value is an escape value. It indicates that the
> > symbol refers to a specific location within a section, but that the
> > section header index for that section is too large to be represented
> > directly in the symbol table entry. The actual section header index
> > is found in the associated SHT_SYMTAB_SHNDX section. The entries in
> > that section correspond one to one with the entries in the symbol
> > table. Only those entries in SHT_SYMTAB_SHNDX that correspond to
> > symbol table entries with SHN_XINDEX will hold valid section header
> > indexes; all other entries will have value 0.
> 
> You may use https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-objcopy/ELF/Object.cpp#L843
> as a reference.

Excellent, lemme go read up and attempt to fix this.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07  7:56                 ` Peter Zijlstra
@ 2021-06-07  9:22                   ` Peter Zijlstra
  2021-06-07  9:45                     ` Peter Zijlstra
  0 siblings, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-07  9:22 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Mon, Jun 07, 2021 at 09:56:48AM +0200, Peter Zijlstra wrote:
> On Sat, Jun 05, 2021 at 06:58:39PM -0700, Fāng-ruì Sòng wrote:
> > On Sat, Jun 5, 2021 at 3:39 AM Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > > I think you've absolutely nailed it; but would you have more information
> > > or a code reference to what you're speaking about? My complete ELF
> > > and libelf knowledge is very limited and as demonstrated here, I'm not
> > > at all sure how all that extended index stuff is supposed to work.
> > 
> > The section index field of an Elf{32,64}_Sym (st_shndx) is 16-bit, so
> > it cannot represent a section index greater than 0xffff.
> > ELF actually reserves values in 0xff00~0xff00 for other purposes, so
> > st_shndx cannot represent a section whose index is greater or equal to
> > 0xff00.
> 
> Right, that's about as far as I got, but never could find details on how
> the extension worked in detail, and I clearly muddled it :/

OK, so I'm all confused again...

So a .symtab entry has:

	st_name  -- strtab offset for the name string
	st_value -- where this symbol lives
	st_size  -- size of symbol in bytes
	st_shndx -- section index to interpret the @st_value above
	st_info  -- type+bind
	st_other -- visibility

The thing is, we're adding UNDEF symbols, for the linker to resolve.
UNDEF has:

	st_value := 0
	st_size  := 0
	st_shndx := 0
	st_info  := GLOBAL + NOTYPE
	st_other := 0

Per that, sh_shndx isn't >= SHN_LORESERVE, and I figured we all good.


Is the problem that .symtab_shndx is expected to contain the exact same
number of entries as .symtab? And I'm adding to .symtab and not to
.symtab_shndx, hence getting them out of sync?

Let me try adding 0s to .symtab_shndx. See if that makes readelf
happier.


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07  9:22                   ` Peter Zijlstra
@ 2021-06-07  9:45                     ` Peter Zijlstra
  2021-06-07 17:23                       ` Fāng-ruì Sòng
  2021-06-07 20:54                       ` Nick Desaulniers
  0 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-07  9:45 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Mon, Jun 07, 2021 at 11:22:11AM +0200, Peter Zijlstra wrote:
> On Mon, Jun 07, 2021 at 09:56:48AM +0200, Peter Zijlstra wrote:
> > On Sat, Jun 05, 2021 at 06:58:39PM -0700, Fāng-ruì Sòng wrote:
> > > On Sat, Jun 5, 2021 at 3:39 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > > I think you've absolutely nailed it; but would you have more information
> > > > or a code reference to what you're speaking about? My complete ELF
> > > > and libelf knowledge is very limited and as demonstrated here, I'm not
> > > > at all sure how all that extended index stuff is supposed to work.
> > > 
> > > The section index field of an Elf{32,64}_Sym (st_shndx) is 16-bit, so
> > > it cannot represent a section index greater than 0xffff.
> > > ELF actually reserves values in 0xff00~0xff00 for other purposes, so
> > > st_shndx cannot represent a section whose index is greater or equal to
> > > 0xff00.
> > 
> > Right, that's about as far as I got, but never could find details on how
> > the extension worked in detail, and I clearly muddled it :/
> 
> OK, so I'm all confused again...
> 
> So a .symtab entry has:
> 
> 	st_name  -- strtab offset for the name string
> 	st_value -- where this symbol lives
> 	st_size  -- size of symbol in bytes
> 	st_shndx -- section index to interpret the @st_value above
> 	st_info  -- type+bind
> 	st_other -- visibility
> 
> The thing is, we're adding UNDEF symbols, for the linker to resolve.
> UNDEF has:
> 
> 	st_value := 0
> 	st_size  := 0
> 	st_shndx := 0
> 	st_info  := GLOBAL + NOTYPE
> 	st_other := 0
> 
> Per that, sh_shndx isn't >= SHN_LORESERVE, and I figured we all good.
> 
> 
> Is the problem that .symtab_shndx is expected to contain the exact same
> number of entries as .symtab? And I'm adding to .symtab and not to
> .symtab_shndx, hence getting them out of sync?
> 
> Let me try adding 0s to .symtab_shndx. See if that makes readelf
> happier.

That does indeed seem to do the trick. Bit daft if you ask me, anybody
reading that file ought to have a handy bucket of 0s available, but
whatever.

---
 tools/objtool/elf.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 743c2e9d0f56..41bca1d13d8e 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -717,7 +717,7 @@ static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
 
 struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
 {
-	struct section *symtab;
+	struct section *symtab, *symtab_shndx;
 	struct symbol *sym;
 	Elf_Data *data;
 	Elf_Scn *s;
@@ -769,6 +769,29 @@ struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
 	symtab->len += data->d_size;
 	symtab->changed = true;
 
+	symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
+	if (symtab_shndx) {
+		s = elf_getscn(elf->elf, symtab_shndx->idx);
+		if (!s) {
+			WARN_ELF("elf_getscn");
+			return NULL;
+		}
+
+		data = elf_newdata(s);
+		if (!data) {
+			WARN_ELF("elf_newdata");
+			return NULL;
+		}
+
+		data->d_buf = &sym->sym.st_size; /* conveniently 0 */
+		data->d_size = sizeof(Elf32_Word);
+		data->d_align = 4;
+		data->d_type = ELF_T_WORD;
+
+		symtab_shndx->len += 4;
+		symtab_shndx->changed = true;
+	}
+
 	sym->sec = find_section_by_index(elf, 0);
 
 	elf_add_symbol(elf, sym);

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07  9:45                     ` Peter Zijlstra
@ 2021-06-07 17:23                       ` Fāng-ruì Sòng
  2021-06-07 18:25                         ` Peter Zijlstra
  2021-06-07 20:54                       ` Nick Desaulniers
  1 sibling, 1 reply; 82+ messages in thread
From: Fāng-ruì Sòng @ 2021-06-07 17:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On 2021-06-07, Peter Zijlstra wrote:
>On Mon, Jun 07, 2021 at 11:22:11AM +0200, Peter Zijlstra wrote:
>> On Mon, Jun 07, 2021 at 09:56:48AM +0200, Peter Zijlstra wrote:
>> > On Sat, Jun 05, 2021 at 06:58:39PM -0700, Fāng-ruì Sòng wrote:
>> > > On Sat, Jun 5, 2021 at 3:39 AM Peter Zijlstra <peterz@infradead.org> wrote:
>> >
>> > > > I think you've absolutely nailed it; but would you have more information
>> > > > or a code reference to what you're speaking about? My complete ELF
>> > > > and libelf knowledge is very limited and as demonstrated here, I'm not
>> > > > at all sure how all that extended index stuff is supposed to work.
>> > >
>> > > The section index field of an Elf{32,64}_Sym (st_shndx) is 16-bit, so
>> > > it cannot represent a section index greater than 0xffff.
>> > > ELF actually reserves values in 0xff00~0xff00 for other purposes, so
>> > > st_shndx cannot represent a section whose index is greater or equal to
>> > > 0xff00.
>> >
>> > Right, that's about as far as I got, but never could find details on how
>> > the extension worked in detail, and I clearly muddled it :/
>>
>> OK, so I'm all confused again...
>>
>> So a .symtab entry has:
>>
>> 	st_name  -- strtab offset for the name string
>> 	st_value -- where this symbol lives
>> 	st_size  -- size of symbol in bytes
>> 	st_shndx -- section index to interpret the @st_value above
>> 	st_info  -- type+bind
>> 	st_other -- visibility
>>
>> The thing is, we're adding UNDEF symbols, for the linker to resolve.
>> UNDEF has:
>>
>> 	st_value := 0
>> 	st_size  := 0
>> 	st_shndx := 0
>> 	st_info  := GLOBAL + NOTYPE
>> 	st_other := 0
>>
>> Per that, sh_shndx isn't >= SHN_LORESERVE, and I figured we all good.
>>
>>
>> Is the problem that .symtab_shndx is expected to contain the exact same
>> number of entries as .symtab? And I'm adding to .symtab and not to
>> .symtab_shndx, hence getting them out of sync?

Yes. http://www.sco.com/developers/gabi/latest/ch4.sheader.html says
"Each value corresponds one to one with a symbol table entry and appear in the same order as those entries."

>> Let me try adding 0s to .symtab_shndx. See if that makes readelf
>> happier.
>
>That does indeed seem to do the trick. Bit daft if you ask me, anybody
>reading that file ought to have a handy bucket of 0s available, but
>whatever.

Does the representation use the section index directly? (sym->sym.st_shndx)
This can be fragile when the number of sections changes..., e.g. elf_add_section

So in llvm-objcopy's representation, the section index is represented as
the section object.

struct Symbol {
   ...
   SectionBase *DefinedIn = nullptr;
   ...
};

In the writer stage, sections are assigned 32-bit indexes and the writer
knows that an SHN_XINDEX for a symbol is needed if the index is >= 0xff00.

>---
> tools/objtool/elf.c | 25 ++++++++++++++++++++++++-
> 1 file changed, 24 insertions(+), 1 deletion(-)
>
>diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
>index 743c2e9d0f56..41bca1d13d8e 100644
>--- a/tools/objtool/elf.c
>+++ b/tools/objtool/elf.c
>@@ -717,7 +717,7 @@ static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
>
> struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
> {
>-	struct section *symtab;
>+	struct section *symtab, *symtab_shndx;
> 	struct symbol *sym;
> 	Elf_Data *data;
> 	Elf_Scn *s;
>@@ -769,6 +769,29 @@ struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
> 	symtab->len += data->d_size;
> 	symtab->changed = true;
>
>+	symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
>+	if (symtab_shndx) {
>+		s = elf_getscn(elf->elf, symtab_shndx->idx);
>+		if (!s) {
>+			WARN_ELF("elf_getscn");
>+			return NULL;
>+		}
>+
>+		data = elf_newdata(s);
>+		if (!data) {
>+			WARN_ELF("elf_newdata");
>+			return NULL;
>+		}
>+
>+		data->d_buf = &sym->sym.st_size; /* conveniently 0 */
>+		data->d_size = sizeof(Elf32_Word);
>+		data->d_align = 4;
>+		data->d_type = ELF_T_WORD;
>+
>+		symtab_shndx->len += 4;
>+		symtab_shndx->changed = true;
>+	}
>+
> 	sym->sec = find_section_by_index(elf, 0);
>
> 	elf_add_symbol(elf, sym);

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-06  1:58               ` Fāng-ruì Sòng
  2021-06-07  7:56                 ` Peter Zijlstra
@ 2021-06-07 18:19                 ` Peter Zijlstra
  2021-06-07 18:27                   ` Fāng-ruì Sòng
  1 sibling, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-07 18:19 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Sat, Jun 05, 2021 at 06:58:39PM -0700, Fāng-ruì Sòng wrote:

> You may use https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-objcopy/ELF/Object.cpp#L843
> as a reference.

BTW, Error::success(), is that a successfull error, or an erroneous
success? :-))

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07 17:23                       ` Fāng-ruì Sòng
@ 2021-06-07 18:25                         ` Peter Zijlstra
  0 siblings, 0 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-07 18:25 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Mon, Jun 07, 2021 at 10:23:11AM -0700, Fāng-ruì Sòng wrote:
> On 2021-06-07, Peter Zijlstra wrote:

> > That does indeed seem to do the trick. Bit daft if you ask me, anybody
> > reading that file ought to have a handy bucket of 0s available, but
> > whatever.
> 
> Does the representation use the section index directly? (sym->sym.st_shndx)
> This can be fragile when the number of sections changes..., e.g. elf_add_section

No, things are supposed to use sym->sec, which is a pointer to our
struct section representation.

> So in llvm-objcopy's representation, the section index is represented as
> the section object.
> 
> struct Symbol {
>   ...
>   SectionBase *DefinedIn = nullptr;
>   ...
> };

Somewhat like that.

> In the writer stage, sections are assigned 32-bit indexes and the writer
> knows that an SHN_XINDEX for a symbol is needed if the index is >= 0xff00.

I think we only ever append sections, so pre-existing section numbers
stay correct. If libelf somehow does something else, we rely on it to
then keep the section numbers internally consistent.

And the only symbol write is this append of undef symbols, which are
always on section 0.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07 18:19                 ` Peter Zijlstra
@ 2021-06-07 18:27                   ` Fāng-ruì Sòng
  2021-06-07 18:47                     ` Peter Zijlstra
  0 siblings, 1 reply; 82+ messages in thread
From: Fāng-ruì Sòng @ 2021-06-07 18:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Mon, Jun 7, 2021 at 11:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Sat, Jun 05, 2021 at 06:58:39PM -0700, Fāng-ruì Sòng wrote:
>
> > You may use https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-objcopy/ELF/Object.cpp#L843
> > as a reference.
>
> BTW, Error::success(), is that a successfull error, or an erroneous
> success? :-))

A success (no error). Error::success() is a factory member function.
Its purpose is to create an "unchecked" Error instance and require the
caller to explicitly check for the error state.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07 18:27                   ` Fāng-ruì Sòng
@ 2021-06-07 18:47                     ` Peter Zijlstra
  0 siblings, 0 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-07 18:47 UTC (permalink / raw)
  To: Fāng-ruì Sòng
  Cc: Nick Desaulniers, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Nathan Chancellor, Sami Tolvanen

On Mon, Jun 07, 2021 at 11:27:27AM -0700, Fāng-ruì Sòng wrote:
> On Mon, Jun 7, 2021 at 11:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Sat, Jun 05, 2021 at 06:58:39PM -0700, Fāng-ruì Sòng wrote:
> >
> > > You may use https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-objcopy/ELF/Object.cpp#L843
> > > as a reference.
> >
> > BTW, Error::success(), is that a successfull error, or an erroneous
> > success? :-))
> 
> A success (no error). Error::success() is a factory member function.
> Its purpose is to create an "unchecked" Error instance and require the
> caller to explicitly check for the error state.

I got that (see the smily face), but it reads really weird when you're
not used to it.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07  9:45                     ` Peter Zijlstra
  2021-06-07 17:23                       ` Fāng-ruì Sòng
@ 2021-06-07 20:54                       ` Nick Desaulniers
  2021-06-08  9:56                         ` Peter Zijlstra
  2021-06-08 16:58                         ` Nathan Chancellor
  1 sibling, 2 replies; 82+ messages in thread
From: Nick Desaulniers @ 2021-06-07 20:54 UTC (permalink / raw)
  To: Peter Zijlstra, Nathan Chancellor
  Cc: Fāng-ruì Sòng, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Mon, Jun 7, 2021 at 2:46 AM Peter Zijlstra <peterz@infradead.org> wrote:
>

Thanks, the below diff resolves the linker error reported in
https://github.com/ClangBuiltLinux/linux/issues/1388

Both readelf implementations seem happy with the results, too.

Tested-by: Nick Desaulniers <ndesaulniers@google.com>

Nathan,
Can you please test the below diff and see if that resolves your boot
issue reported in:
https://github.com/ClangBuiltLinux/linux/issues/1384

> ---
>  tools/objtool/elf.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
> index 743c2e9d0f56..41bca1d13d8e 100644
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -717,7 +717,7 @@ static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
>
>  struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
>  {
> -       struct section *symtab;
> +       struct section *symtab, *symtab_shndx;
>         struct symbol *sym;
>         Elf_Data *data;
>         Elf_Scn *s;
> @@ -769,6 +769,29 @@ struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
>         symtab->len += data->d_size;
>         symtab->changed = true;
>
> +       symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
> +       if (symtab_shndx) {
> +               s = elf_getscn(elf->elf, symtab_shndx->idx);
> +               if (!s) {
> +                       WARN_ELF("elf_getscn");
> +                       return NULL;
> +               }
> +
> +               data = elf_newdata(s);
> +               if (!data) {
> +                       WARN_ELF("elf_newdata");
> +                       return NULL;
> +               }
> +
> +               data->d_buf = &sym->sym.st_size; /* conveniently 0 */
> +               data->d_size = sizeof(Elf32_Word);
> +               data->d_align = 4;
> +               data->d_type = ELF_T_WORD;
> +
> +               symtab_shndx->len += 4;
> +               symtab_shndx->changed = true;
> +       }
> +
>         sym->sec = find_section_by_index(elf, 0);
>
>         elf_add_symbol(elf, sym);


The only thing that's still different is that the `file` command still
prints "no section header."

$ find . -name \*.lto.o | xargs file | rev | cut -d , -f 1 | rev |
sort | uniq -c
      1  no section header
   8377  not stripped
      1  too many section headers (33683)
      1  too many section headers (50758)
$ file --version
file-5.39

That's drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o, fs/xfs/xfs.lto.o,
drivers/gpu/drm/i915/i915.lto.o, respectively.  I'm not sure that's a
problem, yet, and whether 9bc0bb50727c8ac69fbb33fb937431cf3518ff37 is
even related yet; those might just be huge drivers and figured it was
reporting somewhere in case it ever comes up again.  CONFIG_LTO
implies -ffunction-sections -fdata-sections, and
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION explicitly sets those, too.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07 20:54                       ` Nick Desaulniers
@ 2021-06-08  9:56                         ` Peter Zijlstra
  2021-06-08 16:58                         ` Nathan Chancellor
  1 sibling, 0 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-08  9:56 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Nathan Chancellor, Fāng-ruì Sòng, Josh Poimboeuf,
	lma, Guenter Roeck, Juergen Gross, lb, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Mon, Jun 07, 2021 at 01:54:37PM -0700, Nick Desaulniers wrote:
> The only thing that's still different is that the `file` command still
> prints "no section header."
> 
> $ find . -name \*.lto.o | xargs file | rev | cut -d , -f 1 | rev |
> sort | uniq -c
>       1  no section header

That's not due to objtool, is it?

$ file amdgpu.lto.o.orig
amdgpu.lto.o.orig: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), no section header

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-07 20:54                       ` Nick Desaulniers
  2021-06-08  9:56                         ` Peter Zijlstra
@ 2021-06-08 16:58                         ` Nathan Chancellor
  2021-06-08 17:22                           ` Peter Zijlstra
  1 sibling, 1 reply; 82+ messages in thread
From: Nathan Chancellor @ 2021-06-08 16:58 UTC (permalink / raw)
  To: Nick Desaulniers, Peter Zijlstra
  Cc: Fāng-ruì Sòng, Josh Poimboeuf, lma, Guenter Roeck,
	Juergen Gross, lb, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On 6/7/2021 1:54 PM, 'Nick Desaulniers' via Clang Built Linux wrote:
> On Mon, Jun 7, 2021 at 2:46 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>
> 
> Thanks, the below diff resolves the linker error reported in
> https://github.com/ClangBuiltLinux/linux/issues/1388
> 
> Both readelf implementations seem happy with the results, too.
> 
> Tested-by: Nick Desaulniers <ndesaulniers@google.com>
> 
> Nathan,
> Can you please test the below diff and see if that resolves your boot
> issue reported in:
> https://github.com/ClangBuiltLinux/linux/issues/1384

Unfortunately, it does not appear to resolve that issue.

$ git log -2 --decorate=no --oneline
eea6a9d6d277 Peter's fix
614124bea77e Linux 5.13-rc5

$ strings /mnt/c/Users/natec/Linux/kernel-investigation | grep microsoft
5.13.0-rc5-microsoft-standard-WSL2-00001-geea6a9d6d277 
(nathan@archlinux-ax161) #3 SMP Tue Jun 8 09:46:19 MST 2021

My VM still never makes it to userspace.

>> ---
>>   tools/objtool/elf.c | 25 ++++++++++++++++++++++++-
>>   1 file changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
>> index 743c2e9d0f56..41bca1d13d8e 100644
>> --- a/tools/objtool/elf.c
>> +++ b/tools/objtool/elf.c
>> @@ -717,7 +717,7 @@ static int elf_add_string(struct elf *elf, struct section *strtab, char *str)
>>
>>   struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
>>   {
>> -       struct section *symtab;
>> +       struct section *symtab, *symtab_shndx;
>>          struct symbol *sym;
>>          Elf_Data *data;
>>          Elf_Scn *s;
>> @@ -769,6 +769,29 @@ struct symbol *elf_create_undef_symbol(struct elf *elf, const char *name)
>>          symtab->len += data->d_size;
>>          symtab->changed = true;
>>
>> +       symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
>> +       if (symtab_shndx) {
>> +               s = elf_getscn(elf->elf, symtab_shndx->idx);
>> +               if (!s) {
>> +                       WARN_ELF("elf_getscn");
>> +                       return NULL;
>> +               }
>> +
>> +               data = elf_newdata(s);
>> +               if (!data) {
>> +                       WARN_ELF("elf_newdata");
>> +                       return NULL;
>> +               }
>> +
>> +               data->d_buf = &sym->sym.st_size; /* conveniently 0 */
>> +               data->d_size = sizeof(Elf32_Word);
>> +               data->d_align = 4;
>> +               data->d_type = ELF_T_WORD;
>> +
>> +               symtab_shndx->len += 4;
>> +               symtab_shndx->changed = true;
>> +       }
>> +
>>          sym->sec = find_section_by_index(elf, 0);
>>
>>          elf_add_symbol(elf, sym);
> 
> 
> The only thing that's still different is that the `file` command still
> prints "no section header."
> 
> $ find . -name \*.lto.o | xargs file | rev | cut -d , -f 1 | rev |
> sort | uniq -c
>        1  no section header
>     8377  not stripped
>        1  too many section headers (33683)
>        1  too many section headers (50758)
> $ file --version
> file-5.39
> 
> That's drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o, fs/xfs/xfs.lto.o,
> drivers/gpu/drm/i915/i915.lto.o, respectively.  I'm not sure that's a
> problem, yet, and whether 9bc0bb50727c8ac69fbb33fb937431cf3518ff37 is
> even related yet; those might just be huge drivers and figured it was
> reporting somewhere in case it ever comes up again.  CONFIG_LTO
> implies -ffunction-sections -fdata-sections, and
> CONFIG_LD_DEAD_CODE_DATA_ELIMINATION explicitly sets those, too.
> 

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-08 16:58                         ` Nathan Chancellor
@ 2021-06-08 17:22                           ` Peter Zijlstra
  2021-06-08 17:29                             ` Nathan Chancellor
  0 siblings, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-08 17:22 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Nick Desaulniers, Fāng-ruì Sòng, Josh Poimboeuf,
	lma, Guenter Roeck, Juergen Gross, lb, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Tue, Jun 08, 2021 at 09:58:03AM -0700, Nathan Chancellor wrote:
> On 6/7/2021 1:54 PM, 'Nick Desaulniers' via Clang Built Linux wrote:
> > On Mon, Jun 7, 2021 at 2:46 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > 
> > 
> > Thanks, the below diff resolves the linker error reported in
> > https://github.com/ClangBuiltLinux/linux/issues/1388
> > 
> > Both readelf implementations seem happy with the results, too.
> > 
> > Tested-by: Nick Desaulniers <ndesaulniers@google.com>
> > 
> > Nathan,
> > Can you please test the below diff and see if that resolves your boot
> > issue reported in:
> > https://github.com/ClangBuiltLinux/linux/issues/1384
> 
> Unfortunately, it does not appear to resolve that issue.
> 
> $ git log -2 --decorate=no --oneline
> eea6a9d6d277 Peter's fix
> 614124bea77e Linux 5.13-rc5
> 
> $ strings /mnt/c/Users/natec/Linux/kernel-investigation | grep microsoft
> 5.13.0-rc5-microsoft-standard-WSL2-00001-geea6a9d6d277
> (nathan@archlinux-ax161) #3 SMP Tue Jun 8 09:46:19 MST 2021
> 
> My VM still never makes it to userspace.

Since it's a VM, can you use the gdb-stub to ask it where it's stuck?

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-08 17:22                           ` Peter Zijlstra
@ 2021-06-08 17:29                             ` Nathan Chancellor
  2021-06-08 18:17                               ` Peter Zijlstra
  2021-06-08 18:18                               ` Nick Desaulniers
  0 siblings, 2 replies; 82+ messages in thread
From: Nathan Chancellor @ 2021-06-08 17:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Fāng-ruì Sòng, Josh Poimboeuf,
	lma, Guenter Roeck, Juergen Gross, lb, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On 6/8/2021 10:22 AM, Peter Zijlstra wrote:
> On Tue, Jun 08, 2021 at 09:58:03AM -0700, Nathan Chancellor wrote:
>> On 6/7/2021 1:54 PM, 'Nick Desaulniers' via Clang Built Linux wrote:
>>> On Mon, Jun 7, 2021 at 2:46 AM Peter Zijlstra <peterz@infradead.org> wrote:
>>>>
>>>
>>> Thanks, the below diff resolves the linker error reported in
>>> https://github.com/ClangBuiltLinux/linux/issues/1388
>>>
>>> Both readelf implementations seem happy with the results, too.
>>>
>>> Tested-by: Nick Desaulniers <ndesaulniers@google.com>
>>>
>>> Nathan,
>>> Can you please test the below diff and see if that resolves your boot
>>> issue reported in:
>>> https://github.com/ClangBuiltLinux/linux/issues/1384
>>
>> Unfortunately, it does not appear to resolve that issue.
>>
>> $ git log -2 --decorate=no --oneline
>> eea6a9d6d277 Peter's fix
>> 614124bea77e Linux 5.13-rc5
>>
>> $ strings /mnt/c/Users/natec/Linux/kernel-investigation | grep microsoft
>> 5.13.0-rc5-microsoft-standard-WSL2-00001-geea6a9d6d277
>> (nathan@archlinux-ax161) #3 SMP Tue Jun 8 09:46:19 MST 2021
>>
>> My VM still never makes it to userspace.
> 
> Since it's a VM, can you use the gdb-stub to ask it where it's stuck?
> 

Unfortunately, this is the VM provided by the Windows Subsystem for 
Linux so examining it is nigh-impossible :/ I am considering bisecting 
the transforms that objtool does to try and figure out the one that 
causes the machine to fail to boot or try to reproduce in a different 
hypervisor, unless you have any other ideas.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-08 17:29                             ` Nathan Chancellor
@ 2021-06-08 18:17                               ` Peter Zijlstra
  2021-06-08 18:49                                 ` Nathan Chancellor
  2021-06-08 18:18                               ` Nick Desaulniers
  1 sibling, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-08 18:17 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Nick Desaulniers, Fāng-ruì Sòng, Josh Poimboeuf,
	lma, Guenter Roeck, Juergen Gross, lb, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Tue, Jun 08, 2021 at 10:29:56AM -0700, Nathan Chancellor wrote:
> On 6/8/2021 10:22 AM, Peter Zijlstra wrote:

> > Since it's a VM, can you use the gdb-stub to ask it where it's stuck?
> > 
> 
> Unfortunately, this is the VM provided by the Windows Subsystem for Linux so
> examining it is nigh-impossible :/ I am considering bisecting the transforms
> that objtool does to try and figure out the one that causes the machine to
> fail to boot or try to reproduce in a different hypervisor, unless you have
> any other ideas.

Does breaking Windows earn points similar to breaking the binary
drivers? :-) :-)

The below should kill this latest transform and would quickly confirm if
the that is causing your problem. If that's not it, what was your last
known working version?


diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e5947fbb9e7a..d0f231b9c5a1 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1857,10 +1857,10 @@ static int decode_sections(struct objtool_file *file)
 	 * Must be after add_special_section_alts(), since this will emit
 	 * alternatives. Must be after add_{jump,call}_destination(), since
 	 * those create the call insn lists.
-	 */
 	ret = arch_rewrite_retpolines(file);
 	if (ret)
 		return ret;
+	 */
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-08 17:29                             ` Nathan Chancellor
  2021-06-08 18:17                               ` Peter Zijlstra
@ 2021-06-08 18:18                               ` Nick Desaulniers
  1 sibling, 0 replies; 82+ messages in thread
From: Nick Desaulniers @ 2021-06-08 18:18 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Peter Zijlstra, Fāng-ruì Sòng, Josh Poimboeuf,
	lma, Guenter Roeck, Juergen Gross, lb, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Tue, Jun 8, 2021 at 10:30 AM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On 6/8/2021 10:22 AM, Peter Zijlstra wrote:
> > On Tue, Jun 08, 2021 at 09:58:03AM -0700, Nathan Chancellor wrote:
> >> On 6/7/2021 1:54 PM, 'Nick Desaulniers' via Clang Built Linux wrote:
> >>> Nathan,
> >>> Can you please test the below diff and see if that resolves your boot
> >>> issue reported in:
> >>> https://github.com/ClangBuiltLinux/linux/issues/1384
> >>
> >> Unfortunately, it does not appear to resolve that issue.
> >>
> >> $ git log -2 --decorate=no --oneline
> >> eea6a9d6d277 Peter's fix
> >> 614124bea77e Linux 5.13-rc5
> >>
> >> $ strings /mnt/c/Users/natec/Linux/kernel-investigation | grep microsoft
> >> 5.13.0-rc5-microsoft-standard-WSL2-00001-geea6a9d6d277
> >> (nathan@archlinux-ax161) #3 SMP Tue Jun 8 09:46:19 MST 2021
> >>
> >> My VM still never makes it to userspace.
> >
> > Since it's a VM, can you use the gdb-stub to ask it where it's stuck?
> >
>
> Unfortunately, this is the VM provided by the Windows Subsystem for
> Linux so examining it is nigh-impossible :/ I am considering bisecting
> the transforms that objtool does to try and figure out the one that
> causes the machine to fail to boot or try to reproduce in a different
> hypervisor, unless you have any other ideas.

Assuming this is an optimization and not required to boot/run; you
could test that quickly by putting a return statement as the first
statement in the list_for_each_entry loop in arch_rewrite_retpolines.
If that works, you could instead use a counter to try to see which
symbol is bad; once you bisect a counter value where things start/stop
booting, you could try to print the corresponding symbol (ie `name`).
(Optimization Fuel)  (Sorry if any of that is unclear, let's follow up
off thread if so).  Maybe that symbol will give us further clues?  I
think that would tell us whether it's a problematic jump vs call, and
via which register.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-08 18:17                               ` Peter Zijlstra
@ 2021-06-08 18:49                                 ` Nathan Chancellor
  2021-06-09  7:11                                   ` Lukasz Majczak
  0 siblings, 1 reply; 82+ messages in thread
From: Nathan Chancellor @ 2021-06-08 18:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Fāng-ruì Sòng, Josh Poimboeuf,
	lma, Guenter Roeck, Juergen Gross, lb, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On 6/8/2021 11:17 AM, Peter Zijlstra wrote:
> On Tue, Jun 08, 2021 at 10:29:56AM -0700, Nathan Chancellor wrote:
>> Unfortunately, this is the VM provided by the Windows Subsystem for Linux so
>> examining it is nigh-impossible :/ I am considering bisecting the transforms
>> that objtool does to try and figure out the one that causes the machine to
>> fail to boot or try to reproduce in a different hypervisor, unless you have
>> any other ideas.
> 
> Does breaking Windows earn points similar to breaking the binary
> drivers? :-) :-)

:)

> The below should kill this latest transform and would quickly confirm if
> the that is causing your problem. If that's not it, what was your last
> known working version?

Yes, that diff gets me back to booting. I will see if I can figure out 
the exact rewrite that blows everything up.

> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index e5947fbb9e7a..d0f231b9c5a1 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -1857,10 +1857,10 @@ static int decode_sections(struct objtool_file *file)
>   	 * Must be after add_special_section_alts(), since this will emit
>   	 * alternatives. Must be after add_{jump,call}_destination(), since
>   	 * those create the call insn lists.
> -	 */
>   	ret = arch_rewrite_retpolines(file);
>   	if (ret)
>   		return ret;
> +	 */
>   
>   	return 0;
>   }
> 

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-08 18:49                                 ` Nathan Chancellor
@ 2021-06-09  7:11                                   ` Lukasz Majczak
  2021-06-09  7:20                                     ` Peter Zijlstra
  0 siblings, 1 reply; 82+ messages in thread
From: Lukasz Majczak @ 2021-06-09  7:11 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Peter Zijlstra, Nick Desaulniers, Fāng-ruì Sòng,
	Josh Poimboeuf, Guenter Roeck, Juergen Gross,
	Łukasz Bartosik, LKML, mbenes, Radosław Biernacki,
	upstream, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

wt., 8 cze 2021 o 20:49 Nathan Chancellor <nathan@kernel.org> napisał(a):
>
> On 6/8/2021 11:17 AM, Peter Zijlstra wrote:
> > On Tue, Jun 08, 2021 at 10:29:56AM -0700, Nathan Chancellor wrote:
> >> Unfortunately, this is the VM provided by the Windows Subsystem for Linux so
> >> examining it is nigh-impossible :/ I am considering bisecting the transforms
> >> that objtool does to try and figure out the one that causes the machine to
> >> fail to boot or try to reproduce in a different hypervisor, unless you have
> >> any other ideas.
> >
> > Does breaking Windows earn points similar to breaking the binary
> > drivers? :-) :-)
>
> :)
>
> > The below should kill this latest transform and would quickly confirm if
> > the that is causing your problem. If that's not it, what was your last
> > known working version?
>
> Yes, that diff gets me back to booting. I will see if I can figure out
> the exact rewrite that blows everything up.
>
> > diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> > index e5947fbb9e7a..d0f231b9c5a1 100644
> > --- a/tools/objtool/check.c
> > +++ b/tools/objtool/check.c
> > @@ -1857,10 +1857,10 @@ static int decode_sections(struct objtool_file *file)
> >        * Must be after add_special_section_alts(), since this will emit
> >        * alternatives. Must be after add_{jump,call}_destination(), since
> >        * those create the call insn lists.
> > -      */
> >       ret = arch_rewrite_retpolines(file);
> >       if (ret)
> >               return ret;
> > +      */
> >
> >       return 0;
> >   }
> >
>
> Cheers,
> Nathan

Hi Peter,

I'm sorry I was on vacation last week - do you still need the requested debugs?

Best regards,
Lukasz

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-09  7:11                                   ` Lukasz Majczak
@ 2021-06-09  7:20                                     ` Peter Zijlstra
  2021-06-09 12:23                                       ` Lukasz Majczak
  0 siblings, 1 reply; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-09  7:20 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: Nathan Chancellor, Nick Desaulniers,
	Fāng-ruì Sòng, Josh Poimboeuf, Guenter Roeck,
	Juergen Gross, Łukasz Bartosik, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Wed, Jun 09, 2021 at 09:11:18AM +0200, Lukasz Majczak wrote:

> I'm sorry I was on vacation last week - do you still need the requested debugs?

If the patch here:

  https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net

does not fix things for you (don't think it actually will), then yes,
please send me the information requested.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-09  7:20                                     ` Peter Zijlstra
@ 2021-06-09 12:23                                       ` Lukasz Majczak
  2021-06-09 15:08                                         ` Peter Zijlstra
  0 siblings, 1 reply; 82+ messages in thread
From: Lukasz Majczak @ 2021-06-09 12:23 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nathan Chancellor, Nick Desaulniers,
	Fāng-ruì Sòng, Josh Poimboeuf, Guenter Roeck,
	Juergen Gross, Łukasz Bartosik, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

śr., 9 cze 2021 o 09:20 Peter Zijlstra <peterz@infradead.org> napisał(a):
>
> On Wed, Jun 09, 2021 at 09:11:18AM +0200, Lukasz Majczak wrote:
>
> > I'm sorry I was on vacation last week - do you still need the requested debugs?
>
> If the patch here:
>
>   https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net
>
> does not fix things for you (don't think it actually will), then yes,
> please send me the information requested.

Ok, it didn't help. Peter, Josh I have sent you a private email with
requested information.

Best regards
Lukasz

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-09 12:23                                       ` Lukasz Majczak
@ 2021-06-09 15:08                                         ` Peter Zijlstra
  2021-06-09 15:11                                           ` Peter Zijlstra
  2021-06-09 15:56                                           ` Nathan Chancellor
  0 siblings, 2 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-09 15:08 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: Nathan Chancellor, Nick Desaulniers,
	Fāng-ruì Sòng, Josh Poimboeuf, Guenter Roeck,
	Juergen Gross, Łukasz Bartosik, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Wed, Jun 09, 2021 at 02:23:28PM +0200, Lukasz Majczak wrote:
> śr., 9 cze 2021 o 09:20 Peter Zijlstra <peterz@infradead.org> napisał(a):
> >
> > On Wed, Jun 09, 2021 at 09:11:18AM +0200, Lukasz Majczak wrote:
> >
> > > I'm sorry I was on vacation last week - do you still need the requested debugs?
> >
> > If the patch here:
> >
> >   https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net
> >
> > does not fix things for you (don't think it actually will), then yes,
> > please send me the information requested.
> 
> Ok, it didn't help. Peter, Josh I have sent you a private email with
> requested information.

OK, I think I've found it. Check this one:

 5d5:   0f 85 00 00 00 00       jne    5db <cpuidle_reflect+0x22>       5d7: R_X86_64_PLT32     __x86_indirect_thunk_r11-0x4


+Relocation section '.rela.altinstructions' at offset 0 contains 14 entries:
+    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend

+0000000000000018  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 5d5
+000000000000001c  0000009200000002 R_X86_64_PC32          0000000000000000 __x86_indirect_alt_call_r11 + 0

Apparently we get conditional branches to retpoline thunks and objtool
completely messes that up. I'm betting this also explains the problems
Nathan is having.

*groan*,.. not sure what to do about this, except return to having
objtool generate code, which everybody hated on. For now I'll make it
skip the conditional branches.

I wonder if the compiler will also generate conditional tail calls, and
what that does with static_call... now I have to check all that.

---

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 24295d39713b..523aa4157f80 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -747,6 +747,10 @@ int arch_rewrite_retpolines(struct objtool_file *file)
 
 	list_for_each_entry(insn, &file->retpoline_call_list, call_node) {
 
+		if (insn->type != INSN_JUMP_DYNAMIC &&
+		    insn->type != INSN_CALL_DYNAMIC)
+			continue;
+
 		if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
 			continue;
 

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-09 15:08                                         ` Peter Zijlstra
@ 2021-06-09 15:11                                           ` Peter Zijlstra
  2021-06-09 15:56                                           ` Nathan Chancellor
  1 sibling, 0 replies; 82+ messages in thread
From: Peter Zijlstra @ 2021-06-09 15:11 UTC (permalink / raw)
  To: Lukasz Majczak
  Cc: Nathan Chancellor, Nick Desaulniers,
	Fāng-ruì Sòng, Josh Poimboeuf, Guenter Roeck,
	Juergen Gross, Łukasz Bartosik, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On Wed, Jun 09, 2021 at 05:08:05PM +0200, Peter Zijlstra wrote:
> I wonder if the compiler will also generate conditional tail calls, and
> what that does with static_call... now I have to check all that.

OK.. static call patching infra will give us a nice WARN before it dies.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls
  2021-06-09 15:08                                         ` Peter Zijlstra
  2021-06-09 15:11                                           ` Peter Zijlstra
@ 2021-06-09 15:56                                           ` Nathan Chancellor
  1 sibling, 0 replies; 82+ messages in thread
From: Nathan Chancellor @ 2021-06-09 15:56 UTC (permalink / raw)
  To: Peter Zijlstra, Lukasz Majczak
  Cc: Nick Desaulniers, Fāng-ruì Sòng, Josh Poimboeuf,
	Guenter Roeck, Juergen Gross, Łukasz Bartosik, LKML, mbenes,
	Radosław Biernacki, upstream,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	clang-built-linux, Sami Tolvanen

On 6/9/2021 8:08 AM, Peter Zijlstra wrote:
> On Wed, Jun 09, 2021 at 02:23:28PM +0200, Lukasz Majczak wrote:
>> śr., 9 cze 2021 o 09:20 Peter Zijlstra <peterz@infradead.org> napisał(a):
>>>
>>> On Wed, Jun 09, 2021 at 09:11:18AM +0200, Lukasz Majczak wrote:
>>>
>>>> I'm sorry I was on vacation last week - do you still need the requested debugs?
>>>
>>> If the patch here:
>>>
>>>    https://lkml.kernel.org/r/YL3q1qFO9QIRL/BA@hirez.programming.kicks-ass.net
>>>
>>> does not fix things for you (don't think it actually will), then yes,
>>> please send me the information requested.
>>
>> Ok, it didn't help. Peter, Josh I have sent you a private email with
>> requested information.
> 
> OK, I think I've found it. Check this one:
> 
>   5d5:   0f 85 00 00 00 00       jne    5db <cpuidle_reflect+0x22>       5d7: R_X86_64_PLT32     __x86_indirect_thunk_r11-0x4
> 
> 
> +Relocation section '.rela.altinstructions' at offset 0 contains 14 entries:
> +    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 
> +0000000000000018  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 5d5
> +000000000000001c  0000009200000002 R_X86_64_PC32          0000000000000000 __x86_indirect_alt_call_r11 + 0
> 
> Apparently we get conditional branches to retpoline thunks and objtool
> completely messes that up. I'm betting this also explains the problems
> Nathan is having.

Yes, the below patch gets my kernel back to booting so it seems the root 
cause is the same.

> *groan*,.. not sure what to do about this, except return to having
> objtool generate code, which everybody hated on. For now I'll make it
> skip the conditional branches.
> 
> I wonder if the compiler will also generate conditional tail calls, and
> what that does with static_call... now I have to check all that.
> 
> ---

Tested-by: Nathan Chancellor <nathan@kernel.org>

> diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
> index 24295d39713b..523aa4157f80 100644
> --- a/tools/objtool/arch/x86/decode.c
> +++ b/tools/objtool/arch/x86/decode.c
> @@ -747,6 +747,10 @@ int arch_rewrite_retpolines(struct objtool_file *file)
>   
>   	list_for_each_entry(insn, &file->retpoline_call_list, call_node) {
>   
> +		if (insn->type != INSN_JUMP_DYNAMIC &&
> +		    insn->type != INSN_CALL_DYNAMIC)
> +			continue;
> +
>   		if (!strcmp(insn->sec->name, ".text.__x86.indirect_thunk"))
>   			continue;
>   
> 

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2021-06-09 15:56 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-26 15:11 [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 01/16] x86: Add insn_decode_kernel() Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 02/16] x86/alternatives: Optimize optimize_nops() Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:11   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 03/16] x86/retpoline: Simplify retpolines Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-04-06  8:56     ` David Laight
2021-03-26 15:12 ` [PATCH v3 04/16] objtool: Correctly handle retpoline thunk calls Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 05/16] objtool: Per arch retpoline naming Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] objtool: Handle per " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 06/16] objtool: Fix static_call list generation Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 07/16] objtool: Rework rebuild_reloc logic Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` [tip: x86/core] objtool: Rework the elf_rebuild_reloc_section() logic tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 08/16] objtool: Add elf_create_reloc() helper Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 09/16] objtool: Implicitly create reloc sections Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` [tip: x86/core] objtool: Create reloc sections implicitly tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 10/16] objtool: Extract elf_strtab_concat() Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 11/16] objtool: Extract elf_symbol_add() Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 12/16] objtool: Add elf_create_undef_symbol() Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 13/16] objtool: Keep track of retpoline call sites Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 14/16] objtool: Cache instruction relocs Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 15/16] objtool: Skip magical retpoline .altinstr_replacement Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-26 15:12 ` [PATCH v3 16/16] objtool,x86: Rewrite retpoline thunk calls Peter Zijlstra
2021-03-29 16:38   ` Josh Poimboeuf
2021-06-02 15:51     ` Lukasz Majczak
2021-06-02 16:56       ` Peter Zijlstra
2021-06-02 17:10         ` Peter Zijlstra
2021-06-02 20:43       ` Josh Poimboeuf
2021-06-04 20:50       ` Nick Desaulniers
2021-06-04 23:27         ` Nick Desaulniers
2021-06-04 23:50           ` Fangrui Song
2021-06-05 10:38             ` Peter Zijlstra
2021-06-06  1:58               ` Fāng-ruì Sòng
2021-06-07  7:56                 ` Peter Zijlstra
2021-06-07  9:22                   ` Peter Zijlstra
2021-06-07  9:45                     ` Peter Zijlstra
2021-06-07 17:23                       ` Fāng-ruì Sòng
2021-06-07 18:25                         ` Peter Zijlstra
2021-06-07 20:54                       ` Nick Desaulniers
2021-06-08  9:56                         ` Peter Zijlstra
2021-06-08 16:58                         ` Nathan Chancellor
2021-06-08 17:22                           ` Peter Zijlstra
2021-06-08 17:29                             ` Nathan Chancellor
2021-06-08 18:17                               ` Peter Zijlstra
2021-06-08 18:49                                 ` Nathan Chancellor
2021-06-09  7:11                                   ` Lukasz Majczak
2021-06-09  7:20                                     ` Peter Zijlstra
2021-06-09 12:23                                       ` Lukasz Majczak
2021-06-09 15:08                                         ` Peter Zijlstra
2021-06-09 15:11                                           ` Peter Zijlstra
2021-06-09 15:56                                           ` Nathan Chancellor
2021-06-08 18:18                               ` Nick Desaulniers
2021-06-07 18:19                 ` Peter Zijlstra
2021-06-07 18:27                   ` Fāng-ruì Sòng
2021-06-07 18:47                     ` Peter Zijlstra
2021-04-01 15:08   ` [tip: x86/core] objtool/x86: " tip-bot2 for Peter Zijlstra
2021-04-03 11:10   ` tip-bot2 for Peter Zijlstra
2021-03-30 15:02 ` [PATCH v3 00/16] x86,objtool: Optimize !RETPOLINE Miroslav Benes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).