All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail
@ 2022-10-28 19:40 Peter Zijlstra
  2022-10-28 19:40 ` [PATCH 1/5] kallsyms: Revert "Take callthunks into account" Peter Zijlstra
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Peter Zijlstra @ 2022-10-28 19:40 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, djwong, yujie.liu, tglx, jpoimboe,
	joao.moreira, samitolvanen

Hi all,

As reported here:

  https://lkml.kernel.org/r/202210241614.2ae4c1f5-yujie.liu@intel.com

The kallsyms change for call-thunks doesn't really work out as expected. These
patches revert that change and propose an alternative.





^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/5] kallsyms: Revert "Take callthunks into account"
  2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
@ 2022-10-28 19:40 ` Peter Zijlstra
  2022-10-28 19:40 ` [PATCH 2/5] objtool: Slice up elf_create_section_symbol() Peter Zijlstra
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2022-10-28 19:40 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, djwong, yujie.liu, tglx, jpoimboe,
	joao.moreira, samitolvanen

This is a full revert of commit:

  f1389181622a ("kallsyms: Take callthunks into account")

The commit assumes a number of things that are not quite right.
Notably it assumes every symbol has PADDING_BYTES in front of it that
are not claimed by another symbol.

This is not true; even when compiled with:

  -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}

Notably things like .cold subfunctions do not need to adhere to this
change in ABI. It it also not true when build with CFI_CLANG, which
claims these PADDING_BYTES in the __cfi_##name symbol.

Once the prefix bytes are not consistent and or otherwise claimed the
approach this patch takes goes out the window and kallsym resolution
will report invalid symbol names.

Therefore revert this to make room for another approach.

Reported-by: Reported-by: kernel test robot <yujie.liu@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/202210241614.2ae4c1f5-yujie.liu@intel.com
---
 kernel/kallsyms.c |   45 +++++----------------------------------------
 1 file changed, 5 insertions(+), 40 deletions(-)

--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -293,12 +293,6 @@ static unsigned long get_symbol_pos(unsi
 	return low;
 }
 
-#ifdef CONFIG_FUNCTION_PADDING_BYTES
-#define PADDING_BYTES	CONFIG_FUNCTION_PADDING_BYTES
-#else
-#define PADDING_BYTES	0
-#endif
-
 /*
  * Lookup an address but don't bother to find any names.
  */
@@ -306,25 +300,13 @@ int kallsyms_lookup_size_offset(unsigned
 				unsigned long *offset)
 {
 	char namebuf[KSYM_NAME_LEN];
-	int ret;
-
-	addr += PADDING_BYTES;
 
 	if (is_ksym_addr(addr)) {
 		get_symbol_pos(addr, symbolsize, offset);
-		ret = 1;
-		goto found;
+		return 1;
 	}
-
-	ret = !!module_address_lookup(addr, symbolsize, offset, NULL, NULL, namebuf);
-	if (!ret) {
-		ret = !!__bpf_address_lookup(addr, symbolsize,
-					     offset, namebuf);
-	}
-found:
-	if (ret && offset)
-		*offset -= PADDING_BYTES;
-	return ret;
+	return !!module_address_lookup(addr, symbolsize, offset, NULL, NULL, namebuf) ||
+	       !!__bpf_address_lookup(addr, symbolsize, offset, namebuf);
 }
 
 static const char *kallsyms_lookup_buildid(unsigned long addr,
@@ -337,8 +319,6 @@ static const char *kallsyms_lookup_build
 	namebuf[KSYM_NAME_LEN - 1] = 0;
 	namebuf[0] = 0;
 
-	addr += PADDING_BYTES;
-
 	if (is_ksym_addr(addr)) {
 		unsigned long pos;
 
@@ -368,8 +348,6 @@ static const char *kallsyms_lookup_build
 
 found:
 	cleanup_symbol_name(namebuf);
-	if (ret && offset)
-		*offset -= PADDING_BYTES;
 	return ret;
 }
 
@@ -396,8 +374,6 @@ int lookup_symbol_name(unsigned long add
 	symname[0] = '\0';
 	symname[KSYM_NAME_LEN - 1] = '\0';
 
-	addr += PADDING_BYTES;
-
 	if (is_ksym_addr(addr)) {
 		unsigned long pos;
 
@@ -425,8 +401,6 @@ int lookup_symbol_attrs(unsigned long ad
 	name[0] = '\0';
 	name[KSYM_NAME_LEN - 1] = '\0';
 
-	addr += PADDING_BYTES;
-
 	if (is_ksym_addr(addr)) {
 		unsigned long pos;
 
@@ -443,8 +417,6 @@ int lookup_symbol_attrs(unsigned long ad
 		return res;
 
 found:
-	if (offset)
-		*offset -= PADDING_BYTES;
 	cleanup_symbol_name(name);
 	return 0;
 }
@@ -470,15 +442,8 @@ static int __sprint_symbol(char *buffer,
 	len = strlen(buffer);
 	offset -= symbol_offset;
 
-	if (add_offset) {
-		char s = '+';
-
-		if ((long)offset < 0) {
-			s = '-';
-			offset = 0UL - offset;
-		}
-		len += sprintf(buffer + len, "%c%#lx/%#lx", s, offset, size);
-	}
+	if (add_offset)
+		len += sprintf(buffer + len, "+%#lx/%#lx", offset, size);
 
 	if (modname) {
 		len += sprintf(buffer + len, " [%s", modname);



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2/5] objtool: Slice up elf_create_section_symbol()
  2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
  2022-10-28 19:40 ` [PATCH 1/5] kallsyms: Revert "Take callthunks into account" Peter Zijlstra
@ 2022-10-28 19:40 ` Peter Zijlstra
  2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2022-10-28 19:40 ` [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf Peter Zijlstra
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2022-10-28 19:40 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, djwong, yujie.liu, tglx, jpoimboe,
	joao.moreira, samitolvanen

In order to facilitate creation of more symbol types, slice up
elf_create_section_symbol() to extract a generic helper that deals
with adding ELF symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/elf.c |   56 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 35 insertions(+), 21 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -717,11 +717,11 @@ static int elf_update_symbol(struct elf
 }
 
 static struct symbol *
-elf_create_section_symbol(struct elf *elf, struct section *sec)
+__elf_create_symbol(struct elf *elf, struct symbol *sym)
 {
 	struct section *symtab, *symtab_shndx;
 	Elf32_Word first_non_local, new_idx;
-	struct symbol *sym, *old;
+	struct symbol *old;
 
 	symtab = find_section_by_name(elf, ".symtab");
 	if (symtab) {
@@ -731,27 +731,16 @@ elf_create_section_symbol(struct elf *el
 		return NULL;
 	}
 
-	sym = calloc(1, sizeof(*sym));
-	if (!sym) {
-		perror("malloc");
-		return NULL;
-	}
+	new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
 
-	sym->name = sec->name;
-	sym->sec = sec;
-
-	// st_name 0
-	sym->sym.st_info = GELF_ST_INFO(STB_LOCAL, STT_SECTION);
-	// st_other 0
-	// st_value 0
-	// st_size 0
+	if (GELF_ST_BIND(sym->sym.st_info) != STB_LOCAL)
+		goto non_local;
 
 	/*
 	 * Move the first global symbol, as per sh_info, into a new, higher
 	 * symbol index. This fees up a spot for a new local symbol.
 	 */
 	first_non_local = symtab->sh.sh_info;
-	new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
 	old = find_symbol_by_index(elf, first_non_local);
 	if (old) {
 		old->idx = new_idx;
@@ -769,18 +758,43 @@ elf_create_section_symbol(struct elf *el
 		new_idx = first_non_local;
 	}
 
+	/*
+	 * Either way, we will add a LOCAL symbol.
+	 */
+	symtab->sh.sh_info += 1;
+
+non_local:
 	sym->idx = new_idx;
 	if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
 		WARN("elf_update_symbol");
 		return NULL;
 	}
 
-	/*
-	 * Either way, we added a LOCAL symbol.
-	 */
-	symtab->sh.sh_info += 1;
+	return sym;
+}
+
+static struct symbol *
+elf_create_section_symbol(struct elf *elf, struct section *sec)
+{
+	struct symbol *sym = calloc(1, sizeof(*sym));
+
+	if (!sym) {
+		perror("malloc");
+		return NULL;
+	}
+
+	sym->name = sec->name;
+	sym->sec = sec;
+
+	// st_name 0
+	sym->sym.st_info = GELF_ST_INFO(STB_LOCAL, STT_SECTION);
+	// st_other 0
+	// st_value 0
+	// st_size 0
 
-	elf_add_symbol(elf, sym);
+	sym = __elf_create_symbol(elf, sym);
+	if (sym)
+		elf_add_symbol(elf, sym);
 
 	return sym;
 }



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
  2022-10-28 19:40 ` [PATCH 1/5] kallsyms: Revert "Take callthunks into account" Peter Zijlstra
  2022-10-28 19:40 ` [PATCH 2/5] objtool: Slice up elf_create_section_symbol() Peter Zijlstra
@ 2022-10-28 19:40 ` Peter Zijlstra
  2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2022-11-02 22:22   ` [PATCH 3/5] " Josh Poimboeuf
  2022-10-28 19:40 ` [PATCH 4/5] objtool: Add option to generate prefix symbols Peter Zijlstra
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2022-10-28 19:40 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, djwong, yujie.liu, tglx, jpoimboe,
	joao.moreira, samitolvanen

Due to how gelf_update_sym*() requires an Elf_Data pointer, and how
libelf keeps Elf_Data in a linked list per section,
elf_update_symbol() ends up having to iterate this list on each
update to find the correct Elf_Data for the index'ed symbol.

By allocating one Elf_Data per new symbol, the list grows per new
symbol, giving an effective O(n^2) insertion time. This is obviously
bloody terrible.

Therefore over-allocate the Elf_Data when an extention is needed.
Except it turns out libelf disregards Elf_Scn::sh_size in favour of
the sum of Elf_Data::d_size. IOW it will happily write out all the
unused space and fill it with:

  0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND

entries (aka zeros). Which obviously violates the STB_LOCAL placement
rule, and is a general pain in the backside for not being the desired
behaviour.

Manually fix-up the Elf_Data size to avoid this problem before calling
elf_update().

This significantly improves performance when adding a significant
number of symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/elf.c                 |   89 +++++++++++++++++++++++++++++++++---
 tools/objtool/include/objtool/elf.h |    2 
 2 files changed, 84 insertions(+), 7 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -634,6 +634,12 @@ static int elf_update_symbol(struct elf
 
 		/* end-of-list */
 		if (!symtab_data) {
+			/*
+			 * Over-allocate to avoid O(n^2) symbol creation
+			 * behaviour.  The down side is that libelf doesn't
+			 * like this; see elf_truncate_section() for the fixup.
+			 */
+			int num = max(1U, sym->idx/3);
 			void *buf;
 
 			if (idx) {
@@ -647,28 +653,34 @@ static int elf_update_symbol(struct elf
 			if (t)
 				shndx_data = elf_newdata(t);
 
-			buf = calloc(1, entsize);
+			buf = calloc(num, entsize);
 			if (!buf) {
 				WARN("malloc");
 				return -1;
 			}
 
 			symtab_data->d_buf = buf;
-			symtab_data->d_size = entsize;
+			symtab_data->d_size = num * entsize;
 			symtab_data->d_align = 1;
 			symtab_data->d_type = ELF_T_SYM;
 
-			symtab->sh.sh_size += entsize;
 			symtab->changed = true;
+			symtab->truncate = true;
 
 			if (t) {
-				shndx_data->d_buf = &sym->sec->idx;
-				shndx_data->d_size = sizeof(Elf32_Word);
+				buf = calloc(num, sizeof(Elf32_Word));
+				if (!buf) {
+					WARN("malloc");
+					return -1;
+				}
+
+				shndx_data->d_buf = buf;
+				shndx_data->d_size = num * sizeof(Elf32_Word);
 				shndx_data->d_align = sizeof(Elf32_Word);
 				shndx_data->d_type = ELF_T_WORD;
 
-				symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
 				symtab_shndx->changed = true;
+				symtab_shndx->truncate = true;
 			}
 
 			break;
@@ -770,6 +782,14 @@ __elf_create_symbol(struct elf *elf, str
 		return NULL;
 	}
 
+	symtab->sh.sh_size += symtab->sh.sh_entsize;
+	symtab->changed = true;
+
+	if (symtab_shndx) {
+		symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
+		symtab_shndx->changed = true;
+	}
+
 	return sym;
 }
 
@@ -1286,6 +1306,60 @@ int elf_write_reloc(struct elf *elf, str
 	return 0;
 }
 
+/*
+ * When Elf_Scn::sh_size is smaller than the combined Elf_Data::d_size
+ * do you:
+ *
+ *   A) adhere to the section header and truncate the data, or
+ *   B) ignore the section header and write out all the data you've got?
+ *
+ * Yes, libelf sucks and we need to manually truncate if we over-allocate data.
+ */
+static int elf_truncate_section(struct elf *elf, struct section *sec)
+{
+	u64 size = sec->sh.sh_size;
+	bool truncated = false;
+	Elf_Data *data = NULL;
+	Elf_Scn *s;
+
+	s = elf_getscn(elf->elf, sec->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	for (;;) {
+		/* get next data descriptor for the relevant section */
+		data = elf_getdata(s, data);
+
+		if (!data) {
+			if (size) {
+				WARN("end of section data but non-zero size left\n");
+				return -1;
+			}
+			return 0;
+		}
+
+		if (truncated) {
+			/* when we remove symbols */
+			WARN("truncated; but more data\n");
+			return -1;
+		}
+
+		if (!data->d_size) {
+			WARN("zero size data");
+			return -1;
+		}
+
+		if (data->d_size > size) {
+			truncated = true;
+			data->d_size = size;
+		}
+
+		size -= data->d_size;
+	}
+}
+
 int elf_write(struct elf *elf)
 {
 	struct section *sec;
@@ -1296,6 +1370,9 @@ int elf_write(struct elf *elf)
 
 	/* Update changed relocation sections and section headers: */
 	list_for_each_entry(sec, &elf->sections, list) {
+		if (sec->truncate)
+			elf_truncate_section(elf, sec);
+
 		if (sec->changed) {
 			s = elf_getscn(elf->elf, sec->idx);
 			if (!s) {
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -38,7 +38,7 @@ struct section {
 	Elf_Data *data;
 	char *name;
 	int idx;
-	bool changed, text, rodata, noinstr, init;
+	bool changed, text, rodata, noinstr, init, truncate;
 };
 
 struct symbol {



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 4/5] objtool: Add option to generate prefix symbols
  2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
                   ` (2 preceding siblings ...)
  2022-10-28 19:40 ` [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf Peter Zijlstra
@ 2022-10-28 19:40 ` Peter Zijlstra
  2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2022-11-02 23:41   ` [PATCH 4/5] " Josh Poimboeuf
  2022-10-28 19:40 ` [PATCH 5/5] x86: Add prefix symbols for function padding Peter Zijlstra
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2022-10-28 19:40 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, djwong, yujie.liu, tglx, jpoimboe,
	joao.moreira, samitolvanen

When code is compiled with:

  -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}

functions will have PADDING_BYTES of NOP in front of them. Unwinders
and other things that symbolize code locations will typically
attribute these bytes to the preceding function.

Given that these bytes nominally belong to the following symbol this
mis-attribution is confusing.

Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to
claim these bytes, allow objtool to emit __pfx_##name symbols to do
the same.

Therefore add the objtool --prefix=N argument, to conditionally place
a __pfx_##name symbol at N bytes ahead of symbol 'name' when: all
these preceding bytes are NOP and name-N is an instruction boundary.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/builtin-check.c           |    1 +
 tools/objtool/check.c                   |   27 +++++++++++++++++++++++++++
 tools/objtool/elf.c                     |   30 ++++++++++++++++++++++++++++++
 tools/objtool/include/objtool/builtin.h |    1 +
 tools/objtool/include/objtool/elf.h     |    2 ++
 5 files changed, 61 insertions(+)

--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -75,6 +75,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('r', "retpoline", &opts.retpoline, "validate and annotate retpoline usage"),
 	OPT_BOOLEAN(0,   "rethunk", &opts.rethunk, "validate and annotate rethunk usage"),
 	OPT_BOOLEAN(0,   "unret", &opts.unret, "validate entry unret placement"),
+	OPT_INTEGER(0,   "prefix", &opts.prefix, "generate prefix symbols"),
 	OPT_BOOLEAN('l', "sls", &opts.sls, "validate straight-line-speculation mitigations"),
 	OPT_BOOLEAN('s', "stackval", &opts.stackval, "validate frame pointer rules"),
 	OPT_BOOLEAN('t', "static-call", &opts.static_call, "annotate static calls"),
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3972,6 +3972,31 @@ static bool ignore_unreachable_insn(stru
 	return false;
 }
 
+static int add_prefix_symbol(struct objtool_file *file, struct symbol *func,
+			     struct instruction *insn)
+{
+	if (!opts.prefix)
+		return 0;
+
+	for (;;) {
+		struct instruction *prev = list_prev_entry(insn, list);
+
+		if (&prev->list == &file->insn_list)
+			break;
+
+		if (prev->type != INSN_NOP)
+			break;
+
+		insn = prev;
+		if (func->offset - prev->offset == opts.prefix) {
+			elf_create_prefix_symbol(file->elf, func, -opts.prefix);
+			break;
+		}
+	}
+
+	return 0;
+}
+
 static int validate_symbol(struct objtool_file *file, struct section *sec,
 			   struct symbol *sym, struct insn_state *state)
 {
@@ -3990,6 +4015,8 @@ static int validate_symbol(struct objtoo
 	if (!insn || insn->ignore || insn->visited)
 		return 0;
 
+	add_prefix_symbol(file, sym, insn);
+
 	state->uaccess = sym->uaccess_safe;
 
 	ret = validate_branch(file, insn_func(insn), insn, *state);
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -819,6 +819,36 @@ elf_create_section_symbol(struct elf *el
 	return sym;
 }
 
+static int elf_add_string(struct elf *elf, struct section *strtab, char *str);
+
+struct symbol *
+elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend)
+{
+	struct symbol *sym = calloc(1, sizeof(*sym));
+	size_t namelen = strlen(orig->name) + sizeof("__pfx_");
+	char *name = malloc(namelen);
+
+	if (!sym || !name) {
+		perror("malloc");
+		return NULL;
+	}
+
+	snprintf(name, namelen, "__pfx_%s", orig->name);
+
+	sym->name = name;
+	sym->sec = orig->sec;
+
+	sym->sym.st_name = elf_add_string(elf, NULL, name);
+	sym->sym.st_info = orig->sym.st_info;
+	sym->sym.st_value = orig->sym.st_value + addend;
+
+	sym = __elf_create_symbol(elf, sym);
+	if (sym)
+		elf_add_symbol(elf, sym);
+
+	return sym;
+}
+
 int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
 			  unsigned long offset, unsigned int type,
 			  struct section *insn_sec, unsigned long insn_off)
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -26,6 +26,7 @@ struct opts {
 	bool stackval;
 	bool static_call;
 	bool uaccess;
+	int prefix;
 
 	/* options: */
 	bool backtrace;
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -146,6 +146,8 @@ static inline bool has_multiple_files(st
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
 
+struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend);
+
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, s64 addend);
 int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 5/5] x86: Add prefix symbols for function padding
  2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
                   ` (3 preceding siblings ...)
  2022-10-28 19:40 ` [PATCH 4/5] objtool: Add option to generate prefix symbols Peter Zijlstra
@ 2022-10-28 19:40 ` Peter Zijlstra
  2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  2022-10-30  9:15 ` [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
  2022-11-02 21:46 ` [PATCH 6/5] objtool: Optimize elf_dirty_reloc_sym() Peter Zijlstra
  6 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2022-10-28 19:40 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, djwong, yujie.liu, tglx, jpoimboe,
	joao.moreira, samitolvanen

When code is compiled with:

  -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}

functions will have PADDING_BYTES of NOP in front of them. Unwinders
and other things that symbolize code locations will typically
attribute these bytes to the preceding function.

Given that these bytes nominally belong to the following symbol this
mis-attribution is confusing.

Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to
claim these bytes, use objtool to emit __pfx_##name symbols to do
the same when CFI_CLANG is not used.

This then shows the callthunk for symbol 'name' as:

  __pfx_##name+0x6/0x10

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/Kconfig     |    4 ++++
 scripts/Makefile.lib |    1 +
 2 files changed, 5 insertions(+)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2472,6 +2472,10 @@ config CALL_THUNKS
 	def_bool n
 	select FUNCTION_ALIGNMENT_16B
 
+config PREFIX_SYMBOLS
+	def_bool y
+	depends on CALL_THUNKS && !CFI_CLANG
+
 menuconfig SPECULATION_MITIGATIONS
 	bool "Mitigations for speculative execution vulnerabilities"
 	default y
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -265,6 +265,7 @@ objtool-args-$(CONFIG_STACK_VALIDATION)
 objtool-args-$(CONFIG_HAVE_STATIC_CALL_INLINE)		+= --static-call
 objtool-args-$(CONFIG_HAVE_UACCESS_VALIDATION)		+= --uaccess
 objtool-args-$(CONFIG_GCOV_KERNEL)			+= --no-unreachable
+objtool-args-$(CONFIG_PREFIX_SYMBOLS)			+= --prefix=$(CONFIG_FUNCTION_PADDING_BYTES)
 
 objtool-args = $(objtool-args-y)					\
 	$(if $(delay-objtool), --link)					\



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail
  2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
                   ` (4 preceding siblings ...)
  2022-10-28 19:40 ` [PATCH 5/5] x86: Add prefix symbols for function padding Peter Zijlstra
@ 2022-10-30  9:15 ` Peter Zijlstra
  2022-10-31  6:18   ` Yujie Liu
  2022-11-02 21:46 ` [PATCH 6/5] objtool: Optimize elf_dirty_reloc_sym() Peter Zijlstra
  6 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2022-10-30  9:15 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, djwong, yujie.liu, tglx, jpoimboe, joao.moreira,
	samitolvanen

On Fri, Oct 28, 2022 at 09:40:22PM +0200, Peter Zijlstra wrote:
> Hi all,
> 
> As reported here:
> 
>   https://lkml.kernel.org/r/202210241614.2ae4c1f5-yujie.liu@intel.com
> 
> The kallsyms change for call-thunks doesn't really work out as expected. These
> patches revert that change and propose an alternative.

Robot found it needed the below; have folded back and pushed out again.

---
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index cf743520a786..55066c493570 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3479,7 +3479,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 
 		if (func && insn_func(insn) && func != insn_func(insn)->pfunc) {
 			/* Ignore KCFI type preambles, which always fall through */
-			if (!strncmp(func->name, "__cfi_", 6))
+			if (!strncmp(func->name, "__cfi_", 6) ||
+			    !strncmp(func->name, "__pfx_", 6))
 				return 0;
 
 			WARN("%s() falls through to next function %s()",
@@ -4042,6 +4043,7 @@ static int add_prefix_symbol(struct objtool_file *file, struct symbol *func,
 
 	for (;;) {
 		struct instruction *prev = list_prev_entry(insn, list);
+		u64 offset;
 
 		if (&prev->list == &file->insn_list)
 			break;
@@ -4049,11 +4051,13 @@ static int add_prefix_symbol(struct objtool_file *file, struct symbol *func,
 		if (prev->type != INSN_NOP)
 			break;
 
-		insn = prev;
-		if (func->offset - prev->offset == opts.prefix) {
-			elf_create_prefix_symbol(file->elf, func, -opts.prefix);
+		offset = func->offset - prev->offset;
+		if (offset >= opts.prefix) {
+			if (offset == opts.prefix)
+				elf_create_prefix_symbol(file->elf, func, opts.prefix);
 			break;
 		}
+		insn = prev;
 	}
 
 	return 0;
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 9fbdad7a565d..3d636d12d679 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -822,7 +822,7 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
 static int elf_add_string(struct elf *elf, struct section *strtab, char *str);
 
 struct symbol *
-elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend)
+elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size)
 {
 	struct symbol *sym = calloc(1, sizeof(*sym));
 	size_t namelen = strlen(orig->name) + sizeof("__pfx_");
@@ -840,7 +840,8 @@ elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend)
 
 	sym->sym.st_name = elf_add_string(elf, NULL, name);
 	sym->sym.st_info = orig->sym.st_info;
-	sym->sym.st_value = orig->sym.st_value + addend;
+	sym->sym.st_value = orig->sym.st_value - size;
+	sym->sym.st_size = size;
 
 	sym = __elf_create_symbol(elf, sym);
 	if (sym)
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 9e3bd4717a11..b6974e3173aa 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -146,7 +146,7 @@ static inline bool has_multiple_files(struct elf *elf)
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
 
-struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend);
+struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size);
 
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, s64 addend);

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail
  2022-10-30  9:15 ` [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
@ 2022-10-31  6:18   ` Yujie Liu
  0 siblings, 0 replies; 17+ messages in thread
From: Yujie Liu @ 2022-10-31  6:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, djwong, tglx, jpoimboe, joao.moreira,
	samitolvanen, lkp

On Sun, Oct 30, 2022 at 10:15:22AM +0100, Peter Zijlstra wrote:
> On Fri, Oct 28, 2022 at 09:40:22PM +0200, Peter Zijlstra wrote:
> > Hi all,
> > 
> > As reported here:
> > 
> >   https://lkml.kernel.org/r/202210241614.2ae4c1f5-yujie.liu@intel.com
> > 
> > The kallsyms change for call-thunks doesn't really work out as expected. These
> > patches revert that change and propose an alternative.
> 
> Robot found it needed the below; have folded back and pushed out again.

Hi Peter,

We applied the five patches in this patchset plus below extra fixup
patch on top of x86/core branch of tip. The xfstests failure and the
issue of changed dmesg WARNING are gone. Thanks.

  Tested-by: Yujie Liu <yujie.liu@intel.com>

=========================================================================================
compiler/disk/fs/kconfig/rootfs/tbox_group/test/testcase:
  gcc-11/4HDD/xfs/x86_64-rhel-8.3-func/debian-11.1-x86_64-20220510.cgz/lkp-ivb-d04/xfs-no-bug-assert/xfstests

commit:
  f1389181622a0 ("kallsyms: Take callthunks into account")
  9b10b976b8e2f ("x86/kallsyms: Fix the call-thunk kallsyms fail")

f1389181622a08d6 9b10b976b8e2f61d861befc114f
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
         12:12        -100%            :12    xfstests.xfs.098.fail
         12:12        -100%            :12    xfstests.xfs.439.fail
           :12          92%          11:12    dmesg.RIP:assfail[xfs]
         12:12        -100%            :12    dmesg.WARNING:CPU:#PID:#at_fs/xfs/xfs_message.c:#xfs_buf_alert_ratelimited.cold-#[xfs]
           :12          92%          11:12    dmesg.WARNING:at_fs/xfs/xfs_message.c:#assfail[xfs]


Best Regards,
Yujie

> 
> ---
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index cf743520a786..55066c493570 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -3479,7 +3479,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
>  
>  		if (func && insn_func(insn) && func != insn_func(insn)->pfunc) {
>  			/* Ignore KCFI type preambles, which always fall through */
> -			if (!strncmp(func->name, "__cfi_", 6))
> +			if (!strncmp(func->name, "__cfi_", 6) ||
> +			    !strncmp(func->name, "__pfx_", 6))
>  				return 0;
>  
>  			WARN("%s() falls through to next function %s()",
> @@ -4042,6 +4043,7 @@ static int add_prefix_symbol(struct objtool_file *file, struct symbol *func,
>  
>  	for (;;) {
>  		struct instruction *prev = list_prev_entry(insn, list);
> +		u64 offset;
>  
>  		if (&prev->list == &file->insn_list)
>  			break;
> @@ -4049,11 +4051,13 @@ static int add_prefix_symbol(struct objtool_file *file, struct symbol *func,
>  		if (prev->type != INSN_NOP)
>  			break;
>  
> -		insn = prev;
> -		if (func->offset - prev->offset == opts.prefix) {
> -			elf_create_prefix_symbol(file->elf, func, -opts.prefix);
> +		offset = func->offset - prev->offset;
> +		if (offset >= opts.prefix) {
> +			if (offset == opts.prefix)
> +				elf_create_prefix_symbol(file->elf, func, opts.prefix);
>  			break;
>  		}
> +		insn = prev;
>  	}
>  
>  	return 0;
> diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
> index 9fbdad7a565d..3d636d12d679 100644
> --- a/tools/objtool/elf.c
> +++ b/tools/objtool/elf.c
> @@ -822,7 +822,7 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
>  static int elf_add_string(struct elf *elf, struct section *strtab, char *str);
>  
>  struct symbol *
> -elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend)
> +elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size)
>  {
>  	struct symbol *sym = calloc(1, sizeof(*sym));
>  	size_t namelen = strlen(orig->name) + sizeof("__pfx_");
> @@ -840,7 +840,8 @@ elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend)
>  
>  	sym->sym.st_name = elf_add_string(elf, NULL, name);
>  	sym->sym.st_info = orig->sym.st_info;
> -	sym->sym.st_value = orig->sym.st_value + addend;
> +	sym->sym.st_value = orig->sym.st_value - size;
> +	sym->sym.st_size = size;
>  
>  	sym = __elf_create_symbol(elf, sym);
>  	if (sym)
> diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
> index 9e3bd4717a11..b6974e3173aa 100644
> --- a/tools/objtool/include/objtool/elf.h
> +++ b/tools/objtool/include/objtool/elf.h
> @@ -146,7 +146,7 @@ static inline bool has_multiple_files(struct elf *elf)
>  struct elf *elf_open_read(const char *name, int flags);
>  struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
>  
> -struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long addend);
> +struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size);
>  
>  int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
>  		  unsigned int type, struct symbol *sym, s64 addend);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip: x86/core] x86: Add prefix symbols for function padding
  2022-10-28 19:40 ` [PATCH 5/5] x86: Add prefix symbols for function padding Peter Zijlstra
@ 2022-11-02  9:20   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2022-11-02  9:20 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Peter Zijlstra (Intel), Yujie Liu, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     b341b20d648bb7e9a3307c33163e7399f0913e66
Gitweb:        https://git.kernel.org/tip/b341b20d648bb7e9a3307c33163e7399f0913e66
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 28 Oct 2022 21:08:19 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 01 Nov 2022 13:44:09 +01:00

x86: Add prefix symbols for function padding

When code is compiled with:

  -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}

functions will have PADDING_BYTES of NOP in front of them. Unwinders
and other things that symbolize code locations will typically
attribute these bytes to the preceding function.

Given that these bytes nominally belong to the following symbol this
mis-attribution is confusing.

Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to
claim these bytes, use objtool to emit __pfx_##name symbols to do
the same when CFI_CLANG is not used.

This then shows the callthunk for symbol 'name' as:

  __pfx_##name+0x6/0x10

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.592512209@infradead.org
---
 arch/x86/Kconfig     | 4 ++++
 scripts/Makefile.lib | 1 +
 2 files changed, 5 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b52ad13..32818aa 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2471,6 +2471,10 @@ config CALL_THUNKS
 	def_bool n
 	select FUNCTION_ALIGNMENT_16B
 
+config PREFIX_SYMBOLS
+	def_bool y
+	depends on CALL_THUNKS && !CFI_CLANG
+
 menuconfig SPECULATION_MITIGATIONS
 	bool "Mitigations for speculative execution vulnerabilities"
 	default y
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 85f0275..2e03bcb 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -265,6 +265,7 @@ objtool-args-$(CONFIG_STACK_VALIDATION)			+= --stackval
 objtool-args-$(CONFIG_HAVE_STATIC_CALL_INLINE)		+= --static-call
 objtool-args-$(CONFIG_HAVE_UACCESS_VALIDATION)		+= --uaccess
 objtool-args-$(CONFIG_GCOV_KERNEL)			+= --no-unreachable
+objtool-args-$(CONFIG_PREFIX_SYMBOLS)			+= --prefix=$(CONFIG_FUNCTION_PADDING_BYTES)
 
 objtool-args = $(objtool-args-y)					\
 	$(if $(delay-objtool), --link)					\

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: x86/core] objtool: Add option to generate prefix symbols
  2022-10-28 19:40 ` [PATCH 4/5] objtool: Add option to generate prefix symbols Peter Zijlstra
@ 2022-11-02  9:20   ` tip-bot2 for Peter Zijlstra
  2022-11-02 23:41   ` [PATCH 4/5] " Josh Poimboeuf
  1 sibling, 0 replies; 17+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2022-11-02  9:20 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Peter Zijlstra (Intel), Yujie Liu, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     9f2899fe36a623885d8576604cb582328ad32b3c
Gitweb:        https://git.kernel.org/tip/9f2899fe36a623885d8576604cb582328ad32b3c
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 28 Oct 2022 15:50:42 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 01 Nov 2022 13:44:09 +01:00

objtool: Add option to generate prefix symbols

When code is compiled with:

  -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}

functions will have PADDING_BYTES of NOP in front of them. Unwinders
and other things that symbolize code locations will typically
attribute these bytes to the preceding function.

Given that these bytes nominally belong to the following symbol this
mis-attribution is confusing.

Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to
claim these bytes, allow objtool to emit __pfx_##name symbols to do
the same.

Therefore add the objtool --prefix=N argument, to conditionally place
a __pfx_##name symbol at N bytes ahead of symbol 'name' when: all
these preceding bytes are NOP and name-N is an instruction boundary.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.526899822@infradead.org
---
 tools/objtool/builtin-check.c           |  1 +-
 tools/objtool/check.c                   | 33 +++++++++++++++++++++++-
 tools/objtool/elf.c                     | 31 +++++++++++++++++++++++-
 tools/objtool/include/objtool/builtin.h |  1 +-
 tools/objtool/include/objtool/elf.h     |  2 +-
 5 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index 0a04f8e..95fcece 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -75,6 +75,7 @@ const struct option check_options[] = {
 	OPT_BOOLEAN('r', "retpoline", &opts.retpoline, "validate and annotate retpoline usage"),
 	OPT_BOOLEAN(0,   "rethunk", &opts.rethunk, "validate and annotate rethunk usage"),
 	OPT_BOOLEAN(0,   "unret", &opts.unret, "validate entry unret placement"),
+	OPT_INTEGER(0,   "prefix", &opts.prefix, "generate prefix symbols"),
 	OPT_BOOLEAN('l', "sls", &opts.sls, "validate straight-line-speculation mitigations"),
 	OPT_BOOLEAN('s', "stackval", &opts.stackval, "validate frame pointer rules"),
 	OPT_BOOLEAN('t', "static-call", &opts.static_call, "annotate static calls"),
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 7936312..27f35f5 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3417,7 +3417,8 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 
 		if (func && insn_func(insn) && func != insn_func(insn)->pfunc) {
 			/* Ignore KCFI type preambles, which always fall through */
-			if (!strncmp(func->name, "__cfi_", 6))
+			if (!strncmp(func->name, "__cfi_", 6) ||
+			    !strncmp(func->name, "__pfx_", 6))
 				return 0;
 
 			WARN("%s() falls through to next function %s()",
@@ -3972,6 +3973,34 @@ static bool ignore_unreachable_insn(struct objtool_file *file, struct instructio
 	return false;
 }
 
+static int add_prefix_symbol(struct objtool_file *file, struct symbol *func,
+			     struct instruction *insn)
+{
+	if (!opts.prefix)
+		return 0;
+
+	for (;;) {
+		struct instruction *prev = list_prev_entry(insn, list);
+		u64 offset;
+
+		if (&prev->list == &file->insn_list)
+			break;
+
+		if (prev->type != INSN_NOP)
+			break;
+
+		offset = func->offset - prev->offset;
+		if (offset >= opts.prefix) {
+			if (offset == opts.prefix)
+				elf_create_prefix_symbol(file->elf, func, opts.prefix);
+			break;
+		}
+		insn = prev;
+	}
+
+	return 0;
+}
+
 static int validate_symbol(struct objtool_file *file, struct section *sec,
 			   struct symbol *sym, struct insn_state *state)
 {
@@ -3990,6 +4019,8 @@ static int validate_symbol(struct objtool_file *file, struct section *sec,
 	if (!insn || insn->ignore || insn->visited)
 		return 0;
 
+	add_prefix_symbol(file, sym, insn);
+
 	state->uaccess = sym->uaccess_safe;
 
 	ret = validate_branch(file, insn_func(insn), insn, *state);
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 36dc787..3d636d1 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -819,6 +819,37 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
 	return sym;
 }
 
+static int elf_add_string(struct elf *elf, struct section *strtab, char *str);
+
+struct symbol *
+elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size)
+{
+	struct symbol *sym = calloc(1, sizeof(*sym));
+	size_t namelen = strlen(orig->name) + sizeof("__pfx_");
+	char *name = malloc(namelen);
+
+	if (!sym || !name) {
+		perror("malloc");
+		return NULL;
+	}
+
+	snprintf(name, namelen, "__pfx_%s", orig->name);
+
+	sym->name = name;
+	sym->sec = orig->sec;
+
+	sym->sym.st_name = elf_add_string(elf, NULL, name);
+	sym->sym.st_info = orig->sym.st_info;
+	sym->sym.st_value = orig->sym.st_value - size;
+	sym->sym.st_size = size;
+
+	sym = __elf_create_symbol(elf, sym);
+	if (sym)
+		elf_add_symbol(elf, sym);
+
+	return sym;
+}
+
 int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
 			  unsigned long offset, unsigned int type,
 			  struct section *insn_sec, unsigned long insn_off)
diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h
index 22092a9..f341b62 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -26,6 +26,7 @@ struct opts {
 	bool stackval;
 	bool static_call;
 	bool uaccess;
+	int prefix;
 
 	/* options: */
 	bool backtrace;
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index 9e96a61..b6974e3 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -146,6 +146,8 @@ static inline bool has_multiple_files(struct elf *elf)
 struct elf *elf_open_read(const char *name, int flags);
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
 
+struct symbol *elf_create_prefix_symbol(struct elf *elf, struct symbol *orig, long size);
+
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 		  unsigned int type, struct symbol *sym, s64 addend);
 int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: x86/core] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  2022-10-28 19:40 ` [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf Peter Zijlstra
@ 2022-11-02  9:20   ` tip-bot2 for Peter Zijlstra
  2022-11-02 22:22   ` [PATCH 3/5] " Josh Poimboeuf
  1 sibling, 0 replies; 17+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2022-11-02  9:20 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Peter Zijlstra (Intel), Yujie Liu, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     13f60e80e15dd0657c90bcca372ba045630ed9de
Gitweb:        https://git.kernel.org/tip/13f60e80e15dd0657c90bcca372ba045630ed9de
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 28 Oct 2022 20:29:51 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 01 Nov 2022 13:44:08 +01:00

objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf

Due to how gelf_update_sym*() requires an Elf_Data pointer, and how
libelf keeps Elf_Data in a linked list per section,
elf_update_symbol() ends up having to iterate this list on each
update to find the correct Elf_Data for the index'ed symbol.

By allocating one Elf_Data per new symbol, the list grows per new
symbol, giving an effective O(n^2) insertion time. This is obviously
bloody terrible.

Therefore over-allocate the Elf_Data when an extention is needed.
Except it turns out libelf disregards Elf_Scn::sh_size in favour of
the sum of Elf_Data::d_size. IOW it will happily write out all the
unused space and fill it with:

  0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND

entries (aka zeros). Which obviously violates the STB_LOCAL placement
rule, and is a general pain in the backside for not being the desired
behaviour.

Manually fix-up the Elf_Data size to avoid this problem before calling
elf_update().

This significantly improves performance when adding a significant
number of symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.461658986@infradead.org
---
 tools/objtool/elf.c                 | 89 ++++++++++++++++++++++++++--
 tools/objtool/include/objtool/elf.h |  2 +-
 2 files changed, 84 insertions(+), 7 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 3ad89d9..36dc787 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -634,6 +634,12 @@ static int elf_update_symbol(struct elf *elf, struct section *symtab,
 
 		/* end-of-list */
 		if (!symtab_data) {
+			/*
+			 * Over-allocate to avoid O(n^2) symbol creation
+			 * behaviour.  The down side is that libelf doesn't
+			 * like this; see elf_truncate_section() for the fixup.
+			 */
+			int num = max(1U, sym->idx/3);
 			void *buf;
 
 			if (idx) {
@@ -647,28 +653,34 @@ static int elf_update_symbol(struct elf *elf, struct section *symtab,
 			if (t)
 				shndx_data = elf_newdata(t);
 
-			buf = calloc(1, entsize);
+			buf = calloc(num, entsize);
 			if (!buf) {
 				WARN("malloc");
 				return -1;
 			}
 
 			symtab_data->d_buf = buf;
-			symtab_data->d_size = entsize;
+			symtab_data->d_size = num * entsize;
 			symtab_data->d_align = 1;
 			symtab_data->d_type = ELF_T_SYM;
 
-			symtab->sh.sh_size += entsize;
 			symtab->changed = true;
+			symtab->truncate = true;
 
 			if (t) {
-				shndx_data->d_buf = &sym->sec->idx;
-				shndx_data->d_size = sizeof(Elf32_Word);
+				buf = calloc(num, sizeof(Elf32_Word));
+				if (!buf) {
+					WARN("malloc");
+					return -1;
+				}
+
+				shndx_data->d_buf = buf;
+				shndx_data->d_size = num * sizeof(Elf32_Word);
 				shndx_data->d_align = sizeof(Elf32_Word);
 				shndx_data->d_type = ELF_T_WORD;
 
-				symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
 				symtab_shndx->changed = true;
+				symtab_shndx->truncate = true;
 			}
 
 			break;
@@ -770,6 +782,14 @@ non_local:
 		return NULL;
 	}
 
+	symtab->sh.sh_size += symtab->sh.sh_entsize;
+	symtab->changed = true;
+
+	if (symtab_shndx) {
+		symtab_shndx->sh.sh_size += sizeof(Elf32_Word);
+		symtab_shndx->changed = true;
+	}
+
 	return sym;
 }
 
@@ -1286,6 +1306,60 @@ int elf_write_reloc(struct elf *elf, struct reloc *reloc)
 	return 0;
 }
 
+/*
+ * When Elf_Scn::sh_size is smaller than the combined Elf_Data::d_size
+ * do you:
+ *
+ *   A) adhere to the section header and truncate the data, or
+ *   B) ignore the section header and write out all the data you've got?
+ *
+ * Yes, libelf sucks and we need to manually truncate if we over-allocate data.
+ */
+static int elf_truncate_section(struct elf *elf, struct section *sec)
+{
+	u64 size = sec->sh.sh_size;
+	bool truncated = false;
+	Elf_Data *data = NULL;
+	Elf_Scn *s;
+
+	s = elf_getscn(elf->elf, sec->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	for (;;) {
+		/* get next data descriptor for the relevant section */
+		data = elf_getdata(s, data);
+
+		if (!data) {
+			if (size) {
+				WARN("end of section data but non-zero size left\n");
+				return -1;
+			}
+			return 0;
+		}
+
+		if (truncated) {
+			/* when we remove symbols */
+			WARN("truncated; but more data\n");
+			return -1;
+		}
+
+		if (!data->d_size) {
+			WARN("zero size data");
+			return -1;
+		}
+
+		if (data->d_size > size) {
+			truncated = true;
+			data->d_size = size;
+		}
+
+		size -= data->d_size;
+	}
+}
+
 int elf_write(struct elf *elf)
 {
 	struct section *sec;
@@ -1296,6 +1370,9 @@ int elf_write(struct elf *elf)
 
 	/* Update changed relocation sections and section headers: */
 	list_for_each_entry(sec, &elf->sections, list) {
+		if (sec->truncate)
+			elf_truncate_section(elf, sec);
+
 		if (sec->changed) {
 			s = elf_getscn(elf->elf, sec->idx);
 			if (!s) {
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index d285331..9e96a61 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -38,7 +38,7 @@ struct section {
 	Elf_Data *data;
 	char *name;
 	int idx;
-	bool changed, text, rodata, noinstr, init;
+	bool changed, text, rodata, noinstr, init, truncate;
 };
 
 struct symbol {

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [tip: x86/core] objtool: Slice up elf_create_section_symbol()
  2022-10-28 19:40 ` [PATCH 2/5] objtool: Slice up elf_create_section_symbol() Peter Zijlstra
@ 2022-11-02  9:20   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2022-11-02  9:20 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Peter Zijlstra (Intel), Yujie Liu, x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     4c91be8e926c6b3734d59b9348e305431484d42b
Gitweb:        https://git.kernel.org/tip/4c91be8e926c6b3734d59b9348e305431484d42b
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Fri, 28 Oct 2022 15:49:26 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 01 Nov 2022 13:44:08 +01:00

objtool: Slice up elf_create_section_symbol()

In order to facilitate creation of more symbol types, slice up
elf_create_section_symbol() to extract a generic helper that deals
with adding ELF symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Yujie Liu <yujie.liu@intel.com>
Link: https://lkml.kernel.org/r/20221028194453.396634875@infradead.org
---
 tools/objtool/elf.c | 56 +++++++++++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 89b37cd..3ad89d9 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -717,11 +717,11 @@ static int elf_update_symbol(struct elf *elf, struct section *symtab,
 }
 
 static struct symbol *
-elf_create_section_symbol(struct elf *elf, struct section *sec)
+__elf_create_symbol(struct elf *elf, struct symbol *sym)
 {
 	struct section *symtab, *symtab_shndx;
 	Elf32_Word first_non_local, new_idx;
-	struct symbol *sym, *old;
+	struct symbol *old;
 
 	symtab = find_section_by_name(elf, ".symtab");
 	if (symtab) {
@@ -731,27 +731,16 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
 		return NULL;
 	}
 
-	sym = calloc(1, sizeof(*sym));
-	if (!sym) {
-		perror("malloc");
-		return NULL;
-	}
-
-	sym->name = sec->name;
-	sym->sec = sec;
+	new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
 
-	// st_name 0
-	sym->sym.st_info = GELF_ST_INFO(STB_LOCAL, STT_SECTION);
-	// st_other 0
-	// st_value 0
-	// st_size 0
+	if (GELF_ST_BIND(sym->sym.st_info) != STB_LOCAL)
+		goto non_local;
 
 	/*
 	 * Move the first global symbol, as per sh_info, into a new, higher
 	 * symbol index. This fees up a spot for a new local symbol.
 	 */
 	first_non_local = symtab->sh.sh_info;
-	new_idx = symtab->sh.sh_size / symtab->sh.sh_entsize;
 	old = find_symbol_by_index(elf, first_non_local);
 	if (old) {
 		old->idx = new_idx;
@@ -769,18 +758,43 @@ elf_create_section_symbol(struct elf *elf, struct section *sec)
 		new_idx = first_non_local;
 	}
 
+	/*
+	 * Either way, we will add a LOCAL symbol.
+	 */
+	symtab->sh.sh_info += 1;
+
+non_local:
 	sym->idx = new_idx;
 	if (elf_update_symbol(elf, symtab, symtab_shndx, sym)) {
 		WARN("elf_update_symbol");
 		return NULL;
 	}
 
-	/*
-	 * Either way, we added a LOCAL symbol.
-	 */
-	symtab->sh.sh_info += 1;
+	return sym;
+}
+
+static struct symbol *
+elf_create_section_symbol(struct elf *elf, struct section *sec)
+{
+	struct symbol *sym = calloc(1, sizeof(*sym));
+
+	if (!sym) {
+		perror("malloc");
+		return NULL;
+	}
 
-	elf_add_symbol(elf, sym);
+	sym->name = sec->name;
+	sym->sec = sec;
+
+	// st_name 0
+	sym->sym.st_info = GELF_ST_INFO(STB_LOCAL, STT_SECTION);
+	// st_other 0
+	// st_value 0
+	// st_size 0
+
+	sym = __elf_create_symbol(elf, sym);
+	if (sym)
+		elf_add_symbol(elf, sym);
 
 	return sym;
 }

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/5] objtool: Optimize elf_dirty_reloc_sym()
  2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
                   ` (5 preceding siblings ...)
  2022-10-30  9:15 ` [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
@ 2022-11-02 21:46 ` Peter Zijlstra
  2022-11-05 10:36   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
  6 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2022-11-02 21:46 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, djwong, yujie.liu, tglx, jpoimboe, joao.moreira,
	samitolvanen

Subject: objtool: Optimize elf_dirty_reloc_sym()
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed Nov  2 22:31:19 CET 2022

When moving a symbol in the symtab its index changes and any reloc
referring that symtol-table-index will need to be rewritten too.

In order to facilitate this, objtool simply marks the whole reloc
section 'changed' which will cause the whole section to be
re-generated.

However, finding the relocs that use any given symbol is implemented
rather crudely -- a fully iteration of all sections and their relocs.
Given that some builds have over 20k sections (kallsyms etc..)
iterating all that for *each* symbol moved takes a bit of time.

Instead have each symbol keep a list of relocs that reference it.

This *vastly* improves build times for certain configs.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/elf.c                 |   27 ++++++++++-----------------
 tools/objtool/include/objtool/elf.h |    2 ++
 2 files changed, 12 insertions(+), 17 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -356,6 +356,7 @@ static void elf_add_symbol(struct elf *e
 	struct rb_node *pnode;
 	struct symbol *iter;
 
+	INIT_LIST_HEAD(&sym->reloc_list);
 	INIT_LIST_HEAD(&sym->pv_target);
 	sym->alias = sym;
 
@@ -557,6 +558,7 @@ int elf_add_reloc(struct elf *elf, struc
 	reloc->sym = sym;
 	reloc->addend = addend;
 
+	list_add_tail(&reloc->sym_reloc_entry, &sym->reloc_list);
 	list_add_tail(&reloc->list, &sec->reloc->reloc_list);
 	elf_hash_add(reloc, &reloc->hash, reloc_hash(reloc));
 
@@ -573,21 +575,10 @@ int elf_add_reloc(struct elf *elf, struc
  */
 static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
 {
-	struct section *sec;
-
-	list_for_each_entry(sec, &elf->sections, list) {
-		struct reloc *reloc;
-
-		if (sec->changed)
-			continue;
+	struct reloc *reloc;
 
-		list_for_each_entry(reloc, &sec->reloc_list, list) {
-			if (reloc->sym == sym) {
-				sec->changed = true;
-				break;
-			}
-		}
-	}
+	list_for_each_entry(reloc, &sym->reloc_list, sym_reloc_entry)
+		reloc->sec->changed = true;
 }
 
 /*
@@ -902,11 +893,12 @@ static int read_rela_reloc(struct sectio
 
 static int read_relocs(struct elf *elf)
 {
+	unsigned long nr_reloc, max_reloc = 0, tot_reloc = 0;
 	struct section *sec;
 	struct reloc *reloc;
-	int i;
 	unsigned int symndx;
-	unsigned long nr_reloc, max_reloc = 0, tot_reloc = 0;
+	struct symbol *sym;
+	int i;
 
 	if (!elf_alloc_hash(reloc, elf->text_size / 16))
 		return -1;
@@ -947,13 +939,14 @@ static int read_relocs(struct elf *elf)
 
 			reloc->sec = sec;
 			reloc->idx = i;
-			reloc->sym = find_symbol_by_index(elf, symndx);
+			reloc->sym = sym = find_symbol_by_index(elf, symndx);
 			if (!reloc->sym) {
 				WARN("can't find reloc entry symbol %d for %s",
 				     symndx, sec->name);
 				return -1;
 			}
 
+			list_add_tail(&reloc->sym_reloc_entry, &sym->reloc_list);
 			list_add_tail(&reloc->list, &sec->reloc_list);
 			elf_hash_add(reloc, &reloc->hash, reloc_hash(reloc));
 
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -62,6 +62,7 @@ struct symbol {
 	u8 fentry            : 1;
 	u8 profiling_func    : 1;
 	struct list_head pv_target;
+	struct list_head reloc_list;
 };
 
 struct reloc {
@@ -73,6 +74,7 @@ struct reloc {
 	};
 	struct section *sec;
 	struct symbol *sym;
+	struct list_head sym_reloc_entry;
 	unsigned long offset;
 	unsigned int type;
 	s64 addend;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  2022-10-28 19:40 ` [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf Peter Zijlstra
  2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2022-11-02 22:22   ` Josh Poimboeuf
  2022-11-03  8:46     ` Peter Zijlstra
  1 sibling, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2022-11-02 22:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, djwong, yujie.liu, tglx, joao.moreira, samitolvanen

On Fri, Oct 28, 2022 at 09:40:25PM +0200, Peter Zijlstra wrote:
> Due to how gelf_update_sym*() requires an Elf_Data pointer, and how
> libelf keeps Elf_Data in a linked list per section,
> elf_update_symbol() ends up having to iterate this list on each
> update to find the correct Elf_Data for the index'ed symbol.
> 
> By allocating one Elf_Data per new symbol, the list grows per new
> symbol, giving an effective O(n^2) insertion time. This is obviously
> bloody terrible.
> 
> Therefore over-allocate the Elf_Data when an extention is needed.
> Except it turns out libelf disregards Elf_Scn::sh_size in favour of
> the sum of Elf_Data::d_size. IOW it will happily write out all the
> unused space and fill it with:
> 
>   0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
> 
> entries (aka zeros). Which obviously violates the STB_LOCAL placement
> rule, and is a general pain in the backside for not being the desired
> behaviour.
> 
> Manually fix-up the Elf_Data size to avoid this problem before calling
> elf_update().
> 
> This significantly improves performance when adding a significant
> number of symbols.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Instead of going through libelf to add each symbol, and
adjusting/shifting/reallocating the d_buf one symbol at a time, it would
probably be a lot easier (and faster) to just manually do all that at
the end, just before writing the file.

See for example what kpatch does:

  https://github.com/dynup/kpatch/blob/master/kpatch-build/kpatch-elf.c#L725

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] objtool: Add option to generate prefix symbols
  2022-10-28 19:40 ` [PATCH 4/5] objtool: Add option to generate prefix symbols Peter Zijlstra
  2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
@ 2022-11-02 23:41   ` Josh Poimboeuf
  1 sibling, 0 replies; 17+ messages in thread
From: Josh Poimboeuf @ 2022-11-02 23:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, djwong, yujie.liu, tglx, joao.moreira, samitolvanen

On Fri, Oct 28, 2022 at 09:40:26PM +0200, Peter Zijlstra wrote:
> When code is compiled with:
> 
>   -fpatchable-function-entry=${PADDING_BYTES},${PADDING_BYTES}
> 
> functions will have PADDING_BYTES of NOP in front of them. Unwinders
> and other things that symbolize code locations will typically
> attribute these bytes to the preceding function.
> 
> Given that these bytes nominally belong to the following symbol this
> mis-attribution is confusing.
> 
> Inspired by the fact that CFI_CLANG emits __cfi_##name symbols to
> claim these bytes, allow objtool to emit __pfx_##name symbols to do
> the same.
> 
> Therefore add the objtool --prefix=N argument, to conditionally place
> a __pfx_##name symbol at N bytes ahead of symbol 'name' when: all
> these preceding bytes are NOP and name-N is an instruction boundary.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  tools/objtool/builtin-check.c           |    1 +
>  tools/objtool/check.c                   |   27 +++++++++++++++++++++++++++
>  tools/objtool/elf.c                     |   30 ++++++++++++++++++++++++++++++
>  tools/objtool/include/objtool/builtin.h |    1 +
>  tools/objtool/include/objtool/elf.h     |    2 ++
>  5 files changed, 61 insertions(+)
> 
> --- a/tools/objtool/builtin-check.c
> +++ b/tools/objtool/builtin-check.c
> @@ -75,6 +75,7 @@ const struct option check_options[] = {
>  	OPT_BOOLEAN('r', "retpoline", &opts.retpoline, "validate and annotate retpoline usage"),
>  	OPT_BOOLEAN(0,   "rethunk", &opts.rethunk, "validate and annotate rethunk usage"),
>  	OPT_BOOLEAN(0,   "unret", &opts.unret, "validate entry unret placement"),
> +	OPT_INTEGER(0,   "prefix", &opts.prefix, "generate prefix symbols"),

"generate prefix symbols for function padding with size 'n' bytes" ?

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
  2022-11-02 22:22   ` [PATCH 3/5] " Josh Poimboeuf
@ 2022-11-03  8:46     ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2022-11-03  8:46 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, djwong, yujie.liu, tglx, joao.moreira, samitolvanen

On Wed, Nov 02, 2022 at 03:22:22PM -0700, Josh Poimboeuf wrote:
> On Fri, Oct 28, 2022 at 09:40:25PM +0200, Peter Zijlstra wrote:
> > Due to how gelf_update_sym*() requires an Elf_Data pointer, and how
> > libelf keeps Elf_Data in a linked list per section,
> > elf_update_symbol() ends up having to iterate this list on each
> > update to find the correct Elf_Data for the index'ed symbol.
> > 
> > By allocating one Elf_Data per new symbol, the list grows per new
> > symbol, giving an effective O(n^2) insertion time. This is obviously
> > bloody terrible.
> > 
> > Therefore over-allocate the Elf_Data when an extention is needed.
> > Except it turns out libelf disregards Elf_Scn::sh_size in favour of
> > the sum of Elf_Data::d_size. IOW it will happily write out all the
> > unused space and fill it with:
> > 
> >   0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
> > 
> > entries (aka zeros). Which obviously violates the STB_LOCAL placement
> > rule, and is a general pain in the backside for not being the desired
> > behaviour.
> > 
> > Manually fix-up the Elf_Data size to avoid this problem before calling
> > elf_update().
> > 
> > This significantly improves performance when adding a significant
> > number of symbols.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> 
> Instead of going through libelf to add each symbol, and
> adjusting/shifting/reallocating the d_buf one symbol at a time, it would
> probably be a lot easier (and faster) to just manually do all that at
> the end, just before writing the file.

Yeah, I've been >< close to doing that at least twice now. The only
consideration is that we then also must rewrite all relocs but I think
we end up doing that anyway.

I'll go do the patch to see what it looks like.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip: x86/core] objtool: Optimize elf_dirty_reloc_sym()
  2022-11-02 21:46 ` [PATCH 6/5] objtool: Optimize elf_dirty_reloc_sym() Peter Zijlstra
@ 2022-11-05 10:36   ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2022-11-05 10:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Borislav Petkov, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     19526717f768bf2f89ca01bd2a595728ebe57540
Gitweb:        https://git.kernel.org/tip/19526717f768bf2f89ca01bd2a595728ebe57540
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 02 Nov 2022 22:31:19 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Sat, 05 Nov 2022 11:28:02 +01:00

objtool: Optimize elf_dirty_reloc_sym()

When moving a symbol in the symtab its index changes and any reloc
referring that symtol-table-index will need to be rewritten too.

In order to facilitate this, objtool simply marks the whole reloc
section 'changed' which will cause the whole section to be
re-generated.

However, finding the relocs that use any given symbol is implemented
rather crudely -- a fully iteration of all sections and their relocs.
Given that some builds have over 20k sections (kallsyms etc..)
iterating all that for *each* symbol moved takes a bit of time.

Instead have each symbol keep a list of relocs that reference it.

This *vastly* improves build times for certain configs.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Y2LlRA7x+8UsE1xf@hirez.programming.kicks-ass.net
---
 tools/objtool/elf.c                 | 27 ++++++++++-----------------
 tools/objtool/include/objtool/elf.h |  2 ++
 2 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 3d636d1..8cd7f01 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -356,6 +356,7 @@ static void elf_add_symbol(struct elf *elf, struct symbol *sym)
 	struct rb_node *pnode;
 	struct symbol *iter;
 
+	INIT_LIST_HEAD(&sym->reloc_list);
 	INIT_LIST_HEAD(&sym->pv_target);
 	sym->alias = sym;
 
@@ -557,6 +558,7 @@ int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
 	reloc->sym = sym;
 	reloc->addend = addend;
 
+	list_add_tail(&reloc->sym_reloc_entry, &sym->reloc_list);
 	list_add_tail(&reloc->list, &sec->reloc->reloc_list);
 	elf_hash_add(reloc, &reloc->hash, reloc_hash(reloc));
 
@@ -573,21 +575,10 @@ int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
  */
 static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
 {
-	struct section *sec;
-
-	list_for_each_entry(sec, &elf->sections, list) {
-		struct reloc *reloc;
-
-		if (sec->changed)
-			continue;
+	struct reloc *reloc;
 
-		list_for_each_entry(reloc, &sec->reloc_list, list) {
-			if (reloc->sym == sym) {
-				sec->changed = true;
-				break;
-			}
-		}
-	}
+	list_for_each_entry(reloc, &sym->reloc_list, sym_reloc_entry)
+		reloc->sec->changed = true;
 }
 
 /*
@@ -902,11 +893,12 @@ static int read_rela_reloc(struct section *sec, int i, struct reloc *reloc, unsi
 
 static int read_relocs(struct elf *elf)
 {
+	unsigned long nr_reloc, max_reloc = 0, tot_reloc = 0;
 	struct section *sec;
 	struct reloc *reloc;
-	int i;
 	unsigned int symndx;
-	unsigned long nr_reloc, max_reloc = 0, tot_reloc = 0;
+	struct symbol *sym;
+	int i;
 
 	if (!elf_alloc_hash(reloc, elf->text_size / 16))
 		return -1;
@@ -947,13 +939,14 @@ static int read_relocs(struct elf *elf)
 
 			reloc->sec = sec;
 			reloc->idx = i;
-			reloc->sym = find_symbol_by_index(elf, symndx);
+			reloc->sym = sym = find_symbol_by_index(elf, symndx);
 			if (!reloc->sym) {
 				WARN("can't find reloc entry symbol %d for %s",
 				     symndx, sec->name);
 				return -1;
 			}
 
+			list_add_tail(&reloc->sym_reloc_entry, &sym->reloc_list);
 			list_add_tail(&reloc->list, &sec->reloc_list);
 			elf_hash_add(reloc, &reloc->hash, reloc_hash(reloc));
 
diff --git a/tools/objtool/include/objtool/elf.h b/tools/objtool/include/objtool/elf.h
index b6974e3..bca719b 100644
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -62,6 +62,7 @@ struct symbol {
 	u8 fentry            : 1;
 	u8 profiling_func    : 1;
 	struct list_head pv_target;
+	struct list_head reloc_list;
 };
 
 struct reloc {
@@ -73,6 +74,7 @@ struct reloc {
 	};
 	struct section *sec;
 	struct symbol *sym;
+	struct list_head sym_reloc_entry;
 	unsigned long offset;
 	unsigned int type;
 	s64 addend;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-11-05 10:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-28 19:40 [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
2022-10-28 19:40 ` [PATCH 1/5] kallsyms: Revert "Take callthunks into account" Peter Zijlstra
2022-10-28 19:40 ` [PATCH 2/5] objtool: Slice up elf_create_section_symbol() Peter Zijlstra
2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2022-10-28 19:40 ` [PATCH 3/5] objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf Peter Zijlstra
2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2022-11-02 22:22   ` [PATCH 3/5] " Josh Poimboeuf
2022-11-03  8:46     ` Peter Zijlstra
2022-10-28 19:40 ` [PATCH 4/5] objtool: Add option to generate prefix symbols Peter Zijlstra
2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2022-11-02 23:41   ` [PATCH 4/5] " Josh Poimboeuf
2022-10-28 19:40 ` [PATCH 5/5] x86: Add prefix symbols for function padding Peter Zijlstra
2022-11-02  9:20   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra
2022-10-30  9:15 ` [PATCH 0/5] x86/kallsyms: Fix the call-thunk kallsyms fail Peter Zijlstra
2022-10-31  6:18   ` Yujie Liu
2022-11-02 21:46 ` [PATCH 6/5] objtool: Optimize elf_dirty_reloc_sym() Peter Zijlstra
2022-11-05 10:36   ` [tip: x86/core] " tip-bot2 for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.