All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
@ 2022-04-27  9:31 ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-27  9:31 UTC (permalink / raw)
  To: Michael Ellerman, Steven Rostedt
  Cc: Nathan Chancellor, Nick Desaulniers, linux-kernel, linuxppc-dev, llvm

This solves a build issue on powerpc with binutils v2.36 and newer [1].
Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
symbols") [2], binutils started dropping section symbols that it thought
were unused. Due to this, in certain scenarios, recordmcount is unable 
to find a non-weak symbol to generate a relocation record against.

Clang integrated assembler is also aggressive in dropping section 
symbols [3].

In the past, there have been various workarounds to address this. See 
commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 
6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch:
https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n.rao@linux.vnet.ibm.com/T/#u

Fix this issue by using the weak symbol in the relocation record. This 
can result in duplicate locations in the mcount table if those weak 
functions are overridden, so have ftrace skip dupicate entries.

Objtool already follows this approach, so patch 2 updates recordmcount 
to do the same. Patch 1 updates ftrace to skip duplicate entries.

- Naveen


[1] https://github.com/linuxppc/issues/issues/388
[2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
[3] https://github.com/ClangBuiltLinux/linux/issues/981


Naveen N. Rao (2):
  ftrace: Drop duplicate mcount locations
  recordmcount: Handle sections with no non-weak symbols

 kernel/trace/ftrace.c  | 13 ++++++-
 scripts/recordmcount.h | 86 +++++++++++++++++++++++++++++++++++-------
 2 files changed, 85 insertions(+), 14 deletions(-)


base-commit: 83d8a0d166119de813cad27ae7d61f54f9aea707
-- 
2.35.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
@ 2022-04-27  9:31 ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-27  9:31 UTC (permalink / raw)
  To: Michael Ellerman, Steven Rostedt
  Cc: Nathan Chancellor, linuxppc-dev, llvm, Nick Desaulniers, linux-kernel

This solves a build issue on powerpc with binutils v2.36 and newer [1].
Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
symbols") [2], binutils started dropping section symbols that it thought
were unused. Due to this, in certain scenarios, recordmcount is unable 
to find a non-weak symbol to generate a relocation record against.

Clang integrated assembler is also aggressive in dropping section 
symbols [3].

In the past, there have been various workarounds to address this. See 
commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 
6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch:
https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n.rao@linux.vnet.ibm.com/T/#u

Fix this issue by using the weak symbol in the relocation record. This 
can result in duplicate locations in the mcount table if those weak 
functions are overridden, so have ftrace skip dupicate entries.

Objtool already follows this approach, so patch 2 updates recordmcount 
to do the same. Patch 1 updates ftrace to skip duplicate entries.

- Naveen


[1] https://github.com/linuxppc/issues/issues/388
[2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
[3] https://github.com/ClangBuiltLinux/linux/issues/981


Naveen N. Rao (2):
  ftrace: Drop duplicate mcount locations
  recordmcount: Handle sections with no non-weak symbols

 kernel/trace/ftrace.c  | 13 ++++++-
 scripts/recordmcount.h | 86 +++++++++++++++++++++++++++++++++++-------
 2 files changed, 85 insertions(+), 14 deletions(-)


base-commit: 83d8a0d166119de813cad27ae7d61f54f9aea707
-- 
2.35.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/2] ftrace: Drop duplicate mcount locations
  2022-04-27  9:31 ` Naveen N. Rao
@ 2022-04-27  9:31   ` Naveen N. Rao
  -1 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-27  9:31 UTC (permalink / raw)
  To: Michael Ellerman, Steven Rostedt
  Cc: Nathan Chancellor, Nick Desaulniers, linux-kernel, linuxppc-dev, llvm

In the absence of section symbols [1], objtool (today) and recordmcount
(with a subsequent patch) generate __mcount_loc relocation records with
weak symbols as the base. This works fine as long as those weak symbols
are not overridden, but if they are, these can result in duplicate
entries in the final vmlinux mcount location table. This will cause
ftrace to fail when trying to patch the same location twice. Fix this by
dropping duplicate locations during ftrace init.

[1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 kernel/trace/ftrace.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 4f1d2f5e726341..8bc4f282bb3ff4 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6496,7 +6496,7 @@ static int ftrace_process_locs(struct module *mod,
 	struct dyn_ftrace *rec;
 	unsigned long count;
 	unsigned long *p;
-	unsigned long addr;
+	unsigned long addr, prev_addr = 0;
 	unsigned long flags = 0; /* Shut up gcc */
 	int ret = -ENOMEM;
 
@@ -6550,6 +6550,16 @@ static int ftrace_process_locs(struct module *mod,
 	while (p < end) {
 		unsigned long end_offset;
 		addr = ftrace_call_adjust(*p++);
+
+		/*
+		 * Drop duplicate entries, which can happen when weak
+		 * functions are overridden, and __mcount_loc relocation
+		 * records were generated against function names due to
+		 * absence of non-weak section symbols
+		 */
+		if (addr == prev_addr)
+			addr = 0;
+
 		/*
 		 * Some architecture linkers will pad between
 		 * the different mcount_loc sections of different
@@ -6569,6 +6579,7 @@ static int ftrace_process_locs(struct module *mod,
 
 		rec = &pg->records[pg->index++];
 		rec->ip = addr;
+		prev_addr = addr;
 	}
 
 	/* We should have used all pages */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 1/2] ftrace: Drop duplicate mcount locations
@ 2022-04-27  9:31   ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-27  9:31 UTC (permalink / raw)
  To: Michael Ellerman, Steven Rostedt
  Cc: Nathan Chancellor, linuxppc-dev, llvm, Nick Desaulniers, linux-kernel

In the absence of section symbols [1], objtool (today) and recordmcount
(with a subsequent patch) generate __mcount_loc relocation records with
weak symbols as the base. This works fine as long as those weak symbols
are not overridden, but if they are, these can result in duplicate
entries in the final vmlinux mcount location table. This will cause
ftrace to fail when trying to patch the same location twice. Fix this by
dropping duplicate locations during ftrace init.

[1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 kernel/trace/ftrace.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 4f1d2f5e726341..8bc4f282bb3ff4 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6496,7 +6496,7 @@ static int ftrace_process_locs(struct module *mod,
 	struct dyn_ftrace *rec;
 	unsigned long count;
 	unsigned long *p;
-	unsigned long addr;
+	unsigned long addr, prev_addr = 0;
 	unsigned long flags = 0; /* Shut up gcc */
 	int ret = -ENOMEM;
 
@@ -6550,6 +6550,16 @@ static int ftrace_process_locs(struct module *mod,
 	while (p < end) {
 		unsigned long end_offset;
 		addr = ftrace_call_adjust(*p++);
+
+		/*
+		 * Drop duplicate entries, which can happen when weak
+		 * functions are overridden, and __mcount_loc relocation
+		 * records were generated against function names due to
+		 * absence of non-weak section symbols
+		 */
+		if (addr == prev_addr)
+			addr = 0;
+
 		/*
 		 * Some architecture linkers will pad between
 		 * the different mcount_loc sections of different
@@ -6569,6 +6579,7 @@ static int ftrace_process_locs(struct module *mod,
 
 		rec = &pg->records[pg->index++];
 		rec->ip = addr;
+		prev_addr = addr;
 	}
 
 	/* We should have used all pages */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-04-27  9:31 ` Naveen N. Rao
@ 2022-04-27  9:31   ` Naveen N. Rao
  -1 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-27  9:31 UTC (permalink / raw)
  To: Michael Ellerman, Steven Rostedt
  Cc: Nathan Chancellor, Nick Desaulniers, linux-kernel, linuxppc-dev, llvm

Kernel builds on powerpc are failing with the below error [1]:
      CC      kernel/kexec_file.o
    Cannot find symbol for section 9: .text.unlikely.
    kernel/kexec_file.o: failed

Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
symbols") [2], binutils started dropping section symbols that it thought
were unused.  This isn't an issue in general, but with kexec_file.c, gcc
is placing kexec_arch_apply_relocations[_add] into a separate
.text.unlikely section and the section symbol ".text.unlikely" is being
dropped. Due to this, recordmcount is unable to find a non-weak symbol
in .text.unlikely to generate a relocation record against.

Handle this by falling back to a weak symbol, similar to what objtool
does in commit 44f6a7c0755d8d ("objtool: Fix seg fault with Clang
non-section symbols"). Note that this approach can result in duplicate
addresses in the final vmlinux mcount location table. A previous commit
updated ftrace to skip such duplicate entries.

After this commit, relocation records for __mcount_loc for kexec_file.o
now include two entries with the weak functions
arch_kexec_apply_relocations() and arch_kexec_apply_relocation_add() as
the relocation bases:

  ...
  0000000000000080 R_PPC64_ADDR64    .text+0x0000000000001d34
  0000000000000088 R_PPC64_ADDR64    .text+0x0000000000001fec
  0000000000000090 R_PPC64_ADDR64    arch_kexec_apply_relocations_add+0x000000000000000c
  0000000000000098 R_PPC64_ADDR64    arch_kexec_apply_relocations+0x000000000000000c

Powerpc does not override these functions today, so these get converted
to correct offsets in the mcount location table in vmlinux.

If one or both of these weak functions are overridden in future, in the
final vmlinux mcount table, references to these will change over to the
non-weak variant which has its own mcount location entry. As such, there
will now be two entries for these functions, both pointing to the same
non-weak location. We don't need the non-weak locations since they will
never be executed. Ftrace skips the duplicate entries due to a previous
commit.

[1] https://github.com/linuxppc/issues/issues/388
[2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 scripts/recordmcount.h | 86 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 73 insertions(+), 13 deletions(-)

diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index 1e9baa5c4fc6ef..0c79a2d2b47493 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -25,6 +25,7 @@
 #undef sift_rel_mcount
 #undef nop_mcount
 #undef find_secsym_ndx
+#undef find_sym_ndx
 #undef __has_rel_mcount
 #undef has_rel_mcount
 #undef tot_relsize
@@ -60,6 +61,7 @@
 # define sift_rel_mcount	sift64_rel_mcount
 # define nop_mcount		nop_mcount_64
 # define find_secsym_ndx	find64_secsym_ndx
+# define find_sym_ndx		find64_sym_ndx
 # define __has_rel_mcount	__has64_rel_mcount
 # define has_rel_mcount		has64_rel_mcount
 # define tot_relsize		tot64_relsize
@@ -98,6 +100,7 @@
 # define sift_rel_mcount	sift32_rel_mcount
 # define nop_mcount		nop_mcount_32
 # define find_secsym_ndx	find32_secsym_ndx
+# define find_sym_ndx		find32_sym_ndx
 # define __has_rel_mcount	__has32_rel_mcount
 # define has_rel_mcount		has32_rel_mcount
 # define tot_relsize		tot32_relsize
@@ -392,6 +395,51 @@ static void get_sym_str_and_relp(Elf_Shdr const *const relhdr,
 	*relp = rel0;
 }
 
+/*
+ * Find a symbol in the given section containing the instruction offset passed
+ * in r_offset, to be used in generating the relocation record for the mcount
+ * location. This is used if there were no local/global symbols in the given
+ * section to be used as the base. Weak symbols are ok, and are expected to
+ * result in duplicate mcount locations which get dropped by ftrace.
+ */
+static int find_sym_ndx(unsigned const txtndx,
+			 char const *const txtname,
+			 uint_t *const recvalp,
+			 unsigned int *sym_index,
+			 Elf_Shdr const *const symhdr,
+			 Elf32_Word const *symtab,
+			 Elf32_Word const *symtab_shndx,
+			 unsigned const r_offset,
+			 Elf_Ehdr const *const ehdr)
+{
+	Elf_Sym const *const sym0 = (Elf_Sym const *)(_w(symhdr->sh_offset)
+		+ (void *)ehdr);
+	unsigned const nsym = _w(symhdr->sh_size) / _w(symhdr->sh_entsize);
+	Elf_Sym const *symp;
+	unsigned t;
+
+	for (symp = sym0, t = nsym; t; --t, ++symp) {
+		if (txtndx == get_symindex(symp, symtab, symtab_shndx)) {
+			/* function symbols on ARM have quirks, avoid them */
+			if (w2(ehdr->e_machine) == EM_ARM &&
+			    ELF_ST_TYPE(symp->st_info) == STT_FUNC)
+				continue;
+
+			if (r_offset >= _w(symp->st_value) &&
+			    r_offset < _w(symp->st_value) + _w(symp->st_size)) {
+				*recvalp = _w(symp->st_value);
+				*sym_index = symp - sym0;
+				return 0;
+			}
+		}
+	}
+
+	fprintf(stderr, "Cannot find symbol containing offset %u for section %u: %s.\n",
+		r_offset, txtndx, txtname);
+
+	return -1;
+}
+
 /*
  * Look at the relocations in order to find the calls to mcount.
  * Accumulate the section offsets that are found, and their relocation info,
@@ -402,9 +450,14 @@ static uint_t *sift_rel_mcount(uint_t *mlocp,
 			       Elf_Rel **const mrelpp,
 			       Elf_Shdr const *const relhdr,
 			       Elf_Ehdr const *const ehdr,
-			       unsigned const recsym,
-			       uint_t const recval,
-			       unsigned const reltype)
+			       unsigned recsym,
+			       uint_t recval,
+			       unsigned const reltype,
+			       unsigned int no_secsym,
+			       char const *const txtname,
+			       Elf_Shdr const *const shdr0,
+			       Elf32_Word const *symtab,
+			       Elf32_Word const *symtab_shndx)
 {
 	uint_t *const mloc0 = mlocp;
 	Elf_Rel *mrelp = *mrelpp;
@@ -415,6 +468,7 @@ static uint_t *sift_rel_mcount(uint_t *mlocp,
 	unsigned const nrel = _w(relhdr->sh_size) / rel_entsize;
 	unsigned mcountsym = 0;
 	unsigned t;
+	uint_t addend;
 
 	get_sym_str_and_relp(relhdr, ehdr, &sym0, &str0, &relp);
 
@@ -424,8 +478,13 @@ static uint_t *sift_rel_mcount(uint_t *mlocp,
 
 		if (mcountsym && mcountsym == Elf_r_sym(relp) &&
 				!is_fake_mcount(relp)) {
-			uint_t const addend =
-				_w(_w(relp->r_offset) - recval + mcount_adjust);
+			if (no_secsym && find_sym_ndx(w(relhdr->sh_info),
+						      txtname, &recval, &recsym,
+						      &shdr0[w(relhdr->sh_link)],
+						      symtab, symtab_shndx,
+						      _w(relp->r_offset), ehdr))
+				return 0;
+			addend = _w(_w(relp->r_offset) - recval + mcount_adjust);
 			mrelp->r_offset = _w(offbase
 				+ ((void *)mlocp - (void *)mloc0));
 			Elf_r_info(mrelp, recsym, reltype);
@@ -544,8 +603,6 @@ static int find_secsym_ndx(unsigned const txtndx,
 			return 0;
 		}
 	}
-	fprintf(stderr, "Cannot find symbol for section %u: %s.\n",
-		txtndx, txtname);
 	return -1;
 }
 
@@ -660,22 +717,25 @@ static int do_func(Elf_Ehdr *const ehdr, char const *const fname,
 			goto out; /* Nothing to be done; don't append! */
 		}
 		if (txtname && is_mcounted_section_name(txtname)) {
-			unsigned int recsym;
+			unsigned int recsym = 0, no_secsym = 0;
 			uint_t recval = 0;
 
 			symsec_sh_link = w(relhdr->sh_link);
-			result = find_secsym_ndx(w(relhdr->sh_info), txtname,
+			if (find_secsym_ndx(w(relhdr->sh_info), txtname,
 						&recval, &recsym,
 						&shdr0[symsec_sh_link],
 						symtab, symtab_shndx,
-						ehdr);
-			if (result)
-				goto out;
+						ehdr))
+				no_secsym = 1;
 
 			rel_entsize = _w(relhdr->sh_entsize);
 			mlocp = sift_rel_mcount(mlocp,
 				(void *)mlocp - (void *)mloc0, &mrelp,
-				relhdr, ehdr, recsym, recval, reltype);
+				relhdr, ehdr, recsym, recval, reltype,
+				no_secsym, txtname, shdr0, symtab,
+				symtab_shndx);
+			if (!mlocp)
+				goto out;
 		} else if (txtname && (warn_on_notrace_sect || make_nop)) {
 			/*
 			 * This section is ignored by ftrace, but still
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-04-27  9:31   ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-27  9:31 UTC (permalink / raw)
  To: Michael Ellerman, Steven Rostedt
  Cc: Nathan Chancellor, linuxppc-dev, llvm, Nick Desaulniers, linux-kernel

Kernel builds on powerpc are failing with the below error [1]:
      CC      kernel/kexec_file.o
    Cannot find symbol for section 9: .text.unlikely.
    kernel/kexec_file.o: failed

Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
symbols") [2], binutils started dropping section symbols that it thought
were unused.  This isn't an issue in general, but with kexec_file.c, gcc
is placing kexec_arch_apply_relocations[_add] into a separate
.text.unlikely section and the section symbol ".text.unlikely" is being
dropped. Due to this, recordmcount is unable to find a non-weak symbol
in .text.unlikely to generate a relocation record against.

Handle this by falling back to a weak symbol, similar to what objtool
does in commit 44f6a7c0755d8d ("objtool: Fix seg fault with Clang
non-section symbols"). Note that this approach can result in duplicate
addresses in the final vmlinux mcount location table. A previous commit
updated ftrace to skip such duplicate entries.

After this commit, relocation records for __mcount_loc for kexec_file.o
now include two entries with the weak functions
arch_kexec_apply_relocations() and arch_kexec_apply_relocation_add() as
the relocation bases:

  ...
  0000000000000080 R_PPC64_ADDR64    .text+0x0000000000001d34
  0000000000000088 R_PPC64_ADDR64    .text+0x0000000000001fec
  0000000000000090 R_PPC64_ADDR64    arch_kexec_apply_relocations_add+0x000000000000000c
  0000000000000098 R_PPC64_ADDR64    arch_kexec_apply_relocations+0x000000000000000c

Powerpc does not override these functions today, so these get converted
to correct offsets in the mcount location table in vmlinux.

If one or both of these weak functions are overridden in future, in the
final vmlinux mcount table, references to these will change over to the
non-weak variant which has its own mcount location entry. As such, there
will now be two entries for these functions, both pointing to the same
non-weak location. We don't need the non-weak locations since they will
never be executed. Ftrace skips the duplicate entries due to a previous
commit.

[1] https://github.com/linuxppc/issues/issues/388
[2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 scripts/recordmcount.h | 86 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 73 insertions(+), 13 deletions(-)

diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index 1e9baa5c4fc6ef..0c79a2d2b47493 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -25,6 +25,7 @@
 #undef sift_rel_mcount
 #undef nop_mcount
 #undef find_secsym_ndx
+#undef find_sym_ndx
 #undef __has_rel_mcount
 #undef has_rel_mcount
 #undef tot_relsize
@@ -60,6 +61,7 @@
 # define sift_rel_mcount	sift64_rel_mcount
 # define nop_mcount		nop_mcount_64
 # define find_secsym_ndx	find64_secsym_ndx
+# define find_sym_ndx		find64_sym_ndx
 # define __has_rel_mcount	__has64_rel_mcount
 # define has_rel_mcount		has64_rel_mcount
 # define tot_relsize		tot64_relsize
@@ -98,6 +100,7 @@
 # define sift_rel_mcount	sift32_rel_mcount
 # define nop_mcount		nop_mcount_32
 # define find_secsym_ndx	find32_secsym_ndx
+# define find_sym_ndx		find32_sym_ndx
 # define __has_rel_mcount	__has32_rel_mcount
 # define has_rel_mcount		has32_rel_mcount
 # define tot_relsize		tot32_relsize
@@ -392,6 +395,51 @@ static void get_sym_str_and_relp(Elf_Shdr const *const relhdr,
 	*relp = rel0;
 }
 
+/*
+ * Find a symbol in the given section containing the instruction offset passed
+ * in r_offset, to be used in generating the relocation record for the mcount
+ * location. This is used if there were no local/global symbols in the given
+ * section to be used as the base. Weak symbols are ok, and are expected to
+ * result in duplicate mcount locations which get dropped by ftrace.
+ */
+static int find_sym_ndx(unsigned const txtndx,
+			 char const *const txtname,
+			 uint_t *const recvalp,
+			 unsigned int *sym_index,
+			 Elf_Shdr const *const symhdr,
+			 Elf32_Word const *symtab,
+			 Elf32_Word const *symtab_shndx,
+			 unsigned const r_offset,
+			 Elf_Ehdr const *const ehdr)
+{
+	Elf_Sym const *const sym0 = (Elf_Sym const *)(_w(symhdr->sh_offset)
+		+ (void *)ehdr);
+	unsigned const nsym = _w(symhdr->sh_size) / _w(symhdr->sh_entsize);
+	Elf_Sym const *symp;
+	unsigned t;
+
+	for (symp = sym0, t = nsym; t; --t, ++symp) {
+		if (txtndx == get_symindex(symp, symtab, symtab_shndx)) {
+			/* function symbols on ARM have quirks, avoid them */
+			if (w2(ehdr->e_machine) == EM_ARM &&
+			    ELF_ST_TYPE(symp->st_info) == STT_FUNC)
+				continue;
+
+			if (r_offset >= _w(symp->st_value) &&
+			    r_offset < _w(symp->st_value) + _w(symp->st_size)) {
+				*recvalp = _w(symp->st_value);
+				*sym_index = symp - sym0;
+				return 0;
+			}
+		}
+	}
+
+	fprintf(stderr, "Cannot find symbol containing offset %u for section %u: %s.\n",
+		r_offset, txtndx, txtname);
+
+	return -1;
+}
+
 /*
  * Look at the relocations in order to find the calls to mcount.
  * Accumulate the section offsets that are found, and their relocation info,
@@ -402,9 +450,14 @@ static uint_t *sift_rel_mcount(uint_t *mlocp,
 			       Elf_Rel **const mrelpp,
 			       Elf_Shdr const *const relhdr,
 			       Elf_Ehdr const *const ehdr,
-			       unsigned const recsym,
-			       uint_t const recval,
-			       unsigned const reltype)
+			       unsigned recsym,
+			       uint_t recval,
+			       unsigned const reltype,
+			       unsigned int no_secsym,
+			       char const *const txtname,
+			       Elf_Shdr const *const shdr0,
+			       Elf32_Word const *symtab,
+			       Elf32_Word const *symtab_shndx)
 {
 	uint_t *const mloc0 = mlocp;
 	Elf_Rel *mrelp = *mrelpp;
@@ -415,6 +468,7 @@ static uint_t *sift_rel_mcount(uint_t *mlocp,
 	unsigned const nrel = _w(relhdr->sh_size) / rel_entsize;
 	unsigned mcountsym = 0;
 	unsigned t;
+	uint_t addend;
 
 	get_sym_str_and_relp(relhdr, ehdr, &sym0, &str0, &relp);
 
@@ -424,8 +478,13 @@ static uint_t *sift_rel_mcount(uint_t *mlocp,
 
 		if (mcountsym && mcountsym == Elf_r_sym(relp) &&
 				!is_fake_mcount(relp)) {
-			uint_t const addend =
-				_w(_w(relp->r_offset) - recval + mcount_adjust);
+			if (no_secsym && find_sym_ndx(w(relhdr->sh_info),
+						      txtname, &recval, &recsym,
+						      &shdr0[w(relhdr->sh_link)],
+						      symtab, symtab_shndx,
+						      _w(relp->r_offset), ehdr))
+				return 0;
+			addend = _w(_w(relp->r_offset) - recval + mcount_adjust);
 			mrelp->r_offset = _w(offbase
 				+ ((void *)mlocp - (void *)mloc0));
 			Elf_r_info(mrelp, recsym, reltype);
@@ -544,8 +603,6 @@ static int find_secsym_ndx(unsigned const txtndx,
 			return 0;
 		}
 	}
-	fprintf(stderr, "Cannot find symbol for section %u: %s.\n",
-		txtndx, txtname);
 	return -1;
 }
 
@@ -660,22 +717,25 @@ static int do_func(Elf_Ehdr *const ehdr, char const *const fname,
 			goto out; /* Nothing to be done; don't append! */
 		}
 		if (txtname && is_mcounted_section_name(txtname)) {
-			unsigned int recsym;
+			unsigned int recsym = 0, no_secsym = 0;
 			uint_t recval = 0;
 
 			symsec_sh_link = w(relhdr->sh_link);
-			result = find_secsym_ndx(w(relhdr->sh_info), txtname,
+			if (find_secsym_ndx(w(relhdr->sh_info), txtname,
 						&recval, &recsym,
 						&shdr0[symsec_sh_link],
 						symtab, symtab_shndx,
-						ehdr);
-			if (result)
-				goto out;
+						ehdr))
+				no_secsym = 1;
 
 			rel_entsize = _w(relhdr->sh_entsize);
 			mlocp = sift_rel_mcount(mlocp,
 				(void *)mlocp - (void *)mloc0, &mrelp,
-				relhdr, ehdr, recsym, recval, reltype);
+				relhdr, ehdr, recsym, recval, reltype,
+				no_secsym, txtname, shdr0, symtab,
+				symtab_shndx);
+			if (!mlocp)
+				goto out;
 		} else if (txtname && (warn_on_notrace_sect || make_nop)) {
 			/*
 			 * This section is ignored by ftrace, but still
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/2] ftrace: Drop duplicate mcount locations
  2022-04-27  9:31   ` Naveen N. Rao
@ 2022-04-27 13:46     ` Steven Rostedt
  -1 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-04-27 13:46 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: Michael Ellerman, Nathan Chancellor, Nick Desaulniers,
	linux-kernel, linuxppc-dev, llvm

On Wed, 27 Apr 2022 15:01:21 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> In the absence of section symbols [1], objtool (today) and recordmcount
> (with a subsequent patch) generate __mcount_loc relocation records with
> weak symbols as the base. This works fine as long as those weak symbols
> are not overridden, but if they are, these can result in duplicate
> entries in the final vmlinux mcount location table. This will cause
> ftrace to fail when trying to patch the same location twice. Fix this by
> dropping duplicate locations during ftrace init.
> 
> [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
> 
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
>  kernel/trace/ftrace.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 4f1d2f5e726341..8bc4f282bb3ff4 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -6496,7 +6496,7 @@ static int ftrace_process_locs(struct module *mod,
>  	struct dyn_ftrace *rec;
>  	unsigned long count;
>  	unsigned long *p;
> -	unsigned long addr;
> +	unsigned long addr, prev_addr = 0;
>  	unsigned long flags = 0; /* Shut up gcc */
>  	int ret = -ENOMEM;
>  
> @@ -6550,6 +6550,16 @@ static int ftrace_process_locs(struct module *mod,
>  	while (p < end) {
>  		unsigned long end_offset;
>  		addr = ftrace_call_adjust(*p++);
> +
> +		/*
> +		 * Drop duplicate entries, which can happen when weak
> +		 * functions are overridden, and __mcount_loc relocation
> +		 * records were generated against function names due to
> +		 * absence of non-weak section symbols
> +		 */
> +		if (addr == prev_addr)
> +			addr = 0;

Please don't use the side effect of addr == 0 causing the loop to continue
for this logic. The two are not related. Simply call continue.

		if (addr == prev_addr)
			continue;


-- Steve


> +
>  		/*
>  		 * Some architecture linkers will pad between
>  		 * the different mcount_loc sections of different
> @@ -6569,6 +6579,7 @@ static int ftrace_process_locs(struct module *mod,
>  
>  		rec = &pg->records[pg->index++];
>  		rec->ip = addr;
> +		prev_addr = addr;
>  	}
>  
>  	/* We should have used all pages */


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/2] ftrace: Drop duplicate mcount locations
@ 2022-04-27 13:46     ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-04-27 13:46 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

On Wed, 27 Apr 2022 15:01:21 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> In the absence of section symbols [1], objtool (today) and recordmcount
> (with a subsequent patch) generate __mcount_loc relocation records with
> weak symbols as the base. This works fine as long as those weak symbols
> are not overridden, but if they are, these can result in duplicate
> entries in the final vmlinux mcount location table. This will cause
> ftrace to fail when trying to patch the same location twice. Fix this by
> dropping duplicate locations during ftrace init.
> 
> [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
> 
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
>  kernel/trace/ftrace.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 4f1d2f5e726341..8bc4f282bb3ff4 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -6496,7 +6496,7 @@ static int ftrace_process_locs(struct module *mod,
>  	struct dyn_ftrace *rec;
>  	unsigned long count;
>  	unsigned long *p;
> -	unsigned long addr;
> +	unsigned long addr, prev_addr = 0;
>  	unsigned long flags = 0; /* Shut up gcc */
>  	int ret = -ENOMEM;
>  
> @@ -6550,6 +6550,16 @@ static int ftrace_process_locs(struct module *mod,
>  	while (p < end) {
>  		unsigned long end_offset;
>  		addr = ftrace_call_adjust(*p++);
> +
> +		/*
> +		 * Drop duplicate entries, which can happen when weak
> +		 * functions are overridden, and __mcount_loc relocation
> +		 * records were generated against function names due to
> +		 * absence of non-weak section symbols
> +		 */
> +		if (addr == prev_addr)
> +			addr = 0;

Please don't use the side effect of addr == 0 causing the loop to continue
for this logic. The two are not related. Simply call continue.

		if (addr == prev_addr)
			continue;


-- Steve


> +
>  		/*
>  		 * Some architecture linkers will pad between
>  		 * the different mcount_loc sections of different
> @@ -6569,6 +6579,7 @@ static int ftrace_process_locs(struct module *mod,
>  
>  		rec = &pg->records[pg->index++];
>  		rec->ip = addr;
> +		prev_addr = addr;
>  	}
>  
>  	/* We should have used all pages */


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-04-27  9:31   ` Naveen N. Rao
@ 2022-04-27 13:54     ` Steven Rostedt
  -1 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-04-27 13:54 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: Michael Ellerman, Nathan Chancellor, Nick Desaulniers,
	linux-kernel, linuxppc-dev, llvm

On Wed, 27 Apr 2022 15:01:22 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> If one or both of these weak functions are overridden in future, in the
> final vmlinux mcount table, references to these will change over to the
> non-weak variant which has its own mcount location entry. As such, there
> will now be two entries for these functions, both pointing to the same
> non-weak location.

But is that really true in all cases? x86 uses fentry these days, and other
archs do things differently too. But the original mcount (-pg) call
happened *after* the frame setup. That means the offset of the mcount call
would be at different offsets wrt the start of the function. If you have
one of these architectures that still use mcount, and the weak function
doesn't have the same size frame setup as the overriding function, then the
addresses will not be the same.

-- Steve


> We don't need the non-weak locations since they will
> never be executed. Ftrace skips the duplicate entries due to a previous
> commit.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-04-27 13:54     ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-04-27 13:54 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

On Wed, 27 Apr 2022 15:01:22 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> If one or both of these weak functions are overridden in future, in the
> final vmlinux mcount table, references to these will change over to the
> non-weak variant which has its own mcount location entry. As such, there
> will now be two entries for these functions, both pointing to the same
> non-weak location.

But is that really true in all cases? x86 uses fentry these days, and other
archs do things differently too. But the original mcount (-pg) call
happened *after* the frame setup. That means the offset of the mcount call
would be at different offsets wrt the start of the function. If you have
one of these architectures that still use mcount, and the weak function
doesn't have the same size frame setup as the overriding function, then the
addresses will not be the same.

-- Steve


> We don't need the non-weak locations since they will
> never be executed. Ftrace skips the duplicate entries due to a previous
> commit.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-04-27 13:54     ` Steven Rostedt
@ 2022-04-28  7:45       ` Naveen N. Rao
  -1 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-28  7:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linuxppc-dev, llvm, Michael Ellerman,
	Nathan Chancellor, Nick Desaulniers

Steven Rostedt wrote:
> On Wed, 27 Apr 2022 15:01:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> If one or both of these weak functions are overridden in future, in the
>> final vmlinux mcount table, references to these will change over to the
>> non-weak variant which has its own mcount location entry. As such, there
>> will now be two entries for these functions, both pointing to the same
>> non-weak location.
> 
> But is that really true in all cases? x86 uses fentry these days, and other
> archs do things differently too. But the original mcount (-pg) call
> happened *after* the frame setup. That means the offset of the mcount call
> would be at different offsets wrt the start of the function. If you have
> one of these architectures that still use mcount, and the weak function
> doesn't have the same size frame setup as the overriding function, then the
> addresses will not be the same.

Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
way to address this. I suppose architectures will have to validate the 
mcount locations, something like this?

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index d83758acd1c7c3..d8b104ed2fdf38 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -12,13 +12,7 @@
 
 #ifndef __ASSEMBLY__
 extern void _mcount(void);
-
-static inline unsigned long ftrace_call_adjust(unsigned long addr)
-{
-       /* relocation of mcount call site is the same as the address */
-       return addr;
-}
-
+unsigned long ftrace_call_adjust(unsigned long addr);
 unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
                                    unsigned long sp);
 
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 4ee04aacf9f13c..976c08cd0573f7 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -858,6 +858,17 @@ void arch_ftrace_update_code(int command)
        ftrace_modify_all_code(command);
 }
 
+unsigned long ftrace_call_adjust(unsigned long addr)
+{
+       ppc_inst_t op = ppc_inst_read((u32 *)addr);
+
+       if (!is_bl_op(op))
+               return 0;
+
+       /* relocation of mcount call site is the same as the address */
+       return addr;
+}
+
 #ifdef CONFIG_PPC64
 #define PACATOC offsetof(struct paca_struct, kernel_toc)
 

We can tighten those checks as necessary, but it will be upto the 
architectures to validate the mcount locations. This all will have to be 
opt-in so that only architectures doing necessary validation will allow 
mcount relocations against weak symbols.


- Naveen

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-04-28  7:45       ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-28  7:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

Steven Rostedt wrote:
> On Wed, 27 Apr 2022 15:01:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> If one or both of these weak functions are overridden in future, in the
>> final vmlinux mcount table, references to these will change over to the
>> non-weak variant which has its own mcount location entry. As such, there
>> will now be two entries for these functions, both pointing to the same
>> non-weak location.
> 
> But is that really true in all cases? x86 uses fentry these days, and other
> archs do things differently too. But the original mcount (-pg) call
> happened *after* the frame setup. That means the offset of the mcount call
> would be at different offsets wrt the start of the function. If you have
> one of these architectures that still use mcount, and the weak function
> doesn't have the same size frame setup as the overriding function, then the
> addresses will not be the same.

Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
way to address this. I suppose architectures will have to validate the 
mcount locations, something like this?

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index d83758acd1c7c3..d8b104ed2fdf38 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -12,13 +12,7 @@
 
 #ifndef __ASSEMBLY__
 extern void _mcount(void);
-
-static inline unsigned long ftrace_call_adjust(unsigned long addr)
-{
-       /* relocation of mcount call site is the same as the address */
-       return addr;
-}
-
+unsigned long ftrace_call_adjust(unsigned long addr);
 unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
                                    unsigned long sp);
 
diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 4ee04aacf9f13c..976c08cd0573f7 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -858,6 +858,17 @@ void arch_ftrace_update_code(int command)
        ftrace_modify_all_code(command);
 }
 
+unsigned long ftrace_call_adjust(unsigned long addr)
+{
+       ppc_inst_t op = ppc_inst_read((u32 *)addr);
+
+       if (!is_bl_op(op))
+               return 0;
+
+       /* relocation of mcount call site is the same as the address */
+       return addr;
+}
+
 #ifdef CONFIG_PPC64
 #define PACATOC offsetof(struct paca_struct, kernel_toc)
 

We can tighten those checks as necessary, but it will be upto the 
architectures to validate the mcount locations. This all will have to be 
opt-in so that only architectures doing necessary validation will allow 
mcount relocations against weak symbols.


- Naveen

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-04-28  7:45       ` Naveen N. Rao
@ 2022-04-28 14:06         ` Steven Rostedt
  -1 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-04-28 14:06 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: linux-kernel, linuxppc-dev, llvm, Michael Ellerman,
	Nathan Chancellor, Nick Desaulniers

On Thu, 28 Apr 2022 13:15:22 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
> way to address this. I suppose architectures will have to validate the 
> mcount locations, something like this?

Perhaps another solution is to make the mcount locations after the linking
is done. The main downside to that is that it takes time to go over the
entire vmlinux, and will slow down a compile that only modified a couple of
files.

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-04-28 14:06         ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-04-28 14:06 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

On Thu, 28 Apr 2022 13:15:22 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
> way to address this. I suppose architectures will have to validate the 
> mcount locations, something like this?

Perhaps another solution is to make the mcount locations after the linking
is done. The main downside to that is that it takes time to go over the
entire vmlinux, and will slow down a compile that only modified a couple of
files.

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-04-28 14:06         ` Steven Rostedt
@ 2022-04-28 17:24           ` Naveen N. Rao
  -1 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-28 17:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linuxppc-dev, llvm, Michael Ellerman,
	Nathan Chancellor, Nick Desaulniers

Steven Rostedt wrote:
> On Thu, 28 Apr 2022 13:15:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
>> way to address this. I suppose architectures will have to validate the 
>> mcount locations, something like this?
> 
> Perhaps another solution is to make the mcount locations after the linking
> is done. The main downside to that is that it takes time to go over the
> entire vmlinux, and will slow down a compile that only modified a couple of
> files.

Yes, and I think that is also very useful with LTO. So, that would be a 
good one to consider in the longer term.

For now, I have posted a v2 of this series with your comments addressed.  
It is working well in my tests on powerpc in the different 
configurations, including the older elf abi v1 that uses -pg. If it 
looks ok to you, we can use this approach for now.


Thanks,
Naveen

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-04-28 17:24           ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-28 17:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

Steven Rostedt wrote:
> On Thu, 28 Apr 2022 13:15:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
>> way to address this. I suppose architectures will have to validate the 
>> mcount locations, something like this?
> 
> Perhaps another solution is to make the mcount locations after the linking
> is done. The main downside to that is that it takes time to go over the
> entire vmlinux, and will slow down a compile that only modified a couple of
> files.

Yes, and I think that is also very useful with LTO. So, that would be a 
good one to consider in the longer term.

For now, I have posted a v2 of this series with your comments addressed.  
It is working well in my tests on powerpc in the different 
configurations, including the older elf abi v1 that uses -pg. If it 
looks ok to you, we can use this approach for now.


Thanks,
Naveen

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-04-28 14:06         ` Steven Rostedt
@ 2022-04-28 17:31           ` Naveen N. Rao
  -1 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-28 17:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linuxppc-dev, llvm, Michael Ellerman,
	Nathan Chancellor, Nick Desaulniers

Steven Rostedt wrote:
> On Thu, 28 Apr 2022 13:15:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
>> way to address this. I suppose architectures will have to validate the 
>> mcount locations, something like this?
> 
> Perhaps another solution is to make the mcount locations after the linking
> is done. The main downside to that is that it takes time to go over the
> entire vmlinux, and will slow down a compile that only modified a couple of
> files.

Yes, and I think that is also very useful with LTO. So, that would be 
good to consider in the longer term.

For now, I have posted a v2 of this series with your comments addressed.  
It is working well in my tests on powerpc in the different 
configurations, including the older elf v1 abi with -pg. If it looks ok 
to you, we can go with this approach for now.


Thanks,
Naveen

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-04-28 17:31           ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-04-28 17:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

Steven Rostedt wrote:
> On Thu, 28 Apr 2022 13:15:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> Indeed, plain old -pg will be a problem. I'm not sure there is a generic 
>> way to address this. I suppose architectures will have to validate the 
>> mcount locations, something like this?
> 
> Perhaps another solution is to make the mcount locations after the linking
> is done. The main downside to that is that it takes time to go over the
> entire vmlinux, and will slow down a compile that only modified a couple of
> files.

Yes, and I think that is also very useful with LTO. So, that would be 
good to consider in the longer term.

For now, I have posted a v2 of this series with your comments addressed.  
It is working well in my tests on powerpc in the different 
configurations, including the older elf v1 abi with -pg. If it looks ok 
to you, we can go with this approach for now.


Thanks,
Naveen

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-04-28 14:06         ` Steven Rostedt
@ 2022-05-02 14:44           ` Christophe Leroy
  -1 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-02 14:44 UTC (permalink / raw)
  To: Steven Rostedt, Naveen N. Rao
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev



Le 28/04/2022 à 16:06, Steven Rostedt a écrit :
> On Thu, 28 Apr 2022 13:15:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> Indeed, plain old -pg will be a problem. I'm not sure there is a generic
>> way to address this. I suppose architectures will have to validate the
>> mcount locations, something like this?
> 
> Perhaps another solution is to make the mcount locations after the linking
> is done. The main downside to that is that it takes time to go over the
> entire vmlinux, and will slow down a compile that only modified a couple of
> files.
> 

If we do that after the linking, won't it be a nightmare with the 
trampolines installed by the linker when the destination is over the 24 
bits limit ?

Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-05-02 14:44           ` Christophe Leroy
  0 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-02 14:44 UTC (permalink / raw)
  To: Steven Rostedt, Naveen N. Rao
  Cc: Nathan Chancellor, linuxppc-dev, llvm, Nick Desaulniers, linux-kernel



Le 28/04/2022 à 16:06, Steven Rostedt a écrit :
> On Thu, 28 Apr 2022 13:15:22 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> Indeed, plain old -pg will be a problem. I'm not sure there is a generic
>> way to address this. I suppose architectures will have to validate the
>> mcount locations, something like this?
> 
> Perhaps another solution is to make the mcount locations after the linking
> is done. The main downside to that is that it takes time to go over the
> entire vmlinux, and will slow down a compile that only modified a couple of
> files.
> 

If we do that after the linking, won't it be a nightmare with the 
trampolines installed by the linker when the destination is over the 24 
bits limit ?

Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-05-02 14:44           ` Christophe Leroy
@ 2022-05-02 23:52             ` Steven Rostedt
  -1 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-05-02 23:52 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Naveen N. Rao, llvm, Nick Desaulniers, linux-kernel,
	Nathan Chancellor, linuxppc-dev

On Mon, 2 May 2022 14:44:56 +0000
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> If we do that after the linking, won't it be a nightmare with the 
> trampolines installed by the linker when the destination is over the 24 
> bits limit ?

Not sure what you mean. The locations I'm talking about is the full
address saved in the __mcount_loc table (data section).

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-05-02 23:52             ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-05-02 23:52 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor,
	Naveen N. Rao, linuxppc-dev

On Mon, 2 May 2022 14:44:56 +0000
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> If we do that after the linking, won't it be a nightmare with the 
> trampolines installed by the linker when the destination is over the 24 
> bits limit ?

Not sure what you mean. The locations I'm talking about is the full
address saved in the __mcount_loc table (data section).

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-05-02 23:52             ` Steven Rostedt
@ 2022-05-03 11:20               ` Christophe Leroy
  -1 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-03 11:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Naveen N. Rao, llvm, Nick Desaulniers, linux-kernel,
	Nathan Chancellor, linuxppc-dev



Le 03/05/2022 à 01:52, Steven Rostedt a écrit :
> On Mon, 2 May 2022 14:44:56 +0000
> Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> 
>> If we do that after the linking, won't it be a nightmare with the
>> trampolines installed by the linker when the destination is over the 24
>> bits limit ?
> 
> Not sure what you mean. The locations I'm talking about is the full
> address saved in the __mcount_loc table (data section).
> 

Maybe I misunderstood. When you say 'after linking', do you mean vmlinux 
or vmlinux.o ?

In vmlinux, the addresses to be saved in __mcount_loc table might not 
contain anymore a call to _mcount but a call to a trampoline that jumps 
to _mcount, in case _mcount is too far from the said location at link 
time. That's what I meant.

Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-05-03 11:20               ` Christophe Leroy
  0 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-03 11:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor,
	Naveen N. Rao, linuxppc-dev



Le 03/05/2022 à 01:52, Steven Rostedt a écrit :
> On Mon, 2 May 2022 14:44:56 +0000
> Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> 
>> If we do that after the linking, won't it be a nightmare with the
>> trampolines installed by the linker when the destination is over the 24
>> bits limit ?
> 
> Not sure what you mean. The locations I'm talking about is the full
> address saved in the __mcount_loc table (data section).
> 

Maybe I misunderstood. When you say 'after linking', do you mean vmlinux 
or vmlinux.o ?

In vmlinux, the addresses to be saved in __mcount_loc table might not 
contain anymore a call to _mcount but a call to a trampoline that jumps 
to _mcount, in case _mcount is too far from the said location at link 
time. That's what I meant.

Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-05-03 11:20               ` Christophe Leroy
@ 2022-05-03 16:25                 ` Steven Rostedt
  -1 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-05-03 16:25 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Naveen N. Rao, llvm, Nick Desaulniers, linux-kernel,
	Nathan Chancellor, linuxppc-dev

On Tue, 3 May 2022 11:20:22 +0000
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> Maybe I misunderstood. When you say 'after linking', do you mean vmlinux 
> or vmlinux.o ?

Whichever ;-)

> 
> In vmlinux, the addresses to be saved in __mcount_loc table might not 
> contain anymore a call to _mcount but a call to a trampoline that jumps 
> to _mcount, in case _mcount is too far from the said location at link 
> time. That's what I meant.

But how is that different than what is done today? And at linking,
everything still calls mcount. It's not until runtime things change.

The point I'm talking about is that after linking, if the linker
removed unused code (which would include unused weak functions,
right?), then the calls to mcount that were in the weak functions would
be gone too, and they would not be added by recordmcount, and for those
that are still around, then using their symbols as the reference point
would also not be an issue.

The problem we have right now is that the only symbol we have is a weak
function to reference the mcount call location in the __mcount_loc
section. But if all the global entries are not used and the linker
removes them, then the references using those symbols in the
__mcount_loc section will be "undefined". After linking, everything in
the vmlinux(.o) is set, and we are free to use that as a reference
point for the call sites.

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-05-03 16:25                 ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-05-03 16:25 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor,
	Naveen N. Rao, linuxppc-dev

On Tue, 3 May 2022 11:20:22 +0000
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> Maybe I misunderstood. When you say 'after linking', do you mean vmlinux 
> or vmlinux.o ?

Whichever ;-)

> 
> In vmlinux, the addresses to be saved in __mcount_loc table might not 
> contain anymore a call to _mcount but a call to a trampoline that jumps 
> to _mcount, in case _mcount is too far from the said location at link 
> time. That's what I meant.

But how is that different than what is done today? And at linking,
everything still calls mcount. It's not until runtime things change.

The point I'm talking about is that after linking, if the linker
removed unused code (which would include unused weak functions,
right?), then the calls to mcount that were in the weak functions would
be gone too, and they would not be added by recordmcount, and for those
that are still around, then using their symbols as the reference point
would also not be an issue.

The problem we have right now is that the only symbol we have is a weak
function to reference the mcount call location in the __mcount_loc
section. But if all the global entries are not used and the linker
removes them, then the references using those symbols in the
__mcount_loc section will be "undefined". After linking, everything in
the vmlinux(.o) is set, and we are free to use that as a reference
point for the call sites.

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-05-03 16:25                 ` Steven Rostedt
@ 2022-05-04 16:50                   ` Christophe Leroy
  -1 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-04 16:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Naveen N. Rao, llvm, Nick Desaulniers, linux-kernel,
	Nathan Chancellor, linuxppc-dev



Le 03/05/2022 à 18:25, Steven Rostedt a écrit :
> On Tue, 3 May 2022 11:20:22 +0000
> Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> 
>> Maybe I misunderstood. When you say 'after linking', do you mean vmlinux
>> or vmlinux.o ?
> 
> Whichever ;-)
> 
>>
>> In vmlinux, the addresses to be saved in __mcount_loc table might not
>> contain anymore a call to _mcount but a call to a trampoline that jumps
>> to _mcount, in case _mcount is too far from the said location at link
>> time. That's what I meant.
> 
> But how is that different than what is done today? And at linking,
> everything still calls mcount. It's not until runtime things change.

Everything call _mcount but not directly.

In vmlinux, relocations are resolved, trampolines are installed for 
unreachable destinations and you don't anymore have a section with all 
the relocations to mcount. It means 'recordmcount' or whatever tool we 
use will have to read the code to find all direct calls to mcount, then 
find all trampolines to mcount then find all calls to those trampolines.

See below some code extracted from vmlinux on a allyesconfig powerpc64le 
build:

c000000000012300 <__traceiter_initcall_level>:
c000000000012300:       4c 1a 4c 3c     addis   r2,r12,6732
c000000000012304:       00 be 42 38     addi    r2,r2,-16896
c000000000012308:       a6 02 08 7c     mflr    r0
c00000000001230c:       4d 60 0d 48     bl      c0000000000e8358 <_mcount>
c000000000012310:       a6 02 08 7c     mflr    r0
...
c0000000020e8740 <get_cur_path>:
c0000000020e8740:       3e 18 4c 3c     addis   r2,r12,6206
c0000000020e8744:       c0 59 42 38     addi    r2,r2,22976
c0000000020e8748:       a6 02 08 7c     mflr    r0
c0000000020e874c:       c5 ff 7c 48     bl      c0000000028b8710 
<0003af2b.plt_branch._mcount>
c0000000020e8750:       a6 02 08 7c     mflr    r0
...
c0000000028b8710 <0003af2b.plt_branch._mcount>:
c0000000028b8710:       ff ff 82 3d     addis   r12,r2,-1
c0000000028b8714:       98 8e 8c e9     ld      r12,-29032(r12)
c0000000028b8718:       a6 03 89 7d     mtctr   r12
c0000000028b871c:       20 04 80 4e     bctr
...
c0000000044bdcc0 <handle_lcd_irq.isra.0>:
c0000000044bdcc0:       01 16 4c 3c     addis   r2,r12,5633
c0000000044bdcc4:       40 04 42 38     addi    r2,r2,1088
c0000000044bdcc8:       a6 02 08 7c     mflr    r0
c0000000044bdccc:       fd 2d c0 49     bl      c0000000060c0ac8 
<000a751f.plt_branch._mcount>
c0000000044bdcd0:       a6 02 08 7c     mflr    r0
...
c0000000060c0ac8 <000a751f.plt_branch._mcount>:
c0000000060c0ac8:       ff ff 82 3d     addis   r12,r2,-1
c0000000060c0acc:       98 8e 8c e9     ld      r12,-29032(r12)
c0000000060c0ad0:       a6 03 89 7d     mtctr   r12
c0000000060c0ad4:       20 04 80 4e     bctr
...


Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-05-04 16:50                   ` Christophe Leroy
  0 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-04 16:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor,
	Naveen N. Rao, linuxppc-dev



Le 03/05/2022 à 18:25, Steven Rostedt a écrit :
> On Tue, 3 May 2022 11:20:22 +0000
> Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> 
>> Maybe I misunderstood. When you say 'after linking', do you mean vmlinux
>> or vmlinux.o ?
> 
> Whichever ;-)
> 
>>
>> In vmlinux, the addresses to be saved in __mcount_loc table might not
>> contain anymore a call to _mcount but a call to a trampoline that jumps
>> to _mcount, in case _mcount is too far from the said location at link
>> time. That's what I meant.
> 
> But how is that different than what is done today? And at linking,
> everything still calls mcount. It's not until runtime things change.

Everything call _mcount but not directly.

In vmlinux, relocations are resolved, trampolines are installed for 
unreachable destinations and you don't anymore have a section with all 
the relocations to mcount. It means 'recordmcount' or whatever tool we 
use will have to read the code to find all direct calls to mcount, then 
find all trampolines to mcount then find all calls to those trampolines.

See below some code extracted from vmlinux on a allyesconfig powerpc64le 
build:

c000000000012300 <__traceiter_initcall_level>:
c000000000012300:       4c 1a 4c 3c     addis   r2,r12,6732
c000000000012304:       00 be 42 38     addi    r2,r2,-16896
c000000000012308:       a6 02 08 7c     mflr    r0
c00000000001230c:       4d 60 0d 48     bl      c0000000000e8358 <_mcount>
c000000000012310:       a6 02 08 7c     mflr    r0
...
c0000000020e8740 <get_cur_path>:
c0000000020e8740:       3e 18 4c 3c     addis   r2,r12,6206
c0000000020e8744:       c0 59 42 38     addi    r2,r2,22976
c0000000020e8748:       a6 02 08 7c     mflr    r0
c0000000020e874c:       c5 ff 7c 48     bl      c0000000028b8710 
<0003af2b.plt_branch._mcount>
c0000000020e8750:       a6 02 08 7c     mflr    r0
...
c0000000028b8710 <0003af2b.plt_branch._mcount>:
c0000000028b8710:       ff ff 82 3d     addis   r12,r2,-1
c0000000028b8714:       98 8e 8c e9     ld      r12,-29032(r12)
c0000000028b8718:       a6 03 89 7d     mtctr   r12
c0000000028b871c:       20 04 80 4e     bctr
...
c0000000044bdcc0 <handle_lcd_irq.isra.0>:
c0000000044bdcc0:       01 16 4c 3c     addis   r2,r12,5633
c0000000044bdcc4:       40 04 42 38     addi    r2,r2,1088
c0000000044bdcc8:       a6 02 08 7c     mflr    r0
c0000000044bdccc:       fd 2d c0 49     bl      c0000000060c0ac8 
<000a751f.plt_branch._mcount>
c0000000044bdcd0:       a6 02 08 7c     mflr    r0
...
c0000000060c0ac8 <000a751f.plt_branch._mcount>:
c0000000060c0ac8:       ff ff 82 3d     addis   r12,r2,-1
c0000000060c0acc:       98 8e 8c e9     ld      r12,-29032(r12)
c0000000060c0ad0:       a6 03 89 7d     mtctr   r12
c0000000060c0ad4:       20 04 80 4e     bctr
...


Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-05-04 16:50                   ` Christophe Leroy
@ 2022-05-04 17:06                     ` Steven Rostedt
  -1 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-05-04 17:06 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Naveen N. Rao, llvm, Nick Desaulniers, linux-kernel,
	Nathan Chancellor, linuxppc-dev

On Wed, 4 May 2022 16:50:58 +0000
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> In vmlinux, relocations are resolved, trampolines are installed for 
> unreachable destinations and you don't anymore have a section with all 
> the relocations to mcount. It means 'recordmcount' or whatever tool we 
> use will have to read the code to find all direct calls to mcount, then 
> find all trampolines to mcount then find all calls to those trampolines.

OK, so what you are saying is that in the object file, we can see the
site that calls mcount, but when it is linked, it may not call mcount,
but instead it will call a trampoline that will call mcount, thus the
tool will need to find these calls to the trampolines that call mcount
as well as the locations that call mcount directly.

Did I get that right?

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-05-04 17:06                     ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-05-04 17:06 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor,
	Naveen N. Rao, linuxppc-dev

On Wed, 4 May 2022 16:50:58 +0000
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> In vmlinux, relocations are resolved, trampolines are installed for 
> unreachable destinations and you don't anymore have a section with all 
> the relocations to mcount. It means 'recordmcount' or whatever tool we 
> use will have to read the code to find all direct calls to mcount, then 
> find all trampolines to mcount then find all calls to those trampolines.

OK, so what you are saying is that in the object file, we can see the
site that calls mcount, but when it is linked, it may not call mcount,
but instead it will call a trampoline that will call mcount, thus the
tool will need to find these calls to the trampolines that call mcount
as well as the locations that call mcount directly.

Did I get that right?

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
  2022-05-04 17:06                     ` Steven Rostedt
@ 2022-05-04 17:10                       ` Christophe Leroy
  -1 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-04 17:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor,
	Naveen N. Rao, linuxppc-dev



Le 04/05/2022 à 19:06, Steven Rostedt a écrit :
> On Wed, 4 May 2022 16:50:58 +0000
> Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> 
>> In vmlinux, relocations are resolved, trampolines are installed for
>> unreachable destinations and you don't anymore have a section with all
>> the relocations to mcount. It means 'recordmcount' or whatever tool we
>> use will have to read the code to find all direct calls to mcount, then
>> find all trampolines to mcount then find all calls to those trampolines.
> 
> OK, so what you are saying is that in the object file, we can see the
> site that calls mcount, but when it is linked, it may not call mcount,
> but instead it will call a trampoline that will call mcount, thus the
> tool will need to find these calls to the trampolines that call mcount
> as well as the locations that call mcount directly.
> 
> Did I get that right?
> 

Yes it is what I'm trying to say.

Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols
@ 2022-05-04 17:10                       ` Christophe Leroy
  0 siblings, 0 replies; 36+ messages in thread
From: Christophe Leroy @ 2022-05-04 17:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Naveen N. Rao, llvm, Nick Desaulniers, linux-kernel,
	Nathan Chancellor, linuxppc-dev



Le 04/05/2022 à 19:06, Steven Rostedt a écrit :
> On Wed, 4 May 2022 16:50:58 +0000
> Christophe Leroy <christophe.leroy@csgroup.eu> wrote:
> 
>> In vmlinux, relocations are resolved, trampolines are installed for
>> unreachable destinations and you don't anymore have a section with all
>> the relocations to mcount. It means 'recordmcount' or whatever tool we
>> use will have to read the code to find all direct calls to mcount, then
>> find all trampolines to mcount then find all calls to those trampolines.
> 
> OK, so what you are saying is that in the object file, we can see the
> site that calls mcount, but when it is linked, it may not call mcount,
> but instead it will call a trampoline that will call mcount, thus the
> tool will need to find these calls to the trampolines that call mcount
> as well as the locations that call mcount directly.
> 
> Did I get that right?
> 

Yes it is what I'm trying to say.

Christophe

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
  2022-04-27  9:31 ` Naveen N. Rao
@ 2022-08-16 14:04   ` Steven Rostedt
  -1 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-08-16 14:04 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: Michael Ellerman, Nathan Chancellor, Nick Desaulniers,
	linux-kernel, linuxppc-dev, llvm

On Wed, 27 Apr 2022 15:01:20 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> This solves a build issue on powerpc with binutils v2.36 and newer [1].
> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> symbols") [2], binutils started dropping section symbols that it thought
> were unused. Due to this, in certain scenarios, recordmcount is unable 
> to find a non-weak symbol to generate a relocation record against.
> 
> Clang integrated assembler is also aggressive in dropping section 
> symbols [3].
> 
> In the past, there have been various workarounds to address this. See 
> commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 
> 6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch:
> https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n.rao@linux.vnet.ibm.com/T/#u
> 
> Fix this issue by using the weak symbol in the relocation record. This 
> can result in duplicate locations in the mcount table if those weak 
> functions are overridden, so have ftrace skip dupicate entries.
> 
> Objtool already follows this approach, so patch 2 updates recordmcount 
> to do the same. Patch 1 updates ftrace to skip duplicate entries.
> 
> - Naveen
> 
> 
> [1] https://github.com/linuxppc/issues/issues/388
> [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
> [3] https://github.com/ClangBuiltLinux/linux/issues/981
> 
>

There's been work to handle weak functions, but I'm not sure that work
handled the issues here. Are these patches still needed, or was there
another workaround to handle the problems this addressed?

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
@ 2022-08-16 14:04   ` Steven Rostedt
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-08-16 14:04 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

On Wed, 27 Apr 2022 15:01:20 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:

> This solves a build issue on powerpc with binutils v2.36 and newer [1].
> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
> symbols") [2], binutils started dropping section symbols that it thought
> were unused. Due to this, in certain scenarios, recordmcount is unable 
> to find a non-weak symbol to generate a relocation record against.
> 
> Clang integrated assembler is also aggressive in dropping section 
> symbols [3].
> 
> In the past, there have been various workarounds to address this. See 
> commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 
> 6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch:
> https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n.rao@linux.vnet.ibm.com/T/#u
> 
> Fix this issue by using the weak symbol in the relocation record. This 
> can result in duplicate locations in the mcount table if those weak 
> functions are overridden, so have ftrace skip dupicate entries.
> 
> Objtool already follows this approach, so patch 2 updates recordmcount 
> to do the same. Patch 1 updates ftrace to skip duplicate entries.
> 
> - Naveen
> 
> 
> [1] https://github.com/linuxppc/issues/issues/388
> [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
> [3] https://github.com/ClangBuiltLinux/linux/issues/981
> 
>

There's been work to handle weak functions, but I'm not sure that work
handled the issues here. Are these patches still needed, or was there
another workaround to handle the problems this addressed?

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
  2022-08-16 14:04   ` Steven Rostedt
@ 2022-08-16 17:05     ` Naveen N. Rao
  -1 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-08-16 17:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, linuxppc-dev, llvm, Michael Ellerman,
	Nathan Chancellor, Nick Desaulniers

Hi Steven,

Steven Rostedt wrote:
> On Wed, 27 Apr 2022 15:01:20 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> This solves a build issue on powerpc with binutils v2.36 and newer [1].
>> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
>> symbols") [2], binutils started dropping section symbols that it thought
>> were unused. Due to this, in certain scenarios, recordmcount is unable 
>> to find a non-weak symbol to generate a relocation record against.
>> 
>> Clang integrated assembler is also aggressive in dropping section 
>> symbols [3].
>> 
>> In the past, there have been various workarounds to address this. See 
>> commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 
>> 6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch:
>> https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n.rao@linux.vnet.ibm.com/T/#u
>> 
>> Fix this issue by using the weak symbol in the relocation record. This 
>> can result in duplicate locations in the mcount table if those weak 
>> functions are overridden, so have ftrace skip dupicate entries.
>> 
>> Objtool already follows this approach, so patch 2 updates recordmcount 
>> to do the same. Patch 1 updates ftrace to skip duplicate entries.
>> 
>> - Naveen
>> 
>> 
>> [1] https://github.com/linuxppc/issues/issues/388
>> [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>> [3] https://github.com/ClangBuiltLinux/linux/issues/981
>> 
>>
> 
> There's been work to handle weak functions, but I'm not sure that work
> handled the issues here. Are these patches still needed, or was there
> another workaround to handle the problems this addressed?

I'm afraid these patches are still needed to address issues in 
recordmcount.

I submitted patches to remove use of weak functions in the kexec 
subsystem, but those have only enabled building ppc64le defconfig 
without errors:
https://lore.kernel.org/all/20220519091237.676736-1-naveen.n.rao@linux.vnet.ibm.com/
https://lore.kernel.org/all/cover.1656659357.git.naveen.n.rao@linux.vnet.ibm.com/

The patch adding support for FTRACE_MCOUNT_MAX_OFFSET to powerpc only 
helps ignore weak functions during runtime:
https://lore.kernel.org/all/20220809105425.424045-1-naveen.n.rao@linux.vnet.ibm.com/

We still see errors from recordmcount when trying to build certain 
powerpc configs.

We are pursuing support for objtool, which doesn't have the same issues:
https://lore.kernel.org/all/20220808114908.240813-1-sv@linux.ibm.com/


- Naveen

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols
@ 2022-08-16 17:05     ` Naveen N. Rao
  0 siblings, 0 replies; 36+ messages in thread
From: Naveen N. Rao @ 2022-08-16 17:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: llvm, Nick Desaulniers, linux-kernel, Nathan Chancellor, linuxppc-dev

Hi Steven,

Steven Rostedt wrote:
> On Wed, 27 Apr 2022 15:01:20 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> 
>> This solves a build issue on powerpc with binutils v2.36 and newer [1].
>> Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
>> symbols") [2], binutils started dropping section symbols that it thought
>> were unused. Due to this, in certain scenarios, recordmcount is unable 
>> to find a non-weak symbol to generate a relocation record against.
>> 
>> Clang integrated assembler is also aggressive in dropping section 
>> symbols [3].
>> 
>> In the past, there have been various workarounds to address this. See 
>> commits 55d5b7dd6451b5 ("initramfs: fix clang build failure") and 
>> 6e7b64b9dd6d96 ("elfcore: fix building with clang") and a recent patch:
>> https://lore.kernel.org/linuxppc-dev/20220425174128.11455-1-naveen.n.rao@linux.vnet.ibm.com/T/#u
>> 
>> Fix this issue by using the weak symbol in the relocation record. This 
>> can result in duplicate locations in the mcount table if those weak 
>> functions are overridden, so have ftrace skip dupicate entries.
>> 
>> Objtool already follows this approach, so patch 2 updates recordmcount 
>> to do the same. Patch 1 updates ftrace to skip duplicate entries.
>> 
>> - Naveen
>> 
>> 
>> [1] https://github.com/linuxppc/issues/issues/388
>> [2] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>> [3] https://github.com/ClangBuiltLinux/linux/issues/981
>> 
>>
> 
> There's been work to handle weak functions, but I'm not sure that work
> handled the issues here. Are these patches still needed, or was there
> another workaround to handle the problems this addressed?

I'm afraid these patches are still needed to address issues in 
recordmcount.

I submitted patches to remove use of weak functions in the kexec 
subsystem, but those have only enabled building ppc64le defconfig 
without errors:
https://lore.kernel.org/all/20220519091237.676736-1-naveen.n.rao@linux.vnet.ibm.com/
https://lore.kernel.org/all/cover.1656659357.git.naveen.n.rao@linux.vnet.ibm.com/

The patch adding support for FTRACE_MCOUNT_MAX_OFFSET to powerpc only 
helps ignore weak functions during runtime:
https://lore.kernel.org/all/20220809105425.424045-1-naveen.n.rao@linux.vnet.ibm.com/

We still see errors from recordmcount when trying to build certain 
powerpc configs.

We are pursuing support for objtool, which doesn't have the same issues:
https://lore.kernel.org/all/20220808114908.240813-1-sv@linux.ibm.com/


- Naveen

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2022-08-16 17:39 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-27  9:31 [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols Naveen N. Rao
2022-04-27  9:31 ` Naveen N. Rao
2022-04-27  9:31 ` [PATCH 1/2] ftrace: Drop duplicate mcount locations Naveen N. Rao
2022-04-27  9:31   ` Naveen N. Rao
2022-04-27 13:46   ` Steven Rostedt
2022-04-27 13:46     ` Steven Rostedt
2022-04-27  9:31 ` [PATCH 2/2] recordmcount: Handle sections with no non-weak symbols Naveen N. Rao
2022-04-27  9:31   ` Naveen N. Rao
2022-04-27 13:54   ` Steven Rostedt
2022-04-27 13:54     ` Steven Rostedt
2022-04-28  7:45     ` Naveen N. Rao
2022-04-28  7:45       ` Naveen N. Rao
2022-04-28 14:06       ` Steven Rostedt
2022-04-28 14:06         ` Steven Rostedt
2022-04-28 17:24         ` Naveen N. Rao
2022-04-28 17:24           ` Naveen N. Rao
2022-04-28 17:31         ` Naveen N. Rao
2022-04-28 17:31           ` Naveen N. Rao
2022-05-02 14:44         ` Christophe Leroy
2022-05-02 14:44           ` Christophe Leroy
2022-05-02 23:52           ` Steven Rostedt
2022-05-02 23:52             ` Steven Rostedt
2022-05-03 11:20             ` Christophe Leroy
2022-05-03 11:20               ` Christophe Leroy
2022-05-03 16:25               ` Steven Rostedt
2022-05-03 16:25                 ` Steven Rostedt
2022-05-04 16:50                 ` Christophe Leroy
2022-05-04 16:50                   ` Christophe Leroy
2022-05-04 17:06                   ` Steven Rostedt
2022-05-04 17:06                     ` Steven Rostedt
2022-05-04 17:10                     ` Christophe Leroy
2022-05-04 17:10                       ` Christophe Leroy
2022-08-16 14:04 ` [PATCH 0/2] ftrace/recordmcount: Handle object files without section symbols Steven Rostedt
2022-08-16 14:04   ` Steven Rostedt
2022-08-16 17:05   ` Naveen N. Rao
2022-08-16 17:05     ` Naveen N. Rao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.