All of lore.kernel.org
 help / color / mirror / Atom feed
* The trouble with __weak and objtool got worse
@ 2022-04-15 11:19 Peter Zijlstra
  2022-04-15 14:12 ` Steven Rostedt
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-15 11:19 UTC (permalink / raw)
  To: x86, Josh Poimboeuf
  Cc: hjl.tools, ndesaulniers, mbenes, rostedt, linux-toolchains

Hi,

After chasing my tail for a good few hours this morning I finally
figured out why code patching is exploding for me in weird and wonderful
ways... :-(

This problem affects everything where we have external tools (mostly
objtool but possible also recordmcount) generate location sections for
us, notably things like:

  .static_call_sites
  .retpoline_sites
  __mcount_loc
  .orc_unwind
  .orc_unwind_ip

from individual translation-unit runs.

Consider:

foo-weak.c:

extern void __SCT__foo(void);

__attribute__((weak)) void foo(void)
{
        return __SCT__foo();
}

foo.c:

extern void __SCT__foo(void);
extern void my_foo(void);

void foo(void)
{
        my_foo();
        return __SCT__foo();
}

These generate the obvious code
(gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):

foo-weak.o:
0000000000000000 <foo>:
   0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4

foo.o:
0000000000000000 <foo>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4


Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o):

foos.o:
0000000000000000 <foo-0x10>:
   0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
   5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
   f:   90                      nop

0000000000000010 <foo>:
  10:   48 83 ec 08             sub    $0x8,%rsp
  14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
  19:   48 83 c4 08             add    $0x8,%rsp
  1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4

Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).

So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

So we have two patch sites, one in the dead code of the weak foo and one
in the real foo. All is well.

*HOWEVER*, sometimes it generates things like this (using new enough
binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.

This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!

Now, if we do IBT/LTO builds, and run objtool on linked objects only,
this obvious doesn't happen, because the weak stuff has been resolved by
then. But ideally we'd not force that since it has a build time
penalty..

The other option seems to be to have objtool add section symbols it
needs, however due to ELF being a total PITA and requiring all LOCAL
symbols to be before GLOBAL symbols, this would mean re-ordering the
whole symbol table (I have the code somewhere :-/).

Alternatively:

  https://sourceware.org/pipermail/binutils/2020-December/114671.html

seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
work, except I'm getting:

$ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
as: unrecognized option '--generate-unused-section-symbols=yes'
as: unrecognized option '--generate-unused-section-symbols=yes'

$ as --version
GNU assembler (GNU Binutils for Debian) 2.38


Opinions?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
@ 2022-04-15 14:12 ` Steven Rostedt
  2022-04-15 15:31   ` Peter Zijlstra
  2022-04-15 15:10 ` Josh Poimboeuf
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2022-04-15 14:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, Josh Poimboeuf, hjl.tools, ndesaulniers, mbenes, linux-toolchains

On Fri, 15 Apr 2022 13:19:04 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> This problem affects everything where we have external tools (mostly
> objtool but possible also recordmcount) generate location sections for
> us, notably things like:
> 
>   .static_call_sites
>   .retpoline_sites
>   __mcount_loc
>   .orc_unwind
>   .orc_unwind_ip

But this is only an issue with newer compilers/linkers right? For the
case of recordmcount(), that is done pretty much entirely via the
compiler these days, so it is likely not an issue. We probably need to
have a way to tell the compiler about external references. Or perhaps
have an external tool to look at the weak functions that are overridden
and be able to handle it appropriately?

-- Steve

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
  2022-04-15 14:12 ` Steven Rostedt
@ 2022-04-15 15:10 ` Josh Poimboeuf
  2022-04-15 15:15   ` Josh Poimboeuf
  2022-04-15 15:26 ` Peter Zijlstra
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Josh Poimboeuf @ 2022-04-15 15:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, hjl.tools, ndesaulniers, mbenes, rostedt, linux-toolchains

On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> Now, if we do IBT/LTO builds, and run objtool on linked objects only,
> this obvious doesn't happen, because the weak stuff has been resolved by
> then. But ideally we'd not force that since it has a build time
> penalty..
> 
> The other option seems to be to have objtool add section symbols it
> needs, however due to ELF being a total PITA and requiring all LOCAL
> symbols to be before GLOBAL symbols, this would mean re-ordering the
> whole symbol table (I have the code somewhere :-/).
> 
> Alternatively:
> 
>   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> 
> seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> work, except I'm getting:
> 
> $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> as: unrecognized option '--generate-unused-section-symbols=yes'
> as: unrecognized option '--generate-unused-section-symbols=yes'
> 
> $ as --version
> GNU assembler (GNU Binutils for Debian) 2.38

Can we trick the assembler into creating function symbols for all the
static call sites by having some inline asm (or _THIS_IP_) in the static
call macro with a dummy reference to a code location in the function?

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 15:10 ` Josh Poimboeuf
@ 2022-04-15 15:15   ` Josh Poimboeuf
  0 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2022-04-15 15:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, hjl.tools, ndesaulniers, mbenes, rostedt, linux-toolchains

On Fri, Apr 15, 2022 at 08:10:18AM -0700, Josh Poimboeuf wrote:
> On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> > Now, if we do IBT/LTO builds, and run objtool on linked objects only,
> > this obvious doesn't happen, because the weak stuff has been resolved by
> > then. But ideally we'd not force that since it has a build time
> > penalty..
> > 
> > The other option seems to be to have objtool add section symbols it
> > needs, however due to ELF being a total PITA and requiring all LOCAL
> > symbols to be before GLOBAL symbols, this would mean re-ordering the
> > whole symbol table (I have the code somewhere :-/).
> > 
> > Alternatively:
> > 
> >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > 
> > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > work, except I'm getting:
> > 
> > $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> > as: unrecognized option '--generate-unused-section-symbols=yes'
> > as: unrecognized option '--generate-unused-section-symbols=yes'
> > 
> > $ as --version
> > GNU assembler (GNU Binutils for Debian) 2.38
> 
> Can we trick the assembler into creating function symbols for all the

I meant s/function symbols/section symbols/ obviously.

> static call sites by having some inline asm (or _THIS_IP_) in the static
> call macro with a dummy reference to a code location in the function?

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
  2022-04-15 14:12 ` Steven Rostedt
  2022-04-15 15:10 ` Josh Poimboeuf
@ 2022-04-15 15:26 ` Peter Zijlstra
  2022-04-15 17:40   ` Nick Desaulniers
  2022-04-15 18:22 ` Segher Boessenkool
  2022-04-15 21:04 ` H.J. Lu
  4 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-15 15:26 UTC (permalink / raw)
  To: x86, Josh Poimboeuf
  Cc: hjl.tools, ndesaulniers, mbenes, rostedt, linux-toolchains

On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> Alternatively:
> 
>   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> 
> seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> work, except I'm getting:
> 
> $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> as: unrecognized option '--generate-unused-section-symbols=yes'
> as: unrecognized option '--generate-unused-section-symbols=yes'

Reading so hard...

 $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,-generate-unused-section-symbols=yes -c foo*.c

seems to actually work.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 14:12 ` Steven Rostedt
@ 2022-04-15 15:31   ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-15 15:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: x86, Josh Poimboeuf, hjl.tools, ndesaulniers, mbenes, linux-toolchains

On Fri, Apr 15, 2022 at 10:12:17AM -0400, Steven Rostedt wrote:
> On Fri, 15 Apr 2022 13:19:04 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > This problem affects everything where we have external tools (mostly
> > objtool but possible also recordmcount) generate location sections for
> > us, notably things like:
> > 
> >   .static_call_sites
> >   .retpoline_sites
> >   __mcount_loc
> >   .orc_unwind
> >   .orc_unwind_ip
> 
> But this is only an issue with newer compilers/linkers right? For the
> case of recordmcount(), that is done pretty much entirely via the
> compiler these days, so it is likely not an issue. We probably need to
> have a way to tell the compiler about external references. Or perhaps
> have an external tool to look at the weak functions that are overridden
> and be able to handle it appropriately?

It is possible to run an ancient compiler without -mrecord-mcount on a
binutils version that strips unused symbols, at which point recordmcount
will go funny too.

Somewhat unlikely, but quite possible.

Anyway, it seems the solution is to use one less '-' than I tried this
morning.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 15:26 ` Peter Zijlstra
@ 2022-04-15 17:40   ` Nick Desaulniers
  2022-04-15 18:21     ` Josh Poimboeuf
  0 siblings, 1 reply; 29+ messages in thread
From: Nick Desaulniers @ 2022-04-15 17:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, Josh Poimboeuf, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux

On Fri, Apr 15, 2022 at 8:26 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> > Alternatively:
> >
> >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> >
> > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > work, except I'm getting:
> >
> > $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> > as: unrecognized option '--generate-unused-section-symbols=yes'
> > as: unrecognized option '--generate-unused-section-symbols=yes'
>
> Reading so hard...
>
>  $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,-generate-unused-section-symbols=yes -c foo*.c
>
> seems to actually work.

$ clang hello.c -Wa,-generate-unused-section-symbols=yes
clang-15: error: unsupported argument
'-generate-unused-section-symbols=yes' to option '-Wa,'

:)

I recall LLVM not emitting STT_SECTION symbols being problematic for
objtool in the past.
Old pre-lore archive:
https://groups.google.com/g/clang-built-linux/c/1C6YoJKBsQQ/m/a8IS1NjGAgAJ
via https://github.com/ClangBuiltLinux/linux/issues/1209.
(It doesn't look too bad for me to implement)
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 17:40   ` Nick Desaulniers
@ 2022-04-15 18:21     ` Josh Poimboeuf
  2022-04-15 18:23       ` Josh Poimboeuf
                         ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2022-04-15 18:21 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux

On Fri, Apr 15, 2022 at 10:40:22AM -0700, Nick Desaulniers wrote:
> On Fri, Apr 15, 2022 at 8:26 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> > > Alternatively:
> > >
> > >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > >
> > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > > work, except I'm getting:
> > >
> > > $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> > > as: unrecognized option '--generate-unused-section-symbols=yes'
> > > as: unrecognized option '--generate-unused-section-symbols=yes'
> >
> > Reading so hard...
> >
> >  $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,-generate-unused-section-symbols=yes -c foo*.c
> >
> > seems to actually work.
> 
> $ clang hello.c -Wa,-generate-unused-section-symbols=yes
> clang-15: error: unsupported argument
> '-generate-unused-section-symbols=yes' to option '-Wa,'
> 
> :)
> 
> I recall LLVM not emitting STT_SECTION symbols being problematic for
> objtool in the past.
> Old pre-lore archive:
> https://groups.google.com/g/clang-built-linux/c/1C6YoJKBsQQ/m/a8IS1NjGAgAJ
> via https://github.com/ClangBuiltLinux/linux/issues/1209.
> (It doesn't look too bad for me to implement)

Right, and that objtool "fix" -- silently falling back to
non-section-symbol relocations instead of erroring out -- is causing
this issue, when it's done for weak symbols.

If LLVM assembler doesn't support this option then we may have to go
with something like this?  I can't seem to recreate so I'm not able to
test.

We'd also need some objtool checks to make sure the non-section-symbol
fallback isn't being done for a weak symbol.  Or just remove the
fallback altogether and force section symbols whereever they're needed,
similar to the below (untested).


diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
index 5a00b8b2cf9f..77040ce575fa 100644
--- a/include/linux/static_call_types.h
+++ b/include/linux/static_call_types.h
@@ -52,9 +52,14 @@ struct static_call_site {
 #define __STATIC_CALL_ADDRESSABLE(name) \
 	__ADDRESSABLE(STATIC_CALL_KEY(name))
 
+extern unsigned long __addressable_ip;
+
+#define __STATIC_CALL_SITE_ADDRESSABLE() __addressable_ip = _THIS_IP_;
+
 #define __static_call(name)						\
 ({									\
 	__STATIC_CALL_ADDRESSABLE(name);				\
+	__STATIC_CALL_SITE_ADDRESSABLE();				\
 	__raw_static_call(name);					\
 })
 
diff --git a/kernel/static_call_inline.c b/kernel/static_call_inline.c
index dc5665b62814..f6e3e0463efb 100644
--- a/kernel/static_call_inline.c
+++ b/kernel/static_call_inline.c
@@ -17,6 +17,8 @@ extern struct static_call_tramp_key __start_static_call_tramp_key[],
 
 static bool static_call_initialized;
 
+unsigned long __section(".discard.addressable") __addressable_ip;
+
 /* mutex to protect key modules/sites */
 static DEFINE_MUTEX(static_call_mutex);
 


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
                   ` (2 preceding siblings ...)
  2022-04-15 15:26 ` Peter Zijlstra
@ 2022-04-15 18:22 ` Segher Boessenkool
  2022-04-15 18:36   ` Nick Desaulniers
  2022-04-16 10:59   ` Peter Zijlstra
  2022-04-15 21:04 ` H.J. Lu
  4 siblings, 2 replies; 29+ messages in thread
From: Segher Boessenkool @ 2022-04-15 18:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, Josh Poimboeuf, hjl.tools, ndesaulniers, mbenes, rostedt,
	linux-toolchains

Hi!

On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
[ huge snip ]

> This seems to happen when objtool cannot find a section symbol, in which
> case it falls back to any other symbol to key off of, however in this
> case that goes terribly wrong!

That is an objtool problem then.  Fix objtool so it looks at symbols
directly?

> The other option seems to be to have objtool add section symbols it
> needs, however due to ELF being a total PITA and requiring all LOCAL
> symbols to be before GLOBAL symbols, this would mean re-ordering the
> whole symbol table (I have the code somewhere :-/).

ELF requires no such thing?  Where do you see this?

This is the order you get as output from GNU LD, is that what you mean?
Expecting a specific order when there is in fact total freedom is not a
PITA, it is just the cost of doing business :-)

> Alternatively:
> 
>   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> 
> seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> work, except I'm getting:

That email is for a proposed patch.  Did anything further ever happen
with it?


Segher

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 18:21     ` Josh Poimboeuf
@ 2022-04-15 18:23       ` Josh Poimboeuf
  2022-04-15 20:36       ` Nick Desaulniers
  2022-04-16 10:48       ` Peter Zijlstra
  2 siblings, 0 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2022-04-15 18:23 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux

On Fri, Apr 15, 2022 at 11:21:34AM -0700, Josh Poimboeuf wrote:
> On Fri, Apr 15, 2022 at 10:40:22AM -0700, Nick Desaulniers wrote:
> > On Fri, Apr 15, 2022 at 8:26 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> > > > Alternatively:
> > > >
> > > >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > > >
> > > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > > > work, except I'm getting:
> > > >
> > > > $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> > > > as: unrecognized option '--generate-unused-section-symbols=yes'
> > > > as: unrecognized option '--generate-unused-section-symbols=yes'
> > >
> > > Reading so hard...
> > >
> > >  $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,-generate-unused-section-symbols=yes -c foo*.c
> > >
> > > seems to actually work.
> > 
> > $ clang hello.c -Wa,-generate-unused-section-symbols=yes
> > clang-15: error: unsupported argument
> > '-generate-unused-section-symbols=yes' to option '-Wa,'
> > 
> > :)
> > 
> > I recall LLVM not emitting STT_SECTION symbols being problematic for
> > objtool in the past.
> > Old pre-lore archive:
> > https://groups.google.com/g/clang-built-linux/c/1C6YoJKBsQQ/m/a8IS1NjGAgAJ
> > via https://github.com/ClangBuiltLinux/linux/issues/1209.
> > (It doesn't look too bad for me to implement)
> 
> Right, and that objtool "fix" -- silently falling back to
> non-section-symbol relocations instead of erroring out -- is causing
> this issue, when it's done for weak symbols.
> 
> If LLVM assembler doesn't support this option then we may have to go
> with something like this?  I can't seem to recreate so I'm not able to
> test.
> 
> We'd also need some objtool checks to make sure the non-section-symbol
> fallback isn't being done for a weak symbol.  Or just remove the
> fallback altogether and force section symbols whereever they're needed,
> similar to the below (untested).

(If the patch works, we could hopefully come up with something better,
using asm goto, that doesn't involve actual instructions)

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 18:22 ` Segher Boessenkool
@ 2022-04-15 18:36   ` Nick Desaulniers
  2022-04-15 20:07     ` Segher Boessenkool
  2022-04-16 10:59   ` Peter Zijlstra
  1 sibling, 1 reply; 29+ messages in thread
From: Nick Desaulniers @ 2022-04-15 18:36 UTC (permalink / raw)
  To: Segher Boessenkool, hjl.tools
  Cc: Peter Zijlstra, x86, Josh Poimboeuf, mbenes, rostedt,
	linux-toolchains, Fangrui Song

On Fri, Apr 15, 2022 at 11:27 AM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> > Alternatively:
> >
> >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> >
> > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > work, except I'm getting:
>
> That email is for a proposed patch.  Did anything further ever happen
> with it?

$ gcc hello.c -Wa,--generate-unused-section-symbols=yes
as: unrecognized option '--generate-unused-section-symbols=yes'
$ gcc hello.c -Wa,-generate-unused-section-symbols=yes
$ gcc --version
gcc (Debian 11.2.0-16) 11.2.0

Uh, the email says -- prefix, but reality shows a single prefix? What
happened there? Maybe that link is to an earlier version than what
landed?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 18:36   ` Nick Desaulniers
@ 2022-04-15 20:07     ` Segher Boessenkool
  2022-04-15 20:31       ` Nick Desaulniers
  2022-04-16 11:09       ` Peter Zijlstra
  0 siblings, 2 replies; 29+ messages in thread
From: Segher Boessenkool @ 2022-04-15 20:07 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: hjl.tools, Peter Zijlstra, x86, Josh Poimboeuf, mbenes, rostedt,
	linux-toolchains, Fangrui Song

On Fri, Apr 15, 2022 at 11:36:32AM -0700, Nick Desaulniers wrote:
> On Fri, Apr 15, 2022 at 11:27 AM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > > Alternatively:
> > >
> > >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > >
> > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > > work, except I'm getting:
> >
> > That email is for a proposed patch.  Did anything further ever happen
> > with it?
> 
> $ gcc hello.c -Wa,--generate-unused-section-symbols=yes
> as: unrecognized option '--generate-unused-section-symbols=yes'
> $ gcc hello.c -Wa,-generate-unused-section-symbols=yes
> $ gcc --version
> gcc (Debian 11.2.0-16) 11.2.0
> 
> Uh, the email says -- prefix, but reality shows a single prefix? What
> happened there? Maybe that link is to an earlier version than what
> landed?

Neither has landed.  You get the -g option, which can take options of
itself, parsed by the target or file format code.  "as -gobbledygook"
works fine as well :-)


Segher

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 20:07     ` Segher Boessenkool
@ 2022-04-15 20:31       ` Nick Desaulniers
  2022-04-15 21:17         ` Fangrui Song
  2022-04-16 11:09       ` Peter Zijlstra
  1 sibling, 1 reply; 29+ messages in thread
From: Nick Desaulniers @ 2022-04-15 20:31 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: hjl.tools, Peter Zijlstra, x86, Josh Poimboeuf, mbenes, rostedt,
	linux-toolchains, Fangrui Song

On Fri, Apr 15, 2022 at 1:11 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Fri, Apr 15, 2022 at 11:36:32AM -0700, Nick Desaulniers wrote:
> > On Fri, Apr 15, 2022 at 11:27 AM Segher Boessenkool
> > <segher@kernel.crashing.org> wrote:
> > >
> > > > Alternatively:
> > > >
> > > >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > > >
> > > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > > > work, except I'm getting:
> > >
> > > That email is for a proposed patch.  Did anything further ever happen
> > > with it?
> >
> > $ gcc hello.c -Wa,--generate-unused-section-symbols=yes
> > as: unrecognized option '--generate-unused-section-symbols=yes'
> > $ gcc hello.c -Wa,-generate-unused-section-symbols=yes
> > $ gcc --version
> > gcc (Debian 11.2.0-16) 11.2.0
> >
> > Uh, the email says -- prefix, but reality shows a single prefix? What
> > happened there? Maybe that link is to an earlier version than what
> > landed?
>
> Neither has landed.  You get the -g option, which can take options of
> itself, parsed by the target or file format code.  "as -gobbledygook"
> works fine as well :-)

https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8
Looks like it's been in binutils since 2.36 (IIUC the ChangeLog correctly).

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 18:21     ` Josh Poimboeuf
  2022-04-15 18:23       ` Josh Poimboeuf
@ 2022-04-15 20:36       ` Nick Desaulniers
  2022-04-16 10:49         ` Peter Zijlstra
  2022-04-16 10:48       ` Peter Zijlstra
  2 siblings, 1 reply; 29+ messages in thread
From: Nick Desaulniers @ 2022-04-15 20:36 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux, Fangrui Song,
	Segher Boessenkool

On Fri, Apr 15, 2022 at 11:21 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Fri, Apr 15, 2022 at 10:40:22AM -0700, Nick Desaulniers wrote:
> > On Fri, Apr 15, 2022 at 8:26 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> > > > Alternatively:
> > > >
> > > >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > > >
> > > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > > > work, except I'm getting:
> > > >
> > > > $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> > > > as: unrecognized option '--generate-unused-section-symbols=yes'
> > > > as: unrecognized option '--generate-unused-section-symbols=yes'
> > >
> > > Reading so hard...
> > >
> > >  $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,-generate-unused-section-symbols=yes -c foo*.c
> > >
> > > seems to actually work.
> >
> > $ clang hello.c -Wa,-generate-unused-section-symbols=yes
> > clang-15: error: unsupported argument
> > '-generate-unused-section-symbols=yes' to option '-Wa,'
> >
> > :)
> >
> > I recall LLVM not emitting STT_SECTION symbols being problematic for
> > objtool in the past.
> > Old pre-lore archive:
> > https://groups.google.com/g/clang-built-linux/c/1C6YoJKBsQQ/m/a8IS1NjGAgAJ
> > via https://github.com/ClangBuiltLinux/linux/issues/1209.
> > (It doesn't look too bad for me to implement)

https://reviews.llvm.org/D123874

>
> Right, and that objtool "fix" -- silently falling back to
> non-section-symbol relocations instead of erroring out -- is causing
> this issue, when it's done for weak symbols.
>
> If LLVM assembler doesn't support this option then we may have to go
> with something like this?  I can't seem to recreate so I'm not able to
> test.

FWIW, I found this note that retaining STT_SECTION will have adverse
binary size effects for LTO.
https://reviews.llvm.org/D93783#2470728
Perhaps food for thought here.
--
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
                   ` (3 preceding siblings ...)
  2022-04-15 18:22 ` Segher Boessenkool
@ 2022-04-15 21:04 ` H.J. Lu
  2022-04-16 11:25   ` Peter Zijlstra
  4 siblings, 1 reply; 29+ messages in thread
From: H.J. Lu @ 2022-04-15 21:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: the arch/x86 maintainers, Josh Poimboeuf, Nick Desaulniers,
	Miroslav Benes, rostedt, linux-toolchains

On Fri, Apr 15, 2022 at 4:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Hi,
>
> After chasing my tail for a good few hours this morning I finally
> figured out why code patching is exploding for me in weird and wonderful
> ways... :-(
>
> This problem affects everything where we have external tools (mostly
> objtool but possible also recordmcount) generate location sections for
> us, notably things like:
>
>   .static_call_sites
>   .retpoline_sites
>   __mcount_loc
>   .orc_unwind
>   .orc_unwind_ip
>
> from individual translation-unit runs.
>
> Consider:
>
> foo-weak.c:
>
> extern void __SCT__foo(void);
>
> __attribute__((weak)) void foo(void)
> {
>         return __SCT__foo();
> }
>
> foo.c:
>
> extern void __SCT__foo(void);
> extern void my_foo(void);
>
> void foo(void)
> {
>         my_foo();
>         return __SCT__foo();
> }
>
> These generate the obvious code
> (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):
>
> foo-weak.o:
> 0000000000000000 <foo>:
>    0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4
>
> foo.o:
> 0000000000000000 <foo>:
>    0:   48 83 ec 08             sub    $0x8,%rsp
>    4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
>    9:   48 83 c4 08             add    $0x8,%rsp
>    d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4
>
>
> Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o):
>
> foos.o:
> 0000000000000000 <foo-0x10>:
>    0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
>    5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
>    f:   90                      nop
>
> 0000000000000010 <foo>:
>   10:   48 83 ec 08             sub    $0x8,%rsp
>   14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
>   19:   48 83 c4 08             add    $0x8,%rsp
>   1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4
>
> Noting that ld preserves the weak function text, but strips the symbol
> off of it (hence objdump doing that funny negative offset thing). This
> does lead to 'interesting' unused code issues with objtool when ran on
> linked objects, but that seems to be working (fingers crossed).
>
> So far so good.. Now lets consider the objtool static_call output
> section (readelf output, old binutils):
>
> foo-weak.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foo.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foos.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> 0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
> 000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> So we have two patch sites, one in the dead code of the weak foo and one
> in the real foo. All is well.
>
> *HOWEVER*, sometimes it generates things like this (using new enough
> binutils):
>
> foo-weak.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foo.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> foos.o:
>
> Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
>     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> 0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
> 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> 0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
> 000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
>
> And now we can see how that foos.o .static_call_sites goes side-ways, we
> now have _two_ patch sites in foo. One for the weak symbol at foo+0
> (which is no longer a static_call site!) and one at foo+d which is in
> fact the right location.

This can't be right.  "foo" must point to the non-weak function.
Please file a linker bug report.

> This seems to happen when objtool cannot find a section symbol, in which
> case it falls back to any other symbol to key off of, however in this
> case that goes terribly wrong!
>
> Now, if we do IBT/LTO builds, and run objtool on linked objects only,
> this obvious doesn't happen, because the weak stuff has been resolved by
> then. But ideally we'd not force that since it has a build time
> penalty..
>
> The other option seems to be to have objtool add section symbols it
> needs, however due to ELF being a total PITA and requiring all LOCAL
> symbols to be before GLOBAL symbols, this would mean re-ordering the
> whole symbol table (I have the code somewhere :-/).
>
> Alternatively:
>
>   https://sourceware.org/pipermail/binutils/2020-December/114671.html
>
> seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> work, except I'm getting:
>
> $ gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -Wa,--generate-unused-section-symbols=yes -c foo*.c
> as: unrecognized option '--generate-unused-section-symbols=yes'
> as: unrecognized option '--generate-unused-section-symbols=yes'
>
> $ as --version
> GNU assembler (GNU Binutils for Debian) 2.38
>
>
> Opinions?



-- 
H.J.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 20:31       ` Nick Desaulniers
@ 2022-04-15 21:17         ` Fangrui Song
  2022-04-15 21:41           ` Segher Boessenkool
  0 siblings, 1 reply; 29+ messages in thread
From: Fangrui Song @ 2022-04-15 21:17 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Segher Boessenkool, hjl.tools, Peter Zijlstra, x86,
	Josh Poimboeuf, mbenes, rostedt, linux-toolchains

On 2022-04-15, Nick Desaulniers wrote:
>On Fri, Apr 15, 2022 at 1:11 PM Segher Boessenkool
><segher@kernel.crashing.org> wrote:
>>
>> On Fri, Apr 15, 2022 at 11:36:32AM -0700, Nick Desaulniers wrote:
>> > On Fri, Apr 15, 2022 at 11:27 AM Segher Boessenkool
>> > <segher@kernel.crashing.org> wrote:
>> > >
>> > > > Alternatively:
>> > > >
>> > > >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
>> > > >
>> > > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
>> > > > work, except I'm getting:
>> > >
>> > > That email is for a proposed patch.  Did anything further ever happen
>> > > with it?
>> >
>> > $ gcc hello.c -Wa,--generate-unused-section-symbols=yes
>> > as: unrecognized option '--generate-unused-section-symbols=yes'
>> > $ gcc hello.c -Wa,-generate-unused-section-symbols=yes
>> > $ gcc --version
>> > gcc (Debian 11.2.0-16) 11.2.0
>> >
>> > Uh, the email says -- prefix, but reality shows a single prefix? What
>> > happened there? Maybe that link is to an earlier version than what
>> > landed?
>>
>> Neither has landed.  You get the -g option, which can take options of
>> itself, parsed by the target or file format code.  "as -gobbledygook"
>> works fine as well :-)
>
>https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8
>Looks like it's been in binutils since 2.36 (IIUC the ChangeLog correctly).

(Came to this thread from https://reviews.llvm.org/D123874)

H.J.'s patch discards unused STT_SECTION symbols.

The ability to add back such symbols (--generate-unused-section-symbols={yes|no}) is not in binutils.
Neither 2.36 nor master branch has --generate-unused-section-symbols={yes|no}.

as -generate-unused-section-symbols=yes is just a GNU as quirk that it
interprets nearly all -g* options as -g, which generates .debug_arange
and .debug_info compile units. The feature is very different from
restoring STT_SECTION symbols.

I know little about the kernel requirement but I think it shouldn't be
too difficult to teach a relocatable object file consumer to not assume
existence of STT_SECTION. The converse (add the feature to GNU as and
LLVM integrated assembler) would be much more unideal.

See also
https://sourceware.org/pipermail/binutils/2022-March/119940.html

> This problem was fixed in the kernel (and backported by distros to their 
> kernels if binutils was updated); it's fairly simplistic changes.
> 
> I don't see the need for returning (optionally) to the old behaviour 
> again.  I mean, at the time, when 2.36 came out, sure, the patch might 
> have made sense, but now?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 21:17         ` Fangrui Song
@ 2022-04-15 21:41           ` Segher Boessenkool
  0 siblings, 0 replies; 29+ messages in thread
From: Segher Boessenkool @ 2022-04-15 21:41 UTC (permalink / raw)
  To: Fangrui Song
  Cc: Nick Desaulniers, hjl.tools, Peter Zijlstra, x86, Josh Poimboeuf,
	mbenes, rostedt, linux-toolchains

On Fri, Apr 15, 2022 at 02:17:40PM -0700, Fangrui Song wrote:
> On 2022-04-15, Nick Desaulniers wrote:
> >>> Uh, the email says -- prefix, but reality shows a single prefix? What
> >>> happened there? Maybe that link is to an earlier version than what
> >>> landed?
> >>
> >>Neither has landed.  You get the -g option, which can take options of
> >>itself, parsed by the target or file format code.  "as -gobbledygook"
> >>works fine as well :-)
> >
> >https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8
> >Looks like it's been in binutils since 2.36 (IIUC the ChangeLog correctly).
> 
> (Came to this thread from https://reviews.llvm.org/D123874)
> 
> H.J.'s patch discards unused STT_SECTION symbols.

Yup, it doesn't add any command line option, it is an internals patch.

> The ability to add back such symbols 
> (--generate-unused-section-symbols={yes|no}) is not in binutils.
> Neither 2.36 nor master branch has 
> --generate-unused-section-symbols={yes|no}.

Which is a good thing really, it is a horrible option name.

> as -generate-unused-section-symbols=yes is just a GNU as quirk that it
> interprets nearly all -g* options as -g, which generates .debug_arange
> and .debug_info compile units.

As -gdwarf-2 in fact, for most ELF targets (specific targets can
override it still, although hopefully they don't :-) )

> The feature is very different from
> restoring STT_SECTION symbols.
> 
> I know little about the kernel requirement but I think it shouldn't be
> too difficult to teach a relocatable object file consumer to not assume
> existence of STT_SECTION. The converse (add the feature to GNU as and
> LLVM integrated assembler) would be much more unideal.

Yeah, there are sections already (those *are* required in ELF), so
objtool should just use those, instead of relying on coincidences that
were true in old tools versions, but were never guaranteed :-)


Segher

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 18:21     ` Josh Poimboeuf
  2022-04-15 18:23       ` Josh Poimboeuf
  2022-04-15 20:36       ` Nick Desaulniers
@ 2022-04-16 10:48       ` Peter Zijlstra
  2022-04-16 16:07         ` Josh Poimboeuf
  2 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-16 10:48 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nick Desaulniers, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux

On Fri, Apr 15, 2022 at 11:21:30AM -0700, Josh Poimboeuf wrote:

> If LLVM assembler doesn't support this option then we may have to go
> with something like this?  I can't seem to recreate so I'm not able to
> test.
> 
> We'd also need some objtool checks to make sure the non-section-symbol
> fallback isn't being done for a weak symbol.  Or just remove the
> fallback altogether and force section symbols whereever they're needed,
> similar to the below (untested).
> 
> 
> diff --git a/include/linux/static_call_types.h b/include/linux/static_call_types.h
> index 5a00b8b2cf9f..77040ce575fa 100644
> --- a/include/linux/static_call_types.h
> +++ b/include/linux/static_call_types.h
> @@ -52,9 +52,14 @@ struct static_call_site {
>  #define __STATIC_CALL_ADDRESSABLE(name) \
>  	__ADDRESSABLE(STATIC_CALL_KEY(name))
>  
> +extern unsigned long __addressable_ip;
> +
> +#define __STATIC_CALL_SITE_ADDRESSABLE() __addressable_ip = _THIS_IP_;
> +
>  #define __static_call(name)						\
>  ({									\
>  	__STATIC_CALL_ADDRESSABLE(name);				\
> +	__STATIC_CALL_SITE_ADDRESSABLE();				\
>  	__raw_static_call(name);					\
>  })
>  
> diff --git a/kernel/static_call_inline.c b/kernel/static_call_inline.c
> index dc5665b62814..f6e3e0463efb 100644
> --- a/kernel/static_call_inline.c
> +++ b/kernel/static_call_inline.c
> @@ -17,6 +17,8 @@ extern struct static_call_tramp_key __start_static_call_tramp_key[],
>  
>  static bool static_call_initialized;
>  
> +unsigned long __section(".discard.addressable") __addressable_ip;
> +
>  /* mutex to protect key modules/sites */
>  static DEFINE_MUTEX(static_call_mutex);

This *might* work for static_call. But what about things like
.retpoline_sites?

If we get an indirect inside the weak function and at
a different place inside the non-weak function, then we end up with two
patch sites in the non-weak function.

And since all of that is compiler generated, we don't have anything to
stick anything in to alleviate trouble.

(in fact, I didn't observe this with static_call, I just picked it for
the example because it was somewhat easier)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 20:36       ` Nick Desaulniers
@ 2022-04-16 10:49         ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-16 10:49 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Josh Poimboeuf, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux, Fangrui Song,
	Segher Boessenkool

On Fri, Apr 15, 2022 at 01:36:12PM -0700, Nick Desaulniers wrote:
> FWIW, I found this note that retaining STT_SECTION will have adverse
> binary size effects for LTO.
> https://reviews.llvm.org/D93783#2470728
> Perhaps food for thought here.

LTO doesn't suffer this problem since then we run objtool after linking
and it never sees the weak stuff to get confused about.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 18:22 ` Segher Boessenkool
  2022-04-15 18:36   ` Nick Desaulniers
@ 2022-04-16 10:59   ` Peter Zijlstra
  2022-04-16 13:20     ` Segher Boessenkool
  1 sibling, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-16 10:59 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: x86, Josh Poimboeuf, hjl.tools, ndesaulniers, mbenes, rostedt,
	linux-toolchains

On Fri, Apr 15, 2022 at 01:22:29PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> [ huge snip ]
> 
> > This seems to happen when objtool cannot find a section symbol, in which
> > case it falls back to any other symbol to key off of, however in this
> > case that goes terribly wrong!
> 
> That is an objtool problem then.  Fix objtool so it looks at symbols
> directly?

I'm not sure what you're saying. It *does* look at the symbols, but at
the time we run objtool (pre link) there's multiple copies of a symbol
(weak and non-weak), and if we then generate symbol relative relocs, we
end up, post-link, with the relocs resolved wrong.

The weak symbol relocs get applied to the non-weak symbol, and fail
happens.

> > The other option seems to be to have objtool add section symbols it
> > needs, however due to ELF being a total PITA and requiring all LOCAL
> > symbols to be before GLOBAL symbols, this would mean re-ordering the
> > whole symbol table (I have the code somewhere :-/).
> 
> ELF requires no such thing?  Where do you see this?

The first google hit:

  https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-79797.html

which has:

  "In each symbol table, all symbols with STB_LOCAL binding precede the
  weak symbols and global symbols. As Sections describes, a symbol table
  section's sh_info section header member holds the symbol table index for
  the first non-local symbol."

I ran into that when I violated that contraint at some point and
the various elf tools screamed at me.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 20:07     ` Segher Boessenkool
  2022-04-15 20:31       ` Nick Desaulniers
@ 2022-04-16 11:09       ` Peter Zijlstra
  1 sibling, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-16 11:09 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Nick Desaulniers, hjl.tools, x86, Josh Poimboeuf, mbenes,
	rostedt, linux-toolchains, Fangrui Song

On Fri, Apr 15, 2022 at 03:07:14PM -0500, Segher Boessenkool wrote:
> On Fri, Apr 15, 2022 at 11:36:32AM -0700, Nick Desaulniers wrote:
> > On Fri, Apr 15, 2022 at 11:27 AM Segher Boessenkool
> > <segher@kernel.crashing.org> wrote:
> > >
> > > > Alternatively:
> > > >
> > > >   https://sourceware.org/pipermail/binutils/2020-December/114671.html
> > > >
> > > > seems to suggest: -Wa,--generate-unused-section-symbols=yes, ought to
> > > > work, except I'm getting:
> > >
> > > That email is for a proposed patch.  Did anything further ever happen
> > > with it?
> > 
> > $ gcc hello.c -Wa,--generate-unused-section-symbols=yes
> > as: unrecognized option '--generate-unused-section-symbols=yes'
> > $ gcc hello.c -Wa,-generate-unused-section-symbols=yes
> > $ gcc --version
> > gcc (Debian 11.2.0-16) 11.2.0
> > 
> > Uh, the email says -- prefix, but reality shows a single prefix? What
> > happened there? Maybe that link is to an earlier version than what
> > landed?
> 
> Neither has landed.  You get the -g option, which can take options of
> itself, parsed by the target or file format code.  "as -gobbledygook"
> works fine as well :-)

Urgh :-/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-15 21:04 ` H.J. Lu
@ 2022-04-16 11:25   ` Peter Zijlstra
  2022-04-16 16:27     ` H.J. Lu
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-16 11:25 UTC (permalink / raw)
  To: H.J. Lu
  Cc: the arch/x86 maintainers, Josh Poimboeuf, Nick Desaulniers,
	Miroslav Benes, rostedt, linux-toolchains

On Fri, Apr 15, 2022 at 02:04:32PM -0700, H.J. Lu wrote:
> On Fri, Apr 15, 2022 at 4:19 AM Peter Zijlstra <peterz@infradead.org> wrote:

> > *HOWEVER*, sometimes it generates things like this (using new enough
> > binutils):
> >
> > foo-weak.o:
> >
> > Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
> >     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> > 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
> > 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> >
> > foo.o:
> >
> > Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
> >     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> > 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
> > 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> >
> > foos.o:
> >
> > Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
> >     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> > 0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
> > 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> > 0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
> > 000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> >
> > And now we can see how that foos.o .static_call_sites goes side-ways, we
> > now have _two_ patch sites in foo. One for the weak symbol at foo+0
> > (which is no longer a static_call site!) and one at foo+d which is in
> > fact the right location.
> 
> This can't be right.  "foo" must point to the non-weak function.
> Please file a linker bug report.

It does point to the non-weak symbol; they both do, even the relocs that
started out as pointing to the weak symbol.

Are you saying the linker *should* have removed the relocs that were
based on the weak symbol?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-16 10:59   ` Peter Zijlstra
@ 2022-04-16 13:20     ` Segher Boessenkool
  2022-04-16 17:59       ` Segher Boessenkool
  0 siblings, 1 reply; 29+ messages in thread
From: Segher Boessenkool @ 2022-04-16 13:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, Josh Poimboeuf, hjl.tools, ndesaulniers, mbenes, rostedt,
	linux-toolchains

On Sat, Apr 16, 2022 at 12:59:05PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 15, 2022 at 01:22:29PM -0500, Segher Boessenkool wrote:
> > On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> > [ huge snip ]
> > 
> > > This seems to happen when objtool cannot find a section symbol, in which
> > > case it falls back to any other symbol to key off of, however in this
> > > case that goes terribly wrong!
> > 
> > That is an objtool problem then.  Fix objtool so it looks at symbols
> > directly?
> 
> I'm not sure what you're saying.

So sorry.  "So that it looks at *sections* directly", I meant to say.

> > > The other option seems to be to have objtool add section symbols it
> > > needs, however due to ELF being a total PITA and requiring all LOCAL
> > > symbols to be before GLOBAL symbols, this would mean re-ordering the
> > > whole symbol table (I have the code somewhere :-/).
> > 
> > ELF requires no such thing?  Where do you see this?
> 
> The first google hit:
> 
>   https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-79797.html
> 
> which has:
> 
>   "In each symbol table, all symbols with STB_LOCAL binding precede the
>   weak symbols and global symbols. As Sections describes, a symbol table
>   section's sh_info section header member holds the symbol table index for
>   the first non-local symbol."

That is not the ELF spec, and it is incorrect afaics.  Do you have any
authorative reference?

> I ran into that when I violated that contraint at some point and
> the various elf tools screamed at me.

Huh.  Can you show anything that reproduces that?


Segher

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-16 10:48       ` Peter Zijlstra
@ 2022-04-16 16:07         ` Josh Poimboeuf
  2022-04-16 16:32           ` H.J. Lu
  2022-04-17 15:44           ` Peter Zijlstra
  0 siblings, 2 replies; 29+ messages in thread
From: Josh Poimboeuf @ 2022-04-16 16:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux

On Sat, Apr 16, 2022 at 12:48:09PM +0200, Peter Zijlstra wrote:
> This *might* work for static_call. But what about things like
> .retpoline_sites?
> 
> If we get an indirect inside the weak function and at
> a different place inside the non-weak function, then we end up with two
> patch sites in the non-weak function.
> 
> And since all of that is compiler generated, we don't have anything to
> stick anything in to alleviate trouble.
> 
> (in fact, I didn't observe this with static_call, I just picked it for
> the example because it was somewhat easier)

Ok, right.  And ORC entries will be broken.  I found the ones below by
disabling CONFIG_DEBUG_INFO and looking for "sym->bind == STB_WEAK" in
the elf_add_reloc_to_insn() fallback.

I guess objtool will need to bite the bullet and create section symbols.

  lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x0: missing section symbol for reference to weak function text
  lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x4: missing section symbol for reference to weak function text
  lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x8: missing section symbol for reference to weak function text
  lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0xc: missing section symbol for reference to weak function text
  lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x10: missing section symbol for reference to weak function text
  lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x14: missing section symbol for reference to weak function text
  lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x0: missing section symbol for reference to weak function text
  lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x4: missing section symbol for reference to weak function text
  lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x10: missing section symbol for reference to weak function text
  lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x14: missing section symbol for reference to weak function text
  lib/crc32.o: warning: objtool: .orc_unwind_ip+0x14: missing section symbol for reference to weak function text
  lib/crc32.o: warning: objtool: .orc_unwind_ip+0x18: missing section symbol for reference to weak function text
  lib/crc32.o: warning: objtool: .orc_unwind_ip+0x1c: missing section symbol for reference to weak function text
  lib/crc32.o: warning: objtool: .orc_unwind_ip+0x20: missing section symbol for reference to weak function text
  lib/crc32.o: warning: objtool: .orc_unwind_ip+0x24: missing section symbol for reference to weak function text
  lib/crc32.o: warning: objtool: .orc_unwind_ip+0x28: missing section symbol for reference to weak function text

-- 
Josh


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-16 11:25   ` Peter Zijlstra
@ 2022-04-16 16:27     ` H.J. Lu
  0 siblings, 0 replies; 29+ messages in thread
From: H.J. Lu @ 2022-04-16 16:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: the arch/x86 maintainers, Josh Poimboeuf, Nick Desaulniers,
	Miroslav Benes, rostedt, linux-toolchains

On Sat, Apr 16, 2022 at 4:25 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Apr 15, 2022 at 02:04:32PM -0700, H.J. Lu wrote:
> > On Fri, Apr 15, 2022 at 4:19 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> > > *HOWEVER*, sometimes it generates things like this (using new enough
> > > binutils):
> > >
> > > foo-weak.o:
> > >
> > > Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
> > >     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> > > 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
> > > 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> > >
> > > foo.o:
> > >
> > > Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
> > >     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> > > 0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
> > > 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> > >
> > > foos.o:
> > >
> > > Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
> > >     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
> > > 0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
> > > 0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> > > 0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
> > > 000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
> > >
> > > And now we can see how that foos.o .static_call_sites goes side-ways, we
> > > now have _two_ patch sites in foo. One for the weak symbol at foo+0
> > > (which is no longer a static_call site!) and one at foo+d which is in
> > > fact the right location.
> >
> > This can't be right.  "foo" must point to the non-weak function.
> > Please file a linker bug report.
>
> It does point to the non-weak symbol; they both do, even the relocs that
> started out as pointing to the weak symbol.

The relocation is against the symbol, foo, and linker will decide where
foo is defined.   We only know what the final foo definition is at link-time.
You can't tell the final foo definition is weak or non-weak before link-time.

> Are you saying the linker *should* have removed the relocs that were
> based on the weak symbol?

No, relocations aren't removed.  Linker ignores the weak definition of
foo if there is a non-weak definition of foo when resolving relocations
against symbol foo.

-- 
H.J.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-16 16:07         ` Josh Poimboeuf
@ 2022-04-16 16:32           ` H.J. Lu
  2022-04-17 15:44           ` Peter Zijlstra
  1 sibling, 0 replies; 29+ messages in thread
From: H.J. Lu @ 2022-04-16 16:32 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, Nick Desaulniers, the arch/x86 maintainers,
	Miroslav Benes, rostedt, linux-toolchains, clang-built-linux

On Sat, Apr 16, 2022 at 9:07 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Sat, Apr 16, 2022 at 12:48:09PM +0200, Peter Zijlstra wrote:
> > This *might* work for static_call. But what about things like
> > .retpoline_sites?
> >
> > If we get an indirect inside the weak function and at
> > a different place inside the non-weak function, then we end up with two
> > patch sites in the non-weak function.
> >
> > And since all of that is compiler generated, we don't have anything to
> > stick anything in to alleviate trouble.
> >
> > (in fact, I didn't observe this with static_call, I just picked it for
> > the example because it was somewhat easier)
>
> Ok, right.  And ORC entries will be broken.  I found the ones below by
> disabling CONFIG_DEBUG_INFO and looking for "sym->bind == STB_WEAK" in
> the elf_add_reloc_to_insn() fallback.
>
> I guess objtool will need to bite the bullet and create section symbols.
>
>   lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x0: missing section symbol for reference to weak function text
>   lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x4: missing section symbol for reference to weak function text
>   lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x8: missing section symbol for reference to weak function text
>   lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0xc: missing section symbol for reference to weak function text
>   lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x10: missing section symbol for reference to weak function text
>   lib/clz_ctz.o: warning: objtool: .orc_unwind_ip+0x14: missing section symbol for reference to weak function text
>   lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x0: missing section symbol for reference to weak function text
>   lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x4: missing section symbol for reference to weak function text
>   lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x10: missing section symbol for reference to weak function text
>   lib/iomap_copy.o: warning: objtool: .orc_unwind_ip+0x14: missing section symbol for reference to weak function text
>   lib/crc32.o: warning: objtool: .orc_unwind_ip+0x14: missing section symbol for reference to weak function text
>   lib/crc32.o: warning: objtool: .orc_unwind_ip+0x18: missing section symbol for reference to weak function text
>   lib/crc32.o: warning: objtool: .orc_unwind_ip+0x1c: missing section symbol for reference to weak function text
>   lib/crc32.o: warning: objtool: .orc_unwind_ip+0x20: missing section symbol for reference to weak function text
>   lib/crc32.o: warning: objtool: .orc_unwind_ip+0x24: missing section symbol for reference to weak function text
>   lib/crc32.o: warning: objtool: .orc_unwind_ip+0x28: missing section symbol for reference to weak function text

STB_WEAK is meaningful when deciding if a symbol definition should
be ignored only when there is no STB_GLOBAL at link-time.   Tools
must keep the section index of a weak symbol definition if there is no
non-weak definition.

-- 
H.J.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-16 13:20     ` Segher Boessenkool
@ 2022-04-16 17:59       ` Segher Boessenkool
  0 siblings, 0 replies; 29+ messages in thread
From: Segher Boessenkool @ 2022-04-16 17:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, Josh Poimboeuf, hjl.tools, ndesaulniers, mbenes, rostedt,
	linux-toolchains

On Sat, Apr 16, 2022 at 08:20:06AM -0500, Segher Boessenkool wrote:
> On Sat, Apr 16, 2022 at 12:59:05PM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 15, 2022 at 01:22:29PM -0500, Segher Boessenkool wrote:
> > > On Fri, Apr 15, 2022 at 01:19:04PM +0200, Peter Zijlstra wrote:
> > > > The other option seems to be to have objtool add section symbols it
> > > > needs, however due to ELF being a total PITA and requiring all LOCAL
> > > > symbols to be before GLOBAL symbols, this would mean re-ordering the
> > > > whole symbol table (I have the code somewhere :-/).
> > > 
> > > ELF requires no such thing?  Where do you see this?
> > 
> > The first google hit:
> > 
> >   https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-79797.html
> > 
> > which has:
> > 
> >   "In each symbol table, all symbols with STB_LOCAL binding precede the
> >   weak symbols and global symbols. As Sections describes, a symbol table
> >   section's sh_info section header member holds the symbol table index for
> >   the first non-local symbol."
> 
> That is not the ELF spec, and it is incorrect afaics.  Do you have any
> authorative reference?

It is on page 1-18 of the ELF specification version 1.1, at the bottom
of the page.  How unfortunate.  It says:
  In each symbol table, all symbols with STB_LOCAL binding preced the
  weak and global symbols.  As ``Sections'' above describes, a symbol
  table section's sh_info section header member holds the symbol table
  index for the first non-local symbol.
so it is an optimisation, assuming that there are significantly fewer
local than global symbols.


Segher

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-16 16:07         ` Josh Poimboeuf
  2022-04-16 16:32           ` H.J. Lu
@ 2022-04-17 15:44           ` Peter Zijlstra
  2022-04-17 15:46             ` Peter Zijlstra
  1 sibling, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-17 15:44 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nick Desaulniers, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux

On Sat, Apr 16, 2022 at 09:07:42AM -0700, Josh Poimboeuf wrote:

> I guess objtool will need to bite the bullet and create section symbols.

---
Subject: objtool: Fix relocs vs weak symbols
From: Peter Zijlstra <peterz@infradead.org>
Date: Sun Apr 17 17:03:36 CEST 2022

Occasionally objtool driven code patching (think .static_call_sites
.retpoline_sites etc..) goes sideways and it tries to patch an
instruction that doesn't match.

Much head-scatching and cursing later the problem is as follows:

Consider:

foo-weak.c:

extern void __SCT__foo(void);

__attribute__((weak)) void foo(void)
{
        return __SCT__foo();
}

foo.c:

extern void __SCT__foo(void);
extern void my_foo(void);

void foo(void)
{
        my_foo();
        return __SCT__foo();
}

These generate the obvious code
(gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):

foo-weak.o:
0000000000000000 <foo>:
   0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4

foo.o:
0000000000000000 <foo>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4

Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o):

foos.o:
0000000000000000 <foo-0x10>:
   0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
   5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
   f:   90                      nop

0000000000000010 <foo>:
  10:   48 83 ec 08             sub    $0x8,%rsp
  14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
  19:   48 83 c4 08             add    $0x8,%rsp
  1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4

Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).

So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

So we have two patch sites, one in the dead code of the weak foo and one
in the real foo. All is well.

*HOWEVER*, sometimes it generates things like this (using new enough
binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.

This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!

As such, teach objtool to create a section symbol when there isn't
one.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/elf.c |  182 +++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 160 insertions(+), 22 deletions(-)

--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -575,37 +575,175 @@ int elf_add_reloc(struct elf *elf, struc
 	return 0;
 }
 
-int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
-			  unsigned long offset, unsigned int type,
-			  struct section *insn_sec, unsigned long insn_off)
+/*
+ * Ensure that any reloc section containing references to @sym is marked
+ * changed such that it will get re-generated in elf_rebuild_reloc_sections()
+ * with the new symbol index.
+ */
+static void elf_dirty_reloc_sym(struct elf *elf, struct symbol *sym)
 {
+	struct section *sec;
+
+	list_for_each_entry(sec, &elf->sections, list) {
+		struct reloc *reloc;
+
+		if (sec->changed)
+			continue;
+
+		list_for_each_entry(reloc, &sec->reloc_list, list) {
+			if (reloc->sym == sym) {
+				sec->changed = true;
+				break;
+			}
+		}
+	}
+}
+
+/*
+ * Move the first global symbol, as per sh_info, into a new, higher symbol
+ * index. This fees up the shndx for a new local symbol.
+ */
+static int elf_move_global_symbol(struct elf *elf, struct section *symtab,
+				  struct section *symtab_shndx)
+{
+	Elf_Data *data, *shndx_data = NULL;
+	Elf32_Word first_non_local;
 	struct symbol *sym;
-	int addend;
+	Elf_Scn *s;
 
-	if (insn_sec->sym) {
-		sym = insn_sec->sym;
-		addend = insn_off;
+	first_non_local = symtab->sh.sh_info;
 
-	} else {
-		/*
-		 * The Clang assembler strips section symbols, so we have to
-		 * reference the function symbol instead:
-		 */
-		sym = find_symbol_containing(insn_sec, insn_off);
-		if (!sym) {
-			/*
-			 * Hack alert.  This happens when we need to reference
-			 * the NOP pad insn immediately after the function.
-			 */
-			sym = find_symbol_containing(insn_sec, insn_off - 1);
+	sym = find_symbol_by_index(elf, first_non_local);
+	if (!sym) {
+		WARN("no non-local symbols !?");
+		return first_non_local;
+	}
+
+	s = elf_getscn(elf->elf, symtab->idx);
+	if (!s) {
+		WARN_ELF("elf_getscn");
+		return -1;
+	}
+
+	data = elf_newdata(s);
+	if (!data) {
+		WARN_ELF("elf_newdata");
+		return -1;
+	}
+
+	data->d_buf = &sym->sym;
+	data->d_size = sizeof(sym->sym);
+	data->d_align = 1;
+	data->d_type = ELF_T_SYM;
+
+	sym->idx = symtab->sh.sh_size / sizeof(sym->sym);
+	elf_dirty_reloc_sym(elf, sym);
+
+	symtab->sh.sh_info += 1;
+	symtab->sh.sh_size += data->d_size;
+	symtab->changed = true;
+
+	/* XXX create ? */
+	if (symtab_shndx) {
+		s = elf_getscn(elf->elf, symtab_shndx->idx);
+		if (!s) {
+			WARN_ELF("elf_getscn");
+			return -1;
 		}
 
-		if (!sym) {
-			WARN("can't find symbol containing %s+0x%lx", insn_sec->name, insn_off);
+		shndx_data = elf_newdata(s);
+		if (!shndx_data) {
+			WARN_ELF("elf_newshndx_data");
 			return -1;
 		}
 
-		addend = insn_off - sym->offset;
+		shndx_data->d_buf = &sym->sec->idx;
+		shndx_data->d_size = sizeof(Elf32_Word);
+		shndx_data->d_align = 4;
+		shndx_data->d_type = ELF_T_WORD;
+
+		symtab_shndx->sh.sh_size += 4;
+		symtab_shndx->changed = true;
+	}
+
+	return first_non_local;
+}
+
+static struct symbol *
+elf_create_section_symbol(struct elf *elf, struct section *sec)
+{
+	struct section *symtab, *symtab_shndx;
+	Elf_Data *shndx_data = NULL;
+	struct symbol *sym;
+	Elf32_Word shndx;
+
+	symtab = find_section_by_name(elf, ".symtab");
+	if (symtab) {
+		symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
+		if (symtab_shndx)
+			shndx_data = symtab_shndx->data;
+	} else {
+		WARN("no .symtab");
+		return NULL;
+	}
+
+	sym = malloc(sizeof(*sym));
+	if (!sym) {
+		perror("malloc");
+		return NULL;
+	}
+	memset(sym, 0, sizeof(*sym));
+
+	sym->idx = elf_move_global_symbol(elf, symtab, symtab_shndx);
+	if (sym->idx < 0) {
+		WARN("elf_move_global_symbol");
+		return NULL;
+	}
+
+	sym->name = sec->name;
+	sym->sec = sec;
+
+	// st_name 0
+	sym->sym.st_info = GELF_ST_INFO(STB_LOCAL, STT_SECTION);
+	// st_other 0
+	// st_value 0
+	// st_size 0
+	shndx = sec->idx;
+	if (shndx >= SHN_UNDEF && shndx < SHN_LORESERVE) {
+		sym->sym.st_shndx = shndx;
+		if (!shndx_data)
+			shndx = 0;
+	} else {
+		sym->sym.st_shndx = SHN_XINDEX;
+		if (!shndx_data) {
+			WARN("no .symtab_shndx");
+			return NULL;
+		}
+	}
+
+	if (!gelf_update_symshndx(symtab->data, shndx_data, sym->idx, &sym->sym, shndx)) {
+		WARN_ELF("gelf_update_symshndx");
+		return NULL;
+	}
+
+	elf_add_symbol(elf, sym);
+
+	return sym;
+}
+
+int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
+			  unsigned long offset, unsigned int type,
+			  struct section *insn_sec, unsigned long insn_off)
+{
+	struct symbol *sym = insn_sec->sym;
+	int addend = insn_off;
+
+	if (!sym) {
+		sym = elf_create_section_symbol(elf, insn_sec);
+		if (!sym)
+			return -1;
+
+		insn_sec->sym = sym;
 	}
 
 	return elf_add_reloc(elf, sec, offset, type, sym, addend);

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: The trouble with __weak and objtool got worse
  2022-04-17 15:44           ` Peter Zijlstra
@ 2022-04-17 15:46             ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2022-04-17 15:46 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Nick Desaulniers, x86, hjl.tools, mbenes, rostedt,
	linux-toolchains, clang-built-linux

On Sun, Apr 17, 2022 at 05:44:22PM +0200, Peter Zijlstra wrote:
> On Sat, Apr 16, 2022 at 09:07:42AM -0700, Josh Poimboeuf wrote:
> 
> > I guess objtool will need to bite the bullet and create section symbols.
> 

Which then led to..

I'll properly post the patches on Tue, once this easter thing has run
it's course and we're actually working again, ha!

---
Subject: objtool: Fix type of reloc::addend
From: Peter Zijlstra <peterz@infradead.org>
Date: Sun Apr 17 17:03:40 CEST 2022

Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means
that out reloc::addend needs to be long or face tuncation issues when
we do elf_rebuild_rel_reloc_sections():

  - 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt+0x80000067
  + 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt-0x7fffff99

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c               |    8 ++++----
 tools/objtool/elf.c                 |    2 +-
 tools/objtool/include/objtool/elf.h |    4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -560,12 +560,12 @@ static int add_dead_ends(struct objtool_
 		else if (reloc->addend == reloc->sym->sec->sh.sh_size) {
 			insn = find_last_insn(file, reloc->sym->sec);
 			if (!insn) {
-				WARN("can't find unreachable insn at %s+0x%x",
+				WARN("can't find unreachable insn at %s+0x%lx",
 				     reloc->sym->sec->name, reloc->addend);
 				return -1;
 			}
 		} else {
-			WARN("can't find unreachable insn at %s+0x%x",
+			WARN("can't find unreachable insn at %s+0x%lx",
 			     reloc->sym->sec->name, reloc->addend);
 			return -1;
 		}
@@ -595,12 +595,12 @@ static int add_dead_ends(struct objtool_
 		else if (reloc->addend == reloc->sym->sec->sh.sh_size) {
 			insn = find_last_insn(file, reloc->sym->sec);
 			if (!insn) {
-				WARN("can't find reachable insn at %s+0x%x",
+				WARN("can't find reachable insn at %s+0x%lx",
 				     reloc->sym->sec->name, reloc->addend);
 				return -1;
 			}
 		} else {
-			WARN("can't find reachable insn at %s+0x%x",
+			WARN("can't find reachable insn at %s+0x%lx",
 			     reloc->sym->sec->name, reloc->addend);
 			return -1;
 		}
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -546,7 +546,7 @@ static struct section *elf_create_reloc_
 						int reltype);
 
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
-		  unsigned int type, struct symbol *sym, int addend)
+		  unsigned int type, struct symbol *sym, long addend)
 {
 	struct reloc *reloc;
 
--- a/tools/objtool/include/objtool/elf.h
+++ b/tools/objtool/include/objtool/elf.h
@@ -73,7 +73,7 @@ struct reloc {
 	struct symbol *sym;
 	unsigned long offset;
 	unsigned int type;
-	int addend;
+	long addend;
 	int idx;
 	bool jump_table_start;
 };
@@ -135,7 +135,7 @@ struct elf *elf_open_read(const char *na
 struct section *elf_create_section(struct elf *elf, const char *name, unsigned int sh_flags, size_t entsize, int nr);
 
 int elf_add_reloc(struct elf *elf, struct section *sec, unsigned long offset,
-		  unsigned int type, struct symbol *sym, int addend);
+		  unsigned int type, struct symbol *sym, long addend);
 int elf_add_reloc_to_insn(struct elf *elf, struct section *sec,
 			  unsigned long offset, unsigned int type,
 			  struct section *insn_sec, unsigned long insn_off);

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-04-17 15:46 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-15 11:19 The trouble with __weak and objtool got worse Peter Zijlstra
2022-04-15 14:12 ` Steven Rostedt
2022-04-15 15:31   ` Peter Zijlstra
2022-04-15 15:10 ` Josh Poimboeuf
2022-04-15 15:15   ` Josh Poimboeuf
2022-04-15 15:26 ` Peter Zijlstra
2022-04-15 17:40   ` Nick Desaulniers
2022-04-15 18:21     ` Josh Poimboeuf
2022-04-15 18:23       ` Josh Poimboeuf
2022-04-15 20:36       ` Nick Desaulniers
2022-04-16 10:49         ` Peter Zijlstra
2022-04-16 10:48       ` Peter Zijlstra
2022-04-16 16:07         ` Josh Poimboeuf
2022-04-16 16:32           ` H.J. Lu
2022-04-17 15:44           ` Peter Zijlstra
2022-04-17 15:46             ` Peter Zijlstra
2022-04-15 18:22 ` Segher Boessenkool
2022-04-15 18:36   ` Nick Desaulniers
2022-04-15 20:07     ` Segher Boessenkool
2022-04-15 20:31       ` Nick Desaulniers
2022-04-15 21:17         ` Fangrui Song
2022-04-15 21:41           ` Segher Boessenkool
2022-04-16 11:09       ` Peter Zijlstra
2022-04-16 10:59   ` Peter Zijlstra
2022-04-16 13:20     ` Segher Boessenkool
2022-04-16 17:59       ` Segher Boessenkool
2022-04-15 21:04 ` H.J. Lu
2022-04-16 11:25   ` Peter Zijlstra
2022-04-16 16:27     ` H.J. Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.