All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
@ 2021-08-26 12:09 Toke Høiland-Jørgensen
  2021-08-30 21:49 ` Andrii Nakryiko
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-08-26 12:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: Toke Høiland-Jørgensen, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf

When .eh_frame and .rel.eh_frame sections are present in BPF object files,
libbpf produces errors like this when loading the file:

libbpf: elf: skipping unrecognized data section(32) .eh_frame
libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame

It is possible to get rid of the .eh_frame section by adding
-fno-asynchronous-unwind-tables to the compilation, but we have seen
multiple examples of these sections appearing in BPF files in the wild,
most recently in samples/bpf, fixed by:
5a0ae9872d5c ("bpf, samples: Add -fno-asynchronous-unwind-tables to BPF Clang invocation")

While the errors are technically harmless, they look odd and confuse users.
So let's make libbpf filter out those sections, by adding .eh_frame to the
filter check in is_sec_name_dwarf().

v2:
- Expand explanation in the commit message

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 tools/lib/bpf/libbpf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 88d8825fc6f6..b1dc97b95965 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -2909,7 +2909,8 @@ static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn)
 static bool is_sec_name_dwarf(const char *name)
 {
 	/* approximation, but the actual list is too long */
-	return strncmp(name, ".debug_", sizeof(".debug_") - 1) == 0;
+	return (strncmp(name, ".debug_", sizeof(".debug_") - 1) == 0 ||
+		strncmp(name, ".eh_frame", sizeof(".eh_frame") - 1) == 0);
 }
 
 static bool ignore_elf_section(GElf_Shdr *hdr, const char *name)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-08-26 12:09 [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files Toke Høiland-Jørgensen
@ 2021-08-30 21:49 ` Andrii Nakryiko
  2021-08-31 10:28   ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2021-08-30 21:49 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf

On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
> libbpf produces errors like this when loading the file:
>
> libbpf: elf: skipping unrecognized data section(32) .eh_frame
> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>
> It is possible to get rid of the .eh_frame section by adding
> -fno-asynchronous-unwind-tables to the compilation, but we have seen
> multiple examples of these sections appearing in BPF files in the wild,
> most recently in samples/bpf, fixed by:
> 5a0ae9872d5c ("bpf, samples: Add -fno-asynchronous-unwind-tables to BPF Clang invocation")
>
> While the errors are technically harmless, they look odd and confuse users.

These warnings point out invalid set of compiler flags used for
compiling BPF object files, though. Which is a good thing and should
incentivize anyone getting those warnings to check and fix how they do
BPF compilation. Those .eh_frame sections shouldn't be present in BPF
object files at all, and that's what libbpf is trying to say.

I don't know exactly in which situations that .eh_frame section is
added, but looking at our selftests (and now samples/bpf as well),
where we use -target bpf, we don't need
-fno-asynchronous-unwind-tables at all.

So instead of hiding the problem, let's use this as an opportunity to
fix those user's compilation flags instead.

> So let's make libbpf filter out those sections, by adding .eh_frame to the
> filter check in is_sec_name_dwarf().
>
> v2:
> - Expand explanation in the commit message
>
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  tools/lib/bpf/libbpf.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 88d8825fc6f6..b1dc97b95965 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -2909,7 +2909,8 @@ static Elf_Data *elf_sec_data(const struct bpf_object *obj, Elf_Scn *scn)
>  static bool is_sec_name_dwarf(const char *name)
>  {
>         /* approximation, but the actual list is too long */
> -       return strncmp(name, ".debug_", sizeof(".debug_") - 1) == 0;
> +       return (strncmp(name, ".debug_", sizeof(".debug_") - 1) == 0 ||
> +               strncmp(name, ".eh_frame", sizeof(".eh_frame") - 1) == 0);
>  }
>
>  static bool ignore_elf_section(GElf_Shdr *hdr, const char *name)
> --
> 2.33.0
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-08-30 21:49 ` Andrii Nakryiko
@ 2021-08-31 10:28   ` Toke Høiland-Jørgensen
  2021-08-31 23:11     ` Andrii Nakryiko
  2021-09-02  2:48     ` Yonghong Song
  0 siblings, 2 replies; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-08-31 10:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>> libbpf produces errors like this when loading the file:
>>
>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>
>> It is possible to get rid of the .eh_frame section by adding
>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>> multiple examples of these sections appearing in BPF files in the wild,
>> most recently in samples/bpf, fixed by:
>> 5a0ae9872d5c ("bpf, samples: Add -fno-asynchronous-unwind-tables to BPF Clang invocation")
>>
>> While the errors are technically harmless, they look odd and confuse users.
>
> These warnings point out invalid set of compiler flags used for
> compiling BPF object files, though. Which is a good thing and should
> incentivize anyone getting those warnings to check and fix how they do
> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
> object files at all, and that's what libbpf is trying to say.

Apart from triggering that warning, what effect does this have, though?
The programs seem to work just fine (as evidenced by the fact that
samples/bpf has been built this way for years, for instance)...

Also, how is a user supposed to go from that cryptic error message to
figuring out that it has something to do with compiler flags?

> I don't know exactly in which situations that .eh_frame section is
> added, but looking at our selftests (and now samples/bpf as well),
> where we use -target bpf, we don't need
> -fno-asynchronous-unwind-tables at all.

This seems to at least be compiler-dependent. We ran into this with
bpftool as well (for the internal BPF programs it loads whenever it
runs), which already had '-target bpf' in the Makefile. We're carrying
an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
bpftool build to fix this...

> So instead of hiding the problem, let's use this as an opportunity to
> fix those user's compilation flags instead.

This really doesn't seem like something that's helping anyone, it's just
annoying and confusing users...

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-08-31 10:28   ` Toke Høiland-Jørgensen
@ 2021-08-31 23:11     ` Andrii Nakryiko
  2021-09-02  2:48     ` Yonghong Song
  1 sibling, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2021-08-31 23:11 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf

On Tue, Aug 31, 2021 at 3:28 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
> >> libbpf produces errors like this when loading the file:
> >>
> >> libbpf: elf: skipping unrecognized data section(32) .eh_frame
> >> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
> >>
> >> It is possible to get rid of the .eh_frame section by adding
> >> -fno-asynchronous-unwind-tables to the compilation, but we have seen
> >> multiple examples of these sections appearing in BPF files in the wild,
> >> most recently in samples/bpf, fixed by:
> >> 5a0ae9872d5c ("bpf, samples: Add -fno-asynchronous-unwind-tables to BPF Clang invocation")
> >>
> >> While the errors are technically harmless, they look odd and confuse users.
> >
> > These warnings point out invalid set of compiler flags used for
> > compiling BPF object files, though. Which is a good thing and should
> > incentivize anyone getting those warnings to check and fix how they do
> > BPF compilation. Those .eh_frame sections shouldn't be present in BPF
> > object files at all, and that's what libbpf is trying to say.
>
> Apart from triggering that warning, what effect does this have, though?
> The programs seem to work just fine (as evidenced by the fact that
> samples/bpf has been built this way for years, for instance)...
>
> Also, how is a user supposed to go from that cryptic error message to
> figuring out that it has something to do with compiler flags?

Google and find discussions like these?.. I don't think libbpf error
messages have to include intro into DWARF and .eh_frame.

Just googling ".eh_frame" gives me [0] as a first link, which seems to
describe what it is and how to get rid of it.

  [0] https://stackoverflow.com/questions/26300819/why-gcc-compiled-c-program-needs-eh-frame-section

>
> > I don't know exactly in which situations that .eh_frame section is
> > added, but looking at our selftests (and now samples/bpf as well),
> > where we use -target bpf, we don't need
> > -fno-asynchronous-unwind-tables at all.
>
> This seems to at least be compiler-dependent. We ran into this with
> bpftool as well (for the internal BPF programs it loads whenever it
> runs), which already had '-target bpf' in the Makefile. We're carrying
> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
> bpftool build to fix this...

So instead of figuring out why your compilers cause .eh_frame
generation (while they shouldn't), you are trying to hide the warning
in libbpf? This hasn't been the problem in production apps at
Facebook, nor with libbpf-tools or libbpf-bootstrap apps. Which just
makes me keep this warning more. Once we support multiple
.rodata/.data/.bss sections for libbpf, I think I'll turn all those
unrecognized sections into actual errors. I'd rather not have unknown
sections being just ignored by libbpf. Someday we might actually use
.eh_frame with BPF objects, that's when this will become not an error
or warning.

>
> > So instead of hiding the problem, let's use this as an opportunity to
> > fix those user's compilation flags instead.
>
> This really doesn't seem like something that's helping anyone, it's just
> annoying and confusing users...

Warnings like "libbpf: elf: skipping unrecognized data section(4)
.rodata.str1.1" annoy me as well, and that's one of the reasons I'll
add support for multiple .rodata sections. So annoying is fine, it
raises awareness and incentivizes fixing the problem.

>
> -Toke
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-08-31 10:28   ` Toke Høiland-Jørgensen
  2021-08-31 23:11     ` Andrii Nakryiko
@ 2021-09-02  2:48     ` Yonghong Song
  2021-09-02 17:08       ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2021-09-02  2:48 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, bpf



On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> 
>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>
>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>>> libbpf produces errors like this when loading the file:
>>>
>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>>
>>> It is possible to get rid of the .eh_frame section by adding
>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>>> multiple examples of these sections appearing in BPF files in the wild,
>>> most recently in samples/bpf, fixed by:
>>> 5a0ae9872d5c ("bpf, samples: Add -fno-asynchronous-unwind-tables to BPF Clang invocation")
>>>
>>> While the errors are technically harmless, they look odd and confuse users.
>>
>> These warnings point out invalid set of compiler flags used for
>> compiling BPF object files, though. Which is a good thing and should
>> incentivize anyone getting those warnings to check and fix how they do
>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
>> object files at all, and that's what libbpf is trying to say.
> 
> Apart from triggering that warning, what effect does this have, though?
> The programs seem to work just fine (as evidenced by the fact that
> samples/bpf has been built this way for years, for instance)...
> 
> Also, how is a user supposed to go from that cryptic error message to
> figuring out that it has something to do with compiler flags?
> 
>> I don't know exactly in which situations that .eh_frame section is
>> added, but looking at our selftests (and now samples/bpf as well),
>> where we use -target bpf, we don't need
>> -fno-asynchronous-unwind-tables at all.
> 
> This seems to at least be compiler-dependent. We ran into this with
> bpftool as well (for the internal BPF programs it loads whenever it
> runs), which already had '-target bpf' in the Makefile. We're carrying
> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
> bpftool build to fix this...

I haven't seen an instance of .eh_frame as well with -target bpf.
Do you have a reproducible test case? I would like to investigate
what is the possible cause and whether we could do something in llvm
to prevent its generatin. Thanks!

> 
>> So instead of hiding the problem, let's use this as an opportunity to
>> fix those user's compilation flags instead.
> 
> This really doesn't seem like something that's helping anyone, it's just
> annoying and confusing users...
> 
> -Toke
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-09-02  2:48     ` Yonghong Song
@ 2021-09-02 17:08       ` Toke Høiland-Jørgensen
  2021-09-02 19:32         ` Alexei Starovoitov
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-09-02 17:08 UTC (permalink / raw)
  To: Yonghong Song, Andrii Nakryiko
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, John Fastabend, KP Singh,
	Stanislav Fomichev, bpf

Yonghong Song <yhs@fb.com> writes:

> On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>> 
>>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>
>>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>>>> libbpf produces errors like this when loading the file:
>>>>
>>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>>>
>>>> It is possible to get rid of the .eh_frame section by adding
>>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>>>> multiple examples of these sections appearing in BPF files in the wild,
>>>> most recently in samples/bpf, fixed by:
>>>> 5a0ae9872d5c ("bpf, samples: Add -fno-asynchronous-unwind-tables to BPF Clang invocation")
>>>>
>>>> While the errors are technically harmless, they look odd and confuse users.
>>>
>>> These warnings point out invalid set of compiler flags used for
>>> compiling BPF object files, though. Which is a good thing and should
>>> incentivize anyone getting those warnings to check and fix how they do
>>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
>>> object files at all, and that's what libbpf is trying to say.
>> 
>> Apart from triggering that warning, what effect does this have, though?
>> The programs seem to work just fine (as evidenced by the fact that
>> samples/bpf has been built this way for years, for instance)...
>> 
>> Also, how is a user supposed to go from that cryptic error message to
>> figuring out that it has something to do with compiler flags?
>> 
>>> I don't know exactly in which situations that .eh_frame section is
>>> added, but looking at our selftests (and now samples/bpf as well),
>>> where we use -target bpf, we don't need
>>> -fno-asynchronous-unwind-tables at all.
>> 
>> This seems to at least be compiler-dependent. We ran into this with
>> bpftool as well (for the internal BPF programs it loads whenever it
>> runs), which already had '-target bpf' in the Makefile. We're carrying
>> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
>> bpftool build to fix this...
>
> I haven't seen an instance of .eh_frame as well with -target bpf.
> Do you have a reproducible test case? I would like to investigate
> what is the possible cause and whether we could do something in llvm
> to prevent its generatin. Thanks!

We found this in the RHEL builds of bpftool. I don't think we're doing
anything special, other than maybe building with a clang version that's
a few versions behind:

# clang --version
clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

So I suppose it may resolve itself once we upgrade LLVM?

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-09-02 17:08       ` Toke Høiland-Jørgensen
@ 2021-09-02 19:32         ` Alexei Starovoitov
  2021-09-02 21:54           ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Alexei Starovoitov @ 2021-09-02 19:32 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Yonghong Song, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	John Fastabend, KP Singh, Stanislav Fomichev, bpf

On Thu, Sep 2, 2021 at 10:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Yonghong Song <yhs@fb.com> writes:
>
> > On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >>
> >>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>>>
> >>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
> >>>> libbpf produces errors like this when loading the file:
> >>>>
> >>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
> >>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
> >>>>
> >>>> It is possible to get rid of the .eh_frame section by adding
> >>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
> >>>> multiple examples of these sections appearing in BPF files in the wild,
> >>>> most recently in samples/bpf, fixed by:
> >>>> 5a0ae9872d5c ("bpf, samples: Add -fno-
/to BPF Clang invocation")
> >>>>
> >>>> While the errors are technically harmless, they look odd and confuse users.
> >>>
> >>> These warnings point out invalid set of compiler flags used for
> >>> compiling BPF object files, though. Which is a good thing and should
> >>> incentivize anyone getting those warnings to check and fix how they do
> >>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
> >>> object files at all, and that's what libbpf is trying to say.
> >>
> >> Apart from triggering that warning, what effect does this have, though?
> >> The programs seem to work just fine (as evidenced by the fact that
> >> samples/bpf has been built this way for years, for instance)...
> >>
> >> Also, how is a user supposed to go from that cryptic error message to
> >> figuring out that it has something to do with compiler flags?
> >>
> >>> I don't know exactly in which situations that .eh_frame section is
> >>> added, but looking at our selftests (and now samples/bpf as well),
> >>> where we use -target bpf, we don't need
> >>> -fno-asynchronous-unwind-tables at all.
> >>
> >> This seems to at least be compiler-dependent. We ran into this with
> >> bpftool as well (for the internal BPF programs it loads whenever it
> >> runs), which already had '-target bpf' in the Makefile. We're carrying
> >> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
> >> bpftool build to fix this...
> >
> > I haven't seen an instance of .eh_frame as well with -target bpf.
> > Do you have a reproducible test case? I would like to investigate
> > what is the possible cause and whether we could do something in llvm
> > to prevent its generatin. Thanks!
>
> We found this in the RHEL builds of bpftool. I don't think we're doing
> anything special, other than maybe building with a clang version that's
> a few versions behind:
>
> # clang --version
> clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/bin
>
> So I suppose it may resolve itself once we upgrade LLVM?

That's odd. I don't think I've seen this issue even with clang 11
(but I built it myself).
If there is a fix indeed let's backport it to llvm 11. The user
experience matters.
It could be llvm configuration too.
I'm guessing some build flags might influence default settings
for unwind tables.

Yonghong, can we make bpf backend to ignore needsUnwindTableEntry ?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-09-02 19:32         ` Alexei Starovoitov
@ 2021-09-02 21:54           ` Yonghong Song
  2021-09-02 22:08             ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2021-09-02 21:54 UTC (permalink / raw)
  To: Alexei Starovoitov, Toke Høiland-Jørgensen
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf



On 9/2/21 12:32 PM, Alexei Starovoitov wrote:
> On Thu, Sep 2, 2021 at 10:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Yonghong Song <yhs@fb.com> writes:
>>
>>> On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
>>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>>>
>>>>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>>>
>>>>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>>>>>> libbpf produces errors like this when loading the file:
>>>>>>
>>>>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>>>>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>>>>>
>>>>>> It is possible to get rid of the .eh_frame section by adding
>>>>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>>>>>> multiple examples of these sections appearing in BPF files in the wild,
>>>>>> most recently in samples/bpf, fixed by:
>>>>>> 5a0ae9872d5c ("bpf, samples: Add -fno-
> /to BPF Clang invocation")
>>>>>>
>>>>>> While the errors are technically harmless, they look odd and confuse users.
>>>>>
>>>>> These warnings point out invalid set of compiler flags used for
>>>>> compiling BPF object files, though. Which is a good thing and should
>>>>> incentivize anyone getting those warnings to check and fix how they do
>>>>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
>>>>> object files at all, and that's what libbpf is trying to say.
>>>>
>>>> Apart from triggering that warning, what effect does this have, though?
>>>> The programs seem to work just fine (as evidenced by the fact that
>>>> samples/bpf has been built this way for years, for instance)...
>>>>
>>>> Also, how is a user supposed to go from that cryptic error message to
>>>> figuring out that it has something to do with compiler flags?
>>>>
>>>>> I don't know exactly in which situations that .eh_frame section is
>>>>> added, but looking at our selftests (and now samples/bpf as well),
>>>>> where we use -target bpf, we don't need
>>>>> -fno-asynchronous-unwind-tables at all.
>>>>
>>>> This seems to at least be compiler-dependent. We ran into this with
>>>> bpftool as well (for the internal BPF programs it loads whenever it
>>>> runs), which already had '-target bpf' in the Makefile. We're carrying
>>>> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
>>>> bpftool build to fix this...
>>>
>>> I haven't seen an instance of .eh_frame as well with -target bpf.
>>> Do you have a reproducible test case? I would like to investigate
>>> what is the possible cause and whether we could do something in llvm
>>> to prevent its generatin. Thanks!
>>
>> We found this in the RHEL builds of bpftool. I don't think we're doing
>> anything special, other than maybe building with a clang version that's
>> a few versions behind:
>>
>> # clang --version
>> clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5)
>> Target: x86_64-unknown-linux-gnu
>> Thread model: posix
>> InstalledDir: /usr/bin
>>
>> So I suppose it may resolve itself once we upgrade LLVM?
> 
> That's odd. I don't think I've seen this issue even with clang 11
> (but I built it myself).

I cannot reproduce it by self with self built llvm (11, 12, 13, 14).
But I can reproduce it with an upstream built llvm12.

/bin/clang \
         -I. \
         -I/home/yhs/work/bpf-next/tools/include/uapi/ \
         -I/home/yhs/work/bpf-next/tools/lib/bpf/ \
         -I/home/yhs/work/bpf-next/tools/lib \
         -g -O2 -Wall -target bpf -c skeleton/pid_iter.bpf.c -o 
pid_iter.bpf.o && llvm-strip -g pid_iter.bpf.o
   GEN     pid_iter.skel.h
libbpf: elf: skipping unrecognized data section(11) .eh_frame
libbpf: elf: skipping relo section(12) .rel.eh_frame for section(11) 
.eh_frame

> If there is a fix indeed let's backport it to llvm 11. The user
> experience matters.
> It could be llvm configuration too.
> I'm guessing some build flags might influence default settings
> for unwind tables.
> 
> Yonghong, can we make bpf backend to ignore needsUnwindTableEntry ?

Sure. I will try to get upstream build flags, reproduce and fix it
in llvm.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-09-02 21:54           ` Yonghong Song
@ 2021-09-02 22:08             ` Toke Høiland-Jørgensen
  2021-09-07 19:15               ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-09-02 22:08 UTC (permalink / raw)
  To: Yonghong Song, Alexei Starovoitov
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf

Yonghong Song <yhs@fb.com> writes:

> On 9/2/21 12:32 PM, Alexei Starovoitov wrote:
>> On Thu, Sep 2, 2021 at 10:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>
>>> Yonghong Song <yhs@fb.com> writes:
>>>
>>>> On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
>>>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>>>>
>>>>>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>>>>
>>>>>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>>>>>>> libbpf produces errors like this when loading the file:
>>>>>>>
>>>>>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>>>>>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>>>>>>
>>>>>>> It is possible to get rid of the .eh_frame section by adding
>>>>>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>>>>>>> multiple examples of these sections appearing in BPF files in the wild,
>>>>>>> most recently in samples/bpf, fixed by:
>>>>>>> 5a0ae9872d5c ("bpf, samples: Add -fno-
>> /to BPF Clang invocation")
>>>>>>>
>>>>>>> While the errors are technically harmless, they look odd and confuse users.
>>>>>>
>>>>>> These warnings point out invalid set of compiler flags used for
>>>>>> compiling BPF object files, though. Which is a good thing and should
>>>>>> incentivize anyone getting those warnings to check and fix how they do
>>>>>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
>>>>>> object files at all, and that's what libbpf is trying to say.
>>>>>
>>>>> Apart from triggering that warning, what effect does this have, though?
>>>>> The programs seem to work just fine (as evidenced by the fact that
>>>>> samples/bpf has been built this way for years, for instance)...
>>>>>
>>>>> Also, how is a user supposed to go from that cryptic error message to
>>>>> figuring out that it has something to do with compiler flags?
>>>>>
>>>>>> I don't know exactly in which situations that .eh_frame section is
>>>>>> added, but looking at our selftests (and now samples/bpf as well),
>>>>>> where we use -target bpf, we don't need
>>>>>> -fno-asynchronous-unwind-tables at all.
>>>>>
>>>>> This seems to at least be compiler-dependent. We ran into this with
>>>>> bpftool as well (for the internal BPF programs it loads whenever it
>>>>> runs), which already had '-target bpf' in the Makefile. We're carrying
>>>>> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
>>>>> bpftool build to fix this...
>>>>
>>>> I haven't seen an instance of .eh_frame as well with -target bpf.
>>>> Do you have a reproducible test case? I would like to investigate
>>>> what is the possible cause and whether we could do something in llvm
>>>> to prevent its generatin. Thanks!
>>>
>>> We found this in the RHEL builds of bpftool. I don't think we're doing
>>> anything special, other than maybe building with a clang version that's
>>> a few versions behind:
>>>
>>> # clang --version
>>> clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5)
>>> Target: x86_64-unknown-linux-gnu
>>> Thread model: posix
>>> InstalledDir: /usr/bin
>>>
>>> So I suppose it may resolve itself once we upgrade LLVM?
>> 
>> That's odd. I don't think I've seen this issue even with clang 11
>> (but I built it myself).
>
> I cannot reproduce it by self with self built llvm (11, 12, 13, 14).
> But I can reproduce it with an upstream built llvm12.
>
> /bin/clang \
>          -I. \
>          -I/home/yhs/work/bpf-next/tools/include/uapi/ \
>          -I/home/yhs/work/bpf-next/tools/lib/bpf/ \
>          -I/home/yhs/work/bpf-next/tools/lib \
>          -g -O2 -Wall -target bpf -c skeleton/pid_iter.bpf.c -o 
> pid_iter.bpf.o && llvm-strip -g pid_iter.bpf.o
>    GEN     pid_iter.skel.h
> libbpf: elf: skipping unrecognized data section(11) .eh_frame
> libbpf: elf: skipping relo section(12) .rel.eh_frame for section(11) 
> .eh_frame

Ah, that's interesting!

>> If there is a fix indeed let's backport it to llvm 11. The user
>> experience matters.
>> It could be llvm configuration too.
>> I'm guessing some build flags might influence default settings
>> for unwind tables.
>> 
>> Yonghong, can we make bpf backend to ignore needsUnwindTableEntry ?
>
> Sure. I will try to get upstream build flags, reproduce and fix it
> in llvm.

Awesome, thanks for looking at this! :)

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-09-02 22:08             ` Toke Høiland-Jørgensen
@ 2021-09-07 19:15               ` Yonghong Song
  2021-09-07 19:36                 ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 12+ messages in thread
From: Yonghong Song @ 2021-09-07 19:15 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Alexei Starovoitov
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf



On 9/2/21 3:08 PM, Toke Høiland-Jørgensen wrote:
> Yonghong Song <yhs@fb.com> writes:
> 
>> On 9/2/21 12:32 PM, Alexei Starovoitov wrote:
>>> On Thu, Sep 2, 2021 at 10:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>
>>>> Yonghong Song <yhs@fb.com> writes:
>>>>
>>>>> On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
>>>>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>>>>>
>>>>>>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>>>>>
>>>>>>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>>>>>>>> libbpf produces errors like this when loading the file:
>>>>>>>>
>>>>>>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>>>>>>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>>>>>>>
>>>>>>>> It is possible to get rid of the .eh_frame section by adding
>>>>>>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>>>>>>>> multiple examples of these sections appearing in BPF files in the wild,
>>>>>>>> most recently in samples/bpf, fixed by:
>>>>>>>> 5a0ae9872d5c ("bpf, samples: Add -fno-
>>> /to BPF Clang invocation")
>>>>>>>>
>>>>>>>> While the errors are technically harmless, they look odd and confuse users.
>>>>>>>
>>>>>>> These warnings point out invalid set of compiler flags used for
>>>>>>> compiling BPF object files, though. Which is a good thing and should
>>>>>>> incentivize anyone getting those warnings to check and fix how they do
>>>>>>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
>>>>>>> object files at all, and that's what libbpf is trying to say.
>>>>>>
>>>>>> Apart from triggering that warning, what effect does this have, though?
>>>>>> The programs seem to work just fine (as evidenced by the fact that
>>>>>> samples/bpf has been built this way for years, for instance)...
>>>>>>
>>>>>> Also, how is a user supposed to go from that cryptic error message to
>>>>>> figuring out that it has something to do with compiler flags?
>>>>>>
>>>>>>> I don't know exactly in which situations that .eh_frame section is
>>>>>>> added, but looking at our selftests (and now samples/bpf as well),
>>>>>>> where we use -target bpf, we don't need
>>>>>>> -fno-asynchronous-unwind-tables at all.
>>>>>>
>>>>>> This seems to at least be compiler-dependent. We ran into this with
>>>>>> bpftool as well (for the internal BPF programs it loads whenever it
>>>>>> runs), which already had '-target bpf' in the Makefile. We're carrying
>>>>>> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
>>>>>> bpftool build to fix this...
>>>>>
>>>>> I haven't seen an instance of .eh_frame as well with -target bpf.
>>>>> Do you have a reproducible test case? I would like to investigate
>>>>> what is the possible cause and whether we could do something in llvm
>>>>> to prevent its generatin. Thanks!
>>>>
>>>> We found this in the RHEL builds of bpftool. I don't think we're doing
>>>> anything special, other than maybe building with a clang version that's
>>>> a few versions behind:
>>>>
>>>> # clang --version
>>>> clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5)
>>>> Target: x86_64-unknown-linux-gnu
>>>> Thread model: posix
>>>> InstalledDir: /usr/bin
>>>>
>>>> So I suppose it may resolve itself once we upgrade LLVM?
>>>
>>> That's odd. I don't think I've seen this issue even with clang 11
>>> (but I built it myself).
>>
>> I cannot reproduce it by self with self built llvm (11, 12, 13, 14).
>> But I can reproduce it with an upstream built llvm12.
>>
>> /bin/clang \
>>           -I. \
>>           -I/home/yhs/work/bpf-next/tools/include/uapi/ \
>>           -I/home/yhs/work/bpf-next/tools/lib/bpf/ \
>>           -I/home/yhs/work/bpf-next/tools/lib \
>>           -g -O2 -Wall -target bpf -c skeleton/pid_iter.bpf.c -o
>> pid_iter.bpf.o && llvm-strip -g pid_iter.bpf.o
>>     GEN     pid_iter.skel.h
>> libbpf: elf: skipping unrecognized data section(11) .eh_frame
>> libbpf: elf: skipping relo section(12) .rel.eh_frame for section(11)
>> .eh_frame
> 
> Ah, that's interesting!
> 
>>> If there is a fix indeed let's backport it to llvm 11. The user
>>> experience matters.
>>> It could be llvm configuration too.
>>> I'm guessing some build flags might influence default settings
>>> for unwind tables.
>>>
>>> Yonghong, can we make bpf backend to ignore needsUnwindTableEntry ?
>>
>> Sure. I will try to get upstream build flags, reproduce and fix it
>> in llvm.

I did some investigation and this is due to centos private patch:
https://git.centos.org/rpms/clang/blob/b99d8d4a38320329e10570f308c3e2d8cf295c78/f/SOURCES/0002-PATCH-clang-Make-funwind-tables-the-default-on-all-a.patch

In upstream, the original llvm-project source is patched with
several private patches before building the rpm.
https://koji.mbox.centos.org/pkgs/packages/clang/12.0.1/1.module_el8.5.0+892+54d791e1/data/logs/x86_64/build.log

The above private patch enables unwind-table (.eh_frame section)
by default for ALL architectures and bpf is a victim of this.

I filed a redhat bugzilla bug to fix their private patch.

https://bugzilla.redhat.com/show_bug.cgi?id=2002024

Hopefully future newer compiler build won't have this issue.

> 
> Awesome, thanks for looking at this! :)
> 
> -Toke
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-09-07 19:15               ` Yonghong Song
@ 2021-09-07 19:36                 ` Toke Høiland-Jørgensen
  2021-09-07 22:24                   ` Yonghong Song
  0 siblings, 1 reply; 12+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-09-07 19:36 UTC (permalink / raw)
  To: Yonghong Song, Alexei Starovoitov
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf

Yonghong Song <yhs@fb.com> writes:

> On 9/2/21 3:08 PM, Toke Høiland-Jørgensen wrote:
>> Yonghong Song <yhs@fb.com> writes:
>> 
>>> On 9/2/21 12:32 PM, Alexei Starovoitov wrote:
>>>> On Thu, Sep 2, 2021 at 10:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>>
>>>>> Yonghong Song <yhs@fb.com> writes:
>>>>>
>>>>>> On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
>>>>>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>>>>>>
>>>>>>>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>>>>>>
>>>>>>>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>>>>>>>>> libbpf produces errors like this when loading the file:
>>>>>>>>>
>>>>>>>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>>>>>>>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>>>>>>>>
>>>>>>>>> It is possible to get rid of the .eh_frame section by adding
>>>>>>>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>>>>>>>>> multiple examples of these sections appearing in BPF files in the wild,
>>>>>>>>> most recently in samples/bpf, fixed by:
>>>>>>>>> 5a0ae9872d5c ("bpf, samples: Add -fno-
>>>> /to BPF Clang invocation")
>>>>>>>>>
>>>>>>>>> While the errors are technically harmless, they look odd and confuse users.
>>>>>>>>
>>>>>>>> These warnings point out invalid set of compiler flags used for
>>>>>>>> compiling BPF object files, though. Which is a good thing and should
>>>>>>>> incentivize anyone getting those warnings to check and fix how they do
>>>>>>>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
>>>>>>>> object files at all, and that's what libbpf is trying to say.
>>>>>>>
>>>>>>> Apart from triggering that warning, what effect does this have, though?
>>>>>>> The programs seem to work just fine (as evidenced by the fact that
>>>>>>> samples/bpf has been built this way for years, for instance)...
>>>>>>>
>>>>>>> Also, how is a user supposed to go from that cryptic error message to
>>>>>>> figuring out that it has something to do with compiler flags?
>>>>>>>
>>>>>>>> I don't know exactly in which situations that .eh_frame section is
>>>>>>>> added, but looking at our selftests (and now samples/bpf as well),
>>>>>>>> where we use -target bpf, we don't need
>>>>>>>> -fno-asynchronous-unwind-tables at all.
>>>>>>>
>>>>>>> This seems to at least be compiler-dependent. We ran into this with
>>>>>>> bpftool as well (for the internal BPF programs it loads whenever it
>>>>>>> runs), which already had '-target bpf' in the Makefile. We're carrying
>>>>>>> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
>>>>>>> bpftool build to fix this...
>>>>>>
>>>>>> I haven't seen an instance of .eh_frame as well with -target bpf.
>>>>>> Do you have a reproducible test case? I would like to investigate
>>>>>> what is the possible cause and whether we could do something in llvm
>>>>>> to prevent its generatin. Thanks!
>>>>>
>>>>> We found this in the RHEL builds of bpftool. I don't think we're doing
>>>>> anything special, other than maybe building with a clang version that's
>>>>> a few versions behind:
>>>>>
>>>>> # clang --version
>>>>> clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5)
>>>>> Target: x86_64-unknown-linux-gnu
>>>>> Thread model: posix
>>>>> InstalledDir: /usr/bin
>>>>>
>>>>> So I suppose it may resolve itself once we upgrade LLVM?
>>>>
>>>> That's odd. I don't think I've seen this issue even with clang 11
>>>> (but I built it myself).
>>>
>>> I cannot reproduce it by self with self built llvm (11, 12, 13, 14).
>>> But I can reproduce it with an upstream built llvm12.
>>>
>>> /bin/clang \
>>>           -I. \
>>>           -I/home/yhs/work/bpf-next/tools/include/uapi/ \
>>>           -I/home/yhs/work/bpf-next/tools/lib/bpf/ \
>>>           -I/home/yhs/work/bpf-next/tools/lib \
>>>           -g -O2 -Wall -target bpf -c skeleton/pid_iter.bpf.c -o
>>> pid_iter.bpf.o && llvm-strip -g pid_iter.bpf.o
>>>     GEN     pid_iter.skel.h
>>> libbpf: elf: skipping unrecognized data section(11) .eh_frame
>>> libbpf: elf: skipping relo section(12) .rel.eh_frame for section(11)
>>> .eh_frame
>> 
>> Ah, that's interesting!
>> 
>>>> If there is a fix indeed let's backport it to llvm 11. The user
>>>> experience matters.
>>>> It could be llvm configuration too.
>>>> I'm guessing some build flags might influence default settings
>>>> for unwind tables.
>>>>
>>>> Yonghong, can we make bpf backend to ignore needsUnwindTableEntry ?
>>>
>>> Sure. I will try to get upstream build flags, reproduce and fix it
>>> in llvm.
>
> I did some investigation and this is due to centos private patch:
> https://git.centos.org/rpms/clang/blob/b99d8d4a38320329e10570f308c3e2d8cf295c78/f/SOURCES/0002-PATCH-clang-Make-funwind-tables-the-default-on-all-a.patch
>
> In upstream, the original llvm-project source is patched with
> several private patches before building the rpm.
> https://koji.mbox.centos.org/pkgs/packages/clang/12.0.1/1.module_el8.5.0+892+54d791e1/data/logs/x86_64/build.log
>
> The above private patch enables unwind-table (.eh_frame section)
> by default for ALL architectures and bpf is a victim of this.

Ah, doh! I had no idea we were doing this :/

> I filed a redhat bugzilla bug to fix their private patch.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=2002024
>
> Hopefully future newer compiler build won't have this issue.

Thank you for finding the root cause of this! I'll follow up internally
and make sure we get this fixed...

-Toke


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files
  2021-09-07 19:36                 ` Toke Høiland-Jørgensen
@ 2021-09-07 22:24                   ` Yonghong Song
  0 siblings, 0 replies; 12+ messages in thread
From: Yonghong Song @ 2021-09-07 22:24 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Alexei Starovoitov
  Cc: Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Stanislav Fomichev, bpf



On 9/7/21 12:36 PM, Toke Høiland-Jørgensen wrote:
> Yonghong Song <yhs@fb.com> writes:
> 
>> On 9/2/21 3:08 PM, Toke Høiland-Jørgensen wrote:
>>> Yonghong Song <yhs@fb.com> writes:
>>>
>>>> On 9/2/21 12:32 PM, Alexei Starovoitov wrote:
>>>>> On Thu, Sep 2, 2021 at 10:08 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>>>
>>>>>> Yonghong Song <yhs@fb.com> writes:
>>>>>>
>>>>>>> On 8/31/21 3:28 AM, Toke Høiland-Jørgensen wrote:
>>>>>>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>>>>>>>
>>>>>>>>> On Thu, Aug 26, 2021 at 5:10 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>> When .eh_frame and .rel.eh_frame sections are present in BPF object files,
>>>>>>>>>> libbpf produces errors like this when loading the file:
>>>>>>>>>>
>>>>>>>>>> libbpf: elf: skipping unrecognized data section(32) .eh_frame
>>>>>>>>>> libbpf: elf: skipping relo section(33) .rel.eh_frame for section(32) .eh_frame
>>>>>>>>>>
>>>>>>>>>> It is possible to get rid of the .eh_frame section by adding
>>>>>>>>>> -fno-asynchronous-unwind-tables to the compilation, but we have seen
>>>>>>>>>> multiple examples of these sections appearing in BPF files in the wild,
>>>>>>>>>> most recently in samples/bpf, fixed by:
>>>>>>>>>> 5a0ae9872d5c ("bpf, samples: Add -fno-
>>>>> /to BPF Clang invocation")
>>>>>>>>>>
>>>>>>>>>> While the errors are technically harmless, they look odd and confuse users.
>>>>>>>>>
>>>>>>>>> These warnings point out invalid set of compiler flags used for
>>>>>>>>> compiling BPF object files, though. Which is a good thing and should
>>>>>>>>> incentivize anyone getting those warnings to check and fix how they do
>>>>>>>>> BPF compilation. Those .eh_frame sections shouldn't be present in BPF
>>>>>>>>> object files at all, and that's what libbpf is trying to say.
>>>>>>>>
>>>>>>>> Apart from triggering that warning, what effect does this have, though?
>>>>>>>> The programs seem to work just fine (as evidenced by the fact that
>>>>>>>> samples/bpf has been built this way for years, for instance)...
>>>>>>>>
>>>>>>>> Also, how is a user supposed to go from that cryptic error message to
>>>>>>>> figuring out that it has something to do with compiler flags?
>>>>>>>>
>>>>>>>>> I don't know exactly in which situations that .eh_frame section is
>>>>>>>>> added, but looking at our selftests (and now samples/bpf as well),
>>>>>>>>> where we use -target bpf, we don't need
>>>>>>>>> -fno-asynchronous-unwind-tables at all.
>>>>>>>>
>>>>>>>> This seems to at least be compiler-dependent. We ran into this with
>>>>>>>> bpftool as well (for the internal BPF programs it loads whenever it
>>>>>>>> runs), which already had '-target bpf' in the Makefile. We're carrying
>>>>>>>> an internal RHEL patch adding -fno-asynchronous-unwind-tables to the
>>>>>>>> bpftool build to fix this...
>>>>>>>
>>>>>>> I haven't seen an instance of .eh_frame as well with -target bpf.
>>>>>>> Do you have a reproducible test case? I would like to investigate
>>>>>>> what is the possible cause and whether we could do something in llvm
>>>>>>> to prevent its generatin. Thanks!
>>>>>>
>>>>>> We found this in the RHEL builds of bpftool. I don't think we're doing
>>>>>> anything special, other than maybe building with a clang version that's
>>>>>> a few versions behind:
>>>>>>
>>>>>> # clang --version
>>>>>> clang version 11.0.0 (Red Hat 11.0.0-1.module+el8.4.0+8598+a071fcd5)
>>>>>> Target: x86_64-unknown-linux-gnu
>>>>>> Thread model: posix
>>>>>> InstalledDir: /usr/bin
>>>>>>
>>>>>> So I suppose it may resolve itself once we upgrade LLVM?
>>>>>
>>>>> That's odd. I don't think I've seen this issue even with clang 11
>>>>> (but I built it myself).
>>>>
>>>> I cannot reproduce it by self with self built llvm (11, 12, 13, 14).
>>>> But I can reproduce it with an upstream built llvm12.
>>>>
>>>> /bin/clang \
>>>>            -I. \
>>>>            -I/home/yhs/work/bpf-next/tools/include/uapi/ \
>>>>            -I/home/yhs/work/bpf-next/tools/lib/bpf/ \
>>>>            -I/home/yhs/work/bpf-next/tools/lib \
>>>>            -g -O2 -Wall -target bpf -c skeleton/pid_iter.bpf.c -o
>>>> pid_iter.bpf.o && llvm-strip -g pid_iter.bpf.o
>>>>      GEN     pid_iter.skel.h
>>>> libbpf: elf: skipping unrecognized data section(11) .eh_frame
>>>> libbpf: elf: skipping relo section(12) .rel.eh_frame for section(11)
>>>> .eh_frame
>>>
>>> Ah, that's interesting!
>>>
>>>>> If there is a fix indeed let's backport it to llvm 11. The user
>>>>> experience matters.
>>>>> It could be llvm configuration too.
>>>>> I'm guessing some build flags might influence default settings
>>>>> for unwind tables.
>>>>>
>>>>> Yonghong, can we make bpf backend to ignore needsUnwindTableEntry ?
>>>>
>>>> Sure. I will try to get upstream build flags, reproduce and fix it
>>>> in llvm.
>>
>> I did some investigation and this is due to centos private patch:
>> https://git.centos.org/rpms/clang/blob/b99d8d4a38320329e10570f308c3e2d8cf295c78/f/SOURCES/0002-PATCH-clang-Make-funwind-tables-the-default-on-all-a.patch
>>
>> In upstream, the original llvm-project source is patched with
>> several private patches before building the rpm.
>> https://koji.mbox.centos.org/pkgs/packages/clang/12.0.1/1.module_el8.5.0+892+54d791e1/data/logs/x86_64/build.log
>>
>> The above private patch enables unwind-table (.eh_frame section)
>> by default for ALL architectures and bpf is a victim of this.
> 
> Ah, doh! I had no idea we were doing this :/
> 
>> I filed a redhat bugzilla bug to fix their private patch.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=2002024
>>
>> Hopefully future newer compiler build won't have this issue.
> 
> Thank you for finding the root cause of this! I'll follow up internally
> and make sure we get this fixed...

Thanks! Hopefully this can be resolved soon.

> 
> -Toke
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-09-07 22:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-26 12:09 [PATCH bpf-next v2] libbpf: ignore .eh_frame sections when parsing elf files Toke Høiland-Jørgensen
2021-08-30 21:49 ` Andrii Nakryiko
2021-08-31 10:28   ` Toke Høiland-Jørgensen
2021-08-31 23:11     ` Andrii Nakryiko
2021-09-02  2:48     ` Yonghong Song
2021-09-02 17:08       ` Toke Høiland-Jørgensen
2021-09-02 19:32         ` Alexei Starovoitov
2021-09-02 21:54           ` Yonghong Song
2021-09-02 22:08             ` Toke Høiland-Jørgensen
2021-09-07 19:15               ` Yonghong Song
2021-09-07 19:36                 ` Toke Høiland-Jørgensen
2021-09-07 22:24                   ` Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.