All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Yonghong Song <yhs@fb.com>, bpf <bpf@vger.kernel.org>
Subject: Re: Latest libbpf fails to load programs compiled with old LLVM
Date: Tue, 08 Dec 2020 14:41:28 +0100	[thread overview]
Message-ID: <87czzkifef.fsf@toke.dk> (raw)
In-Reply-To: <CAEf4Bzb=zi6ew3fgAg29ZB0tcBw8xfEX-pRuMeAyYBiXp5ewTw@mail.gmail.com>

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Mon, Dec 7, 2020 at 3:00 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>
>> > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote:
>> >>
>> >>
>> >>
>> >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote:
>> >> > Yonghong Song <yhs@fb.com> writes:
>> >> >
>> >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote:
>> >> >>> Hi Andrii
>> >> >>>
>> >> >>> I noticed that recent libbpf versions fail to load BPF files compiled
>> >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I
>> >> >>> get:
>> >> >>>
>> >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv
>> >> >>> Loading 1 files on interface 'testns'.
>> >> >>> libbpf: loading ../lib/testing/xdp_drop.o
>> >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1
>> >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0
>> >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid
>> >> >>>
>> >> >>> The 'failed to find program symbol' error seems to have been introduced
>> >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into
>> >> >>> multiple BPF programs").
>> >> >>>
>> >> >>> Looking at the object file in question, indeed it seems to not have any
>> >> >>> function symbols defined:
>> >> >>>
>> >> >>> $  llvm-objdump --syms ../lib/testing/xdp_drop.o
>> >> >>>
>> >> >>> ../lib/testing/xdp_drop.o:  file format elf64-bpf
>> >> >>>
>> >> >>> SYMBOL TABLE:
>> >> >>> 0000000000000000 l       .debug_str 0000000000000000
>> >> >>> 0000000000000037 l       .debug_str 0000000000000000
>> >> >>> 0000000000000042 l       .debug_str 0000000000000000
>> >> >>> 0000000000000068 l       .debug_str 0000000000000000
>> >> >>> 0000000000000071 l       .debug_str 0000000000000000
>> >> >>> 0000000000000076 l       .debug_str 0000000000000000
>> >> >>> 000000000000008a l       .debug_str 0000000000000000
>> >> >>> 0000000000000097 l       .debug_str 0000000000000000
>> >> >>> 00000000000000a3 l       .debug_str 0000000000000000
>> >> >>> 00000000000000ac l       .debug_str 0000000000000000
>> >> >>> 00000000000000b5 l       .debug_str 0000000000000000
>> >> >>> 00000000000000bc l       .debug_str 0000000000000000
>> >> >>> 00000000000000c9 l       .debug_str 0000000000000000
>> >> >>> 00000000000000d4 l       .debug_str 0000000000000000
>> >> >>> 00000000000000dd l       .debug_str 0000000000000000
>> >> >>> 00000000000000e1 l       .debug_str 0000000000000000
>> >> >>> 00000000000000e5 l       .debug_str 0000000000000000
>> >> >>> 00000000000000ea l       .debug_str 0000000000000000
>> >> >>> 00000000000000f0 l       .debug_str 0000000000000000
>> >> >>> 00000000000000f9 l       .debug_str 0000000000000000
>> >> >>> 0000000000000103 l       .debug_str 0000000000000000
>> >> >>> 0000000000000113 l       .debug_str 0000000000000000
>> >> >>> 0000000000000122 l       .debug_str 0000000000000000
>> >> >>> 0000000000000131 l       .debug_str 0000000000000000
>> >> >>> 0000000000000000 l    d  prog       0000000000000000 prog
>> >> >>> 0000000000000000 l    d  .debug_abbrev      0000000000000000 .debug_abbrev
>> >> >>> 0000000000000000 l    d  .debug_info        0000000000000000 .debug_info
>> >> >>> 0000000000000000 l    d  .debug_frame       0000000000000000 .debug_frame
>> >> >>> 0000000000000000 l    d  .debug_line        0000000000000000 .debug_line
>> >> >>> 0000000000000000 g       license    0000000000000000 _license
>> >> >>> 0000000000000000 g       prog       0000000000000000 xdp_drop
>> >> >>>
>> >> >>>
>> >> >>> I assume this is because old LLVM versions simply don't emit that symbol
>> >> >>> information?
>> >>
>> >> Thanks for the below instruction and xdp_drop.c file. I can reproduce
>> >> the issue now.
>> >>
>> >> I added another function 'xdp_drop1' in the same thing. Below is the
>> >> symbol table with llvm7 vs. llvm12.
>> >>
>> >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop
>> >>      32: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT     3 xdp_drop
>> >>      33: 0000000000000010     0 NOTYPE  GLOBAL DEFAULT     3 xdp_drop1
>> >>
>> >>    [ 3] prog              PROGBITS        0000000000000000 000040 000020
>> >> 00  AX  0   0  8
>> >>
>> >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop
>> >>      32: 0000000000000000    16 FUNC    GLOBAL DEFAULT     3 xdp_drop
>> >>      33: 0000000000000010    16 FUNC    GLOBAL DEFAULT     3 xdp_drop1
>> >> -bash-4.4$
>> >>
>> >>    [ 3] prog              PROGBITS        0000000000000000 000040 000020
>> >> 00  AX  0   0  8
>> >>
>> >>
>> >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can
>> >> change to recognize NOTYPE and use the symbol value (representing the
>> >> offset from the start of the section) and section size to
>> >> calculate the individual function size. This is more complicated than
>> >> elf file providing FUNC type and symbol size directly.
>> >
>> > I think we should just face the fact that LLVM7 is way too old to
>> > produce a sensible BPF ELF file layout. We can extend:
>> >
>> > libbpf: sec 'prog': failed to find program symbol at offset 0
>> > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid
>> >
>> > with a suggestion to upgrade Clang/LLVM to something more recent, if
>> > that would be helpful.
>> >
>> > But I don't want to add error-prone checks and assumptions in the
>> > already quite complicated logic. Even the kernel itself maintains that
>> > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not
>> > working with older than Clang10, so lots of people have already
>> > upgraded way beyond that.
>>
>> Wait, what? This is a regression that *breaks people's programs* on
>> compiler versions that are still very much in the wild! I mean, fine if
>> you don't want to support new features on such files, but then surely we
>> can at least revert back to the old behaviour?
>
> This is clearly a bug in LLVM7, which didn't produce correct ELF
> symbols, do we agree on that? libbpf used to handle such invalid ELF
> files *by accident* until it changed its internal logic to be more
> strict in v0.2. It became more strict and doesn't work with such
> invalid ELF files anymore. Does it need to add extra quirks to support
> such broken ELF? I don't think so.

I don't know enough about the intricacies of the ELF format to say, but
I believe you when you say it's a bug. However, that doesn't change the
fact that from a user's PoV, something that was working before is now
broken, with the only change being a newer libbpf.

This is not a theoretical concern, BTW, I discovered this due to
feedback from a partner that we've been pushing to adopt libbpf. When
they finally tried it out, the first thing they noticed is that their
programs wouldn't load due to this issue.

Sure, I can tell them to just upgrade their toolchain (and I will), but
that still means we're back to "in order to use this library, you should
expect to keep chasing the latest version of the entire toolchain". And
this is a much harder sell than "this is a stable library and upstream
takes backwards compatibility very serious", which I *thought* was the
expectation.

> Surely, users that can't upgrade LLVM7 to something less ancient, can
> stick to libbpf v0.1, that was lenient enough to accept such invalid
> ELF files. libbpf v0.2 was released more than a month ago, and so far
> you are the only one who noticed this "regression". So hopefully it's
> not super annoying to people and they would be accommodating enough to
> use more up to date compiler (and save themselves lots of trouble
> along the way).

Oh, boy, do I envy your adoption rate for new versions! In my world I
would expect that by one month a few people who are very early adopters
have started noticing and maybe thinking about testing the new version :)

>> > Speaking of legacy. Toke, can you please update all the samples in
>> > your xdp-tools repo to not use arbitrary sections names. I see
>> > SEC("prog"), where it should really be SEC("xdp"). It sets a bad
>> > example for newcomers, IMO.
>>
>> I used "prog" because that's what iproute2 looks for if you don't supply
>
> Ok.

Fixed now, BTW:
https://github.com/xdp-project/xdp-tools/commit/83ab8aa1c29408aac842bebe704aa47ec5dc5bc3

>> a section name, so it makes it convenient to load programs with 'ip'
>> without supplying the section name. However, I do realise this is not
>> the best of reasons, and I am not opposed to changing it. However...
>>
>> > I'm also going to emit warnings in libbpf soon for section names that
>> > don't follow proper libbpf naming pattern, so it would be good if you
>> > could get ahead of the curve.
>>
>> ...this sounds like just another way to annoy users by breaking things
>> that were working before? :/
>
> It won't break, libbpf will emit a warning about the need to use
> proper section name format, which will start to be enforced only with
> major version bump. So that will give users plenty of time to make
> sure their BPF programs are compatible with stricter libbpf.

Well see above re: different expectations for "plenty of time". But OK,
maybe this isn't as bad as I figured at first glance :)

-Toke


  reply	other threads:[~2020-12-08 13:43 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 17:55 Latest libbpf fails to load programs compiled with old LLVM Toke Høiland-Jørgensen
2020-12-04  1:42 ` Yonghong Song
2020-12-04  9:34   ` Toke Høiland-Jørgensen
2020-12-04 17:54     ` Yonghong Song
2020-12-04 19:23       ` Andrii Nakryiko
2020-12-07 10:59         ` Toke Høiland-Jørgensen
2020-12-07 15:55           ` Alexei Starovoitov
2020-12-07 16:15             ` Toke Høiland-Jørgensen
2020-12-07 16:20               ` Alexei Starovoitov
2020-12-07 16:51                 ` Toke Høiland-Jørgensen
2020-12-07 17:16                   ` Alexei Starovoitov
2020-12-07 22:18                     ` Toke Høiland-Jørgensen
2020-12-08  1:20                       ` Alexei Starovoitov
2020-12-07 18:02                 ` David Ahern
2020-12-07 18:14                   ` Alexei Starovoitov
2020-12-07 18:59                     ` David Ahern
2020-12-08  2:47           ` Andrii Nakryiko
2020-12-08 13:41             ` Toke Høiland-Jørgensen [this message]
2020-12-08 18:39               ` Andrii Nakryiko
2020-12-08 22:38                 ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czzkifef.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.