* Latest libbpf fails to load programs compiled with old LLVM @ 2020-12-03 17:55 Toke Høiland-Jørgensen 2020-12-04 1:42 ` Yonghong Song 0 siblings, 1 reply; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-03 17:55 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: bpf Hi Andrii I noticed that recent libbpf versions fail to load BPF files compiled with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I get: $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv Loading 1 files on interface 'testns'. libbpf: loading ../lib/testing/xdp_drop.o libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 libbpf: sec 'prog': failed to find program symbol at offset 0 Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid The 'failed to find program symbol' error seems to have been introduced with commit c112239272c6 ("libbpf: Parse multi-function sections into multiple BPF programs"). Looking at the object file in question, indeed it seems to not have any function symbols defined: $ llvm-objdump --syms ../lib/testing/xdp_drop.o ../lib/testing/xdp_drop.o: file format elf64-bpf SYMBOL TABLE: 0000000000000000 l .debug_str 0000000000000000 0000000000000037 l .debug_str 0000000000000000 0000000000000042 l .debug_str 0000000000000000 0000000000000068 l .debug_str 0000000000000000 0000000000000071 l .debug_str 0000000000000000 0000000000000076 l .debug_str 0000000000000000 000000000000008a l .debug_str 0000000000000000 0000000000000097 l .debug_str 0000000000000000 00000000000000a3 l .debug_str 0000000000000000 00000000000000ac l .debug_str 0000000000000000 00000000000000b5 l .debug_str 0000000000000000 00000000000000bc l .debug_str 0000000000000000 00000000000000c9 l .debug_str 0000000000000000 00000000000000d4 l .debug_str 0000000000000000 00000000000000dd l .debug_str 0000000000000000 00000000000000e1 l .debug_str 0000000000000000 00000000000000e5 l .debug_str 0000000000000000 00000000000000ea l .debug_str 0000000000000000 00000000000000f0 l .debug_str 0000000000000000 00000000000000f9 l .debug_str 0000000000000000 0000000000000103 l .debug_str 0000000000000000 0000000000000113 l .debug_str 0000000000000000 0000000000000122 l .debug_str 0000000000000000 0000000000000131 l .debug_str 0000000000000000 0000000000000000 l d prog 0000000000000000 prog 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev 0000000000000000 l d .debug_info 0000000000000000 .debug_info 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame 0000000000000000 l d .debug_line 0000000000000000 .debug_line 0000000000000000 g license 0000000000000000 _license 0000000000000000 g prog 0000000000000000 xdp_drop I assume this is because old LLVM versions simply don't emit that symbol information? Anyhow, the patch series that introduced this restructures the program parsing some, so I wanted to get your input to make sure I don't break things when fixing this regression. So what's the best way to fix it? Just assume that the whole section is one program if no symbols are present, or is the some subtle reason why that would break any of the other logic for BPF-to-BPF calls? -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-03 17:55 Latest libbpf fails to load programs compiled with old LLVM Toke Høiland-Jørgensen @ 2020-12-04 1:42 ` Yonghong Song 2020-12-04 9:34 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 20+ messages in thread From: Yonghong Song @ 2020-12-04 1:42 UTC (permalink / raw) To: Toke Høiland-Jørgensen, Andrii Nakryiko; +Cc: bpf On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: > Hi Andrii > > I noticed that recent libbpf versions fail to load BPF files compiled > with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I > get: > > $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv > Loading 1 files on interface 'testns'. > libbpf: loading ../lib/testing/xdp_drop.o > libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 > libbpf: sec 'prog': failed to find program symbol at offset 0 > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid > > The 'failed to find program symbol' error seems to have been introduced > with commit c112239272c6 ("libbpf: Parse multi-function sections into > multiple BPF programs"). > > Looking at the object file in question, indeed it seems to not have any > function symbols defined: > > $ llvm-objdump --syms ../lib/testing/xdp_drop.o > > ../lib/testing/xdp_drop.o: file format elf64-bpf > > SYMBOL TABLE: > 0000000000000000 l .debug_str 0000000000000000 > 0000000000000037 l .debug_str 0000000000000000 > 0000000000000042 l .debug_str 0000000000000000 > 0000000000000068 l .debug_str 0000000000000000 > 0000000000000071 l .debug_str 0000000000000000 > 0000000000000076 l .debug_str 0000000000000000 > 000000000000008a l .debug_str 0000000000000000 > 0000000000000097 l .debug_str 0000000000000000 > 00000000000000a3 l .debug_str 0000000000000000 > 00000000000000ac l .debug_str 0000000000000000 > 00000000000000b5 l .debug_str 0000000000000000 > 00000000000000bc l .debug_str 0000000000000000 > 00000000000000c9 l .debug_str 0000000000000000 > 00000000000000d4 l .debug_str 0000000000000000 > 00000000000000dd l .debug_str 0000000000000000 > 00000000000000e1 l .debug_str 0000000000000000 > 00000000000000e5 l .debug_str 0000000000000000 > 00000000000000ea l .debug_str 0000000000000000 > 00000000000000f0 l .debug_str 0000000000000000 > 00000000000000f9 l .debug_str 0000000000000000 > 0000000000000103 l .debug_str 0000000000000000 > 0000000000000113 l .debug_str 0000000000000000 > 0000000000000122 l .debug_str 0000000000000000 > 0000000000000131 l .debug_str 0000000000000000 > 0000000000000000 l d prog 0000000000000000 prog > 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev > 0000000000000000 l d .debug_info 0000000000000000 .debug_info > 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame > 0000000000000000 l d .debug_line 0000000000000000 .debug_line > 0000000000000000 g license 0000000000000000 _license > 0000000000000000 g prog 0000000000000000 xdp_drop > > > I assume this is because old LLVM versions simply don't emit that symbol > information? Could you share xdp_drop.c or other test which I can compile and check to understand the issue? > > Anyhow, the patch series that introduced this restructures the program > parsing some, so I wanted to get your input to make sure I don't break > things when fixing this regression. So what's the best way to fix it? > Just assume that the whole section is one program if no symbols are > present, or is the some subtle reason why that would break any of the > other logic for BPF-to-BPF calls? > > -Toke > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-04 1:42 ` Yonghong Song @ 2020-12-04 9:34 ` Toke Høiland-Jørgensen 2020-12-04 17:54 ` Yonghong Song 0 siblings, 1 reply; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-04 9:34 UTC (permalink / raw) To: Yonghong Song, Andrii Nakryiko; +Cc: bpf Yonghong Song <yhs@fb.com> writes: > On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: >> Hi Andrii >> >> I noticed that recent libbpf versions fail to load BPF files compiled >> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I >> get: >> >> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv >> Loading 1 files on interface 'testns'. >> libbpf: loading ../lib/testing/xdp_drop.o >> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 >> libbpf: sec 'prog': failed to find program symbol at offset 0 >> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid >> >> The 'failed to find program symbol' error seems to have been introduced >> with commit c112239272c6 ("libbpf: Parse multi-function sections into >> multiple BPF programs"). >> >> Looking at the object file in question, indeed it seems to not have any >> function symbols defined: >> >> $ llvm-objdump --syms ../lib/testing/xdp_drop.o >> >> ../lib/testing/xdp_drop.o: file format elf64-bpf >> >> SYMBOL TABLE: >> 0000000000000000 l .debug_str 0000000000000000 >> 0000000000000037 l .debug_str 0000000000000000 >> 0000000000000042 l .debug_str 0000000000000000 >> 0000000000000068 l .debug_str 0000000000000000 >> 0000000000000071 l .debug_str 0000000000000000 >> 0000000000000076 l .debug_str 0000000000000000 >> 000000000000008a l .debug_str 0000000000000000 >> 0000000000000097 l .debug_str 0000000000000000 >> 00000000000000a3 l .debug_str 0000000000000000 >> 00000000000000ac l .debug_str 0000000000000000 >> 00000000000000b5 l .debug_str 0000000000000000 >> 00000000000000bc l .debug_str 0000000000000000 >> 00000000000000c9 l .debug_str 0000000000000000 >> 00000000000000d4 l .debug_str 0000000000000000 >> 00000000000000dd l .debug_str 0000000000000000 >> 00000000000000e1 l .debug_str 0000000000000000 >> 00000000000000e5 l .debug_str 0000000000000000 >> 00000000000000ea l .debug_str 0000000000000000 >> 00000000000000f0 l .debug_str 0000000000000000 >> 00000000000000f9 l .debug_str 0000000000000000 >> 0000000000000103 l .debug_str 0000000000000000 >> 0000000000000113 l .debug_str 0000000000000000 >> 0000000000000122 l .debug_str 0000000000000000 >> 0000000000000131 l .debug_str 0000000000000000 >> 0000000000000000 l d prog 0000000000000000 prog >> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev >> 0000000000000000 l d .debug_info 0000000000000000 .debug_info >> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame >> 0000000000000000 l d .debug_line 0000000000000000 .debug_line >> 0000000000000000 g license 0000000000000000 _license >> 0000000000000000 g prog 0000000000000000 xdp_drop >> >> >> I assume this is because old LLVM versions simply don't emit that symbol >> information? > > Could you share xdp_drop.c or other test which I can compile and check > to understand the issue? It's just an empty program returning XDP_DROP: https://github.com/xdp-project/xdp-tools/blob/master/lib/testing/xdp_drop.c I basically just did this on Debian buster: $ sudo apt install gcc-multilib build-essential libpcap-dev libelf-dev git llc lld clang gcc-multilib pkt-config m4 $ git clone --recurse-submodules https://github.com/xdp-project/xdp-tools $ cd xdp-tools $ LLC=llc-7 ./configure $ make -k $ cd xdp-loader $ sudo ip link add dev testns type veth $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv (xdpdump will fail to build with llvm7, but the rest should work) -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-04 9:34 ` Toke Høiland-Jørgensen @ 2020-12-04 17:54 ` Yonghong Song 2020-12-04 19:23 ` Andrii Nakryiko 0 siblings, 1 reply; 20+ messages in thread From: Yonghong Song @ 2020-12-04 17:54 UTC (permalink / raw) To: Toke Høiland-Jørgensen, Andrii Nakryiko; +Cc: bpf On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote: > Yonghong Song <yhs@fb.com> writes: > >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: >>> Hi Andrii >>> >>> I noticed that recent libbpf versions fail to load BPF files compiled >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I >>> get: >>> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv >>> Loading 1 files on interface 'testns'. >>> libbpf: loading ../lib/testing/xdp_drop.o >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 >>> libbpf: sec 'prog': failed to find program symbol at offset 0 >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid >>> >>> The 'failed to find program symbol' error seems to have been introduced >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into >>> multiple BPF programs"). >>> >>> Looking at the object file in question, indeed it seems to not have any >>> function symbols defined: >>> >>> $ llvm-objdump --syms ../lib/testing/xdp_drop.o >>> >>> ../lib/testing/xdp_drop.o: file format elf64-bpf >>> >>> SYMBOL TABLE: >>> 0000000000000000 l .debug_str 0000000000000000 >>> 0000000000000037 l .debug_str 0000000000000000 >>> 0000000000000042 l .debug_str 0000000000000000 >>> 0000000000000068 l .debug_str 0000000000000000 >>> 0000000000000071 l .debug_str 0000000000000000 >>> 0000000000000076 l .debug_str 0000000000000000 >>> 000000000000008a l .debug_str 0000000000000000 >>> 0000000000000097 l .debug_str 0000000000000000 >>> 00000000000000a3 l .debug_str 0000000000000000 >>> 00000000000000ac l .debug_str 0000000000000000 >>> 00000000000000b5 l .debug_str 0000000000000000 >>> 00000000000000bc l .debug_str 0000000000000000 >>> 00000000000000c9 l .debug_str 0000000000000000 >>> 00000000000000d4 l .debug_str 0000000000000000 >>> 00000000000000dd l .debug_str 0000000000000000 >>> 00000000000000e1 l .debug_str 0000000000000000 >>> 00000000000000e5 l .debug_str 0000000000000000 >>> 00000000000000ea l .debug_str 0000000000000000 >>> 00000000000000f0 l .debug_str 0000000000000000 >>> 00000000000000f9 l .debug_str 0000000000000000 >>> 0000000000000103 l .debug_str 0000000000000000 >>> 0000000000000113 l .debug_str 0000000000000000 >>> 0000000000000122 l .debug_str 0000000000000000 >>> 0000000000000131 l .debug_str 0000000000000000 >>> 0000000000000000 l d prog 0000000000000000 prog >>> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev >>> 0000000000000000 l d .debug_info 0000000000000000 .debug_info >>> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame >>> 0000000000000000 l d .debug_line 0000000000000000 .debug_line >>> 0000000000000000 g license 0000000000000000 _license >>> 0000000000000000 g prog 0000000000000000 xdp_drop >>> >>> >>> I assume this is because old LLVM versions simply don't emit that symbol >>> information? Thanks for the below instruction and xdp_drop.c file. I can reproduce the issue now. I added another function 'xdp_drop1' in the same thing. Below is the symbol table with llvm7 vs. llvm12. -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop 32: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop 33: 0000000000000010 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop1 [ 3] prog PROGBITS 0000000000000000 000040 000020 00 AX 0 0 8 -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop 32: 0000000000000000 16 FUNC GLOBAL DEFAULT 3 xdp_drop 33: 0000000000000010 16 FUNC GLOBAL DEFAULT 3 xdp_drop1 -bash-4.4$ [ 3] prog PROGBITS 0000000000000000 000040 000020 00 AX 0 0 8 Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can change to recognize NOTYPE and use the symbol value (representing the offset from the start of the section) and section size to calculate the individual function size. This is more complicated than elf file providing FUNC type and symbol size directly. Maybe in this case, libbpf can do some sanity check. If there are more than one functions in the 'prog' section and they are not marked at FUNC type, simply recommend newer compiler and bail out saying this feature not available with old llvm? >> >> Could you share xdp_drop.c or other test which I can compile and check >> to understand the issue? > > It's just an empty program returning XDP_DROP: > > https://github.com/xdp-project/xdp-tools/blob/master/lib/testing/xdp_drop.c > > I basically just did this on Debian buster: > > $ sudo apt install gcc-multilib build-essential libpcap-dev libelf-dev git llc lld clang gcc-multilib pkt-config m4 > $ git clone --recurse-submodules https://github.com/xdp-project/xdp-tools > $ cd xdp-tools > $ LLC=llc-7 ./configure > $ make -k > $ cd xdp-loader > $ sudo ip link add dev testns type veth > $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv > > (xdpdump will fail to build with llvm7, but the rest should work) > > -Toke > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-04 17:54 ` Yonghong Song @ 2020-12-04 19:23 ` Andrii Nakryiko 2020-12-07 10:59 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 20+ messages in thread From: Andrii Nakryiko @ 2020-12-04 19:23 UTC (permalink / raw) To: Yonghong Song; +Cc: Toke Høiland-Jørgensen, bpf On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote: > > > > On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote: > > Yonghong Song <yhs@fb.com> writes: > > > >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: > >>> Hi Andrii > >>> > >>> I noticed that recent libbpf versions fail to load BPF files compiled > >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I > >>> get: > >>> > >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv > >>> Loading 1 files on interface 'testns'. > >>> libbpf: loading ../lib/testing/xdp_drop.o > >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 > >>> libbpf: sec 'prog': failed to find program symbol at offset 0 > >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid > >>> > >>> The 'failed to find program symbol' error seems to have been introduced > >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into > >>> multiple BPF programs"). > >>> > >>> Looking at the object file in question, indeed it seems to not have any > >>> function symbols defined: > >>> > >>> $ llvm-objdump --syms ../lib/testing/xdp_drop.o > >>> > >>> ../lib/testing/xdp_drop.o: file format elf64-bpf > >>> > >>> SYMBOL TABLE: > >>> 0000000000000000 l .debug_str 0000000000000000 > >>> 0000000000000037 l .debug_str 0000000000000000 > >>> 0000000000000042 l .debug_str 0000000000000000 > >>> 0000000000000068 l .debug_str 0000000000000000 > >>> 0000000000000071 l .debug_str 0000000000000000 > >>> 0000000000000076 l .debug_str 0000000000000000 > >>> 000000000000008a l .debug_str 0000000000000000 > >>> 0000000000000097 l .debug_str 0000000000000000 > >>> 00000000000000a3 l .debug_str 0000000000000000 > >>> 00000000000000ac l .debug_str 0000000000000000 > >>> 00000000000000b5 l .debug_str 0000000000000000 > >>> 00000000000000bc l .debug_str 0000000000000000 > >>> 00000000000000c9 l .debug_str 0000000000000000 > >>> 00000000000000d4 l .debug_str 0000000000000000 > >>> 00000000000000dd l .debug_str 0000000000000000 > >>> 00000000000000e1 l .debug_str 0000000000000000 > >>> 00000000000000e5 l .debug_str 0000000000000000 > >>> 00000000000000ea l .debug_str 0000000000000000 > >>> 00000000000000f0 l .debug_str 0000000000000000 > >>> 00000000000000f9 l .debug_str 0000000000000000 > >>> 0000000000000103 l .debug_str 0000000000000000 > >>> 0000000000000113 l .debug_str 0000000000000000 > >>> 0000000000000122 l .debug_str 0000000000000000 > >>> 0000000000000131 l .debug_str 0000000000000000 > >>> 0000000000000000 l d prog 0000000000000000 prog > >>> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev > >>> 0000000000000000 l d .debug_info 0000000000000000 .debug_info > >>> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame > >>> 0000000000000000 l d .debug_line 0000000000000000 .debug_line > >>> 0000000000000000 g license 0000000000000000 _license > >>> 0000000000000000 g prog 0000000000000000 xdp_drop > >>> > >>> > >>> I assume this is because old LLVM versions simply don't emit that symbol > >>> information? > > Thanks for the below instruction and xdp_drop.c file. I can reproduce > the issue now. > > I added another function 'xdp_drop1' in the same thing. Below is the > symbol table with llvm7 vs. llvm12. > > -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop > 32: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop > 33: 0000000000000010 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop1 > > [ 3] prog PROGBITS 0000000000000000 000040 000020 > 00 AX 0 0 8 > > -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop > 32: 0000000000000000 16 FUNC GLOBAL DEFAULT 3 xdp_drop > 33: 0000000000000010 16 FUNC GLOBAL DEFAULT 3 xdp_drop1 > -bash-4.4$ > > [ 3] prog PROGBITS 0000000000000000 000040 000020 > 00 AX 0 0 8 > > > Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can > change to recognize NOTYPE and use the symbol value (representing the > offset from the start of the section) and section size to > calculate the individual function size. This is more complicated than > elf file providing FUNC type and symbol size directly. I think we should just face the fact that LLVM7 is way too old to produce a sensible BPF ELF file layout. We can extend: libbpf: sec 'prog': failed to find program symbol at offset 0 Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid with a suggestion to upgrade Clang/LLVM to something more recent, if that would be helpful. But I don't want to add error-prone checks and assumptions in the already quite complicated logic. Even the kernel itself maintains that Clang 10+ needs to be used for its compilation. BPF CO-RE is also not working with older than Clang10, so lots of people have already upgraded way beyond that. Speaking of legacy. Toke, can you please update all the samples in your xdp-tools repo to not use arbitrary sections names. I see SEC("prog"), where it should really be SEC("xdp"). It sets a bad example for newcomers, IMO. I'm also going to emit warnings in libbpf soon for section names that don't follow proper libbpf naming pattern, so it would be good if you could get ahead of the curve. > > Maybe in this case, libbpf can do some sanity check. If there are more > than one functions in the 'prog' section and they are not marked at FUNC > type, simply recommend newer compiler and bail out saying this feature > not available with old llvm? > > >> > >> Could you share xdp_drop.c or other test which I can compile and check > >> to understand the issue? > > > > It's just an empty program returning XDP_DROP: > > > > https://github.com/xdp-project/xdp-tools/blob/master/lib/testing/xdp_drop.c > > > > I basically just did this on Debian buster: > > > > $ sudo apt install gcc-multilib build-essential libpcap-dev libelf-dev git llc lld clang gcc-multilib pkt-config m4 > > $ git clone --recurse-submodules https://github.com/xdp-project/xdp-tools > > $ cd xdp-tools > > $ LLC=llc-7 ./configure > > $ make -k > > $ cd xdp-loader > > $ sudo ip link add dev testns type veth > > $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv > > > > (xdpdump will fail to build with llvm7, but the rest should work) > > > > -Toke > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-04 19:23 ` Andrii Nakryiko @ 2020-12-07 10:59 ` Toke Høiland-Jørgensen 2020-12-07 15:55 ` Alexei Starovoitov 2020-12-08 2:47 ` Andrii Nakryiko 0 siblings, 2 replies; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-07 10:59 UTC (permalink / raw) To: Andrii Nakryiko, Yonghong Song; +Cc: bpf Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote: >> >> >> >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote: >> > Yonghong Song <yhs@fb.com> writes: >> > >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: >> >>> Hi Andrii >> >>> >> >>> I noticed that recent libbpf versions fail to load BPF files compiled >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I >> >>> get: >> >>> >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv >> >>> Loading 1 files on interface 'testns'. >> >>> libbpf: loading ../lib/testing/xdp_drop.o >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0 >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid >> >>> >> >>> The 'failed to find program symbol' error seems to have been introduced >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into >> >>> multiple BPF programs"). >> >>> >> >>> Looking at the object file in question, indeed it seems to not have any >> >>> function symbols defined: >> >>> >> >>> $ llvm-objdump --syms ../lib/testing/xdp_drop.o >> >>> >> >>> ../lib/testing/xdp_drop.o: file format elf64-bpf >> >>> >> >>> SYMBOL TABLE: >> >>> 0000000000000000 l .debug_str 0000000000000000 >> >>> 0000000000000037 l .debug_str 0000000000000000 >> >>> 0000000000000042 l .debug_str 0000000000000000 >> >>> 0000000000000068 l .debug_str 0000000000000000 >> >>> 0000000000000071 l .debug_str 0000000000000000 >> >>> 0000000000000076 l .debug_str 0000000000000000 >> >>> 000000000000008a l .debug_str 0000000000000000 >> >>> 0000000000000097 l .debug_str 0000000000000000 >> >>> 00000000000000a3 l .debug_str 0000000000000000 >> >>> 00000000000000ac l .debug_str 0000000000000000 >> >>> 00000000000000b5 l .debug_str 0000000000000000 >> >>> 00000000000000bc l .debug_str 0000000000000000 >> >>> 00000000000000c9 l .debug_str 0000000000000000 >> >>> 00000000000000d4 l .debug_str 0000000000000000 >> >>> 00000000000000dd l .debug_str 0000000000000000 >> >>> 00000000000000e1 l .debug_str 0000000000000000 >> >>> 00000000000000e5 l .debug_str 0000000000000000 >> >>> 00000000000000ea l .debug_str 0000000000000000 >> >>> 00000000000000f0 l .debug_str 0000000000000000 >> >>> 00000000000000f9 l .debug_str 0000000000000000 >> >>> 0000000000000103 l .debug_str 0000000000000000 >> >>> 0000000000000113 l .debug_str 0000000000000000 >> >>> 0000000000000122 l .debug_str 0000000000000000 >> >>> 0000000000000131 l .debug_str 0000000000000000 >> >>> 0000000000000000 l d prog 0000000000000000 prog >> >>> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev >> >>> 0000000000000000 l d .debug_info 0000000000000000 .debug_info >> >>> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame >> >>> 0000000000000000 l d .debug_line 0000000000000000 .debug_line >> >>> 0000000000000000 g license 0000000000000000 _license >> >>> 0000000000000000 g prog 0000000000000000 xdp_drop >> >>> >> >>> >> >>> I assume this is because old LLVM versions simply don't emit that symbol >> >>> information? >> >> Thanks for the below instruction and xdp_drop.c file. I can reproduce >> the issue now. >> >> I added another function 'xdp_drop1' in the same thing. Below is the >> symbol table with llvm7 vs. llvm12. >> >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop >> 32: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop >> 33: 0000000000000010 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop1 >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 >> 00 AX 0 0 8 >> >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop >> 32: 0000000000000000 16 FUNC GLOBAL DEFAULT 3 xdp_drop >> 33: 0000000000000010 16 FUNC GLOBAL DEFAULT 3 xdp_drop1 >> -bash-4.4$ >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 >> 00 AX 0 0 8 >> >> >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can >> change to recognize NOTYPE and use the symbol value (representing the >> offset from the start of the section) and section size to >> calculate the individual function size. This is more complicated than >> elf file providing FUNC type and symbol size directly. > > I think we should just face the fact that LLVM7 is way too old to > produce a sensible BPF ELF file layout. We can extend: > > libbpf: sec 'prog': failed to find program symbol at offset 0 > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid > > with a suggestion to upgrade Clang/LLVM to something more recent, if > that would be helpful. > > But I don't want to add error-prone checks and assumptions in the > already quite complicated logic. Even the kernel itself maintains that > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not > working with older than Clang10, so lots of people have already > upgraded way beyond that. Wait, what? This is a regression that *breaks people's programs* on compiler versions that are still very much in the wild! I mean, fine if you don't want to support new features on such files, but then surely we can at least revert back to the old behaviour? > Speaking of legacy. Toke, can you please update all the samples in > your xdp-tools repo to not use arbitrary sections names. I see > SEC("prog"), where it should really be SEC("xdp"). It sets a bad > example for newcomers, IMO. I used "prog" because that's what iproute2 looks for if you don't supply a section name, so it makes it convenient to load programs with 'ip' without supplying the section name. However, I do realise this is not the best of reasons, and I am not opposed to changing it. However... > I'm also going to emit warnings in libbpf soon for section names that > don't follow proper libbpf naming pattern, so it would be good if you > could get ahead of the curve. ...this sounds like just another way to annoy users by breaking things that were working before? :/ -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 10:59 ` Toke Høiland-Jørgensen @ 2020-12-07 15:55 ` Alexei Starovoitov 2020-12-07 16:15 ` Toke Høiland-Jørgensen 2020-12-08 2:47 ` Andrii Nakryiko 1 sibling, 1 reply; 20+ messages in thread From: Alexei Starovoitov @ 2020-12-07 15:55 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Andrii Nakryiko, Yonghong Song, bpf On Mon, Dec 7, 2020 at 3:03 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Wait, what? This is a regression that *breaks people's programs* on > compiler versions that are still very much in the wild! I mean, fine if > you don't want to support new features on such files, but then surely we > can at least revert back to the old behaviour? Those folks that care about compiling with old llvm would have to stick to whatever loader they have instead of using libbpf. It's not a backward compatibility breakage. > I used "prog" because that's what iproute2 looks for if you don't supply Please don't use iproute2 as a reason to do anything in libbpf. It won't fly. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 15:55 ` Alexei Starovoitov @ 2020-12-07 16:15 ` Toke Høiland-Jørgensen 2020-12-07 16:20 ` Alexei Starovoitov 0 siblings, 1 reply; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-07 16:15 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Andrii Nakryiko, Yonghong Song, bpf Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > On Mon, Dec 7, 2020 at 3:03 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Wait, what? This is a regression that *breaks people's programs* on >> compiler versions that are still very much in the wild! I mean, fine if >> you don't want to support new features on such files, but then surely we >> can at least revert back to the old behaviour? > > Those folks that care about compiling with old llvm would have to stick > to whatever loader they have instead of using libbpf. > It's not a backward compatibility breakage. What? It's a change in libbpf that breaks loading of existing BPF object files that were working (with libbpf) before. If that's not a backward compatibility break then that term has lost all meaning. >> I used "prog" because that's what iproute2 looks for if you don't supply > > Please don't use iproute2 as a reason to do anything in libbpf. It won't fly. Eh? Did you even read the email you're replying to? This issue has nothing to do with iproute2, that was an unrelated thing Andrii asked me to change when he was looking at my example... -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 16:15 ` Toke Høiland-Jørgensen @ 2020-12-07 16:20 ` Alexei Starovoitov 2020-12-07 16:51 ` Toke Høiland-Jørgensen 2020-12-07 18:02 ` David Ahern 0 siblings, 2 replies; 20+ messages in thread From: Alexei Starovoitov @ 2020-12-07 16:20 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Andrii Nakryiko, Yonghong Song, bpf On Mon, Dec 7, 2020 at 8:15 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > > > On Mon, Dec 7, 2020 at 3:03 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Wait, what? This is a regression that *breaks people's programs* on > >> compiler versions that are still very much in the wild! I mean, fine if > >> you don't want to support new features on such files, but then surely we > >> can at least revert back to the old behaviour? > > > > Those folks that care about compiling with old llvm would have to stick > > to whatever loader they have instead of using libbpf. > > It's not a backward compatibility breakage. > > What? It's a change in libbpf that breaks loading of existing BPF object > files that were working (with libbpf) before. If that's not a backward > compatibility break then that term has lost all meaning. The user space library is not a kernel. The library will change its interface. It will remove functions, features, etc. That's what .map is for. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 16:20 ` Alexei Starovoitov @ 2020-12-07 16:51 ` Toke Høiland-Jørgensen 2020-12-07 17:16 ` Alexei Starovoitov 2020-12-07 18:02 ` David Ahern 1 sibling, 1 reply; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-07 16:51 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Andrii Nakryiko, Yonghong Song, bpf Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > On Mon, Dec 7, 2020 at 8:15 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: >> >> > On Mon, Dec 7, 2020 at 3:03 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> >> >> Wait, what? This is a regression that *breaks people's programs* on >> >> compiler versions that are still very much in the wild! I mean, fine if >> >> you don't want to support new features on such files, but then surely we >> >> can at least revert back to the old behaviour? >> > >> > Those folks that care about compiling with old llvm would have to stick >> > to whatever loader they have instead of using libbpf. >> > It's not a backward compatibility breakage. >> >> What? It's a change in libbpf that breaks loading of existing BPF object >> files that were working (with libbpf) before. If that's not a backward >> compatibility break then that term has lost all meaning. > > The user space library is not a kernel. > The library will change its interface. It will remove functions, features, etc. > That's what .map is for. Right, OK, so how do I use .map to get the old behaviour here? That's all I'm asking for, really... -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 16:51 ` Toke Høiland-Jørgensen @ 2020-12-07 17:16 ` Alexei Starovoitov 2020-12-07 22:18 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 20+ messages in thread From: Alexei Starovoitov @ 2020-12-07 17:16 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Andrii Nakryiko, Yonghong Song, bpf On Mon, Dec 7, 2020 at 8:51 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > > > On Mon, Dec 7, 2020 at 8:15 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > >> > >> > On Mon, Dec 7, 2020 at 3:03 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> >> > >> >> Wait, what? This is a regression that *breaks people's programs* on > >> >> compiler versions that are still very much in the wild! I mean, fine if > >> >> you don't want to support new features on such files, but then surely we > >> >> can at least revert back to the old behaviour? > >> > > >> > Those folks that care about compiling with old llvm would have to stick > >> > to whatever loader they have instead of using libbpf. > >> > It's not a backward compatibility breakage. > >> > >> What? It's a change in libbpf that breaks loading of existing BPF object > >> files that were working (with libbpf) before. If that's not a backward > >> compatibility break then that term has lost all meaning. > > > > The user space library is not a kernel. > > The library will change its interface. It will remove functions, features, etc. > > That's what .map is for. > > Right, OK, so how do I use .map to get the old behaviour here? That's > all I'm asking for, really... Fix old llvm. The users would have to upgrade either from llvm 7.x to 7.x+1 or to llvm 10+. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 17:16 ` Alexei Starovoitov @ 2020-12-07 22:18 ` Toke Høiland-Jørgensen 2020-12-08 1:20 ` Alexei Starovoitov 0 siblings, 1 reply; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-07 22:18 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Andrii Nakryiko, Yonghong Song, bpf Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > On Mon, Dec 7, 2020 at 8:51 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: >> >> > On Mon, Dec 7, 2020 at 8:15 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> >> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: >> >> >> >> > On Mon, Dec 7, 2020 at 3:03 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> >> >> >> >> Wait, what? This is a regression that *breaks people's programs* on >> >> >> compiler versions that are still very much in the wild! I mean, fine if >> >> >> you don't want to support new features on such files, but then surely we >> >> >> can at least revert back to the old behaviour? >> >> > >> >> > Those folks that care about compiling with old llvm would have to stick >> >> > to whatever loader they have instead of using libbpf. >> >> > It's not a backward compatibility breakage. >> >> >> >> What? It's a change in libbpf that breaks loading of existing BPF object >> >> files that were working (with libbpf) before. If that's not a backward >> >> compatibility break then that term has lost all meaning. >> > >> > The user space library is not a kernel. >> > The library will change its interface. It will remove functions, features, etc. >> > That's what .map is for. >> >> Right, OK, so how do I use .map to get the old behaviour here? That's >> all I'm asking for, really... > > Fix old llvm. The users would have to upgrade either from llvm 7.x to > 7.x+1 or to llvm 10+. Right, so by "we keep a stable interface" you mean "we expect you to upgrade your entire toolchain every time you update the library". Gotcha! I'll rectify my newspeak dictionary straight away - doubleplusgood! -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 22:18 ` Toke Høiland-Jørgensen @ 2020-12-08 1:20 ` Alexei Starovoitov 0 siblings, 0 replies; 20+ messages in thread From: Alexei Starovoitov @ 2020-12-08 1:20 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Andrii Nakryiko, Yonghong Song, bpf On Mon, Dec 7, 2020 at 2:18 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > > > On Mon, Dec 7, 2020 at 8:51 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > >> > >> > On Mon, Dec 7, 2020 at 8:15 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> >> > >> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > >> >> > >> >> > On Mon, Dec 7, 2020 at 3:03 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> >> >> > >> >> >> Wait, what? This is a regression that *breaks people's programs* on > >> >> >> compiler versions that are still very much in the wild! I mean, fine if > >> >> >> you don't want to support new features on such files, but then surely we > >> >> >> can at least revert back to the old behaviour? > >> >> > > >> >> > Those folks that care about compiling with old llvm would have to stick > >> >> > to whatever loader they have instead of using libbpf. > >> >> > It's not a backward compatibility breakage. > >> >> > >> >> What? It's a change in libbpf that breaks loading of existing BPF object > >> >> files that were working (with libbpf) before. If that's not a backward > >> >> compatibility break then that term has lost all meaning. > >> > > >> > The user space library is not a kernel. > >> > The library will change its interface. It will remove functions, features, etc. > >> > That's what .map is for. > >> > >> Right, OK, so how do I use .map to get the old behaviour here? That's > >> all I'm asking for, really... > > > > Fix old llvm. The users would have to upgrade either from llvm 7.x to > > 7.x+1 or to llvm 10+. > > Right, so by "we keep a stable interface" you mean "we expect you to > upgrade your entire toolchain every time you update the library". Gotcha! No. It means that libbpf is not going to have a workaround for every possible llvm bug that was fixed over the years. libbpf already does a ton of fixups for things that users can actually hit. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 16:20 ` Alexei Starovoitov 2020-12-07 16:51 ` Toke Høiland-Jørgensen @ 2020-12-07 18:02 ` David Ahern 2020-12-07 18:14 ` Alexei Starovoitov 1 sibling, 1 reply; 20+ messages in thread From: David Ahern @ 2020-12-07 18:02 UTC (permalink / raw) To: Alexei Starovoitov, Toke Høiland-Jørgensen Cc: Andrii Nakryiko, Yonghong Song, bpf On 12/7/20 9:20 AM, Alexei Starovoitov wrote: > The user space library is not a kernel. > The library will change its interface. It will remove functions, features, etc. > That's what .map is for. So any user/package wanting to leverage libbpf can not expect stability or consistency with its APIs? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 18:02 ` David Ahern @ 2020-12-07 18:14 ` Alexei Starovoitov 2020-12-07 18:59 ` David Ahern 0 siblings, 1 reply; 20+ messages in thread From: Alexei Starovoitov @ 2020-12-07 18:14 UTC (permalink / raw) To: David Ahern Cc: Toke Høiland-Jørgensen, Andrii Nakryiko, Yonghong Song, bpf On Mon, Dec 7, 2020 at 10:02 AM David Ahern <dsahern@gmail.com> wrote: > > On 12/7/20 9:20 AM, Alexei Starovoitov wrote: > > The user space library is not a kernel. > > The library will change its interface. It will remove functions, features, etc. > > That's what .map is for. > > So any user/package wanting to leverage libbpf can not expect stability > or consistency with its APIs? If you're talking about iproute2 and your own convoluted definition of stability and consistency then certainly not. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 18:14 ` Alexei Starovoitov @ 2020-12-07 18:59 ` David Ahern 0 siblings, 0 replies; 20+ messages in thread From: David Ahern @ 2020-12-07 18:59 UTC (permalink / raw) To: Alexei Starovoitov Cc: Toke Høiland-Jørgensen, Andrii Nakryiko, Yonghong Song, bpf On 12/7/20 11:14 AM, Alexei Starovoitov wrote: > On Mon, Dec 7, 2020 at 10:02 AM David Ahern <dsahern@gmail.com> wrote: >> >> On 12/7/20 9:20 AM, Alexei Starovoitov wrote: >>> The user space library is not a kernel. >>> The library will change its interface. It will remove functions, features, etc. >>> That's what .map is for. >> >> So any user/package wanting to leverage libbpf can not expect stability >> or consistency with its APIs? > > If you're talking about iproute2 and your own convoluted definition of your record player seems to be stuck. Could you give it a little bump so it can move on to the next track? Thanks. I do work on other things besides iproute2. > stability and consistency then certainly not. > Your statement has huge impacts on what users can expect from a library and is inconsistent with other libraries I have used. Hence, my request for clarification on your comment about 'changing interfaces, removing functions and features.' ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-07 10:59 ` Toke Høiland-Jørgensen 2020-12-07 15:55 ` Alexei Starovoitov @ 2020-12-08 2:47 ` Andrii Nakryiko 2020-12-08 13:41 ` Toke Høiland-Jørgensen 1 sibling, 1 reply; 20+ messages in thread From: Andrii Nakryiko @ 2020-12-08 2:47 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Yonghong Song, bpf On Mon, Dec 7, 2020 at 3:00 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote: > >> > >> > >> > >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote: > >> > Yonghong Song <yhs@fb.com> writes: > >> > > >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: > >> >>> Hi Andrii > >> >>> > >> >>> I noticed that recent libbpf versions fail to load BPF files compiled > >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I > >> >>> get: > >> >>> > >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv > >> >>> Loading 1 files on interface 'testns'. > >> >>> libbpf: loading ../lib/testing/xdp_drop.o > >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 > >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0 > >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid > >> >>> > >> >>> The 'failed to find program symbol' error seems to have been introduced > >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into > >> >>> multiple BPF programs"). > >> >>> > >> >>> Looking at the object file in question, indeed it seems to not have any > >> >>> function symbols defined: > >> >>> > >> >>> $ llvm-objdump --syms ../lib/testing/xdp_drop.o > >> >>> > >> >>> ../lib/testing/xdp_drop.o: file format elf64-bpf > >> >>> > >> >>> SYMBOL TABLE: > >> >>> 0000000000000000 l .debug_str 0000000000000000 > >> >>> 0000000000000037 l .debug_str 0000000000000000 > >> >>> 0000000000000042 l .debug_str 0000000000000000 > >> >>> 0000000000000068 l .debug_str 0000000000000000 > >> >>> 0000000000000071 l .debug_str 0000000000000000 > >> >>> 0000000000000076 l .debug_str 0000000000000000 > >> >>> 000000000000008a l .debug_str 0000000000000000 > >> >>> 0000000000000097 l .debug_str 0000000000000000 > >> >>> 00000000000000a3 l .debug_str 0000000000000000 > >> >>> 00000000000000ac l .debug_str 0000000000000000 > >> >>> 00000000000000b5 l .debug_str 0000000000000000 > >> >>> 00000000000000bc l .debug_str 0000000000000000 > >> >>> 00000000000000c9 l .debug_str 0000000000000000 > >> >>> 00000000000000d4 l .debug_str 0000000000000000 > >> >>> 00000000000000dd l .debug_str 0000000000000000 > >> >>> 00000000000000e1 l .debug_str 0000000000000000 > >> >>> 00000000000000e5 l .debug_str 0000000000000000 > >> >>> 00000000000000ea l .debug_str 0000000000000000 > >> >>> 00000000000000f0 l .debug_str 0000000000000000 > >> >>> 00000000000000f9 l .debug_str 0000000000000000 > >> >>> 0000000000000103 l .debug_str 0000000000000000 > >> >>> 0000000000000113 l .debug_str 0000000000000000 > >> >>> 0000000000000122 l .debug_str 0000000000000000 > >> >>> 0000000000000131 l .debug_str 0000000000000000 > >> >>> 0000000000000000 l d prog 0000000000000000 prog > >> >>> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev > >> >>> 0000000000000000 l d .debug_info 0000000000000000 .debug_info > >> >>> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame > >> >>> 0000000000000000 l d .debug_line 0000000000000000 .debug_line > >> >>> 0000000000000000 g license 0000000000000000 _license > >> >>> 0000000000000000 g prog 0000000000000000 xdp_drop > >> >>> > >> >>> > >> >>> I assume this is because old LLVM versions simply don't emit that symbol > >> >>> information? > >> > >> Thanks for the below instruction and xdp_drop.c file. I can reproduce > >> the issue now. > >> > >> I added another function 'xdp_drop1' in the same thing. Below is the > >> symbol table with llvm7 vs. llvm12. > >> > >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop > >> 32: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop > >> 33: 0000000000000010 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop1 > >> > >> [ 3] prog PROGBITS 0000000000000000 000040 000020 > >> 00 AX 0 0 8 > >> > >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop > >> 32: 0000000000000000 16 FUNC GLOBAL DEFAULT 3 xdp_drop > >> 33: 0000000000000010 16 FUNC GLOBAL DEFAULT 3 xdp_drop1 > >> -bash-4.4$ > >> > >> [ 3] prog PROGBITS 0000000000000000 000040 000020 > >> 00 AX 0 0 8 > >> > >> > >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can > >> change to recognize NOTYPE and use the symbol value (representing the > >> offset from the start of the section) and section size to > >> calculate the individual function size. This is more complicated than > >> elf file providing FUNC type and symbol size directly. > > > > I think we should just face the fact that LLVM7 is way too old to > > produce a sensible BPF ELF file layout. We can extend: > > > > libbpf: sec 'prog': failed to find program symbol at offset 0 > > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid > > > > with a suggestion to upgrade Clang/LLVM to something more recent, if > > that would be helpful. > > > > But I don't want to add error-prone checks and assumptions in the > > already quite complicated logic. Even the kernel itself maintains that > > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not > > working with older than Clang10, so lots of people have already > > upgraded way beyond that. > > Wait, what? This is a regression that *breaks people's programs* on > compiler versions that are still very much in the wild! I mean, fine if > you don't want to support new features on such files, but then surely we > can at least revert back to the old behaviour? This is clearly a bug in LLVM7, which didn't produce correct ELF symbols, do we agree on that? libbpf used to handle such invalid ELF files *by accident* until it changed its internal logic to be more strict in v0.2. It became more strict and doesn't work with such invalid ELF files anymore. Does it need to add extra quirks to support such broken ELF? I don't think so. Surely, users that can't upgrade LLVM7 to something less ancient, can stick to libbpf v0.1, that was lenient enough to accept such invalid ELF files. libbpf v0.2 was released more than a month ago, and so far you are the only one who noticed this "regression". So hopefully it's not super annoying to people and they would be accommodating enough to use more up to date compiler (and save themselves lots of trouble along the way). > > > Speaking of legacy. Toke, can you please update all the samples in > > your xdp-tools repo to not use arbitrary sections names. I see > > SEC("prog"), where it should really be SEC("xdp"). It sets a bad > > example for newcomers, IMO. > > I used "prog" because that's what iproute2 looks for if you don't supply Ok. > a section name, so it makes it convenient to load programs with 'ip' > without supplying the section name. However, I do realise this is not > the best of reasons, and I am not opposed to changing it. However... > > > I'm also going to emit warnings in libbpf soon for section names that > > don't follow proper libbpf naming pattern, so it would be good if you > > could get ahead of the curve. > > ...this sounds like just another way to annoy users by breaking things > that were working before? :/ It won't break, libbpf will emit a warning about the need to use proper section name format, which will start to be enforced only with major version bump. So that will give users plenty of time to make sure their BPF programs are compatible with stricter libbpf. > > -Toke > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-08 2:47 ` Andrii Nakryiko @ 2020-12-08 13:41 ` Toke Høiland-Jørgensen 2020-12-08 18:39 ` Andrii Nakryiko 0 siblings, 1 reply; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-08 13:41 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: Yonghong Song, bpf Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Mon, Dec 7, 2020 at 3:00 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote: >> >> >> >> >> >> >> >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote: >> >> > Yonghong Song <yhs@fb.com> writes: >> >> > >> >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: >> >> >>> Hi Andrii >> >> >>> >> >> >>> I noticed that recent libbpf versions fail to load BPF files compiled >> >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I >> >> >>> get: >> >> >>> >> >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv >> >> >>> Loading 1 files on interface 'testns'. >> >> >>> libbpf: loading ../lib/testing/xdp_drop.o >> >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 >> >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0 >> >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid >> >> >>> >> >> >>> The 'failed to find program symbol' error seems to have been introduced >> >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into >> >> >>> multiple BPF programs"). >> >> >>> >> >> >>> Looking at the object file in question, indeed it seems to not have any >> >> >>> function symbols defined: >> >> >>> >> >> >>> $ llvm-objdump --syms ../lib/testing/xdp_drop.o >> >> >>> >> >> >>> ../lib/testing/xdp_drop.o: file format elf64-bpf >> >> >>> >> >> >>> SYMBOL TABLE: >> >> >>> 0000000000000000 l .debug_str 0000000000000000 >> >> >>> 0000000000000037 l .debug_str 0000000000000000 >> >> >>> 0000000000000042 l .debug_str 0000000000000000 >> >> >>> 0000000000000068 l .debug_str 0000000000000000 >> >> >>> 0000000000000071 l .debug_str 0000000000000000 >> >> >>> 0000000000000076 l .debug_str 0000000000000000 >> >> >>> 000000000000008a l .debug_str 0000000000000000 >> >> >>> 0000000000000097 l .debug_str 0000000000000000 >> >> >>> 00000000000000a3 l .debug_str 0000000000000000 >> >> >>> 00000000000000ac l .debug_str 0000000000000000 >> >> >>> 00000000000000b5 l .debug_str 0000000000000000 >> >> >>> 00000000000000bc l .debug_str 0000000000000000 >> >> >>> 00000000000000c9 l .debug_str 0000000000000000 >> >> >>> 00000000000000d4 l .debug_str 0000000000000000 >> >> >>> 00000000000000dd l .debug_str 0000000000000000 >> >> >>> 00000000000000e1 l .debug_str 0000000000000000 >> >> >>> 00000000000000e5 l .debug_str 0000000000000000 >> >> >>> 00000000000000ea l .debug_str 0000000000000000 >> >> >>> 00000000000000f0 l .debug_str 0000000000000000 >> >> >>> 00000000000000f9 l .debug_str 0000000000000000 >> >> >>> 0000000000000103 l .debug_str 0000000000000000 >> >> >>> 0000000000000113 l .debug_str 0000000000000000 >> >> >>> 0000000000000122 l .debug_str 0000000000000000 >> >> >>> 0000000000000131 l .debug_str 0000000000000000 >> >> >>> 0000000000000000 l d prog 0000000000000000 prog >> >> >>> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev >> >> >>> 0000000000000000 l d .debug_info 0000000000000000 .debug_info >> >> >>> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame >> >> >>> 0000000000000000 l d .debug_line 0000000000000000 .debug_line >> >> >>> 0000000000000000 g license 0000000000000000 _license >> >> >>> 0000000000000000 g prog 0000000000000000 xdp_drop >> >> >>> >> >> >>> >> >> >>> I assume this is because old LLVM versions simply don't emit that symbol >> >> >>> information? >> >> >> >> Thanks for the below instruction and xdp_drop.c file. I can reproduce >> >> the issue now. >> >> >> >> I added another function 'xdp_drop1' in the same thing. Below is the >> >> symbol table with llvm7 vs. llvm12. >> >> >> >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop >> >> 32: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop >> >> 33: 0000000000000010 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop1 >> >> >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 >> >> 00 AX 0 0 8 >> >> >> >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop >> >> 32: 0000000000000000 16 FUNC GLOBAL DEFAULT 3 xdp_drop >> >> 33: 0000000000000010 16 FUNC GLOBAL DEFAULT 3 xdp_drop1 >> >> -bash-4.4$ >> >> >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 >> >> 00 AX 0 0 8 >> >> >> >> >> >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can >> >> change to recognize NOTYPE and use the symbol value (representing the >> >> offset from the start of the section) and section size to >> >> calculate the individual function size. This is more complicated than >> >> elf file providing FUNC type and symbol size directly. >> > >> > I think we should just face the fact that LLVM7 is way too old to >> > produce a sensible BPF ELF file layout. We can extend: >> > >> > libbpf: sec 'prog': failed to find program symbol at offset 0 >> > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid >> > >> > with a suggestion to upgrade Clang/LLVM to something more recent, if >> > that would be helpful. >> > >> > But I don't want to add error-prone checks and assumptions in the >> > already quite complicated logic. Even the kernel itself maintains that >> > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not >> > working with older than Clang10, so lots of people have already >> > upgraded way beyond that. >> >> Wait, what? This is a regression that *breaks people's programs* on >> compiler versions that are still very much in the wild! I mean, fine if >> you don't want to support new features on such files, but then surely we >> can at least revert back to the old behaviour? > > This is clearly a bug in LLVM7, which didn't produce correct ELF > symbols, do we agree on that? libbpf used to handle such invalid ELF > files *by accident* until it changed its internal logic to be more > strict in v0.2. It became more strict and doesn't work with such > invalid ELF files anymore. Does it need to add extra quirks to support > such broken ELF? I don't think so. I don't know enough about the intricacies of the ELF format to say, but I believe you when you say it's a bug. However, that doesn't change the fact that from a user's PoV, something that was working before is now broken, with the only change being a newer libbpf. This is not a theoretical concern, BTW, I discovered this due to feedback from a partner that we've been pushing to adopt libbpf. When they finally tried it out, the first thing they noticed is that their programs wouldn't load due to this issue. Sure, I can tell them to just upgrade their toolchain (and I will), but that still means we're back to "in order to use this library, you should expect to keep chasing the latest version of the entire toolchain". And this is a much harder sell than "this is a stable library and upstream takes backwards compatibility very serious", which I *thought* was the expectation. > Surely, users that can't upgrade LLVM7 to something less ancient, can > stick to libbpf v0.1, that was lenient enough to accept such invalid > ELF files. libbpf v0.2 was released more than a month ago, and so far > you are the only one who noticed this "regression". So hopefully it's > not super annoying to people and they would be accommodating enough to > use more up to date compiler (and save themselves lots of trouble > along the way). Oh, boy, do I envy your adoption rate for new versions! In my world I would expect that by one month a few people who are very early adopters have started noticing and maybe thinking about testing the new version :) >> > Speaking of legacy. Toke, can you please update all the samples in >> > your xdp-tools repo to not use arbitrary sections names. I see >> > SEC("prog"), where it should really be SEC("xdp"). It sets a bad >> > example for newcomers, IMO. >> >> I used "prog" because that's what iproute2 looks for if you don't supply > > Ok. Fixed now, BTW: https://github.com/xdp-project/xdp-tools/commit/83ab8aa1c29408aac842bebe704aa47ec5dc5bc3 >> a section name, so it makes it convenient to load programs with 'ip' >> without supplying the section name. However, I do realise this is not >> the best of reasons, and I am not opposed to changing it. However... >> >> > I'm also going to emit warnings in libbpf soon for section names that >> > don't follow proper libbpf naming pattern, so it would be good if you >> > could get ahead of the curve. >> >> ...this sounds like just another way to annoy users by breaking things >> that were working before? :/ > > It won't break, libbpf will emit a warning about the need to use > proper section name format, which will start to be enforced only with > major version bump. So that will give users plenty of time to make > sure their BPF programs are compatible with stricter libbpf. Well see above re: different expectations for "plenty of time". But OK, maybe this isn't as bad as I figured at first glance :) -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-08 13:41 ` Toke Høiland-Jørgensen @ 2020-12-08 18:39 ` Andrii Nakryiko 2020-12-08 22:38 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 20+ messages in thread From: Andrii Nakryiko @ 2020-12-08 18:39 UTC (permalink / raw) To: Toke Høiland-Jørgensen; +Cc: Yonghong Song, bpf On Tue, Dec 8, 2020 at 5:41 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Mon, Dec 7, 2020 at 3:00 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > >> > >> > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote: > >> >> > >> >> > >> >> > >> >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote: > >> >> > Yonghong Song <yhs@fb.com> writes: > >> >> > > >> >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: > >> >> >>> Hi Andrii > >> >> >>> > >> >> >>> I noticed that recent libbpf versions fail to load BPF files compiled > >> >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I > >> >> >>> get: > >> >> >>> > >> >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv > >> >> >>> Loading 1 files on interface 'testns'. > >> >> >>> libbpf: loading ../lib/testing/xdp_drop.o > >> >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 > >> >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0 > >> >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid > >> >> >>> > >> >> >>> The 'failed to find program symbol' error seems to have been introduced > >> >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into > >> >> >>> multiple BPF programs"). > >> >> >>> > >> >> >>> Looking at the object file in question, indeed it seems to not have any > >> >> >>> function symbols defined: > >> >> >>> > >> >> >>> $ llvm-objdump --syms ../lib/testing/xdp_drop.o > >> >> >>> > >> >> >>> ../lib/testing/xdp_drop.o: file format elf64-bpf > >> >> >>> > >> >> >>> SYMBOL TABLE: > >> >> >>> 0000000000000000 l .debug_str 0000000000000000 > >> >> >>> 0000000000000037 l .debug_str 0000000000000000 > >> >> >>> 0000000000000042 l .debug_str 0000000000000000 > >> >> >>> 0000000000000068 l .debug_str 0000000000000000 > >> >> >>> 0000000000000071 l .debug_str 0000000000000000 > >> >> >>> 0000000000000076 l .debug_str 0000000000000000 > >> >> >>> 000000000000008a l .debug_str 0000000000000000 > >> >> >>> 0000000000000097 l .debug_str 0000000000000000 > >> >> >>> 00000000000000a3 l .debug_str 0000000000000000 > >> >> >>> 00000000000000ac l .debug_str 0000000000000000 > >> >> >>> 00000000000000b5 l .debug_str 0000000000000000 > >> >> >>> 00000000000000bc l .debug_str 0000000000000000 > >> >> >>> 00000000000000c9 l .debug_str 0000000000000000 > >> >> >>> 00000000000000d4 l .debug_str 0000000000000000 > >> >> >>> 00000000000000dd l .debug_str 0000000000000000 > >> >> >>> 00000000000000e1 l .debug_str 0000000000000000 > >> >> >>> 00000000000000e5 l .debug_str 0000000000000000 > >> >> >>> 00000000000000ea l .debug_str 0000000000000000 > >> >> >>> 00000000000000f0 l .debug_str 0000000000000000 > >> >> >>> 00000000000000f9 l .debug_str 0000000000000000 > >> >> >>> 0000000000000103 l .debug_str 0000000000000000 > >> >> >>> 0000000000000113 l .debug_str 0000000000000000 > >> >> >>> 0000000000000122 l .debug_str 0000000000000000 > >> >> >>> 0000000000000131 l .debug_str 0000000000000000 > >> >> >>> 0000000000000000 l d prog 0000000000000000 prog > >> >> >>> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev > >> >> >>> 0000000000000000 l d .debug_info 0000000000000000 .debug_info > >> >> >>> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame > >> >> >>> 0000000000000000 l d .debug_line 0000000000000000 .debug_line > >> >> >>> 0000000000000000 g license 0000000000000000 _license > >> >> >>> 0000000000000000 g prog 0000000000000000 xdp_drop > >> >> >>> > >> >> >>> > >> >> >>> I assume this is because old LLVM versions simply don't emit that symbol > >> >> >>> information? > >> >> > >> >> Thanks for the below instruction and xdp_drop.c file. I can reproduce > >> >> the issue now. > >> >> > >> >> I added another function 'xdp_drop1' in the same thing. Below is the > >> >> symbol table with llvm7 vs. llvm12. > >> >> > >> >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop > >> >> 32: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop > >> >> 33: 0000000000000010 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop1 > >> >> > >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 > >> >> 00 AX 0 0 8 > >> >> > >> >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop > >> >> 32: 0000000000000000 16 FUNC GLOBAL DEFAULT 3 xdp_drop > >> >> 33: 0000000000000010 16 FUNC GLOBAL DEFAULT 3 xdp_drop1 > >> >> -bash-4.4$ > >> >> > >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 > >> >> 00 AX 0 0 8 > >> >> > >> >> > >> >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can > >> >> change to recognize NOTYPE and use the symbol value (representing the > >> >> offset from the start of the section) and section size to > >> >> calculate the individual function size. This is more complicated than > >> >> elf file providing FUNC type and symbol size directly. > >> > > >> > I think we should just face the fact that LLVM7 is way too old to > >> > produce a sensible BPF ELF file layout. We can extend: > >> > > >> > libbpf: sec 'prog': failed to find program symbol at offset 0 > >> > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid > >> > > >> > with a suggestion to upgrade Clang/LLVM to something more recent, if > >> > that would be helpful. > >> > > >> > But I don't want to add error-prone checks and assumptions in the > >> > already quite complicated logic. Even the kernel itself maintains that > >> > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not > >> > working with older than Clang10, so lots of people have already > >> > upgraded way beyond that. > >> > >> Wait, what? This is a regression that *breaks people's programs* on > >> compiler versions that are still very much in the wild! I mean, fine if > >> you don't want to support new features on such files, but then surely we > >> can at least revert back to the old behaviour? > > > > This is clearly a bug in LLVM7, which didn't produce correct ELF > > symbols, do we agree on that? libbpf used to handle such invalid ELF > > files *by accident* until it changed its internal logic to be more > > strict in v0.2. It became more strict and doesn't work with such > > invalid ELF files anymore. Does it need to add extra quirks to support > > such broken ELF? I don't think so. > > I don't know enough about the intricacies of the ELF format to say, but > I believe you when you say it's a bug. However, that doesn't change the > fact that from a user's PoV, something that was working before is now > broken, with the only change being a newer libbpf. > > This is not a theoretical concern, BTW, I discovered this due to > feedback from a partner that we've been pushing to adopt libbpf. When > they finally tried it out, the first thing they noticed is that their > programs wouldn't load due to this issue. > > Sure, I can tell them to just upgrade their toolchain (and I will), but > that still means we're back to "in order to use this library, you should > expect to keep chasing the latest version of the entire toolchain". And Migrating from LLVM7 to something like LLVM10 or LLVM11 (not asking for not-yet-released LLVM12) hardly qualifies as "chasing the latest version". LLVM8 or LLVM9 might work for their simple use case either, I haven't checked. Just please don't use the extremely outdated toolchain that is (now) known to be broken. That's all I'm asking. > this is a much harder sell than "this is a stable library and upstream > takes backwards compatibility very serious", which I *thought* was the > expectation. That's still true and I'd rather not go over the same discussion again. But libbpf is also not a dumpster of work-arounds for all possible bugs in the kernel and compiler. Libbpf does a lot of that for backwards compatibility reasons, no need to deal with quirks of buggy and very outdated compilers (and kernels, if there are obvious bugs like this). > > > Surely, users that can't upgrade LLVM7 to something less ancient, can > > stick to libbpf v0.1, that was lenient enough to accept such invalid > > ELF files. libbpf v0.2 was released more than a month ago, and so far > > you are the only one who noticed this "regression". So hopefully it's > > not super annoying to people and they would be accommodating enough to > > use more up to date compiler (and save themselves lots of trouble > > along the way). > > Oh, boy, do I envy your adoption rate for new versions! In my world I > would expect that by one month a few people who are very early adopters > have started noticing and maybe thinking about testing the new version :) In practice with the new libbpf releases I've been getting reports about something broken within a few days. So yeah, I'm a lucky guy, I suppose. > > >> > Speaking of legacy. Toke, can you please update all the samples in > >> > your xdp-tools repo to not use arbitrary sections names. I see > >> > SEC("prog"), where it should really be SEC("xdp"). It sets a bad > >> > example for newcomers, IMO. > >> > >> I used "prog" because that's what iproute2 looks for if you don't supply > > > > Ok. > > Fixed now, BTW: > https://github.com/xdp-project/xdp-tools/commit/83ab8aa1c29408aac842bebe704aa47ec5dc5bc3 Thanks! > > >> a section name, so it makes it convenient to load programs with 'ip' > >> without supplying the section name. However, I do realise this is not > >> the best of reasons, and I am not opposed to changing it. However... > >> > >> > I'm also going to emit warnings in libbpf soon for section names that > >> > don't follow proper libbpf naming pattern, so it would be good if you > >> > could get ahead of the curve. > >> > >> ...this sounds like just another way to annoy users by breaking things > >> that were working before? :/ > > > > It won't break, libbpf will emit a warning about the need to use > > proper section name format, which will start to be enforced only with > > major version bump. So that will give users plenty of time to make > > sure their BPF programs are compatible with stricter libbpf. > > Well see above re: different expectations for "plenty of time". But OK, > maybe this isn't as bad as I figured at first glance :) So far libbpf releases were timed to Linux releases, so roughly one every 2 months. libbpf 1.0 is unlikely to come sooner than 2 releases out. So it's not like 2 weeks notice, right? Waiting for a year or more seems excessive as well. > > -Toke > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Latest libbpf fails to load programs compiled with old LLVM 2020-12-08 18:39 ` Andrii Nakryiko @ 2020-12-08 22:38 ` Toke Høiland-Jørgensen 0 siblings, 0 replies; 20+ messages in thread From: Toke Høiland-Jørgensen @ 2020-12-08 22:38 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: Yonghong Song, bpf Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Tue, Dec 8, 2020 at 5:41 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> > On Mon, Dec 7, 2020 at 3:00 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> >> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> >> >> > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote: >> >> >> >> >> >> >> >> >> >> >> >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote: >> >> >> > Yonghong Song <yhs@fb.com> writes: >> >> >> > >> >> >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote: >> >> >> >>> Hi Andrii >> >> >> >>> >> >> >> >>> I noticed that recent libbpf versions fail to load BPF files compiled >> >> >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I >> >> >> >>> get: >> >> >> >>> >> >> >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv >> >> >> >>> Loading 1 files on interface 'testns'. >> >> >> >>> libbpf: loading ../lib/testing/xdp_drop.o >> >> >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1 >> >> >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0 >> >> >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid >> >> >> >>> >> >> >> >>> The 'failed to find program symbol' error seems to have been introduced >> >> >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into >> >> >> >>> multiple BPF programs"). >> >> >> >>> >> >> >> >>> Looking at the object file in question, indeed it seems to not have any >> >> >> >>> function symbols defined: >> >> >> >>> >> >> >> >>> $ llvm-objdump --syms ../lib/testing/xdp_drop.o >> >> >> >>> >> >> >> >>> ../lib/testing/xdp_drop.o: file format elf64-bpf >> >> >> >>> >> >> >> >>> SYMBOL TABLE: >> >> >> >>> 0000000000000000 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000037 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000042 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000068 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000071 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000076 l .debug_str 0000000000000000 >> >> >> >>> 000000000000008a l .debug_str 0000000000000000 >> >> >> >>> 0000000000000097 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000a3 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000ac l .debug_str 0000000000000000 >> >> >> >>> 00000000000000b5 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000bc l .debug_str 0000000000000000 >> >> >> >>> 00000000000000c9 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000d4 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000dd l .debug_str 0000000000000000 >> >> >> >>> 00000000000000e1 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000e5 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000ea l .debug_str 0000000000000000 >> >> >> >>> 00000000000000f0 l .debug_str 0000000000000000 >> >> >> >>> 00000000000000f9 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000103 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000113 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000122 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000131 l .debug_str 0000000000000000 >> >> >> >>> 0000000000000000 l d prog 0000000000000000 prog >> >> >> >>> 0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev >> >> >> >>> 0000000000000000 l d .debug_info 0000000000000000 .debug_info >> >> >> >>> 0000000000000000 l d .debug_frame 0000000000000000 .debug_frame >> >> >> >>> 0000000000000000 l d .debug_line 0000000000000000 .debug_line >> >> >> >>> 0000000000000000 g license 0000000000000000 _license >> >> >> >>> 0000000000000000 g prog 0000000000000000 xdp_drop >> >> >> >>> >> >> >> >>> >> >> >> >>> I assume this is because old LLVM versions simply don't emit that symbol >> >> >> >>> information? >> >> >> >> >> >> Thanks for the below instruction and xdp_drop.c file. I can reproduce >> >> >> the issue now. >> >> >> >> >> >> I added another function 'xdp_drop1' in the same thing. Below is the >> >> >> symbol table with llvm7 vs. llvm12. >> >> >> >> >> >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop >> >> >> 32: 0000000000000000 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop >> >> >> 33: 0000000000000010 0 NOTYPE GLOBAL DEFAULT 3 xdp_drop1 >> >> >> >> >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 >> >> >> 00 AX 0 0 8 >> >> >> >> >> >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop >> >> >> 32: 0000000000000000 16 FUNC GLOBAL DEFAULT 3 xdp_drop >> >> >> 33: 0000000000000010 16 FUNC GLOBAL DEFAULT 3 xdp_drop1 >> >> >> -bash-4.4$ >> >> >> >> >> >> [ 3] prog PROGBITS 0000000000000000 000040 000020 >> >> >> 00 AX 0 0 8 >> >> >> >> >> >> >> >> >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can >> >> >> change to recognize NOTYPE and use the symbol value (representing the >> >> >> offset from the start of the section) and section size to >> >> >> calculate the individual function size. This is more complicated than >> >> >> elf file providing FUNC type and symbol size directly. >> >> > >> >> > I think we should just face the fact that LLVM7 is way too old to >> >> > produce a sensible BPF ELF file layout. We can extend: >> >> > >> >> > libbpf: sec 'prog': failed to find program symbol at offset 0 >> >> > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid >> >> > >> >> > with a suggestion to upgrade Clang/LLVM to something more recent, if >> >> > that would be helpful. >> >> > >> >> > But I don't want to add error-prone checks and assumptions in the >> >> > already quite complicated logic. Even the kernel itself maintains that >> >> > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not >> >> > working with older than Clang10, so lots of people have already >> >> > upgraded way beyond that. >> >> >> >> Wait, what? This is a regression that *breaks people's programs* on >> >> compiler versions that are still very much in the wild! I mean, fine if >> >> you don't want to support new features on such files, but then surely we >> >> can at least revert back to the old behaviour? >> > >> > This is clearly a bug in LLVM7, which didn't produce correct ELF >> > symbols, do we agree on that? libbpf used to handle such invalid ELF >> > files *by accident* until it changed its internal logic to be more >> > strict in v0.2. It became more strict and doesn't work with such >> > invalid ELF files anymore. Does it need to add extra quirks to support >> > such broken ELF? I don't think so. >> >> I don't know enough about the intricacies of the ELF format to say, but >> I believe you when you say it's a bug. However, that doesn't change the >> fact that from a user's PoV, something that was working before is now >> broken, with the only change being a newer libbpf. >> >> This is not a theoretical concern, BTW, I discovered this due to >> feedback from a partner that we've been pushing to adopt libbpf. When >> they finally tried it out, the first thing they noticed is that their >> programs wouldn't load due to this issue. >> >> Sure, I can tell them to just upgrade their toolchain (and I will), but >> that still means we're back to "in order to use this library, you should >> expect to keep chasing the latest version of the entire toolchain". And > > Migrating from LLVM7 to something like LLVM10 or LLVM11 (not asking > for not-yet-released LLVM12) hardly qualifies as "chasing the latest > version". LLVM8 or LLVM9 might work for their simple use case either, > I haven't checked. Just please don't use the extremely outdated > toolchain that is (now) known to be broken. That's all I'm asking. "Extremely outdated" is very much subjective: It's only two years old, and still shipping as the current version in major Linux distributions. But fine, "we won't promise not to break compilers older than a year" is also a support statement, it's just not the one (I thought you were) making before. In any case, having a clearly stated expectation articulated somewhere would be very helpful. Something concrete like "expect to run an LLVM version no older than two stable releases"; just saying "it's stable" seems to be a tad too subjective, as evidenced by this discussion (and similar ones we've had before) :) >> this is a much harder sell than "this is a stable library and upstream >> takes backwards compatibility very serious", which I *thought* was the >> expectation. > > That's still true and I'd rather not go over the same discussion > again. But libbpf is also not a dumpster of work-arounds for all > possible bugs in the kernel and compiler. Libbpf does a lot of that > for backwards compatibility reasons, no need to deal with quirks of > buggy and very outdated compilers (and kernels, if there are obvious > bugs like this). Not adding new workarounds for things that were broken from the start is fine. But we're discussing a change that broke something that was working before. Look at the kernel stability policy: If something was working with an old kernel, we promise it will keep working on new ones, regardless of whether what it was doing is technically outside of some spec. *That's* a stability promise. >> > Surely, users that can't upgrade LLVM7 to something less ancient, can >> > stick to libbpf v0.1, that was lenient enough to accept such invalid >> > ELF files. libbpf v0.2 was released more than a month ago, and so far >> > you are the only one who noticed this "regression". So hopefully it's >> > not super annoying to people and they would be accommodating enough to >> > use more up to date compiler (and save themselves lots of trouble >> > along the way). >> >> Oh, boy, do I envy your adoption rate for new versions! In my world I >> would expect that by one month a few people who are very early adopters >> have started noticing and maybe thinking about testing the new version :) > > In practice with the new libbpf releases I've been getting reports > about something broken within a few days. So yeah, I'm a lucky guy, I > suppose. Don't mean to knock early testing, that's awesome. I'm just objecting to the converse implication (i.e., "if it doesn't show up in early testing it's not a real bug"). >> >> a section name, so it makes it convenient to load programs with 'ip' >> >> without supplying the section name. However, I do realise this is not >> >> the best of reasons, and I am not opposed to changing it. However... >> >> >> >> > I'm also going to emit warnings in libbpf soon for section names that >> >> > don't follow proper libbpf naming pattern, so it would be good if you >> >> > could get ahead of the curve. >> >> >> >> ...this sounds like just another way to annoy users by breaking things >> >> that were working before? :/ >> > >> > It won't break, libbpf will emit a warning about the need to use >> > proper section name format, which will start to be enforced only with >> > major version bump. So that will give users plenty of time to make >> > sure their BPF programs are compatible with stricter libbpf. >> >> Well see above re: different expectations for "plenty of time". But OK, >> maybe this isn't as bad as I figured at first glance :) > > So far libbpf releases were timed to Linux releases, so roughly one > every 2 months. libbpf 1.0 is unlikely to come sooner than 2 releases > out. So it's not like 2 weeks notice, right? Waiting for a year or > more seems excessive as well. Removing deprecated features on major release versions does seem reasonable. But do keep in mind that for some projects, adding a warning still constitutes "breaking the build" due to downstream policy. -Toke ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-12-08 22:40 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-03 17:55 Latest libbpf fails to load programs compiled with old LLVM Toke Høiland-Jørgensen 2020-12-04 1:42 ` Yonghong Song 2020-12-04 9:34 ` Toke Høiland-Jørgensen 2020-12-04 17:54 ` Yonghong Song 2020-12-04 19:23 ` Andrii Nakryiko 2020-12-07 10:59 ` Toke Høiland-Jørgensen 2020-12-07 15:55 ` Alexei Starovoitov 2020-12-07 16:15 ` Toke Høiland-Jørgensen 2020-12-07 16:20 ` Alexei Starovoitov 2020-12-07 16:51 ` Toke Høiland-Jørgensen 2020-12-07 17:16 ` Alexei Starovoitov 2020-12-07 22:18 ` Toke Høiland-Jørgensen 2020-12-08 1:20 ` Alexei Starovoitov 2020-12-07 18:02 ` David Ahern 2020-12-07 18:14 ` Alexei Starovoitov 2020-12-07 18:59 ` David Ahern 2020-12-08 2:47 ` Andrii Nakryiko 2020-12-08 13:41 ` Toke Høiland-Jørgensen 2020-12-08 18:39 ` Andrii Nakryiko 2020-12-08 22:38 ` Toke Høiland-Jørgensen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.