bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH iproute2-next 0/5] iproute2: add libbpf support
@ 2020-10-23  3:38 Hangbin Liu
  2020-10-23  3:38 ` [PATCH iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
                   ` (6 more replies)
  0 siblings, 7 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-23  3:38 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency).

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.

Here are the test results with patched iproute2:

== setup env
# clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o
# clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o
# rm -rf /sys/fs/bpf/xdp/globals
# /root/iproute2/ip/ip link add type veth
# /root/iproute2/ip/ip link set veth0 up
# /root/iproute2/ip/ip link set veth1 up


== Load objs
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 4 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
4: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:21-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 5
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 8 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
8: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:23-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 10
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 12 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
12: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:25-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 15
# /root/iproute2/ip/ip link set veth0 xdp off


== Load objs again to make sure maps could be reused
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 16 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
16: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:27-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 20
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 20 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show                                                                                                                                                                   [236/4518]
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
20: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:29-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 25
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 24 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
24: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:31-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 30
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals

== Testing if we can load new-style objects (using xdp-filter as an example)
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 28 tag e29eeda1489a6520 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
28: xdp  name xdpfilt_alw_all  tag e29eeda1489a6520  gpl
        loaded_at 2020-10-22T08:04:33-0400  uid 0
        xlated 2408B  jited 1405B  memlock 4096B  map_ids 9,5,7,8,6
        btf_id 35
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
32: xdp  name xdpfilt_alw_ip  tag 2f2b9dbfb786a5a2  gpl
        loaded_at 2020-10-22T08:04:35-0400  uid 0
        xlated 1336B  jited 778B  memlock 4096B  map_ids 7,8,5
        btf_id 40
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 36 tag 18c1bb25084030bc jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
36: xdp  name xdpfilt_alw_tcp  tag 18c1bb25084030bc  gpl
        loaded_at 2020-10-22T08:04:37-0400  uid 0
        xlated 1128B  jited 690B  memlock 4096B  map_ids 6,5
        btf_id 45
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Load new btf defined maps
# /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 40 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
40: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:39-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 50
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 44 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
44: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:41-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 13
        btf_id 55
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 48 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer  map_sh
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
14: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
48: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:43-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 14
        btf_id 60
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Test load objs by tc
# /root/iproute2/tc/tc qdisc add dev veth0 ingress
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0
# /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
# ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d:
jmp_tc

/sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f:
jmp_ex  jmp_tc  map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc
# bpftool map show
15: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
16: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
17: prog_array  name jmp_ex  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
18: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 2  memlock 4096B
        owner_prog_type sched_cls  owner jited
19: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
52: sched_cls  name cls_loop  tag 3e98a40b04099d36  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 168B  jited 133B  memlock 4096B  map_ids 15
        btf_id 65
56: sched_cls  name cls_entry  tag 0fbb4d9310a6ee26  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 144B  jited 121B  memlock 4096B  map_ids 16
        btf_id 70
60: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 75
66: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 80
72: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 85
78: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 90
79: sched_cls  name cls_case2  tag ee218ff893dca823  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 336B  jited 218B  memlock 4096B  map_ids 19,18
        btf_id 90
80: sched_cls  name cls_exit  tag e78a58140deed387  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 288B  jited 177B  memlock 4096B  map_ids 19
        btf_id 90

I also run the following upstream kselftest with patches iproute2 and
all passed.

test_lwt_ip_encap.sh
test_xdp_redirect.sh
test_tc_redirect.sh
test_xdp_meta.sh
test_xdp_veth.sh
test_xdp_vlan.sh


Hangbin Liu (5):
  configure: add check_libbpf() for later libbpf support
  lib: rename bpf.c to bpf_legacy.c
  lib: add libbpf support
  examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  examples/bpf: add bpf examples with BTF defined maps

 configure                                |  48 ++++
 examples/bpf/README                      |  18 +-
 examples/bpf/bpf_graft.c                 |  14 +-
 examples/bpf/bpf_map_in_map.c            |  37 ++-
 examples/bpf/bpf_shared.c                |  14 +-
 examples/bpf/{ => legacy}/bpf_cyclic.c   |   2 +-
 examples/bpf/legacy/bpf_graft.c          |  66 +++++
 examples/bpf/legacy/bpf_map_in_map.c     |  56 ++++
 examples/bpf/legacy/bpf_shared.c         |  53 ++++
 examples/bpf/{ => legacy}/bpf_tailcall.c |   2 +-
 include/bpf_api.h                        |  13 +
 include/bpf_util.h                       |  17 +-
 ip/ipvrf.c                               |   4 +-
 lib/Makefile                             |   6 +-
 lib/{bpf.c => bpf_legacy.c}              | 184 +++++++++++-
 lib/bpf_libbpf.c                         | 338 +++++++++++++++++++++++
 16 files changed, 824 insertions(+), 48 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 create mode 100644 examples/bpf/legacy/bpf_graft.c
 create mode 100644 examples/bpf/legacy/bpf_map_in_map.c
 create mode 100644 examples/bpf/legacy/bpf_shared.c
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)
 rename lib/{bpf.c => bpf_legacy.c} (94%)
 create mode 100644 lib/bpf_libbpf.c

-- 
2.25.4


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCH iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
@ 2020-10-23  3:38 ` Hangbin Liu
  2020-10-23  3:38 ` [PATCH iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-23  3:38 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch adds a check to see if we support libbpf. By default the
system libbpf will be used, but static linking against a custom libbpf
version can be achieved by passing LIBBPF_DIR to configure. FORCE_LIBBPF
can be set to force configure to abort if no suitable libbpf is found,
which is useful for automatic packaging that wants to enforce the
dependency.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 configure | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/configure b/configure
index 307912aa..77f475d9 100755
--- a/configure
+++ b/configure
@@ -240,6 +240,51 @@ check_elf()
     fi
 }
 
+check_libbpf()
+{
+    if ${PKG_CONFIG} libbpf --exists || [ -n "$LIBBPF_DIR" ] ; then
+
+        if [ -n "$LIBBPF_DIR" ]; then
+            LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include -L${LIBBPF_DIR}/lib64"
+            LIBBPF_LDLIBS="${LIBBPF_DIR}/lib64/libbpf.a -lz -lelf"
+        else
+            LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
+            LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
+        fi
+
+        cat >$TMPDIR/libbpftest.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    void *ptr;
+    DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, .relaxed_maps = true, .pin_root_path = "/path");
+    (void) bpf_object__open_file("file", &opts);
+    (void) bpf_map__name(ptr);
+    (void) bpf_map__ifindex(ptr);
+    (void) bpf_map__reuse_fd(ptr, 0);
+    (void) bpf_map__pin(ptr, "/path");
+    return 0;
+}
+EOF
+
+        if $CC -o $TMPDIR/libbpftest $TMPDIR/libbpftest.c $LIBBPF_CFLAGS -lbpf 2>&1; then
+            echo "HAVE_LIBBPF:=y" >>$CONFIG
+            echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
+            echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
+            echo "yes"
+            return 0
+        fi
+    fi
+
+    echo "no"
+
+    # if set FORCE_LIBBPF but no libbpf support, just exist the config
+    # process to make sure we don't build without libbpf.
+    if [ -n "$FORCE_LIBBPF" ]; then
+	    echo "FORCE_LIBBPF set, but couldn't find a usable libbpf"
+	    exit 1
+    fi
+}
+
 check_selinux()
 # SELinux is a compile time option in the ss utility
 {
@@ -385,6 +430,9 @@ check_setns
 echo -n "SELinux support: "
 check_selinux
 
+echo -n "libbpf support: "
+check_libbpf
+
 echo -n "ELF support: "
 check_elf
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCH iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c
  2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  2020-10-23  3:38 ` [PATCH iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
@ 2020-10-23  3:38 ` Hangbin Liu
  2020-10-23  3:38 ` [PATCH iproute2-next 3/5] lib: add libbpf support Hangbin Liu
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-23  3:38 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This is a preparation for later libbpf support in iproute2. Function
bpf_prog_load() is also renamed to bpf_prog_load_buf() as there is a
conflict with libbpf.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 include/bpf_util.h          | 6 +++---
 ip/ipvrf.c                  | 4 ++--
 lib/Makefile                | 2 +-
 lib/{bpf.c => bpf_legacy.c} | 6 +++---
 4 files changed, 9 insertions(+), 9 deletions(-)
 rename lib/{bpf.c => bpf_legacy.c} (99%)

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 63db07ca..72d3a32c 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -274,9 +274,9 @@ int bpf_trace_pipe(void);
 
 void bpf_print_ops(struct rtattr *bpf_ops, __u16 len);
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log);
+int bpf_prog_load_buf(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, char *log,
+		      size_t size_log);
 
 int bpf_prog_attach_fd(int prog_fd, int target_fd, enum bpf_attach_type type);
 int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type);
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 28dd8e25..33150ac2 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -256,8 +256,8 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
-	return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
-			     "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
diff --git a/lib/Makefile b/lib/Makefile
index 7cba1857..a326fb9f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@ CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
-	names.o color.o bpf.o exec.o fs.o cg_map.o
+	names.o color.o bpf_legacy.o exec.o fs.o cg_map.o
 
 NLOBJ=libgenl.o libnetlink.o
 
diff --git a/lib/bpf.c b/lib/bpf_legacy.c
similarity index 99%
rename from lib/bpf.c
rename to lib/bpf_legacy.c
index c7d45077..2e6e0602 100644
--- a/lib/bpf.c
+++ b/lib/bpf_legacy.c
@@ -1109,9 +1109,9 @@ static int bpf_prog_load_dev(enum bpf_prog_type type,
 	return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 }
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log)
+int bpf_prog_load_buf(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, char *log,
+		      size_t size_log)
 {
 	return bpf_prog_load_dev(type, insns, size_insns, license, 0,
 				 log, size_log);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  2020-10-23  3:38 ` [PATCH iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
  2020-10-23  3:38 ` [PATCH iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
@ 2020-10-23  3:38 ` Hangbin Liu
  2020-10-23 14:34   ` David Ahern
  2020-10-24  0:21   ` Andrii Nakryiko
  2020-10-23  3:38 ` [PATCH iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-23  3:38 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available, which is started by Toke's
implementation[1]. With libbpf iproute2 could correctly process BTF
information and support the new-style BTF-defined maps, while keeping
compatibility with the old internal map definition syntax.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
iproute2 will still understand the old map definition format, including
populating map-in-map and tail call maps before load.

In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
legacy bytes. When handling the legacy maps, for map-in-maps, we create
them manually and re-use the fd as they are associated with id/inner_id.
For pin maps, we only set the pin path and let libbp load to handle it.
For tail calls, we find it first and update the element after prog load.

Other maps/progs will be loaded by libbpf directly.

Note: ip/ipvrf.c is not convert to use libbpf as it only encodes a few
instructions and load directly.

[1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 include/bpf_util.h |  11 ++
 lib/Makefile       |   4 +
 lib/bpf_legacy.c   | 178 ++++++++++++++++++++++++
 lib/bpf_libbpf.c   | 338 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 531 insertions(+)
 create mode 100644 lib/bpf_libbpf.c

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 72d3a32c..e200c107 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -300,4 +300,15 @@ static inline int bpf_recv_map_fds(const char *path, int *fds,
 	return -1;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg);
+int iproute2_bpf_fetch_ancillary(void);
+int iproute2_get_root_path(char *root_path, size_t len);
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname);
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name);
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name);
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg);
+#endif /* HAVE_LIBBPF */
 #endif /* __BPF_UTIL__ */
diff --git a/lib/Makefile b/lib/Makefile
index a326fb9f..82d6e465 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -7,6 +7,10 @@ UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
 	names.o color.o bpf_legacy.o exec.o fs.o cg_map.o
 
+ifeq ($(HAVE_LIBBPF),y)
+UTILOBJ += bpf_libbpf.o
+endif
+
 NLOBJ=libgenl.o libnetlink.o
 
 all: libnetlink.a libutil.a
diff --git a/lib/bpf_legacy.c b/lib/bpf_legacy.c
index 2e6e0602..c5ff3e32 100644
--- a/lib/bpf_legacy.c
+++ b/lib/bpf_legacy.c
@@ -940,6 +940,9 @@ static int bpf_do_parse(struct bpf_cfg_in *cfg, const bool *opt_tbl)
 static int bpf_do_load(struct bpf_cfg_in *cfg)
 {
 	if (cfg->mode == EBPF_OBJECT) {
+#ifdef HAVE_LIBBPF
+		return iproute2_load_libbpf(cfg);
+#endif
 		cfg->prog_fd = bpf_obj_open(cfg->object, cfg->type,
 					    cfg->section, cfg->ifindex,
 					    cfg->verbose);
@@ -3165,3 +3168,178 @@ int bpf_recv_map_fds(const char *path, int *fds, struct bpf_map_aux *aux,
 	return ret;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+/* The following functions are wrapper functions for libbpf code to be
+ * compatible with the legacy format. So all the functions have prefix
+ * with iproute2_
+ */
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+
+	return bpf_elf_ctx_init(ctx, cfg->object, cfg->type, cfg->ifindex, cfg->verbose);
+}
+
+int iproute2_bpf_fetch_ancillary(void)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	struct bpf_elf_sec_data data;
+	int i, ret = 0;
+
+	for (i = 1; i < ctx->elf_hdr.e_shnum; i++) {
+		ret = bpf_fill_section_data(ctx, i, &data);
+		if (ret < 0)
+			continue;
+
+		if (data.sec_hdr.sh_type == SHT_PROGBITS &&
+		    !strcmp(data.sec_name, ELF_SECTION_MAPS))
+			ret = bpf_fetch_maps_begin(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_SYMTAB &&
+			 !strcmp(data.sec_name, ".symtab"))
+			ret = bpf_fetch_symtab(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_STRTAB &&
+			 !strcmp(data.sec_name, ".strtab"))
+			ret = bpf_fetch_strtab(ctx, i, &data);
+		if (ret < 0) {
+			fprintf(stderr, "Error parsing section %d! Perhaps check with readelf -a?\n",
+				i);
+			return ret;
+		}
+	}
+
+	if (bpf_has_map_data(ctx)) {
+		ret = bpf_fetch_maps_end(ctx);
+		if (ret < 0) {
+			fprintf(stderr, "Error fixing up map structure, incompatible struct bpf_elf_map used?\n");
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+int iproute2_get_root_path(char *root_path, size_t len)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	int ret = 0;
+
+	snprintf(root_path, len, "%s/%s",
+		 bpf_get_work_dir(ctx->type), BPF_DIR_GLOBALS);
+
+	ret = mkdir(root_path, S_IRWXU);
+	if (ret && errno != EEXIST) {
+		fprintf(stderr, "mkdir %s failed: %s\n", root_path, strerror(errno));
+		return ret;
+	}
+
+	return 0;
+}
+
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name, *tmp;
+	unsigned int pinning;
+	int i, ret = 0;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].pinning == PIN_OBJECT_NS &&
+		    ctx->noafalg) {
+			fprintf(stderr, "Missing kernel AF_ALG support for PIN_OBJECT_NS!\n");
+			return false;
+		}
+
+		map_name = bpf_map_fetch_name(ctx, i);
+		if (!map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, map_name))
+			continue;
+
+		pinning = ctx->maps[i].pinning;
+
+		if (bpf_no_pinning(ctx, pinning) || !bpf_get_work_dir(ctx->type))
+			return false;
+
+		if (pinning == PIN_OBJECT_NS)
+			ret = bpf_make_obj_path(ctx);
+		else if ((tmp = bpf_custom_pinning(ctx, pinning)))
+			ret = bpf_make_custom_path(ctx, tmp);
+		if (ret < 0)
+			return false;
+
+		bpf_make_pathname(pathname, PATH_MAX, map_name, ctx, pinning);
+
+		return true;
+	}
+
+	return false;
+}
+
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *inner_map_name, *outer_map_name;
+	int i, j;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		inner_map_name = bpf_map_fetch_name(ctx, i);
+		if (!inner_map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, inner_map_name))
+			continue;
+
+		if (!ctx->maps[i].id ||
+		    ctx->maps[i].inner_id ||
+		    ctx->maps[i].inner_idx == -1)
+			continue;
+
+		*imap = ctx->maps[i];
+
+		for (j = 0; j < ctx->map_num; j++) {
+			if (!bpf_is_map_in_map_type(&ctx->maps[j]))
+				continue;
+			if (ctx->maps[j].inner_id != ctx->maps[i].id)
+				continue;
+
+			*omap = ctx->maps[j];
+			outer_map_name = bpf_map_fetch_name(ctx, j);
+			memcpy(omap_name, outer_map_name, strlen(outer_map_name) + 1);
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name;
+	int i, idx = -1;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].id == map_id &&
+		    ctx->maps[i].type == BPF_MAP_TYPE_PROG_ARRAY) {
+			idx = i;
+			break;
+		}
+	}
+
+	if (idx < 0)
+		return -1;
+
+	map_name = bpf_map_fetch_name(ctx, idx);
+	if (!map_name)
+		return -1;
+
+	memcpy(name, map_name, strlen(map_name) + 1);
+	return 0;
+}
+#endif /* HAVE_LIBBPF */
diff --git a/lib/bpf_libbpf.c b/lib/bpf_libbpf.c
new file mode 100644
index 00000000..9e3b9787
--- /dev/null
+++ b/lib/bpf_libbpf.c
@@ -0,0 +1,338 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <errno.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+#include <gelf.h>
+
+#include <bpf/libbpf.h>
+#include <bpf/bpf.h>
+
+#include "bpf_util.h"
+
+#define MAX_ERRNO       4095
+#define IS_ERR_VALUE(x) ((x) >= (unsigned long)-MAX_ERRNO)
+
+static inline bool IS_ERR_OR_NULL(const void *ptr)
+{
+	return (!ptr) || IS_ERR_VALUE((unsigned long)ptr);
+}
+
+static int verbose_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	return vfprintf(stderr, format, args);
+}
+
+static int silent_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	if (level > LIBBPF_WARN)
+		return 0;
+
+	/* Skip warning from bpf_object__init_user_maps() for legacy maps */
+	if (strstr(format, "has unrecognized, non-zero options"))
+		return 0;
+
+	return vfprintf(stderr, format, args);
+}
+
+static int create_map(const char *name, struct bpf_elf_map *map,
+		      __u32 ifindex, int inner_fd)
+{
+	struct bpf_create_map_attr map_attr = {};
+
+	map_attr.name = name;
+	map_attr.map_type = map->type;
+	map_attr.map_flags = map->flags;
+	map_attr.key_size = map->size_key;
+	map_attr.value_size = map->size_value;
+	map_attr.max_entries = map->max_elem;
+	map_attr.map_ifindex = ifindex;
+	map_attr.inner_map_fd = inner_fd;
+
+	return bpf_create_map_xattr(&map_attr);
+}
+
+static int create_map_in_map(struct bpf_object *obj, struct bpf_map *map,
+			     struct bpf_elf_map *elf_map, int inner_fd,
+			     bool *reuse_pin_map)
+{
+	char pathname[PATH_MAX];
+	const char *map_name;
+	bool pin_map = false;
+	int map_fd, ret = 0;
+
+	map_name = bpf_map__name(map);
+
+	if (iproute2_is_pin_map(map_name, pathname)) {
+		pin_map = true;
+
+		/* Check if there already has a pinned map */
+		map_fd = bpf_obj_get(pathname);
+		if (map_fd > 0) {
+			if (reuse_pin_map)
+				*reuse_pin_map = true;
+			close(map_fd);
+			return bpf_map__set_pin_path(map, pathname);
+		}
+	}
+
+	map_fd = create_map(map_name, elf_map, bpf_map__ifindex(map), inner_fd);
+	if (map_fd < 0) {
+		fprintf(stderr, "create map %s failed\n", map_name);
+		return map_fd;
+	}
+
+	ret = bpf_map__reuse_fd(map, map_fd);
+	if (ret < 0) {
+		fprintf(stderr, "map %s reuse fd failed\n", map_name);
+		goto err_out;
+	}
+
+	if (pin_map) {
+		ret = bpf_map__set_pin_path(map, pathname);
+		if (ret < 0)
+			goto err_out;
+	}
+
+	return 0;
+err_out:
+	close(map_fd);
+	return ret;
+}
+
+static int
+handle_legacy_map_in_map(struct bpf_object *obj, struct bpf_map *inner_map,
+			 const char *inner_map_name)
+{
+	int inner_fd, outer_fd, inner_idx, ret = 0;
+	struct bpf_elf_map imap, omap;
+	struct bpf_map *outer_map;
+	/* What's the size limit of map name? */
+	char outer_map_name[128];
+	bool reuse_pin_map = false;
+
+	/* Deal with map-in-map */
+	if (iproute2_is_map_in_map(inner_map_name, &imap, &omap, outer_map_name)) {
+		ret = create_map_in_map(obj, inner_map, &imap, -1, NULL);
+		if (ret < 0)
+			return ret;
+
+		inner_fd = bpf_map__fd(inner_map);
+		outer_map = bpf_object__find_map_by_name(obj, outer_map_name);
+		ret = create_map_in_map(obj, outer_map, &omap, inner_fd, &reuse_pin_map);
+		if (ret < 0)
+			return ret;
+
+		if (!reuse_pin_map) {
+			inner_idx = imap.inner_idx;
+			outer_fd = bpf_map__fd(outer_map);
+			ret = bpf_map_update_elem(outer_fd, &inner_idx, &inner_fd, 0);
+			if (ret < 0)
+				fprintf(stderr, "Cannot update inner_idx into outer_map\n");
+		}
+	}
+
+	return ret;
+}
+
+static int find_legacy_tail_calls(struct bpf_program *prog, struct bpf_object *obj)
+{
+	unsigned int map_id, key_id;
+	const char *sec_name;
+	struct bpf_map *map;
+	char map_name[128];
+	int ret;
+
+	/* Handle iproute2 tail call */
+	sec_name = bpf_program__section_name(prog);
+	ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+	if (ret != 2)
+		return -1;
+
+	ret = iproute2_find_map_name_by_id(map_id, map_name);
+	if (ret < 0) {
+		fprintf(stderr, "unable to find map id %u for tail call\n", map_id);
+		return ret;
+	}
+
+	map = bpf_object__find_map_by_name(obj, map_name);
+	if (!map)
+		return -1;
+
+	/* Save the map here for later updating */
+	bpf_program__set_priv(prog, map, NULL);
+
+	return 0;
+}
+
+static int update_legacy_tail_call_maps(struct bpf_object *obj)
+{
+	int prog_fd, map_fd, ret = 0;
+	unsigned int map_id, key_id;
+	struct bpf_program *prog;
+	const char *sec_name;
+	struct bpf_map *map;
+
+	bpf_object__for_each_program(prog, obj) {
+		map = bpf_program__priv(prog);
+		if (!map)
+			continue;
+
+		prog_fd = bpf_program__fd(prog);
+		if (prog_fd < 0)
+			continue;
+
+		sec_name = bpf_program__section_name(prog);
+		ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+		if (ret != 2)
+			continue;
+
+		map_fd = bpf_map__fd(map);
+		ret = bpf_map_update_elem(map_fd, &key_id, &prog_fd, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Cannot update map key for tail call!\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int handle_legacy_maps(struct bpf_object *obj)
+{
+	char pathname[PATH_MAX];
+	struct bpf_map *map;
+	const char *map_name;
+	int map_fd, ret = 0;
+
+	bpf_object__for_each_map(map, obj) {
+		map_name = bpf_map__name(map);
+
+		ret = handle_legacy_map_in_map(obj, map, map_name);
+		if (ret)
+			return ret;
+
+		/* If it is a iproute2 legacy pin maps, just set pin path
+		 * and let bpf_object__load() to deal with the map creation.
+		 * We need to ignore map-in-maps which have pinned maps manually
+		 */
+		map_fd = bpf_map__fd(map);
+		if (map_fd < 0 && iproute2_is_pin_map(map_name, pathname)) {
+			ret = bpf_map__set_pin_path(map, pathname);
+			if (ret) {
+				fprintf(stderr, "map '%s': couldn't set pin path.\n", map_name);
+				break;
+			}
+		}
+
+	}
+
+	return ret;
+}
+
+static int load_bpf_object(struct bpf_cfg_in *cfg)
+{
+	struct bpf_program *p, *prog = NULL;
+	struct bpf_object *obj;
+	char root_path[PATH_MAX];
+	struct bpf_map *map;
+	int prog_fd, ret = 0;
+
+	ret = iproute2_get_root_path(root_path, PATH_MAX);
+	if (ret)
+		return ret;
+
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts,
+			.relaxed_maps = true,
+			.pin_root_path = root_path,
+	);
+
+	obj = bpf_object__open_file(cfg->object, &open_opts);
+	if (IS_ERR_OR_NULL(obj))
+		return -ENOENT;
+
+	bpf_object__for_each_program(p, obj) {
+		/* Only load the programs that will either be subsequently
+		 * attached or inserted into a tail call map */
+		if (find_legacy_tail_calls(p, obj) < 0 && cfg->section &&
+		    strcmp(bpf_program__section_name(p), cfg->section)) {
+			ret = bpf_program__set_autoload(p, false);
+			if (ret)
+				return -EINVAL;
+			continue;
+		}
+
+		bpf_program__set_type(p, cfg->type);
+		bpf_program__set_ifindex(p, cfg->ifindex);
+		if (!prog)
+			prog = p;
+	}
+
+	bpf_object__for_each_map(map, obj) {
+		if (!bpf_map__is_offload_neutral(map))
+			bpf_map__set_ifindex(map, cfg->ifindex);
+	}
+
+	if (!prog) {
+		fprintf(stderr, "object file doesn't contain sec %s\n", cfg->section);
+		return -ENOENT;
+	}
+
+	/* Handle iproute2 legacy pin maps and map-in-maps */
+	ret = handle_legacy_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = bpf_object__load(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = update_legacy_tail_call_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	prog_fd = fcntl(bpf_program__fd(prog), F_DUPFD_CLOEXEC, 1);
+	if (prog_fd < 0)
+		ret = -errno;
+	else
+		cfg->prog_fd = prog_fd;
+
+unload_obj:
+	/* Close obj as we don't need it */
+	bpf_object__close(obj);
+	return ret;
+}
+
+/* Load ebpf and return prog fd */
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg)
+{
+	int ret = 0;
+
+	if (cfg->verbose)
+		libbpf_set_print(verbose_print);
+	else
+		libbpf_set_print(silent_print);
+
+	ret = iproute2_bpf_elf_ctx_init(cfg);
+	if (ret < 0) {
+		fprintf(stderr, "Cannot initialize ELF context!\n");
+		return ret;
+	}
+
+	ret = iproute2_bpf_fetch_ancillary();
+	if (ret < 0) {
+		fprintf(stderr, "Error fetching ELF ancillary data!\n");
+		return ret;
+	}
+
+	ret = load_bpf_object(cfg);
+	if (ret)
+		return ret;
+
+	return cfg->prog_fd;
+}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCH iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                   ` (2 preceding siblings ...)
  2020-10-23  3:38 ` [PATCH iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-10-23  3:38 ` Hangbin Liu
  2020-10-23  3:38 ` [PATCH iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-23  3:38 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README                        | 14 +++++++++-----
 examples/bpf/{ => legacy}/bpf_cyclic.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_graft.c      |  2 +-
 examples/bpf/{ => legacy}/bpf_map_in_map.c |  2 +-
 examples/bpf/{ => legacy}/bpf_shared.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_tailcall.c   |  2 +-
 6 files changed, 14 insertions(+), 10 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 rename examples/bpf/{ => legacy}/bpf_graft.c (97%)
 rename examples/bpf/{ => legacy}/bpf_map_in_map.c (96%)
 rename examples/bpf/{ => legacy}/bpf_shared.c (97%)
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)

diff --git a/examples/bpf/README b/examples/bpf/README
index 1bbdda3f..732bcc83 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,8 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
- - bpf_shared.c		-> Ingress/egress map sharing example
- - bpf_tailcall.c	-> Using tail call chains
- - bpf_cyclic.c		-> Simple cycle as tail calls
- - bpf_graft.c		-> Demo on altering runtime behaviour
- - bpf_map_in_map.c     -> Using map in map example
+ - legacy/bpf_shared.c		-> Ingress/egress map sharing example
+ - legacy/bpf_tailcall.c	-> Using tail call chains
+ - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
+ - legacy/bpf_graft.c		-> Demo on altering runtime behaviour
+ - legacy/bpf_map_in_map.c	-> Using map in map example
+
+Note: Users should use new BTF way to defined the maps, the examples
+in legacy folder which is using struct bpf_elf_map defined maps is not
+recommanded.
diff --git a/examples/bpf/bpf_cyclic.c b/examples/bpf/legacy/bpf_cyclic.c
similarity index 95%
rename from examples/bpf/bpf_cyclic.c
rename to examples/bpf/legacy/bpf_cyclic.c
index 11d1c061..33590730 100644
--- a/examples/bpf/bpf_cyclic.c
+++ b/examples/bpf/legacy/bpf_cyclic.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Cyclic dependency example to test the kernel's runtime upper
  * bound on loops. Also demonstrates on how to use direct-actions,
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/legacy/bpf_graft.c
similarity index 97%
rename from examples/bpf/bpf_graft.c
rename to examples/bpf/legacy/bpf_graft.c
index 07113d4a..f4c920cc 100644
--- a/examples/bpf/bpf_graft.c
+++ b/examples/bpf/legacy/bpf_graft.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* This example demonstrates how classifier run-time behaviour
  * can be altered with tail calls. We start out with an empty
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/legacy/bpf_map_in_map.c
similarity index 96%
rename from examples/bpf/bpf_map_in_map.c
rename to examples/bpf/legacy/bpf_map_in_map.c
index ff0e623a..575f8812 100644
--- a/examples/bpf/bpf_map_in_map.c
+++ b/examples/bpf/legacy/bpf_map_in_map.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define MAP_INNER_ID	42
 
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/legacy/bpf_shared.c
similarity index 97%
rename from examples/bpf/bpf_shared.c
rename to examples/bpf/legacy/bpf_shared.c
index 21fe6f1e..05b2b9ef 100644
--- a/examples/bpf/bpf_shared.c
+++ b/examples/bpf/legacy/bpf_shared.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Minimal, stand-alone toy map pinning example:
  *
diff --git a/examples/bpf/bpf_tailcall.c b/examples/bpf/legacy/bpf_tailcall.c
similarity index 98%
rename from examples/bpf/bpf_tailcall.c
rename to examples/bpf/legacy/bpf_tailcall.c
index 161eb606..8ebc554c 100644
--- a/examples/bpf/bpf_tailcall.c
+++ b/examples/bpf/legacy/bpf_tailcall.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define ENTRY_INIT	3
 #define ENTRY_0		0
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCH iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps
  2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                   ` (3 preceding siblings ...)
  2020-10-23  3:38 ` [PATCH iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
@ 2020-10-23  3:38 ` Hangbin Liu
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  2020-11-29  6:16 ` [PATCH " Stephen Hemminger
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-23  3:38 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

Users should try use the new BTF defined maps instead of struct
bpf_elf_map defined maps. The tail call examples are not added yet
as libbpf doesn't currently support declaratively populating tail call
maps.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README           |  6 ++++
 examples/bpf/bpf_graft.c      | 66 +++++++++++++++++++++++++++++++++++
 examples/bpf/bpf_map_in_map.c | 55 +++++++++++++++++++++++++++++
 examples/bpf/bpf_shared.c     | 53 ++++++++++++++++++++++++++++
 include/bpf_api.h             | 13 +++++++
 5 files changed, 193 insertions(+)
 create mode 100644 examples/bpf/bpf_graft.c
 create mode 100644 examples/bpf/bpf_map_in_map.c
 create mode 100644 examples/bpf/bpf_shared.c

diff --git a/examples/bpf/README b/examples/bpf/README
index 732bcc83..b7261191 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,6 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
+- BTF defined map examples
+ - bpf_graft.c		-> Demo on altering runtime behaviour
+ - bpf_shared.c 	-> Ingress/egress map sharing example
+ - bpf_map_in_map.c	-> Using map in map example
+
+- legacy struct bpf_elf_map defined map examples
  - legacy/bpf_shared.c		-> Ingress/egress map sharing example
  - legacy/bpf_tailcall.c	-> Using tail call chains
  - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/bpf_graft.c
new file mode 100644
index 00000000..8066dcce
--- /dev/null
+++ b/examples/bpf/bpf_graft.c
@@ -0,0 +1,66 @@
+#include "../../include/bpf_api.h"
+
+/* This example demonstrates how classifier run-time behaviour
+ * can be altered with tail calls. We start out with an empty
+ * jmp_tc array, then add section aaa to the array slot 0, and
+ * later on atomically replace it with section bbb. Note that
+ * as shown in other examples, the tc loader can prepopulate
+ * tail called sections, here we start out with an empty one
+ * on purpose to show it can also be done this way.
+ *
+ * tc filter add dev foo parent ffff: bpf obj graft.o
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-20229 [001] ..s. 138993.003923: : fallthrough
+ *   <idle>-0            [001] ..s. 138993.202265: : fallthrough
+ *   Socket Thread-20229 [001] ..s. 138994.004149: : fallthrough
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec aaa
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139012.053587: : aaa
+ *   <idle>-0            [002] ..s. 139012.172359: : aaa
+ *   Socket Thread-19818 [001] ..s. 139012.173556: : aaa
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec bbb
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139022.102967: : bbb
+ *   <idle>-0            [002] ..s. 139022.155640: : bbb
+ *   Socket Thread-19818 [001] ..s. 139022.156730: : bbb
+ *   [...]
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+} jmp_tc __section(".maps");
+
+__section("aaa")
+int cls_aaa(struct __sk_buff *skb)
+{
+	printt("aaa\n");
+	return TC_H_MAKE(1, 42);
+}
+
+__section("bbb")
+int cls_bbb(struct __sk_buff *skb)
+{
+	printt("bbb\n");
+	return TC_H_MAKE(1, 43);
+}
+
+__section_cls_entry
+int cls_entry(struct __sk_buff *skb)
+{
+	tail_call(skb, &jmp_tc, 0);
+	printt("fallthrough\n");
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/bpf_map_in_map.c
new file mode 100644
index 00000000..39c86268
--- /dev/null
+++ b/examples/bpf/bpf_map_in_map.c
@@ -0,0 +1,55 @@
+#include "../../include/bpf_api.h"
+
+struct inner_map {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+} map_inner __section(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+	__array(values, struct inner_map);
+} map_outer __section(".maps") = {
+	.values = {
+		[0] = &map_inner,
+	},
+};
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			lock_xadd(val, 1);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			printt("map val: %d\n", *val);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/bpf_shared.c
new file mode 100644
index 00000000..99a332f4
--- /dev/null
+++ b/examples/bpf/bpf_shared.c
@@ -0,0 +1,53 @@
+#include "../../include/bpf_api.h"
+
+/* Minimal, stand-alone toy map pinning example:
+ *
+ * clang -target bpf -O2 [...] -o bpf_shared.o -c bpf_shared.c
+ * tc filter add dev foo parent 1: bpf obj bpf_shared.o sec egress
+ * tc filter add dev foo parent ffff: bpf obj bpf_shared.o sec ingress
+ *
+ * Both classifier will share the very same map instance in this example,
+ * so map content can be accessed from ingress *and* egress side!
+ *
+ * This example has a pinning of PIN_OBJECT_NS, so it's private and
+ * thus shared among various program sections within the object.
+ *
+ * A setting of PIN_GLOBAL_NS would place it into a global namespace,
+ * so that it can be shared among different object files. A setting
+ * of PIN_NONE (= 0) means no sharing, so each tc invocation a new map
+ * instance is being created.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);	/* or LIBBPF_PIN_NONE */
+} map_sh __section(".maps");
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		lock_xadd(val, 1);
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		printt("map val: %d\n", *val);
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/include/bpf_api.h b/include/bpf_api.h
index 89d3488d..82c47089 100644
--- a/include/bpf_api.h
+++ b/include/bpf_api.h
@@ -19,6 +19,19 @@
 
 #include "bpf_elf.h"
 
+/** libbpf pin type. */
+enum libbpf_pin_type {
+	LIBBPF_PIN_NONE,
+	/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
+	LIBBPF_PIN_BY_NAME,
+};
+
+/** Type helper macros. */
+
+#define __uint(name, val) int (*name)[val]
+#define __type(name, val) typeof(val) *name
+#define __array(name, val) typeof(val) *name[]
+
 /** Misc macros. */
 
 #ifndef __stringify
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-23  3:38 ` [PATCH iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-10-23 14:34   ` David Ahern
  2020-10-25 15:13     ` Toke Høiland-Jørgensen
  2020-10-24  0:21   ` Andrii Nakryiko
  1 sibling, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-10-23 14:34 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 10/22/20 9:38 PM, Hangbin Liu wrote:
> Note: ip/ipvrf.c is not convert to use libbpf as it only encodes a few
> instructions and load directly.

for completeness, libbpf should be able to load a program from a buffer
as well.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-23  3:38 ` [PATCH iproute2-next 3/5] lib: add libbpf support Hangbin Liu
  2020-10-23 14:34   ` David Ahern
@ 2020-10-24  0:21   ` Andrii Nakryiko
  2020-10-25 15:11     ` Toke Høiland-Jørgensen
  2020-10-26  8:10     ` Hangbin Liu
  1 sibling, 2 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-10-24  0:21 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, David Ahern,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Thu, Oct 22, 2020 at 8:39 PM Hangbin Liu <haliu@redhat.com> wrote:
>
> This patch converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available, which is started by Toke's
> implementation[1]. With libbpf iproute2 could correctly process BTF
> information and support the new-style BTF-defined maps, while keeping
> compatibility with the old internal map definition syntax.
>
> The old iproute2 bpf code is kept and will be used if no suitable libbpf
> is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
> iproute2 will still understand the old map definition format, including
> populating map-in-map and tail call maps before load.
>
> In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
> legacy bytes. When handling the legacy maps, for map-in-maps, we create
> them manually and re-use the fd as they are associated with id/inner_id.
> For pin maps, we only set the pin path and let libbp load to handle it.
> For tail calls, we find it first and update the element after prog load.

I never implemented tail call map initialization using the same
approach as declarative map-in-map support in libbpf, because no one
asked and/or showed a use case. But all the pieces are there, and if
there's interest, we should probably support that in libbpf as well.

>
> Other maps/progs will be loaded by libbpf directly.
>
> Note: ip/ipvrf.c is not convert to use libbpf as it only encodes a few
> instructions and load directly.
>
> [1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/
>
> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Signed-off-by: Hangbin Liu <haliu@redhat.com>
> ---
>  include/bpf_util.h |  11 ++
>  lib/Makefile       |   4 +
>  lib/bpf_legacy.c   | 178 ++++++++++++++++++++++++
>  lib/bpf_libbpf.c   | 338 +++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 531 insertions(+)
>  create mode 100644 lib/bpf_libbpf.c
>

[...]

> +
> +static int load_bpf_object(struct bpf_cfg_in *cfg)
> +{
> +       struct bpf_program *p, *prog = NULL;
> +       struct bpf_object *obj;
> +       char root_path[PATH_MAX];
> +       struct bpf_map *map;
> +       int prog_fd, ret = 0;
> +
> +       ret = iproute2_get_root_path(root_path, PATH_MAX);
> +       if (ret)
> +               return ret;
> +
> +       DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts,
> +                       .relaxed_maps = true,
> +                       .pin_root_path = root_path,
> +       );
> +
> +       obj = bpf_object__open_file(cfg->object, &open_opts);
> +       if (IS_ERR_OR_NULL(obj))

libbpf defines libbpf_get_error() to check that the returned pointer
is not encoding error, you shouldn't need to define your IS_ERR
macros.

> +               return -ENOENT;
> +

[...]

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-24  0:21   ` Andrii Nakryiko
@ 2020-10-25 15:11     ` Toke Høiland-Jørgensen
  2020-10-26  8:10     ` Hangbin Liu
  1 sibling, 0 replies; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-25 15:11 UTC (permalink / raw)
  To: Andrii Nakryiko, Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, David Ahern,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Thu, Oct 22, 2020 at 8:39 PM Hangbin Liu <haliu@redhat.com> wrote:
>>
>> This patch converts iproute2 to use libbpf for loading and attaching
>> BPF programs when it is available, which is started by Toke's
>> implementation[1]. With libbpf iproute2 could correctly process BTF
>> information and support the new-style BTF-defined maps, while keeping
>> compatibility with the old internal map definition syntax.
>>
>> The old iproute2 bpf code is kept and will be used if no suitable libbpf
>> is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
>> iproute2 will still understand the old map definition format, including
>> populating map-in-map and tail call maps before load.
>>
>> In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
>> legacy bytes. When handling the legacy maps, for map-in-maps, we create
>> them manually and re-use the fd as they are associated with id/inner_id.
>> For pin maps, we only set the pin path and let libbp load to handle it.
>> For tail calls, we find it first and update the element after prog load.
>
> I never implemented tail call map initialization using the same
> approach as declarative map-in-map support in libbpf, because no one
> asked and/or showed a use case. But all the pieces are there, and if
> there's interest, we should probably support that in libbpf as well.

Yeah, that's what we figured; and since this series maintains
compatibility with the old map definition format for declarative
tail-calls, this doesn't have to hold up the conversion: iproute2 will
just magically gain this when/if it lands in libbpf :)

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-23 14:34   ` David Ahern
@ 2020-10-25 15:13     ` Toke Høiland-Jørgensen
  2020-10-25 22:12       ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-25 15:13 UTC (permalink / raw)
  To: David Ahern, Hangbin Liu, Stephen Hemminger, Daniel Borkmann,
	Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko

David Ahern <dsahern@gmail.com> writes:

> On 10/22/20 9:38 PM, Hangbin Liu wrote:
>> Note: ip/ipvrf.c is not convert to use libbpf as it only encodes a few
>> instructions and load directly.
>
> for completeness, libbpf should be able to load a program from a buffer
> as well.

It can, but the particular use in ipvrf is just loading half a dozen
instructions defined inline in C - there's no object files, BTF or
anything. So why bother with going through libbpf in this case? The
actual attachment is using the existing code anyway...

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-25 15:13     ` Toke Høiland-Jørgensen
@ 2020-10-25 22:12       ` David Ahern
  2020-10-26  8:56         ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-10-25 22:12 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Hangbin Liu, Stephen Hemminger,
	Daniel Borkmann, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko

On 10/25/20 9:13 AM, Toke Høiland-Jørgensen wrote:
> David Ahern <dsahern@gmail.com> writes:
> 
>> On 10/22/20 9:38 PM, Hangbin Liu wrote:
>>> Note: ip/ipvrf.c is not convert to use libbpf as it only encodes a few
>>> instructions and load directly.
>>
>> for completeness, libbpf should be able to load a program from a buffer
>> as well.
> 
> It can, but the particular use in ipvrf is just loading half a dozen
> instructions defined inline in C - there's no object files, BTF or
> anything. So why bother with going through libbpf in this case? The
> actual attachment is using the existing code anyway...
> 

actually, it already does: bpf_load_program

I recalled figuring out how to do it, just did not remember if it was
local changes to libbpf. Does not look like any changes were needed:

https://github.com/dsahern/bpf-progs/blob/master/src/cgroup_sock.c

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-24  0:21   ` Andrii Nakryiko
  2020-10-25 15:11     ` Toke Høiland-Jørgensen
@ 2020-10-26  8:10     ` Hangbin Liu
  1 sibling, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-26  8:10 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stephen Hemminger, Daniel Borkmann, David Ahern,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Fri, Oct 23, 2020 at 05:21:20PM -0700, Andrii Nakryiko wrote:
> > +       obj = bpf_object__open_file(cfg->object, &open_opts);
> > +       if (IS_ERR_OR_NULL(obj))
> 
> libbpf defines libbpf_get_error() to check that the returned pointer
> is not encoding error, you shouldn't need to define your IS_ERR
> macros.

Thanks for this tip, I will fix it in next version.

Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-25 22:12       ` David Ahern
@ 2020-10-26  8:56         ` Hangbin Liu
  2020-10-26 15:15           ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-10-26  8:56 UTC (permalink / raw)
  To: David Ahern
  Cc: Toke Høiland-Jørgensen, Stephen Hemminger,
	Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, netdev, bpf,
	Jiri Benc, Andrii Nakryiko


Hi David,

On Sun, Oct 25, 2020 at 04:12:34PM -0600, David Ahern wrote:
> On 10/25/20 9:13 AM, Toke Høiland-Jørgensen wrote:
> > David Ahern <dsahern@gmail.com> writes:
> > 
> >> On 10/22/20 9:38 PM, Hangbin Liu wrote:
> >>> Note: ip/ipvrf.c is not convert to use libbpf as it only encodes a few
> >>> instructions and load directly.
> >>
> >> for completeness, libbpf should be able to load a program from a buffer
> >> as well.
> > 
> > It can, but the particular use in ipvrf is just loading half a dozen
> > instructions defined inline in C - there's no object files, BTF or
> > anything. So why bother with going through libbpf in this case? The
> > actual attachment is using the existing code anyway...
> > 
> 
> actually, it already does: bpf_load_program

Thanks for this info. Do you want to convert ipvrf.c to:

@@ -256,8 +262,13 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
+#ifdef HAVE_LIBBPF
+	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
+#else
 	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
 			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+#endif
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
@@ -288,7 +299,11 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 		goto out;
 	}
 
+#ifdef HAVE_LIBBPF
+	if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE, 0)) {
+#else
 	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
+#endif
 		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
 			strerror(errno));
 		goto out;

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-26  8:56         ` Hangbin Liu
@ 2020-10-26 15:15           ` David Ahern
  2020-10-27  2:58             ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-10-26 15:15 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Toke Høiland-Jørgensen, Stephen Hemminger,
	Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, netdev, bpf,
	Jiri Benc, Andrii Nakryiko

On 10/26/20 2:56 AM, Hangbin Liu wrote:
> 
> Hi David,
> 
> On Sun, Oct 25, 2020 at 04:12:34PM -0600, David Ahern wrote:
>> On 10/25/20 9:13 AM, Toke Høiland-Jørgensen wrote:
>>> David Ahern <dsahern@gmail.com> writes:
>>>
>>>> On 10/22/20 9:38 PM, Hangbin Liu wrote:
>>>>> Note: ip/ipvrf.c is not convert to use libbpf as it only encodes a few
>>>>> instructions and load directly.
>>>>
>>>> for completeness, libbpf should be able to load a program from a buffer
>>>> as well.
>>>
>>> It can, but the particular use in ipvrf is just loading half a dozen
>>> instructions defined inline in C - there's no object files, BTF or
>>> anything. So why bother with going through libbpf in this case? The
>>> actual attachment is using the existing code anyway...
>>>
>>
>> actually, it already does: bpf_load_program
> 
> Thanks for this info. Do you want to convert ipvrf.c to:
> 
> @@ -256,8 +262,13 @@ static int prog_load(int idx)
>  		BPF_EXIT_INSN(),
>  	};
>  
> +#ifdef HAVE_LIBBPF
> +	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> +				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
> +#else
>  	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
>  			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
> +#endif
>  }
>  
>  static int vrf_configure_cgroup(const char *path, int ifindex)
> @@ -288,7 +299,11 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
>  		goto out;
>  	}
>  
> +#ifdef HAVE_LIBBPF
> +	if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE, 0)) {
> +#else
>  	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
> +#endif
>  		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
>  			strerror(errno));
>  		goto out;
> 

works for me. The rename in patch 2 can be dropped as well correct?

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 3/5] lib: add libbpf support
  2020-10-26 15:15           ` David Ahern
@ 2020-10-27  2:58             ` Hangbin Liu
  0 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-27  2:58 UTC (permalink / raw)
  To: David Ahern
  Cc: Toke Høiland-Jørgensen, Stephen Hemminger,
	Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, netdev, bpf,
	Jiri Benc, Andrii Nakryiko

On Mon, Oct 26, 2020 at 09:15:00AM -0600, David Ahern wrote:
> >> actually, it already does: bpf_load_program
> > 
> > Thanks for this info. Do you want to convert ipvrf.c to:
> > 
> > @@ -256,8 +262,13 @@ static int prog_load(int idx)
> >  		BPF_EXIT_INSN(),
> >  	};
> >  
> > +#ifdef HAVE_LIBBPF
> > +	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> > +				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
> > +#else
> >  	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> >  			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
> > +#endif
> >  }
> >  
> >  static int vrf_configure_cgroup(const char *path, int ifindex)
> > @@ -288,7 +299,11 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
> >  		goto out;
> >  	}
> >  
> > +#ifdef HAVE_LIBBPF
> > +	if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE, 0)) {
> > +#else
> >  	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
> > +#endif
> >  		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
> >  			strerror(errno));
> >  		goto out;
> > 
> 
> works for me. The rename in patch 2 can be dropped as well correct?
> 

No, the BPF_MOV64_* micros are not defined in uapi, so we still need include
"bpf_util.h", which will got bpf_prog_load() conflicts.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                   ` (4 preceding siblings ...)
  2020-10-23  3:38 ` [PATCH iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
@ 2020-10-28 13:25 ` Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
                     ` (7 more replies)
  2020-11-29  6:16 ` [PATCH " Stephen Hemminger
  6 siblings, 8 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-28 13:25 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency).

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.


v2:
a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
b) Add ipvrf with libbpf support.


Here are the test results with patched iproute2:

== setup env
# clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o
# clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o
# rm -rf /sys/fs/bpf/xdp/globals
# /root/iproute2/ip/ip link add type veth
# /root/iproute2/ip/ip link set veth0 up
# /root/iproute2/ip/ip link set veth1 up


== Load objs
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 4 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
4: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:21-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 5
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 8 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
8: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:23-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 10
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 12 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
12: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:25-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 15
# /root/iproute2/ip/ip link set veth0 xdp off


== Load objs again to make sure maps could be reused
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 16 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
16: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:27-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 20
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 20 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show                                                                                                                                                                   [236/4518]
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
20: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:29-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 25
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 24 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
24: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:31-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 30
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals

== Testing if we can load new-style objects (using xdp-filter as an example)
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 28 tag e29eeda1489a6520 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
28: xdp  name xdpfilt_alw_all  tag e29eeda1489a6520  gpl
        loaded_at 2020-10-22T08:04:33-0400  uid 0
        xlated 2408B  jited 1405B  memlock 4096B  map_ids 9,5,7,8,6
        btf_id 35
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
32: xdp  name xdpfilt_alw_ip  tag 2f2b9dbfb786a5a2  gpl
        loaded_at 2020-10-22T08:04:35-0400  uid 0
        xlated 1336B  jited 778B  memlock 4096B  map_ids 7,8,5
        btf_id 40
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 36 tag 18c1bb25084030bc jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
36: xdp  name xdpfilt_alw_tcp  tag 18c1bb25084030bc  gpl
        loaded_at 2020-10-22T08:04:37-0400  uid 0
        xlated 1128B  jited 690B  memlock 4096B  map_ids 6,5
        btf_id 45
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Load new btf defined maps
# /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 40 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
40: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:39-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 50
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 44 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
44: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:41-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 13
        btf_id 55
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 48 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer  map_sh
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
14: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
48: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:43-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 14
        btf_id 60
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Test load objs by tc
# /root/iproute2/tc/tc qdisc add dev veth0 ingress
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0
# /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
# ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d:
jmp_tc

/sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f:
jmp_ex  jmp_tc  map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc
# bpftool map show
15: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
16: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
17: prog_array  name jmp_ex  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
18: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 2  memlock 4096B
        owner_prog_type sched_cls  owner jited
19: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
52: sched_cls  name cls_loop  tag 3e98a40b04099d36  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 168B  jited 133B  memlock 4096B  map_ids 15
        btf_id 65
56: sched_cls  name cls_entry  tag 0fbb4d9310a6ee26  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 144B  jited 121B  memlock 4096B  map_ids 16
        btf_id 70
60: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 75
66: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 80
72: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 85
78: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 90
79: sched_cls  name cls_case2  tag ee218ff893dca823  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 336B  jited 218B  memlock 4096B  map_ids 19,18
        btf_id 90
80: sched_cls  name cls_exit  tag e78a58140deed387  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 288B  jited 177B  memlock 4096B  map_ids 19
        btf_id 90

I also run the following upstream kselftest with patches iproute2 and
all passed.

test_lwt_ip_encap.sh
test_xdp_redirect.sh
test_tc_redirect.sh
test_xdp_meta.sh
test_xdp_veth.sh
test_xdp_vlan.sh

Hangbin Liu (5):
  configure: add check_libbpf() for later libbpf support
  lib: rename bpf.c to bpf_legacy.c
  lib: add libbpf support
  examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  examples/bpf: add bpf examples with BTF defined maps

 configure                                |  48 ++++
 examples/bpf/README                      |  18 +-
 examples/bpf/bpf_graft.c                 |  14 +-
 examples/bpf/bpf_map_in_map.c            |  37 ++-
 examples/bpf/bpf_shared.c                |  14 +-
 examples/bpf/{ => legacy}/bpf_cyclic.c   |   2 +-
 examples/bpf/legacy/bpf_graft.c          |  66 +++++
 examples/bpf/legacy/bpf_map_in_map.c     |  56 ++++
 examples/bpf/legacy/bpf_shared.c         |  53 ++++
 examples/bpf/{ => legacy}/bpf_tailcall.c |   2 +-
 include/bpf_api.h                        |  13 +
 include/bpf_util.h                       |  17 +-
 ip/ipvrf.c                               |  19 +-
 lib/Makefile                             |   6 +-
 lib/{bpf.c => bpf_legacy.c}              | 184 ++++++++++++-
 lib/bpf_libbpf.c                         | 332 +++++++++++++++++++++++
 16 files changed, 833 insertions(+), 48 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 create mode 100644 examples/bpf/legacy/bpf_graft.c
 create mode 100644 examples/bpf/legacy/bpf_map_in_map.c
 create mode 100644 examples/bpf/legacy/bpf_shared.c
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)
 rename lib/{bpf.c => bpf_legacy.c} (94%)
 create mode 100644 lib/bpf_libbpf.c

-- 
2.25.4


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv2 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
@ 2020-10-28 13:25   ` Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-28 13:25 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch adds a check to see if we support libbpf. By default the
system libbpf will be used, but static linking against a custom libbpf
version can be achieved by passing LIBBPF_DIR to configure. FORCE_LIBBPF
can be set to force configure to abort if no suitable libbpf is found,
which is useful for automatic packaging that wants to enforce the
dependency.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 configure | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/configure b/configure
index 307912aa..77f475d9 100755
--- a/configure
+++ b/configure
@@ -240,6 +240,51 @@ check_elf()
     fi
 }
 
+check_libbpf()
+{
+    if ${PKG_CONFIG} libbpf --exists || [ -n "$LIBBPF_DIR" ] ; then
+
+        if [ -n "$LIBBPF_DIR" ]; then
+            LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include -L${LIBBPF_DIR}/lib64"
+            LIBBPF_LDLIBS="${LIBBPF_DIR}/lib64/libbpf.a -lz -lelf"
+        else
+            LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
+            LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
+        fi
+
+        cat >$TMPDIR/libbpftest.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    void *ptr;
+    DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, .relaxed_maps = true, .pin_root_path = "/path");
+    (void) bpf_object__open_file("file", &opts);
+    (void) bpf_map__name(ptr);
+    (void) bpf_map__ifindex(ptr);
+    (void) bpf_map__reuse_fd(ptr, 0);
+    (void) bpf_map__pin(ptr, "/path");
+    return 0;
+}
+EOF
+
+        if $CC -o $TMPDIR/libbpftest $TMPDIR/libbpftest.c $LIBBPF_CFLAGS -lbpf 2>&1; then
+            echo "HAVE_LIBBPF:=y" >>$CONFIG
+            echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
+            echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
+            echo "yes"
+            return 0
+        fi
+    fi
+
+    echo "no"
+
+    # if set FORCE_LIBBPF but no libbpf support, just exist the config
+    # process to make sure we don't build without libbpf.
+    if [ -n "$FORCE_LIBBPF" ]; then
+	    echo "FORCE_LIBBPF set, but couldn't find a usable libbpf"
+	    exit 1
+    fi
+}
+
 check_selinux()
 # SELinux is a compile time option in the ss utility
 {
@@ -385,6 +430,9 @@ check_setns
 echo -n "SELinux support: "
 check_selinux
 
+echo -n "libbpf support: "
+check_libbpf
+
 echo -n "ELF support: "
 check_elf
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv2 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
@ 2020-10-28 13:25   ` Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-28 13:25 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This is a preparation for later libbpf support in iproute2. Function
bpf_prog_load() is also renamed to bpf_prog_load_buf() as there is a
conflict with libbpf.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 include/bpf_util.h          | 6 +++---
 ip/ipvrf.c                  | 4 ++--
 lib/Makefile                | 2 +-
 lib/{bpf.c => bpf_legacy.c} | 6 +++---
 4 files changed, 9 insertions(+), 9 deletions(-)
 rename lib/{bpf.c => bpf_legacy.c} (99%)

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 63db07ca..72d3a32c 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -274,9 +274,9 @@ int bpf_trace_pipe(void);
 
 void bpf_print_ops(struct rtattr *bpf_ops, __u16 len);
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log);
+int bpf_prog_load_buf(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, char *log,
+		      size_t size_log);
 
 int bpf_prog_attach_fd(int prog_fd, int target_fd, enum bpf_attach_type type);
 int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type);
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 28dd8e25..33150ac2 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -256,8 +256,8 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
-	return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
-			     "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
diff --git a/lib/Makefile b/lib/Makefile
index 7cba1857..a326fb9f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@ CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
-	names.o color.o bpf.o exec.o fs.o cg_map.o
+	names.o color.o bpf_legacy.o exec.o fs.o cg_map.o
 
 NLOBJ=libgenl.o libnetlink.o
 
diff --git a/lib/bpf.c b/lib/bpf_legacy.c
similarity index 99%
rename from lib/bpf.c
rename to lib/bpf_legacy.c
index c7d45077..2e6e0602 100644
--- a/lib/bpf.c
+++ b/lib/bpf_legacy.c
@@ -1109,9 +1109,9 @@ static int bpf_prog_load_dev(enum bpf_prog_type type,
 	return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 }
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log)
+int bpf_prog_load_buf(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, char *log,
+		      size_t size_log)
 {
 	return bpf_prog_load_dev(type, insns, size_insns, license, 0,
 				 log, size_log);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv2 iproute2-next 3/5] lib: add libbpf support
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
@ 2020-10-28 13:25   ` Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-28 13:25 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available, which is started by Toke's
implementation[1]. With libbpf iproute2 could correctly process BTF
information and support the new-style BTF-defined maps, while keeping
compatibility with the old internal map definition syntax.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
iproute2 will still understand the old map definition format, including
populating map-in-map and tail call maps before load.

In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
legacy bytes. When handling the legacy maps, for map-in-maps, we create
them manually and re-use the fd as they are associated with id/inner_id.
For pin maps, we only set the pin path and let libbp load to handle it.
For tail calls, we find it first and update the element after prog load.

Other maps/progs will be loaded by libbpf directly.

[1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---

v2:
Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
Add ipvrf with libbpf support.
---
 include/bpf_util.h |  11 ++
 ip/ipvrf.c         |  15 ++
 lib/Makefile       |   4 +
 lib/bpf_legacy.c   | 178 ++++++++++++++++++++++++
 lib/bpf_libbpf.c   | 332 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 540 insertions(+)
 create mode 100644 lib/bpf_libbpf.c

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 72d3a32c..e200c107 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -300,4 +300,15 @@ static inline int bpf_recv_map_fds(const char *path, int *fds,
 	return -1;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg);
+int iproute2_bpf_fetch_ancillary(void);
+int iproute2_get_root_path(char *root_path, size_t len);
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname);
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name);
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name);
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg);
+#endif /* HAVE_LIBBPF */
 #endif /* __BPF_UTIL__ */
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 33150ac2..afaf1de7 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -28,8 +28,14 @@
 #include "rt_names.h"
 #include "utils.h"
 #include "ip_common.h"
+
 #include "bpf_util.h"
 
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#endif
+
 #define CGRP_PROC_FILE  "/cgroup.procs"
 
 static struct link_filter vrf_filter;
@@ -256,8 +262,13 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
+#ifdef HAVE_LIBBPF
+	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
+#else
 	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
 			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+#endif
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
@@ -288,7 +299,11 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 		goto out;
 	}
 
+#ifdef HAVE_LIBBPF
+	if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE, 0)) {
+#else
 	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
+#endif
 		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
 			strerror(errno));
 		goto out;
diff --git a/lib/Makefile b/lib/Makefile
index a326fb9f..82d6e465 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -7,6 +7,10 @@ UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
 	names.o color.o bpf_legacy.o exec.o fs.o cg_map.o
 
+ifeq ($(HAVE_LIBBPF),y)
+UTILOBJ += bpf_libbpf.o
+endif
+
 NLOBJ=libgenl.o libnetlink.o
 
 all: libnetlink.a libutil.a
diff --git a/lib/bpf_legacy.c b/lib/bpf_legacy.c
index 2e6e0602..c5ff3e32 100644
--- a/lib/bpf_legacy.c
+++ b/lib/bpf_legacy.c
@@ -940,6 +940,9 @@ static int bpf_do_parse(struct bpf_cfg_in *cfg, const bool *opt_tbl)
 static int bpf_do_load(struct bpf_cfg_in *cfg)
 {
 	if (cfg->mode == EBPF_OBJECT) {
+#ifdef HAVE_LIBBPF
+		return iproute2_load_libbpf(cfg);
+#endif
 		cfg->prog_fd = bpf_obj_open(cfg->object, cfg->type,
 					    cfg->section, cfg->ifindex,
 					    cfg->verbose);
@@ -3165,3 +3168,178 @@ int bpf_recv_map_fds(const char *path, int *fds, struct bpf_map_aux *aux,
 	return ret;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+/* The following functions are wrapper functions for libbpf code to be
+ * compatible with the legacy format. So all the functions have prefix
+ * with iproute2_
+ */
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+
+	return bpf_elf_ctx_init(ctx, cfg->object, cfg->type, cfg->ifindex, cfg->verbose);
+}
+
+int iproute2_bpf_fetch_ancillary(void)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	struct bpf_elf_sec_data data;
+	int i, ret = 0;
+
+	for (i = 1; i < ctx->elf_hdr.e_shnum; i++) {
+		ret = bpf_fill_section_data(ctx, i, &data);
+		if (ret < 0)
+			continue;
+
+		if (data.sec_hdr.sh_type == SHT_PROGBITS &&
+		    !strcmp(data.sec_name, ELF_SECTION_MAPS))
+			ret = bpf_fetch_maps_begin(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_SYMTAB &&
+			 !strcmp(data.sec_name, ".symtab"))
+			ret = bpf_fetch_symtab(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_STRTAB &&
+			 !strcmp(data.sec_name, ".strtab"))
+			ret = bpf_fetch_strtab(ctx, i, &data);
+		if (ret < 0) {
+			fprintf(stderr, "Error parsing section %d! Perhaps check with readelf -a?\n",
+				i);
+			return ret;
+		}
+	}
+
+	if (bpf_has_map_data(ctx)) {
+		ret = bpf_fetch_maps_end(ctx);
+		if (ret < 0) {
+			fprintf(stderr, "Error fixing up map structure, incompatible struct bpf_elf_map used?\n");
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+int iproute2_get_root_path(char *root_path, size_t len)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	int ret = 0;
+
+	snprintf(root_path, len, "%s/%s",
+		 bpf_get_work_dir(ctx->type), BPF_DIR_GLOBALS);
+
+	ret = mkdir(root_path, S_IRWXU);
+	if (ret && errno != EEXIST) {
+		fprintf(stderr, "mkdir %s failed: %s\n", root_path, strerror(errno));
+		return ret;
+	}
+
+	return 0;
+}
+
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name, *tmp;
+	unsigned int pinning;
+	int i, ret = 0;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].pinning == PIN_OBJECT_NS &&
+		    ctx->noafalg) {
+			fprintf(stderr, "Missing kernel AF_ALG support for PIN_OBJECT_NS!\n");
+			return false;
+		}
+
+		map_name = bpf_map_fetch_name(ctx, i);
+		if (!map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, map_name))
+			continue;
+
+		pinning = ctx->maps[i].pinning;
+
+		if (bpf_no_pinning(ctx, pinning) || !bpf_get_work_dir(ctx->type))
+			return false;
+
+		if (pinning == PIN_OBJECT_NS)
+			ret = bpf_make_obj_path(ctx);
+		else if ((tmp = bpf_custom_pinning(ctx, pinning)))
+			ret = bpf_make_custom_path(ctx, tmp);
+		if (ret < 0)
+			return false;
+
+		bpf_make_pathname(pathname, PATH_MAX, map_name, ctx, pinning);
+
+		return true;
+	}
+
+	return false;
+}
+
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *inner_map_name, *outer_map_name;
+	int i, j;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		inner_map_name = bpf_map_fetch_name(ctx, i);
+		if (!inner_map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, inner_map_name))
+			continue;
+
+		if (!ctx->maps[i].id ||
+		    ctx->maps[i].inner_id ||
+		    ctx->maps[i].inner_idx == -1)
+			continue;
+
+		*imap = ctx->maps[i];
+
+		for (j = 0; j < ctx->map_num; j++) {
+			if (!bpf_is_map_in_map_type(&ctx->maps[j]))
+				continue;
+			if (ctx->maps[j].inner_id != ctx->maps[i].id)
+				continue;
+
+			*omap = ctx->maps[j];
+			outer_map_name = bpf_map_fetch_name(ctx, j);
+			memcpy(omap_name, outer_map_name, strlen(outer_map_name) + 1);
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name;
+	int i, idx = -1;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].id == map_id &&
+		    ctx->maps[i].type == BPF_MAP_TYPE_PROG_ARRAY) {
+			idx = i;
+			break;
+		}
+	}
+
+	if (idx < 0)
+		return -1;
+
+	map_name = bpf_map_fetch_name(ctx, idx);
+	if (!map_name)
+		return -1;
+
+	memcpy(name, map_name, strlen(map_name) + 1);
+	return 0;
+}
+#endif /* HAVE_LIBBPF */
diff --git a/lib/bpf_libbpf.c b/lib/bpf_libbpf.c
new file mode 100644
index 00000000..9c29abc1
--- /dev/null
+++ b/lib/bpf_libbpf.c
@@ -0,0 +1,332 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <errno.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+#include <gelf.h>
+
+#include <bpf/libbpf.h>
+#include <bpf/bpf.h>
+
+#include "bpf_util.h"
+
+static int verbose_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	return vfprintf(stderr, format, args);
+}
+
+static int silent_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	if (level > LIBBPF_WARN)
+		return 0;
+
+	/* Skip warning from bpf_object__init_user_maps() for legacy maps */
+	if (strstr(format, "has unrecognized, non-zero options"))
+		return 0;
+
+	return vfprintf(stderr, format, args);
+}
+
+static int create_map(const char *name, struct bpf_elf_map *map,
+		      __u32 ifindex, int inner_fd)
+{
+	struct bpf_create_map_attr map_attr = {};
+
+	map_attr.name = name;
+	map_attr.map_type = map->type;
+	map_attr.map_flags = map->flags;
+	map_attr.key_size = map->size_key;
+	map_attr.value_size = map->size_value;
+	map_attr.max_entries = map->max_elem;
+	map_attr.map_ifindex = ifindex;
+	map_attr.inner_map_fd = inner_fd;
+
+	return bpf_create_map_xattr(&map_attr);
+}
+
+static int create_map_in_map(struct bpf_object *obj, struct bpf_map *map,
+			     struct bpf_elf_map *elf_map, int inner_fd,
+			     bool *reuse_pin_map)
+{
+	char pathname[PATH_MAX];
+	const char *map_name;
+	bool pin_map = false;
+	int map_fd, ret = 0;
+
+	map_name = bpf_map__name(map);
+
+	if (iproute2_is_pin_map(map_name, pathname)) {
+		pin_map = true;
+
+		/* Check if there already has a pinned map */
+		map_fd = bpf_obj_get(pathname);
+		if (map_fd > 0) {
+			if (reuse_pin_map)
+				*reuse_pin_map = true;
+			close(map_fd);
+			return bpf_map__set_pin_path(map, pathname);
+		}
+	}
+
+	map_fd = create_map(map_name, elf_map, bpf_map__ifindex(map), inner_fd);
+	if (map_fd < 0) {
+		fprintf(stderr, "create map %s failed\n", map_name);
+		return map_fd;
+	}
+
+	ret = bpf_map__reuse_fd(map, map_fd);
+	if (ret < 0) {
+		fprintf(stderr, "map %s reuse fd failed\n", map_name);
+		goto err_out;
+	}
+
+	if (pin_map) {
+		ret = bpf_map__set_pin_path(map, pathname);
+		if (ret < 0)
+			goto err_out;
+	}
+
+	return 0;
+err_out:
+	close(map_fd);
+	return ret;
+}
+
+static int
+handle_legacy_map_in_map(struct bpf_object *obj, struct bpf_map *inner_map,
+			 const char *inner_map_name)
+{
+	int inner_fd, outer_fd, inner_idx, ret = 0;
+	struct bpf_elf_map imap, omap;
+	struct bpf_map *outer_map;
+	/* What's the size limit of map name? */
+	char outer_map_name[128];
+	bool reuse_pin_map = false;
+
+	/* Deal with map-in-map */
+	if (iproute2_is_map_in_map(inner_map_name, &imap, &omap, outer_map_name)) {
+		ret = create_map_in_map(obj, inner_map, &imap, -1, NULL);
+		if (ret < 0)
+			return ret;
+
+		inner_fd = bpf_map__fd(inner_map);
+		outer_map = bpf_object__find_map_by_name(obj, outer_map_name);
+		ret = create_map_in_map(obj, outer_map, &omap, inner_fd, &reuse_pin_map);
+		if (ret < 0)
+			return ret;
+
+		if (!reuse_pin_map) {
+			inner_idx = imap.inner_idx;
+			outer_fd = bpf_map__fd(outer_map);
+			ret = bpf_map_update_elem(outer_fd, &inner_idx, &inner_fd, 0);
+			if (ret < 0)
+				fprintf(stderr, "Cannot update inner_idx into outer_map\n");
+		}
+	}
+
+	return ret;
+}
+
+static int find_legacy_tail_calls(struct bpf_program *prog, struct bpf_object *obj)
+{
+	unsigned int map_id, key_id;
+	const char *sec_name;
+	struct bpf_map *map;
+	char map_name[128];
+	int ret;
+
+	/* Handle iproute2 tail call */
+	sec_name = bpf_program__section_name(prog);
+	ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+	if (ret != 2)
+		return -1;
+
+	ret = iproute2_find_map_name_by_id(map_id, map_name);
+	if (ret < 0) {
+		fprintf(stderr, "unable to find map id %u for tail call\n", map_id);
+		return ret;
+	}
+
+	map = bpf_object__find_map_by_name(obj, map_name);
+	if (!map)
+		return -1;
+
+	/* Save the map here for later updating */
+	bpf_program__set_priv(prog, map, NULL);
+
+	return 0;
+}
+
+static int update_legacy_tail_call_maps(struct bpf_object *obj)
+{
+	int prog_fd, map_fd, ret = 0;
+	unsigned int map_id, key_id;
+	struct bpf_program *prog;
+	const char *sec_name;
+	struct bpf_map *map;
+
+	bpf_object__for_each_program(prog, obj) {
+		map = bpf_program__priv(prog);
+		if (!map)
+			continue;
+
+		prog_fd = bpf_program__fd(prog);
+		if (prog_fd < 0)
+			continue;
+
+		sec_name = bpf_program__section_name(prog);
+		ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+		if (ret != 2)
+			continue;
+
+		map_fd = bpf_map__fd(map);
+		ret = bpf_map_update_elem(map_fd, &key_id, &prog_fd, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Cannot update map key for tail call!\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int handle_legacy_maps(struct bpf_object *obj)
+{
+	char pathname[PATH_MAX];
+	struct bpf_map *map;
+	const char *map_name;
+	int map_fd, ret = 0;
+
+	bpf_object__for_each_map(map, obj) {
+		map_name = bpf_map__name(map);
+
+		ret = handle_legacy_map_in_map(obj, map, map_name);
+		if (ret)
+			return ret;
+
+		/* If it is a iproute2 legacy pin maps, just set pin path
+		 * and let bpf_object__load() to deal with the map creation.
+		 * We need to ignore map-in-maps which have pinned maps manually
+		 */
+		map_fd = bpf_map__fd(map);
+		if (map_fd < 0 && iproute2_is_pin_map(map_name, pathname)) {
+			ret = bpf_map__set_pin_path(map, pathname);
+			if (ret) {
+				fprintf(stderr, "map '%s': couldn't set pin path.\n", map_name);
+				break;
+			}
+		}
+
+	}
+
+	return ret;
+}
+
+static int load_bpf_object(struct bpf_cfg_in *cfg)
+{
+	struct bpf_program *p, *prog = NULL;
+	struct bpf_object *obj;
+	char root_path[PATH_MAX];
+	struct bpf_map *map;
+	int prog_fd, ret = 0;
+
+	ret = iproute2_get_root_path(root_path, PATH_MAX);
+	if (ret)
+		return ret;
+
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts,
+			.relaxed_maps = true,
+			.pin_root_path = root_path,
+	);
+
+	obj = bpf_object__open_file(cfg->object, &open_opts);
+	if (libbpf_get_error(obj)) {
+		fprintf(stderr, "ERROR: opening BPF object file failed\n");
+		return -ENOENT;
+	}
+
+	bpf_object__for_each_program(p, obj) {
+		/* Only load the programs that will either be subsequently
+		 * attached or inserted into a tail call map */
+		if (find_legacy_tail_calls(p, obj) < 0 && cfg->section &&
+		    strcmp(bpf_program__section_name(p), cfg->section)) {
+			ret = bpf_program__set_autoload(p, false);
+			if (ret)
+				return -EINVAL;
+			continue;
+		}
+
+		bpf_program__set_type(p, cfg->type);
+		bpf_program__set_ifindex(p, cfg->ifindex);
+		if (!prog)
+			prog = p;
+	}
+
+	bpf_object__for_each_map(map, obj) {
+		if (!bpf_map__is_offload_neutral(map))
+			bpf_map__set_ifindex(map, cfg->ifindex);
+	}
+
+	if (!prog) {
+		fprintf(stderr, "object file doesn't contain sec %s\n", cfg->section);
+		return -ENOENT;
+	}
+
+	/* Handle iproute2 legacy pin maps and map-in-maps */
+	ret = handle_legacy_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = bpf_object__load(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = update_legacy_tail_call_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	prog_fd = fcntl(bpf_program__fd(prog), F_DUPFD_CLOEXEC, 1);
+	if (prog_fd < 0)
+		ret = -errno;
+	else
+		cfg->prog_fd = prog_fd;
+
+unload_obj:
+	/* Close obj as we don't need it */
+	bpf_object__close(obj);
+	return ret;
+}
+
+/* Load ebpf and return prog fd */
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg)
+{
+	int ret = 0;
+
+	if (cfg->verbose)
+		libbpf_set_print(verbose_print);
+	else
+		libbpf_set_print(silent_print);
+
+	ret = iproute2_bpf_elf_ctx_init(cfg);
+	if (ret < 0) {
+		fprintf(stderr, "Cannot initialize ELF context!\n");
+		return ret;
+	}
+
+	ret = iproute2_bpf_fetch_ancillary();
+	if (ret < 0) {
+		fprintf(stderr, "Error fetching ELF ancillary data!\n");
+		return ret;
+	}
+
+	ret = load_bpf_object(cfg);
+	if (ret)
+		return ret;
+
+	return cfg->prog_fd;
+}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv2 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                     ` (2 preceding siblings ...)
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-10-28 13:25   ` Hangbin Liu
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-28 13:25 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README                        | 14 +++++++++-----
 examples/bpf/{ => legacy}/bpf_cyclic.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_graft.c      |  2 +-
 examples/bpf/{ => legacy}/bpf_map_in_map.c |  2 +-
 examples/bpf/{ => legacy}/bpf_shared.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_tailcall.c   |  2 +-
 6 files changed, 14 insertions(+), 10 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 rename examples/bpf/{ => legacy}/bpf_graft.c (97%)
 rename examples/bpf/{ => legacy}/bpf_map_in_map.c (96%)
 rename examples/bpf/{ => legacy}/bpf_shared.c (97%)
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)

diff --git a/examples/bpf/README b/examples/bpf/README
index 1bbdda3f..732bcc83 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,8 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
- - bpf_shared.c		-> Ingress/egress map sharing example
- - bpf_tailcall.c	-> Using tail call chains
- - bpf_cyclic.c		-> Simple cycle as tail calls
- - bpf_graft.c		-> Demo on altering runtime behaviour
- - bpf_map_in_map.c     -> Using map in map example
+ - legacy/bpf_shared.c		-> Ingress/egress map sharing example
+ - legacy/bpf_tailcall.c	-> Using tail call chains
+ - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
+ - legacy/bpf_graft.c		-> Demo on altering runtime behaviour
+ - legacy/bpf_map_in_map.c	-> Using map in map example
+
+Note: Users should use new BTF way to defined the maps, the examples
+in legacy folder which is using struct bpf_elf_map defined maps is not
+recommanded.
diff --git a/examples/bpf/bpf_cyclic.c b/examples/bpf/legacy/bpf_cyclic.c
similarity index 95%
rename from examples/bpf/bpf_cyclic.c
rename to examples/bpf/legacy/bpf_cyclic.c
index 11d1c061..33590730 100644
--- a/examples/bpf/bpf_cyclic.c
+++ b/examples/bpf/legacy/bpf_cyclic.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Cyclic dependency example to test the kernel's runtime upper
  * bound on loops. Also demonstrates on how to use direct-actions,
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/legacy/bpf_graft.c
similarity index 97%
rename from examples/bpf/bpf_graft.c
rename to examples/bpf/legacy/bpf_graft.c
index 07113d4a..f4c920cc 100644
--- a/examples/bpf/bpf_graft.c
+++ b/examples/bpf/legacy/bpf_graft.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* This example demonstrates how classifier run-time behaviour
  * can be altered with tail calls. We start out with an empty
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/legacy/bpf_map_in_map.c
similarity index 96%
rename from examples/bpf/bpf_map_in_map.c
rename to examples/bpf/legacy/bpf_map_in_map.c
index ff0e623a..575f8812 100644
--- a/examples/bpf/bpf_map_in_map.c
+++ b/examples/bpf/legacy/bpf_map_in_map.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define MAP_INNER_ID	42
 
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/legacy/bpf_shared.c
similarity index 97%
rename from examples/bpf/bpf_shared.c
rename to examples/bpf/legacy/bpf_shared.c
index 21fe6f1e..05b2b9ef 100644
--- a/examples/bpf/bpf_shared.c
+++ b/examples/bpf/legacy/bpf_shared.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Minimal, stand-alone toy map pinning example:
  *
diff --git a/examples/bpf/bpf_tailcall.c b/examples/bpf/legacy/bpf_tailcall.c
similarity index 98%
rename from examples/bpf/bpf_tailcall.c
rename to examples/bpf/legacy/bpf_tailcall.c
index 161eb606..8ebc554c 100644
--- a/examples/bpf/bpf_tailcall.c
+++ b/examples/bpf/legacy/bpf_tailcall.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define ENTRY_INIT	3
 #define ENTRY_0		0
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv2 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                     ` (3 preceding siblings ...)
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
@ 2020-10-28 13:25   ` Hangbin Liu
  2020-10-28 21:17   ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-28 13:25 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

Users should try use the new BTF defined maps instead of struct
bpf_elf_map defined maps. The tail call examples are not added yet
as libbpf doesn't currently support declaratively populating tail call
maps.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README           |  6 ++++
 examples/bpf/bpf_graft.c      | 66 +++++++++++++++++++++++++++++++++++
 examples/bpf/bpf_map_in_map.c | 55 +++++++++++++++++++++++++++++
 examples/bpf/bpf_shared.c     | 53 ++++++++++++++++++++++++++++
 include/bpf_api.h             | 13 +++++++
 5 files changed, 193 insertions(+)
 create mode 100644 examples/bpf/bpf_graft.c
 create mode 100644 examples/bpf/bpf_map_in_map.c
 create mode 100644 examples/bpf/bpf_shared.c

diff --git a/examples/bpf/README b/examples/bpf/README
index 732bcc83..b7261191 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,6 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
+- BTF defined map examples
+ - bpf_graft.c		-> Demo on altering runtime behaviour
+ - bpf_shared.c 	-> Ingress/egress map sharing example
+ - bpf_map_in_map.c	-> Using map in map example
+
+- legacy struct bpf_elf_map defined map examples
  - legacy/bpf_shared.c		-> Ingress/egress map sharing example
  - legacy/bpf_tailcall.c	-> Using tail call chains
  - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/bpf_graft.c
new file mode 100644
index 00000000..8066dcce
--- /dev/null
+++ b/examples/bpf/bpf_graft.c
@@ -0,0 +1,66 @@
+#include "../../include/bpf_api.h"
+
+/* This example demonstrates how classifier run-time behaviour
+ * can be altered with tail calls. We start out with an empty
+ * jmp_tc array, then add section aaa to the array slot 0, and
+ * later on atomically replace it with section bbb. Note that
+ * as shown in other examples, the tc loader can prepopulate
+ * tail called sections, here we start out with an empty one
+ * on purpose to show it can also be done this way.
+ *
+ * tc filter add dev foo parent ffff: bpf obj graft.o
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-20229 [001] ..s. 138993.003923: : fallthrough
+ *   <idle>-0            [001] ..s. 138993.202265: : fallthrough
+ *   Socket Thread-20229 [001] ..s. 138994.004149: : fallthrough
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec aaa
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139012.053587: : aaa
+ *   <idle>-0            [002] ..s. 139012.172359: : aaa
+ *   Socket Thread-19818 [001] ..s. 139012.173556: : aaa
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec bbb
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139022.102967: : bbb
+ *   <idle>-0            [002] ..s. 139022.155640: : bbb
+ *   Socket Thread-19818 [001] ..s. 139022.156730: : bbb
+ *   [...]
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+} jmp_tc __section(".maps");
+
+__section("aaa")
+int cls_aaa(struct __sk_buff *skb)
+{
+	printt("aaa\n");
+	return TC_H_MAKE(1, 42);
+}
+
+__section("bbb")
+int cls_bbb(struct __sk_buff *skb)
+{
+	printt("bbb\n");
+	return TC_H_MAKE(1, 43);
+}
+
+__section_cls_entry
+int cls_entry(struct __sk_buff *skb)
+{
+	tail_call(skb, &jmp_tc, 0);
+	printt("fallthrough\n");
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/bpf_map_in_map.c
new file mode 100644
index 00000000..39c86268
--- /dev/null
+++ b/examples/bpf/bpf_map_in_map.c
@@ -0,0 +1,55 @@
+#include "../../include/bpf_api.h"
+
+struct inner_map {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+} map_inner __section(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+	__array(values, struct inner_map);
+} map_outer __section(".maps") = {
+	.values = {
+		[0] = &map_inner,
+	},
+};
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			lock_xadd(val, 1);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			printt("map val: %d\n", *val);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/bpf_shared.c
new file mode 100644
index 00000000..99a332f4
--- /dev/null
+++ b/examples/bpf/bpf_shared.c
@@ -0,0 +1,53 @@
+#include "../../include/bpf_api.h"
+
+/* Minimal, stand-alone toy map pinning example:
+ *
+ * clang -target bpf -O2 [...] -o bpf_shared.o -c bpf_shared.c
+ * tc filter add dev foo parent 1: bpf obj bpf_shared.o sec egress
+ * tc filter add dev foo parent ffff: bpf obj bpf_shared.o sec ingress
+ *
+ * Both classifier will share the very same map instance in this example,
+ * so map content can be accessed from ingress *and* egress side!
+ *
+ * This example has a pinning of PIN_OBJECT_NS, so it's private and
+ * thus shared among various program sections within the object.
+ *
+ * A setting of PIN_GLOBAL_NS would place it into a global namespace,
+ * so that it can be shared among different object files. A setting
+ * of PIN_NONE (= 0) means no sharing, so each tc invocation a new map
+ * instance is being created.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);	/* or LIBBPF_PIN_NONE */
+} map_sh __section(".maps");
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		lock_xadd(val, 1);
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		printt("map val: %d\n", *val);
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/include/bpf_api.h b/include/bpf_api.h
index 89d3488d..82c47089 100644
--- a/include/bpf_api.h
+++ b/include/bpf_api.h
@@ -19,6 +19,19 @@
 
 #include "bpf_elf.h"
 
+/** libbpf pin type. */
+enum libbpf_pin_type {
+	LIBBPF_PIN_NONE,
+	/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
+	LIBBPF_PIN_BY_NAME,
+};
+
+/** Type helper macros. */
+
+#define __uint(name, val) int (*name)[val]
+#define __type(name, val) typeof(val) *name
+#define __array(name, val) typeof(val) *name[]
+
 /** Misc macros. */
 
 #ifndef __stringify
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                     ` (4 preceding siblings ...)
  2020-10-28 13:25   ` [PATCHv2 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
@ 2020-10-28 21:17   ` Alexei Starovoitov
  2020-10-28 23:02   ` David Ahern
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
  7 siblings, 0 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-10-28 21:17 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, David Ahern,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 09:25:24PM +0800, Hangbin Liu wrote:
> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
> 
> This is achieved by checking for libbpf at './configure' time, and using
> it if available. By default the system libbpf will be used, but static
> linking against a custom libbpf version can be achieved by passing
> LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
> abort if no suitable libbpf is found (useful for automatic packaging
> that wants to enforce the dependency).
> 
> The old iproute2 bpf code is kept and will be used if no suitable libbpf
> is available. When using libbpf, wrapper code ensures that iproute2 will
> still understand the old map definition format, including populating
> map-in-map and tail call maps before load.
> 
> The examples in bpf/examples are kept, and a separate set of examples
> are added with BTF-based map definitions for those examples where this
> is possible (libbpf doesn't currently support declaratively populating
> tail call maps).

Awesome to see this work continue! Thank you.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                     ` (5 preceding siblings ...)
  2020-10-28 21:17   ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
@ 2020-10-28 23:02   ` David Ahern
  2020-10-29  2:06     ` Hangbin Liu
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
  7 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-10-28 23:02 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 10/28/20 7:25 AM, Hangbin Liu wrote:
> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
> 
> This is achieved by checking for libbpf at './configure' time, and using
> it if available. By default the system libbpf will be used, but static
> linking against a custom libbpf version can be achieved by passing
> LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
> abort if no suitable libbpf is found (useful for automatic packaging
> that wants to enforce the dependency).
> 
> The old iproute2 bpf code is kept and will be used if no suitable libbpf
> is available. When using libbpf, wrapper code ensures that iproute2 will
> still understand the old map definition format, including populating
> map-in-map and tail call maps before load.
> 
> The examples in bpf/examples are kept, and a separate set of examples
> are added with BTF-based map definitions for those examples where this
> is possible (libbpf doesn't currently support declaratively populating
> tail call maps).
> 
> At last, Thanks a lot for Toke's help on this patch set.
> 
> 
> v2:
> a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
> b) Add ipvrf with libbpf support.
> 
> 

fails to compile on Ubuntu 20.10:

root@u2010-sfo3:~/iproute2.git# ./configure
TC schedulers
 ATM	yes
 IPT	using xtables
 IPSET  yes

iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
libc has setns: yes
SELinux support: yes
libbpf support: yes
ELF support: yes
libmnl support: yes
Berkeley DB: no
need for strlcpy: yes
libcap support: yes

root@u2010-sfo3:~/iproute2.git# make clean

root@u2010-sfo3:~/iproute2.git# make -j 4
...
/usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
bpf_libbpf.c:(.text+0x3cb): undefined reference to
`bpf_program__section_name'
/usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
`bpf_program__section_name'
/usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
`bpf_program__section_name'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:27: ip] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:64: all] Error 2



^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-28 23:02   ` David Ahern
@ 2020-10-29  2:06     ` Hangbin Liu
  2020-10-29  2:20       ` David Ahern
                         ` (2 more replies)
  0 siblings, 3 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29  2:06 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
> fails to compile on Ubuntu 20.10:
> 
> root@u2010-sfo3:~/iproute2.git# ./configure
> TC schedulers
>  ATM	yes
>  IPT	using xtables
>  IPSET  yes
> 
> iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
> libc has setns: yes
> SELinux support: yes
> libbpf support: yes
> ELF support: yes
> libmnl support: yes
> Berkeley DB: no
> need for strlcpy: yes
> libcap support: yes
> 
> root@u2010-sfo3:~/iproute2.git# make clean
> 
> root@u2010-sfo3:~/iproute2.git# make -j 4
> ...
> /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
> bpf_libbpf.c:(.text+0x3cb): undefined reference to
> `bpf_program__section_name'
> /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
> `bpf_program__section_name'
> /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
> `bpf_program__section_name'
> collect2: error: ld returned 1 exit status
> make[1]: *** [Makefile:27: ip] Error 1
> make[1]: *** Waiting for unfinished jobs....
> make: *** [Makefile:64: all] Error 2

You need to update libbpf to latest version.

But this also remind me that I need to add bpf_program__section_name() to
configure checking. I will see if I missed other functions' checking.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:06     ` Hangbin Liu
@ 2020-10-29  2:20       ` David Ahern
  2020-10-29  2:45         ` Hangbin Liu
  2020-10-29  2:27       ` Andrii Nakryiko
  2020-10-29  2:33       ` Stephen Hemminger
  2 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-10-29  2:20 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 10/28/20 8:06 PM, Hangbin Liu wrote:
> On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
>> fails to compile on Ubuntu 20.10:
>>
>> root@u2010-sfo3:~/iproute2.git# ./configure
>> TC schedulers
>>  ATM	yes
>>  IPT	using xtables
>>  IPSET  yes
>>
>> iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
>> libc has setns: yes
>> SELinux support: yes
>> libbpf support: yes
>> ELF support: yes
>> libmnl support: yes
>> Berkeley DB: no
>> need for strlcpy: yes
>> libcap support: yes
>>
>> root@u2010-sfo3:~/iproute2.git# make clean
>>
>> root@u2010-sfo3:~/iproute2.git# make -j 4
>> ...
>> /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
>> bpf_libbpf.c:(.text+0x3cb): undefined reference to
>> `bpf_program__section_name'
>> /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
>> `bpf_program__section_name'
>> /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
>> `bpf_program__section_name'
>> collect2: error: ld returned 1 exit status
>> make[1]: *** [Makefile:27: ip] Error 1
>> make[1]: *** Waiting for unfinished jobs....
>> make: *** [Makefile:64: all] Error 2
> 
> You need to update libbpf to latest version.

nope. you need to be able to handle this. Ubuntu 20.10 was just
released, and it has a version of libbpf. If you are going to integrate
libbpf into other packages like iproute2, it needs to just work with
that version.

> 
> But this also remind me that I need to add bpf_program__section_name() to
> configure checking. I will see if I missed other functions' checking.

This is going to be an on-going problem. iproute2 should work with
whatever version of libbpf is installed on that system.


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:06     ` Hangbin Liu
  2020-10-29  2:20       ` David Ahern
@ 2020-10-29  2:27       ` Andrii Nakryiko
  2020-10-29  2:33         ` David Ahern
  2020-10-29  2:34         ` Stephen Hemminger
  2020-10-29  2:33       ` Stephen Hemminger
  2 siblings, 2 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  2:27 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: David Ahern, Stephen Hemminger, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 7:06 PM Hangbin Liu <haliu@redhat.com> wrote:
>
> On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
> > fails to compile on Ubuntu 20.10:
> >
> > root@u2010-sfo3:~/iproute2.git# ./configure
> > TC schedulers
> >  ATM  yes
> >  IPT  using xtables
> >  IPSET  yes
> >
> > iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
> > libc has setns: yes
> > SELinux support: yes
> > libbpf support: yes
> > ELF support: yes
> > libmnl support: yes
> > Berkeley DB: no
> > need for strlcpy: yes
> > libcap support: yes
> >
> > root@u2010-sfo3:~/iproute2.git# make clean
> >
> > root@u2010-sfo3:~/iproute2.git# make -j 4
> > ...
> > /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
> > bpf_libbpf.c:(.text+0x3cb): undefined reference to
> > `bpf_program__section_name'
> > /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
> > `bpf_program__section_name'
> > /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
> > `bpf_program__section_name'
> > collect2: error: ld returned 1 exit status
> > make[1]: *** [Makefile:27: ip] Error 1
> > make[1]: *** Waiting for unfinished jobs....
> > make: *** [Makefile:64: all] Error 2
>
> You need to update libbpf to latest version.

Why not using libbpf from submodule?

>
> But this also remind me that I need to add bpf_program__section_name() to
> configure checking. I will see if I missed other functions' checking.
>
> Thanks
> Hangbin
>

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:06     ` Hangbin Liu
  2020-10-29  2:20       ` David Ahern
  2020-10-29  2:27       ` Andrii Nakryiko
@ 2020-10-29  2:33       ` Stephen Hemminger
  2 siblings, 0 replies; 167+ messages in thread
From: Stephen Hemminger @ 2020-10-29  2:33 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: David Ahern, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Thu, 29 Oct 2020 10:06:37 +0800
Hangbin Liu <haliu@redhat.com> wrote:

> On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
> > fails to compile on Ubuntu 20.10:
> > 
> > root@u2010-sfo3:~/iproute2.git# ./configure
> > TC schedulers
> >  ATM	yes
> >  IPT	using xtables
> >  IPSET  yes
> > 
> > iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
> > libc has setns: yes
> > SELinux support: yes
> > libbpf support: yes
> > ELF support: yes
> > libmnl support: yes
> > Berkeley DB: no
> > need for strlcpy: yes
> > libcap support: yes
> > 
> > root@u2010-sfo3:~/iproute2.git# make clean
> > 
> > root@u2010-sfo3:~/iproute2.git# make -j 4
> > ...
> > /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
> > bpf_libbpf.c:(.text+0x3cb): undefined reference to
> > `bpf_program__section_name'
> > /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
> > `bpf_program__section_name'
> > /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
> > `bpf_program__section_name'
> > collect2: error: ld returned 1 exit status
> > make[1]: *** [Makefile:27: ip] Error 1
> > make[1]: *** Waiting for unfinished jobs....
> > make: *** [Makefile:64: all] Error 2  
> 
> You need to update libbpf to latest version.
> 
> But this also remind me that I need to add bpf_program__section_name() to
> configure checking. I will see if I missed other functions' checking.
> 
> Thanks
> Hangbin
> 

Then configure needs to check for this or every distro is going to get real mad...

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:27       ` Andrii Nakryiko
@ 2020-10-29  2:33         ` David Ahern
  2020-10-29  2:46           ` Andrii Nakryiko
  2020-10-29  2:34         ` Stephen Hemminger
  1 sibling, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-10-29  2:33 UTC (permalink / raw)
  To: Andrii Nakryiko, Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 10/28/20 8:27 PM, Andrii Nakryiko wrote:
> On Wed, Oct 28, 2020 at 7:06 PM Hangbin Liu <haliu@redhat.com> wrote:
>>
>> On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
>>> fails to compile on Ubuntu 20.10:
>>>
>>> root@u2010-sfo3:~/iproute2.git# ./configure
>>> TC schedulers
>>>  ATM  yes
>>>  IPT  using xtables
>>>  IPSET  yes
>>>
>>> iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
>>> libc has setns: yes
>>> SELinux support: yes
>>> libbpf support: yes
>>> ELF support: yes
>>> libmnl support: yes
>>> Berkeley DB: no
>>> need for strlcpy: yes
>>> libcap support: yes
>>>
>>> root@u2010-sfo3:~/iproute2.git# make clean
>>>
>>> root@u2010-sfo3:~/iproute2.git# make -j 4
>>> ...
>>> /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
>>> bpf_libbpf.c:(.text+0x3cb): undefined reference to
>>> `bpf_program__section_name'
>>> /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
>>> `bpf_program__section_name'
>>> /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
>>> `bpf_program__section_name'
>>> collect2: error: ld returned 1 exit status
>>> make[1]: *** [Makefile:27: ip] Error 1
>>> make[1]: *** Waiting for unfinished jobs....
>>> make: *** [Makefile:64: all] Error 2
>>
>> You need to update libbpf to latest version.
> 
> Why not using libbpf from submodule?
> 

no. iproute2 does not bring in libmnl, libc, ... a submodules. libbpf is
not special. OS versions provide it and it needs to co-exist with packages.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:27       ` Andrii Nakryiko
  2020-10-29  2:33         ` David Ahern
@ 2020-10-29  2:34         ` Stephen Hemminger
  2020-10-29  2:50           ` Andrii Nakryiko
  1 sibling, 1 reply; 167+ messages in thread
From: Stephen Hemminger @ 2020-10-29  2:34 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Hangbin Liu, David Ahern, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Wed, 28 Oct 2020 19:27:20 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Wed, Oct 28, 2020 at 7:06 PM Hangbin Liu <haliu@redhat.com> wrote:
> >
> > On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:  
> > > fails to compile on Ubuntu 20.10:
> > >
> > > root@u2010-sfo3:~/iproute2.git# ./configure
> > > TC schedulers
> > >  ATM  yes
> > >  IPT  using xtables
> > >  IPSET  yes
> > >
> > > iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
> > > libc has setns: yes
> > > SELinux support: yes
> > > libbpf support: yes
> > > ELF support: yes
> > > libmnl support: yes
> > > Berkeley DB: no
> > > need for strlcpy: yes
> > > libcap support: yes
> > >
> > > root@u2010-sfo3:~/iproute2.git# make clean
> > >
> > > root@u2010-sfo3:~/iproute2.git# make -j 4
> > > ...
> > > /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
> > > bpf_libbpf.c:(.text+0x3cb): undefined reference to
> > > `bpf_program__section_name'
> > > /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
> > > `bpf_program__section_name'
> > > /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
> > > `bpf_program__section_name'
> > > collect2: error: ld returned 1 exit status
> > > make[1]: *** [Makefile:27: ip] Error 1
> > > make[1]: *** Waiting for unfinished jobs....
> > > make: *** [Makefile:64: all] Error 2  
> >
> > You need to update libbpf to latest version.  
> 
> Why not using libbpf from submodule?

Because it makes it harder for people downloading tarballs and distributions.
Iproute2 has worked well by being standalone.

Want to merge libbpf into iproute2?? 
 


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:20       ` David Ahern
@ 2020-10-29  2:45         ` Hangbin Liu
  2020-10-29  3:00           ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29  2:45 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 08:20:55PM -0600, David Ahern wrote:
> >> root@u2010-sfo3:~/iproute2.git# make -j 4
> >> ...
> >> /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
> >> bpf_libbpf.c:(.text+0x3cb): undefined reference to
> >> `bpf_program__section_name'
> >> /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
> >> `bpf_program__section_name'
> >> /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
> >> `bpf_program__section_name'
> >> collect2: error: ld returned 1 exit status
> >> make[1]: *** [Makefile:27: ip] Error 1
> >> make[1]: *** Waiting for unfinished jobs....
> >> make: *** [Makefile:64: all] Error 2
> > 
> > You need to update libbpf to latest version.
> 
> nope. you need to be able to handle this. Ubuntu 20.10 was just
> released, and it has a version of libbpf. If you are going to integrate
> libbpf into other packages like iproute2, it needs to just work with
> that version.

OK, I can replace bpf_program__section_name by bpf_program__title().
> 
> > 
> > But this also remind me that I need to add bpf_program__section_name() to
> > configure checking. I will see if I missed other functions' checking.
> 
> This is going to be an on-going problem. iproute2 should work with
> whatever version of libbpf is installed on that system.

I will make it works on Ubuntu 20.10, but with whatever version of libbpf?
That looks hard, especially with old libbpf.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:33         ` David Ahern
@ 2020-10-29  2:46           ` Andrii Nakryiko
  0 siblings, 0 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  2:46 UTC (permalink / raw)
  To: David Ahern
  Cc: Hangbin Liu, Stephen Hemminger, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 7:33 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 10/28/20 8:27 PM, Andrii Nakryiko wrote:
> > On Wed, Oct 28, 2020 at 7:06 PM Hangbin Liu <haliu@redhat.com> wrote:
> >>
> >> On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
> >>> fails to compile on Ubuntu 20.10:
> >>>
> >>> root@u2010-sfo3:~/iproute2.git# ./configure
> >>> TC schedulers
> >>>  ATM  yes
> >>>  IPT  using xtables
> >>>  IPSET  yes
> >>>
> >>> iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
> >>> libc has setns: yes
> >>> SELinux support: yes
> >>> libbpf support: yes
> >>> ELF support: yes
> >>> libmnl support: yes
> >>> Berkeley DB: no
> >>> need for strlcpy: yes
> >>> libcap support: yes
> >>>
> >>> root@u2010-sfo3:~/iproute2.git# make clean
> >>>
> >>> root@u2010-sfo3:~/iproute2.git# make -j 4
> >>> ...
> >>> /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
> >>> bpf_libbpf.c:(.text+0x3cb): undefined reference to
> >>> `bpf_program__section_name'
> >>> /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
> >>> `bpf_program__section_name'
> >>> /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
> >>> `bpf_program__section_name'
> >>> collect2: error: ld returned 1 exit status
> >>> make[1]: *** [Makefile:27: ip] Error 1
> >>> make[1]: *** Waiting for unfinished jobs....
> >>> make: *** [Makefile:64: all] Error 2
> >>
> >> You need to update libbpf to latest version.
> >
> > Why not using libbpf from submodule?
> >
>
> no. iproute2 does not bring in libmnl, libc, ... a submodules. libbpf is
> not special. OS versions provide it and it needs to co-exist with packages.

Not saying libbpf is special, but libbpf is a fast moving target right
now, so it's pragmatic to have it as submodule, because if you'd like
to use some latest functionality, you won't have to add all the
conditional compilation shenanigans to detect every single new API
you'd like to use from libbpf. And libbpf is pretty small to not worry
about saving memory through a shared library.

But it's up to you guys, of course.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:34         ` Stephen Hemminger
@ 2020-10-29  2:50           ` Andrii Nakryiko
  2020-10-29 11:38             ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 167+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  2:50 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Hangbin Liu, David Ahern, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 7:34 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 28 Oct 2020 19:27:20 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Wed, Oct 28, 2020 at 7:06 PM Hangbin Liu <haliu@redhat.com> wrote:
> > >
> > > On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
> > > > fails to compile on Ubuntu 20.10:
> > > >
> > > > root@u2010-sfo3:~/iproute2.git# ./configure
> > > > TC schedulers
> > > >  ATM  yes
> > > >  IPT  using xtables
> > > >  IPSET  yes
> > > >
> > > > iptables modules directory: /usr/lib/x86_64-linux-gnu/xtables
> > > > libc has setns: yes
> > > > SELinux support: yes
> > > > libbpf support: yes
> > > > ELF support: yes
> > > > libmnl support: yes
> > > > Berkeley DB: no
> > > > need for strlcpy: yes
> > > > libcap support: yes
> > > >
> > > > root@u2010-sfo3:~/iproute2.git# make clean
> > > >
> > > > root@u2010-sfo3:~/iproute2.git# make -j 4
> > > > ...
> > > > /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
> > > > bpf_libbpf.c:(.text+0x3cb): undefined reference to
> > > > `bpf_program__section_name'
> > > > /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
> > > > `bpf_program__section_name'
> > > > /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
> > > > `bpf_program__section_name'
> > > > collect2: error: ld returned 1 exit status
> > > > make[1]: *** [Makefile:27: ip] Error 1
> > > > make[1]: *** Waiting for unfinished jobs....
> > > > make: *** [Makefile:64: all] Error 2
> > >
> > > You need to update libbpf to latest version.
> >
> > Why not using libbpf from submodule?
>
> Because it makes it harder for people downloading tarballs and distributions.

Genuinely curious, making harder how exactly? When packaging sources
as a tarball you'd check out submodules before packaging, right?

> Iproute2 has worked well by being standalone.

Again, maybe I'm missing something, but what makes it not a
standalone, if it is using a submodule? Pahole, for instance, is using
libbpf through submodule and just bypasses all the problems with
detection of features and library availability. I haven't heard anyone
complaining about it made working with pahole harder in any way.

>
> Want to merge libbpf into iproute2??

No... How did you come to this conclusion?..

>
>

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:45         ` Hangbin Liu
@ 2020-10-29  3:00           ` David Ahern
  2020-10-29  3:17             ` Hangbin Liu
  2020-10-29 10:26             ` Hangbin Liu
  0 siblings, 2 replies; 167+ messages in thread
From: David Ahern @ 2020-10-29  3:00 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 10/28/20 8:45 PM, Hangbin Liu wrote:
> On Wed, Oct 28, 2020 at 08:20:55PM -0600, David Ahern wrote:
>>>> root@u2010-sfo3:~/iproute2.git# make -j 4
>>>> ...
>>>> /usr/bin/ld: ../lib/libutil.a(bpf_libbpf.o): in function `load_bpf_object':
>>>> bpf_libbpf.c:(.text+0x3cb): undefined reference to
>>>> `bpf_program__section_name'
>>>> /usr/bin/ld: bpf_libbpf.c:(.text+0x438): undefined reference to
>>>> `bpf_program__section_name'
>>>> /usr/bin/ld: bpf_libbpf.c:(.text+0x716): undefined reference to
>>>> `bpf_program__section_name'
>>>> collect2: error: ld returned 1 exit status
>>>> make[1]: *** [Makefile:27: ip] Error 1
>>>> make[1]: *** Waiting for unfinished jobs....
>>>> make: *** [Makefile:64: all] Error 2
>>>
>>> You need to update libbpf to latest version.
>>
>> nope. you need to be able to handle this. Ubuntu 20.10 was just
>> released, and it has a version of libbpf. If you are going to integrate
>> libbpf into other packages like iproute2, it needs to just work with
>> that version.
> 
> OK, I can replace bpf_program__section_name by bpf_program__title().

I believe this one can be handled through a compatability check. Looks
the rename / deprecation is fairly recent (78cdb58bdf15f from Sept 2020).


>>
>>>
>>> But this also remind me that I need to add bpf_program__section_name() to
>>> configure checking. I will see if I missed other functions' checking.
>>
>> This is going to be an on-going problem. iproute2 should work with
>> whatever version of libbpf is installed on that system.
> 
> I will make it works on Ubuntu 20.10, but with whatever version of libbpf?
> That looks hard, especially with old libbpf.
> 

I meant what comes with the OS. I believe I read that Fedora 33 was just
released as well. Does it have a version of libbpf? If so, please verify
it compiles and works with that version too. Before committing I will
also verify it compiles and links against a local version of libbpf (top
of tree) just to get a range of versions.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  3:00           ` David Ahern
@ 2020-10-29  3:17             ` Hangbin Liu
  2020-10-29 10:26             ` Hangbin Liu
  1 sibling, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29  3:17 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 09:00:41PM -0600, David Ahern wrote:
> >>> You need to update libbpf to latest version.
> >>
> >> nope. you need to be able to handle this. Ubuntu 20.10 was just
> >> released, and it has a version of libbpf. If you are going to integrate
> >> libbpf into other packages like iproute2, it needs to just work with
> >> that version.
> > 
> > OK, I can replace bpf_program__section_name by bpf_program__title().
> 
> I believe this one can be handled through a compatability check. Looks

Do you mean add a check like

#ifdef has_section_name_support
	use bpf_program__section_name;
#else
	use bpf_program__title;
#endif

> the rename / deprecation is fairly recent (78cdb58bdf15f from Sept 2020).

Yeah... As Andrii said, libbpf is in fast moving..

> >>
> >>>
> >>> But this also remind me that I need to add bpf_program__section_name() to
> >>> configure checking. I will see if I missed other functions' checking.
> >>
> >> This is going to be an on-going problem. iproute2 should work with
> >> whatever version of libbpf is installed on that system.
> > 
> > I will make it works on Ubuntu 20.10, but with whatever version of libbpf?
> > That looks hard, especially with old libbpf.
> > 
> 
> I meant what comes with the OS. I believe I read that Fedora 33 was just
> released as well. Does it have a version of libbpf? If so, please verify
> it compiles and works with that version too. Before committing I will
> also verify it compiles and links against a local version of libbpf (top
> of tree) just to get a range of versions.
> 

Yes, it makes sense. I will also check the libbpf on Fedora 33.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  3:00           ` David Ahern
  2020-10-29  3:17             ` Hangbin Liu
@ 2020-10-29 10:26             ` Hangbin Liu
  2020-10-29 10:51               ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29 10:26 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Wed, Oct 28, 2020 at 09:00:41PM -0600, David Ahern wrote:
> >> nope. you need to be able to handle this. Ubuntu 20.10 was just
> >> released, and it has a version of libbpf. If you are going to integrate
> >> libbpf into other packages like iproute2, it needs to just work with
> >> that version.
> > 
> > OK, I can replace bpf_program__section_name by bpf_program__title().
> 
> I believe this one can be handled through a compatability check. Looks
> the rename / deprecation is fairly recent (78cdb58bdf15f from Sept 2020).

Hi David,

I just come up with another way. In configure, build a temp program and update
the function checking every time is not graceful. How about just check the
libbpf version, since libbpf has exported all functions in src/libbpf.map.

Currently, only bpf_program__section_name() is added in 0.2.0, all other
needed functions are supported in 0.1.0.

So in configure, the new check would like:

check_force_libbpf()
{
    # if set FORCE_LIBBPF but no libbpf support, just exist the config
    # process to make sure we don't build without libbpf.
    if [ -n "$FORCE_LIBBPF" ]; then
        echo "FORCE_LIBBPF set, but couldn't find a usable libbpf"
        exit 1
    fi
}

check_libbpf()
{
    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
        echo "no"
        check_force_libbpf
        return
    fi

    if [ $(uname -m) == x86_64 ]; then
        local LIBSUBDIR=lib64
    else
        local LIBSUBDIR=lib
    fi

    if [ -n "$LIBBPF_DIR" ]; then
        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include -L${LIBBPF_DIR}/${LIBSUBDIR}"
        LIBBPF_LDLIBS="${LIBBPF_DIR}/${LIBSUBDIR}/libbpf.a -lz -lelf"
    else
        LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
        LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
    fi

    if ${PKG_CONFIG} libbpf --atleast-version 0.1.0 || \
        PKG_CONFIG_LIBDIR=${LIBBPF_DIR}/${LIBSUBDIR}/pkgconfig \
	${PKG_CONFIG} libbpf --atleast-version 0.1.0; then
        echo "HAVE_LIBBPF:=y" >>$CONFIG
        echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
        echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
        echo "yes"
    else
        echo "no"
        check_force_libbpf
	return
    fi

    # bpf_program__title() is deprecated since libbpf 0.2.0, use
    # bpf_program__section_name() instead if we support
    if ${PKG_CONFIG} libbpf --atleast-version 0.2.0 || \
        PKG_CONFIG_LIBDIR=${LIBBPF_DIR}/${LIBSUBDIR}/pkgconfig \
	${PKG_CONFIG} libbpf --atleast-version 0.2.0; then
        echo 'CFLAGS += -DHAVE_LIBBPF_SECTION_NAME ' $LIBBPF_CFLAGS >> $CONFIG
    fi
}

And in lib/bpf_libbpf.c, we add a new helper like:

static const char *get_bpf_program__section_name(const struct bpf_program *prog)
{
#ifdef HAVE_LIBBPF_SECTION_NAME
	return bpf_program__section_name(prog);
#else
	return bpf_program__title(prog, false);
#endif
}

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29 10:26             ` Hangbin Liu
@ 2020-10-29 10:51               ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-29 10:51 UTC (permalink / raw)
  To: Hangbin Liu, David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko

Hangbin Liu <haliu@redhat.com> writes:

> On Wed, Oct 28, 2020 at 09:00:41PM -0600, David Ahern wrote:
>> >> nope. you need to be able to handle this. Ubuntu 20.10 was just
>> >> released, and it has a version of libbpf. If you are going to integrate
>> >> libbpf into other packages like iproute2, it needs to just work with
>> >> that version.
>> > 
>> > OK, I can replace bpf_program__section_name by bpf_program__title().
>> 
>> I believe this one can be handled through a compatability check. Looks
>> the rename / deprecation is fairly recent (78cdb58bdf15f from Sept 2020).
>
> Hi David,
>
> I just come up with another way. In configure, build a temp program and update
> the function checking every time is not graceful. How about just check the
> libbpf version, since libbpf has exported all functions in src/libbpf.map.
>
> Currently, only bpf_program__section_name() is added in 0.2.0, all other
> needed functions are supported in 0.1.0.
>
> So in configure, the new check would like:

Why is this easier than just checking for the function you need? In
xdp-tools configure we have a test like this:

check_perf_consume()
{
    cat >$TMPDIR/libbpftest.c <<EOF
#include <bpf/libbpf.h>
int main(int argc, char **argv) {
    perf_buffer__consume(NULL);
    return 0;
}
EOF
    libbpf_err=$($CC -o $TMPDIR/libbpftest $TMPDIR/libbpftest.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS 2>&1)
    if [ "$?" -eq "0" ]; then
        echo "HAVE_LIBBPF_PERF_BUFFER__CONSUME:=y" >>"$CONFIG"
        echo "yes"
    else
        echo "HAVE_LIBBPF_PERF_BUFFER__CONSUME:=n" >>"$CONFIG"
        echo "no"
    fi
}

Just do that for __section_name(), and you'll also be able to work with
custom libbpf versions using LIBBPF_DIR.

> static const char *get_bpf_program__section_name(const struct bpf_program *prog)
> {
> #ifdef HAVE_LIBBPF_SECTION_NAME
> 	return bpf_program__section_name(prog);
> #else
> 	return bpf_program__title(prog, false);
> #endif
> }

This bit is fine :)

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29  2:50           ` Andrii Nakryiko
@ 2020-10-29 11:38             ` Jesper Dangaard Brouer
  2020-10-29 20:30               ` Andrii Nakryiko
  0 siblings, 1 reply; 167+ messages in thread
From: Jesper Dangaard Brouer @ 2020-10-29 11:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Stephen Hemminger, Hangbin Liu, David Ahern, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, brouer

On Wed, 28 Oct 2020 19:50:51 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Wed, Oct 28, 2020 at 7:34 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Wed, 28 Oct 2020 19:27:20 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >  
> > > On Wed, Oct 28, 2020 at 7:06 PM Hangbin Liu <haliu@redhat.com> wrote:  
> > > >
> > > > On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:  
> > > > > fails to compile on Ubuntu 20.10:
> > > > >
[...]
> > > > You need to update libbpf to latest version.  
> > >
> > > Why not using libbpf from submodule?  
> >
> > Because it makes it harder for people downloading tarballs and distributions.  
> 
> Genuinely curious, making harder how exactly? When packaging sources
> as a tarball you'd check out submodules before packaging, right?
> 
> > Iproute2 has worked well by being standalone.  
> 
> Again, maybe I'm missing something, but what makes it not a
> standalone, if it is using a submodule? Pahole, for instance, is using
> libbpf through submodule and just bypasses all the problems with
> detection of features and library availability. I haven't heard anyone
> complaining about it made working with pahole harder in any way.

I do believe you are missing something.  I guess I can be the relay for
complains, so you will officially hear about this.  Red Hat and Fedora
security is complaining that we are packaging a library (libbpf)
directly into the individual packages.  They complain because in case
of a security issue, they have to figure out to rebuild all the software
packages that are statically compiled with this library.

Maybe you say I don't care that Distro security teams have to do more
work and update more packages.  Then security team says, we expect
customers will use this library right, and if we ship it as a dynamic
loadable (.so) file, then we can update and fix security issues in
library without asking customers to recompile. (Notice the same story
goes if we can update the base-image used by a container).


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                     ` (6 preceding siblings ...)
  2020-10-28 23:02   ` David Ahern
@ 2020-10-29 15:11   ` Hangbin Liu
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
                       ` (6 more replies)
  7 siblings, 7 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29 15:11 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency).

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.

v3:
a) Update configure to Check function bpf_program__section_name() separately
b) Add a new function get_bpf_program__section_name() to choose whether to
use bpf_program__title() or not.
c) Test build the patch on Fedora 33 with libbpf-0.1.0-1.fc33 and
   libbpf-devel-0.1.0-1.fc33

v2:
a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
b) Add ipvrf with libbpf support.


Here are the test results with patched iproute2:

== setup env
# clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o
# clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o
# rm -rf /sys/fs/bpf/xdp/globals
# /root/iproute2/ip/ip link add type veth
# /root/iproute2/ip/ip link set veth0 up
# /root/iproute2/ip/ip link set veth1 up


== Load objs
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 4 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
4: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:21-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 5
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 8 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
8: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:23-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 10
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 12 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
12: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:25-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 15
# /root/iproute2/ip/ip link set veth0 xdp off


== Load objs again to make sure maps could be reused
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 16 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
16: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:27-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 20
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 20 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show                                                                                                                                                                   [236/4518]
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
20: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:29-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 25
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 24 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
24: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:31-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 30
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals

== Testing if we can load new-style objects (using xdp-filter as an example)
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 28 tag e29eeda1489a6520 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
28: xdp  name xdpfilt_alw_all  tag e29eeda1489a6520  gpl
        loaded_at 2020-10-22T08:04:33-0400  uid 0
        xlated 2408B  jited 1405B  memlock 4096B  map_ids 9,5,7,8,6
        btf_id 35
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
32: xdp  name xdpfilt_alw_ip  tag 2f2b9dbfb786a5a2  gpl
        loaded_at 2020-10-22T08:04:35-0400  uid 0
        xlated 1336B  jited 778B  memlock 4096B  map_ids 7,8,5
        btf_id 40
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 36 tag 18c1bb25084030bc jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
36: xdp  name xdpfilt_alw_tcp  tag 18c1bb25084030bc  gpl
        loaded_at 2020-10-22T08:04:37-0400  uid 0
        xlated 1128B  jited 690B  memlock 4096B  map_ids 6,5
        btf_id 45
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Load new btf defined maps
# /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 40 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
40: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:39-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 50
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 44 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
44: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:41-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 13
        btf_id 55
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 48 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer  map_sh
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
14: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
48: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:43-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 14
        btf_id 60
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Test load objs by tc
# /root/iproute2/tc/tc qdisc add dev veth0 ingress
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0
# /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
# ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d:
jmp_tc

/sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f:
jmp_ex  jmp_tc  map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc
# bpftool map show
15: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
16: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
17: prog_array  name jmp_ex  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
18: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 2  memlock 4096B
        owner_prog_type sched_cls  owner jited
19: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
52: sched_cls  name cls_loop  tag 3e98a40b04099d36  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 168B  jited 133B  memlock 4096B  map_ids 15
        btf_id 65
56: sched_cls  name cls_entry  tag 0fbb4d9310a6ee26  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 144B  jited 121B  memlock 4096B  map_ids 16
        btf_id 70
60: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 75
66: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 80
72: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 85
78: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 90
79: sched_cls  name cls_case2  tag ee218ff893dca823  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 336B  jited 218B  memlock 4096B  map_ids 19,18
        btf_id 90
80: sched_cls  name cls_exit  tag e78a58140deed387  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 288B  jited 177B  memlock 4096B  map_ids 19
        btf_id 90

I also run the following upstream kselftest with patches iproute2 and
all passed.

test_lwt_ip_encap.sh
test_xdp_redirect.sh
test_tc_redirect.sh
test_xdp_meta.sh
test_xdp_veth.sh
test_xdp_vlan.sh

Hangbin Liu (5):
  configure: add check_libbpf() for later libbpf support
  lib: rename bpf.c to bpf_legacy.c
  lib: add libbpf support
  examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  examples/bpf: add bpf examples with BTF defined maps

 configure                                |  94 +++++++
 examples/bpf/README                      |  18 +-
 examples/bpf/bpf_graft.c                 |  14 +-
 examples/bpf/bpf_map_in_map.c            |  37 ++-
 examples/bpf/bpf_shared.c                |  14 +-
 examples/bpf/{ => legacy}/bpf_cyclic.c   |   2 +-
 examples/bpf/legacy/bpf_graft.c          |  66 +++++
 examples/bpf/legacy/bpf_map_in_map.c     |  56 ++++
 examples/bpf/legacy/bpf_shared.c         |  53 ++++
 examples/bpf/{ => legacy}/bpf_tailcall.c |   2 +-
 include/bpf_api.h                        |  13 +
 include/bpf_util.h                       |  17 +-
 ip/ipvrf.c                               |  19 +-
 lib/Makefile                             |   6 +-
 lib/{bpf.c => bpf_legacy.c}              | 184 +++++++++++-
 lib/bpf_libbpf.c                         | 341 +++++++++++++++++++++++
 16 files changed, 888 insertions(+), 48 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 create mode 100644 examples/bpf/legacy/bpf_graft.c
 create mode 100644 examples/bpf/legacy/bpf_map_in_map.c
 create mode 100644 examples/bpf/legacy/bpf_shared.c
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)
 rename lib/{bpf.c => bpf_legacy.c} (94%)
 create mode 100644 lib/bpf_libbpf.c

-- 
2.25.4


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
@ 2020-10-29 15:11     ` Hangbin Liu
  2020-10-29 15:26       ` Toke Høiland-Jørgensen
  2020-11-02 15:37       ` David Ahern
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
                       ` (5 subsequent siblings)
  6 siblings, 2 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29 15:11 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch adds a check to see if we support libbpf. By default the
system libbpf will be used, but static linking against a custom libbpf
version can be achieved by passing LIBBPF_DIR to configure. FORCE_LIBBPF
can be set to force configure to abort if no suitable libbpf is found,
which is useful for automatic packaging that wants to enforce the
dependency.

Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
v3:
Check function bpf_program__section_name() separately and only use it
on higher libbpf version.

v2:
No update
---
 configure | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 94 insertions(+)

diff --git a/configure b/configure
index 307912aa..58a7176e 100755
--- a/configure
+++ b/configure
@@ -240,6 +240,97 @@ check_elf()
     fi
 }
 
+have_libbpf_basic()
+{
+    cat >$TMPDIR/libbpf_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    bpf_program__set_autoload(NULL, false);
+    bpf_map__ifindex(NULL);
+    bpf_map__set_pin_path(NULL, NULL);
+    bpf_object__open_file(NULL, NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_test $TMPDIR/libbpf_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_test.c $TMPDIR/libbpf_test
+    return $ret
+}
+
+have_libbpf_sec_name()
+{
+    cat >$TMPDIR/libbpf_sec_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    void *ptr;
+    bpf_program__section_name(NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_sec_test $TMPDIR/libbpf_sec_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_sec_test.c $TMPDIR/libbpf_sec_test
+    return $ret
+}
+
+check_force_libbpf()
+{
+    # if set FORCE_LIBBPF but no libbpf support, just exist the config
+    # process to make sure we don't build without libbpf.
+    if [ -n "$FORCE_LIBBPF" ]; then
+        echo "FORCE_LIBBPF set, but couldn't find a usable libbpf"
+        exit 1
+    fi
+}
+
+check_libbpf()
+{
+    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
+        echo "no"
+        check_force_libbpf
+        return
+    fi
+
+    if [ $(uname -m) == x86_64 ]; then
+        local LIBSUBDIR=lib64
+    else
+        local LIBSUBDIR=lib
+    fi
+
+    if [ -n "$LIBBPF_DIR" ]; then
+        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include -L${LIBBPF_DIR}/${LIBSUBDIR}"
+        LIBBPF_LDLIBS="${LIBBPF_DIR}/${LIBSUBDIR}/libbpf.a -lz -lelf"
+    else
+        LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
+        LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
+    fi
+
+    if ! have_libbpf_basic; then
+        echo "no"
+        echo "	libbpf version is too low, please update it to at least 0.1.0"
+        check_force_libbpf
+        return
+    else
+        echo "HAVE_LIBBPF:=y" >>$CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
+        echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
+    fi
+
+    # bpf_program__title() is deprecated since libbpf 0.2.0, use
+    # bpf_program__section_name() instead if we support
+    if have_libbpf_sec_name; then
+        echo "HAVE_LIBBPF_SECTION_NAME:=y" >>$CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF_SECTION_NAME ' $LIBBPF_CFLAGS >> $CONFIG
+    fi
+
+    echo "yes"
+}
+
 check_selinux()
 # SELinux is a compile time option in the ss utility
 {
@@ -385,6 +476,9 @@ check_setns
 echo -n "SELinux support: "
 check_selinux
 
+echo -n "libbpf support: "
+check_libbpf
+
 echo -n "ELF support: "
 check_elf
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv3 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
@ 2020-10-29 15:11     ` Hangbin Liu
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29 15:11 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This is a preparation for later libbpf support in iproute2. Function
bpf_prog_load() is also renamed to bpf_prog_load_buf() as there is a
conflict with libbpf.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 include/bpf_util.h          | 6 +++---
 ip/ipvrf.c                  | 4 ++--
 lib/Makefile                | 2 +-
 lib/{bpf.c => bpf_legacy.c} | 6 +++---
 4 files changed, 9 insertions(+), 9 deletions(-)
 rename lib/{bpf.c => bpf_legacy.c} (99%)

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 63db07ca..72d3a32c 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -274,9 +274,9 @@ int bpf_trace_pipe(void);
 
 void bpf_print_ops(struct rtattr *bpf_ops, __u16 len);
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log);
+int bpf_prog_load_buf(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, char *log,
+		      size_t size_log);
 
 int bpf_prog_attach_fd(int prog_fd, int target_fd, enum bpf_attach_type type);
 int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type);
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 28dd8e25..33150ac2 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -256,8 +256,8 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
-	return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
-			     "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
diff --git a/lib/Makefile b/lib/Makefile
index 7cba1857..a326fb9f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@ CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
-	names.o color.o bpf.o exec.o fs.o cg_map.o
+	names.o color.o bpf_legacy.o exec.o fs.o cg_map.o
 
 NLOBJ=libgenl.o libnetlink.o
 
diff --git a/lib/bpf.c b/lib/bpf_legacy.c
similarity index 99%
rename from lib/bpf.c
rename to lib/bpf_legacy.c
index c7d45077..2e6e0602 100644
--- a/lib/bpf.c
+++ b/lib/bpf_legacy.c
@@ -1109,9 +1109,9 @@ static int bpf_prog_load_dev(enum bpf_prog_type type,
 	return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 }
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log)
+int bpf_prog_load_buf(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, char *log,
+		      size_t size_log)
 {
 	return bpf_prog_load_dev(type, insns, size_insns, license, 0,
 				 log, size_log);
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
@ 2020-10-29 15:11     ` Hangbin Liu
  2020-11-02 15:41       ` David Ahern
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
                       ` (3 subsequent siblings)
  6 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29 15:11 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available, which is started by Toke's
implementation[1]. With libbpf iproute2 could correctly process BTF
information and support the new-style BTF-defined maps, while keeping
compatibility with the old internal map definition syntax.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
iproute2 will still understand the old map definition format, including
populating map-in-map and tail call maps before load.

In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
legacy bytes. When handling the legacy maps, for map-in-maps, we create
them manually and re-use the fd as they are associated with id/inner_id.
For pin maps, we only set the pin path and let libbp load to handle it.
For tail calls, we find it first and update the element after prog load.

Other maps/progs will be loaded by libbpf directly.

[1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>

---
v3:
Add a new function get_bpf_program__section_name() to choose whether
use bpf_program__title() or not.

v2:
Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
Add ipvrf with libbpf support.
---
 include/bpf_util.h |  11 ++
 ip/ipvrf.c         |  15 ++
 lib/Makefile       |   4 +
 lib/bpf_legacy.c   | 178 +++++++++++++++++++++++
 lib/bpf_libbpf.c   | 341 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 549 insertions(+)
 create mode 100644 lib/bpf_libbpf.c

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 72d3a32c..e200c107 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -300,4 +300,15 @@ static inline int bpf_recv_map_fds(const char *path, int *fds,
 	return -1;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg);
+int iproute2_bpf_fetch_ancillary(void);
+int iproute2_get_root_path(char *root_path, size_t len);
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname);
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name);
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name);
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg);
+#endif /* HAVE_LIBBPF */
 #endif /* __BPF_UTIL__ */
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 33150ac2..afaf1de7 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -28,8 +28,14 @@
 #include "rt_names.h"
 #include "utils.h"
 #include "ip_common.h"
+
 #include "bpf_util.h"
 
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#endif
+
 #define CGRP_PROC_FILE  "/cgroup.procs"
 
 static struct link_filter vrf_filter;
@@ -256,8 +262,13 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
+#ifdef HAVE_LIBBPF
+	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
+#else
 	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
 			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+#endif
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
@@ -288,7 +299,11 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 		goto out;
 	}
 
+#ifdef HAVE_LIBBPF
+	if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE, 0)) {
+#else
 	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
+#endif
 		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
 			strerror(errno));
 		goto out;
diff --git a/lib/Makefile b/lib/Makefile
index a326fb9f..82d6e465 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -7,6 +7,10 @@ UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
 	names.o color.o bpf_legacy.o exec.o fs.o cg_map.o
 
+ifeq ($(HAVE_LIBBPF),y)
+UTILOBJ += bpf_libbpf.o
+endif
+
 NLOBJ=libgenl.o libnetlink.o
 
 all: libnetlink.a libutil.a
diff --git a/lib/bpf_legacy.c b/lib/bpf_legacy.c
index 2e6e0602..c5ff3e32 100644
--- a/lib/bpf_legacy.c
+++ b/lib/bpf_legacy.c
@@ -940,6 +940,9 @@ static int bpf_do_parse(struct bpf_cfg_in *cfg, const bool *opt_tbl)
 static int bpf_do_load(struct bpf_cfg_in *cfg)
 {
 	if (cfg->mode == EBPF_OBJECT) {
+#ifdef HAVE_LIBBPF
+		return iproute2_load_libbpf(cfg);
+#endif
 		cfg->prog_fd = bpf_obj_open(cfg->object, cfg->type,
 					    cfg->section, cfg->ifindex,
 					    cfg->verbose);
@@ -3165,3 +3168,178 @@ int bpf_recv_map_fds(const char *path, int *fds, struct bpf_map_aux *aux,
 	return ret;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+/* The following functions are wrapper functions for libbpf code to be
+ * compatible with the legacy format. So all the functions have prefix
+ * with iproute2_
+ */
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+
+	return bpf_elf_ctx_init(ctx, cfg->object, cfg->type, cfg->ifindex, cfg->verbose);
+}
+
+int iproute2_bpf_fetch_ancillary(void)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	struct bpf_elf_sec_data data;
+	int i, ret = 0;
+
+	for (i = 1; i < ctx->elf_hdr.e_shnum; i++) {
+		ret = bpf_fill_section_data(ctx, i, &data);
+		if (ret < 0)
+			continue;
+
+		if (data.sec_hdr.sh_type == SHT_PROGBITS &&
+		    !strcmp(data.sec_name, ELF_SECTION_MAPS))
+			ret = bpf_fetch_maps_begin(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_SYMTAB &&
+			 !strcmp(data.sec_name, ".symtab"))
+			ret = bpf_fetch_symtab(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_STRTAB &&
+			 !strcmp(data.sec_name, ".strtab"))
+			ret = bpf_fetch_strtab(ctx, i, &data);
+		if (ret < 0) {
+			fprintf(stderr, "Error parsing section %d! Perhaps check with readelf -a?\n",
+				i);
+			return ret;
+		}
+	}
+
+	if (bpf_has_map_data(ctx)) {
+		ret = bpf_fetch_maps_end(ctx);
+		if (ret < 0) {
+			fprintf(stderr, "Error fixing up map structure, incompatible struct bpf_elf_map used?\n");
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+int iproute2_get_root_path(char *root_path, size_t len)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	int ret = 0;
+
+	snprintf(root_path, len, "%s/%s",
+		 bpf_get_work_dir(ctx->type), BPF_DIR_GLOBALS);
+
+	ret = mkdir(root_path, S_IRWXU);
+	if (ret && errno != EEXIST) {
+		fprintf(stderr, "mkdir %s failed: %s\n", root_path, strerror(errno));
+		return ret;
+	}
+
+	return 0;
+}
+
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name, *tmp;
+	unsigned int pinning;
+	int i, ret = 0;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].pinning == PIN_OBJECT_NS &&
+		    ctx->noafalg) {
+			fprintf(stderr, "Missing kernel AF_ALG support for PIN_OBJECT_NS!\n");
+			return false;
+		}
+
+		map_name = bpf_map_fetch_name(ctx, i);
+		if (!map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, map_name))
+			continue;
+
+		pinning = ctx->maps[i].pinning;
+
+		if (bpf_no_pinning(ctx, pinning) || !bpf_get_work_dir(ctx->type))
+			return false;
+
+		if (pinning == PIN_OBJECT_NS)
+			ret = bpf_make_obj_path(ctx);
+		else if ((tmp = bpf_custom_pinning(ctx, pinning)))
+			ret = bpf_make_custom_path(ctx, tmp);
+		if (ret < 0)
+			return false;
+
+		bpf_make_pathname(pathname, PATH_MAX, map_name, ctx, pinning);
+
+		return true;
+	}
+
+	return false;
+}
+
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *inner_map_name, *outer_map_name;
+	int i, j;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		inner_map_name = bpf_map_fetch_name(ctx, i);
+		if (!inner_map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, inner_map_name))
+			continue;
+
+		if (!ctx->maps[i].id ||
+		    ctx->maps[i].inner_id ||
+		    ctx->maps[i].inner_idx == -1)
+			continue;
+
+		*imap = ctx->maps[i];
+
+		for (j = 0; j < ctx->map_num; j++) {
+			if (!bpf_is_map_in_map_type(&ctx->maps[j]))
+				continue;
+			if (ctx->maps[j].inner_id != ctx->maps[i].id)
+				continue;
+
+			*omap = ctx->maps[j];
+			outer_map_name = bpf_map_fetch_name(ctx, j);
+			memcpy(omap_name, outer_map_name, strlen(outer_map_name) + 1);
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name;
+	int i, idx = -1;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].id == map_id &&
+		    ctx->maps[i].type == BPF_MAP_TYPE_PROG_ARRAY) {
+			idx = i;
+			break;
+		}
+	}
+
+	if (idx < 0)
+		return -1;
+
+	map_name = bpf_map_fetch_name(ctx, idx);
+	if (!map_name)
+		return -1;
+
+	memcpy(name, map_name, strlen(map_name) + 1);
+	return 0;
+}
+#endif /* HAVE_LIBBPF */
diff --git a/lib/bpf_libbpf.c b/lib/bpf_libbpf.c
new file mode 100644
index 00000000..4fe0bc4b
--- /dev/null
+++ b/lib/bpf_libbpf.c
@@ -0,0 +1,341 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <errno.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+#include <gelf.h>
+
+#include <bpf/libbpf.h>
+#include <bpf/bpf.h>
+
+#include "bpf_util.h"
+
+static int verbose_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	return vfprintf(stderr, format, args);
+}
+
+static int silent_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	if (level > LIBBPF_WARN)
+		return 0;
+
+	/* Skip warning from bpf_object__init_user_maps() for legacy maps */
+	if (strstr(format, "has unrecognized, non-zero options"))
+		return 0;
+
+	return vfprintf(stderr, format, args);
+}
+
+static const char *get_bpf_program__section_name(const struct bpf_program *prog)
+{
+#ifdef HAVE_LIBBPF_SECTION_NAME
+	return bpf_program__section_name(prog);
+#else
+	return bpf_program__title(prog, false);
+#endif
+}
+
+static int create_map(const char *name, struct bpf_elf_map *map,
+		      __u32 ifindex, int inner_fd)
+{
+	struct bpf_create_map_attr map_attr = {};
+
+	map_attr.name = name;
+	map_attr.map_type = map->type;
+	map_attr.map_flags = map->flags;
+	map_attr.key_size = map->size_key;
+	map_attr.value_size = map->size_value;
+	map_attr.max_entries = map->max_elem;
+	map_attr.map_ifindex = ifindex;
+	map_attr.inner_map_fd = inner_fd;
+
+	return bpf_create_map_xattr(&map_attr);
+}
+
+static int create_map_in_map(struct bpf_object *obj, struct bpf_map *map,
+			     struct bpf_elf_map *elf_map, int inner_fd,
+			     bool *reuse_pin_map)
+{
+	char pathname[PATH_MAX];
+	const char *map_name;
+	bool pin_map = false;
+	int map_fd, ret = 0;
+
+	map_name = bpf_map__name(map);
+
+	if (iproute2_is_pin_map(map_name, pathname)) {
+		pin_map = true;
+
+		/* Check if there already has a pinned map */
+		map_fd = bpf_obj_get(pathname);
+		if (map_fd > 0) {
+			if (reuse_pin_map)
+				*reuse_pin_map = true;
+			close(map_fd);
+			return bpf_map__set_pin_path(map, pathname);
+		}
+	}
+
+	map_fd = create_map(map_name, elf_map, bpf_map__ifindex(map), inner_fd);
+	if (map_fd < 0) {
+		fprintf(stderr, "create map %s failed\n", map_name);
+		return map_fd;
+	}
+
+	ret = bpf_map__reuse_fd(map, map_fd);
+	if (ret < 0) {
+		fprintf(stderr, "map %s reuse fd failed\n", map_name);
+		goto err_out;
+	}
+
+	if (pin_map) {
+		ret = bpf_map__set_pin_path(map, pathname);
+		if (ret < 0)
+			goto err_out;
+	}
+
+	return 0;
+err_out:
+	close(map_fd);
+	return ret;
+}
+
+static int
+handle_legacy_map_in_map(struct bpf_object *obj, struct bpf_map *inner_map,
+			 const char *inner_map_name)
+{
+	int inner_fd, outer_fd, inner_idx, ret = 0;
+	struct bpf_elf_map imap, omap;
+	struct bpf_map *outer_map;
+	/* What's the size limit of map name? */
+	char outer_map_name[128];
+	bool reuse_pin_map = false;
+
+	/* Deal with map-in-map */
+	if (iproute2_is_map_in_map(inner_map_name, &imap, &omap, outer_map_name)) {
+		ret = create_map_in_map(obj, inner_map, &imap, -1, NULL);
+		if (ret < 0)
+			return ret;
+
+		inner_fd = bpf_map__fd(inner_map);
+		outer_map = bpf_object__find_map_by_name(obj, outer_map_name);
+		ret = create_map_in_map(obj, outer_map, &omap, inner_fd, &reuse_pin_map);
+		if (ret < 0)
+			return ret;
+
+		if (!reuse_pin_map) {
+			inner_idx = imap.inner_idx;
+			outer_fd = bpf_map__fd(outer_map);
+			ret = bpf_map_update_elem(outer_fd, &inner_idx, &inner_fd, 0);
+			if (ret < 0)
+				fprintf(stderr, "Cannot update inner_idx into outer_map\n");
+		}
+	}
+
+	return ret;
+}
+
+static int find_legacy_tail_calls(struct bpf_program *prog, struct bpf_object *obj)
+{
+	unsigned int map_id, key_id;
+	const char *sec_name;
+	struct bpf_map *map;
+	char map_name[128];
+	int ret;
+
+	/* Handle iproute2 tail call */
+	sec_name = get_bpf_program__section_name(prog);
+	ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+	if (ret != 2)
+		return -1;
+
+	ret = iproute2_find_map_name_by_id(map_id, map_name);
+	if (ret < 0) {
+		fprintf(stderr, "unable to find map id %u for tail call\n", map_id);
+		return ret;
+	}
+
+	map = bpf_object__find_map_by_name(obj, map_name);
+	if (!map)
+		return -1;
+
+	/* Save the map here for later updating */
+	bpf_program__set_priv(prog, map, NULL);
+
+	return 0;
+}
+
+static int update_legacy_tail_call_maps(struct bpf_object *obj)
+{
+	int prog_fd, map_fd, ret = 0;
+	unsigned int map_id, key_id;
+	struct bpf_program *prog;
+	const char *sec_name;
+	struct bpf_map *map;
+
+	bpf_object__for_each_program(prog, obj) {
+		map = bpf_program__priv(prog);
+		if (!map)
+			continue;
+
+		prog_fd = bpf_program__fd(prog);
+		if (prog_fd < 0)
+			continue;
+
+		sec_name = get_bpf_program__section_name(prog);
+		ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+		if (ret != 2)
+			continue;
+
+		map_fd = bpf_map__fd(map);
+		ret = bpf_map_update_elem(map_fd, &key_id, &prog_fd, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Cannot update map key for tail call!\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int handle_legacy_maps(struct bpf_object *obj)
+{
+	char pathname[PATH_MAX];
+	struct bpf_map *map;
+	const char *map_name;
+	int map_fd, ret = 0;
+
+	bpf_object__for_each_map(map, obj) {
+		map_name = bpf_map__name(map);
+
+		ret = handle_legacy_map_in_map(obj, map, map_name);
+		if (ret)
+			return ret;
+
+		/* If it is a iproute2 legacy pin maps, just set pin path
+		 * and let bpf_object__load() to deal with the map creation.
+		 * We need to ignore map-in-maps which have pinned maps manually
+		 */
+		map_fd = bpf_map__fd(map);
+		if (map_fd < 0 && iproute2_is_pin_map(map_name, pathname)) {
+			ret = bpf_map__set_pin_path(map, pathname);
+			if (ret) {
+				fprintf(stderr, "map '%s': couldn't set pin path.\n", map_name);
+				break;
+			}
+		}
+
+	}
+
+	return ret;
+}
+
+static int load_bpf_object(struct bpf_cfg_in *cfg)
+{
+	struct bpf_program *p, *prog = NULL;
+	struct bpf_object *obj;
+	char root_path[PATH_MAX];
+	struct bpf_map *map;
+	int prog_fd, ret = 0;
+
+	ret = iproute2_get_root_path(root_path, PATH_MAX);
+	if (ret)
+		return ret;
+
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts,
+			.relaxed_maps = true,
+			.pin_root_path = root_path,
+	);
+
+	obj = bpf_object__open_file(cfg->object, &open_opts);
+	if (libbpf_get_error(obj)) {
+		fprintf(stderr, "ERROR: opening BPF object file failed\n");
+		return -ENOENT;
+	}
+
+	bpf_object__for_each_program(p, obj) {
+		/* Only load the programs that will either be subsequently
+		 * attached or inserted into a tail call map */
+		if (find_legacy_tail_calls(p, obj) < 0 && cfg->section &&
+		    strcmp(get_bpf_program__section_name(p), cfg->section)) {
+			ret = bpf_program__set_autoload(p, false);
+			if (ret)
+				return -EINVAL;
+			continue;
+		}
+
+		bpf_program__set_type(p, cfg->type);
+		bpf_program__set_ifindex(p, cfg->ifindex);
+		if (!prog)
+			prog = p;
+	}
+
+	bpf_object__for_each_map(map, obj) {
+		if (!bpf_map__is_offload_neutral(map))
+			bpf_map__set_ifindex(map, cfg->ifindex);
+	}
+
+	if (!prog) {
+		fprintf(stderr, "object file doesn't contain sec %s\n", cfg->section);
+		return -ENOENT;
+	}
+
+	/* Handle iproute2 legacy pin maps and map-in-maps */
+	ret = handle_legacy_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = bpf_object__load(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = update_legacy_tail_call_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	prog_fd = fcntl(bpf_program__fd(prog), F_DUPFD_CLOEXEC, 1);
+	if (prog_fd < 0)
+		ret = -errno;
+	else
+		cfg->prog_fd = prog_fd;
+
+unload_obj:
+	/* Close obj as we don't need it */
+	bpf_object__close(obj);
+	return ret;
+}
+
+/* Load ebpf and return prog fd */
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg)
+{
+	int ret = 0;
+
+	if (cfg->verbose)
+		libbpf_set_print(verbose_print);
+	else
+		libbpf_set_print(silent_print);
+
+	ret = iproute2_bpf_elf_ctx_init(cfg);
+	if (ret < 0) {
+		fprintf(stderr, "Cannot initialize ELF context!\n");
+		return ret;
+	}
+
+	ret = iproute2_bpf_fetch_ancillary();
+	if (ret < 0) {
+		fprintf(stderr, "Error fetching ELF ancillary data!\n");
+		return ret;
+	}
+
+	ret = load_bpf_object(cfg);
+	if (ret)
+		return ret;
+
+	return cfg->prog_fd;
+}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv3 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
                       ` (2 preceding siblings ...)
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-10-29 15:11     ` Hangbin Liu
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29 15:11 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README                        | 14 +++++++++-----
 examples/bpf/{ => legacy}/bpf_cyclic.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_graft.c      |  2 +-
 examples/bpf/{ => legacy}/bpf_map_in_map.c |  2 +-
 examples/bpf/{ => legacy}/bpf_shared.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_tailcall.c   |  2 +-
 6 files changed, 14 insertions(+), 10 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 rename examples/bpf/{ => legacy}/bpf_graft.c (97%)
 rename examples/bpf/{ => legacy}/bpf_map_in_map.c (96%)
 rename examples/bpf/{ => legacy}/bpf_shared.c (97%)
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)

diff --git a/examples/bpf/README b/examples/bpf/README
index 1bbdda3f..732bcc83 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,8 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
- - bpf_shared.c		-> Ingress/egress map sharing example
- - bpf_tailcall.c	-> Using tail call chains
- - bpf_cyclic.c		-> Simple cycle as tail calls
- - bpf_graft.c		-> Demo on altering runtime behaviour
- - bpf_map_in_map.c     -> Using map in map example
+ - legacy/bpf_shared.c		-> Ingress/egress map sharing example
+ - legacy/bpf_tailcall.c	-> Using tail call chains
+ - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
+ - legacy/bpf_graft.c		-> Demo on altering runtime behaviour
+ - legacy/bpf_map_in_map.c	-> Using map in map example
+
+Note: Users should use new BTF way to defined the maps, the examples
+in legacy folder which is using struct bpf_elf_map defined maps is not
+recommanded.
diff --git a/examples/bpf/bpf_cyclic.c b/examples/bpf/legacy/bpf_cyclic.c
similarity index 95%
rename from examples/bpf/bpf_cyclic.c
rename to examples/bpf/legacy/bpf_cyclic.c
index 11d1c061..33590730 100644
--- a/examples/bpf/bpf_cyclic.c
+++ b/examples/bpf/legacy/bpf_cyclic.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Cyclic dependency example to test the kernel's runtime upper
  * bound on loops. Also demonstrates on how to use direct-actions,
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/legacy/bpf_graft.c
similarity index 97%
rename from examples/bpf/bpf_graft.c
rename to examples/bpf/legacy/bpf_graft.c
index 07113d4a..f4c920cc 100644
--- a/examples/bpf/bpf_graft.c
+++ b/examples/bpf/legacy/bpf_graft.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* This example demonstrates how classifier run-time behaviour
  * can be altered with tail calls. We start out with an empty
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/legacy/bpf_map_in_map.c
similarity index 96%
rename from examples/bpf/bpf_map_in_map.c
rename to examples/bpf/legacy/bpf_map_in_map.c
index ff0e623a..575f8812 100644
--- a/examples/bpf/bpf_map_in_map.c
+++ b/examples/bpf/legacy/bpf_map_in_map.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define MAP_INNER_ID	42
 
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/legacy/bpf_shared.c
similarity index 97%
rename from examples/bpf/bpf_shared.c
rename to examples/bpf/legacy/bpf_shared.c
index 21fe6f1e..05b2b9ef 100644
--- a/examples/bpf/bpf_shared.c
+++ b/examples/bpf/legacy/bpf_shared.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Minimal, stand-alone toy map pinning example:
  *
diff --git a/examples/bpf/bpf_tailcall.c b/examples/bpf/legacy/bpf_tailcall.c
similarity index 98%
rename from examples/bpf/bpf_tailcall.c
rename to examples/bpf/legacy/bpf_tailcall.c
index 161eb606..8ebc554c 100644
--- a/examples/bpf/bpf_tailcall.c
+++ b/examples/bpf/legacy/bpf_tailcall.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define ENTRY_INIT	3
 #define ENTRY_0		0
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv3 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
                       ` (3 preceding siblings ...)
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
@ 2020-10-29 15:11     ` Hangbin Liu
  2020-11-02 15:47     ` [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support David Ahern
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-10-29 15:11 UTC (permalink / raw)
  To: Stephen Hemminger, Daniel Borkmann, David Ahern, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, Hangbin Liu

Users should try use the new BTF defined maps instead of struct
bpf_elf_map defined maps. The tail call examples are not added yet
as libbpf doesn't currently support declaratively populating tail call
maps.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README           |  6 ++++
 examples/bpf/bpf_graft.c      | 66 +++++++++++++++++++++++++++++++++++
 examples/bpf/bpf_map_in_map.c | 55 +++++++++++++++++++++++++++++
 examples/bpf/bpf_shared.c     | 53 ++++++++++++++++++++++++++++
 include/bpf_api.h             | 13 +++++++
 5 files changed, 193 insertions(+)
 create mode 100644 examples/bpf/bpf_graft.c
 create mode 100644 examples/bpf/bpf_map_in_map.c
 create mode 100644 examples/bpf/bpf_shared.c

diff --git a/examples/bpf/README b/examples/bpf/README
index 732bcc83..b7261191 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,6 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
+- BTF defined map examples
+ - bpf_graft.c		-> Demo on altering runtime behaviour
+ - bpf_shared.c 	-> Ingress/egress map sharing example
+ - bpf_map_in_map.c	-> Using map in map example
+
+- legacy struct bpf_elf_map defined map examples
  - legacy/bpf_shared.c		-> Ingress/egress map sharing example
  - legacy/bpf_tailcall.c	-> Using tail call chains
  - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/bpf_graft.c
new file mode 100644
index 00000000..8066dcce
--- /dev/null
+++ b/examples/bpf/bpf_graft.c
@@ -0,0 +1,66 @@
+#include "../../include/bpf_api.h"
+
+/* This example demonstrates how classifier run-time behaviour
+ * can be altered with tail calls. We start out with an empty
+ * jmp_tc array, then add section aaa to the array slot 0, and
+ * later on atomically replace it with section bbb. Note that
+ * as shown in other examples, the tc loader can prepopulate
+ * tail called sections, here we start out with an empty one
+ * on purpose to show it can also be done this way.
+ *
+ * tc filter add dev foo parent ffff: bpf obj graft.o
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-20229 [001] ..s. 138993.003923: : fallthrough
+ *   <idle>-0            [001] ..s. 138993.202265: : fallthrough
+ *   Socket Thread-20229 [001] ..s. 138994.004149: : fallthrough
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec aaa
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139012.053587: : aaa
+ *   <idle>-0            [002] ..s. 139012.172359: : aaa
+ *   Socket Thread-19818 [001] ..s. 139012.173556: : aaa
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec bbb
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139022.102967: : bbb
+ *   <idle>-0            [002] ..s. 139022.155640: : bbb
+ *   Socket Thread-19818 [001] ..s. 139022.156730: : bbb
+ *   [...]
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+} jmp_tc __section(".maps");
+
+__section("aaa")
+int cls_aaa(struct __sk_buff *skb)
+{
+	printt("aaa\n");
+	return TC_H_MAKE(1, 42);
+}
+
+__section("bbb")
+int cls_bbb(struct __sk_buff *skb)
+{
+	printt("bbb\n");
+	return TC_H_MAKE(1, 43);
+}
+
+__section_cls_entry
+int cls_entry(struct __sk_buff *skb)
+{
+	tail_call(skb, &jmp_tc, 0);
+	printt("fallthrough\n");
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/bpf_map_in_map.c
new file mode 100644
index 00000000..39c86268
--- /dev/null
+++ b/examples/bpf/bpf_map_in_map.c
@@ -0,0 +1,55 @@
+#include "../../include/bpf_api.h"
+
+struct inner_map {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+} map_inner __section(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+	__array(values, struct inner_map);
+} map_outer __section(".maps") = {
+	.values = {
+		[0] = &map_inner,
+	},
+};
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			lock_xadd(val, 1);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			printt("map val: %d\n", *val);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/bpf_shared.c
new file mode 100644
index 00000000..99a332f4
--- /dev/null
+++ b/examples/bpf/bpf_shared.c
@@ -0,0 +1,53 @@
+#include "../../include/bpf_api.h"
+
+/* Minimal, stand-alone toy map pinning example:
+ *
+ * clang -target bpf -O2 [...] -o bpf_shared.o -c bpf_shared.c
+ * tc filter add dev foo parent 1: bpf obj bpf_shared.o sec egress
+ * tc filter add dev foo parent ffff: bpf obj bpf_shared.o sec ingress
+ *
+ * Both classifier will share the very same map instance in this example,
+ * so map content can be accessed from ingress *and* egress side!
+ *
+ * This example has a pinning of PIN_OBJECT_NS, so it's private and
+ * thus shared among various program sections within the object.
+ *
+ * A setting of PIN_GLOBAL_NS would place it into a global namespace,
+ * so that it can be shared among different object files. A setting
+ * of PIN_NONE (= 0) means no sharing, so each tc invocation a new map
+ * instance is being created.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);	/* or LIBBPF_PIN_NONE */
+} map_sh __section(".maps");
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		lock_xadd(val, 1);
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		printt("map val: %d\n", *val);
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/include/bpf_api.h b/include/bpf_api.h
index 89d3488d..82c47089 100644
--- a/include/bpf_api.h
+++ b/include/bpf_api.h
@@ -19,6 +19,19 @@
 
 #include "bpf_elf.h"
 
+/** libbpf pin type. */
+enum libbpf_pin_type {
+	LIBBPF_PIN_NONE,
+	/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
+	LIBBPF_PIN_BY_NAME,
+};
+
+/** Type helper macros. */
+
+#define __uint(name, val) int (*name)[val]
+#define __type(name, val) typeof(val) *name
+#define __array(name, val) typeof(val) *name[]
+
 /** Misc macros. */
 
 #ifndef __stringify
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
@ 2020-10-29 15:26       ` Toke Høiland-Jørgensen
  2020-11-02 15:37       ` David Ahern
  1 sibling, 0 replies; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-10-29 15:26 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger, Daniel Borkmann, David Ahern,
	Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Hangbin Liu

Hangbin Liu <haliu@redhat.com> writes:

> This patch adds a check to see if we support libbpf. By default the
> system libbpf will be used, but static linking against a custom libbpf
> version can be achieved by passing LIBBPF_DIR to configure. FORCE_LIBBPF
> can be set to force configure to abort if no suitable libbpf is found,
> which is useful for automatic packaging that wants to enforce the
> dependency.
>
> Signed-off-by: Hangbin Liu <haliu@redhat.com>

With one nit below, feel free to add back my:

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>

> ---
> v3:
> Check function bpf_program__section_name() separately and only use it
> on higher libbpf version.
>
> v2:
> No update
> ---
>  configure | 94 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 94 insertions(+)
>
> diff --git a/configure b/configure
> index 307912aa..58a7176e 100755
> --- a/configure
> +++ b/configure
> @@ -240,6 +240,97 @@ check_elf()
>      fi
>  }
>  
> +have_libbpf_basic()
> +{
> +    cat >$TMPDIR/libbpf_test.c <<EOF
> +#include <bpf/libbpf.h>
> +int main(int argc, char **argv) {
> +    bpf_program__set_autoload(NULL, false);
> +    bpf_map__ifindex(NULL);
> +    bpf_map__set_pin_path(NULL, NULL);
> +    bpf_object__open_file(NULL, NULL);
> +    return 0;
> +}
> +EOF
> +
> +    $CC -o $TMPDIR/libbpf_test $TMPDIR/libbpf_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
> +    local ret=$?
> +
> +    rm -f $TMPDIR/libbpf_test.c $TMPDIR/libbpf_test
> +    return $ret
> +}
> +
> +have_libbpf_sec_name()
> +{
> +    cat >$TMPDIR/libbpf_sec_test.c <<EOF
> +#include <bpf/libbpf.h>
> +int main(int argc, char **argv) {
> +    void *ptr;
> +    bpf_program__section_name(NULL);
> +    return 0;
> +}
> +EOF
> +
> +    $CC -o $TMPDIR/libbpf_sec_test $TMPDIR/libbpf_sec_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
> +    local ret=$?
> +
> +    rm -f $TMPDIR/libbpf_sec_test.c $TMPDIR/libbpf_sec_test
> +    return $ret
> +}
> +
> +check_force_libbpf()
> +{
> +    # if set FORCE_LIBBPF but no libbpf support, just exist the config
> +    # process to make sure we don't build without libbpf.
> +    if [ -n "$FORCE_LIBBPF" ]; then
> +        echo "FORCE_LIBBPF set, but couldn't find a usable libbpf"
> +        exit 1
> +    fi
> +}
> +
> +check_libbpf()
> +{
> +    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
> +        echo "no"
> +        check_force_libbpf
> +        return
> +    fi
> +
> +    if [ $(uname -m) == x86_64 ]; then
> +        local LIBSUBDIR=lib64
> +    else
> +        local LIBSUBDIR=lib
> +    fi
> +
> +    if [ -n "$LIBBPF_DIR" ]; then
> +        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include -L${LIBBPF_DIR}/${LIBSUBDIR}"
> +        LIBBPF_LDLIBS="${LIBBPF_DIR}/${LIBSUBDIR}/libbpf.a -lz -lelf"
> +    else
> +        LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
> +        LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
> +    fi
> +
> +    if ! have_libbpf_basic; then
> +        echo "no"
> +        echo "	libbpf version is too low, please update it to at least 0.1.0"
> +        check_force_libbpf
> +        return
> +    else
> +        echo "HAVE_LIBBPF:=y" >>$CONFIG
> +        echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
> +        echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
> +    fi
> +
> +    # bpf_program__title() is deprecated since libbpf 0.2.0, use
> +    # bpf_program__section_name() instead if we support
> +    if have_libbpf_sec_name; then
> +        echo "HAVE_LIBBPF_SECTION_NAME:=y" >>$CONFIG
> +        echo 'CFLAGS += -DHAVE_LIBBPF_SECTION_NAME ' $LIBBPF_CFLAGS >> $CONFIG

You already added $LIBBPF_CFLAGS above, so with this it ends up being
duplicated, doesn't it?

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29 11:38             ` Jesper Dangaard Brouer
@ 2020-10-29 20:30               ` Andrii Nakryiko
  0 siblings, 0 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-10-29 20:30 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Stephen Hemminger, Hangbin Liu, David Ahern, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Thu, Oct 29, 2020 at 4:38 AM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Wed, 28 Oct 2020 19:50:51 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Wed, Oct 28, 2020 at 7:34 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Wed, 28 Oct 2020 19:27:20 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > On Wed, Oct 28, 2020 at 7:06 PM Hangbin Liu <haliu@redhat.com> wrote:
> > > > >
> > > > > On Wed, Oct 28, 2020 at 05:02:34PM -0600, David Ahern wrote:
> > > > > > fails to compile on Ubuntu 20.10:
> > > > > >
> [...]
> > > > > You need to update libbpf to latest version.
> > > >
> > > > Why not using libbpf from submodule?
> > >
> > > Because it makes it harder for people downloading tarballs and distributions.
> >
> > Genuinely curious, making harder how exactly? When packaging sources
> > as a tarball you'd check out submodules before packaging, right?
> >
> > > Iproute2 has worked well by being standalone.
> >
> > Again, maybe I'm missing something, but what makes it not a
> > standalone, if it is using a submodule? Pahole, for instance, is using
> > libbpf through submodule and just bypasses all the problems with
> > detection of features and library availability. I haven't heard anyone
> > complaining about it made working with pahole harder in any way.
>
> I do believe you are missing something.

I don't think I got an answer how submodules make it harder for people
downloading tarballs and distributions, and the standalone-ness issue.
Your security angle is a very different aspect.

>  I guess I can be the relay for
> complains, so you will officially hear about this.  Red Hat and Fedora
> security is complaining that we are packaging a library (libbpf)
> directly into the individual packages.  They complain because in case
> of a security issue, they have to figure out to rebuild all the software
> packages that are statically compiled with this library.

They must be having nightmares already about BCC, bpftool, pahole, as
well as perf built with libbpf statically (perf on my server is, at
least). I also wonder how many other projects do use either submodules
or static linking with libraries as well.

>
> Maybe you say I don't care that Distro security teams have to do more
> work and update more packages.  Then security team says, we expect
> customers will use this library right, and if we ship it as a dynamic
> loadable (.so) file, then we can update and fix security issues in
> library without asking customers to recompile. (Notice the same story
> goes if we can update the base-image used by a container).

It's a trade off, and everyone decides for themselves where they want
to stand on this.

On the one hand, there are security folks obsessing about hypothetical
security vulnerabilities in libbpf so bad that they will need to
update libbpf overnight.

On the other hand, extra complexity for multiple users of libbpf to do
feature detection and working around the lack of some of the APIs in
libbpf due to older versions in the system. That extra complexity
might lead to more problems, bugs, vulnerabilities in the long run.

I understand the concerns and how dynamic libraries make it easier. We
can't really know for sure which of those two aspects would lead to
more pain and problems overall. I personally choose simplicity,
though.

But as I said, it's up to iproute2 folks to decide. Was just curious
about some of the claims I cited.


>
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
>

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
  2020-10-29 15:26       ` Toke Høiland-Jørgensen
@ 2020-11-02 15:37       ` David Ahern
  2020-11-03  5:54         ` Hangbin Liu
  1 sibling, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-02 15:37 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 10/29/20 9:11 AM, Hangbin Liu wrote:
> This patch adds a check to see if we support libbpf. By default the
> system libbpf will be used, but static linking against a custom libbpf
> version can be achieved by passing LIBBPF_DIR to configure. FORCE_LIBBPF
> can be set to force configure to abort if no suitable libbpf is found,
> which is useful for automatic packaging that wants to enforce the
> dependency.
> 

Add an option to force libbpf off and use of the legacy code. i.e, yes
it is installed, but don't use it.

configure script really needs a usage to dump options like disabling libbpf.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-11-02 15:41       ` David Ahern
  2020-11-03  5:48         ` Hangbin Liu
  2020-11-04  8:22         ` Hangbin Liu
  0 siblings, 2 replies; 167+ messages in thread
From: David Ahern @ 2020-11-02 15:41 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 10/29/20 9:11 AM, Hangbin Liu wrote:
> diff --git a/ip/ipvrf.c b/ip/ipvrf.c
> index 33150ac2..afaf1de7 100644
> --- a/ip/ipvrf.c
> +++ b/ip/ipvrf.c
> @@ -28,8 +28,14 @@
>  #include "rt_names.h"
>  #include "utils.h"
>  #include "ip_common.h"
> +
>  #include "bpf_util.h"
>  
> +#ifdef HAVE_LIBBPF
> +#include <bpf/bpf.h>
> +#include <bpf/libbpf.h>
> +#endif
> +
>  #define CGRP_PROC_FILE  "/cgroup.procs"
>  
>  static struct link_filter vrf_filter;
> @@ -256,8 +262,13 @@ static int prog_load(int idx)
>  		BPF_EXIT_INSN(),
>  	};
>  
> +#ifdef HAVE_LIBBPF
> +	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> +				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
> +#else
>  	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
>  			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
> +#endif
>  }
>  
>  static int vrf_configure_cgroup(const char *path, int ifindex)
> @@ -288,7 +299,11 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
>  		goto out;
>  	}
>  
> +#ifdef HAVE_LIBBPF
> +	if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE, 0)) {
> +#else
>  	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
> +#endif
>  		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
>  			strerror(errno));
>  		goto out;

I would prefer to have these #ifdef .. #endif checks consolidated in the
lib code. Create a bpf_compat file for these. e.g.,

int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
                     size_t size_insns, const char *license, char *log,
                     size_t size_log)
{
+#ifdef HAVE_LIBBPF
+	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
+#else
 	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
 			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+#endif
}

Similarly for bpf_program_attach.


I think even the includes can be done once in bpf_util.h with a single
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#endif
+

The iproute2_* functions added later in this patch can be in the compat
file as well.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
                       ` (4 preceding siblings ...)
  2020-10-29 15:11     ` [PATCHv3 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
@ 2020-11-02 15:47     ` David Ahern
  2020-11-03  6:58       ` Andrii Nakryiko
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
  6 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-02 15:47 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 10/29/20 9:11 AM, Hangbin Liu wrote:
> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
> 
> This is achieved by checking for libbpf at './configure' time, and using
> it if available. By default the system libbpf will be used, but static
> linking against a custom libbpf version can be achieved by passing
> LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
> abort if no suitable libbpf is found (useful for automatic packaging
> that wants to enforce the dependency).
> 
> The old iproute2 bpf code is kept and will be used if no suitable libbpf
> is available. When using libbpf, wrapper code ensures that iproute2 will
> still understand the old map definition format, including populating
> map-in-map and tail call maps before load.
> 
> The examples in bpf/examples are kept, and a separate set of examples
> are added with BTF-based map definitions for those examples where this
> is possible (libbpf doesn't currently support declaratively populating
> tail call maps).
> 
> At last, Thanks a lot for Toke's help on this patch set.
> 

In regards to comments from v2 of the series:

iproute2 is a stable, production package that requires minimal support
from external libraries. The external packages it does require are also
stable with few to no relevant changes.

bpf and libbpf on the other hand are under active development and
rapidly changing month over month. The git submodule approach has its
conveniences for rapid development but is inappropriate for a package
like iproute2 and will not be considered.

To explicitly state what I think should be obvious to any experienced
Linux user, iproute2 code should always compile and work *without
functionality loss* on LTS versions N and N-1 of well known OS’es with
LTS releases (e.g., Debian, Ubuntu, RHEL). Meaning iproute2 will compile
and work with the external dependencies as they exist in that OS version.

I believe there are more than enough established compatibility and
library version checks to find the middle ground to integrate new
features requiring new versions of libbpf while maintaining stability
and compatibility with older releases. The biannual releases of Ubuntu
and Fedora serve as testing grounds for integrating new features
requiring a newer version of libbpf while continuing to work with
released versions of libbpf. It appears Debian Bullseye will also fall
into this category.

Finally, bpf-based features in iproute2 will only be committed once
relevant support exists in a released version of libbpf (ie., the github
version, not just commits to the in-kernel tree version). Patches can
and should be sent for review based on testing with the in-kernel tree
version of libbpf, but I will not commit them until the library has been
released.

Thanks for working on this, Hangbin. It is right direction in the long term.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-02 15:41       ` David Ahern
@ 2020-11-03  5:48         ` Hangbin Liu
  2020-11-03 17:19           ` David Ahern
  2020-11-04  8:22         ` Hangbin Liu
  1 sibling, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-03  5:48 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Mon, Nov 02, 2020 at 08:41:09AM -0700, David Ahern wrote:
> 
> I would prefer to have these #ifdef .. #endif checks consolidated in the
> lib code. Create a bpf_compat file for these. e.g.,
> 
> int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
>                      size_t size_insns, const char *license, char *log,
>                      size_t size_log)
> {
> +#ifdef HAVE_LIBBPF
> +	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> +				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
> +#else
>  	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
>  			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
> +#endif
> }
> 
> Similarly for bpf_program_attach.

> 
> I think even the includes can be done once in bpf_util.h with a single
> +#ifdef HAVE_LIBBPF
> +#include <bpf/bpf.h>
> +#include <bpf/libbpf.h>
> +#endif
> +
> 
> The iproute2_* functions added later in this patch can be in the compat
> file as well.

The iproute2_* functions need access static struct bpf_elf_ctx __ctx;
We need move the struct bpf_elf_ctx to another header file if add the
iproute2_* functions to compat file. Do you still want this?

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-02 15:37       ` David Ahern
@ 2020-11-03  5:54         ` Hangbin Liu
  2020-11-03 17:32           ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-03  5:54 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Mon, Nov 02, 2020 at 08:37:37AM -0700, David Ahern wrote:
> On 10/29/20 9:11 AM, Hangbin Liu wrote:
> > This patch adds a check to see if we support libbpf. By default the
> > system libbpf will be used, but static linking against a custom libbpf
> > version can be achieved by passing LIBBPF_DIR to configure. FORCE_LIBBPF
> > can be set to force configure to abort if no suitable libbpf is found,
> > which is useful for automatic packaging that wants to enforce the
> > dependency.
> > 
> 
> Add an option to force libbpf off and use of the legacy code. i.e, yes
> it is installed, but don't use it.
> 
> configure script really needs a usage to dump options like disabling libbpf.
> 

Shouldn't we use libbpf by default if system support? The same like libmnl.
There is no options to force libnml off.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-02 15:47     ` [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support David Ahern
@ 2020-11-03  6:58       ` Andrii Nakryiko
  2020-11-03  8:42         ` Jiri Benc
  2020-11-03  8:46         ` Daniel Borkmann
  0 siblings, 2 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  6:58 UTC (permalink / raw)
  To: David Ahern
  Cc: Hangbin Liu, Stephen Hemminger, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Mon, Nov 2, 2020 at 7:47 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 10/29/20 9:11 AM, Hangbin Liu wrote:
> > This series converts iproute2 to use libbpf for loading and attaching
> > BPF programs when it is available. This means that iproute2 will
> > correctly process BTF information and support the new-style BTF-defined
> > maps, while keeping compatibility with the old internal map definition
> > syntax.
> >
> > This is achieved by checking for libbpf at './configure' time, and using
> > it if available. By default the system libbpf will be used, but static
> > linking against a custom libbpf version can be achieved by passing
> > LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
> > abort if no suitable libbpf is found (useful for automatic packaging
> > that wants to enforce the dependency).
> >
> > The old iproute2 bpf code is kept and will be used if no suitable libbpf
> > is available. When using libbpf, wrapper code ensures that iproute2 will
> > still understand the old map definition format, including populating
> > map-in-map and tail call maps before load.
> >
> > The examples in bpf/examples are kept, and a separate set of examples
> > are added with BTF-based map definitions for those examples where this
> > is possible (libbpf doesn't currently support declaratively populating
> > tail call maps).
> >
> > At last, Thanks a lot for Toke's help on this patch set.
> >
>
> In regards to comments from v2 of the series:
>
> iproute2 is a stable, production package that requires minimal support
> from external libraries. The external packages it does require are also
> stable with few to no relevant changes.
>
> bpf and libbpf on the other hand are under active development and
> rapidly changing month over month. The git submodule approach has its
> conveniences for rapid development but is inappropriate for a package
> like iproute2 and will not be considered.

It's ok to not consider that, really. I'm trying to understand what's
so bad about the submodule approach, not convince you (anymore) to use
libbpf through submodule. And the submodule is not for rapid
development, it's mainly for guaranteed libbpf features and version,
and simplicity of iproute2 code when using libbpf.

But I don't think I got a real answer as to what's the exact reason
against the submodule. Like what "inappropriate" even means in this
case? Jesper's security argument so far was the only objective
criteria, as far as I can tell.

>
> To explicitly state what I think should be obvious to any experienced
> Linux user, iproute2 code should always compile and work *without
> functionality loss* on LTS versions N and N-1 of well known OS’es with
> LTS releases (e.g., Debian, Ubuntu, RHEL). Meaning iproute2 will compile
> and work with the external dependencies as they exist in that OS version.

I love the appeal to obviousness and "experienced Linux user" ;)

But I also see that using libbpf through submodule gives iproute2
exact control over which version of libbpf is being used. And that
does not depend at all on any specific Linux distribution, its
version, LTS vs non-LTS, etc. iproute2 will just work the same across
all of them. So matches your stated goals very directly and
explicitly.

>
> I believe there are more than enough established compatibility and
> library version checks to find the middle ground to integrate new
> features requiring new versions of libbpf while maintaining stability
> and compatibility with older releases. The biannual releases of Ubuntu
> and Fedora serve as testing grounds for integrating new features
> requiring a newer version of libbpf while continuing to work with
> released versions of libbpf. It appears Debian Bullseye will also fall
> into this category.

Beyond just more unnecessary complexity in iproute2 library to
accommodate older libbpf versions, users basically will need to pay
closer attention not just to which version of iproute2 they have, but
also which version of libbpf is installed on their system. Which is
ok, but an unnecessary burden, IMO. By controlling the libbpf version
through the submodule, it would be simple to say: "iproute2 vX uses
libbpf vY with features Z1, Z2, Z3". Then the user would just know
what to expect from iproute2 and its BPF support. And iproute2 code
base won't have to do as much feature detection and condition
compilation tricks.

That's what I don't understand, why settle for the lowest common
denominator of libbpf versions across a wide range of systems, when
you can take control and develop against a well-known version of
libbpf. I get security upgrades angle (even if I don't rank it higher
than simplicity). But I don't get the ideal behind a blanket statement
"libbpf through submodule is inappropriate".

>
> Finally, bpf-based features in iproute2 will only be committed once
> relevant support exists in a released version of libbpf (ie., the github
> version, not just commits to the in-kernel tree version). Patches can
> and should be sent for review based on testing with the in-kernel tree
> version of libbpf, but I will not commit them until the library has been
> released.

Makes sense. And the submodule approach gives you a great deal of
control and flexibility in this case. For testing, it's easy to use
either Github or even in-kernel sources (with a bit of symlinking,
though). But for upstreaming the submodule should only reference a
released tag from Github repo. Again, everything seems to work out,
no?

>
> Thanks for working on this, Hangbin. It is right direction in the long term.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03  6:58       ` Andrii Nakryiko
@ 2020-11-03  8:42         ` Jiri Benc
  2020-11-03 17:45           ` David Ahern
  2020-11-03 17:48           ` Alexei Starovoitov
  2020-11-03  8:46         ` Daniel Borkmann
  1 sibling, 2 replies; 167+ messages in thread
From: Jiri Benc @ 2020-11-03  8:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David Ahern, Hangbin Liu, Stephen Hemminger, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Mon, 2 Nov 2020 22:58:06 -0800, Andrii Nakryiko wrote:
> But I don't think I got a real answer as to what's the exact reason
> against the submodule. Like what "inappropriate" even means in this
> case? Jesper's security argument so far was the only objective
> criteria, as far as I can tell.

It's the fundamental objection. Distributions in general have the "no
bundled libraries" policy. It is sometimes annoying but it helps to
understand that the policy is not a whim of distros, it's coming from
years of experience with package maintenance for security and stability.

> But I also see that using libbpf through submodule gives iproute2
> exact control over which version of libbpf is being used. And that
> does not depend at all on any specific Linux distribution, its
> version, LTS vs non-LTS, etc. iproute2 will just work the same across
> all of them. So matches your stated goals very directly and
> explicitly.

If you take this route, the end result would be all dependencies for
all projects being included as submodules and bundled. At the first
sight, this sounds easier for the developers. Why bother with dynamic
linking at all? Everything can be linked statically.

The result would be nightmare for both distros and users. No timely
security updates possible, critical bugs not being fixed in some
programs, etc. There is enough experience with this kind of setup to
conclude it is not the right way to go.

Yes, dynamic linking is initially more work for developers of both apps
and libraries. However, it pays off over time - there's no need to keep
track of security and other important fixes in the dependencies, it
comes for free from the distro work.

Btw, taking the bundling to the extreme, every app could bundle its own
well tested and compatible kernel version and be run in a VM. This
might sound far fetched but there were actual attempts to do that. It
didn't take off; I think part of the reason was that the Linux kernel
is very good in keeping its APIs stable.

And I'm convinced this is the way to go for libraries, too: put an
emphasis on API stability. Make it easy to get consumed and updated
under the hood. Everybody wins this way.

 Jiri


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03  6:58       ` Andrii Nakryiko
  2020-11-03  8:42         ` Jiri Benc
@ 2020-11-03  8:46         ` Daniel Borkmann
  2020-11-03 17:35           ` David Ahern
  1 sibling, 1 reply; 167+ messages in thread
From: Daniel Borkmann @ 2020-11-03  8:46 UTC (permalink / raw)
  To: Andrii Nakryiko, David Ahern
  Cc: Hangbin Liu, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/3/20 7:58 AM, Andrii Nakryiko wrote:
> On Mon, Nov 2, 2020 at 7:47 AM David Ahern <dsahern@gmail.com> wrote:
>> On 10/29/20 9:11 AM, Hangbin Liu wrote:
>>> This series converts iproute2 to use libbpf for loading and attaching
>>> BPF programs when it is available. This means that iproute2 will
>>> correctly process BTF information and support the new-style BTF-defined
>>> maps, while keeping compatibility with the old internal map definition
>>> syntax.
>>>
>>> This is achieved by checking for libbpf at './configure' time, and using
>>> it if available. By default the system libbpf will be used, but static
>>> linking against a custom libbpf version can be achieved by passing
>>> LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
>>> abort if no suitable libbpf is found (useful for automatic packaging
>>> that wants to enforce the dependency).
>>>
>>> The old iproute2 bpf code is kept and will be used if no suitable libbpf
>>> is available. When using libbpf, wrapper code ensures that iproute2 will
>>> still understand the old map definition format, including populating
>>> map-in-map and tail call maps before load.
>>>
>>> The examples in bpf/examples are kept, and a separate set of examples
>>> are added with BTF-based map definitions for those examples where this
>>> is possible (libbpf doesn't currently support declaratively populating
>>> tail call maps).
>>>
>>> At last, Thanks a lot for Toke's help on this patch set.
>>
>> In regards to comments from v2 of the series:
>>
>> iproute2 is a stable, production package that requires minimal support
>> from external libraries. The external packages it does require are also
>> stable with few to no relevant changes.
>>
>> bpf and libbpf on the other hand are under active development and
>> rapidly changing month over month. The git submodule approach has its
>> conveniences for rapid development but is inappropriate for a package
>> like iproute2 and will not be considered.

I thought last time this discussion came up there was consensus that the
submodule could be an explicit opt in for the configure script at least?

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-03  5:48         ` Hangbin Liu
@ 2020-11-03 17:19           ` David Ahern
  0 siblings, 0 replies; 167+ messages in thread
From: David Ahern @ 2020-11-03 17:19 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 11/2/20 10:48 PM, Hangbin Liu wrote:
> 
> The iproute2_* functions need access static struct bpf_elf_ctx __ctx;
> We need move the struct bpf_elf_ctx to another header file if add the
> iproute2_* functions to compat file. Do you still want this?
> 

ok, leave it in legacy for now.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-03  5:54         ` Hangbin Liu
@ 2020-11-03 17:32           ` David Ahern
  2020-11-04  8:51             ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-03 17:32 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 11/2/20 10:54 PM, Hangbin Liu wrote:
> On Mon, Nov 02, 2020 at 08:37:37AM -0700, David Ahern wrote:
>> On 10/29/20 9:11 AM, Hangbin Liu wrote:
>>> This patch adds a check to see if we support libbpf. By default the
>>> system libbpf will be used, but static linking against a custom libbpf
>>> version can be achieved by passing LIBBPF_DIR to configure. FORCE_LIBBPF
>>> can be set to force configure to abort if no suitable libbpf is found,
>>> which is useful for automatic packaging that wants to enforce the
>>> dependency.
>>>
>>
>> Add an option to force libbpf off and use of the legacy code. i.e, yes
>> it is installed, but don't use it.
>>
>> configure script really needs a usage to dump options like disabling libbpf.
>>
> 
> Shouldn't we use libbpf by default if system support? The same like libmnl.
> There is no options to force libnml off.
> 

configure scripts usually allow you to control options directly,
overriding the autoprobe.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03  8:46         ` Daniel Borkmann
@ 2020-11-03 17:35           ` David Ahern
  2020-11-03 17:47             ` Alexei Starovoitov
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-03 17:35 UTC (permalink / raw)
  To: Daniel Borkmann, Andrii Nakryiko
  Cc: Hangbin Liu, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/3/20 1:46 AM, Daniel Borkmann wrote:
> I thought last time this discussion came up there was consensus that the
> submodule could be an explicit opt in for the configure script at least?

I do not recall Stephen agreeing to that, and I certainly did not.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03  8:42         ` Jiri Benc
@ 2020-11-03 17:45           ` David Ahern
  2020-11-03 17:48           ` Alexei Starovoitov
  1 sibling, 0 replies; 167+ messages in thread
From: David Ahern @ 2020-11-03 17:45 UTC (permalink / raw)
  To: Jiri Benc, Andrii Nakryiko
  Cc: Hangbin Liu, Stephen Hemminger, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/3/20 1:42 AM, Jiri Benc wrote:
> And I'm convinced this is the way to go for libraries, too: put an
> emphasis on API stability. Make it easy to get consumed and updated
> under the hood. Everybody wins this way.

exactly. Libraries should export well thought out, easy to use, stable
APIs. Maintainers do not need to be concerned about how the code is
consumed by projects.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03 17:35           ` David Ahern
@ 2020-11-03 17:47             ` Alexei Starovoitov
  2020-11-03 18:23               ` Stephen Hemminger
  2020-11-03 22:32               ` David Ahern
  0 siblings, 2 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-03 17:47 UTC (permalink / raw)
  To: David Ahern
  Cc: Daniel Borkmann, Andrii Nakryiko, Hangbin Liu, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Tue, Nov 3, 2020 at 9:36 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 11/3/20 1:46 AM, Daniel Borkmann wrote:
> > I thought last time this discussion came up there was consensus that the
> > submodule could be an explicit opt in for the configure script at least?
>
> I do not recall Stephen agreeing to that, and I certainly did not.

Daniel,

since David is deaf to technical arguments,
how about we fork iproute2 and maintain it separately?

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03  8:42         ` Jiri Benc
  2020-11-03 17:45           ` David Ahern
@ 2020-11-03 17:48           ` Alexei Starovoitov
  1 sibling, 0 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-03 17:48 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Andrii Nakryiko, David Ahern, Hangbin Liu, Stephen Hemminger,
	Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko, Toke Høiland-Jørgensen

On Tue, Nov 3, 2020 at 12:42 AM Jiri Benc <jbenc@redhat.com> wrote:
> sight, this sounds easier for the developers. Why bother with dynamic
> linking at all? Everything can be linked statically.

That's exactly what some companies do.
Linking everything statically provides stronger security.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03 17:47             ` Alexei Starovoitov
@ 2020-11-03 18:23               ` Stephen Hemminger
  2020-11-03 22:32               ` David Ahern
  1 sibling, 0 replies; 167+ messages in thread
From: Stephen Hemminger @ 2020-11-03 18:23 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Ahern, Daniel Borkmann, Andrii Nakryiko, Hangbin Liu,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Tue, 3 Nov 2020 09:47:00 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Tue, Nov 3, 2020 at 9:36 AM David Ahern <dsahern@gmail.com> wrote:
> >
> > On 11/3/20 1:46 AM, Daniel Borkmann wrote:  
> > > I thought last time this discussion came up there was consensus that the
> > > submodule could be an explicit opt in for the configure script at least?  
> >
> > I do not recall Stephen agreeing to that, and I certainly did not.  
> 
> Daniel,
> 
> since David is deaf to technical arguments,
> how about we fork iproute2 and maintain it separately?

A submodule is not a practical viable option.

Please come back when you are ready to use distro libbpf packages.
This seems a microcosm of the Linux packaging problem that was discussed
around Kubernetes and "vendorizaton"

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03 17:47             ` Alexei Starovoitov
  2020-11-03 18:23               ` Stephen Hemminger
@ 2020-11-03 22:32               ` David Ahern
  2020-11-03 22:55                 ` Alexei Starovoitov
  1 sibling, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-03 22:32 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Andrii Nakryiko, Hangbin Liu, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/3/20 10:47 AM, Alexei Starovoitov wrote:
> since David is deaf to technical arguments,
It is not that I am "deaf to technical arguments"; you do not like my
response.

The scope of bpf in iproute2 is tiny - a few tc modules (and VRF but it
does not need libbpf) which is a small subset of the functionality and
commands within the package.

The configure script will allow you to use any libbpf version you wish.
Standard operating procedure for configuring a dependency within a package.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03 22:32               ` David Ahern
@ 2020-11-03 22:55                 ` Alexei Starovoitov
  2020-11-04  1:40                   ` David Ahern
  2020-11-04  2:17                   ` Hangbin Liu
  0 siblings, 2 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-03 22:55 UTC (permalink / raw)
  To: David Ahern
  Cc: Daniel Borkmann, Andrii Nakryiko, Hangbin Liu, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Tue, Nov 03, 2020 at 03:32:55PM -0700, David Ahern wrote:
> On 11/3/20 10:47 AM, Alexei Starovoitov wrote:
> > since David is deaf to technical arguments,
> It is not that I am "deaf to technical arguments"; you do not like my
> response.
> 
> The scope of bpf in iproute2 is tiny - a few tc modules (and VRF but it
> does not need libbpf) which is a small subset of the functionality and
> commands within the package.

When Hangbin sent this patch set I got excited that finally tc command
will start working with the latest bpf elf files.
Currently "tc" supports 4 year old files which caused plenty of pain to bpf users.
I got excited, but now I've realized that this patch set will make it worse.
The bpf support in "tc" command instead of being obviously old and obsolete
will be sort-of working with unpredictable delay between released kernel
and released iproute2 version. The iproute2 release that suppose to match kernel
release will be meaningless.
More so, the upgrade of shared libbpf.so can make older iproute2/tc to do 
something new and unpredictable.
The user experience will be awful. Not only the users won't know
what to expect out of 'tc' command they won't have a way to debug it.
All of it because iproute2 build will take system libbpf and link it
as shared library by default.
So I think iproute2 must not use libbpf. If I could remove bpf support
from iproute2 I would do so as well.
The current state of iproute2 is hurting bpf ecosystem and proposed
libbpf+iproute2 integration will make it worse.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03 22:55                 ` Alexei Starovoitov
@ 2020-11-04  1:40                   ` David Ahern
  2020-11-04  2:45                     ` Alexei Starovoitov
  2020-11-04  2:17                   ` Hangbin Liu
  1 sibling, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-04  1:40 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Andrii Nakryiko, Hangbin Liu, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/3/20 3:55 PM, Alexei Starovoitov wrote:
> The bpf support in "tc" command instead of being obviously old and obsolete
> will be sort-of working with unpredictable delay between released kernel
> and released iproute2 version. The iproute2 release that suppose to match kernel
> release will be meaningless.

iproute2, like all userspace commands, is written to an API and for well
written APIs the commands should be backward and forward compatible
across kernel versions. Kernel upgrades do not force an update of the
entire ecosystem. New userspace on old kernels should again just work.
New functionality in the new userpsace will not, but detection of that
is a different problem and relies on kernel APIs doing proper data
validation.


> More so, the upgrade of shared libbpf.so can make older iproute2/tc to do 
> something new and unpredictable.

How so? If libbpf is written against kernel APIs and properly versioned,
it should just work. A new version of libbpf changes the .so version, so
old commands will not load it.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-03 22:55                 ` Alexei Starovoitov
  2020-11-04  1:40                   ` David Ahern
@ 2020-11-04  2:17                   ` Hangbin Liu
  2020-11-04  3:11                     ` Alexei Starovoitov
  1 sibling, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-04  2:17 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Ahern, Daniel Borkmann, Andrii Nakryiko, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Tue, Nov 03, 2020 at 02:55:54PM -0800, Alexei Starovoitov wrote:
> > The scope of bpf in iproute2 is tiny - a few tc modules (and VRF but it
> > does not need libbpf) which is a small subset of the functionality and
> > commands within the package.
> 
> When Hangbin sent this patch set I got excited that finally tc command
> will start working with the latest bpf elf files.
> Currently "tc" supports 4 year old files which caused plenty of pain to bpf users.
> I got excited, but now I've realized that this patch set will make it worse.
> The bpf support in "tc" command instead of being obviously old and obsolete
> will be sort-of working with unpredictable delay between released kernel
> and released iproute2 version. The iproute2 release that suppose to match kernel
> release will be meaningless.
> More so, the upgrade of shared libbpf.so can make older iproute2/tc to do 
> something new and unpredictable.
> The user experience will be awful. Not only the users won't know
> what to expect out of 'tc' command they won't have a way to debug it.
> All of it because iproute2 build will take system libbpf and link it
> as shared library by default.
> So I think iproute2 must not use libbpf. If I could remove bpf support
> from iproute2 I would do so as well.
> The current state of iproute2 is hurting bpf ecosystem and proposed
> libbpf+iproute2 integration will make it worse.

Hi Guys,

Please take it easy. IMHO, it always very hard to make a perfect solution.
From development side, it's easier and could get latest features by using
libbpf as submodule. But we need to take care of users, backward
compatibility, distros policy etc.

I like using iproute2 to load bpf objs. But it's not standardized and too old
to load the new BTF defined objs. I think all of us like to improve it by
using libbpf. But users and distros are slowly. Some user are still using
`ifconfig`. Distros have policies to link the shared .so, etc. We have to
compromise on something.

Our purpose is to push the user to use new features. As this patchset
does, push users to try libbpf instead of legacy code. But this need time.

Sorry if my word make you feel confused. I'm not a native speaker, but I hope
we could find a solution that all(we, users, distros) could accept instead of
break/give up.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04  1:40                   ` David Ahern
@ 2020-11-04  2:45                     ` Alexei Starovoitov
  2020-11-04  9:28                       ` Jiri Benc
  0 siblings, 1 reply; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-04  2:45 UTC (permalink / raw)
  To: David Ahern
  Cc: Daniel Borkmann, Andrii Nakryiko, Hangbin Liu, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Tue, Nov 03, 2020 at 06:40:44PM -0700, David Ahern wrote:
> On 11/3/20 3:55 PM, Alexei Starovoitov wrote:
> > The bpf support in "tc" command instead of being obviously old and obsolete
> > will be sort-of working with unpredictable delay between released kernel
> > and released iproute2 version. The iproute2 release that suppose to match kernel
> > release will be meaningless.
> 
> iproute2, like all userspace commands, is written to an API and for well
> written APIs the commands should be backward and forward compatible
> across kernel versions. Kernel upgrades do not force an update of the
> entire ecosystem. New userspace on old kernels should again just work.
> New functionality in the new userpsace will not, but detection of that
> is a different problem and relies on kernel APIs doing proper data
> validation.

commands ?!
libbpf is not a library that translates user input into kernel syscalls.
It's not libmnl that is a wrapper for netlink.
It's not libelf either.
libbpf probes kernel features and does different things depending on what it found.
libbpf is the only library I know that is backward and forward compatible.
All other libraries are backwards compatible only.
iproute2 itself is backward compatible only as well.
New devlink feature in iproute2 won't do anything on the kernel that doesn't
have the support.
libbpf, on the other side, has to work on older kernels. New libbpf features
have to gradually degrade when possible.
The users can upgrade and downgrade libbpf version at any time.
They can upgrade and downgrade kernel while keeping libbpf version the same.
The users can upgrade llvm as well and libbpf has to expect unexpected
and deal with all combinations.

> 
> > More so, the upgrade of shared libbpf.so can make older iproute2/tc to do 
> > something new and unpredictable.
> 
> How so? If libbpf is written against kernel APIs and properly versioned,
> it should just work. A new version of libbpf changes the .so version, so
> old commands will not load it.

Please point out where do you see this happening in the patch set.
See tools/lib/bpf/README.rst to understand the versioning.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04  2:17                   ` Hangbin Liu
@ 2020-11-04  3:11                     ` Alexei Starovoitov
  2020-11-04 10:01                       ` Jiri Benc
                                         ` (2 more replies)
  0 siblings, 3 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-04  3:11 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: David Ahern, Daniel Borkmann, Andrii Nakryiko, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Wed, Nov 04, 2020 at 10:17:30AM +0800, Hangbin Liu wrote:
> On Tue, Nov 03, 2020 at 02:55:54PM -0800, Alexei Starovoitov wrote:
> > > The scope of bpf in iproute2 is tiny - a few tc modules (and VRF but it
> > > does not need libbpf) which is a small subset of the functionality and
> > > commands within the package.
> > 
> > When Hangbin sent this patch set I got excited that finally tc command
> > will start working with the latest bpf elf files.
> > Currently "tc" supports 4 year old files which caused plenty of pain to bpf users.
> > I got excited, but now I've realized that this patch set will make it worse.
> > The bpf support in "tc" command instead of being obviously old and obsolete
> > will be sort-of working with unpredictable delay between released kernel
> > and released iproute2 version. The iproute2 release that suppose to match kernel
> > release will be meaningless.
> > More so, the upgrade of shared libbpf.so can make older iproute2/tc to do 
> > something new and unpredictable.
> > The user experience will be awful. Not only the users won't know
> > what to expect out of 'tc' command they won't have a way to debug it.
> > All of it because iproute2 build will take system libbpf and link it
> > as shared library by default.
> > So I think iproute2 must not use libbpf. If I could remove bpf support
> > from iproute2 I would do so as well.
> > The current state of iproute2 is hurting bpf ecosystem and proposed
> > libbpf+iproute2 integration will make it worse.
> 
> Hi Guys,
> 
> Please take it easy. IMHO, it always very hard to make a perfect solution.
> From development side, it's easier and could get latest features by using
> libbpf as submodule. But we need to take care of users, backward
> compatibility, distros policy etc.
> 
> I like using iproute2 to load bpf objs. But it's not standardized and too old
> to load the new BTF defined objs. I think all of us like to improve it by
> using libbpf. But users and distros are slowly. Some user are still using
> `ifconfig`. Distros have policies to link the shared .so, etc. We have to
> compromise on something.
> 
> Our purpose is to push the user to use new features. As this patchset
> does, push users to try libbpf instead of legacy code. But this need time.

My problem with iproute2 picking random libbpf is unpredictability.
Such roll of dice gives no confidence to users on what is expected to work.
bpf_hello_world.o will load, but that's it.
What is going to work with this or that version of "tc" command? No one knows.
The user will do 'tc -V'. Does version mean anything from bpf loading pov?
It's not. The user will do "ldd `which tc`" and then what?
Such bpf support in "tc" is worse than the current one.
At least the current one is predictably old.

There are alternatives though.
Forking the whole iproute2 because of "tc" is pointless, of course.
My 'proposal' was a fire starter because people are too stubborn to
realize that their long term believes could be incorrect until the fire is burning.
"bpftool prog load" can load any kind of elf. It cannot operate on qdiscs
and shouldn't do qdisc manipulations, but may be we can combine them into pipe
of some sort. Like "bpftool prog load file.o | tc filter ... bpf pipe"
I think that would be better long term. It will be predictable.

When we release new version of libbpf it goes through rigorous testing.
bpftool gets a lot of test coverage as well.
iproute2 with shared libbpf will get nothing. It's the same random roll of dice.
New libbpf may or may not break iproute2. That's awful user experience.
So iproute2 has to use git submodule with particular libbpf sha.
Then libbpf release process can incorporate proper testing of libbpf
and iproute2 combination.
Or iproute2 should stay as-is with obsolete bpf support.

Few years from now the situation could be different and shared libbpf would
be the most appropriate choice. But that day is not today.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-02 15:41       ` David Ahern
  2020-11-03  5:48         ` Hangbin Liu
@ 2020-11-04  8:22         ` Hangbin Liu
  2020-11-05  2:33           ` David Ahern
  1 sibling, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-04  8:22 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Mon, Nov 02, 2020 at 08:41:09AM -0700, David Ahern wrote:
> On 10/29/20 9:11 AM, Hangbin Liu wrote:
> > diff --git a/ip/ipvrf.c b/ip/ipvrf.c
> > index 33150ac2..afaf1de7 100644
> > --- a/ip/ipvrf.c
> > +++ b/ip/ipvrf.c
> > @@ -28,8 +28,14 @@
> >  #include "rt_names.h"
> >  #include "utils.h"
> >  #include "ip_common.h"
> > +
> >  #include "bpf_util.h"
> >  
> > +#ifdef HAVE_LIBBPF
> > +#include <bpf/bpf.h>
> > +#include <bpf/libbpf.h>
> > +#endif
> > +
> >  #define CGRP_PROC_FILE  "/cgroup.procs"
> >  
> >  static struct link_filter vrf_filter;
> > @@ -256,8 +262,13 @@ static int prog_load(int idx)
> >  		BPF_EXIT_INSN(),
> >  	};
> >  
> > +#ifdef HAVE_LIBBPF
> > +	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> > +				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
> > +#else
> >  	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> >  			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
> > +#endif
> >  }
> >  
> >  static int vrf_configure_cgroup(const char *path, int ifindex)
> > @@ -288,7 +299,11 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
> >  		goto out;
> >  	}
> >  
> > +#ifdef HAVE_LIBBPF
> > +	if (bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE, 0)) {
> > +#else
> >  	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
> > +#endif
> >  		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
> >  			strerror(errno));
> >  		goto out;
> 
> I would prefer to have these #ifdef .. #endif checks consolidated in the
> lib code. Create a bpf_compat file for these. e.g.,
> 
> int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
>                      size_t size_insns, const char *license, char *log,
>                      size_t size_log)
> {
> +#ifdef HAVE_LIBBPF
> +	return bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
> +				"GPL", 0, bpf_log_buf, sizeof(bpf_log_buf));
> +#else
>  	return bpf_prog_load_buf(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
>  			         "GPL", bpf_log_buf, sizeof(bpf_log_buf));
> +#endif
> }
> 
> Similarly for bpf_program_attach.
> 
> 
> I think even the includes can be done once in bpf_util.h with a single
> +#ifdef HAVE_LIBBPF
> +#include <bpf/bpf.h>
> +#include <bpf/libbpf.h>
> +#endif
> +

Oh, I just found why I didn't include libbpf.h in bpf_legacy.c.
The reason is there are more function conflicts. e.g.
bpf_obj_get, bpf_obj_pin, bpf_prog_attach.

If we move this #ifdef HAVE_LIBBPF to bpf_legacy.c, we need to rename
them all. With current patch, we limit all the legacy functions in bpf_legacy
and doesn't mix them with libbpf.h. What do you think?

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-03 17:32           ` David Ahern
@ 2020-11-04  8:51             ` Hangbin Liu
  2020-11-04 11:09               ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-04  8:51 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Tue, Nov 03, 2020 at 10:32:37AM -0700, David Ahern wrote:
> configure scripts usually allow you to control options directly,
> overriding the autoprobe.

What do you think of the follow update? It's a little rough and only controls
libbpf.

$ git diff
diff --git a/configure b/configure
index 711bb69c..be35c024 100755
--- a/configure
+++ b/configure
@@ -442,6 +442,35 @@ endif
 EOF
 }

+usage()
+{
+       cat <<EOF
+Usage: $0 [OPTIONS]
+  -h | --help                  Show this usage info
+  --no-libbpf                  build the package without libbpf
+  --libbpf-dir=DIR             build the package with self defined libbpf dir
+EOF
+       exit $1
+}
+
+while true; do
+       case "$1" in
+               --libbpf-dir)
+                       LIBBPF_DIR="$2"
+                       shift 2 ;;
+               --no-libbpf)
+                       NO_LIBBPF_CHECK=1
+                       shift ;;
+               -h | --help)
+                       usage 0 ;;
+               "")
+                       break ;;
+               *)
+                       usage 1 ;;
+       esac
+done
+
+
 echo "# Generated config based on" $INCLUDE >$CONFIG
 quiet_config >> $CONFIG

@@ -476,8 +505,10 @@ check_setns
 echo -n "SELinux support: "
 check_selinux

-echo -n "libbpf support: "
-check_libbpf
+if [ -z $NO_LIBBPF_CHECK ]; then
+       echo -n "libbpf support: "
+       check_libbpf
+fi

 echo -n "ELF support: "
 check_elf


$ ./configure -h
Usage: ./configure [OPTIONS]
  -h | --help                   Show this usage info
  --no-libbpf                   build the package without libbpf
  --libbpf-dir=DIR              build the package with self defined libbpf dir

Thanks
Hangbin


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04  2:45                     ` Alexei Starovoitov
@ 2020-11-04  9:28                       ` Jiri Benc
  2020-11-05  2:39                         ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Jiri Benc @ 2020-11-04  9:28 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Ahern, Daniel Borkmann, Andrii Nakryiko, Hangbin Liu,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Tue, 3 Nov 2020 18:45:59 -0800, Alexei Starovoitov wrote:
> libbpf is the only library I know that is backward and forward compatible.

This is great to hear. It means there will be no problem with iproute2
using the system libbpf. As libbpf is both backward and forward
compatible, iproute2 will just work with whatever version it is used
with.

> All other libraries are backwards compatible only.

Backward compatibility would be enough for iproute2 but forward
compatibility does not hurt, of course.

> The users can upgrade and downgrade libbpf version at any time.
> They can upgrade and downgrade kernel while keeping libbpf version the same.
> The users can upgrade llvm as well and libbpf has to expect unexpected
> and deal with all combinations.

This actually goes beyond what would be needed for iproute2 dynamically
linked against system libbpf.

> > How so? If libbpf is written against kernel APIs and properly versioned,
> > it should just work. A new version of libbpf changes the .so version, so
> > old commands will not load it.  
> 
> Please point out where do you see this happening in the patch set.
> See tools/lib/bpf/README.rst to understand the versioning.

If the iproute2 binaries are linked against a symbol of a newer version than
is available in the system libbpf (which should not really happen
unless the system is broken), the dynamic linker will refuse to load
it. If the binary is linked against an old version of a particular
symbol, that old version will be used, if it's still provided by the
library. Otherwise, it will not load. I don't see a problem here?

The only problem would be if a particular function changed its
semantics while retaining ABI. But since libbpf is backward and forward
compatible, this should not happen.

 Jiri


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04  3:11                     ` Alexei Starovoitov
@ 2020-11-04 10:01                       ` Jiri Benc
  2020-11-04 10:21                       ` Daniel Borkmann
  2020-11-04 21:15                       ` Edward Cree
  2 siblings, 0 replies; 167+ messages in thread
From: Jiri Benc @ 2020-11-04 10:01 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Hangbin Liu, David Ahern, Daniel Borkmann, Andrii Nakryiko,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Tue, 3 Nov 2020 19:11:45 -0800, Alexei Starovoitov wrote:
> When we release new version of libbpf it goes through rigorous testing.
> bpftool gets a lot of test coverage as well.
> iproute2 with shared libbpf will get nothing. It's the same random roll of dice.

"Random roll of dice" would be true only if libbpf did incredibly bad
job in keeping backward compatibility. In my experience it is not the
case. Sure, a bug in retaining the compatibility may occasionally
appear; after all, any software tends to contain bugs in various
places. You are right that such bug may not be caught by your testing.

I also believe that if there is a bug in backward compatibility
reported by someone, it will be fixed (if possible). So this is really
just a matter of testing, not a fundamental problem of ABI
compatibility.

Let the distros worry about the testing. Upstream may test (and
even recommend!) certain combinations of iproute2 + libbpf, such as the
latest of both at the time of testing. If distros want to use a
different combination, they can and should do their own testing. If
their testing reveals a bug in backward compatibility and a patch to
fix it is accepted, everything will work smoothly for the distro users.

Non-distro users (or small distros) may just rely on the upstream
tested combination of iproute2 + libbpf.

> Few years from now the situation could be different and shared libbpf would
> be the most appropriate choice. But that day is not today.

Interestingly, the major compatibility problems we had were with llvm
updates. After llvm update while keeping the same kernel version, llvm
started to emit code that the verifier did not accept. Meaning a bpf
program that was previously accepted by the kernel was rejected after
recompilation. This was solved by adding a translation code to libbpf
(which nicely demonstrates that indeed libbpf cares about backward
compatibility).

Now, with dynamically linked libbpf, a single package update was able
to solve the problem for everything on the system, including users' own
programs. All that was needed was making the llvm package force update
the libbpf package (which rpm can do easily with its Conflicts
dependency).

So, at least for us, there was so far no disadvantage (and no problem)
with dynamic linking and a quite substantial advantage.

 Jiri


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04  3:11                     ` Alexei Starovoitov
  2020-11-04 10:01                       ` Jiri Benc
@ 2020-11-04 10:21                       ` Daniel Borkmann
  2020-11-04 11:20                         ` Toke Høiland-Jørgensen
  2020-11-05  3:19                         ` David Ahern
  2020-11-04 21:15                       ` Edward Cree
  2 siblings, 2 replies; 167+ messages in thread
From: Daniel Borkmann @ 2020-11-04 10:21 UTC (permalink / raw)
  To: Alexei Starovoitov, Hangbin Liu
  Cc: David Ahern, Andrii Nakryiko, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/4/20 4:11 AM, Alexei Starovoitov wrote:
> On Wed, Nov 04, 2020 at 10:17:30AM +0800, Hangbin Liu wrote:
>> On Tue, Nov 03, 2020 at 02:55:54PM -0800, Alexei Starovoitov wrote:
>>>> The scope of bpf in iproute2 is tiny - a few tc modules (and VRF but it
>>>> does not need libbpf) which is a small subset of the functionality and
>>>> commands within the package.
>>>
>>> When Hangbin sent this patch set I got excited that finally tc command
>>> will start working with the latest bpf elf files.
>>> Currently "tc" supports 4 year old files which caused plenty of pain to bpf users.
>>> I got excited, but now I've realized that this patch set will make it worse.
>>> The bpf support in "tc" command instead of being obviously old and obsolete
>>> will be sort-of working with unpredictable delay between released kernel
>>> and released iproute2 version. The iproute2 release that suppose to match kernel
>>> release will be meaningless.
>>> More so, the upgrade of shared libbpf.so can make older iproute2/tc to do
>>> something new and unpredictable.
>>> The user experience will be awful. Not only the users won't know
>>> what to expect out of 'tc' command they won't have a way to debug it.
>>> All of it because iproute2 build will take system libbpf and link it
>>> as shared library by default.
>>> So I think iproute2 must not use libbpf. If I could remove bpf support
>>> from iproute2 I would do so as well.
>>> The current state of iproute2 is hurting bpf ecosystem and proposed
>>> libbpf+iproute2 integration will make it worse.
>>
>> Please take it easy. IMHO, it always very hard to make a perfect solution.
>> From development side, it's easier and could get latest features by using
>> libbpf as submodule. But we need to take care of users, backward
>> compatibility, distros policy etc.
>>
>> I like using iproute2 to load bpf objs. But it's not standardized and too old
>> to load the new BTF defined objs. I think all of us like to improve it by
>> using libbpf. But users and distros are slowly. Some user are still using
>> `ifconfig`. Distros have policies to link the shared .so, etc. We have to
>> compromise on something.
>>
>> Our purpose is to push the user to use new features. As this patchset
>> does, push users to try libbpf instead of legacy code. But this need time.
> 
> My problem with iproute2 picking random libbpf is unpredictability.
> Such roll of dice gives no confidence to users on what is expected to work.
> bpf_hello_world.o will load, but that's it.
> What is going to work with this or that version of "tc" command? No one knows.
> The user will do 'tc -V'. Does version mean anything from bpf loading pov?
> It's not. The user will do "ldd `which tc`" and then what?
> Such bpf support in "tc" is worse than the current one.
> At least the current one is predictably old.

User experience will be crappy and predictability worse, agree on that. For libbpf
it's the same as with rest of iproute2 code in that features are developed along
with the kernel. Distros so far are more or less used to upgrade iproute2 along
with new kernel releases though it's not the first time that some major ones have
been shipping old iproute2 for several releases until we pinged them to finally
get their act together to upgrade. With libbpf dynamically linked it's one more
moving target and it's not clear whether distros will upgrade with same cadence
as iproute2 (or even add libbpf as dependency to their packaging). Only option
users might have if they were to rely on iproute2 and to have predictability is
to ship their stuff via container with current libbpf approach which is probably
not the goal of this set.

> There are alternatives though.
> Forking the whole iproute2 because of "tc" is pointless, of course.
> My 'proposal' was a fire starter because people are too stubborn to
> realize that their long term believes could be incorrect until the fire is burning.
> "bpftool prog load" can load any kind of elf. It cannot operate on qdiscs
> and shouldn't do qdisc manipulations, but may be we can combine them into pipe
> of some sort. Like "bpftool prog load file.o | tc filter ... bpf pipe"
> I think that would be better long term. It will be predictable.

We've been thinking about 'bpftool prog load' as well given we build it right of
the kernel tree and bpftool + libbpf are predictable since they are both built out
of the same git tree and with latest features. I don't think it needs to pipe, it
would be enough to just specify where the loaded progs should be pinned in bpf fs
and tc/ip(xdp) already has the option to pick fd of the entry point from pinned
file. But should be doable as well to just pass fd via pipe to avoid later potential
cleanup. Either way I think it makes sense to do 'bpftool prog load' regardless
since it's generic and useful also for other (non-networking) prog types that can
be attached elsewhere in the system.

> When we release new version of libbpf it goes through rigorous testing.
> bpftool gets a lot of test coverage as well.
> iproute2 with shared libbpf will get nothing. It's the same random roll of dice.
> New libbpf may or may not break iproute2. That's awful user experience.
> So iproute2 has to use git submodule with particular libbpf sha.

Alternatively, you have an uapi sync script already in order to not rely on the
distro system headers installed by the distro and to copy latest uapi ones from
kernel tree. Could as well be extended to have similar fixed built-in situation as
with the current lib/bpf.c in iproute2. Back in the days when developing lib/bpf.c,
it was explicitly done as built-in for iproute2 so that it doesn't take years for
users to actually get to the point where they can realistically make use of it. If
we were to extend the internal lib/bpf.c to similar feature state as libbpf today,
how is that different in the bigger picture compared to sync or submodule... so far
noone complained about lib/bpf.c.

> Then libbpf release process can incorporate proper testing of libbpf
> and iproute2 combination.
> Or iproute2 should stay as-is with obsolete bpf support.
> 
> Few years from now the situation could be different and shared libbpf would
> be the most appropriate choice. But that day is not today.

Yep, for libbpf to be in same situation as libelf or libmnl basically feature
development would have to pretty much come to a stop so that even minor or exotic
distros get to a point where they ship same libbpf version as major distros where
then users can start to rely on the base feature set for developing programs
against it.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-04  8:51             ` Hangbin Liu
@ 2020-11-04 11:09               ` Toke Høiland-Jørgensen
  2020-11-04 11:40                 ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-04 11:09 UTC (permalink / raw)
  To: Hangbin Liu, David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko

Hangbin Liu <haliu@redhat.com> writes:

> On Tue, Nov 03, 2020 at 10:32:37AM -0700, David Ahern wrote:
>> configure scripts usually allow you to control options directly,
>> overriding the autoprobe.
>
> What do you think of the follow update? It's a little rough and only controls
> libbpf.
>
> $ git diff
> diff --git a/configure b/configure
> index 711bb69c..be35c024 100755
> --- a/configure
> +++ b/configure
> @@ -442,6 +442,35 @@ endif
>  EOF
>  }
>
> +usage()
> +{
> +       cat <<EOF
> +Usage: $0 [OPTIONS]
> +  -h | --help                  Show this usage info
> +  --no-libbpf                  build the package without libbpf
> +  --libbpf-dir=DIR             build the package with self defined libbpf dir
> +EOF
> +       exit $1
> +}

This would be the only command line arg that configure takes; all other
options are passed via the environment. I think we should be consistent
here; and since converting the whole configure script is probably out of
scope for this patch, why not just use the existing FORCE_LIBBPF
variable?

I.e., FORCE_LIBBPF=on will fail if not libbpf is present,
FORCE_LIBBPF=off will disable libbpf entirely, and if the variable is
unset, libbpf will be used if found?

Alternatively, keep them as two separate variables (FORCE_LIBBPF and
DISABLE_LIBBPF?). I don't have any strong preference as to which of
those is best, but I think they'd both be more consistent with the
existing configure script logic...

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 10:21                       ` Daniel Borkmann
@ 2020-11-04 11:20                         ` Toke Høiland-Jørgensen
  2020-11-04 13:12                           ` Daniel Borkmann
  2020-11-05  3:19                         ` David Ahern
  1 sibling, 1 reply; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-04 11:20 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov, Hangbin Liu
  Cc: David Ahern, Andrii Nakryiko, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

Daniel Borkmann <daniel@iogearbox.net> writes:

> Back in the days when developing lib/bpf.c, it was explicitly done as
> built-in for iproute2 so that it doesn't take years for users to
> actually get to the point where they can realistically make use of it.
> If we were to extend the internal lib/bpf.c to similar feature state
> as libbpf today, how is that different in the bigger picture compared
> to sync or submodule... so far noone complained about lib/bpf.c.

Except that this whole effort started because lib/bpf.c is slowly
bitrotting into oblivion? If all the tools are dynamically linked
against libbpf, that's only one package the distros have to keep
up-to-date instead of a whole list of tools. How does that make things
*worse*?

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-04 11:09               ` Toke Høiland-Jørgensen
@ 2020-11-04 11:40                 ` Hangbin Liu
  0 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-04 11:40 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: David Ahern, Stephen Hemminger, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Andrii Nakryiko

On Wed, Nov 04, 2020 at 12:09:15PM +0100, Toke Høiland-Jørgensen wrote:
> > +usage()
> > +{
> > +       cat <<EOF
> > +Usage: $0 [OPTIONS]
> > +  -h | --help                  Show this usage info
> > +  --no-libbpf                  build the package without libbpf
> > +  --libbpf-dir=DIR             build the package with self defined libbpf dir
> > +EOF
> > +       exit $1
> > +}
> 
> This would be the only command line arg that configure takes; all other
> options are passed via the environment. I think we should be consistent
> here; and since converting the whole configure script is probably out of
> scope for this patch, why not just use the existing FORCE_LIBBPF
> variable?

Yes, converting the whole configure script should be split as another patch
work.
> 
> I.e., FORCE_LIBBPF=on will fail if not libbpf is present,
> FORCE_LIBBPF=off will disable libbpf entirely, and if the variable is
> unset, libbpf will be used if found?

I like this one, with only one variable. I will check how to re-organize the
script.

> 
> Alternatively, keep them as two separate variables (FORCE_LIBBPF and
> DISABLE_LIBBPF?). I don't have any strong preference as to which of
> those is best, but I think they'd both be more consistent with the
> existing configure script logic...

Please tell me if others have any other ideas.

Thanks
Hnagbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 11:20                         ` Toke Høiland-Jørgensen
@ 2020-11-04 13:12                           ` Daniel Borkmann
  2020-11-04 19:17                             ` Jakub Kicinski
  0 siblings, 1 reply; 167+ messages in thread
From: Daniel Borkmann @ 2020-11-04 13:12 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Alexei Starovoitov, Hangbin Liu
  Cc: David Ahern, Andrii Nakryiko, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

On 11/4/20 12:20 PM, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
> 
>> Back in the days when developing lib/bpf.c, it was explicitly done as
>> built-in for iproute2 so that it doesn't take years for users to
>> actually get to the point where they can realistically make use of it.
>> If we were to extend the internal lib/bpf.c to similar feature state
>> as libbpf today, how is that different in the bigger picture compared
>> to sync or submodule... so far noone complained about lib/bpf.c.
> 
> Except that this whole effort started because lib/bpf.c is slowly
> bitrotting into oblivion? If all the tools are dynamically linked
> against libbpf, that's only one package the distros have to keep
> up-to-date instead of a whole list of tools. How does that make things
> *worse*?

It sounds good in theory if that would all work out as expected, but reality
differs unfortunately. Today on vast majority of distros you are able to use
iproute2's BPF loader via lib/bpf.c given it's a fixed built-in, even if
it's bitrotting for a while now in terms of features^BTF, but the base functionality
that is in there can be used, and it is used in the wild today. If libbpf is
dynamically linked to iproute2, then I - as a user - am left with continuing
to assume that the current lib/bpf.c is the /only/ base that is really /guaranteed/
to be available as a loader across distros, but iproute2 + libbpf may not be
(it may be the case for RHEL but potentially not others). So from user PoV
I might be sticking to the current lib/bpf.c that iproute2 ships instead of
converting code over until even major distros catch up in maybe 2 years from now
(that is in fact how long it took Canonical to get bpftool included, not kidding).
If we would have done lib/bpf.c as a dynamic library back then, we wouldn't be
where we are today since users might be able to start consuming BPF functionality
just now, don't you agree? This was an explicit design choice back then for exactly
this reason. If we extend lib/bpf.c or import libbpf one way or another then there
is consistency across distros and users would be able to consume it in a predictable
way starting from next major releases. And you could start making this assumption
on all major distros in say, 3 months from now. The discussion is somehow focused
on the PoV of /a/ distro which is all nice and good, but the ones consuming the
loader shipping software /across/ distros are users writing BPF progs, all I'm
trying to say is that the _user experience_ should be the focus of this discussion
and right now we're trying hard making it rather painful for them to consume it.

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 13:12                           ` Daniel Borkmann
@ 2020-11-04 19:17                             ` Jakub Kicinski
  2020-11-04 20:43                               ` Andrii Nakryiko
  0 siblings, 1 reply; 167+ messages in thread
From: Jakub Kicinski @ 2020-11-04 19:17 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Toke Høiland-Jørgensen, Alexei Starovoitov,
	Hangbin Liu, David Ahern, Andrii Nakryiko, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

On Wed, 4 Nov 2020 14:12:47 +0100 Daniel Borkmann wrote:
> If we would have done lib/bpf.c as a dynamic library back then, we wouldn't be
> where we are today since users might be able to start consuming BPF functionality
> just now, don't you agree? This was an explicit design choice back then for exactly
> this reason. If we extend lib/bpf.c or import libbpf one way or another then there
> is consistency across distros and users would be able to consume it in a predictable
> way starting from next major releases. And you could start making this assumption
> on all major distros in say, 3 months from now. The discussion is somehow focused
> on the PoV of /a/ distro which is all nice and good, but the ones consuming the
> loader shipping software /across/ distros are users writing BPF progs, all I'm
> trying to say is that the _user experience_ should be the focus of this discussion
> and right now we're trying hard making it rather painful for them to consume it.

IIUC you're saying that we cannot depend on libbpf updates from distro.
Isn't that a pretty bad experience for all users who would like to link
against it? There are 4 components (kernel, lib, tools, compiler) all
need to be kept up to date for optimal user experience. Cutting corners
with one of them leads nowhere medium term IMHO.

Unless what you guys are saying is that libbpf is _not_ supposed to be
backward compatible from the user side, and must be used a submodule.
But then why bother defining ABI versions, or build it as an .so at all.

I'm also confused by the testing argument. Surely the solution is to
add unit / system tests for iproute2. Distros will rebuild packages
when dependencies change and retest. If we have 0 tests doesn't matter
what update strategy there is.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 19:17                             ` Jakub Kicinski
@ 2020-11-04 20:43                               ` Andrii Nakryiko
  2020-11-04 22:24                                 ` Toke Høiland-Jørgensen
  2020-11-05  3:48                                 ` David Ahern
  0 siblings, 2 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-04 20:43 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Daniel Borkmann, Toke Høiland-Jørgensen,
	Alexei Starovoitov, Hangbin Liu, David Ahern, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

On Wed, Nov 4, 2020 at 11:17 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 4 Nov 2020 14:12:47 +0100 Daniel Borkmann wrote:
> > If we would have done lib/bpf.c as a dynamic library back then, we wouldn't be
> > where we are today since users might be able to start consuming BPF functionality
> > just now, don't you agree? This was an explicit design choice back then for exactly
> > this reason. If we extend lib/bpf.c or import libbpf one way or another then there
> > is consistency across distros and users would be able to consume it in a predictable
> > way starting from next major releases. And you could start making this assumption
> > on all major distros in say, 3 months from now. The discussion is somehow focused
> > on the PoV of /a/ distro which is all nice and good, but the ones consuming the
> > loader shipping software /across/ distros are users writing BPF progs, all I'm
> > trying to say is that the _user experience_ should be the focus of this discussion
> > and right now we're trying hard making it rather painful for them to consume it.

This! Thanks, Daniel, for stating it very explicitly. Earlier I
mentioned iproute2 code simplification if using submodules, but that's
just a nice by-product, not the goal, so I'll just ignore that. I'll
try to emphasize the end user experience though.

What users writing BPF programs can expect from iproute2 in terms of
available BPF features is what matters. And by not enforcing a
specific minimal libbpf version, iproute2 version doesn't matter all
that much, because libbpf version that iproute2 ends up linking
against might be very old.

There was a lot of talk about API stability and backwards
compatibility. Libbpf has had a stable API and ABI for at least 1.5
years now and is very conscious about that when adding or extending
new APIs. That's not even a factor in me arguing for submodules. I'll
give a few specific examples of libbpf API not changing at all, but
how end user experience gets tremendously better.

Some of the most important APIs of libbpf are, arguably,
bpf_object__open() and bpf_object__load(). They accept a BPF ELF file,
do some preprocessing and in the end load BPF instructions into the
kernel for verification. But while API doesn't change across libbpf
versions, BPF-side code features supported changes quite a lot.

1. BTF sanitization. Newer versions of clang would emit a richer set
of BTF type information. Old kernels might not support BTF at all (but
otherwise would work just fine), or might not support some specific
newer additions to BTF. If someone was to use the latest Clang, but
outdated libbpf and old kernel, they would have a bad time, because
their BPF program would fail due to the kernel being strict about BTF.
But new libbpf would "sanitize" BTF, according to supported features
of the kernel, or just drop BTF altogether, if the kernel is that old.

If iproute2's latest version doesn't imply the latest libbpf version,
there is a high chance that the user's BPF program will fail to load.
Which requires users to be **aware** of all these complications, and
care about specific Clang versions and subsets of BTF that get
generated. With the latest libbpf all that goes away.

2. bpf_probe_read_user() falling back to bpf_probe_read(). Newer
kernels warn if a BPF application isn't using a proper _kernel() or
_user() variant of bpf_probe_read(), and eventually will just stop
supporting generic bpf_probe_read(). So what this means is that end
users would need to compile to variants of their BPF application, one
for older kernels with bpf_probe_read(), another with
bpf_probe_read_kernel()/bpf_probe_read_user(). That's a massive pain
in the butt. But newer libbpf versions provide a completely
transparent fallback from _user()/_kernel() variants to generic one,
if the kernel doesn't support new variants. So the instruction to
users becomes simple: always use
bpf_probe_read_user()/bpf_probe_read_kernel().

But with iproute2 not enforcing new enough versions of libbpf, all
that goes out of the window and puts the burden back on end users.

3. Another feature (and far from being the last of this kind in
libbpf) is a full support for individual *non-always-inlined*
functions in BPF code, which was added recently. This allows to
structure BPF code better, get better instruction cache use and for
newer kernels even get significant speed ups of BPF code verification.
This is purely a libbpf feature, no API was changed. Further, the
kernel understands the difference between global and static functions
in BPF code and optimizes verification, if possible. Libbpf takes care
of falling back to static functions for old kernels that are not yet
aware of global functions. All that is completely transparent and
works reliably without users having to deal with three variants of
doing helper functions in their BPF code.

And again, if, when using iproute2, the user doesn't know which
version of libbpf will be used, they have to assume the worst
(__always_inline) or maintain 2 or 3 different copies of their code.

And there are more conveniences like that significantly simplifying
BPF end users by hiding differences of kernel versions, clang
versions, etc.

Submodule is a way that I know of to make this better for end users.
If there are other ways to pull this off with shared library use, I'm
all for it, it will save the security angle that distros are arguing
for. E.g., if distributions will always have the latest libbpf
available almost as soon as it's cut upstream *and* new iproute2
versions enforce the latest libbpf when they are packaged/released,
then this might work equivalently for end users. If Linux distros
would be willing to do this faithfully and promptly, I have no
objections whatsoever. Because all that matters is BPF end user
experience, as Daniel explained above.

>
> IIUC you're saying that we cannot depend on libbpf updates from distro.

As I tried to explain above, a big part of libbpf is BPF loader,
which, while not changing the library API, does get more and advanced
features with newer versions. So yeah, you can totally use older
versions of libbpf, but you need to be aware of all the kernel + clang
+ BPF code features interactions, which newer libbpfs often
transparently alleviate for the user.

So if someone has some old BPF code not using anything fancy, they
might not care all that much, probably.


> Isn't that a pretty bad experience for all users who would like to link
> against it? There are 4 components (kernel, lib, tools, compiler) all
> need to be kept up to date for optimal user experience. Cutting corners
> with one of them leads nowhere medium term IMHO.
>
> Unless what you guys are saying is that libbpf is _not_ supposed to be
> backward compatible from the user side, and must be used a submodule.
> But then why bother defining ABI versions, or build it as an .so at all.

That's not what anyone is saying, I hope we established that in this
thread that libbpf does provide a stable API and ABI, with backwards
and forward compatibility. And takes it very seriously. User BPF
programs just tend to grow in complexity and features used and newer
libbpf versions are sometimes a requirement to utilize all that
effectively.

>
> I'm also confused by the testing argument. Surely the solution is to
> add unit / system tests for iproute2. Distros will rebuild packages
> when dependencies change and retest. If we have 0 tests doesn't matter
> what update strategy there is.

Tests are good, but I'm a bit sceptical about the surface area that
could be tested. Compiled BPF program (ELF file) is an input to BPF
loader APIs, and that compiled BPF program can be arbitrarily complex,
using a variety of different kernel/libbpf features. So a single
non-changing APIs accepts an infinite variety of inputs. selftests/bpf
mandate that each new kernel and libbpf feature gets a test, I'm
wondering if iproute2 test suite would be able to keep up with this.
And then again, some features are not supposed to work on older libbpf
versions, so not clear how iproute2 would test that. But regardless,
more testing is always better, so I hope this won't discourage testing
per se.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04  3:11                     ` Alexei Starovoitov
  2020-11-04 10:01                       ` Jiri Benc
  2020-11-04 10:21                       ` Daniel Borkmann
@ 2020-11-04 21:15                       ` Edward Cree
  2020-11-04 22:10                         ` Alexei Starovoitov
  2 siblings, 1 reply; 167+ messages in thread
From: Edward Cree @ 2020-11-04 21:15 UTC (permalink / raw)
  To: Alexei Starovoitov, Hangbin Liu
  Cc: David Ahern, Daniel Borkmann, Andrii Nakryiko, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 04/11/2020 03:11, Alexei Starovoitov wrote:
> The user will do 'tc -V'. Does version mean anything from bpf loading pov?
> It's not. The user will do "ldd `which tc`" and then what?
Is it beyond the wit of man for 'tc -V' to output somethingabout
 libbpf version?
Other libraries seem to solve these problems all the time, I
 haven't seen anyone explain what makes libbpf so special that it
 has to be different.

-ed

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 21:15                       ` Edward Cree
@ 2020-11-04 22:10                         ` Alexei Starovoitov
  2020-11-04 22:35                           ` Toke Høiland-Jørgensen
  2020-11-04 23:05                           ` Edward Cree
  0 siblings, 2 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-04 22:10 UTC (permalink / raw)
  To: Edward Cree
  Cc: Hangbin Liu, David Ahern, Daniel Borkmann, Andrii Nakryiko,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Wed, Nov 4, 2020 at 1:16 PM Edward Cree <ecree@solarflare.com> wrote:
>
> On 04/11/2020 03:11, Alexei Starovoitov wrote:
> > The user will do 'tc -V'. Does version mean anything from bpf loading pov?
> > It's not. The user will do "ldd `which tc`" and then what?
> Is it beyond the wit of man for 'tc -V' to output somethingabout
>  libbpf version?
> Other libraries seem to solve these problems all the time, I
>  haven't seen anyone explain what makes libbpf so special that it
>  has to be different.

slow vger? Please see Daniel and Andrii detailed explanations.

libbpf is not your traditional library.
Looking through the installed libraries on my devserver in /lib64/ directory
I think the closest is libbfd.so
Then think why gdb always statically links it.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 20:43                               ` Andrii Nakryiko
@ 2020-11-04 22:24                                 ` Toke Høiland-Jørgensen
  2020-11-05 20:14                                   ` Andrii Nakryiko
  2020-11-05  3:48                                 ` David Ahern
  1 sibling, 1 reply; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-04 22:24 UTC (permalink / raw)
  To: Andrii Nakryiko, Jakub Kicinski
  Cc: Daniel Borkmann, Alexei Starovoitov, Hangbin Liu, David Ahern,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> Some of the most important APIs of libbpf are, arguably,
> bpf_object__open() and bpf_object__load(). They accept a BPF ELF file,
> do some preprocessing and in the end load BPF instructions into the
> kernel for verification. But while API doesn't change across libbpf
> versions, BPF-side code features supported changes quite a lot.

Yes, which means that nothing has to change in iproute2 *at all* to get
this; not the version, not even a rebuild: just update the system
libbpf, and you'll automatically gain all these features. How is that an
argument for *not* linking dynamically? It's a user *benefit* to not
have to care about the iproute2 version, but only have to care about
keeping libbpf up to date.

I mean, if iproute2 had started out by linking dynamically against
libbpf (setting aside the fact that libbpf didn't exist back then), we
wouldn't even be having this conversation: In that case its support for
new features in the BPF format would just automatically have kept up
along with the rest of the system as the library got upgraded...

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 22:10                         ` Alexei Starovoitov
@ 2020-11-04 22:35                           ` Toke Høiland-Jørgensen
  2020-11-04 23:05                           ` Edward Cree
  1 sibling, 0 replies; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-04 22:35 UTC (permalink / raw)
  To: Alexei Starovoitov, Edward Cree
  Cc: Hangbin Liu, David Ahern, Daniel Borkmann, Andrii Nakryiko,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Wed, Nov 4, 2020 at 1:16 PM Edward Cree <ecree@solarflare.com> wrote:
>>
>> On 04/11/2020 03:11, Alexei Starovoitov wrote:
>> > The user will do 'tc -V'. Does version mean anything from bpf loading pov?
>> > It's not. The user will do "ldd `which tc`" and then what?
>> Is it beyond the wit of man for 'tc -V' to output somethingabout
>>  libbpf version?
>> Other libraries seem to solve these problems all the time, I
>>  haven't seen anyone explain what makes libbpf so special that it
>>  has to be different.
>
> slow vger? Please see Daniel and Andrii detailed explanations.
>
> libbpf is not your traditional library.
> Looking through the installed libraries on my devserver in /lib64/ directory
> I think the closest is libbfd.so
> Then think why gdb always statically links it.

The distinguishing feature is the tool, not the library. For a tool that
intimately depends detailed behaviour, sure it makes sense to statically
link to know exactly which version you have. But for BPF, that is
bpftool, not iproute2.

For iproute2, libbpf serves a very simple function: load a BPF program
from an object file and turn it into an fd that can be attached. For
that, dynamic linking is the right thing to do so library upgrades can
bring in new support without touching the tool itself.

Daniel's example from upthread illustrates it:

bpftool prog load | tc attach

i.e., decoupling load from attach. Which is *exactly* what dynamic
linking in iproute2 would mean, except using ld(1) instead of a pipe!

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 22:10                         ` Alexei Starovoitov
  2020-11-04 22:35                           ` Toke Høiland-Jørgensen
@ 2020-11-04 23:05                           ` Edward Cree
  2020-11-05 20:19                             ` Andrii Nakryiko
  1 sibling, 1 reply; 167+ messages in thread
From: Edward Cree @ 2020-11-04 23:05 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Hangbin Liu, David Ahern, Daniel Borkmann, Andrii Nakryiko,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 04/11/2020 22:10, Alexei Starovoitov wrote:
> On Wed, Nov 4, 2020 at 1:16 PM Edward Cree <ecree@solarflare.com> wrote:
>> On 04/11/2020 03:11, Alexei Starovoitov wrote:
>>> The user will do 'tc -V'. Does version mean anything from bpf loading pov?
>>> It's not. The user will do "ldd `which tc`" and then what?
>> Is it beyond the wit of man for 'tc -V' to output somethingabout
>>  libbpf version?
>> Other libraries seem to solve these problems all the time, I
>>  haven't seen anyone explain what makes libbpf so special that it
>>  has to be different.
> slow vger? Please see Daniel and Andrii detailed explanations.
Nah, I've seen that subthread(vger is fine).  I felt that subthread
 was missing this point about -V which is why I replied where it was
 brought up.
Daniel and Andrii have only explained why users will want to have an
 up-to-date libbpf, they (and you) haven't connected it to any
 argument about why static linking is the way to achieve that.
> libbpf is not your traditional library.
This has only been asserted, not explained.
I'm fully willing to entertain the possibility that libbpf is indeed
 special.  But if you want to win people over, you'll need to
 explain *why* it's special.
"Look at bfd and think why" is not enough, be more explicit.

AIUI the API between iproute2 and libbpf isn't changing, all that's
 happening is that libbpf is gaining new capabilities in things that
 are totally transparent to iproute2 (e.g. BTF fixups).  So the
 reasonable thing for users to expect is "I need new BPF features,
 I'll upgrade my libbpf", and with dynamic linking that works fine
 whether they upgrade iproute2 too or not.
This narrative is, on the face of it, just as plausible as "I'm
 getting an error from iproute2, I'll upgrade that".  And if distros
 decide that that's a common enough mistake to matter, then they can
 make the newer iproute2 package depend on a newer libbpf package,
 and apt or yum or whatever will automagically DTRT.
Whereas if you tightly couple them from the start, distros can't
 then go the other way if it turns out you made the wrong choice.
 (What if someone can't use the latest iproute2 release because it
 has a regression bug that breaks their use-case, but they need the
 latest libbpf for one of your shiny new features?)

Don't get me wrong, I'd love a world in which static linking was the
 norm and we all rebuilt our binaries locally every time we upgraded
 a piece.  But that's not the world we live in, and consistency
 *within* a distro matters too...

-ed

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-04  8:22         ` Hangbin Liu
@ 2020-11-05  2:33           ` David Ahern
  2020-11-05  7:51             ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-05  2:33 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 11/4/20 1:22 AM, Hangbin Liu wrote:
> If we move this #ifdef HAVE_LIBBPF to bpf_legacy.c, we need to rename
> them all. With current patch, we limit all the legacy functions in bpf_legacy
> and doesn't mix them with libbpf.h. What do you think?

Let's rename conflicts with a prefix -- like legacy. In fact, those
iproute2_ functions names could use the legacy_ prefix as well.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04  9:28                       ` Jiri Benc
@ 2020-11-05  2:39                         ` David Ahern
  0 siblings, 0 replies; 167+ messages in thread
From: David Ahern @ 2020-11-05  2:39 UTC (permalink / raw)
  To: Jiri Benc, Alexei Starovoitov
  Cc: Daniel Borkmann, Andrii Nakryiko, Hangbin Liu, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/4/20 2:28 AM, Jiri Benc wrote:
> On Tue, 3 Nov 2020 18:45:59 -0800, Alexei Starovoitov wrote:
>> libbpf is the only library I know that is backward and forward compatible.
> 
> This is great to hear. It means there will be no problem with iproute2
> using the system libbpf. As libbpf is both backward and forward
> compatible, iproute2 will just work with whatever version it is used
> with.

That is how I read that as well. The bpf team is making sure libbpf is a
stable, robust front-end to kernel APIs. That stability is what controls
the user experience. With the due diligence in testing, packages using
libbpf can have confidence that using an libbpf API is not going to
change release over release regardless of kernel version installed
(i.e., as kernel versions go newer from an OS start point - typical
scenario for a distribution).


> 
> The only problem would be if a particular function changed its
> semantics while retaining ABI. But since libbpf is backward and forward
> compatible, this should not happen.

exactly.

Then, If libbpf needs to change something that affects users, it bumps
the soname version.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 10:21                       ` Daniel Borkmann
  2020-11-04 11:20                         ` Toke Høiland-Jørgensen
@ 2020-11-05  3:19                         ` David Ahern
  2020-11-05 14:05                           ` Jamal Hadi Salim
  2020-11-05 20:45                           ` Andrii Nakryiko
  1 sibling, 2 replies; 167+ messages in thread
From: David Ahern @ 2020-11-05  3:19 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov, Hangbin Liu
  Cc: Andrii Nakryiko, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/4/20 3:21 AM, Daniel Borkmann wrote:
> 
>> Then libbpf release process can incorporate proper testing of libbpf
>> and iproute2 combination.
>> Or iproute2 should stay as-is with obsolete bpf support.
>>
>> Few years from now the situation could be different and shared libbpf
>> would
>> be the most appropriate choice. But that day is not today.
> 
> Yep, for libbpf to be in same situation as libelf or libmnl basically
> feature
> development would have to pretty much come to a stop so that even minor
> or exotic
> distros get to a point where they ship same libbpf version as major
> distros where
> then users can start to rely on the base feature set for developing
> programs
> against it.

User experience keeps getting brought up, but I also keep reading the
stance that BPF users can not expect a consistent experience unless they
are constantly chasing latest greatest versions of *ALL* S/W related to
BPF. That is not a realistic expectation for users. Distributions exist
for a reason. They solve real packaging problems.

As libbpf and bpf in general reach a broader audience, the requirements
to use, deploy and even tryout BPF features needs to be more user
friendly and that starts with maintainers of the BPF code and how they
approach extensions and features. Telling libbpf consumers to make
libbpf a submodule of their project and update the reference point every
time a new release comes out is not user friendly.

Similarly, it is not realistic or user friendly to *require* general
Linux users to constantly chase latest versions of llvm, clang, dwarves,
bcc, bpftool, libbpf, (I am sure I am missing more), and, by extension
of what you want here, iproute2 just to upgrade their production kernel
to say v5.10, the next LTS, or to see what relevant new ebpf features
exists in the new kernel. As a specific example BTF extensions are added
in a way that is all or nothing. Meaning, you want to compile kernel
version X with CONFIG_DEBUG_INFO_BTF enabled, update your toolchain.
Sure, you are using the latest LTS of $distro, and it worked fine with
kernel version X-1 last week, but now compile fails completely unless
the pahole version is updated. Horrible user experience. Again, just an
example and one I brought up in July. I am sure there more.

Linux APIs are about stability and consistency. Commands and libraries
that work on v5.9 should work exactly the same on v5.10, 5.11, 5.12, ...
*IF* I want a new feature (kernel, bpf or libbpf), then the requirement
to upgrade is justified. But if I am just updating my kernel, or
updating my compiler, or updating iproute2 because I want to try out
some new nexthop feature, I should not be cornered into an all or
nothing scheme.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 20:43                               ` Andrii Nakryiko
  2020-11-04 22:24                                 ` Toke Høiland-Jørgensen
@ 2020-11-05  3:48                                 ` David Ahern
  2020-11-05 20:53                                   ` Andrii Nakryiko
  1 sibling, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-05  3:48 UTC (permalink / raw)
  To: Andrii Nakryiko, Jakub Kicinski
  Cc: Daniel Borkmann, Toke Høiland-Jørgensen,
	Alexei Starovoitov, Hangbin Liu, Stephen Hemminger,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

On 11/4/20 1:43 PM, Andrii Nakryiko wrote:
> 
> What users writing BPF programs can expect from iproute2 in terms of
> available BPF features is what matters. And by not enforcing a
> specific minimal libbpf version, iproute2 version doesn't matter all
> that much, because libbpf version that iproute2 ends up linking
> against might be very old.
> 
> There was a lot of talk about API stability and backwards
> compatibility. Libbpf has had a stable API and ABI for at least 1.5
> years now and is very conscious about that when adding or extending
> new APIs. That's not even a factor in me arguing for submodules. I'll
> give a few specific examples of libbpf API not changing at all, but
> how end user experience gets tremendously better.
> 
> Some of the most important APIs of libbpf are, arguably,
> bpf_object__open() and bpf_object__load(). They accept a BPF ELF file,
> do some preprocessing and in the end load BPF instructions into the
> kernel for verification. But while API doesn't change across libbpf
> versions, BPF-side code features supported changes quite a lot.
> 
> 1. BTF sanitization. Newer versions of clang would emit a richer set
> of BTF type information. Old kernels might not support BTF at all (but
> otherwise would work just fine), or might not support some specific
> newer additions to BTF. If someone was to use the latest Clang, but
> outdated libbpf and old kernel, they would have a bad time, because
> their BPF program would fail due to the kernel being strict about BTF.
> But new libbpf would "sanitize" BTF, according to supported features
> of the kernel, or just drop BTF altogether, if the kernel is that old.
> 

In my experience, compilers are the least likely change in a typical
Linux development environment. BPF should not be forcing new versions
(see me last response).

> 
> 2. bpf_probe_read_user() falling back to bpf_probe_read(). Newer
> kernels warn if a BPF application isn't using a proper _kernel() or
> _user() variant of bpf_probe_read(), and eventually will just stop
> supporting generic bpf_probe_read(). So what this means is that end
> users would need to compile to variants of their BPF application, one
> for older kernels with bpf_probe_read(), another with
> bpf_probe_read_kernel()/bpf_probe_read_user(). That's a massive pain
> in the butt. But newer libbpf versions provide a completely
> transparent fallback from _user()/_kernel() variants to generic one,
> if the kernel doesn't support new variants. So the instruction to
> users becomes simple: always use
> bpf_probe_read_user()/bpf_probe_read_kernel().
> 

I vaguely recall a thread about having BPF system call return user
friendly messages, but that was shot down. I take this example to mean
the solution is to have libbpf handle the quirks and various changes
which means that now libbpf takes on burden - the need for constant
updates to handle quirks. extack has been very successful at making
networking configuration mistakes more user friendly. Other kernel
features should be using the same kind of extension.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-05  2:33           ` David Ahern
@ 2020-11-05  7:51             ` Hangbin Liu
  2020-11-05 15:25               ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-05  7:51 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, netdev, bpf, Toke Høiland-Jørgensen

On Wed, Nov 04, 2020 at 07:33:40PM -0700, David Ahern wrote:
> On 11/4/20 1:22 AM, Hangbin Liu wrote:
> > If we move this #ifdef HAVE_LIBBPF to bpf_legacy.c, we need to rename
> > them all. With current patch, we limit all the legacy functions in bpf_legacy
> > and doesn't mix them with libbpf.h. What do you think?
> 
> Let's rename conflicts with a prefix -- like legacy. In fact, those
> iproute2_ functions names could use the legacy_ prefix as well.
> 

Sorry, when trying to rename the functions. I just found another issue.
Even we fix the conflicts right now. What if libbpf add new functions
and we got another conflict in future? There are too much bpf functions
in bpf_legacy.c which would have more risks for naming conflicts..

With bpf_libbpf.c, there are less functions and has less risk for naming
conflicts. So I think it maybe better to not include libbpf.h in bpf_legacy.c.
What do you think?

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05  3:19                         ` David Ahern
@ 2020-11-05 14:05                           ` Jamal Hadi Salim
  2020-11-05 21:01                             ` Andrii Nakryiko
  2020-11-10 12:47                             ` Edward Cree
  2020-11-05 20:45                           ` Andrii Nakryiko
  1 sibling, 2 replies; 167+ messages in thread
From: Jamal Hadi Salim @ 2020-11-05 14:05 UTC (permalink / raw)
  To: David Ahern, Daniel Borkmann, Alexei Starovoitov, Hangbin Liu
  Cc: Andrii Nakryiko, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 2020-11-04 10:19 p.m., David Ahern wrote:

[..]
> 
> User experience keeps getting brought up, but I also keep reading the
> stance that BPF users can not expect a consistent experience unless they
> are constantly chasing latest greatest versions of *ALL* S/W related to
> BPF. That is not a realistic expectation for users. Distributions exist
> for a reason. They solve real packaging problems.
> 
> As libbpf and bpf in general reach a broader audience, the requirements
> to use, deploy and even tryout BPF features needs to be more user
> friendly and that starts with maintainers of the BPF code and how they
> approach extensions and features. Telling libbpf consumers to make
> libbpf a submodule of their project and update the reference point every
> time a new release comes out is not user friendly.
> 
> Similarly, it is not realistic or user friendly to *require* general
> Linux users to constantly chase latest versions of llvm, clang, dwarves,
> bcc, bpftool, libbpf, (I am sure I am missing more), and, by extension
> of what you want here, iproute2 just to upgrade their production kernel
> to say v5.10, the next LTS, or to see what relevant new ebpf features
> exists in the new kernel. As a specific example BTF extensions are added
> in a way that is all or nothing. Meaning, you want to compile kernel
> version X with CONFIG_DEBUG_INFO_BTF enabled, update your toolchain.
> Sure, you are using the latest LTS of $distro, and it worked fine with
> kernel version X-1 last week, but now compile fails completely unless
> the pahole version is updated. Horrible user experience. Again, just an
> example and one I brought up in July. I am sure there more.
> 


2cents feedback from a dabbler in ebpf on user experience:

What David described above *has held me back*.
Over time it seems things have gotten better with libbpf
(although a few times i find myself copying includes from the
latest iproute into libbpf). I ended up just doing static links.
The idea of upgrading clang/llvm every 2 months i revisit ebpf is
the most painful. At times code that used to compile just fine
earlier doesnt anymore. There's a minor issue of requiring i install
kernel headers every time i want to run something in samples, etc
but i am probably lacking knowledge on how to ease the pain in that
regard.

I find the loader and associated tooling in iproute2/tc to be quiet
stable (not shiny but works everytime).
And for that reason i often find myself sticking to just tc instead
of toying with other areas.
Slight tangent:
One thing that would help libbpf adoption is to include an examples/
directory. Put a bunch of sample apps for tc, probes, xdp etc.
And have them compile outside of the kernel. Maybe useful Makefiles
that people can cutnpaste from. Every time you add a new feature
put some sample code in the examples.

cheers,
jamal

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-05  7:51             ` Hangbin Liu
@ 2020-11-05 15:25               ` David Ahern
  2020-11-05 15:57                 ` Toke Høiland-Jørgensen
  2020-11-06  0:41                 ` Hangbin Liu
  0 siblings, 2 replies; 167+ messages in thread
From: David Ahern @ 2020-11-05 15:25 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, netdev, bpf, Toke Høiland-Jørgensen

On 11/5/20 12:51 AM, Hangbin Liu wrote:
> On Wed, Nov 04, 2020 at 07:33:40PM -0700, David Ahern wrote:
>> On 11/4/20 1:22 AM, Hangbin Liu wrote:
>>> If we move this #ifdef HAVE_LIBBPF to bpf_legacy.c, we need to rename
>>> them all. With current patch, we limit all the legacy functions in bpf_legacy
>>> and doesn't mix them with libbpf.h. What do you think?
>>
>> Let's rename conflicts with a prefix -- like legacy. In fact, those
>> iproute2_ functions names could use the legacy_ prefix as well.
>>
> 
> Sorry, when trying to rename the functions. I just found another issue.
> Even we fix the conflicts right now. What if libbpf add new functions
> and we got another conflict in future? There are too much bpf functions
> in bpf_legacy.c which would have more risks for naming conflicts..
> 
> With bpf_libbpf.c, there are less functions and has less risk for naming
> conflicts. So I think it maybe better to not include libbpf.h in bpf_legacy.c.
> What do you think?
> 
>

Is there a way to sort the code such that bpf_legacy.c is not used when
libbpf is enabled and bpf_libbpf.c is not compiled when libbpf is disabled.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-05 15:25               ` David Ahern
@ 2020-11-05 15:57                 ` Toke Høiland-Jørgensen
  2020-11-05 16:02                   ` David Ahern
  2020-11-06  0:41                 ` Hangbin Liu
  1 sibling, 1 reply; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-05 15:57 UTC (permalink / raw)
  To: David Ahern, Hangbin Liu; +Cc: Stephen Hemminger, netdev, bpf

David Ahern <dsahern@gmail.com> writes:

> On 11/5/20 12:51 AM, Hangbin Liu wrote:
>> On Wed, Nov 04, 2020 at 07:33:40PM -0700, David Ahern wrote:
>>> On 11/4/20 1:22 AM, Hangbin Liu wrote:
>>>> If we move this #ifdef HAVE_LIBBPF to bpf_legacy.c, we need to rename
>>>> them all. With current patch, we limit all the legacy functions in bpf_legacy
>>>> and doesn't mix them with libbpf.h. What do you think?
>>>
>>> Let's rename conflicts with a prefix -- like legacy. In fact, those
>>> iproute2_ functions names could use the legacy_ prefix as well.
>>>
>> 
>> Sorry, when trying to rename the functions. I just found another issue.
>> Even we fix the conflicts right now. What if libbpf add new functions
>> and we got another conflict in future? There are too much bpf functions
>> in bpf_legacy.c which would have more risks for naming conflicts..
>> 
>> With bpf_libbpf.c, there are less functions and has less risk for naming
>> conflicts. So I think it maybe better to not include libbpf.h in bpf_legacy.c.
>> What do you think?
>> 
>>
>
> Is there a way to sort the code such that bpf_legacy.c is not used when
> libbpf is enabled and bpf_libbpf.c is not compiled when libbpf is disabled.

That's basically what we were going for, i.e.:

git mv lib/bpf.c lib/bpf_legacy.c
git add lib/bpf_libbpf.c

and then adding ifdefs to bpf_legacy.c and only including the other if
libbpf support is enabled.

I guess we could split it further into lib/bpf_{libbpf,legacy,glue}.c
and have the two former ones be completely devoid of ifdefs and
conditionally included based on whether or not libbpf support is
enabled?

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-05 15:57                 ` Toke Høiland-Jørgensen
@ 2020-11-05 16:02                   ` David Ahern
  2020-11-06  0:56                     ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-05 16:02 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Hangbin Liu
  Cc: Stephen Hemminger, netdev, bpf

On 11/5/20 8:57 AM, Toke Høiland-Jørgensen wrote:
> I guess we could split it further into lib/bpf_{libbpf,legacy,glue}.c
> and have the two former ones be completely devoid of ifdefs and
> conditionally included based on whether or not libbpf support is
> enabled?

that sounds reasonable.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 22:24                                 ` Toke Høiland-Jørgensen
@ 2020-11-05 20:14                                   ` Andrii Nakryiko
  0 siblings, 0 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-05 20:14 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Jakub Kicinski, Daniel Borkmann, Alexei Starovoitov, Hangbin Liu,
	David Ahern, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

On Wed, Nov 4, 2020 at 2:24 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > Some of the most important APIs of libbpf are, arguably,
> > bpf_object__open() and bpf_object__load(). They accept a BPF ELF file,
> > do some preprocessing and in the end load BPF instructions into the
> > kernel for verification. But while API doesn't change across libbpf
> > versions, BPF-side code features supported changes quite a lot.
>
> Yes, which means that nothing has to change in iproute2 *at all* to get
> this; not the version, not even a rebuild: just update the system
> libbpf, and you'll automatically gain all these features. How is that an
> argument for *not* linking dynamically? It's a user *benefit* to not
> have to care about the iproute2 version, but only have to care about
> keeping libbpf up to date.
>
> I mean, if iproute2 had started out by linking dynamically against
> libbpf (setting aside the fact that libbpf didn't exist back then), we
> wouldn't even be having this conversation: In that case its support for
> new features in the BPF format would just automatically have kept up
> along with the rest of the system as the library got upgraded...
>

I think it's a difference in the perspective.

You are seeing iproute2 as an explicit proxy to libbpf. Users should
be aware of the fact that iproute2 just uses libbpf to load whatever
BPF ELF file user provides. At that point iproute2 versions almost
doesn't matter. Whatever BPF application users provide (that rely on
iproute2 to load it) should still be very conscious about libbpf
version and depend on that explicitly.

I saw it differently. For me, the fact that iproute2 is using libbpf
is an implementation detail. User developing BPF application is
providing a BPF ELF file that follows a de facto BPF "spec" (all those
SEC() conventions, global variables, map references, etc). Yes, that
"spec" is being driven by libbpf currently, but libbpf is not the only
library that supports it. Go BPF library is trying to keep up and
support most of the same features. So in that sense, iproute2 is
another BPF loader, just like Go library and libbpf library. The fact
that it defers to libbpf should be not important to the end user. With
that view, if a user tested their BPF program with a specific iproute2
version, it should be enough.

But clearly that's not the view that most people on this thread hold
and prefer end users to know and care about libbpf versioning
explicitly. That's fine.

But can we at least make sure that when libbpf is integrated with
iproute2, it specifies the latest libbpf (v0.2) as a dependency?


> -Toke
>

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-04 23:05                           ` Edward Cree
@ 2020-11-05 20:19                             ` Andrii Nakryiko
  2020-11-06  8:44                               ` Jiri Benc
  0 siblings, 1 reply; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-05 20:19 UTC (permalink / raw)
  To: Edward Cree
  Cc: Alexei Starovoitov, Hangbin Liu, David Ahern, Daniel Borkmann,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Wed, Nov 4, 2020 at 3:05 PM Edward Cree <ecree@solarflare.com> wrote:
>
> On 04/11/2020 22:10, Alexei Starovoitov wrote:
> > On Wed, Nov 4, 2020 at 1:16 PM Edward Cree <ecree@solarflare.com> wrote:
> >> On 04/11/2020 03:11, Alexei Starovoitov wrote:
> >>> The user will do 'tc -V'. Does version mean anything from bpf loading pov?
> >>> It's not. The user will do "ldd `which tc`" and then what?
> >> Is it beyond the wit of man for 'tc -V' to output somethingabout
> >>  libbpf version?
> >> Other libraries seem to solve these problems all the time, I
> >>  haven't seen anyone explain what makes libbpf so special that it
> >>  has to be different.
> > slow vger? Please see Daniel and Andrii detailed explanations.
> Nah, I've seen that subthread(vger is fine).  I felt that subthread
>  was missing this point about -V which is why I replied where it was
>  brought up.
> Daniel and Andrii have only explained why users will want to have an
>  up-to-date libbpf, they (and you) haven't connected it to any
>  argument about why static linking is the way to achieve that.

I'll just quote myself here for your convenience.

  Submodule is a way that I know of to make this better for end users.
  If there are other ways to pull this off with shared library use, I'm
  all for it, it will save the security angle that distros are arguing
  for. E.g., if distributions will always have the latest libbpf
  available almost as soon as it's cut upstream *and* new iproute2
  versions enforce the latest libbpf when they are packaged/released,
  then this might work equivalently for end users. If Linux distros
  would be willing to do this faithfully and promptly, I have no
  objections whatsoever. Because all that matters is BPF end user
  experience, as Daniel explained above.

No one replied to that, unfortunately.


> > libbpf is not your traditional library.
> This has only been asserted, not explained.
> I'm fully willing to entertain the possibility that libbpf is indeed
>  special.  But if you want to win people over, you'll need to
>  explain *why* it's special.
> "Look at bfd and think why" is not enough, be more explicit.
>
> AIUI the API between iproute2 and libbpf isn't changing, all that's
>  happening is that libbpf is gaining new capabilities in things that
>  are totally transparent to iproute2 (e.g. BTF fixups).  So the
>  reasonable thing for users to expect is "I need new BPF features,
>  I'll upgrade my libbpf", and with dynamic linking that works fine
>  whether they upgrade iproute2 too or not.
> This narrative is, on the face of it, just as plausible as "I'm
>  getting an error from iproute2, I'll upgrade that".  And if distros
>  decide that that's a common enough mistake to matter, then they can
>  make the newer iproute2 package depend on a newer libbpf package,
>  and apt or yum or whatever will automagically DTRT.
> Whereas if you tightly couple them from the start, distros can't
>  then go the other way if it turns out you made the wrong choice.
>  (What if someone can't use the latest iproute2 release because it
>  has a regression bug that breaks their use-case, but they need the
>  latest libbpf for one of your shiny new features?)
>
> Don't get me wrong, I'd love a world in which static linking was the
>  norm and we all rebuilt our binaries locally every time we upgraded
>  a piece.  But that's not the world we live in, and consistency
>  *within* a distro matters too...
>
> -ed

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05  3:19                         ` David Ahern
  2020-11-05 14:05                           ` Jamal Hadi Salim
@ 2020-11-05 20:45                           ` Andrii Nakryiko
  2020-11-06  9:00                             ` Jiri Benc
  1 sibling, 1 reply; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-05 20:45 UTC (permalink / raw)
  To: David Ahern
  Cc: Daniel Borkmann, Alexei Starovoitov, Hangbin Liu,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Wed, Nov 4, 2020 at 7:19 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 11/4/20 3:21 AM, Daniel Borkmann wrote:
> >
> >> Then libbpf release process can incorporate proper testing of libbpf
> >> and iproute2 combination.
> >> Or iproute2 should stay as-is with obsolete bpf support.
> >>
> >> Few years from now the situation could be different and shared libbpf
> >> would
> >> be the most appropriate choice. But that day is not today.
> >
> > Yep, for libbpf to be in same situation as libelf or libmnl basically
> > feature
> > development would have to pretty much come to a stop so that even minor
> > or exotic
> > distros get to a point where they ship same libbpf version as major
> > distros where
> > then users can start to rely on the base feature set for developing
> > programs
> > against it.
>
> User experience keeps getting brought up, but I also keep reading the
> stance that BPF users can not expect a consistent experience unless they
> are constantly chasing latest greatest versions of *ALL* S/W related to

That's not true. If you need new functionality like BTF, CO-RE,
function-by-function verification, etc., then yes, you have to update
kernel, compiler, libbpf, sometimes pahole. But if you have an BPF
application that doesn't use and need any of the newer features, it
will keep working just fine with the old kernel, old libbpf, and old
compiler.

Life is a bit more nuanced, of course. Sometimes a Clang update will
cause a shift in code generation patterns and you'd need either kernel
update (to get improved verifier logic) and/or libbpf update (to
compensate for either kernel or Clang change). Or update Clang again
to get a fixed version. That's life, bugs and problems are real.

If you care about using BTF-powered features, yes, you might need to
update pahole to get basic BTF, or get new BTF funcs needed for
fentry/fexit, or soon you'll need v1.19 if you want kernel module
BTFs. If you don't care about BTF, don't set CONFIG_DEBUG_INFO_BTF=y
and you won't even need pahole. For kernel module BTFs, you can't
request module BTF generation, unless you have a recent enough pahole.
I'm not sure how this can be handled better.

But if you have a plain old boring BPF program using
BPF_MAP_ARRAY/BPF_MAP_HASH, no global variables, you attach it to old
and stable BPF hooks like tracepoint, kprobe, etc., then it will work
with pretty much every version of libbpf, clang, and kernel. Don't
pass '-g' to Clang and BTF won't be generated at all, so you won't
even need BTF sanitization at all. And so on.

The problem is that users do want those new features, because those
allow to do new things or do existing things better/easier/faster. So
then we do ask to upgrade regularly to provide adequate support. But
it's like complaining that you need to update Java VM, compiler, Java
standard library, when you do want to use some new functionality.

> BPF. That is not a realistic expectation for users. Distributions exist
> for a reason. They solve real packaging problems.
>
> As libbpf and bpf in general reach a broader audience, the requirements
> to use, deploy and even tryout BPF features needs to be more user
> friendly and that starts with maintainers of the BPF code and how they
> approach extensions and features. Telling libbpf consumers to make
> libbpf a submodule of their project and update the reference point every
> time a new release comes out is not user friendly.

I have all the rights to ask for this, if I believe it's a better way
to go. Users have the right to refuse. But also iproute2 is not
exactly an end user in this situation, it is part of the BPF
ecosystem. So I think it's reasonable to have a healthy discussion
about the best way to facilitate BPF end-users.

>
> Similarly, it is not realistic or user friendly to *require* general
> Linux users to constantly chase latest versions of llvm, clang, dwarves,
> bcc, bpftool, libbpf, (I am sure I am missing more), and, by extension
> of what you want here, iproute2 just to upgrade their production kernel
> to say v5.10, the next LTS, or to see what relevant new ebpf features
> exists in the new kernel. As a specific example BTF extensions are added
> in a way that is all or nothing. Meaning, you want to compile kernel
> version X with CONFIG_DEBUG_INFO_BTF enabled, update your toolchain.
> Sure, you are using the latest LTS of $distro, and it worked fine with
> kernel version X-1 last week, but now compile fails completely unless
> the pahole version is updated. Horrible user experience. Again, just an
> example and one I brought up in July. I am sure there more.
>
> Linux APIs are about stability and consistency. Commands and libraries
> that work on v5.9 should work exactly the same on v5.10, 5.11, 5.12, ...
> *IF* I want a new feature (kernel, bpf or libbpf), then the requirement
> to upgrade is justified. But if I am just updating my kernel, or
> updating my compiler, or updating iproute2 because I want to try out
> some new nexthop feature, I should not be cornered into an all or
> nothing scheme.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05  3:48                                 ` David Ahern
@ 2020-11-05 20:53                                   ` Andrii Nakryiko
  0 siblings, 0 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-05 20:53 UTC (permalink / raw)
  To: David Ahern
  Cc: Jakub Kicinski, Daniel Borkmann,
	Toke Høiland-Jørgensen, Alexei Starovoitov,
	Hangbin Liu, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko

On Wed, Nov 4, 2020 at 7:48 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 11/4/20 1:43 PM, Andrii Nakryiko wrote:
> >
> > What users writing BPF programs can expect from iproute2 in terms of
> > available BPF features is what matters. And by not enforcing a
> > specific minimal libbpf version, iproute2 version doesn't matter all
> > that much, because libbpf version that iproute2 ends up linking
> > against might be very old.
> >
> > There was a lot of talk about API stability and backwards
> > compatibility. Libbpf has had a stable API and ABI for at least 1.5
> > years now and is very conscious about that when adding or extending
> > new APIs. That's not even a factor in me arguing for submodules. I'll
> > give a few specific examples of libbpf API not changing at all, but
> > how end user experience gets tremendously better.
> >
> > Some of the most important APIs of libbpf are, arguably,
> > bpf_object__open() and bpf_object__load(). They accept a BPF ELF file,
> > do some preprocessing and in the end load BPF instructions into the
> > kernel for verification. But while API doesn't change across libbpf
> > versions, BPF-side code features supported changes quite a lot.
> >
> > 1. BTF sanitization. Newer versions of clang would emit a richer set
> > of BTF type information. Old kernels might not support BTF at all (but
> > otherwise would work just fine), or might not support some specific
> > newer additions to BTF. If someone was to use the latest Clang, but
> > outdated libbpf and old kernel, they would have a bad time, because
> > their BPF program would fail due to the kernel being strict about BTF.
> > But new libbpf would "sanitize" BTF, according to supported features
> > of the kernel, or just drop BTF altogether, if the kernel is that old.
> >
>
> In my experience, compilers are the least likely change in a typical
> Linux development environment. BPF should not be forcing new versions
> (see me last response).
>

"My experience" and "typical" don't generalize well, I'd rather not
draw any specific conclusions from that. But as I replied to your last
response: if you have a BPF application that doesn't use BPF CO-RE and
doesn't need BTF, you'll most probably be just fine with older Clang
(<v10), no one is forcing anything.

We do recommend to use the latest Clang, so that you have to deal with
less work arounds, of course. And you get all the shiny BTF built-ins.
And some of the problematic code patterns are not generated by newer
Clangs so that you as a BPF developer have to deal with less painful
development and debugging process.

> >
> > 2. bpf_probe_read_user() falling back to bpf_probe_read(). Newer
> > kernels warn if a BPF application isn't using a proper _kernel() or
> > _user() variant of bpf_probe_read(), and eventually will just stop
> > supporting generic bpf_probe_read(). So what this means is that end
> > users would need to compile to variants of their BPF application, one
> > for older kernels with bpf_probe_read(), another with
> > bpf_probe_read_kernel()/bpf_probe_read_user(). That's a massive pain
> > in the butt. But newer libbpf versions provide a completely
> > transparent fallback from _user()/_kernel() variants to generic one,
> > if the kernel doesn't support new variants. So the instruction to
> > users becomes simple: always use
> > bpf_probe_read_user()/bpf_probe_read_kernel().
> >
>
> I vaguely recall a thread about having BPF system call return user
> friendly messages, but that was shot down. I take this example to mean
> the solution is to have libbpf handle the quirks and various changes
> which means that now libbpf takes on burden - the need for constant
> updates to handle quirks. extack has been very successful at making
> networking configuration mistakes more user friendly. Other kernel
> features should be using the same kind of extension.

I don't think this is relevant for this discussion at all. But yes,
libbpf tries to alleviate as much pain as possible. And no, extack
won't help with that in general, only with some error reporting,
potentially.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05 14:05                           ` Jamal Hadi Salim
@ 2020-11-05 21:01                             ` Andrii Nakryiko
  2020-11-06 15:27                               ` Jamal Hadi Salim
  2020-11-10 12:47                             ` Edward Cree
  1 sibling, 1 reply; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-05 21:01 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: David Ahern, Daniel Borkmann, Alexei Starovoitov, Hangbin Liu,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Thu, Nov 5, 2020 at 6:05 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On 2020-11-04 10:19 p.m., David Ahern wrote:
>
> [..]
> >
> > User experience keeps getting brought up, but I also keep reading the
> > stance that BPF users can not expect a consistent experience unless they
> > are constantly chasing latest greatest versions of *ALL* S/W related to
> > BPF. That is not a realistic expectation for users. Distributions exist
> > for a reason. They solve real packaging problems.
> >
> > As libbpf and bpf in general reach a broader audience, the requirements
> > to use, deploy and even tryout BPF features needs to be more user
> > friendly and that starts with maintainers of the BPF code and how they
> > approach extensions and features. Telling libbpf consumers to make
> > libbpf a submodule of their project and update the reference point every
> > time a new release comes out is not user friendly.
> >
> > Similarly, it is not realistic or user friendly to *require* general
> > Linux users to constantly chase latest versions of llvm, clang, dwarves,
> > bcc, bpftool, libbpf, (I am sure I am missing more), and, by extension
> > of what you want here, iproute2 just to upgrade their production kernel
> > to say v5.10, the next LTS, or to see what relevant new ebpf features
> > exists in the new kernel. As a specific example BTF extensions are added
> > in a way that is all or nothing. Meaning, you want to compile kernel
> > version X with CONFIG_DEBUG_INFO_BTF enabled, update your toolchain.
> > Sure, you are using the latest LTS of $distro, and it worked fine with
> > kernel version X-1 last week, but now compile fails completely unless
> > the pahole version is updated. Horrible user experience. Again, just an
> > example and one I brought up in July. I am sure there more.
> >
>
>
> 2cents feedback from a dabbler in ebpf on user experience:
>
> What David described above *has held me back*.
> Over time it seems things have gotten better with libbpf
> (although a few times i find myself copying includes from the
> latest iproute into libbpf). I ended up just doing static links.
> The idea of upgrading clang/llvm every 2 months i revisit ebpf is
> the most painful. At times code that used to compile just fine
> earlier doesnt anymore. There's a minor issue of requiring i install

Do you have a specific example of something that stopped compiling?
I'm not saying that can't happen, but we definitely try hard to avoid
any regressions. I might be forgetting something, but I don't recall
the situation when something would stop compiling just due to newer
libbpf.

> kernel headers every time i want to run something in samples, etc
> but i am probably lacking knowledge on how to ease the pain in that
> regard.
>
> I find the loader and associated tooling in iproute2/tc to be quiet
> stable (not shiny but works everytime).
> And for that reason i often find myself sticking to just tc instead
> of toying with other areas.

That's the part that others on this thread mentioned is bit rotting?
Doesn't seem like everyone is happy about that, though. Stopping any
development definitely makes things stable by definition. BPF and
libbpf try to be stable while not stagnating, which is harder than
just stopping any development, unfortunately.

> Slight tangent:
> One thing that would help libbpf adoption is to include an examples/
> directory. Put a bunch of sample apps for tc, probes, xdp etc.
> And have them compile outside of the kernel. Maybe useful Makefiles
> that people can cutnpaste from. Every time you add a new feature
> put some sample code in the examples.

That's what tools/testing/selftests/bpf in kernel source are for. It's
not the greatest showcase of examples, but all the new features have a
test demonstrating its usage. I do agree about having simple Makefiles
and we do have that at [0]. I'm also about to do another sample repo
with a lot of things pre-setup, for tinkering and using that as a
bootstrap for BPF development with libbpf.

  [0] https://github.com/iovisor/bcc/tree/master/libbpf-tools

>
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-05 15:25               ` David Ahern
  2020-11-05 15:57                 ` Toke Høiland-Jørgensen
@ 2020-11-06  0:41                 ` Hangbin Liu
  1 sibling, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-06  0:41 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, netdev, bpf, Toke Høiland-Jørgensen

On Thu, Nov 05, 2020 at 08:25:12AM -0700, David Ahern wrote:
> > Sorry, when trying to rename the functions. I just found another issue.
> > Even we fix the conflicts right now. What if libbpf add new functions
> > and we got another conflict in future? There are too much bpf functions
> > in bpf_legacy.c which would have more risks for naming conflicts..
> > 
> > With bpf_libbpf.c, there are less functions and has less risk for naming
> > conflicts. So I think it maybe better to not include libbpf.h in bpf_legacy.c.
> > What do you think?
> > 
> >
> 
> Is there a way to sort the code such that bpf_legacy.c is not used when
> libbpf is enabled and bpf_libbpf.c is not compiled when libbpf is disabled.
> 

That what the current code did. In lib/Makefile we only compile bpf_libbpf.o
when libbpf enabled.

ifeq ($(HAVE_LIBBPF),y)
UTILOBJ += bpf_libbpf.o
endif

But bpf code in ipvrf.c is special as it calls both legacy code an libbpf code.
If we put it in bpf_legacy.c, then bpf_legacy.c will be corrupt by libbpf.h.
If we put it in bpf_libbpf.c, then we can't build without bpf_libbpf.o when
libbpf disable.

I haven't figured out a better solution.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 3/5] lib: add libbpf support
  2020-11-05 16:02                   ` David Ahern
@ 2020-11-06  0:56                     ` Hangbin Liu
  0 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-06  0:56 UTC (permalink / raw)
  To: David Ahern
  Cc: Toke Høiland-Jørgensen, Stephen Hemminger, netdev, bpf

On Thu, Nov 05, 2020 at 09:02:15AM -0700, David Ahern wrote:
> On 11/5/20 8:57 AM, Toke Høiland-Jørgensen wrote:
> > I guess we could split it further into lib/bpf_{libbpf,legacy,glue}.c
> > and have the two former ones be completely devoid of ifdefs and
> > conditionally included based on whether or not libbpf support is
> > enabled?
> 
> that sounds reasonable.
> 

OK, I will do this.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05 20:19                             ` Andrii Nakryiko
@ 2020-11-06  8:44                               ` Jiri Benc
  2020-11-06 20:57                                 ` Andrii Nakryiko
  0 siblings, 1 reply; 167+ messages in thread
From: Jiri Benc @ 2020-11-06  8:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Edward Cree, Alexei Starovoitov, Hangbin Liu, David Ahern,
	Daniel Borkmann, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Thu, 5 Nov 2020 12:19:00 -0800, Andrii Nakryiko wrote:
> I'll just quote myself here for your convenience.

Sorry, I missed your original email for some reason.

>   Submodule is a way that I know of to make this better for end users.
>   If there are other ways to pull this off with shared library use, I'm
>   all for it, it will save the security angle that distros are arguing
>   for. E.g., if distributions will always have the latest libbpf
>   available almost as soon as it's cut upstream *and* new iproute2
>   versions enforce the latest libbpf when they are packaged/released,
>   then this might work equivalently for end users. If Linux distros
>   would be willing to do this faithfully and promptly, I have no
>   objections whatsoever. Because all that matters is BPF end user
>   experience, as Daniel explained above.

That's basically what we already do, for both Fedora and RHEL.

Of course, it follows the distro release cycle, i.e. no version
upgrades - or very limited ones - during lifetime of a particular
release. But that would not be different if libbpf was bundled in
individual projects.

 Jiri


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05 20:45                           ` Andrii Nakryiko
@ 2020-11-06  9:00                             ` Jiri Benc
  2020-11-06 21:07                               ` Andrii Nakryiko
  0 siblings, 1 reply; 167+ messages in thread
From: Jiri Benc @ 2020-11-06  9:00 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David Ahern, Daniel Borkmann, Alexei Starovoitov, Hangbin Liu,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Thu, 5 Nov 2020 12:45:39 -0800, Andrii Nakryiko wrote:
> That's not true. If you need new functionality like BTF, CO-RE,
> function-by-function verification, etc., then yes, you have to update
> kernel, compiler, libbpf, sometimes pahole. But if you have an BPF
> application that doesn't use and need any of the newer features, it
> will keep working just fine with the old kernel, old libbpf, and old
> compiler.

I'm fine with this.

It doesn't work that well in practice, we've found ourselves chasing
problems caused by llvm update (problems for older bpf programs, not
new ones), problems on non-x86_64 caused by kernel updates, etc. It can
be attributed to living on the edge and it should stabilize over time,
hopefully. But it's still what the users are experiencing and it's
probably what David is referring to. I expect it to smooth itself over
time.

Add to that the fact that something that is in fact a new feature is
perceived as a bug fix by some users. For example, a perfectly valid
and simple C program, not using anything shiny but a basic simple loop,
compiles just fine but is rejected by the kernel. A newer kernel and a
newer compiler and a newer libbpf and a newer pahole will cause the
same program to be accepted. Now, the user does not see that for this,
a new load of BTF functionality had to be added and all those mentioned
projects enhanced with substantial code. All they see is their simple
hello world test program did not work and now it does.

I'm not saying I have a solution nor I'm saying you should do something
about it. Just trying to explain the perception.

 Jiri


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05 21:01                             ` Andrii Nakryiko
@ 2020-11-06 15:27                               ` Jamal Hadi Salim
  2020-11-06 21:25                                 ` Andrii Nakryiko
  0 siblings, 1 reply; 167+ messages in thread
From: Jamal Hadi Salim @ 2020-11-06 15:27 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David Ahern, Daniel Borkmann, Alexei Starovoitov, Hangbin Liu,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 2020-11-05 4:01 p.m., Andrii Nakryiko wrote:
> On Thu, Nov 5, 2020 at 6:05 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>>
>> On 2020-11-04 10:19 p.m., David Ahern wrote:
>>
>> [..]

[..]

>> 2cents feedback from a dabbler in ebpf on user experience:
>>
>> What David described above *has held me back*.
>> Over time it seems things have gotten better with libbpf
>> (although a few times i find myself copying includes from the
>> latest iproute into libbpf). I ended up just doing static links.
>> The idea of upgrading clang/llvm every 2 months i revisit ebpf is
>> the most painful. At times code that used to compile just fine
>> earlier doesnt anymore. There's a minor issue of requiring i install
> 
> Do you have a specific example of something that stopped compiling?
> I'm not saying that can't happen, but we definitely try hard to avoid
> any regressions. I might be forgetting something, but I don't recall
> the situation when something would stop compiling just due to newer
> libbpf.
> 

Unfortunately the ecosystem is more than libbpf; sometimes it is
the kernel code that is being exercised by libbpf that is problematic.
This may sound unfair to libbpf but it is hard to separate the two for
someone who is dabbling like me.

The last issue iirc correctly had to do with one of the tcp notifier
variants either in samples or selftests(both user space and kernel).
I can go back and look at the details.
The fix always more than half the time was need to upgrade
clang/llvm. At one point i think it required that i had to grab
the latest and greatest git version. I think the machine i have
right now has version 11. The first time i found out about these
clang upgrades was trying to go from 8->9 or maybe it was 9->10.
Somewhere along there also was discovery that something that
compiled under earlier version wasnt compiling under newer version.

>> kernel headers every time i want to run something in samples, etc
>> but i am probably lacking knowledge on how to ease the pain in that
>> regard.
>>
>> I find the loader and associated tooling in iproute2/tc to be quiet
>> stable (not shiny but works everytime).
>> And for that reason i often find myself sticking to just tc instead
>> of toying with other areas.
> 
> That's the part that others on this thread mentioned is bit rotting?

Yes. Reason is i dont have to deal with new discoveries of things
that require some upgrade or copying etc.
I should be clear on the "it is the ecosystem": this is not just because
of user space code but also the simplicity of writing the tc kernel code
and loading it with tc tooling and then have a separate user tool for
control.
Lately i started linking the control tool with static libbpf instead.

Bpftool seems improved last time i tried to load something in XDP. I 
like the load-map-then-attach-program approach that bpftool gets
out of libbpf. I dont think that feature is possible with tc tooling.

However, I am still loading with tc and xdp with ip because of old
habits and what i consider to be a very simple workflow.

> Doesn't seem like everyone is happy about that, though. Stopping any
> development definitely makes things stable by definition. BPF and
> libbpf try to be stable while not stagnating, which is harder than
> just stopping any development, unfortunately.
> 

I am for moving to libbpf. I think it is a bad idea to have multiple
loaders for example. Note: I am not a demanding user, but there
are a few useful features that i feel i need that are missing in
iproute2 version. e.g, one thing i was playing with about a month
ago was some TOCTOU issue in the kernel code and getting
the bpf_lock integrated into the tc code proved challenging.
I ended rewriting the code to work around the tooling.

The challenge - when making changes in the name of progress - is to
not burden a user like myself with a complex workflow but still give
me the features i need.

>> Slight tangent:
>> One thing that would help libbpf adoption is to include an examples/
>> directory. Put a bunch of sample apps for tc, probes, xdp etc.
>> And have them compile outside of the kernel. Maybe useful Makefiles
>> that people can cutnpaste from. Every time you add a new feature
>> put some sample code in the examples.
> 
> That's what tools/testing/selftests/bpf in kernel source are for. It's
> not the greatest showcase of examples, but all the new features have a
> test demonstrating its usage. I do agree about having simple Makefiles
> and we do have that at [0]. I'm also about to do another sample repo
> with a lot of things pre-setup, for tinkering and using that as a
> bootstrap for BPF development with libbpf.
> 
>    [0] https://github.com/iovisor/bcc/tree/master/libbpf-tools


I pull that tree regularly.
selftests is good for aggregating things developers submit and
then have the robots test.
For better usability, it has to be something that is standalone that 
would work out of the box with libbf.
selftests and samples are not what i would consider for the
faint-hearted.
It may look easy to you because you eat this stuff for
breakfast but consider all those masses you want to be part of this.
They dont have the skills and people with average skills dont
have the patience.

This again comes back to "the ecosystem" - just getting libbpf to get
things stable for userland is not enough. Maybe have part of the libbpf
testing also to copy things from selftests.

cheers,
jamal

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06  8:44                               ` Jiri Benc
@ 2020-11-06 20:57                                 ` Andrii Nakryiko
  2020-11-06 21:04                                   ` Alexei Starovoitov
  0 siblings, 1 reply; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-06 20:57 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Edward Cree, Alexei Starovoitov, Hangbin Liu, David Ahern,
	Daniel Borkmann, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, Nov 6, 2020 at 12:44 AM Jiri Benc <jbenc@redhat.com> wrote:
>
> On Thu, 5 Nov 2020 12:19:00 -0800, Andrii Nakryiko wrote:
> > I'll just quote myself here for your convenience.
>
> Sorry, I missed your original email for some reason.
>
> >   Submodule is a way that I know of to make this better for end users.
> >   If there are other ways to pull this off with shared library use, I'm
> >   all for it, it will save the security angle that distros are arguing
> >   for. E.g., if distributions will always have the latest libbpf
> >   available almost as soon as it's cut upstream *and* new iproute2
> >   versions enforce the latest libbpf when they are packaged/released,
> >   then this might work equivalently for end users. If Linux distros
> >   would be willing to do this faithfully and promptly, I have no
> >   objections whatsoever. Because all that matters is BPF end user
> >   experience, as Daniel explained above.
>
> That's basically what we already do, for both Fedora and RHEL.
>
> Of course, it follows the distro release cycle, i.e. no version
> upgrades - or very limited ones - during lifetime of a particular
> release. But that would not be different if libbpf was bundled in
> individual projects.

Alright. Hopefully this would be sufficient in practice.

>
>  Jiri
>

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06 20:57                                 ` Andrii Nakryiko
@ 2020-11-06 21:04                                   ` Alexei Starovoitov
  2020-11-06 23:25                                     ` Stephen Hemminger
  0 siblings, 1 reply; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-06 21:04 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Benc, Edward Cree, Hangbin Liu, David Ahern,
	Daniel Borkmann, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, Nov 6, 2020 at 12:58 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Nov 6, 2020 at 12:44 AM Jiri Benc <jbenc@redhat.com> wrote:
> >
> > On Thu, 5 Nov 2020 12:19:00 -0800, Andrii Nakryiko wrote:
> > > I'll just quote myself here for your convenience.
> >
> > Sorry, I missed your original email for some reason.
> >
> > >   Submodule is a way that I know of to make this better for end users.
> > >   If there are other ways to pull this off with shared library use, I'm
> > >   all for it, it will save the security angle that distros are arguing
> > >   for. E.g., if distributions will always have the latest libbpf
> > >   available almost as soon as it's cut upstream *and* new iproute2
> > >   versions enforce the latest libbpf when they are packaged/released,
> > >   then this might work equivalently for end users. If Linux distros
> > >   would be willing to do this faithfully and promptly, I have no
> > >   objections whatsoever. Because all that matters is BPF end user
> > >   experience, as Daniel explained above.
> >
> > That's basically what we already do, for both Fedora and RHEL.
> >
> > Of course, it follows the distro release cycle, i.e. no version
> > upgrades - or very limited ones - during lifetime of a particular
> > release. But that would not be different if libbpf was bundled in
> > individual projects.
>
> Alright. Hopefully this would be sufficient in practice.

I think bumping the minimal version of libbpf with every iproute2 release
is necessary as well.
Today iproute2-next should require 0.2.0. The cycle after it should be 0.3.0
and so on.
This way at least some correlation between iproute2 and libbpf will be
established.
Otherwise it's a mess of versions and functionality from user point of view.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06  9:00                             ` Jiri Benc
@ 2020-11-06 21:07                               ` Andrii Nakryiko
  0 siblings, 0 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-06 21:07 UTC (permalink / raw)
  To: Jiri Benc
  Cc: David Ahern, Daniel Borkmann, Alexei Starovoitov, Hangbin Liu,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, Nov 6, 2020 at 1:00 AM Jiri Benc <jbenc@redhat.com> wrote:
>
> On Thu, 5 Nov 2020 12:45:39 -0800, Andrii Nakryiko wrote:
> > That's not true. If you need new functionality like BTF, CO-RE,
> > function-by-function verification, etc., then yes, you have to update
> > kernel, compiler, libbpf, sometimes pahole. But if you have an BPF
> > application that doesn't use and need any of the newer features, it
> > will keep working just fine with the old kernel, old libbpf, and old
> > compiler.
>
> I'm fine with this.
>
> It doesn't work that well in practice, we've found ourselves chasing
> problems caused by llvm update (problems for older bpf programs, not
> new ones), problems on non-x86_64 caused by kernel updates, etc. It can
> be attributed to living on the edge and it should stabilize over time,
> hopefully. But it's still what the users are experiencing and it's
> probably what David is referring to. I expect it to smooth itself over
> time.

It's definitely going to be better over time, of course. I honestly
can't remember many cases where working applications stopped working
with newer kernels. I only remember cases when Clang changed the code
generation patterns. Also there were few too permissive checks fixed
in later kernels, which could break apps, if apps relied on buggy
logic. That did happen I think.

But anyway, I bet people just got a "something like that happened in
the past" flag in their head, but won't be able to recall specific
details anymore. My point is that we (BPF developers) don't take these
things lightly, so I'd just like to avoid the perception that we don't
care about this. Because we do, despite it sometimes being painful.
But there are layers upon layers of abstraction and it's not all
always under our control, so things might break.

>
> Add to that the fact that something that is in fact a new feature is
> perceived as a bug fix by some users. For example, a perfectly valid
> and simple C program, not using anything shiny but a basic simple loop,
> compiles just fine but is rejected by the kernel. A newer kernel and a
> newer compiler and a newer libbpf and a newer pahole will cause the
> same program to be accepted. Now, the user does not see that for this,
> a new load of BTF functionality had to be added and all those mentioned
> projects enhanced with substantial code. All they see is their simple
> hello world test program did not work and now it does.

Right. The unavoidable truth that anyone using BPF has to have at
least a surface-level idea about what BPF verifier is and what (and
sometimes how) it checks. It also gets better over time so much that
for some simpler application it will just work perfectly from the
first version of written code.

But let's also not lose perspective here. There aren't many examples
of practical static verification of program safety and termination,
right? It's tricky, and especially when making it also practical for a
wide variety of use cases.

>
> I'm not saying I have a solution nor I'm saying you should do something
> about it. Just trying to explain the perception.

Thanks for that, it's a good perspective. Hopefully my explanation
also makes sense ;)

>
>  Jiri
>

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06 15:27                               ` Jamal Hadi Salim
@ 2020-11-06 21:25                                 ` Andrii Nakryiko
  0 siblings, 0 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-06 21:25 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: David Ahern, Daniel Borkmann, Alexei Starovoitov, Hangbin Liu,
	Stephen Hemminger, Alexei Starovoitov, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Networking, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, Nov 6, 2020 at 7:27 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On 2020-11-05 4:01 p.m., Andrii Nakryiko wrote:
> > On Thu, Nov 5, 2020 at 6:05 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >>
> >> On 2020-11-04 10:19 p.m., David Ahern wrote:
> >>
> >> [..]
>
> [..]
>
> >> 2cents feedback from a dabbler in ebpf on user experience:
> >>
> >> What David described above *has held me back*.
> >> Over time it seems things have gotten better with libbpf
> >> (although a few times i find myself copying includes from the
> >> latest iproute into libbpf). I ended up just doing static links.
> >> The idea of upgrading clang/llvm every 2 months i revisit ebpf is
> >> the most painful. At times code that used to compile just fine
> >> earlier doesnt anymore. There's a minor issue of requiring i install
> >
> > Do you have a specific example of something that stopped compiling?
> > I'm not saying that can't happen, but we definitely try hard to avoid
> > any regressions. I might be forgetting something, but I don't recall
> > the situation when something would stop compiling just due to newer
> > libbpf.
> >
>
> Unfortunately the ecosystem is more than libbpf; sometimes it is
> the kernel code that is being exercised by libbpf that is problematic.
> This may sound unfair to libbpf but it is hard to separate the two for
> someone who is dabbling like me.

I get that. Clang is also part of the ecosystem, along the kernel,
pahole, etc. It's a lot of moving parts and we strive to keep them all
working well together. It's not 100% smooth all the time, but that's
at least the goal.

>
> The last issue iirc correctly had to do with one of the tcp notifier
> variants either in samples or selftests(both user space and kernel).
> I can go back and look at the details.
> The fix always more than half the time was need to upgrade
> clang/llvm. At one point i think it required that i had to grab
> the latest and greatest git version. I think the machine i have
> right now has version 11. The first time i found out about these
> clang upgrades was trying to go from 8->9 or maybe it was 9->10.
> Somewhere along there also was discovery that something that
> compiled under earlier version wasnt compiling under newer version.

So with kernel's samples/bpf and selftests/bpf, we do quite often
expect the latest Clang, because it's not just examples, but also a
live set of tests. So to not accumulate too much cruft, we do update
those (sometimes, not all the time) with assumption of latest features
in Clang, libbpf, pahole, and kernel. That's reality and we set those
expectations quite explicitly a while ago. But that's not the
expectation for user applications outside of the kernel tree. Just
wanted to make this clear.

>
> >> kernel headers every time i want to run something in samples, etc
> >> but i am probably lacking knowledge on how to ease the pain in that
> >> regard.
> >>
> >> I find the loader and associated tooling in iproute2/tc to be quiet
> >> stable (not shiny but works everytime).
> >> And for that reason i often find myself sticking to just tc instead
> >> of toying with other areas.
> >
> > That's the part that others on this thread mentioned is bit rotting?
>
> Yes. Reason is i dont have to deal with new discoveries of things
> that require some upgrade or copying etc.
> I should be clear on the "it is the ecosystem": this is not just because
> of user space code but also the simplicity of writing the tc kernel code
> and loading it with tc tooling and then have a separate user tool for
> control.
> Lately i started linking the control tool with static libbpf instead.

There are also two broad categories of BPF applications: networking
and the rest (tracing, now security, etc). Networking historically
dealt with well-defined data structures (ip headers, tcp headers, etc)
and didn't need much to know about the ever-changing nature of kernel
memory layouts. That used to be, arguably, simpler use case from BPF
standpoint.

Tracing, on the other hand, was always challenging. The only viable
option was BCC's approach of bundling compiler, expecting
kernel-headers, etc. We started changing that with BPF CO-RE to make a
traditional per-compiled model viable. That obviously required changes
in all parts of the ecosystem. So tracing BPF apps went from
impossible, to hard, to constantly evolving, and we are right now in a
somewhat mixed evolving/stabilizing stage. Bleeding edge. As Jiri
said, it's to be expected that there would be rough corners. But the
choice is either to live dangerously or wait for a few years for
things to completely settle. Pick your poison ;)

>
> Bpftool seems improved last time i tried to load something in XDP. I
> like the load-map-then-attach-program approach that bpftool gets
> out of libbpf. I dont think that feature is possible with tc tooling.
>
> However, I am still loading with tc and xdp with ip because of old
> habits and what i consider to be a very simple workflow.
>
> > Doesn't seem like everyone is happy about that, though. Stopping any
> > development definitely makes things stable by definition. BPF and
> > libbpf try to be stable while not stagnating, which is harder than
> > just stopping any development, unfortunately.
> >
>
> I am for moving to libbpf. I think it is a bad idea to have multiple
> loaders for example. Note: I am not a demanding user, but there
> are a few useful features that i feel i need that are missing in
> iproute2 version. e.g, one thing i was playing with about a month
> ago was some TOCTOU issue in the kernel code and getting
> the bpf_lock integrated into the tc code proved challenging.
> I ended rewriting the code to work around the tooling.

Right, bpf_lock relies on BTF, that's probably why.

>
> The challenge - when making changes in the name of progress - is to
> not burden a user like myself with a complex workflow but still give
> me the features i need.

This takes time and work, and can't be done perfectly overnight.
That's all. But the thing is: we are working towards it, non-stop.

>
> >> Slight tangent:
> >> One thing that would help libbpf adoption is to include an examples/
> >> directory. Put a bunch of sample apps for tc, probes, xdp etc.
> >> And have them compile outside of the kernel. Maybe useful Makefiles
> >> that people can cutnpaste from. Every time you add a new feature
> >> put some sample code in the examples.
> >
> > That's what tools/testing/selftests/bpf in kernel source are for. It's
> > not the greatest showcase of examples, but all the new features have a
> > test demonstrating its usage. I do agree about having simple Makefiles
> > and we do have that at [0]. I'm also about to do another sample repo
> > with a lot of things pre-setup, for tinkering and using that as a
> > bootstrap for BPF development with libbpf.
> >
> >    [0] https://github.com/iovisor/bcc/tree/master/libbpf-tools
>
>
> I pull that tree regularly.
> selftests is good for aggregating things developers submit and
> then have the robots test.
> For better usability, it has to be something that is standalone that
> would work out of the box with libbf.

It's not yet ready for wider announcement, but give this a try:

https://github.com/anakryiko/libbpf-bootstrap

Should make it easier to play with libbpf and BPF.

> selftests and samples are not what i would consider for the
> faint-hearted.
> It may look easy to you because you eat this stuff for
> breakfast but consider all those masses you want to be part of this.
> They dont have the skills and people with average skills dont
> have the patience.

I acknowledged from the get-go that selftest/bpf are not the best
source of examples, just that's what we've got. It takes contributions
from lots of people to maintain a decent, nice and clean, easy to use
set of realistic examples. It's unrealistic, IMO, to expect a bunch of
core BPF developers to both develop core technology actively, and
provide great educational resources (however unfortunate that is).

>
> This again comes back to "the ecosystem" - just getting libbpf to get
> things stable for userland is not enough. Maybe have part of the libbpf
> testing also to copy things from selftests.
>
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06 21:04                                   ` Alexei Starovoitov
@ 2020-11-06 23:25                                     ` Stephen Hemminger
  2020-11-06 23:30                                       ` Andrii Nakryiko
  2020-11-06 23:38                                       ` David Ahern
  0 siblings, 2 replies; 167+ messages in thread
From: Stephen Hemminger @ 2020-11-06 23:25 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Jiri Benc, Edward Cree, Hangbin Liu,
	David Ahern, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, 6 Nov 2020 13:04:16 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Fri, Nov 6, 2020 at 12:58 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Fri, Nov 6, 2020 at 12:44 AM Jiri Benc <jbenc@redhat.com> wrote:  
> > >
> > > On Thu, 5 Nov 2020 12:19:00 -0800, Andrii Nakryiko wrote:  
> > > > I'll just quote myself here for your convenience.  
> > >
> > > Sorry, I missed your original email for some reason.
> > >  
> > > >   Submodule is a way that I know of to make this better for end users.
> > > >   If there are other ways to pull this off with shared library use, I'm
> > > >   all for it, it will save the security angle that distros are arguing
> > > >   for. E.g., if distributions will always have the latest libbpf
> > > >   available almost as soon as it's cut upstream *and* new iproute2
> > > >   versions enforce the latest libbpf when they are packaged/released,
> > > >   then this might work equivalently for end users. If Linux distros
> > > >   would be willing to do this faithfully and promptly, I have no
> > > >   objections whatsoever. Because all that matters is BPF end user
> > > >   experience, as Daniel explained above.  
> > >
> > > That's basically what we already do, for both Fedora and RHEL.
> > >
> > > Of course, it follows the distro release cycle, i.e. no version
> > > upgrades - or very limited ones - during lifetime of a particular
> > > release. But that would not be different if libbpf was bundled in
> > > individual projects.  
> >
> > Alright. Hopefully this would be sufficient in practice.  
> 
> I think bumping the minimal version of libbpf with every iproute2 release
> is necessary as well.
> Today iproute2-next should require 0.2.0. The cycle after it should be 0.3.0
> and so on.
> This way at least some correlation between iproute2 and libbpf will be
> established.
> Otherwise it's a mess of versions and functionality from user point of view.

As long as iproute2 6.0 and libbpf 0.11.0 continues to work on older kernel
(like oldest living LTS 4.19 in 2023?); then it is fine. 

Just don't want libbpf to cause visible breakage for users.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06 23:25                                     ` Stephen Hemminger
@ 2020-11-06 23:30                                       ` Andrii Nakryiko
  2020-11-07  0:41                                         ` Stephen Hemminger
  2020-11-06 23:38                                       ` David Ahern
  1 sibling, 1 reply; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-06 23:30 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Alexei Starovoitov, Jiri Benc, Edward Cree, Hangbin Liu,
	David Ahern, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, Nov 6, 2020 at 3:25 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Fri, 6 Nov 2020 13:04:16 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > On Fri, Nov 6, 2020 at 12:58 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Fri, Nov 6, 2020 at 12:44 AM Jiri Benc <jbenc@redhat.com> wrote:
> > > >
> > > > On Thu, 5 Nov 2020 12:19:00 -0800, Andrii Nakryiko wrote:
> > > > > I'll just quote myself here for your convenience.
> > > >
> > > > Sorry, I missed your original email for some reason.
> > > >
> > > > >   Submodule is a way that I know of to make this better for end users.
> > > > >   If there are other ways to pull this off with shared library use, I'm
> > > > >   all for it, it will save the security angle that distros are arguing
> > > > >   for. E.g., if distributions will always have the latest libbpf
> > > > >   available almost as soon as it's cut upstream *and* new iproute2
> > > > >   versions enforce the latest libbpf when they are packaged/released,
> > > > >   then this might work equivalently for end users. If Linux distros
> > > > >   would be willing to do this faithfully and promptly, I have no
> > > > >   objections whatsoever. Because all that matters is BPF end user
> > > > >   experience, as Daniel explained above.
> > > >
> > > > That's basically what we already do, for both Fedora and RHEL.
> > > >
> > > > Of course, it follows the distro release cycle, i.e. no version
> > > > upgrades - or very limited ones - during lifetime of a particular
> > > > release. But that would not be different if libbpf was bundled in
> > > > individual projects.
> > >
> > > Alright. Hopefully this would be sufficient in practice.
> >
> > I think bumping the minimal version of libbpf with every iproute2 release
> > is necessary as well.
> > Today iproute2-next should require 0.2.0. The cycle after it should be 0.3.0
> > and so on.
> > This way at least some correlation between iproute2 and libbpf will be
> > established.
> > Otherwise it's a mess of versions and functionality from user point of view.
>
> As long as iproute2 6.0 and libbpf 0.11.0 continues to work on older kernel
> (like oldest living LTS 4.19 in 2023?); then it is fine.
>
> Just don't want libbpf to cause visible breakage for users.

libbpf CI validates a bunch of selftests on 4.9 kernel, see [0]. It
should work on even older ones. Not all BPF programs would load and be
verified successfully, but libbpf itself should work regardless.

  [0] https://travis-ci.com/github/libbpf/libbpf/jobs/429362146

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06 23:25                                     ` Stephen Hemminger
  2020-11-06 23:30                                       ` Andrii Nakryiko
@ 2020-11-06 23:38                                       ` David Ahern
  2020-11-09  1:45                                         ` Alexei Starovoitov
  1 sibling, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-06 23:38 UTC (permalink / raw)
  To: Stephen Hemminger, Alexei Starovoitov
  Cc: Andrii Nakryiko, Jiri Benc, Edward Cree, Hangbin Liu,
	Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/6/20 4:25 PM, Stephen Hemminger wrote:
>>
>> I think bumping the minimal version of libbpf with every iproute2 release
>> is necessary as well.
>> Today iproute2-next should require 0.2.0. The cycle after it should be 0.3.0
>> and so on.
>> This way at least some correlation between iproute2 and libbpf will be
>> established.
>> Otherwise it's a mess of versions and functionality from user point of view.

If existing bpf features in iproute2 work fine with version 0.1.0, what
is the justification for an arbitrary requirement for iproute2 to force
users to bump libbpf versions just to use iproute2 from v5.11?

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06 23:30                                       ` Andrii Nakryiko
@ 2020-11-07  0:41                                         ` Stephen Hemminger
  2020-11-07  1:07                                           ` Andrii Nakryiko
  0 siblings, 1 reply; 167+ messages in thread
From: Stephen Hemminger @ 2020-11-07  0:41 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Jiri Benc, Edward Cree, Hangbin Liu,
	David Ahern, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, 6 Nov 2020 15:30:38 -0800
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Fri, Nov 6, 2020 at 3:25 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Fri, 6 Nov 2020 13:04:16 -0800
> > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> >  
> > > On Fri, Nov 6, 2020 at 12:58 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:  
> > > >
> > > > On Fri, Nov 6, 2020 at 12:44 AM Jiri Benc <jbenc@redhat.com> wrote:  
> > > > >
> > > > > On Thu, 5 Nov 2020 12:19:00 -0800, Andrii Nakryiko wrote:  
> > > > > > I'll just quote myself here for your convenience.  
> > > > >
> > > > > Sorry, I missed your original email for some reason.
> > > > >  
> > > > > >   Submodule is a way that I know of to make this better for end users.
> > > > > >   If there are other ways to pull this off with shared library use, I'm
> > > > > >   all for it, it will save the security angle that distros are arguing
> > > > > >   for. E.g., if distributions will always have the latest libbpf
> > > > > >   available almost as soon as it's cut upstream *and* new iproute2
> > > > > >   versions enforce the latest libbpf when they are packaged/released,
> > > > > >   then this might work equivalently for end users. If Linux distros
> > > > > >   would be willing to do this faithfully and promptly, I have no
> > > > > >   objections whatsoever. Because all that matters is BPF end user
> > > > > >   experience, as Daniel explained above.  
> > > > >
> > > > > That's basically what we already do, for both Fedora and RHEL.
> > > > >
> > > > > Of course, it follows the distro release cycle, i.e. no version
> > > > > upgrades - or very limited ones - during lifetime of a particular
> > > > > release. But that would not be different if libbpf was bundled in
> > > > > individual projects.  
> > > >
> > > > Alright. Hopefully this would be sufficient in practice.  
> > >
> > > I think bumping the minimal version of libbpf with every iproute2 release
> > > is necessary as well.
> > > Today iproute2-next should require 0.2.0. The cycle after it should be 0.3.0
> > > and so on.
> > > This way at least some correlation between iproute2 and libbpf will be
> > > established.
> > > Otherwise it's a mess of versions and functionality from user point of view.  
> >
> > As long as iproute2 6.0 and libbpf 0.11.0 continues to work on older kernel
> > (like oldest living LTS 4.19 in 2023?); then it is fine.
> >
> > Just don't want libbpf to cause visible breakage for users.  
> 
> libbpf CI validates a bunch of selftests on 4.9 kernel, see [0]. It
> should work on even older ones. Not all BPF programs would load and be
> verified successfully, but libbpf itself should work regardless.
> 
>   [0] https://travis-ci.com/github/libbpf/libbpf/jobs/429362146

Look at the dates in my note, are you willing to promise that compatibility
in future versions.


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-07  0:41                                         ` Stephen Hemminger
@ 2020-11-07  1:07                                           ` Andrii Nakryiko
  0 siblings, 0 replies; 167+ messages in thread
From: Andrii Nakryiko @ 2020-11-07  1:07 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Alexei Starovoitov, Jiri Benc, Edward Cree, Hangbin Liu,
	David Ahern, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, Nov 6, 2020 at 4:41 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Fri, 6 Nov 2020 15:30:38 -0800
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Fri, Nov 6, 2020 at 3:25 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Fri, 6 Nov 2020 13:04:16 -0800
> > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > >
> > > > On Fri, Nov 6, 2020 at 12:58 PM Andrii Nakryiko
> > > > <andrii.nakryiko@gmail.com> wrote:
> > > > >
> > > > > On Fri, Nov 6, 2020 at 12:44 AM Jiri Benc <jbenc@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, 5 Nov 2020 12:19:00 -0800, Andrii Nakryiko wrote:
> > > > > > > I'll just quote myself here for your convenience.
> > > > > >
> > > > > > Sorry, I missed your original email for some reason.
> > > > > >
> > > > > > >   Submodule is a way that I know of to make this better for end users.
> > > > > > >   If there are other ways to pull this off with shared library use, I'm
> > > > > > >   all for it, it will save the security angle that distros are arguing
> > > > > > >   for. E.g., if distributions will always have the latest libbpf
> > > > > > >   available almost as soon as it's cut upstream *and* new iproute2
> > > > > > >   versions enforce the latest libbpf when they are packaged/released,
> > > > > > >   then this might work equivalently for end users. If Linux distros
> > > > > > >   would be willing to do this faithfully and promptly, I have no
> > > > > > >   objections whatsoever. Because all that matters is BPF end user
> > > > > > >   experience, as Daniel explained above.
> > > > > >
> > > > > > That's basically what we already do, for both Fedora and RHEL.
> > > > > >
> > > > > > Of course, it follows the distro release cycle, i.e. no version
> > > > > > upgrades - or very limited ones - during lifetime of a particular
> > > > > > release. But that would not be different if libbpf was bundled in
> > > > > > individual projects.
> > > > >
> > > > > Alright. Hopefully this would be sufficient in practice.
> > > >
> > > > I think bumping the minimal version of libbpf with every iproute2 release
> > > > is necessary as well.
> > > > Today iproute2-next should require 0.2.0. The cycle after it should be 0.3.0
> > > > and so on.
> > > > This way at least some correlation between iproute2 and libbpf will be
> > > > established.
> > > > Otherwise it's a mess of versions and functionality from user point of view.
> > >
> > > As long as iproute2 6.0 and libbpf 0.11.0 continues to work on older kernel
> > > (like oldest living LTS 4.19 in 2023?); then it is fine.
> > >
> > > Just don't want libbpf to cause visible breakage for users.
> >
> > libbpf CI validates a bunch of selftests on 4.9 kernel, see [0]. It
> > should work on even older ones. Not all BPF programs would load and be
> > verified successfully, but libbpf itself should work regardless.
> >
> >   [0] https://travis-ci.com/github/libbpf/libbpf/jobs/429362146
>
> Look at the dates in my note, are you willing to promise that compatibility
> in future versions.
>

I don't understand why after so many emails in this thread it's still
not clear that backwards compatibility is in libbpf's DNA. And no one
can even point out where and when exactly libbpf even had a problem
with backwards compatibility in the first place! Yet, all of this
insinuation of libbpf API instability...

So for the last time (hopefully): yes!

We managed to do that for at least 2 last years, why would we suddenly
break this?

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-06 23:38                                       ` David Ahern
@ 2020-11-09  1:45                                         ` Alexei Starovoitov
  2020-11-10  4:09                                           ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-09  1:45 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, Nov 06, 2020 at 04:38:13PM -0700, David Ahern wrote:
> On 11/6/20 4:25 PM, Stephen Hemminger wrote:
> >>
> >> I think bumping the minimal version of libbpf with every iproute2 release
> >> is necessary as well.
> >> Today iproute2-next should require 0.2.0. The cycle after it should be 0.3.0
> >> and so on.
> >> This way at least some correlation between iproute2 and libbpf will be
> >> established.
> >> Otherwise it's a mess of versions and functionality from user point of view.
> 
> If existing bpf features in iproute2 work fine with version 0.1.0, what
> is the justification for an arbitrary requirement for iproute2 to force
> users to bump libbpf versions just to use iproute2 from v5.11?

I don't understand why on one side you're pointing out existing quirkiness with
bpf usability while at the same time arguing to make it _less_ user friendly
when myself, Daniel, Andrii explained in detail what libbpf does and how it
affects user experience?

The analogy of libbpf in iproute2 and libbfd in gdb is that both libraries
perform large percentage of functionality comparing to the rest of the tool.
When library is dynamic linked it makes user experience unpredictable. My guess
is that libbfd is ~50% of what gdb is doing. What will the users say if gdb
suddenly behaves differently (supports less or more elf files) because
libbfd.so got upgraded in the background? In case of tc+libbpf the break down
of funcionality is heavliy skewed towards libbpf. The amount of logic iproute2
code will do to perform "tc filter ... bpf..." command is 10% iproute2 / 90%
libbpf. Issuing few netlink calls to attach bpf prog to a qdisc is trivial
comparing to what libbpf is doing with an elf file. There is a linker inside
libbpf. It will separate different functions inside elf file. It will relocate
code and adjust instructions before sending it to the kernel. libbpf is not
a wrapper. It's a mini compiler: CO-RE logic, function relocation, dynamic
kernel feature probing, etc. When the users use a command line tool (like
iproute2 or bpftool) they are interfacing with the tool. It's not unix-like to
demand that users should check the version of a shared library and adjust their
expectations. The UI is the command line. Its version is as a promise of
features. iproute2 of certain version in one distro should behave the same as
iproute2 in another distro. By not doing git submodule that promise is broken.
Hence my preference is to use fixed libbpf sha for every iproute2 release. The
other alternative is to lag iproute2/libbpf one release behind. Hence
repeating what I said earlier: Today iproute2-next should require 0.2.0. The
iprtoute2 in the next cycle _must_ bump be the minimum libbpf version to 0.3.0.
Not bumping minimum version brings us to square one and unpredicatable user
experience. The users are jumping through enough hoops when they develop bpf
programs. We have to make it simpler and easier. Using libbpf in iproute2
can improve the user experience, but only if it's predictable.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv4 iproute2-next 0/5] iproute2: add libbpf support
  2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
                       ` (5 preceding siblings ...)
  2020-11-02 15:47     ` [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support David Ahern
@ 2020-11-09  7:07     ` Hangbin Liu
  2020-11-09  7:07       ` [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
                         ` (5 more replies)
  6 siblings, 6 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-09  7:07 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency), or set off to disable libbpf check
and build iproute2 with legacy bpf.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.

v4:
a) Make variable LIBBPF_FORCE able to control whether build iproute2
   with libbpf or not.
b) Add new file bpf_glue.c to for libbpf/legacy mixed bpf calls.
c) Fix some build issues and shell compatibility error.

v3:
a) Update configure to Check function bpf_program__section_name() separately
b) Add a new function get_bpf_program__section_name() to choose whether to
use bpf_program__title() or not.
c) Test build the patch on Fedora 33 with libbpf-0.1.0-1.fc33 and
   libbpf-devel-0.1.0-1.fc33

v2:
a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
b) Add ipvrf with libbpf support.


Here are the test results with patched iproute2:

== setup env
# clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o
# clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o
# rm -rf /sys/fs/bpf/xdp/globals
# /root/iproute2/ip/ip link add type veth
# /root/iproute2/ip/ip link set veth0 up
# /root/iproute2/ip/ip link set veth1 up


== Load objs
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 4 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
4: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:21-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 5
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 8 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
8: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:23-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 10
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 12 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
12: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:25-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 15
# /root/iproute2/ip/ip link set veth0 xdp off


== Load objs again to make sure maps could be reused
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 16 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
16: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:27-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 20
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 20 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show                                                                                                                                                                   [236/4518]
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
20: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:29-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 25
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 24 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
24: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:31-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 30
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals

== Testing if we can load new-style objects (using xdp-filter as an example)
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 28 tag e29eeda1489a6520 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
28: xdp  name xdpfilt_alw_all  tag e29eeda1489a6520  gpl
        loaded_at 2020-10-22T08:04:33-0400  uid 0
        xlated 2408B  jited 1405B  memlock 4096B  map_ids 9,5,7,8,6
        btf_id 35
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
32: xdp  name xdpfilt_alw_ip  tag 2f2b9dbfb786a5a2  gpl
        loaded_at 2020-10-22T08:04:35-0400  uid 0
        xlated 1336B  jited 778B  memlock 4096B  map_ids 7,8,5
        btf_id 40
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 36 tag 18c1bb25084030bc jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
36: xdp  name xdpfilt_alw_tcp  tag 18c1bb25084030bc  gpl
        loaded_at 2020-10-22T08:04:37-0400  uid 0
        xlated 1128B  jited 690B  memlock 4096B  map_ids 6,5
        btf_id 45
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Load new btf defined maps
# /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 40 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
40: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:39-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 50
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 44 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
44: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:41-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 13
        btf_id 55
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 48 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer  map_sh
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
14: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
48: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:43-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 14
        btf_id 60
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Test load objs by tc
# /root/iproute2/tc/tc qdisc add dev veth0 ingress
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0
# /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
# ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d:
jmp_tc

/sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f:
jmp_ex  jmp_tc  map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc
# bpftool map show
15: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
16: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
17: prog_array  name jmp_ex  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
18: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 2  memlock 4096B
        owner_prog_type sched_cls  owner jited
19: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
52: sched_cls  name cls_loop  tag 3e98a40b04099d36  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 168B  jited 133B  memlock 4096B  map_ids 15
        btf_id 65
56: sched_cls  name cls_entry  tag 0fbb4d9310a6ee26  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 144B  jited 121B  memlock 4096B  map_ids 16
        btf_id 70
60: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 75
66: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 80
72: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 85
78: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 90
79: sched_cls  name cls_case2  tag ee218ff893dca823  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 336B  jited 218B  memlock 4096B  map_ids 19,18
        btf_id 90
80: sched_cls  name cls_exit  tag e78a58140deed387  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 288B  jited 177B  memlock 4096B  map_ids 19
        btf_id 90

I also run the following upstream kselftest with patches iproute2 and
all passed.

test_lwt_ip_encap.sh
test_xdp_redirect.sh
test_tc_redirect.sh
test_xdp_meta.sh
test_xdp_veth.sh
test_xdp_vlan.sh

Hangbin Liu (5):
  configure: add check_libbpf() for later libbpf support
  lib: rename bpf.c to bpf_legacy.c
  lib: add libbpf support
  examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  examples/bpf: add bpf examples with BTF defined maps

 configure                                | 108 +++++++
 examples/bpf/README                      |  18 +-
 examples/bpf/bpf_graft.c                 |  14 +-
 examples/bpf/bpf_map_in_map.c            |  37 ++-
 examples/bpf/bpf_shared.c                |  14 +-
 examples/bpf/{ => legacy}/bpf_cyclic.c   |   2 +-
 examples/bpf/legacy/bpf_graft.c          |  66 +++++
 examples/bpf/legacy/bpf_map_in_map.c     |  56 ++++
 examples/bpf/legacy/bpf_shared.c         |  53 ++++
 examples/bpf/{ => legacy}/bpf_tailcall.c |   2 +-
 include/bpf_api.h                        |  13 +
 include/bpf_util.h                       |  21 +-
 ip/ipvrf.c                               |   6 +-
 lib/Makefile                             |   6 +-
 lib/bpf_glue.c                           |  35 +++
 lib/{bpf.c => bpf_legacy.c}              | 193 ++++++++++++-
 lib/bpf_libbpf.c                         | 353 +++++++++++++++++++++++
 17 files changed, 939 insertions(+), 58 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 create mode 100644 examples/bpf/legacy/bpf_graft.c
 create mode 100644 examples/bpf/legacy/bpf_map_in_map.c
 create mode 100644 examples/bpf/legacy/bpf_shared.c
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)
 create mode 100644 lib/bpf_glue.c
 rename lib/{bpf.c => bpf_legacy.c} (94%)
 create mode 100644 lib/bpf_libbpf.c

-- 
2.25.4


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
@ 2020-11-09  7:07       ` Hangbin Liu
  2020-11-14  3:26         ` David Ahern
  2020-11-09  7:07       ` [PATCHv4 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
                         ` (4 subsequent siblings)
  5 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-09  7:07 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch adds a check to see if we have libbpf support. By default the
system libbpf will be used, but static linking against a custom libbpf
version can be achieved by passing LIBBPF_DIR to configure.

Add another variable LIBBPF_FORCE to control whether to build iproute2
with libbpf. If set to on, then force to build with libbpf and exit if
not available. If set to off, then force to not build with libbpf.

Signed-off-by: Hangbin Liu <haliu@redhat.com>

v4:
1) Remove duplicate LIBBPF_CFLAGS
2) Remove un-needed -L since using static libbpf.a
3) Fix == not supported in dash
4) Extend LIBBPF_FORCE to support on/off, when set to on, stop building when
   there is no libbpf support. If set to off, discard libbpf check.
5) Print libbpf version after checking

v3:
Check function bpf_program__section_name() separately and only use it
on higher libbpf version.

v2:
No update
---
 configure | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 108 insertions(+)

diff --git a/configure b/configure
index 307912aa..3081a2ac 100755
--- a/configure
+++ b/configure
@@ -2,6 +2,11 @@
 # SPDX-License-Identifier: GPL-2.0
 # This is not an autoconf generated configure
 #
+# Influential LIBBPF environment variables:
+#   LIBBPF_FORCE={on,off}   on: require link against libbpf;
+#                           off: disable libbpf probing
+#   LIBBPF_LIBDIR           Path to libbpf to use
+
 INCLUDE=${1:-"$PWD/include"}
 
 # Output file which is input to Makefile
@@ -240,6 +245,106 @@ check_elf()
     fi
 }
 
+have_libbpf_basic()
+{
+    cat >$TMPDIR/libbpf_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    bpf_program__set_autoload(NULL, false);
+    bpf_map__ifindex(NULL);
+    bpf_map__set_pin_path(NULL, NULL);
+    bpf_object__open_file(NULL, NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_test $TMPDIR/libbpf_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_test.c $TMPDIR/libbpf_test
+    return $ret
+}
+
+have_libbpf_sec_name()
+{
+    cat >$TMPDIR/libbpf_sec_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    void *ptr;
+    bpf_program__section_name(NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_sec_test $TMPDIR/libbpf_sec_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_sec_test.c $TMPDIR/libbpf_sec_test
+    return $ret
+}
+
+check_force_libbpf_on()
+{
+    # if set LIBBPF_FORCE=on but no libbpf support, just exist the config
+    # process to make sure we don't build without libbpf.
+    if [ "$LIBBPF_FORCE" = on ]; then
+        echo "	LIBBPF_FORCE=on set, but couldn't find a usable libbpf"
+        exit 1
+    fi
+}
+
+check_libbpf()
+{
+    # if set LIBBPF_FORCE=off, disable libbpf entirely
+    if [ "$LIBBPF_FORCE" = off ]; then
+        echo "no"
+        return
+    fi
+
+    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
+        echo "no"
+        check_force_libbpf_on
+        return
+    fi
+
+    if [ $(uname -m) = x86_64 ]; then
+        local LIBBPF_LIBDIR="${LIBBPF_DIR}/lib64"
+    else
+        local LIBBPF_LIBDIR="${LIBBPF_DIR}/lib"
+    fi
+
+    if [ -n "$LIBBPF_DIR" ]; then
+        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include"
+        LIBBPF_LDLIBS="${LIBBPF_LIBDIR}/libbpf.a -lz -lelf"
+        LIBBPF_VERSION=$(PKG_CONFIG_LIBDIR=${LIBBPF_LIBDIR}/pkgconfig ${PKG_CONFIG} libbpf --modversion)
+    else
+        LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
+        LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
+        LIBBPF_VERSION=$(${PKG_CONFIG} libbpf --modversion)
+    fi
+
+    if ! have_libbpf_basic; then
+        echo "no"
+        echo "	libbpf version $LIBBPF_VERSION is too low, please update it to at least 0.1.0"
+        check_force_libbpf_on
+        return
+    else
+        echo "HAVE_LIBBPF:=y" >>$CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
+        echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
+    fi
+
+    # bpf_program__title() is deprecated since libbpf 0.2.0, use
+    # bpf_program__section_name() instead if we support
+    if have_libbpf_sec_name; then
+        echo "HAVE_LIBBPF_SECTION_NAME:=y" >>$CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF_SECTION_NAME ' >> $CONFIG
+    fi
+
+    echo "yes"
+    echo "	libbpf version $LIBBPF_VERSION"
+}
+
 check_selinux()
 # SELinux is a compile time option in the ss utility
 {
@@ -385,6 +490,9 @@ check_setns
 echo -n "SELinux support: "
 check_selinux
 
+echo -n "libbpf support: "
+check_libbpf
+
 echo -n "ELF support: "
 check_elf
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv4 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
  2020-11-09  7:07       ` [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
@ 2020-11-09  7:07       ` Hangbin Liu
  2020-11-14  3:24         ` David Ahern
  2020-11-09  7:08       ` [PATCHv4 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
                         ` (3 subsequent siblings)
  5 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-09  7:07 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This is a preparation for later main libbpf support in iproute2.
bpf.c is moved to bpf_legacy.c first.

A new file bpf_glue.c is added which could call both legacy libbpf code.
There are two wrapper functions added for ipvrf. Function
bpf_prog_load() is removed as it's conflict with libbpf function name.

v4: Add new file bpf_glue.c to for libbpf/legacy mixed bpf calls.
v2-v3: no update

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 include/bpf_util.h          | 10 +++++++---
 ip/ipvrf.c                  |  6 +++---
 lib/Makefile                |  2 +-
 lib/bpf_glue.c              | 35 +++++++++++++++++++++++++++++++++++
 lib/{bpf.c => bpf_legacy.c} | 15 +++------------
 5 files changed, 49 insertions(+), 19 deletions(-)
 create mode 100644 lib/bpf_glue.c
 rename lib/{bpf.c => bpf_legacy.c} (99%)

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 63db07ca..82217cc6 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -274,12 +274,16 @@ int bpf_trace_pipe(void);
 
 void bpf_print_ops(struct rtattr *bpf_ops, __u16 len);
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log);
+int bpf_prog_load_dev(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, __u32 ifindex,
+		      char *log, size_t size_log);
+int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
+		     size_t size_insns, const char *license, char *log,
+		     size_t size_log);
 
 int bpf_prog_attach_fd(int prog_fd, int target_fd, enum bpf_attach_type type);
 int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type);
+int bpf_program_attach(int prog_fd, int target_fd, enum bpf_attach_type type);
 
 int bpf_dump_prog_info(FILE *f, uint32_t id);
 
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 28dd8e25..42779e5c 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -256,8 +256,8 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
-	return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
-			     "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+	return bpf_program_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+			        "GPL", bpf_log_buf, sizeof(bpf_log_buf));
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
@@ -288,7 +288,7 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 		goto out;
 	}
 
-	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
+	if (bpf_program_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
 		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
 			strerror(errno));
 		goto out;
diff --git a/lib/Makefile b/lib/Makefile
index 7cba1857..c9502f6a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@ CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
-	names.o color.o bpf.o exec.o fs.o cg_map.o
+	names.o color.o bpf_legacy.o bpf_glue.o exec.o fs.o cg_map.o
 
 NLOBJ=libgenl.o libnetlink.o
 
diff --git a/lib/bpf_glue.c b/lib/bpf_glue.c
new file mode 100644
index 00000000..7626a893
--- /dev/null
+++ b/lib/bpf_glue.c
@@ -0,0 +1,35 @@
+/*
+ * bpf_glue.c	BPF code to call both legacy and libbpf code
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	Hangbin Liu <haliu@redhat.com>
+ *
+ */
+#include "bpf_util.h"
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#endif
+
+int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
+		     size_t size_insns, const char *license, char *log,
+		     size_t size_log)
+{
+#ifdef HAVE_LIBBPF
+	return bpf_load_program(type, insns, size_insns, license, 0, log, size_log);
+#else
+	return bpf_load_load_dev(type, insns, size_insns, license, 0, log, size_log);
+#endif
+}
+
+int bpf_program_attach(int prog_fd, int target_fd, enum bpf_attach_type type)
+{
+#ifdef HAVE_LIBBPF
+	return bpf_prog_attach(prog_fd, target_fd, type, 0);
+#else
+	return bpf_prog_attach_fd(prog_fd, target_fd, type);
+#endif
+}
diff --git a/lib/bpf.c b/lib/bpf_legacy.c
similarity index 99%
rename from lib/bpf.c
rename to lib/bpf_legacy.c
index c7d45077..4246fb76 100644
--- a/lib/bpf.c
+++ b/lib/bpf_legacy.c
@@ -1087,10 +1087,9 @@ int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type)
 	return bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
 }
 
-static int bpf_prog_load_dev(enum bpf_prog_type type,
-			     const struct bpf_insn *insns, size_t size_insns,
-			     const char *license, __u32 ifindex,
-			     char *log, size_t size_log)
+int bpf_prog_load_dev(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, __u32 ifindex,
+		      char *log, size_t size_log)
 {
 	union bpf_attr attr = {};
 
@@ -1109,14 +1108,6 @@ static int bpf_prog_load_dev(enum bpf_prog_type type,
 	return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 }
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log)
-{
-	return bpf_prog_load_dev(type, insns, size_insns, license, 0,
-				 log, size_log);
-}
-
 #ifdef HAVE_ELF
 struct bpf_elf_prog {
 	enum bpf_prog_type	type;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv4 iproute2-next 3/5] lib: add libbpf support
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
  2020-11-09  7:07       ` [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
  2020-11-09  7:07       ` [PATCHv4 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
@ 2020-11-09  7:08       ` Hangbin Liu
  2020-11-09  7:08       ` [PATCHv4 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-09  7:08 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available, which is started by Toke's
implementation[1]. With libbpf iproute2 could correctly process BTF
information and support the new-style BTF-defined maps, while keeping
compatibility with the old internal map definition syntax.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
iproute2 will still understand the old map definition format, including
populating map-in-map and tail call maps before load.

In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
legacy bytes. When handling the legacy maps, for map-in-maps, we create
them manually and re-use the fd as they are associated with id/inner_id.
For pin maps, we only set the pin path and let libbp load to handle it.
For tail calls, we find it first and update the element after prog load.

Other maps/progs will be loaded by libbpf directly.

[1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>

v4:
Move ipvrf code to patch 02
Move HAVE_LIBBPF inside HAVE_ELF definition as libbpf depends on elf.

v3:
Add a new function get_bpf_program__section_name() to choose whether
use bpf_program__title() or not.

v2:
Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
Add ipvrf with libbpf support.
---
 include/bpf_util.h |  11 ++
 lib/Makefile       |   4 +
 lib/bpf_legacy.c   | 178 +++++++++++++++++++++++
 lib/bpf_libbpf.c   | 353 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 546 insertions(+)
 create mode 100644 lib/bpf_libbpf.c

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 82217cc6..d4f66de9 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -304,4 +304,15 @@ static inline int bpf_recv_map_fds(const char *path, int *fds,
 	return -1;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg);
+int iproute2_bpf_fetch_ancillary(void);
+int iproute2_get_root_path(char *root_path, size_t len);
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname);
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name);
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name);
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg);
+#endif /* HAVE_LIBBPF */
 #endif /* __BPF_UTIL__ */
diff --git a/lib/Makefile b/lib/Makefile
index c9502f6a..7c8c4c50 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -7,6 +7,10 @@ UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
 	names.o color.o bpf_legacy.o bpf_glue.o exec.o fs.o cg_map.o
 
+ifeq ($(HAVE_LIBBPF),y)
+UTILOBJ += bpf_libbpf.o
+endif
+
 NLOBJ=libgenl.o libnetlink.o
 
 all: libnetlink.a libutil.a
diff --git a/lib/bpf_legacy.c b/lib/bpf_legacy.c
index 4246fb76..bc869c3f 100644
--- a/lib/bpf_legacy.c
+++ b/lib/bpf_legacy.c
@@ -940,6 +940,9 @@ static int bpf_do_parse(struct bpf_cfg_in *cfg, const bool *opt_tbl)
 static int bpf_do_load(struct bpf_cfg_in *cfg)
 {
 	if (cfg->mode == EBPF_OBJECT) {
+#ifdef HAVE_LIBBPF
+		return iproute2_load_libbpf(cfg);
+#endif
 		cfg->prog_fd = bpf_obj_open(cfg->object, cfg->type,
 					    cfg->section, cfg->ifindex,
 					    cfg->verbose);
@@ -3155,4 +3158,179 @@ int bpf_recv_map_fds(const char *path, int *fds, struct bpf_map_aux *aux,
 	close(fd);
 	return ret;
 }
+
+#ifdef HAVE_LIBBPF
+/* The following functions are wrapper functions for libbpf code to be
+ * compatible with the legacy format. So all the functions have prefix
+ * with iproute2_
+ */
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+
+	return bpf_elf_ctx_init(ctx, cfg->object, cfg->type, cfg->ifindex, cfg->verbose);
+}
+
+int iproute2_bpf_fetch_ancillary(void)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	struct bpf_elf_sec_data data;
+	int i, ret = 0;
+
+	for (i = 1; i < ctx->elf_hdr.e_shnum; i++) {
+		ret = bpf_fill_section_data(ctx, i, &data);
+		if (ret < 0)
+			continue;
+
+		if (data.sec_hdr.sh_type == SHT_PROGBITS &&
+		    !strcmp(data.sec_name, ELF_SECTION_MAPS))
+			ret = bpf_fetch_maps_begin(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_SYMTAB &&
+			 !strcmp(data.sec_name, ".symtab"))
+			ret = bpf_fetch_symtab(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_STRTAB &&
+			 !strcmp(data.sec_name, ".strtab"))
+			ret = bpf_fetch_strtab(ctx, i, &data);
+		if (ret < 0) {
+			fprintf(stderr, "Error parsing section %d! Perhaps check with readelf -a?\n",
+				i);
+			return ret;
+		}
+	}
+
+	if (bpf_has_map_data(ctx)) {
+		ret = bpf_fetch_maps_end(ctx);
+		if (ret < 0) {
+			fprintf(stderr, "Error fixing up map structure, incompatible struct bpf_elf_map used?\n");
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+int iproute2_get_root_path(char *root_path, size_t len)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	int ret = 0;
+
+	snprintf(root_path, len, "%s/%s",
+		 bpf_get_work_dir(ctx->type), BPF_DIR_GLOBALS);
+
+	ret = mkdir(root_path, S_IRWXU);
+	if (ret && errno != EEXIST) {
+		fprintf(stderr, "mkdir %s failed: %s\n", root_path, strerror(errno));
+		return ret;
+	}
+
+	return 0;
+}
+
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name, *tmp;
+	unsigned int pinning;
+	int i, ret = 0;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].pinning == PIN_OBJECT_NS &&
+		    ctx->noafalg) {
+			fprintf(stderr, "Missing kernel AF_ALG support for PIN_OBJECT_NS!\n");
+			return false;
+		}
+
+		map_name = bpf_map_fetch_name(ctx, i);
+		if (!map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, map_name))
+			continue;
+
+		pinning = ctx->maps[i].pinning;
+
+		if (bpf_no_pinning(ctx, pinning) || !bpf_get_work_dir(ctx->type))
+			return false;
+
+		if (pinning == PIN_OBJECT_NS)
+			ret = bpf_make_obj_path(ctx);
+		else if ((tmp = bpf_custom_pinning(ctx, pinning)))
+			ret = bpf_make_custom_path(ctx, tmp);
+		if (ret < 0)
+			return false;
+
+		bpf_make_pathname(pathname, PATH_MAX, map_name, ctx, pinning);
+
+		return true;
+	}
+
+	return false;
+}
+
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *inner_map_name, *outer_map_name;
+	int i, j;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		inner_map_name = bpf_map_fetch_name(ctx, i);
+		if (!inner_map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, inner_map_name))
+			continue;
+
+		if (!ctx->maps[i].id ||
+		    ctx->maps[i].inner_id ||
+		    ctx->maps[i].inner_idx == -1)
+			continue;
+
+		*imap = ctx->maps[i];
+
+		for (j = 0; j < ctx->map_num; j++) {
+			if (!bpf_is_map_in_map_type(&ctx->maps[j]))
+				continue;
+			if (ctx->maps[j].inner_id != ctx->maps[i].id)
+				continue;
+
+			*omap = ctx->maps[j];
+			outer_map_name = bpf_map_fetch_name(ctx, j);
+			memcpy(omap_name, outer_map_name, strlen(outer_map_name) + 1);
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name;
+	int i, idx = -1;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].id == map_id &&
+		    ctx->maps[i].type == BPF_MAP_TYPE_PROG_ARRAY) {
+			idx = i;
+			break;
+		}
+	}
+
+	if (idx < 0)
+		return -1;
+
+	map_name = bpf_map_fetch_name(ctx, idx);
+	if (!map_name)
+		return -1;
+
+	memcpy(name, map_name, strlen(map_name) + 1);
+	return 0;
+}
+#endif /* HAVE_LIBBPF */
 #endif /* HAVE_ELF */
diff --git a/lib/bpf_libbpf.c b/lib/bpf_libbpf.c
new file mode 100644
index 00000000..26694b43
--- /dev/null
+++ b/lib/bpf_libbpf.c
@@ -0,0 +1,353 @@
+/*
+ * bpf_libbpf.c	BPF code relay on libbpf
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	Hangbin Liu <haliu@redhat.com>
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <errno.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+#include <gelf.h>
+
+#include <bpf/libbpf.h>
+#include <bpf/bpf.h>
+
+#include "bpf_util.h"
+
+static int verbose_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	return vfprintf(stderr, format, args);
+}
+
+static int silent_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	if (level > LIBBPF_WARN)
+		return 0;
+
+	/* Skip warning from bpf_object__init_user_maps() for legacy maps */
+	if (strstr(format, "has unrecognized, non-zero options"))
+		return 0;
+
+	return vfprintf(stderr, format, args);
+}
+
+static const char *get_bpf_program__section_name(const struct bpf_program *prog)
+{
+#ifdef HAVE_LIBBPF_SECTION_NAME
+	return bpf_program__section_name(prog);
+#else
+	return bpf_program__title(prog, false);
+#endif
+}
+
+static int create_map(const char *name, struct bpf_elf_map *map,
+		      __u32 ifindex, int inner_fd)
+{
+	struct bpf_create_map_attr map_attr = {};
+
+	map_attr.name = name;
+	map_attr.map_type = map->type;
+	map_attr.map_flags = map->flags;
+	map_attr.key_size = map->size_key;
+	map_attr.value_size = map->size_value;
+	map_attr.max_entries = map->max_elem;
+	map_attr.map_ifindex = ifindex;
+	map_attr.inner_map_fd = inner_fd;
+
+	return bpf_create_map_xattr(&map_attr);
+}
+
+static int create_map_in_map(struct bpf_object *obj, struct bpf_map *map,
+			     struct bpf_elf_map *elf_map, int inner_fd,
+			     bool *reuse_pin_map)
+{
+	char pathname[PATH_MAX];
+	const char *map_name;
+	bool pin_map = false;
+	int map_fd, ret = 0;
+
+	map_name = bpf_map__name(map);
+
+	if (iproute2_is_pin_map(map_name, pathname)) {
+		pin_map = true;
+
+		/* Check if there already has a pinned map */
+		map_fd = bpf_obj_get(pathname);
+		if (map_fd > 0) {
+			if (reuse_pin_map)
+				*reuse_pin_map = true;
+			close(map_fd);
+			return bpf_map__set_pin_path(map, pathname);
+		}
+	}
+
+	map_fd = create_map(map_name, elf_map, bpf_map__ifindex(map), inner_fd);
+	if (map_fd < 0) {
+		fprintf(stderr, "create map %s failed\n", map_name);
+		return map_fd;
+	}
+
+	ret = bpf_map__reuse_fd(map, map_fd);
+	if (ret < 0) {
+		fprintf(stderr, "map %s reuse fd failed\n", map_name);
+		goto err_out;
+	}
+
+	if (pin_map) {
+		ret = bpf_map__set_pin_path(map, pathname);
+		if (ret < 0)
+			goto err_out;
+	}
+
+	return 0;
+err_out:
+	close(map_fd);
+	return ret;
+}
+
+static int
+handle_legacy_map_in_map(struct bpf_object *obj, struct bpf_map *inner_map,
+			 const char *inner_map_name)
+{
+	int inner_fd, outer_fd, inner_idx, ret = 0;
+	struct bpf_elf_map imap, omap;
+	struct bpf_map *outer_map;
+	/* What's the size limit of map name? */
+	char outer_map_name[128];
+	bool reuse_pin_map = false;
+
+	/* Deal with map-in-map */
+	if (iproute2_is_map_in_map(inner_map_name, &imap, &omap, outer_map_name)) {
+		ret = create_map_in_map(obj, inner_map, &imap, -1, NULL);
+		if (ret < 0)
+			return ret;
+
+		inner_fd = bpf_map__fd(inner_map);
+		outer_map = bpf_object__find_map_by_name(obj, outer_map_name);
+		ret = create_map_in_map(obj, outer_map, &omap, inner_fd, &reuse_pin_map);
+		if (ret < 0)
+			return ret;
+
+		if (!reuse_pin_map) {
+			inner_idx = imap.inner_idx;
+			outer_fd = bpf_map__fd(outer_map);
+			ret = bpf_map_update_elem(outer_fd, &inner_idx, &inner_fd, 0);
+			if (ret < 0)
+				fprintf(stderr, "Cannot update inner_idx into outer_map\n");
+		}
+	}
+
+	return ret;
+}
+
+static int find_legacy_tail_calls(struct bpf_program *prog, struct bpf_object *obj)
+{
+	unsigned int map_id, key_id;
+	const char *sec_name;
+	struct bpf_map *map;
+	char map_name[128];
+	int ret;
+
+	/* Handle iproute2 tail call */
+	sec_name = get_bpf_program__section_name(prog);
+	ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+	if (ret != 2)
+		return -1;
+
+	ret = iproute2_find_map_name_by_id(map_id, map_name);
+	if (ret < 0) {
+		fprintf(stderr, "unable to find map id %u for tail call\n", map_id);
+		return ret;
+	}
+
+	map = bpf_object__find_map_by_name(obj, map_name);
+	if (!map)
+		return -1;
+
+	/* Save the map here for later updating */
+	bpf_program__set_priv(prog, map, NULL);
+
+	return 0;
+}
+
+static int update_legacy_tail_call_maps(struct bpf_object *obj)
+{
+	int prog_fd, map_fd, ret = 0;
+	unsigned int map_id, key_id;
+	struct bpf_program *prog;
+	const char *sec_name;
+	struct bpf_map *map;
+
+	bpf_object__for_each_program(prog, obj) {
+		map = bpf_program__priv(prog);
+		if (!map)
+			continue;
+
+		prog_fd = bpf_program__fd(prog);
+		if (prog_fd < 0)
+			continue;
+
+		sec_name = get_bpf_program__section_name(prog);
+		ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+		if (ret != 2)
+			continue;
+
+		map_fd = bpf_map__fd(map);
+		ret = bpf_map_update_elem(map_fd, &key_id, &prog_fd, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Cannot update map key for tail call!\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int handle_legacy_maps(struct bpf_object *obj)
+{
+	char pathname[PATH_MAX];
+	struct bpf_map *map;
+	const char *map_name;
+	int map_fd, ret = 0;
+
+	bpf_object__for_each_map(map, obj) {
+		map_name = bpf_map__name(map);
+
+		ret = handle_legacy_map_in_map(obj, map, map_name);
+		if (ret)
+			return ret;
+
+		/* If it is a iproute2 legacy pin maps, just set pin path
+		 * and let bpf_object__load() to deal with the map creation.
+		 * We need to ignore map-in-maps which have pinned maps manually
+		 */
+		map_fd = bpf_map__fd(map);
+		if (map_fd < 0 && iproute2_is_pin_map(map_name, pathname)) {
+			ret = bpf_map__set_pin_path(map, pathname);
+			if (ret) {
+				fprintf(stderr, "map '%s': couldn't set pin path.\n", map_name);
+				break;
+			}
+		}
+
+	}
+
+	return ret;
+}
+
+static int load_bpf_object(struct bpf_cfg_in *cfg)
+{
+	struct bpf_program *p, *prog = NULL;
+	struct bpf_object *obj;
+	char root_path[PATH_MAX];
+	struct bpf_map *map;
+	int prog_fd, ret = 0;
+
+	ret = iproute2_get_root_path(root_path, PATH_MAX);
+	if (ret)
+		return ret;
+
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts,
+			.relaxed_maps = true,
+			.pin_root_path = root_path,
+	);
+
+	obj = bpf_object__open_file(cfg->object, &open_opts);
+	if (libbpf_get_error(obj)) {
+		fprintf(stderr, "ERROR: opening BPF object file failed\n");
+		return -ENOENT;
+	}
+
+	bpf_object__for_each_program(p, obj) {
+		/* Only load the programs that will either be subsequently
+		 * attached or inserted into a tail call map */
+		if (find_legacy_tail_calls(p, obj) < 0 && cfg->section &&
+		    strcmp(get_bpf_program__section_name(p), cfg->section)) {
+			ret = bpf_program__set_autoload(p, false);
+			if (ret)
+				return -EINVAL;
+			continue;
+		}
+
+		bpf_program__set_type(p, cfg->type);
+		bpf_program__set_ifindex(p, cfg->ifindex);
+		if (!prog)
+			prog = p;
+	}
+
+	bpf_object__for_each_map(map, obj) {
+		if (!bpf_map__is_offload_neutral(map))
+			bpf_map__set_ifindex(map, cfg->ifindex);
+	}
+
+	if (!prog) {
+		fprintf(stderr, "object file doesn't contain sec %s\n", cfg->section);
+		return -ENOENT;
+	}
+
+	/* Handle iproute2 legacy pin maps and map-in-maps */
+	ret = handle_legacy_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = bpf_object__load(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = update_legacy_tail_call_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	prog_fd = fcntl(bpf_program__fd(prog), F_DUPFD_CLOEXEC, 1);
+	if (prog_fd < 0)
+		ret = -errno;
+	else
+		cfg->prog_fd = prog_fd;
+
+unload_obj:
+	/* Close obj as we don't need it */
+	bpf_object__close(obj);
+	return ret;
+}
+
+/* Load ebpf and return prog fd */
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg)
+{
+	int ret = 0;
+
+	if (cfg->verbose)
+		libbpf_set_print(verbose_print);
+	else
+		libbpf_set_print(silent_print);
+
+	ret = iproute2_bpf_elf_ctx_init(cfg);
+	if (ret < 0) {
+		fprintf(stderr, "Cannot initialize ELF context!\n");
+		return ret;
+	}
+
+	ret = iproute2_bpf_fetch_ancillary();
+	if (ret < 0) {
+		fprintf(stderr, "Error fetching ELF ancillary data!\n");
+		return ret;
+	}
+
+	ret = load_bpf_object(cfg);
+	if (ret)
+		return ret;
+
+	return cfg->prog_fd;
+}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv4 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
                         ` (2 preceding siblings ...)
  2020-11-09  7:08       ` [PATCHv4 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-11-09  7:08       ` Hangbin Liu
  2020-11-09  7:08       ` [PATCHv4 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  5 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-09  7:08 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README                        | 14 +++++++++-----
 examples/bpf/{ => legacy}/bpf_cyclic.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_graft.c      |  2 +-
 examples/bpf/{ => legacy}/bpf_map_in_map.c |  2 +-
 examples/bpf/{ => legacy}/bpf_shared.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_tailcall.c   |  2 +-
 6 files changed, 14 insertions(+), 10 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 rename examples/bpf/{ => legacy}/bpf_graft.c (97%)
 rename examples/bpf/{ => legacy}/bpf_map_in_map.c (96%)
 rename examples/bpf/{ => legacy}/bpf_shared.c (97%)
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)

diff --git a/examples/bpf/README b/examples/bpf/README
index 1bbdda3f..732bcc83 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,8 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
- - bpf_shared.c		-> Ingress/egress map sharing example
- - bpf_tailcall.c	-> Using tail call chains
- - bpf_cyclic.c		-> Simple cycle as tail calls
- - bpf_graft.c		-> Demo on altering runtime behaviour
- - bpf_map_in_map.c     -> Using map in map example
+ - legacy/bpf_shared.c		-> Ingress/egress map sharing example
+ - legacy/bpf_tailcall.c	-> Using tail call chains
+ - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
+ - legacy/bpf_graft.c		-> Demo on altering runtime behaviour
+ - legacy/bpf_map_in_map.c	-> Using map in map example
+
+Note: Users should use new BTF way to defined the maps, the examples
+in legacy folder which is using struct bpf_elf_map defined maps is not
+recommanded.
diff --git a/examples/bpf/bpf_cyclic.c b/examples/bpf/legacy/bpf_cyclic.c
similarity index 95%
rename from examples/bpf/bpf_cyclic.c
rename to examples/bpf/legacy/bpf_cyclic.c
index 11d1c061..33590730 100644
--- a/examples/bpf/bpf_cyclic.c
+++ b/examples/bpf/legacy/bpf_cyclic.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Cyclic dependency example to test the kernel's runtime upper
  * bound on loops. Also demonstrates on how to use direct-actions,
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/legacy/bpf_graft.c
similarity index 97%
rename from examples/bpf/bpf_graft.c
rename to examples/bpf/legacy/bpf_graft.c
index 07113d4a..f4c920cc 100644
--- a/examples/bpf/bpf_graft.c
+++ b/examples/bpf/legacy/bpf_graft.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* This example demonstrates how classifier run-time behaviour
  * can be altered with tail calls. We start out with an empty
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/legacy/bpf_map_in_map.c
similarity index 96%
rename from examples/bpf/bpf_map_in_map.c
rename to examples/bpf/legacy/bpf_map_in_map.c
index ff0e623a..575f8812 100644
--- a/examples/bpf/bpf_map_in_map.c
+++ b/examples/bpf/legacy/bpf_map_in_map.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define MAP_INNER_ID	42
 
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/legacy/bpf_shared.c
similarity index 97%
rename from examples/bpf/bpf_shared.c
rename to examples/bpf/legacy/bpf_shared.c
index 21fe6f1e..05b2b9ef 100644
--- a/examples/bpf/bpf_shared.c
+++ b/examples/bpf/legacy/bpf_shared.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Minimal, stand-alone toy map pinning example:
  *
diff --git a/examples/bpf/bpf_tailcall.c b/examples/bpf/legacy/bpf_tailcall.c
similarity index 98%
rename from examples/bpf/bpf_tailcall.c
rename to examples/bpf/legacy/bpf_tailcall.c
index 161eb606..8ebc554c 100644
--- a/examples/bpf/bpf_tailcall.c
+++ b/examples/bpf/legacy/bpf_tailcall.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define ENTRY_INIT	3
 #define ENTRY_0		0
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv4 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
                         ` (3 preceding siblings ...)
  2020-11-09  7:08       ` [PATCHv4 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
@ 2020-11-09  7:08       ` Hangbin Liu
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  5 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-09  7:08 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

Users should try use the new BTF defined maps instead of struct
bpf_elf_map defined maps. The tail call examples are not added yet
as libbpf doesn't currently support declaratively populating tail call
maps.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README           |  6 ++++
 examples/bpf/bpf_graft.c      | 66 +++++++++++++++++++++++++++++++++++
 examples/bpf/bpf_map_in_map.c | 55 +++++++++++++++++++++++++++++
 examples/bpf/bpf_shared.c     | 53 ++++++++++++++++++++++++++++
 include/bpf_api.h             | 13 +++++++
 5 files changed, 193 insertions(+)
 create mode 100644 examples/bpf/bpf_graft.c
 create mode 100644 examples/bpf/bpf_map_in_map.c
 create mode 100644 examples/bpf/bpf_shared.c

diff --git a/examples/bpf/README b/examples/bpf/README
index 732bcc83..b7261191 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,6 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
+- BTF defined map examples
+ - bpf_graft.c		-> Demo on altering runtime behaviour
+ - bpf_shared.c 	-> Ingress/egress map sharing example
+ - bpf_map_in_map.c	-> Using map in map example
+
+- legacy struct bpf_elf_map defined map examples
  - legacy/bpf_shared.c		-> Ingress/egress map sharing example
  - legacy/bpf_tailcall.c	-> Using tail call chains
  - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/bpf_graft.c
new file mode 100644
index 00000000..8066dcce
--- /dev/null
+++ b/examples/bpf/bpf_graft.c
@@ -0,0 +1,66 @@
+#include "../../include/bpf_api.h"
+
+/* This example demonstrates how classifier run-time behaviour
+ * can be altered with tail calls. We start out with an empty
+ * jmp_tc array, then add section aaa to the array slot 0, and
+ * later on atomically replace it with section bbb. Note that
+ * as shown in other examples, the tc loader can prepopulate
+ * tail called sections, here we start out with an empty one
+ * on purpose to show it can also be done this way.
+ *
+ * tc filter add dev foo parent ffff: bpf obj graft.o
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-20229 [001] ..s. 138993.003923: : fallthrough
+ *   <idle>-0            [001] ..s. 138993.202265: : fallthrough
+ *   Socket Thread-20229 [001] ..s. 138994.004149: : fallthrough
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec aaa
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139012.053587: : aaa
+ *   <idle>-0            [002] ..s. 139012.172359: : aaa
+ *   Socket Thread-19818 [001] ..s. 139012.173556: : aaa
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec bbb
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139022.102967: : bbb
+ *   <idle>-0            [002] ..s. 139022.155640: : bbb
+ *   Socket Thread-19818 [001] ..s. 139022.156730: : bbb
+ *   [...]
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+} jmp_tc __section(".maps");
+
+__section("aaa")
+int cls_aaa(struct __sk_buff *skb)
+{
+	printt("aaa\n");
+	return TC_H_MAKE(1, 42);
+}
+
+__section("bbb")
+int cls_bbb(struct __sk_buff *skb)
+{
+	printt("bbb\n");
+	return TC_H_MAKE(1, 43);
+}
+
+__section_cls_entry
+int cls_entry(struct __sk_buff *skb)
+{
+	tail_call(skb, &jmp_tc, 0);
+	printt("fallthrough\n");
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/bpf_map_in_map.c
new file mode 100644
index 00000000..39c86268
--- /dev/null
+++ b/examples/bpf/bpf_map_in_map.c
@@ -0,0 +1,55 @@
+#include "../../include/bpf_api.h"
+
+struct inner_map {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+} map_inner __section(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+	__array(values, struct inner_map);
+} map_outer __section(".maps") = {
+	.values = {
+		[0] = &map_inner,
+	},
+};
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			lock_xadd(val, 1);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			printt("map val: %d\n", *val);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/bpf_shared.c
new file mode 100644
index 00000000..99a332f4
--- /dev/null
+++ b/examples/bpf/bpf_shared.c
@@ -0,0 +1,53 @@
+#include "../../include/bpf_api.h"
+
+/* Minimal, stand-alone toy map pinning example:
+ *
+ * clang -target bpf -O2 [...] -o bpf_shared.o -c bpf_shared.c
+ * tc filter add dev foo parent 1: bpf obj bpf_shared.o sec egress
+ * tc filter add dev foo parent ffff: bpf obj bpf_shared.o sec ingress
+ *
+ * Both classifier will share the very same map instance in this example,
+ * so map content can be accessed from ingress *and* egress side!
+ *
+ * This example has a pinning of PIN_OBJECT_NS, so it's private and
+ * thus shared among various program sections within the object.
+ *
+ * A setting of PIN_GLOBAL_NS would place it into a global namespace,
+ * so that it can be shared among different object files. A setting
+ * of PIN_NONE (= 0) means no sharing, so each tc invocation a new map
+ * instance is being created.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);	/* or LIBBPF_PIN_NONE */
+} map_sh __section(".maps");
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		lock_xadd(val, 1);
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		printt("map val: %d\n", *val);
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/include/bpf_api.h b/include/bpf_api.h
index 89d3488d..82c47089 100644
--- a/include/bpf_api.h
+++ b/include/bpf_api.h
@@ -19,6 +19,19 @@
 
 #include "bpf_elf.h"
 
+/** libbpf pin type. */
+enum libbpf_pin_type {
+	LIBBPF_PIN_NONE,
+	/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
+	LIBBPF_PIN_BY_NAME,
+};
+
+/** Type helper macros. */
+
+#define __uint(name, val) int (*name)[val]
+#define __type(name, val) typeof(val) *name
+#define __array(name, val) typeof(val) *name[]
+
 /** Misc macros. */
 
 #ifndef __stringify
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-09  1:45                                         ` Alexei Starovoitov
@ 2020-11-10  4:09                                           ` David Ahern
  2020-11-11  0:47                                             ` Alexei Starovoitov
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-10  4:09 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On 11/8/20 6:45 PM, Alexei Starovoitov wrote:
> 
> I don't understand why on one side you're pointing out existing quirkiness with
> bpf usability while at the same time arguing to make it _less_ user friendly

I believe you have confused my comments with others. My comments have
focused on one aspect: The insistence by BPF maintainers that all code
bases and users constantly chase latest and greatest versions of
relevant S/W to use BPF - though I believe a lot of the tool chasing
stems from BTF. I am fairly certain I have been consistent in that theme
within this thread.

> when myself, Daniel, Andrii explained in detail what libbpf does and how it
> affects user experience?
> 
> The analogy of libbpf in iproute2 and libbfd in gdb is that both libraries

Your gdb / libbfd analogy misses the mark - by a lot. That analogy is
relevant for bpftool, not iproute2.

iproute2 can leverage libbpf for 3 or 4 tc modules and a few xdp hooks.
That is it, and it is a tiny percentage of the functionality in the package.


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-05 14:05                           ` Jamal Hadi Salim
  2020-11-05 21:01                             ` Andrii Nakryiko
@ 2020-11-10 12:47                             ` Edward Cree
  2020-11-11  0:53                               ` Alexei Starovoitov
  1 sibling, 1 reply; 167+ messages in thread
From: Edward Cree @ 2020-11-10 12:47 UTC (permalink / raw)
  To: Jamal Hadi Salim, David Ahern, Daniel Borkmann,
	Alexei Starovoitov, Hangbin Liu
  Cc: Andrii Nakryiko, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 05/11/2020 14:05, Jamal Hadi Salim wrote:
> On 2020-11-04 10:19 p.m., David Ahern wrote:
>
> [..]
>> Similarly, it is not realistic or user friendly to *require* general
>> Linux users to constantly chase latest versions of llvm, clang, dwarves,
>> bcc, bpftool, libbpf, (I am sure I am missing more)
>
> 2cents feedback from a dabbler in ebpf on user experience:
>
> What David described above *has held me back*.
If we're doing 2¢... I gave up on trying to keep ebpf_asmabreast
 of all the latest BPF and BTF features quite some time ago, since
 there was rarely any documentation and the specifications for BPF
 elves were basically "whatever latest clang does".
The bpf developers seem to have taken the position that since
 they're in control of clang, libbpf and the kernel, they can make
 their changes across all three and not bother with the specs that
 would allow other toolchains to interoperate.  As a result of
 which, that belief has now become true — while ebpf_asm will
 still work for what it always did (simple XDP programs), it is
 unlikely ever to gain CO-RE support so is no longer a live
 alternative to clang for BPF in general.
Of course the bpf developers are well within their rights to not
 care about that.  But I think it illustrates why having to
 interoperate with systems outside their control and mix-and-match
 versioning of various components provides external discipline that
 is sorely needed if the BPF ecosystem is to remain healthy.
That is why I am opposed to iproute2 'vendoring' libbpf.

-ed

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-10  4:09                                           ` David Ahern
@ 2020-11-11  0:47                                             ` Alexei Starovoitov
  2020-11-11 11:02                                               ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-11  0:47 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Mon, Nov 09, 2020 at 09:09:44PM -0700, David Ahern wrote:
> On 11/8/20 6:45 PM, Alexei Starovoitov wrote:
> > 
> > I don't understand why on one side you're pointing out existing quirkiness with
> > bpf usability while at the same time arguing to make it _less_ user friendly
> 
> I believe you have confused my comments with others. My comments have
> focused on one aspect: The insistence by BPF maintainers that all code
> bases and users constantly chase latest and greatest versions of
> relevant S/W to use BPF

yes, because we care about user experience while you're still insisting
on make it horrible.
With random pick of libbpf.so we would have no choice, but to actively tell
users to avoid using tc, because sooner or later they will be pissed. I'd
rather warn them ahead of time.

> - though I believe a lot of the tool chasing
> stems from BTF. I am fairly certain I have been consistent in that theme
> within this thread.

Right. A lot of features added in the last couple years depend on BTF:
static vs global linking, bpf_spin_lock, function by function verification, etc

> > when myself, Daniel, Andrii explained in detail what libbpf does and how it
> > affects user experience?
> > 
> > The analogy of libbpf in iproute2 and libbfd in gdb is that both libraries
> 
> Your gdb / libbfd analogy misses the mark - by a lot. That analogy is
> relevant for bpftool, not iproute2.
> 
> iproute2 can leverage libbpf for 3 or 4 tc modules and a few xdp hooks.
> That is it, and it is a tiny percentage of the functionality in the package.

cat tools/lib/bpf/*.[hc]|wc -l
23950
cat iproute2/tc/*.[hc]|wc -l
29542

The point is that for these few tc commands the amount logic in libbpf/tc is 90/10.

Let's play it out how libbpf+tc is going to get developed moving forward if
libbpf is a random version. Say, there is a patch for libbpf that makes
iproute2 experience better. bpf maintainers would have no choice, but to reject
it, since we don't add features/apis to libbpf if there is no active user.
Adding a new libbpf api that iproute2 few years from now may or may not take
advantage makes little sense.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-10 12:47                             ` Edward Cree
@ 2020-11-11  0:53                               ` Alexei Starovoitov
  2020-11-11 11:31                                 ` Edward Cree
  0 siblings, 1 reply; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-11  0:53 UTC (permalink / raw)
  To: Edward Cree
  Cc: Jamal Hadi Salim, David Ahern, Daniel Borkmann, Hangbin Liu,
	Andrii Nakryiko, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Tue, Nov 10, 2020 at 12:47:28PM +0000, Edward Cree wrote:
> On 05/11/2020 14:05, Jamal Hadi Salim wrote:
> > On 2020-11-04 10:19 p.m., David Ahern wrote:
> >
> > [..]
> >> Similarly, it is not realistic or user friendly to *require* general
> >> Linux users to constantly chase latest versions of llvm, clang, dwarves,
> >> bcc, bpftool, libbpf, (I am sure I am missing more)
> >
> > 2cents feedback from a dabbler in ebpf on user experience:
> >
> > What David described above *has held me back*.
> If we're doing 2¢... I gave up on trying to keep ebpf_asmabreast
>  of all the latest BPF and BTF features quite some time ago, since
>  there was rarely any documentation and the specifications for BPF
>  elves were basically "whatever latest clang does".
> The bpf developers seem to have taken the position that since
>  they're in control of clang, libbpf and the kernel, they can make
>  their changes across all three and not bother with the specs that
>  would allow other toolchains to interoperate.  As a result of
>  which, that belief has now become true — while ebpf_asm will
>  still work for what it always did (simple XDP programs), it is
>  unlikely ever to gain CO-RE support so is no longer a live
>  alternative to clang for BPF in general.
> Of course the bpf developers are well within their rights to not
>  care about that.  But I think it illustrates why having to
>  interoperate with systems outside their control and mix-and-match
>  versioning of various components provides external discipline that
>  is sorely needed if the BPF ecosystem is to remain healthy.

I think thriving public bpf projects, startups and established companies
that are obviously outside of control of few people that argue here
would disagree with your assessment.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-11  0:47                                             ` Alexei Starovoitov
@ 2020-11-11 11:02                                               ` Toke Høiland-Jørgensen
  2020-11-11 15:06                                                 ` Daniel Borkmann
  0 siblings, 1 reply; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-11 11:02 UTC (permalink / raw)
  To: Alexei Starovoitov, David Ahern
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Daniel Borkmann, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Andrii Nakryiko

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Mon, Nov 09, 2020 at 09:09:44PM -0700, David Ahern wrote:
>> On 11/8/20 6:45 PM, Alexei Starovoitov wrote:
>> > 
>> > I don't understand why on one side you're pointing out existing quirkiness with
>> > bpf usability while at the same time arguing to make it _less_ user friendly
>> 
>> I believe you have confused my comments with others. My comments have
>> focused on one aspect: The insistence by BPF maintainers that all code
>> bases and users constantly chase latest and greatest versions of
>> relevant S/W to use BPF
>
> yes, because we care about user experience while you're still insisting
> on make it horrible.
> With random pick of libbpf.so we would have no choice, but to actively tell
> users to avoid using tc, because sooner or later they will be pissed. I'd
> rather warn them ahead of time.

Could we *please* stop with this "my way or the highway" extortion? It's
incredibly rude, and it's not helping the discussion.

>> - though I believe a lot of the tool chasing
>> stems from BTF. I am fairly certain I have been consistent in that theme
>> within this thread.
>
> Right. A lot of features added in the last couple years depend on BTF:
> static vs global linking, bpf_spin_lock, function by function verification, etc
>
>> > when myself, Daniel, Andrii explained in detail what libbpf does and how it
>> > affects user experience?
>> > 
>> > The analogy of libbpf in iproute2 and libbfd in gdb is that both libraries
>> 
>> Your gdb / libbfd analogy misses the mark - by a lot. That analogy is
>> relevant for bpftool, not iproute2.
>> 
>> iproute2 can leverage libbpf for 3 or 4 tc modules and a few xdp hooks.
>> That is it, and it is a tiny percentage of the functionality in the package.
>
> cat tools/lib/bpf/*.[hc]|wc -l
> 23950
> cat iproute2/tc/*.[hc]|wc -l
> 29542
>
> The point is that for these few tc commands the amount logic in libbpf/tc is 90/10.
>
> Let's play it out how libbpf+tc is going to get developed moving forward if
> libbpf is a random version. Say, there is a patch for libbpf that makes
> iproute2 experience better. bpf maintainers would have no choice, but to reject
> it, since we don't add features/apis to libbpf if there is no active user.
> Adding a new libbpf api that iproute2 few years from now may or may not take
> advantage makes little sense.

What? No one has said that iproute2 would never use any new features,
just that they would be added conditionally on a compatibility check
with libbpf (like the check for bpf_program__section_name() in the
current patch series).

Besides, for the entire history of BPF support in iproute2 so far, the
benefit has come from all the features that libbpf has just started
automatically supporting on load (BTF, etc), so users would have
benefited from automatic library updates had it *not* been vendored in.

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-11  0:53                               ` Alexei Starovoitov
@ 2020-11-11 11:31                                 ` Edward Cree
  2020-11-11 18:08                                   ` Alexei Starovoitov
  0 siblings, 1 reply; 167+ messages in thread
From: Edward Cree @ 2020-11-11 11:31 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jamal Hadi Salim, David Ahern, Daniel Borkmann, Hangbin Liu,
	Andrii Nakryiko, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/11/2020 00:53, Alexei Starovoitov wrote:
> On Tue, Nov 10, 2020 at 12:47:28PM +0000, Edward Cree wrote:
>> But I think it illustrates why having to
>>  interoperate with systems outside their control and mix-and-match
>>  versioning of various components provides external discipline that
>>  is sorely needed if the BPF ecosystem is to remain healthy.
> 
> I think thriving public bpf projects, startups and established companies
> that are obviously outside of control of few people that argue here
> would disagree with your assessment.

Correct me if I'm wrong, but aren't those bpf projects and companies
 _things that are written in BPF_, rather than alternative toolchain
 components for compiling, loading and otherwise wrangling BPF once
 it's been written?
It is the latter that I am saying is needed in order to keep BPF
 infrastructure development "honest", rather than treating the clang
 frontend as The API and all layers below it as undocumented internal
 implementation details.
In a healthy ecosystem, it should be possible to use a compiler,
 assembler, linker and loader developed separately by four projects
 unrelated to each other and to the kernel and runtime.  Thanks to
 well-specified ABIs and file formats, in the C ecosystem this is
 actually possible, despite the existence of some projects that
 bundle together multiple components.
In the BPF ecosystem, instead, it seems like the only toolchain
 anyone cares to support is latest clang + latest libbpf, and if you
 try to replace any component of the toolchain with something else,
 the spec you have to program against is "Go and read the LLVM
 source code, figure out what it does, and copy that".
That is not sustainable in the long term.

-ed

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-11 11:02                                               ` Toke Høiland-Jørgensen
@ 2020-11-11 15:06                                                 ` Daniel Borkmann
  2020-11-11 16:33                                                   ` David Ahern
  2020-11-12 22:36                                                   ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 167+ messages in thread
From: Daniel Borkmann @ 2020-11-11 15:06 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Alexei Starovoitov, David Ahern
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko

On 11/11/20 12:02 PM, Toke Høiland-Jørgensen wrote:
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> On Mon, Nov 09, 2020 at 09:09:44PM -0700, David Ahern wrote:
>>> On 11/8/20 6:45 PM, Alexei Starovoitov wrote:
>>>>
>>>> I don't understand why on one side you're pointing out existing quirkiness with
>>>> bpf usability while at the same time arguing to make it _less_ user friendly
>>>
>>> I believe you have confused my comments with others. My comments have
>>> focused on one aspect: The insistence by BPF maintainers that all code
>>> bases and users constantly chase latest and greatest versions of
>>> relevant S/W to use BPF
>>
>> yes, because we care about user experience while you're still insisting
>> on make it horrible.
>> With random pick of libbpf.so we would have no choice, but to actively tell
>> users to avoid using tc, because sooner or later they will be pissed. I'd
>> rather warn them ahead of time.
> 
> Could we *please* stop with this "my way or the highway" extortion? It's
> incredibly rude, and it's not helping the discussion.
> 
>>> - though I believe a lot of the tool chasing
>>> stems from BTF. I am fairly certain I have been consistent in that theme
>>> within this thread.
>>
>> Right. A lot of features added in the last couple years depend on BTF:
>> static vs global linking, bpf_spin_lock, function by function verification, etc
>>
>>>> when myself, Daniel, Andrii explained in detail what libbpf does and how it
>>>> affects user experience?
>>>>
>>>> The analogy of libbpf in iproute2 and libbfd in gdb is that both libraries
>>>
>>> Your gdb / libbfd analogy misses the mark - by a lot. That analogy is
>>> relevant for bpftool, not iproute2.
>>>
>>> iproute2 can leverage libbpf for 3 or 4 tc modules and a few xdp hooks.
>>> That is it, and it is a tiny percentage of the functionality in the package.
>>
>> cat tools/lib/bpf/*.[hc]|wc -l
>> 23950
>> cat iproute2/tc/*.[hc]|wc -l
>> 29542
>>
>> The point is that for these few tc commands the amount logic in libbpf/tc is 90/10.
>>
>> Let's play it out how libbpf+tc is going to get developed moving forward if
>> libbpf is a random version. Say, there is a patch for libbpf that makes
>> iproute2 experience better. bpf maintainers would have no choice, but to reject
>> it, since we don't add features/apis to libbpf if there is no active user.
>> Adding a new libbpf api that iproute2 few years from now may or may not take
>> advantage makes little sense.
> 
> What? No one has said that iproute2 would never use any new features,
> just that they would be added conditionally on a compatibility check
> with libbpf (like the check for bpf_program__section_name() in the
> current patch series).
> 
> Besides, for the entire history of BPF support in iproute2 so far, the
> benefit has come from all the features that libbpf has just started
> automatically supporting on load (BTF, etc), so users would have
> benefited from automatic library updates had it *not* been vendored in.

Not really. What you imply here is that we're living in a perfect world and that
all distros follow suite and i) add libbpf dependency to their official iproute2
package, ii) upgrade iproute2 package along with new kernel releases and iii)
upgrade libbpf along with it so that users are able to develop BPF programs against
the feature set that the kernel offers (as intended). These are a lot of moving parts
to get right, and as I pointed out earlier in the conversation, it took major distros
2 years to get their act together to officially include bpftool as a package -
I'm not making this up, and this sort of pace is simply not sustainable. It's also
not clear whether distros will get point iii) correct. It's not about compatibility,
but rather about __users__ of the loader being able to __benefit__ of the latest
features their distro kernel ships from BPF (& libbpf) side just as they do with
iproute2 extensions. For the integrated lib/bpf.c in iproute2 this was never an
issue and for multiple years in the earlier days it was much further ahead than
libbpf which was only tracing-focused before we decided to put focus on the latter
as a more general loader instead. But if you ever want to start a deprecation process
of the lib/bpf.c then users should not need to worry whether iproute2 was even linked
to libbpf in the first place, they should be able to have a guarantee that it's
__generally available__ as with lib/bpf.c, otherwise they'll always just assume
the latter as the minimal available base when writing code against iproute2 loader.
Hypothetically speaking, if Hangbin would have presented patches here to extend the
existing lib/bpf.c to the point that it's feature complete (compared to libbpf),
we wouldn't even have this whole discussion here.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-11 15:06                                                 ` Daniel Borkmann
@ 2020-11-11 16:33                                                   ` David Ahern
  2020-11-12 22:36                                                   ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 167+ messages in thread
From: David Ahern @ 2020-11-11 16:33 UTC (permalink / raw)
  To: Daniel Borkmann, Toke Høiland-Jørgensen, Alexei Starovoitov
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko

On 11/11/20 8:06 AM, Daniel Borkmann wrote:
> 
> Not really. What you imply here is that we're living in a perfect world
> and that
> all distros follow suite and i) add libbpf dependency to their official
> iproute2
> package, ii) upgrade iproute2 package along with new kernel releases and
> iii)
> upgrade libbpf along with it so that users are able to develop BPF
> programs against
> the feature set that the kernel offers (as intended). These are a lot of
> moving parts
> to get right, and as I pointed out earlier in the conversation, it took
> major distros
> 2 years to get their act together to officially include bpftool as a
> package -

Yes, there are lot of moving parts and that puts a huge burden on
distributions. The trend that related s/w is outdated 2-3 months after a
release can be taken as a sign that bpf is not stable and ready for
distributions to take on and support.

bpftool is only 3 years old (Oct 2017 is first kernel commit). You can
not expect distributions to chase every whim from kernel developers, so
bpftool needed to evolve and prove its usefulness. It has now, so really
the disappointment should be limited to distributions over the past 12
months, especially Ubuntu 20.04 (most recent LTS) not having a libbpf
and bpftool releases. But again, 20.04 was too old for BTF 3 months
after it was released and that comes back to the bigger question of
whether bpf is really ready for distributions to support. More below.

Focusing on the future: for Ubuntu (and Debian?) bpftool is in the
linux-tools-common package. perf has already trained distributions to
release a tools package with kernel releases. That means bpftool updates
follow the kernel cadence. bpftool requires libbpf and I believe given
the feature dependencies will force libbpf versions to follow kernel
releases, so I believe your goal is going to be achieved by those
dependencies.

But there is an on-going nagging problem which needs to be acknowledged
and solved. As an *example*, Ubunutu has kernel updates to get new
hardware support (HWE releases). Updating kernels on an LTS is
problematic when the kernel update requires toolchain updates to
maintain features (DEBUG_INFO_BTF) and library updates to get tools for
that kernel version working. That is a huge disruption to their
customers who want stability — the whole reason for LTS distributions.





^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-11 11:31                                 ` Edward Cree
@ 2020-11-11 18:08                                   ` Alexei Starovoitov
  0 siblings, 0 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-11 18:08 UTC (permalink / raw)
  To: Edward Cree
  Cc: Jamal Hadi Salim, David Ahern, Daniel Borkmann, Hangbin Liu,
	Andrii Nakryiko, Stephen Hemminger, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Networking, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Wed, Nov 11, 2020 at 11:31:47AM +0000, Edward Cree wrote:
> On 11/11/2020 00:53, Alexei Starovoitov wrote:
> > On Tue, Nov 10, 2020 at 12:47:28PM +0000, Edward Cree wrote:
> >> But I think it illustrates why having to
> >>  interoperate with systems outside their control and mix-and-match
> >>  versioning of various components provides external discipline that
> >>  is sorely needed if the BPF ecosystem is to remain healthy.
> > 
> > I think thriving public bpf projects, startups and established companies
> > that are obviously outside of control of few people that argue here
> > would disagree with your assessment.
> 
> Correct me if I'm wrong, but aren't those bpf projects and companies
>  _things that are written in BPF_, rather than alternative toolchain
>  components for compiling, loading and otherwise wrangling BPF once
>  it's been written?
> It is the latter that I am saying is needed in order to keep BPF
>  infrastructure development "honest", rather than treating the clang
>  frontend as The API and all layers below it as undocumented internal
>  implementation details.
> In a healthy ecosystem, it should be possible to use a compiler,
>  assembler, linker and loader developed separately by four projects
>  unrelated to each other and to the kernel and runtime.  Thanks to
>  well-specified ABIs and file formats, in the C ecosystem this is
>  actually possible, despite the existence of some projects that
>  bundle together multiple components.
> In the BPF ecosystem, instead, it seems like the only toolchain
>  anyone cares to support is latest clang + latest libbpf, and if you
>  try to replace any component of the toolchain with something else,
>  the spec you have to program against is "Go and read the LLVM
>  source code, figure out what it does, and copy that".
> That is not sustainable in the long term.

Absolutely. I agree 100% with above.
BPF ecosystem eventually will get to a point of fixed file format,
linker specification and 1000 page psABI document.
One can argue that when RISCV ISA was invented recently and it came with full
ABI document just like x86 long ago. BPF ISA is different. It grows
"organically". We don't add all possible instructions up front. We don't define
all possible relocation types to ELF. That fundamental difference vs all other
ISAs help BPF follow its own path. Take BTF, for example. No other ISA have
such concept. Yet due to BTF the BPF ecosystem can provide features no other
ISA can. Similar story happens with clang. BPF extended C language _already_.
The BPF C programs have a way to compare types. It is a C language extension.
Did we go to C standard committee and argue for years that such extension is
necessary? Obviously not. Today BPF is, as you correctly pointed out, layers of
undocumented internal details. Obviously we're not content with such situation.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-11 15:06                                                 ` Daniel Borkmann
  2020-11-11 16:33                                                   ` David Ahern
@ 2020-11-12 22:36                                                   ` Toke Høiland-Jørgensen
  2020-11-12 23:20                                                     ` Daniel Borkmann
  1 sibling, 1 reply; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-12 22:36 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov, David Ahern
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko

Daniel Borkmann <daniel@iogearbox.net> writes:

>> Besides, for the entire history of BPF support in iproute2 so far, the
>> benefit has come from all the features that libbpf has just started
>> automatically supporting on load (BTF, etc), so users would have
>> benefited from automatic library updates had it *not* been vendored in.
>
> Not really. What you imply here is that we're living in a perfect
> world and that all distros follow suite and i) add libbpf dependency
> to their official iproute2 package, ii) upgrade iproute2 package along
> with new kernel releases and iii) upgrade libbpf along with it so that
> users are able to develop BPF programs against the feature set that
> the kernel offers (as intended). These are a lot of moving parts to
> get right, and as I pointed out earlier in the conversation, it took
> major distros 2 years to get their act together to officially include
> bpftool as a package - I'm not making this up, and this sort of pace
> is simply not sustainable. It's also not clear whether distros will
> get point iii) correct.

I totally get that you've been frustrated with the distro adoption and
packaging of BPF-related tools. And rightfully so. I just don't think
that the answer to this is to try to work around distros, but rather to
work with them to get things right.

I'm quite happy to take a shot at getting a cross-distro effort going in
this space; really, having well-supported BPF tooling ought to be in
everyone's interest!

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-12 22:36                                                   ` Toke Høiland-Jørgensen
@ 2020-11-12 23:20                                                     ` Daniel Borkmann
  2020-11-13  0:04                                                       ` Stephen Hemminger
  2020-11-13  3:55                                                       ` David Ahern
  0 siblings, 2 replies; 167+ messages in thread
From: Daniel Borkmann @ 2020-11-12 23:20 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Alexei Starovoitov, David Ahern
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko

On 11/12/20 11:36 PM, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
> 
>>> Besides, for the entire history of BPF support in iproute2 so far, the
>>> benefit has come from all the features that libbpf has just started
>>> automatically supporting on load (BTF, etc), so users would have
>>> benefited from automatic library updates had it *not* been vendored in.
>>
>> Not really. What you imply here is that we're living in a perfect
>> world and that all distros follow suite and i) add libbpf dependency
>> to their official iproute2 package, ii) upgrade iproute2 package along
>> with new kernel releases and iii) upgrade libbpf along with it so that
>> users are able to develop BPF programs against the feature set that
>> the kernel offers (as intended). These are a lot of moving parts to
>> get right, and as I pointed out earlier in the conversation, it took
>> major distros 2 years to get their act together to officially include
>> bpftool as a package - I'm not making this up, and this sort of pace
>> is simply not sustainable. It's also not clear whether distros will
>> get point iii) correct.
> 
> I totally get that you've been frustrated with the distro adoption and
> packaging of BPF-related tools. And rightfully so. I just don't think
> that the answer to this is to try to work around distros, but rather to
> work with them to get things right.
> 
> I'm quite happy to take a shot at getting a cross-distro effort going in
> this space; really, having well-supported BPF tooling ought to be in
> everyone's interest!

Thanks, yes, that is worth a push either way! There is still a long tail
of distros that are not considered major and until they all catch up with
points i)-iii) it might take a much longer time until this becomes really
ubiquitous with iproute2 for users of the libbpf loader. Its that this
frustrating user experience could be avoided altogether. iproute2 is
shipped and run also on small / embedded devices hence it tries to have
external dependencies reduced to a bare minimum (well, except that libmnl
detour, but it's not a mandatory dependency). If I were a user and would
rely on the loader for my progs to be installed I'd probably end up
compiling my own version of iproute2 linked with libbpf to move forward
instead of being blocked on distro to catch up, but its an additional
hassle for shipping SW instead of just having it all pre-installed when
built-in given it otherwise comes with the base distro already. But then
my question is what is planned here as deprecation process for the built-in
lib/bpf.c code? I presume we'll remove it eventually to move on?

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-12 23:20                                                     ` Daniel Borkmann
@ 2020-11-13  0:04                                                       ` Stephen Hemminger
  2020-11-13  0:40                                                         ` Alexei Starovoitov
  2020-11-13  3:55                                                       ` David Ahern
  1 sibling, 1 reply; 167+ messages in thread
From: Stephen Hemminger @ 2020-11-13  0:04 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Toke Høiland-Jørgensen, Alexei Starovoitov,
	David Ahern, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko

On Fri, 13 Nov 2020 00:20:52 +0100
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 11/12/20 11:36 PM, Toke Høiland-Jørgensen wrote:
> > Daniel Borkmann <daniel@iogearbox.net> writes:
> >   
> >>> Besides, for the entire history of BPF support in iproute2 so far, the
> >>> benefit has come from all the features that libbpf has just started
> >>> automatically supporting on load (BTF, etc), so users would have
> >>> benefited from automatic library updates had it *not* been vendored in.  
> >>
> >> Not really. What you imply here is that we're living in a perfect
> >> world and that all distros follow suite and i) add libbpf dependency
> >> to their official iproute2 package, ii) upgrade iproute2 package along
> >> with new kernel releases and iii) upgrade libbpf along with it so that
> >> users are able to develop BPF programs against the feature set that
> >> the kernel offers (as intended). These are a lot of moving parts to
> >> get right, and as I pointed out earlier in the conversation, it took
> >> major distros 2 years to get their act together to officially include
> >> bpftool as a package - I'm not making this up, and this sort of pace
> >> is simply not sustainable. It's also not clear whether distros will
> >> get point iii) correct.  
> > 
> > I totally get that you've been frustrated with the distro adoption and
> > packaging of BPF-related tools. And rightfully so. I just don't think
> > that the answer to this is to try to work around distros, but rather to
> > work with them to get things right.
> > 
> > I'm quite happy to take a shot at getting a cross-distro effort going in
> > this space; really, having well-supported BPF tooling ought to be in
> > everyone's interest!  
> 
> Thanks, yes, that is worth a push either way! There is still a long tail
> of distros that are not considered major and until they all catch up with
> points i)-iii) it might take a much longer time until this becomes really
> ubiquitous with iproute2 for users of the libbpf loader. Its that this
> frustrating user experience could be avoided altogether. iproute2 is
> shipped and run also on small / embedded devices hence it tries to have
> external dependencies reduced to a bare minimum (well, except that libmnl
> detour, but it's not a mandatory dependency). If I were a user and would
> rely on the loader for my progs to be installed I'd probably end up
> compiling my own version of iproute2 linked with libbpf to move forward
> instead of being blocked on distro to catch up, but its an additional
> hassle for shipping SW instead of just having it all pre-installed when
> built-in given it otherwise comes with the base distro already. But then
> my question is what is planned here as deprecation process for the built-in
> lib/bpf.c code? I presume we'll remove it eventually to move on?

Perf has a similar problem and it made it into most distributions because it is
a valuable tool. Maybe there is some lessons learned that could apply here.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-13  0:04                                                       ` Stephen Hemminger
@ 2020-11-13  0:40                                                         ` Alexei Starovoitov
  0 siblings, 0 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-13  0:40 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Daniel Borkmann, Toke Høiland-Jørgensen, David Ahern,
	Andrii Nakryiko, Jiri Benc, Edward Cree, Hangbin Liu,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, Networking, bpf,
	Andrii Nakryiko

On Thu, Nov 12, 2020 at 4:35 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Fri, 13 Nov 2020 00:20:52 +0100
> Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> > On 11/12/20 11:36 PM, Toke Høiland-Jørgensen wrote:
> > > Daniel Borkmann <daniel@iogearbox.net> writes:
> > >
> > >>> Besides, for the entire history of BPF support in iproute2 so far, the
> > >>> benefit has come from all the features that libbpf has just started
> > >>> automatically supporting on load (BTF, etc), so users would have
> > >>> benefited from automatic library updates had it *not* been vendored in.
> > >>
> > >> Not really. What you imply here is that we're living in a perfect
> > >> world and that all distros follow suite and i) add libbpf dependency
> > >> to their official iproute2 package, ii) upgrade iproute2 package along
> > >> with new kernel releases and iii) upgrade libbpf along with it so that
> > >> users are able to develop BPF programs against the feature set that
> > >> the kernel offers (as intended). These are a lot of moving parts to
> > >> get right, and as I pointed out earlier in the conversation, it took
> > >> major distros 2 years to get their act together to officially include
> > >> bpftool as a package - I'm not making this up, and this sort of pace
> > >> is simply not sustainable. It's also not clear whether distros will
> > >> get point iii) correct.
> > >
> > > I totally get that you've been frustrated with the distro adoption and
> > > packaging of BPF-related tools. And rightfully so. I just don't think
> > > that the answer to this is to try to work around distros, but rather to
> > > work with them to get things right.
> > >
> > > I'm quite happy to take a shot at getting a cross-distro effort going in
> > > this space; really, having well-supported BPF tooling ought to be in
> > > everyone's interest!
> >
> > Thanks, yes, that is worth a push either way! There is still a long tail
> > of distros that are not considered major and until they all catch up with
> > points i)-iii) it might take a much longer time until this becomes really
> > ubiquitous with iproute2 for users of the libbpf loader. Its that this
> > frustrating user experience could be avoided altogether. iproute2 is
> > shipped and run also on small / embedded devices hence it tries to have
> > external dependencies reduced to a bare minimum (well, except that libmnl
> > detour, but it's not a mandatory dependency). If I were a user and would
> > rely on the loader for my progs to be installed I'd probably end up
> > compiling my own version of iproute2 linked with libbpf to move forward
> > instead of being blocked on distro to catch up, but its an additional
> > hassle for shipping SW instead of just having it all pre-installed when
> > built-in given it otherwise comes with the base distro already. But then
> > my question is what is planned here as deprecation process for the built-in
> > lib/bpf.c code? I presume we'll remove it eventually to move on?
>
> Perf has a similar problem and it made it into most distributions because it is
> a valuable tool. Maybe there is some lessons learned that could apply here.

Indeed.
Please read tools/perf/Documentation/Build.txt
and realize that perf binary _statically_ links libperf library.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-12 23:20                                                     ` Daniel Borkmann
  2020-11-13  0:04                                                       ` Stephen Hemminger
@ 2020-11-13  3:55                                                       ` David Ahern
  1 sibling, 0 replies; 167+ messages in thread
From: David Ahern @ 2020-11-13  3:55 UTC (permalink / raw)
  To: Daniel Borkmann, Toke Høiland-Jørgensen, Alexei Starovoitov
  Cc: Stephen Hemminger, Andrii Nakryiko, Jiri Benc, Edward Cree,
	Hangbin Liu, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, Networking,
	bpf, Andrii Nakryiko

On 11/12/20 4:20 PM, Daniel Borkmann wrote:
> built-in given it otherwise comes with the base distro already. But then
> my question is what is planned here as deprecation process for the built-in
> lib/bpf.c code? I presume we'll remove it eventually to move on?

It will need to follow the established deprecation pattern for N, N-1
releases (N here refers to distro LTS releases, not kernel or iproute2
releases). Meaning, for the next few years it needs to exist as an
option when libbpf is not installed. After that we can add a deprecation
warning that libbpf is preferred, and then at some point in the distant
future it can be removed.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv4 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c
  2020-11-09  7:07       ` [PATCHv4 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
@ 2020-11-14  3:24         ` David Ahern
  2020-11-16  3:55           ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-14  3:24 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On 11/9/20 12:07 AM, Hangbin Liu wrote:
> diff --git a/lib/bpf_glue.c b/lib/bpf_glue.c
> new file mode 100644
> index 00000000..7626a893
> --- /dev/null
> +++ b/lib/bpf_glue.c

...

> +
> +int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
> +		     size_t size_insns, const char *license, char *log,
> +		     size_t size_log)
> +{
> +#ifdef HAVE_LIBBPF
> +	return bpf_load_program(type, insns, size_insns, license, 0, log, size_log);
> +#else
> +	return bpf_load_load_dev(type, insns, size_insns, license, 0, log, size_log);
> +#endif
> +}
> +

Fails to compile:

$ LIBBPF_FORCE=off ./configure
$ make
...
/usr/bin/ld: ../lib/libutil.a(bpf_glue.o): in function `bpf_program_load':
bpf_glue.c:(.text+0x13): undefined reference to `bpf_load_load_dev'
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:27: ip] Error 1
make: *** [Makefile:64: all] Error 2


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-09  7:07       ` [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
@ 2020-11-14  3:26         ` David Ahern
  2020-11-16  4:30           ` Hangbin Liu
  0 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-14  3:26 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On 11/9/20 12:07 AM, Hangbin Liu wrote:
> This patch adds a check to see if we have libbpf support. By default the
> system libbpf will be used, but static linking against a custom libbpf
> version can be achieved by passing LIBBPF_DIR to configure.
> 
> Add another variable LIBBPF_FORCE to control whether to build iproute2
> with libbpf. If set to on, then force to build with libbpf and exit if
> not available. If set to off, then force to not build with libbpf.
> 
> Signed-off-by: Hangbin Liu <haliu@redhat.com>
> 
> v4:
> 1) Remove duplicate LIBBPF_CFLAGS
> 2) Remove un-needed -L since using static libbpf.a
> 3) Fix == not supported in dash
> 4) Extend LIBBPF_FORCE to support on/off, when set to on, stop building when
>    there is no libbpf support. If set to off, discard libbpf check.
> 5) Print libbpf version after checking
> 
> v3:
> Check function bpf_program__section_name() separately and only use it
> on higher libbpf version.
> 
> v2:
> No update
> ---
>  configure | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 108 insertions(+)
> 
> diff --git a/configure b/configure
> index 307912aa..3081a2ac 100755
> --- a/configure
> +++ b/configure
> @@ -2,6 +2,11 @@
>  # SPDX-License-Identifier: GPL-2.0
>  # This is not an autoconf generated configure
>  #
> +# Influential LIBBPF environment variables:
> +#   LIBBPF_FORCE={on,off}   on: require link against libbpf;
> +#                           off: disable libbpf probing
> +#   LIBBPF_LIBDIR           Path to libbpf to use
> +
>  INCLUDE=${1:-"$PWD/include"}
>  
>  # Output file which is input to Makefile
> @@ -240,6 +245,106 @@ check_elf()
>      fi
>  }
>  
> +have_libbpf_basic()
> +{
> +    cat >$TMPDIR/libbpf_test.c <<EOF
> +#include <bpf/libbpf.h>
> +int main(int argc, char **argv) {
> +    bpf_program__set_autoload(NULL, false);
> +    bpf_map__ifindex(NULL);
> +    bpf_map__set_pin_path(NULL, NULL);
> +    bpf_object__open_file(NULL, NULL);
> +    return 0;
> +}
> +EOF
> +
> +    $CC -o $TMPDIR/libbpf_test $TMPDIR/libbpf_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
> +    local ret=$?
> +
> +    rm -f $TMPDIR/libbpf_test.c $TMPDIR/libbpf_test
> +    return $ret
> +}
> +
> +have_libbpf_sec_name()
> +{
> +    cat >$TMPDIR/libbpf_sec_test.c <<EOF
> +#include <bpf/libbpf.h>
> +int main(int argc, char **argv) {
> +    void *ptr;
> +    bpf_program__section_name(NULL);
> +    return 0;
> +}
> +EOF
> +
> +    $CC -o $TMPDIR/libbpf_sec_test $TMPDIR/libbpf_sec_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
> +    local ret=$?
> +
> +    rm -f $TMPDIR/libbpf_sec_test.c $TMPDIR/libbpf_sec_test
> +    return $ret
> +}
> +
> +check_force_libbpf_on()
> +{
> +    # if set LIBBPF_FORCE=on but no libbpf support, just exist the config
> +    # process to make sure we don't build without libbpf.
> +    if [ "$LIBBPF_FORCE" = on ]; then
> +        echo "	LIBBPF_FORCE=on set, but couldn't find a usable libbpf"
> +        exit 1
> +    fi
> +}
> +
> +check_libbpf()
> +{
> +    # if set LIBBPF_FORCE=off, disable libbpf entirely
> +    if [ "$LIBBPF_FORCE" = off ]; then
> +        echo "no"
> +        return
> +    fi
> +
> +    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
> +        echo "no"
> +        check_force_libbpf_on
> +        return
> +    fi
> +
> +    if [ $(uname -m) = x86_64 ]; then
> +        local LIBBPF_LIBDIR="${LIBBPF_DIR}/lib64"
> +    else
> +        local LIBBPF_LIBDIR="${LIBBPF_DIR}/lib"
> +    fi
> +
> +    if [ -n "$LIBBPF_DIR" ]; then
> +        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include"
> +        LIBBPF_LDLIBS="${LIBBPF_LIBDIR}/libbpf.a -lz -lelf"
> +        LIBBPF_VERSION=$(PKG_CONFIG_LIBDIR=${LIBBPF_LIBDIR}/pkgconfig ${PKG_CONFIG} libbpf --modversion)
> +    else
> +        LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
> +        LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
> +        LIBBPF_VERSION=$(${PKG_CONFIG} libbpf --modversion)
> +    fi
> +
> +    if ! have_libbpf_basic; then
> +        echo "no"
> +        echo "	libbpf version $LIBBPF_VERSION is too low, please update it to at least 0.1.0"
> +        check_force_libbpf_on
> +        return
> +    else
> +        echo "HAVE_LIBBPF:=y" >>$CONFIG
> +        echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
> +        echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
> +    fi
> +
> +    # bpf_program__title() is deprecated since libbpf 0.2.0, use
> +    # bpf_program__section_name() instead if we support
> +    if have_libbpf_sec_name; then
> +        echo "HAVE_LIBBPF_SECTION_NAME:=y" >>$CONFIG
> +        echo 'CFLAGS += -DHAVE_LIBBPF_SECTION_NAME ' >> $CONFIG
> +    fi
> +
> +    echo "yes"
> +    echo "	libbpf version $LIBBPF_VERSION"
> +}
> +
>  check_selinux()
>  # SELinux is a compile time option in the ss utility
>  {
> @@ -385,6 +490,9 @@ check_setns
>  echo -n "SELinux support: "
>  check_selinux
>  
> +echo -n "libbpf support: "
> +check_libbpf
> +
>  echo -n "ELF support: "
>  check_elf
>  
> 

Something is off with the version detection.

# LIBBPF_LIBDIR=/tmp/libbpf ./configure
TC schedulers
 ATM	no

libc has setns: yes
SELinux support: no
libbpf support: yes
	libbpf version 0.1.0
ELF support: yes

/tmp/libbpf has an install of top of tree as of today which is:

/tmp/libbpf/usr/lib64/libbpf.so.0.3.0

This is using Ubuntu 20.10.


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv4 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c
  2020-11-14  3:24         ` David Ahern
@ 2020-11-16  3:55           ` Hangbin Liu
  0 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  3:55 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, bpf

On Fri, Nov 13, 2020 at 08:24:41PM -0700, David Ahern wrote:
> On 11/9/20 12:07 AM, Hangbin Liu wrote:
> > diff --git a/lib/bpf_glue.c b/lib/bpf_glue.c
> > new file mode 100644
> > index 00000000..7626a893
> > --- /dev/null
> > +++ b/lib/bpf_glue.c
> 
> ...
> 
> > +
> > +int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
> > +		     size_t size_insns, const char *license, char *log,
> > +		     size_t size_log)
> > +{
> > +#ifdef HAVE_LIBBPF
> > +	return bpf_load_program(type, insns, size_insns, license, 0, log, size_log);
> > +#else
> > +	return bpf_load_load_dev(type, insns, size_insns, license, 0, log, size_log);
> > +#endif
> > +}
> > +
> 
> Fails to compile:
> 
> $ LIBBPF_FORCE=off ./configure
> $ make
> ...
> /usr/bin/ld: ../lib/libutil.a(bpf_glue.o): in function `bpf_program_load':
> bpf_glue.c:(.text+0x13): undefined reference to `bpf_load_load_dev'

Opps, sorry for the typo...


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-14  3:26         ` David Ahern
@ 2020-11-16  4:30           ` Hangbin Liu
  2020-11-16  4:33             ` David Ahern
  0 siblings, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  4:30 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, netdev, bpf,
	Jiri Benc, Toke Høiland-Jørgensen

On Fri, Nov 13, 2020 at 08:26:51PM -0700, David Ahern wrote:
> > +# Influential LIBBPF environment variables:
> > +#   LIBBPF_FORCE={on,off}   on: require link against libbpf;
> > +#                           off: disable libbpf probing
> > +#   LIBBPF_LIBDIR           Path to libbpf to use
> > +

...
> > +check_libbpf()
> > +{
> > +    # if set LIBBPF_FORCE=off, disable libbpf entirely
> > +    if [ "$LIBBPF_FORCE" = off ]; then
> > +        echo "no"
> > +        return
> > +    fi
> > +
> > +    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
> > +        echo "no"
> > +        check_force_libbpf_on
> > +        return
> > +    fi

...

> 
> Something is off with the version detection.
> 
> # LIBBPF_LIBDIR=/tmp/libbpf ./configure

My copy-past error. It should take LIBBPF_DIR, but I wrote LIBBPF_LIBDIR in
the description... Also the folder should be libbpf dest dir, not libbpf
dir directly. To be consistent with the libbpf document. I will change
LIBBPF_DIR to LIBBPF_DESTDIR(Please tell me if you think the name is not
suitable). The fix diff will looks like

diff --git a/configure b/configure
index 3081a2ac..5ca10337 100755
--- a/configure
+++ b/configure
@@ -5,7 +5,7 @@
 # Influential LIBBPF environment variables:
 #   LIBBPF_FORCE={on,off}   on: require link against libbpf;
 #                           off: disable libbpf probing
-#   LIBBPF_LIBDIR           Path to libbpf to use
+#   LIBBPF_DESTDIR          Path to libbpf dest dir to use
 
 INCLUDE=${1:-"$PWD/include"}
 
@@ -301,20 +301,20 @@ check_libbpf()
         return
     fi
 
-    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
+    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DESTDIR" ] ; then
         echo "no"
         check_force_libbpf_on
         return
     fi
 
     if [ $(uname -m) = x86_64 ]; then
-        local LIBBPF_LIBDIR="${LIBBPF_DIR}/lib64"
+        local LIBBPF_LIBDIR="${LIBBPF_DESTDIR}/usr/lib64"
     else
-        local LIBBPF_LIBDIR="${LIBBPF_DIR}/lib"
+        local LIBBPF_LIBDIR="${LIBBPF_DESTDIR}/usr/lib"
     fi
 
-    if [ -n "$LIBBPF_DIR" ]; then
-        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/include"
+    if [ -n "$LIBBPF_DESTDIR" ]; then
+        LIBBPF_CFLAGS="-I${LIBBPF_DESTDIR}/usr/include"
         LIBBPF_LDLIBS="${LIBBPF_LIBDIR}/libbpf.a -lz -lelf"
         LIBBPF_VERSION=$(PKG_CONFIG_LIBDIR=${LIBBPF_LIBDIR}/pkgconfig ${PKG_CONFIG} libbpf --modversion)
     else

When you compile libbpf, it should like

$ mkdir /tmp/libbpf_destdir
$ cd libbpf/src/
$ make
...
  CC       libbpf.so.0.2.0
$ DESTDIR=/tmp/libbpf_destdir make install

Then in iproute2, configure it with

$ LIBBPF_DIR=/tmp/libbpf_destdir ./configure
TC schedulers
 ATM    no

libc has setns: yes
SELinux support: no
libbpf support: yes
        libbpf version 0.2.0
ELF support: yes
libmnl support: yes
Berkeley DB: no
need for strlcpy: yes
libcap support: yes

Thanks
Hangbin


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-16  4:30           ` Hangbin Liu
@ 2020-11-16  4:33             ` David Ahern
  0 siblings, 0 replies; 167+ messages in thread
From: David Ahern @ 2020-11-16  4:33 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, netdev, bpf,
	Jiri Benc, Toke Høiland-Jørgensen

On 11/15/20 9:30 PM, Hangbin Liu wrote:
> diff --git a/configure b/configure
> index 3081a2ac..5ca10337 100755
> --- a/configure
> +++ b/configure
> @@ -5,7 +5,7 @@
>  # Influential LIBBPF environment variables:
>  #   LIBBPF_FORCE={on,off}   on: require link against libbpf;
>  #                           off: disable libbpf probing
> -#   LIBBPF_LIBDIR           Path to libbpf to use
> +#   LIBBPF_DESTDIR          Path to libbpf dest dir to use

DESTDIR as a name applies to an install script. I think LIBBPF_DIR
 is fine. You can enhance the description to make it clear.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
                         ` (4 preceding siblings ...)
  2020-11-09  7:08       ` [PATCHv4 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
@ 2020-11-16  6:53       ` Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
                           ` (6 more replies)
  5 siblings, 7 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  6:53 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency), or set off to disable libbpf check
and build iproute2 with legacy bpf.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.

v5:
a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
   dest.
b) Fix bpf_prog_load_dev typo.
c) rebase to latest iproute2-next.

v4:
a) Make variable LIBBPF_FORCE able to control whether build iproute2
   with libbpf or not.
b) Add new file bpf_glue.c to for libbpf/legacy mixed bpf calls.
c) Fix some build issues and shell compatibility error.

v3:
a) Update configure to Check function bpf_program__section_name() separately
b) Add a new function get_bpf_program__section_name() to choose whether to
use bpf_program__title() or not.
c) Test build the patch on Fedora 33 with libbpf-0.1.0-1.fc33 and
   libbpf-devel-0.1.0-1.fc33

v2:
a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
b) Add ipvrf with libbpf support.


Here are the test results with patched iproute2:

== setup env
# clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o
# clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o
# rm -rf /sys/fs/bpf/xdp/globals
# /root/iproute2/ip/ip link add type veth
# /root/iproute2/ip/ip link set veth0 up
# /root/iproute2/ip/ip link set veth1 up


== Load objs
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 4 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
4: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:21-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 5
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 8 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
8: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:23-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 10
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 12 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
12: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:25-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 15
# /root/iproute2/ip/ip link set veth0 xdp off


== Load objs again to make sure maps could be reused
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 16 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
16: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:27-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 20
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 20 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show                                                                                                                                                                   [236/4518]
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
20: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:29-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 25
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 24 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
24: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:31-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 30
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals

== Testing if we can load new-style objects (using xdp-filter as an example)
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 28 tag e29eeda1489a6520 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
28: xdp  name xdpfilt_alw_all  tag e29eeda1489a6520  gpl
        loaded_at 2020-10-22T08:04:33-0400  uid 0
        xlated 2408B  jited 1405B  memlock 4096B  map_ids 9,5,7,8,6
        btf_id 35
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
32: xdp  name xdpfilt_alw_ip  tag 2f2b9dbfb786a5a2  gpl
        loaded_at 2020-10-22T08:04:35-0400  uid 0
        xlated 1336B  jited 778B  memlock 4096B  map_ids 7,8,5
        btf_id 40
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 36 tag 18c1bb25084030bc jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
36: xdp  name xdpfilt_alw_tcp  tag 18c1bb25084030bc  gpl
        loaded_at 2020-10-22T08:04:37-0400  uid 0
        xlated 1128B  jited 690B  memlock 4096B  map_ids 6,5
        btf_id 45
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Load new btf defined maps
# /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 40 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
40: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:39-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 50
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 44 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
44: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:41-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 13
        btf_id 55
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 48 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer  map_sh
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
14: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
48: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:43-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 14
        btf_id 60
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Test load objs by tc
# /root/iproute2/tc/tc qdisc add dev veth0 ingress
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0
# /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
# ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d:
jmp_tc

/sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f:
jmp_ex  jmp_tc  map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc
# bpftool map show
15: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
16: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
17: prog_array  name jmp_ex  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
18: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 2  memlock 4096B
        owner_prog_type sched_cls  owner jited
19: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
52: sched_cls  name cls_loop  tag 3e98a40b04099d36  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 168B  jited 133B  memlock 4096B  map_ids 15
        btf_id 65
56: sched_cls  name cls_entry  tag 0fbb4d9310a6ee26  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 144B  jited 121B  memlock 4096B  map_ids 16
        btf_id 70
60: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 75
66: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 80
72: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 85
78: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 90
79: sched_cls  name cls_case2  tag ee218ff893dca823  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 336B  jited 218B  memlock 4096B  map_ids 19,18
        btf_id 90
80: sched_cls  name cls_exit  tag e78a58140deed387  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 288B  jited 177B  memlock 4096B  map_ids 19
        btf_id 90

I also run the following upstream kselftest with patches iproute2 and
all passed.

test_lwt_ip_encap.sh
test_xdp_redirect.sh
test_tc_redirect.sh
test_xdp_meta.sh
test_xdp_veth.sh
test_xdp_vlan.sh

Hangbin Liu (5):
  configure: add check_libbpf() for later libbpf support
  lib: rename bpf.c to bpf_legacy.c
  lib: add libbpf support
  examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  examples/bpf: add bpf examples with BTF defined maps

 configure                                | 108 +++++++
 examples/bpf/README                      |  18 +-
 examples/bpf/bpf_graft.c                 |  14 +-
 examples/bpf/bpf_map_in_map.c            |  37 ++-
 examples/bpf/bpf_shared.c                |  14 +-
 examples/bpf/{ => legacy}/bpf_cyclic.c   |   2 +-
 examples/bpf/legacy/bpf_graft.c          |  66 +++++
 examples/bpf/legacy/bpf_map_in_map.c     |  56 ++++
 examples/bpf/legacy/bpf_shared.c         |  53 ++++
 examples/bpf/{ => legacy}/bpf_tailcall.c |   2 +-
 include/bpf_api.h                        |  13 +
 include/bpf_util.h                       |  21 +-
 ip/ipvrf.c                               |   6 +-
 lib/Makefile                             |   6 +-
 lib/bpf_glue.c                           |  35 +++
 lib/{bpf.c => bpf_legacy.c}              | 193 ++++++++++++-
 lib/bpf_libbpf.c                         | 353 +++++++++++++++++++++++
 17 files changed, 939 insertions(+), 58 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 create mode 100644 examples/bpf/legacy/bpf_graft.c
 create mode 100644 examples/bpf/legacy/bpf_map_in_map.c
 create mode 100644 examples/bpf/legacy/bpf_shared.c
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)
 create mode 100644 lib/bpf_glue.c
 rename lib/{bpf.c => bpf_legacy.c} (94%)
 create mode 100644 lib/bpf_libbpf.c

-- 
2.25.4


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv5 iproute2-next 1/5] configure: add check_libbpf() for later libbpf support
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
@ 2020-11-16  6:53         ` Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
                           ` (5 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  6:53 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch adds a check to see if we have libbpf support. By default the
system libbpf will be used, but static linking against a custom libbpf
version can be achieved by passing libbpf DESTDIR to variable LIBBPF_DIR for
configure.

Add another variable LIBBPF_FORCE to control whether to build iproute2
with libbpf. If set to on, then force to build with libbpf and exit if
not available. If set to off, then force to not build with libbpf.

Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
v5:
1) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
   dest.

v4:
1) Remove duplicate LIBBPF_CFLAGS
2) Remove un-needed -L since using static libbpf.a
3) Fix == not supported in dash
4) Extend LIBBPF_FORCE to support on/off, when set to on, stop building when
   there is no libbpf support. If set to off, discard libbpf check.
5) Print libbpf version after checking

v3:
Check function bpf_program__section_name() separately and only use it
on higher libbpf version.

v2:
No update
---
 configure | 108 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 108 insertions(+)

diff --git a/configure b/configure
index 307912aa..290b4c86 100755
--- a/configure
+++ b/configure
@@ -2,6 +2,11 @@
 # SPDX-License-Identifier: GPL-2.0
 # This is not an autoconf generated configure
 #
+# Influential LIBBPF environment variables:
+#   LIBBPF_FORCE={on,off}   on: require link against libbpf;
+#                           off: disable libbpf probing
+#   LIBBPF_DIR              Path to libbpf DESTDIR to use
+
 INCLUDE=${1:-"$PWD/include"}
 
 # Output file which is input to Makefile
@@ -240,6 +245,106 @@ check_elf()
     fi
 }
 
+have_libbpf_basic()
+{
+    cat >$TMPDIR/libbpf_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    bpf_program__set_autoload(NULL, false);
+    bpf_map__ifindex(NULL);
+    bpf_map__set_pin_path(NULL, NULL);
+    bpf_object__open_file(NULL, NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_test $TMPDIR/libbpf_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_test.c $TMPDIR/libbpf_test
+    return $ret
+}
+
+have_libbpf_sec_name()
+{
+    cat >$TMPDIR/libbpf_sec_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    void *ptr;
+    bpf_program__section_name(NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_sec_test $TMPDIR/libbpf_sec_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_sec_test.c $TMPDIR/libbpf_sec_test
+    return $ret
+}
+
+check_force_libbpf_on()
+{
+    # if set LIBBPF_FORCE=on but no libbpf support, just exist the config
+    # process to make sure we don't build without libbpf.
+    if [ "$LIBBPF_FORCE" = on ]; then
+        echo "	LIBBPF_FORCE=on set, but couldn't find a usable libbpf"
+        exit 1
+    fi
+}
+
+check_libbpf()
+{
+    # if set LIBBPF_FORCE=off, disable libbpf entirely
+    if [ "$LIBBPF_FORCE" = off ]; then
+        echo "no"
+        return
+    fi
+
+    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
+        echo "no"
+        check_force_libbpf_on
+        return
+    fi
+
+    if [ $(uname -m) = x86_64 ]; then
+        local LIBBPF_LIBDIR="${LIBBPF_DIR}/usr/lib64"
+    else
+        local LIBBPF_LIBDIR="${LIBBPF_DIR}/usr/lib"
+    fi
+
+    if [ -n "$LIBBPF_DIR" ]; then
+        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/usr/include"
+        LIBBPF_LDLIBS="${LIBBPF_LIBDIR}/libbpf.a -lz -lelf"
+        LIBBPF_VERSION=$(PKG_CONFIG_LIBDIR=${LIBBPF_LIBDIR}/pkgconfig ${PKG_CONFIG} libbpf --modversion)
+    else
+        LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
+        LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
+        LIBBPF_VERSION=$(${PKG_CONFIG} libbpf --modversion)
+    fi
+
+    if ! have_libbpf_basic; then
+        echo "no"
+        echo "	libbpf version $LIBBPF_VERSION is too low, please update it to at least 0.1.0"
+        check_force_libbpf_on
+        return
+    else
+        echo "HAVE_LIBBPF:=y" >>$CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
+        echo 'LDLIBS += ' $LIBBPF_LDLIBS >>$CONFIG
+    fi
+
+    # bpf_program__title() is deprecated since libbpf 0.2.0, use
+    # bpf_program__section_name() instead if we support
+    if have_libbpf_sec_name; then
+        echo "HAVE_LIBBPF_SECTION_NAME:=y" >>$CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF_SECTION_NAME ' >> $CONFIG
+    fi
+
+    echo "yes"
+    echo "	libbpf version $LIBBPF_VERSION"
+}
+
 check_selinux()
 # SELinux is a compile time option in the ss utility
 {
@@ -385,6 +490,9 @@ check_setns
 echo -n "SELinux support: "
 check_selinux
 
+echo -n "libbpf support: "
+check_libbpf
+
 echo -n "ELF support: "
 check_elf
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv5 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
@ 2020-11-16  6:53         ` Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  6:53 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This is a preparation for later main libbpf support in iproute2.
bpf.c is moved to bpf_legacy.c first.

A new file bpf_glue.c is added which could call both legacy libbpf code.
There are two wrapper functions added for ipvrf. Function
bpf_prog_load() is removed as it's conflict with libbpf function name.

Signed-off-by: Hangbin Liu <haliu@redhat.com>
---

v5: Fix bpf_prog_load_dev typo.
v4: Add new file bpf_glue.c
v2-v3: no update

---
 include/bpf_util.h          | 10 +++++++---
 ip/ipvrf.c                  |  6 +++---
 lib/Makefile                |  2 +-
 lib/bpf_glue.c              | 35 +++++++++++++++++++++++++++++++++++
 lib/{bpf.c => bpf_legacy.c} | 15 +++------------
 5 files changed, 49 insertions(+), 19 deletions(-)
 create mode 100644 lib/bpf_glue.c
 rename lib/{bpf.c => bpf_legacy.c} (99%)

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 63db07ca..82217cc6 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -274,12 +274,16 @@ int bpf_trace_pipe(void);
 
 void bpf_print_ops(struct rtattr *bpf_ops, __u16 len);
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log);
+int bpf_prog_load_dev(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, __u32 ifindex,
+		      char *log, size_t size_log);
+int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
+		     size_t size_insns, const char *license, char *log,
+		     size_t size_log);
 
 int bpf_prog_attach_fd(int prog_fd, int target_fd, enum bpf_attach_type type);
 int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type);
+int bpf_program_attach(int prog_fd, int target_fd, enum bpf_attach_type type);
 
 int bpf_dump_prog_info(FILE *f, uint32_t id);
 
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 28dd8e25..42779e5c 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -256,8 +256,8 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
-	return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
-			     "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+	return bpf_program_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+			        "GPL", bpf_log_buf, sizeof(bpf_log_buf));
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
@@ -288,7 +288,7 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 		goto out;
 	}
 
-	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
+	if (bpf_program_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
 		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
 			strerror(errno));
 		goto out;
diff --git a/lib/Makefile b/lib/Makefile
index 13f4ee15..7c8a197c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@ CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
-	names.o color.o bpf.o exec.o fs.o cg_map.o
+	names.o color.o bpf_legacy.o bpf_glue.o exec.o fs.o cg_map.o
 
 NLOBJ=libgenl.o libnetlink.o mnl_utils.o
 
diff --git a/lib/bpf_glue.c b/lib/bpf_glue.c
new file mode 100644
index 00000000..fd4fcc94
--- /dev/null
+++ b/lib/bpf_glue.c
@@ -0,0 +1,35 @@
+/*
+ * bpf_glue.c	BPF code to call both legacy and libbpf code
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	Hangbin Liu <haliu@redhat.com>
+ *
+ */
+#include "bpf_util.h"
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#endif
+
+int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
+		     size_t size_insns, const char *license, char *log,
+		     size_t size_log)
+{
+#ifdef HAVE_LIBBPF
+	return bpf_load_program(type, insns, size_insns, license, 0, log, size_log);
+#else
+	return bpf_prog_load_dev(type, insns, size_insns, license, 0, log, size_log);
+#endif
+}
+
+int bpf_program_attach(int prog_fd, int target_fd, enum bpf_attach_type type)
+{
+#ifdef HAVE_LIBBPF
+	return bpf_prog_attach(prog_fd, target_fd, type, 0);
+#else
+	return bpf_prog_attach_fd(prog_fd, target_fd, type);
+#endif
+}
diff --git a/lib/bpf.c b/lib/bpf_legacy.c
similarity index 99%
rename from lib/bpf.c
rename to lib/bpf_legacy.c
index c7d45077..4246fb76 100644
--- a/lib/bpf.c
+++ b/lib/bpf_legacy.c
@@ -1087,10 +1087,9 @@ int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type)
 	return bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
 }
 
-static int bpf_prog_load_dev(enum bpf_prog_type type,
-			     const struct bpf_insn *insns, size_t size_insns,
-			     const char *license, __u32 ifindex,
-			     char *log, size_t size_log)
+int bpf_prog_load_dev(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, __u32 ifindex,
+		      char *log, size_t size_log)
 {
 	union bpf_attr attr = {};
 
@@ -1109,14 +1108,6 @@ static int bpf_prog_load_dev(enum bpf_prog_type type,
 	return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 }
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log)
-{
-	return bpf_prog_load_dev(type, insns, size_insns, license, 0,
-				 log, size_log);
-}
-
 #ifdef HAVE_ELF
 struct bpf_elf_prog {
 	enum bpf_prog_type	type;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv5 iproute2-next 3/5] lib: add libbpf support
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
@ 2020-11-16  6:53         ` Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  6:53 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

This patch converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available, which is started by Toke's
implementation[1]. With libbpf iproute2 could correctly process BTF
information and support the new-style BTF-defined maps, while keeping
compatibility with the old internal map definition syntax.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
iproute2 will still understand the old map definition format, including
populating map-in-map and tail call maps before load.

In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
legacy bytes. When handling the legacy maps, for map-in-maps, we create
them manually and re-use the fd as they are associated with id/inner_id.
For pin maps, we only set the pin path and let libbp load to handle it.
For tail calls, we find it first and update the element after prog load.

Other maps/progs will be loaded by libbpf directly.

[1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>

---
v4:
Move ipvrf code to patch 02
Move HAVE_LIBBPF inside HAVE_ELF definition as libbpf depends on elf.

v3:
Add a new function get_bpf_program__section_name() to choose whether
use bpf_program__title() or not.

v2:
Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
Add ipvrf with libbpf support.
---
 include/bpf_util.h |  11 ++
 lib/Makefile       |   4 +
 lib/bpf_legacy.c   | 178 +++++++++++++++++++++++
 lib/bpf_libbpf.c   | 353 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 546 insertions(+)
 create mode 100644 lib/bpf_libbpf.c

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 82217cc6..d4f66de9 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -304,4 +304,15 @@ static inline int bpf_recv_map_fds(const char *path, int *fds,
 	return -1;
 }
 #endif /* HAVE_ELF */
+
+#ifdef HAVE_LIBBPF
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg);
+int iproute2_bpf_fetch_ancillary(void);
+int iproute2_get_root_path(char *root_path, size_t len);
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname);
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name);
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name);
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg);
+#endif /* HAVE_LIBBPF */
 #endif /* __BPF_UTIL__ */
diff --git a/lib/Makefile b/lib/Makefile
index 7c8a197c..fb6c31ec 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -7,6 +7,10 @@ UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
 	names.o color.o bpf_legacy.o bpf_glue.o exec.o fs.o cg_map.o
 
+ifeq ($(HAVE_LIBBPF),y)
+UTILOBJ += bpf_libbpf.o
+endif
+
 NLOBJ=libgenl.o libnetlink.o mnl_utils.o
 
 all: libnetlink.a libutil.a
diff --git a/lib/bpf_legacy.c b/lib/bpf_legacy.c
index 4246fb76..bc869c3f 100644
--- a/lib/bpf_legacy.c
+++ b/lib/bpf_legacy.c
@@ -940,6 +940,9 @@ static int bpf_do_parse(struct bpf_cfg_in *cfg, const bool *opt_tbl)
 static int bpf_do_load(struct bpf_cfg_in *cfg)
 {
 	if (cfg->mode == EBPF_OBJECT) {
+#ifdef HAVE_LIBBPF
+		return iproute2_load_libbpf(cfg);
+#endif
 		cfg->prog_fd = bpf_obj_open(cfg->object, cfg->type,
 					    cfg->section, cfg->ifindex,
 					    cfg->verbose);
@@ -3155,4 +3158,179 @@ int bpf_recv_map_fds(const char *path, int *fds, struct bpf_map_aux *aux,
 	close(fd);
 	return ret;
 }
+
+#ifdef HAVE_LIBBPF
+/* The following functions are wrapper functions for libbpf code to be
+ * compatible with the legacy format. So all the functions have prefix
+ * with iproute2_
+ */
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+
+	return bpf_elf_ctx_init(ctx, cfg->object, cfg->type, cfg->ifindex, cfg->verbose);
+}
+
+int iproute2_bpf_fetch_ancillary(void)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	struct bpf_elf_sec_data data;
+	int i, ret = 0;
+
+	for (i = 1; i < ctx->elf_hdr.e_shnum; i++) {
+		ret = bpf_fill_section_data(ctx, i, &data);
+		if (ret < 0)
+			continue;
+
+		if (data.sec_hdr.sh_type == SHT_PROGBITS &&
+		    !strcmp(data.sec_name, ELF_SECTION_MAPS))
+			ret = bpf_fetch_maps_begin(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_SYMTAB &&
+			 !strcmp(data.sec_name, ".symtab"))
+			ret = bpf_fetch_symtab(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_STRTAB &&
+			 !strcmp(data.sec_name, ".strtab"))
+			ret = bpf_fetch_strtab(ctx, i, &data);
+		if (ret < 0) {
+			fprintf(stderr, "Error parsing section %d! Perhaps check with readelf -a?\n",
+				i);
+			return ret;
+		}
+	}
+
+	if (bpf_has_map_data(ctx)) {
+		ret = bpf_fetch_maps_end(ctx);
+		if (ret < 0) {
+			fprintf(stderr, "Error fixing up map structure, incompatible struct bpf_elf_map used?\n");
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+int iproute2_get_root_path(char *root_path, size_t len)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	int ret = 0;
+
+	snprintf(root_path, len, "%s/%s",
+		 bpf_get_work_dir(ctx->type), BPF_DIR_GLOBALS);
+
+	ret = mkdir(root_path, S_IRWXU);
+	if (ret && errno != EEXIST) {
+		fprintf(stderr, "mkdir %s failed: %s\n", root_path, strerror(errno));
+		return ret;
+	}
+
+	return 0;
+}
+
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name, *tmp;
+	unsigned int pinning;
+	int i, ret = 0;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].pinning == PIN_OBJECT_NS &&
+		    ctx->noafalg) {
+			fprintf(stderr, "Missing kernel AF_ALG support for PIN_OBJECT_NS!\n");
+			return false;
+		}
+
+		map_name = bpf_map_fetch_name(ctx, i);
+		if (!map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, map_name))
+			continue;
+
+		pinning = ctx->maps[i].pinning;
+
+		if (bpf_no_pinning(ctx, pinning) || !bpf_get_work_dir(ctx->type))
+			return false;
+
+		if (pinning == PIN_OBJECT_NS)
+			ret = bpf_make_obj_path(ctx);
+		else if ((tmp = bpf_custom_pinning(ctx, pinning)))
+			ret = bpf_make_custom_path(ctx, tmp);
+		if (ret < 0)
+			return false;
+
+		bpf_make_pathname(pathname, PATH_MAX, map_name, ctx, pinning);
+
+		return true;
+	}
+
+	return false;
+}
+
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *inner_map_name, *outer_map_name;
+	int i, j;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		inner_map_name = bpf_map_fetch_name(ctx, i);
+		if (!inner_map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, inner_map_name))
+			continue;
+
+		if (!ctx->maps[i].id ||
+		    ctx->maps[i].inner_id ||
+		    ctx->maps[i].inner_idx == -1)
+			continue;
+
+		*imap = ctx->maps[i];
+
+		for (j = 0; j < ctx->map_num; j++) {
+			if (!bpf_is_map_in_map_type(&ctx->maps[j]))
+				continue;
+			if (ctx->maps[j].inner_id != ctx->maps[i].id)
+				continue;
+
+			*omap = ctx->maps[j];
+			outer_map_name = bpf_map_fetch_name(ctx, j);
+			memcpy(omap_name, outer_map_name, strlen(outer_map_name) + 1);
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name;
+	int i, idx = -1;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].id == map_id &&
+		    ctx->maps[i].type == BPF_MAP_TYPE_PROG_ARRAY) {
+			idx = i;
+			break;
+		}
+	}
+
+	if (idx < 0)
+		return -1;
+
+	map_name = bpf_map_fetch_name(ctx, idx);
+	if (!map_name)
+		return -1;
+
+	memcpy(name, map_name, strlen(map_name) + 1);
+	return 0;
+}
+#endif /* HAVE_LIBBPF */
 #endif /* HAVE_ELF */
diff --git a/lib/bpf_libbpf.c b/lib/bpf_libbpf.c
new file mode 100644
index 00000000..26694b43
--- /dev/null
+++ b/lib/bpf_libbpf.c
@@ -0,0 +1,353 @@
+/*
+ * bpf_libbpf.c	BPF code relay on libbpf
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	Hangbin Liu <haliu@redhat.com>
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <errno.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+#include <gelf.h>
+
+#include <bpf/libbpf.h>
+#include <bpf/bpf.h>
+
+#include "bpf_util.h"
+
+static int verbose_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	return vfprintf(stderr, format, args);
+}
+
+static int silent_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	if (level > LIBBPF_WARN)
+		return 0;
+
+	/* Skip warning from bpf_object__init_user_maps() for legacy maps */
+	if (strstr(format, "has unrecognized, non-zero options"))
+		return 0;
+
+	return vfprintf(stderr, format, args);
+}
+
+static const char *get_bpf_program__section_name(const struct bpf_program *prog)
+{
+#ifdef HAVE_LIBBPF_SECTION_NAME
+	return bpf_program__section_name(prog);
+#else
+	return bpf_program__title(prog, false);
+#endif
+}
+
+static int create_map(const char *name, struct bpf_elf_map *map,
+		      __u32 ifindex, int inner_fd)
+{
+	struct bpf_create_map_attr map_attr = {};
+
+	map_attr.name = name;
+	map_attr.map_type = map->type;
+	map_attr.map_flags = map->flags;
+	map_attr.key_size = map->size_key;
+	map_attr.value_size = map->size_value;
+	map_attr.max_entries = map->max_elem;
+	map_attr.map_ifindex = ifindex;
+	map_attr.inner_map_fd = inner_fd;
+
+	return bpf_create_map_xattr(&map_attr);
+}
+
+static int create_map_in_map(struct bpf_object *obj, struct bpf_map *map,
+			     struct bpf_elf_map *elf_map, int inner_fd,
+			     bool *reuse_pin_map)
+{
+	char pathname[PATH_MAX];
+	const char *map_name;
+	bool pin_map = false;
+	int map_fd, ret = 0;
+
+	map_name = bpf_map__name(map);
+
+	if (iproute2_is_pin_map(map_name, pathname)) {
+		pin_map = true;
+
+		/* Check if there already has a pinned map */
+		map_fd = bpf_obj_get(pathname);
+		if (map_fd > 0) {
+			if (reuse_pin_map)
+				*reuse_pin_map = true;
+			close(map_fd);
+			return bpf_map__set_pin_path(map, pathname);
+		}
+	}
+
+	map_fd = create_map(map_name, elf_map, bpf_map__ifindex(map), inner_fd);
+	if (map_fd < 0) {
+		fprintf(stderr, "create map %s failed\n", map_name);
+		return map_fd;
+	}
+
+	ret = bpf_map__reuse_fd(map, map_fd);
+	if (ret < 0) {
+		fprintf(stderr, "map %s reuse fd failed\n", map_name);
+		goto err_out;
+	}
+
+	if (pin_map) {
+		ret = bpf_map__set_pin_path(map, pathname);
+		if (ret < 0)
+			goto err_out;
+	}
+
+	return 0;
+err_out:
+	close(map_fd);
+	return ret;
+}
+
+static int
+handle_legacy_map_in_map(struct bpf_object *obj, struct bpf_map *inner_map,
+			 const char *inner_map_name)
+{
+	int inner_fd, outer_fd, inner_idx, ret = 0;
+	struct bpf_elf_map imap, omap;
+	struct bpf_map *outer_map;
+	/* What's the size limit of map name? */
+	char outer_map_name[128];
+	bool reuse_pin_map = false;
+
+	/* Deal with map-in-map */
+	if (iproute2_is_map_in_map(inner_map_name, &imap, &omap, outer_map_name)) {
+		ret = create_map_in_map(obj, inner_map, &imap, -1, NULL);
+		if (ret < 0)
+			return ret;
+
+		inner_fd = bpf_map__fd(inner_map);
+		outer_map = bpf_object__find_map_by_name(obj, outer_map_name);
+		ret = create_map_in_map(obj, outer_map, &omap, inner_fd, &reuse_pin_map);
+		if (ret < 0)
+			return ret;
+
+		if (!reuse_pin_map) {
+			inner_idx = imap.inner_idx;
+			outer_fd = bpf_map__fd(outer_map);
+			ret = bpf_map_update_elem(outer_fd, &inner_idx, &inner_fd, 0);
+			if (ret < 0)
+				fprintf(stderr, "Cannot update inner_idx into outer_map\n");
+		}
+	}
+
+	return ret;
+}
+
+static int find_legacy_tail_calls(struct bpf_program *prog, struct bpf_object *obj)
+{
+	unsigned int map_id, key_id;
+	const char *sec_name;
+	struct bpf_map *map;
+	char map_name[128];
+	int ret;
+
+	/* Handle iproute2 tail call */
+	sec_name = get_bpf_program__section_name(prog);
+	ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+	if (ret != 2)
+		return -1;
+
+	ret = iproute2_find_map_name_by_id(map_id, map_name);
+	if (ret < 0) {
+		fprintf(stderr, "unable to find map id %u for tail call\n", map_id);
+		return ret;
+	}
+
+	map = bpf_object__find_map_by_name(obj, map_name);
+	if (!map)
+		return -1;
+
+	/* Save the map here for later updating */
+	bpf_program__set_priv(prog, map, NULL);
+
+	return 0;
+}
+
+static int update_legacy_tail_call_maps(struct bpf_object *obj)
+{
+	int prog_fd, map_fd, ret = 0;
+	unsigned int map_id, key_id;
+	struct bpf_program *prog;
+	const char *sec_name;
+	struct bpf_map *map;
+
+	bpf_object__for_each_program(prog, obj) {
+		map = bpf_program__priv(prog);
+		if (!map)
+			continue;
+
+		prog_fd = bpf_program__fd(prog);
+		if (prog_fd < 0)
+			continue;
+
+		sec_name = get_bpf_program__section_name(prog);
+		ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+		if (ret != 2)
+			continue;
+
+		map_fd = bpf_map__fd(map);
+		ret = bpf_map_update_elem(map_fd, &key_id, &prog_fd, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Cannot update map key for tail call!\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int handle_legacy_maps(struct bpf_object *obj)
+{
+	char pathname[PATH_MAX];
+	struct bpf_map *map;
+	const char *map_name;
+	int map_fd, ret = 0;
+
+	bpf_object__for_each_map(map, obj) {
+		map_name = bpf_map__name(map);
+
+		ret = handle_legacy_map_in_map(obj, map, map_name);
+		if (ret)
+			return ret;
+
+		/* If it is a iproute2 legacy pin maps, just set pin path
+		 * and let bpf_object__load() to deal with the map creation.
+		 * We need to ignore map-in-maps which have pinned maps manually
+		 */
+		map_fd = bpf_map__fd(map);
+		if (map_fd < 0 && iproute2_is_pin_map(map_name, pathname)) {
+			ret = bpf_map__set_pin_path(map, pathname);
+			if (ret) {
+				fprintf(stderr, "map '%s': couldn't set pin path.\n", map_name);
+				break;
+			}
+		}
+
+	}
+
+	return ret;
+}
+
+static int load_bpf_object(struct bpf_cfg_in *cfg)
+{
+	struct bpf_program *p, *prog = NULL;
+	struct bpf_object *obj;
+	char root_path[PATH_MAX];
+	struct bpf_map *map;
+	int prog_fd, ret = 0;
+
+	ret = iproute2_get_root_path(root_path, PATH_MAX);
+	if (ret)
+		return ret;
+
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts,
+			.relaxed_maps = true,
+			.pin_root_path = root_path,
+	);
+
+	obj = bpf_object__open_file(cfg->object, &open_opts);
+	if (libbpf_get_error(obj)) {
+		fprintf(stderr, "ERROR: opening BPF object file failed\n");
+		return -ENOENT;
+	}
+
+	bpf_object__for_each_program(p, obj) {
+		/* Only load the programs that will either be subsequently
+		 * attached or inserted into a tail call map */
+		if (find_legacy_tail_calls(p, obj) < 0 && cfg->section &&
+		    strcmp(get_bpf_program__section_name(p), cfg->section)) {
+			ret = bpf_program__set_autoload(p, false);
+			if (ret)
+				return -EINVAL;
+			continue;
+		}
+
+		bpf_program__set_type(p, cfg->type);
+		bpf_program__set_ifindex(p, cfg->ifindex);
+		if (!prog)
+			prog = p;
+	}
+
+	bpf_object__for_each_map(map, obj) {
+		if (!bpf_map__is_offload_neutral(map))
+			bpf_map__set_ifindex(map, cfg->ifindex);
+	}
+
+	if (!prog) {
+		fprintf(stderr, "object file doesn't contain sec %s\n", cfg->section);
+		return -ENOENT;
+	}
+
+	/* Handle iproute2 legacy pin maps and map-in-maps */
+	ret = handle_legacy_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = bpf_object__load(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = update_legacy_tail_call_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	prog_fd = fcntl(bpf_program__fd(prog), F_DUPFD_CLOEXEC, 1);
+	if (prog_fd < 0)
+		ret = -errno;
+	else
+		cfg->prog_fd = prog_fd;
+
+unload_obj:
+	/* Close obj as we don't need it */
+	bpf_object__close(obj);
+	return ret;
+}
+
+/* Load ebpf and return prog fd */
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg)
+{
+	int ret = 0;
+
+	if (cfg->verbose)
+		libbpf_set_print(verbose_print);
+	else
+		libbpf_set_print(silent_print);
+
+	ret = iproute2_bpf_elf_ctx_init(cfg);
+	if (ret < 0) {
+		fprintf(stderr, "Cannot initialize ELF context!\n");
+		return ret;
+	}
+
+	ret = iproute2_bpf_fetch_ancillary();
+	if (ret < 0) {
+		fprintf(stderr, "Error fetching ELF ancillary data!\n");
+		return ret;
+	}
+
+	ret = load_bpf_object(cfg);
+	if (ret)
+		return ret;
+
+	return cfg->prog_fd;
+}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv5 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                           ` (2 preceding siblings ...)
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-11-16  6:53         ` Hangbin Liu
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  6:53 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README                        | 14 +++++++++-----
 examples/bpf/{ => legacy}/bpf_cyclic.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_graft.c      |  2 +-
 examples/bpf/{ => legacy}/bpf_map_in_map.c |  2 +-
 examples/bpf/{ => legacy}/bpf_shared.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_tailcall.c   |  2 +-
 6 files changed, 14 insertions(+), 10 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 rename examples/bpf/{ => legacy}/bpf_graft.c (97%)
 rename examples/bpf/{ => legacy}/bpf_map_in_map.c (96%)
 rename examples/bpf/{ => legacy}/bpf_shared.c (97%)
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)

diff --git a/examples/bpf/README b/examples/bpf/README
index 1bbdda3f..732bcc83 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,8 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
- - bpf_shared.c		-> Ingress/egress map sharing example
- - bpf_tailcall.c	-> Using tail call chains
- - bpf_cyclic.c		-> Simple cycle as tail calls
- - bpf_graft.c		-> Demo on altering runtime behaviour
- - bpf_map_in_map.c     -> Using map in map example
+ - legacy/bpf_shared.c		-> Ingress/egress map sharing example
+ - legacy/bpf_tailcall.c	-> Using tail call chains
+ - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
+ - legacy/bpf_graft.c		-> Demo on altering runtime behaviour
+ - legacy/bpf_map_in_map.c	-> Using map in map example
+
+Note: Users should use new BTF way to defined the maps, the examples
+in legacy folder which is using struct bpf_elf_map defined maps is not
+recommanded.
diff --git a/examples/bpf/bpf_cyclic.c b/examples/bpf/legacy/bpf_cyclic.c
similarity index 95%
rename from examples/bpf/bpf_cyclic.c
rename to examples/bpf/legacy/bpf_cyclic.c
index 11d1c061..33590730 100644
--- a/examples/bpf/bpf_cyclic.c
+++ b/examples/bpf/legacy/bpf_cyclic.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Cyclic dependency example to test the kernel's runtime upper
  * bound on loops. Also demonstrates on how to use direct-actions,
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/legacy/bpf_graft.c
similarity index 97%
rename from examples/bpf/bpf_graft.c
rename to examples/bpf/legacy/bpf_graft.c
index 07113d4a..f4c920cc 100644
--- a/examples/bpf/bpf_graft.c
+++ b/examples/bpf/legacy/bpf_graft.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* This example demonstrates how classifier run-time behaviour
  * can be altered with tail calls. We start out with an empty
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/legacy/bpf_map_in_map.c
similarity index 96%
rename from examples/bpf/bpf_map_in_map.c
rename to examples/bpf/legacy/bpf_map_in_map.c
index ff0e623a..575f8812 100644
--- a/examples/bpf/bpf_map_in_map.c
+++ b/examples/bpf/legacy/bpf_map_in_map.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define MAP_INNER_ID	42
 
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/legacy/bpf_shared.c
similarity index 97%
rename from examples/bpf/bpf_shared.c
rename to examples/bpf/legacy/bpf_shared.c
index 21fe6f1e..05b2b9ef 100644
--- a/examples/bpf/bpf_shared.c
+++ b/examples/bpf/legacy/bpf_shared.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Minimal, stand-alone toy map pinning example:
  *
diff --git a/examples/bpf/bpf_tailcall.c b/examples/bpf/legacy/bpf_tailcall.c
similarity index 98%
rename from examples/bpf/bpf_tailcall.c
rename to examples/bpf/legacy/bpf_tailcall.c
index 161eb606..8ebc554c 100644
--- a/examples/bpf/bpf_tailcall.c
+++ b/examples/bpf/legacy/bpf_tailcall.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define ENTRY_INIT	3
 #define ENTRY_0		0
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv5 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                           ` (3 preceding siblings ...)
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
@ 2020-11-16  6:53         ` Hangbin Liu
  2020-11-16  7:19         ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-16  6:53 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Hangbin Liu

Users should try use the new BTF defined maps instead of struct
bpf_elf_map defined maps. The tail call examples are not added yet
as libbpf doesn't currently support declaratively populating tail call
maps.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README           |  6 ++++
 examples/bpf/bpf_graft.c      | 66 +++++++++++++++++++++++++++++++++++
 examples/bpf/bpf_map_in_map.c | 55 +++++++++++++++++++++++++++++
 examples/bpf/bpf_shared.c     | 53 ++++++++++++++++++++++++++++
 include/bpf_api.h             | 13 +++++++
 5 files changed, 193 insertions(+)
 create mode 100644 examples/bpf/bpf_graft.c
 create mode 100644 examples/bpf/bpf_map_in_map.c
 create mode 100644 examples/bpf/bpf_shared.c

diff --git a/examples/bpf/README b/examples/bpf/README
index 732bcc83..b7261191 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,6 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
+- BTF defined map examples
+ - bpf_graft.c		-> Demo on altering runtime behaviour
+ - bpf_shared.c 	-> Ingress/egress map sharing example
+ - bpf_map_in_map.c	-> Using map in map example
+
+- legacy struct bpf_elf_map defined map examples
  - legacy/bpf_shared.c		-> Ingress/egress map sharing example
  - legacy/bpf_tailcall.c	-> Using tail call chains
  - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/bpf_graft.c
new file mode 100644
index 00000000..8066dcce
--- /dev/null
+++ b/examples/bpf/bpf_graft.c
@@ -0,0 +1,66 @@
+#include "../../include/bpf_api.h"
+
+/* This example demonstrates how classifier run-time behaviour
+ * can be altered with tail calls. We start out with an empty
+ * jmp_tc array, then add section aaa to the array slot 0, and
+ * later on atomically replace it with section bbb. Note that
+ * as shown in other examples, the tc loader can prepopulate
+ * tail called sections, here we start out with an empty one
+ * on purpose to show it can also be done this way.
+ *
+ * tc filter add dev foo parent ffff: bpf obj graft.o
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-20229 [001] ..s. 138993.003923: : fallthrough
+ *   <idle>-0            [001] ..s. 138993.202265: : fallthrough
+ *   Socket Thread-20229 [001] ..s. 138994.004149: : fallthrough
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec aaa
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139012.053587: : aaa
+ *   <idle>-0            [002] ..s. 139012.172359: : aaa
+ *   Socket Thread-19818 [001] ..s. 139012.173556: : aaa
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec bbb
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139022.102967: : bbb
+ *   <idle>-0            [002] ..s. 139022.155640: : bbb
+ *   Socket Thread-19818 [001] ..s. 139022.156730: : bbb
+ *   [...]
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+} jmp_tc __section(".maps");
+
+__section("aaa")
+int cls_aaa(struct __sk_buff *skb)
+{
+	printt("aaa\n");
+	return TC_H_MAKE(1, 42);
+}
+
+__section("bbb")
+int cls_bbb(struct __sk_buff *skb)
+{
+	printt("bbb\n");
+	return TC_H_MAKE(1, 43);
+}
+
+__section_cls_entry
+int cls_entry(struct __sk_buff *skb)
+{
+	tail_call(skb, &jmp_tc, 0);
+	printt("fallthrough\n");
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/bpf_map_in_map.c
new file mode 100644
index 00000000..39c86268
--- /dev/null
+++ b/examples/bpf/bpf_map_in_map.c
@@ -0,0 +1,55 @@
+#include "../../include/bpf_api.h"
+
+struct inner_map {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+} map_inner __section(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+	__array(values, struct inner_map);
+} map_outer __section(".maps") = {
+	.values = {
+		[0] = &map_inner,
+	},
+};
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			lock_xadd(val, 1);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			printt("map val: %d\n", *val);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/bpf_shared.c
new file mode 100644
index 00000000..99a332f4
--- /dev/null
+++ b/examples/bpf/bpf_shared.c
@@ -0,0 +1,53 @@
+#include "../../include/bpf_api.h"
+
+/* Minimal, stand-alone toy map pinning example:
+ *
+ * clang -target bpf -O2 [...] -o bpf_shared.o -c bpf_shared.c
+ * tc filter add dev foo parent 1: bpf obj bpf_shared.o sec egress
+ * tc filter add dev foo parent ffff: bpf obj bpf_shared.o sec ingress
+ *
+ * Both classifier will share the very same map instance in this example,
+ * so map content can be accessed from ingress *and* egress side!
+ *
+ * This example has a pinning of PIN_OBJECT_NS, so it's private and
+ * thus shared among various program sections within the object.
+ *
+ * A setting of PIN_GLOBAL_NS would place it into a global namespace,
+ * so that it can be shared among different object files. A setting
+ * of PIN_NONE (= 0) means no sharing, so each tc invocation a new map
+ * instance is being created.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);	/* or LIBBPF_PIN_NONE */
+} map_sh __section(".maps");
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		lock_xadd(val, 1);
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		printt("map val: %d\n", *val);
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/include/bpf_api.h b/include/bpf_api.h
index 89d3488d..82c47089 100644
--- a/include/bpf_api.h
+++ b/include/bpf_api.h
@@ -19,6 +19,19 @@
 
 #include "bpf_elf.h"
 
+/** libbpf pin type. */
+enum libbpf_pin_type {
+	LIBBPF_PIN_NONE,
+	/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
+	LIBBPF_PIN_BY_NAME,
+};
+
+/** Type helper macros. */
+
+#define __uint(name, val) int (*name)[val]
+#define __type(name, val) typeof(val) *name
+#define __array(name, val) typeof(val) *name[]
+
 /** Misc macros. */
 
 #ifndef __stringify
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                           ` (4 preceding siblings ...)
  2020-11-16  6:53         ` [PATCHv5 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
@ 2020-11-16  7:19         ` Alexei Starovoitov
  2020-11-16 14:54           ` Jesper Dangaard Brouer
  2020-11-16 16:45           ` Stephen Hemminger
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
  6 siblings, 2 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-16  7:19 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Stephen Hemminger, David Ahern, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote:
>
> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
>
> This is achieved by checking for libbpf at './configure' time, and using
> it if available. By default the system libbpf will be used, but static
> linking against a custom libbpf version can be achieved by passing
> LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
> abort if no suitable libbpf is found (useful for automatic packaging
> that wants to enforce the dependency), or set off to disable libbpf check
> and build iproute2 with legacy bpf.
>
> The old iproute2 bpf code is kept and will be used if no suitable libbpf
> is available. When using libbpf, wrapper code ensures that iproute2 will
> still understand the old map definition format, including populating
> map-in-map and tail call maps before load.
>
> The examples in bpf/examples are kept, and a separate set of examples
> are added with BTF-based map definitions for those examples where this
> is possible (libbpf doesn't currently support declaratively populating
> tail call maps).
>
> At last, Thanks a lot for Toke's help on this patch set.
>
> v5:
> a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
>    dest.
> b) Fix bpf_prog_load_dev typo.
> c) rebase to latest iproute2-next.

For the reasons explained multiple times earlier:
Nacked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-16  7:19         ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
@ 2020-11-16 14:54           ` Jesper Dangaard Brouer
  2020-11-16 23:29             ` Toke Høiland-Jørgensen
                               ` (2 more replies)
  2020-11-16 16:45           ` Stephen Hemminger
  1 sibling, 3 replies; 167+ messages in thread
From: Jesper Dangaard Brouer @ 2020-11-16 14:54 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Hangbin Liu, Stephen Hemminger, David Ahern, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, brouer

On Sun, 15 Nov 2020 23:19:26 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote:
> >
> > This series converts iproute2 to use libbpf for loading and attaching
> > BPF programs when it is available. This means that iproute2 will
> > correctly process BTF information and support the new-style BTF-defined
> > maps, while keeping compatibility with the old internal map definition
> > syntax.
> >
> > This is achieved by checking for libbpf at './configure' time, and using
> > it if available. By default the system libbpf will be used, but static
> > linking against a custom libbpf version can be achieved by passing
> > LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
> > abort if no suitable libbpf is found (useful for automatic packaging
> > that wants to enforce the dependency), or set off to disable libbpf check
> > and build iproute2 with legacy bpf.
> >
> > The old iproute2 bpf code is kept and will be used if no suitable libbpf
> > is available. When using libbpf, wrapper code ensures that iproute2 will
> > still understand the old map definition format, including populating
> > map-in-map and tail call maps before load.
> >
> > The examples in bpf/examples are kept, and a separate set of examples
> > are added with BTF-based map definitions for those examples where this
> > is possible (libbpf doesn't currently support declaratively populating
> > tail call maps).
> >
> > At last, Thanks a lot for Toke's help on this patch set.
> >
> > v5:
> > a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
> >    dest.
> > b) Fix bpf_prog_load_dev typo.
> > c) rebase to latest iproute2-next.  
> 
> For the reasons explained multiple times earlier:
> Nacked-by: Alexei Starovoitov <ast@kernel.org>

We really need to get another BPF-ELF loaded into iproute2.  I have
done a number of practical projects with TC-BPF and it sucks that
iproute2 have this out-dated (compiled in) BPF-loader.  Examples
jumping through hoops to get XDP + TC to collaborate[1], and dealing
with iproute2 map-elf layout[2].

Thus, IMHO we MUST move forward and get started with converting
iproute2 to libbpf, and start on the work to deprecate the build in
BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
replace it with libbpf that handle the older binary elf-map layout, but
I do understand if you want to keep this around. (at least for the next
couple of releases).

Maybe we can get a little closer to what Alexei wants?

When compiled against dynamic libbpf, then I would use 'ldd' command to
see what libbpf lib version is used.  When compiled/linked statically
against a custom libbpf version (already supported via LIBBPF_DIR) then
*I* think is difficult to figure out that version of libbpf I'm using.
Could we add the libbpf version info in 'tc -V', as then it would
remove one of my concerns with static linking.

I actually fear that it will be a bad user experience, when we start to
have multiple userspace tools that load BPF, but each is compiled and
statically linked with it own version of libbpf (with git submodule an
increasing number of tools will have more variations!).  Small
variations in supported features can cause strange and difficult
troubleshooting. A practical example is xdp-cpumap-tc[1] where I had to
instruct the customer to load XDP-program *BEFORE* TC-program to have map
(that is shared between TC and XDP) being created correctly, for
userspace tool written in libbpf to have proper map-access and info.


I actually thinks it makes sense to have iproute2 require a specific
libbpf version, and also to move this version requirement forward, as
the kernel evolves features that gets added into libbpf.  I know this
is kind of controversial, and an attempt to pressure distro vendors to
update libbpf.  Maybe it will actually backfire, as the person
generating the DEB/RPM software package will/can choose to compile
iproute2 without ELF-BPF/libbpf support.


[1] https://github.com/xdp-project/xdp-cpumap-tc
[2] https://github.com/netoptimizer/bpf-examples/blob/71db45b28ec/traffic-pacing-edt/edt_pacer02.c#L33-L35
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

(p.s. I actually like dynamic libs, as I can do evil tricks like
LD_PRELOAD, e.g. if system had too old libbpf when I could have my own
and have iproute2 load it via LD_PRELOAD. I know we shouldn't encourage
tricks like this, but I've used these kind of trick with success before).


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-16  7:19         ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
  2020-11-16 14:54           ` Jesper Dangaard Brouer
@ 2020-11-16 16:45           ` Stephen Hemminger
  1 sibling, 0 replies; 167+ messages in thread
From: Stephen Hemminger @ 2020-11-16 16:45 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Hangbin Liu, David Ahern, Daniel Borkmann, Martin KaFai Lau,
	Song Liu, Yonghong Song, David Miller, Jesper Dangaard Brouer,
	Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On Sun, 15 Nov 2020 23:19:26 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Sun, Nov 15, 2020 at 10:56 PM Hangbin Liu <haliu@redhat.com> wrote:
> >
> > This series converts iproute2 to use libbpf for loading and attaching
> > BPF programs when it is available. This means that iproute2 will
> > correctly process BTF information and support the new-style BTF-defined
> > maps, while keeping compatibility with the old internal map definition
> > syntax.
> >
> > This is achieved by checking for libbpf at './configure' time, and using
> > it if available. By default the system libbpf will be used, but static
> > linking against a custom libbpf version can be achieved by passing
> > LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
> > abort if no suitable libbpf is found (useful for automatic packaging
> > that wants to enforce the dependency), or set off to disable libbpf check
> > and build iproute2 with legacy bpf.
> >
> > The old iproute2 bpf code is kept and will be used if no suitable libbpf
> > is available. When using libbpf, wrapper code ensures that iproute2 will
> > still understand the old map definition format, including populating
> > map-in-map and tail call maps before load.
> >
> > The examples in bpf/examples are kept, and a separate set of examples
> > are added with BTF-based map definitions for those examples where this
> > is possible (libbpf doesn't currently support declaratively populating
> > tail call maps).
> >
> > At last, Thanks a lot for Toke's help on this patch set.
> >
> > v5:
> > a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
> >    dest.
> > b) Fix bpf_prog_load_dev typo.
> > c) rebase to latest iproute2-next.  
> 
> For the reasons explained multiple times earlier:
> Nacked-by: Alexei Starovoitov <ast@kernel.org>

Could you propose a trial balloon patch to show what you would like to see in iproute2?

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-16 14:54           ` Jesper Dangaard Brouer
@ 2020-11-16 23:29             ` Toke Høiland-Jørgensen
  2020-11-17  2:37             ` Alexei Starovoitov
  2020-11-17  3:38             ` David Ahern
  2 siblings, 0 replies; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-16 23:29 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Alexei Starovoitov
  Cc: Hangbin Liu, Stephen Hemminger, David Ahern, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Network Development, bpf, Jiri Benc, brouer

Jesper Dangaard Brouer <brouer@redhat.com> writes:

> When compiled against dynamic libbpf, then I would use 'ldd' command to
> see what libbpf lib version is used.  When compiled/linked statically
> against a custom libbpf version (already supported via LIBBPF_DIR) then
> *I* think is difficult to figure out that version of libbpf I'm using.
> Could we add the libbpf version info in 'tc -V', as then it would
> remove one of my concerns with static linking.

Agreed, I think we should definitely add the libbpf version to the tool
version output.

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-16 14:54           ` Jesper Dangaard Brouer
  2020-11-16 23:29             ` Toke Høiland-Jørgensen
@ 2020-11-17  2:37             ` Alexei Starovoitov
  2020-11-17  3:19               ` Hangbin Liu
  2020-11-17 11:56               ` Edward Cree
  2020-11-17  3:38             ` David Ahern
  2 siblings, 2 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-17  2:37 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Hangbin Liu, Stephen Hemminger, David Ahern, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote:
> 
> Thus, IMHO we MUST move forward and get started with converting
> iproute2 to libbpf, and start on the work to deprecate the build in
> BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
> replace it with libbpf that handle the older binary elf-map layout, but
> I do understand if you want to keep this around. (at least for the next
> couple of releases).

I don't understand why legacy code has to be around.
Having the legacy code and an option to build tc without libbpf creates
backward compatibility risk to tc users:
Newer tc may not load bpf progs that older tc did.

> I actually fear that it will be a bad user experience, when we start to
> have multiple userspace tools that load BPF, but each is compiled and
> statically linked with it own version of libbpf (with git submodule an
> increasing number of tools will have more variations!).

So far people either freeze bpftool that they use to load progs
or they use libbpf directly in their applications.
Any other way means that the application behavior will be unpredictable.
If a company built a bpf-based product and wants to distibute such
product as a package it needs a way to specify this dependency in pkg config.
'tc -V' is not something that can be put in a spec.
The main iproute2 version can be used as a dependency, but it's meaningless
when presence of libbpf and its version is not strictly derived from
iproute2 spec.
The users should be able to write in their spec:
BuildRequires: iproute-tc >= 5.10
and be confident that tc will load the prog they've developed and tested.

> I actually thinks it makes sense to have iproute2 require a specific
> libbpf version, and also to move this version requirement forward, as
> the kernel evolves features that gets added into libbpf. 

+1

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-17  2:37             ` Alexei Starovoitov
@ 2020-11-17  3:19               ` Hangbin Liu
  2020-11-17 18:27                 ` Alexei Starovoitov
  2020-11-17 11:56               ` Edward Cree
  1 sibling, 1 reply; 167+ messages in thread
From: Hangbin Liu @ 2020-11-17  3:19 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jesper Dangaard Brouer, Stephen Hemminger, David Ahern,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On Mon, Nov 16, 2020 at 06:37:57PM -0800, Alexei Starovoitov wrote:
> On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote:
> > 
> > Thus, IMHO we MUST move forward and get started with converting
> > iproute2 to libbpf, and start on the work to deprecate the build in
> > BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
> > replace it with libbpf that handle the older binary elf-map layout, but
> > I do understand if you want to keep this around. (at least for the next
> > couple of releases).
> 
> I don't understand why legacy code has to be around.
> Having the legacy code and an option to build tc without libbpf creates
> backward compatibility risk to tc users:
> Newer tc may not load bpf progs that older tc did.

If a distro choose to compile iproute2 with libbpf, I don't think they will
compile iproute2 without libbpf in new version. So yum/apt-get update from
official source doesn't like a problem.

Unless a user choose to use a self build iproute2 version. Then the self build
version may also don't have other supports, like libelf, libnml, libcap etc.

> 
> > I actually fear that it will be a bad user experience, when we start to
> > have multiple userspace tools that load BPF, but each is compiled and
> > statically linked with it own version of libbpf (with git submodule an
> > increasing number of tools will have more variations!).
> 
> So far people either freeze bpftool that they use to load progs
> or they use libbpf directly in their applications.
> Any other way means that the application behavior will be unpredictable.
> If a company built a bpf-based product and wants to distibute such
> product as a package it needs a way to specify this dependency in pkg config.
> 'tc -V' is not something that can be put in a spec.
> The main iproute2 version can be used as a dependency, but it's meaningless
> when presence of libbpf and its version is not strictly derived from
> iproute2 spec.
> The users should be able to write in their spec:
> BuildRequires: iproute-tc >= 5.10
> and be confident that tc will load the prog they've developed and tested.

The current patch does have a libbpf version check, it need at least libbpf
0.1.0. So if a distro starts to build iproute2 based on libbpf, there will
have a dependence. The rule could be added to rpm spec file, or what else
the distro choose. That's the distro compiler's work.

Unless you want to say a company built a bpf-based product, they only
add iproute2 version dependence(let's say some distros has iproute2 5.12 with
libbpf supported), and somehow forgot add libbpf version dependence check
and distro check. At the same time a user run the product on a distro without
libbpf compiled on iproute2 5.12. That do will cause problem.

But if I'm the user, I will think the company is not professional for bpf
product that they even do not know libbpf is needed...

So my opinion: for end user, the distro should take care of libbpf and
iproute2 version control. For bpf company, they should take care if libbpf
is used by the iproute2 and what distros they support.

Please correct me if I missed something.

Thanks
Hangbin


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-16 14:54           ` Jesper Dangaard Brouer
  2020-11-16 23:29             ` Toke Høiland-Jørgensen
  2020-11-17  2:37             ` Alexei Starovoitov
@ 2020-11-17  3:38             ` David Ahern
  2020-11-17 18:19               ` Alexei Starovoitov
  2 siblings, 1 reply; 167+ messages in thread
From: David Ahern @ 2020-11-17  3:38 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Hangbin Liu, Stephen Hemminger, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Alexei Starovoitov

On 11/16/20 7:54 AM, Jesper Dangaard Brouer wrote:
> When compiled against dynamic libbpf, then I would use 'ldd' command to
> see what libbpf lib version is used.  When compiled/linked statically
> against a custom libbpf version (already supported via LIBBPF_DIR) then
> *I* think is difficult to figure out that version of libbpf I'm using.
> Could we add the libbpf version info in 'tc -V', as then it would
> remove one of my concerns with static linking.

Adding libbpf version to 'tc -V' and 'ip -V' seems reasonable.

As for the bigger problem, trying to force user space components to
constantly chase latest and greatest S/W versions is not the right answer.

The crux of the problem here is loading bpf object files and what will
most likely be a never ending stream of enhancements that impact the
proper loading of them. bpftool is much more suited to the job of
managing bpf files versus iproute2 which is the de facto implementation
for networking APIs. bpftool ships as part of a common linux tools
package, so it will naturally track kernel versions for those who want /
need latest and greatest versions. Users who are not building their own
agents for managing bpf files (which I think is much more appropriate
for production use cases than forking command line utilities) can use
bpftool to load files, manage maps which are then attached to the
programs, etc, and then invoke iproute2 to handle the networking attach
/ detach / list with detailed information.

That said, the legacy bpf code in iproute2 has created some
expectations, and iproute2 can not simply remove existing capabilities.
Moving iproute2 to libbpf provides an improvement over the current
status by allowing ‘modern’ bpf object files to be loaded without
affecting legacy users, even if it does not allow latest and greatest
bpf capabilities at every moment in time (again, a constantly moving
reference point).

iproute2 is a networking configuration tool, not a bpf management tool.
Hangbin’s approach gives full flexibility to those who roll their own
and for distributions who value stability, it allows iproute2 to use
latest and greatest libbpf for those who want to chase the pot of gold
at the end of the rainbow, or they can choose stability with an OS
distro’s libbpf or legacy bpf. I believe this is the right compromise at
this point in time.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-17  2:37             ` Alexei Starovoitov
  2020-11-17  3:19               ` Hangbin Liu
@ 2020-11-17 11:56               ` Edward Cree
  1 sibling, 0 replies; 167+ messages in thread
From: Edward Cree @ 2020-11-17 11:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Jesper Dangaard Brouer
  Cc: Hangbin Liu, Stephen Hemminger, David Ahern, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On 17/11/2020 02:37, Alexei Starovoitov wrote:
> If a company built a bpf-based product and wants to distibute such
> product as a package it needs a way to specify this dependency in pkg config.
> 'tc -V' is not something that can be put in a spec.
> The main iproute2 version can be used as a dependency, but it's meaningless
> when presence of libbpf and its version is not strictly derived from
> iproute2 spec.

But if libbpf is dynamically linked, they can put
Requires: libbpf >= 0.3.0
Requires: iproute-tc >= 5.10
and get the dependency behaviour they need.  No?

-ed

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-17  3:38             ` David Ahern
@ 2020-11-17 18:19               ` Alexei Starovoitov
  0 siblings, 0 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-17 18:19 UTC (permalink / raw)
  To: David Ahern
  Cc: Jesper Dangaard Brouer, Hangbin Liu, Stephen Hemminger,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen

On Mon, Nov 16, 2020 at 08:38:15PM -0700, David Ahern wrote:
> 
> As for the bigger problem, trying to force user space components to
> constantly chase latest and greatest S/W versions is not the right answer.

Your own nexthop enhancements in the kernel code follow 1-1 with iproute2
changes. So the users do chase the latest kernel and the latest iproute2
if they want the networking feature.
Yet you're arguing that for bpf features they shouldn't have such expectations
with iproute2 which will not support the latest kernel bpf features.
I sense a lot of bias here.

> The crux of the problem here is loading bpf object files and what will
> most likely be a never ending stream of enhancements that impact the
> proper loading of them.

Please stop this misinformation spread.
Multiple people explained numerous times that libbpf takes care of
backward compatibility.

> That said, the legacy bpf code in iproute2 has created some
> expectations, and iproute2 can not simply remove existing capabilities.

It certainly can remove them by moving to libbpf.

> iproute2 is a networking configuration tool, not a bpf management tool.
> Hangbin’s approach gives full flexibility to those who roll their own
> and for distributions who value stability, it allows iproute2 to use
> latest and greatest libbpf for those who want to chase the pot of gold
> at the end of the rainbow, or they can choose stability with an OS
> distro’s libbpf or legacy bpf. I believe this is the right compromise at
> this point in time.

In other words you're saying that upstream iproute2 is a kitchen sink
of untested combinations of libraries and distros suppose to do a ton
of extra work to provide their users a quality iproute2.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-17  3:19               ` Hangbin Liu
@ 2020-11-17 18:27                 ` Alexei Starovoitov
  0 siblings, 0 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-17 18:27 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Jesper Dangaard Brouer, Stephen Hemminger, David Ahern,
	Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Network Development, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, ecree

On Tue, Nov 17, 2020 at 11:19:33AM +0800, Hangbin Liu wrote:
> On Mon, Nov 16, 2020 at 06:37:57PM -0800, Alexei Starovoitov wrote:
> > On Mon, Nov 16, 2020 at 03:54:46PM +0100, Jesper Dangaard Brouer wrote:
> > > 
> > > Thus, IMHO we MUST move forward and get started with converting
> > > iproute2 to libbpf, and start on the work to deprecate the build in
> > > BPF-ELF-loader.  I would prefer ripping out the BPF-ELF-loader and
> > > replace it with libbpf that handle the older binary elf-map layout, but
> > > I do understand if you want to keep this around. (at least for the next
> > > couple of releases).
> > 
> > I don't understand why legacy code has to be around.
> > Having the legacy code and an option to build tc without libbpf creates
> > backward compatibility risk to tc users:
> > Newer tc may not load bpf progs that older tc did.
> 
> If a distro choose to compile iproute2 with libbpf, I don't think they will
> compile iproute2 without libbpf in new version. So yum/apt-get update from
> official source doesn't like a problem.
> 
> Unless a user choose to use a self build iproute2 version. Then the self build
> version may also don't have other supports, like libelf, libnml, libcap etc.
> 
> > 
> > > I actually fear that it will be a bad user experience, when we start to
> > > have multiple userspace tools that load BPF, but each is compiled and
> > > statically linked with it own version of libbpf (with git submodule an
> > > increasing number of tools will have more variations!).
> > 
> > So far people either freeze bpftool that they use to load progs
> > or they use libbpf directly in their applications.
> > Any other way means that the application behavior will be unpredictable.
> > If a company built a bpf-based product and wants to distibute such
> > product as a package it needs a way to specify this dependency in pkg config.
> > 'tc -V' is not something that can be put in a spec.
> > The main iproute2 version can be used as a dependency, but it's meaningless
> > when presence of libbpf and its version is not strictly derived from
> > iproute2 spec.
> > The users should be able to write in their spec:
> > BuildRequires: iproute-tc >= 5.10
> > and be confident that tc will load the prog they've developed and tested.
> 
> The current patch does have a libbpf version check, it need at least libbpf
> 0.1.0. So if a distro starts to build iproute2 based on libbpf, there will
> have a dependence. The rule could be added to rpm spec file, or what else
> the distro choose. That's the distro compiler's work.
> 
> Unless you want to say a company built a bpf-based product, they only
> add iproute2 version dependence(let's say some distros has iproute2 5.12 with
> libbpf supported), and somehow forgot add libbpf version dependence check
> and distro check. At the same time a user run the product on a distro without
> libbpf compiled on iproute2 5.12. That do will cause problem.

right.
You've answered Ed's question:

> But if libbpf is dynamically linked, they can put
> Requires: libbpf >= 0.3.0
> Requires: iproute-tc >= 5.10
> and get the dependency behaviour they need.  No?

It is a problem because >= 5.10 cannot capture legacy vs libbpf.

> But if I'm the user, I will think the company is not professional for bpf
> product that they even do not know libbpf is needed...
> 
> So my opinion: for end user, the distro should take care of libbpf and
> iproute2 version control. For bpf company, they should take care if libbpf
> is used by the iproute2 and what distros they support.

So you're saying that bpf community shouldn't care about their users.
The distros suppose to step forward and provide proper bpf support
in tools like iproute2?
In other words iproute2 upstream doesn't care about shipping quality product.
It's distros job now.
Thanks, but no.
iproute2 should stay with legacy obsolete prog loader
and the users should switch to bpftool + iproute2 combination.
bpftool for loading progs and iproute2 for networking configs.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv6 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                           ` (5 preceding siblings ...)
  2020-11-16  7:19         ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
@ 2020-11-23 13:11         ` Hangbin Liu
  2020-11-23 13:11           ` [PATCHv6 iproute2-next 1/5] iproute2: add check_libbpf() and get_libbpf_version() Hangbin Liu
                             ` (6 more replies)
  6 siblings, 7 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-23 13:11 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Alexei Starovoitov, Hangbin Liu

This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency), or set off to disable libbpf check
and build iproute2 with legacy bpf.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.

v6:
a) print runtime libbpf version in ip -V and tc -V

v5:
a) Fix LIBBPF_DIR typo and description, use libbpf DESTDIR as LIBBPF_DIR
   dest.
b) Fix bpf_prog_load_dev typo.
c) rebase to latest iproute2-next.

v4:
a) Make variable LIBBPF_FORCE able to control whether build iproute2
   with libbpf or not.
b) Add new file bpf_glue.c to for libbpf/legacy mixed bpf calls.
c) Fix some build issues and shell compatibility error.

v3:
a) Update configure to Check function bpf_program__section_name() separately
b) Add a new function get_bpf_program__section_name() to choose whether to
use bpf_program__title() or not.
c) Test build the patch on Fedora 33 with libbpf-0.1.0-1.fc33 and
   libbpf-devel-0.1.0-1.fc33

v2:
a) Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
b) Add ipvrf with libbpf support.


Here are the test results with patched iproute2:
== Show libbpf version
# ip -V
ip utility, iproute2-5.9.0, libbpf 0.1.0
# tc -V
tc utility, iproute2-5.9.0, libbpf 0.1.0

== setup env
# clang -O2 -Wall -g -target bpf -c bpf_graft.c -o btf_graft.o
# clang -O2 -Wall -g -target bpf -c bpf_map_in_map.c -o btf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c bpf_shared.c -o btf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_cyclic.c -o bpf_cyclic.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_graft.c -o bpf_graft.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_map_in_map.c -o bpf_map_in_map.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_shared.c -o bpf_shared.o
# clang -O2 -Wall -g -target bpf -c legacy/bpf_tailcall.c -o bpf_tailcall.o
# rm -rf /sys/fs/bpf/xdp/globals
# /root/iproute2/ip/ip link add type veth
# /root/iproute2/ip/ip link set veth0 up
# /root/iproute2/ip/ip link set veth1 up


== Load objs
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 4 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
4: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:21-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 5
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 8 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
8: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:23-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 10
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 12 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
12: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:25-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 15
# /root/iproute2/ip/ip link set veth0 xdp off


== Load objs again to make sure maps could be reused
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 16 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
16: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:27-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 20
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 20 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show                                                                                                                                                                   [236/4518]
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
20: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:29-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 3
        btf_id 25
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj bpf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 24 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef:
map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc  map_inner  map_outer
# bpftool map show
1: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
2: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
3: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
4: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
24: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:31-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 4
        btf_id 30
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/7a1422e90cd81478f97bc33fbd7782bcb3b868ef /sys/fs/bpf/xdp/globals

== Testing if we can load new-style objects (using xdp-filter as an example)
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_all.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 28 tag e29eeda1489a6520 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
28: xdp  name xdpfilt_alw_all  tag e29eeda1489a6520  gpl
        loaded_at 2020-10-22T08:04:33-0400  uid 0
        xlated 2408B  jited 1405B  memlock 4096B  map_ids 9,5,7,8,6
        btf_id 35
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_ip.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 32 tag 2f2b9dbfb786a5a2 jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
32: xdp  name xdpfilt_alw_ip  tag 2f2b9dbfb786a5a2  gpl
        loaded_at 2020-10-22T08:04:35-0400  uid 0
        xlated 1336B  jited 778B  memlock 4096B  map_ids 7,8,5
        btf_id 40
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj /usr/lib64/bpf/xdpfilt_alw_tcp.o sec xdp_filter
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 36 tag 18c1bb25084030bc jited
# ls /sys/fs/bpf/xdp/globals
filter_ethernet  filter_ipv4  filter_ipv6  filter_ports  xdp_stats_map
# bpftool map show
5: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        btf_id 35
6: percpu_array  name filter_ports  flags 0x0
        key 4B  value 8B  max_entries 65536  memlock 1576960B
        btf_id 35
7: percpu_hash  name filter_ipv4  flags 0x0
        key 4B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
8: percpu_hash  name filter_ipv6  flags 0x0
        key 16B  value 8B  max_entries 10000  memlock 1142784B
        btf_id 35
9: percpu_hash  name filter_ethernet  flags 0x0
        key 6B  value 8B  max_entries 10000  memlock 1064960B
        btf_id 35
# bpftool prog show
36: xdp  name xdpfilt_alw_tcp  tag 18c1bb25084030bc  gpl
        loaded_at 2020-10-22T08:04:37-0400  uid 0
        xlated 1128B  jited 690B  memlock 4096B  map_ids 6,5
        btf_id 45
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Load new btf defined maps
# /root/iproute2/ip/ip link set veth0 xdp obj btf_graft.o sec aaa
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 40 tag 3056d2382e53f27c jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
40: xdp  name cls_aaa  tag 3056d2382e53f27c  gpl
        loaded_at 2020-10-22T08:04:39-0400  uid 0
        xlated 80B  jited 71B  memlock 4096B
        btf_id 50
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_map_in_map.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 44 tag 4420e72b2a601ed7 jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
44: xdp  name imain  tag 4420e72b2a601ed7  gpl
        loaded_at 2020-10-22T08:04:41-0400  uid 0
        xlated 336B  jited 193B  memlock 4096B  map_ids 13
        btf_id 55
# /root/iproute2/ip/ip link set veth0 xdp off
# /root/iproute2/ip/ip link set veth0 xdp obj btf_shared.o sec ingress
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
    prog/xdp id 48 tag 9cbab549c3af3eab jited
# ls /sys/fs/bpf/xdp/globals
jmp_tc  map_outer  map_sh
# bpftool map show
10: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
11: array  name map_inner  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
13: array_of_maps  name map_outer  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
14: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
48: xdp  name imain  tag 9cbab549c3af3eab  gpl
        loaded_at 2020-10-22T08:04:43-0400  uid 0
        xlated 224B  jited 139B  memlock 4096B  map_ids 14
        btf_id 60
# /root/iproute2/ip/ip link set veth0 xdp off
# rm -rf /sys/fs/bpf/xdp/globals


== Test load objs by tc
# /root/iproute2/tc/tc qdisc add dev veth0 ingress
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_cyclic.o sec 0xabccba/0
# /root/iproute2/tc/tc filter add dev veth0 parent ffff: bpf obj bpf_graft.o
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 42/1
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec 43/0
# /root/iproute2/tc/tc filter add dev veth0 ingress bpf da obj bpf_tailcall.o sec classifier
# /root/iproute2/ip/ip link show veth0
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 6a:e6:fa:2b:4e:1f brd ff:ff:ff:ff:ff:ff
# ls /sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d /sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f /sys/fs/bpf/xdp/globals
/sys/fs/bpf/xdp/37e88cb3b9646b2ea5f99ab31069ad88db06e73d:
jmp_tc

/sys/fs/bpf/xdp/fc68fe3e96378a0cba284ea6acbe17e898d8b11f:
jmp_ex  jmp_tc  map_sh

/sys/fs/bpf/xdp/globals:
jmp_tc
# bpftool map show
15: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
16: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
17: prog_array  name jmp_ex  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
        owner_prog_type sched_cls  owner jited
18: prog_array  name jmp_tc  flags 0x0
        key 4B  value 4B  max_entries 2  memlock 4096B
        owner_prog_type sched_cls  owner jited
19: array  name map_sh  flags 0x0
        key 4B  value 4B  max_entries 1  memlock 4096B
# bpftool prog show
52: sched_cls  name cls_loop  tag 3e98a40b04099d36  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 168B  jited 133B  memlock 4096B  map_ids 15
        btf_id 65
56: sched_cls  name cls_entry  tag 0fbb4d9310a6ee26  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 144B  jited 121B  memlock 4096B  map_ids 16
        btf_id 70
60: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 75
66: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 80
72: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 85
78: sched_cls  name cls_case1  tag e06a3bd62293d65d  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 328B  jited 216B  memlock 4096B  map_ids 19,17
        btf_id 90
79: sched_cls  name cls_case2  tag ee218ff893dca823  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 336B  jited 218B  memlock 4096B  map_ids 19,18
        btf_id 90
80: sched_cls  name cls_exit  tag e78a58140deed387  gpl
        loaded_at 2020-10-22T08:04:45-0400  uid 0
        xlated 288B  jited 177B  memlock 4096B  map_ids 19
        btf_id 90

I also run the following upstream kselftest with patches iproute2 and
all passed.

test_lwt_ip_encap.sh
test_xdp_redirect.sh
test_tc_redirect.sh
test_xdp_meta.sh
test_xdp_veth.sh
test_xdp_vlan.sh


Hangbin Liu (5):
  iproute2: add check_libbpf() and get_libbpf_version()
  lib: make ipvrf able to use libbpf and fix function name conflicts
  lib: add libbpf support
  examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  examples/bpf: add bpf examples with BTF defined maps

 configure                                | 113 ++++++++
 examples/bpf/README                      |  18 +-
 examples/bpf/bpf_graft.c                 |  14 +-
 examples/bpf/bpf_map_in_map.c            |  37 ++-
 examples/bpf/bpf_shared.c                |  14 +-
 examples/bpf/{ => legacy}/bpf_cyclic.c   |   2 +-
 examples/bpf/legacy/bpf_graft.c          |  66 +++++
 examples/bpf/legacy/bpf_map_in_map.c     |  56 ++++
 examples/bpf/legacy/bpf_shared.c         |  53 ++++
 examples/bpf/{ => legacy}/bpf_tailcall.c |   2 +-
 include/bpf_api.h                        |  13 +
 include/bpf_util.h                       |  30 +-
 ip/ip.c                                  |  10 +-
 ip/ipvrf.c                               |   6 +-
 lib/Makefile                             |   8 +-
 lib/bpf_glue.c                           |  86 ++++++
 lib/{bpf.c => bpf_legacy.c}              | 193 ++++++++++++-
 lib/bpf_libbpf.c                         | 348 +++++++++++++++++++++++
 tc/tc.c                                  |  10 +-
 19 files changed, 1017 insertions(+), 62 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 create mode 100644 examples/bpf/legacy/bpf_graft.c
 create mode 100644 examples/bpf/legacy/bpf_map_in_map.c
 create mode 100644 examples/bpf/legacy/bpf_shared.c
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)
 create mode 100644 lib/bpf_glue.c
 rename lib/{bpf.c => bpf_legacy.c} (94%)
 create mode 100644 lib/bpf_libbpf.c

-- 
2.25.4


^ permalink raw reply	[flat|nested] 167+ messages in thread

* [PATCHv6 iproute2-next 1/5] iproute2: add check_libbpf() and get_libbpf_version()
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
@ 2020-11-23 13:11           ` Hangbin Liu
  2020-11-23 13:11           ` [PATCHv6 iproute2-next 2/5] lib: make ipvrf able to use libbpf and fix function name conflicts Hangbin Liu
                             ` (5 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-23 13:11 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Alexei Starovoitov, Hangbin Liu

This patch aim to add basic checking functions for later iproute2
libbpf support.

First we add check_libbpf() in configure to see if we have bpf library
support. By default the system libbpf will be used, but static linking
against a custom libbpf version can be achieved by passing libbpf DESTDIR
to variable LIBBPF_DIR for configure.

Another variable LIBBPF_FORCE is used to control whether to build iproute2
with libbpf. If set to on, then force to build with libbpf and exit if
not available. If set to off, then force to not build with libbpf.

When dynamically linking against libbpf, we can't be sure that the
version we discovered at compile time is actually the one we are
using at runtime. This can lead to hard-to-debug errors. So we add
a new file lib/bpf_glue.c and a helper function get_libbpf_version()
to get correct libbpf version at runtime.

Signed-off-by: Hangbin Liu <haliu@redhat.com>
---

v6:
1) Add a new helper get_libbpf_version() to get runtime libbpf version
  based on Toke's xdp-tools patch. The libbpf version will be printed
  when exec ip -V or tc -V.

v5:
1) Fix LIBBPF_DIR type and description, use libbpf DESTDIR as LIBBPF_DIR
   dest.

v4:
1) Remove duplicate LIBBPF_CFLAGS
2) Remove un-needed -L since using static libbpf.a
3) Fix == not supported in dash
4) Extend LIBBPF_FORCE to support on/off, when set to on, stop building when
   there is no libbpf support. If set to off, discard libbpf check.
5) Print libbpf version after checking

v3:
Check function bpf_program__section_name() separately and only use it
on higher libbpf version.

v2:
No update
---
 configure          | 113 +++++++++++++++++++++++++++++++++++++++++++++
 include/bpf_util.h |   3 ++
 ip/ip.c            |  10 +++-
 lib/Makefile       |   2 +-
 lib/bpf_glue.c     |  63 +++++++++++++++++++++++++
 tc/tc.c            |  10 +++-
 6 files changed, 196 insertions(+), 5 deletions(-)
 create mode 100644 lib/bpf_glue.c

diff --git a/configure b/configure
index 307912aa..2c363d3b 100755
--- a/configure
+++ b/configure
@@ -2,6 +2,11 @@
 # SPDX-License-Identifier: GPL-2.0
 # This is not an autoconf generated configure
 #
+# Influential LIBBPF environment variables:
+#   LIBBPF_FORCE={on,off}   on: require link against libbpf;
+#                           off: disable libbpf probing
+#   LIBBPF_DIR              Path to libbpf DESTDIR to use
+
 INCLUDE=${1:-"$PWD/include"}
 
 # Output file which is input to Makefile
@@ -240,6 +245,111 @@ check_elf()
     fi
 }
 
+have_libbpf_basic()
+{
+    cat >$TMPDIR/libbpf_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    bpf_program__set_autoload(NULL, false);
+    bpf_map__ifindex(NULL);
+    bpf_map__set_pin_path(NULL, NULL);
+    bpf_object__open_file(NULL, NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_test $TMPDIR/libbpf_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_test.c $TMPDIR/libbpf_test
+    return $ret
+}
+
+have_libbpf_sec_name()
+{
+    cat >$TMPDIR/libbpf_sec_test.c <<EOF
+#include <bpf/libbpf.h>
+int main(int argc, char **argv) {
+    void *ptr;
+    bpf_program__section_name(NULL);
+    return 0;
+}
+EOF
+
+    $CC -o $TMPDIR/libbpf_sec_test $TMPDIR/libbpf_sec_test.c $LIBBPF_CFLAGS $LIBBPF_LDLIBS >/dev/null 2>&1
+    local ret=$?
+
+    rm -f $TMPDIR/libbpf_sec_test.c $TMPDIR/libbpf_sec_test
+    return $ret
+}
+
+check_force_libbpf_on()
+{
+    # if set LIBBPF_FORCE=on but no libbpf support, just exist the config
+    # process to make sure we don't build without libbpf.
+    if [ "$LIBBPF_FORCE" = on ]; then
+        echo "	LIBBPF_FORCE=on set, but couldn't find a usable libbpf"
+        exit 1
+    fi
+}
+
+check_libbpf()
+{
+    # if set LIBBPF_FORCE=off, disable libbpf entirely
+    if [ "$LIBBPF_FORCE" = off ]; then
+        echo "no"
+        return
+    fi
+
+    if ! ${PKG_CONFIG} libbpf --exists && [ -z "$LIBBPF_DIR" ] ; then
+        echo "no"
+        check_force_libbpf_on
+        return
+    fi
+
+    if [ $(uname -m) = x86_64 ]; then
+        local LIBBPF_LIBDIR="${LIBBPF_DIR}/usr/lib64"
+    else
+        local LIBBPF_LIBDIR="${LIBBPF_DIR}/usr/lib"
+    fi
+
+    if [ -n "$LIBBPF_DIR" ]; then
+        LIBBPF_CFLAGS="-I${LIBBPF_DIR}/usr/include"
+        LIBBPF_LDLIBS="${LIBBPF_LIBDIR}/libbpf.a -lz -lelf"
+        LIBBPF_VERSION=$(PKG_CONFIG_LIBDIR=${LIBBPF_LIBDIR}/pkgconfig ${PKG_CONFIG} libbpf --modversion)
+    else
+        LIBBPF_CFLAGS=$(${PKG_CONFIG} libbpf --cflags)
+        LIBBPF_LDLIBS=$(${PKG_CONFIG} libbpf --libs)
+        LIBBPF_VERSION=$(${PKG_CONFIG} libbpf --modversion)
+    fi
+
+    if ! have_libbpf_basic; then
+        echo "no"
+        echo "	libbpf version $LIBBPF_VERSION is too low, please update it to at least 0.1.0"
+        check_force_libbpf_on
+        return
+    else
+        echo "HAVE_LIBBPF:=y" >> $CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF ' $LIBBPF_CFLAGS >> $CONFIG
+        echo "CFLAGS += -DLIBBPF_VERSION=\\\"$LIBBPF_VERSION\\\"" >> $CONFIG
+        echo 'LDLIBS += ' $LIBBPF_LDLIBS >> $CONFIG
+
+        if [ -z "$LIBBPF_DIR" ]; then
+            echo "CFLAGS += -DLIBBPF_DYNAMIC" >> $CONFIG
+        fi
+    fi
+
+    # bpf_program__title() is deprecated since libbpf 0.2.0, use
+    # bpf_program__section_name() instead if we support
+    if have_libbpf_sec_name; then
+        echo "HAVE_LIBBPF_SECTION_NAME:=y" >> $CONFIG
+        echo 'CFLAGS += -DHAVE_LIBBPF_SECTION_NAME ' >> $CONFIG
+    fi
+
+    echo "yes"
+    echo "	libbpf version $LIBBPF_VERSION"
+}
+
 check_selinux()
 # SELinux is a compile time option in the ss utility
 {
@@ -385,6 +495,9 @@ check_setns
 echo -n "SELinux support: "
 check_selinux
 
+echo -n "libbpf support: "
+check_libbpf
+
 echo -n "ELF support: "
 check_elf
 
diff --git a/include/bpf_util.h b/include/bpf_util.h
index 63db07ca..dee5bb02 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -300,4 +300,7 @@ static inline int bpf_recv_map_fds(const char *path, int *fds,
 	return -1;
 }
 #endif /* HAVE_ELF */
+
+const char *get_libbpf_version(void);
+
 #endif /* __BPF_UTIL__ */
diff --git a/ip/ip.c b/ip/ip.c
index 5e31957f..466dbb52 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -24,6 +24,7 @@
 #include "namespace.h"
 #include "color.h"
 #include "rt_names.h"
+#include "bpf_util.h"
 
 int preferred_family = AF_UNSPEC;
 int human_readable;
@@ -147,8 +148,9 @@ static int batch(const char *name)
 
 int main(int argc, char **argv)
 {
-	char *basename;
+	const char *libbpf_version;
 	char *batch_file = NULL;
+	char *basename;
 	int color = 0;
 
 	/* to run vrf exec without root, capabilities might be set, drop them
@@ -229,7 +231,11 @@ int main(int argc, char **argv)
 			++timestamp;
 			++timestamp_short;
 		} else if (matches(opt, "-Version") == 0) {
-			printf("ip utility, iproute2-%s\n", version);
+			printf("ip utility, iproute2-%s", version);
+			libbpf_version = get_libbpf_version();
+			if (libbpf_version)
+				printf(", libbpf %s", libbpf_version);
+			printf("\n");
 			exit(0);
 		} else if (matches(opt, "-force") == 0) {
 			++force;
diff --git a/lib/Makefile b/lib/Makefile
index 13f4ee15..a02775a5 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@ CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
-	names.o color.o bpf.o exec.o fs.o cg_map.o
+	names.o color.o bpf.o bpf_glue.o exec.o fs.o cg_map.o
 
 NLOBJ=libgenl.o libnetlink.o mnl_utils.o
 
diff --git a/lib/bpf_glue.c b/lib/bpf_glue.c
new file mode 100644
index 00000000..67c41c22
--- /dev/null
+++ b/lib/bpf_glue.c
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * bpf_glue.c:	BPF code to call both legacy and libbpf code
+ * Authors:	Hangbin Liu <haliu@redhat.com>
+ *
+ */
+#include "bpf_util.h"
+
+#ifdef HAVE_LIBBPF
+static const char *_libbpf_compile_version = LIBBPF_VERSION;
+static char _libbpf_version[10] = {};
+
+const char *get_libbpf_version(void)
+{
+	/* Start by copying compile-time version into buffer so we have a
+	 * fallback value in case we are dynamically linked, or can't find a
+	 * version in /proc/self/maps below.
+	 */
+	strncpy(_libbpf_version, _libbpf_compile_version,
+		sizeof(_libbpf_version)-1);
+#ifdef LIBBPF_DYNAMIC
+	char buf[PATH_MAX], *s;
+	bool found = false;
+	FILE *fp;
+
+	/* When dynamically linking against libbpf, we can't be sure that the
+	 * version we discovered at compile time is actually the one we are
+	 * using at runtime. This can lead to hard-to-debug errors, so we try to
+	 * discover the correct version at runtime.
+	 *
+	 * The simple solution to this would be if libbpf itself exported a
+	 * version in its API. But since it doesn't, we work around this by
+	 * parsing the mappings of the binary at runtime, looking for the full
+	 * filename of libbpf.so and using that.
+	 */
+	fp = fopen("/proc/self/maps", "r");
+	if (fp == NULL)
+		goto out;
+
+	while ((s = fgets(buf, sizeof(buf), fp)) != NULL) {
+		if ((s = strstr(buf, "libbpf.so.")) != NULL) {
+			strncpy(_libbpf_version, s+10, sizeof(_libbpf_version)-1);
+			strtok(_libbpf_version, "\n");
+			found = true;
+			break;
+		}
+	}
+
+	fclose(fp);
+out:
+	if (!found)
+		fprintf(stderr, "Couldn't find runtime libbpf version - falling back to compile-time value!\n");
+#endif /* LIBBPF_DYNAMIC */
+
+	_libbpf_version[sizeof(_libbpf_version)-1] = '\0';
+	return _libbpf_version;
+}
+#else
+const char *get_libbpf_version(void)
+{
+	return NULL;
+}
+#endif /* HAVE_LIBBPF */
diff --git a/tc/tc.c b/tc/tc.c
index af9b21da..7557b977 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -30,6 +30,7 @@
 #include "tc_common.h"
 #include "namespace.h"
 #include "rt_names.h"
+#include "bpf_util.h"
 
 int show_stats;
 int show_details;
@@ -259,8 +260,9 @@ static int batch(const char *name)
 
 int main(int argc, char **argv)
 {
-	int ret;
+	const char *libbpf_version;
 	char *batch_file = NULL;
+	int ret;
 
 	while (argc > 1) {
 		if (argv[1][0] != '-')
@@ -277,7 +279,11 @@ int main(int argc, char **argv)
 		} else if (matches(argv[1], "-graph") == 0) {
 			show_graph = 1;
 		} else if (matches(argv[1], "-Version") == 0) {
-			printf("tc utility, iproute2-%s\n", version);
+			printf("tc utility, iproute2-%s", version);
+			libbpf_version = get_libbpf_version();
+			if (libbpf_version)
+				printf(", libbpf %s", libbpf_version);
+			printf("\n");
 			return 0;
 		} else if (matches(argv[1], "-iec") == 0) {
 			++use_iec;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv6 iproute2-next 2/5] lib: make ipvrf able to use libbpf and fix function name conflicts
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
  2020-11-23 13:11           ` [PATCHv6 iproute2-next 1/5] iproute2: add check_libbpf() and get_libbpf_version() Hangbin Liu
@ 2020-11-23 13:11           ` Hangbin Liu
  2020-11-23 13:11           ` [PATCHv6 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
                             ` (4 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-23 13:11 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Alexei Starovoitov, Hangbin Liu

There are directly calls in libbpf for bpf program load/attach.
So we could just use two wrapper functions for ipvrf and convert
them with libbpf support.

Function bpf_prog_load() is removed as it's conflict with libbpf
function name.

bpf.c is moved to bpf_legacy.c for later main libbpf support in
iproute2.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
v6: bpf_glue.c is created in previous patch. So I changed the commit
    description.
v5: Fix bpf_prog_load_dev typo.
v4: Add new file bpf_glue.c
v2-v3: no update
---
 include/bpf_util.h          | 10 +++++++---
 ip/ipvrf.c                  |  6 +++---
 lib/Makefile                |  2 +-
 lib/bpf_glue.c              | 23 +++++++++++++++++++++++
 lib/{bpf.c => bpf_legacy.c} | 15 +++------------
 5 files changed, 37 insertions(+), 19 deletions(-)
 rename lib/{bpf.c => bpf_legacy.c} (99%)

diff --git a/include/bpf_util.h b/include/bpf_util.h
index dee5bb02..3235c34e 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -274,12 +274,16 @@ int bpf_trace_pipe(void);
 
 void bpf_print_ops(struct rtattr *bpf_ops, __u16 len);
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log);
+int bpf_prog_load_dev(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, __u32 ifindex,
+		      char *log, size_t size_log);
+int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
+		     size_t size_insns, const char *license, char *log,
+		     size_t size_log);
 
 int bpf_prog_attach_fd(int prog_fd, int target_fd, enum bpf_attach_type type);
 int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type);
+int bpf_program_attach(int prog_fd, int target_fd, enum bpf_attach_type type);
 
 int bpf_dump_prog_info(FILE *f, uint32_t id);
 
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 28dd8e25..42779e5c 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -256,8 +256,8 @@ static int prog_load(int idx)
 		BPF_EXIT_INSN(),
 	};
 
-	return bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
-			     "GPL", bpf_log_buf, sizeof(bpf_log_buf));
+	return bpf_program_load(BPF_PROG_TYPE_CGROUP_SOCK, prog, sizeof(prog),
+			        "GPL", bpf_log_buf, sizeof(bpf_log_buf));
 }
 
 static int vrf_configure_cgroup(const char *path, int ifindex)
@@ -288,7 +288,7 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 		goto out;
 	}
 
-	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
+	if (bpf_program_attach(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
 		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
 			strerror(errno));
 		goto out;
diff --git a/lib/Makefile b/lib/Makefile
index a02775a5..7c8a197c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -5,7 +5,7 @@ CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
-	names.o color.o bpf.o bpf_glue.o exec.o fs.o cg_map.o
+	names.o color.o bpf_legacy.o bpf_glue.o exec.o fs.o cg_map.o
 
 NLOBJ=libgenl.o libnetlink.o mnl_utils.o
 
diff --git a/lib/bpf_glue.c b/lib/bpf_glue.c
index 67c41c22..fa609bfe 100644
--- a/lib/bpf_glue.c
+++ b/lib/bpf_glue.c
@@ -5,6 +5,29 @@
  *
  */
 #include "bpf_util.h"
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#endif
+
+int bpf_program_load(enum bpf_prog_type type, const struct bpf_insn *insns,
+		     size_t size_insns, const char *license, char *log,
+		     size_t size_log)
+{
+#ifdef HAVE_LIBBPF
+	return bpf_load_program(type, insns, size_insns, license, 0, log, size_log);
+#else
+	return bpf_prog_load_dev(type, insns, size_insns, license, 0, log, size_log);
+#endif
+}
+
+int bpf_program_attach(int prog_fd, int target_fd, enum bpf_attach_type type)
+{
+#ifdef HAVE_LIBBPF
+	return bpf_prog_attach(prog_fd, target_fd, type, 0);
+#else
+	return bpf_prog_attach_fd(prog_fd, target_fd, type);
+#endif
+}
 
 #ifdef HAVE_LIBBPF
 static const char *_libbpf_compile_version = LIBBPF_VERSION;
diff --git a/lib/bpf.c b/lib/bpf_legacy.c
similarity index 99%
rename from lib/bpf.c
rename to lib/bpf_legacy.c
index c7d45077..4246fb76 100644
--- a/lib/bpf.c
+++ b/lib/bpf_legacy.c
@@ -1087,10 +1087,9 @@ int bpf_prog_detach_fd(int target_fd, enum bpf_attach_type type)
 	return bpf(BPF_PROG_DETACH, &attr, sizeof(attr));
 }
 
-static int bpf_prog_load_dev(enum bpf_prog_type type,
-			     const struct bpf_insn *insns, size_t size_insns,
-			     const char *license, __u32 ifindex,
-			     char *log, size_t size_log)
+int bpf_prog_load_dev(enum bpf_prog_type type, const struct bpf_insn *insns,
+		      size_t size_insns, const char *license, __u32 ifindex,
+		      char *log, size_t size_log)
 {
 	union bpf_attr attr = {};
 
@@ -1109,14 +1108,6 @@ static int bpf_prog_load_dev(enum bpf_prog_type type,
 	return bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
 }
 
-int bpf_prog_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-		  size_t size_insns, const char *license, char *log,
-		  size_t size_log)
-{
-	return bpf_prog_load_dev(type, insns, size_insns, license, 0,
-				 log, size_log);
-}
-
 #ifdef HAVE_ELF
 struct bpf_elf_prog {
 	enum bpf_prog_type	type;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv6 iproute2-next 3/5] lib: add libbpf support
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
  2020-11-23 13:11           ` [PATCHv6 iproute2-next 1/5] iproute2: add check_libbpf() and get_libbpf_version() Hangbin Liu
  2020-11-23 13:11           ` [PATCHv6 iproute2-next 2/5] lib: make ipvrf able to use libbpf and fix function name conflicts Hangbin Liu
@ 2020-11-23 13:11           ` Hangbin Liu
  2020-11-23 13:12           ` [PATCHv6 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
                             ` (3 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-23 13:11 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Alexei Starovoitov, Hangbin Liu

This patch converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available, which is started by Toke's
implementation[1]. With libbpf iproute2 could correctly process BTF
information and support the new-style BTF-defined maps, while keeping
compatibility with the old internal map definition syntax.

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code in bpf_legacy.c ensures that
iproute2 will still understand the old map definition format, including
populating map-in-map and tail call maps before load.

In bpf_libbpf.c, we init iproute2 ctx and elf info first to check the
legacy bytes. When handling the legacy maps, for map-in-maps, we create
them manually and re-use the fd as they are associated with id/inner_id.
For pin maps, we only set the pin path and let libbp load to handle it.
For tail calls, we find it first and update the element after prog load.

Other maps/progs will be loaded by libbpf directly.

[1] https://lore.kernel.org/bpf/20190820114706.18546-1-toke@redhat.com/

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
v6: also make bpf_libbpf.c build depend on HAVE_ELF

v5: no update

v4:
Move ipvrf code to patch 02
Move HAVE_LIBBPF inside HAVE_ELF definition as libbpf depends on elf.

v3:
Add a new function get_bpf_program__section_name() to choose whether
use bpf_program__title() or not.

v2:
Remove self defined IS_ERR_OR_NULL and use libbpf_get_error() instead.
Add ipvrf with libbpf support.
---
 include/bpf_util.h |  17 +++
 lib/Makefile       |   6 +
 lib/bpf_legacy.c   | 178 +++++++++++++++++++++++
 lib/bpf_libbpf.c   | 348 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 549 insertions(+)
 create mode 100644 lib/bpf_libbpf.c

diff --git a/include/bpf_util.h b/include/bpf_util.h
index 3235c34e..53acc410 100644
--- a/include/bpf_util.h
+++ b/include/bpf_util.h
@@ -291,6 +291,16 @@ int bpf_dump_prog_info(FILE *f, uint32_t id);
 int bpf_send_map_fds(const char *path, const char *obj);
 int bpf_recv_map_fds(const char *path, int *fds, struct bpf_map_aux *aux,
 		     unsigned int entries);
+#ifdef HAVE_LIBBPF
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg);
+int iproute2_bpf_fetch_ancillary(void);
+int iproute2_get_root_path(char *root_path, size_t len);
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname);
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name);
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name);
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg);
+#endif /* HAVE_LIBBPF */
 #else
 static inline int bpf_send_map_fds(const char *path, const char *obj)
 {
@@ -303,6 +313,13 @@ static inline int bpf_recv_map_fds(const char *path, int *fds,
 {
 	return -1;
 }
+#ifdef HAVE_LIBBPF
+static inline int iproute2_load_libbpf(struct bpf_cfg_in *cfg)
+{
+	fprintf(stderr, "No ELF library support compiled in.\n");
+	return -1;
+}
+#endif /* HAVE_LIBBPF */
 #endif /* HAVE_ELF */
 
 const char *get_libbpf_version(void);
diff --git a/lib/Makefile b/lib/Makefile
index 7c8a197c..e37585c6 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -7,6 +7,12 @@ UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \
 	inet_proto.o namespace.o json_writer.o json_print.o \
 	names.o color.o bpf_legacy.o bpf_glue.o exec.o fs.o cg_map.o
 
+ifeq ($(HAVE_ELF),y)
+ifeq ($(HAVE_LIBBPF),y)
+UTILOBJ += bpf_libbpf.o
+endif
+endif
+
 NLOBJ=libgenl.o libnetlink.o mnl_utils.o
 
 all: libnetlink.a libutil.a
diff --git a/lib/bpf_legacy.c b/lib/bpf_legacy.c
index 4246fb76..bc869c3f 100644
--- a/lib/bpf_legacy.c
+++ b/lib/bpf_legacy.c
@@ -940,6 +940,9 @@ static int bpf_do_parse(struct bpf_cfg_in *cfg, const bool *opt_tbl)
 static int bpf_do_load(struct bpf_cfg_in *cfg)
 {
 	if (cfg->mode == EBPF_OBJECT) {
+#ifdef HAVE_LIBBPF
+		return iproute2_load_libbpf(cfg);
+#endif
 		cfg->prog_fd = bpf_obj_open(cfg->object, cfg->type,
 					    cfg->section, cfg->ifindex,
 					    cfg->verbose);
@@ -3155,4 +3158,179 @@ int bpf_recv_map_fds(const char *path, int *fds, struct bpf_map_aux *aux,
 	close(fd);
 	return ret;
 }
+
+#ifdef HAVE_LIBBPF
+/* The following functions are wrapper functions for libbpf code to be
+ * compatible with the legacy format. So all the functions have prefix
+ * with iproute2_
+ */
+int iproute2_bpf_elf_ctx_init(struct bpf_cfg_in *cfg)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+
+	return bpf_elf_ctx_init(ctx, cfg->object, cfg->type, cfg->ifindex, cfg->verbose);
+}
+
+int iproute2_bpf_fetch_ancillary(void)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	struct bpf_elf_sec_data data;
+	int i, ret = 0;
+
+	for (i = 1; i < ctx->elf_hdr.e_shnum; i++) {
+		ret = bpf_fill_section_data(ctx, i, &data);
+		if (ret < 0)
+			continue;
+
+		if (data.sec_hdr.sh_type == SHT_PROGBITS &&
+		    !strcmp(data.sec_name, ELF_SECTION_MAPS))
+			ret = bpf_fetch_maps_begin(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_SYMTAB &&
+			 !strcmp(data.sec_name, ".symtab"))
+			ret = bpf_fetch_symtab(ctx, i, &data);
+		else if (data.sec_hdr.sh_type == SHT_STRTAB &&
+			 !strcmp(data.sec_name, ".strtab"))
+			ret = bpf_fetch_strtab(ctx, i, &data);
+		if (ret < 0) {
+			fprintf(stderr, "Error parsing section %d! Perhaps check with readelf -a?\n",
+				i);
+			return ret;
+		}
+	}
+
+	if (bpf_has_map_data(ctx)) {
+		ret = bpf_fetch_maps_end(ctx);
+		if (ret < 0) {
+			fprintf(stderr, "Error fixing up map structure, incompatible struct bpf_elf_map used?\n");
+			return ret;
+		}
+	}
+
+	return ret;
+}
+
+int iproute2_get_root_path(char *root_path, size_t len)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	int ret = 0;
+
+	snprintf(root_path, len, "%s/%s",
+		 bpf_get_work_dir(ctx->type), BPF_DIR_GLOBALS);
+
+	ret = mkdir(root_path, S_IRWXU);
+	if (ret && errno != EEXIST) {
+		fprintf(stderr, "mkdir %s failed: %s\n", root_path, strerror(errno));
+		return ret;
+	}
+
+	return 0;
+}
+
+bool iproute2_is_pin_map(const char *libbpf_map_name, char *pathname)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name, *tmp;
+	unsigned int pinning;
+	int i, ret = 0;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].pinning == PIN_OBJECT_NS &&
+		    ctx->noafalg) {
+			fprintf(stderr, "Missing kernel AF_ALG support for PIN_OBJECT_NS!\n");
+			return false;
+		}
+
+		map_name = bpf_map_fetch_name(ctx, i);
+		if (!map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, map_name))
+			continue;
+
+		pinning = ctx->maps[i].pinning;
+
+		if (bpf_no_pinning(ctx, pinning) || !bpf_get_work_dir(ctx->type))
+			return false;
+
+		if (pinning == PIN_OBJECT_NS)
+			ret = bpf_make_obj_path(ctx);
+		else if ((tmp = bpf_custom_pinning(ctx, pinning)))
+			ret = bpf_make_custom_path(ctx, tmp);
+		if (ret < 0)
+			return false;
+
+		bpf_make_pathname(pathname, PATH_MAX, map_name, ctx, pinning);
+
+		return true;
+	}
+
+	return false;
+}
+
+bool iproute2_is_map_in_map(const char *libbpf_map_name, struct bpf_elf_map *imap,
+			    struct bpf_elf_map *omap, char *omap_name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *inner_map_name, *outer_map_name;
+	int i, j;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		inner_map_name = bpf_map_fetch_name(ctx, i);
+		if (!inner_map_name) {
+			return false;
+		}
+
+		if (strcmp(libbpf_map_name, inner_map_name))
+			continue;
+
+		if (!ctx->maps[i].id ||
+		    ctx->maps[i].inner_id ||
+		    ctx->maps[i].inner_idx == -1)
+			continue;
+
+		*imap = ctx->maps[i];
+
+		for (j = 0; j < ctx->map_num; j++) {
+			if (!bpf_is_map_in_map_type(&ctx->maps[j]))
+				continue;
+			if (ctx->maps[j].inner_id != ctx->maps[i].id)
+				continue;
+
+			*omap = ctx->maps[j];
+			outer_map_name = bpf_map_fetch_name(ctx, j);
+			memcpy(omap_name, outer_map_name, strlen(outer_map_name) + 1);
+
+			return true;
+		}
+	}
+
+	return false;
+}
+
+int iproute2_find_map_name_by_id(unsigned int map_id, char *name)
+{
+	struct bpf_elf_ctx *ctx = &__ctx;
+	const char *map_name;
+	int i, idx = -1;
+
+	for (i = 0; i < ctx->map_num; i++) {
+		if (ctx->maps[i].id == map_id &&
+		    ctx->maps[i].type == BPF_MAP_TYPE_PROG_ARRAY) {
+			idx = i;
+			break;
+		}
+	}
+
+	if (idx < 0)
+		return -1;
+
+	map_name = bpf_map_fetch_name(ctx, idx);
+	if (!map_name)
+		return -1;
+
+	memcpy(name, map_name, strlen(map_name) + 1);
+	return 0;
+}
+#endif /* HAVE_LIBBPF */
 #endif /* HAVE_ELF */
diff --git a/lib/bpf_libbpf.c b/lib/bpf_libbpf.c
new file mode 100644
index 00000000..d05737a4
--- /dev/null
+++ b/lib/bpf_libbpf.c
@@ -0,0 +1,348 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * bpf_libbpf.c		BPF code relay on libbpf
+ * Authors:		Hangbin Liu <haliu@redhat.com>
+ *
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <errno.h>
+#include <fcntl.h>
+
+#include <libelf.h>
+#include <gelf.h>
+
+#include <bpf/libbpf.h>
+#include <bpf/bpf.h>
+
+#include "bpf_util.h"
+
+static int verbose_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	return vfprintf(stderr, format, args);
+}
+
+static int silent_print(enum libbpf_print_level level, const char *format, va_list args)
+{
+	if (level > LIBBPF_WARN)
+		return 0;
+
+	/* Skip warning from bpf_object__init_user_maps() for legacy maps */
+	if (strstr(format, "has unrecognized, non-zero options"))
+		return 0;
+
+	return vfprintf(stderr, format, args);
+}
+
+static const char *get_bpf_program__section_name(const struct bpf_program *prog)
+{
+#ifdef HAVE_LIBBPF_SECTION_NAME
+	return bpf_program__section_name(prog);
+#else
+	return bpf_program__title(prog, false);
+#endif
+}
+
+static int create_map(const char *name, struct bpf_elf_map *map,
+		      __u32 ifindex, int inner_fd)
+{
+	struct bpf_create_map_attr map_attr = {};
+
+	map_attr.name = name;
+	map_attr.map_type = map->type;
+	map_attr.map_flags = map->flags;
+	map_attr.key_size = map->size_key;
+	map_attr.value_size = map->size_value;
+	map_attr.max_entries = map->max_elem;
+	map_attr.map_ifindex = ifindex;
+	map_attr.inner_map_fd = inner_fd;
+
+	return bpf_create_map_xattr(&map_attr);
+}
+
+static int create_map_in_map(struct bpf_object *obj, struct bpf_map *map,
+			     struct bpf_elf_map *elf_map, int inner_fd,
+			     bool *reuse_pin_map)
+{
+	char pathname[PATH_MAX];
+	const char *map_name;
+	bool pin_map = false;
+	int map_fd, ret = 0;
+
+	map_name = bpf_map__name(map);
+
+	if (iproute2_is_pin_map(map_name, pathname)) {
+		pin_map = true;
+
+		/* Check if there already has a pinned map */
+		map_fd = bpf_obj_get(pathname);
+		if (map_fd > 0) {
+			if (reuse_pin_map)
+				*reuse_pin_map = true;
+			close(map_fd);
+			return bpf_map__set_pin_path(map, pathname);
+		}
+	}
+
+	map_fd = create_map(map_name, elf_map, bpf_map__ifindex(map), inner_fd);
+	if (map_fd < 0) {
+		fprintf(stderr, "create map %s failed\n", map_name);
+		return map_fd;
+	}
+
+	ret = bpf_map__reuse_fd(map, map_fd);
+	if (ret < 0) {
+		fprintf(stderr, "map %s reuse fd failed\n", map_name);
+		goto err_out;
+	}
+
+	if (pin_map) {
+		ret = bpf_map__set_pin_path(map, pathname);
+		if (ret < 0)
+			goto err_out;
+	}
+
+	return 0;
+err_out:
+	close(map_fd);
+	return ret;
+}
+
+static int
+handle_legacy_map_in_map(struct bpf_object *obj, struct bpf_map *inner_map,
+			 const char *inner_map_name)
+{
+	int inner_fd, outer_fd, inner_idx, ret = 0;
+	struct bpf_elf_map imap, omap;
+	struct bpf_map *outer_map;
+	/* What's the size limit of map name? */
+	char outer_map_name[128];
+	bool reuse_pin_map = false;
+
+	/* Deal with map-in-map */
+	if (iproute2_is_map_in_map(inner_map_name, &imap, &omap, outer_map_name)) {
+		ret = create_map_in_map(obj, inner_map, &imap, -1, NULL);
+		if (ret < 0)
+			return ret;
+
+		inner_fd = bpf_map__fd(inner_map);
+		outer_map = bpf_object__find_map_by_name(obj, outer_map_name);
+		ret = create_map_in_map(obj, outer_map, &omap, inner_fd, &reuse_pin_map);
+		if (ret < 0)
+			return ret;
+
+		if (!reuse_pin_map) {
+			inner_idx = imap.inner_idx;
+			outer_fd = bpf_map__fd(outer_map);
+			ret = bpf_map_update_elem(outer_fd, &inner_idx, &inner_fd, 0);
+			if (ret < 0)
+				fprintf(stderr, "Cannot update inner_idx into outer_map\n");
+		}
+	}
+
+	return ret;
+}
+
+static int find_legacy_tail_calls(struct bpf_program *prog, struct bpf_object *obj)
+{
+	unsigned int map_id, key_id;
+	const char *sec_name;
+	struct bpf_map *map;
+	char map_name[128];
+	int ret;
+
+	/* Handle iproute2 tail call */
+	sec_name = get_bpf_program__section_name(prog);
+	ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+	if (ret != 2)
+		return -1;
+
+	ret = iproute2_find_map_name_by_id(map_id, map_name);
+	if (ret < 0) {
+		fprintf(stderr, "unable to find map id %u for tail call\n", map_id);
+		return ret;
+	}
+
+	map = bpf_object__find_map_by_name(obj, map_name);
+	if (!map)
+		return -1;
+
+	/* Save the map here for later updating */
+	bpf_program__set_priv(prog, map, NULL);
+
+	return 0;
+}
+
+static int update_legacy_tail_call_maps(struct bpf_object *obj)
+{
+	int prog_fd, map_fd, ret = 0;
+	unsigned int map_id, key_id;
+	struct bpf_program *prog;
+	const char *sec_name;
+	struct bpf_map *map;
+
+	bpf_object__for_each_program(prog, obj) {
+		map = bpf_program__priv(prog);
+		if (!map)
+			continue;
+
+		prog_fd = bpf_program__fd(prog);
+		if (prog_fd < 0)
+			continue;
+
+		sec_name = get_bpf_program__section_name(prog);
+		ret = sscanf(sec_name, "%i/%i", &map_id, &key_id);
+		if (ret != 2)
+			continue;
+
+		map_fd = bpf_map__fd(map);
+		ret = bpf_map_update_elem(map_fd, &key_id, &prog_fd, 0);
+		if (ret < 0) {
+			fprintf(stderr, "Cannot update map key for tail call!\n");
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int handle_legacy_maps(struct bpf_object *obj)
+{
+	char pathname[PATH_MAX];
+	struct bpf_map *map;
+	const char *map_name;
+	int map_fd, ret = 0;
+
+	bpf_object__for_each_map(map, obj) {
+		map_name = bpf_map__name(map);
+
+		ret = handle_legacy_map_in_map(obj, map, map_name);
+		if (ret)
+			return ret;
+
+		/* If it is a iproute2 legacy pin maps, just set pin path
+		 * and let bpf_object__load() to deal with the map creation.
+		 * We need to ignore map-in-maps which have pinned maps manually
+		 */
+		map_fd = bpf_map__fd(map);
+		if (map_fd < 0 && iproute2_is_pin_map(map_name, pathname)) {
+			ret = bpf_map__set_pin_path(map, pathname);
+			if (ret) {
+				fprintf(stderr, "map '%s': couldn't set pin path.\n", map_name);
+				break;
+			}
+		}
+
+	}
+
+	return ret;
+}
+
+static int load_bpf_object(struct bpf_cfg_in *cfg)
+{
+	struct bpf_program *p, *prog = NULL;
+	struct bpf_object *obj;
+	char root_path[PATH_MAX];
+	struct bpf_map *map;
+	int prog_fd, ret = 0;
+
+	ret = iproute2_get_root_path(root_path, PATH_MAX);
+	if (ret)
+		return ret;
+
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts,
+			.relaxed_maps = true,
+			.pin_root_path = root_path,
+	);
+
+	obj = bpf_object__open_file(cfg->object, &open_opts);
+	if (libbpf_get_error(obj)) {
+		fprintf(stderr, "ERROR: opening BPF object file failed\n");
+		return -ENOENT;
+	}
+
+	bpf_object__for_each_program(p, obj) {
+		/* Only load the programs that will either be subsequently
+		 * attached or inserted into a tail call map */
+		if (find_legacy_tail_calls(p, obj) < 0 && cfg->section &&
+		    strcmp(get_bpf_program__section_name(p), cfg->section)) {
+			ret = bpf_program__set_autoload(p, false);
+			if (ret)
+				return -EINVAL;
+			continue;
+		}
+
+		bpf_program__set_type(p, cfg->type);
+		bpf_program__set_ifindex(p, cfg->ifindex);
+		if (!prog)
+			prog = p;
+	}
+
+	bpf_object__for_each_map(map, obj) {
+		if (!bpf_map__is_offload_neutral(map))
+			bpf_map__set_ifindex(map, cfg->ifindex);
+	}
+
+	if (!prog) {
+		fprintf(stderr, "object file doesn't contain sec %s\n", cfg->section);
+		return -ENOENT;
+	}
+
+	/* Handle iproute2 legacy pin maps and map-in-maps */
+	ret = handle_legacy_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = bpf_object__load(obj);
+	if (ret)
+		goto unload_obj;
+
+	ret = update_legacy_tail_call_maps(obj);
+	if (ret)
+		goto unload_obj;
+
+	prog_fd = fcntl(bpf_program__fd(prog), F_DUPFD_CLOEXEC, 1);
+	if (prog_fd < 0)
+		ret = -errno;
+	else
+		cfg->prog_fd = prog_fd;
+
+unload_obj:
+	/* Close obj as we don't need it */
+	bpf_object__close(obj);
+	return ret;
+}
+
+/* Load ebpf and return prog fd */
+int iproute2_load_libbpf(struct bpf_cfg_in *cfg)
+{
+	int ret = 0;
+
+	if (cfg->verbose)
+		libbpf_set_print(verbose_print);
+	else
+		libbpf_set_print(silent_print);
+
+	ret = iproute2_bpf_elf_ctx_init(cfg);
+	if (ret < 0) {
+		fprintf(stderr, "Cannot initialize ELF context!\n");
+		return ret;
+	}
+
+	ret = iproute2_bpf_fetch_ancillary();
+	if (ret < 0) {
+		fprintf(stderr, "Error fetching ELF ancillary data!\n");
+		return ret;
+	}
+
+	ret = load_bpf_object(cfg);
+	if (ret)
+		return ret;
+
+	return cfg->prog_fd;
+}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv6 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
                             ` (2 preceding siblings ...)
  2020-11-23 13:11           ` [PATCHv6 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
@ 2020-11-23 13:12           ` Hangbin Liu
  2020-11-23 13:12           ` [PATCHv6 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
                             ` (2 subsequent siblings)
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-23 13:12 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Alexei Starovoitov, Hangbin Liu

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README                        | 14 +++++++++-----
 examples/bpf/{ => legacy}/bpf_cyclic.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_graft.c      |  2 +-
 examples/bpf/{ => legacy}/bpf_map_in_map.c |  2 +-
 examples/bpf/{ => legacy}/bpf_shared.c     |  2 +-
 examples/bpf/{ => legacy}/bpf_tailcall.c   |  2 +-
 6 files changed, 14 insertions(+), 10 deletions(-)
 rename examples/bpf/{ => legacy}/bpf_cyclic.c (95%)
 rename examples/bpf/{ => legacy}/bpf_graft.c (97%)
 rename examples/bpf/{ => legacy}/bpf_map_in_map.c (96%)
 rename examples/bpf/{ => legacy}/bpf_shared.c (97%)
 rename examples/bpf/{ => legacy}/bpf_tailcall.c (98%)

diff --git a/examples/bpf/README b/examples/bpf/README
index 1bbdda3f..732bcc83 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,8 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
- - bpf_shared.c		-> Ingress/egress map sharing example
- - bpf_tailcall.c	-> Using tail call chains
- - bpf_cyclic.c		-> Simple cycle as tail calls
- - bpf_graft.c		-> Demo on altering runtime behaviour
- - bpf_map_in_map.c     -> Using map in map example
+ - legacy/bpf_shared.c		-> Ingress/egress map sharing example
+ - legacy/bpf_tailcall.c	-> Using tail call chains
+ - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
+ - legacy/bpf_graft.c		-> Demo on altering runtime behaviour
+ - legacy/bpf_map_in_map.c	-> Using map in map example
+
+Note: Users should use new BTF way to defined the maps, the examples
+in legacy folder which is using struct bpf_elf_map defined maps is not
+recommanded.
diff --git a/examples/bpf/bpf_cyclic.c b/examples/bpf/legacy/bpf_cyclic.c
similarity index 95%
rename from examples/bpf/bpf_cyclic.c
rename to examples/bpf/legacy/bpf_cyclic.c
index 11d1c061..33590730 100644
--- a/examples/bpf/bpf_cyclic.c
+++ b/examples/bpf/legacy/bpf_cyclic.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Cyclic dependency example to test the kernel's runtime upper
  * bound on loops. Also demonstrates on how to use direct-actions,
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/legacy/bpf_graft.c
similarity index 97%
rename from examples/bpf/bpf_graft.c
rename to examples/bpf/legacy/bpf_graft.c
index 07113d4a..f4c920cc 100644
--- a/examples/bpf/bpf_graft.c
+++ b/examples/bpf/legacy/bpf_graft.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* This example demonstrates how classifier run-time behaviour
  * can be altered with tail calls. We start out with an empty
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/legacy/bpf_map_in_map.c
similarity index 96%
rename from examples/bpf/bpf_map_in_map.c
rename to examples/bpf/legacy/bpf_map_in_map.c
index ff0e623a..575f8812 100644
--- a/examples/bpf/bpf_map_in_map.c
+++ b/examples/bpf/legacy/bpf_map_in_map.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define MAP_INNER_ID	42
 
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/legacy/bpf_shared.c
similarity index 97%
rename from examples/bpf/bpf_shared.c
rename to examples/bpf/legacy/bpf_shared.c
index 21fe6f1e..05b2b9ef 100644
--- a/examples/bpf/bpf_shared.c
+++ b/examples/bpf/legacy/bpf_shared.c
@@ -1,4 +1,4 @@
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 /* Minimal, stand-alone toy map pinning example:
  *
diff --git a/examples/bpf/bpf_tailcall.c b/examples/bpf/legacy/bpf_tailcall.c
similarity index 98%
rename from examples/bpf/bpf_tailcall.c
rename to examples/bpf/legacy/bpf_tailcall.c
index 161eb606..8ebc554c 100644
--- a/examples/bpf/bpf_tailcall.c
+++ b/examples/bpf/legacy/bpf_tailcall.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include "../../include/bpf_api.h"
+#include "../../../include/bpf_api.h"
 
 #define ENTRY_INIT	3
 #define ENTRY_0		0
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* [PATCHv6 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
                             ` (3 preceding siblings ...)
  2020-11-23 13:12           ` [PATCHv6 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
@ 2020-11-23 13:12           ` Hangbin Liu
  2020-11-25  5:28           ` [PATCHv6 iproute2-next 0/5] iproute2: add libbpf support David Ahern
  2020-11-25  5:30           ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 167+ messages in thread
From: Hangbin Liu @ 2020-11-23 13:12 UTC (permalink / raw)
  To: Stephen Hemminger, David Ahern
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Alexei Starovoitov, Hangbin Liu

Users should try use the new BTF defined maps instead of struct
bpf_elf_map defined maps. The tail call examples are not added yet
as libbpf doesn't currently support declaratively populating tail call
maps.

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Hangbin Liu <haliu@redhat.com>
---
 examples/bpf/README           |  6 ++++
 examples/bpf/bpf_graft.c      | 66 +++++++++++++++++++++++++++++++++++
 examples/bpf/bpf_map_in_map.c | 55 +++++++++++++++++++++++++++++
 examples/bpf/bpf_shared.c     | 53 ++++++++++++++++++++++++++++
 include/bpf_api.h             | 13 +++++++
 5 files changed, 193 insertions(+)
 create mode 100644 examples/bpf/bpf_graft.c
 create mode 100644 examples/bpf/bpf_map_in_map.c
 create mode 100644 examples/bpf/bpf_shared.c

diff --git a/examples/bpf/README b/examples/bpf/README
index 732bcc83..b7261191 100644
--- a/examples/bpf/README
+++ b/examples/bpf/README
@@ -1,6 +1,12 @@
 eBPF toy code examples (running in kernel) to familiarize yourself
 with syntax and features:
 
+- BTF defined map examples
+ - bpf_graft.c		-> Demo on altering runtime behaviour
+ - bpf_shared.c 	-> Ingress/egress map sharing example
+ - bpf_map_in_map.c	-> Using map in map example
+
+- legacy struct bpf_elf_map defined map examples
  - legacy/bpf_shared.c		-> Ingress/egress map sharing example
  - legacy/bpf_tailcall.c	-> Using tail call chains
  - legacy/bpf_cyclic.c		-> Simple cycle as tail calls
diff --git a/examples/bpf/bpf_graft.c b/examples/bpf/bpf_graft.c
new file mode 100644
index 00000000..8066dcce
--- /dev/null
+++ b/examples/bpf/bpf_graft.c
@@ -0,0 +1,66 @@
+#include "../../include/bpf_api.h"
+
+/* This example demonstrates how classifier run-time behaviour
+ * can be altered with tail calls. We start out with an empty
+ * jmp_tc array, then add section aaa to the array slot 0, and
+ * later on atomically replace it with section bbb. Note that
+ * as shown in other examples, the tc loader can prepopulate
+ * tail called sections, here we start out with an empty one
+ * on purpose to show it can also be done this way.
+ *
+ * tc filter add dev foo parent ffff: bpf obj graft.o
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-20229 [001] ..s. 138993.003923: : fallthrough
+ *   <idle>-0            [001] ..s. 138993.202265: : fallthrough
+ *   Socket Thread-20229 [001] ..s. 138994.004149: : fallthrough
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec aaa
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139012.053587: : aaa
+ *   <idle>-0            [002] ..s. 139012.172359: : aaa
+ *   Socket Thread-19818 [001] ..s. 139012.173556: : aaa
+ *   [...]
+ *
+ * tc exec bpf graft m:globals/jmp_tc key 0 obj graft.o sec bbb
+ * tc exec bpf dbg
+ *   [...]
+ *   Socket Thread-19818 [002] ..s. 139022.102967: : bbb
+ *   <idle>-0            [002] ..s. 139022.155640: : bbb
+ *   Socket Thread-19818 [001] ..s. 139022.156730: : bbb
+ *   [...]
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+} jmp_tc __section(".maps");
+
+__section("aaa")
+int cls_aaa(struct __sk_buff *skb)
+{
+	printt("aaa\n");
+	return TC_H_MAKE(1, 42);
+}
+
+__section("bbb")
+int cls_bbb(struct __sk_buff *skb)
+{
+	printt("bbb\n");
+	return TC_H_MAKE(1, 43);
+}
+
+__section_cls_entry
+int cls_entry(struct __sk_buff *skb)
+{
+	tail_call(skb, &jmp_tc, 0);
+	printt("fallthrough\n");
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_map_in_map.c b/examples/bpf/bpf_map_in_map.c
new file mode 100644
index 00000000..39c86268
--- /dev/null
+++ b/examples/bpf/bpf_map_in_map.c
@@ -0,0 +1,55 @@
+#include "../../include/bpf_api.h"
+
+struct inner_map {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+} map_inner __section(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);
+	__array(values, struct inner_map);
+} map_outer __section(".maps") = {
+	.values = {
+		[0] = &map_inner,
+	},
+};
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			lock_xadd(val, 1);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	struct bpf_elf_map *map_inner;
+	int key = 0, *val;
+
+	map_inner = map_lookup_elem(&map_outer, &key);
+	if (map_inner) {
+		val = map_lookup_elem(map_inner, &key);
+		if (val)
+			printt("map val: %d\n", *val);
+	}
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/examples/bpf/bpf_shared.c b/examples/bpf/bpf_shared.c
new file mode 100644
index 00000000..99a332f4
--- /dev/null
+++ b/examples/bpf/bpf_shared.c
@@ -0,0 +1,53 @@
+#include "../../include/bpf_api.h"
+
+/* Minimal, stand-alone toy map pinning example:
+ *
+ * clang -target bpf -O2 [...] -o bpf_shared.o -c bpf_shared.c
+ * tc filter add dev foo parent 1: bpf obj bpf_shared.o sec egress
+ * tc filter add dev foo parent ffff: bpf obj bpf_shared.o sec ingress
+ *
+ * Both classifier will share the very same map instance in this example,
+ * so map content can be accessed from ingress *and* egress side!
+ *
+ * This example has a pinning of PIN_OBJECT_NS, so it's private and
+ * thus shared among various program sections within the object.
+ *
+ * A setting of PIN_GLOBAL_NS would place it into a global namespace,
+ * so that it can be shared among different object files. A setting
+ * of PIN_NONE (= 0) means no sharing, so each tc invocation a new map
+ * instance is being created.
+ */
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(key_size, sizeof(uint32_t));
+	__uint(value_size, sizeof(uint32_t));
+	__uint(max_entries, 1);
+	__uint(pinning, LIBBPF_PIN_BY_NAME);	/* or LIBBPF_PIN_NONE */
+} map_sh __section(".maps");
+
+__section("egress")
+int emain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		lock_xadd(val, 1);
+
+	return BPF_H_DEFAULT;
+}
+
+__section("ingress")
+int imain(struct __sk_buff *skb)
+{
+	int key = 0, *val;
+
+	val = map_lookup_elem(&map_sh, &key);
+	if (val)
+		printt("map val: %d\n", *val);
+
+	return BPF_H_DEFAULT;
+}
+
+BPF_LICENSE("GPL");
diff --git a/include/bpf_api.h b/include/bpf_api.h
index 89d3488d..82c47089 100644
--- a/include/bpf_api.h
+++ b/include/bpf_api.h
@@ -19,6 +19,19 @@
 
 #include "bpf_elf.h"
 
+/** libbpf pin type. */
+enum libbpf_pin_type {
+	LIBBPF_PIN_NONE,
+	/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
+	LIBBPF_PIN_BY_NAME,
+};
+
+/** Type helper macros. */
+
+#define __uint(name, val) int (*name)[val]
+#define __type(name, val) typeof(val) *name
+#define __array(name, val) typeof(val) *name[]
+
 /** Misc macros. */
 
 #ifndef __stringify
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 167+ messages in thread

* Re: [PATCHv6 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
                             ` (4 preceding siblings ...)
  2020-11-23 13:12           ` [PATCHv6 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
@ 2020-11-25  5:28           ` David Ahern
  2020-11-25  5:30           ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 167+ messages in thread
From: David Ahern @ 2020-11-25  5:28 UTC (permalink / raw)
  To: Hangbin Liu, Stephen Hemminger
  Cc: Daniel Borkmann, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc,
	Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Alexei Starovoitov

On 11/23/20 6:11 AM, Hangbin Liu wrote:
> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
> 


applied to iproute2-next.

Thanks for the detailed cover letter. In the future, please use '$'
instead of '#' for the prompt on the commands or offset the command
lines. Lines starting with '#' are considered comments by git.


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCHv6 iproute2-next 0/5] iproute2: add libbpf support
  2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
                             ` (5 preceding siblings ...)
  2020-11-25  5:28           ` [PATCHv6 iproute2-next 0/5] iproute2: add libbpf support David Ahern
@ 2020-11-25  5:30           ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 167+ messages in thread
From: patchwork-bot+netdevbpf @ 2020-11-25  5:30 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: stephen, dsahern, daniel, kafai, songliubraving, yhs, davem,
	netdev, bpf, jbenc, toke, brouer, alexei.starovoitov

Hello:

This series was applied to iproute2/iproute2-next.git (refs/heads/main):

On Mon, 23 Nov 2020 21:11:56 +0800 you wrote:
> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
> 
> This is achieved by checking for libbpf at './configure' time, and using
> it if available. By default the system libbpf will be used, but static
> linking against a custom libbpf version can be achieved by passing
> LIBBPF_DIR to configure. LIBBPF_FORCE can be set to on to force configure
> abort if no suitable libbpf is found (useful for automatic packaging
> that wants to enforce the dependency), or set off to disable libbpf check
> and build iproute2 with legacy bpf.
> 
> [...]

Here is the summary with links:
  - [PATCHv6,iproute2-next,1/5] iproute2: add check_libbpf() and get_libbpf_version()
    https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=503e9229b054
  - [PATCHv6,iproute2-next,2/5] lib: make ipvrf able to use libbpf and fix function name conflicts
    https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=dc800a4ed4f3
  - [PATCHv6,iproute2-next,3/5] lib: add libbpf support
    https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=6d61a2b55799
  - [PATCHv6,iproute2-next,4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder
    https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=1ac8285a692e
  - [PATCHv6,iproute2-next,5/5] examples/bpf: add bpf examples with BTF defined maps
    https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id=71c7c1fb4ff0

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 0/5] iproute2: add libbpf support
  2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
                   ` (5 preceding siblings ...)
  2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
@ 2020-11-29  6:16 ` Stephen Hemminger
  2020-11-29  6:22   ` Greg KH
                     ` (2 more replies)
  6 siblings, 3 replies; 167+ messages in thread
From: Stephen Hemminger @ 2020-11-29  6:16 UTC (permalink / raw)
  To: Hangbin Liu
  Cc: Daniel Borkmann, David Ahern, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Fri, 23 Oct 2020 11:38:50 +0800
Hangbin Liu <haliu@redhat.com> wrote:

> This series converts iproute2 to use libbpf for loading and attaching
> BPF programs when it is available. This means that iproute2 will
> correctly process BTF information and support the new-style BTF-defined
> maps, while keeping compatibility with the old internal map definition
> syntax.
> 
> This is achieved by checking for libbpf at './configure' time, and using
> it if available. By default the system libbpf will be used, but static
> linking against a custom libbpf version can be achieved by passing
> LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
> abort if no suitable libbpf is found (useful for automatic packaging
> that wants to enforce the dependency).
> 
> The old iproute2 bpf code is kept and will be used if no suitable libbpf
> is available. When using libbpf, wrapper code ensures that iproute2 will
> still understand the old map definition format, including populating
> map-in-map and tail call maps before load.
> 
> The examples in bpf/examples are kept, and a separate set of examples
> are added with BTF-based map definitions for those examples where this
> is possible (libbpf doesn't currently support declaratively populating
> tail call maps).


Luca wants to put this in Debian 11 (good idea), but that means:

1. It has to work with 5.10 release and kernel.
2. Someone has to test it.
3. The 5.10 is a LTS kernel release which means BPF developers have
   to agree to supporting LTS releases.

If someone steps up to doing this then I would be happy to merge it now
for 5.10. Otherwise it won't show up until 5.11.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 0/5] iproute2: add libbpf support
  2020-11-29  6:16 ` [PATCH " Stephen Hemminger
@ 2020-11-29  6:22   ` Greg KH
  2020-11-30 11:39     ` Michal Kubecek
  2020-11-29 17:33   ` Alexei Starovoitov
  2020-11-29 19:41   ` David Ahern
  2 siblings, 1 reply; 167+ messages in thread
From: Greg KH @ 2020-11-29  6:22 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Hangbin Liu, Daniel Borkmann, David Ahern, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen

On Sat, Nov 28, 2020 at 10:16:35PM -0800, Stephen Hemminger wrote:
> On Fri, 23 Oct 2020 11:38:50 +0800
> Hangbin Liu <haliu@redhat.com> wrote:
> 
> > This series converts iproute2 to use libbpf for loading and attaching
> > BPF programs when it is available. This means that iproute2 will
> > correctly process BTF information and support the new-style BTF-defined
> > maps, while keeping compatibility with the old internal map definition
> > syntax.
> > 
> > This is achieved by checking for libbpf at './configure' time, and using
> > it if available. By default the system libbpf will be used, but static
> > linking against a custom libbpf version can be achieved by passing
> > LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
> > abort if no suitable libbpf is found (useful for automatic packaging
> > that wants to enforce the dependency).
> > 
> > The old iproute2 bpf code is kept and will be used if no suitable libbpf
> > is available. When using libbpf, wrapper code ensures that iproute2 will
> > still understand the old map definition format, including populating
> > map-in-map and tail call maps before load.
> > 
> > The examples in bpf/examples are kept, and a separate set of examples
> > are added with BTF-based map definitions for those examples where this
> > is possible (libbpf doesn't currently support declaratively populating
> > tail call maps).
> 
> 
> Luca wants to put this in Debian 11 (good idea), but that means:
> 
> 1. It has to work with 5.10 release and kernel.
> 2. Someone has to test it.
> 3. The 5.10 is a LTS kernel release which means BPF developers have
>    to agree to supporting LTS releases.

Why would the bpf developers have to support any old releases?  That's
not their responsibility, that's the developers who want to create
stable/lts releases.

> If someone steps up to doing this then I would be happy to merge it now
> for 5.10. Otherwise it won't show up until 5.11.

Don't ever "rush" anything for a LTS/stable release, otherwise I am
going to have to go back to the old way of not announcing them until
_after_ they are released as people throw stuff that is not ready for
a normal merge.

This looks like a new feature, and shouldn't go in right now in the
development cycle anyway, all features for 5.10 had to be in linux-next
before 5.9 was released.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 0/5] iproute2: add libbpf support
  2020-11-29  6:16 ` [PATCH " Stephen Hemminger
  2020-11-29  6:22   ` Greg KH
@ 2020-11-29 17:33   ` Alexei Starovoitov
  2020-11-29 19:41   ` David Ahern
  2 siblings, 0 replies; 167+ messages in thread
From: Alexei Starovoitov @ 2020-11-29 17:33 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Hangbin Liu, Daniel Borkmann, David Ahern, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, David Miller,
	Jesper Dangaard Brouer, Network Development, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Sat, Nov 28, 2020 at 10:16 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Fri, 23 Oct 2020 11:38:50 +0800
> Hangbin Liu <haliu@redhat.com> wrote:
>
> > This series converts iproute2 to use libbpf for loading and attaching
> > BPF programs when it is available. This means that iproute2 will
> > correctly process BTF information and support the new-style BTF-defined
> > maps, while keeping compatibility with the old internal map definition
> > syntax.
> >
> > This is achieved by checking for libbpf at './configure' time, and using
> > it if available. By default the system libbpf will be used, but static
> > linking against a custom libbpf version can be achieved by passing
> > LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
> > abort if no suitable libbpf is found (useful for automatic packaging
> > that wants to enforce the dependency).
> >
> > The old iproute2 bpf code is kept and will be used if no suitable libbpf
> > is available. When using libbpf, wrapper code ensures that iproute2 will
> > still understand the old map definition format, including populating
> > map-in-map and tail call maps before load.
> >
> > The examples in bpf/examples are kept, and a separate set of examples
> > are added with BTF-based map definitions for those examples where this
> > is possible (libbpf doesn't currently support declaratively populating
> > tail call maps).
>
>
> Luca wants to put this in Debian 11 (good idea), but that means:
>
> 1. It has to work with 5.10 release and kernel.
> 2. Someone has to test it.
> 3. The 5.10 is a LTS kernel release which means BPF developers have
>    to agree to supporting LTS releases.

That must be a bad joke.
You did the opposite of what we asked.
You folks are on your own.
5.10, 5.11 whatever release. When angry users come with questions
about random behavior you'll be answering them. Not us.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 0/5] iproute2: add libbpf support
  2020-11-29  6:16 ` [PATCH " Stephen Hemminger
  2020-11-29  6:22   ` Greg KH
  2020-11-29 17:33   ` Alexei Starovoitov
@ 2020-11-29 19:41   ` David Ahern
  2020-11-30 11:04     ` Toke Høiland-Jørgensen
  2020-12-01 14:22     ` Jesper Dangaard Brouer
  2 siblings, 2 replies; 167+ messages in thread
From: David Ahern @ 2020-11-29 19:41 UTC (permalink / raw)
  To: Stephen Hemminger, Hangbin Liu
  Cc: Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, netdev, bpf,
	Jiri Benc, Andrii Nakryiko, Toke Høiland-Jørgensen

On 11/28/20 11:16 PM, Stephen Hemminger wrote:
> Luca wants to put this in Debian 11 (good idea), but that means:
> 
> 1. It has to work with 5.10 release and kernel.
> 2. Someone has to test it.
> 3. The 5.10 is a LTS kernel release which means BPF developers have
>    to agree to supporting LTS releases.
> 
> If someone steps up to doing this then I would be happy to merge it now
> for 5.10. Otherwise it won't show up until 5.11.

It would be good for Bullseye to have the option to use libbpf with
iproute2. If Debian uses the 5.10 kernel then it should use the 5.10
version of iproute2 and 5.10 version libbpf. All the components align
with consistent versioning.

I have some use cases I can move from bpftool loading to iproute2 as
additional testing to what Hangbin has already done. If that goes well,
I can re-send the patch series against iproute2-main branch by next weekend.

It would be good for others (Jesper, Toke, Jiri) to run their own
testing as well.

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 0/5] iproute2: add libbpf support
  2020-11-29 19:41   ` David Ahern
@ 2020-11-30 11:04     ` Toke Høiland-Jørgensen
  2020-12-01 14:22     ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 167+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-11-30 11:04 UTC (permalink / raw)
  To: David Ahern, Stephen Hemminger, Hangbin Liu
  Cc: Daniel Borkmann, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, David Miller, Jesper Dangaard Brouer, netdev, bpf,
	Jiri Benc, Andrii Nakryiko

David Ahern <dsahern@gmail.com> writes:

> On 11/28/20 11:16 PM, Stephen Hemminger wrote:
>> Luca wants to put this in Debian 11 (good idea), but that means:
>> 
>> 1. It has to work with 5.10 release and kernel.
>> 2. Someone has to test it.
>> 3. The 5.10 is a LTS kernel release which means BPF developers have
>>    to agree to supporting LTS releases.
>> 
>> If someone steps up to doing this then I would be happy to merge it now
>> for 5.10. Otherwise it won't show up until 5.11.
>
> It would be good for Bullseye to have the option to use libbpf with
> iproute2. If Debian uses the 5.10 kernel then it should use the 5.10
> version of iproute2 and 5.10 version libbpf. All the components align
> with consistent versioning.
>
> I have some use cases I can move from bpftool loading to iproute2 as
> additional testing to what Hangbin has already done. If that goes well,
> I can re-send the patch series against iproute2-main branch by next weekend.

This is fine by me - there's nothing in the iproute2 patches that
depends on any particular version of libbpf newer than 0.1.0 (that was
the whole point), so it's just a matter of when you guys want to merge
it.

> It would be good for others (Jesper, Toke, Jiri) to run their own
> testing as well.

I'll do some manual testing, and once we get this into RHEL it'll be
part of automated testing there as well. The latter may take a while,
though, so don't count on it for any initial verification...

-Toke


^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 0/5] iproute2: add libbpf support
  2020-11-29  6:22   ` Greg KH
@ 2020-11-30 11:39     ` Michal Kubecek
  0 siblings, 0 replies; 167+ messages in thread
From: Michal Kubecek @ 2020-11-30 11:39 UTC (permalink / raw)
  To: Greg KH
  Cc: Stephen Hemminger, Hangbin Liu, Daniel Borkmann, David Ahern,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, Jesper Dangaard Brouer, netdev, bpf, Jiri Benc,
	Andrii Nakryiko, Toke Høiland-Jørgensen

On Sun, Nov 29, 2020 at 07:22:54AM +0100, Greg KH wrote:
> On Sat, Nov 28, 2020 at 10:16:35PM -0800, Stephen Hemminger wrote:
> 
> > If someone steps up to doing this then I would be happy to merge it now
> > for 5.10. Otherwise it won't show up until 5.11.
> 
> Don't ever "rush" anything for a LTS/stable release, otherwise I am
> going to have to go back to the old way of not announcing them until
> _after_ they are released as people throw stuff that is not ready for
> a normal merge.
> 
> This looks like a new feature, and shouldn't go in right now in the
> development cycle anyway, all features for 5.10 had to be in linux-next
> before 5.9 was released.

From the context, I believe Stephen meant merging into iproute2 5.10,
not kernel.

Michal Kubecek

^ permalink raw reply	[flat|nested] 167+ messages in thread

* Re: [PATCH iproute2-next 0/5] iproute2: add libbpf support
  2020-11-29 19:41   ` David Ahern
  2020-11-30 11:04     ` Toke Høiland-Jørgensen
@ 2020-12-01 14:22     ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 167+ messages in thread
From: Jesper Dangaard Brouer @ 2020-12-01 14:22 UTC (permalink / raw)
  To: David Ahern
  Cc: Stephen Hemminger, Hangbin Liu, Daniel Borkmann,
	Alexei Starovoitov, Martin KaFai Lau, Song Liu, Yonghong Song,
	David Miller, netdev, bpf, Jiri Benc, Andrii Nakryiko,
	Toke Høiland-Jørgensen, brouer

On Sun, 29 Nov 2020 12:41:49 -0700
David Ahern <dsahern@gmail.com> wrote:

> On 11/28/20 11:16 PM, Stephen Hemminger wrote:
> > Luca wants to put this in Debian 11 (good idea), but that means:
> > 
> > 1. It has to work with 5.10 release and kernel.
> > 2. Someone has to test it.
> > 3. The 5.10 is a LTS kernel release which means BPF developers have
> >    to agree to supporting LTS releases.
> > 
> > If someone steps up to doing this then I would be happy to merge it now
> > for 5.10. Otherwise it won't show up until 5.11.  
> 
> It would be good for Bullseye to have the option to use libbpf with
> iproute2. If Debian uses the 5.10 kernel then it should use the 5.10
> version of iproute2 and 5.10 version libbpf. All the components align
> with consistent versioning.
> 
> I have some use cases I can move from bpftool loading to iproute2 as
> additional testing to what Hangbin has already done. If that goes well,
> I can re-send the patch series against iproute2-main branch by next weekend.
> 
> It would be good for others (Jesper, Toke, Jiri) to run their own
> testing as well.

I have tested this on a Ubuntu 20.04.1 LTS.

I had to compile tc my own "old" version (based it on iproute2 git
tree), because Ubuntu vendor tc util version didn't even support loading
BPF-ELF objects... weird!

Copy-pasted by compile instruction below signature (including one
failure, that people can find via Google search).

I tested difference combinations old vs. new loader with map pinning
and reuse of maps (as instructed by Toke over IRC), all the cases
worked.

I took it one step further and implemented tc libbpf detection:
 https://github.com/netoptimizer/bpf-examples/commit/048c960756eb65

So, my EDT-pacing code[1] now support BTF-maps, via configure detection
and code gets compiled with support, which allows me to inspect the
content really easily (data from production system):

$ bpftool map lookup id 1351 key 0x10 0x0 0x0 0x0
{
    "key": 16,
    "value": {
        "rate": 0,
        "t_last": 3299496947649930,
        "t_horizon_drop": 0,
        "t_horizon_ecn": 0,
        "codel": {
            "first_above_time": 3299496641781522,
            "drop_next": 3299497041788432,
            "count": 9,
            "dropping": 1
        }
    }
}

[1] https://github.com/netoptimizer/bpf-examples/tree/master/traffic-pacing-edt
- - 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


Very recently iproute2 got support for using libbpf as BPF-ELF loader.

Testing this on Ubuntu 20.04.1 LTS.

Currently avail is iproute2-next tree:
- https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/
- git clone git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git


First get libbpf:
  git clone https://github.com/libbpf/libbpf.git
  cd libbpf

Build libbpf and install it locally:

  cd ~/git/libbpf/
  mkdir build
  cd ~/git/libbpf/src
  DESTDIR=../build make install
  DESTDIR=../build make install_headers


Attempt#1: Try to get iproute2 compiling against:

  cd ~/git/iproute2-next
  $ LIBBPF_DIR=../libbpf/build/ ./configure 
  TC schedulers
   ATM	no
  
  libc has setns: yes
  SELinux support: no
  libbpf support: yes
  	libbpf version 0.3.0
  ELF support: yes
  libmnl support: yes
  Berkeley DB: no
  need for strlcpy: no
  libcap support: no

Make fails:
  $ make

  lib
      CC       bpf_libbpf.o
  bpf_libbpf.c:20:10: fatal error: bpf/libbpf.h: No such file or directory
     20 | #include <bpf/libbpf.h>
        |          ^~~~~~~~~~~~~~
  compilation terminated.


The problem is use of "relative path" in LIBBPF_DIR (../libbpf/build/), as
the Makefile enter subdir 'lib' and have these include path CFLAGS:

  CFLAGS += -DHAVE_LIBBPF  -I../libbpf/build//usr/include

Attempt#2 works: Try to get iproute2 compiling against:

  cd ~/git/iproute2-next
  $ LIBBPF_DIR=~/git/libbpf/build/ ./configure
  make


Install as stow version:

  export STOW=/usr/local/stow/iproute2-libbpf-next-git-c29f65db34
  make
  make PREFIX=$STOW SYSCONFDIR=$STOW CONFDIR=$STOW/etc/iproute2 SBINDIR=$STOW/sbin -n install
  make PREFIX=$STOW SYSCONFDIR=$STOW CONFDIR=$STOW/etc/iproute2 SBINDIR=$STOW/sbin install

Current state:
  $ tc -V
  tc utility, iproute2-5.9.0, libbpf 0.3.0


^ permalink raw reply	[flat|nested] 167+ messages in thread

end of thread, other threads:[~2020-12-01 14:24 UTC | newest]

Thread overview: 167+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-23  3:38 [PATCH iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
2020-10-23  3:38 ` [PATCH iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
2020-10-23  3:38 ` [PATCH iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
2020-10-23  3:38 ` [PATCH iproute2-next 3/5] lib: add libbpf support Hangbin Liu
2020-10-23 14:34   ` David Ahern
2020-10-25 15:13     ` Toke Høiland-Jørgensen
2020-10-25 22:12       ` David Ahern
2020-10-26  8:56         ` Hangbin Liu
2020-10-26 15:15           ` David Ahern
2020-10-27  2:58             ` Hangbin Liu
2020-10-24  0:21   ` Andrii Nakryiko
2020-10-25 15:11     ` Toke Høiland-Jørgensen
2020-10-26  8:10     ` Hangbin Liu
2020-10-23  3:38 ` [PATCH iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
2020-10-23  3:38 ` [PATCH iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
2020-10-28 13:25 ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
2020-10-28 13:25   ` [PATCHv2 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
2020-10-28 13:25   ` [PATCHv2 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
2020-10-28 13:25   ` [PATCHv2 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
2020-10-28 13:25   ` [PATCHv2 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
2020-10-28 13:25   ` [PATCHv2 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
2020-10-28 21:17   ` [PATCHv2 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
2020-10-28 23:02   ` David Ahern
2020-10-29  2:06     ` Hangbin Liu
2020-10-29  2:20       ` David Ahern
2020-10-29  2:45         ` Hangbin Liu
2020-10-29  3:00           ` David Ahern
2020-10-29  3:17             ` Hangbin Liu
2020-10-29 10:26             ` Hangbin Liu
2020-10-29 10:51               ` Toke Høiland-Jørgensen
2020-10-29  2:27       ` Andrii Nakryiko
2020-10-29  2:33         ` David Ahern
2020-10-29  2:46           ` Andrii Nakryiko
2020-10-29  2:34         ` Stephen Hemminger
2020-10-29  2:50           ` Andrii Nakryiko
2020-10-29 11:38             ` Jesper Dangaard Brouer
2020-10-29 20:30               ` Andrii Nakryiko
2020-10-29  2:33       ` Stephen Hemminger
2020-10-29 15:11   ` [PATCHv3 " Hangbin Liu
2020-10-29 15:11     ` [PATCHv3 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
2020-10-29 15:26       ` Toke Høiland-Jørgensen
2020-11-02 15:37       ` David Ahern
2020-11-03  5:54         ` Hangbin Liu
2020-11-03 17:32           ` David Ahern
2020-11-04  8:51             ` Hangbin Liu
2020-11-04 11:09               ` Toke Høiland-Jørgensen
2020-11-04 11:40                 ` Hangbin Liu
2020-10-29 15:11     ` [PATCHv3 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
2020-10-29 15:11     ` [PATCHv3 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
2020-11-02 15:41       ` David Ahern
2020-11-03  5:48         ` Hangbin Liu
2020-11-03 17:19           ` David Ahern
2020-11-04  8:22         ` Hangbin Liu
2020-11-05  2:33           ` David Ahern
2020-11-05  7:51             ` Hangbin Liu
2020-11-05 15:25               ` David Ahern
2020-11-05 15:57                 ` Toke Høiland-Jørgensen
2020-11-05 16:02                   ` David Ahern
2020-11-06  0:56                     ` Hangbin Liu
2020-11-06  0:41                 ` Hangbin Liu
2020-10-29 15:11     ` [PATCHv3 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
2020-10-29 15:11     ` [PATCHv3 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
2020-11-02 15:47     ` [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support David Ahern
2020-11-03  6:58       ` Andrii Nakryiko
2020-11-03  8:42         ` Jiri Benc
2020-11-03 17:45           ` David Ahern
2020-11-03 17:48           ` Alexei Starovoitov
2020-11-03  8:46         ` Daniel Borkmann
2020-11-03 17:35           ` David Ahern
2020-11-03 17:47             ` Alexei Starovoitov
2020-11-03 18:23               ` Stephen Hemminger
2020-11-03 22:32               ` David Ahern
2020-11-03 22:55                 ` Alexei Starovoitov
2020-11-04  1:40                   ` David Ahern
2020-11-04  2:45                     ` Alexei Starovoitov
2020-11-04  9:28                       ` Jiri Benc
2020-11-05  2:39                         ` David Ahern
2020-11-04  2:17                   ` Hangbin Liu
2020-11-04  3:11                     ` Alexei Starovoitov
2020-11-04 10:01                       ` Jiri Benc
2020-11-04 10:21                       ` Daniel Borkmann
2020-11-04 11:20                         ` Toke Høiland-Jørgensen
2020-11-04 13:12                           ` Daniel Borkmann
2020-11-04 19:17                             ` Jakub Kicinski
2020-11-04 20:43                               ` Andrii Nakryiko
2020-11-04 22:24                                 ` Toke Høiland-Jørgensen
2020-11-05 20:14                                   ` Andrii Nakryiko
2020-11-05  3:48                                 ` David Ahern
2020-11-05 20:53                                   ` Andrii Nakryiko
2020-11-05  3:19                         ` David Ahern
2020-11-05 14:05                           ` Jamal Hadi Salim
2020-11-05 21:01                             ` Andrii Nakryiko
2020-11-06 15:27                               ` Jamal Hadi Salim
2020-11-06 21:25                                 ` Andrii Nakryiko
2020-11-10 12:47                             ` Edward Cree
2020-11-11  0:53                               ` Alexei Starovoitov
2020-11-11 11:31                                 ` Edward Cree
2020-11-11 18:08                                   ` Alexei Starovoitov
2020-11-05 20:45                           ` Andrii Nakryiko
2020-11-06  9:00                             ` Jiri Benc
2020-11-06 21:07                               ` Andrii Nakryiko
2020-11-04 21:15                       ` Edward Cree
2020-11-04 22:10                         ` Alexei Starovoitov
2020-11-04 22:35                           ` Toke Høiland-Jørgensen
2020-11-04 23:05                           ` Edward Cree
2020-11-05 20:19                             ` Andrii Nakryiko
2020-11-06  8:44                               ` Jiri Benc
2020-11-06 20:57                                 ` Andrii Nakryiko
2020-11-06 21:04                                   ` Alexei Starovoitov
2020-11-06 23:25                                     ` Stephen Hemminger
2020-11-06 23:30                                       ` Andrii Nakryiko
2020-11-07  0:41                                         ` Stephen Hemminger
2020-11-07  1:07                                           ` Andrii Nakryiko
2020-11-06 23:38                                       ` David Ahern
2020-11-09  1:45                                         ` Alexei Starovoitov
2020-11-10  4:09                                           ` David Ahern
2020-11-11  0:47                                             ` Alexei Starovoitov
2020-11-11 11:02                                               ` Toke Høiland-Jørgensen
2020-11-11 15:06                                                 ` Daniel Borkmann
2020-11-11 16:33                                                   ` David Ahern
2020-11-12 22:36                                                   ` Toke Høiland-Jørgensen
2020-11-12 23:20                                                     ` Daniel Borkmann
2020-11-13  0:04                                                       ` Stephen Hemminger
2020-11-13  0:40                                                         ` Alexei Starovoitov
2020-11-13  3:55                                                       ` David Ahern
2020-11-09  7:07     ` [PATCHv4 " Hangbin Liu
2020-11-09  7:07       ` [PATCHv4 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
2020-11-14  3:26         ` David Ahern
2020-11-16  4:30           ` Hangbin Liu
2020-11-16  4:33             ` David Ahern
2020-11-09  7:07       ` [PATCHv4 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
2020-11-14  3:24         ` David Ahern
2020-11-16  3:55           ` Hangbin Liu
2020-11-09  7:08       ` [PATCHv4 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
2020-11-09  7:08       ` [PATCHv4 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
2020-11-09  7:08       ` [PATCHv4 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
2020-11-16  6:53       ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Hangbin Liu
2020-11-16  6:53         ` [PATCHv5 iproute2-next 1/5] configure: add check_libbpf() for later " Hangbin Liu
2020-11-16  6:53         ` [PATCHv5 iproute2-next 2/5] lib: rename bpf.c to bpf_legacy.c Hangbin Liu
2020-11-16  6:53         ` [PATCHv5 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
2020-11-16  6:53         ` [PATCHv5 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
2020-11-16  6:53         ` [PATCHv5 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
2020-11-16  7:19         ` [PATCHv5 iproute2-next 0/5] iproute2: add libbpf support Alexei Starovoitov
2020-11-16 14:54           ` Jesper Dangaard Brouer
2020-11-16 23:29             ` Toke Høiland-Jørgensen
2020-11-17  2:37             ` Alexei Starovoitov
2020-11-17  3:19               ` Hangbin Liu
2020-11-17 18:27                 ` Alexei Starovoitov
2020-11-17 11:56               ` Edward Cree
2020-11-17  3:38             ` David Ahern
2020-11-17 18:19               ` Alexei Starovoitov
2020-11-16 16:45           ` Stephen Hemminger
2020-11-23 13:11         ` [PATCHv6 " Hangbin Liu
2020-11-23 13:11           ` [PATCHv6 iproute2-next 1/5] iproute2: add check_libbpf() and get_libbpf_version() Hangbin Liu
2020-11-23 13:11           ` [PATCHv6 iproute2-next 2/5] lib: make ipvrf able to use libbpf and fix function name conflicts Hangbin Liu
2020-11-23 13:11           ` [PATCHv6 iproute2-next 3/5] lib: add libbpf support Hangbin Liu
2020-11-23 13:12           ` [PATCHv6 iproute2-next 4/5] examples/bpf: move struct bpf_elf_map defined maps to legacy folder Hangbin Liu
2020-11-23 13:12           ` [PATCHv6 iproute2-next 5/5] examples/bpf: add bpf examples with BTF defined maps Hangbin Liu
2020-11-25  5:28           ` [PATCHv6 iproute2-next 0/5] iproute2: add libbpf support David Ahern
2020-11-25  5:30           ` patchwork-bot+netdevbpf
2020-11-29  6:16 ` [PATCH " Stephen Hemminger
2020-11-29  6:22   ` Greg KH
2020-11-30 11:39     ` Michal Kubecek
2020-11-29 17:33   ` Alexei Starovoitov
2020-11-29 19:41   ` David Ahern
2020-11-30 11:04     ` Toke Høiland-Jørgensen
2020-12-01 14:22     ` Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).