BPF Archive on lore.kernel.org
 help / color / Atom feed
From: Andrey Ignatov <rdna@fb.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf 1/2] libbpf: Fix loading cgroup_skb/egress with ret in [2, 3]
Date: Sat, 11 Apr 2020 17:36:04 -0700
Message-ID: <20200412003604.GA15986@rdna-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <20200411224214.32hiebejlsl2rc2k@ast-mbp>

Alexei Starovoitov <alexei.starovoitov@gmail.com> [Sat, 2020-04-11 15:42 -0700]:
> On Fri, Apr 10, 2020 at 03:57:39PM -0700, Andrey Ignatov wrote:
> > 
> > > Seems like kernels accept expected_attach_type
> > > for a while now, so it might be ok backwards compatibility-wise?
> > 
> > Sure, that commit is from 2018, but I guess backward compatibility
> > should still be maintained in this case? That's a question to
> > maintainers though. If simply changing BPF_APROG_SEC to BPF_EAPROG_SEC
> > is an option that works for me.
> > 
> > 
> > > Otherwise, we can teach libbpf to retry program load without expected
> > > attach type for cgroup_skb/egress?
> > 
> > Looks like a lot of work compared to simply adding a new section name
> > (or changing existing section if backward compatibility is not a concern
> > here).
> I still don't understand that backward compatiblity concern.
> Fixing libbpf to do BPF_EAPROG_SEC("cgroup_skb/egress"
> will make egress progs to fail at load time if they use > 1 return value on old
> kernels

No. Changing BPF_APROG_SEC to BPF_EAPROG_SEC will fail loading programs
on old kernels with any return value, incl. 0 and 1.  That's the point.

> and fail at load time for > 3 return value on new kernels. Without
> libbpf fix such progs would rely on old and new kernels internal implementation
> details. Since on the latest kernel with current libbpf behavior the egress
> prog will get loaded as ingress type with return value 4 and then gets attached
> at egress. Argh. One really need to deep dive into kernel sources to figure out
> what kernel will do with such return value. Such behavior is undefined and broken.
> Did I misunderstand the whole issue?

Let me try to explain with an example.

I patched libbpf and built bpftool with this patch:

  17:00:43 0 rdna@dev082.prn2:~/bpf-next$>git di tools/lib/bpf/
  diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
  index ff9174282a8c..cc4d5b74e64a 100644
  --- a/tools/lib/bpf/libbpf.c
  +++ b/tools/lib/bpf/libbpf.c
  @@ -6330,7 +6330,7 @@ static const struct bpf_sec_def section_defs[] = {
          BPF_PROG_SEC("lwt_seg6local",           BPF_PROG_TYPE_LWT_SEG6LOCAL),
          BPF_APROG_SEC("cgroup_skb/ingress",     BPF_PROG_TYPE_CGROUP_SKB,
  -       BPF_APROG_SEC("cgroup_skb/egress",      BPF_PROG_TYPE_CGROUP_SKB,
  +       BPF_EAPROG_SEC("cgroup_skb/egress",     BPF_PROG_TYPE_CGROUP_SKB,
          BPF_APROG_COMPAT("cgroup/skb",          BPF_PROG_TYPE_CGROUP_SKB),
          BPF_APROG_SEC("cgroup/sock",            BPF_PROG_TYPE_CGROUP_SOCK,

Wrote a simple cgroup_skb/egress program that returns 1 and built it:

  17:00:59 0 rdna@dev082.prn2:~/bpf-next$>cat tools/testing/selftests/bpf/progs/skb_ret1.c
  #include <linux/bpf.h>
  #include <bpf/bpf_helpers.h>
  int _version SEC("version") = 1;
  char _license[] SEC("license") = "GPL";
  int skb_ret1(struct __sk_buff *skb)
          return 1;

Made sure that it can be loaded on new kernel:

  17:00:39 0 rdna@dev082.prn2:~/bpf-next$>uname -srm
  Linux 5.2.9-93_fbk11_rc1_3610_g6a108f4f4a2b x86_64
  17:01:10 0 rdna@dev082.prn2:~/bpf-next$>sudo ./tools/bpf/bpftool/bpftool p load tools/testing/selftests/bpf/skb_ret1.o /sys/fs/bpf/skb_ret1
  17:01:25 0 rdna@dev082.prn2:~/bpf-next$>sudo ./tools/bpf/bpftool/bpftool p l pinned /sys/fs/bpf/skb_ret1
  87347: cgroup_skb  name skb_ret1  tag 9bf8e2a8bee581f5  gpl
          loaded_at 2020-04-11T17:01:25-0700  uid 0
          xlated 16B  jited 40B  memlock 4096B
          btf_id 4952

Then copied both bpftool and the program to a server with old kernel
(that doesn't have 5e43f899b03a ("bpf: Check attach type at prog load
time")) and tried same thing (note "./bpftool"):

  [root@<some_host> ~]# uname -srm
  Linux 4.11.3-45_fbk11_3602_gd67c71c x86_64
  [root@<some_host> ~]# ./bpftool p load ./skb_ret1.o /sys/fs/bpf/skb_ret1
  libbpf: Error loading BTF: Invalid argument(22)
  libbpf: Error loading .BTF into kernel: -22.
  libbpf: load bpf program failed: Invalid argument
  libbpf: failed to load program 'cgroup_skb/egress'
  libbpf: failed to load object './skb_ret1.o'
  Error: failed to load object file
  [root@<some_host> ~]# echo $?
  [root@<some_host> ~]# ./bpftool p l pinned /sys/fs/bpf/skb_ret1
  Error: bpf obj get (/sys/fs/bpf/skb_ret1): No such file or directory

As you can see it fails to load (BTF errors are not relevant).

Then I tried to load same program on same old kernel but with prod
bpftool (w/o my patch) just to make sure BTF is not a problem and it
loads fine:

  [root@<some_host> ~]# bpftool p load ./skb_ret1.o /sys/fs/bpf/skb_ret1
  libbpf: Error loading BTF: Invalid argument(22)
  libbpf: Error loading .BTF into kernel: -22.
  [root@<some_host> ~]# echo $?
  [root@<some_host> ~]# bpftool p l pinned /sys/fs/bpf/skb_ret1
  23944: cgroup_skb  name skb_ret1  tag 9bf8e2a8bee581f5
          loaded_at 2020-04-11T17:12:07-0700  uid 0
          xlated 16B  jited 64B  memlock 4096B
  [root@<some_host> ~]#

That's expected becase the error at loading time happens long before
running verifier and doesn't have anything to do with what program
returns. It fails on `if (CHECK_ATTR(BPF_PROG_LOAD))` in kernel's
bpf_prog_load() -- the very first check that happens in that function.

Old kernel checks that passed by user-space bpf_attr has all bytes after
bpf_attr.prog_ifindex (the BPF_PROG_LOAD_LAST_FIELD known to that
kernel) zero, but finds that there is non-zero unknown field, that is
expected_attach_type, and fails.

So changing BPF_APROG_SEC to BPF_EAPROG_SEC will break loading cgroup
skb egress progs on any kernel before 5e43f899b03a ("bpf: Check attach
type at prog load time"), no matter what those programs do. That's why I
think it's not a good idea.

Does it clarify?

Andrey Ignatov

  reply index

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-10 19:53 [PATCH bpf 0/2] " Andrey Ignatov
2020-04-10 19:54 ` [PATCH bpf 1/2] " Andrey Ignatov
2020-04-10 20:39   ` Andrii Nakryiko
2020-04-10 21:14     ` Alexei Starovoitov
2020-04-10 21:52       ` Andrii Nakryiko
2020-04-10 23:02         ` Andrey Ignatov
2020-04-10 22:57     ` Andrey Ignatov
2020-04-11  0:09       ` Andrii Nakryiko
2020-04-12  6:01         ` Andrii Nakryiko
2020-04-13 20:24           ` Andrey Ignatov
2020-04-11 22:42       ` Alexei Starovoitov
2020-04-12  0:36         ` Andrey Ignatov [this message]
2020-04-10 19:54 ` [PATCH bpf 2/2] selftests/bpf: Test cgroup_skb/egress/expected section name Andrey Ignatov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200412003604.GA15986@rdna-mbp.dhcp.thefacebook.com \
    --to=rdna@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git