netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@fb.com>
To: "Daniel Borkmann" <daniel@iogearbox.net>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Andrii Nakryiko" <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel abstraction
Date: Tue, 3 Mar 2020 14:01:09 -0800	[thread overview]
Message-ID: <ccbc1e49-45c1-858b-1ad5-ee503e0497f2@fb.com> (raw)
In-Reply-To: <ab2f98f6-c712-d8a2-1fd3-b39abbaa9f64@iogearbox.net>

On 3/3/20 12:53 PM, Daniel Borkmann wrote:
> 
> I think it depends on the environment, and yes, whether the orchestrator
> of those progs controls the host [networking] as in case of Cilium. We
> actually had cases where a large user in prod was accidentally removing
> the Cilium k8s daemon set (and hence the user space agent as well) and only
> noticed 1hrs later since everything just kept running in the data path as
> expected w/o causing them an outage. So I think both attachment semantics
> have pros and cons. ;)

of course. that's why there is pinning of FD-based links.
There are cases where pinning is useful and there are cases where
pinning will cause outages.
During app restart temporary pinning might be useful too.

> But then are you also expecting that netlink requests which drop that tc
> filter that holds this BPF prog would get rejected given it has a bpf_link,
> is active & pinned and traffic goes through? If not the case, then what
> would be the point? If it is the case, then this seems rather complex to
> realize for rather little gain given there are two uapi interfaces (bpf,
> tc/netlink) which then mess around with the same underlying object in
> different ways.

Legacy api for tc, xdp, cgroup will not be able to override FD-based
link. For TC it's easy. cls-bpf allows multi-prog, so netlink
adding/removing progs will not be able to touch progs that are
attached via FD-based link.
Same thing for cgroups. FD-based link will be similar to 'multi' mode.
The owner of the link has a guarantee that their program will
stay attached to cgroup.
XDP is also easy. Since it has only one prog. Attaching FD-based link
will prevent netlink from overriding it.
This way the rootlet prog installed by libxdp (let's find a better name
for it) will stay attached. libxdp can choose to pin it in some libxdp
specific location, so other libxdp-enabled applications can find it
in the same location, detach, replace, modify, but random app that
wants to hack an xdp prog won't be able to mess with it.
We didn't come up with these design choices overnight. It came from
hard lessons learned while deploying xdp, tc and cgroup in production.
Legacy apis will not be deprecated, of course.

  reply	other threads:[~2020-03-03 22:01 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-28 22:39 [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel abstraction Andrii Nakryiko
2020-02-28 22:39 ` [PATCH bpf-next 1/3] bpf: introduce pinnable bpf_link abstraction Andrii Nakryiko
2020-03-02 10:13   ` Toke Høiland-Jørgensen
2020-03-02 18:06     ` Andrii Nakryiko
2020-03-02 21:40       ` Toke Høiland-Jørgensen
2020-03-02 23:37         ` Andrii Nakryiko
2020-03-03  2:50   ` Alexei Starovoitov
2020-03-03  4:18     ` Andrii Nakryiko
2020-02-28 22:39 ` [PATCH bpf-next 2/3] libbpf: add bpf_link pinning/unpinning Andrii Nakryiko
2020-03-02 10:16   ` Toke Høiland-Jørgensen
2020-03-02 18:09     ` Andrii Nakryiko
2020-03-02 21:45       ` Toke Høiland-Jørgensen
2020-02-28 22:39 ` [PATCH bpf-next 3/3] selftests/bpf: add link pinning selftests Andrii Nakryiko
2020-03-02 10:11 ` [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel abstraction Toke Høiland-Jørgensen
2020-03-02 18:05   ` Andrii Nakryiko
2020-03-02 22:24     ` Toke Høiland-Jørgensen
2020-03-02 23:35       ` Andrii Nakryiko
2020-03-03  8:12         ` Toke Høiland-Jørgensen
2020-03-03  8:12       ` Daniel Borkmann
2020-03-03 15:46         ` Alexei Starovoitov
2020-03-03 19:23           ` Daniel Borkmann
2020-03-03 19:46             ` Andrii Nakryiko
2020-03-03 20:24               ` Toke Høiland-Jørgensen
2020-03-03 20:53                 ` Daniel Borkmann
2020-03-03 22:01                   ` Alexei Starovoitov [this message]
2020-03-03 22:27                     ` Toke Høiland-Jørgensen
2020-03-04  4:36                       ` Alexei Starovoitov
2020-03-04  7:47                         ` Toke Høiland-Jørgensen
2020-03-04 15:47                           ` Alexei Starovoitov
2020-03-05 10:37                             ` Toke Høiland-Jørgensen
2020-03-05 16:34                               ` Alexei Starovoitov
2020-03-05 22:34                                 ` Daniel Borkmann
2020-03-05 22:50                                   ` Alexei Starovoitov
2020-03-05 23:42                                     ` Daniel Borkmann
2020-03-06  8:31                                       ` Toke Høiland-Jørgensen
2020-03-06 10:25                                         ` Daniel Borkmann
2020-03-06 10:42                                           ` Toke Høiland-Jørgensen
2020-03-06 18:09                                           ` David Ahern
2020-03-04 19:41                         ` Jakub Kicinski
2020-03-04 20:45                           ` Alexei Starovoitov
2020-03-04 21:24                             ` Jakub Kicinski
2020-03-05  1:07                               ` Alexei Starovoitov
2020-03-05  8:16                                 ` Jakub Kicinski
2020-03-05 11:05                                   ` Toke Høiland-Jørgensen
2020-03-05 18:13                                     ` Jakub Kicinski
2020-03-09 11:41                                       ` Toke Høiland-Jørgensen
2020-03-09 18:50                                         ` Jakub Kicinski
2020-03-10 12:22                                           ` Toke Høiland-Jørgensen
2020-03-05 16:39                                   ` Alexei Starovoitov
2020-03-03 22:40                 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ccbc1e49-45c1-858b-1ad5-ee503e0497f2@fb.com \
    --to=ast@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andriin@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).