bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	Marek Majkowski <marek@cloudflare.com>,
	Lorenz Bauer <lmb@cloudflare.com>,
	Alan Maguire <alan.maguire@oracle.com>,
	David Miller <davem@davemloft.net>,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF programs after each other
Date: Wed, 16 Oct 2019 13:16:13 +0200	[thread overview]
Message-ID: <878splgcua.fsf@toke.dk> (raw)
In-Reply-To: <20191016103501.GB21367@pc-63.home>

Daniel Borkmann <daniel@iogearbox.net> writes:
> We do use cls_bpf heavily in Cilium, but I don't necessarily agree on
> the notorious difficult to use aspect (at least for tc + BPF): i) this
> is abstracted away from the /user/ entirely to the point that this is an
> implementation detail he doesn't need to know about, ii) these days most
> access to these hooks is done programmatically, if this is a worry, then
> lets simply add a cls_bpf pendant for APIs like bpf_set_link_xdp_fd() we
> have in libbpf where you only pass in ifindex, direction (ingress/egress)
> and priority of the program so that underneath it sets up cls_act qdisc
> with a cls_bpf instance that makes the whole thing foolproof, e.g.:
>
>   int bpf_set_link_tc_fd(int ifindex, int fd, enum bpf_tc_dir dir,
>                          __u32 priority, __u32 flags);

Basically, what I'm trying to achieve with XDP chain calls is to be able
to do something similar to this, but for XDP programs. Just with the
added ability to also select on return code...

>> Second, the multiple "independent programs", are actually not
>> independent, because the current running program must return
>> TC_ACT_UNSPEC to allow next bpf-prog to run.  Thus, it is not really
>> usable.
>
> I'd argue that unless the only thing you do in your debugging program is
> to introspect (read-only) the packet at the current point, you'd run into
> a similar coordination issue, meaning, the "independent programs" works
> for simple cases where you only have ACCEPT and DROP policy, such that
> you could run through all the programs and have precedence on DROP.
>
> But once you have conflicting policies with regards to how these programs
> mangle and redirect packets, how would you handle these?

I imagine that in most relevant cases this can be handled by convention;
the most obvious convention being "chain call on XDP_PASS". But still
allowing the admin to override this if they know what they are doing.

> I'd argue it's a non-trivial task to outsource if /admins/ are
> supposed to do manual order adjustments and more importantly to
> troubleshoot issues due to them. Potentially debugging hooks would
> make that easier to avoid recompilation, but it's more of a developer
> task.

Sure, in the general case this could become arbitrarily complex; but I
think that the feature can still be useful.

> Often times orchestration tools i) assume they just own the data path
> to reduce complexity in an already complex system and ii) also keep
> 'refreshing' their setup. One random example for the latter is k8s'
> kube-proxy that reinstalls its iptables rules every x sec, in order to
> make sure there was no manual messing around and to keep the data path
> eventually consistent with the daemon view (if they got borked).

This is actually the reason I want the kernel state to be the source of
truth (instead of keeping state in a userspace daemon). If the kernel
keeps the state it can enforce consistency, whereas a userspace daemon
has to be able to deal with things in the kernel changing underneath
it...

> How would you make the loader aware of daemons automatically
> refreshing/ reconfiguring their BPF progs in the situation where
> admins changed the pipeline, adding similar handle as tc so whoever
> does the 'chain' assembly know which one to update?

My idea for this was to implement atomic updates in the form of a
"cmpxchg" type of operation. I.e., we'd add a parameter to the syscall
where userspace could say "I want to install this program in place of
this one", and if that "old program" value doesn't match the current
state of the kernel, it can be rejected, atomically.

-Toke

  reply	other threads:[~2019-10-16 11:16 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-07 17:20 [PATCH bpf-next v3 0/5] xdp: Support multiple programs on a single interface through chain calls Toke Høiland-Jørgensen
2019-10-07 17:20 ` [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF programs after each other Toke Høiland-Jørgensen
2019-10-07 20:42   ` Alexei Starovoitov
2019-10-08  8:07     ` Toke Høiland-Jørgensen
2019-10-09  1:51       ` Alexei Starovoitov
2019-10-09  8:03         ` Toke Høiland-Jørgensen
2019-10-10  4:41           ` Alexei Starovoitov
2019-10-14 12:35             ` Toke Høiland-Jørgensen
2019-10-14 17:08               ` John Fastabend
2019-10-14 18:48                 ` Toke Høiland-Jørgensen
2019-10-15 16:30                   ` Edward Cree
2019-10-15 16:42                     ` Toke Høiland-Jørgensen
2019-10-15 18:33                       ` Edward Cree
2019-10-17 12:11                         ` Toke Høiland-Jørgensen
2019-10-22 17:27                           ` Edward Cree
2019-10-22 18:07                             ` Toke Høiland-Jørgensen
2019-11-12  2:51                               ` static and dynamic linking. Was: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF Alexei Starovoitov
2019-11-12 16:20                                 ` Toke Høiland-Jørgensen
2019-11-12 19:52                                   ` Alexei Starovoitov
2019-11-12 21:25                                     ` Edward Cree
2019-11-12 23:18                                       ` Alexei Starovoitov
2019-11-13 18:30                                         ` Edward Cree
2019-11-13 18:51                                           ` Andrii Nakryiko
2019-11-15  2:13                                           ` Alexei Starovoitov
2019-11-15 16:56                                             ` John Fastabend
2019-11-12 23:25                                     ` John Fastabend
2019-11-13  0:21                                       ` Alexei Starovoitov
2019-11-13  5:33                                         ` John Fastabend
2019-11-15  1:50                                           ` Alexei Starovoitov
2019-11-15 16:39                                             ` John Fastabend
2019-11-14 15:41                                     ` Toke Høiland-Jørgensen
2019-11-12 16:32                                 ` Edward Cree
2019-11-15 11:48                                 ` Lorenz Bauer
2019-11-15 23:02                                   ` Alexei Starovoitov
2019-11-18 13:29                                     ` Lorenz Bauer
2019-10-21 23:51                         ` [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF programs after each other Edward Cree
2019-10-16  2:28               ` Alexei Starovoitov
2019-10-16  8:27                 ` Jesper Dangaard Brouer
2019-10-16 10:35                   ` Daniel Borkmann
2019-10-16 11:16                     ` Toke Høiland-Jørgensen [this message]
2019-10-16 13:51                 ` Toke Høiland-Jørgensen
2019-10-19 20:09                   ` bpf indirect calls Alexei Starovoitov
2019-10-20 10:58                     ` Toke Høiland-Jørgensen
2019-10-25 16:30                       ` Alexei Starovoitov
2019-10-27 12:15                         ` Toke Høiland-Jørgensen
2023-09-27 13:27                     ` Matt Bobrowski
2023-09-29 21:06                       ` Alexei Starovoitov
2023-10-02 18:50                         ` Barret Rhoden
2023-10-06  9:36                         ` Matt Bobrowski
2023-10-06 18:49                           ` Alexei Starovoitov
2023-10-19 12:28                             ` Matt Bobrowski
2019-10-09 10:19         ` [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF programs after each other Jesper Dangaard Brouer
2019-10-09 17:57           ` Alexei Starovoitov
2019-10-07 17:20 ` [PATCH bpf-next v3 2/5] bpf: Add support for setting chain call sequence for programs Toke Høiland-Jørgensen
2019-10-07 20:38   ` Daniel Borkmann
2019-10-08  8:09     ` Toke Høiland-Jørgensen
2019-10-07 17:20 ` [PATCH bpf-next v3 3/5] tools: Update bpf.h header for program chain calls Toke Høiland-Jørgensen
2019-10-07 17:20 ` [PATCH bpf-next v3 4/5] libbpf: Add syscall wrappers for BPF_PROG_CHAIN_* commands Toke Høiland-Jørgensen
2019-10-07 17:20 ` [PATCH bpf-next v3 5/5] selftests: Add tests for XDP chain calls Toke Høiland-Jørgensen
2019-10-07 18:58 ` [PATCH bpf-next v3 0/5] xdp: Support multiple programs on a single interface through " John Fastabend
2019-10-08  8:42   ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878splgcua.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kafai@fb.com \
    --cc=lmb@cloudflare.com \
    --cc=marek@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).