netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamal Hadi Salim <jhs@mojatatu.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: ast@kernel.org, andrii@kernel.org, martin.lau@linux.dev,
	 razor@blackwall.org, sdf@google.com, john.fastabend@gmail.com,
	 kuba@kernel.org, dxu@dxuuu.xyz, joe@cilium.io, toke@kernel.org,
	 davem@davemloft.net, bpf@vger.kernel.org,
	netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v2 2/7] bpf: Add fd-based tcx multi-prog infra with link support
Date: Thu, 8 Jun 2023 15:46:05 -0400	[thread overview]
Message-ID: <CAM0EoM=FFsTNNKaMbRtuRxc8ieJgDFsBifBmZZ2_67u5=+-3BQ@mail.gmail.com> (raw)
In-Reply-To: <fe2e13a6-1fb6-c160-1d6f-31c09264911b@iogearbox.net>

Hi Daniel,

On Thu, Jun 8, 2023 at 6:12 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Hi Jamal,
>
> On 6/8/23 3:25 AM, Jamal Hadi Salim wrote:
> [...]
> > A general question (which i think i asked last time as well): who
> > decides what comes after/before what prog in this setup? And would
> > that same entity not have been able to make the same decision using tc
> > priorities?
>
> Back in the first version of the series I initially coded up this option
> that the tc_run() would basically be a fake 'bpf_prog' and it would have,
> say, fixed prio 1000. It would get executed via tcx_run() when iterating
> via bpf_mprog_foreach_prog() where bpf_prog_run() is called, and then users
> could pick for native BPF prio before or after that. But then the feedback
> was that sticking to prio is a bad user experience which led to the
> development of what is in patch 1 of this series (see the details there).
>

Thanks. I read the commit message in patch 1 and followed the thread
back including some of the discussion we had and i am still
disagreeing that this couldnt be solved with a smart priority based
scheme - but i think we can move on since this is standalone and
doesnt affect tc.

Daniel - i am still curious in the new scheme of things how would
cilium vs datadog food fight get resolved without some arbitration
entity?

> > The idea of protecting programs from being unloaded is very welcome
> > but feels would have made sense to be a separate patchset (we have
> > good need for it). Would it be possible to use that feature in tc and
> > xdp?
> BPF links are supported for XDP today, just tc BPF is one of the few
> remainders where it is not the case, hence the work of this series. What
> XDP lacks today however is multi-prog support. With the bpf_mprog concept
> that could be addressed with that common/uniform api (and Andrii expressed
> interest in integrating this also for cgroup progs), so yes, various hook
> points/program types could benefit from it.

Is there some sample XDP related i could look at?  Let me describe our
use case: lets say we load an ebpf program foo attached to XDP of a
netdev  and then something further upstream in the stack is consuming
the results of that ebpf XDP program. For some reason someone, at some
point, decides to replace the XDP prog with a different one - and the
new prog does a very different thing. Could we stop the replacement
with the link mechanism you describe? i.e the program is still loaded
but is no longer attached to the netdev.


> >> +struct tcx_entry {
> >> +       struct bpf_mprog_bundle         bundle;
> >> +       struct mini_Qdisc __rcu         *miniq;
> >> +};
> >> +
> >
> > Can you please move miniq to the front? From where i sit this looks:
> > struct tcx_entry {
> >          struct bpf_mprog_bundle    bundle
> > __attribute__((__aligned__(64))); /*     0  3264 */
> >
> >          /* XXX last struct has 36 bytes of padding */
> >
> >          /* --- cacheline 51 boundary (3264 bytes) --- */
> >          struct mini_Qdisc *        miniq;                /*  3264     8 */
> >
> >          /* size: 3328, cachelines: 52, members: 2 */
> >          /* padding: 56 */
> >          /* paddings: 1, sum paddings: 36 */
> >          /* forced alignments: 1 */
> > } __attribute__((__aligned__(64)));
> >
> > That is a _lot_ of cachelines - at the expense of the status quo
> > clsact/ingress qdiscs which access miniq.
>
> Ah yes, I'll fix this up.

Thanks.

cheers,
jamal
> Thanks,
> Daniel

  reply	other threads:[~2023-06-08 19:46 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-07 19:26 [PATCH bpf-next v2 0/7] BPF link support for tc BPF programs Daniel Borkmann
2023-06-07 19:26 ` [PATCH bpf-next v2 1/7] bpf: Add generic attach/detach/query API for multi-progs Daniel Borkmann
2023-06-08 17:23   ` Stanislav Fomichev
2023-06-08 20:59     ` Andrii Nakryiko
2023-06-08 21:52       ` Stanislav Fomichev
2023-06-08 22:13         ` Andrii Nakryiko
2023-06-08 23:06           ` Stanislav Fomichev
2023-06-08 23:54             ` Alexei Starovoitov
2023-06-09  0:08               ` Andrii Nakryiko
2023-06-09  0:38                 ` Stanislav Fomichev
2023-06-09  0:29             ` Toke Høiland-Jørgensen
2023-06-09  6:52               ` Daniel Borkmann
2023-06-09  7:15                 ` Daniel Borkmann
2023-06-09 11:04                 ` Toke Høiland-Jørgensen
2023-06-09 12:34                   ` Timo Beckers
2023-06-09 13:11                     ` Toke Høiland-Jørgensen
2023-06-09 14:15                       ` Daniel Borkmann
2023-06-09 16:41                         ` Stanislav Fomichev
2023-06-09 19:03                           ` Andrii Nakryiko
2023-06-10  2:52                             ` Daniel Xu
2023-06-09 18:58                         ` Andrii Nakryiko
2023-06-09 20:28                         ` Toke Høiland-Jørgensen
2023-06-12 11:21                         ` Dave Tucker
2023-06-12 12:43                           ` Daniel Borkmann
2023-06-09 18:56                       ` Andrii Nakryiko
2023-06-09 20:08                         ` Alexei Starovoitov
     [not found]                           ` <20230610022721.2950602-1-prankgup@fb.com>
2023-06-10  3:37                             ` Alexei Starovoitov
2023-06-09 20:20                         ` Toke Høiland-Jørgensen
2023-06-08 20:53   ` Andrii Nakryiko
2023-06-07 19:26 ` [PATCH bpf-next v2 2/7] bpf: Add fd-based tcx multi-prog infra with link support Daniel Borkmann
2023-06-08  1:25   ` Jamal Hadi Salim
2023-06-08 10:11     ` Daniel Borkmann
2023-06-08 19:46       ` Jamal Hadi Salim [this message]
2023-06-08 21:24         ` Andrii Nakryiko
2023-07-04 21:36           ` Jamal Hadi Salim
2023-07-04 22:01             ` Daniel Borkmann
2023-07-04 22:38               ` Jamal Hadi Salim
2023-07-05  7:34                 ` Daniel Borkmann
2023-07-06 13:31                   ` Jamal Hadi Salim
2023-06-08 17:50   ` Stanislav Fomichev
2023-06-08 21:20   ` Andrii Nakryiko
2023-06-09  3:06   ` Jakub Kicinski
2023-06-07 19:26 ` [PATCH bpf-next v2 3/7] libbpf: Add opts-based attach/detach/query API for tcx Daniel Borkmann
2023-06-08 21:37   ` Andrii Nakryiko
2023-06-07 19:26 ` [PATCH bpf-next v2 4/7] libbpf: Add link-based " Daniel Borkmann
2023-06-08 21:45   ` Andrii Nakryiko
2023-06-07 19:26 ` [PATCH bpf-next v2 5/7] bpftool: Extend net dump with tcx progs Daniel Borkmann
2023-06-07 19:26 ` [PATCH bpf-next v2 6/7] selftests/bpf: Add mprog API tests for BPF tcx opts Daniel Borkmann
2023-06-07 19:26 ` [PATCH bpf-next v2 7/7] selftests/bpf: Add mprog API tests for BPF tcx links Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM0EoM=FFsTNNKaMbRtuRxc8ieJgDFsBifBmZZ2_67u5=+-3BQ@mail.gmail.com' \
    --to=jhs@mojatatu.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dxu@dxuuu.xyz \
    --cc=joe@cilium.io \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=netdev@vger.kernel.org \
    --cc=razor@blackwall.org \
    --cc=sdf@google.com \
    --cc=toke@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).