netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zvi Effron <zeffron@riotgames.com>
To: Lorenz Bauer <lmb@cloudflare.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Lorenzo Bianconi" <lbianconi@redhat.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"John Fastabend" <john.fastabend@gmail.com>,
	Networking <netdev@vger.kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: Redux: Backwards compatibility for XDP multi-buff
Date: Fri, 24 Sep 2021 10:55:27 -0700	[thread overview]
Message-ID: <CAC1LvL2n+FYA997j8+=gnyNQz3y80socMwTqQ1fbbOdp2ZnfAg@mail.gmail.com> (raw)
In-Reply-To: <CACAyw9_N2Jh651hXL=P=cFM7O-n7Z0NXWy_D9j0ztVpEm+OgNA@mail.gmail.com>

On Fri, Sep 24, 2021 at 3:19 AM Lorenz Bauer <lmb@cloudflare.com> wrote:
>
> On Thu, 23 Sept 2021 at 13:59, Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > I don't think it has to be quite that bleak :)
> >
> > Specifically, there is no reason to block mb-aware programs from loading
> > even when the multi-buffer mode is disabled. So a migration plan would
> > look something like:
>
> ...
>
> > 2. Start porting all your XDP programs to make them mb-aware, and switch
> > their program type as you do. In many cases this is just a matter of
> > checking that the programs don't care about packet length. [...]
>
> Porting is only easy if we are guaranteed that the first PAGE_SIZE
> bytes (or whatever the current limit is) are available via ->data
> without trickery. Otherwise we have to convert all direct packet
> access to the new API, whatever that ends up being. It seemed to me
> like you were saying there is no such guarantee, and it could be
> driver dependent, which is the worst possible outcome imo. This is the
> status quo for TC classifiers, which is a great source of hard to
> diagnose bugs.
>
> > 3. Once all your programs have been ported and marked as such, flip the
> > sysctl. This will make the system start refusing to load any XDP
> > programs that are not mb-aware.
>
> By this you mean reboot the system and early in boot change the
> sysctl? That could work I guess.
>

I don't think a reboot is required. Just an unload of all
mb-unaware XDP programs to allow adjusting the sysctl via normal sysctl
changing methods (i.e. sysctl -w). Even if rebooting, it should just need an
entry in /etc/sysctl.conf, nothing necessarily early in the boot process. The
sysctl would only need to be set before the first load of an mn-unaware XDP
program, which, if everything is ported correctly, would never happen.

> > > 2. Add a compatibility shim for mb-unaware programs receiving an mb frame.
> > >
> > > We'd still need a way to indicate "MB-OK", but it could be a piece of
> > > metadata on a bpf_prog. Whatever code dispatches to an XDP program
> > > would have to include a prologue that linearises the xdp_buff if
> > > necessary which implies allocating memory. I don't know how hard it is
> > > to implement this.
> >
> > I think it would be somewhat non-trivial, and more importantly would
> > absolutely slaughter performance. And if you're using XDP, presumably
> > you care about that, so I'm not sure we're doing anyone any favours by
> > implementing such a compatibility layer?
>
> I see your point: having a single non-mb-aware program trash
> performance is bad for marketing. Better to not let people bump the
> MTU in that case.
>
> > > 3. Make non-linearity invisible to the BPF program
> > >
> > > Something I've wished for often is that I didn't have to deal with
> > > nonlinearity at all, based on my experience with cls_redirect [2].
> > > It's really hard to write a BPF program that handles non-linear skb,
> > > especially when you have to call adjust_head, etc. which invalidates
> > > packet buffers. This is probably impossible, but maybe someone has a
> > > crazy idea? :)
> >
> > With the other helpers that we started discussing, I don't think you
> > have to? I.e., with an xdp_load_bytes() or an xdp_data_pointer()-type
> > helper that works across fragment boundaries I think you'd be fine, no?
>
> I'll take a look!
>
> Lorenz
>
> --
> Lorenz Bauer | Systems Engineer
> 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK
>
> www.cloudflare.com

  reply	other threads:[~2021-09-24 17:55 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-21 16:06 Redux: Backwards compatibility for XDP multi-buff Toke Høiland-Jørgensen
2021-09-21 17:31 ` Zvi Effron
2021-09-21 18:22   ` Toke Høiland-Jørgensen
2021-09-21 19:17     ` Zvi Effron
2021-09-21 22:14       ` Toke Høiland-Jørgensen
2021-09-21 23:10         ` Zvi Effron
2021-09-22 20:13           ` Toke Høiland-Jørgensen
2021-09-21 20:12     ` Alexei Starovoitov
2021-09-21 22:20       ` Toke Høiland-Jørgensen
2021-09-21 22:51         ` Jakub Kicinski
2021-09-22 20:01           ` Toke Høiland-Jørgensen
2021-09-22 21:23             ` Zvi Effron
2021-09-23 18:45               ` Toke Høiland-Jørgensen
2021-09-23 13:46             ` Jakub Kicinski
2021-09-27 12:43               ` Jesper Dangaard Brouer
2021-09-21 22:54 ` Jakub Kicinski
2021-09-22 20:02   ` Toke Høiland-Jørgensen
2021-09-22 21:11     ` Zvi Effron
2021-09-23 19:00       ` Toke Høiland-Jørgensen
2021-09-23 10:33 ` Lorenz Bauer
2021-09-23 12:59   ` Toke Høiland-Jørgensen
2021-09-24 10:18     ` Lorenz Bauer
2021-09-24 17:55       ` Zvi Effron [this message]
2021-09-24 19:38       ` Toke Høiland-Jørgensen
2021-09-28  8:47         ` Lorenz Bauer
2021-09-28 13:43           ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAC1LvL2n+FYA997j8+=gnyNQz3y80socMwTqQ1fbbOdp2ZnfAg@mail.gmail.com' \
    --to=zeffron@riotgames.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=lbianconi@redhat.com \
    --cc=lmb@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).