Redux: Backwards compatibility for XDP multi-buff

* Redux: Backwards compatibility for XDP multi-buff
@ 2021-09-21 16:06 Toke Høiland-Jørgensen
  2021-09-21 17:31 ` Zvi Effron
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-09-21 16:06 UTC (permalink / raw)
  To: Lorenz Bauer
  Cc: Lorenzo Bianconi, Daniel Borkmann, John Fastabend, netdev, bpf

Hi Lorenz (Cc. the other people who participated in today's discussion)

Following our discussion at the LPC session today, I dug up my previous
summary of the issue and some possible solutions[0]. Seems no on
actually replied last time, which is why we went with the "do nothing"
approach, I suppose. I'm including the full text of the original email
below; please take a look, and let's see if we can converge on a
consensus here.

First off, a problem description: If an existing XDP program is exposed
to an xdp_buff that is really a multi-buffer, while it will continue to
run, it may end up with subtle and hard-to-debug bugs: If it's parsing
the packet it'll only see part of the payload and not be aware of that
fact, and if it's calculating the packet length, that will also only be
wrong (only counting the first fragment).

So what to do about this? First of all, to do anything about it, XDP
programs need to be able to declare themselves "multi-buffer aware" (but
see point 1 below). We could try to auto-detect it in the verifier by
which helpers the program is using, but since existing programs could be
perfectly happy to just keep running, it probably needs to be something
the program communicates explicitly. One option is to use the
expected_attach_type to encode this; programs can then declare it in the
source by section name, or the userspace loader can set the type for
existing programs if needed.

With this, the kernel will know if a given XDP program is multi-buff
aware and can decide what to do with that information. For this we came
up with basically three options:

1. Do nothing. This would make it up to users / sysadmins to avoid
   anything breaking by manually making sure to not enable multi-buffer
   support while loading any XDP programs that will malfunction if
   presented with an mb frame. This will probably break in interesting
   ways, but it's nice and simple from an implementation PoV. With this
   we don't need the declaration discussed above either.

2. Add a check at runtime and drop the frames if they are mb-enabled and
   the program doesn't understand it. This is relatively simple to
   implement, but it also makes for difficult-to-understand issues (why
   are my packets suddenly being dropped?), and it will incur runtime
   overhead.

3. Reject loading of programs that are not MB-aware when running in an
   MB-enabled mode. This would make things break in more obvious ways,
   and still allow a userspace loader to declare a program "MB-aware" to
   force it to run if necessary. The problem then becomes at what level
   to block this?

   Doing this at the driver level is not enough: while a particular
   driver knows if it's running in multi-buff mode, we can't know for
   sure if a particular XDP program is multi-buff aware at attach time:
   it could be tail-calling other programs, or redirecting packets to
   another interface where it will be processed by a non-MB aware
   program.

   So another option is to make it a global toggle: e.g., create a new
   sysctl to enable multi-buffer. If this is set, reject loading any XDP
   program that doesn't support multi-buffer mode, and if it's unset,
   disable multi-buffer mode in all drivers. This will make it explicit
   when the multi-buffer mode is used, and prevent any accidental subtle
   malfunction of existing XDP programs. The drawback is that it's a
   mode switch, so more configuration complexity.

None of these options are ideal, of course, but I hope the above
explanation at least makes sense. If anyone has any better ideas (or can
spot any flaws in the reasoning above) please don't hesitate to let us
know!

-Toke

[0] https://lore.kernel.org/r/8735srxglb.fsf@toke.dk

^ permalink raw reply	[flat|nested] 26+ messages in thread