XDP multi-buffer design discussion

* XDP multi-buffer design discussion
       [not found]           ` <ec2fd7f6da44410fbaeb021cf984f2f6@EX13D11EUC003.ant.amazon.com>
@ 2019-12-16 14:07             ` Jesper Dangaard Brouer
  2019-12-17  4:15               ` Luigi Rizzo
  0 siblings, 1 reply; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2019-12-16 14:07 UTC (permalink / raw)
  To: Jubran, Samih
  Cc: Machulsky, Zorik, Daniel Borkmann, David Miller, Tzalik, Guy,
	Ilias Apalodimas, Toke Høiland-Jørgensen, Kiyanovski,
	Arthur, brouer, Alexei Starovoitov, netdev, David Ahern

See answers inlined below (please get an email client that support
inline replies... to interact with this community)

On Sun, 15 Dec 2019 13:57:12 +0000
"Jubran, Samih" <sameehj@amazon.com> wrote:

> I am currently working on writing a design document for the XDP multi
> buff and I am using your proposal as a base. 

[Base-doc]: https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org

I will really appreciate if you can write your design document in a
text-based format, and that this can be included in the XDP-project
git-repo.  If you don't want to update the existing document
xdp-multi-buffer01-design.org, I suggest that you instead create
xdp-multi-buffer02-design.org to layout your design proposal.

That said, don't write a huge document... instead interact with
netdev@vger.kernel.org as early as possible, and then update the design
doc with the input you get.  Lets start it now... Cc'ing netdev as this
discussion should also be public.

> I have few questions in mind that weren't addressed in your draft and
> it would be great if you share your thoughts on them.
> 
> * Why should we provide the fragments to the bpf program if the
> program doesn't access them? If validating the length is what
> matters, we can provide only the full length info to the user with no
> issues.

My Proposal#1 (in [base-doc]) is that XDP only get access to the
first-buffer.  People are welcome to challenge this choice.

There are a several sub-questions and challenges hidden inside this
choice.

As you hint, the total length... spawns some questions we should answer:

 (1) is it relevant to the BPF program to know this, explain the use-case.

 (2) if so, how does BPF prog access info (without slowdown baseline)

 (3) if so, implies driver need to collect frags before calling bpf_prog
    (this influence driver RX-loop design).

> * In case we do need the fragments, should they be modifiable
> (Without helpers) by the xdp developer? 

It is okay to think about, how we can give access to fragments in the
future. But IMHO we should avoid going too far down that path...
If we just make sure we can extend it later, then it should be enough.

> * What about data_end? I believe it should point to the end of the
> first buffer, correct?

Yes, because this is part of BPF-verifier checks.

> * Should the kernel indicate to the driver somehow that it supports
> multi buf? I suppose this shouldn't be an issue unless somehow the
> patches were back patched to old kernels.
> 

The other way around.  The driver need to indicate to kernel that is
supports/enabled XDP multi-buffer.  This feature "indication" interface
is unfortunately not available today...

The reason this is needed: the BPF-helper bpf_xdp_adjust_tail() is
allowed to modify xdp_buff->data_end (as also desc in [base-doc]).
Even-though this is only shrink, then it seems very wrong to
change/shrink the first-fragment.

IMHO the BPF-loader (or XDP-attach) should simply reject programs using
bpf_xdp_adjust_tail() on a driver that have enabled XDP-multi-buffer.
This basically also happens today, if trying to attach XDP on a NIC
with large MTU (that requires >= two pages).

--Jesper

> > -----Original Message-----
> > From: Jesper Dangaard Brouer <brouer@redhat.com>
> > Sent: Wednesday, December 4, 2019 4:55 PM
> > To: Machulsky, Zorik <zorik@amazon.com>
> > Cc: Daniel Borkmann <borkmann@iogearbox.net>; David Miller
> > <davem@davemloft.net>; Jubran, Samih <sameehj@amazon.com>; Tzalik,
> > Guy <gtzalik@amazon.com>; brouer@redhat.com; Ilias Apalodimas
> > <ilias.apalodimas@linaro.org>; Toke Høiland-Jørgensen <toke@redhat.com>
> > Subject: Re: XDP_TX in ENA
> > 
> > On Mon, 2 Dec 2019 08:17:08 +0000
> > "Machulsky, Zorik" <zorik@amazon.com> wrote:
> >   
> > > Hi Jesper,
> > >
> > > Just wanted to inform you that Samih (cc-ed) started working on
> > > multi-buffer packets support. I hope it will be OK to reach out to
> > > this forum in case there will be questions during this work.  
> > 
> > Great to hear that you are continuing the work.
> > 
> > I did notice the patchset ("Introduce XDP to ena") from Sameeh, but net-
> > next is currently closed.  I will appreciate if you can Cc both me
> > (brouer@redhat.com) and Ilias Apalodimas <ilias.apalodimas@linaro.org>.
> > 
> > Ilias have signed up for doing driver XDP reviews.
> > 
> > --Jesper
> > 
> >   
> > > On 8/22/19, 11:47 PM, "Jesper Dangaard Brouer" <brouer@redhat.com>  
> > wrote:  
> > >
> > >     Hi Zorik,
> > >
> > >     How do you plan to handle multi-buffer packets (a.k.a jumbo-frames, and  
> > >     more)?
> > >
> > >     Most drivers, when XDP gets loaded, just limit the MTU and disable TSO
> > >     (notice GRO in software is still done). Or reject XDP loading if
> > >     MTU > 3520 or TSO is enabled.
> > >
> > >     You seemed to want XDP multi-buffer support.  For this to happen
> > >     someone needs to work on this.  I've written up a design proposal
> > >     here[1], but I don't have time to work on this... Can you allocate
> > >     resources to work on this?
> > >
> > >     [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
> > >
> > >     --Jesper
> > >     (top-post as your email client seems to be challenged ;-))

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 11+ messages in thread