netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Cc: Tariq Toukan <ttoukan.linux@gmail.com>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Andy Gospodarek <andrew.gospodarek@broadcom.com>,
	ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net,
	hawk@kernel.org, john.fastabend@gmail.com, andrii@kernel.org,
	kafai@fb.com, songliubraving@fb.com, yhs@fb.com,
	kpsingh@kernel.org, lorenzo.bianconi@redhat.com,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	gal@nvidia.com, Saeed Mahameed <saeedm@nvidia.com>,
	tariqt@nvidia.com
Subject: Re: [PATCH net-next v2] samples/bpf: fixup some tools to be able to support xdp multibuffer
Date: Thu, 05 Jan 2023 23:07:42 +0100	[thread overview]
Message-ID: <87v8lkzlch.fsf@toke.dk> (raw)
In-Reply-To: <Y7cBfE7GpX04EI97@C02YVCJELVCG.dhcp.broadcom.net>

Andy Gospodarek <andrew.gospodarek@broadcom.com> writes:

> On Thu, Jan 05, 2023 at 04:43:28PM +0100, Toke Høiland-Jørgensen wrote:
>> Tariq Toukan <ttoukan.linux@gmail.com> writes:
>> 
>> > On 04/01/2023 14:28, Toke Høiland-Jørgensen wrote:
>> >> Lorenzo Bianconi <lorenzo@kernel.org> writes:
>> >> 
>> >>>> On Tue, 03 Jan 2023 16:19:49 +0100 Toke Høiland-Jørgensen wrote:
>> >>>>> Hmm, good question! I don't think we've ever explicitly documented any
>> >>>>> assumptions one way or the other. My own mental model has certainly
>> >>>>> always assumed the first frag would continue to be the same size as in
>> >>>>> non-multi-buf packets.
>> >>>>
>> >>>> Interesting! :) My mental model was closer to GRO by frags
>> >>>> so the linear part would have no data, just headers.
>> >>>
>> >>> That is assumption as well.
>> >> 
>> >> Right, okay, so how many headers? Only Ethernet, or all the way up to
>> >> L4 (TCP/UDP)?
>> >> 
>> >> I do seem to recall a discussion around the header/data split for TCP
>> >> specifically, but I think I mentally put that down as "something people
>> >> may way to do at some point in the future", which is why it hasn't made
>> >> it into my own mental model (yet?) :)
>> >> 
>> >> -Toke
>> >> 
>> >
>> > I don't think that all the different GRO layers assume having their 
>> > headers/data in the linear part. IMO they will just perform better if 
>> > these parts are already there. Otherwise, the GRO flow manages, and 
>> > pulls the needed amount into the linear part.
>> > As examples, see calls to gro_pull_from_frag0 in net/core/gro.c, and the 
>> > call to pskb_may_pull() from skb_gro_header_slow().
>> >
>> > This resembles the bpf_xdp_load_bytes() API used here in the xdp prog.
>> 
>> Right, but that is kernel code; what we end up doing with the API here
>> affects how many programs need to make significant changes to work with
>> multibuf, and how many can just set the frags flag and continue working.
>> Which also has a performance impact, see below.
>> 
>> > The context of my questions is that I'm looking for the right memory 
>> > scheme for adding xdp-mb support to mlx5e striding RQ.
>> > In striding RQ, the RX buffer consists of "strides" of a fixed size set 
>> > by pthe driver. An incoming packet is written to the buffer starting from 
>> > the beginning of the next available stride, consuming as much strides as 
>> > needed.
>> >
>> > Due to the need for headroom and tailroom, there's no easy way of 
>> > building the xdp_buf in place (around the packet), so it should go to a 
>> > side buffer.
>> >
>> > By using 0-length linear part in a side buffer, I can address two 
>> > challenging issues: (1) save the in-driver headers memcpy (copy might 
>> > still exist in the xdp program though), and (2) conform to the 
>> > "fragments of the same size" requirement/assumption in xdp-mb. 
>> > Otherwise, if we pull from frag[0] into the linear part, frag[0] becomes 
>> > smaller than the next fragments.
>> 
>> Right, I see.
>> 
>> So my main concern would be that if we "allow" this, the only way to
>> write an interoperable XDP program will be to use bpf_xdp_load_bytes()
>> for every packet access. Which will be slower than DPA, so we may end up
>> inadvertently slowing down all of the XDP ecosystem, because no one is
>> going to bother with writing two versions of their programs. Whereas if
>> you can rely on packet headers always being in the linear part, you can
>> write a lot of the "look at headers and make a decision" type programs
>> using just DPA, and they'll work for multibuf as well.
>
> The question I would have is what is really the 'slow down' for
> bpf_xdp_load_bytes() vs DPA?  I know you and Jesper can tell me how many
> instructions each use. :)

I can try running some benchmarks to compare the two, sure!

-Toke


  parent reply	other threads:[~2023-01-05 22:09 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-21 17:54 [PATCH net-next v2] samples/bpf: fixup some tools to be able to support xdp multibuffer Andy Gospodarek
2022-06-22  2:00 ` patchwork-bot+netdevbpf
2023-01-03 12:55 ` Tariq Toukan
2023-01-03 15:19   ` Toke Høiland-Jørgensen
2023-01-04  1:21     ` Jakub Kicinski
2023-01-04  8:44       ` Lorenzo Bianconi
2023-01-04 12:28         ` Toke Høiland-Jørgensen
2023-01-05  1:17           ` Jakub Kicinski
2023-01-05  7:20           ` Tariq Toukan
2023-01-05 15:43             ` Toke Høiland-Jørgensen
2023-01-05 16:57               ` Andy Gospodarek
2023-01-05 18:16                 ` Jakub Kicinski
2023-01-06 13:56                   ` Andy Gospodarek
2023-01-08 12:33                   ` Tariq Toukan
     [not found]                   ` <8369e348-a8ec-cb10-f91f-4277e5041a27@nvidia.com>
2023-01-08 12:42                     ` Tariq Toukan
2023-01-09 13:50                       ` Toke Høiland-Jørgensen
2023-01-05 22:07                 ` Toke Høiland-Jørgensen [this message]
2023-01-06 17:54                   ` Toke Høiland-Jørgensen
2023-01-05 16:22       ` Andy Gospodarek
2023-01-10 20:59       ` Maxim Mikityanskiy
2023-01-13 21:07         ` Tariq Toukan
2023-01-25 12:49           ` Tariq Toukan
2023-01-05 16:18   ` Andy Gospodarek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v8lkzlch.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=andrew.gospodarek@broadcom.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=gal@nvidia.com \
    --cc=hawk@kernel.org \
    --cc=ilias.apalodimas@linaro.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=songliubraving@fb.com \
    --cc=tariqt@nvidia.com \
    --cc=ttoukan.linux@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).