bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: "Toke Høiland-Jørgensen" <toke@kernel.org>
Cc: <bpf@vger.kernel.org>, <ast@kernel.org>, <daniel@iogearbox.net>,
	<andrii@kernel.org>, <netdev@vger.kernel.org>,
	<magnus.karlsson@intel.com>, <bjorn@kernel.org>,
	<tirthendu.sarkar@intel.com>, <simon.horman@corigine.com>
Subject: Re: [PATCH v3 bpf-next 00/22] xsk: multi-buffer support
Date: Tue, 6 Jun 2023 14:50:41 +0200	[thread overview]
Message-ID: <ZH8roRaXpova3Qwy@boxer> (raw)
In-Reply-To: <87edmp3ky6.fsf@toke.dk>

On Mon, Jun 05, 2023 at 06:58:25PM +0200, Toke Høiland-Jørgensen wrote:
> Great to see this proceeding! Thought I'd weigh in on this part:

Hey Toke that is very nice to hear and thanks for chiming in:)

> 
> > Unfortunately, we had to introduce a new bind flag (XDP_USE_SG) on the
> > AF_XDP level to enable multi-buffer support. It would be great if you
> > have ideas on how to get rid of it. The reason we need to
> > differentiate between non multi-buffer and multi-buffer is the
> > behaviour when the kernel gets a packet that is larger than the frame
> > size. Without multi-buffer, this packet is dropped and marked in the
> > stats. With multi-buffer on, we want to split it up into multiple
> > frames instead.
> >
> > At the start, we thought that riding on the .frags section name of
> > the XDP program was a good idea. You do not have to introduce yet
> > another flag and all AF_XDP users must load an XDP program anyway
> > to get any traffic up to the socket, so why not just say that the XDP
> > program decides if the AF_XDP socket should get multi-buffer packets
> > or not? The problem is that we can create an AF_XDP socket that is Tx
> > only and that works without having to load an XDP program at
> > all. Another problem is that the XDP program might change during the
> > execution, so we would have to check this for every single packet.
> 
> I agree that it's better to tie the enabling of this to a socket flag
> instead of to the XDP program, for a couple of reasons:
> 
> - The XDP program can, as you say, be changed, but it can also be shared
>   between several sockets in a single XSK, so this really needs to be
>   tied to the socket.

exactly

> 
> - The XDP program is often installed implicitly by libxdp, in which case
>   the program can't really set the flag on the program itself.
> 
> There's a related question of whether the frags flag on the XDP program
> should be a prerequisite for enabling it at the socket? I think probably
> it should, right?

These are two separate events (loading XDP prog vs loading AF_XDP socket)
which are unordered, so you can load mbuf AF_XDP socket in the first place
and then non-mbuf XDP prog and it will still work at some circumstances -
i will quote here commit msg from patch 02:

<quote>
Such capability of the application needs to be independent of the
xdp_prog's frag support capability since there are cases where even a
single xdp_buffer may need to be split into multiple descriptors owing to
a smaller xsk frame size.

For e.g., with NIC rx_buffer size set to 4kB, a 3kB packet will
constitute of a single buffer and so will be sent as such to AF_XDP layer
irrespective of 'xdp.frags' capability of the XDP program. Now if the xsk
frame size is set to 2kB by the AF_XDP application, then the packet will
need to be split into 2 descriptors if AF_XDP application can handle
multi-buffer, else it needs to be dropped.
</quote>

> 
> Also, related to the first point above, how does the driver respond to
> two different sockets being attached to the same device with two
> different values of the flag? (As you can probably tell I didn't look at
> the details of the implementation...)

If we talk about zero-copy multi-buffer enabled driver then it will
combine all of the frags that belong to particular packet onto xdp_buff
which then will be redirected and AF_XDP core will check XDP_USE_SG flag
vs the length of xdp_buff - if len is bigger than a chunk size from XSK
pool (implies mbuf) and there is no XDP_USE_SG flag on socket - packet
will be dropped.

So driver is agnostic to that. AF_XDP core handles case you brought up
respectively.

Also what we actually attach down to driver is XSK pool not XSK socket
itself as you know. XSK pool does not carry any info regarding frags.

> 
> -Toke

  reply	other threads:[~2023-06-06 12:52 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-05 14:44 [PATCH v3 bpf-next 00/22] xsk: multi-buffer support Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 01/22] xsk: prepare 'options' in xdp_desc for multi-buffer use Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 02/22] xsk: introduce XSK_USE_SG bind flag for xsk socket Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 03/22] xsk: prepare both copy and zero-copy modes to co-exist Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 04/22] xsk: move xdp_buff's data length check to xsk_rcv_check Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 05/22] xsk: add support for AF_XDP multi-buffer on Rx path Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 06/22] xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 07/22] xsk: allow core/drivers to test EOP bit Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 08/22] xsk: add support for AF_XDP multi-buffer on Tx path Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 09/22] xsk: discard zero length descriptors in " Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 10/22] xsk: support mbuf on ZC RX Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 11/22] ice: xsk: add RX multi-buffer support Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 12/22] xsk: support ZC Tx multi-buffer in batch API Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 13/22] xsk: report zero-copy multi-buffer capability via xdp_features Maciej Fijalkowski
2023-06-05 21:43   ` Jakub Kicinski
2023-06-06 12:52     ` Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 14/22] ice: xsk: Tx multi-buffer support Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 15/22] xsk: add multi-buffer documentation Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 16/22] selftests/xsk: transmit and receive multi-buffer packets Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 17/22] selftests/xsk: add basic multi-buffer test Maciej Fijalkowski
2023-06-05 20:17   ` Simon Horman
2023-06-05 14:44 ` [PATCH v3 bpf-next 18/22] selftests/xsk: add unaligned mode test for multi-buffer Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 19/22] selftests/xsk: add invalid descriptor " Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 20/22] selftests/xsk: add metadata copy test for multi-buff Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 21/22] selftests/xsk: add test for too many frags Maciej Fijalkowski
2023-06-05 14:44 ` [PATCH v3 bpf-next 22/22] selftests/xsk: reset NIC settings to default after running test suite Maciej Fijalkowski
2023-06-05 16:58 ` [PATCH v3 bpf-next 00/22] xsk: multi-buffer support Toke Høiland-Jørgensen
2023-06-06 12:50   ` Maciej Fijalkowski [this message]
2023-06-06 20:35     ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZH8roRaXpova3Qwy@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=simon.horman@corigine.com \
    --cc=tirthendu.sarkar@intel.com \
    --cc=toke@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).