bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Björn Töpel" <bjorn.topel@intel.com>
To: Magnus Karlsson <magnus.karlsson@intel.com>,
	ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org,
	jonathan.lemon@gmail.com, maximmi@mellanox.com
Cc: bpf@vger.kernel.org, jeffrey.t.kirsher@intel.com,
	anthony.l.nguyen@intel.com, maciej.fijalkowski@intel.com,
	maciejromanfijalkowski@gmail.com, cristian.dumitrescu@intel.com
Subject: Re: [PATCH bpf-next v4 14/14] xsk: documentation for XDP_SHARED_UMEM between queues and netdevs
Date: Tue, 28 Jul 2020 09:18:48 +0200	[thread overview]
Message-ID: <59554482-dc1d-7758-c090-9fada8c66151@intel.com> (raw)
In-Reply-To: <1595307848-20719-15-git-send-email-magnus.karlsson@intel.com>



On 2020-07-21 07:04, Magnus Karlsson wrote:
> Add documentation for the XDP_SHARED_UMEM feature when a UMEM is
> shared between different queues and/or netdevs.
> 
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Acked-by: Björn Töpel <bjorn.topel@intel.com>


> ---
>   Documentation/networking/af_xdp.rst | 68 +++++++++++++++++++++++++++++++------
>   1 file changed, 58 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
> index 5bc55a4..2ccc564 100644
> --- a/Documentation/networking/af_xdp.rst
> +++ b/Documentation/networking/af_xdp.rst
> @@ -258,14 +258,21 @@ socket into zero-copy mode or fail.
>   XDP_SHARED_UMEM bind flag
>   -------------------------
>   
> -This flag enables you to bind multiple sockets to the same UMEM, but
> -only if they share the same queue id. In this mode, each socket has
> -their own RX and TX rings, but the UMEM (tied to the fist socket
> -created) only has a single FILL ring and a single COMPLETION
> -ring. To use this mode, create the first socket and bind it in the normal
> -way. Create a second socket and create an RX and a TX ring, or at
> -least one of them, but no FILL or COMPLETION rings as the ones from
> -the first socket will be used. In the bind call, set he
> +This flag enables you to bind multiple sockets to the same UMEM. It
> +works on the same queue id, between queue ids and between
> +netdevs/devices. In this mode, each socket has their own RX and TX
> +rings as usual, but you are going to have one or more FILL and
> +COMPLETION ring pairs. You have to create one of these pairs per
> +unique netdev and queue id tuple that you bind to.
> +
> +Starting with the case were we would like to share a UMEM between
> +sockets bound to the same netdev and queue id. The UMEM (tied to the
> +fist socket created) will only have a single FILL ring and a single
> +COMPLETION ring as there is only on unique netdev,queue_id tuple that
> +we have bound to. To use this mode, create the first socket and bind
> +it in the normal way. Create a second socket and create an RX and a TX
> +ring, or at least one of them, but no FILL or COMPLETION rings as the
> +ones from the first socket will be used. In the bind call, set he
>   XDP_SHARED_UMEM option and provide the initial socket's fd in the
>   sxdp_shared_umem_fd field. You can attach an arbitrary number of extra
>   sockets this way.
> @@ -305,11 +312,41 @@ concurrently. There are no synchronization primitives in the
>   libbpf code that protects multiple users at this point in time.
>   
>   Libbpf uses this mode if you create more than one socket tied to the
> -same umem. However, note that you need to supply the
> +same UMEM. However, note that you need to supply the
>   XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD libbpf_flag with the
>   xsk_socket__create calls and load your own XDP program as there is no
>   built in one in libbpf that will route the traffic for you.
>   
> +The second case is when you share a UMEM between sockets that are
> +bound to different queue ids and/or netdevs. In this case you have to
> +create one FILL ring and one COMPLETION ring for each unique
> +netdev,queue_id pair. Let us say you want to create two sockets bound
> +to two different queue ids on the same netdev. Create the first socket
> +and bind it in the normal way. Create a second socket and create an RX
> +and a TX ring, or at least one of them, and then one FILL and
> +COMPLETION ring for this socket. Then in the bind call, set he
> +XDP_SHARED_UMEM option and provide the initial socket's fd in the
> +sxdp_shared_umem_fd field as you registered the UMEM on that
> +socket. These two sockets will now share one and the same UMEM.
> +
> +There is no need to supply an XDP program like the one in the previous
> +case where sockets were bound to the same queue id and
> +device. Instead, use the NIC's packet steering capabilities to steer
> +the packets to the right queue. In the previous example, there is only
> +one queue shared among sockets, so the NIC cannot do this steering. It
> +can only steer between queues.
> +
> +In libbpf, you need to use the xsk_socket__create_shared() API as it
> +takes a reference to a FILL ring and a COMPLETION ring that will be
> +created for you and bound to the shared UMEM. You can use this
> +function for all the sockets you create, or you can use it for the
> +second and following ones and use xsk_socket__create() for the first
> +one. Both methods yield the same result.
> +
> +Note that a UMEM can be shared between sockets on the same queue id
> +and device, as well as between queues on the same device and between
> +devices at the same time.
> +
>   XDP_USE_NEED_WAKEUP bind flag
>   -----------------------------
>   
> @@ -364,7 +401,7 @@ resources by only setting up one of them. Both the FILL ring and the
>   COMPLETION ring are mandatory as you need to have a UMEM tied to your
>   socket. But if the XDP_SHARED_UMEM flag is used, any socket after the
>   first one does not have a UMEM and should in that case not have any
> -FILL or COMPLETION rings created as the ones from the shared umem will
> +FILL or COMPLETION rings created as the ones from the shared UMEM will
>   be used. Note, that the rings are single-producer single-consumer, so
>   do not try to access them from multiple processes at the same
>   time. See the XDP_SHARED_UMEM section.
> @@ -567,6 +604,17 @@ A: The short answer is no, that is not supported at the moment. The
>      switch, or other distribution mechanism, in your NIC to direct
>      traffic to the correct queue id and socket.
>   
> +Q: My packets are sometimes corrupted. What is wrong?
> +
> +A: Care has to be taken not to feed the same buffer in the UMEM into
> +   more than one ring at the same time. If you for example feed the
> +   same buffer into the FILL ring and the TX ring at the same time, the
> +   NIC might receive data into the buffer at the same time it is
> +   sending it. This will cause some packets to become corrupted. Same
> +   thing goes for feeding the same buffer into the FILL rings
> +   belonging to different queue ids or netdevs bound with the
> +   XDP_SHARED_UMEM flag.
> +
>   Credits
>   =======
>   
> 

      reply	other threads:[~2020-07-28  7:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21  5:03 [PATCH bpf-next v4 00/14] xsk: support shared umems between devices and queues Magnus Karlsson
2020-07-21  5:03 ` [PATCH bpf-next v4 01/14] xsk: i40e: ice: ixgbe: mlx5: pass buffer pool to driver instead of umem Magnus Karlsson
2020-07-28  7:04   ` Björn Töpel
2020-07-21  5:03 ` [PATCH bpf-next v4 02/14] xsk: i40e: ice: ixgbe: mlx5: rename xsk zero-copy driver interfaces Magnus Karlsson
2020-07-28  7:04   ` Björn Töpel
2020-07-21  5:03 ` [PATCH bpf-next v4 03/14] xsk: create and free buffer pool independently from umem Magnus Karlsson
2020-07-28  7:05   ` Björn Töpel
2020-07-21  5:03 ` [PATCH bpf-next v4 04/14] xsk: move fill and completion rings to buffer pool Magnus Karlsson
2020-07-28  7:05   ` Björn Töpel
2020-07-21  5:03 ` [PATCH bpf-next v4 05/14] xsk: move queue_id, dev and need_wakeup " Magnus Karlsson
2020-07-28  7:09   ` Björn Töpel
2020-07-29 13:20     ` Magnus Karlsson
2020-07-28  9:21   ` Maxim Mikityanskiy
2020-07-29 13:21     ` Magnus Karlsson
2020-07-21  5:04 ` [PATCH bpf-next v4 06/14] xsk: move xsk_tx_list and its lock " Magnus Karlsson
2020-07-28  7:10   ` Björn Töpel
2020-07-21  5:04 ` [PATCH bpf-next v4 07/14] xsk: move addrs from buffer pool to umem Magnus Karlsson
2020-07-28  7:11   ` Björn Töpel
2020-07-21  5:04 ` [PATCH bpf-next v4 08/14] xsk: enable sharing of dma mappings Magnus Karlsson
2020-07-28  7:14   ` Björn Töpel
2020-07-28  8:59   ` Maxim Mikityanskiy
2020-07-29 13:22     ` Magnus Karlsson
2020-07-21  5:04 ` [PATCH bpf-next v4 09/14] xsk: rearrange internal structs for better performance Magnus Karlsson
2020-07-28  7:14   ` Björn Töpel
2020-07-21  5:04 ` [PATCH bpf-next v4 10/14] xsk: add shared umem support between queue ids Magnus Karlsson
2020-07-28  7:15   ` Björn Töpel
2020-07-21  5:04 ` [PATCH bpf-next v4 11/14] xsk: add shared umem support between devices Magnus Karlsson
2020-07-28  7:15   ` Björn Töpel
2020-07-21  5:04 ` [PATCH bpf-next v4 12/14] libbpf: support shared umems between queues and devices Magnus Karlsson
2020-07-28  7:18   ` Björn Töpel
2020-07-21  5:04 ` [PATCH bpf-next v4 13/14] samples/bpf: add new sample xsk_fwd.c Magnus Karlsson
2020-07-28  7:18   ` Björn Töpel
2020-07-21  5:04 ` [PATCH bpf-next v4 14/14] xsk: documentation for XDP_SHARED_UMEM between queues and netdevs Magnus Karlsson
2020-07-28  7:18   ` Björn Töpel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=59554482-dc1d-7758-c090-9fada8c66151@intel.com \
    --to=bjorn.topel@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cristian.dumitrescu@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=maciejromanfijalkowski@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=maximmi@mellanox.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).