All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@google.com>
To: bpf@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	martin.lau@linux.dev, song@kernel.org, yhs@fb.com,
	john.fastabend@gmail.com, kpsingh@kernel.org, sdf@google.com,
	haoluo@google.com, jolsa@kernel.org,
	David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
	Willem de Bruijn <willemb@google.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: [PATCH bpf-next v5 01/17] bpf: Document XDP RX metadata
Date: Tue, 20 Dec 2022 14:20:27 -0800	[thread overview]
Message-ID: <20221220222043.3348718-2-sdf@google.com> (raw)
In-Reply-To: <20221220222043.3348718-1-sdf@google.com>

Document all current use-cases and assumptions.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 Documentation/networking/index.rst           |   1 +
 Documentation/networking/xdp-rx-metadata.rst | 107 +++++++++++++++++++
 2 files changed, 108 insertions(+)
 create mode 100644 Documentation/networking/xdp-rx-metadata.rst

diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 4f2d1f682a18..4ddcae33c336 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -120,6 +120,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics.
    xfrm_proc
    xfrm_sync
    xfrm_sysctl
+   xdp-rx-metadata
 
 .. only::  subproject and html
 
diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
new file mode 100644
index 000000000000..37e8192d9b60
--- /dev/null
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -0,0 +1,107 @@
+===============
+XDP RX Metadata
+===============
+
+This document describes how an XDP program can access hardware metadata
+related to a packet using a set of helper functions, and how it can pass
+that metadata on to other consumers.
+
+General Design
+==============
+
+XDP has access to a set of kfuncs to manipulate the metadata in an XDP frame.
+Every device driver that wishes to expose additional packet metadata can
+implement these kfuncs. The set of kfuncs is declared in ``include/net/xdp.h``
+via ``XDP_METADATA_KFUNC_xxx``.
+
+Currently, the following kfuncs are supported. In the future, as more
+metadata is supported, this set will grow:
+
+- ``bpf_xdp_metadata_rx_timestamp`` returns a packet's RX timestamp
+- ``bpf_xdp_metadata_rx_hash`` returns a packet's RX hash
+
+The XDP program can use these kfuncs to read the metadata into stack
+variables for its own consumption. Or, to pass the metadata on to other
+consumers, an XDP program can store it into the metadata area carried
+ahead of the packet.
+
+Not all kfuncs have to be implemented by the device driver; when not
+implemented, the default ones that return ``-EOPNOTSUPP`` will be used.
+
+Within the XDP frame, the metadata layout is as follows::
+
+  +----------+-----------------+------+
+  | headroom | custom metadata | data |
+  +----------+-----------------+------+
+             ^                 ^
+             |                 |
+   xdp_buff->data_meta   xdp_buff->data
+
+The XDP program can store individual metadata items into this data_meta
+area in whichever format it chooses. Later consumers of the metadata
+will have to agree on the format by some out of band contract (like for
+the AF_XDP use case, see below).
+
+AF_XDP
+======
+
+``AF_XDP`` use-case implies that there is a contract between the BPF program
+that redirects XDP frames into the ``AF_XDP`` socket (``XSK``) and the final
+consumer. Thus the BPF program manually allocates a fixed number of
+bytes out of metadata via ``bpf_xdp_adjust_meta`` and calls a subset
+of kfuncs to populate it. The userspace ``XSK`` consumer computes
+``xsk_umem__get_data() - METADATA_SIZE`` to locate its metadata.
+
+Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer)::
+
+  +----------+-----------------+------+
+  | headroom | custom metadata | data |
+  +----------+-----------------+------+
+                               ^
+                               |
+                        rx_desc->address
+
+XDP_PASS
+========
+
+This is the path where the packets processed by the XDP program are passed
+into the kernel. The kernel creates the ``skb`` out of the ``xdp_buff``
+contents. Currently, every driver has custom kernel code to parse
+the descriptors and populate ``skb`` metadata when doing this ``xdp_buff->skb``
+conversion, and the XDP metadata is not used by the kernel when building
+skbs. However, TC-BPF programs can access the XDP metadata area using
+the data_meta pointer.
+
+In the future, we'd like to support a case where an XDP program
+can override some of the metadata used for building skbs.
+
+bpf_redirect_map
+================
+
+``bpf_redirect_map`` can redirect the frame to a different device.
+Some devices (like virtual ethernet links) support running a second XDP
+program after the redirect. However, the final consumer doesn't have
+access to the original hardware descriptor and can't access any of
+the original metadata. The same applies to XDP programs installed
+into devmaps and cpumaps.
+
+This means that for redirected packets only custom metadata is
+currently supported, which has to be prepared by the initial XDP program
+before redirect. If the frame is eventually passed to the kernel, the
+skb created from such a frame won't have any hardware metadata populated
+in its skb. And if such a packet is later redirected into an ``XSK``,
+that will also only have access to the custom metadata.
+
+
+bpf_tail_call
+=============
+
+Adding programs that access metadata kfuncs to the ``BPF_MAP_TYPE_PROG_ARRAY``
+is currently not supported.
+
+Example
+=======
+
+See ``tools/testing/selftests/bpf/progs/xdp_metadata.c`` and
+``tools/testing/selftests/bpf/prog_tests/xdp_metadata.c`` for an example of
+BPF program that handles XDP metadata.
-- 
2.39.0.314.g84b9a713c41-goog


  reply	other threads:[~2022-12-20 22:21 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-20 22:20 [PATCH bpf-next v5 00/17] xdp: hints via kfuncs Stanislav Fomichev
2022-12-20 22:20 ` Stanislav Fomichev [this message]
2022-12-28 17:25   ` [PATCH bpf-next v5 01/17] bpf: Document XDP RX metadata David Vernet
2023-01-03 22:23     ` Stanislav Fomichev
2023-01-04 16:02       ` David Vernet
2022-12-20 22:20 ` [PATCH bpf-next v5 02/17] bpf: Rename bpf_{prog,map}_is_dev_bound to is_offloaded Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 03/17] bpf: Move offload initialization into late_initcall Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 04/17] bpf: Reshuffle some parts of bpf/offload.c Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 05/17] bpf: Introduce device-bound XDP programs Stanislav Fomichev
2022-12-23  0:19   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 06/17] selftests/bpf: Update expected test_offload.py messages Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 07/17] bpf: XDP metadata RX kfuncs Stanislav Fomichev
2022-12-21  5:43   ` kernel test robot
2022-12-21 17:49     ` Stanislav Fomichev
2022-12-23  0:31   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2022-12-27 20:33   ` David Vernet
2023-01-03 22:23     ` Stanislav Fomichev
2023-01-03 22:35       ` David Vernet
2022-12-20 22:20 ` [PATCH bpf-next v5 08/17] bpf: Support consuming XDP HW metadata from fext programs Stanislav Fomichev
2022-12-23  0:37   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2023-01-04  1:51       ` Martin KaFai Lau
2023-01-04  3:59         ` Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 09/17] veth: Introduce veth_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 10/17] veth: Support RX XDP metadata Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 11/17] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2022-12-23  0:40   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2023-01-04  2:05       ` Martin KaFai Lau
2022-12-20 22:20 ` [PATCH bpf-next v5 12/17] net/mlx4_en: Introduce wrapper for xdp_buff Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 13/17] net/mlx4_en: Support RX XDP metadata Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 14/17] xsk: Add cb area to struct xdp_buff_xsk Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 15/17] net/mlx5e: Introduce wrapper for xdp_buff Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 16/17] net/mlx5e: Support RX XDP metadata Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 17/17] selftests/bpf: Simple program to dump XDP RX metadata Stanislav Fomichev
2022-12-23  0:53   ` Martin KaFai Lau
2022-12-23  4:07     ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221220222043.3348718-2-sdf@google.com \
    --to=sdf@google.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=song@kernel.org \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.