bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@google.com>
To: David Vernet <void@manifault.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org,
	haoluo@google.com, jolsa@kernel.org,
	David Ahern <dsahern@gmail.com>, Jakub Kicinski <kuba@kernel.org>,
	Willem de Bruijn <willemb@google.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: Re: [PATCH bpf-next v5 01/17] bpf: Document XDP RX metadata
Date: Tue, 3 Jan 2023 14:23:03 -0800	[thread overview]
Message-ID: <CAKH8qBs8+gUekdNDRKnU1hg8gCB4Q29eMBhF2NeTT7Y1pyitQQ@mail.gmail.com> (raw)
In-Reply-To: <Y6x7+BL7eWERwpGy@maniforge.lan>

On Wed, Dec 28, 2022 at 9:25 AM David Vernet <void@manifault.com> wrote:
>
> On Tue, Dec 20, 2022 at 02:20:27PM -0800, Stanislav Fomichev wrote:
> > Document all current use-cases and assumptions.
> >
> > Cc: John Fastabend <john.fastabend@gmail.com>
> > Cc: David Ahern <dsahern@gmail.com>
> > Cc: Martin KaFai Lau <martin.lau@linux.dev>
> > Cc: Jakub Kicinski <kuba@kernel.org>
> > Cc: Willem de Bruijn <willemb@google.com>
> > Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> > Cc: Anatoly Burakov <anatoly.burakov@intel.com>
> > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
> > Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
> > Cc: Maryam Tahhan <mtahhan@redhat.com>
> > Cc: xdp-hints@xdp-project.net
> > Cc: netdev@vger.kernel.org
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  Documentation/networking/index.rst           |   1 +
> >  Documentation/networking/xdp-rx-metadata.rst | 107 +++++++++++++++++++
> >  2 files changed, 108 insertions(+)
> >  create mode 100644 Documentation/networking/xdp-rx-metadata.rst
> >
> > diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
> > index 4f2d1f682a18..4ddcae33c336 100644
> > --- a/Documentation/networking/index.rst
> > +++ b/Documentation/networking/index.rst
> > @@ -120,6 +120,7 @@ Refer to :ref:`netdev-FAQ` for a guide on netdev development process specifics.
> >     xfrm_proc
> >     xfrm_sync
> >     xfrm_sysctl
> > +   xdp-rx-metadata
> >
> >  .. only::  subproject and html
> >
> > diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
> > new file mode 100644
> > index 000000000000..37e8192d9b60
> > --- /dev/null
> > +++ b/Documentation/networking/xdp-rx-metadata.rst
>
> Hey Stanislav,
>
> This is looking excellent. Left a few more minor comments and
> suggestions.
>
> > @@ -0,0 +1,107 @@
> > +===============
> > +XDP RX Metadata
> > +===============
> > +
> > +This document describes how an XDP program can access hardware metadata
>
> In similar fashion to LWN articles, can we spell out what XDP means the
> first time it's used, e.g.:
>
> ...describes how an eXpress Data Path (XDP) program...
>
> In general this applies to other acronyms unless they're super obvious,
> like "CPU" (thanks for already having done it for XSK).

Sure. Hopefully no need to explain RX below? Don't see anything else..
LMK if I missed something

> > +related to a packet using a set of helper functions, and how it can pass
> > +that metadata on to other consumers.
> > +
> > +General Design
> > +==============
> > +
> > +XDP has access to a set of kfuncs to manipulate the metadata in an XDP frame.
> > +Every device driver that wishes to expose additional packet metadata can
> > +implement these kfuncs. The set of kfuncs is declared in ``include/net/xdp.h``
> > +via ``XDP_METADATA_KFUNC_xxx``.
> > +
> > +Currently, the following kfuncs are supported. In the future, as more
> > +metadata is supported, this set will grow:
> > +
> > +- ``bpf_xdp_metadata_rx_timestamp`` returns a packet's RX timestamp
> > +- ``bpf_xdp_metadata_rx_hash`` returns a packet's RX hash
>
> So, I leave this up to you as to whether or not you want to do this, but
> there is a built-in mechanism in sphinx that converts doxygen comments
> to rendered documentation for a function, struct, etc, and also
> automatically links other places in documentation where the function is
> referenced. See [0] for an example of this in code, and [1] for an
> example of how it's rendered.
>
> [0]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/Documentation/bpf/kfuncs.rst#n239
> [1]: https://docs.kernel.org/bpf/kfuncs.html#c.bpf_task_acquire
>
> So you would do something like add function headers to the kfuncs, and
> then do:
>
> .. kernel-doc:: net/core/xdp.c
>    :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash
>
> At some point we will need a consistent story for how we document
> kfuncs. That's not set in stone yet, which is why I'm saying it's up to
> you whether or not you want to do this or just leave it as teletype with
> a written description next to it.  Later on when we settle on a
> documentation story for kfuncs, we can update all of them to be
> documented in the same way.

Let me try and see how it looks in the html doc. I like the idea of
referencing the code directly, hopefully less chance it goes stale.

> > +The XDP program can use these kfuncs to read the metadata into stack
>
> s/The/An

Ack, ty!

> > +variables for its own consumption. Or, to pass the metadata on to other
> > +consumers, an XDP program can store it into the metadata area carried
> > +ahead of the packet.
> > +
> > +Not all kfuncs have to be implemented by the device driver; when not
> > +implemented, the default ones that return ``-EOPNOTSUPP`` will be used.
> > +
> > +Within the XDP frame, the metadata layout is as follows::
>
> s/the/an

Thx!

> > +
> > +  +----------+-----------------+------+
> > +  | headroom | custom metadata | data |
> > +  +----------+-----------------+------+
> > +             ^                 ^
> > +             |                 |
> > +   xdp_buff->data_meta   xdp_buff->data
> > +
> > +The XDP program can store individual metadata items into this data_meta
>
> Can we teletype ``data_meta``? Also, s/The XDP program/An XDP program

Sure.

> > +area in whichever format it chooses. Later consumers of the metadata
> > +will have to agree on the format by some out of band contract (like for
> > +the AF_XDP use case, see below).
> > +
> > +AF_XDP
> > +======
> > +
> > +``AF_XDP`` use-case implies that there is a contract between the BPF program
> > +that redirects XDP frames into the ``AF_XDP`` socket (``XSK``) and the final
> > +consumer. Thus the BPF program manually allocates a fixed number of
>
> Can we have ``AF_XDP`` link to the ``AF_XDP`` docs page for any reader
> that's not familiar with it? I think you can just do the following and
> it should just work:
>
> s/``AF_XDP``/:doc:`af_xdp`

Will try.

> > +bytes out of metadata via ``bpf_xdp_adjust_meta`` and calls a subset
> > +of kfuncs to populate it. The userspace ``XSK`` consumer computes
> > +``xsk_umem__get_data() - METADATA_SIZE`` to locate its metadata.
>
> s/to locate its metadata/to locate that metadata

SG.

> to make it clear that the consumer is consuming the metadata that was
> populated by bpf_xdp_adjust_meta()? It's pretty clear from context but I
> think "its metadata" may confuse some people as the metadata is not

[..]

> really owned by the consumer. Also, should we mention that
> xsk_umem__get_data() is defined in libxdp?

Add something like:

Note, ``xsk_umem__get_data`` is defined in ``libxdp`` and
``METADATA_SIZE`` is an application-specific constant.

?

> One other probably dumb question: is METADATA_SIZE static? If so, should
> we provide a libxdp wrapper to access it?

(added a note above)

> > +
> > +Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer)::
> > +
> > +  +----------+-----------------+------+
> > +  | headroom | custom metadata | data |
> > +  +----------+-----------------+------+
> > +                               ^
> > +                               |
> > +                        rx_desc->address
> > +
> > +XDP_PASS
> > +========
> > +
> > +This is the path where the packets processed by the XDP program are passed
> > +into the kernel. The kernel creates the ``skb`` out of the ``xdp_buff``
> > +contents. Currently, every driver has custom kernel code to parse
> > +the descriptors and populate ``skb`` metadata when doing this ``xdp_buff->skb``
> > +conversion, and the XDP metadata is not used by the kernel when building
> > +skbs. However, TC-BPF programs can access the XDP metadata area using
> > +the data_meta pointer.
>
> ``data_meta``

Ack.

> > +
> > +In the future, we'd like to support a case where an XDP program
> > +can override some of the metadata used for building skbs.
> > +
> > +bpf_redirect_map
> > +================
> > +
> > +``bpf_redirect_map`` can redirect the frame to a different device.
> > +Some devices (like virtual ethernet links) support running a second XDP
> > +program after the redirect. However, the final consumer doesn't have
> > +access to the original hardware descriptor and can't access any of
> > +the original metadata. The same applies to XDP programs installed
> > +into devmaps and cpumaps.
> > +
> > +This means that for redirected packets only custom metadata is
> > +currently supported, which has to be prepared by the initial XDP program
> > +before redirect. If the frame is eventually passed to the kernel, the
> > +skb created from such a frame won't have any hardware metadata populated
> > +in its skb. And if such a packet is later redirected into an ``XSK``,
>
> s/And if/If
>
> Also, can you add teletype ``skb`` in this paragraph?

Sure, thanks.

> > +that will also only have access to the custom metadata.
> > +
> > +
> > +bpf_tail_call
> > +=============
> > +
> > +Adding programs that access metadata kfuncs to the ``BPF_MAP_TYPE_PROG_ARRAY``
> > +is currently not supported.
> > +
> > +Example
> > +=======
> > +
> > +See ``tools/testing/selftests/bpf/progs/xdp_metadata.c`` and
> > +``tools/testing/selftests/bpf/prog_tests/xdp_metadata.c`` for an example of
> > +BPF program that handles XDP metadata.
> > --
> > 2.39.0.314.g84b9a713c41-goog
>
> Looks great otherwise, thanks.

  reply	other threads:[~2023-01-03 22:27 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-20 22:20 [PATCH bpf-next v5 00/17] xdp: hints via kfuncs Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 01/17] bpf: Document XDP RX metadata Stanislav Fomichev
2022-12-28 17:25   ` David Vernet
2023-01-03 22:23     ` Stanislav Fomichev [this message]
2023-01-04 16:02       ` David Vernet
2022-12-20 22:20 ` [PATCH bpf-next v5 02/17] bpf: Rename bpf_{prog,map}_is_dev_bound to is_offloaded Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 03/17] bpf: Move offload initialization into late_initcall Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 04/17] bpf: Reshuffle some parts of bpf/offload.c Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 05/17] bpf: Introduce device-bound XDP programs Stanislav Fomichev
2022-12-23  0:19   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 06/17] selftests/bpf: Update expected test_offload.py messages Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 07/17] bpf: XDP metadata RX kfuncs Stanislav Fomichev
     [not found]   ` <202212211311.e2ZWQLue-lkp@intel.com>
2022-12-21 17:49     ` Stanislav Fomichev
2022-12-23  0:31   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2022-12-27 20:33   ` David Vernet
2023-01-03 22:23     ` Stanislav Fomichev
2023-01-03 22:35       ` David Vernet
2022-12-20 22:20 ` [PATCH bpf-next v5 08/17] bpf: Support consuming XDP HW metadata from fext programs Stanislav Fomichev
2022-12-23  0:37   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2023-01-04  1:51       ` Martin KaFai Lau
2023-01-04  3:59         ` Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 09/17] veth: Introduce veth_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 10/17] veth: Support RX XDP metadata Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 11/17] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2022-12-23  0:40   ` Martin KaFai Lau
2022-12-23  4:06     ` Stanislav Fomichev
2023-01-04  2:05       ` Martin KaFai Lau
2022-12-20 22:20 ` [PATCH bpf-next v5 12/17] net/mlx4_en: Introduce wrapper for xdp_buff Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 13/17] net/mlx4_en: Support RX XDP metadata Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 14/17] xsk: Add cb area to struct xdp_buff_xsk Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 15/17] net/mlx5e: Introduce wrapper for xdp_buff Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 16/17] net/mlx5e: Support RX XDP metadata Stanislav Fomichev
2022-12-20 22:20 ` [PATCH bpf-next v5 17/17] selftests/bpf: Simple program to dump XDP RX metadata Stanislav Fomichev
2022-12-23  0:53   ` Martin KaFai Lau
2022-12-23  4:07     ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKH8qBs8+gUekdNDRKnU1hg8gCB4Q29eMBhF2NeTT7Y1pyitQQ@mail.gmail.com \
    --to=sdf@google.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=song@kernel.org \
    --cc=void@manifault.com \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).