All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: sdf@google.com, Jakub Kicinski <kuba@kernel.org>
Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org,
	haoluo@google.com, jolsa@kernel.org,
	Tariq Toukan <tariqt@nvidia.com>, David Ahern <dsahern@gmail.com>,
	Willem de Bruijn <willemb@google.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Anatoly Burakov <anatoly.burakov@intel.com>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	Maryam Tahhan <mtahhan@redhat.com>,
	xdp-hints@xdp-project.net, netdev@vger.kernel.org
Subject: Re: [xdp-hints] Re: [PATCH bpf-next v2 6/8] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff
Date: Wed, 23 Nov 2022 22:55:21 +0100	[thread overview]
Message-ID: <871qptuyie.fsf@toke.dk> (raw)
In-Reply-To: <Y3557Ecr80Y9ZD2z@google.com>

sdf@google.com writes:

> On 11/23, Jakub Kicinski wrote:
>> On Wed, 23 Nov 2022 10:26:41 -0800 Stanislav Fomichev wrote:
>> > > This embedding trick works for drivers that put xdp_buff on the stack,
>> > > but mlx5 supports XSK zerocopy, which uses the xsk_buff_pool for
>> > > allocating them. This makes it a bit awkward to do the same thing  
>> there;
>> > > and since it's probably going to be fairly common to do something like
>> > > this, how about we just add a 'void *drv_priv' pointer to struct
>> > > xdp_buff that the drivers can use? The xdp_buff already takes up a  
>> full
>> > > cache line anyway, so any data stuffed after it will spill over to a  
>> new
>> > > one; so I don't think there's much difference performance-wise.
>> >
>> > I guess the alternative is to extend xsk_buff_pool with some new
>> > argument for xdp_buff tailroom? (so it can kmalloc(sizeof(xdp_buff) +
>> > xdp_buff_tailroom))
>> > But it seems messy because there is no way of knowing what the target
>> > device's tailroom is, so it has to be a user setting :-/
>> > I've started with a priv pointer in xdp_buff initially, it seems fine
>> > to go back. I'll probably convert veth/mlx4 to the same mode as well
>> > to avoid having different approaches in different places..
>
>> Can we not do this please? Add 16B of "private driver space" after
>> the xdp_buff in xdp_buff_xsk (we have 16B to full cacheline), the
>> drivers decide how they use it. Drivers can do BUILD_BUG_ON() for their
>> expected size and cast that to whatever struct they want. This is how
>> various offloads work, the variable size tailroom would be an over
>> design IMO.
>
>> And this way non XSK paths can keep its normal typing.
>
> Good idea, prototyped below, lmk if it that's not what you had in mind.
>
> struct xdp_buff_xsk {
> 	struct xdp_buff            xdp;                  /*     0    56 */
> 	u8                         cb[16];               /*    56    16 */
> 	/* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */

As pahole helpfully says here, xdp_buff is actually only 8 bytes from
being a full cache line. I thought about adding a 'cb' field like this
to xdp_buff itself, but figured that since there's only room for a
single pointer, why not just add that and let the driver point it to
where it wants to store the extra context data?

I am not suggesting to make anything variable-size; the 'void *drv_priv'
is just a normal pointer. There's no changes to any typing; not sure
where you got that from, Jakub?

Also, the priv pointer approach works for both XSK and on-stack
allocations, unlike this approach (see below).

> 	dma_addr_t                 dma;                  /*    72     8 */
> 	dma_addr_t                 frame_dma;            /*    80     8 */
> 	struct xsk_buff_pool *     pool;                 /*    88     8 */
> 	u64                        orig_addr;            /*    96     8 */
> 	struct list_head           free_list_node;       /*   104    16 */
>
> 	/* size: 120, cachelines: 2, members: 7 */
> 	/* last cacheline: 56 bytes */
> };
>
> Toke, I can try to merge this into your patch + keep your SoB (or feel free
> to try this and retest yourself, whatever works).
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h  
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> index bc2d9034af5b..837bf103b871 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h
> @@ -44,6 +44,11 @@
>   	(MLX5E_XDP_INLINE_WQE_MAX_DS_CNT * MLX5_SEND_WQE_DS - \
>   	 sizeof(struct mlx5_wqe_inline_seg))
>
> +struct mlx5_xdp_cb {
> +	struct mlx5_cqe64 *cqe;
> +	struct mlx5e_rq *rq;
> +};
> +
>   struct mlx5e_xsk_param;
>   int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param  
> *xsk);
>   bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct page *page,
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c  
> b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> index c91b54d9ff27..84d23b2da7ce 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c
> @@ -5,6 +5,7 @@
>   #include "en/xdp.h"
>   #include <net/xdp_sock_drv.h>
>   #include <linux/filter.h>
> +#include <linux/build_bug.h>
>
>   /* RX data path */
>
> @@ -286,8 +287,14 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct  
> mlx5e_rq *rq,
>   					      u32 cqe_bcnt)
>   {
>   	struct xdp_buff *xdp = wi->au->xsk;
> +	struct mlx5_xdp_cb *cb;
>   	struct bpf_prog *prog;
>
> +	BUILD_BUG_ON(sizeof(struct mlx5_xdp_cb) > XSKB_CB_SIZE);
> +	cb = xp_get_cb(xdp);
> +	cb->cqe = NULL /*cqe*/;
> +	cb->rq = rq;

So this works fine for the XSK path, but for the regular XDP path, mlx5
*does* indeed put the xdp_buff on the stack. So to re-use code there
would be an implicit assumption that both memory layout and size matches
between the two paths. I'm not sure that's better than just having a
pointer inside the xdp_buff and pointing it wherever makes sense for
that driver (as my patch did)?

-Toke


  parent reply	other threads:[~2022-11-23 21:56 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-21 18:25 [PATCH bpf-next v2 0/8] xdp: hints via kfuncs Stanislav Fomichev
2022-11-21 18:25 ` [PATCH bpf-next v2 1/8] bpf: Document XDP RX metadata Stanislav Fomichev
2022-11-21 18:25 ` [PATCH bpf-next v2 2/8] bpf: XDP metadata RX kfuncs Stanislav Fomichev
2022-11-21 23:31   ` kernel test robot
2022-11-23  6:34   ` Martin KaFai Lau
2022-11-23 18:43     ` Stanislav Fomichev
2022-11-23 14:24   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-23 18:43     ` Stanislav Fomichev
2022-11-24  2:23   ` kernel test robot
2022-11-24 12:19   ` kernel test robot
2022-11-24 13:09   ` kernel test robot
2022-11-25 17:53   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-28 18:53     ` Stanislav Fomichev
2022-11-28 19:21       ` Stanislav Fomichev
2022-11-28 22:25         ` Toke Høiland-Jørgensen
2022-11-28 22:10       ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-30 17:24   ` Larysa Zaremba
2022-11-30 19:06     ` Stanislav Fomichev
2022-11-30 20:17       ` Stanislav Fomichev
2022-12-01 13:52         ` Larysa Zaremba
2022-12-01 17:14           ` Stanislav Fomichev
2022-11-21 18:25 ` [PATCH bpf-next v2 3/8] veth: Introduce veth_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-21 18:25 ` [PATCH bpf-next v2 4/8] veth: Support RX XDP metadata Stanislav Fomichev
2022-11-21 18:25 ` [PATCH bpf-next v2 5/8] selftests/bpf: Verify xdp_metadata xdp->af_xdp path Stanislav Fomichev
2022-11-29 10:06   ` Anton Protopopov
2022-11-29 18:52     ` Stanislav Fomichev
2022-11-21 18:25 ` [PATCH bpf-next v2 6/8] mlx4: Introduce mlx4_xdp_buff wrapper for xdp_buff Stanislav Fomichev
2022-11-22 13:49   ` Tariq Toukan
2022-11-22 18:08     ` Stanislav Fomichev
2022-11-23 14:33   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-23 18:26     ` Stanislav Fomichev
2022-11-23 19:14       ` Jakub Kicinski
2022-11-23 19:52         ` sdf
2022-11-23 21:54           ` Maciej Fijalkowski
2022-11-23 21:55           ` Toke Høiland-Jørgensen [this message]
2022-11-24  1:47             ` [xdp-hints] " Jakub Kicinski
2022-11-24 14:39               ` Toke Høiland-Jørgensen
2022-11-24 15:17                 ` Maciej Fijalkowski
2022-11-24 16:11                   ` Maciej Fijalkowski
2022-11-25  0:36                     ` Toke Høiland-Jørgensen
2022-11-28 21:58                       ` Stanislav Fomichev
2022-11-28 22:11                         ` Toke Høiland-Jørgensen
2022-11-21 18:25 ` [PATCH bpf-next v2 7/8] mxl4: Support RX XDP metadata Stanislav Fomichev
2022-11-22 13:50   ` Tariq Toukan
2022-11-22 18:08     ` Stanislav Fomichev
2022-11-21 18:25 ` [PATCH bpf-next v2 8/8] selftests/bpf: Simple program to dump XDP RX metadata Stanislav Fomichev
2022-11-23 14:26   ` [xdp-hints] " Toke Høiland-Jørgensen
2022-11-23 18:29     ` Stanislav Fomichev
2022-11-23 19:17       ` Jakub Kicinski
2022-11-23 19:54         ` Stanislav Fomichev
2022-11-23 14:46 ` [PATCH bpf-next 1/2] xdp: Add drv_priv pointer to struct xdp_buff Toke Høiland-Jørgensen
2022-11-23 14:46   ` [PATCH bpf-next 2/2] mlx5: Support XDP RX metadata Toke Høiland-Jørgensen
2022-11-23 22:29     ` [xdp-hints] " Saeed Mahameed
2022-11-23 22:44       ` Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871qptuyie.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=dsahern@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=magnus.karlsson@gmail.com \
    --cc=martin.lau@linux.dev \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=tariqt@nvidia.com \
    --cc=willemb@google.com \
    --cc=xdp-hints@xdp-project.net \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.