bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Lorenzo Bianconi <lorenzo@kernel.org>,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Cc: lorenzo.bianconi@redhat.com, davem@davemloft.net,
	kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net,
	shayagr@amazon.com, john.fastabend@gmail.com, dsahern@kernel.org,
	brouer@redhat.com, echaudro@redhat.com, jasowang@redhat.com,
	alexander.duyck@gmail.com, saeed@kernel.org,
	maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
	tirthendu.sarkar@intel.com, toke@redhat.com
Subject: RE: [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper
Date: Tue, 31 Aug 2021 17:36:54 -0700	[thread overview]
Message-ID: <612ecb262b05_6b87208c0@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <14b99bc75ce0f8d4968208fb0b420a054e45433e.1629473234.git.lorenzo@kernel.org>

Lorenzo Bianconi wrote:
> For XDP frames split over multiple buffers, the xdp_md->data and
> xdp_md->data_end pointers will point to the start and end of the first
> fragment only. bpf_xdp_adjust_data can be used to access subsequent
> fragments by moving the data pointers. To use, an XDP program can call
> this helper with the byte offset of the packet payload that
> it wants to access; the helper will move xdp_md->data and xdp_md ->data_end
> so they point to the requested payload offset and to the end of the
> fragment containing this byte offset, and return the byte offset of the
> start of the fragment.
> To move back to the beginning of the packet, simply call the
> helper with an offset of '0'.
> Note also that the helpers that modify the packet boundaries
> (bpf_xdp_adjust_head(), bpf_xdp_adjust_tail() and
> bpf_xdp_adjust_meta()) will fail if the pointers have been
> moved; it is the responsibility of the BPF program to move them
> back before using these helpers.

I'm ok with this for a first iteration I guess with more work we
can make the helpers use the updated pointers though.

> 
> Suggested-by: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

Overall looks good couple small nits/questions below. Thanks!

> ---
>  include/net/xdp.h              |  8 +++++
>  include/uapi/linux/bpf.h       | 32 ++++++++++++++++++
>  net/bpf/test_run.c             |  8 +++++
>  net/core/filter.c              | 62 +++++++++++++++++++++++++++++++++-
>  tools/include/uapi/linux/bpf.h | 32 ++++++++++++++++++
>  5 files changed, 141 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index cdaecf8d4d61..ce4764c7cd40 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -82,6 +82,11 @@ struct xdp_buff {
>  	struct xdp_txq_info *txq;
>  	u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
>  	u16 flags; /* supported values defined in xdp_flags */
> +	/* xdp multi-buff metadata used for frags iteration */
> +	struct {
> +		u16 headroom;	/* frame headroom: data - data_hard_start */
> +		u16 headlen;	/* first buffer length: data_end - data */
> +	} mb;
>  };
>  
>  static __always_inline bool xdp_buff_is_mb(struct xdp_buff *xdp)
> @@ -127,6 +132,9 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start,
>  	xdp->data = data;
>  	xdp->data_end = data + data_len;
>  	xdp->data_meta = meta_valid ? data : data + 1;
> +	/* mb metadata for frags iteration */
> +	xdp->mb.headroom = headroom;
> +	xdp->mb.headlen = data_len;
>  }
>  
>  /* Reserve memory area at end-of data area.
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 9e2c3b12ea49..a7b5185a718a 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4877,6 +4877,37 @@ union bpf_attr {
>   *		Get the total size of a given xdp buff (linear and paged area)
>   *	Return
>   *		The total size of a given xdp buffer.
> + *
> + * long bpf_xdp_adjust_data(struct xdp_buff *xdp_md, u32 offset)
> + *	Description
> + *		For XDP frames split over multiple buffers, the
> + *		*xdp_md*\ **->data** and*xdp_md *\ **->data_end** pointers
                                       ^^^^
missing space?

> + *		will point to the start and end of the first fragment only.
> + *		This helper can be used to access subsequent fragments by
> + *		moving the data pointers. To use, an XDP program can call
> + *		this helper with the byte offset of the packet payload that
> + *		it wants to access; the helper will move *xdp_md*\ **->data**
> + *		and *xdp_md *\ **->data_end** so they point to the requested
> + *		payload offset and to the end of the fragment containing this
> + *		byte offset, and return the byte offset of the start of the
> + *		fragment.
> + *		To move back to the beginning of the packet, simply call the
> + *		helper with an offset of '0'.
> + *		Note also that the helpers that modify the packet boundaries
> + *		(*bpf_xdp_adjust_head()*, *bpf_xdp_adjust_tail()* and
> + *		*bpf_xdp_adjust_meta()*) will fail if the pointers have been
> + *		moved; it is the responsibility of the BPF program to move them
> + *		back before using these helpers.
> + *
> + *		A call to this helper is susceptible to change the underlying
> + *		packet buffer. Therefore, at load time, all checks on pointers
> + *		previously done by the verifier are invalidated and must be
> + *		performed again, if the helper is used in combination with
> + *		direct packet access.
> + *	Return
> + *		offset between the beginning of the current fragment and
> + *		original *xdp_md*\ **->data** on success, or a negative error
> + *		in case of failure.
>   */
>  #define __BPF_FUNC_MAPPER(FN)		\
>  	FN(unspec),			\
> @@ -5055,6 +5086,7 @@ union bpf_attr {
>  	FN(get_func_ip),		\
>  	FN(get_attach_cookie),		\
>  	FN(xdp_get_buff_len),		\
> +	FN(xdp_adjust_data),		\
>  	/* */
>  
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 869dcf23a1ca..f09c2c8c0d6c 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -757,6 +757,8 @@ static int xdp_convert_md_to_buff(struct xdp_md *xdp_md, struct xdp_buff *xdp)
>  	}
>  
>  	xdp->data = xdp->data_meta + xdp_md->data;
> +	xdp->mb.headroom = xdp->data - xdp->data_hard_start;
> +	xdp->mb.headlen = xdp->data_end - xdp->data;
>  	return 0;
>  
>  free_dev:
> @@ -871,6 +873,12 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>  	if (ret)
>  		goto out;
>  
> +	/* data pointers need to be reset after frag iteration */
> +	if (unlikely(xdp.data_hard_start + xdp.mb.headroom != xdp.data)) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
>  	size = xdp.data_end - xdp.data_meta + sinfo->xdp_frags_size;
>  	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, sinfo, size,
>  			      retval, duration);
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2122c00c680f..ed2a6632adce 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3827,6 +3827,10 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
>  	void *data_start = xdp_frame_end + metalen;
>  	void *data = xdp->data + offset;
>  
> +	/* data pointers need to be reset after frag iteration */
> +	if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
> +		return -EINVAL;

-EFAULT? It might be nice if error code is different from below
for debugging?

> +
>  	if (unlikely(data < data_start ||
>  		     data > xdp->data_end - ETH_HLEN))
>  		return -EINVAL;
> @@ -3836,6 +3840,9 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
>  			xdp->data_meta, metalen);
>  	xdp->data_meta += offset;
>  	xdp->data = data;
> +	/* update metada for multi-buff frag iteration */
> +	xdp->mb.headroom = xdp->data - xdp->data_hard_start;
> +	xdp->mb.headlen = xdp->data_end - xdp->data;
>  
>  	return 0;
>  }
> @@ -3910,6 +3917,10 @@ BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
>  	void *data_hard_end = xdp_data_hard_end(xdp); /* use xdp->frame_sz */
>  	void *data_end = xdp->data_end + offset;
>  
> +	/* data pointer needs to be reset after frag iteration */
> +	if (unlikely(xdp->data + xdp->mb.headlen != xdp->data_end))
> +		return -EINVAL;

EFAULT?

> +
>  	if (unlikely(xdp_buff_is_mb(xdp)))
>  		return bpf_xdp_mb_adjust_tail(xdp, offset);
>  
> @@ -3949,6 +3960,10 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
>  	void *meta = xdp->data_meta + offset;
>  	unsigned long metalen = xdp->data - meta;
>  
> +	/* data pointer needs to be reset after frag iteration */
> +	if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
> +		return -EINVAL;

same comment.

>  	if (xdp_data_meta_unsupported(xdp))
>  		return -ENOTSUPP;
>  	if (unlikely(meta < xdp_frame_end ||
> @@ -3970,6 +3985,48 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = {
>  	.arg2_type	= ARG_ANYTHING,
>  };
>  
> +BPF_CALL_2(bpf_xdp_adjust_data, struct xdp_buff *, xdp, u32, offset)
> +{
> +	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> +	u32 base_offset = xdp->mb.headlen;
> +	int i;
> +
> +	if (!xdp_buff_is_mb(xdp) || offset > sinfo->xdp_frags_size)
> +		return -EINVAL;

Do we need to error this? If its not mb we can just return the same
as offset==0?

> +
> +	if (offset < xdp->mb.headlen) {
> +		/* linear area */
> +		xdp->data = xdp->data_hard_start + xdp->mb.headroom + offset;
> +		xdp->data_end = xdp->data_hard_start + xdp->mb.headroom +
> +				xdp->mb.headlen;
> +		return 0;
> +	}
> +
> +	for (i = 0; i < sinfo->nr_frags; i++) {
> +		/* paged area */
> +		skb_frag_t *frag = &sinfo->frags[i];
> +		unsigned int size = skb_frag_size(frag);
> +
> +		if (offset < base_offset + size) {
> +			u8 *addr = skb_frag_address(frag);
> +
> +			xdp->data = addr + offset - base_offset;
> +			xdp->data_end = addr + size;
> +			break;
> +		}
> +		base_offset += size;
> +	}
> +	return base_offset;
> +}

  reply	other threads:[~2021-09-01  0:37 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp Lorenzo Bianconi
2021-08-31 23:13   ` John Fastabend
2021-09-03 17:13     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame Lorenzo Bianconi
2021-08-31 23:15   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-08-31 23:23   ` John Fastabend
2021-09-03 17:23     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management Lorenzo Bianconi
2021-08-31 23:34   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine Lorenzo Bianconi
2021-08-31 23:38   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 06/18] net: marvell: rely on " Lorenzo Bianconi
2021-08-31 23:41   ` John Fastabend
2021-09-03 17:27     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-08-31 23:43   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-08-31 23:44   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-08-31 23:45   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-09-01  0:10   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper Lorenzo Bianconi
2021-09-01  0:12   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
2021-09-01  0:19   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 13/18] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 14/18] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 15/18] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 16/18] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper Lorenzo Bianconi
2021-09-01  0:36   ` John Fastabend [this message]
2021-09-03 17:57     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 18/18] bpf: add bpf_xdp_adjust_data selftest Lorenzo Bianconi
2021-09-01  0:45 ` [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support John Fastabend
2021-09-07  8:35   ` Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=612ecb262b05_6b87208c0@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=alexander.duyck@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=echaudro@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeed@kernel.org \
    --cc=shayagr@amazon.com \
    --cc=tirthendu.sarkar@intel.com \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).