All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support
@ 2021-08-20 15:40 Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp Lorenzo Bianconi
                   ` (18 more replies)
  0 siblings, 19 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 7669 bytes --]

This series introduce XDP multi-buffer support. The mvneta driver is
the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
please focus on how these new types of xdp_{buff,frame} packets
traverse the different layers and the layout design. It is on purpose
that BPF-helpers are kept simple, as we don't want to expose the
internal layout to allow later changes.

The main idea for the new multi-buffer layout is to reuse the same
structure used for non-linear SKB. This rely on the "skb_shared_info"
struct at the end of the first buffer to link together subsequent
buffers. Keeping the layout compatible with SKBs is also done to ease
and speedup creating a SKB from an xdp_{buff,frame}.
Converting xdp_frame to SKB and deliver it to the network stack is shown
in patch 05/18 (e.g. cpumaps).

A multi-buffer bit (mb) has been introduced in the flags field of xdp_{buff,frame}
structure to notify the bpf/network layer if this is a xdp multi-buffer frame
(mb = 1) or not (mb = 0).
The mb bit will be set by a xdp multi-buffer capable driver only for
non-linear frames maintaining the capability to receive linear frames
without any extra cost since the skb_shared_info structure at the end
of the first buffer will be initialized only if mb is set.
Moreover the flags field in xdp_{buff,frame} will be reused even for
xdp rx csum offloading in future series.

Typical use cases for this series are:
- Jumbo-frames
- Packet header split (please see Google’s use-case @ NetDevConf 0x14, [0])
- TSO/GRO

The two following ebpf helpers (and related selftests) has been introduced:
- bpf_xdp_adjust_data:
  Move xdp_md->data and xdp_md->data_end pointers in subsequent fragments
  according to the offset provided by the ebpf program. This helper can be
  used to read/write values in frame payload.
- bpf_xdp_get_buff_len:
  Return the total frame size (linear + paged parts)

bpf_xdp_adjust_tail and bpf_xdp_copy helpers have been modified to take into
account xdp multi-buff frames.

More info about the main idea behind this approach can be found here [1][2].

Changes since v11:
- add missing static to bpf_xdp_get_buff_len_proto structure
- fix bpf_xdp_adjust_data helper when offset is smaller than linear area length.

Changes since v10:
- move xdp->data to the requested payload offset instead of to the beginning of
  the fragment in bpf_xdp_adjust_data()

Changes since v9:
- introduce bpf_xdp_adjust_data helper and related selftest
- add xdp_frags_size and xdp_frags_tsize fields in skb_shared_info
- introduce xdp_update_skb_shared_info utility routine in ordere to not reset
  frags array in skb_shared_info converting from a xdp_buff/xdp_frame to a skb 
- simplify bpf_xdp_copy routine

Changes since v8:
- add proper dma unmapping if XDP_TX fails on mvneta for a xdp multi-buff
- switch back to skb_shared_info implementation from previous xdp_shared_info
  one
- avoid using a bietfield in xdp_buff/xdp_frame since it introduces performance
  regressions. Tested now on 10G NIC (ixgbe) to verify there are no performance
  penalties for regular codebase
- add bpf_xdp_get_buff_len helper and remove frame_length field in xdp ctx
- add data_len field in skb_shared_info struct
- introduce XDP_FLAGS_FRAGS_PF_MEMALLOC flag

Changes since v7:
- rebase on top of bpf-next
- fix sparse warnings
- improve comments for frame_length in include/net/xdp.h

Changes since v6:
- the main difference respect to previous versions is the new approach proposed
  by Eelco to pass full length of the packet to eBPF layer in XDP context
- reintroduce multi-buff support to eBPF kself-tests
- reintroduce multi-buff support to bpf_xdp_adjust_tail helper
- introduce multi-buffer support to bpf_xdp_copy helper
- rebase on top of bpf-next

Changes since v5:
- rebase on top of bpf-next
- initialize mb bit in xdp_init_buff() and drop per-driver initialization
- drop xdp->mb initialization in xdp_convert_zc_to_xdp_frame()
- postpone introduction of frame_length field in XDP ctx to another series
- minor changes

Changes since v4:
- rebase ontop of bpf-next
- introduce xdp_shared_info to build xdp multi-buff instead of using the
  skb_shared_info struct
- introduce frame_length in xdp ctx
- drop previous bpf helpers
- fix bpf_xdp_adjust_tail for xdp multi-buff
- introduce xdp multi-buff self-tests for bpf_xdp_adjust_tail
- fix xdp_return_frame_bulk for xdp multi-buff

Changes since v3:
- rebase ontop of bpf-next
- add patch 10/13 to copy back paged data from a xdp multi-buff frame to
  userspace buffer for xdp multi-buff selftests

Changes since v2:
- add throughput measurements
- drop bpf_xdp_adjust_mb_header bpf helper
- introduce selftest for xdp multibuffer
- addressed comments on bpf_xdp_get_frags_count
- introduce xdp multi-buff support to cpumaps

Changes since v1:
- Fix use-after-free in xdp_return_{buff/frame}
- Introduce bpf helpers
- Introduce xdp_mb sample program
- access skb_shared_info->nr_frags only on the last fragment

Changes since RFC:
- squash multi-buffer bit initialization in a single patch
- add mvneta non-linear XDP buff support for tx side

[0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
[1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
[2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)

Eelco Chaudron (3):
  bpf: add multi-buff support to the bpf_xdp_adjust_tail() API
  bpf: add multi-buffer support to xdp copy helpers
  bpf: update xdp_adjust_tail selftest to include multi-buffer

Lorenzo Bianconi (15):
  net: skbuff: add size metadata to skb_shared_info for xdp
  xdp: introduce flags field in xdp_buff/xdp_frame
  net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  net: mvneta: simplify mvneta_swbm_add_rx_fragment management
  net: xdp: add xdp_update_skb_shared_info utility routine
  net: marvell: rely on xdp_update_skb_shared_info utility routine
  xdp: add multi-buff support to xdp_return_{buff/frame}
  net: mvneta: add multi buffer support to XDP_TX
  net: mvneta: enable jumbo frames for XDP
  bpf: introduce bpf_xdp_get_buff_len helper
  bpf: move user_size out of bpf_test_init
  bpf: introduce multibuff support to bpf_prog_test_run_xdp()
  bpf: test_run: add xdp_shared_info pointer in bpf_test_finish
    signature
  net: xdp: introduce bpf_xdp_adjust_data helper
  bpf: add bpf_xdp_adjust_data selftest

 drivers/net/ethernet/marvell/mvneta.c         | 204 ++++++++++-------
 include/linux/skbuff.h                        |   6 +-
 include/net/xdp.h                             |  95 +++++++-
 include/uapi/linux/bpf.h                      |  39 ++++
 kernel/trace/bpf_trace.c                      |   3 +
 net/bpf/test_run.c                            | 117 ++++++++--
 net/core/filter.c                             | 213 +++++++++++++++++-
 net/core/xdp.c                                |  76 ++++++-
 tools/include/uapi/linux/bpf.h                |  39 ++++
 .../bpf/prog_tests/xdp_adjust_data.c          |  55 +++++
 .../bpf/prog_tests/xdp_adjust_tail.c          | 118 ++++++++++
 .../selftests/bpf/prog_tests/xdp_bpf2bpf.c    | 151 +++++++++----
 .../bpf/progs/test_xdp_adjust_tail_grow.c     |  10 +-
 .../bpf/progs/test_xdp_adjust_tail_shrink.c   |  32 ++-
 .../selftests/bpf/progs/test_xdp_bpf2bpf.c    |   2 +-
 .../bpf/progs/test_xdp_update_frags.c         |  41 ++++
 16 files changed, 1036 insertions(+), 165 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_adjust_data.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_update_frags.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:13   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame Lorenzo Bianconi
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Introduce xdp_frags_tsize field in skb_shared_info data structure
to store xdp_buff/xdp_frame truesize (xdp_frags_tsize will be used
in xdp multi-buff support). In order to not increase skb_shared_info
size we will use a hole due to skb_shared_info alignment.
Introduce xdp_frags_size field in skb_shared_info data structure
reusing gso_type field in order to store xdp_buff/xdp_frame paged size.
xdp_frags_size will be used in xdp multi-buff support.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/linux/skbuff.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6bdb0db3e825..1abeba7ef82e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -522,13 +522,17 @@ struct skb_shared_info {
 	unsigned short	gso_segs;
 	struct sk_buff	*frag_list;
 	struct skb_shared_hwtstamps hwtstamps;
-	unsigned int	gso_type;
+	union {
+		unsigned int	gso_type;
+		unsigned int	xdp_frags_size;
+	};
 	u32		tskey;
 
 	/*
 	 * Warning : all fields before dataref are cleared in __alloc_skb()
 	 */
 	atomic_t	dataref;
+	unsigned int	xdp_frags_tsize;
 
 	/* Intermediate layers must ensure that destructor_arg
 	 * remains valid until skb destructor */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:15   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Introduce flags field in xdp_frame and xdp_buffer data structures
to define additional buffer features. At the moment the only
supported buffer feature is multi-buffer bit (mb). Multi-buffer bit
is used to specify if this is a linear buffer (mb = 0) or a multi-buffer
frame (mb = 1). In the latter case the driver is expected to initialize
the skb_shared_info structure at the end of the first buffer to link
together subsequent buffers belonging to the same frame.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index ad5b02dcb6f4..ed5ea784fd45 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -66,6 +66,10 @@ struct xdp_txq_info {
 	struct net_device *dev;
 };
 
+enum xdp_buff_flags {
+	XDP_FLAGS_MULTI_BUFF	= BIT(0), /* non-linear xdp buff */
+};
+
 struct xdp_buff {
 	void *data;
 	void *data_end;
@@ -74,13 +78,30 @@ struct xdp_buff {
 	struct xdp_rxq_info *rxq;
 	struct xdp_txq_info *txq;
 	u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
+	u16 flags; /* supported values defined in xdp_flags */
 };
 
+static __always_inline bool xdp_buff_is_mb(struct xdp_buff *xdp)
+{
+	return !!(xdp->flags & XDP_FLAGS_MULTI_BUFF);
+}
+
+static __always_inline void xdp_buff_set_mb(struct xdp_buff *xdp)
+{
+	xdp->flags |= XDP_FLAGS_MULTI_BUFF;
+}
+
+static __always_inline void xdp_buff_clear_mb(struct xdp_buff *xdp)
+{
+	xdp->flags &= ~XDP_FLAGS_MULTI_BUFF;
+}
+
 static __always_inline void
 xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)
 {
 	xdp->frame_sz = frame_sz;
 	xdp->rxq = rxq;
+	xdp->flags = 0;
 }
 
 static __always_inline void
@@ -122,8 +143,14 @@ struct xdp_frame {
 	 */
 	struct xdp_mem_info mem;
 	struct net_device *dev_rx; /* used by cpumap */
+	u16 flags; /* supported values defined in xdp_flags */
 };
 
+static __always_inline bool xdp_frame_is_mb(struct xdp_frame *frame)
+{
+	return !!(frame->flags & XDP_FLAGS_MULTI_BUFF);
+}
+
 #define XDP_BULK_QUEUE_SIZE	16
 struct xdp_frame_bulk {
 	int count;
@@ -180,6 +207,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)
 	xdp->data_end = frame->data + frame->len;
 	xdp->data_meta = frame->data - frame->metasize;
 	xdp->frame_sz = frame->frame_sz;
+	xdp->flags = frame->flags;
 }
 
 static inline
@@ -206,6 +234,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp,
 	xdp_frame->headroom = headroom - sizeof(*xdp_frame);
 	xdp_frame->metasize = metasize;
 	xdp_frame->frame_sz = xdp->frame_sz;
+	xdp_frame->flags = xdp->flags;
 
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:23   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management Lorenzo Bianconi
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Update multi-buffer bit (mb) in xdp_buff to notify XDP/eBPF layer and
XDP remote drivers if this is a "non-linear" XDP buffer. Access
skb_shared_info only if xdp_buff mb is set in order to avoid possible
cache-misses.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 5d1007e1b5c9..9f4858e35566 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2037,9 +2037,14 @@ mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 {
 	int i;
 
+	if (likely(!xdp_buff_is_mb(xdp)))
+		goto out;
+
 	for (i = 0; i < sinfo->nr_frags; i++)
 		page_pool_put_full_page(rxq->page_pool,
 					skb_frag_page(&sinfo->frags[i]), true);
+
+out:
 	page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data),
 			   sync_len, true);
 }
@@ -2241,7 +2246,6 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 	int data_len = -MVNETA_MH_SIZE, len;
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
-	struct skb_shared_info *sinfo;
 
 	if (*size > MVNETA_MAX_RX_BUF_SIZE) {
 		len = MVNETA_MAX_RX_BUF_SIZE;
@@ -2261,11 +2265,9 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 
 	/* Prefetch header */
 	prefetch(data);
+	xdp_buff_clear_mb(xdp);
 	xdp_prepare_buff(xdp, data, pp->rx_offset_correction + MVNETA_MH_SIZE,
 			 data_len, false);
-
-	sinfo = xdp_get_shared_info_from_buff(xdp);
-	sinfo->nr_frags = 0;
 }
 
 static void
@@ -2299,6 +2301,9 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 		skb_frag_off_set(frag, pp->rx_offset_correction);
 		skb_frag_size_set(frag, data_len);
 		__skb_frag_set_page(frag, page);
+
+		if (!xdp_buff_is_mb(xdp))
+			xdp_buff_set_mb(xdp);
 	} else {
 		page_pool_put_full_page(rxq->page_pool, page, true);
 	}
@@ -2320,8 +2325,12 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
 		      struct xdp_buff *xdp, u32 desc_status)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
-	int i, num_frags = sinfo->nr_frags;
 	struct sk_buff *skb;
+	u8 num_frags;
+	int i;
+
+	if (unlikely(xdp_buff_is_mb(xdp)))
+		num_frags = sinfo->nr_frags;
 
 	skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
 	if (!skb)
@@ -2333,6 +2342,9 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
 	skb_put(skb, xdp->data_end - xdp->data);
 	skb->ip_summed = mvneta_rx_csum(pp, desc_status);
 
+	if (likely(!xdp_buff_is_mb(xdp)))
+		goto out;
+
 	for (i = 0; i < num_frags; i++) {
 		skb_frag_t *frag = &sinfo->frags[i];
 
@@ -2341,6 +2353,7 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
 				skb_frag_size(frag), PAGE_SIZE);
 	}
 
+out:
 	return skb;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (2 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:34   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine Lorenzo Bianconi
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Relying on xdp mb bit, remove skb_shared_info structure allocated on the
stack in mvneta_rx_swbm routine and simplify mvneta_swbm_add_rx_fragment
accessing skb_shared_info in the xdp_buff structure directly. There is no
performance penalty in this approach since mvneta_swbm_add_rx_fragment
is run just for multi-buff use-case.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 42 ++++++++++-----------------
 1 file changed, 15 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 9f4858e35566..cbf614d6b993 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2032,9 +2032,9 @@ int mvneta_rx_refill_queue(struct mvneta_port *pp, struct mvneta_rx_queue *rxq)
 
 static void
 mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
-		    struct xdp_buff *xdp, struct skb_shared_info *sinfo,
-		    int sync_len)
+		    struct xdp_buff *xdp, int sync_len)
 {
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
 	int i;
 
 	if (likely(!xdp_buff_is_mb(xdp)))
@@ -2182,7 +2182,6 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	       struct bpf_prog *prog, struct xdp_buff *xdp,
 	       u32 frame_sz, struct mvneta_stats *stats)
 {
-	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
 	unsigned int len, data_len, sync;
 	u32 ret, act;
 
@@ -2203,7 +2202,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 
 		err = xdp_do_redirect(pp->dev, xdp, prog);
 		if (unlikely(err)) {
-			mvneta_xdp_put_buff(pp, rxq, xdp, sinfo, sync);
+			mvneta_xdp_put_buff(pp, rxq, xdp, sync);
 			ret = MVNETA_XDP_DROPPED;
 		} else {
 			ret = MVNETA_XDP_REDIR;
@@ -2214,7 +2213,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	case XDP_TX:
 		ret = mvneta_xdp_xmit_back(pp, xdp);
 		if (ret != MVNETA_XDP_TX)
-			mvneta_xdp_put_buff(pp, rxq, xdp, sinfo, sync);
+			mvneta_xdp_put_buff(pp, rxq, xdp, sync);
 		break;
 	default:
 		bpf_warn_invalid_xdp_action(act);
@@ -2223,7 +2222,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		trace_xdp_exception(pp->dev, prog, act);
 		fallthrough;
 	case XDP_DROP:
-		mvneta_xdp_put_buff(pp, rxq, xdp, sinfo, sync);
+		mvneta_xdp_put_buff(pp, rxq, xdp, sync);
 		ret = MVNETA_XDP_DROPPED;
 		stats->xdp_drop++;
 		break;
@@ -2275,9 +2274,9 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 			    struct mvneta_rx_desc *rx_desc,
 			    struct mvneta_rx_queue *rxq,
 			    struct xdp_buff *xdp, int *size,
-			    struct skb_shared_info *xdp_sinfo,
 			    struct page *page)
 {
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
 	int data_len, len;
@@ -2295,8 +2294,11 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 				len, dma_dir);
 	rx_desc->buf_phys_addr = 0;
 
-	if (data_len > 0 && xdp_sinfo->nr_frags < MAX_SKB_FRAGS) {
-		skb_frag_t *frag = &xdp_sinfo->frags[xdp_sinfo->nr_frags++];
+	if (!xdp_buff_is_mb(xdp))
+		sinfo->nr_frags = 0;
+
+	if (data_len > 0 && sinfo->nr_frags < MAX_SKB_FRAGS) {
+		skb_frag_t *frag = &sinfo->frags[sinfo->nr_frags++];
 
 		skb_frag_off_set(frag, pp->rx_offset_correction);
 		skb_frag_size_set(frag, data_len);
@@ -2307,16 +2309,6 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 	} else {
 		page_pool_put_full_page(rxq->page_pool, page, true);
 	}
-
-	/* last fragment */
-	if (len == *size) {
-		struct skb_shared_info *sinfo;
-
-		sinfo = xdp_get_shared_info_from_buff(xdp);
-		sinfo->nr_frags = xdp_sinfo->nr_frags;
-		memcpy(sinfo->frags, xdp_sinfo->frags,
-		       sinfo->nr_frags * sizeof(skb_frag_t));
-	}
 	*size -= len;
 }
 
@@ -2364,7 +2356,6 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 {
 	int rx_proc = 0, rx_todo, refill, size = 0;
 	struct net_device *dev = pp->dev;
-	struct skb_shared_info sinfo;
 	struct mvneta_stats ps = {};
 	struct bpf_prog *xdp_prog;
 	u32 desc_status, frame_sz;
@@ -2373,8 +2364,6 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 	xdp_init_buff(&xdp_buf, PAGE_SIZE, &rxq->xdp_rxq);
 	xdp_buf.data_hard_start = NULL;
 
-	sinfo.nr_frags = 0;
-
 	/* Get number of received packets */
 	rx_todo = mvneta_rxq_busy_desc_num_get(pp, rxq);
 
@@ -2416,7 +2405,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 			}
 
 			mvneta_swbm_add_rx_fragment(pp, rx_desc, rxq, &xdp_buf,
-						    &size, &sinfo, page);
+						    &size, page);
 		} /* Middle or Last descriptor */
 
 		if (!(rx_status & MVNETA_RXD_LAST_DESC))
@@ -2424,7 +2413,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 			continue;
 
 		if (size) {
-			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &sinfo, -1);
+			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, -1);
 			goto next;
 		}
 
@@ -2436,7 +2425,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 		if (IS_ERR(skb)) {
 			struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats);
 
-			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &sinfo, -1);
+			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, -1);
 
 			u64_stats_update_begin(&stats->syncp);
 			stats->es.skb_alloc_error++;
@@ -2453,11 +2442,10 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 		napi_gro_receive(napi, skb);
 next:
 		xdp_buf.data_hard_start = NULL;
-		sinfo.nr_frags = 0;
 	}
 
 	if (xdp_buf.data_hard_start)
-		mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &sinfo, -1);
+		mvneta_xdp_put_buff(pp, rxq, &xdp_buf, -1);
 
 	if (ps.xdp_redirect)
 		xdp_do_flush_map();
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (3 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:38   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 06/18] net: marvell: rely on " Lorenzo Bianconi
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Introduce xdp_update_skb_shared_info routine to update frags array
metadata in skb_shared_info data structure converting to a skb from
a xdp_buff or xdp_frame.
According to the current skb_shared_info architecture in
xdp_frame/xdp_buff and to the xdp multi-buff support, there is
no need to run skb_add_rx_frag() and reset frags array converting the buffer
to a skb since the frag array will be in the same position for xdp_buff/xdp_frame
and for the skb, we just need to update memory metadata.
Introduce XDP_FLAGS_PF_MEMALLOC flag in xdp_buff_flags in order to mark
the xdp_buff or xdp_frame as under memory-pressure if pages of the frags array
are under memory pressure. Doing so we can avoid looping over all fragments in
xdp_update_skb_shared_info routine. The driver is expected to set the
flag constructing the xdp_buffer using xdp_buff_set_frag_pfmemalloc
utility routine.
Rely on xdp_update_skb_shared_info in __xdp_build_skb_from_frame routine
converting the multi-buff xdp_frame to a skb after performing a XDP_REDIRECT.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h | 33 ++++++++++++++++++++++++++++++++-
 net/core/xdp.c    | 17 +++++++++++++++++
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index ed5ea784fd45..53cccdc9528c 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -67,7 +67,10 @@ struct xdp_txq_info {
 };
 
 enum xdp_buff_flags {
-	XDP_FLAGS_MULTI_BUFF	= BIT(0), /* non-linear xdp buff */
+	XDP_FLAGS_MULTI_BUFF		= BIT(0), /* non-linear xdp buff */
+	XDP_FLAGS_FRAGS_PF_MEMALLOC	= BIT(1), /* xdp multi-buff paged memory
+						   * is under pressure
+						   */
 };
 
 struct xdp_buff {
@@ -96,6 +99,16 @@ static __always_inline void xdp_buff_clear_mb(struct xdp_buff *xdp)
 	xdp->flags &= ~XDP_FLAGS_MULTI_BUFF;
 }
 
+static __always_inline bool xdp_buff_is_frag_pfmemalloc(struct xdp_buff *xdp)
+{
+	return !!(xdp->flags & XDP_FLAGS_FRAGS_PF_MEMALLOC);
+}
+
+static __always_inline void xdp_buff_set_frag_pfmemalloc(struct xdp_buff *xdp)
+{
+	xdp->flags |= XDP_FLAGS_FRAGS_PF_MEMALLOC;
+}
+
 static __always_inline void
 xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)
 {
@@ -151,6 +164,11 @@ static __always_inline bool xdp_frame_is_mb(struct xdp_frame *frame)
 	return !!(frame->flags & XDP_FLAGS_MULTI_BUFF);
 }
 
+static __always_inline bool xdp_frame_is_frag_pfmemalloc(struct xdp_frame *frame)
+{
+	return !!(frame->flags & XDP_FLAGS_FRAGS_PF_MEMALLOC);
+}
+
 #define XDP_BULK_QUEUE_SIZE	16
 struct xdp_frame_bulk {
 	int count;
@@ -186,6 +204,19 @@ static inline void xdp_scrub_frame(struct xdp_frame *frame)
 	frame->dev_rx = NULL;
 }
 
+static inline void
+xdp_update_skb_shared_info(struct sk_buff *skb, u8 nr_frags,
+			   unsigned int size, unsigned int truesize,
+			   bool pfmemalloc)
+{
+	skb_shinfo(skb)->nr_frags = nr_frags;
+
+	skb->len += size;
+	skb->data_len += size;
+	skb->truesize += truesize;
+	skb->pfmemalloc |= pfmemalloc;
+}
+
 /* Avoids inlining WARN macro in fast-path */
 void xdp_warn(const char *msg, const char *func, const int line);
 #define XDP_WARN(msg) xdp_warn(msg, __func__, __LINE__)
diff --git a/net/core/xdp.c b/net/core/xdp.c
index cc92ccb38432..504be3ce3ca9 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -531,8 +531,20 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 					   struct sk_buff *skb,
 					   struct net_device *dev)
 {
+	unsigned int frag_size, frag_tsize;
 	unsigned int headroom, frame_size;
 	void *hard_start;
+	u8 nr_frags;
+
+	/* xdp multi-buff frame */
+	if (unlikely(xdp_frame_is_mb(xdpf))) {
+		struct skb_shared_info *sinfo;
+
+		sinfo = xdp_get_shared_info_from_frame(xdpf);
+		frag_tsize = sinfo->xdp_frags_tsize;
+		frag_size = sinfo->xdp_frags_size;
+		nr_frags = sinfo->nr_frags;
+	}
 
 	/* Part of headroom was reserved to xdpf */
 	headroom = sizeof(*xdpf) + xdpf->headroom;
@@ -552,6 +564,11 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	if (xdpf->metasize)
 		skb_metadata_set(skb, xdpf->metasize);
 
+	if (unlikely(xdp_frame_is_mb(xdpf)))
+		xdp_update_skb_shared_info(skb, nr_frags,
+					   frag_size, frag_tsize,
+					   xdp_frame_is_frag_pfmemalloc(xdpf));
+
 	/* Essential SKB info: protocol and skb->dev */
 	skb->protocol = eth_type_trans(skb, dev);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 06/18] net: marvell: rely on xdp_update_skb_shared_info utility routine
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (4 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:41   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Rely on xdp_update_skb_shared_info routine in order to avoid
resetting frags array in skb_shared_info structure building
the skb in mvneta_swbm_build_skb(). Frags array is expected to
be initialized by the receiving driver building the xdp_buff
and here we just need to update memory metadata.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 35 +++++++++++++++------------
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index cbf614d6b993..b996eb49d813 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2304,11 +2304,19 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 		skb_frag_size_set(frag, data_len);
 		__skb_frag_set_page(frag, page);
 
-		if (!xdp_buff_is_mb(xdp))
+		if (!xdp_buff_is_mb(xdp)) {
+			sinfo->xdp_frags_size = *size;
 			xdp_buff_set_mb(xdp);
+		}
+		if (page_is_pfmemalloc(page))
+			xdp_buff_set_frag_pfmemalloc(xdp);
 	} else {
 		page_pool_put_full_page(rxq->page_pool, page, true);
 	}
+
+	/* last fragment */
+	if (len == *size)
+		sinfo->xdp_frags_tsize = sinfo->nr_frags * PAGE_SIZE;
 	*size -= len;
 }
 
@@ -2316,13 +2324,18 @@ static struct sk_buff *
 mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
 		      struct xdp_buff *xdp, u32 desc_status)
 {
-	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+	unsigned int size, truesize;
 	struct sk_buff *skb;
 	u8 num_frags;
-	int i;
 
-	if (unlikely(xdp_buff_is_mb(xdp)))
+	if (unlikely(xdp_buff_is_mb(xdp))) {
+		struct skb_shared_info *sinfo;
+
+		sinfo = xdp_get_shared_info_from_buff(xdp);
+		truesize = sinfo->xdp_frags_tsize;
+		size = sinfo->xdp_frags_size;
 		num_frags = sinfo->nr_frags;
+	}
 
 	skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
 	if (!skb)
@@ -2334,18 +2347,10 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
 	skb_put(skb, xdp->data_end - xdp->data);
 	skb->ip_summed = mvneta_rx_csum(pp, desc_status);
 
-	if (likely(!xdp_buff_is_mb(xdp)))
-		goto out;
-
-	for (i = 0; i < num_frags; i++) {
-		skb_frag_t *frag = &sinfo->frags[i];
-
-		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
-				skb_frag_page(frag), skb_frag_off(frag),
-				skb_frag_size(frag), PAGE_SIZE);
-	}
+	if (unlikely(xdp_buff_is_mb(xdp)))
+		xdp_update_skb_shared_info(skb, num_frags, size, truesize,
+					   xdp_buff_is_frag_pfmemalloc(xdp));
 
-out:
 	return skb;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame}
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (5 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 06/18] net: marvell: rely on " Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:43   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Take into account if the received xdp_buff/xdp_frame is non-linear
recycling/returning the frame memory to the allocator or into
xdp_frame_bulk.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h | 18 ++++++++++++++--
 net/core/xdp.c    | 54 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 53cccdc9528c..d66e9877d773 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -306,10 +306,24 @@ void __xdp_release_frame(void *data, struct xdp_mem_info *mem);
 static inline void xdp_release_frame(struct xdp_frame *xdpf)
 {
 	struct xdp_mem_info *mem = &xdpf->mem;
+	struct skb_shared_info *sinfo;
+	int i;
 
 	/* Curr only page_pool needs this */
-	if (mem->type == MEM_TYPE_PAGE_POOL)
-		__xdp_release_frame(xdpf->data, mem);
+	if (mem->type != MEM_TYPE_PAGE_POOL)
+		return;
+
+	if (likely(!xdp_frame_is_mb(xdpf)))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_release_frame(page_address(page), mem);
+	}
+out:
+	__xdp_release_frame(xdpf->data, mem);
 }
 
 int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq,
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 504be3ce3ca9..1346fb8b3f50 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -376,12 +376,38 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
 
 void xdp_return_frame(struct xdp_frame *xdpf)
 {
+	struct skb_shared_info *sinfo;
+	int i;
+
+	if (likely(!xdp_frame_is_mb(xdpf)))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdpf->mem, false, NULL);
+	}
+out:
 	__xdp_return(xdpf->data, &xdpf->mem, false, NULL);
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame);
 
 void xdp_return_frame_rx_napi(struct xdp_frame *xdpf)
 {
+	struct skb_shared_info *sinfo;
+	int i;
+
+	if (likely(!xdp_frame_is_mb(xdpf)))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdpf->mem, true, NULL);
+	}
+out:
 	__xdp_return(xdpf->data, &xdpf->mem, true, NULL);
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame_rx_napi);
@@ -417,7 +443,7 @@ void xdp_return_frame_bulk(struct xdp_frame *xdpf,
 	struct xdp_mem_allocator *xa;
 
 	if (mem->type != MEM_TYPE_PAGE_POOL) {
-		__xdp_return(xdpf->data, &xdpf->mem, false, NULL);
+		xdp_return_frame(xdpf);
 		return;
 	}
 
@@ -436,12 +462,38 @@ void xdp_return_frame_bulk(struct xdp_frame *xdpf,
 		bq->xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params);
 	}
 
+	if (unlikely(xdp_frame_is_mb(xdpf))) {
+		struct skb_shared_info *sinfo;
+		int i;
+
+		sinfo = xdp_get_shared_info_from_frame(xdpf);
+		for (i = 0; i < sinfo->nr_frags; i++) {
+			skb_frag_t *frag = &sinfo->frags[i];
+
+			bq->q[bq->count++] = skb_frag_address(frag);
+			if (bq->count == XDP_BULK_QUEUE_SIZE)
+				xdp_flush_frame_bulk(bq);
+		}
+	}
 	bq->q[bq->count++] = xdpf->data;
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame_bulk);
 
 void xdp_return_buff(struct xdp_buff *xdp)
 {
+	struct skb_shared_info *sinfo;
+	int i;
+
+	if (likely(!xdp_buff_is_mb(xdp)))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_buff(xdp);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdp->rxq->mem, true, xdp);
+	}
+out:
 	__xdp_return(xdp->data, &xdp->rxq->mem, true, xdp);
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (6 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:44   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Introduce the capability to map non-linear xdp buffer running
mvneta_xdp_submit_frame() for XDP_TX and XDP_REDIRECT

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 112 +++++++++++++++++---------
 1 file changed, 76 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index b996eb49d813..dde1c28b0ea8 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1856,8 +1856,8 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 			bytes_compl += buf->skb->len;
 			pkts_compl++;
 			dev_kfree_skb_any(buf->skb);
-		} else if (buf->type == MVNETA_TYPE_XDP_TX ||
-			   buf->type == MVNETA_TYPE_XDP_NDO) {
+		} else if ((buf->type == MVNETA_TYPE_XDP_TX ||
+			    buf->type == MVNETA_TYPE_XDP_NDO) && buf->xdpf) {
 			if (napi && buf->type == MVNETA_TYPE_XDP_TX)
 				xdp_return_frame_rx_napi(buf->xdpf);
 			else
@@ -2051,47 +2051,87 @@ mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 
 static int
 mvneta_xdp_submit_frame(struct mvneta_port *pp, struct mvneta_tx_queue *txq,
-			struct xdp_frame *xdpf, bool dma_map)
+			struct xdp_frame *xdpf, int *nxmit_byte, bool dma_map)
 {
-	struct mvneta_tx_desc *tx_desc;
-	struct mvneta_tx_buf *buf;
-	dma_addr_t dma_addr;
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_frame(xdpf);
+	struct device *dev = pp->dev->dev.parent;
+	struct mvneta_tx_desc *tx_desc = NULL;
+	int i, num_frames = 1;
+	struct page *page;
+
+	if (unlikely(xdp_frame_is_mb(xdpf)))
+		num_frames += sinfo->nr_frags;
 
-	if (txq->count >= txq->tx_stop_threshold)
+	if (txq->count + num_frames >= txq->size)
 		return MVNETA_XDP_DROPPED;
 
-	tx_desc = mvneta_txq_next_desc_get(txq);
+	for (i = 0; i < num_frames; i++) {
+		struct mvneta_tx_buf *buf = &txq->buf[txq->txq_put_index];
+		skb_frag_t *frag = NULL;
+		int len = xdpf->len;
+		dma_addr_t dma_addr;
 
-	buf = &txq->buf[txq->txq_put_index];
-	if (dma_map) {
-		/* ndo_xdp_xmit */
-		dma_addr = dma_map_single(pp->dev->dev.parent, xdpf->data,
-					  xdpf->len, DMA_TO_DEVICE);
-		if (dma_mapping_error(pp->dev->dev.parent, dma_addr)) {
-			mvneta_txq_desc_put(txq);
-			return MVNETA_XDP_DROPPED;
+		if (unlikely(i)) { /* paged area */
+			frag = &sinfo->frags[i - 1];
+			len = skb_frag_size(frag);
 		}
-		buf->type = MVNETA_TYPE_XDP_NDO;
-	} else {
-		struct page *page = virt_to_page(xdpf->data);
 
-		dma_addr = page_pool_get_dma_addr(page) +
-			   sizeof(*xdpf) + xdpf->headroom;
-		dma_sync_single_for_device(pp->dev->dev.parent, dma_addr,
-					   xdpf->len, DMA_BIDIRECTIONAL);
-		buf->type = MVNETA_TYPE_XDP_TX;
+		tx_desc = mvneta_txq_next_desc_get(txq);
+		if (dma_map) {
+			/* ndo_xdp_xmit */
+			void *data;
+
+			data = unlikely(frag) ? skb_frag_address(frag)
+					      : xdpf->data;
+			dma_addr = dma_map_single(dev, data, len,
+						  DMA_TO_DEVICE);
+			if (dma_mapping_error(dev, dma_addr)) {
+				mvneta_txq_desc_put(txq);
+				goto unmap;
+			}
+
+			buf->type = MVNETA_TYPE_XDP_NDO;
+		} else {
+			page = unlikely(frag) ? skb_frag_page(frag)
+					      : virt_to_page(xdpf->data);
+			dma_addr = page_pool_get_dma_addr(page);
+			if (unlikely(frag))
+				dma_addr += skb_frag_off(frag);
+			else
+				dma_addr += sizeof(*xdpf) + xdpf->headroom;
+			dma_sync_single_for_device(dev, dma_addr, len,
+						   DMA_BIDIRECTIONAL);
+			buf->type = MVNETA_TYPE_XDP_TX;
+		}
+		buf->xdpf = unlikely(i) ? NULL : xdpf;
+
+		tx_desc->command = unlikely(i) ? 0 : MVNETA_TXD_F_DESC;
+		tx_desc->buf_phys_addr = dma_addr;
+		tx_desc->data_size = len;
+		*nxmit_byte += len;
+
+		mvneta_txq_inc_put(txq);
 	}
-	buf->xdpf = xdpf;
 
-	tx_desc->command = MVNETA_TXD_FLZ_DESC;
-	tx_desc->buf_phys_addr = dma_addr;
-	tx_desc->data_size = xdpf->len;
+	/*last descriptor */
+	if (likely(tx_desc))
+		tx_desc->command |= MVNETA_TXD_L_DESC | MVNETA_TXD_Z_PAD;
 
-	mvneta_txq_inc_put(txq);
-	txq->pending++;
-	txq->count++;
+	txq->pending += num_frames;
+	txq->count += num_frames;
 
 	return MVNETA_XDP_TX;
+
+unmap:
+	for (i--; i >= 0; i--) {
+		mvneta_txq_desc_put(txq);
+		tx_desc = txq->descs + txq->next_desc_to_proc;
+		dma_unmap_single(dev, tx_desc->buf_phys_addr,
+				 tx_desc->data_size,
+				 DMA_TO_DEVICE);
+	}
+
+	return MVNETA_XDP_DROPPED;
 }
 
 static int
@@ -2100,8 +2140,8 @@ mvneta_xdp_xmit_back(struct mvneta_port *pp, struct xdp_buff *xdp)
 	struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats);
 	struct mvneta_tx_queue *txq;
 	struct netdev_queue *nq;
+	int cpu, nxmit_byte = 0;
 	struct xdp_frame *xdpf;
-	int cpu;
 	u32 ret;
 
 	xdpf = xdp_convert_buff_to_frame(xdp);
@@ -2113,10 +2153,10 @@ mvneta_xdp_xmit_back(struct mvneta_port *pp, struct xdp_buff *xdp)
 	nq = netdev_get_tx_queue(pp->dev, txq->id);
 
 	__netif_tx_lock(nq, cpu);
-	ret = mvneta_xdp_submit_frame(pp, txq, xdpf, false);
+	ret = mvneta_xdp_submit_frame(pp, txq, xdpf, &nxmit_byte, false);
 	if (ret == MVNETA_XDP_TX) {
 		u64_stats_update_begin(&stats->syncp);
-		stats->es.ps.tx_bytes += xdpf->len;
+		stats->es.ps.tx_bytes += nxmit_byte;
 		stats->es.ps.tx_packets++;
 		stats->es.ps.xdp_tx++;
 		u64_stats_update_end(&stats->syncp);
@@ -2155,11 +2195,11 @@ mvneta_xdp_xmit(struct net_device *dev, int num_frame,
 
 	__netif_tx_lock(nq, cpu);
 	for (i = 0; i < num_frame; i++) {
-		ret = mvneta_xdp_submit_frame(pp, txq, frames[i], true);
+		ret = mvneta_xdp_submit_frame(pp, txq, frames[i], &nxmit_byte,
+					      true);
 		if (ret != MVNETA_XDP_TX)
 			break;
 
-		nxmit_byte += frames[i]->len;
 		nxmit++;
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (7 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-31 23:45   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Enable the capability to receive jumbo frames even if the interface is
running in XDP mode

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index dde1c28b0ea8..3dbcb067346c 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3767,11 +3767,6 @@ static int mvneta_change_mtu(struct net_device *dev, int mtu)
 		mtu = ALIGN(MVNETA_RX_PKT_SIZE(mtu), 8);
 	}
 
-	if (pp->xdp_prog && mtu > MVNETA_MAX_RX_BUF_SIZE) {
-		netdev_info(dev, "Illegal MTU value %d for XDP mode\n", mtu);
-		return -EINVAL;
-	}
-
 	dev->mtu = mtu;
 
 	if (!netif_running(dev)) {
@@ -4481,11 +4476,6 @@ static int mvneta_xdp_setup(struct net_device *dev, struct bpf_prog *prog,
 	struct mvneta_port *pp = netdev_priv(dev);
 	struct bpf_prog *old_prog;
 
-	if (prog && dev->mtu > MVNETA_MAX_RX_BUF_SIZE) {
-		NL_SET_ERR_MSG_MOD(extack, "MTU too large for XDP");
-		return -EOPNOTSUPP;
-	}
-
 	if (pp->bm_priv) {
 		NL_SET_ERR_MSG_MOD(extack,
 				   "Hardware Buffer Management not supported on XDP");
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (8 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-09-01  0:10   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper Lorenzo Bianconi
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

From: Eelco Chaudron <echaudro@redhat.com>

This change adds support for tail growing and shrinking for XDP multi-buff.

When called on a multi-buffer packet with a grow request, it will always
work on the last fragment of the packet. So the maximum grow size is the
last fragments tailroom, i.e. no new buffer will be allocated.

When shrinking, it will work from the last fragment, all the way down to
the base buffer depending on the shrinking size. It's important to mention
that once you shrink down the fragment(s) are freed, so you can not grow
again to the original size.

Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
 include/net/xdp.h |  9 +++++++
 net/core/filter.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++
 net/core/xdp.c    |  5 ++--
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index d66e9877d773..cdaecf8d4d61 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -145,6 +145,13 @@ xdp_get_shared_info_from_buff(struct xdp_buff *xdp)
 	return (struct skb_shared_info *)xdp_data_hard_end(xdp);
 }
 
+static inline unsigned int xdp_get_frag_tailroom(const skb_frag_t *frag)
+{
+	struct page *page = skb_frag_page(frag);
+
+	return page_size(page) - skb_frag_size(frag) - skb_frag_off(frag);
+}
+
 struct xdp_frame {
 	void *data;
 	u16 len;
@@ -290,6 +297,8 @@ struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
 	return xdp_frame;
 }
 
+void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
+		  struct xdp_buff *xdp);
 void xdp_return_frame(struct xdp_frame *xdpf);
 void xdp_return_frame_rx_napi(struct xdp_frame *xdpf);
 void xdp_return_buff(struct xdp_buff *xdp);
diff --git a/net/core/filter.c b/net/core/filter.c
index 59b8f5050180..e6a589705813 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3818,11 +3818,71 @@ static const struct bpf_func_proto bpf_xdp_adjust_head_proto = {
 	.arg2_type	= ARG_ANYTHING,
 };
 
+static int bpf_xdp_mb_adjust_tail(struct xdp_buff *xdp, int offset)
+{
+	struct skb_shared_info *sinfo;
+
+	sinfo = xdp_get_shared_info_from_buff(xdp);
+	if (offset >= 0) {
+		skb_frag_t *frag = &sinfo->frags[sinfo->nr_frags - 1];
+		int size;
+
+		if (unlikely(offset > xdp_get_frag_tailroom(frag)))
+			return -EINVAL;
+
+		size = skb_frag_size(frag);
+		memset(skb_frag_address(frag) + size, 0, offset);
+		skb_frag_size_set(frag, size + offset);
+		sinfo->xdp_frags_size += offset;
+	} else {
+		int i, n_frags_free = 0, len_free = 0, tlen_free = 0;
+
+		offset = abs(offset);
+		if (unlikely(offset > ((int)(xdp->data_end - xdp->data) +
+				       sinfo->xdp_frags_size - ETH_HLEN)))
+			return -EINVAL;
+
+		for (i = sinfo->nr_frags - 1; i >= 0 && offset > 0; i--) {
+			skb_frag_t *frag = &sinfo->frags[i];
+			int size = skb_frag_size(frag);
+			int shrink = min_t(int, offset, size);
+
+			len_free += shrink;
+			offset -= shrink;
+
+			if (unlikely(size == shrink)) {
+				struct page *page = skb_frag_page(frag);
+
+				__xdp_return(page_address(page), &xdp->rxq->mem,
+					     false, NULL);
+				tlen_free += page_size(page);
+				n_frags_free++;
+			} else {
+				skb_frag_size_set(frag, size - shrink);
+				break;
+			}
+		}
+		sinfo->nr_frags -= n_frags_free;
+		sinfo->xdp_frags_size -= len_free;
+		sinfo->xdp_frags_tsize -= tlen_free;
+
+		if (unlikely(offset > 0)) {
+			xdp_buff_clear_mb(xdp);
+			xdp->data_end -= offset;
+		}
+	}
+
+	return 0;
+}
+
 BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
 {
 	void *data_hard_end = xdp_data_hard_end(xdp); /* use xdp->frame_sz */
 	void *data_end = xdp->data_end + offset;
 
+	if (unlikely(xdp_buff_is_mb(xdp)))
+		return bpf_xdp_mb_adjust_tail(xdp, offset);
+
 	/* Notice that xdp_data_hard_end have reserved some tailroom */
 	if (unlikely(data_end > data_hard_end))
 		return -EINVAL;
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 1346fb8b3f50..a71cdea75306 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -339,8 +339,8 @@ EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model);
  * is used for those calls sites.  Thus, allowing for faster recycling
  * of xdp_frames/pages in those cases.
  */
-static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
-			 struct xdp_buff *xdp)
+void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
+		  struct xdp_buff *xdp)
 {
 	struct xdp_mem_allocator *xa;
 	struct page *page;
@@ -373,6 +373,7 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
 		break;
 	}
 }
+EXPORT_SYMBOL_GPL(__xdp_return);
 
 void xdp_return_frame(struct xdp_frame *xdpf)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (9 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-09-01  0:12   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer
total size (linear and paged area)

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/uapi/linux/bpf.h       |  7 +++++++
 net/core/filter.c              | 23 +++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  7 +++++++
 3 files changed, 37 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c4f7892edb2b..9e2c3b12ea49 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4871,6 +4871,12 @@ union bpf_attr {
  * 	Return
  *		Value specified by user at BPF link creation/attachment time
  *		or 0, if it was not specified.
+ *
+ * u64 bpf_xdp_get_buff_len(struct xdp_buff *xdp_md)
+ *	Description
+ *		Get the total size of a given xdp buff (linear and paged area)
+ *	Return
+ *		The total size of a given xdp buffer.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5048,6 +5054,7 @@ union bpf_attr {
 	FN(timer_cancel),		\
 	FN(get_func_ip),		\
 	FN(get_attach_cookie),		\
+	FN(xdp_get_buff_len),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/net/core/filter.c b/net/core/filter.c
index e6a589705813..53eaae2a5463 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3784,6 +3784,27 @@ static const struct bpf_func_proto sk_skb_change_head_proto = {
 	.arg2_type	= ARG_ANYTHING,
 	.arg3_type	= ARG_ANYTHING,
 };
+
+BPF_CALL_1(bpf_xdp_get_buff_len, struct  xdp_buff*, xdp)
+{
+	u64 len = xdp->data_end - xdp->data;
+
+	if (unlikely(xdp_buff_is_mb(xdp))) {
+		struct skb_shared_info *sinfo;
+
+		sinfo = xdp_get_shared_info_from_buff(xdp);
+		len += sinfo->xdp_frags_size;
+	}
+	return len;
+}
+
+static const struct bpf_func_proto bpf_xdp_get_buff_len_proto = {
+	.func		= bpf_xdp_get_buff_len,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+};
+
 static unsigned long xdp_get_metalen(const struct xdp_buff *xdp)
 {
 	return xdp_data_meta_unsupported(xdp) ? 0 :
@@ -7513,6 +7534,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_xdp_redirect_map_proto;
 	case BPF_FUNC_xdp_adjust_tail:
 		return &bpf_xdp_adjust_tail_proto;
+	case BPF_FUNC_xdp_get_buff_len:
+		return &bpf_xdp_get_buff_len_proto;
 	case BPF_FUNC_fib_lookup:
 		return &bpf_xdp_fib_lookup_proto;
 	case BPF_FUNC_check_mtu:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index c4f7892edb2b..9e2c3b12ea49 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4871,6 +4871,12 @@ union bpf_attr {
  * 	Return
  *		Value specified by user at BPF link creation/attachment time
  *		or 0, if it was not specified.
+ *
+ * u64 bpf_xdp_get_buff_len(struct xdp_buff *xdp_md)
+ *	Description
+ *		Get the total size of a given xdp buff (linear and paged area)
+ *	Return
+ *		The total size of a given xdp buffer.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5048,6 +5054,7 @@ union bpf_attr {
 	FN(timer_cancel),		\
 	FN(get_func_ip),		\
 	FN(get_attach_cookie),		\
+	FN(xdp_get_buff_len),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (10 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-09-01  0:19   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 13/18] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

From: Eelco Chaudron <echaudro@redhat.com>

This patch adds support for multi-buffer for the following helpers:
  - bpf_xdp_output()
  - bpf_perf_event_output()

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 kernel/trace/bpf_trace.c                      |   3 +
 net/core/filter.c                             |  68 +++++++-
 .../selftests/bpf/prog_tests/xdp_bpf2bpf.c    | 151 +++++++++++++-----
 .../selftests/bpf/progs/test_xdp_bpf2bpf.c    |   2 +-
 4 files changed, 180 insertions(+), 44 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index cbc73c08c4a4..20b0fae9b6d4 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1424,6 +1424,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto_raw_tp = {
 
 extern const struct bpf_func_proto bpf_skb_output_proto;
 extern const struct bpf_func_proto bpf_xdp_output_proto;
+extern const struct bpf_func_proto bpf_xdp_get_buff_len_trace_proto;
 
 BPF_CALL_3(bpf_get_stackid_raw_tp, struct bpf_raw_tracepoint_args *, args,
 	   struct bpf_map *, map, u64, flags)
@@ -1521,6 +1522,8 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_sock_from_file_proto;
 	case BPF_FUNC_get_socket_cookie:
 		return &bpf_get_socket_ptr_cookie_proto;
+	case BPF_FUNC_xdp_get_buff_len:
+		return &bpf_xdp_get_buff_len_trace_proto;
 #endif
 	case BPF_FUNC_seq_printf:
 		return prog->expected_attach_type == BPF_TRACE_ITER ?
diff --git a/net/core/filter.c b/net/core/filter.c
index 53eaae2a5463..2122c00c680f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3805,6 +3805,15 @@ static const struct bpf_func_proto bpf_xdp_get_buff_len_proto = {
 	.arg1_type	= ARG_PTR_TO_CTX,
 };
 
+BTF_ID_LIST_SINGLE(bpf_xdp_get_buff_len_bpf_ids, struct, xdp_buff)
+
+const struct bpf_func_proto bpf_xdp_get_buff_len_trace_proto = {
+	.func		= bpf_xdp_get_buff_len,
+	.gpl_only	= false,
+	.arg1_type	= ARG_PTR_TO_BTF_ID,
+	.arg1_btf_id	= &bpf_xdp_get_buff_len_bpf_ids[0],
+};
+
 static unsigned long xdp_get_metalen(const struct xdp_buff *xdp)
 {
 	return xdp_data_meta_unsupported(xdp) ? 0 :
@@ -4619,10 +4628,52 @@ static const struct bpf_func_proto bpf_sk_ancestor_cgroup_id_proto = {
 };
 #endif
 
-static unsigned long bpf_xdp_copy(void *dst_buff, const void *src_buff,
+static unsigned long bpf_xdp_copy(void *dst_buff, const void *ctx,
 				  unsigned long off, unsigned long len)
 {
-	memcpy(dst_buff, src_buff + off, len);
+	unsigned long base_len, copy_len, frag_off_total;
+	struct xdp_buff *xdp = (struct xdp_buff *)ctx;
+	struct skb_shared_info *sinfo;
+	int i;
+
+	if (likely(!xdp_buff_is_mb(xdp))) {
+		memcpy(dst_buff, xdp->data + off, len);
+		return 0;
+	}
+
+	base_len = xdp->data_end - xdp->data;
+	frag_off_total = base_len;
+	sinfo = xdp_get_shared_info_from_buff(xdp);
+
+	/* If we need to copy data from the base buffer do it */
+	if (off < base_len) {
+		copy_len = min(len, base_len - off);
+		memcpy(dst_buff, xdp->data + off, copy_len);
+
+		off += copy_len;
+		len -= copy_len;
+		dst_buff += copy_len;
+	}
+
+	/* Copy any remaining data from the fragments */
+	for (i = 0; len && i < sinfo->nr_frags; i++) {
+		skb_frag_t *frag = &sinfo->frags[i];
+		unsigned long frag_len, frag_off;
+
+		frag_len = skb_frag_size(frag);
+		frag_off = off - frag_off_total;
+		if (frag_off < frag_len) {
+			copy_len = min(len, frag_len - frag_off);
+			memcpy(dst_buff,
+			       skb_frag_address(frag) + frag_off, copy_len);
+
+			off += copy_len;
+			len -= copy_len;
+			dst_buff += copy_len;
+		}
+		frag_off_total += frag_len;
+	}
+
 	return 0;
 }
 
@@ -4634,10 +4685,19 @@ BPF_CALL_5(bpf_xdp_event_output, struct xdp_buff *, xdp, struct bpf_map *, map,
 	if (unlikely(flags & ~(BPF_F_CTXLEN_MASK | BPF_F_INDEX_MASK)))
 		return -EINVAL;
 	if (unlikely(!xdp ||
-		     xdp_size > (unsigned long)(xdp->data_end - xdp->data)))
+		     (likely(!xdp_buff_is_mb(xdp)) &&
+		      xdp_size > (unsigned long)(xdp->data_end - xdp->data))))
 		return -EFAULT;
+	if (unlikely(xdp_buff_is_mb(xdp))) {
+		struct skb_shared_info *sinfo;
+
+		sinfo = xdp_get_shared_info_from_buff(xdp);
+		if (unlikely(xdp_size > ((int)(xdp->data_end - xdp->data) +
+					 sinfo->xdp_frags_size)))
+			return -EFAULT;
+	}
 
-	return bpf_event_output(map, flags, meta, meta_size, xdp->data,
+	return bpf_event_output(map, flags, meta, meta_size, xdp,
 				xdp_size, bpf_xdp_copy);
 }
 
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c b/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c
index 3bd5904b4db5..fe279c1c0e48 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_bpf2bpf.c
@@ -10,11 +10,20 @@ struct meta {
 	int pkt_len;
 };
 
+struct test_ctx_s {
+	bool passed;
+	int pkt_size;
+};
+
+struct test_ctx_s test_ctx;
+
 static void on_sample(void *ctx, int cpu, void *data, __u32 size)
 {
-	int duration = 0;
 	struct meta *meta = (struct meta *)data;
 	struct ipv4_packet *trace_pkt_v4 = data + sizeof(*meta);
+	unsigned char *raw_pkt = data + sizeof(*meta);
+	struct test_ctx_s *tst_ctx = ctx;
+	int duration = 0;
 
 	if (CHECK(size < sizeof(pkt_v4) + sizeof(*meta),
 		  "check_size", "size %u < %zu\n",
@@ -25,25 +34,114 @@ static void on_sample(void *ctx, int cpu, void *data, __u32 size)
 		  "meta->ifindex = %d\n", meta->ifindex))
 		return;
 
-	if (CHECK(meta->pkt_len != sizeof(pkt_v4), "check_meta_pkt_len",
-		  "meta->pkt_len = %zd\n", sizeof(pkt_v4)))
+	if (CHECK(meta->pkt_len != tst_ctx->pkt_size, "check_meta_pkt_len",
+		  "meta->pkt_len = %d\n", tst_ctx->pkt_size))
 		return;
 
 	if (CHECK(memcmp(trace_pkt_v4, &pkt_v4, sizeof(pkt_v4)),
 		  "check_packet_content", "content not the same\n"))
 		return;
 
-	*(bool *)ctx = true;
+	if (meta->pkt_len > sizeof(pkt_v4)) {
+		for (int i = 0; i < (meta->pkt_len - sizeof(pkt_v4)); i++) {
+			if (raw_pkt[i + sizeof(pkt_v4)] != (unsigned char)i) {
+				CHECK(true, "check_packet_content",
+				      "byte %zu does not match %u != %u\n",
+				      i + sizeof(pkt_v4),
+				      raw_pkt[i + sizeof(pkt_v4)],
+				      (unsigned char)i);
+				break;
+			}
+		}
+	}
+
+	tst_ctx->passed = true;
 }
 
-void test_xdp_bpf2bpf(void)
+#define BUF_SZ	9000
+
+static int run_xdp_bpf2bpf_pkt_size(int pkt_fd, struct perf_buffer *pb,
+				    struct test_xdp_bpf2bpf *ftrace_skel,
+				    int pkt_size)
 {
 	__u32 duration = 0, retval, size;
-	char buf[128];
+	__u8 *buf, *buf_in;
+	int err, ret = 0;
+
+	if (pkt_size > BUF_SZ || pkt_size < sizeof(pkt_v4))
+		return -EINVAL;
+
+	buf_in = malloc(BUF_SZ);
+	if (CHECK(!buf_in, "buf_in malloc()", "error:%s\n", strerror(errno)))
+		return -ENOMEM;
+
+	buf = malloc(BUF_SZ);
+	if (CHECK(!buf, "buf malloc()", "error:%s\n", strerror(errno))) {
+		ret = -ENOMEM;
+		goto free_buf_in;
+	}
+
+	test_ctx.passed = false;
+	test_ctx.pkt_size = pkt_size;
+
+	memcpy(buf_in, &pkt_v4, sizeof(pkt_v4));
+	if (pkt_size > sizeof(pkt_v4)) {
+		for (int i = 0; i < (pkt_size - sizeof(pkt_v4)); i++)
+			buf_in[i + sizeof(pkt_v4)] = i;
+	}
+
+	/* Run test program */
+	err = bpf_prog_test_run(pkt_fd, 1, buf_in, pkt_size,
+				buf, &size, &retval, &duration);
+
+	if (CHECK(err || retval != XDP_PASS || size != pkt_size,
+		  "ipv4", "err %d errno %d retval %d size %d\n",
+		  err, errno, retval, size)) {
+		ret = err ? err : -EINVAL;
+		goto free_buf;
+	}
+
+	/* Make sure bpf_xdp_output() was triggered and it sent the expected
+	 * data to the perf ring buffer.
+	 */
+	err = perf_buffer__poll(pb, 100);
+	if (CHECK(err <= 0, "perf_buffer__poll", "err %d\n", err)) {
+		ret = -EINVAL;
+		goto free_buf;
+	}
+
+	if (CHECK_FAIL(!test_ctx.passed)) {
+		ret = -EINVAL;
+		goto free_buf;
+	}
+
+	/* Verify test results */
+	if (CHECK(ftrace_skel->bss->test_result_fentry != if_nametoindex("lo"),
+		  "result", "fentry failed err %llu\n",
+		  ftrace_skel->bss->test_result_fentry)) {
+		ret = -EINVAL;
+		goto free_buf;
+	}
+
+	if (CHECK(ftrace_skel->bss->test_result_fexit != XDP_PASS, "result",
+		  "fexit failed err %llu\n",
+		  ftrace_skel->bss->test_result_fexit))
+		ret = -EINVAL;
+
+free_buf:
+	free(buf);
+free_buf_in:
+	free(buf_in);
+
+	return ret;
+}
+
+void test_xdp_bpf2bpf(void)
+{
 	int err, pkt_fd, map_fd;
-	bool passed = false;
-	struct iphdr *iph = (void *)buf + sizeof(struct ethhdr);
-	struct iptnl_info value4 = {.family = AF_INET};
+	__u32 duration = 0;
+	int pkt_sizes[] = {sizeof(pkt_v4), 1024, 4100, 8200};
+	struct iptnl_info value4 = {.family = AF_INET6};
 	struct test_xdp *pkt_skel = NULL;
 	struct test_xdp_bpf2bpf *ftrace_skel = NULL;
 	struct vip key4 = {.protocol = 6, .family = AF_INET};
@@ -87,40 +185,15 @@ void test_xdp_bpf2bpf(void)
 
 	/* Set up perf buffer */
 	pb_opts.sample_cb = on_sample;
-	pb_opts.ctx = &passed;
+	pb_opts.ctx = &test_ctx;
 	pb = perf_buffer__new(bpf_map__fd(ftrace_skel->maps.perf_buf_map),
-			      1, &pb_opts);
+			      8, &pb_opts);
 	if (!ASSERT_OK_PTR(pb, "perf_buf__new"))
 		goto out;
 
-	/* Run test program */
-	err = bpf_prog_test_run(pkt_fd, 1, &pkt_v4, sizeof(pkt_v4),
-				buf, &size, &retval, &duration);
-
-	if (CHECK(err || retval != XDP_TX || size != 74 ||
-		  iph->protocol != IPPROTO_IPIP, "ipv4",
-		  "err %d errno %d retval %d size %d\n",
-		  err, errno, retval, size))
-		goto out;
-
-	/* Make sure bpf_xdp_output() was triggered and it sent the expected
-	 * data to the perf ring buffer.
-	 */
-	err = perf_buffer__poll(pb, 100);
-	if (CHECK(err < 0, "perf_buffer__poll", "err %d\n", err))
-		goto out;
-
-	CHECK_FAIL(!passed);
-
-	/* Verify test results */
-	if (CHECK(ftrace_skel->bss->test_result_fentry != if_nametoindex("lo"),
-		  "result", "fentry failed err %llu\n",
-		  ftrace_skel->bss->test_result_fentry))
-		goto out;
-
-	CHECK(ftrace_skel->bss->test_result_fexit != XDP_TX, "result",
-	      "fexit failed err %llu\n", ftrace_skel->bss->test_result_fexit);
-
+	for (int i = 0; i < ARRAY_SIZE(pkt_sizes); i++)
+		run_xdp_bpf2bpf_pkt_size(pkt_fd, pb, ftrace_skel,
+					 pkt_sizes[i]);
 out:
 	if (pb)
 		perf_buffer__free(pb);
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_bpf2bpf.c b/tools/testing/selftests/bpf/progs/test_xdp_bpf2bpf.c
index a038e827f850..902b54190377 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_bpf2bpf.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_bpf2bpf.c
@@ -49,7 +49,7 @@ int BPF_PROG(trace_on_entry, struct xdp_buff *xdp)
 	void *data = (void *)(long)xdp->data;
 
 	meta.ifindex = xdp->rxq->dev->ifindex;
-	meta.pkt_len = data_end - data;
+	meta.pkt_len = bpf_xdp_get_buff_len((struct xdp_md *)xdp);
 	bpf_xdp_output(xdp, &perf_buf_map,
 		       ((__u64) meta.pkt_len << 32) |
 		       BPF_F_CURRENT_CPU,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 13/18] bpf: move user_size out of bpf_test_init
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (11 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 14/18] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Rely on data_size_in in bpf_test_init routine signature. This is a
preliminary patch to introduce xdp multi-buff selftest

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 net/bpf/test_run.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 2eb0e55ef54d..1013fafb9275 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -246,11 +246,10 @@ bool bpf_prog_test_check_kfunc_call(u32 kfunc_id)
 	return btf_id_set_contains(&test_sk_kfunc_ids, kfunc_id);
 }
 
-static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
-			   u32 headroom, u32 tailroom)
+static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size,
+			   u32 size, u32 headroom, u32 tailroom)
 {
 	void __user *data_in = u64_to_user_ptr(kattr->test.data_in);
-	u32 user_size = kattr->test.data_size_in;
 	void *data;
 
 	if (size < ETH_HLEN || size > PAGE_SIZE - headroom - tailroom)
@@ -571,7 +570,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 	if (kattr->test.flags || kattr->test.cpu)
 		return -EINVAL;
 
-	data = bpf_test_init(kattr, size, NET_SKB_PAD + NET_IP_ALIGN,
+	data = bpf_test_init(kattr, kattr->test.data_size_in,
+			     size, NET_SKB_PAD + NET_IP_ALIGN,
 			     SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
 	if (IS_ERR(data))
 		return PTR_ERR(data);
@@ -782,7 +782,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	/* XDP have extra tailroom as (most) drivers use full page */
 	max_data_sz = 4096 - headroom - tailroom;
 
-	data = bpf_test_init(kattr, max_data_sz, headroom, tailroom);
+	data = bpf_test_init(kattr, kattr->test.data_size_in,
+			     max_data_sz, headroom, tailroom);
 	if (IS_ERR(data)) {
 		ret = PTR_ERR(data);
 		goto free_ctx;
@@ -866,7 +867,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	if (size < ETH_HLEN)
 		return -EINVAL;
 
-	data = bpf_test_init(kattr, size, 0, 0);
+	data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0);
 	if (IS_ERR(data))
 		return PTR_ERR(data);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 14/18] bpf: introduce multibuff support to bpf_prog_test_run_xdp()
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (12 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 13/18] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 15/18] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Introduce the capability to allocate a xdp multi-buff in
bpf_prog_test_run_xdp routine. This is a preliminary patch to introduce
the selftests for new xdp multi-buff ebpf helpers

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 net/bpf/test_run.c | 54 ++++++++++++++++++++++++++++++++++++----------
 1 file changed, 43 insertions(+), 11 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 1013fafb9275..2173dba0742f 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -750,16 +750,16 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 			  union bpf_attr __user *uattr)
 {
 	u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-	u32 headroom = XDP_PACKET_HEADROOM;
 	u32 size = kattr->test.data_size_in;
+	u32 headroom = XDP_PACKET_HEADROOM;
+	u32 retval, duration, max_data_sz;
 	u32 repeat = kattr->test.repeat;
 	struct netdev_rx_queue *rxqueue;
+	struct skb_shared_info *sinfo;
 	struct xdp_buff xdp = {};
-	u32 retval, duration;
+	int i, ret = -EINVAL;
 	struct xdp_md *ctx;
-	u32 max_data_sz;
 	void *data;
-	int ret = -EINVAL;
 
 	if (prog->expected_attach_type == BPF_XDP_DEVMAP ||
 	    prog->expected_attach_type == BPF_XDP_CPUMAP)
@@ -779,11 +779,10 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 		headroom -= ctx->data;
 	}
 
-	/* XDP have extra tailroom as (most) drivers use full page */
 	max_data_sz = 4096 - headroom - tailroom;
+	size = min_t(u32, size, max_data_sz);
 
-	data = bpf_test_init(kattr, kattr->test.data_size_in,
-			     max_data_sz, headroom, tailroom);
+	data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom);
 	if (IS_ERR(data)) {
 		ret = PTR_ERR(data);
 		goto free_ctx;
@@ -793,11 +792,45 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	xdp_init_buff(&xdp, headroom + max_data_sz + tailroom,
 		      &rxqueue->xdp_rxq);
 	xdp_prepare_buff(&xdp, data, headroom, size, true);
+	sinfo = xdp_get_shared_info_from_buff(&xdp);
 
 	ret = xdp_convert_md_to_buff(ctx, &xdp);
 	if (ret)
 		goto free_data;
 
+	if (unlikely(kattr->test.data_size_in > size)) {
+		void __user *data_in = u64_to_user_ptr(kattr->test.data_in);
+
+		while (size < kattr->test.data_size_in) {
+			struct page *page;
+			skb_frag_t *frag;
+			int data_len;
+
+			page = alloc_page(GFP_KERNEL);
+			if (!page) {
+				ret = -ENOMEM;
+				goto out;
+			}
+
+			frag = &sinfo->frags[sinfo->nr_frags++];
+			__skb_frag_set_page(frag, page);
+
+			data_len = min_t(int, kattr->test.data_size_in - size,
+					 PAGE_SIZE);
+			skb_frag_size_set(frag, data_len);
+
+			if (copy_from_user(page_address(page), data_in + size,
+					   data_len)) {
+				ret = -EFAULT;
+				goto out;
+			}
+			sinfo->xdp_frags_tsize += PAGE_SIZE;
+			sinfo->xdp_frags_size += data_len;
+			size += data_len;
+		}
+		xdp_buff_set_mb(&xdp);
+	}
+
 	bpf_prog_change_xdp(NULL, prog);
 	ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
 	/* We convert the xdp_buff back to an xdp_md before checking the return
@@ -808,10 +841,7 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	if (ret)
 		goto out;
 
-	if (xdp.data_meta != data + headroom ||
-	    xdp.data_end != xdp.data_meta + size)
-		size = xdp.data_end - xdp.data_meta;
-
+	size = xdp.data_end - xdp.data_meta + sinfo->xdp_frags_size;
 	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, size, retval,
 			      duration);
 	if (!ret)
@@ -821,6 +851,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 out:
 	bpf_prog_change_xdp(prog, NULL);
 free_data:
+	for (i = 0; i < sinfo->nr_frags; i++)
+		__free_page(skb_frag_page(&sinfo->frags[i]));
 	kfree(data);
 free_ctx:
 	kfree(ctx);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 15/18] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (13 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 14/18] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 16/18] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

introduce xdp_shared_info pointer in bpf_test_finish signature in order
to copy back paged data from a xdp multi-buff frame to userspace buffer

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 net/bpf/test_run.c | 48 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 2173dba0742f..869dcf23a1ca 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -129,7 +129,8 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
 
 static int bpf_test_finish(const union bpf_attr *kattr,
 			   union bpf_attr __user *uattr, const void *data,
-			   u32 size, u32 retval, u32 duration)
+			   struct skb_shared_info *sinfo, u32 size,
+			   u32 retval, u32 duration)
 {
 	void __user *data_out = u64_to_user_ptr(kattr->test.data_out);
 	int err = -EFAULT;
@@ -144,8 +145,36 @@ static int bpf_test_finish(const union bpf_attr *kattr,
 		err = -ENOSPC;
 	}
 
-	if (data_out && copy_to_user(data_out, data, copy_size))
-		goto out;
+	if (data_out) {
+		int len = sinfo ? copy_size - sinfo->xdp_frags_size : copy_size;
+
+		if (copy_to_user(data_out, data, len))
+			goto out;
+
+		if (sinfo) {
+			int i, offset = len, data_len;
+
+			for (i = 0; i < sinfo->nr_frags; i++) {
+				skb_frag_t *frag = &sinfo->frags[i];
+
+				if (offset >= copy_size) {
+					err = -ENOSPC;
+					break;
+				}
+
+				data_len = min_t(int, copy_size - offset,
+						 skb_frag_size(frag));
+
+				if (copy_to_user(data_out + offset,
+						 skb_frag_address(frag),
+						 data_len))
+					goto out;
+
+				offset += data_len;
+			}
+		}
+	}
+
 	if (copy_to_user(&uattr->test.data_size_out, &size, sizeof(size)))
 		goto out;
 	if (copy_to_user(&uattr->test.retval, &retval, sizeof(retval)))
@@ -674,7 +703,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 	/* bpf program can never convert linear skb to non-linear */
 	if (WARN_ON_ONCE(skb_is_nonlinear(skb)))
 		size = skb_headlen(skb);
-	ret = bpf_test_finish(kattr, uattr, skb->data, size, retval, duration);
+	ret = bpf_test_finish(kattr, uattr, skb->data, NULL, size, retval,
+			      duration);
 	if (!ret)
 		ret = bpf_ctx_finish(kattr, uattr, ctx,
 				     sizeof(struct __sk_buff));
@@ -842,8 +872,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 		goto out;
 
 	size = xdp.data_end - xdp.data_meta + sinfo->xdp_frags_size;
-	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, size, retval,
-			      duration);
+	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, sinfo, size,
+			      retval, duration);
 	if (!ret)
 		ret = bpf_ctx_finish(kattr, uattr, ctx,
 				     sizeof(struct xdp_md));
@@ -934,8 +964,8 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	if (ret < 0)
 		goto out;
 
-	ret = bpf_test_finish(kattr, uattr, &flow_keys, sizeof(flow_keys),
-			      retval, duration);
+	ret = bpf_test_finish(kattr, uattr, &flow_keys, NULL,
+			      sizeof(flow_keys), retval, duration);
 	if (!ret)
 		ret = bpf_ctx_finish(kattr, uattr, user_ctx,
 				     sizeof(struct bpf_flow_keys));
@@ -1039,7 +1069,7 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
 		user_ctx->cookie = sock_gen_cookie(ctx.selected_sk);
 	}
 
-	ret = bpf_test_finish(kattr, uattr, NULL, 0, retval, duration);
+	ret = bpf_test_finish(kattr, uattr, NULL, NULL, 0, retval, duration);
 	if (!ret)
 		ret = bpf_ctx_finish(kattr, uattr, user_ctx, sizeof(*user_ctx));
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 16/18] bpf: update xdp_adjust_tail selftest to include multi-buffer
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (14 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 15/18] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-08-20 15:40 ` [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper Lorenzo Bianconi
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

From: Eelco Chaudron <echaudro@redhat.com>

This change adds test cases for the multi-buffer scenarios when shrinking
and growing.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 .../bpf/prog_tests/xdp_adjust_tail.c          | 118 ++++++++++++++++++
 .../bpf/progs/test_xdp_adjust_tail_grow.c     |  10 +-
 .../bpf/progs/test_xdp_adjust_tail_shrink.c   |  32 ++++-
 3 files changed, 153 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c
index d5c98f2cb12f..40f7ae798fd1 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c
@@ -130,6 +130,120 @@ void test_xdp_adjust_tail_grow2(void)
 	bpf_object__close(obj);
 }
 
+void test_xdp_adjust_mb_tail_shrink(void)
+{
+	const char *file = "./test_xdp_adjust_tail_shrink.o";
+	__u32 duration, retval, size, exp_size;
+	struct bpf_object *obj;
+	int err, prog_fd;
+	__u8 *buf;
+
+	/* For the individual test cases, the first byte in the packet
+	 * indicates which test will be run.
+	 */
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_XDP, &obj, &prog_fd);
+	if (CHECK_FAIL(err))
+		return;
+
+	buf = malloc(9000);
+	if (CHECK(!buf, "malloc()", "error:%s\n", strerror(errno)))
+		return;
+
+	memset(buf, 0, 9000);
+
+	/* Test case removing 10 bytes from last frag, NOT freeing it */
+	exp_size = 8990; /* 9000 - 10 */
+	err = bpf_prog_test_run(prog_fd, 1, buf, 9000,
+				buf, &size, &retval, &duration);
+
+	CHECK(err || retval != XDP_TX || size != exp_size,
+	      "9k-10b", "err %d errno %d retval %d[%d] size %d[%u]\n",
+	      err, errno, retval, XDP_TX, size, exp_size);
+
+	/* Test case removing one of two pages, assuming 4K pages */
+	buf[0] = 1;
+	exp_size = 4900; /* 9000 - 4100 */
+	err = bpf_prog_test_run(prog_fd, 1, buf, 9000,
+				buf, &size, &retval, &duration);
+
+	CHECK(err || retval != XDP_TX || size != exp_size,
+	      "9k-1p", "err %d errno %d retval %d[%d] size %d[%u]\n",
+	      err, errno, retval, XDP_TX, size, exp_size);
+
+	/* Test case removing two pages resulting in a non mb xdp_buff */
+	buf[0] = 2;
+	exp_size = 800; /* 9000 - 8200 */
+	err = bpf_prog_test_run(prog_fd, 1, buf, 9000,
+				buf, &size, &retval, &duration);
+
+	CHECK(err || retval != XDP_TX || size != exp_size,
+	      "9k-2p", "err %d errno %d retval %d[%d] size %d[%u]\n",
+	      err, errno, retval, XDP_TX, size, exp_size);
+
+	free(buf);
+
+	bpf_object__close(obj);
+}
+
+void test_xdp_adjust_mb_tail_grow(void)
+{
+	const char *file = "./test_xdp_adjust_tail_grow.o";
+	__u32 duration, retval, size, exp_size;
+	struct bpf_object *obj;
+	int err, i, prog_fd;
+	__u8 *buf;
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_XDP, &obj, &prog_fd);
+	if (CHECK_FAIL(err))
+		return;
+
+	buf = malloc(16384);
+	if (CHECK(!buf, "malloc()", "error:%s\n", strerror(errno)))
+		return;
+
+	/* Test case add 10 bytes to last frag */
+	memset(buf, 1, 16384);
+	size = 9000;
+	exp_size = size + 10;
+	err = bpf_prog_test_run(prog_fd, 1, buf, size,
+				buf, &size, &retval, &duration);
+
+	CHECK(err || retval != XDP_TX || size != exp_size,
+	      "9k+10b", "err %d retval %d[%d] size %d[%u]\n",
+	      err, retval, XDP_TX, size, exp_size);
+
+	for (i = 0; i < 9000; i++)
+		CHECK(buf[i] != 1, "9k+10b-old",
+		      "Old data not all ok, offset %i is failing [%u]!\n",
+		      i, buf[i]);
+
+	for (i = 9000; i < 9010; i++)
+		CHECK(buf[i] != 0, "9k+10b-new",
+		      "New data not all ok, offset %i is failing [%u]!\n",
+		      i, buf[i]);
+
+	for (i = 9010; i < 16384; i++)
+		CHECK(buf[i] != 1, "9k+10b-untouched",
+		      "Unused data not all ok, offset %i is failing [%u]!\n",
+		      i, buf[i]);
+
+	/* Test a too large grow */
+	memset(buf, 1, 16384);
+	size = 9001;
+	exp_size = size;
+	err = bpf_prog_test_run(prog_fd, 1, buf, size,
+				buf, &size, &retval, &duration);
+
+	CHECK(err || retval != XDP_DROP || size != exp_size,
+	      "9k+10b", "err %d retval %d[%d] size %d[%u]\n",
+	      err, retval, XDP_TX, size, exp_size);
+
+	free(buf);
+
+	bpf_object__close(obj);
+}
+
 void test_xdp_adjust_tail(void)
 {
 	if (test__start_subtest("xdp_adjust_tail_shrink"))
@@ -138,4 +252,8 @@ void test_xdp_adjust_tail(void)
 		test_xdp_adjust_tail_grow();
 	if (test__start_subtest("xdp_adjust_tail_grow2"))
 		test_xdp_adjust_tail_grow2();
+	if (test__start_subtest("xdp_adjust_mb_tail_shrink"))
+		test_xdp_adjust_mb_tail_shrink();
+	if (test__start_subtest("xdp_adjust_mb_tail_grow"))
+		test_xdp_adjust_mb_tail_grow();
 }
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c
index 3d66599eee2e..3d43defb0e00 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_grow.c
@@ -7,11 +7,10 @@ int _xdp_adjust_tail_grow(struct xdp_md *xdp)
 {
 	void *data_end = (void *)(long)xdp->data_end;
 	void *data = (void *)(long)xdp->data;
-	unsigned int data_len;
+	int data_len = bpf_xdp_get_buff_len(xdp);
 	int offset = 0;
 
 	/* Data length determine test case */
-	data_len = data_end - data;
 
 	if (data_len == 54) { /* sizeof(pkt_v4) */
 		offset = 4096; /* test too large offset */
@@ -20,7 +19,12 @@ int _xdp_adjust_tail_grow(struct xdp_md *xdp)
 	} else if (data_len == 64) {
 		offset = 128;
 	} else if (data_len == 128) {
-		offset = 4096 - 256 - 320 - data_len; /* Max tail grow 3520 */
+		/* Max tail grow 3520 */
+		offset = 4096 - 256 - 320 - data_len;
+	} else if (data_len == 9000) {
+		offset = 10;
+	} else if (data_len == 9001) {
+		offset = 4096;
 	} else {
 		return XDP_ABORTED; /* No matching test */
 	}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_shrink.c b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_shrink.c
index 22065a9cfb25..64177597ac29 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_shrink.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_adjust_tail_shrink.c
@@ -14,14 +14,38 @@ int _version SEC("version") = 1;
 SEC("xdp_adjust_tail_shrink")
 int _xdp_adjust_tail_shrink(struct xdp_md *xdp)
 {
-	void *data_end = (void *)(long)xdp->data_end;
-	void *data = (void *)(long)xdp->data;
+	__u8 *data_end = (void *)(long)xdp->data_end;
+	__u8 *data = (void *)(long)xdp->data;
 	int offset = 0;
 
-	if (data_end - data == 54) /* sizeof(pkt_v4) */
+	switch (bpf_xdp_get_buff_len(xdp)) {
+	case 54:
+		/* sizeof(pkt_v4) */
 		offset = 256; /* shrink too much */
-	else
+		break;
+	case 9000:
+		/* Multi-buffer test cases */
+		if (data + 1 > data_end)
+			return XDP_DROP;
+
+		switch (data[0]) {
+		case 0:
+			offset = 10;
+			break;
+		case 1:
+			offset = 4100;
+			break;
+		case 2:
+			offset = 8200;
+			break;
+		default:
+			return XDP_DROP;
+		}
+		break;
+	default:
 		offset = 20;
+		break;
+	}
 	if (bpf_xdp_adjust_tail(xdp, 0 - offset))
 		return XDP_DROP;
 	return XDP_TX;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (15 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 16/18] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-09-01  0:36   ` John Fastabend
  2021-08-20 15:40 ` [PATCH v12 bpf-next 18/18] bpf: add bpf_xdp_adjust_data selftest Lorenzo Bianconi
  2021-09-01  0:45 ` [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support John Fastabend
  18 siblings, 1 reply; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

For XDP frames split over multiple buffers, the xdp_md->data and
xdp_md->data_end pointers will point to the start and end of the first
fragment only. bpf_xdp_adjust_data can be used to access subsequent
fragments by moving the data pointers. To use, an XDP program can call
this helper with the byte offset of the packet payload that
it wants to access; the helper will move xdp_md->data and xdp_md ->data_end
so they point to the requested payload offset and to the end of the
fragment containing this byte offset, and return the byte offset of the
start of the fragment.
To move back to the beginning of the packet, simply call the
helper with an offset of '0'.
Note also that the helpers that modify the packet boundaries
(bpf_xdp_adjust_head(), bpf_xdp_adjust_tail() and
bpf_xdp_adjust_meta()) will fail if the pointers have been
moved; it is the responsibility of the BPF program to move them
back before using these helpers.

Suggested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h              |  8 +++++
 include/uapi/linux/bpf.h       | 32 ++++++++++++++++++
 net/bpf/test_run.c             |  8 +++++
 net/core/filter.c              | 62 +++++++++++++++++++++++++++++++++-
 tools/include/uapi/linux/bpf.h | 32 ++++++++++++++++++
 5 files changed, 141 insertions(+), 1 deletion(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index cdaecf8d4d61..ce4764c7cd40 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -82,6 +82,11 @@ struct xdp_buff {
 	struct xdp_txq_info *txq;
 	u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
 	u16 flags; /* supported values defined in xdp_flags */
+	/* xdp multi-buff metadata used for frags iteration */
+	struct {
+		u16 headroom;	/* frame headroom: data - data_hard_start */
+		u16 headlen;	/* first buffer length: data_end - data */
+	} mb;
 };
 
 static __always_inline bool xdp_buff_is_mb(struct xdp_buff *xdp)
@@ -127,6 +132,9 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start,
 	xdp->data = data;
 	xdp->data_end = data + data_len;
 	xdp->data_meta = meta_valid ? data : data + 1;
+	/* mb metadata for frags iteration */
+	xdp->mb.headroom = headroom;
+	xdp->mb.headlen = data_len;
 }
 
 /* Reserve memory area at end-of data area.
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 9e2c3b12ea49..a7b5185a718a 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4877,6 +4877,37 @@ union bpf_attr {
  *		Get the total size of a given xdp buff (linear and paged area)
  *	Return
  *		The total size of a given xdp buffer.
+ *
+ * long bpf_xdp_adjust_data(struct xdp_buff *xdp_md, u32 offset)
+ *	Description
+ *		For XDP frames split over multiple buffers, the
+ *		*xdp_md*\ **->data** and*xdp_md *\ **->data_end** pointers
+ *		will point to the start and end of the first fragment only.
+ *		This helper can be used to access subsequent fragments by
+ *		moving the data pointers. To use, an XDP program can call
+ *		this helper with the byte offset of the packet payload that
+ *		it wants to access; the helper will move *xdp_md*\ **->data**
+ *		and *xdp_md *\ **->data_end** so they point to the requested
+ *		payload offset and to the end of the fragment containing this
+ *		byte offset, and return the byte offset of the start of the
+ *		fragment.
+ *		To move back to the beginning of the packet, simply call the
+ *		helper with an offset of '0'.
+ *		Note also that the helpers that modify the packet boundaries
+ *		(*bpf_xdp_adjust_head()*, *bpf_xdp_adjust_tail()* and
+ *		*bpf_xdp_adjust_meta()*) will fail if the pointers have been
+ *		moved; it is the responsibility of the BPF program to move them
+ *		back before using these helpers.
+ *
+ *		A call to this helper is susceptible to change the underlying
+ *		packet buffer. Therefore, at load time, all checks on pointers
+ *		previously done by the verifier are invalidated and must be
+ *		performed again, if the helper is used in combination with
+ *		direct packet access.
+ *	Return
+ *		offset between the beginning of the current fragment and
+ *		original *xdp_md*\ **->data** on success, or a negative error
+ *		in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5055,6 +5086,7 @@ union bpf_attr {
 	FN(get_func_ip),		\
 	FN(get_attach_cookie),		\
 	FN(xdp_get_buff_len),		\
+	FN(xdp_adjust_data),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 869dcf23a1ca..f09c2c8c0d6c 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -757,6 +757,8 @@ static int xdp_convert_md_to_buff(struct xdp_md *xdp_md, struct xdp_buff *xdp)
 	}
 
 	xdp->data = xdp->data_meta + xdp_md->data;
+	xdp->mb.headroom = xdp->data - xdp->data_hard_start;
+	xdp->mb.headlen = xdp->data_end - xdp->data;
 	return 0;
 
 free_dev:
@@ -871,6 +873,12 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	if (ret)
 		goto out;
 
+	/* data pointers need to be reset after frag iteration */
+	if (unlikely(xdp.data_hard_start + xdp.mb.headroom != xdp.data)) {
+		ret = -EFAULT;
+		goto out;
+	}
+
 	size = xdp.data_end - xdp.data_meta + sinfo->xdp_frags_size;
 	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, sinfo, size,
 			      retval, duration);
diff --git a/net/core/filter.c b/net/core/filter.c
index 2122c00c680f..ed2a6632adce 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3827,6 +3827,10 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
 	void *data_start = xdp_frame_end + metalen;
 	void *data = xdp->data + offset;
 
+	/* data pointers need to be reset after frag iteration */
+	if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
+		return -EINVAL;
+
 	if (unlikely(data < data_start ||
 		     data > xdp->data_end - ETH_HLEN))
 		return -EINVAL;
@@ -3836,6 +3840,9 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
 			xdp->data_meta, metalen);
 	xdp->data_meta += offset;
 	xdp->data = data;
+	/* update metada for multi-buff frag iteration */
+	xdp->mb.headroom = xdp->data - xdp->data_hard_start;
+	xdp->mb.headlen = xdp->data_end - xdp->data;
 
 	return 0;
 }
@@ -3910,6 +3917,10 @@ BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
 	void *data_hard_end = xdp_data_hard_end(xdp); /* use xdp->frame_sz */
 	void *data_end = xdp->data_end + offset;
 
+	/* data pointer needs to be reset after frag iteration */
+	if (unlikely(xdp->data + xdp->mb.headlen != xdp->data_end))
+		return -EINVAL;
+
 	if (unlikely(xdp_buff_is_mb(xdp)))
 		return bpf_xdp_mb_adjust_tail(xdp, offset);
 
@@ -3949,6 +3960,10 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
 	void *meta = xdp->data_meta + offset;
 	unsigned long metalen = xdp->data - meta;
 
+	/* data pointer needs to be reset after frag iteration */
+	if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
+		return -EINVAL;
+
 	if (xdp_data_meta_unsupported(xdp))
 		return -ENOTSUPP;
 	if (unlikely(meta < xdp_frame_end ||
@@ -3970,6 +3985,48 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = {
 	.arg2_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_2(bpf_xdp_adjust_data, struct xdp_buff *, xdp, u32, offset)
+{
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+	u32 base_offset = xdp->mb.headlen;
+	int i;
+
+	if (!xdp_buff_is_mb(xdp) || offset > sinfo->xdp_frags_size)
+		return -EINVAL;
+
+	if (offset < xdp->mb.headlen) {
+		/* linear area */
+		xdp->data = xdp->data_hard_start + xdp->mb.headroom + offset;
+		xdp->data_end = xdp->data_hard_start + xdp->mb.headroom +
+				xdp->mb.headlen;
+		return 0;
+	}
+
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		/* paged area */
+		skb_frag_t *frag = &sinfo->frags[i];
+		unsigned int size = skb_frag_size(frag);
+
+		if (offset < base_offset + size) {
+			u8 *addr = skb_frag_address(frag);
+
+			xdp->data = addr + offset - base_offset;
+			xdp->data_end = addr + size;
+			break;
+		}
+		base_offset += size;
+	}
+	return base_offset;
+}
+
+static const struct bpf_func_proto bpf_xdp_adjust_data_proto = {
+	.func		= bpf_xdp_adjust_data,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+	.arg2_type	= ARG_ANYTHING,
+};
+
 /* XDP_REDIRECT works by a three-step process, implemented in the functions
  * below:
  *
@@ -7243,7 +7300,8 @@ bool bpf_helper_changes_pkt_data(void *func)
 	    func == bpf_sock_ops_store_hdr_opt ||
 #endif
 	    func == bpf_lwt_in_push_encap ||
-	    func == bpf_lwt_xmit_push_encap)
+	    func == bpf_lwt_xmit_push_encap ||
+	    func == bpf_xdp_adjust_data)
 		return true;
 
 	return false;
@@ -7596,6 +7654,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_xdp_adjust_tail_proto;
 	case BPF_FUNC_xdp_get_buff_len:
 		return &bpf_xdp_get_buff_len_proto;
+	case BPF_FUNC_xdp_adjust_data:
+		return &bpf_xdp_adjust_data_proto;
 	case BPF_FUNC_fib_lookup:
 		return &bpf_xdp_fib_lookup_proto;
 	case BPF_FUNC_check_mtu:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 9e2c3b12ea49..a7b5185a718a 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4877,6 +4877,37 @@ union bpf_attr {
  *		Get the total size of a given xdp buff (linear and paged area)
  *	Return
  *		The total size of a given xdp buffer.
+ *
+ * long bpf_xdp_adjust_data(struct xdp_buff *xdp_md, u32 offset)
+ *	Description
+ *		For XDP frames split over multiple buffers, the
+ *		*xdp_md*\ **->data** and*xdp_md *\ **->data_end** pointers
+ *		will point to the start and end of the first fragment only.
+ *		This helper can be used to access subsequent fragments by
+ *		moving the data pointers. To use, an XDP program can call
+ *		this helper with the byte offset of the packet payload that
+ *		it wants to access; the helper will move *xdp_md*\ **->data**
+ *		and *xdp_md *\ **->data_end** so they point to the requested
+ *		payload offset and to the end of the fragment containing this
+ *		byte offset, and return the byte offset of the start of the
+ *		fragment.
+ *		To move back to the beginning of the packet, simply call the
+ *		helper with an offset of '0'.
+ *		Note also that the helpers that modify the packet boundaries
+ *		(*bpf_xdp_adjust_head()*, *bpf_xdp_adjust_tail()* and
+ *		*bpf_xdp_adjust_meta()*) will fail if the pointers have been
+ *		moved; it is the responsibility of the BPF program to move them
+ *		back before using these helpers.
+ *
+ *		A call to this helper is susceptible to change the underlying
+ *		packet buffer. Therefore, at load time, all checks on pointers
+ *		previously done by the verifier are invalidated and must be
+ *		performed again, if the helper is used in combination with
+ *		direct packet access.
+ *	Return
+ *		offset between the beginning of the current fragment and
+ *		original *xdp_md*\ **->data** on success, or a negative error
+ *		in case of failure.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5055,6 +5086,7 @@ union bpf_attr {
 	FN(get_func_ip),		\
 	FN(get_attach_cookie),		\
 	FN(xdp_get_buff_len),		\
+	FN(xdp_adjust_data),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v12 bpf-next 18/18] bpf: add bpf_xdp_adjust_data selftest
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (16 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper Lorenzo Bianconi
@ 2021-08-20 15:40 ` Lorenzo Bianconi
  2021-09-01  0:45 ` [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support John Fastabend
  18 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-08-20 15:40 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Introduce kernel selftest for new bpf_xdp_adjust_data helper.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 .../bpf/prog_tests/xdp_adjust_data.c          | 55 +++++++++++++++++++
 .../bpf/progs/test_xdp_update_frags.c         | 41 ++++++++++++++
 2 files changed, 96 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_adjust_data.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_update_frags.c

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_data.c b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_data.c
new file mode 100644
index 000000000000..a3e098b72fc9
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_data.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include <network_helpers.h>
+
+void test_xdp_update_frag(void)
+{
+	const char *file = "./test_xdp_update_frags.o";
+	__u32 duration, retval, size;
+	struct bpf_object *obj;
+	int err, prog_fd;
+	__u8 *buf;
+
+	err = bpf_prog_load(file, BPF_PROG_TYPE_XDP, &obj, &prog_fd);
+	if (CHECK_FAIL(err))
+		return;
+
+	buf = malloc(128);
+	if (CHECK(!buf, "malloc()", "error:%s\n", strerror(errno)))
+		return;
+
+	memset(buf, 0, 128);
+
+	err = bpf_prog_test_run(prog_fd, 1, buf, 128,
+				buf, &size, &retval, &duration);
+	free(buf);
+
+	CHECK(err || retval != XDP_DROP,
+	      "128b", "err %d errno %d retval %d size %d\n",
+	      err, errno, retval, size);
+
+	buf = malloc(9000);
+	if (CHECK(!buf, "malloc()", "error:%s\n", strerror(errno)))
+		return;
+
+	memset(buf, 0, 9000);
+	buf[5000] = 0xaa; /* marker at offset 5000 (frag0) */
+
+	err = bpf_prog_test_run(prog_fd, 1, buf, 9000,
+				buf, &size, &retval, &duration);
+
+	/* test_xdp_update_frags: buf[5000]: 0xaa -> 0xbb */
+	CHECK(err || retval != XDP_PASS || buf[5000] != 0xbb,
+	      "9000b", "err %d errno %d retval %d size %d\n",
+	      err, errno, retval, size);
+
+	free(buf);
+
+	bpf_object__close(obj);
+}
+
+void test_xdp_adjust_data(void)
+{
+	if (test__start_subtest("xdp_adjust_data"))
+		test_xdp_update_frag();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_update_frags.c b/tools/testing/selftests/bpf/progs/test_xdp_update_frags.c
new file mode 100644
index 000000000000..c2d5772007f3
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_update_frags.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <linux/bpf.h>
+#include <linux/if_ether.h>
+#include <bpf/bpf_helpers.h>
+
+int _version SEC("version") = 1;
+
+SEC("xdp_adjust_frags")
+int _xdp_adjust_frags(struct xdp_md *xdp)
+{
+	__u8 *data_end = (void *)(long)xdp->data_end;
+	__u8 *data = (void *)(long)xdp->data;
+	__u32 offset = 5000; /* marker offset */
+	int base_offset, ret = XDP_DROP;
+
+	base_offset = bpf_xdp_adjust_data(xdp, offset);
+	if (base_offset < 0 || base_offset > offset)
+		return XDP_DROP;
+
+	data_end = (void *)(long)xdp->data_end;
+	data = (void *)(long)xdp->data;
+
+	if (data + 1 > data_end)
+		goto out;
+
+	if (*data != 0xaa) /* marker */
+		goto out;
+
+	*data = 0xbb; /* update the marker */
+	ret = XDP_PASS;
+out:
+	bpf_xdp_adjust_data(xdp, 0);
+	return ret;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp
  2021-08-20 15:40 ` [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp Lorenzo Bianconi
@ 2021-08-31 23:13   ` John Fastabend
  2021-09-03 17:13     ` Lorenzo Bianconi
  0 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:13 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Introduce xdp_frags_tsize field in skb_shared_info data structure
> to store xdp_buff/xdp_frame truesize (xdp_frags_tsize will be used
> in xdp multi-buff support). In order to not increase skb_shared_info
> size we will use a hole due to skb_shared_info alignment.
> Introduce xdp_frags_size field in skb_shared_info data structure
> reusing gso_type field in order to store xdp_buff/xdp_frame paged size.
> xdp_frags_size will be used in xdp multi-buff support.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

I assume we can use xdp_frags_tsize for anything else above XDP later?
Other than simple question looks OK to me.

Acked-by: John Fastabend <john.fastabend@gmail.com>

> ---
>  include/linux/skbuff.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 6bdb0db3e825..1abeba7ef82e 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -522,13 +522,17 @@ struct skb_shared_info {
>  	unsigned short	gso_segs;
>  	struct sk_buff	*frag_list;
>  	struct skb_shared_hwtstamps hwtstamps;
> -	unsigned int	gso_type;
> +	union {
> +		unsigned int	gso_type;
> +		unsigned int	xdp_frags_size;
> +	};
>  	u32		tskey;
>  
>  	/*
>  	 * Warning : all fields before dataref are cleared in __alloc_skb()
>  	 */
>  	atomic_t	dataref;
> +	unsigned int	xdp_frags_tsize;
>  
>  	/* Intermediate layers must ensure that destructor_arg
>  	 * remains valid until skb destructor */
> -- 
> 2.31.1
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame
  2021-08-20 15:40 ` [PATCH v12 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame Lorenzo Bianconi
@ 2021-08-31 23:15   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:15 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Introduce flags field in xdp_frame and xdp_buffer data structures
> to define additional buffer features. At the moment the only
> supported buffer feature is multi-buffer bit (mb). Multi-buffer bit
> is used to specify if this is a linear buffer (mb = 0) or a multi-buffer
> frame (mb = 1). In the latter case the driver is expected to initialize
> the skb_shared_info structure at the end of the first buffer to link
> together subsequent buffers belonging to the same frame.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  2021-08-20 15:40 ` [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
@ 2021-08-31 23:23   ` John Fastabend
  2021-09-03 17:23     ` Lorenzo Bianconi
  0 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:23 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Update multi-buffer bit (mb) in xdp_buff to notify XDP/eBPF layer and
> XDP remote drivers if this is a "non-linear" XDP buffer. Access
> skb_shared_info only if xdp_buff mb is set in order to avoid possible
> cache-misses.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/marvell/mvneta.c | 23 ++++++++++++++++++-----
>  1 file changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 5d1007e1b5c9..9f4858e35566 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2037,9 +2037,14 @@ mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
>  {
>  	int i;
>  
> +	if (likely(!xdp_buff_is_mb(xdp)))
> +		goto out;
> +

Wouldn't nr_frags = 0 in the !xdp_buff_is_mb case? Is the
xdp_buff_is_mb check with goto really required?

>  	for (i = 0; i < sinfo->nr_frags; i++)
>  		page_pool_put_full_page(rxq->page_pool,
>  					skb_frag_page(&sinfo->frags[i]), true);
> +
> +out:
>  	page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data),
>  			   sync_len, true);
>  }
> @@ -2241,7 +2246,6 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
>  	int data_len = -MVNETA_MH_SIZE, len;
>  	struct net_device *dev = pp->dev;
>  	enum dma_data_direction dma_dir;
> -	struct skb_shared_info *sinfo;
>  
>  	if (*size > MVNETA_MAX_RX_BUF_SIZE) {
>  		len = MVNETA_MAX_RX_BUF_SIZE;
> @@ -2261,11 +2265,9 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
>  
>  	/* Prefetch header */
>  	prefetch(data);
> +	xdp_buff_clear_mb(xdp);
>  	xdp_prepare_buff(xdp, data, pp->rx_offset_correction + MVNETA_MH_SIZE,
>  			 data_len, false);
> -
> -	sinfo = xdp_get_shared_info_from_buff(xdp);
> -	sinfo->nr_frags = 0;
>  }
>  
>  static void
> @@ -2299,6 +2301,9 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
>  		skb_frag_off_set(frag, pp->rx_offset_correction);
>  		skb_frag_size_set(frag, data_len);
>  		__skb_frag_set_page(frag, page);
> +
> +		if (!xdp_buff_is_mb(xdp))
> +			xdp_buff_set_mb(xdp);
>  	} else {
>  		page_pool_put_full_page(rxq->page_pool, page, true);
>  	}
> @@ -2320,8 +2325,12 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
>  		      struct xdp_buff *xdp, u32 desc_status)
>  {
>  	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> -	int i, num_frags = sinfo->nr_frags;
>  	struct sk_buff *skb;
> +	u8 num_frags;
> +	int i;
> +
> +	if (unlikely(xdp_buff_is_mb(xdp)))
> +		num_frags = sinfo->nr_frags;
>  
>  	skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
>  	if (!skb)
> @@ -2333,6 +2342,9 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
>  	skb_put(skb, xdp->data_end - xdp->data);
>  	skb->ip_summed = mvneta_rx_csum(pp, desc_status);
>  
> +	if (likely(!xdp_buff_is_mb(xdp)))
> +		goto out;
> +

Not that I care much, but couldn't you just init num_frags = 0 and
avoid the goto?

Anyways its not my driver so no need to change it if you like it better
the way it is. Mostly just checking my understanding.

>  	for (i = 0; i < num_frags; i++) {
>  		skb_frag_t *frag = &sinfo->frags[i];
>  
> @@ -2341,6 +2353,7 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
>  				skb_frag_size(frag), PAGE_SIZE);
>  	}
>  
> +out:
>  	return skb;
>  }
>  
> -- 
> 2.31.1
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management
  2021-08-20 15:40 ` [PATCH v12 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management Lorenzo Bianconi
@ 2021-08-31 23:34   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:34 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Relying on xdp mb bit, remove skb_shared_info structure allocated on the
> stack in mvneta_rx_swbm routine and simplify mvneta_swbm_add_rx_fragment
> accessing skb_shared_info in the xdp_buff structure directly. There is no
> performance penalty in this approach since mvneta_swbm_add_rx_fragment
> is run just for multi-buff use-case.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine
  2021-08-20 15:40 ` [PATCH v12 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine Lorenzo Bianconi
@ 2021-08-31 23:38   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:38 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Introduce xdp_update_skb_shared_info routine to update frags array
> metadata in skb_shared_info data structure converting to a skb from
> a xdp_buff or xdp_frame.
> According to the current skb_shared_info architecture in
> xdp_frame/xdp_buff and to the xdp multi-buff support, there is
> no need to run skb_add_rx_frag() and reset frags array converting the buffer
> to a skb since the frag array will be in the same position for xdp_buff/xdp_frame
> and for the skb, we just need to update memory metadata.
> Introduce XDP_FLAGS_PF_MEMALLOC flag in xdp_buff_flags in order to mark
> the xdp_buff or xdp_frame as under memory-pressure if pages of the frags array
> are under memory pressure. Doing so we can avoid looping over all fragments in
> xdp_update_skb_shared_info routine. The driver is expected to set the
> flag constructing the xdp_buffer using xdp_buff_set_frag_pfmemalloc
> utility routine.
> Rely on xdp_update_skb_shared_info in __xdp_build_skb_from_frame routine
> converting the multi-buff xdp_frame to a skb after performing a XDP_REDIRECT.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 06/18] net: marvell: rely on xdp_update_skb_shared_info utility routine
  2021-08-20 15:40 ` [PATCH v12 bpf-next 06/18] net: marvell: rely on " Lorenzo Bianconi
@ 2021-08-31 23:41   ` John Fastabend
  2021-09-03 17:27     ` Lorenzo Bianconi
  0 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:41 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Rely on xdp_update_skb_shared_info routine in order to avoid
> resetting frags array in skb_shared_info structure building
> the skb in mvneta_swbm_build_skb(). Frags array is expected to
> be initialized by the receiving driver building the xdp_buff
> and here we just need to update memory metadata.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

>  drivers/net/ethernet/marvell/mvneta.c | 35 +++++++++++++++------------
>  1 file changed, 20 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index cbf614d6b993..b996eb49d813 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2304,11 +2304,19 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
>  		skb_frag_size_set(frag, data_len);
>  		__skb_frag_set_page(frag, page);
>  
> -		if (!xdp_buff_is_mb(xdp))
> +		if (!xdp_buff_is_mb(xdp)) {
> +			sinfo->xdp_frags_size = *size;
>  			xdp_buff_set_mb(xdp);
> +		}
> +		if (page_is_pfmemalloc(page))
> +			xdp_buff_set_frag_pfmemalloc(xdp);
>  	} else {
>  		page_pool_put_full_page(rxq->page_pool, page, true);
>  	}
> +
> +	/* last fragment */
> +	if (len == *size)
> +		sinfo->xdp_frags_tsize = sinfo->nr_frags * PAGE_SIZE;
>  	*size -= len;
>  }
>  
> @@ -2316,13 +2324,18 @@ static struct sk_buff *
>  mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
>  		      struct xdp_buff *xdp, u32 desc_status)
>  {
> -	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> +	unsigned int size, truesize;
>  	struct sk_buff *skb;
>  	u8 num_frags;
> -	int i;
>  
> -	if (unlikely(xdp_buff_is_mb(xdp)))
> +	if (unlikely(xdp_buff_is_mb(xdp))) {

Just curious does the mvneta hardware support header split? If we
get to that point then we can drop the unlikely.

> +		struct skb_shared_info *sinfo;
> +
> +		sinfo = xdp_get_shared_info_from_buff(xdp);
> +		truesize = sinfo->xdp_frags_tsize;
> +		size = sinfo->xdp_frags_size;
>  		num_frags = sinfo->nr_frags;
> +	}
>  
>  	skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
>  	if (!skb)
> @@ -2334,18 +2347,10 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
>  	skb_put(skb, xdp->data_end - xdp->data);
>  	skb->ip_summed = mvneta_rx_csum(pp, desc_status);
>  
> -	if (likely(!xdp_buff_is_mb(xdp)))
> -		goto out;
> -
> -	for (i = 0; i < num_frags; i++) {
> -		skb_frag_t *frag = &sinfo->frags[i];
> -
> -		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
> -				skb_frag_page(frag), skb_frag_off(frag),
> -				skb_frag_size(frag), PAGE_SIZE);
> -	}
> +	if (unlikely(xdp_buff_is_mb(xdp)))
> +		xdp_update_skb_shared_info(skb, num_frags, size, truesize,
> +					   xdp_buff_is_frag_pfmemalloc(xdp));
>  
> -out:
>  	return skb;
>  }
>  
> -- 
> 2.31.1
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame}
  2021-08-20 15:40 ` [PATCH v12 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
@ 2021-08-31 23:43   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:43 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Take into account if the received xdp_buff/xdp_frame is non-linear
> recycling/returning the frame memory to the allocator or into
> xdp_frame_bulk.
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX
  2021-08-20 15:40 ` [PATCH v12 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
@ 2021-08-31 23:44   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:44 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Introduce the capability to map non-linear xdp buffer running
> mvneta_xdp_submit_frame() for XDP_TX and XDP_REDIRECT
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP
  2021-08-20 15:40 ` [PATCH v12 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
@ 2021-08-31 23:45   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-08-31 23:45 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Enable the capability to receive jumbo frames even if the interface is
> running in XDP mode
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API
  2021-08-20 15:40 ` [PATCH v12 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
@ 2021-09-01  0:10   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-09-01  0:10 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> From: Eelco Chaudron <echaudro@redhat.com>
> 
> This change adds support for tail growing and shrinking for XDP multi-buff.
> 
> When called on a multi-buffer packet with a grow request, it will always
> work on the last fragment of the packet. So the maximum grow size is the
> last fragments tailroom, i.e. no new buffer will be allocated.
> 
> When shrinking, it will work from the last fragment, all the way down to
> the base buffer depending on the shrinking size. It's important to mention
> that once you shrink down the fragment(s) are freed, so you can not grow
> again to the original size.
> 
> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---

LGTM.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper
  2021-08-20 15:40 ` [PATCH v12 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper Lorenzo Bianconi
@ 2021-09-01  0:12   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-09-01  0:12 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer
> total size (linear and paged area)
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers
  2021-08-20 15:40 ` [PATCH v12 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
@ 2021-09-01  0:19   ` John Fastabend
  0 siblings, 0 replies; 38+ messages in thread
From: John Fastabend @ 2021-09-01  0:19 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> From: Eelco Chaudron <echaudro@redhat.com>
> 
> This patch adds support for multi-buffer for the following helpers:
>   - bpf_xdp_output()
>   - bpf_perf_event_output()
> 
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper
  2021-08-20 15:40 ` [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper Lorenzo Bianconi
@ 2021-09-01  0:36   ` John Fastabend
  2021-09-03 17:57     ` Lorenzo Bianconi
  0 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2021-09-01  0:36 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> For XDP frames split over multiple buffers, the xdp_md->data and
> xdp_md->data_end pointers will point to the start and end of the first
> fragment only. bpf_xdp_adjust_data can be used to access subsequent
> fragments by moving the data pointers. To use, an XDP program can call
> this helper with the byte offset of the packet payload that
> it wants to access; the helper will move xdp_md->data and xdp_md ->data_end
> so they point to the requested payload offset and to the end of the
> fragment containing this byte offset, and return the byte offset of the
> start of the fragment.
> To move back to the beginning of the packet, simply call the
> helper with an offset of '0'.
> Note also that the helpers that modify the packet boundaries
> (bpf_xdp_adjust_head(), bpf_xdp_adjust_tail() and
> bpf_xdp_adjust_meta()) will fail if the pointers have been
> moved; it is the responsibility of the BPF program to move them
> back before using these helpers.

I'm ok with this for a first iteration I guess with more work we
can make the helpers use the updated pointers though.

> 
> Suggested-by: John Fastabend <john.fastabend@gmail.com>
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

Overall looks good couple small nits/questions below. Thanks!

> ---
>  include/net/xdp.h              |  8 +++++
>  include/uapi/linux/bpf.h       | 32 ++++++++++++++++++
>  net/bpf/test_run.c             |  8 +++++
>  net/core/filter.c              | 62 +++++++++++++++++++++++++++++++++-
>  tools/include/uapi/linux/bpf.h | 32 ++++++++++++++++++
>  5 files changed, 141 insertions(+), 1 deletion(-)
> 
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index cdaecf8d4d61..ce4764c7cd40 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -82,6 +82,11 @@ struct xdp_buff {
>  	struct xdp_txq_info *txq;
>  	u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
>  	u16 flags; /* supported values defined in xdp_flags */
> +	/* xdp multi-buff metadata used for frags iteration */
> +	struct {
> +		u16 headroom;	/* frame headroom: data - data_hard_start */
> +		u16 headlen;	/* first buffer length: data_end - data */
> +	} mb;
>  };
>  
>  static __always_inline bool xdp_buff_is_mb(struct xdp_buff *xdp)
> @@ -127,6 +132,9 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start,
>  	xdp->data = data;
>  	xdp->data_end = data + data_len;
>  	xdp->data_meta = meta_valid ? data : data + 1;
> +	/* mb metadata for frags iteration */
> +	xdp->mb.headroom = headroom;
> +	xdp->mb.headlen = data_len;
>  }
>  
>  /* Reserve memory area at end-of data area.
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 9e2c3b12ea49..a7b5185a718a 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4877,6 +4877,37 @@ union bpf_attr {
>   *		Get the total size of a given xdp buff (linear and paged area)
>   *	Return
>   *		The total size of a given xdp buffer.
> + *
> + * long bpf_xdp_adjust_data(struct xdp_buff *xdp_md, u32 offset)
> + *	Description
> + *		For XDP frames split over multiple buffers, the
> + *		*xdp_md*\ **->data** and*xdp_md *\ **->data_end** pointers
                                       ^^^^
missing space?

> + *		will point to the start and end of the first fragment only.
> + *		This helper can be used to access subsequent fragments by
> + *		moving the data pointers. To use, an XDP program can call
> + *		this helper with the byte offset of the packet payload that
> + *		it wants to access; the helper will move *xdp_md*\ **->data**
> + *		and *xdp_md *\ **->data_end** so they point to the requested
> + *		payload offset and to the end of the fragment containing this
> + *		byte offset, and return the byte offset of the start of the
> + *		fragment.
> + *		To move back to the beginning of the packet, simply call the
> + *		helper with an offset of '0'.
> + *		Note also that the helpers that modify the packet boundaries
> + *		(*bpf_xdp_adjust_head()*, *bpf_xdp_adjust_tail()* and
> + *		*bpf_xdp_adjust_meta()*) will fail if the pointers have been
> + *		moved; it is the responsibility of the BPF program to move them
> + *		back before using these helpers.
> + *
> + *		A call to this helper is susceptible to change the underlying
> + *		packet buffer. Therefore, at load time, all checks on pointers
> + *		previously done by the verifier are invalidated and must be
> + *		performed again, if the helper is used in combination with
> + *		direct packet access.
> + *	Return
> + *		offset between the beginning of the current fragment and
> + *		original *xdp_md*\ **->data** on success, or a negative error
> + *		in case of failure.
>   */
>  #define __BPF_FUNC_MAPPER(FN)		\
>  	FN(unspec),			\
> @@ -5055,6 +5086,7 @@ union bpf_attr {
>  	FN(get_func_ip),		\
>  	FN(get_attach_cookie),		\
>  	FN(xdp_get_buff_len),		\
> +	FN(xdp_adjust_data),		\
>  	/* */
>  
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 869dcf23a1ca..f09c2c8c0d6c 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -757,6 +757,8 @@ static int xdp_convert_md_to_buff(struct xdp_md *xdp_md, struct xdp_buff *xdp)
>  	}
>  
>  	xdp->data = xdp->data_meta + xdp_md->data;
> +	xdp->mb.headroom = xdp->data - xdp->data_hard_start;
> +	xdp->mb.headlen = xdp->data_end - xdp->data;
>  	return 0;
>  
>  free_dev:
> @@ -871,6 +873,12 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>  	if (ret)
>  		goto out;
>  
> +	/* data pointers need to be reset after frag iteration */
> +	if (unlikely(xdp.data_hard_start + xdp.mb.headroom != xdp.data)) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
>  	size = xdp.data_end - xdp.data_meta + sinfo->xdp_frags_size;
>  	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, sinfo, size,
>  			      retval, duration);
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2122c00c680f..ed2a6632adce 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3827,6 +3827,10 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
>  	void *data_start = xdp_frame_end + metalen;
>  	void *data = xdp->data + offset;
>  
> +	/* data pointers need to be reset after frag iteration */
> +	if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
> +		return -EINVAL;

-EFAULT? It might be nice if error code is different from below
for debugging?

> +
>  	if (unlikely(data < data_start ||
>  		     data > xdp->data_end - ETH_HLEN))
>  		return -EINVAL;
> @@ -3836,6 +3840,9 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
>  			xdp->data_meta, metalen);
>  	xdp->data_meta += offset;
>  	xdp->data = data;
> +	/* update metada for multi-buff frag iteration */
> +	xdp->mb.headroom = xdp->data - xdp->data_hard_start;
> +	xdp->mb.headlen = xdp->data_end - xdp->data;
>  
>  	return 0;
>  }
> @@ -3910,6 +3917,10 @@ BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
>  	void *data_hard_end = xdp_data_hard_end(xdp); /* use xdp->frame_sz */
>  	void *data_end = xdp->data_end + offset;
>  
> +	/* data pointer needs to be reset after frag iteration */
> +	if (unlikely(xdp->data + xdp->mb.headlen != xdp->data_end))
> +		return -EINVAL;

EFAULT?

> +
>  	if (unlikely(xdp_buff_is_mb(xdp)))
>  		return bpf_xdp_mb_adjust_tail(xdp, offset);
>  
> @@ -3949,6 +3960,10 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
>  	void *meta = xdp->data_meta + offset;
>  	unsigned long metalen = xdp->data - meta;
>  
> +	/* data pointer needs to be reset after frag iteration */
> +	if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
> +		return -EINVAL;

same comment.

>  	if (xdp_data_meta_unsupported(xdp))
>  		return -ENOTSUPP;
>  	if (unlikely(meta < xdp_frame_end ||
> @@ -3970,6 +3985,48 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = {
>  	.arg2_type	= ARG_ANYTHING,
>  };
>  
> +BPF_CALL_2(bpf_xdp_adjust_data, struct xdp_buff *, xdp, u32, offset)
> +{
> +	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> +	u32 base_offset = xdp->mb.headlen;
> +	int i;
> +
> +	if (!xdp_buff_is_mb(xdp) || offset > sinfo->xdp_frags_size)
> +		return -EINVAL;

Do we need to error this? If its not mb we can just return the same
as offset==0?

> +
> +	if (offset < xdp->mb.headlen) {
> +		/* linear area */
> +		xdp->data = xdp->data_hard_start + xdp->mb.headroom + offset;
> +		xdp->data_end = xdp->data_hard_start + xdp->mb.headroom +
> +				xdp->mb.headlen;
> +		return 0;
> +	}
> +
> +	for (i = 0; i < sinfo->nr_frags; i++) {
> +		/* paged area */
> +		skb_frag_t *frag = &sinfo->frags[i];
> +		unsigned int size = skb_frag_size(frag);
> +
> +		if (offset < base_offset + size) {
> +			u8 *addr = skb_frag_address(frag);
> +
> +			xdp->data = addr + offset - base_offset;
> +			xdp->data_end = addr + size;
> +			break;
> +		}
> +		base_offset += size;
> +	}
> +	return base_offset;
> +}

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support
  2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (17 preceding siblings ...)
  2021-08-20 15:40 ` [PATCH v12 bpf-next 18/18] bpf: add bpf_xdp_adjust_data selftest Lorenzo Bianconi
@ 2021-09-01  0:45 ` John Fastabend
  2021-09-07  8:35   ` Lorenzo Bianconi
  18 siblings, 1 reply; 38+ messages in thread
From: John Fastabend @ 2021-09-01  0:45 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, magnus.karlsson,
	tirthendu.sarkar, toke

Lorenzo Bianconi wrote:
> This series introduce XDP multi-buffer support. The mvneta driver is
> the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> please focus on how these new types of xdp_{buff,frame} packets
> traverse the different layers and the layout design. It is on purpose
> that BPF-helpers are kept simple, as we don't want to expose the
> internal layout to allow later changes.
> 
> The main idea for the new multi-buffer layout is to reuse the same
> structure used for non-linear SKB. This rely on the "skb_shared_info"
> struct at the end of the first buffer to link together subsequent
> buffers. Keeping the layout compatible with SKBs is also done to ease
> and speedup creating a SKB from an xdp_{buff,frame}.
> Converting xdp_frame to SKB and deliver it to the network stack is shown
> in patch 05/18 (e.g. cpumaps).
> 
> A multi-buffer bit (mb) has been introduced in the flags field of xdp_{buff,frame}
> structure to notify the bpf/network layer if this is a xdp multi-buffer frame
> (mb = 1) or not (mb = 0).
> The mb bit will be set by a xdp multi-buffer capable driver only for
> non-linear frames maintaining the capability to receive linear frames
> without any extra cost since the skb_shared_info structure at the end
> of the first buffer will be initialized only if mb is set.
> Moreover the flags field in xdp_{buff,frame} will be reused even for
> xdp rx csum offloading in future series.
> 

The series is looking really close to me. Couple small comments/questions
inline. Also I think we should call out the potential issues in the cover
letter with regards to backwards compatibility. Something like, 

"
A multi-buffer enabled NIC may receive XDP frames with multiple frags.
If a BPF program does not understand mb layouts its possible to contrive
a BPF program that incorrectly views data_end as the end of data when
there is more data in the payload. Note helpers will generally due the
correct thing, for example perf_output will consume entire payload. But,
it is still possible some programs could do the wrong thing even if in
an edge case. Although we expect most BPF programs not to be impacted
we can't rule out, you've been warned.
"

I can't think of an elegant way around this and it does require at least
some type of opt-in by increasing the MTU limit so I'm OK with it given
I think it should impact few (no?) real programs.

> Typical use cases for this series are:
> - Jumbo-frames
> - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])
> - TSO/GRO
> 
> The two following ebpf helpers (and related selftests) has been introduced:
> - bpf_xdp_adjust_data:
>   Move xdp_md->data and xdp_md->data_end pointers in subsequent fragments
>   according to the offset provided by the ebpf program. This helper can be
>   used to read/write values in frame payload.
> - bpf_xdp_get_buff_len:
>   Return the total frame size (linear + paged parts)
> 
> bpf_xdp_adjust_tail and bpf_xdp_copy helpers have been modified to take into
> account xdp multi-buff frames.
> 
> More info about the main idea behind this approach can be found here [1][2].
> 
> Changes since v11:
> - add missing static to bpf_xdp_get_buff_len_proto structure
> - fix bpf_xdp_adjust_data helper when offset is smaller than linear area length.
> 
> Changes since v10:
> - move xdp->data to the requested payload offset instead of to the beginning of
>   the fragment in bpf_xdp_adjust_data()
> 
> Changes since v9:
> - introduce bpf_xdp_adjust_data helper and related selftest
> - add xdp_frags_size and xdp_frags_tsize fields in skb_shared_info
> - introduce xdp_update_skb_shared_info utility routine in ordere to not reset
>   frags array in skb_shared_info converting from a xdp_buff/xdp_frame to a skb 
> - simplify bpf_xdp_copy routine
> 
> Changes since v8:
> - add proper dma unmapping if XDP_TX fails on mvneta for a xdp multi-buff
> - switch back to skb_shared_info implementation from previous xdp_shared_info
>   one
> - avoid using a bietfield in xdp_buff/xdp_frame since it introduces performance
>   regressions. Tested now on 10G NIC (ixgbe) to verify there are no performance
>   penalties for regular codebase
> - add bpf_xdp_get_buff_len helper and remove frame_length field in xdp ctx
> - add data_len field in skb_shared_info struct
> - introduce XDP_FLAGS_FRAGS_PF_MEMALLOC flag
> 
> Changes since v7:
> - rebase on top of bpf-next
> - fix sparse warnings
> - improve comments for frame_length in include/net/xdp.h
> 
> Changes since v6:
> - the main difference respect to previous versions is the new approach proposed
>   by Eelco to pass full length of the packet to eBPF layer in XDP context
> - reintroduce multi-buff support to eBPF kself-tests
> - reintroduce multi-buff support to bpf_xdp_adjust_tail helper
> - introduce multi-buffer support to bpf_xdp_copy helper
> - rebase on top of bpf-next
> 
> Changes since v5:
> - rebase on top of bpf-next
> - initialize mb bit in xdp_init_buff() and drop per-driver initialization
> - drop xdp->mb initialization in xdp_convert_zc_to_xdp_frame()
> - postpone introduction of frame_length field in XDP ctx to another series
> - minor changes
> 
> Changes since v4:
> - rebase ontop of bpf-next
> - introduce xdp_shared_info to build xdp multi-buff instead of using the
>   skb_shared_info struct
> - introduce frame_length in xdp ctx
> - drop previous bpf helpers
> - fix bpf_xdp_adjust_tail for xdp multi-buff
> - introduce xdp multi-buff self-tests for bpf_xdp_adjust_tail
> - fix xdp_return_frame_bulk for xdp multi-buff
> 
> Changes since v3:
> - rebase ontop of bpf-next
> - add patch 10/13 to copy back paged data from a xdp multi-buff frame to
>   userspace buffer for xdp multi-buff selftests
> 
> Changes since v2:
> - add throughput measurements
> - drop bpf_xdp_adjust_mb_header bpf helper
> - introduce selftest for xdp multibuffer
> - addressed comments on bpf_xdp_get_frags_count
> - introduce xdp multi-buff support to cpumaps
> 
> Changes since v1:
> - Fix use-after-free in xdp_return_{buff/frame}
> - Introduce bpf helpers
> - Introduce xdp_mb sample program
> - access skb_shared_info->nr_frags only on the last fragment
> 
> Changes since RFC:
> - squash multi-buffer bit initialization in a single patch
> - add mvneta non-linear XDP buff support for tx side
> 
> [0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
> [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
> [2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)
> 
> Eelco Chaudron (3):
>   bpf: add multi-buff support to the bpf_xdp_adjust_tail() API
>   bpf: add multi-buffer support to xdp copy helpers
>   bpf: update xdp_adjust_tail selftest to include multi-buffer
> 
> Lorenzo Bianconi (15):
>   net: skbuff: add size metadata to skb_shared_info for xdp
>   xdp: introduce flags field in xdp_buff/xdp_frame
>   net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
>   net: mvneta: simplify mvneta_swbm_add_rx_fragment management
>   net: xdp: add xdp_update_skb_shared_info utility routine
>   net: marvell: rely on xdp_update_skb_shared_info utility routine
>   xdp: add multi-buff support to xdp_return_{buff/frame}
>   net: mvneta: add multi buffer support to XDP_TX
>   net: mvneta: enable jumbo frames for XDP
>   bpf: introduce bpf_xdp_get_buff_len helper
>   bpf: move user_size out of bpf_test_init
>   bpf: introduce multibuff support to bpf_prog_test_run_xdp()
>   bpf: test_run: add xdp_shared_info pointer in bpf_test_finish
>     signature
>   net: xdp: introduce bpf_xdp_adjust_data helper
>   bpf: add bpf_xdp_adjust_data selftest
> 
>  drivers/net/ethernet/marvell/mvneta.c         | 204 ++++++++++-------
>  include/linux/skbuff.h                        |   6 +-
>  include/net/xdp.h                             |  95 +++++++-
>  include/uapi/linux/bpf.h                      |  39 ++++
>  kernel/trace/bpf_trace.c                      |   3 +
>  net/bpf/test_run.c                            | 117 ++++++++--
>  net/core/filter.c                             | 213 +++++++++++++++++-
>  net/core/xdp.c                                |  76 ++++++-
>  tools/include/uapi/linux/bpf.h                |  39 ++++
>  .../bpf/prog_tests/xdp_adjust_data.c          |  55 +++++
>  .../bpf/prog_tests/xdp_adjust_tail.c          | 118 ++++++++++
>  .../selftests/bpf/prog_tests/xdp_bpf2bpf.c    | 151 +++++++++----
>  .../bpf/progs/test_xdp_adjust_tail_grow.c     |  10 +-
>  .../bpf/progs/test_xdp_adjust_tail_shrink.c   |  32 ++-
>  .../selftests/bpf/progs/test_xdp_bpf2bpf.c    |   2 +-
>  .../bpf/progs/test_xdp_update_frags.c         |  41 ++++
>  16 files changed, 1036 insertions(+), 165 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_adjust_data.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_update_frags.c
> 
> -- 
> 2.31.1
> 



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp
  2021-08-31 23:13   ` John Fastabend
@ 2021-09-03 17:13     ` Lorenzo Bianconi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-09-03 17:13 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, BPF-dev-list, Network Development,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Agroskin, Shay, David Ahern, Jesper Brouer,
	Eelco Chaudron, Jason Wang, Alexander Duyck, Saeed Mahameed,
	Maciej Fijalkowski, Karlsson, Magnus, Sarkar, Tirthendu,
	Toke Hoiland Jorgensen

>
> Lorenzo Bianconi wrote:
> > Introduce xdp_frags_tsize field in skb_shared_info data structure
> > to store xdp_buff/xdp_frame truesize (xdp_frags_tsize will be used
> > in xdp multi-buff support). In order to not increase skb_shared_info
> > size we will use a hole due to skb_shared_info alignment.
> > Introduce xdp_frags_size field in skb_shared_info data structure
> > reusing gso_type field in order to store xdp_buff/xdp_frame paged size.
> > xdp_frags_size will be used in xdp multi-buff support.
> >
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
>
> I assume we can use xdp_frags_tsize for anything else above XDP later?
> Other than simple question looks OK to me.

yes, right as we did for gso_type/xdp_frags_size.

Regards,
Lorenzo

>
> Acked-by: John Fastabend <john.fastabend@gmail.com>
>
> > ---
> >  include/linux/skbuff.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 6bdb0db3e825..1abeba7ef82e 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -522,13 +522,17 @@ struct skb_shared_info {
> >       unsigned short  gso_segs;
> >       struct sk_buff  *frag_list;
> >       struct skb_shared_hwtstamps hwtstamps;
> > -     unsigned int    gso_type;
> > +     union {
> > +             unsigned int    gso_type;
> > +             unsigned int    xdp_frags_size;
> > +     };
> >       u32             tskey;
> >
> >       /*
> >        * Warning : all fields before dataref are cleared in __alloc_skb()
> >        */
> >       atomic_t        dataref;
> > +     unsigned int    xdp_frags_tsize;
> >
> >       /* Intermediate layers must ensure that destructor_arg
> >        * remains valid until skb destructor */
> > --
> > 2.31.1
> >
>
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  2021-08-31 23:23   ` John Fastabend
@ 2021-09-03 17:23     ` Lorenzo Bianconi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-09-03 17:23 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, BPF-dev-list, Network Development,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Agroskin, Shay, David Ahern, Jesper Brouer,
	Eelco Chaudron, Jason Wang, Alexander Duyck, Saeed Mahameed,
	Maciej Fijalkowski, Karlsson, Magnus, Sarkar, Tirthendu,
	Toke Hoiland Jorgensen

On Wed, Sep 1, 2021 at 1:24 AM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Lorenzo Bianconi wrote:
> > Update multi-buffer bit (mb) in xdp_buff to notify XDP/eBPF layer and
> > XDP remote drivers if this is a "non-linear" XDP buffer. Access
> > skb_shared_info only if xdp_buff mb is set in order to avoid possible
> > cache-misses.
> >
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > ---
> >  drivers/net/ethernet/marvell/mvneta.c | 23 ++++++++++++++++++-----
> >  1 file changed, 18 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> > index 5d1007e1b5c9..9f4858e35566 100644
> > --- a/drivers/net/ethernet/marvell/mvneta.c
> > +++ b/drivers/net/ethernet/marvell/mvneta.c
> > @@ -2037,9 +2037,14 @@ mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
> >  {
> >       int i;
> >
> > +     if (likely(!xdp_buff_is_mb(xdp)))
> > +             goto out;
> > +
>
> Wouldn't nr_frags = 0 in the !xdp_buff_is_mb case? Is the
> xdp_buff_is_mb check with goto really required?

if xdp_buff_is_mb is false, nr_frags will not be initialized otherwise
we will trigger a cache-miss for the single-buffer use case (where
initializing skb_shared_info is not required).

>
> >       for (i = 0; i < sinfo->nr_frags; i++)
> >               page_pool_put_full_page(rxq->page_pool,
> >                                       skb_frag_page(&sinfo->frags[i]), true);
> > +
> > +out:
> >       page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data),
> >                          sync_len, true);
> >  }
> > @@ -2241,7 +2246,6 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
> >       int data_len = -MVNETA_MH_SIZE, len;
> >       struct net_device *dev = pp->dev;
> >       enum dma_data_direction dma_dir;
> > -     struct skb_shared_info *sinfo;
> >
> >       if (*size > MVNETA_MAX_RX_BUF_SIZE) {
> >               len = MVNETA_MAX_RX_BUF_SIZE;
> > @@ -2261,11 +2265,9 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
> >
> >       /* Prefetch header */
> >       prefetch(data);
> > +     xdp_buff_clear_mb(xdp);
> >       xdp_prepare_buff(xdp, data, pp->rx_offset_correction + MVNETA_MH_SIZE,
> >                        data_len, false);
> > -
> > -     sinfo = xdp_get_shared_info_from_buff(xdp);
> > -     sinfo->nr_frags = 0;
> >  }
> >
> >  static void
> > @@ -2299,6 +2301,9 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
> >               skb_frag_off_set(frag, pp->rx_offset_correction);
> >               skb_frag_size_set(frag, data_len);
> >               __skb_frag_set_page(frag, page);
> > +
> > +             if (!xdp_buff_is_mb(xdp))
> > +                     xdp_buff_set_mb(xdp);
> >       } else {
> >               page_pool_put_full_page(rxq->page_pool, page, true);
> >       }
> > @@ -2320,8 +2325,12 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
> >                     struct xdp_buff *xdp, u32 desc_status)
> >  {
> >       struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> > -     int i, num_frags = sinfo->nr_frags;
> >       struct sk_buff *skb;
> > +     u8 num_frags;
> > +     int i;
> > +
> > +     if (unlikely(xdp_buff_is_mb(xdp)))
> > +             num_frags = sinfo->nr_frags;
> >
> >       skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
> >       if (!skb)
> > @@ -2333,6 +2342,9 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
> >       skb_put(skb, xdp->data_end - xdp->data);
> >       skb->ip_summed = mvneta_rx_csum(pp, desc_status);
> >
> > +     if (likely(!xdp_buff_is_mb(xdp)))
> > +             goto out;
> > +
>
> Not that I care much, but couldn't you just init num_frags = 0 and
> avoid the goto?

same here.

Regards,
Lorenzo

>
> Anyways its not my driver so no need to change it if you like it better
> the way it is. Mostly just checking my understanding.
>
> >       for (i = 0; i < num_frags; i++) {
> >               skb_frag_t *frag = &sinfo->frags[i];
> >
> > @@ -2341,6 +2353,7 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
> >                               skb_frag_size(frag), PAGE_SIZE);
> >       }
> >
> > +out:
> >       return skb;
> >  }
> >
> > --
> > 2.31.1
> >
>
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v12 bpf-next 06/18] net: marvell: rely on xdp_update_skb_shared_info utility routine
  2021-08-31 23:41   ` John Fastabend
@ 2021-09-03 17:27     ` Lorenzo Bianconi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-09-03 17:27 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, BPF-dev-list, Network Development,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Agroskin, Shay, David Ahern, Jesper Brouer,
	Eelco Chaudron, Jason Wang, Alexander Duyck, Saeed Mahameed,
	Maciej Fijalkowski, Karlsson, Magnus, Sarkar, Tirthendu,
	Toke Hoiland Jorgensen

>
> Lorenzo Bianconi wrote:
> > Rely on xdp_update_skb_shared_info routine in order to avoid
> > resetting frags array in skb_shared_info structure building
> > the skb in mvneta_swbm_build_skb(). Frags array is expected to
> > be initialized by the receiving driver building the xdp_buff
> > and here we just need to update memory metadata.
> >
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > ---
>
> Acked-by: John Fastabend <john.fastabend@gmail.com>
>
> >  drivers/net/ethernet/marvell/mvneta.c | 35 +++++++++++++++------------
> >  1 file changed, 20 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> > index cbf614d6b993..b996eb49d813 100644
> > --- a/drivers/net/ethernet/marvell/mvneta.c
> > +++ b/drivers/net/ethernet/marvell/mvneta.c
> > @@ -2304,11 +2304,19 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
> >               skb_frag_size_set(frag, data_len);
> >               __skb_frag_set_page(frag, page);
> >
> > -             if (!xdp_buff_is_mb(xdp))
> > +             if (!xdp_buff_is_mb(xdp)) {
> > +                     sinfo->xdp_frags_size = *size;
> >                       xdp_buff_set_mb(xdp);
> > +             }
> > +             if (page_is_pfmemalloc(page))
> > +                     xdp_buff_set_frag_pfmemalloc(xdp);
> >       } else {
> >               page_pool_put_full_page(rxq->page_pool, page, true);
> >       }
> > +
> > +     /* last fragment */
> > +     if (len == *size)
> > +             sinfo->xdp_frags_tsize = sinfo->nr_frags * PAGE_SIZE;
> >       *size -= len;
> >  }
> >
> > @@ -2316,13 +2324,18 @@ static struct sk_buff *
> >  mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
> >                     struct xdp_buff *xdp, u32 desc_status)
> >  {
> > -     struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> > +     unsigned int size, truesize;
> >       struct sk_buff *skb;
> >       u8 num_frags;
> > -     int i;
> >
> > -     if (unlikely(xdp_buff_is_mb(xdp)))
> > +     if (unlikely(xdp_buff_is_mb(xdp))) {
>
> Just curious does the mvneta hardware support header split? If we
> get to that point then we can drop the unlikely.

nope, it does not :(

Regards,
Lorenzo

>
> > +             struct skb_shared_info *sinfo;
> > +
> > +             sinfo = xdp_get_shared_info_from_buff(xdp);
> > +             truesize = sinfo->xdp_frags_tsize;
> > +             size = sinfo->xdp_frags_size;
> >               num_frags = sinfo->nr_frags;
> > +     }
> >
> >       skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
> >       if (!skb)
> > @@ -2334,18 +2347,10 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct page_pool *pool,
> >       skb_put(skb, xdp->data_end - xdp->data);
> >       skb->ip_summed = mvneta_rx_csum(pp, desc_status);
> >
> > -     if (likely(!xdp_buff_is_mb(xdp)))
> > -             goto out;
> > -
> > -     for (i = 0; i < num_frags; i++) {
> > -             skb_frag_t *frag = &sinfo->frags[i];
> > -
> > -             skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
> > -                             skb_frag_page(frag), skb_frag_off(frag),
> > -                             skb_frag_size(frag), PAGE_SIZE);
> > -     }
> > +     if (unlikely(xdp_buff_is_mb(xdp)))
> > +             xdp_update_skb_shared_info(skb, num_frags, size, truesize,
> > +                                        xdp_buff_is_frag_pfmemalloc(xdp));
> >
> > -out:
> >       return skb;
> >  }
> >
> > --
> > 2.31.1
> >
>
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper
  2021-09-01  0:36   ` John Fastabend
@ 2021-09-03 17:57     ` Lorenzo Bianconi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-09-03 17:57 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, BPF-dev-list, Network Development,
	David S. Miller, Jakub Kicinski, Alexei Starovoitov,
	Daniel Borkmann, Agroskin, Shay, David Ahern, Jesper Brouer,
	Eelco Chaudron, Jason Wang, Alexander Duyck, Saeed Mahameed,
	Maciej Fijalkowski, Karlsson, Magnus, Sarkar, Tirthendu,
	Toke Hoiland Jorgensen

>
> Lorenzo Bianconi wrote:
> > For XDP frames split over multiple buffers, the xdp_md->data and
> > xdp_md->data_end pointers will point to the start and end of the first
> > fragment only. bpf_xdp_adjust_data can be used to access subsequent
> > fragments by moving the data pointers. To use, an XDP program can call
> > this helper with the byte offset of the packet payload that
> > it wants to access; the helper will move xdp_md->data and xdp_md ->data_end
> > so they point to the requested payload offset and to the end of the
> > fragment containing this byte offset, and return the byte offset of the
> > start of the fragment.
> > To move back to the beginning of the packet, simply call the
> > helper with an offset of '0'.
> > Note also that the helpers that modify the packet boundaries
> > (bpf_xdp_adjust_head(), bpf_xdp_adjust_tail() and
> > bpf_xdp_adjust_meta()) will fail if the pointers have been
> > moved; it is the responsibility of the BPF program to move them
> > back before using these helpers.
>
> I'm ok with this for a first iteration I guess with more work we
> can make the helpers use the updated pointers though.
>
> >
> > Suggested-by: John Fastabend <john.fastabend@gmail.com>
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
>
> Overall looks good couple small nits/questions below. Thanks!
>
> > ---
> >  include/net/xdp.h              |  8 +++++
> >  include/uapi/linux/bpf.h       | 32 ++++++++++++++++++
> >  net/bpf/test_run.c             |  8 +++++
> >  net/core/filter.c              | 62 +++++++++++++++++++++++++++++++++-
> >  tools/include/uapi/linux/bpf.h | 32 ++++++++++++++++++
> >  5 files changed, 141 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/net/xdp.h b/include/net/xdp.h
> > index cdaecf8d4d61..ce4764c7cd40 100644
> > --- a/include/net/xdp.h
> > +++ b/include/net/xdp.h
> > @@ -82,6 +82,11 @@ struct xdp_buff {
> >       struct xdp_txq_info *txq;
> >       u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
> >       u16 flags; /* supported values defined in xdp_flags */
> > +     /* xdp multi-buff metadata used for frags iteration */
> > +     struct {
> > +             u16 headroom;   /* frame headroom: data - data_hard_start */
> > +             u16 headlen;    /* first buffer length: data_end - data */
> > +     } mb;
> >  };
> >
> >  static __always_inline bool xdp_buff_is_mb(struct xdp_buff *xdp)
> > @@ -127,6 +132,9 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start,
> >       xdp->data = data;
> >       xdp->data_end = data + data_len;
> >       xdp->data_meta = meta_valid ? data : data + 1;
> > +     /* mb metadata for frags iteration */
> > +     xdp->mb.headroom = headroom;
> > +     xdp->mb.headlen = data_len;
> >  }
> >
> >  /* Reserve memory area at end-of data area.
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 9e2c3b12ea49..a7b5185a718a 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4877,6 +4877,37 @@ union bpf_attr {
> >   *           Get the total size of a given xdp buff (linear and paged area)
> >   *   Return
> >   *           The total size of a given xdp buffer.
> > + *
> > + * long bpf_xdp_adjust_data(struct xdp_buff *xdp_md, u32 offset)
> > + *   Description
> > + *           For XDP frames split over multiple buffers, the
> > + *           *xdp_md*\ **->data** and*xdp_md *\ **->data_end** pointers
>                                        ^^^^
> missing space?

ack, right. I will fix it.

>
> > + *           will point to the start and end of the first fragment only.
> > + *           This helper can be used to access subsequent fragments by
> > + *           moving the data pointers. To use, an XDP program can call
> > + *           this helper with the byte offset of the packet payload that
> > + *           it wants to access; the helper will move *xdp_md*\ **->data**
> > + *           and *xdp_md *\ **->data_end** so they point to the requested
> > + *           payload offset and to the end of the fragment containing this
> > + *           byte offset, and return the byte offset of the start of the
> > + *           fragment.
> > + *           To move back to the beginning of the packet, simply call the
> > + *           helper with an offset of '0'.
> > + *           Note also that the helpers that modify the packet boundaries
> > + *           (*bpf_xdp_adjust_head()*, *bpf_xdp_adjust_tail()* and
> > + *           *bpf_xdp_adjust_meta()*) will fail if the pointers have been
> > + *           moved; it is the responsibility of the BPF program to move them
> > + *           back before using these helpers.
> > + *
> > + *           A call to this helper is susceptible to change the underlying
> > + *           packet buffer. Therefore, at load time, all checks on pointers
> > + *           previously done by the verifier are invalidated and must be
> > + *           performed again, if the helper is used in combination with
> > + *           direct packet access.
> > + *   Return
> > + *           offset between the beginning of the current fragment and
> > + *           original *xdp_md*\ **->data** on success, or a negative error
> > + *           in case of failure.
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)                \
> >       FN(unspec),                     \
> > @@ -5055,6 +5086,7 @@ union bpf_attr {
> >       FN(get_func_ip),                \
> >       FN(get_attach_cookie),          \
> >       FN(xdp_get_buff_len),           \
> > +     FN(xdp_adjust_data),            \
> >       /* */
> >
> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index 869dcf23a1ca..f09c2c8c0d6c 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -757,6 +757,8 @@ static int xdp_convert_md_to_buff(struct xdp_md *xdp_md, struct xdp_buff *xdp)
> >       }
> >
> >       xdp->data = xdp->data_meta + xdp_md->data;
> > +     xdp->mb.headroom = xdp->data - xdp->data_hard_start;
> > +     xdp->mb.headlen = xdp->data_end - xdp->data;
> >       return 0;
> >
> >  free_dev:
> > @@ -871,6 +873,12 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> >       if (ret)
> >               goto out;
> >
> > +     /* data pointers need to be reset after frag iteration */
> > +     if (unlikely(xdp.data_hard_start + xdp.mb.headroom != xdp.data)) {
> > +             ret = -EFAULT;
> > +             goto out;
> > +     }
> > +
> >       size = xdp.data_end - xdp.data_meta + sinfo->xdp_frags_size;
> >       ret = bpf_test_finish(kattr, uattr, xdp.data_meta, sinfo, size,
> >                             retval, duration);
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 2122c00c680f..ed2a6632adce 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -3827,6 +3827,10 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
> >       void *data_start = xdp_frame_end + metalen;
> >       void *data = xdp->data + offset;
> >
> > +     /* data pointers need to be reset after frag iteration */
> > +     if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
> > +             return -EINVAL;
>
> -EFAULT? It might be nice if error code is different from below
> for debugging?

ack, I will fix it in v13

>
> > +
> >       if (unlikely(data < data_start ||
> >                    data > xdp->data_end - ETH_HLEN))
> >               return -EINVAL;
> > @@ -3836,6 +3840,9 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
> >                       xdp->data_meta, metalen);
> >       xdp->data_meta += offset;
> >       xdp->data = data;
> > +     /* update metada for multi-buff frag iteration */
> > +     xdp->mb.headroom = xdp->data - xdp->data_hard_start;
> > +     xdp->mb.headlen = xdp->data_end - xdp->data;
> >
> >       return 0;
> >  }
> > @@ -3910,6 +3917,10 @@ BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
> >       void *data_hard_end = xdp_data_hard_end(xdp); /* use xdp->frame_sz */
> >       void *data_end = xdp->data_end + offset;
> >
> > +     /* data pointer needs to be reset after frag iteration */
> > +     if (unlikely(xdp->data + xdp->mb.headlen != xdp->data_end))
> > +             return -EINVAL;
>
> EFAULT?

ack, I will fix it in v13

>
> > +
> >       if (unlikely(xdp_buff_is_mb(xdp)))
> >               return bpf_xdp_mb_adjust_tail(xdp, offset);
> >
> > @@ -3949,6 +3960,10 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
> >       void *meta = xdp->data_meta + offset;
> >       unsigned long metalen = xdp->data - meta;
> >
> > +     /* data pointer needs to be reset after frag iteration */
> > +     if (unlikely(xdp->data_hard_start + xdp->mb.headroom != xdp->data))
> > +             return -EINVAL;
>
> same comment.

ack, I will fix it in v13

>
> >       if (xdp_data_meta_unsupported(xdp))
> >               return -ENOTSUPP;
> >       if (unlikely(meta < xdp_frame_end ||
> > @@ -3970,6 +3985,48 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = {
> >       .arg2_type      = ARG_ANYTHING,
> >  };
> >
> > +BPF_CALL_2(bpf_xdp_adjust_data, struct xdp_buff *, xdp, u32, offset)
> > +{
> > +     struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
> > +     u32 base_offset = xdp->mb.headlen;
> > +     int i;
> > +
> > +     if (!xdp_buff_is_mb(xdp) || offset > sinfo->xdp_frags_size)
> > +             return -EINVAL;
>
> Do we need to error this? If its not mb we can just return the same
> as offset==0?

ack, we can check do something like:

u32 max_offset = xdp->mb.headlen;

if (xdp_buff_is_mb(xdp))
   max_offset += sinfo->xdp_frags_size;

if (offset > max_offset)
   return -EINVAL;

what do you think?

Regards,
Lorenzo

>
> > +
> > +     if (offset < xdp->mb.headlen) {
> > +             /* linear area */
> > +             xdp->data = xdp->data_hard_start + xdp->mb.headroom + offset;
> > +             xdp->data_end = xdp->data_hard_start + xdp->mb.headroom +
> > +                             xdp->mb.headlen;
> > +             return 0;
> > +     }
> > +
> > +     for (i = 0; i < sinfo->nr_frags; i++) {
> > +             /* paged area */
> > +             skb_frag_t *frag = &sinfo->frags[i];
> > +             unsigned int size = skb_frag_size(frag);
> > +
> > +             if (offset < base_offset + size) {
> > +                     u8 *addr = skb_frag_address(frag);
> > +
> > +                     xdp->data = addr + offset - base_offset;
> > +                     xdp->data_end = addr + size;
> > +                     break;
> > +             }
> > +             base_offset += size;
> > +     }
> > +     return base_offset;
> > +}
>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support
  2021-09-01  0:45 ` [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support John Fastabend
@ 2021-09-07  8:35   ` Lorenzo Bianconi
  0 siblings, 0 replies; 38+ messages in thread
From: Lorenzo Bianconi @ 2021-09-07  8:35 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, bpf, netdev, davem, kuba, ast, daniel, shayagr,
	dsahern, brouer, echaudro, jasowang, alexander.duyck, saeed,
	maciej.fijalkowski, magnus.karlsson, tirthendu.sarkar, toke

[-- Attachment #1: Type: text/plain, Size: 9652 bytes --]

> Lorenzo Bianconi wrote:
> > This series introduce XDP multi-buffer support. The mvneta driver is
> > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > please focus on how these new types of xdp_{buff,frame} packets
> > traverse the different layers and the layout design. It is on purpose
> > that BPF-helpers are kept simple, as we don't want to expose the
> > internal layout to allow later changes.
> > 
> > The main idea for the new multi-buffer layout is to reuse the same
> > structure used for non-linear SKB. This rely on the "skb_shared_info"
> > struct at the end of the first buffer to link together subsequent
> > buffers. Keeping the layout compatible with SKBs is also done to ease
> > and speedup creating a SKB from an xdp_{buff,frame}.
> > Converting xdp_frame to SKB and deliver it to the network stack is shown
> > in patch 05/18 (e.g. cpumaps).
> > 
> > A multi-buffer bit (mb) has been introduced in the flags field of xdp_{buff,frame}
> > structure to notify the bpf/network layer if this is a xdp multi-buffer frame
> > (mb = 1) or not (mb = 0).
> > The mb bit will be set by a xdp multi-buffer capable driver only for
> > non-linear frames maintaining the capability to receive linear frames
> > without any extra cost since the skb_shared_info structure at the end
> > of the first buffer will be initialized only if mb is set.
> > Moreover the flags field in xdp_{buff,frame} will be reused even for
> > xdp rx csum offloading in future series.
> > 
> 
> The series is looking really close to me. Couple small comments/questions
> inline. Also I think we should call out the potential issues in the cover
> letter with regards to backwards compatibility. Something like, 
> 
> "
> A multi-buffer enabled NIC may receive XDP frames with multiple frags.
> If a BPF program does not understand mb layouts its possible to contrive
> a BPF program that incorrectly views data_end as the end of data when
> there is more data in the payload. Note helpers will generally due the
> correct thing, for example perf_output will consume entire payload. But,
> it is still possible some programs could do the wrong thing even if in
> an edge case. Although we expect most BPF programs not to be impacted
> we can't rule out, you've been warned.

ack, I will add it to the cover letter in v13.

Regards,
Lorenzo

> "
> 
> I can't think of an elegant way around this and it does require at least
> some type of opt-in by increasing the MTU limit so I'm OK with it given
> I think it should impact few (no?) real programs.
> 
> > Typical use cases for this series are:
> > - Jumbo-frames
> > - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])
> > - TSO/GRO
> > 
> > The two following ebpf helpers (and related selftests) has been introduced:
> > - bpf_xdp_adjust_data:
> >   Move xdp_md->data and xdp_md->data_end pointers in subsequent fragments
> >   according to the offset provided by the ebpf program. This helper can be
> >   used to read/write values in frame payload.
> > - bpf_xdp_get_buff_len:
> >   Return the total frame size (linear + paged parts)
> > 
> > bpf_xdp_adjust_tail and bpf_xdp_copy helpers have been modified to take into
> > account xdp multi-buff frames.
> > 
> > More info about the main idea behind this approach can be found here [1][2].
> > 
> > Changes since v11:
> > - add missing static to bpf_xdp_get_buff_len_proto structure
> > - fix bpf_xdp_adjust_data helper when offset is smaller than linear area length.
> > 
> > Changes since v10:
> > - move xdp->data to the requested payload offset instead of to the beginning of
> >   the fragment in bpf_xdp_adjust_data()
> > 
> > Changes since v9:
> > - introduce bpf_xdp_adjust_data helper and related selftest
> > - add xdp_frags_size and xdp_frags_tsize fields in skb_shared_info
> > - introduce xdp_update_skb_shared_info utility routine in ordere to not reset
> >   frags array in skb_shared_info converting from a xdp_buff/xdp_frame to a skb 
> > - simplify bpf_xdp_copy routine
> > 
> > Changes since v8:
> > - add proper dma unmapping if XDP_TX fails on mvneta for a xdp multi-buff
> > - switch back to skb_shared_info implementation from previous xdp_shared_info
> >   one
> > - avoid using a bietfield in xdp_buff/xdp_frame since it introduces performance
> >   regressions. Tested now on 10G NIC (ixgbe) to verify there are no performance
> >   penalties for regular codebase
> > - add bpf_xdp_get_buff_len helper and remove frame_length field in xdp ctx
> > - add data_len field in skb_shared_info struct
> > - introduce XDP_FLAGS_FRAGS_PF_MEMALLOC flag
> > 
> > Changes since v7:
> > - rebase on top of bpf-next
> > - fix sparse warnings
> > - improve comments for frame_length in include/net/xdp.h
> > 
> > Changes since v6:
> > - the main difference respect to previous versions is the new approach proposed
> >   by Eelco to pass full length of the packet to eBPF layer in XDP context
> > - reintroduce multi-buff support to eBPF kself-tests
> > - reintroduce multi-buff support to bpf_xdp_adjust_tail helper
> > - introduce multi-buffer support to bpf_xdp_copy helper
> > - rebase on top of bpf-next
> > 
> > Changes since v5:
> > - rebase on top of bpf-next
> > - initialize mb bit in xdp_init_buff() and drop per-driver initialization
> > - drop xdp->mb initialization in xdp_convert_zc_to_xdp_frame()
> > - postpone introduction of frame_length field in XDP ctx to another series
> > - minor changes
> > 
> > Changes since v4:
> > - rebase ontop of bpf-next
> > - introduce xdp_shared_info to build xdp multi-buff instead of using the
> >   skb_shared_info struct
> > - introduce frame_length in xdp ctx
> > - drop previous bpf helpers
> > - fix bpf_xdp_adjust_tail for xdp multi-buff
> > - introduce xdp multi-buff self-tests for bpf_xdp_adjust_tail
> > - fix xdp_return_frame_bulk for xdp multi-buff
> > 
> > Changes since v3:
> > - rebase ontop of bpf-next
> > - add patch 10/13 to copy back paged data from a xdp multi-buff frame to
> >   userspace buffer for xdp multi-buff selftests
> > 
> > Changes since v2:
> > - add throughput measurements
> > - drop bpf_xdp_adjust_mb_header bpf helper
> > - introduce selftest for xdp multibuffer
> > - addressed comments on bpf_xdp_get_frags_count
> > - introduce xdp multi-buff support to cpumaps
> > 
> > Changes since v1:
> > - Fix use-after-free in xdp_return_{buff/frame}
> > - Introduce bpf helpers
> > - Introduce xdp_mb sample program
> > - access skb_shared_info->nr_frags only on the last fragment
> > 
> > Changes since RFC:
> > - squash multi-buffer bit initialization in a single patch
> > - add mvneta non-linear XDP buff support for tx side
> > 
> > [0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
> > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
> > [2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)
> > 
> > Eelco Chaudron (3):
> >   bpf: add multi-buff support to the bpf_xdp_adjust_tail() API
> >   bpf: add multi-buffer support to xdp copy helpers
> >   bpf: update xdp_adjust_tail selftest to include multi-buffer
> > 
> > Lorenzo Bianconi (15):
> >   net: skbuff: add size metadata to skb_shared_info for xdp
> >   xdp: introduce flags field in xdp_buff/xdp_frame
> >   net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
> >   net: mvneta: simplify mvneta_swbm_add_rx_fragment management
> >   net: xdp: add xdp_update_skb_shared_info utility routine
> >   net: marvell: rely on xdp_update_skb_shared_info utility routine
> >   xdp: add multi-buff support to xdp_return_{buff/frame}
> >   net: mvneta: add multi buffer support to XDP_TX
> >   net: mvneta: enable jumbo frames for XDP
> >   bpf: introduce bpf_xdp_get_buff_len helper
> >   bpf: move user_size out of bpf_test_init
> >   bpf: introduce multibuff support to bpf_prog_test_run_xdp()
> >   bpf: test_run: add xdp_shared_info pointer in bpf_test_finish
> >     signature
> >   net: xdp: introduce bpf_xdp_adjust_data helper
> >   bpf: add bpf_xdp_adjust_data selftest
> > 
> >  drivers/net/ethernet/marvell/mvneta.c         | 204 ++++++++++-------
> >  include/linux/skbuff.h                        |   6 +-
> >  include/net/xdp.h                             |  95 +++++++-
> >  include/uapi/linux/bpf.h                      |  39 ++++
> >  kernel/trace/bpf_trace.c                      |   3 +
> >  net/bpf/test_run.c                            | 117 ++++++++--
> >  net/core/filter.c                             | 213 +++++++++++++++++-
> >  net/core/xdp.c                                |  76 ++++++-
> >  tools/include/uapi/linux/bpf.h                |  39 ++++
> >  .../bpf/prog_tests/xdp_adjust_data.c          |  55 +++++
> >  .../bpf/prog_tests/xdp_adjust_tail.c          | 118 ++++++++++
> >  .../selftests/bpf/prog_tests/xdp_bpf2bpf.c    | 151 +++++++++----
> >  .../bpf/progs/test_xdp_adjust_tail_grow.c     |  10 +-
> >  .../bpf/progs/test_xdp_adjust_tail_shrink.c   |  32 ++-
> >  .../selftests/bpf/progs/test_xdp_bpf2bpf.c    |   2 +-
> >  .../bpf/progs/test_xdp_update_frags.c         |  41 ++++
> >  16 files changed, 1036 insertions(+), 165 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_adjust_data.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_update_frags.c
> > 
> > -- 
> > 2.31.1
> > 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-09-07  8:35 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-20 15:40 [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 01/18] net: skbuff: add size metadata to skb_shared_info for xdp Lorenzo Bianconi
2021-08-31 23:13   ` John Fastabend
2021-09-03 17:13     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 02/18] xdp: introduce flags field in xdp_buff/xdp_frame Lorenzo Bianconi
2021-08-31 23:15   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 03/18] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-08-31 23:23   ` John Fastabend
2021-09-03 17:23     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 04/18] net: mvneta: simplify mvneta_swbm_add_rx_fragment management Lorenzo Bianconi
2021-08-31 23:34   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 05/18] net: xdp: add xdp_update_skb_shared_info utility routine Lorenzo Bianconi
2021-08-31 23:38   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 06/18] net: marvell: rely on " Lorenzo Bianconi
2021-08-31 23:41   ` John Fastabend
2021-09-03 17:27     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 07/18] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-08-31 23:43   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 08/18] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-08-31 23:44   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 09/18] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-08-31 23:45   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 10/18] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-09-01  0:10   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 11/18] bpf: introduce bpf_xdp_get_buff_len helper Lorenzo Bianconi
2021-09-01  0:12   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 12/18] bpf: add multi-buffer support to xdp copy helpers Lorenzo Bianconi
2021-09-01  0:19   ` John Fastabend
2021-08-20 15:40 ` [PATCH v12 bpf-next 13/18] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 14/18] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 15/18] bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 16/18] bpf: update xdp_adjust_tail selftest to include multi-buffer Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 17/18] net: xdp: introduce bpf_xdp_adjust_data helper Lorenzo Bianconi
2021-09-01  0:36   ` John Fastabend
2021-09-03 17:57     ` Lorenzo Bianconi
2021-08-20 15:40 ` [PATCH v12 bpf-next 18/18] bpf: add bpf_xdp_adjust_data selftest Lorenzo Bianconi
2021-09-01  0:45 ` [PATCH v12 bpf-next 00/18] mvneta: introduce XDP multi-buffer support John Fastabend
2021-09-07  8:35   ` Lorenzo Bianconi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.