bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support
@ 2021-01-19 20:20 Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 1/8] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 4648 bytes --]

This series introduce XDP multi-buffer support. The mvneta driver is
the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
please focus on how these new types of xdp_{buff,frame} packets
traverse the different layers and the layout design. It is on purpose
that BPF-helpers are kept simple, as we don't want to expose the
internal layout to allow later changes.

For now, to keep the design simple and to maintain performance, the XDP
BPF-prog (still) only have access to the first-buffer. It is left for
later (another patchset) to add payload access across multiple buffers.
This patchset should still allow for these future extensions. The goal
is to lift the XDP MTU restriction that comes with XDP, but maintain
same performance as before.

The main idea for the new multi-buffer layout is to reuse the same
layout used for non-linear SKB. We introduced a "xdp_shared_info" data
structure at the end of the first buffer to link together subsequent buffers.
xdp_shared_info will alias skb_shared_info allowing to keep most of the frags
in the same cache-line (while with skb_shared_info only the first fragment will
be placed in the first "shared_info" cache-line). Moreover we introduced some
xdp_shared_info helpers aligned to skb_frag* ones.
Converting xdp_frame to SKB and deliver it to the network stack is shown in
cpumap code (patch 7/8). Building the SKB, the xdp_shared_info structure
will be converted in a skb_shared_info one.

A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
or not (mb = 0).
The mb bit will be set by a xdp multi-buffer capable driver only for
non-linear frames maintaining the capability to receive linear frames
without any extra cost since the xdp_shared_info structure at the end
of the first buffer will be initialized only if mb is set.

Typical use cases for this series are:
- Jumbo-frames
- Packet header split (please see Google’s use-case @ NetDevConf 0x14, [0])
- TSO

bpf_xdp_adjust_tail helper has been modified to take info account xdp
multi-buff frames.

More info about the main idea behind this approach can be found here [1][2].

Changes since v5:
- rebase on top of bpf-next
- initialize mb bit in xdp_init_buff() and drop per-driver initialization
- drop xdp->mb initialization in xdp_convert_zc_to_xdp_frame()
- postpone introduction of frame_length field in XDP ctx to another series
- minor changes

Changes since v4:
- rebase ontop of bpf-next
- introduce xdp_shared_info to build xdp multi-buff instead of using the
  skb_shared_info struct
- introduce frame_length in xdp ctx
- drop previous bpf helpers
- fix bpf_xdp_adjust_tail for xdp multi-buff
- introduce xdp multi-buff self-tests for bpf_xdp_adjust_tail
- fix xdp_return_frame_bulk for xdp multi-buff

Changes since v3:
- rebase ontop of bpf-next
- add patch 10/13 to copy back paged data from a xdp multi-buff frame to
  userspace buffer for xdp multi-buff selftests

Changes since v2:
- add throughput measurements
- drop bpf_xdp_adjust_mb_header bpf helper
- introduce selftest for xdp multibuffer
- addressed comments on bpf_xdp_get_frags_count
- introduce xdp multi-buff support to cpumaps

Changes since v1:
- Fix use-after-free in xdp_return_{buff/frame}
- Introduce bpf helpers
- Introduce xdp_mb sample program
- access skb_shared_info->nr_frags only on the last fragment

Changes since RFC:
- squash multi-buffer bit initialization in a single patch
- add mvneta non-linear XDP buff support for tx side

[0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
[1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
[2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)

Eelco Chaudron (1):
  bpf: add multi-buff support to the bpf_xdp_adjust_tail() API

Lorenzo Bianconi (7):
  xdp: introduce mb in xdp_buff/xdp_frame
  xdp: add xdp_shared_info data structure
  net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  xdp: add multi-buff support to xdp_return_{buff/frame}
  net: mvneta: add multi buffer support to XDP_TX
  net: mvneta: enable jumbo frames for XDP
  net: xdp: add multi-buff support to xdp_build_skb_from_fram

 drivers/net/ethernet/marvell/mvneta.c | 182 +++++++++++++++-----------
 include/net/xdp.h                     |  88 +++++++++++--
 net/core/filter.c                     |  63 +++++++++
 net/core/xdp.c                        | 102 ++++++++++++++-
 4 files changed, 349 insertions(+), 86 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 1/8] xdp: introduce mb in xdp_buff/xdp_frame
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 2/8] xdp: add xdp_shared_info data structure Lorenzo Bianconi
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

Introduce multi-buffer bit (mb) in xdp_frame/xdp_buffer data structure
in order to specify if this is a linear buffer (mb = 0) or a multi-buffer
frame (mb = 1). In the latter case the shared_info area at the end of the
first buffer will be properly initialized to link together subsequent
buffers.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index c4bfdc9a8b79..945767813fdd 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -73,7 +73,8 @@ struct xdp_buff {
 	void *data_hard_start;
 	struct xdp_rxq_info *rxq;
 	struct xdp_txq_info *txq;
-	u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
+	u32 frame_sz:31; /* frame size to deduce data_hard_end/reserved tailroom*/
+	u32 mb:1; /* xdp non-linear buffer */
 };
 
 static __always_inline void
@@ -81,6 +82,7 @@ xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)
 {
 	xdp->frame_sz = frame_sz;
 	xdp->rxq = rxq;
+	xdp->mb = 0;
 }
 
 static __always_inline void
@@ -116,7 +118,8 @@ struct xdp_frame {
 	u16 len;
 	u16 headroom;
 	u32 metasize:8;
-	u32 frame_sz:24;
+	u32 frame_sz:23;
+	u32 mb:1; /* xdp non-linear frame */
 	/* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
 	 * while mem info is valid on remote CPU.
 	 */
@@ -178,6 +181,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)
 	xdp->data_end = frame->data + frame->len;
 	xdp->data_meta = frame->data - frame->metasize;
 	xdp->frame_sz = frame->frame_sz;
+	xdp->mb = frame->mb;
 }
 
 static inline
@@ -204,6 +208,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp,
 	xdp_frame->headroom = headroom - sizeof(*xdp_frame);
 	xdp_frame->metasize = metasize;
 	xdp_frame->frame_sz = xdp->frame_sz;
+	xdp_frame->mb = xdp->mb;
 
 	return 0;
 }
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 2/8] xdp: add xdp_shared_info data structure
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 1/8] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 3/8] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

Introduce xdp_shared_info data structure to contain info about
"non-linear" xdp frame. xdp_shared_info will alias skb_shared_info
allowing to keep most of the frags in the same cache-line.
Introduce some xdp_shared_info helpers aligned to skb_frag* ones

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 62 +++++++++++++++------------
 include/net/xdp.h                     | 55 ++++++++++++++++++++++--
 2 files changed, 85 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 6290bfb6494e..ed986ffff96b 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2033,14 +2033,17 @@ int mvneta_rx_refill_queue(struct mvneta_port *pp, struct mvneta_rx_queue *rxq)
 
 static void
 mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
-		    struct xdp_buff *xdp, struct skb_shared_info *sinfo,
+		    struct xdp_buff *xdp, struct xdp_shared_info *xdp_sinfo,
 		    int sync_len)
 {
 	int i;
 
-	for (i = 0; i < sinfo->nr_frags; i++)
+	for (i = 0; i < xdp_sinfo->nr_frags; i++) {
+		skb_frag_t *frag = &xdp_sinfo->frags[i];
+
 		page_pool_put_full_page(rxq->page_pool,
-					skb_frag_page(&sinfo->frags[i]), true);
+					xdp_get_frag_page(frag), true);
+	}
 	page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data),
 			   sync_len, true);
 }
@@ -2179,7 +2182,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	       struct bpf_prog *prog, struct xdp_buff *xdp,
 	       u32 frame_sz, struct mvneta_stats *stats)
 {
-	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+	struct xdp_shared_info *xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
 	unsigned int len, data_len, sync;
 	u32 ret, act;
 
@@ -2200,7 +2203,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 
 		err = xdp_do_redirect(pp->dev, xdp, prog);
 		if (unlikely(err)) {
-			mvneta_xdp_put_buff(pp, rxq, xdp, sinfo, sync);
+			mvneta_xdp_put_buff(pp, rxq, xdp, xdp_sinfo, sync);
 			ret = MVNETA_XDP_DROPPED;
 		} else {
 			ret = MVNETA_XDP_REDIR;
@@ -2211,7 +2214,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	case XDP_TX:
 		ret = mvneta_xdp_xmit_back(pp, xdp);
 		if (ret != MVNETA_XDP_TX)
-			mvneta_xdp_put_buff(pp, rxq, xdp, sinfo, sync);
+			mvneta_xdp_put_buff(pp, rxq, xdp, xdp_sinfo, sync);
 		break;
 	default:
 		bpf_warn_invalid_xdp_action(act);
@@ -2220,7 +2223,7 @@ mvneta_run_xdp(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		trace_xdp_exception(pp->dev, prog, act);
 		fallthrough;
 	case XDP_DROP:
-		mvneta_xdp_put_buff(pp, rxq, xdp, sinfo, sync);
+		mvneta_xdp_put_buff(pp, rxq, xdp, xdp_sinfo, sync);
 		ret = MVNETA_XDP_DROPPED;
 		stats->xdp_drop++;
 		break;
@@ -2241,9 +2244,9 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 {
 	unsigned char *data = page_address(page);
 	int data_len = -MVNETA_MH_SIZE, len;
+	struct xdp_shared_info *xdp_sinfo;
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
-	struct skb_shared_info *sinfo;
 
 	if (*size > MVNETA_MAX_RX_BUF_SIZE) {
 		len = MVNETA_MAX_RX_BUF_SIZE;
@@ -2266,8 +2269,8 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 	xdp_prepare_buff(xdp, data, pp->rx_offset_correction + MVNETA_MH_SIZE,
 			 data_len, false);
 
-	sinfo = xdp_get_shared_info_from_buff(xdp);
-	sinfo->nr_frags = 0;
+	xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
+	xdp_sinfo->nr_frags = 0;
 }
 
 static void
@@ -2275,7 +2278,7 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 			    struct mvneta_rx_desc *rx_desc,
 			    struct mvneta_rx_queue *rxq,
 			    struct xdp_buff *xdp, int *size,
-			    struct skb_shared_info *xdp_sinfo,
+			    struct xdp_shared_info *xdp_sinfo,
 			    struct page *page)
 {
 	struct net_device *dev = pp->dev;
@@ -2298,13 +2301,13 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 	if (data_len > 0 && xdp_sinfo->nr_frags < MAX_SKB_FRAGS) {
 		skb_frag_t *frag = &xdp_sinfo->frags[xdp_sinfo->nr_frags++];
 
-		skb_frag_off_set(frag, pp->rx_offset_correction);
-		skb_frag_size_set(frag, data_len);
-		__skb_frag_set_page(frag, page);
+		xdp_set_frag_offset(frag, pp->rx_offset_correction);
+		xdp_set_frag_size(frag, data_len);
+		xdp_set_frag_page(frag, page);
 
 		/* last fragment */
 		if (len == *size) {
-			struct skb_shared_info *sinfo;
+			struct xdp_shared_info *sinfo;
 
 			sinfo = xdp_get_shared_info_from_buff(xdp);
 			sinfo->nr_frags = xdp_sinfo->nr_frags;
@@ -2321,10 +2324,13 @@ static struct sk_buff *
 mvneta_swbm_build_skb(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		      struct xdp_buff *xdp, u32 desc_status)
 {
-	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
-	int i, num_frags = sinfo->nr_frags;
+	struct xdp_shared_info *xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
+	int i, num_frags = xdp_sinfo->nr_frags;
+	skb_frag_t frag_list[MAX_SKB_FRAGS];
 	struct sk_buff *skb;
 
+	memcpy(frag_list, xdp_sinfo->frags, sizeof(skb_frag_t) * num_frags);
+
 	skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
 	if (!skb)
 		return ERR_PTR(-ENOMEM);
@@ -2336,12 +2342,12 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	mvneta_rx_csum(pp, desc_status, skb);
 
 	for (i = 0; i < num_frags; i++) {
-		skb_frag_t *frag = &sinfo->frags[i];
+		struct page *page = xdp_get_frag_page(&frag_list[i]);
 
 		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
-				skb_frag_page(frag), skb_frag_off(frag),
-				skb_frag_size(frag), PAGE_SIZE);
-		page_pool_release_page(rxq->page_pool, skb_frag_page(frag));
+				page, xdp_get_frag_offset(&frag_list[i]),
+				xdp_get_frag_size(&frag_list[i]), PAGE_SIZE);
+		page_pool_release_page(rxq->page_pool, page);
 	}
 
 	return skb;
@@ -2354,7 +2360,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 {
 	int rx_proc = 0, rx_todo, refill, size = 0;
 	struct net_device *dev = pp->dev;
-	struct skb_shared_info sinfo;
+	struct xdp_shared_info xdp_sinfo;
 	struct mvneta_stats ps = {};
 	struct bpf_prog *xdp_prog;
 	u32 desc_status, frame_sz;
@@ -2363,7 +2369,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 	xdp_init_buff(&xdp_buf, PAGE_SIZE, &rxq->xdp_rxq);
 	xdp_buf.data_hard_start = NULL;
 
-	sinfo.nr_frags = 0;
+	xdp_sinfo.nr_frags = 0;
 
 	/* Get number of received packets */
 	rx_todo = mvneta_rxq_busy_desc_num_get(pp, rxq);
@@ -2407,7 +2413,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 			}
 
 			mvneta_swbm_add_rx_fragment(pp, rx_desc, rxq, &xdp_buf,
-						    &size, &sinfo, page);
+						    &size, &xdp_sinfo, page);
 		} /* Middle or Last descriptor */
 
 		if (!(rx_status & MVNETA_RXD_LAST_DESC))
@@ -2415,7 +2421,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 			continue;
 
 		if (size) {
-			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &sinfo, -1);
+			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &xdp_sinfo, -1);
 			goto next;
 		}
 
@@ -2427,7 +2433,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 		if (IS_ERR(skb)) {
 			struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats);
 
-			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &sinfo, -1);
+			mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &xdp_sinfo, -1);
 
 			u64_stats_update_begin(&stats->syncp);
 			stats->es.skb_alloc_error++;
@@ -2444,12 +2450,12 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 		napi_gro_receive(napi, skb);
 next:
 		xdp_buf.data_hard_start = NULL;
-		sinfo.nr_frags = 0;
+		xdp_sinfo.nr_frags = 0;
 	}
 	rcu_read_unlock();
 
 	if (xdp_buf.data_hard_start)
-		mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &sinfo, -1);
+		mvneta_xdp_put_buff(pp, rxq, &xdp_buf, &xdp_sinfo, -1);
 
 	if (ps.xdp_redirect)
 		xdp_do_flush_map();
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 945767813fdd..ca7c4fb30b4e 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -107,10 +107,54 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start,
 	((xdp)->data_hard_start + (xdp)->frame_sz -	\
 	 SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
 
-static inline struct skb_shared_info *
+struct xdp_shared_info {
+	u16 nr_frags;
+	u16 data_length; /* paged area length */
+	skb_frag_t frags[MAX_SKB_FRAGS];
+};
+
+static inline struct xdp_shared_info *
 xdp_get_shared_info_from_buff(struct xdp_buff *xdp)
 {
-	return (struct skb_shared_info *)xdp_data_hard_end(xdp);
+	BUILD_BUG_ON(sizeof(struct xdp_shared_info) >
+		     sizeof(struct skb_shared_info));
+	return (struct xdp_shared_info *)xdp_data_hard_end(xdp);
+}
+
+static inline struct page *xdp_get_frag_page(const skb_frag_t *frag)
+{
+	return frag->bv_page;
+}
+
+static inline unsigned int xdp_get_frag_offset(const skb_frag_t *frag)
+{
+	return frag->bv_offset;
+}
+
+static inline unsigned int xdp_get_frag_size(const skb_frag_t *frag)
+{
+	return frag->bv_len;
+}
+
+static inline void *xdp_get_frag_address(const skb_frag_t *frag)
+{
+	return page_address(xdp_get_frag_page(frag)) +
+	       xdp_get_frag_offset(frag);
+}
+
+static inline void xdp_set_frag_page(skb_frag_t *frag, struct page *page)
+{
+	frag->bv_page = page;
+}
+
+static inline void xdp_set_frag_offset(skb_frag_t *frag, u32 offset)
+{
+	frag->bv_offset = offset;
+}
+
+static inline void xdp_set_frag_size(skb_frag_t *frag, u32 size)
+{
+	frag->bv_len = size;
 }
 
 struct xdp_frame {
@@ -140,12 +184,15 @@ static __always_inline void xdp_frame_bulk_init(struct xdp_frame_bulk *bq)
 	bq->xa = NULL;
 }
 
-static inline struct skb_shared_info *
+static inline struct xdp_shared_info *
 xdp_get_shared_info_from_frame(struct xdp_frame *frame)
 {
 	void *data_hard_start = frame->data - frame->headroom - sizeof(*frame);
 
-	return (struct skb_shared_info *)(data_hard_start + frame->frame_sz -
+	/* xdp_shared_info struct must be aligned to skb_shared_info
+	 * area in buffer tailroom
+	 */
+	return (struct xdp_shared_info *)(data_hard_start + frame->frame_sz -
 				SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 3/8] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 1/8] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 2/8] xdp: add xdp_shared_info data structure Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 4/8] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

Update multi-buffer bit (mb) in xdp_buff to notify XDP/eBPF layer and
XDP remote drivers if this is a "non-linear" XDP buffer. Access
xdp_shared_info only if xdp_buff mb is set.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index ed986ffff96b..3df0602239ad 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2038,12 +2038,16 @@ mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 {
 	int i;
 
+	if (likely(!xdp->mb))
+		goto out;
+
 	for (i = 0; i < xdp_sinfo->nr_frags; i++) {
 		skb_frag_t *frag = &xdp_sinfo->frags[i];
 
 		page_pool_put_full_page(rxq->page_pool,
 					xdp_get_frag_page(frag), true);
 	}
+out:
 	page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data),
 			   sync_len, true);
 }
@@ -2244,7 +2248,6 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 {
 	unsigned char *data = page_address(page);
 	int data_len = -MVNETA_MH_SIZE, len;
-	struct xdp_shared_info *xdp_sinfo;
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
 
@@ -2268,9 +2271,6 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 	prefetch(data);
 	xdp_prepare_buff(xdp, data, pp->rx_offset_correction + MVNETA_MH_SIZE,
 			 data_len, false);
-
-	xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
-	xdp_sinfo->nr_frags = 0;
 }
 
 static void
@@ -2305,12 +2305,18 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 		xdp_set_frag_size(frag, data_len);
 		xdp_set_frag_page(frag, page);
 
+		if (!xdp->mb) {
+			xdp_sinfo->data_length = *size;
+			xdp->mb = 1;
+		}
 		/* last fragment */
 		if (len == *size) {
 			struct xdp_shared_info *sinfo;
 
 			sinfo = xdp_get_shared_info_from_buff(xdp);
 			sinfo->nr_frags = xdp_sinfo->nr_frags;
+			sinfo->data_length = xdp_sinfo->data_length;
+
 			memcpy(sinfo->frags, xdp_sinfo->frags,
 			       sinfo->nr_frags * sizeof(skb_frag_t));
 		}
@@ -2325,11 +2331,15 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		      struct xdp_buff *xdp, u32 desc_status)
 {
 	struct xdp_shared_info *xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
-	int i, num_frags = xdp_sinfo->nr_frags;
 	skb_frag_t frag_list[MAX_SKB_FRAGS];
+	int i, num_frags = 0;
 	struct sk_buff *skb;
 
-	memcpy(frag_list, xdp_sinfo->frags, sizeof(skb_frag_t) * num_frags);
+	if (unlikely(xdp->mb)) {
+		num_frags = xdp_sinfo->nr_frags;
+		memcpy(frag_list, xdp_sinfo->frags,
+		       sizeof(skb_frag_t) * num_frags);
+	}
 
 	skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
 	if (!skb)
@@ -2341,6 +2351,9 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	skb_put(skb, xdp->data_end - xdp->data);
 	mvneta_rx_csum(pp, desc_status, skb);
 
+	if (likely(!xdp->mb))
+		return skb;
+
 	for (i = 0; i < num_frags; i++) {
 		struct page *page = xdp_get_frag_page(&frag_list[i]);
 
@@ -2402,6 +2415,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 			frame_sz = size - ETH_FCS_LEN;
 			desc_status = rx_status;
 
+			xdp_buf.mb = 0;
 			mvneta_swbm_rx_frame(pp, rx_desc, rxq, &xdp_buf,
 					     &size, page);
 		} else {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 4/8] xdp: add multi-buff support to xdp_return_{buff/frame}
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (2 preceding siblings ...)
  2021-01-19 20:20 ` [PATCH v6 bpf-next 3/8] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 5/8] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

Take into account if the received xdp_buff/xdp_frame is non-linear
recycling/returning the frame memory to the allocator or into
xdp_frame_bulk.
Introduce xdp_return_num_frags_from_buff to return a given number of
fragments from a xdp multi-buff starting from the tail.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h | 19 ++++++++++--
 net/core/xdp.c    | 76 ++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index ca7c4fb30b4e..a2e09031b346 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -286,6 +286,7 @@ void xdp_return_buff(struct xdp_buff *xdp);
 void xdp_flush_frame_bulk(struct xdp_frame_bulk *bq);
 void xdp_return_frame_bulk(struct xdp_frame *xdpf,
 			   struct xdp_frame_bulk *bq);
+void xdp_return_num_frags_from_buff(struct xdp_buff *xdp, u16 num_frags);
 
 /* When sending xdp_frame into the network stack, then there is no
  * return point callback, which is needed to release e.g. DMA-mapping
@@ -296,10 +297,24 @@ void __xdp_release_frame(void *data, struct xdp_mem_info *mem);
 static inline void xdp_release_frame(struct xdp_frame *xdpf)
 {
 	struct xdp_mem_info *mem = &xdpf->mem;
+	struct xdp_shared_info *xdp_sinfo;
+	int i;
 
 	/* Curr only page_pool needs this */
-	if (mem->type == MEM_TYPE_PAGE_POOL)
-		__xdp_release_frame(xdpf->data, mem);
+	if (mem->type != MEM_TYPE_PAGE_POOL)
+		return;
+
+	if (likely(!xdpf->mb))
+		goto out;
+
+	xdp_sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < xdp_sinfo->nr_frags; i++) {
+		struct page *page = xdp_get_frag_page(&xdp_sinfo->frags[i]);
+
+		__xdp_release_frame(page_address(page), mem);
+	}
+out:
+	__xdp_release_frame(xdpf->data, mem);
 }
 
 int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq,
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 0d2630a35c3e..af9b9c1194fc 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -374,12 +374,38 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
 
 void xdp_return_frame(struct xdp_frame *xdpf)
 {
+	struct xdp_shared_info *xdp_sinfo;
+	int i;
+
+	if (likely(!xdpf->mb))
+		goto out;
+
+	xdp_sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < xdp_sinfo->nr_frags; i++) {
+		struct page *page = xdp_get_frag_page(&xdp_sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdpf->mem, false, NULL);
+	}
+out:
 	__xdp_return(xdpf->data, &xdpf->mem, false, NULL);
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame);
 
 void xdp_return_frame_rx_napi(struct xdp_frame *xdpf)
 {
+	struct xdp_shared_info *xdp_sinfo;
+	int i;
+
+	if (likely(!xdpf->mb))
+		goto out;
+
+	xdp_sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < xdp_sinfo->nr_frags; i++) {
+		struct page *page = xdp_get_frag_page(&xdp_sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdpf->mem, true, NULL);
+	}
+out:
 	__xdp_return(xdpf->data, &xdpf->mem, true, NULL);
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame_rx_napi);
@@ -415,7 +441,7 @@ void xdp_return_frame_bulk(struct xdp_frame *xdpf,
 	struct xdp_mem_allocator *xa;
 
 	if (mem->type != MEM_TYPE_PAGE_POOL) {
-		__xdp_return(xdpf->data, &xdpf->mem, false, NULL);
+		xdp_return_frame(xdpf);
 		return;
 	}
 
@@ -434,15 +460,63 @@ void xdp_return_frame_bulk(struct xdp_frame *xdpf,
 		bq->xa = rhashtable_lookup(mem_id_ht, &mem->id, mem_id_rht_params);
 	}
 
+	if (unlikely(xdpf->mb)) {
+		struct xdp_shared_info *xdp_sinfo;
+		int i;
+
+		xdp_sinfo = xdp_get_shared_info_from_frame(xdpf);
+		for (i = 0; i < xdp_sinfo->nr_frags; i++) {
+			skb_frag_t *frag = &xdp_sinfo->frags[i];
+
+			bq->q[bq->count++] = xdp_get_frag_address(frag);
+			if (bq->count == XDP_BULK_QUEUE_SIZE)
+				xdp_flush_frame_bulk(bq);
+		}
+	}
 	bq->q[bq->count++] = xdpf->data;
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame_bulk);
 
 void xdp_return_buff(struct xdp_buff *xdp)
 {
+	struct xdp_shared_info *xdp_sinfo;
+	int i;
+
+	if (likely(!xdp->mb))
+		goto out;
+
+	xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
+	for (i = 0; i < xdp_sinfo->nr_frags; i++) {
+		struct page *page = xdp_get_frag_page(&xdp_sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdp->rxq->mem, true, xdp);
+	}
+out:
 	__xdp_return(xdp->data, &xdp->rxq->mem, true, xdp);
 }
 
+void xdp_return_num_frags_from_buff(struct xdp_buff *xdp, u16 num_frags)
+{
+	struct xdp_shared_info *xdp_sinfo;
+	int i;
+
+	if (unlikely(!xdp->mb))
+		return;
+
+	xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
+	num_frags = min_t(u16, num_frags, xdp_sinfo->nr_frags);
+	for (i = 1; i <= num_frags; i++) {
+		skb_frag_t *frag = &xdp_sinfo->frags[xdp_sinfo->nr_frags - i];
+		struct page *page = xdp_get_frag_page(frag);
+
+		xdp_sinfo->data_length -= xdp_get_frag_size(frag);
+		__xdp_return(page_address(page), &xdp->rxq->mem, false, NULL);
+	}
+	xdp_sinfo->nr_frags -= num_frags;
+	xdp->mb = !!xdp_sinfo->nr_frags;
+}
+EXPORT_SYMBOL_GPL(xdp_return_num_frags_from_buff);
+
 /* Only called for MEM_TYPE_PAGE_POOL see xdp.h */
 void __xdp_release_frame(void *data, struct xdp_mem_info *mem)
 {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 5/8] net: mvneta: add multi buffer support to XDP_TX
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (3 preceding siblings ...)
  2021-01-19 20:20 ` [PATCH v6 bpf-next 4/8] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 6/8] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

Introduce the capability to map non-linear xdp buffer running
mvneta_xdp_submit_frame() for XDP_TX and XDP_REDIRECT

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 94 ++++++++++++++++-----------
 1 file changed, 56 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 3df0602239ad..c273e674e3de 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1857,8 +1857,8 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 			bytes_compl += buf->skb->len;
 			pkts_compl++;
 			dev_kfree_skb_any(buf->skb);
-		} else if (buf->type == MVNETA_TYPE_XDP_TX ||
-			   buf->type == MVNETA_TYPE_XDP_NDO) {
+		} else if ((buf->type == MVNETA_TYPE_XDP_TX ||
+			    buf->type == MVNETA_TYPE_XDP_NDO) && buf->xdpf) {
 			if (napi && buf->type == MVNETA_TYPE_XDP_TX)
 				xdp_return_frame_rx_napi(buf->xdpf);
 			else
@@ -2054,45 +2054,64 @@ mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 
 static int
 mvneta_xdp_submit_frame(struct mvneta_port *pp, struct mvneta_tx_queue *txq,
-			struct xdp_frame *xdpf, bool dma_map)
+			struct xdp_frame *xdpf, int *nxmit_byte, bool dma_map)
 {
-	struct mvneta_tx_desc *tx_desc;
-	struct mvneta_tx_buf *buf;
-	dma_addr_t dma_addr;
+	struct xdp_shared_info *xdp_sinfo = xdp_get_shared_info_from_frame(xdpf);
+	int i, num_frames = xdpf->mb ? xdp_sinfo->nr_frags + 1 : 1;
+	struct mvneta_tx_desc *tx_desc = NULL;
+	struct page *page;
 
-	if (txq->count >= txq->tx_stop_threshold)
+	if (txq->count + num_frames >= txq->size)
 		return MVNETA_XDP_DROPPED;
 
-	tx_desc = mvneta_txq_next_desc_get(txq);
+	for (i = 0; i < num_frames; i++) {
+		struct mvneta_tx_buf *buf = &txq->buf[txq->txq_put_index];
+		skb_frag_t *frag = i ? &xdp_sinfo->frags[i - 1] : NULL;
+		int len = i ? xdp_get_frag_size(frag) : xdpf->len;
+		dma_addr_t dma_addr;
 
-	buf = &txq->buf[txq->txq_put_index];
-	if (dma_map) {
-		/* ndo_xdp_xmit */
-		dma_addr = dma_map_single(pp->dev->dev.parent, xdpf->data,
-					  xdpf->len, DMA_TO_DEVICE);
-		if (dma_mapping_error(pp->dev->dev.parent, dma_addr)) {
-			mvneta_txq_desc_put(txq);
-			return MVNETA_XDP_DROPPED;
+		tx_desc = mvneta_txq_next_desc_get(txq);
+		if (dma_map) {
+			/* ndo_xdp_xmit */
+			void *data;
+
+			data = frag ? xdp_get_frag_address(frag) : xdpf->data;
+			dma_addr = dma_map_single(pp->dev->dev.parent, data,
+						  len, DMA_TO_DEVICE);
+			if (dma_mapping_error(pp->dev->dev.parent, dma_addr)) {
+				for (; i >= 0; i--)
+					mvneta_txq_desc_put(txq);
+				return MVNETA_XDP_DROPPED;
+			}
+			buf->type = MVNETA_TYPE_XDP_NDO;
+		} else {
+			page = frag ? xdp_get_frag_page(frag)
+				    : virt_to_page(xdpf->data);
+			dma_addr = page_pool_get_dma_addr(page);
+			if (frag)
+				dma_addr += xdp_get_frag_offset(frag);
+			else
+				dma_addr += sizeof(*xdpf) + xdpf->headroom;
+			dma_sync_single_for_device(pp->dev->dev.parent,
+						   dma_addr, len,
+						   DMA_BIDIRECTIONAL);
+			buf->type = MVNETA_TYPE_XDP_TX;
 		}
-		buf->type = MVNETA_TYPE_XDP_NDO;
-	} else {
-		struct page *page = virt_to_page(xdpf->data);
+		buf->xdpf = i ? NULL : xdpf;
+
+		tx_desc->command = !i ? MVNETA_TXD_F_DESC : 0;
+		tx_desc->buf_phys_addr = dma_addr;
+		tx_desc->data_size = len;
+		*nxmit_byte += len;
 
-		dma_addr = page_pool_get_dma_addr(page) +
-			   sizeof(*xdpf) + xdpf->headroom;
-		dma_sync_single_for_device(pp->dev->dev.parent, dma_addr,
-					   xdpf->len, DMA_BIDIRECTIONAL);
-		buf->type = MVNETA_TYPE_XDP_TX;
+		mvneta_txq_inc_put(txq);
 	}
-	buf->xdpf = xdpf;
 
-	tx_desc->command = MVNETA_TXD_FLZ_DESC;
-	tx_desc->buf_phys_addr = dma_addr;
-	tx_desc->data_size = xdpf->len;
+	/*last descriptor */
+	tx_desc->command |= MVNETA_TXD_L_DESC | MVNETA_TXD_Z_PAD;
 
-	mvneta_txq_inc_put(txq);
-	txq->pending++;
-	txq->count++;
+	txq->pending += num_frames;
+	txq->count += num_frames;
 
 	return MVNETA_XDP_TX;
 }
@@ -2103,8 +2122,8 @@ mvneta_xdp_xmit_back(struct mvneta_port *pp, struct xdp_buff *xdp)
 	struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats);
 	struct mvneta_tx_queue *txq;
 	struct netdev_queue *nq;
+	int cpu, nxmit_byte = 0;
 	struct xdp_frame *xdpf;
-	int cpu;
 	u32 ret;
 
 	xdpf = xdp_convert_buff_to_frame(xdp);
@@ -2116,10 +2135,10 @@ mvneta_xdp_xmit_back(struct mvneta_port *pp, struct xdp_buff *xdp)
 	nq = netdev_get_tx_queue(pp->dev, txq->id);
 
 	__netif_tx_lock(nq, cpu);
-	ret = mvneta_xdp_submit_frame(pp, txq, xdpf, false);
+	ret = mvneta_xdp_submit_frame(pp, txq, xdpf, &nxmit_byte, false);
 	if (ret == MVNETA_XDP_TX) {
 		u64_stats_update_begin(&stats->syncp);
-		stats->es.ps.tx_bytes += xdpf->len;
+		stats->es.ps.tx_bytes += nxmit_byte;
 		stats->es.ps.tx_packets++;
 		stats->es.ps.xdp_tx++;
 		u64_stats_update_end(&stats->syncp);
@@ -2158,10 +2177,9 @@ mvneta_xdp_xmit(struct net_device *dev, int num_frame,
 
 	__netif_tx_lock(nq, cpu);
 	for (i = 0; i < num_frame; i++) {
-		ret = mvneta_xdp_submit_frame(pp, txq, frames[i], true);
-		if (ret == MVNETA_XDP_TX) {
-			nxmit_byte += frames[i]->len;
-		} else {
+		ret = mvneta_xdp_submit_frame(pp, txq, frames[i], &nxmit_byte,
+					      true);
+		if (ret != MVNETA_XDP_TX) {
 			xdp_return_frame_rx_napi(frames[i]);
 			nxmit--;
 		}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 6/8] net: mvneta: enable jumbo frames for XDP
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (4 preceding siblings ...)
  2021-01-19 20:20 ` [PATCH v6 bpf-next 5/8] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 7/8] net: xdp: add multi-buff support to xdp_build_skb_from_fram Lorenzo Bianconi
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

Enable the capability to receive jumbo frames even if the interface is
running in XDP mode

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index c273e674e3de..45a3a8cb12fa 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3763,11 +3763,6 @@ static int mvneta_change_mtu(struct net_device *dev, int mtu)
 		mtu = ALIGN(MVNETA_RX_PKT_SIZE(mtu), 8);
 	}
 
-	if (pp->xdp_prog && mtu > MVNETA_MAX_RX_BUF_SIZE) {
-		netdev_info(dev, "Illegal MTU value %d for XDP mode\n", mtu);
-		return -EINVAL;
-	}
-
 	dev->mtu = mtu;
 
 	if (!netif_running(dev)) {
@@ -4465,11 +4460,6 @@ static int mvneta_xdp_setup(struct net_device *dev, struct bpf_prog *prog,
 	struct mvneta_port *pp = netdev_priv(dev);
 	struct bpf_prog *old_prog;
 
-	if (prog && dev->mtu > MVNETA_MAX_RX_BUF_SIZE) {
-		NL_SET_ERR_MSG_MOD(extack, "MTU too large for XDP");
-		return -EOPNOTSUPP;
-	}
-
 	if (pp->bm_priv) {
 		NL_SET_ERR_MSG_MOD(extack,
 				   "Hardware Buffer Management not supported on XDP");
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 7/8] net: xdp: add multi-buff support to xdp_build_skb_from_fram
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (5 preceding siblings ...)
  2021-01-19 20:20 ` [PATCH v6 bpf-next 6/8] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-19 20:20 ` [PATCH v6 bpf-next 8/8] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
  2021-01-23  1:03 ` [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Daniel Borkmann
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

Introduce xdp multi-buff support to
__xdp_build_skb_from_frame/xdp_build_skb_from_fram utility
routines.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 net/core/xdp.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index af9b9c1194fc..a7cd1ffbab6b 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -592,9 +592,21 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 					   struct sk_buff *skb,
 					   struct net_device *dev)
 {
+	skb_frag_t frag_list[MAX_SKB_FRAGS];
 	unsigned int headroom, frame_size;
+	int i, num_frags = 0;
 	void *hard_start;
 
+	/* XDP multi-buff frame */
+	if (unlikely(xdpf->mb)) {
+		struct xdp_shared_info *xdp_sinfo;
+
+		xdp_sinfo = xdp_get_shared_info_from_frame(xdpf);
+		num_frags = xdp_sinfo->nr_frags;
+		memcpy(frag_list, xdp_sinfo->frags,
+		       sizeof(skb_frag_t) * num_frags);
+	}
+
 	/* Part of headroom was reserved to xdpf */
 	headroom = sizeof(*xdpf) + xdpf->headroom;
 
@@ -613,6 +625,20 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	if (xdpf->metasize)
 		skb_metadata_set(skb, xdpf->metasize);
 
+	/* Single-buff XDP frame */
+	if (likely(!num_frags))
+		goto out;
+
+	for (i = 0; i < num_frags; i++) {
+		struct page *page = xdp_get_frag_page(&frag_list[i]);
+
+		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
+				page, xdp_get_frag_offset(&frag_list[i]),
+				xdp_get_frag_size(&frag_list[i]),
+				xdpf->frame_sz);
+	}
+
+out:
 	/* Essential SKB info: protocol and skb->dev */
 	skb->protocol = eth_type_trans(skb, dev);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v6 bpf-next 8/8] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (6 preceding siblings ...)
  2021-01-19 20:20 ` [PATCH v6 bpf-next 7/8] net: xdp: add multi-buff support to xdp_build_skb_from_fram Lorenzo Bianconi
@ 2021-01-19 20:20 ` Lorenzo Bianconi
  2021-01-23  1:03 ` [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Daniel Borkmann
  8 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-19 20:20 UTC (permalink / raw)
  To: bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, daniel, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

From: Eelco Chaudron <echaudro@redhat.com>

This change adds support for tail growing and shrinking for XDP multi-buff.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h |  5 ++++
 net/core/filter.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index a2e09031b346..4bc86538e052 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -157,6 +157,11 @@ static inline void xdp_set_frag_size(skb_frag_t *frag, u32 size)
 	frag->bv_len = size;
 }
 
+static inline unsigned int xdp_get_frag_tailroom(const skb_frag_t *frag)
+{
+	return PAGE_SIZE - xdp_get_frag_size(frag) - xdp_get_frag_offset(frag);
+}
+
 struct xdp_frame {
 	void *data;
 	u16 len;
diff --git a/net/core/filter.c b/net/core/filter.c
index 9ab94e90d660..86d0adc6cecd 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3860,11 +3860,74 @@ static const struct bpf_func_proto bpf_xdp_adjust_head_proto = {
 	.arg2_type	= ARG_ANYTHING,
 };
 
+static int bpf_xdp_mb_adjust_tail(struct xdp_buff *xdp, int offset)
+{
+	struct xdp_shared_info *xdp_sinfo = xdp_get_shared_info_from_buff(xdp);
+
+	if (unlikely(xdp_sinfo->nr_frags == 0))
+		return -EINVAL;
+
+	if (offset >= 0) {
+		skb_frag_t *frag = &xdp_sinfo->frags[xdp_sinfo->nr_frags - 1];
+		int size;
+
+		if (unlikely(offset > xdp_get_frag_tailroom(frag)))
+			return -EINVAL;
+
+		size = xdp_get_frag_size(frag);
+		memset(xdp_get_frag_address(frag) + size, 0, offset);
+		xdp_set_frag_size(frag, size + offset);
+		xdp_sinfo->data_length += offset;
+	} else {
+		int i, frags_to_free = 0;
+
+		offset = abs(offset);
+
+		if (unlikely(offset > ((int)(xdp->data_end - xdp->data) +
+				       xdp_sinfo->data_length -
+				       ETH_HLEN)))
+			return -EINVAL;
+
+		for (i = xdp_sinfo->nr_frags - 1; i >= 0 && offset > 0; i--) {
+			skb_frag_t *frag = &xdp_sinfo->frags[i];
+			int size = xdp_get_frag_size(frag);
+			int shrink = min_t(int, offset, size);
+
+			offset -= shrink;
+			if (likely(size - shrink > 0)) {
+				/* When updating the final fragment we have
+				 * to adjust the data_length in line.
+				 */
+				xdp_sinfo->data_length -= shrink;
+				xdp_set_frag_size(frag, size - shrink);
+				break;
+			}
+
+			/* When we free the fragments,
+			 * xdp_return_frags_from_buff() will take care
+			 * of updating the xdp share info data_length.
+			 */
+			frags_to_free++;
+		}
+
+		if (unlikely(frags_to_free))
+			xdp_return_num_frags_from_buff(xdp, frags_to_free);
+
+		if (unlikely(offset > 0))
+			xdp->data_end -= offset;
+	}
+
+	return 0;
+}
+
 BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
 {
 	void *data_hard_end = xdp_data_hard_end(xdp); /* use xdp->frame_sz */
 	void *data_end = xdp->data_end + offset;
 
+	if (unlikely(xdp->mb))
+		return bpf_xdp_mb_adjust_tail(xdp, offset);
+
 	/* Notice that xdp_data_hard_end have reserved some tailroom */
 	if (unlikely(data_end > data_hard_end))
 		return -EINVAL;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support
  2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (7 preceding siblings ...)
  2021-01-19 20:20 ` [PATCH v6 bpf-next 8/8] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
@ 2021-01-23  1:03 ` Daniel Borkmann
  2021-01-31 17:23   ` Lorenzo Bianconi
  8 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2021-01-23  1:03 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: lorenzo.bianconi, davem, kuba, ast, shayagr, john.fastabend,
	dsahern, brouer, echaudro, jasowang, alexander.duyck, saeed,
	maciej.fijalkowski, sameehj

Hi Lorenzo,

On 1/19/21 9:20 PM, Lorenzo Bianconi wrote:
> This series introduce XDP multi-buffer support. The mvneta driver is
> the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> please focus on how these new types of xdp_{buff,frame} packets
> traverse the different layers and the layout design. It is on purpose
> that BPF-helpers are kept simple, as we don't want to expose the
> internal layout to allow later changes.
> 
> For now, to keep the design simple and to maintain performance, the XDP
> BPF-prog (still) only have access to the first-buffer. It is left for
> later (another patchset) to add payload access across multiple buffers.

I think xmas break has mostly wiped my memory from 2020 ;) so it would be
good to describe the sketched out design for how this will look like inside
the cover letter in terms of planned uapi exposure. (Additionally discussing
api design proposal could also be sth for BPF office hour to move things
quicker + posting a summary to the list for transparency of course .. just
a thought.)

Glancing over the series, while you've addressed the bpf_xdp_adjust_tail()
helper API, this series will be breaking one assumption of programs at least
for the mvneta driver from one kernel to another if you then use the multi
buff mode, and that is basically bpf_xdp_event_output() API: the assumption
is that you can do full packet capture by passing in the xdp buff len that
is data_end - data ptr. We use it this way for sampling & others might as well
(e.g. xdpcap). But bpf_xdp_copy() would only copy the first buffer today which
would break the full pkt visibility assumption. Just walking the frags if
xdp->mb bit is set would still need some sort of struct xdp_md exposure so
the prog can figure out the actual full size..

> This patchset should still allow for these future extensions. The goal
> is to lift the XDP MTU restriction that comes with XDP, but maintain
> same performance as before.
> 
> The main idea for the new multi-buffer layout is to reuse the same
> layout used for non-linear SKB. We introduced a "xdp_shared_info" data
> structure at the end of the first buffer to link together subsequent buffers.
> xdp_shared_info will alias skb_shared_info allowing to keep most of the frags
> in the same cache-line (while with skb_shared_info only the first fragment will
> be placed in the first "shared_info" cache-line). Moreover we introduced some
> xdp_shared_info helpers aligned to skb_frag* ones.
> Converting xdp_frame to SKB and deliver it to the network stack is shown in
> cpumap code (patch 7/8). Building the SKB, the xdp_shared_info structure
> will be converted in a skb_shared_info one.
> 
> A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
> to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
> or not (mb = 0).
> The mb bit will be set by a xdp multi-buffer capable driver only for
> non-linear frames maintaining the capability to receive linear frames
> without any extra cost since the xdp_shared_info structure at the end
> of the first buffer will be initialized only if mb is set.
> 
> Typical use cases for this series are:
> - Jumbo-frames
> - Packet header split (please see Google’s use-case @ NetDevConf 0x14, [0])
> - TSO
> 
> bpf_xdp_adjust_tail helper has been modified to take info account xdp
> multi-buff frames.

Also in terms of logistics (I think mentioned earlier already), for the series to
be merged - as with other networking features spanning core + driver (example
af_xdp) - we also need a second driver (ideally mlx5, i40e or ice) implementing
this and ideally be submitted together in the same series for review. For that
it probably also makes sense to more cleanly split out the core pieces from the
driver ones. Either way, how is progress on that side coming along?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support
  2021-01-23  1:03 ` [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Daniel Borkmann
@ 2021-01-31 17:23   ` Lorenzo Bianconi
  2021-02-03 13:24     ` Jubran, Samih
  0 siblings, 1 reply; 12+ messages in thread
From: Lorenzo Bianconi @ 2021-01-31 17:23 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: bpf, netdev, lorenzo.bianconi, davem, kuba, ast, shayagr,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, sameehj

[-- Attachment #1: Type: text/plain, Size: 5638 bytes --]

> Hi Lorenzo,

Hi Daniel,

sorry for the delay.

> 
> On 1/19/21 9:20 PM, Lorenzo Bianconi wrote:
> > This series introduce XDP multi-buffer support. The mvneta driver is
> > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > please focus on how these new types of xdp_{buff,frame} packets
> > traverse the different layers and the layout design. It is on purpose
> > that BPF-helpers are kept simple, as we don't want to expose the
> > internal layout to allow later changes.
> > 
> > For now, to keep the design simple and to maintain performance, the XDP
> > BPF-prog (still) only have access to the first-buffer. It is left for
> > later (another patchset) to add payload access across multiple buffers.
> 
> I think xmas break has mostly wiped my memory from 2020 ;) so it would be
> good to describe the sketched out design for how this will look like inside
> the cover letter in terms of planned uapi exposure. (Additionally discussing
> api design proposal could also be sth for BPF office hour to move things
> quicker + posting a summary to the list for transparency of course .. just
> a thought.)

I guess the main goal of this series is to add the multi-buffer support to the
xdp core (e.g. in xdp_frame/xdp_buff or in xdp_return_{buff/frame}) and to provide
the first driver with xdp mult-ibuff support. We tried to make the changes
independent from eBPF helpers since we do not have defined use cases for them
yet and we don't want to expose the internal layout to allow later changes.
One possible example is bpf_xdp_adjust_mb_header() helper we sent in v2 patch 6/9
[0] to try to address use-case explained by Eric @ NetDev 0x14 [1].
Anyway I agree there are some missing bits we need to address (e.g. what is the
behaviour when we redirect a mb xdp_frame to a driver not supporting it?)

Ack, I agree we can discuss about mb eBPF helper APIs in BPF office hour mtg in
order to speed-up the process.

> 
> Glancing over the series, while you've addressed the bpf_xdp_adjust_tail()
> helper API, this series will be breaking one assumption of programs at least
> for the mvneta driver from one kernel to another if you then use the multi
> buff mode, and that is basically bpf_xdp_event_output() API: the assumption
> is that you can do full packet capture by passing in the xdp buff len that
> is data_end - data ptr. We use it this way for sampling & others might as well
> (e.g. xdpcap). But bpf_xdp_copy() would only copy the first buffer today which
> would break the full pkt visibility assumption. Just walking the frags if
> xdp->mb bit is set would still need some sort of struct xdp_md exposure so
> the prog can figure out the actual full size..

ack, thx for pointing this out, I will take a look to it.
Eelco added xdp_len to xdp_md in the previous series (he is still working on
it). Another possible approach would be defining a helper, what do you think?

> 
> > This patchset should still allow for these future extensions. The goal
> > is to lift the XDP MTU restriction that comes with XDP, but maintain
> > same performance as before.
> > 
> > The main idea for the new multi-buffer layout is to reuse the same
> > layout used for non-linear SKB. We introduced a "xdp_shared_info" data
> > structure at the end of the first buffer to link together subsequent buffers.
> > xdp_shared_info will alias skb_shared_info allowing to keep most of the frags
> > in the same cache-line (while with skb_shared_info only the first fragment will
> > be placed in the first "shared_info" cache-line). Moreover we introduced some
> > xdp_shared_info helpers aligned to skb_frag* ones.
> > Converting xdp_frame to SKB and deliver it to the network stack is shown in
> > cpumap code (patch 7/8). Building the SKB, the xdp_shared_info structure
> > will be converted in a skb_shared_info one.
> > 
> > A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
> > to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
> > or not (mb = 0).
> > The mb bit will be set by a xdp multi-buffer capable driver only for
> > non-linear frames maintaining the capability to receive linear frames
> > without any extra cost since the xdp_shared_info structure at the end
> > of the first buffer will be initialized only if mb is set.
> > 
> > Typical use cases for this series are:
> > - Jumbo-frames
> > - Packet header split (please see Google’s use-case @ NetDevConf 0x14, [0])
> > - TSO
> > 
> > bpf_xdp_adjust_tail helper has been modified to take info account xdp
> > multi-buff frames.
> 
> Also in terms of logistics (I think mentioned earlier already), for the series to
> be merged - as with other networking features spanning core + driver (example
> af_xdp) - we also need a second driver (ideally mlx5, i40e or ice) implementing
> this and ideally be submitted together in the same series for review. For that
> it probably also makes sense to more cleanly split out the core pieces from the
> driver ones. Either way, how is progress on that side coming along?

I do not have any updated news about it so far, but afaik amazon folks were working
on adding mb support to ena driver, while intel was planning to add it to af_xdp.
Moreover Jason was looking to add it to virtio-net.

> 
> Thanks,
> Daniel

Regards,
Lorenzo

[0] https://patchwork.ozlabs.org/project/netdev/patch/b7475687bb09aac6ec051596a8ccbb311a54cb8a.1599165031.git.lorenzo@kernel.org/
[1] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support
  2021-01-31 17:23   ` Lorenzo Bianconi
@ 2021-02-03 13:24     ` Jubran, Samih
  0 siblings, 0 replies; 12+ messages in thread
From: Jubran, Samih @ 2021-02-03 13:24 UTC (permalink / raw)
  To: Lorenzo Bianconi, Daniel Borkmann
  Cc: bpf, netdev, lorenzo.bianconi, davem, kuba, ast, Agroskin, Shay,
	john.fastabend, dsahern, brouer, echaudro, jasowang,
	alexander.duyck, saeed, maciej.fijalkowski, Tzalik, Guy, Dagan,
	Noam



> -----Original Message-----
> From: Lorenzo Bianconi <lorenzo@kernel.org>
> Sent: Sunday, January 31, 2021 7:24 PM
> To: Daniel Borkmann <daniel@iogearbox.net>
> Cc: bpf@vger.kernel.org; netdev@vger.kernel.org;
> lorenzo.bianconi@redhat.com; davem@davemloft.net; kuba@kernel.org;
> ast@kernel.org; Agroskin, Shay <shayagr@amazon.com>;
> john.fastabend@gmail.com; dsahern@kernel.org; brouer@redhat.com;
> echaudro@redhat.com; jasowang@redhat.com;
> alexander.duyck@gmail.com; saeed@kernel.org;
> maciej.fijalkowski@intel.com; Jubran, Samih <sameehj@amazon.com>
> Subject: RE: [EXTERNAL] [PATCH v6 bpf-next 0/8] mvneta: introduce XDP
> multi-buffer support
> 
> > Hi Lorenzo,
> 
> Hi Daniel,
> 
> sorry for the delay.
> 
> >
> > On 1/19/21 9:20 PM, Lorenzo Bianconi wrote:
> > > This series introduce XDP multi-buffer support. The mvneta driver is
> > > the first to support these new "non-linear" xdp_{buff,frame}.
> > > Reviewers please focus on how these new types of xdp_{buff,frame}
> > > packets traverse the different layers and the layout design. It is
> > > on purpose that BPF-helpers are kept simple, as we don't want to
> > > expose the internal layout to allow later changes.
> > >
> > > For now, to keep the design simple and to maintain performance, the
> > > XDP BPF-prog (still) only have access to the first-buffer. It is
> > > left for later (another patchset) to add payload access across multiple
> buffers.
> >
> > I think xmas break has mostly wiped my memory from 2020 ;) so it would
> > be good to describe the sketched out design for how this will look
> > like inside the cover letter in terms of planned uapi exposure.
> > (Additionally discussing api design proposal could also be sth for BPF
> > office hour to move things quicker + posting a summary to the list for
> > transparency of course .. just a thought.)
> 
> I guess the main goal of this series is to add the multi-buffer support to the
> xdp core (e.g. in xdp_frame/xdp_buff or in xdp_return_{buff/frame}) and to
> provide the first driver with xdp mult-ibuff support. We tried to make the
> changes independent from eBPF helpers since we do not have defined use
> cases for them yet and we don't want to expose the internal layout to allow
> later changes.
> One possible example is bpf_xdp_adjust_mb_header() helper we sent in v2
> patch 6/9 [0] to try to address use-case explained by Eric @ NetDev 0x14 [1].
> Anyway I agree there are some missing bits we need to address (e.g. what is
> the behaviour when we redirect a mb xdp_frame to a driver not supporting
> it?)
> 
> Ack, I agree we can discuss about mb eBPF helper APIs in BPF office hour mtg
> in order to speed-up the process.
> 
> >
> > Glancing over the series, while you've addressed the
> > bpf_xdp_adjust_tail() helper API, this series will be breaking one
> > assumption of programs at least for the mvneta driver from one kernel
> > to another if you then use the multi buff mode, and that is basically
> > bpf_xdp_event_output() API: the assumption is that you can do full
> > packet capture by passing in the xdp buff len that is data_end - data
> > ptr. We use it this way for sampling & others might as well (e.g.
> > xdpcap). But bpf_xdp_copy() would only copy the first buffer today
> > which would break the full pkt visibility assumption. Just walking the
> > frags if
> > xdp->mb bit is set would still need some sort of struct xdp_md
> > xdp->exposure so
> > the prog can figure out the actual full size..
> 
> ack, thx for pointing this out, I will take a look to it.
> Eelco added xdp_len to xdp_md in the previous series (he is still working on
> it). Another possible approach would be defining a helper, what do you
> think?
> 
> >
> > > This patchset should still allow for these future extensions. The
> > > goal is to lift the XDP MTU restriction that comes with XDP, but
> > > maintain same performance as before.
> > >
> > > The main idea for the new multi-buffer layout is to reuse the same
> > > layout used for non-linear SKB. We introduced a "xdp_shared_info"
> > > data structure at the end of the first buffer to link together subsequent
> buffers.
> > > xdp_shared_info will alias skb_shared_info allowing to keep most of
> > > the frags in the same cache-line (while with skb_shared_info only
> > > the first fragment will be placed in the first "shared_info"
> > > cache-line). Moreover we introduced some xdp_shared_info helpers
> aligned to skb_frag* ones.
> > > Converting xdp_frame to SKB and deliver it to the network stack is
> > > shown in cpumap code (patch 7/8). Building the SKB, the
> > > xdp_shared_info structure will be converted in a skb_shared_info one.
> > >
> > > A multi-buffer bit (mb) has been introduced in xdp_{buff,frame}
> > > structure to notify the bpf/network layer if this is a xdp
> > > multi-buffer frame (mb = 1) or not (mb = 0).
> > > The mb bit will be set by a xdp multi-buffer capable driver only for
> > > non-linear frames maintaining the capability to receive linear
> > > frames without any extra cost since the xdp_shared_info structure at
> > > the end of the first buffer will be initialized only if mb is set.
> > >
> > > Typical use cases for this series are:
> > > - Jumbo-frames
> > > - Packet header split (please see Google’s use-case @ NetDevConf
> > > 0x14, [0])
> > > - TSO
> > >
> > > bpf_xdp_adjust_tail helper has been modified to take info account
> > > xdp multi-buff frames.
> >
> > Also in terms of logistics (I think mentioned earlier already), for
> > the series to be merged - as with other networking features spanning
> > core + driver (example
> > af_xdp) - we also need a second driver (ideally mlx5, i40e or ice)
> > implementing this and ideally be submitted together in the same series
> > for review. For that it probably also makes sense to more cleanly
> > split out the core pieces from the driver ones. Either way, how is progress
> on that side coming along?
> 
> I do not have any updated news about it so far, but afaik amazon folks were
> working on adding mb support to ena driver, while intel was planning to add
> it to af_xdp.
Hi all,

The ENA XDP MB implementation is currently being rebased on top of this
series. We are in final stages of polishing and testing the code and
looking forward to send it for review, hopefully till the end of March.

> Moreover Jason was looking to add it to virtio-net.
> 
> >
> > Thanks,
> > Daniel
> 
> Regards,
> Lorenzo
>
Best regards,
Sameeh
 
> [0]
> https://patchwork.ozlabs.org/project/netdev/patch/b7475687bb09aac6ec05
> 1596a8ccbb311a54cb8a.1599165031.git.lorenzo@kernel.org/
> [1] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-
> and-rx-zerocopy

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-02-03 13:26 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 1/8] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 2/8] xdp: add xdp_shared_info data structure Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 3/8] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 4/8] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 5/8] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 6/8] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 7/8] net: xdp: add multi-buff support to xdp_build_skb_from_fram Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 8/8] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-01-23  1:03 ` [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Daniel Borkmann
2021-01-31 17:23   ` Lorenzo Bianconi
2021-02-03 13:24     ` Jubran, Samih

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).