All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
@ 2020-10-02 14:41 Lorenzo Bianconi
  2020-10-02 14:41 ` [PATCH v4 bpf-next 01/13] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
                   ` (13 more replies)
  0 siblings, 14 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:41 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 6858 bytes --]

This series introduce XDP multi-buffer support. The mvneta driver is
the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
please focus on how these new types of xdp_{buff,frame} packets
traverse the different layers and the layout design. It is on purpose
that BPF-helpers are kept simple, as we don't want to expose the
internal layout to allow later changes.

For now, to keep the design simple and to maintain performance, the XDP
BPF-prog (still) only have access to the first-buffer. It is left for
later (another patchset) to add payload access across multiple buffers.
This patchset should still allow for these future extensions. The goal
is to lift the XDP MTU restriction that comes with XDP, but maintain
same performance as before.

The main idea for the new multi-buffer layout is to reuse the same
layout used for non-linear SKB. This rely on the "skb_shared_info"
struct at the end of the first buffer to link together subsequent
buffers. Keeping the layout compatible with SKBs is also done to ease
and speedup creating an SKB from an xdp_{buff,frame}. Converting
xdp_frame to SKB and deliver it to the network stack is shown in cpumap
code (patch 13/13).

A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
or not (mb = 0).
The mb bit will be set by a xdp multi-buffer capable driver only for
non-linear frames maintaining the capability to receive linear frames
without any extra cost since the skb_shared_info structure at the end
of the first buffer will be initialized only if mb is set.

In order to provide to userspace some metdata about the non-linear
xdp_{buff,frame}, we introduced 2 bpf helpers:
- bpf_xdp_get_frags_count:
  get the number of fragments for a given xdp multi-buffer.
- bpf_xdp_get_frags_total_size:
  get the total size of fragments for a given xdp multi-buffer.

Typical use cases for this series are:
- Jumbo-frames
- Packet header split (please see Google’s use-case @ NetDevConf 0x14, [0])
- TSO

More info about the main idea behind this approach can be found here [1][2].

We carried out some throughput tests in a standard linear frame scenario in order
to verify we did not introduced any performance regression adding xdp multi-buff
support to mvneta:

offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE

commit: 879456bedbe5 ("net: mvneta: avoid possible cache misses in mvneta_rx_swbm")
- xdp-pass:      ~162Kpps
- xdp-drop:      ~701Kpps
- xdp-tx:        ~185Kpps
- xdp-redirect:  ~202Kpps

mvneta xdp multi-buff:
- xdp-pass:      ~163Kpps
- xdp-drop:      ~739Kpps
- xdp-tx:        ~182Kpps
- xdp-redirect:  ~202Kpps

Changes since v3:
- rebase ontop of bpf-next
- add patch 10/13 to copy back paged data from a xdp multi-buff frame to
  userspace buffer for xdp multi-buff selftests

Changes since v2:
- add throughput measurements
- drop bpf_xdp_adjust_mb_header bpf helper
- introduce selftest for xdp multibuffer
- addressed comments on bpf_xdp_get_frags_count
- introduce xdp multi-buff support to cpumaps

Changes since v1:
- Fix use-after-free in xdp_return_{buff/frame}
- Introduce bpf helpers
- Introduce xdp_mb sample program
- access skb_shared_info->nr_frags only on the last fragment

Changes since RFC:
- squash multi-buffer bit initialization in a single patch
- add mvneta non-linear XDP buff support for tx side

[0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
[1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
[2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)

Lorenzo Bianconi (11):
  xdp: introduce mb in xdp_buff/xdp_frame
  xdp: initialize xdp_buff mb bit to 0 in all XDP drivers
  net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  xdp: add multi-buff support to xdp_return_{buff/frame}
  net: mvneta: add multi buffer support to XDP_TX
  bpf: move user_size out of bpf_test_init
  bpf: introduce multibuff support to bpf_prog_test_run_xdp()
  bpf: test_run: add skb_shared_info pointer in bpf_test_finish
    signature
  bpf: add xdp multi-buffer selftest
  net: mvneta: enable jumbo frames for XDP
  bpf: cpumap: introduce xdp multi-buff support

Sameeh Jubran (2):
  bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers
  samples/bpf: add bpf program that uses xdp mb helpers

 drivers/net/ethernet/amazon/ena/ena_netdev.c  |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   1 +
 .../net/ethernet/cavium/thunder/nicvf_main.c  |   1 +
 .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  |   1 +
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   1 +
 drivers/net/ethernet/intel/ice/ice_txrx.c     |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   1 +
 .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |   1 +
 drivers/net/ethernet/marvell/mvneta.c         | 131 +++++++------
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |   1 +
 drivers/net/ethernet/mellanox/mlx4/en_rx.c    |   1 +
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   |   1 +
 .../ethernet/netronome/nfp/nfp_net_common.c   |   1 +
 drivers/net/ethernet/qlogic/qede/qede_fp.c    |   1 +
 drivers/net/ethernet/sfc/rx.c                 |   1 +
 drivers/net/ethernet/socionext/netsec.c       |   1 +
 drivers/net/ethernet/ti/cpsw.c                |   1 +
 drivers/net/ethernet/ti/cpsw_new.c            |   1 +
 drivers/net/hyperv/netvsc_bpf.c               |   1 +
 drivers/net/tun.c                             |   2 +
 drivers/net/veth.c                            |   1 +
 drivers/net/virtio_net.c                      |   2 +
 drivers/net/xen-netfront.c                    |   1 +
 include/net/xdp.h                             |  31 ++-
 include/uapi/linux/bpf.h                      |  14 ++
 kernel/bpf/cpumap.c                           |  45 +----
 net/bpf/test_run.c                            | 118 ++++++++++--
 net/core/dev.c                                |   1 +
 net/core/filter.c                             |  42 ++++
 net/core/xdp.c                                | 104 ++++++++++
 samples/bpf/Makefile                          |   3 +
 samples/bpf/xdp_mb_kern.c                     |  68 +++++++
 samples/bpf/xdp_mb_user.c                     | 182 ++++++++++++++++++
 tools/include/uapi/linux/bpf.h                |  14 ++
 .../testing/selftests/bpf/prog_tests/xdp_mb.c |  79 ++++++++
 .../selftests/bpf/progs/test_xdp_multi_buff.c |  24 +++
 36 files changed, 757 insertions(+), 123 deletions(-)
 create mode 100644 samples/bpf/xdp_mb_kern.c
 create mode 100644 samples/bpf/xdp_mb_user.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_mb.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c

-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 01/13] xdp: introduce mb in xdp_buff/xdp_frame
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
@ 2020-10-02 14:41 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 02/13] xdp: initialize xdp_buff mb bit to 0 in all XDP drivers Lorenzo Bianconi
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:41 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Introduce multi-buffer bit (mb) in xdp_frame/xdp_buffer data structure
in order to specify if this is a linear buffer (mb = 0) or a multi-buffer
frame (mb = 1). In the latter case the shared_info area at the end of the
first buffer is been properly initialized to link together subsequent
buffers.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h | 8 ++++++--
 net/core/xdp.c    | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 3814fb631d52..42f439f9fcda 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -72,7 +72,8 @@ struct xdp_buff {
 	void *data_hard_start;
 	struct xdp_rxq_info *rxq;
 	struct xdp_txq_info *txq;
-	u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
+	u32 frame_sz:31; /* frame size to deduce data_hard_end/reserved tailroom*/
+	u32 mb:1; /* xdp non-linear buffer */
 };
 
 /* Reserve memory area at end-of data area.
@@ -96,7 +97,8 @@ struct xdp_frame {
 	u16 len;
 	u16 headroom;
 	u32 metasize:8;
-	u32 frame_sz:24;
+	u32 frame_sz:23;
+	u32 mb:1; /* xdp non-linear frame */
 	/* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
 	 * while mem info is valid on remote CPU.
 	 */
@@ -141,6 +143,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)
 	xdp->data_end = frame->data + frame->len;
 	xdp->data_meta = frame->data - frame->metasize;
 	xdp->frame_sz = frame->frame_sz;
+	xdp->mb = frame->mb;
 }
 
 static inline
@@ -167,6 +170,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp,
 	xdp_frame->headroom = headroom - sizeof(*xdp_frame);
 	xdp_frame->metasize = metasize;
 	xdp_frame->frame_sz = xdp->frame_sz;
+	xdp_frame->mb = xdp->mb;
 
 	return 0;
 }
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 48aba933a5a8..884f140fc3be 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -454,6 +454,7 @@ struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp)
 	xdpf->headroom = 0;
 	xdpf->metasize = metasize;
 	xdpf->frame_sz = PAGE_SIZE;
+	xdpf->mb = xdp->mb;
 	xdpf->mem.type = MEM_TYPE_PAGE_ORDER0;
 
 	xsk_buff_free(xdp);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 02/13] xdp: initialize xdp_buff mb bit to 0 in all XDP drivers
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
  2020-10-02 14:41 ` [PATCH v4 bpf-next 01/13] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 03/13] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Initialize multi-buffer bit (mb) to 0 in all XDP-capable drivers.
This is a preliminary patch to enable xdp multi-buffer support.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c        | 1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c       | 1 +
 drivers/net/ethernet/cavium/thunder/nicvf_main.c    | 1 +
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c    | 1 +
 drivers/net/ethernet/intel/i40e/i40e_txrx.c         | 1 +
 drivers/net/ethernet/intel/ice/ice_txrx.c           | 1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c       | 1 +
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c   | 1 +
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c     | 1 +
 drivers/net/ethernet/mellanox/mlx4/en_rx.c          | 1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c     | 1 +
 drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 1 +
 drivers/net/ethernet/qlogic/qede/qede_fp.c          | 1 +
 drivers/net/ethernet/sfc/rx.c                       | 1 +
 drivers/net/ethernet/socionext/netsec.c             | 1 +
 drivers/net/ethernet/ti/cpsw.c                      | 1 +
 drivers/net/ethernet/ti/cpsw_new.c                  | 1 +
 drivers/net/hyperv/netvsc_bpf.c                     | 1 +
 drivers/net/tun.c                                   | 2 ++
 drivers/net/veth.c                                  | 1 +
 drivers/net/virtio_net.c                            | 2 ++
 drivers/net/xen-netfront.c                          | 1 +
 net/core/dev.c                                      | 1 +
 23 files changed, 25 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index e8131dadc22c..339319b97853 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1595,6 +1595,7 @@ static int ena_clean_rx_irq(struct ena_ring *rx_ring, struct napi_struct *napi,
 	res_budget = budget;
 	xdp.rxq = &rx_ring->xdp_rxq;
 	xdp.frame_sz = ENA_PAGE_SIZE;
+	xdp.mb = 0;
 
 	do {
 		xdp_verdict = XDP_PASS;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
index fcc262064766..344644b6dd4d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c
@@ -139,6 +139,7 @@ bool bnxt_rx_xdp(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, u16 cons,
 	xdp.data_end = *data_ptr + *len;
 	xdp.rxq = &rxr->xdp_rxq;
 	xdp.frame_sz = PAGE_SIZE; /* BNXT_RX_PAGE_MODE(bp) when XDP enabled */
+	xdp.mb = 0;
 	orig_data = xdp.data;
 
 	rcu_read_lock();
diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
index 0a94c396173b..7fdabaabab1b 100644
--- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c
+++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c
@@ -553,6 +553,7 @@ static inline bool nicvf_xdp_rx(struct nicvf *nic, struct bpf_prog *prog,
 	xdp.data_end = xdp.data + len;
 	xdp.rxq = &rq->xdp_rxq;
 	xdp.frame_sz = RCV_FRAG_LEN + XDP_PACKET_HEADROOM;
+	xdp.mb = 0;
 	orig_data = xdp.data;
 
 	rcu_read_lock();
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index fe4caf7aad7c..8410e713162e 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -366,6 +366,7 @@ static u32 dpaa2_eth_run_xdp(struct dpaa2_eth_priv *priv,
 
 	xdp.frame_sz = DPAA2_ETH_RX_BUF_RAW_SIZE -
 		(dpaa2_fd_get_offset(fd) - XDP_PACKET_HEADROOM);
+	xdp.mb = 0;
 
 	xdp_act = bpf_prog_run_xdp(xdp_prog, &xdp);
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index d43ce13a93c9..5df07bc98283 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2332,6 +2332,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 	xdp.frame_sz = i40e_rx_frame_truesize(rx_ring, 0);
 #endif
 	xdp.rxq = &rx_ring->xdp_rxq;
+	xdp.mb = 0;
 
 	while (likely(total_rx_packets < (unsigned int)budget)) {
 		struct i40e_rx_buffer *rx_buffer;
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index eae75260fe20..d641f513b8d9 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1089,6 +1089,7 @@ int ice_clean_rx_irq(struct ice_ring *rx_ring, int budget)
 #if (PAGE_SIZE < 8192)
 	xdp.frame_sz = ice_rx_frame_truesize(rx_ring, 0);
 #endif
+	xdp.mb = 0;
 
 	/* start the loop to process Rx packets bounded by 'budget' */
 	while (likely(total_rx_pkts < (unsigned int)budget)) {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index a190d5c616fc..39f9d2032b9d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -2298,6 +2298,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector,
 #if (PAGE_SIZE < 8192)
 	xdp.frame_sz = ixgbe_rx_frame_truesize(rx_ring, 0);
 #endif
+	xdp.mb = 0;
 
 	while (likely(total_rx_packets < budget)) {
 		union ixgbe_adv_rx_desc *rx_desc;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index 82fce27f682b..1fbc740c266e 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -1129,6 +1129,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
 	struct xdp_buff xdp;
 
 	xdp.rxq = &rx_ring->xdp_rxq;
+	xdp.mb = 0;
 
 	/* Frame size depend on rx_ring setup when PAGE_SIZE=4K */
 #if (PAGE_SIZE < 8192)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index f6616c8933ca..01661ade9009 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -3558,6 +3558,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 			xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
 			xdp.data_end = xdp.data + rx_bytes;
 			xdp.frame_sz = PAGE_SIZE;
+			xdp.mb = 0;
 
 			if (bm_pool->pkt_size == MVPP2_BM_SHORT_PKT_SIZE)
 				xdp.rxq = &rxq->xdp_rxq_short;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 99d7737e8ad6..de1ae36b068e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -684,6 +684,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 	xdp_prog = rcu_dereference(ring->xdp_prog);
 	xdp.rxq = &ring->xdp_rxq;
 	xdp.frame_sz = priv->frag_info[0].frag_stride;
+	xdp.mb = 0;
 	doorbell_pending = 0;
 
 	/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 599f5b5ebc97..82c3e755dadd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -1133,6 +1133,7 @@ static void mlx5e_fill_xdp_buff(struct mlx5e_rq *rq, void *va, u16 headroom,
 	xdp->data_end = xdp->data + len;
 	xdp->rxq = &rq->xdp_rxq;
 	xdp->frame_sz = rq->buff.frame0_sz;
+	xdp->mb = 0;
 }
 
 static struct sk_buff *
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index b150da43adb2..69fab1010752 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1824,6 +1824,7 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 	true_bufsz = xdp_prog ? PAGE_SIZE : dp->fl_bufsz;
 	xdp.frame_sz = PAGE_SIZE - NFP_NET_RX_BUF_HEADROOM;
 	xdp.rxq = &rx_ring->xdp_rxq;
+	xdp.mb = 0;
 	tx_ring = r_vec->xdp_ring;
 
 	while (pkts_polled < budget) {
diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index a2494bf85007..14a54094ca08 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -1096,6 +1096,7 @@ static bool qede_rx_xdp(struct qede_dev *edev,
 	xdp.data_end = xdp.data + *len;
 	xdp.rxq = &rxq->xdp_rxq;
 	xdp.frame_sz = rxq->rx_buf_seg_size; /* PAGE_SIZE when XDP enabled */
+	xdp.mb = 0;
 
 	/* Queues always have a full reset currently, so for the time
 	 * being until there's atomic program replace just mark read
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index aaa112877561..286feb510c21 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -301,6 +301,7 @@ static bool efx_do_xdp(struct efx_nic *efx, struct efx_channel *channel,
 	xdp.data_end = xdp.data + rx_buf->len;
 	xdp.rxq = &rx_queue->xdp_rxq_info;
 	xdp.frame_sz = efx->rx_page_buf_step;
+	xdp.mb = 0;
 
 	xdp_act = bpf_prog_run_xdp(xdp_prog, &xdp);
 	rcu_read_unlock();
diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c
index 806eb651cea3..0f0567083a6c 100644
--- a/drivers/net/ethernet/socionext/netsec.c
+++ b/drivers/net/ethernet/socionext/netsec.c
@@ -947,6 +947,7 @@ static int netsec_process_rx(struct netsec_priv *priv, int budget)
 
 	xdp.rxq = &dring->xdp_rxq;
 	xdp.frame_sz = PAGE_SIZE;
+	xdp.mb = 0;
 
 	rcu_read_lock();
 	xdp_prog = READ_ONCE(priv->xdp_prog);
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 9fd1f77190ad..558e0abb03c1 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -407,6 +407,7 @@ static void cpsw_rx_handler(void *token, int len, int status)
 		xdp.data_hard_start = pa;
 		xdp.rxq = &priv->xdp_rxq[ch];
 		xdp.frame_sz = PAGE_SIZE;
+		xdp.mb = 0;
 
 		port = priv->emac_port + cpsw->data.dual_emac;
 		ret = cpsw_run_xdp(priv, ch, &xdp, page, port);
diff --git a/drivers/net/ethernet/ti/cpsw_new.c b/drivers/net/ethernet/ti/cpsw_new.c
index f779d2e1b5c5..7baab97e302a 100644
--- a/drivers/net/ethernet/ti/cpsw_new.c
+++ b/drivers/net/ethernet/ti/cpsw_new.c
@@ -350,6 +350,7 @@ static void cpsw_rx_handler(void *token, int len, int status)
 		xdp.data_hard_start = pa;
 		xdp.rxq = &priv->xdp_rxq[ch];
 		xdp.frame_sz = PAGE_SIZE;
+		xdp.mb = 0;
 
 		ret = cpsw_run_xdp(priv, ch, &xdp, page, priv->emac_port);
 		if (ret != CPSW_XDP_PASS)
diff --git a/drivers/net/hyperv/netvsc_bpf.c b/drivers/net/hyperv/netvsc_bpf.c
index 440486d9c999..a4bafc64997f 100644
--- a/drivers/net/hyperv/netvsc_bpf.c
+++ b/drivers/net/hyperv/netvsc_bpf.c
@@ -50,6 +50,7 @@ u32 netvsc_run_xdp(struct net_device *ndev, struct netvsc_channel *nvchan,
 	xdp->data_end = xdp->data + len;
 	xdp->rxq = &nvchan->xdp_rxq;
 	xdp->frame_sz = PAGE_SIZE;
+	xdp->mb = 0;
 
 	memcpy(xdp->data, data, len);
 
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index be69d272052f..d8380feb7626 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1641,6 +1641,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
 		xdp.data_end = xdp.data + len;
 		xdp.rxq = &tfile->xdp_rxq;
 		xdp.frame_sz = buflen;
+		xdp.mb = 0;
 
 		act = bpf_prog_run_xdp(xdp_prog, &xdp);
 		if (act == XDP_REDIRECT || act == XDP_TX) {
@@ -2388,6 +2389,7 @@ static int tun_xdp_one(struct tun_struct *tun,
 		xdp_set_data_meta_invalid(xdp);
 		xdp->rxq = &tfile->xdp_rxq;
 		xdp->frame_sz = buflen;
+		xdp->mb = 0;
 
 		act = bpf_prog_run_xdp(xdp_prog, xdp);
 		err = tun_xdp_act(tun, xdp_prog, xdp, act);
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 091e5b4ba042..e25af95a532d 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -711,6 +711,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	/* SKB "head" area always have tailroom for skb_shared_info */
 	xdp.frame_sz = (void *)skb_end_pointer(skb) - xdp.data_hard_start;
 	xdp.frame_sz += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	xdp.mb = 0;
 
 	orig_data = xdp.data;
 	orig_data_end = xdp.data_end;
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7145c83c6c8c..3d39d7622840 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -690,6 +690,7 @@ static struct sk_buff *receive_small(struct net_device *dev,
 		xdp.data_meta = xdp.data;
 		xdp.rxq = &rq->xdp_rxq;
 		xdp.frame_sz = buflen;
+		xdp.mb = 0;
 		orig_data = xdp.data;
 		act = bpf_prog_run_xdp(xdp_prog, &xdp);
 		stats->xdp_packets++;
@@ -860,6 +861,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		xdp.data_meta = xdp.data;
 		xdp.rxq = &rq->xdp_rxq;
 		xdp.frame_sz = frame_sz - vi->hdr_len;
+		xdp.mb = 0;
 
 		act = bpf_prog_run_xdp(xdp_prog, &xdp);
 		stats->xdp_packets++;
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 3e9895bec15f..00440ad34ca8 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -870,6 +870,7 @@ static u32 xennet_run_xdp(struct netfront_queue *queue, struct page *pdata,
 	xdp->data_end = xdp->data + len;
 	xdp->rxq = &queue->xdp_rxq;
 	xdp->frame_sz = XEN_PAGE_SIZE - XDP_PACKET_HEADROOM;
+	xdp->mb = 0;
 
 	act = bpf_prog_run_xdp(prog, xdp);
 	switch (act) {
diff --git a/net/core/dev.c b/net/core/dev.c
index 9d55bf5d1a65..1e78b028518d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4640,6 +4640,7 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 	/* SKB "head" area always have tailroom for skb_shared_info */
 	xdp->frame_sz  = (void *)skb_end_pointer(skb) - xdp->data_hard_start;
 	xdp->frame_sz += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	xdp->mb = 0;
 
 	orig_data_end = xdp->data_end;
 	orig_data = xdp->data;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 03/13] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
  2020-10-02 14:41 ` [PATCH v4 bpf-next 01/13] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 02/13] xdp: initialize xdp_buff mb bit to 0 in all XDP drivers Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 04/13] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Update multi-buffer bit (mb) in xdp_buff to notify XDP/eBPF layer and
XDP remote drivers if this is a "non-linear" XDP buffer. Access
skb_shared_info only if xdp_buff mb is set

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 42 +++++++++++++++++----------
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index d095718355d3..a431e8478297 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2027,12 +2027,17 @@ static void
 mvneta_xdp_put_buff(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		    struct xdp_buff *xdp, int sync_len, bool napi)
 {
-	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+	struct skb_shared_info *sinfo;
 	int i;
 
+	if (likely(!xdp->mb))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_buff(xdp);
 	for (i = 0; i < sinfo->nr_frags; i++)
 		page_pool_put_full_page(rxq->page_pool,
 					skb_frag_page(&sinfo->frags[i]), napi);
+out:
 	page_pool_put_page(rxq->page_pool, virt_to_head_page(xdp->data),
 			   sync_len, napi);
 }
@@ -2234,7 +2239,6 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 	int data_len = -MVNETA_MH_SIZE, len;
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
-	struct skb_shared_info *sinfo;
 
 	if (*size > MVNETA_MAX_RX_BUF_SIZE) {
 		len = MVNETA_MAX_RX_BUF_SIZE;
@@ -2259,9 +2263,6 @@ mvneta_swbm_rx_frame(struct mvneta_port *pp,
 	xdp->data = data + pp->rx_offset_correction + MVNETA_MH_SIZE;
 	xdp->data_end = xdp->data + data_len;
 	xdp_set_data_meta_invalid(xdp);
-
-	sinfo = xdp_get_shared_info_from_buff(xdp);
-	sinfo->nr_frags = 0;
 }
 
 static void
@@ -2272,9 +2273,9 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 			    struct page *page)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+	int data_len, len, nfrags = xdp->mb ? sinfo->nr_frags : 0;
 	struct net_device *dev = pp->dev;
 	enum dma_data_direction dma_dir;
-	int data_len, len;
 
 	if (*size > MVNETA_MAX_RX_BUF_SIZE) {
 		len = MVNETA_MAX_RX_BUF_SIZE;
@@ -2288,17 +2289,21 @@ mvneta_swbm_add_rx_fragment(struct mvneta_port *pp,
 				rx_desc->buf_phys_addr,
 				len, dma_dir);
 
-	if (data_len > 0 && sinfo->nr_frags < MAX_SKB_FRAGS) {
-		skb_frag_t *frag = &sinfo->frags[sinfo->nr_frags];
+	if (data_len > 0 && nfrags < MAX_SKB_FRAGS) {
+		skb_frag_t *frag = &sinfo->frags[nfrags];
 
 		skb_frag_off_set(frag, pp->rx_offset_correction);
 		skb_frag_size_set(frag, data_len);
 		__skb_frag_set_page(frag, page);
-		sinfo->nr_frags++;
-
-		rx_desc->buf_phys_addr = 0;
+		nfrags++;
+	} else {
+		page_pool_put_full_page(rxq->page_pool, page, true);
 	}
+
+	rx_desc->buf_phys_addr = 0;
+	sinfo->nr_frags = nfrags;
 	*size -= len;
+	xdp->mb = 1;
 }
 
 static struct sk_buff *
@@ -2306,7 +2311,7 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 		      struct xdp_buff *xdp, u32 desc_status)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
-	int i, num_frags = sinfo->nr_frags;
+	int i, num_frags = xdp->mb ? sinfo->nr_frags : 0;
 	struct sk_buff *skb;
 
 	skb = build_skb(xdp->data_hard_start, PAGE_SIZE);
@@ -2319,6 +2324,9 @@ mvneta_swbm_build_skb(struct mvneta_port *pp, struct mvneta_rx_queue *rxq,
 	skb_put(skb, xdp->data_end - xdp->data);
 	mvneta_rx_csum(pp, desc_status, skb);
 
+	if (likely(!xdp->mb))
+		return skb;
+
 	for (i = 0; i < num_frags; i++) {
 		skb_frag_t *frag = &sinfo->frags[i];
 
@@ -2338,13 +2346,14 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 {
 	int rx_proc = 0, rx_todo, refill, size = 0;
 	struct net_device *dev = pp->dev;
-	struct xdp_buff xdp_buf = {
-		.frame_sz = PAGE_SIZE,
-		.rxq = &rxq->xdp_rxq,
-	};
 	struct mvneta_stats ps = {};
 	struct bpf_prog *xdp_prog;
 	u32 desc_status, frame_sz;
+	struct xdp_buff xdp_buf;
+
+	xdp_buf.data_hard_start = NULL;
+	xdp_buf.frame_sz = PAGE_SIZE;
+	xdp_buf.rxq = &rxq->xdp_rxq;
 
 	/* Get number of received packets */
 	rx_todo = mvneta_rxq_busy_desc_num_get(pp, rxq);
@@ -2377,6 +2386,7 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 			frame_sz = size - ETH_FCS_LEN;
 			desc_status = rx_status;
 
+			xdp_buf.mb = 0;
 			mvneta_swbm_rx_frame(pp, rx_desc, rxq, &xdp_buf,
 					     &size, page);
 		} else {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 04/13] xdp: add multi-buff support to xdp_return_{buff/frame}
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (2 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 03/13] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 05/13] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Take into account if the received xdp_buff/xdp_frame is non-linear
recycling/returning the frame memory to the allocator

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h | 18 ++++++++++++++++--
 net/core/xdp.c    | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 42f439f9fcda..4d47076546ff 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -208,10 +208,24 @@ void __xdp_release_frame(void *data, struct xdp_mem_info *mem);
 static inline void xdp_release_frame(struct xdp_frame *xdpf)
 {
 	struct xdp_mem_info *mem = &xdpf->mem;
+	struct skb_shared_info *sinfo;
+	int i;
 
 	/* Curr only page_pool needs this */
-	if (mem->type == MEM_TYPE_PAGE_POOL)
-		__xdp_release_frame(xdpf->data, mem);
+	if (mem->type != MEM_TYPE_PAGE_POOL)
+		return;
+
+	if (likely(!xdpf->mb))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_release_frame(page_address(page), mem);
+	}
+out:
+	__xdp_release_frame(xdpf->data, mem);
 }
 
 int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq,
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 884f140fc3be..6d4fd4dddb00 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -370,18 +370,57 @@ static void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct)
 
 void xdp_return_frame(struct xdp_frame *xdpf)
 {
+	struct skb_shared_info *sinfo;
+	int i;
+
+	if (likely(!xdpf->mb))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdpf->mem, false);
+	}
+out:
 	__xdp_return(xdpf->data, &xdpf->mem, false);
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame);
 
 void xdp_return_frame_rx_napi(struct xdp_frame *xdpf)
 {
+	struct skb_shared_info *sinfo;
+	int i;
+
+	if (likely(!xdpf->mb))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_frame(xdpf);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdpf->mem, true);
+	}
+out:
 	__xdp_return(xdpf->data, &xdpf->mem, true);
 }
 EXPORT_SYMBOL_GPL(xdp_return_frame_rx_napi);
 
 void xdp_return_buff(struct xdp_buff *xdp)
 {
+	struct skb_shared_info *sinfo;
+	int i;
+
+	if (likely(!xdp->mb))
+		goto out;
+
+	sinfo = xdp_get_shared_info_from_buff(xdp);
+	for (i = 0; i < sinfo->nr_frags; i++) {
+		struct page *page = skb_frag_page(&sinfo->frags[i]);
+
+		__xdp_return(page_address(page), &xdp->rxq->mem, true);
+	}
+out:
 	__xdp_return(xdp->data, &xdp->rxq->mem, true);
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 05/13] net: mvneta: add multi buffer support to XDP_TX
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (3 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 04/13] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers Lorenzo Bianconi
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Introduce the capability to map non-linear xdp buffer running
mvneta_xdp_submit_frame() for XDP_TX and XDP_REDIRECT

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 79 +++++++++++++++++----------
 1 file changed, 49 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index a431e8478297..f709650974ea 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1852,8 +1852,8 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 			bytes_compl += buf->skb->len;
 			pkts_compl++;
 			dev_kfree_skb_any(buf->skb);
-		} else if (buf->type == MVNETA_TYPE_XDP_TX ||
-			   buf->type == MVNETA_TYPE_XDP_NDO) {
+		} else if ((buf->type == MVNETA_TYPE_XDP_TX ||
+			    buf->type == MVNETA_TYPE_XDP_NDO) && buf->xdpf) {
 			if (napi && buf->type == MVNETA_TYPE_XDP_TX)
 				xdp_return_frame_rx_napi(buf->xdpf);
 			else
@@ -2046,43 +2046,62 @@ static int
 mvneta_xdp_submit_frame(struct mvneta_port *pp, struct mvneta_tx_queue *txq,
 			struct xdp_frame *xdpf, bool dma_map)
 {
-	struct mvneta_tx_desc *tx_desc;
-	struct mvneta_tx_buf *buf;
-	dma_addr_t dma_addr;
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_frame(xdpf);
+	int i, num_frames = xdpf->mb ? sinfo->nr_frags + 1 : 1;
+	struct mvneta_tx_desc *tx_desc = NULL;
+	struct page *page;
 
-	if (txq->count >= txq->tx_stop_threshold)
+	if (txq->count + num_frames >= txq->tx_stop_threshold)
 		return MVNETA_XDP_DROPPED;
 
-	tx_desc = mvneta_txq_next_desc_get(txq);
+	for (i = 0; i < num_frames; i++) {
+		struct mvneta_tx_buf *buf = &txq->buf[txq->txq_put_index];
+		skb_frag_t *frag = i ? &sinfo->frags[i - 1] : NULL;
+		int len = frag ? skb_frag_size(frag) : xdpf->len;
+		dma_addr_t dma_addr;
 
-	buf = &txq->buf[txq->txq_put_index];
-	if (dma_map) {
-		/* ndo_xdp_xmit */
-		dma_addr = dma_map_single(pp->dev->dev.parent, xdpf->data,
-					  xdpf->len, DMA_TO_DEVICE);
-		if (dma_mapping_error(pp->dev->dev.parent, dma_addr)) {
-			mvneta_txq_desc_put(txq);
-			return MVNETA_XDP_DROPPED;
+		tx_desc = mvneta_txq_next_desc_get(txq);
+		if (dma_map) {
+			/* ndo_xdp_xmit */
+			void *data;
+
+			data = frag ? skb_frag_address(frag) : xdpf->data;
+			dma_addr = dma_map_single(pp->dev->dev.parent, data,
+						  len, DMA_TO_DEVICE);
+			if (dma_mapping_error(pp->dev->dev.parent, dma_addr)) {
+				for (; i >= 0; i--)
+					mvneta_txq_desc_put(txq);
+				return MVNETA_XDP_DROPPED;
+			}
+			buf->type = MVNETA_TYPE_XDP_NDO;
+		} else {
+			page = frag ? skb_frag_page(frag)
+				    : virt_to_page(xdpf->data);
+			dma_addr = page_pool_get_dma_addr(page);
+			if (frag)
+				dma_addr += skb_frag_off(frag);
+			else
+				dma_addr += sizeof(*xdpf) + xdpf->headroom;
+			dma_sync_single_for_device(pp->dev->dev.parent,
+						   dma_addr, len,
+						   DMA_BIDIRECTIONAL);
+			buf->type = MVNETA_TYPE_XDP_TX;
 		}
-		buf->type = MVNETA_TYPE_XDP_NDO;
-	} else {
-		struct page *page = virt_to_page(xdpf->data);
+		buf->xdpf = i ? NULL : xdpf;
 
-		dma_addr = page_pool_get_dma_addr(page) +
-			   sizeof(*xdpf) + xdpf->headroom;
-		dma_sync_single_for_device(pp->dev->dev.parent, dma_addr,
-					   xdpf->len, DMA_BIDIRECTIONAL);
-		buf->type = MVNETA_TYPE_XDP_TX;
+		if (!i)
+			tx_desc->command = MVNETA_TXD_F_DESC;
+		tx_desc->buf_phys_addr = dma_addr;
+		tx_desc->data_size = len;
+
+		mvneta_txq_inc_put(txq);
 	}
-	buf->xdpf = xdpf;
 
-	tx_desc->command = MVNETA_TXD_FLZ_DESC;
-	tx_desc->buf_phys_addr = dma_addr;
-	tx_desc->data_size = xdpf->len;
+	/*last descriptor */
+	tx_desc->command |= MVNETA_TXD_L_DESC | MVNETA_TXD_Z_PAD;
 
-	mvneta_txq_inc_put(txq);
-	txq->pending++;
-	txq->count++;
+	txq->pending += num_frames;
+	txq->count += num_frames;
 
 	return MVNETA_XDP_TX;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (4 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 05/13] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 15:36   ` John Fastabend
  2020-10-02 14:42 ` [PATCH v4 bpf-next 07/13] samples/bpf: add bpf program that uses xdp mb helpers Lorenzo Bianconi
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

From: Sameeh Jubran <sameehj@amazon.com>

Introduce the two following bpf helpers in order to provide some
metadata about a xdp multi-buff fame to bpf layer:

- bpf_xdp_get_frags_count()
  get the number of fragments for a given xdp multi-buffer.

* bpf_xdp_get_frags_total_size()
  get the total size of fragments for a given xdp multi-buffer.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/uapi/linux/bpf.h       | 14 ++++++++++++
 net/core/filter.c              | 42 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 14 ++++++++++++
 3 files changed, 70 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4f556cfcbfbe..0715995eb18c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3668,6 +3668,18 @@ union bpf_attr {
  * 	Return
  * 		The helper returns **TC_ACT_REDIRECT** on success or
  * 		**TC_ACT_SHOT** on error.
+ *
+ * int bpf_xdp_get_frags_count(struct xdp_buff *xdp_md)
+ *	Description
+ *		Get the number of fragments for a given xdp multi-buffer.
+ *	Return
+ *		The number of fragments
+ *
+ * int bpf_xdp_get_frags_total_size(struct xdp_buff *xdp_md)
+ *	Description
+ *		Get the total size of fragments for a given xdp multi-buffer.
+ *	Return
+ *		The total size of fragments for a given xdp multi-buffer.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3823,6 +3835,8 @@ union bpf_attr {
 	FN(seq_printf_btf),		\
 	FN(skb_cgroup_classid),		\
 	FN(redirect_neigh),		\
+	FN(xdp_get_frags_count),	\
+	FN(xdp_get_frags_total_size),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/net/core/filter.c b/net/core/filter.c
index 3fb6adad1957..4c55b788c4c5 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3739,6 +3739,44 @@ static const struct bpf_func_proto bpf_xdp_adjust_head_proto = {
 	.arg2_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_1(bpf_xdp_get_frags_count, struct  xdp_buff*, xdp)
+{
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
+
+	return xdp->mb ? sinfo->nr_frags : 0;
+}
+
+const struct bpf_func_proto bpf_xdp_get_frags_count_proto = {
+	.func		= bpf_xdp_get_frags_count,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+};
+
+BPF_CALL_1(bpf_xdp_get_frags_total_size, struct  xdp_buff*, xdp)
+{
+	struct skb_shared_info *sinfo;
+	int nfrags, i, size = 0;
+
+	if (likely(!xdp->mb))
+		return 0;
+
+	sinfo = xdp_get_shared_info_from_buff(xdp);
+	nfrags = min_t(u8, sinfo->nr_frags, MAX_SKB_FRAGS);
+
+	for (i = 0; i < nfrags; i++)
+		size += skb_frag_size(&sinfo->frags[i]);
+
+	return size;
+}
+
+const struct bpf_func_proto bpf_xdp_get_frags_total_size_proto = {
+	.func		= bpf_xdp_get_frags_total_size,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_CTX,
+};
+
 BPF_CALL_2(bpf_xdp_adjust_tail, struct xdp_buff *, xdp, int, offset)
 {
 	void *data_hard_end = xdp_data_hard_end(xdp); /* use xdp->frame_sz */
@@ -7092,6 +7130,10 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_xdp_redirect_map_proto;
 	case BPF_FUNC_xdp_adjust_tail:
 		return &bpf_xdp_adjust_tail_proto;
+	case BPF_FUNC_xdp_get_frags_count:
+		return &bpf_xdp_get_frags_count_proto;
+	case BPF_FUNC_xdp_get_frags_total_size:
+		return &bpf_xdp_get_frags_total_size_proto;
 	case BPF_FUNC_fib_lookup:
 		return &bpf_xdp_fib_lookup_proto;
 #ifdef CONFIG_INET
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4f556cfcbfbe..0715995eb18c 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3668,6 +3668,18 @@ union bpf_attr {
  * 	Return
  * 		The helper returns **TC_ACT_REDIRECT** on success or
  * 		**TC_ACT_SHOT** on error.
+ *
+ * int bpf_xdp_get_frags_count(struct xdp_buff *xdp_md)
+ *	Description
+ *		Get the number of fragments for a given xdp multi-buffer.
+ *	Return
+ *		The number of fragments
+ *
+ * int bpf_xdp_get_frags_total_size(struct xdp_buff *xdp_md)
+ *	Description
+ *		Get the total size of fragments for a given xdp multi-buffer.
+ *	Return
+ *		The total size of fragments for a given xdp multi-buffer.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3823,6 +3835,8 @@ union bpf_attr {
 	FN(seq_printf_btf),		\
 	FN(skb_cgroup_classid),		\
 	FN(redirect_neigh),		\
+	FN(xdp_get_frags_count),	\
+	FN(xdp_get_frags_total_size),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 07/13] samples/bpf: add bpf program that uses xdp mb helpers
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (5 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 08/13] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

From: Sameeh Jubran <sameehj@amazon.com>

The bpf program returns XDP_PASS for every packet and calculates the
total number of bytes in its linear and paged parts.

The program is executed with:
./xdp_mb [if name]

and has the following output format:
[if index]: [rx packet count] pkt/sec, [number of bytes] bytes/sec

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 samples/bpf/Makefile      |   3 +
 samples/bpf/xdp_mb_kern.c |  68 ++++++++++++++
 samples/bpf/xdp_mb_user.c | 182 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 253 insertions(+)
 create mode 100644 samples/bpf/xdp_mb_kern.c
 create mode 100644 samples/bpf/xdp_mb_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 4f1ed0e3cf9f..12e32516f02a 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -54,6 +54,7 @@ tprogs-y += task_fd_query
 tprogs-y += xdp_sample_pkts
 tprogs-y += ibumad
 tprogs-y += hbm
+tprogs-y += xdp_mb
 
 # Libbpf dependencies
 LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -111,6 +112,7 @@ task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
 xdp_sample_pkts-objs := xdp_sample_pkts_user.o $(TRACE_HELPERS)
 ibumad-objs := bpf_load.o ibumad_user.o $(TRACE_HELPERS)
 hbm-objs := bpf_load.o hbm.o $(CGROUP_HELPERS)
+xdp_mb-objs := xdp_mb_user.o
 
 # Tell kbuild to always build the programs
 always-y := $(tprogs-y)
@@ -172,6 +174,7 @@ always-y += ibumad_kern.o
 always-y += hbm_out_kern.o
 always-y += hbm_edt_kern.o
 always-y += xdpsock_kern.o
+always-y += xdp_mb_kern.o
 
 ifeq ($(ARCH), arm)
 # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux
diff --git a/samples/bpf/xdp_mb_kern.c b/samples/bpf/xdp_mb_kern.c
new file mode 100644
index 000000000000..f366bce92fc7
--- /dev/null
+++ b/samples/bpf/xdp_mb_kern.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright 2020 Amazon.com, Inc. or its affiliates. All rights reserved.
+ */
+#define KBUILD_MODNAME "foo"
+#include <uapi/linux/bpf.h>
+#include <linux/in.h>
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#include <linux/if_vlan.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+#include <bpf/bpf_helpers.h>
+
+/* count RX packets */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__type(key, u32);
+	__type(value, long);
+	__uint(max_entries, 1);
+} rx_cnt SEC(".maps");
+
+/* count RX fragments */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__type(key, u32);
+	__type(value, long);
+	__uint(max_entries, 1);
+} rx_frags SEC(".maps");
+
+/* count total number of bytes */
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+	__type(key, u32);
+	__type(value, long);
+	__uint(max_entries, 1);
+} tot_len SEC(".maps");
+
+SEC("xdp_mb")
+int xdp_mb_prog(struct xdp_md *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data = (void *)(long)ctx->data;
+	u32 frag_offset = 0, frag_size = 0;
+	u32 key = 0, nfrags;
+	long *value;
+	int i, len;
+
+	value = bpf_map_lookup_elem(&rx_cnt, &key);
+	if (value)
+		*value += 1;
+
+	len = data_end - data;
+	nfrags = bpf_xdp_get_frags_count(ctx);
+	len += bpf_xdp_get_frags_total_size(ctx);
+
+	value = bpf_map_lookup_elem(&tot_len, &key);
+	if (value)
+		*value += len;
+
+	value = bpf_map_lookup_elem(&rx_frags, &key);
+	if (value)
+		*value += nfrags;
+
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/samples/bpf/xdp_mb_user.c b/samples/bpf/xdp_mb_user.c
new file mode 100644
index 000000000000..6f555e94b748
--- /dev/null
+++ b/samples/bpf/xdp_mb_user.c
@@ -0,0 +1,182 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright 2020 Amazon.com, Inc. or its affiliates. All rights reserved.
+ */
+#include <linux/bpf.h>
+#include <linux/if_link.h>
+#include <assert.h>
+#include <errno.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <libgen.h>
+#include <sys/resource.h>
+#include <net/if.h>
+
+#include "bpf_util.h"
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+
+static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_DRV_MODE;
+static __u32 prog_id;
+static int rx_cnt_fd, tot_len_fd, rx_frags_fd;
+static int ifindex;
+
+static void int_exit(int sig)
+{
+	__u32 curr_prog_id = 0;
+
+	if (bpf_get_link_xdp_id(ifindex, &curr_prog_id, xdp_flags)) {
+		printf("bpf_get_link_xdp_id failed\n");
+		exit(1);
+	}
+	if (prog_id == curr_prog_id)
+		bpf_set_link_xdp_fd(ifindex, -1, xdp_flags);
+	else if (!curr_prog_id)
+		printf("couldn't find a prog id on a given interface\n");
+	else
+		printf("program on interface changed, not removing\n");
+	exit(0);
+}
+
+/* count total packets and bytes per second */
+static void poll_stats(int interval)
+{
+	unsigned int nr_cpus = bpf_num_possible_cpus();
+	__u64 rx_frags_cnt[nr_cpus], rx_frags_cnt_prev[nr_cpus];
+	__u64 tot_len[nr_cpus], tot_len_prev[nr_cpus];
+	__u64 rx_cnt[nr_cpus], rx_cnt_prev[nr_cpus];
+	int i;
+
+	memset(rx_frags_cnt_prev, 0, sizeof(rx_frags_cnt_prev));
+	memset(tot_len_prev, 0, sizeof(tot_len_prev));
+	memset(rx_cnt_prev, 0, sizeof(rx_cnt_prev));
+
+	while (1) {
+		__u64 n_rx_pkts = 0, rx_frags = 0, rx_len = 0;
+		__u32 key = 0;
+
+		sleep(interval);
+
+		/* fetch rx cnt */
+		assert(bpf_map_lookup_elem(rx_cnt_fd, &key, rx_cnt) == 0);
+		for (i = 0; i < nr_cpus; i++)
+			n_rx_pkts += (rx_cnt[i] - rx_cnt_prev[i]);
+		memcpy(rx_cnt_prev, rx_cnt, sizeof(rx_cnt));
+
+		/* fetch rx frags */
+		assert(bpf_map_lookup_elem(rx_frags_fd, &key, rx_frags_cnt) == 0);
+		for (i = 0; i < nr_cpus; i++)
+			rx_frags += (rx_frags_cnt[i] - rx_frags_cnt_prev[i]);
+		memcpy(rx_frags_cnt_prev, rx_frags_cnt, sizeof(rx_frags_cnt));
+
+		/* count total bytes of packets */
+		assert(bpf_map_lookup_elem(tot_len_fd, &key, tot_len) == 0);
+		for (i = 0; i < nr_cpus; i++)
+			rx_len += (tot_len[i] - tot_len_prev[i]);
+		memcpy(tot_len_prev, tot_len, sizeof(tot_len));
+
+		if (n_rx_pkts)
+			printf("ifindex %i: %10llu pkt/s, %10llu frags/s, %10llu bytes/s\n",
+			       ifindex, n_rx_pkts / interval, rx_frags / interval,
+			       rx_len / interval);
+	}
+}
+
+static void usage(const char *prog)
+{
+	fprintf(stderr,
+		"%s: %s [OPTS] IFACE\n\n"
+		"OPTS:\n"
+		"    -F    force loading prog\n",
+		__func__, prog);
+}
+
+int main(int argc, char **argv)
+{
+	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
+	struct bpf_prog_load_attr prog_load_attr = {
+		.prog_type	= BPF_PROG_TYPE_XDP,
+	};
+	int prog_fd, opt;
+	struct bpf_prog_info info = {};
+	__u32 info_len = sizeof(info);
+	const char *optstr = "F";
+	struct bpf_program *prog;
+	struct bpf_object *obj;
+	char filename[256];
+	int err;
+
+	while ((opt = getopt(argc, argv, optstr)) != -1) {
+		switch (opt) {
+		case 'F':
+			xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST;
+			break;
+		default:
+			usage(basename(argv[0]));
+			return 1;
+		}
+	}
+
+	if (optind == argc) {
+		usage(basename(argv[0]));
+		return 1;
+	}
+
+	if (setrlimit(RLIMIT_MEMLOCK, &r)) {
+		perror("setrlimit(RLIMIT_MEMLOCK)");
+		return 1;
+	}
+
+	ifindex = if_nametoindex(argv[optind]);
+	if (!ifindex) {
+		perror("if_nametoindex");
+		return 1;
+	}
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+	prog_load_attr.file = filename;
+
+	if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd))
+		return 1;
+
+	prog = bpf_program__next(NULL, obj);
+	if (!prog) {
+		printf("finding a prog in obj file failed\n");
+		return 1;
+	}
+
+	if (!prog_fd) {
+		printf("bpf_prog_load_xattr: %s\n", strerror(errno));
+		return 1;
+	}
+
+	rx_cnt_fd = bpf_object__find_map_fd_by_name(obj, "rx_cnt");
+	rx_frags_fd = bpf_object__find_map_fd_by_name(obj, "rx_frags");
+	tot_len_fd = bpf_object__find_map_fd_by_name(obj, "tot_len");
+	if (rx_cnt_fd < 0 || rx_frags_fd < 0 || tot_len_fd < 0) {
+		printf("bpf_object__find_map_fd_by_name failed\n");
+		return 1;
+	}
+
+	if (bpf_set_link_xdp_fd(ifindex, prog_fd, xdp_flags) < 0) {
+		printf("ERROR: link set xdp fd failed on %d\n", ifindex);
+		return 1;
+	}
+
+	err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len);
+	if (err) {
+		printf("can't get prog info - %s\n", strerror(errno));
+		return err;
+	}
+	prog_id = info.id;
+
+	signal(SIGINT, int_exit);
+	signal(SIGTERM, int_exit);
+
+	poll_stats(1);
+
+	return 0;
+}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 08/13] bpf: move user_size out of bpf_test_init
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (6 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 07/13] samples/bpf: add bpf program that uses xdp mb helpers Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Rely on data_size_in in bpf_test_init routine signature. This is a
preliminary patch to introduce xdp multi-buff selftest

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 net/bpf/test_run.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index c1c30a9f76f3..bd291f5f539c 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -171,11 +171,10 @@ __diag_pop();
 
 ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);
 
-static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
-			   u32 headroom, u32 tailroom)
+static void *bpf_test_init(const union bpf_attr *kattr, u32 user_size,
+			   u32 size, u32 headroom, u32 tailroom)
 {
 	void __user *data_in = u64_to_user_ptr(kattr->test.data_in);
-	u32 user_size = kattr->test.data_size_in;
 	void *data;
 
 	if (size < ETH_HLEN || size > PAGE_SIZE - headroom - tailroom)
@@ -495,7 +494,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 	if (kattr->test.flags || kattr->test.cpu)
 		return -EINVAL;
 
-	data = bpf_test_init(kattr, size, NET_SKB_PAD + NET_IP_ALIGN,
+	data = bpf_test_init(kattr, kattr->test.data_size_in,
+			     size, NET_SKB_PAD + NET_IP_ALIGN,
 			     SKB_DATA_ALIGN(sizeof(struct skb_shared_info)));
 	if (IS_ERR(data))
 		return PTR_ERR(data);
@@ -632,7 +632,8 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	/* XDP have extra tailroom as (most) drivers use full page */
 	max_data_sz = 4096 - headroom - tailroom;
 
-	data = bpf_test_init(kattr, max_data_sz, headroom, tailroom);
+	data = bpf_test_init(kattr, kattr->test.data_size_in,
+			     max_data_sz, headroom, tailroom);
 	if (IS_ERR(data))
 		return PTR_ERR(data);
 
@@ -698,7 +699,7 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	if (size < ETH_HLEN)
 		return -EINVAL;
 
-	data = bpf_test_init(kattr, size, 0, 0);
+	data = bpf_test_init(kattr, kattr->test.data_size_in, size, 0, 0);
 	if (IS_ERR(data))
 		return PTR_ERR(data);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp()
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (7 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 08/13] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-08  8:06   ` Shay Agroskin
  2020-10-02 14:42 ` [PATCH v4 bpf-next 10/13] bpf: test_run: add skb_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Introduce the capability to allocate a xdp multi-buff in
bpf_prog_test_run_xdp routine. This is a preliminary patch to introduce
the selftests for new xdp multi-buff ebpf helpers

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 net/bpf/test_run.c | 51 ++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 43 insertions(+), 8 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index bd291f5f539c..ec7286cd051b 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -617,44 +617,79 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 {
 	u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 	u32 headroom = XDP_PACKET_HEADROOM;
-	u32 size = kattr->test.data_size_in;
 	u32 repeat = kattr->test.repeat;
 	struct netdev_rx_queue *rxqueue;
+	struct skb_shared_info *sinfo;
 	struct xdp_buff xdp = {};
+	u32 max_data_sz, size;
 	u32 retval, duration;
-	u32 max_data_sz;
+	int i, ret, data_len;
 	void *data;
-	int ret;
 
 	if (kattr->test.ctx_in || kattr->test.ctx_out)
 		return -EINVAL;
 
-	/* XDP have extra tailroom as (most) drivers use full page */
 	max_data_sz = 4096 - headroom - tailroom;
+	size = min_t(u32, kattr->test.data_size_in, max_data_sz);
+	data_len = size;
 
-	data = bpf_test_init(kattr, kattr->test.data_size_in,
-			     max_data_sz, headroom, tailroom);
+	data = bpf_test_init(kattr, size, max_data_sz, headroom, tailroom);
 	if (IS_ERR(data))
 		return PTR_ERR(data);
 
 	xdp.data_hard_start = data;
 	xdp.data = data + headroom;
 	xdp.data_meta = xdp.data;
-	xdp.data_end = xdp.data + size;
+	xdp.data_end = xdp.data + data_len;
 	xdp.frame_sz = headroom + max_data_sz + tailroom;
 
+	sinfo = xdp_get_shared_info_from_buff(&xdp);
+	if (unlikely(kattr->test.data_size_in > size)) {
+		void __user *data_in = u64_to_user_ptr(kattr->test.data_in);
+
+		while (size < kattr->test.data_size_in) {
+			skb_frag_t *frag = &sinfo->frags[sinfo->nr_frags];
+			struct page *page;
+			int data_len;
+
+			page = alloc_page(GFP_KERNEL);
+			if (!page) {
+				ret = -ENOMEM;
+				goto out;
+			}
+
+			__skb_frag_set_page(frag, page);
+			data_len = min_t(int, kattr->test.data_size_in - size,
+					 PAGE_SIZE);
+			skb_frag_size_set(frag, data_len);
+			if (copy_from_user(page_address(page), data_in + size,
+					   data_len)) {
+				ret = -EFAULT;
+				goto out;
+			}
+			sinfo->nr_frags++;
+			size += data_len;
+		}
+		xdp.mb = 1;
+	}
+
 	rxqueue = __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 0);
 	xdp.rxq = &rxqueue->xdp_rxq;
 	bpf_prog_change_xdp(NULL, prog);
 	ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
 	if (ret)
 		goto out;
+
 	if (xdp.data != data + headroom || xdp.data_end != xdp.data + size)
-		size = xdp.data_end - xdp.data;
+		size += xdp.data_end - xdp.data - data_len;
+
 	ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration);
 out:
 	bpf_prog_change_xdp(prog, NULL);
+	for (i = 0; i < sinfo->nr_frags; i++)
+		__free_page(skb_frag_page(&sinfo->frags[i]));
 	kfree(data);
+
 	return ret;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 10/13] bpf: test_run: add skb_shared_info pointer in bpf_test_finish signature
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (8 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 11/13] bpf: add xdp multi-buffer selftest Lorenzo Bianconi
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

introduce skb_shared_info pointer in bpf_test_finish signature in order
to copy back paged data from a xdp multi-buff frame to userspace buffer

Tested-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 net/bpf/test_run.c | 58 ++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 51 insertions(+), 7 deletions(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index ec7286cd051b..7e33181f88ee 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -79,9 +79,23 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
 	return ret;
 }
 
+static int bpf_test_get_buff_data_len(struct skb_shared_info *sinfo)
+{
+	int i, size = 0;
+
+	if (likely(!sinfo))
+		return 0;
+
+	for (i = 0; i < sinfo->nr_frags; i++)
+		size += skb_frag_size(&sinfo->frags[i]);
+
+	return size;
+}
+
 static int bpf_test_finish(const union bpf_attr *kattr,
 			   union bpf_attr __user *uattr, const void *data,
-			   u32 size, u32 retval, u32 duration)
+			   struct skb_shared_info *sinfo, u32 size,
+			   u32 retval, u32 duration)
 {
 	void __user *data_out = u64_to_user_ptr(kattr->test.data_out);
 	int err = -EFAULT;
@@ -96,8 +110,35 @@ static int bpf_test_finish(const union bpf_attr *kattr,
 		err = -ENOSPC;
 	}
 
-	if (data_out && copy_to_user(data_out, data, copy_size))
-		goto out;
+	if (data_out) {
+		int len = copy_size - bpf_test_get_buff_data_len(sinfo);
+
+		if (copy_to_user(data_out, data, len))
+			goto out;
+
+		if (sinfo) {
+			int i, offset = len, data_len;
+
+			for (i = 0; i < sinfo->nr_frags; i++) {
+				skb_frag_t *frag = &sinfo->frags[i];
+
+				if (offset >= copy_size) {
+					err = -ENOSPC;
+					break;
+				}
+
+				data_len = min_t(int, copy_size - offset,
+						 skb_frag_size(frag));
+				if (copy_to_user(data_out + offset,
+						 skb_frag_address(frag),
+						 data_len))
+					goto out;
+
+				offset += data_len;
+			}
+		}
+	}
+
 	if (copy_to_user(&uattr->test.data_size_out, &size, sizeof(size)))
 		goto out;
 	if (copy_to_user(&uattr->test.retval, &retval, sizeof(retval)))
@@ -598,7 +639,8 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 	/* bpf program can never convert linear skb to non-linear */
 	if (WARN_ON_ONCE(skb_is_nonlinear(skb)))
 		size = skb_headlen(skb);
-	ret = bpf_test_finish(kattr, uattr, skb->data, size, retval, duration);
+	ret = bpf_test_finish(kattr, uattr, skb->data, NULL, size, retval,
+			      duration);
 	if (!ret)
 		ret = bpf_ctx_finish(kattr, uattr, ctx,
 				     sizeof(struct __sk_buff));
@@ -683,7 +725,9 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
 	if (xdp.data != data + headroom || xdp.data_end != xdp.data + size)
 		size += xdp.data_end - xdp.data - data_len;
 
-	ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration);
+	ret = bpf_test_finish(kattr, uattr, xdp.data, sinfo, size, retval,
+			      duration);
+
 out:
 	bpf_prog_change_xdp(prog, NULL);
 	for (i = 0; i < sinfo->nr_frags; i++)
@@ -793,8 +837,8 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
 	do_div(time_spent, repeat);
 	duration = time_spent > U32_MAX ? U32_MAX : (u32)time_spent;
 
-	ret = bpf_test_finish(kattr, uattr, &flow_keys, sizeof(flow_keys),
-			      retval, duration);
+	ret = bpf_test_finish(kattr, uattr, &flow_keys, NULL,
+			      sizeof(flow_keys), retval, duration);
 	if (!ret)
 		ret = bpf_ctx_finish(kattr, uattr, user_ctx,
 				     sizeof(struct bpf_flow_keys));
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 11/13] bpf: add xdp multi-buffer selftest
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (9 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 10/13] bpf: test_run: add skb_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 12/13] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Introduce xdp multi-buffer selftest for the following ebpf helpers:
- bpf_xdp_get_frags_total_size
- bpf_xdp_get_frags_count

Co-developed-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 .../testing/selftests/bpf/prog_tests/xdp_mb.c | 79 +++++++++++++++++++
 .../selftests/bpf/progs/test_xdp_multi_buff.c | 24 ++++++
 2 files changed, 103 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_mb.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_mb.c b/tools/testing/selftests/bpf/prog_tests/xdp_mb.c
new file mode 100644
index 000000000000..4b1aca2d31e5
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_mb.c
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <unistd.h>
+#include <linux/kernel.h>
+#include <test_progs.h>
+#include <network_helpers.h>
+
+#include "test_xdp_multi_buff.skel.h"
+
+static void test_xdp_mb_check_len(void)
+{
+	int test_sizes[] = { 128, 4096, 9000 };
+	struct test_xdp_multi_buff *pkt_skel;
+	__u8 *pkt_in = NULL, *pkt_out = NULL;
+	__u32 duration = 0, retval, size;
+	int err, pkt_fd, i;
+
+	/* Load XDP program */
+	pkt_skel = test_xdp_multi_buff__open_and_load();
+	if (CHECK(!pkt_skel, "pkt_skel_load", "test_xdp_mb skeleton failed\n"))
+		goto out;
+
+	/* Allocate resources */
+	pkt_out = malloc(test_sizes[ARRAY_SIZE(test_sizes) - 1]);
+	if (CHECK(!pkt_out, "malloc", "Failed pkt_out malloc\n"))
+		goto out;
+
+	pkt_in = malloc(test_sizes[ARRAY_SIZE(test_sizes) - 1]);
+	if (CHECK(!pkt_in, "malloc", "Failed pkt_in malloc\n"))
+		goto out;
+
+	pkt_fd = bpf_program__fd(pkt_skel->progs._xdp_check_mb_len);
+	if (pkt_fd < 0)
+		goto out;
+
+	/* Run test for specific set of packets */
+	for (i = 0; i < ARRAY_SIZE(test_sizes); i++) {
+		int frags_count;
+
+		/* Run test program */
+		err = bpf_prog_test_run(pkt_fd, 1, pkt_in, test_sizes[i],
+					pkt_out, &size, &retval, &duration);
+
+		if (CHECK(err || retval != XDP_PASS || size != test_sizes[i],
+			  "test_run", "err %d errno %d retval %d size %d[%d]\n",
+			  err, errno, retval, size, test_sizes[i]))
+			goto out;
+
+		/* Verify test results */
+		frags_count = DIV_ROUND_UP(
+			test_sizes[i] - pkt_skel->data->test_result_xdp_len,
+			getpagesize());
+
+		if (CHECK(pkt_skel->data->test_result_frags_count != frags_count,
+			  "result", "frags_count = %llu != %u\n",
+			  pkt_skel->data->test_result_frags_count, frags_count))
+			goto out;
+
+		if (CHECK(pkt_skel->data->test_result_frags_len != test_sizes[i] -
+			  pkt_skel->data->test_result_xdp_len,
+			  "result", "frags_len = %llu != %llu\n",
+			  pkt_skel->data->test_result_frags_len,
+			  test_sizes[i] - pkt_skel->data->test_result_xdp_len))
+			goto out;
+	}
+out:
+	if (pkt_out)
+		free(pkt_out);
+	if (pkt_in)
+		free(pkt_in);
+
+	test_xdp_multi_buff__destroy(pkt_skel);
+}
+
+void test_xdp_mb(void)
+{
+	if (test__start_subtest("xdp_mb_check_len_frags"))
+		test_xdp_mb_check_len();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c b/tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c
new file mode 100644
index 000000000000..b7527829a3ed
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <linux/if_ether.h>
+#include <bpf/bpf_helpers.h>
+#include <stdint.h>
+
+__u64 test_result_frags_count = UINT64_MAX;
+__u64 test_result_frags_len = UINT64_MAX;
+__u64 test_result_xdp_len = UINT64_MAX;
+
+SEC("xdp_check_mb_len")
+int _xdp_check_mb_len(struct xdp_md *xdp)
+{
+	void *data_end = (void *)(long)xdp->data_end;
+	void *data = (void *)(long)xdp->data;
+
+	test_result_xdp_len = (__u64)(data_end - data);
+	test_result_frags_len = bpf_xdp_get_frags_total_size(xdp);
+	test_result_frags_count = bpf_xdp_get_frags_count(xdp);
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 12/13] net: mvneta: enable jumbo frames for XDP
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (10 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 11/13] bpf: add xdp multi-buffer selftest Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 14:42 ` [PATCH v4 bpf-next 13/13] bpf: cpumap: introduce xdp multi-buff support Lorenzo Bianconi
  2020-10-02 15:25 ` [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support John Fastabend
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Enable the capability to receive jumbo frames even if the interface is
running in XDP mode

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 drivers/net/ethernet/marvell/mvneta.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index f709650974ea..e3352ed13ea8 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3743,11 +3743,6 @@ static int mvneta_change_mtu(struct net_device *dev, int mtu)
 		mtu = ALIGN(MVNETA_RX_PKT_SIZE(mtu), 8);
 	}
 
-	if (pp->xdp_prog && mtu > MVNETA_MAX_RX_BUF_SIZE) {
-		netdev_info(dev, "Illegal MTU value %d for XDP mode\n", mtu);
-		return -EINVAL;
-	}
-
 	dev->mtu = mtu;
 
 	if (!netif_running(dev)) {
@@ -4445,11 +4440,6 @@ static int mvneta_xdp_setup(struct net_device *dev, struct bpf_prog *prog,
 	struct mvneta_port *pp = netdev_priv(dev);
 	struct bpf_prog *old_prog;
 
-	if (prog && dev->mtu > MVNETA_MAX_RX_BUF_SIZE) {
-		NL_SET_ERR_MSG_MOD(extack, "Jumbo frames not supported on XDP");
-		return -EOPNOTSUPP;
-	}
-
 	if (pp->bm_priv) {
 		NL_SET_ERR_MSG_MOD(extack,
 				   "Hardware Buffer Management not supported on XDP");
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 bpf-next 13/13] bpf: cpumap: introduce xdp multi-buff support
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (11 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 12/13] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
@ 2020-10-02 14:42 ` Lorenzo Bianconi
  2020-10-02 15:25 ` [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support John Fastabend
  13 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 14:42 UTC (permalink / raw)
  To: bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Introduce __xdp_build_skb_from_frame and xdp_build_skb_from_frame
utility routines to build the skb from xdp_frame.
Add xdp multi-buff support to cpumap

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 include/net/xdp.h   |  5 ++++
 kernel/bpf/cpumap.c | 45 +------------------------------
 net/core/xdp.c      | 64 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 70 insertions(+), 44 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 4d47076546ff..8d9224ef75ee 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -134,6 +134,11 @@ void xdp_warn(const char *msg, const char *func, const int line);
 #define XDP_WARN(msg) xdp_warn(msg, __func__, __LINE__)
 
 struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp);
+struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
+					   struct sk_buff *skb,
+					   struct net_device *dev);
+struct sk_buff *xdp_build_skb_from_frame(struct xdp_frame *xdpf,
+					 struct net_device *dev);
 
 static inline
 void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)
diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index c61a23b564aa..fa07b4226836 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -155,49 +155,6 @@ static void cpu_map_kthread_stop(struct work_struct *work)
 	kthread_stop(rcpu->kthread);
 }
 
-static struct sk_buff *cpu_map_build_skb(struct xdp_frame *xdpf,
-					 struct sk_buff *skb)
-{
-	unsigned int hard_start_headroom;
-	unsigned int frame_size;
-	void *pkt_data_start;
-
-	/* Part of headroom was reserved to xdpf */
-	hard_start_headroom = sizeof(struct xdp_frame) +  xdpf->headroom;
-
-	/* Memory size backing xdp_frame data already have reserved
-	 * room for build_skb to place skb_shared_info in tailroom.
-	 */
-	frame_size = xdpf->frame_sz;
-
-	pkt_data_start = xdpf->data - hard_start_headroom;
-	skb = build_skb_around(skb, pkt_data_start, frame_size);
-	if (unlikely(!skb))
-		return NULL;
-
-	skb_reserve(skb, hard_start_headroom);
-	__skb_put(skb, xdpf->len);
-	if (xdpf->metasize)
-		skb_metadata_set(skb, xdpf->metasize);
-
-	/* Essential SKB info: protocol and skb->dev */
-	skb->protocol = eth_type_trans(skb, xdpf->dev_rx);
-
-	/* Optional SKB info, currently missing:
-	 * - HW checksum info		(skb->ip_summed)
-	 * - HW RX hash			(skb_set_hash)
-	 * - RX ring dev queue index	(skb_record_rx_queue)
-	 */
-
-	/* Until page_pool get SKB return path, release DMA here */
-	xdp_release_frame(xdpf);
-
-	/* Allow SKB to reuse area used by xdp_frame */
-	xdp_scrub_frame(xdpf);
-
-	return skb;
-}
-
 static void __cpu_map_ring_cleanup(struct ptr_ring *ring)
 {
 	/* The tear-down procedure should have made sure that queue is
@@ -364,7 +321,7 @@ static int cpu_map_kthread_run(void *data)
 			struct sk_buff *skb = skbs[i];
 			int ret;
 
-			skb = cpu_map_build_skb(xdpf, skb);
+			skb = __xdp_build_skb_from_frame(xdpf, skb, xdpf->dev_rx);
 			if (!skb) {
 				xdp_return_frame(xdpf);
 				continue;
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 6d4fd4dddb00..a6bdefed92e6 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -507,3 +507,67 @@ void xdp_warn(const char *msg, const char *func, const int line)
 	WARN(1, "XDP_WARN: %s(line:%d): %s\n", func, line, msg);
 };
 EXPORT_SYMBOL_GPL(xdp_warn);
+
+struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
+					   struct sk_buff *skb,
+					   struct net_device *dev)
+{
+	struct skb_shared_info *sinfo = xdp_get_shared_info_from_frame(xdpf);
+	unsigned int headroom = sizeof(*xdpf) +  xdpf->headroom;
+	int i, num_frags = xdpf->mb ? sinfo->nr_frags : 0;
+	void *hard_start = xdpf->data - headroom;
+
+	skb = build_skb_around(skb, hard_start, xdpf->frame_sz);
+	if (unlikely(!skb))
+		return NULL;
+
+	skb_reserve(skb, headroom);
+	__skb_put(skb, xdpf->len);
+	if (xdpf->metasize)
+		skb_metadata_set(skb, xdpf->metasize);
+
+	if (likely(!num_frags))
+		goto out;
+
+	for (i = 0; i < num_frags; i++) {
+		skb_frag_t *frag = &sinfo->frags[i];
+
+		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
+				skb_frag_page(frag), skb_frag_off(frag),
+				skb_frag_size(frag), xdpf->frame_sz);
+	}
+
+out:
+	/* Essential SKB info: protocol and skb->dev */
+	skb->protocol = eth_type_trans(skb, dev);
+
+	/* Optional SKB info, currently missing:
+	 * - HW checksum info		(skb->ip_summed)
+	 * - HW RX hash			(skb_set_hash)
+	 * - RX ring dev queue index	(skb_record_rx_queue)
+	 */
+
+	/* Until page_pool get SKB return path, release DMA here */
+	xdp_release_frame(xdpf);
+
+	/* Allow SKB to reuse area used by xdp_frame */
+	xdp_scrub_frame(xdpf);
+
+	return skb;
+}
+EXPORT_SYMBOL_GPL(__xdp_build_skb_from_frame);
+
+struct sk_buff *xdp_build_skb_from_frame(struct xdp_frame *xdpf,
+					 struct net_device *dev)
+{
+	struct sk_buff *skb;
+
+	skb = kmem_cache_alloc(skbuff_head_cache, GFP_ATOMIC);
+	if (unlikely(!skb))
+		return NULL;
+
+	memset(skb, 0, offsetof(struct sk_buff, tail));
+
+	return __xdp_build_skb_from_frame(xdpf, skb, dev);
+}
+EXPORT_SYMBOL_GPL(xdp_build_skb_from_frame);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* RE: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
                   ` (12 preceding siblings ...)
  2020-10-02 14:42 ` [PATCH v4 bpf-next 13/13] bpf: cpumap: introduce xdp multi-buff support Lorenzo Bianconi
@ 2020-10-02 15:25 ` John Fastabend
  2020-10-02 16:06   ` Lorenzo Bianconi
  2020-10-02 19:53   ` Daniel Borkmann
  13 siblings, 2 replies; 31+ messages in thread
From: John Fastabend @ 2020-10-02 15:25 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Lorenzo Bianconi wrote:
> This series introduce XDP multi-buffer support. The mvneta driver is
> the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> please focus on how these new types of xdp_{buff,frame} packets
> traverse the different layers and the layout design. It is on purpose
> that BPF-helpers are kept simple, as we don't want to expose the
> internal layout to allow later changes.
> 
> For now, to keep the design simple and to maintain performance, the XDP
> BPF-prog (still) only have access to the first-buffer. It is left for
> later (another patchset) to add payload access across multiple buffers.
> This patchset should still allow for these future extensions. The goal
> is to lift the XDP MTU restriction that comes with XDP, but maintain
> same performance as before.
> 
> The main idea for the new multi-buffer layout is to reuse the same
> layout used for non-linear SKB. This rely on the "skb_shared_info"
> struct at the end of the first buffer to link together subsequent
> buffers. Keeping the layout compatible with SKBs is also done to ease
> and speedup creating an SKB from an xdp_{buff,frame}. Converting
> xdp_frame to SKB and deliver it to the network stack is shown in cpumap
> code (patch 13/13).

Using the end of the buffer for the skb_shared_info struct is going to
become driver API so unwinding it if it proves to be a performance issue
is going to be ugly. So same question as before, for the use case where
we receive packet and do XDP_TX with it how do we avoid cache miss
overhead? This is not just a hypothetical use case, the Facebook
load balancer is doing this as well as Cilium and allowing this with
multi-buffer packets >1500B would be useful.

Can we write the skb_shared_info lazily? It should only be needed once
we know the packet is going up the stack to some place that needs the
info. Which we could learn from the return code of the XDP program.

> 
> A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
> to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
> or not (mb = 0).
> The mb bit will be set by a xdp multi-buffer capable driver only for
> non-linear frames maintaining the capability to receive linear frames
> without any extra cost since the skb_shared_info structure at the end
> of the first buffer will be initialized only if mb is set.

Thanks above is clearer.

> 
> In order to provide to userspace some metdata about the non-linear
> xdp_{buff,frame}, we introduced 2 bpf helpers:
> - bpf_xdp_get_frags_count:
>   get the number of fragments for a given xdp multi-buffer.
> - bpf_xdp_get_frags_total_size:
>   get the total size of fragments for a given xdp multi-buffer.

Whats the use case for these? Do you have an example where knowing
the frags count is going to be something a BPF program will use?
Having total size seems interesting but perhaps we should push that
into the metadata so its pulled into the cache if users are going to
be reading it on every packet or something.

> 
> Typical use cases for this series are:
> - Jumbo-frames
> - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])
> - TSO
> 
> More info about the main idea behind this approach can be found here [1][2].
> 
> We carried out some throughput tests in a standard linear frame scenario in order
> to verify we did not introduced any performance regression adding xdp multi-buff
> support to mvneta:
> 
> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE
> 
> commit: 879456bedbe5 ("net: mvneta: avoid possible cache misses in mvneta_rx_swbm")
> - xdp-pass:      ~162Kpps
> - xdp-drop:      ~701Kpps
> - xdp-tx:        ~185Kpps
> - xdp-redirect:  ~202Kpps
> 
> mvneta xdp multi-buff:
> - xdp-pass:      ~163Kpps
> - xdp-drop:      ~739Kpps
> - xdp-tx:        ~182Kpps
> - xdp-redirect:  ~202Kpps
> 
> Changes since v3:
> - rebase ontop of bpf-next
> - add patch 10/13 to copy back paged data from a xdp multi-buff frame to
>   userspace buffer for xdp multi-buff selftests
> 
> Changes since v2:
> - add throughput measurements
> - drop bpf_xdp_adjust_mb_header bpf helper
> - introduce selftest for xdp multibuffer
> - addressed comments on bpf_xdp_get_frags_count
> - introduce xdp multi-buff support to cpumaps
> 
> Changes since v1:
> - Fix use-after-free in xdp_return_{buff/frame}
> - Introduce bpf helpers
> - Introduce xdp_mb sample program
> - access skb_shared_info->nr_frags only on the last fragment
> 
> Changes since RFC:
> - squash multi-buffer bit initialization in a single patch
> - add mvneta non-linear XDP buff support for tx side
> 
> [0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
> [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
> [2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)
> 
> Lorenzo Bianconi (11):
>   xdp: introduce mb in xdp_buff/xdp_frame
>   xdp: initialize xdp_buff mb bit to 0 in all XDP drivers
>   net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
>   xdp: add multi-buff support to xdp_return_{buff/frame}
>   net: mvneta: add multi buffer support to XDP_TX
>   bpf: move user_size out of bpf_test_init
>   bpf: introduce multibuff support to bpf_prog_test_run_xdp()
>   bpf: test_run: add skb_shared_info pointer in bpf_test_finish
>     signature
>   bpf: add xdp multi-buffer selftest
>   net: mvneta: enable jumbo frames for XDP
>   bpf: cpumap: introduce xdp multi-buff support
> 
> Sameeh Jubran (2):
>   bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers
>   samples/bpf: add bpf program that uses xdp mb helpers
> 
>  drivers/net/ethernet/amazon/ena/ena_netdev.c  |   1 +
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   1 +
>  .../net/ethernet/cavium/thunder/nicvf_main.c  |   1 +
>  .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  |   1 +
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   1 +
>  drivers/net/ethernet/intel/ice/ice_txrx.c     |   1 +
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   1 +
>  .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |   1 +
>  drivers/net/ethernet/marvell/mvneta.c         | 131 +++++++------
>  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |   1 +
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c    |   1 +
>  .../net/ethernet/mellanox/mlx5/core/en_rx.c   |   1 +
>  .../ethernet/netronome/nfp/nfp_net_common.c   |   1 +
>  drivers/net/ethernet/qlogic/qede/qede_fp.c    |   1 +
>  drivers/net/ethernet/sfc/rx.c                 |   1 +
>  drivers/net/ethernet/socionext/netsec.c       |   1 +
>  drivers/net/ethernet/ti/cpsw.c                |   1 +
>  drivers/net/ethernet/ti/cpsw_new.c            |   1 +
>  drivers/net/hyperv/netvsc_bpf.c               |   1 +
>  drivers/net/tun.c                             |   2 +
>  drivers/net/veth.c                            |   1 +
>  drivers/net/virtio_net.c                      |   2 +
>  drivers/net/xen-netfront.c                    |   1 +
>  include/net/xdp.h                             |  31 ++-
>  include/uapi/linux/bpf.h                      |  14 ++
>  kernel/bpf/cpumap.c                           |  45 +----
>  net/bpf/test_run.c                            | 118 ++++++++++--
>  net/core/dev.c                                |   1 +
>  net/core/filter.c                             |  42 ++++
>  net/core/xdp.c                                | 104 ++++++++++
>  samples/bpf/Makefile                          |   3 +
>  samples/bpf/xdp_mb_kern.c                     |  68 +++++++
>  samples/bpf/xdp_mb_user.c                     | 182 ++++++++++++++++++
>  tools/include/uapi/linux/bpf.h                |  14 ++
>  .../testing/selftests/bpf/prog_tests/xdp_mb.c |  79 ++++++++
>  .../selftests/bpf/progs/test_xdp_multi_buff.c |  24 +++
>  36 files changed, 757 insertions(+), 123 deletions(-)
>  create mode 100644 samples/bpf/xdp_mb_kern.c
>  create mode 100644 samples/bpf/xdp_mb_user.c
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_mb.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c
> 
> -- 
> 2.26.2
> 



^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers
  2020-10-02 14:42 ` [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers Lorenzo Bianconi
@ 2020-10-02 15:36   ` John Fastabend
  2020-10-02 16:25     ` Lorenzo Bianconi
  0 siblings, 1 reply; 31+ messages in thread
From: John Fastabend @ 2020-10-02 15:36 UTC (permalink / raw)
  To: Lorenzo Bianconi, bpf, netdev
  Cc: davem, kuba, ast, daniel, shayagr, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro

Lorenzo Bianconi wrote:
> From: Sameeh Jubran <sameehj@amazon.com>
> 
> Introduce the two following bpf helpers in order to provide some
> metadata about a xdp multi-buff fame to bpf layer:
> 
> - bpf_xdp_get_frags_count()
>   get the number of fragments for a given xdp multi-buffer.

Same comment as in the cover letter can you provide a use case
for how/where I would use xdp_get_frags_count()? Is it just for
debug? If its just debug do we really want a uapi helper for it.

> 
> * bpf_xdp_get_frags_total_size()
>   get the total size of fragments for a given xdp multi-buffer.

This is awkward IMO. If total size is needed it should return total size
in all cases not just in the mb case otherwise programs will have two
paths the mb path and the non-mb path. And if you have mixed workload
the branch predictor will miss? Plus its extra instructions to load.

And if its useful for something beyond just debug and its going to be
read every packet or something I think we should put it in the metadata
so that its not hidden behind a helper which likely will show up as
overhead on a 40+gbps nic. The use case I have in mind is counting
bytes maybe sliced by IP or protocol. Here you will always read it
and I don't want code with a if/else stuck in the middle when if
we do it right we have a single read.

> 
> Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  include/uapi/linux/bpf.h       | 14 ++++++++++++
>  net/core/filter.c              | 42 ++++++++++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h | 14 ++++++++++++
>  3 files changed, 70 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 4f556cfcbfbe..0715995eb18c 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -3668,6 +3668,18 @@ union bpf_attr {
>   * 	Return
>   * 		The helper returns **TC_ACT_REDIRECT** on success or
>   * 		**TC_ACT_SHOT** on error.
> + *
> + * int bpf_xdp_get_frags_count(struct xdp_buff *xdp_md)
> + *	Description
> + *		Get the number of fragments for a given xdp multi-buffer.
> + *	Return
> + *		The number of fragments
> + *
> + * int bpf_xdp_get_frags_total_size(struct xdp_buff *xdp_md)
> + *	Description
> + *		Get the total size of fragments for a given xdp multi-buffer.

Why just fragments? Will I have to also add the initial frag0 to it
or not. I think the description is a bit ambiguous.

> + *	Return
> + *		The total size of fragments for a given xdp multi-buffer.
>   */

[...]

> +const struct bpf_func_proto bpf_xdp_get_frags_count_proto = {
> +	.func		= bpf_xdp_get_frags_count,
> +	.gpl_only	= false,
> +	.ret_type	= RET_INTEGER,
> +	.arg1_type	= ARG_PTR_TO_CTX,
> +};
> +
> +BPF_CALL_1(bpf_xdp_get_frags_total_size, struct  xdp_buff*, xdp)
> +{
> +	struct skb_shared_info *sinfo;
> +	int nfrags, i, size = 0;
> +
> +	if (likely(!xdp->mb))
> +		return 0;
> +
> +	sinfo = xdp_get_shared_info_from_buff(xdp);
> +	nfrags = min_t(u8, sinfo->nr_frags, MAX_SKB_FRAGS);
> +
> +	for (i = 0; i < nfrags; i++)
> +		size += skb_frag_size(&sinfo->frags[i]);

Wont the hardware just know this? I think walking the frag list
just to get the total seems wrong. The hardware should have a
total_len field somewhere we can just read no? If mvneta doesn't
know the total length that seems like a driver limitation and we
shouldn't encode it in the helper.

> +
> +	return size;
> +}

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-02 15:25 ` [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support John Fastabend
@ 2020-10-02 16:06   ` Lorenzo Bianconi
  2020-10-02 18:06     ` John Fastabend
  2020-10-02 19:53   ` Daniel Borkmann
  1 sibling, 1 reply; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 16:06 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, bpf, netdev, davem, kuba, ast, daniel, shayagr,
	sameehj, dsahern, brouer, echaudro

[-- Attachment #1: Type: text/plain, Size: 9830 bytes --]

> Lorenzo Bianconi wrote:
> > This series introduce XDP multi-buffer support. The mvneta driver is
> > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > please focus on how these new types of xdp_{buff,frame} packets
> > traverse the different layers and the layout design. It is on purpose
> > that BPF-helpers are kept simple, as we don't want to expose the
> > internal layout to allow later changes.
> > 
> > For now, to keep the design simple and to maintain performance, the XDP
> > BPF-prog (still) only have access to the first-buffer. It is left for
> > later (another patchset) to add payload access across multiple buffers.
> > This patchset should still allow for these future extensions. The goal
> > is to lift the XDP MTU restriction that comes with XDP, but maintain
> > same performance as before.
> > 
> > The main idea for the new multi-buffer layout is to reuse the same
> > layout used for non-linear SKB. This rely on the "skb_shared_info"
> > struct at the end of the first buffer to link together subsequent
> > buffers. Keeping the layout compatible with SKBs is also done to ease
> > and speedup creating an SKB from an xdp_{buff,frame}. Converting
> > xdp_frame to SKB and deliver it to the network stack is shown in cpumap
> > code (patch 13/13).
> 
> Using the end of the buffer for the skb_shared_info struct is going to
> become driver API so unwinding it if it proves to be a performance issue
> is going to be ugly. So same question as before, for the use case where
> we receive packet and do XDP_TX with it how do we avoid cache miss
> overhead? This is not just a hypothetical use case, the Facebook
> load balancer is doing this as well as Cilium and allowing this with
> multi-buffer packets >1500B would be useful.
> 
> Can we write the skb_shared_info lazily? It should only be needed once
> we know the packet is going up the stack to some place that needs the
> info. Which we could learn from the return code of the XDP program.

Hi John,

I agree, I think for XDP_TX use-case it is not strictly necessary to fill the
skb_hared_info. The driver can just keep this info on the stack and use it
inserting the packet back to the DMA ring.
For mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit
path since it is a low-end device. I guess we are not introducing any API constraint
for XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way
in order to avoid the cache miss.

We need to fill the skb_shared info only when we want to pass the frame to the
network stack (build_skb() can directly reuse skb_shared_info->frags[]) or for
XDP_REDIRECT use-case.

> 
> > 
> > A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
> > to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
> > or not (mb = 0).
> > The mb bit will be set by a xdp multi-buffer capable driver only for
> > non-linear frames maintaining the capability to receive linear frames
> > without any extra cost since the skb_shared_info structure at the end
> > of the first buffer will be initialized only if mb is set.
> 
> Thanks above is clearer.
> 
> > 
> > In order to provide to userspace some metdata about the non-linear
> > xdp_{buff,frame}, we introduced 2 bpf helpers:
> > - bpf_xdp_get_frags_count:
> >   get the number of fragments for a given xdp multi-buffer.
> > - bpf_xdp_get_frags_total_size:
> >   get the total size of fragments for a given xdp multi-buffer.
> 
> Whats the use case for these? Do you have an example where knowing
> the frags count is going to be something a BPF program will use?
> Having total size seems interesting but perhaps we should push that
> into the metadata so its pulled into the cache if users are going to
> be reading it on every packet or something.

At the moment we do not have any use-case for these helpers (not considering
the sample in the series :)). We introduced them to provide some basic metadata
about the non-linear xdp_frame.
IIRC we decided to introduce some helpers instead of adding this info in xdp_frame
in order to save space on it (for xdp it is essential xdp_frame to fit in a single
cache-line).

Regards,
Lorenzo

> 
> > 
> > Typical use cases for this series are:
> > - Jumbo-frames
> > - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])
> > - TSO
> > 
> > More info about the main idea behind this approach can be found here [1][2].
> > 
> > We carried out some throughput tests in a standard linear frame scenario in order
> > to verify we did not introduced any performance regression adding xdp multi-buff
> > support to mvneta:
> > 
> > offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE
> > 
> > commit: 879456bedbe5 ("net: mvneta: avoid possible cache misses in mvneta_rx_swbm")
> > - xdp-pass:      ~162Kpps
> > - xdp-drop:      ~701Kpps
> > - xdp-tx:        ~185Kpps
> > - xdp-redirect:  ~202Kpps
> > 
> > mvneta xdp multi-buff:
> > - xdp-pass:      ~163Kpps
> > - xdp-drop:      ~739Kpps
> > - xdp-tx:        ~182Kpps
> > - xdp-redirect:  ~202Kpps
> > 
> > Changes since v3:
> > - rebase ontop of bpf-next
> > - add patch 10/13 to copy back paged data from a xdp multi-buff frame to
> >   userspace buffer for xdp multi-buff selftests
> > 
> > Changes since v2:
> > - add throughput measurements
> > - drop bpf_xdp_adjust_mb_header bpf helper
> > - introduce selftest for xdp multibuffer
> > - addressed comments on bpf_xdp_get_frags_count
> > - introduce xdp multi-buff support to cpumaps
> > 
> > Changes since v1:
> > - Fix use-after-free in xdp_return_{buff/frame}
> > - Introduce bpf helpers
> > - Introduce xdp_mb sample program
> > - access skb_shared_info->nr_frags only on the last fragment
> > 
> > Changes since RFC:
> > - squash multi-buffer bit initialization in a single patch
> > - add mvneta non-linear XDP buff support for tx side
> > 
> > [0] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
> > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org
> > [2] https://netdevconf.info/0x14/session.html?tutorial-add-XDP-support-to-a-NIC-driver (XDPmulti-buffers section)
> > 
> > Lorenzo Bianconi (11):
> >   xdp: introduce mb in xdp_buff/xdp_frame
> >   xdp: initialize xdp_buff mb bit to 0 in all XDP drivers
> >   net: mvneta: update mb bit before passing the xdp buffer to eBPF layer
> >   xdp: add multi-buff support to xdp_return_{buff/frame}
> >   net: mvneta: add multi buffer support to XDP_TX
> >   bpf: move user_size out of bpf_test_init
> >   bpf: introduce multibuff support to bpf_prog_test_run_xdp()
> >   bpf: test_run: add skb_shared_info pointer in bpf_test_finish
> >     signature
> >   bpf: add xdp multi-buffer selftest
> >   net: mvneta: enable jumbo frames for XDP
> >   bpf: cpumap: introduce xdp multi-buff support
> > 
> > Sameeh Jubran (2):
> >   bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers
> >   samples/bpf: add bpf program that uses xdp mb helpers
> > 
> >  drivers/net/ethernet/amazon/ena/ena_netdev.c  |   1 +
> >  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   1 +
> >  .../net/ethernet/cavium/thunder/nicvf_main.c  |   1 +
> >  .../net/ethernet/freescale/dpaa2/dpaa2-eth.c  |   1 +
> >  drivers/net/ethernet/intel/i40e/i40e_txrx.c   |   1 +
> >  drivers/net/ethernet/intel/ice/ice_txrx.c     |   1 +
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   1 +
> >  .../net/ethernet/intel/ixgbevf/ixgbevf_main.c |   1 +
> >  drivers/net/ethernet/marvell/mvneta.c         | 131 +++++++------
> >  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   |   1 +
> >  drivers/net/ethernet/mellanox/mlx4/en_rx.c    |   1 +
> >  .../net/ethernet/mellanox/mlx5/core/en_rx.c   |   1 +
> >  .../ethernet/netronome/nfp/nfp_net_common.c   |   1 +
> >  drivers/net/ethernet/qlogic/qede/qede_fp.c    |   1 +
> >  drivers/net/ethernet/sfc/rx.c                 |   1 +
> >  drivers/net/ethernet/socionext/netsec.c       |   1 +
> >  drivers/net/ethernet/ti/cpsw.c                |   1 +
> >  drivers/net/ethernet/ti/cpsw_new.c            |   1 +
> >  drivers/net/hyperv/netvsc_bpf.c               |   1 +
> >  drivers/net/tun.c                             |   2 +
> >  drivers/net/veth.c                            |   1 +
> >  drivers/net/virtio_net.c                      |   2 +
> >  drivers/net/xen-netfront.c                    |   1 +
> >  include/net/xdp.h                             |  31 ++-
> >  include/uapi/linux/bpf.h                      |  14 ++
> >  kernel/bpf/cpumap.c                           |  45 +----
> >  net/bpf/test_run.c                            | 118 ++++++++++--
> >  net/core/dev.c                                |   1 +
> >  net/core/filter.c                             |  42 ++++
> >  net/core/xdp.c                                | 104 ++++++++++
> >  samples/bpf/Makefile                          |   3 +
> >  samples/bpf/xdp_mb_kern.c                     |  68 +++++++
> >  samples/bpf/xdp_mb_user.c                     | 182 ++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h                |  14 ++
> >  .../testing/selftests/bpf/prog_tests/xdp_mb.c |  79 ++++++++
> >  .../selftests/bpf/progs/test_xdp_multi_buff.c |  24 +++
> >  36 files changed, 757 insertions(+), 123 deletions(-)
> >  create mode 100644 samples/bpf/xdp_mb_kern.c
> >  create mode 100644 samples/bpf/xdp_mb_user.c
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_mb.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_multi_buff.c
> > 
> > -- 
> > 2.26.2
> > 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers
  2020-10-02 15:36   ` John Fastabend
@ 2020-10-02 16:25     ` Lorenzo Bianconi
  0 siblings, 0 replies; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-02 16:25 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, bpf, netdev, davem, kuba, ast, daniel, shayagr,
	sameehj, dsahern, brouer, echaudro

[-- Attachment #1: Type: text/plain, Size: 4446 bytes --]

> Lorenzo Bianconi wrote:
> > From: Sameeh Jubran <sameehj@amazon.com>
> > 
> > Introduce the two following bpf helpers in order to provide some
> > metadata about a xdp multi-buff fame to bpf layer:
> > 
> > - bpf_xdp_get_frags_count()
> >   get the number of fragments for a given xdp multi-buffer.
> 
> Same comment as in the cover letter can you provide a use case
> for how/where I would use xdp_get_frags_count()? Is it just for
> debug? If its just debug do we really want a uapi helper for it.

I have no a strong opinion on it, I guess we can just drop this helper,
but I am not the original author of the patch :)

> 
> > 
> > * bpf_xdp_get_frags_total_size()
> >   get the total size of fragments for a given xdp multi-buffer.
> 
> This is awkward IMO. If total size is needed it should return total size
> in all cases not just in the mb case otherwise programs will have two
> paths the mb path and the non-mb path. And if you have mixed workload
> the branch predictor will miss? Plus its extra instructions to load.

ack, I am fine to make the helper reporing to total size instead of paged ones
(we can compute it if we really need it)

> 
> And if its useful for something beyond just debug and its going to be
> read every packet or something I think we should put it in the metadata
> so that its not hidden behind a helper which likely will show up as
> overhead on a 40+gbps nic. The use case I have in mind is counting
> bytes maybe sliced by IP or protocol. Here you will always read it
> and I don't want code with a if/else stuck in the middle when if
> we do it right we have a single read.

do you mean xdp_frame or data_meta area? As explained in the cover-letter we
choose this approach to save space in xdp_frame.

> 
> > 
> > Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
> > Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > ---
> >  include/uapi/linux/bpf.h       | 14 ++++++++++++
> >  net/core/filter.c              | 42 ++++++++++++++++++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h | 14 ++++++++++++
> >  3 files changed, 70 insertions(+)
> > 
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 4f556cfcbfbe..0715995eb18c 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3668,6 +3668,18 @@ union bpf_attr {
> >   * 	Return
> >   * 		The helper returns **TC_ACT_REDIRECT** on success or
> >   * 		**TC_ACT_SHOT** on error.
> > + *
> > + * int bpf_xdp_get_frags_count(struct xdp_buff *xdp_md)
> > + *	Description
> > + *		Get the number of fragments for a given xdp multi-buffer.
> > + *	Return
> > + *		The number of fragments
> > + *
> > + * int bpf_xdp_get_frags_total_size(struct xdp_buff *xdp_md)
> > + *	Description
> > + *		Get the total size of fragments for a given xdp multi-buffer.
> 
> Why just fragments? Will I have to also add the initial frag0 to it
> or not. I think the description is a bit ambiguous.
> 
> > + *	Return
> > + *		The total size of fragments for a given xdp multi-buffer.
> >   */
> 
> [...]
> 
> > +const struct bpf_func_proto bpf_xdp_get_frags_count_proto = {
> > +	.func		= bpf_xdp_get_frags_count,
> > +	.gpl_only	= false,
> > +	.ret_type	= RET_INTEGER,
> > +	.arg1_type	= ARG_PTR_TO_CTX,
> > +};
> > +
> > +BPF_CALL_1(bpf_xdp_get_frags_total_size, struct  xdp_buff*, xdp)
> > +{
> > +	struct skb_shared_info *sinfo;
> > +	int nfrags, i, size = 0;
> > +
> > +	if (likely(!xdp->mb))
> > +		return 0;
> > +
> > +	sinfo = xdp_get_shared_info_from_buff(xdp);
> > +	nfrags = min_t(u8, sinfo->nr_frags, MAX_SKB_FRAGS);
> > +
> > +	for (i = 0; i < nfrags; i++)
> > +		size += skb_frag_size(&sinfo->frags[i]);
> 
> Wont the hardware just know this? I think walking the frag list
> just to get the total seems wrong. The hardware should have a
> total_len field somewhere we can just read no? If mvneta doesn't
> know the total length that seems like a driver limitation and we
> shouldn't encode it in the helper.

I have a couple of patches to improve this (not posted yet):
- https://github.com/LorenzoBianconi/bpf-next/commit/ff9b3a74a105b64947931f83fe86a4b8b1808103
- https://github.com/LorenzoBianconi/bpf-next/commit/712e67333cbc5f6304122b1009cdae1e18e6eb26

Regards,
Lorenzo

> 
> > +
> > +	return size;
> > +}
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-02 16:06   ` Lorenzo Bianconi
@ 2020-10-02 18:06     ` John Fastabend
  2020-10-05  9:52       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 31+ messages in thread
From: John Fastabend @ 2020-10-02 18:06 UTC (permalink / raw)
  To: Lorenzo Bianconi, John Fastabend
  Cc: Lorenzo Bianconi, bpf, netdev, davem, kuba, ast, daniel, shayagr,
	sameehj, dsahern, brouer, echaudro

Lorenzo Bianconi wrote:
> > Lorenzo Bianconi wrote:
> > > This series introduce XDP multi-buffer support. The mvneta driver is
> > > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > > please focus on how these new types of xdp_{buff,frame} packets
> > > traverse the different layers and the layout design. It is on purpose
> > > that BPF-helpers are kept simple, as we don't want to expose the
> > > internal layout to allow later changes.
> > > 
> > > For now, to keep the design simple and to maintain performance, the XDP
> > > BPF-prog (still) only have access to the first-buffer. It is left for
> > > later (another patchset) to add payload access across multiple buffers.
> > > This patchset should still allow for these future extensions. The goal
> > > is to lift the XDP MTU restriction that comes with XDP, but maintain
> > > same performance as before.
> > > 
> > > The main idea for the new multi-buffer layout is to reuse the same
> > > layout used for non-linear SKB. This rely on the "skb_shared_info"
> > > struct at the end of the first buffer to link together subsequent
> > > buffers. Keeping the layout compatible with SKBs is also done to ease
> > > and speedup creating an SKB from an xdp_{buff,frame}. Converting
> > > xdp_frame to SKB and deliver it to the network stack is shown in cpumap
> > > code (patch 13/13).
> > 
> > Using the end of the buffer for the skb_shared_info struct is going to
> > become driver API so unwinding it if it proves to be a performance issue
> > is going to be ugly. So same question as before, for the use case where
> > we receive packet and do XDP_TX with it how do we avoid cache miss
> > overhead? This is not just a hypothetical use case, the Facebook
> > load balancer is doing this as well as Cilium and allowing this with
> > multi-buffer packets >1500B would be useful.
> > 
> > Can we write the skb_shared_info lazily? It should only be needed once
> > we know the packet is going up the stack to some place that needs the
> > info. Which we could learn from the return code of the XDP program.
> 
> Hi John,

Hi, I'll try to join the two threads this one and the one on helpers here
so we don't get too fragmented.

> 
> I agree, I think for XDP_TX use-case it is not strictly necessary to fill the
> skb_hared_info. The driver can just keep this info on the stack and use it
> inserting the packet back to the DMA ring.
> For mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit
> path since it is a low-end device. I guess we are not introducing any API constraint
> for XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way
> in order to avoid the cache miss.

Agree it would be an implementation detail for XDP_TX except the two helpers added
in this series currently require it to be there.

> 
> We need to fill the skb_shared info only when we want to pass the frame to the
> network stack (build_skb() can directly reuse skb_shared_info->frags[]) or for
> XDP_REDIRECT use-case.

It might be good to think about the XDP_REDIRECT case as well then. If the
frags list fit in the metadata/xdp_frame would we expect better
performance?

Looking at skb_shared_info{} that is a rather large structure with many
fields that look unnecessary for XDP_REDIRECT case and only needed when
passing to the stack. Fundamentally, a frag just needs

 struct bio_vec {
     struct page *bv_page;     // 8B
     unsigned int bv_len;      // 4B
     unsigned int bv_offset;   // 4B
 } // 16B

With header split + data we only need a single frag so we could use just
16B. And worse case jumbo frame + header split seems 3 entries would be
enough giving 48B (header plus 3 4k pages). Could we just stick this in
the metadata and make it read only? Then programs that care can read it
and get all the info they need without helpers. I would expect performance
to be better in the XDP_TX and XDP_REDIRECT cases. And copying an extra
worse case 48B in passing to the stack I guess is not measurable given
all the work needed in that path.

> 
> > 
> > > 
> > > A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
> > > to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
> > > or not (mb = 0).
> > > The mb bit will be set by a xdp multi-buffer capable driver only for
> > > non-linear frames maintaining the capability to receive linear frames
> > > without any extra cost since the skb_shared_info structure at the end
> > > of the first buffer will be initialized only if mb is set.
> > 
> > Thanks above is clearer.
> > 
> > > 
> > > In order to provide to userspace some metdata about the non-linear
> > > xdp_{buff,frame}, we introduced 2 bpf helpers:
> > > - bpf_xdp_get_frags_count:
> > >   get the number of fragments for a given xdp multi-buffer.
> > > - bpf_xdp_get_frags_total_size:
> > >   get the total size of fragments for a given xdp multi-buffer.
> > 
> > Whats the use case for these? Do you have an example where knowing
> > the frags count is going to be something a BPF program will use?
> > Having total size seems interesting but perhaps we should push that
> > into the metadata so its pulled into the cache if users are going to
> > be reading it on every packet or something.
> 
> At the moment we do not have any use-case for these helpers (not considering
> the sample in the series :)). We introduced them to provide some basic metadata
> about the non-linear xdp_frame.
> IIRC we decided to introduce some helpers instead of adding this info in xdp_frame
> in order to save space on it (for xdp it is essential xdp_frame to fit in a single
> cache-line).

Sure, how about in the metadata then? (From other thread I was suggesting putting
the total length in metadata) We could even allow programs to overwrite it if
they wanted if its not used by the stack for anything other than packet length
visibility. Of course users would then need to be a bit careful not to overwrite
it and then read it again expecting the length to be correct. I think from a
users perspective though that would be expected.

> 
> Regards,
> Lorenzo
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-02 15:25 ` [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support John Fastabend
  2020-10-02 16:06   ` Lorenzo Bianconi
@ 2020-10-02 19:53   ` Daniel Borkmann
  2020-10-05 15:50     ` Tirthendu Sarkar
  2020-10-06 12:39     ` Jubran, Samih
  1 sibling, 2 replies; 31+ messages in thread
From: Daniel Borkmann @ 2020-10-02 19:53 UTC (permalink / raw)
  To: John Fastabend, Lorenzo Bianconi, bpf, netdev
  Cc: davem, kuba, ast, shayagr, sameehj, dsahern, brouer,
	lorenzo.bianconi, echaudro

On 10/2/20 5:25 PM, John Fastabend wrote:
> Lorenzo Bianconi wrote:
>> This series introduce XDP multi-buffer support. The mvneta driver is
>> the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
>> please focus on how these new types of xdp_{buff,frame} packets
>> traverse the different layers and the layout design. It is on purpose
>> that BPF-helpers are kept simple, as we don't want to expose the
>> internal layout to allow later changes.
>>
>> For now, to keep the design simple and to maintain performance, the XDP
>> BPF-prog (still) only have access to the first-buffer. It is left for
>> later (another patchset) to add payload access across multiple buffers.
>> This patchset should still allow for these future extensions. The goal
>> is to lift the XDP MTU restriction that comes with XDP, but maintain
>> same performance as before.
>>
>> The main idea for the new multi-buffer layout is to reuse the same
>> layout used for non-linear SKB. This rely on the "skb_shared_info"
>> struct at the end of the first buffer to link together subsequent
>> buffers. Keeping the layout compatible with SKBs is also done to ease
>> and speedup creating an SKB from an xdp_{buff,frame}. Converting
>> xdp_frame to SKB and deliver it to the network stack is shown in cpumap
>> code (patch 13/13).
> 
> Using the end of the buffer for the skb_shared_info struct is going to
> become driver API so unwinding it if it proves to be a performance issue
> is going to be ugly. So same question as before, for the use case where
> we receive packet and do XDP_TX with it how do we avoid cache miss
> overhead? This is not just a hypothetical use case, the Facebook
> load balancer is doing this as well as Cilium and allowing this with
> multi-buffer packets >1500B would be useful.
[...]

Fully agree. My other question would be if someone else right now is in the process
of implementing this scheme for a 40G+ NIC? My concern is the numbers below are rather
on the lower end of the spectrum, so I would like to see a comparison of XDP as-is
today vs XDP multi-buff on a higher end NIC so that we have a picture how well the
current designed scheme works there and into which performance issue we'll run e.g.
under typical XDP L4 load balancer scenario with XDP_TX. I think this would be crucial
before the driver API becomes 'sort of' set in stone where others start to adapting
it and changing design becomes painful. Do ena folks have an implementation ready as
well? And what about virtio_net, for example, anyone committing there too? Typically
for such features to land is to require at least 2 drivers implementing it.

>> Typical use cases for this series are:
>> - Jumbo-frames
>> - Packet header split (please see Google���s use-case @ NetDevConf 0x14, [0])
>> - TSO
>>
>> More info about the main idea behind this approach can be found here [1][2].
>>
>> We carried out some throughput tests in a standard linear frame scenario in order
>> to verify we did not introduced any performance regression adding xdp multi-buff
>> support to mvneta:
>>
>> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor size is one PAGE
>>
>> commit: 879456bedbe5 ("net: mvneta: avoid possible cache misses in mvneta_rx_swbm")
>> - xdp-pass:      ~162Kpps
>> - xdp-drop:      ~701Kpps
>> - xdp-tx:        ~185Kpps
>> - xdp-redirect:  ~202Kpps
>>
>> mvneta xdp multi-buff:
>> - xdp-pass:      ~163Kpps
>> - xdp-drop:      ~739Kpps
>> - xdp-tx:        ~182Kpps
>> - xdp-redirect:  ~202Kpps
[...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-02 18:06     ` John Fastabend
@ 2020-10-05  9:52       ` Jesper Dangaard Brouer
  2020-10-05 21:22         ` John Fastabend
  0 siblings, 1 reply; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2020-10-05  9:52 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, Lorenzo Bianconi, bpf, netdev, davem, kuba,
	ast, daniel, shayagr, sameehj, dsahern, echaudro, brouer

On Fri, 02 Oct 2020 11:06:12 -0700
John Fastabend <john.fastabend@gmail.com> wrote:

> Lorenzo Bianconi wrote:
> > > Lorenzo Bianconi wrote:  
> > > > This series introduce XDP multi-buffer support. The mvneta driver is
> > > > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > > > please focus on how these new types of xdp_{buff,frame} packets
> > > > traverse the different layers and the layout design. It is on purpose
> > > > that BPF-helpers are kept simple, as we don't want to expose the
> > > > internal layout to allow later changes.
> > > > 
> > > > For now, to keep the design simple and to maintain performance, the XDP
> > > > BPF-prog (still) only have access to the first-buffer. It is left for
> > > > later (another patchset) to add payload access across multiple buffers.
> > > > This patchset should still allow for these future extensions. The goal
> > > > is to lift the XDP MTU restriction that comes with XDP, but maintain
> > > > same performance as before.
> > > > 
> > > > The main idea for the new multi-buffer layout is to reuse the same
> > > > layout used for non-linear SKB. This rely on the "skb_shared_info"
> > > > struct at the end of the first buffer to link together subsequent
> > > > buffers. Keeping the layout compatible with SKBs is also done to ease
> > > > and speedup creating an SKB from an xdp_{buff,frame}. Converting
> > > > xdp_frame to SKB and deliver it to the network stack is shown in cpumap
> > > > code (patch 13/13).  
> > > 
> > > Using the end of the buffer for the skb_shared_info struct is going to
> > > become driver API so unwinding it if it proves to be a performance issue
> > > is going to be ugly. So same question as before, for the use case where
> > > we receive packet and do XDP_TX with it how do we avoid cache miss
> > > overhead? This is not just a hypothetical use case, the Facebook
> > > load balancer is doing this as well as Cilium and allowing this with
> > > multi-buffer packets >1500B would be useful.
> > > 
> > > Can we write the skb_shared_info lazily? It should only be needed once
> > > we know the packet is going up the stack to some place that needs the
> > > info. Which we could learn from the return code of the XDP program.  
> > 
> > Hi John,  
> 
> Hi, I'll try to join the two threads this one and the one on helpers here
> so we don't get too fragmented.
> 
> > 
> > I agree, I think for XDP_TX use-case it is not strictly necessary to fill the
> > skb_hared_info. The driver can just keep this info on the stack and use it
> > inserting the packet back to the DMA ring.
> > For mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit
> > path since it is a low-end device. I guess we are not introducing any API constraint
> > for XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way
> > in order to avoid the cache miss.  
> 
> Agree it would be an implementation detail for XDP_TX except the two
> helpers added in this series currently require it to be there.

That is a good point.  If you look at the details, the helpers use
xdp_buff->mb bit to guard against accessing the "shared_info"
cacheline. Thus, for the normal single frame case XDP_TX should not see
a slowdown.  Do we really need to optimize XDP_TX multi-frame case(?)


> > 
> > We need to fill the skb_shared info only when we want to pass the frame to the
> > network stack (build_skb() can directly reuse skb_shared_info->frags[]) or for
> > XDP_REDIRECT use-case.  
> 
> It might be good to think about the XDP_REDIRECT case as well then. If the
> frags list fit in the metadata/xdp_frame would we expect better
> performance?

I don't like to use space in xdp_frame for this. (1) We (Ahern and I)
are planning to use the space in xdp_frame for RX-csum + RX-hash +vlan,
which will be more common (e.g. all packets will have HW RX+csum).  (2)
I consider XDP multi-buffer the exception case, that will not be used
in most cases, so why reserve space for that in this cache-line.

IMHO we CANNOT allow any slowdown for existing XDP use-cases, but IMHO
XDP multi-buffer use-cases are allowed to run "slower".


> Looking at skb_shared_info{} that is a rather large structure with many

A cache-line detail about skb_shared_info: The first frags[0] member is
in the first cache-line.  Meaning that it is still fast to have xdp
frames with 1 extra buffer.

> fields that look unnecessary for XDP_REDIRECT case and only needed when
> passing to the stack. 

Yes, I think we can use first cache-line of skb_shared_info more
optimally (via defining a xdp_shared_info struct). But I still want us
to use this specific cache-line.  Let me explain why below. (Avoiding
cache-line misses is all about the details, so I hope you can follow).

Hopefully most driver developers understand/knows this.  In the RX-loop
the current RX-descriptor have a status that indicate there are more
frame, usually expressed as non-EOP (End-Of-Packet).  Thus, a driver
can start a prefetchw of this shared_info cache-line, prior to
processing the RX-desc that describe the multi-buffer.
 (Remember this shared_info is constructed prior to calling XDP and any
XDP_TX action, thus the XDP prog should not see a cache-line miss when
using the BPF-helper to read shared_info area).


> Fundamentally, a frag just needs
> 
>  struct bio_vec {
>      struct page *bv_page;     // 8B
>      unsigned int bv_len;      // 4B
>      unsigned int bv_offset;   // 4B
>  } // 16B
> 
> With header split + data we only need a single frag so we could use just
> 16B. And worse case jumbo frame + header split seems 3 entries would be
> enough giving 48B (header plus 3 4k pages). 

For jumbo-frame 9000 MTU 2 entries might be enough, as we also have
room in the first buffer (((9000-(4096-256-320))/4096 = 1.33789).

The problem is that we need to support TSO (TCP Segmentation Offload)
use-case, which can have more frames. Thus, 3 entries will not be
enough.

> Could we just stick this in the metadata and make it read only? Then
> programs that care can read it and get all the info they need without
> helpers.

I don't see how that is possible. (1) the metadata area is only 32
bytes, (2) when freeing an xdp_frame the kernel need to know the layout
as these points will be free'ed.

> I would expect performance to be better in the XDP_TX and
> XDP_REDIRECT cases. And copying an extra worse case 48B in passing to
> the stack I guess is not measurable given all the work needed in that
> path.

I do agree, that when passing to netstack we can do a transformation
from xdp_shared_info to skb_shared_info with a fairly small cost.  (The
TSO case would require more copying).

Notice that allocating an SKB, will always clear the first 32 bytes of
skb_shared_info.  If the XDP driver-code path have done the prefetch
as described above, then we should see a speedup for netstack delivery.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-02 19:53   ` Daniel Borkmann
@ 2020-10-05 15:50     ` Tirthendu Sarkar
  2020-10-06 12:39     ` Jubran, Samih
  1 sibling, 0 replies; 31+ messages in thread
From: Tirthendu Sarkar @ 2020-10-05 15:50 UTC (permalink / raw)
  To: daniel
  Cc: ast, bpf, brouer, davem, dsahern, echaudro, john.fastabend, kuba,
	lorenzo.bianconi, lorenzo, netdev, sameehj, shayagr,
	Tirthendu Sarkar

On 10/2/20 5:25 PM, John Fastabend wrote:
>>[..] Typically for such features to land is to require at least 2 drivers
>>implementing it.

I am working on making changes to Intel NIC drivers for XDP multi buffer based
on these patches. Respective patches Will be posted once ready.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-05  9:52       ` Jesper Dangaard Brouer
@ 2020-10-05 21:22         ` John Fastabend
  2020-10-05 22:24           ` Lorenzo Bianconi
  0 siblings, 1 reply; 31+ messages in thread
From: John Fastabend @ 2020-10-05 21:22 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, John Fastabend
  Cc: Lorenzo Bianconi, Lorenzo Bianconi, bpf, netdev, davem, kuba,
	ast, daniel, shayagr, sameehj, dsahern, echaudro, brouer

Jesper Dangaard Brouer wrote:
> On Fri, 02 Oct 2020 11:06:12 -0700
> John Fastabend <john.fastabend@gmail.com> wrote:
> 
> > Lorenzo Bianconi wrote:
> > > > Lorenzo Bianconi wrote:  
> > > > > This series introduce XDP multi-buffer support. The mvneta driver is
> > > > > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > > > > please focus on how these new types of xdp_{buff,frame} packets
> > > > > traverse the different layers and the layout design. It is on purpose
> > > > > that BPF-helpers are kept simple, as we don't want to expose the
> > > > > internal layout to allow later changes.
> > > > > 
> > > > > For now, to keep the design simple and to maintain performance, the XDP
> > > > > BPF-prog (still) only have access to the first-buffer. It is left for
> > > > > later (another patchset) to add payload access across multiple buffers.
> > > > > This patchset should still allow for these future extensions. The goal
> > > > > is to lift the XDP MTU restriction that comes with XDP, but maintain
> > > > > same performance as before.
> > > > > 
> > > > > The main idea for the new multi-buffer layout is to reuse the same
> > > > > layout used for non-linear SKB. This rely on the "skb_shared_info"
> > > > > struct at the end of the first buffer to link together subsequent
> > > > > buffers. Keeping the layout compatible with SKBs is also done to ease
> > > > > and speedup creating an SKB from an xdp_{buff,frame}. Converting
> > > > > xdp_frame to SKB and deliver it to the network stack is shown in cpumap
> > > > > code (patch 13/13).  
> > > > 
> > > > Using the end of the buffer for the skb_shared_info struct is going to
> > > > become driver API so unwinding it if it proves to be a performance issue
> > > > is going to be ugly. So same question as before, for the use case where
> > > > we receive packet and do XDP_TX with it how do we avoid cache miss
> > > > overhead? This is not just a hypothetical use case, the Facebook
> > > > load balancer is doing this as well as Cilium and allowing this with
> > > > multi-buffer packets >1500B would be useful.
> > > > 
> > > > Can we write the skb_shared_info lazily? It should only be needed once
> > > > we know the packet is going up the stack to some place that needs the
> > > > info. Which we could learn from the return code of the XDP program.  
> > > 
> > > Hi John,  
> > 
> > Hi, I'll try to join the two threads this one and the one on helpers here
> > so we don't get too fragmented.
> > 
> > > 
> > > I agree, I think for XDP_TX use-case it is not strictly necessary to fill the
> > > skb_hared_info. The driver can just keep this info on the stack and use it
> > > inserting the packet back to the DMA ring.
> > > For mvneta I implemented it in this way to keep the code aligned with ndo_xdp_xmit
> > > path since it is a low-end device. I guess we are not introducing any API constraint
> > > for XDP_TX. A high-end device can implement multi-buff for XDP_TX in a different way
> > > in order to avoid the cache miss.  
> > 
> > Agree it would be an implementation detail for XDP_TX except the two
> > helpers added in this series currently require it to be there.
> 
> That is a good point.  If you look at the details, the helpers use
> xdp_buff->mb bit to guard against accessing the "shared_info"
> cacheline. Thus, for the normal single frame case XDP_TX should not see
> a slowdown.  Do we really need to optimize XDP_TX multi-frame case(?)

Agree it is guarded by xdp_buff->mb which is why I asked for that detail
to be posted in the cover letter so it was easy to understand that bit
of info.

Do we really need to optimize XDP_TX multi-frame case? Yes I think so.
The use case is jumbo frames (or 4kB) LB. XDP_TX is the common case any
many configurations. For our use case these including cloud providers
and bare-metal data centers.

Keeping the implementation out of the helpers allows drivers to optimize
for this case. Also it doesn't seem like the helpers in this series
have a strong use case. Happy to hear what it is, but I can't see how
to use them myself.

> 
> 
> > > 
> > > We need to fill the skb_shared info only when we want to pass the frame to the
> > > network stack (build_skb() can directly reuse skb_shared_info->frags[]) or for
> > > XDP_REDIRECT use-case.  
> > 
> > It might be good to think about the XDP_REDIRECT case as well then. If the
> > frags list fit in the metadata/xdp_frame would we expect better
> > performance?
> 
> I don't like to use space in xdp_frame for this. (1) We (Ahern and I)
> are planning to use the space in xdp_frame for RX-csum + RX-hash +vlan,
> which will be more common (e.g. all packets will have HW RX+csum).  (2)
> I consider XDP multi-buffer the exception case, that will not be used
> in most cases, so why reserve space for that in this cache-line.

Sure.

> 
> IMHO we CANNOT allow any slowdown for existing XDP use-cases, but IMHO
> XDP multi-buffer use-cases are allowed to run "slower".

I agree we cannot slowdown existing use cases. But, disagree that multi
buffer use cases can be slower. If folks enable jumbo-frames and things
slow down thats a problem.

> 
> 
> > Looking at skb_shared_info{} that is a rather large structure with many
> 
> A cache-line detail about skb_shared_info: The first frags[0] member is
> in the first cache-line.  Meaning that it is still fast to have xdp
> frames with 1 extra buffer.

Thats nice in-theory.

> 
> > fields that look unnecessary for XDP_REDIRECT case and only needed when
> > passing to the stack. 
> 
> Yes, I think we can use first cache-line of skb_shared_info more
> optimally (via defining a xdp_shared_info struct). But I still want us
> to use this specific cache-line.  Let me explain why below. (Avoiding
> cache-line misses is all about the details, so I hope you can follow).
> 
> Hopefully most driver developers understand/knows this.  In the RX-loop
> the current RX-descriptor have a status that indicate there are more
> frame, usually expressed as non-EOP (End-Of-Packet).  Thus, a driver
> can start a prefetchw of this shared_info cache-line, prior to
> processing the RX-desc that describe the multi-buffer.
>  (Remember this shared_info is constructed prior to calling XDP and any
> XDP_TX action, thus the XDP prog should not see a cache-line miss when
> using the BPF-helper to read shared_info area).

In general I see no reason to populate these fields before the XDP
program runs. Someone needs to convince me why having frags info before
program runs is useful. In general headers should be preserved and first
frag already included in the data pointers. If users start parsing further
they might need it, but this series doesn't provide a way to do that
so IMO without those helpers its a bit difficult to debate.

Specifically for XDP_TX case we can just flip the descriptors from RX
ring to TX ring and keep moving along. This is going to be ideal on
40/100Gbps nics.

I'm not arguing that its likely possible to put some prefetch logic
in there and keep the pipe full, but I would need to see that on
a 100gbps nic to be convinced the details here are going to work. Or
at minimum a 40gbps nic.

> 
> 
> > Fundamentally, a frag just needs
> > 
> >  struct bio_vec {
> >      struct page *bv_page;     // 8B
> >      unsigned int bv_len;      // 4B
> >      unsigned int bv_offset;   // 4B
> >  } // 16B
> > 
> > With header split + data we only need a single frag so we could use just
> > 16B. And worse case jumbo frame + header split seems 3 entries would be
> > enough giving 48B (header plus 3 4k pages). 
> 
> For jumbo-frame 9000 MTU 2 entries might be enough, as we also have
> room in the first buffer (((9000-(4096-256-320))/4096 = 1.33789).

Sure. I was just counting the fist buffer a frag understanding it
wouldn't actually be in the frag list.

> 
> The problem is that we need to support TSO (TCP Segmentation Offload)
> use-case, which can have more frames. Thus, 3 entries will not be
> enough.

Sorry not following, TSO? Explain how TSO is going to work for XDP_TX
and XDP_REDIRECT? I guess in theory you can header split and coalesce,
but we are a ways off from that and this series certainly doesn't
talk about TSO unless I missed something.

> 
> > Could we just stick this in the metadata and make it read only? Then
> > programs that care can read it and get all the info they need without
> > helpers.
> 
> I don't see how that is possible. (1) the metadata area is only 32
> bytes, (2) when freeing an xdp_frame the kernel need to know the layout
> as these points will be free'ed.

Agree its tight, probably too tight to be useful.

> 
> > I would expect performance to be better in the XDP_TX and
> > XDP_REDIRECT cases. And copying an extra worse case 48B in passing to
> > the stack I guess is not measurable given all the work needed in that
> > path.
> 
> I do agree, that when passing to netstack we can do a transformation
> from xdp_shared_info to skb_shared_info with a fairly small cost.  (The
> TSO case would require more copying).

I'm lost on the TSO case. Explain how TSO is related here? 

> 
> Notice that allocating an SKB, will always clear the first 32 bytes of
> skb_shared_info.  If the XDP driver-code path have done the prefetch
> as described above, then we should see a speedup for netstack delivery.

Not against it, but these things are a bit tricky. Couple things I still
want to see/understand

 - Lets see a 40gbps use a prefetch and verify it works in practice
 - Explain why we can't just do this after XDP program runs
 - How will we read data in the frag list if we need to parse headers
   inside the frags[].

The above would be best to answer now rather than later IMO.

Thanks,
John

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-05 21:22         ` John Fastabend
@ 2020-10-05 22:24           ` Lorenzo Bianconi
  2020-10-06  4:29             ` John Fastabend
  0 siblings, 1 reply; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-05 22:24 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jesper Dangaard Brouer, Lorenzo Bianconi, bpf, netdev, davem,
	kuba, ast, daniel, shayagr, sameehj, dsahern, echaudro

[-- Attachment #1: Type: text/plain, Size: 2041 bytes --]

[...]

> 
> In general I see no reason to populate these fields before the XDP
> program runs. Someone needs to convince me why having frags info before
> program runs is useful. In general headers should be preserved and first
> frag already included in the data pointers. If users start parsing further
> they might need it, but this series doesn't provide a way to do that
> so IMO without those helpers its a bit difficult to debate.

We need to populate the skb_shared_info before running the xdp program in order to
allow the ebpf sanbox to access this data. If we restrict the access to the first
buffer only I guess we can avoid to do that but I think there is a value allowing
the xdp program to access this data.
A possible optimization can be access the shared_info only once before running
the ebpf program constructing the shared_info using a struct allocated on the
stack.
Moreover we can define a "xdp_shared_info" struct to alias the skb_shared_info
one in order to have most on frags elements in the first "shared_info" cache line.

> 
> Specifically for XDP_TX case we can just flip the descriptors from RX
> ring to TX ring and keep moving along. This is going to be ideal on
> 40/100Gbps nics.
> 
> I'm not arguing that its likely possible to put some prefetch logic
> in there and keep the pipe full, but I would need to see that on
> a 100gbps nic to be convinced the details here are going to work. Or
> at minimum a 40gbps nic.
> 
> > 
> > 

[...]

> Not against it, but these things are a bit tricky. Couple things I still
> want to see/understand
> 
>  - Lets see a 40gbps use a prefetch and verify it works in practice
>  - Explain why we can't just do this after XDP program runs

how can we allow the ebpf program to access paged data if we do not do that?

>  - How will we read data in the frag list if we need to parse headers
>    inside the frags[].
> 
> The above would be best to answer now rather than later IMO.
> 
> Thanks,
> John

Regards,
Lorenzo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-05 22:24           ` Lorenzo Bianconi
@ 2020-10-06  4:29             ` John Fastabend
  2020-10-06  7:30               ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 31+ messages in thread
From: John Fastabend @ 2020-10-06  4:29 UTC (permalink / raw)
  To: Lorenzo Bianconi, John Fastabend
  Cc: Jesper Dangaard Brouer, Lorenzo Bianconi, bpf, netdev, davem,
	kuba, ast, daniel, shayagr, sameehj, dsahern, echaudro

Lorenzo Bianconi wrote:
> [...]
> 
> > 
> > In general I see no reason to populate these fields before the XDP
> > program runs. Someone needs to convince me why having frags info before
> > program runs is useful. In general headers should be preserved and first
> > frag already included in the data pointers. If users start parsing further
> > they might need it, but this series doesn't provide a way to do that
> > so IMO without those helpers its a bit difficult to debate.
> 
> We need to populate the skb_shared_info before running the xdp program in order to
> allow the ebpf sanbox to access this data. If we restrict the access to the first
> buffer only I guess we can avoid to do that but I think there is a value allowing
> the xdp program to access this data.

I agree. We could also only populate the fields if the program accesses
the fields.

> A possible optimization can be access the shared_info only once before running
> the ebpf program constructing the shared_info using a struct allocated on the
> stack.

Seems interesting, might be a good idea.

> Moreover we can define a "xdp_shared_info" struct to alias the skb_shared_info
> one in order to have most on frags elements in the first "shared_info" cache line.
> 
> > 
> > Specifically for XDP_TX case we can just flip the descriptors from RX
> > ring to TX ring and keep moving along. This is going to be ideal on
> > 40/100Gbps nics.
> > 
> > I'm not arguing that its likely possible to put some prefetch logic
> > in there and keep the pipe full, but I would need to see that on
> > a 100gbps nic to be convinced the details here are going to work. Or
> > at minimum a 40gbps nic.
> > 
> > > 
> > > 
> 
> [...]
> 
> > Not against it, but these things are a bit tricky. Couple things I still
> > want to see/understand
> > 
> >  - Lets see a 40gbps use a prefetch and verify it works in practice
> >  - Explain why we can't just do this after XDP program runs
> 
> how can we allow the ebpf program to access paged data if we do not do that?

I don't see an easy way, but also this series doesn't have the data
access support.

Its hard to tell until we get at least a 40gbps nic if my concern about
performance is real or not. Prefetching smartly could resolve some of the
issue I guess.

If the Intel folks are working on it I think waiting would be great. Otherwise
at minimum drop the helpers and be prepared to revert things if needed.

> 
> >  - How will we read data in the frag list if we need to parse headers
> >    inside the frags[].
> > 
> > The above would be best to answer now rather than later IMO.
> > 
> > Thanks,
> > John
> 
> Regards,
> Lorenzo



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-06  4:29             ` John Fastabend
@ 2020-10-06  7:30               ` Jesper Dangaard Brouer
  2020-10-06 15:28                 ` Lorenzo Bianconi
  0 siblings, 1 reply; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2020-10-06  7:30 UTC (permalink / raw)
  To: John Fastabend
  Cc: Lorenzo Bianconi, Lorenzo Bianconi, bpf, netdev, davem, kuba,
	ast, daniel, shayagr, sameehj, dsahern, Eelco Chaudron, brouer,
	Tirthendu Sarkar, Toke Høiland-Jørgensen

On Mon, 05 Oct 2020 21:29:36 -0700
John Fastabend <john.fastabend@gmail.com> wrote:

> Lorenzo Bianconi wrote:
> > [...]
> >   
> > > 
> > > In general I see no reason to populate these fields before the XDP
> > > program runs. Someone needs to convince me why having frags info before
> > > program runs is useful. In general headers should be preserved and first
> > > frag already included in the data pointers. If users start parsing further
> > > they might need it, but this series doesn't provide a way to do that
> > > so IMO without those helpers its a bit difficult to debate.  
> > 
> > We need to populate the skb_shared_info before running the xdp program in order to
> > allow the ebpf sanbox to access this data. If we restrict the access to the first
> > buffer only I guess we can avoid to do that but I think there is a value allowing
> > the xdp program to access this data.  
> 
> I agree. We could also only populate the fields if the program accesses
> the fields.

Notice, a driver will not initialize/use the shared_info area unless
there are more segments.  And (we have already established) the xdp->mb
bit is guarding BPF-prog from accessing shared_info area. 

> > A possible optimization can be access the shared_info only once before running
> > the ebpf program constructing the shared_info using a struct allocated on the
> > stack.  
> 
> Seems interesting, might be a good idea.

It *might* be a good idea ("alloc" shared_info on stack), but we should
benchmark this.  The prefetch trick might be fast enough.  But also
keep in mind the performance target, as with large size frames the
packet-per-sec we need to handle dramatically drop.


The TSO statement, I meant LRO (Large Receive Offload), but I want the
ability to XDP-redirect this frame out another netdev as TSO.  This
does means that we need more than 3 pages (2 frags slots) to store LRO
frames.  Thus, if we store this shared_info on the stack it might need
to be larger than we like.



> > Moreover we can define a "xdp_shared_info" struct to alias the skb_shared_info
> > one in order to have most on frags elements in the first "shared_info" cache line.
> >   
> > > 
> > > Specifically for XDP_TX case we can just flip the descriptors from RX
> > > ring to TX ring and keep moving along. This is going to be ideal on
> > > 40/100Gbps nics.

I think both approaches will still allow to do these page-flips.

> > > I'm not arguing that its likely possible to put some prefetch logic
> > > in there and keep the pipe full, but I would need to see that on
> > > a 100gbps nic to be convinced the details here are going to work. Or
> > > at minimum a 40gbps nic.

I'm looking forward to see how this performs on faster NICs.  Once we
have a high-speed NIC driver with this I can also start doing testing
in my testlab.


> > [...]
> >   
> > > Not against it, but these things are a bit tricky. Couple things I still
> > > want to see/understand
> > > 
> > >  - Lets see a 40gbps use a prefetch and verify it works in practice
> > >  - Explain why we can't just do this after XDP program runs  
> > 
> > how can we allow the ebpf program to access paged data if we do not do that?  
> 
> I don't see an easy way, but also this series doesn't have the data
> access support.

Eelco (Cc'ed) are working on patches that allow access to data in these
fragments, so far internal patches, which (sorry to mention) got
shutdown in internal review.


> Its hard to tell until we get at least a 40gbps nic if my concern about
> performance is real or not. Prefetching smartly could resolve some of the
> issue I guess.
> 
> If the Intel folks are working on it I think waiting would be great. Otherwise
> at minimum drop the helpers and be prepared to revert things if needed.

I do think it makes sense to drop the helpers for now, and focus on how
this new multi-buffer frame type is handled in the existing code, and do
some benchmarking on higher speed NIC, before the BPF-helper start to
lockdown/restrict what we can change/revert as they define UAPI.

E.g. existing code that need to handle this is existing helper
bpf_xdp_adjust_tail, which is something I have broad up before and even
described in[1].  Lets make sure existing code works with proposed
design, before introducing new helpers (and this makes it easier to
revert).

[1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org#xdp-tail-adjust
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-02 19:53   ` Daniel Borkmann
  2020-10-05 15:50     ` Tirthendu Sarkar
@ 2020-10-06 12:39     ` Jubran, Samih
  1 sibling, 0 replies; 31+ messages in thread
From: Jubran, Samih @ 2020-10-06 12:39 UTC (permalink / raw)
  To: Daniel Borkmann, John Fastabend, Lorenzo Bianconi, bpf, netdev
  Cc: davem, kuba, ast, Agroskin, Shay, dsahern, brouer,
	lorenzo.bianconi, echaudro



> -----Original Message-----
> From: Daniel Borkmann <daniel@iogearbox.net>
> Sent: Friday, October 2, 2020 10:53 PM
> To: John Fastabend <john.fastabend@gmail.com>; Lorenzo Bianconi
> <lorenzo@kernel.org>; bpf@vger.kernel.org; netdev@vger.kernel.org
> Cc: davem@davemloft.net; kuba@kernel.org; ast@kernel.org; Agroskin,
> Shay <shayagr@amazon.com>; Jubran, Samih <sameehj@amazon.com>;
> dsahern@kernel.org; brouer@redhat.com; lorenzo.bianconi@redhat.com;
> echaudro@redhat.com
> Subject: RE: [EXTERNAL] [PATCH v4 bpf-next 00/13] mvneta: introduce XDP
> multi-buffer support
> 
> CAUTION: This email originated from outside of the organization. Do not click
> links or open attachments unless you can confirm the sender and know the
> content is safe.
> 
> 
> 
> On 10/2/20 5:25 PM, John Fastabend wrote:
> > Lorenzo Bianconi wrote:
> >> This series introduce XDP multi-buffer support. The mvneta driver is
> >> the first to support these new "non-linear" xdp_{buff,frame}.
> >> Reviewers please focus on how these new types of xdp_{buff,frame}
> >> packets traverse the different layers and the layout design. It is on
> >> purpose that BPF-helpers are kept simple, as we don't want to expose
> >> the internal layout to allow later changes.
> >>
> >> For now, to keep the design simple and to maintain performance, the
> >> XDP BPF-prog (still) only have access to the first-buffer. It is left
> >> for later (another patchset) to add payload access across multiple buffers.
> >> This patchset should still allow for these future extensions. The
> >> goal is to lift the XDP MTU restriction that comes with XDP, but
> >> maintain same performance as before.
> >>
> >> The main idea for the new multi-buffer layout is to reuse the same
> >> layout used for non-linear SKB. This rely on the "skb_shared_info"
> >> struct at the end of the first buffer to link together subsequent
> >> buffers. Keeping the layout compatible with SKBs is also done to ease
> >> and speedup creating an SKB from an xdp_{buff,frame}. Converting
> >> xdp_frame to SKB and deliver it to the network stack is shown in
> >> cpumap code (patch 13/13).
> >
> > Using the end of the buffer for the skb_shared_info struct is going to
> > become driver API so unwinding it if it proves to be a performance
> > issue is going to be ugly. So same question as before, for the use
> > case where we receive packet and do XDP_TX with it how do we avoid
> > cache miss overhead? This is not just a hypothetical use case, the
> > Facebook load balancer is doing this as well as Cilium and allowing
> > this with multi-buffer packets >1500B would be useful.
> [...]
> 
> Fully agree. My other question would be if someone else right now is in the
> process of implementing this scheme for a 40G+ NIC? My concern is the
> numbers below are rather on the lower end of the spectrum, so I would like
> to see a comparison of XDP as-is today vs XDP multi-buff on a higher end NIC
> so that we have a picture how well the current designed scheme works there
> and into which performance issue we'll run e.g.
> under typical XDP L4 load balancer scenario with XDP_TX. I think this would
> be crucial before the driver API becomes 'sort of' set in stone where others
> start to adapting it and changing design becomes painful. Do ena folks have
> an implementation ready as well? And what about virtio_net, for example,
> anyone committing there too? Typically for such features to land is to require
> at least 2 drivers implementing it.
>

We (ENA) expect to have XDP MB implementation with performance results in around 4-6 weeks.

> >> Typical use cases for this series are:
> >> - Jumbo-frames
> >> - Packet header split (please see Google   s use-case @ NetDevConf
> >> 0x14, [0])
> >> - TSO
> >>
> >> More info about the main idea behind this approach can be found here
> [1][2].
> >>
> >> We carried out some throughput tests in a standard linear frame
> >> scenario in order to verify we did not introduced any performance
> >> regression adding xdp multi-buff support to mvneta:
> >>
> >> offered load is ~ 1000Kpps, packet size is 64B, mvneta descriptor
> >> size is one PAGE
> >>
> >> commit: 879456bedbe5 ("net: mvneta: avoid possible cache misses in
> mvneta_rx_swbm")
> >> - xdp-pass:      ~162Kpps
> >> - xdp-drop:      ~701Kpps
> >> - xdp-tx:        ~185Kpps
> >> - xdp-redirect:  ~202Kpps
> >>
> >> mvneta xdp multi-buff:
> >> - xdp-pass:      ~163Kpps
> >> - xdp-drop:      ~739Kpps
> >> - xdp-tx:        ~182Kpps
> >> - xdp-redirect:  ~202Kpps
> [...]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-06  7:30               ` Jesper Dangaard Brouer
@ 2020-10-06 15:28                 ` Lorenzo Bianconi
  2020-10-08 14:38                   ` John Fastabend
  0 siblings, 1 reply; 31+ messages in thread
From: Lorenzo Bianconi @ 2020-10-06 15:28 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: John Fastabend, Lorenzo Bianconi, bpf, netdev, davem, kuba, ast,
	daniel, shayagr, sameehj, dsahern, Eelco Chaudron,
	Tirthendu Sarkar, Toke Høiland-Jørgensen

[-- Attachment #1: Type: text/plain, Size: 2918 bytes --]

> On Mon, 05 Oct 2020 21:29:36 -0700
> John Fastabend <john.fastabend@gmail.com> wrote:
> 
> > Lorenzo Bianconi wrote:
> > > [...]
> > >   
> > > > 
> > > > In general I see no reason to populate these fields before the XDP
> > > > program runs. Someone needs to convince me why having frags info before
> > > > program runs is useful. In general headers should be preserved and first
> > > > frag already included in the data pointers. If users start parsing further
> > > > they might need it, but this series doesn't provide a way to do that
> > > > so IMO without those helpers its a bit difficult to debate.  
> > > 
> > > We need to populate the skb_shared_info before running the xdp program in order to
> > > allow the ebpf sanbox to access this data. If we restrict the access to the first
> > > buffer only I guess we can avoid to do that but I think there is a value allowing
> > > the xdp program to access this data.  
> > 
> > I agree. We could also only populate the fields if the program accesses
> > the fields.
> 
> Notice, a driver will not initialize/use the shared_info area unless
> there are more segments.  And (we have already established) the xdp->mb
> bit is guarding BPF-prog from accessing shared_info area. 
> 
> > > A possible optimization can be access the shared_info only once before running
> > > the ebpf program constructing the shared_info using a struct allocated on the
> > > stack.  
> > 
> > Seems interesting, might be a good idea.
> 
> It *might* be a good idea ("alloc" shared_info on stack), but we should
> benchmark this.  The prefetch trick might be fast enough.  But also
> keep in mind the performance target, as with large size frames the
> packet-per-sec we need to handle dramatically drop.

right. I guess we need to define a workload we want to run for the
xdp multi-buff use-case (e.g. if MTU is 9K we will have ~3 frames
for each packets and # of pps will be much slower)

> 
> 

[...]

> 
> I do think it makes sense to drop the helpers for now, and focus on how
> this new multi-buffer frame type is handled in the existing code, and do
> some benchmarking on higher speed NIC, before the BPF-helper start to
> lockdown/restrict what we can change/revert as they define UAPI.

ack, I will drop them in v5.

Regards,
Lorenzo

> 
> E.g. existing code that need to handle this is existing helper
> bpf_xdp_adjust_tail, which is something I have broad up before and even
> described in[1].  Lets make sure existing code works with proposed
> design, before introducing new helpers (and this makes it easier to
> revert).
> 
> [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org#xdp-tail-adjust
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp()
  2020-10-02 14:42 ` [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
@ 2020-10-08  8:06   ` Shay Agroskin
  2020-10-08 10:46     ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 31+ messages in thread
From: Shay Agroskin @ 2020-10-08  8:06 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: bpf, netdev, davem, kuba, ast, daniel, sameehj, john.fastabend,
	dsahern, brouer, lorenzo.bianconi, echaudro


Lorenzo Bianconi <lorenzo@kernel.org> writes:

> Introduce the capability to allocate a xdp multi-buff in
> bpf_prog_test_run_xdp routine. This is a preliminary patch to 
> introduce
> the selftests for new xdp multi-buff ebpf helpers
>
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  net/bpf/test_run.c | 51 
>  ++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 43 insertions(+), 8 deletions(-)
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index bd291f5f539c..ec7286cd051b 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -617,44 +617,79 @@ int bpf_prog_test_run_xdp(struct bpf_prog 
> *prog, const union bpf_attr *kattr,
>  {
>  	u32 tailroom = SKB_DATA_ALIGN(sizeof(struct 
>  skb_shared_info));
>  	u32 headroom = XDP_PACKET_HEADROOM;
> -	u32 size = kattr->test.data_size_in;
>  	u32 repeat = kattr->test.repeat;
>  	struct netdev_rx_queue *rxqueue;
> +	struct skb_shared_info *sinfo;
>  	struct xdp_buff xdp = {};
> +	u32 max_data_sz, size;
>  	u32 retval, duration;
> -	u32 max_data_sz;
> +	int i, ret, data_len;
>  	void *data;
> -	int ret;
>  
>  	if (kattr->test.ctx_in || kattr->test.ctx_out)
>  		return -EINVAL;
>  
> -	/* XDP have extra tailroom as (most) drivers use full page 
> */
>  	max_data_sz = 4096 - headroom - tailroom;

For the sake of consistency, can this 4096 be changed to PAGE_SIZE 
?
Same as in
     data_len = min_t(int, kattr->test.data_size_in - size, 
     PAGE_SIZE);

expression below

> +	size = min_t(u32, kattr->test.data_size_in, max_data_sz);
> +	data_len = size;
>  
> -	data = bpf_test_init(kattr, kattr->test.data_size_in,
> -			     max_data_sz, headroom, tailroom);
> +	data = bpf_test_init(kattr, size, max_data_sz, headroom, 
> tailroom);
>  	if (IS_ERR(data))
>  		return PTR_ERR(data);
>  
>  	xdp.data_hard_start = data;
>  	xdp.data = data + headroom;
>  	xdp.data_meta = xdp.data;
> -	xdp.data_end = xdp.data + size;
> +	xdp.data_end = xdp.data + data_len;
>  	xdp.frame_sz = headroom + max_data_sz + tailroom;
>  
> +	sinfo = xdp_get_shared_info_from_buff(&xdp);
> +	if (unlikely(kattr->test.data_size_in > size)) {
> +		void __user *data_in = 
> u64_to_user_ptr(kattr->test.data_in);
> +
> +		while (size < kattr->test.data_size_in) {
> +			skb_frag_t *frag = 
> &sinfo->frags[sinfo->nr_frags];
> +			struct page *page;
> +			int data_len;
> +
> +			page = alloc_page(GFP_KERNEL);
> +			if (!page) {
> +				ret = -ENOMEM;
> +				goto out;
> +			}
> +
> +			__skb_frag_set_page(frag, page);
> +			data_len = min_t(int, 
> kattr->test.data_size_in - size,
> +					 PAGE_SIZE);
> +			skb_frag_size_set(frag, data_len);
> +			if (copy_from_user(page_address(page), 
> data_in + size,
> +					   data_len)) {
> +				ret = -EFAULT;
> +				goto out;
> +			}
> +			sinfo->nr_frags++;
> +			size += data_len;
> +		}
> +		xdp.mb = 1;
> +	}
> +
>  	rxqueue = 
>  __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 
>  0);
>  	xdp.rxq = &rxqueue->xdp_rxq;
>  	bpf_prog_change_xdp(NULL, prog);
>  	ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, 
>  true);
>  	if (ret)
>  		goto out;
> +
>  	if (xdp.data != data + headroom || xdp.data_end != 
>  xdp.data + size)
> -		size = xdp.data_end - xdp.data;
> +		size += xdp.data_end - xdp.data - data_len;

Can we please drop the variable shadowing of data_len ? This is 
confusing since the initial value of data_len is correct in the 
`size` calculation, while its value inside the while loop it not.

This seem to be syntactically correct, but I think it's better 
practice to avoid shadowing here.

> +
>  	ret = bpf_test_finish(kattr, uattr, xdp.data, size, 
>  retval, duration);
>  out:
>  	bpf_prog_change_xdp(prog, NULL);
> +	for (i = 0; i < sinfo->nr_frags; i++)
> +		__free_page(skb_frag_page(&sinfo->frags[i]));
>  	kfree(data);
> +
>  	return ret;
>  }


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp()
  2020-10-08  8:06   ` Shay Agroskin
@ 2020-10-08 10:46     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 31+ messages in thread
From: Jesper Dangaard Brouer @ 2020-10-08 10:46 UTC (permalink / raw)
  To: Shay Agroskin
  Cc: Lorenzo Bianconi, bpf, netdev, davem, kuba, ast, daniel, sameehj,
	john.fastabend, dsahern, lorenzo.bianconi, echaudro, brouer

On Thu, 8 Oct 2020 11:06:14 +0300
Shay Agroskin <shayagr@amazon.com> wrote:

> Lorenzo Bianconi <lorenzo@kernel.org> writes:
> 
> > Introduce the capability to allocate a xdp multi-buff in
> > bpf_prog_test_run_xdp routine. This is a preliminary patch to 
> > introduce
> > the selftests for new xdp multi-buff ebpf helpers
> >
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > ---
> >  net/bpf/test_run.c | 51  ++++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 43 insertions(+), 8 deletions(-)
> >
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index bd291f5f539c..ec7286cd051b 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -617,44 +617,79 @@ int bpf_prog_test_run_xdp(struct bpf_prog 
> > *prog, const union bpf_attr *kattr,
> >  {
> >  	u32 tailroom = SKB_DATA_ALIGN(sizeof(struct 
> >  skb_shared_info));
> >  	u32 headroom = XDP_PACKET_HEADROOM;
> > -	u32 size = kattr->test.data_size_in;
> >  	u32 repeat = kattr->test.repeat;
> >  	struct netdev_rx_queue *rxqueue;
> > +	struct skb_shared_info *sinfo;
> >  	struct xdp_buff xdp = {};
> > +	u32 max_data_sz, size;
> >  	u32 retval, duration;
> > -	u32 max_data_sz;
> > +	int i, ret, data_len;
> >  	void *data;
> > -	int ret;
> >  
> >  	if (kattr->test.ctx_in || kattr->test.ctx_out)
> >  		return -EINVAL;
> >  
> > -	/* XDP have extra tailroom as (most) drivers use full page 
> > */
> >  	max_data_sz = 4096 - headroom - tailroom;  
> 
> For the sake of consistency, can this 4096 be changed to PAGE_SIZE 
> ?

The size 4096 is explicitly use, because the selftest xdp_adjust_tail
expect this, else it will fail on ARCHs with 64K PAGE_SIZE.  It also
seems excessive to create 64K packets for testing XDP.

See: tools/testing/selftests/bpf/prog_tests/xdp_adjust_tail.c

> Same as in
>      data_len = min_t(int, kattr->test.data_size_in - size, 
>      PAGE_SIZE);
> 
> expression below



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
  2020-10-06 15:28                 ` Lorenzo Bianconi
@ 2020-10-08 14:38                   ` John Fastabend
  0 siblings, 0 replies; 31+ messages in thread
From: John Fastabend @ 2020-10-08 14:38 UTC (permalink / raw)
  To: Lorenzo Bianconi, Jesper Dangaard Brouer
  Cc: John Fastabend, Lorenzo Bianconi, bpf, netdev, davem, kuba, ast,
	daniel, shayagr, sameehj, dsahern, Eelco Chaudron,
	Tirthendu Sarkar, Toke Høiland-Jørgensen

Lorenzo Bianconi wrote:
> > On Mon, 05 Oct 2020 21:29:36 -0700
> > John Fastabend <john.fastabend@gmail.com> wrote:
> > 
> > > Lorenzo Bianconi wrote:
> > > > [...]
> > > >   
> > > > > 
> > > > > In general I see no reason to populate these fields before the XDP
> > > > > program runs. Someone needs to convince me why having frags info before
> > > > > program runs is useful. In general headers should be preserved and first
> > > > > frag already included in the data pointers. If users start parsing further
> > > > > they might need it, but this series doesn't provide a way to do that
> > > > > so IMO without those helpers its a bit difficult to debate.  
> > > > 
> > > > We need to populate the skb_shared_info before running the xdp program in order to
> > > > allow the ebpf sanbox to access this data. If we restrict the access to the first
> > > > buffer only I guess we can avoid to do that but I think there is a value allowing
> > > > the xdp program to access this data.  
> > > 
> > > I agree. We could also only populate the fields if the program accesses
> > > the fields.
> > 
> > Notice, a driver will not initialize/use the shared_info area unless
> > there are more segments.  And (we have already established) the xdp->mb
> > bit is guarding BPF-prog from accessing shared_info area. 
> > 
> > > > A possible optimization can be access the shared_info only once before running
> > > > the ebpf program constructing the shared_info using a struct allocated on the
> > > > stack.  
> > > 
> > > Seems interesting, might be a good idea.
> > 
> > It *might* be a good idea ("alloc" shared_info on stack), but we should
> > benchmark this.  The prefetch trick might be fast enough.  But also
> > keep in mind the performance target, as with large size frames the
> > packet-per-sec we need to handle dramatically drop.
> 
> right. I guess we need to define a workload we want to run for the
> xdp multi-buff use-case (e.g. if MTU is 9K we will have ~3 frames
> for each packets and # of pps will be much slower)

Right. Or configuring header split which would give 2 buffers with a much
smaller packet size. This would give some indication of the overhead. Then
we would likely want to look at XDP_TX and XDP_REDIRECT cases. At least
those would be my use cases.

> 
> > 
> > 
> 
> [...]
> 
> > 
> > I do think it makes sense to drop the helpers for now, and focus on how
> > this new multi-buffer frame type is handled in the existing code, and do
> > some benchmarking on higher speed NIC, before the BPF-helper start to
> > lockdown/restrict what we can change/revert as they define UAPI.
> 
> ack, I will drop them in v5.
> 
> Regards,
> Lorenzo
> 
> > 
> > E.g. existing code that need to handle this is existing helper
> > bpf_xdp_adjust_tail, which is something I have broad up before and even
> > described in[1].  Lets make sure existing code works with proposed
> > design, before introducing new helpers (and this makes it easier to
> > revert).
> > 
> > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org#xdp-tail-adjust
> > -- 
> > Best regards,
> >   Jesper Dangaard Brouer
> >   MSc.CS, Principal Kernel Engineer at Red Hat
> >   LinkedIn: http://www.linkedin.com/in/brouer
> > 



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2020-10-08 14:39 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2020-10-02 14:41 ` [PATCH v4 bpf-next 01/13] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 02/13] xdp: initialize xdp_buff mb bit to 0 in all XDP drivers Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 03/13] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 04/13] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 05/13] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers Lorenzo Bianconi
2020-10-02 15:36   ` John Fastabend
2020-10-02 16:25     ` Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 07/13] samples/bpf: add bpf program that uses xdp mb helpers Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 08/13] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2020-10-08  8:06   ` Shay Agroskin
2020-10-08 10:46     ` Jesper Dangaard Brouer
2020-10-02 14:42 ` [PATCH v4 bpf-next 10/13] bpf: test_run: add skb_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 11/13] bpf: add xdp multi-buffer selftest Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 12/13] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 13/13] bpf: cpumap: introduce xdp multi-buff support Lorenzo Bianconi
2020-10-02 15:25 ` [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support John Fastabend
2020-10-02 16:06   ` Lorenzo Bianconi
2020-10-02 18:06     ` John Fastabend
2020-10-05  9:52       ` Jesper Dangaard Brouer
2020-10-05 21:22         ` John Fastabend
2020-10-05 22:24           ` Lorenzo Bianconi
2020-10-06  4:29             ` John Fastabend
2020-10-06  7:30               ` Jesper Dangaard Brouer
2020-10-06 15:28                 ` Lorenzo Bianconi
2020-10-08 14:38                   ` John Fastabend
2020-10-02 19:53   ` Daniel Borkmann
2020-10-05 15:50     ` Tirthendu Sarkar
2020-10-06 12:39     ` Jubran, Samih

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.