Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH net-next 0/4] mvpp2: XDP support
@ 2020-06-30 18:09 Matteo Croce
  2020-06-30 18:09 ` [PATCH net-next 1/4] mvpp2: refactor BM pool init percpu code Matteo Croce
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Matteo Croce @ 2020-06-30 18:09 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

From: Matteo Croce <mcroce@microsoft.com>

Add XDP support to mvpp2. This series converts the driver to the
page_pool API for RX buffer management, and adds native XDP support.

These are the performance numbers, as measured by Sven:

SKB fwd page pool:
Rx bps     390.38 Mbps
Rx pps     762.46 Kpps

XDP fwd:
Rx bps     1.39 Gbps
Rx pps     2.72 Mpps

XDP Drop:
eth0: 12.9 Mpps
eth1: 4.1 Mpps

Matteo Croce (4):
  mvpp2: refactor BM pool init percpu code
  mvpp2: use page_pool allocator
  mvpp2: add basic XDP support
  mvpp2: XDP TX support

 drivers/net/ethernet/marvell/Kconfig          |   1 +
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |  49 +-
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 600 ++++++++++++++++--
 3 files changed, 588 insertions(+), 62 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 1/4] mvpp2: refactor BM pool init percpu code
  2020-06-30 18:09 [PATCH net-next 0/4] mvpp2: XDP support Matteo Croce
@ 2020-06-30 18:09 ` Matteo Croce
  2020-06-30 18:09 ` [PATCH net-next 2/4] mvpp2: use page_pool allocator Matteo Croce
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Matteo Croce @ 2020-06-30 18:09 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

From: Matteo Croce <mcroce@microsoft.com>

In mvpp2_swf_bm_pool_init_percpu(), a reference to a struct
mvpp2_bm_pool is obtained traversing multiple structs, when a
local variable already points to the same object.

Fix it and, while at it, give the variable a meaningful name.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
---
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 23 +++++++++----------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 212fc3b54310..027de7291f92 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -907,28 +907,27 @@ static int mvpp2_swf_bm_pool_init_shared(struct mvpp2_port *port)
 /* Initialize pools for swf, percpu buffers variant */
 static int mvpp2_swf_bm_pool_init_percpu(struct mvpp2_port *port)
 {
-	struct mvpp2_bm_pool *p;
+	struct mvpp2_bm_pool *bm_pool;
 	int i;
 
 	for (i = 0; i < port->nrxqs; i++) {
-		p = mvpp2_bm_pool_use_percpu(port, MVPP2_BM_SHORT, i,
-					     mvpp2_pools[MVPP2_BM_SHORT].pkt_size);
-		if (!p)
+		bm_pool = mvpp2_bm_pool_use_percpu(port, MVPP2_BM_SHORT, i,
+						   mvpp2_pools[MVPP2_BM_SHORT].pkt_size);
+		if (!bm_pool)
 			return -ENOMEM;
 
-		port->priv->bm_pools[i].port_map |= BIT(port->id);
-		mvpp2_rxq_short_pool_set(port, i, port->priv->bm_pools[i].id);
+		bm_pool->port_map |= BIT(port->id);
+		mvpp2_rxq_short_pool_set(port, i, bm_pool->id);
 	}
 
 	for (i = 0; i < port->nrxqs; i++) {
-		p = mvpp2_bm_pool_use_percpu(port, MVPP2_BM_LONG, i + port->nrxqs,
-					     mvpp2_pools[MVPP2_BM_LONG].pkt_size);
-		if (!p)
+		bm_pool = mvpp2_bm_pool_use_percpu(port, MVPP2_BM_LONG, i + port->nrxqs,
+						   mvpp2_pools[MVPP2_BM_LONG].pkt_size);
+		if (!bm_pool)
 			return -ENOMEM;
 
-		port->priv->bm_pools[i + port->nrxqs].port_map |= BIT(port->id);
-		mvpp2_rxq_long_pool_set(port, i,
-					port->priv->bm_pools[i + port->nrxqs].id);
+		bm_pool->port_map |= BIT(port->id);
+		mvpp2_rxq_long_pool_set(port, i, bm_pool->id);
 	}
 
 	port->pool_long = NULL;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 2/4] mvpp2: use page_pool allocator
  2020-06-30 18:09 [PATCH net-next 0/4] mvpp2: XDP support Matteo Croce
  2020-06-30 18:09 ` [PATCH net-next 1/4] mvpp2: refactor BM pool init percpu code Matteo Croce
@ 2020-06-30 18:09 ` Matteo Croce
  2020-07-02  7:31   ` ilias.apalodimas
  2020-06-30 18:09 ` [PATCH net-next 3/4] mvpp2: add basic XDP support Matteo Croce
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Matteo Croce @ 2020-06-30 18:09 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

From: Matteo Croce <mcroce@microsoft.com>

Use the page_pool API for memory management. This is a prerequisite for
native XDP support.

Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
---
 drivers/net/ethernet/marvell/Kconfig          |   1 +
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |   8 +
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 155 +++++++++++++++---
 3 files changed, 139 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/marvell/Kconfig b/drivers/net/ethernet/marvell/Kconfig
index cd8ddd1ef6f2..ef4f35ba077d 100644
--- a/drivers/net/ethernet/marvell/Kconfig
+++ b/drivers/net/ethernet/marvell/Kconfig
@@ -87,6 +87,7 @@ config MVPP2
 	depends on ARCH_MVEBU || COMPILE_TEST
 	select MVMDIO
 	select PHYLINK
+	select PAGE_POOL
 	help
 	  This driver supports the network interface units in the
 	  Marvell ARMADA 375, 7K and 8K SoCs.
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
index 543a310ec102..4c16c9e9c1e5 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
@@ -15,6 +15,7 @@
 #include <linux/phy.h>
 #include <linux/phylink.h>
 #include <net/flow_offload.h>
+#include <net/page_pool.h>
 
 /* Fifo Registers */
 #define MVPP2_RX_DATA_FIFO_SIZE_REG(port)	(0x00 + 4 * (port))
@@ -820,6 +821,9 @@ struct mvpp2 {
 
 	/* RSS Indirection tables */
 	struct mvpp2_rss_table *rss_tables[MVPP22_N_RSS_TABLES];
+
+	/* page_pool allocator */
+	struct page_pool *page_pool[MVPP2_PORT_MAX_RXQ];
 };
 
 struct mvpp2_pcpu_stats {
@@ -1161,6 +1165,10 @@ struct mvpp2_rx_queue {
 
 	/* Port's logic RXQ number to which physical RXQ is mapped */
 	int logic_rxq;
+
+	/* XDP memory accounting */
+	struct xdp_rxq_info xdp_rxq_short;
+	struct xdp_rxq_info xdp_rxq_long;
 };
 
 struct mvpp2_bm_pool {
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 027de7291f92..9e2e8fb0a0b8 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -95,6 +95,22 @@ static inline u32 mvpp2_cpu_to_thread(struct mvpp2 *priv, int cpu)
 	return cpu % priv->nthreads;
 }
 
+static struct page_pool *
+mvpp2_create_page_pool(struct device *dev, int num, int len)
+{
+	struct page_pool_params pp_params = {
+		/* internal DMA mapping in page_pool */
+		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
+		.pool_size = num,
+		.nid = NUMA_NO_NODE,
+		.dev = dev,
+		.dma_dir = DMA_FROM_DEVICE,
+		.max_len = len,
+	};
+
+	return page_pool_create(&pp_params);
+}
+
 /* These accessors should be used to access:
  *
  * - per-thread registers, where each thread has its own copy of the
@@ -327,17 +343,26 @@ static inline int mvpp2_txq_phys(int port, int txq)
 	return (MVPP2_MAX_TCONT + port) * MVPP2_MAX_TXQ + txq;
 }
 
-static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool)
+/* Returns a struct page if page_pool is set, otherwise a buffer */
+static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool,
+			      struct page_pool *page_pool)
 {
+	if (page_pool)
+		return page_pool_alloc_pages(page_pool,
+					     GFP_ATOMIC | __GFP_NOWARN);
+
 	if (likely(pool->frag_size <= PAGE_SIZE))
 		return netdev_alloc_frag(pool->frag_size);
-	else
-		return kmalloc(pool->frag_size, GFP_ATOMIC);
+
+	return kmalloc(pool->frag_size, GFP_ATOMIC);
 }
 
-static void mvpp2_frag_free(const struct mvpp2_bm_pool *pool, void *data)
+static void mvpp2_frag_free(const struct mvpp2_bm_pool *pool,
+			    struct page_pool *page_pool, void *data)
 {
-	if (likely(pool->frag_size <= PAGE_SIZE))
+	if (page_pool)
+		page_pool_put_full_page(page_pool, virt_to_head_page(data), false);
+	else if (likely(pool->frag_size <= PAGE_SIZE))
 		skb_free_frag(data);
 	else
 		kfree(data);
@@ -442,6 +467,7 @@ static void mvpp2_bm_bufs_get_addrs(struct device *dev, struct mvpp2 *priv,
 static void mvpp2_bm_bufs_free(struct device *dev, struct mvpp2 *priv,
 			       struct mvpp2_bm_pool *bm_pool, int buf_num)
 {
+	struct page_pool *pp = NULL;
 	int i;
 
 	if (buf_num > bm_pool->buf_num) {
@@ -450,6 +476,9 @@ static void mvpp2_bm_bufs_free(struct device *dev, struct mvpp2 *priv,
 		buf_num = bm_pool->buf_num;
 	}
 
+	if (priv->percpu_pools)
+		pp = priv->page_pool[bm_pool->id];
+
 	for (i = 0; i < buf_num; i++) {
 		dma_addr_t buf_dma_addr;
 		phys_addr_t buf_phys_addr;
@@ -458,14 +487,15 @@ static void mvpp2_bm_bufs_free(struct device *dev, struct mvpp2 *priv,
 		mvpp2_bm_bufs_get_addrs(dev, priv, bm_pool,
 					&buf_dma_addr, &buf_phys_addr);
 
-		dma_unmap_single(dev, buf_dma_addr,
-				 bm_pool->buf_size, DMA_FROM_DEVICE);
+		if (!pp)
+			dma_unmap_single(dev, buf_dma_addr,
+					 bm_pool->buf_size, DMA_FROM_DEVICE);
 
 		data = (void *)phys_to_virt(buf_phys_addr);
 		if (!data)
 			break;
 
-		mvpp2_frag_free(bm_pool, data);
+		mvpp2_frag_free(bm_pool, pp, data);
 	}
 
 	/* Update BM driver with number of buffers removed from pool */
@@ -496,6 +526,9 @@ static int mvpp2_bm_pool_destroy(struct device *dev, struct mvpp2 *priv,
 	int buf_num;
 	u32 val;
 
+	if (priv->percpu_pools)
+		page_pool_destroy(priv->page_pool[bm_pool->id]);
+
 	buf_num = mvpp2_check_hw_buf_num(priv, bm_pool);
 	mvpp2_bm_bufs_free(dev, priv, bm_pool, buf_num);
 
@@ -548,8 +581,20 @@ static int mvpp2_bm_init(struct device *dev, struct mvpp2 *priv)
 {
 	int i, err, poolnum = MVPP2_BM_POOLS_NUM;
 
-	if (priv->percpu_pools)
+	if (priv->percpu_pools) {
 		poolnum = mvpp2_get_nrxqs(priv) * 2;
+		for (i = 0; i < poolnum; i++) {
+			/* the pool in use */
+			int pn = i / (poolnum / 2);
+
+			priv->page_pool[i] =
+				mvpp2_create_page_pool(dev,
+						       mvpp2_pools[pn].buf_num,
+						       mvpp2_pools[pn].pkt_size);
+			if (IS_ERR(priv->page_pool[i]))
+				return PTR_ERR(priv->page_pool[i]);
+		}
+	}
 
 	dev_info(dev, "using %d %s buffers\n", poolnum,
 		 priv->percpu_pools ? "per-cpu" : "shared");
@@ -632,23 +677,31 @@ static void mvpp2_rxq_short_pool_set(struct mvpp2_port *port,
 
 static void *mvpp2_buf_alloc(struct mvpp2_port *port,
 			     struct mvpp2_bm_pool *bm_pool,
+			     struct page_pool *page_pool,
 			     dma_addr_t *buf_dma_addr,
 			     phys_addr_t *buf_phys_addr,
 			     gfp_t gfp_mask)
 {
 	dma_addr_t dma_addr;
+	struct page *page;
 	void *data;
 
-	data = mvpp2_frag_alloc(bm_pool);
+	data = mvpp2_frag_alloc(bm_pool, page_pool);
 	if (!data)
 		return NULL;
 
-	dma_addr = dma_map_single(port->dev->dev.parent, data,
-				  MVPP2_RX_BUF_SIZE(bm_pool->pkt_size),
-				  DMA_FROM_DEVICE);
-	if (unlikely(dma_mapping_error(port->dev->dev.parent, dma_addr))) {
-		mvpp2_frag_free(bm_pool, data);
-		return NULL;
+	if (page_pool) {
+		page = (struct page *)data;
+		dma_addr = page_pool_get_dma_addr(page);
+		data = page_to_virt(page);
+	} else {
+		dma_addr = dma_map_single(port->dev->dev.parent, data,
+					  MVPP2_RX_BUF_SIZE(bm_pool->pkt_size),
+					  DMA_FROM_DEVICE);
+		if (unlikely(dma_mapping_error(port->dev->dev.parent, dma_addr))) {
+			mvpp2_frag_free(bm_pool, NULL, data);
+			return NULL;
+		}
 	}
 	*buf_dma_addr = dma_addr;
 	*buf_phys_addr = virt_to_phys(data);
@@ -706,6 +759,7 @@ static int mvpp2_bm_bufs_add(struct mvpp2_port *port,
 	int i, buf_size, total_size;
 	dma_addr_t dma_addr;
 	phys_addr_t phys_addr;
+	struct page_pool *pp = NULL;
 	void *buf;
 
 	if (port->priv->percpu_pools &&
@@ -726,8 +780,10 @@ static int mvpp2_bm_bufs_add(struct mvpp2_port *port,
 		return 0;
 	}
 
+	if (port->priv->percpu_pools)
+		pp = port->priv->page_pool[bm_pool->id];
 	for (i = 0; i < buf_num; i++) {
-		buf = mvpp2_buf_alloc(port, bm_pool, &dma_addr,
+		buf = mvpp2_buf_alloc(port, bm_pool, pp, &dma_addr,
 				      &phys_addr, GFP_KERNEL);
 		if (!buf)
 			break;
@@ -2374,10 +2430,11 @@ static int mvpp2_aggr_txq_init(struct platform_device *pdev,
 /* Create a specified Rx queue */
 static int mvpp2_rxq_init(struct mvpp2_port *port,
 			  struct mvpp2_rx_queue *rxq)
-
 {
+	struct mvpp2 *priv = port->priv;
 	unsigned int thread;
 	u32 rxq_dma;
+	int err;
 
 	rxq->size = port->rx_ring_size;
 
@@ -2415,7 +2472,41 @@ static int mvpp2_rxq_init(struct mvpp2_port *port,
 	/* Add number of descriptors ready for receiving packets */
 	mvpp2_rxq_status_update(port, rxq->id, 0, rxq->size);
 
+	if (priv->percpu_pools) {
+		err = xdp_rxq_info_reg(&rxq->xdp_rxq_short, port->dev, rxq->id);
+		if (err < 0)
+			goto err_free_dma;
+
+		err = xdp_rxq_info_reg(&rxq->xdp_rxq_long, port->dev, rxq->id);
+		if (err < 0)
+			goto err_unregister_rxq_short;
+
+		/* Every RXQ has a pool for short and another for long packets */
+		err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_short,
+						 MEM_TYPE_PAGE_POOL,
+						 priv->page_pool[rxq->logic_rxq]);
+		if (err < 0)
+			goto err_unregister_rxq_short;
+
+		err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_long,
+						 MEM_TYPE_PAGE_POOL,
+						 priv->page_pool[rxq->logic_rxq +
+								 port->nrxqs]);
+		if (err < 0)
+			goto err_unregister_rxq_long;
+	}
+
 	return 0;
+
+err_unregister_rxq_long:
+	xdp_rxq_info_unreg(&rxq->xdp_rxq_long);
+err_unregister_rxq_short:
+	xdp_rxq_info_unreg(&rxq->xdp_rxq_short);
+err_free_dma:
+	dma_free_coherent(port->dev->dev.parent,
+			  rxq->size * MVPP2_DESC_ALIGNED_SIZE,
+			  rxq->descs, rxq->descs_dma);
+	return err;
 }
 
 /* Push packets received by the RXQ to BM pool */
@@ -2449,6 +2540,12 @@ static void mvpp2_rxq_deinit(struct mvpp2_port *port,
 {
 	unsigned int thread;
 
+	if (xdp_rxq_info_is_reg(&rxq->xdp_rxq_short))
+		xdp_rxq_info_unreg(&rxq->xdp_rxq_short);
+
+	if (xdp_rxq_info_is_reg(&rxq->xdp_rxq_long))
+		xdp_rxq_info_unreg(&rxq->xdp_rxq_long);
+
 	mvpp2_rxq_drop_pkts(port, rxq);
 
 	if (rxq->descs)
@@ -2890,14 +2987,15 @@ static void mvpp2_rx_csum(struct mvpp2_port *port, u32 status,
 
 /* Allocate a new skb and add it to BM pool */
 static int mvpp2_rx_refill(struct mvpp2_port *port,
-			   struct mvpp2_bm_pool *bm_pool, int pool)
+			   struct mvpp2_bm_pool *bm_pool,
+			   struct page_pool *page_pool, int pool)
 {
 	dma_addr_t dma_addr;
 	phys_addr_t phys_addr;
 	void *buf;
 
-	buf = mvpp2_buf_alloc(port, bm_pool, &dma_addr, &phys_addr,
-			      GFP_ATOMIC);
+	buf = mvpp2_buf_alloc(port, bm_pool, page_pool,
+			      &dma_addr, &phys_addr, GFP_ATOMIC);
 	if (!buf)
 		return -ENOMEM;
 
@@ -2956,6 +3054,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 	while (rx_done < rx_todo) {
 		struct mvpp2_rx_desc *rx_desc = mvpp2_rxq_next_desc_get(rxq);
 		struct mvpp2_bm_pool *bm_pool;
+		struct page_pool *pp = NULL;
 		struct sk_buff *skb;
 		unsigned int frag_size;
 		dma_addr_t dma_addr;
@@ -2989,6 +3088,9 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 					DMA_FROM_DEVICE);
 		prefetch(data);
 
+		if (port->priv->percpu_pools)
+			pp = port->priv->page_pool[pool];
+
 		if (bm_pool->frag_size > PAGE_SIZE)
 			frag_size = 0;
 		else
@@ -3000,15 +3102,18 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 			goto err_drop_frame;
 		}
 
-		err = mvpp2_rx_refill(port, bm_pool, pool);
+		err = mvpp2_rx_refill(port, bm_pool, pp, pool);
 		if (err) {
 			netdev_err(port->dev, "failed to refill BM pools\n");
 			goto err_drop_frame;
 		}
 
-		dma_unmap_single_attrs(dev->dev.parent, dma_addr,
-				       bm_pool->buf_size, DMA_FROM_DEVICE,
-				       DMA_ATTR_SKIP_CPU_SYNC);
+		if (pp)
+			page_pool_release_page(pp, virt_to_page(data));
+		else
+			dma_unmap_single_attrs(dev->dev.parent, dma_addr,
+					       bm_pool->buf_size, DMA_FROM_DEVICE,
+					       DMA_ATTR_SKIP_CPU_SYNC);
 
 		rcvd_pkts++;
 		rcvd_bytes += rx_bytes;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 3/4] mvpp2: add basic XDP support
  2020-06-30 18:09 [PATCH net-next 0/4] mvpp2: XDP support Matteo Croce
  2020-06-30 18:09 ` [PATCH net-next 1/4] mvpp2: refactor BM pool init percpu code Matteo Croce
  2020-06-30 18:09 ` [PATCH net-next 2/4] mvpp2: use page_pool allocator Matteo Croce
@ 2020-06-30 18:09 ` Matteo Croce
  2020-07-02  8:08   ` ilias.apalodimas
  2020-06-30 18:09 ` [PATCH net-next 4/4] mvpp2: XDP TX support Matteo Croce
  2020-07-01 19:18 ` [PATCH net-next 0/4] mvpp2: XDP support Jesper Dangaard Brouer
  4 siblings, 1 reply; 11+ messages in thread
From: Matteo Croce @ 2020-06-30 18:09 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

From: Matteo Croce <mcroce@microsoft.com>

Add XDP native support.
By now only XDP_DROP, XDP_PASS and XDP_REDIRECT
verdicts are supported.

Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
---
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |  28 ++-
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 166 +++++++++++++++++-
 2 files changed, 186 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
index 4c16c9e9c1e5..f351e41c9da6 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
@@ -16,6 +16,18 @@
 #include <linux/phylink.h>
 #include <net/flow_offload.h>
 #include <net/page_pool.h>
+#include <linux/bpf.h>
+#include <net/xdp.h>
+
+/* The PacketOffset field is measured in units of 32 bytes and is 3 bits wide,
+ * so the maximum offset is 7 * 32 = 224
+ */
+#define MVPP2_SKB_HEADROOM	min(max(XDP_PACKET_HEADROOM, NET_SKB_PAD), 224)
+
+#define MVPP2_XDP_PASS		0
+#define MVPP2_XDP_DROPPED	BIT(0)
+#define MVPP2_XDP_TX		BIT(1)
+#define MVPP2_XDP_REDIR		BIT(2)
 
 /* Fifo Registers */
 #define MVPP2_RX_DATA_FIFO_SIZE_REG(port)	(0x00 + 4 * (port))
@@ -629,10 +641,12 @@
 	ALIGN((mtu) + MVPP2_MH_SIZE + MVPP2_VLAN_TAG_LEN + \
 	      ETH_HLEN + ETH_FCS_LEN, cache_line_size())
 
-#define MVPP2_RX_BUF_SIZE(pkt_size)	((pkt_size) + NET_SKB_PAD)
+#define MVPP2_RX_BUF_SIZE(pkt_size)	((pkt_size) + MVPP2_SKB_HEADROOM)
 #define MVPP2_RX_TOTAL_SIZE(buf_size)	((buf_size) + MVPP2_SKB_SHINFO_SIZE)
 #define MVPP2_RX_MAX_PKT_SIZE(total_size) \
-	((total_size) - NET_SKB_PAD - MVPP2_SKB_SHINFO_SIZE)
+	((total_size) - MVPP2_SKB_HEADROOM - MVPP2_SKB_SHINFO_SIZE)
+
+#define MVPP2_MAX_RX_BUF_SIZE	(PAGE_SIZE - MVPP2_SKB_SHINFO_SIZE - MVPP2_SKB_HEADROOM)
 
 #define MVPP2_BIT_TO_BYTE(bit)		((bit) / 8)
 #define MVPP2_BIT_TO_WORD(bit)		((bit) / 32)
@@ -690,9 +704,9 @@ enum mvpp2_prs_l3_cast {
 #define MVPP2_BM_COOKIE_POOL_OFFS	8
 #define MVPP2_BM_COOKIE_CPU_OFFS	24
 
-#define MVPP2_BM_SHORT_FRAME_SIZE		512
-#define MVPP2_BM_LONG_FRAME_SIZE		2048
-#define MVPP2_BM_JUMBO_FRAME_SIZE		10240
+#define MVPP2_BM_SHORT_FRAME_SIZE	704	/* frame size 128 */
+#define MVPP2_BM_LONG_FRAME_SIZE	2240	/* frame size 1664 */
+#define MVPP2_BM_JUMBO_FRAME_SIZE	10432	/* frame size 9856 */
 /* BM short pool packet size
  * These value assure that for SWF the total number
  * of bytes allocated for each buffer will be 512
@@ -913,6 +927,8 @@ struct mvpp2_port {
 	unsigned int ntxqs;
 	struct net_device *dev;
 
+	struct bpf_prog *xdp_prog;
+
 	int pkt_size;
 
 	/* Per-CPU port control */
@@ -932,6 +948,8 @@ struct mvpp2_port {
 	struct mvpp2_pcpu_stats __percpu *stats;
 	u64 *ethtool_stats;
 
+	unsigned long state;
+
 	/* Per-port work and its lock to gather hardware statistics */
 	struct mutex gather_stats_lock;
 	struct delayed_work stats_work;
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 9e2e8fb0a0b8..864d4789a0b3 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -36,6 +36,7 @@
 #include <net/ip.h>
 #include <net/ipv6.h>
 #include <net/tso.h>
+#include <linux/bpf_trace.h>
 
 #include "mvpp2.h"
 #include "mvpp2_prs.h"
@@ -105,6 +106,7 @@ mvpp2_create_page_pool(struct device *dev, int num, int len)
 		.nid = NUMA_NO_NODE,
 		.dev = dev,
 		.dma_dir = DMA_FROM_DEVICE,
+		.offset = MVPP2_SKB_HEADROOM,
 		.max_len = len,
 	};
 
@@ -2463,7 +2465,7 @@ static int mvpp2_rxq_init(struct mvpp2_port *port,
 	put_cpu();
 
 	/* Set Offset */
-	mvpp2_rxq_offset_set(port, rxq->id, NET_SKB_PAD);
+	mvpp2_rxq_offset_set(port, rxq->id, MVPP2_SKB_HEADROOM);
 
 	/* Set coalescing pkts and time */
 	mvpp2_rx_pkts_coal_set(port, rxq);
@@ -3036,16 +3038,69 @@ static u32 mvpp2_skb_tx_csum(struct mvpp2_port *port, struct sk_buff *skb)
 	return MVPP2_TXD_L4_CSUM_NOT | MVPP2_TXD_IP_CSUM_DISABLE;
 }
 
+static int
+mvpp2_run_xdp(struct mvpp2_port *port, struct mvpp2_rx_queue *rxq,
+	      struct bpf_prog *prog, struct xdp_buff *xdp,
+	      struct page_pool *pp)
+{
+	unsigned int len, sync, err;
+	struct page *page;
+	u32 ret, act;
+
+	len = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
+	act = bpf_prog_run_xdp(prog, xdp);
+
+	/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
+	sync = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
+	sync = max(sync, len);
+
+	switch (act) {
+	case XDP_PASS:
+		ret = MVPP2_XDP_PASS;
+		break;
+	case XDP_REDIRECT:
+		err = xdp_do_redirect(port->dev, xdp, prog);
+		if (unlikely(err)) {
+			ret = MVPP2_XDP_DROPPED;
+			page = virt_to_head_page(xdp->data);
+			page_pool_put_page(pp, page, sync, true);
+		} else {
+			ret = MVPP2_XDP_REDIR;
+		}
+		break;
+	default:
+		bpf_warn_invalid_xdp_action(act);
+		fallthrough;
+	case XDP_ABORTED:
+		trace_xdp_exception(port->dev, prog, act);
+		fallthrough;
+	case XDP_DROP:
+		page = virt_to_head_page(xdp->data);
+		page_pool_put_page(pp, page, sync, true);
+		ret = MVPP2_XDP_DROPPED;
+		break;
+	}
+
+	return ret;
+}
+
 /* Main rx processing */
 static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 		    int rx_todo, struct mvpp2_rx_queue *rxq)
 {
 	struct net_device *dev = port->dev;
+	struct bpf_prog *xdp_prog;
+	struct xdp_buff xdp;
 	int rx_received;
 	int rx_done = 0;
+	u32 xdp_ret = 0;
 	u32 rcvd_pkts = 0;
 	u32 rcvd_bytes = 0;
 
+	rcu_read_lock();
+
+	xdp_prog = READ_ONCE(port->xdp_prog);
+
 	/* Get number of received packets and clamp the to-do */
 	rx_received = mvpp2_rxq_received(port, rxq->id);
 	if (rx_todo > rx_received)
@@ -3060,7 +3115,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 		dma_addr_t dma_addr;
 		phys_addr_t phys_addr;
 		u32 rx_status;
-		int pool, rx_bytes, err;
+		int pool, rx_bytes, err, ret;
 		void *data;
 
 		rx_done++;
@@ -3096,6 +3151,33 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 		else
 			frag_size = bm_pool->frag_size;
 
+		if (xdp_prog) {
+			xdp.data_hard_start = data;
+			xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
+			xdp.data_end = xdp.data + rx_bytes;
+			xdp.frame_sz = PAGE_SIZE;
+
+			if (bm_pool->pkt_size == MVPP2_BM_SHORT_PKT_SIZE)
+				xdp.rxq = &rxq->xdp_rxq_short;
+			else
+				xdp.rxq = &rxq->xdp_rxq_long;
+
+			xdp_set_data_meta_invalid(&xdp);
+
+			ret = mvpp2_run_xdp(port, rxq, xdp_prog, &xdp, pp);
+
+			if (ret) {
+				xdp_ret |= ret;
+				err = mvpp2_rx_refill(port, bm_pool, pp, pool);
+				if (err) {
+					netdev_err(port->dev, "failed to refill BM pools\n");
+					goto err_drop_frame;
+				}
+
+				continue;
+			}
+		}
+
 		skb = build_skb(data, frag_size);
 		if (!skb) {
 			netdev_warn(port->dev, "skb build failed\n");
@@ -3118,7 +3200,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 		rcvd_pkts++;
 		rcvd_bytes += rx_bytes;
 
-		skb_reserve(skb, MVPP2_MH_SIZE + NET_SKB_PAD);
+		skb_reserve(skb, MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM);
 		skb_put(skb, rx_bytes);
 		skb->protocol = eth_type_trans(skb, dev);
 		mvpp2_rx_csum(port, rx_status, skb);
@@ -3133,6 +3215,8 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 		mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
 	}
 
+	rcu_read_unlock();
+
 	if (rcvd_pkts) {
 		struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats);
 
@@ -3608,6 +3692,8 @@ static void mvpp2_start_dev(struct mvpp2_port *port)
 	}
 
 	netif_tx_start_all_queues(port->dev);
+
+	clear_bit(0, &port->state);
 }
 
 /* Set hw internals when stopping port */
@@ -3615,6 +3701,8 @@ static void mvpp2_stop_dev(struct mvpp2_port *port)
 {
 	int i;
 
+	set_bit(0, &port->state);
+
 	/* Disable interrupts on all threads */
 	mvpp2_interrupts_disable(port);
 
@@ -4021,6 +4109,10 @@ static int mvpp2_change_mtu(struct net_device *dev, int mtu)
 	}
 
 	if (MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE) {
+		if (port->xdp_prog) {
+			netdev_err(dev, "Jumbo frames are not supported with XDP\n");
+			return -EINVAL;
+		}
 		if (priv->percpu_pools) {
 			netdev_warn(dev, "mtu %d too high, switching to shared buffers", mtu);
 			mvpp2_bm_switch_buffers(priv, false);
@@ -4159,6 +4251,73 @@ static int mvpp2_set_features(struct net_device *dev,
 	return 0;
 }
 
+static int mvpp2_xdp_setup(struct mvpp2_port *port, struct netdev_bpf *bpf)
+{
+	struct bpf_prog *prog = bpf->prog, *old_prog;
+	bool running = netif_running(port->dev);
+	bool reset = !prog != !port->xdp_prog;
+
+	if (port->dev->mtu > ETH_DATA_LEN) {
+		netdev_err(port->dev, "Jumbo frames are not supported by XDP, current MTU %d.\n",
+			   port->dev->mtu);
+		return -EOPNOTSUPP;
+	}
+
+	if (!port->priv->percpu_pools) {
+		netdev_err(port->dev, "Per CPU Pools required for XDP");
+		return -EOPNOTSUPP;
+	}
+
+	/* device is up and bpf is added/removed, must setup the RX queues */
+	if (running && reset) {
+		mvpp2_stop_dev(port);
+		mvpp2_cleanup_rxqs(port);
+		mvpp2_cleanup_txqs(port);
+	}
+
+	old_prog = xchg(&port->xdp_prog, prog);
+	if (old_prog)
+		bpf_prog_put(old_prog);
+
+	/* bpf is just replaced, RXQ and MTU are already setup */
+	if (!reset)
+		return 0;
+
+	/* device was up, restore the link */
+	if (running) {
+		int ret = mvpp2_setup_rxqs(port);
+
+		if (ret) {
+			netdev_err(port->dev, "mvpp2_setup_rxqs failed\n");
+			return ret;
+		}
+		ret = mvpp2_setup_txqs(port);
+		if (ret) {
+			netdev_err(port->dev, "mvpp2_setup_txqs failed\n");
+			return ret;
+		}
+
+		mvpp2_start_dev(port);
+	}
+
+	return 0;
+}
+
+static int mvpp2_xdp(struct net_device *dev, struct netdev_bpf *xdp)
+{
+	struct mvpp2_port *port = netdev_priv(dev);
+
+	switch (xdp->command) {
+	case XDP_SETUP_PROG:
+		return mvpp2_xdp_setup(port, xdp);
+	case XDP_QUERY_PROG:
+		xdp->prog_id = port->xdp_prog ? port->xdp_prog->aux->id : 0;
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
 /* Ethtool methods */
 
 static int mvpp2_ethtool_nway_reset(struct net_device *dev)
@@ -4509,6 +4668,7 @@ static const struct net_device_ops mvpp2_netdev_ops = {
 	.ndo_vlan_rx_add_vid	= mvpp2_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= mvpp2_vlan_rx_kill_vid,
 	.ndo_set_features	= mvpp2_set_features,
+	.ndo_bpf		= mvpp2_xdp,
 };
 
 static const struct ethtool_ops mvpp2_eth_tool_ops = {
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 4/4] mvpp2: XDP TX support
  2020-06-30 18:09 [PATCH net-next 0/4] mvpp2: XDP support Matteo Croce
                   ` (2 preceding siblings ...)
  2020-06-30 18:09 ` [PATCH net-next 3/4] mvpp2: add basic XDP support Matteo Croce
@ 2020-06-30 18:09 ` Matteo Croce
  2020-07-01 19:18 ` [PATCH net-next 0/4] mvpp2: XDP support Jesper Dangaard Brouer
  4 siblings, 0 replies; 11+ messages in thread
From: Matteo Croce @ 2020-06-30 18:09 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

From: Matteo Croce <mcroce@microsoft.com>

Add the transmit part of XDP support, which includes:
- support for XDP_TX in mvpp2_xdp()
- .ndo_xdp_xmit hook for AF_XDP and XDP_REDIRECT with mvpp2 as destination

mvpp2_xdp_submit_frame() is a generic function which is called by
mvpp2_xdp_xmit_back() when doing XDP_TX, and by mvpp2_xdp_xmit when
doing AF_XDP or XDP_REDIRECT target.

The buffer allocation has been reworked to be able to map the buffers
as DMA_FROM_DEVICE or DMA_BIDIRECTIONAL depending if a BPF program
is loaded or not.

Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
---
 drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |  13 +-
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 312 +++++++++++++++---
 2 files changed, 280 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
index f351e41c9da6..c52955b33fab 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
@@ -1082,9 +1082,20 @@ struct mvpp2_rx_desc {
 	};
 };
 
+enum mvpp2_tx_buf_type {
+	MVPP2_TYPE_SKB,
+	MVPP2_TYPE_XDP_TX,
+	MVPP2_TYPE_XDP_NDO,
+};
+
 struct mvpp2_txq_pcpu_buf {
+	enum mvpp2_tx_buf_type type;
+
 	/* Transmitted SKB */
-	struct sk_buff *skb;
+	union {
+		struct xdp_frame *xdpf;
+		struct sk_buff *skb;
+	};
 
 	/* Physical address of transmitted buffer */
 	dma_addr_t dma;
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 864d4789a0b3..ffc2a220613d 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -97,7 +97,8 @@ static inline u32 mvpp2_cpu_to_thread(struct mvpp2 *priv, int cpu)
 }
 
 static struct page_pool *
-mvpp2_create_page_pool(struct device *dev, int num, int len)
+mvpp2_create_page_pool(struct device *dev, int num, int len,
+		       enum dma_data_direction dma_dir)
 {
 	struct page_pool_params pp_params = {
 		/* internal DMA mapping in page_pool */
@@ -105,7 +106,7 @@ mvpp2_create_page_pool(struct device *dev, int num, int len)
 		.pool_size = num,
 		.nid = NUMA_NO_NODE,
 		.dev = dev,
-		.dma_dir = DMA_FROM_DEVICE,
+		.dma_dir = dma_dir,
 		.offset = MVPP2_SKB_HEADROOM,
 		.max_len = len,
 	};
@@ -299,12 +300,17 @@ static void mvpp2_txq_inc_get(struct mvpp2_txq_pcpu *txq_pcpu)
 
 static void mvpp2_txq_inc_put(struct mvpp2_port *port,
 			      struct mvpp2_txq_pcpu *txq_pcpu,
-			      struct sk_buff *skb,
-			      struct mvpp2_tx_desc *tx_desc)
+			      void *data,
+			      struct mvpp2_tx_desc *tx_desc,
+			      enum mvpp2_tx_buf_type buf_type)
 {
 	struct mvpp2_txq_pcpu_buf *tx_buf =
 		txq_pcpu->buffs + txq_pcpu->txq_put_index;
-	tx_buf->skb = skb;
+	tx_buf->type = buf_type;
+	if (buf_type == MVPP2_TYPE_SKB)
+		tx_buf->skb = data;
+	else
+		tx_buf->xdpf = data;
 	tx_buf->size = mvpp2_txdesc_size_get(port, tx_desc);
 	tx_buf->dma = mvpp2_txdesc_dma_addr_get(port, tx_desc) +
 		mvpp2_txdesc_offset_get(port, tx_desc);
@@ -528,9 +534,6 @@ static int mvpp2_bm_pool_destroy(struct device *dev, struct mvpp2 *priv,
 	int buf_num;
 	u32 val;
 
-	if (priv->percpu_pools)
-		page_pool_destroy(priv->page_pool[bm_pool->id]);
-
 	buf_num = mvpp2_check_hw_buf_num(priv, bm_pool);
 	mvpp2_bm_bufs_free(dev, priv, bm_pool, buf_num);
 
@@ -546,6 +549,9 @@ static int mvpp2_bm_pool_destroy(struct device *dev, struct mvpp2 *priv,
 	val |= MVPP2_BM_STOP_MASK;
 	mvpp2_write(priv, MVPP2_BM_POOL_CTRL_REG(bm_pool->id), val);
 
+	if (priv->percpu_pools)
+		page_pool_destroy(priv->page_pool[bm_pool->id]);
+
 	dma_free_coherent(dev, bm_pool->size_bytes,
 			  bm_pool->virt_addr,
 			  bm_pool->dma_addr);
@@ -581,9 +587,19 @@ static int mvpp2_bm_pools_init(struct device *dev, struct mvpp2 *priv)
 
 static int mvpp2_bm_init(struct device *dev, struct mvpp2 *priv)
 {
+	enum dma_data_direction dma_dir = DMA_FROM_DEVICE;
 	int i, err, poolnum = MVPP2_BM_POOLS_NUM;
+	struct mvpp2_port *port;
 
 	if (priv->percpu_pools) {
+		for (i = 0; i < priv->port_count; i++) {
+			port = priv->port_list[i];
+			if (port->xdp_prog) {
+				dma_dir = DMA_BIDIRECTIONAL;
+				break;
+			}
+		}
+
 		poolnum = mvpp2_get_nrxqs(priv) * 2;
 		for (i = 0; i < poolnum; i++) {
 			/* the pool in use */
@@ -592,7 +608,8 @@ static int mvpp2_bm_init(struct device *dev, struct mvpp2 *priv)
 			priv->page_pool[i] =
 				mvpp2_create_page_pool(dev,
 						       mvpp2_pools[pn].buf_num,
-						       mvpp2_pools[pn].pkt_size);
+						       mvpp2_pools[pn].pkt_size,
+						       dma_dir);
 			if (IS_ERR(priv->page_pool[i]))
 				return PTR_ERR(priv->page_pool[i]);
 		}
@@ -2319,11 +2336,15 @@ static void mvpp2_txq_bufs_free(struct mvpp2_port *port,
 		struct mvpp2_txq_pcpu_buf *tx_buf =
 			txq_pcpu->buffs + txq_pcpu->txq_get_index;
 
-		if (!IS_TSO_HEADER(txq_pcpu, tx_buf->dma))
+		if (!IS_TSO_HEADER(txq_pcpu, tx_buf->dma) &&
+		    tx_buf->type != MVPP2_TYPE_XDP_TX)
 			dma_unmap_single(port->dev->dev.parent, tx_buf->dma,
 					 tx_buf->size, DMA_TO_DEVICE);
-		if (tx_buf->skb)
+		if (tx_buf->type == MVPP2_TYPE_SKB && tx_buf->skb)
 			dev_kfree_skb_any(tx_buf->skb);
+		else if (tx_buf->type == MVPP2_TYPE_XDP_TX ||
+			 tx_buf->type == MVPP2_TYPE_XDP_NDO)
+			xdp_return_frame(tx_buf->xdpf);
 
 		mvpp2_txq_inc_get(txq_pcpu);
 	}
@@ -2809,7 +2830,7 @@ static int mvpp2_setup_rxqs(struct mvpp2_port *port)
 static int mvpp2_setup_txqs(struct mvpp2_port *port)
 {
 	struct mvpp2_tx_queue *txq;
-	int queue, err, cpu;
+	int queue, err;
 
 	for (queue = 0; queue < port->ntxqs; queue++) {
 		txq = port->txqs[queue];
@@ -2818,8 +2839,8 @@ static int mvpp2_setup_txqs(struct mvpp2_port *port)
 			goto err_cleanup;
 
 		/* Assign this queue to a CPU */
-		cpu = queue % num_present_cpus();
-		netif_set_xps_queue(port->dev, cpumask_of(cpu), queue);
+		if (queue < num_possible_cpus())
+			netif_set_xps_queue(port->dev, cpumask_of(queue), queue);
 	}
 
 	if (port->has_tx_irqs) {
@@ -3038,6 +3059,165 @@ static u32 mvpp2_skb_tx_csum(struct mvpp2_port *port, struct sk_buff *skb)
 	return MVPP2_TXD_L4_CSUM_NOT | MVPP2_TXD_IP_CSUM_DISABLE;
 }
 
+static void mvpp2_xdp_finish_tx(struct mvpp2_port *port, u16 txq_id, int nxmit, int nxmit_byte)
+{
+	unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
+	struct mvpp2_pcpu_stats *stats = per_cpu_ptr(port->stats, thread);
+	struct mvpp2_tx_queue *aggr_txq;
+	struct mvpp2_txq_pcpu *txq_pcpu;
+	struct mvpp2_tx_queue *txq;
+	struct netdev_queue *nq;
+
+	txq = port->txqs[txq_id];
+	txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
+	nq = netdev_get_tx_queue(port->dev, txq_id);
+	aggr_txq = &port->priv->aggr_txqs[thread];
+
+	txq_pcpu->reserved_num -= nxmit;
+	txq_pcpu->count += nxmit;
+	aggr_txq->count += nxmit;
+
+	/* Enable transmit */
+	wmb();
+	mvpp2_aggr_txq_pend_desc_add(port, nxmit);
+
+	if (txq_pcpu->count >= txq_pcpu->stop_threshold)
+		netif_tx_stop_queue(nq);
+
+	u64_stats_update_begin(&stats->syncp);
+	stats->tx_bytes += nxmit_byte;
+	stats->tx_packets += nxmit;
+	u64_stats_update_end(&stats->syncp);
+
+	/* Finalize TX processing */
+	if (!port->has_tx_irqs && txq_pcpu->count >= txq->done_pkts_coal)
+		mvpp2_txq_done(port, txq, txq_pcpu);
+}
+
+static int
+mvpp2_xdp_submit_frame(struct mvpp2_port *port, u16 txq_id,
+		       struct xdp_frame *xdpf, bool dma_map)
+{
+	struct mvpp2_tx_desc *tx_desc;
+	struct mvpp2_tx_queue *aggr_txq;
+	struct mvpp2_tx_queue *txq;
+	struct mvpp2_txq_pcpu *txq_pcpu;
+	dma_addr_t dma_addr;
+	u32 tx_cmd = MVPP2_TXD_L4_CSUM_NOT | MVPP2_TXD_IP_CSUM_DISABLE |
+		     MVPP2_TXD_F_DESC | MVPP2_TXD_L_DESC;
+	enum mvpp2_tx_buf_type buf_type;
+	int ret = MVPP2_XDP_TX;
+
+	unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
+
+	txq = port->txqs[txq_id];
+	txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
+	aggr_txq = &port->priv->aggr_txqs[thread];
+
+	/* Check number of available descriptors */
+	if (mvpp2_aggr_desc_num_check(port, aggr_txq, 1) ||
+	    mvpp2_txq_reserved_desc_num_proc(port, txq, txq_pcpu, 1)) {
+		ret = MVPP2_XDP_DROPPED;
+		goto out;
+	}
+
+	/* Get a descriptor for the first part of the packet */
+	tx_desc = mvpp2_txq_next_desc_get(aggr_txq);
+	mvpp2_txdesc_txq_set(port, tx_desc, txq->id);
+	mvpp2_txdesc_size_set(port, tx_desc, xdpf->len);
+
+	if (dma_map) {
+		/* XDP_REDIRECT or AF_XDP */
+		dma_addr = dma_map_single(port->dev->dev.parent, xdpf->data,
+					  xdpf->len, DMA_TO_DEVICE);
+
+		if (unlikely(dma_mapping_error(port->dev->dev.parent, dma_addr))) {
+			mvpp2_txq_desc_put(txq);
+			ret = MVPP2_XDP_DROPPED;
+			goto out;
+		}
+
+		buf_type = MVPP2_TYPE_XDP_NDO;
+	} else {
+		/* XDP_TX */
+		struct page *page = virt_to_page(xdpf->data);
+
+		dma_addr = page_pool_get_dma_addr(page) +
+			   sizeof(*xdpf) + xdpf->headroom;
+		dma_sync_single_for_device(port->dev->dev.parent, dma_addr,
+					   xdpf->len, DMA_BIDIRECTIONAL);
+
+		buf_type = MVPP2_TYPE_XDP_TX;
+	}
+
+	mvpp2_txdesc_dma_addr_set(port, tx_desc, dma_addr);
+
+	mvpp2_txdesc_cmd_set(port, tx_desc, tx_cmd);
+	mvpp2_txq_inc_put(port, txq_pcpu, xdpf, tx_desc, buf_type);
+
+out:
+	return ret;
+}
+
+static int
+mvpp2_xdp_xmit_back(struct mvpp2_port *port, struct xdp_buff *xdp)
+{
+	struct xdp_frame *xdpf;
+	u16 txq_id;
+	int ret;
+
+	xdpf = xdp_convert_buff_to_frame(xdp);
+	if (unlikely(!xdpf))
+		return MVPP2_XDP_DROPPED;
+
+	/* The first of the TX queues are used for XPS,
+	 * the second half for XDP_TX
+	 */
+	txq_id = mvpp2_cpu_to_thread(port->priv, smp_processor_id()) + (port->ntxqs / 2);
+
+	ret = mvpp2_xdp_submit_frame(port, txq_id, xdpf, false);
+	if (ret == MVPP2_XDP_TX)
+		mvpp2_xdp_finish_tx(port, txq_id, 1, xdpf->len);
+
+	return ret;
+}
+
+static int
+mvpp2_xdp_xmit(struct net_device *dev, int num_frame,
+	       struct xdp_frame **frames, u32 flags)
+{
+	struct mvpp2_port *port = netdev_priv(dev);
+	int i, nxmit_byte = 0, nxmit = num_frame;
+	u32 ret;
+	u16 txq_id;
+
+	if (unlikely(test_bit(0, &port->state)))
+		return -ENETDOWN;
+
+	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
+		return -EINVAL;
+
+	/* The first of the TX queues are used for XPS,
+	 * the second half for XDP_TX
+	 */
+	txq_id = mvpp2_cpu_to_thread(port->priv, smp_processor_id()) + (port->ntxqs / 2);
+
+	for (i = 0; i < num_frame; i++) {
+		ret = mvpp2_xdp_submit_frame(port, txq_id, frames[i], true);
+		if (ret == MVPP2_XDP_TX) {
+			nxmit_byte += frames[i]->len;
+		} else {
+			xdp_return_frame_rx_napi(frames[i]);
+			nxmit--;
+		}
+	}
+
+	if (nxmit > 0)
+		mvpp2_xdp_finish_tx(port, txq_id, nxmit, nxmit_byte);
+
+	return nxmit;
+}
+
 static int
 mvpp2_run_xdp(struct mvpp2_port *port, struct mvpp2_rx_queue *rxq,
 	      struct bpf_prog *prog, struct xdp_buff *xdp,
@@ -3068,6 +3248,13 @@ mvpp2_run_xdp(struct mvpp2_port *port, struct mvpp2_rx_queue *rxq,
 			ret = MVPP2_XDP_REDIR;
 		}
 		break;
+	case XDP_TX:
+		ret = mvpp2_xdp_xmit_back(port, xdp);
+		if (ret != MVPP2_XDP_TX) {
+			page = virt_to_head_page(xdp->data);
+			page_pool_put_page(pp, page, sync, true);
+		}
+		break;
 	default:
 		bpf_warn_invalid_xdp_action(act);
 		fallthrough;
@@ -3089,6 +3276,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 		    int rx_todo, struct mvpp2_rx_queue *rxq)
 {
 	struct net_device *dev = port->dev;
+	enum dma_data_direction dma_dir;
 	struct bpf_prog *xdp_prog;
 	struct xdp_buff xdp;
 	int rx_received;
@@ -3138,13 +3326,19 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 		if (rx_status & MVPP2_RXD_ERR_SUMMARY)
 			goto err_drop_frame;
 
+		if (port->priv->percpu_pools) {
+			pp = port->priv->page_pool[pool];
+			dma_dir = page_pool_get_dma_dir(pp);
+		} else {
+			dma_dir = DMA_FROM_DEVICE;
+		}
+
 		dma_sync_single_for_cpu(dev->dev.parent, dma_addr,
 					rx_bytes + MVPP2_MH_SIZE,
-					DMA_FROM_DEVICE);
-		prefetch(data);
+					dma_dir);
 
-		if (port->priv->percpu_pools)
-			pp = port->priv->page_pool[pool];
+		/* Prefetch header */
+		prefetch(data);
 
 		if (bm_pool->frag_size > PAGE_SIZE)
 			frag_size = 0;
@@ -3217,6 +3411,9 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
 
 	rcu_read_unlock();
 
+	if (xdp_ret & MVPP2_XDP_REDIR)
+		xdp_do_flush_map();
+
 	if (rcvd_pkts) {
 		struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats);
 
@@ -3283,11 +3480,11 @@ static int mvpp2_tx_frag_process(struct mvpp2_port *port, struct sk_buff *skb,
 			/* Last descriptor */
 			mvpp2_txdesc_cmd_set(port, tx_desc,
 					     MVPP2_TXD_L_DESC);
-			mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc);
+			mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc, MVPP2_TYPE_SKB);
 		} else {
 			/* Descriptor in the middle: Not First, Not Last */
 			mvpp2_txdesc_cmd_set(port, tx_desc, 0);
-			mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc);
+			mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
 		}
 	}
 
@@ -3325,7 +3522,7 @@ static inline void mvpp2_tso_put_hdr(struct sk_buff *skb,
 	mvpp2_txdesc_cmd_set(port, tx_desc, mvpp2_skb_tx_csum(port, skb) |
 					    MVPP2_TXD_F_DESC |
 					    MVPP2_TXD_PADDING_DISABLE);
-	mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc);
+	mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
 }
 
 static inline int mvpp2_tso_put_data(struct sk_buff *skb,
@@ -3354,14 +3551,14 @@ static inline int mvpp2_tso_put_data(struct sk_buff *skb,
 	if (!left) {
 		mvpp2_txdesc_cmd_set(port, tx_desc, MVPP2_TXD_L_DESC);
 		if (last) {
-			mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc);
+			mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc, MVPP2_TYPE_SKB);
 			return 0;
 		}
 	} else {
 		mvpp2_txdesc_cmd_set(port, tx_desc, 0);
 	}
 
-	mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc);
+	mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
 	return 0;
 }
 
@@ -3474,12 +3671,12 @@ static netdev_tx_t mvpp2_tx(struct sk_buff *skb, struct net_device *dev)
 		/* First and Last descriptor */
 		tx_cmd |= MVPP2_TXD_F_DESC | MVPP2_TXD_L_DESC;
 		mvpp2_txdesc_cmd_set(port, tx_desc, tx_cmd);
-		mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc);
+		mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc, MVPP2_TYPE_SKB);
 	} else {
 		/* First but not Last */
 		tx_cmd |= MVPP2_TXD_F_DESC | MVPP2_TXD_PADDING_DISABLE;
 		mvpp2_txdesc_cmd_set(port, tx_desc, tx_cmd);
-		mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc);
+		mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
 
 		/* Continue with other skb fragments */
 		if (mvpp2_tx_frag_process(port, skb, aggr_txq, txq)) {
@@ -4158,6 +4355,39 @@ static int mvpp2_change_mtu(struct net_device *dev, int mtu)
 	return err;
 }
 
+static int mvpp2_check_pagepool_dma(struct mvpp2_port *port)
+{
+	enum dma_data_direction dma_dir = DMA_FROM_DEVICE;
+	struct mvpp2 *priv = port->priv;
+	struct page_pool *page_pool;
+	int err = -1, i;
+
+	if (!priv->percpu_pools) {
+		netdev_warn(port->dev, "can not change pagepool it is not enabled");
+	} else {
+		for (i = 0; i < priv->port_count; i++) {
+			port = priv->port_list[i];
+			if (port->xdp_prog) {
+				dma_dir = DMA_BIDIRECTIONAL;
+				break;
+			}
+		}
+
+		if (!priv->page_pool)
+			return -ENOMEM;
+
+		/* All pools are equal in terms of dma direction */
+		page_pool = priv->page_pool[0];
+
+		if (page_pool->p.dma_dir != dma_dir) {
+			netdev_info(port->dev, "Changing pagepool dma to %d\n", dma_dir);
+			err = mvpp2_bm_switch_buffers(priv, true);
+		}
+	}
+
+	return err;
+}
+
 static void
 mvpp2_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
@@ -4268,13 +4498,16 @@ static int mvpp2_xdp_setup(struct mvpp2_port *port, struct netdev_bpf *bpf)
 		return -EOPNOTSUPP;
 	}
 
-	/* device is up and bpf is added/removed, must setup the RX queues */
-	if (running && reset) {
-		mvpp2_stop_dev(port);
-		mvpp2_cleanup_rxqs(port);
-		mvpp2_cleanup_txqs(port);
+	if (port->ntxqs < num_possible_cpus() * 2) {
+		netdev_err(port->dev, "XDP_TX needs twice the CPU TX queues, but only %d queues available.\n",
+			   port->ntxqs);
+		return -EOPNOTSUPP;
 	}
 
+	/* device is up and bpf is added/removed, must setup the RX queues */
+	if (running && reset)
+		mvpp2_stop(port->dev);
+
 	old_prog = xchg(&port->xdp_prog, prog);
 	if (old_prog)
 		bpf_prog_put(old_prog);
@@ -4284,21 +4517,11 @@ static int mvpp2_xdp_setup(struct mvpp2_port *port, struct netdev_bpf *bpf)
 		return 0;
 
 	/* device was up, restore the link */
-	if (running) {
-		int ret = mvpp2_setup_rxqs(port);
-
-		if (ret) {
-			netdev_err(port->dev, "mvpp2_setup_rxqs failed\n");
-			return ret;
-		}
-		ret = mvpp2_setup_txqs(port);
-		if (ret) {
-			netdev_err(port->dev, "mvpp2_setup_txqs failed\n");
-			return ret;
-		}
+	if (running)
+		mvpp2_open(port->dev);
 
-		mvpp2_start_dev(port);
-	}
+	/* Check Page Pool DMA Direction */
+	mvpp2_check_pagepool_dma(port);
 
 	return 0;
 }
@@ -4669,6 +4892,7 @@ static const struct net_device_ops mvpp2_netdev_ops = {
 	.ndo_vlan_rx_kill_vid	= mvpp2_vlan_rx_kill_vid,
 	.ndo_set_features	= mvpp2_set_features,
 	.ndo_bpf		= mvpp2_xdp,
+	.ndo_xdp_xmit		= mvpp2_xdp_xmit,
 };
 
 static const struct ethtool_ops mvpp2_eth_tool_ops = {
-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 0/4] mvpp2: XDP support
  2020-06-30 18:09 [PATCH net-next 0/4] mvpp2: XDP support Matteo Croce
                   ` (3 preceding siblings ...)
  2020-06-30 18:09 ` [PATCH net-next 4/4] mvpp2: XDP TX support Matteo Croce
@ 2020-07-01 19:18 ` Jesper Dangaard Brouer
  4 siblings, 0 replies; 11+ messages in thread
From: Jesper Dangaard Brouer @ 2020-07-01 19:18 UTC (permalink / raw)
  To: Matteo Croce, Ilias Apalodimas
  Cc: netdev, linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Stefan Chulski, Marcin Wojtas,
	maxime.chevallier, antoine.tenart, thomas.petazzoni, brouer

Hey Ilias,

Could you please review this driver mvpp2 for XDP and page_pool usage?


On Tue, 30 Jun 2020 20:09:26 +0200
Matteo Croce <mcroce@linux.microsoft.com> wrote:

> From: Matteo Croce <mcroce@microsoft.com>
> 
> Add XDP support to mvpp2. This series converts the driver to the
> page_pool API for RX buffer management, and adds native XDP support.
> 
> These are the performance numbers, as measured by Sven:
> 
> SKB fwd page pool:
> Rx bps     390.38 Mbps
> Rx pps     762.46 Kpps
> 
> XDP fwd:
> Rx bps     1.39 Gbps
> Rx pps     2.72 Mpps
> 
> XDP Drop:
> eth0: 12.9 Mpps
> eth1: 4.1 Mpps
> 
> Matteo Croce (4):
>   mvpp2: refactor BM pool init percpu code
>   mvpp2: use page_pool allocator
>   mvpp2: add basic XDP support
>   mvpp2: XDP TX support
> 
>  drivers/net/ethernet/marvell/Kconfig          |   1 +
>  drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |  49 +-
>  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 600 ++++++++++++++++--
>  3 files changed, 588 insertions(+), 62 deletions(-)
> 

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/4] mvpp2: use page_pool allocator
  2020-06-30 18:09 ` [PATCH net-next 2/4] mvpp2: use page_pool allocator Matteo Croce
@ 2020-07-02  7:31   ` ilias.apalodimas
  2020-07-02  9:42     ` Matteo Croce
  0 siblings, 1 reply; 11+ messages in thread
From: ilias.apalodimas @ 2020-07-02  7:31 UTC (permalink / raw)
  To: Matteo Croce
  Cc: netdev, linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

Hi Matteo, 

Thanks for working on this!

On Tue, Jun 30, 2020 at 08:09:28PM +0200, Matteo Croce wrote:
> From: Matteo Croce <mcroce@microsoft.com>
> 
> Use the page_pool API for memory management. This is a prerequisite for
> native XDP support.
> 
> Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
> Signed-off-by: Matteo Croce <mcroce@microsoft.com>
> ---
>  drivers/net/ethernet/marvell/Kconfig          |   1 +
>  drivers/net/ethernet/marvell/mvpp2/mvpp2.h    |   8 +
>  .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 155 +++++++++++++++---
>  3 files changed, 139 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/Kconfig b/drivers/net/ethernet/marvell/Kconfig
> index cd8ddd1ef6f2..ef4f35ba077d 100644
> --- a/drivers/net/ethernet/marvell/Kconfig
> +++ b/drivers/net/ethernet/marvell/Kconfig
> @@ -87,6 +87,7 @@ config MVPP2
>  	depends on ARCH_MVEBU || COMPILE_TEST
>  	select MVMDIO
>  	select PHYLINK
> +	select PAGE_POOL
>  	help
>  	  This driver supports the network interface units in the
>  	  Marvell ARMADA 375, 7K and 8K SoCs.
> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> index 543a310ec102..4c16c9e9c1e5 100644
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
> @@ -15,6 +15,7 @@
>  #include <linux/phy.h>
>  #include <linux/phylink.h>
>  #include <net/flow_offload.h>
> +#include <net/page_pool.h>
>  
>  /* Fifo Registers */
>  #define MVPP2_RX_DATA_FIFO_SIZE_REG(port)	(0x00 + 4 * (port))
> @@ -820,6 +821,9 @@ struct mvpp2 {
>  
>  	/* RSS Indirection tables */
>  	struct mvpp2_rss_table *rss_tables[MVPP22_N_RSS_TABLES];
> +
> +	/* page_pool allocator */
> +	struct page_pool *page_pool[MVPP2_PORT_MAX_RXQ];
>  };
>  
>  struct mvpp2_pcpu_stats {
> @@ -1161,6 +1165,10 @@ struct mvpp2_rx_queue {
>  
>  	/* Port's logic RXQ number to which physical RXQ is mapped */
>  	int logic_rxq;
> +
> +	/* XDP memory accounting */
> +	struct xdp_rxq_info xdp_rxq_short;
> +	struct xdp_rxq_info xdp_rxq_long;
>  };
>  
>  struct mvpp2_bm_pool {
> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> index 027de7291f92..9e2e8fb0a0b8 100644
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> @@ -95,6 +95,22 @@ static inline u32 mvpp2_cpu_to_thread(struct mvpp2 *priv, int cpu)
>  	return cpu % priv->nthreads;
>  }
>  
> +static struct page_pool *
> +mvpp2_create_page_pool(struct device *dev, int num, int len)
> +{
> +	struct page_pool_params pp_params = {
> +		/* internal DMA mapping in page_pool */
> +		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
> +		.pool_size = num,
> +		.nid = NUMA_NO_NODE,
> +		.dev = dev,
> +		.dma_dir = DMA_FROM_DEVICE,
> +		.max_len = len,
> +	};
> +
> +	return page_pool_create(&pp_params);
> +}
> +
>  /* These accessors should be used to access:
>   *
>   * - per-thread registers, where each thread has its own copy of the
> @@ -327,17 +343,26 @@ static inline int mvpp2_txq_phys(int port, int txq)
>  	return (MVPP2_MAX_TCONT + port) * MVPP2_MAX_TXQ + txq;
>  }
>  
> -static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool)
> +/* Returns a struct page if page_pool is set, otherwise a buffer */
> +static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool,
> +			      struct page_pool *page_pool)
>  {
> +	if (page_pool)
> +		return page_pool_alloc_pages(page_pool,
> +					     GFP_ATOMIC | __GFP_NOWARN);

page_pool_dev_alloc_pages() can set these flags for you, instead of explicitly
calling them

> +
>  	if (likely(pool->frag_size <= PAGE_SIZE))
>  		return netdev_alloc_frag(pool->frag_size);
> -	else
> -		return kmalloc(pool->frag_size, GFP_ATOMIC);
> +
> +	return kmalloc(pool->frag_size, GFP_ATOMIC);
>  }
>  
> -static void mvpp2_frag_free(const struct mvpp2_bm_pool *pool, void *data)
> +static void mvpp2_frag_free(const struct mvpp2_bm_pool *pool,
> +			    struct page_pool *page_pool, void *data)
>  {
> -	if (likely(pool->frag_size <= PAGE_SIZE))
> +	if (page_pool)
> +		page_pool_put_full_page(page_pool, virt_to_head_page(data), false);
> +	else if (likely(pool->frag_size <= PAGE_SIZE))
>  		skb_free_frag(data);
>  	else
>  		kfree(data);
> @@ -442,6 +467,7 @@ static void mvpp2_bm_bufs_get_addrs(struct device *dev, struct mvpp2 *priv,
>  static void mvpp2_bm_bufs_free(struct device *dev, struct mvpp2 *priv,
>  			       struct mvpp2_bm_pool *bm_pool, int buf_num)
>  {
> +	struct page_pool *pp = NULL;
>  	int i;
>  
>  	if (buf_num > bm_pool->buf_num) {
> @@ -450,6 +476,9 @@ static void mvpp2_bm_bufs_free(struct device *dev, struct mvpp2 *priv,
>  		buf_num = bm_pool->buf_num;
>  	}
>  
> +	if (priv->percpu_pools)
> +		pp = priv->page_pool[bm_pool->id];
> +
>  	for (i = 0; i < buf_num; i++) {
>  		dma_addr_t buf_dma_addr;
>  		phys_addr_t buf_phys_addr;
> @@ -458,14 +487,15 @@ static void mvpp2_bm_bufs_free(struct device *dev, struct mvpp2 *priv,
>  		mvpp2_bm_bufs_get_addrs(dev, priv, bm_pool,
>  					&buf_dma_addr, &buf_phys_addr);
>  
> -		dma_unmap_single(dev, buf_dma_addr,
> -				 bm_pool->buf_size, DMA_FROM_DEVICE);
> +		if (!pp)
> +			dma_unmap_single(dev, buf_dma_addr,
> +					 bm_pool->buf_size, DMA_FROM_DEVICE);
>  
>  		data = (void *)phys_to_virt(buf_phys_addr);
>  		if (!data)
>  			break;
>  
> -		mvpp2_frag_free(bm_pool, data);
> +		mvpp2_frag_free(bm_pool, pp, data);
>  	}
>  
>  	/* Update BM driver with number of buffers removed from pool */
> @@ -496,6 +526,9 @@ static int mvpp2_bm_pool_destroy(struct device *dev, struct mvpp2 *priv,
>  	int buf_num;
>  	u32 val;
>  
> +	if (priv->percpu_pools)
> +		page_pool_destroy(priv->page_pool[bm_pool->id]);
> +
>  	buf_num = mvpp2_check_hw_buf_num(priv, bm_pool);
>  	mvpp2_bm_bufs_free(dev, priv, bm_pool, buf_num);
>  
> @@ -548,8 +581,20 @@ static int mvpp2_bm_init(struct device *dev, struct mvpp2 *priv)
>  {
>  	int i, err, poolnum = MVPP2_BM_POOLS_NUM;
>  
> -	if (priv->percpu_pools)
> +	if (priv->percpu_pools) {
>  		poolnum = mvpp2_get_nrxqs(priv) * 2;
> +		for (i = 0; i < poolnum; i++) {
> +			/* the pool in use */
> +			int pn = i / (poolnum / 2);
> +
> +			priv->page_pool[i] =
> +				mvpp2_create_page_pool(dev,
> +						       mvpp2_pools[pn].buf_num,
> +						       mvpp2_pools[pn].pkt_size);
> +			if (IS_ERR(priv->page_pool[i]))
> +				return PTR_ERR(priv->page_pool[i]);
> +		}
> +	}
>  
>  	dev_info(dev, "using %d %s buffers\n", poolnum,
>  		 priv->percpu_pools ? "per-cpu" : "shared");
> @@ -632,23 +677,31 @@ static void mvpp2_rxq_short_pool_set(struct mvpp2_port *port,
>  
>  static void *mvpp2_buf_alloc(struct mvpp2_port *port,
>  			     struct mvpp2_bm_pool *bm_pool,
> +			     struct page_pool *page_pool,
>  			     dma_addr_t *buf_dma_addr,
>  			     phys_addr_t *buf_phys_addr,
>  			     gfp_t gfp_mask)
>  {
>  	dma_addr_t dma_addr;
> +	struct page *page;
>  	void *data;
>  
> -	data = mvpp2_frag_alloc(bm_pool);
> +	data = mvpp2_frag_alloc(bm_pool, page_pool);
>  	if (!data)
>  		return NULL;
>  
> -	dma_addr = dma_map_single(port->dev->dev.parent, data,
> -				  MVPP2_RX_BUF_SIZE(bm_pool->pkt_size),
> -				  DMA_FROM_DEVICE);
> -	if (unlikely(dma_mapping_error(port->dev->dev.parent, dma_addr))) {
> -		mvpp2_frag_free(bm_pool, data);
> -		return NULL;
> +	if (page_pool) {
> +		page = (struct page *)data;
> +		dma_addr = page_pool_get_dma_addr(page);
> +		data = page_to_virt(page);
> +	} else {
> +		dma_addr = dma_map_single(port->dev->dev.parent, data,
> +					  MVPP2_RX_BUF_SIZE(bm_pool->pkt_size),
> +					  DMA_FROM_DEVICE);
> +		if (unlikely(dma_mapping_error(port->dev->dev.parent, dma_addr))) {
> +			mvpp2_frag_free(bm_pool, NULL, data);
> +			return NULL;
> +		}
>  	}
>  	*buf_dma_addr = dma_addr;
>  	*buf_phys_addr = virt_to_phys(data);
> @@ -706,6 +759,7 @@ static int mvpp2_bm_bufs_add(struct mvpp2_port *port,
>  	int i, buf_size, total_size;
>  	dma_addr_t dma_addr;
>  	phys_addr_t phys_addr;
> +	struct page_pool *pp = NULL;
>  	void *buf;
>  
>  	if (port->priv->percpu_pools &&
> @@ -726,8 +780,10 @@ static int mvpp2_bm_bufs_add(struct mvpp2_port *port,
>  		return 0;
>  	}
>  
> +	if (port->priv->percpu_pools)
> +		pp = port->priv->page_pool[bm_pool->id];
>  	for (i = 0; i < buf_num; i++) {
> -		buf = mvpp2_buf_alloc(port, bm_pool, &dma_addr,
> +		buf = mvpp2_buf_alloc(port, bm_pool, pp, &dma_addr,
>  				      &phys_addr, GFP_KERNEL);
>  		if (!buf)
>  			break;
> @@ -2374,10 +2430,11 @@ static int mvpp2_aggr_txq_init(struct platform_device *pdev,
>  /* Create a specified Rx queue */
>  static int mvpp2_rxq_init(struct mvpp2_port *port,
>  			  struct mvpp2_rx_queue *rxq)
> -
>  {
> +	struct mvpp2 *priv = port->priv;
>  	unsigned int thread;
>  	u32 rxq_dma;
> +	int err;
>  
>  	rxq->size = port->rx_ring_size;
>  
> @@ -2415,7 +2472,41 @@ static int mvpp2_rxq_init(struct mvpp2_port *port,
>  	/* Add number of descriptors ready for receiving packets */
>  	mvpp2_rxq_status_update(port, rxq->id, 0, rxq->size);
>  
> +	if (priv->percpu_pools) {
> +		err = xdp_rxq_info_reg(&rxq->xdp_rxq_short, port->dev, rxq->id);
> +		if (err < 0)
> +			goto err_free_dma;
> +
> +		err = xdp_rxq_info_reg(&rxq->xdp_rxq_long, port->dev, rxq->id);
> +		if (err < 0)
> +			goto err_unregister_rxq_short;
> +
> +		/* Every RXQ has a pool for short and another for long packets */
> +		err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_short,
> +						 MEM_TYPE_PAGE_POOL,
> +						 priv->page_pool[rxq->logic_rxq]);
> +		if (err < 0)
> +			goto err_unregister_rxq_short;
> +
> +		err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_long,
> +						 MEM_TYPE_PAGE_POOL,
> +						 priv->page_pool[rxq->logic_rxq +
> +								 port->nrxqs]);
> +		if (err < 0)
> +			goto err_unregister_rxq_long;

Since mvpp2_rxq_init() will return an error shouldn't we unregister the short 
memory pool as well?

> +	}
> +
>  	return 0;
> +
> +err_unregister_rxq_long:
> +	xdp_rxq_info_unreg(&rxq->xdp_rxq_long);
> +err_unregister_rxq_short:
> +	xdp_rxq_info_unreg(&rxq->xdp_rxq_short);
> +err_free_dma:
> +	dma_free_coherent(port->dev->dev.parent,
> +			  rxq->size * MVPP2_DESC_ALIGNED_SIZE,
> +			  rxq->descs, rxq->descs_dma);
> +	return err;
>  }
>  
>  /* Push packets received by the RXQ to BM pool */
> @@ -2449,6 +2540,12 @@ static void mvpp2_rxq_deinit(struct mvpp2_port *port,
>  {
>  	unsigned int thread;
>  
> +	if (xdp_rxq_info_is_reg(&rxq->xdp_rxq_short))
> +		xdp_rxq_info_unreg(&rxq->xdp_rxq_short);
> +
> +	if (xdp_rxq_info_is_reg(&rxq->xdp_rxq_long))
> +		xdp_rxq_info_unreg(&rxq->xdp_rxq_long);
> +
>  	mvpp2_rxq_drop_pkts(port, rxq);
>  
>  	if (rxq->descs)
> @@ -2890,14 +2987,15 @@ static void mvpp2_rx_csum(struct mvpp2_port *port, u32 status,
>  
>  /* Allocate a new skb and add it to BM pool */
>  static int mvpp2_rx_refill(struct mvpp2_port *port,
> -			   struct mvpp2_bm_pool *bm_pool, int pool)
> +			   struct mvpp2_bm_pool *bm_pool,
> +			   struct page_pool *page_pool, int pool)
>  {
>  	dma_addr_t dma_addr;
>  	phys_addr_t phys_addr;
>  	void *buf;
>  
> -	buf = mvpp2_buf_alloc(port, bm_pool, &dma_addr, &phys_addr,
> -			      GFP_ATOMIC);
> +	buf = mvpp2_buf_alloc(port, bm_pool, page_pool,
> +			      &dma_addr, &phys_addr, GFP_ATOMIC);
>  	if (!buf)
>  		return -ENOMEM;
>  
> @@ -2956,6 +3054,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  	while (rx_done < rx_todo) {
>  		struct mvpp2_rx_desc *rx_desc = mvpp2_rxq_next_desc_get(rxq);
>  		struct mvpp2_bm_pool *bm_pool;
> +		struct page_pool *pp = NULL;
>  		struct sk_buff *skb;
>  		unsigned int frag_size;
>  		dma_addr_t dma_addr;
> @@ -2989,6 +3088,9 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  					DMA_FROM_DEVICE);
>  		prefetch(data);
>  
> +		if (port->priv->percpu_pools)
> +			pp = port->priv->page_pool[pool];
> +
>  		if (bm_pool->frag_size > PAGE_SIZE)
>  			frag_size = 0;
>  		else
> @@ -3000,15 +3102,18 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  			goto err_drop_frame;
>  		}
>  
> -		err = mvpp2_rx_refill(port, bm_pool, pool);
> +		err = mvpp2_rx_refill(port, bm_pool, pp, pool);
>  		if (err) {
>  			netdev_err(port->dev, "failed to refill BM pools\n");
>  			goto err_drop_frame;
>  		}
>  
> -		dma_unmap_single_attrs(dev->dev.parent, dma_addr,
> -				       bm_pool->buf_size, DMA_FROM_DEVICE,
> -				       DMA_ATTR_SKIP_CPU_SYNC);
> +		if (pp)
> +			page_pool_release_page(pp, virt_to_page(data));
> +		else
> +			dma_unmap_single_attrs(dev->dev.parent, dma_addr,
> +					       bm_pool->buf_size, DMA_FROM_DEVICE,
> +					       DMA_ATTR_SKIP_CPU_SYNC);
>  
>  		rcvd_pkts++;
>  		rcvd_bytes += rx_bytes;
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/4] mvpp2: add basic XDP support
  2020-06-30 18:09 ` [PATCH net-next 3/4] mvpp2: add basic XDP support Matteo Croce
@ 2020-07-02  8:08   ` ilias.apalodimas
  2020-07-02  9:09     ` Maciej Fijalkowski
  0 siblings, 1 reply; 11+ messages in thread
From: ilias.apalodimas @ 2020-07-02  8:08 UTC (permalink / raw)
  To: Matteo Croce
  Cc: netdev, linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

On Tue, Jun 30, 2020 at 08:09:29PM +0200, Matteo Croce wrote:
> From: Matteo Croce <mcroce@microsoft.com>
> 
> Add XDP native support.
> By now only XDP_DROP, XDP_PASS and XDP_REDIRECT
> verdicts are supported.
> 
> Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
> Signed-off-by: Matteo Croce <mcroce@microsoft.com>
> ---

[...]

>  }
>  
> +static int
> +mvpp2_run_xdp(struct mvpp2_port *port, struct mvpp2_rx_queue *rxq,
> +	      struct bpf_prog *prog, struct xdp_buff *xdp,
> +	      struct page_pool *pp)
> +{
> +	unsigned int len, sync, err;
> +	struct page *page;
> +	u32 ret, act;
> +
> +	len = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
> +	act = bpf_prog_run_xdp(prog, xdp);
> +
> +	/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
> +	sync = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
> +	sync = max(sync, len);
> +
> +	switch (act) {
> +	case XDP_PASS:
> +		ret = MVPP2_XDP_PASS;
> +		break;
> +	case XDP_REDIRECT:
> +		err = xdp_do_redirect(port->dev, xdp, prog);
> +		if (unlikely(err)) {
> +			ret = MVPP2_XDP_DROPPED;
> +			page = virt_to_head_page(xdp->data);
> +			page_pool_put_page(pp, page, sync, true);
> +		} else {
> +			ret = MVPP2_XDP_REDIR;
> +		}
> +		break;
> +	default:
> +		bpf_warn_invalid_xdp_action(act);
> +		fallthrough;
> +	case XDP_ABORTED:
> +		trace_xdp_exception(port->dev, prog, act);
> +		fallthrough;
> +	case XDP_DROP:
> +		page = virt_to_head_page(xdp->data);
> +		page_pool_put_page(pp, page, sync, true);
> +		ret = MVPP2_XDP_DROPPED;
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
>  /* Main rx processing */
>  static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  		    int rx_todo, struct mvpp2_rx_queue *rxq)
>  {
>  	struct net_device *dev = port->dev;
> +	struct bpf_prog *xdp_prog;
> +	struct xdp_buff xdp;
>  	int rx_received;
>  	int rx_done = 0;
> +	u32 xdp_ret = 0;
>  	u32 rcvd_pkts = 0;
>  	u32 rcvd_bytes = 0;
>  
> +	rcu_read_lock();
> +
> +	xdp_prog = READ_ONCE(port->xdp_prog);
> +
>  	/* Get number of received packets and clamp the to-do */
>  	rx_received = mvpp2_rxq_received(port, rxq->id);
>  	if (rx_todo > rx_received)
> @@ -3060,7 +3115,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  		dma_addr_t dma_addr;
>  		phys_addr_t phys_addr;
>  		u32 rx_status;
> -		int pool, rx_bytes, err;
> +		int pool, rx_bytes, err, ret;
>  		void *data;
>  
>  		rx_done++;
> @@ -3096,6 +3151,33 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  		else
>  			frag_size = bm_pool->frag_size;
>  
> +		if (xdp_prog) {
> +			xdp.data_hard_start = data;
> +			xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
> +			xdp.data_end = xdp.data + rx_bytes;
> +			xdp.frame_sz = PAGE_SIZE;
> +
> +			if (bm_pool->pkt_size == MVPP2_BM_SHORT_PKT_SIZE)
> +				xdp.rxq = &rxq->xdp_rxq_short;
> +			else
> +				xdp.rxq = &rxq->xdp_rxq_long;
> +
> +			xdp_set_data_meta_invalid(&xdp);
> +
> +			ret = mvpp2_run_xdp(port, rxq, xdp_prog, &xdp, pp);
> +
> +			if (ret) {
> +				xdp_ret |= ret;
> +				err = mvpp2_rx_refill(port, bm_pool, pp, pool);
> +				if (err) {
> +					netdev_err(port->dev, "failed to refill BM pools\n");
> +					goto err_drop_frame;
> +				}
> +
> +				continue;
> +			}
> +		}
> +
>  		skb = build_skb(data, frag_size);
>  		if (!skb) {
>  			netdev_warn(port->dev, "skb build failed\n");
> @@ -3118,7 +3200,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  		rcvd_pkts++;
>  		rcvd_bytes += rx_bytes;
>  
> -		skb_reserve(skb, MVPP2_MH_SIZE + NET_SKB_PAD);
> +		skb_reserve(skb, MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM);
>  		skb_put(skb, rx_bytes);
>  		skb->protocol = eth_type_trans(skb, dev);
>  		mvpp2_rx_csum(port, rx_status, skb);
> @@ -3133,6 +3215,8 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
>  		mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
>  	}
>  
> +	rcu_read_unlock();
> +
>  	if (rcvd_pkts) {
>  		struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats);
>  
> @@ -3608,6 +3692,8 @@ static void mvpp2_start_dev(struct mvpp2_port *port)
>  	}
>  
>  	netif_tx_start_all_queues(port->dev);
> +
> +	clear_bit(0, &port->state);
>  }
>  
>  /* Set hw internals when stopping port */
> @@ -3615,6 +3701,8 @@ static void mvpp2_stop_dev(struct mvpp2_port *port)
>  {
>  	int i;
>  
> +	set_bit(0, &port->state);
> +
>  	/* Disable interrupts on all threads */
>  	mvpp2_interrupts_disable(port);
>  
> @@ -4021,6 +4109,10 @@ static int mvpp2_change_mtu(struct net_device *dev, int mtu)
>  	}
>  
>  	if (MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE) {
> +		if (port->xdp_prog) {
> +			netdev_err(dev, "Jumbo frames are not supported with XDP\n");

Does it make sense to switch to NL_SET_ERR_MSG_MOD() here, so the user can get
an immediate feedback?

> +			return -EINVAL;
> +		}
>  		if (priv->percpu_pools) {
>  			netdev_warn(dev, "mtu %d too high, switching to shared buffers", mtu);
>  			mvpp2_bm_switch_buffers(priv, false);
> @@ -4159,6 +4251,73 @@ static int mvpp2_set_features(struct net_device *dev,
>  	return 0;
>  }
>  
> +static int mvpp2_xdp_setup(struct mvpp2_port *port, struct netdev_bpf *bpf)
> +{
> +	struct bpf_prog *prog = bpf->prog, *old_prog;
> +	bool running = netif_running(port->dev);
> +	bool reset = !prog != !port->xdp_prog;
> +
> +	if (port->dev->mtu > ETH_DATA_LEN) {
> +		netdev_err(port->dev, "Jumbo frames are not supported by XDP, current MTU %d.\n",
> +			   port->dev->mtu);

ditto

> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (!port->priv->percpu_pools) {
> +		netdev_err(port->dev, "Per CPU Pools required for XDP");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	/* device is up and bpf is added/removed, must setup the RX queues */
> +	if (running && reset) {
> +		mvpp2_stop_dev(port);
> +		mvpp2_cleanup_rxqs(port);
> +		mvpp2_cleanup_txqs(port);
> +	}
> +
> +	old_prog = xchg(&port->xdp_prog, prog);
> +	if (old_prog)
> +		bpf_prog_put(old_prog);
> +
> +	/* bpf is just replaced, RXQ and MTU are already setup */
> +	if (!reset)
> +		return 0;
> +
> +	/* device was up, restore the link */
> +	if (running) {
> +		int ret = mvpp2_setup_rxqs(port);
> +
> +		if (ret) {
> +			netdev_err(port->dev, "mvpp2_setup_rxqs failed\n");
> +			return ret;
> +		}
> +		ret = mvpp2_setup_txqs(port);
> +		if (ret) {
> +			netdev_err(port->dev, "mvpp2_setup_txqs failed\n");
> +			return ret;
> +		}
> +
> +		mvpp2_start_dev(port);
> +	}
> +
> +	return 0;
> +}
> +
> +static int mvpp2_xdp(struct net_device *dev, struct netdev_bpf *xdp)
> +{
> +	struct mvpp2_port *port = netdev_priv(dev);
> +
> +	switch (xdp->command) {
> +	case XDP_SETUP_PROG:
> +		return mvpp2_xdp_setup(port, xdp);
> +	case XDP_QUERY_PROG:
> +		xdp->prog_id = port->xdp_prog ? port->xdp_prog->aux->id : 0;
> +		return 0;
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
>  /* Ethtool methods */
>  
>  static int mvpp2_ethtool_nway_reset(struct net_device *dev)
> @@ -4509,6 +4668,7 @@ static const struct net_device_ops mvpp2_netdev_ops = {
>  	.ndo_vlan_rx_add_vid	= mvpp2_vlan_rx_add_vid,
>  	.ndo_vlan_rx_kill_vid	= mvpp2_vlan_rx_kill_vid,
>  	.ndo_set_features	= mvpp2_set_features,
> +	.ndo_bpf		= mvpp2_xdp,
>  };
>  
>  static const struct ethtool_ops mvpp2_eth_tool_ops = {
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/4] mvpp2: add basic XDP support
  2020-07-02  8:08   ` ilias.apalodimas
@ 2020-07-02  9:09     ` Maciej Fijalkowski
  2020-07-02 10:19       ` Matteo Croce
  0 siblings, 1 reply; 11+ messages in thread
From: Maciej Fijalkowski @ 2020-07-02  9:09 UTC (permalink / raw)
  To: ilias.apalodimas
  Cc: Matteo Croce, netdev, linux-kernel, bpf, Sven Auhagen,
	Lorenzo Bianconi, David S. Miller, Jesper Dangaard Brouer,
	Stefan Chulski, Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

On Thu, Jul 02, 2020 at 11:08:19AM +0300, ilias.apalodimas@linaro.org wrote:
> On Tue, Jun 30, 2020 at 08:09:29PM +0200, Matteo Croce wrote:
> > From: Matteo Croce <mcroce@microsoft.com>
> > 
> > Add XDP native support.
> > By now only XDP_DROP, XDP_PASS and XDP_REDIRECT
> > verdicts are supported.
> > 
> > Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de>
> > Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
> > Signed-off-by: Matteo Croce <mcroce@microsoft.com>
> > ---
> 
> [...]
> 
> >  }
> >  
> > +static int
> > +mvpp2_run_xdp(struct mvpp2_port *port, struct mvpp2_rx_queue *rxq,
> > +	      struct bpf_prog *prog, struct xdp_buff *xdp,
> > +	      struct page_pool *pp)
> > +{
> > +	unsigned int len, sync, err;
> > +	struct page *page;
> > +	u32 ret, act;
> > +
> > +	len = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
> > +	act = bpf_prog_run_xdp(prog, xdp);
> > +
> > +	/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
> > +	sync = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
> > +	sync = max(sync, len);
> > +
> > +	switch (act) {
> > +	case XDP_PASS:
> > +		ret = MVPP2_XDP_PASS;
> > +		break;
> > +	case XDP_REDIRECT:
> > +		err = xdp_do_redirect(port->dev, xdp, prog);
> > +		if (unlikely(err)) {
> > +			ret = MVPP2_XDP_DROPPED;
> > +			page = virt_to_head_page(xdp->data);
> > +			page_pool_put_page(pp, page, sync, true);
> > +		} else {
> > +			ret = MVPP2_XDP_REDIR;
> > +		}
> > +		break;
> > +	default:
> > +		bpf_warn_invalid_xdp_action(act);
> > +		fallthrough;
> > +	case XDP_ABORTED:
> > +		trace_xdp_exception(port->dev, prog, act);
> > +		fallthrough;
> > +	case XDP_DROP:
> > +		page = virt_to_head_page(xdp->data);
> > +		page_pool_put_page(pp, page, sync, true);
> > +		ret = MVPP2_XDP_DROPPED;
> > +		break;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> >  /* Main rx processing */
> >  static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
> >  		    int rx_todo, struct mvpp2_rx_queue *rxq)
> >  {
> >  	struct net_device *dev = port->dev;
> > +	struct bpf_prog *xdp_prog;
> > +	struct xdp_buff xdp;
> >  	int rx_received;
> >  	int rx_done = 0;
> > +	u32 xdp_ret = 0;
> >  	u32 rcvd_pkts = 0;
> >  	u32 rcvd_bytes = 0;
> >  
> > +	rcu_read_lock();
> > +
> > +	xdp_prog = READ_ONCE(port->xdp_prog);
> > +
> >  	/* Get number of received packets and clamp the to-do */
> >  	rx_received = mvpp2_rxq_received(port, rxq->id);
> >  	if (rx_todo > rx_received)
> > @@ -3060,7 +3115,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
> >  		dma_addr_t dma_addr;
> >  		phys_addr_t phys_addr;
> >  		u32 rx_status;
> > -		int pool, rx_bytes, err;
> > +		int pool, rx_bytes, err, ret;
> >  		void *data;
> >  
> >  		rx_done++;
> > @@ -3096,6 +3151,33 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
> >  		else
> >  			frag_size = bm_pool->frag_size;
> >  
> > +		if (xdp_prog) {
> > +			xdp.data_hard_start = data;
> > +			xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
> > +			xdp.data_end = xdp.data + rx_bytes;
> > +			xdp.frame_sz = PAGE_SIZE;
> > +
> > +			if (bm_pool->pkt_size == MVPP2_BM_SHORT_PKT_SIZE)
> > +				xdp.rxq = &rxq->xdp_rxq_short;
> > +			else
> > +				xdp.rxq = &rxq->xdp_rxq_long;
> > +
> > +			xdp_set_data_meta_invalid(&xdp);
> > +
> > +			ret = mvpp2_run_xdp(port, rxq, xdp_prog, &xdp, pp);
> > +
> > +			if (ret) {
> > +				xdp_ret |= ret;
> > +				err = mvpp2_rx_refill(port, bm_pool, pp, pool);
> > +				if (err) {
> > +					netdev_err(port->dev, "failed to refill BM pools\n");
> > +					goto err_drop_frame;
> > +				}
> > +
> > +				continue;
> > +			}
> > +		}
> > +
> >  		skb = build_skb(data, frag_size);
> >  		if (!skb) {
> >  			netdev_warn(port->dev, "skb build failed\n");
> > @@ -3118,7 +3200,7 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
> >  		rcvd_pkts++;
> >  		rcvd_bytes += rx_bytes;
> >  
> > -		skb_reserve(skb, MVPP2_MH_SIZE + NET_SKB_PAD);
> > +		skb_reserve(skb, MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM);
> >  		skb_put(skb, rx_bytes);
> >  		skb->protocol = eth_type_trans(skb, dev);
> >  		mvpp2_rx_csum(port, rx_status, skb);
> > @@ -3133,6 +3215,8 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
> >  		mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
> >  	}
> >  
> > +	rcu_read_unlock();
> > +
> >  	if (rcvd_pkts) {
> >  		struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats);
> >  
> > @@ -3608,6 +3692,8 @@ static void mvpp2_start_dev(struct mvpp2_port *port)
> >  	}
> >  
> >  	netif_tx_start_all_queues(port->dev);
> > +
> > +	clear_bit(0, &port->state);
> >  }
> >  
> >  /* Set hw internals when stopping port */
> > @@ -3615,6 +3701,8 @@ static void mvpp2_stop_dev(struct mvpp2_port *port)
> >  {
> >  	int i;
> >  
> > +	set_bit(0, &port->state);
> > +
> >  	/* Disable interrupts on all threads */
> >  	mvpp2_interrupts_disable(port);
> >  
> > @@ -4021,6 +4109,10 @@ static int mvpp2_change_mtu(struct net_device *dev, int mtu)
> >  	}
> >  
> >  	if (MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE) {
> > +		if (port->xdp_prog) {
> > +			netdev_err(dev, "Jumbo frames are not supported with XDP\n");
> 
> Does it make sense to switch to NL_SET_ERR_MSG_MOD() here, so the user can get
> an immediate feedback?

ndo_change_mtu does not provide netlink's extack, so that's not possible
here AFAIK.

> 
> > +			return -EINVAL;
> > +		}
> >  		if (priv->percpu_pools) {
> >  			netdev_warn(dev, "mtu %d too high, switching to shared buffers", mtu);
> >  			mvpp2_bm_switch_buffers(priv, false);
> > @@ -4159,6 +4251,73 @@ static int mvpp2_set_features(struct net_device *dev,
> >  	return 0;
> >  }
> >  
> > +static int mvpp2_xdp_setup(struct mvpp2_port *port, struct netdev_bpf *bpf)
> > +{
> > +	struct bpf_prog *prog = bpf->prog, *old_prog;
> > +	bool running = netif_running(port->dev);
> > +	bool reset = !prog != !port->xdp_prog;
> > +
> > +	if (port->dev->mtu > ETH_DATA_LEN) {
> > +		netdev_err(port->dev, "Jumbo frames are not supported by XDP, current MTU %d.\n",
> > +			   port->dev->mtu);
> 
> ditto

Here I agree and for every other netdev_err within mvpp2_xdp_setup().

> 
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	if (!port->priv->percpu_pools) {
> > +		netdev_err(port->dev, "Per CPU Pools required for XDP");
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> > +	/* device is up and bpf is added/removed, must setup the RX queues */
> > +	if (running && reset) {
> > +		mvpp2_stop_dev(port);
> > +		mvpp2_cleanup_rxqs(port);
> > +		mvpp2_cleanup_txqs(port);
> > +	}
> > +
> > +	old_prog = xchg(&port->xdp_prog, prog);
> > +	if (old_prog)
> > +		bpf_prog_put(old_prog);
> > +
> > +	/* bpf is just replaced, RXQ and MTU are already setup */
> > +	if (!reset)
> > +		return 0;
> > +
> > +	/* device was up, restore the link */
> > +	if (running) {
> > +		int ret = mvpp2_setup_rxqs(port);
> > +
> > +		if (ret) {
> > +			netdev_err(port->dev, "mvpp2_setup_rxqs failed\n");
> > +			return ret;
> > +		}
> > +		ret = mvpp2_setup_txqs(port);
> > +		if (ret) {
> > +			netdev_err(port->dev, "mvpp2_setup_txqs failed\n");
> > +			return ret;
> > +		}
> > +
> > +		mvpp2_start_dev(port);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int mvpp2_xdp(struct net_device *dev, struct netdev_bpf *xdp)
> > +{
> > +	struct mvpp2_port *port = netdev_priv(dev);
> > +
> > +	switch (xdp->command) {
> > +	case XDP_SETUP_PROG:
> > +		return mvpp2_xdp_setup(port, xdp);
> > +	case XDP_QUERY_PROG:
> > +		xdp->prog_id = port->xdp_prog ? port->xdp_prog->aux->id : 0;
> > +		return 0;
> > +	default:
> > +		return -EINVAL;
> > +	}
> > +}
> > +
> >  /* Ethtool methods */
> >  
> >  static int mvpp2_ethtool_nway_reset(struct net_device *dev)
> > @@ -4509,6 +4668,7 @@ static const struct net_device_ops mvpp2_netdev_ops = {
> >  	.ndo_vlan_rx_add_vid	= mvpp2_vlan_rx_add_vid,
> >  	.ndo_vlan_rx_kill_vid	= mvpp2_vlan_rx_kill_vid,
> >  	.ndo_set_features	= mvpp2_set_features,
> > +	.ndo_bpf		= mvpp2_xdp,
> >  };
> >  
> >  static const struct ethtool_ops mvpp2_eth_tool_ops = {
> > -- 
> > 2.26.2
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 2/4] mvpp2: use page_pool allocator
  2020-07-02  7:31   ` ilias.apalodimas
@ 2020-07-02  9:42     ` Matteo Croce
  0 siblings, 0 replies; 11+ messages in thread
From: Matteo Croce @ 2020-07-02  9:42 UTC (permalink / raw)
  To: ilias.apalodimas
  Cc: netdev, linux-kernel, bpf, Sven Auhagen, Lorenzo Bianconi,
	David S. Miller, Jesper Dangaard Brouer, Stefan Chulski,
	Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

On Thu, Jul 2, 2020 at 9:31 AM <ilias.apalodimas@linaro.org> wrote:
>
> Hi Matteo,
>
> Thanks for working on this!
>

:)

> On Tue, Jun 30, 2020 at 08:09:28PM +0200, Matteo Croce wrote:
> > From: Matteo Croce <mcroce@microsoft.com>
> > -static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool)
> > +/* Returns a struct page if page_pool is set, otherwise a buffer */
> > +static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool,
> > +                           struct page_pool *page_pool)
> >  {
> > +     if (page_pool)
> > +             return page_pool_alloc_pages(page_pool,
> > +                                          GFP_ATOMIC | __GFP_NOWARN);
>
> page_pool_dev_alloc_pages() can set these flags for you, instead of explicitly
> calling them
>

Ok

> >
> > +     if (priv->percpu_pools) {
> > +             err = xdp_rxq_info_reg(&rxq->xdp_rxq_short, port->dev, rxq->id);
> > +             if (err < 0)
> > +                     goto err_free_dma;
> > +
> > +             err = xdp_rxq_info_reg(&rxq->xdp_rxq_long, port->dev, rxq->id);
> > +             if (err < 0)
> > +                     goto err_unregister_rxq_short;
> > +
> > +             /* Every RXQ has a pool for short and another for long packets */
> > +             err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_short,
> > +                                              MEM_TYPE_PAGE_POOL,
> > +                                              priv->page_pool[rxq->logic_rxq]);
> > +             if (err < 0)
> > +                     goto err_unregister_rxq_short;
> > +
> > +             err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_long,
> > +                                              MEM_TYPE_PAGE_POOL,
> > +                                              priv->page_pool[rxq->logic_rxq +
> > +                                                              port->nrxqs]);
> > +             if (err < 0)
> > +                     goto err_unregister_rxq_long;
>
> Since mvpp2_rxq_init() will return an error shouldn't we unregister the short
> memory pool as well?
>

Ok, I'll add another label like:

err_unregister_mem_rxq_short:
        xdp_rxq_info_unreg_mem_model(&rxq->xdp_rxq_short);

-- 
per aspera ad upstream

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 3/4] mvpp2: add basic XDP support
  2020-07-02  9:09     ` Maciej Fijalkowski
@ 2020-07-02 10:19       ` Matteo Croce
  0 siblings, 0 replies; 11+ messages in thread
From: Matteo Croce @ 2020-07-02 10:19 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: ilias.apalodimas, netdev, linux-kernel, bpf, Sven Auhagen,
	Lorenzo Bianconi, David S. Miller, Jesper Dangaard Brouer,
	Stefan Chulski, Marcin Wojtas, maxime.chevallier, antoine.tenart,
	thomas.petazzoni

On Thu, Jul 2, 2020 at 11:14 AM Maciej Fijalkowski
<maciej.fijalkowski@intel.com> wrote:
> > > +   if (port->dev->mtu > ETH_DATA_LEN) {
> > > +           netdev_err(port->dev, "Jumbo frames are not supported by XDP, current MTU %d.\n",
> > > +                      port->dev->mtu);
> >
> > ditto
>
> Here I agree and for every other netdev_err within mvpp2_xdp_setup().
>

Nice idea, I'll add extack error reporting where possible.

-- 
per aspera ad upstream

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-30 18:09 [PATCH net-next 0/4] mvpp2: XDP support Matteo Croce
2020-06-30 18:09 ` [PATCH net-next 1/4] mvpp2: refactor BM pool init percpu code Matteo Croce
2020-06-30 18:09 ` [PATCH net-next 2/4] mvpp2: use page_pool allocator Matteo Croce
2020-07-02  7:31   ` ilias.apalodimas
2020-07-02  9:42     ` Matteo Croce
2020-06-30 18:09 ` [PATCH net-next 3/4] mvpp2: add basic XDP support Matteo Croce
2020-07-02  8:08   ` ilias.apalodimas
2020-07-02  9:09     ` Maciej Fijalkowski
2020-07-02 10:19       ` Matteo Croce
2020-06-30 18:09 ` [PATCH net-next 4/4] mvpp2: XDP TX support Matteo Croce
2020-07-01 19:18 ` [PATCH net-next 0/4] mvpp2: XDP support Jesper Dangaard Brouer

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git