[PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support
@ 2019-07-03 10:18 Ivan Khoronzhuk
  2019-07-03 10:18 ` [PATCH v6 net-next 1/5] xdp: allow same allocator usage Ivan Khoronzhuk
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-03 10:18 UTC (permalink / raw)
  To: grygorii.strashko, hawk, davem
  Cc: ast, linux-kernel, linux-omap, xdp-newbies, ilias.apalodimas,
	netdev, daniel, jakub.kicinski, john.fastabend, Ivan Khoronzhuk

This patchset adds XDP support for TI cpsw driver and base it on
page_pool allocator. It was verified on af_xdp socket drop,
af_xdp l2f, ebpf XDP_DROP, XDP_REDIRECT, XDP_PASS, XDP_TX.

It was verified with following configs enabled:
CONFIG_JIT=y
CONFIG_BPFILTER=y
CONFIG_BPF_SYSCALL=y
CONFIG_XDP_SOCKETS=y
CONFIG_BPF_EVENTS=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_JIT=y
CONFIG_CGROUP_BPF=y

Link on previous v5:
https://lkml.org/lkml/2019/6/30/89

Also regular tests with iperf2 were done in order to verify impact on
regular netstack performance, compared with base commit:
https://pastebin.com/JSMT0iZ4

v5..v6:
- do changes that is rx_dev while redirect/flush cycle is kept the same
- dropped net: ethernet: ti: davinci_cpdma: return handler status
- other changes desc in patches

v4..v5:
- added two plreliminary patches:
  net: ethernet: ti: davinci_cpdma: allow desc split while down
  net: ethernet: ti: cpsw_ethtool: allow res split while down
- added xdp alocator refcnt on xdp level, avoiding page pool refcnt
- moved flush status as separate argument for cpdma_chan_process
- reworked cpsw code according to last changes to allocator
- added missed statistic counter

v3..v4:
- added page pool user counter
- use same pool for ndevs in dual mac
- restructured page pool create/destroy according to the last changes in API

v2..v3:
- each rxq and ndev has its own page pool

v1..v2:
- combined xdp_xmit functions
- used page allocation w/o refcnt juggle
- unmapped page for skb netstack
- moved rxq/page pool allocation to open/close pair
- added several preliminary patches:
  net: page_pool: add helper function to retrieve dma addresses
  net: page_pool: add helper function to unmap dma addresses
  net: ethernet: ti: cpsw: use cpsw as drv data
  net: ethernet: ti: cpsw_ethtool: simplify slave loops

Ivan Khoronzhuk (5):
  xdp: allow same allocator usage
  net: ethernet: ti: davinci_cpdma: add dma mapped submit
  net: ethernet: ti: davinci_cpdma: allow desc split while down
  net: ethernet: ti: cpsw_ethtool: allow res split while down
  net: ethernet: ti: cpsw: add XDP support

 drivers/net/ethernet/ti/Kconfig         |   1 +
 drivers/net/ethernet/ti/cpsw.c          | 485 +++++++++++++++++++++---
 drivers/net/ethernet/ti/cpsw_ethtool.c  |  76 +++-
 drivers/net/ethernet/ti/cpsw_priv.h     |   7 +
 drivers/net/ethernet/ti/davinci_cpdma.c |  99 ++++-
 drivers/net/ethernet/ti/davinci_cpdma.h |   7 +-
 include/net/xdp_priv.h                  |   2 +
 net/core/xdp.c                          |  55 +++
 8 files changed, 656 insertions(+), 76 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 net-next 1/5] xdp: allow same allocator usage
  2019-07-03 10:18 [PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support Ivan Khoronzhuk
@ 2019-07-03 10:18 ` Ivan Khoronzhuk
  2019-07-03 17:40   ` Jesper Dangaard Brouer
  2019-07-03 10:19 ` [PATCH v6 net-next 2/5] net: ethernet: ti: davinci_cpdma: add dma mapped submit Ivan Khoronzhuk
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-03 10:18 UTC (permalink / raw)
  To: grygorii.strashko, hawk, davem
  Cc: ast, linux-kernel, linux-omap, xdp-newbies, ilias.apalodimas,
	netdev, daniel, jakub.kicinski, john.fastabend, Ivan Khoronzhuk

First of all, it is an absolute requirement that each RX-queue have
their own page_pool object/allocator. And this change is intendant
to handle special case, where a single RX-queue can receive packets
from two different net_devices.

In order to protect against using same allocator for 2 different rx
queues, add queue_index to xdp_mem_allocator to catch the obvious
mistake where queue_index mismatch, as proposed by Jesper Dangaard
Brouer.

Adding this on xdp allocator level allows drivers with such dependency
change the allocators w/o modifications.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 include/net/xdp_priv.h |  2 ++
 net/core/xdp.c         | 55 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
index 6a8cba6ea79a..9858a4057842 100644
--- a/include/net/xdp_priv.h
+++ b/include/net/xdp_priv.h
@@ -18,6 +18,8 @@ struct xdp_mem_allocator {
 	struct rcu_head rcu;
 	struct delayed_work defer_wq;
 	unsigned long defer_warn;
+	unsigned long refcnt;
+	u32 queue_index;
 };
 
 #endif /* __LINUX_NET_XDP_PRIV_H__ */
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 829377cc83db..4f0ddbb3717a 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -98,6 +98,18 @@ static bool __mem_id_disconnect(int id, bool force)
 		WARN(1, "Request remove non-existing id(%d), driver bug?", id);
 		return true;
 	}
+
+	/* to avoid calling hash lookup twice, decrement refcnt here till it
+	 * reaches zero, then it can be called from workqueue afterwards.
+	 */
+	if (xa->refcnt)
+		xa->refcnt--;
+
+	if (xa->refcnt) {
+		mutex_unlock(&mem_id_lock);
+		return true;
+	}
+
 	xa->disconnect_cnt++;
 
 	/* Detects in-flight packet-pages for page_pool */
@@ -312,6 +324,33 @@ static bool __is_supported_mem_type(enum xdp_mem_type type)
 	return true;
 }
 
+static struct xdp_mem_allocator *xdp_allocator_find(void *allocator)
+{
+	struct xdp_mem_allocator *xae, *xa = NULL;
+	struct rhashtable_iter iter;
+
+	if (!allocator)
+		return xa;
+
+	rhashtable_walk_enter(mem_id_ht, &iter);
+	do {
+		rhashtable_walk_start(&iter);
+
+		while ((xae = rhashtable_walk_next(&iter)) && !IS_ERR(xae)) {
+			if (xae->allocator == allocator) {
+				xa = xae;
+				break;
+			}
+		}
+
+		rhashtable_walk_stop(&iter);
+
+	} while (xae == ERR_PTR(-EAGAIN));
+	rhashtable_walk_exit(&iter);
+
+	return xa;
+}
+
 int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 			       enum xdp_mem_type type, void *allocator)
 {
@@ -347,6 +386,20 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 		}
 	}
 
+	mutex_lock(&mem_id_lock);
+	xdp_alloc = xdp_allocator_find(allocator);
+	if (xdp_alloc) {
+		/* One allocator per queue is supposed only */
+		if (xdp_alloc->queue_index != xdp_rxq->queue_index)
+			return -EINVAL;
+
+		xdp_rxq->mem.id = xdp_alloc->mem.id;
+		xdp_alloc->refcnt++;
+		mutex_unlock(&mem_id_lock);
+		return 0;
+	}
+	mutex_unlock(&mem_id_lock);
+
 	xdp_alloc = kzalloc(sizeof(*xdp_alloc), gfp);
 	if (!xdp_alloc)
 		return -ENOMEM;
@@ -360,6 +413,8 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 	xdp_rxq->mem.id = id;
 	xdp_alloc->mem  = xdp_rxq->mem;
 	xdp_alloc->allocator = allocator;
+	xdp_alloc->refcnt = 1;
+	xdp_alloc->queue_index = xdp_rxq->queue_index;
 
 	/* Insert allocator into ID lookup table */
 	ptr = rhashtable_insert_slow(mem_id_ht, &id, &xdp_alloc->node);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 net-next 2/5] net: ethernet: ti: davinci_cpdma: add dma mapped submit
  2019-07-03 10:18 [PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support Ivan Khoronzhuk
  2019-07-03 10:18 ` [PATCH v6 net-next 1/5] xdp: allow same allocator usage Ivan Khoronzhuk
@ 2019-07-03 10:19 ` Ivan Khoronzhuk
  2019-07-05 19:32   ` kbuild test robot
  2019-07-03 10:19 ` [PATCH v6 net-next 3/5] net: ethernet: ti: davinci_cpdma: allow desc split while down Ivan Khoronzhuk
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-03 10:19 UTC (permalink / raw)
  To: grygorii.strashko, hawk, davem
  Cc: ast, linux-kernel, linux-omap, xdp-newbies, ilias.apalodimas,
	netdev, daniel, jakub.kicinski, john.fastabend, Ivan Khoronzhuk

In case if dma mapped packet needs to be sent, like with XDP
page pool, the "mapped" submit can be used. This patch adds dma
mapped submit based on regular one.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/davinci_cpdma.c | 89 ++++++++++++++++++++++---
 drivers/net/ethernet/ti/davinci_cpdma.h |  4 ++
 2 files changed, 83 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
index 5cf1758d425b..8da46394c0e7 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.c
+++ b/drivers/net/ethernet/ti/davinci_cpdma.c
@@ -139,6 +139,7 @@ struct submit_info {
 	int directed;
 	void *token;
 	void *data;
+	int flags;
 	int len;
 };
 
@@ -184,6 +185,8 @@ static struct cpdma_control_info controls[] = {
 				 (directed << CPDMA_TO_PORT_SHIFT));	\
 	} while (0)
 
+#define CPDMA_DMA_EXT_MAP		BIT(16)
+
 static void cpdma_desc_pool_destroy(struct cpdma_ctlr *ctlr)
 {
 	struct cpdma_desc_pool *pool = ctlr->pool;
@@ -1015,6 +1018,7 @@ static int cpdma_chan_submit_si(struct submit_info *si)
 	struct cpdma_chan		*chan = si->chan;
 	struct cpdma_ctlr		*ctlr = chan->ctlr;
 	int				len = si->len;
+	int				swlen = len;
 	struct cpdma_desc __iomem	*desc;
 	dma_addr_t			buffer;
 	u32				mode;
@@ -1036,16 +1040,22 @@ static int cpdma_chan_submit_si(struct submit_info *si)
 		chan->stats.runt_transmit_buff++;
 	}
 
-	buffer = dma_map_single(ctlr->dev, si->data, len, chan->dir);
-	ret = dma_mapping_error(ctlr->dev, buffer);
-	if (ret) {
-		cpdma_desc_free(ctlr->pool, desc, 1);
-		return -EINVAL;
-	}
-
 	mode = CPDMA_DESC_OWNER | CPDMA_DESC_SOP | CPDMA_DESC_EOP;
 	cpdma_desc_to_port(chan, mode, si->directed);
 
+	if (si->flags & CPDMA_DMA_EXT_MAP) {
+		buffer = (u32)si->data;
+		dma_sync_single_for_device(ctlr->dev, buffer, len, chan->dir);
+		swlen |= CPDMA_DMA_EXT_MAP;
+	} else {
+		buffer = dma_map_single(ctlr->dev, si->data, len, chan->dir);
+		ret = dma_mapping_error(ctlr->dev, buffer);
+		if (ret) {
+			cpdma_desc_free(ctlr->pool, desc, 1);
+			return -EINVAL;
+		}
+	}
+
 	/* Relaxed IO accessors can be used here as there is read barrier
 	 * at the end of write sequence.
 	 */
@@ -1055,7 +1065,7 @@ static int cpdma_chan_submit_si(struct submit_info *si)
 	writel_relaxed(mode | len, &desc->hw_mode);
 	writel_relaxed((uintptr_t)si->token, &desc->sw_token);
 	writel_relaxed(buffer, &desc->sw_buffer);
-	writel_relaxed(len, &desc->sw_len);
+	writel_relaxed(swlen, &desc->sw_len);
 	desc_read(desc, sw_len);
 
 	__cpdma_chan_submit(chan, desc);
@@ -1079,6 +1089,32 @@ int cpdma_chan_idle_submit(struct cpdma_chan *chan, void *token, void *data,
 	si.data = data;
 	si.len = len;
 	si.directed = directed;
+	si.flags = 0;
+
+	spin_lock_irqsave(&chan->lock, flags);
+	if (chan->state == CPDMA_STATE_TEARDOWN) {
+		spin_unlock_irqrestore(&chan->lock, flags);
+		return -EINVAL;
+	}
+
+	ret = cpdma_chan_submit_si(&si);
+	spin_unlock_irqrestore(&chan->lock, flags);
+	return ret;
+}
+
+int cpdma_chan_idle_submit_mapped(struct cpdma_chan *chan, void *token,
+				  dma_addr_t data, int len, int directed)
+{
+	struct submit_info si;
+	unsigned long flags;
+	int ret;
+
+	si.chan = chan;
+	si.token = token;
+	si.data = (void *)(u32)data;
+	si.len = len;
+	si.directed = directed;
+	si.flags = CPDMA_DMA_EXT_MAP;
 
 	spin_lock_irqsave(&chan->lock, flags);
 	if (chan->state == CPDMA_STATE_TEARDOWN) {
@@ -1103,6 +1139,32 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
 	si.data = data;
 	si.len = len;
 	si.directed = directed;
+	si.flags = 0;
+
+	spin_lock_irqsave(&chan->lock, flags);
+	if (chan->state != CPDMA_STATE_ACTIVE) {
+		spin_unlock_irqrestore(&chan->lock, flags);
+		return -EINVAL;
+	}
+
+	ret = cpdma_chan_submit_si(&si);
+	spin_unlock_irqrestore(&chan->lock, flags);
+	return ret;
+}
+
+int cpdma_chan_submit_mapped(struct cpdma_chan *chan, void *token,
+			     dma_addr_t data, int len, int directed)
+{
+	struct submit_info si;
+	unsigned long flags;
+	int ret;
+
+	si.chan = chan;
+	si.token = token;
+	si.data = (void *)(u32)data;
+	si.len = len;
+	si.directed = directed;
+	si.flags = CPDMA_DMA_EXT_MAP;
 
 	spin_lock_irqsave(&chan->lock, flags);
 	if (chan->state != CPDMA_STATE_ACTIVE) {
@@ -1140,10 +1202,17 @@ static void __cpdma_chan_free(struct cpdma_chan *chan,
 	uintptr_t			token;
 
 	token      = desc_read(desc, sw_token);
-	buff_dma   = desc_read(desc, sw_buffer);
 	origlen    = desc_read(desc, sw_len);
 
-	dma_unmap_single(ctlr->dev, buff_dma, origlen, chan->dir);
+	buff_dma   = desc_read(desc, sw_buffer);
+	if (origlen & CPDMA_DMA_EXT_MAP) {
+		origlen &= ~CPDMA_DMA_EXT_MAP;
+		dma_sync_single_for_cpu(ctlr->dev, buff_dma, origlen,
+					chan->dir);
+	} else {
+		dma_unmap_single(ctlr->dev, buff_dma, origlen, chan->dir);
+	}
+
 	cpdma_desc_free(pool, desc, 1);
 	(*chan->handler)((void *)token, outlen, status);
 }
diff --git a/drivers/net/ethernet/ti/davinci_cpdma.h b/drivers/net/ethernet/ti/davinci_cpdma.h
index 9343c8c73c1b..0271a20c2e09 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.h
+++ b/drivers/net/ethernet/ti/davinci_cpdma.h
@@ -77,8 +77,12 @@ int cpdma_chan_stop(struct cpdma_chan *chan);
 
 int cpdma_chan_get_stats(struct cpdma_chan *chan,
 			 struct cpdma_chan_stats *stats);
+int cpdma_chan_submit_mapped(struct cpdma_chan *chan, void *token,
+			     dma_addr_t data, int len, int directed);
 int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
 		      int len, int directed);
+int cpdma_chan_idle_submit_mapped(struct cpdma_chan *chan, void *token,
+				  dma_addr_t data, int len, int directed);
 int cpdma_chan_idle_submit(struct cpdma_chan *chan, void *token, void *data,
 			   int len, int directed);
 int cpdma_chan_process(struct cpdma_chan *chan, int quota);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 net-next 3/5] net: ethernet: ti: davinci_cpdma: allow desc split while down
  2019-07-03 10:18 [PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support Ivan Khoronzhuk
  2019-07-03 10:18 ` [PATCH v6 net-next 1/5] xdp: allow same allocator usage Ivan Khoronzhuk
  2019-07-03 10:19 ` [PATCH v6 net-next 2/5] net: ethernet: ti: davinci_cpdma: add dma mapped submit Ivan Khoronzhuk
@ 2019-07-03 10:19 ` Ivan Khoronzhuk
  2019-07-03 10:19 ` [PATCH v6 net-next 4/5] net: ethernet: ti: cpsw_ethtool: allow res " Ivan Khoronzhuk
  2019-07-03 10:19 ` [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support Ivan Khoronzhuk
  4 siblings, 0 replies; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-03 10:19 UTC (permalink / raw)
  To: grygorii.strashko, hawk, davem
  Cc: ast, linux-kernel, linux-omap, xdp-newbies, ilias.apalodimas,
	netdev, daniel, jakub.kicinski, john.fastabend, Ivan Khoronzhuk

That's possible to set ring params while interfaces are down. When
interface gets up it uses number of descs to fill rx queue and on
later on changes to create rx pools. Usually, this resplit can happen
after phy is up, but it can be needed before this, so allow it to
happen while setting number of rx descs, when interfaces are down.
Also, if no dependency on intf state, move it to cpdma layer, where
it should be.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw_ethtool.c  |  9 ++++-----
 drivers/net/ethernet/ti/davinci_cpdma.c | 10 +++++++++-
 drivers/net/ethernet/ti/davinci_cpdma.h |  3 +--
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw_ethtool.c b/drivers/net/ethernet/ti/cpsw_ethtool.c
index f60dc1dfc443..08d7aaee8299 100644
--- a/drivers/net/ethernet/ti/cpsw_ethtool.c
+++ b/drivers/net/ethernet/ti/cpsw_ethtool.c
@@ -664,15 +664,14 @@ int cpsw_set_ringparam(struct net_device *ndev,
 
 	cpsw_suspend_data_pass(ndev);
 
-	cpdma_set_num_rx_descs(cpsw->dma, ering->rx_pending);
-
-	if (cpsw->usage_count)
-		cpdma_chan_split_pool(cpsw->dma);
+	ret = cpdma_set_num_rx_descs(cpsw->dma, ering->rx_pending);
+	if (ret)
+		goto err;
 
 	ret = cpsw_resume_data_pass(ndev);
 	if (!ret)
 		return 0;
-
+err:
 	dev_err(cpsw->dev, "cannot set ring params, closing device\n");
 	dev_close(ndev);
 	return ret;
diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
index 8da46394c0e7..4167b0b77c8e 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.c
+++ b/drivers/net/ethernet/ti/davinci_cpdma.c
@@ -1423,8 +1423,16 @@ int cpdma_get_num_tx_descs(struct cpdma_ctlr *ctlr)
 	return ctlr->num_tx_desc;
 }
 
-void cpdma_set_num_rx_descs(struct cpdma_ctlr *ctlr, int num_rx_desc)
+int cpdma_set_num_rx_descs(struct cpdma_ctlr *ctlr, int num_rx_desc)
 {
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&ctlr->lock, flags);
 	ctlr->num_rx_desc = num_rx_desc;
 	ctlr->num_tx_desc = ctlr->pool->num_desc - ctlr->num_rx_desc;
+	ret = cpdma_chan_split_pool(ctlr);
+	spin_unlock_irqrestore(&ctlr->lock, flags);
+
+	return ret;
 }
diff --git a/drivers/net/ethernet/ti/davinci_cpdma.h b/drivers/net/ethernet/ti/davinci_cpdma.h
index 0271a20c2e09..d3cfe234d16a 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.h
+++ b/drivers/net/ethernet/ti/davinci_cpdma.h
@@ -116,8 +116,7 @@ enum cpdma_control {
 int cpdma_control_get(struct cpdma_ctlr *ctlr, int control);
 int cpdma_control_set(struct cpdma_ctlr *ctlr, int control, int value);
 int cpdma_get_num_rx_descs(struct cpdma_ctlr *ctlr);
-void cpdma_set_num_rx_descs(struct cpdma_ctlr *ctlr, int num_rx_desc);
+int cpdma_set_num_rx_descs(struct cpdma_ctlr *ctlr, int num_rx_desc);
 int cpdma_get_num_tx_descs(struct cpdma_ctlr *ctlr);
-int cpdma_chan_split_pool(struct cpdma_ctlr *ctlr);
 
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 net-next 4/5] net: ethernet: ti: cpsw_ethtool: allow res split while down
  2019-07-03 10:18 [PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support Ivan Khoronzhuk
                   ` (2 preceding siblings ...)
  2019-07-03 10:19 ` [PATCH v6 net-next 3/5] net: ethernet: ti: davinci_cpdma: allow desc split while down Ivan Khoronzhuk
@ 2019-07-03 10:19 ` Ivan Khoronzhuk
  2019-07-03 10:19 ` [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support Ivan Khoronzhuk
  4 siblings, 0 replies; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-03 10:19 UTC (permalink / raw)
  To: grygorii.strashko, hawk, davem
  Cc: ast, linux-kernel, linux-omap, xdp-newbies, ilias.apalodimas,
	netdev, daniel, jakub.kicinski, john.fastabend, Ivan Khoronzhuk

That's possible to set channel num while interfaces are down. When
interface gets up it should resplit budget. This resplit can happen
after phy is up but only if speed is changed, so should be set before
this, for this allow it to happen while changing number of channels,
when interfaces are down.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/cpsw_ethtool.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw_ethtool.c b/drivers/net/ethernet/ti/cpsw_ethtool.c
index 08d7aaee8299..fa4d75f5548e 100644
--- a/drivers/net/ethernet/ti/cpsw_ethtool.c
+++ b/drivers/net/ethernet/ti/cpsw_ethtool.c
@@ -620,8 +620,7 @@ int cpsw_set_channels_common(struct net_device *ndev,
 		}
 	}
 
-	if (cpsw->usage_count)
-		cpsw_split_res(cpsw);
+	cpsw_split_res(cpsw);
 
 	ret = cpsw_resume_data_pass(ndev);
 	if (!ret)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support
  2019-07-03 10:18 [PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support Ivan Khoronzhuk
                   ` (3 preceding siblings ...)
  2019-07-03 10:19 ` [PATCH v6 net-next 4/5] net: ethernet: ti: cpsw_ethtool: allow res " Ivan Khoronzhuk
@ 2019-07-03 10:19 ` Ivan Khoronzhuk
  2019-07-04  9:19   ` Jesper Dangaard Brouer
  4 siblings, 1 reply; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-03 10:19 UTC (permalink / raw)
  To: grygorii.strashko, hawk, davem
  Cc: ast, linux-kernel, linux-omap, xdp-newbies, ilias.apalodimas,
	netdev, daniel, jakub.kicinski, john.fastabend, Ivan Khoronzhuk

Add XDP support based on rx page_pool allocator, one frame per page.
Page pool allocator is used with assumption that only one rx_handler
is running simultaneously. DMA map/unmap is reused from page pool
despite there is no need to map whole page.

Due to specific of cpsw, the same TX/RX handler can be used by 2
network devices, so special fields in buffer are added to identify
an interface the frame is destined to. Thus XDP works for both
interfaces, that allows to test xdp redirect between two interfaces
easily. Aslo, each rx queue have own page pools, but common for both
netdevs.

XDP prog is common for all channels till appropriate changes are added
in XDP infrastructure. Also, once page_pool recycling becomes part of
skb netstack some simplifications can be added, like removing
page_pool_release_page() before skb receive.

In order to keep rx_dev while redirect, that can be somehow used in
future, do flush in rx_handler, that allows to keep rx dev the same
while reidrect. It allows to conform with tracing rx_dev pointed
by Jesper.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
---
 drivers/net/ethernet/ti/Kconfig        |   1 +
 drivers/net/ethernet/ti/cpsw.c         | 485 ++++++++++++++++++++++---
 drivers/net/ethernet/ti/cpsw_ethtool.c |  66 +++-
 drivers/net/ethernet/ti/cpsw_priv.h    |   7 +
 4 files changed, 502 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index a800d3417411..834afca3a019 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -50,6 +50,7 @@ config TI_CPSW
 	depends on ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST
 	select TI_DAVINCI_MDIO
 	select MFD_SYSCON
+	select PAGE_POOL
 	select REGMAP
 	---help---
 	  This driver supports TI's CPSW Ethernet Switch.
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 32b7b3b74a6b..6e9be22035a9 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -31,6 +31,10 @@
 #include <linux/if_vlan.h>
 #include <linux/kmemleak.h>
 #include <linux/sys_soc.h>
+#include <net/page_pool.h>
+#include <linux/bpf.h>
+#include <linux/bpf_trace.h>
+#include <linux/filter.h>
 
 #include <linux/pinctrl/consumer.h>
 #include <net/pkt_cls.h>
@@ -60,6 +64,10 @@ static int descs_pool_size = CPSW_CPDMA_DESCS_POOL_SIZE_DEFAULT;
 module_param(descs_pool_size, int, 0444);
 MODULE_PARM_DESC(descs_pool_size, "Number of CPDMA CPPI descriptors in pool");
 
+/* The buf includes headroom compatible with both skb and xdpf */
+#define CPSW_HEADROOM_NA (max(XDP_PACKET_HEADROOM, NET_SKB_PAD) + NET_IP_ALIGN)
+#define CPSW_HEADROOM  ALIGN(CPSW_HEADROOM_NA, sizeof(long))
+
 #define for_each_slave(priv, func, arg...)				\
 	do {								\
 		struct cpsw_slave *slave;				\
@@ -74,6 +82,11 @@ MODULE_PARM_DESC(descs_pool_size, "Number of CPDMA CPPI descriptors in pool");
 				(func)(slave++, ##arg);			\
 	} while (0)
 
+#define CPSW_XMETA_OFFSET	ALIGN(sizeof(struct xdp_frame), sizeof(long))
+
+#define CPSW_XDP_CONSUMED		1
+#define CPSW_XDP_PASS			0
+
 static int cpsw_ndo_vlan_rx_add_vid(struct net_device *ndev,
 				    __be16 proto, u16 vid);
 
@@ -337,24 +350,58 @@ void cpsw_intr_disable(struct cpsw_common *cpsw)
 	return;
 }
 
+static int cpsw_is_xdpf_handle(void *handle)
+{
+	return (unsigned long)handle & BIT(0);
+}
+
+static void *cpsw_xdpf_to_handle(struct xdp_frame *xdpf)
+{
+	return (void *)((unsigned long)xdpf | BIT(0));
+}
+
+static struct xdp_frame *cpsw_handle_to_xdpf(void *handle)
+{
+	return (struct xdp_frame *)((unsigned long)handle & ~BIT(0));
+}
+
+struct __aligned(sizeof(long)) cpsw_meta_xdp {
+	struct net_device *ndev;
+	int ch;
+};
+
 void cpsw_tx_handler(void *token, int len, int status)
 {
+	struct cpsw_meta_xdp	*xmeta;
+	struct xdp_frame	*xdpf;
+	struct net_device	*ndev;
 	struct netdev_queue	*txq;
-	struct sk_buff		*skb = token;
-	struct net_device	*ndev = skb->dev;
-	struct cpsw_common	*cpsw = ndev_to_cpsw(ndev);
+	struct sk_buff		*skb;
+	int			ch;
+
+	if (cpsw_is_xdpf_handle(token)) {
+		xdpf = cpsw_handle_to_xdpf(token);
+		xmeta = (void *)xdpf + CPSW_XMETA_OFFSET;
+		ndev = xmeta->ndev;
+		ch = xmeta->ch;
+		xdp_return_frame(xdpf);
+	} else {
+		skb = token;
+		ndev = skb->dev;
+		ch = skb_get_queue_mapping(skb);
+		cpts_tx_timestamp(ndev_to_cpsw(ndev)->cpts, skb);
+		dev_kfree_skb_any(skb);
+	}
 
 	/* Check whether the queue is stopped due to stalled tx dma, if the
 	 * queue is stopped then start the queue as we have free desc for tx
 	 */
-	txq = netdev_get_tx_queue(ndev, skb_get_queue_mapping(skb));
+	txq = netdev_get_tx_queue(ndev, ch);
 	if (unlikely(netif_tx_queue_stopped(txq)))
 		netif_tx_wake_queue(txq);
 
-	cpts_tx_timestamp(cpsw->cpts, skb);
 	ndev->stats.tx_packets++;
 	ndev->stats.tx_bytes += len;
-	dev_kfree_skb_any(skb);
 }
 
 static void cpsw_rx_vlan_encap(struct sk_buff *skb)
@@ -400,24 +447,236 @@ static void cpsw_rx_vlan_encap(struct sk_buff *skb)
 	}
 }
 
+static int cpsw_xdp_tx_frame(struct cpsw_priv *priv, struct xdp_frame *xdpf,
+			     struct page *page)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_meta_xdp *xmeta;
+	struct cpdma_chan *txch;
+	dma_addr_t dma;
+	int ret, port;
+
+	xmeta = (void *)xdpf + CPSW_XMETA_OFFSET;
+	xmeta->ndev = priv->ndev;
+	xmeta->ch = 0;
+	txch = cpsw->txv[0].ch;
+
+	port = priv->emac_port + cpsw->data.dual_emac;
+	if (page) {
+		dma = page_pool_get_dma_addr(page);
+		dma += xdpf->data - (void *)xdpf;
+		ret = cpdma_chan_submit_mapped(txch, cpsw_xdpf_to_handle(xdpf),
+					       dma, xdpf->len, port);
+	} else {
+		if (sizeof(*xmeta) > xdpf->headroom) {
+			xdp_return_frame_rx_napi(xdpf);
+			return -EINVAL;
+		}
+
+		ret = cpdma_chan_submit(txch, cpsw_xdpf_to_handle(xdpf),
+					xdpf->data, xdpf->len, port);
+	}
+
+	if (ret) {
+		priv->ndev->stats.tx_dropped++;
+		xdp_return_frame_rx_napi(xdpf);
+	}
+
+	return ret;
+}
+
+static int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp,
+			struct page *page)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct net_device *ndev = priv->ndev;
+	int ret = CPSW_XDP_CONSUMED;
+	struct xdp_frame *xdpf;
+	struct bpf_prog *prog;
+	u32 act;
+
+	rcu_read_lock();
+
+	prog = READ_ONCE(priv->xdp_prog);
+	if (!prog) {
+		ret = CPSW_XDP_PASS;
+		goto out;
+	}
+
+	act = bpf_prog_run_xdp(prog, xdp);
+	switch (act) {
+	case XDP_PASS:
+		ret = CPSW_XDP_PASS;
+		break;
+	case XDP_TX:
+		xdpf = convert_to_xdp_frame(xdp);
+		if (unlikely(!xdpf))
+			goto drop;
+
+		cpsw_xdp_tx_frame(priv, xdpf, page);
+		break;
+	case XDP_REDIRECT:
+		if (xdp_do_redirect(ndev, xdp, prog))
+			goto drop;
+
+		/* as flush requires rx_dev to be per NAPI handle and there
+		 * is can be two devices putting packets on bulk queue,
+		 * do flush here avoid this just for sure.
+		 */
+		xdp_do_flush_map();
+		break;
+	default:
+		bpf_warn_invalid_xdp_action(act);
+		/* fall through */
+	case XDP_ABORTED:
+		trace_xdp_exception(ndev, prog, act);
+		/* fall through -- handle aborts by dropping packet */
+	case XDP_DROP:
+		goto drop;
+	}
+out:
+	rcu_read_unlock();
+	return ret;
+drop:
+	rcu_read_unlock();
+	page_pool_recycle_direct(cpsw->page_pool[ch], page);
+	return ret;
+}
+
+static unsigned int cpsw_rxbuf_total_len(unsigned int len)
+{
+	len += CPSW_HEADROOM;
+	len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+
+	return SKB_DATA_ALIGN(len);
+}
+
+static struct page_pool *cpsw_create_page_pool(struct cpsw_common *cpsw,
+					       int size)
+{
+	struct page_pool_params pp_params;
+	struct page_pool *pool;
+
+	pp_params.order = 0;
+	pp_params.flags = PP_FLAG_DMA_MAP;
+	pp_params.pool_size = size;
+	pp_params.nid = NUMA_NO_NODE;
+	pp_params.dma_dir = DMA_BIDIRECTIONAL;
+	pp_params.dev = cpsw->dev;
+
+	pool = page_pool_create(&pp_params);
+	if (IS_ERR(pool))
+		dev_err(cpsw->dev, "cannot create rx page pool\n");
+
+	return pool;
+}
+
+static int cpsw_create_rx_pool(struct cpsw_common *cpsw, int ch)
+{
+	struct page_pool *pool;
+	int ret = 0, pool_size;
+
+	pool_size = cpdma_chan_get_rx_buf_num(cpsw->rxv[ch].ch);
+	pool = cpsw_create_page_pool(cpsw, pool_size);
+	if (IS_ERR(pool))
+		ret = PTR_ERR(pool);
+	else
+		cpsw->page_pool[ch] = pool;
+
+	return ret;
+}
+
+static int cpsw_ndev_create_xdp_rxq(struct cpsw_priv *priv, int ch)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	int ret, new_pool = false;
+	struct xdp_rxq_info *rxq;
+
+	rxq = &priv->xdp_rxq[ch];
+
+	ret = xdp_rxq_info_reg(rxq, priv->ndev, ch);
+	if (ret)
+		return ret;
+
+	if (!cpsw->page_pool[ch]) {
+		ret =  cpsw_create_rx_pool(cpsw, ch);
+		if (ret)
+			goto err_rxq;
+
+		new_pool = true;
+	}
+
+	ret = xdp_rxq_info_reg_mem_model(rxq, MEM_TYPE_PAGE_POOL,
+					 cpsw->page_pool[ch]);
+	if (!ret)
+		return 0;
+
+	if (new_pool) {
+		page_pool_free(cpsw->page_pool[ch]);
+		cpsw->page_pool[ch] = NULL;
+	}
+
+err_rxq:
+	xdp_rxq_info_unreg(rxq);
+	return ret;
+}
+
+void cpsw_ndev_destroy_xdp_rxqs(struct cpsw_priv *priv)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	struct xdp_rxq_info *rxq;
+	int i;
+
+	for (i = 0; i < cpsw->rx_ch_num; i++) {
+		rxq = &priv->xdp_rxq[i];
+		if (xdp_rxq_info_is_reg(rxq))
+			xdp_rxq_info_unreg(rxq);
+	}
+}
+
+int cpsw_ndev_create_xdp_rxqs(struct cpsw_priv *priv)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	int i, ret;
+
+	for (i = 0; i < cpsw->rx_ch_num; i++) {
+		ret = cpsw_ndev_create_xdp_rxq(priv, i);
+		if (ret)
+			goto err_cleanup;
+	}
+
+	return 0;
+
+err_cleanup:
+	cpsw_ndev_destroy_xdp_rxqs(priv);
+
+	return ret;
+}
+
 static void cpsw_rx_handler(void *token, int len, int status)
 {
-	struct cpdma_chan	*ch;
-	struct sk_buff		*skb = token;
-	struct sk_buff		*new_skb;
-	struct net_device	*ndev = skb->dev;
-	int			ret = 0, port;
-	struct cpsw_common	*cpsw = ndev_to_cpsw(ndev);
+	struct page		*new_page, *page = token;
+	void			*pa = page_address(page);
+	struct cpsw_meta_xdp	*xmeta = pa + CPSW_XMETA_OFFSET;
+	struct cpsw_common	*cpsw = ndev_to_cpsw(xmeta->ndev);
+	int			pkt_size = cpsw->rx_packet_max;
+	int			ret = 0, port, ch = xmeta->ch;
+	int			headroom = CPSW_HEADROOM;
+	struct net_device	*ndev = xmeta->ndev;
 	struct cpsw_priv	*priv;
+	struct page_pool	*pool;
+	struct sk_buff		*skb;
+	struct xdp_buff		xdp;
+	dma_addr_t		dma;
 
-	if (cpsw->data.dual_emac) {
+	if (cpsw->data.dual_emac && status >= 0) {
 		port = CPDMA_RX_SOURCE_PORT(status);
-		if (port) {
+		if (port)
 			ndev = cpsw->slaves[--port].ndev;
-			skb->dev = ndev;
-		}
 	}
 
+	priv = netdev_priv(ndev);
+	pool = cpsw->page_pool[ch];
 	if (unlikely(status < 0) || unlikely(!netif_running(ndev))) {
 		/* In dual emac mode check for all interfaces */
 		if (cpsw->data.dual_emac && cpsw->usage_count &&
@@ -426,43 +685,87 @@ static void cpsw_rx_handler(void *token, int len, int status)
 			 * is already down and the other interface is up
 			 * and running, instead of freeing which results
 			 * in reducing of the number of rx descriptor in
-			 * DMA engine, requeue skb back to cpdma.
+			 * DMA engine, requeue page back to cpdma.
 			 */
-			new_skb = skb;
+			new_page = page;
 			goto requeue;
 		}
 
-		/* the interface is going down, skbs are purged */
-		dev_kfree_skb_any(skb);
+		/* the interface is going down, pages are purged */
+		page_pool_recycle_direct(pool, page);
 		return;
 	}
 
-	new_skb = netdev_alloc_skb_ip_align(ndev, cpsw->rx_packet_max);
-	if (new_skb) {
-		skb_copy_queue_mapping(new_skb, skb);
-		skb_put(skb, len);
-		if (status & CPDMA_RX_VLAN_ENCAP)
-			cpsw_rx_vlan_encap(skb);
-		priv = netdev_priv(ndev);
-		if (priv->rx_ts_enabled)
-			cpts_rx_timestamp(cpsw->cpts, skb);
-		skb->protocol = eth_type_trans(skb, ndev);
-		netif_receive_skb(skb);
-		ndev->stats.rx_bytes += len;
-		ndev->stats.rx_packets++;
-		kmemleak_not_leak(new_skb);
-	} else {
+	new_page = page_pool_dev_alloc_pages(pool);
+	if (unlikely(!new_page)) {
+		new_page = page;
 		ndev->stats.rx_dropped++;
-		new_skb = skb;
+		goto requeue;
 	}
 
+	if (priv->xdp_prog) {
+		if (status & CPDMA_RX_VLAN_ENCAP) {
+			xdp.data = pa + CPSW_HEADROOM +
+				   CPSW_RX_VLAN_ENCAP_HDR_SIZE;
+			xdp.data_end = xdp.data + len -
+				       CPSW_RX_VLAN_ENCAP_HDR_SIZE;
+		} else {
+			xdp.data = pa + CPSW_HEADROOM;
+			xdp.data_end = xdp.data + len;
+		}
+
+		xdp_set_data_meta_invalid(&xdp);
+
+		xdp.data_hard_start = pa;
+		xdp.rxq = &priv->xdp_rxq[ch];
+
+		ret = cpsw_run_xdp(priv, ch, &xdp, page);
+		if (ret != CPSW_XDP_PASS)
+			goto requeue;
+
+		/* XDP prog might have changed packet data and boundaries */
+		len = xdp.data_end - xdp.data;
+		headroom = xdp.data - xdp.data_hard_start;
+
+		/* XDP prog can modify vlan tag, so can't use encap header */
+		status &= ~CPDMA_RX_VLAN_ENCAP;
+	}
+
+	/* pass skb to netstack if no XDP prog or returned XDP_PASS */
+	skb = build_skb(pa, cpsw_rxbuf_total_len(pkt_size));
+	if (!skb) {
+		ndev->stats.rx_dropped++;
+		page_pool_recycle_direct(pool, page);
+		goto requeue;
+	}
+
+	skb_reserve(skb, headroom);
+	skb_put(skb, len);
+	skb->dev = ndev;
+	if (status & CPDMA_RX_VLAN_ENCAP)
+		cpsw_rx_vlan_encap(skb);
+	if (priv->rx_ts_enabled)
+		cpts_rx_timestamp(cpsw->cpts, skb);
+	skb->protocol = eth_type_trans(skb, ndev);
+
+	/* unmap page as no netstack skb page recycling */
+	page_pool_release_page(pool, page);
+	netif_receive_skb(skb);
+
+	ndev->stats.rx_bytes += len;
+	ndev->stats.rx_packets++;
+
 requeue:
-	ch = cpsw->rxv[skb_get_queue_mapping(new_skb)].ch;
-	ret = cpdma_chan_submit(ch, new_skb, new_skb->data,
-				skb_tailroom(new_skb), 0);
+	xmeta = page_address(new_page) + CPSW_XMETA_OFFSET;
+	xmeta->ndev = ndev;
+	xmeta->ch = ch;
+
+	dma = page_pool_get_dma_addr(new_page) + CPSW_HEADROOM;
+	ret = cpdma_chan_submit_mapped(cpsw->rxv[ch].ch, new_page, dma,
+				       pkt_size, 0);
 	if (ret < 0) {
 		WARN_ON(ret == -ENOMEM);
-		dev_kfree_skb_any(new_skb);
+		page_pool_recycle_direct(pool, new_page);
 	}
 }
 
@@ -1032,33 +1335,39 @@ static void cpsw_init_host_port(struct cpsw_priv *priv)
 int cpsw_fill_rx_channels(struct cpsw_priv *priv)
 {
 	struct cpsw_common *cpsw = priv->cpsw;
-	struct sk_buff *skb;
+	struct cpsw_meta_xdp *xmeta;
+	struct page_pool *pool;
+	struct page *page;
 	int ch_buf_num;
 	int ch, i, ret;
+	dma_addr_t dma;
 
 	for (ch = 0; ch < cpsw->rx_ch_num; ch++) {
+		pool = cpsw->page_pool[ch];
 		ch_buf_num = cpdma_chan_get_rx_buf_num(cpsw->rxv[ch].ch);
 		for (i = 0; i < ch_buf_num; i++) {
-			skb = __netdev_alloc_skb_ip_align(priv->ndev,
-							  cpsw->rx_packet_max,
-							  GFP_KERNEL);
-			if (!skb) {
-				cpsw_err(priv, ifup, "cannot allocate skb\n");
+			page = page_pool_dev_alloc_pages(pool);
+			if (!page) {
+				cpsw_err(priv, ifup, "allocate rx page err\n");
 				return -ENOMEM;
 			}
 
-			skb_set_queue_mapping(skb, ch);
-			ret = cpdma_chan_idle_submit(cpsw->rxv[ch].ch, skb,
-						     skb->data,
-						     skb_tailroom(skb), 0);
+			xmeta = page_address(page) + CPSW_XMETA_OFFSET;
+			xmeta->ndev = priv->ndev;
+			xmeta->ch = ch;
+
+			dma = page_pool_get_dma_addr(page) + CPSW_HEADROOM;
+			ret = cpdma_chan_idle_submit_mapped(cpsw->rxv[ch].ch,
+							    page, dma,
+							    cpsw->rx_packet_max,
+							    0);
 			if (ret < 0) {
 				cpsw_err(priv, ifup,
-					 "cannot submit skb to channel %d rx, error %d\n",
+					 "cannot submit page to channel %d rx, error %d\n",
 					 ch, ret);
-				kfree_skb(skb);
+				page_pool_recycle_direct(pool, page);
 				return ret;
 			}
-			kmemleak_not_leak(skb);
 		}
 
 		cpsw_info(priv, ifup, "ch %d rx, submitted %d descriptors\n",
@@ -1370,6 +1679,10 @@ static int cpsw_ndo_open(struct net_device *ndev)
 		cpsw_ale_add_vlan(cpsw->ale, cpsw->data.default_vlan,
 				  ALE_ALL_PORTS, ALE_ALL_PORTS, 0, 0);
 
+	ret = cpsw_ndev_create_xdp_rxqs(priv);
+	if (ret)
+		goto err_cleanup;
+
 	/* initialize shared resources for every ndev */
 	if (!cpsw->usage_count) {
 		/* disable priority elevation */
@@ -1422,9 +1735,10 @@ static int cpsw_ndo_open(struct net_device *ndev)
 err_cleanup:
 	if (!cpsw->usage_count) {
 		cpdma_ctlr_stop(cpsw->dma);
-		for_each_slave(priv, cpsw_slave_stop, cpsw);
+		memset(cpsw->page_pool, 0, sizeof(cpsw->page_pool));
 	}
 
+	for_each_slave(priv, cpsw_slave_stop, cpsw);
 	pm_runtime_put_sync(cpsw->dev);
 	netif_carrier_off(priv->ndev);
 	return ret;
@@ -1447,9 +1761,12 @@ static int cpsw_ndo_stop(struct net_device *ndev)
 		cpsw_intr_disable(cpsw);
 		cpdma_ctlr_stop(cpsw->dma);
 		cpsw_ale_stop(cpsw->ale);
+		memset(cpsw->page_pool, 0, sizeof(cpsw->page_pool));
 	}
 	for_each_slave(priv, cpsw_slave_stop, cpsw);
 
+	cpsw_ndev_destroy_xdp_rxqs(priv);
+
 	if (cpsw_need_resplit(cpsw))
 		cpsw_split_res(cpsw);
 
@@ -2004,6 +2321,64 @@ static int cpsw_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type,
 	}
 }
 
+static int cpsw_xdp_prog_setup(struct cpsw_priv *priv, struct netdev_bpf *bpf)
+{
+	struct bpf_prog *prog = bpf->prog;
+
+	if (!priv->xdpi.prog && !prog)
+		return 0;
+
+	if (!xdp_attachment_flags_ok(&priv->xdpi, bpf))
+		return -EBUSY;
+
+	WRITE_ONCE(priv->xdp_prog, prog);
+
+	xdp_attachment_setup(&priv->xdpi, bpf);
+
+	return 0;
+}
+
+static int cpsw_ndo_bpf(struct net_device *ndev, struct netdev_bpf *bpf)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+
+	switch (bpf->command) {
+	case XDP_SETUP_PROG:
+		return cpsw_xdp_prog_setup(priv, bpf);
+
+	case XDP_QUERY_PROG:
+		return xdp_attachment_query(&priv->xdpi, bpf);
+
+	default:
+		return -EINVAL;
+	}
+}
+
+static int cpsw_ndo_xdp_xmit(struct net_device *ndev, int n,
+			     struct xdp_frame **frames, u32 flags)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct xdp_frame *xdpf;
+	int i, drops = 0;
+
+	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
+		return -EINVAL;
+
+	for (i = 0; i < n; i++) {
+		xdpf = frames[i];
+		if (xdpf->len < CPSW_MIN_PACKET_SIZE) {
+			xdp_return_frame_rx_napi(xdpf);
+			drops++;
+			continue;
+		}
+
+		if (cpsw_xdp_tx_frame(priv, xdpf, NULL))
+			drops++;
+	}
+
+	return n - drops;
+}
+
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void cpsw_ndo_poll_controller(struct net_device *ndev)
 {
@@ -2032,6 +2407,8 @@ static const struct net_device_ops cpsw_netdev_ops = {
 	.ndo_vlan_rx_add_vid	= cpsw_ndo_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= cpsw_ndo_vlan_rx_kill_vid,
 	.ndo_setup_tc           = cpsw_ndo_setup_tc,
+	.ndo_bpf		= cpsw_ndo_bpf,
+	.ndo_xdp_xmit		= cpsw_ndo_xdp_xmit,
 };
 
 static void cpsw_get_drvinfo(struct net_device *ndev,
diff --git a/drivers/net/ethernet/ti/cpsw_ethtool.c b/drivers/net/ethernet/ti/cpsw_ethtool.c
index fa4d75f5548e..b39a598cb094 100644
--- a/drivers/net/ethernet/ti/cpsw_ethtool.c
+++ b/drivers/net/ethernet/ti/cpsw_ethtool.c
@@ -578,6 +578,48 @@ static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx,
 	return 0;
 }
 
+static void cpsw_destroy_xdp_rxqs(struct cpsw_common *cpsw)
+{
+	struct net_device *ndev;
+	struct cpsw_priv *priv;
+	int i;
+
+	for (i = 0; i < cpsw->data.slaves; i++) {
+		ndev = cpsw->slaves[i].ndev;
+		if (!ndev || !netif_running(ndev))
+			continue;
+
+		priv = netdev_priv(ndev);
+		cpsw_ndev_destroy_xdp_rxqs(priv);
+	}
+
+	memset(cpsw->page_pool, 0, sizeof(cpsw->page_pool));
+}
+
+static int cpsw_create_xdp_rxqs(struct cpsw_common *cpsw)
+{
+	struct net_device *ndev;
+	struct cpsw_priv *priv;
+	int i, ret;
+
+	for (i = 0; i < cpsw->data.slaves; i++) {
+		ndev = cpsw->slaves[i].ndev;
+		if (!ndev || !netif_running(ndev))
+			continue;
+
+		priv = netdev_priv(ndev);
+		ret = cpsw_ndev_create_xdp_rxqs(priv);
+		if (ret)
+			goto err_cleanup;
+	}
+
+	return 0;
+
+err_cleanup:
+	cpsw_destroy_xdp_rxqs(cpsw);
+	return ret;
+}
+
 int cpsw_set_channels_common(struct net_device *ndev,
 			     struct ethtool_channels *chs,
 			     cpdma_handler_fn rx_handler)
@@ -585,7 +627,7 @@ int cpsw_set_channels_common(struct net_device *ndev,
 	struct cpsw_priv *priv = netdev_priv(ndev);
 	struct cpsw_common *cpsw = priv->cpsw;
 	struct net_device *sl_ndev;
-	int i, ret;
+	int i, new_pools, ret;
 
 	ret = cpsw_check_ch_settings(cpsw, chs);
 	if (ret < 0)
@@ -593,6 +635,10 @@ int cpsw_set_channels_common(struct net_device *ndev,
 
 	cpsw_suspend_data_pass(ndev);
 
+	new_pools = (chs->rx_count != cpsw->rx_ch_num) && cpsw->usage_count;
+	if (new_pools)
+		cpsw_destroy_xdp_rxqs(cpsw);
+
 	ret = cpsw_update_channels_res(priv, chs->rx_count, 1, rx_handler);
 	if (ret)
 		goto err;
@@ -622,6 +668,12 @@ int cpsw_set_channels_common(struct net_device *ndev,
 
 	cpsw_split_res(cpsw);
 
+	if (new_pools) {
+		ret = cpsw_create_xdp_rxqs(cpsw);
+		if (ret)
+			goto err;
+	}
+
 	ret = cpsw_resume_data_pass(ndev);
 	if (!ret)
 		return 0;
@@ -647,8 +699,7 @@ void cpsw_get_ringparam(struct net_device *ndev,
 int cpsw_set_ringparam(struct net_device *ndev,
 		       struct ethtool_ringparam *ering)
 {
-	struct cpsw_priv *priv = netdev_priv(ndev);
-	struct cpsw_common *cpsw = priv->cpsw;
+	struct cpsw_common *cpsw = ndev_to_cpsw(ndev);
 	int ret;
 
 	/* ignore ering->tx_pending - only rx_pending adjustment is supported */
@@ -663,10 +714,19 @@ int cpsw_set_ringparam(struct net_device *ndev,
 
 	cpsw_suspend_data_pass(ndev);
 
+	if (cpsw->usage_count)
+		cpsw_destroy_xdp_rxqs(cpsw);
+
 	ret = cpdma_set_num_rx_descs(cpsw->dma, ering->rx_pending);
 	if (ret)
 		goto err;
 
+	if (cpsw->usage_count) {
+		ret = cpsw_create_xdp_rxqs(cpsw);
+		if (ret)
+			goto err;
+	}
+
 	ret = cpsw_resume_data_pass(ndev);
 	if (!ret)
 		return 0;
diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h
index 04795b97ee71..da68764e7f87 100644
--- a/drivers/net/ethernet/ti/cpsw_priv.h
+++ b/drivers/net/ethernet/ti/cpsw_priv.h
@@ -346,6 +346,7 @@ struct cpsw_common {
 	int				rx_ch_num, tx_ch_num;
 	int				speed;
 	int				usage_count;
+	struct page_pool		*page_pool[CPSW_MAX_QUEUES];
 };
 
 struct cpsw_priv {
@@ -360,6 +361,10 @@ struct cpsw_priv {
 	int				shp_cfg_speed;
 	int				tx_ts_enabled;
 	int				rx_ts_enabled;
+	struct bpf_prog			*xdp_prog;
+	struct xdp_rxq_info		xdp_rxq[CPSW_MAX_QUEUES];
+	struct xdp_attachment_info	xdpi;
+
 	u32 emac_port;
 	struct cpsw_common *cpsw;
 };
@@ -391,6 +396,8 @@ int cpsw_fill_rx_channels(struct cpsw_priv *priv);
 void cpsw_intr_enable(struct cpsw_common *cpsw);
 void cpsw_intr_disable(struct cpsw_common *cpsw);
 void cpsw_tx_handler(void *token, int len, int status);
+int cpsw_ndev_create_xdp_rxqs(struct cpsw_priv *priv);
+void cpsw_ndev_destroy_xdp_rxqs(struct cpsw_priv *priv);
 
 /* ethtool */
 u32 cpsw_get_msglevel(struct net_device *ndev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 1/5] xdp: allow same allocator usage
  2019-07-03 10:18 ` [PATCH v6 net-next 1/5] xdp: allow same allocator usage Ivan Khoronzhuk
@ 2019-07-03 17:40   ` Jesper Dangaard Brouer
  2019-07-04 10:22     ` Ivan Khoronzhuk
  0 siblings, 1 reply; 17+ messages in thread
From: Jesper Dangaard Brouer @ 2019-07-03 17:40 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: grygorii.strashko, hawk, davem, ast, linux-kernel, linux-omap,
	xdp-newbies, ilias.apalodimas, netdev, daniel, jakub.kicinski,
	john.fastabend, brouer

On Wed,  3 Jul 2019 13:18:59 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> First of all, it is an absolute requirement that each RX-queue have
> their own page_pool object/allocator. And this change is intendant
> to handle special case, where a single RX-queue can receive packets
> from two different net_devices.
> 
> In order to protect against using same allocator for 2 different rx
> queues, add queue_index to xdp_mem_allocator to catch the obvious
> mistake where queue_index mismatch, as proposed by Jesper Dangaard
> Brouer.
> 
> Adding this on xdp allocator level allows drivers with such dependency
> change the allocators w/o modifications.
> 
> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
>  include/net/xdp_priv.h |  2 ++
>  net/core/xdp.c         | 55 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 57 insertions(+)
> 
> diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
> index 6a8cba6ea79a..9858a4057842 100644
> --- a/include/net/xdp_priv.h
> +++ b/include/net/xdp_priv.h
> @@ -18,6 +18,8 @@ struct xdp_mem_allocator {
>  	struct rcu_head rcu;
>  	struct delayed_work defer_wq;
>  	unsigned long defer_warn;
> +	unsigned long refcnt;
> +	u32 queue_index;
>  };

I don't like this approach, because I think we need to extend struct
xdp_mem_allocator with a net_device pointer, for doing dev_hold(), to
correctly handle lifetime issues. (As I tried to explain previously).
This will be much harder after this change, which is why I proposed the
other patch.


>  #endif /* __LINUX_NET_XDP_PRIV_H__ */
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 829377cc83db..4f0ddbb3717a 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -98,6 +98,18 @@ static bool __mem_id_disconnect(int id, bool force)
>  		WARN(1, "Request remove non-existing id(%d), driver bug?", id);
>  		return true;
>  	}
> +
> +	/* to avoid calling hash lookup twice, decrement refcnt here till it
> +	 * reaches zero, then it can be called from workqueue afterwards.
> +	 */
> +	if (xa->refcnt)
> +		xa->refcnt--;
> +
> +	if (xa->refcnt) {
> +		mutex_unlock(&mem_id_lock);
> +		return true;
> +	}
> +
>  	xa->disconnect_cnt++;
>  
>  	/* Detects in-flight packet-pages for page_pool */
> @@ -312,6 +324,33 @@ static bool __is_supported_mem_type(enum xdp_mem_type type)
>  	return true;
>  }
>  
> +static struct xdp_mem_allocator *xdp_allocator_find(void *allocator)
> +{
> +	struct xdp_mem_allocator *xae, *xa = NULL;
> +	struct rhashtable_iter iter;
> +
> +	if (!allocator)
> +		return xa;
> +
> +	rhashtable_walk_enter(mem_id_ht, &iter);
> +	do {
> +		rhashtable_walk_start(&iter);
> +
> +		while ((xae = rhashtable_walk_next(&iter)) && !IS_ERR(xae)) {
> +			if (xae->allocator == allocator) {
> +				xa = xae;
> +				break;
> +			}
> +		}
> +
> +		rhashtable_walk_stop(&iter);
> +
> +	} while (xae == ERR_PTR(-EAGAIN));
> +	rhashtable_walk_exit(&iter);
> +
> +	return xa;
> +}
> +
>  int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>  			       enum xdp_mem_type type, void *allocator)
>  {
> @@ -347,6 +386,20 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>  		}
>  	}
>  
> +	mutex_lock(&mem_id_lock);
> +	xdp_alloc = xdp_allocator_find(allocator);
> +	if (xdp_alloc) {
> +		/* One allocator per queue is supposed only */
> +		if (xdp_alloc->queue_index != xdp_rxq->queue_index)
> +			return -EINVAL;
> +
> +		xdp_rxq->mem.id = xdp_alloc->mem.id;
> +		xdp_alloc->refcnt++;
> +		mutex_unlock(&mem_id_lock);
> +		return 0;
> +	}
> +	mutex_unlock(&mem_id_lock);
> +
>  	xdp_alloc = kzalloc(sizeof(*xdp_alloc), gfp);
>  	if (!xdp_alloc)
>  		return -ENOMEM;
> @@ -360,6 +413,8 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>  	xdp_rxq->mem.id = id;
>  	xdp_alloc->mem  = xdp_rxq->mem;
>  	xdp_alloc->allocator = allocator;
> +	xdp_alloc->refcnt = 1;
> +	xdp_alloc->queue_index = xdp_rxq->queue_index;
>  
>  	/* Insert allocator into ID lookup table */
>  	ptr = rhashtable_insert_slow(mem_id_ht, &id, &xdp_alloc->node);



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support
  2019-07-03 10:19 ` [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support Ivan Khoronzhuk
@ 2019-07-04  9:19   ` Jesper Dangaard Brouer
  2019-07-04  9:39     ` Ilias Apalodimas
  2019-07-04  9:45     ` Ivan Khoronzhuk
  0 siblings, 2 replies; 17+ messages in thread
From: Jesper Dangaard Brouer @ 2019-07-04  9:19 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: grygorii.strashko, hawk, davem, ast, linux-kernel, linux-omap,
	xdp-newbies, ilias.apalodimas, netdev, daniel, jakub.kicinski,
	john.fastabend, brouer

On Wed,  3 Jul 2019 13:19:03 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> Add XDP support based on rx page_pool allocator, one frame per page.
> Page pool allocator is used with assumption that only one rx_handler
> is running simultaneously. DMA map/unmap is reused from page pool
> despite there is no need to map whole page.
> 
> Due to specific of cpsw, the same TX/RX handler can be used by 2
> network devices, so special fields in buffer are added to identify
> an interface the frame is destined to. Thus XDP works for both
> interfaces, that allows to test xdp redirect between two interfaces
> easily. Aslo, each rx queue have own page pools, but common for both
> netdevs.
> 
> XDP prog is common for all channels till appropriate changes are added
> in XDP infrastructure. Also, once page_pool recycling becomes part of
> skb netstack some simplifications can be added, like removing
> page_pool_release_page() before skb receive.
> 
> In order to keep rx_dev while redirect, that can be somehow used in
> future, do flush in rx_handler, that allows to keep rx dev the same
> while reidrect. It allows to conform with tracing rx_dev pointed
> by Jesper.

So, you simply call xdp_do_flush_map() after each xdp_do_redirect().
It will kill RX-bulk and performance, but I guess it will work.

I guess, we can optimized it later, by e.g. in function calling
cpsw_run_xdp() have a variable that detect if net_device changed
(priv->ndev) and then call xdp_do_flush_map() when needed.


> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> ---
>  drivers/net/ethernet/ti/Kconfig        |   1 +
>  drivers/net/ethernet/ti/cpsw.c         | 485 ++++++++++++++++++++++---
>  drivers/net/ethernet/ti/cpsw_ethtool.c |  66 +++-
>  drivers/net/ethernet/ti/cpsw_priv.h    |   7 +
>  4 files changed, 502 insertions(+), 57 deletions(-)
> 
[...]
> +static int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp,
> +			struct page *page)
> +{
> +	struct cpsw_common *cpsw = priv->cpsw;
> +	struct net_device *ndev = priv->ndev;
> +	int ret = CPSW_XDP_CONSUMED;
> +	struct xdp_frame *xdpf;
> +	struct bpf_prog *prog;
> +	u32 act;
> +
> +	rcu_read_lock();
> +
> +	prog = READ_ONCE(priv->xdp_prog);
> +	if (!prog) {
> +		ret = CPSW_XDP_PASS;
> +		goto out;
> +	}
> +
> +	act = bpf_prog_run_xdp(prog, xdp);
> +	switch (act) {
> +	case XDP_PASS:
> +		ret = CPSW_XDP_PASS;
> +		break;
> +	case XDP_TX:
> +		xdpf = convert_to_xdp_frame(xdp);
> +		if (unlikely(!xdpf))
> +			goto drop;
> +
> +		cpsw_xdp_tx_frame(priv, xdpf, page);
> +		break;
> +	case XDP_REDIRECT:
> +		if (xdp_do_redirect(ndev, xdp, prog))
> +			goto drop;
> +
> +		/* as flush requires rx_dev to be per NAPI handle and there
> +		 * is can be two devices putting packets on bulk queue,
> +		 * do flush here avoid this just for sure.
> +		 */
> +		xdp_do_flush_map();

> +		break;
> +	default:
> +		bpf_warn_invalid_xdp_action(act);
> +		/* fall through */
> +	case XDP_ABORTED:
> +		trace_xdp_exception(ndev, prog, act);
> +		/* fall through -- handle aborts by dropping packet */
> +	case XDP_DROP:
> +		goto drop;
> +	}
> +out:
> +	rcu_read_unlock();
> +	return ret;
> +drop:
> +	rcu_read_unlock();
> +	page_pool_recycle_direct(cpsw->page_pool[ch], page);
> +	return ret;
> +}

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support
  2019-07-04  9:19   ` Jesper Dangaard Brouer
@ 2019-07-04  9:39     ` Ilias Apalodimas
  2019-07-04  9:43       ` Ivan Khoronzhuk
  2019-07-04  9:45     ` Ivan Khoronzhuk
  1 sibling, 1 reply; 17+ messages in thread
From: Ilias Apalodimas @ 2019-07-04  9:39 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Ivan Khoronzhuk, grygorii.strashko, hawk, davem, ast,
	linux-kernel, linux-omap, xdp-newbies, netdev, daniel,
	jakub.kicinski, john.fastabend

On Thu, Jul 04, 2019 at 11:19:39AM +0200, Jesper Dangaard Brouer wrote:
> On Wed,  3 Jul 2019 13:19:03 +0300
> Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
> 
> > Add XDP support based on rx page_pool allocator, one frame per page.
> > Page pool allocator is used with assumption that only one rx_handler
> > is running simultaneously. DMA map/unmap is reused from page pool
> > despite there is no need to map whole page.
> > 
> > Due to specific of cpsw, the same TX/RX handler can be used by 2
> > network devices, so special fields in buffer are added to identify
> > an interface the frame is destined to. Thus XDP works for both
> > interfaces, that allows to test xdp redirect between two interfaces
> > easily. Aslo, each rx queue have own page pools, but common for both
> > netdevs.
> > 
> > XDP prog is common for all channels till appropriate changes are added
> > in XDP infrastructure. Also, once page_pool recycling becomes part of
> > skb netstack some simplifications can be added, like removing
> > page_pool_release_page() before skb receive.
> > 
> > In order to keep rx_dev while redirect, that can be somehow used in
> > future, do flush in rx_handler, that allows to keep rx dev the same
> > while reidrect. It allows to conform with tracing rx_dev pointed
> > by Jesper.
> 
> So, you simply call xdp_do_flush_map() after each xdp_do_redirect().
> It will kill RX-bulk and performance, but I guess it will work.
> 
> I guess, we can optimized it later, by e.g. in function calling
> cpsw_run_xdp() have a variable that detect if net_device changed
> (priv->ndev) and then call xdp_do_flush_map() when needed.
I tried something similar on the netsec driver on my initial development. 
On the 1gbit speed NICs i saw no difference between flushing per packet vs
flushing on the end of the NAPI handler. 
The latter is obviously better but since the performance impact is negligible on
this particular NIC, i don't think this should be a blocker. 
Please add a clear comment on this and why you do that on this driver,
so people won't go ahead and copy/paste this approach 


Thanks
/Ilias
> 
> 
> > Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> > ---
> >  drivers/net/ethernet/ti/Kconfig        |   1 +
> >  drivers/net/ethernet/ti/cpsw.c         | 485 ++++++++++++++++++++++---
> >  drivers/net/ethernet/ti/cpsw_ethtool.c |  66 +++-
> >  drivers/net/ethernet/ti/cpsw_priv.h    |   7 +
> >  4 files changed, 502 insertions(+), 57 deletions(-)
> > 
> [...]
> > +static int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp,
> > +			struct page *page)
> > +{
> > +	struct cpsw_common *cpsw = priv->cpsw;
> > +	struct net_device *ndev = priv->ndev;
> > +	int ret = CPSW_XDP_CONSUMED;
> > +	struct xdp_frame *xdpf;
> > +	struct bpf_prog *prog;
> > +	u32 act;
> > +
> > +	rcu_read_lock();
> > +
> > +	prog = READ_ONCE(priv->xdp_prog);
> > +	if (!prog) {
> > +		ret = CPSW_XDP_PASS;
> > +		goto out;
> > +	}
> > +
> > +	act = bpf_prog_run_xdp(prog, xdp);
> > +	switch (act) {
> > +	case XDP_PASS:
> > +		ret = CPSW_XDP_PASS;
> > +		break;
> > +	case XDP_TX:
> > +		xdpf = convert_to_xdp_frame(xdp);
> > +		if (unlikely(!xdpf))
> > +			goto drop;
> > +
> > +		cpsw_xdp_tx_frame(priv, xdpf, page);
> > +		break;
> > +	case XDP_REDIRECT:
> > +		if (xdp_do_redirect(ndev, xdp, prog))
> > +			goto drop;
> > +
> > +		/* as flush requires rx_dev to be per NAPI handle and there
> > +		 * is can be two devices putting packets on bulk queue,
> > +		 * do flush here avoid this just for sure.
> > +		 */
> > +		xdp_do_flush_map();
> 
> > +		break;
> > +	default:
> > +		bpf_warn_invalid_xdp_action(act);
> > +		/* fall through */
> > +	case XDP_ABORTED:
> > +		trace_xdp_exception(ndev, prog, act);
> > +		/* fall through -- handle aborts by dropping packet */
> > +	case XDP_DROP:
> > +		goto drop;
> > +	}
> > +out:
> > +	rcu_read_unlock();
> > +	return ret;
> > +drop:
> > +	rcu_read_unlock();
> > +	page_pool_recycle_direct(cpsw->page_pool[ch], page);
> > +	return ret;
> > +}
> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support
  2019-07-04  9:39     ` Ilias Apalodimas
@ 2019-07-04  9:43       ` Ivan Khoronzhuk
  2019-07-04  9:49         ` Ilias Apalodimas
  0 siblings, 1 reply; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-04  9:43 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Jesper Dangaard Brouer, grygorii.strashko, hawk, davem, ast,
	linux-kernel, linux-omap, xdp-newbies, netdev, daniel,
	jakub.kicinski, john.fastabend

On Thu, Jul 04, 2019 at 12:39:02PM +0300, Ilias Apalodimas wrote:
>On Thu, Jul 04, 2019 at 11:19:39AM +0200, Jesper Dangaard Brouer wrote:
>> On Wed,  3 Jul 2019 13:19:03 +0300
>> Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>>
>> > Add XDP support based on rx page_pool allocator, one frame per page.
>> > Page pool allocator is used with assumption that only one rx_handler
>> > is running simultaneously. DMA map/unmap is reused from page pool
>> > despite there is no need to map whole page.
>> >
>> > Due to specific of cpsw, the same TX/RX handler can be used by 2
>> > network devices, so special fields in buffer are added to identify
>> > an interface the frame is destined to. Thus XDP works for both
>> > interfaces, that allows to test xdp redirect between two interfaces
>> > easily. Aslo, each rx queue have own page pools, but common for both
>> > netdevs.
>> >
>> > XDP prog is common for all channels till appropriate changes are added
>> > in XDP infrastructure. Also, once page_pool recycling becomes part of
>> > skb netstack some simplifications can be added, like removing
>> > page_pool_release_page() before skb receive.
>> >
>> > In order to keep rx_dev while redirect, that can be somehow used in
>> > future, do flush in rx_handler, that allows to keep rx dev the same
>> > while reidrect. It allows to conform with tracing rx_dev pointed
>> > by Jesper.
>>
>> So, you simply call xdp_do_flush_map() after each xdp_do_redirect().
>> It will kill RX-bulk and performance, but I guess it will work.
>>
>> I guess, we can optimized it later, by e.g. in function calling
>> cpsw_run_xdp() have a variable that detect if net_device changed
>> (priv->ndev) and then call xdp_do_flush_map() when needed.
>I tried something similar on the netsec driver on my initial development.
>On the 1gbit speed NICs i saw no difference between flushing per packet vs
>flushing on the end of the NAPI handler.
>The latter is obviously better but since the performance impact is negligible on
>this particular NIC, i don't think this should be a blocker.
>Please add a clear comment on this and why you do that on this driver,
>so people won't go ahead and copy/paste this approach
Sry, but I did this already, is it not enouph?

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support
  2019-07-04  9:19   ` Jesper Dangaard Brouer
  2019-07-04  9:39     ` Ilias Apalodimas
@ 2019-07-04  9:45     ` Ivan Khoronzhuk
  1 sibling, 0 replies; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-04  9:45 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: grygorii.strashko, hawk, davem, ast, linux-kernel, linux-omap,
	xdp-newbies, ilias.apalodimas, netdev, daniel, jakub.kicinski,
	john.fastabend

On Thu, Jul 04, 2019 at 11:19:39AM +0200, Jesper Dangaard Brouer wrote:
>On Wed,  3 Jul 2019 13:19:03 +0300
>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>
>> Add XDP support based on rx page_pool allocator, one frame per page.
>> Page pool allocator is used with assumption that only one rx_handler
>> is running simultaneously. DMA map/unmap is reused from page pool
>> despite there is no need to map whole page.
>>
>> Due to specific of cpsw, the same TX/RX handler can be used by 2
>> network devices, so special fields in buffer are added to identify
>> an interface the frame is destined to. Thus XDP works for both
>> interfaces, that allows to test xdp redirect between two interfaces
>> easily. Aslo, each rx queue have own page pools, but common for both
>> netdevs.
>>
>> XDP prog is common for all channels till appropriate changes are added
>> in XDP infrastructure. Also, once page_pool recycling becomes part of
>> skb netstack some simplifications can be added, like removing
>> page_pool_release_page() before skb receive.
>>
>> In order to keep rx_dev while redirect, that can be somehow used in
>> future, do flush in rx_handler, that allows to keep rx dev the same
>> while reidrect. It allows to conform with tracing rx_dev pointed
>> by Jesper.
>
>So, you simply call xdp_do_flush_map() after each xdp_do_redirect().
>It will kill RX-bulk and performance, but I guess it will work.
>
>I guess, we can optimized it later, by e.g. in function calling
>cpsw_run_xdp() have a variable that detect if net_device changed
>(priv->ndev) and then call xdp_do_flush_map() when needed.

It's problem of cpsw already and can be optimized locally by own
bulk queues for instance, if it will be simple if really needed ofc.

>
>
>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>> ---
>>  drivers/net/ethernet/ti/Kconfig        |   1 +
>>  drivers/net/ethernet/ti/cpsw.c         | 485 ++++++++++++++++++++++---
>>  drivers/net/ethernet/ti/cpsw_ethtool.c |  66 +++-
>>  drivers/net/ethernet/ti/cpsw_priv.h    |   7 +
>>  4 files changed, 502 insertions(+), 57 deletions(-)
>>
>[...]
>> +static int cpsw_run_xdp(struct cpsw_priv *priv, int ch, struct xdp_buff *xdp,
>> +			struct page *page)
>> +{
>> +	struct cpsw_common *cpsw = priv->cpsw;
>> +	struct net_device *ndev = priv->ndev;
>> +	int ret = CPSW_XDP_CONSUMED;
>> +	struct xdp_frame *xdpf;
>> +	struct bpf_prog *prog;
>> +	u32 act;
>> +
>> +	rcu_read_lock();
>> +
>> +	prog = READ_ONCE(priv->xdp_prog);
>> +	if (!prog) {
>> +		ret = CPSW_XDP_PASS;
>> +		goto out;
>> +	}
>> +
>> +	act = bpf_prog_run_xdp(prog, xdp);
>> +	switch (act) {
>> +	case XDP_PASS:
>> +		ret = CPSW_XDP_PASS;
>> +		break;
>> +	case XDP_TX:
>> +		xdpf = convert_to_xdp_frame(xdp);
>> +		if (unlikely(!xdpf))
>> +			goto drop;
>> +
>> +		cpsw_xdp_tx_frame(priv, xdpf, page);
>> +		break;
>> +	case XDP_REDIRECT:
>> +		if (xdp_do_redirect(ndev, xdp, prog))
>> +			goto drop;
>> +
>> +		/* as flush requires rx_dev to be per NAPI handle and there
>> +		 * is can be two devices putting packets on bulk queue,
>> +		 * do flush here avoid this just for sure.
>> +		 */
>> +		xdp_do_flush_map();
>
>> +		break;
>> +	default:
>> +		bpf_warn_invalid_xdp_action(act);
>> +		/* fall through */
>> +	case XDP_ABORTED:
>> +		trace_xdp_exception(ndev, prog, act);
>> +		/* fall through -- handle aborts by dropping packet */
>> +	case XDP_DROP:
>> +		goto drop;
>> +	}
>> +out:
>> +	rcu_read_unlock();
>> +	return ret;
>> +drop:
>> +	rcu_read_unlock();
>> +	page_pool_recycle_direct(cpsw->page_pool[ch], page);
>> +	return ret;
>> +}
>
>-- 
>Best regards,
>  Jesper Dangaard Brouer
>  MSc.CS, Principal Kernel Engineer at Red Hat
>  LinkedIn: http://www.linkedin.com/in/brouer

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support
  2019-07-04  9:43       ` Ivan Khoronzhuk
@ 2019-07-04  9:49         ` Ilias Apalodimas
  2019-07-04  9:53           ` Ivan Khoronzhuk
  0 siblings, 1 reply; 17+ messages in thread
From: Ilias Apalodimas @ 2019-07-04  9:49 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, grygorii.strashko, hawk, davem, ast,
	linux-kernel, linux-omap, xdp-newbies, netdev, daniel,
	jakub.kicinski, john.fastabend

On Thu, Jul 04, 2019 at 12:43:30PM +0300, Ivan Khoronzhuk wrote:
> On Thu, Jul 04, 2019 at 12:39:02PM +0300, Ilias Apalodimas wrote:
> >On Thu, Jul 04, 2019 at 11:19:39AM +0200, Jesper Dangaard Brouer wrote:
> >>On Wed,  3 Jul 2019 13:19:03 +0300
> >>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
> >>
> >>> Add XDP support based on rx page_pool allocator, one frame per page.
> >>> Page pool allocator is used with assumption that only one rx_handler
> >>> is running simultaneously. DMA map/unmap is reused from page pool
> >>> despite there is no need to map whole page.
> >>>
> >>> Due to specific of cpsw, the same TX/RX handler can be used by 2
> >>> network devices, so special fields in buffer are added to identify
> >>> an interface the frame is destined to. Thus XDP works for both
> >>> interfaces, that allows to test xdp redirect between two interfaces
> >>> easily. Aslo, each rx queue have own page pools, but common for both
> >>> netdevs.
> >>>
> >>> XDP prog is common for all channels till appropriate changes are added
> >>> in XDP infrastructure. Also, once page_pool recycling becomes part of
> >>> skb netstack some simplifications can be added, like removing
> >>> page_pool_release_page() before skb receive.
> >>>
> >>> In order to keep rx_dev while redirect, that can be somehow used in
> >>> future, do flush in rx_handler, that allows to keep rx dev the same
> >>> while reidrect. It allows to conform with tracing rx_dev pointed
> >>> by Jesper.
> >>
> >>So, you simply call xdp_do_flush_map() after each xdp_do_redirect().
> >>It will kill RX-bulk and performance, but I guess it will work.
> >>
> >>I guess, we can optimized it later, by e.g. in function calling
> >>cpsw_run_xdp() have a variable that detect if net_device changed
> >>(priv->ndev) and then call xdp_do_flush_map() when needed.
> >I tried something similar on the netsec driver on my initial development.
> >On the 1gbit speed NICs i saw no difference between flushing per packet vs
> >flushing on the end of the NAPI handler.
> >The latter is obviously better but since the performance impact is negligible on
> >this particular NIC, i don't think this should be a blocker.
> >Please add a clear comment on this and why you do that on this driver,
> >so people won't go ahead and copy/paste this approach
> Sry, but I did this already, is it not enouph?
The flush *must* happen there to avoid messing the following layers. The comment
says something like 'just to be sure'. It's not something that might break, it's
something that *will* break the code and i don't think that's clear with the
current comment.

So i'd prefer something like 
'We must flush here, per packet, instead of doing it in bulk at the end of
the napi handler.The RX devices on this particular hardware is sharing a
common queue, so the incoming device might change per packet'


Thanks
/Ilias
> 
> -- 
> Regards,
> Ivan Khoronzhuk

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support
  2019-07-04  9:49         ` Ilias Apalodimas
@ 2019-07-04  9:53           ` Ivan Khoronzhuk
  0 siblings, 0 replies; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-04  9:53 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Jesper Dangaard Brouer, grygorii.strashko, hawk, davem, ast,
	linux-kernel, linux-omap, xdp-newbies, netdev, daniel,
	jakub.kicinski, john.fastabend

On Thu, Jul 04, 2019 at 12:49:38PM +0300, Ilias Apalodimas wrote:
>On Thu, Jul 04, 2019 at 12:43:30PM +0300, Ivan Khoronzhuk wrote:
>> On Thu, Jul 04, 2019 at 12:39:02PM +0300, Ilias Apalodimas wrote:
>> >On Thu, Jul 04, 2019 at 11:19:39AM +0200, Jesper Dangaard Brouer wrote:
>> >>On Wed,  3 Jul 2019 13:19:03 +0300
>> >>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>> >>
>> >>> Add XDP support based on rx page_pool allocator, one frame per page.
>> >>> Page pool allocator is used with assumption that only one rx_handler
>> >>> is running simultaneously. DMA map/unmap is reused from page pool
>> >>> despite there is no need to map whole page.
>> >>>
>> >>> Due to specific of cpsw, the same TX/RX handler can be used by 2
>> >>> network devices, so special fields in buffer are added to identify
>> >>> an interface the frame is destined to. Thus XDP works for both
>> >>> interfaces, that allows to test xdp redirect between two interfaces
>> >>> easily. Aslo, each rx queue have own page pools, but common for both
>> >>> netdevs.
>> >>>
>> >>> XDP prog is common for all channels till appropriate changes are added
>> >>> in XDP infrastructure. Also, once page_pool recycling becomes part of
>> >>> skb netstack some simplifications can be added, like removing
>> >>> page_pool_release_page() before skb receive.
>> >>>
>> >>> In order to keep rx_dev while redirect, that can be somehow used in
>> >>> future, do flush in rx_handler, that allows to keep rx dev the same
>> >>> while reidrect. It allows to conform with tracing rx_dev pointed
>> >>> by Jesper.
>> >>
>> >>So, you simply call xdp_do_flush_map() after each xdp_do_redirect().
>> >>It will kill RX-bulk and performance, but I guess it will work.
>> >>
>> >>I guess, we can optimized it later, by e.g. in function calling
>> >>cpsw_run_xdp() have a variable that detect if net_device changed
>> >>(priv->ndev) and then call xdp_do_flush_map() when needed.
>> >I tried something similar on the netsec driver on my initial development.
>> >On the 1gbit speed NICs i saw no difference between flushing per packet vs
>> >flushing on the end of the NAPI handler.
>> >The latter is obviously better but since the performance impact is negligible on
>> >this particular NIC, i don't think this should be a blocker.
>> >Please add a clear comment on this and why you do that on this driver,
>> >so people won't go ahead and copy/paste this approach
>> Sry, but I did this already, is it not enouph?
>The flush *must* happen there to avoid messing the following layers. The comment
>says something like 'just to be sure'. It's not something that might break, it's
>something that *will* break the code and i don't think that's clear with the
>current comment.
>
>So i'd prefer something like
>'We must flush here, per packet, instead of doing it in bulk at the end of
>the napi handler.The RX devices on this particular hardware is sharing a
>common queue, so the incoming device might change per packet'
Sounds good, will replace on it.

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 1/5] xdp: allow same allocator usage
  2019-07-03 17:40   ` Jesper Dangaard Brouer
@ 2019-07-04 10:22     ` Ivan Khoronzhuk
  2019-07-04 12:41       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-04 10:22 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: grygorii.strashko, hawk, davem, ast, linux-kernel, linux-omap,
	xdp-newbies, ilias.apalodimas, netdev, daniel, jakub.kicinski,
	john.fastabend

On Wed, Jul 03, 2019 at 07:40:13PM +0200, Jesper Dangaard Brouer wrote:
>On Wed,  3 Jul 2019 13:18:59 +0300
>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>
>> First of all, it is an absolute requirement that each RX-queue have
>> their own page_pool object/allocator. And this change is intendant
>> to handle special case, where a single RX-queue can receive packets
>> from two different net_devices.
>>
>> In order to protect against using same allocator for 2 different rx
>> queues, add queue_index to xdp_mem_allocator to catch the obvious
>> mistake where queue_index mismatch, as proposed by Jesper Dangaard
>> Brouer.
>>
>> Adding this on xdp allocator level allows drivers with such dependency
>> change the allocators w/o modifications.
>>
>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>> ---
>>  include/net/xdp_priv.h |  2 ++
>>  net/core/xdp.c         | 55 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 57 insertions(+)
>>
>> diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
>> index 6a8cba6ea79a..9858a4057842 100644
>> --- a/include/net/xdp_priv.h
>> +++ b/include/net/xdp_priv.h
>> @@ -18,6 +18,8 @@ struct xdp_mem_allocator {
>>  	struct rcu_head rcu;
>>  	struct delayed_work defer_wq;
>>  	unsigned long defer_warn;
>> +	unsigned long refcnt;
>> +	u32 queue_index;
>>  };
>
>I don't like this approach, because I think we need to extend struct
>xdp_mem_allocator with a net_device pointer, for doing dev_hold(), to
>correctly handle lifetime issues. (As I tried to explain previously).
>This will be much harder after this change, which is why I proposed the
>other patch.
My concern comes not from zero also.
It's partly continuation of not answered questions from here:
https://lwn.net/ml/netdev/20190625122822.GC6485@khorivan/

"For me it's important to know only if it means that alloc.count is
freed at first call of __mem_id_disconnect() while shutdown.
The workqueue for the rest is connected only with ring cache protected
by ring lock and not supposed that alloc.count can be changed while
workqueue tries to shutdonwn the pool."

So patch you propose to leave works only because of luck, because fast
cache is cleared before workqueue is scheduled and no races between two
workqueues for fast cache later. I'm not really against this patch, but
I have to try smth better.

So, the patch is fine only because of specific of page_pool implementation.
I'm not sure that in future similar workqueue completion will be lucky for
another allocator (it easily can happen due to xdp frame can live longer
than an allocator). Similar problem can happen with other drivers having
same allocator, that can use zca (potentially can use smth similar),
af_xdp api allows to switch on it or some other allocators....

But not the essence. The concern about adding smth new to the allocator
later, like net device, can be solved with a little modification to the patch,
(despite here can be several more approaches) for instance, like this:
(by fact it's still the same, when mem_alloc instance per each register call
but with same void *allocator)


diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
index 6a8cba6ea79a..c7ad0f41e1b0 100644
--- a/include/net/xdp_priv.h
+++ b/include/net/xdp_priv.h
@@ -18,6 +18,8 @@ struct xdp_mem_allocator {
 	struct rcu_head rcu;
 	struct delayed_work defer_wq;
 	unsigned long defer_warn;
+	unsigned long *refcnt;
+	u32 queue_index;
 };
 
 #endif /* __LINUX_NET_XDP_PRIV_H__ */
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 829377cc83db..a44e3e4c8307 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -64,9 +64,37 @@ static const struct rhashtable_params mem_id_rht_params = {
 	.obj_cmpfn = xdp_mem_id_cmp,
 };
 
+static struct xdp_mem_allocator *xdp_allocator_find(void *allocator)
+{
+	struct xdp_mem_allocator *xae, *xa = NULL;
+	struct rhashtable_iter iter;
+
+	if (!allocator)
+		return xa;
+
+	rhashtable_walk_enter(mem_id_ht, &iter);
+	do {
+		rhashtable_walk_start(&iter);
+
+		while ((xae = rhashtable_walk_next(&iter)) && !IS_ERR(xae)) {
+			if (xae->allocator == allocator) {
+				xa = xae;
+				break;
+			}
+		}
+
+		rhashtable_walk_stop(&iter);
+
+	} while (xae == ERR_PTR(-EAGAIN));
+	rhashtable_walk_exit(&iter);
+
+	return xa;
+}
+
 static void __xdp_mem_allocator_rcu_free(struct rcu_head *rcu)
 {
 	struct xdp_mem_allocator *xa;
+	void *allocator;
 
 	xa = container_of(rcu, struct xdp_mem_allocator, rcu);
 
@@ -74,15 +102,27 @@ static void __xdp_mem_allocator_rcu_free(struct rcu_head *rcu)
 	if (xa->mem.type == MEM_TYPE_PAGE_POOL)
 		page_pool_free(xa->page_pool);
 
-	/* Allow this ID to be reused */
-	ida_simple_remove(&mem_id_pool, xa->mem.id);
+	kfree(xa->refcnt);
+	allocator = xa->allocator;
+	while (xa) {
+		xa = xdp_allocator_find(allocator);
+		if (!xa)
+			break;
+
+		mutex_lock(&mem_id_lock);
+		rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
+		mutex_unlock(&mem_id_lock);
 
-	/* Poison memory */
-	xa->mem.id = 0xFFFF;
-	xa->mem.type = 0xF0F0;
-	xa->allocator = (void *)0xDEAD9001;
+		/* Allow this ID to be reused */
+		ida_simple_remove(&mem_id_pool, xa->mem.id);
 
-	kfree(xa);
+		/* Poison memory */
+		xa->mem.id = 0xFFFF;
+		xa->mem.type = 0xF0F0;
+		xa->allocator = (void *)0xDEAD9001;
+
+		kfree(xa);
+	}
 }
 
 static bool __mem_id_disconnect(int id, bool force)
@@ -98,6 +138,18 @@ static bool __mem_id_disconnect(int id, bool force)
 		WARN(1, "Request remove non-existing id(%d), driver bug?", id);
 		return true;
 	}
+
+	/* to avoid calling hash lookup twice, decrement refcnt here till it
+	 * reaches zero, then it can be called from workqueue afterwards.
+	 */
+	if (*xa->refcnt)
+		(*xa->refcnt)--;
+
+	if (*xa->refcnt) {
+		mutex_unlock(&mem_id_lock);
+		return true;
+	}
+
 	xa->disconnect_cnt++;
 
 	/* Detects in-flight packet-pages for page_pool */
@@ -106,8 +158,7 @@ static bool __mem_id_disconnect(int id, bool force)
 
 	trace_mem_disconnect(xa, safe_to_remove, force);
 
-	if ((safe_to_remove || force) &&
-	    !rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
+	if (safe_to_remove || force)
 		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
 
 	mutex_unlock(&mem_id_lock);
@@ -316,6 +367,7 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 			       enum xdp_mem_type type, void *allocator)
 {
 	struct xdp_mem_allocator *xdp_alloc;
+	unsigned long *refcnt = NULL;
 	gfp_t gfp = GFP_KERNEL;
 	int id, errno, ret;
 	void *ptr;
@@ -347,6 +399,19 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 		}
 	}
 
+	mutex_lock(&mem_id_lock);
+	xdp_alloc = xdp_allocator_find(allocator);
+	if (xdp_alloc) {
+		/* One allocator per queue is supposed only */
+		if (xdp_alloc->queue_index != xdp_rxq->queue_index) {
+			mutex_unlock(&mem_id_lock);
+			return -EINVAL;
+		}
+
+		refcnt = xdp_alloc->refcnt;
+	}
+	mutex_unlock(&mem_id_lock);
+
 	xdp_alloc = kzalloc(sizeof(*xdp_alloc), gfp);
 	if (!xdp_alloc)
 		return -ENOMEM;
@@ -360,6 +425,7 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 	xdp_rxq->mem.id = id;
 	xdp_alloc->mem  = xdp_rxq->mem;
 	xdp_alloc->allocator = allocator;
+	xdp_alloc->queue_index = xdp_rxq->queue_index;
 
 	/* Insert allocator into ID lookup table */
 	ptr = rhashtable_insert_slow(mem_id_ht, &id, &xdp_alloc->node);
@@ -370,6 +436,16 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 		goto err;
 	}
 
+	if (!refcnt) {
+		refcnt = kzalloc(sizeof(*xdp_alloc->refcnt), gfp);
+		if (!refcnt) {
+			errno = -ENOMEM;
+			goto err;
+		}
+	}
+
+	(*refcnt)++;
+	xdp_alloc->refcnt = refcnt;
 	mutex_unlock(&mem_id_lock);
 
 	trace_mem_connect(xdp_alloc, xdp_rxq);



>
>
>>  #endif /* __LINUX_NET_XDP_PRIV_H__ */
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index 829377cc83db..4f0ddbb3717a 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -98,6 +98,18 @@ static bool __mem_id_disconnect(int id, bool force)
>>  		WARN(1, "Request remove non-existing id(%d), driver bug?", id);
>>  		return true;
>>  	}
>> +
>> +	/* to avoid calling hash lookup twice, decrement refcnt here till it
>> +	 * reaches zero, then it can be called from workqueue afterwards.
>> +	 */
>> +	if (xa->refcnt)
>> +		xa->refcnt--;
>> +
>> +	if (xa->refcnt) {
>> +		mutex_unlock(&mem_id_lock);
>> +		return true;
>> +	}
>> +
>>  	xa->disconnect_cnt++;
>>
>>  	/* Detects in-flight packet-pages for page_pool */
>> @@ -312,6 +324,33 @@ static bool __is_supported_mem_type(enum xdp_mem_type type)
>>  	return true;
>>  }
>>
>> +static struct xdp_mem_allocator *xdp_allocator_find(void *allocator)
>> +{
>> +	struct xdp_mem_allocator *xae, *xa = NULL;
>> +	struct rhashtable_iter iter;
>> +
>> +	if (!allocator)
>> +		return xa;
>> +
>> +	rhashtable_walk_enter(mem_id_ht, &iter);
>> +	do {
>> +		rhashtable_walk_start(&iter);
>> +
>> +		while ((xae = rhashtable_walk_next(&iter)) && !IS_ERR(xae)) {
>> +			if (xae->allocator == allocator) {
>> +				xa = xae;
>> +				break;
>> +			}
>> +		}
>> +
>> +		rhashtable_walk_stop(&iter);
>> +
>> +	} while (xae == ERR_PTR(-EAGAIN));
>> +	rhashtable_walk_exit(&iter);
>> +
>> +	return xa;
>> +}
>> +
>>  int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>>  			       enum xdp_mem_type type, void *allocator)
>>  {
>> @@ -347,6 +386,20 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>>  		}
>>  	}
>>
>> +	mutex_lock(&mem_id_lock);
>> +	xdp_alloc = xdp_allocator_find(allocator);
>> +	if (xdp_alloc) {
>> +		/* One allocator per queue is supposed only */
>> +		if (xdp_alloc->queue_index != xdp_rxq->queue_index)
>> +			return -EINVAL;
>> +
>> +		xdp_rxq->mem.id = xdp_alloc->mem.id;
>> +		xdp_alloc->refcnt++;
>> +		mutex_unlock(&mem_id_lock);
>> +		return 0;
>> +	}
>> +	mutex_unlock(&mem_id_lock);
>> +
>>  	xdp_alloc = kzalloc(sizeof(*xdp_alloc), gfp);
>>  	if (!xdp_alloc)
>>  		return -ENOMEM;
>> @@ -360,6 +413,8 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>>  	xdp_rxq->mem.id = id;
>>  	xdp_alloc->mem  = xdp_rxq->mem;
>>  	xdp_alloc->allocator = allocator;
>> +	xdp_alloc->refcnt = 1;
>> +	xdp_alloc->queue_index = xdp_rxq->queue_index;
>>
>>  	/* Insert allocator into ID lookup table */
>>  	ptr = rhashtable_insert_slow(mem_id_ht, &id, &xdp_alloc->node);
>
>
>
>-- 
>Best regards,
>  Jesper Dangaard Brouer
>  MSc.CS, Principal Kernel Engineer at Red Hat
>  LinkedIn: http://www.linkedin.com/in/brouer

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 1/5] xdp: allow same allocator usage
  2019-07-04 10:22     ` Ivan Khoronzhuk
@ 2019-07-04 12:41       ` Jesper Dangaard Brouer
  2019-07-04 17:11         ` Ivan Khoronzhuk
  0 siblings, 1 reply; 17+ messages in thread
From: Jesper Dangaard Brouer @ 2019-07-04 12:41 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: grygorii.strashko, davem, ast, linux-kernel, linux-omap,
	xdp-newbies, ilias.apalodimas, netdev, daniel, jakub.kicinski,
	john.fastabend, brouer

On Thu, 4 Jul 2019 13:22:40 +0300
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:

> On Wed, Jul 03, 2019 at 07:40:13PM +0200, Jesper Dangaard Brouer wrote:
> >On Wed,  3 Jul 2019 13:18:59 +0300
> >Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
> >  
> >> First of all, it is an absolute requirement that each RX-queue have
> >> their own page_pool object/allocator. And this change is intendant
> >> to handle special case, where a single RX-queue can receive packets
> >> from two different net_devices.
> >>
> >> In order to protect against using same allocator for 2 different rx
> >> queues, add queue_index to xdp_mem_allocator to catch the obvious
> >> mistake where queue_index mismatch, as proposed by Jesper Dangaard
> >> Brouer.
> >>
> >> Adding this on xdp allocator level allows drivers with such dependency
> >> change the allocators w/o modifications.
> >>
> >> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
> >> ---
> >>  include/net/xdp_priv.h |  2 ++
> >>  net/core/xdp.c         | 55 ++++++++++++++++++++++++++++++++++++++++++
> >>  2 files changed, 57 insertions(+)
> >>
> >> diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
> >> index 6a8cba6ea79a..9858a4057842 100644
> >> --- a/include/net/xdp_priv.h
> >> +++ b/include/net/xdp_priv.h
> >> @@ -18,6 +18,8 @@ struct xdp_mem_allocator {
> >>  	struct rcu_head rcu;
> >>  	struct delayed_work defer_wq;
> >>  	unsigned long defer_warn;
> >> +	unsigned long refcnt;
> >> +	u32 queue_index;
> >>  };  
> >
> >I don't like this approach, because I think we need to extend struct
> >xdp_mem_allocator with a net_device pointer, for doing dev_hold(), to
> >correctly handle lifetime issues. (As I tried to explain previously).
> >This will be much harder after this change, which is why I proposed the
> >other patch.  
> My concern comes not from zero also.
> It's partly continuation of not answered questions from here:
> https://lwn.net/ml/netdev/20190625122822.GC6485@khorivan/
> 
> "For me it's important to know only if it means that alloc.count is
> freed at first call of __mem_id_disconnect() while shutdown.
> The workqueue for the rest is connected only with ring cache protected
> by ring lock and not supposed that alloc.count can be changed while
> workqueue tries to shutdonwn the pool."

Yes.  The alloc.count is only freed on first call.  I considered
changing the shutdown API, to have two shutdown calls, where the call
used from the work-queue will not have the loop emptying alloc.count,
but instead have a WARN_ON(alloc.count), as it MUST be empty (once is
code running from work-queue).

> So patch you propose to leave works only because of luck, because fast
> cache is cleared before workqueue is scheduled and no races between two
> workqueues for fast cache later. I'm not really against this patch, but
> I have to try smth better.

It is not "luck".  It does the correct thing as we never enter the
while loop in __page_pool_request_shutdown() from a work-queue, but it
is not obvious from the code.  The not-so-nice thing is that two
work-queue shutdowns will be racing with each-other, in the multi
netdev use-case, but access to the ptr_ring is safe/locked.


> So, the patch is fine only because of specific of page_pool implementation.
> I'm not sure that in future similar workqueue completion will be lucky for
> another allocator (it easily can happen due to xdp frame can live longer
> than an allocator). Similar problem can happen with other drivers having
> same allocator, that can use zca (potentially can use smth similar),
> af_xdp api allows to switch on it or some other allocators....
> 
> But not the essence. The concern about adding smth new to the allocator
> later, like net device, can be solved with a little modification to the patch,
> (despite here can be several more approaches) for instance, like this:
> (by fact it's still the same, when mem_alloc instance per each register call
> but with same void *allocator)

Okay, below you have demonstrated that it is possible to extend later,
although it will make the code (IMHO) "ugly" and more complicated...
So, I guess, I cannot object to this not being extensible.

 
> diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
> index 6a8cba6ea79a..c7ad0f41e1b0 100644
> --- a/include/net/xdp_priv.h
> +++ b/include/net/xdp_priv.h
> @@ -18,6 +18,8 @@ struct xdp_mem_allocator {
>  	struct rcu_head rcu;
>  	struct delayed_work defer_wq;
>  	unsigned long defer_warn;
> +	unsigned long *refcnt;
> +	u32 queue_index;
>  };
>  
>  #endif /* __LINUX_NET_XDP_PRIV_H__ */
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 829377cc83db..a44e3e4c8307 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -64,9 +64,37 @@ static const struct rhashtable_params mem_id_rht_params = {
>  	.obj_cmpfn = xdp_mem_id_cmp,
>  };
>  
> +static struct xdp_mem_allocator *xdp_allocator_find(void *allocator)
> +{
> +	struct xdp_mem_allocator *xae, *xa = NULL;
> +	struct rhashtable_iter iter;
> +
> +	if (!allocator)
> +		return xa;
> +
> +	rhashtable_walk_enter(mem_id_ht, &iter);
> +	do {
> +		rhashtable_walk_start(&iter);
> +
> +		while ((xae = rhashtable_walk_next(&iter)) && !IS_ERR(xae)) {
> +			if (xae->allocator == allocator) {
> +				xa = xae;
> +				break;
> +			}
> +		}
> +
> +		rhashtable_walk_stop(&iter);
> +
> +	} while (xae == ERR_PTR(-EAGAIN));
> +	rhashtable_walk_exit(&iter);
> +
> +	return xa;
> +}
> +
>  static void __xdp_mem_allocator_rcu_free(struct rcu_head *rcu)
>  {
>  	struct xdp_mem_allocator *xa;
> +	void *allocator;
>  
>  	xa = container_of(rcu, struct xdp_mem_allocator, rcu);
>  
> @@ -74,15 +102,27 @@ static void __xdp_mem_allocator_rcu_free(struct rcu_head *rcu)
>  	if (xa->mem.type == MEM_TYPE_PAGE_POOL)
>  		page_pool_free(xa->page_pool);
>  
> -	/* Allow this ID to be reused */
> -	ida_simple_remove(&mem_id_pool, xa->mem.id);
> +	kfree(xa->refcnt);
> +	allocator = xa->allocator;
> +	while (xa) {
> +		xa = xdp_allocator_find(allocator);
> +		if (!xa)
> +			break;
> +
> +		mutex_lock(&mem_id_lock);
> +		rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
> +		mutex_unlock(&mem_id_lock);
>  
> -	/* Poison memory */
> -	xa->mem.id = 0xFFFF;
> -	xa->mem.type = 0xF0F0;
> -	xa->allocator = (void *)0xDEAD9001;
> +		/* Allow this ID to be reused */
> +		ida_simple_remove(&mem_id_pool, xa->mem.id);
>  
> -	kfree(xa);
> +		/* Poison memory */
> +		xa->mem.id = 0xFFFF;
> +		xa->mem.type = 0xF0F0;
> +		xa->allocator = (void *)0xDEAD9001;
> +
> +		kfree(xa);
> +	}
>  }
>  
>  static bool __mem_id_disconnect(int id, bool force)
> @@ -98,6 +138,18 @@ static bool __mem_id_disconnect(int id, bool force)
>  		WARN(1, "Request remove non-existing id(%d), driver bug?", id);
>  		return true;
>  	}
> +
> +	/* to avoid calling hash lookup twice, decrement refcnt here till it
> +	 * reaches zero, then it can be called from workqueue afterwards.
> +	 */
> +	if (*xa->refcnt)
> +		(*xa->refcnt)--;
> +
> +	if (*xa->refcnt) {
> +		mutex_unlock(&mem_id_lock);
> +		return true;
> +	}
> +
>  	xa->disconnect_cnt++;
>  
>  	/* Detects in-flight packet-pages for page_pool */
> @@ -106,8 +158,7 @@ static bool __mem_id_disconnect(int id, bool force)
>  
>  	trace_mem_disconnect(xa, safe_to_remove, force);
>  
> -	if ((safe_to_remove || force) &&
> -	    !rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
> +	if (safe_to_remove || force)
>  		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>  
>  	mutex_unlock(&mem_id_lock);
> @@ -316,6 +367,7 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>  			       enum xdp_mem_type type, void *allocator)
>  {
>  	struct xdp_mem_allocator *xdp_alloc;
> +	unsigned long *refcnt = NULL;
>  	gfp_t gfp = GFP_KERNEL;
>  	int id, errno, ret;
>  	void *ptr;
> @@ -347,6 +399,19 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>  		}
>  	}
>  
> +	mutex_lock(&mem_id_lock);
> +	xdp_alloc = xdp_allocator_find(allocator);
> +	if (xdp_alloc) {
> +		/* One allocator per queue is supposed only */
> +		if (xdp_alloc->queue_index != xdp_rxq->queue_index) {
> +			mutex_unlock(&mem_id_lock);
> +			return -EINVAL;
> +		}
> +
> +		refcnt = xdp_alloc->refcnt;
> +	}
> +	mutex_unlock(&mem_id_lock);
> +
>  	xdp_alloc = kzalloc(sizeof(*xdp_alloc), gfp);
>  	if (!xdp_alloc)
>  		return -ENOMEM;
> @@ -360,6 +425,7 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>  	xdp_rxq->mem.id = id;
>  	xdp_alloc->mem  = xdp_rxq->mem;
>  	xdp_alloc->allocator = allocator;
> +	xdp_alloc->queue_index = xdp_rxq->queue_index;
>  
>  	/* Insert allocator into ID lookup table */
>  	ptr = rhashtable_insert_slow(mem_id_ht, &id, &xdp_alloc->node);
> @@ -370,6 +436,16 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
>  		goto err;
>  	}
>  
> +	if (!refcnt) {
> +		refcnt = kzalloc(sizeof(*xdp_alloc->refcnt), gfp);
> +		if (!refcnt) {
> +			errno = -ENOMEM;
> +			goto err;
> +		}
> +	}
> +
> +	(*refcnt)++;
> +	xdp_alloc->refcnt = refcnt;
>  	mutex_unlock(&mem_id_lock);
>  
>  	trace_mem_connect(xdp_alloc, xdp_rxq);
> 
 

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 1/5] xdp: allow same allocator usage
  2019-07-04 12:41       ` Jesper Dangaard Brouer
@ 2019-07-04 17:11         ` Ivan Khoronzhuk
  0 siblings, 0 replies; 17+ messages in thread
From: Ivan Khoronzhuk @ 2019-07-04 17:11 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: grygorii.strashko, davem, ast, linux-kernel, linux-omap,
	xdp-newbies, ilias.apalodimas, netdev, daniel, jakub.kicinski,
	john.fastabend

On Thu, Jul 04, 2019 at 02:41:44PM +0200, Jesper Dangaard Brouer wrote:
>On Thu, 4 Jul 2019 13:22:40 +0300
>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>
>> On Wed, Jul 03, 2019 at 07:40:13PM +0200, Jesper Dangaard Brouer wrote:
>> >On Wed,  3 Jul 2019 13:18:59 +0300
>> >Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> wrote:
>> >
>> >> First of all, it is an absolute requirement that each RX-queue have
>> >> their own page_pool object/allocator. And this change is intendant
>> >> to handle special case, where a single RX-queue can receive packets
>> >> from two different net_devices.
>> >>
>> >> In order to protect against using same allocator for 2 different rx
>> >> queues, add queue_index to xdp_mem_allocator to catch the obvious
>> >> mistake where queue_index mismatch, as proposed by Jesper Dangaard
>> >> Brouer.
>> >>
>> >> Adding this on xdp allocator level allows drivers with such dependency
>> >> change the allocators w/o modifications.
>> >>
>> >> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
>> >> ---
>> >>  include/net/xdp_priv.h |  2 ++
>> >>  net/core/xdp.c         | 55 ++++++++++++++++++++++++++++++++++++++++++
>> >>  2 files changed, 57 insertions(+)
>> >>
>> >> diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
>> >> index 6a8cba6ea79a..9858a4057842 100644
>> >> --- a/include/net/xdp_priv.h
>> >> +++ b/include/net/xdp_priv.h
>> >> @@ -18,6 +18,8 @@ struct xdp_mem_allocator {
>> >>  	struct rcu_head rcu;
>> >>  	struct delayed_work defer_wq;
>> >>  	unsigned long defer_warn;
>> >> +	unsigned long refcnt;
>> >> +	u32 queue_index;
>> >>  };
>> >
>> >I don't like this approach, because I think we need to extend struct
>> >xdp_mem_allocator with a net_device pointer, for doing dev_hold(), to
>> >correctly handle lifetime issues. (As I tried to explain previously).
>> >This will be much harder after this change, which is why I proposed the
>> >other patch.
>> My concern comes not from zero also.
>> It's partly continuation of not answered questions from here:
>> https://lwn.net/ml/netdev/20190625122822.GC6485@khorivan/
>>
>> "For me it's important to know only if it means that alloc.count is
>> freed at first call of __mem_id_disconnect() while shutdown.
>> The workqueue for the rest is connected only with ring cache protected
>> by ring lock and not supposed that alloc.count can be changed while
>> workqueue tries to shutdonwn the pool."
>
>Yes.  The alloc.count is only freed on first call.  I considered
>changing the shutdown API, to have two shutdown calls, where the call
>used from the work-queue will not have the loop emptying alloc.count,
>but instead have a WARN_ON(alloc.count), as it MUST be empty (once is
>code running from work-queue).
>
>> So patch you propose to leave works only because of luck, because fast
>> cache is cleared before workqueue is scheduled and no races between two
>> workqueues for fast cache later. I'm not really against this patch, but
>> I have to try smth better.
>
>It is not "luck".  It does the correct thing as we never enter the
>while loop in __page_pool_request_shutdown() from a work-queue, but it
>is not obvious from the code.  The not-so-nice thing is that two
>work-queue shutdowns will be racing with each-other, in the multi
>netdev use-case, but access to the ptr_ring is safe/locked.

So, having this, and being prudent to generic code changes, lets roll back
to idea from v.4:
https://lkml.org/lkml/2019/6/25/996
but use changes from following patch, reintroducing page destroy:
https://www.spinics.net/lists/netdev/msg583145.html
with appropriate small modifications for cpsw.

In case of some issue connected with it (not supposed), or two/more
allocators used by cpsw, or one more driver having such multi ndev
capabilities (supposed), would be nice to use this link as reference
and it can be base for similar modifications.

Unless Jesper disagrees with this ofc.

I will send v7 soon after verification is completed.

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 net-next 2/5] net: ethernet: ti: davinci_cpdma: add dma mapped submit
  2019-07-03 10:19 ` [PATCH v6 net-next 2/5] net: ethernet: ti: davinci_cpdma: add dma mapped submit Ivan Khoronzhuk
@ 2019-07-05 19:32   ` kbuild test robot
  0 siblings, 0 replies; 17+ messages in thread
From: kbuild test robot @ 2019-07-05 19:32 UTC (permalink / raw)
  To: Ivan Khoronzhuk
  Cc: kbuild-all, grygorii.strashko, hawk, davem, ast, linux-kernel,
	linux-omap, xdp-newbies, ilias.apalodimas, netdev, daniel,
	jakub.kicinski, john.fastabend, Ivan Khoronzhuk

[-- Attachment #1: Type: text/plain, Size: 5251 bytes --]

Hi Ivan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ivan-Khoronzhuk/xdp-allow-same-allocator-usage/20190706-003850
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=arm64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/net//ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_si':
>> drivers/net//ethernet/ti/davinci_cpdma.c:1047:12: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
      buffer = (u32)si->data;
               ^
   drivers/net//ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_idle_submit_mapped':
>> drivers/net//ethernet/ti/davinci_cpdma.c:1114:12: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     si.data = (void *)(u32)data;
               ^
   drivers/net//ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_mapped':
   drivers/net//ethernet/ti/davinci_cpdma.c:1164:12: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     si.data = (void *)(u32)data;
               ^

vim +1047 drivers/net//ethernet/ti/davinci_cpdma.c

  1015	
  1016	static int cpdma_chan_submit_si(struct submit_info *si)
  1017	{
  1018		struct cpdma_chan		*chan = si->chan;
  1019		struct cpdma_ctlr		*ctlr = chan->ctlr;
  1020		int				len = si->len;
  1021		int				swlen = len;
  1022		struct cpdma_desc __iomem	*desc;
  1023		dma_addr_t			buffer;
  1024		u32				mode;
  1025		int				ret;
  1026	
  1027		if (chan->count >= chan->desc_num)	{
  1028			chan->stats.desc_alloc_fail++;
  1029			return -ENOMEM;
  1030		}
  1031	
  1032		desc = cpdma_desc_alloc(ctlr->pool);
  1033		if (!desc) {
  1034			chan->stats.desc_alloc_fail++;
  1035			return -ENOMEM;
  1036		}
  1037	
  1038		if (len < ctlr->params.min_packet_size) {
  1039			len = ctlr->params.min_packet_size;
  1040			chan->stats.runt_transmit_buff++;
  1041		}
  1042	
  1043		mode = CPDMA_DESC_OWNER | CPDMA_DESC_SOP | CPDMA_DESC_EOP;
  1044		cpdma_desc_to_port(chan, mode, si->directed);
  1045	
  1046		if (si->flags & CPDMA_DMA_EXT_MAP) {
> 1047			buffer = (u32)si->data;
  1048			dma_sync_single_for_device(ctlr->dev, buffer, len, chan->dir);
  1049			swlen |= CPDMA_DMA_EXT_MAP;
  1050		} else {
  1051			buffer = dma_map_single(ctlr->dev, si->data, len, chan->dir);
  1052			ret = dma_mapping_error(ctlr->dev, buffer);
  1053			if (ret) {
  1054				cpdma_desc_free(ctlr->pool, desc, 1);
  1055				return -EINVAL;
  1056			}
  1057		}
  1058	
  1059		/* Relaxed IO accessors can be used here as there is read barrier
  1060		 * at the end of write sequence.
  1061		 */
  1062		writel_relaxed(0, &desc->hw_next);
  1063		writel_relaxed(buffer, &desc->hw_buffer);
  1064		writel_relaxed(len, &desc->hw_len);
  1065		writel_relaxed(mode | len, &desc->hw_mode);
  1066		writel_relaxed((uintptr_t)si->token, &desc->sw_token);
  1067		writel_relaxed(buffer, &desc->sw_buffer);
  1068		writel_relaxed(swlen, &desc->sw_len);
  1069		desc_read(desc, sw_len);
  1070	
  1071		__cpdma_chan_submit(chan, desc);
  1072	
  1073		if (chan->state == CPDMA_STATE_ACTIVE && chan->rxfree)
  1074			chan_write(chan, rxfree, 1);
  1075	
  1076		chan->count++;
  1077		return 0;
  1078	}
  1079	
  1080	int cpdma_chan_idle_submit(struct cpdma_chan *chan, void *token, void *data,
  1081				   int len, int directed)
  1082	{
  1083		struct submit_info si;
  1084		unsigned long flags;
  1085		int ret;
  1086	
  1087		si.chan = chan;
  1088		si.token = token;
  1089		si.data = data;
  1090		si.len = len;
  1091		si.directed = directed;
  1092		si.flags = 0;
  1093	
  1094		spin_lock_irqsave(&chan->lock, flags);
  1095		if (chan->state == CPDMA_STATE_TEARDOWN) {
  1096			spin_unlock_irqrestore(&chan->lock, flags);
  1097			return -EINVAL;
  1098		}
  1099	
  1100		ret = cpdma_chan_submit_si(&si);
  1101		spin_unlock_irqrestore(&chan->lock, flags);
  1102		return ret;
  1103	}
  1104	
  1105	int cpdma_chan_idle_submit_mapped(struct cpdma_chan *chan, void *token,
  1106					  dma_addr_t data, int len, int directed)
  1107	{
  1108		struct submit_info si;
  1109		unsigned long flags;
  1110		int ret;
  1111	
  1112		si.chan = chan;
  1113		si.token = token;
> 1114		si.data = (void *)(u32)data;
  1115		si.len = len;
  1116		si.directed = directed;
  1117		si.flags = CPDMA_DMA_EXT_MAP;
  1118	
  1119		spin_lock_irqsave(&chan->lock, flags);
  1120		if (chan->state == CPDMA_STATE_TEARDOWN) {
  1121			spin_unlock_irqrestore(&chan->lock, flags);
  1122			return -EINVAL;
  1123		}
  1124	
  1125		ret = cpdma_chan_submit_si(&si);
  1126		spin_unlock_irqrestore(&chan->lock, flags);
  1127		return ret;
  1128	}
  1129	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 65617 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-07-05 19:33 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-03 10:18 [PATCH v6 net-next 0/5] net: ethernet: ti: cpsw: Add XDP support Ivan Khoronzhuk
2019-07-03 10:18 ` [PATCH v6 net-next 1/5] xdp: allow same allocator usage Ivan Khoronzhuk
2019-07-03 17:40   ` Jesper Dangaard Brouer
2019-07-04 10:22     ` Ivan Khoronzhuk
2019-07-04 12:41       ` Jesper Dangaard Brouer
2019-07-04 17:11         ` Ivan Khoronzhuk
2019-07-03 10:19 ` [PATCH v6 net-next 2/5] net: ethernet: ti: davinci_cpdma: add dma mapped submit Ivan Khoronzhuk
2019-07-05 19:32   ` kbuild test robot
2019-07-03 10:19 ` [PATCH v6 net-next 3/5] net: ethernet: ti: davinci_cpdma: allow desc split while down Ivan Khoronzhuk
2019-07-03 10:19 ` [PATCH v6 net-next 4/5] net: ethernet: ti: cpsw_ethtool: allow res " Ivan Khoronzhuk
2019-07-03 10:19 ` [PATCH v6 net-next 5/5] net: ethernet: ti: cpsw: add XDP support Ivan Khoronzhuk
2019-07-04  9:19   ` Jesper Dangaard Brouer
2019-07-04  9:39     ` Ilias Apalodimas
2019-07-04  9:43       ` Ivan Khoronzhuk
2019-07-04  9:49         ` Ilias Apalodimas
2019-07-04  9:53           ` Ivan Khoronzhuk
2019-07-04  9:45     ` Ivan Khoronzhuk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).