netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 00/10] tsnep: XDP support
@ 2023-01-09 19:15 Gerhard Engleder
  2023-01-09 19:15 ` [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX Gerhard Engleder
                   ` (9 more replies)
  0 siblings, 10 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

Implement XDP support for tsnep driver. I tried to follow existing
drivers like igb/igc as far as possible. Some prework was already done
in previous patch series, so in this series only actual XDP stuff is
included.

Thanks for the NetDev 0x14 slides "Add XDP support on a NIC driver".

Some commits contain changes not related to XDP but found during review
of XDP support patches.

v4:
- remove process context from spin_lock_bh commit message (Alexander Lobakin)
- move tsnep_adapter::state to prevent 4 byte hole (Alexander Lobakin)
- braces for bitops in combination logical ops (Alexander Lobakin)
- make various pointers const (Alexander Lobakin)
- '!i' instead of 'i == 0' (Alexander Lobakin)
- removed redundant braces (Alexander Lobakin)
- squash variables into same line if same type (Alexander Lobakin)
- use fact that ::skb and ::xdpf use same slot for simplification (Alexander Lobakin)
- use u32 for smp_processor_id() (Alexander Lobakin)
- don't add $(tsnep-y) to $(tsnep-objs) (Alexander Lobakin)
- use rev xmas tree in tsnep_netdev_open() (Alexander Lobakin)
- do not move tsnep_queue::napi (Alexander Lobakin)
- call xdp_init_buff() only once (Alexander Lobakin)
- get nq and tx only once for XDP TX (Alexander Lobakin)
- move XDP BPF program setup to end of patch series (Alexander Lobakin)
- check for XDP state change and prevent redundant down-ups (Alexander Lobakin)
- access tsnep_adapter::xdp_prog only with READ_ONCE in RX path (Alexander Lobakin)
- forward NAPI budget to napi_consume_skb() (Alexander Lobakin)
- fix errno leftover in tsnep_xdp_xmit_back() (Dan Carpenter)
- eliminate tsnep_xdp_is_enabled() by setting RX offset during init

v3:
- use spin_lock_bh for TX (Paolo Abeni)
- add comment for XDP TX descriptor available check (Maciej Fijalkowski)
- return value bool for tsnep_xdp_xmit_frame_ring() (Saeed Mahameed)
- do not print DMA mapping error (Saeed Mahameed)
- use reverse xmas tree variable declaration (Saeed Mahameed)
- move struct xdp_rxq_info to end of struct tsnep_rx (Maciej Fijalkowski)
- check __TSNEP_DOWN flag on close to prevent double free (Saeed Mahameed)
- describe TSNEP_RX_INLINE_METADATA_SIZE in comment (Maciej Fijalkowski)
- substract TSNEP_RX_INLINE_METADATA_SIZE after DMA sync (Maciej Fijalkowski)
- use enum tsnep_tx_type for tsnep_xdp_tx_map (Saeed Mahameed)
- use nxmit as loop iterator in tsnep_netdev_xdp_xmit (Saeed Mahameed)
- stop netdev in tsnep_netdev_close() which is called during BPF prog setup

v2:
- move tsnep_xdp_xmit_back() to commit where it is used (Paolo Abeni)
- remove inline from tsnep_rx_offset() (Paolo Abeni)
- remove inline from tsnep_rx_offset_xdp() (Paolo Abeni)
- simplify tsnep_xdp_run_prog() call by moving xdp_status update to it (Paolo Abeni)

Gerhard Engleder (10):
  tsnep: Use spin_lock_bh for TX
  tsnep: Forward NAPI budget to napi_consume_skb()
  tsnep: Do not print DMA mapping error
  tsnep: Add adapter down state
  tsnep: Add XDP TX support
  tsnep: Substract TSNEP_RX_INLINE_METADATA_SIZE once
  tsnep: Prepare RX buffer for XDP support
  tsnep: Add RX queue info for XDP support
  tsnep: Add XDP RX support
  tsnep: Support XDP BPF program setup

 drivers/net/ethernet/engleder/Makefile     |   2 +-
 drivers/net/ethernet/engleder/tsnep.h      |  24 +-
 drivers/net/ethernet/engleder/tsnep_main.c | 460 ++++++++++++++++++---
 drivers/net/ethernet/engleder/tsnep_xdp.c  |  29 ++
 4 files changed, 464 insertions(+), 51 deletions(-)
 create mode 100644 drivers/net/ethernet/engleder/tsnep_xdp.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 15:49   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 02/10] tsnep: Forward NAPI budget to napi_consume_skb() Gerhard Engleder
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

TX processing is done only within BH context. Therefore, _irqsafe
variant is not necessary.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep_main.c | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index bf0190e1d2ea..7cc5e2407809 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -434,7 +434,6 @@ static int tsnep_tx_unmap(struct tsnep_tx *tx, int index, int count)
 static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 					 struct tsnep_tx *tx)
 {
-	unsigned long flags;
 	int count = 1;
 	struct tsnep_tx_entry *entry;
 	int length;
@@ -444,7 +443,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 	if (skb_shinfo(skb)->nr_frags > 0)
 		count += skb_shinfo(skb)->nr_frags;
 
-	spin_lock_irqsave(&tx->lock, flags);
+	spin_lock_bh(&tx->lock);
 
 	if (tsnep_tx_desc_available(tx) < count) {
 		/* ring full, shall not happen because queue is stopped if full
@@ -452,7 +451,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 		 */
 		netif_stop_queue(tx->adapter->netdev);
 
-		spin_unlock_irqrestore(&tx->lock, flags);
+		spin_unlock_bh(&tx->lock);
 
 		return NETDEV_TX_BUSY;
 	}
@@ -468,7 +467,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 
 		tx->dropped++;
 
-		spin_unlock_irqrestore(&tx->lock, flags);
+		spin_unlock_bh(&tx->lock);
 
 		netdev_err(tx->adapter->netdev, "TX DMA map failed\n");
 
@@ -496,20 +495,19 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 		netif_stop_queue(tx->adapter->netdev);
 	}
 
-	spin_unlock_irqrestore(&tx->lock, flags);
+	spin_unlock_bh(&tx->lock);
 
 	return NETDEV_TX_OK;
 }
 
 static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 {
-	unsigned long flags;
 	int budget = 128;
 	struct tsnep_tx_entry *entry;
 	int count;
 	int length;
 
-	spin_lock_irqsave(&tx->lock, flags);
+	spin_lock_bh(&tx->lock);
 
 	do {
 		if (tx->read == tx->write)
@@ -568,18 +566,17 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 		netif_wake_queue(tx->adapter->netdev);
 	}
 
-	spin_unlock_irqrestore(&tx->lock, flags);
+	spin_unlock_bh(&tx->lock);
 
 	return (budget != 0);
 }
 
 static bool tsnep_tx_pending(struct tsnep_tx *tx)
 {
-	unsigned long flags;
 	struct tsnep_tx_entry *entry;
 	bool pending = false;
 
-	spin_lock_irqsave(&tx->lock, flags);
+	spin_lock_bh(&tx->lock);
 
 	if (tx->read != tx->write) {
 		entry = &tx->entry[tx->read];
@@ -589,7 +586,7 @@ static bool tsnep_tx_pending(struct tsnep_tx *tx)
 			pending = true;
 	}
 
-	spin_unlock_irqrestore(&tx->lock, flags);
+	spin_unlock_bh(&tx->lock);
 
 	return pending;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 02/10] tsnep: Forward NAPI budget to napi_consume_skb()
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
  2023-01-09 19:15 ` [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 15:50   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 03/10] tsnep: Do not print DMA mapping error Gerhard Engleder
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

NAPI budget must be forwarded to napi_consume_skb(). It is used to
detect non-NAPI context.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 7cc5e2407809..d148ba422b8c 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -550,7 +550,7 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 			skb_tstamp_tx(entry->skb, &hwtstamps);
 		}
 
-		napi_consume_skb(entry->skb, budget);
+		napi_consume_skb(entry->skb, napi_budget);
 		entry->skb = NULL;
 
 		tx->read = (tx->read + count) % TSNEP_RING_SIZE;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 03/10] tsnep: Do not print DMA mapping error
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
  2023-01-09 19:15 ` [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX Gerhard Engleder
  2023-01-09 19:15 ` [PATCH net-next v4 02/10] tsnep: Forward NAPI budget to napi_consume_skb() Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 15:53   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 04/10] tsnep: Add adapter down state Gerhard Engleder
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

Printing in data path shall be avoided. DMA mapping error is already
counted in stats so printing is not necessary.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index d148ba422b8c..8c6d6e210494 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -469,8 +469,6 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 
 		spin_unlock_bh(&tx->lock);
 
-		netdev_err(tx->adapter->netdev, "TX DMA map failed\n");
-
 		return NETDEV_TX_OK;
 	}
 	length = retval;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 04/10] tsnep: Add adapter down state
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
                   ` (2 preceding siblings ...)
  2023-01-09 19:15 ` [PATCH net-next v4 03/10] tsnep: Do not print DMA mapping error Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 16:05   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 05/10] tsnep: Add XDP TX support Gerhard Engleder
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

Add adapter state with flag for down state. This flag will be used by
the XDP TX path to deny TX if adapter is down.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep.h      |  1 +
 drivers/net/ethernet/engleder/tsnep_main.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
index f93ba48bac3f..d658413ceb14 100644
--- a/drivers/net/ethernet/engleder/tsnep.h
+++ b/drivers/net/ethernet/engleder/tsnep.h
@@ -147,6 +147,7 @@ struct tsnep_adapter {
 	bool suppress_preamble;
 	phy_interface_t phy_mode;
 	struct phy_device *phydev;
+	unsigned long state;
 	int msg_enable;
 
 	struct platform_device *pdev;
diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 8c6d6e210494..943de5a09693 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -43,6 +43,10 @@
 #define TSNEP_COALESCE_USECS_MAX     ((ECM_INT_DELAY_MASK >> ECM_INT_DELAY_SHIFT) * \
 				      ECM_INT_DELAY_BASE_US + ECM_INT_DELAY_BASE_US - 1)
 
+enum {
+	__TSNEP_DOWN,
+};
+
 static void tsnep_enable_irq(struct tsnep_adapter *adapter, u32 mask)
 {
 	iowrite32(mask, adapter->addr + ECM_INT_ENABLE);
@@ -1138,6 +1142,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
 		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);
 	}
 
+	clear_bit(__TSNEP_DOWN, &adapter->state);
+
 	return 0;
 
 phy_failed:
@@ -1160,6 +1166,8 @@ static int tsnep_netdev_close(struct net_device *netdev)
 	struct tsnep_adapter *adapter = netdev_priv(netdev);
 	int i;
 
+	set_bit(__TSNEP_DOWN, &adapter->state);
+
 	tsnep_disable_irq(adapter, ECM_INT_LINK);
 	tsnep_phy_close(adapter);
 
@@ -1513,6 +1521,7 @@ static int tsnep_probe(struct platform_device *pdev)
 	adapter->msg_enable = NETIF_MSG_DRV | NETIF_MSG_PROBE |
 			      NETIF_MSG_LINK | NETIF_MSG_IFUP |
 			      NETIF_MSG_IFDOWN | NETIF_MSG_TX_QUEUED;
+	set_bit(__TSNEP_DOWN, &adapter->state);
 
 	netdev->min_mtu = ETH_MIN_MTU;
 	netdev->max_mtu = TSNEP_MAX_FRAME_SIZE;
@@ -1609,6 +1618,8 @@ static int tsnep_remove(struct platform_device *pdev)
 {
 	struct tsnep_adapter *adapter = platform_get_drvdata(pdev);
 
+	set_bit(__TSNEP_DOWN, &adapter->state);
+
 	unregister_netdev(adapter->netdev);
 
 	tsnep_rxnfc_cleanup(adapter);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 05/10] tsnep: Add XDP TX support
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
                   ` (3 preceding siblings ...)
  2023-01-09 19:15 ` [PATCH net-next v4 04/10] tsnep: Add adapter down state Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 16:56   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 06/10] tsnep: Substract TSNEP_RX_INLINE_METADATA_SIZE once Gerhard Engleder
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

Implement ndo_xdp_xmit() for XDP TX support. Support for fragmented XDP
frames is included.

Also some const, braces and logic clean ups are done in normal TX path
to keep both TX paths in sync.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep.h      |  12 +-
 drivers/net/ethernet/engleder/tsnep_main.c | 211 +++++++++++++++++++--
 2 files changed, 210 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
index d658413ceb14..9cb267938794 100644
--- a/drivers/net/ethernet/engleder/tsnep.h
+++ b/drivers/net/ethernet/engleder/tsnep.h
@@ -57,6 +57,12 @@ struct tsnep_rxnfc_rule {
 	int location;
 };
 
+enum tsnep_tx_type {
+	TSNEP_TX_TYPE_SKB,
+	TSNEP_TX_TYPE_XDP_TX,
+	TSNEP_TX_TYPE_XDP_NDO,
+};
+
 struct tsnep_tx_entry {
 	struct tsnep_tx_desc *desc;
 	struct tsnep_tx_desc_wb *desc_wb;
@@ -65,7 +71,11 @@ struct tsnep_tx_entry {
 
 	u32 properties;
 
-	struct sk_buff *skb;
+	enum tsnep_tx_type type;
+	union {
+		struct sk_buff *skb;
+		struct xdp_frame *xdpf;
+	};
 	size_t len;
 	DEFINE_DMA_UNMAP_ADDR(dma);
 };
diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 943de5a09693..1ae73c706c9e 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -310,10 +310,12 @@ static void tsnep_tx_activate(struct tsnep_tx *tx, int index, int length,
 	struct tsnep_tx_entry *entry = &tx->entry[index];
 
 	entry->properties = 0;
+	/* xdpf is union with skb */
 	if (entry->skb) {
 		entry->properties = length & TSNEP_DESC_LENGTH_MASK;
 		entry->properties |= TSNEP_DESC_INTERRUPT_FLAG;
-		if (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS)
+		if (entry->type == TSNEP_TX_TYPE_SKB &&
+		    (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS))
 			entry->properties |= TSNEP_DESC_EXTENDED_WRITEBACK_FLAG;
 
 		/* toggle user flag to prevent false acknowledge
@@ -370,7 +372,8 @@ static int tsnep_tx_desc_available(struct tsnep_tx *tx)
 		return tx->read - tx->write - 1;
 }
 
-static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
+static int tsnep_tx_map(const struct sk_buff *skb, struct tsnep_tx *tx,
+			int count)
 {
 	struct device *dmadev = tx->adapter->dmadev;
 	struct tsnep_tx_entry *entry;
@@ -382,7 +385,7 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
 	for (i = 0; i < count; i++) {
 		entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
 
-		if (i == 0) {
+		if (!i) {
 			len = skb_headlen(skb);
 			dma = dma_map_single(dmadev, skb->data, len,
 					     DMA_TO_DEVICE);
@@ -400,6 +403,8 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
 
 		entry->desc->tx = __cpu_to_le64(dma);
 
+		entry->type = TSNEP_TX_TYPE_SKB;
+
 		map_len += len;
 	}
 
@@ -417,12 +422,13 @@ static int tsnep_tx_unmap(struct tsnep_tx *tx, int index, int count)
 		entry = &tx->entry[(index + i) % TSNEP_RING_SIZE];
 
 		if (entry->len) {
-			if (i == 0)
+			if (!i && entry->type == TSNEP_TX_TYPE_SKB)
 				dma_unmap_single(dmadev,
 						 dma_unmap_addr(entry, dma),
 						 dma_unmap_len(entry, len),
 						 DMA_TO_DEVICE);
-			else
+			else if (entry->type == TSNEP_TX_TYPE_SKB ||
+				 entry->type == TSNEP_TX_TYPE_XDP_NDO)
 				dma_unmap_page(dmadev,
 					       dma_unmap_addr(entry, dma),
 					       dma_unmap_len(entry, len),
@@ -482,7 +488,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 
 	for (i = 0; i < count; i++)
 		tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
-				  i == (count - 1));
+				  i == count - 1);
 	tx->write = (tx->write + count) % TSNEP_RING_SIZE;
 
 	skb_tx_timestamp(skb);
@@ -502,12 +508,133 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
 	return NETDEV_TX_OK;
 }
 
+static int tsnep_xdp_tx_map(const struct xdp_frame *xdpf, struct tsnep_tx *tx,
+			    const struct skb_shared_info *shinfo, int count,
+			    enum tsnep_tx_type type)
+{
+	struct device *dmadev = tx->adapter->dmadev;
+	struct tsnep_tx_entry *entry;
+	const skb_frag_t *frag;
+	struct page *page;
+	unsigned int len;
+	int map_len = 0;
+	dma_addr_t dma;
+	void *data;
+	int i;
+
+	frag = NULL;
+	len = xdpf->len;
+	for (i = 0; i < count; i++) {
+		entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
+		if (type == TSNEP_TX_TYPE_XDP_NDO) {
+			data = unlikely(frag) ? skb_frag_address(frag) :
+						xdpf->data;
+			dma = dma_map_single(dmadev, data, len, DMA_TO_DEVICE);
+			if (dma_mapping_error(dmadev, dma))
+				return -ENOMEM;
+
+			entry->type = TSNEP_TX_TYPE_XDP_NDO;
+		} else {
+			page = unlikely(frag) ? skb_frag_page(frag) :
+						virt_to_page(xdpf->data);
+			dma = page_pool_get_dma_addr(page);
+			if (unlikely(frag))
+				dma += skb_frag_off(frag);
+			else
+				dma += sizeof(*xdpf) + xdpf->headroom;
+			dma_sync_single_for_device(dmadev, dma, len,
+						   DMA_BIDIRECTIONAL);
+
+			entry->type = TSNEP_TX_TYPE_XDP_TX;
+		}
+
+		entry->len = len;
+		dma_unmap_addr_set(entry, dma, dma);
+
+		entry->desc->tx = __cpu_to_le64(dma);
+
+		map_len += len;
+
+		if (i + 1 < count) {
+			frag = &shinfo->frags[i];
+			len = skb_frag_size(frag);
+		}
+	}
+
+	return map_len;
+}
+
+/* This function requires __netif_tx_lock is held by the caller. */
+static bool tsnep_xdp_xmit_frame_ring(struct xdp_frame *xdpf,
+				      struct tsnep_tx *tx,
+				      enum tsnep_tx_type type)
+{
+	const struct skb_shared_info *shinfo =
+		xdp_get_shared_info_from_frame(xdpf);
+	struct tsnep_tx_entry *entry;
+	int count, length, retval, i;
+
+	count = 1;
+	if (unlikely(xdp_frame_has_frags(xdpf)))
+		count += shinfo->nr_frags;
+
+	spin_lock_bh(&tx->lock);
+
+	/* ensure that TX ring is not filled up by XDP, always MAX_SKB_FRAGS
+	 * will be available for normal TX path and queue is stopped there if
+	 * necessary
+	 */
+	if (tsnep_tx_desc_available(tx) < (MAX_SKB_FRAGS + 1 + count)) {
+		spin_unlock_bh(&tx->lock);
+
+		return false;
+	}
+
+	entry = &tx->entry[tx->write];
+	entry->xdpf = xdpf;
+
+	retval = tsnep_xdp_tx_map(xdpf, tx, shinfo, count, type);
+	if (retval < 0) {
+		tsnep_tx_unmap(tx, tx->write, count);
+		entry->xdpf = NULL;
+
+		tx->dropped++;
+
+		spin_unlock_bh(&tx->lock);
+
+		return false;
+	}
+	length = retval;
+
+	for (i = 0; i < count; i++)
+		tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
+				  i == count - 1);
+	tx->write = (tx->write + count) % TSNEP_RING_SIZE;
+
+	/* descriptor properties shall be valid before hardware is notified */
+	dma_wmb();
+
+	spin_unlock_bh(&tx->lock);
+
+	return true;
+}
+
+static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
+{
+	iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
+}
+
 static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 {
-	int budget = 128;
 	struct tsnep_tx_entry *entry;
-	int count;
+	struct xdp_frame_bulk bq;
+	int budget = 128;
 	int length;
+	int count;
+
+	xdp_frame_bulk_init(&bq);
+
+	rcu_read_lock(); /* need for xdp_return_frame_bulk */
 
 	spin_lock_bh(&tx->lock);
 
@@ -527,12 +654,17 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 		dma_rmb();
 
 		count = 1;
-		if (skb_shinfo(entry->skb)->nr_frags > 0)
+		if (entry->type == TSNEP_TX_TYPE_SKB &&
+		    skb_shinfo(entry->skb)->nr_frags > 0)
 			count += skb_shinfo(entry->skb)->nr_frags;
+		else if (entry->type != TSNEP_TX_TYPE_SKB &&
+			 xdp_frame_has_frags(entry->xdpf))
+			count += xdp_get_shared_info_from_frame(entry->xdpf)->nr_frags;
 
 		length = tsnep_tx_unmap(tx, tx->read, count);
 
-		if ((skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
+		if (entry->type == TSNEP_TX_TYPE_SKB &&
+		    (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
 		    (__le32_to_cpu(entry->desc_wb->properties) &
 		     TSNEP_DESC_EXTENDED_WRITEBACK_FLAG)) {
 			struct skb_shared_hwtstamps hwtstamps;
@@ -552,7 +684,18 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 			skb_tstamp_tx(entry->skb, &hwtstamps);
 		}
 
-		napi_consume_skb(entry->skb, napi_budget);
+		switch (entry->type) {
+		case TSNEP_TX_TYPE_SKB:
+			napi_consume_skb(entry->skb, napi_budget);
+			break;
+		case TSNEP_TX_TYPE_XDP_TX:
+			xdp_return_frame_rx_napi(entry->xdpf);
+			break;
+		case TSNEP_TX_TYPE_XDP_NDO:
+			xdp_return_frame_bulk(entry->xdpf, &bq);
+			break;
+		}
+		/* xdpf is union with skb */
 		entry->skb = NULL;
 
 		tx->read = (tx->read + count) % TSNEP_RING_SIZE;
@@ -570,7 +713,11 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 
 	spin_unlock_bh(&tx->lock);
 
-	return (budget != 0);
+	xdp_flush_frame_bulk(&bq);
+
+	rcu_read_unlock();
+
+	return budget != 0;
 }
 
 static bool tsnep_tx_pending(struct tsnep_tx *tx)
@@ -1330,6 +1477,45 @@ static ktime_t tsnep_netdev_get_tstamp(struct net_device *netdev,
 	return ns_to_ktime(timestamp);
 }
 
+static int tsnep_netdev_xdp_xmit(struct net_device *dev, int n,
+				 struct xdp_frame **xdp, u32 flags)
+{
+	struct tsnep_adapter *adapter = netdev_priv(dev);
+	u32 cpu = smp_processor_id();
+	struct netdev_queue *nq;
+	int nxmit, queue;
+	bool xmit;
+
+	if (unlikely(test_bit(__TSNEP_DOWN, &adapter->state)))
+		return -ENETDOWN;
+
+	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
+		return -EINVAL;
+
+	queue = cpu % adapter->num_tx_queues;
+	nq = netdev_get_tx_queue(adapter->netdev, queue);
+
+	__netif_tx_lock(nq, cpu);
+
+	/* Avoid transmit queue timeout since we share it with the slow path */
+	txq_trans_cond_update(nq);
+
+	for (nxmit = 0; nxmit < n; nxmit++) {
+		xmit = tsnep_xdp_xmit_frame_ring(xdp[nxmit],
+						 &adapter->tx[queue],
+						 TSNEP_TX_TYPE_XDP_NDO);
+		if (!xmit)
+			break;
+	}
+
+	if (flags & XDP_XMIT_FLUSH)
+		tsnep_xdp_xmit_flush(&adapter->tx[queue]);
+
+	__netif_tx_unlock(nq);
+
+	return nxmit;
+}
+
 static const struct net_device_ops tsnep_netdev_ops = {
 	.ndo_open = tsnep_netdev_open,
 	.ndo_stop = tsnep_netdev_close,
@@ -1341,6 +1527,7 @@ static const struct net_device_ops tsnep_netdev_ops = {
 	.ndo_set_features = tsnep_netdev_set_features,
 	.ndo_get_tstamp = tsnep_netdev_get_tstamp,
 	.ndo_setup_tc = tsnep_tc_setup,
+	.ndo_xdp_xmit = tsnep_netdev_xdp_xmit,
 };
 
 static int tsnep_mac_init(struct tsnep_adapter *adapter)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 06/10] tsnep: Substract TSNEP_RX_INLINE_METADATA_SIZE once
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
                   ` (4 preceding siblings ...)
  2023-01-09 19:15 ` [PATCH net-next v4 05/10] tsnep: Add XDP TX support Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 17:05   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 07/10] tsnep: Prepare RX buffer for XDP support Gerhard Engleder
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

Subtrace size of metadata in front of received data only once. This
simplifies the RX code.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep_main.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 1ae73c706c9e..1110530ec639 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -980,7 +980,7 @@ static struct sk_buff *tsnep_build_skb(struct tsnep_rx *rx, struct page *page,
 
 	/* update pointers within the skb to store the data */
 	skb_reserve(skb, TSNEP_SKB_PAD + TSNEP_RX_INLINE_METADATA_SIZE);
-	__skb_put(skb, length - TSNEP_RX_INLINE_METADATA_SIZE - ETH_FCS_LEN);
+	__skb_put(skb, length - ETH_FCS_LEN);
 
 	if (rx->adapter->hwtstamp_config.rx_filter == HWTSTAMP_FILTER_ALL) {
 		struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb);
@@ -1052,6 +1052,13 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
 		dma_sync_single_range_for_cpu(dmadev, entry->dma, TSNEP_SKB_PAD,
 					      length, dma_dir);
 
+		/* RX metadata with timestamps is in front of actual data,
+		 * subtract metadata size to get length of actual data and
+		 * consider metadata size as offset of actual data during RX
+		 * processing
+		 */
+		length -= TSNEP_RX_INLINE_METADATA_SIZE;
+
 		rx->read = (rx->read + 1) % TSNEP_RING_SIZE;
 		desc_available++;
 
@@ -1060,7 +1067,7 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
 			page_pool_release_page(rx->page_pool, entry->page);
 
 			rx->packets++;
-			rx->bytes += length - TSNEP_RX_INLINE_METADATA_SIZE;
+			rx->bytes += length;
 			if (skb->pkt_type == PACKET_MULTICAST)
 				rx->multicast++;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 07/10] tsnep: Prepare RX buffer for XDP support
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
                   ` (5 preceding siblings ...)
  2023-01-09 19:15 ` [PATCH net-next v4 06/10] tsnep: Substract TSNEP_RX_INLINE_METADATA_SIZE once Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-09 19:15 ` [PATCH net-next v4 08/10] tsnep: Add RX queue info " Gerhard Engleder
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

Introduce tsnep_adapter::xdp_prog, which will later signal that XDP is
enabled.

Reserve XDP_PACKET_HEADROOM in front of RX buffer if XDP is enabled.
Also set DMA direction properly in this case.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep.h      |  3 +++
 drivers/net/ethernet/engleder/tsnep_main.c | 22 ++++++++++++++--------
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
index 9cb267938794..855738d31d73 100644
--- a/drivers/net/ethernet/engleder/tsnep.h
+++ b/drivers/net/ethernet/engleder/tsnep.h
@@ -117,6 +117,7 @@ struct tsnep_rx {
 	struct tsnep_adapter *adapter;
 	void __iomem *addr;
 	int queue_index;
+	unsigned int offset;
 
 	void *page[TSNEP_RING_PAGE_COUNT];
 	dma_addr_t page_dma[TSNEP_RING_PAGE_COUNT];
@@ -183,6 +184,8 @@ struct tsnep_adapter {
 	int rxnfc_count;
 	int rxnfc_max;
 
+	struct bpf_prog *xdp_prog;
+
 	int num_tx_queues;
 	struct tsnep_tx tx[TSNEP_MAX_QUEUES];
 	int num_rx_queues;
diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 1110530ec639..0c9669edb2dd 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -26,9 +26,10 @@
 #include <linux/etherdevice.h>
 #include <linux/phy.h>
 #include <linux/iopoll.h>
+#include <linux/bpf.h>
 
 #define TSNEP_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN)
-#define TSNEP_HEADROOM ALIGN(TSNEP_SKB_PAD, 4)
+#define TSNEP_HEADROOM ALIGN(max(TSNEP_SKB_PAD, XDP_PACKET_HEADROOM), 4)
 #define TSNEP_MAX_RX_BUF_SIZE (PAGE_SIZE - TSNEP_HEADROOM - \
 			       SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
 
@@ -838,9 +839,10 @@ static int tsnep_rx_ring_init(struct tsnep_rx *rx)
 	pp_params.pool_size = TSNEP_RING_SIZE;
 	pp_params.nid = dev_to_node(dmadev);
 	pp_params.dev = dmadev;
-	pp_params.dma_dir = DMA_FROM_DEVICE;
+	pp_params.dma_dir = !!rx->adapter->xdp_prog ?
+			    DMA_BIDIRECTIONAL : DMA_FROM_DEVICE;
 	pp_params.max_len = TSNEP_MAX_RX_BUF_SIZE;
-	pp_params.offset = TSNEP_SKB_PAD;
+	pp_params.offset = rx->offset;
 	rx->page_pool = page_pool_create(&pp_params);
 	if (IS_ERR(rx->page_pool)) {
 		retval = PTR_ERR(rx->page_pool);
@@ -875,7 +877,7 @@ static void tsnep_rx_set_page(struct tsnep_rx *rx, struct tsnep_rx_entry *entry,
 	entry->page = page;
 	entry->len = TSNEP_MAX_RX_BUF_SIZE;
 	entry->dma = page_pool_get_dma_addr(entry->page);
-	entry->desc->rx = __cpu_to_le64(entry->dma + TSNEP_SKB_PAD);
+	entry->desc->rx = __cpu_to_le64(entry->dma + rx->offset);
 }
 
 static int tsnep_rx_alloc_buffer(struct tsnep_rx *rx, int index)
@@ -979,14 +981,14 @@ static struct sk_buff *tsnep_build_skb(struct tsnep_rx *rx, struct page *page,
 		return NULL;
 
 	/* update pointers within the skb to store the data */
-	skb_reserve(skb, TSNEP_SKB_PAD + TSNEP_RX_INLINE_METADATA_SIZE);
+	skb_reserve(skb, rx->offset + TSNEP_RX_INLINE_METADATA_SIZE);
 	__skb_put(skb, length - ETH_FCS_LEN);
 
 	if (rx->adapter->hwtstamp_config.rx_filter == HWTSTAMP_FILTER_ALL) {
 		struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb);
 		struct tsnep_rx_inline *rx_inline =
 			(struct tsnep_rx_inline *)(page_address(page) +
-						   TSNEP_SKB_PAD);
+						   rx->offset);
 
 		skb_shinfo(skb)->tx_flags |=
 			SKBTX_HW_TSTAMP_NETDEV;
@@ -1046,10 +1048,10 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
 		 */
 		dma_rmb();
 
-		prefetch(page_address(entry->page) + TSNEP_SKB_PAD);
+		prefetch(page_address(entry->page) + rx->offset);
 		length = __le32_to_cpu(entry->desc_wb->properties) &
 			 TSNEP_DESC_LENGTH_MASK;
-		dma_sync_single_range_for_cpu(dmadev, entry->dma, TSNEP_SKB_PAD,
+		dma_sync_single_range_for_cpu(dmadev, entry->dma, rx->offset,
 					      length, dma_dir);
 
 		/* RX metadata with timestamps is in front of actual data,
@@ -1111,6 +1113,10 @@ static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr,
 	rx->adapter = adapter;
 	rx->addr = addr;
 	rx->queue_index = queue_index;
+	if (!!adapter->xdp_prog)
+		rx->offset = XDP_PACKET_HEADROOM;
+	else
+		rx->offset = TSNEP_SKB_PAD;
 
 	retval = tsnep_rx_ring_init(rx);
 	if (retval)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 08/10] tsnep: Add RX queue info for XDP support
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
                   ` (6 preceding siblings ...)
  2023-01-09 19:15 ` [PATCH net-next v4 07/10] tsnep: Prepare RX buffer for XDP support Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 17:29   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 09/10] tsnep: Add XDP RX support Gerhard Engleder
  2023-01-09 19:15 ` [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup Gerhard Engleder
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder, Saeed Mahameed

Register xdp_rxq_info with page_pool memory model. This is needed for
XDP buffer handling.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
---
 drivers/net/ethernet/engleder/tsnep.h      |  2 ++
 drivers/net/ethernet/engleder/tsnep_main.c | 39 ++++++++++++++++------
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
index 855738d31d73..2268ff793edf 100644
--- a/drivers/net/ethernet/engleder/tsnep.h
+++ b/drivers/net/ethernet/engleder/tsnep.h
@@ -134,6 +134,8 @@ struct tsnep_rx {
 	u32 dropped;
 	u32 multicast;
 	u32 alloc_failed;
+
+	struct xdp_rxq_info xdp_rxq;
 };
 
 struct tsnep_queue {
diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 0c9669edb2dd..451ad1849b9d 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -792,6 +792,9 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx)
 		entry->page = NULL;
 	}
 
+	if (xdp_rxq_info_is_reg(&rx->xdp_rxq))
+		xdp_rxq_info_unreg(&rx->xdp_rxq);
+
 	if (rx->page_pool)
 		page_pool_destroy(rx->page_pool);
 
@@ -807,7 +810,7 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx)
 	}
 }
 
-static int tsnep_rx_ring_init(struct tsnep_rx *rx)
+static int tsnep_rx_ring_init(struct tsnep_rx *rx, unsigned int napi_id)
 {
 	struct device *dmadev = rx->adapter->dmadev;
 	struct tsnep_rx_entry *entry;
@@ -850,6 +853,15 @@ static int tsnep_rx_ring_init(struct tsnep_rx *rx)
 		goto failed;
 	}
 
+	retval = xdp_rxq_info_reg(&rx->xdp_rxq, rx->adapter->netdev,
+				  rx->queue_index, napi_id);
+	if (retval)
+		goto failed;
+	retval = xdp_rxq_info_reg_mem_model(&rx->xdp_rxq, MEM_TYPE_PAGE_POOL,
+					    rx->page_pool);
+	if (retval)
+		goto failed;
+
 	for (i = 0; i < TSNEP_RING_SIZE; i++) {
 		entry = &rx->entry[i];
 		next_entry = &rx->entry[(i + 1) % TSNEP_RING_SIZE];
@@ -1104,7 +1116,8 @@ static bool tsnep_rx_pending(struct tsnep_rx *rx)
 }
 
 static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr,
-			 int queue_index, struct tsnep_rx *rx)
+			 unsigned int napi_id, int queue_index,
+			 struct tsnep_rx *rx)
 {
 	dma_addr_t dma;
 	int retval;
@@ -1118,7 +1131,7 @@ static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr,
 	else
 		rx->offset = TSNEP_SKB_PAD;
 
-	retval = tsnep_rx_ring_init(rx);
+	retval = tsnep_rx_ring_init(rx, napi_id);
 	if (retval)
 		return retval;
 
@@ -1245,14 +1258,19 @@ static void tsnep_free_irq(struct tsnep_queue *queue, bool first)
 static int tsnep_netdev_open(struct net_device *netdev)
 {
 	struct tsnep_adapter *adapter = netdev_priv(netdev);
-	int i;
-	void __iomem *addr;
 	int tx_queue_index = 0;
 	int rx_queue_index = 0;
-	int retval;
+	unsigned int napi_id;
+	void __iomem *addr;
+	int i, retval;
 
 	for (i = 0; i < adapter->num_queues; i++) {
 		adapter->queue[i].adapter = adapter;
+
+		netif_napi_add(adapter->netdev, &adapter->queue[i].napi,
+			       tsnep_poll);
+		napi_id = adapter->queue[i].napi.napi_id;
+
 		if (adapter->queue[i].tx) {
 			addr = adapter->addr + TSNEP_QUEUE(tx_queue_index);
 			retval = tsnep_tx_open(adapter, addr, tx_queue_index,
@@ -1263,7 +1281,7 @@ static int tsnep_netdev_open(struct net_device *netdev)
 		}
 		if (adapter->queue[i].rx) {
 			addr = adapter->addr + TSNEP_QUEUE(rx_queue_index);
-			retval = tsnep_rx_open(adapter, addr,
+			retval = tsnep_rx_open(adapter, addr, napi_id,
 					       rx_queue_index,
 					       adapter->queue[i].rx);
 			if (retval)
@@ -1295,8 +1313,6 @@ static int tsnep_netdev_open(struct net_device *netdev)
 		goto phy_failed;
 
 	for (i = 0; i < adapter->num_queues; i++) {
-		netif_napi_add(adapter->netdev, &adapter->queue[i].napi,
-			       tsnep_poll);
 		napi_enable(&adapter->queue[i].napi);
 
 		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);
@@ -1317,6 +1333,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
 			tsnep_rx_close(adapter->queue[i].rx);
 		if (adapter->queue[i].tx)
 			tsnep_tx_close(adapter->queue[i].tx);
+
+		netif_napi_del(&adapter->queue[i].napi);
 	}
 	return retval;
 }
@@ -1335,7 +1353,6 @@ static int tsnep_netdev_close(struct net_device *netdev)
 		tsnep_disable_irq(adapter, adapter->queue[i].irq_mask);
 
 		napi_disable(&adapter->queue[i].napi);
-		netif_napi_del(&adapter->queue[i].napi);
 
 		tsnep_free_irq(&adapter->queue[i], i == 0);
 
@@ -1343,6 +1360,8 @@ static int tsnep_netdev_close(struct net_device *netdev)
 			tsnep_rx_close(adapter->queue[i].rx);
 		if (adapter->queue[i].tx)
 			tsnep_tx_close(adapter->queue[i].tx);
+
+		netif_napi_del(&adapter->queue[i].napi);
 	}
 
 	return 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 09/10] tsnep: Add XDP RX support
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
                   ` (7 preceding siblings ...)
  2023-01-09 19:15 ` [PATCH net-next v4 08/10] tsnep: Add RX queue info " Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 17:40   ` Alexander H Duyck
  2023-01-09 19:15 ` [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup Gerhard Engleder
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

If BPF program is set up, then run BPF program for every received frame
and execute the selected action.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/tsnep_main.c | 122 ++++++++++++++++++++-
 1 file changed, 120 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 451ad1849b9d..002c879639db 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -27,6 +27,7 @@
 #include <linux/phy.h>
 #include <linux/iopoll.h>
 #include <linux/bpf.h>
+#include <linux/bpf_trace.h>
 
 #define TSNEP_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN)
 #define TSNEP_HEADROOM ALIGN(max(TSNEP_SKB_PAD, XDP_PACKET_HEADROOM), 4)
@@ -44,6 +45,9 @@
 #define TSNEP_COALESCE_USECS_MAX     ((ECM_INT_DELAY_MASK >> ECM_INT_DELAY_SHIFT) * \
 				      ECM_INT_DELAY_BASE_US + ECM_INT_DELAY_BASE_US - 1)
 
+#define TSNEP_XDP_TX		BIT(0)
+#define TSNEP_XDP_REDIRECT	BIT(1)
+
 enum {
 	__TSNEP_DOWN,
 };
@@ -625,6 +629,28 @@ static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
 	iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
 }
 
+static bool tsnep_xdp_xmit_back(struct tsnep_adapter *adapter,
+				struct xdp_buff *xdp,
+				struct netdev_queue *tx_nq, struct tsnep_tx *tx)
+{
+	struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);
+	bool xmit;
+
+	if (unlikely(!xdpf))
+		return false;
+
+	__netif_tx_lock(tx_nq, smp_processor_id());
+
+	/* Avoid transmit queue timeout since we share it with the slow path */
+	txq_trans_cond_update(tx_nq);
+
+	xmit = tsnep_xdp_xmit_frame_ring(xdpf, tx, TSNEP_TX_TYPE_XDP_TX);
+
+	__netif_tx_unlock(tx_nq);
+
+	return xmit;
+}
+
 static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
 {
 	struct tsnep_tx_entry *entry;
@@ -983,6 +1009,62 @@ static int tsnep_rx_refill(struct tsnep_rx *rx, int count, bool reuse)
 	return i;
 }
 
+static bool tsnep_xdp_run_prog(struct tsnep_rx *rx, struct bpf_prog *prog,
+			       struct xdp_buff *xdp, int *status,
+			       struct netdev_queue *tx_nq, struct tsnep_tx *tx)
+{
+	unsigned int length;
+	unsigned int sync;
+	u32 act;
+
+	length = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM;
+
+	act = bpf_prog_run_xdp(prog, xdp);
+
+	/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
+	sync = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM;
+	sync = max(sync, length);
+
+	switch (act) {
+	case XDP_PASS:
+		return false;
+	case XDP_TX:
+		if (!tsnep_xdp_xmit_back(rx->adapter, xdp, tx_nq, tx))
+			goto out_failure;
+		*status |= TSNEP_XDP_TX;
+		return true;
+	case XDP_REDIRECT:
+		if (xdp_do_redirect(rx->adapter->netdev, xdp, prog) < 0)
+			goto out_failure;
+		*status |= TSNEP_XDP_REDIRECT;
+		return true;
+	default:
+		bpf_warn_invalid_xdp_action(rx->adapter->netdev, prog, act);
+		fallthrough;
+	case XDP_ABORTED:
+out_failure:
+		trace_xdp_exception(rx->adapter->netdev, prog, act);
+		fallthrough;
+	case XDP_DROP:
+		page_pool_put_page(rx->page_pool, virt_to_head_page(xdp->data),
+				   sync, true);
+		return true;
+	}
+}
+
+static void tsnep_finalize_xdp(struct tsnep_adapter *adapter, int status,
+			       struct netdev_queue *tx_nq, struct tsnep_tx *tx)
+{
+	if (status & TSNEP_XDP_TX) {
+		__netif_tx_lock(tx_nq, smp_processor_id());
+		tsnep_xdp_xmit_flush(tx);
+		__netif_tx_unlock(tx_nq);
+	}
+
+	if (status & TSNEP_XDP_REDIRECT)
+		xdp_do_flush();
+}
+
 static struct sk_buff *tsnep_build_skb(struct tsnep_rx *rx, struct page *page,
 				       int length)
 {
@@ -1018,15 +1100,29 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
 			 int budget)
 {
 	struct device *dmadev = rx->adapter->dmadev;
-	int desc_available;
-	int done = 0;
 	enum dma_data_direction dma_dir;
 	struct tsnep_rx_entry *entry;
+	struct netdev_queue *tx_nq;
+	struct bpf_prog *prog;
+	struct xdp_buff xdp;
 	struct sk_buff *skb;
+	struct tsnep_tx *tx;
+	int desc_available;
+	int xdp_status = 0;
+	int done = 0;
 	int length;
 
 	desc_available = tsnep_rx_desc_available(rx);
 	dma_dir = page_pool_get_dma_dir(rx->page_pool);
+	prog = READ_ONCE(rx->adapter->xdp_prog);
+	if (prog) {
+		int queue = smp_processor_id() % rx->adapter->num_tx_queues;
+
+		tx_nq = netdev_get_tx_queue(rx->adapter->netdev, queue);
+		tx = &rx->adapter->tx[queue];
+
+		xdp_init_buff(&xdp, PAGE_SIZE, &rx->xdp_rxq);
+	}
 
 	while (likely(done < budget) && (rx->read != rx->write)) {
 		entry = &rx->entry[rx->read];
@@ -1076,6 +1172,25 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
 		rx->read = (rx->read + 1) % TSNEP_RING_SIZE;
 		desc_available++;
 
+		if (prog) {
+			bool consume;
+
+			xdp_prepare_buff(&xdp, page_address(entry->page),
+					 XDP_PACKET_HEADROOM + TSNEP_RX_INLINE_METADATA_SIZE,
+					 length, false);
+
+			consume = tsnep_xdp_run_prog(rx, prog, &xdp,
+						     &xdp_status, tx_nq, tx);
+			if (consume) {
+				rx->packets++;
+				rx->bytes += length;
+
+				entry->page = NULL;
+
+				continue;
+			}
+		}
+
 		skb = tsnep_build_skb(rx, entry->page, length);
 		if (skb) {
 			page_pool_release_page(rx->page_pool, entry->page);
@@ -1094,6 +1209,9 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
 		entry->page = NULL;
 	}
 
+	if (xdp_status)
+		tsnep_finalize_xdp(rx->adapter, xdp_status, tx_nq, tx);
+
 	if (desc_available)
 		tsnep_rx_refill(rx, desc_available, false);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
                   ` (8 preceding siblings ...)
  2023-01-09 19:15 ` [PATCH net-next v4 09/10] tsnep: Add XDP RX support Gerhard Engleder
@ 2023-01-09 19:15 ` Gerhard Engleder
  2023-01-10 17:33   ` Alexander H Duyck
  9 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-09 19:15 UTC (permalink / raw)
  To: netdev; +Cc: davem, kuba, edumazet, pabeni, Gerhard Engleder

Implement setup of BPF programs for XDP RX path with command
XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path
support.

tsnep_netdev_close() is called directly during BPF program setup. Add
netif_carrier_off() and netif_tx_stop_all_queues() calls to signal to
network stack that device is down. Otherwise network stack would
continue transmitting pakets.

Return value of tsnep_netdev_open() is not checked during BPF program
setup like in other drivers. Forwarding the return value would result in
a bpf_prog_put() call in dev_xdp_install(), which would make removal of
BPF program necessary.

If tsnep_netdev_open() fails during BPF program setup, then the network
stack would call tsnep_netdev_close() anyway. Thus, tsnep_netdev_close()
checks now if device is already down.

Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added
automatically.

Test results with A53 1.2GHz:

XDP_DROP (samples/bpf/xdp1)
proto 17:     883878 pkt/s

XDP_TX (samples/bpf/xdp2)
proto 17:     255693 pkt/s

XDP_REDIRECT (samples/bpf/xdpsock)
 sock0@eth2:0 rxdrop xdp-drv
                   pps            pkts           1.00
rx                 855,582        5,404,523
tx                 0              0

XDP_REDIRECT (samples/bpf/xdp_redirect)
eth2->eth1         613,267 rx/s   0 err,drop/s   613,272 xmit/s

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
---
 drivers/net/ethernet/engleder/Makefile     |  2 +-
 drivers/net/ethernet/engleder/tsnep.h      |  6 +++++
 drivers/net/ethernet/engleder/tsnep_main.c | 25 ++++++++++++++++---
 drivers/net/ethernet/engleder/tsnep_xdp.c  | 29 ++++++++++++++++++++++
 4 files changed, 58 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/ethernet/engleder/tsnep_xdp.c

diff --git a/drivers/net/ethernet/engleder/Makefile b/drivers/net/ethernet/engleder/Makefile
index b6e3b16623de..b98135f65eb7 100644
--- a/drivers/net/ethernet/engleder/Makefile
+++ b/drivers/net/ethernet/engleder/Makefile
@@ -6,5 +6,5 @@
 obj-$(CONFIG_TSNEP) += tsnep.o
 
 tsnep-objs := tsnep_main.o tsnep_ethtool.o tsnep_ptp.o tsnep_tc.o \
-	      tsnep_rxnfc.o $(tsnep-y)
+	      tsnep_rxnfc.o tsnep_xdp.o
 tsnep-$(CONFIG_TSNEP_SELFTESTS) += tsnep_selftests.o
diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
index 2268ff793edf..550aae24c8b9 100644
--- a/drivers/net/ethernet/engleder/tsnep.h
+++ b/drivers/net/ethernet/engleder/tsnep.h
@@ -197,6 +197,9 @@ struct tsnep_adapter {
 	struct tsnep_queue queue[TSNEP_MAX_QUEUES];
 };
 
+int tsnep_netdev_open(struct net_device *netdev);
+int tsnep_netdev_close(struct net_device *netdev);
+
 extern const struct ethtool_ops tsnep_ethtool_ops;
 
 int tsnep_ptp_init(struct tsnep_adapter *adapter);
@@ -220,6 +223,9 @@ int tsnep_rxnfc_add_rule(struct tsnep_adapter *adapter,
 int tsnep_rxnfc_del_rule(struct tsnep_adapter *adapter,
 			 struct ethtool_rxnfc *cmd);
 
+int tsnep_xdp_setup_prog(struct tsnep_adapter *adapter, struct bpf_prog *prog,
+			 struct netlink_ext_ack *extack);
+
 #if IS_ENABLED(CONFIG_TSNEP_SELFTESTS)
 int tsnep_ethtool_get_test_count(void);
 void tsnep_ethtool_get_test_strings(u8 *data);
diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
index 002c879639db..57c35c74dc08 100644
--- a/drivers/net/ethernet/engleder/tsnep_main.c
+++ b/drivers/net/ethernet/engleder/tsnep_main.c
@@ -1373,7 +1373,7 @@ static void tsnep_free_irq(struct tsnep_queue *queue, bool first)
 	memset(queue->name, 0, sizeof(queue->name));
 }
 
-static int tsnep_netdev_open(struct net_device *netdev)
+int tsnep_netdev_open(struct net_device *netdev)
 {
 	struct tsnep_adapter *adapter = netdev_priv(netdev);
 	int tx_queue_index = 0;
@@ -1436,6 +1436,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
 		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);
 	}
 
+	netif_tx_start_all_queues(adapter->netdev);
+
 	clear_bit(__TSNEP_DOWN, &adapter->state);
 
 	return 0;
@@ -1457,12 +1459,16 @@ static int tsnep_netdev_open(struct net_device *netdev)
 	return retval;
 }
 
-static int tsnep_netdev_close(struct net_device *netdev)
+int tsnep_netdev_close(struct net_device *netdev)
 {
 	struct tsnep_adapter *adapter = netdev_priv(netdev);
 	int i;
 
-	set_bit(__TSNEP_DOWN, &adapter->state);
+	if (test_and_set_bit(__TSNEP_DOWN, &adapter->state))
+		return 0;
+
+	netif_carrier_off(netdev);
+	netif_tx_stop_all_queues(netdev);
 
 	tsnep_disable_irq(adapter, ECM_INT_LINK);
 	tsnep_phy_close(adapter);
@@ -1627,6 +1633,18 @@ static ktime_t tsnep_netdev_get_tstamp(struct net_device *netdev,
 	return ns_to_ktime(timestamp);
 }
 
+static int tsnep_netdev_bpf(struct net_device *dev, struct netdev_bpf *bpf)
+{
+	struct tsnep_adapter *adapter = netdev_priv(dev);
+
+	switch (bpf->command) {
+	case XDP_SETUP_PROG:
+		return tsnep_xdp_setup_prog(adapter, bpf->prog, bpf->extack);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static int tsnep_netdev_xdp_xmit(struct net_device *dev, int n,
 				 struct xdp_frame **xdp, u32 flags)
 {
@@ -1677,6 +1695,7 @@ static const struct net_device_ops tsnep_netdev_ops = {
 	.ndo_set_features = tsnep_netdev_set_features,
 	.ndo_get_tstamp = tsnep_netdev_get_tstamp,
 	.ndo_setup_tc = tsnep_tc_setup,
+	.ndo_bpf = tsnep_netdev_bpf,
 	.ndo_xdp_xmit = tsnep_netdev_xdp_xmit,
 };
 
diff --git a/drivers/net/ethernet/engleder/tsnep_xdp.c b/drivers/net/ethernet/engleder/tsnep_xdp.c
new file mode 100644
index 000000000000..5ced32cd9bb7
--- /dev/null
+++ b/drivers/net/ethernet/engleder/tsnep_xdp.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2022 Gerhard Engleder <gerhard@engleder-embedded.com> */
+
+#include <linux/if_vlan.h>
+#include <net/xdp_sock_drv.h>
+
+#include "tsnep.h"
+
+int tsnep_xdp_setup_prog(struct tsnep_adapter *adapter, struct bpf_prog *prog,
+			 struct netlink_ext_ack *extack)
+{
+	struct net_device *dev = adapter->netdev;
+	struct bpf_prog *old_prog;
+	bool need_reset, running;
+
+	running = netif_running(dev);
+	need_reset = !!adapter->xdp_prog != !!prog;
+	if (running && need_reset)
+		tsnep_netdev_close(dev);
+
+	old_prog = xchg(&adapter->xdp_prog, prog);
+	if (old_prog)
+		bpf_prog_put(old_prog);
+
+	if (running && need_reset)
+		tsnep_netdev_open(dev);
+
+	return 0;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX
  2023-01-09 19:15 ` [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX Gerhard Engleder
@ 2023-01-10 15:49   ` Alexander H Duyck
  2023-01-10 19:44     ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 15:49 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> TX processing is done only within BH context. Therefore, _irqsafe
> variant is not necessary.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>

Rather than reducing the context of this why not drop it completely? It
doesn't look like you are running with the NETIF_F_LLTX flag set so
from what I can tell it looks like you are taking the Tx lock in the
xmit path. So Tx queues are protected with the Tx queue lock at the
netdev level via the HARD_TX_LOCK macro.

Since it is already being used in the Tx path to protect multiple
access you could probably just look at getting rid of it entirely.

The only caveat you would need to watch out for is a race between the
cleaning and transmitting which can be addressed via a few barriers
like what was done in the intel drivers via something like the
__ixgbe_maybe_stop_tx function and the logic to wake the queue in the
clean function.

Alternatively if you really feel you need this in the non-xmit path
functions you could just drop the lock and instead use __netif_tx_lock
for those spots that are accessing the queue outside the normal
transmit path.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 02/10] tsnep: Forward NAPI budget to napi_consume_skb()
  2023-01-09 19:15 ` [PATCH net-next v4 02/10] tsnep: Forward NAPI budget to napi_consume_skb() Gerhard Engleder
@ 2023-01-10 15:50   ` Alexander H Duyck
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 15:50 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> NAPI budget must be forwarded to napi_consume_skb(). It is used to
> detect non-NAPI context.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> ---
>  drivers/net/ethernet/engleder/tsnep_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> index 7cc5e2407809..d148ba422b8c 100644
> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> @@ -550,7 +550,7 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>  			skb_tstamp_tx(entry->skb, &hwtstamps);
>  		}
>  
> -		napi_consume_skb(entry->skb, budget);
> +		napi_consume_skb(entry->skb, napi_budget);
>  		entry->skb = NULL;
>  
>  		tx->read = (tx->read + count) % TSNEP_RING_SIZE;

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 03/10] tsnep: Do not print DMA mapping error
  2023-01-09 19:15 ` [PATCH net-next v4 03/10] tsnep: Do not print DMA mapping error Gerhard Engleder
@ 2023-01-10 15:53   ` Alexander H Duyck
  2023-01-10 19:47     ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 15:53 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> Printing in data path shall be avoided. DMA mapping error is already
> counted in stats so printing is not necessary.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> ---
>  drivers/net/ethernet/engleder/tsnep_main.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> index d148ba422b8c..8c6d6e210494 100644
> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> @@ -469,8 +469,6 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
>  
>  		spin_unlock_bh(&tx->lock);
>  
> -		netdev_err(tx->adapter->netdev, "TX DMA map failed\n");
> -
>  		return NETDEV_TX_OK;
>  	}
>  	length = retval;

It might be nice to add a stat to indicate that this is specifically a
mapping error rather than just incrementing dropped but that could also
be done in a future patch.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 04/10] tsnep: Add adapter down state
  2023-01-09 19:15 ` [PATCH net-next v4 04/10] tsnep: Add adapter down state Gerhard Engleder
@ 2023-01-10 16:05   ` Alexander H Duyck
  2023-01-10 20:16     ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 16:05 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> Add adapter state with flag for down state. This flag will be used by
> the XDP TX path to deny TX if adapter is down.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>

What value is this bit adding?

From what I can tell you could probably just use netif_carrier_ok in
place of this and actually get better coverage in terms of identifying
state in which the Tx queue is able to function. So in your XDP_TX
patch you could do that if you really need it. 

As far as the use in your close function it is redundant since the
IFF_UP is only set if ndo_open completes, and ndo_stop is only called
if IFF_UP is set. So your down flag would be redundant with !IFF_UP in
that case.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 05/10] tsnep: Add XDP TX support
  2023-01-09 19:15 ` [PATCH net-next v4 05/10] tsnep: Add XDP TX support Gerhard Engleder
@ 2023-01-10 16:56   ` Alexander H Duyck
  2023-01-10 21:07     ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 16:56 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

nOn Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> Implement ndo_xdp_xmit() for XDP TX support. Support for fragmented XDP
> frames is included.
> 
> Also some const, braces and logic clean ups are done in normal TX path
> to keep both TX paths in sync.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> ---
>  drivers/net/ethernet/engleder/tsnep.h      |  12 +-
>  drivers/net/ethernet/engleder/tsnep_main.c | 211 +++++++++++++++++++--
>  2 files changed, 210 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
> index d658413ceb14..9cb267938794 100644
> --- a/drivers/net/ethernet/engleder/tsnep.h
> +++ b/drivers/net/ethernet/engleder/tsnep.h
> @@ -57,6 +57,12 @@ struct tsnep_rxnfc_rule {
>  	int location;
>  };
>  
> +enum tsnep_tx_type {
> +	TSNEP_TX_TYPE_SKB,
> +	TSNEP_TX_TYPE_XDP_TX,
> +	TSNEP_TX_TYPE_XDP_NDO,
> +};
> +

I'm more a fan of using a bitmap rather than an enum for these type of
things since otherwise you are having to check for 3 possible values
for everything.

Also you may want to reorder this so that SKB is the last option. With
that 0/1 would be your two XDP values and 2 would be your SKB value. It
should make it easier for this to be treated more as a decision tree.

>  struct tsnep_tx_entry {
>  	struct tsnep_tx_desc *desc;
>  	struct tsnep_tx_desc_wb *desc_wb;
> @@ -65,7 +71,11 @@ struct tsnep_tx_entry {
>  
>  	u32 properties;
>  
> -	struct sk_buff *skb;
> +	enum tsnep_tx_type type;
> +	union {
> +		struct sk_buff *skb;
> +		struct xdp_frame *xdpf;
> +	};
>  	size_t len;
>  	DEFINE_DMA_UNMAP_ADDR(dma);
>  };
> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> index 943de5a09693..1ae73c706c9e 100644
> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> @@ -310,10 +310,12 @@ static void tsnep_tx_activate(struct tsnep_tx *tx, int index, int length,
>  	struct tsnep_tx_entry *entry = &tx->entry[index];
>  
>  	entry->properties = 0;
> +	/* xdpf is union with skb */
>  	if (entry->skb) {
>  		entry->properties = length & TSNEP_DESC_LENGTH_MASK;
>  		entry->properties |= TSNEP_DESC_INTERRUPT_FLAG;
> -		if (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS)
> +		if (entry->type == TSNEP_TX_TYPE_SKB &&
> +		    (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS))
>  			entry->properties |= TSNEP_DESC_EXTENDED_WRITEBACK_FLAG;
>  
>  		/* toggle user flag to prevent false acknowledge
> @@ -370,7 +372,8 @@ static int tsnep_tx_desc_available(struct tsnep_tx *tx)
>  		return tx->read - tx->write - 1;
>  }
>  
> -static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
> +static int tsnep_tx_map(const struct sk_buff *skb, struct tsnep_tx *tx,
> +			int count)
>  {

This change to const doesn't add anything since this is a static
function. You could probably just skip making this change since the
function will likely be inlined anyway.

>  	struct device *dmadev = tx->adapter->dmadev;
>  	struct tsnep_tx_entry *entry;
> @@ -382,7 +385,7 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
>  	for (i = 0; i < count; i++) {
>  		entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
>  
> -		if (i == 0) {
> +		if (!i) {
>  			len = skb_headlen(skb);
>  			dma = dma_map_single(dmadev, skb->data, len,
>  					     DMA_TO_DEVICE);
> @@ -400,6 +403,8 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
>  
>  		entry->desc->tx = __cpu_to_le64(dma);
>  
> +		entry->type = TSNEP_TX_TYPE_SKB;
> +
>  		map_len += len;
>  	}
>  

I wonder if it wouldn't be better to change this so that you have a 4th
type for just "PAGE" or "FRAGMENT" since you only really need to
identify this as SKB or XDP on the first buffer, and then all the rest
are just going to be unmapped as page regardless of if it is XDP or
SKB.

> @@ -417,12 +422,13 @@ static int tsnep_tx_unmap(struct tsnep_tx *tx, int index, int count)
>  		entry = &tx->entry[(index + i) % TSNEP_RING_SIZE];
>  
>  		if (entry->len) {
> -			if (i == 0)
> +			if (!i && entry->type == TSNEP_TX_TYPE_SKB)
>  				dma_unmap_single(dmadev,
>  						 dma_unmap_addr(entry, dma),
>  						 dma_unmap_len(entry, len),
>  						 DMA_TO_DEVICE);
> -			else
> +			else if (entry->type == TSNEP_TX_TYPE_SKB ||
> +				 entry->type == TSNEP_TX_TYPE_XDP_NDO)
>  				dma_unmap_page(dmadev,
>  					       dma_unmap_addr(entry, dma),
>  					       dma_unmap_len(entry, len),

Rather than perform 2 checks here you could just verify type !=
TYPE_XDP_TX which would save you a check.

> @@ -482,7 +488,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
>  
>  	for (i = 0; i < count; i++)
>  		tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
> -				  i == (count - 1));
> +				  i == count - 1);
>  	tx->write = (tx->write + count) % TSNEP_RING_SIZE;
>  
>  	skb_tx_timestamp(skb);
> @@ -502,12 +508,133 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
>  	return NETDEV_TX_OK;
>  }
>  
> +static int tsnep_xdp_tx_map(const struct xdp_frame *xdpf, struct tsnep_tx *tx,
> +			    const struct skb_shared_info *shinfo, int count,
> +			    enum tsnep_tx_type type)

Again the const here isn't adding any value since this is a static
function and will likely be inlined into the function below which calls
it.

> +{
> +	struct device *dmadev = tx->adapter->dmadev;
> +	struct tsnep_tx_entry *entry;
> +	const skb_frag_t *frag;
> +	struct page *page;
> +	unsigned int len;
> +	int map_len = 0;
> +	dma_addr_t dma;
> +	void *data;
> +	int i;
> +
> +	frag = NULL;
> +	len = xdpf->len;
> +	for (i = 0; i < count; i++) {
> +		entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
> +		if (type == TSNEP_TX_TYPE_XDP_NDO) {
> +			data = unlikely(frag) ? skb_frag_address(frag) :
> +						xdpf->data;
> +			dma = dma_map_single(dmadev, data, len, DMA_TO_DEVICE);
> +			if (dma_mapping_error(dmadev, dma))
> +				return -ENOMEM;
> +
> +			entry->type = TSNEP_TX_TYPE_XDP_NDO;
> +		} else {
> +			page = unlikely(frag) ? skb_frag_page(frag) :
> +						virt_to_page(xdpf->data);
> +			dma = page_pool_get_dma_addr(page);
> +			if (unlikely(frag))
> +				dma += skb_frag_off(frag);
> +			else
> +				dma += sizeof(*xdpf) + xdpf->headroom;
> +			dma_sync_single_for_device(dmadev, dma, len,
> +						   DMA_BIDIRECTIONAL);
> +
> +			entry->type = TSNEP_TX_TYPE_XDP_TX;
> +		}
> +
> +		entry->len = len;
> +		dma_unmap_addr_set(entry, dma, dma);
> +
> +		entry->desc->tx = __cpu_to_le64(dma);
> +
> +		map_len += len;
> +
> +		if (i + 1 < count) {
> +			frag = &shinfo->frags[i];
> +			len = skb_frag_size(frag);
> +		}
> +	}
> +
> +	return map_len;
> +}
> +
> +/* This function requires __netif_tx_lock is held by the caller. */
> +static bool tsnep_xdp_xmit_frame_ring(struct xdp_frame *xdpf,
> +				      struct tsnep_tx *tx,
> +				      enum tsnep_tx_type type)
> +{
> +	const struct skb_shared_info *shinfo =
> +		xdp_get_shared_info_from_frame(xdpf);
> +	struct tsnep_tx_entry *entry;
> +	int count, length, retval, i;
> +
> +	count = 1;
> +	if (unlikely(xdp_frame_has_frags(xdpf)))
> +		count += shinfo->nr_frags;
> +
> +	spin_lock_bh(&tx->lock);
> +
> +	/* ensure that TX ring is not filled up by XDP, always MAX_SKB_FRAGS
> +	 * will be available for normal TX path and queue is stopped there if
> +	 * necessary
> +	 */
> +	if (tsnep_tx_desc_available(tx) < (MAX_SKB_FRAGS + 1 + count)) {
> +		spin_unlock_bh(&tx->lock);
> +
> +		return false;
> +	}
> +
> +	entry = &tx->entry[tx->write];
> +	entry->xdpf = xdpf;
> +
> +	retval = tsnep_xdp_tx_map(xdpf, tx, shinfo, count, type);
> +	if (retval < 0) {
> +		tsnep_tx_unmap(tx, tx->write, count);
> +		entry->xdpf = NULL;
> +
> +		tx->dropped++;
> +
> +		spin_unlock_bh(&tx->lock);
> +
> +		return false;
> +	}
> +	length = retval;
> +
> +	for (i = 0; i < count; i++)
> +		tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
> +				  i == count - 1);
> +	tx->write = (tx->write + count) % TSNEP_RING_SIZE;
> +
> +	/* descriptor properties shall be valid before hardware is notified */
> +	dma_wmb();
> +
> +	spin_unlock_bh(&tx->lock);
> +
> +	return true;
> +}
> +
> +static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
> +{
> +	iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
> +}
> +
>  static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>  {
> -	int budget = 128;
>  	struct tsnep_tx_entry *entry;
> -	int count;
> +	struct xdp_frame_bulk bq;
> +	int budget = 128;
>  	int length;
> +	int count;
> +
> +	xdp_frame_bulk_init(&bq);
> +
> +	rcu_read_lock(); /* need for xdp_return_frame_bulk */
>  

You should be able to get rid of both of these. See comments below.

>  	spin_lock_bh(&tx->lock);
>  
> @@ -527,12 +654,17 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>  		dma_rmb();
>  
>  		count = 1;
> -		if (skb_shinfo(entry->skb)->nr_frags > 0)
> +		if (entry->type == TSNEP_TX_TYPE_SKB &&
> +		    skb_shinfo(entry->skb)->nr_frags > 0)
>  			count += skb_shinfo(entry->skb)->nr_frags;
> +		else if (entry->type != TSNEP_TX_TYPE_SKB &&
> +			 xdp_frame_has_frags(entry->xdpf))
> +			count += xdp_get_shared_info_from_frame(entry->xdpf)->nr_frags;
>  
>  		length = tsnep_tx_unmap(tx, tx->read, count);
>  
> -		if ((skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
> +		if (entry->type == TSNEP_TX_TYPE_SKB &&
> +		    (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
>  		    (__le32_to_cpu(entry->desc_wb->properties) &
>  		     TSNEP_DESC_EXTENDED_WRITEBACK_FLAG)) {
>  			struct skb_shared_hwtstamps hwtstamps;
> @@ -552,7 +684,18 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>  			skb_tstamp_tx(entry->skb, &hwtstamps);
>  		}
>  
> -		napi_consume_skb(entry->skb, napi_budget);
> +		switch (entry->type) {
> +		case TSNEP_TX_TYPE_SKB:
> +			napi_consume_skb(entry->skb, napi_budget);
> +			break;
> +		case TSNEP_TX_TYPE_XDP_TX:
> +			xdp_return_frame_rx_napi(entry->xdpf);
> +			break;
> +		case TSNEP_TX_TYPE_XDP_NDO:
> +			xdp_return_frame_bulk(entry->xdpf, &bq);
> +			break;
> +		}
> +		/* xdpf is union with skb */
>  		entry->skb = NULL;
>  
>  		tx->read = (tx->read + count) % TSNEP_RING_SIZE;

If I am not mistaken I think you can use xdp_return_frame_rx_napi for
both of these without having to resort to the bulk approach since the
tsnep_tx_poll is operating in the NAPI polling context.

When you free the buffers outside of polling context you would then
want to use the xdp_return_frame_bulk. So for example if you have any
buffers that weren't transmitted when you get to the point of
tsnep_tx_ring_cleanup then you would want to unmap and free them there
using something like the xdp_return_frame or xdp_return_frame_bulk.

> @@ -570,7 +713,11 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>  
>  	spin_unlock_bh(&tx->lock);
>  
> -	return (budget != 0);
> +	xdp_flush_frame_bulk(&bq);
> +
> +	rcu_read_unlock();
> +
> +	return budget != 0;
>  }
>  
>  static bool tsnep_tx_pending(struct tsnep_tx *tx)
> @@ -1330,6 +1477,45 @@ static ktime_t tsnep_netdev_get_tstamp(struct net_device *netdev,
>  	return ns_to_ktime(timestamp);
>  }
>  
> +static int tsnep_netdev_xdp_xmit(struct net_device *dev, int n,
> +				 struct xdp_frame **xdp, u32 flags)
> +{
> +	struct tsnep_adapter *adapter = netdev_priv(dev);
> +	u32 cpu = smp_processor_id();
> +	struct netdev_queue *nq;
> +	int nxmit, queue;
> +	bool xmit;
> +
> +	if (unlikely(test_bit(__TSNEP_DOWN, &adapter->state)))
> +		return -ENETDOWN;
> +

Again, here you are better off probably checking for netif_carrier_ok
rather than adding a new flag.

> +	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
> +		return -EINVAL;
> +
> +	queue = cpu % adapter->num_tx_queues;

Modulo math here unconditionally will be expensive. Take a look at
using reciprocal_scale like we do in skb_tx_hash.

> +	nq = netdev_get_tx_queue(adapter->netdev, queue);
> +
> +	__netif_tx_lock(nq, cpu);
> +
> +	/* Avoid transmit queue timeout since we share it with the slow path */
> +	txq_trans_cond_update(nq);
> +

Doing this here may not be the right spot. It might make more sense to
do this after the call to xmit_frame_ring after the break case in order
to guarantee that you actually put something on the ring before
updating the state otherwise you can stall the Tx ring without
triggering the watchdog by just hammering this.

> +	for (nxmit = 0; nxmit < n; nxmit++) {
> +		xmit = tsnep_xdp_xmit_frame_ring(xdp[nxmit],
> +						 &adapter->tx[queue],
> +						 TSNEP_TX_TYPE_XDP_NDO);
> +		if (!xmit)
> +			break;

So move txq_trans_cond_update(nq) here.

> +	}
> +
> +	if (flags & XDP_XMIT_FLUSH)
> +		tsnep_xdp_xmit_flush(&adapter->tx[queue]);
> +
> +	__netif_tx_unlock(nq);
> +
> +	return nxmit;
> +}
> +
>  static const struct net_device_ops tsnep_netdev_ops = {
>  	.ndo_open = tsnep_netdev_open,
>  	.ndo_stop = tsnep_netdev_close,
> @@ -1341,6 +1527,7 @@ static const struct net_device_ops tsnep_netdev_ops = {
>  	.ndo_set_features = tsnep_netdev_set_features,
>  	.ndo_get_tstamp = tsnep_netdev_get_tstamp,
>  	.ndo_setup_tc = tsnep_tc_setup,
> +	.ndo_xdp_xmit = tsnep_netdev_xdp_xmit,
>  };
>  
>  static int tsnep_mac_init(struct tsnep_adapter *adapter)


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 06/10] tsnep: Substract TSNEP_RX_INLINE_METADATA_SIZE once
  2023-01-09 19:15 ` [PATCH net-next v4 06/10] tsnep: Substract TSNEP_RX_INLINE_METADATA_SIZE once Gerhard Engleder
@ 2023-01-10 17:05   ` Alexander H Duyck
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 17:05 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> Subtrace size of metadata in front of received data only once. This
> simplifies the RX code.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>

s/Subtrace/Subtract/

Otherwise the patch looks good to me.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 08/10] tsnep: Add RX queue info for XDP support
  2023-01-09 19:15 ` [PATCH net-next v4 08/10] tsnep: Add RX queue info " Gerhard Engleder
@ 2023-01-10 17:29   ` Alexander H Duyck
  2023-01-10 21:21     ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 17:29 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni, Saeed Mahameed

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> Register xdp_rxq_info with page_pool memory model. This is needed for
> XDP buffer handling.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> Reviewed-by: Saeed Mahameed <saeed@kernel.org>
> ---
>  drivers/net/ethernet/engleder/tsnep.h      |  2 ++
>  drivers/net/ethernet/engleder/tsnep_main.c | 39 ++++++++++++++++------
>  2 files changed, 31 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
> index 855738d31d73..2268ff793edf 100644
> --- a/drivers/net/ethernet/engleder/tsnep.h
> +++ b/drivers/net/ethernet/engleder/tsnep.h
> @@ -134,6 +134,8 @@ struct tsnep_rx {
>  	u32 dropped;
>  	u32 multicast;
>  	u32 alloc_failed;
> +
> +	struct xdp_rxq_info xdp_rxq;
>  };
>  

Rather than placing it in your Rx queue structure it might be better to
look at placing it in the tsnep_queue struct. The fact is this is more
closely associated with the napi struct then your Rx ring so changing
it to that might make more sense as you can avoid a bunch of extra
overhead with having to pull napi to it.

Basically what it comes down it is that it is much easier to access
queue[i].rx than it is for the rx to get access to queue[i].napi.

>  struct tsnep_queue {
> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> index 0c9669edb2dd..451ad1849b9d 100644
> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> @@ -792,6 +792,9 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx)
>  		entry->page = NULL;
>  	}
>  
> +	if (xdp_rxq_info_is_reg(&rx->xdp_rxq))
> +		xdp_rxq_info_unreg(&rx->xdp_rxq);
> +
>  	if (rx->page_pool)
>  		page_pool_destroy(rx->page_pool);
>  
> @@ -807,7 +810,7 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx)
>  	}
>  }
>  
> -static int tsnep_rx_ring_init(struct tsnep_rx *rx)
> +static int tsnep_rx_ring_init(struct tsnep_rx *rx, unsigned int napi_id)
>  {
>  	struct device *dmadev = rx->adapter->dmadev;
>  	struct tsnep_rx_entry *entry;
> @@ -850,6 +853,15 @@ static int tsnep_rx_ring_init(struct tsnep_rx *rx)
>  		goto failed;
>  	}
>  
> +	retval = xdp_rxq_info_reg(&rx->xdp_rxq, rx->adapter->netdev,
> +				  rx->queue_index, napi_id);
> +	if (retval)
> +		goto failed;
> +	retval = xdp_rxq_info_reg_mem_model(&rx->xdp_rxq, MEM_TYPE_PAGE_POOL,
> +					    rx->page_pool);
> +	if (retval)
> +		goto failed;
> +
>  	for (i = 0; i < TSNEP_RING_SIZE; i++) {
>  		entry = &rx->entry[i];
>  		next_entry = &rx->entry[(i + 1) % TSNEP_RING_SIZE];
> @@ -1104,7 +1116,8 @@ static bool tsnep_rx_pending(struct tsnep_rx *rx)
>  }
>  
>  static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr,
> -			 int queue_index, struct tsnep_rx *rx)
> +			 unsigned int napi_id, int queue_index,
> +			 struct tsnep_rx *rx)
>  {
>  	dma_addr_t dma;
>  	int retval;
> @@ -1118,7 +1131,7 @@ static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr,
>  	else
>  		rx->offset = TSNEP_SKB_PAD;
>  
> -	retval = tsnep_rx_ring_init(rx);
> +	retval = tsnep_rx_ring_init(rx, napi_id);
>  	if (retval)
>  		return retval;
>  
> @@ -1245,14 +1258,19 @@ static void tsnep_free_irq(struct tsnep_queue *queue, bool first)
>  static int tsnep_netdev_open(struct net_device *netdev)
>  {
>  	struct tsnep_adapter *adapter = netdev_priv(netdev);
> -	int i;
> -	void __iomem *addr;
>  	int tx_queue_index = 0;
>  	int rx_queue_index = 0;
> -	int retval;
> +	unsigned int napi_id;
> +	void __iomem *addr;
> +	int i, retval;
>  
>  	for (i = 0; i < adapter->num_queues; i++) {
>  		adapter->queue[i].adapter = adapter;
> +
> +		netif_napi_add(adapter->netdev, &adapter->queue[i].napi,
> +			       tsnep_poll);
> +		napi_id = adapter->queue[i].napi.napi_id;
> +

This is a good example of what I am referring to.

>  		if (adapter->queue[i].tx) {
>  			addr = adapter->addr + TSNEP_QUEUE(tx_queue_index);
>  			retval = tsnep_tx_open(adapter, addr, tx_queue_index,
> @@ -1263,7 +1281,7 @@ static int tsnep_netdev_open(struct net_device *netdev)
>  		}
>  		if (adapter->queue[i].rx) {
>  			addr = adapter->addr + TSNEP_QUEUE(rx_queue_index);
> -			retval = tsnep_rx_open(adapter, addr,
> +			retval = tsnep_rx_open(adapter, addr, napi_id,
>  					       rx_queue_index,
>  					       adapter->queue[i].rx);
>  			if (retval)
> @@ -1295,8 +1313,6 @@ static int tsnep_netdev_open(struct net_device *netdev)
>  		goto phy_failed;
>  
>  	for (i = 0; i < adapter->num_queues; i++) {
> -		netif_napi_add(adapter->netdev, &adapter->queue[i].napi,
> -			       tsnep_poll);
>  		napi_enable(&adapter->queue[i].napi);
>  
>  		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);

What you could do here is look at making all the napi_add/napi_enable
and xdp_rxq_info items into one function to handle all the enabling.

> @@ -1317,6 +1333,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
>  			tsnep_rx_close(adapter->queue[i].rx);
>  		if (adapter->queue[i].tx)
>  			tsnep_tx_close(adapter->queue[i].tx);
> +
> +		netif_napi_del(&adapter->queue[i].napi);
>  	}
>  	return retval;
>  }
> @@ -1335,7 +1353,6 @@ static int tsnep_netdev_close(struct net_device *netdev)
>  		tsnep_disable_irq(adapter, adapter->queue[i].irq_mask);
>  
>  		napi_disable(&adapter->queue[i].napi);
> -		netif_napi_del(&adapter->queue[i].napi);
>  
>  		tsnep_free_irq(&adapter->queue[i], i == 0);
>  

Likewise here you could take care of all the same items with the page
pool being freed after you have already unregistered and freed the napi
instance.

> @@ -1343,6 +1360,8 @@ static int tsnep_netdev_close(struct net_device *netdev)
>  			tsnep_rx_close(adapter->queue[i].rx);
>  		if (adapter->queue[i].tx)
>  			tsnep_tx_close(adapter->queue[i].tx);
> +
> +		netif_napi_del(&adapter->queue[i].napi);
>  	}
>  
>  	return 0;


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-09 19:15 ` [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup Gerhard Engleder
@ 2023-01-10 17:33   ` Alexander H Duyck
  2023-01-10 21:38     ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 17:33 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> Implement setup of BPF programs for XDP RX path with command
> XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path
> support.
> 
> tsnep_netdev_close() is called directly during BPF program setup. Add
> netif_carrier_off() and netif_tx_stop_all_queues() calls to signal to
> network stack that device is down. Otherwise network stack would
> continue transmitting pakets.
> 
> Return value of tsnep_netdev_open() is not checked during BPF program
> setup like in other drivers. Forwarding the return value would result in
> a bpf_prog_put() call in dev_xdp_install(), which would make removal of
> BPF program necessary.
> 
> If tsnep_netdev_open() fails during BPF program setup, then the network
> stack would call tsnep_netdev_close() anyway. Thus, tsnep_netdev_close()
> checks now if device is already down.
> 
> Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added
> automatically.
> 
> Test results with A53 1.2GHz:
> 
> XDP_DROP (samples/bpf/xdp1)
> proto 17:     883878 pkt/s
> 
> XDP_TX (samples/bpf/xdp2)
> proto 17:     255693 pkt/s
> 
> XDP_REDIRECT (samples/bpf/xdpsock)
>  sock0@eth2:0 rxdrop xdp-drv
>                    pps            pkts           1.00
> rx                 855,582        5,404,523
> tx                 0              0
> 
> XDP_REDIRECT (samples/bpf/xdp_redirect)
> eth2->eth1         613,267 rx/s   0 err,drop/s   613,272 xmit/s
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> ---
>  drivers/net/ethernet/engleder/Makefile     |  2 +-
>  drivers/net/ethernet/engleder/tsnep.h      |  6 +++++
>  drivers/net/ethernet/engleder/tsnep_main.c | 25 ++++++++++++++++---
>  drivers/net/ethernet/engleder/tsnep_xdp.c  | 29 ++++++++++++++++++++++
>  4 files changed, 58 insertions(+), 4 deletions(-)
>  create mode 100644 drivers/net/ethernet/engleder/tsnep_xdp.c
> 
> 

<...>

> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> @@ -1373,7 +1373,7 @@ static void tsnep_free_irq(struct tsnep_queue *queue, bool first)
>  	memset(queue->name, 0, sizeof(queue->name));
>  }
>  
> -static int tsnep_netdev_open(struct net_device *netdev)
> +int tsnep_netdev_open(struct net_device *netdev)
>  {
>  	struct tsnep_adapter *adapter = netdev_priv(netdev);
>  	int tx_queue_index = 0;
> @@ -1436,6 +1436,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
>  		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);
>  	}
>  
> +	netif_tx_start_all_queues(adapter->netdev);
> +
>  	clear_bit(__TSNEP_DOWN, &adapter->state);
>  
>  	return 0;
> @@ -1457,12 +1459,16 @@ static int tsnep_netdev_open(struct net_device *netdev)
>  	return retval;
>  }
>  
> -static int tsnep_netdev_close(struct net_device *netdev)
> +int tsnep_netdev_close(struct net_device *netdev)
>  {
>  	struct tsnep_adapter *adapter = netdev_priv(netdev);
>  	int i;
>  
> -	set_bit(__TSNEP_DOWN, &adapter->state);
> +	if (test_and_set_bit(__TSNEP_DOWN, &adapter->state))
> +		return 0;
> +
> +	netif_carrier_off(netdev);
> +	netif_tx_stop_all_queues(netdev);
>  

As I called out earlier the __TSNEP_DOWN is just !IFF_UP so you don't
need that bit.

The fact that netif_carrier_off is here also points out the fact that
the code in the Tx path isn't needed regarding __TSNEP_DOWN and you can
probably just check netif_carrier_ok if you need the check.

>  	tsnep_disable_irq(adapter, ECM_INT_LINK);
>  	tsnep_phy_close(adapter);
> @@ -1627,6 +1633,18 @@ static ktime_t tsnep_netdev_get_tstamp(struct net_device *netdev,
>  	return ns_to_ktime(timestamp);
>  }
>  
> +static int tsnep_netdev_bpf(struct net_device *dev, struct netdev_bpf *bpf)
> +{
> +	struct tsnep_adapter *adapter = netdev_priv(dev);
> +
> +	switch (bpf->command) {
> +	case XDP_SETUP_PROG:
> +		return tsnep_xdp_setup_prog(adapter, bpf->prog, bpf->extack);
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
> +
>  static int tsnep_netdev_xdp_xmit(struct net_device *dev, int n,
>  				 struct xdp_frame **xdp, u32 flags)
>  {
> @@ -1677,6 +1695,7 @@ static const struct net_device_ops tsnep_netdev_ops = {
>  	.ndo_set_features = tsnep_netdev_set_features,
>  	.ndo_get_tstamp = tsnep_netdev_get_tstamp,
>  	.ndo_setup_tc = tsnep_tc_setup,
> +	.ndo_bpf = tsnep_netdev_bpf,
>  	.ndo_xdp_xmit = tsnep_netdev_xdp_xmit,
>  };
>  
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 09/10] tsnep: Add XDP RX support
  2023-01-09 19:15 ` [PATCH net-next v4 09/10] tsnep: Add XDP RX support Gerhard Engleder
@ 2023-01-10 17:40   ` Alexander H Duyck
  2023-01-10 21:28     ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 17:40 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> If BPF program is set up, then run BPF program for every received frame
> and execute the selected action.
> 
> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> ---
>  drivers/net/ethernet/engleder/tsnep_main.c | 122 ++++++++++++++++++++-
>  1 file changed, 120 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> index 451ad1849b9d..002c879639db 100644
> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> @@ -27,6 +27,7 @@
>  #include <linux/phy.h>
>  #include <linux/iopoll.h>
>  #include <linux/bpf.h>
> +#include <linux/bpf_trace.h>
>  
>  #define TSNEP_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN)
>  #define TSNEP_HEADROOM ALIGN(max(TSNEP_SKB_PAD, XDP_PACKET_HEADROOM), 4)
> @@ -44,6 +45,9 @@
>  #define TSNEP_COALESCE_USECS_MAX     ((ECM_INT_DELAY_MASK >> ECM_INT_DELAY_SHIFT) * \
>  				      ECM_INT_DELAY_BASE_US + ECM_INT_DELAY_BASE_US - 1)
>  
> +#define TSNEP_XDP_TX		BIT(0)
> +#define TSNEP_XDP_REDIRECT	BIT(1)
> +
>  enum {
>  	__TSNEP_DOWN,
>  };
> @@ -625,6 +629,28 @@ static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
>  	iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
>  }
>  
> +static bool tsnep_xdp_xmit_back(struct tsnep_adapter *adapter,
> +				struct xdp_buff *xdp,
> +				struct netdev_queue *tx_nq, struct tsnep_tx *tx)
> +{
> +	struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);
> +	bool xmit;
> +
> +	if (unlikely(!xdpf))
> +		return false;
> +
> +	__netif_tx_lock(tx_nq, smp_processor_id());
> +
> +	/* Avoid transmit queue timeout since we share it with the slow path */
> +	txq_trans_cond_update(tx_nq);
> +
> +	xmit = tsnep_xdp_xmit_frame_ring(xdpf, tx, TSNEP_TX_TYPE_XDP_TX);
> +

Again the trans_cond_update should be after the xmit and only if it is
not indicating it completed the transmit.

> +	__netif_tx_unlock(tx_nq);
> +
> +	return xmit;
> +}
> +
>  static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>  {
>  	struct tsnep_tx_entry *entry;
> @@ -983,6 +1009,62 @@ static int tsnep_rx_refill(struct tsnep_rx *rx, int count, bool reuse)
>  	return i;
>  }
>  
> +static bool tsnep_xdp_run_prog(struct tsnep_rx *rx, struct bpf_prog *prog,
> +			       struct xdp_buff *xdp, int *status,
> +			       struct netdev_queue *tx_nq, struct tsnep_tx *tx)
> +{
> +	unsigned int length;
> +	unsigned int sync;
> +	u32 act;
> +
> +	length = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM;
> +
> +	act = bpf_prog_run_xdp(prog, xdp);
> +
> +	/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
> +	sync = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM;
> +	sync = max(sync, length);
> +
> +	switch (act) {
> +	case XDP_PASS:
> +		return false;
> +	case XDP_TX:
> +		if (!tsnep_xdp_xmit_back(rx->adapter, xdp, tx_nq, tx))
> +			goto out_failure;
> +		*status |= TSNEP_XDP_TX;
> +		return true;
> +	case XDP_REDIRECT:
> +		if (xdp_do_redirect(rx->adapter->netdev, xdp, prog) < 0)
> +			goto out_failure;
> +		*status |= TSNEP_XDP_REDIRECT;
> +		return true;
> +	default:
> +		bpf_warn_invalid_xdp_action(rx->adapter->netdev, prog, act);
> +		fallthrough;
> +	case XDP_ABORTED:
> +out_failure:
> +		trace_xdp_exception(rx->adapter->netdev, prog, act);
> +		fallthrough;
> +	case XDP_DROP:
> +		page_pool_put_page(rx->page_pool, virt_to_head_page(xdp->data),
> +				   sync, true);
> +		return true;
> +	}
> +}
> +
> +static void tsnep_finalize_xdp(struct tsnep_adapter *adapter, int status,
> +			       struct netdev_queue *tx_nq, struct tsnep_tx *tx)
> +{
> +	if (status & TSNEP_XDP_TX) {
> +		__netif_tx_lock(tx_nq, smp_processor_id());
> +		tsnep_xdp_xmit_flush(tx);
> +		__netif_tx_unlock(tx_nq);
> +	}
> +
> +	if (status & TSNEP_XDP_REDIRECT)
> +		xdp_do_flush();
> +}
> +
>  static struct sk_buff *tsnep_build_skb(struct tsnep_rx *rx, struct page *page,
>  				       int length)
>  {
> @@ -1018,15 +1100,29 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
>  			 int budget)
>  {
>  	struct device *dmadev = rx->adapter->dmadev;
> -	int desc_available;
> -	int done = 0;
>  	enum dma_data_direction dma_dir;
>  	struct tsnep_rx_entry *entry;
> +	struct netdev_queue *tx_nq;
> +	struct bpf_prog *prog;
> +	struct xdp_buff xdp;
>  	struct sk_buff *skb;
> +	struct tsnep_tx *tx;
> +	int desc_available;
> +	int xdp_status = 0;
> +	int done = 0;
>  	int length;
>  
>  	desc_available = tsnep_rx_desc_available(rx);
>  	dma_dir = page_pool_get_dma_dir(rx->page_pool);
> +	prog = READ_ONCE(rx->adapter->xdp_prog);
> +	if (prog) {
> +		int queue = smp_processor_id() % rx->adapter->num_tx_queues;
> +

As I mentioned before. Take a look at how this was addressed in
skb_tx_hash. The modulus division is really expensive.

Also does this make sense. I am assuming you have a 1:1 Tx to Rx
mapping for your queues don't you? If so it might make more sense to
use the Tx queue that you clean in this queue pair.

> +		tx_nq = netdev_get_tx_queue(rx->adapter->netdev, queue);
> +		tx = &rx->adapter->tx[queue];
> +
> +		xdp_init_buff(&xdp, PAGE_SIZE, &rx->xdp_rxq);
> +	}
>  
>  	while (likely(done < budget) && (rx->read != rx->write)) {
>  		entry = &rx->entry[rx->read];
> @@ -1076,6 +1172,25 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
>  		rx->read = (rx->read + 1) % TSNEP_RING_SIZE;
>  		desc_available++;
>  
> +		if (prog) {
> +			bool consume;
> +
> +			xdp_prepare_buff(&xdp, page_address(entry->page),
> +					 XDP_PACKET_HEADROOM + TSNEP_RX_INLINE_METADATA_SIZE,
> +					 length, false);
> +
> +			consume = tsnep_xdp_run_prog(rx, prog, &xdp,
> +						     &xdp_status, tx_nq, tx);
> +			if (consume) {
> +				rx->packets++;
> +				rx->bytes += length;
> +
> +				entry->page = NULL;
> +
> +				continue;
> +			}
> +		}
> +
>  		skb = tsnep_build_skb(rx, entry->page, length);
>  		if (skb) {
>  			page_pool_release_page(rx->page_pool, entry->page);
> @@ -1094,6 +1209,9 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
>  		entry->page = NULL;
>  	}
>  
> +	if (xdp_status)
> +		tsnep_finalize_xdp(rx->adapter, xdp_status, tx_nq, tx);
> +
>  	if (desc_available)
>  		tsnep_rx_refill(rx, desc_available, false);
>  


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX
  2023-01-10 15:49   ` Alexander H Duyck
@ 2023-01-10 19:44     ` Gerhard Engleder
  0 siblings, 0 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-10 19:44 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni

On 10.01.23 16:49, Alexander H Duyck wrote:
> On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>> TX processing is done only within BH context. Therefore, _irqsafe
>> variant is not necessary.
>>
>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> 
> Rather than reducing the context of this why not drop it completely? It
> doesn't look like you are running with the NETIF_F_LLTX flag set so
> from what I can tell it looks like you are taking the Tx lock in the
> xmit path. So Tx queues are protected with the Tx queue lock at the
> netdev level via the HARD_TX_LOCK macro.
> 
> Since it is already being used in the Tx path to protect multiple
> access you could probably just look at getting rid of it entirely.
> 
> The only caveat you would need to watch out for is a race between the
> cleaning and transmitting which can be addressed via a few barriers
> like what was done in the intel drivers via something like the
> __ixgbe_maybe_stop_tx function and the logic to wake the queue in the
> clean function.

I know these barriers in the intel drivers. But I chose to use a lock
like the microchip driver to be on the safe side rather than having a
fully optimized driver.

> Alternatively if you really feel you need this in the non-xmit path
> functions you could just drop the lock and instead use __netif_tx_lock
> for those spots that are accessing the queue outside the normal
> transmit path.

Ok, I will work on that. One of your suggestions will be done.

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 03/10] tsnep: Do not print DMA mapping error
  2023-01-10 15:53   ` Alexander H Duyck
@ 2023-01-10 19:47     ` Gerhard Engleder
  0 siblings, 0 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-10 19:47 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni



On 10.01.23 16:53, Alexander H Duyck wrote:
> On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>> Printing in data path shall be avoided. DMA mapping error is already
>> counted in stats so printing is not necessary.
>>
>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
>> ---
>>   drivers/net/ethernet/engleder/tsnep_main.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
>> index d148ba422b8c..8c6d6e210494 100644
>> --- a/drivers/net/ethernet/engleder/tsnep_main.c
>> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
>> @@ -469,8 +469,6 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
>>   
>>   		spin_unlock_bh(&tx->lock);
>>   
>> -		netdev_err(tx->adapter->netdev, "TX DMA map failed\n");
>> -
>>   		return NETDEV_TX_OK;
>>   	}
>>   	length = retval;
> 
> It might be nice to add a stat to indicate that this is specifically a
> mapping error rather than just incrementing dropped but that could also
> be done in a future patch.

I took a note for future work.

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 04/10] tsnep: Add adapter down state
  2023-01-10 16:05   ` Alexander H Duyck
@ 2023-01-10 20:16     ` Gerhard Engleder
  0 siblings, 0 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-10 20:16 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni

On 10.01.23 17:05, Alexander H Duyck wrote:
> On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>> Add adapter state with flag for down state. This flag will be used by
>> the XDP TX path to deny TX if adapter is down.
>>
>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> 
> What value is this bit adding?
> 
>  From what I can tell you could probably just use netif_carrier_ok in
> place of this and actually get better coverage in terms of identifying
> state in which the Tx queue is able to function. So in your XDP_TX
> patch you could do that if you really need it.

TX does not check the link state, because the hardware just drops all
packets if there is no link. I would like to keep it like that, because
it minimizes special behavior if the link is down. netif_carrier_ok()
would include the link state.

> As far as the use in your close function it is redundant since the
> IFF_UP is only set if ndo_open completes, and ndo_stop is only called
> if IFF_UP is set. So your down flag would be redundant with !IFF_UP in
> that case.

tsnep_netdev_close() is called directly during bpf prog setup (see last
commit). If the following tsnep_netdev_open() call fails, then this
flag signals that the device is actually down even if IFF_UP is set. So
in this case the down flag is not redundant to !IFF_UP.

Is this a good enough reason for the flag?

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 05/10] tsnep: Add XDP TX support
  2023-01-10 16:56   ` Alexander H Duyck
@ 2023-01-10 21:07     ` Gerhard Engleder
  2023-01-10 22:38       ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-10 21:07 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni

On 10.01.23 17:56, Alexander H Duyck wrote:
> nOn Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>> Implement ndo_xdp_xmit() for XDP TX support. Support for fragmented XDP
>> frames is included.
>>
>> Also some const, braces and logic clean ups are done in normal TX path
>> to keep both TX paths in sync.
>>
>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
>> ---
>>   drivers/net/ethernet/engleder/tsnep.h      |  12 +-
>>   drivers/net/ethernet/engleder/tsnep_main.c | 211 +++++++++++++++++++--
>>   2 files changed, 210 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
>> index d658413ceb14..9cb267938794 100644
>> --- a/drivers/net/ethernet/engleder/tsnep.h
>> +++ b/drivers/net/ethernet/engleder/tsnep.h
>> @@ -57,6 +57,12 @@ struct tsnep_rxnfc_rule {
>>   	int location;
>>   };
>>   
>> +enum tsnep_tx_type {
>> +	TSNEP_TX_TYPE_SKB,
>> +	TSNEP_TX_TYPE_XDP_TX,
>> +	TSNEP_TX_TYPE_XDP_NDO,
>> +};
>> +
> 
> I'm more a fan of using a bitmap rather than an enum for these type of
> things since otherwise you are having to check for 3 possible values
> for everything.
> 
> Also you may want to reorder this so that SKB is the last option. With
> that 0/1 would be your two XDP values and 2 would be your SKB value. It
> should make it easier for this to be treated more as a decision tree.

I will try to use bitmap.

>>   struct tsnep_tx_entry {
>>   	struct tsnep_tx_desc *desc;
>>   	struct tsnep_tx_desc_wb *desc_wb;
>> @@ -65,7 +71,11 @@ struct tsnep_tx_entry {
>>   
>>   	u32 properties;
>>   
>> -	struct sk_buff *skb;
>> +	enum tsnep_tx_type type;
>> +	union {
>> +		struct sk_buff *skb;
>> +		struct xdp_frame *xdpf;
>> +	};
>>   	size_t len;
>>   	DEFINE_DMA_UNMAP_ADDR(dma);xdp_return_frame_bulk
>>   };
>> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
>> index 943de5a09693..1ae73c706c9e 100644
>> --- a/drivers/net/ethernet/engleder/tsnep_main.c
>> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
>> @@ -310,10 +310,12 @@ static void tsnep_tx_activate(struct tsnep_tx *tx, int index, int length,
>>   	struct tsnep_tx_entry *entry = &tx->entry[index];
>>   
>>   	entry->properties = 0;
>> +	/* xdpf is union with skb */
>>   	if (entry->skb) {
>>   		entry->properties = length & TSNEP_DESC_LENGTH_MASK;
>>   		entry->properties |= TSNEP_DESC_INTERRUPT_FLAG;
>> -		if (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS)
>> +		if (entry->type == Txdp_return_frame_bulkSNEP_TX_TYPE_SKB &&
>> +		    (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS))
>>   			entry->properties |= TSNEP_DESC_EXTENDED_WRITEBACK_FLAG;
>>   
>>   		/* toggle user flag to prevent false acknowledge
>> @@ -370,7 +372,8 @@ static int tsnep_tx_desc_available(struct tsnep_tx *tx)
>>   		return tx->read - tx->write - 1;
>>   }
>>   
>> -static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
>> +static int tsnep_tx_map(const struct sk_buff *skb, struct tsnep_tx *tx,
>> +			int count)
>>   {
> 
> This change to const doesn't add anything since this is a static
> function. You could probably just skip making this change since the
> function will likely be inlined anyway.

const was requested for tsnep_xdp_tx_map() during last review round so I
added it also here to keep both function similar.

>>   	struct device *dmadev = tx->adapter->dmadev;
>>   	struct tsnep_tx_entry *entry;
>> @@ -382,7 +385,7 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
>>   	for (i = 0; i < count; i++) {
>>   		entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
>>   
>> -		if (i == 0) {
>> +		if (!i) {
>>   			len = skb_headlen(skb);
>>   			dma = dma_map_single(dmadev, skb->data, len,
>>   					     DMA_TO_DEVICE);
>> @@ -400,6 +403,8 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
>>   
>>   		entry->desc->tx = __cpu_to_le64(dma);
>>   
>> +		entry->type = TSNEP_TX_TYPE_SKB;
>> +
>>   		map_len += len;
>>   	}
>>   
> 
> I wonder if it wouldn't be better to change this so that you have a 4th
> type for just "PAGE" or "FRAGMENT" since you only really need to
> identify this as SKB or XDP on the first buffer, and then all the rest
> are just going to be unmapped as page regardless of if it is XDP or
> SKB.

I will it try in combination with your bitmap suggestion.

>> @@ -417,12 +422,13 @@ static int tsnep_tx_unmap(struct tsnep_tx *tx, int index, int count)
>>   		entry = &tx->entry[(index + i) % TSNEP_RING_SIZE];
>>   
>>   		if (entry->len) {
>> -			if (i == 0)
>> +			if (!i && entry->type == TSNEP_TX_TYPE_SKB)
>>   				dma_unmap_single(dmadev,
>>   						 dma_unmap_addr(entry, dma),
>>   						 dma_unmap_len(entry, len),
>>   						 DMA_TO_DEVICE);
>> -			else
>> +			else if (entry->type == TSNEP_TX_TYPE_SKB ||
>> +				 entry->type == TSNEP_TX_TYPE_XDP_NDO)
>>   				dma_unmap_page(dmadev,
>>   					       dma_unmap_addr(entry, dma),
>>   					       dma_unmap_len(entry, len),
> 
> Rather than perform 2 checks here you could just verify type !=
> TYPE_XDP_TX which would save you a check.

Will be improved with your bitmap suggestion.

>> @@ -482,7 +488,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
>>   
>>   	for (i = 0; i < count; i++)
>>   		tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
>> -				  i == (count - 1));
>> +				  i == count - 1);
>>   	tx->write = (tx->write + count) % TSNEP_RING_SIZE;
>>   
>>   	skb_tx_timestamp(skb);
>> @@ -502,12 +508,133 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
>>   	return NETDEV_TX_OK;
>>   }
>>   
>> +static int tsnep_xdp_tx_map(const struct xdp_frame *xdpf, struct tsnep_tx *tx,
>> +			    const struct skb_shared_info *shinfo, int count,
>> +			    enum tsnep_tx_type type)
> 
> Again the const here isn't adding any value since this is a static
> function and will likely be inlined into the function below which calls
> it.

const was requested here during last review round so I added it. It may
add some value by detecting some problems at compile time.

>> +{
>> +	struct device *dmadev = tx->adapter->dmadev;
>> +	struct tsnep_tx_entry *entry;
>> +	const skb_frag_t *frag;
>> +	struct page *page;
>> +	unsigned int len;
>> +	int map_len = 0;
>> +	dma_addr_t dma;
>> +	void *data;
>> +	int i;
>> +
>> +	frag = NULL;
>> +	len = xdpf->len;
>> +	for (i = 0; i < count; i++) {
>> +		entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
>> +		if (type == TSNEP_TX_TYPE_XDP_NDO) {
>> +			data = unlikely(frag) ? skb_frag_address(frag) :
>> +						xdpf->data;
>> +			dma = dma_map_single(dmadev, data, len, DMA_TO_DEVICE);
>> +			if (dma_mapping_error(dmadev, dma))
>> +				return -ENOMEM;
>> +
>> +			entry->type = TSNEP_TX_TYPE_XDP_NDO;
>> +		} else {
>> +			page = unlikely(frag) ? skb_frag_page(frag) :
>> +						virt_to_page(xdpf->data);
>> +			dma = page_pool_get_dma_addr(page);
>> +			if (unlikely(frag))
>> +				dma += skb_frag_off(frag);
>> +			else
>> +				dma += sizeof(*xdpf) + xdpf->headroom;
>> +			dma_sync_single_for_device(dmadev, dma, len,
>> +						   DMA_BIDIRECTIONAL);
>> +
>> +			entry->type = TSNEP_TX_TYPE_XDP_TX;
>> +		}
>> +
>> +		entry->len = len;
>> +		dma_unmap_addr_set(entry, dma, dma);
>> +
>> +		entry->desc->tx = __cpu_to_le64(dma);
>> +
>> +		map_len += len;
>> +
>> +		if (i + 1 < count) {
>> +			frag = &shinfo->frags[i];
>> +			len = skb_frag_size(frag);
>> +		}
>> +	}
>> +
>> +	return map_len;
>> +}
>> +
>> +/* This function requires __netif_tx_lock is held by the caller. */
>> +static bool tsnep_xdp_xmit_frame_ring(struct xdp_frame *xdpf,
>> +				      struct tsnep_tx *tx,
>> +				      enum tsnep_tx_type type)
>> +{
>> +	const struct skb_shared_info *shinfo =
>> +		xdp_get_shared_info_from_frame(xdpf);
>> +	struct tsnep_tx_entry *entry;
>> +	int count, length, retval, i;
>> +
>> +	count = 1;
>> +	if (unlikely(xdp_frame_has_frags(xdpf)))
>> +		count += shinfo->nr_frags;
>> +
>> +	spin_lock_bh(&tx->lock);
>> +
>> +	/* ensure that TX ring is not filled up by XDP, always MAX_SKB_FRAGS
>> +	 * will be available for normal TX path and queue is stopped there if
>> +	 * necessary
>> +	 */
>> +	if (tsnep_tx_desc_available(tx) < (MAX_SKB_FRAGS + 1 + count)) {
>> +		spin_unlock_bh(&tx->lock);
>> +
>> +		return false;
>> +	}
>> +
>> +	entry = &tx->entry[tx->write];
>> +	entry->xdpf = xdpf;
>> +
>> +	retval = tsnep_xdp_tx_map(xdpf, tx, shinfo, count, type);
>> +	if (retval < 0) {
>> +		tsnep_tx_unmap(tx, tx->write, count);
>> +		entry->xdpf = NULL;
>> +
>> +		tx->dropped++;
>> +
>> +		spin_unlock_bh(&tx->lock);
>> +
>> +		return false;
>> +	}
>> +	length = retval;
>> +
>> +	for (i = 0; i < count; i++)
>> +		tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
>> +				  i == count - 1);
>> +	tx->write = (tx->write + count) % TSNEP_RING_SIZE;
>> +
>> +	/* descriptor properties shall be valid before hardware is notified */
>> +	dma_wmb();
>> +
>> +	spin_unlock_bh(&tx->lock);
>> +
>> +	return true;
>> +}
>> +
>> +static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
>> +{
>> +	iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
>> +}
>> +
>>   static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>>   {
>> -	int budget = 128;
>>   	struct tsnep_tx_entry *entry;
>> -	int count;
>> +	struct xdp_frame_bulk bq;
>> +	int budget = 128;
>>   	int length;
>> +	int count;
>> +
>> +	xdp_frame_bulk_init(&bq);
>> +
>> +	rcu_read_lock(); /* need for xdp_return_frame_bulk */
>>   
> 
> You should be able to get rid of both of these. See comments below.
> 
>>   	spin_lock_bh(&tx->lock);
>>   
>> @@ -527,12 +654,17 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>>   		dma_rmb();
>>   
>>   		count = 1;
>> -		if (skb_shinfo(entry->skb)->nr_frags > 0)
>> +		if (entry->type == TSNEP_TX_TYPE_SKB &&
>> +		    skb_shinfo(entry->skb)->nr_frags > 0)
>>   			count += skb_shinfo(entry->skb)->nr_frags;
>> +		else if (entry->type != TSNEP_TX_TYPE_SKB &&
>> +			 xdp_frame_has_frags(entry->xdpf))
>> +			count += xdp_get_shared_info_from_frame(entry->xdpf)->nr_frags;
>>   
>>   		length = tsnep_tx_unmap(tx, tx->read, count);
>>   
>> -		if ((skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
>> +		if (entry->type == TSNEP_TX_TYPE_SKB &&
>> +		    (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
>>   		    (__le32_to_cpu(entry->desc_wb->properties) &
>>   		     TSNEP_DESC_EXTENDED_WRITEBACK_FLAG)) {
>>   			struct skb_shared_hwtstamps hwtstamps;
>> @@ -552,7 +684,18 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>>   			skb_tstamp_tx(entry->skb, &hwtstamps);
>>   		}
>>   
>> -		napi_consume_skb(entry->skb, napi_budget);
>> +		switch (entry->type) {
>> +		case TSNEP_TX_TYPE_SKB:
>> +			napi_consume_skb(entry->skb, napi_budget);
>> +			break;
>> +		case TSNEP_TX_TYPE_XDP_TX:
>> +			xdp_return_frame_rx_napi(entry->xdpf);
>> +			break;
>> +		case TSNEP_TX_TYPE_XDP_NDO:
>> +			xdp_return_frame_bulk(entry->xdpf, &bq);
>> +			break;
>> +		}
>> +		/* xdpf is union with skb */
>>   		entry->skb = NULL;
>>   
>>   		tx->read = (tx->read + count) % TSNEP_RING_SIZE;
> 
> If I am not mistaken I think you can use xdp_return_frame_rx_napi for
> both of these without having to resort to the bulk approach since the
> tsnep_tx_poll is operating in the NAPI polling context.
> 
> When you free the buffers outside of polling context you would then
> want to use the xdp_return_frame_bulk. So for example if you have any
> buffers that weren't transmitted when you get to the point of
> tsnep_tx_ring_cleanup then you would want to unmap and free them there
> using something like the xdp_return_frame or xdp_return_frame_bulk.

This is done similar in other drivers, so I thought it is good practice.
I will compare the performance of xdp_return_frame_rx_napi() and
xdp_return_frame_bulk().

>> @@ -570,7 +713,11 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>>   
>>   	spin_unlock_bh(&tx->lock);
>>   
>> -	return (budget != 0);
>> +	xdp_flush_frame_bulk(&bq);
>> +
>> +	rcu_read_unlock();
>> +
>> +	return budget != 0;
>>   }
>>   
>>   static bool tsnep_tx_pending(struct tsnep_tx *tx)
>> @@ -1330,6 +1477,45 @@ static ktime_t tsnep_netdev_get_tstamp(struct net_device *netdev,
>>   	return ns_to_ktime(timestamp);
>>   }
>>   
>> +static int tsnep_netdev_xdp_xmit(struct net_device *dev, int n,
>> +				 struct xdp_frame **xdp, u32 flags)
>> +{
>> +	struct tsnep_adapter *adapter = netdev_priv(dev);
>> +	u32 cpu = smp_processor_id();
>> +	struct netdev_queue *nq;
>> +	int nxmit, queue;
>> +	bool xmit;
>> +
>> +	if (unlikely(test_bit(__TSNEP_DOWN, &adapter->state)))
>> +		return -ENETDOWN;
>> +
> 
> Again, here you are better off probably checking for netif_carrier_ok
> rather than adding a new flag.
> 
>> +	if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
>> +		return -EINVAL;
>> +
>> +	queue = cpu % adapter->num_tx_queues;
> 
> Modulo math here unconditionally will be expensive. Take a look at
> using reciprocal_scale like we do in skb_tx_hash.

I will try to find a better solution.

>> +	nq = netdev_get_tx_queue(adapter->netdev, queue);
>> +
>> +	__netif_tx_lock(nq, cpu);
>> +
>> +	/* Avoid transmit queue timeout since we share it with the slow path */
>> +	txq_trans_cond_update(nq);
>> +
> 
> Doing this here may not be the right spot. It might make more sense to
> do this after the call to xmit_frame_ring after the break case in order
> to guarantee that you actually put something on the ring before
> updating the state otherwise you can stall the Tx ring without
> triggering the watchdog by just hammering this.
> 
>> +	for (nxmit = 0; nxmit < n; nxmit++) {
>> +		xmit = tsnep_xdp_xmit_frame_ring(xdp[nxmit],
>> +						 &adapter->tx[queue],
>> +						 TSNEP_TX_TYPE_XDP_NDO);
>> +		if (!xmit)
>> +			break;
> 
> So move txq_trans_cond_update(nq) here.

Will be fixed.

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 08/10] tsnep: Add RX queue info for XDP support
  2023-01-10 17:29   ` Alexander H Duyck
@ 2023-01-10 21:21     ` Gerhard Engleder
  2023-01-10 22:27       ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-10 21:21 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni, Saeed Mahameed

On 10.01.23 18:29, Alexander H Duyck wrote:
> On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>> Register xdp_rxq_info with page_pool memory model. This is needed for
>> XDP buffer handling.
>>
>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
>> Reviewed-by: Saeed Mahameed <saeed@kernel.org>
>> ---
>>   drivers/net/ethernet/engleder/tsnep.h      |  2 ++
>>   drivers/net/ethernet/engleder/tsnep_main.c | 39 ++++++++++++++++------
>>   2 files changed, 31 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h
>> index 855738d31d73..2268ff793edf 100644
>> --- a/drivers/net/ethernet/engleder/tsnep.h
>> +++ b/drivers/net/ethernet/engleder/tsnep.h
>> @@ -134,6 +134,8 @@ struct tsnep_rx {
>>   	u32 dropped;
>>   	u32 multicast;
>>   	u32 alloc_failed;
>> +
>> +	struct xdp_rxq_info xdp_rxq;
>>   };
>>   
> 
> Rather than placing it in your Rx queue structure it might be better to
> look at placing it in the tsnep_queue struct. The fact is this is more
> closely associated with the napi struct then your Rx ring so changing
> it to that might make more sense as you can avoid a bunch of extra
> overhead with having to pull napi to it.
> 
> Basically what it comes down it is that it is much easier to access
> queue[i].rx than it is for the rx to get access to queue[i].napi.

I will try it as suggested.

>>   struct tsnep_queue {
>> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
>> index 0c9669edb2dd..451ad1849b9d 100644
>> --- a/drivers/net/ethernet/engleder/tsnep_main.c
>> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
>> @@ -792,6 +792,9 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx)
>>   		entry->page = NULL;
>>   	}
>>   
>> +	if (xdp_rxq_info_is_reg(&rx->xdp_rxq))
>> +		xdp_rxq_info_unreg(&rx->xdp_rxq);
>> +
>>   	if (rx->page_pool)
>>   		page_pool_destroy(rx->page_pool);
>>   
>> @@ -807,7 +810,7 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx)
>>   	}
>>   }
>>   
>> -static int tsnep_rx_ring_init(struct tsnep_rx *rx)
>> +static int tsnep_rx_ring_init(struct tsnep_rx *rx, unsigned int napi_id)
>>   {
>>   	struct device *dmadev = rx->adapter->dmadev;
>>   	struct tsnep_rx_entry *entry;
>> @@ -850,6 +853,15 @@ static int tsnep_rx_ring_init(struct tsnep_rx *rx)
>>   		goto failed;
>>   	}
>>   
>> +	retval = xdp_rxq_info_reg(&rx->xdp_rxq, rx->adapter->netdev,
>> +				  rx->queue_index, napi_id);
>> +	if (retval)
>> +		goto failed;
>> +	retval = xdp_rxq_info_reg_mem_model(&rx->xdp_rxq, MEM_TYPE_PAGE_POOL,
>> +					    rx->page_pool);
>> +	if (retval)
>> +		goto failed;
>> +
>>   	for (i = 0; i < TSNEP_RING_SIZE; i++) {
>>   		entry = &rx->entry[i];
>>   		next_entry = &rx->entry[(i + 1) % TSNEP_RING_SIZE];
>> @@ -1104,7 +1116,8 @@ static bool tsnep_rx_pending(struct tsnep_rx *rx)
>>   }
>>   
>>   static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr,
>> -			 int queue_index, struct tsnep_rx *rx)
>> +			 unsigned int napi_id, int queue_index,
>> +			 struct tsnep_rx *rx)
>>   {
>>   	dma_addr_t dma;
>>   	int retval;
>> @@ -1118,7 +1131,7 @@ static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr,
>>   	else
>>   		rx->offset = TSNEP_SKB_PAD;
>>   
>> -	retval = tsnep_rx_ring_init(rx);
>> +	retval = tsnep_rx_ring_init(rx, napi_id);
>>   	if (retval)
>>   		return retval;
>>   
>> @@ -1245,14 +1258,19 @@ static void tsnep_free_irq(struct tsnep_queue *queue, bool first)
>>   static int tsnep_netdev_open(struct net_device *netdev)
>>   {
>>   	struct tsnep_adapter *adapter = netdev_priv(netdev);
>> -	int i;
>> -	void __iomem *addr;
>>   	int tx_queue_index = 0;
>>   	int rx_queue_index = 0;
>> -	int retval;
>> +	unsigned int napi_id;
>> +	void __iomem *addr;
>> +	int i, retval;
>>   
>>   	for (i = 0; i < adapter->num_queues; i++) {
>>   		adapter->queue[i].adapter = adapter;
>> +
>> +		netif_napi_add(adapter->netdev, &adapter->queue[i].napi,
>> +			       tsnep_poll);
>> +		napi_id = adapter->queue[i].napi.napi_id;
>> +
> 
> This is a good example of what I am referring to.
> 
>>   		if (adapter->queue[i].tx) {
>>   			addr = adapter->addr + TSNEP_QUEUE(tx_queue_index);
>>   			retval = tsnep_tx_open(adapter, addr, tx_queue_index,
>> @@ -1263,7 +1281,7 @@ static int tsnep_netdev_open(struct net_device *netdev)
>>   		}
>>   		if (adapter->queue[i].rx) {
>>   			addr = adapter->addr + TSNEP_QUEUE(rx_queue_index);
>> -			retval = tsnep_rx_open(adapter, addr,
>> +			retval = tsnep_rx_open(adapter, addr, napi_id,
>>   					       rx_queue_index,
>>   					       adapter->queue[i].rx);
>>   			if (retval)
>> @@ -1295,8 +1313,6 @@ static int tsnep_netdev_open(struct net_device *netdev)
>>   		goto phy_failed;
>>   
>>   	for (i = 0; i < adapter->num_queues; i++) {
>> -		netif_napi_add(adapter->netdev, &adapter->queue[i].napi,
>> -			       tsnep_poll);
>>   		napi_enable(&adapter->queue[i].napi);
>>   
>>   		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);
> 
> What you could do here is look at making all the napi_add/napi_enable
> and xdp_rxq_info items into one function to handle all the enabling.

I will rework that.

>> @@ -1317,6 +1333,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
>>   			tsnep_rx_close(adapter->queue[i].rx);
>>   		if (adapter->queue[i].tx)
>>   			tsnep_tx_close(adapter->queue[i].tx);
>> +
>> +		netif_napi_del(&adapter->queue[i].napi);
>>   	}
>>   	return retval;
>>   }
>> @@ -1335,7 +1353,6 @@ static int tsnep_netdev_close(struct net_device *netdev)
>>   		tsnep_disable_irq(adapter, adapter->queue[i].irq_mask);
>>   
>>   		napi_disable(&adapter->queue[i].napi);
>> -		netif_napi_del(&adapter->queue[i].napi);
>>   
>>   		tsnep_free_irq(&adapter->queue[i], i == 0);
>>   
> 
> Likewise here you could take care of all the same items with the page
> pool being freed after you have already unregistered and freed the napi
> instance.

I'm not sure if I understand it right. According to your suggestion
above napi and xdp_rxq_info should be freed here?

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 09/10] tsnep: Add XDP RX support
  2023-01-10 17:40   ` Alexander H Duyck
@ 2023-01-10 21:28     ` Gerhard Engleder
  2023-01-10 22:30       ` Alexander Duyck
  0 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-10 21:28 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni

On 10.01.23 18:40, Alexander H Duyck wrote:
> On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>> If BPF program is set up, then run BPF program for every received frame
>> and execute the selected action.
>>
>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
>> ---
>>   drivers/net/ethernet/engleder/tsnep_main.c | 122 ++++++++++++++++++++-
>>   1 file changed, 120 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
>> index 451ad1849b9d..002c879639db 100644
>> --- a/drivers/net/ethernet/engleder/tsnep_main.c
>> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
>> @@ -27,6 +27,7 @@
>>   #include <linux/phy.h>
>>   #include <linux/iopoll.h>
>>   #include <linux/bpf.h>
>> +#include <linux/bpf_trace.h>
>>   
>>   #define TSNEP_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN)
>>   #define TSNEP_HEADROOM ALIGN(max(TSNEP_SKB_PAD, XDP_PACKET_HEADROOM), 4)
>> @@ -44,6 +45,9 @@
>>   #define TSNEP_COALESCE_USECS_MAX     ((ECM_INT_DELAY_MASK >> ECM_INT_DELAY_SHIFT) * \
>>   				      ECM_INT_DELAY_BASE_US + ECM_INT_DELAY_BASE_US - 1)
>>   
>> +#define TSNEP_XDP_TX		BIT(0)
>> +#define TSNEP_XDP_REDIRECT	BIT(1)
>> +
>>   enum {
>>   	__TSNEP_DOWN,
>>   };
>> @@ -625,6 +629,28 @@ static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
>>   	iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
>>   }
>>   
>> +static bool tsnep_xdp_xmit_back(struct tsnep_adapter *adapter,
>> +				struct xdp_buff *xdp,
>> +				struct netdev_queue *tx_nq, struct tsnep_tx *tx)
>> +{
>> +	struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);
>> +	bool xmit;
>> +
>> +	if (unlikely(!xdpf))
>> +		return false;
>> +
>> +	__netif_tx_lock(tx_nq, smp_processor_id());
>> +
>> +	/* Avoid transmit queue timeout since we share it with the slow path */
>> +	txq_trans_cond_update(tx_nq);
>> +
>> +	xmit = tsnep_xdp_xmit_frame_ring(xdpf, tx, TSNEP_TX_TYPE_XDP_TX);
>> +
> 
> Again the trans_cond_update should be after the xmit and only if it is
> not indicating it completed the transmit.

tsnep_xdp_xmit_frame_ring() only adds xpdf to the descriptor ring, so it
cannot complete the transmit. Therefore and in line with your previous
comment trans_cond_update() should be called here if xpdf is
successfully placed in the descriptor ring. Is that right?

>> +	__netif_tx_unlock(tx_nq);
>> +
>> +	return xmit;
>> +}
>> +
>>   static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
>>   {
>>   	struct tsnep_tx_entry *entry;
>> @@ -983,6 +1009,62 @@ static int tsnep_rx_refill(struct tsnep_rx *rx, int count, bool reuse)
>>   	return i;
>>   }
>>   
>> +static bool tsnep_xdp_run_prog(struct tsnep_rx *rx, struct bpf_prog *prog,
>> +			       struct xdp_buff *xdp, int *status,
>> +			       struct netdev_queue *tx_nq, struct tsnep_tx *tx)
>> +{
>> +	unsigned int length;
>> +	unsigned int sync;
>> +	u32 act;
>> +
>> +	length = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM;
>> +
>> +	act = bpf_prog_run_xdp(prog, xdp);
>> +
>> +	/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
>> +	sync = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM;
>> +	sync = max(sync, length);
>> +
>> +	switch (act) {
>> +	case XDP_PASS:
>> +		return false;
>> +	case XDP_TX:
>> +		if (!tsnep_xdp_xmit_back(rx->adapter, xdp, tx_nq, tx))
>> +			goto out_failure;
>> +		*status |= TSNEP_XDP_TX;
>> +		return true;
>> +	case XDP_REDIRECT:
>> +		if (xdp_do_redirect(rx->adapter->netdev, xdp, prog) < 0)
>> +			goto out_failure;
>> +		*status |= TSNEP_XDP_REDIRECT;
>> +		return true;
>> +	default:
>> +		bpf_warn_invalid_xdp_action(rx->adapter->netdev, prog, act);
>> +		fallthrough;
>> +	case XDP_ABORTED:
>> +out_failure:
>> +		trace_xdp_exception(rx->adapter->netdev, prog, act);
>> +		fallthrough;
>> +	case XDP_DROP:
>> +		page_pool_put_page(rx->page_pool, virt_to_head_page(xdp->data),
>> +				   sync, true);
>> +		return true;
>> +	}
>> +}
>> +
>> +static void tsnep_finalize_xdp(struct tsnep_adapter *adapter, int status,
>> +			       struct netdev_queue *tx_nq, struct tsnep_tx *tx)
>> +{
>> +	if (status & TSNEP_XDP_TX) {
>> +		__netif_tx_lock(tx_nq, smp_processor_id());
>> +		tsnep_xdp_xmit_flush(tx);
>> +		__netif_tx_unlock(tx_nq);
>> +	}
>> +
>> +	if (status & TSNEP_XDP_REDIRECT)
>> +		xdp_do_flush();
>> +}
>> +
>>   static struct sk_buff *tsnep_build_skb(struct tsnep_rx *rx, struct page *page,
>>   				       int length)
>>   {
>> @@ -1018,15 +1100,29 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi,
>>   			 int budget)
>>   {
>>   	struct device *dmadev = rx->adapter->dmadev;
>> -	int desc_available;
>> -	int done = 0;
>>   	enum dma_data_direction dma_dir;
>>   	struct tsnep_rx_entry *entry;
>> +	struct netdev_queue *tx_nq;
>> +	struct bpf_prog *prog;
>> +	struct xdp_buff xdp;
>>   	struct sk_buff *skb;
>> +	struct tsnep_tx *tx;
>> +	int desc_available;
>> +	int xdp_status = 0;
>> +	int done = 0;
>>   	int length;
>>   
>>   	desc_available = tsnep_rx_desc_available(rx);
>>   	dma_dir = page_pool_get_dma_dir(rx->page_pool);
>> +	prog = READ_ONCE(rx->adapter->xdp_prog);
>> +	if (prog) {
>> +		int queue = smp_processor_id() % rx->adapter->num_tx_queues;
>> +
> 
> As I mentioned before. Take a look at how this was addressed in
> skb_tx_hash. The modulus division is really expensive.
> 
> Also does this make sense. I am assuming you have a 1:1 Tx to Rx
> mapping for your queues don't you? If so it might make more sense to
> use the Tx queue that you clean in this queue pair.

Sounds reasonable. I will work on that.

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-10 17:33   ` Alexander H Duyck
@ 2023-01-10 21:38     ` Gerhard Engleder
  2023-01-10 23:00       ` Alexander H Duyck
  2023-01-11  0:12       ` Jakub Kicinski
  0 siblings, 2 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-10 21:38 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni

On 10.01.23 18:33, Alexander H Duyck wrote:
> On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>> Implement setup of BPF programs for XDP RX path with command
>> XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path
>> support.
>>
>> tsnep_netdev_close() is called directly during BPF program setup. Add
>> netif_carrier_off() and netif_tx_stop_all_queues() calls to signal to
>> network stack that device is down. Otherwise network stack would
>> continue transmitting pakets.
>>
>> Return value of tsnep_netdev_open() is not checked during BPF program
>> setup like in other drivers. Forwarding the return value would result in
>> a bpf_prog_put() call in dev_xdp_install(), which would make removal of
>> BPF program necessary.
>>
>> If tsnep_netdev_open() fails during BPF program setup, then the network
>> stack would call tsnep_netdev_close() anyway. Thus, tsnep_netdev_close()
>> checks now if device is already down.
>>
>> Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added
>> automatically.
>>
>> Test results with A53 1.2GHz:
>>
>> XDP_DROP (samples/bpf/xdp1)
>> proto 17:     883878 pkt/s
>>
>> XDP_TX (samples/bpf/xdp2)
>> proto 17:     255693 pkt/s
>>
>> XDP_REDIRECT (samples/bpf/xdpsock)
>>   sock0@eth2:0 rxdrop xdp-drv
>>                     pps            pkts           1.00
>> rx                 855,582        5,404,523
>> tx                 0              0
>>
>> XDP_REDIRECT (samples/bpf/xdp_redirect)
>> eth2->eth1         613,267 rx/s   0 err,drop/s   613,272 xmit/s
>>
>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
>> ---
>>   drivers/net/ethernet/engleder/Makefile     |  2 +-
>>   drivers/net/ethernet/engleder/tsnep.h      |  6 +++++
>>   drivers/net/ethernet/engleder/tsnep_main.c | 25 ++++++++++++++++---
>>   drivers/net/ethernet/engleder/tsnep_xdp.c  | 29 ++++++++++++++++++++++
>>   4 files changed, 58 insertions(+), 4 deletions(-)
>>   create mode 100644 drivers/net/ethernet/engleder/tsnep_xdp.c
>>
>>
> 
> <...>
> 
>> --- a/drivers/net/ethernet/engleder/tsnep_main.c
>> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
>> @@ -1373,7 +1373,7 @@ static void tsnep_free_irq(struct tsnep_queue *queue, bool first)
>>   	memset(queue->name, 0, sizeof(queue->name));
>>   }
>>   
>> -static int tsnep_netdev_open(struct net_device *netdev)
>> +int tsnep_netdev_open(struct net_device *netdev)
>>   {
>>   	struct tsnep_adapter *adapter = netdev_priv(netdev);
>>   	int tx_queue_index = 0;
>> @@ -1436,6 +1436,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
>>   		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);
>>   	}
>>   
>> +	netif_tx_start_all_queues(adapter->netdev);
>> +
>>   	clear_bit(__TSNEP_DOWN, &adapter->state);
>>   
>>   	return 0;
>> @@ -1457,12 +1459,16 @@ static int tsnep_netdev_open(struct net_device *netdev)
>>   	return retval;
>>   }
>>   
>> -static int tsnep_netdev_close(struct net_device *netdev)
>> +int tsnep_netdev_close(struct net_device *netdev)
>>   {
>>   	struct tsnep_adapter *adapter = netdev_priv(netdev);
>>   	int i;
>>   
>> -	set_bit(__TSNEP_DOWN, &adapter->state);
>> +	if (test_and_set_bit(__TSNEP_DOWN, &adapter->state))
>> +		return 0;
>> +
>> +	netif_carrier_off(netdev);
>> +	netif_tx_stop_all_queues(netdev);
>>   
> 
> As I called out earlier the __TSNEP_DOWN is just !IFF_UP so you don't
> need that bit.
> 
> The fact that netif_carrier_off is here also points out the fact that
> the code in the Tx path isn't needed regarding __TSNEP_DOWN and you can
> probably just check netif_carrier_ok if you need the check.

tsnep_netdev_close() is called directly during bpf prog setup (see
tsnep_xdp_setup_prog() in this commit). If the following
tsnep_netdev_open() call fails, then this flag signals that the device
is already down and nothing needs to be cleaned up if
tsnep_netdev_close() is called later (because IFF_UP is still set).

Thanks for the review!

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 08/10] tsnep: Add RX queue info for XDP support
  2023-01-10 21:21     ` Gerhard Engleder
@ 2023-01-10 22:27       ` Alexander Duyck
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander Duyck @ 2023-01-10 22:27 UTC (permalink / raw)
  To: Gerhard Engleder; +Cc: netdev, davem, kuba, edumazet, pabeni, Saeed Mahameed

On Tue, Jan 10, 2023 at 1:21 PM Gerhard Engleder
<gerhard@engleder-embedded.com> wrote:
>
> On 10.01.23 18:29, Alexander H Duyck wrote:
> > On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> >> Register xdp_rxq_info with page_pool memory model. This is needed for
> >> XDP buffer handling.
> >>

<...>

> >> @@ -1317,6 +1333,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
> >>                      tsnep_rx_close(adapter->queue[i].rx);
> >>              if (adapter->queue[i].tx)
> >>                      tsnep_tx_close(adapter->queue[i].tx);
> >> +
> >> +            netif_napi_del(&adapter->queue[i].napi);
> >>      }
> >>      return retval;
> >>   }
> >> @@ -1335,7 +1353,6 @@ static int tsnep_netdev_close(struct net_device *netdev)
> >>              tsnep_disable_irq(adapter, adapter->queue[i].irq_mask);
> >>
> >>              napi_disable(&adapter->queue[i].napi);
> >> -            netif_napi_del(&adapter->queue[i].napi);
> >>
> >>              tsnep_free_irq(&adapter->queue[i], i == 0);
> >>
> >
> > Likewise here you could take care of all the same items with the page
> > pool being freed after you have already unregistered and freed the napi
> > instance.
>
> I'm not sure if I understand it right. According to your suggestion
> above napi and xdp_rxq_info should be freed here?

Right. Between the napi_disable and the netif_napi_del you could
unregister the mem model and the xdp_info. Basically the two are tried
closer to the NAPI instance than the Rx queue itself so it would make
sense to just take care of it here. You might look at just putting
together a function to handle all of it since then you just pass
&adapter->queue once and then use a local variable in the function.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 09/10] tsnep: Add XDP RX support
  2023-01-10 21:28     ` Gerhard Engleder
@ 2023-01-10 22:30       ` Alexander Duyck
  0 siblings, 0 replies; 36+ messages in thread
From: Alexander Duyck @ 2023-01-10 22:30 UTC (permalink / raw)
  To: Gerhard Engleder; +Cc: netdev, davem, kuba, edumazet, pabeni

On Tue, Jan 10, 2023 at 1:28 PM Gerhard Engleder
<gerhard@engleder-embedded.com> wrote:
>
> On 10.01.23 18:40, Alexander H Duyck wrote:
> > On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> >> If BPF program is set up, then run BPF program for every received frame
> >> and execute the selected action.
> >>
> >> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> >> ---
> >>   drivers/net/ethernet/engleder/tsnep_main.c | 122 ++++++++++++++++++++-
> >>   1 file changed, 120 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> >> index 451ad1849b9d..002c879639db 100644
> >> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> >> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> >> @@ -27,6 +27,7 @@
> >>   #include <linux/phy.h>
> >>   #include <linux/iopoll.h>
> >>   #include <linux/bpf.h>
> >> +#include <linux/bpf_trace.h>
> >>
> >>   #define TSNEP_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN)
> >>   #define TSNEP_HEADROOM ALIGN(max(TSNEP_SKB_PAD, XDP_PACKET_HEADROOM), 4)
> >> @@ -44,6 +45,9 @@
> >>   #define TSNEP_COALESCE_USECS_MAX     ((ECM_INT_DELAY_MASK >> ECM_INT_DELAY_SHIFT) * \
> >>                                    ECM_INT_DELAY_BASE_US + ECM_INT_DELAY_BASE_US - 1)
> >>
> >> +#define TSNEP_XDP_TX                BIT(0)
> >> +#define TSNEP_XDP_REDIRECT  BIT(1)
> >> +
> >>   enum {
> >>      __TSNEP_DOWN,
> >>   };
> >> @@ -625,6 +629,28 @@ static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
> >>      iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
> >>   }
> >>
> >> +static bool tsnep_xdp_xmit_back(struct tsnep_adapter *adapter,
> >> +                            struct xdp_buff *xdp,
> >> +                            struct netdev_queue *tx_nq, struct tsnep_tx *tx)
> >> +{
> >> +    struct xdp_frame *xdpf = xdp_convert_buff_to_frame(xdp);
> >> +    bool xmit;
> >> +
> >> +    if (unlikely(!xdpf))
> >> +            return false;
> >> +
> >> +    __netif_tx_lock(tx_nq, smp_processor_id());
> >> +
> >> +    /* Avoid transmit queue timeout since we share it with the slow path */
> >> +    txq_trans_cond_update(tx_nq);
> >> +
> >> +    xmit = tsnep_xdp_xmit_frame_ring(xdpf, tx, TSNEP_TX_TYPE_XDP_TX);
> >> +
> >
> > Again the trans_cond_update should be after the xmit and only if it is
> > not indicating it completed the transmit.
>
> tsnep_xdp_xmit_frame_ring() only adds xpdf to the descriptor ring, so it
> cannot complete the transmit. Therefore and in line with your previous
> comment trans_cond_update() should be called here if xpdf is
> successfully placed in the descriptor ring. Is that right?

Yes, that is what I meant by "complete the transmit" is if it places
the xdpf on the descriptor ring then you can update this. Basically
the idea is we should be updating the timer any time a frame goes onto
the ring. It shouldn't be an unconditional update as a stalled ring
could then go undetected.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 05/10] tsnep: Add XDP TX support
  2023-01-10 21:07     ` Gerhard Engleder
@ 2023-01-10 22:38       ` Alexander Duyck
  2023-01-11 18:51         ` Gerhard Engleder
  0 siblings, 1 reply; 36+ messages in thread
From: Alexander Duyck @ 2023-01-10 22:38 UTC (permalink / raw)
  To: Gerhard Engleder; +Cc: netdev, davem, kuba, edumazet, pabeni

On Tue, Jan 10, 2023 at 1:07 PM Gerhard Engleder
<gerhard@engleder-embedded.com> wrote:
>
> On 10.01.23 17:56, Alexander H Duyck wrote:
> > nOn Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> >> Implement ndo_xdp_xmit() for XDP TX support. Support for fragmented XDP
> >> frames is included.
> >>
> >> Also some const, braces and logic clean ups are done in normal TX path
> >> to keep both TX paths in sync.
> >>
> >> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>

<...>

> >>   struct tsnep_tx_entry {
> >>      struct tsnep_tx_desc *desc;
> >>      struct tsnep_tx_desc_wb *desc_wb;
> >> @@ -65,7 +71,11 @@ struct tsnep_tx_entry {
> >>
> >>      u32 properties;
> >>
> >> -    struct sk_buff *skb;
> >> +    enum tsnep_tx_type type;
> >> +    union {
> >> +            struct sk_buff *skb;
> >> +            struct xdp_frame *xdpf;
> >> +    };
> >>      size_t len;
> >>      DEFINE_DMA_UNMAP_ADDR(dma);xdp_return_frame_bulk
> >>   };
> >> diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c
> >> index 943de5a09693..1ae73c706c9e 100644
> >> --- a/drivers/net/ethernet/engleder/tsnep_main.c
> >> +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> >> @@ -310,10 +310,12 @@ static void tsnep_tx_activate(struct tsnep_tx *tx, int index, int length,
> >>      struct tsnep_tx_entry *entry = &tx->entry[index];
> >>
> >>      entry->properties = 0;
> >> +    /* xdpf is union with skb */
> >>      if (entry->skb) {
> >>              entry->properties = length & TSNEP_DESC_LENGTH_MASK;
> >>              entry->properties |= TSNEP_DESC_INTERRUPT_FLAG;
> >> -            if (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS)
> >> +            if (entry->type == Txdp_return_frame_bulkSNEP_TX_TYPE_SKB &&
> >> +                (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS))
> >>                      entry->properties |= TSNEP_DESC_EXTENDED_WRITEBACK_FLAG;
> >>
> >>              /* toggle user flag to prevent false acknowledge
> >> @@ -370,7 +372,8 @@ static int tsnep_tx_desc_available(struct tsnep_tx *tx)
> >>              return tx->read - tx->write - 1;
> >>   }
> >>
> >> -static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
> >> +static int tsnep_tx_map(const struct sk_buff *skb, struct tsnep_tx *tx,
> >> +                    int count)
> >>   {
> >
> > This change to const doesn't add anything since this is a static
> > function. You could probably just skip making this change since the
> > function will likely be inlined anyway.
>
> const was requested for tsnep_xdp_tx_map() during last review round so I
> added it also here to keep both function similar.

As a general rule it doesn't add anything to make an argument to a
static function const unless the callers are also making it a const.
Otherwise what you end up doing is just adding useless modifiers that
will be thrown away to the code as the compiler can already take care
of whatever optimizations it can get out of it.

> >>      struct device *dmadev = tx->adapter->dmadev;
> >>      struct tsnep_tx_entry *entry;
> >> @@ -382,7 +385,7 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
> >>      for (i = 0; i < count; i++) {
> >>              entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
> >>
> >> -            if (i == 0) {
> >> +            if (!i) {
> >>                      len = skb_headlen(skb);
> >>                      dma = dma_map_single(dmadev, skb->data, len,
> >>                                           DMA_TO_DEVICE);
> >> @@ -400,6 +403,8 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
> >>
> >>              entry->desc->tx = __cpu_to_le64(dma);
> >>
> >> +            entry->type = TSNEP_TX_TYPE_SKB;
> >> +
> >>              map_len += len;
> >>      }
> >>
> >
> > I wonder if it wouldn't be better to change this so that you have a 4th
> > type for just "PAGE" or "FRAGMENT" since you only really need to
> > identify this as SKB or XDP on the first buffer, and then all the rest
> > are just going to be unmapped as page regardless of if it is XDP or
> > SKB.
>
> I will it try in combination with your bitmap suggestion.
>
> >> @@ -417,12 +422,13 @@ static int tsnep_tx_unmap(struct tsnep_tx *tx, int index, int count)
> >>              entry = &tx->entry[(index + i) % TSNEP_RING_SIZE];
> >>
> >>              if (entry->len) {
> >> -                    if (i == 0)
> >> +                    if (!i && entry->type == TSNEP_TX_TYPE_SKB)
> >>                              dma_unmap_single(dmadev,
> >>                                               dma_unmap_addr(entry, dma),
> >>                                               dma_unmap_len(entry, len),
> >>                                               DMA_TO_DEVICE);
> >> -                    else
> >> +                    else if (entry->type == TSNEP_TX_TYPE_SKB ||
> >> +                             entry->type == TSNEP_TX_TYPE_XDP_NDO)
> >>                              dma_unmap_page(dmadev,
> >>                                             dma_unmap_addr(entry, dma),
> >>                                             dma_unmap_len(entry, len),
> >
> > Rather than perform 2 checks here you could just verify type !=
> > TYPE_XDP_TX which would save you a check.
>
> Will be improved with your bitmap suggestion.
>
> >> @@ -482,7 +488,7 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
> >>
> >>      for (i = 0; i < count; i++)
> >>              tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
> >> -                              i == (count - 1));
> >> +                              i == count - 1);
> >>      tx->write = (tx->write + count) % TSNEP_RING_SIZE;
> >>
> >>      skb_tx_timestamp(skb);
> >> @@ -502,12 +508,133 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb,
> >>      return NETDEV_TX_OK;
> >>   }
> >>
> >> +static int tsnep_xdp_tx_map(const struct xdp_frame *xdpf, struct tsnep_tx *tx,
> >> +                        const struct skb_shared_info *shinfo, int count,
> >> +                        enum tsnep_tx_type type)
> >
> > Again the const here isn't adding any value since this is a static
> > function and will likely be inlined into the function below which calls
> > it.
>
> const was requested here during last review round so I added it. It may
> add some value by detecting some problems at compile time.

I suppose, but really adding a const attribute here doesn't add much
here unless you are also going to enforce it at higher levels such as
the xmit_frame_ring function itself. Also keep in mind that all the
const would protect is the xdp frame structure itself. It does nothing
to keep us from modifying the data in the pages and such.

> >> +{
> >> +    struct device *dmadev = tx->adapter->dmadev;
> >> +    struct tsnep_tx_entry *entry;
> >> +    const skb_frag_t *frag;
> >> +    struct page *page;
> >> +    unsigned int len;
> >> +    int map_len = 0;
> >> +    dma_addr_t dma;
> >> +    void *data;
> >> +    int i;
> >> +
> >> +    frag = NULL;
> >> +    len = xdpf->len;
> >> +    for (i = 0; i < count; i++) {
> >> +            entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE];
> >> +            if (type == TSNEP_TX_TYPE_XDP_NDO) {
> >> +                    data = unlikely(frag) ? skb_frag_address(frag) :
> >> +                                            xdpf->data;
> >> +                    dma = dma_map_single(dmadev, data, len, DMA_TO_DEVICE);
> >> +                    if (dma_mapping_error(dmadev, dma))
> >> +                            return -ENOMEM;
> >> +
> >> +                    entry->type = TSNEP_TX_TYPE_XDP_NDO;
> >> +            } else {
> >> +                    page = unlikely(frag) ? skb_frag_page(frag) :
> >> +                                            virt_to_page(xdpf->data);
> >> +                    dma = page_pool_get_dma_addr(page);
> >> +                    if (unlikely(frag))
> >> +                            dma += skb_frag_off(frag);
> >> +                    else
> >> +                            dma += sizeof(*xdpf) + xdpf->headroom;
> >> +                    dma_sync_single_for_device(dmadev, dma, len,
> >> +                                               DMA_BIDIRECTIONAL);
> >> +
> >> +                    entry->type = TSNEP_TX_TYPE_XDP_TX;
> >> +            }
> >> +
> >> +            entry->len = len;
> >> +            dma_unmap_addr_set(entry, dma, dma);
> >> +
> >> +            entry->desc->tx = __cpu_to_le64(dma);
> >> +
> >> +            map_len += len;
> >> +
> >> +            if (i + 1 < count) {
> >> +                    frag = &shinfo->frags[i];
> >> +                    len = skb_frag_size(frag);
> >> +            }
> >> +    }
> >> +
> >> +    return map_len;
> >> +}
> >> +
> >> +/* This function requires __netif_tx_lock is held by the caller. */
> >> +static bool tsnep_xdp_xmit_frame_ring(struct xdp_frame *xdpf,
> >> +                                  struct tsnep_tx *tx,
> >> +                                  enum tsnep_tx_type type)
> >> +{
> >> +    const struct skb_shared_info *shinfo =
> >> +            xdp_get_shared_info_from_frame(xdpf);
> >> +    struct tsnep_tx_entry *entry;
> >> +    int count, length, retval, i;
> >> +
> >> +    count = 1;
> >> +    if (unlikely(xdp_frame_has_frags(xdpf)))
> >> +            count += shinfo->nr_frags;
> >> +
> >> +    spin_lock_bh(&tx->lock);
> >> +
> >> +    /* ensure that TX ring is not filled up by XDP, always MAX_SKB_FRAGS
> >> +     * will be available for normal TX path and queue is stopped there if
> >> +     * necessary
> >> +     */
> >> +    if (tsnep_tx_desc_available(tx) < (MAX_SKB_FRAGS + 1 + count)) {
> >> +            spin_unlock_bh(&tx->lock);
> >> +
> >> +            return false;
> >> +    }
> >> +
> >> +    entry = &tx->entry[tx->write];
> >> +    entry->xdpf = xdpf;
> >> +
> >> +    retval = tsnep_xdp_tx_map(xdpf, tx, shinfo, count, type);
> >> +    if (retval < 0) {
> >> +            tsnep_tx_unmap(tx, tx->write, count);
> >> +            entry->xdpf = NULL;
> >> +
> >> +            tx->dropped++;
> >> +
> >> +            spin_unlock_bh(&tx->lock);
> >> +
> >> +            return false;
> >> +    }
> >> +    length = retval;
> >> +
> >> +    for (i = 0; i < count; i++)
> >> +            tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length,
> >> +                              i == count - 1);
> >> +    tx->write = (tx->write + count) % TSNEP_RING_SIZE;
> >> +
> >> +    /* descriptor properties shall be valid before hardware is notified */
> >> +    dma_wmb();
> >> +
> >> +    spin_unlock_bh(&tx->lock);
> >> +
> >> +    return true;
> >> +}
> >> +
> >> +static void tsnep_xdp_xmit_flush(struct tsnep_tx *tx)
> >> +{
> >> +    iowrite32(TSNEP_CONTROL_TX_ENABLE, tx->addr + TSNEP_CONTROL);
> >> +}
> >> +
> >>   static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
> >>   {
> >> -    int budget = 128;
> >>      struct tsnep_tx_entry *entry;
> >> -    int count;
> >> +    struct xdp_frame_bulk bq;
> >> +    int budget = 128;
> >>      int length;
> >> +    int count;
> >> +
> >> +    xdp_frame_bulk_init(&bq);
> >> +
> >> +    rcu_read_lock(); /* need for xdp_return_frame_bulk */
> >>
> >
> > You should be able to get rid of both of these. See comments below.
> >
> >>      spin_lock_bh(&tx->lock);
> >>
> >> @@ -527,12 +654,17 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
> >>              dma_rmb();
> >>
> >>              count = 1;
> >> -            if (skb_shinfo(entry->skb)->nr_frags > 0)
> >> +            if (entry->type == TSNEP_TX_TYPE_SKB &&
> >> +                skb_shinfo(entry->skb)->nr_frags > 0)
> >>                      count += skb_shinfo(entry->skb)->nr_frags;
> >> +            else if (entry->type != TSNEP_TX_TYPE_SKB &&
> >> +                     xdp_frame_has_frags(entry->xdpf))
> >> +                    count += xdp_get_shared_info_from_frame(entry->xdpf)->nr_frags;
> >>
> >>              length = tsnep_tx_unmap(tx, tx->read, count);
> >>
> >> -            if ((skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
> >> +            if (entry->type == TSNEP_TX_TYPE_SKB &&
> >> +                (skb_shinfo(entry->skb)->tx_flags & SKBTX_IN_PROGRESS) &&
> >>                  (__le32_to_cpu(entry->desc_wb->properties) &
> >>                   TSNEP_DESC_EXTENDED_WRITEBACK_FLAG)) {
> >>                      struct skb_shared_hwtstamps hwtstamps;
> >> @@ -552,7 +684,18 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget)
> >>                      skb_tstamp_tx(entry->skb, &hwtstamps);
> >>              }
> >>
> >> -            napi_consume_skb(entry->skb, napi_budget);
> >> +            switch (entry->type) {
> >> +            case TSNEP_TX_TYPE_SKB:
> >> +                    napi_consume_skb(entry->skb, napi_budget);
> >> +                    break;
> >> +            case TSNEP_TX_TYPE_XDP_TX:
> >> +                    xdp_return_frame_rx_napi(entry->xdpf);
> >> +                    break;
> >> +            case TSNEP_TX_TYPE_XDP_NDO:
> >> +                    xdp_return_frame_bulk(entry->xdpf, &bq);
> >> +                    break;
> >> +            }
> >> +            /* xdpf is union with skb */
> >>              entry->skb = NULL;
> >>
> >>              tx->read = (tx->read + count) % TSNEP_RING_SIZE;
> >
> > If I am not mistaken I think you can use xdp_return_frame_rx_napi for
> > both of these without having to resort to the bulk approach since the
> > tsnep_tx_poll is operating in the NAPI polling context.
> >
> > When you free the buffers outside of polling context you would then
> > want to use the xdp_return_frame_bulk. So for example if you have any
> > buffers that weren't transmitted when you get to the point of
> > tsnep_tx_ring_cleanup then you would want to unmap and free them there
> > using something like the xdp_return_frame or xdp_return_frame_bulk.
>
> This is done similar in other drivers, so I thought it is good practice.
> I will compare the performance of xdp_return_frame_rx_napi() and
> xdp_return_frame_bulk().

Okay, sounds good.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-10 21:38     ` Gerhard Engleder
@ 2023-01-10 23:00       ` Alexander H Duyck
  2023-01-11 18:58         ` Gerhard Engleder
  2023-01-11  0:12       ` Jakub Kicinski
  1 sibling, 1 reply; 36+ messages in thread
From: Alexander H Duyck @ 2023-01-10 23:00 UTC (permalink / raw)
  To: Gerhard Engleder, netdev; +Cc: davem, kuba, edumazet, pabeni

On Tue, 2023-01-10 at 22:38 +0100, Gerhard Engleder wrote:
> On 10.01.23 18:33, Alexander H Duyck wrote:
> > On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
> > > Implement setup of BPF programs for XDP RX path with command
> > > XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path
> > > support.
> > > 
> > > tsnep_netdev_close() is called directly during BPF program setup. Add
> > > netif_carrier_off() and netif_tx_stop_all_queues() calls to signal to
> > > network stack that device is down. Otherwise network stack would
> > > continue transmitting pakets.
> > > 
> > > Return value of tsnep_netdev_open() is not checked during BPF program
> > > setup like in other drivers. Forwarding the return value would result in
> > > a bpf_prog_put() call in dev_xdp_install(), which would make removal of
> > > BPF program necessary.
> > > 
> > > If tsnep_netdev_open() fails during BPF program setup, then the network
> > > stack would call tsnep_netdev_close() anyway. Thus, tsnep_netdev_close()
> > > checks now if device is already down.
> > > 
> > > Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added
> > > automatically.
> > > 
> > > Test results with A53 1.2GHz:
> > > 
> > > XDP_DROP (samples/bpf/xdp1)
> > > proto 17:     883878 pkt/s
> > > 
> > > XDP_TX (samples/bpf/xdp2)
> > > proto 17:     255693 pkt/s
> > > 
> > > XDP_REDIRECT (samples/bpf/xdpsock)
> > >   sock0@eth2:0 rxdrop xdp-drv
> > >                     pps            pkts           1.00
> > > rx                 855,582        5,404,523
> > > tx                 0              0
> > > 
> > > XDP_REDIRECT (samples/bpf/xdp_redirect)
> > > eth2->eth1         613,267 rx/s   0 err,drop/s   613,272 xmit/s
> > > 
> > > Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
> > > ---
> > >   drivers/net/ethernet/engleder/Makefile     |  2 +-
> > >   drivers/net/ethernet/engleder/tsnep.h      |  6 +++++
> > >   drivers/net/ethernet/engleder/tsnep_main.c | 25 ++++++++++++++++---
> > >   drivers/net/ethernet/engleder/tsnep_xdp.c  | 29 ++++++++++++++++++++++
> > >   4 files changed, 58 insertions(+), 4 deletions(-)
> > >   create mode 100644 drivers/net/ethernet/engleder/tsnep_xdp.c
> > > 
> > > 
> > 
> > <...>
> > 
> > > --- a/drivers/net/ethernet/engleder/tsnep_main.c
> > > +++ b/drivers/net/ethernet/engleder/tsnep_main.c
> > > @@ -1373,7 +1373,7 @@ static void tsnep_free_irq(struct tsnep_queue *queue, bool first)
> > >   	memset(queue->name, 0, sizeof(queue->name));
> > >   }
> > >   
> > > -static int tsnep_netdev_open(struct net_device *netdev)
> > > +int tsnep_netdev_open(struct net_device *netdev)
> > >   {
> > >   	struct tsnep_adapter *adapter = netdev_priv(netdev);
> > >   	int tx_queue_index = 0;
> > > @@ -1436,6 +1436,8 @@ static int tsnep_netdev_open(struct net_device *netdev)
> > >   		tsnep_enable_irq(adapter, adapter->queue[i].irq_mask);
> > >   	}
> > >   
> > > +	netif_tx_start_all_queues(adapter->netdev);
> > > +
> > >   	clear_bit(__TSNEP_DOWN, &adapter->state);
> > >   
> > >   	return 0;
> > > @@ -1457,12 +1459,16 @@ static int tsnep_netdev_open(struct net_device *netdev)
> > >   	return retval;
> > >   }
> > >   
> > > -static int tsnep_netdev_close(struct net_device *netdev)
> > > +int tsnep_netdev_close(struct net_device *netdev)
> > >   {
> > >   	struct tsnep_adapter *adapter = netdev_priv(netdev);
> > >   	int i;
> > >   
> > > -	set_bit(__TSNEP_DOWN, &adapter->state);
> > > +	if (test_and_set_bit(__TSNEP_DOWN, &adapter->state))
> > > +		return 0;
> > > +
> > > +	netif_carrier_off(netdev);
> > > +	netif_tx_stop_all_queues(netdev);
> > >   
> > 
> > As I called out earlier the __TSNEP_DOWN is just !IFF_UP so you don't
> > need that bit.
> > 
> > The fact that netif_carrier_off is here also points out the fact that
> > the code in the Tx path isn't needed regarding __TSNEP_DOWN and you can
> > probably just check netif_carrier_ok if you need the check.
> 
> tsnep_netdev_close() is called directly during bpf prog setup (see
> tsnep_xdp_setup_prog() in this commit). If the following
> tsnep_netdev_open() call fails, then this flag signals that the device
> is already down and nothing needs to be cleaned up if
> tsnep_netdev_close() is called later (because IFF_UP is still set).

If the call to close was fouled up you should probably be blocking
access to the device via at least a netif_device_detach. I susppose you
could use the __LINK_STATE_PRESENT bit as the inverse of the
__TSNEP_DOWN bit. If your open fails you clean up, detatch the device,
and in the close path you only run through it if the device is present.

Basically what we want to avoid is adding a bunch of extra state as
what we tend to see is that it will start to create a snarl as you add
more and more layers.



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-10 21:38     ` Gerhard Engleder
  2023-01-10 23:00       ` Alexander H Duyck
@ 2023-01-11  0:12       ` Jakub Kicinski
  2023-01-11 19:11         ` Gerhard Engleder
  1 sibling, 1 reply; 36+ messages in thread
From: Jakub Kicinski @ 2023-01-11  0:12 UTC (permalink / raw)
  To: Gerhard Engleder; +Cc: Alexander H Duyck, netdev, davem, edumazet, pabeni

On Tue, 10 Jan 2023 22:38:04 +0100 Gerhard Engleder wrote:
> > As I called out earlier the __TSNEP_DOWN is just !IFF_UP so you don't
> > need that bit.
> > 
> > The fact that netif_carrier_off is here also points out the fact that
> > the code in the Tx path isn't needed regarding __TSNEP_DOWN and you can
> > probably just check netif_carrier_ok if you need the check.  
> 
> tsnep_netdev_close() is called directly during bpf prog setup (see
> tsnep_xdp_setup_prog() in this commit). If the following
> tsnep_netdev_open() call fails, then this flag signals that the device
> is already down and nothing needs to be cleaned up if
> tsnep_netdev_close() is called later (because IFF_UP is still set).

TBH we've been pushing pretty hard for a while now to stop people
from implementing the:

	close()
	change config
	open()

sort of reconfiguration. I did that myself when I was a was
implementing my first Ethernet driver and DaveM nacked the change.
Must have been a decade ago.

Imagine you're working on a remote box via SSH and the box is under
transient memory pressure. Allocations fail, we don't want the machine
to fall off the network :(

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 05/10] tsnep: Add XDP TX support
  2023-01-10 22:38       ` Alexander Duyck
@ 2023-01-11 18:51         ` Gerhard Engleder
  0 siblings, 0 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-11 18:51 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, davem, kuba, edumazet, pabeni

On 10.01.23 23:38, Alexander Duyck wrote:
> On Tue, Jan 10, 2023 at 1:07 PM Gerhard Engleder
> <gerhard@engleder-embedded.com> wrote:
>>
>> On 10.01.23 17:56, Alexander H Duyck wrote:
>>> nOn Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>>>> Implement ndo_xdp_xmit() for XDP TX support. Support for fragmented XDP
>>>> frames is included.
>>>>
>>>> Also some const, braces and logic clean ups are done in normal TX path
>>>> to keep both TX paths in sync.
>>>>
>>>> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>

<...>

>>>> -static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count)
>>>> +static int tsnep_tx_map(const struct sk_buff *skb, struct tsnep_tx *tx,
>>>> +                    int count)
>>>>    {
>>>
>>> This change to const doesn't add anything since this is a static
>>> function. You could probably just skip making this change since the
>>> function will likely be inlined anyway.
>>
>> const was requested for tsnep_xdp_tx_map() during last review round so I
>> added it also here to keep both function similar.
> 
> As a general rule it doesn't add anything to make an argument to a
> static function const unless the callers are also making it a const.
> Otherwise what you end up doing is just adding useless modifiers that
> will be thrown away to the code as the compiler can already take care
> of whatever optimizations it can get out of it.

For me removing const is totally ok. I will remove it as I see no value
for static functions too. Let's see what the next reviewer says ;-)

<...>

>>>> +static int tsnep_xdp_tx_map(const struct xdp_frame *xdpf, struct tsnep_tx *tx,
>>>> +                        const struct skb_shared_info *shinfo, int count,
>>>> +                        enum tsnep_tx_type type)
>>>
>>> Again the const here isn't adding any value since this is a static
>>> function and will likely be inlined into the function below which calls
>>> it.
>>
>> const was requested here during last review round so I added it. It may
>> add some value by detecting some problems at compile time.
> 
> I suppose, but really adding a const attribute here doesn't add much
> here unless you are also going to enforce it at higher levels such as
> the xmit_frame_ring function itself. Also keep in mind that all the
> const would protect is the xdp frame structure itself. It does nothing
> to keep us from modifying the data in the pages and such.

Of course.

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-10 23:00       ` Alexander H Duyck
@ 2023-01-11 18:58         ` Gerhard Engleder
  0 siblings, 0 replies; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-11 18:58 UTC (permalink / raw)
  To: Alexander H Duyck, netdev; +Cc: davem, kuba, edumazet, pabeni

On 11.01.23 00:00, Alexander H Duyck wrote:
> On Tue, 2023-01-10 at 22:38 +0100, Gerhard Engleder wrote:
>> On 10.01.23 18:33, Alexander H Duyck wrote:
>>> On Mon, 2023-01-09 at 20:15 +0100, Gerhard Engleder wrote:
>>>> Implement setup of BPF programs for XDP RX path with command
>>>> XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path
>>>> support.
>>>>
>>>> tsnep_netdev_close() is called directly during BPF program setup. Add
>>>> netif_carrier_off() and netif_tx_stop_all_queues() calls to signal to
>>>> network stack that device is down. Otherwise network stack would
>>>> continue transmitting pakets.
>>>>
>>>> Return value of tsnep_netdev_open() is not checked during BPF program
>>>> setup like in other drivers. Forwarding the return value would result in
>>>> a bpf_prog_put() call in dev_xdp_install(), which would make removal of
>>>> BPF program necessary.
>>>>
>>>> If tsnep_netdev_open() fails during BPF program setup, then the network
>>>> stack would call tsnep_netdev_close() anyway. Thus, tsnep_netdev_close()
>>>> checks now if device is already down.
>>>>
>>>> Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added
>>>> automatically.

<...>

>>>> +	netif_carrier_off(netdev);
>>>> +	netif_tx_stop_all_queues(netdev);
>>>>    
>>>
>>> As I called out earlier the __TSNEP_DOWN is just !IFF_UP so you don't
>>> need that bit.
>>>
>>> The fact that netif_carrier_off is here also points out the fact that
>>> the code in the Tx path isn't needed regarding __TSNEP_DOWN and you can
>>> probably just check netif_carrier_ok if you need the check.
>>
>> tsnep_netdev_close() is called directly during bpf prog setup (see
>> tsnep_xdp_setup_prog() in this commit). If the following
>> tsnep_netdev_open() call fails, then this flag signals that the device
>> is already down and nothing needs to be cleaned up if
>> tsnep_netdev_close() is called later (because IFF_UP is still set).
> 
> If the call to close was fouled up you should probably be blocking
> access to the device via at least a netif_device_detach. I susppose you
> could use the __LINK_STATE_PRESENT bit as the inverse of the
> __TSNEP_DOWN bit. If your open fails you clean up, detatch the device,
> and in the close path you only run through it if the device is present.
> 
> Basically what we want to avoid is adding a bunch of extra state as
> what we tend to see is that it will start to create a snarl as you add
> more and more layers.

To be honest, I cannot argue that __TSNEP_DOWN is great solution. It is
may more about fighting symptoms from the

	close()
	change config
	open()

style which according to Jabuk should be prevented anyway. I will
suggest a different solution as a reply to Jakubs comment.

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-11  0:12       ` Jakub Kicinski
@ 2023-01-11 19:11         ` Gerhard Engleder
  2023-01-11 20:06           ` Jakub Kicinski
  0 siblings, 1 reply; 36+ messages in thread
From: Gerhard Engleder @ 2023-01-11 19:11 UTC (permalink / raw)
  To: Jakub Kicinski, Alexander H Duyck; +Cc: netdev, davem, edumazet, pabeni

On 11.01.23 01:12, Jakub Kicinski wrote:
> On Tue, 10 Jan 2023 22:38:04 +0100 Gerhard Engleder wrote:
>>> As I called out earlier the __TSNEP_DOWN is just !IFF_UP so you don't
>>> need that bit.
>>>
>>> The fact that netif_carrier_off is here also points out the fact that
>>> the code in the Tx path isn't needed regarding __TSNEP_DOWN and you can
>>> probably just check netif_carrier_ok if you need the check.
>>
>> tsnep_netdev_close() is called directly during bpf prog setup (see
>> tsnep_xdp_setup_prog() in this commit). If the following
>> tsnep_netdev_open() call fails, then this flag signals that the device
>> is already down and nothing needs to be cleaned up if
>> tsnep_netdev_close() is called later (because IFF_UP is still set).
> 
> TBH we've been pushing pretty hard for a while now to stop people
> from implementing the:
> 
> 	close()
> 	change config
> 	open()
> 
> sort of reconfiguration. I did that myself when I was a was
> implementing my first Ethernet driver and DaveM nacked the change.
> Must have been a decade ago.
> 
> Imagine you're working on a remote box via SSH and the box is under
> transient memory pressure. Allocations fail, we don't want the machine
> to fall off the network :(

I agree with you that this pattern is bad. Most XDP BPF program setup do
it like that, but this is of course no valid argument.

In the last review round I made the following suggestion (but got no
reply so far):

What about always using 'XDP_PACKET_HEADROOM' as offset in the RX
buffer? The offset 'NET_SKB_PAD + NET_IP_ALIGN' would not even be used
if XDP is not enabled. Changing this offset is the only task to be done
at the first XDP BFP prog setup call. By always using this offset
no

	close()
	change config
	open()

pattern is needed. As a result no handling for failed open() is needed
and __TSNEP_DOWN is not needed. Simpler code with less problems in my
opinion.

The only problem could be that NET_IP_ALIGN is not used, but
NET_IP_ALIGN is 0 anyway on the two platforms (x86, arm64) where this
driver is used.

Gerhard

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup
  2023-01-11 19:11         ` Gerhard Engleder
@ 2023-01-11 20:06           ` Jakub Kicinski
  0 siblings, 0 replies; 36+ messages in thread
From: Jakub Kicinski @ 2023-01-11 20:06 UTC (permalink / raw)
  To: Gerhard Engleder; +Cc: Alexander H Duyck, netdev, davem, edumazet, pabeni

On Wed, 11 Jan 2023 20:11:44 +0100 Gerhard Engleder wrote:
> I agree with you that this pattern is bad. Most XDP BPF program setup do
> it like that, but this is of course no valid argument.
> 
> In the last review round I made the following suggestion (but got no
> reply so far):
> 
> What about always using 'XDP_PACKET_HEADROOM' as offset in the RX
> buffer? The offset 'NET_SKB_PAD + NET_IP_ALIGN' would not even be used
> if XDP is not enabled. Changing this offset is the only task to be done
> at the first XDP BFP prog setup call. By always using this offset
> no
> 
> 	close()
> 	change config
> 	open()
> 
> pattern is needed. As a result no handling for failed open() is needed
> and __TSNEP_DOWN is not needed. Simpler code with less problems in my
> opinion.
> 
> The only problem could be that NET_IP_ALIGN is not used, but
> NET_IP_ALIGN is 0 anyway on the two platforms (x86, arm64) where this
> driver is used.

You can add NET_IP_ALIGN as well, AFAIU XDP_PACKET_HEADROOM is more 
of a minimum headroom than an exact one.

The other thing off the top of my head is that you'll always need to
DMA map the Rx buffers BIDIR.

If those are acceptable for your platform / applications then indeed
seems like an easy way out of the problem!

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-01-11 20:08 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-09 19:15 [PATCH net-next v4 00/10] tsnep: XDP support Gerhard Engleder
2023-01-09 19:15 ` [PATCH net-next v4 01/10] tsnep: Use spin_lock_bh for TX Gerhard Engleder
2023-01-10 15:49   ` Alexander H Duyck
2023-01-10 19:44     ` Gerhard Engleder
2023-01-09 19:15 ` [PATCH net-next v4 02/10] tsnep: Forward NAPI budget to napi_consume_skb() Gerhard Engleder
2023-01-10 15:50   ` Alexander H Duyck
2023-01-09 19:15 ` [PATCH net-next v4 03/10] tsnep: Do not print DMA mapping error Gerhard Engleder
2023-01-10 15:53   ` Alexander H Duyck
2023-01-10 19:47     ` Gerhard Engleder
2023-01-09 19:15 ` [PATCH net-next v4 04/10] tsnep: Add adapter down state Gerhard Engleder
2023-01-10 16:05   ` Alexander H Duyck
2023-01-10 20:16     ` Gerhard Engleder
2023-01-09 19:15 ` [PATCH net-next v4 05/10] tsnep: Add XDP TX support Gerhard Engleder
2023-01-10 16:56   ` Alexander H Duyck
2023-01-10 21:07     ` Gerhard Engleder
2023-01-10 22:38       ` Alexander Duyck
2023-01-11 18:51         ` Gerhard Engleder
2023-01-09 19:15 ` [PATCH net-next v4 06/10] tsnep: Substract TSNEP_RX_INLINE_METADATA_SIZE once Gerhard Engleder
2023-01-10 17:05   ` Alexander H Duyck
2023-01-09 19:15 ` [PATCH net-next v4 07/10] tsnep: Prepare RX buffer for XDP support Gerhard Engleder
2023-01-09 19:15 ` [PATCH net-next v4 08/10] tsnep: Add RX queue info " Gerhard Engleder
2023-01-10 17:29   ` Alexander H Duyck
2023-01-10 21:21     ` Gerhard Engleder
2023-01-10 22:27       ` Alexander Duyck
2023-01-09 19:15 ` [PATCH net-next v4 09/10] tsnep: Add XDP RX support Gerhard Engleder
2023-01-10 17:40   ` Alexander H Duyck
2023-01-10 21:28     ` Gerhard Engleder
2023-01-10 22:30       ` Alexander Duyck
2023-01-09 19:15 ` [PATCH net-next v4 10/10] tsnep: Support XDP BPF program setup Gerhard Engleder
2023-01-10 17:33   ` Alexander H Duyck
2023-01-10 21:38     ` Gerhard Engleder
2023-01-10 23:00       ` Alexander H Duyck
2023-01-11 18:58         ` Gerhard Engleder
2023-01-11  0:12       ` Jakub Kicinski
2023-01-11 19:11         ` Gerhard Engleder
2023-01-11 20:06           ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).