All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy
@ 2020-12-17 20:24 Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 01/10] igc: Move igc_xdp_is_enabled() Andre Guedes
                   ` (10 more replies)
  0 siblings, 11 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

Hi,

This series adds AF_XDP zero-copy feature to igc driver.

The initial patches do some code refactoring, preparing the code base to
land the AF_XDP zero-copy feature, avoiding code duplications. The last
patches of the series are the ones implementing the feature.

The last patch which indeed implements AF_XDP zero-copy support was
originally way too lengthy so, for the sake of code review, I broke it
up into two patches: one adding support for the RX functionality and the
other one adding TX support. Feel free to squash those two patches when
pushing the series to net-next if you find appropriate.

Best,
Andre

Andre Guedes (10):
  igc: Move igc_xdp_is_enabled()
  igc: Refactor igc_xdp_run_prog()
  igc: Refactor igc_clean_rx_ring()
  igc: Refactor XDP rxq info registration
  igc: Introduce igc_update_rx_stats()
  igc: Introduce igc_update_tx_stats()
  igc: Introduce igc_unmap_tx_buffer() helper
  igc: Replace IGC_TX_FLAGS_XDP flag by an enum
  igc: Enable RX via AF_XDP zero-copy
  igc: Enable TX via AF_XDP zero-copy

 drivers/net/ethernet/intel/igc/igc.h      |  33 +-
 drivers/net/ethernet/intel/igc/igc_base.h |   2 +
 drivers/net/ethernet/intel/igc/igc_main.c | 650 ++++++++++++++++++----
 drivers/net/ethernet/intel/igc/igc_xdp.c  | 107 +++-
 drivers/net/ethernet/intel/igc/igc_xdp.h  |   8 +-
 5 files changed, 672 insertions(+), 128 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 01/10] igc: Move igc_xdp_is_enabled()
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 02/10] igc: Refactor igc_xdp_run_prog() Andre Guedes
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

This patch moves the helper igc_xdp_is_enabled() to igc_xdp.h so it can
be reused in igc_xdp.c by upcoming patches that will introduce AF_XDP
zero-copy support to the driver.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 5 -----
 drivers/net/ethernet/intel/igc/igc_xdp.h  | 5 +++++
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 8913c7f7be7f..0e582a4ee986 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -515,11 +515,6 @@ static int igc_setup_all_rx_resources(struct igc_adapter *adapter)
 	return err;
 }
 
-static bool igc_xdp_is_enabled(struct igc_adapter *adapter)
-{
-	return !!adapter->xdp_prog;
-}
-
 /**
  * igc_configure_rx_ring - Configure a receive ring after Reset
  * @adapter: board private structure
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.h b/drivers/net/ethernet/intel/igc/igc_xdp.h
index cfecb515b718..412aa369e6ba 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.h
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.h
@@ -10,4 +10,9 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 int igc_xdp_register_rxq_info(struct igc_ring *ring);
 void igc_xdp_unregister_rxq_info(struct igc_ring *ring);
 
+static inline bool igc_xdp_is_enabled(struct igc_adapter *adapter)
+{
+	return !!adapter->xdp_prog;
+}
+
 #endif /* _IGC_XDP_H_ */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 02/10] igc: Refactor igc_xdp_run_prog()
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 01/10] igc: Move igc_xdp_is_enabled() Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-21 22:45   ` Maciej Fijalkowski
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 03/10] igc: Refactor igc_clean_rx_ring() Andre Guedes
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

This patch refactors igc_xdp_run_prog() helper, preparing the code for
AF_XDP zero-copy support which is added by upcoming patches.

With AF_XDP zero-copy support, igc_poll() will have a dedicated path
when the rx ring has zero-copy is enabled. To avoid code duplication as
much as possible, this patch encapsulates the code specific to handling
XDP program actions into a local helper so it can be reused by the
zero-copy path.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 56 +++++++++++------------
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 0e582a4ee986..56b791b356dc 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2029,38 +2029,22 @@ static int igc_xdp_xmit_back(struct igc_adapter *adapter, struct xdp_buff *xdp)
 	return res;
 }
 
-static struct sk_buff *igc_xdp_run_prog(struct igc_adapter *adapter,
-					struct xdp_buff *xdp)
+/* This function assumes rcu_read_lock() is held by the caller. */
+static int igc_xdp_do_run_prog(struct igc_adapter *adapter,
+			       struct bpf_prog *prog,
+			       struct xdp_buff *xdp)
 {
-	struct bpf_prog *prog;
-	int res;
-	u32 act;
-
-	rcu_read_lock();
-
-	prog = READ_ONCE(adapter->xdp_prog);
-	if (!prog) {
-		res = IGC_XDP_PASS;
-		goto unlock;
-	}
+	u32 act = bpf_prog_run_xdp(prog, xdp);
 
-	act = bpf_prog_run_xdp(prog, xdp);
 	switch (act) {
 	case XDP_PASS:
-		res = IGC_XDP_PASS;
-		break;
+		return IGC_XDP_PASS;
 	case XDP_TX:
-		if (igc_xdp_xmit_back(adapter, xdp) < 0)
-			res = IGC_XDP_CONSUMED;
-		else
-			res = IGC_XDP_TX;
-		break;
+		return igc_xdp_xmit_back(adapter, xdp) < 0 ?
+			IGC_XDP_CONSUMED : IGC_XDP_TX;
 	case XDP_REDIRECT:
-		if (xdp_do_redirect(adapter->netdev, xdp, prog) < 0)
-			res = IGC_XDP_CONSUMED;
-		else
-			res = IGC_XDP_REDIRECT;
-		break;
+		return xdp_do_redirect(adapter->netdev, xdp, prog) < 0 ?
+			IGC_XDP_CONSUMED : IGC_XDP_REDIRECT;
 	default:
 		bpf_warn_invalid_xdp_action(act);
 		fallthrough;
@@ -2068,9 +2052,25 @@ static struct sk_buff *igc_xdp_run_prog(struct igc_adapter *adapter,
 		trace_xdp_exception(adapter->netdev, prog, act);
 		fallthrough;
 	case XDP_DROP:
-		res = IGC_XDP_CONSUMED;
-		break;
+		return IGC_XDP_CONSUMED;
 	}
+}
+
+static struct sk_buff *igc_xdp_run_prog(struct igc_adapter *adapter,
+					struct xdp_buff *xdp)
+{
+	struct bpf_prog *prog;
+	int res;
+
+	rcu_read_lock();
+
+	prog = READ_ONCE(adapter->xdp_prog);
+	if (!prog) {
+		res = IGC_XDP_PASS;
+		goto unlock;
+	}
+
+	res = igc_xdp_do_run_prog(adapter, prog, xdp);
 
 unlock:
 	rcu_read_unlock();
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 03/10] igc: Refactor igc_clean_rx_ring()
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 01/10] igc: Move igc_xdp_is_enabled() Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 02/10] igc: Refactor igc_xdp_run_prog() Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration Andre Guedes
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

This patch refactors igc_clean_rx_ring() helper, preparing the code for
AF_XDP zero-copy support which is added by upcoming patches.

The refactor consists of encapsulating page-shared specific code into
its own helper and leave common code that will be shared by both
page-shared and xsk pool in igc_clean_rx_ring().

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 25 ++++++++++++++---------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 56b791b356dc..d591c1e0581f 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -346,16 +346,10 @@ static int igc_setup_all_tx_resources(struct igc_adapter *adapter)
 	return err;
 }
 
-/**
- * igc_clean_rx_ring - Free Rx Buffers per Queue
- * @rx_ring: ring to free buffers from
- */
-static void igc_clean_rx_ring(struct igc_ring *rx_ring)
+static void igc_clean_rx_ring_page_shared(struct igc_ring *rx_ring)
 {
 	u16 i = rx_ring->next_to_clean;
 
-	clear_ring_uses_large_buffer(rx_ring);
-
 	dev_kfree_skb(rx_ring->skb);
 	rx_ring->skb = NULL;
 
@@ -385,10 +379,21 @@ static void igc_clean_rx_ring(struct igc_ring *rx_ring)
 		if (i == rx_ring->count)
 			i = 0;
 	}
+}
 
-	rx_ring->next_to_alloc = 0;
-	rx_ring->next_to_clean = 0;
-	rx_ring->next_to_use = 0;
+/**
+ * igc_clean_rx_ring - Free Rx Buffers per Queue
+ * @ring: ring to free buffers from
+ */
+static void igc_clean_rx_ring(struct igc_ring *ring)
+{
+	clear_ring_uses_large_buffer(ring);
+
+	igc_clean_rx_ring_page_shared(ring);
+
+	ring->next_to_alloc = 0;
+	ring->next_to_clean = 0;
+	ring->next_to_use = 0;
 }
 
 /**
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (2 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 03/10] igc: Refactor igc_clean_rx_ring() Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-21 22:53   ` Maciej Fijalkowski
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 05/10] igc: Introduce igc_update_rx_stats() Andre Guedes
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

This patch does a refactoring on the XDP rxq info registration code,
preparing the driver for AF_XDP zero-copy support which is added by
upcoming patches.

Currently, xdp_rxq and memory model are both registered during RX
resource setup time by igc_xdp_register_rxq_info() helper. With AF_XDP,
we want to register the memory model later on while configuring the ring
because we will know which memory model type to register
(MEM_TYPE_PAGE_SHARED or MEM_TYPE_XSK_BUFF_POOL).

The helpers igc_xdp_register_rxq_info() and igc_xdp_unregister_rxq_
info() are not useful anymore so they are removed.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 14 ++++++++----
 drivers/net/ethernet/intel/igc/igc_xdp.c  | 27 -----------------------
 drivers/net/ethernet/intel/igc/igc_xdp.h  |  3 ---
 3 files changed, 10 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index d591c1e0581f..1870ca37d650 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -419,7 +419,7 @@ void igc_free_rx_resources(struct igc_ring *rx_ring)
 {
 	igc_clean_rx_ring(rx_ring);
 
-	igc_xdp_unregister_rxq_info(rx_ring);
+	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
 
 	vfree(rx_ring->rx_buffer_info);
 	rx_ring->rx_buffer_info = NULL;
@@ -460,9 +460,12 @@ int igc_setup_rx_resources(struct igc_ring *rx_ring)
 	struct device *dev = rx_ring->dev;
 	int size, desc_len, res;
 
-	res = igc_xdp_register_rxq_info(rx_ring);
-	if (res < 0)
+	res = xdp_rxq_info_reg(&rx_ring->xdp_rxq, ndev, rx_ring->queue_index,
+			       0);
+	if (res < 0) {
+		netdev_err(ndev, "Failed to register xdp rxq info\n");
 		return res;
+	}
 
 	size = sizeof(struct igc_rx_buffer) * rx_ring->count;
 	rx_ring->rx_buffer_info = vzalloc(size);
@@ -488,7 +491,7 @@ int igc_setup_rx_resources(struct igc_ring *rx_ring)
 	return 0;
 
 err:
-	igc_xdp_unregister_rxq_info(rx_ring);
+	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
 	vfree(rx_ring->rx_buffer_info);
 	rx_ring->rx_buffer_info = NULL;
 	netdev_err(ndev, "Unable to allocate memory for Rx descriptor ring\n");
@@ -536,6 +539,9 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
 	u32 srrctl = 0, rxdctl = 0;
 	u64 rdba = ring->dma;
 
+	WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
+					   MEM_TYPE_PAGE_SHARED, NULL));
+
 	if (igc_xdp_is_enabled(adapter))
 		set_ring_uses_large_buffer(ring);
 
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
index 11133c4619bb..27c886a254f1 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.c
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
@@ -31,30 +31,3 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 
 	return 0;
 }
-
-int igc_xdp_register_rxq_info(struct igc_ring *ring)
-{
-	struct net_device *dev = ring->netdev;
-	int err;
-
-	err = xdp_rxq_info_reg(&ring->xdp_rxq, dev, ring->queue_index, 0);
-	if (err) {
-		netdev_err(dev, "Failed to register xdp rxq info\n");
-		return err;
-	}
-
-	err = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, MEM_TYPE_PAGE_SHARED,
-					 NULL);
-	if (err) {
-		netdev_err(dev, "Failed to register xdp rxq mem model\n");
-		xdp_rxq_info_unreg(&ring->xdp_rxq);
-		return err;
-	}
-
-	return 0;
-}
-
-void igc_xdp_unregister_rxq_info(struct igc_ring *ring)
-{
-	xdp_rxq_info_unreg(&ring->xdp_rxq);
-}
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.h b/drivers/net/ethernet/intel/igc/igc_xdp.h
index 412aa369e6ba..cdaa2c39b03a 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.h
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.h
@@ -7,9 +7,6 @@
 int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 		     struct netlink_ext_ack *extack);
 
-int igc_xdp_register_rxq_info(struct igc_ring *ring);
-void igc_xdp_unregister_rxq_info(struct igc_ring *ring);
-
 static inline bool igc_xdp_is_enabled(struct igc_adapter *adapter)
 {
 	return !!adapter->xdp_prog;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 05/10] igc: Introduce igc_update_rx_stats()
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (3 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 06/10] igc: Introduce igc_update_tx_stats() Andre Guedes
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

In preparation for AF_XDP zero-copy support, this patch encapsulates the
code that updates the driver RX stats so it can be reused in the zero-copy
path.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 1870ca37d650..e5cdebbe5637 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2118,6 +2118,20 @@ static void igc_finalize_xdp(struct igc_adapter *adapter, int status)
 		xdp_do_flush();
 }
 
+static void igc_update_rx_stats(struct igc_q_vector *q_vector,
+				unsigned int packets, unsigned int bytes)
+{
+	struct igc_ring *ring = q_vector->rx.ring;
+
+	u64_stats_update_begin(&ring->rx_syncp);
+	ring->rx_stats.packets += packets;
+	ring->rx_stats.bytes += bytes;
+	u64_stats_update_end(&ring->rx_syncp);
+
+	q_vector->rx.total_packets += packets;
+	q_vector->rx.total_bytes += bytes;
+}
+
 static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 {
 	unsigned int total_bytes = 0, total_packets = 0;
@@ -2241,12 +2255,7 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 	/* place incomplete frames back on ring for completion */
 	rx_ring->skb = skb;
 
-	u64_stats_update_begin(&rx_ring->rx_syncp);
-	rx_ring->rx_stats.packets += total_packets;
-	rx_ring->rx_stats.bytes += total_bytes;
-	u64_stats_update_end(&rx_ring->rx_syncp);
-	q_vector->rx.total_packets += total_packets;
-	q_vector->rx.total_bytes += total_bytes;
+	igc_update_rx_stats(q_vector, total_packets, total_bytes);
 
 	if (cleaned_count)
 		igc_alloc_rx_buffers(rx_ring, cleaned_count);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 06/10] igc: Introduce igc_update_tx_stats()
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (4 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 05/10] igc: Introduce igc_update_rx_stats() Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-21 22:58   ` Maciej Fijalkowski
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 07/10] igc: Introduce igc_unmap_tx_buffer() helper Andre Guedes
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

Likewise we do with the RX stats, this patch encapsulates the code that
updates the driver TX stats in its own local helper.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index e5cdebbe5637..26c2fc9977cc 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -2263,6 +2263,20 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 	return total_packets;
 }
 
+static void igc_update_tx_stats(struct igc_q_vector *q_vector,
+				unsigned int packets, unsigned int bytes)
+{
+	struct igc_ring *ring = q_vector->tx.ring;
+
+	u64_stats_update_begin(&ring->tx_syncp);
+	ring->tx_stats.bytes += bytes;
+	ring->tx_stats.packets += packets;
+	u64_stats_update_end(&ring->tx_syncp);
+
+	q_vector->tx.total_bytes += bytes;
+	q_vector->tx.total_packets += packets;
+}
+
 /**
  * igc_clean_tx_irq - Reclaim resources after transmit completes
  * @q_vector: pointer to q_vector containing needed info
@@ -2365,12 +2379,8 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 
 	i += tx_ring->count;
 	tx_ring->next_to_clean = i;
-	u64_stats_update_begin(&tx_ring->tx_syncp);
-	tx_ring->tx_stats.bytes += total_bytes;
-	tx_ring->tx_stats.packets += total_packets;
-	u64_stats_update_end(&tx_ring->tx_syncp);
-	q_vector->tx.total_bytes += total_bytes;
-	q_vector->tx.total_packets += total_packets;
+
+	igc_update_tx_stats(q_vector, total_packets, total_bytes);
 
 	if (test_bit(IGC_RING_FLAG_TX_DETECT_HANG, &tx_ring->flags)) {
 		struct igc_hw *hw = &adapter->hw;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 07/10] igc: Introduce igc_unmap_tx_buffer() helper
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (5 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 06/10] igc: Introduce igc_update_tx_stats() Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum Andre Guedes
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

In preparation for AF_XDP zero-copy support, this patch encapsulates the
code that unmaps tx buffers into its own local helper so we can reuse it
when adding zero-copy support, avoiding code duplication.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc_main.c | 49 +++++++----------------
 1 file changed, 15 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 26c2fc9977cc..60987a5b4b72 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -171,6 +171,14 @@ static void igc_get_hw_control(struct igc_adapter *adapter)
 	     ctrl_ext | IGC_CTRL_EXT_DRV_LOAD);
 }
 
+static void igc_unmap_tx_buffer(struct device *dev, struct igc_tx_buffer *buf)
+{
+	dma_unmap_single(dev, dma_unmap_addr(buf, dma),
+			 dma_unmap_len(buf, len), DMA_TO_DEVICE);
+
+	dma_unmap_len_set(buf, len, 0);
+}
+
 /**
  * igc_clean_tx_ring - Free Tx Buffers
  * @tx_ring: ring to be cleaned
@@ -188,11 +196,7 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
 		else
 			dev_kfree_skb_any(tx_buffer->skb);
 
-		/* unmap skb header data */
-		dma_unmap_single(tx_ring->dev,
-				 dma_unmap_addr(tx_buffer, dma),
-				 dma_unmap_len(tx_buffer, len),
-				 DMA_TO_DEVICE);
+		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 
 		/* check for eop_desc to determine the end of the packet */
 		eop_desc = tx_buffer->next_to_watch;
@@ -211,10 +215,7 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
 
 			/* unmap any remaining paged data */
 			if (dma_unmap_len(tx_buffer, len))
-				dma_unmap_page(tx_ring->dev,
-					       dma_unmap_addr(tx_buffer, dma),
-					       dma_unmap_len(tx_buffer, len),
-					       DMA_TO_DEVICE);
+				igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 		}
 
 		/* move us one more past the eop_desc for start of next pkt */
@@ -1229,11 +1230,7 @@ static int igc_tx_map(struct igc_ring *tx_ring,
 	/* clear dma mappings for failed tx_buffer_info map */
 	while (tx_buffer != first) {
 		if (dma_unmap_len(tx_buffer, len))
-			dma_unmap_page(tx_ring->dev,
-				       dma_unmap_addr(tx_buffer, dma),
-				       dma_unmap_len(tx_buffer, len),
-				       DMA_TO_DEVICE);
-		dma_unmap_len_set(tx_buffer, len, 0);
+			igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 
 		if (i-- == 0)
 			i += tx_ring->count;
@@ -1241,11 +1238,7 @@ static int igc_tx_map(struct igc_ring *tx_ring,
 	}
 
 	if (dma_unmap_len(tx_buffer, len))
-		dma_unmap_single(tx_ring->dev,
-				 dma_unmap_addr(tx_buffer, dma),
-				 dma_unmap_len(tx_buffer, len),
-				 DMA_TO_DEVICE);
-	dma_unmap_len_set(tx_buffer, len, 0);
+		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 
 	dev_kfree_skb_any(tx_buffer->skb);
 	tx_buffer->skb = NULL;
@@ -2327,14 +2320,7 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 		else
 			napi_consume_skb(tx_buffer->skb, napi_budget);
 
-		/* unmap skb header data */
-		dma_unmap_single(tx_ring->dev,
-				 dma_unmap_addr(tx_buffer, dma),
-				 dma_unmap_len(tx_buffer, len),
-				 DMA_TO_DEVICE);
-
-		/* clear tx_buffer data */
-		dma_unmap_len_set(tx_buffer, len, 0);
+		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 
 		/* clear last DMA location and unmap remaining buffers */
 		while (tx_desc != eop_desc) {
@@ -2348,13 +2334,8 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 			}
 
 			/* unmap any remaining paged data */
-			if (dma_unmap_len(tx_buffer, len)) {
-				dma_unmap_page(tx_ring->dev,
-					       dma_unmap_addr(tx_buffer, dma),
-					       dma_unmap_len(tx_buffer, len),
-					       DMA_TO_DEVICE);
-				dma_unmap_len_set(tx_buffer, len, 0);
-			}
+			if (dma_unmap_len(tx_buffer, len))
+				igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 		}
 
 		/* move us one more past the eop_desc for start of next pkt */
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (6 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 07/10] igc: Introduce igc_unmap_tx_buffer() helper Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-21 23:09   ` Maciej Fijalkowski
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy Andre Guedes
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

Up to this point, tx buffers are associated with either a skb or a xdpf,
and the IGC_TX_FLAGS_XDP flag was enough to distinguish between these
two case. However, with upcoming patches that will add AF_XDP zero-copy
support, a third case will be introduced so this flag-based approach
won't fit well.

In preparation to land AF_XDP zero-copy support, this patch replaces the
IGC_TX_FLAGS_XDP flag by an enum which will be extended once zero-copy
support is introduced to the driver.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |  8 +++++--
 drivers/net/ethernet/intel/igc/igc_main.c | 27 ++++++++++++++++++-----
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index 29be8833956a..b537488f5bae 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -376,8 +376,6 @@ enum igc_tx_flags {
 	/* olinfo flags */
 	IGC_TX_FLAGS_IPV4	= 0x10,
 	IGC_TX_FLAGS_CSUM	= 0x20,
-
-	IGC_TX_FLAGS_XDP	= 0x100,
 };
 
 enum igc_boards {
@@ -394,12 +392,18 @@ enum igc_boards {
 #define TXD_USE_COUNT(S)	DIV_ROUND_UP((S), IGC_MAX_DATA_PER_TXD)
 #define DESC_NEEDED	(MAX_SKB_FRAGS + 4)
 
+enum igc_tx_buffer_type {
+	IGC_TX_BUFFER_TYPE_SKB,
+	IGC_TX_BUFFER_TYPE_XDP,
+};
+
 /* wrapper around a pointer to a socket buffer,
  * so a DMA handle can be stored along with the buffer
  */
 struct igc_tx_buffer {
 	union igc_adv_tx_desc *next_to_watch;
 	unsigned long time_stamp;
+	enum igc_tx_buffer_type type;
 	union {
 		struct sk_buff *skb;
 		struct xdp_frame *xdpf;
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 60987a5b4b72..ec366643f996 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -191,10 +191,18 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
 	while (i != tx_ring->next_to_use) {
 		union igc_adv_tx_desc *eop_desc, *tx_desc;
 
-		if (tx_buffer->tx_flags & IGC_TX_FLAGS_XDP)
+		switch (tx_buffer->type) {
+		case IGC_TX_BUFFER_TYPE_XDP:
 			xdp_return_frame(tx_buffer->xdpf);
-		else
+			break;
+		case IGC_TX_BUFFER_TYPE_SKB:
 			dev_kfree_skb_any(tx_buffer->skb);
+			break;
+		default:
+			netdev_warn_once(tx_ring->netdev,
+					 "Unknown tx buffer type\n");
+			break;
+		}
 
 		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 
@@ -1371,6 +1379,7 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
 
 	/* record the location of the first descriptor for this packet */
 	first = &tx_ring->tx_buffer_info[tx_ring->next_to_use];
+	first->type = IGC_TX_BUFFER_TYPE_SKB;
 	first->skb = skb;
 	first->bytecount = skb->len;
 	first->gso_segs = 1;
@@ -1950,8 +1959,8 @@ static int igc_xdp_init_tx_buffer(struct igc_tx_buffer *buffer,
 		return -ENOMEM;
 	}
 
+	buffer->type = IGC_TX_BUFFER_TYPE_XDP;
 	buffer->xdpf = xdpf;
-	buffer->tx_flags = IGC_TX_FLAGS_XDP;
 	buffer->protocol = 0;
 	buffer->bytecount = xdpf->len;
 	buffer->gso_segs = 1;
@@ -2315,10 +2324,18 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 		total_bytes += tx_buffer->bytecount;
 		total_packets += tx_buffer->gso_segs;
 
-		if (tx_buffer->tx_flags & IGC_TX_FLAGS_XDP)
+		switch (tx_buffer->type) {
+		case IGC_TX_BUFFER_TYPE_XDP:
 			xdp_return_frame(tx_buffer->xdpf);
-		else
+			break;
+		case IGC_TX_BUFFER_TYPE_SKB:
 			napi_consume_skb(tx_buffer->skb, napi_budget);
+			break;
+		default:
+			netdev_warn_once(tx_ring->netdev,
+					 "Unknown tx buffer type\n");
+			break;
+		}
 
 		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (7 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-22 13:14   ` Maciej Fijalkowski
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 10/10] igc: Enable TX " Andre Guedes
  2020-12-22 14:20 ` [Intel-wired-lan] [PATCH 00/10] igc: Add support for " Maciej Fijalkowski
  10 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

This patch adds support for receiving packets via AF_XDP zero-copy
mechanism.

A new flag is added to 'enum igc_ring_flags_t' to indicate the ring has
AF_XDP zero-copy enabled so proper ring setup is carried out during ring
configuration in igc_configure_rx_ring().

RX buffers can now be allocated via the shared pages mechanism (default
behavior of the driver) or via xsk pool (when AF_XDP zero-copy is
enabled) so a union is added to the 'struct igc_rx_buffer' to cover both
cases.

When AF_XDP zero-copy is enabled, rx buffers are allocated from the xsk
pool using the new helper igc_alloc_rx_buffers_zc() which is the
counterpart of igc_alloc_rx_buffers().

Likewise other Intel drivers that support AF_XDP zero-copy, in igc we
have a dedicated path for cleaning up rx irqs when zero-copy is enabled.
This avoids adding too many checks within igc_clean_rx_irq(), resulting
in a more readable and efficient code since this function is called from
the hot-path of the driver.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |  22 +-
 drivers/net/ethernet/intel/igc/igc_base.h |   1 +
 drivers/net/ethernet/intel/igc/igc_main.c | 334 +++++++++++++++++++++-
 drivers/net/ethernet/intel/igc/igc_xdp.c  |  98 +++++++
 drivers/net/ethernet/intel/igc/igc_xdp.h  |   2 +
 5 files changed, 438 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index b537488f5bae..d4b0e1db9804 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -113,6 +113,7 @@ struct igc_ring {
 	};
 
 	struct xdp_rxq_info xdp_rxq;
+	struct xsk_buff_pool *xsk_pool;
 } ____cacheline_internodealigned_in_smp;
 
 /* Board specific private data structure */
@@ -241,6 +242,9 @@ bool igc_has_link(struct igc_adapter *adapter);
 void igc_reset(struct igc_adapter *adapter);
 int igc_set_spd_dplx(struct igc_adapter *adapter, u32 spd, u8 dplx);
 void igc_update_stats(struct igc_adapter *adapter);
+void igc_disable_rx_ring(struct igc_ring *ring);
+void igc_enable_rx_ring(struct igc_ring *ring);
+int igc_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
 
 /* igc_dump declarations */
 void igc_rings_dump(struct igc_adapter *adapter);
@@ -418,14 +422,19 @@ struct igc_tx_buffer {
 };
 
 struct igc_rx_buffer {
-	dma_addr_t dma;
-	struct page *page;
+	union {
+		struct {
+			dma_addr_t dma;
+			struct page *page;
 #if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536)
-	__u32 page_offset;
+			__u32 page_offset;
 #else
-	__u16 page_offset;
+			__u16 page_offset;
 #endif
-	__u16 pagecnt_bias;
+			__u16 pagecnt_bias;
+		};
+		struct xdp_buff *xdp;
+	};
 };
 
 struct igc_q_vector {
@@ -511,7 +520,8 @@ enum igc_ring_flags_t {
 	IGC_RING_FLAG_RX_SCTP_CSUM,
 	IGC_RING_FLAG_RX_LB_VLAN_BSWAP,
 	IGC_RING_FLAG_TX_CTX_IDX,
-	IGC_RING_FLAG_TX_DETECT_HANG
+	IGC_RING_FLAG_TX_DETECT_HANG,
+	IGC_RING_FLAG_AF_XDP_ZC,
 };
 
 #define ring_uses_large_buffer(ring) \
diff --git a/drivers/net/ethernet/intel/igc/igc_base.h b/drivers/net/ethernet/intel/igc/igc_base.h
index ea627ce52525..15d0086be949 100644
--- a/drivers/net/ethernet/intel/igc/igc_base.h
+++ b/drivers/net/ethernet/intel/igc/igc_base.h
@@ -81,6 +81,7 @@ union igc_adv_rx_desc {
 
 /* Additional Receive Descriptor Control definitions */
 #define IGC_RXDCTL_QUEUE_ENABLE	0x02000000 /* Ena specific Rx Queue */
+#define IGC_RXDCTL_SWFLUSH	0x04000000 /* Receive Software Flush */
 
 /* SRRCTL bit definitions */
 #define IGC_SRRCTL_BSIZEPKT_SHIFT		10 /* Shift _right_ */
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index ec366643f996..74b6138e5f2c 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -11,7 +11,7 @@
 #include <linux/pm_runtime.h>
 #include <net/pkt_sched.h>
 #include <linux/bpf_trace.h>
-
+#include <net/xdp_sock_drv.h>
 #include <net/ipv6.h>
 
 #include "igc.h"
@@ -390,6 +390,21 @@ static void igc_clean_rx_ring_page_shared(struct igc_ring *rx_ring)
 	}
 }
 
+static void igc_clean_rx_ring_xsk_pool(struct igc_ring *ring)
+{
+	struct igc_rx_buffer *bi;
+	u16 i;
+
+	for (i = 0; i < ring->count; i++) {
+		bi = &ring->rx_buffer_info[i];
+		if (!bi->xdp)
+			continue;
+
+		xsk_buff_free(bi->xdp);
+		bi->xdp = NULL;
+	}
+}
+
 /**
  * igc_clean_rx_ring - Free Rx Buffers per Queue
  * @ring: ring to free buffers from
@@ -398,7 +413,10 @@ static void igc_clean_rx_ring(struct igc_ring *ring)
 {
 	clear_ring_uses_large_buffer(ring);
 
-	igc_clean_rx_ring_page_shared(ring);
+	if (ring->xsk_pool)
+		igc_clean_rx_ring_xsk_pool(ring);
+	else
+		igc_clean_rx_ring_page_shared(ring);
 
 	ring->next_to_alloc = 0;
 	ring->next_to_clean = 0;
@@ -532,6 +550,16 @@ static int igc_setup_all_rx_resources(struct igc_adapter *adapter)
 	return err;
 }
 
+static struct xsk_buff_pool *igc_get_xsk_pool(struct igc_adapter *adapter,
+					      struct igc_ring *ring)
+{
+	if (!igc_xdp_is_enabled(adapter) ||
+	    !test_bit(IGC_RING_FLAG_AF_XDP_ZC, &ring->flags))
+		return NULL;
+
+	return xsk_get_pool_from_qid(ring->netdev, ring->queue_index);
+}
+
 /**
  * igc_configure_rx_ring - Configure a receive ring after Reset
  * @adapter: board private structure
@@ -547,9 +575,20 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
 	int reg_idx = ring->reg_idx;
 	u32 srrctl = 0, rxdctl = 0;
 	u64 rdba = ring->dma;
-
-	WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
-					   MEM_TYPE_PAGE_SHARED, NULL));
+	u32 buf_size;
+
+	xdp_rxq_info_unreg_mem_model(&ring->xdp_rxq);
+	ring->xsk_pool = igc_get_xsk_pool(adapter, ring);
+	if (ring->xsk_pool) {
+		WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
+						   MEM_TYPE_XSK_BUFF_POOL,
+						   NULL));
+		xsk_pool_set_rxq_info(ring->xsk_pool, &ring->xdp_rxq);
+	} else {
+		WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
+						   MEM_TYPE_PAGE_SHARED,
+						   NULL));
+	}
 
 	if (igc_xdp_is_enabled(adapter))
 		set_ring_uses_large_buffer(ring);
@@ -573,12 +612,15 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
 	ring->next_to_clean = 0;
 	ring->next_to_use = 0;
 
-	/* set descriptor configuration */
-	srrctl = IGC_RX_HDR_LEN << IGC_SRRCTL_BSIZEHDRSIZE_SHIFT;
-	if (ring_uses_large_buffer(ring))
-		srrctl |= IGC_RXBUFFER_3072 >> IGC_SRRCTL_BSIZEPKT_SHIFT;
+	if (ring->xsk_pool)
+		buf_size = xsk_pool_get_rx_frame_size(ring->xsk_pool);
+	else if (ring_uses_large_buffer(ring))
+		buf_size = IGC_RXBUFFER_3072;
 	else
-		srrctl |= IGC_RXBUFFER_2048 >> IGC_SRRCTL_BSIZEPKT_SHIFT;
+		buf_size = IGC_RXBUFFER_2048;
+
+	srrctl = IGC_RX_HDR_LEN << IGC_SRRCTL_BSIZEHDRSIZE_SHIFT;
+	srrctl |= buf_size >> IGC_SRRCTL_BSIZEPKT_SHIFT;
 	srrctl |= IGC_SRRCTL_DESCTYPE_ADV_ONEBUF;
 
 	wr32(IGC_SRRCTL(reg_idx), srrctl);
@@ -1947,6 +1989,63 @@ static void igc_alloc_rx_buffers(struct igc_ring *rx_ring, u16 cleaned_count)
 	}
 }
 
+static bool igc_alloc_rx_buffers_zc(struct igc_ring *ring, u16 count)
+{
+	union igc_adv_rx_desc *desc;
+	u16 i = ring->next_to_use;
+	struct igc_rx_buffer *bi;
+	dma_addr_t dma;
+	bool ok = true;
+
+	if (!count)
+		return ok;
+
+	desc = IGC_RX_DESC(ring, i);
+	bi = &ring->rx_buffer_info[i];
+	i -= ring->count;
+
+	do {
+		bi->xdp = xsk_buff_alloc(ring->xsk_pool);
+		if (!bi->xdp) {
+			ok = false;
+			break;
+		}
+
+		dma = xsk_buff_xdp_get_dma(bi->xdp);
+		desc->read.pkt_addr = cpu_to_le64(dma);
+
+		desc++;
+		bi++;
+		i++;
+		if (unlikely(!i)) {
+			desc = IGC_RX_DESC(ring, 0);
+			bi = ring->rx_buffer_info;
+			i -= ring->count;
+		}
+
+		/* Clear the length for the next_to_use descriptor. */
+		desc->wb.upper.length = 0;
+
+		count--;
+	} while (count);
+
+	i += ring->count;
+
+	if (ring->next_to_use != i) {
+		ring->next_to_use = i;
+
+		/* Force memory writes to complete before letting h/w
+		 * know there are new descriptors to fetch.  (Only
+		 * applicable for weak-ordered memory model archs,
+		 * such as IA-64).
+		 */
+		wmb();
+		writel(i, ring->tail);
+	}
+
+	return ok;
+}
+
 static int igc_xdp_init_tx_buffer(struct igc_tx_buffer *buffer,
 				  struct xdp_frame *xdpf,
 				  struct igc_ring *ring)
@@ -2265,6 +2364,137 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
 	return total_packets;
 }
 
+static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
+					    struct xdp_buff *xdp)
+{
+	unsigned int metasize = xdp->data - xdp->data_meta;
+	unsigned int datasize = xdp->data_end - xdp->data;
+	struct sk_buff *skb;
+
+	skb = __napi_alloc_skb(&ring->q_vector->napi,
+			       xdp->data_end - xdp->data_hard_start,
+			       GFP_ATOMIC | __GFP_NOWARN);
+	if (unlikely(!skb))
+		return NULL;
+
+	skb_reserve(skb, xdp->data - xdp->data_hard_start);
+	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
+	if (metasize)
+		skb_metadata_set(skb, metasize);
+
+	return skb;
+}
+
+static void igc_dispatch_skb_zc(struct igc_q_vector *q_vector,
+				union igc_adv_rx_desc *desc,
+				struct xdp_buff *xdp)
+{
+	struct igc_ring *ring = q_vector->rx.ring;
+	struct sk_buff *skb;
+
+	skb = igc_construct_skb_zc(ring, xdp);
+	if (!skb) {
+		ring->rx_stats.alloc_failed++;
+		return;
+	}
+
+	if (igc_cleanup_headers(ring, desc, skb))
+		return;
+
+	igc_process_skb_fields(ring, desc, skb);
+	napi_gro_receive(&q_vector->napi, skb);
+}
+
+static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
+{
+	struct igc_adapter *adapter = q_vector->adapter;
+	struct igc_ring *ring = q_vector->rx.ring;
+	u16 cleaned_count = igc_desc_unused(ring);
+	int total_bytes = 0, total_packets = 0;
+	struct bpf_prog *prog;
+	bool failure = false;
+	int xdp_status = 0;
+
+	rcu_read_lock();
+
+	prog = READ_ONCE(adapter->xdp_prog);
+
+	while (likely(total_packets < budget)) {
+		union igc_adv_rx_desc *desc;
+		struct igc_rx_buffer *bi;
+		unsigned int size;
+		int res;
+
+		desc = IGC_RX_DESC(ring, ring->next_to_clean);
+		size = le16_to_cpu(desc->wb.upper.length);
+		if (!size)
+			break;
+
+		/* This memory barrier is needed to keep us from reading
+		 * any other fields out of the rx_desc until we know the
+		 * descriptor has been written back
+		 */
+		dma_rmb();
+
+		bi = &ring->rx_buffer_info[ring->next_to_clean];
+
+		if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) {
+			/* FIXME: For now, if packet buffer contains timestamp
+			 * information, we discard it. Once XDP infrastructe
+			 * provides a way to report hardware timestamps, the
+			 * code below should be fixed.
+			 */
+			bi->xdp->data += IGC_TS_HDR_LEN;
+			size -= IGC_TS_HDR_LEN;
+		}
+
+		bi->xdp->data_end = bi->xdp->data + size;
+		xsk_buff_dma_sync_for_cpu(bi->xdp, ring->xsk_pool);
+
+		res = igc_xdp_do_run_prog(adapter, prog, bi->xdp);
+		switch (res) {
+		case IGC_XDP_PASS:
+			igc_dispatch_skb_zc(q_vector, desc, bi->xdp);
+			fallthrough;
+		case IGC_XDP_CONSUMED:
+			xsk_buff_free(bi->xdp);
+			break;
+		case IGC_XDP_TX:
+		case IGC_XDP_REDIRECT:
+			xdp_status |= res;
+			break;
+		}
+
+		bi->xdp = NULL;
+		total_bytes += size;
+		total_packets++;
+		cleaned_count++;
+		ring->next_to_clean++;
+		if (ring->next_to_clean == ring->count)
+			ring->next_to_clean = 0;
+	}
+
+	rcu_read_unlock();
+
+	if (cleaned_count >= IGC_RX_BUFFER_WRITE)
+		failure = !igc_alloc_rx_buffers_zc(ring, cleaned_count);
+
+	if (xdp_status)
+		igc_finalize_xdp(adapter, xdp_status);
+
+	igc_update_rx_stats(q_vector, total_packets, total_bytes);
+
+	if (xsk_uses_need_wakeup(ring->xsk_pool)) {
+		if (failure || ring->next_to_clean == ring->next_to_use)
+			xsk_set_rx_need_wakeup(ring->xsk_pool);
+		else
+			xsk_clear_rx_need_wakeup(ring->xsk_pool);
+		return total_packets;
+	}
+
+	return failure ? budget : total_packets;
+}
+
 static void igc_update_tx_stats(struct igc_q_vector *q_vector,
 				unsigned int packets, unsigned int bytes)
 {
@@ -2958,7 +3188,10 @@ static void igc_configure(struct igc_adapter *adapter)
 	for (i = 0; i < adapter->num_rx_queues; i++) {
 		struct igc_ring *ring = adapter->rx_ring[i];
 
-		igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
+		if (ring->xsk_pool)
+			igc_alloc_rx_buffers_zc(ring, igc_desc_unused(ring));
+		else
+			igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
 	}
 }
 
@@ -3573,14 +3806,18 @@ static int igc_poll(struct napi_struct *napi, int budget)
 	struct igc_q_vector *q_vector = container_of(napi,
 						     struct igc_q_vector,
 						     napi);
+	struct igc_ring *rx_ring = q_vector->rx.ring;
+
 	bool clean_complete = true;
 	int work_done = 0;
 
 	if (q_vector->tx.ring)
 		clean_complete = igc_clean_tx_irq(q_vector, budget);
 
-	if (q_vector->rx.ring) {
-		int cleaned = igc_clean_rx_irq(q_vector, budget);
+	if (rx_ring) {
+		int cleaned = rx_ring->xsk_pool ?
+			      igc_clean_rx_irq_zc(q_vector, budget) :
+			      igc_clean_rx_irq(q_vector, budget);
 
 		work_done += cleaned;
 		if (cleaned >= budget)
@@ -5161,6 +5398,9 @@ static int igc_bpf(struct net_device *dev, struct netdev_bpf *bpf)
 	switch (bpf->command) {
 	case XDP_SETUP_PROG:
 		return igc_xdp_set_prog(adapter, bpf->prog, bpf->extack);
+	case XDP_SETUP_XSK_POOL:
+		return igc_xdp_setup_pool(adapter, bpf->xsk.pool,
+					  bpf->xsk.queue_id);
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -5206,6 +5446,43 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
 	return num_frames - drops;
 }
 
+static void igc_trigger_rxtxq_interrupt(struct igc_adapter *adapter,
+					struct igc_q_vector *q_vector)
+{
+	struct igc_hw *hw = &adapter->hw;
+	u32 eics = 0;
+
+	eics |= q_vector->eims_value;
+	wr32(IGC_EICS, eics);
+}
+
+int igc_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags)
+{
+	struct igc_adapter *adapter = netdev_priv(dev);
+	struct igc_q_vector *q_vector;
+	struct igc_ring *ring;
+
+	if (test_bit(__IGC_DOWN, &adapter->state))
+		return -ENETDOWN;
+
+	if (!igc_xdp_is_enabled(adapter))
+		return -ENXIO;
+
+	if (queue_id >= adapter->num_rx_queues)
+		return -EINVAL;
+
+	ring = adapter->rx_ring[queue_id];
+
+	if (!ring->xsk_pool)
+		return -ENXIO;
+
+	q_vector = adapter->q_vector[queue_id];
+	if (!napi_if_scheduled_mark_missed(&q_vector->napi))
+		igc_trigger_rxtxq_interrupt(adapter, q_vector);
+
+	return 0;
+}
+
 static const struct net_device_ops igc_netdev_ops = {
 	.ndo_open		= igc_open,
 	.ndo_stop		= igc_close,
@@ -5221,6 +5498,7 @@ static const struct net_device_ops igc_netdev_ops = {
 	.ndo_setup_tc		= igc_setup_tc,
 	.ndo_bpf		= igc_bpf,
 	.ndo_xdp_xmit		= igc_xdp_xmit,
+	.ndo_xsk_wakeup		= igc_xsk_wakeup,
 };
 
 /* PCIe configuration access */
@@ -5974,6 +6252,36 @@ struct net_device *igc_get_hw_dev(struct igc_hw *hw)
 	return adapter->netdev;
 }
 
+static void igc_disable_rx_ring_hw(struct igc_ring *ring)
+{
+	struct igc_hw *hw = &ring->q_vector->adapter->hw;
+	u8 idx = ring->reg_idx;
+	u32 rxdctl;
+
+	rxdctl = rd32(IGC_RXDCTL(idx));
+	rxdctl &= ~IGC_RXDCTL_QUEUE_ENABLE;
+	rxdctl |= IGC_RXDCTL_SWFLUSH;
+	wr32(IGC_RXDCTL(idx), rxdctl);
+}
+
+void igc_disable_rx_ring(struct igc_ring *ring)
+{
+	igc_disable_rx_ring_hw(ring);
+	igc_clean_rx_ring(ring);
+}
+
+void igc_enable_rx_ring(struct igc_ring *ring)
+{
+	struct igc_adapter *adapter = ring->q_vector->adapter;
+
+	igc_configure_rx_ring(adapter, ring);
+
+	if (ring->xsk_pool)
+		igc_alloc_rx_buffers_zc(ring, igc_desc_unused(ring));
+	else
+		igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
+}
+
 /**
  * igc_init_module - Driver Registration Routine
  *
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
index 27c886a254f1..9515906d54e0 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.c
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2020, Intel Corporation. */
 
+#include <net/xdp_sock_drv.h>
+
 #include "igc.h"
 #include "igc_xdp.h"
 
@@ -31,3 +33,99 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 
 	return 0;
 }
+
+static int igc_xdp_enable_pool(struct igc_adapter *adapter,
+			       struct xsk_buff_pool *pool, u16 queue_id)
+{
+	struct net_device *ndev = adapter->netdev;
+	struct device *dev = &adapter->pdev->dev;
+	struct igc_ring *rx_ring;
+	struct napi_struct *napi;
+	bool needs_reset;
+	u32 frame_size;
+	int err;
+
+	if (queue_id >= adapter->num_rx_queues)
+		return -EINVAL;
+
+	frame_size = xsk_pool_get_rx_frame_size(pool);
+	if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
+		/* For now, when XDP is enabled, the driver doesn't support
+		 * frames that span over multiple buffers. The max frame size
+		 * considered here is the ethernet frame size + vlan double
+		 * tagging.
+		 */
+		return -EOPNOTSUPP;
+	}
+
+	err = xsk_pool_dma_map(pool, dev, IGC_RX_DMA_ATTR);
+	if (err) {
+		netdev_err(ndev, "Failed to map xsk pool\n");
+		return err;
+	}
+
+	needs_reset = netif_running(adapter->netdev) && igc_xdp_is_enabled(adapter);
+
+	rx_ring = adapter->rx_ring[queue_id];
+	napi = &rx_ring->q_vector->napi;
+
+	if (needs_reset) {
+		igc_disable_rx_ring(rx_ring);
+		napi_disable(napi);
+	}
+
+	set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags);
+
+	if (needs_reset) {
+		napi_enable(napi);
+		igc_enable_rx_ring(rx_ring);
+
+		err = igc_xsk_wakeup(ndev, queue_id, XDP_WAKEUP_RX);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int igc_xdp_disable_pool(struct igc_adapter *adapter, u16 queue_id)
+{
+	struct xsk_buff_pool *pool;
+	struct igc_ring *rx_ring;
+	struct napi_struct *napi;
+	bool needs_reset;
+
+	if (queue_id >= adapter->num_rx_queues)
+		return -EINVAL;
+
+	pool = xsk_get_pool_from_qid(adapter->netdev, queue_id);
+	if (!pool)
+		return -EINVAL;
+
+	needs_reset = netif_running(adapter->netdev) && igc_xdp_is_enabled(adapter);
+
+	rx_ring = adapter->rx_ring[queue_id];
+	napi = &rx_ring->q_vector->napi;
+
+	if (needs_reset) {
+		igc_disable_rx_ring(rx_ring);
+		napi_disable(napi);
+	}
+
+	xsk_pool_dma_unmap(pool, IGC_RX_DMA_ATTR);
+	clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags);
+
+	if (needs_reset) {
+		napi_enable(napi);
+		igc_enable_rx_ring(rx_ring);
+	}
+
+	return 0;
+}
+
+int igc_xdp_setup_pool(struct igc_adapter *adapter, struct xsk_buff_pool *pool,
+		       u16 queue_id)
+{
+	return pool ? igc_xdp_enable_pool(adapter, pool, queue_id) :
+		      igc_xdp_disable_pool(adapter, queue_id);
+}
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.h b/drivers/net/ethernet/intel/igc/igc_xdp.h
index cdaa2c39b03a..a74e5487d199 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.h
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.h
@@ -6,6 +6,8 @@
 
 int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
 		     struct netlink_ext_ack *extack);
+int igc_xdp_setup_pool(struct igc_adapter *adapter, struct xsk_buff_pool *pool,
+		       u16 queue_id);
 
 static inline bool igc_xdp_is_enabled(struct igc_adapter *adapter)
 {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 10/10] igc: Enable TX via AF_XDP zero-copy
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (8 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy Andre Guedes
@ 2020-12-17 20:24 ` Andre Guedes
  2020-12-22 14:16   ` Maciej Fijalkowski
  2020-12-22 14:20 ` [Intel-wired-lan] [PATCH 00/10] igc: Add support for " Maciej Fijalkowski
  10 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-17 20:24 UTC (permalink / raw)
  To: intel-wired-lan

This patch adds support for transmitting packets via AF_XDP zero-copy
mechanism.

The packet transmission itself is implemented by igc_xdp_xmit_zc() which
is called from igc_clean_tx_irq() when the ring has AF_XDP zero-copy
enabled. Likewise i40e and ice drivers, the transmission budget used is
the number of descriptors available on the ring.

A new tx buffer type is introduced to 'enum igc_tx_buffer_type' to
indicate the tx buffer uses memory from xsk pool so it can be properly
cleaned after transmission or when the ring is cleaned.

The I225 controller has only 4 Tx hardware queues so the main difference
between igc and other Intel drivers that support AF_XDP zero-copy is
that there is no tx ring dedicated exclusively to XDP. Instead, tx
rings are shared between the network stack and XDP, and netdev queue
lock is used to ensure mutual exclusion. This is the same approach
implemented to support XDP_TX and XDP_REDIRECT actions.

Signed-off-by: Andre Guedes <andre.guedes@intel.com>
---
 drivers/net/ethernet/intel/igc/igc.h      |   3 +
 drivers/net/ethernet/intel/igc/igc_base.h |   1 +
 drivers/net/ethernet/intel/igc/igc_main.c | 115 +++++++++++++++++++++-
 drivers/net/ethernet/intel/igc/igc_xdp.c  |  20 +++-
 4 files changed, 131 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
index d4b0e1db9804..dced32a098c0 100644
--- a/drivers/net/ethernet/intel/igc/igc.h
+++ b/drivers/net/ethernet/intel/igc/igc.h
@@ -244,6 +244,8 @@ int igc_set_spd_dplx(struct igc_adapter *adapter, u32 spd, u8 dplx);
 void igc_update_stats(struct igc_adapter *adapter);
 void igc_disable_rx_ring(struct igc_ring *ring);
 void igc_enable_rx_ring(struct igc_ring *ring);
+void igc_disable_tx_ring(struct igc_ring *ring);
+void igc_enable_tx_ring(struct igc_ring *ring);
 int igc_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
 
 /* igc_dump declarations */
@@ -399,6 +401,7 @@ enum igc_boards {
 enum igc_tx_buffer_type {
 	IGC_TX_BUFFER_TYPE_SKB,
 	IGC_TX_BUFFER_TYPE_XDP,
+	IGC_TX_BUFFER_TYPE_XSK,
 };
 
 /* wrapper around a pointer to a socket buffer,
diff --git a/drivers/net/ethernet/intel/igc/igc_base.h b/drivers/net/ethernet/intel/igc/igc_base.h
index 15d0086be949..4c12d514b448 100644
--- a/drivers/net/ethernet/intel/igc/igc_base.h
+++ b/drivers/net/ethernet/intel/igc/igc_base.h
@@ -78,6 +78,7 @@ union igc_adv_rx_desc {
 
 /* Additional Transmit Descriptor Control definitions */
 #define IGC_TXDCTL_QUEUE_ENABLE	0x02000000 /* Ena specific Tx Queue */
+#define IGC_TXDCTL_SWFLUSH	0x04000000 /* Transmit Software Flush */
 
 /* Additional Receive Descriptor Control definitions */
 #define IGC_RXDCTL_QUEUE_ENABLE	0x02000000 /* Ena specific Rx Queue */
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 74b6138e5f2c..ef95078af38f 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -187,16 +187,22 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
 {
 	u16 i = tx_ring->next_to_clean;
 	struct igc_tx_buffer *tx_buffer = &tx_ring->tx_buffer_info[i];
+	u32 xsk_frames = 0;
 
 	while (i != tx_ring->next_to_use) {
 		union igc_adv_tx_desc *eop_desc, *tx_desc;
 
 		switch (tx_buffer->type) {
+		case IGC_TX_BUFFER_TYPE_XSK:
+			xsk_frames++;
+			break;
 		case IGC_TX_BUFFER_TYPE_XDP:
 			xdp_return_frame(tx_buffer->xdpf);
+			igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 			break;
 		case IGC_TX_BUFFER_TYPE_SKB:
 			dev_kfree_skb_any(tx_buffer->skb);
+			igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 			break;
 		default:
 			netdev_warn_once(tx_ring->netdev,
@@ -204,8 +210,6 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
 			break;
 		}
 
-		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
-
 		/* check for eop_desc to determine the end of the packet */
 		eop_desc = tx_buffer->next_to_watch;
 		tx_desc = IGC_TX_DESC(tx_ring, i);
@@ -235,6 +239,9 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
 		}
 	}
 
+	if (tx_ring->xsk_pool && xsk_frames)
+		xsk_tx_completed(tx_ring->xsk_pool, xsk_frames);
+
 	/* reset BQL for queue */
 	netdev_tx_reset_queue(txring_txq(tx_ring));
 
@@ -675,6 +682,8 @@ static void igc_configure_tx_ring(struct igc_adapter *adapter,
 	u64 tdba = ring->dma;
 	u32 txdctl = 0;
 
+	ring->xsk_pool = igc_get_xsk_pool(adapter, ring);
+
 	/* disable the queue */
 	wr32(IGC_TXDCTL(reg_idx), 0);
 	wrfl();
@@ -2509,6 +2518,67 @@ static void igc_update_tx_stats(struct igc_q_vector *q_vector,
 	q_vector->tx.total_packets += packets;
 }
 
+static void igc_xdp_xmit_zc(struct igc_ring *ring)
+{
+	struct xsk_buff_pool *pool = ring->xsk_pool;
+	struct netdev_queue *nq = txring_txq(ring);
+	int cpu = smp_processor_id();
+	struct xdp_desc xdp_desc;
+	bool work_done;
+	u16 budget;
+
+	if (!netif_carrier_ok(ring->netdev))
+		return;
+
+	__netif_tx_lock(nq, cpu);
+
+	budget = igc_desc_unused(ring);
+	work_done = false;
+
+	while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {
+		u32 cmd_type, olinfo_status;
+		union igc_adv_tx_desc *desc;
+		struct igc_tx_buffer *bi;
+		dma_addr_t dma;
+
+		cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
+			   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
+			   xdp_desc.len;
+		olinfo_status = xdp_desc.len << IGC_ADVTXD_PAYLEN_SHIFT;
+
+		dma = xsk_buff_raw_get_dma(pool, xdp_desc.addr);
+		xsk_buff_raw_dma_sync_for_device(pool, dma, xdp_desc.len);
+
+		desc = IGC_TX_DESC(ring, ring->next_to_use);
+		desc->read.cmd_type_len = cpu_to_le32(cmd_type);
+		desc->read.olinfo_status = cpu_to_le32(olinfo_status);
+		desc->read.buffer_addr = cpu_to_le64(dma);
+
+		bi = &ring->tx_buffer_info[ring->next_to_use];
+		bi->type = IGC_TX_BUFFER_TYPE_XSK;
+		bi->protocol = 0;
+		bi->bytecount = xdp_desc.len;
+		bi->gso_segs = 1;
+		bi->time_stamp = jiffies;
+		bi->next_to_watch = desc;
+
+		netdev_tx_sent_queue(txring_txq(ring), xdp_desc.len);
+
+		ring->next_to_use++;
+		if (ring->next_to_use == ring->count)
+			ring->next_to_use = 0;
+
+		work_done = true;
+	}
+
+	if (work_done) {
+		igc_flush_tx_descriptors(ring);
+		xsk_tx_release(pool);
+	}
+
+	__netif_tx_unlock(nq);
+}
+
 /**
  * igc_clean_tx_irq - Reclaim resources after transmit completes
  * @q_vector: pointer to q_vector containing needed info
@@ -2525,6 +2595,7 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 	unsigned int i = tx_ring->next_to_clean;
 	struct igc_tx_buffer *tx_buffer;
 	union igc_adv_tx_desc *tx_desc;
+	u32 xsk_frames = 0;
 
 	if (test_bit(__IGC_DOWN, &adapter->state))
 		return true;
@@ -2555,11 +2626,16 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 		total_packets += tx_buffer->gso_segs;
 
 		switch (tx_buffer->type) {
+		case IGC_TX_BUFFER_TYPE_XSK:
+			xsk_frames++;
+			break;
 		case IGC_TX_BUFFER_TYPE_XDP:
 			xdp_return_frame(tx_buffer->xdpf);
+			igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 			break;
 		case IGC_TX_BUFFER_TYPE_SKB:
 			napi_consume_skb(tx_buffer->skb, napi_budget);
+			igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
 			break;
 		default:
 			netdev_warn_once(tx_ring->netdev,
@@ -2567,8 +2643,6 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 			break;
 		}
 
-		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
-
 		/* clear last DMA location and unmap remaining buffers */
 		while (tx_desc != eop_desc) {
 			tx_buffer++;
@@ -2610,6 +2684,14 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
 
 	igc_update_tx_stats(q_vector, total_packets, total_bytes);
 
+	if (tx_ring->xsk_pool) {
+		if (xsk_frames)
+			xsk_tx_completed(tx_ring->xsk_pool, xsk_frames);
+		if (xsk_uses_need_wakeup(tx_ring->xsk_pool))
+			xsk_set_tx_need_wakeup(tx_ring->xsk_pool);
+		igc_xdp_xmit_zc(tx_ring);
+	}
+
 	if (test_bit(IGC_RING_FLAG_TX_DETECT_HANG, &tx_ring->flags)) {
 		struct igc_hw *hw = &adapter->hw;
 
@@ -6282,6 +6364,31 @@ void igc_enable_rx_ring(struct igc_ring *ring)
 		igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
 }
 
+static void igc_disable_tx_ring_hw(struct igc_ring *ring)
+{
+	struct igc_hw *hw = &ring->q_vector->adapter->hw;
+	u8 idx = ring->reg_idx;
+	u32 txdctl;
+
+	txdctl = rd32(IGC_TXDCTL(idx));
+	txdctl &= ~IGC_TXDCTL_QUEUE_ENABLE;
+	txdctl |= IGC_TXDCTL_SWFLUSH;
+	wr32(IGC_TXDCTL(idx), txdctl);
+}
+
+void igc_disable_tx_ring(struct igc_ring *ring)
+{
+	igc_disable_tx_ring_hw(ring);
+	igc_clean_tx_ring(ring);
+}
+
+void igc_enable_tx_ring(struct igc_ring *ring)
+{
+	struct igc_adapter *adapter = ring->q_vector->adapter;
+
+	igc_configure_tx_ring(adapter, ring);
+}
+
 /**
  * igc_init_module - Driver Registration Routine
  *
diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
index 9515906d54e0..5ee31b32df7a 100644
--- a/drivers/net/ethernet/intel/igc/igc_xdp.c
+++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
@@ -39,13 +39,14 @@ static int igc_xdp_enable_pool(struct igc_adapter *adapter,
 {
 	struct net_device *ndev = adapter->netdev;
 	struct device *dev = &adapter->pdev->dev;
-	struct igc_ring *rx_ring;
+	struct igc_ring *rx_ring, *tx_ring;
 	struct napi_struct *napi;
 	bool needs_reset;
 	u32 frame_size;
 	int err;
 
-	if (queue_id >= adapter->num_rx_queues)
+	if (queue_id >= adapter->num_rx_queues ||
+	    queue_id >= adapter->num_tx_queues)
 		return -EINVAL;
 
 	frame_size = xsk_pool_get_rx_frame_size(pool);
@@ -67,18 +68,23 @@ static int igc_xdp_enable_pool(struct igc_adapter *adapter,
 	needs_reset = netif_running(adapter->netdev) && igc_xdp_is_enabled(adapter);
 
 	rx_ring = adapter->rx_ring[queue_id];
+	tx_ring = adapter->tx_ring[queue_id];
+	/* Rx and Tx rings share the same napi context. */
 	napi = &rx_ring->q_vector->napi;
 
 	if (needs_reset) {
 		igc_disable_rx_ring(rx_ring);
+		igc_disable_tx_ring(tx_ring);
 		napi_disable(napi);
 	}
 
 	set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags);
+	set_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags);
 
 	if (needs_reset) {
 		napi_enable(napi);
 		igc_enable_rx_ring(rx_ring);
+		igc_enable_tx_ring(tx_ring);
 
 		err = igc_xsk_wakeup(ndev, queue_id, XDP_WAKEUP_RX);
 		if (err)
@@ -90,12 +96,13 @@ static int igc_xdp_enable_pool(struct igc_adapter *adapter,
 
 static int igc_xdp_disable_pool(struct igc_adapter *adapter, u16 queue_id)
 {
+	struct igc_ring *rx_ring, *tx_ring;
 	struct xsk_buff_pool *pool;
-	struct igc_ring *rx_ring;
 	struct napi_struct *napi;
 	bool needs_reset;
 
-	if (queue_id >= adapter->num_rx_queues)
+	if (queue_id >= adapter->num_rx_queues ||
+	    queue_id >= adapter->num_tx_queues)
 		return -EINVAL;
 
 	pool = xsk_get_pool_from_qid(adapter->netdev, queue_id);
@@ -105,19 +112,24 @@ static int igc_xdp_disable_pool(struct igc_adapter *adapter, u16 queue_id)
 	needs_reset = netif_running(adapter->netdev) && igc_xdp_is_enabled(adapter);
 
 	rx_ring = adapter->rx_ring[queue_id];
+	tx_ring = adapter->tx_ring[queue_id];
+	/* Rx and Tx rings share the same napi context. */
 	napi = &rx_ring->q_vector->napi;
 
 	if (needs_reset) {
 		igc_disable_rx_ring(rx_ring);
+		igc_disable_tx_ring(tx_ring);
 		napi_disable(napi);
 	}
 
 	xsk_pool_dma_unmap(pool, IGC_RX_DMA_ATTR);
 	clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags);
+	clear_bit(IGC_RING_FLAG_AF_XDP_ZC, &tx_ring->flags);
 
 	if (needs_reset) {
 		napi_enable(napi);
 		igc_enable_rx_ring(rx_ring);
+		igc_enable_tx_ring(tx_ring);
 	}
 
 	return 0;
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 02/10] igc: Refactor igc_xdp_run_prog()
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 02/10] igc: Refactor igc_xdp_run_prog() Andre Guedes
@ 2020-12-21 22:45   ` Maciej Fijalkowski
  2020-12-22  1:12     ` Andre Guedes
  0 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-21 22:45 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 17, 2020 at 12:24:07PM -0800, Andre Guedes wrote:
> This patch refactors igc_xdp_run_prog() helper, preparing the code for

s/This patch refactors/Refactor

to follow the guideline of using the imperative mood in commit msgs.

> AF_XDP zero-copy support which is added by upcoming patches.
> 
> With AF_XDP zero-copy support, igc_poll() will have a dedicated path
> when the rx ring has zero-copy is enabled. To avoid code duplication as

The end of this sentence is a bit bogus.

What about:

With AF_XDP zero-copy support, igc_poll() will have a dedicated path
when rx ring's memory model is MEM_TYPE_XSK_BUFF_POOL.

?

> much as possible, this patch encapsulates the code specific to handling

s/this patch encapsulates/encapsulate

git grep "This patch" Documentation/process/submitting-patches.rst

as a refresher :)

> XDP program actions into a local helper so it can be reused by the
> zero-copy path.
> 
> Signed-off-by: Andre Guedes <andre.guedes@intel.com>
> ---
>  drivers/net/ethernet/intel/igc/igc_main.c | 56 +++++++++++------------
>  1 file changed, 28 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index 0e582a4ee986..56b791b356dc 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -2029,38 +2029,22 @@ static int igc_xdp_xmit_back(struct igc_adapter *adapter, struct xdp_buff *xdp)
>  	return res;
>  }
>  
> -static struct sk_buff *igc_xdp_run_prog(struct igc_adapter *adapter,
> -					struct xdp_buff *xdp)
> +/* This function assumes rcu_read_lock() is held by the caller. */
> +static int igc_xdp_do_run_prog(struct igc_adapter *adapter,
> +			       struct bpf_prog *prog,
> +			       struct xdp_buff *xdp)
>  {
> -	struct bpf_prog *prog;
> -	int res;
> -	u32 act;
> -
> -	rcu_read_lock();
> -
> -	prog = READ_ONCE(adapter->xdp_prog);
> -	if (!prog) {
> -		res = IGC_XDP_PASS;
> -		goto unlock;
> -	}
> +	u32 act = bpf_prog_run_xdp(prog, xdp);
>  
> -	act = bpf_prog_run_xdp(prog, xdp);
>  	switch (act) {
>  	case XDP_PASS:
> -		res = IGC_XDP_PASS;
> -		break;
> +		return IGC_XDP_PASS;
>  	case XDP_TX:
> -		if (igc_xdp_xmit_back(adapter, xdp) < 0)
> -			res = IGC_XDP_CONSUMED;
> -		else
> -			res = IGC_XDP_TX;
> -		break;
> +		return igc_xdp_xmit_back(adapter, xdp) < 0 ?
> +			IGC_XDP_CONSUMED : IGC_XDP_TX;
>  	case XDP_REDIRECT:
> -		if (xdp_do_redirect(adapter->netdev, xdp, prog) < 0)
> -			res = IGC_XDP_CONSUMED;
> -		else
> -			res = IGC_XDP_REDIRECT;
> -		break;
> +		return xdp_do_redirect(adapter->netdev, xdp, prog) < 0 ?
> +			IGC_XDP_CONSUMED : IGC_XDP_REDIRECT;
>  	default:
>  		bpf_warn_invalid_xdp_action(act);
>  		fallthrough;
> @@ -2068,9 +2052,25 @@ static struct sk_buff *igc_xdp_run_prog(struct igc_adapter *adapter,
>  		trace_xdp_exception(adapter->netdev, prog, act);
>  		fallthrough;
>  	case XDP_DROP:
> -		res = IGC_XDP_CONSUMED;
> -		break;
> +		return IGC_XDP_CONSUMED;
>  	}
> +}
> +
> +static struct sk_buff *igc_xdp_run_prog(struct igc_adapter *adapter,
> +					struct xdp_buff *xdp)
> +{
> +	struct bpf_prog *prog;
> +	int res;
> +
> +	rcu_read_lock();
> +
> +	prog = READ_ONCE(adapter->xdp_prog);
> +	if (!prog) {
> +		res = IGC_XDP_PASS;
> +		goto unlock;
> +	}
> +
> +	res = igc_xdp_do_run_prog(adapter, prog, xdp);
>  
>  unlock:
>  	rcu_read_unlock();
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration Andre Guedes
@ 2020-12-21 22:53   ` Maciej Fijalkowski
  2020-12-22  1:13     ` Andre Guedes
  0 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-21 22:53 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 17, 2020 at 12:24:09PM -0800, Andre Guedes wrote:
> This patch does a refactoring on the XDP rxq info registration code,
> preparing the driver for AF_XDP zero-copy support which is added by
> upcoming patches.
> 
> Currently, xdp_rxq and memory model are both registered during RX
> resource setup time by igc_xdp_register_rxq_info() helper. With AF_XDP,
> we want to register the memory model later on while configuring the ring
> because we will know which memory model type to register
> (MEM_TYPE_PAGE_SHARED or MEM_TYPE_XSK_BUFF_POOL).
> 
> The helpers igc_xdp_register_rxq_info() and igc_xdp_unregister_rxq_
> info() are not useful anymore so they are removed.
> 
> Signed-off-by: Andre Guedes <andre.guedes@intel.com>
> ---
>  drivers/net/ethernet/intel/igc/igc_main.c | 14 ++++++++----
>  drivers/net/ethernet/intel/igc/igc_xdp.c  | 27 -----------------------
>  drivers/net/ethernet/intel/igc/igc_xdp.h  |  3 ---
>  3 files changed, 10 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index d591c1e0581f..1870ca37d650 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -419,7 +419,7 @@ void igc_free_rx_resources(struct igc_ring *rx_ring)
>  {
>  	igc_clean_rx_ring(rx_ring);
>  
> -	igc_xdp_unregister_rxq_info(rx_ring);
> +	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
>  
>  	vfree(rx_ring->rx_buffer_info);
>  	rx_ring->rx_buffer_info = NULL;
> @@ -460,9 +460,12 @@ int igc_setup_rx_resources(struct igc_ring *rx_ring)
>  	struct device *dev = rx_ring->dev;
>  	int size, desc_len, res;
>  
> -	res = igc_xdp_register_rxq_info(rx_ring);
> -	if (res < 0)
> +	res = xdp_rxq_info_reg(&rx_ring->xdp_rxq, ndev, rx_ring->queue_index,
> +			       0);
> +	if (res < 0) {
> +		netdev_err(ndev, "Failed to register xdp rxq info\n");

Maybe print out q idx?

>  		return res;
> +	}
>  
>  	size = sizeof(struct igc_rx_buffer) * rx_ring->count;
>  	rx_ring->rx_buffer_info = vzalloc(size);
> @@ -488,7 +491,7 @@ int igc_setup_rx_resources(struct igc_ring *rx_ring)
>  	return 0;
>  
>  err:
> -	igc_xdp_unregister_rxq_info(rx_ring);
> +	xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
>  	vfree(rx_ring->rx_buffer_info);
>  	rx_ring->rx_buffer_info = NULL;
>  	netdev_err(ndev, "Unable to allocate memory for Rx descriptor ring\n");
> @@ -536,6 +539,9 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
>  	u32 srrctl = 0, rxdctl = 0;
>  	u64 rdba = ring->dma;
>  
> +	WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
> +					   MEM_TYPE_PAGE_SHARED, NULL));

You should do the unroll in case it fails just like it was done in
igc_xdp_register_rxq_info.

> +
>  	if (igc_xdp_is_enabled(adapter))
>  		set_ring_uses_large_buffer(ring);
>  
> diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
> index 11133c4619bb..27c886a254f1 100644
> --- a/drivers/net/ethernet/intel/igc/igc_xdp.c
> +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
> @@ -31,30 +31,3 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>  
>  	return 0;
>  }
> -
> -int igc_xdp_register_rxq_info(struct igc_ring *ring)
> -{
> -	struct net_device *dev = ring->netdev;
> -	int err;
> -
> -	err = xdp_rxq_info_reg(&ring->xdp_rxq, dev, ring->queue_index, 0);
> -	if (err) {
> -		netdev_err(dev, "Failed to register xdp rxq info\n");
> -		return err;
> -	}
> -
> -	err = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, MEM_TYPE_PAGE_SHARED,
> -					 NULL);
> -	if (err) {
> -		netdev_err(dev, "Failed to register xdp rxq mem model\n");
> -		xdp_rxq_info_unreg(&ring->xdp_rxq);
> -		return err;
> -	}
> -
> -	return 0;
> -}
> -
> -void igc_xdp_unregister_rxq_info(struct igc_ring *ring)
> -{
> -	xdp_rxq_info_unreg(&ring->xdp_rxq);
> -}
> diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.h b/drivers/net/ethernet/intel/igc/igc_xdp.h
> index 412aa369e6ba..cdaa2c39b03a 100644
> --- a/drivers/net/ethernet/intel/igc/igc_xdp.h
> +++ b/drivers/net/ethernet/intel/igc/igc_xdp.h
> @@ -7,9 +7,6 @@
>  int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>  		     struct netlink_ext_ack *extack);
>  
> -int igc_xdp_register_rxq_info(struct igc_ring *ring);
> -void igc_xdp_unregister_rxq_info(struct igc_ring *ring);
> -
>  static inline bool igc_xdp_is_enabled(struct igc_adapter *adapter)
>  {
>  	return !!adapter->xdp_prog;
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 06/10] igc: Introduce igc_update_tx_stats()
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 06/10] igc: Introduce igc_update_tx_stats() Andre Guedes
@ 2020-12-21 22:58   ` Maciej Fijalkowski
  2020-12-22  1:13     ` Andre Guedes
  0 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-21 22:58 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 17, 2020 at 12:24:11PM -0800, Andre Guedes wrote:
> Likewise we do with the RX stats, this patch encapsulates the code that
> updates the driver TX stats in its own local helper.

I think you can squash this onto previous patch.

> 
> Signed-off-by: Andre Guedes <andre.guedes@intel.com>
> ---
>  drivers/net/ethernet/intel/igc/igc_main.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index e5cdebbe5637..26c2fc9977cc 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -2263,6 +2263,20 @@ static int igc_clean_rx_irq(struct igc_q_vector *q_vector, const int budget)
>  	return total_packets;
>  }
>  
> +static void igc_update_tx_stats(struct igc_q_vector *q_vector,
> +				unsigned int packets, unsigned int bytes)
> +{
> +	struct igc_ring *ring = q_vector->tx.ring;
> +
> +	u64_stats_update_begin(&ring->tx_syncp);
> +	ring->tx_stats.bytes += bytes;
> +	ring->tx_stats.packets += packets;
> +	u64_stats_update_end(&ring->tx_syncp);
> +
> +	q_vector->tx.total_bytes += bytes;
> +	q_vector->tx.total_packets += packets;
> +}
> +
>  /**
>   * igc_clean_tx_irq - Reclaim resources after transmit completes
>   * @q_vector: pointer to q_vector containing needed info
> @@ -2365,12 +2379,8 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
>  
>  	i += tx_ring->count;
>  	tx_ring->next_to_clean = i;
> -	u64_stats_update_begin(&tx_ring->tx_syncp);
> -	tx_ring->tx_stats.bytes += total_bytes;
> -	tx_ring->tx_stats.packets += total_packets;
> -	u64_stats_update_end(&tx_ring->tx_syncp);
> -	q_vector->tx.total_bytes += total_bytes;
> -	q_vector->tx.total_packets += total_packets;
> +
> +	igc_update_tx_stats(q_vector, total_packets, total_bytes);
>  
>  	if (test_bit(IGC_RING_FLAG_TX_DETECT_HANG, &tx_ring->flags)) {
>  		struct igc_hw *hw = &adapter->hw;
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum Andre Guedes
@ 2020-12-21 23:09   ` Maciej Fijalkowski
  2020-12-22  1:13     ` Andre Guedes
  0 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-21 23:09 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 17, 2020 at 12:24:13PM -0800, Andre Guedes wrote:
> Up to this point, tx buffers are associated with either a skb or a xdpf,
> and the IGC_TX_FLAGS_XDP flag was enough to distinguish between these
> two case. However, with upcoming patches that will add AF_XDP zero-copy
> support, a third case will be introduced so this flag-based approach
> won't fit well.
> 
> In preparation to land AF_XDP zero-copy support, this patch replaces the
> IGC_TX_FLAGS_XDP flag by an enum which will be extended once zero-copy
> support is introduced to the driver.
> 
> Signed-off-by: Andre Guedes <andre.guedes@intel.com>
> ---
>  drivers/net/ethernet/intel/igc/igc.h      |  8 +++++--
>  drivers/net/ethernet/intel/igc/igc_main.c | 27 ++++++++++++++++++-----
>  2 files changed, 28 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
> index 29be8833956a..b537488f5bae 100644
> --- a/drivers/net/ethernet/intel/igc/igc.h
> +++ b/drivers/net/ethernet/intel/igc/igc.h
> @@ -376,8 +376,6 @@ enum igc_tx_flags {
>  	/* olinfo flags */
>  	IGC_TX_FLAGS_IPV4	= 0x10,
>  	IGC_TX_FLAGS_CSUM	= 0x20,
> -
> -	IGC_TX_FLAGS_XDP	= 0x100,
>  };
>  
>  enum igc_boards {
> @@ -394,12 +392,18 @@ enum igc_boards {
>  #define TXD_USE_COUNT(S)	DIV_ROUND_UP((S), IGC_MAX_DATA_PER_TXD)
>  #define DESC_NEEDED	(MAX_SKB_FRAGS + 4)
>  
> +enum igc_tx_buffer_type {
> +	IGC_TX_BUFFER_TYPE_SKB,
> +	IGC_TX_BUFFER_TYPE_XDP,
> +};
> +
>  /* wrapper around a pointer to a socket buffer,
>   * so a DMA handle can be stored along with the buffer
>   */
>  struct igc_tx_buffer {
>  	union igc_adv_tx_desc *next_to_watch;
>  	unsigned long time_stamp;
> +	enum igc_tx_buffer_type type;
>  	union {
>  		struct sk_buff *skb;
>  		struct xdp_frame *xdpf;
> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> index 60987a5b4b72..ec366643f996 100644
> --- a/drivers/net/ethernet/intel/igc/igc_main.c
> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> @@ -191,10 +191,18 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
>  	while (i != tx_ring->next_to_use) {
>  		union igc_adv_tx_desc *eop_desc, *tx_desc;
>  
> -		if (tx_buffer->tx_flags & IGC_TX_FLAGS_XDP)
> +		switch (tx_buffer->type) {
> +		case IGC_TX_BUFFER_TYPE_XDP:
>  			xdp_return_frame(tx_buffer->xdpf);
> -		else
> +			break;
> +		case IGC_TX_BUFFER_TYPE_SKB:
>  			dev_kfree_skb_any(tx_buffer->skb);
> +			break;
> +		default:
> +			netdev_warn_once(tx_ring->netdev,
> +					 "Unknown tx buffer type\n");
> +			break;
> +		}

nit: you've been doing some effort in order to reduce the code duplication
as much as it's possible, yet here you introduce duplicated code for that
desc cleanup :p maybe add a little helper for that as well?

>  
>  		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
>  
> @@ -1371,6 +1379,7 @@ static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb,
>  
>  	/* record the location of the first descriptor for this packet */
>  	first = &tx_ring->tx_buffer_info[tx_ring->next_to_use];
> +	first->type = IGC_TX_BUFFER_TYPE_SKB;
>  	first->skb = skb;
>  	first->bytecount = skb->len;
>  	first->gso_segs = 1;
> @@ -1950,8 +1959,8 @@ static int igc_xdp_init_tx_buffer(struct igc_tx_buffer *buffer,
>  		return -ENOMEM;
>  	}
>  
> +	buffer->type = IGC_TX_BUFFER_TYPE_XDP;
>  	buffer->xdpf = xdpf;
> -	buffer->tx_flags = IGC_TX_FLAGS_XDP;
>  	buffer->protocol = 0;
>  	buffer->bytecount = xdpf->len;
>  	buffer->gso_segs = 1;
> @@ -2315,10 +2324,18 @@ static bool igc_clean_tx_irq(struct igc_q_vector *q_vector, int napi_budget)
>  		total_bytes += tx_buffer->bytecount;
>  		total_packets += tx_buffer->gso_segs;
>  
> -		if (tx_buffer->tx_flags & IGC_TX_FLAGS_XDP)
> +		switch (tx_buffer->type) {
> +		case IGC_TX_BUFFER_TYPE_XDP:
>  			xdp_return_frame(tx_buffer->xdpf);
> -		else
> +			break;
> +		case IGC_TX_BUFFER_TYPE_SKB:
>  			napi_consume_skb(tx_buffer->skb, napi_budget);
> +			break;
> +		default:
> +			netdev_warn_once(tx_ring->netdev,
> +					 "Unknown tx buffer type\n");
> +			break;
> +		}
>  
>  		igc_unmap_tx_buffer(tx_ring->dev, tx_buffer);
>  
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 02/10] igc: Refactor igc_xdp_run_prog()
  2020-12-21 22:45   ` Maciej Fijalkowski
@ 2020-12-22  1:12     ` Andre Guedes
  0 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-22  1:12 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-21 14:45:12)
> On Thu, Dec 17, 2020 at 12:24:07PM -0800, Andre Guedes wrote:
> > This patch refactors igc_xdp_run_prog() helper, preparing the code for
> 
> s/This patch refactors/Refactor
> 
> to follow the guideline of using the imperative mood in commit msgs.
> 
> > AF_XDP zero-copy support which is added by upcoming patches.
> > 
> > With AF_XDP zero-copy support, igc_poll() will have a dedicated path
> > when the rx ring has zero-copy is enabled. To avoid code duplication as
> 
> The end of this sentence is a bit bogus.
> 
> What about:
> 
> With AF_XDP zero-copy support, igc_poll() will have a dedicated path
> when rx ring's memory model is MEM_TYPE_XSK_BUFF_POOL.
> 
> ?
> 
> > much as possible, this patch encapsulates the code specific to handling
> 
> s/this patch encapsulates/encapsulate
> 
> git grep "This patch" Documentation/process/submitting-patches.rst
> 
> as a refresher :)
> 

Thanks for the suggestions above. I'll fix all commit messages in the v2.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration
  2020-12-21 22:53   ` Maciej Fijalkowski
@ 2020-12-22  1:13     ` Andre Guedes
  2020-12-22 12:32       ` Maciej Fijalkowski
  0 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-22  1:13 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-21 14:53:45)
> > @@ -460,9 +460,12 @@ int igc_setup_rx_resources(struct igc_ring *rx_ring)
> >       struct device *dev = rx_ring->dev;
> >       int size, desc_len, res;
> >  
> > -     res = igc_xdp_register_rxq_info(rx_ring);
> > -     if (res < 0)
> > +     res = xdp_rxq_info_reg(&rx_ring->xdp_rxq, ndev, rx_ring->queue_index,
> > +                            0);
> > +     if (res < 0) {
> > +             netdev_err(ndev, "Failed to register xdp rxq info\n");
> 
> Maybe print out q idx?

I'll add that.

> > @@ -536,6 +539,9 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
> >       u32 srrctl = 0, rxdctl = 0;
> >       u64 rdba = ring->dma;
> >  
> > +     WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
> > +                                        MEM_TYPE_PAGE_SHARED, NULL));
> 
> You should do the unroll in case it fails just like it was done in
> igc_xdp_register_rxq_info.

This was inspired in ixgbe driver.

The only reason xdp_rxq_info_reg_mem_model() could fail here is if xdp_rxq
wasn't registered. However, this is very unlikely to happen since it is
registered in igc_setup_rx_resources() which is always called before
igc_configure_rx_ring(). The WARN_ON() macro is used just in case.

If we really want to unroll, we should propagate the error back in the call
chain, changing the returning type of igc_configure_rx_ring() as well as the
other functions in the call chain, so the unrolling is done in the proper
place.

IMO, such change isn't worth it. It seems like a lot of change to cover a case
that is never expected. WARN_ON() sound more suitable in those cases. Also,
ixgbe is around for quite some time and this doesn't seem to be an issue.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 06/10] igc: Introduce igc_update_tx_stats()
  2020-12-21 22:58   ` Maciej Fijalkowski
@ 2020-12-22  1:13     ` Andre Guedes
  0 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-22  1:13 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-21 14:58:11)
> On Thu, Dec 17, 2020 at 12:24:11PM -0800, Andre Guedes wrote:
> > Likewise we do with the RX stats, this patch encapsulates the code that
> > updates the driver TX stats in its own local helper.
> 
> I think you can squash this onto previous patch.

I can do that.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum
  2020-12-21 23:09   ` Maciej Fijalkowski
@ 2020-12-22  1:13     ` Andre Guedes
  2020-12-22 12:33       ` Maciej Fijalkowski
  0 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-22  1:13 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-21 15:09:05)
> > @@ -191,10 +191,18 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
> >       while (i != tx_ring->next_to_use) {
> >               union igc_adv_tx_desc *eop_desc, *tx_desc;
> >  
> > -             if (tx_buffer->tx_flags & IGC_TX_FLAGS_XDP)
> > +             switch (tx_buffer->type) {
> > +             case IGC_TX_BUFFER_TYPE_XDP:
> >                       xdp_return_frame(tx_buffer->xdpf);
> > -             else
> > +                     break;
> > +             case IGC_TX_BUFFER_TYPE_SKB:
> >                       dev_kfree_skb_any(tx_buffer->skb);
> > +                     break;
> > +             default:
> > +                     netdev_warn_once(tx_ring->netdev,
> > +                                      "Unknown tx buffer type\n");
> > +                     break;
> > +             }
> 
> nit: you've been doing some effort in order to reduce the code duplication
> as much as it's possible, yet here you introduce duplicated code for that
> desc cleanup :p maybe add a little helper for that as well?

Note that the handling in IGC_TX_BUFFER_TYPE_SKB case is different in
igc_clean_tx_irq(): dev_kfree_skb_any() vs napi_consume_skb().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration
  2020-12-22  1:13     ` Andre Guedes
@ 2020-12-22 12:32       ` Maciej Fijalkowski
  2020-12-22 17:43         ` Andre Guedes
  0 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-22 12:32 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 21, 2020 at 05:13:30PM -0800, Andre Guedes wrote:
> Quoting Maciej Fijalkowski (2020-12-21 14:53:45)
> > > @@ -460,9 +460,12 @@ int igc_setup_rx_resources(struct igc_ring *rx_ring)
> > >       struct device *dev = rx_ring->dev;
> > >       int size, desc_len, res;
> > >  
> > > -     res = igc_xdp_register_rxq_info(rx_ring);
> > > -     if (res < 0)
> > > +     res = xdp_rxq_info_reg(&rx_ring->xdp_rxq, ndev, rx_ring->queue_index,
> > > +                            0);
> > > +     if (res < 0) {
> > > +             netdev_err(ndev, "Failed to register xdp rxq info\n");
> > 
> > Maybe print out q idx?
> 
> I'll add that.
> 
> > > @@ -536,6 +539,9 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
> > >       u32 srrctl = 0, rxdctl = 0;
> > >       u64 rdba = ring->dma;
> > >  
> > > +     WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
> > > +                                        MEM_TYPE_PAGE_SHARED, NULL));
> > 
> > You should do the unroll in case it fails just like it was done in
> > igc_xdp_register_rxq_info.
> 
> This was inspired in ixgbe driver.
> 
> The only reason xdp_rxq_info_reg_mem_model() could fail here is if xdp_rxq
> wasn't registered. However, this is very unlikely to happen since it is
> registered in igc_setup_rx_resources() which is always called before
> igc_configure_rx_ring(). The WARN_ON() macro is used just in case.

Agreed on that but let's not disregard the other failure cases that can
happen by saying that it can only fail when xdp_rxq wasn't registered.

I believe that functions returning statuses have been written in such way
for some reason, so I feel that ignoring error statuses is a wrong
attitude.

For example, igc_setup_all_rx_resources return value is ignored in
igc_request_irq, but in __igc_open it is checked. Why?

One last thing is that all other drivers besides igb/ixgbe do the error
handling.

> 
> If we really want to unroll, we should propagate the error back in the call
> chain, changing the returning type of igc_configure_rx_ring() as well as the
> other functions in the call chain, so the unrolling is done in the proper
> place.
> 
> IMO, such change isn't worth it. It seems like a lot of change to cover a case
> that is never expected. WARN_ON() sound more suitable in those cases. Also,
> ixgbe is around for quite some time and this doesn't seem to be an issue.

Well, although I don't like it, I agree :P

The uncomfortable question would be what will happen if we fail to
register that mem model but due to the fact that driver is written in a
way that it is not profitable to do unrolling?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum
  2020-12-22  1:13     ` Andre Guedes
@ 2020-12-22 12:33       ` Maciej Fijalkowski
  0 siblings, 0 replies; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-22 12:33 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Dec 21, 2020 at 05:13:48PM -0800, Andre Guedes wrote:
> Quoting Maciej Fijalkowski (2020-12-21 15:09:05)
> > > @@ -191,10 +191,18 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring)
> > >       while (i != tx_ring->next_to_use) {
> > >               union igc_adv_tx_desc *eop_desc, *tx_desc;
> > >  
> > > -             if (tx_buffer->tx_flags & IGC_TX_FLAGS_XDP)
> > > +             switch (tx_buffer->type) {
> > > +             case IGC_TX_BUFFER_TYPE_XDP:
> > >                       xdp_return_frame(tx_buffer->xdpf);
> > > -             else
> > > +                     break;
> > > +             case IGC_TX_BUFFER_TYPE_SKB:
> > >                       dev_kfree_skb_any(tx_buffer->skb);
> > > +                     break;
> > > +             default:
> > > +                     netdev_warn_once(tx_ring->netdev,
> > > +                                      "Unknown tx buffer type\n");
> > > +                     break;
> > > +             }
> > 
> > nit: you've been doing some effort in order to reduce the code duplication
> > as much as it's possible, yet here you introduce duplicated code for that
> > desc cleanup :p maybe add a little helper for that as well?
> 
> Note that the handling in IGC_TX_BUFFER_TYPE_SKB case is different in
> igc_clean_tx_irq(): dev_kfree_skb_any() vs napi_consume_skb().

Yikes! Sorry, it was late.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy Andre Guedes
@ 2020-12-22 13:14   ` Maciej Fijalkowski
  2020-12-22 17:43     ` Andre Guedes
  0 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-22 13:14 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 17, 2020 at 12:24:14PM -0800, Andre Guedes wrote:
> This patch adds support for receiving packets via AF_XDP zero-copy
> mechanism.
> 
> A new flag is added to 'enum igc_ring_flags_t' to indicate the ring has
> AF_XDP zero-copy enabled so proper ring setup is carried out during ring
> configuration in igc_configure_rx_ring().
> 
> RX buffers can now be allocated via the shared pages mechanism (default
> behavior of the driver) or via xsk pool (when AF_XDP zero-copy is
> enabled) so a union is added to the 'struct igc_rx_buffer' to cover both
> cases.
> 
> When AF_XDP zero-copy is enabled, rx buffers are allocated from the xsk
> pool using the new helper igc_alloc_rx_buffers_zc() which is the
> counterpart of igc_alloc_rx_buffers().
> 
> Likewise other Intel drivers that support AF_XDP zero-copy, in igc we
> have a dedicated path for cleaning up rx irqs when zero-copy is enabled.
> This avoids adding too many checks within igc_clean_rx_irq(), resulting
> in a more readable and efficient code since this function is called from
> the hot-path of the driver.
> 
> Signed-off-by: Andre Guedes <andre.guedes@intel.com>
> ---
>  drivers/net/ethernet/intel/igc/igc.h      |  22 +-
>  drivers/net/ethernet/intel/igc/igc_base.h |   1 +
>  drivers/net/ethernet/intel/igc/igc_main.c | 334 +++++++++++++++++++++-
>  drivers/net/ethernet/intel/igc/igc_xdp.c  |  98 +++++++
>  drivers/net/ethernet/intel/igc/igc_xdp.h  |   2 +
>  5 files changed, 438 insertions(+), 19 deletions(-)
> 

[...]

> +
> +static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
> +{
> +	struct igc_adapter *adapter = q_vector->adapter;
> +	struct igc_ring *ring = q_vector->rx.ring;
> +	u16 cleaned_count = igc_desc_unused(ring);
> +	int total_bytes = 0, total_packets = 0;
> +	struct bpf_prog *prog;
> +	bool failure = false;
> +	int xdp_status = 0;
> +
> +	rcu_read_lock();
> +
> +	prog = READ_ONCE(adapter->xdp_prog);
> +
> +	while (likely(total_packets < budget)) {
> +		union igc_adv_rx_desc *desc;
> +		struct igc_rx_buffer *bi;
> +		unsigned int size;
> +		int res;
> +
> +		desc = IGC_RX_DESC(ring, ring->next_to_clean);
> +		size = le16_to_cpu(desc->wb.upper.length);
> +		if (!size)
> +			break;
> +
> +		/* This memory barrier is needed to keep us from reading
> +		 * any other fields out of the rx_desc until we know the
> +		 * descriptor has been written back
> +		 */
> +		dma_rmb();
> +
> +		bi = &ring->rx_buffer_info[ring->next_to_clean];
> +
> +		if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) {
> +			/* FIXME: For now, if packet buffer contains timestamp
> +			 * information, we discard it. Once XDP infrastructe

nit: s/infrastructe/infrastructure

> +			 * provides a way to report hardware timestamps, the
> +			 * code below should be fixed.
> +			 */
> +			bi->xdp->data += IGC_TS_HDR_LEN;
> +			size -= IGC_TS_HDR_LEN;
> +		}
> +
> +		bi->xdp->data_end = bi->xdp->data + size;
> +		xsk_buff_dma_sync_for_cpu(bi->xdp, ring->xsk_pool);
> +
> +		res = igc_xdp_do_run_prog(adapter, prog, bi->xdp);
> +		switch (res) {
> +		case IGC_XDP_PASS:
> +			igc_dispatch_skb_zc(q_vector, desc, bi->xdp);
> +			fallthrough;
> +		case IGC_XDP_CONSUMED:
> +			xsk_buff_free(bi->xdp);
> +			break;
> +		case IGC_XDP_TX:
> +		case IGC_XDP_REDIRECT:
> +			xdp_status |= res;
> +			break;
> +		}
> +
> +		bi->xdp = NULL;
> +		total_bytes += size;
> +		total_packets++;
> +		cleaned_count++;
> +		ring->next_to_clean++;
> +		if (ring->next_to_clean == ring->count)
> +			ring->next_to_clean = 0;
> +	}
> +
> +	rcu_read_unlock();
> +
> +	if (cleaned_count >= IGC_RX_BUFFER_WRITE)
> +		failure = !igc_alloc_rx_buffers_zc(ring, cleaned_count);
> +
> +	if (xdp_status)
> +		igc_finalize_xdp(adapter, xdp_status);
> +
> +	igc_update_rx_stats(q_vector, total_packets, total_bytes);
> +
> +	if (xsk_uses_need_wakeup(ring->xsk_pool)) {
> +		if (failure || ring->next_to_clean == ring->next_to_use)
> +			xsk_set_rx_need_wakeup(ring->xsk_pool);
> +		else
> +			xsk_clear_rx_need_wakeup(ring->xsk_pool);
> +		return total_packets;
> +	}
> +
> +	return failure ? budget : total_packets;
> +}
> +
>  static void igc_update_tx_stats(struct igc_q_vector *q_vector,
>  				unsigned int packets, unsigned int bytes)
>  {
> @@ -2958,7 +3188,10 @@ static void igc_configure(struct igc_adapter *adapter)
>  	for (i = 0; i < adapter->num_rx_queues; i++) {
>  		struct igc_ring *ring = adapter->rx_ring[i];
>  
> -		igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
> +		if (ring->xsk_pool)
> +			igc_alloc_rx_buffers_zc(ring, igc_desc_unused(ring));
> +		else
> +			igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
>  	}
>  }
>  
> @@ -3573,14 +3806,18 @@ static int igc_poll(struct napi_struct *napi, int budget)
>  	struct igc_q_vector *q_vector = container_of(napi,
>  						     struct igc_q_vector,
>  						     napi);
> +	struct igc_ring *rx_ring = q_vector->rx.ring;
> +
>  	bool clean_complete = true;
>  	int work_done = 0;
>  
>  	if (q_vector->tx.ring)
>  		clean_complete = igc_clean_tx_irq(q_vector, budget);
>  
> -	if (q_vector->rx.ring) {
> -		int cleaned = igc_clean_rx_irq(q_vector, budget);
> +	if (rx_ring) {
> +		int cleaned = rx_ring->xsk_pool ?
> +			      igc_clean_rx_irq_zc(q_vector, budget) :
> +			      igc_clean_rx_irq(q_vector, budget);
>  
>  		work_done += cleaned;
>  		if (cleaned >= budget)
> @@ -5161,6 +5398,9 @@ static int igc_bpf(struct net_device *dev, struct netdev_bpf *bpf)
>  	switch (bpf->command) {
>  	case XDP_SETUP_PROG:
>  		return igc_xdp_set_prog(adapter, bpf->prog, bpf->extack);
> +	case XDP_SETUP_XSK_POOL:
> +		return igc_xdp_setup_pool(adapter, bpf->xsk.pool,
> +					  bpf->xsk.queue_id);
>  	default:
>  		return -EOPNOTSUPP;
>  	}
> @@ -5206,6 +5446,43 @@ static int igc_xdp_xmit(struct net_device *dev, int num_frames,
>  	return num_frames - drops;
>  }
>  
> +static void igc_trigger_rxtxq_interrupt(struct igc_adapter *adapter,
> +					struct igc_q_vector *q_vector)
> +{
> +	struct igc_hw *hw = &adapter->hw;
> +	u32 eics = 0;
> +
> +	eics |= q_vector->eims_value;
> +	wr32(IGC_EICS, eics);
> +}
> +
> +int igc_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags)
> +{
> +	struct igc_adapter *adapter = netdev_priv(dev);
> +	struct igc_q_vector *q_vector;
> +	struct igc_ring *ring;
> +
> +	if (test_bit(__IGC_DOWN, &adapter->state))
> +		return -ENETDOWN;
> +
> +	if (!igc_xdp_is_enabled(adapter))
> +		return -ENXIO;
> +
> +	if (queue_id >= adapter->num_rx_queues)
> +		return -EINVAL;
> +
> +	ring = adapter->rx_ring[queue_id];
> +
> +	if (!ring->xsk_pool)
> +		return -ENXIO;

ring local variable feels a bit unnecessary.

> +
> +	q_vector = adapter->q_vector[queue_id];
> +	if (!napi_if_scheduled_mark_missed(&q_vector->napi))
> +		igc_trigger_rxtxq_interrupt(adapter, q_vector);
> +
> +	return 0;
> +}
> +
>  static const struct net_device_ops igc_netdev_ops = {
>  	.ndo_open		= igc_open,
>  	.ndo_stop		= igc_close,
> @@ -5221,6 +5498,7 @@ static const struct net_device_ops igc_netdev_ops = {
>  	.ndo_setup_tc		= igc_setup_tc,
>  	.ndo_bpf		= igc_bpf,
>  	.ndo_xdp_xmit		= igc_xdp_xmit,
> +	.ndo_xsk_wakeup		= igc_xsk_wakeup,
>  };
>  
>  /* PCIe configuration access */
> @@ -5974,6 +6252,36 @@ struct net_device *igc_get_hw_dev(struct igc_hw *hw)
>  	return adapter->netdev;
>  }
>  
> +static void igc_disable_rx_ring_hw(struct igc_ring *ring)
> +{
> +	struct igc_hw *hw = &ring->q_vector->adapter->hw;
> +	u8 idx = ring->reg_idx;
> +	u32 rxdctl;
> +
> +	rxdctl = rd32(IGC_RXDCTL(idx));
> +	rxdctl &= ~IGC_RXDCTL_QUEUE_ENABLE;
> +	rxdctl |= IGC_RXDCTL_SWFLUSH;
> +	wr32(IGC_RXDCTL(idx), rxdctl);
> +}
> +
> +void igc_disable_rx_ring(struct igc_ring *ring)
> +{
> +	igc_disable_rx_ring_hw(ring);
> +	igc_clean_rx_ring(ring);
> +}
> +
> +void igc_enable_rx_ring(struct igc_ring *ring)
> +{
> +	struct igc_adapter *adapter = ring->q_vector->adapter;
> +
> +	igc_configure_rx_ring(adapter, ring);
> +
> +	if (ring->xsk_pool)
> +		igc_alloc_rx_buffers_zc(ring, igc_desc_unused(ring));
> +	else
> +		igc_alloc_rx_buffers(ring, igc_desc_unused(ring));
> +}
> +
>  /**
>   * igc_init_module - Driver Registration Routine
>   *
> diff --git a/drivers/net/ethernet/intel/igc/igc_xdp.c b/drivers/net/ethernet/intel/igc/igc_xdp.c
> index 27c886a254f1..9515906d54e0 100644
> --- a/drivers/net/ethernet/intel/igc/igc_xdp.c
> +++ b/drivers/net/ethernet/intel/igc/igc_xdp.c
> @@ -1,6 +1,8 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /* Copyright (c) 2020, Intel Corporation. */
>  
> +#include <net/xdp_sock_drv.h>
> +
>  #include "igc.h"
>  #include "igc_xdp.h"
>  
> @@ -31,3 +33,99 @@ int igc_xdp_set_prog(struct igc_adapter *adapter, struct bpf_prog *prog,
>  
>  	return 0;
>  }
> +
> +static int igc_xdp_enable_pool(struct igc_adapter *adapter,
> +			       struct xsk_buff_pool *pool, u16 queue_id)
> +{
> +	struct net_device *ndev = adapter->netdev;
> +	struct device *dev = &adapter->pdev->dev;
> +	struct igc_ring *rx_ring;
> +	struct napi_struct *napi;
> +	bool needs_reset;
> +	u32 frame_size;
> +	int err;
> +
> +	if (queue_id >= adapter->num_rx_queues)
> +		return -EINVAL;
> +
> +	frame_size = xsk_pool_get_rx_frame_size(pool);
> +	if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
> +		/* For now, when XDP is enabled, the driver doesn't support
> +		 * frames that span over multiple buffers. The max frame size
> +		 * considered here is the ethernet frame size + vlan double
> +		 * tagging.
> +		 */

I don't really follow that. You check if chunk size is less than
ETH_FRAME_LEN + VLAN_HLEN * 2. chunk size will be at least 2k if I recall
correctly. How is that related to fragmented buffers?

> +		return -EOPNOTSUPP;
> +	}
> +
> +	err = xsk_pool_dma_map(pool, dev, IGC_RX_DMA_ATTR);
> +	if (err) {
> +		netdev_err(ndev, "Failed to map xsk pool\n");
> +		return err;
> +	}
> +
> +	needs_reset = netif_running(adapter->netdev) && igc_xdp_is_enabled(adapter);
> +
> +	rx_ring = adapter->rx_ring[queue_id];
> +	napi = &rx_ring->q_vector->napi;
> +
> +	if (needs_reset) {
> +		igc_disable_rx_ring(rx_ring);
> +		napi_disable(napi);
> +	}
> +
> +	set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags);
> +
> +	if (needs_reset) {
> +		napi_enable(napi);
> +		igc_enable_rx_ring(rx_ring);
> +
> +		err = igc_xsk_wakeup(ndev, queue_id, XDP_WAKEUP_RX);
> +		if (err)
> +			return err;
> +	}
> +
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 10/10] igc: Enable TX via AF_XDP zero-copy
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 10/10] igc: Enable TX " Andre Guedes
@ 2020-12-22 14:16   ` Maciej Fijalkowski
  0 siblings, 0 replies; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-22 14:16 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 17, 2020 at 12:24:15PM -0800, Andre Guedes wrote:
> This patch adds support for transmitting packets via AF_XDP zero-copy
> mechanism.
> 
> The packet transmission itself is implemented by igc_xdp_xmit_zc() which
> is called from igc_clean_tx_irq() when the ring has AF_XDP zero-copy
> enabled. Likewise i40e and ice drivers, the transmission budget used is
> the number of descriptors available on the ring.
> 
> A new tx buffer type is introduced to 'enum igc_tx_buffer_type' to
> indicate the tx buffer uses memory from xsk pool so it can be properly
> cleaned after transmission or when the ring is cleaned.
> 
> The I225 controller has only 4 Tx hardware queues so the main difference
> between igc and other Intel drivers that support AF_XDP zero-copy is
> that there is no tx ring dedicated exclusively to XDP. Instead, tx
> rings are shared between the network stack and XDP, and netdev queue
> lock is used to ensure mutual exclusion. This is the same approach
> implemented to support XDP_TX and XDP_REDIRECT actions.
> 
> Signed-off-by: Andre Guedes <andre.guedes@intel.com>
> ---
>  drivers/net/ethernet/intel/igc/igc.h      |   3 +
>  drivers/net/ethernet/intel/igc/igc_base.h |   1 +
>  drivers/net/ethernet/intel/igc/igc_main.c | 115 +++++++++++++++++++++-
>  drivers/net/ethernet/intel/igc/igc_xdp.c  |  20 +++-
>  4 files changed, 131 insertions(+), 8 deletions(-)
> 

[...]

>  
> +static void igc_xdp_xmit_zc(struct igc_ring *ring)
> +{
> +	struct xsk_buff_pool *pool = ring->xsk_pool;
> +	struct netdev_queue *nq = txring_txq(ring);
> +	int cpu = smp_processor_id();
> +	struct xdp_desc xdp_desc;
> +	bool work_done;
> +	u16 budget;
> +
> +	if (!netif_carrier_ok(ring->netdev))
> +		return;
> +
> +	__netif_tx_lock(nq, cpu);
> +
> +	budget = igc_desc_unused(ring);
> +	work_done = false;
> +
> +	while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {
> +		u32 cmd_type, olinfo_status;
> +		union igc_adv_tx_desc *desc;
> +		struct igc_tx_buffer *bi;
> +		dma_addr_t dma;
> +
> +		cmd_type = IGC_ADVTXD_DTYP_DATA | IGC_ADVTXD_DCMD_DEXT |
> +			   IGC_ADVTXD_DCMD_IFCS | IGC_TXD_DCMD |
> +			   xdp_desc.len;
> +		olinfo_status = xdp_desc.len << IGC_ADVTXD_PAYLEN_SHIFT;
> +
> +		dma = xsk_buff_raw_get_dma(pool, xdp_desc.addr);
> +		xsk_buff_raw_dma_sync_for_device(pool, dma, xdp_desc.len);
> +
> +		desc = IGC_TX_DESC(ring, ring->next_to_use);
> +		desc->read.cmd_type_len = cpu_to_le32(cmd_type);
> +		desc->read.olinfo_status = cpu_to_le32(olinfo_status);
> +		desc->read.buffer_addr = cpu_to_le64(dma);
> +
> +		bi = &ring->tx_buffer_info[ring->next_to_use];
> +		bi->type = IGC_TX_BUFFER_TYPE_XSK;
> +		bi->protocol = 0;
> +		bi->bytecount = xdp_desc.len;
> +		bi->gso_segs = 1;
> +		bi->time_stamp = jiffies;
> +		bi->next_to_watch = desc;
> +
> +		netdev_tx_sent_queue(txring_txq(ring), xdp_desc.len);
> +
> +		ring->next_to_use++;
> +		if (ring->next_to_use == ring->count)
> +			ring->next_to_use = 0;
> +
> +		work_done = true;

nit: setting it on each iteration feels semi-optimal.

> +	}
> +
> +	if (work_done) {
> +		igc_flush_tx_descriptors(ring);
> +		xsk_tx_release(pool);
> +	}
> +
> +	__netif_tx_unlock(nq);
> +}

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy
  2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
                   ` (9 preceding siblings ...)
  2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 10/10] igc: Enable TX " Andre Guedes
@ 2020-12-22 14:20 ` Maciej Fijalkowski
  2020-12-22 17:43   ` Andre Guedes
  10 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-22 14:20 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, Dec 17, 2020 at 12:24:05PM -0800, Andre Guedes wrote:
> Hi,
> 
> This series adds AF_XDP zero-copy feature to igc driver.
> 
> The initial patches do some code refactoring, preparing the code base to
> land the AF_XDP zero-copy feature, avoiding code duplications. The last
> patches of the series are the ones implementing the feature.
> 
> The last patch which indeed implements AF_XDP zero-copy support was
> originally way too lengthy so, for the sake of code review, I broke it
> up into two patches: one adding support for the RX functionality and the
> other one adding TX support. Feel free to squash those two patches when
> pushing the series to net-next if you find appropriate.

Generally this looks good! I had only minor comments.

I'm interested how are you going to utilize the AF_XDP ZC, probably
something related to TSN ;)

> 
> Best,
> Andre
> 
> Andre Guedes (10):
>   igc: Move igc_xdp_is_enabled()
>   igc: Refactor igc_xdp_run_prog()
>   igc: Refactor igc_clean_rx_ring()
>   igc: Refactor XDP rxq info registration
>   igc: Introduce igc_update_rx_stats()
>   igc: Introduce igc_update_tx_stats()
>   igc: Introduce igc_unmap_tx_buffer() helper
>   igc: Replace IGC_TX_FLAGS_XDP flag by an enum
>   igc: Enable RX via AF_XDP zero-copy
>   igc: Enable TX via AF_XDP zero-copy
> 
>  drivers/net/ethernet/intel/igc/igc.h      |  33 +-
>  drivers/net/ethernet/intel/igc/igc_base.h |   2 +
>  drivers/net/ethernet/intel/igc/igc_main.c | 650 ++++++++++++++++++----
>  drivers/net/ethernet/intel/igc/igc_xdp.c  | 107 +++-
>  drivers/net/ethernet/intel/igc/igc_xdp.h  |   8 +-
>  5 files changed, 672 insertions(+), 128 deletions(-)
> 
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration
  2020-12-22 12:32       ` Maciej Fijalkowski
@ 2020-12-22 17:43         ` Andre Guedes
  2020-12-22 20:59           ` Maciej Fijalkowski
  0 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-22 17:43 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-22 04:32:05)
> > > > @@ -536,6 +539,9 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
> > > >       u32 srrctl = 0, rxdctl = 0;
> > > >       u64 rdba = ring->dma;
> > > >  
> > > > +     WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
> > > > +                                        MEM_TYPE_PAGE_SHARED, NULL));
> > > 
> > > You should do the unroll in case it fails just like it was done in
> > > igc_xdp_register_rxq_info.
> > 
> > This was inspired in ixgbe driver.
> > 
> > The only reason xdp_rxq_info_reg_mem_model() could fail here is if xdp_rxq
> > wasn't registered. However, this is very unlikely to happen since it is
> > registered in igc_setup_rx_resources() which is always called before
> > igc_configure_rx_ring(). The WARN_ON() macro is used just in case.
> 
> Agreed on that but let's not disregard the other failure cases that can
> happen by saying that it can only fail when xdp_rxq wasn't registered.
> 
> I believe that functions returning statuses have been written in such way
> for some reason, so I feel that ignoring error statuses is a wrong
> attitude.
> 
> For example, igc_setup_all_rx_resources return value is ignored in
> igc_request_irq, but in __igc_open it is checked. Why?
> 
> One last thing is that all other drivers besides igb/ixgbe do the error
> handling.

Agreed. I have another series with some fixes to the driver that I'm working
on. I can fix the issue with igc_request_irq() on that series.

> > If we really want to unroll, we should propagate the error back in the call
> > chain, changing the returning type of igc_configure_rx_ring() as well as the
> > other functions in the call chain, so the unrolling is done in the proper
> > place.
> > 
> > IMO, such change isn't worth it. It seems like a lot of change to cover a case
> > that is never expected. WARN_ON() sound more suitable in those cases. Also,
> > ixgbe is around for quite some time and this doesn't seem to be an issue.
> 
> Well, although I don't like it, I agree :P
> 
> The uncomfortable question would be what will happen if we fail to
> register that mem model but due to the fact that driver is written in a
> way that it is not profitable to do unrolling?

I see your point. In that case, I think we can tackle that in a separate series
changing all three drivers (igc, ixgbe, and igb). Sounds good?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy
  2020-12-22 13:14   ` Maciej Fijalkowski
@ 2020-12-22 17:43     ` Andre Guedes
  2020-12-22 20:59       ` Maciej Fijalkowski
  0 siblings, 1 reply; 30+ messages in thread
From: Andre Guedes @ 2020-12-22 17:43 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-22 05:14:28)
> > +static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
> > +{
> > +     struct igc_adapter *adapter = q_vector->adapter;
> > +     struct igc_ring *ring = q_vector->rx.ring;
> > +     u16 cleaned_count = igc_desc_unused(ring);
> > +     int total_bytes = 0, total_packets = 0;
> > +     struct bpf_prog *prog;
> > +     bool failure = false;
> > +     int xdp_status = 0;
> > +
> > +     rcu_read_lock();
> > +
> > +     prog = READ_ONCE(adapter->xdp_prog);
> > +
> > +     while (likely(total_packets < budget)) {
> > +             union igc_adv_rx_desc *desc;
> > +             struct igc_rx_buffer *bi;
> > +             unsigned int size;
> > +             int res;
> > +
> > +             desc = IGC_RX_DESC(ring, ring->next_to_clean);
> > +             size = le16_to_cpu(desc->wb.upper.length);
> > +             if (!size)
> > +                     break;
> > +
> > +             /* This memory barrier is needed to keep us from reading
> > +              * any other fields out of the rx_desc until we know the
> > +              * descriptor has been written back
> > +              */
> > +             dma_rmb();
> > +
> > +             bi = &ring->rx_buffer_info[ring->next_to_clean];
> > +
> > +             if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) {
> > +                     /* FIXME: For now, if packet buffer contains timestamp
> > +                      * information, we discard it. Once XDP infrastructe
> 
> nit: s/infrastructe/infrastructure

Thanks, I'll fix it in the v2.

> > +int igc_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags)
> > +{
> > +     struct igc_adapter *adapter = netdev_priv(dev);
> > +     struct igc_q_vector *q_vector;
> > +     struct igc_ring *ring;
> > +
> > +     if (test_bit(__IGC_DOWN, &adapter->state))
> > +             return -ENETDOWN;
> > +
> > +     if (!igc_xdp_is_enabled(adapter))
> > +             return -ENXIO;
> > +
> > +     if (queue_id >= adapter->num_rx_queues)
> > +             return -EINVAL;
> > +
> > +     ring = adapter->rx_ring[queue_id];
> > +
> > +     if (!ring->xsk_pool)
> > +             return -ENXIO;
> 
> ring local variable feels a bit unnecessary.

It's just for the sake of code readability. I thought it reads better than:

	if (!adapter->rx_ring[queue_id]->xsk_pool)
		return -ENXIO;

> > +static int igc_xdp_enable_pool(struct igc_adapter *adapter,
> > +                            struct xsk_buff_pool *pool, u16 queue_id)
> > +{
> > +     struct net_device *ndev = adapter->netdev;
> > +     struct device *dev = &adapter->pdev->dev;
> > +     struct igc_ring *rx_ring;
> > +     struct napi_struct *napi;
> > +     bool needs_reset;
> > +     u32 frame_size;
> > +     int err;
> > +
> > +     if (queue_id >= adapter->num_rx_queues)
> > +             return -EINVAL;
> > +
> > +     frame_size = xsk_pool_get_rx_frame_size(pool);
> > +     if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
> > +             /* For now, when XDP is enabled, the driver doesn't support
> > +              * frames that span over multiple buffers. The max frame size
> > +              * considered here is the ethernet frame size + vlan double
> > +              * tagging.
> > +              */
> 
> I don't really follow that. You check if chunk size is less than
> ETH_FRAME_LEN + VLAN_HLEN * 2. chunk size will be at least 2k if I recall
> correctly. How is that related to fragmented buffers?

Note that the code above checks the xsk frame size, not chunk size.

Yes, the chunk mim size is 2k indeed. However, if I'm reading the code
correctly, the umem headroom can have an arbitrary value defined by the user
(see xdp_umem_reg()). In this case, the xsk frame size could potentially be
less than the ethernet frame size.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy
  2020-12-22 14:20 ` [Intel-wired-lan] [PATCH 00/10] igc: Add support for " Maciej Fijalkowski
@ 2020-12-22 17:43   ` Andre Guedes
  0 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-22 17:43 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-22 06:20:45)
> On Thu, Dec 17, 2020 at 12:24:05PM -0800, Andre Guedes wrote:
> > Hi,
> > 
> > This series adds AF_XDP zero-copy feature to igc driver.
> > 
> > The initial patches do some code refactoring, preparing the code base to
> > land the AF_XDP zero-copy feature, avoiding code duplications. The last
> > patches of the series are the ones implementing the feature.
> > 
> > The last patch which indeed implements AF_XDP zero-copy support was
> > originally way too lengthy so, for the sake of code review, I broke it
> > up into two patches: one adding support for the RX functionality and the
> > other one adding TX support. Feel free to squash those two patches when
> > pushing the series to net-next if you find appropriate.
> 
> Generally this looks good! I had only minor comments.

Thanks for your review, Maciej. I'm working on the v2 and should submit it
soon.

> I'm interested how are you going to utilize the AF_XDP ZC, probably
> something related to TSN ;)

Yes, that's right. In TSN use cases, tx and rx latency within the system is
relevant. AF_XDP ZC infra provides us a lower latency alternative to AF_PACKET.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy
  2020-12-22 17:43     ` Andre Guedes
@ 2020-12-22 20:59       ` Maciej Fijalkowski
  2020-12-23 19:27         ` Andre Guedes
  0 siblings, 1 reply; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-22 20:59 UTC (permalink / raw)
  To: intel-wired-lan

On Tue, Dec 22, 2020 at 09:43:39AM -0800, Andre Guedes wrote:
> Quoting Maciej Fijalkowski (2020-12-22 05:14:28)
> > > +static int igc_clean_rx_irq_zc(struct igc_q_vector *q_vector, const int budget)
> > > +{
> > > +     struct igc_adapter *adapter = q_vector->adapter;
> > > +     struct igc_ring *ring = q_vector->rx.ring;
> > > +     u16 cleaned_count = igc_desc_unused(ring);
> > > +     int total_bytes = 0, total_packets = 0;
> > > +     struct bpf_prog *prog;
> > > +     bool failure = false;
> > > +     int xdp_status = 0;
> > > +
> > > +     rcu_read_lock();
> > > +
> > > +     prog = READ_ONCE(adapter->xdp_prog);
> > > +
> > > +     while (likely(total_packets < budget)) {
> > > +             union igc_adv_rx_desc *desc;
> > > +             struct igc_rx_buffer *bi;
> > > +             unsigned int size;
> > > +             int res;
> > > +
> > > +             desc = IGC_RX_DESC(ring, ring->next_to_clean);
> > > +             size = le16_to_cpu(desc->wb.upper.length);
> > > +             if (!size)
> > > +                     break;
> > > +
> > > +             /* This memory barrier is needed to keep us from reading
> > > +              * any other fields out of the rx_desc until we know the
> > > +              * descriptor has been written back
> > > +              */
> > > +             dma_rmb();
> > > +
> > > +             bi = &ring->rx_buffer_info[ring->next_to_clean];
> > > +
> > > +             if (igc_test_staterr(desc, IGC_RXDADV_STAT_TSIP)) {
> > > +                     /* FIXME: For now, if packet buffer contains timestamp
> > > +                      * information, we discard it. Once XDP infrastructe
> > 
> > nit: s/infrastructe/infrastructure
> 
> Thanks, I'll fix it in the v2.
> 
> > > +int igc_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags)
> > > +{
> > > +     struct igc_adapter *adapter = netdev_priv(dev);
> > > +     struct igc_q_vector *q_vector;
> > > +     struct igc_ring *ring;
> > > +
> > > +     if (test_bit(__IGC_DOWN, &adapter->state))
> > > +             return -ENETDOWN;
> > > +
> > > +     if (!igc_xdp_is_enabled(adapter))
> > > +             return -ENXIO;
> > > +
> > > +     if (queue_id >= adapter->num_rx_queues)
> > > +             return -EINVAL;
> > > +
> > > +     ring = adapter->rx_ring[queue_id];
> > > +
> > > +     if (!ring->xsk_pool)
> > > +             return -ENXIO;
> > 
> > ring local variable feels a bit unnecessary.
> 
> It's just for the sake of code readability. I thought it reads better than:
> 
> 	if (!adapter->rx_ring[queue_id]->xsk_pool)
> 		return -ENXIO;

Yeah, okay then.

> 
> > > +static int igc_xdp_enable_pool(struct igc_adapter *adapter,
> > > +                            struct xsk_buff_pool *pool, u16 queue_id)
> > > +{
> > > +     struct net_device *ndev = adapter->netdev;
> > > +     struct device *dev = &adapter->pdev->dev;
> > > +     struct igc_ring *rx_ring;
> > > +     struct napi_struct *napi;
> > > +     bool needs_reset;
> > > +     u32 frame_size;
> > > +     int err;
> > > +
> > > +     if (queue_id >= adapter->num_rx_queues)
> > > +             return -EINVAL;
> > > +
> > > +     frame_size = xsk_pool_get_rx_frame_size(pool);
> > > +     if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
> > > +             /* For now, when XDP is enabled, the driver doesn't support
> > > +              * frames that span over multiple buffers. The max frame size
> > > +              * considered here is the ethernet frame size + vlan double
> > > +              * tagging.
> > > +              */
> > 
> > I don't really follow that. You check if chunk size is less than
> > ETH_FRAME_LEN + VLAN_HLEN * 2. chunk size will be at least 2k if I recall
> > correctly. How is that related to fragmented buffers?
> 
> Note that the code above checks the xsk frame size, not chunk size.
> 
> Yes, the chunk mim size is 2k indeed. However, if I'm reading the code
> correctly, the umem headroom can have an arbitrary value defined by the user
> (see xdp_umem_reg()). In this case, the xsk frame size could potentially be
> less than the ethernet frame size.

Ok I see. So you actually meant that for lower frame size given to HW we
would start to fragment the frames?

Comment was a bit unclear, hence my confusion.

Anyway, I feel like this is something generic that should be checked
earlier at xsk code? Meaning, when we actually create the umem with user
provided config at setsockopt?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration
  2020-12-22 17:43         ` Andre Guedes
@ 2020-12-22 20:59           ` Maciej Fijalkowski
  0 siblings, 0 replies; 30+ messages in thread
From: Maciej Fijalkowski @ 2020-12-22 20:59 UTC (permalink / raw)
  To: intel-wired-lan

On Tue, Dec 22, 2020 at 09:43:35AM -0800, Andre Guedes wrote:
> Quoting Maciej Fijalkowski (2020-12-22 04:32:05)
> > > > > @@ -536,6 +539,9 @@ static void igc_configure_rx_ring(struct igc_adapter *adapter,
> > > > >       u32 srrctl = 0, rxdctl = 0;
> > > > >       u64 rdba = ring->dma;
> > > > >  
> > > > > +     WARN_ON(xdp_rxq_info_reg_mem_model(&ring->xdp_rxq,
> > > > > +                                        MEM_TYPE_PAGE_SHARED, NULL));
> > > > 
> > > > You should do the unroll in case it fails just like it was done in
> > > > igc_xdp_register_rxq_info.
> > > 
> > > This was inspired in ixgbe driver.
> > > 
> > > The only reason xdp_rxq_info_reg_mem_model() could fail here is if xdp_rxq
> > > wasn't registered. However, this is very unlikely to happen since it is
> > > registered in igc_setup_rx_resources() which is always called before
> > > igc_configure_rx_ring(). The WARN_ON() macro is used just in case.
> > 
> > Agreed on that but let's not disregard the other failure cases that can
> > happen by saying that it can only fail when xdp_rxq wasn't registered.
> > 
> > I believe that functions returning statuses have been written in such way
> > for some reason, so I feel that ignoring error statuses is a wrong
> > attitude.
> > 
> > For example, igc_setup_all_rx_resources return value is ignored in
> > igc_request_irq, but in __igc_open it is checked. Why?
> > 
> > One last thing is that all other drivers besides igb/ixgbe do the error
> > handling.
> 
> Agreed. I have another series with some fixes to the driver that I'm working
> on. I can fix the issue with igc_request_irq() on that series.
> 
> > > If we really want to unroll, we should propagate the error back in the call
> > > chain, changing the returning type of igc_configure_rx_ring() as well as the
> > > other functions in the call chain, so the unrolling is done in the proper
> > > place.
> > > 
> > > IMO, such change isn't worth it. It seems like a lot of change to cover a case
> > > that is never expected. WARN_ON() sound more suitable in those cases. Also,
> > > ixgbe is around for quite some time and this doesn't seem to be an issue.
> > 
> > Well, although I don't like it, I agree :P
> > 
> > The uncomfortable question would be what will happen if we fail to
> > register that mem model but due to the fact that driver is written in a
> > way that it is not profitable to do unrolling?
> 
> I see your point. In that case, I think we can tackle that in a separate series
> changing all three drivers (igc, ixgbe, and igb). Sounds good?

Sure :)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy
  2020-12-22 20:59       ` Maciej Fijalkowski
@ 2020-12-23 19:27         ` Andre Guedes
  0 siblings, 0 replies; 30+ messages in thread
From: Andre Guedes @ 2020-12-23 19:27 UTC (permalink / raw)
  To: intel-wired-lan

Quoting Maciej Fijalkowski (2020-12-22 12:59:25)
> > > > +static int igc_xdp_enable_pool(struct igc_adapter *adapter,
> > > > +                            struct xsk_buff_pool *pool, u16 queue_id)
> > > > +{
> > > > +     struct net_device *ndev = adapter->netdev;
> > > > +     struct device *dev = &adapter->pdev->dev;
> > > > +     struct igc_ring *rx_ring;
> > > > +     struct napi_struct *napi;
> > > > +     bool needs_reset;
> > > > +     u32 frame_size;
> > > > +     int err;
> > > > +
> > > > +     if (queue_id >= adapter->num_rx_queues)
> > > > +             return -EINVAL;
> > > > +
> > > > +     frame_size = xsk_pool_get_rx_frame_size(pool);
> > > > +     if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
> > > > +             /* For now, when XDP is enabled, the driver doesn't support
> > > > +              * frames that span over multiple buffers. The max frame size
> > > > +              * considered here is the ethernet frame size + vlan double
> > > > +              * tagging.
> > > > +              */
> > > 
> > > I don't really follow that. You check if chunk size is less than
> > > ETH_FRAME_LEN + VLAN_HLEN * 2. chunk size will be at least 2k if I recall
> > > correctly. How is that related to fragmented buffers?
> > 
> > Note that the code above checks the xsk frame size, not chunk size.
> > 
> > Yes, the chunk mim size is 2k indeed. However, if I'm reading the code
> > correctly, the umem headroom can have an arbitrary value defined by the user
> > (see xdp_umem_reg()). In this case, the xsk frame size could potentially be
> > less than the ethernet frame size.
> 
> Ok I see. So you actually meant that for lower frame size given to HW we
> would start to fragment the frames?

Yes.

> Comment was a bit unclear, hence my confusion.

I'll try to make that comment more clear in the v2.

> Anyway, I feel like this is something generic that should be checked
> earlier at xsk code? Meaning, when we actually create the umem with user
> provided config at setsockopt?

Yeah, that might be a good idea. For now, I'll keep the check in the driver
since it doesn't work properly with smaller sizes. In the future, if we have
that check in xsk layer, we can remove it from the driver.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2020-12-23 19:27 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-17 20:24 [Intel-wired-lan] [PATCH 00/10] igc: Add support for AF_XDP zero-copy Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 01/10] igc: Move igc_xdp_is_enabled() Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 02/10] igc: Refactor igc_xdp_run_prog() Andre Guedes
2020-12-21 22:45   ` Maciej Fijalkowski
2020-12-22  1:12     ` Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 03/10] igc: Refactor igc_clean_rx_ring() Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 04/10] igc: Refactor XDP rxq info registration Andre Guedes
2020-12-21 22:53   ` Maciej Fijalkowski
2020-12-22  1:13     ` Andre Guedes
2020-12-22 12:32       ` Maciej Fijalkowski
2020-12-22 17:43         ` Andre Guedes
2020-12-22 20:59           ` Maciej Fijalkowski
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 05/10] igc: Introduce igc_update_rx_stats() Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 06/10] igc: Introduce igc_update_tx_stats() Andre Guedes
2020-12-21 22:58   ` Maciej Fijalkowski
2020-12-22  1:13     ` Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 07/10] igc: Introduce igc_unmap_tx_buffer() helper Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 08/10] igc: Replace IGC_TX_FLAGS_XDP flag by an enum Andre Guedes
2020-12-21 23:09   ` Maciej Fijalkowski
2020-12-22  1:13     ` Andre Guedes
2020-12-22 12:33       ` Maciej Fijalkowski
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 09/10] igc: Enable RX via AF_XDP zero-copy Andre Guedes
2020-12-22 13:14   ` Maciej Fijalkowski
2020-12-22 17:43     ` Andre Guedes
2020-12-22 20:59       ` Maciej Fijalkowski
2020-12-23 19:27         ` Andre Guedes
2020-12-17 20:24 ` [Intel-wired-lan] [PATCH 10/10] igc: Enable TX " Andre Guedes
2020-12-22 14:16   ` Maciej Fijalkowski
2020-12-22 14:20 ` [Intel-wired-lan] [PATCH 00/10] igc: Add support for " Maciej Fijalkowski
2020-12-22 17:43   ` Andre Guedes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.