All of lore.kernel.org
 help / color / mirror / Atom feed
* [next-queue v2 0/4] i40e: Add an i40e_napi_poll tracepoint
@ 2022-10-05 21:21 ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg,
	maciej.fijalkowski, Joe Damato

Greetings:

Debugging and tuning the NAPI and i40e NIC parameters can be a bit tricky
as there are many different options to test.

This change adds a tracepoint to i40e_napi_poll which exposes a lot of
helpful debug information for users who'd like to get a better
understanding of how their NIC is performing as they adjust various
parameters and tuning knobs.

With this series applied, you can use the tracepoint with perf by running:

$ sudo perf trace -e i40e:i40e_napi_poll -a --call-graph=fp --libtraceevent_print

388.258 :0/0 i40e:i40e_napi_poll(i40e_napi_poll on dev eth2 q i40e-eth2-TxRx-9 irq 346 irq_mask 00000000,00000000,00000000,00000000,00000000,00800000 curr_cpu 23 budget 64 bpr 64 rx_cleaned 28 tx_cleaned 0 rx_clean_complete 1 tx_clean_complete 1)
	i40e_napi_poll ([i40e])
	i40e_napi_poll ([i40e])
	__napi_poll ([kernel.kallsyms])
	net_rx_action ([kernel.kallsyms])
	__do_softirq ([kernel.kallsyms])
	common_interrupt ([kernel.kallsyms])
	asm_common_interrupt ([kernel.kallsyms])
	intel_idle_irq ([kernel.kallsyms])
	cpuidle_enter_state ([kernel.kallsyms])
	cpuidle_enter ([kernel.kallsyms])
	do_idle ([kernel.kallsyms])
	cpu_startup_entry ([kernel.kallsyms])
	[0x243fd8] ([kernel.kallsyms])
	secondary_startup_64_no_verify ([kernel.kallsyms])

The output is verbose, but is helpful when trying to determine the impact of
various turning parameters.

Thanks,
Joe

v1 -> v2:
	- TX path modified to push an out parameter through the function
	  call chain instead of modifying control flow.
	- RX path modified to also use an out parameter to track the number
	  of packets processed.
	- Naming of tracepoint struct members and format string modified to
	  be more readable.


Joe Damato (4):
  i40e: Store the irq number in i40e_q_vector
  i40e: Record number TXes cleaned during NAPI
  i40e: Record number of RXes cleaned during NAPI
  i40e: Add i40e_napi_poll tracepoint

 drivers/net/ethernet/intel/i40e/i40e.h       |  1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c  |  1 +
 drivers/net/ethernet/intel/i40e/i40e_trace.h | 49 ++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_txrx.c  | 30 ++++++++++++-----
 drivers/net/ethernet/intel/i40e/i40e_xsk.c   | 21 +++++++++---
 drivers/net/ethernet/intel/i40e/i40e_xsk.h   |  6 ++--
 6 files changed, 93 insertions(+), 15 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [next-queue v2 0/4] i40e: Add an i40e_napi_poll tracepoint
@ 2022-10-05 21:21 ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Joe Damato, kuba, davem

Greetings:

Debugging and tuning the NAPI and i40e NIC parameters can be a bit tricky
as there are many different options to test.

This change adds a tracepoint to i40e_napi_poll which exposes a lot of
helpful debug information for users who'd like to get a better
understanding of how their NIC is performing as they adjust various
parameters and tuning knobs.

With this series applied, you can use the tracepoint with perf by running:

$ sudo perf trace -e i40e:i40e_napi_poll -a --call-graph=fp --libtraceevent_print

388.258 :0/0 i40e:i40e_napi_poll(i40e_napi_poll on dev eth2 q i40e-eth2-TxRx-9 irq 346 irq_mask 00000000,00000000,00000000,00000000,00000000,00800000 curr_cpu 23 budget 64 bpr 64 rx_cleaned 28 tx_cleaned 0 rx_clean_complete 1 tx_clean_complete 1)
	i40e_napi_poll ([i40e])
	i40e_napi_poll ([i40e])
	__napi_poll ([kernel.kallsyms])
	net_rx_action ([kernel.kallsyms])
	__do_softirq ([kernel.kallsyms])
	common_interrupt ([kernel.kallsyms])
	asm_common_interrupt ([kernel.kallsyms])
	intel_idle_irq ([kernel.kallsyms])
	cpuidle_enter_state ([kernel.kallsyms])
	cpuidle_enter ([kernel.kallsyms])
	do_idle ([kernel.kallsyms])
	cpu_startup_entry ([kernel.kallsyms])
	[0x243fd8] ([kernel.kallsyms])
	secondary_startup_64_no_verify ([kernel.kallsyms])

The output is verbose, but is helpful when trying to determine the impact of
various turning parameters.

Thanks,
Joe

v1 -> v2:
	- TX path modified to push an out parameter through the function
	  call chain instead of modifying control flow.
	- RX path modified to also use an out parameter to track the number
	  of packets processed.
	- Naming of tracepoint struct members and format string modified to
	  be more readable.


Joe Damato (4):
  i40e: Store the irq number in i40e_q_vector
  i40e: Record number TXes cleaned during NAPI
  i40e: Record number of RXes cleaned during NAPI
  i40e: Add i40e_napi_poll tracepoint

 drivers/net/ethernet/intel/i40e/i40e.h       |  1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c  |  1 +
 drivers/net/ethernet/intel/i40e/i40e_trace.h | 49 ++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_txrx.c  | 30 ++++++++++++-----
 drivers/net/ethernet/intel/i40e/i40e_xsk.c   | 21 +++++++++---
 drivers/net/ethernet/intel/i40e/i40e_xsk.h   |  6 ++--
 6 files changed, 93 insertions(+), 15 deletions(-)

-- 
2.7.4

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [next-queue v2 1/4] i40e: Store the irq number in i40e_q_vector
  2022-10-05 21:21 ` [Intel-wired-lan] " Joe Damato
@ 2022-10-05 21:21   ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg,
	maciej.fijalkowski, Joe Damato

Make it easy to figure out the IRQ number for a particular i40e_q_vector by
storing the assigned IRQ in the structure itself.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h      | 1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 9926c4e..8e1f395 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -992,6 +992,7 @@ struct i40e_q_vector {
 	struct rcu_head rcu;	/* to avoid race with update stats on free */
 	char name[I40E_INT_NAME_STR_LEN];
 	bool arm_wb_state;
+	int irq_num;		/* IRQ assigned to this q_vector */
 } ____cacheline_internodealigned_in_smp;
 
 /* lan device */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 6b7535a..6efe130 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -4123,6 +4123,7 @@ static int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename)
 		}
 
 		/* register for affinity change notifications */
+		q_vector->irq_num = irq_num;
 		q_vector->affinity_notify.notify = i40e_irq_affinity_notify;
 		q_vector->affinity_notify.release = i40e_irq_affinity_release;
 		irq_set_affinity_notifier(irq_num, &q_vector->affinity_notify);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [next-queue v2 1/4] i40e: Store the irq number in i40e_q_vector
@ 2022-10-05 21:21   ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Joe Damato, kuba, davem

Make it easy to figure out the IRQ number for a particular i40e_q_vector by
storing the assigned IRQ in the structure itself.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h      | 1 +
 drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 9926c4e..8e1f395 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -992,6 +992,7 @@ struct i40e_q_vector {
 	struct rcu_head rcu;	/* to avoid race with update stats on free */
 	char name[I40E_INT_NAME_STR_LEN];
 	bool arm_wb_state;
+	int irq_num;		/* IRQ assigned to this q_vector */
 } ____cacheline_internodealigned_in_smp;
 
 /* lan device */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 6b7535a..6efe130 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -4123,6 +4123,7 @@ static int i40e_vsi_request_irq_msix(struct i40e_vsi *vsi, char *basename)
 		}
 
 		/* register for affinity change notifications */
+		q_vector->irq_num = irq_num;
 		q_vector->affinity_notify.notify = i40e_irq_affinity_notify;
 		q_vector->affinity_notify.release = i40e_irq_affinity_release;
 		irq_set_affinity_notifier(irq_num, &q_vector->affinity_notify);
-- 
2.7.4

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-05 21:21 ` [Intel-wired-lan] " Joe Damato
@ 2022-10-05 21:21   ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg,
	maciej.fijalkowski, Joe Damato

Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
the number TXs cleaned.

Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.

Care has been taken to avoid changing the control flow of any functions
involved.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
 drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
 drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
 3 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index b97c95f..a2cc98e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
  * @vsi: the VSI we care about
  * @tx_ring: Tx ring to clean
  * @napi_budget: Used to determine if we are in netpoll
+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
  *
  * Returns true if there's any budget left (e.g. the clean is finished)
  **/
 static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
-			      struct i40e_ring *tx_ring, int napi_budget)
+			      struct i40e_ring *tx_ring, int napi_budget,
+			      unsigned int *tx_cleaned)
 {
 	int i = tx_ring->next_to_clean;
 	struct i40e_tx_buffer *tx_buf;
@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
 	i40e_arm_wb(tx_ring, vsi, budget);
 
 	if (ring_is_xdp(tx_ring))
-		return !!budget;
+		goto out;
 
 	/* notify netdev of completed buffers */
 	netdev_tx_completed_queue(txring_txq(tx_ring),
@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
 		}
 	}
 
+out:
+	*tx_cleaned = total_packets;
 	return !!budget;
 }
 
@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 			       container_of(napi, struct i40e_q_vector, napi);
 	struct i40e_vsi *vsi = q_vector->vsi;
 	struct i40e_ring *ring;
+	bool tx_clean_complete = true;
 	bool clean_complete = true;
 	bool arm_wb = false;
 	int budget_per_ring;
 	int work_done = 0;
+	unsigned int tx_cleaned = 0;
 
 	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
 		napi_complete(napi);
@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 	 */
 	i40e_for_each_ring(ring, q_vector->tx) {
 		bool wd = ring->xsk_pool ?
-			  i40e_clean_xdp_tx_irq(vsi, ring) :
-			  i40e_clean_tx_irq(vsi, ring, budget);
+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
 
 		if (!wd) {
-			clean_complete = false;
+			clean_complete = tx_clean_complete = false;
 			continue;
 		}
 		arm_wb |= ring->arm_wb;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 790aaeff..f98ce7e4 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
  * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
  * @xdp_ring: XDP Tx ring
  * @budget: NAPI budget
+ * @tx_cleaned: Out parameter of the TX packets processed
  *
  * Returns true if the work is finished.
  **/
-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
+			 unsigned int *tx_cleaned)
 {
 	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
 	u32 nb_pkts, nb_processed = 0;
 	unsigned int total_bytes = 0;
 
 	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
-	if (!nb_pkts)
+	if (!nb_pkts) {
+		*tx_cleaned = 0;
 		return true;
+	}
 
 	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
 		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
 
 	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
 
+	*tx_cleaned = nb_pkts;
 	return nb_pkts < budget;
 }
 
@@ -581,10 +586,12 @@ static void i40e_clean_xdp_tx_buffer(struct i40e_ring *tx_ring,
  * i40e_clean_xdp_tx_irq - Completes AF_XDP entries, and cleans XDP entries
  * @vsi: Current VSI
  * @tx_ring: XDP Tx ring
+ * @tx_cleaned: out parameter of number of TXes cleaned
  *
  * Returns true if cleanup/tranmission is done.
  **/
-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
+			   unsigned int *tx_cleaned)
 {
 	struct xsk_buff_pool *bp = tx_ring->xsk_pool;
 	u32 i, completed_frames, xsk_frames = 0;
@@ -634,7 +641,7 @@ bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
 	if (xsk_uses_need_wakeup(tx_ring->xsk_pool))
 		xsk_set_tx_need_wakeup(tx_ring->xsk_pool);
 
-	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring));
+	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring), tx_cleaned);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
index 821df24..396ed11 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
@@ -30,7 +30,8 @@ int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
 bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
 int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
 
-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring);
+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
+			   unsigned int *tx_cleaned);
 int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
 int i40e_realloc_rx_bi_zc(struct i40e_vsi *vsi, bool zc);
 void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-05 21:21   ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Joe Damato, kuba, davem

Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
the number TXs cleaned.

Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.

Care has been taken to avoid changing the control flow of any functions
involved.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
 drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
 drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
 3 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index b97c95f..a2cc98e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
  * @vsi: the VSI we care about
  * @tx_ring: Tx ring to clean
  * @napi_budget: Used to determine if we are in netpoll
+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
  *
  * Returns true if there's any budget left (e.g. the clean is finished)
  **/
 static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
-			      struct i40e_ring *tx_ring, int napi_budget)
+			      struct i40e_ring *tx_ring, int napi_budget,
+			      unsigned int *tx_cleaned)
 {
 	int i = tx_ring->next_to_clean;
 	struct i40e_tx_buffer *tx_buf;
@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
 	i40e_arm_wb(tx_ring, vsi, budget);
 
 	if (ring_is_xdp(tx_ring))
-		return !!budget;
+		goto out;
 
 	/* notify netdev of completed buffers */
 	netdev_tx_completed_queue(txring_txq(tx_ring),
@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
 		}
 	}
 
+out:
+	*tx_cleaned = total_packets;
 	return !!budget;
 }
 
@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 			       container_of(napi, struct i40e_q_vector, napi);
 	struct i40e_vsi *vsi = q_vector->vsi;
 	struct i40e_ring *ring;
+	bool tx_clean_complete = true;
 	bool clean_complete = true;
 	bool arm_wb = false;
 	int budget_per_ring;
 	int work_done = 0;
+	unsigned int tx_cleaned = 0;
 
 	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
 		napi_complete(napi);
@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 	 */
 	i40e_for_each_ring(ring, q_vector->tx) {
 		bool wd = ring->xsk_pool ?
-			  i40e_clean_xdp_tx_irq(vsi, ring) :
-			  i40e_clean_tx_irq(vsi, ring, budget);
+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
 
 		if (!wd) {
-			clean_complete = false;
+			clean_complete = tx_clean_complete = false;
 			continue;
 		}
 		arm_wb |= ring->arm_wb;
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index 790aaeff..f98ce7e4 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
  * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
  * @xdp_ring: XDP Tx ring
  * @budget: NAPI budget
+ * @tx_cleaned: Out parameter of the TX packets processed
  *
  * Returns true if the work is finished.
  **/
-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
+			 unsigned int *tx_cleaned)
 {
 	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
 	u32 nb_pkts, nb_processed = 0;
 	unsigned int total_bytes = 0;
 
 	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
-	if (!nb_pkts)
+	if (!nb_pkts) {
+		*tx_cleaned = 0;
 		return true;
+	}
 
 	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
 		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
 
 	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
 
+	*tx_cleaned = nb_pkts;
 	return nb_pkts < budget;
 }
 
@@ -581,10 +586,12 @@ static void i40e_clean_xdp_tx_buffer(struct i40e_ring *tx_ring,
  * i40e_clean_xdp_tx_irq - Completes AF_XDP entries, and cleans XDP entries
  * @vsi: Current VSI
  * @tx_ring: XDP Tx ring
+ * @tx_cleaned: out parameter of number of TXes cleaned
  *
  * Returns true if cleanup/tranmission is done.
  **/
-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
+			   unsigned int *tx_cleaned)
 {
 	struct xsk_buff_pool *bp = tx_ring->xsk_pool;
 	u32 i, completed_frames, xsk_frames = 0;
@@ -634,7 +641,7 @@ bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
 	if (xsk_uses_need_wakeup(tx_ring->xsk_pool))
 		xsk_set_tx_need_wakeup(tx_ring->xsk_pool);
 
-	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring));
+	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring), tx_cleaned);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
index 821df24..396ed11 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
@@ -30,7 +30,8 @@ int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
 bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
 int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
 
-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring);
+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
+			   unsigned int *tx_cleaned);
 int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
 int i40e_realloc_rx_bi_zc(struct i40e_vsi *vsi, bool zc);
 void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring);
-- 
2.7.4

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [next-queue v2 3/4] i40e: Record number of RXes cleaned during NAPI
  2022-10-05 21:21 ` [Intel-wired-lan] " Joe Damato
@ 2022-10-05 21:21   ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg,
	maciej.fijalkowski, Joe Damato

Adjust i40e_clean_rx_irq and i40e_clean_rx_irq_zc to accept an out
parameter which records the number of RX packets cleaned.

Care has been taken to avoid any changes in control flow.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 11 ++++++++---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c  |  6 +++++-
 drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index a2cc98e..8a0d4fd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2426,6 +2426,7 @@ static void i40e_inc_ntc(struct i40e_ring *rx_ring)
  * i40e_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf
  * @rx_ring: rx descriptor ring to transact packets on
  * @budget: Total limit on number of packets to process
+ * @rx_cleaned: Out parameter of the number of packets processed
  *
  * This function provides a "bounce buffer" approach to Rx interrupt
  * processing.  The advantage to this is that on systems that have
@@ -2434,7 +2435,8 @@ static void i40e_inc_ntc(struct i40e_ring *rx_ring)
  *
  * Returns amount of work completed
  **/
-static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
+static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
+			     unsigned int *rx_cleaned)
 {
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0, frame_sz = 0;
 	u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
@@ -2571,6 +2573,8 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 
 	i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
 
+	*rx_cleaned = total_rx_packets;
+
 	/* guarantee a trip back through this routine if there was a failure */
 	return failure ? budget : (int)total_rx_packets;
 }
@@ -2699,6 +2703,7 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 	int budget_per_ring;
 	int work_done = 0;
 	unsigned int tx_cleaned = 0;
+	unsigned int rx_cleaned = 0;
 
 	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
 		napi_complete(napi);
@@ -2738,8 +2743,8 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 
 	i40e_for_each_ring(ring, q_vector->rx) {
 		int cleaned = ring->xsk_pool ?
-			      i40e_clean_rx_irq_zc(ring, budget_per_ring) :
-			      i40e_clean_rx_irq(ring, budget_per_ring);
+			      i40e_clean_rx_irq_zc(ring, budget_per_ring, &rx_cleaned) :
+			      i40e_clean_rx_irq(ring, budget_per_ring, &rx_cleaned);
 
 		work_done += cleaned;
 		/* if we clean as many as budgeted, we must not be done */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index f98ce7e4..b1f582a0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -378,10 +378,12 @@ static void i40e_handle_xdp_result_zc(struct i40e_ring *rx_ring,
  * i40e_clean_rx_irq_zc - Consumes Rx packets from the hardware ring
  * @rx_ring: Rx ring
  * @budget: NAPI budget
+ * @rx_cleaned: out parameter of the packets processed
  *
  * Returns amount of work completed
  **/
-int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget)
+int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget,
+			  unsigned int *rx_cleaned)
 {
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 next_to_clean = rx_ring->next_to_clean;
@@ -452,6 +454,8 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget)
 	i40e_finalize_xdp_rx(rx_ring, xdp_xmit);
 	i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
 
+	*rx_cleaned = total_rx_packets;
+
 	if (xsk_uses_need_wakeup(rx_ring->xsk_pool)) {
 		if (failure || next_to_clean == rx_ring->next_to_use)
 			xsk_set_rx_need_wakeup(rx_ring->xsk_pool);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
index 396ed11..1089cc0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
@@ -28,7 +28,8 @@ int i40e_queue_pair_enable(struct i40e_vsi *vsi, int queue_pair);
 int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
 			u16 qid);
 bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
-int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
+int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget,
+			  unsigned int *rx_cleaned);
 
 bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
 			   unsigned int *tx_cleaned);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [next-queue v2 3/4] i40e: Record number of RXes cleaned during NAPI
@ 2022-10-05 21:21   ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Joe Damato, kuba, davem

Adjust i40e_clean_rx_irq and i40e_clean_rx_irq_zc to accept an out
parameter which records the number of RX packets cleaned.

Care has been taken to avoid any changes in control flow.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 11 ++++++++---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c  |  6 +++++-
 drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index a2cc98e..8a0d4fd 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2426,6 +2426,7 @@ static void i40e_inc_ntc(struct i40e_ring *rx_ring)
  * i40e_clean_rx_irq - Clean completed descriptors from Rx ring - bounce buf
  * @rx_ring: rx descriptor ring to transact packets on
  * @budget: Total limit on number of packets to process
+ * @rx_cleaned: Out parameter of the number of packets processed
  *
  * This function provides a "bounce buffer" approach to Rx interrupt
  * processing.  The advantage to this is that on systems that have
@@ -2434,7 +2435,8 @@ static void i40e_inc_ntc(struct i40e_ring *rx_ring)
  *
  * Returns amount of work completed
  **/
-static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
+static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget,
+			     unsigned int *rx_cleaned)
 {
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0, frame_sz = 0;
 	u16 cleaned_count = I40E_DESC_UNUSED(rx_ring);
@@ -2571,6 +2573,8 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 
 	i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
 
+	*rx_cleaned = total_rx_packets;
+
 	/* guarantee a trip back through this routine if there was a failure */
 	return failure ? budget : (int)total_rx_packets;
 }
@@ -2699,6 +2703,7 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 	int budget_per_ring;
 	int work_done = 0;
 	unsigned int tx_cleaned = 0;
+	unsigned int rx_cleaned = 0;
 
 	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
 		napi_complete(napi);
@@ -2738,8 +2743,8 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 
 	i40e_for_each_ring(ring, q_vector->rx) {
 		int cleaned = ring->xsk_pool ?
-			      i40e_clean_rx_irq_zc(ring, budget_per_ring) :
-			      i40e_clean_rx_irq(ring, budget_per_ring);
+			      i40e_clean_rx_irq_zc(ring, budget_per_ring, &rx_cleaned) :
+			      i40e_clean_rx_irq(ring, budget_per_ring, &rx_cleaned);
 
 		work_done += cleaned;
 		/* if we clean as many as budgeted, we must not be done */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index f98ce7e4..b1f582a0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -378,10 +378,12 @@ static void i40e_handle_xdp_result_zc(struct i40e_ring *rx_ring,
  * i40e_clean_rx_irq_zc - Consumes Rx packets from the hardware ring
  * @rx_ring: Rx ring
  * @budget: NAPI budget
+ * @rx_cleaned: out parameter of the packets processed
  *
  * Returns amount of work completed
  **/
-int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget)
+int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget,
+			  unsigned int *rx_cleaned)
 {
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
 	u16 next_to_clean = rx_ring->next_to_clean;
@@ -452,6 +454,8 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget)
 	i40e_finalize_xdp_rx(rx_ring, xdp_xmit);
 	i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets);
 
+	*rx_cleaned = total_rx_packets;
+
 	if (xsk_uses_need_wakeup(rx_ring->xsk_pool)) {
 		if (failure || next_to_clean == rx_ring->next_to_use)
 			xsk_set_rx_need_wakeup(rx_ring->xsk_pool);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
index 396ed11..1089cc0 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
@@ -28,7 +28,8 @@ int i40e_queue_pair_enable(struct i40e_vsi *vsi, int queue_pair);
 int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
 			u16 qid);
 bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
-int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
+int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget,
+			  unsigned int *rx_cleaned);
 
 bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
 			   unsigned int *tx_cleaned);
-- 
2.7.4

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [next-queue v2 4/4] i40e: Add i40e_napi_poll tracepoint
  2022-10-05 21:21 ` [Intel-wired-lan] " Joe Damato
@ 2022-10-05 21:21   ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg,
	maciej.fijalkowski, Joe Damato

Add a tracepoint for i40e_napi_poll that allows users to get detailed
information about the amount of work done. This information can help users
better tune the correct NAPI parameters (like weight and budget), as well
as debug NIC settings like rx-usecs and tx-usecs, etc.

An example of the output from this tracepoint:

$ sudo perf trace -e i40e:i40e_napi_poll -a --call-graph=fp --libtraceevent_print

[..snip..]

388.258 :0/0 i40e:i40e_napi_poll(i40e_napi_poll on dev eth2 q i40e-eth2-TxRx-9 irq 346 irq_mask 00000000,00000000,00000000,00000000,00000000,00800000 curr_cpu 23 budget 64 bpr 64 rx_cleaned 28 tx_cleaned 0 rx_clean_complete 1 tx_clean_complete 1)
	i40e_napi_poll ([i40e])
	i40e_napi_poll ([i40e])
	__napi_poll ([kernel.kallsyms])
	net_rx_action ([kernel.kallsyms])
	__do_softirq ([kernel.kallsyms])
	common_interrupt ([kernel.kallsyms])
	asm_common_interrupt ([kernel.kallsyms])
	intel_idle_irq ([kernel.kallsyms])
	cpuidle_enter_state ([kernel.kallsyms])
	cpuidle_enter ([kernel.kallsyms])
	do_idle ([kernel.kallsyms])
	cpu_startup_entry ([kernel.kallsyms])
	[0x243fd8] ([kernel.kallsyms])
	secondary_startup_64_no_verify ([kernel.kallsyms])

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_trace.h | 49 ++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_txrx.c  |  3 ++
 2 files changed, 52 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_trace.h b/drivers/net/ethernet/intel/i40e/i40e_trace.h
index b5b1229..7d7c161 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_trace.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_trace.h
@@ -55,6 +55,55 @@
  * being built from shared code.
  */
 
+#define NO_DEV "(i40e no_device)"
+
+TRACE_EVENT(i40e_napi_poll,
+
+	TP_PROTO(struct napi_struct *napi, struct i40e_q_vector *q, int budget,
+		 int budget_per_ring, unsigned int rx_cleaned, unsigned int tx_cleaned,
+		 bool rx_clean_complete, bool tx_clean_complete),
+
+	TP_ARGS(napi, q, budget, budget_per_ring, rx_cleaned, tx_cleaned,
+		rx_clean_complete, tx_clean_complete),
+
+	TP_STRUCT__entry(
+		__field(int, budget)
+		__field(int, budget_per_ring)
+		__field(unsigned int, rx_cleaned)
+		__field(unsigned int, tx_cleaned)
+		__field(int, rx_clean_complete)
+		__field(int, tx_clean_complete)
+		__field(int, irq_num)
+		__field(int, curr_cpu)
+		__string(qname, q->name)
+		__string(dev_name, napi->dev ? napi->dev->name : NO_DEV)
+		__bitmask(irq_affinity,	nr_cpumask_bits)
+	),
+
+	TP_fast_assign(
+		__entry->budget = budget;
+		__entry->budget_per_ring = budget_per_ring;
+		__entry->rx_cleaned = rx_cleaned;
+		__entry->tx_cleaned = tx_cleaned;
+		__entry->rx_clean_complete = rx_clean_complete;
+		__entry->tx_clean_complete = tx_clean_complete;
+		__entry->irq_num = q->irq_num;
+		__entry->curr_cpu = get_cpu();
+		__assign_str(qname, q->name);
+		__assign_str(dev_name, napi->dev ? napi->dev->name : NO_DEV);
+		__assign_bitmask(irq_affinity, cpumask_bits(&q->affinity_mask),
+				 nr_cpumask_bits);
+	),
+
+	TP_printk("i40e_napi_poll on dev %s q %s irq %d irq_mask %s curr_cpu %d "
+		  "budget %d bpr %d rx_cleaned %lu tx_cleaned %lu "
+		  "rx_clean_complete %d tx_clean_complete %d",
+		__get_str(dev_name), __get_str(qname), __entry->irq_num,
+		__get_bitmask(irq_affinity), __entry->curr_cpu, __entry->budget,
+		__entry->budget_per_ring, __entry->rx_cleaned, __entry->tx_cleaned,
+		__entry->rx_clean_complete, __entry->tx_clean_complete)
+);
+
 /* Events related to a vsi & ring */
 DECLARE_EVENT_CLASS(
 	i40e_tx_template,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 8a0d4fd..cda7f59 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2752,6 +2752,9 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 			clean_complete = false;
 	}
 
+	trace_i40e_napi_poll(napi, q_vector, budget, budget_per_ring, rx_cleaned,
+			     tx_cleaned, clean_complete, tx_clean_complete);
+
 	/* If work not completed, return budget and polling will return */
 	if (!clean_complete) {
 		int cpu_id = smp_processor_id();
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Intel-wired-lan] [next-queue v2 4/4] i40e: Add i40e_napi_poll tracepoint
@ 2022-10-05 21:21   ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-05 21:21 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, Joe Damato, kuba, davem

Add a tracepoint for i40e_napi_poll that allows users to get detailed
information about the amount of work done. This information can help users
better tune the correct NAPI parameters (like weight and budget), as well
as debug NIC settings like rx-usecs and tx-usecs, etc.

An example of the output from this tracepoint:

$ sudo perf trace -e i40e:i40e_napi_poll -a --call-graph=fp --libtraceevent_print

[..snip..]

388.258 :0/0 i40e:i40e_napi_poll(i40e_napi_poll on dev eth2 q i40e-eth2-TxRx-9 irq 346 irq_mask 00000000,00000000,00000000,00000000,00000000,00800000 curr_cpu 23 budget 64 bpr 64 rx_cleaned 28 tx_cleaned 0 rx_clean_complete 1 tx_clean_complete 1)
	i40e_napi_poll ([i40e])
	i40e_napi_poll ([i40e])
	__napi_poll ([kernel.kallsyms])
	net_rx_action ([kernel.kallsyms])
	__do_softirq ([kernel.kallsyms])
	common_interrupt ([kernel.kallsyms])
	asm_common_interrupt ([kernel.kallsyms])
	intel_idle_irq ([kernel.kallsyms])
	cpuidle_enter_state ([kernel.kallsyms])
	cpuidle_enter ([kernel.kallsyms])
	do_idle ([kernel.kallsyms])
	cpu_startup_entry ([kernel.kallsyms])
	[0x243fd8] ([kernel.kallsyms])
	secondary_startup_64_no_verify ([kernel.kallsyms])

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_trace.h | 49 ++++++++++++++++++++++++++++
 drivers/net/ethernet/intel/i40e/i40e_txrx.c  |  3 ++
 2 files changed, 52 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_trace.h b/drivers/net/ethernet/intel/i40e/i40e_trace.h
index b5b1229..7d7c161 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_trace.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_trace.h
@@ -55,6 +55,55 @@
  * being built from shared code.
  */
 
+#define NO_DEV "(i40e no_device)"
+
+TRACE_EVENT(i40e_napi_poll,
+
+	TP_PROTO(struct napi_struct *napi, struct i40e_q_vector *q, int budget,
+		 int budget_per_ring, unsigned int rx_cleaned, unsigned int tx_cleaned,
+		 bool rx_clean_complete, bool tx_clean_complete),
+
+	TP_ARGS(napi, q, budget, budget_per_ring, rx_cleaned, tx_cleaned,
+		rx_clean_complete, tx_clean_complete),
+
+	TP_STRUCT__entry(
+		__field(int, budget)
+		__field(int, budget_per_ring)
+		__field(unsigned int, rx_cleaned)
+		__field(unsigned int, tx_cleaned)
+		__field(int, rx_clean_complete)
+		__field(int, tx_clean_complete)
+		__field(int, irq_num)
+		__field(int, curr_cpu)
+		__string(qname, q->name)
+		__string(dev_name, napi->dev ? napi->dev->name : NO_DEV)
+		__bitmask(irq_affinity,	nr_cpumask_bits)
+	),
+
+	TP_fast_assign(
+		__entry->budget = budget;
+		__entry->budget_per_ring = budget_per_ring;
+		__entry->rx_cleaned = rx_cleaned;
+		__entry->tx_cleaned = tx_cleaned;
+		__entry->rx_clean_complete = rx_clean_complete;
+		__entry->tx_clean_complete = tx_clean_complete;
+		__entry->irq_num = q->irq_num;
+		__entry->curr_cpu = get_cpu();
+		__assign_str(qname, q->name);
+		__assign_str(dev_name, napi->dev ? napi->dev->name : NO_DEV);
+		__assign_bitmask(irq_affinity, cpumask_bits(&q->affinity_mask),
+				 nr_cpumask_bits);
+	),
+
+	TP_printk("i40e_napi_poll on dev %s q %s irq %d irq_mask %s curr_cpu %d "
+		  "budget %d bpr %d rx_cleaned %lu tx_cleaned %lu "
+		  "rx_clean_complete %d tx_clean_complete %d",
+		__get_str(dev_name), __get_str(qname), __entry->irq_num,
+		__get_bitmask(irq_affinity), __entry->curr_cpu, __entry->budget,
+		__entry->budget_per_ring, __entry->rx_cleaned, __entry->tx_cleaned,
+		__entry->rx_clean_complete, __entry->tx_clean_complete)
+);
+
 /* Events related to a vsi & ring */
 DECLARE_EVENT_CLASS(
 	i40e_tx_template,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 8a0d4fd..cda7f59 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2752,6 +2752,9 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
 			clean_complete = false;
 	}
 
+	trace_i40e_napi_poll(napi, q_vector, budget, budget_per_ring, rx_cleaned,
+			     tx_cleaned, clean_complete, tx_clean_complete);
+
 	/* If work not completed, return budget and polling will return */
 	if (!clean_complete) {
 		int cpu_id = smp_processor_id();
-- 
2.7.4

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-05 21:21   ` [Intel-wired-lan] " Joe Damato
@ 2022-10-06  0:16     ` Samudrala, Sridhar
  -1 siblings, 0 replies; 30+ messages in thread
From: Samudrala, Sridhar @ 2022-10-06  0:16 UTC (permalink / raw)
  To: Joe Damato, intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg,
	maciej.fijalkowski

On 10/5/2022 4:21 PM, Joe Damato wrote:
> Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> the number TXs cleaned.
>
> Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
>
> Care has been taken to avoid changing the control flow of any functions
> involved.
>
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
>   drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
>   drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
>   3 files changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index b97c95f..a2cc98e 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
>    * @vsi: the VSI we care about
>    * @tx_ring: Tx ring to clean
>    * @napi_budget: Used to determine if we are in netpoll
> + * @tx_cleaned: Out parameter set to the number of TXes cleaned
>    *
>    * Returns true if there's any budget left (e.g. the clean is finished)
>    **/
>   static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> -			      struct i40e_ring *tx_ring, int napi_budget)
> +			      struct i40e_ring *tx_ring, int napi_budget,
> +			      unsigned int *tx_cleaned)
>   {
>   	int i = tx_ring->next_to_clean;
>   	struct i40e_tx_buffer *tx_buf;
> @@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>   	i40e_arm_wb(tx_ring, vsi, budget);
>   
>   	if (ring_is_xdp(tx_ring))
> -		return !!budget;
> +		goto out;
>   
>   	/* notify netdev of completed buffers */
>   	netdev_tx_completed_queue(txring_txq(tx_ring),
> @@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>   		}
>   	}
>   
> +out:
> +	*tx_cleaned = total_packets;
>   	return !!budget;
>   }
>   
> @@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>   			       container_of(napi, struct i40e_q_vector, napi);
>   	struct i40e_vsi *vsi = q_vector->vsi;
>   	struct i40e_ring *ring;
> +	bool tx_clean_complete = true;
>   	bool clean_complete = true;
>   	bool arm_wb = false;
>   	int budget_per_ring;
>   	int work_done = 0;
> +	unsigned int tx_cleaned = 0;
>   
>   	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
>   		napi_complete(napi);
> @@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>   	 */
>   	i40e_for_each_ring(ring, q_vector->tx) {
>   		bool wd = ring->xsk_pool ?
> -			  i40e_clean_xdp_tx_irq(vsi, ring) :
> -			  i40e_clean_tx_irq(vsi, ring, budget);
> +			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> +			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
>   
>   		if (!wd) {
> -			clean_complete = false;
> +			clean_complete = tx_clean_complete = false;
>   			continue;
>   		}
>   		arm_wb |= ring->arm_wb;
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index 790aaeff..f98ce7e4 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
>    * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
>    * @xdp_ring: XDP Tx ring
>    * @budget: NAPI budget
> + * @tx_cleaned: Out parameter of the TX packets processed
>    *
>    * Returns true if the work is finished.
>    **/
> -static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> +static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> +			 unsigned int *tx_cleaned)
>   {
>   	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
>   	u32 nb_pkts, nb_processed = 0;
>   	unsigned int total_bytes = 0;
>   
>   	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> -	if (!nb_pkts)
> +	if (!nb_pkts) {
> +		*tx_cleaned = 0;
>   		return true;
> +	}
>   
>   	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
>   		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> @@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>   
>   	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
>   
> +	*tx_cleaned = nb_pkts;

With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
May be we need 2 counters for xdp.


>   	return nb_pkts < budget;
>   }
>   
> @@ -581,10 +586,12 @@ static void i40e_clean_xdp_tx_buffer(struct i40e_ring *tx_ring,
>    * i40e_clean_xdp_tx_irq - Completes AF_XDP entries, and cleans XDP entries
>    * @vsi: Current VSI
>    * @tx_ring: XDP Tx ring
> + * @tx_cleaned: out parameter of number of TXes cleaned
>    *
>    * Returns true if cleanup/tranmission is done.
>    **/
> -bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
> +bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> +			   unsigned int *tx_cleaned)
>   {
>   	struct xsk_buff_pool *bp = tx_ring->xsk_pool;
>   	u32 i, completed_frames, xsk_frames = 0;
> @@ -634,7 +641,7 @@ bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
>   	if (xsk_uses_need_wakeup(tx_ring->xsk_pool))
>   		xsk_set_tx_need_wakeup(tx_ring->xsk_pool);
>   
> -	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring));
> +	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring), tx_cleaned);
>   }
>   
>   /**
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> index 821df24..396ed11 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> @@ -30,7 +30,8 @@ int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
>   bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
>   int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
>   
> -bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring);
> +bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> +			   unsigned int *tx_cleaned);
>   int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
>   int i40e_realloc_rx_bi_zc(struct i40e_vsi *vsi, bool zc);
>   void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06  0:16     ` Samudrala, Sridhar
  0 siblings, 0 replies; 30+ messages in thread
From: Samudrala, Sridhar @ 2022-10-06  0:16 UTC (permalink / raw)
  To: Joe Damato, intel-wired-lan; +Cc: netdev, kuba, davem

On 10/5/2022 4:21 PM, Joe Damato wrote:
> Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> the number TXs cleaned.
>
> Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
>
> Care has been taken to avoid changing the control flow of any functions
> involved.
>
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
>   drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
>   drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
>   3 files changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index b97c95f..a2cc98e 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
>    * @vsi: the VSI we care about
>    * @tx_ring: Tx ring to clean
>    * @napi_budget: Used to determine if we are in netpoll
> + * @tx_cleaned: Out parameter set to the number of TXes cleaned
>    *
>    * Returns true if there's any budget left (e.g. the clean is finished)
>    **/
>   static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> -			      struct i40e_ring *tx_ring, int napi_budget)
> +			      struct i40e_ring *tx_ring, int napi_budget,
> +			      unsigned int *tx_cleaned)
>   {
>   	int i = tx_ring->next_to_clean;
>   	struct i40e_tx_buffer *tx_buf;
> @@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>   	i40e_arm_wb(tx_ring, vsi, budget);
>   
>   	if (ring_is_xdp(tx_ring))
> -		return !!budget;
> +		goto out;
>   
>   	/* notify netdev of completed buffers */
>   	netdev_tx_completed_queue(txring_txq(tx_ring),
> @@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>   		}
>   	}
>   
> +out:
> +	*tx_cleaned = total_packets;
>   	return !!budget;
>   }
>   
> @@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>   			       container_of(napi, struct i40e_q_vector, napi);
>   	struct i40e_vsi *vsi = q_vector->vsi;
>   	struct i40e_ring *ring;
> +	bool tx_clean_complete = true;
>   	bool clean_complete = true;
>   	bool arm_wb = false;
>   	int budget_per_ring;
>   	int work_done = 0;
> +	unsigned int tx_cleaned = 0;
>   
>   	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
>   		napi_complete(napi);
> @@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>   	 */
>   	i40e_for_each_ring(ring, q_vector->tx) {
>   		bool wd = ring->xsk_pool ?
> -			  i40e_clean_xdp_tx_irq(vsi, ring) :
> -			  i40e_clean_tx_irq(vsi, ring, budget);
> +			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> +			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
>   
>   		if (!wd) {
> -			clean_complete = false;
> +			clean_complete = tx_clean_complete = false;
>   			continue;
>   		}
>   		arm_wb |= ring->arm_wb;
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index 790aaeff..f98ce7e4 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
>    * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
>    * @xdp_ring: XDP Tx ring
>    * @budget: NAPI budget
> + * @tx_cleaned: Out parameter of the TX packets processed
>    *
>    * Returns true if the work is finished.
>    **/
> -static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> +static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> +			 unsigned int *tx_cleaned)
>   {
>   	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
>   	u32 nb_pkts, nb_processed = 0;
>   	unsigned int total_bytes = 0;
>   
>   	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> -	if (!nb_pkts)
> +	if (!nb_pkts) {
> +		*tx_cleaned = 0;
>   		return true;
> +	}
>   
>   	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
>   		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> @@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>   
>   	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
>   
> +	*tx_cleaned = nb_pkts;

With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
May be we need 2 counters for xdp.


>   	return nb_pkts < budget;
>   }
>   
> @@ -581,10 +586,12 @@ static void i40e_clean_xdp_tx_buffer(struct i40e_ring *tx_ring,
>    * i40e_clean_xdp_tx_irq - Completes AF_XDP entries, and cleans XDP entries
>    * @vsi: Current VSI
>    * @tx_ring: XDP Tx ring
> + * @tx_cleaned: out parameter of number of TXes cleaned
>    *
>    * Returns true if cleanup/tranmission is done.
>    **/
> -bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
> +bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> +			   unsigned int *tx_cleaned)
>   {
>   	struct xsk_buff_pool *bp = tx_ring->xsk_pool;
>   	u32 i, completed_frames, xsk_frames = 0;
> @@ -634,7 +641,7 @@ bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
>   	if (xsk_uses_need_wakeup(tx_ring->xsk_pool))
>   		xsk_set_tx_need_wakeup(tx_ring->xsk_pool);
>   
> -	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring));
> +	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring), tx_cleaned);
>   }
>   
>   /**
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> index 821df24..396ed11 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> @@ -30,7 +30,8 @@ int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
>   bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
>   int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
>   
> -bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring);
> +bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> +			   unsigned int *tx_cleaned);
>   int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
>   int i40e_realloc_rx_bi_zc(struct i40e_vsi *vsi, bool zc);
>   void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring);

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06  0:16     ` [Intel-wired-lan] " Samudrala, Sridhar
@ 2022-10-06  0:31       ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06  0:31 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: intel-wired-lan, netdev, kuba, davem, anthony.l.nguyen,
	jesse.brandeburg, maciej.fijalkowski

On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> On 10/5/2022 4:21 PM, Joe Damato wrote:
> >Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> >the number TXs cleaned.
> >
> >Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> >
> >Care has been taken to avoid changing the control flow of any functions
> >involved.
> >
> >Signed-off-by: Joe Damato <jdamato@fastly.com>
> >---
> >  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> >  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> >  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> >  3 files changed, 24 insertions(+), 10 deletions(-)
> >
> >diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >index b97c95f..a2cc98e 100644
> >--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> >   * @vsi: the VSI we care about
> >   * @tx_ring: Tx ring to clean
> >   * @napi_budget: Used to determine if we are in netpoll
> >+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> >   *
> >   * Returns true if there's any budget left (e.g. the clean is finished)
> >   **/
> >  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >-			      struct i40e_ring *tx_ring, int napi_budget)
> >+			      struct i40e_ring *tx_ring, int napi_budget,
> >+			      unsigned int *tx_cleaned)
> >  {
> >  	int i = tx_ring->next_to_clean;
> >  	struct i40e_tx_buffer *tx_buf;
> >@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >  	i40e_arm_wb(tx_ring, vsi, budget);
> >  	if (ring_is_xdp(tx_ring))
> >-		return !!budget;
> >+		goto out;
> >  	/* notify netdev of completed buffers */
> >  	netdev_tx_completed_queue(txring_txq(tx_ring),
> >@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >  		}
> >  	}
> >+out:
> >+	*tx_cleaned = total_packets;
> >  	return !!budget;
> >  }
> >@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >  			       container_of(napi, struct i40e_q_vector, napi);
> >  	struct i40e_vsi *vsi = q_vector->vsi;
> >  	struct i40e_ring *ring;
> >+	bool tx_clean_complete = true;
> >  	bool clean_complete = true;
> >  	bool arm_wb = false;
> >  	int budget_per_ring;
> >  	int work_done = 0;
> >+	unsigned int tx_cleaned = 0;
> >  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> >  		napi_complete(napi);
> >@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >  	 */
> >  	i40e_for_each_ring(ring, q_vector->tx) {
> >  		bool wd = ring->xsk_pool ?
> >-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> >-			  i40e_clean_tx_irq(vsi, ring, budget);
> >+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> >+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> >  		if (!wd) {
> >-			clean_complete = false;
> >+			clean_complete = tx_clean_complete = false;
> >  			continue;
> >  		}
> >  		arm_wb |= ring->arm_wb;
> >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >index 790aaeff..f98ce7e4 100644
> >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> >   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> >   * @xdp_ring: XDP Tx ring
> >   * @budget: NAPI budget
> >+ * @tx_cleaned: Out parameter of the TX packets processed
> >   *
> >   * Returns true if the work is finished.
> >   **/
> >-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> >+			 unsigned int *tx_cleaned)
> >  {
> >  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> >  	u32 nb_pkts, nb_processed = 0;
> >  	unsigned int total_bytes = 0;
> >  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> >-	if (!nb_pkts)
> >+	if (!nb_pkts) {
> >+		*tx_cleaned = 0;
> >  		return true;
> >+	}
> >  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> >  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> >@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> >+	*tx_cleaned = nb_pkts;
> 
> With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> May be we need 2 counters for xdp.

I think there's two issues you are describing, which are separate in my
mind.

  1.) The name "tx_cleaned", and
  2.) Whether nb_pkts is the right thing to write as the out param.

For #1: I'm OK to change the name if that's the blocker here; please
suggest a suitable alternative that you'll accept.

For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
whether or not polling mode is entered.

The purpose of the tracepoint is to determine when/why/how you are entering
polling mode, so if nb_pkts plays a role in that calculation, it's the
right number to output.


> >  	return nb_pkts < budget;
> >  }
> >@@ -581,10 +586,12 @@ static void i40e_clean_xdp_tx_buffer(struct i40e_ring *tx_ring,
> >   * i40e_clean_xdp_tx_irq - Completes AF_XDP entries, and cleans XDP entries
> >   * @vsi: Current VSI
> >   * @tx_ring: XDP Tx ring
> >+ * @tx_cleaned: out parameter of number of TXes cleaned
> >   *
> >   * Returns true if cleanup/tranmission is done.
> >   **/
> >-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
> >+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> >+			   unsigned int *tx_cleaned)
> >  {
> >  	struct xsk_buff_pool *bp = tx_ring->xsk_pool;
> >  	u32 i, completed_frames, xsk_frames = 0;
> >@@ -634,7 +641,7 @@ bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
> >  	if (xsk_uses_need_wakeup(tx_ring->xsk_pool))
> >  		xsk_set_tx_need_wakeup(tx_ring->xsk_pool);
> >-	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring));
> >+	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring), tx_cleaned);
> >  }
> >  /**
> >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> >index 821df24..396ed11 100644
> >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> >@@ -30,7 +30,8 @@ int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
> >  bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
> >  int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
> >-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring);
> >+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> >+			   unsigned int *tx_cleaned);
> >  int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
> >  int i40e_realloc_rx_bi_zc(struct i40e_vsi *vsi, bool zc);
> >  void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring);
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06  0:31       ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06  0:31 UTC (permalink / raw)
  To: Samudrala, Sridhar; +Cc: netdev, kuba, intel-wired-lan, davem

On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> On 10/5/2022 4:21 PM, Joe Damato wrote:
> >Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> >the number TXs cleaned.
> >
> >Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> >
> >Care has been taken to avoid changing the control flow of any functions
> >involved.
> >
> >Signed-off-by: Joe Damato <jdamato@fastly.com>
> >---
> >  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> >  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> >  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> >  3 files changed, 24 insertions(+), 10 deletions(-)
> >
> >diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >index b97c95f..a2cc98e 100644
> >--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> >   * @vsi: the VSI we care about
> >   * @tx_ring: Tx ring to clean
> >   * @napi_budget: Used to determine if we are in netpoll
> >+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> >   *
> >   * Returns true if there's any budget left (e.g. the clean is finished)
> >   **/
> >  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >-			      struct i40e_ring *tx_ring, int napi_budget)
> >+			      struct i40e_ring *tx_ring, int napi_budget,
> >+			      unsigned int *tx_cleaned)
> >  {
> >  	int i = tx_ring->next_to_clean;
> >  	struct i40e_tx_buffer *tx_buf;
> >@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >  	i40e_arm_wb(tx_ring, vsi, budget);
> >  	if (ring_is_xdp(tx_ring))
> >-		return !!budget;
> >+		goto out;
> >  	/* notify netdev of completed buffers */
> >  	netdev_tx_completed_queue(txring_txq(tx_ring),
> >@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >  		}
> >  	}
> >+out:
> >+	*tx_cleaned = total_packets;
> >  	return !!budget;
> >  }
> >@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >  			       container_of(napi, struct i40e_q_vector, napi);
> >  	struct i40e_vsi *vsi = q_vector->vsi;
> >  	struct i40e_ring *ring;
> >+	bool tx_clean_complete = true;
> >  	bool clean_complete = true;
> >  	bool arm_wb = false;
> >  	int budget_per_ring;
> >  	int work_done = 0;
> >+	unsigned int tx_cleaned = 0;
> >  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> >  		napi_complete(napi);
> >@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >  	 */
> >  	i40e_for_each_ring(ring, q_vector->tx) {
> >  		bool wd = ring->xsk_pool ?
> >-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> >-			  i40e_clean_tx_irq(vsi, ring, budget);
> >+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> >+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> >  		if (!wd) {
> >-			clean_complete = false;
> >+			clean_complete = tx_clean_complete = false;
> >  			continue;
> >  		}
> >  		arm_wb |= ring->arm_wb;
> >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >index 790aaeff..f98ce7e4 100644
> >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> >   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> >   * @xdp_ring: XDP Tx ring
> >   * @budget: NAPI budget
> >+ * @tx_cleaned: Out parameter of the TX packets processed
> >   *
> >   * Returns true if the work is finished.
> >   **/
> >-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> >+			 unsigned int *tx_cleaned)
> >  {
> >  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> >  	u32 nb_pkts, nb_processed = 0;
> >  	unsigned int total_bytes = 0;
> >  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> >-	if (!nb_pkts)
> >+	if (!nb_pkts) {
> >+		*tx_cleaned = 0;
> >  		return true;
> >+	}
> >  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> >  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> >@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> >+	*tx_cleaned = nb_pkts;
> 
> With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> May be we need 2 counters for xdp.

I think there's two issues you are describing, which are separate in my
mind.

  1.) The name "tx_cleaned", and
  2.) Whether nb_pkts is the right thing to write as the out param.

For #1: I'm OK to change the name if that's the blocker here; please
suggest a suitable alternative that you'll accept.

For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
whether or not polling mode is entered.

The purpose of the tracepoint is to determine when/why/how you are entering
polling mode, so if nb_pkts plays a role in that calculation, it's the
right number to output.


> >  	return nb_pkts < budget;
> >  }
> >@@ -581,10 +586,12 @@ static void i40e_clean_xdp_tx_buffer(struct i40e_ring *tx_ring,
> >   * i40e_clean_xdp_tx_irq - Completes AF_XDP entries, and cleans XDP entries
> >   * @vsi: Current VSI
> >   * @tx_ring: XDP Tx ring
> >+ * @tx_cleaned: out parameter of number of TXes cleaned
> >   *
> >   * Returns true if cleanup/tranmission is done.
> >   **/
> >-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
> >+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> >+			   unsigned int *tx_cleaned)
> >  {
> >  	struct xsk_buff_pool *bp = tx_ring->xsk_pool;
> >  	u32 i, completed_frames, xsk_frames = 0;
> >@@ -634,7 +641,7 @@ bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring)
> >  	if (xsk_uses_need_wakeup(tx_ring->xsk_pool))
> >  		xsk_set_tx_need_wakeup(tx_ring->xsk_pool);
> >-	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring));
> >+	return i40e_xmit_zc(tx_ring, I40E_DESC_UNUSED(tx_ring), tx_cleaned);
> >  }
> >  /**
> >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> >index 821df24..396ed11 100644
> >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h
> >@@ -30,7 +30,8 @@ int i40e_xsk_pool_setup(struct i40e_vsi *vsi, struct xsk_buff_pool *pool,
> >  bool i40e_alloc_rx_buffers_zc(struct i40e_ring *rx_ring, u16 cleaned_count);
> >  int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget);
> >-bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring);
> >+bool i40e_clean_xdp_tx_irq(struct i40e_vsi *vsi, struct i40e_ring *tx_ring,
> >+			   unsigned int *tx_cleaned);
> >  int i40e_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags);
> >  int i40e_realloc_rx_bi_zc(struct i40e_vsi *vsi, bool zc);
> >  void i40e_clear_rx_bi_zc(struct i40e_ring *rx_ring);
> 
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 3/4] i40e: Record number of RXes cleaned during NAPI
  2022-10-05 21:21   ` [Intel-wired-lan] " Joe Damato
@ 2022-10-06  0:36     ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06  0:36 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: netdev, kuba, davem

On Wed, Oct 05, 2022 at 02:21:52PM -0700, Joe Damato wrote:
> Adjust i40e_clean_rx_irq and i40e_clean_rx_irq_zc to accept an out
> parameter which records the number of RX packets cleaned.

I just realized that this change probably also needs to include an
"rx_clean_complete" as was added in the previous patch for the TX case so
that when the tracepoint is hit it will be more clear which of the two (RX or
TX) triggered clean_complete = false.

I think the tracepoint should have separate bool flags for each of these
cases (but neither will be used to modify control flow as Jesse asked
earlier).

I'll leave that fix for the v3, in addition to addressing any other feedback on
the rest of the changes.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 3/4] i40e: Record number of RXes cleaned during NAPI
@ 2022-10-06  0:36     ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06  0:36 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg,
	maciej.fijalkowski

On Wed, Oct 05, 2022 at 02:21:52PM -0700, Joe Damato wrote:
> Adjust i40e_clean_rx_irq and i40e_clean_rx_irq_zc to accept an out
> parameter which records the number of RX packets cleaned.

I just realized that this change probably also needs to include an
"rx_clean_complete" as was added in the previous patch for the TX case so
that when the tracepoint is hit it will be more clear which of the two (RX or
TX) triggered clean_complete = false.

I think the tracepoint should have separate bool flags for each of these
cases (but neither will be used to modify control flow as Jesse asked
earlier).

I'll leave that fix for the v3, in addition to addressing any other feedback on
the rest of the changes.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06  0:31       ` [Intel-wired-lan] " Joe Damato
@ 2022-10-06  1:00         ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06  1:00 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: intel-wired-lan, netdev, kuba, davem, anthony.l.nguyen,
	jesse.brandeburg, maciej.fijalkowski

On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
> On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> > On 10/5/2022 4:21 PM, Joe Damato wrote:
> > >Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> > >the number TXs cleaned.
> > >
> > >Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> > >
> > >Care has been taken to avoid changing the control flow of any functions
> > >involved.
> > >
> > >Signed-off-by: Joe Damato <jdamato@fastly.com>
> > >---
> > >  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> > >  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> > >  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> > >  3 files changed, 24 insertions(+), 10 deletions(-)
> > >
> > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > >index b97c95f..a2cc98e 100644
> > >--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > >+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > >@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> > >   * @vsi: the VSI we care about
> > >   * @tx_ring: Tx ring to clean
> > >   * @napi_budget: Used to determine if we are in netpoll
> > >+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> > >   *
> > >   * Returns true if there's any budget left (e.g. the clean is finished)
> > >   **/
> > >  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > >-			      struct i40e_ring *tx_ring, int napi_budget)
> > >+			      struct i40e_ring *tx_ring, int napi_budget,
> > >+			      unsigned int *tx_cleaned)
> > >  {
> > >  	int i = tx_ring->next_to_clean;
> > >  	struct i40e_tx_buffer *tx_buf;
> > >@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > >  	i40e_arm_wb(tx_ring, vsi, budget);
> > >  	if (ring_is_xdp(tx_ring))
> > >-		return !!budget;
> > >+		goto out;
> > >  	/* notify netdev of completed buffers */
> > >  	netdev_tx_completed_queue(txring_txq(tx_ring),
> > >@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > >  		}
> > >  	}
> > >+out:
> > >+	*tx_cleaned = total_packets;
> > >  	return !!budget;
> > >  }
> > >@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > >  			       container_of(napi, struct i40e_q_vector, napi);
> > >  	struct i40e_vsi *vsi = q_vector->vsi;
> > >  	struct i40e_ring *ring;
> > >+	bool tx_clean_complete = true;
> > >  	bool clean_complete = true;
> > >  	bool arm_wb = false;
> > >  	int budget_per_ring;
> > >  	int work_done = 0;
> > >+	unsigned int tx_cleaned = 0;
> > >  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> > >  		napi_complete(napi);
> > >@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > >  	 */
> > >  	i40e_for_each_ring(ring, q_vector->tx) {
> > >  		bool wd = ring->xsk_pool ?
> > >-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> > >-			  i40e_clean_tx_irq(vsi, ring, budget);
> > >+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> > >+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> > >  		if (!wd) {
> > >-			clean_complete = false;
> > >+			clean_complete = tx_clean_complete = false;
> > >  			continue;
> > >  		}
> > >  		arm_wb |= ring->arm_wb;
> > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > >index 790aaeff..f98ce7e4 100644
> > >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > >@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> > >   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> > >   * @xdp_ring: XDP Tx ring
> > >   * @budget: NAPI budget
> > >+ * @tx_cleaned: Out parameter of the TX packets processed
> > >   *
> > >   * Returns true if the work is finished.
> > >   **/
> > >-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > >+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> > >+			 unsigned int *tx_cleaned)
> > >  {
> > >  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> > >  	u32 nb_pkts, nb_processed = 0;
> > >  	unsigned int total_bytes = 0;
> > >  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> > >-	if (!nb_pkts)
> > >+	if (!nb_pkts) {
> > >+		*tx_cleaned = 0;
> > >  		return true;
> > >+	}
> > >  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> > >  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> > >@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > >  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> > >+	*tx_cleaned = nb_pkts;
> > 
> > With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> > packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> > May be we need 2 counters for xdp.
> 
> I think there's two issues you are describing, which are separate in my
> mind.
> 
>   1.) The name "tx_cleaned", and
>   2.) Whether nb_pkts is the right thing to write as the out param.
> 
> For #1: I'm OK to change the name if that's the blocker here; please
> suggest a suitable alternative that you'll accept.
> 
> For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
> nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
> whether or not polling mode is entered.
> 
> The purpose of the tracepoint is to determine when/why/how you are entering
> polling mode, so if nb_pkts plays a role in that calculation, it's the
> right number to output.

I suppose the alternative is to only fire the tracepoint when *not* in XDP.
Then the changes to the XDP stuff can be dropped and a separate set of
tracepoints for XDP can be created in the future.

That might reduce the complexity a bit, and will probably still be pretty
useful for people tuning their non-XDP workloads.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06  1:00         ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06  1:00 UTC (permalink / raw)
  To: Samudrala, Sridhar; +Cc: netdev, kuba, intel-wired-lan, davem

On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
> On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> > On 10/5/2022 4:21 PM, Joe Damato wrote:
> > >Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> > >the number TXs cleaned.
> > >
> > >Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> > >
> > >Care has been taken to avoid changing the control flow of any functions
> > >involved.
> > >
> > >Signed-off-by: Joe Damato <jdamato@fastly.com>
> > >---
> > >  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> > >  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> > >  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> > >  3 files changed, 24 insertions(+), 10 deletions(-)
> > >
> > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > >index b97c95f..a2cc98e 100644
> > >--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > >+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > >@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> > >   * @vsi: the VSI we care about
> > >   * @tx_ring: Tx ring to clean
> > >   * @napi_budget: Used to determine if we are in netpoll
> > >+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> > >   *
> > >   * Returns true if there's any budget left (e.g. the clean is finished)
> > >   **/
> > >  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > >-			      struct i40e_ring *tx_ring, int napi_budget)
> > >+			      struct i40e_ring *tx_ring, int napi_budget,
> > >+			      unsigned int *tx_cleaned)
> > >  {
> > >  	int i = tx_ring->next_to_clean;
> > >  	struct i40e_tx_buffer *tx_buf;
> > >@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > >  	i40e_arm_wb(tx_ring, vsi, budget);
> > >  	if (ring_is_xdp(tx_ring))
> > >-		return !!budget;
> > >+		goto out;
> > >  	/* notify netdev of completed buffers */
> > >  	netdev_tx_completed_queue(txring_txq(tx_ring),
> > >@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > >  		}
> > >  	}
> > >+out:
> > >+	*tx_cleaned = total_packets;
> > >  	return !!budget;
> > >  }
> > >@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > >  			       container_of(napi, struct i40e_q_vector, napi);
> > >  	struct i40e_vsi *vsi = q_vector->vsi;
> > >  	struct i40e_ring *ring;
> > >+	bool tx_clean_complete = true;
> > >  	bool clean_complete = true;
> > >  	bool arm_wb = false;
> > >  	int budget_per_ring;
> > >  	int work_done = 0;
> > >+	unsigned int tx_cleaned = 0;
> > >  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> > >  		napi_complete(napi);
> > >@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > >  	 */
> > >  	i40e_for_each_ring(ring, q_vector->tx) {
> > >  		bool wd = ring->xsk_pool ?
> > >-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> > >-			  i40e_clean_tx_irq(vsi, ring, budget);
> > >+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> > >+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> > >  		if (!wd) {
> > >-			clean_complete = false;
> > >+			clean_complete = tx_clean_complete = false;
> > >  			continue;
> > >  		}
> > >  		arm_wb |= ring->arm_wb;
> > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > >index 790aaeff..f98ce7e4 100644
> > >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > >@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> > >   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> > >   * @xdp_ring: XDP Tx ring
> > >   * @budget: NAPI budget
> > >+ * @tx_cleaned: Out parameter of the TX packets processed
> > >   *
> > >   * Returns true if the work is finished.
> > >   **/
> > >-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > >+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> > >+			 unsigned int *tx_cleaned)
> > >  {
> > >  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> > >  	u32 nb_pkts, nb_processed = 0;
> > >  	unsigned int total_bytes = 0;
> > >  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> > >-	if (!nb_pkts)
> > >+	if (!nb_pkts) {
> > >+		*tx_cleaned = 0;
> > >  		return true;
> > >+	}
> > >  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> > >  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> > >@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > >  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> > >+	*tx_cleaned = nb_pkts;
> > 
> > With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> > packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> > May be we need 2 counters for xdp.
> 
> I think there's two issues you are describing, which are separate in my
> mind.
> 
>   1.) The name "tx_cleaned", and
>   2.) Whether nb_pkts is the right thing to write as the out param.
> 
> For #1: I'm OK to change the name if that's the blocker here; please
> suggest a suitable alternative that you'll accept.
> 
> For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
> nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
> whether or not polling mode is entered.
> 
> The purpose of the tracepoint is to determine when/why/how you are entering
> polling mode, so if nb_pkts plays a role in that calculation, it's the
> right number to output.

I suppose the alternative is to only fire the tracepoint when *not* in XDP.
Then the changes to the XDP stuff can be dropped and a separate set of
tracepoints for XDP can be created in the future.

That might reduce the complexity a bit, and will probably still be pretty
useful for people tuning their non-XDP workloads.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06  1:00         ` [Intel-wired-lan] " Joe Damato
@ 2022-10-06 13:03           ` Maciej Fijalkowski
  -1 siblings, 0 replies; 30+ messages in thread
From: Maciej Fijalkowski @ 2022-10-06 13:03 UTC (permalink / raw)
  To: Joe Damato
  Cc: Samudrala, Sridhar, intel-wired-lan, netdev, kuba, davem,
	anthony.l.nguyen, jesse.brandeburg

On Wed, Oct 05, 2022 at 06:00:24PM -0700, Joe Damato wrote:
> On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
> > On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> > > On 10/5/2022 4:21 PM, Joe Damato wrote:
> > > >Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> > > >the number TXs cleaned.
> > > >
> > > >Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> > > >
> > > >Care has been taken to avoid changing the control flow of any functions
> > > >involved.
> > > >
> > > >Signed-off-by: Joe Damato <jdamato@fastly.com>
> > > >---
> > > >  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> > > >  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> > > >  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> > > >  3 files changed, 24 insertions(+), 10 deletions(-)
> > > >
> > > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > > >index b97c95f..a2cc98e 100644
> > > >--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > > >+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > > >@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> > > >   * @vsi: the VSI we care about
> > > >   * @tx_ring: Tx ring to clean
> > > >   * @napi_budget: Used to determine if we are in netpoll
> > > >+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> > > >   *
> > > >   * Returns true if there's any budget left (e.g. the clean is finished)
> > > >   **/
> > > >  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > > >-			      struct i40e_ring *tx_ring, int napi_budget)
> > > >+			      struct i40e_ring *tx_ring, int napi_budget,
> > > >+			      unsigned int *tx_cleaned)
> > > >  {
> > > >  	int i = tx_ring->next_to_clean;
> > > >  	struct i40e_tx_buffer *tx_buf;
> > > >@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > > >  	i40e_arm_wb(tx_ring, vsi, budget);
> > > >  	if (ring_is_xdp(tx_ring))
> > > >-		return !!budget;
> > > >+		goto out;
> > > >  	/* notify netdev of completed buffers */
> > > >  	netdev_tx_completed_queue(txring_txq(tx_ring),
> > > >@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > > >  		}
> > > >  	}
> > > >+out:
> > > >+	*tx_cleaned = total_packets;
> > > >  	return !!budget;
> > > >  }
> > > >@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > > >  			       container_of(napi, struct i40e_q_vector, napi);
> > > >  	struct i40e_vsi *vsi = q_vector->vsi;
> > > >  	struct i40e_ring *ring;
> > > >+	bool tx_clean_complete = true;
> > > >  	bool clean_complete = true;
> > > >  	bool arm_wb = false;
> > > >  	int budget_per_ring;
> > > >  	int work_done = 0;
> > > >+	unsigned int tx_cleaned = 0;
> > > >  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> > > >  		napi_complete(napi);
> > > >@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > > >  	 */
> > > >  	i40e_for_each_ring(ring, q_vector->tx) {
> > > >  		bool wd = ring->xsk_pool ?
> > > >-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> > > >-			  i40e_clean_tx_irq(vsi, ring, budget);
> > > >+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> > > >+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> > > >  		if (!wd) {
> > > >-			clean_complete = false;
> > > >+			clean_complete = tx_clean_complete = false;
> > > >  			continue;
> > > >  		}
> > > >  		arm_wb |= ring->arm_wb;
> > > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > > >index 790aaeff..f98ce7e4 100644
> > > >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > > >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > > >@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> > > >   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> > > >   * @xdp_ring: XDP Tx ring
> > > >   * @budget: NAPI budget
> > > >+ * @tx_cleaned: Out parameter of the TX packets processed
> > > >   *
> > > >   * Returns true if the work is finished.
> > > >   **/
> > > >-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > > >+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> > > >+			 unsigned int *tx_cleaned)
> > > >  {
> > > >  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> > > >  	u32 nb_pkts, nb_processed = 0;
> > > >  	unsigned int total_bytes = 0;
> > > >  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> > > >-	if (!nb_pkts)
> > > >+	if (!nb_pkts) {
> > > >+		*tx_cleaned = 0;
> > > >  		return true;
> > > >+	}
> > > >  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> > > >  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> > > >@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > > >  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> > > >+	*tx_cleaned = nb_pkts;
> > > 
> > > With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> > > packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> > > May be we need 2 counters for xdp.
> > 
> > I think there's two issues you are describing, which are separate in my
> > mind.
> > 
> >   1.) The name "tx_cleaned", and
> >   2.) Whether nb_pkts is the right thing to write as the out param.
> > 
> > For #1: I'm OK to change the name if that's the blocker here; please
> > suggest a suitable alternative that you'll accept.
> > 
> > For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
> > nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
> > whether or not polling mode is entered.
> > 
> > The purpose of the tracepoint is to determine when/why/how you are entering
> > polling mode, so if nb_pkts plays a role in that calculation, it's the
> > right number to output.
> 
> I suppose the alternative is to only fire the tracepoint when *not* in XDP.
> Then the changes to the XDP stuff can be dropped and a separate set of
> tracepoints for XDP can be created in the future.

Let's be clear that it's the AF_XDP quirk that we have in here that actual
xmit happens within NAPI polling routine.

Sridhar is right with having xsk_frames as tx_cleaned but you're also
right that nb_pkts affects napi polling. But then if you look at Rx side
there is an analogous case with buffer allocation affecting napi polling.

> 
> That might reduce the complexity a bit, and will probably still be pretty
> useful for people tuning their non-XDP workloads.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06 13:03           ` Maciej Fijalkowski
  0 siblings, 0 replies; 30+ messages in thread
From: Maciej Fijalkowski @ 2022-10-06 13:03 UTC (permalink / raw)
  To: Joe Damato; +Cc: netdev, kuba, intel-wired-lan, davem

On Wed, Oct 05, 2022 at 06:00:24PM -0700, Joe Damato wrote:
> On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
> > On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> > > On 10/5/2022 4:21 PM, Joe Damato wrote:
> > > >Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> > > >the number TXs cleaned.
> > > >
> > > >Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> > > >
> > > >Care has been taken to avoid changing the control flow of any functions
> > > >involved.
> > > >
> > > >Signed-off-by: Joe Damato <jdamato@fastly.com>
> > > >---
> > > >  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> > > >  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> > > >  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> > > >  3 files changed, 24 insertions(+), 10 deletions(-)
> > > >
> > > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > > >index b97c95f..a2cc98e 100644
> > > >--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > > >+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > > >@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> > > >   * @vsi: the VSI we care about
> > > >   * @tx_ring: Tx ring to clean
> > > >   * @napi_budget: Used to determine if we are in netpoll
> > > >+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> > > >   *
> > > >   * Returns true if there's any budget left (e.g. the clean is finished)
> > > >   **/
> > > >  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > > >-			      struct i40e_ring *tx_ring, int napi_budget)
> > > >+			      struct i40e_ring *tx_ring, int napi_budget,
> > > >+			      unsigned int *tx_cleaned)
> > > >  {
> > > >  	int i = tx_ring->next_to_clean;
> > > >  	struct i40e_tx_buffer *tx_buf;
> > > >@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > > >  	i40e_arm_wb(tx_ring, vsi, budget);
> > > >  	if (ring_is_xdp(tx_ring))
> > > >-		return !!budget;
> > > >+		goto out;
> > > >  	/* notify netdev of completed buffers */
> > > >  	netdev_tx_completed_queue(txring_txq(tx_ring),
> > > >@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> > > >  		}
> > > >  	}
> > > >+out:
> > > >+	*tx_cleaned = total_packets;
> > > >  	return !!budget;
> > > >  }
> > > >@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > > >  			       container_of(napi, struct i40e_q_vector, napi);
> > > >  	struct i40e_vsi *vsi = q_vector->vsi;
> > > >  	struct i40e_ring *ring;
> > > >+	bool tx_clean_complete = true;
> > > >  	bool clean_complete = true;
> > > >  	bool arm_wb = false;
> > > >  	int budget_per_ring;
> > > >  	int work_done = 0;
> > > >+	unsigned int tx_cleaned = 0;
> > > >  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> > > >  		napi_complete(napi);
> > > >@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> > > >  	 */
> > > >  	i40e_for_each_ring(ring, q_vector->tx) {
> > > >  		bool wd = ring->xsk_pool ?
> > > >-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> > > >-			  i40e_clean_tx_irq(vsi, ring, budget);
> > > >+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> > > >+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> > > >  		if (!wd) {
> > > >-			clean_complete = false;
> > > >+			clean_complete = tx_clean_complete = false;
> > > >  			continue;
> > > >  		}
> > > >  		arm_wb |= ring->arm_wb;
> > > >diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > > >index 790aaeff..f98ce7e4 100644
> > > >--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > > >+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> > > >@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> > > >   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> > > >   * @xdp_ring: XDP Tx ring
> > > >   * @budget: NAPI budget
> > > >+ * @tx_cleaned: Out parameter of the TX packets processed
> > > >   *
> > > >   * Returns true if the work is finished.
> > > >   **/
> > > >-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > > >+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> > > >+			 unsigned int *tx_cleaned)
> > > >  {
> > > >  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> > > >  	u32 nb_pkts, nb_processed = 0;
> > > >  	unsigned int total_bytes = 0;
> > > >  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> > > >-	if (!nb_pkts)
> > > >+	if (!nb_pkts) {
> > > >+		*tx_cleaned = 0;
> > > >  		return true;
> > > >+	}
> > > >  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> > > >  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> > > >@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> > > >  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> > > >+	*tx_cleaned = nb_pkts;
> > > 
> > > With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> > > packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> > > May be we need 2 counters for xdp.
> > 
> > I think there's two issues you are describing, which are separate in my
> > mind.
> > 
> >   1.) The name "tx_cleaned", and
> >   2.) Whether nb_pkts is the right thing to write as the out param.
> > 
> > For #1: I'm OK to change the name if that's the blocker here; please
> > suggest a suitable alternative that you'll accept.
> > 
> > For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
> > nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
> > whether or not polling mode is entered.
> > 
> > The purpose of the tracepoint is to determine when/why/how you are entering
> > polling mode, so if nb_pkts plays a role in that calculation, it's the
> > right number to output.
> 
> I suppose the alternative is to only fire the tracepoint when *not* in XDP.
> Then the changes to the XDP stuff can be dropped and a separate set of
> tracepoints for XDP can be created in the future.

Let's be clear that it's the AF_XDP quirk that we have in here that actual
xmit happens within NAPI polling routine.

Sridhar is right with having xsk_frames as tx_cleaned but you're also
right that nb_pkts affects napi polling. But then if you look at Rx side
there is an analogous case with buffer allocation affecting napi polling.

> 
> That might reduce the complexity a bit, and will probably still be pretty
> useful for people tuning their non-XDP workloads.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06 13:03           ` [Intel-wired-lan] " Maciej Fijalkowski
@ 2022-10-06 14:57             ` Samudrala, Sridhar
  -1 siblings, 0 replies; 30+ messages in thread
From: Samudrala, Sridhar @ 2022-10-06 14:57 UTC (permalink / raw)
  To: Maciej Fijalkowski, Joe Damato; +Cc: netdev, kuba, intel-wired-lan, davem

On 10/6/2022 8:03 AM, Maciej Fijalkowski wrote:
> On Wed, Oct 05, 2022 at 06:00:24PM -0700, Joe Damato wrote:
>> On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
>>> On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
>>>> On 10/5/2022 4:21 PM, Joe Damato wrote:
>>>>> Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
>>>>> the number TXs cleaned.
>>>>>
>>>>> Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
>>>>>
>>>>> Care has been taken to avoid changing the control flow of any functions
>>>>> involved.
>>>>>
>>>>> Signed-off-by: Joe Damato <jdamato@fastly.com>
>>>>> ---
>>>>>   drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
>>>>>   drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
>>>>>   drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
>>>>>   3 files changed, 24 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
>>>>> index b97c95f..a2cc98e 100644
>>>>> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
>>>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
>>>>> @@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
>>>>>    * @vsi: the VSI we care about
>>>>>    * @tx_ring: Tx ring to clean
>>>>>    * @napi_budget: Used to determine if we are in netpoll
>>>>> + * @tx_cleaned: Out parameter set to the number of TXes cleaned
>>>>>    *
>>>>>    * Returns true if there's any budget left (e.g. the clean is finished)
>>>>>    **/
>>>>>   static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>>>>> -			      struct i40e_ring *tx_ring, int napi_budget)
>>>>> +			      struct i40e_ring *tx_ring, int napi_budget,
>>>>> +			      unsigned int *tx_cleaned)
>>>>>   {
>>>>>   	int i = tx_ring->next_to_clean;
>>>>>   	struct i40e_tx_buffer *tx_buf;
>>>>> @@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>>>>>   	i40e_arm_wb(tx_ring, vsi, budget);
>>>>>   	if (ring_is_xdp(tx_ring))
>>>>> -		return !!budget;
>>>>> +		goto out;
>>>>>   	/* notify netdev of completed buffers */
>>>>>   	netdev_tx_completed_queue(txring_txq(tx_ring),
>>>>> @@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>>>>>   		}
>>>>>   	}
>>>>> +out:
>>>>> +	*tx_cleaned = total_packets;
>>>>>   	return !!budget;
>>>>>   }
>>>>> @@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>>>>>   			       container_of(napi, struct i40e_q_vector, napi);
>>>>>   	struct i40e_vsi *vsi = q_vector->vsi;
>>>>>   	struct i40e_ring *ring;
>>>>> +	bool tx_clean_complete = true;
>>>>>   	bool clean_complete = true;
>>>>>   	bool arm_wb = false;
>>>>>   	int budget_per_ring;
>>>>>   	int work_done = 0;
>>>>> +	unsigned int tx_cleaned = 0;
>>>>>   	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
>>>>>   		napi_complete(napi);
>>>>> @@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>>>>>   	 */
>>>>>   	i40e_for_each_ring(ring, q_vector->tx) {
>>>>>   		bool wd = ring->xsk_pool ?
>>>>> -			  i40e_clean_xdp_tx_irq(vsi, ring) :
>>>>> -			  i40e_clean_tx_irq(vsi, ring, budget);
>>>>> +			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
>>>>> +			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
>>>>>   		if (!wd) {
>>>>> -			clean_complete = false;
>>>>> +			clean_complete = tx_clean_complete = false;
>>>>>   			continue;
>>>>>   		}
>>>>>   		arm_wb |= ring->arm_wb;
>>>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
>>>>> index 790aaeff..f98ce7e4 100644
>>>>> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
>>>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
>>>>> @@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
>>>>>    * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
>>>>>    * @xdp_ring: XDP Tx ring
>>>>>    * @budget: NAPI budget
>>>>> + * @tx_cleaned: Out parameter of the TX packets processed
>>>>>    *
>>>>>    * Returns true if the work is finished.
>>>>>    **/
>>>>> -static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>>>>> +static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
>>>>> +			 unsigned int *tx_cleaned)
>>>>>   {
>>>>>   	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
>>>>>   	u32 nb_pkts, nb_processed = 0;
>>>>>   	unsigned int total_bytes = 0;
>>>>>   	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
>>>>> -	if (!nb_pkts)
>>>>> +	if (!nb_pkts) {
>>>>> +		*tx_cleaned = 0;
>>>>>   		return true;
>>>>> +	}
>>>>>   	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
>>>>>   		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
>>>>> @@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>>>>>   	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
>>>>> +	*tx_cleaned = nb_pkts;
>>>> With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
>>>> packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
>>>> May be we need 2 counters for xdp.
>>> I think there's two issues you are describing, which are separate in my
>>> mind.
>>>
>>>    1.) The name "tx_cleaned", and
>>>    2.) Whether nb_pkts is the right thing to write as the out param.
>>>
>>> For #1: I'm OK to change the name if that's the blocker here; please
>>> suggest a suitable alternative that you'll accept.
>>>
>>> For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
>>> nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
>>> whether or not polling mode is entered.
>>>
>>> The purpose of the tracepoint is to determine when/why/how you are entering
>>> polling mode, so if nb_pkts plays a role in that calculation, it's the
>>> right number to output.
>> I suppose the alternative is to only fire the tracepoint when *not* in XDP.
>> Then the changes to the XDP stuff can be dropped and a separate set of
>> tracepoints for XDP can be created in the future.
> Let's be clear that it's the AF_XDP quirk that we have in here that actual
> xmit happens within NAPI polling routine.
>
> Sridhar is right with having xsk_frames as tx_cleaned but you're also
> right that nb_pkts affects napi polling. But then if you look at Rx side
> there is an analogous case with buffer allocation affecting napi polling.

To be correct,  I would suggest 2 out parameters to i40e_clean_xdp_tx_irq()
tx_cleaned and xdp_transmitted.  tx_cleaned should be filled in
with xsk_frames. Add xdp_transmitted as an out parameter to i40e_xmit_zc()
and fill it with nb_pkts.

I am not completely clear on the reasoning behind setting clean_complete
based on number of packets transmitted in case of XDP.


>
>> That might reduce the complexity a bit, and will probably still be pretty
>> useful for people tuning their non-XDP workloads.

This option is fine too.


_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06 14:57             ` Samudrala, Sridhar
  0 siblings, 0 replies; 30+ messages in thread
From: Samudrala, Sridhar @ 2022-10-06 14:57 UTC (permalink / raw)
  To: Maciej Fijalkowski, Joe Damato
  Cc: intel-wired-lan, netdev, kuba, davem, anthony.l.nguyen, jesse.brandeburg

On 10/6/2022 8:03 AM, Maciej Fijalkowski wrote:
> On Wed, Oct 05, 2022 at 06:00:24PM -0700, Joe Damato wrote:
>> On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
>>> On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
>>>> On 10/5/2022 4:21 PM, Joe Damato wrote:
>>>>> Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
>>>>> the number TXs cleaned.
>>>>>
>>>>> Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
>>>>>
>>>>> Care has been taken to avoid changing the control flow of any functions
>>>>> involved.
>>>>>
>>>>> Signed-off-by: Joe Damato <jdamato@fastly.com>
>>>>> ---
>>>>>   drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
>>>>>   drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
>>>>>   drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
>>>>>   3 files changed, 24 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
>>>>> index b97c95f..a2cc98e 100644
>>>>> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
>>>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
>>>>> @@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
>>>>>    * @vsi: the VSI we care about
>>>>>    * @tx_ring: Tx ring to clean
>>>>>    * @napi_budget: Used to determine if we are in netpoll
>>>>> + * @tx_cleaned: Out parameter set to the number of TXes cleaned
>>>>>    *
>>>>>    * Returns true if there's any budget left (e.g. the clean is finished)
>>>>>    **/
>>>>>   static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>>>>> -			      struct i40e_ring *tx_ring, int napi_budget)
>>>>> +			      struct i40e_ring *tx_ring, int napi_budget,
>>>>> +			      unsigned int *tx_cleaned)
>>>>>   {
>>>>>   	int i = tx_ring->next_to_clean;
>>>>>   	struct i40e_tx_buffer *tx_buf;
>>>>> @@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>>>>>   	i40e_arm_wb(tx_ring, vsi, budget);
>>>>>   	if (ring_is_xdp(tx_ring))
>>>>> -		return !!budget;
>>>>> +		goto out;
>>>>>   	/* notify netdev of completed buffers */
>>>>>   	netdev_tx_completed_queue(txring_txq(tx_ring),
>>>>> @@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
>>>>>   		}
>>>>>   	}
>>>>> +out:
>>>>> +	*tx_cleaned = total_packets;
>>>>>   	return !!budget;
>>>>>   }
>>>>> @@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>>>>>   			       container_of(napi, struct i40e_q_vector, napi);
>>>>>   	struct i40e_vsi *vsi = q_vector->vsi;
>>>>>   	struct i40e_ring *ring;
>>>>> +	bool tx_clean_complete = true;
>>>>>   	bool clean_complete = true;
>>>>>   	bool arm_wb = false;
>>>>>   	int budget_per_ring;
>>>>>   	int work_done = 0;
>>>>> +	unsigned int tx_cleaned = 0;
>>>>>   	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
>>>>>   		napi_complete(napi);
>>>>> @@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
>>>>>   	 */
>>>>>   	i40e_for_each_ring(ring, q_vector->tx) {
>>>>>   		bool wd = ring->xsk_pool ?
>>>>> -			  i40e_clean_xdp_tx_irq(vsi, ring) :
>>>>> -			  i40e_clean_tx_irq(vsi, ring, budget);
>>>>> +			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
>>>>> +			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
>>>>>   		if (!wd) {
>>>>> -			clean_complete = false;
>>>>> +			clean_complete = tx_clean_complete = false;
>>>>>   			continue;
>>>>>   		}
>>>>>   		arm_wb |= ring->arm_wb;
>>>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
>>>>> index 790aaeff..f98ce7e4 100644
>>>>> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
>>>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
>>>>> @@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
>>>>>    * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
>>>>>    * @xdp_ring: XDP Tx ring
>>>>>    * @budget: NAPI budget
>>>>> + * @tx_cleaned: Out parameter of the TX packets processed
>>>>>    *
>>>>>    * Returns true if the work is finished.
>>>>>    **/
>>>>> -static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>>>>> +static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
>>>>> +			 unsigned int *tx_cleaned)
>>>>>   {
>>>>>   	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
>>>>>   	u32 nb_pkts, nb_processed = 0;
>>>>>   	unsigned int total_bytes = 0;
>>>>>   	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
>>>>> -	if (!nb_pkts)
>>>>> +	if (!nb_pkts) {
>>>>> +		*tx_cleaned = 0;
>>>>>   		return true;
>>>>> +	}
>>>>>   	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
>>>>>   		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
>>>>> @@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
>>>>>   	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
>>>>> +	*tx_cleaned = nb_pkts;
>>>> With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
>>>> packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
>>>> May be we need 2 counters for xdp.
>>> I think there's two issues you are describing, which are separate in my
>>> mind.
>>>
>>>    1.) The name "tx_cleaned", and
>>>    2.) Whether nb_pkts is the right thing to write as the out param.
>>>
>>> For #1: I'm OK to change the name if that's the blocker here; please
>>> suggest a suitable alternative that you'll accept.
>>>
>>> For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
>>> nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
>>> whether or not polling mode is entered.
>>>
>>> The purpose of the tracepoint is to determine when/why/how you are entering
>>> polling mode, so if nb_pkts plays a role in that calculation, it's the
>>> right number to output.
>> I suppose the alternative is to only fire the tracepoint when *not* in XDP.
>> Then the changes to the XDP stuff can be dropped and a separate set of
>> tracepoints for XDP can be created in the future.
> Let's be clear that it's the AF_XDP quirk that we have in here that actual
> xmit happens within NAPI polling routine.
>
> Sridhar is right with having xsk_frames as tx_cleaned but you're also
> right that nb_pkts affects napi polling. But then if you look at Rx side
> there is an analogous case with buffer allocation affecting napi polling.

To be correct,  I would suggest 2 out parameters to i40e_clean_xdp_tx_irq()
tx_cleaned and xdp_transmitted.  tx_cleaned should be filled in
with xsk_frames. Add xdp_transmitted as an out parameter to i40e_xmit_zc()
and fill it with nb_pkts.

I am not completely clear on the reasoning behind setting clean_complete
based on number of packets transmitted in case of XDP.


>
>> That might reduce the complexity a bit, and will probably still be pretty
>> useful for people tuning their non-XDP workloads.

This option is fine too.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06 14:57             ` Samudrala, Sridhar
@ 2022-10-06 17:32               ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06 17:32 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Maciej Fijalkowski, intel-wired-lan, netdev, kuba, davem,
	anthony.l.nguyen, jesse.brandeburg

On Thu, Oct 06, 2022 at 09:57:19AM -0500, Samudrala, Sridhar wrote:
> On 10/6/2022 8:03 AM, Maciej Fijalkowski wrote:
> >On Wed, Oct 05, 2022 at 06:00:24PM -0700, Joe Damato wrote:
> >>On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
> >>>On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> >>>>On 10/5/2022 4:21 PM, Joe Damato wrote:
> >>>>>Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> >>>>>the number TXs cleaned.
> >>>>>
> >>>>>Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> >>>>>
> >>>>>Care has been taken to avoid changing the control flow of any functions
> >>>>>involved.
> >>>>>
> >>>>>Signed-off-by: Joe Damato <jdamato@fastly.com>
> >>>>>---
> >>>>>  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> >>>>>  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> >>>>>  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> >>>>>  3 files changed, 24 insertions(+), 10 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >>>>>index b97c95f..a2cc98e 100644
> >>>>>--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >>>>>+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >>>>>@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> >>>>>   * @vsi: the VSI we care about
> >>>>>   * @tx_ring: Tx ring to clean
> >>>>>   * @napi_budget: Used to determine if we are in netpoll
> >>>>>+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> >>>>>   *
> >>>>>   * Returns true if there's any budget left (e.g. the clean is finished)
> >>>>>   **/
> >>>>>  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >>>>>-			      struct i40e_ring *tx_ring, int napi_budget)
> >>>>>+			      struct i40e_ring *tx_ring, int napi_budget,
> >>>>>+			      unsigned int *tx_cleaned)
> >>>>>  {
> >>>>>  	int i = tx_ring->next_to_clean;
> >>>>>  	struct i40e_tx_buffer *tx_buf;
> >>>>>@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >>>>>  	i40e_arm_wb(tx_ring, vsi, budget);
> >>>>>  	if (ring_is_xdp(tx_ring))
> >>>>>-		return !!budget;
> >>>>>+		goto out;
> >>>>>  	/* notify netdev of completed buffers */
> >>>>>  	netdev_tx_completed_queue(txring_txq(tx_ring),
> >>>>>@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >>>>>  		}
> >>>>>  	}
> >>>>>+out:
> >>>>>+	*tx_cleaned = total_packets;
> >>>>>  	return !!budget;
> >>>>>  }
> >>>>>@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >>>>>  			       container_of(napi, struct i40e_q_vector, napi);
> >>>>>  	struct i40e_vsi *vsi = q_vector->vsi;
> >>>>>  	struct i40e_ring *ring;
> >>>>>+	bool tx_clean_complete = true;
> >>>>>  	bool clean_complete = true;
> >>>>>  	bool arm_wb = false;
> >>>>>  	int budget_per_ring;
> >>>>>  	int work_done = 0;
> >>>>>+	unsigned int tx_cleaned = 0;
> >>>>>  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> >>>>>  		napi_complete(napi);
> >>>>>@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >>>>>  	 */
> >>>>>  	i40e_for_each_ring(ring, q_vector->tx) {
> >>>>>  		bool wd = ring->xsk_pool ?
> >>>>>-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> >>>>>-			  i40e_clean_tx_irq(vsi, ring, budget);
> >>>>>+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> >>>>>+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> >>>>>  		if (!wd) {
> >>>>>-			clean_complete = false;
> >>>>>+			clean_complete = tx_clean_complete = false;
> >>>>>  			continue;
> >>>>>  		}
> >>>>>  		arm_wb |= ring->arm_wb;
> >>>>>diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >>>>>index 790aaeff..f98ce7e4 100644
> >>>>>--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >>>>>+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >>>>>@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> >>>>>   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> >>>>>   * @xdp_ring: XDP Tx ring
> >>>>>   * @budget: NAPI budget
> >>>>>+ * @tx_cleaned: Out parameter of the TX packets processed
> >>>>>   *
> >>>>>   * Returns true if the work is finished.
> >>>>>   **/
> >>>>>-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >>>>>+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> >>>>>+			 unsigned int *tx_cleaned)
> >>>>>  {
> >>>>>  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> >>>>>  	u32 nb_pkts, nb_processed = 0;
> >>>>>  	unsigned int total_bytes = 0;
> >>>>>  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> >>>>>-	if (!nb_pkts)
> >>>>>+	if (!nb_pkts) {
> >>>>>+		*tx_cleaned = 0;
> >>>>>  		return true;
> >>>>>+	}
> >>>>>  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> >>>>>  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> >>>>>@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >>>>>  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> >>>>>+	*tx_cleaned = nb_pkts;
> >>>>With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> >>>>packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> >>>>May be we need 2 counters for xdp.
> >>>I think there's two issues you are describing, which are separate in my
> >>>mind.
> >>>
> >>>   1.) The name "tx_cleaned", and
> >>>   2.) Whether nb_pkts is the right thing to write as the out param.
> >>>
> >>>For #1: I'm OK to change the name if that's the blocker here; please
> >>>suggest a suitable alternative that you'll accept.
> >>>
> >>>For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
> >>>nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
> >>>whether or not polling mode is entered.
> >>>
> >>>The purpose of the tracepoint is to determine when/why/how you are entering
> >>>polling mode, so if nb_pkts plays a role in that calculation, it's the
> >>>right number to output.
> >>I suppose the alternative is to only fire the tracepoint when *not* in XDP.
> >>Then the changes to the XDP stuff can be dropped and a separate set of
> >>tracepoints for XDP can be created in the future.
> >Let's be clear that it's the AF_XDP quirk that we have in here that actual
> >xmit happens within NAPI polling routine.
> >
> >Sridhar is right with having xsk_frames as tx_cleaned but you're also
> >right that nb_pkts affects napi polling. But then if you look at Rx side
> >there is an analogous case with buffer allocation affecting napi polling.
> 
> To be correct,  I would suggest 2 out parameters to i40e_clean_xdp_tx_irq()
> tx_cleaned and xdp_transmitted.  tx_cleaned should be filled in
> with xsk_frames. Add xdp_transmitted as an out parameter to i40e_xmit_zc()
> and fill it with nb_pkts.

Sorry, but I don't see the value in the second param. NAPI decides what to
do based on nb_pkts. That's the only parameter that matters for the purpose
of NAPI going into poll mode or not, right?

If so: I don't see any reason why a second parameter is necessary.

As I mentioned earlier: if it's just that the name of the parameter isn't
right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
that's an easy fix; I'll just change the name.

It doesn't seem helpful to have xsk_frames as an out parameter for
i40e_napi_poll tracepoint; that value is not used to determine anything
about i40e's NAPI.

> I am not completely clear on the reasoning behind setting clean_complete
> based on number of packets transmitted in case of XDP.
> >
> >>That might reduce the complexity a bit, and will probably still be pretty
> >>useful for people tuning their non-XDP workloads.
> 
> This option is fine too.

I'll give Jesse a chance to weigh in before I proceed with spinning a v3.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06 17:32               ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06 17:32 UTC (permalink / raw)
  To: Samudrala, Sridhar; +Cc: netdev, intel-wired-lan, kuba, davem

On Thu, Oct 06, 2022 at 09:57:19AM -0500, Samudrala, Sridhar wrote:
> On 10/6/2022 8:03 AM, Maciej Fijalkowski wrote:
> >On Wed, Oct 05, 2022 at 06:00:24PM -0700, Joe Damato wrote:
> >>On Wed, Oct 05, 2022 at 05:31:04PM -0700, Joe Damato wrote:
> >>>On Wed, Oct 05, 2022 at 07:16:56PM -0500, Samudrala, Sridhar wrote:
> >>>>On 10/5/2022 4:21 PM, Joe Damato wrote:
> >>>>>Update i40e_clean_tx_irq to take an out parameter (tx_cleaned) which stores
> >>>>>the number TXs cleaned.
> >>>>>
> >>>>>Likewise, update i40e_clean_xdp_tx_irq and i40e_xmit_zc to do the same.
> >>>>>
> >>>>>Care has been taken to avoid changing the control flow of any functions
> >>>>>involved.
> >>>>>
> >>>>>Signed-off-by: Joe Damato <jdamato@fastly.com>
> >>>>>---
> >>>>>  drivers/net/ethernet/intel/i40e/i40e_txrx.c | 16 +++++++++++-----
> >>>>>  drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 15 +++++++++++----
> >>>>>  drivers/net/ethernet/intel/i40e/i40e_xsk.h  |  3 ++-
> >>>>>  3 files changed, 24 insertions(+), 10 deletions(-)
> >>>>>
> >>>>>diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >>>>>index b97c95f..a2cc98e 100644
> >>>>>--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >>>>>+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> >>>>>@@ -923,11 +923,13 @@ void i40e_detect_recover_hung(struct i40e_vsi *vsi)
> >>>>>   * @vsi: the VSI we care about
> >>>>>   * @tx_ring: Tx ring to clean
> >>>>>   * @napi_budget: Used to determine if we are in netpoll
> >>>>>+ * @tx_cleaned: Out parameter set to the number of TXes cleaned
> >>>>>   *
> >>>>>   * Returns true if there's any budget left (e.g. the clean is finished)
> >>>>>   **/
> >>>>>  static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >>>>>-			      struct i40e_ring *tx_ring, int napi_budget)
> >>>>>+			      struct i40e_ring *tx_ring, int napi_budget,
> >>>>>+			      unsigned int *tx_cleaned)
> >>>>>  {
> >>>>>  	int i = tx_ring->next_to_clean;
> >>>>>  	struct i40e_tx_buffer *tx_buf;
> >>>>>@@ -1026,7 +1028,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >>>>>  	i40e_arm_wb(tx_ring, vsi, budget);
> >>>>>  	if (ring_is_xdp(tx_ring))
> >>>>>-		return !!budget;
> >>>>>+		goto out;
> >>>>>  	/* notify netdev of completed buffers */
> >>>>>  	netdev_tx_completed_queue(txring_txq(tx_ring),
> >>>>>@@ -1048,6 +1050,8 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi,
> >>>>>  		}
> >>>>>  	}
> >>>>>+out:
> >>>>>+	*tx_cleaned = total_packets;
> >>>>>  	return !!budget;
> >>>>>  }
> >>>>>@@ -2689,10 +2693,12 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >>>>>  			       container_of(napi, struct i40e_q_vector, napi);
> >>>>>  	struct i40e_vsi *vsi = q_vector->vsi;
> >>>>>  	struct i40e_ring *ring;
> >>>>>+	bool tx_clean_complete = true;
> >>>>>  	bool clean_complete = true;
> >>>>>  	bool arm_wb = false;
> >>>>>  	int budget_per_ring;
> >>>>>  	int work_done = 0;
> >>>>>+	unsigned int tx_cleaned = 0;
> >>>>>  	if (test_bit(__I40E_VSI_DOWN, vsi->state)) {
> >>>>>  		napi_complete(napi);
> >>>>>@@ -2704,11 +2710,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget)
> >>>>>  	 */
> >>>>>  	i40e_for_each_ring(ring, q_vector->tx) {
> >>>>>  		bool wd = ring->xsk_pool ?
> >>>>>-			  i40e_clean_xdp_tx_irq(vsi, ring) :
> >>>>>-			  i40e_clean_tx_irq(vsi, ring, budget);
> >>>>>+			  i40e_clean_xdp_tx_irq(vsi, ring, &tx_cleaned) :
> >>>>>+			  i40e_clean_tx_irq(vsi, ring, budget, &tx_cleaned);
> >>>>>  		if (!wd) {
> >>>>>-			clean_complete = false;
> >>>>>+			clean_complete = tx_clean_complete = false;
> >>>>>  			continue;
> >>>>>  		}
> >>>>>  		arm_wb |= ring->arm_wb;
> >>>>>diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >>>>>index 790aaeff..f98ce7e4 100644
> >>>>>--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >>>>>+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> >>>>>@@ -530,18 +530,22 @@ static void i40e_set_rs_bit(struct i40e_ring *xdp_ring)
> >>>>>   * i40e_xmit_zc - Performs zero-copy Tx AF_XDP
> >>>>>   * @xdp_ring: XDP Tx ring
> >>>>>   * @budget: NAPI budget
> >>>>>+ * @tx_cleaned: Out parameter of the TX packets processed
> >>>>>   *
> >>>>>   * Returns true if the work is finished.
> >>>>>   **/
> >>>>>-static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >>>>>+static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget,
> >>>>>+			 unsigned int *tx_cleaned)
> >>>>>  {
> >>>>>  	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> >>>>>  	u32 nb_pkts, nb_processed = 0;
> >>>>>  	unsigned int total_bytes = 0;
> >>>>>  	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> >>>>>-	if (!nb_pkts)
> >>>>>+	if (!nb_pkts) {
> >>>>>+		*tx_cleaned = 0;
> >>>>>  		return true;
> >>>>>+	}
> >>>>>  	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> >>>>>  		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> >>>>>@@ -558,6 +562,7 @@ static bool i40e_xmit_zc(struct i40e_ring *xdp_ring, unsigned int budget)
> >>>>>  	i40e_update_tx_stats(xdp_ring, nb_pkts, total_bytes);
> >>>>>+	*tx_cleaned = nb_pkts;
> >>>>With XDP, I don't think we should count these as tx_cleaned packets. These are transmitted
> >>>>packets. The tx_cleaned would be the xsk_frames counter in i40e_clean_xdp_tx_irq
> >>>>May be we need 2 counters for xdp.
> >>>I think there's two issues you are describing, which are separate in my
> >>>mind.
> >>>
> >>>   1.) The name "tx_cleaned", and
> >>>   2.) Whether nb_pkts is the right thing to write as the out param.
> >>>
> >>>For #1: I'm OK to change the name if that's the blocker here; please
> >>>suggest a suitable alternative that you'll accept.
> >>>
> >>>For #2: nb_pkts is, IMO, the right value to bubble up to the tracepoint because
> >>>nb_pkts affects clean_complete in i40e_napi_poll which in turn determines
> >>>whether or not polling mode is entered.
> >>>
> >>>The purpose of the tracepoint is to determine when/why/how you are entering
> >>>polling mode, so if nb_pkts plays a role in that calculation, it's the
> >>>right number to output.
> >>I suppose the alternative is to only fire the tracepoint when *not* in XDP.
> >>Then the changes to the XDP stuff can be dropped and a separate set of
> >>tracepoints for XDP can be created in the future.
> >Let's be clear that it's the AF_XDP quirk that we have in here that actual
> >xmit happens within NAPI polling routine.
> >
> >Sridhar is right with having xsk_frames as tx_cleaned but you're also
> >right that nb_pkts affects napi polling. But then if you look at Rx side
> >there is an analogous case with buffer allocation affecting napi polling.
> 
> To be correct,  I would suggest 2 out parameters to i40e_clean_xdp_tx_irq()
> tx_cleaned and xdp_transmitted.  tx_cleaned should be filled in
> with xsk_frames. Add xdp_transmitted as an out parameter to i40e_xmit_zc()
> and fill it with nb_pkts.

Sorry, but I don't see the value in the second param. NAPI decides what to
do based on nb_pkts. That's the only parameter that matters for the purpose
of NAPI going into poll mode or not, right?

If so: I don't see any reason why a second parameter is necessary.

As I mentioned earlier: if it's just that the name of the parameter isn't
right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
that's an easy fix; I'll just change the name.

It doesn't seem helpful to have xsk_frames as an out parameter for
i40e_napi_poll tracepoint; that value is not used to determine anything
about i40e's NAPI.

> I am not completely clear on the reasoning behind setting clean_complete
> based on number of packets transmitted in case of XDP.
> >
> >>That might reduce the complexity a bit, and will probably still be pretty
> >>useful for people tuning their non-XDP workloads.
> 
> This option is fine too.

I'll give Jesse a chance to weigh in before I proceed with spinning a v3.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06 17:32               ` [Intel-wired-lan] " Joe Damato
@ 2022-10-06 22:35                 ` Jesse Brandeburg
  -1 siblings, 0 replies; 30+ messages in thread
From: Jesse Brandeburg @ 2022-10-06 22:35 UTC (permalink / raw)
  To: Joe Damato, Samudrala, Sridhar
  Cc: Maciej Fijalkowski, intel-wired-lan, netdev, kuba, davem,
	anthony.l.nguyen

On 10/6/2022 10:32 AM, Joe Damato wrote:
> Sorry, but I don't see the value in the second param. NAPI decides what to
> do based on nb_pkts. That's the only parameter that matters for the purpose
> of NAPI going into poll mode or not, right?
> 
> If so: I don't see any reason why a second parameter is necessary.

Sridhar and I talked about this offline. We agree now that you can just 
proceed with the single parameter.

> 
> As I mentioned earlier: if it's just that the name of the parameter isn't
> right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
> that's an easy fix; I'll just change the name.

I think the name change isn't necessary, since we're not going to extend 
this patch with full XDP events printed (see below)

> 
> It doesn't seem helpful to have xsk_frames as an out parameter for
> i40e_napi_poll tracepoint; that value is not used to determine anything
> about i40e's NAPI.
> 
>> I am not completely clear on the reasoning behind setting clean_complete
>> based on number of packets transmitted in case of XDP.
>>>
>>>> That might reduce the complexity a bit, and will probably still be pretty
>>>> useful for people tuning their non-XDP workloads.
>>
>> This option is fine too.
> 
> I'll give Jesse a chance to weigh in before I proceed with spinning a v3.

I'm ok with the patch you have now, that shows nb_pkts because it's the 
input to the polling decision. We can add the detail about XDP transmits 
cleaned in a later series or patch that is by someone who wants the XDP 
details in the napi poll context.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06 22:35                 ` Jesse Brandeburg
  0 siblings, 0 replies; 30+ messages in thread
From: Jesse Brandeburg @ 2022-10-06 22:35 UTC (permalink / raw)
  To: Joe Damato, Samudrala, Sridhar; +Cc: netdev, intel-wired-lan, kuba, davem

On 10/6/2022 10:32 AM, Joe Damato wrote:
> Sorry, but I don't see the value in the second param. NAPI decides what to
> do based on nb_pkts. That's the only parameter that matters for the purpose
> of NAPI going into poll mode or not, right?
> 
> If so: I don't see any reason why a second parameter is necessary.

Sridhar and I talked about this offline. We agree now that you can just 
proceed with the single parameter.

> 
> As I mentioned earlier: if it's just that the name of the parameter isn't
> right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
> that's an easy fix; I'll just change the name.

I think the name change isn't necessary, since we're not going to extend 
this patch with full XDP events printed (see below)

> 
> It doesn't seem helpful to have xsk_frames as an out parameter for
> i40e_napi_poll tracepoint; that value is not used to determine anything
> about i40e's NAPI.
> 
>> I am not completely clear on the reasoning behind setting clean_complete
>> based on number of packets transmitted in case of XDP.
>>>
>>>> That might reduce the complexity a bit, and will probably still be pretty
>>>> useful for people tuning their non-XDP workloads.
>>
>> This option is fine too.
> 
> I'll give Jesse a chance to weigh in before I proceed with spinning a v3.

I'm ok with the patch you have now, that shows nb_pkts because it's the 
input to the polling decision. We can add the detail about XDP transmits 
cleaned in a later series or patch that is by someone who wants the XDP 
details in the napi poll context.

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06 22:35                 ` [Intel-wired-lan] " Jesse Brandeburg
@ 2022-10-06 22:56                   ` Joe Damato
  -1 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06 22:56 UTC (permalink / raw)
  To: Jesse Brandeburg
  Cc: Samudrala, Sridhar, Maciej Fijalkowski, intel-wired-lan, netdev,
	kuba, davem, anthony.l.nguyen

On Thu, Oct 06, 2022 at 03:35:36PM -0700, Jesse Brandeburg wrote:
> On 10/6/2022 10:32 AM, Joe Damato wrote:
> >Sorry, but I don't see the value in the second param. NAPI decides what to
> >do based on nb_pkts. That's the only parameter that matters for the purpose
> >of NAPI going into poll mode or not, right?
> >
> >If so: I don't see any reason why a second parameter is necessary.
> 
> Sridhar and I talked about this offline. We agree now that you can just
> proceed with the single parameter.

OK, thanks.

> >
> >As I mentioned earlier: if it's just that the name of the parameter isn't
> >right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
> >that's an easy fix; I'll just change the name.
> 
> I think the name change isn't necessary, since we're not going to extend
> this patch with full XDP events printed (see below)
> 
> >
> >It doesn't seem helpful to have xsk_frames as an out parameter for
> >i40e_napi_poll tracepoint; that value is not used to determine anything
> >about i40e's NAPI.
> >
> >>I am not completely clear on the reasoning behind setting clean_complete
> >>based on number of packets transmitted in case of XDP.
> >>>
> >>>>That might reduce the complexity a bit, and will probably still be pretty
> >>>>useful for people tuning their non-XDP workloads.
> >>
> >>This option is fine too.
> >
> >I'll give Jesse a chance to weigh in before I proceed with spinning a v3.
> 
> I'm ok with the patch you have now, that shows nb_pkts because it's the
> input to the polling decision. We can add the detail about XDP transmits
> cleaned in a later series or patch that is by someone who wants the XDP
> details in the napi poll context.

Thanks for the detailed and thoughtful feedback, it is much appreciated.

I'll leave this patch the way it is then and tweak the RX patch to include
an rx_clean_complete boolean as I mentioned in my response to that patch
and send out a v3.

FWIW, I had assumed that you would suggest dropping the XDP stuff so I
pre-emptively spun a branch locally that dropped it... it is a much smaller
change of course, but I suspect that this tracepoint might useful for XDP
users, so I think the decision to leave it with nb_pkts makes sense.

Thanks again for the review. I'll send a v3 shortly.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-06 22:56                   ` Joe Damato
  0 siblings, 0 replies; 30+ messages in thread
From: Joe Damato @ 2022-10-06 22:56 UTC (permalink / raw)
  To: Jesse Brandeburg; +Cc: intel-wired-lan, netdev, kuba, davem

On Thu, Oct 06, 2022 at 03:35:36PM -0700, Jesse Brandeburg wrote:
> On 10/6/2022 10:32 AM, Joe Damato wrote:
> >Sorry, but I don't see the value in the second param. NAPI decides what to
> >do based on nb_pkts. That's the only parameter that matters for the purpose
> >of NAPI going into poll mode or not, right?
> >
> >If so: I don't see any reason why a second parameter is necessary.
> 
> Sridhar and I talked about this offline. We agree now that you can just
> proceed with the single parameter.

OK, thanks.

> >
> >As I mentioned earlier: if it's just that the name of the parameter isn't
> >right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
> >that's an easy fix; I'll just change the name.
> 
> I think the name change isn't necessary, since we're not going to extend
> this patch with full XDP events printed (see below)
> 
> >
> >It doesn't seem helpful to have xsk_frames as an out parameter for
> >i40e_napi_poll tracepoint; that value is not used to determine anything
> >about i40e's NAPI.
> >
> >>I am not completely clear on the reasoning behind setting clean_complete
> >>based on number of packets transmitted in case of XDP.
> >>>
> >>>>That might reduce the complexity a bit, and will probably still be pretty
> >>>>useful for people tuning their non-XDP workloads.
> >>
> >>This option is fine too.
> >
> >I'll give Jesse a chance to weigh in before I proceed with spinning a v3.
> 
> I'm ok with the patch you have now, that shows nb_pkts because it's the
> input to the polling decision. We can add the detail about XDP transmits
> cleaned in a later series or patch that is by someone who wants the XDP
> details in the napi poll context.

Thanks for the detailed and thoughtful feedback, it is much appreciated.

I'll leave this patch the way it is then and tweak the RX patch to include
an rx_clean_complete boolean as I mentioned in my response to that patch
and send out a v3.

FWIW, I had assumed that you would suggest dropping the XDP stuff so I
pre-emptively spun a branch locally that dropped it... it is a much smaller
change of course, but I suspect that this tracepoint might useful for XDP
users, so I think the decision to leave it with nb_pkts makes sense.

Thanks again for the review. I'll send a v3 shortly.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
  2022-10-06 22:56                   ` [Intel-wired-lan] " Joe Damato
@ 2022-10-07  8:08                     ` Maciej Fijalkowski
  -1 siblings, 0 replies; 30+ messages in thread
From: Maciej Fijalkowski @ 2022-10-07  8:08 UTC (permalink / raw)
  To: Joe Damato
  Cc: Jesse Brandeburg, Samudrala, Sridhar, intel-wired-lan, netdev,
	kuba, davem, anthony.l.nguyen

On Thu, Oct 06, 2022 at 03:56:57PM -0700, Joe Damato wrote:
> On Thu, Oct 06, 2022 at 03:35:36PM -0700, Jesse Brandeburg wrote:
> > On 10/6/2022 10:32 AM, Joe Damato wrote:
> > >Sorry, but I don't see the value in the second param. NAPI decides what to
> > >do based on nb_pkts. That's the only parameter that matters for the purpose
> > >of NAPI going into poll mode or not, right?
> > >
> > >If so: I don't see any reason why a second parameter is necessary.
> > 
> > Sridhar and I talked about this offline. We agree now that you can just
> > proceed with the single parameter.
> 
> OK, thanks.
> 
> > >
> > >As I mentioned earlier: if it's just that the name of the parameter isn't
> > >right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
> > >that's an easy fix; I'll just change the name.
> > 
> > I think the name change isn't necessary, since we're not going to extend
> > this patch with full XDP events printed (see below)

So better to keep the twisted naming?

> > 
> > >
> > >It doesn't seem helpful to have xsk_frames as an out parameter for
> > >i40e_napi_poll tracepoint; that value is not used to determine anything
> > >about i40e's NAPI.
> > >
> > >>I am not completely clear on the reasoning behind setting clean_complete
> > >>based on number of packets transmitted in case of XDP.
> > >>>
> > >>>>That might reduce the complexity a bit, and will probably still be pretty
> > >>>>useful for people tuning their non-XDP workloads.
> > >>
> > >>This option is fine too.
> > >
> > >I'll give Jesse a chance to weigh in before I proceed with spinning a v3.
> > 
> > I'm ok with the patch you have now, that shows nb_pkts because it's the
> > input to the polling decision. We can add the detail about XDP transmits
> > cleaned in a later series or patch that is by someone who wants the XDP
> > details in the napi poll context.

Please spell out whole AF_XDP instead of referring to XDP. Future readers
might get confused. XDP is totally fine with what Joe is doing, I'm trying
to bring up whole AF_XDP term and I feel like I'm being ignored.

number of produced packets to HW tx ring != number of produced packets to
AF_XDP CQ ring.

> 
> Thanks for the detailed and thoughtful feedback, it is much appreciated.
> 
> I'll leave this patch the way it is then and tweak the RX patch to include
> an rx_clean_complete boolean as I mentioned in my response to that patch
> and send out a v3.
> 
> FWIW, I had assumed that you would suggest dropping the XDP stuff so I
> pre-emptively spun a branch locally that dropped it... it is a much smaller
> change of course, but I suspect that this tracepoint might useful for XDP
> users, so I think the decision to leave it with nb_pkts makes sense.
> 
> Thanks again for the review. I'll send a v3 shortly.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-wired-lan] [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI
@ 2022-10-07  8:08                     ` Maciej Fijalkowski
  0 siblings, 0 replies; 30+ messages in thread
From: Maciej Fijalkowski @ 2022-10-07  8:08 UTC (permalink / raw)
  To: Joe Damato; +Cc: intel-wired-lan, netdev, kuba, davem

On Thu, Oct 06, 2022 at 03:56:57PM -0700, Joe Damato wrote:
> On Thu, Oct 06, 2022 at 03:35:36PM -0700, Jesse Brandeburg wrote:
> > On 10/6/2022 10:32 AM, Joe Damato wrote:
> > >Sorry, but I don't see the value in the second param. NAPI decides what to
> > >do based on nb_pkts. That's the only parameter that matters for the purpose
> > >of NAPI going into poll mode or not, right?
> > >
> > >If so: I don't see any reason why a second parameter is necessary.
> > 
> > Sridhar and I talked about this offline. We agree now that you can just
> > proceed with the single parameter.
> 
> OK, thanks.
> 
> > >
> > >As I mentioned earlier: if it's just that the name of the parameter isn't
> > >right (e.g., you want it to be 'tx_processed' instead of 'tx_cleaned') then
> > >that's an easy fix; I'll just change the name.
> > 
> > I think the name change isn't necessary, since we're not going to extend
> > this patch with full XDP events printed (see below)

So better to keep the twisted naming?

> > 
> > >
> > >It doesn't seem helpful to have xsk_frames as an out parameter for
> > >i40e_napi_poll tracepoint; that value is not used to determine anything
> > >about i40e's NAPI.
> > >
> > >>I am not completely clear on the reasoning behind setting clean_complete
> > >>based on number of packets transmitted in case of XDP.
> > >>>
> > >>>>That might reduce the complexity a bit, and will probably still be pretty
> > >>>>useful for people tuning their non-XDP workloads.
> > >>
> > >>This option is fine too.
> > >
> > >I'll give Jesse a chance to weigh in before I proceed with spinning a v3.
> > 
> > I'm ok with the patch you have now, that shows nb_pkts because it's the
> > input to the polling decision. We can add the detail about XDP transmits
> > cleaned in a later series or patch that is by someone who wants the XDP
> > details in the napi poll context.

Please spell out whole AF_XDP instead of referring to XDP. Future readers
might get confused. XDP is totally fine with what Joe is doing, I'm trying
to bring up whole AF_XDP term and I feel like I'm being ignored.

number of produced packets to HW tx ring != number of produced packets to
AF_XDP CQ ring.

> 
> Thanks for the detailed and thoughtful feedback, it is much appreciated.
> 
> I'll leave this patch the way it is then and tweak the RX patch to include
> an rx_clean_complete boolean as I mentioned in my response to that patch
> and send out a v3.
> 
> FWIW, I had assumed that you would suggest dropping the XDP stuff so I
> pre-emptively spun a branch locally that dropped it... it is a much smaller
> change of course, but I suspect that this tracepoint might useful for XDP
> users, so I think the decision to leave it with nb_pkts makes sense.
> 
> Thanks again for the review. I'll send a v3 shortly.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-10-07  8:08 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-05 21:21 [next-queue v2 0/4] i40e: Add an i40e_napi_poll tracepoint Joe Damato
2022-10-05 21:21 ` [Intel-wired-lan] " Joe Damato
2022-10-05 21:21 ` [next-queue v2 1/4] i40e: Store the irq number in i40e_q_vector Joe Damato
2022-10-05 21:21   ` [Intel-wired-lan] " Joe Damato
2022-10-05 21:21 ` [next-queue v2 2/4] i40e: Record number TXes cleaned during NAPI Joe Damato
2022-10-05 21:21   ` [Intel-wired-lan] " Joe Damato
2022-10-06  0:16   ` Samudrala, Sridhar
2022-10-06  0:16     ` [Intel-wired-lan] " Samudrala, Sridhar
2022-10-06  0:31     ` Joe Damato
2022-10-06  0:31       ` [Intel-wired-lan] " Joe Damato
2022-10-06  1:00       ` Joe Damato
2022-10-06  1:00         ` [Intel-wired-lan] " Joe Damato
2022-10-06 13:03         ` Maciej Fijalkowski
2022-10-06 13:03           ` [Intel-wired-lan] " Maciej Fijalkowski
2022-10-06 14:57           ` Samudrala, Sridhar
2022-10-06 14:57             ` Samudrala, Sridhar
2022-10-06 17:32             ` Joe Damato
2022-10-06 17:32               ` [Intel-wired-lan] " Joe Damato
2022-10-06 22:35               ` Jesse Brandeburg
2022-10-06 22:35                 ` [Intel-wired-lan] " Jesse Brandeburg
2022-10-06 22:56                 ` Joe Damato
2022-10-06 22:56                   ` [Intel-wired-lan] " Joe Damato
2022-10-07  8:08                   ` Maciej Fijalkowski
2022-10-07  8:08                     ` [Intel-wired-lan] " Maciej Fijalkowski
2022-10-05 21:21 ` [next-queue v2 3/4] i40e: Record number of RXes " Joe Damato
2022-10-05 21:21   ` [Intel-wired-lan] " Joe Damato
2022-10-06  0:36   ` Joe Damato
2022-10-06  0:36     ` Joe Damato
2022-10-05 21:21 ` [next-queue v2 4/4] i40e: Add i40e_napi_poll tracepoint Joe Damato
2022-10-05 21:21   ` [Intel-wired-lan] " Joe Damato

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.