netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing
@ 2020-07-13 12:56 Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 1/6] enetc: Refine buffer descriptor ring sizes Claudiu Manoil
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-13 12:56 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev

Apart from some related cleanup patches, this set
introduces in a straightforward way the support needed
to enable and configure interrupt coalescing for ENETC.

Patch 4 introduces the framework for configuring
interrupt coalescing parameters and switching between
moderated (int. coalescing) and per-packet interrupt modes.
When interrupt coalescing is enabled the Rx/Tx time
thresholds are configurable, packet thresholds are fixed.
To make this work reliably, patch 4 calls the traffic
pause procedure introduced in patch 2.

Patch 6 adds DIM (Dynamic Interrupt Moderation) to implement
adaptive coalescing based on time thresholds, for both Rx and
Tx processing 'channels' independently.
netperf -t TCP_MAERTS measurements show a significant CPU load
reduction correlated w/ reduced interrupt rates. For a single
TCP flow (mostly Rx), when both RX and Tx paths are processed
on the same CPU, the CPU load improvement is not so great though
the interrupt rate is ~3x smaller than before. I think part of this
can be attributed to the overhead of supporting interrupt coalescing.
But if the Rx and Tx channels are processed on separate CPUs the
improvement is stunning.
Nevertheless, for a system load test involving 8 TCP threads the
CPU utilization improvement is important.  Below are the
measurement results pasted from patch 6's comments, for reference:

2 ARM Cortex-A72 @1.3Ghz CPUs system, 32 KB L1 data cache,
using netperf @ 1Gbit link (maximum throughput):

1) 1 Rx TCP flow, both Rx and Tx processed by the same NAPI
thread on the same CPU:
        CPU utilization         int rate (ints/sec)
Before: 50%-60% (over 50%)              92k
After:  just under 50%                  35k
Comment:  Small CPU utilization improvement for a single flow
          Rx TCP flow (i.e. netperf -t TCP_MAERTS) on a single
          CPU.

2) 1 Rx TCP flow, Rx processing on CPU0, Tx on CPU1:
        Total CPU utilization   Total int rate (ints/sec)
Before: 60%-70%                 85k CPU0 + 42k CPU1
After:  15%                     3.5k CPU0 + 3.5k CPU1
Comment:  Huge improvement in total CPU utilization
          correlated w/a a huge decrease in interrupt rate.

3) 4 Rx TCP flows + 4 Tx TCP flows (+ pings to check the latency):
        Total CPU utilization   Total int rate (ints/sec)
Before: ~80% (spikes to 90%)            ~100k
After:   60% (more steady)               ~10k
Comment:  Important improvement for this load test, while the
          ping test outcome was not impacted.


Claudiu Manoil (6):
  enetc: Refine buffer descriptor ring sizes
  enetc: Factor out the traffic start/stop procedures
  enetc: Fix interrupt coalescing register naming
  enetc: Add interrupt coalescing support
  enetc: Drop redundant ____cacheline_aligned_in_smp
  enetc: Add adaptive interrupt coalescing

 drivers/net/ethernet/freescale/enetc/Kconfig  |   2 +
 drivers/net/ethernet/freescale/enetc/enetc.c  | 215 ++++++++++++++----
 drivers/net/ethernet/freescale/enetc/enetc.h  |  40 +++-
 .../ethernet/freescale/enetc/enetc_ethtool.c  |  92 +++++++-
 .../net/ethernet/freescale/enetc/enetc_hw.h   |  23 +-
 5 files changed, 322 insertions(+), 50 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next 1/6] enetc: Refine buffer descriptor ring sizes
  2020-07-13 12:56 [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
@ 2020-07-13 12:56 ` Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 2/6] enetc: Factor out the traffic start/stop procedures Claudiu Manoil
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-13 12:56 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev

It's time to differentiate between Rx and Tx ring sizes.
Not only Tx rings are processed differently than Rx rings,
but their default number also differs - i.e. up to 8 Tx rings
per device (8 traffic classes) vs. 2 Rx rings (one per CPU).
So let's set Tx rings sizes to half the size of the Rx rings
for now, to be conservative.
The default ring sizes were decreased as well (to the next
lower power of 2), to reduce the memory footprint, buffering
etc., since the measurements I've made so far show that the
rings are very unlikely to get full.
This change also anticipates the introduction of the
dynamic interrupt moderation (dim) algorithm which operates
on maximum packet thresholds of 256 packets for Rx and 128
packets for Tx.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c | 4 ++--
 drivers/net/ethernet/freescale/enetc/enetc.h | 5 +++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 3f32b85ba2cf..d91e52618681 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -1064,8 +1064,8 @@ void enetc_init_si_rings_params(struct enetc_ndev_priv *priv)
 	struct enetc_si *si = priv->si;
 	int cpus = num_online_cpus();
 
-	priv->tx_bd_count = ENETC_BDR_DEFAULT_SIZE;
-	priv->rx_bd_count = ENETC_BDR_DEFAULT_SIZE;
+	priv->tx_bd_count = ENETC_TX_RING_DEFAULT_SIZE;
+	priv->rx_bd_count = ENETC_RX_RING_DEFAULT_SIZE;
 
 	/* Enable all available TX rings in order to configure as many
 	 * priorities as possible, when needed.
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index b705464f6882..0dd8ee179753 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -44,8 +44,9 @@ struct enetc_ring_stats {
 	unsigned int rx_alloc_errs;
 };
 
-#define ENETC_BDR_DEFAULT_SIZE	1024
-#define ENETC_DEFAULT_TX_WORK	256
+#define ENETC_RX_RING_DEFAULT_SIZE	512
+#define ENETC_TX_RING_DEFAULT_SIZE	256
+#define ENETC_DEFAULT_TX_WORK		(ENETC_TX_RING_DEFAULT_SIZE / 2)
 
 struct enetc_bdr {
 	struct device *dev; /* for DMA mapping */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 2/6] enetc: Factor out the traffic start/stop procedures
  2020-07-13 12:56 [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 1/6] enetc: Refine buffer descriptor ring sizes Claudiu Manoil
@ 2020-07-13 12:56 ` Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 3/6] enetc: Fix interrupt coalescing register naming Claudiu Manoil
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-13 12:56 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev

A reliable traffic pause (and reconfiguration) procedure
is needed to be able to safely make h/w configuration
changes during run-time, like changing the mode in which the
interrupts are operating (i.e. with or without coalescing),
as opposed to making on-the-fly register updates that
may be subject to h/w or s/w concurrency issues.
To this end, the code responsible of the run-time device
configurations that basically starts resp. stops the traffic
flow through the device has been extracted from the
the enetc_open/_close procedures, to the separate standalone
enetc_start/_stop procedures. Traffic stop should be as
graceful as possible, it lets the executing napi threads to
to finish while the interrupts stay disabled.  But since
the napi thread will try to re-enable interrupts by clearing
the device's unmask register, the enable_irq/ disable_irq
API has been used to avoid this potential concurrency issue
and make the traffic pause procedure more reliable.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c | 74 +++++++++++++-------
 1 file changed, 49 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index d91e52618681..51a1c97aedac 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -1264,6 +1264,7 @@ static int enetc_setup_irqs(struct enetc_ndev_priv *priv)
 			dev_err(priv->dev, "request_irq() failed!\n");
 			goto irq_err;
 		}
+		disable_irq(irq);
 
 		v->tbier_base = hw->reg + ENETC_BDR(TX, 0, ENETC_TBIER);
 		v->rbier = hw->reg + ENETC_BDR(RX, i, ENETC_RBIER);
@@ -1306,7 +1307,7 @@ static void enetc_free_irqs(struct enetc_ndev_priv *priv)
 	}
 }
 
-static void enetc_enable_interrupts(struct enetc_ndev_priv *priv)
+static void enetc_setup_interrupts(struct enetc_ndev_priv *priv)
 {
 	int i;
 
@@ -1322,7 +1323,7 @@ static void enetc_enable_interrupts(struct enetc_ndev_priv *priv)
 	}
 }
 
-static void enetc_disable_interrupts(struct enetc_ndev_priv *priv)
+static void enetc_clear_interrupts(struct enetc_ndev_priv *priv)
 {
 	int i;
 
@@ -1369,10 +1370,33 @@ static int enetc_phy_connect(struct net_device *ndev)
 	return 0;
 }
 
+static void enetc_start(struct net_device *ndev)
+{
+	struct enetc_ndev_priv *priv = netdev_priv(ndev);
+	int i;
+
+	enetc_setup_interrupts(priv);
+
+	for (i = 0; i < priv->bdr_int_num; i++) {
+		int irq = pci_irq_vector(priv->si->pdev,
+					 ENETC_BDR_INT_BASE_IDX + i);
+
+		napi_enable(&priv->int_vector[i]->napi);
+		enable_irq(irq);
+	}
+
+	if (ndev->phydev)
+		phy_start(ndev->phydev);
+	else
+		netif_carrier_on(ndev);
+
+	netif_tx_start_all_queues(ndev);
+}
+
 int enetc_open(struct net_device *ndev)
 {
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
-	int i, err;
+	int err;
 
 	err = enetc_setup_irqs(priv);
 	if (err)
@@ -1390,8 +1414,6 @@ int enetc_open(struct net_device *ndev)
 	if (err)
 		goto err_alloc_rx;
 
-	enetc_setup_bdrs(priv);
-
 	err = netif_set_real_num_tx_queues(ndev, priv->num_tx_rings);
 	if (err)
 		goto err_set_queues;
@@ -1400,17 +1422,8 @@ int enetc_open(struct net_device *ndev)
 	if (err)
 		goto err_set_queues;
 
-	for (i = 0; i < priv->bdr_int_num; i++)
-		napi_enable(&priv->int_vector[i]->napi);
-
-	enetc_enable_interrupts(priv);
-
-	if (ndev->phydev)
-		phy_start(ndev->phydev);
-	else
-		netif_carrier_on(ndev);
-
-	netif_tx_start_all_queues(ndev);
+	enetc_setup_bdrs(priv);
+	enetc_start(ndev);
 
 	return 0;
 
@@ -1427,28 +1440,39 @@ int enetc_open(struct net_device *ndev)
 	return err;
 }
 
-int enetc_close(struct net_device *ndev)
+static void enetc_stop(struct net_device *ndev)
 {
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
 	int i;
 
 	netif_tx_stop_all_queues(ndev);
 
-	if (ndev->phydev) {
-		phy_stop(ndev->phydev);
-		phy_disconnect(ndev->phydev);
-	} else {
-		netif_carrier_off(ndev);
-	}
-
 	for (i = 0; i < priv->bdr_int_num; i++) {
+		int irq = pci_irq_vector(priv->si->pdev,
+					 ENETC_BDR_INT_BASE_IDX + i);
+
+		disable_irq(irq);
 		napi_synchronize(&priv->int_vector[i]->napi);
 		napi_disable(&priv->int_vector[i]->napi);
 	}
 
-	enetc_disable_interrupts(priv);
+	if (ndev->phydev)
+		phy_stop(ndev->phydev);
+	else
+		netif_carrier_off(ndev);
+
+	enetc_clear_interrupts(priv);
+}
+
+int enetc_close(struct net_device *ndev)
+{
+	struct enetc_ndev_priv *priv = netdev_priv(ndev);
+
+	enetc_stop(ndev);
 	enetc_clear_bdrs(priv);
 
+	if (ndev->phydev)
+		phy_disconnect(ndev->phydev);
 	enetc_free_rxtx_rings(priv);
 	enetc_free_rx_resources(priv);
 	enetc_free_tx_resources(priv);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 3/6] enetc: Fix interrupt coalescing register naming
  2020-07-13 12:56 [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 1/6] enetc: Refine buffer descriptor ring sizes Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 2/6] enetc: Factor out the traffic start/stop procedures Claudiu Manoil
@ 2020-07-13 12:56 ` Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 4/6] enetc: Add interrupt coalescing support Claudiu Manoil
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-13 12:56 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev

Interrupt coalescing registers naming in the current revision
of the Ref Man (RM) is ICR, deprecating the ICIR name used
in earlier (draft) versions of the RM.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c         | 4 ++--
 drivers/net/ethernet/freescale/enetc/enetc_ethtool.c | 2 +-
 drivers/net/ethernet/freescale/enetc/enetc_hw.h      | 8 ++++----
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index 51a1c97aedac..be594c7af538 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -1140,7 +1140,7 @@ static void enetc_setup_txbdr(struct enetc_hw *hw, struct enetc_bdr *tx_ring)
 	tx_ring->next_to_clean = enetc_txbdr_rd(hw, idx, ENETC_TBCIR);
 
 	/* enable Tx ints by setting pkt thr to 1 */
-	enetc_txbdr_wr(hw, idx, ENETC_TBICIR0, ENETC_TBICIR0_ICEN | 0x1);
+	enetc_txbdr_wr(hw, idx, ENETC_TBICR0, ENETC_TBICR0_ICEN | 0x1);
 
 	tbmr = ENETC_TBMR_EN;
 	if (tx_ring->ndev->features & NETIF_F_HW_VLAN_CTAG_TX)
@@ -1174,7 +1174,7 @@ static void enetc_setup_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring)
 	enetc_rxbdr_wr(hw, idx, ENETC_RBPIR, 0);
 
 	/* enable Rx ints by setting pkt thr to 1 */
-	enetc_rxbdr_wr(hw, idx, ENETC_RBICIR0, ENETC_RBICIR0_ICEN | 0x1);
+	enetc_rxbdr_wr(hw, idx, ENETC_RBICR0, ENETC_RBICR0_ICEN | 0x1);
 
 	rbmr = ENETC_RBMR_EN;
 
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
index 34bd1f3fb415..8aeaa3de0012 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
@@ -19,7 +19,7 @@ static const u32 enetc_txbdr_regs[] = {
 
 static const u32 enetc_rxbdr_regs[] = {
 	ENETC_RBMR, ENETC_RBSR, ENETC_RBBSR, ENETC_RBCIR, ENETC_RBBAR0,
-	ENETC_RBBAR1, ENETC_RBPIR, ENETC_RBLENR, ENETC_RBICIR0, ENETC_RBIER
+	ENETC_RBBAR1, ENETC_RBPIR, ENETC_RBLENR, ENETC_RBICR0, ENETC_RBIER
 };
 
 static const u32 enetc_port_regs[] = {
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
index fc357bc56835..05bb4c525897 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
@@ -121,8 +121,8 @@ enum enetc_bdr_type {TX, RX};
 #define ENETC_RBIER	0xa0
 #define ENETC_RBIER_RXTIE	BIT(0)
 #define ENETC_RBIDR	0xa4
-#define ENETC_RBICIR0	0xa8
-#define ENETC_RBICIR0_ICEN	BIT(31)
+#define ENETC_RBICR0	0xa8
+#define ENETC_RBICR0_ICEN	BIT(31)
 
 /* TX BDR reg offsets */
 #define ENETC_TBMR	0
@@ -141,8 +141,8 @@ enum enetc_bdr_type {TX, RX};
 #define ENETC_TBIER	0xa0
 #define ENETC_TBIER_TXTIE	BIT(0)
 #define ENETC_TBIDR	0xa4
-#define ENETC_TBICIR0	0xa8
-#define ENETC_TBICIR0_ICEN	BIT(31)
+#define ENETC_TBICR0	0xa8
+#define ENETC_TBICR0_ICEN	BIT(31)
 
 #define ENETC_RTBLENR_LEN(n)	((n) & ~0x7)
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 4/6] enetc: Add interrupt coalescing support
  2020-07-13 12:56 [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
                   ` (2 preceding siblings ...)
  2020-07-13 12:56 ` [PATCH net-next 3/6] enetc: Fix interrupt coalescing register naming Claudiu Manoil
@ 2020-07-13 12:56 ` Claudiu Manoil
  2020-07-13 22:18   ` Jakub Kicinski
  2020-07-13 12:56 ` [PATCH net-next 5/6] enetc: Drop redundant ____cacheline_aligned_in_smp Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
  5 siblings, 1 reply; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-13 12:56 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev

Enable programming of the interrupt coalescing registers
and allow manual configuration of the coalescing time
thresholds via ethtool.  Packet thresholds have been fixed
to predetermined values as there's no point in making them
run-time configurable, also anticipating the dynamic interrupt
moderation (dim) algorithm which uses fixed packet thresholds
as well.  If the interface is up when the operation mode of
traffic interrupt events is changed by the user (i.e. switching
from default per-packet interrupts to coalesced interrupts),
the traffic needs to be paused in the process.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.c  | 41 ++++++++--
 drivers/net/ethernet/freescale/enetc/enetc.h  | 19 +++++
 .../ethernet/freescale/enetc/enetc_ethtool.c  | 76 ++++++++++++++++++-
 .../net/ethernet/freescale/enetc/enetc_hw.h   | 19 ++++-
 4 files changed, 144 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index be594c7af538..e66405d1b791 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -265,9 +265,12 @@ static irqreturn_t enetc_msix(int irq, void *data)
 
 	/* disable interrupts */
 	enetc_wr_reg(v->rbier, 0);
+	enetc_wr_reg(v->ricr1, v->rx_ictt);
 
-	for_each_set_bit(i, &v->tx_rings_map, ENETC_MAX_NUM_TXQS)
+	for_each_set_bit(i, &v->tx_rings_map, ENETC_MAX_NUM_TXQS) {
 		enetc_wr_reg(v->tbier_base + ENETC_BDR_OFF(i), 0);
+		enetc_wr_reg(v->ticr1_base + ENETC_BDR_OFF(i), v->tx_ictt);
+	}
 
 	napi_schedule_irqoff(&v->napi);
 
@@ -1268,6 +1271,8 @@ static int enetc_setup_irqs(struct enetc_ndev_priv *priv)
 
 		v->tbier_base = hw->reg + ENETC_BDR(TX, 0, ENETC_TBIER);
 		v->rbier = hw->reg + ENETC_BDR(RX, i, ENETC_RBIER);
+		v->ticr1_base = hw->reg + ENETC_BDR(TX, 0, ENETC_TBICR1);
+		v->ricr1 = hw->reg + ENETC_BDR(RX, i, ENETC_RBICR1);
 
 		enetc_wr(hw, ENETC_SIMSIRRV(i), entry);
 
@@ -1309,17 +1314,39 @@ static void enetc_free_irqs(struct enetc_ndev_priv *priv)
 
 static void enetc_setup_interrupts(struct enetc_ndev_priv *priv)
 {
+	struct enetc_hw *hw = &priv->si->hw;
+	u32 icpt, ictt;
 	int i;
 
 	/* enable Tx & Rx event indication */
 	for (i = 0; i < priv->num_rx_rings; i++) {
-		enetc_rxbdr_wr(&priv->si->hw, i,
-			       ENETC_RBIER, ENETC_RBIER_RXTIE);
+		if (priv->ic_mode & ENETC_IC_RX_MANUAL) {
+			icpt = ENETC_RBICR0_SET_ICPT(ENETC_RXIC_PKTTHR);
+			/* init to non-0 minimum, will be adjusted later */
+			ictt = 0x1;
+		} else {
+			icpt = 0x1; /* enable Rx ints by setting pkt thr to 1 */
+			ictt = 0;
+		}
+
+		enetc_rxbdr_wr(hw, i, ENETC_RBICR1, ictt);
+		enetc_rxbdr_wr(hw, i, ENETC_RBICR0, ENETC_RBICR0_ICEN | icpt);
+		enetc_rxbdr_wr(hw, i, ENETC_RBIER, ENETC_RBIER_RXTIE);
 	}
 
 	for (i = 0; i < priv->num_tx_rings; i++) {
-		enetc_txbdr_wr(&priv->si->hw, i,
-			       ENETC_TBIER, ENETC_TBIER_TXTIE);
+		if (priv->ic_mode & ENETC_IC_TX_MANUAL) {
+			icpt = ENETC_TBICR0_SET_ICPT(ENETC_TXIC_PKTTHR);
+			/* init to non-0 minimum, will be adjusted later */
+			ictt = 0x1;
+		} else {
+			icpt = 0x1; /* enable Tx ints by setting pkt thr to 1 */
+			ictt = 0;
+		}
+
+		enetc_txbdr_wr(hw, i, ENETC_TBICR1, ictt);
+		enetc_txbdr_wr(hw, i, ENETC_TBICR0, ENETC_TBICR0_ICEN | icpt);
+		enetc_txbdr_wr(hw, i, ENETC_TBIER, ENETC_TBIER_TXTIE);
 	}
 }
 
@@ -1370,7 +1397,7 @@ static int enetc_phy_connect(struct net_device *ndev)
 	return 0;
 }
 
-static void enetc_start(struct net_device *ndev)
+void enetc_start(struct net_device *ndev)
 {
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
 	int i;
@@ -1440,7 +1467,7 @@ int enetc_open(struct net_device *ndev)
 	return err;
 }
 
-static void enetc_stop(struct net_device *ndev)
+void enetc_stop(struct net_device *ndev)
 {
 	struct enetc_ndev_priv *priv = netdev_priv(ndev);
 	int i;
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index 0dd8ee179753..cec7e05ec523 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -190,6 +190,10 @@ static inline bool enetc_si_is_pf(struct enetc_si *si)
 struct enetc_int_vector {
 	void __iomem *rbier;
 	void __iomem *tbier_base;
+	void __iomem *ricr1;
+	void __iomem *ticr1_base;
+	u32 rx_ictt;
+	u32 tx_ictt;
 	unsigned long tx_rings_map;
 	int count_tx_rings;
 	struct napi_struct napi;
@@ -221,6 +225,18 @@ enum enetc_active_offloads {
 	ENETC_F_QCI		= BIT(3),
 };
 
+/* interrupt coalescing modes */
+enum enetc_ic_mode {
+	/* one interrupt per frame */
+	ENETC_IC_NONE = 0,
+	/* activated when int coalescing time is set to a non-0 value */
+	ENETC_IC_RX_MANUAL = BIT(0),
+	ENETC_IC_TX_MANUAL = BIT(1),
+};
+
+#define ENETC_RXIC_PKTTHR	min_t(u32, 256, ENETC_RX_RING_DEFAULT_SIZE / 2)
+#define ENETC_TXIC_PKTTHR	min_t(u32, 128, ENETC_TX_RING_DEFAULT_SIZE / 2)
+
 struct enetc_ndev_priv {
 	struct net_device *ndev;
 	struct device *dev; /* dma-mapping device */
@@ -245,6 +261,7 @@ struct enetc_ndev_priv {
 
 	struct device_node *phy_node;
 	phy_interface_t if_mode;
+	int ic_mode;
 };
 
 /* Messaging */
@@ -274,6 +291,8 @@ void enetc_free_si_resources(struct enetc_ndev_priv *priv);
 
 int enetc_open(struct net_device *ndev);
 int enetc_close(struct net_device *ndev);
+void enetc_start(struct net_device *ndev);
+void enetc_stop(struct net_device *ndev);
 netdev_tx_t enetc_xmit(struct sk_buff *skb, struct net_device *ndev);
 struct net_device_stats *enetc_get_stats(struct net_device *ndev);
 int enetc_set_features(struct net_device *ndev,
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
index 8aeaa3de0012..8e0867fa1af6 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
@@ -14,12 +14,14 @@ static const u32 enetc_si_regs[] = {
 
 static const u32 enetc_txbdr_regs[] = {
 	ENETC_TBMR, ENETC_TBSR, ENETC_TBBAR0, ENETC_TBBAR1,
-	ENETC_TBPIR, ENETC_TBCIR, ENETC_TBLENR, ENETC_TBIER
+	ENETC_TBPIR, ENETC_TBCIR, ENETC_TBLENR, ENETC_TBIER, ENETC_TBICR0,
+	ENETC_TBICR1
 };
 
 static const u32 enetc_rxbdr_regs[] = {
 	ENETC_RBMR, ENETC_RBSR, ENETC_RBBSR, ENETC_RBCIR, ENETC_RBBAR0,
-	ENETC_RBBAR1, ENETC_RBPIR, ENETC_RBLENR, ENETC_RBICR0, ENETC_RBIER
+	ENETC_RBBAR1, ENETC_RBPIR, ENETC_RBLENR, ENETC_RBIER, ENETC_RBICR0,
+	ENETC_RBICR1
 };
 
 static const u32 enetc_port_regs[] = {
@@ -561,6 +563,68 @@ static void enetc_get_ringparam(struct net_device *ndev,
 	}
 }
 
+static int enetc_get_coalesce(struct net_device *ndev,
+			      struct ethtool_coalesce *ic)
+{
+	struct enetc_ndev_priv *priv = netdev_priv(ndev);
+	struct enetc_int_vector *v = priv->int_vector[0];
+
+	memset(ic, 0, sizeof(*ic));
+	ic->tx_coalesce_usecs = enetc_cycles_to_usecs(v->tx_ictt);
+	ic->rx_coalesce_usecs = enetc_cycles_to_usecs(v->rx_ictt);
+
+	ic->tx_max_coalesced_frames = ENETC_TXIC_PKTTHR;
+	ic->rx_max_coalesced_frames = ENETC_RXIC_PKTTHR;
+
+	return 0;
+}
+
+static int enetc_set_coalesce(struct net_device *ndev,
+			      struct ethtool_coalesce *ic)
+{
+	struct enetc_ndev_priv *priv = netdev_priv(ndev);
+	u32 rx_ictt, tx_ictt;
+	int i, ic_mode;
+
+	tx_ictt = enetc_usecs_to_cycles(ic->tx_coalesce_usecs);
+	rx_ictt = enetc_usecs_to_cycles(ic->rx_coalesce_usecs);
+
+	if (!ic->rx_max_coalesced_frames)
+		netif_warn(priv, hw, ndev, "rx-frames fixed to %d\n",
+			   ENETC_RXIC_PKTTHR);
+
+	if (!ic->tx_max_coalesced_frames)
+		netif_warn(priv, hw, ndev, "tx-frames fixed to %d\n",
+			   ENETC_TXIC_PKTTHR);
+
+	ic_mode = ENETC_IC_NONE;
+	if (tx_ictt)
+		ic_mode |= ENETC_IC_TX_MANUAL;
+	if (rx_ictt)
+		ic_mode |= ENETC_IC_RX_MANUAL;
+
+	/* commit the settings */
+	for (i = 0; i < priv->bdr_int_num; i++) {
+		struct enetc_int_vector *v = priv->int_vector[i];
+
+		v->tx_ictt = tx_ictt;
+		v->rx_ictt = rx_ictt;
+	}
+
+	if (netif_running(ndev) && ic_mode != priv->ic_mode) {
+		priv->ic_mode = ic_mode;
+		/* reconfigure the operation mode of h/w interrupts,
+		 * traffic needs to be paused in the process
+		 */
+		enetc_stop(ndev);
+		enetc_start(ndev);
+	} else {
+		priv->ic_mode = ic_mode;
+	}
+
+	return 0;
+}
+
 static int enetc_get_ts_info(struct net_device *ndev,
 			     struct ethtool_ts_info *info)
 {
@@ -617,6 +681,8 @@ static int enetc_set_wol(struct net_device *dev,
 }
 
 static const struct ethtool_ops enetc_pf_ethtool_ops = {
+	.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
+				     ETHTOOL_COALESCE_MAX_FRAMES,
 	.get_regs_len = enetc_get_reglen,
 	.get_regs = enetc_get_regs,
 	.get_sset_count = enetc_get_sset_count,
@@ -629,6 +695,8 @@ static const struct ethtool_ops enetc_pf_ethtool_ops = {
 	.get_rxfh = enetc_get_rxfh,
 	.set_rxfh = enetc_set_rxfh,
 	.get_ringparam = enetc_get_ringparam,
+	.get_coalesce = enetc_get_coalesce,
+	.set_coalesce = enetc_set_coalesce,
 	.get_link_ksettings = phy_ethtool_get_link_ksettings,
 	.set_link_ksettings = phy_ethtool_set_link_ksettings,
 	.get_link = ethtool_op_get_link,
@@ -638,6 +706,8 @@ static const struct ethtool_ops enetc_pf_ethtool_ops = {
 };
 
 static const struct ethtool_ops enetc_vf_ethtool_ops = {
+	.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
+				     ETHTOOL_COALESCE_MAX_FRAMES,
 	.get_regs_len = enetc_get_reglen,
 	.get_regs = enetc_get_regs,
 	.get_sset_count = enetc_get_sset_count,
@@ -649,6 +719,8 @@ static const struct ethtool_ops enetc_vf_ethtool_ops = {
 	.get_rxfh = enetc_get_rxfh,
 	.set_rxfh = enetc_set_rxfh,
 	.get_ringparam = enetc_get_ringparam,
+	.get_coalesce = enetc_get_coalesce,
+	.set_coalesce = enetc_set_coalesce,
 	.get_link = ethtool_op_get_link,
 	.get_ts_info = enetc_get_ts_info,
 };
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
index 05bb4c525897..95f3c4d8f602 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
@@ -122,7 +122,10 @@ enum enetc_bdr_type {TX, RX};
 #define ENETC_RBIER_RXTIE	BIT(0)
 #define ENETC_RBIDR	0xa4
 #define ENETC_RBICR0	0xa8
-#define ENETC_RBICR0_ICEN	BIT(31)
+#define ENETC_RBICR0_ICEN		BIT(31)
+#define ENETC_RBICR0_ICPT_MASK		0x1ff
+#define ENETC_RBICR0_SET_ICPT(n)	((n) & ENETC_RBICR0_ICPT_MASK)
+#define ENETC_RBICR1	0xac
 
 /* TX BDR reg offsets */
 #define ENETC_TBMR	0
@@ -142,7 +145,10 @@ enum enetc_bdr_type {TX, RX};
 #define ENETC_TBIER_TXTIE	BIT(0)
 #define ENETC_TBIDR	0xa4
 #define ENETC_TBICR0	0xa8
-#define ENETC_TBICR0_ICEN	BIT(31)
+#define ENETC_TBICR0_ICEN		BIT(31)
+#define ENETC_TBICR0_ICPT_MASK		0xf
+#define ENETC_TBICR0_SET_ICPT(n) ((ilog2(n) + 1) & ENETC_TBICR0_ICPT_MASK)
+#define ENETC_TBICR1	0xac
 
 #define ENETC_RTBLENR_LEN(n)	((n) & ~0x7)
 
@@ -784,6 +790,15 @@ struct enetc_cbd {
 };
 
 #define ENETC_CLK  400000000ULL
+static inline u32 enetc_cycles_to_usecs(u32 cycles)
+{
+	return (u32)div_u64(cycles * 1000000ULL, ENETC_CLK);
+}
+
+static inline u32 enetc_usecs_to_cycles(u32 usecs)
+{
+	return (u32)div_u64(usecs * ENETC_CLK, 1000000ULL);
+}
 
 /* port time gating control register */
 #define ENETC_QBV_PTGCR_OFFSET		0x11a00
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 5/6] enetc: Drop redundant ____cacheline_aligned_in_smp
  2020-07-13 12:56 [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
                   ` (3 preceding siblings ...)
  2020-07-13 12:56 ` [PATCH net-next 4/6] enetc: Add interrupt coalescing support Claudiu Manoil
@ 2020-07-13 12:56 ` Claudiu Manoil
  2020-07-13 12:56 ` [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
  5 siblings, 0 replies; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-13 12:56 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev

'struct enetc_bdr' is already '____cacheline_aligned_in_smp'.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/enetc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index cec7e05ec523..af5a276ce02d 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -199,7 +199,7 @@ struct enetc_int_vector {
 	struct napi_struct napi;
 	char name[ENETC_INT_NAME_MAX];
 
-	struct enetc_bdr rx_ring ____cacheline_aligned_in_smp;
+	struct enetc_bdr rx_ring;
 	struct enetc_bdr tx_ring[];
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing
  2020-07-13 12:56 [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
                   ` (4 preceding siblings ...)
  2020-07-13 12:56 ` [PATCH net-next 5/6] enetc: Drop redundant ____cacheline_aligned_in_smp Claudiu Manoil
@ 2020-07-13 12:56 ` Claudiu Manoil
  2020-07-13 22:30   ` Jakub Kicinski
  5 siblings, 1 reply; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-13 12:56 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev

Use the generic dynamic interrupt moderation (dim)
framework to implement adaptive interrupt coalescing
in ENETC.  With the per-packet interrupt scheme, a high
interrupt rate has been noted for moderate traffic flows
leading to high CPU utilization.  The 'dim' scheme
implemented by the current patch addresses this issue
improving CPU utilization while using minimal coalescing
time thresholds in order to preserve a good latency.

Below are some measurement results for before and after
this patch (and related dependencies) basically, for a
2 ARM Cortex-A72 @1.3Ghz CPUs system (32 KB L1 data cache),
using netperf @ 1Gbit link (maximum throughput):

1) 1 Rx TCP flow, both Rx and Tx processed by the same NAPI
thread on the same CPU:
	CPU utilization		int rate (ints/sec)
Before:	50%-60% (over 50%)		92k
After:  just under 50%			35k
Comment:  Small CPU utilization improvement for a single flow
	  Rx TCP flow (i.e. netperf -t TCP_MAERTS) on a single
	  CPU.

2) 1 Rx TCP flow, Rx processing on CPU0, Tx on CPU1:
	Total CPU utilization	Total int rate (ints/sec)
Before:	60%-70%			85k CPU0 + 42k CPU1
After:  15%			3.5k CPU0 + 3.5k CPU1
Comment:  Huge improvement in total CPU utilization
	  correlated w/a a huge decrease in interrupt rate.

3) 4 Rx TCP flows + 4 Tx TCP flows (+ pings to check the latency):
	Total CPU utilization	Total int rate (ints/sec)
Before:	~80% (spikes to 90%)		~100k
After:   60% (more steady)		 ~10k
Comment:  Important improvement for this load test, while the
	  ping test outcome was not impacted.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
---
 drivers/net/ethernet/freescale/enetc/Kconfig  |   2 +
 drivers/net/ethernet/freescale/enetc/enetc.c  | 100 ++++++++++++++++--
 drivers/net/ethernet/freescale/enetc/enetc.h  |  14 ++-
 .../ethernet/freescale/enetc/enetc_ethtool.c  |  28 +++--
 4 files changed, 129 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/Kconfig b/drivers/net/ethernet/freescale/enetc/Kconfig
index 2b43848e1363..37b804f8bd76 100644
--- a/drivers/net/ethernet/freescale/enetc/Kconfig
+++ b/drivers/net/ethernet/freescale/enetc/Kconfig
@@ -4,6 +4,7 @@ config FSL_ENETC
 	depends on PCI && PCI_MSI
 	select FSL_ENETC_MDIO
 	select PHYLIB
+	select DIMLIB
 	help
 	  This driver supports NXP ENETC gigabit ethernet controller PCIe
 	  physical function (PF) devices, managing ENETC Ports at a privileged
@@ -15,6 +16,7 @@ config FSL_ENETC_VF
 	tristate "ENETC VF driver"
 	depends on PCI && PCI_MSI
 	select PHYLIB
+	select DIMLIB
 	help
 	  This driver supports NXP ENETC gigabit ethernet controller PCIe
 	  virtual function (VF) devices enabled by the ENETC PF driver.
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c
index e66405d1b791..0c623eaf431c 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -277,10 +277,68 @@ static irqreturn_t enetc_msix(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
-static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget);
+static int enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget);
 static int enetc_clean_rx_ring(struct enetc_bdr *rx_ring,
 			       struct napi_struct *napi, int work_limit);
 
+static void enetc_rx_dim_work(struct work_struct *w)
+{
+	struct dim *dim = container_of(w, struct dim, work);
+	struct dim_cq_moder moder =
+		net_dim_get_rx_moderation(dim->mode, dim->profile_ix);
+	struct enetc_int_vector	*v =
+		container_of(dim, struct enetc_int_vector, rx_dim);
+
+	v->rx_ictt = enetc_usecs_to_cycles(moder.usec);
+	dim->state = DIM_START_MEASURE;
+}
+
+static void enetc_tx_dim_work(struct work_struct *w)
+{
+	struct dim *dim = container_of(w, struct dim, work);
+	struct dim_cq_moder moder =
+		net_dim_get_tx_moderation(dim->mode, dim->profile_ix);
+	struct enetc_int_vector	*v =
+		container_of(dim, struct enetc_int_vector, tx_dim);
+
+	v->tx_ictt = enetc_usecs_to_cycles(moder.usec);
+	dim->state = DIM_START_MEASURE;
+}
+
+static void enetc_rx_net_dim(struct enetc_int_vector *v)
+{
+	struct dim_sample dim_sample;
+
+	dim_update_sample(v->comp_cnt,
+			  v->rx_ring.stats.packets,
+			  v->rx_ring.stats.bytes,
+			  &dim_sample);
+	net_dim(&v->rx_dim, dim_sample);
+}
+
+static void enetc_tx_net_dim(struct enetc_int_vector *v)
+{
+	unsigned int packets = 0, bytes = 0;
+	struct dim_sample dim_sample;
+	int i;
+
+	for (i = 0; i < v->count_tx_rings; i++) {
+		packets += v->tx_ring[i].stats.packets;
+		bytes += v->tx_ring[i].stats.bytes;
+	}
+	dim_update_sample(v->comp_cnt, packets, bytes, &dim_sample);
+	net_dim(&v->tx_dim, dim_sample);
+}
+
+static void enetc_net_dim(struct enetc_int_vector *v)
+{
+	v->comp_cnt++;
+	if (v->rx_dim_en && v->rx_napi_work)
+		enetc_rx_net_dim(v);
+	if (v->tx_dim_en && v->tx_napi_work)
+		enetc_tx_net_dim(v);
+}
+
 static int enetc_poll(struct napi_struct *napi, int budget)
 {
 	struct enetc_int_vector
@@ -289,19 +347,31 @@ static int enetc_poll(struct napi_struct *napi, int budget)
 	int work_done;
 	int i;
 
-	for (i = 0; i < v->count_tx_rings; i++)
-		if (!enetc_clean_tx_ring(&v->tx_ring[i], budget))
+	for (i = 0; i < v->count_tx_rings; i++) {
+		work_done = enetc_clean_tx_ring(&v->tx_ring[i], budget);
+		if (work_done == ENETC_DEFAULT_TX_WORK)
 			complete = false;
+		if (work_done)
+			v->tx_napi_work = true;
+	}
 
 	work_done = enetc_clean_rx_ring(&v->rx_ring, napi, budget);
 	if (work_done == budget)
 		complete = false;
+	if (work_done)
+		v->rx_napi_work = true;
 
 	if (!complete)
 		return budget;
 
 	napi_complete_done(napi, work_done);
 
+	if (likely(v->rx_dim_en || v->tx_dim_en))
+		enetc_net_dim(v);
+
+	v->rx_napi_work = false;
+	v->tx_napi_work = false;
+
 	/* enable interrupts */
 	enetc_wr_reg(v->rbier, ENETC_RBIER_RXTIE);
 
@@ -343,7 +413,7 @@ static void enetc_tstamp_tx(struct sk_buff *skb, u64 tstamp)
 	}
 }
 
-static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget)
+static int enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget)
 {
 	struct net_device *ndev = tx_ring->ndev;
 	int tx_frm_cnt = 0, tx_byte_cnt = 0;
@@ -419,7 +489,7 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget)
 		netif_wake_subqueue(ndev, tx_ring->index);
 	}
 
-	return tx_frm_cnt != ENETC_DEFAULT_TX_WORK;
+	return tx_frm_cnt;
 }
 
 static bool enetc_new_page(struct enetc_bdr *rx_ring,
@@ -1077,6 +1147,7 @@ void enetc_init_si_rings_params(struct enetc_ndev_priv *priv)
 	priv->num_rx_rings = min_t(int, cpus, si->num_rx_rings);
 	priv->num_tx_rings = si->num_tx_rings;
 	priv->bdr_int_num = cpus;
+	priv->ic_mode = ENETC_IC_ADAPTIVE;
 
 	/* SI specific */
 	si->cbd_ring.bd_count = ENETC_CBDR_DEFAULT_SIZE;
@@ -1320,7 +1391,8 @@ static void enetc_setup_interrupts(struct enetc_ndev_priv *priv)
 
 	/* enable Tx & Rx event indication */
 	for (i = 0; i < priv->num_rx_rings; i++) {
-		if (priv->ic_mode & ENETC_IC_RX_MANUAL) {
+		if (priv->ic_mode &
+		    (ENETC_IC_RX_MANUAL | ENETC_IC_RX_ADAPTIVE)) {
 			icpt = ENETC_RBICR0_SET_ICPT(ENETC_RXIC_PKTTHR);
 			/* init to non-0 minimum, will be adjusted later */
 			ictt = 0x1;
@@ -1335,7 +1407,8 @@ static void enetc_setup_interrupts(struct enetc_ndev_priv *priv)
 	}
 
 	for (i = 0; i < priv->num_tx_rings; i++) {
-		if (priv->ic_mode & ENETC_IC_TX_MANUAL) {
+		if (priv->ic_mode &
+		    (ENETC_IC_TX_MANUAL | ENETC_IC_TX_ADAPTIVE)) {
 			icpt = ENETC_TBICR0_SET_ICPT(ENETC_TXIC_PKTTHR);
 			/* init to non-0 minimum, will be adjusted later */
 			ictt = 0x1;
@@ -1793,6 +1866,15 @@ int enetc_alloc_msix(struct enetc_ndev_priv *priv)
 
 		priv->int_vector[i] = v;
 
+		/* init defaults for adaptive IC */
+		if (priv->ic_mode == ENETC_IC_ADAPTIVE) {
+			v->tx_ictt = 0x1;
+			v->rx_ictt = 0x1;
+			v->tx_dim_en = true;
+			v->rx_dim_en = true;
+		}
+		INIT_WORK(&v->rx_dim.work, enetc_rx_dim_work);
+		INIT_WORK(&v->tx_dim.work, enetc_tx_dim_work);
 		netif_napi_add(priv->ndev, &v->napi, enetc_poll,
 			       NAPI_POLL_WEIGHT);
 		v->count_tx_rings = v_tx_rings;
@@ -1828,6 +1910,8 @@ int enetc_alloc_msix(struct enetc_ndev_priv *priv)
 fail:
 	while (i--) {
 		netif_napi_del(&priv->int_vector[i]->napi);
+		cancel_work_sync(&priv->int_vector[i]->rx_dim.work);
+		cancel_work_sync(&priv->int_vector[i]->tx_dim.work);
 		kfree(priv->int_vector[i]);
 	}
 
@@ -1844,6 +1928,8 @@ void enetc_free_msix(struct enetc_ndev_priv *priv)
 		struct enetc_int_vector *v = priv->int_vector[i];
 
 		netif_napi_del(&v->napi);
+		cancel_work_sync(&v->rx_dim.work);
+		cancel_work_sync(&v->tx_dim.work);
 	}
 
 	for (i = 0; i < priv->num_rx_rings; i++)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h
index af5a276ce02d..b9d1215c7155 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -10,6 +10,7 @@
 #include <linux/ethtool.h>
 #include <linux/if_vlan.h>
 #include <linux/phy.h>
+#include <linux/dim.h>
 
 #include "enetc_hw.h"
 
@@ -196,12 +197,17 @@ struct enetc_int_vector {
 	u32 tx_ictt;
 	unsigned long tx_rings_map;
 	int count_tx_rings;
-	struct napi_struct napi;
+	u16 comp_cnt;
+	bool rx_dim_en, tx_dim_en;
+	bool rx_napi_work, tx_napi_work;
+	struct napi_struct napi ____cacheline_aligned_in_smp;
+	struct dim rx_dim ____cacheline_aligned_in_smp;
+	struct dim tx_dim;
 	char name[ENETC_INT_NAME_MAX];
 
 	struct enetc_bdr rx_ring;
 	struct enetc_bdr tx_ring[];
-};
+} ____cacheline_aligned_in_smp;
 
 struct enetc_cls_rule {
 	struct ethtool_rx_flow_spec fs;
@@ -232,8 +238,12 @@ enum enetc_ic_mode {
 	/* activated when int coalescing time is set to a non-0 value */
 	ENETC_IC_RX_MANUAL = BIT(0),
 	ENETC_IC_TX_MANUAL = BIT(1),
+	/* use dynamic interrupt moderation */
+	ENETC_IC_RX_ADAPTIVE = BIT(2),
+	ENETC_IC_TX_ADAPTIVE = BIT(3),
 };
 
+#define ENETC_IC_ADAPTIVE	(ENETC_IC_RX_ADAPTIVE | ENETC_IC_TX_ADAPTIVE)
 #define ENETC_RXIC_PKTTHR	min_t(u32, 256, ENETC_RX_RING_DEFAULT_SIZE / 2)
 #define ENETC_TXIC_PKTTHR	min_t(u32, 128, ENETC_TX_RING_DEFAULT_SIZE / 2)
 
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
index 8e0867fa1af6..344a1105444f 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c
@@ -576,6 +576,9 @@ static int enetc_get_coalesce(struct net_device *ndev,
 	ic->tx_max_coalesced_frames = ENETC_TXIC_PKTTHR;
 	ic->rx_max_coalesced_frames = ENETC_RXIC_PKTTHR;
 
+	ic->use_adaptive_rx_coalesce = priv->ic_mode & ENETC_IC_RX_ADAPTIVE;
+	ic->use_adaptive_tx_coalesce = priv->ic_mode & ENETC_IC_TX_ADAPTIVE;
+
 	return 0;
 }
 
@@ -598,10 +601,19 @@ static int enetc_set_coalesce(struct net_device *ndev,
 			   ENETC_TXIC_PKTTHR);
 
 	ic_mode = ENETC_IC_NONE;
-	if (tx_ictt)
-		ic_mode |= ENETC_IC_TX_MANUAL;
-	if (rx_ictt)
-		ic_mode |= ENETC_IC_RX_MANUAL;
+	if (ic->use_adaptive_rx_coalesce) {
+		ic_mode |= ENETC_IC_RX_ADAPTIVE;
+		rx_ictt = 0x1;
+	} else {
+		ic_mode |= rx_ictt ? ENETC_IC_RX_MANUAL : 0;
+	}
+
+	if (ic->use_adaptive_tx_coalesce) {
+		ic_mode |= ENETC_IC_TX_ADAPTIVE;
+		tx_ictt = 0x1;
+	} else {
+		ic_mode |= tx_ictt ? ENETC_IC_TX_MANUAL : 0;
+	}
 
 	/* commit the settings */
 	for (i = 0; i < priv->bdr_int_num; i++) {
@@ -609,6 +621,8 @@ static int enetc_set_coalesce(struct net_device *ndev,
 
 		v->tx_ictt = tx_ictt;
 		v->rx_ictt = rx_ictt;
+		v->tx_dim_en = !!(ic_mode & ENETC_IC_TX_ADAPTIVE);
+		v->rx_dim_en = !!(ic_mode & ENETC_IC_RX_ADAPTIVE);
 	}
 
 	if (netif_running(ndev) && ic_mode != priv->ic_mode) {
@@ -682,7 +696,8 @@ static int enetc_set_wol(struct net_device *dev,
 
 static const struct ethtool_ops enetc_pf_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
-				     ETHTOOL_COALESCE_MAX_FRAMES,
+				     ETHTOOL_COALESCE_MAX_FRAMES |
+				     ETHTOOL_COALESCE_USE_ADAPTIVE,
 	.get_regs_len = enetc_get_reglen,
 	.get_regs = enetc_get_regs,
 	.get_sset_count = enetc_get_sset_count,
@@ -707,7 +722,8 @@ static const struct ethtool_ops enetc_pf_ethtool_ops = {
 
 static const struct ethtool_ops enetc_vf_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
-				     ETHTOOL_COALESCE_MAX_FRAMES,
+				     ETHTOOL_COALESCE_MAX_FRAMES |
+				     ETHTOOL_COALESCE_USE_ADAPTIVE,
 	.get_regs_len = enetc_get_reglen,
 	.get_regs = enetc_get_regs,
 	.get_sset_count = enetc_get_sset_count,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 4/6] enetc: Add interrupt coalescing support
  2020-07-13 12:56 ` [PATCH net-next 4/6] enetc: Add interrupt coalescing support Claudiu Manoil
@ 2020-07-13 22:18   ` Jakub Kicinski
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2020-07-13 22:18 UTC (permalink / raw)
  To: Claudiu Manoil; +Cc: David S . Miller, netdev

On Mon, 13 Jul 2020 15:56:08 +0300 Claudiu Manoil wrote:
> +static int enetc_get_coalesce(struct net_device *ndev,
> +			      struct ethtool_coalesce *ic)
> +{
> +	struct enetc_ndev_priv *priv = netdev_priv(ndev);
> +	struct enetc_int_vector *v = priv->int_vector[0];
> +
> +	memset(ic, 0, sizeof(*ic));

nit: no need to zero this out

> +	ic->tx_coalesce_usecs = enetc_cycles_to_usecs(v->tx_ictt);
> +	ic->rx_coalesce_usecs = enetc_cycles_to_usecs(v->rx_ictt);
> +
> +	ic->tx_max_coalesced_frames = ENETC_TXIC_PKTTHR;
> +	ic->rx_max_coalesced_frames = ENETC_RXIC_PKTTHR;
> +
> +	return 0;
> +}
> +
> +static int enetc_set_coalesce(struct net_device *ndev,
> +			      struct ethtool_coalesce *ic)
> +{
> +	struct enetc_ndev_priv *priv = netdev_priv(ndev);
> +	u32 rx_ictt, tx_ictt;
> +	int i, ic_mode;
> +
> +	tx_ictt = enetc_usecs_to_cycles(ic->tx_coalesce_usecs);
> +	rx_ictt = enetc_usecs_to_cycles(ic->rx_coalesce_usecs);
> +
> +	if (!ic->rx_max_coalesced_frames)

Isn't it better to check != ENETC_RXIC_PKTTHR, rather than != 0?

> +		netif_warn(priv, hw, ndev, "rx-frames fixed to %d\n",
> +			   ENETC_RXIC_PKTTHR);
> +
> +	if (!ic->tx_max_coalesced_frames)
> +		netif_warn(priv, hw, ndev, "tx-frames fixed to %d\n",
> +			   ENETC_TXIC_PKTTHR);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing
  2020-07-13 12:56 ` [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
@ 2020-07-13 22:30   ` Jakub Kicinski
  2020-07-14 11:21     ` Claudiu Manoil
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2020-07-13 22:30 UTC (permalink / raw)
  To: Claudiu Manoil; +Cc: David S . Miller, netdev

On Mon, 13 Jul 2020 15:56:10 +0300 Claudiu Manoil wrote:
> Use the generic dynamic interrupt moderation (dim)
> framework to implement adaptive interrupt coalescing
> in ENETC.  With the per-packet interrupt scheme, a high
> interrupt rate has been noted for moderate traffic flows
> leading to high CPU utilization.  The 'dim' scheme
> implemented by the current patch addresses this issue
> improving CPU utilization while using minimal coalescing
> time thresholds in order to preserve a good latency.
> 
> Below are some measurement results for before and after
> this patch (and related dependencies) basically, for a
> 2 ARM Cortex-A72 @1.3Ghz CPUs system (32 KB L1 data cache),
> using netperf @ 1Gbit link (maximum throughput):
> 
> 1) 1 Rx TCP flow, both Rx and Tx processed by the same NAPI
> thread on the same CPU:
> 	CPU utilization		int rate (ints/sec)
> Before:	50%-60% (over 50%)		92k
> After:  just under 50%			35k
> Comment:  Small CPU utilization improvement for a single flow
> 	  Rx TCP flow (i.e. netperf -t TCP_MAERTS) on a single
> 	  CPU.
> 
> 2) 1 Rx TCP flow, Rx processing on CPU0, Tx on CPU1:
> 	Total CPU utilization	Total int rate (ints/sec)
> Before:	60%-70%			85k CPU0 + 42k CPU1
> After:  15%			3.5k CPU0 + 3.5k CPU1
> Comment:  Huge improvement in total CPU utilization
> 	  correlated w/a a huge decrease in interrupt rate.
> 
> 3) 4 Rx TCP flows + 4 Tx TCP flows (+ pings to check the latency):
> 	Total CPU utilization	Total int rate (ints/sec)
> Before:	~80% (spikes to 90%)		~100k
> After:   60% (more steady)		 ~10k
> Comment:  Important improvement for this load test, while the
> 	  ping test outcome was not impacted.
> 
> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>

Does it really make sense to implement DIM for TX?

For TX the only thing we care about is that no queue in the system
underflows. So the calculation is simply timeout = queue len / speed.
The only problem is which queue in the system is the smallest (TX 
ring, TSQ etc.) but IMHO there's little point in the extra work to
calculate the thresholds dynamically. On real life workloads the
scheduler overhead the async work structs introduce cause measurable
regressions.

That's just to share my experience, up to you to decide if you want 
to keep the TX-side DIM or not :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing
  2020-07-13 22:30   ` Jakub Kicinski
@ 2020-07-14 11:21     ` Claudiu Manoil
  2020-07-14 16:54       ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Claudiu Manoil @ 2020-07-14 11:21 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: David S . Miller, netdev

>-----Original Message-----
>From: Jakub Kicinski <kuba@kernel.org>
[...]
>Subject: Re: [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing
>
>On Mon, 13 Jul 2020 15:56:10 +0300 Claudiu Manoil wrote:
>> Use the generic dynamic interrupt moderation (dim)
>> framework to implement adaptive interrupt coalescing
>> in ENETC.  With the per-packet interrupt scheme, a high
>> interrupt rate has been noted for moderate traffic flows
>> leading to high CPU utilization.  The 'dim' scheme
>> implemented by the current patch addresses this issue
>> improving CPU utilization while using minimal coalescing
>> time thresholds in order to preserve a good latency.
>>
>> Below are some measurement results for before and after
>> this patch (and related dependencies) basically, for a
>> 2 ARM Cortex-A72 @1.3Ghz CPUs system (32 KB L1 data cache),
>> using netperf @ 1Gbit link (maximum throughput):
>>
>> 1) 1 Rx TCP flow, both Rx and Tx processed by the same NAPI
>> thread on the same CPU:
>> 	CPU utilization		int rate (ints/sec)
>> Before:	50%-60% (over 50%)		92k
>> After:  just under 50%			35k
>> Comment:  Small CPU utilization improvement for a single flow
>> 	  Rx TCP flow (i.e. netperf -t TCP_MAERTS) on a single
>> 	  CPU.
>>
>> 2) 1 Rx TCP flow, Rx processing on CPU0, Tx on CPU1:
>> 	Total CPU utilization	Total int rate (ints/sec)
>> Before:	60%-70%			85k CPU0 + 42k CPU1
>> After:  15%			3.5k CPU0 + 3.5k CPU1
>> Comment:  Huge improvement in total CPU utilization
>> 	  correlated w/a a huge decrease in interrupt rate.
>>
>> 3) 4 Rx TCP flows + 4 Tx TCP flows (+ pings to check the latency):
>> 	Total CPU utilization	Total int rate (ints/sec)
>> Before:	~80% (spikes to 90%)		~100k
>> After:   60% (more steady)		 ~10k
>> Comment:  Important improvement for this load test, while the
>> 	  ping test outcome was not impacted.
>>
>> Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
>
>Does it really make sense to implement DIM for TX?
>
>For TX the only thing we care about is that no queue in the system
>underflows. So the calculation is simply timeout = queue len / speed.
>The only problem is which queue in the system is the smallest (TX
>ring, TSQ etc.) but IMHO there's little point in the extra work to
>calculate the thresholds dynamically. On real life workloads the
>scheduler overhead the async work structs introduce cause measurable
>regressions.
>
>That's just to share my experience, up to you to decide if you want
>to keep the TX-side DIM or not :)

Yeah, I'm not happy either with Tx DIM, it seems too much for this device,
too much overhead.
But it seemed there's no other option left, because leaving coalescing as
disabled for Tx is not an option as there are too many Tx interrupts, but
on the other hand coming up with a single Tx coalescing time threshold to
cover all the possible cases is not feasible either.  However your suggestion
to compute the Tx coalescing values based on link speed, at least that's how
I read it, is worth investigating.  This device is supposed to handle link speeds
ranging from 10Mbit to 2.5G, so it would be great if TX DIM could be replaced
replaced in this case by a set of precomputed values based on link speed.
I'm going to look into this.  If you have any other suggestion on this pls let me know.
Thanks.
Claudiu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing
  2020-07-14 11:21     ` Claudiu Manoil
@ 2020-07-14 16:54       ` Jakub Kicinski
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2020-07-14 16:54 UTC (permalink / raw)
  To: Claudiu Manoil; +Cc: David S . Miller, netdev

On Tue, 14 Jul 2020 11:21:45 +0000 Claudiu Manoil wrote:
> >Does it really make sense to implement DIM for TX?
> >
> >For TX the only thing we care about is that no queue in the system
> >underflows. So the calculation is simply timeout = queue len / speed.
> >The only problem is which queue in the system is the smallest (TX
> >ring, TSQ etc.) but IMHO there's little point in the extra work to
> >calculate the thresholds dynamically. On real life workloads the
> >scheduler overhead the async work structs introduce cause measurable
> >regressions.
> >
> >That's just to share my experience, up to you to decide if you want
> >to keep the TX-side DIM or not :)  
> 
> Yeah, I'm not happy either with Tx DIM, it seems too much for this device,
> too much overhead.
> But it seemed there's no other option left, because leaving coalescing as
> disabled for Tx is not an option as there are too many Tx interrupts, but
> on the other hand coming up with a single Tx coalescing time threshold to
> cover all the possible cases is not feasible either.  However your suggestion
> to compute the Tx coalescing values based on link speed, at least that's how
> I read it, is worth investigating.  This device is supposed to handle link speeds
> ranging from 10Mbit to 2.5G, so it would be great if TX DIM could be replaced
> replaced in this case by a set of precomputed values based on link speed.
> I'm going to look into this.  If you have any other suggestion on this pls let me know.

If you were happy with TX DIM - my guess would be that even if you
leave the TX coalescing with the value optimal for 2.5G - it will be
perfectly fine for other speeds, too. TX DIM is quite aggressive, if
I'm reading the code correctly it maxes out at 64us - which is a low
value for TX.

In my experiments with 25G NICs and TCP workloads (and some synthetic
netperf TCP_RR) the optimal value seems to be TSQ / link speed (- some
safety margin). Which is ~360us for 25G, since the TSQ value was bumped
to 1MB in recent kernels.

Obviously YMMV if the system is running a routing or raw socket app.
Then you presumably want to sustain max throughput on 2.5G with min
sized frames. And your rings by default hold 256 entries - that's still
~50us to complete a ring.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-07-14 16:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-13 12:56 [PATCH net-next 0/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
2020-07-13 12:56 ` [PATCH net-next 1/6] enetc: Refine buffer descriptor ring sizes Claudiu Manoil
2020-07-13 12:56 ` [PATCH net-next 2/6] enetc: Factor out the traffic start/stop procedures Claudiu Manoil
2020-07-13 12:56 ` [PATCH net-next 3/6] enetc: Fix interrupt coalescing register naming Claudiu Manoil
2020-07-13 12:56 ` [PATCH net-next 4/6] enetc: Add interrupt coalescing support Claudiu Manoil
2020-07-13 22:18   ` Jakub Kicinski
2020-07-13 12:56 ` [PATCH net-next 5/6] enetc: Drop redundant ____cacheline_aligned_in_smp Claudiu Manoil
2020-07-13 12:56 ` [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing Claudiu Manoil
2020-07-13 22:30   ` Jakub Kicinski
2020-07-14 11:21     ` Claudiu Manoil
2020-07-14 16:54       ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).